Backfilling Data

In Switchboard, if you publish or make a recipe change, that recipe will apply from that date going forward in time.

In many cases you want to be able to ingest and pull data from previous dates or update the data to reflect a changed transformation. Each of the connectors has its own sense of the dates it needs to pull from. Most of the connectors will pull yesterday’s data by default.

If you need data from an earlier date or date range, you can use the backfill functionality.

Initiating Backfills

To initiate Backfills select “Downloads > Files” from the navigation and click the “Backfill Options” button under the filter options.

First, pick the period of time for which you want to pull data (i.e. March 1st through 31st) with Date Range calendar selector. Then, select the source from the Imports select box (the labels correspond to those defined in the import or download sections of your recipes). Finally, click the “Launch Backfill” button.

A modal dialogue with all of the dates you have selected will appear in the screen which you may close.

Your backfill request will be queued. Switchboard will request reports for the dates you selected, pull the data in, and reprocess it.

Restrictions on Backfilling

The only caveat to backfilling is that Switchboard currently imposes a 30 day date range on on the backfill requested.

However, you can do this multiple times (that is you could pull data from February and then January). This is a simple guardrail to prevent too many users from backfilling enormous amounts all at once.

Backfilling is also limited by the historical data retention for each different data source and may vary greatly from source to source.

“Backfillable” Data Sources

Switchboard can backfill data sources that have a “sense of date,” such as an Google AdX. Any file based or API based data is generally back fillable,

The one type of data that’s not back fillable are what we call “immediate scheduled downloaders.” This includes full synchs of databases pulling from queries. This is because those reports don’t really have a sense of date. They only have a sense of immediacy.

Another issue is data that is no longer preserved, such as as a Salesforce table from 10 years ago.

Handling Data from Snapshot Data Sources

In cases where the source from which data is pulled is already bounded by dates which are not parameters that are submitted by a client (e.g. it is a report that is a daily snapshot), it may still be possible to effectively backfill the data.

Because the contents of the report determine the range of dates for the data available for upload, ordinary backfilling will merely populate the specified dates with the content currently contained within the report and end up creating duplicate data that is not related to the specified time period needed for the backfill.

Therefore, in order to “backfill” this connector, you must create a new report that contains the desired data needed for backfilling and use this new report’s identifier as the target.

You can perform this action from a draft copy of the recipe you would ordinarily run by replacing unique identifier for the report with that of the “backfill report,” specifying a table to upload leveraging the test_table parameter and running the test. Doing so will populate Switchboard with the desired data. There is no need to publish this draft once the test is run.