Google Cloud Storage File Uploader
Google cloud storage can be used for exporting files of the following types:
- json
- csv
- compressed csv
Prerequisites
Google Cloud Storage must be enabled for your account.
Upload to Google Cloud Storage requires authentication either through a Google Service Account or Google OAuth.
Google Service Account
To authenticate via a Google Service Account add a Key with the Credential Type Google Service Account
Service accounts differ from user accounts in several key ways.
- Client Email
- Email address associated with the Service Account
- Key Name
- Value set here will be used in your Switchboard Script.
- Key Type
- Corresponds to the format of the Service Account Key. Typically the value is
json
- Service Account Key
- Entire text of the key pair provided when a Service Account is generated
Google OAuth
To authenticate via Oauth you must add a Key with the Credential Type Google OAuth
, enter a Key Name (used in the Switchboard Script below) and perform the Oauth flow by clicking the button labeled Connect
.
Parameters
- clear_files boolean
- optional
- Flag to indicate whether the destination to which data is uploaded should be cleared before uploading data. By default this value is set to “true”
- compression string
- optional
- Indicates the compression method. Accepted values include
gzip
. If utilized, the configured file extension in thedestination
parameter must end with.gz
- destination string
- required
- The path to the location where the file will be uploaded. If the path does not exist, it will be created. Our standard for google storage is:
gs://<storage_root_path>/<purpose:prod|dev|test>/<source_name>/<file_type:csv|json|parquet|avro>/<report_name>/<report_name>_YYYY-MM-DD_NN.csv.gz
- The following date patterns may be used to fill in the current date automatically.
-
2022-06-01
(YYYY-MM-DD)
-
2022-6-1
(single digit month and date)
-
2022-Jun-1
(three letter month abbreviation)
-
2022-June-1
(full month name)
- The string
UUID
will be replaced with an eight digit alphanumeric string to ensure the uniqueness of the filename. It may be used in conjunction with date formatting such thatgs://switchboard-example/fileYYYYMMDD_UUID.csv.gz
will be formatted as../file20220727_30FG06et.csv.gz
- We strongly recommend that if you are going to use these two patterns together you use the date first so that you can sort the files by date.
- format string
- required
- The format data should be uploaded as. Allowed values incude
csv
,json
,parquet
, andavro
- headers boolean
- optional
- Indicates whether or not a header row should be written to the destination file (for csv files)
- partition_count integer
- optional
- Some systems can’t handle really large files and as such uploads must be “partitioned” (made smaller). Typically you want only one partition, which is the default, but it may be convient if downstream processes consuming the data require smaller files.
- project string
- required
- The name of the Google Cloud Project to which the file will be uploaded.
- primary_source_name string
- required
- The name of the initial download from which data is being uploaded.Required for uploads where date a pattern is utilized in the naming convention.
- test_destination string
- optional
- The location to write data for the purpose of test runs — unless this is specified, test data will not be uploaded. Note that date based variables cannot be used in a test destination name.
Switchboard Script Syntax
upload example_gs_report to {
type: "gcs";
key: "my_google_key";
destination: "gs://my_google_storage/prod/my_report_source/my_file_type/my_report_name/example_gs_report_YYYY-MM-DD.csv.gz";
project: "my-project";
headers: true;
compression: "gzip"; //optional compression method, if utilized, the configured file extension in destination must end with .gz as shown above
format: "csv";
partition_count: 1;
clear_flies: false;
primary_source_name: "example_gs_report_raw"; //only necessary for uploads where date is utilized in the naming convention, corresponds with the initial download name
test_destination: "gs://my_google_storage_root/test/my_report_source/my_report_name/example_report_test.csv.gz"; //Test destinations are supported by using this field
};