Amazon S3 File Uploader

Amazon Simple Storage Solution, or s3 storage can be used for exporting files the following file types:

json
csv
compressed csv
avro
parquet

Prerequisites

Exports to S3 require that you create an Amazon AWS Credential key in the Key Editor with access to the storage target.

Parameters

clear_files boolean

optional

Flag to indicate whether the bucket to which data is uploaded should be cleared before uploading data. By default this value is set to “true”

compression string

optional

Indicates the compression method. Accepted values include gzip. If utilized, the configured file extension in the destination parameter must end with .gz

destination string

required

The path to the location where the file will be uploaded. Our standard for S3 is:

s3://<bucket_name>/<purpose:prod|dev|test>/<source_name>/<file_type:csv|json|parquet|avro>/<report_name>/<report_name>_YYYY-MM-DD_NN.csv.gz

The following date patterns may be used to fill in the current date automatically.

2022-06-01 (YYYY-MM-DD)

2022-6-1 (single digit month and date)

2022-Jun-1 (three letter month abbreviation)

2022-June-1 (full month name)

The string UUID will be replaced with an eight digit alphanumeric string to ensure the uniqueness of the filename. It may be used in conjunction with date formatting such that s3://switchboard-example/fileYYYYMMDD_UUID.csv.gz will be formatted as ../file20220727_30FG06et.csv.gz

We strongly recommend that if you are going to use these two patterns together you use the date first so that you can sort the files by date.

format string

required

The format data should be uploaded as. Allowed values incude csv, json, parquet, and avro

headers boolean

optional

Indicates whether or not a header row should be written to the destination file (for csv files)

single_partition boolean

optional

Ensure uploader creates a single file when file uploads are greater than 100Mb. Maximum single filesize is 5Gb.

primary_source_name string

required

The name of the initial download from which data is being uploaded. Required for uploads where date a pattern is utilized in the naming convention.

test_destination string

optional

The location to write data for the purpose of test runs — unless this is specified, test data will not be uploaded. Note that date based variables cannot be used in a test destination name.

Switchboard Script Syntax

upload example_s3_report to {

    type: "s3";
    key: "my_aws_key";
    
    // Our standard for s3 storage is:
    // s3://<bucket_name>/<purpose:prod|dev|test>/<source_name>/<file_type:csv|json|parquet|avro>/<report_name>/<report_name>_YYYY-MM-DD_NN.csv.gz
    destination: "s3://my_aws_storage/prod/my_report_source/my_file_type/my_report_name/example_s3_report_YYYY-MM-DD.csv.gz";
    headers: true;
    compression: "gzip"; //option compression method, if utilized, the configured file extension in destination must end with .gz as shown above
    format: "csv";
    single_partition: true;
    clear_flies: false;
    
    primary_source_name: "example_s3_report_raw"; //only necessary for uploads where date is utilized in the naming convention, corresponds with the initial download name
    test_destination: "s3://my_aws_storage/.../example_report_test.csv.gz" //Test destinations are supported by using this field
};