Switchboard Script for SQL Users
Switchboard Script (SBS) was designed to closely resemble standard SQL, the lingua franca of data work. However, SBS does a number of things that SQL cannot do - describe complex data transformation and workflow steps, define data sources and sinks, and configure monitoring. In addition, SBS attempts to streamline SQL syntax that may be redundant or unnecessary for data operations work.
SBS aggregations do not require “GROUP BY”
Switchboard can aggregate columns for a group of rows, similar to SQL. However, Switchboard does not need an explicit GROUP BY clause like SQL does. Instead, we determine the key columns automatically:
table my_new_table is select key1, key1, max(creation_date) as max_date from my_table;
You can filter groups matching only certain aggregation criteria:
table my_new_table is select key1, key1, count() as group_count from my_table having group_count >= 10;
Use ? instead of CASE
SQL conditional statments use the CASE syntax. Switchboard Script uses a more compact “ternary” conditional format.
SQL SELECT CASE WHEN user_name == "lauren" THEN "daenerys" WHEN user_name == "gene" THEN "drogo" WHEN user_name == "michael" THEN "bran" WHEN user_name == "topher" THEN "jon" ELSE "joffrey" END as character FROM t1; Switchboard Script table t2 is select user_name == literal "lauren" ? "daenerys" : user_name == literal "gene" ? "drogo" : user_name == literal "michael" ? "bran" : user_name == literal "topher" ? "jon" : "joffrey" as character from t1;
Use this SBS shorthand to refer to many columns at once
Switchboard makes it easy to deal with data with hundreds, if not thousands, of columns. Switchboard provides a “column comprehension” syntax for addressing lists of columns succinctly.
SELECT column_1, column_2, column_3, ... /* hundreds of lines of column names */ column_998, column_999, column_1000 FROM my_table;
table my_new_table is select << my_table.$c for c in my_table from my_table;
It’s also possible to leave out specific columns. For example, you can leave out column_1 and column_2 by using the
table my_new_table is select << my_table.$c for c in my_table except [column_1, column_2] from my_table;
Columns name versus labels
Get literal: Anything that’s not a column name in SBS need to be declared ‘literal’
SBS introduces the concept of the literal tag, to definitively identify anything that is meant to be considered as a literal string (and not a reference to a column name).
Here’s a transformation step with a new column (called new_column) with nothing but hello world in every row:
table my_new_table is select column_1, column_2, literal 'hello world' as my new_column from my_table;
Or, you could concatenate the value of every column in the table with a suffix:
table my_new_table is select original_column, original_column + literal '_01' as my new_column_with_a_suffix from my_table;
Use literal whenever you use a regular expression in a function.
table my_new_table is select extract(original_column, literal "(?<key>\w+)=(?<value>\w+)") as new_column from my_table;