Switchboard Script for SQL Users
Switchboard Script (SBS) was designed to closely resemble standard SQL, the lingua franca of data work.
However, SBS does a number of things that SQL cannot do, such as
- describe complex data transformation and workflow steps
- define data sources and sinks
- configure monitoring
In addition, SBS attempts to streamline SQL syntax that may be redundant or unnecessary for data operations work.
Aggregations
SBS aggregations do not require GROUP BY
Switchboard can aggregate columns for a group of rows, similar to SQL. However, Switchboard does not need an explicit GROUP BY clause like SQL does. Instead, we determine the key columns automatically:
table my_new_table is select
key1, key1, max(creation_date) as max_date
from my_table;
You can filter groups matching only certain aggregation criteria:
table my_new_table is select
key1, key1, count() as group_count
from my_table
having group_count >= 10;
Conditionals
In SBS, use ?
instead of CASE
SQL conditional statments use the CASE
syntax. Switchboard Script uses a more compact “ternary” conditional format.
SQL
SELECT
CASE
WHEN user_name == "lauren" THEN "daenerys"
WHEN user_name == "gene" THEN "drogo"
WHEN user_name == "michael" THEN "bran"
WHEN user_name == "topher" THEN "jon"
ELSE "joffrey"
END as character
FROM t1;
Switchboard Script
table t2 is select
user_name == literal "lauren" ? "daenerys" :
user_name == literal "gene" ? "drogo" :
user_name == literal "michael" ? "bran" :
user_name == literal "topher" ? "jon" :
"joffrey" as character
from t1;
List Comprehensions
Use the following SBS shorthand to refer to many columns at once
Switchboard makes it easy to deal with data with hundreds, if not thousands, of columns. Switchboard provides a “column comprehension” syntax for addressing lists of columns succinctly.
SQL
SELECT
column_1,
column_2,
column_3,
... /* hundreds of lines of column names */
column_998,
column_999,
column_1000
FROM my_table;
Switchboard Script
table my_new_table is select
<< my_table.$c for c in my_table
from my_table;
It’s also possible to leave out specific columns. For example, you can leave out column_1 and column_2 by using the except
clause:
table my_new_table is select
<< my_table.$c for c in my_table except [column_1, column_2]
from my_table;
Columns Name Versus Labels
Get Literal
Anything that’s not a column name in SBS need to be declared ‘literal’SBS introduces the concept of the literal tag, to definitively identify anything that is meant to be considered as a literal string (and not a reference to a column name).
Here’s a transformation step with a new column (called new_column) with nothing but hello world in every row:
table my_new_table is select
column_1,
column_2,
literal 'hello world' as my new_column
from my_table;
Or, you could concatenate the value of every column in the table with a suffix:
table my_new_table is select
original_column,
original_column + literal '_01' as my new_column_with_a_suffix
from my_table;
Use literal whenever you use a regular expression in a function.
table my_new_table is select
extract(original_column, literal "(?<key>\w+)=(?<value>\w+)") as new_column
from my_table;
Joins
Switchboard supports SQL like syntax for joins, including left joins, and unions
For example, take the following spreadsheets:
A table of customers
CustomerID | CustomerName | ContactName | Country |
---|---|---|---|
1 | Alfreds Futterkiste | Maria Anders | Germany |
2 | Ana Trujillo Emparedados y helados | Ana Trujillo | Mexico |
3 | Antonio Moreno Taquería | Antonio Moreno | Mexico |
A table of orders, where CustomerID is a foreign key:
OrderID | CustomerID | OrderDate | OrderItem |
---|---|---|---|
10308 | 1 | 1996-09-18 | Blue Widget |
10309 | 2 | 1996-09-19 | Blue Widget |
10310 | 2 | 1996-09-20 | Red Widget |
Using the following Switchboard script you can download each table, join them on customerID and upload the result to a new spreadsheet
download SwitchboardSampleCustomers_raw from {
type: "google_sheets";
key: "{your Google Oauth Key}";
spreadsheet_id: "{your spreadsheet ID}";
data_range: "A2:D4";
} using {
"CustomerID" : integer;
"CustomerName" : string;
"ContactName" : string;
"Country" : string;
};
download SwitchboardSampleOrders_raw from {
type: "google_sheets";
key: "{your Google Oauth Key}";
spreadsheet_id: "{your spreadsheet ID}";
data_range: "A2:D4";
} using {
"OrderID" : integer;
"CustomerID" : integer;
"OrderDate" : date;
"OrderItem" : string;
};
table SwitchboardSampleJoin_final is
select
o.OrderID,
o.OrderDate,
o.OrderItem,
o.CustomerID,
c.CustomerName as CustomerName
from SwitchboardSampleOrders_raw as o
join SwitchboardSampleCustomers_raw as c
on o.CustomerID == c.CustomerID;
upload SwitchboardSampleJoin_final to {
type: "google_sheets";
key: "{your Google Oauth Key}";
primary_source_name: "SwitchboardSampleOrders_raw";
folder_name: "switchboard_uploads";
spreadsheet_name: "Sample Join Upload"; // Required
test_spreadsheet_name: "Sample Join Upload Test";
};
This would upload the following spreadsheet:
OrderID | OrderDate | OrderItem | CustomerID | CustomerName |
---|---|---|---|---|
10308 | 1996-09-18 | Blue Widget | 1 | Alfreds Futterkiste |
10309 | 1996-09-19 | Blue Widget | 2 | Ana Trujillo Emparedados y helados |
10310 | 1996-09-20 | Red Widget | 2 | Ana Trujillo Emparedados y helados |
Outer Joins
Alternatively you could do a left outer join:
table SwitchboardSampleLeftJoin_final is
select
c.CustomerID,
c.CustomerName as CustomerName,
c.Country as CustomerCountry,
o.OrderID,
o.OrderDate,
o.OrderItem
from SwitchboardSampleCustomers_raw as c
left join SwitchboardSampleOrders_raw as o
on o.CustomerID == c.CustomerID;
Which would produce the this spreadsheet instead:
CustomerID | CustomerName | CustomerCountry | OrderID | OrderDate | OrderItem |
---|---|---|---|---|---|
1 | Alfreds Futterkiste | Germany | 10308 | 1996-09-18 | Blue Widget |
2 | Ana Trujillo Emparedados y helados | Mexico | 10310 | 1996-09-20 | Red Widget |
2 | Ana Trujillo Emparedados y helados | Mexico | 10309 | 1996-09-19 | Blue Widget |
3 | Antonio Moreno Taquería | Mexico |
Unions
Like SQL, Switchboard script has a UNION
operator that is used to combine the result-set of two or more SELECT statements
table new_unioned_thing is
table_1
union
table_2;