VI. SQL

Reference: General SQL Query Structure

SELECT <attribute list> // specifies columns (variables) to be retrieved

FROM <table list> // tables to be queried

JOIN <table2> ON table1.column_field = table.common_field

WHERE <condition> LOGICAL CONNECTIVE <condition>  // subset based on condition

GROUP BY <grouoping attribute(s)> // grouping based on a conditon

HAVING <grouping condition>

ORDER BY <attribute(s)> ASC/DESC;

General Application

Overview of Databricks Platform 12 results
OrderFile NameComments
1_Introduction to SQLAbout SQL, how SQL statements are executed (`;`), comparison using `NULL` and the 3 logical values in SQL (`TRUE`, `FALSE`, and `UNKNOWN`).
2AliasingCan be applied to attributes and relation.
3Creating Views (Virtual Tables)Creating views and how it is different from regular result-set.
4Checking for Missing ValuesHow to check for missing values in a column using `IS NULL` and `IS NOT NULL`.
5Handling Ambiguous Attribute NamesHow to deal with retrieving attributes with the same names but in different relation - `qualifying relation with attribute name`
6USE CommandHow to specify which database to work with in a DBMS.
7Logical and Comparison OperatorsLogical gates (AND, OR, NOT) and comparison operators (incl. `LIKE`, `BETWEEN`, 'IN')
8Wildcard Matching (Pattern Matching)Pattern matching within `string` values.
9Set OperationsSet operations to combine or compare results of multiple `SELECT` statements.
10_TerminologyTerminology of commonly used words in SQL.
11Arthmetic Operations with SELECTUsing standard arithmetic with the `SELECT` statement.
12_FAQ For SQL

Basic SQL Queries

Overview of Databricks Platform 13 results
OrderFile NameComments
1The CREATE TABLE StatementAKA `Base relations`. Creating tables and adding content.
2The SELECT StatementSpecifying the type of information to be retrieved from the database. Using `*` and `DISTINCT`.
3The JOIN ClauseCombining records from two different tables.
4The WHERE ClauseFiltering result of a returned query. Can also be used with other keywords like `JOIN` to link tables together based on a specified condition.
5The WHERE Clause (Dynamic)How to create dynamic WHERE Clauses with `WHERE 1=1`.
6The GROUP BY ClauseGroup rows by a specified criterion, filtering by using a `HAVING` clause, extensions of the function using `CUBE()`, `ROLLUP()`, and custom GROUPING SETS.
7The ORDER BY KeywordOrdering result-set by a `ascending` or `descending` order.
8The LIMIT ClauseRestricting the size of the result-set.
9The CREATE SCHEMA StatementUnderstanding the use of schemas in a DBMS, how to initialize a schema by code (very rarely used), and how to add objects to a schema.
10The INSERT INTO CommandHow to add records to a table (individually and in bulk).
11The UPDATE StatementHow to modify an existing tuple in a relation.
12The DELETE StatementUsed to delete specified rows in a table or all.
13The TRUNCATE() StatementUsed to delete all the rows from a table.

Complex SQL Queries

Overview of Databricks Platform 12 results
OrderFile NameComments
1Nested QueriesBuilding nested queries using different comparison operators within the `WHERE` clause (with/without additional keywords (e.g., `SOME`, `ALL`)), and collapsing nested queries (only for `IN`). Overview of subqueries after the FROM clause.
2EXISTS, NOT EXISTS, UNIQUE Keywords with Nested QueriesKeywords to use with nested queries (`EXISTS`, `UNIQUE`, `NOT EXISTS`, `IN`).
2.1Using "Select 1 from" to Check ExistenceChecking for existence of rows that match certain conditions without returning any data.
3Correlated Nested QueriesNested queries where the inner query refers the same table from the outer query.
4Multiway JoinsHow to join three or more tables in a query.
5Generating Temp Result-set Using `With` as CTEsAKA `common table expression (CTE)`. Referenced in main query as like any relation. Can think of it as a "subquery". Plus how to created multiple CTEs.
6Recursive (Hierarchical) QueryCTEs
6Conditional Logic using CASEDefine set of conditions and corresponding actions to be taken based on those conditions. Analogous to the `IF-THEN-ELSE` in Python.
7Conditional Logic Using `Decode()` DECODE()  is a single expression conditional function that compares an expression against a sequence of search values and returns the corresponding result when a match is found. `CASE WHEN` conditional logic is preferable for more complex logic.
8Querying with JUNCTION TablesQuerying with junction tables.
9UNIONs vs JOINsWhen to use `UNION ALL` vs `JOINs`.
10Loop ConstructsOnly works with procedural extensions like PL/SQL in Oracle. No native looping constructs in SQL.

Specifying Constraints When Creating a DB

Overview of Databricks Platform 7 results
OrderFile NameComments
1The CONSTRAINT KeywordNaming constraints and for altering constraints.
2Primary KeysHow to specify primary keys when creating tables. Dropping and altering PKs.
3Foreign KeysSpecifying foreign keys and specifying `referential triggered action` clause in response to changes or deletion of a record in the referenced table.
4Candidate KeysSpecifying candidate keys using the `UNIQUE` clause
5Attribute DomainsSpecifying the domain of an attribute to enfore data integrity (e.g., ensure values are within permitted range of values given the datatype).
6Semantic ConstraintsCreating `ASSERTIONS` and `TRIGGERS`
7Attribute ConstraintsOther restrictions on the domains of attributes.

Aggregate Functions

SQL aggregate functions are used to perform calculations on a set of values and return a single value.

Restrictions in Calculation

Aggregate functions only aggregate vertically. If you want to perform a calculation across rows, you would do this with simple arithmetic. SQL aggregation is the task of collecting a set of values to return a single value

Aggregate Functions 6 results
OrderFile NameComments
1SUM FunctionComputing the total value of over an entire column.
2COUNT FunctionCounting the number of rows in a table or the number of values in a particular column.
3AVG FunctionReturning the average value of a *numeric* column.
4MAX and MIN FunctionsRetrieving the MAX and MIN values of a column or group.
5LISTAGG Function LISTAGG()  lets you collect values from multiple rows and join them into a single string, usually separated by a comma or other delimiter.
6PERCENTILE_CONT FunctionFind value that cuts off the bottom n-th percentile of a ordered list of values.

Scalar Functions

How is this different from Scalar Functions?

  • Aggregate Functions: Operate over multiple rows and produce a summary result (i.e., computes over a set of rows and return a single result for each group or the entire dataset).

  • Scalar Functions: Operate on individual values and produce a result for each row.

Overview of Databricks Platform 1 results
OrderFile NameComments
1The GREATEST() FunctionReturn the largest value for a given row in a dataset against a list of expressions. These expressions can be columns directly from the dataset or combined with other functions.

Useful SQL Functions

Overview of Databricks Platform 8 results
OrderFile NameComments
1The Conditional AggregationHow to use aggregate data based on specified conditions.
2The DECODE() FunctionFunction unique to **Oracle Databases**. Extension of a more complex and powerful conditional statement in SQL.
3The OVER() and PARTITION BY() FunctionHow to apply aggregation functions to a dataset by partitioning data into specific groups (i.e., partitions) based on 1 or more columns.
3The NVL() and NVL2() FunctionOracle. How to replace NULL values with a specified default value with `NVL` function and `NVL2`.
4The WINDOW FunctionWhat is a window function and example of how to assign sequential integer to each row to a returned result set using ROW_NUMBER() window function.
5The FIRST_VALUE() FunctionRetrieve first value of a selected column of a specified window or partition.
6The LAG() FunctionRetrieve value of the previous row of the same result set.
7The OFFSET KeywordSkip a specified number of rows from the beginning of a query's result set before returning the remaining rows.

Schema Changes

Good Practice!

Back-up schema before making any changes!

In SQL, schema changes refer to modifications made to the structure or design of a database.
To change an operational database schema without recompilation of the entire database, below are some common schema change commands.

Overview of Databricks Platform 2 results
OrderFile NameComments
1ALTER TABLEHow to alter the structure of an existing table in the database schema.
2DROP TABLEHow to remove an existing table from a database.
Powered by Forestry.md