Last modified: July 28, 2020
Syntax conventions allows people to easily scan through your query and understand what metrics are being measured and analyzed.
All Caps SQL Commands
Show example queries with ALL CAPS, First Letter Caps, and lowercase SQL commands.
For readability, all SQL commands should be written in uppercase letters. This allows the reader to identify the keywords in the SQL statement and easily determine what the query is executing. Avoid writing first letter caps and lowercase queries as they are difficult to scan quickly and separate into different clauses.
❌
Select created_at, gender, age_at_registration, provincefrom users
✅
SELECT created_at, gender, age_at_registration, provinceFROM users
New Lines and Indenting
Lay out each column or logical statement on new lines so that it easy to identify what is being queried and filtered. When they are placed in the same line as the SQL command they may not even fit within your SQL editor window. Keep Aliases on the same line to make it clear what they are referring to.
Keep the SQL keywords on lines of their own and indent the other parts of the query. This makes it easy to read, edits and spot any unexpected errors.
❌
SELECT u.created_at, u.age_at_registration,u.gender,AVG(p.amount) AS average_user_payment_amountFROM users AS uINNER JOIN payments AS p ON u.user_id = p.user_idWHERE p.has_refund = FALSE
✅
SELECTu.created_at,u.age_at_registration,u.gender,AVG(p.amount) AS average_user_payment_amountFROMusers AS uINNER JOIN payments AS p ON u.user_id = p.user_idWHEREp.has_refund = FALSE
When writing simple queries for up to 3 three (3) columns, you can write the columns in line with the SQL commands.
✅
SELECT created_at, age_at_registration, current_ageFROM customers
GROUP BY Numbers and Names
GROUP BY allows you to separate data into groups, which can be aggregated independently of one another. It is generally recommended to use column names to be clear as to what you are grouping by but when you are grouping many columns or long titled columns you may substitute numbers that correspond with the select statement order.
✅
SELECTdate_of_customer_registration,number_of_user_logins,number_of_user_payments,COUNT(*) AS total_user_engagementFROMuser_metricsGROUP BYdate_of_customer_registration,number_of_user_logins,number_of_user_paymentsORDER BYtotal_user_engagement DESC
Due to the length of the column names, rewriting them for the GROUP BY statement can become tedious and time consuming. Instead, the GROUP BY could number the columns in order (1, 2, 3).
✅ ?
SELECTdate_of_customer_registration,number_of_user_logins,number_of_user_payments,COUNT(*) AS total_user_engagementFROMuser_metricsGROUP BY1, 2, 3ORDER BYtotal_user_engagement DES
At a quick glance the numbers become difficult to determine the metrics under analyses. Another risk with this method is if you update any column in the SELECT statement, and you used numbers in the GROUP BY it will not produce any errors so you may assume nothing is wrong. The numbers reference whatever columns are in the SELECT. You will need to check to see if the numbers are still corresponding to the columns you are expecting in the SELECT statement to have an accurate query.
Happy syntax-ing!
Written by: Katherine Chiodo
Reviewed by: Matt David
Created by
©2019 All Rights Reserved • Terms