SQL can seem daunting for a non-techie, but it doesn’t have to be. Whether you’re just setting up your first database, or learning the ropes of a well-established data set, this is a useful sheet of basics to print out and keep above your monitor for those times you feel stuck.
Clauses in SQL
- SELECT – chooses which columns you wish to see from the data. While it is the first clause written in a query, it is executed after FROM, JOIN, WHERE and GROUP BY
- FROM – identifies the first table to be included in the query.
- JOIN – identifies an additional table to be joined to one of the tables already included. You can have multiple joins, and either have INNER JOINs or LEFT JOINs or both.
- WHERE – filters data according to criteria. You can have multiple filters by using AND/OR
- GROUP BY – groups the data by a particular column – you can use functions like count(), sum(), max(), min(), etc in the SELECT clause to report aggregate data according to the categories determined in the GROUP BY clause.
- HAVING – similar to WHERE, except it is calculated after the GROUP BY function, so you can use it to filter according to aggregate categories, instead of by the original data.
- ORDER BY – sorts data according to one or many columns. You can have the data sorted descending or ascending by using ‘desc’ or ‘asc’.
- LIMIT – can cut out off the number of results produced – if used in conjunction with ORDER BY, it will order before it cuts the data off at the limit.
Example query
SELECT first_name, last_name, mobile_number, p.number postcode, sum(t.amount_in_cents) / 100 total_donated
FROM users u
JOIN donations d on d.user_id = u.id
JOIN transactions t on t.donation_id = d.id
LEFT JOIN postcodes p on p.id = u.postcode_id
WHERE date(d.created_at) = ‘2016-01-27’
GROUP BY u.id
HAVING total_donated <= 100
ORDER BY total_donated desc, first_name asc
LIMIT 20
How do aliases work?
You can give an alias to any table, by adding the alias after the table name when it is introduced (either in the FROM or JOIN clause) with a space in between. An easy option is to give tables single-letter aliases – ‘u’ for ‘users’, ‘c’ for ‘campaigns’, ‘d’ for ‘donations’, etc.
You can then specify which table a field belongs to by including the table’s alias, followed by the field name, with a full stop in between – eg. ‘c.id’ represents the ‘id’ field in the ‘campaigns’ table.
You need to specify the table the field belongs to when you are joining tables, and when a field name is used in more than one table.
You can also add aliases for particular fields in the SELECT field – this is useful when field names are unclear (eg. the ‘number’ field in the postcode table is often aliased as ‘postcode’) or the column is produced from a formula – such as ‘sum(t.amount_in_cents)’. You would want to give that column a simpler name, both because it looks neater and allows you to refer to that column further down.