This will cover core SQL concepts at a high level that should be learned after getting a rundown of the basics from A Precis of SQL and perhaps PostgreSQL which both cover realistic use cases.
Table of Contents
- Relationships
- [[#Relationships#Cascading|Cascading]]
- Normal Forms
- Sublanguages
- [[#Sublanguages#Summary|Summary]]
- [[#Sublanguages#DDL - Data Definition Language|DDL - Data Definition Language]]
- [[#Sublanguages#DML - Data Manipulation Language|DML - Data Manipulation Language]]
- [[#Sublanguages#DQL - Data Query Language|DQL - Data Query Language]]
- Subqueries
Relationships
1:1 (one to one) - Like a person and a passport
1:M (one to many) - Like a person and their orders
M:N (many to many) - Like students enrolled in classes
A primary key uniquely identifies a row in the table. A foreign key references the primary key of another table. The table with the primary key being referenced is considered the parent table, whereas the table with a foreign key is the child table because it depends on the existence of an entry in the parent table.
Cascading
Understanding the parent-child relationship is crucial for understanding this. As we said, this relationship forms a dependency, so that if the parent does an action like deleting an entry, we’d like to respond to it (cascade) to avoid orphaning an entry in the child table.
In other words, the cascade happens in one direction only: parent action → child reaction. Never the reverse.
The two actions are ON DELETE and ON UPDATE. The reactions are CASCADE to apply the same action as the parent, SET NULL, SET DEFAULT, and RESTRICT/NO ACTION to block.
E.g. ON DELETE CASCADE - If parent is deleted, delete this entry. If parent is updated, update this entry.
Normal Forms
1NF - Eliminates duplicate records
- All columns must contain atomic values—i.e. a Phone Numbers column should not contain multiple phone numbers in one row.
- No duplicate rows.
- No duplicate column names.
2NF - Eliminates partial dependency - Every non-key column must fully depend on the entire primary key.
- Emphasis on the entire key, not parts of it (when composite).
3NF - Eliminates transitive dependency - Columns shouldn’t depend on other columns, only the key.
- E.g. if you need to get
advisor_roomusingadvisorfound instudent_id, you should probably split into two separate tables.- Otherwise, updating
advisor_roomwill require finding all entries and updating them accordingly.
BCNF (3.5NF) - Every determinant must be a superkey
- Otherwise, updating
Sublanguages
Summary
DDL - Data Definition Language
- Defines the data structure/schema
ALTER TABLE users ADD COLUMN nickname TEXT;
DML - Data Manipulation Language- Insert, update, delete records
INSERT INTO users (email) VALUES ('a@b.com');
DCL - Data Control Language- Grants or revokes permissions to certain roles
GRANT SELECT ON orders TO analytics_role;
DQL - Data Query Language- For reading information, often with filtering or aggregation
SELECT date(created_at) FROM users;
TCL - Transaction Control Language- Defines boundaries for concurrent operations; in other words bundling operations so they succeed/fail together, like a funds transfer
BEGIN;
UPDATE accounts SET balance = balance - 100 WHERE id = 1;
UPDATE accounts SET balance = balance + 100 WHERE id = 2;
COMMIT;DDL - Data Definition Language
Assuming the caller has permissions…
CREATE- Can be used to create objects like a database, user, table, index, trigger, function, view
DROP- Destroy anything made with
CREATE
- Destroy anything made with
ALTER- Add, drop, or modify characteristics of an object (data types, columns, constraints)
RENAMETRUNCATE- Remove all data while preserving shape/structure of table
COMMENT
Examples
CREATE DATABASE IF NOT EXISTS IAM;
CREATE TABLE IF NOT EXISTS permissions (
id BIGINT PRIMARY KEY,
categoryId BIGINT NOT NULL,
name VARCHAR(30) NOT NULL UNIQUE,
INDEX(categoryId, name)
);
ALTER TABLE permissions ADD CONSTRAINT fk_permissions_category_id FOREIGN KEY (categoryId) REFERENCES permission_categories(id);About constraints
NOT NULLUNIQUE- no duplicatesPRIMARY KEY- implies UNIQUE, NOT NULL, and creates an indexFOREIGN KEY- links the columns to another tables primary keyCHECK- validates any condition, similar toWHEREDEFAULT- default value when none is providedON UPDATE/ON DELETE- controls cascading behavior for foreign key changes
Constraints can be defined at the column-level AND table-level.
DML - Data Manipulation Language
For changing existing data or schemata…
INSERT- Format is
INSERT INTO <table> (col1, col2...) VALUES (val1, val2...), ()... - e.g.
INSERT INTO roles (id, name) VALUES (1, 'ADMIN'), (2, 'OWNER'), (3, 'EDITOR'), (4, 'VIEWER');
- Format is
UPDATE- Format is
UPDATE <table> SET col_val='new_val' [WHERE condition] - e.g.
UPDATE permissions SET categoryId=1 WHERE name='PAYROLL';
- Format is
DELETE- Format is
DELETE FROM <table> [WHERE condition] DELETE FROM roles WHERE name='VIEWER';
- Format is
DCL - Data Control Language
Regarding rights and permissions of users on database objects.
GRANTREVOKE
The format for these commands is<GRANT|REVOKE> PRIVILEGES ON object FROM user, with the following privileges:SELECTINSERTDELETEUPDATEINDEXCREATEALTERDROPALL- All permissions except grantGRANT- AllowsGRANTstatements
E.g.
REVOKE SELECT, INSERT, UPDATE, DELETE ON posts FROM 'community.manager';TCL - Transaction Control Language
A sequence of one or more SQL operations treated as a single unit of work, the point is for them to complete in their entirety (commit) or have no effect whatsoever, not in between. For example, a bank transaction—debit account A, credit account B.
START TRANSACTION/BEGINCOMMIT- Makes all pending changes in a transaction permanent by saving itROLLBACK- Restores data to the last save point, unless no save point exists then the last committed stateSAVEPOINT- An intermediate point within a transaction that can be rolled back
E.g.
START TRANSACTION;
UPDATE bankaccounts SET funds = funds - 50 WHERE account_no = ACC1 AND funds > 50;
UPDATE bankaccounts SET funds = funds + 50 WHERE account_no = ACC2;
COMMIT;DQL - Data Query Language
For searching, aggregating, joining, filtering, and grouping existing data…
SELECT- That’s it.
The format for this is somewhat complicated at first, but generally follows this structure. Learn this in more detail at A Precis of SQL.
SELECT [ALL | DISTINCT]
select_expr [, select_expr] ...
[into_option]
[FROM table_ref]
[WHERE where_condition]
[GROUP BY {col_name | expr | position}]
[HAVING having_condition]
[ORDER BY {col_name | expr | position}]
[ASC | DESC]
[LIMIT {[offset,] row_count | row_count OFFSET offset}];
Subqueries
Convenient and clear, but almost always outperformed by a join. Can be placed in three main places…
WHERE- Nested Query- Placed in the conditional portion of the outer query.
- e.g.
SELECT a.id, a.name, b.evalName, b.mark FROM students a, evals b WHERE a.id = b.studentId AND b.evalName = 'quiz 1' AND b.mark > (SELECT mark FROM evals WHERE evalName = 'quiz 1' AND studentId = 2);
FROM- Inline Query- Great for essentially creating a temporary table.
- e.g.
SELECT a.name, b.evalName, b.mark FROM students a, (SELECT studentId, evalName, mark FROM evals WHERE mark > 90) b WHERE a.id = b.studentId;
SELECT- Inner Query- Great for creating a temporary column.
- e.g.
SELECT a.id, a.name, (SELECT AVG(mark) FROM evals WHERE studentId = a.id GROUP BY studentId) avg FROM students a;i
Joins
- Inner Join — return rows where both tables match on the join condition.
- Left Join — return all left-table rows + matching right-table rows; unmatched right becomes NULL.
- Right Join — return all right-table rows + matching left-table rows; unmatched left becomes NULL.
- Full Outer Join — return all rows from both tables; non-matching sides filled with NULLs.
- Theta Join — join using any comparison operator (
<,>,<=,>=,<>, not just=). - Cross Join / Cartesian Join — pair every row in table A with every row in table B (no condition).
- Self Join — join a table to itself to relate rows within the same table (e.g., employees to managers).
Set Operations
Combines the result of two queries, provided that the order, data, and number of columns between them are the same.
- UNION — Like a set union, combines the results (EXCLUDING duplicates)
- UNION ALL — Like a set union, combines the results (INCLUDING duplicates)
- INTERSECT — Like a set intersection, only includes COMMON rows between the two
- MINUS/EXCLUDE — Like a set difference, removes B rows found in A.
Indexes
The go-to solution for optimization. Like using a bookmark to quickly find a page, it reduces the amount of data that needs to be scanned.
While it improves reading/query performance, there is an overhead on write operations. In your average database (e.g. a weather data db updated once a month), reading operations tend to dominate writing operations which is why this becomes a fairly important concept. Contrast this to a financial ticker or event logging database where constant writes on a database could potentially blow up.
Rule of thumb: Try indexing columns you often find in a WHERE, JOIN, ORDER BY, or GROUP BY clause.
Two types:
- Clustered: Rows are stored in order of the index key, limit 1 clustered index.
- Non-clustered: The index key contains a sorted order of the data, but the actual rows are stored somewhere else from the index.
Database Objects
SEQUENCE
Used to create custom sequences, possibly for use as a primary key.
CREATE SEQUENCE example_1
AS INT
START WITH 10
INCREMENT BY 10; TRIGGER
Runs in response to an event on the database.
CREATE TRIGGER [trigger_name]
[BEFORE | AFTER]
{INSERT | UPDATE | DELETE}
ON [table_name]
[FOR EACH ROW]
[trigger_body] PROCEDURE
A set of statements that are named and stored to be executed at a future point in time.
CREATE PROCEDURE UpdateEmployeeSalaries(IN PercentageIncrease DECIMAL(5,2))
BEGIN
-- Updating the salaries
UPDATE Employees
SET Salary = Salary * (1 + PercentageIncrease / 100);
-- Logging the update
INSERT INTO SalaryUpdateLog (UpdateDate, PercentageIncrease)
VALUES (NOW(), PercentageIncrease);
ENDFUNCTION
The same as a procedure except that it returns a result.
CREATE FUNCTION get_balance(acc_no INT)
RETURNS INT
DETERMINISTIC
BEGIN
DECLARE acc_bal INT;
SELECT order_total
INTO acc_bal
FROM sample2.orders
WHERE customer_id = acc_no
LIMIT 1;
RETURN acc_bal;
END