PostgreSQL SQL Mastery Guide

This guide provides a hands-on walkthrough of PostgreSQL SQL, from basic queries to advanced optimization and real-world scenarios. Each section includes annotated code and practical tips to help you write efficient, effective SQL for any project.

๐Ÿ˜ PostgreSQL SQL Mastery Guide


1. ๐Ÿงฑ Basics

SELECT, FROM, WHERE, ORDER BY, LIMIT

Used to retrieve data.

SELECT name, salary FROM employees
WHERE department = 'Engineering'
ORDER BY salary DESC
LIMIT 5;

Filtering with AND, OR, IN, BETWEEN, LIKE

SELECT * FROM employees
WHERE salary BETWEEN 60000 AND 90000
AND department IN ('HR', 'Engineering')
AND name LIKE 'A%';

Aggregations: COUNT, SUM, AVG, MIN, MAX

SELECT department, COUNT(*) AS total, AVG(salary) AS avg_salary
FROM employees
GROUP BY department;

GROUP BY and HAVING

SELECT department, AVG(salary) AS avg_salary
FROM employees
GROUP BY department
HAVING AVG(salary) > 70000;

2. ๐Ÿ”„ Intermediate

JOIN types: INNER, LEFT, RIGHT, FULL

SELECT e.name, d.name AS department
FROM employees e
LEFT JOIN departments d ON e.department_id = d.id;

Subqueries (SELECT, FROM, WHERE)

SELECT name
FROM employees
WHERE salary > (
  SELECT AVG(salary) FROM employees
);

CASE WHEN logic

SELECT name, salary,
  CASE
    WHEN salary > 90000 THEN 'High'
    WHEN salary > 60000 THEN 'Medium'
    ELSE 'Low'
  END AS salary_band
FROM employees;

DISTINCT, UNION, EXCEPT, INTERSECT

SELECT DISTINCT department FROM employees;

SELECT name FROM engineers
UNION
SELECT name FROM hr;

Window Functions: ROW_NUMBER, RANK, OVER

SELECT name, salary,
  ROW_NUMBER() OVER (PARTITION BY department_id ORDER BY salary DESC) AS rank
FROM employees;

3. ๐Ÿง  Advanced

CTEs (WITH)

WITH avg_salary AS (
  SELECT department_id, AVG(salary) AS avg
  FROM employees
  GROUP BY department_id
)
SELECT e.name, e.salary
FROM employees e
JOIN avg_salary a ON e.department_id = a.department_id
WHERE e.salary > a.avg;

Recursive Queries

WITH RECURSIVE nums(n) AS (
  SELECT 1
  UNION
  SELECT n + 1 FROM nums WHERE n < 5
)
SELECT * FROM nums;

JSON and JSONB

SELECT data->>'name' AS name
FROM users
WHERE data->>'active' = 'true';

Indexes and Performance

CREATE INDEX idx_salary ON employees(salary);

EXPLAIN ANALYZE
SELECT * FROM employees WHERE salary > 70000;

Functions in PLpgSQL

CREATE FUNCTION high_earners(threshold INT)
RETURNS TABLE(name TEXT, salary INT) AS $$
BEGIN
  RETURN QUERY
  SELECT name, salary FROM employees WHERE salary > threshold;
END;
$$ LANGUAGE plpgsql;

Transactions and Locking

BEGIN;
UPDATE accounts SET balance = balance - 100 WHERE id = 1;
UPDATE accounts SET balance = balance + 100 WHERE id = 2;
COMMIT;

4. ๐Ÿ›  Real-World Practice

Sample Schema: Payroll

-- employees table
-- departments table
-- payroll_entries table

Build Query from Requirement

Q: Show department-wise top earner(s) and average salary

WITH stats AS (
  SELECT department_id, MAX(salary) AS max_salary, AVG(salary) AS avg_salary
  FROM employees GROUP BY department_id
),
top_earners AS (
  SELECT e.name, e.salary, d.name AS department
  FROM employees e
  JOIN stats s ON e.department_id = s.department_id AND e.salary = s.max_salary
  JOIN departments d ON d.id = e.department_id
)
SELECT * FROM top_earners;

Debug with EXPLAIN

EXPLAIN ANALYZE
SELECT * FROM employees WHERE department_id = 2 AND salary > 80000;

๐Ÿ’ก Tip

Practice using PostgreSQL on a local setup or tools like pgAdmin or db-fiddle.



๐Ÿš€ Indexes and Query Optimization

๐Ÿ”น What is an Index?

An index is a data structure that improves the speed of data retrieval at the cost of slower writes and more storage.

Think of it like a book index โ€” it tells you exactly where to find information without scanning the whole book.


๐Ÿ“˜ Types of Indexes in PostgreSQL

  • B-Tree (default): Great for equality and range queries.
  • Hash: Good for equality only (rarely used).
  • GIN: Useful for full-text search and JSONB.
  • BRIN: Efficient for very large tables with naturally ordered data.
  • Composite Index: Index on multiple columns.

โš™๏ธ Creating Indexes

-- Index on one column
CREATE INDEX idx_salary ON employees(salary);

-- Index on multiple columns
CREATE INDEX idx_dept_salary ON employees(department_id, salary);

-- JSONB index
CREATE INDEX idx_json_name ON users USING GIN ((data->>'name'));

๐Ÿงช Using EXPLAIN to Analyze Queries

EXPLAIN ANALYZE
SELECT * FROM employees WHERE salary > 70000;
  • Look for Seq Scan โ†’ can be replaced with Index Scan if an index exists.
  • Check actual timing and row estimates.

๐Ÿ“ˆ Real-World Use Case: Optimizing a Report Query

Scenario:

Show the top 3 highest-paid employees per department (out of 10 million rows).


๐Ÿ›  Step-by-Step Solution

-- Index for faster filtering and ordering
CREATE INDEX idx_dept_salary ON employees(department_id, salary DESC);
-- Optimized Query
SELECT *
FROM (
  SELECT name, department_id, salary,
         ROW_NUMBER() OVER (PARTITION BY department_id ORDER BY salary DESC) AS rank
  FROM employees
) ranked
WHERE rank <= 3;
  • Uses window function to rank salaries per department
  • Index helps sort within partitions
  • Avoids full table sort

โšก Tips to Speed Up Queries

  • Use EXPLAIN ANALYZE regularly
  • Use indexes wisely on columns used in WHERE, JOIN, ORDER BY
  • Avoid SELECT * in production queries
  • Avoid unnecessary DISTINCT and nested subqueries
  • Batch inserts and updates
  • Normalize and denormalize when needed
  • Materialize expensive CTEs (PostgreSQL 12+): MATERIALIZED

๐Ÿง  Summary

Indexes can dramatically reduce query time โ€” especially on large datasets. But use them carefully:

  • Too many indexes = slow writes
  • Use compound indexes when queries filter on multiple columns
  • Always validate with EXPLAIN ANALYZE

Comments

Add a Comment