What is GROUP BY?
The GROUP BY clause groups rows that share the same values into summary rows. It's used with aggregate functions like COUNT, SUM, AVG to calculate statistics for each group.
Think of it like organizing students by grade level and calculating the average test score for each grade. GROUP BY does the grouping, and aggregate functions do the calculations.
Basic Syntax
SELECT column_name, AGGREGATE_FUNCTION(column)
FROM table_name
WHERE condition
GROUP BY column_name
HAVING group_condition
ORDER BY column_name;💡 Pro Tip: Every column in SELECT that's not inside an aggregate function must appear in GROUP BY. This is SQL's "grouping rule."
Common Patterns
Count how many rows exist in each group
SELECT country, COUNT(*) AS customer_count
FROM customers
GROUP BY country;Result: Shows number of customers per country: USA: 45, Canada: 23, UK: 17...
See Aggregation In Action
Watch how GROUP BY transforms raw data into summary statistics step by step.
Count Orders per Customer
Aggregate using COUNT
Input Data
order_idINTEGER | customer_idINTEGER | productTEXT | amountDECIMAL | order_dateDATE |
|---|---|---|---|---|
| 101 | 1 | Laptop | 1200 | 2024-01-15 |
| 102 | 1 | Mouse | 25 | 2024-01-16 |
| 103 | 2 | Keyboard | 75 | 2024-01-18 |
| 104 | 3 | Monitor | 350 | 2024-01-20 |
| 105 | 3 | Webcam | 120 | 2024-01-22 |
SQL Query
SELECT customer_id, COUNT(*) as order_count
FROM orders
GROUP BY customer_id
ORDER BY customer_idResult
5 rowscustomer_idNUMBER | order_countNUMBER |
|---|---|
| 1 | 2 |
| 2 | 1 |
| 3 | 2 |
| 4 | 1 |
| 5 | 2 |
Aggregate Functions
COUNT()
Counts the number of rows in each group.
COUNT(*)Use COUNT(*) for all rows, COUNT(column) to exclude NULLs
SUM()
Adds up all values in a numeric column.
SUM(amount)Only works with numeric columns
AVG()
Calculates the average (mean) of numeric values.
AVG(price)NULL values are excluded from calculation
MAX()
Finds the largest value in each group.
MAX(salary)Works with numbers, text (alphabetically), and dates
MIN()
Finds the smallest value in each group.
MIN(price)Works with numbers, text (alphabetically), and dates
Multiple Aggregates
Combine multiple functions in one query.
COUNT(*), AVG(price), MAX(price)Get comprehensive statistics in a single query
WHERE vs HAVING
WHERE
Filters individual rows BEFORE grouping happens.
WHERE price > 100- Applied before GROUP BY
- Cannot use aggregate functions
- Filters rows
HAVING
Filters groups AFTER aggregation is done.
HAVING COUNT(*) > 10- Applied after GROUP BY
- Can use aggregate functions
- Filters groups
Complete Example Using Both
SELECT country, COUNT(*) AS customer_count, AVG(age) AS avg_age
FROM customers
WHERE age >= 18 -- Filter rows: only adults
GROUP BY country -- Group by country
HAVING COUNT(*) > 5 -- Filter groups: only countries with 5+ customers
ORDER BY customer_count DESC; -- Sort resultsThis query finds countries with 5+ adult customers and shows their average age, sorted by customer count.
Common Mistakes to Avoid
❌ Column not in GROUP BY
SELECT country, name, COUNT(*) FROM customers GROUP BY country;`name` isn't in GROUP BY and isn't aggregated. SQL doesn't know which name to show per country! Remove `name` or add it to GROUP BY.
❌ Using aggregate in WHERE instead of HAVING
WHERE COUNT(*) > 10WHERE can't use aggregates! Use: HAVING COUNT(*) > 10
❌ Forgetting GROUP BY entirely
SELECT country, COUNT(*) FROM customers;Mixing non-aggregate column (country) with aggregate (COUNT). Need: GROUP BY country
❌ COUNT(*) vs COUNT(column) confusion
COUNT(email) -- Excludes rows where email is NULLIf you want all rows: COUNT(*). Use COUNT(column) only when you want to exclude NULLs.
✅ Correct GROUP BY with multiple aggregates
SELECT category, COUNT(*) AS products, AVG(price) AS avg_price FROM products WHERE price > 0 GROUP BY category HAVING COUNT(*) > 3;Perfect! All non-aggregate columns in GROUP BY, WHERE filters rows, HAVING filters groups.