While studying for the Databricks Data Analyst Associate certification, I hit a bit of a wall with some of SQL’s advanced grouping options. GROUP BY was familiar, but GROUPING SETS, ROLLUP, and CUBE left me scratching my head.
So, I decided to break things down and write the blog post I wish I had when I started. If you’ve ever felt confused by these SQL tools, this one’s for you!
🧱 Starting with the Basics: GROUP BY
If you’ve worked with SQL before, GROUP BY is probably second nature.
What it does:
Groups rows that have the same values in specified columns. Then, you can use aggregate functions like SUM(), COUNT(), or AVG() on each group.
🔍 Example:
Let’s say you have a simple sales table:
| category | region | sales |
|---|---|---|
| Furniture | East | 100 |
| Technology | East | 200 |
| Furniture | West | 150 |
| Technology | West | 300 |
To get total sales per product:
SELECT category, SUM(sales) AS total_sales
FROM superstore
GROUP BY category;
Result:
| category | total_sales |
|---|---|
| Furniture | 250 |
| Technology | 500 |
🔄 GROUPING SETS: Multiple Summaries in One Query
What it does:GROUPING SETS allows you to combine multiple GROUP BY queries into a single result. It’s like running separate aggregations side by side, but way more efficient.
💡 When to use:
- You want summaries at different levels (e.g., by category, by region, and an overall total).
- You’re building reports that require multiple perspectives.
🔍 Example:
SELECT category, region, SUM(sales) AS total_sales
FROM superstore
GROUP BY GROUPING SETS (
(category),
(region),
()
);
Result:
| category | region | total_sales |
|---|---|---|
| Furniture | NULL | 250 |
| Technology | NULL | 500 |
| NULL | East | 300 |
| NULL | West | 450 |
| NULL | NULL | 750 |
📝 Note: NULL in a column means that column wasn’t part of the grouping at that row level (e.g., total across all categories or all regions).
🔁 ROLLUP: Subtotals + Grand Total
What it does:ROLLUP creates subtotals as you move up a hierarchy of columns, ending with a grand total. It’s great for multi-level reporting.
💡 When to use:
- You want a breakdown by category and subcategory.
- You also want subtotals and an overall total—without writing multiple queries.
🔍 Example:
SELECT category, region, SUM(sales) AS total_sales
FROM superstore
GROUP BY ROLLUP (category, region);
Result:
| category | region | total_sales |
|---|---|---|
| Furniture | East | 100 |
| Furniture | West | 150 |
| Furniture | NULL | 250 |
| Technology | East | 200 |
| Technology | West | 300 |
| Technology | NULL | 500 |
| NULL | NULL | 750 |
📝 Note: Rows with NULL in region are subtotals for each category. The row with both columns as NULL is the grand total.
🧊 CUBE: All the Combinations
What it does:CUBE is like ROLLUP, but more powerful—it gives you every possible combination of the grouping columns.
💡 When to use:
- You want to analyze your data from every possible angle.
- You’re building a pivot-style report or multidimensional dashboard.
🔍 Example:
SELECT category, region, SUM(sales) AS total_sales
FROM superstore
GROUP BY CUBE (category, region);
Result:
| category | region | total_sales |
|---|---|---|
| Furniture | East | 100 |
| Furniture | West | 150 |
| Furniture | NULL | 250 |
| Technology | East | 200 |
| Tehnology | West | 300 |
| Technology | NULL | 500 |
| NULL | East | 300 |
| NULL | West | 450 |
| NULL | NULL | 750 |
📊 Quick Reference Table
| Feature | Example | What You Get | Best Use Case |
|---|---|---|---|
GROUP BY | GROUP BY category | Basic grouping | Simple aggregations |
GROUPING SETS | GROUPING SETS ((category),()) | Multiple specific summaries in one | Comparative reporting |
ROLLUP | ROLLUP (category, region) | Subtotals + Grand Total | Hierarchical summaries |
CUBE | CUBE (category, region) | All combinations of groupings | Multidimensional analysis |
✅ Final Thoughts
These SQL tools might seem intimidating at first, but once you understand what each one does, they unlock a ton of power for summarizing and analyzing your data.
Here’s a cheat sheet to help decide:
- 👉 Use
GROUP BYfor simple groupings - 👉 Use
GROUPING SETSwhen you need multiple distinct groupings in one result - 👉 Use
ROLLUPto add hierarchy-based subtotals and a grand total - 👉 Use
CUBEwhen you want to analyze across all combinations of columns
Once I got the hang of these, I was ready to sit the exam!
Got questions or want to see another topic broken down? Drop a comment below 👇 or reach out—I’d love to help.
Happy querying!

Leave a comment