Putting It All Together: Saving $550,000 on One Query
Concept. A single unindexed weekly aggregation query running 100 times a day on a 1 TB table can cost a company $550,000 a year in cloud compute. Stacking the right optimizations (index, columnar storage, compression) drops the bill to under $400.
Intuition. A naive Hash Partition Join across a full 1 TB of listen history costs ~49,200 IOs per run. Adding a B+Tree index on listen_time prunes the query to the 1% of rows in this week's range, which cuts the cost to ~554 IOs. Switching to columnar storage reads only the 4 columns the query actually touches, cutting further to ~60 IOs. Compression shrinks those columns to ~35 IOs. A 1,400× improvement overall.
The Bill
Goal: Understand how indexing, columnar storage, and compression stack to tackle a costly query disaster.
An unoptimized weekly "Trending Genres" aggregation query runs 100 times daily at 9 hours per run, costing $1,500 a day (about $550,000 a year) in AWS compute.
System Archeology: The Schema
To optimize the workload, we start by laying out the physical schema.
| Component | Stats | Disk footprint (64 MB pages) |
|---|---|---|
| Songs table | ~50 M songs | 400 pages (25 GB) |
| Listens table | ~10 B listens | 16,000 pages (1 TB) |
| Weekly scope | Oct 21–27 filter | 150 pages (~1% of data) |
| Memory (B) | Buffer pool | 20 pages (1.28 GB) |
All calculations below use a 64 MB page size.
Problem 0: The Baseline Investigation
Problem. The weekly trending report takes 9 hours. Before optimizing, we need the baseline cost. What is the IO cost of a Hash Partition Join (HPJ) on these two tables? Should the optimizer choose a Block Nested Loop Join (BNLJ) instead?
🔍 Show Solution
Option 1: Block Nested Loop Join (BNLJ)
If the optimizer picked BNLJ, it would be a catastrophe.
-
Formula:
M + (M / B) × N -
Math:
400 + (400 / 20) × 16,000 = 320,400 IOs -
Why: BNLJ scans the 1 TB Listens table once for every block of the 25 GB Songs table. This is the "disaster scenario" for big-data joins.
Option 2: Hash Partition Join (HPJ)
HPJ is better, but still wasteful.
-
Formula:
3 × (M + N) -
Math:
3 × (400 + 16,000) = 49,200 IOs -
Why: HPJ is efficient for joining two large tables, but here we only need 1% of the data. Scanning the entire 1 TB to find that 1% is massive waste.
Problem 1: The Index Win
Problem. Only about 150 of the 16,000 Listens pages contain data for Oct 21–27. How do we skip the other pages, and what is the new IO cost?
🔍 Show Solution
-
Logic. Create a B+Tree index on
Listens(listen_time). -
Design. The index allows a targeted scan instead of a full join.
-
IO math:
- Scan Songs: 400 IOs
- Index lookup: ~4 IOs
- Fetch filtered Listens: 150 IOs
- Total: ~554 IOs
-
Why. Indexes prune the search space. By jumping directly to the start date and reading only the 1% target data, we get a 90× improvement over plain HPJ.
Problem 2: The Column Store Win
Problem. Even with an index, row-based storage loads the whole 150-byte row just to read the 24 bytes we actually need (song_id, user_id, listen_time, duration). How do we eliminate this column waste, and what is the estimated IO savings?
🔍 Show Solution
-
Logic. Switch to a columnar store (BigQuery, Parquet).
-
Design. Each column is physically stored in its own set of pages. The engine reads only the columns referenced in the
SELECTandWHEREclauses, ignoring IP addresses, device IDs, and anything else. -
IO math:
- We read only 1/6th of the row width.
- Total: ~60 IOs (Listens 25 + Songs 35)
-
Why. Columnar storage is optimized for aggregation. Reading only the relevant columns cuts IO by another 9×.
Problem 3: The Compression Win
Problem. How do we further shrink the genre and rating columns without losing data?
🔍 Show Solution
-
Technique 1 (Dictionary encoding). Map the 200 unique genres to 1-byte IDs (16 pages down to 1 page).
-
Technique 2 (Bit packing). Ratings are 1–5, so store them in 4 bits instead of a 32-bit integer (7 pages down to 1 page).
-
Technique 3 (Run-length encoding). Since the data is sorted, compress repeating user IDs (13 pages down to 5 pages).
-
Final result: ~35 IOs, 20 seconds per run, ~$400 a year.
-
Why. Domain-specific compression. The more you know about the shape of your data, the more of it you can throw away.
Summary: The Compound ROI
Three independent optimizations multiplied together take a $550,000 disaster down to $400.
| Stage | IOs per run | Annual cost | Win factor |
|---|---|---|---|
| Baseline (HPJ) | 49,200 | $550,000 | 1× |
| + Index | ~554 | ~$6,500 | 90× |
| + Columnar | ~60 | ~$730 | 9× |
| + Compression | ~35 | ~$400 | 2× |
Total impact: 1,400× improvement.
The lesson isn't that any single technique is magic. It's that indexes, columnar storage, and compression are orthogonal: each one attacks a different kind of waste (scanning too many rows, reading too many columns, storing too many bytes), and they multiply when stacked.
Next
Indexing Quiz → Apply the four index types and these optimizations to a few practice queries.