SQL CASE WHEN COUNT: Practical Guide with Real-World Examples

Ryan Forrester

Oct 2nd, 2024

Blog

Breakdown of SQL CASE WHEN COUNT

SQL’s CASE WHEN statement combined with the COUNT function is a great for data analysis. This combo lets you create custom categories and count occurrences within those categories in a single query.

If you’ve ever found yourself running multiple queries to get counts for different conditions, this technique will save you time and simplify your code.

SQL Made Simple - Try Five
Five is the Easiest Way to Create, Model and Query a SQL Database

___

Here’s What You Can Do With Five

Create and Model a SQL Database ✅
Write or Build SQL Queries ✅
Visualize Queries as Charts or in Dashboards ✅
Add CRUD Permissions to Control Data Access ✅
Host Your Database Online ✅

“Five bridges the gap between SQL and the web,
allowing me to create full-stack applications almost entirely in SQL”
– Crag Jones, Database Administrator (DBA)

Table of Contents

Breakdown of SQL CASE WHEN COUNT
Why Use CASE WHEN with COUNT?
Basic Syntax: CASE WHEN COUNT
Real-World Example: Analyzing Customer Orders
Advanced Usage: Multiple Conditions and Aggregations
CASE WHEN COUNT for Percentage Calculations
Combining CASE WHEN COUNT with Other Aggregations
Performance Considerations
Common Pitfalls and How to Avoid Them
Wrapping Up

Why Use CASE WHEN with COUNT?

The CASE WHEN COUNT combination shines when you need to:

Count occurrences based on multiple conditions
Create custom groupings on the fly
Perform conditional aggregations
Generate summary reports with various metrics

Let’s dive into some practical examples to see how this works in action.

Basic Syntax: CASE WHEN COUNT

Before we jump into complex scenarios, let’s review the basic syntax:

SELECT
  COUNT(CASE WHEN condition THEN result END) AS count_name
FROM
  table_name;

This structure allows you to count rows that meet specific conditions. The CASE WHEN statement acts as a filter, and COUNT tallies up the matches.

Real-World Example: Analyzing Customer Orders

Imagine you’re working with an e-commerce database and want to analyze customer orders. You have a table called orders with columns like order_id, customer_id, order_date, and total_amount.

Task: Count Orders by Value Range

Let’s say you want to count orders in different value ranges: under $50, $50-$100, and over $100.

SELECT
  COUNT(CASE WHEN total_amount < 50 THEN 1 END) AS low_value_orders,
  COUNT(CASE WHEN total_amount BETWEEN 50 AND 100 THEN 1 END) AS mid_value_orders,
  COUNT(CASE WHEN total_amount > 100 THEN 1 END) AS high_value_orders
FROM
  orders;

This query gives you a quick overview of order distribution across different value ranges. It’s much more efficient than running separate queries for each range.

Adding Time-Based Conditions

Now, let’s add a time dimension to our analysis. We’ll count orders for the last 30 days, 30-60 days ago, and 60-90 days ago.

SELECT
  COUNT(CASE WHEN order_date >= CURRENT_DATE - INTERVAL '30 days' THEN 1 END) AS last_30_days,
  COUNT(CASE WHEN order_date BETWEEN CURRENT_DATE - INTERVAL '60 days' AND CURRENT_DATE - INTERVAL '30 days' THEN 1 END) AS days_30_to_60,
  COUNT(CASE WHEN order_date BETWEEN CURRENT_DATE - INTERVAL '90 days' AND CURRENT_DATE - INTERVAL '60 days' THEN 1 END) AS days_60_to_90
FROM
  orders;

This query provides insights into recent order trends without the need for multiple date-range filters.

Advanced Usage: Multiple Conditions and Aggregations

The real power of CASE WHEN COUNT comes into play when you combine multiple conditions and aggregations.

Task: Analyze High-Value Customers

Let’s identify and count high-value customers based on their order history.

SELECT
  COUNT(CASE WHEN total_orders > 10 AND total_spend > 1000 THEN 1 END) AS high_value_customers,
  COUNT(CASE WHEN total_orders > 5 AND total_spend > 500 THEN 1 END) AS potential_high_value
FROM (
  SELECT
    customer_id,
    COUNT(*) AS total_orders,
    SUM(total_amount) AS total_spend
  FROM
    orders
  GROUP BY
    customer_id
) customer_summary;

This query uses a subquery to calculate total orders and spend per customer, then applies CASE WHEN COUNT to categorize and count customers based on these metrics.

CASE WHEN COUNT for Percentage Calculations

Another practical use of CASE WHEN COUNT is calculating percentages within your dataset.

Task: Calculate Product Category Distribution

Suppose you have a products table with a category column. You want to see the distribution of products across categories as percentages.

SELECT
  category,
  COUNT(*) AS category_count,
  COUNT(*) * 100.0 / SUM(COUNT(*)) OVER () AS category_percentage
FROM
  products
GROUP BY
  category;

This query uses COUNT(*) with CASE WHEN implicitly (the CASE WHEN is applied by the GROUP BY). The OVER () clause creates a window function that calculates the total count across all categories.

Combining CASE WHEN COUNT with Other Aggregations

You can mix CASE WHEN COUNT with other aggregation functions for more comprehensive analysis.

Task: Analyze Order Values and Counts by Customer Type

Let’s say you have a customers table with a customer_type column (e.g., ‘retail’, ‘wholesale’, ‘VIP’). You want to analyze both the count and total value of orders for each customer type.

SELECT
  c.customer_type,
  COUNT(CASE WHEN o.total_amount < 100 THEN 1 END) AS low_value_orders,
  COUNT(CASE WHEN o.total_amount >= 100 AND o.total_amount < 500 THEN 1 END) AS mid_value_orders,
  COUNT(CASE WHEN o.total_amount >= 500 THEN 1 END) AS high_value_orders,
  SUM(CASE WHEN o.total_amount < 100 THEN o.total_amount ELSE 0 END) AS low_value_total,
  SUM(CASE WHEN o.total_amount >= 100 AND o.total_amount < 500 THEN o.total_amount ELSE 0 END) AS mid_value_total,
  SUM(CASE WHEN o.total_amount >= 500 THEN o.total_amount ELSE 0 END) AS high_value_total
FROM
  customers c
JOIN
  orders o ON c.customer_id = o.customer_id
GROUP BY
  c.customer_type;

This query provides a detailed breakdown of order counts and values across different customer types and order value ranges.

Performance Considerations

While CASE WHEN COUNT is incredibly useful, it’s important to consider performance, especially when working with large datasets.

Indexing: Ensure that columns used in CASE WHEN conditions are properly indexed.
Subqueries: For complex conditions, using subqueries (as in the high-value customers example) can sometimes be more efficient than multiple CASE WHEN statements.
Materialized Views: For frequently run analyses, consider creating materialized views that pre-compute these conditional counts.

Common Pitfalls and How to Avoid Them

Null Handling: Remember that COUNT(expression) only counts non-null values. If your CASE WHEN might return null, use COUNT(*) instead:

   COUNT(CASE WHEN condition THEN 1 ELSE NULL END)

vs.

   COUNT(*) FILTER (WHERE condition)

Exclusive Conditions: Ensure your CASE WHEN conditions are mutually exclusive if you’re categorizing data. Overlapping conditions can lead to double-counting.
Date and Time Zones: When working with date/time conditions, be aware of time zone differences. Use consistent time zone handling in your queries.

Wrapping Up

The CASE WHEN COUNT technique in SQL is a versatile tool for data analysis. It allows you to create nuanced, condition-based counts in a single query, reducing the need for multiple queries or complex joins. By using this approach, you’ll be able to write more efficient, readable SQL code and derive insights from your data more quickly.

Remember, the key to effective use of CASE WHEN COUNT is to think about your data categorization needs upfront. What conditions are important for your analysis? How can you group these conditions efficiently? With practice, you’ll find yourself reaching for this technique regularly in your SQL toolkit.