MySQL HAVING

Filtering grouped results

🔍 What is HAVING?

HAVING filters groups created by GROUP BY based on aggregate function results. Unlike WHERE which filters individual rows before grouping, HAVING filters groups after aggregation, allowing conditions on COUNT, SUM, AVG, MAX, and MIN results.


-- Simple HAVING example
SELECT department, COUNT(*) AS emp_count
FROM employees
GROUP BY department
HAVING COUNT(*) > 10;
                                    

Output:

department emp_count
Sales 15
IT 12

Only departments with more than 10 employees

Key HAVING Concepts

🔢

Filters Groups

Filters after GROUP BY aggregation

HAVING COUNT(*) > 5
📊

Aggregate Conditions

Works with aggregate functions

HAVING SUM(amount) > 1000
⚖️

After Grouping

Applied after GROUP BY processing

GROUP BY dept
HAVING AVG(salary) > 50000
🔄

vs WHERE

WHERE filters rows, HAVING filters groups

WHERE status='Active'
HAVING COUNT(*) > 3

🔹 Basic HAVING Syntax

HAVING must be used with GROUP BY and comes after it in the query. It filters the grouped results based on conditions applied to aggregate functions. You can use comparison operators like greater than, less than, equals, and logical operators like AND, OR with HAVING clauses.

-- Basic HAVING syntax
SELECT column1, aggregate_function(column2)
FROM table_name
GROUP BY column1
HAVING aggregate_function(column2) condition;

-- HAVING with multiple conditions
SELECT department, COUNT(*) AS count
FROM employees
GROUP BY department
HAVING COUNT(*) > 5 AND COUNT(*) < 20;

Example Result:

Only groups that meet the HAVING condition

🔹 HAVING vs WHERE Comparison

WHERE filters individual rows before grouping occurs, while HAVING filters groups after aggregation. Use WHERE to filter raw data and HAVING to filter aggregated results. You can use both in the same query: WHERE filters first, then GROUP BY groups, then HAVING filters the groups.

-- Create sample table
CREATE TABLE sales (
    id INT,
    salesperson VARCHAR(100),
    region VARCHAR(50),
    amount DECIMAL(10,2),
    sale_date DATE
);

-- Insert sample data
INSERT INTO sales VALUES
(1, 'John', 'North', 1000, '2024-01-15'),
(2, 'John', 'North', 1500, '2024-01-20'),
(3, 'Jane', 'South', 800, '2024-01-18'),
(4, 'Mike', 'North', 2000, '2024-01-22'),
(5, 'Jane', 'South', 1200, '2024-01-25');

-- WHERE filters rows before grouping
SELECT salesperson, SUM(amount) AS total_sales
FROM sales
WHERE region = 'North'
GROUP BY salesperson;

-- HAVING filters groups after aggregation
SELECT salesperson, SUM(amount) AS total_sales
FROM sales
GROUP BY salesperson
HAVING SUM(amount) > 1500;

-- Using both WHERE and HAVING
SELECT salesperson, SUM(amount) AS total_sales
FROM sales
WHERE region = 'North'
GROUP BY salesperson
HAVING SUM(amount) > 1000;

WHERE Output (filters rows first):

salesperson total_sales
John 2500.00
Mike 2000.00

HAVING Output (filters groups):

salesperson total_sales
John 2500.00
Jane 2000.00
Mike 2000.00

🔹 HAVING with Different Aggregate Functions

HAVING works with all aggregate functions including COUNT, SUM, AVG, MAX, and MIN. You can combine multiple aggregate conditions using AND and OR operators to create complex filtering logic. This allows precise control over which groups appear in your final results based on various statistical criteria.

-- HAVING with COUNT
SELECT region, COUNT(*) AS sale_count
FROM sales
GROUP BY region
HAVING COUNT(*) >= 3;

-- HAVING with SUM
SELECT salesperson, SUM(amount) AS total_sales
FROM sales
GROUP BY salesperson
HAVING SUM(amount) > 2000;

-- HAVING with AVG
SELECT region, AVG(amount) AS avg_sale
FROM sales
GROUP BY region
HAVING AVG(amount) > 1000;

-- HAVING with multiple conditions
SELECT 
    salesperson,
    COUNT(*) AS sale_count,
    SUM(amount) AS total_sales,
    AVG(amount) AS avg_sale
FROM sales
GROUP BY salesperson
HAVING COUNT(*) >= 2 
AND SUM(amount) > 1500
AND AVG(amount) > 1000;

Output:

salesperson sale_count total_sales avg_sale
John 2 2500.00 1250.00
Jane 2 2000.00 1000.00

🔹 Practical HAVING Examples

HAVING is essential for business analytics and reporting. Use it to find top performers, identify underperforming segments, detect anomalies, and create threshold-based reports. It helps answer questions like "which products sold more than 100 units" or "which customers spent over $10,000" by filtering aggregated data.

-- Find customers with more than 5 orders
SELECT 
    customer_name,
    COUNT(*) AS order_count,
    SUM(amount) AS total_spent
FROM orders
GROUP BY customer_name
HAVING COUNT(*) > 5
ORDER BY total_spent DESC;

-- Find products with average rating above 4.5
SELECT 
    product_name,
    COUNT(*) AS review_count,
    AVG(rating) AS avg_rating
FROM product_reviews
GROUP BY product_name
HAVING AVG(rating) > 4.5 AND COUNT(*) >= 10;

-- Find departments with high average salary
SELECT 
    department,
    COUNT(*) AS employee_count,
    AVG(salary) AS avg_salary,
    MAX(salary) AS highest_salary
FROM employees
GROUP BY department
HAVING AVG(salary) > 60000
ORDER BY avg_salary DESC;

🔹 Real-World Use Cases

HAVING enables sophisticated data analysis for business intelligence. Common applications include identifying VIP customers who exceed spending thresholds, finding popular products that meet minimum sales volumes, detecting fraud patterns with unusual transaction counts, analyzing regional performance above targets, and generating executive reports with qualified metrics that meet specific business criteria.

-- Use Case 1: Identify high-value customers
SELECT 
    customer_id,
    customer_name,
    COUNT(*) AS total_orders,
    SUM(order_amount) AS lifetime_value,
    AVG(order_amount) AS avg_order_value
FROM customer_orders
WHERE order_date >= DATE_SUB(CURDATE(), INTERVAL 1 YEAR)
GROUP BY customer_id, customer_name
HAVING SUM(order_amount) > 10000
ORDER BY lifetime_value DESC;

-- Use Case 2: Find underperforming products
SELECT 
    category,
    product_name,
    SUM(quantity_sold) AS units_sold,
    SUM(revenue) AS total_revenue
FROM product_sales
WHERE sale_date >= DATE_SUB(CURDATE(), INTERVAL 6 MONTH)
GROUP BY category, product_name
HAVING SUM(quantity_sold) < 50
ORDER BY category, units_sold;

-- Use Case 3: Regional sales analysis
SELECT 
    region,
    COUNT(DISTINCT salesperson_id) AS sales_team_size,
    COUNT(*) AS total_sales,
    SUM(amount) AS revenue,
    AVG(amount) AS avg_deal_size
FROM sales_data
GROUP BY region
HAVING SUM(amount) > 100000 
AND COUNT(*) > 50
ORDER BY revenue DESC;

🧠 Test Your Knowledge

When should you use HAVING instead of WHERE?