MySQL Self Join

Joining a table with itself

🔄 What is Self Join?

Self Join is a technique where a table is joined with itself to compare rows within the same table. It's useful for hierarchical data like employee-manager relationships or comparing records within one table.


-- Simple Self Join example
SELECT e1.name AS Employee, e2.name AS Manager
FROM employees e1
JOIN employees e2 ON e1.manager_id = e2.id;
                                    

Output:

Employee Manager
John Sarah
Mike Sarah
Lisa John

Key Self Join Concepts

🏷️

Table Aliases

Must use aliases to distinguish copies

FROM employees e1
JOIN employees e2
🌳

Hierarchical Data

Perfect for parent-child relationships

ON e1.manager_id = e2.id
🔍

Compare Rows

Find relationships within same table

WHERE e1.salary > e2.salary
🔗

Same Table Twice

Treats one table as two separate tables

SELECT a.*, b.*
FROM table a, table b

🔹 Basic Self Join Syntax

Self Join requires table aliases to differentiate between the two instances of the same table. You treat the table as if it were two separate tables by giving each instance a unique alias, then join them using a common column relationship.

-- Basic Self Join syntax
SELECT 
    a.column1,
    b.column2
FROM table_name a
JOIN table_name b ON a.common_column = b.related_column;

-- Alternative with INNER JOIN
SELECT 
    t1.column_name,
    t2.column_name
FROM table_name t1
INNER JOIN table_name t2 ON t1.id = t2.parent_id;

Example Result:

Rows from the same table matched based on relationship

🔹 Employee-Manager Hierarchy Example

The most common use of Self Join is displaying organizational hierarchies. This example shows how to match employees with their managers by joining the employees table with itself, where each employee's manager_id references another employee's id in the same table.

-- Create employees table
CREATE TABLE employees (
    id INT PRIMARY KEY,
    name VARCHAR(100),
    manager_id INT,
    position VARCHAR(100)
);

-- Insert sample data
INSERT INTO employees VALUES
(1, 'Sarah Johnson', NULL, 'CEO'),
(2, 'John Smith', 1, 'Manager'),
(3, 'Mike Brown', 1, 'Manager'),
(4, 'Lisa Davis', 2, 'Developer'),
(5, 'Tom Wilson', 2, 'Developer');

-- Self Join to show employee-manager relationship
SELECT 
    e.name AS Employee,
    e.position AS Position,
    m.name AS Manager
FROM employees e
LEFT JOIN employees m ON e.manager_id = m.id
ORDER BY m.name, e.name;

Output:

Employee Position Manager
Sarah Johnson CEO NULL
John Smith Manager Sarah Johnson
Mike Brown Manager Sarah Johnson
Lisa Davis Developer John Smith
Tom Wilson Developer John Smith

🔹 Comparing Rows in Same Table

Self Join allows you to compare different rows within the same table. This is useful for finding pairs of records that meet specific criteria, such as employees in the same department, products with similar prices, or customers from the same city.

-- Find employees earning more than their managers
SELECT 
    e.name AS Employee,
    e.salary AS Employee_Salary,
    m.name AS Manager,
    m.salary AS Manager_Salary
FROM employees e
JOIN employees m ON e.manager_id = m.id
WHERE e.salary > m.salary;

-- Find pairs of employees in same department
SELECT 
    e1.name AS Employee1,
    e2.name AS Employee2,
    e1.department
FROM employees e1
JOIN employees e2 ON e1.department = e2.department
WHERE e1.id < e2.id;

Output:

Employees paired based on comparison criteria

🔹 Real-World Use Cases

Self Join is essential for working with hierarchical structures, finding duplicates, and analyzing relationships within a single dataset. Common applications include organizational charts, product recommendations based on similar items, social network friend connections, and geographical proximity analysis between locations in the same table.

-- Use Case 1: Find all subordinates of a manager
SELECT 
    m.name AS Manager,
    COUNT(e.id) AS Number_of_Subordinates,
    GROUP_CONCAT(e.name) AS Subordinates
FROM employees m
LEFT JOIN employees e ON m.id = e.manager_id
GROUP BY m.id, m.name
HAVING COUNT(e.id) > 0;

-- Use Case 2: Product recommendations (similar products)
SELECT 
    p1.product_name AS Product,
    p2.product_name AS Similar_Product,
    p1.category
FROM products p1
JOIN products p2 ON p1.category = p2.category
WHERE p1.id != p2.id
AND ABS(p1.price - p2.price) < 10;

🧠 Test Your Knowledge

Why do we need table aliases in Self Join?