MySQL Self Join
Joining a table with itself
🔄 What is Self Join?
Self Join is a technique where a table is joined with itself to compare rows within the same table. It's useful for hierarchical data like employee-manager relationships or comparing records within one table.
-- Simple Self Join example
SELECT e1.name AS Employee, e2.name AS Manager
FROM employees e1
JOIN employees e2 ON e1.manager_id = e2.id;
Output:
| Employee | Manager |
|---|---|
| John | Sarah |
| Mike | Sarah |
| Lisa | John |
Key Self Join Concepts
Table Aliases
Must use aliases to distinguish copies
FROM employees e1
JOIN employees e2
Hierarchical Data
Perfect for parent-child relationships
ON e1.manager_id = e2.id
Compare Rows
Find relationships within same table
WHERE e1.salary > e2.salary
Same Table Twice
Treats one table as two separate tables
SELECT a.*, b.*
FROM table a, table b
🔹 Basic Self Join Syntax
Self Join requires table aliases to differentiate between the two instances of the same table. You treat the table as if it were two separate tables by giving each instance a unique alias, then join them using a common column relationship.
-- Basic Self Join syntax
SELECT
a.column1,
b.column2
FROM table_name a
JOIN table_name b ON a.common_column = b.related_column;
-- Alternative with INNER JOIN
SELECT
t1.column_name,
t2.column_name
FROM table_name t1
INNER JOIN table_name t2 ON t1.id = t2.parent_id;
Example Result:
Rows from the same table matched based on relationship
🔹 Employee-Manager Hierarchy Example
The most common use of Self Join is displaying organizational hierarchies. This example shows how to match employees with their managers by joining the employees table with itself, where each employee's manager_id references another employee's id in the same table.
-- Create employees table
CREATE TABLE employees (
id INT PRIMARY KEY,
name VARCHAR(100),
manager_id INT,
position VARCHAR(100)
);
-- Insert sample data
INSERT INTO employees VALUES
(1, 'Sarah Johnson', NULL, 'CEO'),
(2, 'John Smith', 1, 'Manager'),
(3, 'Mike Brown', 1, 'Manager'),
(4, 'Lisa Davis', 2, 'Developer'),
(5, 'Tom Wilson', 2, 'Developer');
-- Self Join to show employee-manager relationship
SELECT
e.name AS Employee,
e.position AS Position,
m.name AS Manager
FROM employees e
LEFT JOIN employees m ON e.manager_id = m.id
ORDER BY m.name, e.name;
Output:
| Employee | Position | Manager |
|---|---|---|
| Sarah Johnson | CEO | NULL |
| John Smith | Manager | Sarah Johnson |
| Mike Brown | Manager | Sarah Johnson |
| Lisa Davis | Developer | John Smith |
| Tom Wilson | Developer | John Smith |
🔹 Comparing Rows in Same Table
Self Join allows you to compare different rows within the same table. This is useful for finding pairs of records that meet specific criteria, such as employees in the same department, products with similar prices, or customers from the same city.
-- Find employees earning more than their managers
SELECT
e.name AS Employee,
e.salary AS Employee_Salary,
m.name AS Manager,
m.salary AS Manager_Salary
FROM employees e
JOIN employees m ON e.manager_id = m.id
WHERE e.salary > m.salary;
-- Find pairs of employees in same department
SELECT
e1.name AS Employee1,
e2.name AS Employee2,
e1.department
FROM employees e1
JOIN employees e2 ON e1.department = e2.department
WHERE e1.id < e2.id;
Output:
Employees paired based on comparison criteria
🔹 Real-World Use Cases
Self Join is essential for working with hierarchical structures, finding duplicates, and analyzing relationships within a single dataset. Common applications include organizational charts, product recommendations based on similar items, social network friend connections, and geographical proximity analysis between locations in the same table.
-- Use Case 1: Find all subordinates of a manager
SELECT
m.name AS Manager,
COUNT(e.id) AS Number_of_Subordinates,
GROUP_CONCAT(e.name) AS Subordinates
FROM employees m
LEFT JOIN employees e ON m.id = e.manager_id
GROUP BY m.id, m.name
HAVING COUNT(e.id) > 0;
-- Use Case 2: Product recommendations (similar products)
SELECT
p1.product_name AS Product,
p2.product_name AS Similar_Product,
p1.category
FROM products p1
JOIN products p2 ON p1.category = p2.category
WHERE p1.id != p2.id
AND ABS(p1.price - p2.price) < 10;