How to create a MySQL hierarchical recursive query?

I have a MySQL table which is as follows:

id name parent_id
19 category1 0
20 category2 19
21 category3 20
22 category4 21
... ... ...

Now, I want to have a single MySQL query to which I simply supply the id [for instance say id=19] then I should get all its child ids [i.e. result should have ids '20,21,22']....

The hierarchy of the children is not known; it can vary....

I know how to do it using a for loop... but how to achieve the same using a single MySQL query?


For MySQL 8+: use the recursive with syntax.
For MySQL 5.x: use inline variables, path IDs, or self-joins.

MySQL 8+

with recursive cte (id, name, parent_id) as (
  select     id,
             name,
             parent_id
  from       products
  where      parent_id = 19
  union all
  select     p.id,
             p.name,
             p.parent_id
  from       products p
  inner join cte
          on p.parent_id = cte.id
)
select * from cte;

The value specified in parent_id = 19 should be set to the id of the parent you want to select all the descendants of.

MySQL 5.x

For MySQL versions that do not support Common Table Expressions (up to version 5.7), you would achieve this with the following query:

select  id,
        name,
        parent_id 
from    (select * from products
         order by parent_id, id) products_sorted,
        (select @pv := '19') initialisation
where   find_in_set(parent_id, @pv)
and     length(@pv := concat(@pv, ',', id))

Here is a fiddle.

Here, the value specified in @pv := '19' should be set to the id of the parent you want to select all the descendants of.

This will work also if a parent has multiple children. However, it is required that each record fulfills the condition parent_id < id, otherwise the results will not be complete.

Variable assignments inside a query

This query uses specific MySQL syntax: variables are assigned and modified during its execution. Some assumptions are made about the order of execution:

  • The from clause is evaluated first. So that is where @pv gets initialised.
  • The where clause is evaluated for each record in the order of retrieval from the from aliases. So this is where a condition is put to only include records for which the parent was already identified as being in the descendant tree (all descendants of the primary parent are progressively added to @pv).
  • The conditions in this where clause are evaluated in order, and the evaluation is interrupted once the total outcome is certain. Therefore the second condition must be in second place, as it adds the id to the parent list, and this should only happen if the id passes the first condition. The length function is only called to make sure this condition is always true, even if the pv string would for some reason yield a falsy value.

All in all, one may find these assumptions too risky to rely on. The documentation warns:

you might get the results you expect, but this is not guaranteed [...] the order of evaluation for expressions involving user variables is undefined.

So even though it works consistently with the above query, the evaluation order may still change, for instance when you add conditions or use this query as a view or sub-query in a larger query. It is a "feature" that will be removed in a future MySQL release:

Previous releases of MySQL made it possible to assign a value to a user variable in statements other than SET. This functionality is supported in MySQL 8.0 for backward compatibility but is subject to removal in a future release of MySQL.

As stated above, from MySQL 8.0 onward you should use the recursive with syntax.

Efficiency

For very large data sets this solution might get slow, as the find_in_set operation is not the most ideal way to find a number in a list, certainly not in a list that reaches a size in the same order of magnitude as the number of records returned.

Alternative 1: with recursive, connect by

More and more databases implement the SQL:1999 ISO standard WITH [RECURSIVE] syntax for recursive queries (e.g. Postgres 8.4+, SQL Server 2005+, DB2, Oracle 11gR2+, SQLite 3.8.4+, Firebird 2.1+, H2, HyperSQL 2.1.0+, Teradata, MariaDB 10.2.2+). And as of version 8.0, also MySQL supports it. See the top of this answer for the syntax to use.

Some databases have an alternative, non-standard syntax for hierarchical look-ups, such as the CONNECT BY clause available on Oracle, DB2, Informix, CUBRID and other databases.

MySQL version 5.7 does not offer such a feature. When your database engine provides this syntax or you can migrate to one that does, then that is certainly the best option to go for. If not, then also consider the following alternatives.

Alternative 2: Path-style Identifiers

Things become a lot easier if you would assign id values that contain the hierarchical information: a path. For example, in your case this could look like this:

ID NAME
19 category1
19/1 category2
19/1/1 category3
19/1/1/1 category4

Then your select would look like this:

select  id,
        name 
from    products
where   id like '19/%'

Alternative 3: Repeated Self-joins

If you know an upper limit for how deep your hierarchy tree can become, you can use a standard sql query like this:

select      p6.parent_id as parent6_id,
            p5.parent_id as parent5_id,
            p4.parent_id as parent4_id,
            p3.parent_id as parent3_id,
            p2.parent_id as parent2_id,
            p1.parent_id as parent_id,
            p1.id as product_id,
            p1.name
from        products p1
left join   products p2 on p2.id = p1.parent_id 
left join   products p3 on p3.id = p2.parent_id 
left join   products p4 on p4.id = p3.parent_id  
left join   products p5 on p5.id = p4.parent_id  
left join   products p6 on p6.id = p5.parent_id
where       19 in (p1.parent_id, 
                   p2.parent_id, 
                   p3.parent_id, 
                   p4.parent_id, 
                   p5.parent_id, 
                   p6.parent_id) 
order       by 1, 2, 3, 4, 5, 6, 7;

See this fiddle

The where condition specifies which parent you want to retrieve the descendants of. You can extend this query with more levels as needed.


From the blog Managing Hierarchical Data in MySQL

Table structure

+-------------+----------------------+--------+
| category_id | name                 | parent |
+-------------+----------------------+--------+
|           1 | ELECTRONICS          |   NULL |
|           2 | TELEVISIONS          |      1 |
|           3 | TUBE                 |      2 |
|           4 | LCD                  |      2 |
|           5 | PLASMA               |      2 |
|           6 | PORTABLE ELECTRONICS |      1 |
|           7 | MP3 PLAYERS          |      6 |
|           8 | FLASH                |      7 |
|           9 | CD PLAYERS           |      6 |
|          10 | 2 WAY RADIOS         |      6 |
+-------------+----------------------+--------+

Query:

SELECT t1.name AS lev1, t2.name as lev2, t3.name as lev3, t4.name as lev4
FROM category AS t1
LEFT JOIN category AS t2 ON t2.parent = t1.category_id
LEFT JOIN category AS t3 ON t3.parent = t2.category_id
LEFT JOIN category AS t4 ON t4.parent = t3.category_id
WHERE t1.name = 'ELECTRONICS';

Output

+-------------+----------------------+--------------+-------+
| lev1        | lev2                 | lev3         | lev4  |
+-------------+----------------------+--------------+-------+
| ELECTRONICS | TELEVISIONS          | TUBE         | NULL  |
| ELECTRONICS | TELEVISIONS          | LCD          | NULL  |
| ELECTRONICS | TELEVISIONS          | PLASMA       | NULL  |
| ELECTRONICS | PORTABLE ELECTRONICS | MP3 PLAYERS  | FLASH |
| ELECTRONICS | PORTABLE ELECTRONICS | CD PLAYERS   | NULL  |
| ELECTRONICS | PORTABLE ELECTRONICS | 2 WAY RADIOS | NULL  |
+-------------+----------------------+--------------+-------+

Most users at one time or another have dealt with hierarchical data in a SQL database and no doubt learned that the management of hierarchical data is not what a relational database is intended for. The tables of a relational database are not hierarchical (like XML), but are simply a flat list. Hierarchical data has a parent-child relationship that is not naturally represented in a relational database table. Read more

Refer the blog for more details.

EDIT:

select @pv:=category_id as category_id, name, parent from category
join
(select @pv:=19)tmp
where parent=@pv

Output:

category_id name    parent
19  category1   0
20  category2   19
21  category3   20
22  category4   21

Reference: How to do the Recursive SELECT query in Mysql?


Did the same thing for another quetion here

Mysql select recursive get all child with multiple level

The query will be :

SELECT GROUP_CONCAT(lv SEPARATOR ',') FROM (
  SELECT @pv:=(
    SELECT GROUP_CONCAT(id SEPARATOR ',')
    FROM table WHERE parent_id IN (@pv)
  ) AS lv FROM table 
  JOIN
  (SELECT @pv:=1)tmp
  WHERE parent_id IN (@pv)
) a;

Try these:

Table definition:

DROP TABLE IF EXISTS category;
CREATE TABLE category (
    id INT AUTO_INCREMENT PRIMARY KEY,
    name VARCHAR(20),
    parent_id INT,
    CONSTRAINT fk_category_parent FOREIGN KEY (parent_id)
    REFERENCES category (id)
) engine=innodb;

Experimental rows:

INSERT INTO category VALUES
(19, 'category1', NULL),
(20, 'category2', 19),
(21, 'category3', 20),
(22, 'category4', 21),
(23, 'categoryA', 19),
(24, 'categoryB', 23),
(25, 'categoryC', 23),
(26, 'categoryD', 24);

Recursive Stored procedure:

DROP PROCEDURE IF EXISTS getpath;
DELIMITER $$
CREATE PROCEDURE getpath(IN cat_id INT, OUT path TEXT)
BEGIN
    DECLARE catname VARCHAR(20);
    DECLARE temppath TEXT;
    DECLARE tempparent INT;
    SET max_sp_recursion_depth = 255;
    SELECT name, parent_id FROM category WHERE id=cat_id INTO catname, tempparent;
    IF tempparent IS NULL
    THEN
        SET path = catname;
    ELSE
        CALL getpath(tempparent, temppath);
        SET path = CONCAT(temppath, '/', catname);
    END IF;
END$$
DELIMITER ;

Wrapper function for the stored procedure:

DROP FUNCTION IF EXISTS getpath;
DELIMITER $$
CREATE FUNCTION getpath(cat_id INT) RETURNS TEXT DETERMINISTIC
BEGIN
    DECLARE res TEXT;
    CALL getpath(cat_id, res);
    RETURN res;
END$$
DELIMITER ;

Select example:

SELECT id, name, getpath(id) AS path FROM category;

Output:

+----+-----------+-----------------------------------------+
| id | name      | path                                    |
+----+-----------+-----------------------------------------+
| 19 | category1 | category1                               |
| 20 | category2 | category1/category2                     |
| 21 | category3 | category1/category2/category3           |
| 22 | category4 | category1/category2/category3/category4 |
| 23 | categoryA | category1/categoryA                     |
| 24 | categoryB | category1/categoryA/categoryB           |
| 25 | categoryC | category1/categoryA/categoryC           |
| 26 | categoryD | category1/categoryA/categoryB/categoryD |
+----+-----------+-----------------------------------------+

Filtering rows with certain path:

SELECT id, name, getpath(id) AS path FROM category HAVING path LIKE 'category1/category2%';

Output:

+----+-----------+-----------------------------------------+
| id | name      | path                                    |
+----+-----------+-----------------------------------------+
| 20 | category2 | category1/category2                     |
| 21 | category3 | category1/category2/category3           |
| 22 | category4 | category1/category2/category3/category4 |
+----+-----------+-----------------------------------------+