Find Parent Recursively using Query

I am using postgresql. I have the table as like below

parent_id    child_id
----------------------
101       102
103       104
104       105
105       106   

I want to write a sql query which will give the final parent of input.

i.e suppose i pass 106 as input then , its output will be 103.

(106 --> 105 --> 104 --> 103)

Here's a complete example. First the DDL:

test=> CREATE TABLE node (
test(>   id SERIAL,
test(>   label TEXT NOT NULL, -- name of the node
test(>   parent_id INT,
test(>   PRIMARY KEY(id)
test(> );
NOTICE:  CREATE TABLE will create implicit sequence "node_id_seq" for serial column "node.id"
NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index "node_pkey" for table "node"
CREATE TABLE

...and some data...

test=> INSERT INTO node (label, parent_id) VALUES ('n1',NULL),('n2',1),('n3',2),('n4',3);
INSERT 0 4
test=> INSERT INTO node (label) VALUES ('garbage1'),('garbage2'), ('garbage3');
INSERT 0 3
test=> INSERT INTO node (label,parent_id) VALUES ('garbage4',6);
INSERT 0 1
test=> SELECT * FROM node;
id |  label   | parent_id 
----+----------+-----------
 1 | n1       |          
 2 | n2       |         1
 3 | n3       |         2
 4 | n4       |         3
 5 | garbage1 |          
 6 | garbage2 |          
 7 | garbage3 |          
 8 | garbage4 |         6
(8 rows)

This performs a recursive query on every id in node:

test=> WITH RECURSIVE nodes_cte(id, label, parent_id, depth, path) AS (
 SELECT tn.id, tn.label, tn.parent_id, 1::INT AS depth, tn.id::TEXT AS path 
 FROM node AS tn 
 WHERE tn.parent_id IS NULL
UNION ALL
 SELECT c.id, c.label, c.parent_id, p.depth + 1 AS depth, 
        (p.path || '->' || c.id::TEXT) 
 FROM nodes_cte AS p, node AS c 
 WHERE c.parent_id = p.id
)
SELECT * FROM nodes_cte AS n ORDER BY n.id ASC;
id |  label   | parent_id | depth |    path    
----+----------+-----------+-------+------------
 1 | n1       |           |     1 | 1
 2 | n2       |         1 |     2 | 1->2
 3 | n3       |         2 |     3 | 1->2->3
 4 | n4       |         3 |     4 | 1->2->3->4
 5 | garbage1 |           |     1 | 5
 6 | garbage2 |           |     1 | 6
 7 | garbage3 |           |     1 | 7
 8 | garbage4 |         6 |     2 | 6->8
(8 rows)

This gets all of the descendents WHERE node.id = 1:

test=> WITH RECURSIVE nodes_cte(id, label, parent_id, depth, path) AS (
 SELECT tn.id, tn.label, tn.parent_id, 1::INT AS depth, tn.id::TEXT AS path FROM node AS tn WHERE tn.id = 1
UNION ALL                   
 SELECT c.id, c.label, c.parent_id, p.depth + 1 AS depth, (p.path || '->' || c.id::TEXT) FROM nodes_cte AS p, node AS c WHERE c.parent_id = p.id
)                                                                
SELECT * FROM nodes_cte AS n;
id | label | parent_id | depth |    path    
----+-------+-----------+-------+------------
 1 | n1    |           |     1 | 1
 2 | n2    |         1 |     2 | 1->2
 3 | n3    |         2 |     3 | 1->2->3
 4 | n4    |         3 |     4 | 1->2->3->4
(4 rows)

The following will get the path of the node with id 4:

test=> WITH RECURSIVE nodes_cte(id, label, parent_id, depth, path) AS (
 SELECT tn.id, tn.label, tn.parent_id, 1::INT AS depth, tn.id::TEXT AS path 
 FROM node AS tn 
 WHERE tn.parent_id IS NULL
UNION ALL
 SELECT c.id, c.label, c.parent_id, p.depth + 1 AS depth, 
        (p.path || '->' || c.id::TEXT) 
 FROM nodes_cte AS p, node AS c 
 WHERE c.parent_id = p.id
)
SELECT * FROM nodes_cte AS n WHERE n.id = 4;
id | label | parent_id | depth |    path    
----+-------+-----------+-------+------------
 4 | n4    |         3 |     4 | 1->2->3->4
(1 row)

And let's assume you want to limit your search to descendants with a depth less than three (note that depth hasn't been incremented yet):

test=> WITH RECURSIVE nodes_cte(id, label, parent_id, depth, path) AS (
  SELECT tn.id, tn.label, tn.parent_id, 1::INT AS depth, tn.id::TEXT AS path 
  FROM node AS tn WHERE tn.id = 1
UNION ALL
  SELECT c.id, c.label, c.parent_id, p.depth + 1 AS depth, 
         (p.path || '->' || c.id::TEXT) 
  FROM nodes_cte AS p, node AS c 
  WHERE c.parent_id = p.id AND p.depth < 2
)
SELECT * FROM nodes_cte AS n;
 id | label | parent_id | depth | path 
----+-------+-----------+-------+------
  1 | n1    |           |     1 | 1
  2 | n2    |         1 |     2 | 1->2
(2 rows)

I'd recommend using an ARRAY data type instead of a string for demonstrating the "path", but the arrow is more illustrative of the parent<=>child relationship.


Use WITH RECURSIVE to create a Common Table Expression (CTE). For the non-recursive term, get the rows in which the child is immediately below the parent:

SELECT
   c.child_id,
   c.parent_id
FROM
   mytable c
LEFT JOIN
   mytable p ON c.parent_id = p.child_id
WHERE
   p.child_id IS NULL

 child_id | parent_id
----------+-----------
      102 |       101
      104 |       103

For the recursive term, you want the children of these children.

WITH RECURSIVE tree(child, root) AS (
   SELECT
      c.child_id,
      c.parent_id
   FROM
      mytable c
   LEFT JOIN
      mytable p ON c.parent_id = p.child_id
   WHERE
      p.child_id IS NULL
   UNION
   SELECT
      child_id,
      root
   FROM
      tree
   INNER JOIN
      mytable on tree.child = mytable.parent_id
)
SELECT * FROM tree;

 child | root
-------+------
   102 |  101
   104 |  103
   105 |  103
   106 |  103

You can filter the children when querying the CTE:

WITH RECURSIVE tree(child, root) AS (...) SELECT root FROM tree WHERE child = 106;

 root
------
  103