Convert a series of parent-child relationships into a hierarchical tree?

I have a bunch of name-parentname pairs, that I'd like to turn into as few heirarchical tree structures as possible. So for example, these could be the pairings:

Child : Parent
    H : G
    F : G
    G : D
    E : D
    A : E
    B : C
    C : E
    D : NULL

Which needs to be transformed into (a) heirarchical tree(s):

D
├── E
│   ├── A
│   │   └── B
│   └── C   
└── G
    ├── F
    └── H

The end result that I want is a nested set of <ul> elements, with each <li> containing the child's name.

There are no inconsistencies in the pairings (child is it's own parent, parent is child's child, etc), so a bunch of optimizations can probably be made.

How, in PHP, would I go from an array containing child=>parent pairs, to a set of Nested <ul>s?

I have a feeling that recursion is involved, but I'm not quite awake enough to think it through.


Solution 1:

This requires a very basic recursive function to parse the child/parent pairs to a tree structure and another recursive function to print it out. Only one function would suffice but here's two for clarity (a combined function can be found at the end of this answer).

First initialize the array of child/parent pairs:

$tree = array(
    'H' => 'G',
    'F' => 'G',
    'G' => 'D',
    'E' => 'D',
    'A' => 'E',
    'B' => 'C',
    'C' => 'E',
    'D' => null
);

Then the function that parses that array into a hierarchical tree structure:

function parseTree($tree, $root = null) {
    $return = array();
    # Traverse the tree and search for direct children of the root
    foreach($tree as $child => $parent) {
        # A direct child is found
        if($parent == $root) {
            # Remove item from tree (we don't need to traverse this again)
            unset($tree[$child]);
            # Append the child into result array and parse its children
            $return[] = array(
                'name' => $child,
                'children' => parseTree($tree, $child)
            );
        }
    }
    return empty($return) ? null : $return;    
}

And a function that traverses that tree to print out an unordered list:

function printTree($tree) {
    if(!is_null($tree) && count($tree) > 0) {
        echo '<ul>';
        foreach($tree as $node) {
            echo '<li>'.$node['name'];
            printTree($node['children']);
            echo '</li>';
        }
        echo '</ul>';
    }
}

And the actual usage:

$result = parseTree($tree);
printTree($result);

Here's the contents of $result:

Array(
    [0] => Array(
        [name] => D
        [children] => Array(
            [0] => Array(
                [name] => G
                [children] => Array(
                    [0] => Array(
                        [name] => H
                        [children] => NULL
                    )
                    [1] => Array(
                        [name] => F
                        [children] => NULL
                    )
                )
            )
            [1] => Array(
                [name] => E
                [children] => Array(
                    [0] => Array(
                        [name] => A
                        [children] => NULL
                    )
                    [1] => Array(
                        [name] => C
                        [children] => Array(
                            [0] => Array(
                                [name] => B
                                [children] => NULL
                            )
                        )
                    )
                )
            )
        )
    )
)

If you want a bit more efficiency, you can combine those functions into one and reduce the number of iterations made:

function parseAndPrintTree($root, $tree) {
    $return = array();
    if(!is_null($tree) && count($tree) > 0) {
        echo '<ul>';
        foreach($tree as $child => $parent) {
            if($parent == $root) {                    
                unset($tree[$child]);
                echo '<li>'.$child;
                parseAndPrintTree($child, $tree);
                echo '</li>';
            }
        }
        echo '</ul>';
    }
}

You'll only save 8 iterations on a dataset as small as this but on bigger sets it could make a difference.

Solution 2:

Yet Another Function To Make A Tree (no recursion involved, uses references instead):

$array = array('H' => 'G', 'F' => 'G', ..., 'D' => null);

function to_tree($array)
{
    $flat = array();
    $tree = array();

    foreach ($array as $child => $parent) {
        if (!isset($flat[$child])) {
            $flat[$child] = array();
        }
        if (!empty($parent)) {
            $flat[$parent][$child] =& $flat[$child];
        } else {
            $tree[$child] =& $flat[$child];
        }
    }

    return $tree;
}

Returns an hierarchical array like this one:

Array(
    [D] => Array(
        [G] => Array(
            [H] => Array()
            [F] => Array()
        )
        ...
    )
)

Which can easily be printed as a HTML list using recursive function.

Solution 3:

Another, more simplified way to convert the flat structure in the $tree into a hierarchy. Only one temporary array is needed to expose it:

// add children to parents
$flat = array(); # temporary array
foreach ($tree as $name => $parent)
{
    $flat[$name]['name'] = $name; # self
    if (NULL === $parent)
    {
        # no parent, is root element, assign it to $tree
        $tree = &$flat[$name]; 
    }
    else
    {
        # has parent, add self as child    
        $flat[$parent]['children'][] = &$flat[$name];
    }
}
unset($flat);

That's all for getting the hierarchy into a multidimensional array:

Array
(
    [children] => Array
        (
            [0] => Array
                (
                    [children] => Array
                        (
                            [0] => Array
                                (
                                    [name] => H
                                )

                            [1] => Array
                                (
                                    [name] => F
                                )

                        )

                    [name] => G
                )

            [1] => Array
                (
                    [name] => E
                    [children] => Array
                        (
                            [0] => Array
                                (
                                    [name] => A
                                )

                            [1] => Array
                                (
                                    [children] => Array
                                        (
                                            [0] => Array
                                                (
                                                    [name] => B
                                                )

                                        )

                                    [name] => C
                                )

                        )

                )

        )

    [name] => D
)

The output is less trivial if you want to avoid recursion (can be a burden with large structures).

I always wanted to solve the UL/LI "dilemma" for outputting an array. The dilemma is, that each item does not know whether or not children will follow up or how many preceding elements need to be closed. In another answer I already solved that by using a RecursiveIteratorIterator and looking for getDepth() and other meta-information that my own written Iterator provided: Getting nested set model into a <ul> but hiding “closed” subtrees. That answer shows as well that with iterators you're quite flexible.

However that was a pre-sorted list, so would not be suitable for your example. Additionally I always wanted to solve this for a sort of standard tree structure and HTML's <ul> and <li> elements.

The basic concept I came up is the following:

  1. TreeNode - Abstracts each element into a simple TreeNode type that can provide it's value (e.g. Name) and whether or not it has children.
  2. TreeNodesIterator - A RecursiveIterator that is able to iterate over a set (array) of these TreeNodes. That is fairly simple as the TreeNode type already knows if it has children and which ones.
  3. RecursiveListIterator - A RecursiveIteratorIterator that has all the events needed when it recursively iterate over any kind of RecursiveIterator:
    • beginIteration / endIteration - Begin and end of the main list.
    • beginElement / endElement - Begin and end of each element.
    • beginChildren / endChildren - Begin and end of each children list. This RecursiveListIterator only provides these events in a form of function calls. children lists, as it is typical for <ul><li> lists, are opened and closed inside it's parent <li> element. Therefore the endElement event is fired after the according endChildren event. This could be changed or made configurable to broaden the use this class. The events are distributed as function calls to a decorator object then, to keep things apart.
  4. ListDecorator - A "decorator" class that is just a receiver of the events of RecursiveListIterator.

I start with the main output logic. Taken the now hierarchical $tree array, the final code looks like the following:

$root = new TreeNode($tree);
$it = new TreeNodesIterator(array($root));
$rit = new RecursiveListIterator($it);
$decor = new ListDecorator($rit);
$rit->addDecorator($decor);

foreach($rit as $item)
{
    $inset = $decor->inset(1);
    printf("%s%s\n", $inset, $item->getName());
}

First let's look into the ListDecorator that simply wraps the <ul> and <li> elements and is deciding about how the list structure is output:

class ListDecorator
{
    private $iterator;
    public function __construct(RecursiveListIterator $iterator)
    {
        $this->iterator = $iterator;
    }
    public function inset($add = 0)
    {
        return str_repeat('  ', $this->iterator->getDepth()*2+$add);
    }

The constructor takes the list iterator it's working on. inset is just a helper function for nice indentation of the output. The rest are just the output functions for each event:

    public function beginElement()
    {
        printf("%s<li>\n", $this->inset());
    }
    public function endElement()
    {
        printf("%s</li>\n", $this->inset());
    }
    public function beginChildren()
    {
        printf("%s<ul>\n", $this->inset(-1));
    }
    public function endChildren()
    {
        printf("%s</ul>\n", $this->inset(-1));
    }
    public function beginIteration()
    {
        printf("%s<ul>\n", $this->inset());
    }
    public function endIteration()
    {
        printf("%s</ul>\n", $this->inset());
    }
}

With these output functions in mind, this is the main output wrap-up / loop again, I go through it step by step:

$root = new TreeNode($tree);

Create the root TreeNode which will be used to start iteration upon:

$it = new TreeNodesIterator(array($root));

This TreeNodesIterator is a RecursiveIterator that enables recursive iteration over the single $root node. It is passed as an array because that class needs something to iterate over and allows re-use with a set of children which is also an array of TreeNode elements.

$rit = new RecursiveListIterator($it);

This RecursiveListIterator is a RecursiveIteratorIterator that provides the said events. To make use of it, only a ListDecorator needs to be provided (the class above) and assigned with addDecorator:

$decor = new ListDecorator($rit);
$rit->addDecorator($decor);

Then everything is set-up to just foreach over it and output each node:

foreach($rit as $item)
{
    $inset = $decor->inset(1);
    printf("%s%s\n", $inset, $item->getName());
}

As this example shows, the whole output logic is encapsulated in the ListDecorator class and this single foreach. The whole recursive traversal has been fully encapsulated into SPL recursive iterators which provided a stacked procedure, that means internally no recursion function calls are done.

The event based ListDecorator allows you to modify the output specifically and to provide multiple type of lists for the same data structure. It's even possible to change the input as the array data has been encapsulated into TreeNode.

The full code example:

<?php
namespace My;

$tree = array('H' => 'G', 'F' => 'G', 'G' => 'D', 'E' => 'D', 'A' => 'E', 'B' => 'C', 'C' => 'E', 'D' => null);

// add children to parents
$flat = array(); # temporary array
foreach ($tree as $name => $parent)
{
    $flat[$name]['name'] = $name; # self
    if (NULL === $parent)
    {
        # no parent, is root element, assign it to $tree
        $tree = &$flat[$name];
    }
    else
    {
        # has parent, add self as child    
        $flat[$parent]['children'][] = &$flat[$name];
    }
}
unset($flat);

class TreeNode
{
    protected $data;
    public function __construct(array $element)
    {
        if (!isset($element['name']))
            throw new InvalidArgumentException('Element has no name.');

        if (isset($element['children']) && !is_array($element['children']))
            throw new InvalidArgumentException('Element has invalid children.');

        $this->data = $element;
    }
    public function getName()
    {
         return $this->data['name'];
    }
    public function hasChildren()
    {
        return isset($this->data['children']) && count($this->data['children']);
    }
    /**
     * @return array of child TreeNode elements 
     */
    public function getChildren()
    {        
        $children = $this->hasChildren() ? $this->data['children'] : array();
        $class = get_called_class();
        foreach($children as &$element)
        {
            $element = new $class($element);
        }
        unset($element);        
        return $children;
    }
}

class TreeNodesIterator implements \RecursiveIterator
{
    private $nodes;
    public function __construct(array $nodes)
    {
        $this->nodes = new \ArrayIterator($nodes);
    }
    public function  getInnerIterator()
    {
        return $this->nodes;
    }
    public function getChildren()
    {
        return new TreeNodesIterator($this->nodes->current()->getChildren());
    }
    public function hasChildren()
    {
        return $this->nodes->current()->hasChildren();
    }
    public function rewind()
    {
        $this->nodes->rewind();
    }
    public function valid()
    {
        return $this->nodes->valid();
    }   
    public function current()
    {
        return $this->nodes->current();
    }
    public function key()
    {
        return $this->nodes->key();
    }
    public function next()
    {
        return $this->nodes->next();
    }
}

class RecursiveListIterator extends \RecursiveIteratorIterator
{
    private $elements;
    /**
     * @var ListDecorator
     */
    private $decorator;
    public function addDecorator(ListDecorator $decorator)
    {
        $this->decorator = $decorator;
    }
    public function __construct($iterator, $mode = \RecursiveIteratorIterator::SELF_FIRST, $flags = 0)
    {
        parent::__construct($iterator, $mode, $flags);
    }
    private function event($name)
    {
        // event debug code: printf("--- %'.-20s --- (Depth: %d, Element: %d)\n", $name, $this->getDepth(), @$this->elements[$this->getDepth()]);
        $callback = array($this->decorator, $name);
        is_callable($callback) && call_user_func($callback);
    }
    public function beginElement()
    {
        $this->event('beginElement');
    }
    public function beginChildren()
    {
        $this->event('beginChildren');
    }
    public function endChildren()
    {
        $this->testEndElement();
        $this->event('endChildren');
    }
    private function testEndElement($depthOffset = 0)
    {
        $depth = $this->getDepth() + $depthOffset;      
        isset($this->elements[$depth]) || $this->elements[$depth] = 0;
        $this->elements[$depth] && $this->event('endElement');

    }
    public function nextElement()
    {
        $this->testEndElement();
        $this->event('{nextElement}');
        $this->event('beginElement');       
        $this->elements[$this->getDepth()] = 1;
    } 
    public function beginIteration()
    {
        $this->event('beginIteration');
    }
    public function endIteration()
    {
        $this->testEndElement();
        $this->event('endIteration');       
    }
}

class ListDecorator
{
    private $iterator;
    public function __construct(RecursiveListIterator $iterator)
    {
        $this->iterator = $iterator;
    }
    public function inset($add = 0)
    {
        return str_repeat('  ', $this->iterator->getDepth()*2+$add);
    }
    public function beginElement()
    {
        printf("%s<li>\n", $this->inset(1));
    }
    public function endElement()
    {
        printf("%s</li>\n", $this->inset(1));
    }
    public function beginChildren()
    {
        printf("%s<ul>\n", $this->inset());
    }
    public function endChildren()
    {
        printf("%s</ul>\n", $this->inset());
    }
    public function beginIteration()
    {
        printf("%s<ul>\n", $this->inset());
    }
    public function endIteration()
    {
        printf("%s</ul>\n", $this->inset());
    }
}


$root = new TreeNode($tree);
$it = new TreeNodesIterator(array($root));
$rit = new RecursiveListIterator($it);
$decor = new ListDecorator($rit);
$rit->addDecorator($decor);

foreach($rit as $item)
{
    $inset = $decor->inset(2);
    printf("%s%s\n", $inset, $item->getName());
}

Outpupt:

<ul>
  <li>
    D
    <ul>
      <li>
        G
        <ul>
          <li>
            H
          </li>
          <li>
            F
          </li>
        </ul>
      </li>
      <li>
        E
        <ul>
          </li>
          <li>
            A
          </li>
          <li>
            C
            <ul>
              <li>
                B
              </li>
            </ul>
          </li>
        </ul>
      </li>
    </ul>
  </li>
</ul>

Demo (PHP 5.2 variant)

A possible variant would be an iterator that iterates over any RecursiveIterator and provides an iteration over all events that can occur. An switch / case inside the foreach loop could then deal with the events.

Related:

  • Nested List with RecursiveArrayIterator
  • The RecursiveIteratorIterator class