Space Cat, Prince Among Thieves

Simple Single Pass Doubly Linked Flat Tree Building Algorithm

A hierarchy is a common structure, especially for structuring pages in web development. A problem with hierarchies is they often need to be flattened to be stored in a relational database, and then expanded again after being pulled out.

An Example Hierarchy

  • Top level item
    • Sub Item
      • Sub Sub Item
        • Sub Sub Sub Item
    • Sibling of "Sub Item"
  • Top level as well

A common and simple way to store them is with a table structure similar to the following.

id parent_id title
1 0 Top level item
2 0 Top level as well
3 1 Sub Item
4 1 Sibling of "Sub Item"
5 3 Sub Sub Item
6 6 Sub Sub Sub Item

Now there are several ways to process this into an array or other navigable structure. A painfully common and naïve way would be recursive queries ala:

function fetch_items($parent = 0) {
    $data = array();
    $qry = mysql_query("select * from `tablename` where parent_id = " . (int)$parent );
    while( $row = mysql_fetch_array($qry) ) {
        $data = array('data' => $row, 'children' => fetch_items($row['id']));
    }
    return $data;
}

The problem with this approach is the sheer number of queries this approach uses. Our simple example hierarchy would execute 6 queries, whereas more complicated hierarchies would execute many more.

A vastly superior approach is to pull all the data out with a single query, and then process it into the structure we need. There are several ways to do this, but I find the following DOM like structure this generates incredibly useful.

function fetch_tree() {
    $data = array();
    $qrt = mysql_query("select * from `tablename` order by parent_id asc");
    while( $row = mysql_fetch_array($qry) ) {
        $data[ $row['id'] ]['data'] = $row;
        $data[ $row['parent_id'] ]['children'][ $row['id'] ] =& $data[ $row['id'] ];
        $data[ $row['id'] ]['parent'] =& $data[ $row['parent_id'] ];
    }
    return $data;
}

Essentially, in a single pass through the data, this gets us an array of hierarchy elements, where rather than an element directly having its children, every entry has a pointer to its parent element as well as an array of pointers to its children. This is incredibly flexible, and as the tree is flattened it allows you to start at any point in the hierarchy by ID ala $data[<$id>], and navigate your way up or down. Viewing any elements children is as simple as $data[<$id>]['children'], and its parent as simple as $data[<$id>]['parent'].

I use this basic methodology a ton in my work, personal and professional (the comments and sitemap of this site are powered by it!) but the concept seems foreign when I try to explain it to people. I hope this explanation sparks some ideas for people.



Email address will never be publicly visible.

Basic HTML allowed.