User login

Determining all possible parent paths for an arbitrary-depth one-to-many categorization structure

Wherein Ben discovers he doesn't know computer programming at all.

The community-managed taxonomy module needs to know all possible parent paths for an arbitrary-depth categorization structure. Even though taxonomy terms have a many-to-many relationship, when tracing back from one term to its parent(s), and that parent's parent(s), and so on to root level, it's one-to-many from that perspective. Barring illegal loops, there must be a finite number of possible ancestor paths, but all my attempts to walk through them have not quite resulted in predictable, correct results

Basically I want to display every possible one-to-one lineage for an item with ancestors.

Everything I tried, and asked, and was answered about this question is below.

When I have the answer it will be linked here.

Most recent desperate plea:

I know this is programming 101, but it's eating my brain.

A taxonomy term can have n parents, and each parent can have n parents, out to an arbitrary level. This is described in two columns, tid and parent, of the cmt_term_hierarchy (same as term_hierarchy in taxonomy.module)

For display purposes I need to change this term "tree" (or "roots") into something like, for term 100 say with parents term 50 and 80:

100 < 50 < 49 < 12 < 0
100 < 80 < 10 < 23 < 8 < 0
100 < 80 < 10 < 18 < 0
100 < 80 < 10 < 9 < 0
100 < 80 < 0

I had had a function I thought did this at my first commit to the repository way back (labeled funkiest function at the time) but it never quite worked, and neither did anything else I tried, and I pretty much blew my chances of recovering with a decent prototype trying to figure this out.

Initial resources

This conversation with Brian probably has the answer, if I could understand it: Representing a complex ancestry hierarchy for a point as multiple linear point-to-single-point hierarchies (answers from a C programmer).

Some more possible resources for "drupal get all possible hierarchies for a term"

http://vsmith.info/DrupalTaxonomy
http://www.greenash.net.au/posts/thoughts/cross-vocabulary-taxonomy-hierarchies

taxonomy hierachies: http://drupal.org/node/58451

Building a hierarchical site with taxonomy, having breadcrumb top-level link to all terms
http://drupal.org/node/41349

I also looked at everything in my PHP books about menu trees, recursion, and design patterns. Nothing I could use.

This article by Gijs Van Tulder restates more clearly what I've read repeatedly elsewhere:

Storing trees is a common problem, with multiple solutions. There are two major approaches: the adjacency list model, and the modified preorder tree traversal algorithm.

http://www.sitepoint.com/article/hierarchical-data-database

But I do not see how to apply it to multiple hierarchies, to terms that can have more than one parent. According to quick searches on "modified preorder tree traversal" "multiple parents" it can be modified to work with many parents per item. And where else would someone (bdragon) try to do this, but in Drupal?

Idea: Bloody fast trees is the closest thing to the answer I've found... right in that metaphorical own backyard.

But I'm not sure even implementing a modified-for-multiple-parents modified pre-order tree traversal solves much of the problem of pulling out each of the distinct paths the hierarchies can create. (Of course, I'm not sure if gravity pulls one toward the earth anymore...)

I learned a lot, but did not find how to create lists of all possible ancestor paths in a multiple parentage hierarchical vocabulary, in these posts:

After I tried all (but the last) below, my mentor wrote:

Write a recursive function, that catches all possibilites.
To prevent loops, mark all "used" items in the tree and do nothing but
returning if you find a mark.

I know that's a bit unspecific, but that's all I can do about it except
coding it ;)

regards,
fago

As yet, it is too vague for me.

Attempts

Here are all the functions that I tried and didn't delete in disgust. I thought I was so close for so long. In fact, I still think I'm close...

<?php
/**
 * Create a hierarchical representation of a vocabulary.
 *
 * @param $vid
 *   Which vocabulary to generate the tree for.
 *
 * @param $parent
 *   The term ID under which to generate the tree. If 0, generate the tree
 *   for the entire vocabulary.
 *
 * @param $depth
 *   Internal use only.
 *
 * @param $max_depth
 *   The number of levels of the tree to return. Leave NULL to return all levels.
 *
 * @return
 *   An array of all term objects in the tree. Each term object is extended
 *   to have "depth" and "parents" attributes in addition to its normal ones.
 *   Results are statically cached.
 */
function cmt_get_tree($vid, $parent = 0, $depth = -1, $max_depth = NULL) {
  static
$children, $parents, $terms;
 
$depth++;
 
// We cache trees, so it's not CPU-intensive to call get_tree() on a term
  // and its children, too.
 
if (!isset($children[$vid])) {
   
$children[$vid] = array();
   
$result = db_query('SELECT t.tid, t.*, parent FROM {cmt_term_data} t INNER JOIN {cmt_term_vocab} v ON t.tid = v.tid INNER JOIN {cmt_term_hierarchy} h ON t.tid = h.tid WHERE v.vid = %d ORDER BY weight, name', $vid);
    while (
$term = db_fetch_object($result)) {
     
$children[$vid][$term->parent][] = $term->tid;
     
$parents[$vid][$term->tid][] = $term->parent;
     
$terms[$vid][$term->tid] = $term;
    }
  }
 
$max_depth = (is_null($max_depth)) ? count($children[$vid]) : $max_depth;
  if (
$children[$vid][$parent]) {
    foreach (
$children[$vid][$parent] as $child) {
      if (
$max_depth > $depth) {
       
$term = drupal_clone($terms[$vid][$child]);
       
$term->depth = $depth;
       
// The "parent" attribute is not useful, as it would show one parent only.
       
unset($term->parent);
       
$term->parents = $parents[$vid][$child];
       
$tree[] = $term;
        if (
$children[$vid][$child]) {
         
$tree = array_merge($tree, cmt_get_tree($vid, $child, $depth, $max_depth));
        }
      }
    }
  }
  return
$tree ? $tree : array();
}

/**
 * Create a hierarchical representation of a vocabulary.
 *
 * @param $vid
 *   Which vocabulary to generate the tree for.
 *
 * @param $parent
 *   The term ID under which to generate the tree. If 0, generate the tree
 *   for the entire vocabulary.
 *
 * @param $depth
 *   Internal use only.
 *
 * @param $max_depth
 *   The number of levels of the tree to return. Leave NULL to return all levels.
 *
 * @return
 *   An array of all term objects in the tree. Each term object is extended
 *   to have "depth" and "parents" attributes in addition to its normal ones.
 *   Results are statically cached.
 */
function cmt_get_roots($vid, $tid = 0, $depth = -1, $max_depth = NULL) {
  static
$ancestors, $parents, $terms;
 
$depth++;
 
// We cache trees, so it's not CPU-intensive to call get_tree() on a term
  // and its children, too.
 
if (TRUE || !isset($ancestors[$vid][$tid])) {
   
$ancestors[$vid][$tid] = array();
   
$result = db_query('SELECT t.tid, t.*, parent FROM {cmt_term_data} t INNER JOIN {cmt_term_vocab} v ON t.tid = v.tid INNER JOIN {cmt_term_hierarchy} h ON t.tid = h.tid WHERE t.tid = %d AND v.vid = %d ORDER BY weight, name', $tid, $vid);
    while (
$term = db_fetch_object($result)) {
     
$ancestors[$vid][$tid][] = $term->parent;
/*     
      $children[$vid][$tid][$term->parent][] = $term->tid;
      $parents[$vid][$term->tid][] = $term->parent;
      $terms[$vid][$term->tid] = $term;
*/
   
}
  }
 
$max_depth = (is_null($max_depth)) ? count($ancestors[$vid][$tid]) : $max_depth;
  if (
$children[$vid][$parent]) {
    foreach (
$children[$vid][$parent] as $child) {
      if (
$max_depth > $depth) {
       
$term = drupal_clone($terms[$vid][$child]);
       
$term->depth = $depth;
       
// The "parent" attribute is not useful, as it would show one parent only.
       
unset($term->parent);
       
$term->parents = $parents[$vid][$child];
       
$roots[] = $term;
        if (
$children[$vid][$child]) {
         
$roots = array_merge($roots, cmt_get_roots($vid, $parent, $depth, $max_depth));
        }
      }
    }
  }
  return
$roots ? $roots : array();
}

function

cmt_get_ancestors($tid, $vid, &$ancestors, $stack=array(), $thread = 0) {
   
$result = db_query('SELECT t.tid, t.*, parent FROM {cmt_term_data} t INNER JOIN {cmt_term_vocab} v ON t.tid = v.tid INNER JOIN {cmt_term_hierarchy} h ON t.tid = h.tid WHERE t.tid = %d AND v.vid = %d ORDER BY weight, name', $tid, $vid);
   
$parents = array();
    while (
$term = db_fetch_object($result)) {
     
$parents[] = $term->parent;
    }
// agaric_a($parents);   
   
if (count($parents) == 0) {
     
$ancestors[$thread] = $stack;
     
$stack = array();
     
$thread++;
    }
    else {
      foreach (
$parents as $parent) {
       
$stack[] = cmt_get_ancestors($parent, $vid, $ancestors, $stack, $thread);
      }
    }
  return
$parents
}

function

cmt_brian($tid) {
 
}

/**
 * I am the funkiest function.   ... and I don't work.  Anyone want to explain why?
 */
function cmt_get_hierarchies($tid) {
 
$term = cmt_get_term($tid);
 
$hierarchies = array();
 
$hierarchies[0] = array(0 => $term);
 
$root = 0;
 
$h = count($hierarchies);
  for (
$i = 0; $i < $h; $i++) {
    if (
$hierarchies[$i][count($hierarchies[$i])-1] == "root") {
     
$root++;
      if(
$root == $h) {
//        for($k=0;$k<$h;$k++) {
//          $hierarchies[$k] = array_reverse($hierarchies[$k]);
//        }
       
return $hierarchies;
      }
    }
    else {
     
$parents = array_values(cmt_get_parents($hierarchies[$i][count($hierarchies[$i])-1]->tid));
// agaric_a($parents);
     
$c = count($parents);
      if (
$c > 1) $copy = $hierarchies[$i];
      if (!
$parents[0]) $parents[0] = "root";
     
$hierarchies[$i][] = $parents[0];    
      for (
$j = 1; $j < $c; $j++) {
       
$hierarchies[$i+$j] = $copy;
        if(!
$parents[$j]) $parents[$j] = "root";
       
$hierarchies[$i+$j][] = $parents[$j];
      }
      if (
$hierarchies[$i][0] == "root")  unset($hierarchies[$i]);
     
$h = count($hierarchies);
     
$i = 0;
    }
  }
}

/**
 * Find all ancestors of a given term ID.
 */
function cmt_get_parents_all($tid) {
 
$parents = array();
  if (
$tid) {
   
$parents[] = cmt_get_term($tid);
   
$n = 0;
    while (
$parent = taxonomy_get_parents($parents[$n]->tid)) {
     
$parents = array_merge($parents, $parent);
     
$n++;
    }
  }
  return
$parents;
}
?>

Still trying...

<?php
function cmt_get_ladders($tid) {
 
// each level is either done, or has the tid (or count?) of the next term to trace
  // ladder[0] has levels 0, 1, 2, 3
  // level 2 had an extra tid (term 1 had two parents)
  // level 1 had two extra tid (had three parents)
  // when we reach level 3, the term given has no parents
  // return to the most recent fork, level 2
  // duplicate ladder [0] to ladder [1], unset everything above level
  // on ladder[1], follow level 2 to the point that it has no parents
  // return to the remaining fork, level 1
  // duplicate ladder
 
$ladder = 0;
 
$level = 0;
 
$lterm = array(0 => $tid);  // term we're currently dealing with by level
 
$ladders[$ladder][$level][$tid] = cmt_get_term($tid);
 
$ladders = cmt_build_ladder($tid, $ladders, $ladder, $level, $lterm);
  return
$ladders;
}

function

cmt_build_ladder($tid, $ladders = array(), $ladder = 0, $level = 0, $lterm = array()) {
drupal_set_message("tid: " . $tid . ", ladder: " . $ladder . ", level: ".$level);
agaric_a($ladders);
agaric_a($lterm);
if (
$ladder > 20 || $level > 25)  return $ladders;
  if (
$parents = cmt_get_parents($tid, 'tid')) {
   
$level = $level + 1;
   
$ladders[$ladder][$level] = $parents;
   
$terms = array_keys($parents);
   
sort($terms);  // sort function works by reference
   
$tid = $terms[0];
   
$lterm[$level] = $tid;
   
// $ladders[$ladder][$level] =
   
cmt_build_ladder($tid, $ladders, $ladder, $level, $lterm);
  }
  else {
// we reached the top!
   
for ($total_levels = $level; $level > 0; $level--) {
      if (
count($ladders[$ladder][$level]) > 1) {
       
// make new ladder and start counting back up
       
$old_ladder = $ladder;
       
$new_ladder = $ladder+1;
       
$ladders[$new_ladder] = array(0 => $ladders[$old_ladder][0]);  // the original term, so there'll be only one
       
for ($new_level = 1; $new_level <= $level; $new_level++) {
         
$ladders[$new_ladder][$new_level] = $ladders[$old_ladder][$new_level];
          foreach (
$ladders[$old_ladder][$new_level] AS $key => $value) {
            if (
$key != $lterm[$new_level]) {
              unset(
$ladders[$old_ladder][$new_level][$key]);
            }
          }
        }
       
$old_tid = $lterm[$level];
        unset(
$ladders[$new_ladder][$level][$old_tid]);
       
// get the next tid
       
$new_tid = _cmt_next_highest_key($ladders[$old_ladder][$level], $old_tid);
       
$lterm[$level] = $new_tid;
       
// $ladders =
       
cmt_build_ladder($new_tid, $ladders, $new_ladder, $level, $lterm);
      }
     
    }
   
// we reached the bottom of a clean stack!
   
return $ladders;
  }
}

/**
 * Get the next highest key in an array
 *
 * An extremely unoptimized function within an extremely unoptimized function.
 */
function _cmt_next_highest_key($array, $key = 0) {
 
$array = array_keys($array);
 
sort($array);
  foreach (
$array AS $array_key) {
    if (
$array_key > $key)  return $array_key;
  }
 
// there was no larger key
 
return FALSE;
}
?>

Comments

Why had no one ever pointed

Why had no one ever pointed me to this? Oh why oh why.

http://drupal.org/project/lineage

"The taxonomy hierarchy system, while powerful, cannot easily generate lists of nodes sorted by hierarchy depth. ...

It is very difficult to get those nodes in the right order. So much so, in fact, that it requires a recursive query algorithm."

At least I feel like less of a total idiot that this seemingly minor problem killed my summer of code project. If Earl Miles of Panels and Views fame wrote a module just to deal with the issue...

I think my solution in the sweet victory over this tricky listing all possible parental paths function is still better and probably different, also, as the nested trees method is for going down trees, not up multihierarchy trees...

Post new comment

The content of this field is kept private and will not be shown publicly.
  • You may post code using <code>...</code> (generic) or <?php ... ?> (highlighted PHP) tags.
  • You can use Markdown syntax to format and style the text. Also see Markdown Extra for tables, footnotes, and more.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Allowed HTML tags: <a> <em> <strong> <cite> <code> <ul> <ol> <li> <dl> <dt> <dd> <img> <blockquote> <small> <h2> <h3> <h4> <h5> <h6> <sub> <sup> <p> <br> <strike> <table> <tr> <td> <thead> <th> <tbody> <tt> <output>
  • Lines and paragraphs break automatically.

More information about formatting options

By submitting this form, you accept the Mollom privacy policy.