User login

Bulk taxonomy import term-by-term is just too slow

Struggling with taxonomy import.

After stripping everything out, the conclusion is that taxonomy_save_term, by itself, without even the hooks to add extra data, is simply too slow, with a minimum of three database calls per term. (data, hierarchy, vocabulary)

Fatal error: Allowed memory size of 134217728 bytes exhausted (tried to allocate 1048576 bytes) in /RCS/agaric/agaric-sites/scf/includes/bootstrap.inc on line 836

134,217,728

Increased to 256MB. It ran forever, then returned this:

Fatal error: Allowed memory size of 268435456 bytes exhausted (tried to allocate 42949104 bytes) in /RCS/agaric/agaric-sites/scf/includes/common.inc on line 317

It imported more than 16,000 terms into the Biological processes vocabulary alone: http://localhost/scf/admin/content/taxonomy/7?page=159

(161 pages with 100 on each full page)

More than 2,300 terms into Cellular components
http://localhost/scf/admin/content/taxonomy/8?page=23

And more than 9,100 terms into the Molecular functions vocabulary.
http://localhost/scf/admin/content/taxonomy/9?page=91

i think it imported everything and choked on the relationship processing that was part of taxonomy_xml already

but I lost my descriptions.

haven't been able to quite follow what canonicize accomplishes when it creates $term->predicates

Ah, ok, it's what makes the taxonomy_xml_add_all_children_to_queue($term) function work.

Resolution

Searched words: 
mass add terms

Comments

Post new comment

The content of this field is kept private and will not be shown publicly.
  • You may post code using <code>...</code> (generic) or <?php ... ?> (highlighted PHP) tags.
  • You can use Markdown syntax to format and style the text. Also see Markdown Extra for tables, footnotes, and more.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Allowed HTML tags: <a> <em> <strong> <cite> <code> <ul> <ol> <li> <dl> <dt> <dd> <img> <blockquote> <small> <h2> <h3> <h4> <h5> <h6> <sub> <sup> <p> <br> <strike> <table> <tr> <td> <thead> <th> <tbody> <tt> <output>
  • Lines and paragraphs break automatically.

More information about formatting options

By submitting this form, you accept the Mollom privacy policy.