Stupid Lucene Tricks: Hierarchies
(reposted from the now-sadly-extinct http://use.perl.org/use.perl.org/_Mark+Leighton+Fisher/journal/40449.html)
You can search on hierarchies in Lucene if your hierarchy can be represented as a path enumeration (a Dewey-Decimal-like style of encoding a path, like "001.014.003" for the 3rd grandchild of the 14th child of the 1st branch).
For example, a search phrase like:
hierarchy:001
would return only the direct children of the 1st branch, while:
hierarchy:001*
would return all descendants of the 1st branch.
To get only the children of a particular node, you specify only that node, like:
hierarchy:001.014.003
To get all of the descendants you specify everything that starts with that node:
hierarchy:001.014.003*
To get only the descendants after the children (grandchildren, etc.), you specify:
hierarchy:001.014.003.*
2019-05-21: I haven't tried it, but it looks like you could do this right in Perl with the now-quiescent Apache Lucy loose port of Lucene.
Leave a comment