Stupid Lucene Tricks: Hierarchies

(reposted from the now-sadly-extinct http://use.perl.org/use.perl.org/_Mark+Leighton+Fisher/journal/40449.html)

You can search on hierarchies in Lucene if your hierarchy can be represented as a path enumeration (a Dewey-Decimal-like style of encoding a path, like "001.014.003" for the 3rd grandchild of the 14th child of the 1st branch).

For example, a search phrase like:

hierarchy:001

would return only the direct children of the 1st branch, while:

hierarchy:001*

would return all descendants of the 1st branch.

To get only the children of a particular node, you specify only that node, like:

hierarchy:001.014.003

To get all of the descendants you specify everything that starts with that node:

hierarchy:001.014.003*

To get only the descendants after the children (grandchildren, etc.), you specify:

hierarchy:001.014.003.*

2019-05-21: I haven't tried it, but it looks like you could do this right in Perl with the now-quiescent Apache Lucy loose port of Lucene.

Leave a comment

About Mark Leighton Fisher

user-pic Perl/CPAN user since 1992.