Lucene Does Not Create Parse Trees

Lucene (in the Java and .NET versions, at least) does not create a parse tree when you use Lucene(.Net)?.QueryParser.QueryParser.Parse() to parse your queries. Instead, Lucene rewrites the query into a set of parenthesized terms that are either optional, required, or prohibited -- i.e. AND, OR, and NOT operators are replaced by +, -, or nothing (optional terms have no prefix).

For example: (ampicillin AND mcnc) OR (penicillin AND NOT scnc)
is rewritten into (assuming that the default operator is OR): (+ampcillin +mcnc) (+penicillin -scnc)

(Written up here because I couldn't find anything useful on the InterWeb about Lucene parse trees.)

(If I am wrong about this issue, please tell me.)

Leave a comment

About Mark Leighton Fisher

user-pic Perl/CPAN user since 1992.