Elasticsearch Custom Scoring

By mateu on October 8, 2014 6:49 PM

Elasticsearch has a builtin scoring algorithm which works quite well in practice, but sometimes you want to roll your own scoring algorithm. Let's examine how to create a custom scoring algorithm using the function score query.

Let's assume we want to search and score Perl job offers based on a set of weighted keyw…

0 comments

Elasticsearch Token Filters

By mateu on September 29, 2014 9:48 PM

We recently saw an example of an elasticsearch token filter called the catalan_stemmer. The Catalan language has other token filters available:

catalan_stop
catalan_elision
catalan_keywords

Let's see what they do.

Stop

The catalan_stop filter removes a list (common) of words. Given the example search:

  porros amb balsàmic

applying the catalan_stop filter will remove the word amb/var/www/users/mateu/index.html

0 comments

Elasticsearch Templates

By mateu on September 26, 2014 6:10 PM

When dealing with elasticearch, one has to consider how they want to manage the analysis of the content that is ingested. The use of templates is a way to ease this burden of managing analyzer settings. Let's learn by example...

Catalan Stemmer

Here's a template that defines an analyzer, cat_stems, which utilizes the built-in catalan stemmer. For example, both singular: porro and plural: porros will be reduced to porr when analyzed by the stemmer. Moreover, this template will be applied to any index created with a name that starts with…

0 comments

Token - Elasticsearch Analyze API

By mateu on September 19, 2014 12:07 AM

Yesterday we looked at an example of how to both index and search using elasticsearch. Today, we'll talk a little about what takes place during indexing, particularly tokenization. For example, what happens when we tokenize the phrase:

porros amb basàlmic

To find out we can pass the phrase to the elasticsearch analyzer API like so:

curl -XGET 'localhost:9200/_analyze?tokenizer=standard' -d 'porros amb balsàmic'

…

0 comments