Elasticsearch Token Filters

We recently saw an example of an elasticsearch token filter called the catalan_stemmer. The Catalan language has other token filters available:

  • catalan_stop
  • catalan_elision
  • catalan_keywords

Let's see what they do.

Stop

The catalan_stop filter removes a list (common) of words. Given the example search:

  porros amb balsàmic

applying the catalan_stop filter will remove the word amb (with) from the indexing.

This stop filter is defined as:
  "catalan_stop": {
    "type":       "stop",
    "stopwords":  "_catalan_" 
  }

and it is customizable.

Elision

The catalan_elision filter removes elisions. Given the example search:

  Amanida d'escalivada

applying the catalan_elision filter would remove the d' from the indexing.

The catalan_elision filter is defined as such:
  "catalan_elision": {
    "type": "elision",
    "articles": [ "d", "l", "m", "n", "s", "t"]
  }

Keywords

The catalan_keywords filter allows one to exclude certain words from being stemmed. An example definition of the keywords filter is:

  "catalan_keywords": {
    "type":       "keyword_marker",
    "keywords":   ["porró"] 
  }

In this example the word porró would not be stemmed to porr

Catalan Analzyer - Sum the Parts

Travis and Dist::Zilla projects

Getting started with Travis

I felt like I should give Travis a try and picked a recent github project of mine that I knew had a decent testsuite.

The instructions on how to get started with travis were quite simple and soon the project had its own travis page.

Unknown build failure

I wondered why the build failed after adding a very simple .travis.yml file to the project.

$ cpanm --quiet --installdeps --notest .

! Configuring . failed. See /home/travis/.cpanm/work/1411809721.1412/build.log for details.

The command "eval cpanm --quiet --installdeps --notest ." failed. Retrying, 2 of 3.

So cpanm failed, but where can I see what's written in the log? My attempts to view it with the travis directive "after_failure" failed and after trying a few more things I checked the project files again. I noticed it was a Dist::Zilla project and there was no Makefile.PL or Build.PL present and travis needs either of them to get started.

DWIM Perl for Linux - version 5.20.1.9 released

After almost a week of adding more and more modules I've got to the 9th revision of DWIM Perl for Linux .

It explicitly includes more than 400 CPAN modules, but with their dependencies it is probably a lot more. The idea behind this distribution is to make it very fast and easy to get started with Perl. Without learning how to brew perl and how to install CPAN modules. Without fighting external dependencies or some failure in the latest release of a CPAN module.

I need your help to test-drive the distribution and to fill the holes. The modules that might be really needed but have not yet been included.

It would be of great help if you downloaded the latest distribution. Configured it as described on the website and let me know which additional modules your application might need or if something is broken.

Breakfast, Lunch, Band & Social Event for Free

Thanks to our generous sponsors we can provide free breakfast, lunch, live band
and social event for every attendee of the Perl::Dancer conference in Hancock, New York.

Attendees and speakers from USA, Canada and Europe are coming to have lots of fun at
the marvelous venue.

There are still tickets available both for the training and presentations days, please
go to registration to purchase your ticket.

Extended Rules to support Modern Perl in Atom symbols-view package

Somehow long ago I wrote some additional rules for Perl in my .ctags file and published it. I even invited you to help improve and polish it or simply use it and modify it as you wish.

Recently I tried the Atom editor which is very trendy now.
I was happy to find it uses Exuberant Ctags and the rules in my ~/.ctags file just worked.
Then I just made a pull request to https://github.com/atom/symbols-view. The pull request got finally accepted and the extended support for Perl is available since version 0.65.0.

Exuberant Ctags
can be used with Vim, jEdit, Sublime Text, Ultra Edit... any IDE/Text Editor that uses it natively or has a plugin for it. I use it to quickly jump around in my Ado project.

Keymaps: ctrl/cmd+r to see current file symbols; shift+ctrl/cmd+r for project symbols.

Enjoy it, modify it as you wish and improve it.

Elasticsearch Templates

When dealing with elasticearch, one has to consider how they want to manage the analysis of the content that is ingested. The use of templates is a way to ease this burden of managing analyzer settings. Let's learn by example...

Catalan Stemmer

Here's a template that defines an analyzer, cat_stems, which utilizes the built-in catalan stemmer. For example, both singular: porro and plural: porros will be reduced to porr when analyzed by the stemmer. Moreover, this template will be applied to any index created with a name that starts with cat.

Template

{
  "template": "cat*",
  "settings": {
    "analysis": {
      "filter": {
        "catalan_stemmer": {
          "type":       "stemmer",
          "language":   "catalan"
        }
      },
      "analyzer": {
        "cat_stems": {
          "tokenizer":  "standard",
          "filter": [
            "lowercase",
            "catalan_stemmer"
          ]
        }
      }
    }
  }
}

Register Template

Here's how we register the new template with elasticsearch so that any future index starting with the letters cat will include the analzyer cat_stems in its settings.


curl -H "Content-Type: application/json" --data-binary @cat_stems.template.json -XPUT http://localhost:9200/_template/catalan_stemmer

Get Template

To see what we just created:

 curl -XGET http://localhost:9200/_template/catalan_stemmer?pretty

which results in:

Next stable DBD::SQLite to be released in late October

Now I call DBD::SQLite 1.43_08 a release candidate of the next stable DBD::SQLite. Please test it with your modules/applications and let me know if you find anything. Due to some change(s) in the upstream SQLite library (since SQLite 3.8.5), this release candidate is known to break older versions of DBIx::Class (prior to 0.082800 released on 2014-09-25). If you use older versions of DBIx::Class, you might also want to upgrade it, or keep DBD::SQLite 1.42 (bundled with older SQLite 3.8.4.1 library) for now. Other major O/R mappers seem not affected by this upgrade. If there's no blocker nor request to wait, I'll release 1.44 in late October, hopefully on 26th.

Other notable changes since the last stable release follow:

  • This release candidate contains new modules to support custom virtual tables written in Perl (by DAMI).
  • If you set sqlite_unicode to true, SQL statements will be upgraded to avoid inconsistency between embedded params and bind params (RT #96877) (by DAMI)

See Changes file in the distribution for other fixes and improvements.

Veure Update

Just in case you're curious, I'm still hacking on Veure, though the last month has kept me busy on a bunch of other things (our daughter just started school, so that's a big one!)

I've been building so much of the infrastructure that you might be surprised to realize that I've only just gotten around to being able to equip weapons and armor:

My last entry gives some hints on how this works.

The other developer has been working on the cockpit view. If you travel from system to system in your own ship, the experience should be different than if you take public shuttles. I haven't actually seen his work yet, so no screenshot on that one.

Update: OK, I have some of the initial screenshots for the cockpit work. They look great, but not sharing until some things are settled.

About blogs.perl.org

blogs.perl.org is a common blogging platform for the Perl community. Written in Perl and offering the modern features you’ve come to expect in blog platforms, the site is run by Dave Cross and Aaron Crane, with a design donated by Six Apart, Ltd.