Perl API for ElasticSearch

I was about to start implementing the Sphinx full text search engine on our site when I saw that a new open source search engine ElasticSearch has just been released.

The overview shows off some of its many features but in summary, it:

  • is easy to setup
  • is designed to be distributed, and to scale from one node to hundreds
  • is real time
  • has a free search schema
  • is based on Lucene
  • speaks JSON over HTTP
  • supports multitenancy, which includes multiple indices, and multiple types per index, with the ability to query across any combination of the two

I liked the look of it so much that I've written a simple Perl API, which should be available on CPAN at : http://search.cpan.org/~drtech/ElasticSearch-0.01/

One nice thing that ElasticSearch.pm does is to retrieve a list of all available nodes in the ElasticSearch cluster, and tries to spread the load across nodes automatically.

Also, if the current node disappears, then it tries to connect to the other nodes that it knows about. Only if no other nodes are available does it fail.

ElasticSearch.pm is an alpha release (doesn't even have a test suite yet), and feedback is more than welcome.

Getting a server running is dead simple. (You need at least Java 1.6). On *nix:

cd ~
git clone git://github.com/elasticsearch/elasticsearch.git
cd elasticsearch
./gradlew clean devRelease

cd /path/where/you/want/elasticsearch
unzip ~/elasticsearch/distributions/elasticsearch*

To start a test server in the foreground, running on 127.0.0.1:9200:

./bin/elasticsearch -f

You can start multiple servers by repeating this command - they will autodiscover each other.

Then in Perl, you can test it out with:

use ElasticSearch;
use Data::Dump qw(pp);   ## just using pp to dump the return values

my $e = ElasticSearch->new( servers => '127.0.0.1:9200', debug => 1 );

# index a "document"
pp $e->index(
    index => 'twitter',
    type  => 'tweet',
    id    => 1,
    data  => {
        user        => 'kimchy',
        postDate    => '2009-11-15T14:12:12',
        message     => 'trying out Elastic Search'
    }
);

# retrieve it by ID
pp $e->get(
    index => 'twitter',
    type  => 'tweet',
    id    => 1
);

# search for it by query term
pp $results = $e->search(
    index => 'twitter',
    type  => 'tweet',
    query => {
        term    => { user => 'kimchy' },
    }
);

The example above shows how easy it is to get started, but don't be fooled into thinking that ElasticSearch is a toy - while it hides a lot of complexity, it provides the functionality to tune your indexing and searches to the 'nth degree.

Git repo at http://github.com/clintongormley/ElasticSearch.pm

4 Comments

Very, very, timely. I was just reading about Elastic Search and thinking about the possibilities -- now there's one more reason to give it a try. :-)

Phillip.

OpenSearch says is built on top of Lucene... I thought Lucene was replaced by Solr?

Sorry John, it's probably my post that is confusing you.

Our project was the "Lucene Web Service" or lucene-ws for short. A totally different project from Lucene proper. As I said in our post, we no longer use lucene-ws and have switched to Solr.

Both ElasticSearch (I assume your "OpenSearch" bit was a typo -- thanks to my post, again, no doubt) and Solr both use Lucene proper underneath.

Hope that helps.

Leave a comment

About Clinton Gormley

user-pic The doctor will see you now...