Memcached to the rescue

Everyday the Internet becomes faster, and everyday new and more complex content is provided via web applications. The problem is that sometimes (maybe most of the times) these rich and complex content applications aren't fast enough to answer big flows of requests. One trick that is often used to improve throughput of slow applications is caching. Instead of always processing requests, that often require some data from one or more external sources, a possible solution is to cache the entire output to answer upcoming requests, or cache smaller components that can be used together to produce the final output.

Memcached is one of the most famous cache engines on the web. It can be used to cache any arbitrary pair key/value, later on in the process you need to know the key to retrieve the stored value. This is one possible caching solution easily to use in Perl. To start using Memcached you can use the following module for example:

    use Cache::Memcached;

Next, a new connection needs to be established:

    my $cache = new Cache::Memcached {
            'servers' => [ "" ]

Now we can use $cache to perform operations. For example:

    # store in cache
    $cache->set($key, $value);

# retrieve from cache
my $value = $cache->get($key);

There are many ways to take advantage of the caching power itself, and imagination is the limit. You can use cache for your requests for data to the database, or cache entire webpages to be immediately served, or somewhere in the middle, you can cache several components in your website and put them together as needed to produce the final output.

A typical workflow using cache for entire web pages done in a dispatcher could look something like:

    # handle request and arguments

my $key = calculate_request_key();
my $content = $cache->get($key);
unless ($content) {
$cache->store($key, $content);

# return content to client

These techniques can greatly improve the number of requests a complex application can answer per second. Keep in mind that although we cache information it doesn't mean your site can't deploy completely dynamic content, since you can set expire time on cached information. Which means that information available in cache can be valid for like a minute for example, and you can also have other processes updating content, cron jobs for example, that are also able to talk to the cache engine and can update information. But instead of having the application processing the output for 10 requests per second, just process it once and immediately return the processed output for the same request in the next 30 seconds. Of course you can argue that your content is 30 or 60 seconds (whatever you cache life time is) late in time, and that is true, but the time spent for a slow application to process thousands of requests in those same 30 or 60 seconds could introduce a much bigger content delay, or even content not being served at all.

There are some hairy problems with these type of approaches, and some issues to be aware of, but there are also some simple solutions that can be introduced in your application to handle them. Some details on those in a post to come.


Why not use CHI::Driver::Memcached? That'd give you Memcached while using a general cache interface, which makes it easy to switch to different backends later, or for testing.

Here is the problem that I see with these caching solutions. Why write so much code to retrieve one cached value???

my $content = $cache->get($key);
unless ($content) {
$cache->store($key, $content);
$content = $cache->get($key);

5 lines to get one value!! Ouch!!

I think we should have some thing on the lines of
$result = $cache->get(subroutine(),ttl=>5 );

The name 'subroutine()' itself should be stored as the key, and every time the ttl expires for the sub the subroutine should be called again to retrieve fresh information while updating the cache with latest value in the background.

Leave a comment

About smash

user-pic I blog about Perl.