Tau Station, The Internal Adventures: DBIx::Class::Row cache (avoid the rabbits!)

Tau Station, "the free-to-play narrative sci-fi MMORPG (TM)", has a nicely complex database. Currently, we have 190 tables with 740 relationships between those tables. DBIx::Class does an amazing job at managing that complexity, since each relationship is simply an accessor on the DBIx::Class::Row object.

However, there is a subtle issue when using those relationship accessors. Using a relationship accessor creates a new Row object and stores it in the calling object. This behavior can easily leads to duplicate DBIC Row objects for a single database row. At best, the duplicates cause wasted resources duplicating the Rows. At worst, they cause update anomalies, since updates done to one Row object are not seen by the duplicate objects.

With a highly-connected schema like we have in Tau Station, trying to handle the object duplication can pretty soon feel like we're trying to handle rabbits in Australia.

In order to avoid this Row duplication, we have developed a cache of DBIC Row objects that is shared within the application. In most cases, this allows us to ensure that we have one DBIC Row object per database row while processing an HTTP request, avoiding those subtle update anomalies.

Analysis

Since Tau Station is a web application, everything is done in an HTTP request cycle. Within one request cycle, we need to retrieve all of the needed data, make the necessary changes, and save the changes back to the database.

Lifecycles

Many of the Rows needed during a request cycle are part of the game environment, and are immutable in production (such as Stations and Missions). These Rows should be shared across request cycles, since re-creating them for every request is a waste of resources.

The rest of the Rows may be modified during a request cycle, so they must only be shared within one request cycle, but not across request cycles.

This means that we need two different lifecycles for the Row cache: "permanent", and "per-request".

However, some of the ResultSources that are immutable in production need to be mutable on our narrative development installation, so that the team can develop the game. This means that the lifecycle for each ResultSource must be configurable per installation.

Current Workarounds

Having duplicated Row objects requires some interesting anti-patterns in our code to ensure correctness. One goal of the shared cache is to remove the need for these workarounds.

  • Sometimes we need to requery a Row (discard_changes) at unobvious places in the code.

    This recovers from updates that were done by a duplicate Row object.

  • Sometimes we use database queries instead of DBIC relationships.

    This ensures that we get a "fresh" Row from the database, rather than a possibly-"stale" object that was stored in the Row object. However, this creates yet another duplicate Row object.

  • Sometimes we manually passing DBIC Row objects across layers to avoid duplication

  • Sometimes we use polymorphic sub signatures, accepting either a slug or a Row.

    This can ensure that the Row object always has "fresh" data, at the cost of yet another duplicate Row (and database query).

In other words, having these duplicate Rows means that we are using the database as the sole point for data synchronization, even for actions that are done within a single request cycle.

These duplicate database queries and redundant Row objects harm both the performance and the scalability of the overall system.

Needing these workarounds also makes it more complex to write correct code, since we always need to be aware of whether there could be duplicates of a given Row, and whether some other code could be updating one of those duplicate Rows.

Use Cases

There are five distinct use cases that can result in duplicate Row objects.

  1. retrieving object using unique slug

    Each unique object in the game has a "slug": a unique alphanumeric key that identifies that object. Both the public API and internal data structures need to retrieve the Row for each given slug. Of course, each retrieval by slug can create a new DBIC object, which could be a duplicate.

  2. retrieving object from the per-request Character object

    Each request cycle gets a new Character object. Any immutable object that is retrieved using a relationship off of the Character object (such as Character->station or Character->inventory->item->item_type) will create a new DBIC object, which could be a duplicate.

  3. retrieving objects from a "diamond" pattern in the schema relationship graph

    In some cases, the same database object can be reached by multiple paths from the Character object. For instance, the ItemType for items in a character's inventory is retrieved by following three DBIC relationships: Character->inventory->item->item_type. If a character has multiple items of the same type in the inventory, then we will end up with duplicate ItemType objects, since we have multiple Item objects that each does its own query and creates its own ItemType object.

  4. cycles in the schema relationship graph

    In some cases, the same database object can be reached by following a relationship cycle starting at the original Row object. For instance, using Character->inventory->owner returns to the original Character, but the ->owner will be a duplicate Character object that is stored in the Inventory object.

  5. filtered collections of immutable objects

    In some cases, we need a collection of immutable objects that pass a given predicate. For instance, we may need all stations that can be reached via shuttle from the current station. Normally, we would do this using criteria on a DBIC search, which creates new Row objects each time the query is run.

Solution

In Tau Station, we now have a singleton DBIC::Cache object that handles process-level caching for all of the Row objects. All caching is configured in a single per-installation YAML file, which specifies the caching needed for each ResultSource.

Since the DBIC::Cache is per-process, and we are not using threads (yay), we can guarantee that the cache will contain unique Row objects.

The DBIC::Cache object provides one DBIC::Cache::Base object for each cached ResultSource, available through a simple accessor. For instance, $cache->station is the Station cache.

The configuration file specifies the behavior of the Cache::Base object for each ResultSource:

caches:
    Station:
        enabled: 1
        lifecycle: permanent
        index_fields : [ 'slug' ]
        order_fields: [ 'level', 'name' ]
        prefetch: [ 'affiliation' ]
        subcaches:
            all_active:
                predicates: ['active']
            all_active_for_star_id:
                predicates: ['active']
                partition_fields: ['star_id']

We can add or modify caches by simply editing this configuration file.

Lifecycle

Each ResultSource can have either a "permanent" or a "per-request" lifecycle.

Having the lifecycle in the configuration file allows the Station to be editable on the narrative installation by using the "per-request" lifecycle, but optimized on the production installation by using the "permanent" lifecycle (since we know that the Station is immutable in production).

Row Accessors

A Cache::Base provides both single-row accessors and multi-row accessors for retrieving the Row objects.

Each accessor only retrieves rows that are not already in the cache, and uses no more than one database query to retrieve all needed rows.

A single-row accessor is provided for each field specified in index_fields. In addition, there is always a for_id accessor. For the Station example, we have:

# retrieve cached Station by station_id
$station = $cache->station->for_id($station_id);

# retrieve cached Station by slug
$station = $cache->station->for_slug($station_slug);

An all-row accessor is always available for each ResultSource. Additional sub-cache accessors can be defined that apply filtering or partitioning to the retrieved Row objects.

# retrieve all Stations
@stations = $cache->station->all;

# retrieve all active stations
@stations = $cache->station->all_active;

# retrieve all active stations for the given star
@stations = $cache->station->all_active_for_star_id($star_id);

DBIC-specific Features

The order_fields option allows us to specify the default order that is used in multi-row accessors.

The prefetch option allows us to specify which DBIC relationships should be prefetched.

Cache Management

We can fill specific caches before the HTTP server forks, so that each fork has a pre-filled cache:

$cache->station->fill;

We can clear specific caches, all caches, or all per-request caches

# clear one cache
$cache->station->clear;

# clear all caches
$cache->clear_all;

# clear the per-request caches for each request cycle
$cache->clear_per_request_caches;

We can control the caching

# is caching enabled?
if ( $cache->station->enabled ) { ... }

# enable caching for a ResultSource
$cache->station->enable;

# disable caching for a ResultSource
$cache->station->disable;

Integration

Use Case 1 is directly handled by using single-row cache accessors in every place where we need to retrieve a Row by a unique identifier.

Use Cases 2, 3, and 4 needed a bit more work. In order to handle them, we needed to wrap the DBIC relationship accessor methods so that they would retrieve the shared Row objects from the unified cache, rather than querying the database and creating new Row objects. (This deserves another blog post.)

Use Case 5 is directly handled by defining multi-row cache accessors that provide the needed filtering, and using them instead of the DBIC queries.

Discussion

Before this unified cache was implemented, we had caching of DBIC rows, but the design was primarily focused on performance rather than eliminating duplicates. It also required writing new code for implementing each additional ResultSource cache.

Developing and integrating this unified Row cache into the application gave us a mechanism for addressing the Row duplication, and a simple way to add and modify the Row caching.

When the unified cache was integrated into the application, there were 21 ResultSources that were cached, all with "permanent" lifecycle. In the following three months, we have added 9 more ResultSources, two of which are using the "per-request" lifecycle.

A nice side effect of integrating this cache structure was to force us to add explicit predicate methods to some of the Result classes, so that we can use those predicates in filtered multi-row cache accessors.

Leave a comment

About Zhtwn

user-pic I blog about Perl.