Pagination done the PostgreSQL way done the DBIx::Class way

Recently, I read Pagination Done the Postgresql Way. The premise is that offset/limit combined with a date index gets slower as you page, but if you page in a way where you are always selecting by date then you will be using the index properly.

A simple example would be a table with an entry per day and a date index. Asking for Tuesday would use the date index more effectively than asking for the 3rd entry of the table sorted ascending, so you could page by just asking for an entry with a date greater than the previous one you pulled limit 1.

Example in perl using DBIx::Class: ( gist )

my $dtf = $schema->storage->datetime_parser;
sub _rs {
    my ( $prev_date, $prev_id ) = @_;
    my $search = undef;
    my $attr   = {
        prefetch => 'content_item',
        order_by => { -desc => [ 'me.content_item_id', 'me.published_date' ] },
        rows     => 5000,
    };
    if ($prev_date) {
        $search = \[
            '(me.content_item_id,me.published_date) < (?,?)',
            $prev_id, $dtf->format_datetime($prev_date),
        ];
    }
    return $schema->resultset("ContentItemPublishedDate")
        ->search( $search, $attr );
}

my $rs = _rs;
while ( my @rows = $rs->all ) {
    for (@rows) {
        my $content = $utf8->decode( $_->content_item->content );
        #WORK HAPPENS HERE
    }
    my $last_row = $rows[$#rows];
    $rs = _rs($last_row->published_date,$last_row->content_item_id);
}

Leave a comment

About Samuel Kaufman

user-pic CTO / Codemonkey, Socialflow.com