DBIx::Class::Report - generate ad-hoc dbic classes from SQL

By Ovid on March 19, 2015 10:39 AM

Object-Relational Mappers (ORMs) are awesome if you think about your database a collection of objects. They're awful if you think about your database as a database. My primary client right now is ZipRecruiter and my work involves rewriting one of their internal systems to be more flexible. However, it involves tons of reporting. For example, I have this SQL:

  SELECT var.name, ce.event_type, count(*) as total
    FROM tracking_conversion_event ce          JOIN tracking_visitor visitor      ON visitor.tracking_visitor_id      = ce.tracking_visitor_id
    JOIN tracking_version_variant curr ON curr.tracking_version_variant_id = visitor.tracking_version_variant_id
    JOIN tracking_version ver          ON ver.tracking_version_id          = curr.tracking_version_id
    JOIN tracking_variant var          ON var.tracking_variant_id          = curr.tracking_variant_id
   WHERE ver.tracking_id = ?
     AND ver.version     = ?
GROUP BY 1, 2

What you can't see in that SQL is that there are many subtleties in it:

That count(*) as total sometimes needs to be replaced by sum(ce.num_events) as total
Sometimes I need different sets of fields returned (and that impacts the GROUP BY, too)
One of those JOIN statements sometimes needs to be a LEFT JOIN.

As you can imagine, that's frustrating. These are the sorts of things that don't map too well to an ORM. As a result, I've hard-coded some SQL and used $dbh->selectall_arrayref, and repeatedly struggled with the fact that sometimes the columns can be in different positions of that array ref.

Or I could return hashrefs and wonder if $result->{total} is really 0, or did I write $result->{totla} again?

Or in another case I found, we had a resultset return many rows of data with about 20 columns per row, but we only needed 4 or 5 columns. Restricting our data to only what we need not only makes the queries faster, it can reduce I/O when you're fetching results from another server. Class::DBI tried to allow this, but that only works at a class level, not query-by-query.

In reality, everyone here is used to working with DBIC resultsets. Returning complex data structures and having people try to remember which index maps to which column, or if the hash keys have changed is frustrating. Wouldn't it be nice if we could just take a chunk of arbitrary SQL and get a resultset from it? Well, now you can.

The code is called DBIx::Class::Report and it's very much a proof of concept! It builds a DBIx::Class view on the fly. It works like this:

use DBIx::Class::Report;

my $report = DBIx::Class::Report->new(
    schema  => $dbic_schema_object,
    sql     => $complicated_sql,
    columns => \@accessor_names,
);
my $resultset = $report->fetch(@bind_params_for_complicated_sql);
while ( my $result = $resultset->next ) {
    # use like a normal dbic result, but it's read only
}

That's the shell of it, but let's use the query I showed above, just to be clear:

my $sql = <<'SQL';
  SELECT var.name, ce.event_type, count(*)
    FROM tracking_conversion_event ce
    JOIN tracking_visitor visitor      ON visitor.tracking_visitor_id      = ce.tracking_visitor_id
    JOIN tracking_version_variant curr ON curr.tracking_version_variant_id = visitor.tracking_version_variant_id
    JOIN tracking_version ver          ON ver.tracking_version_id          = curr.tracking_version_id
    JOIN tracking_variant var          ON var.tracking_variant_id          = curr.tracking_variant_id
   WHERE ver.tracking_id = ?
     AND ver.version     = ?
GROUP BY 1, 2
SQL

my $events_per_name = DBIx::Class::Report->new(
   schema  => $schema,
   sql     => $sql,
   columns => [qw/name event_type total/],
);

my $resultset = $events_per_name->fetch( $tracking_id, $version );
say $resultset->count; # yeah, it behaves just like a normal dbic resultset

while ( my $result = $resultset->next ) {
   say $result->name;
   say $result->event_type;
   say $result->total;
}

The core idea is simple enough that you should be able to throw just about anything at this and it should just work (but it's tricky with SQLite). This also helps when you don't want to litter your codebase with a bunch of one-off DBIC views that are only called in a single method. Also, note that the data you get back is read-only. You cannot update anything.

If I get it stable, I'll push it to the CPAN at some point.

Many thanks to ribasushi++ and castaway++ for helping me debug this.

7 comments

Tagged as:

DBIx::Class

7 Comments

JT Smith | March 20, 2015 10:35 PM

This is awesome! I've wanted this so many times! Next time this comes up I'll give it a whirl and let you know the result.

Mark Dominus | March 21, 2015 2:41 AM

This is a great idea! I think the worst thing about DBIC is its dependence on the horrendous SQL::Abstract for specifying queries, and I've wanted something like this for years.

Thanks a lot!

Laurent Dami | March 21, 2015 7:33 AM

"These are the sorts of things that don't map too well to an ORM."

Well, it depends on which ORM you choose :-)

In DBIx::DataModel :

a) choosing joins

# joins as given by default in the schema declaration
my $src = $schema->join(qw/table1 other1 other2 ../) ;

# forcing inner (<=>) or left (=>) joins
$src = $schema->join(qw/table1 <=> other1 => other2/);

b) changing columns

my @grouped_cols = qw/col1 col2/;
my @other_cols   = qw/count(*)|total/; # '|' is aliasing
my $result = $src->select(
  -columns => [@grouped_cols, @other_cols],
  -where => ...,
  -group_by => \@grouped_cols,
);

Laurent Dami replied to comment from Mark Dominus | March 21, 2015 1:47 PM

@Mark : please elaborate, what is it that you find 'horrendous' in SQL::Abstract ? Do you think that SQL::Abstract::More does any improvement ?

JT Smith | March 21, 2015 10:09 PM

As it happens, I had an occasion to use it this already this weekend! It worked like a charm. Ovid++

kablamo | March 23, 2015 1:04 PM

Is this pretty similar to DBIx::Raw? I am not complaining, just trying to know when to use which tool.

https://metacpan.org/pod/DBIx::Raw

Ovid replied to comment from kablamo | March 23, 2015 1:13 PM

kablamo, DBIx::Raw is very different. It's useful, but doesn't serve the needs I have now.

In short, if you have a DBIx::Class based system, using DBIx::Raw doesn't help because you're still stuck with hashrefs, or assigning the data directly. DBIx::Class::Report allows you to use raw SQL, but still have a DBIx::Class interface (resultsets and results). As a result, end-consumers don't have to learn a different interface for however you want to present your data. Plus, you also get the composability of dbic searches.

About Ovid

Freelance Perl/Testing/Agile consultant and trainer. See http://www.allaroundtheworld.fr/ for our services. If you have a problem with Perl, we will solve it for you. And don't forget to buy my book! http://www.amazon.com/Beginning-Perl-Curtis-Poe/dp/1118013840/

More info »

Ovid