DBIx::Class::Report - generate ad-hoc dbic classes from SQL
Object-Relational Mappers (ORMs) are awesome if you think about your database a collection of objects. They're awful if you think about your database as a database. My primary client right now is ZipRecruiter and my work involves rewriting one of their internal systems to be more flexible. However, it involves tons of reporting. For example, I have this SQL:
SELECT var.name, ce.event_type, count(*) as total
FROM tracking_conversion_event ce JOIN tracking_visitor visitor ON visitor.tracking_visitor_id = ce.tracking_visitor_id
JOIN tracking_version_variant curr ON curr.tracking_version_variant_id = visitor.tracking_version_variant_id
JOIN tracking_version ver ON ver.tracking_version_id = curr.tracking_version_id
JOIN tracking_variant var ON var.tracking_variant_id = curr.tracking_variant_id
WHERE ver.tracking_id = ?
AND ver.version = ?
GROUP BY 1, 2
What you can't see in that SQL is that there are many subtleties in it:
- That
count(*) as total
sometimes needs to be replaced bysum(ce.num_events) as total
- Sometimes I need different sets of fields returned (and that impacts the
GROUP BY
, too) - One of those
JOIN
statements sometimes needs to be aLEFT JOIN
.
As you can imagine, that's frustrating. These are the sorts of things that don't map too well to an ORM. As a result, I've hard-coded some SQL and used $dbh->selectall_arrayref
, and repeatedly struggled with the fact that sometimes the columns can be in different positions of that array ref.
Or I could return hashrefs and wonder if $result->{total}
is really 0, or did I write $result->{totla}
again?
Or in another case I found, we had a resultset return many rows of data with about 20 columns per row, but we only needed 4 or 5 columns. Restricting our data to only what we need not only makes the queries faster, it can reduce I/O when you're fetching results from another server. Class::DBI tried to allow this, but that only works at a class level, not query-by-query.
In reality, everyone here is used to working with DBIC resultsets. Returning complex data structures and having people try to remember which index maps to which column, or if the hash keys have changed is frustrating. Wouldn't it be nice if we could just take a chunk of arbitrary SQL and get a resultset from it? Well, now you can.
The code is called DBIx::Class::Report and it's very much a proof of concept! It builds a DBIx::Class view on the fly. It works like this:
use DBIx::Class::Report;
my $report = DBIx::Class::Report->new(
schema => $dbic_schema_object,
sql => $complicated_sql,
columns => \@accessor_names,
);
my $resultset = $report->fetch(@bind_params_for_complicated_sql);
while ( my $result = $resultset->next ) {
# use like a normal dbic result, but it's read only
}
That's the shell of it, but let's use the query I showed above, just to be clear:
my $sql = <<'SQL';
SELECT var.name, ce.event_type, count(*)
FROM tracking_conversion_event ce
JOIN tracking_visitor visitor ON visitor.tracking_visitor_id = ce.tracking_visitor_id
JOIN tracking_version_variant curr ON curr.tracking_version_variant_id = visitor.tracking_version_variant_id
JOIN tracking_version ver ON ver.tracking_version_id = curr.tracking_version_id
JOIN tracking_variant var ON var.tracking_variant_id = curr.tracking_variant_id
WHERE ver.tracking_id = ?
AND ver.version = ?
GROUP BY 1, 2
SQL
my $events_per_name = DBIx::Class::Report->new(
schema => $schema,
sql => $sql,
columns => [qw/name event_type total/],
);
my $resultset = $events_per_name->fetch( $tracking_id, $version );
say $resultset->count; # yeah, it behaves just like a normal dbic resultset
while ( my $result = $resultset->next ) {
say $result->name;
say $result->event_type;
say $result->total;
}
The core idea is simple enough that you should be able to throw just about anything at this and it should just work (but it's tricky with SQLite). This also helps when you don't want to litter your codebase with a bunch of one-off DBIC views that are only called in a single method. Also, note that the data you get back is read-only. You cannot update anything.
If I get it stable, I'll push it to the CPAN at some point.
Many thanks to ribasushi++ and castaway++ for helping me debug this.
This is awesome! I've wanted this so many times! Next time this comes up I'll give it a whirl and let you know the result.
This is a great idea! I think the worst thing about DBIC is its dependence on the horrendous SQL::Abstract for specifying queries, and I've wanted something like this for years.
Thanks a lot!
Well, it depends on which ORM you choose :-)
In DBIx::DataModel :
a) choosing joins
b) changing columns
@Mark : please elaborate, what is it that you find 'horrendous' in SQL::Abstract ? Do you think that SQL::Abstract::More does any improvement ?
As it happens, I had an occasion to use it this already this weekend! It worked like a charm. Ovid++
Is this pretty similar to DBIx::Raw? I am not complaining, just trying to know when to use which tool.
https://metacpan.org/pod/DBIx::Raw
kablamo, DBIx::Raw is very different. It's useful, but doesn't serve the needs I have now.
In short, if you have a DBIx::Class based system, using DBIx::Raw doesn't help because you're still stuck with hashrefs, or assigning the data directly. DBIx::Class::Report allows you to use raw SQL, but still have a DBIx::Class interface (resultsets and results). As a result, end-consumers don't have to learn a different interface for however you want to present your data. Plus, you also get the composability of dbic searches.