DBIx::Class populate Gotcha

By Wolfgang Kinkeldei on September 21, 2011 1:37 PM

Today, we watched a series of test cases failing. Our tests started with a blank database and a series of populate method-calls to create their own test records.

Everything looked obviously right, but after a while of debugging, we found out that the populate method of DBIx::Class::ResultSet behaves differently depending on the context used. In Scalar or List context, the records are inserted using the INSERT statement, while in void context, a bulk insert method is used.

From a performance standpoint, this absolutely makes sense. The documentation does explain this very well.

However, some optimizations have been made under the hood that are not obvious unless you know about -- I did not know that before.

My code looked like this:

    MyApp->model('DB::Element')->populate([
        {element_id => 113, name => 'f4'},
        {element_id => 114, name => 'f5', usage_starts => '2000-04-01'},
    ]);

And the statement sent to the database was:

INSERT INTO element( element_id, name ) 
  VALUES( ?, ? ) : '__BULK_INSERT__'

Every column which was not mentioned in the first record got lost and simply was ignored.

Conclusion? A bulk insert can only operate on a well known set of columns. The logic behind DBIx::Class retrieves the set of rows from the keys of the first hashref. This is important to keep in mind when using this syntax.

Solutions? TIMTOWTDI :-) If speed is not important, a populate method call could easily put into list context forcing the usage of single insert statements that do what you expect.

    # force list context, ugly but usable :-)
    () = MyApp->model('DB::Element')->populate([
        # ...
    ]);

5 comments

Tagged as:

DBIx::Class

5 Comments

Steffen Mueller | September 21, 2011 2:57 PM | Reply

To be honest, this smells like a bit of a premature optimization. IMHO DBIx::Class should, by default, play it safe and only do the bulk insert if

a) You told it to use the first record to grok all.

b) It scanned the records and knows it's safe.

Peter Rabbitson replied to comment from Steffen Mueller | September 21, 2011 5:02 PM | Reply

Absolutely. This is one of the more retarded design decisions (which tend to range from mildly to very retarded :) that happened during the early years. Unfortunately it's not something that can be easily killed, so is there to stay.

Peter Rabbitson | September 21, 2011 5:04 PM | Reply

@Wolfgang - this smells like a bug to me. Which version of DBIC are you using?

Wolfgang Kinkeldei replied to comment from Peter Rabbitson | September 21, 2011 7:39 PM | Reply

@Peter: thanks for your reply. I hope you did not misunderstand me. My intention was not to blame anybody for the behavior I described. I like using DBIx::Class and I strongly appreciate everybody's work for this great project. I discovered the effects described in DBIx::Class 0.08195.

If I had used the arrayref of arrayrefs syntax for populating, I had never seen this effect. But I decided to use hashrefs. I can live with the way things work right now. Knowing how things work helps to understand what happens. I personally think it would be sufficient to see a warning if the hashrefs handed to populate() have differing keys.

Having the choice between possibly inefficient single INSERTs and a bulk mode generally is great, as it allows to operate fast if one knows about the side effects.

Peter Rabbitson replied to comment from Wolfgang Kinkeldei | September 21, 2011 8:18 PM | Reply

The fact that you can live with it doesn't mean it is not an actual bug :) Should be fixed in next rel. Cheers!

Name

Email Address

URL

Remember personal info?

Comments (You may use HTML tags for style)

About Wolfgang Kinkeldei

I blog about Perl.

More info »

Wolfgang Kinkeldei