Comma quibbling in Perl
Sinan posted his answer to a Eric Lippert's comma quibbling exercise. There are a couple of Perl solutions in Rosetta Code.
Eric says:
I am particularly interested in solutions which make the semantics of the code very clear to the code maintainer.
Sinan's answer works. He handles the cases where the array reference has zero, one, and and more than one elements. In his solution for more than one element, he uses a special case where an array slice will return zero elements when the "high" number of the range is less than the "low" number and the special case where a join with one list item returns only that item. The collapses two cases of the problem into one:
sub comma_quibbling { my $x = shift; # an array ref my $n = @$x;$n or return '';
$n == 1 and return $x->[0];return join(' and ' =>
# array slice and range operator
join(', ' => @$x[0 .. ($n - 2)]),
$x->[-1],
);
}
The Perl solution in Rosetta Code does the same sort of thing, but encodes all of the cases in a conditional operator (this one takes an a list instead of an array ref):
sub comma_quibbling(@) { return "{$_}" for @_ < 2 ? "@_" : join(', ', @_[0..@_-2]) . ' and ' . $_[-1]; }
Those work, but I think they are a bit too clever. I think most work-a-day programmers would have trouble recognizing the specification from those solutions. Until we have to optimize the problem, I rather translate the specification directly:
sub comma_quibbling { my( $x ) = @_;if( @$x == 0 ) { return '' }
elsif( @$x == 1 ) { return $x->[0] }
elsif( @$x == 2 ) { return join ' and ', @$x }
elsif( @$x > 2 ) { return
join ', ',
@$x[0 .. $#$x - 2],
join ' and ', @$x[-2, -1];
}
}
I don't expect any of these to be much faster than any of the others. They are all doing the same thing presented a bit differently.
As a stylistic matter i'd strongly recommend doing `my @x = @$x;` at the start. All that needless dereferencing makes it that unnecessary bit more ugly.
That said i prefer the last example over the others, but am confused why you deal with the elses, when you could just make it a stack of returns with postfix-if conditions. (Even if you don't like postfix-if, you could just drop the els bits and lose no functionality.) Also, the last example seems to have no default case?
I don't know if this is interesting, but this is Actual Code from my transit-schedule-presentation project (not just written for the exercise):
If I were going to handle the empty list as a possibility, I probably would have it croak rather than just return nothing.
And now I reread it and found the bug!
return "$things[0] and $things[1]" if @things == 2;
should have been
return "$things[0] $and $things[1]" if @things == 2;
Ha.
Thank you for pointing to this. I have had a lot of fun playing around with it.
First I thought all cases could collapse along this:
- join elements with comma
- replace last comma with "and"
(See the Python example on rosetta for comparison.)
But I think it is buggy because the last element can contain commas, too.
Then I went back to brian's if-elsif solution and quite liked it, especially the formatting.
I played around with some given-when which is very readable, too.
In the meantime, Aaron posted his solution. I like it a lot, especially after some more formatting. Sadly I cannot get code in the comments nicely formatted..
To pop the last element (and getting rid of the array subscripting) seems favourable.
Should solutions also contain the curlies {ABC and DEF} or are they irrelevant?
(It’s ugly in several ways to make this expect an arrayref so I won’t.)
I should say I find code massively irritating that explicitly lists every effectively possible condition and then repeats some code across several of the branches, esp. when the repeat must be written in a slightly different way each time. Don’t make me very carefully compare identical-in-intent code to figure out that it’s also identical in effect. Do your job. It’s almost always possible to write it without repeating yourself and without being too clever by half about factoring out the repetition – but you have to work at it and not stop at the first way of writing it that pops into your head. Or even the second.
(That is, btw, the generic “you” and not directly aimed at you, brian. The solution you proposed is not egregious (though I still find it irritating) – but the threshold for that is very low indeed. So I avoid writing code like that on principle.)
I understand what you are saying. I did consider factoring out the join, but otherwise the code doesn't repeat itself. But, even in your solutions you have repeated return statements. I'd probably remove all the returns I used, although many programmers don't seem to like it.
There are many dimensions to this, though. I mitigate the repeated structure by how I place the code on the line. Similar things line up.
But, if I write that sort of code, it's the hit the middle of the distribution of programmer skill. In a work environment, someone at your skill level doesn't even factor into it because there's probably not that many of highly skilled programmers, and if there are they are working on different things. You might immediately understand all of the other "tight" solutions, but I had to take a couple of minutes to figure them out and think about edge cases.
As such, the "do your job" depends on what the work situation is. I don't see my job as one of shortest typing or constantly tweaking something that works. When it comes to performance or optimization, sure, but this isn't one of those cases.
But still, I think your solution is better than mine. :)
Eliminating returns is a question of the overall structure, to me. I find code most readable that doesn’t have any nesting in its control flow – just straight line execution with “if you got here and condition X holds, do this and bail out” exit points. Then you can just read the code from top to bottom, and the conditions simply compound, like a narrowing funnel, so it’s easy to understand how you reach any given point, and what happens in which sequence. (Sometimes I actually factor pieces of code out into a sub or method just so I get to use
return
to write it like this.) In such a case I am very willing, glad in fact, to accept some amount of repetitive control flow structure at each exit point – you don’t have any reason to be comparing the structure of one exit points to that of another to make sure they are the same, so the repetition has no cognitive cost.I don’t try to write code for tightness at any cost. That is what I meant by not stopping at the first way of writing it that comes into one’s head – I’m not satisfied with code that is simply “mathematically” well factored, it has to explain itself really clearly too. That may well be more verbose than the shortest or simplest solution possible; not all verbosity is bad, and not all of it even costs anything, as with the repetitive
return
s above. (But I do sometimes struggle between clarity to an experienced reader, where judicious use of implicitness can make code really concise, and to a novice, who might not be able to follow the exploitation of language features in such code. But I struggle because I feel the novice-friendly code is actually less clear – not because it’s insufficiently golfed and clever.)(As for mitigating repetitions by way of aligning things: I consider that a fallback of last resort. Aligned structures are very fragile in the face of receiving maintenance. If new requirements come in that only affect some case or other, where they can’t be fit into the existing structure easily, alignment goes out the window very quickly, yet the repeated logic defaults to staying. This is bad. In this particular example you get away with it fine, because the repetition is trivial and the problem fairly circumscribed and static, so there’s no great danger. But I’ll rely on alignment only if I can’t think of anything better.)
As usual, you make very good points, and we just disagree on some of them.
However, I don't think that many people would disagree that no practice is safe from ongoing maintenance. :)
And I'm curious how you're typing in your replies. The mails I get with the messages have apparently zapped all the smart quotes. I wish MT was smarter about that sort of thing.
Oh, the mail comes as Latin-1 but the web view is UTF-8. Heh.
It's all clear now why Moose has such a heavy startup :-)