Don't use something or another

Seems like a lot of people are keen on telling us not to use CGI.pm, but rather use something else. These discussions seem to verge on religious fervour, with each side finding small problems with CGI.pm or its alternatives, and then telling us that these small problems are actually the end of the world.

I don't use CGI.pm, I haven't used it for at least ten years, and I'm not about to defend it, but since we're all telling people not to use something, I thought I would chip in with something which I don't think you should use.

Since about 2006 I've been running a web site which offers to convert Japanese numbers into other kinds of numbers, and vice-versa. For most of those years until relatively recently I was using Lingua::JA::Numbers by Dan Kogai. Dan Kogai's module uses a methodology of converting the numbers by changing Japanese numbers into digits then sending the digits into an "eval" statement to compute the numeral value of the numbers:

https://metacpan.org/source/DANKOGAI/Lingua-JA-Numbers-0.05/lib/Lingua/JA/Numbers.pm#L375

I'd like to argue that the "eval" statement is impossible to use correctly even for this limited case, based on about twelve years of nearly-endless bugs.

The first problem is that to make sure that this eval statement works correctly, one has to validate the input sufficiently. The second problem is that, for whatever reason, people go to a web site which promises to convert Japanese numbers into Western numbers, and they type in their names, or addresses, or other random things. I recently computed the statistics for the site, and about twenty percent of the inputs over the last eight years (I don't have logs for the first four years) were just random characters or nonsense inputs. So before trying to convert these numbers, I had to first of all validate that they were numbers, and not someone's name or random ascii or something.

Although this validation sounds like a relatively simple task on the face of it, no matter what code I wrote to validate the numbers, someone would input some random thing which passed all of my validation tests, and break it. The final straw was some nonsense input which actually looked like "decimal point" "ten to the power twenty", and caused yet more errors.

Finally I came to this conclusion: I just don't think it's possible to validate the input fully before sending it to an eval statement without actually doing the entire computation, which makes the eval statement completely redundant. So I suggest that if you're making a module and you think "eval" might be a good trick to do something, you might want to think again.

2 Comments

There are legitimate uses for eval when you are doing language-level stuff. I’m finding it hard to think of how to circumscribe them with a reasonably simple and reasonably accurate criterion though. To put it unhelpfully abstractly, eval is an appropriate tool when you want to generate code but not when you are just performing a computation. And in any case it needs to be fed stringently controlled input.

What you absolutely don’t want to do is treat random user input as code after just running some substitutions on it without ever actually parsing it yourself.

Leave a comment

About Ben Bullock

user-pic Perl user since about 2006, I have also released some CPAN modules.