May 2013 Archives

Parallel Forking and Process Management

Stop me if you have heard this one before. You have a list of files you need to process in a text file with one item per line. Handling this is fairly simple you read a line in and process it over and over again until you processed the whole list. This works great, but if that list is 40,000 items long and each item takes up to 30 seconds to run it suddenly takes a very long time to finish. In this case processing each item is just a system call to another cli application with no shared resources, thus allowing processing of items in parallel with no fuss. For this task I am using Parallel::ForkManager and here are the important bits:

About Kimmel

user-pic I like writing Perl code and since most of it is open source I might as well talk about it too. @KirkKimmel on twitter