Thread pool for a boss/worker model
This is a pretty simple idea - a boss thread assigns work to a pool of worker threads who do nothing until some work enters their queue. This way the boss can fill a queue very quickly and you have multiple back end processes that can consume that queue.
I'm using threading and not an async thing because some of the work I'll be assigning to threads are long-polling operations. The workers will hit some REST API route on some other application, and some of those routes take up to 30 seconds to complete or have dependencies or followup work. Rather than block and spin in an async call, for my tasks, its easier to have a queue of work and workers that execute them.
Each task may have multiple sub-tasks associated that are ordered, so each worker will be assigned a "task group". This way it can manage its own dependencies and I can manage the total load on both the server its sitting on and the cloud, and the database. Its not about "speed" in the sense of less time to execute, its more about keeping a small pipe filled and not blocking other workers while on a long blocking call.
There aren't a lot of thread pooling type modules that I could see on CPAN. Its not a complex task, but there are a lot of things to think about. The Thread::Pool module causes problems with the perl MongoDB module, and thats really the only recent option that seems to fit my problem set.
I ended up rolling my own module using threads::shared, Thread::Queue, and Thread::Semaphore. I basically spawn X amount of threads using the semaphore as a creator of the tokens to keep the number of threads at the right level. I use shared queues and non-blocking checks against the queue from the worker to get its work. I use a shared variable with each worker thread to control its loop and kill it when its time. This also allows for things like stopping the boss and waiting for the workers to drain the queue before killing the pool. You can add workers and stop individual workers. You can keep the workers running and kill the boss, or kill the workers and let the boss run. Boss/workers have callback functions for executing their tasks. Its shiny and runs great.
I'm 99% certain my code will pass any test case I throw against it and it will work fine for what I'll be using it for (work queue for a back-office cloud controller).
Did I miss a module? Is there something better out there that is less intrusive than Thread::Pool? If not I'll clean up the POD on this and submit it to CPAN but I'd rather not duplicate with Yet Another CPAN Module that already has 10 different variations.