The problem with these for my application lies in the additional dependency of a job server. Both of these server daemons are written in C, which I could find packages for, but thats yet another dependency chain to track. A partial framework already exists for this design and its all in perl, and I'd rather continue to update this design (DB backend), which works great, than switch to a new client/server model.
I've finished a rough draft of Thread::Workers. I need to polish its edges a bit and make it pretty before I expose it to the world.
Its a simplistic pool/queue design with no priorities - in my case, I need FIFO access for all queues. I assume if one wanted priorities, one could create multiple pools of boss/workers and manage the priorities from the main application thread using this module.
For me, its enough to create a single pool. I hook the boss's fetch work callback to a mongo query. If queries are returned, this is fed into a FIFO queue which the idle workers pick up on first come, first served basis.
]]>Gearman is definitely lean in terms of dependencies, but its single threaded and I'd still have to manage multiple worker threads (and a server thread and a manager to stuff tasks), or just have separate apps to run each of these, though the complexity drops a bit. Fits most of the requirements but has stuff I don't want or need (socketed client/servers, IPC).
Hopkins is a static job runner - they call themselves a better cron job, and this fulfills part of what I need..but it requires static configuration of XML up front. Under the hood, running POE::Component::JobQueue for its heavy lifting which is async and not threaded, and will block on some of my tasks. Not quite what I need.
TheSchwartz/Helios - also has the ability of registering capabilities, but adds in the requirements of a DBI driven database. Much closer to what I'm looking for, but a bit of overkill. Also async and most likely will block on a few of my tasks.
My problem was I was also searching for lighter-weight thread pooling modules, to use across multiple apps for consistency, and I didn't see the modules you listed. I don't need any real IPC other than to signal a thread to exit its loop. I don't really want/need additional sockets open, other than a single outbound connection to a DB or job queue.
One task for this thread pool will be on a VM with only a single worker to manage jobs coming in from a master controller. A single threaded queue/run program would actually suffice here but there are a couple of situations where a system will be heavily using its workers and I'd like them to be able to service more than one at a time.
Its really the controller(s) I'm worried about. But I don't need a million workers, I'm starting with 5 and will be scaling to ~25, and Future Growth may increase that. I just was looking for a Simple Thread Pool management module, very much like Thread::Pool, except one that worked with non-thread friendly modules like MongoDB. ;(
]]>Zeromq is pretty cool from what I can tell! Its just overkill for what I need on the messaging/queue side.
Its *thread management* that I really want to make sure I'm doing right.
]]>I'm using threading and not an async thing because some of the work I'll be assigning to threads are long-polling operations. The workers will hit some REST API route on some other application, and some of those routes take up to 30 seconds to complete or have dependencies or followup work. Rather than block and spin in an async call, for my tasks, its easier to have a queue of work and workers that execute them.
Each task may have multiple sub-tasks associated that are ordered, so each worker will be assigned a "task group". This way it can manage its own dependencies and I can manage the total load on both the server its sitting on and the cloud, and the database. Its not about "speed" in the sense of less time to execute, its more about keeping a small pipe filled and not blocking other workers while on a long blocking call.
There aren't a lot of thread pooling type modules that I could see on CPAN. Its not a complex task, but there are a lot of things to think about. The Thread::Pool module causes problems with the perl MongoDB module, and thats really the only recent option that seems to fit my problem set.
I ended up rolling my own module using threads::shared, Thread::Queue, and Thread::Semaphore. I basically spawn X amount of threads using the semaphore as a creator of the tokens to keep the number of threads at the right level. I use shared queues and non-blocking checks against the queue from the worker to get its work. I use a shared variable with each worker thread to control its loop and kill it when its time. This also allows for things like stopping the boss and waiting for the workers to drain the queue before killing the pool. You can add workers and stop individual workers. You can keep the workers running and kill the boss, or kill the workers and let the boss run. Boss/workers have callback functions for executing their tasks. Its shiny and runs great.
I'm 99% certain my code will pass any test case I throw against it and it will work fine for what I'll be using it for (work queue for a back-office cloud controller).
Did I miss a module? Is there something better out there that is less intrusive than Thread::Pool? If not I'll clean up the POD on this and submit it to CPAN but I'd rather not duplicate with Yet Another CPAN Module that already has 10 different variations.
]]>I'm autodidactic, but I get to the point where my growth is limited by my ability to talk through problems with other people. I live in perl-land. And its awesome that I don't have to change my toolset to keep up with the times or to keep away from social ignorance.
]]>I'll admit that part of the reason I chose to write my blog here is due to the recent amount of posts I've seen over Sexual Harrassment and what it means to be a Member of a Community that has standards. The more voices in a community speak out, the more the community *is* those voices.
I won't post anything else about it - this isn't 'my community' in that I don't attend conferences, I don't go to meet ups or anything like this, I have a Perl Monks account but never log (every question I need to ask has already been asked!) I guess I'm making it my community by actively voicing my concern and opinion, and it would be nice to talk about perl problems with people who actually know perl!
One thing good about this community: I haven't seen anyone who says "groping a stranger at a conference is acceptable behavior." I have seen some dissension over whether comparing Moose to augmented breasts is considered sexual harassment, but even that has been a minor backlash - most people completely understand the lines being drawn. Those that don't sound relatively immature.
I think its pretty easy - there are Communities of Like Minded People. There are Communities of People with Similar Interests. The Perl Community is a group of people with similar interests, not necessarily Like Minded.
Reddit, for example, is a community of Like Minded People. I have a friend who is a redditor - I refuse to touch that site simply because I am a bit too easily offended by thoughtless discriminatory humor. But I'm not going on there telling people they shouldn't post their junk - that's their community, a random place for random people who aren't easily offended and don't want to think beyond the next bit of humor or 'wow' feeling.
Perl events, perl websites, perl blogs, are about *this* community, and I think this sort of outcry of support for a safe-from-negative-social-stigmas environment is Right and Good.
So I'm adding my voice to the many others who want a community where we can read about perl, discuss perl, get help with perl, and write silly perl poetry and make code jokes without having to worry about the harassing behavior. Or the months of blog posts and discussions that inevitably follow these incidents.
In words that are simply for the Brogrammers to understand:
Flaming me over which is better, '$x unless $y' vs 'if $y $x', thats acceptable.
Making a joke about my preference that involves genitalia or topics that would make a typical grandmother blush, not acceptable.
]]>@vti/Ranguard -
Yep, gone through all those sites, and Task::Kensho is a fantastic reference for CPAN modules!
@Joel -
I just started using the MongoDBI code in a project, its an interesting idea and requires some work to setup but is keeping the rest of the code cleaner, which I like!
This is probably where things have changed the most. You'd find the GUI toolkits had a decent amount of 'OOP-ness' to them, but it seems like everything is based on Moose now.
]]>Handrolled mysql connections and 50k lines of code aren't good practice anymore - not that they ever were, but it was at least accepted and common practice. I've started using MongoDB as well, and its collection of documents metaphor makes a lot of sense when you can pair JSON and perl data types so easily.
Perl has moved on since the early 5.008 days, but the web itself is more like a glacier - slowly moving and slowly shedding the older cruft from search results, but at a slower pace than before. Searching for results will give you tutorials from the late 90s along side a tutorial from 2011 that both show a
C/C++ with the new 2011 standard will face the same problem - new ways of doing things will be overshadowed by the glacier of static history for many years to come.
There are plenty of CPAN modules that stay updated, and new maintainers take over from the old. I haven't run across a module yet that hasn't worked, even if it wasn't updated in the last 4 years.
To be fair, I've been programming perl for years professionally as duct tape and glue, and I only recently decided to really investigate how far the language has come as I'm using it in a much more extensive project.
And in 2012, I finally learned how far perl has come in the last 4 years. The glacier is slowly moving on. I'm blogging about it, so I guess I am too.
]]>