Parallel programming with fork() and tail()ing logs
I' ve worked on a small freelance project recently.
It was a log watcher and another tiny program watching it (watcher watcher).
Basically, it's an extended `tail -f`
watching over multiple logs
generated by some persistent programs simultaneously
and sending alerts based on configuration data. So there are two main problems
before starting coding:
- Use threads or use fork()?
- Hand craft tailing code or check if there is already a module for that?
Oh, there are also other side issues like fetching configuration and sending emails.
For the configuration part, the client initially thought about connecting to a
database and SELECT
ing the conf from an (Oracle) table. So,
we either needed to use DBD::Oracle
or DBD::ODBC
for that and apart from making sure that these dependencies work, they are
dependencies after all; and just to get a configuration. So the client dismissed
that and we decided that the code will fetch the conf from a HTTP URL instead
(which I don't care how they implement that).
At this point, the the problem of GET()
ing the URL arised. The code will
run on machines with different setups and it's impossible to assume that LWP will be
pre-installed (altough some distributions are known to bundle it) and making LWP a dependency
will pull in a bunch of other modules too
(they are not using perl, it's needed for this project). So I thought that
maybe I can use Sockets directly or perhaps someone wrote a low-calory user agent.
And as one can assume, CPAN already has that wheel re-invented by Adam "Tiny" Kennedy as HTTP::Lite (one can expect it to be HTTP::Tiny though). As a side note: I also noticed that perlbrew is also using it and even submitted a patch to add support for mirror selection (that unknown committer is me, I forgot to set git ENV variables).
For sending email, I chose MIME::Lite as it's a light module as the name suggests and I like it's API. Also, SASL Authentication support added recently to it and although we didn't need it, it's easy to enable when needed.
For parallel processing I decided to use fork()
as it seemed natural than using threads
and I didn't really used threads before and there was a possibility to have non-threaded perls. Actually
I didn't use fork() either in a production code before and it seemed like a good opportunity
to test its usage extensively. The usage is quite simple as you know:
foreach my $foo ( @logs ) { # pre-init stuff here(*) my $pid = fork(); if ( $pid ) { # parent push @children, $pid; $counter++; } elsif ( $pid == 0 ) { # child # real action happens here else { die "Couldn't fork: $!\n"; } # other stuff here }I collect the children PIDs in the code, to wait for them below to prevent zombies and also to kill them if needed (and it's needed!):
foreach my $pid ( @children ) { waitpid($pid, 0); }
$counter
is there to limit the forks to a constant value to prevent fork bombs.
# pre-init stuff here(*) if ( $counter >= FORK_LIMIT ) { warn "A warning to inform that the program will " ."discard any logs from now on\n"; warn "The user either has to change the hard coded limit " ."or create another instance\n"; last; }For some reason, the user (or the system admin) could decide to stop the watcher program. It's easy to kill it or hit
[CTRL]+C
but that's not a good user interface and is messy.
So, it's better to have a nice command like "$0 -stop"
.
And the simplest way to implement "$0 -stop"
is:
local $SIG{INT} = sub { kill SIGKILL, @children; warn "All stopped.\n"; exit 0; };And notice that, just before the program exit()s, it'll call
END
blocks and
since I did an OO implementation, the DESTROY
method of the object will be called
where I can do some cleanup and save state onto the disk.
But how will $0
(the program) know which process to send the ^INT
signal to (where it'll be catched by $SIG{INT}
)?
That needs a little trick used by a lot of programs and even by your cpan
shell. Basically, the programs creates a lock file when it starts and saves the PID
into it. When someone executes $0 -stop
command, the program does not over-write
(re-create) the lock file, but instead reads it and sends a kill command to the other process
and then returns (exits):
kill SIGINT, $pid_from_lock_file;Another requirement was to check if there is a new configuration every one hour automatically. I wrote this as the last part of the code with an alarm handler, which proved to be easy to implement at the end.
The whole thing sounds easy right? But the real implementation of this whole stuff took a little longer than I anticipated, because there are some other requirements around it that made things a little bit complex. But the main logic is really simple. However, the problem with IPC is; you get copies of everything instead of a simple shared variable (that also took a little while to realise). So, if you want to set some flags in the child, you can't get them back in the parent or inside other siblings. To do that you need a shared memory area where everyone can read and write. I decided to use IPC::ShareLite. From a "normal" programming point of view, it has a ridiculous API and only supports strings as shared memory storage but there are workarounds for that and the one suggested is to use the excellent Storable module to store and fetch complex structures (which I did). If you are wondering, why I needed that shared memory, it's needed to implement a pause() functionality where the watcher stops sending emails for a defined period of time when a treshold is reached (and some other stuff).
At this point, all I did write was the wrapper code. I didn't actually write the code that tails the logs.
For tailing, it seemed so simple to open()
the file and then use a loop to get new entries.
But that became a bad idea after I realised that I also need to check if the file is rotated and do
some other stuff, etc. So, after loosing a couple of hours,
I ended up using File::Tail.
It really has stuff more in it than meets the eye and I strongly suggest using it instead of implementing
your own. It takes care of pretty much everything needed to implement tail -f
in Perl.
But I didn't just use File::Tail
, I decided to improve it as leaving a very handy code
in abandonement (not updated for 5 years with open bugs) seemed like a bad idea. I've forked the code
from Michael Schwern's
gitpan project and since I don't like git
much, I've created a new Mercurial repository on Bitbucket at
bitbucket.org/burak/cpan-file-tail.
I did a major refactoring, improved unit tests, fixed Windows issues (apart from rotate detection)
and implemented $/
support as it's in the TODO list and someone requested it.
I still haven't done some benchmarks to see if there are speed issues but feel free to
fork it or test it. I've also opened a RT ticket to inform the original author. If I don't get a response or he rejects it,
I'll possibly fork it on CPAN with a new name.
Few things:
1. Interesting! Thank you for this post.
2. Great job on File::Tail. You won't necessarily have to fork it. If the author is AWOL and you've made a solid attempt at tracking him down (but failed), you can simply ask PAUSE admins for co-ownership. I reckon it's better than forking. If you indeed fork it, I suggest writing a CPAN rating for the original module mentioning your fork. Also, include the name "File::Tail" in yours so it comes up in searches.
3. I would use LWP::UserAgent and bundle it in if I cannot be sure it will be installed there.
4. Beyond that, I would simply use POE since it's such a solid ground that already implements all of this, it's async and has tail, useragent, etc. It also uses forks in the background and not threads.
Check out POE::Component::Client::HTTP and POE::Wheel::FollowTail.
Instead of using POE, it'd be good to have an AnyEvent::Tail or something like that, so it can be used with all sorts of event loops.
1. Why SIGKILL the children and not SIGTERM them?
2. You may be interested in Proc::Fork – it’ll let you write your loop like so:
(Note that you get
die "Can't fork: $!\n"
for free if you don’t include your ownerror
block.)Yes, POE can be used with all sorts of event loops too.
You're welcome Sawyer X.
2. I just opened a RT ticket and will wait for a couple of months.
3. As I said, the client does not use Perl directly on indirectly, they needed it for this project and LWP has many prereqs. I wanted to keep the prereqs at minimum, so that they do not have to install many modules on many machines (as you can guess they do not have perl deployment process). HTTP::Lite is good for this job.
4. Again, POE is a PITA to install sometimes. And is a big dependency. If the client was a perl shop, I'd have considered POE :)
@ Aristotle
1. Hmm... Good point. I don't have a solid answer. But since it does not manipulate any files, it's better to get rid of the child I guess.
2. Proc::Fork looks nice. I missed it :)
Fair enough. :)