Parallel programming with fork() and tail()ing logs

I' ve worked on a small freelance project recently. It was a log watcher and another tiny program watching it (watcher watcher). Basically, it's an extended `tail -f` watching over multiple logs generated by some persistent programs simultaneously and sending alerts based on configuration data. So there are two main problems before starting coding:

  1. Use threads or use fork()?
  2. Hand craft tailing code or check if there is already a module for that?

Oh, there are also other side issues like fetching configuration and sending emails. For the configuration part, the client initially thought about connecting to a database and SELECTing the conf from an (Oracle) table. So, we either needed to use DBD::Oracle or DBD::ODBC for that and apart from making sure that these dependencies work, they are dependencies after all; and just to get a configuration. So the client dismissed that and we decided that the code will fetch the conf from a HTTP URL instead (which I don't care how they implement that).

At this point, the the problem of GET()ing the URL arised. The code will run on machines with different setups and it's impossible to assume that LWP will be pre-installed (altough some distributions are known to bundle it) and making LWP a dependency will pull in a bunch of other modules too (they are not using perl, it's needed for this project). So I thought that maybe I can use Sockets directly or perhaps someone wrote a low-calory user agent.

And as one can assume, CPAN already has that wheel re-invented by Adam "Tiny" Kennedy as HTTP::Lite (one can expect it to be HTTP::Tiny though). As a side note: I also noticed that perlbrew is also using it and even submitted a patch to add support for mirror selection (that unknown committer is me, I forgot to set git ENV variables).

For sending email, I chose MIME::Lite as it's a light module as the name suggests and I like it's API. Also, SASL Authentication support added recently to it and although we didn't need it, it's easy to enable when needed.

For parallel processing I decided to use fork() as it seemed natural than using threads and I didn't really used threads before and there was a possibility to have non-threaded perls. Actually I didn't use fork() either in a production code before and it seemed like a good opportunity to test its usage extensively. The usage is quite simple as you know:


    foreach my $foo ( @logs ) {
        # pre-init stuff here(*)
        my $pid = fork();
        if ( $pid ) {
            # parent
            push @children, $pid;
            $counter++;
        }
        elsif ( $pid == 0 ) {
            # child
            # real action happens here
        else {
            die "Couldn't fork: $!\n";
        }
        # other stuff here
    }

I collect the children PIDs in the code, to wait for them below to prevent zombies and also to kill them if needed (and it's needed!):

    foreach my $pid ( @children ) {
        waitpid($pid, 0);
    }


$counter is there to limit the forks to a constant value to prevent fork bombs.

        # pre-init stuff here(*)
        if ( $counter >= FORK_LIMIT ) {
            warn "A warning to inform that the program will "
                 ."discard any logs from now on\n";
            warn "The user either has to change the hard coded limit "
                  ."or create another instance\n";
            last;
        }

For some reason, the user (or the system admin) could decide to stop the watcher program. It's easy to kill it or hit [CTRL]+C but that's not a good user interface and is messy. So, it's better to have a nice command like "$0 -stop". And the simplest way to implement "$0 -stop" is:

    local $SIG{INT} = sub {
        kill SIGKILL, @children;
        warn "All stopped.\n";
        exit 0;
    };

And notice that, just before the program exit()s, it'll call END blocks and since I did an OO implementation, the DESTROY method of the object will be called where I can do some cleanup and save state onto the disk.

But how will $0 (the program) know which process to send the ^INT signal to (where it'll be catched by $SIG{INT})? That needs a little trick used by a lot of programs and even by your cpan shell. Basically, the programs creates a lock file when it starts and saves the PID into it. When someone executes $0 -stop command, the program does not over-write (re-create) the lock file, but instead reads it and sends a kill command to the other process and then returns (exits):


    kill SIGINT, $pid_from_lock_file;

Another requirement was to check if there is a new configuration every one hour automatically. I wrote this as the last part of the code with an alarm handler, which proved to be easy to implement at the end.

The whole thing sounds easy right? But the real implementation of this whole stuff took a little longer than I anticipated, because there are some other requirements around it that made things a little bit complex. But the main logic is really simple. However, the problem with IPC is; you get copies of everything instead of a simple shared variable (that also took a little while to realise). So, if you want to set some flags in the child, you can't get them back in the parent or inside other siblings. To do that you need a shared memory area where everyone can read and write. I decided to use IPC::ShareLite. From a "normal" programming point of view, it has a ridiculous API and only supports strings as shared memory storage but there are workarounds for that and the one suggested is to use the excellent Storable module to store and fetch complex structures (which I did). If you are wondering, why I needed that shared memory, it's needed to implement a pause() functionality where the watcher stops sending emails for a defined period of time when a treshold is reached (and some other stuff).

At this point, all I did write was the wrapper code. I didn't actually write the code that tails the logs. For tailing, it seemed so simple to open() the file and then use a loop to get new entries. But that became a bad idea after I realised that I also need to check if the file is rotated and do some other stuff, etc. So, after loosing a couple of hours, I ended up using File::Tail. It really has stuff more in it than meets the eye and I strongly suggest using it instead of implementing your own. It takes care of pretty much everything needed to implement tail -f in Perl.

But I didn't just use File::Tail, I decided to improve it as leaving a very handy code in abandonement (not updated for 5 years with open bugs) seemed like a bad idea. I've forked the code from Michael Schwern's gitpan project and since I don't like git much, I've created a new Mercurial repository on Bitbucket at bitbucket.org/burak/cpan-file-tail. I did a major refactoring, improved unit tests, fixed Windows issues (apart from rotate detection) and implemented $/ support as it's in the TODO list and someone requested it. I still haven't done some benchmarks to see if there are speed issues but feel free to fork it or test it. I've also opened a RT ticket to inform the original author. If I don't get a response or he rejects it, I'll possibly fork it on CPAN with a new name.

7 Comments

Few things:

1. Interesting! Thank you for this post.

2. Great job on File::Tail. You won't necessarily have to fork it. If the author is AWOL and you've made a solid attempt at tracking him down (but failed), you can simply ask PAUSE admins for co-ownership. I reckon it's better than forking. If you indeed fork it, I suggest writing a CPAN rating for the original module mentioning your fork. Also, include the name "File::Tail" in yours so it comes up in searches.

3. I would use LWP::UserAgent and bundle it in if I cannot be sure it will be installed there.

4. Beyond that, I would simply use POE since it's such a solid ground that already implements all of this, it's async and has tail, useragent, etc. It also uses forks in the background and not threads.

Check out POE::Component::Client::HTTP and POE::Wheel::FollowTail.

Instead of using POE, it'd be good to have an AnyEvent::Tail or something like that, so it can be used with all sorts of event loops.

1. Why SIGKILL the children and not SIGTERM them?

2. You may be interested in Proc::Fork – it’ll let you write your loop like so:

foreach my $foo ( @logs ) {
    # pre-init stuff here
    run_fork {
        parent {
            push @children, $pid;
            $counter++;
        }
        child {
            # real action happens here
        }
    }
    # other stuff here
}

(Note that you get die "Can't fork: $!\n" for free if you don’t include your own error block.)

Yes, POE can be used with all sorts of event loops too.

Fair enough. :)

Leave a comment

About Burak Gürsoy

user-pic CPAN Author, Fumetto addict.