How I didn't fix AnyEvent::ForkManager
For April, I was assigned AnyEvent::ForkManager, which claims to provide an interface similar to Parallel::ForkManager, but compatible with AnyEvent. The module had some CPAN testers’ failures as well as an issue reported on GitHub, so I tried to fix it. I wasn’t quite successful, though.
The issue reported that tests for the module hung on MSWin. At work, I use Cygwin, so I tried to install the module there to see how the Linux/MSWin hybrid would do. I was able to install all prerequisites, including the main one, AnyEvent, “a framework to do event-based programming”. Nevertheless, the tests for the module itself got stuck, even if in a different place than reported in the issue.
I sprinkled the code with debugging messages (Basic debugging checklist #2) to discover the following line doesn’t return:
  isnt $$, $pm->manager_pid, 'called by child';
  At first, I thought that manager_pid was the problem,
  so I extracted the call from the statement:
  my $mpid = $pm->manager_pid;
  isnt $$, $mpid, 'called by child';
  Surprisingly, $mpid was populated correctly, it was
  the isnt that didn’t return. It seemed very suspicious:
  it’s used in all the test suits on CPAN, it shouldn’t cause
  problems! Or, maybe, the isnt wasn’t
  the isnt I thought. I checked the dependencies, and
  discovered Test::SharedFork
  which defines its own testing subroutines. Adding some debugging
  output to it revealed the real problem in the constructor of a
  Test::SharedFork::Store::Locker object:
  flock $store->{fh}, LOCK_EX or die $!;
The flock was waiting for the exclusive lock infinitely. Just for curiosity, I inserted the following before the problematic line:
  use Data::Dumper;
  $Data::Dumper::Deparse = 1;
  warn Dumper($store);
  Strangely, not only was I able to explore the structure, but all the
  tests passed. “Race condition!” thought I and tried to replace the
  lines with Time::HiRes::usleep(200). The tests were
  still passing, but when I lowered the value, they started to get
  hung again.
Race conditions appear only sometimes, so I tried running the test suite 50 times on my Linux desktop. It failed 7 times with the following detail:
Interrupted system call at /home/choroba/perl5/lib/perl5/Test/SharedFork/Store.pm line 104.
Can't use an undefined value as a HASH reference at /home/choroba/perl5/lib/perl5/Test/SharedFork/Store.pm line 51.
END failed--call queue aborted at xt/nonblocking.t line 104.
On my laptop, the failures were less frequent (about 2/50), and sometimes, the message was different:
Interrupted system call at /home/choroba/perl5/lib/perl5/Test/SharedFork/Store.pm line 104.
Magic number checking on storable file failed at
/usr/lib/perl5/5.18.1/x86_64-linux-thread-multi/Storable.pm line 398, at /home/choroba/perl5/lib/perl5/Test/SharedFork/Store.pm line 51.
END failed--call queue aborted at xt/nonblocking.t line 104.
Line 104 in Store.pm is the flock line shown above.
  Unfortunatelly, I didn’t have enough time to debug this further. It
  was the end of April already, so I had to ask to “Stick” with the
  assignment. To get rid of it and get my May assignment, I just fixed
  some typos
  in the documentation,
  removed use
  utf8 where it wasn’t needed,
  and replaced select
  undef, undef, undef
  with Time::HiRes::usleep, especially
  because Time::HiRes was
  already used.
                        
 I blog about Perl.
	            I blog about Perl.
Leave a comment