Asynchronous, parallel and... dead. My Perl 6 daily bread.

I love Perl 6 asynchronous features. They are so easy to use and can give instant boost by changing few lines of code that I got addicted to them. I became asynchronous junkie. And finally overdosed. Here is my story...

I was processing a document that was divided into chapters, sub-chapters, sub-sub-chapters and so on. Parsed to data structure it looked like this:

    my %document = (
        '1' => {
            '1.1' => 'Lorem ipsum',
            '1.2' => {
                '1.2.1' => 'Lorem ipsum',
                '1.2.2' => 'Lorem ipsum'
            }
        },
        '2' => {
            '2.1' => {
                '2.1.1' => 'Lorem ipsum'
            }
        }
    );

Every chapter required processing of its children before it could be processed. Also processing of each chapter was quite time consuming - no matter which level it was and how many children did it have. So I started by writing recursive function to do it:

    sub process (%chapters) {
        for %chapters.kv -> $number, $content {
            note "Chapter $number started";
            &?ROUTINE.($content) if $content ~~ Hash;
            sleep 1; # here the chapter itself is processed
            note "Chapter $number finished";
        }
    }
    
    process(%document);

So nothing fancy here. Maybe except current &?ROUTINE variable which makes recursive code less error prone - there is no need to repeat subroutine name explicitly. After running it I got expected DFS (Depth First Search) flow:

    $ time perl6 run.pl
    Chapter 1 started
    Chapter 1.1 started
    Chapter 1.1 finished
    Chapter 1.2 started
    Chapter 1.2.1 started
    Chapter 1.2.1 finished
    Chapter 1.2.2 started
    Chapter 1.2.2 finished
    Chapter 1.2 finished
    Chapter 1 finished
    Chapter 2 started
    Chapter 2.1 started
    Chapter 2.1.1 started
    Chapter 2.1.1 finished
    Chapter 2.1 finished
    Chapter 2 finished
    
    real    0m8.184s

It worked perfectly, but that was too slow. Because 1 second was required to process each chapter in serial manner it ran for 8 seconds total. So without hesitation I reached for Perl 6 asynchronous goodies to process chapters in parallel.

    sub process (%chapters) {
        await do for %chapters.kv -> $number, $content {
            start {
                note "Chapter $number started";
                &?ROUTINE.outer.($content) if $content ~~ Hash;
                sleep 1; # here the chapter itself is processed
                note "Chapter $number finished";
            }
        }
    }
    
    process(%document);

Now every chapter is processed asynchronously in parallel and first waits for its children to be also processed asynchronously in parallel. Note that after wrapping processing in await/start construct &?ROUTINE must now point to outer scope.

    $ time perl6 run.pl
    Chapter 1 started
    Chapter 2 started
    Chapter 1.1 started
    Chapter 1.2 started
    Chapter 2.1 started
    Chapter 1.2.1 started
    Chapter 2.1.1 started
    Chapter 1.2.2 started
    Chapter 1.1 finished
    Chapter 1.2.1 finished
    Chapter 1.2.2 finished
    Chapter 2.1.1 finished
    Chapter 2.1 finished
    Chapter 1.2 finished
    Chapter 1 finished
    Chapter 2 finished
    
    real    0m3.171s

Perfect. Time dropped to expected 3 seconds - it was not possible to go any faster because document had 3 nesting levels and each required 1 second to process. Still smiling I threw bigger document at my beautiful script - 10 chapters, each with 10 sub-chapters, each with 10 sub-sub-chapters. It started processing, run for a while... and DEADLOCKED.

Friedrich Nietzsche said that "when you gaze long into an abyss the abyss also gazes into you". Same rule applies to code. After few minutes me and my code were staring at each other. And I couldn't find why it worked perfectly for small documents but was deadlocking in random moments for big ones. Half an hour later I blinked and got defeated by my own code in staring contest. So it was time for debugging.

I noticed that when it was deadlocking there was always constant amount of 16 chapters that were still in progress. And that number looked familiar to me - thread pool!

    $ perl6 -e 'say start { }'
    Promise.new(
        scheduler => ThreadPoolScheduler.new(
            initial_threads => 0,
            max_threads => 16,
            uncaught_handler => Callable
        ),
        status => PromiseStatus::Kept
    )

Every asynchronous task that is planned needs free thread so it can be executed. And on my system only 16 concurrent threads are allowed as shown above. To analyze what happened let's use document from first example but also assume thread pool is limited to 4:

    $ perl6 run.pl          # 4 threads available by default
    Chapter 1 started       # 3 threads available
    Chapter 1.1 started     # 2 threads available
    Chapter 2 started       # 1 thread available
    Chapter 1.1 finished    # 2 threads available again
    Chapter 1.2 started     # 1 thread available
    Chapter 1.2.1 started   # 0 threads available
                            # deadlock!

At this moment chapter 1 subtree holds three threads and waits for one more for chapter 1.2.2 to complete everything and start ascending from recursion. And subtree of chapter 2 holds one thread and waits for one more for chapter 2.1 to descend into recursion. In result processing gets to a point where at least one more thread is required to proceed but all threads are taken and none can be returned to thread pool. Script deadlocks and stops here forever.

How to solve this problem and maintain parallel processing? There are many ways to do it :)
The key to the solution is to process asynchronously only those chapters that do not have unprocessed chapters on lower level.

Luckily Perl 6 offers perfect tool - promise junctions. It is possible to create a promise that waits for other promises to be kept and until it happens it is not sent to thread pool for execution. Following code illustrates that:

    my $p = Promise.allof( Promise.in(2), Promise.in(3) );
    sleep 1;
    say "Promise after 1 second: " ~ $p.perl;
    sleep 3;
    say "Promise after 4 seconds: " ~ $p.perl;

Prints:

    Promise after 1 second: Promise.new(
        ..., status => PromiseStatus::Planned
    )
    Promise after 4 seconds: Promise.new(
        ..., status => PromiseStatus::Kept
    )

Let's rewrite processing using this cool property:

    sub process (%chapters) {
        return Promise.allof(
            do for %chapters.kv -> $number, $content {
                my $current = {
                    note "Chapter $number started";
                    sleep 1; # here the chapter itself is processed
                    note "Chapter $number finished";
                };
                
                if $content ~~ Hash {
                    Promise.allof( &?ROUTINE.($content) )
                        .then( $current );
                }
                else {
                    Promise.start( $current );
                }
            }
        );
    }
    
    await process(%document);

It solves the problem when chapter was competing with its sub-chapters for free threads but at the same time it needed those sub-chapters before it can process itself. Now awaiting for sub-chapters to complete does not require free thread. Let's run it:

    $ perl6 run.pl
    Chapter 1.1 started
    Chapter 1.2.1 started
    Chapter 1.2.2 started
    Chapter 2.1.1 started
    -
    Chapter 1.1 finished
    Chapter 1.2.1 finished
    Chapter 1.2.2 finished
    Chapter 1.2 started
    Chapter 2.1.1 finished
    Chapter 2.1 started
    -
    Chapter 1.2 finished
    Chapter 1 started
    Chapter 2.1 finished
    Chapter 2 started
    -
    Chapter 1 finished
    Chapter 2 finished
    
    real    0m3.454s

I've added separator for each second passed so it is easier to understand. When script starts chapters 1.1, 1.2.1, 1.2.2 and 2.1.1 do not have sub-chapters at all. So they can take threads from thread pool immediately. When they are completed after one second then Promises that were awaiting for all of them are kept and chapters 1.2 and 2.1 can be processed safely on thread pool. It keeps going until getting out of recursion.

After trying big document again it was processed flawlessly in 72 seconds instead of linear 1000.

I'm high on asynchronous processing again!

You can download script here and try different data sizes and algorithms for yourself (params are taken from command line).

1 Comment

Leave a comment

About Pawel bbkr Pabian

user-pic GitHub LinkedIn