Spewing Sentences

Imagine the following scenario. Something has gone wrong. It's Monday (of course it is), you're doing your best to find the underlying cause, yet a strange force is actively interfering -- you are required to provide a status update every 15 minutes.

Trouble is, you can't give much of an update. The best you can do is write something like

We're still trying to find the source of the problems, and a possible workaround.

A while later, after several variations of the above phrase have been posted, you transition to another state. Now, you can write

We have found the underlying cause, and we're working on fixing it.

Again, it's not much of an update, but it's something. Both phrases are general enough that some problems later, you start to wonder ,,Could I generate those updates automatically?''

Of course you could.

The main issue here is providing enough variety for the messages you post. You could start by writing down several phrases that state your basic message -- that you don't know anything yet, but at least you're doing something.

$variations{'dont_know'} = [
    'We are still trying to find the cause of the problems.',
    'A cause has not yet been determined.',
    'The cause of the problems is still unknown.',
];

Now, all you have to do is pick a phrase at random:

my @list = @{$variations{$theme}};
my $message = $list[int(rand(@list))];

Picking sentences out of a bag is not exactly generating them. We will need a grammar, and fortunately, adding a simple one will not make the process much more complicated. Let's assume we are dealing with a fixed order of sentence parts. Then, all that we would need is:

  • a way to pick part variations at random

    sub a_random {
        my @list = @_;
        return $list[int(rand(@list))];
    }
    
  • some variations for each part

    my @subject = (
        'We',
        'Our administrators',
    );
    my @action = (
        'are trying to find',
        'are trying to determine',
        'are seeking',
    );
    my @object = (
        'a cause of the problems',
        'a possible solution',
        'a way to restore system operating status',
    );
    
  • and finally, our grammar -- a defined order of sentence parts

    my $sentence = join " ", (
        a_random(@subject),
        a_random(@action), 
        a_random(@object),
    );
    

By joining three random part variations, we can improve the variety of our sentences a bit, but this is still far from a good solution to the problem. Fortunately, nothing forces us to stick with a defined order of sentence parts. Our grammars can be more complex than that.

Consider this:

[START]
|AreHappy| |UWrote|

[AreHappy]
We are |Happy|

[UWrote]
that you |Wrote|.

[Happy]
happy
glad

[Wrote]
wrote
contacted us
sent us an email

This simple grammar is organised into sections. Every section contains variations with |tokens| that can be expanded by looking at another section matching the |name|. Beginning at the [START] section, you could generate multiple sentences just like those:

We are happy that you contacted us.
We are glad that you wrote.
We are glad that you contacted us.

Assuming that we can pick a variation from a section at random, with pick_var_from('Section'), we could write expand_tokens_in(\$variation) that replaces all |token| occurences with a random variation from the appropriate section. We could also write is_expandable($variation), that returns true if there are any |tokens| within a given variation. Having those few subs in place, our sentence generator would be as simple as:

my $var = pick_var_from("START");
expand_tokens_in(\$var) while (is_expandable($var));

We could put all those subroutines in place, but there is already a better, pre-invented wheel in CPAN: Inline::Spew.

#!/usr/bin/perl
use Modern::Perl;

use Inline Spew => <<’GRAMMAR’;
START: "the " noun verb
noun: "dog " | "cat " | "rat "
verb: "eats" | "sleeps"
GRAMMAR

say spew();
# the cat sleeps

As the example shows, the grammar syntax is a bit different, but everything works just like expected. Finally, we can start writing the status update generator. The message stays the same:

We don't know anything yet, but we're doing something.

Here is the grammar in Inline::Spew syntax.

StatusUnknown:
Situation "is being analysed"
| Situation "is being investigated"
| "We are " Investigating Situation
| "Follow-up in progress"
| Investigation "is ongoing"
| "There is an ongoing " Investigation
| Investigation "is in progress"
| Investigation "in progress"

DoingSomething:
We "are in the process of applying the necessary means, so that " AllIsOK
| We "are working on " Solution
| We "are trying to determine the cause "
| We "are gathering the necessary input and actively working on " Solution
| We "are trying to determine the " Solution
| Solution "is still being determined"
| We "do not have " Solution "yet"
| We "are still looking for " Solution

Solution:
"a solution " | "a fix " | "a possible solution "
| "a possible fix " | "a way to fix this " | "a way to solve this "

AllIsOK:
"all functionality is restored"
| "everything is fully operational"
| "the system is operating as usual"
| "the affected system is restored to normal operating status"
| "system is fully available again"
| "all is OK" | "everything works fine" | "all works again"

We:
"we " | "involved teams " | "involved parties " | "responsible teams "
| "responsible parties " | "persons involved " | "persons on this case "
| "the teams on the case " | "teams on the case "

The grammar is the main ingredient. But what about the actual code? It's extremely short.

#!/usr/bin/perl
use Modern::Perl;

use Inline Spew => <<’GRAMMAR’;
( ... ) # here comes the grammar described before
GRAMMAR

sub sentence {
    my $start = shift;
    my $sentence = spew($start);
    $sentence =˜ s/\s*$/./;
    return ucfirst $sentence;
}

say sentence(’StatusUnknown’);
say sentence(’DoingSomething’);

The generated statements provide enough variety to be useful.

Investigation is in progress.
Responsible parties are still looking for a solution.
---------------------------
There is an ongoing investigation.
A way to solve this is still being determined.
---------------------------
Investigation in progress.
Responsible parties are working on a possible solution.
---------------------------
Follow-up in progress.
Involved parties are still looking for a fix.
---------------------------
There is an ongoing analysis.
We are working on a fix.
---------------------------
Analysis in progress.
A fix is still being determined.

Every 15 minutes, one of those updates could be posted automatically, allowing you to focus on the real issue -- the unknown problem that you are supposed to fix. Unless, of course, it already fixed itself.


This post is largely based on a talk I've given at the German Perl Workshop 15.0, and later, with some changes, at the Polish Perl Workshop 2013.

This is a good opportunity to say thank you to all the people involved for two really great conferences. Thank you!

3 Comments

BTW, I grew up with the "$picked = $ary[rand @ary];" idiom, and see "$ary[int(rand(@ary))]" as less idiomatic.

But I've seen more than one occasions people writing array_rand() functions, so at least writing "$ary[int(rand(@ary))]" is relatively better :)

Leave a comment

About blindluke

user-pic Unix admin by profession, Perl developer by choice. Also, a very mediocre chess player.