Perl 6 Archives

IO::Glob for Perl 6

This is a module I wrote a while back, but I never announced. I am pretty happy with how this came out, so here's the announcement. Some JAPHs might be disappointed to learn that one feature of Perl 5 that did not make it to 6 is globbing. That is, doing something like this:

for my $file (glob "src/core/*.pm") { say $file }

With just Perl 6, you need to do something like this instead:

for "src/core".IO.dir(:test(/ .* ".pm" $/)) -> $file { say ~$file }

That's not too terrible, but I still miss the simplicity of globs. In that case, I can use IO::Glob:

use IO::Glob;
for glob("src/core/*.pm") -> $file { say ~$file }

That does the same thing as the Perl 5 code, more or less.

But, that's not all. I always wished that globs could be used for pattern matching. Sometimes, just matching a string against a glob is handy, but Perl 5's globs are narrow minded. IO::Glob is not:

use IO::Glob;
for <abc acc acdc>.grep(glob('ac*')) { .say }

You can apply globs to anything, though, if you really want directory/file matching-type semantics, you really want to work with IO::Path objects:

use IO::Glob;
for <abc acc acdc>.map({ .IO }).grep(glob('ac*')) { .say }

In this particular case, it does not make any difference, but the placement of a path separating slash is treated as just another character when matching strings, but is treated as a significant file separator when matching paths, so be aware of that difference.

Finally, this is nice, but there is more to globbing than just '*', right? By default, IO::Glob parses globs using a BSD-style grammar. Therefore, without doing anything special, you can use *, ?, [abc], [!abc], ~, {ab,cd,efg} and they do what you expect.

Sometimes, though, that's just overkill. You can request a simple grammar that just supports * and ?:

use IO::Glob;
for glob("src/core/*.pm", :grammar(IO::Glob::Simple)) -> $file { say ~$file }

Or you can use SQL-style globbing, if you prefer:

use IO::Glob;
for glob("src/core/%.pm", :grammar(IO::Glob::SQL)) -> $file { say ~$file }

From here you could even write your own grammar, but I am not going to go into those details here.

If you are trying out Perl 6 and find that you want to glob some files, IO::Glob is here to help.

Cheers.

Async Aborts and P6SGI

So, it's been a couple months or so since I last posted about this. Since then, I gave a talk about it at the Pittsburgh Perl Workshop. After that, I took a Perl 6 hiatus because life got busy and I was a little burned up. In the past few weeks, I've done a little bit of work: cleaning up somethings, making changes that'd been slow cooking in my brain during hiatus, etc. However, I'm putting P6SGI on another hiatus, but this time it has nothing to do with me and everything to do with the state of Perl 6.

Why? In order to make any further progress on P6SGI, I really need an implementation to prove that what I'm proposing can actually be done the way I'm proposing it. I'm pretty sure it can, but it surely needs a few tweaks that will only be found with an implementation. There are also some blue sky ideas I want to play with to see if they're feasible, but can't without an implementation because they are so far out in space.

To that end I have built a small library that will be named HTTP::Supply (though was provisionally named HTTP1::StreamParser until I could come up with something better). This HTTP parser is the core of what must happen to make an implementation of P6SGI: a library that takes bytes from a socket connection and turns it into an asynchronous Supply of HTTP requests.

This new hiatus basically comes down to this output I get when testing the library:

% perl6 -Ilib t/http-1.0.t
1..40
ok 1 - environment looks good
ok 2 - input found in environment
ok 3 - message body looks good
ok 4 - no more requests expected
ok 5 - environment looks good
ok 6 - input found in environment
ok 7 - message body looks good
ok 8 - no more requests expected
ok 9 - environment looks good
ok 10 - input found in environment
ok 11 - message body looks good
ok 12 - no more requests expected
ok 13 - environment looks good
ok 14 - input found in environment
ok 15 - message body looks good
ok 16 - no more requests expected
ok 17 - environment looks good
ok 18 - input found in environment
ok 19 - message body looks good
ok 20 - no more requests expected
ok 21 - environment looks good
ok 22 - input found in environment
ok 23 - message body looks good
ok 24 - no more requests expected
ok 25 - environment looks good
ok 26 - input found in environment
ok 27 - message body looks good
ok 28 - no more requests expected
ok 29 - environment looks good
ok 30 - input found in environment
ok 31 - message body looks good
ok 32 - no more requests expected
ok 33 - environment looks good
ok 34 - input found in environment
zsh: abort      perl6 -Ilib t/http-1.0.t

That abort there is the issue. My code could most definitely be at fault, but I am of the opinion that it should not be an abort message, but a Perl 6 failure. This is Perl, not C. I'm not really an expert in tracking down my errors when the VM is aborting without giving me any clues at all.

I have been able to track down where this is happening inside of MoarVM and libuv. My C fu is that good, but my libuv fu and pthreads fu is pretty weak. All I know is that it appears to be some sort of action-at-a-distance issue occurring during thread cleanup within MoarVM. At least that's what it looks like to me. I'm a module guy, though, not a VM hacker. I need help on this one.

If you're interested and have the skills required or willing to acquire them to help me past this hiatus, I'll happily buy you coffee or a beer. Help me and then tell me where to send the gift card. In the meantime, I'm moving on to other Perl 6 problems I've been thinking of tackling.

Cheers.

P6SGI: More of a Journey than a Destination

When I started working on P6SGI, I thought, "Hey, I'll just update PSGI to use Perl 6, take advantage of some async data structures, and be done." That is not how this process has gone down. First, I learned that I needed to know more about Perl 6. Then, I found that I need to know more about HTTP/1.1 and more about PSGI. Most recently, I have been researching HTTP/2, Mojolicious, WebSockets, Akka, and a whole pile of other things.

So, here's the progress report on thing that have changed in the last week or so on our way toward a complete P6SGI standard, which is still a ways off.

Standardizing the application life cycle. I count this as the most significant change this week. An application server is now expected to call the application as soon as the request headers have been read. That is, it should not wait until the entire request body has been read from the incoming socket. This makes it possible for applications to respond to Expect: 100-continue headers. Furthermore, the application server should send the response headers back at the earliest opportunity. Therefore, it is not mandated, but strongly encouraged that application servers always run concurrently with their applications.

Using Supply for all the things. The p6sgi.input and p6sgi.errors streams have been replaced with Supply objects. The input stream comes from the server as a series of Blob objects while the application and middleware write Str objects to the error stream.

Also the output stream has been expanded to allow Lists of Pairs so that trailing headers or multiple-headers (as may occur handling 1xx responses) may be sent. In PSGI, this was handled just by writing out the headers as part of the content directly. This is a problem, however, if the application server wants to implement HTTP/2 or WebSockets as they have special framing requirements. This way the application needs to know a bit less about the protocol and can focus on the content more.

Another consequence of input streaming is that buffering is now gone from the spec. It will come be back as an extension of some kind to handle application servers that are inherent buffered (like CGI) or those that provide it at the server for convenience. Details TBD.

Additional promises have been added. I have specified new Promises to be kept by the server. The p6sgi.ready Promise is kept when the server has tapped and is ready to receive the output, just in case the application wants to deliver a live Supply for some reason. The p6sgix.header.done Promise is kept when the server has sent the headers and the p6sgix.body.done Promise is kept when the server has finished sending the body.

Extensions have been ported over from PSGI. I have ported the extensions of PSGI over without much modification or thought. These will likely change. The p6sgix.io extension, in particular, is controversial, problematic, and might not even be necessary. (Partly discussed below.)

Backpressure extensions have been added. I have added a set of "backpressure" extensions (also providing a Supply) that allow the application to monitor when the output socket is blocking. This allows an application that is sending a long stream to a slow connection to pause processing while waiting packets to clear the intertubes between peers. (Though, implementation of these will wait until non-blocking I/O is implemented.)

Protocol header extension. I am actively thinking through the first draft of this issue as I finish this post. One problem with PSGI is that is has a strong assumption that your application only cares about some flavor of HTTP/1.1. This is hardly likely in the modern web and a weakness of PSGI. PSGI provides a basic extension, psgi.io, that aims to allow the application to implement more advanced subprotocols. This has at least two major problems:

  1. The details of psgix.io is implementation-specific. This means that an application using it is tied to a specific implementation or has to provide multiple implementations.

  2. Why would you want protocol implementation details in the application? The reason you use an application server is so that the server takes care of those details on your behalf. The application should not be responsible for implementation something like WebSocket or HTTP/2 or Comet or whatever.

To address this issue, I am working on a mechanism that would allow the application server to specify additional protocols supported and then allow the application to request that the server negotiate the handshake for these. So far, this works for protocols like HTTP/2 or WebSocket, where the application can detect a client request to upgrade a connection for such (by the presence of an upgrade header). In those cases, the application could check an environment variable to see which protocols the server supports. If the desired protocol is supported, the application may tell the server to negotiate the handshake for that server and take any other actions required by including a special header, possible P6SGIx-Upgrade, which specifies the desired upgrade.

It is my hope that we can encourage servers to implement specialized protocols by providing upgrades like this or other forms of server-application communication rather than relying on a kludge like psgix.io.

In conclusion, it is my hope that more details of protocol handling can be kept to the server and the applications can focus more directly on the details of message content. It is my hope as well that through better asynchronous communication and concurrent operation of application and application server, we can resolve many of the weaknesses of PSGI and gain an overall more flexible and useful standard for the modern web.

P.S. I will be at the Pittsburgh Perl Workshop in a couple weeks and giving a talk on the P6SGI process and where we're headed with it.

P6SGI: 4 Myths Dispelled

So my last couple of posts have brought up some grousing from people who do not like PSGI in Perl 5. I am sorry to say that some of this grousing against P6SGI has made some assumptions based on PSGI and not based on P6SGI. I am now going to try to dispel some of these myths.

  • P6SGI is a direct port of PSGI to Perl 6. If you have started with that assumption, you are wrong. P6SGI is fundamentally different from PSGI in significant ways, mostly in that it is completely asynchronous from the start: literally, your application must be wrapped in a start-block, which joins a thread.

  • P6SGI will not do X. This is an early spec. Anything can change right now. Nothing is bolted down for at least 3 to 6 months and nothing is final until 12 to 18 months from now. There is time to make it right. Come contribute. If you want to a commit bit to make your changes, I'm handing them out if you ask nicely.

  • There is no need for P6SGI/the approach is dead. I beg to differ. I want a P6SGI implementation for myself. Therefore, even if I end up being the only one, there is at least one user that wants it. I am hoping that by engaging the community, I can find other users interested and other use-cases to support.

  • P6SGI is just re-hashing 15 year old technology. I would rather say that it is improving upon 15 year old technology. P6SGI is asynchronous while most of it's predecessors are not. It will provide tools for streaming input in and out asynchronously without requiring raw access to the socket connection (not totally spec'd, but it will be soon). This is not your father's web gateway.

Seriously folks. We have history with PSGI and WSGI and whatever, but we're not stopping there. Let's get it right this time so that Perl 6 has the implementation everyone is copying for the next 15 years.

P6SGI: Smack the Reference Implementation

The P6SGI standard is progressing reasonably well now. There are a number of issues yet to be worked out it is a reasonably good start. However, before we can really be sure of that, we need an implementation that puts the standard to use and helps us find the warts as well as provides a way to get started working with it.

Introducing: Smack.

Perl has Plack. Perl 6 has Smack. So far, the basics of it's own standalone server are built and working and CGI is started. Some of the built-in apps and some standard middleware have been drafted, but there is a large amount of work to be done.

I have a regular job, 3 boys, and limited spare time. If you have any interest in helping work on the next generation of Perl application servers, your help would be most welcome.

Cheers.

P6SGI: Revising and Reviewing

After my previous post I received quite a bit of really good, constructive feedback. Thank you all who responded to my request for comments! I would say the gist of the feedback was this:

  • Make the interface simpler for middleware and to a lesser extent, servers.
  • Consider always requiring the Promise interface.
  • Consider using Supply instead of Channel as it is non-blocking and likely to perform better.

After wrangling and playing around with various things and learning more about Perl 6 than I knew before, I have come up with what I think will be a stable interface that satisfies all of these suggestions.

First, since Promises are easy to do, I agree, let's just always require them. This means that the basic Hello World app now looks like this:

sub app(%env) {
    start {
        200, [ Content-Type => 'text/plain' ], [ 'Hello World' ]
    };
}

After much research and playing around with the Supply type, I have decided that it is generally superior to Channel. Therefore, we will use Supply to generate the message body, not a Channel. The main problem with Channel is that it will block the server on receive. You can get around this, but it is either a headache or requires a dedicated thread. I originally chose Channel over Supply, though, because Supply is flexible enough to cause trouble if the application developer is sloppy. The benefits, however, far outweigh the risks. Besides, since when has Perl ever shied away from letting sloppy developers shoot themselves in the foot?

Finally, I have unified the interface so that middleware and servers can process the application output without much effort, but in a way that allows the application flexibility to respond in just about any way it pleases. This is possible because of the way Perl 6 coercions work. Let's examine the definition of an application and then explore a couple examples.

  • The application must return a Promise or it must return an object that coerces into a Promise. In Perl, an object coerces to another if it provides a method with the name of the coercion to perform. Therefore, an application could return a Supply as it can be turned into a Promise automatically.

  • The Promise must be kept with a Capture that contains three positional arguments, or it must provide something that coerces into such a Capture, such as a 3-element List or Array.

  • The first argument of the Capture is the status code, which must be an Int or (you guessed it) something that coerces into an Int. That means it could be Enum or some other kind of object representing each of the possible HTTP status codes.

  • The second argument of the Capture is the list of headers. As before, this is a list of Pairs mapping header names to header values. Again, it could be any object that coerces to such a list.

  • The third argument of the Capture is a Supply that emits Str and Blob values, or any object that coerces into such a Supply. Both List and Array coerce into such a Supply safely, so we can just return an Array or List in simple cases.

This means that we have the simplicity for this to work:

sub app(%env) {
    start {
        my $n = %env.Int;
        my $acc = 1.FatRat;

        200,
        [ Content-Type => 'text/plain' ],
        [ do for 1..$n { "{$acc *= $_}\n" } ]
    };
}

and the flexibility to make something that outputs it's first byte faster like this:

sub app(%env) {
    start {
        my $n = %env.Int;
        200,
        [ Content-Type => 'text/plain' ],
        Supply.on-demand(-> $content {
            my $acc = 1.FatRat;
            for 1..$n {
                $content.emit("{$acc *= $_}\n");
            }
            $content.done;
        });
    };
}

and still have middleware that processes each like this:

sub add-one(%env) {
    callsame().then(-> $p {
        my ($s, @h, Supply(Any) $body) = $p.result;
        $s, @h, $body.map(* + 1);
    });
}
&app.wrap(&add-one);

I think that's pretty awesome.

Cheers.

Edit: The folks in #perl6 corrected a misunderstanding I had about the start { } deprecation. I have adjusted the code here accordingly.

P6SGI: Perl 6 Web Service Gateway Interface

So, I have been meaning to start a Perl 6 blog for a couple of months. At that point, though, this site was having issues and I have this perverse desire to write blog software every time I think about blogging and so things got put off for a bit. I am now starting this here and I want to get write off and start with what I think is my most important Perl 6 contribution thus far and one I want to get your feedback on: P6SGI!

For those that need instant gratification, here is a P6SGI application:

    # Perl 6
    sub app(%env) { 
        (200, [ 'Content-Type' => 'text/plain' ], [ 'Hello World!' ]) 
    }

That looks quite a bit like its PSGI cousin. That's not the real power of P6SGI, though. For that, you will have to bear with me for a short bit.

I am a web application developer by day and a fan (mostly) of PSGI, the Perl Web Service Gateway Interface. PSGI is a standard that defines something akin to CGI, but is modernized and implementation agnostic. It was developed by Tatsuhiko Miyagawa after seeing the success of similar standards for Python (WSGI) and Ruby (Rack). The nice thing about PSGI is that it requires nothing but the Perl language itself to build a web application, which means it is really simple to use. The bad thing, though, is that when you need to use the advanced PSGI features, you can end up with a callback that calls a callback that calls a callback: it is hard to read and not ideal. That is not so much a weakness of PSGI, but a weakness of what is built in to Perl.

While I have been learning Perl 6 over the past few months, I have learned that Perl 6 has a number of very useful, built-in types that could actually fix these problems. Looking at early app server implementations, I found that each had implemented support for PSGI-style apps like the one above, but none had delved into the deferred or streaming aspects of the standard. I decided, I wanted to see what could be done with this and began experimenting.

Let us start with the simplest of these, the deferred PSGI application:

    # Perl 5
    sub app {
        my $env = shift;
        return sub {
            my $res = shift;

            my $content = some_long_running_process($env);
            $res->([ 200, [ 'Content-Type' => 'text/plain' ], [ $content ]);
        }
    }

What is going on? If your application returns a code reference, then the server is supposed to call that reference and supply to it another callback that can be called by your application when it has finished running some long running task.

This is verbose and confusing. I did not want to use a callback that calls another callback to implement this in Perl 6. Fortunately, Perl 6 has a tool that is perfect and exactly suited to solving this problem, a Promise:

    # Perl 6
    sub app(%env) {
        start {
            my $content = some-long-running-process(%env);
            (200, [ 'Content-Type' => 'text/plain' ], [ $content ])
        };
    }

This is the equivalent to the above in P6SGI, but is much easier to read. What is going on? First, start is a routine that starts an asynchronous process, this is kind of like a fork in Perl 5. Perl 6 is responsible to run that block and it returns a Promise object. The server can then wait for the Promise to be kept or broken. Perl 6 FTW.

That takes care of deferred responses, but what about streaming? Often, you have a large file you need to return or one that returns in bits over a long period of time. For this, you want a streaming response.

In PSGI, a streaming response looks pretty similar to a deferred response, like this:

    # Perl 5
    sub app {
        my $env = shift;    
        return sub {
            my $res = shift;
            my $out = $res->([ 200, [ 'Content-Type' => 'text/plain' ] ]);

            while (my $content = more_data($env)) {
                $out->write($content);
            }
            $out->close;
        }
    }

Again, we have a callback that calls a callback, but if we call that callback with only the status and header bits, we get back a writer object, which is kind of like another callback. We can then use the write and close methods on that writer to send our content back. This works, but again requires a complicated bit of reasoning to understand the callback-callback-callback parts.

Perl 6 comes to our rescue again with a built-in type that seems tailor made to fix this problem, Channels:

    # Perl 6
    sub app(%env) {
        my $stream = Channel.new;
        start {
            while my $content = more_data(%env) {
                $stream.send($content);
            }
            $stream.close;
        };
        (200, [ 'Content-Type' => 'text/plain' ], $stream)
    }

A Channel is an object capable of encapsulating an asynchronous stream of data to a single recipient. If the content return is a Channel object, the server can receive data from the Channel as it arrives and send it on. There are no callbacks, just clean reactive code.

If you need to defer and stream, you can do that too by combining a Promise with a result returning a Channel, which is probably what I will normally do since it is shorter and cleaner to write:

    # Perl 6
    sub app(%env) {
        start {
            my Channel $stream = long-process-fills-a-channel(%env);
            (200, [ 'Content-Type' => 'text/plain' ], $stream);
        }
    }

Now, you have a Promise to stream through a Channel.

Another smaller difference in P6SGI from PSGI is the encoding of data. Most developers do not like dealing with encoding at all. When applications interact with other applications over a socket, though, they may not have that luxury. Fortunately, Perl 6 simplifies this by making a strong distinction between encoded and decoded data.

All of the P6SGI application examples here deal with strings and send them to the server without encoding them into blobs first. This puts the burden of encoding on the server. P6SGI specifies how servers should do this. However, applications that want to make sure this is always done precisely can pass the server blobs instead of strings:

    # Perl 6
    sub app(%env) {
        (200, [ 'Content-Type' => 'text/plain; charset=UTF-8' ], [
            "Hello World".encode('UTF-8')
        ]);
    }

Anyway, I have written all of this up on github in full detail and hope that you will give feedback. Nothing is set in stone at this point as I have not even yet gotten around to writing the reference implementation, but I am very happy with the natural improvements Perl 6 grants to PSGI through Promises, Channels, and better encoding tools.

Please, let me know what you think and, as always, patches welcome.

Cheers.

About Sterling Hanenkamp

user-pic Perl hacker by day, Perl 6 module contributor by night.