IO::Glob for Perl 6

This is a module I wrote a while back, but I never announced. I am pretty happy with how this came out, so here's the announcement. Some JAPHs might be disappointed to learn that one feature of Perl 5 that did not make it to 6 is globbing. That is, doing something like this:

for my $file (glob "src/core/*.pm") { say $file }

With just Perl 6, you need to do something like this instead:

for "src/core".IO.dir(:test(/ .* ".pm" $/)) -> $file { say ~$file }

That's not too terrible, but I still miss the simplicity of globs. In that case, I can use IO::Glob:

use IO::Glob;
for glob("src/core/*.pm") -> $file { say ~$file }

That does the same thing as the Perl 5 code, more or less.

But, that's not all. I always wished that globs could be used for pattern matching. Sometimes, just matching a string against a glob is handy, but Perl 5's globs are narrow minded. IO::Glob is not:

use IO::Glob;
for <abc acc acdc>.grep(glob('ac*')) { .say }

You can apply globs to anything, though, if you really want directory/file matching-type semantics, you really want to work with IO::Path objects:

use IO::Glob;
for <abc acc acdc>.map({ .IO }).grep(glob('ac*')) { .say }

In this particular case, it does not make any difference, but the placement of a path separating slash is treated as just another character when matching strings, but is treated as a significant file separator when matching paths, so be aware of that difference.

Finally, this is nice, but there is more to globbing than just '*', right? By default, IO::Glob parses globs using a BSD-style grammar. Therefore, without doing anything special, you can use *, ?, [abc], [!abc], ~, {ab,cd,efg} and they do what you expect.

Sometimes, though, that's just overkill. You can request a simple grammar that just supports * and ?:

use IO::Glob;
for glob("src/core/*.pm", :grammar(IO::Glob::Simple)) -> $file { say ~$file }

Or you can use SQL-style globbing, if you prefer:

use IO::Glob;
for glob("src/core/%.pm", :grammar(IO::Glob::SQL)) -> $file { say ~$file }

From here you could even write your own grammar, but I am not going to go into those details here.

If you are trying out Perl 6 and find that you want to glob some files, IO::Glob is here to help.

Cheers.

Async Aborts and P6SGI

So, it's been a couple months or so since I last posted about this. Since then, I gave a talk about it at the Pittsburgh Perl Workshop. After that, I took a Perl 6 hiatus because life got busy and I was a little burned up. In the past few weeks, I've done a little bit of work: cleaning up somethings, making changes that'd been slow cooking in my brain during hiatus, etc. However, I'm putting P6SGI on another hiatus, but this time it has nothing to do with me and everything to do with the state of Perl 6.

Why? In order to make any further progress on P6SGI, I really need an implementation to prove that what I'm proposing can actually be done the way I'm proposing it. I'm pretty sure it can, but it surely needs a few tweaks that will only be found with an implementation. There are also some blue sky ideas I want to play with to see if they're feasible, but can't without an implementation because they are so far out in space.

To that end I have built a small library that will be named HTTP::Supply (though was provisionally named HTTP1::StreamParser until I could come up with something better). This HTTP parser is the core of what must happen to make an implementation of P6SGI: a library that takes bytes from a socket connection and turns it into an asynchronous Supply of HTTP requests.

This new hiatus basically comes down to this output I get when testing the library:

% perl6 -Ilib t/http-1.0.t
1..40
ok 1 - environment looks good
ok 2 - input found in environment
ok 3 - message body looks good
ok 4 - no more requests expected
ok 5 - environment looks good
ok 6 - input found in environment
ok 7 - message body looks good
ok 8 - no more requests expected
ok 9 - environment looks good
ok 10 - input found in environment
ok 11 - message body looks good
ok 12 - no more requests expected
ok 13 - environment looks good
ok 14 - input found in environment
ok 15 - message body looks good
ok 16 - no more requests expected
ok 17 - environment looks good
ok 18 - input found in environment
ok 19 - message body looks good
ok 20 - no more requests expected
ok 21 - environment looks good
ok 22 - input found in environment
ok 23 - message body looks good
ok 24 - no more requests expected
ok 25 - environment looks good
ok 26 - input found in environment
ok 27 - message body looks good
ok 28 - no more requests expected
ok 29 - environment looks good
ok 30 - input found in environment
ok 31 - message body looks good
ok 32 - no more requests expected
ok 33 - environment looks good
ok 34 - input found in environment
zsh: abort      perl6 -Ilib t/http-1.0.t

That abort there is the issue. My code could most definitely be at fault, but I am of the opinion that it should not be an abort message, but a Perl 6 failure. This is Perl, not C. I'm not really an expert in tracking down my errors when the VM is aborting without giving me any clues at all.

I have been able to track down where this is happening inside of MoarVM and libuv. My C fu is that good, but my libuv fu and pthreads fu is pretty weak. All I know is that it appears to be some sort of action-at-a-distance issue occurring during thread cleanup within MoarVM. At least that's what it looks like to me. I'm a module guy, though, not a VM hacker. I need help on this one.

If you're interested and have the skills required or willing to acquire them to help me past this hiatus, I'll happily buy you coffee or a beer. Help me and then tell me where to send the gift card. In the meantime, I'm moving on to other Perl 6 problems I've been thinking of tackling.

Cheers.

P6SGI: More of a Journey than a Destination

When I started working on P6SGI, I thought, "Hey, I'll just update PSGI to use Perl 6, take advantage of some async data structures, and be done." That is not how this process has gone down. First, I learned that I needed to know more about Perl 6. Then, I found that I need to know more about HTTP/1.1 and more about PSGI. Most recently, I have been researching HTTP/2, Mojolicious, WebSockets, Akka, and a whole pile of other things.

So, here's the progress report on thing that have changed in the last week or so on our way toward a complete P6SGI standard, which is still a ways off.

Standardizing the application life cycle. I count this as the most significant change this week. An application server is now expected to call the application as soon as the request headers have been read. That is, it should not wait until the entire request body has been read from the incoming socket. This makes it possible for applications to respond to Expect: 100-continue headers. Furthermore, the application server should send the response headers back at the earliest opportunity. Therefore, it is not mandated, but strongly encouraged that application servers always run concurrently with their applications.

Using Supply for all the things. The p6sgi.input and p6sgi.errors streams have been replaced with Supply objects. The input stream comes from the server as a series of Blob objects while the application and middleware write Str objects to the error stream.

Also the output stream has been expanded to allow Lists of Pairs so that trailing headers or multiple-headers (as may occur handling 1xx responses) may be sent. In PSGI, this was handled just by writing out the headers as part of the content directly. This is a problem, however, if the application server wants to implement HTTP/2 or WebSockets as they have special framing requirements. This way the application needs to know a bit less about the protocol and can focus on the content more.

Another consequence of input streaming is that buffering is now gone from the spec. It will come be back as an extension of some kind to handle application servers that are inherent buffered (like CGI) or those that provide it at the server for convenience. Details TBD.

Additional promises have been added. I have specified new Promises to be kept by the server. The p6sgi.ready Promise is kept when the server has tapped and is ready to receive the output, just in case the application wants to deliver a live Supply for some reason. The p6sgix.header.done Promise is kept when the server has sent the headers and the p6sgix.body.done Promise is kept when the server has finished sending the body.

Extensions have been ported over from PSGI. I have ported the extensions of PSGI over without much modification or thought. These will likely change. The p6sgix.io extension, in particular, is controversial, problematic, and might not even be necessary. (Partly discussed below.)

Backpressure extensions have been added. I have added a set of "backpressure" extensions (also providing a Supply) that allow the application to monitor when the output socket is blocking. This allows an application that is sending a long stream to a slow connection to pause processing while waiting packets to clear the intertubes between peers. (Though, implementation of these will wait until non-blocking I/O is implemented.)

Protocol header extension. I am actively thinking through the first draft of this issue as I finish this post. One problem with PSGI is that is has a strong assumption that your application only cares about some flavor of HTTP/1.1. This is hardly likely in the modern web and a weakness of PSGI. PSGI provides a basic extension, psgi.io, that aims to allow the application to implement more advanced subprotocols. This has at least two major problems:

  1. The details of psgix.io is implementation-specific. This means that an application using it is tied to a specific implementation or has to provide multiple implementations.

  2. Why would you want protocol implementation details in the application? The reason you use an application server is so that the server takes care of those details on your behalf. The application should not be responsible for implementation something like WebSocket or HTTP/2 or Comet or whatever.

To address this issue, I am working on a mechanism that would allow the application server to specify additional protocols supported and then allow the application to request that the server negotiate the handshake for these. So far, this works for protocols like HTTP/2 or WebSocket, where the application can detect a client request to upgrade a connection for such (by the presence of an upgrade header). In those cases, the application could check an environment variable to see which protocols the server supports. If the desired protocol is supported, the application may tell the server to negotiate the handshake for that server and take any other actions required by including a special header, possible P6SGIx-Upgrade, which specifies the desired upgrade.

It is my hope that we can encourage servers to implement specialized protocols by providing upgrades like this or other forms of server-application communication rather than relying on a kludge like psgix.io.

In conclusion, it is my hope that more details of protocol handling can be kept to the server and the applications can focus more directly on the details of message content. It is my hope as well that through better asynchronous communication and concurrent operation of application and application server, we can resolve many of the weaknesses of PSGI and gain an overall more flexible and useful standard for the modern web.

P.S. I will be at the Pittsburgh Perl Workshop in a couple weeks and giving a talk on the P6SGI process and where we're headed with it.

P6SGI: 4 Myths Dispelled

So my last couple of posts have brought up some grousing from people who do not like PSGI in Perl 5. I am sorry to say that some of this grousing against P6SGI has made some assumptions based on PSGI and not based on P6SGI. I am now going to try to dispel some of these myths.

  • P6SGI is a direct port of PSGI to Perl 6. If you have started with that assumption, you are wrong. P6SGI is fundamentally different from PSGI in significant ways, mostly in that it is completely asynchronous from the start: literally, your application must be wrapped in a start-block, which joins a thread.

  • P6SGI will not do X. This is an early spec. Anything can change right now. Nothing is bolted down for at least 3 to 6 months and nothing is final until 12 to 18 months from now. There is time to make it right. Come contribute. If you want to a commit bit to make your changes, I'm handing them out if you ask nicely.

  • There is no need for P6SGI/the approach is dead. I beg to differ. I want a P6SGI implementation for myself. Therefore, even if I end up being the only one, there is at least one user that wants it. I am hoping that by engaging the community, I can find other users interested and other use-cases to support.

  • P6SGI is just re-hashing 15 year old technology. I would rather say that it is improving upon 15 year old technology. P6SGI is asynchronous while most of it's predecessors are not. It will provide tools for streaming input in and out asynchronously without requiring raw access to the socket connection (not totally spec'd, but it will be soon). This is not your father's web gateway.

Seriously folks. We have history with PSGI and WSGI and whatever, but we're not stopping there. Let's get it right this time so that Perl 6 has the implementation everyone is copying for the next 15 years.

P6SGI: Smack the Reference Implementation

The P6SGI standard is progressing reasonably well now. There are a number of issues yet to be worked out it is a reasonably good start. However, before we can really be sure of that, we need an implementation that puts the standard to use and helps us find the warts as well as provides a way to get started working with it.

Introducing: Smack.

Perl has Plack. Perl 6 has Smack. So far, the basics of it's own standalone server are built and working and CGI is started. Some of the built-in apps and some standard middleware have been drafted, but there is a large amount of work to be done.

I have a regular job, 3 boys, and limited spare time. If you have any interest in helping work on the next generation of Perl application servers, your help would be most welcome.

Cheers.