perl6 Archives

Perl 6 Advent Calendar 2018 Call for Authors

Read this article on Rakudo.Party

Every year since 2009, the Perl 6 community publishes a Perl 6 advent calendar, in the form of blog posts on perl6advent.wordpress.com.

To keep up this great tradition, we need 24 blog posts, and volunteers who write them. If you want to contribute a blog post about anything related to Perl 6, please add your name (and potentially also a topic already) to the schedule, and if you don't yet have a login on the advent blog, please tell Zoffix or someone on #perl6 IRC chat your email address so that they can send you an invitation to Wordpress for the site.

Perl 6 advent blog posts should be finished the day before they are due, and published on midnight (UTC) of the due date as publishing date.

If you have any questions, or want to discuss blog post ideas, please join on the #perl6 IRC channel on irc.freenode.org.

The 100 Day Plan: The Update on Perl 6.d Preparations

Read this article on Rakudo.Party

Today's is a milestone of sorts: it's 100 days before the first scheduled Rakudo Perl 6 compiler release that will occur after this year's festival of Diwali. As some know, Diwali is also the code name for the next major release of the Perl 6 language, version 6.d, which means there's a high chance that in about 100 days you'll be able to install and use that.

I figured, I'd write a update on the subject.

When?

The oft-asked questions is when is 6.d going to be released. The plan is to have the 6.d specification good and ready to release on this year's Diwali, which is November 6–7.

About 10 days later the Rakudo compiler will be released (compiler-only, not the Rakudo Star), with 6.d language enabled by default. That is, you'll no longer need to use use v6.d.PREVIEW pragma to get 6.d features, and if you wish to get old, 6.c behaviour you'll need to use an explicit use v6.c pragma.

However, there's a ton of work to do and the work is largely done by volunteers, so we have no compunction about delaying the release of any of the deliverables indefinitely, if the need arises.

What?

The 6.d major version of the Perl 6 programming language includes over 3,400 new commits in its specification. The vast majority of these are clarifications to 6.c spec. In other words, most of these define previously undefined behaviour, rather than specify entirely new features.

Many of the clarifications and new features do not conflict with the 6.c specification. If you're using the Rakudo compiler, you are likely already reaping some of the benefits of the 6.d language, as such things do not require explicit use v6.d.PREVIEW pragma.

Those who've seen the 6.d Teasers frequently ask for the full list of 6.d changes. That list does not yet exist, as the spec is still in the process of being reviewed. The changelog will be available some time in October. You may have seen the 6.d prep repo, but that just contains guiding info for coredevs and isn't descriptive of the actual 6.d content.

The aforementioned ton of work includes:

  • Still have to review about 2,100 spec commits
  • Still have ~95% of ChangeLog to write
  • Still have to implement, 7 TODO features, costing 110 hours
  • Still have 0.3 policies to write (a draft already exists, but needs polishing)
  • Review and spec of any new features that were implemented in Rakudo but were not specced in the language
  • Marketing stuff regarding creation of marketing name alias for the language

The Future

Going forward for future language releases, I foresee us doing a point release every 6 months, and 6.e being released 2 or 3 years after 6.d. The previously 6.d-blocking Issue R#1289 still blocks a number of language changes, and all the 6.d changes blocked by that Issue were pushed to later language versions. So, if that Issue is resolved, that will likely be a reason to cut a language release soon thereafter.

Conclusion

The prep work for next major release of the Perl 6 Programming Language version 6.d (Diwali) is in a high gear. There's lots of work to do. Will likely release spec on November 7th, with compiler release following step and being released about 2 weeks after that. The list of changes will be ready in October and does not currently exist in any user-consumable form.

Let the hype begin \o/

-Ofun

Introducing: Perl 6 Marketing Assets Web App

Read this article on Rakudo.Party

As some of you already may have known from occasional tweets and mentions in the Weekly, we have a perl6/marketing repo that contains some flyers and brochures for Perl 6.

With one of the Perl 6 coredevs making a living as a Multi-Media Designer, the repo has seen a steady stream of new pieces designed, when inspiration strikes, or when someone makes a request. There are now several pieces available, but GitHub isn't the greatest interface for this sort of stuff.

Introducing marketing.perl6.org

To make it easier to see what we have available, we made a front-end for our marketing repo, that lets you browse all of the assets. It's hosted at marketing.perl6.org

The Assets

Under the thumbnail of each asset, there are a few buttons that show you which formats are available for download. The last two buttons are the GitHub button and the pencil button. The former will lead you to GitHub to the folder that particular asset is at, where you can download any files that aren't shown on the front end (e.g. the source files). The latter will lead you to the New Issue page on the marketing repo, with title/ID of the piece pre-filled. This is in case you'd like to request different format, size, or some other changes for that piece.

Each piece has an ID number (a Unix timestamp, e.g 1516098660). If you want to refer to some piece, try to include its ID, as that's the easiest way for the designer to know what piece you're talking about.

INB4, the Camelia logo variants are so numerous because the rules allow for her colours to be changed. Personally, I prefer transparent wings, as they're easier on my retinas than the default logo.

Keep in mind, you can request new pieces as well. Just file a new Issue in the marketing repo, describing the content that you want, including the sizes/colour restrictions, and our volunteers will hook you up.

The Prints

While files themselves are easy to make for free, the same isn't the case for hard copies. We (the volunteers handling the marketing repo) can't print any hardcopies for you. Unless you are able to use a local printing company and pay out of your own pocket, your best bet would be to contact The Perl Foundation and ask them if they can sponsor the prints. I know they made prints of the Introducing Perl 6 brochure for a conference in the past.

Licenses

The assets shown in the marketing web app are licensed under Creative Commons Attribution 4.0 International License. The Camelia logo is copyright by Larry Wall. Some of the pieces use purchased stock, which may have licenses that limit super-large print runs (50,000+ copies). Check the files in the repo or contact Zoffix if you have an unusual usecase for the materials and wish to clarify the licensing.

The source files (InDesign/PhotoShop/Adobe Illustrator) themselves can be modified freely, under the terms of Creative Commons Attribution 4.0 International License. Any images/fonts/other assets used by those source files might have additional licensing restrictions, which will usually be noted in the directory for that asset.

Conclusion

Going out for some tech meetup? Print out a few pieces from our marketing web app, hand them out, share the Perl 6 love!

-Ofun

Talk Slides and Recording: "Intro Into Perl 6 Regexes and Grammars"

Read this article on Rakudo.Party

Last week I gave a "Intro Into Perl 6 Regexes and Grammars" talk at the Toronto Perl Mongers, whom I thank for letting me speak.

For google hangout that is usually set up, we got to use the fancy equipment provided by the company that was letting us use their space. Unfortunately, it's currently unclear if the hangout was recorded and if there would be a video of the talk.

So, I figured I'd make a screencast of the talk. You won't get some of the discussions that occurred during the meeting, but the content of the talk itself is pretty much identical.

You can view the slides at https://tpm-regex.perl6.party/ and the screencast of the talk is on YouTube:

Talk Slides and Recording: "Faster Perl 6 Programs"

Read this article on Rakudo.Party

Last week I gave a "Faster Perl 6 Programs" talk at the Toronto Perl Mongers, whom I thank for letting me speak.

For google hangout that is usually set up, we got to use the fancy equipment provided by the company that was letting us use their space. Unfortunately, it's currently unclear if the hangout was recorded and if there would be a video of the talk.

So, I figured I'd make a screencast of the talk. You won't get some of the discussions that occurred during the meeting, but the content of the talk itself is pretty much identical.

You can view the slides at https://tpm-perf.perl6.party/ and the screencast of the talk is on YouTube:

The Missing Contributors of Perl 6

Read this article on Rakudo.Party

Today, I came across a reddit post from a couple months back, from a rather irate person claiming themselves to be possibly the only person to never receive any credit for their work on Perl 6.

I was aware that person committed at least one commit and knowing the contributors list is generated automagically with a script, I thought to myself "Well, that's clear and provable bullshit." And I went to prove it.

Moar No More

I looked up the commit I knew about, looked at the release announcement for the release it went into and… that person was indeed missing! It was the 2017.02 release, which I released. So what was going on? Did I have an alter-ego that shamelessly erased random people from the contrib list without my having any memory of it?!

First, a brief intro on how the contrib script works: it uses git to look up commits in checkouts of 5 repos: Rakudo, NQP, MoarVM, Docs, and Roast. Until December 2016 the script just used the day of the release as last release, which was later switched to using the timestamp on the Rakudo's tag. The script gathers all the contributors from commits, crunches the names through the names map in CREDITS files in the repos, and it spits out the names ordered by the number of commits made, largest first.

I set out to figure out why a person was missing from the release announcement. After digging through commits, CREDITS files, and tracing the code in the contributor generating script, I found out that in September 2016, I introduced a bug into the contributors script. After some refactoring I accidentally left out MoarVM repository from the list of repos the script searches, so all the contributors to the MoarVM since September 2016 were missing! Since many of them also contribute to the other 4 repos, it was harder to spot that something was wrong.

I filed the problem as R#2024 and left it at that for the time being.

Missing More

I started working on the problem and implemented a new feature in the contributors script that lets you look up contributors for past releases. Neat! So let's try it out for some release before my bug was made, shall we?

I ran the contrib script for 2016.08 relase and then ran another script that diffed the names from that output against what is on the release announcement. The output was:

Announcement has these extra names: David Warring
Contrib script has these extra names: Arne Skjærholt, Bart Wiegmans

The announcement had an extra name and was missing two. The way the contrib script figures out when one release ends and another starts is iffy, especially so in the past. There's a gap of about a day where contributors can slip through: e.g. release manager runs the release at 2PM, someone commits at 3PM, and that commit didn't make it into this release and will be included in the next, or might even be missed entirely.

So that was one problem I noticed. Is that where the difference for 2016.08 release names list comes from? Let's try the earliest post-The-Christmas release: 2016.01-RC1

Announcement has these extra names: Andy Weidenbaum, Lloyd Fournier, skids
Contrib script has these extra names: A. Sinan Unur, Aleks-Daniel Jakimenko-Aleksejev,
Brad Gilbert, Brian S. Julin, Brock Wilcox, Bruce Gray, Carl Masak,
Christian Bartolomäus, Christopher Bottoms, Claudio Ramirez, Dale Evans,
Dave Olszewski, David Brunton, Fritz Zaucker, Jake Russo, James ( Jeremy )
Carman, Jeffrey Goff, Jim Davis, John Gabriele, LLFourn, Marcel Timmerman,
Martin Dørum Nygaard, Neil Shadrach, Salvador Ortiz, Shlomi Fish, Siavash
Askari Nasr, Stéphane Payrard, Sylvain Colinet, Wenzel P. P. Peppmeyer,
Zoffix Znet, fireartist, sylvarant, vinc17fr

That's huge! One name stood out to me in that list—and it isn't my own—it was that same person from reddit who was complaining that they don't get credit. They got left out twice: in 2016.01 and again in 2017.02. No wonder they're pissed off, but I wish they would've said something in 2016.01, so we'd've fixed the Missing Persons issues back then instead of now.

The 2016.02 release has a bunch of missing names as well. I can surmise the cause of the issue is a previously fixed mis-implementation of the contributors script where it'd be quiet if some of the repo checkouts were missing. Neither I (until that point), nor earlier release managers had all of at the right locations the script was expecting, so it's possible that's how some repos were missed.

At the time, I assumed only the docs repo was missing and we credited missing Docs contributors in the 2016.08 announcement. However, now I realize that other release managers likely had different directory setups and thus missed different sets of people.

The Future

Thus, we have identified four issues with the way contributor's script is or has been generating the list of contributors:

  • Relying on the time when the release manager runs the contributor script, potentially creating a gap of unrecorded contributions between the time the script is run and the time the next run of contrib script considers as "last release"
  • Relying on release manager's setup of directories/repos. Even after a previous fix in this area, we're still relying on the release manager to have up-to-date checkouts of repos
  • Missing contributors from entire repositories due to unnoticed bug in the code
  • What happens with commits made at the time of past release in a branch that is merged at the time of the next release? Do they get lost?

I'm taking the lazy way out and leaving it to the current release managers to resolve these problems. I filed R#2028 with the list of issues and have full trust the solution that will be implemented will be suitable :)

The Found

And now, of course, the list of previously unsung heros who made Perl 6 better in the past two and a half years, in alphabetical order. I've also added them to our past release announcements.

It's possible this list includes the missing-found from 2016.08 announcement as well as people who were not logged in the CREDITS file in the past but are now, but I figure it's better to list them twice than none at all.

If you still believe we're missing someone, let us know so the problem can be fixed.

2016.01/2016.01-RC1

A. Sinan Unur, Aleks-Daniel Jakimenko-Aleksejev, Brad Gilbert, Brian S. Julin, Brock Wilcox, Bruce Gray, Carl Masak, Christian Bartolomäus, Christopher Bottoms, Claudio Ramirez, Dale Evans, Daniel Perrett, Dave Olszewski, David Brunton, Fritz Zaucker, Jake Russo, James ( Jeremy ) Carman, Jeffrey Goff, Jim Davis, John Gabriele, LLFourn, Marcel Timmerman, Martin Dørum Nygaard, Neil Shadrach, Salvador Ortiz, Shlomi Fish, Siavash Askari Nasr, Stéphane Payrard, Sylvain Colinet, Wenzel P. P. Peppmeyer, Zoffix Znet, fireartist, raiph, sylvarant, vinc17fr

2016.02

Bart Wiegmans, Brian S. Julin, Brock Wilcox, Daniel Perrett, David Brunton, Eric de Hont, Fritz Zaucker, Marcel Timmerman, Nat, Pepe Schwarz, Robert Newbould, Shlomi Fish, Simon Ruderich, Steve Mynott, Wenzel P. P. Peppmeyer, gotoexit, raiph, sylvarant

2016.03

Ahmad M. Zawawi, Aleks-Daniel Jakimenko-Aleksejev, Bahtiar kalkin- Gadimov, Bart Wiegmans, Brian S. Julin, Brock Wilcox, Claudio Ramirez, Emeric54, Eric de Hont, Jake Russo, John Gabriele, LLFourn, Mathieu Gagnon, Paul Cochrane, Siavash Askari Nasr, Zoffix Znet, jjatria, okaoka, sylvarant

2016.04

Brian S. Julin, Brock Wilcox, Christopher Bottoms, David H. Adler, Donald Hunter, Emeric54, Itsuki Toyota, Jan-Olof Hendig, John Gabriele, Mathieu Gagnon, Nick Logan, Simon Ruderich, Tom Browder, Wenzel P. P. Peppmeyer, Zoffix Znet

2016.05

Aleks-Daniel Jakimenko-Aleksejev, Brian Duggan, Brian S. Julin, Brock Wilcox, Christopher Bottoms, Clifton Wood, Coleoid, Dabrien 'Dabe' Murphy, Itsuki Toyota, Jan-Olof Hendig, Jason Cole, John Gabriele, Mathieu Gagnon, Philippe Bruhat (BooK), Siavash Askari Nasr, Sterling Hanenkamp, Steve Mynott, Tadeusz “tadzik” Sośnierz, VZ, Wenzel P. P. Peppmeyer, Will Coleda

2016.06

(contrib script missing repos issue is fixed around this point, so the number of missing persons drops. Remaining ones are likely the ones that fell into the gap between releases; particularly MoarVM and docs contributors)

Itsuki Toyota, Matthew Wilson, Will Coleda, parabolize

2016.07

Bart Wiegmans, Brian S. Julin, Daniel Perrett, David Warring, Dominique Dumont, Itsuki Toyota, thundergnat

2016.08

Arne Skjærholt, Bart Wiegmans

2016.09

(missing MoarVM bug is introed at this point; we start to see the missing MoarVM devs who mostly work on MoarVM and not other repos. Also a bunch of docs people who likely fell into the gap between releases)

Alexey Melezhik, Bart Wiegmans, Paul Cochrane

2016.10

Brent Laabs, Jimmy Zhuo, Steve Mynott

2016.11

Bart Wiegmans, Itsuki Toyota, Jimmy Zhuo, Mark Rushing

2016.12

Bart Wiegmans, Jimmy Zhuo, LemonBoy, Nic Q, Reini Urban, Tobias Leich, ab5tract

2017.01

Antonio Quinonez, Jimmy Zhuo, M. Faiz Zakwan Zamzuri

2017.02

A. Sinan Unur, Bart Wiegmans, Benny Siegert, Jeff Linahan, Jimmy Zhuo, Lucas Buchala, M. Faiz Zakwan Zamzuri

2017.03

Jonathan Scott Duff, Lucas Buchala, Moritz Lenz

2017.04

Bart Wiegmans, eater

2017.05

Bart Wiegmans, Paweł Murias

2017.06

Bart Wiegmans, Jimmy Zhuo, Oleksii Varianyk, Paweł Murias, Robert Lemmen, gerd

2017.07

Bart Wiegmans, Douglas Schrag, Gerd Pokorra, Lucas Buchala, Paweł Murias, gerd

2017.08

Bart Wiegmans, Dagfinn Ilmari Mannsåker, Douglas L. Schrag, Jimmy Zhuo, Mario, Mark Montague, Nadim Khemir, Paul Smith, Paweł Murias, Philippe Bruhat (BooK), Ronald Schmidt, Steve Mynott, Sylvain Colinet, rafaelschipiura, ven

2017.09

Bart Wiegmans, Dan Zwell, Itsuki Toyota, Jan-Olof Hendig, Jimmy Zhuo, Mario, Paweł Murias, Rafael Schipiura, Skarsnik, Will Coleda, smls

2017.10

Bart Wiegmans, Jimmy Zhuo, Joel, Julien Simonet, Justin DeVuyst, M, Mario, Martin Ryan, Moritz Lenz, Patrick Sebastian Zimmermann, Paweł Murias, bitrauser, coypoop, eater, mryan, smls

2017.11

Bart Wiegmans, Jimmy Zhuo, Martin Barth, Patrick Zimmermann, Paweł Murias

2017.12

Bart Wiegmans, Paweł Murias, Stefan Seifert, brian d foy

2018.01

Bart Wiegmans, Daniel Dehennin, Paweł Murias, Stefan Seifert, Will Coleda

2018.02

Bart Wiegmans, Daniel Green, Paweł Murias, cygx, wukgdu

2018.03

Bart Wiegmans

2018.04

Bart Wiegmans, Paweł Murias, cc, gerd

2018.05

Antonio, Bart Wiegmans, elenamerelo

2018.06

Bart Wiegmans, JJ Merelo

Conclusion

So this was quite a fun investigation and hopefully all the missing people have been found and this is the last missing-found persons list we compile.

The most important lesson, however, is: report problems as soon as you find them. We could've fixed this at the start of 2016, and those who knew they were left out could've saved two years of being upset about it.

-Ofun

Perl 6 Colonpairoscopy

Read this article on Rakudo.Party

If I were to pick the most ubiquitous construct in the Perl 6 programming language, it'd most definitely be the colonpair. Hash constructors, named arguments and parameters, adverbs, and regex modifiers—all involve the colonpair. It's not surprising that with such breadth there would be many shortcuts when it comes to constructing colonpairs.

Today, we'll learn about all of those! Doing so will have us looking at the simplest as well as some of the more advanced language constructs, so if parts of this article make you scratch your head, don't worry—you don't have to learn all of it at once!

PART I: Creation

Colonwhaaaa?

The colonpair gets its name from (usually) being a Pair object constructor and (usually) having a colon in it. Here are some examples of colonpairs:

:foo,
:$bar,
:meow<moo>,
heh => hah

The last one doesn't have a colon in it, but since it's basically the same thing as other colonpairs, I personally consider it a colonpair as well.

We can see the colonpairs make Pair objects by dumping their .^namemethodname):

say :foo.^name; # OUTPUT: «Pair␤»

However, when used in argument lists, the colonpairs are specially handled to represent named arguments. We'll get to that part later in the article.

The Shortcuts

Here's a mostly-complete list of available ways to write a colonpair you can glance over before we dive in. I know, it looks like a huge list, but that's why we're reading this article—to learn the general patterns that make up all of these permutations.

# Standard, take-any-type, non-shortcut form
:nd(2).say;             # OUTPUT: «nd => 2␤»
:foo('foo', 'bar').say; # OUTPUT: «foo => (foo bar)␤»
:foo( %(:42a, :foo<a b c>) ).say;
# OUTPUT: «foo => {a => 42, foo => (a b c)}␤»

# Can use fat-arrow notation too:
# (parentheses around them are here just for the .say call)
(nd => 2).say; # OUTPUT: «nd => 2␤»
(foo => ('foo', 'bar') ).say; # OUTPUT: «foo => (foo bar)␤»
(foo => %(:42a, :foo<a b c>) ).say;
# OUTPUT: «foo => {a => 42, foo => (a b c)}␤»

# Booleans
:foo .say; # OUTPUT: «foo => True␤»
:!foo.say; # OUTPUT: «foo => False␤»

# Unsigned integers:
:2nd   .say; # OUTPUT: «nd => 2␤»
:1000th.say; # OUTPUT: «th => 1000␤»

# Strings and Allomorphs (stings that look like numbers are Str + numeric type)
:foo<bar>      .say; # OUTPUT: «foo => bar␤»
:bar<42.5>     .say; # OUTPUT: «bar => 42.5␤»
:bar<42.5>.perl.say; # OUTPUT: «:bar(RatStr.new(42.5, "42.5"))␤»

# Positionals
:foo['foo', 42.5] .say; # A mutable Array:   OUTPUT: «foo => [foo 42.5]␤»
:foo<foo bar 42.5>.say; # An immutable List: OUTPUT: «foo => (foo bar 42.5)␤»
# angled brackets give you allomorphs!

# Callables
:foo{ say "Hello, World!" }.say;
# OUTPUT: «foo => -> ;; $_? is raw { #`(Block|82978224) ... }␤»

# Hashes; keep 'em simple so it doesn't get parsed as a Callable
:foo{ :42a, :foo<a b c> }.say; # OUTPUT: «foo => {a => 42, foo => (a b c)}␤»

# Name and value from variable
:$foo;  # same as :foo($foo)
:$*foo; # same as :foo($*foo)
:$?foo; # same as :foo($?foo)
:$.foo; # same as :foo($.foo)
:$!foo; # same as :foo($!foo)
:@foo;  # same as :foo(@foo)
:@*foo; # same as :foo(@*foo)
:@?foo; # same as :foo(@?foo)
:@.foo; # same as :foo(@.foo)
:@!foo; # same as :foo(@!foo)
:%foo;  # same as :foo(%foo)
:%*foo; # same as :foo(%*foo)
:%?foo; # same as :foo(%?foo)
:%.foo; # same as :foo(%.foo)
:%!foo; # same as :foo(%!foo)
:&foo;  # same as :foo(&foo)
:&*foo; # same as :foo(&*foo)
:&?foo; # same as :foo(&?foo)
:&.foo; # same as :foo(&.foo)
:&!foo; # same as :foo(&!foo)

Let's break these up and take a closer look!

Standard, Take-any-Type, Non-Shortcut Form

The "standard" form of the colonpair consists of a colon (:), a valid term that functions as the .key of the created Pair object, and then a set of parentheses inside of which is the expression with the .value for the Pair:

:nd(2).say;                       # OUTPUT: «nd => 2␤»
:foo('foo', 'bar').say;           # OUTPUT: «foo => (foo bar)␤»
:foo( %(:42a, :foo<a b c>) ).say;
# OUTPUT: «foo => {a => 42, foo => (a b c)}␤»

As long as the key is a valid identifier, all other forms of colonpairs can be written using this way. And for non-valid identifiers, you can simply use the .new method—Pair.new('the key','value')—or the "fat arrow" syntax.

Fat Arrow Syntax

If you ever used Perl 5, you need no introductions to this syntax: you write the key—which will get auto-quoted if it's a valid identifier, so in those cases you can omit the quotes—then you write => and then you write the value. The quotes around the key are required if the key is not a valid identifier and the fat arrow is the only operator-involved syntax that will let you construct Pairs with such keys:

# (outer parentheses are here just for the .say call)
(nd => 2).say; # OUTPUT: «nd => 2␤»
(foo => ('foo', 'bar') ).say; # OUTPUT: «foo => (foo bar)␤»
(foo => %(:42a, :foo<a b c>) ).say;
# OUTPUT: «foo => {a => 42, foo => (a b c)}␤»
("the key" => "the value").say; # OUTPUT: «the key => the value␤»

There are some extra rules with how this form behaves in argument lists as well as sigilless variables and constants, which we'll see later in the article.

Boolean Shortcut

Now we start getting into shortcuts! What would the most common use of named parameters be? Probably, to specify boolean flags.

It'd be pretty annoying to always have to write those as :foo(True), so there's a shortcut: simply omit the value entirely, and if you want :foo(False), omit the value and put the negation operator) right after the colon:

# Booleans
:foo .say; # OUTPUT: «foo => True␤»
:!foo.say; # OUTPUT: «foo => False␤»

# Equivalent calls:
some-sub :foo :!bar :ber;
some-sub foo => True, bar => False, ber => True;

The shortcut form is a lot shorter. This is also the form you may see in adverbs and regex syntax, such as the :g adverb on the m// quoter and :s/:!s significant whitespace modifier inside the regex:

say "a b c def g h i" ~~ m:g/:s \S \S [:!s \S \s+ \S]/;
# OUTPUT: «(「a b c d」 「f g h i」)␤»

Here's also another trick from my personal bag: since Bool type is an Int, you can use boolean shortcuts to specify Int values 1 and 0:

# set `batch` to `1`:
^4 .race(:batch).map: { sleep 1 };
say now - ENTER now; # OUTPUT: «1.144883␤»

However, for clarity you may wish to use unsigned integer colonpair shortcut instead, which isn't much longer.

Unsigned Integer Shortcut

The Perl 6 programming language lets you grab an nth match when you're matching stuff with a regex:

say "first second third" ~~ m:3rd/\S+/;
# OUTPUT: «「third」␤»

As you can probably surmise by now, the :3rd after the m in m// quoter is the adverb, written as a colonpair in unsigned integer shortcut. This form consist of a colon and the name of the key with unquoted unsigned integer value placed between them. No signs, no decimal dots, and not even underscore separators between digits are permitted.

The primary use of this shortcut is for things with ordinal suffixes like :1st, :2nd, :9th, etc. It offers great readability there, but personally I have no reservations about using this syntax for all unsigned integer values, regardless of what the name of the key is. It feels slightly offcolour when you first encounter such syntax, but it quickly grows on you:

some-sub :1st :2nd :3rd :42foo :100bar;
^4 .race(:1batch).map: { sleep 1 };

Hash/Array/Callable Shortcuts

Using standard colonpair format you may notice some forms are too parentheses-heavy:

:foo(<42>)                 # value is an IntStr allomorph
:foo(<a b c>)              # value is a List
:foo([<a b c>])            # value is an Array
:foo({:42foo, :100bar})    # value is a Hash
:foo({.contains: 'meows'}) # the value is a Callable

In these form, you can simply omit the outer parentheses and let the inner brackets and curlies do their job:

:foo<42>                   # value is an IntStr allomorph
:foo<a b c>                # value is a List
:foo[<a b c>]              # value is an Array
:foo{:42foo, :100bar}      # value is a Hash
:foo{.contains: 'meows'}   # the value is a Callable

It looks a lot cleaner and is simpler to write. Both the Hash and Callable use the same set of curlies and the same simple rules as used by the {…} construct elsewhere in the language: if the content is empty, or contains a single list that starts with a Pair literal or %-sigiled variable, and the $_ variable or placeholder parameters are not used, a Hash is created; otherwise a Block (Callable) is created.

The angle bracket form (:foo<…>) follows the same rules as the angle bracket quoter used elsewhere in the language:

:foo< 42  >.value.^name.say; # OUTPUT: «IntStr␤»
:foo<meows>.value.^name.say; # OUTPUT: «Str␤»
:foo<a b c>.value.^name.say; # OUTPUT: «List␤»

And keep in mind that these two forms are not equivalent:

:42foo
:foo<42>

The first creates an Int object, while the second one creates an IntStr object, which is an allomorph. This difference is important for things that care about object identity, such as set operators

Sigiled Shortcut

The one thing I find a pain in the bit to write in other languages is constructs like this:

my $the-thing-with-a-thing = …
…
some-sub the-thing-with-a-thing => $the-thing-with-a-thing;

It's fairly common to name your variables the same as some named argument to which you wish to pass that variable as a value. The Perl 6 programming language offers a colonpair shortcut precisely for that case. Simply prepend a colon to the variable name to construct a colonpair with the key named the same as the variable (without including the sigil) and the value being the value of that variable. The only catch is the variable must have a sigil, so you can't use this shortcut with sigilless variables or constants.

my $the-thing-with-a-thing = …
…
some-sub :$the-thing-with-a-thing;

You'll notice that the syntax above looks exactly like how you'd declare a parameter that takes such a named argument—consistency is a good thing. All available sigils and twigils are supported, which makes the full list of variants for this shortcut look something like this:

# Name and value from variable
:$foo;  # same as :foo($foo)
:$*foo; # same as :foo($*foo)
:$?foo; # same as :foo($?foo)
:$.foo; # same as :foo($.foo)
:$!foo; # same as :foo($!foo)
:@foo;  # same as :foo(@foo)
:@*foo; # same as :foo(@*foo)
:@?foo; # same as :foo(@?foo)
:@.foo; # same as :foo(@.foo)
:@!foo; # same as :foo(@!foo)
:%foo;  # same as :foo(%foo)
:%*foo; # same as :foo(%*foo)
:%?foo; # same as :foo(%?foo)
:%.foo; # same as :foo(%.foo)
:%!foo; # same as :foo(%!foo)
:&foo;  # same as :foo(&foo)
:&*foo; # same as :foo(&*foo)
:&?foo; # same as :foo(&?foo)
:&.foo; # same as :foo(&.foo)
:&!foo; # same as :foo(&!foo)

This about wraps up the list of currently available colonpair shortcuts. As you can see, the huge list of shortcuts was reduced to a few simple patterns to follow. However, this might not be all the shortcuts that will exist for all the time…

The Future!

While currently aren't available, the following two shortcuts might become part of the language in future language versions.

The first one is the indirect lookup shortcut. If you have a named variable and the name of that variable in another variable, you can access the value of the first variable using the indirect lookup construct:

my $foo = "bar";
my %bar = :42foo, :70bar;
say %::($foo); # OUTPUT: «{bar => 70, foo => 42}␤»

If you squint, the indirect lookup is sort'f like a sigilled variable and colonpair shortcuts for sigilled variables exist, so it makes sense for the language to be consistent and support indirect lookup colonpair shortcut, which would look something like this, where the value of $foo contains the name of the key for the colonpair.

:%::($foo)

This form is currently listed as simply unimplemented feature in R#1532, so it'll likely see life some day.

The second possible future construct is the :.foo form, which was proposed in RFC R#1462. This form calls method .foo on the $_ topical variable and uses the return value as the value for the created Pair, with the name of the method being the name of the key.

This form comes up semi-frequently when you're passing values of attributes of one object to another with similarly-named attributes, so something like this:

Some::Other.new: :foo(.foo) :bar(.bar) :ber(.ber) with $obj

Would in shortcut form be written like this:

Some::Other.new: :.foo :.bar :.ber with $obj

At the time of this writing, this RFC has been self-rejected, but you never know if there'd be more calls for introduction of this syntax.

PART II: Use

Now that we're familiar with how to write all the forms of colonpairs, let's take a look at some of their available uses, especially those with special rules.

Parameters

To specify that a parameter should be a named rather than a positional parameter, simply use the sigilled variable colonpair shortcut:

sub meow($foo, :$bar) {
    say "$foo is positional and $bar is named";
}
meow 42, :100bar; # 42 is positional and 100 is named
meow :100bar, 42; # 42 is positional and 100 is named

Since parameters need some sort of a variable to bind their stuff to, pretty much all other forms of colonpairs are not available for use in parameters. This means that you can't, for example, declare sigilless named parameters and must instead explicitly use the is raw trait to get the rawness:

sub meow (\foo, :$bar is raw) {
    (foo, $bar) = ($bar, foo)
}
my $foo = 42;
my $bar = 100;
meow $foo, :$bar;
say [$foo, $bar]; # OUTPUT: «[100 42]»

The one other colonpair form available in parameters is the standard form that is used for aliasing multiple named params to the same name and parameter descructuring:

sub meow (:st(:nd(:rd(:$nth))), Positional :list(($, $second, |))) {
    say [$nth, $second];
}
meow :3rd, :list<a b c>; # OUTPUT: «[3 b]»

Pro-tip: if you're using the Rakudo compiler you may wish to take it easy with aliasing. Using aliases more than 1 level deep will cause the compiler to switch to the slow-path binder, which, as the name suggests, is about 10x slower.

A trick you can use is to use more than one parameter, each with aliases at most 1 level deep, and then merge them in the body:

sub meow (:st(:$nd is copy), :rd(:$nth)) {
    $nd //= $nth;
}

Argument Lists

Use of colonpairs in argument lists deserves a separate section due to a rule that's subtle enough to earn a spot in language's traps section. The rule involves the problem that a programmer may wish to pass Pair objects in argument lists as either a named or a positional argument.

In majority of cases, the colonpairs will be passed as named arguments:

sub args {
    say "Positional args are: @_.perl()";
    say "Named      args are: %_.perl()";
}

args :foo, :50bar, e => 42;
# OUTPUT:
# Positional args are: []
# Named      args are: {:bar(50), :e(42), :foo}

To pass a Pair object as a positional argument, you can do any of the following:

  1. Wrap the entire colonpair in parentheses
  2. Call some method on the colonpair, such as .self or .Pair; weird stuff like using R meta op on the => operator applies as well
  3. Quote the key in foo => bar syntax
  4. In foo => bar syntax, use a key that is not a valid identifier
  5. Put your Pairs in a list and slip it in with the | "operator"

Here's that list of options in code form:

my @pairs := :42foo, :70meow;
args :foo.Pair, (:50bar), "baz" => "ber", e R=> 42, 42 => 42, |@pairs;

# OUTPUT:
# Positional args are: [:foo, :bar(50), :baz("ber"),
#   42 => 2.718281828459045e0, 42 => 42, :foo(42), :meow(70)]
# Named      args are: {}

Number (3) is especially worth keeping in mind if you're coming from other languages, like Perl 5, that use the fat arrow (=>) for key/value separation. This construct gets passed as a named argument only if the key is unquoted and only if it's a valid identifier.

Should it happen that you have to use one of these constructs, yet wish to pass them as named arguments instead of positionals, simply wrap them in parentheses and use the | prefix to "slip" them in. For the list of Pairs we were already slipping in in previous example, you'll need to coerce it into a Capture object first, as Pairs stuffed into a Capture—unlike a list—end up being named parameters, when the Capture is slipped into the argument list:

my @pairs := :42foo, :70meow;
args |(:foo.Pair), |(:50bar),   |("baz" => "ber"),
     |(e R=> 42),  |(42 => 42), |@pairs.Capture;

# OUTPUT:
# Positional args are: []
# Named      args are: {"42" => 42, :bar(50),
#                      :baz("ber"), :foo(42), :meow(70)}

The same slipping trick can be used to provide named arguments conditionally:

sub foo (:$bar = 'the default') { say $bar }

my $bar;
foo |(:bar($_) with $bar);  # OUTPUT: «the default␤»
$bar = 42;
foo |(:bar($_) with $bar);  # OUTPUT: «42␤»

If $bar is not defined, the with statement modifier will return Empty, which when slipped with | will end up being empty, allowing the parameter to attain its default value. Since |(:) looks like a sideways ninja, I call this technique "ninja-ing the arg".

Auto-Quoting in => Form

The sharp-eyed in the audience might have noticed the e => 42 colonpair in the previous section used letter e as a key, yet in reversed form e R=> 42, the e became 2.718281828459045e0, because the core language has e defined as Euler's number.

The reason it remained a plain string e in the e => 42 form is because this construct auto-quotes keys that are valid identifiers and so they will always end up as strings, even if a constant, a sigilless variable, or a routine with the same name exists:

my \meows = '🐱';
sub ehh { rand }
say %(
    meows => 'moo',
    ehh   => 42,
    τ     => 'meow',
); # OUTPUT: «{ehh => 42, meows => moo, τ => meow}␤»

A multitude of ways exist to avoid this autoquoting behaviour, but I'll show you just one that's good enough: slap a pair of parentheses around the key:

my \meows = '🐱';
sub ehh { rand }
say %(
    (meows) => 'moo',
    (ehh)   => 42,
    (τ)     => 'meow',
);
# OUTPUT: «{0.58437052771857 => 42, 6.283185307179586 => meow,
#           🐱 => moo}␤»

Simple!

Conclusion

That's pretty much all there is to know about colonpairs. We learned they can be used to construct Pair objects, used as adverbs, and used to specify named arguments and parameters.

We learned about various shortcuts, such as using key only for boolean True, sticking the negation operator or an unsigned integer between the colon and the key to specify boolean False or an Int value. We also learned that parentheses can be omited on colonpair values if there's already a set of curly, square, or angle brackets around the value and that prepending a colon to a sigilled variable name will create a colonpair that will use the name of that variable as the key.

In the second half of the article, we went over available colonpair syntaxes when specifying named paramaters, the pecularities in passing colonpairs as either named or positional arguments, as well as how to avoid auto-quoting of the key by wrapping it in parentheses.

I hope you found this informative.

-Ofun

WANTED: Perl 6 Historical Items

Read this article on Rakudo.Party

The Perl 6 programming language had a turbulent birth. It was announced in the summer of 2000 and the first stable language release shipped out only 2 years ago, on Christmas, 2015. A lot has happened during that decade and a half, yet the details are hard to piece together.

After my recent facelift to rakudo.org, I'm working on a (second) facelift to perl6.org website.

Part of the work involves bringing all the Perl 6 deliverables under one umbrella, so the user isn't thrown around multiple websites, trying to find what to install. At the same time, we want to strengthen the distinction between Perl 6 the language and the compilers that implement it, as well as encourage more implementors to give it a go at implementing a Perl 6 programming language compiler.

The Perl 6 Programming Language Museum will be part of that effort and along with interesting tidbits of Perl 6 history, it'll showcase past implementation attempts that may no longer be in active development today. Since I don't know much about what happened before I came to the language sometime in 2015, I need your help in collecting those tidbits.

Larry Wall at FOSDEM 2015, photo by Klapi

In my mind's eye, I'm imagining a few pages on perl6.org; something in the same vein as Computer History Museum's pages—pictures, years, and info, and potentially links to code repositories. Depending on the content we collect, it's possible there will be a digital PDF version of the Museum that can also be printed and handed out at events, if desired.

I'm looking for:

  • Descriptions of interesting/significant events (like the mug throwing incident).
  • Descriptions of interesting/significant implementations of Perl 6 or influential Perl 6 projects. Having links to repos/tarballs of their code is a plus.
  • Samples of interesting/significant email threads or chat logs.
  • Pictures of interesting/significant objects (first sight at plush Camelias?).
  • Pictures of interesting/significant humans (a filled out model release form is required).
  • Anything else that's Museum worthy.

If you have any of these items, please submit them to the appropriate year directory in the Perl 6 Museum Items repository. If you're a member of Perl 6 GitHub org, you should already have a commit bit to that repo. Otherwise, submit your items via a pull request.

Let's build something cool and interesting for the people using Perl 6 a hundred years from now to look at and remember!

If you have any questions or need help, talk to a human on our IRC chat.

-OFun

Perl 6: On Specs, Versioning, Changes, and... Breakage

Read this article on Rakudo.Party

Recently, I came across a somewhat-frantic comment on StackOverflow that describes a 2017.01 change to the type of return value of .sort:

"you just can't be sure what ~~ returns" Ouch. […] .list the result of a sort is presumably an appropriate work around. But, still, ouch. I don't know of a blog post or whatever that explains how P6 approaches changes to the language; and to roast; and to Rakudo. Perhaps someone will write one that also explains how this aspect of 2017.01 was conceived, considered and applied; what was right about the change; what was wrong; etc.

Today, I decided to answer that call to write a blog post and reply to all of the questions posed in the comment, as well as explain how it's possible that such an "ouch" change made it in.

On Versioning

The '6' in Perl 6 is just part of the name. The language version itself is encoded by a sequential letter, which is also the starting letter of a codename for that release. For example, the current stable language version is 6.c "Christmas". The next language release will be 6.d with one of the proposed codenames being "Diwali". The version after that will be 6.e, then 6.f, and so on.

If you've used Perl 6 sometime between 2015 and 2018, you likely used the "Rakudo" compiler, which is often packaged as "Rakudo Star" distribution and is versioned with the year and the month of the release, e.g. release 2017.01.

In some languages, like Perl 6's sister language Perl 5, what the compiler does is what the language itself is. Bugs aside, if the latest (2017.09) Perl 5 compiler gives 4 for 2+2, then that's the definition of what 2+2 is in the Perl 5 language.

In Perl 6, however, how a compiler (e.g. "Rakudo") behaves or what it implements does not define the Perl 6 language. The Perl 6 language specification does. The specification consists of a test suite of about 155,000 tests and anything that passes that test suite can call itself a "Perl 6 compiler".

It's to this specification version 6.c "Christmas" refers. It was released on December 25, 2015 and at the time of this writing, it's the first and only release of a stable language spec. Aside from a few error corrections, there were no changes to that specification… The latest version of Rakudo still passes every single test—it's a release requirement.

On Changes

Ardent Perl 6 users would likely recall that there have been many changes in the Rakudo compiler since Christmas 2015. Including the "ouch" change referenced by that StackOverflow comment. If the specification did not change and core devs are not allowed to make changes that break 6.c specification, how is it possible that the return type of .sort could have changed?

The reason is—and I hope the other core devs will forgive me for my choice of imagery—the specification is full of holes!

It doesn't (yet) cover every imaginable use and combination of features. What happens when you try to print a Junction of strings? As far as 6.c version of Perl 6 language is concerned, that's undefined behaviour. What object do you get if you call .Numeric on an Rat type object rather than an instance? Undefined behaviour. What about the return value of .sort? You'll get sorted values in an Iterable type, but whether that type is a Seq or a List is not specified by the 6.c specification.

This is how 2017.01 version of Rakudo managed to change the return type of .sort, despite being a compliant implementation of the 6.c language—the spec was not precise about what Iterable type .sort must return; both Seq and List are Iterable, thus both conform to the spec. (It's worth noting that since 2017.01 we implemented an extended testing framework that also guides our decisions on whether we actually allow changes that don't violate the spec).

In my personal opinion, the 6.c spec is overly sparse in places, which is why we saw a number of large changes in 2016 and early 2017, including the "ouch" change the commenter on StackOverlow referred to. But… it won't stay that way forever.

The Future of the Spec

At the time of this writing, there have been 3,129 commits to the spec, since 6.c language release. These are the proposals for the 6.d language specification. While some of these commits address new features, a lot of them close those holes the 6.c spec contains. The main goal is not to write a "whole new spec" but to refine and clarify the previous version.

Thus, when 6.d is released, it'll look something like this:

A few more slices of new features, but largely the same thing. Still some holes (undefined behaviour) in it, but a lot less than in 6.c language. It now defines that printing a Junction will thread it; that calling .Numeric on a Numeric type object gives a numeric equivalent of zero of that type and a warning; and that the .sort's Iterable return type is a Seq, not a List.

As more uses of combinations original designers haven't thought of come around, even more holes will be covered in future language versions.

Breaking Things

The cheese metaphor covers refinements to the specification, but there's another set of changes the core developers sometimes have to make: changes that violate previous versions of the specification. For 6.d language, the list of such changes is available in our 6.d-prep repository (some of the listed changes don't violate 6.c spec, but still have significant impact so we pushed them to the next language version).

This may seem to be a contradiction: didn't I say earlier that passing 6.c specification is part of the compiler's release requirements? The key to resolving that contradiction lies in ability to request different language versions in different comp units (e.g. in different modules) that are used by the same program.

A single compiler can support multiple language versions. Specifying use v6.c pragma loads 6.c language. Specifying use v6.d (currently available as use v6.d.PREVIEW) loads 6.d language. Not specifying anything loads the newest version the compiler supports.

One of the changes between 6.c and 6.d languages is that await no longer blocks the thread in 6.d. We can observe this change using a single small script that loads two modules. The code between the two modules is the same, except they request different language versions:

# file ./C.pm6
use v6.c;
sub await-c is export {
    await ^10 .map: {
        start await ^5 .map: { start await Promise.in: 1 }
    }
    say "6.c version took $(now - ENTER now) secs";
}

# file ./D.pm6
use v6.d.PREVIEW;
sub await-d is export {
    await ^10 .map: {
        start await ^5 .map: { start await Promise.in: 1 }
    }
    say "6.d version took $(now - ENTER now) secs";
}

# $ perl6 -I. -MC -MD -e 'await-c; await-d'
# 6.c version took 2.05268528 secs
# 6.d version took 1.038609 secs

When we run the program, we see that no-longer blocked threads let 6.d version complete a lot faster (in fact, if you bump the loop numbers by a factor, 6.d would still complete, while 6.c would deadlock).

So this is the Perl 6 mechanism that lets the core developers make breaking changes without breaking user's programs. There are some limitations to it (e.g. methods on classes)—so for some things there still will be standard deprecation procedures. We also try to limit the number of such spec-breaking changes, to reduce the maintenance burden and impact on users who don't want to lock their code down to some older version. Thus, don't worry about getting some weird new language on the next language release—the differences will be minimal.

Who Decides?

This all brings us to one of the questions posed by that StackOverflow user: how do language changes get conceived, considered, and applied—in short: who decides what the behaviour is to be like? What is the process?

As far as conception goes, many of the current ideas are based on seeing what our users need. Some proposals come directly from users; others get inspired as more elegant solutions to problems users showed they were trying to solve. Some of the changes proposed for 6.d language were informed by problematic areas of currently-implemented features that weren't foreseen during original implementation.

When it comes to implementation, the scope of the feature and core developer's expertise with the given area of the codebase generally drive the process. With the "ouch" change, the expert in the area of Iterables deemed Seq to be a superior type for .sort to return, due to its non-caching behaviour as well as its ease of degenerating into a caching List.

Some changes get opened as an Issue on the bugtracker first, to notify other devs of the impending change. Large changes usually get a proposed design written down first. The proposal is shared with the core devs and feedback is gathered before the proposal is actually implemented. The implementation of significant things is also merged far away from the date of the next release, to let the bleeding-edge users find any potential problems in the work.

Geth, our IRC bot, announces all commits in our development IRC channel. Most of the core devs backlog that channel, so any of the potentially problematic commits—even if one of the devs goes ahead and commits the change—get discussed and at times reverted.

The Perl 6 pumpking (Jonathan Worthington) and the BDFL (Larry Wall) are available to provide feedback on controversial, questionable, or large changes being proposed. They also have the veto power on any changes. Our messaging bot helps us request feedback from them, even if they're currently not in the chat.

When it comes to errata to previous specifications, unless the test to be changed is "obviously wrong", the decision on whether the errata can be applied is delegated to the Release Manager (AlexDaniel), and informed by the pumpking/BDFL, if required.

The Future

The current process is a bit loose in places. A test that's "obviously wrong" to one person might have some valid reasons behind it to someone else. This is why the TODO for 6.d release lists several documents to be written that will refine the procedures for various types of changes.

It won't be on the scale of PEP, but simply something more concrete for the core devs to refer to, when performing changes that have some impact on the users. It's a balancing act between organization and procedure and letting through a consistent flow of contributions.

And if breaking changes have to be made, an alert will be pushed to the the P6lert service for users of Perl 6 to get informed of them in advance.

Conclusion

Today, we gleaned an insight into how Perl 6 core devs introduce changes to the compiler and the language.

The language specification and the compiler's behaviour are separate entities. The 6.c language specification has places of unspecified behaviour, which is how changes that have large impact on the users slipped through in the past.

The extended testing framework as well as specification clarifications offered by 6.d language proposal tests that refine the specification and close the holes with undefined behaviour reduce unforeseen impact on the users.

The core dev team informs their decisions based on user's feedback and the way the language is used by the community. Large changes get written up as proposals and the pumking/BDFL offer advise on anything controversial.

In the future, more refined practices for how changes are made will be defined, as we work on making upgrade experience more predictable and non-breaking for our users. The P6lert service helps that goal and is already available today.

Hope this answers all the questions :)

Perl 6 Core Hacking: QASTalicious

Read this article on Rakudo.Party

Over the past month, I spent some time in Rakudo's QAST land writing a few optimizations, fixing bugs involving warnings, as well as squashing a monster hive of 10 thunk scoping bugs with a single commit. In today's article, we'll go over that last feat in detail, as well as learn what QAST is and how to work with it.

PART I: The QAST

"QAST" stands for "Q" Abstract Syntax Tree. The "Q" is there because it's comes after letter "P", and "P" used to be in "PAST" to stand for "Parrot", the name of an earlier, experimental Perl 6 implementation (or rather, its virtual machine). Let's see what QAST is all about!

Dumping QAST

Every Rakudo Perl 6 program compiles down to a tree of QAST nodes and you can dump that tree if you specify --target=ast or --target=optimize command line option to perl6 when compiling a program or a module:

$ perl6 --target=ast -e 'say "Hello, World!"'
[...]
- QAST::Op(call &say) <sunk> :statement_id<?> say \"Hello, World!\"
  - QAST::Want <wanted> Hello, World!
    - QAST::WVal(Str)
    - Ss
    - QAST::SVal(Hello, World!)
[...]

The difference between the --target=ast and --target=optimize is that the former shows the QAST tree as soon as it has been generated, while the later shows the QAST tree after the static optimizer has had a go at it.

While the command line option gives you the QAST for the entire program (excluding modules pre-compiled separately), each QAST::Node object has a .dump method you can use to dump specific QAST pieces of interest from within Rakudo's source code.

For example, to examine the QAST generated by the statement token, I'd find method statement in src/Perl6/Actions.nqp and stick nqp::say('statement QAST: ' ~ $past.dump) close to the end of the method.

Since Rakudo's compilation takes a couple of minutes for each go, I like to key my debug dumps on env variables, like this:

nqp::atkey(nqp::getenvhash(),'ZZ1') && nqp::say('ZZ1: something or other');
...
nqp::atkey(nqp::getenvhash(),'ZZ2') && nqp::say('ZZ2: something else');

Then, I can execute the compiled ./perl6 as if I didn't add anything, and enable my dumps by running ZZ1=1 ./perl6, ZZ2=1 ./perl6, or both dumps at the same time with ZZ1=1 ZZ2=1 ./perl6.

Viewing QAST

Looking at the output of --target dumps in the terminal is sufficient for a quickie glance at the trees, but for extra assistance you can install CoreHackers::Q module that brings in q command line utility.

Simply prefix your regular perl6 invocation with q a or q o to produce --target=ast and --target=optimize QAST dumps respectively. The program will generate out.html file in the current directory:

$ q a perl6 -e 'say "Hello, World!"'
$ firefox out.html

Pop open the generated HTML file and reap these benefits:

  • Color-coded QAST nodes
  • Color hints for sunk nodes
  • Ctrl+Click on any node to collapse it
  • Muted view of QAST::Want alternatives, makes it easier to ignore them

Eventually, I hope to extend this tool and make it more helpful, but at the time of this writing, that's all it does.

The QAST Forest

There are four main files in rakudo's source where you'd expect to be working with QAST nodes: src/Perl6/Grammar.nqp, src/Perl6/Actions.nqp, src/Perl6/World.nqp, and src/Perl6/Optimizer.nqp. If you're using Z-Script utility, you can even run z q command to open these four files in Atom editor.

Grammar.nqp is the Perl 6 grammar. Actions.nqp are the actions for it. World.nqp contains all sorts of helpful routines used by both Grammar.nqp and Actions.nqp that access them via the $*W dynamic variable containing a Perl6::World object. Lastly, Optimizer.nqp contains Rakudo's static optimizer.

The root (of all evil) is the QAST::Node object, with all the other QAST nodes being its subclasses. Let's review some of the popular ones:

QAST::Op

QAST::Op nodes are the workhorse of the QAST world. The :op named argument specifies the name of an NQP op or the name of a Rakudo's NQP extension op and its children are the arguments:

Here's a say op printing a string value:

QAST::Op.new: :op<say>,
  QAST::SVal.new: :value('Hello, World!');

And here's a QAST node for a call op that calls Perl 6's infix:<+> operator; notice how the name of the routine we call is given via :name named argument:

QAST::Op.new: :op<call>, :name('&infix:<+>'),
  QAST::IVal.new( :value(2)),
  QAST::IVal.new: :value(2)

QAST::*Val

The QAST::SVal, QAST::IVal, QAST::NVal, and QAST::WVal nodes, specify string, integer, float, and "World" object values respectively. The first three are the "unboxed" raw values, while World objects are everything else, such as DateTime, Block, or Str objects.

QAST::Want

Some of the objects can be represented by multiple QAST::*Val nodes, where the most appropriate value is used depending on what is wanted in the current context. QAST::Want node contains these alternatives, interleaved with string markers indicating what those alternatives are.

For example, numeric value 42 in Perl 6 could be wanted as an object to call some method on, or as a raw value to be assigned to a native int variable. The QAST::Want node for it would look like this:

QAST::Want.new:
  QAST::WVal.new(:value($Int-obj))),
  'Ii',
  QAST::IVal.new: :value(42)

The $Int-obj above would contain an instance of Int type with value set to 42. The Ii marker indicates the following alternative is an integer value and we provide a QAST::IVal object containing it. The other possible markers are Nn (float), Ss (string), and v (void context) alternatives.

When these nodes are later converted to bytecode, the most appropriate value will be selected, with the first child being the "default" value, to be used when none of the available alternatives make the cut.

QAST::Var

These nodes are used for variables and parameters. The :name named argument specifies the name of the variable and :scope its scope:

QAST::Op.new: :op('bind'),
  QAST::Var.new(:name<$x>, :scope<lexical>, :decl<var>, :returns(int)),
  QAST::IVal.new: :value(0)

The :decl named arg is present when the node is used for the variable's declaration (when it's absent, we simply reference the variable) and its value dictates what sort of variable it is: var for variables and param for routine parameters. Several other :decl types, as well as optional arguments specifying additional configuration of the variable exist. You can find them discussed in the QAST documentation

QAST::Stmt / QAST::Stmts

These are statement grouping constructs. For example, here, the truthy branch of an nqp::if contains three nqp::say statements, all grouped inside QAST::Stmts:

QAST::Op.new: :op<if>,
  QAST::IVal.new(:value(42)),
  QAST::Stmts.new(
    QAST::Op.new( :op<say>, QAST::SVal.new: :value<foo>),
    QAST::Op.new( :op<say>, QAST::SVal.new: :value<bar>),
    QAST::Op.new: :op<say>, QAST::SVal.new: :value<ber>),
  QAST::Op.new: :op<say>, QAST::SVal.new: :value<meow>,

The singular QAST::Stmt is similar. The difference is it marks a register allocation boundary, beyond which, any temporaries are free to be reused. When used correctly, this alternative can result in better code generation.

QAST::Block

This node is both a unit of invocation and a unit of lexical scoping. For example, code sub foo { say "hello" } might compile to a QAST::Block like this:

Block (:cuid(1)) <wanted> :IN_DECL<sub> { say \"hello\" }
[...]
  Stmts <wanted> say \"hello\"
    Stmt <wanted final> say \"hello\"
      Want <wanted>
        Op (call &say) <wanted> :statement_id<?> say \"hello\"
          Want <wanted> hello
            WVal (Str)
            - Ss
            SVal (hello)
        - v
        Op (p6sink)
          Op (call &say) <wanted> :statement_id<?> say \"hello\"
            Want <wanted> hello
              WVal (Str)
              - Ss
              SVal (hello)
[...]

Each block demarcates a lexical scope boundary—this detail comes into play in Part II of this article, when we'll be going over a fix for a bug.

Others

A few more QAST nodes exist. They're out of scope of this article, but you may wish to read the documentation or, since some of them are not appear in those docs, go straight to the source.

Executing QAST Trees

Having a decent familarity with nqp ops (as well as Rakudo's nqp extensions) is helpful when working with QAST. A sharp eye would notice in QAST dumps that many QAST::Op nodes correspond to nqp::* op calls, where :op named argument specifies the name of the op.

When writing large QAST trees, it's handy to write them down using pure NQP ops first, and then translate the result into a tree of QAST node objects. Let's look at a simplified example:

nqp::if(
  nqp::isgt_n(nqp::rand_n(1e0), .5e0),
  nqp::say('Glass half full'),
  nqp::say('Glass half empty'));

We have NQP op, so we'll start with QAST::Op node, using 'if' as the value for :op. The op takes three positional arguments—the three ops used for the conditional, the truthy branch, and the falsy branch. Some of the ops also take float and string values, so we'll use QAST::NVal and QAST::SVal nodes for those. The result is:

QAST::Op.new(:op('if'),
  QAST::Op.new(:op('isgt_n'),
    QAST::Op.new(:op('rand_n'),
      QAST::NVal.new(:value(1e0))
    ),
    QAST::NVal.new(:value(.5e0))
  ),
  QAST::Op.new(:op('say'),
    QAST::SVal.new(:value('Glass half full'))
  ),
  QAST::Op.new(:op('say'),
    QAST::SVal.new(:value('Glass half empty'))
  )
)

I find it easier to track the tree's nesting by using parentheses only when necessary, preferring colon method call syntax whenever possible:

QAST::Op.new: :op<if>,
  QAST::Op.new(:op<isgt_n>,
    QAST::Op.new(:op<rand_n>,
      QAST::NVal.new: :value(1e0)),
    QAST::NVal.new: :value(.5e0)),
  QAST::Op.new(:op<say>,
    QAST::SVal.new: :value('Glass half full')),
  QAST::Op.new: :op<say>,
    QAST::SVal.new: :value('Glass half empty')

If a .new is followed by a colon, there aren't any more nodes on the same level. If .new is followed by an opening parentheses, there are more sister nodes yet to come.

Due to Rakudo's lengthy compilation, it can be handy to execute your QAST tree without having to stick it into src/Perl6/Actions.nqp or similar file first. To some extent, it's possible to do that with a regular Perl 6 program. We'll simply access Perl6::World object in $*W variable inside a BEGIN block, where it still exists, and call .compile_time_evaluate method, giving it an empty variable as the first positional (it expects a Match object for the tree) and our QAST tree as the second positional:

use QAST:from<NQP>;
BEGIN $*W.compile_time_evaluate: $,
    QAST::Op.new: :op<if>,
      QAST::Op.new(:op<isgt_n>,
        QAST::Op.new(:op<rand_n>,
          QAST::NVal.new: :value(1e0)),
        QAST::NVal.new: :value(.5e0)),
      QAST::Op.new(:op<say>,
        QAST::SVal.new: :value('Glass half full')),
      QAST::Op.new: :op<say>,
        QAST::SVal.new: :value('Glass half empty')

The one caveat with this method is we're using full-blown Perl 6 language, whereas in src/Perl6/Actions.nqp and related files, as .nqp extension suggests, we're using NQP language only. Keep an eye out for strange explosions; it's possible your QAST tree that explodes in Perl 6 will compile just fine in the land of pure NQP.

Annotating QAST Nodes

All QAST nodes support annotations that allow you to attach an arbitrary value to a node and then read that value elsewhere. To add an annotation, use .annotate method, which takes two positional arguments—a string containing name of the annotation and the value to attach to it—and returns that value. Recent versions of NQP also have .annotate_self method that works the same, except it returns the QAST node itself:

$qast.annotate_self('foo', 42).annotate: 'bar', 'meow';

Later, you can read that value using .ann method that takes the name of the annotation as the argument. If the annotation doesn't exist, NQPMu is returned instead:

note($qast.ann: 'foo'); # OUTPUT: «42␤»

You can also check for whether an annotation merely exists using .has_ann method that returns 1 (true) or 0 (false):

note($qast.has_ann: 'bar'); # OUTPUT: «1␤»

Or dump all of the annotations on the node (to prevent potential flood of output, most values will be dumped as simply a question mark):

note($qast.dump_annotations); # OUTPUT: « :bar<?> :foo<?>␤»);

Lasty, to clear all annotations on the node, simply call .clear_annotations method.

Mutating QAST Nodes

A handy thing to do with QAST node objects is to mutate them into something better. That's essentially all the static optimizer in src/Perl6/Optimizer.nqp does. Named arguments can be mutated by calling them as methods and providing a value. For example, $qast.op('callstatic') will change the value of :op from whatever it is to callstatic. Positional arguments can be altered by re-assignment to a positional index, as well as shift, push, unshift, pop operations performed either via method calls with those names or nqp:: ops. Some nodes also support nqp::elems calls on them, which is slightly faster than the generic pattern of +@($qast) that can be used on all nodes to find out the number of children a node contains.

As an exercise, let's write a small optimization: some operations, like $foo < $bar < $ber compile to nqp::chain ops. That is so even if we have only two children, e.g. $foo < $bar. In such cases, rewriting the op to be nqp::call has performance advantages: not only nqp::call on its own is a little bit faster than nqp::chain, the static optimizer knows how to do further optimizations on nqp::call ops.

Let's take a look at what both 2-child and 2+-child nqp::chain chains look like:

$ perl6 --target=ast -e '2 < 3 < 4; 2 < 3'

The first statement compiled to this (I removed QAST::Wants for clarity):

- QAST::Op(chain &infix:«<»)  :statement_id<?> <
  - QAST::Op(chain &infix:«<») <wanted> <
    - QAST::IVal(2)
    - QAST::IVal(3)
  - QAST::IVal(4)

And the second one to:

- QAST::Op(chain &infix:«<»)  :statement_id<?> <
  - QAST::IVal(2)
  - QAST::IVal(3)

Thus, to target our optimization correctly, we need to ensure neither child of our chain op is a chain op. In addition, we need to ensure that the op we're optimizing is not itself a child of another chain op.

Raking the code of the optimizer, we can spot that chain depth is already tracked via $!chain_depth attribute, so we merely need to ensure we're at the first link of the chain. The code then becomes:

$qast.op: 'call'
  if nqp::istype($qast, QAST::Op)
  && $qast.op eq 'chain'
  && $!chain_depth == 1
  && ! (nqp::istype($qast[0], QAST::Op) && $qast[0].op eq 'chain')
  && ! (nqp::istype($qast[1], QAST::Op) && $qast[1].op eq 'chain');

Once we find a chain QAST::Op, we index into it and use nqp::istype to check the type of kid nodes, and if those happen to be QAST::Op nodes, we ensure the :op parameter is not a chain op. If all of the conditions are met, we simply call .op method on our node with value 'call' to convert it into a call op.

We then stick our optimization early enough into .visit_op method of the optimizer and its later portions will further optimize our call.

A fairly easy and straightforward optimization that can bring a lot of benefit.

PART II: A Thunk in The Trunk


Note: it took me three evenings to debug and fix the following tickets. To learn the solution I tried many dead ends that I won't be covering, to keep you from getting bored, and instead will instantly jump to conclusions. The point I'm making is that fixing core bugs is a lot easier than may seem from reading this article—you just need to be willing to spend some time on them.


Now that we have some familiarity with QAST, let's try to fix a bug that existed in Rakudo v2018.01.30.ga.5.c.2398.cc and earlier. The ticket in question is R#1212, that shows the following problem:

$ perl6 -e 'say <a b c>[$_ xx 2] with 1'

Use of Nil in string context
  in block  at -e line 1
Unable to call postcircumfix [ (Any) ] with a type object
Indexing requires a defined object
  in block <unit> at -e line 1

It looks like the $_ topical variable inside the indexing brackets fails to get the value from with statement modifier and ends up being undefined. Sounds like a challenge!

It's A Hive!

Both with and xx operator create thunks (thunks are like blocks of code, without having explicit blocks in the code; this, for example, lets rand xx 10 to produce 10 different random values; rand is thunked and the thunk is called for each iteration). This reminded me of some other tickets I've seen, so I went to fail.rakudo.party and looked through open tickets for anything that mentioned thunking or wrong scoping.

I ended up with a list of 7 tickets, and with the help of dogbert++ later increased the number to 9, which with the original Issue gives us a total of 10 different manifestations of a bug. The other tickets are RT#130575, RT#132337, RT#131548, RT#132211, RT#126569, RT#128054, RT#126413, RT#126984, and RT#132172. Quite a bug hive!

Test It Out

Our starting point is to cover each manifestation of the bug with a test. Make all the test pass and you know you've fixed the bug, plus you already have something to place into roast, to cover the tickets. My tests ended up looking like this, where I've used gather/take duo to capture what the tickets' code printed to the screen:

use Test;
plan 1;
subtest 'thunking closure scoping' => {
    plan 10;

    # https://github.com/rakudo/rakudo/issues/1212
    is-deeply <a b c>[$_ xx 2], <b b>.Seq, 'xx inside `with`' with 1;

    # RT #130575
    is-deeply gather {
        sub itcavuc ($c) { try {take $c} andthen 42 };
        itcavuc $_ for 2, 4, 6;
    }, (2, 4, 6).Seq, 'try with block and andthen';

    # RT #132337
    is-deeply gather {
        sub foo ($str) { { take $str }() orelse Nil }
        foo "cc"; foo "dd";
    }, <cc dd>.Seq, 'block in a sub with orelse';

    # RT #131548
    is-deeply gather for ^7 {
        my $x = 1;
        1 andthen $x.take andthen $x = 2 andthen $x = 3 andthen $x = 4;
    }, 1 xx 7, 'loop + lexical variable plus chain of andthens';

    # RT #132211
    is-deeply gather for <a b c> { $^v.uc andthen $v.take orelse .say },
        <a b c>.Seq, 'loop + andthen + orelse';

    # RT #126569
    is-deeply gather { (.take xx 10) given 42 }, 42 xx 10,
        'parentheses + xx + given';

    # RT #128054
    is-deeply gather { take ("{$_}") for <aa bb> }, <aa bb>.Seq,
        'postfix for + take + block in a string';

    # RT #126413
    is-deeply gather { take (* + $_)(32) given 10 }, 42.Seq,
        'given + whatever code closure execution';

    # RT #126984
    is-deeply gather {
        sub foo($x) { (* ~ $x)($_).take given $x }; foo(1); foo(2)
    }, ("11", "22").Seq, 'sub + given + whatevercode closure execution';

    # RT #132172
    is-deeply gather { sub {
        my $ver =.lines.uc with "totally-not-there".IO.open
            orelse "meow {$_ ~~ Failure}".take and return 42;
    }() }, 'meow True'.Seq, 'sub with `with` + orelse + block interpolation';
}

When I brought up the first bug in our dev chatroom, jnthn++ pointed out that such bugs are often due to mis-scoped blocks, as p6capturelex op that's involved needs to be called in the immediate outer of the block it references.

Looking through the tickets, I also spotted skids++'s note that changing a conditional for statement_id in block migrator predicate fixed one of the tickets. This wasn't the full story of the fix, as the many still-failing tests showed, but it was a good start.

What's Your Problem?

In order to find the best solution for a bug, it's important to understand what exactly is the problem. We know mis-scoped blocks are the cause of the bug, so lets grab each of our tests, dump their QAST (--target=ast), and write out how mis-scoped the blocks are.

To make it easier to match the QAST::Blocks with the QAST::WVals referencing them, I made a modification to QAST::Node.dump to include CUID numbers and statement_id annotations in the dumps.

Going through mosts of the buggy code chunks, we have these results:

is-deeply <a b c>[$_ xx 2], <b b>.Seq, 'xx inside `with`' with 1;
# QAST for `xx` is ALONGSIDE RHS `andthen` thunk, but needs to be INSIDE

is-deeply gather {
    sub itcavuc ($c) { try {take $c} andthen 42 };
    itcavuc $_ for 2, 4, 6;
}, (2, 4, 6).Seq, 'try with block and andthen';
# QAST for try block is INSIDE RHS `andthen` thunk, but needs to be ALONGSIDE

is-deeply gather {
    sub foo ($str) { { take $str }() orelse Nil }
    foo "cc"; foo "dd";
}, <cc dd>.Seq, 'block in a sub with orelse';
# QAST for block is INSIDE RHS `andthen` thunk, but needs to be ALONGSIDE

is-deeply gather for ^7 {
    my $x = 1;
    1 andthen $x.take andthen $x = 2 andthen $x = 3 andthen $x = 4;
}, 1 xx 7, 'loop + lexical variable plus chain of andthens';
# each andthen thunk is nested inside the previous one, but all need to be
# ALONGSIDE each other

is-deeply gather for <a b c> { $^v.uc andthen $v.take orelse .say },
    <a b c>.Seq, 'loop + andthen + orelse';
# andthen's block is INSIDE orelse's but needs to be ALONGSIDE each other

is-deeply gather { (.take xx 10) given 42 }, 42 xx 10,
    'parentheses + xx + given';
# .take thunk is ALONGSIDE given's thunk, but needs to be INSIDE of it

is-deeply gather { take ("{$_}") for <aa bb> }, <aa bb>.Seq,
    'postfix for + take + block in a string';
# the $_ is ALONGSIDE `for`'s thunk, but needs to be INSIDE

is-deeply gather { take (* + $_)(32) given 10 }, 42.Seq,
    'given + whatever code closure execution';
# the WhateverCode ain't got no statement_id and is ALONGSIDE given
# block but needs to be INSIDE of it

So far, we can see a couple of patterns:

  • xx and WhateverCode thunks don't get migrated, even though they should
  • andthen thunks get migrated, even though they shouldn't

The first one is fairly straightforward. Looking at the QAST dump, we see xx thunk has a higher statement_id than the block it was meant to be in. This is what skids++'s hint addresses, so we'll change the statement_id conditional from == to >= to look for statement IDs higher than our current one as well, since those would be from any substatements, such as our xx inside the positional indexing operator:

($b.ann('statement_id') // -1) >= $migrate_stmt_id

The cause is very similar for the WhateverCode case, as it's missing statement_id annotation altogether, so we'll just annotate the generated QAST::Block with the statement ID. Some basic detective work gives us the location where that node is created: we search src/Perl6/Actions.nqp for word "whatever" until we spot whatever_curry method and in its guts we find the QAST::Block we want. For the statement ID, we'll grep the source for statement_id:

$ grep -FIRn 'statement_id' src/Perl6/
src/Perl6/Actions.nqp:1497:            $past.annotate('statement_id', $id);
src/Perl6/Actions.nqp:2326:                $_.annotate('statement_id', $*STATEMENT_ID);
src/Perl6/Actions.nqp:2488:                -> $b { ($b.ann('statement_id') // -1) == $stmt.ann('statement_id') });
src/Perl6/Actions.nqp:9235:                && ($b.ann('statement_id') // -1) >= $migrate_stmt_id
src/Perl6/Actions.nqp:9616:            ).annotate_self: 'statement_id', $*STATEMENT_ID;
src/Perl6/World.nqp:256:            $pad.annotate('statement_id', $*STATEMENT_ID);

From the output, we can see the ID is stored in $*STATEMENT_ID dynamic variable, so we'll use that for our annotation on the WhateverCode's QAST::Block:

my $block := QAST::Block.new(
    QAST::Stmts.new(), $past
).annotate_self: 'statement_id', $*STATEMENT_ID;

Let's compile and run our bug tests. If you're using Z-Script, you can re-compile Rakudo by running z command with no arguments:

$ z
[...]
$ ./perl6 bug-tests.t
1..1
    1..10
    ok 1 - xx inside `with`
    not ok 2 - try with block and andthen
    # Failed test 'try with block and andthen'
    # at bug-tests.t line 10
    # expected: $(2, 4, 6)
    #      got: $(2, 2, 4)
    not ok 3 - block in a sub with orelse
    # Failed test 'block in a sub with orelse'
    # at bug-tests.t line 16
    # expected: $("cc", "dd")
    #      got: $("cc", "cc")
    not ok 4 - loop + lexical variable plus chain of andthens
    # Failed test 'loop + lexical variable plus chain of andthens'
    # at bug-tests.t line 22
    # expected: $(1, 1, 1, 1, 1, 1, 1)
    #      got: $(1, 4, 3, 3, 3, 3, 3)
    not ok 5 - loop + andthen + orelse
    # Failed test 'loop + andthen + orelse'
    # at bug-tests.t line 28
    # expected: $("a", "b", "c")
    #      got: $("a", "a", "a")
    ok 6 - parentheses + xx + given
    ok 7 - postfix for + take + block in a string
    ok 8 - given + whatever code closure execution
    ok 9 - sub + given + whatevercode closure execution
    not ok 10 - sub with `with` + orelse + block interpolation
    # Failed test 'sub with `with` + orelse + block interpolation'
    # at bug-tests.t line 49
    # expected: $("meow True",)
    #      got: $("meow False",)
    # Looks like you failed 5 tests of 10
not ok 1 - thunking closure scoping
# Failed test 'thunking closure scoping'
# at bug-tests.t line 3
# Looks like you failed 1 test of 1

Looks like that fixed half of the issues already. That's pretty good!

Extra Debugging

Let's now look at the remaining failures and figure out why block migration isn't how we want it in those cases. To assists with our sleuthing efforts, let's make a couple of changes to produce more debugging info.

First, let's modify QAST::Node.dump method in NQP's repo to dump the value of in_stmt_mod annotation, by telling it to dump out the value verbatim if the key is in_stmt_mod:

if $k eq 'IN_DECL' || $k eq 'BY' || $k eq 'statement_id'
|| $k eq 'in_stmt_mod' {
    ...

Next, let's go to sub migrate_blocks in Actions.nqp and add a bunch of debug dumps inside most of the conditionals. This will let us track when a block is compared and to see whether migration occurs. As mentioned earlier, I like to key my dumps on env vars using nqp::getenvhash op, so after modifications my migrate_blocks routine looks like this; note the use of .dump method to dump QAST node guts (tip: .dump method also exists on Perl6::Grammar's match objects!):

sub migrate_blocks($from, $to, $predicate?) {
    my @decls := @($from[0]);
    my int $n := nqp::elems(@decls);
    my int $i := 0;
    while $i < $n {
        my $decl := @decls[$i];
        if nqp::istype($decl, QAST::Block) {
            nqp::atkey(nqp::getenvhash(),'ZZ') && nqp::say('ZZ1: -----------------');
            nqp::atkey(nqp::getenvhash(),'ZZ') && nqp::say('ZZ1: trying to grab ' ~ $decl.dump);
            nqp::atkey(nqp::getenvhash(),'ZZ') && nqp::say('ZZ1: to move to ' ~ $to.dump);
            if !$predicate || $predicate($decl) {
                nqp::atkey(nqp::getenvhash(),'ZZ') && nqp::say('ZZ1: grabbed');
                $to[0].push($decl);
                @decls[$i] := QAST::Op.new( :op('null') );
            }
            nqp::atkey(nqp::getenvhash(),'ZZ') && nqp::say('ZZ1: -----------------');
        }
        elsif (nqp::istype($decl, QAST::Stmt) || nqp::istype($decl, QAST::Stmts)) &&
              nqp::istype($decl[0], QAST::Block) {
            nqp::atkey(nqp::getenvhash(),'ZZ') && nqp::say('ZZ1: -----------------');
            nqp::atkey(nqp::getenvhash(),'ZZ') && nqp::say('ZZ1: trying to grab ' ~ $decl[0].dump);
            nqp::atkey(nqp::getenvhash(),'ZZ') && nqp::say('ZZ1: to move to ' ~ $to.dump);
            if !$predicate || $predicate($decl[0]) {
                nqp::atkey(nqp::getenvhash(),'ZZ') && nqp::say('ZZ1: grabbed');
                $to[0].push($decl[0]);
                $decl[0] := QAST::Op.new( :op('null') );
            }
            nqp::atkey(nqp::getenvhash(),'ZZ') && nqp::say('ZZ1: -----------------');
        }
        elsif nqp::istype($decl, QAST::Var) && $predicate && $predicate($decl) {
            $to[0].push($decl);
            @decls[$i] := QAST::Op.new( :op('null') );
        }
        $i++;
    }
}

After making the changes, we need to recompile both NQP and Rakudo. With Z-Script, we can just run z n to do that:

$ z n
[...]

Now, we'll grab the first failing code and take a look at its QAST. I'm going to use the CoreHackers::Q tool:

$ q a ./perl6 -e '
    sub itcavuc ($c) { try {say $c} andthen 42 };
    itcavuc $_ for 2, 4, 6;'
$ firefox out.html

We can see that our buggy say call lives in QAST::Block with cuid 1, which gets called from within QAST::Block with cuid 3, but is actually located within QAST::Block with cuid 2:

- QAST::Block(:cuid(3)) <wanted> :statement_id<1>
        :count<?> :signatured<?> :IN_DECL<sub>
        :in_stmt_mod<0> :code_object<?>
        :outer<?> { try {say $c} andthen 42 }
    [...]
        - QAST::Block(:cuid(2)) <wanted> :statement_id<2>
                :count<?> :in_stmt_mod<0> :code_object<?> :outer<?>
            [...]
            - QAST::Block(:cuid(1)) <wanted> :statement_id<2>
                    :IN_DECL<> :in_stmt_mod<0> :code_object<?>
                    :also_uses<?> :outer<?> {say $c}
                [...]
                - QAST::Op(call &say)  say $c
    [...]
    - QAST::Op(p6typecheckrv)
        [...]
        - QAST::WVal(Block :cuid(1))

Looks like cuid 2 block steals our cuid 1 block. Let's enable the debug env var and look at the dumps to see why exactly:

$ ZZ=1 ./perl6 -e '
    sub itcavuc ($c) { try {say $c} andthen 42 };
    itcavuc $_ for 2, 4, 6;'

ZZ1: -----------------
ZZ1: trying to grab - QAST::Block(:cuid(1)) <wanted>
    :statement_id<2> :IN_DECL<> :in_stmt_mod<0> :code_object<?>
    :also_uses<?> :outer<?> {say $c}
[...]

ZZ1: to move to - QAST::Block  :statement_id<2>
    :in_stmt_mod<0> :outer<?>

ZZ1: grabbed
ZZ1: -----------------

We can see the theft in progress. Let's take a look at our migration predicate again:

! $b.ann('in_stmt_mod')
&& ($b.ann('statement_id') // -1) >= $migrate_stmt_id

In the dump we can see in_stmt_mod is false. Were it set to a true value, the block would not be migrated—exactly what we're trying to accomplish. Let's investigate the in_stmt_mod annotation, to see when it gets set:

$ G 'in_stmt_mod' src/Perl6/Actions.nqp
2327:                $_.annotate('in_stmt_mod', $*IN_STMT_MOD);
9206:                !$b.ann('in_stmt_mod') && ($b.ann('statement_id') // -1) >= $migrate_stmt_id

$ G '$*IN_STMT_MOD' src/Perl6/Grammar.nqp
1200:        :my $*IN_STMT_MOD := 0;                    # are we inside a statement modifier?
1328:        :my $*IN_STMT_MOD := 0;
1338:        | <EXPR> :dba('statement end') { $*IN_STMT_MOD := 1 }

Looks like it's a marker for statement modifier conditions. Statement modifiers have a lot of relevance to our andthen thunks, because $foo with $bar gets turned into $bar andthen $foo during parsing. Since, as we can see in src/Perl6/Grammar.nqp, in_stmt_mod annotation gets set for with statement modifiers, we can hypothesize that if we turn our buggy andthen into a with, the bug will disappear:

$ ./perl6 -e 'sub itcavuc ($c) { 42 with try {say $c} };
    itcavuc $_ for 2, 4, 6;'
2
4
6

And indeed it does! Then, we have a way forward: we need to set in_stmt_mod annotation to a truthy value for just the first argument of andthen (and its relatives notandthen and orelse).

Glancing at the Grammar it doesn't look like it immediatelly offers a similar opportunity for how in_stmt_mod is set for the with statement modifier. Let's approach it differently. Since we care about this when thunks are created, let's watch for andthen QAST inside sub thunkity_thunk in Actions, then descend into its first kid and add the in_stmt_mod annotation by cheating and using the past_block annotation on QAST::WVal with the thunk that contains the reference to QAST::Block we wish to annotate. The code will look something like this:

sub mark_blocks_as_andnotelse_first_arg($ast) {
    if $ast && nqp::can($ast, 'ann') && $ast.ann('past_block') {
        $ast.ann('past_block').annotate: 'in_stmt_mod', 1;
    }
    elsif nqp::istype($ast, QAST::Op)
    || nqp::istype($ast, QAST::Stmt)
    || nqp::istype($ast, QAST::Stmts) {
        mark_blocks_as_andnotelse_first_arg($_) for @($ast)
    }
}

sub thunkity_thunk($/,$thunky,$past,@clause) {
    [...]

    my $andnotelse_thunk := nqp::istype($past, QAST::Op)
      && $past.op eq 'call'
      && ( $past.name eq '&infix:<andthen>'
        || $past.name eq '&infix:<notandthen>'
        || $past.name eq '&infix:<orelse>');

    while $i < $e {
        my $ast := @clause[$i];
        $ast := $ast.ast if nqp::can($ast,'ast');
        mark_blocks_as_andnotelse_first_arg($ast)
            if $andnotelse_thunk && $i == 0;
        [...]

First, we rake $past argument given to thunkity_thunk for a QAST::Op for nqp::call that calls one of our ops—when we found one, we set a variable to a truthy value. Then, in the loop, when we're iterating over the first child node ($i == 0) of these ops, we'll pass its QAST to our newly minted mark_blocks_as_andnotelse_first_arg routine, inside of which we recurse over any ops that can have kids and mark anything that has past_block annotation with truthy in_stmt_mod annotation.

Let's compile our concoction and give the tests another run. Once again, I'm using Z-Script to recompile Rakudo:

$ z
[...]
$ ./perl6 bug-tests.t
1..1
    1..10
    ok 1 - xx inside `with`
    ok 2 - try with block and andthen
    ok 3 - block in a sub with orelse
    not ok 4 - loop + lexical variable plus chain of andthens
    # Failed test 'loop + lexical variable plus chain of andthens'
    # at bug-tests.t line 23
    # expected: $(1, 1, 1, 1, 1, 1, 1)
    #      got: $(1, 4, 3, 3, 3, 3, 3)
    ok 5 - loop + andthen + orelse
    ok 6 - parentheses + xx + given
    ok 7 - postfix for + take + block in a string
    ok 8 - given + whatever code closure execution
    ok 9 - sub + given + whatevercode closure execution
    not ok 10 - sub with `with` + orelse + block interpolation
    # Failed test 'sub with `with` + orelse + block interpolation'
    # at bug-tests.t line 50
    # expected: $("meow True",)
    #      got: $("meow False",)
    # Looks like you failed 2 tests of 10
not ok 1 - thunking closure scoping
# Failed test 'thunking closure scoping'
# at bug-tests.t line 4
# Looks like you failed 1 test of 1

We got closer to the goal, with 80% of the tests now passing! In the first remaining failure, we already know from our original examination that chained andthen thunks get nested when they should not—we haven't done anything to fix that yet. Let's take care of that first.

Playing Chinese Food Mind Games

Looking back out at the fixes we applied already, we have a marker for when we're working with andthen or its sister ops: the $andnotelse_thunk variable. It seems fairly straight-forward that if we don't want the thunks of these ops to migrate, we just need to annotate them appropriately and stick the check for that annotation into the migration predicate.

In Grammar.nqp, we can see our ops are configured with the .b thunky, so we'll locate that branch in sub thunkity_thunk and pass $andnotelse_thunk variable as a new named param to the make_topic_block_ref block maker:

...
elsif $type eq 'b' {  # thunk and topicalize to a block
    unless $ast.ann('bare_block') || $ast.ann('past_block') {
        $ast := block_closure(make_topic_block_ref(@clause[$i],
          $ast, :$andnotelse_thunk,
          migrate_stmt_id => $*STATEMENT_ID));
    }
    $past.push($ast);
}
...

The block maker) will shove it into the migration predicate, so our block maker code becomes this:

 sub make_topic_block_ref(
    $/, $past, :$copy, :$andnotelse_thunk, :$migrate_stmt_id,
 ) {
    my $block := $*W.push_lexpad($/);

    # Add annotation to thunks of our ops:
    $block.annotate: 'andnotelse_thunk', 1 if $andnotelse_thunk;

    $block[0].push
        QAST::Var.new( :name('$_'), :scope('lexical'), :decl('var') );
    $block.push($past);
    $*W.pop_lexpad();
    if nqp::defined($migrate_stmt_id) {
        migrate_blocks($*W.cur_lexpad(), $block, -> $b {
               ! $b.ann('in_stmt_mod')

            # Don't migrate thunks of our ops:
            && ! $b.ann('andnotelse_thunk')

            && ($b.ann('statement_id') // -1) >= $migrate_stmt_id
        });
    }
    ...

One more compilation cycle and test run:

$ z
[...]
$ ./perl6 bug-tests.t
1..1
    1..10
    ok 1 - xx inside `with`
    ok 2 - try with block and andthen
    ok 3 - block in a sub with orelse
    ok 4 - loop + lexical variable plus chain of andthens
    ok 5 - loop + andthen + orelse
    ok 6 - parentheses + xx + given
    ok 7 - postfix for + take + block in a string
    ok 8 - given + whatever code closure execution
    ok 9 - sub + given + whatevercode closure execution
    not ok 10 - sub with `with` + orelse + block interpolation
    # Failed test 'sub with `with` + orelse + block interpolation'
    # at bug-tests.t line 50
    # expected: $("meow True",)
    #      got: $("meow False",)
    # Looks like you failed 1 test of 10
not ok 1 - thunking closure scoping
# Failed test 'thunking closure scoping'
# at bug-tests.t line 4
# Looks like you failed 1 test of 1

So close! Just a single test failure remains. Let's give it a close look.

Within and Without

Let's repeat our procedure of dumping QASTs as well as enabing the ZZ env var and looking at what's causing the thunk mis-migration. I'm going to run a slightly simplified version of the failing test, to keep the cruft out of QAST dumps. If you're following along, when looking at full QAST dump keep in mind what I mentioned earlier: with gets rewritten into andthen op call during parsing.

$ q a ./perl6 -e '.uc with +"a" orelse "meow {$_ ~~ Failure}".say and 42'
$ firefox out.html

- QAST::Block(:cuid(4)) :in_stmt_mod<0>
    [...]
    - QAST::Block(:cuid(1))  :statement_id<1> :in_stmt_mod<1>
      [...]
      - QAST::Op(chain &infix:<~~>) <wanted> :statement_id<2> ~~
        - QAST::Var(lexical $_) <wanted> $_
        - QAST::WVal(Failure) <wanted> Failure
    - QAST::Block(:cuid(2)) :statement_id<1>
        :in_stmt_mod<1> :andnotelse_thunk<1>
      [...]
      - QAST::Op(callmethod Stringy) <wanted>
        - QAST::Op(call) <wanted> {$_ ~~ Failure}
          - QAST::Op(p6capturelex) <wanted> :code_object<?>
            - QAST::Op(callmethod clone)
              - QAST::WVal(Block)

$ ZZ=1 ./perl6 -e '.uc with +"a" orelse "meow {$_ ~~ Failure}".say and 42'
[...]
ZZ1: -----------------
ZZ1: trying to grab - QAST::Block(:cuid(1))
  :statement_id<1> :in_stmt_mod<1>
  [...]
ZZ1: to move to - QAST::Block
  :statement_id<1> :andnotelse_thunk<1> :in_stmt_mod<1>
  [...]
ZZ1: -----------------

Although QAST::WVal lacks .past_block annotation and so doesn't show the block's CUID in the dump, just by reading the code dumped around that QAST, we can see that the CUID-less block is our QAST::Block :cuid(1), whose immediate outer is QAST::Block :cuid(4), yet it's called from within QAST::Block :cuid(2). It's supposed to get migrated, but that migration never happens, as we can see when we use the ZZ env var to enable our debug dumps in the sub migrate_blocks.

We can see why. Here's our current migration predicate (where $b is the examined block, which in our case is QAST::Block :cuid(1)):

   ! $b.ann('in_stmt_mod')
&& ! $b.ann('andnotelse_thunk')
&& ($b.ann('statement_id') // -1) >= $migrate_stmt_id

The very first condition prevents our migration, as our block has truthy in_stmt_mod annotation, because it's part of the with's condition. At the same time, it does need to be migrated because it's part of the andthen thunk that's inside the statement modifier!

Since we already have $andnotelse_thunk variable in the vicinity of the migration predicate we can use it to tell us whether we're migrating for the benefit of our andthen thunk and not the statement modifier. However, recall that we've used the very same in_stmt_mod annotation to mark the first argument of andthen and its brother ops. We need to alter that first.

And so, the sub mark_blocks_as_andnotelse_first_arg we added earlier becomes:

sub mark_blocks_as_andnotelse_first_arg($ast) {
    if $ast && nqp::can($ast, 'ann') && $ast.ann('past_block') {
        $ast.ann('past_block').annotate: 'in_stmt_mod_andnotelse', 1;
    }
    ...

And then we tweak the migration predicate to watch for this altered annotation and to consider the value of $andnotelse_thunk variable:

migrate_blocks($*W.cur_lexpad(), $block, -> $b {
    (    (! $b.ann('in_stmt_mod_andnotelse') &&   $andnotelse_thunk)
      || (! $b.ann('in_stmt_mod')            && ! $andnotelse_thunk)
    )
    && ($b.ann('statement_id') // -1) >= $migrate_stmt_id
    && ! $b.has_ann('andnotelse_thunk')
});

Thus, we migrate all the blocks with statement_id equal to or higher than ours and are all of the following:

  • Not thunks of actual andthen, notandthen, or orelse
  • Not thunks inside a statement modifier, unless they're inside thunks of andthen or related ops
  • If we're considering migrating them inside one of the andthen's thunks, then also not part of the first argument to andthen (or related ops), .

That's a fancy-pants predicate. Let's compile and see if it gets the job done:

$ z
[...]
$ ./perl6 bug-tests.t
  1..1
    1..10
    ok 1 - xx inside `with`
    ok 2 - try with block and andthen
    ok 3 - block in a sub with orelse
    ok 4 - loop + lexical variable plus chain of andthens
    ok 5 - loop + andthen + orelse
    ok 6 - parentheses + xx + given
    ok 7 - postfix for + take + block in a string
    ok 8 - given + whatever code closure execution
    ok 9 - sub + given + whatevercode closure execution
    ok 10 - sub with `with` + orelse + block interpolation
ok 1 - thunking closure scoping

Success! Now, let's remove all of the debug statements we added. Then, recompile and run make stresstest, to ensure we did not break anything else. With Z-Script, we can do all that by just running z ss:

$ z ss
[...]
All tests successful.
Files=1287, Tests=153127, 159 wallclock secs (21.40 usr  3.27 sys + 3418.56 cusr 179.32 csys = 3622.55 CPU)
Result: PASS

All is green. We can now commit our fix to Rakudo's repo, then commit our tests to the roast repo, and all that remains is closing those 10 tickets we fixed!

Job well done.

Conclusion

Today, we learned quite a bit about QAST: the Abstract Syntax Trees Perl 6 code compiles to in the Rakudo compiler. We examined the common types of QAST and how to create, annotate, mutate, execute, and dump them for examination.

In the second part of the article, we applied our new knowledge to fix a hive of mis-scoped thunking bugs that plagued various Perl 6 constructs. We introspected the generated QAST nodes to specially annotate them, and then used those annotations to reconfigure migration predicate, so that it migrates the blocks correctly.

Hopefully, this knowledge inspires you to fix the many other bugs we have on the RT tracker as well as our GitHub Issue tracker

-Ofun

About Zoffix Znet

user-pic I blog about Perl.