Perl 6 IO TPF Grant: Monthly Report (March, 2017)
This document is the March, 2017 progress report for TPF Standardization, Test Coverage, and Documentation of Perl 6 I/O Routines grant
My delivery of the Action Plan was one week later than I originally expected to deliver it. The delay let me assess some of the big-picture consistency issues, which led to proposal to remove 15 methods from IO::Handle and to iron out naming and argument format for several other routines.
I still hope to complete all the code modifications prior to end of weekend of April 15, so all of these can be included in the next Rakudo Star release. And a week after, I plan to complete the grant.
Note: to minimize user impact, some of the changes may be included only in
6.d language, which will be available in 2017.04 release only if the user uses
use v6.d.PREVIEW pragma.
IO Action Plan
I finished the IO Action Plan, placed it into
/doc of rakudo's repository, and made it available to other core devs for
review for a week (period ends on April 1st). The Action Plan is 16 pages long and
contains 26 sections detailing proposed changes.
Overall, I proposed many more smaller changes than I originally expected and fewer larger, breaking changes than I originally expected. This has to do with a much better understanding of how rakudo's IO routines are "meant to" be used, so I think the improvement of the documentation under this grant will be much greater than I originally anticipated.
A lot of this has to do with lack of explanatory documentation for how to
manipulate and traverse paths. This had the effect that users were using the
$*SPEC object (157 instances of its use in the ecosystem!) and its routines
for that goal, which is rather awkward.
This concept is prevalent enough that I even wrote
SPEC::Func module in the
past, due to user demand, and certain books whose draft copies I read used
$*SPEC as well.
$*SPEC is an internal-ish thing and unless you're writing your
own IO abstractions, you never need to use it directly. The changes and
additions to the
IO::Path methods done under this grant will make traversing
paths even more pleasant, and the new tutorial documentation I plan to write
under this grant will fully describe the Right Way™ to do it all.
In fact, removal of
$*SPEC in future language versions is currently under
lizmat++ pointed out that we can gain significant performance improvements by
$*SPEC infrastructure and moving it into module-space. For example,
a benchmark of slurping a 10-line file shows that removal of all the
path processing code makes benched program run more than 14x faster. When
IO::Path creation, dynamic var lookup alone takes up 14.73% of
the execution time.
The initial plan was to try and make IO routines handle all OSes in a unified
way (e.g. using
/ on Windows), however it was found this would create
several ambiguities and would be buggy, even if fast.
However, I think there are still a lot of improvements that can be gained
$*SPEC infrastructure internal. So we'd still have the
IO::Spec-type modules but they'll have a private API we can optimize freely,
and we'll get rid of the dynamic lookups, consolidate what code we can into
IO::Path, while keeping the functionality that differs between OSes in the
Since this all sounds like guestimation and there's a significant-ish use of
$*SPEC in the ecosystem, the plan now is to implement it all in a module
first and see whether it works well and offers any significant performance
improvements. If it does, I believe it should be possible to swap
to use the fast version in
6.d language, while still leaving
and its modules in core, as deprecated, for removal in
This won't be done under this grant, and while trying not to over-promise, I
hope to release this module some time in May-June. So keep an eye out for it; I
already picked out a name:
As per original goals of the grant, I reviewed the code in Rakudo's 2014–2015
newio branch, to look for any salvagable ideas. I did not have any masterplan
design documents to go with it and I tried a few commits but did not find one
that didn't have merge conflicts and compiled (it kept complaining about
ModuleLoader), so my understanding of it comes solely from reading the source
code, and may be off from what the original author intended it to be.
The major difference between
newio and Rakudo's current IO structure is
type hierarchy and removal of
PIO roles which are done by
IO::Huh classes that represent various IO objects. The current
Rakudo's system has fewer abstractions:
IO::Path represents a path to an IO
IO::Handle provides read/write access to it, with
handling pipes, and no special objects for directories (their contents are
IO::Path.dir method and their attributes are modified via
Since 6.d language is additive to 6.c language, completely revamping the
type hierarchy may be challenging and messy. I'm also not entirely sold on what
appears to be one of the core design ideas in
newio: most of the
abstractions are of IO objects as they were at the object instantiation time. An
IO::Pathy object represents an IO item that exists, despite there being
no guarantees that it actually does. Thus,
True, while its
.d method always returns
undoubtedly gives a performance enhancement, however, if
$ rm foo were executed after
IO::File object's creation, the
would no longer return correct data and if then
$ mkdir foo were
.d methods would be returning incorrect data.
Until recently, Rakudo cached the result of
.e call and that produced
unexpected by user behaviour. I think the
issue will be greatly exacerbated if this sort of caching is extended to entire
objects and many of their methods.
However, I do think the removal of
$*SPEC is a good idea. And as described in
previous section I will try to make a
FastIO module, using ideas from
branch, for possible inclusion in future language versions.
Experimental MoarVM Coverage Reporter
As was mentioned in my grant proposal, the coverage reporter was busted by
the upgrade of information returned by
.line methods on core
MasterDuke++ made several commits fixing numerous issues to the coverage
parser and last night I identified the final piece of the breakage. The
annotations and hit reports all use the new
format. The setting file has
SETTING::src/core/blah markers inside of it.
The parser however, still thinks it's being fed the old
filenames, so once I teach it to calculate proper offsets
into the setting file, we'll have coverage reports on perl6.wtf back up and running and I'll be able to use them
to judge IO routine test coverage required for this grant.
Although not planned by the original grant, I was able to make the following performance enhancements to IO routines. So hey! Bonus deliverables \o/:
- rakudo/fa9aa47 Make
- rakudo/0111f10 Make IO::Spec::Unix.catdir 3.9x Faster
- rakudo/4fdebc9 Make IO::Spec::Unix.split 36x Faster
- Affects IO::Path's .parent, .parts, .volume, .dirname, and .basename
- Measurement of first call to .basename shows it's now 6x-10x faster
- rakudo/dcf1bb2 Make IO::Spec::Unix.rel2abs 35% faster
- rakudo/55abc6d Improve IO::Path.child perf on
- make IO::Path.child 2.1x faster on
- make IO::Spec::Unix.join 8.5x faster
- make IO::Spec::Unix.catpath 9x faster
- make IO::Path.child 2.1x faster on
- rakudo/4032953 Make IO::Handle.open 75% faster
- rakudo/4eef6db Make IO::Spec::Unix.is-absolute about 4.4x faster
- rakudo/ae5e510 Make IO::Path.new 7% faster when creating from Str
- rakudo/0c6281 Make IO::Pipe.lines use IO::Handle.lines for 3.2x faster performance
Performance Improvements Made By Other Core Developers
lizmat++ also made these improvements in IO area:
- rakudo/b4d80c0 Make .IO.slurp about 2x as fast
- rakudo/9da50e3 Introducing IO::Handle.iterator
- rakudo/9019a5b Streamline IO::Handle.get/getc
- rakudo/4bc826d Streamline IO::Handle.get
Along with the commits above, she also made IO::Handle.lines faster and
eliminated a quirk that required
.lines implementation in IO::Pipe (which is a subclass of IO::Handle).
Due to that, I was able to remove old IO::Pipe.lines implementation and make it
use new-and-improved IO::Handle.lines, which made the
method about 3.2x faster.
Will (attempt to) fix as part of the grant
.tmethod from from
IO::Handleto check if the handle is a TTY, however, attempt to call it causes a segfault. MasterDuke++ already found the candidate for the offending code (MoarVM/Issue#561) and this should be resolved by the time this grant is completed.
Don't think I will be able to fix these as part of the grant
- Found a strange error generated when
IO::Pipe's buffer is filled up. This is too deep in the guts for me to know how to resolve yet, so I filed it as RT#131026
- Found that IO::Path had a vestigial .pipe method that delegated to a non-existant IO::Handle method. Removed in rakudo/a01d67
- Fixed IO::Pipe.lines not accepting a Whatever as limit, which is accepted by all other .lines. rakudo/0c6281 Tests in roast/465795 and roast/add852
- Fixed issues due to caching of
IO::Handle.e. Reported as RT#130889. Fixed in rakudo/76f718. Tests in roast/908348
- Rejected rakudo PR#666
and resolved RT#126262 by explaining why the methods return
Strobjects instead of
IO::Pathon ticket/PR and improving the documentation by fixing mistakes (doc/ccae74) and expanding (doc/3cf943) on what the methods do exactly.
- IO::Path.Bridge was defunct, as it was trying to call .Bridge on Str, which does not exist. Resolved the issue by deleting this method in rakudo/212cc8
- Per demand, made
IO::Path.dira multi, so module-space can augment it with other candidates that add more functionality. rakudo/fbe7ace