Perl 6 IO TPF Grant: Monthly Report (March, 2017)
This document is the March, 2017 progress report for TPF Standardization, Test Coverage, and Documentation of Perl 6 I/O Routines grant
Timing
My delivery of the Action Plan was one week later than I originally expected to deliver it. The delay let me assess some of the big-picture consistency issues, which led to proposal to remove 15 methods from IO::Handle and to iron out naming and argument format for several other routines.
I still hope to complete all the code modifications prior to end of weekend of April 15, so all of these can be included in the next Rakudo Star release. And a week after, I plan to complete the grant.
Note: to minimize user impact, some of the changes may be included only in
6.d language, which will be available in 2017.04 release only if the user uses
use v6.d.PREVIEW
pragma.
IO Action Plan
I finished the IO Action Plan, placed it into /doc
of rakudo's repository, and made it available to other core devs for
review for a week (period ends on April 1st). The Action Plan is 16 pages long and
contains 26 sections detailing proposed changes.
Overall, I proposed many more smaller changes than I originally expected and fewer larger, breaking changes than I originally expected. This has to do with a much better understanding of how rakudo's IO routines are "meant to" be used, so I think the improvement of the documentation under this grant will be much greater than I originally anticipated.
A lot of this has to do with lack of explanatory documentation for how to
manipulate and traverse paths. This had the effect that users were using the
$*SPEC
object (157 instances of its use in the ecosystem!) and its routines
for that goal, which is rather awkward.
This concept is prevalent enough that I even wrote SPEC::Func
module in the
past, due to user demand, and certain books whose draft copies I read used
the $*SPEC
as well.
In reality, $*SPEC
is an internal-ish thing and unless you're writing your
own IO abstractions, you never need to use it directly. The changes and
additions to the IO::Path
methods done under this grant will make traversing
paths even more pleasant, and the new tutorial documentation I plan to write
under this grant will fully describe the Right Way™ to do it all.
In fact, removal of $*SPEC
in future language versions is currently under
consideration...
Removal of $*SPEC
lizmat++ pointed out that we can gain significant performance improvements by
removing $*SPEC
infrastructure and moving it into module-space. For example,
a benchmark of slurping a 10-line file shows that removal of all the
path processing code makes benched program run more than 14x faster. When
benching IO::Path
creation, dynamic var lookup alone takes up 14.73% of
the execution time.
The initial plan was to try and make IO routines handle all OSes in a unified
way (e.g. using /
on Windows), however it was found this would create
several ambiguities and would be buggy, even if fast.
However, I think there are still a lot of improvements that can be gained
by making $*SPEC
infrastructure internal. So we'd still have the
IO::Spec
-type modules but they'll have a private API we can optimize freely,
and we'll get rid of the dynamic lookups, consolidate what code we can into
IO::Path
, while keeping the functionality that differs between OSes in the
::Spec
modules.
Since this all sounds like guestimation and there's a significant-ish use of
$*SPEC
in the ecosystem, the plan now is to implement it all in a module
first and see whether it works well and offers any significant performance
improvements. If it does, I believe it should be possible to swap IO::Path
to use the fast version in 6.d
language, while still leaving $*SPEC
dynvar
and its modules in core, as deprecated, for removal in 6.e
.
This won't be done under this grant, and while trying not to over-promise, I
hope to release this module some time in May-June. So keep an eye out for it; I
already picked out a name: FastIO
newio Branch
As per original goals of the grant, I reviewed the code in Rakudo's 2014–2015
newio
branch, to look for any salvagable ideas. I did not have any masterplan
design documents to go with it and I tried a few commits but did not find one
that didn't have merge conflicts and compiled (it kept complaining about
ModuleLoader), so my understanding of it comes solely from reading the source
code, and may be off from what the original author intended it to be.
The major difference between newio
and Rakudo's current IO structure is
type hierarchy and removal of $*SPEC
. newio
provides IO::Pathy
and
PIO
roles which are done by IO::File
, IO::Dir
, IO::Local
, IO::Dup
,
IO::Pipe
, and IO::Huh
classes that represent various IO objects. The current
Rakudo's system has fewer abstractions: IO::Path
represents a path to an IO
object and IO::Handle
provides read/write access to it, with IO::Pipe
handling pipes, and no special objects for directories (their contents are
obtained via IO::Path.dir
method and their attributes are modified via
IO::Path
methods).
Since 6.d language is additive to 6.c language, completely revamping the
type hierarchy may be challenging and messy. I'm also not entirely sold on what
appears to be one of the core design ideas in newio
: most of the
abstractions are of IO objects as they were at the object instantiation time. An IO::Pathy
object represents an IO item that exists, despite there being
no guarantees that it actually does. Thus, IO::File
's .f
and .e
methods
always return True
, while its .d
method always returns False
. This
undoubtedly gives a performance enhancement, however, if
$ rm foo
were executed after IO::File
object's creation, the .e
method
would no longer return correct data and if then $ mkdir foo
were
executed, both .f
and .d
methods would be returning incorrect data.
Until recently, Rakudo cached the result of .e
call and that produced
unexpected by user behaviour. I think the
issue will be greatly exacerbated if this sort of caching is extended to entire
objects and many of their methods.
However, I do think the removal of $*SPEC
is a good idea. And as described in
previous section I will try to make a FastIO
module, using ideas from newio
branch, for possible inclusion in future language versions.
Experimental MoarVM Coverage Reporter
As was mentioned in my grant proposal, the coverage reporter was busted by
the upgrade of information returned by .file
and .line
methods on core
routines.
MasterDuke++ made several commits fixing numerous issues to the coverage
parser and last night I identified the final piece of the breakage. The
annotations and hit reports all use the new SETTING::src/core/blah
file
format. The setting file has SETTING::src/core/blah
markers inside of it.
The parser however, still thinks it's being fed the old gen/moar/CORE.setting
filenames, so once I teach it to calculate proper offsets
into the setting file, we'll have coverage reports on perl6.wtf back up and running and I'll be able to use them
to judge IO routine test coverage required for this grant.
Performance Improvements
Although not planned by the original grant, I was able to make the following performance enhancements to IO routines. So hey! Bonus deliverables \o/:
- rakudo/fa9aa47 Make
R::I::SET_LINE_ENDING_ON_HANDLE
4.1x Faster - rakudo/0111f10 Make IO::Spec::Unix.catdir 3.9x Faster
- rakudo/4fdebc9 Make IO::Spec::Unix.split 36x Faster
- Affects IO::Path's .parent, .parts, .volume, .dirname, and .basename
- Measurement of first call to .basename shows it's now 6x-10x faster
- rakudo/dcf1bb2 Make IO::Spec::Unix.rel2abs 35% faster
- rakudo/55abc6d Improve IO::Path.child perf on
*nix
:- make IO::Path.child 2.1x faster on
*nix
- make IO::Spec::Unix.join 8.5x faster
- make IO::Spec::Unix.catpath 9x faster
- make IO::Path.child 2.1x faster on
- rakudo/4032953 Make IO::Handle.open 75% faster
- rakudo/4eef6db Make IO::Spec::Unix.is-absolute about 4.4x faster
- rakudo/ae5e510 Make IO::Path.new 7% faster when creating from Str
- rakudo/0c6281 Make IO::Pipe.lines use IO::Handle.lines for 3.2x faster performance
Performance Improvements Made By Other Core Developers
lizmat++ also made these improvements in IO area:
- rakudo/b4d80c0 Make .IO.slurp about 2x as fast
- rakudo/9da50e3 Introducing IO::Handle.iterator
- rakudo/9019a5b Streamline IO::Handle.get/getc
- rakudo/4bc826d Streamline IO::Handle.get
Along with the commits above, she also made IO::Handle.lines faster and
eliminated a quirk that required
custom .lines
implementation in IO::Pipe (which is a subclass of IO::Handle).
Due to that, I was able to remove old IO::Pipe.lines implementation and make it
use new-and-improved IO::Handle.lines, which made the
method about 3.2x faster.
Bugs
Will (attempt to) fix as part of the grant
IO::Pipe
inherits.t
method from fromIO::Handle
to check if the handle is a TTY, however, attempt to call it causes a segfault. MasterDuke++ already found the candidate for the offending code (MoarVM/Issue#561) and this should be resolved by the time this grant is completed.
Don't think I will be able to fix these as part of the grant
- Found a strange error generated when
IO::Pipe
's buffer is filled up. This is too deep in the guts for me to know how to resolve yet, so I filed it as RT#131026
Already Fixed
- Found that IO::Path had a vestigial .pipe method that delegated to a non-existant IO::Handle method. Removed in rakudo/a01d67
- Fixed IO::Pipe.lines not accepting a Whatever as limit, which is accepted by all other .lines. rakudo/0c6281 Tests in roast/465795 and roast/add852
- Fixed issues due to caching of
IO::Handle.e
. Reported as RT#130889. Fixed in rakudo/76f718. Tests in roast/908348 - Rejected rakudo PR#666
and resolved RT#126262 by explaining why the methods return
Str
objects instead ofIO::Path
on ticket/PR and improving the documentation by fixing mistakes (doc/ccae74) and expanding (doc/3cf943) on what the methods do exactly. - IO::Path.Bridge was defunct, as it was trying to call .Bridge on Str, which does not exist. Resolved the issue by deleting this method in rakudo/212cc8
- Per demand, made
IO::Path.dir
a multi, so module-space can augment it with other candidates that add more functionality. rakudo/fbe7ace
Leave a comment