Perl 5 Porters Mailing List Summary: September 20th-26th

I have started compiling summaries of the Perl 5 Porters (p5p) mailing list. Thank you to everyone who helped improve them. Following is the first report.

Bug reports and bug fixes

Fuzzing to find bugs

More tickets were opened by Dan Collins and Brian Carpenter, results of fuzzing perl and uncovering interesting ways to crash perl.

Brian Carpenter found two bugs causing a null pointer dereference, leading to a segfault (Perl #126191, Perl #126192).

Brian Carpenter found a couple of assertion failures (Perl #126193, Perl #126170).

Dan Collins found a double free problem (Perl #126199). Vincent Pit was able to provide a summary of this bug in the form of:

$[ .= *[ = 'y'

which I found too cool to leave out.

Dan Collins provided a segfault (Perl #126204) caused by the following reduced regex:

/(?[()-!])/

Perl #126206 is a floating point exception. Reported by Dan Collins.

Perl #126042 is a stack corruption caused by perl losing track of the stack pointer. Reported by Dan Collins, fixed by Father Chrysostomos.

Father C. raised a problem with the implementation details of PL_sv_yes, and cautiously proposed at least a specific usage of it be removed from perl space. Zefram supported and added that -- even if implemented correctly -- it would still be wrong.

Perl #126064: Yet another stack corruption; like the previous one, reported by Dan Collins, fixed by Father Chrysostomos.

Perl #126188: When requiring IO::File in an attempt to resolve missing method on a hash, a segfault happens. Yet more results of fuzzing. Reported by Dan Collins.

Ricardo Signes bumped another fuzzing bug which caused a segfault and Shlomi Fish provided a patch, seconded by Reini Urban and discussed and approved on #p5p - leading to the patch being applied. (Perl #125350)

Portability fixes

Dan Collins provided a patch for compiling perl on GNU/Linux which was bisected to one of the recent AmigaOS-related changes. (Perl #126152)

Dan Collins reports a problem with building quadmath Perl. Bulk88 provided more comments and Lukas Mai delved into util.c finding a potential memory leak with the quadmath-related code. (Perl #126203)

Sisyphus raised an issue of a test that has been failing for him on Windows 7 since perl 5.23.3 having to do with read-only file attributes. With the help of Tony Cook the packaging problem was solved. (Perl #126133)

Bulk88 provided additional information to a ticket requested to revert a patch that caused a problem on Visual C. Karl Williamson does not wish to revert it and instead offers to handle the specific compiler explicitly. (Perl #126045)

sv_backoff optimisation

Bulk88 provided a patch for making sv_backoff more tailcall friendly (Perl #126171).

sv_backoff would change its return value from int to void, but as Bulk88 explains, since it was previously only returning a meaningless constant value and since it should be reached via the public API function sv_setsv(sv, &PL_sv_undef) (or several others) anyway, it shouldn't be a problem. He also added this explanation as a documentation patch.

File::Find portability issue

Robert Mah raised a problem with File::Find working between CIFS and GNU/Linux systems - one supports nlink while the other does not and File::Find failed to understand this. Dave Mitchell showed the commit that tries to identify it, and additional research by Robert has shown other C utilities seem to have this problem too. (Perl #126144)

More optimisations

Bulk88 bumped a ticket he opened at the beginning of 2014 to optimize two functions using tied hashes. The bump resulted in additional comments from Tony Cook, following several rounds of comments and fixes between Dave Mitchell and Bulk88. (Perl #121348)

He also provided a patch for optimizing some stack manipulation. (Perl #126196)

Mysterious mod_perl compilation crash

Additional information provided by Michael Schout on a crash involving pre-compiling modules under mod_perl tracing back to commit which fixed a seemingly unrelated bug. (Perl #126145)

Documenting the names of types

Linda Walsh opened a ticket regarding additional reference type documentation. Ricardo adds an explanation of Regexp vs. REGEXP and notes additional types that should probably be documented along the way. (Perl #126150)

Finding modules on case-insensitive filesystems

Patrick Zimmermann raised an issue with loading modules on case-insensitive systems. Zefram provided an explanation on why this happens, unfortunately a combination of correct behavior with an API design of an optional import method in a module - something that cannot be changed. (Perl #126167)

Crashing perl with x

Dan Collins raised a problem with the x operator causing a segfault when operating on a list. The ticket contains an in-depth discussion on the problem and possible fixes. (Perl #125937)

What's in th %! hash

Felipe Gasper opened a ticket suggesting to document the ability to use the values of %!. (Perl #125350)

Several regexp bugs

Victor ADAM has raised a ticket that the regexp pattern ]]]]][\\ should raise an error but does not. Karl Williamson was able to reproduce, write a patch, and will seek additional cases before pushing it. (Perl #126141)

He opened several other regex-related tickets:

Discussions

Smart Match, again

Ricardo Signes has laid out plans on cleaning up Smart Match and has provided test cases for the new expected behavior of Smart Match.

It would seem like Smart Match is going to get very clear, simple, and most importantly, expected syntax.

The thread itself is quite long and I recommend reading it only if you're interested in what people had snagged on. The aforementioned gist provides a clear spec of what Smart Match would become. The discussion thread also contains some comments by Zefram on what he believes is also confusion in Smart Match in Perl 6.

Revising version string semantics

Following Lyon QA Hackathon and its decisions, Ricardo Signes provided a summary and queried for any reasoned objections to moving forward with the recommendations. Questions were asked for clarifying specific situations and amending was done on the linked gist.

Spaces in qr/\p L/

Karl Williamson asked for comments on having spaces when using \p (the syntax for named Unicode properties) in regular expressions:

qr/\p L/

vs.

qr/\p L/x

Agreement that both should fail from Ricardo Signes, Yitzchak Scott-Thoennes, and Abigail.

Removing legacy code from B

Nicolas R. suggested removing "dead code" from core (specifically B relating to some PERL_VERSION checks and provided a patch removing it. This turned into a conversation on what "dead code" is and whether it should be removed.

Main positions:

  • Dave Mitchell agrees with removing the code.
  • Tony Cook applied the patch.
  • Reini Urban disagrees, as it was used as boilerplate code on CPAN.
  • Todd Rinaldo supports removal and suggests considering adding docs.

Continued discussion:

Rocco Caputo asked about guidelines on obsolete implementations and dead code in general while Aristotle Pagaltzis asked to think of this change in the context of the recently-surfaced concept of the CPAN river and doubts whether the code in question is, in fact, dead code.

Ricardo Signes proposed a practical solution along Todd's suggestion, opting for finding a proper place for any important information which might be lost in this commit - documentation.

Bulk88 had offered an example with Encode supporting older perl versions with what could be described as "dead code", which Ricardo explained as a incorrect example, since Encode can be installed on older perl versions, while new versions of B cannot.

Unshifting undef to @ISA

Vadim Pushtaev opened a ticket about unshifting more than one value to @ISA, leading to a discussion about the problem.

Zefram was able to distill that example further. Aristotle Pagaltzis suggests this raises two bugs instead of one: A loop that occurring pushing undef to @ISA, and unshifting multiple defined values causing perl to unshift undef. Eirik Berg Hanssen was able to suggest a third bug which occurs inside an eval showing a different behavior.

Paul "LeoNerd" Evans suggested a possible reason for the warnings which both Vadim and Zefram agree is the real cause. The problem? In Paul's own words:

Random guess: 'unshift' has to create multiple holes at the start of the array so it doesn't suffer O(n^2) behaviour.

Zefram expands:

Pretty much. ppunshift() internally performs an avunshift() followed by a bunch of avstore()s. avunshift() doesn't take parameters for the values to unshift; it always sticks undefs in. (Actually null pointers internally.) The av_store() calls invoke magic on the array.

Dagfinn Ilmari Mannsåker reminds there is a separate bug unearthed during the debugging process in which storing undef values in @main::ISA warns 101 times before dying when detecting inheritance recursion. Paul's observation should be noted:

Oops; sounds like the code to detect and warn against the chance of an infinite recursion bug itself suffers an infinite recursion bug.

Ilmari provided a patch to delay the @ISA set magic until all items are assigned (which is what is done in push but not unshift).

Tony Cook explained why the original magic delay code for push (and thus the proposed same solution for unshift) is actually wrong and offered a different solution instead.

Ilmari offered a new patch fixing both according to Tony's suggestion.

Optimizing reference checks

At the beginning of the month Jarkko Hietaniemi raised a personal annoyance with the fact that:

ref $foo eq 'ARRAY'

actually gets compiled into a string eq against the string constant ARRAY, which makes it not just inefficient, but also a possible problem with blessed references - whether arrayref or ARRAY package name, not to mention typo possibilities.

Zefram suggested a similar solution to what Params::Classify does. Kent noted that it's Perl should have a native typeof kind of check.

Bulk88 further delved into implementation details of other languages in this respect and what possible changes could happen in perl 5 in order to accommodate this (and beyond). Unfortunately his email did not receive comments.

Dereferencing and "anonymous scalars"

Bob Kleemann hit a snag when trying to dereference a variable referencing a previous variable with the same name using version numbers.

my $v = shift;
$v = \$v;
say sprintf("v%vd", $$v); # prints address, not value

Tony Cook explains it succinctly:

You're setting $v to a reference to itself, so $$v is the value of $v, which is a reference [...].

Bob found a solution as the following code:

$v = \eval { $v };

This introduced an interesting thread on what Eirik Berg Hanssen referred to as anonymous scalars.

By the way, following an exhaustive explanation by Aristotle Pagaltzis, Bob eventually went with:

$v = \do { my $copy = $v };

AUTOLOAD on non-existing tied hash methods

Bulk88 asks whether AUTOLOAD should be called when some methods of a tied hash do not exist.

Chas. Owen shared code demonstrating that the only function that must be handled is TIEHASH.

Possible optimization for tied hashes

Bulk88 wondered about possibly calling scalar keys on a tied hash instead of iterating over the keys and values, which would result in a nice optimization. Tony Cook explained why it will not be suitable for the purpose.

stat with an array

Following a ticket raised by Jozef Mojzis, Dan Collins writes that a side effect of fixing at least two crash-inducing bugs, stat no longer works on arrays. He suggests declaring it a WONTFIX or add a warning and a note in perldelta.

Father Chrysostomos adds a reasonable use-case for stat(@_) and suggests researching additional usages before attempting to fix it, one way or the other, and Eirik Berg Hanssen offers reasons to both document in perldelta and raise a warning.

Dan Collins provided a patch to warn with a documentation change.

Stack overflow with XS_RETURN?

Following a ticket (mentioned above) referring to the x operator causing a segfault, Bulk88 wondered whether the XS_RETURN family of macros could introduce a segfault as well. Dave Mitchell explained how it would not be possible, but added that maybe we should make it explicit in the documentation and by adding assertions.

Negative values in XSRETURN

Doug Bell asked what should happen when XSRETURN receives a negative value as a parameter. Dave Mitchell, backed by H. Merijn Brand (Tux), suggests that since this corrupts the stack, an assertion should be added.

A patch provided and being smoked.

Branch cleanups

Dave Mitchell sent another email with a list of temporary Git branches to be deleted, asking people to prune them.

News

perl 5.23.3 released!

Peter Martini released perl 5.23.3.

His epigraph follows:

Little of of all we value here
Wakes on the morn of its hundredth year
Without both feeling and looking queer.
In fact, there’s nothing that keeps its youth,
So far as I know, but a tree and truth.
(This is a moral that runs at large;
Take it. — You’re welcome. — No extra charge.)

    -- The Deacon’s Masterpiece or The Wonderful "One-Hoss Shay": A Logical Story
Oliver Wendell Holmes

The announcement.

Encode 2.78 is out

Dan Kogai announced a new version of Encode. Biggest change is preloading the CP1252 encoding.

3 Comments

Thank you, great idea!

This is great Sawyer -- thanks for doing this!

I have two suggestions. First, there's a lot of content, and I guess there often will be. So how about a slightly more interactive format? I took your summary and did a rough conversion to a markdown format, then hacked together a script which converts it to a simple structured format.

This is clearly going to be a lot of work. I think it's more likely to keep getting done if there's a team who can take turns. You could set up a github organisation and use github pages to provide a basic website for the summaries. That way people could send PRs to expand / correct / fill in sections, and editors for the summaries could come and go.

Leave a comment

About Sawyer X

user-pic Gots to do the bloggingz