Perl 5 Porters Mailing List Summary: September 20th-26th
I have started compiling summaries of the Perl 5 Porters (p5p) mailing list. Thank you to everyone who helped improve them. Following is the first report.
Bug reports and bug fixes
Fuzzing to find bugs
More tickets were opened by Dan Collins and Brian Carpenter, results of fuzzing perl and uncovering interesting ways to crash perl.
Brian Carpenter found two bugs causing a null pointer dereference, leading to a segfault (Perl #126191, Perl #126192).
Brian Carpenter found a couple of assertion failures (Perl #126193, Perl #126170).
Dan Collins found a double free problem (Perl #126199). Vincent Pit was able to provide a summary of this bug in the form of:
$[ .= *[ = 'y'
which I found too cool to leave out.
Dan Collins provided a segfault (Perl #126204) caused by the following reduced regex:
/(?[()-!])/
Perl #126206 is a floating point exception. Reported by Dan Collins.
Perl #126042 is a stack corruption caused by perl losing track of the stack pointer. Reported by Dan Collins, fixed by Father Chrysostomos.
Father C. raised a problem with the implementation details of PL_sv_yes
,
and cautiously proposed at least a specific usage of it be removed from perl
space. Zefram supported and added that -- even if implemented
correctly -- it would still be wrong.
Perl #126064: Yet another stack corruption; like the previous one, reported by Dan Collins, fixed by Father Chrysostomos.
Perl #126188: When requiring IO::File in an attempt to resolve missing method on a hash, a segfault happens. Yet more results of fuzzing. Reported by Dan Collins.
Ricardo Signes bumped another fuzzing bug which caused a segfault and Shlomi Fish provided a patch, seconded by Reini Urban and discussed and approved on #p5p - leading to the patch being applied. (Perl #125350)
Portability fixes
Dan Collins provided a patch for compiling perl on GNU/Linux which was bisected to one of the recent AmigaOS-related changes. (Perl #126152)
Dan Collins reports a problem with building quadmath Perl. Bulk88
provided more comments and Lukas Mai delved into util.c
finding a
potential memory leak with the quadmath-related code.
(Perl #126203)
Sisyphus raised an issue of a test that has been failing for him on Windows 7 since perl 5.23.3 having to do with read-only file attributes. With the help of Tony Cook the packaging problem was solved. (Perl #126133)
Bulk88 provided additional information to a ticket requested to revert a patch that caused a problem on Visual C. Karl Williamson does not wish to revert it and instead offers to handle the specific compiler explicitly. (Perl #126045)
sv_backoff
optimisation
Bulk88 provided a patch for making sv_backoff
more tailcall friendly
(Perl #126171).
sv_backoff
would change its return value from int to void, but
as Bulk88 explains, since it was previously only returning a meaningless
constant value and since it should be reached via the public API function
sv_setsv(sv, &PL_sv_undef)
(or several others) anyway, it shouldn't be
a problem. He also added this explanation as a documentation patch.
File::Find
portability issue
Robert Mah raised a problem with
File::Find working between CIFS
and GNU/Linux systems - one supports nlink while the other does not
and File::Find
failed to understand this. Dave Mitchell showed the
commit that tries to identify it, and additional research by Robert has
shown other C utilities seem to have this problem too.
(Perl #126144)
More optimisations
Bulk88 bumped a ticket he opened at the beginning of 2014 to optimize two functions using tied hashes. The bump resulted in additional comments from Tony Cook, following several rounds of comments and fixes between Dave Mitchell and Bulk88. (Perl #121348)
He also provided a patch for optimizing some stack manipulation. (Perl #126196)
Mysterious mod_perl
compilation crash
Additional information provided by Michael Schout on a crash involving
pre-compiling modules under mod_perl
tracing back to commit which fixed
a seemingly unrelated bug.
(Perl #126145)
Documenting the names of types
Linda Walsh opened a ticket regarding additional reference type
documentation. Ricardo adds an explanation of Regexp
vs. REGEXP
and notes additional types that should probably be documented along the
way. (Perl #126150)
Finding modules on case-insensitive filesystems
Patrick Zimmermann raised an issue with loading modules on case-insensitive
systems. Zefram provided an explanation on why this happens, unfortunately
a combination of correct behavior with an API design of an optional
import
method in a module - something that cannot be changed.
(Perl #126167)
Crashing perl with x
Dan Collins raised a problem with the x
operator causing a segfault
when operating on a list. The ticket contains an in-depth discussion
on the problem and possible fixes.
(Perl #125937)
What's in th %!
hash
Felipe Gasper opened a ticket suggesting to document the ability to
use the values of %!
.
(Perl #125350)
Several regexp bugs
Victor ADAM has raised a ticket that the regexp pattern ]]]]][\\
should raise an error but does not. Karl Williamson was able to
reproduce, write a patch, and will seek additional cases before pushing
it. (Perl #126141)
He opened several other regex-related tickets:
- Perl #126179:
Regression with
\N{}
. - Perl #126181:
\c
inside(?[])
. - Perl #126182:
/(.(?2))((?<=(?=(?1)).))/
hangs, eats all available RAM. - Perl #126185:
/(?-p)/
should raise an error. - Perl #126186:
Document
(*COMMIT:arg)
and(*ACCEPT:arg)
. - Perl #126177:
(?n)
should be documented. - Perl #126187:
/\p /
and/\p^/
give strange warnings. - Perl #126180:
/(?[\ &!])/
segfaults. - Perl #126178:
/(?i/
and similar should raise an error.
Discussions
Smart Match, again
Ricardo Signes has laid out plans on cleaning up Smart Match and has provided test cases for the new expected behavior of Smart Match.
It would seem like Smart Match is going to get very clear, simple, and most importantly, expected syntax.
The thread itself is quite long and I recommend reading it only if you're interested in what people had snagged on. The aforementioned gist provides a clear spec of what Smart Match would become. The discussion thread also contains some comments by Zefram on what he believes is also confusion in Smart Match in Perl 6.
Revising version string semantics
Following Lyon QA Hackathon and its decisions, Ricardo Signes provided a summary and queried for any reasoned objections to moving forward with the recommendations. Questions were asked for clarifying specific situations and amending was done on the linked gist.
Spaces in qr/\p L/
Karl Williamson
asked for comments
on having spaces when using \p
(the
syntax for named Unicode properties) in regular expressions:
qr/\p L/
vs.
qr/\p L/x
Agreement that both should fail from Ricardo Signes, Yitzchak Scott-Thoennes, and Abigail.
Removing legacy code from B
Nicolas R. suggested
removing "dead code" from core (specifically
B relating to some PERL_VERSION
checks and
provided a patch removing it. This turned into a conversation on what
"dead code" is and whether it should be removed.
Main positions:
- Dave Mitchell agrees with removing the code.
- Tony Cook applied the patch.
- Reini Urban disagrees, as it was used as boilerplate code on CPAN.
- Todd Rinaldo supports removal and suggests considering adding docs.
Continued discussion:
Rocco Caputo asked about guidelines on obsolete implementations and dead code in general while Aristotle Pagaltzis asked to think of this change in the context of the recently-surfaced concept of the CPAN river and doubts whether the code in question is, in fact, dead code.
Ricardo Signes proposed a practical solution along Todd's suggestion, opting for finding a proper place for any important information which might be lost in this commit - documentation.
Bulk88 had offered an example with Encode
supporting older perl versions with what could be described as "dead code",
which Ricardo explained as a incorrect example, since Encode
can be installed
on older perl versions, while new versions of B
cannot.
Unshifting undef
to @ISA
Vadim Pushtaev opened
a ticket about
unshifting more than one value to @ISA
, leading to a
discussion
about the problem.
Zefram was able to distill that example further. Aristotle Pagaltzis
suggests this raises two bugs instead of one: A loop that occurring
pushing undef
to @ISA
, and unshifting multiple defined
values causing perl to unshift undef
. Eirik Berg Hanssen was able
to suggest a third bug which occurs inside an eval
showing a different
behavior.
Paul "LeoNerd" Evans suggested a possible reason for the warnings which both Vadim and Zefram agree is the real cause. The problem? In Paul's own words:
Random guess: 'unshift' has to create multiple holes at the start of the array so it doesn't suffer O(n^2) behaviour.
Zefram expands:
Pretty much. ppunshift() internally performs an avunshift() followed by a bunch of avstore()s. avunshift() doesn't take parameters for the values to unshift; it always sticks undefs in. (Actually null pointers internally.) The av_store() calls invoke magic on the array.
Dagfinn Ilmari Mannsåker reminds there is a separate bug unearthed during
the debugging process in which storing undef
values in @main::ISA
warns
101 times before dying when detecting inheritance recursion. Paul's observation
should be noted:
Oops; sounds like the code to detect and warn against the chance of an infinite recursion bug itself suffers an infinite recursion bug.
Ilmari provided a patch to delay the @ISA
set magic until all items are
assigned (which is what is done in push
but not unshift
).
Tony Cook explained why the original magic delay code for push
(and
thus the proposed same solution for unshift
) is actually wrong and
offered a different solution instead.
Ilmari offered a new patch fixing both according to Tony's suggestion.
Optimizing reference checks
At the beginning of the month Jarkko Hietaniemi raised a personal annoyance with the fact that:
ref $foo eq 'ARRAY'
actually gets compiled into a string eq
against the string constant
ARRAY
, which makes it not just inefficient, but also a possible problem
with blessed references - whether arrayref or ARRAY
package name, not
to mention typo possibilities.
Zefram suggested a similar solution to what
Params::Classify does. Kent
noted that it's Perl should have a native typeof
kind of check.
Bulk88 further delved into implementation details of other languages in this respect and what possible changes could happen in perl 5 in order to accommodate this (and beyond). Unfortunately his email did not receive comments.
Dereferencing and "anonymous scalars"
Bob Kleemann hit a snag when trying to dereference a variable referencing a previous variable with the same name using version numbers.
my $v = shift;
$v = \$v;
say sprintf("v%vd", $$v); # prints address, not value
Tony Cook explains it succinctly:
You're setting $v to a reference to itself, so $$v is the value of $v, which is a reference [...].
Bob found a solution as the following code:
$v = \eval { $v };
This introduced an interesting thread on what Eirik Berg Hanssen referred to as anonymous scalars.
By the way, following an exhaustive explanation by Aristotle Pagaltzis, Bob eventually went with:
$v = \do { my $copy = $v };
AUTOLOAD
on non-existing tied hash methods
Bulk88 asks
whether AUTOLOAD
should be called when some methods of a tied
hash do not exist.
Chas. Owen shared code demonstrating that the only function that must be
handled is TIEHASH
.
Possible optimization for tied hashes
Bulk88 wondered
about possibly calling scalar keys
on a tied hash instead
of iterating over the keys and values, which would result in a nice
optimization. Tony Cook explained why it will not be suitable for the
purpose.
stat
with an array
Following a ticket
raised by Jozef Mojzis, Dan Collins
writes
that a side effect of fixing at
least two crash-inducing bugs, stat
no longer works on arrays. He suggests
declaring it a WONTFIX or add a warning and a note in
perldelta.
Father Chrysostomos adds
a reasonable use-case for stat(@_)
and suggests
researching additional usages before attempting to fix it, one way or the
other, and Eirik Berg Hanssen offers reasons to both document in perldelta
and raise a warning.
Dan Collins provided a patch to warn with a documentation change.
Stack overflow with XS_RETURN
?
Following a ticket
(mentioned above) referring to the x
operator causing a segfault, Bulk88
wondered
whether the XS_RETURN
family of macros could introduce a segfault
as well. Dave Mitchell explained how it would not be possible, but added
that maybe we should make it explicit in the documentation and by adding
assertions.
Negative values in XSRETURN
Doug Bell asked
what should happen when XSRETURN
receives a negative
value as a parameter. Dave Mitchell, backed by H. Merijn Brand (Tux),
suggests that since this corrupts the stack, an assertion should be added.
A patch provided and being smoked.
Branch cleanups
Dave Mitchell sent another email with a list of temporary Git branches to be deleted, asking people to prune them.
News
perl 5.23.3 released!
Peter Martini released perl 5.23.3.
His epigraph follows:
Little of of all we value here
Wakes on the morn of its hundredth year
Without both feeling and looking queer.
In fact, there’s nothing that keeps its youth,
So far as I know, but a tree and truth.
(This is a moral that runs at large;
Take it. — You’re welcome. — No extra charge.)
-- The Deacon’s Masterpiece or The Wonderful "One-Hoss Shay": A Logical Story
Oliver Wendell Holmes
Encode 2.78 is out
Dan Kogai announced a new version of Encode. Biggest change is preloading the CP1252 encoding.
Thank you, great idea!
This is great Sawyer -- thanks for doing this!
I have two suggestions. First, there's a lot of content, and I guess there often will be. So how about a slightly more interactive format? I took your summary and did a rough conversion to a markdown format, then hacked together a script which converts it to a simple structured format.
This is clearly going to be a lot of work. I think it's more likely to keep getting done if there's a team who can take turns. You could set up a github organisation and use github pages to provide a basic website for the summaries. That way people could send PRs to expand / correct / fill in sections, and editors for the summaries could come and go.
Neil, I'm sorry you had to do this extra work. The summaries are already in Markdown format in the p5p-summaries repo. :)