These are all doomed modules, as p5p refuses to take bug and security reports, and most of them are not maintained and fixed and improved in cperl only.
Maintained by p5p, responsible: pjacklam
regression in blsflt (left shift) and refusal to fix it, lack of understanding of float math.
fixed in cperl only. in cperl we also heavily modernized all the bigint/bignum/bigrat/Big::* modules to use typed signatures.
there were several podcheck fails, which were promptly fixed. so they do react fast, their only problem is the lack of understanding.
maintained by p5p, responsible: nobody
missing features, grave security problems with bless and tie, other security problems on 64bit with overlarge hashes and names. fixed in cperl only.
tonyc recently replied to a similar but limited bugreport with a partial fix, which doesn't really help.
unmaintained by p5p, responsible: nobody, previously smpeters
p5p took it over from him, while smpeters refactored the test suite for a while. many bug reports and patches still sitting in the cpan queue, no idea about the p5p queue, because they blocked me there.
development stalled, so I added most needed features and fixes. fixed in cperl, and on https://github.com/rurban/Net-Ping
unmaintained by nwclark, responsible: nobody
Development and maintenance stalled, many fixes unanswered. I had to publish all fixes to cpan as UNAUTHORIZED RELEASE.
Fixed in cperl.
unmaintained by rgs
Development and maintenance stalled, many improvements ignored. had to publish my own version Opcodes. Maintainer only wants Opcode as parent module for Safe, but not to specify Opcode properties. This is hardly an acceptable point of view.
maintained by p5p
Basically this module is only plagued by old sins, exporting all symbols at once and not by request. But the maintainers obviously don't know about this and accepted a bugreport that a new symbol is not automatically exported. This is a feature, not a bug.
The unsafe tmpnam
is still exported, but at least deprecated.
The new tmpnam replacements tmpfile
, mkstemp
, mkdtemp
are missing.
POSIX::1003 should be used instead.
maintained by pjf
, several private core fixes, several unfixed problems
this is a stable pragma with a complicated scoped feature set, which
does not really work in corner cases. it broke the compiler with
gensyms and empty stashes, and it broke Porting/sync-with-cpan
.
no autodie does not properly restore system()
from CORE::
. I had to
remove autodie from cperl's improved sync-with-cpan, and I urge
everybody else not to use it, as the errors are often invisible and
undetected.
It is also considered bad practice to use it. You need to know which open can fail and which not. It's recommended for complete newbies, such as use warnings and Carp, which introduce heavy memory regressions.
maintained by pevans
, but this guy can usually be trusted.
The queue is quite small.
But I had to add missing IPv6 constants in cperl for Net::Ping, a lot more are still missing for ping6 functionality. http://search.cpan.org/~rurban/Socket-2.021_01/
unmaintained by makamaka
unwilling to support Cpanel::JSON::XS the follower to the broken JSON::XS
unwilling to support cross-compatibility with the other JSON modules, throws several redefinition warnings if the loading order is different.
unresponsive. we typically wait >1 year to get answers on feature requests. last release 2013.
in cperl we fixed overload stringify and eq as in Cpanel:JSON::XS esp. "true" => true not 1
now maintained by mallen
not much problems, but horrible perldoc.3 vs perldoc.1 conflicts in core integration.
horrible core integration of Pod::Man and a broken $Maintainers{MAP}
Properly maintained, but they still think they are alone on this world, same problem as with Ingy. That there is only one installed perl, and only this can run the tests. There exists $^X for a reason.
I fixed their t/search50.t
test in Dec 2015, but it is still not handled.
Since cperl will ship its own heavily modernized version, it's no big deal. signature all the methods and get type violations at compile time, which caught many errors already.
maintained by the toolchain gang
I don't expect the toolchain folks to support the cperl variant, so I do it by myself, and added various new features along the line.
wrong Config dependencies,
support PERL_USE_UNSAFE_INC=1
,
call darwin dsymutil with DEBUGGING,
use the fixed and faster YAML and JSON cperl core parsers.
see below at Problems in core toolchain cpan modules, kept fixes private.
maintained by exodist
, very sceptical
The big Test2 feature he is working on for 2 years is event-based streamable tests, which still is 2x slower so far. So a no go. See http://search.cpan.org/~exodist/Test-Simple-1.302014_007/lib/Test2/Transition.pod
The old implementation is too heavy, we simplified and improved it internally. Since cperl will ship its own heavily modernized version, it's no big deal. We improved many abstraction methods. Signature all the methods and get type violations at compile time, which caught many errors already. Some of our type checks will break your tests, esp. that the skip count needs to be a number.
maintained by gisle
, no response
core had traditionally a wrong U32_ALIGNMENT_REQUIRED
probe in Configure.
I added a proper probe to fix the intel architecture, and added compatible SIGBUS on alignment fails, to detect previously undetected alignment errors. Fixed 64bit and darwin multiarch probes. Fixed RT #77919
https://github.com/rurban/digest-md5/tree/intel-align-rt77919
unmaintained by alh
Had to fix 5.16 binary names by myself, which should have happened in the last 8 years, as p5p pushed for binary names, and I opposed it.
cperl also supports @INC without . (-Dfortify_inc)
See https://github.com/rurban/Devel-PPPort/tree/516gvhv
maintained by pevans
Many core features are improperly supported, and many features are missing, so people had to come up with List::MoreUtils. The current maintainers do not known about MULTICALL and lexical $_, the most trivial features needed for this module.
The recent div by zero fix was done much worse than in cperl.
At least he is responsive.
Only minor issues. This is the only core module published as .tgz, so it cannot be updated by p5p with sync-with-cpan. only fixed in cperl.
Basically all toolchain modules which lack cperl support. Esp. the 'c' suffix version detection and comparison, and the lack of knowledge about cperl builtin modules without .pm: XSLoader, DynaLoader, strict, coretypes. The old toolchain still wants to install a missing XSLoader requirement, because it searches for modules statically, not dynamically.
CPAN-Meta-YAML, CPAN-Meta-Requirements, version, CPAN, CPAN-Meta, Parse-CPAN-Meta, EU-MM
The biggest problem is over-architecture, as the toolchain is typically the gathering place for all the important people too incompetent to do core. The old master Dan Golden is the only light in the dark.
Another problem is the lack of efficient JSON and esp. YAML support, and lack of YAML compliance. They gave up on YAML instead of fixing it, as we did.
With cperl we added fast XS variants for JSON and YAML, fixed the broken and overly strict YAML::XS, and use this throughout. And not the infant YAML::Tiny, which also deviates from the YAML specs. cperl added JSON and YAML validation tests for all existing JSON and YAML modules.
I've looked over all my >100 distroprefs patches, and in the end problems are only with ether and schmorp. schmorp at least knows what he is doing and eventually comes up with fixes by himself.
maintained by ether
Moose was compilable before ether took over, since then it eroded. The maintainer doesn't have an idea how compile-time vs run-time works, which should be critical understanding for such a compile-time optimized module.
A lot of compile-time code is not run in BEGIN or CHECK blocks, rather as main_cv code of compile-time imported modules, which does not work when separated into compiled and run-time.
I didn't even bother to file a ticket, as 1st compared to Mouse Moose is totally unusable, and 2nd the maintainer might be willing to fix it, but is not able to maintain this.
maintained by ether
The new maintainer is not able to come up with 5.16 support, binary names. She refused to merge my fixes for several years.
maintained and taken over by ether
. I suspect a hostile takeover.
The new maintainer was not able to maintain API stability for
OP::parent
, one of the biggest reasons this module exists, and lied for
the reasons to take it over. A complete disaster.
I had to publish my own stable B::Utils1
module.
So in summary, only p5p and ether are failing to do their job.
With the core cpan modules there are some typical outliers, nothing dramatic. cpan itself looks very healthy, whilst p5p looks totally dead.
]]>Now a new problem arose on the horizon, that p5p dares to fix another bug, the inability to store overlarge strings. Which would be okay if they would have taken our fixes, which fixes the security problem and all overlarge data >I32, which are arrays and hashes also. But they didn't, so I have to complain again. People apparently don't like me to complain about p5p mistakes, but you should be concerned what p5p is doing.
See perl #127743 for the wrong fix, adding 2 new ops, where only one is needed, and this fix only fixes a third of the overlarge problem, and not the security problems. There's no need for them to deviate from the API with a worse fix.
And compare that to the real fixes at cperl, which I sent to Tony some time ago, but apparently with no effect. So they are playing dumb again.
Note that this is now a security problem they are avoiding to fix.
Note that I cannot fix that for you on CPAN, as Storable is maintained by p5p. It's only fixed for us. A fair warning.
And for the ones who are wondering: I'm not able to even post the fixes to the proper ticket and list, as they blocked me there. The reason for the block was entirely fabricated, normal censorship to silence criticism.
]]>In 2013 Rafael Garcia-Suarez posted to p5p a plea to Salvaging lexical $_ from deprecation arguing that the recent deprecation of my $_ is wrong, and can easily be fixed.
He explained the problems pretty nicely on the language level, but
forgot to explain the internal advantages of a lexical $_
over a
global or localized $_
. To add that it's pretty simple: A lexical
$_
is an indexed slot on the pad array, calculated at
compile-time, whilst a local or global $_
is a hash entry in the
namespace. It's much faster to lookup, write and restore. The slot
index is compiled and carried around in the op. That's why it's
called lexical, it is resolved at compile-time, and not at run-time.
The best technical reason why a lexical $_
is needed is given by davem, the only smart guy in the room here at RT #94682. given uses an implicit my $_
internally, and every match uses it.
If you try to read the answers by p5p you will get headaches as their
explanations are not only clueless and wrong, they even argue that
using $_
lexically is semantically wrong! Eg. rjbs: "I think it is
half-baked, confused, and confusing. I don't see how it can be made
useful or straightforward, at the language
level."p5p
He clearly is confused because of the tricky bugs on the ops he is
talking about, but here he was arguing that the semantics are broken(!).
Using $_
explicitly or implicitly; or global, local or lexical makes no
difference, only to RJBS and to some other porters.
Doy for example argues that my $_
should be removed because
this example is broken:
my $_ = 'foo'; print = any { $_ eq 'a' } qw(a b c)
which I fixed in 10 minutes by using the lexical $_
in the XS
implementation and not the global. See the simple patch at
RT #113939
The underlying problems with the implementation of my $_
were quite
trivial to fix. I just had to fix the initial check for OPp_TARGLEX
in the cperl
commit 597e929c9 August 2015, which I published quite well. But the message obviously
didn't arrive at the p5p team, because the decided that my $_
cannot be
fixed and needs to be removed end of August. This time without any deprecation period.
One other interesting new answer was by zefram that this bug if not
fixable, because we'd need support for lexical $_
in every op which
supports implicit $_
or takes blocks,
such as grep and map, obviously not knowing the codebase that this is exactly what we were
doing quite efficiently since 2001. There is e.g. a special lexical
grep bit. A git grep GREP_LEX
would have helped, which was added
2004 as a special bit to grep.
I think there's no way forward that looks at all similar to what we have. Any lexical topic variable would at least have to be selected on a per-construct basis: you can't have "my $_" causing grep and the like to use a lexical topic. A spearate "grep_lexical" operator should be fine, but if we go that route we should consider forcing the lexical topic variable to have a name other than "$_" to avoid the confusing shadowing. Actual "my $_" should be no more valid than "my @_". - June 2015.
There was no single answer to that clear message. In retrospect I would also don't know what to answer to so much nonsense in one single paragraph.
I concur with rgs that "In my opinion, the more important problem here is the impression that P5P is throwing away without much thought of a perfectly nice and modern language feature (for some value of modern that means "post-FORTRAN"). This could give the impression of a the lack of vision for Perl 5 (and reinforce the "perl is dead" death spiral as perceived by the outside world -- the Perl users)."
Well said, and I knew it beforehand before it became true. But it's pretty easy to foresee such desaster looking at the track record this team is piling up, and how they react to critism, reviews and help. Their latest attempt to come up with an unwritten religious CoC rule, that you need to have faith in the maintainers in order to talk to them is the final sign of "Death by Code of Conduct". It's pretty hard to have faith in someone who decides 90% on the wrong side and only by luck sometimes makes a right decision. Initially I thought this code of conduct could finally stop all the silly bullying and name calling, but it got even worse, and whenever I complain about such abuse p5p punishes the one who complains about abuse, which is clearly only in the interest of the p5p buddies to maintain their powers they are abusing, but not in the interest of anybody else.
Now that p5p decided on this schism and went forward with more totally wrong and outragious decisions to do more and more harm on the perl5 language, I can only recommend to use cperl, the perl5 implementation where such grave mistakes do not happen, where perl5 development actually continues in the spirit before p5p took it over, and where the p5p principles of ruling by incompetence, power and abuse are not tolerated. cperl decisions are made rationally, professionally and most of all "cperl is not a religion". You are allowed to show distrust and criticsm. And in cperl such bugs are actually getting fixed, and the language is not harmed.
You will also hopefully not see those incredibly silly mailinglist threads as the cited one above. It's atrocious. We can only hope that p5p is put to an end soon, and a proper development process can start. There's no way this can go forward like this.
Maybe I should finally mention that fixing the removal of the lexical-topic is not only done by skipping the removal of it. The new CX code e.g. assumes that all the $_
there are global, that no lexical $_
can exist, and fixing this already cost me a few hours when merging with 5.24.0. It's pretty hard to deal with artificially stupidified code. At least the new 5.24 CX code is so wonderful that's it's worth it. +20%
libyaml upstream has now a patch with the new options NonStrict and IndentlessMap https://github.com/yaml/libyaml/pull/8
YAML::XS in https://github.com/ingydotnet/yaml-libyaml-pm/pull/43
YAML::XS writes now proper YAML which can be read with YAML.pm, and passes the CPAN::Meta validation tests. See https://github.com/Perl-Toolchain-Gang/CPAN-Meta/pull/107
For CPAN::Meta I've added validation tests for all existing YAML loaders, so you can see what's going on, and which version is conforming or fails. YAML::XS, YAML::Tiny, CPAN::Meta::YAML, YAML::Syck do pass now, YAML fails.
I have also patches for Parse::CPAN::META and CPAN-Meta-YAML to use the new versions, but only in cperl, no PR yet.
I've also started merging libsyck from upstream into YAML::Syck, which came up with some horrible private extensions, and made them mergable to upstream. But this work is still ongoing at https://github.com/rurban/syck/commits/0.71 and https://github.com/rurban/YAML-Syck/commits/merge-upstream.
It is a mess, I admit, but easier fixable than the YAML::XS mess. So I took libsyck upstream, which is at 0.70, and merged it with our changes which are at 0.61. Our perl-specific changes are a complete mess, so I cleaned that up to be acceptable upstream into a new 0.71.
merge back various changes from upstream (my own WIP version 0.71)
add proper type casts
sanify various unmergable hacks into proper flags, which can be set perl-specific:
add emitter->nocomplexkey flag, default=0, 1 for perl.
and rename scalar2quote1 to scalar1quoteesc (JSON singlequote as single-quoted with dq-like escapes)
remove some other unmergable hacks:
syck_base64enc requires an ending \n
YAML::Syck has many advantaged over YAML::XS. It does support reading and writing to file streams, which means it does not need to slurp each file into a buffer and process that. It can process streamable buffers. libyaml can do that also, but YAML::XS never implemented that. I only added a LoadFile method, but not DumpFile.
YAML::XS doesn't really use the nice architecture libyaml provides, it rather does it's own perl-specific callbacks, bypassing many advantages of libyaml.
libsyck is much better written than libyaml, no question about that. It has much less bugs, much more options to handle, but it got stuck at YAML 1.1 Anybody really needs YAML 1.2? I haven't checked the changes yet.
My changes (still WIP) are at:
So now I'm pondering to convince everybody to ditch YAML and YAML::XS completely in favor of YAML::Syck. Let's see how this will turn out... In fact it's only a tiny patch to CPAN, and I can do that by my own, since CPAN is in core.
My core integration for YAML::XS is at:
What I need now a is good YAML testsuite which merges the validators required by core (CPAN::Meta) and various interop testing as I did with Cpanel::JSON::XS, esp. roundtrips, add the perl module back to syck to give it into sane hands (this migth be tricky as it involves testing with ruby, php, python, ...), do benchmarks and going over the tickets.
What I know is that YAML.pm processing over my cpan prefs is ~10x slower than with YAML::Syck. The current performance is unacceptable, and YAML::XS emitting unindented seq elements for a map child ditto. Maybe I have to fork YAML::XS to a Cpanel::YAML::XS, but most of the fixes need to be done in libyaml itself, and let's see how fixing syck turns out.
]]>.ini
, .json
and .xml
.
But Houston we have a problem. For a long time. I'll fix it.
We have the unique advantage that the spec author and maintainer is from the perl world, Ingy, and maintains the two standard libraries YAML, the PP (pure perl) variant, and YAML::XS, the fast XS variant, based on LibYAML.
This would be an advantage if those two libraries would agree on their interpretation and implementation of the specs. They do not.
Historically the YAML library is used as the default reader for
CPAN .yml
preferences and a fork of YAML::Tiny, CPAN::Meta::YAML which
is in core is used to read and write the package META.yml
files.
The basic idea is to use the fastest library available and use a PP fallback for systems which don't have the fast variant. perl5 core does not ship a proper fast library for JSON and YAML, so you have to stick to the 10x slower PP variants. cperl will ship with YAML::XS and Cpanel::JSON::XS in core, so there this problem is gone.
But we still have the YAML problem:
YAML
, the default reader for CPAN, refuses to read .yml files
produced by YAML::XS
. You can only set YAML::Syck
as yaml_module
in ~/.cpan/CPAN/MyConfig.pm
, using YAML::XS
will get you into
trouble. But YAML::Syck is not maintained anymore. It was written by
_why the lucky stiff, also the author of potion, the VM for p2. It
still kinda works, and it behaves better than YAML::XS, but it would
be better to replace libsyck by libyaml afterall, and get the YAML
maintainers to fix their mess.
The fault is in the YAML::XS (i.e. LibYAML) dumper and in the YAML loader.
YAML::XS writes .yml
files, which YAML cannot read.
YAML supports scalars, arrays (called sequences) and hashes (called map).
The current problem is the interpretation of the Spec in the current version 1.2,
6.1 Indentation Spaces.
YAML::XS writes the elements of sequence without indent, and YAML all other YAML libraries expect an indent.
I.e. YAML::XS writes for {author => ['perl5-porters@perl.org']}
author:
- perl5-porters@perl.org
while all other libraries and the spec insist on at least a space
before the -
, the seq sibling.
author:
- perl5-porters@perl.org
"Each node must be indented further than its parent node. All sibling nodes must use the exact same indentation level. However the content of each sibling node may be further indented independently." http://yaml.org/spec/1.2/spec.html#id2777534
But in the meantime all other YAML loaders came to accept Ingy's interpretation on the seq indentation level, and do accept the missing seq indent. Just YAML not. YAML throws a MAP error. This is certainly a YAML loader bug.
Remember that YAML is the default reader in the CPAN config, all it needs to do is to load the yaml. Which is broken.
All this is known for a long time, Szabo wrote about inconsistencies, p5p put a variant of the better YAML::Tiny into core as CPAN::Meta::YAML. This is fine, but in the long run a fast library in core is preferred, and that's what I'm doing for cperl.
So what needs to be done:
Change yaml_module
in ~/.cpan/CPAN/MyConfig.pm
to either
CPAN::Meta::YAML, YAML::Syck or YAML::XS. All these can read those
YAML files. YAML can not, until it's fixed.
Fix YAML::XS to dump seq elements with intentation, as all the others YAML libraries, and as the specs says. I'm working on that.
Fix YAML to accept seq elements with intentation to be able to read old YAML::XS files. I'm working on that.
Fix YAML::XS to accept spec-violating elements in a new NonStrict mode, because the other libraries write those elements, and a YAML loader should be optionally non-fatal on illegal control chars, illegal utf-8 characters and such. All other YAML loaders silently replace illegal elements with undef. I'm working on that in https://github.com/ingydotnet/yaml-libyaml-pm/issues/44
Ingy insists that all other libraries are broken, they produce wrong YAML. Which would be acceptable if the libraries and the spec at least would be consistent. They are not. And historically all successful YAML readers are non-fatal.
cpanel_json_xs
has now the options yaml, yaml-xs, yaml-tiny, and
yaml-syck to use those libraries for readine and writing from the
command line. This way you can easily prove the various inconsistencies.
cpanel_json_xs -f yaml -t yaml-xs <META.yml >XSMETA.yml
cpanel_json_xs -f yaml -t yaml <XSMETA.yml
YAML Error: Invalid element in map
Code: YAML_LOAD_ERR_BAD_MAP_ELEMENT
And you can try all other variants, which do work mostly.
For YAML::XS the following needs to be done: With NonStrict allow character errors (control, unicode), throw a warning, replace by the partial read or undef, and continue parsing. This way you loose data, but NonStrict is optional and a fallback for local configuration files, which are better read partially than not at all. We cannot loose everything on roundtrips.
]]>The perl5.18 implementation for COW strings is totally broken as it
uses the COW REFCNT field within the string. You cannot ever come to a
true successful copy-on-write COW scheme. You cannot put the string
into the .rodata segment as with static const char* pv = "foo";
it
needs to be outlined as static char* pv = "foo\000\001";
. The byte
behind the NUL delimiter is used as REFCNT byte, which prohibits its
use in multi-threading or embedded scenarios. In cperl I'm was working
on moving this counter to an extra field, but the 2 authors made it
impossible to write it in a maintainable way. I could easily seperate
the refcnt flag but I couldn't make it COW yet.
But even if the COW implementation in the libperl run-time is broken by design it still can be put into good use to store more strings statically than expected. The problem was that since 5.18 and with this COW feature binaries needed 20% more memory, as I couldn't save the strings statically anymore and had to allocate them dynamically.
In the first attempt I save some kilobytes memory by removing the IsCOW flag and store more strings statically.
But now I do the opposite. I set the IsCOW
flags on much more
strings since 5.20 and -O2, store it not as const char*
to be able
up update the cow refcnt, and rely in the automatic cow and uncow
functions in the runtime to move this static buffer to the heap when
being written to, and don't need to rely on LEN=0
anymore, which
indicates a normal static string.
With a typical example of a medium sized module, Net::DNS::Resolver, 64bit not threaded, the memory usage is now as follows:
5.22:
pcc -O0 -S -e'use Net::DNS::Resolver; my $res = Net::DNS::Resolver->new;
$res->send("www.google.com"); print `ps -p $$ -O rss,vsz`'
pcc -O3 -S -e'use Net::DNS::Resolver; my $res = Net::DNS::Resolver->new;
$res->send("www.google.com"); print `ps -p $$ -O rss,vsz`'
rss
without -fcow: 12832
with -fcow : 12112
cperl : 12532
6% percent memory win for 5.22. Even better than with cperl.
The current distribution of .rodata, .data and dynamic heap strings with this example is as follows:
.rodata .data heap
-fno-cow (-O0): 305 1945 1435
-fcow (-O3): 110 2225 1024
cperl -O3: 107 2112 1001
Thus with -O3 we traded 40% less dynamic strings for 3x less .ro strings, but 14% more static strings. With cperl the improvements are no so dramatic, as cperl already has much more static optimizations already.
]]>COG means that the array of SV* pointers is allocated by the compiler statically, not dynamically, and that the cperl runtime creates a new array whenever the array is extended (copy-on-grow).
COW means that the array of SV* pointers is allocated by the compiler constant and static in the .rodata segment, and that the cperl runtime creates a new array whenever an element of the arrays is changed (copy-on-write).
With a typical example of a medium sized module, Net::DNS::Resolver, the memory usage is as follows:
pcc -O0 -S -e'use Net::DNS::Resolver; my $res = Net::DNS::Resolver->new;
$res->send("www.google.com"); print `ps -p $$ -O rss,vsz`'
rss
with avcow: 12720
without : 13456
5.8% percent win.
The numbers with a small example are as follows:
rss vsz
cperl5.22.2-nt-avcow 2536 2438744
-O3 2532 2438740
cperl5.22.2d-nt-avcog 3516 2451728
perl5.22.1-nt 3316 2438912
perl5.20.3-nt 3264 2438696
perl5.18.2-nt 3036 2438468
perl5.18.4d 4276 2450540
perl5.18.4d-nt 4120 2451332
perl5.16.3 4072 2458904
perl5.16.3-nt 3008 2438420
perl5.14.4 3168 2447764
perl5.14.4-nt 2944 2447540
perl5.14.4-nt -O3 2852 2447472
perl5.12.5 3440 2449964
perl5.12.5-nt 3244 2447716
perl5.10.1-nt 3172 2456836
perl5.8.9 3176 2465976
perl5.8.9d-nt 3096 2438400
perl5.8.5d-nt 3228 2456836
perl5.8.4d-nt 3176 2457792
Here you see that the previously useful perl version perl5.14.4-nt with 2852 kB is now finally made obsolete by cperl with an RSS of 2532 kB.
5.16 introduced binary symbols, and 5.18 added a completely broken implementation of COW strings, which forced all previously statically allocated strings to be allocated dynamically. This caused a 20% memory increase in 5.22, which we could only overcome with cperl, and some tricks in the compiler to disable COW strings at all.
Theoretically I can set all arrays as COW to get the biggest memory win, but at run-time all writes need to copy those arrays to the heap, which is a performance and memory loss. So I cow only the arrays which are very likely to be not changed at all. I.e. all @ISA arrays, the @INC and all READONLY arrays.
The current distribution with this example is as follows:
24 COW arrays of size 1, 2x 2, 1x 3, 1x 9. 28 COW arrays at all.
11 COG arrays of size 1, overall 90 COG array sizes with max 169 elements, 89 COG arrays at all.
1338 arrays and 16562 SVs at all.
I haven't measured the hit and miss rate yet, and I haven't fixed COW or COG for other data types, such as strings or hashes. A big improvement would be proper COW or COG for strings of course, with an expected memory win of 10-20%.
]]>As I said in my interview it's my belief that if all current p5p core committers would stop committing their bad code it would be actually be the best for the perl5 project. They weren't able to implemented any of the already properly designed features from perl6 in the last 12 years, and every feature they did implement is just so horrifibly bad, making our already bad code base, which led to reimplementation efforts of perl6/parrot with a better core, even worse. With cperl I can only undo a little, but when they start breaking the API and planned features in an incompatible way they should just stop.
Nevertheless, 5.22 added a significant improvement from outside, syber's monomorphic inline caching for method calls besides the internal improvement of multideref by Dave Mitchell.
Now to the problems I had to fix in the last months with that 5.22.0 release:
This is something I cannot fix in the compiler. I updated my perl patcher
App::perlall
with new --patches=Compiler patches to fix this, and cperl of course
also has this fix.
I had to write a complicated
probe mechanism
for ByteLoader to check if the used perl5.22 version is already
patched or not. Probing a to-be-built XS submodule is not that easy. A
typical chicken and egg problem. I could use my already existing
B::C::Flags helper config, which allows custom compiler settings.
There I initialize the variable $B::C::Flags::have_byteloader
with
undef, and when the XS modules are all built I call a helper script to
probe for a working ByteLoader, and patch
$B::C::Flags::have_byteloader
to 0 or 1. I can use this then in the
tests to skip or run the bytecode tests. And I had to put this helper
script into the hints directory to skip it from being
installed. Messing with EUMM libscan() was too dirty for me.
The internal compiler op.c creates a new main or eval environment with
newPROG()
, setting the entry points PL_main_start
and PL_main_root
from the intermediate parsed PL_compcv
. In the case of en empty source
the parser always adds a final ;
semicolon, which leads to an empty optree
starting with OP_STUB
.
But with commit
34b5495
for [perl #77452] the compiler now always adds a LINESEQ in front of
the STUB, but the logic in newPROG for source filters which already
setup PL_main_start
and PL_main_root
wasnot changed, which led to
a broken ByteLoader.
This is an interesting commit as it added a lot of wrong comments about the inner working of this, but didn't update the logic.
The fix in cperl is here and for perlall here, and my perlbug report did not get through.
I can only guess that p5p blocked me again, because they didn't like me to call them incompetent. Blocking bug reports and fixes is worse than just incompetence, but I got used to that recently. They blocked my simple fix for the horrific double-readonly system, and they proudly announced last week some new optimization regarding faster arithmetic, but didn't have a look into my fast arithmetic optimizations which I wrote half a year ago, and which makes them look very bad in the end. Everybody applauded poor Dave for this "fantastic breakthrough". The guys are really that simple. Looking through my improvements would have wasted less time and would have improved it upstream by 30% not just 10%.
Multideref merges sequential hash or array access into one compressed op. This is a pretty good compiler optimization, if the B design would not be so bad.
The upstream design of the new 5.22 B::UNOP_AUX::aux_list
method
deviates significantly from proper B design. aux_list requires the
curcv to be provided, which is not trivial to do for a B module, and
it needs this to resolve shared SVs beforehand. Requiring the curcv
to resolve the padoffset is unneeded and does not help B and any of
its clients. Clients need the padoffset and resolving it e.g. in
B::Deparse is to be done in B as with all other threaded and shared SV
accessing methods.
Thanksfully I can patch most of B bugs by myself, and don't have to fork it publicly into a worse name. B is already a good enough name, and I don't want to deviate from that, even if p5p consistently refused to maintain B properly in the last years. There was some short time a few years ago where I could work without a patched B, but this period only lasted very shortly, and none of my fixes were applied, while other new horrific mistakes made it in.
Stashes can be aliased to seperate namespaces, and the ENAMES API to access this names never made it into B, and thus never into a compiler. Namespaces aliases are rather seldom, so it caused not too much trouble, but now I added ENAMES and could hereby fix most of the remaining compiler limitations, even for 5.14.
I explained that technically in my interview. Currently we limit the max name length of lexical variables to 60, because we statically allocate the buffers for them. It is not a practical problem, and I'll optimize that sooner or later to smaller static structs.
HEK's (shared hash keys) are still dynamic, not static, but I could fix the remaining refcount issues at least.
The cperl code to support static HEKs is already there, but I still need to add compiler code and probes to support that.
5.22 has a wrong RV->FLAGS for a GVOP_gv pointing to a CVREF gv(cv ref:). It returns the flags for a GV (0x808009) where it should be just 0x801, a ROK RV. This is suddenly broken in 5.22 because it's a new optimization they did, and of course wrong.
I haven't fixed that yet in cperl, I just a workaround in the compiler with this patch
Overall we are very happy with the new 5.22 compiler, though we are not yet using the much more advanced cperl optimizations. The B::C optimizations alone lead to ~20% less memory, with cperl and its compiled readonly hashes for Config and warnings and its upcoming support for static GV/AV/CV/PAD/HEK layout it's much more dramatic. This will be a real COW (copy-on-write) mechanism then, being able to statically allocate readonly buffers, and copy it to the heap, when it's being changed. For the compiler we only need to ensure that static buffers are not freed, which is trivial with the added flag.
-m support for perlcc, compiling to modules, not single binaries is also improving. This can split various optimizations per module/.pm file, so we can use B::CC compiled modules or even rperl compiled modules, compile-times should go down from 20min to ~5min, with much faster smoker feedbacks, and pushing updates live is much faster, because they will be much smaller. The old compile times were 2 hours.
But since fixing B::C for 5.22 needed so much more time than expected I couldn't add most of the planned cperl optimizations for the upcoming cperl-5.22.2 release and B-C-1.53 release.
]]>The name cperl stands for a perl with classes, types, compiler support, or just a company-friendly perl, but currently it's only a better 5.22 based variant without classes.
Currently it is about 1.5x faster than perl5.22 overall, >2x faster then 5.14 and uses the least amount of memory measured since 5.6, i.e. less than 5.10 and 5.6.2, which were the previous leaders. While perl5.22 uses the most memory yet measured.
See http://perl11.org/cperl/STATUS.html and http://perl11.org/cperl/ for an
overview, changes and docs.
Detailed changes are at https://github.com/perl11/cperl/blob/master/Changes
./Configure -sder -Dusecperl && make -s -j4 test && sudo make install
]]>wget https://raw.githubusercontent.com/cyrus-and/gdb-dashboard/master/.gdbinit -O .gdb-dashboard
sed -i 's,python Dashboard.start(),#python Dashboard.start(),' .gdb-dashboard
joe .gdbinit
source .gdb-dashboard
python Dashboard.start()
My Macbook Air gives constantly better results in my hash function benchmarks than my big Linux Desktop PC, because it has a newer i7 Haswell, and the linux has only an older i5 CPU. Both have fast SSD's and enough RAM.
But when I run the perl5 testsuite the linux machine is twice as fast. Typically 530s vs 1200s. Which is odd and very annoying.
And then I fixed it with one little change.
$ time ./perl -Ilib -MNet::Domain -e'print Net::Domain::hostname()'
real 1m0.151s
user 0m0.028s
sys 0m0.011s
$ hostname
airc
$ sudo hostname airc.local
airc.local
$ time ./perl -Ilib -MNet::Domain -e'print Net::Domain::hostname()'
airc.local
real 0m0.039s
user 0m0.027s
sys 0m0.008s
You see that Net::Domain::hostname didn't return a value. It timed out. Great for testsuites and benchmarks.
The code in Net::Domain calls for MacOS just hostname
, which is fast, but for darwin it calls sys_hostname
. But this is fast also. So what else is going on?
sub domainname {
return $fqdn
if (defined $fqdn);
_hostname();
# *.local names are special on darwin. If we call gethostbyname below, it
# may hang while waiting for another, non-existent computer to respond.
if($^O eq 'darwin' && $host =~ /\.local$/) {
return $host;
}
I cannot beat this magic, so I changed my hostname on my laptop. Problem solved. Aargh
Elapsed: 637 sec
]]>He struggles with his new super-op OP_SIGNATURE which is at second thought a better idea then the old way to assign lexical values from the call stack at the begin of subroutines, just that it cannot take the stack values, it has to go through an intermediate @_ copy, but that is just an implementation detail which can be optimized away, and goes then on like this:
Also, every op/optree related internals change now involves fixups to B and Deparse, so every change is that much more work.
If I could travel back in time and stop Malcolm B. writing B and friends, I would in an instant. Perl now would have been far, far better, and probably a lot more truly extensible than it is now.
The realisation that B (and B::C etc) were a failed experiment was one of the drivers of perl6. It's been an albatross round perl5's neck ever since.
Dear Dave, are you completely out of your mind?
I know that you all don't want to maintain your reflection API and AST. p5p was also not able to maintain B::C and the other 2 compilers, so I had to step up and fix it for them. Even if p5p is unwiling to fix the outstanding problems they create and is still doing more damage than good, B::C is a huge success story.
B::Generate, optimizer and types are not so easy to fix because there the damage done by p5p is too outragious that it can only be fixed in a forked version of perl. I still have to maintain patchsets in my perlall build suite (perlall build --patch=Compiler) to be able to create perls without those roadblocks. And to support windows, because p5p is not willing to export the needed API functions.
Nick Clark really claimed publicly that changing function bodies at run-time is too dangerous when used with concurrent threads. Let that be the problem of the optimizer, not yours. By further blocking dynamic optimizations B::Generate is worthless and type or profile based optimizations cannot be done. Do you have an idea why javascript, an even more dynamic language and worse language than perl could be optimized so much? Apparently not.
B::C passes the complete core testsuite. B::C compiled code is faster and smaller than uncompiled perl. Look at the B::CC and rperl benchmarks. We are close to v8 and in the next step we will be there.
B::C is successfully used in production by cPanel, which is in fact the largest and most successful company using perl. We just don't shout it out as loud as Booking.com, because we are privately held and we don't need to publish our numbers. Nevertheless cPanel is the facto one of the backbones of the internet, used by ca 70% of all webhosters worldwide, with compiled perl applications and its distribution based on CentOS.
We have to compile perl to have a low memory footprint for our daemons. They need to be smaller than apache and mysql at least, and they need to run on hosting VM's which are low on memory.
Even with p5p non-ability to come up with non-bloated versions of their releases, and their non-ability to come up with any improvements since 5.6, B::C is a huge success.
Dave and Nick, you are the real albatros around perl5's neck for years. Finally stop doing your destructive work, step back and let the people do the work who got a track record and have an idea what they are doing. You both got paid for years to work to improve perl5 and the results are hopeless. One year to fix eval in regex? My dear.
Still not a single feature written and discussed by p5p was ever successful, besides the trivial defined-or syntax. The only non-trivial improvement in the last years came from outside and was initially heavily critized and not understood. ("Why do we need another hash table implementation?") But will this lead eventually to an efficient implementation of classes, roles (mixins), polymorphism and types? Or a better runloop. For sure not. This is shot down since 2002, and everybody who was able to do that and was interested left p5p. Not about talking fixing the easy stuff like smartmatch, switch, given/when or even hash tables. You just gave up.
I have to do my work now behind closed doors.
Stop doing your destructive work, start listening to advice and maybe even implement a good feature or library. Like OP_MULTIDEREF which is nice. Even if it just tampers around the fact that the runloop is too big and slow.
Without B perl5 would have been more successful?
What a ridiculous statement. Inspecting the AST, the optree, how perl compiles its ops? Certainly totally outragious. Nobody would need that. You even refused to accept documentation for the optree. You are just bitching about your need for B::Deparse and B::Concise test updates. Who needs precise optree representations? It's just an implementation detail. The functionality needs to be tested not how it looks internally.
But maybe you'll eventually learn how the optree (i.e. the AST) looks like, when you are forced to update some B modules. I'll do the rest for you anyway for the parts you do not understand.
Realize that your work is looking worse than PHP.
]]>DaveM now introduced a new OP_SIGNATURE which assigns run-time args according to the compiled signature.
It basically speeds up these compiled checks
sub f ($a, $b = 0, $c = "foo") {};
=>
sub f {
die sprintf("Too many arguments for subroutine at %s line %d.\n", (caller)[1, 2]) unless @_ <= 3;
die sprintf("Too few arguments for subroutine at %s line %d.\n", (caller)[1, 2]) unless @_ >= 1;
my $a = $_[0];
my $b = @_ >= 2 ? $_[1] : 0;
my $c = @_ >= 3 ? $_[2] : 'foo';
();
}
into one OP, which does the same, similar to the new MULTIDEREF. Moving op chains into C. DaveM is now the goto guy for the big DWIM ops, compressing previous op chains into a single one.
This is far too much for a single op, but previously it was all handled either in ENTERSUB or in user code.
The arity checks should be done in the call op (ENTERSUB), the local assignment should be separate ops, we still have no syntax support for types which could be used previously in user-code (my int $i = $_[0];
) and now we even need type hooks. These can also be added after the SIGNATURE op, but make not much sense there, as side effects will appear to early.
XS calls do not need the checks as they do by their own, but XS calls are easily detected in ENTERSUB.
The assignment to local lexicals is now buried in this single OP, which makes it's now impossible to change the currently only supported call by value to the faster call by reference. I.e. there's still no support to declare call by ref
sub myinc (\$a) { $a++ }; my $i=0; myinc($i); print $i; # => 1
so you still have to use $_[0]
directly, which means @_
still needs to be filled with all args, which makes every signature usage still twice as slow as normal calls without signature declaration. Once for @_
in ENTERSUB and a second time for the named args in SIGNATURE.
So this new OP basically just hides this new slowness by design (blame Zefram for this idea) by bypassing the normal ops which assigned the locals.
Any optimizing compiler now needs to replace this new SIGNATURE op and cannot work on the optree. Fine, it cannot be used as is anyway.
Compiled or run-time polymorphism (dispatch on argument types) now needs to replace SIGNATURE and not ENTERSUB. There's not much difference, both are horrible ops to work with. SIGNATURE is probably easier to replace, but replacing ENTERSUB had its advantages by leaving out all the unneeded recursion, @_ and debugger code. So basically you have now to replace both.
Of course there's still no type support, and still no return type declaration syntax, though it seems the post declaration attribute list can now be used, as :const
is now supported, just for anonsubs only.
So real subs can soon look like:
sub myinc (int \$a) :int { $a++ }
and you can use the faster i_ops
for the result, and since it's a reference, for the lifetime of the caller variable.
Just don't expect that from p5p in the next 5 years. Only all the other dynamic languages, python, ruby, php, javascript announce these features officially, and I have to implement it in private.
p5p still has no idea what they are doing, but probably will also announce it as great breakthrough, as they did with the Zefram signatures before. Which is somewhat funny, announcing the worst of all existing signature implementations as positive. People bought that, so it will work now too.
So far the biggest breakthrough lately was besides the new fast METHOD ops (THANKS!), to go for the :const
attribute for subs (THANKS!), so the other syntax possibilities => type
or perl6 like returns type
or is ro
are now very unlikely to appear. This was a great decision, even if it was done unconsciously, and I can finally go forward.
In order to find out with which perl this distro was built, we need to parse the generated Makefile.
Recent EUMM 7.0x introduced a new feature which broke all my scripts. They started double-quoting PERL and FULLPERL in the generated Makefile. Damage is already done. They only thing you can do is to remove the quote.
PERL=`grep "^PERL =" Makefile|cut -c8-`
PERL=${PERL:-perl}
PERL=`echo $PERL|sed -e's,^",,; s,"$,,'`
They obviously were afraid of spaces in Windows paths. Only cmd.exe accepts "cmd", no other shell. So the obvious fix would be to add double quotes on Win32 only, and only of a space appears on the NAME or the PATH. Same as we have to do with $^X in system calls, where we have to double-quote $^X explicitly in string -context. Like with
$X = $^X =~ / / ? qq("$^X") : $^X; system("$X ...")
Initial feedback to the maintainers was not positive, they don't care. EUMM needs to write Makefiles, nothing else.
The second reply was: Just use sh -c $PERL $args
. Yeah. Exactly.
So I fear the toolchain also starts rotting now with the newbies taking over. Test::Builder is also in great danger with a newbie maintainer. The initial trials were twice as slow to be able to support streaming. Given that p5p has similar technical problems it doesn't look to good for 5.2x being usable too soon. I'm still forced to use 5.14.4.
Let's just hope CPAN will not get new maintainers.
My fix: https://github.com/rurban/perl-compiler/commit/16379cf29cbffdf8ffce9d0822af0548cfb65051
]]>