Call it a witchcraft if you like, but we identified line to blame within first minutes we started looking at the problem. Unfortunately, we were not able to convince each other that it is the issue and as the problem was only visible in a long running soak test we were not able to justify running it.
Perl's garbage collection works by reference counting and only frees circular references at exit. As we were dealing with a long running daemon - we started by trying to locate circular references. Inspecting code gave nothing away. So we decided to utilise wonderful Paul Evans' Devel::MAT module. Unfortunately we were not able to locate any circular references.
Finally we decided to look at how Perl sees our code by utilising B::Deparse & opcodes via B::Concise. Somewhere deep in ourselves we started doubting perhaps it is some Coro magic, however, as you will see later - completely unnecessarily. It was not Coro and frankly Coro has lots of potential and I believe it should be in Perl's core to assure its future ( though I know Marc Lehmann is of different opinion ).
Back to the story. By the end of the day we found nothing, we tried out nothing. I hinted to try commenting the line we identified within the very first couple minutes we started looking at the problem, which we did and left soak test running over night. In the morning, to much of our surprise, we found it was no longer wasting memory. For interested souls, it was line similar to following: logdebug( 'Returned ' . join(", ", map { "$" } @list ));
( yes ironically it was unnecessary call as log_debug() does nothing in normal mode we were running ). Nonetheless, consider following dummy cut-down script:
#!/usr/bin/perl
use 5.14.0;
use warnings;
use Data::Dumper;
use Devel::MAT::Dumper;
sub log_it
{
my ($line) = @_;
return length $line;
}
sub do_thing
{
my ($n) = @_;
my @list = ("x") x $n;
# large memory waste:
my $logline = "Returned " . join(",", map { "$_/" } @list);
my $length = log_it($logline);
return $length;
}
my @results = map { my $ln = $_ * 10000; do_thing($ln); } (1..10);
say Dumper @results;
Devel::MAT::Dumper::dump("/tmp/map-leaker.pmat");
which shows a memory hog in following Devel::MAT interactive explorer ( sorted by size ):
$ pmat-explore-gtk /tmp/map-leaker.pmat
Interestingly following does not end up wasting memory:
#!/usr/bin/perl
use 5.14.0;
use warnings;
use Data::Dumper;
use Devel::MAT::Dumper;
sub log_it
{
my ($line) = @_;
return length $line;
}
sub do_thing
{
my ($n) = @_;
my @list = ("x") x $n;
# small:
my $logline = join(",", map { "$_/" } @list);
my $length = log_it("Returned " . $logline);
# large:
#my $logline = "Returned " . join(",", map { "$_/" } @list);
#my $length = log_it($logline);
return $length;
}
my @results = map { my $ln = $_ * 10000; do_thing($ln); } (1..10);
say Dumper @results;
Devel::MAT::Dumper::dump("/tmp/map-leaker.pmat");
nor does following:
# also small:
my $logline = "Returned " . (my $temp = join(",", map { "$_/" } @list));
my $length = log_it($logline);
As one might expect, Coro does the right thing by copying padlists around, hence increases the memory waste - roughly - by number of active coroutines. Consider following example:
#!/usr/bin/perl
use 5.14.0;
use warnings;
use Coro;
use EV;
use Coro::AnyEvent;
use Data::Dumper;
use Devel::MAT::Dumper;
sub log_it
{
my ($line) = @_;
cede;
return length $line;
}
sub do_thing
{
my ($n) = @_;
my @list = ("x") x $n;
# small:
#my $logline = join(",", map { "$_/" } @list);
#my $length = log_it("Returned " . $logline);
# also small:
#my $logline = "Returned " . (my $temp = join(",", map { "$_/" } @list));
#my $length = log_it($logline);
# large:
my $logline = "Returned " . join(",", map { "$_/" } @list);
my $length = log_it($logline);
return $length;
}
# Either construction shows the problem, but the coro one leaks 10 instances.
my @coros = map { my $ln = $_ * 10000; async { do_thing($ln); }; } (1..10);
my @results = map { $_->join(); } @coros;
# my @results = map { my $ln = $_ * 10000; do_thing($ln); } (1..10);
say Dumper @results;
# Demonstrate that there's one leak per active coro, not one per coro that
# ever existed. So there will still be 10 leaked even though we do another
# 10 iterations here.
my @morecoros = map { my $ln = $_ * 10000; async { do_thing($ln); }; } (11..20);
my @moreresults = map { $_->join(); } @morecoros;
say Dumper @moreresults;
Devel::MAT::Dumper::dump("/tmp/map-leaker.pmat");
and following outcome:
It's midnight here and I feel rather tired ( as you might tell by how quickly I glossed over last part ), so I will finish it here.
]]>Some other ideas I came across:
* pyconuk 2014 had 2 days education track
* Applications like Pyland would be great
* It might be great to adjust Perl for different levels, something like staging Perl, i.e. allow only if statements for 5th grade pupils, allow feature x for 6th grade pupils etc. etc. obviously that fits syllabus and government requirements.
Due to different requirements. Perl for production needs to support what was done 10 years ago. Perl for education should enforce current best practices, hence can be more aggressive.
> Strict, warnings, utf8 and newest Perl features on by default
>> Modern::Perl or a similar pragma.
I agree.
> Sub signatures and postfix dereferencing should be on and without experimental warnings
>> Easily achievable with a pragma.
True, as some of the other things, but not all.
> Most of the greatest CPAN modules should come preinstalled, and I am really talking about modules that helps beginners! i.e. Devel::REPL, Devel::DidYouMean, Moo, and many many other like Mojolicious, Dancer, Catalyst, whatever…
>> Task::Kensho, dwimperl, or a custom Task::.
While both Task::Kensho and dwimperl are awesome, they do not achieve what I would like to see in Perl for education. Furthermore, they are NOT backed-up by Perl foundation..
> Forbid/remove special cases like split emulating awk.. or indirect object notation and many other silly leftovers
>> indirect.pm for indirect object notation.
Sure, again, via pragma you can disable indirect object notation and many others, but you can't forbid things like one/two argument open and etc. etc.
>> In any case, they do no harm — nobody forces the teachers to teach these features.
In my opinion it does harm. Pupils may solve an exercise in a way that an experienced Perl developer ( with 10 years of experience ) would struggle to understand without a compiler, and now we are talking about teachers who had no more then a month or two years of playing. Furthermore, I feel they are wasting their time by explaining "what modern perl is" with all the boilerplate..
>> This ePerl should just be a pragma and a Task::, probably in a single distro. Which would be easy to write. No need to break backwards compatibility or to use a sandbox.
Yes and No. Perl lost it's fight in education sector. It's a fact. Thanks to Gabor Szabo and many others, there are things like dwimperl, but as we see - it's not enough. First of all, such initiative to get back into education market should be backed up by Perl foundation, who therefore might find sponsor and do a proper research of why Perl is not there and what we can do to be there. I can only guess, that it is:
- backwards compatibility
- no proper interactive shell
- too much boilerplate
- undermarketing PDL and friends
- need of PBP v2
- sponsors
- visibility(?)
btw, pdl.perl.org is down at the moment.
]]>In my opinion, to make Perl more acceptable in School/University curriculum we need to sell it to lazy teachers/lecturers, who need something like:
Other languages break backward compatibility, they make current developers angry, but future generation don’t need to care what or why happened 10 years ago. Don’t get me wrong, backwards compatibility is superb, but it’s biggest Perl’s weakness today. Beginners don’t care about the core nor how to achieve thing X in Y different ways. Furthermore, this would allow them to learn quicker and safer Modern Perl.
What do you think?
=== P.S.
...While I am writing about Perl5, I believe Perl6 will have exact same problem...
...and yes.. I am aware of breaking CPAN. Though if it was running in a sandbox, it would still be able to escape and use CPAN modules..
… I believe Perl is unique language, that needs it, because:
sub foo {
my $bar = shift;
}
Why is it still fine within the community to skip the @_ ? If we promote shift, then lets use pop as well? Why not? it looks nice:
sub foo {
return pop, shift;
}
Though I am sure someone already uses it.. how about those that use shift at line 100 inside the sub ? I hate that.. It makes really hard to follow code, for instance is it sixth or seventh unpacked argument?.. I think it’s bad practise.
I like when code is consistent and self-documenting. I love when the very first line inside the sub lists expected parameters! Just look how beautiful and tidy it looks:
sub foo {
my ( $foo, $bar, $baz ) = @_;
}
You might think that it is convenient to use shift in cases like:
# EXAMPLE1
sub init {
shift;
my %args = @_;
}
sub foo {
my $bar = shift // ‘default’;
}
# EXAMPLE2
sub foo { shift->call() }
# EXAMPLE3
sub extends {
my $meta = shift;
if ( @_ ) {
print “foo bar baz”;
}
return @_;
}
# EXAMPLE4
sub foo {
new $_[0], shift;
}
sub before {
Foo::Bar::baz(shift, ‘before’, \@_);
}
But it’s horrible for newcomers! You are hurting them! What did they do to you?
Yet, shift might look tidier in compare to $_[0] when you are after performance and don’t want to assign named variables. But in 99% it doesn’t matter and if you need performance - document the need.
Lets start preparing for new wonderful Peter Martini’s sub signatures and tidy up.
]]>