Contributing to Perl and the Perl community was never so easy. Last week Questhub gained support for stencils: pre-scripted quests with clear instructions, and bonus points. The perl realm now has an initial set of stencils, each of which defines a specific way you can contribute to Perl, CPAN or the Perl Foundation. Some of these only require a few minutes, some require a larger commitment of your time.
]]> If you've only got a few minutes, you could subscribe to Perl Weekly or ++ some of your regularly-used modules on MetaCPAN.Do you like to clean things up? Update 5 CPAN distributions to have licence and repository link. Here's how Neil Bowers is doing exactly that.
Do you have extra money? Donate to the Perl Foundation. Or maybe you need extra money, in which case, Submit a grant request to TPF.
Completing these quests will bring you game points. It probably won't bring you fame and glory, but you'll hopefully get a feeling of satisfaction, and you may get on top of the Leaderboard.
So here's how you can do any of these:
Let us know if you've got ideas for additional Perl stencils, large or small. Add a comment to this meta-quest, or send your ideas to me@berekuk.ru.
What if you haven't got the time or inclination to take on a quest? You can still sign up and encourage others by voting on their quests! :-)
PS: I haven't been posting updates on Questhub for some time, because it's not just about Perl now, it's just a startup I'm doing, which happens to be written in Perl and originates in Perl community. But if you want to know what else is happening, follow its development on Meta realm, Twitter or Facebook.
PPS: Half of this post was written / heavily edited by the awesome neilb. Thanks!
]]>I thought Beam::Wire was about configuration, something like Bread::Board?
Our Stream framework had a configuration layer, too; not DI-style, though, just a registry to reference all streams by their names. I haven't released its Flux rewrite yet.
Beam::Wire looks more like something related to messaging, but it's hard to tell without any implementations...
Another distribution similar to Flux is Message::Passing. Two important differences between M::P and Flux are: 1) M::P is asynchronous; it has its advantages, and we had to use some tricks in some cases with Flux to unblock it, but generally I'm quite happy with straight-forward sequencial code Flux lets me write; 2) Flux puts bigger emphasis on keeping your data safe, with its explicit commit() calls.
]]>What's it good for? Message queues; organizing your data processing scripts in a scalable way; de-coupling your processing pipeline elements, making them reusable and testable; seeing your system as a collection of lego-like blocks which can be combined and replaced as you like. With Flux, your code is a series of tubes.
Flux is a rewrite of Stream framework which we wrote and used in Yandex for many years. Stream:: namespace on CPAN is taken, though, which gave me the reason to do a cleanup before uploading it, as well as a chance to rewrite everything with Moo/Moose.
I'm planning to release Flux in small chunks, explaining them along the way in separate blog posts, as time will allow. Today, I'll explain the main ideas behind it, some core classes, and how all its parts are working together.
]]> Flux has input streams and output streams; storages, which are kind of like tanks storing the data as it flows through the system; pumpers to connect storages; and mappers which modify the data.Let's start with input streams.
Flux::In is the role for all Flux input streams (it's a Moo role, as all other core Flux interfaces). Here's how input streams work:
my $item = $in->read;
my $other_item = $in->read;
$in->commit; # save the reading position
my $arrayref_with_10_items = $in->read_chunk(10);
There are two important things to note about this simple example:
read_chunk
is a part of core interface. Why? Because real stream objects often include several layers of delegations, and method invocations in Perl are relatively expensive. I'll come back to this point later.
How you can construct an input stream? Most objects in Flux are polymorphic; there are many different implementations. Here's the most basic implementation:
use Flux::Simple qw(array_in);
my $in = array_in([ "foo", "bar" ]);
say $in->read; # foo
array_in
doesn't support commit()
, i.e. committing it does nothing.
But there's still commit()
method present; some other features of Flux streams are optional, and high-level code which deals with input and output stream objects have to check for those features, but commit
and read_chunk
are fundamental and omni-present.
array_in
is useful for unit testing - since all input streams has the same basic interface, you can replace your real input stream (which would read data from file or DB or network) with it.
Now let's take a look at output streams, described by Flux::Out role/interface.
It's mirroring Flux::In interface:
$out->write("foo");
$out->write("bar");
$out->commit;
$out->write_chunk(["xxx", "yyy", "zzz"]);
When you commit an output stream, you force the data down the pipe. Depending on the implementation, it would mean flushing memory buffers into HTTP POST request, writing or fsyncing data on disk, or committing SQL transaction.
The simplest Flux::Out implementation is this:
use Flux::Simple qw(array_out);
my @data;
my $out = Flux::Out->new(\@data);
$out->write("foo");
$out->write("bar");
$out->commit;
say for @data; # foo; bar
But all this doesn't sound very useful so far, does it?
Here's a more serious bit of code:
use Flux::Log;
my $storage = Flux::Log->new("/opt/events.log");
$storage->write(q[{"type": "email", "title": "Hello"}\n]);
$storage->write(q[{"type": "email", "title": "Goodbye"}\n]);
$storage->commit;
# in another script:
my $in = $storage->in("sendmail");
while (my $item = $in->read) {
do_sendmail($item);
}
$in->commit;
This code introduces Storages. Storages implement Flux::Storage interface; storages are Outs which can generate input streams. Flux::Log is a storage which writes to the log, and you can later read that log from another process, processing your data asynchronously.
In other words, it's a file-based message queue.
"sendmail"
string in the code above is called client name, and $in
object can be referred to as a client, because it's a client reading our storage. You can read one storage with multiple clients at the same time; each of these will get its own copy of data.
Flux::Log is not a trivial piece of code, by the way. It handles safe writing to the log file (which is *not* easy [2]), and its clients implement transparent reading, so you don't have to worry about logrotate rotating these logs (this part is covered by using Log::Unrotate).
Finally, I want to introduce the concept of Mappers. Here's an example:
use Flux::Simple qw(mapper);
use JSON;
my $raw_in = $storage->in("sendmail");
my $in = $raw_in | mapper { decode_json(shift) };
my $item = $in->read; # decoded hashref
Mappers rewrite your data. You can chain them into pipelines using shell-like "|"
syntax sugar.
Mappers can be attached both to input streams (on the right side: $in | $mapper
) and to output streams (on the left side: $mapper | $out
).
$in | mapper { my $item = shift; $item->{type} eq 'email' ? $item : () }
Or for turning one item into several items:
my $double = mapper { my $item = shift; ($item) x 2 }
If you're wondering how Mappers are different from functions, remember when I said that read_chunk
(and write_chunk
) are parts of core interfaces? Low-level Mapper interface is similar to Flux::Out interface, with write_chunk
and commit
.
For example, you may write a mapper which gets items {id => 123}
and fills them with data referenced by id
from SQL database. In this case, it might make sense to store multiple items in mapper's inner memory buffer, and then do one bulk SELECT before returning them from mapper's other end.
So, those were the basic building blocks for programs using Flux. Inputs and outputs define your data sources; storages generally turn outputs of one program into inputs for another; and mappers let you abstract processing code into reusable objects.
Next time, I'll explain pumpers, which are scripts which actually *do* stuff with data, and formats, which are two-way lens encoding and decoding data.
Footnotes:
[1] https://blogs.perl.org/users/vyacheslav_matjukhin/2010/12/roadmaps.html - really, a very long time...
[2] See Suprisingly hard task of writing logs. Flux::Log solves this issue by locking the file while writing and doing look-behinds whenever necessary to check that file's last line ends with "\n".
use autodie;
.
Here's a quick benchmark:
$ time perl -E 'say "package X$_; use autodie qw(:all);" for 1..100;' | perl
real 0m1.482s
user 0m1.431s
sys 0m0.047s
Compare with Moose:
$ time perl -E 'say "package X$_; use Moose;" for 1..100;' | perl
real 0m0.343s
user 0m0.328s
sys 0m0.016s
It doesn't get much better without qw(:all)
:
$ time perl -E 'say "package X$_; use autodie;" for 1..100;' | perl
real 0m1.212s
user 0m1.169s
sys 0m0.047s
But it gets significantly better if you import only a small number of functions:
$ time perl -E 'say "package X$_; use autodie qw(open close);" for 1..100;' | perl
real 0m0.175s
user 0m0.166s
sys 0m0.011s
Basically, you pay for each function you import, once per function instance, in each module, again and again. That's different from, for example, Moose, where 99% of importing performance hit is on first use
, when perl compiles all the code, and then each subsequent import()
is almost free. Due to this, if your app has many modules, autodie can easily become the biggest bottleneck in its starting performance.
So, it's a bad idea to add use autodie qw(:all)
thoughtlessly to your boilerplate, in addition to use strict; use warnings; use 5.0xx;
. If you do need to use autodie, it might be a good idea to explicitly list all functions you want to replace.
PS: I don't know why it should be that way. I know autodie 2.18 does more caching and is significantly faster than previous versions, but it still doesn't cache much, apparently.
PPS: This post was brought to you by questhub, as usual :)
]]> var _gaq = _gaq || []; _gaq.push(['_setAccount', 'UA-35747472-3']); _gaq.push(['_trackPageview']); (function() { var ga = document.createElement('script'); ga.type = 'text/javascript'; ga.async = true; ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '.google-analytics.com/ga.js'; var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(ga, s); })();]]>PlayPerl used Twitter credentials initially, because it was the easiest route to implement.
These days, PlayPerl (also known as Questhub) gives you the option to log in with email using Mozilla Persona, which is the best of both worlds: you don't have to invent a new password, but you can choose any credentials provider, be it Mozilla, some other big player which implements Persona protocol, or your own private server.
PS: It's not possible to migrate from Twitter account to Persona yet, but it will be possible in the future.
]]>ubic-admin setup
!)]]>
Not all modules are packaged as RPMs, and those which are packaged are often out-of-date.
]]>Ok, I checked, that's not the case on CentOS. perl-CPAN.x86_64 contains /usr/bin/cpan
, which then installs stuff to /usr/local/bin/
with default settings.
But I admit I don't know anything about CentOS. Or how to customize CPAN.pm
's behavior, for that matter... I usually just install cpanm with it and never look back.
Yeah, I know, old habit...
> For cron, I don't rely on PATH at all
This doesn't help if your script calls other scripts.
]]>
You get the root shell with sudo -s
.
You enter: cpan App-cpanminus
.
And then you enter: cpanm --help
... and oops:
[root@localhost vagrant]# cpanm --help
bash: cpanm: command not found
WAT?
]]> What just happened is thatcpan
installs scripts into /usr/local/bin/
, and /usr/local/bin/
is not in the root's $PATH
.
I don't think CPAN toolchain is to blame here. But I think that if you can't run the software you just installed, something somewhere is horribly wrong.
But wait, there's more.
There're several different ways to get the root shell:
sudo -s
su -
su
Let's see...
[vagrant@localhost ~]$ sudo -s
[root@localhost vagrant]# echo $PATH
/sbin:/bin:/usr/sbin:/usr/bin
[vagrant@localhost ~]$ su
Password:
[root@localhost vagrant]# echo $PATH
/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/vagrant/bin
[vagrant@localhost ~]$ su -
Password:
[root@localhost ~]# echo $PATH
/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin:/root/bin
The first one, sudo -s
, uses the secure_path
option from /etc/sudoers
.
The second one, su
, inherits the current user's environment.
And the third one, su -
, is the same as su -l
, makes the shell a login shell, which resets the environment, so that it's the same as if you'd login with the root account initially.
But wait, it gets worse. Let's try cron:
[root@localhost ~]# crontab -l
* * * * * perl -E 'say $ENV{PATH}' >/tmp/cron.path
[root@localhost ~]# cat /tmp/cron.path
/usr/bin:/bin
Cron is the reason I wrote this post in the first place. The most popular complain about my Ubic these days is that it doesn't start services after the reboot. That's because Ubic bootstraps services on reboot with the crontab, which doesn't include /usr/local/bin/
(because there are many ways to setup Ubic, e.g., with perlbrew, so I didn't want to make any assumptions about the user's environment).
I'm giving up. I'm just going to always add /usr/local/bin to $PATH if it's not already there.
BTW, sudo -s
$PATH issue is more RedHat-family specific, but cron
's $PATH issue is not, as far as I know.
But why? What's the reason for all this? Is it for the sake of security? Can there really be such a program that's safe to keep in /usr/local/bin, accessible to all users, but not to the root? Really?
PS: I know, I know, everyone should use Perlbrew. I agree, but don't forget to source $HOME/perl5/perlbrew/etc/bashrc
everywhere, especially in your crontabs.
If you enjoyed this blog post, you could give me a +1 on the questhub for it. |
I added the button to the play-perl's footer. Let's see if anyone clicks it!
And then let's see if I can manage to process the money, because I'm pretty sure Paypal doesn't support withdraws in Russia. (But I think Skrill does.)