Large-scale deployments with Pogo at Yahoo!

Mike Schilli will be giving a talk at YAPC::NA 2012 described as:

Deploying releases to tens of thousands of hosts reliably while only disrupting a limited number of busy production servers at a time in different colocations is impossible to manage manually.

Pogo is a distributed deployment engine written in Perl that solves this problem. It’s an open source project on Github, written by Yahoo’s deployment tools engineering group. It enables one-button deployments in large infrastructures at Yahoo! every day.

The talk offers a glimpse at how large-scale deployments are done at Yahoo! and how Pogo manages these tasks. We’ll also cover the Pogo distributed architecture, how to configure and use it for your needs and how to get involved with the project.

[From the YAPC::NA Blog.]

Extract Mail Adresses from CSV

What do you think about the code below?
I have a file containing information about people, where the fifth element of the tab separated line contains the mail address.
Each mail address appears multiple times, I need to print them unique.

my %mails;
open my $csv, '<', 'my.csv' or die $!;
while (<$csv>){
    my  $mail = ( split(/\t/) )[4]; 
    $mails{$mail} = 1
}
say foreach sort keys %mails;

(and how do I indent code on blogs.perl.org?)
thx @Aristotle

Dist::Zilla, Pod::Weaver and bin

I use Dist::Zilla for managing my distributions. It's awesome and useful, although it lacks some bits of documentation every now and then; this lack is compensated in other ways, e.g. IRC.

I also use the PodWeaver plugin to automatically generate boilerplate POD stuff in the modules. Some time ago I needed to add some programs to a distribution of mine (which I also managed to forget at the moment, but this is another story), and this is where I got hit by all the voodoo.

The first program was actually a Perl program, consisting of a minimal script to call the appropriate run() method in one of the modules of the distro:

$ cat bin/perl-program 
#!/usr/bin/env perl
use prova;
prova->run();

This led Dist::Zilla to complain like this:

Tel Aviv Perl Mongers Meeting on 30 November, 2011

(The Hebrew text will be followed by an English one).

STF and How We Serve Your Data

Daisuke Maki will be giving a talk at YAPC::NA 2012 described as:

STF is a distributed object store that is used in Livedoor (NHN Japan) 

The idea is the same as mogilefs, but it was built with open protocols (HTTP, PSGI) and commodity tools like Apache/Nginx, MySQL, and Q4M.

It serves several hundred million files (mainly image files) and a few million b/sec for our blog, photo share, geo-location, and other services.

In this talk I will describe what STF is, and how we set it up how we operate it.

[From the YAPC::NA Blog.]

MacBook battery status and screen

While I'm writing my beginning Perl book, I've developed several tools to make my life easier. However, I've encountered a problem. I've often found that I am using iTerm2 in full-screen and this obscures my battery indicator. If I'm not plugged in, my battery can get dangerously low. I've fixed that.

On simple benchmarks

use Benchmark is not good enough. At all. - you can specify -2 as count which means 2 seconds. Good. - if you specify the test code as string not coderef means that you bench also the parsing time for all counts, and not plain run-time. coderefs should be used. The result is entirely unrealistic as you compile once and run often. - the iteration results are not used at all to check the statistical test quality. - without using :hireswallclock you get time(2) precision which is integer seconds.

benchmark-perlformance is too good and too slow. It"s good to have a single special and reliable machine for this, but I see no useful results. And I miss simple tests with good op coverage. I even do not see op coverage at all.

How fast is my perl, how good is my test and how good is my test result?

Dumpbench reports at least some statistical quality, but needs too many args. initial_runs and target_precision should not be mandatory.

What is the Marpa algorithm?

I have referred to "the Marpa algorithm" many times. What is that? The implementation involves many details, but the Marpa algorithm itself is basically four ideas. Of these only the most recent is mine. The other three come from papers spanning over 40 years.

Idea 1: Parse by determining which rules can be applied where

The first idea is to track the progress of the a parse by determining, for each token, which rules can be applied and where. Sounds pretty obvious. Not-so-obvious is how to do this efficiently.

In fact, most parsing these days uses some sort of shortcut. Regexes and LALR (yacc, etc.) require the grammar to take a restricted form, so that they can convert the rules into a state machine. Recursive descent, rather than list the possibilities, dives into them one by one. It, too, only works well with grammars of a certain kind.

I’m happy to announce that ActiveState has decided to...



I’m happy to announce that ActiveState has decided to sponsor YAPC::NA 2012.

ActiveState empowers innovation from code to cloud smarter, safer, and faster. ActiveState’s cutting-edge solutions give developers and enterprises the power and flexibility to develop in Java, Ruby, Python, Perl, Node.js, PHP, Tcl, and more. Stackato is ActiveState’s groundbreaking cloud platform for creating a private platform as a service (PaaS), and is the cost-effective, secure, and portable way to develop and deploy apps to the cloud. ActiveState is proven for the enterprise: More than two million developers and 97% of Fortune-1000 companies use ActiveState’s end-to-end solutions to develop, distribute, and manage their software applications. Global customers like Cisco, CA, HP, Bank of America, Siemens, and Lockheed Martin look to ActiveState to save time, save money, minimize risk, ensure compliance, and reduce time to market.

For more information, visit www.activestate.com.

[From the YAPC::NA Blog.]

The Perl Foundation Grants

As you might know, TPF awards grants to some people, for some tasks. There are some huge grants, like Nicholas Clark's grant or Dave Mitchell's grant on Perl 5. Unfortunately not all of us have the time and/or the knowledge to help in such low level tasks. Nevertheless, TPF has a grant committee that awards small grants (ranging from $500 to $2000) for smaller tasks. Some examples include documentation writing or tests writing for some relevant module, the implementation of some service in the web, the development of a specific module, etc.

Note that if you have a project in mind and you think it is worth more than $2000, you can propose parts of it. Give a big picture of what you would like to do, and define a sub-task. It is relevant that the sub-task is useful by itself, of course. But once you complete it, and if you show quality in your work, and you meet deadlines, it is very probably you get a second grant to complete your work.

At the moment TPF has extended its deadline for grant proposals. Submit them until the end of the month of November. You can read the complete call for grants in The Perl Foundation blog.

Frequently installing apps via any FTP

Your app is in a tarball and clients only have FTP access to install your app on their host. Furthermore, you need to customize the config file (or do other process) for each install. What you need is Net::xFTP and Archive::Tar. Net::xFTP's put allows you to pass in an open filehandle typeglob as the local file. So you can open a filehandle on a string reference and use that as the LOCAL FILE. This is a simplified version.

$#boo

As we all know $#boo returns the last index of array @boo.

It is clear why we have the prefixes '$' and '@' ('$' is like the first
letter of the word 'scalar' and the '@' is like the first letter of the word
'array').

But is it unclear why there is '#' after the dollar sign. I've checked out
the perl v 1.0 and in the man page there is such text:

> you may find the length of array @days by evaluating "$#days", as in csh.
> [Actually, it's not the length of the array, it's the subscript of the last
> element, since there is (ordinarily) a 0th element.]

So the answer why the number of the last index is $#boo is somewhere is csh.

Perl Data Language at YAPC

We’ve got a start on a track about the Perl Data Language (PDL) at YAPC::NA 2012. This area of Perl is a great fit with our “Perl in the Wild” theme. So if you have some expertise using the PDL, by all means submit a talk about it. If we can get enough talks, we can put together a one day track about it.

Likewise if you want to run a workshop to get people bootstrapped on PDL, you can submit a talk for that as well. 

[From the YAPC::NA Blog.]

I wish UNSHIFT was called something else...

I wish unshift was called shove : )

It's more visualizing :)

starting my twitter account

I think I want to use this account from now on to write more essay like pieces. Many of my posts here were just short reports of the existence of slides, talks, articles and other things. Maybe I will do comparing and reflecting summaries on such things, but for recent informations please subscribe my twitter channel. Some messages there might be German. Please don't mind.

https://twitter.com/#!/kephra_lk

Time spent waiting for tests you know will pass is time wasted

I've started using 'cpanm -n Module' to install Perl modules. The '-n' tells cpanminus to skip testing and just install the module.

"What, are you insane?"

Nope, I have just found that for most Perl modules, it is more time efficient to skip testing on the initial install, and sort out any problems later. Especially with a setup you know that works.

If I was installing a new application for the first time, I would probably not skip the tests however.

I’m quite pleased to announce that Shadowcat Systems has...



I’m quite pleased to announce that Shadowcat Systems has decided to sponsor YAPC::NA 2012!

Shadowcat Systems is a developer, sponsor of, and contributor to open source software projects including Catalyst, Moose, Moo, Tak, Devel::Declare and DBIx::Class. Shadowcat provides consultancy, training and support for these projects and for most of CPAN; systems management and automation; the design and implementation of network architecture; the development of proprietary and open source custom web applications; and offers Perl refactoring and project crisis management.

Shadowcat Systems are based in the United Kingdom but delivers solutions to a global community of clients via onsite supervision along with traditional and internet based communications. 

[From the YAPC::NA Blog.]

A lot of good tiny ideas

brian d foy was here in Houston for two days and I got a lot of good tiny ideas:

1. implement last out of grep/map (disabled because broken with 5.6)

2 days. step out 2 scopes in dopoptoloop:
grep, grep_item, block

$ p -MO=Concise,-exec -e'grep{last if $_ == 2} 1..3'
1 <0> enter
2 <;> nextstate(main 2 -e:1) v:{
3 <0> pushmark s
4 <$> const(AV ) s
5 <1> rv2av lKPM/1
6 <@> grepstart K*
7 <|> grepwhile(other->8)[t3] vK
8 <0> enter s
9 <;> nextstate(main 1 -e:1) v:{
a <$> gvsv(*_) s
b <$> const(IV 7) s
c <2> eq sK/2
d <|> and(other->e) sK/1
e <0> last s*
f <@> leave sKP
goto 7
g <@> leave[1 ref] vKP/REFC

Method::Signatures : Some relief for MooseX::Declare users

I have been excited about OO programming in Perl thanks to MooseX::Declare but I have never especially liked its performance hit and its cryptic warnings. It turns out that much of this problem is due to MooseX::Method::Signatures, which is used under the hood.

Many moons ago, I was curious about Moose and MooseX::Declare and I posted a question on StackOverflow. Venerable Perl guy Schwern then posted as a comment, that Method::Signatures was better than MooseX::Method::Signatures, and that there was a mod in the works to use it with MX::D.

Adventures in Debugging C/XS

... or Why A Good Perl Developer Is Not Automatically A Good C Developer, the Story of C Programming via Google.

My tests failed, but only sometimes. I was building an XS module to interface with a C wrapper around a C++ library (wrapper unnecessary? probably). make test was failing with exit code 11. Some quick searching revealed that I had an intermittent segfault. Calling a function as_xml would fail with a SEGV in strlen(). This only happened in perl after as_xml when perl was making a SV out of the return value. This also only mainly happened during make test. Doing prove myself would succeed 19 times out of 20, where make test would fail 19 times out of 20. Worse, my C test program would never fail at all.

About blogs.perl.org

blogs.perl.org is a common blogging platform for the Perl community. Written in Perl with a graphic design donated by Six Apart, Ltd.