January 2019 Archives

AWS Lambdas, Furl & LWP

At re:Invent 2018 AWS announced custom Lambda runtimes. This makes it possible to create Perl based Lambdas. Although theoretically it was possible to create a Perl Lambda prior to this announcement by invoking a shell from Python for example, the new custom runtimes make it possible to use almost any language to create Lambdas.

I've been developing a Perl based custom runtime framework so that creating a Perl Lambda is pretty much as simple as:

package MyLambda;

use parent qw/Amazon::Lambda::Runtime/;

sub handler {
  my $self = shift;

  my ($event, $context) = @_;

  $self->get_logger->info("...Perl has just joined the party!");

  return "Hello World";
}

Implementing a custom runtime for Perl seems rather straight forward, however there a lot of mysteries regarding the way Lambdas are actually invoked and executed. According to the AWS documentation, one should create an event loop waiting for Lambda events by calling an API endpoint. In experimenting with various tools I noticed that using Furl vs LWP produced an interesting reveal...it appears AWS can freeze time! Huh?

Well, not really, but certainly that's one explanation. Amazon seems to be able to do pretty much anything they want to do. Actually though, I think the phenomenon I observed has more to do with the way timeouts are implemented in Furl vs LWP - anyone who can verify or refute my theory is encouraged to chime in and educate me on this topic.

When using Furl it seems that timeouts are implemented based on the clock on the wall, however for LWP it appears that timeouts are based on elapsed processing time of the function. I've arrived at this conclusion by noting the behavior of Lambdas that are called shortly after one another. AWS maintains a Lambda environment (most likely using some kind of container - actually it's called Firecracker) for your Lambda for some period of time after invocation in order to minimize startup time and presumably to make Lambdas more efficient. One can capitalize on that behavior by caching certain data that might otherwise be expensive to retrieve each time. While you can't guarantee the data will be present when your Lambda is invoked, you can take advantage of it if it is present. Many people do this in their Python and Node Lambdas.

So let's say the timeout for Furl is set to 30 seconds. When I make a call to my Lambda runtime which uses Furl to fetch the event...all good..event retrieved. Now we continue in our event loop and make another call to the endpoint waiting for an HTTP response. We would expect that if no event were present the Lambda would timeout - however here's where it gets weird. The Lambda actually stops executing after you return a response and make a request to the endpoint used for getting events. The INVOCATION_NEXT request to fetch the next event appears to trigger the end of the billing cycle for that Lambda and puts your Lambda to sleep or at least in suspended animation.

There appear to be two "stopped" conditions for Lambdas. In condition A, right after a successful call, the Lambda appears to go to sleep but still maintains its environment. In condition B the Lambda environment is torn down and another call to the Lambda will recreate the runtime environment.

Now back to Furl vs LWP. When Furl is used and the Lambda is "woken", Furl times out and returns a 500 (internal response). When LWP is used and the Lambda is "woken", LWP returns the new event and does not present a timeout. Hmmm...curious. I'm therefore left to conclude that LWP is using relative execution time and Furl is using wall clock time. Or there is some other explanation or implementation detail (alerts?) that I have not thought of.

I'll be posting my project on github soon if anyone is interested. So far, I'm confident that Perl Lambdas are going to be viable, fun and perhaps a way to see more Perl in the AWS environment!

Perl Blogs

What is the status of a replacement for this blogging platform? I imagine the community as a whole is reluctant to blog in a central place because of the difficulty just logging in to this particular site.

Is there an alternative site where Perl bloggers post?

Perl Dependency Checking

I'm working on a few projects right now, most notably one that helps me create a CPAN distribution so that I can create a Perl Lambda in the AWS environment. This has led me to some yak shaving exercises, most notably investigating how to check for Perl dependencies.

Without getting too far into the weeds on Perl Lambdas (that's another blog post in the writing), suffice it to say I need to vendor Perl modules and deploy them in the Lambda environment. I briefly looked at carton and that may solve the problem neatly, but my early dive indicated to me that another path might be a more direct shot on goal and produce a cleaner Lambda deployment methodology.

Back to the issue at hand...specifically this blog is going to discuss Perl dependency checking using these tools:

scandeps.pl
Devel::Modlist
/usr/lib/rpm/perl.req

I'm sure I'm over complicating things and experienced CPAN authors will most likely point to the current set of methods available for both dependency checking and building CPAN distributions. Nevertheless my confusion given the various methods available and the lack of a single authoritative voice compelled me to create something simple (at least from my toolchain's perspective) for creating CPAN distributions from uncomplicated projects.

Of course if it's a simple distribution, why not just create a Makefile.PL as part of your project and be done with? Sure, that works - but my needs and desire for more automation have gotten the best of me over the holidays - hence - make-cpan-dist

This is not an ad for that tool - I doubt anyone is interested, however, I wanted to blog a bit about dependency checking in the hopes that some of you that read these blogs might illuminate the broad, dark corners of my ignorance.

Some background. Where I work we have packaged Perl applications modules using the Redhat Package Manager (rpm) as we are very coupled to the RedHat/CentOS environment. We have persevered with the system Perl for lo these many years with no major issues.

Aside: Truth be told, it worked for us but we are now moving a lot of our new development to Python for many of the reasons that have been hashed out in many other blogs and internet forums. Personally, I enjoy writing Perl and although I direct development efforts, our needs have dictated rethinking some of our development methods including what language and frameworks to use. We'll still have Perl around for some time and I will continue writing Perl code until I find it no longer meets my needs.

The Redhat Package Manager includes a script (/usr/lib/rpm/perl.req) that manages to do a fairly decent job of teasing out the direct Perl dependencies of your module. One of its advantages is that it seems to semantically check for dependencies in a more complete manner than some other tools. For example take this simple Perl script:

$ echo "use parent qw/Foo::Bar/;" > foo.pl
The rpm tool reports Foo::Bar as a dependency, whereas scandeps.pl does not.
$ scandeps.pl foo.pl
No modules found!

Now let's try using the -Rc options (don't recurse, compile).

$ scandeps.pl -Rc foo.pl
Can't locate Foo/Bar.pm in @INC (you may need to install the Foo::Bar module) (@INC 
contains: /usr/lib/bedrock/perl5 /opt/perl-5.28.1/lib/site_perl/5.28.1/x86_64-linux /opt/perl- 
5.28.1/lib/site_perl/5.28.1 /opt/perl-5.28.1/lib/5.28.1/x86_64-linux /opt/perl- 
5.28.1/lib/5.28.1) at /opt/perl-5.28.1/lib/5.28.1/parent.pm line 16.
BEGIN failed--compilation aborted at foo.pl line 5.
SYSTEM ERROR in compiling foo.pl: 512 at /opt/perl- 
5.28.1/lib/site_perl/5.28.1/Module/ScanDeps.pm line 1448.

Let's use /usr/lib/rpm/perl.req

$ /usr/lib/rpm/perl.req  foo.pl
perl(Foo::Bar)
perl(parent)

Now Devel::Modlist...

$ perl -MDevel::Modlist=nocore foo.pl
Can't locate Foo/Bar.pm in @INC (you may need to install the Foo::Bar module) (@INC 
contains: /usr/lib/bedrock/perl5 /opt/perl-5.28.1/lib/site_perl/5.28.1/x86_64-linux /opt/perl- 
5.28.1/lib/site_perl/5.28.1 /opt/perl-5.28.1/lib/5.28.1/x86_64-linux /opt/perl 
-5.28.1/lib/5.28.1) at /opt/perl-5.28.1/lib/5.28.1/parent.pm line 16.
 BEGIN failed--compilation aborted at foo.pl line 4.
Okay, so the errors are clearly indicative of compiling and then dumping @INC I suppose. So it appears we have at least 2 different methods of finding Perl dependencies.

  • compile and dump @INC
  • parse Perl and do semantic checking 
/usr/lib/rpm/perl.req appears to parse the Perl while scandeps.pl parses(?) the script, but also has an option for compiling in order to tease out additional dependencies...however it misses Foo::Bar when parsing only is specified. Both techniques have their advantages and disadvantages. Consider this:

$ echo "use LWP;" > foo.pl
Using scandeps.pl and allowing it to recurse...but no compilation.

$ scandeps.pl foo.pl | wc -l
96
Wow! Lot's of dependencies.

scandeps.pl, no recursing, no compiling

$ scandeps.pl -R foo.pl | wc -l
1
scandeps.pl, no recursing, with compiling
  
$ scandeps.pl -Rc foo.pl | wc -l
13

...and Devel::Modlist??

$ perl -MDevel::Modlist=nocore foo.pl | wc -l
13
Okay, that's a little better.  What about the rpm tool?

$ /usr/lib/rpm/perl.req foo.pl | wc -l
1

So the direct dependency for foo.pl we know is only LWP and both /usr/lib/rpm/perl.req and scandeps.pl without compiling can figure that out, however scandeps.pl does get tripped up with our pathological case:

use parent qw/Foo::Bar/;
Why does any of this matter anyway? I'm not sure honestly, however a the end of the day I do want to make sure I articulate module dependencies in my CPAN distribution correctly.

While it's true using the options that recurse and compile will produce a more complete listing of dependencies it seems redundant (and possibly problematic if we start talking about module versions and what's actually in your current Perl environment). If every CPAN module lists its direct dependencies then tools like cpanm should have no problem resolving all dependencies for a given Perl module distributed on CPAN. Correct?

So it seems it would be "best" to list only your direct dependencies and hope everyone else has a complete list of their direct dependencies.

To do that (list direct dependencies only), from the tools I've looked at so far (which include others I have not discussed in this blog), /usr/lib/rpm/perl.req did the best job of telling me what the direct dependencies of my module are.

My gut tells me I am missing something and that the greater Perl community as always has the answers as to what's the best practice for describing dependencies in a CPAN distribution.

P.S. This is the most awfulest blog platform every invented. It's almost impossible to determine how to format things as code or fixed fonts despite the various allegedly supported formats - not to mention the maddening need to reset my password every time I log in (to the same damn password!). It would be nice if Perl had a central place where the community could blog that actually worked well.

About bigfoot

user-pic I blog about Perl and Bedrock.