Sergey Kolychev [blogs.perl.org]

Machine learning in Perl: Kyuubi goes to a (Model)Zoo during The Starry Night.

By Sergey Kolychev on July 15, 2018 3:32 AM

Hello all, this is a fourth blog post in the Machine learning in Perl series, focusing on the AI::MXNet, a Perl interface to Apache MXNet, a modern and powerful machine learning library.

If you're interested in refreshing your memory or just new to the series, please check previous entries over here: 1 2 3

If you're following ML research then you're probably well aware of two most popular libraries out there, Google's TensorFlow and a relative newcomer to the field but rapidly gaining widespread acceptance, Facebook's PyTorch.

The reason why PyTorch has gained so much ground on TensorFlow is in dynamic nature of that library. TensorFlow started as a static graph library (which is easier to optimize) and PyTorch went with dynamically allocated graphs and NumPy (read PDL) style of programming (with a robust GPU support and auto-differentiation of the gradients) that is as easy to debug as an ordinary Python's code.

Of course nor TF nor PyTorch are against common sense compromises and TF recently added dynamic style via tf.eager package and PyTorch is working on getting to 'low memory usage, high speed, ready for production' kinda state.

Apache MXNet, that in its turn is supported by Amazon (but started as free software project in academia), is relatively less popular. I am not sure what is the reason for that, I'd err on herd mentality, perception over reality, that kinda stuff. Perl has suffered from undeserved overly negative perception as a 'write only' language and whatever is actual reality does not seem to matter much.

MXNet stays for MiXNet, that is a mix of static(symbolic) and dynamic(tensor) style of programming from get go, this is the philosophy of the lib. You can be flexible and code absolutely whatever you want with raw tensors (NDArrays in MXNet) or you can go strict symbolic style and the lib will mercilessly optimize the execution of your graph, reducing memory usage and make the training/inference as efficient and speedy as possible. Check it out on google, MXNet consistently beats other libs over various benchmarks.

However raw tensor dynamic style of ML programming is not for everyone, it's hard, you need to know a lot about the details, besides, due to popularity of ML, the topic overrun with new, inexperienced people (me included), looking for quick, easy, fool-proof solution, hence the popularity of Keras (now tf.keras, an official front-end to TF) that covers over raw complexities with a layer of syntactic sugar, allowing for more inclusive environment and wider enterprise adoption. That layer of sugar tends to hurt performance, bit it seems to be a wise choice, judging on the wide adoption of the lib.

MXNet's answer to Keras and PyTorch is Gluon. Essentially a layer very similar in syntax and capabilities to PyTorch and Keras, dynamically created graphs, full flexibility to do any kinds of dynamic tensor operations with automatic differentiation, transparent multi GPU and multi machine training, etc. But with MXNet twist. Gluon stays true to the MiX roots of the library, allowing for extremely easy and transparent conversion of dynamic graphs to static with all optimization related benefits while not taking (there are some caveats) away the freedom of the dynamic programming.

Okay, so far so good. But why this wall of text is being published on blogs.perl.org ?

The reason is simple, AI::MXNet fully supports Gluon, all capabilities, state of the art networks from very recent ML papers can be implemented in Perl very easy, efficiently and painlessly.

To demonstrate this fact and hopefully spark an interest to the ML topic among Perl community I recently ported Gluon ModelZoo (a set of pretrained vision networks) with state of the art models for ImageNet dataset to Perl as AI::MXNet::Gluon::ModelZoo and added two new really cool Gluon examples.

This new module and the examples are main topics of the post.

Lets start with AI::MXNet::Model::Zoo itself. It's a collection of seven different deep neural networks capturing the effort to 'solve' ImageNet dataset in time span between 2012 and 2017 (It's effectively solved and state of the art has surpassed human performance).

For the sake of brevity we'll concentrate our attention on smallest network, AlexNet; a grand daddy of todays ML craze. AlexNet has started it all in 2012 by beating closest competitor by whopping 41% percent.

Here is how it's defined in Gluon.


package AI::MXNet::Gluon::ModelZoo::Vision::AlexNet;
use strict;
use warnings;
use AI::MXNet::Function::Parameters;
use AI::MXNet::Gluon::Mouse;
extends 'AI::MXNet::Gluon::HybridBlock';

has 'classes' => (is => 'ro', isa => 'Int', default => 1000);

method python_constructor_arguments() { ['classes'] }

sub BUILD

{

    my $self = shift;

    $self->name_scope(sub {

        $self->features(nn->HybridSequential(prefix=>''));

        $self->features->name_scope(sub {

            $self->features->add(nn->Conv2D(64, kernel_size=>11, strides=>4,

                                            padding=>2, activation=>'relu'));

            $self->features->add(nn->MaxPool2D(pool_size=>3, strides=>2));

            $self->features->add(nn->Conv2D(192, kernel_size=>5, padding=>2,

                                            activation=>'relu'));

            $self->features->add(nn->MaxPool2D(pool_size=>3, strides=>2));

            $self->features->add(nn->Conv2D(384, kernel_size=>3, padding=>1,

                                            activation=>'relu'));

            $self->features->add(nn->Conv2D(256, kernel_size=>3, padding=>1,

                                            activation=>'relu'));

            $self->features->add(nn->Conv2D(256, kernel_size=>3, padding=>1,

                                            activation=>'relu'));

            $self->features->add(nn->MaxPool2D(pool_size=>3, strides=>2));

            $self->features->add(nn->Flatten());

            $self->features->add(nn->Dense(4096, activation=>'relu'));

            $self->features->add(nn->Dropout(0.5));

            $self->features->add(nn->Dense(4096, activation=>'relu'));

            $self->features->add(nn->Dropout(0.5));

        });

        $self->output(nn->Dense($self->classes));

    });

}

method hybrid_forward(GluonClass $F, GluonInput $x)

{

    $x = $self->features->($x);

    $x = $self->output->($x);

    return $x;

}

Few things here warrant attention. AI::MXNet::Gluon::Mouse is just a Mouse's subclass that implicitly adds internal trigger on any attribute for the sake of user convenience of not adding that trigger explicitly.

The network itself is a subclass of AI::MXNet::Gluon::HybridBlock, a class that allows Gluon to be true to its MiX roots, hence Hybrid in its name. By simply calling ->hybridize method on the net object a user signifies that the work on the graph creation is complete and the lib is now allowed to optimize all innards to its likings.

sub BUILD is a heart of the net creation, the place where the magic happens. In order to make conversions of Python examples to Perl as simple as possible I added auto-vivification of new attributes (via AUTOLOAD that just calls Mouse's 'has' first time the attribute is mentioned).

method hybrid_forward is essentially what happens during forward phase of the net execution, at this point you can think it as something that executes the network, converting the input into the output, the image of a cat to the answer 'this picture contains a cat'.

You may be wondering what is nn-> means ? It's there also to allow converting Python examples to Perl in the least painful manner. To the Perl it's just 'AI::MXNet::Gluon::NN', a module that houses a vast collection of predefined blocks of which deep nets are built as a lego.

Want to see deeper ? Easy. Let's stringify the net and let it tell us its structure


use AI::MXNet qw(mx); 
print mx->gluon->model_zoo->vision->alexnet

AlexNet(

  (features): HybridSequential(

    (0): Conv2D(64, kernel_size=(11,11), stride=(4,4), padding=(2,2))

    (1): MaxPool2D(size=(3,3), stride=(2,2), padding=(0,0), ceil_mode=0)

    (2): Conv2D(192, kernel_size=(5,5), stride=(1,1), padding=(2,2))

    (3): MaxPool2D(size=(3,3), stride=(2,2), padding=(0,0), ceil_mode=0)

    (4): Conv2D(384, kernel_size=(3,3), stride=(1,1), padding=(1,1))

    (5): Conv2D(256, kernel_size=(3,3), stride=(1,1), padding=(1,1))

    (6): Conv2D(256, kernel_size=(3,3), stride=(1,1), padding=(1,1))

    (7): MaxPool2D(size=(3,3), stride=(2,2), padding=(0,0), ceil_mode=0)

    (8): Flatten

    (9): Dense(4096 -> 0, Activation(relu))

    (10): Dropout(p = 0.5)

    (11): Dense(4096 -> 0, Activation(relu))

    (12): Dropout(p = 0.5)

  )

  (output): Dense(1000 -> 0, linear)

)

Want to see how the input dimensions get changed layer by layer? Easy. Lets call a summary method for that.


use AI::MXNet qw(mx); 
use AI::MXNet::Gluon qw(gluon); 
my $net = mx->gluon->model_zoo->vision->alexnet; 
$net->initialize; 
$net->(nd->random->uniform(shape => [1,3,224,224])); 
$net->summary(nd->random->uniform(shape=>[1,3,224,224]))
--------------------------------------------------------------------------------
        Layer (type)                                Output Shape         Param #
================================================================================
               Input                            (1, 3, 224, 224)               0
        Activation-1                             (1, 64, 55, 55)               0
            Conv2D-2                             (1, 64, 55, 55)           23296
         MaxPool2D-3                             (1, 64, 27, 27)               0
        Activation-4                            (1, 192, 27, 27)               0
            Conv2D-5                            (1, 192, 27, 27)          307392
         MaxPool2D-6                            (1, 192, 13, 13)               0
        Activation-7                            (1, 384, 13, 13)               0
            Conv2D-8                            (1, 384, 13, 13)          663936
        Activation-9                            (1, 256, 13, 13)               0
           Conv2D-10                            (1, 256, 13, 13)          884992
       Activation-11                            (1, 256, 13, 13)               0
           Conv2D-12                            (1, 256, 13, 13)          590080
        MaxPool2D-13                              (1, 256, 6, 6)               0
          Flatten-14                                   (1, 9216)               0
       Activation-15                                   (1, 4096)               0
            Dense-16                                   (1, 4096)        37752832
          Dropout-17                                   (1, 4096)               0
       Activation-18                                   (1, 4096)               0
            Dense-19                                   (1, 4096)        16781312
          Dropout-20                                   (1, 4096)               0
            Dense-21                                   (1, 1000)         4097000
          AlexNet-22                                   (1, 1000)               0
================================================================================
Total params: 61100840
Trainable params: 61100840
Non-trainable params: 0
Shared params: 0
--------------------------------------------------------------------------------

We can do even better and convert the net into a static graph and print out an image if its structure.

Ok, that is may be cool but I think a little dry. Lets add some cute pictures to the mix. You may have wondered why the title of the blog post is what it is.

Kyuubi is my dog (love him soooo much :-)), four year old Pembrock Welsh Corgi.

It's easy for the networks with 95% of accuracy on ImageNet to identify Corgi. I do however have a photo that obsures his doggy features a bit. On this photo Kyuubi is enjoying the total Sun eclipse on Aug 21, 2017 in Salem, OR. Naturally he took some precaution measures in order to protect his eyes.

Let's see if one of the ModelZoo networks will be able to correctly classify what's in the picture. Below you can see the example that I included with the AI::MXNet::Gluon::ModelZoo module.



use strict;

use warnings;

use AI::MXNet::Gluon::ModelZoo 'get_model';

use AI::MXNet::Gluon::Utils 'download';

use Getopt::Long qw(HelpMessage);

GetOptions(

    ## my Pembroke Welsh Corgi Kyuubi, enjoing Solar eclipse of August 21, 2017

    'image=s' => \(my $image = 'http://apache-mxnet.s3-accelerate.dualstack.amazonaws.com/'.

                               'gluon/dataset/kyuubi.jpg'),

    'model=s' => \(my $model = 'resnet152_v2'),

    'help'    => sub { HelpMessage(0) },

) or HelpMessage(1);

## get a pretrained model (download parameters file if necessary)

my $net = get_model($model, pretrained => 1);

## ImageNet classes

my $fname = download('http://data.mxnet.io/models/imagenet/synset.txt');

my @text_labels = map { chomp; s/^\S+\s+//; $_ } IO::File->new($fname)->getlines;

## get the image from the disk or net

if($image =~ /^https/)

{

    eval { require IO::Socket::SSL; };

    die "Need to have IO::Socket::SSL installed for https images" if $@;

}

$image = $image =~ /^https?/ ? download($image) : $image;

# Following the conventional way of preprocessing ImageNet data:

# Resize the short edge into 256 pixes,

# And then perform a center crop to obtain a 224-by-224 image.

# The following code uses the image processing functions provided 

# in the AI::MXNet::Image module.

$image = mx->image->imread($image);

$image = mx->image->resize_short($image, $model =~ /inception/ ? 330 : 256);

($image) = mx->image->center_crop($image, [($model =~ /inception/ ? 299 : 224)x2]);

## CV that is used to read image is column major (as PDL)

$image = $image->transpose([2,0,1])->expand_dims(axis=>0);

## normalizing the image

my $rgb_mean = nd->array([0.485, 0.456, 0.406])->reshape([1,3,1,1]);

my $rgb_std = nd->array([0.229, 0.224, 0.225])->reshape([1,3,1,1]);

$image = ($image->astype('float32') / 255 - $rgb_mean) / $rgb_std;

# Now we can recognize the object in the image.

# We perform an additional softmax on the output to obtain probability scores.

# And then print the top-5 recognized objects.

my $prob = $net->($image)->softmax;

for my $idx (@{ $prob->topk(k=>5)->at(0) })

{

    my $i = $idx->asscalar;

    printf(

        "With prob = %.5f, it contains %s\n",

        $prob->at(0)->at($i)->asscalar, $text_labels[$i]

    );

}

The core of the script is quite simple. Convert the input (image) to the output (one dimensional array of 1k size for 1000 ImageNet classes). The array is essentially a probability distribution (sum of all values adds to 1) and the value at the index for a specific class is the probability that the object of that class is present in the picture.

To convert the input into the output we just need to pretend that our network is a reference to a subroutine and feed it our input as we would to any other perl sub. Everything else is merely a busy work, read image, convert it to an appropriate format (people never seem to agree which dimension ordering method is superior) and then print out five largest probability values along with their text labels.


./image_classification.pl 
Downloading synset.txt from http://data.mxnet.io/models/imagenet/synset.txt ...
Downloading kyuubi.jpg from http://apache-mxnet.s3-accelerate.dualstack.amazonaws.com/gluon/dataset/kyuubi.jpg ...
With prob = 0.69273, it contains Pembroke, Pembroke Welsh corgi
With prob = 0.30584, it contains Cardigan, Cardigan Welsh corgi
With prob = 0.00041, it contains beagle
With prob = 0.00029, it contains basset, basset hound
With prob = 0.00010, it contains Eskimo dog, husky

As we can see 152 layer deep residual (input and output of a layer summed up and fed to a next layer) network had no trouble seeing through Kyuubi's shenigans.

But this is not all, there's one more Gluon example, artistic style transfer. You may have seen the examples of this technique, there are even mobile apps that allow you convert your photos into a 'timeless art'.

Now you can just read a well formatted Perl code and see for yourself what's under the hood. The example implements an absolute state of the art style transfer, as good as it gets, and real time one to boot, not much wait needed. Feed it any style picture and your image/photo and prepare to get amused at the result in 10, 15 seconds.

I'll refrain from explaining the code, the actual 'network' part is still something that I do not fully understand (working on it), everything else is busy work with images.

I hope you'll have fun with this example and produce a lot of nice pictures with it. Below are the images produced by the example script from the Kyuubi's photo and different classic paintings.

Style image: Kazimir Malevich, Black Square

Style image: random ornate stone wall image

Style image: Salvador Dali, The Enigma of Desire

Style image: Vincent van Gogh, The Starry Night

That's all that wanted to share today and it's time for a little rant.

When I started porting MXNet to Perl in Dec 2016 my motivation was to get Perl's community an exposure to a modern ML lib, get people to use it, mitigate in some way severe lack of ML tools in public Perl sphere.

Now, it worked for me personally pretty well on many planes. An exposure to complex Python taught me a lot about that language, the porting of the lib itself taught me a lot about under the hood details of the ML process. Using AI::MXNet I was able to introduce ML at my day job (we are mostly Perl shop) with considerable success.

But I am having second thoughts now about actual value of this work to you guys. Does Perl's community actually need ML ? What's it ? Everybody who needs ML just switches to Python and that's it ?

In these two years I am remaining a sole contributor to the codebase (with exception of my work colleague a bit), almost no issues submitted, number of docker downloads is at about 130 or so.

When I started to learn Perl in 1998 I wrote a module called Net::RawIP to help myself get a hold of the language. That module is an absolute disaster of a crappy code, like really horrible. But in days after releasing it I got dozens of emails, patches, people contacted, helped me to port it to Solaris, *BSD, etc. The work felt needed.

With AI::MXNet the feel is opposite. It looks like I am writing the module for myself and it's not really needed or asked for. Hopefully not. Looking for more contributors to the Perl part of MXNet codebase, there's a lot of work and I could use some help.

Thank you for reading this far.

8 comments

Perl client for NATS Streaming Messaging System

By Sergey Kolychev on November 12, 2017 9:29 PM

Hello all,

With micro-services and cloud being a buzzwords of the day it's no surprise that a market for messaging systems is pretty busy at the moment.

One such system (NATS and persistent version NATS Streaming) seems to be a leader among relatively new arrivals. If interested please read more on it at its official page nats.io.

There's a lot of clients for original NATS already exist on the market, including one for Perl,
="https://metacpan.org/pod/…

0 comments

Machine learning in Perl, Part3: Deep Convolutional Generative Adversarial network

By Sergey Kolychev on October 7, 2017 8:48 PM

Hello all,
Quick update on the status of AI::MXNet.

Recently MXNet proper got a cool addition,
new imperative PyTorch like interface called Gluon. If interested please read about it at Gluon home page.

I am pleased to announce that Perl (as of AI::MXNet 1.1) is joining a happy family of Lua and Python that are able to express ML ideas with Torch like elegance and fluidity.

Today's code is from Perl examples, and if you would like to understand it deeper please read the details at /var/www/users/sergey_kolychev/index.html

3 comments

Machine learning in Perl, Part2: a calculator, handwritten digits and roboshakespeare.

By Sergey Kolychev on April 23, 2017 8:29 PM

Hello all,
The eight weeks that passed after that were quite fruitful, I've ported whole python's test suite, fixed multiple bugs, added docs, examples, high level RNN interface, Perl API docs has been added to the official MXNet website.
This time I'd like to review in detail three examples from the examples directory.
First one is a simple calculator, a fully connected net that is structured to learn four basi…

3 comments

Machine learning in Perl

By Sergey Kolychev on February 21, 2017 11:58 PM

Hello everybody, this is my first post, so please go easy on me. I work with Perl for last 19 years and it always treated me extremely well in regards of interesting work or material compensation. Past December my company decided that it's time to finally join in the fashion of the day and start experimenting with ML.

I started researching and found out that's my lovely Perl is stuck in the past in regards to ML support and there's no any recent developments in this area (like full last decade).

Now look at Python! Tensorflow, MXNet, Keras, Theano, Caffe, and many, many more. Java has it's deeplearning4j, Lua has Torch and what had Perl ?

We had AI::FANN, the interface (good one, I used it, it's good) to the C lib that has not seen any real development since 2007, only feed-forward neural networks, no convolutional networks, no recurrent networks, all advances in last 10 years just were happening outside of Perl.

Please do not get me wrong, PDL is wonderful, CPAN is livelier than ever, Perl 5 is actively being developed, but ignoring such important topic completely does not look like a good state of the things.

So I had a sad day around Dec 15 and decided to try to repay Perl for 19 years of taking care of me. After researching around and comparing the existing libraries I chose on MXNet as the best lib around (it's most scalable, fast, efficient, with cleanest design and has backing of Amazon as its official ML lib).

I started to write Perl interface to the lib, it took about 1.5 months of work but it's finally got to rough usable state (pod is mostly Python yet, and some aspects of Python interface is not yet ported and I am actively working on fixing the remaining stuff). Today it was accepted into official MXNet github repository :-) and I hope to be able to keep Perl interface on par with Python for years to come.

Ok, now for the code example, this is straight out from the t/ dir. I am conscientiously keeping the outer sugar very close to original Python usage so the examples written in Python are applicable (just add $ sigils :-))


    ## Convolutional NN for recognizing hand-written digits in MNIST dataset
    ## It's considered "Hello, World" for Neural Networks
    ## For more info about the MNIST problem please refer to http://neuralnetworksanddeeplearning.com/chap1.html
    use strict;
    use warnings;
    use AI::MXNet qw(mx);
    use AI::MXNet::TestUtils qw(GetMNIST_ubyte);
    use Test::More tests => 1;
    # symbol net
    my $batch_size = 100;
    ### model
    my $data = mx->symbol->Variable('data');
    my $conv1= mx->symbol->Convolution(data => $data, name => 'conv1', num_filter => 32, kernel => [3,3], stride => [2,2]);
    my $bn1  = mx->symbol->BatchNorm(data => $conv1, name => "bn1");
    my $act1 = mx->symbol->Activation(data => $bn1, name => 'relu1', act_type => "relu");
    my $mp1  = mx->symbol->Pooling(data => $act1, name => 'mp1', kernel => [2,2], stride =>[2,2], pool_type=>'max');
    my $conv2= mx->symbol->Convolution(data => $mp1, name => 'conv2', num_filter => 32, kernel=>[3,3], stride=>[2,2]);
    my $bn2  = mx->symbol->BatchNorm(data => $conv2, name=>"bn2");
    my $act2 = mx->symbol->Activation(data => $bn2, name=>'relu2', act_type=>"relu");
    my $mp2  = mx->symbol->Pooling(data => $act2, name => 'mp2', kernel=>[2,2], stride=>[2,2], pool_type=>'max');
    my $fl   = mx->symbol->Flatten(data => $mp2, name=>"flatten");
    my $fc1  = mx->symbol->FullyConnected(data => $fl,  name=>"fc1", num_hidden=>30);
    my $act3 = mx->symbol->Activation(data => $fc1, name=>'relu3', act_type=>"relu");
    my $fc2  = mx->symbol->FullyConnected(data => $act3, name=>'fc2', num_hidden=>10);
    my $softmax = mx->symbol->SoftmaxOutput(data => $fc2, name => 'softmax');
    # check data
    GetMNIST_ubyte();
    my $train_dataiter = mx->io->MNISTIter({
        image=>"data/train-images-idx3-ubyte",
        label=>"data/train-labels-idx1-ubyte",
        data_shape=>[1, 28, 28],
        batch_size=>$batch_size, shuffle=>1, flat=>0, silent=>0, seed=>10});
    my $val_dataiter = mx->io->MNISTIter({
        image=>"data/t10k-images-idx3-ubyte",
        label=>"data/t10k-labels-idx1-ubyte",
        data_shape=>[1, 28, 28],
        batch_size=>$batch_size, shuffle=>1, flat=>0, silent=>0});
    my $n_epoch = 1;
    my $mod = mx->mod->new(symbol => $softmax);
    $mod->fit(
        $train_dataiter,
        eval_data => $val_dataiter,
        optimizer_params=>{learning_rate=>0.01, momentum=> 0.9},
        num_epoch=>$n_epoch
    );
    my $res = $mod->score($val_dataiter, mx->metric->create('acc'));
    ok($res->{accuracy} > 0.8);

I hope for this extension to be useful and may be to be able to chip just a bit from some of the negativity that surrounds Perl these days.

Thank you for reading this far!, if interested please checkout MXNet at MXNet Github repository

7 comments

Sergey Kolychev

Machine learning in Perl: Kyuubi goes to a (Model)Zoo during The Starry Night.

Perl client for NATS Streaming Messaging System

Machine learning in Perl, Part3: Deep Convolutional Generative Adversarial network

Machine learning in Perl, Part2: a calculator, handwritten digits and roboshakespeare.

Machine learning in Perl

About Sergey Kolychev

Search this blog

Recent entries