<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">
    <title>David Mertens</title>
    <link rel="alternate" type="text/html" href="http://blogs.perl.org/users/david_mertens/" />
    <link rel="self" type="application/atom+xml" href="http://blogs.perl.org/users/david_mertens/atom.xml" />
    <id>tag:blogs.perl.org,2009-11-03:/users/david_mertens//664</id>
    <updated>2013-03-11T19:39:08Z</updated>
    <subtitle>A blog about the Perl programming language</subtitle>
    <generator uri="http://www.sixapart.com/movabletype/">Movable Type Pro 4.38</generator>

<entry>
    <title>Request for help to verify a docs translation to Serbo-Croation</title>
    <link rel="alternate" type="text/html" href="http://blogs.perl.org/users/david_mertens/2013/03/request-for-help-to-verify-a-docs-translation-to-serbo-croation.html" />
    <id>tag:blogs.perl.org,2013:/users/david_mertens//664.4412</id>

    <published>2013-03-11T19:30:41Z</published>
    <updated>2013-03-11T19:39:08Z</updated>

    <summary>I am writing to solicit help from anybody who knows Perl and can read Serbo-Croation. Vera Djuraskovic kindly offered to translate the documentation to PDL&apos;s threading engine, PDL::PP. Not light material, mind you. :-) The problem is that I only...</summary>
    <author>
        <name>David Mertens</name>
        <uri>https://github.com/run4flat/</uri>
    </author>
    
    <category term="pdl" label="PDL" scheme="http://www.sixapart.com/ns/types#tag" />
    <category term="translation" label="translation" scheme="http://www.sixapart.com/ns/types#tag" />
    
    <content type="html" xml:lang="en" xml:base="http://blogs.perl.org/users/david_mertens/">
        <![CDATA[<p>I am writing to solicit help from anybody who knows Perl and can read Serbo-Croation. Vera Djuraskovic kindly offered to translate the documentation to PDL's threading engine, PDL::PP. Not light material, mind you. :-)</p>

<p>The problem is that I only speak and read English (and maybe British). If you think you can help, or think you know somebody who can help, please check out <a href="http://science.webhostinggeeks.com/pdlpp-pdl-rutine">http://science.webhostinggeeks.com/pdlpp-pdl-rutine</a>. The original English can be found at <a href="http://pdl.perl.org/PDLdocs/PP.html">http://pdl.perl.org/PDLdocs/PP.html</a>.</p>

<p>Of course, the original docs are written in pod, and I would really like to distribute the translation with PDL itself. However, Vera has not expressed any interest in distributing these docs as pod, possibly because she (he?) would actually like to drive some viewers to his (her?) site. For my part, I am most concerned about having the translation. If the translator can derive a fringe benefit from the translation, I'm OK with that.</p>

<p>Thanks in advance!</p>]]>
        
    </content>
</entry>

<entry>
    <title>CUDA::Minimal, back where it should be (but why did it break?)</title>
    <link rel="alternate" type="text/html" href="http://blogs.perl.org/users/david_mertens/2013/02/cudasimple-back-where-it-should-be-but-why-did-it-break.html" />
    <id>tag:blogs.perl.org,2013:/users/david_mertens//664.4341</id>

    <published>2013-02-17T07:21:10Z</published>
    <updated>2013-02-18T02:45:26Z</updated>

    <summary>I just posted an entry about how CUDA::Minimal was behaving weirdly. I hadn&apos;t dug around to figure out what was going on, and I went so far as to modify the way the code was compiled and linked to get...</summary>
    <author>
        <name>David Mertens</name>
        <uri>https://github.com/run4flat/</uri>
    </author>
    
    <category term="ofun" label="-Ofun" scheme="http://www.sixapart.com/ns/types#tag" />
    <category term="cuda" label="CUDA" scheme="http://www.sixapart.com/ns/types#tag" />
    
    <content type="html" xml:lang="en" xml:base="http://blogs.perl.org/users/david_mertens/">
        <![CDATA[<p>I just <a href="http://blogs.perl.org/users/david_mertens/2013/02/cudasimple-takes-two-steps-backwards-one-step-forward.html">posted an entry</a> about how <a href="https://github.com/run4flat/perl-CUDA-Minimal">CUDA::Minimal</a> was behaving weirdly. I hadn't dug around to figure out what was going on, and I went so far as to modify the way the code was compiled and linked to get it to work! My <a href="http://blogs.perl.org/users/david_mertens/2013/02/cudasimple-takes-two-steps-backwards-one-step-forward.html">previous post</a> raised concerns about how I would continue with the newly-chosen path.</p>

<p>Well, I decided that I should really figure out what was giving trouble. After many hours (and staying up later than I ought), I figured it out. I highly suspect that these problems are due to interactions with C macros defined by nVidia, but I'm not sure. I'm going to post them here, and possibly also the the perl-xs mailing list, in hopes that they might help somebody solve problems.</p>

<p>There were two big changes. First, all XS code that uses <a href="http://p3rl.org/ExtUtils::nvcc">ExtUtils::nvcc</a> has to include this in their boot sections:</p>

<pre><code>BOOT:
#undef PERL_VERSION
#define PERL_VERSION 0
</code></pre>

<p>Why? Because this generated C code that came later in the <code>BOOT</code> section was causing segfaults:</p>

<pre><code>#if (PERL_REVISION == 5 &amp;&amp; PERL_VERSION &gt;= 9)
  if (PL_unitcheckav)
       call_list(PL_scopestack_ix, PL_unitcheckav);
</code></pre>

<p>Second, in order to get the address of the pointer value part of a scalar, I use <code>sv_2pvbyte_nolen</code>. I originally used <code>SvPVX</code> (having verified that he slot existed), but that started to return nil in this new compilation setup. I also tried using <code>SvPVbyte_nolen</code>, but that also returned nil. The <a href="http://perldoc.perl.org/perlapi.html#pack_cat">perlapi docs for <code>sv_2pvbyte_nolen</code></a> state this function is "Usually accessed via the SvPVbyte_nolen macro," but I can't figure out if there's anything bad with using <code>sv_2pvbyte_nolen</code>.</p>

<p>You might ask why I don't file these on perlbug. Like I said, I'm not sure if these are bugs with Perl, or with the Perl-nvcc interactions. If it's the latter, I'm not so worried about them, and I'd rather at least begin by finding and documenting these troubles. I'll try to fix them if they really start giving trouble. If you have any ideas for what is going on or want me to try something to figure it out, feel free to respond here, or <a href="https://github.com/run4flat/perl-CUDA-Minimal/issues">file a bug report on CUDA::Minimal</a>.</p>

<p>And with that, I'm going to get my play-perl points. :-)</p>
]]>
        

    </content>
</entry>

<entry>
    <title>CUDA::Minimal takes two steps backwards, one step forward</title>
    <link rel="alternate" type="text/html" href="http://blogs.perl.org/users/david_mertens/2013/02/cudasimple-takes-two-steps-backwards-one-step-forward.html" />
    <id>tag:blogs.perl.org,2013:/users/david_mertens//664.4340</id>

    <published>2013-02-17T04:57:15Z</published>
    <updated>2013-02-18T02:46:44Z</updated>

    <summary>Edit: I got my original approach to work, see my follow-up. A week ago I wrote about how I though play-perl was great. I put up a bunch of ideas and waited to see what others would encourage me to...</summary>
    <author>
        <name>David Mertens</name>
        <uri>https://github.com/run4flat/</uri>
    </author>
    
    <category term="ofun" label="-Ofun" scheme="http://www.sixapart.com/ns/types#tag" />
    <category term="cuda" label="CUDA" scheme="http://www.sixapart.com/ns/types#tag" />
    
    <content type="html" xml:lang="en" xml:base="http://blogs.perl.org/users/david_mertens/">
        <![CDATA[<p>Edit: I got my original approach to work, see <a href="http://blogs.perl.org/users/david_mertens/2013/02/cudasimple-back-where-it-should-be-but-why-did-it-break.html">my follow-up</a>.</p>

<p>A week ago I wrote about how I though play-perl was great. I put up a bunch of ideas and waited to see what others would encourage me to do. Two of my ideas got two votes each (some others got single votes), one of which involved <a href="http://play-perl.org/quest/5115c734172f6c1d5e00000f">revising CUDA::Minimal sot it works again</a>. (<a href="http://en.wikipedia.org/wiki/CUDA">CUDA is a means for executing massively parallelized code on your video card.</a>) Well, <a href="https://github.com/run4flat/perl-CUDA-Minimal">it compiles now</a>, but it doesn't quite work the way I had hoped and I am facing some basic architectural redesign issues. Herein I describe the old way things worked, and why that won't work anymore. Can anybody offer some ideas for how I might move forward?</p>]]>
        <![CDATA[<p>First, let me explain the original design. Originally, I released <a href="http://p3rl.org/ExtUtils::nvcc">ExtUtils::nvcc</a>. The goal of this module was to operate as a drop-in replacement for you C compiler in ExtUtils::MakeMaker, Module::Build, and Inline::C that actually invoked <a href="https://duckduckgo.com/NVIDIA_CUDA_Compiler">nVidia's nvcc</a>. nvcc lets normal C (or C++) code pass through without touching it, but it handles special minor language extensions, making it easy and relatively painless to mix device (video card) and host (regular CPU) code. Using this compiler wrapper, I could write XS code with CUDA-C, and it Just Worked.</p>

<p>That was three years ago (almost exactly), and I hadn't touched the code since July of 2010. Last September, I tried recompiling CUDA::Minimal, but they didn't work anymore. I do not understand in any way what is going on, but it seems as though nVidia introduced a C macro into their libraries (which get added when you compile with nvcc) that trips up some of Perl's internals. (I tried compiling with different versions of Perl---including the one I originally used in my development---and they all failed, so I presume it was a change on nVidia's end, not Perl's.)</p>

<p>I need to fix this because I wrote some serious numerical code a couple of years ago that I want to use again, and which is locked into this old system. I am also motivated to score my first points on play-perl.org. :-)</p>

<p>My current solution gets CUDA::Minimal back up-and-running. You can once again use it to transfer data to your video card, and transfer data back from your video card because the underlying interface for those were C functions to begin with. However, you can't just write you CUDA code in your XS files anymore. You can't write simple XS wrappers around kernel launches.</p>

<p>To get an idea of what nvcc lets you do, see the code under <a href="https://llpanorama.wordpress.com/2008/05/21/my-first-cuda-program/">this post</a>. The function on line 10, which starts with __global__, is meant to be compiled for and run on your video card. You need to run that through nvcc. The other important pieces is on line 30, where you see the tripple-angle-bracket notation. That code is replced by nvcc with a set of declarations and function invocations that builds the call stack and invokes it on the video card. Once upon a time, you could include these triple-angle-bracket invocations directly into your XS or Inline::C code and it worked. Now it doesn't.</p>

<p>Architecturally, you can get around this by creating a separate source file with C functions that perform whatever kernel invocation you want, and compiling that source file with nvcc. You would then write an XS file that uses a common C header file, and link the nvcc-compiled code at the last second. However, that's a major change in how one would call kernels from XS code. It's an extra pair of files, and managing the proper linker arguments is a pain.</p>

<p>I could try to dig around in the Perl source to figure out which C macro is getting tripped, but that will be a lot of work. (The error is reportedly from somewhere in the regex engine.) I could try to write wrapper modules that would minimize the extra effort for consuming modules. I could try to introduce a new module that takes a string containing your source code and returns a ready-to-call sub. All of these are hard work, and none of them are either easy nor as elegant as the original approach. Does anybody have a better idea?</p>]]>
    </content>
</entry>

<entry>
    <title><![CDATA[I &lt;3 play-perl]]></title>
    <link rel="alternate" type="text/html" href="http://blogs.perl.org/users/david_mertens/2013/02/i-3-play-perl.html" />
    <id>tag:blogs.perl.org,2013:/users/david_mertens//664.4297</id>

    <published>2013-02-09T15:06:41Z</published>
    <updated>2013-02-09T19:57:07Z</updated>

    <summary>play-perl was only just announced, but I&apos;ve already fallen in love with it. There seems to be some confusion about how it works, so I thought I would lend my interpretation. Note that I did not write it nor am...</summary>
    <author>
        <name>David Mertens</name>
        <uri>https://github.com/run4flat/</uri>
    </author>
    
    <category term="ofun" label="-Ofun" scheme="http://www.sixapart.com/ns/types#tag" />
    <category term="playperl" label="play-perl" scheme="http://www.sixapart.com/ns/types#tag" />
    
    <content type="html" xml:lang="en" xml:base="http://blogs.perl.org/users/david_mertens/">
        <![CDATA[<p><a href="http://play-perl.org">play-perl</a> was only just announced, but I've already fallen in love with it. There seems to be some confusion about how it works, so I thought I would lend my interpretation. Note that I did not write it nor am I affiliated with it, but I think it's awesome and want to get people using it!</p>

<p>If you're like me, you have a lot of ideas floating around in your head for open source projects. Mine tend to be oriented towards computational science, but it could be anything. And, if you're like me, a big part of your open source experience centers on making others happy by helping them solve their problems. The question naturally arises: among all your random ideas, what would be the best thing to work on? What will make the most people happy if you complete it? Should you write a blog entry explaining a feature, or add a new feature?</p>

<p>Enter <a href="http://play-perl.org">play-perl.org</a>. There are two basic things you do on this site once you've registered. First, dump all your ideas into open quests. When you do this, you are asking others on play-perl what they would like to see done. Second, read through others' ideas and "like" stuff that you would like to see done.</p>

<p>And that's it! It's really quite simple.</p>

<p>The service is brand new and it will eventually become too big with its present form. Hopefully new features will arise, such as being able to add tags to quests so that people can filter others' quests on tags. Maybe I'll even dig around play-perl's source and try to add that functionality. But it's not a problem for now, as it's so new.</p>

<p>-Ofun</p>]]>
        
    </content>
</entry>

<entry>
    <title>$3M says Perl5 needs a new major version number</title>
    <link rel="alternate" type="text/html" href="http://blogs.perl.org/users/david_mertens/2013/02/3m-says-perl5-needs-a-new-major-version-number.html" />
    <id>tag:blogs.perl.org,2013:/users/david_mertens//664.4268</id>

    <published>2013-02-07T16:29:42Z</published>
    <updated>2013-02-07T16:57:49Z</updated>

    <summary>Yesterday, Ovid started his discussion about moving the major version of Perl 5 to Perl 7. You know what else happened that day? Continuum Analytics won $3M from DARPA to undertake a huge renovation to NumPy. Three. Million. Dollars. Not...</summary>
    <author>
        <name>David Mertens</name>
        <uri>https://github.com/run4flat/</uri>
    </author>
    
    <category term="perl" label="perl" scheme="http://www.sixapart.com/ns/types#tag" />
    
    <content type="html" xml:lang="en" xml:base="http://blogs.perl.org/users/david_mertens/">
        <![CDATA[<p>Yesterday, Ovid started his discussion about moving the major version of Perl 5 to Perl 7. You know what else happened that day? <a href="http://m.itworld.com/big-data/340570/python-gets-big-data-boost-darpa">Continuum Analytics won $3M from DARPA to undertake a huge renovation to NumPy</a>. Three. Million. Dollars. Not for Python. For an <strong>extension</strong> for Python. Continuum plans to add all kinds of capabilities, but bear in mind that PDL already possesses at least one of those, namely built-in support for missing data. From the technical standpoint, we were already ahead, and somebody else won $3M.</p>

<p>NumPy is very well run, and very well organized, and there are many more libraries available for NumPy than there are for PDL. Those dollars are very likely to be well spent. I will not argue that this was a bad decision by DARPA.</p>

<p>But think about it. Do you remember just how amazing it was when Craigslist gave the Perl Foundation $100,000? Or when Booking.com gave the Perl Foundation 100,000 Euros? Now, multiply that by 30.</p>

<p>chromatic replied to Ovid's post by saying that <a href="http://www.modernperlbooks.com/mt/2013/02/project-facepalm.html">we should write kick-ass software</a>. Well, I have. I wrote a plotting library that I think is really good. It addresses one of the most important shortcomings of PDL, namely the lack of a default plotting library. It still needs a ton of work, but it fast and functional, today, and it has a strong foundation. The problem? Nobody uses the library besides me. Why? Because they already went through the labor of setting up some other plotting library, like PGPLOT, PLplot, or Gnuplot. That took a lot of effort, and climbing the learning curve took time for them. That means that the only way to really get users is to get <strong>new</strong> users. Now, tell me: where are new users coming from?</p>

<p>I repeat, where are new users coming from?</p>

<p>Potential new users ask their friends how they get their work done. Their friends say Matlab, or R, or Python. Potential new users are probably more likely to try using Ruby than they are to try using Perl because "it's cool." Everybody thinks Perl has gone the way of Tcl. And why shouldn't they? <a href="http://wiki.tcl.tk/1721">Tcl's latest major version is <strong>newer than</strong> Perl's.</a></p>

<p>What would I like? I'd like to see Perl 5.18+Moo released as Perl 7. Make it big. Make a splash. People will be wowed by the new syntax---even better than Python's or Ruby's OO sugar---and will be happy when p5mop finally hits and delivers a major speed-up.</p>

<p>Please, stop saying that "if you build it, they will come." We need hype.</p>]]>
        
    </content>
</entry>

<entry>
    <title>Viewing your weather forecast without a browser</title>
    <link rel="alternate" type="text/html" href="http://blogs.perl.org/users/david_mertens/2013/01/viewing-your-weather-forecast-without-a-browser.html" />
    <id>tag:blogs.perl.org,2013:/users/david_mertens//664.4221</id>

    <published>2013-01-21T16:44:23Z</published>
    <updated>2013-01-21T19:08:25Z</updated>

    <summary>PDL::Graphics::Prima is a Perl plotting library written using PDL and the Prima GUI toolkit. It is targeted at PDL users with the hope of one day becoming the standard plotting library for PDL. PDL::Graphics::Prima provides a complete plotting widget, but...</summary>
    <author>
        <name>David Mertens</name>
        <uri>https://github.com/run4flat/</uri>
    </author>
    
    <category term="mojolicious" label="Mojolicious" scheme="http://www.sixapart.com/ns/types#tag" />
    <category term="pdl" label="PDL" scheme="http://www.sixapart.com/ns/types#tag" />
    <category term="pdlgraphicsprima" label="PDL::Graphics::Prima" scheme="http://www.sixapart.com/ns/types#tag" />
    
    <content type="html" xml:lang="en" xml:base="http://blogs.perl.org/users/david_mertens/">
        <![CDATA[<p><a href="http://p3rl.org/PDL::Graphics::Prima">PDL::Graphics::Prima</a> is a Perl plotting library written using <a href="http://pdl.perl.org">PDL</a> and <a href="http://www.prima.eu.org">the Prima GUI toolkit</a>. It is targeted at PDL users with the hope of one day becoming the standard plotting library for PDL. PDL::Graphics::Prima provides a complete plotting widget, but one of the great aspects of the library is that it also provides a very simple interface for building one-off plots called <a href="http://p3rl.org/PDL::Graphics::Prima::Simple">PDL::Graphics::Prima::Simple</a>. In this post, I use Mojolicious to pull down some weather data and plot it using PDL::Graphics::Prima. For the initial pass, the initial plotting code is just 1 line. :-)</p>
]]>
        <![CDATA[<p>Check out the full script at <a href="https://gist.github.com/4587417">this gist</a>. If you want to run this on your own machine, you should install PDL::Graphics::Prima on your machine (which will install PDL, which will take about 10 minutes). Then install Mojolicious or revise the script to use your user agent of choice. This is a complete but rather lengthy script, so for now turn your attention to <a href="https://gist.github.com/4587417#file-forecast-view-simple-pl-L41">lines 41-42</a>, which reads thus: </p>

<script src="https://gist.github.com/4587699.js"></script>

<p>Try running the script and give it your zip-code and see what it gives you. I particularly enjoy doing this for zip codes of people I know, and Beverly Hills, since I know that one off the top of my head, too.</p>

<p>Of course, it might be nice to add axis labels and such. There are two ways to do that. The first way is to get the window and plot object that are returned from line_plot in list context, modify the plot object, and execute the window:</p>

<script src="https://gist.github.com/4587806.js"></script>

<p>Unlike the line_plot command in void context, this function call does not display the plot (and block the script) as soon as it is called. Instead it returns the plot object and the window that will display that plot which you can run later. Or, you can accumulate a collection of such windows and call Prima->run to view them all at once. That will kick off the Prima event loop, which is unfortunate since you will need to interrupt it with CTRL-C or use some other strategy in order to break out of it. There are other, better ways to handle this, but that discussion will have to wait for later. In the meantime, it's best to stick with one window, or do your plotting in the PDL shell.</p>

<p>Another approach is to specify all of this in a single plot() command:</p>

<script src="https://gist.github.com/4587857.js"></script>

<p>The advantage of this approach is that it brings you much closer to the deeper PDL::Graphics::Prima widget interface, and thus to building full GUI data analysis applications. That will also have to wait for another post. :-)</p>

<p>Enjoy!</p>
]]>
    </content>
</entry>

<entry>
    <title>The Quantified Onion is not just another echo chamber</title>
    <link rel="alternate" type="text/html" href="http://blogs.perl.org/users/david_mertens/2012/07/the-quantified-onion-is-not-just-another-echo-chamber.html" />
    <id>tag:blogs.perl.org,2012:/users/david_mertens//664.3591</id>

    <published>2012-07-21T03:33:03Z</published>
    <updated>2012-07-21T04:00:30Z</updated>

    <summary>In tandem with the creation of perl4science.github.com, gizmo_mathboy created a new google group called The Quantified Onion. Both of these web properties are meant to give greater visibility to Perl&apos;s role in science, and to spread the word to newcomers...</summary>
    <author>
        <name>David Mertens</name>
        <uri>https://github.com/run4flat/</uri>
    </author>
    
    <category term="bioperl" label="BioPerl" scheme="http://www.sixapart.com/ns/types#tag" />
    <category term="pdl" label="PDL" scheme="http://www.sixapart.com/ns/types#tag" />
    <category term="science" label="Science" scheme="http://www.sixapart.com/ns/types#tag" />
    
    <content type="html" xml:lang="en" xml:base="http://blogs.perl.org/users/david_mertens/">
        <![CDATA[<p>In tandem with the creation of perl4science.github.com, gizmo_mathboy created a new google group called <a href="https://groups.google.com/forum/#!forum/the-quantified-onion">The Quantified Onion</a>. Both of these web properties are meant to give greater visibility to Perl's role in science, and to spread the word to newcomers that Perl is a great language for scientific computing and data analysis.</p>

<p>It doesn't take long to realize that The Quantified Onion could easily slip into becoming just another room in the Perl echo chamber, a forum-based extension to blogs.perl.org with a scientific bent. If our goal had been to create a space for Perl scientists to hang out, this might be acceptable. However, I am convinced that Perl needs to get itself out of the lonesome offices and into the halls of academia. We have to grab the interest of undergraduates and graduate students doing science. Heck, I'll even take a postdoc or professor if I can get their attention. How do we achieve this?</p>

<p>We put on workshops for scientists and engineers.</p>]]>
        <![CDATA[<p>I believe that we need to organize one-day events at institutions of higher learning. The workshops should be free, and they should be entitled something like "Introduction to Perl for Scientists and Engineers." If we can offer free pizza, that'd be even better. The first part would be a basic introduction to Perl; the second part would be a basic introduction to PDL; the third part would be a basic introduction to BioPerl. Or something like that. I'm very open to suggestions.</p>

<p>I am interested in organizing this sort of thing and would be happy to cover the PDL part. I can travel pretty far and wide in the midwest as long as it's on a Saturday. I'll even pay for half of a hotel room. The major missing links are (1) somebody to talk about BioPerl, (2) a good introductory curriculum, (3) enthusiasm and help, and (4) students and/or Perl Mongers at the various institutions willing to organize the room and distribute flyers for the event.</p>

<p>What do you think? Did I miss a major topic in science or engineering? Do you think the Perl Foundation might spring for the cost of pizza? Are you interested in helping out?</p>

<p>Although comments here (at blogs.perl.org) are welcome, I'd particularly appreciate if you could comment at <a href="https://groups.google.com/forum/#!forum/the-quantified-onion">The Quantified Onion</a>.</p>

<p>Thanks!</p>]]>
    </content>
</entry>

<entry>
    <title>Adapting PDL to a Big Data Landscape</title>
    <link rel="alternate" type="text/html" href="http://blogs.perl.org/users/david_mertens/2012/06/adapting-pdl-to-a-big-data-landscape.html" />
    <id>tag:blogs.perl.org,2012:/users/david_mertens//664.3447</id>

    <published>2012-06-30T04:57:58Z</published>
    <updated>2012-06-30T05:29:21Z</updated>

    <summary>Note: although this article is directed at current PDL users, I would particularly appreciate the opinion of Perl users who are considering using PDL. Does my assessment seem accurate to you? I was just watching a few of the talks...</summary>
    <author>
        <name>David Mertens</name>
        <uri>https://github.com/run4flat/</uri>
    </author>
    
    <category term="bigdata" label="Big data" scheme="http://www.sixapart.com/ns/types#tag" />
    <category term="pdl" label="PDL" scheme="http://www.sixapart.com/ns/types#tag" />
    
    <content type="html" xml:lang="en" xml:base="http://blogs.perl.org/users/david_mertens/">
        <![CDATA[<p>Note: although this article is directed at current PDL users, I would particularly appreciate the opinion of Perl users who are considering using PDL. Does my assessment seem accurate to you?</p>

<p>I was just watching a <a href="http://www.youtube.com/watch?v=6xUR83ndsuo&feature=plcp">few</a> of the <a href="http://www.youtube.com/watch?v=F476qz3eSI0&feature=plcp">talks</a> on youtube from from <a href="http://yapcna.org/">YAPC::NA</a> that I wanted to attend in Madison but could not because I was busy (writing my talks) or attending other talks. And it reminded me of the revelation that I had at YAPC. Although I am not looking for a job, I spoke with the <a href="http://www.booking.com/">sponsors</a> at their <a href="http://www.shutterstock.com/">job booths</a>, just to get a feel for what's out there. Is it possible for a Perl programmer to get a job doing real data crunching? The answer, happily, was "yes".</p>

<p>Almost immediately, I began to realize that there is a whole world of data analysis that is on the horizon for which PDL is well suited. PDL was written by and for scientists, but there's no reason it couldn't be applied to the analysis of <a href="http://en.wikipedia.org/wiki/Big_data">Big Data</a> (made possible in large part due to Chris Marshall's work on fully cross-platform memory mapping and 64-bit cleanups). Analyses of large data sets are already happening at many private corporations using languages such as SAS, SPSS, S, and R. Some of them might use Matlab; a rare few might use Python or Perl. Due to our limited marketing budget (ha!), the only corporations that will choose to use Perl and PDL are those which already use Perl in some significant capacity. We PDL folks have two major things to take away from this. First, we must engage with the wider Perl community, and second, we must make it easy for PDL outsiders to learn about and use the full breadth of PDL.</p>]]>
        <![CDATA[<p><strong>Engaging the wider community</strong>  I am happy to report that my <a href="http://www.youtube.com/watch?v=rf1yfZ2yUFo&feature=plcp">Introduction to PDL</a> was very well attended. In other words, the Perl people care about and are interested in PDL. We PDL people simply need to make ourselves better known and accessible to the other Perl people who live and work in our midst. I highly recommend attending your local <a href="http://www.pm.org/">Perl Mongers</a>. If there is no such group and you're the outgoing type, try searching on LinkedIn for other Perl folks in your neck of the woods and contact them if you can. A sysadmin who knows about PDL is one thing; a sysadmin that can put his coworker in touch with you, a PDL user that he sees once a month, is a much more powerful thing. If you're less outgoing, join the #pdl channel at irc.perl.org. If you don't have an irc client or don't know how to use irc, just use the <a href="http://www.mibbit.com/chat/?url=irc://irc.perl.org/pdl">in-browser mibbit client</a>.</p>

<p><strong>Making it easy for outsiders</strong>  With the release of the <a href="http://pdl.perl.org/content/pdl-book-toc.html">PDL::Book</a>, we finally have a single comprehensive resource for learning PDL. This is great. However, both the core docs and the Book can be improved. As the need to analyze Big Data grows, new users will come to PDL needing new functionality, and they will need to be able to learn to implement that functionality. Do you understand the intricacies of PDL threading? At the very least, do you feel like you could sit down with another programmer and hack at it until you got it right? Or, going further, have you used PDL::PP? There's a chapter in the book on that, too. (I should know, I wrote it. :-) If not, read selected chapters from the book and give your feedback. (Credits are listed in the back of the book, or just email the <a href="http://pdl.perl.org/?page=mailing-lists">mailing list</a>. New users, you have to sign-up to send mail.) The better we can make the book and the docs, the better we will be able to accommodate newcomers. The more people in our little community who understand these things, the more responsive we can be when newcomers arrive and ask questions, and the more of them will stay and start contributing, making PDL even better.</p>

<p>Finally, yes, I am talking to you, Jane PDL Hacker. I know that some of you, even some of the PDL Big Wigs, do not attend your local Perl Mongers. You should. Furthermore, only one person gave me thorough feedback on the PDL::PP chapter, and I must shamefully admit that I have yet to read most of the rest of the book. If you do not help, you should not be surprised if PDL slowly bitrots into oblivion. But if you tell others about PDL and give useful feedback on the docs, PDL will grow and improve and your efforts will pay off in the form of an even more awesome tool.</p>

<p>The tide of Big Data is coming. Do Your Part: help make PDL awesome, and help other Perlers discover how awesome it is, and maybe even make it better.</p>]]>
    </content>
</entry>

<entry>
    <title>Yet Another YAPC::NA Report</title>
    <link rel="alternate" type="text/html" href="http://blogs.perl.org/users/david_mertens/2012/06/yet-another-yapcna-report.html" />
    <id>tag:blogs.perl.org,2012:/users/david_mertens//664.3433</id>

    <published>2012-06-27T11:14:36Z</published>
    <updated>2012-06-27T12:25:23Z</updated>

    <summary>Finally, I can sit down to write my report! This was my first YAPC, and it was fantastic! For many years my knowledge of the Perl community has been through the PDL mailing list, through the many Perl blogs, and...</summary>
    <author>
        <name>David Mertens</name>
        <uri>https://github.com/run4flat/</uri>
    </author>
    
    <category term="yapc" label="YAPC" scheme="http://www.sixapart.com/ns/types#tag" />
    
    <content type="html" xml:lang="en" xml:base="http://blogs.perl.org/users/david_mertens/">
        <![CDATA[<p>Finally, I can sit down to write my report!</p>

<p>This was my first YAPC, and it was fantastic! For many years my knowledge of the Perl community has been through the PDL mailing list, through the many Perl blogs, and through occasional IRC. I attended  few Chambana.pm meetings, but they were social and didn't really get me fired up for Perl. My first experience with a collection of Perl programmers would be joining Chicago.pm last fall, and my first Perl conference was DC/Baltimore Perl Workshop this spring. But wow, 400+ Perl programmers in one place!</p>

<p>I gave two talks at the conference: an introduction to the Perl Data Lanuage (PDL) and an introduction to my new plotting library called PDL::Graphics::Prima. Both were well attended and well recieved, and I have gotten a handful of follow-up email and irc discussions as a result of both talks. Building the PDL community was one reason I attended YAPC::NA and I get the impression that it's paying off.</p>]]>
        <![CDATA[<p>A few weeks (or months?) before YAPC, I began noticing the many blog entries by JT mentioning sponsors, and mentioning that they were hiring. This piqued my curiosity. I am a postdoc, I just got my contract renewed for another year, and I aim to remain in academia. In other words, I'm not looking for a job. Still, I was curious about the job market for a Perl programmer who specializes in Big Data, and I was pleasantly surprised to discover that such jobs exist. In the past, I had considered leaving academia so that I could program in Perl full-time (because I enjoy programming in Perl that much), but the prospects of leaving academia for a sysadmin job was not appealing. Data analysis jobs in industry exist, but many of them focus on SAS. It is helpful and sobering to know that I can get a real job doing data analysis with Perl instead of SAS.</p>

<p>In addition to talking with recruiters, I met lots of great Perl programmers. The most anticipated was meeting Maggie Xiong, a PDL programmer who wrote the impressive and much-needed PDL::Stats distribution. She is only the second person I have met face-to-face who can write code that utilizes PDL::PP. (I met Chris Marshall, PDL pumpking, in mid-April at the DCBPW, a surprise indeed since I didn't know he would be there!) Putting a face and a voice to a personality that I knew only through email and irc was a great experience.</p>

<p>To my surprise, I had the opportunity to speak with Larry Wall. I called an ad-hoc meeting of scientists "in the corner by the piano" at the end of the first day's lightning talks and Larry decided to join us. My goal had been to identify that handful of us who did science (as opposed to web development :-) so that we could find each other in the hallways through the rest of the week, and that mostly worked. Larry decided to join our group because he was interested in meeting Perlers interested in discussing <a href="http://perlcabal.org/syn/S09.html">Synopsis 9</a>, and that mostly worked, too. Larry and I chatted for a bit about how he wanted to incorporate PDL ideas into Perl6, and I explained my concerns about doing it "right". I have since had a handful of ideas that I've been meaning to write down and email off to Larry. Now that I've written this report, maybe I'll be able to get around to doing that.</p>

<p>I come away with new perspectives (which I hope to explore with more blogs) and a lot of good intentions. I hope that I can convert this energy into practical results, whether those be modules on CPAN or workshops with Chicago.pm. The first of those good intentions is finally fulfilled: writing my YAPC::NA report.</p>]]>
    </content>
</entry>

<entry>
    <title>CUDA::Minimal and Error Handling</title>
    <link rel="alternate" type="text/html" href="http://blogs.perl.org/users/david_mertens/2011/07/cudaminimal-and-error-handling.html" />
    <id>tag:blogs.perl.org,2011:/users/david_mertens//664.1927</id>

    <published>2011-07-01T05:24:07Z</published>
    <updated>2011-07-01T05:46:40Z</updated>

    <summary>In the last few days I&apos;ve been introducing my CUDA bindings for Perl that I&apos;ve put on github called CUDA::Minimal. CUDA is a framework for writing and running massively parallel code on the highly parallel computing architecture that is your...</summary>
    <author>
        <name>David Mertens</name>
        <uri>https://github.com/run4flat/</uri>
    </author>
    
    <category term="cuda" label="CUDA" scheme="http://www.sixapart.com/ns/types#tag" />
    
    <content type="html" xml:lang="en" xml:base="http://blogs.perl.org/users/david_mertens/">
        <![CDATA[<p>In the last few days I've <a href="http://blogs.perl.org/users/david_mertens/2011/06/perls-first-real-cuda-bindings-released.html">been</a> <a href="http://blogs.perl.org/users/david_mertens/2011/06/cuda-and-the-perl-data-language.html">introducing</a> my CUDA bindings for Perl that I've put on github called <a href="https://github.com/run4flat/perl-CUDA-Minimal">CUDA::Minimal</a>. <a href="http://en.wikipedia.org/wiki/CUDA">CUDA</a> is a framework for writing and running massively parallel code on the highly parallel computing architecture that is your video card (assuming your card is capable of CUDA in the first place). Today I am going to discuss error handling in CUDA.</p>

<p>Error handling is a boring topic, but it's important, so I'm going to motivate it a bit. Consider this statement from version 4.0 of the CUDA C Best Practices Guide (which you can find <a href="http://tegradeveloper.nvidia.com/nvidia-gpu-computing-documentation">here</a>):</p>

<blockquote>
  <p>Code samples throughout the guide omit error checking for conciseness.
Production code should, however, systematically check the error code returned
by each API call...</p>
</blockquote>

<p>In CUDA::Minimal, you don't have to sacrifice conciseness for error-checking...</p>
]]>
        <![CDATA[<h2>Example: Unable to Allocate Memory</h2>

<p>For example, suppose you try to allocate memory on your device and you run into trouble. This error is hard to make in Perl because it allows using underscores in numbers like 10_000, but let's assume that you accidentally allocate far more memory than you meant to allocate. Here's a fully working script that should produce the error (and only the error), at least on current hardware:</p>

<pre><code>use strict;
use warnings;
use CUDA::Minimal;

# Oops: 1 Terabyte of memory?
my $input_dev_ptr = Malloc( Sizeof f =&gt; 10e12);

print "I've escaped the error!\n";
</code></pre>

<p><br />
Although it may seem a little contrived, this will croak with an informative message:</p>

<blockquote>
  <p>Unable to allocate 4294967295 bytes on the device: out of memory at cuda-error.pl line 6</p>
</blockquote>

<p><code>Malloc</code> can croak for a handful of reasons, but this error comes to <code>Malloc</code> from CUDA itself: we're asking for too much memory. The system croaks with this message. Contrast that with the CUDA-C code, which would chug merrily along unless you checked the error condition.</p>

<p>The simple way to handle these sorts of errors is to use <code>eval</code> blocks and capture errors when you know how to respond.</p>

<h2>Thread Launch Problems</h2>

<p>An interesting feature of CUDA is that kernel-launches are non-blocking. When you launch a kernel, as demonstrated in the <a href="http://blogs.perl.org/users/david_mertens/2011/06/perls-first-real-cuda-">opening example </a>, control returns to the CPU immediately after the kernel starts running. This is handy because it allows you to do other things on the CPU while the calculations run on the GPU, such as logging.</p>

<p>However, non-blocking kernel launches cannot directly report run time errors in your kernel, such as a segmentation fault. To make matters even more confusing, kernel launch failures <em>will</em> trip errors when you call other functions, such as memory allocations or transfers, because the system remains in an error state until you clear it. So if you get an "Unspecified launch failure," scan backward from that point in the code to find the offending kernel launch.</p>

<p>To be sure you've found the correct problematic kernel launch, place the following immediately after your kernel launch:</p>

<pre><code>TreadSynchronize;
croak ("Found error") if ThereAreCudaErrors;
</code></pre>

<p><br /></p>

<h2>Error-related Functions</h2>

<p>At the moment, there are three error-related functions you should know about:</p>

<ol>
<li><code>ThereAreCudaErrors</code>: boolean function that returns true when
 an error exists and false otherwise.</li>
<li><code>PeekAtLastError</code>: Returns the string describing the last error, or 'no errors'</li>
<li><code>GetLastError</code>: Like PeekAtLastError, but also resets the error status.</li>
<li><code>DeviceReset</code>: Resets the device so that future kernel launches do
 not fail from a previous "Unspecified launch failure"</li>
</ol>
]]>
    </content>
</entry>

<entry>
    <title>CUDA and the Perl Data Language</title>
    <link rel="alternate" type="text/html" href="http://blogs.perl.org/users/david_mertens/2011/06/cuda-and-the-perl-data-language.html" />
    <id>tag:blogs.perl.org,2011:/users/david_mertens//664.1921</id>

    <published>2011-06-29T04:00:00Z</published>
    <updated>2011-07-28T02:39:06Z</updated>

    <summary>Yesterday I announced the release of my Perl-accessible bindings for CUDA. CUDA is marketed as a massively parallel, high-performance computing architecture. When you think about Perl and high-performance computing, I would hope that PDL, the Perl Data Language, comes to...</summary>
    <author>
        <name>David Mertens</name>
        <uri>https://github.com/run4flat/</uri>
    </author>
    
    <category term="cuda" label="CUDA" scheme="http://www.sixapart.com/ns/types#tag" />
    <category term="pdl" label="PDL" scheme="http://www.sixapart.com/ns/types#tag" />
    
    <content type="html" xml:lang="en" xml:base="http://blogs.perl.org/users/david_mertens/">
        <![CDATA[<p><a href="http://blogs.perl.org/users/david_mertens/2011/06/perls-first-real-cuda-bindings-released.html">Yesterday</a> I announced the release of my Perl-accessible bindings for <a href="http://en.wikipedia.org/wiki/CUDA">CUDA</a>. CUDA is marketed as a massively parallel, high-performance computing architecture. When you think about Perl and high-performance computing, I would <em>hope</em> that <a href="http://pdl.perl.org/">PDL</a>, the Perl Data Language, comes to mind. :-)</p>

<p>PDL is a CPAN distribution that gives Perl the ability to compactly store and speedily manipulate the large N-dimensional data arrays which are the bread and butter of scientific computing. Today I will discuss how <a href="https://github.com/run4flat/perl-CUDA-Minimal">CUDA::Minimal</a> and PDL talk with each other. (In case you're curious, tomorrow I discuss <a href="http://blogs.perl.org/users/david_mertens/2011/07/cudaminimal-and-error-handling.html">error handling</a> in CUDA::Minimal.)</p>
]]>
        <![CDATA[<h2>Terminology</h2>

<p>The common lingo among PDL folk is to call PDL objects by the name 'piddle'. We discuss the possibility of a name change perennially on the mailing list and I've never come up with anything better, until tonight. For this blog post, I'm going to refer to PDL objects as 'pdsets', short for Perl Data Sets.</p>

<p>However, if you ever visit the mailing lists, you should probably use the term 'piddle' if you need to make reference to PDL objects.</p>

<h2>Dropping-in</h2>

<p>PDL's automated vectorization is powerful but it will play no role in today's post. My interest in linking PDL and CUDA::Minimal focuses on transferring data contained in pdsets to and from the device. The cool part is that you don't have to change anything from the case of using packed scalars. Just use the pdset in place of the original scalar and everything will work just fine.</p>

<p>For example, these three sets of code end up with the same data on the device:</p>

<pre><code>my $packed_data = pack('f*', 0..24);
my $input_dev_ptr = MallocFrom($packed_data);

my $pdset = sequence(25);
my $input_dev_ptr = MallocFrom($pdset);

my $input_dev_ptr = MallocFrom(sequence(25)-&gt;float);
</code></pre>

<p><br />
You can use the <code>Transfer</code> function that I discussed in my last post:</p>

<pre><code>my $pd_results = zeroes(float, $N_data_points);
Transfer($dev_ptr =&gt; $pd_results);
</code></pre>

<p><br />
In short, if you use a pdset in place of a packed scalar, it should just work.</p>

<h2>Object Methods</h2>

<p>CUDA::Minimal also installs a couple of methods in the PDL namespace: <code>send_to</code>, <code>get_from</code>, and <code>nbytes</code>. You can use the first two of these methods directly if you prefer:</p>

<pre><code>my $pd_results = zeroes(float, $N_data_points);
$pd_results-&gt;get_from($dev_ptr);
</code></pre>

<p><br />
or, more compactly, using the chaining idiom common to PDL code:</p>

<pre><code>my $pd_results = zeroes(float, $N_data_points)-&gt;get_from($dev_ptr);
</code></pre>

<p><br /></p>

<h2>Working with Slices</h2>

<p>PDL provides a method for selecting subsets of your full pdset to directly manipulate. The manipulations flow back to the original pdset unless you intentionally sever the connection. Although it is not very efficient, CUDA::Minimal properly handles slices in this way.</p>

<h2>Extending to Other Classes</h2>

<p>So far I have focused on PDL. But when <code>Transfer</code> or any other CUDA::Minimal function encounters an object (a blessed reference) as one of its arguments where it was expecting a packed scalar or an integer with the device pointer, it attempts to use the object methods discussed in the previous section. CUDA::Minimal can work with any class that supplies methods <code>send_to</code>, <code>get_from</code>, and <code>nbytes</code>. The details are discussed in the documentation.</p>
]]>
    </content>
</entry>

<entry>
    <title>Perl&apos;s first real CUDA bindings released</title>
    <link rel="alternate" type="text/html" href="http://blogs.perl.org/users/david_mertens/2011/06/perls-first-real-cuda-bindings-released.html" />
    <id>tag:blogs.perl.org,2011:/users/david_mertens//664.1913</id>

    <published>2011-06-28T04:14:13Z</published>
    <updated>2011-07-01T05:34:46Z</updated>

    <summary>Since my first blog post back in December I&apos;ve written and made thorough use of a simple Perl interface for CUDA. Today, I&apos;ve posted it on github, and in this post I&apos;ll give a relatively simple example of how to...</summary>
    <author>
        <name>David Mertens</name>
        <uri>https://github.com/run4flat/</uri>
    </author>
    
    <category term="cuda" label="CUDA" scheme="http://www.sixapart.com/ns/types#tag" />
    
    <content type="html" xml:lang="en" xml:base="http://blogs.perl.org/users/david_mertens/">
        <![CDATA[<p>Since my first blog post back in December I've written and made thorough use of a simple Perl interface for <a href="http://en.wikipedia.org/wiki/CUDA">CUDA</a>. Today, I've <a href="https://github.com/run4flat/perl-CUDA-Minimal">posted it on github</a>, and in this post I'll give a relatively simple example of how to use CUDA with Perl via Inline::C. (In case you're wondering, CUDA is a technology provided by nVidia that lets you compile and execute highly parallel code on your CUDA-capable video card.)</p>

<p>First, of course, you'll need to install <a href="p3rl.org/ExtUtils::nvcc">ExtUtils::nvcc</a>. At the moment this only works with Linux (maybe with Mac OSX, definitely not yet with Windows). It has only been confirmed with Ubuntu. See directions <a href="http://github.com/run4flat/perl_nvcc/wiki/Installing">on the ExtUtils::nvcc wiki</a>. (If you manage to install it on other systems, please let me know and edit the wiki or send me your notes!) If you have that installed, installing CUDA::Minimal is just a simple CPAN install.</p>

<h1>First Script</h1>

<p>So, at this point I will assume you've installed CUDA::Minimal. What can you do with it? Here's a simple example:</p>
]]>
        <![CDATA[<pre><code>use strict;
use warnings;
use CUDA::Minimal;
use ExtUtils::nvcc;
use Inline C =&gt; DATA =&gt; ExtUtils::nvcc::Inline;

# Some CUDA kernels and their Perl wrappers are defined below. Let's
# create some data and invoke them!

my $N_values = 10;
my $host_data = pack('f*', 1..$N_values);

# Copy the data to the video card and get the pointer in the video card's
# memory. MallocFrom allocates enough memory and copies the contents:
my $input_dev_ptr = MallocFrom($host_data);

# Before processing the data on the video card, I need to allocate some
# memory on the card where the results will be stored. Malloc allocates
# enough memory but does not copy contents:
my $output_dev_ptr = Malloc($host_data);

# Run the kernel:
invoke_the_kernel($input_dev_ptr, $output_dev_ptr, $N_values);

# We would like to see the results, allocate an new host array:
SetSize(my $results_array, length($host_data));
# and copy the results back:
Transfer($output_dev_ptr =&gt; $results_array);
print "$_\n" foreach (unpack 'f*', $results_array);

# Finally, free the device memory:
Free($input_dev_ptr, $output_dev_ptr);

__END__

__C__

// A simple kernel that triples the value of the input data and stores
// the result in the output array:
__global__ void triple(float * in_g, float * out_g) {
    out_g[threadIdx.x] = in_g[threadIdx.x] * 3;
}

// A  little wrapper for the kernel that Inline::C knows how to parse:
void invoke_the_kernel(SV * in_SV, SV * out_SV, int N_values) {
    // Unpack the device pointers:
    float * d_in = INT2PTR(float *, SvIV(in_SV));
    float * d_out = INT2PTR(float *, SvIV(out_SV));

    // invoke the kernel:
    triple &lt;&lt;&lt;1, N_values&gt;&gt;&gt;(d_in, d_out);
}
</code></pre>

<p><br />
That's not exactly hello world. Let's pull it apart.</p>

<h1>Boiler Plate</h1>

<p>Starting from the top, we see some fairly standard boiler-plate using strictures and warnings. Since this is an example for CUDA::Minimal, we'll need that too.</p>

<p>The last two lines of use statements should like a bit intriguing to you:</p>

<pre><code>use ExtUtils::nvcc;
use Inline C =&gt; DATA =&gt; ExtUtils::nvcc::Inline;
</code></pre>

<p><br />
ExtUtils::nvcc is part of the CUDA toolchain and it provides some simple functions for configuring the three main build tools: ExtUtils::MakeMaker, Module::Build, and as shown here Inline::C. (This sets the cc and ld flags, as explained <a href="p3rl.org/ExtUtils::nvcc">in the docs</a>.)</p>

<h1>Allocating Memory</h1>

<p>After creating a Perl string filled with a packed array of floating-point data, we come to the first lines of CUDA::Minimal:</p>

<pre><code>my $input_dev_ptr = MallocFrom($host_data);
</code></pre>

<p><br />
<code>MallocFrom</code> was imported from CUDA::Minimal by default. (Yeah, it imports functions by default. It's supposed to be easy to use. :-) <code>MallocFrom</code> is one of those handy functions that packs a lot of functionality compared with its CUDA C counterparts. It (1) determines the size of your host-side memory, (2) allocates the same amount of memory on the device, (3) copies the host-side data to the device, and (4) returns the pointer to the memory location on the device. All that with one function call!</p>

<p>The next step allocates even more memory on the device. This is the memory on the video card where the results will go:</p>

<pre><code>my $output_dev_ptr = Malloc($host_data);
</code></pre>

<p><br />
<code>Malloc</code>, is very similar to <code>MallocFrom</code> except that it does not copy the contents of <code>$host_data</code> over to the device. It simply allocates the memory and returns the device pointer. </p>

<h1>Location and Terminology</h1>

<p>This is a good point to introduce some terminology. CUDA provides a way for running almost arbitrary code in parallel on your video card. Video cards to not have direct access to your CPU's RAM, and your CPU does not have direct access to your video card's RAM. (nVidia's CUDA Toolkit 4.0 makes this a small lie, but stick with me.) Therefore, it is common convention to refer to the video card as the device and the CPU and its associated RAM as the host. Device pointers are often prefixed with a <code>d_</code> and host pointers are commonly prefixed with a <code>h_</code> to help with bookkeeping.</p>

<p>Although we clarify which memory is which (host vs device), we use an entirely different name for functions run on the video card. They are called kernels. We <em>call functions</em> on the host CPU and we <em>launch kernels</em> on the device.</p>

<h1>Launching the Kernel</h1>

<p>CUDA::Minimal does not provide a means for launching kernels directly. (Perl bindings for the so-called CUDA Driver API, which allows you to do this and many other things outside the scope of CUDA::Minimal, are my next project.) However, Perl provides a means for calling C functions using either Inline::C or plain ol' XS. If said code is compiled using nvcc (using ExtUtils::nvcc to simplify configuration), you can invoke a kernel using the CUDA-C kernel invocation syntax. I'll discuss that in a little bit. The point is that from the standpoint of Perl we are simply calling a function which happens to be defined using XS code instead of Perl code:</p>

<pre><code>invoke_the_kernel($input_dev_ptr, $output_dev_ptr, $N_values);
</code></pre>

<p><br />
The kernel launch is a bit of magic of which Perl is blissfully unaware.</p>

<h1>Getting the Results and Cleaning Up</h1>

<p>Having run the kernel, I next bring back the results. I do this by first allocating some new memory on the CPU. Perl provides a handful of methods for setting the length of scalar variables, but I can never remember them so I created a <code>SetSize</code> function to handle it for me. I copy the results from the video card back to this host memory and print the results:</p>

<pre><code>SetSize(my $results_array, length($host_data));
Transfer($output_dev_ptr =&gt; $results_array);
print "$_\n" foreach (unpack 'f*', $results_array);
</code></pre>

<p><br />
You may have noticed that I use <code>Transfer</code> to copy data both to and from the device. The use of the fat comma (<code>=&gt;</code>) is highly recommended as it gives a very clear indication of the flow of data. Under the hood, <code>Transfer</code> examines the details of the scalars that you pass and determines if either or both arguments are device pointers, and takes the appropriate action. (If both are device pointers, however, you must specify the number of bytes to copy in a third, optional argument.) Finally, we free the device-side memory:</p>

<pre><code>Free($input_dev_ptr, $output_dev_ptr);
</code></pre>

<p>Your program will execute fine without freeing the memory and the memory will (as best I can tell) be reclaimed by the video card at the close of your program. However, if you have a long-running script, failure to free device memory may lead to allocation issues, so it is a good practice to free device-side memory when you're done with it.</p>

<h1>The Kernel Definition</h1>

<p>CUDA kernels are defined like normal functions in C with one special addition: the use of <code>__global__</code> before the return value:</p>

<pre><code>__global__ void triple(float * in_g, float * out_g) {
    out_g[threadIdx.x] = in_g[threadIdx.x] * 3;
}
</code></pre>

<p><br />
Furthermore, all kernels have access to the variables <code>threadIdx</code>, <code>blockIdx</code>, <code>blockDim</code>, and <code>gridDim</code>, though I will not explain those now. Inline::C that does not recognize such statements as normal C function declarations, which is very important for the use of Inline::C with CUDA. (It is likely a bug, but it sure is useful.)</p>

<h1>The Kernel Wrapper</h1>

<p>As I said earlier, Perl does not (yet) have wrappers for API that would allow for direct kernel invocation. However, Inline::C knows how to parse a standard function definition and expose that function to Perl for me. The role of the C function is then to unpack the arguments and invoke the kernel:</p>

<pre><code>void invoke_the_kernel(SV * in_SV, SV * out_SV, int N_values) {
    // Unpack the device pointers:
    float * d_in = INT2PTR(float *, SvIV(in_SV));
    float * d_out = INT2PTR(float *, SvIV(out_SV));

    // invoke the kernel:
    triple &lt;&lt;&lt;1, N_values&gt;&gt;&gt;(d_in, d_out);
}
</code></pre>

<p><br />
The first two lines of this function convert Perl's internal representation of the input scalars into native C float pointers. These pointers point to the location on the video card where the data resides and are passed along to the kernel so it knows from where it retrieves its input and stores its results.</p>

<p>The last and most interesting part of this whole chunk of code is the means by which we invoke the kernel. <em><code>func-name</code></em> <strong><code>&lt;&lt;&lt;</code></strong> <em><code>N-blocks, block-size</code></em> <strong><code>&gt;&gt;&gt; (</code></strong> <em><code>args</code></em> <strong><code>)</code></strong>. If you strip away the parts between the triple-brackets, we have something that looks like a normal function call. The triple-brackets are an extension to ANSI-C provided by nvcc and is one reason that CUDA code must be compiled with nvcc:
when nvcc finds triple-brackets, it inserts code to initialize the video card to run the kernel with the associated block dimensions and grid dimensions. </p>

<h1>Summary</h1>

<p>Today I've given a simple Perl script that manages CUDA memory, compiles a CUDA kernel, and invokes the CUDA kernel with a very thin wrapper written in C. <a href="http://blogs.perl.org/users/david_mertens/2011/06/cuda-and-the-perl-data-language.html">Tomorrow</a> I'll discuss CUDA interoperability with PDL.</p>
]]>
    </content>
</entry>

<entry>
    <title>CUDA, Perl, and perl_nvcc</title>
    <link rel="alternate" type="text/html" href="http://blogs.perl.org/users/david_mertens/2010/12/cuda-perl-and-perl-nvcc.html" />
    <id>tag:blogs.perl.org,2010:/users/david_mertens//664.1287</id>

    <published>2010-12-30T13:57:34Z</published>
    <updated>2010-12-30T14:19:08Z</updated>

    <summary>Over the summer I had the privilege of attending a week-long workshop on CUDA hosted by the Virtual School of Computational Science and Engineering. It was was free for students from the University of Illinois (and other partner institutions, I...</summary>
    <author>
        <name>David Mertens</name>
        <uri>https://github.com/run4flat/</uri>
    </author>
    
    
    <content type="html" xml:lang="en" xml:base="http://blogs.perl.org/users/david_mertens/">
        <![CDATA[<p>Over the summer I had the privilege of attending a week-long workshop on CUDA hosted by the Virtual School of Computational Science and Engineering. It was was free for students from the University of Illinois (and other partner institutions, I presume) and it was excellent. If you want to learn CUDA quickly and you want to learn it well, I highly recommend attending such a workshop.</p>

<p>Over the fall I started writing and using CUDA kernels in my research. This meant writing code in C. C is a great language, but it is not known for its <a href="http://use.perl.org/~ziggy/journal/26131">whipuptitude</a>. Almost immediately, I noticed that my main() function did little more than manage memory and coordinate kernel launches. This, I thought to myself, is exactly what scripting languages are for, and wished there was something out there to let me manage CUDA memory and invoke CUDA kernels from Perl.</p>

<p>This is how I started down the path of writing perl_nvcc.</p>]]>
        
    </content>
</entry>

</feed>
