<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">
    <title>ugexe</title>
    <link rel="alternate" type="text/html" href="http://blogs.perl.org/users/ugexe/" />
    <link rel="self" type="application/atom+xml" href="http://blogs.perl.org/users/ugexe/atom.xml" />
    <id>tag:blogs.perl.org,2009-11-03:/users/ugexe//1622</id>
    <updated>2012-10-28T18:37:11Z</updated>
    <subtitle>A blog about the Perl programming language</subtitle>
    <generator uri="http://www.sixapart.com/movabletype/">Movable Type Pro 4.38</generator>

<entry>
    <title>DBIx::Class::FilterColumn: making transformation easier</title>
    <link rel="alternate" type="text/html" href="http://blogs.perl.org/users/ugexe/2012/10/dbixclassfiltercolumn-making-transformation-easier.html" />
    <id>tag:blogs.perl.org,2012:/users/ugexe//1622.3998</id>

    <published>2012-10-28T18:07:53Z</published>
    <updated>2012-10-28T18:37:11Z</updated>

    <summary>Over a year ago I was tasked with creating a data warehouse for sports data. Having known absolutely nothing about data warehousing/ETL, my first sport ended up quite the mess; scrapers would extract and transform at the same time then...</summary>
    <author>
        <name>ugexe</name>
        
    </author>
    
    <category term="dbixclasswarehousefilters" label="dbix::class warehouse filters" scheme="http://www.sixapart.com/ns/types#tag" />
    
    <content type="html" xml:lang="en" xml:base="http://blogs.perl.org/users/ugexe/">
        <![CDATA[<p>Over a year ago I was tasked with creating a data warehouse for sports data. Having known absolutely nothing about data warehousing/ETL, my first sport ended up quite the mess; scrapers would extract and transform at the same time then stuff it into a database where it most likely needed additional transformations. At the time, additional transformations meant writing a script to iterate over every row and change whatever column to whatever regex I had constructed. Sometime later after i'd have generated a report i'd find something wrong, often missing data due to a bad transform regex, which meant re-scraping websites (and often times purchasing another membership). </p>]]>
        <![CDATA[<p>Naturally I started saving the pages to be scraped, and then just scraped the files themselves. But I was still left loading the data into the db with DBIx::Class (by choice) and then running some sort of transformation on every column. My scripts folder got messy quickly, filled with various 20-30 line transformation scripts to modify a column or two's data. That is all fine and dandy, except as the data grew the need for the transformation to just 'happen' automatically became more apparent.</p>

<p>Welcome to DBIx::Class::FilterColumn. For this specific example, I was getting spreads where certain letter strings and non alphanumeric characters needed to be represented numerically. To clarify, I wanted a decimal number (negative or positive), but would often times get spreads such as:</p><p>pk # means 0</p><p>-5-05 # means -5</p><p>ev-05 # means 0</p><p>½ # means .5</p><p>Not only does DBIx::Class::FilterColumn handle this for us, but if we wanted to spit the data back out its easy to replace and substituted characters back (we still lose anything we flat out removed, but for the scope of this post we'll stop here).</p><p><br /></p>

<p>sub odds_to_storage   {&nbsp;</p><p>&nbsp; &nbsp; return undef unless $_[1];<br />&nbsp; &nbsp; $_[1]=~s~\275~.5~; # convert Â½ to .5<br />&nbsp; &nbsp; $_[1]=~s~-\d\d$|ev$|u\d+$|\+\d+$~~; # strip off vig<br />&nbsp; &nbsp; ($_[1] eq 'pk') ? 0 : $_[1]; # pj -&gt; 0</p><p>}<br /></p><p>
sub odds_from_storage {&nbsp;</p><p>&nbsp; &nbsp; $_[1]=~s~\.0~~; # strip off needless .0<br />&nbsp; &nbsp; $_[1]=~s~.5~\275~; # convert .5 to Â½<br />&nbsp; &nbsp; ($_[1] eq '0') ? 'pk' : $_[1] } # 0 -&gt; pk<br />}</p><p>
foreach my $col (__PACKAGE__-&gt;columns) {<br /></p><p>
    &nbsp; &nbsp; __PACKAGE__-&gt;filter_column(<br />&nbsp; &nbsp; &nbsp; &nbsp; $col =&gt; {<br />&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; filter_to_storage =&gt; 'odds_to_storage',<br />&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; filter_from_storage =&gt; 'odds_from_storage',<br />&nbsp; &nbsp; &nbsp; &nbsp; })&nbsp;if __PACKAGE__-&gt;column_info($col)-&gt;{data_type} eq 'decimal';<br />
}</p>]]>
    </content>
</entry>

<entry>
    <title>Assigning a user defined function at runtime</title>
    <link rel="alternate" type="text/html" href="http://blogs.perl.org/users/ugexe/2012/10/assigning-a-user-defined-function-at-runtime.html" />
    <id>tag:blogs.perl.org,2012:/users/ugexe//1622.3996</id>

    <published>2012-10-28T02:00:28Z</published>
    <updated>2012-10-28T18:01:58Z</updated>

    <summary>I recently wrote my first XS module, and found myself wanting to dynamically load it in the parent module if possible. The next problem was what if the user wants to change back to the pure Perl version? And then,...</summary>
    <author>
        <name>ugexe</name>
        
    </author>
    
    
    <content type="html" xml:lang="en" xml:base="http://blogs.perl.org/users/ugexe/">
        <![CDATA[<p>I recently wrote my first XS module, and found myself wanting to dynamically load it in the parent module if possible. The next problem was what if the user wants to change back to  the pure Perl version? And then, what if the user wants to use Other::Module's similar_function?</p>
]]>
        <![CDATA[<div><span style="font-family: monospace; ">sub _set_backend {</span></div><div><font face="monospace">&nbsp; my $be = shift;</font></div><div><font face="monospace">&nbsp; my $module = $be;</font></div><div><font face="monospace">&nbsp; $module =~ s/^(.*)::.*?$/$1/g;</font></div><div><font face="monospace"><br /></font></div><div><font face="monospace">&nbsp; # Does the module exist?</font></div><div><font face="monospace">&nbsp; eval "require $module";</font></div><div><font face="monospace">&nbsp; unless($@) {</font></div><div><font face="monospace">&nbsp; &nbsp; &nbsp; &nbsp;# Does the module have such a function?</font></div><div><font face="monospace">&nbsp; <span class="Apple-tab-span" style="white-space:pre">	</span>eval "defined &amp;$be";</font></div><div><font face="monospace"><span class="Apple-tab-span" style="white-space:pre">	</span>unless($@) {</font></div><div><font face="monospace"><span class="Apple-tab-span" style="white-space:pre">		</span># Does it return a number if we give it 2 strings?</font></div><div><font face="monospace"><span class="Apple-tab-span" style="white-space:pre">		</span>eval "die unless(&amp;$be('four','fuor') =~&nbsp;</font></div><div><font face="monospace">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;/[-+]?[0-9]*\.?[0-9]+/)";</font></div><div><font face="monospace"><span class="Apple-tab-span" style="white-space:pre">		</span>unless($@) {</font></div><div><font face="monospace"><span class="Apple-tab-span" style="white-space:pre">			</span># We welcome our new edistance overlord</font></div><div><font face="monospace"><span class="Apple-tab-span" style="white-space:pre">	</span> &nbsp;<span class="Apple-tab-span" style="white-space:pre">		</span>*edistance = \&amp;$be;</font></div><div><font face="monospace"><span class="Apple-tab-span" style="white-space:pre">		</span>}</font></div><div><font face="monospace"><span class="Apple-tab-span" style="white-space:pre">	</span>}</font></div><div><font face="monospace">&nbsp; }</font></div><div><font face="monospace">}</font></div><div><font face="monospace"><br /></font></div><div><font face="monospace"><br /></font></div><div><font face="monospace">Of coarse we really only need the last eval, but this way we can return an appropriate error code if we want.</font></div>]]>
    </content>
</entry>

</feed>
