HTML Content Extraction / Instapaper
I recently found the old Instapaper extraction rules to rewrite HTML content in a way that is easier on the eyes for consumption. This find has resulted in me writing HTML::ExtractContent::FTR and HTML::ExtractContent::Pluggable to get a nice/concise way to scrape HTML from sites for consumption via RSS or mail.
MVC with Dancer2 and DBIC: Form Validation
A few days ago I pushed to GitHub a sample web application written in the MVC style with Dancer2 and DBIx::Class. In this very first post about it, I'd like to highlight
how a route block that processes and validates form data can be made short and neat with the help of HTML::FormHandler.
Consider this HTML form from the application which creates a new user:
Perl 6 .rotor: The King of List Manipulation
Rotor. The word makes a mechanic think about brakes, an electrical engineer about motors, and a fan of Red Letter Media YouTube channel about poorly executed films. But to a Perl 6 programmer, .rotor is a powerful tool for list operations.
Break up into chunks
At its simplest, .rotor takes an integer and breaks up a list into sublists with that many elements:
say <a b c d e f g h>.rotor: 3
>>>OUTPUT: ((a b c) (d e f))
We have a list of 8 elements, we called .rotor on it with argument 3 and we received two Lists, with 3 elements each. The last two elements of the original list we
not included, because they don't make up a complete, 3-element list. That can be rectified, however, using
the :partial named argument set to True:
Social Media Meta Tags
Social Media Meta Tags
Social media meta tags are HTML tags that allow you to make the most out of the content you share from a URL. You can determine what information is displayed from a post in Twitter, Facebook, LinkedIn, Pinterest and beyond. It gives developers control over the experience their content produces, as it shows up on these social networks.
UTF-16 and Windows CRLF, oh my
I recently had to do some quick search/replace on a bundle of Windows XML files. They are all encoded as UTF-16LE, with the Windows \n\r line endings encoded as 0D 00 0A 00.
Perl can handle UTF-16LE just fine, and it handles CRLF endings on windows out-of-the-box, but the problem is that the default CRLF translation happens too close to the filehandle- on the wrong side of the Unicode translation. The fix is to use the PerlIO layers :raw:encoding(UTF-16LE):crlf - the ":raw" prevents the default CRLF translation from happening at the byte level, the UTF section translates between characters and the encoded bytes, and the final ":crlf" handles the line endings at the encoded-as-UTF16 level.
Knowing that is half the battle. The other half is applying those layers. This was a one-time, quick-and-dirty command-line edit, along these lines:
perl -pi.bak -e "s/old-dir/new-dir/gi" file1.xml file2.xml file3.xml
curl + swat VS selenium
No, this is not about holy war ! I do respect other tools (-: , really
But rambling on stackoverflow I found quite interesting question about web tests automation. The author started using curl for quite simple test automation task and then changed to selenium web driver, the reason was quite obvious - curl has request oriented design which make it hard to use it when making complicated, sequential requests in a whole test story.
But curl is still cool stuff to get rid of , but you don't have to ... if you use swat PLUS curl.
So here is my answer ...
100+ Modules for Adoption! (Bit Rot Thursday)
EDIT: Just a note for PAUSE admins, as some emailed me, any module listed on this post can be given away to anyone who wishes to take it, without any need to ask me first. I do not wish to retain a co-maint either, so please just go ahead and transfer the ownership :) Thanks!
Today's Thursday, and if you regularly read blogs.perl.org, you know today is the first day of my plan to combat bit rot.
Happy Bit Rot Thursday, everyone!
The first step I'm undertaking is reducing the number of projects under my wing by means of deleting them entirely or putting them up for adoption. In total, there are about 107 modules I made adoptable, although some of them are a bundle deal.
Adoption
Mock Testing Web Services with Mojo
Occasionally, the need to write a web service client comes about. For example, when the decision gets made to move away from a piece of software that you run in-house to a suite of hosted apps.
The hosted apps offer RESTful APIs for communication that you will need to use to transfer your data. Let's pretend that there isn't yet a Perl client implementation to fit our needs. So, the first thing that needs to be done is to write a client for these web services (using Mojolicious) to handle the few API methods you'll need.
The client
You end up with an overly simplified client library that might look like this:
The Fuse Operator - A Suggested Language Extension
Perl 5 has become pretty stable, but there is always room for small improvements. I would like to discuss yet another "missing" operator. Its purpose is to make expressions handle some edge cases more gracefully. It could render some other extensions that have been suggested before unnecessary.
testing JSON/XML applications using swat
Hi!
I have just released swat, version 0.1.80 to support JSON/XML applications testing with the help of so called response processors.
Regards
-- Alexey
Bit Rot Thursday
Part 1: There is a Problem
I don't think I'd have to look for long for someone who'd agree that writing new code is much more fun that fixing bugs in old one. A cool new idea gets written up, while older code is still lacking tests. A new module gets shipped, while there's still that API improvement proposal from 6 months ago in the other. And while you're drafting a design document for the Next Awesome Thing, the rest of your code is being slowly consumed by bit rot.
Having written 250–300 Perl 5 modules and now 32 Perl 6 modules and other ideas, I'm more than aware of what it feels like to be leaving a decaying pile of code in your wake. The problems I notice are these:
- Unfixed bugs
- Lack of compresensive tests
- Lack of documentation
- Bad documentation (too wordy; incorrect; partial)
- Unimplemented new features, even if the proposal for them was approved
- Partial implementation (an FTP client that can only download, for example)
Introducing Scheme in Perl 6
Introducing Scheme in Perl 6: https://github.com/drforr/perl6-Inline-Guile
This is very much in its early days, and the interface is likely to change as I find the method(s) in the Guile library that I need. Specifically once I can figure out how to portably crack into a SCM return value the need for separate _i and _s functions should go away. Perl 6 is perfectly capable of making the distinction, but it's a segfault waiting to happen from C should I get the return values wrong.
Also I need to cleanly dispose of the returned string, as it is it'll leak memory.
Perl 5 Porters Mailing List Summary: January 11th-24th
Hey everyone,
Following is the p5p (Perl 5 Porters) mailing list summary for the past two weeks. Enjoy!
Perl 6.c (Christmas) Rakudo Star coming soon
Are you waiting for a Rakudo * Christmas Release? Or an MSI installer for windows? Here's the story so far:
A Date with CPAN, Part 6: Time Won't Give Me Time
[This is a post in my latest long-ass series. You may want to begin at the beginning. I do not promise that the next post in the series will be next week. Just that I will eventually finish it, someday. Unless I get hit by a bus.
IMPORTANT NOTE! When I provide you links to code on GitHub, I’m giving you links to particular commits. This allows me to show you the code as it was at the time the blog post was written and insures that the code references will make sense in the context of this post. Just be aware that the latest version of the code may be very different.]
Last time I added Time::ParseDate support to our date class, which made it fairly usable, if still incomplete. This time I decided to concentrate on getting a first cut at our datetime class.
In many ways, the datetime class is simpler than the date class, because it doesn’t need to do anything fancy like truncate to midnight or try to ignore times and timezones when parsing. Of course, datetimes do have to consider timezones, but I decided to defer that thorny issue until next time.
The problem with Exporters (Meet Importer)
The problem with Exporters
With Exporter, and most exporter tools we have failed to separate concerns.
Exporting fundamentally involves 2 parties: exporter and importer. Historically
however we have only ever concerned ourselves with the exporter. The "standard"
way of dealing with this in perl has been to have a module that provides
exports use it's import() method to inject symbols into the importer's
namespace.
What if we did this with other similar concepts? What if instead of:
use base 'foo';
we had:
package My::Base;
# Gives us an import() method that sets @{caller::ISA} = __PACKAGE__
use base;
...
package My::Subclass;
use Base::Class; #Automatically sets @ISA
Pure-Perl XML
In the past I sometimes used XML::Tiny and I found it perfect for the job. Agreed, I had to struggle only with very little and under-control XML, so I knew I could do without a full-fledged XML Parser.
A pretty stupid idea...
If you're a Perl developer in the UK, I will literally send you free money by email: http://eepurl.com/bNSF9P
A Naïve SQL Shell
For one client, I was told that our devs didn't have client access to a database with a problem, but they could connect via DBI. Thus, I whipped up the following to help them out.
It has command line history and mostly handles multi-line queries. It's not overly robust, but it's the sort of handy code you might just need in a pinch.
About blogs.perl.org
blogs.perl.org is a common blogging platform for the Perl community. Written in Perl with a graphic design donated by Six Apart, Ltd.