Web Scraping with Zydeco

By Toby Inkster on September 11, 2020 11:52 AM under DateTime, HTML, Moo, Zydeco, roles

So I like to keep local copies of my blogs.perl.org blog posts as Atom entries, but noticed yesterday that I had a few gaps in my collection. The Atom feeds offered by blogs.perl.org only have the most recent articles though, so I decided to write a quick script to scrape the posts. Luckily, I managed to get a table containing the URLs for each post I needed, so I didn't need to bother with following links to find the pages; I just needed to grab the content from them.

I thought some people might find the code interesting especially for its use of lazy attributes. This is one of those "it only needs to be used once, so making the code maintainable isn't important" kinds of projects, do bear that in mind. I've cleaned up the whitespace and added comments for this blog post, but other than that, it's just a quickly hacked together script.

4 comments

Tagged as:

OOP

4 Comments

David Hodgkinson | September 14, 2020 11:03 AM | Reply

Why the :: in front of class names? Please point me at the relevant docs.

Sébastien Feugère replied to comment from David Hodgkinson | September 14, 2020 6:34 PM | Reply

Zydeco prefixes your class names with the package name. The :: in front of class names will avoid this.>

This is explained with an example in this documentation.

Sébastien Feugère | September 14, 2020 6:59 PM | Reply

Thanks for sharing this: it helps to have a real life example that go beyond the Foo::Bar app.

Toby Inkster replied to comment from David Hodgkinson | September 14, 2020 10:17 PM | Reply

Zydeco treats class names and role names as relative to the container package unless you prefix them with "::".

Relevant docs.

Name

Email Address

URL

Remember personal info?

Comments (You may use HTML tags for style)

About Toby Inkster

I'm tobyink on CPAN, IRC and PerlMonks.

More info »

Toby Inkster