Atom Feed Help

It's more than a touch frustrating for me, but I need help processing an Atom feed (having never done this before). Specifically, I need help with the gitpan Atom feed. Github has a useful API, but it can't handle the huge number of repos which gitpan has, not does it appear that the Github API offer any paging facilities.

I've already seen modules like XML::Atom, but what I'd like to see is something which allows me to pull past Atom entries (I know this is available because Google Reader can read the past entries. Heck, even reading the HTTP headers hasn't allowed me to decipher the exact incantation needed. Basically, I'm looking at the following (pseudo-code):

my $atom = Some::Atom::Module->new($atom_url); my ( $limit, $offset ) = ( 100, 0 ); while ( my $results = $atom->fetch( { limit => $limit, offset => $offset } ) { process($results); $offset += $limit; }

I see a number of Atom modules on the CPAN, but I've not found one which offers paging. Have I missed one? Is there a clear resource online to explain how I can at least fetch past Atom results via curl?

6 Comments

Your problem is at a conceptual level I think :-/

It looks like the github atom feed contains 35 entries. So you can only ever get the most recent 35 entries from parsing the atom feed.

I know that Google Reader looks like it can get older stuff. But I'm pretty sure that's only because it downloaded the atom feed when those entries were there and then cached the information in a database.

All of which means that for a huge upload like gitpan, the atom feed is pretty much useless and you'll have to start digging around in the API - perhaps doing stuff a few repos at a time.

Let me know if I can be any more help.

I don't think I've ever seen an atom feed that follows those standards.

FWIW, your pseudo-code is kind of like OpenSearch.

Dave: all Blogger feeds have paging links per RFC 5005. (Hardly much help to Ovid, though.)

Ovid: there is no automagical paging mechanism for feeds. A feed is no more special than a web page. The https://blogs.perl.org front page doesn’t have dynamic paging either, f.ex., so there’s simply no way you can page backward.

Leave a comment

About Ovid

user-pic Freelance Perl/Testing/Agile consultant and trainer. See http://www.allaroundtheworld.fr/ for our services. If you have a problem with Perl, we will solve it for you. And don't forget to buy my book! http://www.amazon.com/Beginning-Perl-Curtis-Poe/dp/1118013840/