A New Catalyst Sitemap Generator --

I've just published a new Catalyst plugin to CPAN and would love to get some feedback on it before I increment the version to the magical 1.0 and declare it production-worthy (sometime next week).

As it stands, there's already an existing sitemap plugin written a few years ago, that works quite well, but I ran into the problem for a client where they have close to 40 million public URLs and want that all represented in a sitemap.

Sitemaps are limited to 50,000 URLs, but through use of a Sitemap Index file, you can include up to 50,000 sitemaps, each with 50,000 URLs.

Rather than just write a one-off script, I decided to write this as a Catalyst plugin and publish to CPAN under the name of Catalyst::Plugin::BigSitemap. It has a public interface that matches the original sitemap plugin (so it can be used as a compete, drop in replacement).

What's supported.


  • Configurable sitemap index and sitemap names.

  • Writing your sitemap files to an arbitrary directory on the disk

  • Overriding the base url (in the even you're starting catalyst from a cron job)


What's not supported:


  • Last Modified times in the Sitemap Index file

  • Splitting sitemaps up by directories (ie, a sitemap that just covers http://mywebsite.com/books). The sitemaps cover all the URLs that can be resolved and they're all placed in the directory you specify.

I look forward to hearing feedback.

Thanks,

Leave a comment

About Derek J. Curtis

user-pic American software developer living in Jakarta, Indonesia.. Primarily working in Perl, Python and C#