scraping Archives

Web Scraping with Perl & PhantomJS

PhantomJS is a 'headless' WebKit browser, mainly intended for use as a web testing framework, and is controlled by a JavaScript API. The 'headless' aspect of that also makes the framework extremely useful for scraping JavaScript heavy websites.

The problem with PhantomJS (up until the v1.8 release on 23 December 2012), was that if you were unfamiliar with JavaScript, CoffeeScript or Node.js (if you were using the Casper.js fork), was that it wasn't very easy understand or control. Since the v1.8 release in December, PhantomJS now supports ="h…

About Rob Hammond

user-pic I blog mostly about SEO, but sometimes about Perl.