Processing schema.org markup with Perl

Someone on IRC asked me for an example of how to parse schema.org markup using my HTML::HTML5::Microdata::Parser module. So here one is. It pulls the microdata from the page, and queries it using SPARQL.

#!/usr/bin/env perl

use HTML::HTML5::Microdata::Parser;
use LWP::Simple 'get';
use RDF::Query;

my $uri = "http://buzzword.org.uk/2012/schema-org.html";
my $microdata = HTML::HTML5::Microdata::Parser->new(
   get($uri),
   $uri,
);

my $query = RDF::Query->new(<<'SPARQL');
PREFIX schema: <http://schema.org/>
SELECT ?name ?page
WHERE {
   ?person
      a schema:Person ;
      schema:name ?name ;
      schema:url ?page .
}
SPARQL

my $people = $query->execute($microdata->graph);

while (my $person = $people->next)
{
   printf(
      "Found person: %s %s\n",
      $person->{name},
      $person->{page},
   );
}

3 Comments

I'm curious with the syntax highlighting on this blog post's code. How did you do it?

Thanks. Ah, no wonder the color theme feels vaguely familiar.

Leave a comment

About Toby Inkster

user-pic I'm tobyink on CPAN, IRC and PerlMonks.