Processing schema.org markup with Perl
Someone on IRC asked me for an example of how to parse schema.org markup using my HTML::HTML5::Microdata::Parser module. So here one is. It pulls the microdata from the page, and queries it using SPARQL.
#!/usr/bin/env perl
use HTML::HTML5::Microdata::Parser;
use LWP::Simple 'get';
use RDF::Query;
my $uri = "http://buzzword.org.uk/2012/schema-org.html";
my $microdata = HTML::HTML5::Microdata::Parser->new(
get($uri),
$uri,
);
my $query = RDF::Query->new(<<'SPARQL');
my $people = $query->execute($microdata->graph);
while (my $person = $people->next)
{
printf(
"Found person: %s %s\n",
$person->{name},
$person->{page},
);
}
use HTML::HTML5::Microdata::Parser;
use LWP::Simple 'get';
use RDF::Query;
my $uri = "http://buzzword.org.uk/2012/schema-org.html";
my $microdata = HTML::HTML5::Microdata::Parser->new(
get($uri),
$uri,
);
my $query = RDF::Query->new(<<'SPARQL');
PREFIX schema: <http://schema.org/>
SELECT ?name ?page
WHERE {
?person
a schema:Person ;
schema:name ?name ;
schema:url ?page .
}
SPARQLSELECT ?name ?page
WHERE {
?person
a schema:Person ;
schema:name ?name ;
schema:url ?page .
}
my $people = $query->execute($microdata->graph);
while (my $person = $people->next)
{
printf(
"Found person: %s %s\n",
$person->{name},
$person->{page},
);
}
I'm curious with the syntax highlighting on this blog post's code. How did you do it?
I used the HTML export feature of my text editor (SciTE) which exports the file syntax highlighted and then ran that through CSS::Inliner. The heredoc background and fixed-width font needed manual intervention.
Thanks. Ah, no wonder the color theme feels vaguely familiar.