Using XML::Compile to output XSD compliant XML
As part of a recent project I was given an XSD file (xml schema definition) and asked to output compliant XML. CPAN to the rescue. I found XML::Compile::Schema which is a cool module that allowed me to do this with very little fuss. The documentation is really good but I think a tutorial-style post might be helpful.
To do this you’ll need to install XML::Compile and XML::LibXML.
You can use XML::Compile::Schema to read in your xsd file and output a perl hash template. Then you can use that example template to construct a hash of real data and then have XML::Compile::Schema output a valid XML file.
For this tutorial, download a sample .xsd file from here. Then write a perl script like so to dump a perl hash template.
#!/usr/local/bin/perl
use warnings;
use strict;
use Data::Dumper;
use XML::Compile::Schema;
use XML::LibXML::Reader;
my $xsd = 'test.xsd';
my $schema = XML::Compile::Schema->new($xsd);
# This will print a very basic description of what the schema describes
$schema->printIndex();
# this will print a hash template that will show you how to construct a
# hash that will be used to construct a valid XML file.
#
# Note: the second argument must match the root-level element of the XML
# document. I'm not quite sure why it's required here.
warn $schema->template('PERL', 'addresses');
The relevant output looks like this:
# is an unnamed complex
{ # sequence of address
# is an unnamed complex
# occurs 1 <= # <= unbounded times
address =>
[ { # sequence of name, street
# is a xs:string
# is optional
name => "example",
# is a xs:string
# is optional
street => "example", }, ], }
The comments are helpful (and were provided by XML::Compile::Schema directly, not by me). It basically says your data structure should start as a hashref which should contain an entry called “address” which is a reference to an array. This array should be a list of hash references which each contain two elements, name and street.
From this you can deduce that a valid hash will look something like this.
my $data = {
address => [
{
name => 'name 1',
street => 'street 1',
},
{
name => 'name 2',
street => 'street 2',
}
],
};
In order to output the XML, you have to do this:
my $doc = XML::LibXML::Document->new('1.0', 'UTF-8');
my $write = $schema->compile(WRITER => 'addresses');
my $xml = $write->($doc, $data);
$doc->setDocumentElement($xml);
print $doc->toString(1); # 1 indicates "pretty print"
My output looks like this:
<?xml version="1.0" encoding="UTF-8"?>
<addresses>
<address>
<name>name 1</name>
<street>street 1</street>
</address>
<address>
<name>name 2</name>
<street>street 2</street>
</address>
</addresses>
The actual XSD and resulting XML files I was dealing with were much more complicated but I followed this process and had no trouble whatsoever.