Another approach to XML processing
The buzz around XML has passed and we are left with a lot of Perl modules to process XML in different ways. I was surprised to still find a gap for another XML processing module.
Common schema-less approaches to XML processing in Perl seem to use XML-LibXML, to get a full XML DOM or a stream of parsing events, or XML::Simple (better used as XML::LibXML::Simple. XML::Simple transforms between XML and Perl data structures but it was designed for "data-oriented" XML where element order does not matter a lot. With XML::Struct I created something like XML::Simple for "document-oriented" XML.
While XML::Simple returns uses (or hashes of arrays when elements are repeated), XML::Struct uses arrays for representing XML data. This is best illustrated by an example:
<root>
<foo>text</foo>
<bar key="value"> <!-- mixed content here: -->
text
<doz/>
</bar>
</root>
is transformed to this structure:
[
root => { }, [
[ foo => { }, "text" ],
[ bar => { key => "value" }, [
"text",
[ doz => { } ]
]
]
]
XML Attributes are transformed to hashes, that can also be omitted with attributes => 0
. If you still want a key-value structure for (parts of) a document, use hashifyXML
. The distribution contains methods for both parsing, and serializing based on XML::LibXML. XML is processed as stream, so one can also extract chunks from very large XML files.
Comments, bug reports, extensions etc. are welcome, especially at https://github.com/nichtich/XML-Struct.
Exactly what I wanted a little while ago. Keeping the order intact is the best part of it.