It's time to admit I've failed
Two days ago I was so excited! I had an idea how to make the Perl world a bit better, faster and simpler. Of course, I didn’t spread such exciting news until I checked and double-checked and benchmarked, until I’m absolutely sure I’ve found The Holy Grail.
Well, see the title. It hurts. All my benchmarks contained a terrible mistake. And those +20%, or, maybe even +100% speed boost PugiXML interface could provide doesn’t worth all the buzz I created.
Yet, Perl interface to PugiXML I’ve described in my previous post could be (optimistically) twice as fast as LibXML. In some cases. But I’m so disappointed by my failure that I just don’t think it worth it.
Another lesson learned.
When you feel lack of speed with HTML parsing please use something LibXML based, like HTML::TreeBuilder::LibXML or just XML::LibXML. Just make sure you are using load_html() family instead of load_xml() and enable recover() mode as it’s done in HTML::TreeBuilder::LibXML
For those who still are interested, the code of the prototype is published at https://github.com/yko/pugixml-perl
At some point I may decide to continue development. Unfortunately it would not be that lightning fast as initially expected.