[Pdx-pm] Iksemel XML parser testing results

Tyler Riddle triddle at gmail.com
Fri Dec 11 08:58:54 PST 2009


On Sun, Dec 6, 2009 at 1:03 PM, Erik Hollensbe <erik at hollensbe.org> wrote:
> If you find the time to get this working, alert me (or the list) of your
> discoveries, I am genuinely curious.
>

I got the Iksemel parser "working" - if you'd like to see the code
it's at https://triddle.projecthut.com/svn/triddle/XML_Speed_Test/bin/iksemel.c

The reason that working is in quotes is because it doesn't. I'm able
to get Iksemel to parse a simple test case of a Wikipedia dump file
(https://triddle.projecthut.com/svn/triddle/Parse-MediaWikiDump/t/pages_test.xml)
but it chokes with an XML document format error on the real XML dump
files from Wikipedia while every other single XML parser handles them
just fine. So in this instance it would seem that the features removed
to make Iksemel go faster are the very ones that make XML work aside
from handling the most simple of documents.

Tyler Riddle

-- 
If you wish to make an apple pie from scratch you must first invent
the universe. -- Carl Sagan


More information about the Pdx-pm-list mailing list