[Chicago-talk] Navigating an XSD

Brian Mielke zelein at gmail.com
Fri Jan 23 07:19:52 PST 2015


For writing XML::Writer is sufficient, though I think some DOM libraries 
allow you to print DOM objects to files too.  As far as reader 
libraries, the big ones that I've used before were XML::Simple, 
XML::Twig, and XML::LibXML.  Avoid XML::Simple. XML::Twig is ok and 
XML::LibXML may take more time to get started with but when I last 
tested those two XML::LibXML is much faster.

I've not used the Mojo::DOM, but with the name Dom, it probably has 
similar issues with any DOM parser -- good at small data sets and very 
easy to code with, but slow and memory intensive with large datasets.  
If you are doing anything with a lot of data, you'll want to read an 
article on DOM vs SAX parsing, and you may want to look at 
XML::LibXML::Reader  ( I think ).  I think it's a module that allows you 
to event parse down to specific nodes and then build up DOM objects, 
which are easier to work with in code.  It's been several years since 
I've had to do anything like that though.  The most important thing with 
XML parsing is knowing when to use an event based parser vs a DOM parser.

There may be some schema support in XML::LibXML too, but it's been a 
while since I've had to do anything with big xml fortunately.

- Brian

On 01/23/2015 08:45 AM, Jay Strauss wrote:
> Hi john thanks, right now I'm going to try mojo::Dom and see how that goes
>
> Sent from my iPhone
>
>> On Jan 23, 2015, at 8:23 AM, John Kristoff <jtk at depaul.edu> wrote:
>>
>>> On Fri, Jan 23, 2015 at 05:12:06AM +0000, Jay Strauss wrote:
>>> I have an XSD document like below.  I've been googling and cpan-ing,
>>> but can't find what I need (i know there are lots of XML packages),
>>> and I don't know how to parse it using regexs (reliably).  I just want
>>> to read in this XSD and navigate it like a perl structure, and extract
>>> field values like:
>> [...]
>>> Can anyone recommend a module or a method?
>> Hi Jay.  I have found some success using XML::Twig and XML::Writer for reading
>> and writing respectively, for relatively simple, but large XML files.
>>
>> With XML::Twig, you can give it handlers based on blocks, for instance,
>> you're interested in.  For example:
>>
>> my $t = XML::Twig->new(
>>     twig_handlers => {
>>         'person' => \&section,
>>     },
>> );
>> $t->parsefile($xmlfile);
>>
>> sub section {
>>     my ( $t, $section ) = @_;
>>
>>     # interested in a few elements within this section
>>     do_something_with( $section->first_child_text('birthdate') );
>>     do_somethingelse_with( $section->first_child_text('ssn') );
>>     #s
>>     # ...
>>
>>     # do not need that element again
>>     $section->purge;
>>
>>     return;
>> }
>>
>> See if that might work for you.
>>
>> John
>> _______________________________________________
>> Chicago-talk mailing list
>> Chicago-talk at pm.org
>> http://mail.pm.org/mailman/listinfo/chicago-talk
> _______________________________________________
> Chicago-talk mailing list
> Chicago-talk at pm.org
> http://mail.pm.org/mailman/listinfo/chicago-talk



More information about the Chicago-talk mailing list