[Kc] Perl Question: XML::Twig module

Daryl Fallin darylvf at gmail.com
Fri Jun 25 17:31:37 PDT 2010


Hi All ....

I have been trying to work with XML::Twig lately to parse an xml file.

I just want to dump every element/Tag of the xml file.  But my while loops
seems to be doing something weird or its the way that XML::Twig is working,
not sure, but I get duplicate information from the original XML file.  Its
like it is running part of the while loop twice.

I know there are other modules that I could use but I am using XML::Twig for
other parts of what will be a larger program and I want the chunking that
XML:Twig allows.

Any help would be greatly appreciated.

Here is my sample code:

#!/usr/bin/perl -w

require XML::Twig;

sub info {
        my ($xml, $info) = @_;
        my $elt = $info;
        while ($elt= $elt->next_elt($info) )
        {
                $elt->set_remove_cdata(1);
                $elt->set_pretty_print("record");  # print one field per
line
                printf "%s\n", $elt->sprint;
        }
}

$xml = new XML::Twig(
        TwigHandlers => {
                XML_DIZ_INFO       => \&info,
        }
);

# Parse the XML
$xml->parsefile('sample.xml');

************************

sample.xml
-----------------
<?xml version="1.0" ?>
<XML_DIZ_INFO>
        <MASTER_PAD_VERSION_INFO>
                <MASTER_PAD_VERSION>1.0</MASTER_PAD_VERSION>
                <MASTER_PAD_EDITOR>Master Editor here</MASTER_PAD_EDITOR>
                <MASTER_PAD_INFO>information would go here
</MASTER_PAD_INFO>
        </MASTER_PAD_VERSION_INFO>
        <Company_Info>
                <Company_Name>Moyea Software Co., Ltd.</Company_Name>
                <Country>China</Country>
                <Company_WebSite_URL>http://www.whatever.com
</Company_WebSite_URL>
                <Contact_Info>
                        <Author_First_Name>Bob</Author_First_Name>
                        <Author_Last_Name>King</Author_Last_Name>
                        <Author_Email>product at moyea.com</Author_Email>
                </Contact_Info>
        </Company_Info>
</XML_DIZ_INFO>

============================================
The following is the output I get.  After the closing </Company_Info> it
should stop.
============================================

  <MASTER_PAD_VERSION_INFO>
    <MASTER_PAD_VERSION>1.0</MASTER_PAD_VERSION>
    <MASTER_PAD_EDITOR>Master Editor here</MASTER_PAD_EDITOR>
    <MASTER_PAD_INFO>information would go here </MASTER_PAD_INFO>
  </MASTER_PAD_VERSION_INFO>

    <MASTER_PAD_VERSION>1.0</MASTER_PAD_VERSION>
1.0

    <MASTER_PAD_EDITOR>Master Editor here</MASTER_PAD_EDITOR>
Master Editor here

    <MASTER_PAD_INFO>information would go here </MASTER_PAD_INFO>
information would go here

  <Company_Info>
    <Company_Name>Moyea Software Co., Ltd.</Company_Name>
    <Country>China</Country>
    <Company_WebSite_URL>http://www.whatever.com</Company_WebSite_URL>
    <Contact_Info>
      <Author_First_Name>Bob</Author_First_Name>
      <Author_Last_Name>King</Author_Last_Name>
      <Author_Email>product at moyea.com</Author_Email>
    </Contact_Info>
  </Company_Info>

    <Company_Name>Moyea Software Co., Ltd.</Company_Name>
Moyea Software Co., Ltd.

    <Country>China</Country>
China

    <Company_WebSite_URL>http://www.whatever.com</Company_WebSite_URL>
http://www.whatever.com

    <Contact_Info>
      <Author_First_Name>Bob</Author_First_Name>
      <Author_Last_Name>King</Author_Last_Name>
      <Author_Email>product at moyea.com</Author_Email>
    </Contact_Info>

      <Author_First_Name>Bob</Author_First_Name>
Bob

      <Author_Last_Name>King</Author_Last_Name>
King

      <Author_Email>product at moyea.com</Author_Email>
product at moyea.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.pm.org/pipermail/kc/attachments/20100625/edd772b7/attachment.html>


More information about the kc mailing list