All -<br><br>Figured out the problem. Sterling Hanenkamp got me going in the right direction.<br><br>Anyway... I was using an abstract example to ask my question, so here is an explanation and my actual code.<br><br>I am working with the Qualys API and I wanted to pull all scan data back from Qualys so that I can store and mashup the data against other data sources.<br>
<br>The DTD for the Qualys xml is: <a href="https://qualysapi.qualys.com/scan-1.dtd" target="_blank">https://qualysapi.qualys.com/scan-1.dtd</a> (This will give you the structure of the XML file)<br><br>Here is the basic code that I ended up with. This works on the xml file after being retrieved from Qualys.<br>
<br><br>*************************************************<br><div id=":1v7" class="ii gt">#!/usr/bin/perl
-w<br><br># Indentation style: 1 tab = 4 spaces<div class="im"><br><br>require
XML::Twig;<br><br>sub info {<br> my ($xml, $info) = @_; <br>
my $elt = $info;<br></div>
if ($elt->is_elt =~ m/(VULN|SERVICE|INFO|PRACTICE)/) {<br>
printf "VALUE: %s \n", $elt->parent->parent->parent->att("value");<br>
printf "ENT: %s \n", $elt->is_elt;<br>
} <br><br> if ($elt->is_elt =~
m/(OS|NETBIOS_HOSTNAME)/) {<br> printf "VALUE: %s \n",
$elt->parent->att("value");<br> printf "ENT: %s
\n", $elt->is_elt;<br>
printf "%s\n", $elt->text;<div class="im"><br>
} <br> while ($elt= $elt->next_elt($info) )<br> { <br></div>
my $localname = $elt->local_name;<br> if ($localname
ne '#CDATA' && $localname ne '#PCDATA') {<br>
printf "%s: ", $localname;<br>
printf "%s\n", $elt->text;<br> } <br> } <br>
printf "\n\n";<br>}<br><br>#===================================================<br>
#Main program section<div class="im"><br><br>$xml = new XML::Twig(<br>
TwigHandlers => {<br></div> SERVICE =>
\&info,<br> VULN => \&info,<br>
OS => \&info,<br>
NETBIOS_HOSTNAME => \&info,<br>
INFO => \&info,<br>
PRACTICE => \&info,<br>
HEADER => \&info,<br>
#_all_ => \&info,
# not using _all_ to ignore the toplevel SCAN tag<br> }, <br>
error_context => 1,<div class="im"><br>);<br><br># Parse the XML<br>$xml->parsefile('sample.xml');<br>
<br></div>******************************************************************</div><br><br><div class="gmail_quote">On Fri, Jun 25, 2010 at 7:31 PM, Daryl Fallin <span dir="ltr"><<a href="mailto:darylvf@gmail.com">darylvf@gmail.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin: 0pt 0pt 0pt 0.8ex; border-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;">Hi All ....<br><br>I have been trying to work with XML::Twig lately to parse an xml file.<br>
<br>I just want to dump every element/Tag of the xml file. But my while loops seems to be doing something weird or its the way that XML::Twig is working, not sure, but I get duplicate information from the original XML file. Its like it is running part of the while loop twice.<br>
<br>I know there are other modules that I could use but I am using XML::Twig for other parts of what will be a larger program and I want the chunking that XML:Twig allows.<br><br>Any help would be greatly appreciated.<br>
<br>Here is my sample code:<br><br>#!/usr/bin/perl -w<br><br>require XML::Twig;<br><br>sub info {<br> my ($xml, $info) = @_;<br> my $elt = $info;<br> while ($elt= $elt->next_elt($info) )<br> {<br>
$elt->set_remove_cdata(1);<br> $elt->set_pretty_print("record"); # print one field per line<br> printf "%s\n", $elt->sprint;<br> }<br>}<br>
<br>$xml = new XML::Twig(<br> TwigHandlers => {<br> XML_DIZ_INFO => \&info,<br> }<br>);<br><br># Parse the XML<br>$xml->parsefile('sample.xml');<br><br>************************<br>
<br>sample.xml<br>-----------------<br><?xml version="1.0" ?><br><XML_DIZ_INFO><br> <MASTER_PAD_VERSION_INFO><br> <MASTER_PAD_VERSION>1.0</MASTER_PAD_VERSION><br>
<MASTER_PAD_EDITOR>Master Editor here</MASTER_PAD_EDITOR><br> <MASTER_PAD_INFO>information would go here </MASTER_PAD_INFO><br> </MASTER_PAD_VERSION_INFO><br>
<Company_Info><br> <Company_Name>Moyea Software Co., Ltd.</Company_Name><br> <Country>China</Country><br> <Company_WebSite_URL><a href="http://www.whatever.com" target="_blank">http://www.whatever.com</a></Company_WebSite_URL><br>
<Contact_Info><br> <Author_First_Name>Bob</Author_First_Name><br> <Author_Last_Name>King</Author_Last_Name><br> <Author_Email><a href="mailto:product@moyea.com" target="_blank">product@moyea.com</a></Author_Email><br>
</Contact_Info><br> </Company_Info><br></XML_DIZ_INFO><br><br>============================================<br>The following is the output I get. After the closing </Company_Info> it should stop.<br>
============================================<br><br> <MASTER_PAD_VERSION_INFO><br> <MASTER_PAD_VERSION>1.0</MASTER_PAD_VERSION><br> <MASTER_PAD_EDITOR>Master Editor here</MASTER_PAD_EDITOR><br>
<MASTER_PAD_INFO>information would go here </MASTER_PAD_INFO><br> </MASTER_PAD_VERSION_INFO><br><br> <MASTER_PAD_VERSION>1.0</MASTER_PAD_VERSION><br>1.0<br><br> <MASTER_PAD_EDITOR>Master Editor here</MASTER_PAD_EDITOR><br>
Master Editor here<br><br> <MASTER_PAD_INFO>information would go here </MASTER_PAD_INFO><br>information would go here <br><br> <Company_Info><br> <Company_Name>Moyea Software Co., Ltd.</Company_Name><br>
<Country>China</Country><br> <Company_WebSite_URL><a href="http://www.whatever.com" target="_blank">http://www.whatever.com</a></Company_WebSite_URL><br> <Contact_Info><br> <Author_First_Name>Bob</Author_First_Name><br>
<Author_Last_Name>King</Author_Last_Name><br> <Author_Email><a href="mailto:product@moyea.com" target="_blank">product@moyea.com</a></Author_Email><br> </Contact_Info><br> </Company_Info><br>
<br> <Company_Name>Moyea Software Co., Ltd.</Company_Name><br>Moyea Software Co., Ltd.<br><br> <Country>China</Country><br>China<br><br> <Company_WebSite_URL><a href="http://www.whatever.com" target="_blank">http://www.whatever.com</a></Company_WebSite_URL><br>
<a href="http://www.whatever.com" target="_blank">http://www.whatever.com</a><br><br> <Contact_Info><br> <Author_First_Name>Bob</Author_First_Name><br> <Author_Last_Name>King</Author_Last_Name><br>
<Author_Email><a href="mailto:product@moyea.com" target="_blank">product@moyea.com</a></Author_Email><br> </Contact_Info><br><br> <Author_First_Name>Bob</Author_First_Name><br>
Bob<br><br> <Author_Last_Name>King</Author_Last_Name><br>
King<br><br> <Author_Email><a href="mailto:product@moyea.com" target="_blank">product@moyea.com</a></Author_Email><br><a href="mailto:product@moyea.com" target="_blank">product@moyea.com</a><br><br><br>
</blockquote></div><br>