SPUG: MSXML & Perl

Rick Brannan rick at libronix.com
Tue Oct 24 00:36:13 CDT 2000


For folks on Windows platforms (like me):

Have you ever experimented with using Win32::OLE and Microsoft's MSXML? It
offers compiled code that quickly loads things into the DOM and allows you
to do all sorts of funky stuff with XSL/XPATH queries. Consider the below
extremely simplified xml:

<?xml version="1.0"?>
<root>
<element>element 1 content</element>
<element>element 2 content</element>
<element>element 3 content</element>
</root>

And a very simple example:

#--------------------------------
use Win32::OLE;

# UTF-8 stuff in perl 5.0?? builds
# (I don't have ActiveState's 5.6 installed yet)
$Win32::OLE::CP = CP_UTF8;

my $xmlFile = "myxml.xml";
my $xmlDom = new Win32::OLE 'Microsoft.XMLDOM' or warn "$!";

# don't worry about asyncronous access locally ...
$xmlDom->{async} = "False";

# load it.
$xmlDom->load($xmlFile);

# get the elements
my $elements = $xmlDom->selectNodes("/root/element");
foreach my $nCount (0 .. ($elements->length - 1))
	{
	my $elementContent = $elements->item($nCount)->{text};
	print "\nContent: $elementContent";
	}
#--------------------------------

Output:

C:\temp\xml>perl temp.pl

Content: element 1 content
Content: element 2 content
Content: element 3 content


Basically, anything you could do with MSXML in JavaScript/VBScript/etc. you
can do with Perl too. I initially ran across this when converting someone
else's existing JavaScript stuff to Perl so I could have access to Perl's
regex superiority, and ever since then this has been my latest hammer, and
all my data have conveinently transformed into nails.

I've got some 3-5 meg XML files that load into the DOM *very* quickly and
then allow me to run all sorts of interesting and somewhat twisted queries
on the data using XPATH syntax. More info on the available XMLDOM
methods/properties and XPATH syntax is available on MSDN
(msdn.microsoft.com/xml). I have had some problems with munging of unicode
data (Russian, specifically, though German and Spanish seem to be all right)
but I think 5.6 will fix that. Note that MSXML returns UTF-8 by default.

Hope it helps. I know I've found it handy. Now, back to lurk mode ...
hopefully I haven't stirred the 'M$ is evil' crowd.

-----------------------------------------
Rick Brannan -- rick at libronix.com
Book Design Manager, Libronix Corp.
http://www.libronix.com
"Unless Scripture is studied and preached
 with diligence, Christians will not know
 what God requires of them."
                          -- E.J. Carnell

 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
     POST TO: spug-list at pm.org       PROBLEMS: owner-spug-list at pm.org
      Subscriptions; Email to majordomo at pm.org:  ACTION  LIST  EMAIL
  Replace ACTION by subscribe or unsubscribe, EMAIL by your Email-address
 For daily traffic, use spug-list for LIST ;  for weekly, spug-list-digest
  Seattle Perl Users Group (SPUG) Home Page: http://www.halcyon.com/spug/





More information about the spug-list mailing list