Any interest in Perl and XML
Bob La Quey
robertl1 at home.com
Fri May 19 17:18:59 CDT 2000
~sdpm~
At 02:37 PM 5/19/00 -0700, you wrote:
>~sdpm~
>Dear Bob,
>
>> I have some code and would be interested in seeing what others
>> are up to as well.
>
> Check out:
>http://www-4.ibm.com/software/developer/library/xml-perl/
>
> and
>http://www.velocigen.com/~tdarugar/talks/
>
> For some of my stuff on XML. We use XML and Perl extensively
>here, so I'm definitely interested in the topic.
>
>Best,
>
>Parand Tony Darugar tdarugar at velocigen.com
Kool,
Here is a little ditty I wrote for xml.com
http://www.xml.com/pub/1999/11/sml/index.html
But why don't we post a little code up here so we can
have some fun and games. After all there is "more than
one way to do it" so I suspect we can have some amusing
arguments if not downright flamefests and religious wars.
I'll kick it off. This is some code I presented at the Linux
Programmers Study Group (LPSG) about a year ago. The LPSG is
an offshoot of the Kernel-Panic Linux Users Group (KPLUG).
LPSG meets twice a month here in San Diego at the San Diego
County of Education facilty. See
http://www.kernel-panic.org/lpsg/ for details.
We also have an active mailing list.
Here is some Perl code that uses the XML::Parser
module to do (surprise) parse XML.
#!/usr/bin/perl
use XML::Parser;
my $source_filename=shift;
my $count=0;
###################################################################################
# Supporting data structures and subroutines for parser
###################################################################################
#set up specifics of my xml handlers
%tags = ( );
# start_handler pushs tag on tag_stack, end_handler pops tag off tag_stack
@tag_stack=();
$sp=1;
#handlers for xml parser events
sub init_handler() {
my($p) = @_;
print("Starting XML parsing\n");
}
sub final_handler() {
my($p) = @_;
print("End of XML parsing\n");
10;
}
sub start_handler() {
my($p, $tag) = @_;
$tag_stack[++$sp]=$tag;
indent();
print("<",$tag,">\n");
}
sub end_handler() {
my($p, $tag) = @_;
indent();
print("</",$tag,">\n");
$sp--;
}
sub char_handler() {
my($p, $el) = @_;
my $current_tag=$tag_stack[$sp];
indent();
print("char_handler, current tag <",$current_tag, "> data element = \"", $el,"\"\n");
}
#default_handler should never be invoked
sub default_handler() {
my($p, $el) = @_;
if($el =~ /\w/) { # only print lines containing something other than whitespace
print("default_handler ", $p, " for ", $el," ord = ", ord($el), "\n");
}
}
sub spaces() {
my ($deep) = @_;
$deep = 4*$deep;
$spaces = ' ' x $deep;
print($spaces);
}
sub indent() {
&spaces($sp-1);
}
###################################################################################
# The parser
###################################################################################
my $parser = new XML::Parser(ErrorContext => 2);
$parser->setHandlers( Init => \&init_handler,
Start => \&start_handler,
Char => \&char_handler,
End => \&end_handler,
Final => \&final_handler);
$parser->parsefile($source_filename);
###################################################################################
Discussion:
Examine the code the last few lines of code. The program
1) Instantiates an instance of the XML parser
my $parser = new ...
2) Tells the parser what hanndlers to use when
the parser generates an event.
$parser->setHandlers( ...
3) Does the actual parsing
$parser->parsefile($source_filename);
perldoc XML::Parser lists among other things the Events
(see HANDLERS) that the parser generates. You may associate
any subroutine you want as the handler of an event.
Some important Events are:
1) Init
generated just before the parsing starts
2) Final
generated just after succesfull parsing stops.
parse returns what this returns.
3) Start
generated when an XML start tag is recognized
4) End
generated when an XML end tag is recognized
5) Char
generated when a non-mark up string is recognized
There are other Events but these will do for the purposes of this
tutorial.
The subroutines defined here as event handlers are just
intended to print out the tags and node contents as a tutorial
exercise. Actual applications would use these handlers to do
something more substantial, e.g. generate HTML, or put data
into a database, etc.
Next we will look at using this Perl script to actually
parse some simple XML. (See next post)
~sdpm~
The posting address is: san-diego-pm-list at hfb.pm.org
List requests should be sent to: majordomo at hfb.pm.org
If you ever want to remove yourself from this mailing list,
you can send mail to <majordomo at happyfunball.pm.org> with the following
command in the body of your email message:
unsubscribe san-diego-pm-list
If you ever need to get in contact with the owner of the list,
(if you have trouble unsubscribing, or have questions about the
list itself) send email to <owner-san-diego-pm-list at happyfunball.pm.org> .
This is the general rule for most mailing lists when you need
to contact a human.
More information about the San-Diego-pm
mailing list