OC-PM: SAX ponderings
Kip Hampton
kip at web.oakley.com
Mon Dec 9 20:23:03 CST 2002
Wilson, Douglas wrote:
> First let me say thanks again Kip for the presentation last night!
> I've now at least a start on what SAX is about...
Thanks.
>
> I was thinking about the theoretical problem Ryan
> mentioned last night, and how it might be handled with SAX.
> The problem as I recall was, say you have a document like this:
> ...
> <x>
> <a>...</a>
> <b>...</b>
> <c>...</c>
> </x>
> ...
>
> And you want to randomly eliminate one of the children of x (or maybe
> you want to randomly pick one to keep and eliminate the rest), and
> you don't know in advance how many children x has.
>
> Am I wrong in thinking that after you come across the x start_element
> event, you would then have to pass events on to a buffer (lets say you
> want to randomly eliminate a child), then when you get to the x end_element
> event you'd have to then pass the buffer through a filter which would
> eliminate the
> unlucky nth child.
Almost...
You're right in that you'd need to buffer the events between the calls
to start_element() and end_element() for the 'x' element, but a filter
is not really needed.
Consider the following (untested):
package Ryans::RandomChildFilter;
use XML::SAX::Base;
use vars qw( @ISA $e_count );
@ISA = qw( XML::SAX::Base );
$e_count = 0;
#init in start_doc
sub start_document {
my $self = shift;
$self->{in_buffer} = undef;
$self->{event_buffer} = [];
$self->SUPER::start_document( @_ );
}
sub start_element {
my $self = shift;
my $e = shift;
if ( defined( $self->{in_buffer} ) ) {
push @{$self->{event_buffer}->[$e_count]},
['start_element', $e];
}
else {
$self->SUPER::start_element( $e );
}
}
sub characters {
my $self = shift;
my $chars = shift;
if ( defined( $self->{in_buffer} ) ) {
push @{$self->{event_buffer}->[$e_count]},
['characters', $chars];
}
else {
$self->SUPER::characters( $chars );
}
}
sub end_element {
my $self = shift;
my $e = shift;
if ( defined( $self->{in_buffer} ) ) {
if ( $e->{LocalName} eq 'x' ) {
# pick a random element from the stack
my @buffer = @{$self->{event_buffer}};
my $selected_index = int rand($#buffer);
# forward the element's events from the buffer
foreach my $event ( @{$buffer[$selected_index]} ) {
my ( $method, $data ) = @{$event};
$self->SUPER::$method( $data );
}
# reset
$self->{event_buffer} = [];
$self->{in_buffer} = undef;
$e_count = 0;
}
else {
push @{$self->{event_buffer}->[$e_count]},
['end_element', $e];
$e_count++;
}
}
else {
$self->SUPER::start_element( $e );
}
}
1;
Obvously, there may be cleaner ways to skin the same cat, but, does this
help at all ?
-kip
More information about the Oc-pm
mailing list