APM: Re: Regular Expression Guru's anyone?
Wayne Walker
wwalker at bybent.com
Thu Oct 10 19:29:33 CDT 2002
This will work (at least for my test data)
guru.pl:
#!/usr/bin/perl
$/ = undef; # unset record separator to read in entire file at once
use strict; # the only way to write perl :)
my ($data, $newdata, $text, $tag);
$data = <DATA>; # Read in all the lines following __DATA__
# Break the string into 3 pieces:
# text before a tag, tag, everything following the tag
# leading non < characters, < all non >chars up to next >, everything else.
while ( $data =~ /^([^<]*)(<[^>]*>)(.*)$/s)
{
# Lazy man's way to grab 3 vars at a time :)
($text, $tag, $data) = ($1, $2, $3);
# Fix the text
$text =~ s/bird/Hawk/gs; # Globally change, treat as a single line
# Add the text and the tag to the $newdata string
$newdata .= $text . $tag;
}
# take whatever is left when there are no more tags and fix it and
# append it to $newdata
$data =~ s/bird/Hawk/gs;
$newdata .= $text;
print $newdata;
__DATA__
this is some text about a bird, a bird is cool, here is a picture of a
bird <img src='bird.jpg'>
this is some text about a bird, a bird is cool, here is a picture of a
bird <img src='bird.jpg'>
this is some text about a bird, a bird is cool, here is a picture of a
bird <img src='bird.jpg'>
this is some text about a bird, a bird is cool, here is a picture of a
bird <img src='bird.jpg'>
this is some text about a bird, a bird is cool, here is a picture of a
bird <img src='bird.jpg'>
On Thu, Oct 10, 2002 at 03:49:13PM -0500, David Lyons wrote:
> Here is what I am trying to do, I need to match text that is in an html
> document but specifically not inside an HTML tag, ie:
>
> matching the word bird:
>
> this is some text about a bird, a bird is cool, here is a picture of a
> bird <img src='bird.jpg'>
>
> would hit on the two instances of "bird" but not on the one in the img
> tag (or any other HTML tag for that matter).
>
> Thanks,
> D
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: linux-unsubscribe at ctlug.org
> For additional commands, e-mail: linux-help at ctlug.org
> ---------------------------------------------------------------------
> Visit our website at <http://www.ctlug.org>.
--
Wayne Walker
More information about the Austin
mailing list