On Thu, Oct 05, 2000 at 08:54:59AM -0700, Todd Wells wrote:
> I'm working on a little web automation routine and I've used HTML::LinkExtor
> to extract the links from a web page, then I'm processing each of those
> links.
> What I'd like to know is if there's some easy way that I could get the
> original text that accompanied that link.  e.g., <a href =
> "http://thislink"> this text here I want </a>. 

You need to "Use Damian" 8-) !

His Text::Balanced module has a method called extract_tagged() that will
find and parse your anchor tags and return each part in a different list

For example:

$ cat extract
#! /usr/bin/perl -w
use Text::Balanced 'extract_tagged';

$_= '<a href = "http://thislink"> this text here I want </a> MORE STUFF';

				    $parts{last_tag}) =

print  map "$_\t=>'$parts{$_}'\n",  sort keys %parts;

$ ./extract
enclosed	=>' this text here I want '
first_tag	=>'<a href = "http://thislink">'
last_tag	=>'</a>'
remnants	=>' MORE STUFF'
skipover	=>''
whole_match	=>'<a href = "http://thislink"> this text here I want </a>'

Check the documentation of the *latest version* for details on setting
the $skip parameter, which controls skipping over text on the way to
finding the tag; you might find its behavior counter-intuitive.

