SPUG: extracting text between <a> and </a>
Tim Maher/CONSULTIX
tim at consultix-inc.com
Thu Oct 5 17:57:21 CDT 2000
On Thu, Oct 05, 2000 at 10:58:34AM -0700, Chuck Keagle wrote:
> Now I'm new with Perl (just had Dr. Tim's beginner class a couple weeks
Glad to see you're putting your education to use! 8-}
> ago), but would a pattern match do the trick?
>
> m|<a[\w:"/= \.]*> ([\w ]*)</a>| and $text = $1;
>
> If I'm way off base, please don't chastise me too harshly.
Nice try, but you've fallen into the trap of underestimating the
difficult of getting the regular expression right, and even if it were
perfect in itself, you'd still have to worry about eliminating matches
within comments, an entire problem unto itself.
Best to use a debugged module written by a person whose Hubris will
promote greater accuracy than a hand-rolled solution. Damian's
Text::Balanced is what I'd suggest; I showed a sample run in a
separate posting.
-"Dr. Tim"
*========================================================================*
| Dr. Tim Maher, CEO, Consultix (206) 781-UNIX/8649; ask for FAX# |
| Email: tim at consultix-inc.com Web: http://www.consultix-inc.com |
|Training- TIM MAHER: Unix, Perl DAMIAN CONWAY: Adv. Perl, OOP, Parsing |
|CLASSES: 10/9: Adv OO-Perl/Parsing 10/16: Int. Perl 10/23 Perl Prog. |
*========================================================================*
>
> --
>
> (fixed width font) //\\
> __________________________ \\
> Chuck Keagle \\ .__=.\\
> chuck.keagle at boeing.com \____ ,' H-D \-\<)
> Shared Services Group \ \______.,(_______/_:\
> (425) 865-5394 |==.\______// # /# #\ || : \____
> Fax: (425) 865-2221 '\\\ =''=//|_|##(O)##|| `./\---.
> M/S 7J-04 _____________ /\ / ,`--'./# ======='//, //.\ . \
> _______ \ \_(_:_ at O__)_///<_>O//// ( (@O ) )
> _____ \_____________/======'O' \ `-' /
> __`-----'__________________`---'___
>
> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
> POST TO: spug-list at pm.org PROBLEMS: owner-spug-list at pm.org
> Subscriptions; Email to majordomo at pm.org: ACTION LIST EMAIL
> Replace ACTION by subscribe or unsubscribe, EMAIL by your Email-address
> For daily traffic, use spug-list for LIST ; for weekly, spug-list-digest
> Seattle Perl Users Group (SPUG) Home Page: http://www.halcyon.com/spug/
>
>
--
*========================================================================*
| Dr. Tim Maher, CEO, Consultix (206) 781-UNIX/8649; ask for FAX# |
| Email: tim at consultix-inc.com Web: http://www.consultix-inc.com |
|Training- TIM MAHER: Unix, Perl DAMIAN CONWAY: Adv. Perl, OOP, Parsing |
|CLASSES: 10/9: Adv OO-Perl/Parsing 10/16: Int. Perl 10/23 Perl Prog. |
*========================================================================*
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
POST TO: spug-list at pm.org PROBLEMS: owner-spug-list at pm.org
Subscriptions; Email to majordomo at pm.org: ACTION LIST EMAIL
Replace ACTION by subscribe or unsubscribe, EMAIL by your Email-address
For daily traffic, use spug-list for LIST ; for weekly, spug-list-digest
Seattle Perl Users Group (SPUG) Home Page: http://www.halcyon.com/spug/
More information about the spug-list
mailing list