a regexp question

chung chung at scripps.edu
Fri May 9 15:33:23 CDT 2003


~sdpm~
Wow!

My thanks to Dave, Steve, Charles, Tkil

Time to go out the 'mastering regular expressions' book
and do some practicin.

You guys are all very cool.

john

>===== Original Message From Tkil <tkil-sdpm at scrye.com> =====
>>>>> "John" == John Chung <chung at scripps.edu> writes:
>
>John> 	s/href="([^"])+"/appendit($1)/eg;
>
>John> I noticed that the $1, instead of it being the entire URL
>John> inside the anchor tag (between <a href=" and  ">), is
>John> usually just the last letter of that URL.
>
>John> I'm confused.  Could someone help me so that I can just
>John> take the whole URL inside the anchor tag and pass it or
>John> refer to it?
>
>You misplaced your parentheses; in this case, the plus quantifier
>modifies the grouping, not the character set.  Simplest fix is:
>
>   s/href="([^"]+)"/appendit($1)/eg;
>
>Although this still isn't correct, since you remove the "href" portion
>of the tag as well.  Maybe:
>
>   s/(href=")([^"]+)(")/$1 . appendit($2) . $3/eg;
>
>Comments:
>
>1. /e is slow, and potentially insecure.  Consider doing the
>   replacement inline:
>
>      s/(href=")([^"]+)(")/$1$2?sid=xxx$3/g;
>
>2. The href url might already have a '?', so another one is incorrect
>   (should be ";" or "&")
>
>      s/(href=")([^"?]+)([^"]*)(")/$1 . $2 . ($3 ? "&" : "?" ) . "sid=xxx" . 
$4/eg;
>
>3. HTML tag attributes are case-insensitive.  Consider using /i:
>
>      s/(href=")([^"]+)(")/$1$2?sid=xxx$3/ig;
>
>4. "href" is also used for IMG tags.  :)
>
>This gets ugly in a hurry.  The slightly better answer is to parse
>things out in more detail; a regex that you might find helpful is
>discussed in:
>
>   http://archive.lug.boulder.co.us/bymonth/2001.08/msg00573.html
>
>Hopefully the tips above are enough to get you started, though.  If
>your HTML is regular enough to begin with, then just moving the + to
>be inside the parens should be enough.
>
>t.


~sdpm~

The posting address is: san-diego-pm-list at hfb.pm.org

List requests should be sent to: majordomo at hfb.pm.org

If you ever want to remove yourself from this mailing list,
you can send mail to <majordomo at happyfunball.pm.org> with the following
command in the body of your email message:

    unsubscribe san-diego-pm-list

If you ever need to get in contact with the owner of the list,
(if you have trouble unsubscribing, or have questions about the
list itself) send email to <owner-san-diego-pm-list at happyfunball.pm.org> .
This is the general rule for most mailing lists when you need
to contact a human.




More information about the San-Diego-pm mailing list