a regexp question
Tkil
tkil-sdpm at scrye.com
Fri May 9 15:23:04 CDT 2003
~sdpm~
>>>>> "John" == John Chung <chung at scripps.edu> writes:
John> s/href="([^"])+"/appendit($1)/eg;
John> I noticed that the $1, instead of it being the entire URL
John> inside the anchor tag (between <a href=" and ">), is
John> usually just the last letter of that URL.
John> I'm confused. Could someone help me so that I can just
John> take the whole URL inside the anchor tag and pass it or
John> refer to it?
You misplaced your parentheses; in this case, the plus quantifier
modifies the grouping, not the character set. Simplest fix is:
s/href="([^"]+)"/appendit($1)/eg;
Although this still isn't correct, since you remove the "href" portion
of the tag as well. Maybe:
s/(href=")([^"]+)(")/$1 . appendit($2) . $3/eg;
Comments:
1. /e is slow, and potentially insecure. Consider doing the
replacement inline:
s/(href=")([^"]+)(")/$1$2?sid=xxx$3/g;
2. The href url might already have a '?', so another one is incorrect
(should be ";" or "&")
s/(href=")([^"?]+)([^"]*)(")/$1 . $2 . ($3 ? "&" : "?" ) . "sid=xxx" . $4/eg;
3. HTML tag attributes are case-insensitive. Consider using /i:
s/(href=")([^"]+)(")/$1$2?sid=xxx$3/ig;
4. "href" is also used for IMG tags. :)
This gets ugly in a hurry. The slightly better answer is to parse
things out in more detail; a regex that you might find helpful is
discussed in:
http://archive.lug.boulder.co.us/bymonth/2001.08/msg00573.html
Hopefully the tips above are enough to get you started, though. If
your HTML is regular enough to begin with, then just moving the + to
be inside the parens should be enough.
t.
~sdpm~
The posting address is: san-diego-pm-list at hfb.pm.org
List requests should be sent to: majordomo at hfb.pm.org
If you ever want to remove yourself from this mailing list,
you can send mail to <majordomo at happyfunball.pm.org> with the following
command in the body of your email message:
unsubscribe san-diego-pm-list
If you ever need to get in contact with the owner of the list,
(if you have trouble unsubscribing, or have questions about the
list itself) send email to <owner-san-diego-pm-list at happyfunball.pm.org> .
This is the general rule for most mailing lists when you need
to contact a human.
More information about the San-Diego-pm
mailing list