RE question

Bret Webb BWebb at roadrunnersports.com
Fri May 14 12:16:02 CDT 2004


~sdpm~
I'm sure there are various and better approaches out there, and I'm looking
forward to seeing them :} but if nobody decides to respond, here are two
approaches I would use: 

my $newStr = $FORM('subject');
$newStr =~ s/[`\^\*\|><]//igs;

This approach requires you to know all special chars to get rid of....The
other alternative is:

my $newStr = $FORM('subject');
$newStr =~ s/\W+//igs;

Unfortunately, this also removes spaces and other chars you may want to
keep. In order to preserve the special characters you dont want to get rid
of, do a substitution on them first, eg.

my $newStr = $FORM('subject');
$newStr =~ s/\./_PD_/igs;
$newStr =~ s/ /_SP_/igs;

Then 

$newStr =~ s/\W+//igs; # Remove all special characters including spaces and
underscores

If the value for $FORM('subject') equals "The man on the moon.", then the
string value after substitution would look like this after performing the
regex:

	The SP man SP on SP the SP moon PD

Now resubstitute for the placeholders:

$newStr =~ s/ SP / /igs; 
$newStr =~ s/ PD /\./igs;

Depending on what your form variable is trying to capture, using SP as a
placeholder for resubstitution may or may not be desireable. Using SPACE may
not be good either if the form variable is used on a NASA form. In my
opinion, the likelihood that SP and PD being submitted as valid values is
minimal. 

I always found it easier to substitute what I wanted to keep rather than
remember all of the special characters I wanted to exclude. Good luck.



-----Original Message-----
From: Ken Loomis [mailto:kloomis at bigplanet.com]
Sent: Friday, May 14, 2004 9:31 AM
To: San Diego Perl Mongers
Subject: RE question


~sdpm~
As a web designer that enjoys programming, I joined this group a few 
years ago hoping to become proficient at PERL (or, is it Perl). I have 
learned a lot from looking at the exchanges here, but have to admit that 
RE's still baffle me.

I host a discussion board that is apparently being Spam'ed by hackers. 
The board is in Perl and I would like an RE that will strip everything 
except alphanumeric characters from the subject. For example, the 
subject line should only contain A-Z, a-z & 0-9.

The subject line is contained in $FORM('subject'), so I'd like an RE 
that will replace the contents of that variable with the stripped 
version. also, I may decide to allow periods ('.'), and would like to 
see how I'd have to modify that RE to allow that.

If anyone can help, I'd appreciate it.

thanks,
Ken Loomis


-- 
Ken Loomis
Consultant
Windows, Macintosh, Internet, etc.
Helping to make your technology
experience more pleasant & profitable  ;-)
619-275-6919 / KLoomis at BigPlanet.com


~sdpm~

The posting address is: san-diego-pm-list at hfb.pm.org

List requests should be sent to: majordomo at hfb.pm.org

If you ever want to remove yourself from this mailing list,
you can send mail to <majordomo at happyfunball.pm.org> with the following
command in the body of your email message:

    unsubscribe san-diego-pm-list

If you ever need to get in contact with the owner of the list,
(if you have trouble unsubscribing, or have questions about the
list itself) send email to <owner-san-diego-pm-list at happyfunball.pm.org> .
This is the general rule for most mailing lists when you need
to contact a human.


~sdpm~

The posting address is: san-diego-pm-list at hfb.pm.org

List requests should be sent to: majordomo at hfb.pm.org

If you ever want to remove yourself from this mailing list,
you can send mail to <majordomo at happyfunball.pm.org> with the following
command in the body of your email message:

    unsubscribe san-diego-pm-list

If you ever need to get in contact with the owner of the list,
(if you have trouble unsubscribing, or have questions about the
list itself) send email to <owner-san-diego-pm-list at happyfunball.pm.org> .
This is the general rule for most mailing lists when you need
to contact a human.




More information about the San-Diego-pm mailing list