LPM: Question about escapes in a regex
David Hempy
hempy at ket.org
Thu Oct 19 17:05:43 CDT 2000
Okay...I want to replace a styleid (such as '\s15\qt'...and yes, those are backslashes in the ID, not escape sequences) with a stylename throughout an RTF file. So I try:
$styleid = '\s15\qt';
$wholefile =~ s/$styleid/$stylename/g;
Simple enough, right? Wrong. :-(
After pulling out the last three hairs on my head, I concocted a test script (see below if you care), which shows that the backslashes in $styleid's value are being interpreted in a doublequotish way by the regex. My current solution is to escape all those backslashes:
$styleid = '\s15\qt';
$styleid =~ s[\\][\\\\]g;
$wholefile =~ s/$styleid/$stylename/g;
So I'm good to go, but I'm a little uneasy with the cludgeness of my solution, and the fact that I can't just make it work the way I want it to. I don't like modifying then un-modifying the value (or creating a copy to modify).
Any perl gurus have experience/insight with this?
-dave
Appendix A. The Script
use strict;
my $wholefile = '
{\s15\qt more stuff KET_TESTING_STYLE}
{\s12\qt more stuff KET_TESTING2_STYLE}
\s12\qt some stuff here is some \s12\qt stuff.
but now some \s15\qt other stuff.
';
my $stylesheet = $wholefile;
while ($stylesheet =~ /{([^{}]*?)}/g) {
my $style = $1;
next unless ($style =~ /^(\\s\d+\S+) .* KET_(\w+)_STYLE/);
my ($styleid, $stylename) = ($1, $2);
print "$styleid: ($stylename) \n";
print "\nNeed to replace [$styleid] in [$wholefile]\n\n";
print "\nTry substituting [$styleid] directly:\n";
if ($wholefile =~ s/$styleid/$stylename/g) {
print ":-) Substituted [$styleid]\n";
} else {
print ":-( [$styleid] was not substituted!\n";
}
$styleid =~ s[\\][\\\\]g;
print "\nTry substituting it escaped: [$styleid]:\n";
if ($wholefile =~ s/$styleid/$stylename/g) {
print ":-) Substituted [$styleid]\n";
} else {
print ":-( [$styleid] was not substituted!\n";
}
}
print "\nResult: [$wholefile]\n\n";
Appendix B. The Execution
D:\Temp>t.pl
Found a style definition: \s15\qt = TESTING
Need to replace [\s15\qt] in [
{\s15\qt more stuff KET_TESTING_STYLE}
{\s12\qt more stuff KET_TESTING2_STYLE}
\s12\qt some stuff here is some \s12\qt stuff.
but now some \s15\qt other stuff.
]
Try substituting [\s15\qt] directly:
:-( [\s15\qt] was not substituted!
Try substituting it escaped: [\\s15\\qt]:
:-) Substituted [\\s15\\qt]
Found a style definition: \s12\qt = TESTING2
Need to replace [\s12\qt] in [
{TESTING more stuff KET_TESTING_STYLE}
{\s12\qt more stuff KET_TESTING2_STYLE}
\s12\qt some stuff here is some \s12\qt stuff.
but now some TESTING other stuff.
]
Try substituting [\s12\qt] directly:
:-( [\s12\qt] was not substituted!
Try substituting it escaped: [\\s12\\qt]:
:-) Substituted [\\s12\\qt]
Result: [
{TESTING more stuff KET_TESTING_STYLE}
{TESTING2 more stuff KET_TESTING2_STYLE}
TESTING2 some stuff here is some TESTING2 stuff.
but now some TESTING other stuff.
]
D:\Temp>
--
David Hempy
Internet Database Administrator
Kentucky Educational Television
<hempy at ket.org> -- (859)258-7164 -- (800)333-9764
More information about the Lexington-pm
mailing list