LPM: Question about escapes in a regex

David Hempy hempy at ket.org
Thu Oct 19 17:05:43 CDT 2000


Okay...I want to replace a styleid (such as '\s15\qt'...and yes, those are backslashes in the ID, not escape sequences) with a stylename throughout an RTF file.  So I try:

	$styleid = '\s15\qt';
	$wholefile =~ s/$styleid/$stylename/g;

Simple enough, right?  Wrong.  :-(

After pulling out the last three hairs on my head, I concocted a test script (see below if you care), which shows that the backslashes in $styleid's value are being interpreted in a doublequotish way by the regex.  My current solution is to escape all those backslashes:

	$styleid = '\s15\qt';
	$styleid =~ s[\\][\\\\]g;
	$wholefile =~ s/$styleid/$stylename/g;


So I'm good to go, but I'm a little uneasy with the cludgeness of my solution, and the fact that I can't just make it work the way I want it to.  I don't like modifying then un-modifying the value (or creating a copy to modify).

Any perl gurus have experience/insight with this? 

-dave


Appendix A.  The Script

	use strict;


	my $wholefile = '
		{\s15\qt more stuff KET_TESTING_STYLE}
		{\s12\qt more stuff KET_TESTING2_STYLE}
		\s12\qt some stuff here is some \s12\qt  stuff. 
		but now some \s15\qt other stuff.
		';
	my $stylesheet = $wholefile;
	
	while ($stylesheet =~ /{([^{}]*?)}/g) {
		my $style = $1;
		next unless ($style =~  /^(\\s\d+\S+) .* KET_(\w+)_STYLE/);
		my ($styleid, $stylename) = ($1, $2);
		print "$styleid: ($stylename) \n";
		

		print "\nNeed to replace [$styleid] in [$wholefile]\n\n";
		print "\nTry substituting [$styleid] directly:\n";
		if ($wholefile =~ s/$styleid/$stylename/g) {
			print ":-) Substituted [$styleid]\n";
		} else {
			print ":-( [$styleid] was not substituted!\n";
		}
		
		$styleid =~ s[\\][\\\\]g;
		print "\nTry substituting it escaped: [$styleid]:\n";
		if ($wholefile =~ s/$styleid/$stylename/g) {
			print ":-) Substituted [$styleid]\n";
		} else {
			print ":-( [$styleid] was not substituted!\n";
		}
		
	}

	print "\nResult: [$wholefile]\n\n";




Appendix B. The Execution

D:\Temp>t.pl
Found a style definition: \s15\qt = TESTING

Need to replace [\s15\qt] in [
                {\s15\qt more stuff KET_TESTING_STYLE}
                {\s12\qt more stuff KET_TESTING2_STYLE}
                \s12\qt some stuff here is some \s12\qt  stuff.
                but now some \s15\qt other stuff.
                ]


Try substituting [\s15\qt] directly:
:-( [\s15\qt] was not substituted!

Try substituting it escaped: [\\s15\\qt]:
:-) Substituted [\\s15\\qt]
Found a style definition: \s12\qt = TESTING2

Need to replace [\s12\qt] in [
                {TESTING more stuff KET_TESTING_STYLE}
                {\s12\qt more stuff KET_TESTING2_STYLE}
                \s12\qt some stuff here is some \s12\qt  stuff.
                but now some TESTING other stuff.
                ]


Try substituting [\s12\qt] directly:
:-( [\s12\qt] was not substituted!

Try substituting it escaped: [\\s12\\qt]:
:-) Substituted [\\s12\\qt]

Result: [
                {TESTING more stuff KET_TESTING_STYLE}
                {TESTING2 more stuff KET_TESTING2_STYLE}
                TESTING2 some stuff here is some TESTING2  stuff.
                but now some TESTING other stuff.
                ]


D:\Temp>

-- 
David Hempy 
Internet Database Administrator
Kentucky Educational Television
<hempy at ket.org> -- (859)258-7164 -- (800)333-9764





More information about the Lexington-pm mailing list