I don't think that's exactly correct ... the $1 on the match side appears to refer to the current regex, not a previous regex.  or perhaps I've misunderstood what you're saying?<br><br>[robj@rj-ul80vt ~]$ perl -e '$x = "one \t\t two"; print ".$1.\n"; print "$x\n"; $x =~ s/(\s)$1+/$1/; print "$x\n";'<br>

..<br>one          two<br>one two<br><br><br>compare with this ..<br><br>[robj@rj-ul80vt ~]$ perl -e '$x = "one \t\t two"; print ".$1.\n"; print "$x\n"; $x =~ s/(\s)\s+/$1/; print "$x\n";'<br>

..<br>one          two<br>one two<br><br>No g on the sub, $1 is set on the first pass, not the second.  It 

appears to act as if it was a \s not the interpolated contents of the first match.<br><br>If the $1 was the contents of the parenthesized match, it would have been a space and the space tab tab space would not have been replaced.  the match would have occurred only for the two tabs, which would have been replaced with one tab.  However, the $1 appears to act as if it was a \s, matching any whitespace character.  the parenthesised match matches the space, and the $1+ matches tab tab space.  On the replacement side, the $1 is just a space.<br>

<br>looked over the perlre man page, didn't see anything about that.  i did find some new stuff.  \g{1} instead of \1, to remove some ambiguity.  also, named capture groups, (?<name>xxxx)<br><br>There was also the warning about using $1 and the like anywhere, causing an overall slowdown.  that might be a reason to use \1 in the replacement, although it's possible that a \1 in that context would also slow perl down.<br>

<br>also, I noticed that \1 in the replacement is "grandfathered" not deprecated, not for backwards compatibility, but to avoid shocking sed fans.  That means that \1 is not going away.  The warning section says it's use is discouraged because of the ambiguity with other uses of \1.<br>

<br>This seems to explain what's happening:<br><br>>>> The operation of interpolation should not be confused with the operation of matching a backreference.  Certainly they mean two different things on the left side of the "s///".<br>

<br>The only reason I can think of for this behaviour of $1 in a match regex is to replicate a pattern in the regex, perhaps with different modifiers.<br><br>-rob<br><br><div class="gmail_quote">On Sat, Oct 27, 2012 at 11:56 PM, Uri Guttman <span dir="ltr"><<a href="mailto:uri@stemsystems.com" target="_blank">uri@stemsystems.com</a>></span> wrote:<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="im">On 10/27/2012 08:52 PM, Rob Janes wrote:<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

sounds a bit muddy ...<br>

<br>

just to clarify, what I found to my surprise,  is that<br>

<br>

$exp =~ s/(\s)$1+/$1/g;<br>

<br>

replaces mixed strings of white space with the first character.  leading me<br>

to conclude that the first $1 actually is the \s, not the specific<br>

character matched.  However, in the second part of the s, the $1 did indeed<br>

give the white space.  So the $1 in the context of searching appears to<br>

render as the regex, not the characters matched.  While in the context of<br>

replacing, the $1 renders as the actual characters matched, not the regex<br>

used.<br>

<br>

hope that makes it clearer.<br>

</blockquote>

<br></div>

sorry but no.<br>

<br>

$1 in the regex is what $1 was BEFORE the regex. it gets interpolated and then parsed as a regex. it is not related to the () in the regex. only \1 will be the grabbed text from that (). and $1 in the replacement is just the string matched in the () which is any single whitespace char.<span class="HOEnZb"><font color="#888888"><br>


<br>

uri<br>

<br>

<br>

</font></span></blockquote></div><br>