SPUG: multi-line substition, all at once.

Thu Jun 25 04:25:30 PDT 2009

Ryan Corder wrote:

> my $code=<<__CODE__;
>  :    sub hello {
>  :        say "Hello World!";
>  :    }
> __CODE__
> 
> I would like to get back:
> 
> [code]    sub hello {
>         say "Hello World!";
>     }
> [/code]

and

> my $code=<<__CODE__;
> foo bar
> baz
> 
>  :    sub hello {
>  :        say "Hello World!";
>  :    }
> asdf
> snap
> crackle
> pop
> __CODE__
> 

>     [code]
>         sub hello {
>             say "Hello World!";
>         }
>     [/code]

right?

It's a good idea to be explicit about your HEREDOC quoting.  By default it's
double quotes, but it doesn't hurt to put them in.  I also don't recommend using
@ as your delimiters in substitutions, but that's because they hurt my eyes (and
my syntax highlighting doesn't understand them).  You had this:

$code =~ s@(?:(?<=\s\:).*)+@"[code]$&[/code]"@esg;

which could be more nicely written as:

$code =~ s{
	        (?:                  # Non-capturing match
	           (?<=\s:)          # follows a space and a :
	           .*
	        )+                   # one or more
        }
        {[code]$&\[/code]}sgx;

or, to get rid of the assertion:

$code =~ s{
                (
                 ^\s:                # the line starts with space:
                 .*
                )+                   # one or more
        }
        {[code]$&\[/code]}msgx;

Note that I do need to escape the second [ on the right handside (as your code
should have too), but I don't need quotes or /e (neither did you).  The right
hand side of a substitute already acts as a double quoted string.

As you point out all of these patterns return:

 :[code]    sub hello {
 :        say "Hello World!";
 :    }
[/code]

Which isn't quite right.  Worse they behave completely wrong when given more
complex data:

foo bar
baz

[code] :    sub hello {
 :        say "Hello World!";
 :    }
asdf
snap
crackle
pop

The expression below works better:

my $code = '
foo bar
baz

 :    sub hello {
 :        say "Hello World!";
 :    }
asdf
snap
crackle
pop
';

$code =~ s{
        \A.*?           # Get rid of any excess at the top
        (               # start $1
         (
          ^             # start of a line
          \s:           # space then colon
          [^\n]*\n      # just 1 line (including newline)
         )+             # one or more times
        )               # end of $1
        .*
        }{[code]$1\[/code]\n}sxm;

print $code;

__END__

[code] :    sub hello {
 :        say "Hello World!";
 :    }
[/code]

The problem is that you want to both capture the lines which are identified by
starting with the space: and then you want to remove the space:.  I don't
believe that this can be done (at least in an efficient and readable fashion)
without cheating.  Here's an example of cheating:

my $code = '
foo bar
baz

 :    sub hello {
 :        say "Hello World!";
 :    }
asdf
snap
crackle
pop
';

$code =~ s{
        \A.*?           # Get rid of any excess at the top
        (               # start $1
         (
          ^             # start of a line
          \s:           # space then colon
          [^\n]*\n      # just 1 line (including newline)
         )+             # one or more times
        )               # end of $1
        .*
        }{"[code]\n".no_colon($1)."[/code]\n"}sexm;

print $code;

sub no_colon {
        my $line = shift;
        $line =~ s/^ ://gm;
        return $line;
}

__END__

[code]
    sub hello {
        say "Hello World!";
    }
[/code]

This still uses two regular expressions, we just hide one of them away in a
subroutine.  I don't really like using /e, so I'd still suggest doing this in
two expressions, such as the example previous plus $code =~ s/^ ://gm;

You *might* be able to achieve this in one regular expression using a re-entrant
expression, but I haven't been able to, and I don't think it would be easily
maintainable.

All the best,

	J

-- 
   ("`-''-/").___..--''"`-._          |  Jacinta Richardson         |
    `6_ 6  )   `-.  (     ).`-.__.`)  |  Perl Training Australia    |
    (_Y_.)'  ._   )  `._ `. ``-..-'   |      +61 3 9354 6001        |
  _..`--'_..-_/  /--'_.' ,'           | contact at perltraining.com.au |
 (il),-''  (li),'  ((!.-'             |   www.perltraining.com.au   |