[Chicago-talk] split function and zero-width seperator

tiger peng tigerpeng2001 at yahoo.com
Thu Oct 22 12:37:16 PDT 2009


I found the answer by checking perldoc of split, it says:

If the PATTERN contains parentheses, additional list elements are created from each matching substring in the delimiter. 

It seems I have to simplify the seperater (PATTERN) first.

perl -MData::Dumper -le '$s=chr(127);
                         $a = "15678.91 ml; r45.12 ";
                         $a =~ s/((?!(\d|\.))(?<=(\d|\.)))|((?=(\d|\.))(?<!(\d|\.)))/$s/g;
                         @a = split /$s/, $a;
                         print Dumper(@a)'
$VAR1 = '';
$VAR2 = '15678.91';
$VAR3 = ' ml; r';
$VAR4 = '45.12';
$VAR5 = ' ';


________________________________
From: tiger peng <tigerpeng2001 at yahoo.com>
To: Chicago.pm chatter <chicago-talk at pm.org>
Sent: Thu, October 22, 2009 2:11:29 PM
Subject: [Chicago-talk] split function and zero-width seperator


Hello everyone,
 
I am trying to isolate numbers from string, manipulate the numbers then put them back to original positions in the string. When I try to use zero-width separator, the split function looks weird, it generated much more elements than I expect.
 
Are there any mistakes? Why the split behaviors like this? (The regexp looks right when I use it with s///)
 
Could anyone help?
 
Thanks,
Tiger

#build the seperator
:-) perl -le '$a = "15678.91 ml; r45.12 ";
              $a =~ s/((?!(\d|\.))(?<=(\d|\.)))|((?=(\d|\.))(?<!(\d|\.)))/|/g;
              print $a'
|15678.91| ml; r|45.12|

#use the seperator in split
:-) perl -MData::Dumper -le '$a = "15678.91 ml; r45.12 ";
                             @a = split /((?!(\d|\.))(?<=(\d|\.)))|((?=(\d|\.))(?<!(\d|\.)))/, $a;
                             print Dumper(@a)'
$VAR1 = '15678.91';
$VAR2 = '';
$VAR3 = undef;
$VAR4 = '1';
$VAR5 = undef;
$VAR6 = undef;
$VAR7 = undef;
$VAR8 = ' ml; r';
$VAR9 = undef;
$VAR10 = undef;
$VAR11 = undef;
$VAR12 = '';
$VAR13 = '4';
$VAR14 = undef;
$VAR15 = '45.12';
$VAR16 = '';
$VAR17 = undef;
$VAR18 = '2';
$VAR19 = undef;
$VAR20 = undef;
$VAR21 = undef;
$VAR22 = ' ';
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.pm.org/pipermail/chicago-talk/attachments/20091022/a806724c/attachment.html>


More information about the Chicago-talk mailing list