SPUG: RE / Split Question

Yitzchak Scott-Thoennes sthoenna at efn.org
Sun Aug 3 15:14:17 CDT 2003


On Thu, 31 Jul 2003 00:02:11 -0700, krahnj at acm.org wrote:
>$ perl -le'
>$glob = "425 501 sttlwa01t 425 712 sttlwa01t tacwa02t 425 337 tacwa02t ";
>
>@array = $glob =~ /( \b\d+ \s+ \d+ (?:\s+ \D\w*)+ )/xg;
>
>print for @array;
>'
>425 501 sttlwa01t 
>425 712 sttlwa01t tacwa02t 
>425 337 tacwa02t 

The problem with this kind of approach is that it silently ignores bad
data (or good data if you make a mistake in your regex).  I like to do
this kind of spliting with something like:

@array = $glob =~ /\G ( \b\d+ \s+ \d+ (?:\s+ \D\w*)+ ) \s+ /xgc;
print "error!" if (pos($glob)||0) != length($glob)

This always starts each match where the preceeding one left off and
then verifies that the entire string was consumed.



More information about the spug-list mailing list