SPUG: RE / Split Question

Chris Wilkes cwilkes-spug at ladro.com
Wed Jul 30 19:18:07 CDT 2003


On Wed, Jul 30, 2003 at 04:54:46PM -0700, Orr, Chuck  (NOC) wrote:
> 
>      I am being given a glob of data from a web page that I need to fix
> with perl.  It comes in as $blob looking like this:
>  
> 425 501 sttlwa01t 425 712 sttlwa01t tacwa02t 425 337 tacwa02t ...
>  
> I need to break this up so the word characters associated with the
> numbers stay with their numbers.  Ideally, I would have an array like
> this:
>  
> 425 501 sttlwa01t
> 425 712 sttlwa01t tacwa02t
> 425 337 tacwa02t

(note to Chuck: this is the same email I sent to you directly by
accident)

Not sure if this is very efficient, but this one works.  In fact, its
probably extremely inefficient with that substitute in there.

This works only if you are guaranteed a format of
"### ### some non three digit numbers".

Chris



#!/usr/bin/perl

while (<DATA>) {
   chomp;
   while (/(\d{3} \d{3} .*?)\s+(\d{3} \d{3})/g) {
      push @ary, $1;
      s/^$1\s+//;
   }
   push @ary, $_ if /^\d{3} \d{3}/;
}

print "" . (join "\n", @ary) . "\n";


__DATA__
425 501 one 425 712 two twoB 425 337 three 123 456 four fourB fourC 789 012 five 000 000 six
123 456 simple one liner



More information about the spug-list mailing list