SPUG: RE / Split Question

James Moore james at banshee.com
Wed Jul 30 19:36:27 CDT 2003


I wouldn't do it in one step.

Something vaguely like (not tested or compiled):

my $foundAWord = 1; # set to true to pick up the first element
my @results;
my @currentResult;
my @elements = split ' ', $incomingBlob;
for my $bit (@elements) {
  if ($foundAWord and $bit =~ /^\d+$/)
    push @results, [@currentResult]
      if @currentResult;
    undef @currentResult;
    undef $foundAWord;
  }
    
  push @currentResult, $bit;

  if ($bit =~ /\D/) {
    $foundAWord = 1;
  }
}
push @results, [@currentResult];


-----Original Message-----
From: spug-list-bounces at mail.pm.org [mailto:spug-list-bounces at mail.pm.org]
On Behalf Of Orr, Chuck (NOC)
Sent: Wednesday, July 30, 2003 4:55 PM
To: spug-list at mail.pm.org
Subject: SPUG: RE / Split Question

Hello All,
 
Please help with the following dilemma:
     I am being given a glob of data from a web page that I need to fix with
perl.  It comes in as $blob looking like this:
 
425 501 sttlwa01t 425 712 sttlwa01t tacwa02t 425 337 tacwa02t ...
 
I need to break this up so the word characters associated with the numbers
stay with their numbers.  Ideally, I would have an array like this:
 
425 501 sttlwa01t
425 712 sttlwa01t tacwa02t
425 337 tacwa02t
 
As you can see, I am not assured of the number of words that will follow
each set of numbers.  Could you please suggest a split or some other tool
that will turn the glob into the fix?
$new_array = [ split /(?=[A-Z]\s\d)/,$scalar ];  
Which is as close as we got, does not work.  It keeps the split characters,
but in a funky way that I cannot deal with.  It also will always miss the
last chunk of the glob.
 
Thanks,
Chuck




More information about the spug-list mailing list