[Pdx-pm] Simplistic, yet complicated to me question....

Michael G Schwern schwern at pobox.com
Mon Dec 9 13:30:12 CST 2002


On Mon, Dec 09, 2002 at 09:32:50AM -0800, A.J. Weinzettel wrote:
> I am parsing a text file and I have the following layout of text
> 
> 
> 123454 text                67890 Get Rid of text
> 234555 more text to keep

(I'm assuming you want to keep the 2nd number and only dump the text)

If its actually set up as a bunch of fixed-witch columns, use pack() as
perlfaq5 suggests.

  my($id, undef, $text, $id2) =
  unpack('A6    AA20                 A5   AA*');
      #   123454 text                67890 Get Rid of text

Otherwise you distinguish the textual parts as the bits with either an
alphanumeric or only a single space, here represented as a space surrounded
by word breaks.
     my($id, $text, $id2) = $line =~ ( /^(\d+) ((?:\w|\b \b)+)\s*(\d+)/

If there can be multiple spaces and numbers in the text you have a
potentially unparsable data format.


-- 

Michael G. Schwern   <schwern at pobox.com>    http://www.pobox.com/~schwern/
Perl Quality Assurance      <perl-qa at perl.org>         Kwalitee Is Job One
I have this god-awful need to aquire useless crap!!!



More information about the Pdx-pm-list mailing list