[Chicago-talk] Spliting an up undelimited file

Andy_Bach at wiwb.uscourts.gov Andy_Bach at wiwb.uscourts.gov
Mon Sep 12 07:20:01 PDT 2011


> This still looks rigidly structured - "date" "space" "run of text"

while (<>) {
  if (/(\w+ \d+, \d{4}) (.+)/) {
    my ($date, $memo) = ($1, $2);
    #do something interesting with $date and $memo
  }
}

Yeah, and just to be safe, use whitespace metas, (and /x - "readability")
to get:
  if (/(\w+ \s+ \d+, \s+ \d+) \s +(.+)/x) {

if there's a chance for variability, as w/ those logs that outdent the
single digit date number
March  8
March  9
March 10

and add and 'else' if you want to worry about bad data.

a

----------------------
Andy Bach
Systems Mangler
Internet: andy_bach at wiwb.uscourts.gov
Voice: (608) 261-5738, Cell: (608) 658-1890

“One of the most striking differences between a cat and a lie is that a cat
has only nine lives.”
Mark Twain, Vice President, American Anti-Imperialist League, and erstwhile
writer


More information about the Chicago-talk mailing list