[Chicago-talk] Regex and the whitespace before it.

Mike Fragassi frag at ripco.com
Wed Mar 26 10:24:59 PDT 2008


Mike --

You're welcome for the help.  Hopefully this will help with the current 
problem:

    split /(\w+)\s*:/

Capturing parentheses in the split regex will return that portion of 
the match in the output.

my $string =<<STRING;
subject1 : description of a subject. subject2 : description of a subject.
subject3 : description of a subject.  etc etc
STRING
my @aa = split /(\w+)\s*:\s*/, $string;
$,="\n";
print @aa;

Of course, this assumes that colons don't naturally occur in the 
descriptions, and that the subject is never more than one word.

If your descriptions wrap across lines and you want to quickly get rid 
of the internal newlines, you can just put a map block between the = and 
the split, like this:

my @aa = map {s/\n//g;$_} split /(\w+)\s*:\s*/, $string;

-- Mike F.



More information about the Chicago-talk mailing list