[Chicago-talk] Need help on a parsing that used to work
Steven Lembark
lembark at wrkhors.com
Tue Nov 25 10:18:05 CST 2003
> RCPT FROM: bernie at mesatech.com
If you're really addicted to regexen try looking for
the shortest match up to a colon:
my ($header, $value ) = m{^(.+?):\s*(.+)}
If I remember RFC822 headers properly the "\s*"
can be changed to "\s+" since there is supposed to
be some space after the colon.
You can also eliminate the space via
my( $header, $value ) = split /:\s*/, $_, 2;
DB<1> $a = 'this is a test: it is only a test, do not be alarmed'
DB<2> x split /:\s+/, $a, 2;
0 'this is a test'
1 'it is only a test, do not be alarmed'
Any time this doesn't work you can try it in the Perl
debugger.
The original version probably worked for the fairly large
number of headers that are single words (e.g., from:, to:)
and you never noticed that it fails for headers with ws
in their names.
--
Steven Lembark 2930 W. Palmer
Workhorse Computing Chicago, IL 60647
+1 888 359 3508
More information about the Chicago-talk
mailing list