[Chicago-talk] Need help on a parsing that used to work

Steven Lembark lembark at wrkhors.com
Tue Nov 25 10:18:05 CST 2003


> RCPT FROM: bernie at mesatech.com

If you're really addicted to regexen try looking for
the shortest match up to a colon:

	my ($header, $value ) = m{^(.+?):\s*(.+)}

If I remember RFC822 headers properly the "\s*"
can be changed to "\s+" since there is supposed to
be some space after the colon.

You can also eliminate the space via

	my( $header, $value ) = split /:\s*/, $_, 2;

  DB<1> $a = 'this is a test: it is only a test, do not be alarmed'

  DB<2> x split /:\s+/, $a, 2;
0  'this is a test'
1  'it is only a test, do not be alarmed'


Any time this doesn't work you can try it in the Perl
debugger.

The original version probably worked for the fairly large
number of headers that are single words (e.g., from:, to:)
and you never noticed that it fails for headers with ws
in their names.


--
Steven Lembark                               2930 W. Palmer
Workhorse Computing                       Chicago, IL 60647
                                            +1 888 359 3508



More information about the Chicago-talk mailing list