[LA.pm] split() question

William Yardley lapm at veggiechinese.net
Thu Sep 28 17:10:09 PDT 2006


On Fri, Sep 29, 2006 at 07:33:08AM +0800, Benjamin J. Tilly wrote:
> "William Yardley" <lapm at veggiechinese.net> wrote:

> > I'm doing some maintenance on a CPAN module I didn't write
> > (Mail::DeliveryStatus::BounceParser), and (around line 356 if anyone
> > needs to actually look there), we do something like:

> > foreach my $para (split /\n{2,}/, $delivery_status_body) {
> >        my $report = Mail::Header->new([split /\n/, $para]);
> > 
> > What's strange, is that in some cases, there is a leading \n left, which
> > results in $report not having the stuff we expect there. I'd assume that
> > since the \n{2,} should be greedy, there wouldn't be any \ns left after
> > that split. However, if I print out $para inside the loop, I see a
> > leading \n.
 
> Is it possible that your data looks like "\nstuff\n\n\nmore stuff"?
> Then the first string has a \n at the start that the split is not
> going to catch.

Right. I thought it was two -  original message -
http://emailproject.perl.org/svn/Mail-DeliveryStatus-BounceParser/trunk/t/corpus/surfcontrol-extra-newline.msg
- has two, but looks like it is one by the time I get
to it, which (basically) explains the problem.

--start--


Action: failed
Final-Recipient: rfc822;recipient at example.com
Diagnostic-Code: smtp; 554 Service currently unavailable
Status: 5.0.0



--end--

> You have my top two guesses in order.  First, $delivery_status_body
> sometimes starts with \n, and failing that that there is  Perl bug.

Right - the problematic example started with a single blank line. I was
thinking it was two (since there are two \ns in the original message)
and / or that a blank line followed by text would be two \ns total.

I guess I could do:
split /(\n{2,}|\A\n)/, $delivery_status_body

but probably the other workaround is better.

w



More information about the Losangeles-pm mailing list