[sf-perl] which first: remove non-data lines or process line continuations?

David Alban extasia at extasia.org
Tue Dec 11 12:28:08 PST 2007


greetings,

I'm parsing text files.  in these text files, a data line is any line
except a comment line, blank line, or null line.  I ignore all lines
that are not data lines.  That is, I process only lines not matching:

  m{ \A \s* (?: \# | \z ) }xms

I also allow line continuation.  That is, backslash-newline pairs are
deleted (after backslash-quoted backslashes are "protected").

So I have a choice.  I can process removal of non-data lines first.
Or I can process line continuations first.

Take the following set of lines:

    foo \
    # : bar \
    : bat
    mumble \
    : squeak

If I process line continuations first, my data lines become:

    ( 'foo # : bar : bat', 'mumble : squeak' )

If I process removal of non-data lines first, I get:

    ( 'foo : bat', 'mumble, squeak' )

I'm leaning toward removing the non-data lines first.  But I wanted to
see if anyone had any strong opinions or otherwise interesting
observations.

Thanks,
David
-- 
Live in a world of your own, but always welcome visitors.


More information about the SanFrancisco-pm mailing list