OxPM: Search and Extract

Thu Dec 5 06:43:33 CST 2002

On Thu 05 Dec 2002, Neil Hoggarth <neil.hoggarth at physiol.ox.ac.uk> wrote:
> You could set the input record seperator ("$/", perldoc perlvar for
> info) to "<p>", then the kind of while(<>) loop that would normally
> process input line-by-line will work paragraph by paragraph.

Ooh, that's as cunning as a very cunning thing, and much simpler than
my overengineered solution.  Probably best to do it as a local,
though, saves hassle of setting it back.

# blah blah blah code
{
  local $/ = "<p>";
  # while loop and processing here
}
# more code, with the normal input record separator

Or you could live dangerously and assume that since your script
doesn't currently do any reading-in of data later on that it never
will (and that it isn't going to live on for ever and ever and get
edited/maintained by people who have no idea what $/ means and can't
be bothered to look it up[0]).

I was going to put a link here to the neat thing I saw on Perlmonks
that helps you remember how which way round $/ and $\ go, but I can't
find it now.  Basically it used the mnemonic I/O and you have to
imagine a raindrop falling down the slash - if it's / then it'll fall
into I so you know $/ is the input record separator.  If it's \ then
it'll fall into O so you know $\ is the output record separator.

Other things I have used $\ for recently include reading in SQL
commands through the <DATA> filehandle - setting it to "\n\n" so I can
wrap my SQL commands nicely.  See for example
  http://search.cpan.org/src/KAKE/CGI-Wiki-0.05/lib/CGI/Wiki/Setup/MySQL.pm

Kake
[0] 'perldoc perlvar' for the bemused - there, you have no excuse now.
Search it for INPUT_RECORD_SEPARATOR and you'll get the right section.
-- 
http://www.earth.li/~kake/cookery/ - vegan recipes, now with new search feature
http://grault.net/grubstreet/ - the open-source guide to London
http://www.penseroso.com/ - websites for the fine art and antique trade