SPUG: Reading a whole file into a scalar.

Duane Blanchard dblanchard at gmail.com
Wed Jul 20 23:28:34 PDT 2005


Thank you all. This is a key thing that makes Perl rewarding, the
community. I'm finally off to bed having lain my XML to rest also.

Duane

On 7/20/05, Adam Monsen <haircut at gmail.com> wrote:
> Not sure if what you mentioned is "best practice".
> 
> Here's the obligatory one-liner:
> 
> $ perl -0ne 'print "MATCH: $1\n" if m/(line.*that)/s' poo.html
> MATCH: line
> that
> 
> The -0 switch--documented in perldoc perlrun--specifies the input
> record separator. I didn't give it any digits, so the input record
> separator is the null character. Since there aren't any null
> characters in the file, the whole thing is sucked into $_. -n and -e
> are also documented in perldoc perlrun.
> 
> But... I'm really an IO::All fan.
> 
> #!/usr/bin/perl -w
> use strict;
> use IO::All;
> my $contents = io('poo.html')->slurp;
> if ($contents =~ m/(line.*that)/s) {
>  print "MATCH! ... $1\n"
> }
> 
> I'm assuming that poo.html contains the XML-like markup example in
> your original post. 'poo.html' could be changed to a URL
> (http://example.com/poo.html) and IO:All would just do the right
> thing. Hm, well, let's just try it.
> 
> #!/usr/bin/perl -w
> use strict;
> use IO::All;
> my $contents = io('http://rafb.net/paste/results/GHgbSG94.txt')->slurp;
> if ($contents =~ m/(line.*that)/s) {
>  print "MATCH! ... $1\n"
> }
> 
> You'll need IO::All::LWP installed for that one to work.
> 
> When doing multiline matches, the only alternative to slurping in the
> file that I can think of is to make a mini state machine: when 'line'
> is found, switch states until 'that' is found. Good luck matching
> across hash or array elements.
> 
> Other notes:
> * You're right about the 's'. The 's' modifier to the regular
> expression makes the . match everything, including newlines. The docs
> (perldoc perlop) say 's' causes the regex engine to "treat the entire
> string as one line", and in this case it means we can do a multiline
> match.
> * check out http://perlmonks.org ... I learned a ton from this site.
> Plus, it's fun. If you want to search it, use the crawler-friendly
> version. in Google, search for: "site:perlmonks.thepen.com keywords"
> (substituting "keywords" for your search terms, of course). Some Perl
> heavyweights frequent this site.
> * are you parsing XML/HTML/etc. markup? If so, there are a ton of
> modules to make your life easier.
> * always use warnings, always use strict.
> * here's an article on IO::All ...
> http://www.perl.com/pub/a/2004/03/12/ioall.html
> 
> hope this helps,
> -Adam
> 
> On 7/20/05, Duane Blanchard <dblanchard at gmail.com> wrote:
> > I just found this, is this the best practice?
> >
> > while ( <COLOURS> )
> > {
> >   $myfile = $myfile . $_;
> > }
> >
> > Duane
> >
> > On 7/20/05, Duane Blanchard <dblanchard at gmail.com> wrote:
> > > Hi gang,
> > >
> > > I'm too tired to think straight and too tired to keep looking on the
> > > 'Net. I want to match things like 'line\s+that' in the example file
> > > below.
> > >
> > > <file>
> > > this is a line
> > > that is a line
> > > </file>
> > >
> > > What has worn me out today is not realizing that I'll never match
> > > across lines of a file if I only read one line at a time. So, I either
> > > need a clever way to match across elements of an array or hash table,
> > > or (more likely) to read the whole file into a scalar. As I recall,
> > > I'll use the 'm' flag to hand the RE more than one line, and '\s'
> > > should handle '\n'.
> > >
> > > Someone, please give a little pointer. Thanks,
> > >
> > > D
> > > --
> > > Duane Blanchard
> > > 206.934.5873
> > >
> > > There are 10 kinds of people in the world;
> > > those who know binary and those who don't.
> > > _____________________________________________________________
> > > Seattle Perl Users Group Mailing List
> > >     POST TO: spug-list at pm.org
> > > SUBSCRIPTION: http://mail.pm.org/mailman/listinfo/spug-list
> > >    MEETINGS: 3rd Tuesdays, Location: Amazon.com Pac-Med
> > >    WEB PAGE: http://seattleperl.org/
> > >
> >
> >
> > --
> > Duane Blanchard
> > 206.934.5873
> >
> > There are 10 kinds of people in the world;
> > those who know binary and those who don't.
> > _____________________________________________________________
> > Seattle Perl Users Group Mailing List
> >      POST TO: spug-list at pm.org
> > SUBSCRIPTION: http://mail.pm.org/mailman/listinfo/spug-list
> >     MEETINGS: 3rd Tuesdays, Location: Amazon.com Pac-Med
> >     WEB PAGE: http://seattleperl.org/
> >
> 
> 
> --
> Adam Monsen
> http://adammonsen.com/
> 


-- 
Duane Blanchard
206.934.5873

There are 10 kinds of people in the world;
those who know binary and those who don't.


More information about the spug-list mailing list