[pm-h] reading a file as a string

Russell L. Harris rlharris at oplink.net
Sat May 19 20:38:43 PDT 2007


* Andy Lester <andy at petdance.com> [070519 21:35]:
> 
> On May 19, 2007, at 9:25 PM, Russell L. Harris wrote:
> 
> > Is there a preferred approach for copying an entire file into a string
> > variable, while preserving the record delimiters (the newline
> > character)?
> >
> > I have found two examples; is either of them a good approach?
> >
> >     open (FILE,$filename) || die "Cannot open '$filename': $!";
> >     undef $/;
> >     my $file_as_string = <FILE>;
> >
> >
> >     open (FILE,$filename) || die "Cannot open '$filename': $!";
> >     my $file_as_string = join '', <FILE>;
> 
> Of those two, choose the former.  The second one reads all the lines  
> into an array, and the glomps together a big string.  The first one  
> just reads into a string.
> 
> Do it this way:
> 
> my $file_as_string = do {
>      open( my $fh, $filename ) or die "Can't open $filename: $!";
>      local $/ = undef;
>      <$fh>;
> };
> 
> This lets you localize the $/ so that it gets set back outside the  
> scope of the block.  Otherwise, you might try to read from a file  
> somewhere else and not know that you changed $/.
> 
> Here's another way:
> 
> use File::Slurp qw( read_file );
> my $file_as_string = read_file( $filename );
 

Thanks for the quick response, Andy.  

After G. Wade's mentoring regarding the diamond operator, I dreamed up
the first approach:

    undef $/;
    my $file_as_string = <FILE>;

The second approach is something I ran across in the 4th edition of
"Learning Perl".  

My ultimate goal is to modify about a hundred document files by
tacking on a new head and a new tail to each.  The largest document
file is about 500 Kbytes; the head and tail each are less than a
kilobyte.  Here is the Perl script which I propose to use:

    $^I = ".bak";


    my $newhead = "newhead";
    open(NEWHEAD,$newhead) || die "failed to open input file $newhead :$!";
    undef $/;
    my $headstring = <NEWHEAD>;
    close(NEWHEAD) || die "failed to close input file $newhead : $!";


    my $newtail = "newtail";
    open(NEWTAIL,$newtail) || die "failed to open input file $newtail :$!";
    undef $/;
    my $tailstring = <NEWTAIL>;
    close(NEWTAIL) || die "failed to close input file $newtail : $!";


    my $bodystring  = '';
    my $newdocument = '';
    undef $/;
    while ($bodystring = <>)
       {
       $newdocument .= $headstring;
       $newdocument .= $bodystring;
       $newdocument .= $tailstring;
       print "$newdocument";
       $newdocument = '';
       }


I have tested the script on short dummy documents, but I wished to
make sure that I am not overlooking something which could corrupt the
document files.

RLH


More information about the Houston mailing list