[pm-h] Perl merging many large files into one

G. Wade Johnson gwadej at anomaly.org
Mon Mar 31 05:35:55 PDT 2014


On Sun, 30 Mar 2014 21:00:46 -0700 (PDT)
"Michael R. Davis" <mrdvt92 at yahoo.com> wrote:

> Perl Folks,
> Can anyone tell me if the diamond operator is optimized in a print
> statement or does it really read the file into memory then print it? 

The short answer is the diamond operator reads all of the files into
memory and returns a list of all lines.

> perl -e '
> use strict;
> use warnings;
> use Path::Class qw{file};
> my @files=qw{X Y Z}; #really large files
> my $out=file("out.txt")->openw;
> foreach my $file (@files) {
>   my $fh=file($file)->openr;
>   print $out <$fh>; #does this read to memory then print or does it
> do something better? }
> '
>  
> Or do I really need to read line by line something like this...
>  
> perl -e '
> use strict;
> use warnings;
> use Path::Class qw{file};
> my @files=qw{X Y Z}; #really large files
> my $out=file("out.txt")->openw;
> foreach my $file (@files) {
>   my $fh=file($file)->openr;
>   my $line;
>   print $out $line while ($line=<$fh>);
> }
> '

Line by line is not as bad as it sounds, Perl uses buffering internally
to avoid hitting the disk more than necessary.

The real question is are you doing more than just concatenating the
files? If you're not doing anything else, there are more appropriate
tools (depending on your OS). If you are doing something else with the
lines, that might change how I would solve the problem.

G. Wade

-- 
Make no decision out of fear.                     -- Bruce Sterling


More information about the Houston mailing list