[Chicago-talk] Handling big files...

Steven Lembark lembark at wrkhors.com
Tue May 18 21:47:25 CDT 2004



>   I probably should have mentioned I am trying to get the data to/from
> tape robot  -- and that my only access to the robot is 2gig unix files.

If you have an autochanger then try:

	bzip -9 < $bigfile |
		split --bytes=$((1024*1024*1024*2 - 1)) --suffix-length=4;

This will generate a sequence "aaaa" ... "zzzz" of 2gig-1-byte
files with one runt at the end. Advantage here is avoiding the
need for a local copy of the .bz2 file (or .gz file if you use
gzip) along with the split up copies.

If your files are remote from the server you can use Net::FTP to
download them into an open file handle that leads to a pipe.

	my $size = 2**31 - 1;

	open my $fh, '| gzip --fast | split --suffix-length=4 --bytes=$size'
		or die "split: $!";

	my $ftp = Net::FTP->new( blah blah );

	$ftp->get( $remote_file, $fh );


this will slurp the file down from a remote server, squish it on
the fly, and split it into "????" files. After that you can use
the autochanger to spit the files onto tape.

If you have ssh access you can run the gzip remotely and spit the
result into files on the fly:

    ssh $host "gzip --fast --verbose $bigfile" | split --suffix-length=4 
--bytes=$((1024*1024*1024*2-1));

will do the trick.



-- 
Steven Lembark                           9 Music Square South, Box 344
Workhorse Computing                                Nashville, TN 37203
lembark at wrkors.com                                      1 888 359 3508



More information about the Chicago-talk mailing list