[San-Diego-pm] odd chars in file "Killing" my console

Christopher Hahn xrz1138 at gmail.com
Wed Nov 10 18:56:12 PST 2010


Hey team,

I am trying to parse a huge (7 Gb) file that is line oriented but has
large sections
that are any kind of binary character.

(this is a p42svn dump file of a large perforce repository)

I tried several smarter things, but found the after running for a while
my console would just close....dead, gone:
============================
administrator at cmSVNDumper-09:/p42svn/testing$ ./p4dump-parse-new.pl
Killed
============================

I am sure that there are odd chars in the file that are doing this....

I tried setting binmode on the input file handle, and just loading the entire
file into a buffer, just as a test, as we have enough memory to do this.

The result:
===========================================
open(OUTF, ">SM_amanda_238037_fixed.dump")
  or die "Opening output file failed: $!";

open(INF, "SM_amanda_238037_bad.dump")
  or die "Opening input file failed: $!";
binmode INF;

my @buffer = <INF>;

print OUTF @buffer;

close(INF);
close(OUTF);
===========================================

I watched using "top" and after the memory used climbed to a tad
more than the size of the file, the "Killed" message appeared and the
console closed itself.

I have to stay at work until this is done, and so am just hoping the
someone if online and can give me the kick in the head that I need.

In any case, thanks for the attention,

Chris

--
Realisant mon espoir, je me lance vers la gloire.
Christopher Hahn == xrz1138 at gmail.com


More information about the San-Diego-pm mailing list