[San-Diego-pm] odd chars in file "Killing" my console
Shlomi Fish
shlomif at iglu.org.il
Thu Nov 11 01:30:36 PST 2010
On Thursday 11 November 2010 04:56:12 Christopher Hahn wrote:
> Hey team,
>
> I am trying to parse a huge (7 Gb) file that is line oriented but has
> large sections
> that are any kind of binary character.
>
> (this is a p42svn dump file of a large perforce repository)
>
> I tried several smarter things, but found the after running for a while
> my console would just close....dead, gone:
> ============================
> administrator at cmSVNDumper-09:/p42svn/testing$ ./p4dump-parse-new.pl
> Killed
> ============================
>
> I am sure that there are odd chars in the file that are doing this....
>
> I tried setting binmode on the input file handle, and just loading the
> entire file into a buffer, just as a test, as we have enough memory to do
> this.
>
> The result:
> ===========================================
> open(OUTF, ">SM_amanda_238037_fixed.dump")
> or die "Opening output file failed: $!";
>
> open(INF, "SM_amanda_238037_bad.dump")
> or die "Opening input file failed: $!";
> binmode INF;
>
> my @buffer = <INF>;
>
Are you sure you want to load the many lines of a 7GB file into an array? Perl
arrays have a lot of overhead, and doing this would be very memory wasteful.
How much RAM do you have? You'll need much more than 7 GB for that.
Regards,
Shlomi Fish
--
-----------------------------------------------------------------
Shlomi Fish http://www.shlomifish.org/
Stop Using MSIE - http://www.shlomifish.org/no-ie/
<rindolf> She's a hot chick. But she smokes.
<go|dfish> She can smoke as long as she's smokin'.
Please reply to list if it's a mailing list post - http://shlom.in/reply .
More information about the San-Diego-pm
mailing list