[Chicago-talk] Performance issue
Jay Strauss
me at heyjay.com
Wed Apr 20 09:37:47 PDT 2011
Hi all,
I have a csv file, with quoted strings (i.e. "field1","field2",...). The
file is 3.5M records. I'm running strawberry perl on win7 (not that I think
that's the issue). What I need to do is convert any embedded "|" to "-",
convert the field delimiter ' "," ' to "|". I know there are cpan mods for
parsing csv but my situation is pretty straight forward. I'm doing:
use strict;
while(<>) {
$_ = substr $_, 1, -2; # Remove first and last ", and remove
# the \n at the same time
#
s/\|/-/g; # Change embedded "|" into "-"
my @words = split(/\",\"/,$_,-1); # split on the remaining ","
print join("|", @words),"\n";
}
But it's take what seems like a long time to run (like 15 mins). I'd think
this would be an ideal use for Perl, and could rip through the file lickedy
split.
I'm I doing something costly in the script above that is making it run so
slow?
Thanks
Jay
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.pm.org/pipermail/chicago-talk/attachments/20110420/a7cca521/attachment.html>
More information about the Chicago-talk
mailing list