[Chicago-talk] Performance issue

Jay Strauss me at heyjay.com
Wed Apr 20 20:37:52 PDT 2011


I stopped using pipes, started using file handles, and things sped up
nicely.

Thanks
Jay

On Wed, Apr 20, 2011 at 12:03 PM, Warren Lindsey
<warren.lindsey at gmail.com>wrote:

> I assume by your use of <> and print without a filehandle that you are
> going through pipes and reading from STDIN and writing to STDOUT. I suspect
> opening input and output file handles will be more efficient. Less data
> movement between buffers.
>
> Cheers,
> Warren
>
> On Apr 20, 2011, at 11:37 AM, Jay Strauss <me at heyjay.com> wrote:
>
> > Hi all,
> >
> > I have a csv file, with quoted strings (i.e. "field1","field2",...).  The
> file is 3.5M records.  I'm running strawberry perl on win7 (not that I think
> that's the issue).  What I need to do is convert any embedded "|" to "-",
> convert the field delimiter ' "," ' to "|".  I know there are cpan mods for
> parsing csv but my situation is pretty straight forward.  I'm doing:
> >
> > use strict;
> >
> > while(<>) {
> >
> >       $_ = substr $_, 1, -2;  #       Remove first and last ", and remove
> >                               #       the \n at the same time
> >                               #
> >
> >       s/\|/-/g;               #       Change embedded "|" into "-"
> >
> >       my @words = split(/\",\"/,$_,-1);       # split on the remaining
> ","
> >
> >       print join("|", @words),"\n";
> > }
> >
> > But it's take what seems like a long time to run (like 15 mins).  I'd
> think this would be an ideal use for Perl, and could rip through the file
> lickedy split.
> >
> > I'm I doing something costly in the script above that is making it run so
> slow?
> >
> > Thanks
> > Jay
> >
> > _______________________________________________
> > Chicago-talk mailing list
> > Chicago-talk at pm.org
> > http://mail.pm.org/mailman/listinfo/chicago-talk
> _______________________________________________
> Chicago-talk mailing list
> Chicago-talk at pm.org
> http://mail.pm.org/mailman/listinfo/chicago-talk
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.pm.org/pipermail/chicago-talk/attachments/20110420/d4a89c03/attachment.html>


More information about the Chicago-talk mailing list