[Melbourne-pm] and the winner is C! (so far anyway, no big surprise)
toby.corkindale at strategicdata.com.au
Fri May 21 01:38:24 PDT 2010
On 21/05/10 18:23, Sam Watkins wrote:
> Here's the leaderboard of CSV readers in various languages, compared
> for a 100,000 line CSV file:
> C 1.00
> brace 1.16
> perl XS 11.33
> (bad) go 17.50
> scala 19.32
> perl 62.51
Umm, I think you're comparing the wrong results there; for the 100k
file, Perl only takes 1.1 seconds for me.
(However the C version only takes 0.11 seconds on that file!)
For the "big" 10m row file, the C version takes 7.90 seconds on my
testbed, which definitely takes it into the lead, by far! (The next
fastest contender is over a minute, and perl takes 108 seconds)
> I should fix the C / brace version to use fread not fgets for better
> correctness (allowing \n in quoted fields) and maybe to go a little faster.
> The printf output, even going to /dev/null, took more than half the time for
> the C code; so if we are testing just CSV reading, C is actually "more faster"
> than my figures indicate.
Someone else recently pointed this out -- the printf() is consuming the
majority of the time, apparently due to it's flushing.
I'm going to modify the tests and give them another shot once I've
eliminated the buffer flushing..
More information about the Melbourne-pm