[Melbourne-pm] and the winner is C! (so far anyway, no big surprise)
Toby Corkindale
toby.corkindale at strategicdata.com.au
Fri May 21 02:21:16 PDT 2010
On 21/05/10 18:54, Sam Watkins wrote:
> On Fri, May 21, 2010 at 06:38:24PM +1000, Toby Corkindale wrote:
>> On 21/05/10 18:23, Sam Watkins wrote:
>>> Here's the leaderboard of CSV readers in various languages, compared
>> to C,
>>> for a 100,000 line CSV file:
>>>
>>> C 1.00
>>> brace 1.16
>>> perl XS 11.33
>>> (bad) go 17.50
>>> scala 19.32
>>> perl 62.51
>>
>>
>> Umm, I think you're comparing the wrong results there; for the 100k
>> file, Perl only takes 1.1 seconds for me.
>> (However the C version only takes 0.11 seconds on that file!)
>
> Yes, those figures were relative to the C version, that's why C is 1.00.
>
> 1.1 / 0.11 = 10 means perl XS gets a score of about 10 on your machine
> - it's 10 times slower than C.
>
> By the way, I compiled it something like this:
>
> gcc -pedantic -std=gnu99 -Wall -Wextra -O2 -o read-c read.c
>
>> For the "big" 10m row file, the C version takes 7.90 seconds on my
>> testbed, which definitely takes it into the lead, by far! (The next
>> fastest contender is over a minute, and perl takes 108 seconds)
>
> Yes, the C version is about 10 times faster than the next fastest
> (excluding brace, which is essentially just C anyway).
>
>> I'm going to modify the tests and give them another shot once I've
>> eliminated the buffer flushing..
>
> ok, cool. When I just commented out the printf I'm not sure how much other
> stuff the C compiler might have been discarding... "hey, he's not using this,
> no need to calculate it!"
Yeah!
I had a bit of a flamewar on another list about that..
A guy insisted that I should comment out the printf() in the scala
version, and then compare it's performance.. Uh.. but surely the
intelligent JVM will optimise out /heaps/ of stuff if I did that.. and
there's no way to compare the output either.
The guy in question was yelling at me about how apparently I didn't know
what I was doing, and was crap at testing performance. I didn't think it
was that unreasonable to want all the tested programs to produce
identical output! Sheesh.
curiously, I've made the Perl version use buffered IO and it seems to be
(very very slightly) slower than the original, not faster! How odd.
I'm doing:
my $output = IO::Handle->new->fdopen(fileno(STDOUT), 'w');
$output->autoflush(0);
...
$outout->printf(...)
...
$output->close;
Does that seem right to everyone else?
Swapping to unbuffered I/O for Scala brought the time down from 89 to 67
seconds on the biggest file.. and then using some performance
improvements suggested by someone else, got it down to a tiny 26
seconds! (And under 1 seconds for the small file)
Poor old Perl is looking very sorry for itself now, at 111 seconds :(
More information about the Melbourne-pm
mailing list