[Phoenix-pm] PERL DBI

Loo, Peter # PHX Peter.Loo at source.wolterskluwer.com
Mon Mar 13 14:01:01 PST 2006


Hi,

I was wondering if auto commit is turned on by default.  Perhaps my
process is committing for each row.
 
Peter Loo
Wolters Kluwer Health
(602) 381-9553

-----Original Message-----
From: phoenix-pm-bounces+peter.loo=source.wolterskluwer.com at pm.org
[mailto:phoenix-pm-bounces+peter.loo=source.wolterskluwer.com at pm.org] On
Behalf Of Bill Nash
Sent: Monday, March 13, 2006 2:47 PM
To: Scott Walters
Cc: phoenix-pm at pm.org
Subject: Re: [Phoenix-pm] PERL DBI


You're very likely pressing up against DB execution overhead.

For starters, you're doing one insert per row. The database side
mechanics of this are:

1. Accept insert.
2. Update indexes.
3. Log query where applicable.
4. Platform specific spooky buffer action and related flushing.

Platform independent, if you're going to chuck a couple million rows
like this, disable your indexing first.

I don't know how you'd optimize this with Oracle, but mysql supports
bulk insert syntax, ala:

insert into foo (a, b, c, d, beer) values (1, 2, 3, 4, 'bar'), (3, 4, 5,
6, 'heineken'), ...;

Even with indexing enabled, this is far more efficient, as indexes are
updated en masse after the insert is accepted.

In psuedocode, something like:

prepare query
execute query
while iterating results,
 	push result set onto array.
 	increment counter.
 	if counter hits 5000,
 		bulk insert entries stored in array.
 		erase array.
 		reset counter.
 	end if
end while

I've gotten DBI on good hardware to iterate upwards of 20k rows a
second. 
Realistic insert rates are a different animal, because of the i/o (your
mileage may vary depending on media speeds), but 6-8k a second is not
unrealistic. Other factors include the engine type, the data i/o bus of
the machine, etc.

- billn

On Mon, 13 Mar 2006, Scott Walters wrote:

> On  0, "David A. Sinck" <phx-pm-list at grueslayer.com> wrote:
>>
>> HAHAHAHAHA.  Guess again.  There's an instance, at least with MySQL
>
> Okay, I don't use MySQL heavy (for reasons of it better garbage), so I

> over generalized.  I conceede your point that rare bugs will make this

> generalization not always true.
>
>> \_ and you don't suffer
>> \_ from SQL command length limitations,
>>
>> I bet I could could craft something that hits a buffer still using 
>> placeholders if I wanted.  :-)
>
> I didn't say he didn't suffer from *buffer* length considerations -- 
> that would be foolish indeed.  I only said that he doesn't suffer from

> this particular one.  I've seen databases cap the command buffer at
64k.
> Generally speaking (true in most cases, perhaps false in some) the 
> length-counted buffers for placeholder data will be significantly 
> larger than the command buffer.
>
>>
>> \_ and the database doesn't have to parse potentially \_ megs of data

>> just to parse the SQL command,
>>
>> Depends on the driver.  MySQL dbd, last I checked, actually sub'd the

>> values in before it got passed to the engine.  YMMV.
>
> That must have been a long time ago indeed.  But you caught me -- I 
> made another false generalization.  I should have said "Generally, "
> in front of it.
>
> By the way, people who argue small points while ignoring the point of 
> what was said annoy me.  Did you really think someone on the list 
> would be damaged thinking that placeholders always send data out of 
> band, even when using Informix?
>
> I wrote a hasty reply out of desire to be helpful to someone who was 
> having trouble with something, not to debate minutia, and have been 
> punished for it.
>
> -scott
>
>>
>> David
>> _______________________________________________
>> Phoenix-pm mailing list
>> Phoenix-pm at pm.org
>> http://mail.pm.org/mailman/listinfo/phoenix-pm
> _______________________________________________
> Phoenix-pm mailing list
> Phoenix-pm at pm.org
> http://mail.pm.org/mailman/listinfo/phoenix-pm
>
_______________________________________________
Phoenix-pm mailing list
Phoenix-pm at pm.org
http://mail.pm.org/mailman/listinfo/phoenix-pm


This E-mail message is for the sole use of the intended recipient(s) and
may contain confidential and privileged information.  Any unauthorized
review, use, disclosure or distribution is prohibited.  If you are not
the intended recipient, please contact the sender by reply E-mail, and
destroy all copies of the original message.


This E-mail message is for the sole use of the intended recipient(s) and may contain confidential and privileged information.  Any unauthorized review, use, disclosure or distribution is prohibited.  If you are not the intended recipient, please contact the sender by reply E-mail, and destroy all copies of the original message.


More information about the Phoenix-pm mailing list