[Phoenix-pm] PERL DBI

Bill Nash billn at odyssey.billn.net
Mon Mar 13 13:53:59 PST 2006


Also, because of the linear nature of file seeks, serial insert rates drop 
as the size of your data store increases. I mentioned this in an earlier 
email, and it will vary by platform (Innodb, anything using raw disk 
hashing, won't have this issue as much. MyISAM is absolutely prone to it.)

- billn

On Mon, 13 Mar 2006, Loo, Peter # PHX wrote:

>
> Here is what it took to run:
>
> Mon Mar 13 13:05:34 MST 2006
> Mon Mar 13 14:31:27 MST 2006
>
> SQL> select count(*) from dssppv.tmp_falcon_projections;
>
>  COUNT(*)
> ----------
>   2413059
>
>
>
> Peter Loo
> Wolters Kluwer Health
> (602) 381-9553
>
> -----Original Message-----
> From: Scott Walters [mailto:scott at illogics.org]
> Sent: Monday, March 13, 2006 2:36 PM
> To: Loo, Peter # PHX
> Cc: phoenix-pm at pm.org
> Subject: Re: [Phoenix-pm] PERL DBI
>
>> This is sad.
>
> No, this is not sad.  Dump/undump programs are hand-tuned C designed to
> do exactly one thing and do it very quickly.  What you're doing would be
> nearly as slow in any language (or nearly as fast).  That aside, if it
> runs for
> 24 hours and you have to do it once, what are you crying about?  Was the
> project notice 48 hours to move from one database system to another?
> Are
> you doing daily syncs?  Go do something else.  Presumably the program
> prints status info.  If not, make it, just to detect possible
> pathological conditions.
>
> -scott
>
> On  0, "Loo, Peter # PHX" <Peter.Loo at source.wolterskluwer.com> wrote:
>>
>> Hi Scott,
>>
>> I kicked off the program at 13:05:34 and it is still running.  We are
>> only dealing with 2,000,000 rows.  This is not going to work.  I have
>> to think of something.  Dumping the data into and external table in
>> Neteeza on a NFS mount and using SQL_LOAD in Oracle will definitely be
> faster.
>> This is sad.
>>
>> Peter Loo
>> Wolters Kluwer Health
>> (602) 381-9553
>>
>> -----Original Message-----
>> From: Scott Walters [mailto:scott at illogics.org]
>> Sent: Monday, March 13, 2006 2:14 PM
>> To: Loo, Peter # PHX
>> Cc: phoenix-pm at pm.org
>> Subject: Re: [Phoenix-pm] PERL DBI
>>
>> Peter,
>>
>> The change in speed would be trivial then, I'm guessing, but there's
>> always Devel::Prof if you don't want a guess.
>>
>> You're suffering from round-trip time right now more than throughput,
>> I'm also guessing.
>>
>> -scott
>>
>> On  0, "Loo, Peter # PHX" <Peter.Loo at source.wolterskluwer.com> wrote:
>>>
>>> Scott,
>>>
>>> How about if I changed the fetchrow_array to fetchrow_arrayref?  I
>>> think that will help.  I am currently using the same database to
>>> test (Oracle production table to an Oracle temp table).  I am not
>>> using Neteeza (source database) yet.
>>>
>>> Peter Loo
>>> Wolters Kluwer Health
>>> (602) 381-9553
>>>
>>> -----Original Message-----
>>> From: Scott Walters [mailto:scott at illogics.org]
>>> Sent: Monday, March 13, 2006 1:29 PM
>>> To: Loo, Peter # PHX
>>> Cc: phoenix-pm at pm.org
>>> Subject: Re: [Phoenix-pm] PERL DBI
>>>
>>> Hi Peter,
>>>
>>> Was it running slow before?  I can't imagine that this would make it
>
>>> slower...
>>>
>>> You could get a little parallelism if you used threads and made a
>>> Thread::Queue to pass rows from a reader thread to a writer
> thread...
>>> but Perl threads are pretty grumpy.
>>>
>>> Just a thought.
>>>
>>> Cheers,
>>> -scott
>>>
>>> On  0, "Loo, Peter # PHX" <Peter.Loo at source.wolterskluwer.com>
> wrote:
>>>>
>>>> Hi All,
>>>>
>>>> Alright, I got it working, but it is running so slow.  I am
>>>> wondering if there is anyway to speed this up?  Here is the syntax
>
>>>> I
>> am using:
>>>>
>>>>     $bindVars = join ', ', ('?') x @column_names;
>>>>     $d_sth = $d_dbh->prepare(qq{insert into
>>>> dssppv.tmp_falcon_projections ($columnNames)
>>>>                               values ($bindVars)}) || die
>>>> "$DBI::errstr\n";
>>>>     $s_sth = $s_dbh->prepare("select * from
>>>> dssppv.p_falcon_projections") || die "$DBI::errstr\n";
>>>>     $s_sth->execute() || die "$DBI::errstr\n";
>>>>     while ( my @row = $s_sth->fetchrow_array ) {
>>>>       $d_sth->execute(@row);
>>>>       }
>>>>
>>>>
>>>>
>>>> Peter Loo
>>>> Wolters Kluwer Health
>>>> (602) 381-9553
>>>>
>>>> -----Original Message-----
>>>> From: Scott Walters [mailto:scott at illogics.org]
>>>> Sent: Monday, March 13, 2006 10:30 AM
>>>> To: Loo, Peter # PHX
>>>> Cc: phoenix-pm at pm.org
>>>> Subject: Re: [Phoenix-pm] PERL DBI
>>>>
>>>> Hi Peter,
>>>>
>>>> Use placeholders:
>>>>
>>>>     my $question_marks = join ', ', '?' x @column_names;  # one ?
>>>> for each value
>>>>     my $sth = $dbh->prepare(qq{insert into foo (@column_names)
>>>> values
>>>> ($question_marks)}) or die $dbh->errstr;
>>>>     while(my @row = $old_database->fetchrow_array) {  # read from
>>>> old database
>>>>          $sth->execute(@row); # insert into new database
>>>>     }
>>>>
>>>> If you use placeholders, you never need to quote or escape data,
>>>> and
>>
>>>> you don't suffer from SQL command length limitations, and the
>>>> database
>>>
>>>> doesn't have to parse potentially megs of data just to parse the
>>>> SQL
>>
>>>> command, and it doesn't have to parse the SQL more than once at
>>>> all
>>>> --
>>>
>>>> you prepare it once and then execute it as many times as you like
>>>> for a huge speed increase.  If you're loading a lot of data,
>>>> you'll also want to drop indexes first and the rebuild them later.
>
>>>> This is
>>
>>>> a huge
>>>
>>>> time savings over incrementally updating them for millions of
>> records.
>>>>
>>>> I might have missed the point of your question -- if so, please
>>>> clarify, and I aplogize.
>>>>
>>>> Cheers,
>>>> -scott
>>>>
>>>> On  0, "Loo, Peter # PHX" <Peter.Loo at source.wolterskluwer.com>
>> wrote:
>>>>>
>>>>>    Hi,
>>>>>
>>>>>
>>>>>            $columnsToAssign = $columnsToAssign . ", \"" .
>>>> @$d_arrayref[0]
>>>>>    . "\"";
>>>>>            }
>>>>>          else {
>>>>>            $columnsToAssign = $columnsToAssign . ", " .
>>>> @$d_arrayref[0];
>>>>>            }
>>>>>
>>>>>    Is there an easier way?
>>>>>
>>>>>
>>>>>
>>>>>    Thanks.
>>>>>
>>>>>
>>>>>
>>>>>    Peter Loo
>>>>>
>>>>>
>>>>
>>>>
>>>> This E-mail message is for the sole use of the intended
>>>> recipient(s)
>>
>>>> and may contain confidential and privileged information.  Any
>>>> unauthorized review, use, disclosure or distribution is
> prohibited.
>>>> If you are not the intended recipient, please contact the sender
>>>> by reply E-mail, and destroy all copies of the original message.
>>>>
>>>>
>>>> This E-mail message is for the sole use of the intended
>>>> recipient(s)
>>> and may contain confidential and privileged information.  Any
>>> unauthorized review, use, disclosure or distribution is prohibited.
>>> If you are not the intended recipient, please contact the sender by
>>> reply E-mail, and destroy all copies of the original message.
>>>
>>>
>>> This E-mail message is for the sole use of the intended recipient(s)
>
>>> and may contain confidential and privileged information.  Any
>>> unauthorized review, use, disclosure or distribution is prohibited.
>>> If you are not the intended recipient, please contact the sender by
>>> reply E-mail, and destroy all copies of the original message.
>>>
>>>
>>> This E-mail message is for the sole use of the intended recipient(s)
>> and may contain confidential and privileged information.  Any
>> unauthorized review, use, disclosure or distribution is prohibited.
>> If you are not the intended recipient, please contact the sender by
>> reply E-mail, and destroy all copies of the original message.
>>
>>
>> This E-mail message is for the sole use of the intended recipient(s)
>> and may contain confidential and privileged information.  Any
>> unauthorized review, use, disclosure or distribution is prohibited.
>> If you are not the intended recipient, please contact the sender by
>> reply E-mail, and destroy all copies of the original message.
>>
>>
>> This E-mail message is for the sole use of the intended recipient(s)
> and may contain confidential and privileged information.  Any
> unauthorized review, use, disclosure or distribution is prohibited.  If
> you are not the intended recipient, please contact the sender by reply
> E-mail, and destroy all copies of the original message.
>
>
> This E-mail message is for the sole use of the intended recipient(s) and
> may contain confidential and privileged information.  Any unauthorized
> review, use, disclosure or distribution is prohibited.  If you are not
> the intended recipient, please contact the sender by reply E-mail, and
> destroy all copies of the original message.
>
>
> This E-mail message is for the sole use of the intended recipient(s) and may contain confidential and privileged information.  Any unauthorized review, use, disclosure or distribution is prohibited.  If you are not the intended recipient, please contact the sender by reply E-mail, and destroy all copies of the original message.
> _______________________________________________
> Phoenix-pm mailing list
> Phoenix-pm at pm.org
> http://mail.pm.org/mailman/listinfo/phoenix-pm
>


More information about the Phoenix-pm mailing list