[Pdx-pm] DBI middle ground

Ben Prew btp at cpan.org
Thu Feb 1 07:29:01 PST 2007


I agree, it sounds like you are looking at each row, and you should
try and build one or two update statements that operate on the entire
set, instead of individual rows.

On 1/29/07, Tkil <tkil at scrye.com> wrote:
> >>>>> "MR" == Michael Rasmussen <mikeraz at patch.com> writes:
>
> MR> I'm querying a large data set.  The current code works something like:
> MR>    while( @lease_information = $lease->fetchrow_array) {
> MR>          modification of data, plugging in defaults where currently null
> MR>          ensuring constraints are met
> MR>          # another_db_sth is an update command
> MR>          $another_db_sth->execute(@lease_information);
> MR>    }
>
> MR> I don't have control over the source database, so the changes need
> MR> to be made in my code.
>
> MR> This takes longer to execute than I'd like.  I can't just do a
> MR> fetchall or selectall because the dataset is larger than my
> MR> available memory.  Nor have I found a middle ground.
>
> MR> Is there one?  If so, that I may study up, what are references to
> MR> it?
>
> A few things to check:
>
> 1. Are you using autocommit?  That can be really painful in this
>    situation.  My usual technique is to turn off autocommit, then
>    explicitly commit() every so often (either by time, or by number of
>    DML statements, or by number of rows affected by those DML, it
>    varies.)
>
> 2. It'll almost always be faster to do this work in the DB, if you
>    can.  You should be able to use MySQL's IFNULL() or Oracle's NVL()
>    to fill in the blanks, and hopefully you can use standard WHERE
>    conditions to check constraints.  (Although the lack of proper
>    regex support in most DBs is painful.)
>
> 3. Profile your code locally, to see if it's CPU bound somehow.  As
>    others pointed out, binding values with "fetch" is advertised as
>    the fastest way to get data through the DBI; doing some local
>    profiling might point fingers.  (There's also the DBI trace
>    methods, which can use high-resolution timing (i think?) to give
>    you further insight.)
>
> 4. Talk to your DBA, if you have one.  It might be that your update
>    can be tuned, or it might be much faster to load up a temp table
>    and then do the update in one swell foop.
>
> Happy hacking,
> t.
> _______________________________________________
> Pdx-pm-list mailing list
> Pdx-pm-list at pm.org
> http://mail.pm.org/mailman/listinfo/pdx-pm-list
>


-- 
--Ben


More information about the Pdx-pm-list mailing list