[Chicago-talk] Speeding up DBI::Pg do,commit (UPSERT)

John Kristoff jtk at depaul.edu
Tue Jan 19 16:02:37 PST 2016


Hello friends,

I'm not using the most current version of PostgreSQL so a native UPSERT
SQL statement for some database (DBD::Pg) statements are implemented
using a technique like the one described here (with the SAVEPOINT
feature):

  <http://www.depesz.com/2012/06/10/why-is-upsert-so-complicated/>

I have to roll through a loop a fairly significant amount of times
(maybe a million or two), which on my system can take a significant
amount of time.  NYTProf shows an extensive amount of time is taken up
with calls to DBI::db::do and DBD::db:commit.

I might be able to optimize my code by assuming there will be no race
condition so I might be able to risk just doing an INSERT if an UPDATE
fails as shown at the top of the page linked above and just ensure only
this is the only process that will ever write to the table and it only
runs once.  Dangerous perhaps, but I could do this and performance will
probably improve.  I might go that route if my options to improve
performance aren't available.

I cannot easily convert this process to a bulk COPY since I'd
potentially be updating some rows and not just inserting them.  I'd
prefer to have a single table with rows that I can just update if
possible.

So my question is, short of making fundamental database changes, does
anyone have any general strategies for doing UPSERTs in a
performance-efficient way, better than I'm currently using of course.

Thank you,

John


More information about the Chicago-talk mailing list