From nagler at bivio.biz Mon Apr 5 16:31:27 2010 From: nagler at bivio.biz (Rob Nagler) Date: Mon, 5 Apr 2010 17:31:27 -0600 Subject: [Boulder.pm] Multi-process caching Message-ID: <19386.29391.564786.221342@r1.bivio.biz> We are having some performance issues for some of our larger apps. We have a Postgres table (Bivio::Biz::Model::RealmRole) which needs to get access frequently, and is written relatively infrequently. We need some sort of in memory cache. I took a look at the IPC::* modules, and none of them seems so brilliant. Before I go diving into performance testing each of them, I was wondering if anybody on this list had opinions about which is the most reliable. We need to get to <100ms for accessing the data, which for our larger apps, is about 500KB serialized with Data::Dumper. Deserialization is too expensive (250ms). We are also thinking about going with a *dbm module or perhaps Berkeley DB. There are about 800K rows, which is really a two-level tree (realm, role), which is why the serialization is so compact. I don't know how dbm/DB work with this size of data. So I'm looking for experience with these technologies from a reliability (first) and performance (second) standpoint. Thanks, Rob From devin.austin at gmail.com Mon Apr 5 18:06:40 2010 From: devin.austin at gmail.com (Devin Austin) Date: Mon, 5 Apr 2010 19:06:40 -0600 Subject: [Boulder.pm] Multi-process caching In-Reply-To: <19386.29391.564786.221342@r1.bivio.biz> References: <19386.29391.564786.221342@r1.bivio.biz> Message-ID: On Mon, Apr 5, 2010 at 5:31 PM, Rob Nagler wrote: > We are having some performance issues for some of our larger apps. We > have a Postgres table (Bivio::Biz::Model::RealmRole) which needs to > get access frequently, and is written relatively infrequently. We > need some sort of in memory cache. > > I took a look at the IPC::* modules, and none of them seems so > brilliant. Before I go diving into performance testing each of them, > I was wondering if anybody on this list had opinions about which is > the most reliable. > > We need to get to <100ms for accessing the data, which for our larger > apps, is about 500KB serialized with Data::Dumper. Deserialization is > too expensive (250ms). > > We are also thinking about going with a *dbm module or perhaps > Berkeley DB. There are about 800K rows, which is really a two-level > tree (realm, role), which is why the serialization is so compact. I > don't know how dbm/DB work with this size of data. > > So I'm looking for experience with these technologies from a > reliability (first) and performance (second) standpoint. > > Thanks, > Rob > > _______________________________________________ > Boulder-pm mailing list > Boulder-pm at pm.org > http://mail.pm.org/mailman/listinfo/boulder-pm > What are you hitting the db with? Are you using an orm? straight dbi? etc. -- Devin Austin http://www.codedright.net 9702906669 - Cell -------------- next part -------------- An HTML attachment was scrubbed... URL: From matt at beldyk.org Mon Apr 5 18:40:01 2010 From: matt at beldyk.org (Matthew Beldyk) Date: Mon, 5 Apr 2010 19:40:01 -0600 Subject: [Boulder.pm] Multi-process caching In-Reply-To: References: <19386.29391.564786.221342@r1.bivio.biz> Message-ID: Have you looked at Storable? I've never used it, but it appears to be included in the Perl core now, so I assume it might be reasonable: "The heart of Storable is written in C for decent speed. Extra low-level optimizations have been made when manipulating perl internals, to sacrifice encapsulation for the benefit of greater speed." Also at one point of time I had a similar issue, we ended up creating a custom binary file format that allowed us to retrieve the data much faster than directly from the database. We used a C program to read the files (although forked from Perl, the fork didn't account for much of the overhead). Just using unpack might have been an option, but at the time we had far more binary experience with C than with Perl (and I hypothesize a C read() is faster than a Perl unpack(), but I've never tested it.) -Matt 2010/4/5 Devin Austin : > > > On Mon, Apr 5, 2010 at 5:31 PM, Rob Nagler wrote: >> >> We are having some performance issues for some of our larger apps. ?We >> have a Postgres table (Bivio::Biz::Model::RealmRole) which needs to >> get access frequently, and is written relatively infrequently. ?We >> need some sort of in memory cache. >> >> I took a look at the IPC::* modules, and none of them seems so >> brilliant. ?Before I go diving into performance testing each of them, >> I was wondering if anybody on this list had opinions about which is >> the most reliable. >> >> We need to get to <100ms for accessing the data, which for our larger >> apps, is about 500KB serialized with Data::Dumper. ?Deserialization is >> too expensive (250ms). >> >> We are also thinking about going with a *dbm module or perhaps >> Berkeley DB. ?There are about 800K rows, which is really a two-level >> tree (realm, role), which is why the serialization is so compact. ?I >> don't know how dbm/DB work with this size of data. >> >> So I'm looking for experience with these technologies from a >> reliability (first) and performance (second) standpoint. >> >> Thanks, >> Rob >> >> _______________________________________________ >> Boulder-pm mailing list >> Boulder-pm at pm.org >> http://mail.pm.org/mailman/listinfo/boulder-pm > > What are you hitting the db with? Are you using an orm? straight dbi? etc. > > > -- > Devin Austin > http://www.codedright.net > 9702906669 - Cell > > _______________________________________________ > Boulder-pm mailing list > Boulder-pm at pm.org > http://mail.pm.org/mailman/listinfo/boulder-pm > -- Calvin: Know what I pray for? Hobbes: What? Calvin: The strength to change what I can, the inability to accept what I can't, and the incapacity to tell the difference. From nagler at bivio.biz Mon Apr 5 18:59:24 2010 From: nagler at bivio.biz (Rob Nagler) Date: Mon, 5 Apr 2010 19:59:24 -0600 Subject: [Boulder.pm] Multi-process caching In-Reply-To: References: <19386.29391.564786.221342@r1.bivio.biz> Message-ID: Hey Matt, Thanks for the reminder. Yes, Storable is indeed much faster. Quick tests indicate 15ms for freeze and 11ms for thaw, and that's with writing/reading files (in kernel cache, but that's likely to be where the file will be anyway). The file isn't much smaller (200KB vs 500kB), but as I noted, it's the serialization that's the problem. I think I'll implement our cache with that to see what happens. Thanks, Rob