[sf-perl] DB design question
Josh Berkus
josh at agliodbs.com
Tue Aug 1 15:14:14 PDT 2006
Rich,
> This is one of those questions where there is no shortage of
> answers, but some are likely to be a lot better (for various
> reasons) than the others. So, I'm hoping for suggestions,
> rationales, etc.
Per previous discussions, you're up against your desire to use a completely
anonymous and flexible database structure (the EAV setup) and the need for
performance. Billions of entities is pushing the envelope for relational
db performance, so you are at the point where your design may center
around performance and not around the logical, or normalized, design.
In fact, given how little use you'll be making of relational features, I
would investigate how well object databases would hold up to your needs.
They probably won't -- last I checked, most ODBs use hash indexes
exclusively -- but it's worth a check. Or possibly there's another
specialized database type.
If I were doing your project in Postgres, it would require some hacking.
Here's what I'd do:
1) extend the intarray module to support the new Generic Inverted Indexes
available in PostgreSQL 8.2 (alpha);
2) improve GIN to support compression per Teodor's notes;
3) Create a denormalized table to hold entites and attributes:
entity (
name text not null primary key
attributes intarray
)
then you can do queries like:
select name from entities where attribues @@ ARRAY[57,201,399,18] >= 3;
.... for "get me all entities which have at least three of the four
attributes 57, 201, 399 and 18".
However, make no mistake: the project you're lanuching will be technically
challenging and may require you to learn to hack RDBMS code to get
acceptable performance (or to hire someone who can).
--
--Josh
Josh Berkus
PostgreSQL @ Sun
San Francisco
More information about the SanFrancisco-pm
mailing list