[LA.pm] Anthony Curtis presents Perl Stored Procedures for MySQL
jbrown at reachlocal.com
Wed Aug 19 12:02:18 PDT 2009
>> If I need complex processing, I'll normally create aggregate tables,
>> which are then simple tables that are accessed with simple queries.
>> Now, this doesn't necessarily contradict the idea of doing data
>> processing as close to where the data lives as possible.
> Actually, it does.
Doing no processing (at least none at query time - but rather offline, so to
speak, by creating aggregate tables) is still faster than doing the
processing at query time but doing it in the db. That is, it's often more
efficient to do the work once, offline. Whether that work is done in the db
or slightly above it, it's most likely still faster than doing the word in
the db but at query time.
From: losangeles-pm-bounces+jbrown=reachlocal.com at pm.org
[mailto:losangeles-pm-bounces+jbrown=reachlocal.com at pm.org] On Behalf Of
Sent: Wednesday, August 19, 2009 11:49 AM
To: Aran Deltac
Cc: losangeles-pm at pm.org; Ask Bjørn Hansen
Subject: Re: [LA.pm] Anthony Curtis presents Perl Stored Procedures for
On Wed, Aug 19, 2009 at 11:22:21AM -0700, Aran Deltac wrote:
> > Here's how it goes, over and over and over again: when MySQL doesn't
> > have it, it's fluff and nobody could possibly want such frippery,
> > let alone need it. When they get some kind of nonstandard, buggy,
> > hemipygian implementation, it's suddenly the greatest thing and you
> > can't live without it.
> > As to this, "storage layer" business, that's what we call the stuff
> > at the other end of the SCSI (alternate spelling: SAS) cable, or if
> > you're unlucky, the network cable.
> > Jim Gray
> > <http://en.wikipedia.org/wiki/Jim_Gray_%28computer_scientist%29>
> > measured this back in 2003, and those metrics have moved even
> > further toward his conclusion, which was essentially, "do all the
> > processing you can as close to where the data lives as you can arrange
> > http://research.microsoft.com/apps/pubs/default.aspx?id=70001
> Good info, thanks.
Clearly you didn't actually read it, or if you did, you didn't understand
> But, I have to agree, the less you do *in* the database, and the more
> you can shrug off processing to other parts of the system, the better.
The more processing you do *as close as possible* to where they data is
actually stored, the better off you are. Read the paper.
> I like to treat my database as a very fast flat file storage engine
> that does very little processing for me.
Yes, that's a common mistake, but that it's common doesn't make it not be a
mistake. OO coders are especially prone to this mistake, but it's far from
unknown among other kinds of coders who don't understand what an RDBMS is or
what it does.
> If I need complex processing, I'll normally create aggregate tables,
> which are then simple tables that are accessed with simple queries.
> Now, this doesn't necessarily contradict the idea of doing data
> processing as close to where the data lives as possible.
Actually, it does.
> It just rules out the database itself - there are other ways to get
> close to the DB. And, of course, there is no one-ring-to-rule them
> all - sometimes you just have to do it in the database.
You've got your assumptions backwards. There may be times when your RDBMS
simply can't do the work for you, say if you've got a broken piece of
garbage that's at least ten years out of date, and you *have* to take that
network hit, which is massively inefficient. Even then, you need to do
tremendously many operations on each byte you pull over the network to make
> This is what Facebook and others do (including my $employer), as much
> as possible.
Facebook is not making money. It's losing money, not least because it's
doing things that cost enormous amounts of money like not letting the RDBMS
do what it can.
> Maybe there is truth in your statement that everyone hims-and-haws
> about how useless features are, and then when MySQL supports it people
> can't live without it. But, you should be aware that there is a
> movement to not do a lot of complex processing within the database
I'm aware that there are a bunch of people who've been (usually not
deliberately) misinformed. That doesn't make them well informed. It just
makes more work for me when it turns out that approach simply cannot be made
to work. I suppose if I were more mercenary, I'd encourage this kind of
thing, but I'd rather work on stuff other than fixing giant systems broken
by this kind of misapprehension.
> Ignoring that and just assuming that people are being MySQL Zealots
> isn't going to help anyone.
> My 2 cents of opinion.
Everybody's entitled to an *informed* opinion. The thing is, Jim Gray went
out and measured in a technology-agnostic way, and he came to the opposite
Perhaps you'll go out and measure something different. It'll be worth your
very own Turing award if you manage to overturn the result in that paper.
David Fetter <david at fetter.org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david.fetter at gmail.com
Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate
Losangeles-pm mailing list
Losangeles-pm at pm.org
More information about the Losangeles-pm