[Oc-pm] June 20th Meeting Notes

Ben Tilly btilly at gmail.com
Mon Jun 25 07:15:00 PDT 2007


On 6/25/07, David Romano <david.romano at gmail.com> wrote:
> Last week's meeting was held at Panera and had four attendees: Ben
> Tilly, Pete Wilson, V.J. (didn't catch the last name), and me. Sorry
> for getting this out a few days late. My memory of what was discussed
> is now a little fuzzy, but here's a bit of what I remember:

You remember it better than I did. :-)

[...]
>     -   Interweaving SQL and Perl code
>         I had brought along Perl Hacks, which I'm currently working my
>         way through, and Ben looked at its discussion of using SQL from
>         Perl. Hack #23 suggests to create a subroutine for each
>         different query, and tuck away all those subroutines in a SQL
>         package. Ben explained a different way of managing SQL code in
>         Perl, and that is by writing chunks of SQL statements, storing
>         them in scalars, and combining them as appropriate. He said he
>         found this very useful when working with complex queries.
>         http://www.oreilly.com/catalog/perlhks/

I should note that those two techniques solve very different problems.
 SQL::Abstract solves the problem of generating SQL for someone who
doesn't really want to learn SQL.  The problem that I was trying to
solve was making complex SQL understandable for someone who
understands SQL.

In particular, for reporting purposes I often wind up writing
subqueries within subqueries of the main query.  In native SQL, even
with good indenting, it winds up being hard to correctly pair up the
SELECT with the WHERE clause.  (Because all of the subqueries went in
the FROM clause between them.)  So I write subqueries, and interpolate
them in.  The result is a series of queries like this:

  my $example_sql = qq{
    SELECT s.this
      , s.that
      , s.the_other
    FROM foo f
      , (} . indent($complex_subquery_sql) . qq{    ) s
    WHERE f.some_key = s.some_key
      AND f.some_field = 'some condition'
  };

each of which has a clear and readable intent, and the result of which
is nicely indented and is as legible as possible.  (ie not very.)

If you routinely wind up writing several hundred line SQL statements
which nest subqueries several deep, you may find this useful.  You may
likewise find my ideas on how to correctly format SQL useful as well.
(See the above for an example of my formatting.  In particular I put
commas and AND statements on the start of lines, which generally makes
it easier not to forget one when you add new lines.)

Of course if you're coding a CRUD application and wind up writing such
SQL statements, then you have another kind of problem...  (Note that
my job is reporting.  It is far more natural for me to want to write
complex SQL than it is for most people.)

>     -   GPS and General Relativity
>         Pete is a data analyst for a company in Anaheim, and his job is
>         to analyze extreme amounts of data from GPS devices his company
>         manufactures. Ben mentioned that GPS is the only real-world
>         application that takes into account the general theory of
>         relativity. I had just read the ABC of Relativity and was
>         still a bit patchy on what the general and special theories
>         were, but Ben gave a good explanation (and demonstration) of
>         general relativity.
>         http://en.wikipedia.org/wiki/GPS
>         http://en.wikipedia.org/wiki/General_relativity

Well I wouldn't say that I *explained* general relativity...

Incidentally http://en.wikipedia.org/wiki/GPS#Relativity discusses the
connection quite well.

[...]
>     -   Lingua::Romana::Perligata
>         It exists. If you're leaving a company you hate, you might
>         consider rewriting your code using it. At least that's what one
>         of Ben's acquaintances thought of doing.
>         http://search.cpan.org/~dconway/Lingua-Romana-Perligata-0.50/

Close.  My aquaintance thought of rewriting his code in Klingon.  He
got the idea because his friend who left the same company actually DID
rewrite his code in Latin.  I don't know what company it was, but it
provides a clear demonstration of why you shouldn't piss off your
programmers.  (The company is in New York, nobody need fear working
there by accident.)

>     -   Databases
>         Pete and I didn't know that much about databases. Ben Tilly
>         enlightened us :-) Ben explained the idea of relational
>         mapping using a Students table, Teachers table, and Classes
>         table. He stated the importance of designing tables
>         orthogonally, and only adding redundant fields to improve
>         performance. He also described what happens when a database
>         falls over. The number of requests to update a certain record
>         (or is it a certain table?) in a database is so high at a
>         certain time X. Each request's continuous check to see if the
>         record has been unlocked (and is now available for writing)
>         overloads the server resources and the server croaks. I hope
>         that makes some sense :-/

Actually I described one way that transactional databases can fall
over.  And that is in overhead for a single latch.  A latch being a
specific kind of (normally lightweight) lock.

There are operations where requests have to be queued up and processed
serially.  For instance you cannot access a particular data structure
until you have a lock so that you are sure that nobody else will
modify it out from under you.  The problem comes when the time to
process that data structure is longer than the time between requests
to do it.  What happens then is that processes pile up trying to get
the latch, aka lock.  Since they are doing a busy wait for the latch,
they consume all of the CPU, and the database falls over.

As for what a latch covers, I didn't go through that in detail.  It is
highly variable, but is seldom as small as a single record or as big
as an entire table.  I've seen this failure mode on things as varied
as fetching an already parsed SQL statement to reading a record from a
table.

The really frustrating thing about this failure mode is that if
requests are coming in at 80% of the maximum rate the latch can take,
there is absolutely no sign that there might be a problm coming.  At
98% you're experiencing really nasty spikes.  (That partially solve
the problem by driving users away from your site.)  And at 100% you
fall over.  Nobody, and I mean nobody, has a good handle on how to
predict when this will happen. :-(

[...]
>     -   Next Possible Meeting Location
>         Ben had remembered that the group had met at a good pub a few
>         years back, and wants to go again. I noticed an old e-mail from
>         the oc-pm archives that mentioned The Olde Ship. Pete mentioned
>         a pub in Fullerton whose name escapes me (The Shipyard?).
>         http://mail.pm.org/pipermail/oc-pm/2003-June/000085.html

:-)

Cheers,
Ben


More information about the Oc-pm mailing list