[Toulouse-pm] OSCON, tutorials
Michel Rodriguez
mirod at xmltwig.com
Tue Jul 8 15:49:07 CDT 2003
Salut,
Bon, comme promis (et non sans mal, le reseau sans-fil est plante et la
conf manque serieusement de prises electriques dans les salles, donc en
general mon portable ne tiens pas jusqu'a la fin des sessions et je tape
la fin d'apres mes notes, grrr... pour YAPC vous tracasser pas les gars,
ca pourra pas etre pire!)
Making Programs Faster
MJD
Of course this morning the tutorial I had signed up for,
Advanced DBI, is cancelled, my next choice, XSLT is
sold-out... I decide sit in MJD's "Making Programs Faster".
MJD is always an entertaining speaker. Plus I am terrible at
optimizing, the archetypal premature optimizer one might say.
MJD shows the Schwartzian transform.
He then shows a simple case (from a post on the newsgroup)
where it really doesn't make much sense to use it: "lc" is
fast, you can use it directly in a sort, the overhead of the
ST is not worth it in this case, the ST version is slower
than the naive one.
Now he gives us some examples of I/O bound, CPU bound and
memory bound code and adds some hints about how to optimize
them: parallelize I/O bound code (or switch from CGI to
mod_perl), optimize the code for CPU bound code, try reducing
the memory used or buy more memory for memory bound code. The
important is to figure out which factor impacts your code the
most.
tools for optimizing
Timing
The shell "time" is the easiest way to quickly figure out how
much usr/CPU time a process takes. There are often 2 versions
of "time" on a system: the built-in and one in
"/usr/bin/time" or equivalent, with different output.
In Perl "time()" can be used, and "use Time::Hires;" to get
"time()" to work with a resolution better than 1s.
The usual way he writes benchmark: an empty loop, then the
various options, so he has a base value to compare the
different options to.
He does not use "Benchmark.pm": the results are different
from his simple benchmarks, for some unknown reason, they are
not consistent if he repeats the test, he even gets results
where the test takes... a negative time to complete! Plus the
machinery for a benchmark should be a simple as possible, and
you should be able to understand it.
Profiling
Profile your code before optimizing it, or you will probably
optimized the wrong functions.
Use "Devel::DProf":
perl -d:DProf toto.pl <args> > /dev/null
dprofpp
Send the output to "/dev/null" so output related problems
don't interfere
This will give you a list of functions, and how much time the
process spent in each one.
The usual rule here is 90-10: 10% of the code accounts for
90% of the run time. So focus on the 10%!
"Devel::Smallprof" gives you an even more detailed report,
line by line.
Examples
Generic advice
A very important piece of advice: TEST THE OPTIMIZED VERSION!
Make sure you don't break things in the process: *You
probably don't live in a World where you want to get the
wrong answer as fast as possible*
Also think about the big picture, is the decrease in
maintainability worth it? Always remember that hardware is
cheap. We still live under the impression that hardware is
expensive and precious. This is no longer true.
MJD has a cute cat! And nice jokes about him that he somehow
manage to tie to the subject at hand.
Perception is important too: sometimes removing warning
messages that say "this will take a long time" actually makes
the users happier and stop them complaining, as in fact they
don't notice the "long time"!
*Optimize for the common case*.
Speeding up a mailbox analyzer
- run and save the output (so we can check the optimized
version)
- profile
- look at the most used functions and see what can be done
In this case he found out that "Mail::Header" was taking a
lot of time. So he replaced it by custom code. This is risky
as mail protocols, (like most protocols IMHO) are hard to get
right.
As it turned out, in this case, when checking the output of
the program had changed... for the best! It turns out there
was a minor bug in "Mail::Header" that the simpler optimized
code fixed!
So the optimized version is better, and 81% faster!
Yipee!
A look at the profiler results, followed by a detailed
economic analyzis shows that he can stop here. That's
actually a very interesting analyzis: if he spends 20 minutes
optimizing the next function, and actually get it to improve
its speed by 20%, he will need to run the code 25 million
times to get a positive return on investment! I should do
this more often, that would save me a lot of time.
Speeding up "pod2man"
This is useful because it is run quite often, any time you
install a new module for example.
- run and save the output (so we can check the optimized
version)
- profile
Here he finds out that he can optimize a POD tokenizer, by
making it faster in the common case ( "I<text>") at the
expense of the less common case "I<< text >>" (man is it a
pain to type POD exemples... in POD as I am doing right now!
I even have to use the "Z<>" escape for the first time!)
To get a better idea of what's going on he needs more
detailed output than provided by "Devel::DProf"
"Devel::Smallprof" gives too much output, so he now teaches
us how to write our own profiling module, using the hooks
available for "Devel::" modules: "@{"::_<toto.pl"}",
%DB::sub, "DB::DB()" and "caller()".
It is not that difficult actually!
CGI (SOAP) application
A real-life example: speeding up a CGI application that was
accessed in burst (several hundred times per minute for a
while, then nothing for a long time). It received XML data,
parsed it, updated a DB, then signaled succes or failure.
The solution: as the succes depended only on whether the XML
parsed or not, just parse the XML, return success/error and
save it to a file. A batch process then reads the files and
updates the DB, at its own pace. The client gets a return
much faster, and the DB gets updated
Plus he got rid of a couple of modules that were used. He
doesn't like modules that do very simple stuff... but with an
OO interface (here the object was a string and the method
used was 2 lines of code).
Blunders
Now some examples of failed attempts at optimization.
He starts with pseudo-hashes. That's too easy! ;--) It's a
quite well-known story
He actually gives a detailed, and interesting, explanation of
how they work and why they make slower than regular hashes.
Then an exemple from the newsgroup where someone replaced an
"eval "$string"" by an "eval { $code }". This made the
code_much_ faster! That would be because "eval { $code }"
does not actually eval the code in $code.
Beware of benchmarks! Check that the different versions
return the same result first, and then benchmark.
Then he shows how the old trick of pre-allocating arrays was
found not to speed up execution. A benchmark showed that it
actually slowed things down.
Darn! I thought I had guessed why the benchmark was wrong but
no!
It's a classic : Perl (or the OS) already optimizes things,
uses caches, pre-allocate arrays based on previous calls to a
function... so your optimization might be redundant, and thus
counter-productive.
He finishes with a similar exemple from "Tie::File", where he
write a real fancy caching algorithm f
Darn! I thought I had guessed why the benchmark was wrong but
no!
It's a classic : Perl (or the OS) already optimizes things,
uses caches, pre-allocate arrays based on previous calls to a
function... so your optimization might be redundant, and thus
counter-productive.
He finishes with a similar exemple from "Tie::File", where he
write a real fancy caching algorithm for lines in the file...
that turned out not to be used in practice.
The next example is from a thread on Perlmonks... with tons
of silly advice on how to optimize a (rather simple)
numerical problem ([id://134419]). His morals: don't
micro-optimize, and *There is plenty of crappy optimization
advice*.
He finishes with some general advice that boils down to
"THINK before you start optimizing!" (then think some more).
Basically optimizing is rarely worth it!
Conclusion
Overall a good tutorial stressing the dangers and potential
pitfalls of optimization. As most good tutorials I liked the
fact that it showed how to go about optimizing code, from
non-optimized to the final version, through analyzis,
refinements, mistakes and knowing when to stop. I find that
this is the most important thing you can get from such a
class: seeing how the instructor mind works.or lines in the file...
that turned out not to be used in practice.
The next example is from a thread on Perlmonks... with tons
of silly advice on how to optimize a (rather simple)
numerical problem ([id://134419]). His morals: don't
micro-optimize, and *There is plenty of crappy optimization
advice*.
He finishes with some general advice that boils down to
"THINK before you start optimizing!" (then think some more).
Basically optimizing is rarely worth it!
Conclusion
Overall a good tutorial stressing the dangers and potential
pitfalls of optimization. As most good tutorials I liked the
fact that it showed how to go about optimizing code, from
non-optimized to the final version, through analyzis,
refinements, mistakes and knowing when to stop. I find that
this is the most important thing you can get from such a
class: seeing how the instructor mind works.
Efficient SQL
by Greg Sabino Mullane
The tutorial will focus on PostgreSQL and how to make DB
applications faster (do you see a trend in the tutorials I
attend?)
PostgreSQL 7.4
*SQL is usually the weakest link in the chain*: when someone
(usually not you) comes and tells you that your application
is too slow, the SQL code is usually where you can optimize.
How to speed things up? 6 ways: hardware, OS, DBMS,
Application, DB design and Query tuning.
Hardware and OS
RAM is the most important, fast disks can be useful too.
DBMS
By default some of the settings for PostgreSQL are set
way too low (in order to run out-of-the-box in low-end
machines), "sort_mem", "shared_buffers" for example, see
section 3.4.2 of the admin guide.
Application
Not that important unless there is a major flaw in the
code.
He advises to keep the data as objects in Perl, with the
SQL in a separate module, isolated from the main code.
This way optimizing the SQL code is easier.
Try to leave as much as possible on the DB side, it will
be faster and will make it easier to enforce constraints
on the DB.
DB Design
He thinks that normalization is really important, and
doesn't necessarily impact speed. *That's what a Data
Base does: JOINTs*. Column order can impact the speed of
the DB.
Query Tuning
That's what the tutorial is all about!
When thinking about optimizing, first figure out whether the
problem is that all queries are slow (in which case you'd
better look at the previous items) or whether some specific
queries are too slow, in which case you can start working on
them.
He now describes how a SQL query is parsed, optimized and
executed by the DBMS.
"EXPLAIN" is used to show how the query will be parsed and
executed (without running it), the number of estimated rows
returned by each step and the estimated cost of each one.
"EXPLAIN ANALYZE" also runs the query and provide the actual
time spent in each step, number of rows returned by each
step.
At this point the tutorial became quite boring for a while as
he listed the various operators and their cost.
Indexes
Now we see how "ANALYZE" generates statistical data on the db
(in the "pg_stat" table). Frm there we can figure out which
columns should be indexed: anytime a slow operation
(typically a sequence scan) shows up we can add an index. The
results are as spectacular, as expected: from 27s to 5.5ms in
the example shown.
PostgreSQL can build indexes on functions, eg on
"lc(column)", to avoid having to recompute the function for
each row.
Using a "WHERE" clause to build partial indexes that can be a
lot more efficient than full indexes. For exemple "NULL"
values can be excluded from the index. The results of
"ANALYZE" are important to figure out if it is worth to
narrow down the index.
"CLUSTER" can also be used to by moving the data physically
on the disks, which increases the access speed.
Miscellaneous tidbits
* previous to PostgreSQL 7.4 "EXISTS" should be favoured
over "IN",
* when doing "UPDATE"s, use "VACUUM", as "UPDATE"does a
"DELETE" and then "INSERT",
* do not use "max(toto)" and "min(tata)" use "ORDER BY toto
DESC LIMIT 1" and "ORDER BY tata ASC LIMIT 1"
respectively
* The "ctid", which is the physical address of a record (a
page/index in page) is the quickest way to access a
record
Conclusion
Overall the tutorial was very deep and thourough, showing
the process used to analyze and optimize SQL queries with
PostgreSQL. It was also thouroughly boring at times: I
found it hard to get excited by the fight to stave off a
couple of milliseconds from a query (how to get from
6.07ms to 0.76ms in 5 painful^Heasy steps ;--)
Voili!
Michel Rodriguez
Perl & XML
http://www.xmltwig.com
More information about the Toulouse-pm
mailing list