Conditional http get for dynamic pages
david_dick at iprimus.com.au
Sun Nov 24 13:19:50 CST 2002
Scott Penrose wrote:
> On Saturday, Nov 23, 2002, at 16:13 Australia/Melbourne, David Dick
>> G'day all,
>> This is slightly off topic, but i figure there is some interest in
>> web stuff in the perl-mongers. I've been reading the http rfc over
>> the last week and realising just how cool conditional gets are for
>> dynamic pages. It seems to reduce network traffic by a good amount.
>> Is anyone else using them for dynamic pages? Any experiences on this?
> Are you talking about standard IMS request (If Mod Since).
Actually the INM (If None Match) request :)
> If so that has been a part of the standard for many years and built
> into most proxy and web servers. For example every time you make a
> request through SQUID it will do an IMS request to the up stream
> server to only get the response if has been modified.
> However it is of course extremely difficult to do with dynamic pages.
> My preference is along the lines of this. If it is a dynamic page that
> changes only occasionally then you are best off generating the page. A
> good example of this would be the slashdot home page. Generate the
> page off line and let the web server deliver it. This has many
> advantages the most of which is that you don't have to write special
Hmmmmm... Correct. My problem is probably a specific one. I'm building
a inventory system (keep track of the amount of goods in a warehouse).
This means that I can't use a cache for the volume of stock remaining
type enquiries, cos it's vital that the application server is consulted
for every enquiry. However, it's quite likely that the level of stock
only changes relatively slowly.
But if I take a MD5 Digest (for example) of the final page just before
actually sending it and send it with the page as a ETag, the next time
the user requests a "level of stock remaining" page (for example :)), if
the page has the same MD5 Digest, I can just send a 304 and save the
network traffic of a full response. Also extremely applicable to a
"Search for a document" type pages, which i am using quite a lot.
> However, if it is a dynamic page which changes depending on the user
> logged in etc, then you are not going to get any advantage out of an
> IMS header response. There are a few reasons for this. A proxy cache
> generally does not cache these pages (it can do but is almost
> pointless so generally don't). Another is that things change in too
> complicated a way. Theoretically you should be able to get cookies and
> stuff with your IMS, but all this overhead generally slows things down
> quite a bit :-)
> To reduce network traffic for dynamically generated pages I recommend
> the following strategy:
> - Generate the pages and put them somewhere sensible if they can
> be cached. If they can't be cached (eg: aways changing) then you gain
> nothing from IMS anyway.
Except that you do reduce network traffic. As above, it's probably a
more specific problem than i realised. :) i just got really excited on
realising that there was a technical optimisation that i could apply to
the problem. :)
> - Use a reasonable expiry time
> - GZIP the pages. If you use the right settings in apache they
> only get compressed once so it is reasonably efficient. You can even
> go further and compress yourself and then NOT check if the file is
> changed, just always send the current GZ file if GZIP is supported.
But if you do take the hash before the gzip, you may not need to gzip.
:) Of course, if you use a MD5 Digest after any possible compression,
you can also use it for a Content-MD5 field :)
More information about the Melbourne-pm