[San-Diego-pm] An old yet persistent problem

Al Tobey tobert at gmail.com
Mon Sep 27 10:47:08 PDT 2010


On Mon, Sep 27, 2010 at 9:21 AM, Joel Fentin <joel at fentin.com> wrote:
>
> Al,
>
> Thank you for getting back to me. Please see below.
> On 9/26/2010 11:35 AM, Al Tobey wrote:
>>
>> On Sun, Sep 26, 2010 at 10:42 AM, Joel Fentin <joel at fentin.com
>> <mailto:joel at fentin.com>> wrote:
>>
>>    The browser makers could make this easy but they won't.
>>
>>    I have needed this for years. Here is the current version of
>>    the need:
>>    After some .JPG files swap names, I want the user to reload
>>    the page from the server. Not from the cache. Elsewise, the
>>    pictures appear in the wrong places.
>>
>>    I've been rummaging Stack Overflow and in plenty of other
>>    places, looking for advice. None of it works for me.
>>
>>    1. The most typical advice is along the lines of:
>>    <meta http-equiv="Cache-Control" content="no-cache, no-store,
>>    must-revalidate">
>>    <META HTTP-EQUIV="Pragma" CONTENT="no-cache">
>>    <META HTTP-EQUIV="Cache-Control" CONTENT="no-cache">
>>    <META HTTP-EQUIV="Pragma-directive" CONTENT="no-cache">
>>    <META HTTP-EQUIV="Cache-Directive" CONTENT="no-cache">
>>    <META HTTP-EQUIV="Expires" CONTENT="0">
>>    No combination of these works for me.
>>
>>    2. The one and only thing that works well is to press the F5
>>    button after the page loads. I have not seen an example of
>>    this in the Perl code.
>>
>>    3. Perhaps there is some javascript that would do the job. I
>>    would need it to do something like: xxx.pl?ID=1234
>>    <http://xxx.pl?ID=1234>. Not a link nor a button - but
>>    executed in and at the end of the Perl script.
>>
>>    If the cure is to work in only one browser, I prefer Firefox.
>>
>>
>> Have you tried Etags?
>
> Until I got your email, I had never heard of Etags. Since then, I have been reading and reading. I still don't get it.
>
> ITEMS:
>
> + In your example below, where does the 123456789 come from? Is it from within the jpg pulled out with Image::ExifTool?

It's just a contrived number.   It could be a unix epoch returned from
time(), an SVN commit, or even a hex commit from git.    Its meaning
is entirely up to you and has nothing to do with etags.

> + Is the "v" a special browser variable or an HTML control name?

Nope, it's also contrived.   The browser caches on URI so changing the
query string will cause a miss on the old cache URI.  Obviously this
will require your HTML to reference the new URI to the image for it to
flip, but that tends to be something easy for you to change on the
fly.   For example:

#!/usr/bin/perl

use strict;
use warnings;
use Digest::MD5;

my $site_path = "/srv/site.com";
my $image = "/images/plus.png";
# inefficient - opening/reading the file on every request
my $csum = checksum($site_path . $image);

print <<EOHTML;
Content-type: text/html

<html><body><img src="$image?c=$csum"/></body></html>
EOHTML

sub checksum {
    my $image = shift;
    my $md5 = Digest::MD5->new();

    open(my $fh, "< $image") or die "open($image) failed: $!";
    $md5->addfile($fh);
    close($fh);

    return $md5->hexdigest;
}

__END__

> + If the browser makers can enable the F5 button for the convenience of the users, why won't they give us (the programmers) the same convenience?

That's exactly what etag headers are for.   They used to be
hit-and-miss but modern browsers (and intermediate caching servers!)
tend to honor them these days.

Maybe this is a better resource for them:
http://en.wikipedia.org/wiki/HTTP_ETag

Etags are the best you can hope for if you want caching enabled.
IIRC, Apache can use the file's mtime for the etag making things very
straightforward to set up and maintain.   If you want a certain image
to always go to server, just use "Cache-Control: no-cache" in the
image's headers and it won't get cached anywhere in the pipeline.  You
can set any of the headers on a per-directory or even per-file basis
in your web server's configuration, so it doesn't need to go through a
separate web app.

-Al

>>
>> http://developer.yahoo.com/performance/rules.html <- great stuff
>> in there in general
>>
>> Maybe you can add a version string to the image URL and get most
>> of what you want.   I haven't seen it used for images but it's
>> used quite a bit for javascript.
>>
>> <img src="/images/foo.jpg?v=123456789"/>  This can even point at a
>> regular file and Apache will eat the query.   If you change the
>> version in your HTML, it'll bypass your browser cache and come
>> back to the server even if the filename is the same - at least for
>> <script> tags.
>>
>> That said, Etag headers per-image are probably what you really want.
>>
>> -Al
>
> --
> Joel Fentin       tel: 760-749-8863
> Biz Website:      http://fentin.com
> Personal Website: http://fentin.com/me


More information about the San-Diego-pm mailing list