SPUG: Web Bugs

Doug Beaver doug at beaver.net
Mon Aug 20 21:14:35 CDT 2001


On Mon, Aug 20, 2001 at 06:34:28PM -0700, Jonathan Woodard wrote:
> In response to Doug...  Ralph Kimball explains the utility of
> transparent gifs better than I in The_Data_Webhouse_Toolkit, p.
> 123-126.  He calls it null logging.  I'll give a summary and my take
> on this.
> 
> You can specify the source of the transparent image to be a URI that
> points to a server or set of servers dedicated to collecting page view
> data by serving up only transparent images.  Let's call them logging
> servers.  The IMG SRC tag used to embed the transparent gif can have
> useful metadata embedded in it - whatever you want to log about the
> page/frame it's embedded in.  To borrow Kimball's example, the tag
> could look like:
> 
> <IMG
> SRC="http://logserver.mega-merc.com/nullpic.gif?type=catalog&sku=bear089
> ">
> 
> In this example, logserver.mega-merc.com points to the logging
> servers.  Then, instead of parsing through the front end servers'
> logs, you just go through the logging servers'  logs.  They'll have
> only http get calls to the transparent image, along with whatever
> metadata you embed in the get call parameters to the image.  In this
> example, you'd see a catalog page hit for sku bear089.
> 
> In our case, parsing through smaller logs would be very useful - we
> collect over 20 GB of IIS logs daily, coming from several geographic
> locations and many servers.  Of course Perl handles this data
> effectively, but decreasing what gets parsed in the first place is
> very attractive to me.  I'd like to provide faster turnaround time for
> feedback on our sites - logging what we're interested in to a set of
> dedicated logging servers by embedding their URI in transparent gifs
> is one approach.

That is pretty cool.  I was going to argue that you could just parse out
the access logs but I didn't think about sending the image requests to a
separate logging server to reduce the log size.  I can see the benefit
now, thanks.

> As a user you can tell your browser not to show images to avoid
> getting logged, but I don't see the big deal, as long as you trust
> what sites and services you use.  For my group, knowing how customers
> (in aggregate) use our sites is extremely valuable to designers,
> management, marketing, operations, and content creators.  If we don't
> know how effective our site is, we might as well close up shop and go
> home.  I hope that services I use online are always looking for ways
> to improve.

I don't mind if they improve their services, it's just that it can be
hard to distinguish between good and evil web bugs.  So I'd rather they
didn't use them at all.

And I can't come up with a better argument against web bugs besides they
are too easily used for evil.  Say web bugs were made illegal or hadn't
been invented, you could get the same data by parsing your access logs
and sending data back and forth to doubleclick or whoever is doing the
tracking for you.  It's just that web bugs make the tracking so simple
and immediate for the implementor.

> I have yet to think of any utility for transparent gifs in email other
> than spam and tracking the path of a message as it gets passed along
> by html mail clients.  Both uses are obnoxious.

Total agreement here.  Although it would be fun to send a html chain
email around the globe and see where it goes.  I could create one of
those my-rich-uncle-died-and-i-have-to-distribute-his-foreign-fortune
scams and see where it goes (geographically, that is).  :-)

Doug

> -----Original Message-----
> From: Ken McGlothlen [mailto:mcglk at artlogix.com] 
> Sent: Monday, August 20, 2001 18:05
> To: Doug Beaver
> Cc: Jonathan Woodard; Wallendahl, Michael/SEA; SPUG
> Subject: Re: SPUG: Web Bugs
> 
> 
> Doug Beaver <doug at beaver.net> writes:
> 
> | What is it about transparent gifs (whether they are static or 
> | generated by a
> | cgi) that makes it easier to log and retrieve page view data?  I am
> trying to
> | see the benefit, but I can't.  Can you explain a little more?
> 
> Specifically, when you visit a site (say, cnn.com), they have the option
> of dropping in a webbug (or set of them) from various other firms.  The
> cnn.com page might consist of:
> 
>         The HTML document
>         An IBM ad
>         A Compaq ad
>         A doubleclick.com webbug
> 
> The doubleclick.com webbug almost always has a way of encoding more
> information in the URL, so now doubleclick.com knows that you saw the
> article, which ads you saw, and when you saw it.  They also work with
> cnn.com to discover the referring URL.
> 
> Alone, this is no big deal, but you can see how, with enough webbugs on
> enough sites (and it doesn't take a majority of them), doubleclick.com
> can come up with a really good profile of individual users, and come up
> with more effective (read "obnoxious") advertising tactics.
> 
> Even worse is emails---it's like a read-receipt that mailreaders like
> Outlook won't let you block.  This is one of the primary reasons why I
> don't use a graphical mailreader.
> 
> | The thing that upsets me about web bugs is that you can't turn them 
> | off.  At least you can turn off cookies.  Even if you're using a proxy
> 
> | which strips your identifying headers, they can still track you since 
> | the tracking info is encoded in the image name.
> 
> Well, there are ways.  On the Macintosh, for example, a popular
> web-browser named OmniWeb allows you to do URL blocking (with regular
> expressions, no less), and that one ability (along with superior cookie
> management) has made it my favorite browser.  Mozilla is also going to
> permit you to block images from sites, whenever it becomes ready for
> prime-time.  Your only other avenues are HTML proxies like junkbuster,
> which block image requests from sites you select.
> 
> | You might be able to test for the existence of web bugs by using a 
> | proxy and doing a HEAD request on each "image" referred to by <img> 
> | tags.
> 
> Actually, if you can just get a list of IMG URLs out of the page
> efficiently, they're pretty easy to spot.  OmniWeb has the "Get Info"
> command; it will list all the resources a page attempts to load.  But it
> does take a pair of eyeballs to distinguish ads and webbugs from
> legitimate spacers and the like.
> 
> 
>  - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
>      POST TO: spug-list at pm.org       PROBLEMS: owner-spug-list at pm.org
>       Subscriptions; Email to majordomo at pm.org:  ACTION  LIST  EMAIL
>   Replace ACTION by subscribe or unsubscribe, EMAIL by your Email-address
>  For daily traffic, use spug-list for LIST ;  for weekly, spug-list-digest
>      Seattle Perl Users Group (SPUG) Home Page: http://zipcon.net/spug/
> 
> 

 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
     POST TO: spug-list at pm.org       PROBLEMS: owner-spug-list at pm.org
      Subscriptions; Email to majordomo at pm.org:  ACTION  LIST  EMAIL
  Replace ACTION by subscribe or unsubscribe, EMAIL by your Email-address
 For daily traffic, use spug-list for LIST ;  for weekly, spug-list-digest
     Seattle Perl Users Group (SPUG) Home Page: http://zipcon.net/spug/





More information about the spug-list mailing list