Phoenix.pm: Job Hunting Saga

Scott Walters phaedrus at illogics.org
Wed Mar 6 17:07:15 CST 2002


This is ment to amuse.

I'm lucky enough to have a first-page ranking on Google for
search phrases as "computer programmer resume". Hungry for job leads,
I hobbled together a quick script (thanks to Bill & O'Reilly for
the regex) to parse my Apache access_log for Google search referals
under the logic that a lot of people looking for resumes would
search google while at work, and then click on mine, and I would get
their domain name to follow up. The script is at the end for anyone
interested. It's partially commented out - one behavior was just to
dump a count of words that I got hits on. Right now, it only prints
out instances from the log of searchhits from google for "resume" +
other words.

What I came up with surprised me:

1. Almost all of the hits are from dialups/dynip DNS/cable, and .edu's. 
2. Of the real companies, most hits were from former employers (3).
3. "websensecache.tco.census.gov wants resume computer programmer" was one 
of the log entries.
4. About 5/100 appear to be valid hits from inside companies.
5. In the last 6 months, first page Google ranking has generated 0 non-spam
emails.

For the curious, the output is at http://www.illogics.org/google.html.

#!/usr/bin/perl

use CGI;
use Socket;

$ident = 'http://www.google.com/search?q=';

open $f, '<', 'access_log' or die;
WEBHIT: while(<$f>) {

        ($host, $ident_user, $auth_user, $date, $time,
            $time_zone, $method, $url, $protocol, $status, $bytes,
            $referer, $agent) =
/^(\S+) (\S+) (\S+) \[([^:]+):(\d+:\d+:\d+) ([^\]]+)\] "(\S+) (.+?) (\S+)" (\S+) (\S+) "([^"]+)" "([^"]+)"$/;

  next WEBHIT unless(substr($referer, 0, length($ident)) eq $ident);
  $qs = substr($referer, length($ident));
  $qs =~ s/&.*//;
  $qs = CGI::unescape($qs);
  $qs =~ s/[^ a-zA-Z0-9]//g;
  foreach my $i (split / /, lc($qs)) {
    $words{$i}++;
  }
  if($qs =~ m/resume/) {
    $host = gethostbyaddr(scalar inet_aton($host), AF_INET) or next WEBHIT;
    print qq{<tr><td>$host</td><td>wants</td><td>$qs</td></tr>\n};
  }
}
close $f;

exit 0;

foreach my $i (keys %words) {
  push @words, sprintf "%8d %s", $words{$i}, $i if $words{$i} > 1;
}
@words = sort @words;

foreach my $i (@words) {
  print $i, "\n";
}






More information about the Phoenix-pm mailing list