SPUG: http link spiders, + open-source question

Parr, Ryan Ryan.Parr at wwireless.com
Tue May 7 14:46:46 CDT 2002


Can't help with #1, but for #2 you may want to contact London.pm about
http://nms-cgi.sourceforge.net, perhaps your script could be posted there.

-- Ryan Parr

Common sense is the collection of prejudices acquired by age eighteen.
		-- Albert Einstein


-----Original Message-----
From: dancerboy [mailto:dancerboy at strangelight.com] 
Sent: Tuesday, May 07, 2002 10:29 AM
To: Seattle Perl Users Group
Subject: SPUG: http link spiders, + open-source question


Hi all, I've got 2 questions.  The first isn't strictly Perl-related, 
but IME I'm likely to get better quality info from asking this list 
than I would from the more "appropriate" forums.  And I'll justify my 
post with question #2, which *is* more Perl-related...

QUESTION #1:

Can anyone recommend a good (preferably free) tool for 
scanning/spidering a web site for broken links and, more importantly, 
*orphaned files*?  My current tool for finding orphaned files does so 
strictly through reference-counting, but what I need is something 
more like mark-and-sweep.  I.e. my current tools for finding 
"orphaned" files will only find files that aren't linked to by *any* 
other files on the site -- but what I need is a tool that can find 
all of the files that are completely inaccessible when starting from 
a specific page or set of pages (e.g. starting from /index.html) even 
if those orphaned pages form cyclical links among themselves.  Does 
that make sense?


QUESTION #2:

I'm working on a CGI script which I think would be very useful to a 
great number of people, *particularly* people of the less 
technically-inclined sort who are trying to maintain small, 
non-commercial web sites.  I intend to GPL the script and release it 
to the world.  My question is:  what's the best way to do so so that 
the people it's intended for can find it and use it?  Of course I 
plan to put it up on sourceforge, but that's a pretty geek-centric 
site, and I doubt if many of the intended users would find it there. 
(And no, I have no intention of turning it into a CPAN module: as I 
said, it's intended for non-programmers; and really, for anyone with 
enough programming savvy to know how to use a CPAN module, the 
functionality would be trivial to create from scratch.)  Any 
suggestions of other good places to post my script?  I don't want to 
go spamming all the web design forums with "hey everybody, check out 
my new script!" (I don't think my script is *that* important ;-) but 
I w

 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
     POST TO: spug-list at pm.org       PROBLEMS: owner-spug-list at pm.org
      Subscriptions; Email to majordomo at pm.org:  ACTION  LIST  EMAIL
  Replace ACTION by subscribe or unsubscribe, EMAIL by your Email-address
 For daily traffic, use spug-list for LIST ;  for weekly, spug-list-digest
     Seattle Perl Users Group (SPUG) Home Page: http://seattleperl.org




More information about the spug-list mailing list