[JaxPM] wget, etc...
Nate Campi
nate at campi.cc
Fri Aug 3 00:42:16 CDT 2001
On the jacksonville-pm-list; Jax.PM'er Nate Campi <nate at campi.cc> wrote -
Bill,
mod_rewrite your way around this:
(granted this is a painful way to do it, but I don't need pics, and I
could easily script in sh to follow links and get them with netcat,
not to mention *gasp* perl)
[nate at monkey:nate]$ nc -v -v ora.sunhelp.org 80
ohno.mrbill.net [207.200.6.75] 80 (www) open
GET /index.html HTTP/1.1
Host: ora.sunhelp.org
HTTP/1.1 200 OK
Date: Fri, 03 Aug 2001 05:34:17 GMT
Server: Apache/1.3.20 (Unix) PHP/4.0.5
Last-Modified: Mon, 23 Jul 2001 04:10:53 GMT
ETag: "f3382-b9f-3b5ba3cd"
Accept-Ranges: bytes
Content-Length: 2975
Content-Type: text/html
<html>
<head><title>O'Reilly CD Bookshelf Library</title></head>
<body bgcolor="black" text="white" link="white" vlink="white">
<center>
<h3>This online reference is for private use only.</h3>
<hr><p>
<table width="75%" cellpadding="5" cellspacing="5">
<tr>
<td><a href="unix/upt/index.htm"><img
src="images/unixpowertools.jpg"></a></td>
<td><a href="unix/unixnut/index.htm"><img
src="images/unixnut.gif"></a></td>
<td><a href="unix/vi/index.htm"><img src="images/learnvi.jpg"></a></td>
<td><a href="unix/sedawk/index.htm"><img
src="images/sedawk.jpg"></a></td>
<td><a href="unix/ksh/index.htm"><img
src="images/learnkorn.jpg"></a></td>
<td><a href="unix/lrnunix/index.htm"><img
src="images/learnunix.jpg"></a></td>
</tr>
<tr>
<td><a href="networking/dnsbind/index.htm"><img
src="images/dnsbind.jpg"></a>
</td>
<td><a href="networking/tcpip/index.htm"><img
src="images/tcpip.jpg"></a></td>
<td><a href="networking/sendmail/index.htm"><img
src="images/sendmail.jpg"></a>
</td>
<td><a href="networking/smdref/index.htm"><img
src="images/senddesk.jpg"></a>
</td>
<td><a href="networking/firewall/index.htm"><img
src="images/firewalls.jpg">
</a></td>
<td><a href="networking/puis/index.htm"><img
src="images/security.jpg"></a>
</td>
</tr>
<tr>
<td><a href="perl/perlnut/index.htm"><img
src="images/perlnut.jpg"></a></td>
<td><a href="perl/learn/index.htm"><img
src="images/learnperl.jpg"></a></td>
<td><a href="perl/learn32/index.htm"><img
src="images/perlwin.gif"></a></td>
<td><a href="perl/prog/index.htm"><img
src="images/progperl.jpg"></a></td>
<td><a href="perl/advprog/index.htm"><img
src="images/advperl.gif"></a></td>
<td><a href="perl/cookbook/index.htm"><img
src="images/perlcook.jpg"></a></td>
</tr>
<tr>
<td><a href="webref/html/index.htm"><img
src="images/htmlguide.jpg"></a></td>
<td><a href="webref/cgi/index.htm"><img
src="images/cgiprog.jpg"></a></td>
<td><a href="webref/jscript/index.htm"><img
src="images/javascript.jpg"></a>
</td>
<td><a href="webref/perl/index.htm"><img
src="images/progperl.jpg"></a></td>
<td><a href="webref/webnut/index.htm"><img
src="images/webmaster.jpg"></a></td>
<td><a href="javaref/javanut/index.htm"><img
src="images/javanut.jpg"></a></td>
</tr>
<tr>
<td><a href="javaref/langref/index.htm"><img
src="images/javalang.jpg"></a></td>
<td><a href="javaref/awt/index.htm"><img
src="images/javaawt.jpg"></a></td>
<td><a href="javaref/fclass/index.htm"><img
src="images/javafund.jpg"></a></td>
<td><a href="javaref/exp/index.htm"><img
src="images/explorjava.jpg"></a></td>
<td><a href="oracle/prog2/index.htm"><img
src="images/oraplsql.gif"></a></td>
<td><a href="oracle/guide8i/index.htm"><img
src="images/ora8i.jpg"></a></td>
</tr>
<tr>
<td><a href="oracle/bipack/index.htm"><img
src="images/orabuilt.jpg"></a></td>
<td><a href="oracle/advprog/index.htm"><img
src="images/advora.jpg"></a></td>
<td><a href="oracle/webapp/index.htm"><img
src="images/oraweb.jpg"></a></td>
<td></td>
<td></td>
</tr>
</table>
</center>
</body>
</html>
sent 48, rcvd 3214
On Fri, Aug 03, 2001 at 12:21:49AM -0400, JONES, WILLIAM C wrote:
> On the jacksonville-pm-list; Jax.PM'er "JONES, WILLIAM C" <wcjones at exchange.fccj.org> wrote -
>
> Thx for reminding me about wget.
>
> I've set mod_rewrite to disallow that bot... I know I know - there are SO
> many others...
>
> (Plus you could change the finger-print of wget by recompiling...)
>
>
> But, what I've done will stop a LOT of script kiddies...
>
> Sx :]
>
>
> PS: The code, if interested -
>
> <IfModule mod_rewrite.c>
> RewriteEngine on
> RewriteLog /var/log/mod_rewrite.log
> RewriteLogLevel 0
>
> RewriteCond %{REQUEST_FILENAME} ^.+$
> RewriteCond %{HTTP_USER_AGENT} ^EmailSiphon [OR]
> RewriteCond %{HTTP_USER_AGENT} ^EmailWolf [OR]
> RewriteCond %{HTTP_USER_AGENT} ^ExtractorPro [OR]
> RewriteCond %{HTTP_USER_AGENT} ^Mozilla.*NEWT [OR]
> RewriteCond %{HTTP_USER_AGENT} ^Crescent [OR]
> RewriteCond %{HTTP_USER_AGENT} ^CherryPicker [OR]
> RewriteCond %{HTTP_USER_AGENT} ^[Ww]get [OR]
> RewriteCond %{HTTP_USER_AGENT} ^[Ww]eb[Bb]andit [OR]
> RewriteCond %{HTTP_USER_AGENT} ^WebEMailExtrac.* [OR]
> RewriteCond %{HTTP_USER_AGENT} ^NICErsPRO [OR]
> RewriteCond %{HTTP_USER_AGENT} ^Telesoft [OR]
> RewriteCond %{HTTP_USER_AGENT} ^Zeus.*Webster [OR]
> RewriteCond %{HTTP_USER_AGENT} ^Microsoft.URL [OR]
> RewriteCond %{HTTP_USER_AGENT} ^Mozilla/3.Mozilla/2.01 [OR]
> RewriteCond %{HTTP_USER_AGENT} ^EmailCollector
> RewriteRule ^.*$ http://insecurity.org/nospam.html
> </IfModule>
--
Nate
Jax.PM Moderator's Note:
This message was posted to the Jacksonville Perl Monger's Group listserv.
The group manager can be reached at -- owner-jacksonville-pm-list at pm.org
to whom send all praises, complaints, or comments...
More information about the Jacksonville-pm
mailing list