[VPM] link 'bot' protection

Jer A jeremygwa at hotmail.com
Mon Feb 23 17:05:43 PST 2009



Thank you for your response.

what can i also do to prevent cross site scripting....eg, if some one finds the html code that references to the form cgi script.....and calls it from their own site for example....is there anything in perl that would allow client computers access (eg. surfers), but block other domains (websites)?


> Date: Mon, 23 Feb 2009 16:31:55 -0800
> From: semaphore_2000 at yahoo.com
> Subject: Re: [VPM] link 'bot' protection
> To: jeremygwa at hotmail.com
> 
> 
> I think ultimately, that's fighting a rear-guard type of action.  There are ways of blocking clients that grab too much too fast (many bots grab lots of pages in a short time so can be detected like that). There are other tricks like that too. But if the scraper or bot is written correctly, and is polite, taking pages slowly, ignores robots.txt and uses a user-agent string that looks like an existing browser, then you'd have a hard time telling. Maybe use javascript to present the info so that scrapers that don't use javascript can't see it. 
> 
> Anyway, I write web scrapers (er, in perl - nice, well-behaved bots that do not suck a server's resources) and if you'd like I can help you test. You might try yourself by playing with the CPAN mech-shell perhaps...
> 
> Doug
> 
> 
> --- On Mon, 2/23/09, Jer A <jeremygwa at hotmail.com> wrote:
> 
> > From: Jer A <jeremygwa at hotmail.com>
> > Subject: [VPM] link 'bot' protection
> > To: victoria-pm at pm.org
> > Date: Monday, February 23, 2009, 6:16 PM
> > hi all,
> > 
> > I am designing a website service.
> > 
> > how do i prevent automated bots and link scrapers and
> > cross-site scripts from access to the site, without
> > hindering the user experience, as well as hindering the
> > performance of the host/server/site?
> > 
> > My site is not graphic intensive, and I do not think anyone
> > would be interest at grabbing anything that is graphical,
> > only Information/Data.
> > 
> > I have thought of banning ip's by parsing log files,
> > but what should I look for that is 'fishy'?
> > 
> > Thanks in advance for all advice/help.
> > 
> > Regards,
> > Jeremy
> > 
> > 
> > _________________________________________________________________
> > Windows Live Messenger. Multitasking at its finest.
> > http://www.microsoft.com/windows/windowslive/products/messenger.aspx_______________________________________________
> > Victoria-pm mailing list
> > Victoria-pm at pm.org
> > http://mail.pm.org/mailman/listinfo/victoria-pm
> 
> 
>       

_________________________________________________________________
So many new options, so little time. Windows Live Messenger.
http://www.microsoft.com/windows/windowslive/products/messenger.aspx
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.pm.org/pipermail/victoria-pm/attachments/20090223/ef850693/attachment.html>


More information about the Victoria-pm mailing list