From rise at knavery.net Fri Nov 1 14:03:31 2002 From: rise at knavery.net (rise) Date: Mon Aug 2 21:25:47 2004 Subject: [Boulder.pm] Nat Torkington on Regexes at Softpro on the 12th Message-ID: Something from SoftPro's newsletter that looks cool: Nat Torkington on Regular Expressions Place and Time: Boulder store -- Tuesday, November 12 at 7 pm. Nat Torkington, co-author of Perl Cookbook and author of the upcoming "Regular Expression Pocket Reference" (both by O'Reilly and Associates), will be at the Boulder store to talk about how regular expressions are becoming a fundamental part of languages other than Perl (e.g. .NET and Java) and what Perl 6 has planned for regular expressions.? If you are interested in attending, please email events@softprowest.com with the word "Torkington" in the subject line. -- Jonathan Conway rise@knavery.net Eff the ineffable, scru the inscrutable. - Unknown From walter at frii.com Tue Nov 12 11:37:29 2002 From: walter at frii.com (Walter Pienciak) Date: Mon Aug 2 21:25:48 2004 Subject: [Boulder.pm] meeting/topic? Message-ID: Hi, I enjoyed Rob's presentation on Extreme Programming last time. I've been spending too much time thinking about spam and how to deal with it. Once upon a time, I was happy to have everything in one mailbox, and I'd just look at the subject lines. Then the volume of mail rose as everyone started to use it, and I fired up procmail to sort into folders and to tag some of the remaining subject lines for make visual ID easier. Then the spam started. procmail started getting unwieldy, so I rolled my own basic pattern-matching filtering program using Mail::Audit. But when spammers can cost-effectively purchase a new domain for each spam run, pattern-matching based on domain names becomes, uh, "less effective." So I move to a heuristics-based approach, with SpamAssassin. Which works really well. But there's still some spam that sneaks in under the radar, so I look at that, add custom rules . . . And still some sneaks in, more than I want, and I realize that if *I* were a smart spammer, I'd have a copy of SpamAssassin myself, and would tweak the wording on my e-mails so that it didn't rise above the default spam threshold with the default settings. Huh. So I've been checking out the Bayesian approaches lately. Interesting, and it made me get out the encyclopedia, since I never did take none o' them statistic classes in school. A LOT of these programs are written in Perl. And so there's the hook I need to make this on topic for the group. Who would be interested in getting together for a meeting centered around spam and the programs used to detect it? Pros/cons of each, and if someone was feeling particularly academic or informed about a program, they could give an intro to the "interesting stuff" behind the mechanism -- e.g., Bayesian filtering. Walter From nagler at bivio.biz Tue Nov 12 12:39:59 2002 From: nagler at bivio.biz (Rob Nagler) Date: Mon Aug 2 21:25:48 2004 Subject: [Boulder.pm] meeting/topic? In-Reply-To: References: Message-ID: <15825.19199.957000.264119@gargle.gargle.HOWL> Walter Pienciak writes: > And still some sneaks in, more than I want, and I realize that if > *I* were a smart spammer, I'd have a copy of SpamAssassin myself, > and would tweak the wording on my e-mails so that it didn't rise > above the default spam threshold with the default settings. Yes, but it is hard to do. My latest venture (assurancesys.com) does exactly this, and even translates SpamAssassin's regex's to English. (That was an interesting project in itself. :) We also do delivery and blacklist monitoring for interactive marketers. > Who would be interested in getting together for a meeting > centered around spam and the programs used to detect it? > Pros/cons of each, and if someone was feeling particularly academic > or informed about a program, they could give an intro to the > "interesting stuff" behind the mechanism -- e.g., Bayesian filtering. Bayesian is cool stuff, and essentially unbreakable. Spam was originally repeated unsolicited email. Now, people opt-in, and still call it spam. The Bayesian approach allows you to define "unwanted" email tailored to your approach. I think it will work best, but can't be installed for an entire site. If you want site solutions, you may want to go with an ASP like brightmail.com. Apparently brightmail has a very low false positive rate (like near zero), which is the biggest danger with site-wide filters. As to a meeting topic, sounds good to me. Rob From zeb at utalk.org Tue Nov 12 15:35:48 2002 From: zeb at utalk.org (lz) Date: Mon Aug 2 21:25:48 2004 Subject: [Boulder.pm] meeting/topic? In-Reply-To: References: Message-ID: <65180776152.20021112143548@utalk.org> Hello Walter, Have you checked out MailWasher. Been using it a week now and am impressed. It dl's just headers then you pick the ones to blacklist and send a 'bounce' saying 'unknown user' all from the isp server. The bounce should get one off spam lists, little by little, as they are cleaned. I've noticed a reduction from 20-25% down to about 5%. Lots more to it. Info at: http://www.mailwasher.net/ Only for windoz, if that's not for you then maybe you can get some ideas from it for your pearl program. (to stay on topic) zya Historians believe that on Tuesday, November 12, 2002 Walter Pienciak wrote and made these points on the subject of "[Boulder.pm] meeting/topic?": > Hi, > I enjoyed Rob's presentation on Extreme Programming last time. > I've been spending too much time thinking about spam and how to > deal with it. Once upon a time, I was happy to have everything > in one mailbox, and I'd just look at the subject lines. > Then the volume of mail rose as everyone started to use it, and > I fired up procmail to sort into folders and to tag some of the > remaining subject lines for make visual ID easier. > Then the spam started. > procmail started getting unwieldy, so I rolled my own basic > pattern-matching filtering program using Mail::Audit. But when > spammers can cost-effectively purchase a new domain for each spam > run, pattern-matching based on domain names becomes, uh, "less > effective." > So I move to a heuristics-based approach, with SpamAssassin. > Which works really well. But there's still some spam that sneaks > in under the radar, so I look at that, add custom rules . . . > And still some sneaks in, more than I want, and I realize that if > *I* were a smart spammer, I'd have a copy of SpamAssassin myself, > and would tweak the wording on my e-mails so that it didn't rise > above the default spam threshold with the default settings. > Huh. > So I've been checking out the Bayesian approaches lately. > Interesting, and it made me get out the encyclopedia, since I > never did take none o' them statistic classes in school. > A LOT of these programs are written in Perl. And so there's > the hook I need to make this on topic for the group. > Who would be interested in getting together for a meeting > centered around spam and the programs used to detect it? > Pros/cons of each, and if someone was feeling particularly academic > or informed about a program, they could give an intro to the > "interesting stuff" behind the mechanism -- e.g., Bayesian filtering. > Walter > _______________________________________________ > Boulder-pm mailing list > Boulder-pm@mail.pm.org > http://mail.pm.org/mailman/listinfo/boulder-pm -- Best regards, From walter at frii.com Wed Nov 13 10:38:46 2002 From: walter at frii.com (Walter Pienciak) Date: Mon Aug 2 21:25:48 2004 Subject: [Boulder.pm] meeting/topic? In-Reply-To: <65180776152.20021112143548@utalk.org> Message-ID: On Tue, 12 Nov 2002, lz wrote: > Hello Walter, > > Have you checked out MailWasher. Been using it a week now > and am impressed. It dl's just headers then you pick the > ones to blacklist and send a 'bounce' saying 'unknown user' > all from the isp server. The bounce should get one off spam > lists, little by little, as they are cleaned. > > I've noticed a reduction from 20-25% down to about 5%. > > Lots more to it. Info at: > http://www.mailwasher.net/ > > Only for windoz, if that's not for you then maybe you can > get some ideas from it for your pearl program. (to stay on > topic) > > zya Nope, I'm a Unix fella these days. Plus, I'm trying to solve the problem not just for my own inbox(es) but for other users also. Bouncing isn't a bad response at all, but when I'm transitioning among spam detectors, I like to eyeball it for a while to make sure it's detecting ONLY spam. But when I have bouncing enabled, I do see the same reduction in incoming spam that you report. One issue there is that if you have admin responsibilities, and someone forwards a spam to complain (either they're a user and they received it, or someone thinks one of your users sent it), it's not always pleasing to the sender to have the thing bounce back to them with a "Bad Spammer! Bad! Bad!" message. Whitelists can *help*, but I'm thinking the Bayesian approach is going to be the winner there. Walter p.s. -- It's perl or Perl or PERL, but not pearl. From walter at frii.com Wed Nov 13 10:45:34 2002 From: walter at frii.com (Walter Pienciak) Date: Mon Aug 2 21:25:48 2004 Subject: [Boulder.pm] meeting/topic? In-Reply-To: <15825.19199.957000.264119@gargle.gargle.HOWL> Message-ID: On Tue, 12 Nov 2002, Rob Nagler wrote: > Bayesian is cool stuff, and essentially unbreakable. Spam was > originally repeated unsolicited email. Now, people opt-in, and still > call it spam. The Bayesian approach allows you to define "unwanted" > email tailored to your approach. I think it will work best, but can't > be installed for an entire site. I wonder if you've bought into more of the spammer propaganda than you realize? ;^) I would rephrase "Now, people opt-in, and still call it spam." as "Now, spammers pass my address amongst themselves, and call it opting in." Cheers, Walter From walter at frii.com Thu Nov 14 22:35:01 2002 From: walter at frii.com (Walter Pienciak) Date: Mon Aug 2 21:25:48 2004 Subject: [Boulder.pm] meeting/topic? In-Reply-To: Message-ID: Huh, not much response. Rob, Don, maybe this is a better topic for a hike or walk somethere than a "meeting" -- 'tis a dud topic. What do you think? Walk and talk some afternoon? Walter From nagler at bivio.biz Thu Nov 14 22:52:36 2002 From: nagler at bivio.biz (Rob Nagler) Date: Mon Aug 2 21:25:48 2004 Subject: [Boulder.pm] meeting/topic? In-Reply-To: References: Message-ID: <15828.32148.343000.771356@gargle.gargle.HOWL> Walter Pienciak writes: > What do you think? Walk and talk some afternoon? Sure. Rob