From extasia at extasia.org Thu Dec 6 11:52:31 2007 From: extasia at extasia.org (David Alban) Date: Thu, 6 Dec 2007 11:52:31 -0800 Subject: [sf-perl] mine vs ours Message-ID: <4c714a9c0712061152s3392a408re2d184cf2aad53e@mail.gmail.com> greetings, I want bar() to access a constant i define in foo(). It can't be defined outside of foo() because in my "real" program, it depends on a value computed in foo(). In fact there's going to be a lot of routines called by foo() that will need With the my() statement uncommented, it complains "Global symbol "$SOME_CONSTANT" requires explicit package name". With the our() statement uncommented it seems to work. #!/usr/bin/perl use strict; use warnings; use Readonly; foo(); sub foo { # Readonly::Scalar our $SOME_CONSTANT => 'FOOBAR'; Readonly::Scalar my $SOME_CONSTANT => 'FOOBAR'; bar(); } # foo sub bar { print "$SOME_CONSTANT\n"; } # bar So it would seem to me to use our() rather than my() for this. Any reason why I shouldn't? Sanity checking... Thanks, David -- Live in a world of your own, but always welcome visitors. From extasia at extasia.org Thu Dec 6 11:59:33 2007 From: extasia at extasia.org (David Alban) Date: Thu, 6 Dec 2007 11:59:33 -0800 Subject: [sf-perl] mine vs ours In-Reply-To: <4c714a9c0712061152s3392a408re2d184cf2aad53e@mail.gmail.com> References: <4c714a9c0712061152s3392a408re2d184cf2aad53e@mail.gmail.com> Message-ID: <4c714a9c0712061159m28080b7cnd0aebc1c6362d0cf@mail.gmail.com> heh. using our() in the "real" program, i get: Variable "$MASTER_PACKAGE_DIRECTORY" is not imported at mypgm line 5433. Variable "$MASTER_PACKAGE_DIRECTORY" is not imported at mypgm line 5434. Variable "$MASTER_PACKAGE_DIRECTORY" is not imported at mypgm line 5437. Global symbol "$MASTER_PACKAGE_DIRECTORY" requires explicit package name at mypgm line 5433. Global symbol "$MASTER_PACKAGE_DIRECTORY" requires explicit package name at mypgm line 5434. Global symbol "$MASTER_PACKAGE_DIRECTORY" requires explicit package name at mypgm line 5437. perhaps i need to give up making this a constant, and have it be a variable... -- Live in a world of your own, but always welcome visitors. From extasia at extasia.org Thu Dec 6 11:52:31 2007 From: extasia at extasia.org (David Alban) Date: Thu, 6 Dec 2007 11:52:31 -0800 Subject: [sf-perl] mine vs ours Message-ID: <4c714a9c0712061152s3392a408re2d184cf2aad53e@mail.gmail.com> greetings, I want bar() to access a constant i define in foo(). It can't be defined outside of foo() because in my "real" program, it depends on a value computed in foo(). In fact there's going to be a lot of routines called by foo() that will need With the my() statement uncommented, it complains "Global symbol "$SOME_CONSTANT" requires explicit package name". With the our() statement uncommented it seems to work. #!/usr/bin/perl use strict; use warnings; use Readonly; foo(); sub foo { # Readonly::Scalar our $SOME_CONSTANT => 'FOOBAR'; Readonly::Scalar my $SOME_CONSTANT => 'FOOBAR'; bar(); } # foo sub bar { print "$SOME_CONSTANT\n"; } # bar So it would seem to me to use our() rather than my() for this. Any reason why I shouldn't? Sanity checking... Thanks, David -- Live in a world of your own, but always welcome visitors. From extasia at extasia.org Thu Dec 6 11:52:31 2007 From: extasia at extasia.org (David Alban) Date: Thu, 6 Dec 2007 11:52:31 -0800 Subject: [sf-perl] mine vs ours Message-ID: <4c714a9c0712061152s3392a408re2d184cf2aad53e@mail.gmail.com> greetings, I want bar() to access a constant i define in foo(). It can't be defined outside of foo() because in my "real" program, it depends on a value computed in foo(). In fact there's going to be a lot of routines called by foo() that will need With the my() statement uncommented, it complains "Global symbol "$SOME_CONSTANT" requires explicit package name". With the our() statement uncommented it seems to work. #!/usr/bin/perl use strict; use warnings; use Readonly; foo(); sub foo { # Readonly::Scalar our $SOME_CONSTANT => 'FOOBAR'; Readonly::Scalar my $SOME_CONSTANT => 'FOOBAR'; bar(); } # foo sub bar { print "$SOME_CONSTANT\n"; } # bar So it would seem to me to use our() rather than my() for this. Any reason why I shouldn't? Sanity checking... Thanks, David -- Live in a world of your own, but always welcome visitors. From extasia at extasia.org Thu Dec 6 11:59:33 2007 From: extasia at extasia.org (David Alban) Date: Thu, 6 Dec 2007 11:59:33 -0800 Subject: [sf-perl] mine vs ours In-Reply-To: <4c714a9c0712061152s3392a408re2d184cf2aad53e@mail.gmail.com> References: <4c714a9c0712061152s3392a408re2d184cf2aad53e@mail.gmail.com> Message-ID: <4c714a9c0712061159m28080b7cnd0aebc1c6362d0cf@mail.gmail.com> heh. using our() in the "real" program, i get: Variable "$MASTER_PACKAGE_DIRECTORY" is not imported at mypgm line 5433. Variable "$MASTER_PACKAGE_DIRECTORY" is not imported at mypgm line 5434. Variable "$MASTER_PACKAGE_DIRECTORY" is not imported at mypgm line 5437. Global symbol "$MASTER_PACKAGE_DIRECTORY" requires explicit package name at mypgm line 5433. Global symbol "$MASTER_PACKAGE_DIRECTORY" requires explicit package name at mypgm line 5434. Global symbol "$MASTER_PACKAGE_DIRECTORY" requires explicit package name at mypgm line 5437. perhaps i need to give up making this a constant, and have it be a variable... -- Live in a world of your own, but always welcome visitors. From extasia at extasia.org Tue Dec 11 12:28:08 2007 From: extasia at extasia.org (David Alban) Date: Tue, 11 Dec 2007 12:28:08 -0800 Subject: [sf-perl] which first: remove non-data lines or process line continuations? Message-ID: <4c714a9c0712111228j5e3eeca7g3a5932c2fdba5217@mail.gmail.com> greetings, I'm parsing text files. in these text files, a data line is any line except a comment line, blank line, or null line. I ignore all lines that are not data lines. That is, I process only lines not matching: m{ \A \s* (?: \# | \z ) }xms I also allow line continuation. That is, backslash-newline pairs are deleted (after backslash-quoted backslashes are "protected"). So I have a choice. I can process removal of non-data lines first. Or I can process line continuations first. Take the following set of lines: foo \ # : bar \ : bat mumble \ : squeak If I process line continuations first, my data lines become: ( 'foo # : bar : bat', 'mumble : squeak' ) If I process removal of non-data lines first, I get: ( 'foo : bat', 'mumble, squeak' ) I'm leaning toward removing the non-data lines first. But I wanted to see if anyone had any strong opinions or otherwise interesting observations. Thanks, David -- Live in a world of your own, but always welcome visitors. From kvale at phy.ucsf.edu Tue Dec 11 12:48:32 2007 From: kvale at phy.ucsf.edu (Mark Kvale) Date: Tue, 11 Dec 2007 12:48:32 -0800 Subject: [sf-perl] which first: remove non-data lines or process line continuations? In-Reply-To: <4c714a9c0712111228j5e3eeca7g3a5932c2fdba5217@mail.gmail.com> References: <4c714a9c0712111228j5e3eeca7g3a5932c2fdba5217@mail.gmail.com> Message-ID: <475EF7A0.4030104@phy.ucsf.edu> My opinion is that comments should be able to be removed completely without changing the meaning or structure of data. If people are creating multi-line comments by using continuations, rather than comment lines all starting with #, that's evil. So I'd take out all comments first and then process the continuations. Best of all is to get the file format of your data files from the source that generated it. Mark David Alban wrote: > greetings, > > I'm parsing text files. in these text files, a data line is any line > except a comment line, blank line, or null line. I ignore all lines > that are not data lines. That is, I process only lines not matching: > > m{ \A \s* (?: \# | \z ) }xms > > I also allow line continuation. That is, backslash-newline pairs are > deleted (after backslash-quoted backslashes are "protected"). > > So I have a choice. I can process removal of non-data lines first. > Or I can process line continuations first. > > Take the following set of lines: > > foo \ > # : bar \ > : bat > mumble \ > : squeak > > If I process line continuations first, my data lines become: > > ( 'foo # : bar : bat', 'mumble : squeak' ) > > If I process removal of non-data lines first, I get: > > ( 'foo : bat', 'mumble, squeak' ) > > I'm leaning toward removing the non-data lines first. But I wanted to > see if anyone had any strong opinions or otherwise interesting > observations. > > Thanks, > David From merlyn at stonehenge.com Tue Dec 11 12:48:57 2007 From: merlyn at stonehenge.com (Randal L. Schwartz) Date: Tue, 11 Dec 2007 12:48:57 -0800 Subject: [sf-perl] which first: remove non-data lines or process line continuations? In-Reply-To: <4c714a9c0712111228j5e3eeca7g3a5932c2fdba5217@mail.gmail.com> (David Alban's message of "Tue, 11 Dec 2007 12:28:08 -0800") References: <4c714a9c0712111228j5e3eeca7g3a5932c2fdba5217@mail.gmail.com> Message-ID: <861w9tot3a.fsf@blue.stonehenge.com> >>>>> "David" == David Alban writes: David> I'm leaning toward removing the non-data lines first. But I wanted to David> see if anyone had any strong opinions or otherwise interesting David> observations. I've seen it done both ways. Whatever you do, make sure you can still represent every possible input. If I recall correctly, there's some combination of backslash and newline that cannot be represented in a csh single-quoted string because they fouled that up. In other words, "\\" at the end of the line should be a real "\", where "\" at the end of the string should swallow the newline and backslash as if it doesn't exist. -- Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095 Perl/Unix/security consulting, Technical writing, Comedy, etc. etc. See PerlTraining.Stonehenge.com for onsite and open-enrollment Perl training! From sphink at gmail.com Tue Dec 11 14:45:25 2007 From: sphink at gmail.com (Steve Fink) Date: Tue, 11 Dec 2007 14:45:25 -0800 Subject: [sf-perl] which first: remove non-data lines or process line continuations? In-Reply-To: <4c714a9c0712111228j5e3eeca7g3a5932c2fdba5217@mail.gmail.com> References: <4c714a9c0712111228j5e3eeca7g3a5932c2fdba5217@mail.gmail.com> Message-ID: <7d7f2e8c0712111445w4b83b22gc1ee275e99d54d5a@mail.gmail.com> > I'm leaning toward removing the non-data lines first. But I wanted to > see if anyone had any strong opinions or otherwise interesting > observations. I was going to say that gmake removes non-data first, but then I tried it, and discovered that gmake is insane. This makefile: -------- X = a \ # b \ c d : D ------- Seems to be processed as "Let's see... this X line is continued, so add in the next line. Hmm... a comment! Let's start ignoring. The comment ends with a backslashed newline, so let's ignore the next line too. Okay, let's see. Now we have d : D. Great!" It forgets completely about the X continuation. That should have been either "X=a c\nd : D" or "X=a d : D". gmake does neither. My initial vote would be for line continuations first, because I like the global rule that any line can be continued by ending it with a backslash, but on further reflection, I think I'd go for non-data first. It is more useful in practice -- there are many times when I have a long series of single words in a Makefile that I put on their own line (followed by a backslash), and it's nice to be able to comment them out individually. From sphink at gmail.com Tue Dec 11 14:45:25 2007 From: sphink at gmail.com (Steve Fink) Date: Tue, 11 Dec 2007 14:45:25 -0800 Subject: [sf-perl] which first: remove non-data lines or process line continuations? In-Reply-To: <4c714a9c0712111228j5e3eeca7g3a5932c2fdba5217@mail.gmail.com> References: <4c714a9c0712111228j5e3eeca7g3a5932c2fdba5217@mail.gmail.com> Message-ID: <7d7f2e8c0712111445w4b83b22gc1ee275e99d54d5a@mail.gmail.com> > I'm leaning toward removing the non-data lines first. But I wanted to > see if anyone had any strong opinions or otherwise interesting > observations. I was going to say that gmake removes non-data first, but then I tried it, and discovered that gmake is insane. This makefile: -------- X = a \ # b \ c d : D ------- Seems to be processed as "Let's see... this X line is continued, so add in the next line. Hmm... a comment! Let's start ignoring. The comment ends with a backslashed newline, so let's ignore the next line too. Okay, let's see. Now we have d : D. Great!" It forgets completely about the X continuation. That should have been either "X=a c\nd : D" or "X=a d : D". gmake does neither. My initial vote would be for line continuations first, because I like the global rule that any line can be continued by ending it with a backslash, but on further reflection, I think I'd go for non-data first. It is more useful in practice -- there are many times when I have a long series of single words in a Makefile that I put on their own line (followed by a backslash), and it's nice to be able to comment them out individually. From lavendula6654 at yahoo.com Wed Dec 12 16:14:57 2007 From: lavendula6654 at yahoo.com (Elaine) Date: Wed, 12 Dec 2007 16:14:57 -0800 (PST) Subject: [sf-perl] Computer Classes at Foothill Message-ID: <145957.52401.qm@web31711.mail.mud.yahoo.com> Winter quarter classes start Monday, 7 January, at Foothill College. These two may be of interest to you: 1) Introduction to Python Programming Prerequisite: Any programming language experience CIS 68K - Monday evenings at Middlefield campus in Palo Alto 2) Application Software Development with Ajax Prerequisite: Knowledge of HTML and JavaScript COIN 71 - Thursday evenings at Middlefield campus in Palo Alto If you are interested in taking a class, please register as soon as possible by going to: http://www.foothill.fhda.edu/reg/index.php If not enough students sign up, a class may be cancelled. If you have any questions, please contact the instructor, Elaine Haight, at haightElaine at foothill.edu ____________________________________________________________________________________ Never miss a thing. Make Yahoo your home page. http://www.yahoo.com/r/hs From quinn at fairpath.com Thu Dec 13 14:42:12 2007 From: quinn at fairpath.com (Quinn Weaver) Date: Thu, 13 Dec 2007 14:42:12 -0800 Subject: [sf-perl] No PM meeting in December; BASS meeting December 26 Message-ID: <20071213224212.GA52519@fu.funkspiel.org> There will be no Perl Monger meeting in December. However, there is a BASS dinner on December 26 (the fourth Wednesday). I'll be out of town, but all other Perl Mongers are invited to attend. http://www.cfcl.com/rdm/bass/ Have fun, everyone! I'll see you next year. :) -- Quinn Weaver, independent contractor | President, San Francisco Perl Mongers http://fairpath.com/quinn/resume/ | http://sf.pm.org/ 510-520-5217 From rdm at cfcl.com Sun Dec 23 09:39:35 2007 From: rdm at cfcl.com (Rich Morin) Date: Sun, 23 Dec 2007 09:39:35 -0800 Subject: [sf-perl] BASS Meeting (SF), Wed. December 26 Message-ID: Our XO (OLPC) arrived recently, so we'll be bringing it to BASS for folks to examine. The UI is based on Squeak and Python is used for a lot of the programming, so it's quite a testament to the power of scripting languages. Anyway, take a break from Christmas leftovers and join us for some tasty Chinese food and scintillating talk... The Beer and Scripting SIG rides again! If you'd like to eat good Chinese food, chat with other local scripters, and possibly take a look at laptop-demoed scripting hacks, this is the place to do it! For your convenience, here are the critical details: Date: Wednesday, December 26, 2007 (4th. Wed.) Time: 8:00 pm Place: Wild Pepper 3601 26th St. (near San Jose Ave.) San Francisco, CA, USA 415/695-767[89] See the BASS web page for more information: http://cfcl.com/rdm/bass/ -r P.S. Mark your calendar for the second PeepCode & Pizza gathering (Thursday, 1/24 in Redwood City). We'll be watching and discussing the Capistrano 2.1 screencast; for more information, see http://ruby.meetup.com/123/ -- http://www.cfcl.com/rdm Rich Morin http://www.cfcl.com/rdm/resume rdm at cfcl.com http://www.cfcl.com/rdm/weblog +1 650-873-7841 Technical editing and writing, programming, and web development From afife at untangle.com Mon Dec 24 13:31:37 2007 From: afife at untangle.com (Andrew Fife) Date: Mon, 24 Dec 2007 13:31:37 -0800 (PST) Subject: [sf-perl] Eric S. Raymond @ BALUG (Jan 15th) Message-ID: <003a01c84674$5ebfd500$4301a8c0@Untangle.local> Howdy Folks: Eric S. Raymond will be kicking off the start of a great 2008 at The Bay Area Linux Users Group (BALUG) with a talk on January 15th. If you haven't been to BALUG in a while, this a great opportunity to check out what we're up to... and who knows you may just wind up eating dinner with Eric S. Raymond at your table. If you'd like to come, please RSVP: RSVP at balug.org Upcoming 2008 speakers include: Jan - Eric S. Raymond Feb - Bruce Perens March - TBD April - Eric Allman May - Jeremy Allison June - Andrew Morton So why not signup for BALUG's extremely low volume announce list: http://lists.balug.org/listinfo.cgi/balug-announce-balug.org Meeting Details... 6:30pm January 15th, 2008 Four Seas Restaurant 731 Grant Ave. San Francisco, CA 94108 Parking: http://www.portsmouthsquaregarage.com/ Cost: The meetings are always free, but dinner is $13 About BALUG: BALUG is lively gathering of Linux users & free software enthusiasts that combines great food, community & intimate access to featured speakers. We meet in the bar of the Four Seas Restaurant from 6:30pm. At 7pm, we share a family-style Chinese dinner, which is followed by our guest speaker. BALUG Mailing list Policy: BALUG promises not to abuse other LUGs mailing lists. Our current policy is to make one monthly announcement on other Bay Area LUGs mailing lists. If you feel this is not appropriate for a particular list, please tell us which list and what you feel would be a more appropriate policy for that list. Please send feedback to balug-contact at balug.org. ---------------------------------------- Andrew Fife Untangle - Open Source Security Gateway download.untangle.com 650.425.3327 (O) 415.806.6028 (C) afife at untangle.com From andy at petdance.com Mon Dec 24 18:17:25 2007 From: andy at petdance.com (Andy Lester) Date: Mon, 24 Dec 2007 20:17:25 -0600 Subject: [sf-perl] Eric S. Raymond @ BALUG (Jan 15th) In-Reply-To: <003a01c84674$5ebfd500$4301a8c0@Untangle.local> References: <003a01c84674$5ebfd500$4301a8c0@Untangle.local> Message-ID: The San Francisco guys have much less trouble filling speaking slots. > Eric S. Raymond will be kicking off the start of a great 2008 at The > Bay > Area Linux Users Group (BALUG) with a talk on January 15th. If you > > Jan - Eric S. Raymond > Feb - Bruce Perens > March - TBD > April - Eric Allman > May - Jeremy Allison > June - Andrew Morton Jeepers! xoxo, Andy -- Andy Lester => andy at petdance.com => www.petdance.com => AIM:petdance From afife at untangle.com Tue Dec 25 12:36:37 2007 From: afife at untangle.com (Andrew Fife) Date: Tue, 25 Dec 2007 12:36:37 -0800 (PST) Subject: [sf-perl] Eric S. Raymond @ BALUG (Jan 15th) In-Reply-To: References: Message-ID: <002f01c84735$d9b97d30$4301a8c0@Untangle.local> Andy Lester: Happy to try and get Untangle's CTO to give his talk on building entire networks in software to speak at SF-PM if you are having trouble recruiting speakers. You can check out a description and .ppt slides (scroll down) at PenLUG here: http://www.penlug.org/twiki/bin/view/Home/MeetingAgenda20071025 Lemme know if that is something you are interested in. -Andrew ---------------------------------------- Andrew Fife Untangle - Open Source Security Gateway download.untangle.com 650.425.3327 (O) 415.806.6028 (C) afife at untangle.com -----Original Message----- From: sanfrancisco-pm-bounces+afife=untangle.com at pm.org [mailto:sanfrancisco-pm-bounces+afife=untangle.com at pm.org] On Behalf Of sanfrancisco-pm-request at pm.org Sent: Tuesday, December 25, 2007 12:01 PM To: sanfrancisco-pm at pm.org Subject: SanFrancisco-pm Digest, Vol 35, Issue 6 Send SanFrancisco-pm mailing list submissions to sanfrancisco-pm at pm.org To subscribe or unsubscribe via the World Wide Web, visit http://mail.pm.org/mailman/listinfo/sanfrancisco-pm or, via email, send a message with subject or body 'help' to sanfrancisco-pm-request at pm.org You can reach the person managing the list at sanfrancisco-pm-owner at pm.org When replying, please edit your Subject line so it is more specific than "Re: Contents of SanFrancisco-pm digest..." Today's Topics: 1. Eric S. Raymond @ BALUG (Jan 15th) (Andrew Fife) 2. Re: Eric S. Raymond @ BALUG (Jan 15th) (Andy Lester) ---------------------------------------------------------------------- Message: 1 Date: Mon, 24 Dec 2007 13:31:37 -0800 (PST) From: "Andrew Fife" Subject: [sf-perl] Eric S. Raymond @ BALUG (Jan 15th) To: Message-ID: <003a01c84674$5ebfd500$4301a8c0 at Untangle.local> Content-Type: text/plain; charset="us-ascii" Howdy Folks: Eric S. Raymond will be kicking off the start of a great 2008 at The Bay Area Linux Users Group (BALUG) with a talk on January 15th. If you haven't been to BALUG in a while, this a great opportunity to check out what we're up to... and who knows you may just wind up eating dinner with Eric S. Raymond at your table. If you'd like to come, please RSVP: RSVP at balug.org Upcoming 2008 speakers include: Jan - Eric S. Raymond Feb - Bruce Perens March - TBD April - Eric Allman May - Jeremy Allison June - Andrew Morton So why not signup for BALUG's extremely low volume announce list: http://lists.balug.org/listinfo.cgi/balug-announce-balug.org Meeting Details... 6:30pm January 15th, 2008 Four Seas Restaurant 731 Grant Ave. San Francisco, CA 94108 Parking: http://www.portsmouthsquaregarage.com/ Cost: The meetings are always free, but dinner is $13 About BALUG: BALUG is lively gathering of Linux users & free software enthusiasts that combines great food, community & intimate access to featured speakers. We meet in the bar of the Four Seas Restaurant from 6:30pm. At 7pm, we share a family-style Chinese dinner, which is followed by our guest speaker. BALUG Mailing list Policy: BALUG promises not to abuse other LUGs mailing lists. Our current policy is to make one monthly announcement on other Bay Area LUGs mailing lists. If you feel this is not appropriate for a particular list, please tell us which list and what you feel would be a more appropriate policy for that list. Please send feedback to balug-contact at balug.org. ---------------------------------------- Andrew Fife Untangle - Open Source Security Gateway download.untangle.com 650.425.3327 (O) 415.806.6028 (C) afife at untangle.com ------------------------------ Message: 2 Date: Mon, 24 Dec 2007 20:17:25 -0600 From: Andy Lester Subject: Re: [sf-perl] Eric S. Raymond @ BALUG (Jan 15th) To: San Francisco Perl Mongers User Group Message-ID: Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes The San Francisco guys have much less trouble filling speaking slots. > Eric S. Raymond will be kicking off the start of a great 2008 at The > Bay > Area Linux Users Group (BALUG) with a talk on January 15th. If you > > Jan - Eric S. Raymond > Feb - Bruce Perens > March - TBD > April - Eric Allman > May - Jeremy Allison > June - Andrew Morton Jeepers! xoxo, Andy -- Andy Lester => andy at petdance.com => www.petdance.com => AIM:petdance ------------------------------ _______________________________________________ SanFrancisco-pm mailing list SanFrancisco-pm at pm.org http://mail.pm.org/mailman/listinfo/sanfrancisco-pm End of SanFrancisco-pm Digest, Vol 35, Issue 6 ********************************************** From andy at petdance.com Tue Dec 25 13:39:40 2007 From: andy at petdance.com (Andy Lester) Date: Tue, 25 Dec 2007 15:39:40 -0600 Subject: [sf-perl] Eric S. Raymond @ BALUG (Jan 15th) In-Reply-To: <002f01c84735$d9b97d30$4301a8c0@Untangle.local> References: <002f01c84735$d9b97d30$4301a8c0@Untangle.local> Message-ID: <20071225213940.GB26305@petdance.com> > Happy to try and get Untangle's CTO to give his talk on building entire > networks in software to speak at SF-PM if you are having trouble > recruiting speakers. You can check out a description and .ppt slides > (scroll down) at PenLUG here: Sorry about that, I meant to forward to my Chicago Perl Mongers group. I'm out here in the Chicago area, and we don't have the heavy hitters the bay area does. But hey, if anyone's comin' out to the Chicago area, we're glad to have 'em! xoa -- Andy Lester => andy at petdance.com => www.petdance.com => AIM:petdance From hwigoda at mindspring.com Wed Dec 26 02:45:26 2007 From: hwigoda at mindspring.com (hwigoda at mindspring.com) Date: Wed, 26 Dec 2007 05:45:26 -0500 (EST) Subject: [sf-perl] [Chicago-talk] Eric S. Raymond @ BALUG (Jan 15th) Message-ID: <27785881.1198665927112.JavaMail.root@mswamui-blood.atl.sa.earthlink.net> yes, but only chicago has the incorrigible andy lester. -----Original Message----- >From: Andy Lester >Sent: Dec 24, 2007 9:17 PM >To: San Francisco Perl Mongers User Group >Subject: Re: [Chicago-talk] [sf-perl] Eric S. Raymond @ BALUG (Jan 15th) > >The San Francisco guys have much less trouble filling speaking slots. > >> Eric S. Raymond will be kicking off the start of a great 2008 at The >> Bay >> Area Linux Users Group (BALUG) with a talk on January 15th. If you >> >> Jan - Eric S. Raymond >> Feb - Bruce Perens >> March - TBD >> April - Eric Allman >> May - Jeremy Allison >> June - Andrew Morton > > >Jeepers! > >xoxo, >Andy > >-- >Andy Lester => andy at petdance.com => www.petdance.com => AIM:petdance > > > > >_______________________________________________ >Chicago-talk mailing list >Chicago-talk at pm.org >http://mail.pm.org/mailman/listinfo/chicago-talk From nheller at silcon.com Sun Dec 30 12:22:35 2007 From: nheller at silcon.com (Neil Heller) Date: Sun, 30 Dec 2007 12:22:35 -0800 Subject: [sf-perl] Testing a web crawler In-Reply-To: <003a01c84674$5ebfd500$4301a8c0@Untangle.local> References: <003a01c84674$5ebfd500$4301a8c0@Untangle.local> Message-ID: <000101c84b21$b733ac00$259b0400$@com> I have been asked to consider ways to test a web crawler (aka search engine). There are lots of "positive"-type tests I can think of that mostly deal with "are the returned pages really relevant to the request". How does one test for pages that are (or might be) relevant but were missed by the web crawler? What might be the best (is that a trick word?) or most optimized request given someone's desire to find information? From friedman at highwire.stanford.edu Sun Dec 30 14:47:33 2007 From: friedman at highwire.stanford.edu (Michael Friedman) Date: Sun, 30 Dec 2007 14:47:33 -0800 Subject: [sf-perl] Testing a web crawler In-Reply-To: <000101c84b21$b733ac00$259b0400$@com> References: <003a01c84674$5ebfd500$4301a8c0@Untangle.local> <000101c84b21$b733ac00$259b0400$@com> Message-ID: I don't know about the optimization tests, but negative tests pretty much require "outside" knowledge. You need to know things that the software doesn't so that you can predict non-matching data. For example, you would use some other search engine to gather results and then pick some of the relevant results from there that aren't in your own search engine. Or if you are using a limited dataset (which you should be, for repeatable unit tests) you can intentionally place files in there that are "close but don't meet the current algorithm". Then the test checks to make sure those files don't appear in search results. If the algorithm changes later and the files that you knew shouldn't appear suddenly do, then you know you've become too inclusive and need to ratchet back. I haven't done work on search engines, but I do work with a journal reference <-> journal citation matching algorithm that has to perform similar discrimination between "good" and "not quite good enough" matches. I've had to create data for each of the 8 factors involved in my algorithm, both positive and negative, plus some random "don't match anything" records. The dataset quickly becomes large, but if you document it correctly it shouldn't be too hard to maintain. Good luck! -- Mike On Dec 30, 2007, at 12:22 PM, Neil Heller wrote: > I have been asked to consider ways to test a web crawler (aka search > engine). > > There are lots of "positive"-type tests I can think of that mostly > deal with > "are the returned pages really relevant to the request". > > How does one test for pages that are (or might be) relevant but were > missed > by the web crawler? > > What might be the best (is that a trick word?) or most optimized > request > given someone's desire to find information? > > > > _______________________________________________ > SanFrancisco-pm mailing list > SanFrancisco-pm at pm.org > http://mail.pm.org/mailman/listinfo/sanfrancisco-pm --------------------------------------------------------------------- Michael Friedman HighWire Press Phone: 650-725-1974 Stanford University FAX: 270-721-8034 ---------------------------------------------------------------------