From jkeroes at eli.net Tue Nov 4 00:04:55 2003 From: jkeroes at eli.net (Joshua Keroes) Date: Mon Aug 2 21:34:25 2004 Subject: [Pdx-pm] Code review Message-ID: Come one, come all to a night of code review! We promise, this will be altogether unlike any code review at work. You lucky contestants get to: 1. Bring code. 2. Talk to people. 3. Get additional eyes to look over your code. 4. Treat those people to beer, food, pool, your HUMAN SOUL; or something else nice. Others may: 1. Constructively criticize code. 2. ...win a lifesize talking alarm clock of super-mega-sitcom star, Fran Drescher! Some lucky few may also witness code being refactored - before your very eyes! A select number will also watch as your fellow coders take on the physical and mental confidence of Charles Atlas that only smooth, clean, succinct, clear code can provide. No sand getting kicked in anyones' eyes here! --- So, the next Perl Mongers meeting is in about ten days. To make this happen, we need those who have improvable code and don't mind admitting that it can be improved, and those who can improve it without being either heavy-handed or low-handed. Reply if you're interested. There needs to be enough interested parties for this to work. -Joshua PS I'll post the location details later. From schwern at pobox.com Tue Nov 4 00:17:06 2003 From: schwern at pobox.com (Michael G Schwern) Date: Mon Aug 2 21:34:25 2004 Subject: [Pdx-pm] Code review In-Reply-To: References: Message-ID: <20031104061706.GB5649@localhost.comcast.net> On Mon, Nov 03, 2003 at 10:04:55PM -0800, Joshua Keroes wrote: > So, the next Perl Mongers meeting is in about ten days. To make this > happen, we need those who have improvable code and don't mind admitting > that it can be improved, and those who can improve it without being > either heavy-handed or low-handed. Can we at least administer corrective wedgies? -- Michael G Schwern schwern@pobox.com http://www.pobox.com/~schwern/ Playstation? Of course Perl runs on Playstation. -- Jarkko Hietaniemi From poec at yahoo.com Tue Nov 4 11:44:53 2003 From: poec at yahoo.com (Ovid) Date: Mon Aug 2 21:34:25 2004 Subject: [Pdx-pm] Code review In-Reply-To: Message-ID: <20031104174453.20460.qmail@web40408.mail.yahoo.com> --- Joshua Keroes wrote: > Reply if you're interested. There needs to be enough interested parties > for this to work. Err ... the code I most want reviewed right now is some C with a thin Perl wrapper. Fair game? I suspect not :( Cheers, Ovid ===== Silence is Evil http://users.easystreet.com/ovid/philosophy/indexdecency.htm Ovid http://www.perlmonks.org/index.pl?node_id=17000 Web Programming with Perl http://users.easystreet.com/ovid/cgi_course/ __________________________________ Do you Yahoo!? Protect your identity with Yahoo! Mail AddressGuard http://antispam.yahoo.com/whatsnewfree From hydo at mac.com Tue Nov 4 18:12:18 2003 From: hydo at mac.com (Clint Moore) Date: Mon Aug 2 21:34:25 2004 Subject: [Pdx-pm] Code review In-Reply-To: <20031104174453.20460.qmail@web40408.mail.yahoo.com> References: <20031104174453.20460.qmail@web40408.mail.yahoo.com> Message-ID: On Nov 4, 2003, at 9:44 AM, Ovid wrote: > --- Joshua Keroes wrote: >> Reply if you're interested. There needs to be enough interested >> parties >> for this to work. > > Err ... the code I most want reviewed right now is some C with a thin > Perl wrapper. Fair game? I > suspect not :( > I'm probably pretty rusty but i'll take a look at it. Bring it! -cm From kyle at silverbeach.net Tue Nov 4 19:25:15 2003 From: kyle at silverbeach.net (Kyle Hayes) Date: Mon Aug 2 21:34:25 2004 Subject: [Pdx-pm] Code review In-Reply-To: <20031104174453.20460.qmail@web40408.mail.yahoo.com> References: <20031104174453.20460.qmail@web40408.mail.yahoo.com> Message-ID: <200311041725.15172.kyle@silverbeach.net> On Tuesday 04 November 2003 09:44, Ovid wrote: > --- Joshua Keroes wrote: > > Reply if you're interested. There needs to be enough interested parties > > for this to work. > > Err ... the code I most want reviewed right now is some C with a thin Perl > wrapper. Fair game? I suspect not :( I'll look at it. I've been playing with Inline::C anyway lately. Best, Kyle From john at digitalmx.com Tue Nov 4 20:58:34 2003 From: john at digitalmx.com (John Springer) Date: Mon Aug 2 21:34:25 2004 Subject: [Pdx-pm] saving state with CGI.pm Message-ID: I'm having a problem using CGI.pm to save state. Maybe I'm using the wrong tool?? Anyways... I have users going through several forms to collect information, and I'm saving the state of the CGI object in a session file. But I want to keep a "running list" of all the data that has been set across all the forms, so the user can bounce back and forth without losing anything. I got it to work but it's awkward and took a lot of trial and error. Here's what I'm doing: 1. create new cgi object with form values ($q= new CGI();) 2. open another CGI object from the previously saved state. ($p= new CGI(FILEHANDLE);) 3. add all the variables from $p that aren't in $q to $q. if($var is in $p but not in $q) { $val=$p->parm($var); $q->param($var,$val); #sets $var to $val in $q. } 4. Save the state of $new back to the file. $q->save(FILEHANDLE); One of the difficulties is that the ->param($var,$val) notation fails if $val is null. Also if $var was previously set but the current form clears it, it needs to get cleared. The if() in step 3 got rather complicated. I run through a sorted list of variables in $q and compare to a sorted list of variables in $p and pass only the vars in p but not in q. I have a feeling I've turned something simple into something really complicated. Can anyone put me on the path of righteousness? -- John Springer Somewhere in Portland Where it's probably raining. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: text/enriched Size: 1517 bytes Desc: not available Url : http://mail.pm.org/pipermail/pdx-pm-list/attachments/20031104/32ee309d/attachment.bin From merlyn at stonehenge.com Tue Nov 4 21:05:40 2003 From: merlyn at stonehenge.com (Randal L. Schwartz) Date: Mon Aug 2 21:34:25 2004 Subject: [Pdx-pm] saving state with CGI.pm In-Reply-To: References: Message-ID: <867k2fms5b.fsf@blue.stonehenge.com> >>>>> "John" == John Springer writes: John> I have users going through several forms to collect information, and John> I'm saving the state of the CGI object in a session file. But I want John> to keep a "running list" of all the data that has been set across all John> the forms, so the user can bounce back and forth without losing John> anything. I got it to work but it's awkward and took a lot of trial John> and error. Consider the code at as a possible solution. It's using client-side state management, but that's certainly a viable solution. -- Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095 Perl/Unix/security consulting, Technical writing, Comedy, etc. etc. See PerlTraining.Stonehenge.com for onsite and open-enrollment Perl training! From darthsmily at verizon.net Tue Nov 4 21:59:25 2003 From: darthsmily at verizon.net (darthsmily) Date: Mon Aug 2 21:34:25 2004 Subject: [Pdx-pm] Code review In-Reply-To: References: Message-ID: <3FA8759D.4080205@verizon.net> Joshua Keroes wrote: > > Come one, come all to a night of code review! We promise, this will be > altogether unlike any code review at work. > > You lucky contestants get to: > 1. Bring code. > 2. Talk to people. > 3. Get additional eyes to look over your code. > 4. Treat those people to beer, food, pool, your HUMAN SOUL; or > something else nice. > > Others may: > 1. Constructively criticize code. > 2. ...win a lifesize talking alarm clock of super-mega-sitcom star, > Fran Drescher! > > Some lucky few may also witness code being refactored - before your > very eyes! > > A select number will also watch as your fellow coders take on the > physical and mental confidence of Charles Atlas that only smooth, > clean, succinct, clear code can provide. No sand getting kicked in > anyones' eyes here! > > --- > > So, the next Perl Mongers meeting is in about ten days. To make this > happen, we need those who have improvable code and don't mind > admitting that it can be improved, and those who can improve it > without being either heavy-handed or low-handed. > > Reply if you're interested. There needs to be enough interested > parties for this to work. > > -Joshua > > PS I'll post the location details later. > > _______________________________________________ > Pdx-pm-list mailing list > Pdx-pm-list@mail.pm.org > http://mail.pm.org/mailman/listinfo/pdx-pm-list > When and where? From tex at off.org Wed Nov 5 02:19:32 2003 From: tex at off.org (Austin Schutz) Date: Mon Aug 2 21:34:25 2004 Subject: [Pdx-pm] saving state with CGI.pm In-Reply-To: References: Message-ID: <20031105081932.GA20965@gblx.net> On Tue, Nov 04, 2003 at 06:58:34PM -0800, John Springer wrote: > I'm having a problem using CGI.pm to save state. Maybe I'm using the > wrong tool?? Anyways... One way to do it is to use cookies. Benefits are that you don't have to save any state yourself and the user can go back to any part of the form at any point in the future and still access their data. You can set cookies at any part of your website and have them readable everywhere, sort of like global variables. Some folks used to say that users wouldn't always allow cookies, but that's probably not true any more. Austin From rootbeer at redcat.com Thu Nov 6 11:50:08 2003 From: rootbeer at redcat.com (Tom Phoenix) Date: Mon Aug 2 21:34:25 2004 Subject: [Pdx-pm] Anti-cookie rhetoric (was: saving state with CGI.pm) In-Reply-To: <20031105081932.GA20965@gblx.net> References: <20031105081932.GA20965@gblx.net> Message-ID: On Wed, 5 Nov 2003, Austin Schutz wrote: > Some folks used to say that users wouldn't always allow cookies, > but that's probably not true any more. It's worth remembering that a few users may not be able to use cookies even if they want to. For example, the user might be at a school or library net terminal, unable to change the preferences, while the site admin has ordained "no cookies" since each computer is shared among many users. Even when cookies succeed, they don't hold information for the user; they hold information for the _browser_. If I borrow your computer and use your browser, sites will think that you're visiting. If you use a different computer or browser, sites may think that a different person is visiting. That's one reason that most cookies should expire within a few hours or at end-of-session, the sooner the better. (Exception: The user asks to save state, such as "Remember my settings". Or you have users who are sure to have cookie support and mostly one-user-per-browser, such as with an in-house application.) We should all laugh at sites which use cookies to keep voters on a web-based poll from "stuffing the ballot box" with multiple votes. That inconveniences some people who share browsers while being impotent to prevent fraudulent votes. (That's a task for a captcha: http://www.captcha.net/ - but there's no fair way to stop someone who wants to vote more than once, short of some non-net-based registration.) Cookies can work for some purposes, but they have a lot of shortcomings. --Tom From poec at yahoo.com Thu Nov 6 12:21:47 2003 From: poec at yahoo.com (Ovid) Date: Mon Aug 2 21:34:25 2004 Subject: [Pdx-pm] saving state with CGI.pm In-Reply-To: <20031105081932.GA20965@gblx.net> Message-ID: <20031106182147.97425.qmail@web40404.mail.yahoo.com> --- Austin Schutz wrote: > One way to do it is to use cookies. Benefits are that you don't > have to save any state yourself and the user can go back to any part of the > form at any point in the future and still access their data. You can set > cookies at any part of your website and have them readable everywhere, sort > of like global variables. Er, sorry, but I have to say that this is a terrible idea. http://use.perl.org/~Ovid/journal/15165 (my credit card number and pin was stored in a cookie) http://use.perl.org/~Ovid/journal/13542 (Friendster stored password in cookie) http://use.perl.org/~Ovid/journal/13471 (Microsoft abuses cookies and a young lady may have gotten in trouble because a cookie revealed the location of her online journal) You can read about those horror stories of storing user data in the cookies. One response might be "store everything *but* sensitive data in the cookie", but at that point, it means you already have a server-side mechanism for maintaining state and you no longer need to rely on the cookie. Cheers, Ovid ===== Silence is Evil http://users.easystreet.com/ovid/philosophy/indexdecency.htm Ovid http://www.perlmonks.org/index.pl?node_id=17000 Web Programming with Perl http://users.easystreet.com/ovid/cgi_course/ __________________________________ Do you Yahoo!? Protect your identity with Yahoo! Mail AddressGuard http://antispam.yahoo.com/whatsnewfree From tex at off.org Wed Nov 5 13:08:45 2003 From: tex at off.org (Austin Schutz) Date: Mon Aug 2 21:34:26 2004 Subject: [Pdx-pm] saving state with CGI.pm In-Reply-To: <20031106182147.97425.qmail@web40404.mail.yahoo.com> References: <20031105081932.GA20965@gblx.net> <20031106182147.97425.qmail@web40404.mail.yahoo.com> Message-ID: <20031105190845.GA13945@gblx.net> On Thu, Nov 06, 2003 at 10:21:47AM -0800, Ovid wrote: > --- Austin Schutz wrote: > > One way to do it is to use cookies. Benefits are that you don't > > have to save any state yourself and the user can go back to any part of the > > form at any point in the future and still access their data. You can set > > cookies at any part of your website and have them readable everywhere, sort > > of like global variables. > > Er, sorry, but I have to say that this is a terrible idea. > > http://use.perl.org/~Ovid/journal/15165 > (my credit card number and pin was stored in a cookie) > http://use.perl.org/~Ovid/journal/13542 > (Friendster stored password in cookie) > http://use.perl.org/~Ovid/journal/13471 > (Microsoft abuses cookies and a young lady may have gotten in trouble > because a cookie revealed the location of her online journal) > > You can read about those horror stories of storing user data in the cookies. Three points of rebuttal... err.. I guess four: 1. If a credit card number has to be stored, I'd much rather have it stored on my computer than on some poorly maintained webserver run by joe shmoe on the other side of the 'Net. 2. You shouldn't be storing credit card information anyway. 3. Encryption works swell. Just because the data is stored on the user's computer doesn't mean it has to be available in plaintext. In addition to the point that if you can't trust the other users on an insecure operating system you shouldn't be using it anyway. In the "young lady" story her parents could just as well have installed a keystroke logger, etc. etc. etc. Austin From joe at oppegaard.net Thu Nov 6 13:38:29 2003 From: joe at oppegaard.net (Joe Oppegaard) Date: Mon Aug 2 21:34:26 2004 Subject: [Pdx-pm] saving state with CGI.pm In-Reply-To: References: Message-ID: On Tue, 4 Nov 2003, John Springer wrote: > I'm having a problem using CGI.pm to save state. Maybe I'm using the > wrong tool?? Anyways... > I have users going through several forms to collect information, and > I'm saving the state of the CGI object in a session file. But I want to > keep a "running list" of all the data that has been set across all the > forms, so the user can bounce back and forth without losing anything. My preferred way to do things like this are with sessions. See CGI::Session and the very good tutorial that comes with it. (Note the -ip-match switch). You can store the user sessionId in a cookie, hidden input fields, or in the URL query string itself, which is nice for non-cookie users. The preferred thing to do with sessions that hold sensitive data in the session file is to expire the sessionid after a set number of minutes or when the browser closes, making sure to cleanup the session files. Of course the session files should only be readable by the user the webserver is running as. I actually haven't done this in perl too much because most of the web code at my job uses PHP (/me ducks), which has very convienent built in session handling. -Joe Oppegaard From tex at off.org Wed Nov 5 13:41:46 2003 From: tex at off.org (Austin Schutz) Date: Mon Aug 2 21:34:26 2004 Subject: [Pdx-pm] Anti-cookie rhetoric (was: saving state with CGI.pm) In-Reply-To: References: <20031105081932.GA20965@gblx.net> Message-ID: <20031105194146.GB13945@gblx.net> On Thu, Nov 06, 2003 at 09:50:08AM -0800, Tom Phoenix wrote: > On Wed, 5 Nov 2003, Austin Schutz wrote: > > > Some folks used to say that users wouldn't always allow cookies, > > but that's probably not true any more. > > It's worth remembering that a few users may not be able to use cookies > even if they want to. For example, the user might be at a school or > library net terminal, unable to change the preferences, while the site > admin has ordained "no cookies" since each computer is shared among many > users. Sure, that could happen. That's a pretty smally minority, but it could be important. > > Even when cookies succeed, they don't hold information for the user; they > hold information for the _browser_. If I borrow your computer and use your > browser, sites will think that you're visiting. If you use a different > computer or browser, sites may think that a different person is visiting. > That's one reason that most cookies should expire within a few hours or at > end-of-session, the sooner the better. (Exception: The user asks to save > state, such as "Remember my settings". Or you have users who are sure to > have cookie support and mostly one-user-per-browser, such as with an > in-house application.) Well it's certainly possible to make sure the data in the cookies is user specific, and to make sure it's password protected and/or encrypted, all of which can be done without that much effort and without maintaining state by the server. The point of the exercise was to maintain state for the user anyway, so at some point unless the data gets flushed it will still be available in the browser no matter which method you use, and using any reasonable method the data can be flushed anyway. *shrug* > Cookies can work for some purposes, but they have a lot of shortcomings. They're definitely not a panacea, but they can be pretty handy IMO, especially for data that _isn't_ particularly sensitive, but should be stored over long periods. Preferences, in particular, can be saved in cookies and make a user's web browsing experience significantly better. I like 'em, but then again, I don't have the added burden of worrying or caring about library users, etc. Austin From poec at yahoo.com Thu Nov 6 14:04:07 2003 From: poec at yahoo.com (Ovid) Date: Mon Aug 2 21:34:26 2004 Subject: [Pdx-pm] 3 Simple ways to attack cookies (was: saving state with CGI.pm) In-Reply-To: <20031105190845.GA13945@gblx.net> Message-ID: <20031106200407.69151.qmail@web40412.mail.yahoo.com> --- Austin Schutz wrote: > Three points of rebuttal... err.. I guess four: > > 1. If a credit card number has to be stored, I'd much rather have it > stored on my computer than on some poorly maintained webserver run > by joe shmoe on the other side of the 'Net. Credit card numbers generally should not be stored. However, if they are, there is *no way* that information should be in a cookie. That information gets invisibly stored on whatever computer I'm using. If I am at a public place and I happen to pop over to the credit card company's Web site, then I've just stored my credit card number on this public computer (and session cookies can also be written to swap files and get stored on disk when you think they aren't!). There is no way we can hope to educate all users on how to manage this information and we shouldn't have to. The more things we ask people to remember, the more things they will forget. > 3. Encryption works swell. Just because the data is stored on the > user's computer doesn't mean it has to be available in plaintext. I can't remember the source of the quote, but I recall reading once a description of SSL as using an armored car to send credit card numbers from a guy on a park bench to a guy in a cardboard box. Now while that *seems* like an innapropriate analogy since you appear to be talking about encrypting the cookies as they are stored in the hard-drive (thus making is an armored cardboard box), I think it's a perfect analogy because it reminds us that there we are talking about complex systems and there are many parts to secure. For example, let's say you really, really think that you've got everything nailed down. The Web site is using SSL for every single page (performance be damned), you have a physically secure computer and just to be paranoid, you use an encrypted file system. You've checked the server that the Web site is sitting on and, as far as you can tell, every security patch appears to be place and there are no known exploits. An in-depth security audit of the code also reveals that there are no known security holes in any portion of the Web code. Feel pretty safe, huh? Now the site uses your computer to store your personal data by storing it in a cookie on your side, but no worries, you're bullet proof. Tomorrow, the new programmer does a quick update to a page to allow a very limited subset of HTML in user-posted comments and someone slips in an XSS (cross-site scripting) attack and snags Joe User's cookie. Game over. Joe loses. But the Web site's security team is so top-notch that they would never allow anything like this to occur, so Joe User doesn't have to worry. And while Joe User isn't worrying, he is over at another site which, unbeknownst to him, allows users to add javascript and, as a result, Joe User finds himself the victim of an XST (cross-site tracing) attack whereby his cookie for the safe domain is sent to the attacker whose script resides on the unsafe domain. Nope. That won't work because Joe User is rather unusual in that he has his Javascript disabled (unlike the vast majority of surfers), so Joe User is now perfectly safe and doesn't have to worry. And while Joe User isn't worrying, he's connecting to the safe server and notices that they had a problem with their SSL certificate. Annoyed at seeing this *again* (so many sites are sloppy about this), he clicks "ignore" and is completely unaware of the man in the middle attack that's grabbing his credit card number. I could go on, but I think the point is made. Those were potential scenarious of attack that *assumed* everyone was cognizant of security issues. Most, as we are sadly aware, are not. With networks, there are so many ways of compromising them that it doesn't make sense to send more confidential information over the Web than is necessary. We know that this data will be sent, but it should be as limited as possible. This is not to say that we should never use the Web for anything personal. That's like saying we can never unlock our front door lest the thieves get in. The problem is that we shouldn't be leaving our front door unlocked and then driving to the coast (unless we have insurance and want new furniture). The point here is risk management. We need to understand what common sources of attack are and how we can guard against them. I probably won't hire a small army to guard against a physical assault against my servers as this is both unlikely and not cost-effective. On the other hand, XSS attacks are quite common and *should* be guarded against. Cheers, Ovid ===== Silence is Evil http://users.easystreet.com/ovid/philosophy/indexdecency.htm Ovid http://www.perlmonks.org/index.pl?node_id=17000 Web Programming with Perl http://users.easystreet.com/ovid/cgi_course/ __________________________________ Do you Yahoo!? Protect your identity with Yahoo! Mail AddressGuard http://antispam.yahoo.com/whatsnewfree From rootbeer at redcat.com Thu Nov 6 16:28:47 2003 From: rootbeer at redcat.com (Tom Phoenix) Date: Mon Aug 2 21:34:26 2004 Subject: [Pdx-pm] Anti-cookie rhetoric (was: saving state with CGI.pm) In-Reply-To: <20031105194146.GB13945@gblx.net> References: <20031105081932.GA20965@gblx.net> <20031105194146.GB13945@gblx.net> Message-ID: On Wed, 5 Nov 2003, Austin Schutz wrote: > Well it's certainly possible to make sure the data in the > cookies is user specific, Am I missing something here? The only ways I can think of to ensure that the cookie data belong to a particular user, instead of browser, would obviate the need for long-term cookies at all. For example, if the user logs in with a username-password combo, you know which user it is - but now, why keep anything in the cookie jar? You've already got the username and password (in some form) in a database, so you may as well keep everything in there, or at least everything important. Cookies get lost, but databases get backed up. (We hope!) > and to make sure it's password protected and/or encrypted, Encrypting user data in cookies is using a cheap database that sometimes loses data. :-) Seriously, disk space on the server is cheap; bandwidth consumed by large cookies that go back and forth on many transactions is expenive. A small cookie that has a session-ID is okay, but that's designed to expire at the end of a session. If you must use large cookies, ensure that they're not sent to and from your server except when necessary. Some servers send and require every cookie even when you're fetching the eighteen images on every page. For some reason, these pages load slowly... :-D > especially for data that _isn't_ particularly sensitive, but should be > stored over long periods. Long-term cookies are generally problematic. Most browsers implement some limit on cookies, deleting old cookies to make room for new ones. The RFC has some information on this, even though its suggested limits are pretty permissive. Section 6.3 says, Applications should use as few and as small cookies as possible, and they should cope gracefully with the loss of a cookie. http://www.cis.ohio-state.edu/cgi-bin/rfc/rfc2109.html#sec-6.3 > Preferences, in particular, can be saved in cookies and make a user's > web browsing experience significantly better. Yes, that's the usage I mentioned - so long as the _user_ chooses to save the state. If I borrow your browser, some site shouldn't save my preferences as if they were yours, though. I'm not opposed to all uses of cookies. But I'm opposed to most of their uses on the web today. --Tom Phoenix From rootbeer at redcat.com Thu Nov 6 16:32:36 2003 From: rootbeer at redcat.com (Tom Phoenix) Date: Mon Aug 2 21:34:26 2004 Subject: [Pdx-pm] 3 Simple ways to attack cookies (was: saving state with CGI.pm) In-Reply-To: <20031106200407.69151.qmail@web40412.mail.yahoo.com> References: <20031106200407.69151.qmail@web40412.mail.yahoo.com> Message-ID: Ya know, maybe we should have a lightning talks session devoted to cookies: pro, con, uses of, abuses of, and recipies. Those who don't want to talk should bring the chocolate chip cookies. --Tom From tex at off.org Wed Nov 5 17:08:07 2003 From: tex at off.org (Austin Schutz) Date: Mon Aug 2 21:34:26 2004 Subject: [Pdx-pm] Anti-cookie rhetoric (was: saving state with CGI.pm) In-Reply-To: References: <20031105081932.GA20965@gblx.net> <20031105194146.GB13945@gblx.net> Message-ID: <20031105230807.GC13945@gblx.net> On Thu, Nov 06, 2003 at 02:28:47PM -0800, Tom Phoenix wrote: > On Wed, 5 Nov 2003, Austin Schutz wrote: > I'm not opposed to all uses of cookies. But I'm opposed to most of their > uses on the web today. > I'll just say that I generally disagree, but I like the lightning talk/chocolate chip cookie idea. Austin From jkeroes at eli.net Thu Nov 6 17:37:35 2003 From: jkeroes at eli.net (Joshua Keroes) Date: Mon Aug 2 21:34:26 2004 Subject: [Pdx-pm] Afterhours meet Message-ID: <3330DA76-10B2-11D8-AC50-000A95C466EC@eli.net> After the next meeting, where shall we go for beer, etc? J From lemming at quirkyqatz.com Thu Nov 6 17:47:49 2003 From: lemming at quirkyqatz.com (Mark Morgan) Date: Mon Aug 2 21:34:26 2004 Subject: [Pdx-pm] 3 Simple ways to attack cookies (was: saving state with CGI.pm) In-Reply-To: References: <20031106200407.69151.qmail@web40412.mail.yahoo.com> Message-ID: <55415.134.134.136.3.1068162469.squirrel@webmail.pair.com> Tom Phoenix said: > Ya know, maybe we should have a lightning talks session devoted to > cookies: pro, con, uses of, abuses of, and recipies. Those who don't > want to talk should bring the chocolate chip cookies. I have been thinking I should be making my chocolate chip cookies again. They're not in the same class as my wife's rum balls, but they're pretty good. -- Mark http://www.kittydream.org - House of Dreams Cat Shelter p.s. sorry Tom for getting you twice on the reply From john at digitalmx.com Thu Nov 6 18:18:43 2003 From: john at digitalmx.com (John Springer) Date: Mon Aug 2 21:34:26 2004 Subject: [Pdx-pm] 3 Simple ways to attack cookies (was: saving state with CGI.pm) In-Reply-To: <55415.134.134.136.3.1068162469.squirrel@webmail.pair.com> References: <20031106200407.69151.qmail@web40412.mail.yahoo.com> <55415.134.134.136.3.1068162469.squirrel@webmail.pair.com> Message-ID: Seems like those in favor of cookies should bring cookies; those opposed bring rum balls. I may choose sides based on the refreshments, assuming there's enough rum is in the rum balls. -- John Springer Somewhere in Portland Where it's probably raining. On Nov 6, 2003, at 3:47 PM, Mark Morgan wrote: > I have been thinking I should be making my chocolate chip cookies > again. > They're not in the same class as my wife's rum balls, but they're > pretty > good. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: text/enriched Size: 565 bytes Desc: not available Url : http://mail.pm.org/pipermail/pdx-pm-list/attachments/20031106/0a58be05/attachment.bin From kyle at silverbeach.net Thu Nov 6 22:55:38 2003 From: kyle at silverbeach.net (Kyle Hayes) Date: Mon Aug 2 21:34:26 2004 Subject: [Pdx-pm] saving state with CGI.pm In-Reply-To: <20031105190845.GA13945@gblx.net> References: <20031105081932.GA20965@gblx.net> <20031106182147.97425.qmail@web40404.mail.yahoo.com> <20031105190845.GA13945@gblx.net> Message-ID: <200311062055.38180.kyle@silverbeach.net> On Wednesday 05 November 2003 11:08, Austin Schutz wrote: > On Thu, Nov 06, 2003 at 10:21:47AM -0800, Ovid wrote: > > --- Austin Schutz wrote: > > > One way to do it is to use cookies. Benefits are that you don't > > [snip, horror stories] > > You can read about those horror stories of storing user data in the > > cookies. > > Three points of rebuttal... err.. I guess four: > > 1. If a credit card number has to be stored, I'd much rather have it > stored on my computer than on some poorly maintained webserver run > by joe shmoe on the other side of the 'Net. ??? but browsers will give it up to any server persuasive enough. Cookie attacks are legion. Tried cranking down control of your cookies real tight and then using Hotmail? HTTP cookies are like VD infections: they spread easily and no one wants to talk about it. Really, really important information I keep offline if at all possible. > 2. You shouldn't be storing credit card information anyway. Tricky to do if you need recurring billing. I agree that the best approach is to do everything possible to avoid storing a credit card in any form that could be turned back into a usable number. It is worth doing all kinds of gymnastics to avoid storing CC numbers in any usable form. > 3. Encryption works swell. Just because the data is stored on the > user's computer doesn't mean it has to be available in plaintext. This is only true when the decryption method is as safe as the credit card data itself. I.e. you've got all your credit card numbers carefully encrypted, but the Perl CGI that has the decryption key and salt is downloadable via misconfiguration or a web server bug.... Ouch. I've seen it happen. I've seen people make MySQL loadable user modules (compiled C++ code) to do the decryption. Great, tough to make the web server serve that module up as a binary. However, a little trickery, a couple of SQL injection attacks, and again, I've got your decryption key, or I can get your server to do the decryption for me. Cool. I prefer doing one-way crypto hashes on CC numbers. MD5 is a bit long in the tooth, but SHA1 and some of the newer, stronger crypto hashes do a fine job. You can't get the number back, but you can tell if you get the same CC twice (which is useful in stopping potential fraud). Sure, someone can crack your web site and stripmine the DB. Big deal. You haven't given away any CC numbers. It's far too easy to win the Visa lawsuit sweepstakes. The Secret Service will not help you. Visa will not help you. The first is too busy with too few people to do much. The second makes money on every transaction, coming or going. You are not worthy to talk to American Express. > In addition to the point that if you can't trust the other users > on an insecure operating system you shouldn't be using it anyway. In the > "young lady" story her parents could just as well have installed a > keystroke logger, etc. etc. etc. As noted in other venues (and on this list): The end points are not secure, but the transport is great. A bit of a conundrum for those of us that sometimes need to make sure that secret data stays secret. Best, Kyle From jkeroes at eli.net Tue Nov 11 11:44:30 2003 From: jkeroes at eli.net (Joshua Keroes) Date: Mon Aug 2 21:34:26 2004 Subject: [Pdx-pm] Nov Meeting: Code Review Message-ID: Come one, come all to a night of code review! We promise, this will be altogether unlike any code review at work. You lucky contestants get to: 1. Bring code. 2. Talk to people. 3. Get additional eyes to look over your code. 4. Treat those people to beer, food, pool, your HUMAN SOUL; or something else nice. Others may: 1. Constructively criticize code. 2. ...win a lifesize talking alarm clock of super-mega-sitcom star, Fran Drescher! Some lucky few may also witness code being refactored - before your very eyes! A select number will also watch as your fellow coders take on the physical and mental confidence of Charles Atlas that only smooth, clean, succinct, clear code can provide. No sand getting kicked in anyones' eyes here! --- Meeting: Weds Nov 12, 6:30-8:30, at the Urban Grind Cafe Map at http://urbangrindcoffee.com/ Afterhours: 8:30/9:00-whenever Goodfoot? Laurelwood? Moon & Sixpense? Basement Pub? Anywhere else? We'll vote on a location at Urban Grind. Email me privately if you only can hit the afterhours. -J From jkeroes at eli.net Wed Nov 12 14:50:56 2003 From: jkeroes at eli.net (Joshua Keroes) Date: Mon Aug 2 21:34:26 2004 Subject: [Pdx-pm] Nov Meeting: Code Review TONIGHT Message-ID: Come one, come all to a night of code review! We promise, this will be altogether unlike any code review at work. You lucky contestants get to: 1. Bring code. 2. Talk to people. 3. Get additional eyes to look over your code. 4. Treat those people to beer, food, pool, your HUMAN SOUL; or something else nice. Others may: 1. Constructively criticize code. 2. ...win a lifesize talking alarm clock of super-mega-sitcom star, Fran Drescher! Some lucky few may also witness code being refactored - before your very eyes! A select number will also watch as your fellow coders take on the physical and mental confidence of Charles Atlas that only smooth, clean, succinct, clear code can provide. No sand getting kicked in anyones' eyes here! --- Meeting: Weds Nov 12, 6:30-8:30, at the Urban Grind Cafe Map at http://urbangrindcoffee.com/ Afterhours: 8:30/9:00-whenever Goodfoot? Laurelwood? Moon & Sixpense? Basement Pub? Anywhere else? We'll vote on a location at Urban Grind. Email me privately if you only can hit the afterhours. -J From poec at yahoo.com Wed Nov 12 16:37:00 2003 From: poec at yahoo.com (Ovid) Date: Mon Aug 2 21:34:26 2004 Subject: [Pdx-pm] Nov Meeting: Code Review TONIGHT In-Reply-To: Message-ID: <20031112223700.65176.qmail@web40403.mail.yahoo.com> --- Joshua Keroes wrote: > Come one, come all to a night of code review! We promise, this will be > altogether unlike any code review at work. Crap. I can't make it. I'm nice and sick. Blah. Any chance that some of the before and afters can be posted to the Web site? Cheers, Ovid ===== Silence is Evil http://users.easystreet.com/ovid/philosophy/indexdecency.htm Ovid http://www.perlmonks.org/index.pl?node_id=17000 Web Programming with Perl http://users.easystreet.com/ovid/cgi_course/ __________________________________ Do you Yahoo!? Protect your identity with Yahoo! Mail AddressGuard http://antispam.yahoo.com/whatsnewfree From merlyn at stonehenge.com Wed Nov 12 16:42:52 2003 From: merlyn at stonehenge.com (Randal L. Schwartz) Date: Mon Aug 2 21:34:26 2004 Subject: [Pdx-pm] Nov Meeting: Code Review TONIGHT In-Reply-To: <20031112223700.65176.qmail@web40403.mail.yahoo.com> References: <20031112223700.65176.qmail@web40403.mail.yahoo.com> Message-ID: <86ekwdryxm.fsf@blue.stonehenge.com> >>>>> "Ovid" == Ovid writes: Ovid> Crap. I can't make it. I'm nice and sick. Blah. And I'm in LA. Nearly the same thing. Nice and sick (read: twisted). Ovid> Any chance that some of the before and afters can be posted to the Web site? Ditto on the request. -- Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095 Perl/Unix/security consulting, Technical writing, Comedy, etc. etc. See PerlTraining.Stonehenge.com for onsite and open-enrollment Perl training! From kellert at ohsu.edu Wed Nov 12 16:57:17 2003 From: kellert at ohsu.edu (Thomas Keller) Date: Mon Aug 2 21:34:26 2004 Subject: [Pdx-pm] Mac OS X 10.3 Message-ID: <90818557-1563-11D8-8656-0003930405E2@ohsu.edu> On my Mac running OS X 10.2 and perl 5.6, I had installed the fink package installer, and had added various libraries required by some of the perl modules I was using. But I was having trouble getting some perl modules to install. So for better or worse, I decided to rebuild my machine. I did a clean install of Mac 10.3 with all the developers tools, xcode, etc. My questions are: What libraries should I install right away, and where on Panther? e.g expat, libgd, freetype, etc? What do people thing about using fink as a software installation tool on the Mac, specifically for perl required libraries, and perl modules? What's your favorite reference for this type of perl-oriented system administration? Thanks, Tom K. Tom Keller, Ph.D. http://www.ohsu.edu/core kellert@ohsu.edu 503-494-2442 -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: text/enriched Size: 857 bytes Desc: not available Url : http://mail.pm.org/pipermail/pdx-pm-list/attachments/20031112/f610211a/attachment.bin From jkeroes at eli.net Wed Nov 12 17:00:42 2003 From: jkeroes at eli.net (Joshua Keroes) Date: Mon Aug 2 21:34:26 2004 Subject: [Pdx-pm] Nov Meeting: Code Review TONIGHT In-Reply-To: <86ekwdryxm.fsf@blue.stonehenge.com> References: <20031112223700.65176.qmail@web40403.mail.yahoo.com> <86ekwdryxm.fsf@blue.stonehenge.com> Message-ID: <0AC262AF-1564-11D8-A723-000A95C466EC@eli.net> On Nov 12, 2003, at 2:42 PM, Randal L. Schwartz wrote: >>>>>> "Ovid" == Ovid writes: > > Ovid> Crap. I can't make it. I'm nice and sick. Blah. > > And I'm in LA. Nearly the same thing. Nice and sick (read: twisted). > > Ovid> Any chance that some of the before and afters can be posted to > the Web site? > > Ditto on the request. Excellent idea. Everyone: please post befores and afters to the PDX.pm kwiki. There's an example listed at http://pdx.pm.org/kwiki/ . -J From schwern at pobox.com Wed Nov 12 20:07:35 2003 From: schwern at pobox.com (Michael G Schwern) Date: Mon Aug 2 21:34:26 2004 Subject: [Pdx-pm] Mac OS X 10.3 In-Reply-To: <90818557-1563-11D8-8656-0003930405E2@ohsu.edu> References: <90818557-1563-11D8-8656-0003930405E2@ohsu.edu> Message-ID: <20031113020735.GG715@localhost.comcast.net> On Wed, Nov 12, 2003 at 02:57:17PM -0800, Thomas Keller wrote: > On my Mac running OS X 10.2 and perl 5.6, I had installed the fink > package installer, and had added various libraries required by some of > the perl modules I was using. But I was having trouble getting some > perl modules to install. So for better or worse, I decided to rebuild > my machine. I did a clean install of Mac 10.3 with all the developers > tools, xcode, etc. > > My questions are: > What libraries should I install right away, and where on Panther? e.g > expat, libgd, freetype, etc? I dunno, install stuff with CPANPLUS as you need it. > What do people thing about using fink as a software installation tool > on the Mac, specifically for perl required libraries, and perl modules? fink doesn't have enough perl modules in its system to be usable as your only source of Perl modules. You could try rolling .info files for each module you want to use and throwing them into /sw/fink/dists/local but its probably not worth the trouble. > What's your favorite reference for this type of perl-oriented system > administration? CPANPLUS does the job well. -- Michael G Schwern schwern@pobox.com http://www.pobox.com/~schwern/ ...someone always points out that we'll end up dressing like gay space pirates anyway, so why bother planning otherwise? - C.H.U.N.K. DCLXVI From john at digitalmx.com Wed Nov 12 23:39:37 2003 From: john at digitalmx.com (John Springer) Date: Mon Aug 2 21:34:26 2004 Subject: [Pdx-pm] Mac OS X 10.3 In-Reply-To: <90818557-1563-11D8-8656-0003930405E2@ohsu.edu> References: <90818557-1563-11D8-8656-0003930405E2@ohsu.edu> Message-ID: This is Jaguar-oriented, but probably still useful: (How to install perl 5.8 on Jaguar) http://developer.apple.com/internet/macosx/perl.html -- John Springer Somewhere in Portland Where it's probably raining. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: text/enriched Size: 301 bytes Desc: not available Url : http://mail.pm.org/pipermail/pdx-pm-list/attachments/20031112/a20f83d2/attachment.bin From schwern at pobox.com Thu Nov 13 00:54:49 2003 From: schwern at pobox.com (Michael G Schwern) Date: Mon Aug 2 21:34:26 2004 Subject: [Pdx-pm] Mac OS X 10.3 In-Reply-To: References: <90818557-1563-11D8-8656-0003930405E2@ohsu.edu> Message-ID: <20031113065449.GA14138@localhost.personaltelco.net> On Wed, Nov 12, 2003 at 09:39:37PM -0800, John Springer wrote: > This is Jaguar-oriented, but probably still useful: > (How to install perl 5.8 on Jaguar) > > http://developer.apple.com/internet/macosx/perl.html I would recommend against changing the system perl on any machine. Leave /usr/bin/perl alone and don't overwrite /System/Library/Perl. Its likely to cause pain and suffering and pain. If you want a newer Perl, install it into /usr/local/perl5.x.y and put a symlink in /usr/local/bin. Configure it so that /Library/Perl is added when it asks you for extra directories for @INC. -- Michael G Schwern schwern@pobox.com http://www.pobox.com/~schwern/ Kindly do not attempt to cloud the issue with facts. From merlyn at stonehenge.com Thu Nov 13 01:46:25 2003 From: merlyn at stonehenge.com (Randal L. Schwartz) Date: Mon Aug 2 21:34:26 2004 Subject: [Pdx-pm] Mac OS X 10.3 In-Reply-To: <20031113065449.GA14138@localhost.personaltelco.net> References: <90818557-1563-11D8-8656-0003930405E2@ohsu.edu> <20031113065449.GA14138@localhost.personaltelco.net> Message-ID: <86n0b0pv78.fsf@blue.stonehenge.com> >>>>> "Michael" == Michael G Schwern writes: Michael> If you want a newer Perl, install it into Michael> /usr/local/perl5.x.y and put a symlink in /usr/local/bin. Michael> Configure it so that /Library/Perl is added when it asks you Michael> for extra directories for @INC. I put my OSX Perl in /opt/perl/snap (for "snapshot") with the Configure line of: ./Configure -des -Dusedevel -Uversiononly -Dprefix=/opt/perl/snap -Dlocincpth=/sw/include -Dloclibpth=/sw/lib -Dperladmin=merlyn@stonehenge.com Note that I have fink installed, so I need to include the two /sw dirs as well. -- Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095 Perl/Unix/security consulting, Technical writing, Comedy, etc. etc. See PerlTraining.Stonehenge.com for onsite and open-enrollment Perl training! From tex at off.org Wed Nov 12 12:46:37 2003 From: tex at off.org (Austin Schutz) Date: Mon Aug 2 21:34:26 2004 Subject: [Pdx-pm] Mac OS X 10.3 In-Reply-To: <20031113065449.GA14138@localhost.personaltelco.net> References: <90818557-1563-11D8-8656-0003930405E2@ohsu.edu> <20031113065449.GA14138@localhost.personaltelco.net> Message-ID: <20031112184637.GC2485@gblx.net> On Wed, Nov 12, 2003 at 10:54:49PM -0800, Michael G Schwern wrote: > On Wed, Nov 12, 2003 at 09:39:37PM -0800, John Springer wrote: > > This is Jaguar-oriented, but probably still useful: > > (How to install perl 5.8 on Jaguar) > > > > http://developer.apple.com/internet/macosx/perl.html > > I would recommend against changing the system perl on any machine. Leave > /usr/bin/perl alone and don't overwrite /System/Library/Perl. Its likely > to cause pain and suffering and pain. > > If you want a newer Perl, install it into /usr/local/perl5.x.y and put a > symlink in /usr/local/bin. Configure it so that /Library/Perl is added when > it asks you for extra directories for @INC. > ..and after you do that you may want to put /usr/local/bin into your PATH ahead of your current PATH, e.g. in .profile: PATH=/usr/local/bin:$PATH; export PATH; Also on some machines I've used the sysadmins neglect to link /usr/local/bin/perldoc -> /usr/local/perl5.x.y/bin/perldoc. That can be quite bothersome when you are trying to read documentation for modules in the non-system version. Also another thing that's worked for me is to install it as /usr/local/bin/perl5 or perl5.8, so if you remember the name you can't confuse the two, even if you wind up using an account without /usr/local/bin in its PATH first. Ditto for perldoc, etc. Seems like this is in a faq somewhere, but my lack of coffee this morning isn't helping me find it. Austin From rootbeer at redcat.com Thu Nov 13 18:06:58 2003 From: rootbeer at redcat.com (Tom Phoenix) Date: Mon Aug 2 21:34:26 2004 Subject: [Pdx-pm] Regular Expression Compendium Message-ID: A student in my class asked whether there's some sort of RE Compendium out there. We're thinking of a web page that has Perl patterns filed under entries like "North American phone number with optional area code" or "fully-qualified domain name" or "Character name from The Simpsons". Is there a page like this already? If not, it really should be on a Wiki, shouldn't it? I think I'll make one... http://pdx.pm.org/kwiki/index.cgi?RECompendium --Tom Phoenix From jkeroes at eli.net Thu Nov 13 18:25:57 2003 From: jkeroes at eli.net (Joshua Keroes) Date: Mon Aug 2 21:34:26 2004 Subject: [Pdx-pm] Regular Expression Compendium In-Reply-To: References: Message-ID: <1D880654-1639-11D8-B9A8-000A95C466EC@eli.net> On Nov 13, 2003, at 4:06 PM, Tom Phoenix wrote: > A student in my class asked whether there's some sort of RE Compendium > out > there. We're thinking of a web page that has Perl patterns filed under > entries like "North American phone number with optional area code" or > "fully-qualified domain name" or "Character name from The Simpsons". http://search.cpan.org/~abigail/Regexp-Common-2.113/ won't do any of these things but I believe the first two are on the TODO list. J From joe at oppegaard.net Thu Nov 20 00:39:05 2003 From: joe at oppegaard.net (Joe Oppegaard) Date: Mon Aug 2 21:34:26 2004 Subject: [Pdx-pm] Too much validation Message-ID: Mongers, I notice that sometimes in OO code that I write I'll do something like this obviously contrived example: ---- package WordCharacters; sub new { my ($class, $value) = @_; # Validation check here unless ($value =~ /^\w+$/) { die "Non-word character used in value: $value"; } my $self = { value => $value }; bless $self, ref($class) || $class; return $self; } package main; print "> "; chomp(my $input = <>); unless ($input =~ /^\w+$/) { # More validation here die "Word characters only!\n"; } my $wc = WordCharacters->new($input); ---- So as a general rule of thumb, when should data validation be done? Catch it early or catch it when it actually matters? Or both? (Ugh, duplicate code). Seems to me that typically you should catch it when it actually matters, so the calling code doesn't have to worry about what is and isn't acceptable. On the other hand, I guess I just feel dirty passing through data that I know could be invalid. -Joe Oppegaard From wcooley at nakedape.cc Thu Nov 20 00:56:19 2003 From: wcooley at nakedape.cc (Wil Cooley) Date: Mon Aug 2 21:34:26 2004 Subject: [Pdx-pm] Too much validation In-Reply-To: References: Message-ID: <1069311379.9617.101.camel@denk.nakedape.priv> On Wed, 2003-11-19 at 22:39, Joe Oppegaard wrote: > So as a general rule of thumb, when should data validation be done? > Catch it early or catch it when it actually matters? Or both? (Ugh, > duplicate code). My guess (and IANAExpert) is that it should probably be done in both, depending on the circumstances. If you're writing a module or anything where you expect reuse, you should treat it as a black-box and make it as robust as possible. OTOH, if you're following an XP/YAGNI approach early in releases, then probably it's fine to just have it in one place. I've often wondered if it wouldn't be better to use objects instead of basic strings for a lot of attributes, where the class implements robust validation checks. How many places splattered throughout your code is the regex for testing if a telephone number or IP address is valid? Why do OO languages rarely ship standard with classes for common data formats? Wil -- Wil Cooley wcooley@nakedape.cc Naked Ape Consulting http://nakedape.cc * * * * Linux, UNIX, Networking and Security Solutions * * * * * Naked Ape Consulting http://nakedape.cc * -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: This is a digitally signed message part Url : http://mail.pm.org/pipermail/pdx-pm-list/attachments/20031119/e24df785/attachment.bin From schwern at pobox.com Thu Nov 20 02:54:15 2003 From: schwern at pobox.com (Michael G Schwern) Date: Mon Aug 2 21:34:26 2004 Subject: [Pdx-pm] Too much validation In-Reply-To: References: Message-ID: <20031120085415.GA17333@windhund.schwern.org> On Wed, Nov 19, 2003 at 10:39:05PM -0800, Joe Oppegaard wrote: > So as a general rule of thumb, when should data validation be done? > Catch it early or catch it when it actually matters? Or both? (Ugh, > duplicate code). Depends on what you're doing, but in general I'd say catch it as the new data comes in. That way you an put the checks in one place and won't forget to validate on the way out. -- Michael G Schwern schwern@pobox.com http://www.pobox.com/~schwern/ Cottleston, Cottleston, Cottleston Pie. A fly can't bird, but a bird can fly. Ask me a riddle and I reply: "Cottleston, Cottleston, Cottleston Pie." From kellert at ohsu.edu Thu Nov 20 15:54:22 2003 From: kellert at ohsu.edu (Thomas Keller) Date: Mon Aug 2 21:34:26 2004 Subject: [Pdx-pm] tilde in paths Message-ID: <19B5940C-1BA4-11D8-A217-0003930405E2@ohsu.edu> Greetings, Forgive my laziness for not searching for the answer to this. But does someone off the top of their head know how to open a file with a tilde in the path? Specifically, I want something like: { open FILE, '~/Documents/myfile' or die; my @info = ; } to open myfile in the Documents dir in the users home directory. I know the problem is passing it to the shell, but I don't know how to do that Thanks, Tom K. From jkeroes at eli.net Thu Nov 20 15:55:36 2003 From: jkeroes at eli.net (Joshua Keroes) Date: Mon Aug 2 21:34:26 2004 Subject: [Pdx-pm] tilde in paths In-Reply-To: <19B5940C-1BA4-11D8-A217-0003930405E2@ohsu.edu> References: <19B5940C-1BA4-11D8-A217-0003930405E2@ohsu.edu> Message-ID: <45C296F9-1BA4-11D8-89E2-000A95C466EC@eli.net> On Nov 20, 2003, at 1:54 PM, Thomas Keller wrote: > Greetings, > Forgive my laziness for not searching for the answer to this. But does > someone off the top of their head know how to open a file with a tilde > in the path? Found in /usr/local/perl581/lib/5.8.1/pods/perlfaq5.pod How can I translate tildes (~) in a filename? Use the <> (glob()) operator, documented in perlfunc. Older versions of Perl require that you have a shell installed that groks tildes. Recent perl versions have this feature built in. The File::KGlob module (available from CPAN) gives more portable glob functionality. Within Perl, you may use this directly: $filename =~ s{ ^ ~ # find a leading tilde ( # save this in $1 [^/] # a non-slash character * # repeated 0 or more times (0 means me) ) }{ $1 ? (getpwnam($1))[7] : ( $ENV{HOME} || $ENV{LOGDIR} ) }ex; Have a nice day. :-) -------------- next part -------------- A non-text attachment was scrubbed... Name: Joshua Keroes.vcf Type: text/directory Size: 363 bytes Desc: not available Url : http://mail.pm.org/pipermail/pdx-pm-list/attachments/20031120/e6e6ac64/JoshuaKeroes.bin -------------- next part -------------- From tkil at scrye.com Thu Nov 20 18:09:58 2003 From: tkil at scrye.com (Tkil) Date: Mon Aug 2 21:34:26 2004 Subject: [Pdx-pm] Too much validation In-Reply-To: References: Message-ID: >>>>> "Joe" == Joe Oppegaard writes: Joe> So as a general rule of thumb, when should data validation be Joe> done? Catch it early or catch it when it actually matters? Or Joe> both? (Ugh, duplicate code). With object-oriented codde, I feel you should let the class decide what is acceptable or not. This lets me change what is considered acceptable without editing all callers. There are two ways I'd code this convention in Perl. One requires a bit of checking, but it is unobtrusive and straightforward: | while ( my $raw_data = get_data() ) | { | # MyClass::new will return undef if $raw_data is invalid | my $obj = MyClass->new( $raw_data ) | or next; | | # do stuff with $obj here | } The other way -- which I've adopted in most of my code of late -- is to use "eval BLOCK" to catch "die" calls as exceptions: | while ( my $raw_data = get_data() ) | { | eval | { | # MyClass::new will 'die' if $raw_data is invalid | my $obj = MyClass->new( $raw_data ); | | # do stuff with $obj here | }; | | if ( $@ ) | { | # complain | } | } This has the advantage of providing a description of what went wrong in $@. Further, it allows any method in MyClass to "die" if it can't do what it promises to do. There are existing modules that can be told to "die" if something goes wrong, freeing you from checking the error return of every call. A fine example of this is DBI, with its RaiseError attribute. t. From wcooley at nakedape.cc Thu Nov 20 18:20:53 2003 From: wcooley at nakedape.cc (Wil Cooley) Date: Mon Aug 2 21:34:26 2004 Subject: [Pdx-pm] Too much validation In-Reply-To: References: Message-ID: <1069374053.30864.10.camel@denk.nakedape.priv> On Thu, 2003-11-20 at 16:09, Tkil wrote: > The other way -- which I've adopted in most of my code of late -- is > to use "eval BLOCK" to catch "die" calls as exceptions: Have you looked at using the Exception.pm module from CPAN? I read the docs for it but never actually got around to using it. I like the idea of being able to use exceptions with the familiar 'try' syntax used in other OO languages, although the lack of pervasive exceptions makes it less than ideal. Wil -- Wil Cooley wcooley@nakedape.cc Naked Ape Consulting http://nakedape.cc * * * * * * Linux Services for Small Businesses * * * * * * * Easy, reliable solutions for small businesses * * Naked Ape Business Server http://nakedape.cc/r/sms * -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: This is a digitally signed message part Url : http://mail.pm.org/pipermail/pdx-pm-list/attachments/20031120/30bcf059/attachment.bin From tex at off.org Thu Nov 20 18:31:28 2003 From: tex at off.org (Austin Schutz) Date: Mon Aug 2 21:34:26 2004 Subject: [Pdx-pm] Too much validation In-Reply-To: References: Message-ID: <20031121003128.GB2390@gblx.net> On Wed, Nov 19, 2003 at 10:39:05PM -0800, Joe Oppegaard wrote: > So as a general rule of thumb, when should data validation be done? > Catch it early or catch it when it actually matters? Or both? (Ugh, > duplicate code). My suggestion would be when it can be done with the least work. That would be "where it matters" in your example. > > Seems to me that typically you should catch it when it actually matters, > so the calling code doesn't have to worry about what is and isn't > acceptable. On the other hand, I guess I just feel dirty passing through > data that I know could be invalid. > If you use the module in many places you will soon tire of repeating the same code and be thankful the module does the validation for you. The other great advantage is that if you change your mind about what constitutes valid input it's in a single spot. Otherwise you may be chasing down regexes in 50 different CGI scripts, etc. I dunno, that's my 2c. Austin From raanders at acm.org Fri Nov 21 16:23:12 2003 From: raanders at acm.org (Roderick A. Anderson) Date: Mon Aug 2 21:34:26 2004 Subject: [Pdx-pm] Hash question Message-ID: New to the list and glad I found it. Not from the Portland area (Hayden, ID). I've been through the list archives but found nothing on this topic (which almost ended up being 'Hash-ish question'). I'm working on a web interface to a commercial application using CGI.pm, Win32::ODBC, and a pile of other modules. Using neat tricks I've got from "Programming Perl", "Learning Perl", "Perl Cookbook" and almost every other O'Reilly book on perl plus those from a few other publishers, I solved an append of one hash to another need but was wondering if there was a better way. And there is one problem that is Win32::ODBC caused that I'm looking for a solution to. First the append hash solution. I use the hash generated from some CGI.pm params then query a SQL Server database and and use DataHash to returned the row. To append to the original hash I'm using a variation on code I got out of "Perl Cookbook". %SignUpInfo = (%SignUpInfo, $db2->DataHash()); Is there a better or more efficient way to do this? Then the Win32::ODBC issue. When I use the above $db2->DataHash() whether appending or creating a new hash I end up with an empty key/value in the hash. Doing a foreach my $key (sort keys %SignUpInfo) { print "$key: $SignUpInfo{$key}\n"; gets me output with one line with only the colon on it. Is there a way to remove this key/value combination? I think it has to do with Win32::ODBC returning some kind of row identifier. TIA, Rod -- "Open Source Software - Sometimes you get more than you paid for..." From joe at radiojoe.org Fri Nov 21 17:45:55 2003 From: joe at radiojoe.org (Joe Oppegaard) Date: Mon Aug 2 21:34:26 2004 Subject: [Pdx-pm] Hash question In-Reply-To: References: Message-ID: On Fri, 21 Nov 2003, Roderick A. Anderson wrote: > New to the list and glad I found it. Not from the Portland area (Hayden, > ID). > Cool, welcome to the list. > First the append hash solution. I use the hash generated from some CGI.pm > params then query a SQL Server database and and use DataHash to returned > the row. To append to the original hash I'm using a variation on code I > got out of "Perl Cookbook". > > %SignUpInfo = (%SignUpInfo, $db2->DataHash()); > > Is there a better or more efficient way to do this? > > > Then the Win32::ODBC issue. When I use the above $db2->DataHash() whether > appending or creating a new hash I end up with an empty key/value in > the hash. > > Doing a > > foreach my $key (sort keys %SignUpInfo) { > print "$key: $SignUpInfo{$key}\n"; > > gets me output with one line with only the colon on it. Is there a way to > remove this key/value combination? I think it has to do with Win32::ODBC > returning some kind of row identifier. > I'm guessing that %SignUpInfo was initially empty up top and $db2->DataHash() had some type of error condition and returned undef or ''. Take the following for example which will print just a colon: ---------- sub ret_undef { return undef; } %a = ret_undef(); foreach (keys %a) { print "$_ : $a{$_}\n"; } ---------- Or to get a better idea of what's in the hash: use Data::Dumper; print Dumper(\%a); Which shows you that you actually do have a blank key: $VAR1 = { '' => undef }; I don't think you really want to remove the blank key/value combination, you probably just want to make sure you're properly checking for error conditions when doing the Win32::ODBC calls. -Joe Oppegaard From ebaur at aracnet.com Fri Nov 21 18:35:12 2003 From: ebaur at aracnet.com (Eric Shore Baur) Date: Mon Aug 2 21:34:26 2004 Subject: [Pdx-pm] reading a broken CSV file Message-ID: I doing an import from a CSV-style text file into a SQL database. The data is set up so that I have one set of text files with a field listing in them (so I know what matches up with what) and then the data files in a parent directory. The data format looks something like this: "title","some text","a date is next",1999/05/10,T,123,F,F,T,"more text" Fine... I can import that. Unfortunatly, some of the records have embeded newlines in them, so you end up with something like this: "title","some text","a date is next",1999/05/10,T,123,F,F,T,"more text goes here until the record is done" ... or, potentially: "title","some text goes over multiple lines","a date is next",1999/05/10,T,123,F,F,T,"more text" What I've been doing is simply doing the data import - letting those screwed up lines fail when the SQL inserts run and then going back and hand entering the screwed up data (since I"ll end up with partial records, so I can search for the missing last field). This is not, however, a very maintainable method. (I have to re-import things when the data set changes, I get all new files, not just changes.) Is there any neat/slick way to get this data in there on the first pass? I tried using ParseWords, but I'm not sure if I utilized it to its fullest extent. I briefly played with a CSV driver for DBI, but it couldn't handle things split over the newlines, either. This was awhile ago that I did this in the first place, I'm just picking the project back up off the shelf, so to speak. Although I had kind of figured I'd have to re-write from scratch, I didn't want to fight the same issues if there was an easy way out of it... any ideas? Thanks, Eric From sechrest at peak.org Fri Nov 21 18:00:04 2003 From: sechrest at peak.org (John Sechrest) Date: Mon Aug 2 21:34:26 2004 Subject: [Pdx-pm] reading a broken CSV file In-Reply-To: Your message of Fri, 21 Nov 2003 16:35:12 PST. Message-ID: <200311220000.hAM004g00719@jas.peak.org> Why not do a text substitution ? Do you have any indicator of what an end of field looks like? Can you say that a record only ends when you have a " on the end of a line? Or do you have to count the records. Sounds like a pre-parser to force things into the right form is a good place to start. Eric Shore Baur writes: % % I doing an import from a CSV-style text file into a SQL database. % The data is set up so that I have one set of text files with a field % listing in them (so I know what matches up with what) and then the data % files in a parent directory. % The data format looks something like this: % % "title","some text","a date is next",1999/05/10,T,123,F,F,T,"more text" % % Fine... I can import that. Unfortunatly, some of the records have % embeded newlines in them, so you end up with something like this: % % "title","some text","a date is next",1999/05/10,T,123,F,F,T,"more text % goes here % until % the record % is done" % % ... or, potentially: % % "title","some text goes % over % multiple % lines","a date is next",1999/05/10,T,123,F,F,T,"more text" % % What I've been doing is simply doing the data import - letting % those screwed up lines fail when the SQL inserts run and then going back % and hand entering the screwed up data (since I"ll end up with partial % records, so I can search for the missing last field). This is not, % however, a very maintainable method. (I have to re-import things when the % data set changes, I get all new files, not just changes.) % Is there any neat/slick way to get this data in there on the first % pass? I tried using ParseWords, but I'm not sure if I utilized it to its % fullest extent. I briefly played with a CSV driver for DBI, but it % couldn't handle things split over the newlines, either. % % This was awhile ago that I did this in the first place, I'm just % picking the project back up off the shelf, so to speak. Although I had % kind of figured I'd have to re-write from scratch, I didn't want to fight % the same issues if there was an easy way out of it... any ideas? % % Thanks, % Eric % % _______________________________________________ % Pdx-pm-list mailing list % Pdx-pm-list@mail.pm.org % http://mail.pm.org/mailman/listinfo/pdx-pm-list ----- John Sechrest . Helping people use CTO PEAK - . computers and the Internet Public Electronic . more effectively Access to Knowledge,Inc . 1600 SW Western, Suite 180 . Internet: sechrest@peak.org Corvallis Oregon 97333 . (541) 754-7325 . http://www.peak.org/~sechrest From jeff at vpservices.com Fri Nov 21 18:03:45 2003 From: jeff at vpservices.com (Jeff Zucker) Date: Mon Aug 2 21:34:26 2004 Subject: [Pdx-pm] reading a broken CSV file In-Reply-To: References: Message-ID: <3FBEA7E1.7010800@vpservices.com> Eric Shore Baur wrote: Why not use DBD::CSV, which will let you query the data files with SQL and which handles embedded newlines just fine. -- Jeff > I doing an import from a CSV-style text file into a SQL database. >The data is set up so that I have one set of text files with a field >listing in them (so I know what matches up with what) and then the data >files in a parent directory. > The data format looks something like this: > >"title","some text","a date is next",1999/05/10,T,123,F,F,T,"more text" > > Fine... I can import that. Unfortunatly, some of the records have >embeded newlines in them, so you end up with something like this: > >"title","some text","a date is next",1999/05/10,T,123,F,F,T,"more text >goes here >until >the record >is done" > > ... or, potentially: > >"title","some text goes >over >multiple >lines","a date is next",1999/05/10,T,123,F,F,T,"more text" > > What I've been doing is simply doing the data import - letting >those screwed up lines fail when the SQL inserts run and then going back >and hand entering the screwed up data (since I"ll end up with partial >records, so I can search for the missing last field). This is not, >however, a very maintainable method. (I have to re-import things when the >data set changes, I get all new files, not just changes.) > Is there any neat/slick way to get this data in there on the first >pass? I tried using ParseWords, but I'm not sure if I utilized it to its >fullest extent. I briefly played with a CSV driver for DBI, but it >couldn't handle things split over the newlines, either. > > This was awhile ago that I did this in the first place, I'm just >picking the project back up off the shelf, so to speak. Although I had >kind of figured I'd have to re-write from scratch, I didn't want to fight >the same issues if there was an easy way out of it... any ideas? > >Thanks, >Eric > >_______________________________________________ >Pdx-pm-list mailing list >Pdx-pm-list@mail.pm.org >http://mail.pm.org/mailman/listinfo/pdx-pm-list > > > > From ckuskie at dalsemi.com Fri Nov 21 18:19:55 2003 From: ckuskie at dalsemi.com (Colin Kuskie) Date: Mon Aug 2 21:34:26 2004 Subject: [Pdx-pm] reading a broken CSV file In-Reply-To: References: Message-ID: <20031122001955.GJ8719@dalsemi.com> On Fri, Nov 21, 2003 at 04:35:12PM -0800, Eric Shore Baur wrote: > > Is there any neat/slick way to get this data in there on the first > pass? I tried using ParseWords, but I'm not sure if I utilized it to its > fullest extent. I briefly played with a CSV driver for DBI, but it > couldn't handle things split over the newlines, either. If the number of columns in each file is a constant, then you could try the following: Get a line. Feed it into some module that handles CSV and returns an array of elements Do I have enough columns? NO: Remove newline from present line; Fetch another line from file; Append it to the current line and check again. YES: Push data into database. From jeff at vpservices.com Fri Nov 21 18:32:41 2003 From: jeff at vpservices.com (Jeff Zucker) Date: Mon Aug 2 21:34:26 2004 Subject: [Pdx-pm] reading a broken CSV file In-Reply-To: <20031122001955.GJ8719@dalsemi.com> References: <20031122001955.GJ8719@dalsemi.com> Message-ID: <3FBEAEA9.40608@vpservices.com> Colin Kuskie wrote: >Get a line. >Feed it into some module that handles CSV and returns an array of elements > If that module is Text::CSV_XS, then the rest of this is irrelevant because it handles embedded newlines. But since Eric is dealing with DBI and a database already, I can't think of any reason that DBD::CSV isn't the right tool for this job, but I'm prejudiced. >Do I have enough columns? >NO: Remove newline from present line; > Fetch another line from file; > Append it to the current line and check again. >YES: Push data into database. > Text::CSV_XS and DBD::CSV do all that for you. -- Jeff From bruce at gridpoint.com Fri Nov 21 22:11:35 2003 From: bruce at gridpoint.com (Bruce J Keeler) Date: Mon Aug 2 21:34:26 2004 Subject: [Pdx-pm] Hash question In-Reply-To: References: Message-ID: <1069474294.2179.154.camel@scrunge.gridpoint.com> On Fri, 2003-11-21 at 14:23, Roderick A. Anderson wrote: > First the append hash solution. I use the hash generated from some CGI.pm > params then query a SQL Server database and and use DataHash to returned > the row. To append to the original hash I'm using a variation on code I > got out of "Perl Cookbook". > > %SignUpInfo = (%SignUpInfo, $db2->DataHash()); > > Is there a better or more efficient way to do this? This will recreate the hash, re-adding all the elements that were there before. In cases where there are a lot of items in the hash to begin with, that's going to be inefficient. Something like this might be better in that case: %tmp = $db2->DataHash(); @SignUpInfo{keys %tmp} = values %tmp; Though now the new hash entries are going to be hashed twice instead, so that's less efficient in the case where there's more data being added than was there to begin with. --Bruce From merlyn at stonehenge.com Fri Nov 21 22:25:09 2003 From: merlyn at stonehenge.com (Randal L. Schwartz) Date: Mon Aug 2 21:34:26 2004 Subject: [Pdx-pm] Hash question In-Reply-To: <1069474294.2179.154.camel@scrunge.gridpoint.com> References: <1069474294.2179.154.camel@scrunge.gridpoint.com> Message-ID: <868ym9ghcv.fsf@blue.stonehenge.com> >>>>> "Bruce" == Bruce J Keeler writes: Bruce> This will recreate the hash, re-adding all the elements that were there Bruce> before. In cases where there are a lot of items in the hash to begin Bruce> with, that's going to be inefficient. Something like this might be Bruce> better in that case: Bruce> %tmp = $db2->DataHash(); Bruce> @SignUpInfo{keys %tmp} = values %tmp; Bruce> Though now the new hash entries are going to be hashed twice instead, so Bruce> that's less efficient in the case where there's more data being added Bruce> than was there to begin with. I think it's even been shown that iteration is better: my @array = $db2->DataHash(); while (@array) { $SignUpInfo{shift @array} = shift @array; } Of course, I'm cheating here, knowing that the left side is eval'ed before the right. If you don't want that much magic: my %tmp = $db2->DataHash(); while (my ($k, $v) = each %tmp) { $SignUpInfo{$k} = $v; } -- Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095 Perl/Unix/security consulting, Technical writing, Comedy, etc. etc. See PerlTraining.Stonehenge.com for onsite and open-enrollment Perl training! From cdawson at webiphany.com Fri Nov 21 23:18:47 2003 From: cdawson at webiphany.com (Chris Dawson) Date: Mon Aug 2 21:34:26 2004 Subject: [Pdx-pm] Net::SCP Message-ID: <3FBEF1B7.1090308@webiphany.com> Does anyone have experience using this? I plan to allow a user to upload files from a web script, and I am wondering if someone has an elegant way of generating keys. Perldoc suggests using keys over setting a password, and I see the logic here rather than storing them in cleartext within a script, but am not sure if I am thinking about this in the correct way. It would be nice if there were a OO method exposed for doing this sort of thing. This might sound like a rant or complaint about the interface, but trust me, it is a question. :) Thanks, Chris From bruce at gridpoint.com Sat Nov 22 15:38:18 2003 From: bruce at gridpoint.com (Bruce J Keeler) Date: Mon Aug 2 21:34:26 2004 Subject: [Pdx-pm] Hash question In-Reply-To: <868ym9ghcv.fsf@blue.stonehenge.com> References: <1069474294.2179.154.camel@scrunge.gridpoint.com> <868ym9ghcv.fsf@blue.stonehenge.com> Message-ID: <1069537097.2179.185.camel@scrunge.gridpoint.com> On Fri, 2003-11-21 at 20:25, Randal L. Schwartz wrote: > > I think it's even been shown that iteration is better: > > my @array = $db2->DataHash(); > while (@array) { > $SignUpInfo{shift @array} = shift @array; > } > Makes sense as it doesn't have to compute hash keys for %tmp. > Of course, I'm cheating here, knowing that the left side > is eval'ed before the right. If you don't want that much magic: > > my %tmp = $db2->DataHash(); > while (my ($k, $v) = each %tmp) { > $SignUpInfo{$k} = $v; > } You're saying that's cheaper than > @SignUpInfo{keys %tmp} = values %tmp; ? This I found hard to believe. Why would Perl pessimize it so? I whipped up the following: #!/usr/bin/perl use Benchmark qw( cmpthese ); push (@array, rand, rand) for (1..100); cmpthese ( -10, { iterated_hash => sub { my %dest; my %tmp = @array; while (my ($k, $v) = each %tmp) { $dest{$k} = $v; } }, atonce => sub { my %dest; my %tmp = @array; @dest{keys %tmp} = values %tmp; }, iterated_array => sub { my %dest; my @tmp = @array; while (@tmp) { $dest{shift @tmp} = shift @tmp; } }, } ); Results: Rate iterated_array iterated_hash atonce iterated_array 1198/s -- -56% -71% iterated_hash 2709/s 126% -- -34% atonce 4098/s 242% 51% -- It seems that the array method is worst of all. Most interesting. My perl is: bruce@scrunge| /tmp % perl -V Summary of my perl5 (revision 5.0 version 8 subversion 2) configuration: Platform: osname=linux, osvers=2.4.22-xfs+ti1211, archname=i386-linux-thread-multi uname='linux kosh 2.4.22-xfs+ti1211 #1 sat oct 25 10:11:37 est 2003 i686 gnulinux ' config_args='-Dusethreads -Duselargefiles -Dccflags=-DDEBIAN -Dcccdlflags=-fPIC -Darchname=i386-linux -Dprefix=/usr -Dprivlib=/usr/share/perl/5.8.2 -Darchlib=/usr/lib/perl/5.8.2 -Dvendorprefix=/usr -Dvendorlib=/usr/share/perl5 -Dvendorarch=/usr/lib/perl5 -Dsiteprefix=/usr/local -Dsitelib=/usr/local/share/perl/5.8.2 -Dsitearch=/usr/local/lib/perl/5.8.2 -Dman1dir=/usr/share/man/man1 -Dman3dir=/usr/share/man/man3 -Dsiteman1dir=/usr/local/man/man1 -Dsiteman3dir=/usr/local/man/man3 -Dman1ext=1 -Dman3ext=3perl -Dpager=/usr/bin/sensible-pager -Uafs -Ud_csh -Uusesfio -Uusenm -Duseshrplib -Dlibperl=libperl.so.5.8.2 -Dd_dosuid -des' hint=recommended, useposix=true, d_sigaction=define usethreads=define use5005threads=undef useithreads=define usemultiplicity=define useperlio=define d_sfio=undef uselargefiles=define usesocks=undef use64bitint=undef use64bitall=undef uselongdouble=undef usemymalloc=n, bincompat5005=undef Compiler: cc='cc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -DTHREADS_HAVE_PIDS -DDEBIAN -fno-strict-aliasing -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64', optimize='-O3', cppflags='-D_REENTRANT -D_GNU_SOURCE -DTHREADS_HAVE_PIDS -DDEBIAN -fno-strict-aliasing -I/usr/local/include' ccversion='', gccversion='3.3.2 (Debian)', gccosandvers='' intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234 d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=12 ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8 alignbytes=4, prototype=define Linker and Libraries: ld='cc', ldflags =' -L/usr/local/lib' libpth=/usr/local/lib /lib /usr/lib libs=-lgdbm -lgdbm_compat -ldb -ldl -lm -lpthread -lc -lcrypt perllibs=-ldl -lm -lpthread -lc -lcrypt libc=/lib/libc-2.3.2.so, so=so, useshrplib=true, libperl=libperl.so.5.8.2 gnulibc_version='2.3.2' Dynamic Linking: dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-rdynamic' cccdlflags='-fPIC', lddlflags='-shared -L/usr/local/lib' Characteristics of this binary (from libperl): Compile-time options: MULTIPLICITY USE_ITHREADS USE_LARGE_FILES PERL_IMPLICIT_CONTEXT Built under linux Compiled at Nov 15 2003 17:52:08 @INC: /etc/perl /usr/local/lib/perl/5.8.2 /usr/local/share/perl/5.8.2 /usr/lib/perl5 /usr/share/perl5 /usr/lib/perl/5.8.2 /usr/share/perl/5.8.2 /usr/local/lib/site_perl /usr/local/lib/perl/5.8.0 /usr/local/share/perl/5.8.0 . From rootbeer at redcat.com Sat Nov 22 21:50:03 2003 From: rootbeer at redcat.com (Tom Phoenix) Date: Mon Aug 2 21:34:26 2004 Subject: [Pdx-pm] Fireside Cafe, open 124 hours a week! In-Reply-To: <20030731064406.GC24299@windhund.schwern.org> References: <20030731064406.GC24299@windhund.schwern.org> Message-ID: On Wed, 30 Jul 2003, Michael G Schwern wrote: > BEST COFFEEHOUSE EVER > > Fireside Cafe, SE 13th and Powell. Free wireless. Free ethernet. Power > jacks galore. Comfy faux "cabin in the woods" feel. Friendly clerk > (the owner). Little side room with rocking chairs for quietness. > Populated mostly by studying college students. > > And the hours: Open 7pm Sunday to Midnight on Friday. 124 hours a > week! YOW! And now, it's open eternally. (Which sounds better to me than saying "24/7".) Also, they have sandwiches and some snacks, so you don't have to live on coffee alone. But the bad news is that I can no longer connect to their free WiFi. It worked an hour ago. But after I put my Mac to sleep and re-awakened it, "There was an error joining the selected AirPort network." The attendant on duty says that "Macs sometimes have problems", but can't offer any more assistance. And I can't find anything in log files or elsewhere telling me more about what's going on. Dang this user-friendly interface! Details for those who are interested: PowerBook G4 using internal AirPort card with Mac OS X 10.2.8. It worked without a problem on the first try; I selected the network name from the pop-up menu and needed no password. I used it for over an hour. Trying to reconnect, I tried manually specifying the network name, tried no password, tried made-up passwords, tried the network name as a password, told it to use the strongest network, told it to use the last-used network, told it to use a specific network, called it some bad names, tried totally different network settings (using my cell phone to connect to Verizon's network, which works fine; I'm using it now), then went back and tried everything again. My best theory: When my machine went to sleep, it failed to properly disconnect from their access point. When I try to connect again, their box thinks my MAC address is already connected, and won't let me connect back up. (Can this happen to WiFi?) Although this implies that the problem would be cured by rebooting their equipment, I can't suggest that when at least half a dozen folks are using their access point at the moment. Anybody else have this happen? Did you figure out how to cure the problem? --Tom Phoenix From ebaur at aracnet.com Mon Nov 24 10:04:09 2003 From: ebaur at aracnet.com (Eric Shore Baur) Date: Mon Aug 2 21:34:26 2004 Subject: [Pdx-pm] reading a broken CSV file In-Reply-To: <3FBEAEA9.40608@vpservices.com> Message-ID: Sorry for being quiet for a couple days... got busy. :) At any rate, this was kind of an old program and (if I remember correctly) DBD:CSV was not handling things properly at the time... it sounds like it does handle embeded newlines properly now, so I just need to give it another go. On the other hand, I may be able to do some pre-processing, like another post suggested. I was trying to make sure I had the right line and put it into the database all at the same time... I think its a much better idea to make two passes instead. Thanks for all the suggestions, Eric On Fri, 21 Nov 2003, Jeff Zucker wrote: > Colin Kuskie wrote: > > >Get a line. > >Feed it into some module that handles CSV and returns an array of elements > > > If that module is Text::CSV_XS, then the rest of this is irrelevant > because it handles embedded newlines. But since Eric is dealing with > DBI and a database already, I can't think of any reason that DBD::CSV > isn't the right tool for this job, but I'm prejudiced. > > >Do I have enough columns? > >NO: Remove newline from present line; > > Fetch another line from file; > > Append it to the current line and check again. > >YES: Push data into database. > > > Text::CSV_XS and DBD::CSV do all that for you. > > From MichaelRWolf at att.net Sun Nov 23 03:38:03 2003 From: MichaelRWolf at att.net (Michael R. Wolf) Date: Mon Aug 2 21:34:26 2004 Subject: [Pdx-pm] reading a broken CSV file In-Reply-To: <3FBEAEA9.40608@vpservices.com> (Jeff Zucker's message of "Fri, 21 Nov 2003 16:32:41 -0800") References: <20031122001955.GJ8719@dalsemi.com> <3FBEAEA9.40608@vpservices.com> Message-ID: Jeff Zucker writes: > Colin Kuskie wrote: > >>Get a line. >>Feed it into some module that handles CSV and returns an array of elements >> > If that module is Text::CSV_XS, then the rest of this is irrelevant > because it handles embedded newlines. But since Eric is dealing with > DBI and a database already, I can't think of any reason that DBD::CSV > isn't the right tool for this job, but I'm prejudiced. Prejudiced has such a negative connotation.. how 'bout well-informed? [...] Wanting to be more informed... It appears that Text::CSV will *not* handle multi-line "records". Will any other non-DBD module handle multi-line CSV's. This is a timely thread for me. I just received a multi-line CSV from an application. It appears to read into Excel OK, but not OpenOffice. Since I'm the token Open Source guy in a mixed open/non-open new venture, I'm wanting to get things done, and also wanting to use Perl when I can. It's good to know that the DBD::CSV module does what I want. Do other CSV modules also handle multi-line records? Thanks, Michael Wolf -- Michael R. Wolf All mammals learn by playing! MichaelRWolf@att.net From jeff at vpservices.com Mon Nov 24 10:18:11 2003 From: jeff at vpservices.com (Jeff Zucker) Date: Mon Aug 2 21:34:26 2004 Subject: [Pdx-pm] reading a broken CSV file In-Reply-To: References: <20031122001955.GJ8719@dalsemi.com> <3FBEAEA9.40608@vpservices.com> Message-ID: <3FC22F43.4020404@vpservices.com> Michael R. Wolf wrote: >Prejudiced has such a negative connotation.. how 'bout well-informed? > > How about -- is the maintainer of the module in question :-) >It appears that Text::CSV will *not* handle multi-line "records". > > Yes, it will. set binary=1. If you have problems, let me know, I'm also its maintainer now. -- Jeff From jeff at vpservices.com Mon Nov 24 10:42:28 2003 From: jeff at vpservices.com (Jeff Zucker) Date: Mon Aug 2 21:34:26 2004 Subject: [Pdx-pm] reading a broken CSV file In-Reply-To: <3FC22F43.4020404@vpservices.com> References: <20031122001955.GJ8719@dalsemi.com> <3FBEAEA9.40608@vpservices.com> <3FC22F43.4020404@vpservices.com> Message-ID: <3FC234F4.6080200@vpservices.com> Jeff Zucker wrote: > Michael R. Wolf wrote: > >> It appears that Text::CSV will *not* handle multi-line "records". >> >> > Yes, it will. set binary=1. If you have problems, let me know, I'm > also its maintainer now. Grrr, I should read before responding. You're correct Text::CSV does not handle newlines but Text::CSV_XS does. I don't maintain the former, I do maintain the later. -- Jeff From robbyrussell at pdxlug.org Mon Nov 24 11:06:50 2003 From: robbyrussell at pdxlug.org (Robby Russell) Date: Mon Aug 2 21:34:26 2004 Subject: [Pdx-pm] Fireside Cafe, open 124 hours a week! In-Reply-To: References: <20030731064406.GC24299@windhund.schwern.org> Message-ID: <3FC23AAA.7060909@pdxlug.org> Tom Phoenix wrote: > On Wed, 30 Jul 2003, Michael G Schwern wrote: > > >>BEST COFFEEHOUSE EVER >> >>Fireside Cafe, SE 13th and Powell. Free wireless. Free ethernet. Power >>jacks galore. Comfy faux "cabin in the woods" feel. Friendly clerk >>(the owner). Little side room with rocking chairs for quietness. >>Populated mostly by studying college students. >> >>And the hours: Open 7pm Sunday to Midnight on Friday. 124 hours a PDXLUG (http://www.pdxlug.org/) hosts their monthly meetings there. Good place. -Robby From raanders at acm.org Mon Nov 24 14:57:23 2003 From: raanders at acm.org (Roderick A. Anderson) Date: Mon Aug 2 21:34:26 2004 Subject: [Pdx-pm] Hash question In-Reply-To: Message-ID: On Fri, 21 Nov 2003, Joe Oppegaard wrote: > use Data::Dumper; > print Dumper(\%a); I'll do that if it get to be too much of an irritation. I do need to look for errors. Or at least more indepth errors. I am selecting some attributes from several tables using a username I know is good. (I already checked it.) No other error checks but thinking on this it could be the way the Win32::ODBC DataHash method is handling duplicate attribute names from different tables even though I'm using table identifiers in the select. Thanks for the ideas. Rod -- "Open Source Software - Usually you get more than you pay for..." "Build A Brighter Lamp :: Linux Apache {middleware} PostgreSQL" From raanders at acm.org Mon Nov 24 15:09:37 2003 From: raanders at acm.org (Roderick A. Anderson) Date: Mon Aug 2 21:34:26 2004 Subject: [Pdx-pm] Hash question In-Reply-To: <868ym9ghcv.fsf@blue.stonehenge.com> Message-ID: On 21 Nov 2003, Randal L. Schwartz wrote: > I think it's even been shown that iteration is better: > > my @array = $db2->DataHash(); > while (@array) { > $SignUpInfo{shift @array} = shift @array; > } Very cool. Amazingly simple once you see it. > Of course, I'm cheating here, knowing that the left side > is eval'ed before the right. If you don't want that much magic: This is a great point to keep in mind. Magic works for me. And when someone else looks at the code I can impress the hell out of them with this neat trick. :-) > my %tmp = $db2->DataHash(); > while (my ($k, $v) = each %tmp) { > $SignUpInfo{$k} = $v; > } I was hoping to avoid this as it seemed _so_ sledge-hammerish. I used it for awhile but wasn't happy with the concept. I really like the one above. Thanks for the insight. Rod -- "Open Source Software - Usually you get more than you pay for..." "Build A Brighter Lamp :: Linux Apache {middleware} PostgreSQL" From raanders at acm.org Mon Nov 24 15:18:51 2003 From: raanders at acm.org (Roderick A. Anderson) Date: Mon Aug 2 21:34:26 2004 Subject: [Pdx-pm] Hash question In-Reply-To: <1069537097.2179.185.camel@scrunge.gridpoint.com> Message-ID: On Sat, 22 Nov 2003, Bruce J Keeler wrote: > You're saying that's cheaper than > > > @SignUpInfo{keys %tmp} = values %tmp; This too looks properly perlish. > Results: > > Rate iterated_array iterated_hash atonce > iterated_array 1198/s -- -56% -71% > iterated_hash 2709/s 126% -- -34% > atonce 4098/s 242% 51% -- > > It seems that the array method is worst of all. Most interesting. Hum. Looks over speed. Reminds me of my younger years and the low-riders verses the hot-rodders. But since the atonce is fastest and looks great I could get the best of both worlds. A low rider that performs. Great paint job and dual quads. Cool! Rod -- "Open Source Software - Usually you get more than you pay for..." "Build A Brighter Lamp :: Linux Apache {middleware} PostgreSQL" From raanders at acm.org Mon Nov 24 15:23:25 2003 From: raanders at acm.org (Roderick A. Anderson) Date: Mon Aug 2 21:34:26 2004 Subject: [Pdx-pm] Slow replies Message-ID: Sorry to be so slow replying. A procmail typo had the messages going into a non-visible folder. I kept checking all week-end as was starting to mutter ill things until I checked the list archives and realized it was on my end. Rod -- "Open Source Software - Usually you get more than you pay for..." "Build A Brighter Lamp :: Linux Apache {middleware} PostgreSQL" From raanders at acm.org Mon Nov 24 16:00:47 2003 From: raanders at acm.org (Roderick A. Anderson) Date: Mon Aug 2 21:34:26 2004 Subject: [Pdx-pm] Hash question In-Reply-To: <1069537097.2179.185.camel@scrunge.gridpoint.com> Message-ID: On Sat, 22 Nov 2003, Bruce J Keeler wrote: > You're saying that's cheaper than > > > @SignUpInfo{keys %tmp} = values %tmp; Now trying to put this in place I'm confused as hell. Shouldn't this be $SignUpInfo{keys %tmp} = values %tmp; I know perl usually does the right thing with what it's handed but I really don't understand the @something{} verses the $something{} here. Do the curly braces over-ride(?) the at-symbol? Rod -- "Open Source Software - Usually you get more than you pay for..." "Build A Brighter Lamp :: Linux Apache {middleware} PostgreSQL" From tcaine at eli.net Mon Nov 24 16:46:12 2003 From: tcaine at eli.net (Todd Caine) Date: Mon Aug 2 21:34:27 2004 Subject: [Pdx-pm] Hash question In-Reply-To: References: <1069537097.2179.185.camel@scrunge.gridpoint.com> Message-ID: <20031124224612.GB12285@eli.net> It's called a hash slice. http://tlc.perlarchive.com/articles/perl/ug0001.shtml On (Mon, Nov 24 14:00), Roderick A. Anderson wrote: > > > @SignUpInfo{keys %tmp} = values %tmp; > > Now trying to put this in place I'm confused as hell. Shouldn't this be > > $SignUpInfo{keys %tmp} = values %tmp; > > I know perl usually does the right thing with what it's handed but I > really don't understand the @something{} verses the $something{} here. > Do the curly braces over-ride(?) the at-symbol? From merlyn at stonehenge.com Mon Nov 24 16:59:56 2003 From: merlyn at stonehenge.com (Randal L. Schwartz) Date: Mon Aug 2 21:34:27 2004 Subject: [Pdx-pm] Hash question In-Reply-To: References: Message-ID: <864qwt2x06.fsf@blue.stonehenge.com> >>>>> "Roderick" == Roderick A Anderson writes: Roderick> On Sat, 22 Nov 2003, Bruce J Keeler wrote: >> You're saying that's cheaper than >> >> > @SignUpInfo{keys %tmp} = values %tmp; Roderick> Now trying to put this in place I'm confused as hell. Shouldn't this be Roderick> $SignUpInfo{keys %tmp} = values %tmp; Roderick> I know perl usually does the right thing with what it's handed but I Roderick> really don't understand the @something{} verses the $something{} here. One is a hash element, the other is a hash slice. Roderick> Do the curly braces over-ride(?) the at-symbol? Define "override". In the same manner that we go from: $array[3] = "fred"; to @array[3, 5, 8] = ("fred", "barney", "dino"); we go from: $hash{"fred"} = "flintstone"; to @hash{("fred", "barney", "dino")} = ("flintstone", "rubble", undef); A hash slice sets many items at once in hash, like an array slice sets many items at once in an array. -- Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095 Perl/Unix/security consulting, Technical writing, Comedy, etc. etc. See PerlTraining.Stonehenge.com for onsite and open-enrollment Perl training! From raanders at acm.org Mon Nov 24 18:24:47 2003 From: raanders at acm.org (Roderick A. Anderson) Date: Mon Aug 2 21:34:27 2004 Subject: [Pdx-pm] Hash question In-Reply-To: <864qwt2x06.fsf@blue.stonehenge.com> Message-ID: On 24 Nov 2003, Randal L. Schwartz wrote: > Define "override". Can't but I see further down you've explained it. > @hash{("fred", "barney", "dino")} = ("flintstone", "rubble", undef); > > A hash slice sets many items at once in hash, like an array slice > sets many items at once in an array. Thanks to you and Todd, Rod -- "Open Source Software - Usually you get more than you pay for..." "Build A Brighter Lamp :: Linux Apache {middleware} PostgreSQL" From tkil at scrye.com Mon Nov 24 22:02:26 2003 From: tkil at scrye.com (Tkil) Date: Mon Aug 2 21:34:27 2004 Subject: [Pdx-pm] Hash question In-Reply-To: <1069537097.2179.185.camel@scrunge.gridpoint.com> References: <1069474294.2179.154.camel@scrunge.gridpoint.com> <868ym9ghcv.fsf@blue.stonehenge.com> <1069537097.2179.185.camel@scrunge.gridpoint.com> Message-ID: Regarding various ways to append one hash onto another, I played with it a bit this afternoon. Parameters to tune for: * difficulty of coding * size of existing hash * number of elements to add to existing hash * are auxilary data structures required * memory efficiency * time efficiency Note that I cheated: in all my tests below, number of existing elements is equal to the number being added. Test program at: http://scrye.com/~tkil/perl/append-hash.plx Observations: The simplest method is hash assignment. It is regularly about half the speed of the other methods (this might be an artifact of my "cheat" above, though.) It is easy to code and difficult to get wrong. Until you know this is your bottleneck, consider sticking with this. For sets up to the low thousands (on this hardware, at least), using array slices of @array as fodder for hash slice assignment into %dest seems the fastest. This is the "even/odd" approach described below. For small sets, this can be nearly 40% faster than any other method examined. For small sets (less than 100 elements or so), Bruce's copy-hash-at-once is the best method that doesn't require any "outside" knowledge. Larger that, the best self-contained method investigated below is to index into the source array directly. Some speedups (5-10%) can be obtained by unrolling the loops. Summary rankings of simplest and fastest methods: 10 elements 100 elements 1000 elements 10000 elements ---------------- --------------- ---------------- ---------------- h_assign 15085/s h_assign 1485/s h_assign 95.6/s h_assign 4.97/s duffs_16 22318/s a_index 2326/s h_atonce 148/s h_atonce 7.00/s a_index 22454/s h_atonce 2425/s a_index 165/s a_index 9.88/s h_atonce 23247/s unr_16 2571/s duffs_16 179/s even_odd 9.94/s unr_16 23423/s duffs_16 2594/s unr_16 181/s duffs_16 10.3/s even_odd 31992/s even_odd 3283/s even_odd 195/s unr_16 10.4/s Randal, Tom -- did anything ever come out of the p5p discussions to allow "push HASH, LIST" to do this? Or am I making that up? Long-winded crap: To make it a bit more realistic, and to try it out with different sizes of hashes, I put it into a big loop and looked at 10, 100, 1000, and 10000-element hashes and arrays: | for my $n_hash_elts ( 10, 100, 1000, 10000 ) | { | my @array = map { rand } 1 .. 2*$n_hash_elts; | my %orig_dest = map { rand } @array; On each of the sets of benchmarks, I included the previous winner: | my $h_atonce = sub { | my %dest = %orig_dest; | my %tmp = @array; | @dest{keys %tmp} = values %tmp; | }; And my first alternate solution, using array indexing to avoid the cost of copying or modifying @array: | my $a_index = sub { | my %dest = %orig_dest; | for ( my $i = 0; $i < @array; $i += 2 ) { | $dest{ $array[$i] } = $array[$i+1]; | } | }; A failed experiment, where I just tried the brute force approach. (Note that this isn't really a failure; if this is fast enough, by all means, use it...) | my $h_assign = sub { | my %dest = %orig_dest; | %dest = ( %dest, @array ); | }; Another failed experiment, where I tried to use 'splice' to minimize the number of changes to the @tmp copy of @array: | my $a_splice = sub { | my %dest = %orig_dest; | my @tmp = @array; | while (@tmp) { | my ( $k, $v ) = splice @tmp, 0, 2; | $dest{$k} = $v; | } | }; Some simple unrollings of the array index case, at 8, 16, and 32 elements (I included 24 elements later): | my $unr_8 = sub { | my %dest = %orig_dest; | my $i = 0; | while ( $i < @array-8 ) { | $dest{ $array[$i ] } = $array[$i+1]; | $dest{ $array[$i+2 ] } = $array[$i+3]; | $dest{ $array[$i+4 ] } = $array[$i+5]; | $dest{ $array[$i+6 ] } = $array[$i+7]; | $i += 8; | } | while ( $i < @array ) | { | $dest{ $array[$i ] } = $array[$i+1]; | $i += 2; | } | }; On the "that makes me feel dirty" scale, how about one that uses Duff's Device? | my $duffs_16 = sub { | my %dest = %orig_dest; | my $t = @array % 16; | my $i = $t - 16; | goto "TARGET_$t"; | while ( $i < @array ) { | TARGET_0: $dest{ $array[$i ] } = $array[$i+1 ]; | TARGET_14: $dest{ $array[$i+2 ] } = $array[$i+3 ]; | TARGET_12: $dest{ $array[$i+4 ] } = $array[$i+5 ]; | TARGET_10: $dest{ $array[$i+6 ] } = $array[$i+7 ]; | TARGET_8: $dest{ $array[$i+8 ] } = $array[$i+9 ]; | TARGET_6: $dest{ $array[$i+10] } = $array[$i+11]; | TARGET_4: $dest{ $array[$i+12] } = $array[$i+13]; | TARGET_2: $dest{ $array[$i+14] } = $array[$i+15]; | $i += 16; | } | }; And, in my final bow to the benchmarking gods: | my @odd = grep { $_ & 1 } 0 .. $#array; | my @even = map { $_-1 } @odd; | | my $even_odd = sub { | my %dest = %orig_dest; | @dest{@array[@even]} = @array[@odd]; | }; There is another implementation that comes to mind, if we can assert these conditions: 1. Having extra entries in %dest is ok; and 2. The universe of keys is fully distinct from the universe of values. 2. No values are undef (or you are running in "no warnings"): Then you could do something like: @dest{ '', @array } = ( @array, '' ); Heh. "Careful with that axe, Eugene!" Anyway, here are results for various set sizes: | $ ./append-hash.plx | | === 10 elements === | | original methods: | Rate a_shift h_each a_index h_atonce | a_shift 9878/s -- -47% -56% -58% | h_each 18654/s 89% -- -17% -21% | a_index 22540/s 128% 21% -- -5% | h_atonce 23685/s 140% 27% 5% -- | | failed experiments: | Rate h_assign a_splice a_index h_atonce | h_assign 15085/s -- -0% -33% -36% | a_splice 15123/s 0% -- -33% -36% | a_index 22583/s 50% 49% -- -4% | h_atonce 23559/s 56% 56% 4% -- | | unrolled: | Rate unr_32 a_index unr_8 h_atonce unr_16 | unr_32 21599/s -- -4% -7% -8% -9% | a_index 22454/s 4% -- -3% -4% -6% | unr_8 23239/s 8% 3% -- -1% -2% | h_atonce 23420/s 8% 4% 1% -- -2% | unr_16 23833/s 10% 6% 3% 2% -- | | the contenders: | Rate duffs_16 a_index h_atonce unr_16 even_odd | duffs_16 22318/s -- -1% -4% -5% -30% | a_index 22454/s 1% -- -3% -4% -30% | h_atonce 23247/s 4% 4% -- -1% -27% | unr_16 23423/s 5% 4% 1% -- -27% | even_odd 31992/s 43% 42% 38% 37% -- | | === 100 elements === | | original methods: | Rate a_shift h_each a_index h_atonce | a_shift 1018/s -- -48% -56% -59% | h_each 1942/s 91% -- -17% -22% | a_index 2329/s 129% 20% -- -6% | h_atonce 2477/s 143% 28% 6% -- | | failed experiments: | Rate h_assign a_splice a_index h_atonce | h_assign 1485/s -- -4% -36% -38% | a_splice 1544/s 4% -- -34% -36% | a_index 2325/s 57% 51% -- -3% | h_atonce 2401/s 62% 56% 3% -- | | unrolled: | Rate a_index h_atonce unr_8 unr_32 unr_16 | a_index 2322/s -- -4% -7% -9% -10% | h_atonce 2431/s 5% -- -3% -5% -6% | unr_8 2504/s 8% 3% -- -2% -3% | unr_32 2546/s 10% 5% 2% -- -1% | unr_16 2574/s 11% 6% 3% 1% -- | | the contenders: | Rate a_index h_atonce unr_16 duffs_16 even_odd | a_index 2326/s -- -4% -10% -10% -29% | h_atonce 2425/s 4% -- -6% -7% -26% | unr_16 2571/s 11% 6% -- -1% -22% | duffs_16 2594/s 12% 7% 1% -- -21% | even_odd 3283/s 41% 35% 28% 27% -- | | === 1000 elements === | | original methods: | Rate a_shift h_each h_atonce a_index | a_shift 88.5/s -- -39% -45% -54% | h_each 145/s 63% -- -10% -24% | h_atonce 160/s 81% 11% -- -16% | a_index 191/s 116% 32% 19% -- | | failed experiments: | Rate h_assign a_splice h_atonce a_index | h_assign 95.6/s -- -19% -37% -45% | a_splice 117/s 23% -- -23% -32% | h_atonce 153/s 60% 30% -- -12% | a_index 173/s 81% 48% 13% -- | | unrolled: | Rate h_atonce a_index unr_8 unr_32 unr_16 | h_atonce 152/s -- -10% -15% -15% -16% | a_index 169/s 12% -- -5% -5% -6% | unr_8 177/s 17% 5% -- -0% -2% | unr_32 178/s 17% 5% 0% -- -2% | unr_16 181/s 19% 7% 2% 2% -- | | the contenders: | Rate h_atonce a_index duffs_16 unr_16 even_odd | h_atonce 148/s -- -10% -17% -18% -24% | a_index 165/s 11% -- -8% -9% -15% | duffs_16 179/s 21% 9% -- -1% -8% | unr_16 181/s 22% 10% 1% -- -7% | even_odd 195/s 32% 18% 9% 8% -- | | === 10000 elements === | | original methods: | Rate a_shift h_atonce h_each a_index | a_shift 6.00/s -- -15% -21% -40% | h_atonce 7.06/s 18% -- -7% -30% | h_each 7.60/s 27% 8% -- -24% | a_index 10.0/s 67% 42% 32% -- | | failed experiments: | Rate h_assign h_atonce a_splice a_index | h_assign 4.97/s -- -29% -31% -50% | h_atonce 7.03/s 41% -- -2% -29% | a_splice 7.19/s 45% 2% -- -28% | a_index 9.94/s 100% 41% 38% -- | | unrolled: | Rate h_atonce a_index unr_8 unr_32 unr_16 | h_atonce 7.03/s -- -29% -31% -32% -32% | a_index 9.90/s 41% -- -3% -4% -5% | unr_8 10.2/s 45% 3% -- -1% -2% | unr_32 10.3/s 46% 4% 1% -- -1% | unr_16 10.4/s 48% 5% 2% 1% -- | | the contenders: | Rate h_atonce a_index even_odd duffs_16 unr_16 | h_atonce 7.00/s -- -29% -30% -32% -32% | a_index 9.88/s 41% -- -1% -4% -5% | even_odd 9.94/s 42% 1% -- -3% -4% | duffs_16 10.3/s 47% 4% 4% -- -1% | unr_16 10.4/s 48% 5% 4% 1% -- For small data sets, the "even odd" approach pretty clearly dominates the field. The fact that there is some preprocessing involved doesn't disqualify it in my mind; in a database environment, you are often fetching exactly the same number of fields each time, so building the even/odd arrays once is not a problem. And note that these are indexes into the result arrays, not results themselves, so they're quite reusable. The "hash at once" construct does well until the sets get particularly large. It has an advantage over the "array index" in smaller sets, and is probably easier to code correctly offhand. "array index" is nearly as fast as "hash at once" with smaller sets, catching up as early as 100 elements. As an added bonus, it has the smallest memory and icache footprint of any of these methods. Unrolling the array index method does give additional speed, but it is probably not worth the code bulk. (Hm... the thought of using eval STRING to generate an unrolled subroutine at run time is tempting.) The fact that 16 is regularly faster than 8 and 32 is interesting; I wonder if I'm hitting an icache limitation at the 32. A quick run with 24 showed it performing about as well as the others | unrolled: | Rate h_atonce a_index unr_32 unr_8 unr_24 unr_16 | h_atonce 151/s -- -9% -14% -14% -15% -16% | a_index 167/s 10% -- -5% -5% -7% -7% | unr_32 175/s 16% 5% -- -0% -2% -3% | unr_8 175/s 16% 5% 0% -- -2% -2% | unr_24 178/s 18% 7% 2% 2% -- -1% | unr_16 180/s 19% 8% 3% 2% 1% -- Duff's Device is just silly, and provides less and less return as the set size gets larger.