From davidnicol at gmail.com Mon May 2 14:23:41 2011 From: davidnicol at gmail.com (David Nicol) Date: Mon, 2 May 2011 16:23:41 -0500 Subject: [Kc] one way to handle commas in quotes strings in CSV files Message-ID: one way to handle commas in quotes strings in CSV files -- of course if your data has angle brackets in it, or form feeds, you'll need a different three characters. while (<>){ while (/"/){ m/[<>\f]/ and die "ANGLE BRACKET OR FF IN INPUT! [$_]"; s/"// or die "unpaired quotes in [$_]"; 1 while s/<(.*?),(.*?)>/<$1\f$2>/; s/[<>]//g; }; my @line = split ',',$_; s/\f/,/g for @line; ... }; -- "During his first performance of the song he received one of the highest honors you can achieve in New York?s underground rap scene: audience members touched his sneakers after a few of the more particularly tight lines." ?-- Cal Newport From ironicface at earthlink.net Mon May 2 14:59:46 2011 From: ironicface at earthlink.net (Teal) Date: Mon, 02 May 2011 16:59:46 -0500 Subject: [Kc] one way to handle commas in quotes strings in CSV files In-Reply-To: References: Message-ID: <4DBF2952.2020903@earthlink.net> On 5/2/2011 4:23 PM, David Nicol wrote: > one way to handle commas in quotes strings in CSV files -- of course > if your data has angle brackets in it, or form feeds, you'll need a > different three characters. > > while (<>){ > > while (/"/){ > m/[<>\f]/ and die "ANGLE BRACKET OR FF IN INPUT! [$_]"; > s/"// or die "unpaired quotes in [$_]"; > 1 while s/<(.*?),(.*?)>/<$1\f$2>/; > s/[<>]//g; > }; > my @line = split ',',$_; > s/\f/,/g for @line; > > ... > > > }; > > I usually use _Numeral (example: _1), as replacement values. The underscore is rare in file bodies, and using a numeral with it gives the ability to replace a set of entities. Of course, order can be very important. :) Teal From stephenclouse at gmail.com Mon May 2 15:53:34 2011 From: stephenclouse at gmail.com (Stephen Clouse) Date: Mon, 2 May 2011 17:53:34 -0500 Subject: [Kc] one way to handle commas in quotes strings in CSV files In-Reply-To: <4DBF2952.2020903@earthlink.net> References: <4DBF2952.2020903@earthlink.net> Message-ID: use Text::CSV (); -- Stephen Clouse -------------- next part -------------- An HTML attachment was scrubbed... URL: From davidnicol at gmail.com Tue May 3 10:34:19 2011 From: davidnicol at gmail.com (David Nicol) Date: Tue, 3 May 2011 12:34:19 -0500 Subject: [Kc] subtle distinction that looks like a bug Message-ID: with 5.12, also with 5.10 and presumably earlier too: >perl -le 'sub Q{@_[0,1,2] = (7,6,5)}; Q($a,$b,$c); print qq[$a $b $c\n];' 7 6 5 >perl -le 'sub Q{@_ = (7,6,5)}; Q($a,$b,$c); print qq[$a $b $c\n];' > I was trying to have a subroutine that resets the variables passed to it when it is called; the second line is the approach I took, it didn't work. The first approach works though. It appears that in the second one, we're assigning to a fresh array variable, the old one's contents having first been jettisoned, instead of overwriting the elements therein. The massive headaches that would ensue if my initial approach did what I wanted are easy to imagine. Or are they? When an array variable holds copies instead of aliases, as they usually do, there would be difference. From jason+kcpm at jlrush.com Tue May 3 13:36:41 2011 From: jason+kcpm at jlrush.com (Jason Rush) Date: Tue, 3 May 2011 15:36:41 -0500 Subject: [Kc] subtle distinction that looks like a bug In-Reply-To: References: Message-ID: > The massive headaches that would ensue if my initial approach did what > I wanted are easy to imagine. Or are they? When an array variable > holds copies instead of aliases, as they usually do, there would be > difference. Isn't it due to Q being pass-by-reference that allows you to assign $_[0] and have the value assigned to $a because it is the same memory location? If it was pass-by-copy and not pass-by-reference, the values from $a, $b, and $c would be passed in and Q would not have access to the memory locations for $a, $b, and $c and thus would not be able to store values there. From sterling at hanenkamp.com Tue May 3 14:49:49 2011 From: sterling at hanenkamp.com (Sterling Hanenkamp) Date: Tue, 3 May 2011 16:49:49 -0500 Subject: [Kc] subtle distinction that looks like a bug In-Reply-To: References: Message-ID: On Tue, May 3, 2011 at 3:36 PM, Jason Rush wrote: > > The massive headaches that would ensue if my initial approach did what > > I wanted are easy to imagine. Or are they? When an array variable > > holds copies instead of aliases, as they usually do, there would be > > difference. > > Isn't it due to Q being pass-by-reference that allows you to assign > $_[0] and have the value assigned to $a because it is the same memory > location? > > If it was pass-by-copy and not pass-by-reference, the values from $a, > $b, and $c would be passed in and Q would not have access to the > memory locations for $a, $b, and $c and thus would not be able to > store values there. > > Right, in Perl, the values in @_ are aliased, which is effectively the same as pass-by-reference in other languages. Though, I tend to think of pass-by-reference in Perl as using explicit references, rather than aliasing. Aliasing only works as long as you use the @_ variable directly by slicing or using an index lookup. If you use an assignment, the alias is not copied. If you modify @_, itself, the aliasing may also go away. For example, assigning to @_: @_ = qw( a b c ); eliminates aliasing entirely because @_ now refers to a completely different array. Subsequent, $_[0] won't do anything to the original parameters. Also, you have to be careful when modifying aliased parameters. For example, if we make a small change to the original code: perl -le 'sub Q{@_[0,1,2] = (7,6,5)}; Q(1,$b,$c); print qq[$a $b $c\n];' Running that will get you: Modification of a read-only value attempted at -e line 1. In general, I try to avoid changing @_ or working with it directly. In my experience, side-effects in parameters tends to lead to unintended consequences. -- Andrew Sterling Hanenkamp sterling at hanenkamp.com 785.370.4454 -------------- next part -------------- An HTML attachment was scrubbed... URL: From davidnicol at gmail.com Wed May 11 11:44:59 2011 From: davidnicol at gmail.com (David Nicol) Date: Wed, 11 May 2011 13:44:59 -0500 Subject: [Kc] strawberry error messages are inferior to windows native error messages Message-ID: An idiom I have found useful, to process all files in the current directory, with a certain extension, is: for (glob("*.something")){ open F, '<', $_ or die "open: $!"; dosomethingwith(join '',); rename $_, "$_.processed" or die "rename: $!" }; This has served, and continues to serve, very well, in various situations. I tried that today on a Windows machine and got "rename: permission denied." I further loosened the security on the working directory to no effect. Eventually I tried die `move $_ $_.processed`; instead of the rename line and got a different error message, about the file being opened by another process. Then I remembered: Perl does not automatically close a file handle when it is done reading it. You can seek() on the handle and to reread the file from the beginning, or maybe the file will grow at the end -- anyway, the handle is open, and Windows won't let you rename a file that has open handles on it. A "close F" before the rename took care of the problem. Is this worth trying to file as a bug with the Strawberry Perl project, that the error messages can be misleading? In general, your takeaway from this note is to supposed to be a a reminder that Perl (at least Strawberry -- haven't tried my idiom with ActiveState) error messages on Windows are inferior to the error messages from native tools, and may be incorrect. -------------- next part -------------- An HTML attachment was scrubbed... URL: From amoore at mooresystems.com Mon May 16 09:23:02 2011 From: amoore at mooresystems.com (Andrew Moore) Date: Mon, 16 May 2011 11:23:02 -0500 Subject: [Kc] oops! Message-ID: I just realized that the 2nd Tuesday of the month was last week, not this one. I'm sorry I missed the meeting this month! Next month, it's the 14th, I guess. -Andy From stephenclouse at gmail.com Thu May 19 09:03:39 2011 From: stephenclouse at gmail.com (Stephen Clouse) Date: Thu, 19 May 2011 11:03:39 -0500 Subject: [Kc] oops! In-Reply-To: References: Message-ID: On Mon, May 16, 2011 at 11:23 AM, Andrew Moore wrote: > I just realized that the 2nd Tuesday of the month was last week, not > this one. I'm sorry I missed the meeting this month! > And I was sick last week, or I would have sent you my usual reminder :) -- Stephen Clouse -------------- next part -------------- An HTML attachment was scrubbed... URL: