APM: Regular Expression Question

Jeff Sumner jeff_sumner at hotmail.com
Wed Aug 16 09:12:56 PDT 2006


What about simply using backreferences
while ($line = <>) {
if ($line =~ /(\S+)\t(\S+)\t(\S+)\t(\S+)\t(\S+)\t(\S+)\t(\S+)/) {
   print qw{"$1"\t"$2"\t"$3"\t"$4"\t"$5"\t"$6"\t$7};
}
}


Or I suppose you could do it in one line

s/(\S+)\t(\S+)\t(\S+)\t(\S+)\t(\S+)\t(\S+)\t(\S+)\n/"$1"\t"$2"\t"$3"\t"$4"\t"$5"\t"$6"\t$7\n/g

Jeff Sumner


>From: "Barron Snyder (CE CEN)" <Barron.Snyder at wholefoods.com>
>To: <austin at pm.org>
>Subject: Re: APM: Regular Expression Question
>Date: Wed, 16 Aug 2006 10:56:34 -0500
>
>Here is some sample data (input):
>FL	999-NO_SUBTEAM	Actuals	ASSETS	FY2006	6	19,416.86
>FL	DP-999	Actuals	150000	FY2006	6	19,416.86
>FL	DP-999	Actuals	WIP	FY2006	6	19,416.86
>FL	DP-999	Actuals	TOT_PPE	FY2006	6	19,416.86
>FL	DP-999	Actuals	LT_ASSET	FY2006	6	19,416.86
>FL	DP-999	Actuals	ASSETS	FY2006	6	19,416.86
>FL	NON_MARGIN	Actuals	510000	FY2006	6	11,866.97
>FL	NON_MARGIN	Actuals	SUPP_PKG	FY2006	6
>11,866.97
>
>And here is what it should end up like (output):
>"FL"	"999-NO_SUBTEAM"	"Actuals"	"ASSETS"	"FY2006"
>"6"	19,416.86
>"FL"	"DP-999"	"Actuals"	"150000"	"FY2006"
>"6"	19,416.86
>"FL"	"DP-999"	"Actuals"	"WIP"	"FY2006"	"6"
>19,416.86
>"FL"	"DP-999"	"Actuals"	"TOT_PPE"	"FY2006"
>"6"	19,416.86
>"FL"	"DP-999"	"Actuals"	"LT_ASSET"	"FY2006"
>"6"	19,416.86
>"FL"	"DP-999"	"Actuals"	"ASSETS"	"FY2006"
>"6"	19,416.86
>"FL"	"NON_MARGIN"	"Actuals"	"510000"	"FY2006"
>"6"	11,866.97
>"FL"	"NON_MARGIN"	"Actuals"	"SUPP_PKG"	"FY2006"
>"6"	11,866.97
>
>All values except those in the final column should be wrapped in
>double-quotes and tabs should separate the values.
>
>My solution does it like this:
>...
>foreach my $piece (@pieces) {
>    	my @strings = split(/\t/, $piece);
>    	print DATA_OUT "\"", join ("\"\t\"", $strings[0], $strings[1],
>$strings[2], $strings[3], $strings[4], $strings[5]), "\"\t",
>$strings[6], "\n";
>}
>...
>
>But as I mentioned, in my effort to learn more about Perl, I thout there
>may be a more elegant way using regular expressions.
>
>Thanks,
>
>Barron Snyder
>
>-----Original Message-----
>From: austin-bounces+barron.snyder=wholefoods.com at pm.org
>[mailto:austin-bounces+barron.snyder=wholefoods.com at pm.org] On Behalf Of
>Zach Vonler
>Sent: Wednesday, August 16, 2006 10:42 AM
>To: austin at pm.org
>Subject: Re: APM: Regular Expression Question
>
>On 8/16/06, Jay Flaherty <jayflaherty at gmail.com> wrote:
> > $piece =~ s/\t{0,3}/\"\t\"/g;
>
>There are two problems with this one, the first being that you have
>the ability to match on a null string, and the second being that
>whatever does get matched is replaced by only a single "\t".
>
>If the number of fields you want to modify is in $count, something like
>
>$repl = "\\\"\\t\\\"" x $count;
>$piece =~ s/\t{$count,$count}/$repl/;
>
>might get you most of the way there.  Note of course that it does not
>modify inputs with fewer than $count fields.
>
>Later,
>Zach
>_______________________________________________
>Austin mailing list
>Austin at pm.org
>http://mail.pm.org/mailman/listinfo/austin
>_______________________________________________
>Austin mailing list
>Austin at pm.org
>http://mail.pm.org/mailman/listinfo/austin




More information about the Austin mailing list