APM: Regular Expression Question

Barron Snyder (CE CEN) Barron.Snyder at wholefoods.com
Wed Aug 16 08:56:34 PDT 2006


Here is some sample data (input):
FL	999-NO_SUBTEAM	Actuals	ASSETS	FY2006	6	19,416.86
FL	DP-999	Actuals	150000	FY2006	6	19,416.86
FL	DP-999	Actuals	WIP	FY2006	6	19,416.86
FL	DP-999	Actuals	TOT_PPE	FY2006	6	19,416.86
FL	DP-999	Actuals	LT_ASSET	FY2006	6	19,416.86
FL	DP-999	Actuals	ASSETS	FY2006	6	19,416.86
FL	NON_MARGIN	Actuals	510000	FY2006	6	11,866.97
FL	NON_MARGIN	Actuals	SUPP_PKG	FY2006	6
11,866.97

And here is what it should end up like (output):
"FL"	"999-NO_SUBTEAM"	"Actuals"	"ASSETS"	"FY2006"
"6"	19,416.86
"FL"	"DP-999"	"Actuals"	"150000"	"FY2006"
"6"	19,416.86
"FL"	"DP-999"	"Actuals"	"WIP"	"FY2006"	"6"
19,416.86
"FL"	"DP-999"	"Actuals"	"TOT_PPE"	"FY2006"
"6"	19,416.86
"FL"	"DP-999"	"Actuals"	"LT_ASSET"	"FY2006"
"6"	19,416.86
"FL"	"DP-999"	"Actuals"	"ASSETS"	"FY2006"
"6"	19,416.86
"FL"	"NON_MARGIN"	"Actuals"	"510000"	"FY2006"
"6"	11,866.97
"FL"	"NON_MARGIN"	"Actuals"	"SUPP_PKG"	"FY2006"
"6"	11,866.97

All values except those in the final column should be wrapped in
double-quotes and tabs should separate the values.

My solution does it like this:
...
foreach my $piece (@pieces) {
   	my @strings = split(/\t/, $piece);
   	print DATA_OUT "\"", join ("\"\t\"", $strings[0], $strings[1],
$strings[2], $strings[3], $strings[4], $strings[5]), "\"\t",
$strings[6], "\n";
}
...

But as I mentioned, in my effort to learn more about Perl, I thout there
may be a more elegant way using regular expressions.

Thanks,
            
Barron Snyder

-----Original Message-----
From: austin-bounces+barron.snyder=wholefoods.com at pm.org
[mailto:austin-bounces+barron.snyder=wholefoods.com at pm.org] On Behalf Of
Zach Vonler
Sent: Wednesday, August 16, 2006 10:42 AM
To: austin at pm.org
Subject: Re: APM: Regular Expression Question

On 8/16/06, Jay Flaherty <jayflaherty at gmail.com> wrote:
> $piece =~ s/\t{0,3}/\"\t\"/g;

There are two problems with this one, the first being that you have
the ability to match on a null string, and the second being that
whatever does get matched is replaced by only a single "\t".

If the number of fields you want to modify is in $count, something like

$repl = "\\\"\\t\\\"" x $count;
$piece =~ s/\t{$count,$count}/$repl/;

might get you most of the way there.  Note of course that it does not
modify inputs with fewer than $count fields.

Later,
Zach
_______________________________________________
Austin mailing list
Austin at pm.org
http://mail.pm.org/mailman/listinfo/austin


More information about the Austin mailing list