[Pdx-pm] read and concatenate two lines at a time
Thomas J Keller
kellert at ohsu.edu
Tue Dec 7 16:09:10 CST 2004
Good point. As it turned out, after capturing the header line, the file
was consistent with respect to each subsequent pair of lines "went
together". The data did have lots of blanks and odd characters though.
This made the sorting task they wanted done prior to separating the
line-pairs again more difficult.
At the risk of exposing my already battered ego to more bruising, what
I decided to do (after concatenating the line-pairs into an unsorted
array) was create an array of default values and "fill-in" any blanks
with the defaults, then do the sorting via the Schwartzian transform.
Then split the lines back to a new file with the correctly sorted pairs
of lines. The following worked for this data file:
#!/usr/bin/perl
use strict;
use warnings;
my $header = <>; ## read header line
chomp $header;
my @keys = split(/\t+/,$header);
$keys[0] = "ID";
## need to add a new element $keys[6] = "variance assumption";
my @new_keys = (@keys[0..5],"Variance Parameter", at keys[6..$#keys]);
my @defaults = qw(no_id no_strain 0.0 99 99 99 none 99 99 99 99 99
no_strain 0.0 99 99 99 none 99 99 99 99 99);
## concatenate consecutive pairs of lines
my @unsorted;
while (my $line = <>){
chomp $line;
chomp($line .= <>); ## grab next line
$line =~ s/\t\./\t1/g; ## substitute "1" for "." values
push @unsorted, $line; ## push consecutive lines
}
## fill empty fields with default values
my @unsorted_filled;
foreach my $line (@unsorted) {
my @data = split "\t", $line;
foreach (0..$#defaults) {
if ($data[$_] eq "0") { ## in case the data contains real 0 values
$data[$_] = "0.000"; ## this "zero" won't evaluate to false in
boolean comparisons
} else {
$data[$_] = $data[$_] || $defaults[$_];
}
}
push @unsorted_filled, join "\t", @data;
}
## Sort Data by P-value ##
my @sorted =
map { $_->[0] } ## return sorted array of lines
sort { $a->[1] <=> $b->[1] } ## sort on second value of each tuple
map { [$_, (split "\t")[7]] } ## create [line, p-value] tuple as anon.
array within array
@unsorted_filled; ## from unsorted lines
## Output ##
print join("\t", @new_keys), "\n";
foreach (@sorted) {
my @data = split("\t",$_);
print join("\t", @data[0..11]),"\n";
print join("\t", ($data[0], @data[12..$#data])),"\n";
#print join( "\t", @data), "\n";
}
Any other suggestions or warnings gladly accepted.
Thanks for your help folks. I very much appreciate it.
Tom Keller
On Dec 7, 2004, at 1:44 PM, Randal L. Schwartz wrote:
>>>>>> "Ken" == Ken Brush <ken at cgi101.com> writes:
>
> Ken> FYI, You can even reduce it by one more operation by doing:
>
> Ken> while( my $entry = <> . <>) {
> Ken> $entry =~ s/\n//g;
> Ken> ...
>
> Not safely. If the first operation returns undef, to indicate the end
> of the @ARGV list, the second operation will read a line from STDIN!
>
> --
> Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777
> 0095
> <merlyn at stonehenge.com> <URL:http://www.stonehenge.com/merlyn/>
> Perl/Unix/security consulting, Technical writing, Comedy, etc. etc.
> See PerlTraining.Stonehenge.com for onsite and open-enrollment Perl
> training!
> _______________________________________________
> Pdx-pm-list mailing list
> Pdx-pm-list at mail.pm.org
> http://mail.pm.org/mailman/listinfo/pdx-pm-list
More information about the Pdx-pm-list
mailing list