[Edinburgh-pm] Code cleanup
Aaron Crane
perl at aaroncrane.co.uk
Mon Mar 5 09:24:55 PST 2012
Miles Gould <miles at assyrian.org.uk> wrote:
> So I thought I'd do
> a step-by-step cleanup of their code, so that they, and hopefully any
> interested bystanders, might learn.
Nice idea.
> Does anyone have any suggestions?
lines 20, 22, 52: perhaps use <> instead of manually opening the file
lines 26, 76: inconsistent that these aggregates get an empty
initializer, when most don't (lines 25, 56, 57)
line 37: this elsif is redundant now that you've got the
looks_like_number() test
lines 34, 41: hoist the chomps out of the conditional
lines 42-44: better: `my ($name, @values) = split " ", $line`
(clarifies what the syntax is, by avoiding the manual `shift`)
line 46: I prefer `if any { !predicate }` to `unless all { predicate
}`, but I suppose that's personal taste
line 47: `$dataArray[$indexLine] = \@values`
line 54: we copy $indexLine into $geneNumber, then never touch
$indexLine again; perhaps replace $indexLine with $geneNumber
throughout?
lines 71-78: delete; and modify lines 91-92 and 95-98 to declare the
variables as needed
line 81: `for my $i (0 .. $filterNumber-1)`
lines 82-89: much simpler as this:
my $data = $filterData[$i];
my @controlArray = @$data[ 0 .. 19]; # first 20
my @sampleArray = @$data[20 .. 40]; # next 21
or, if you're willing to require Perl 5.12, replace lines 81-89 with:
while (my ($i, $data) = each @filterData) {
my @controlArray = @$data[ 0 .. 19]; # first 20
my @sampleArray = @$data[20 .. 40]; # next 21
lines 106, 109, 112: if you're willing to require Perl 5.12, rename
$scoreCounter to $i, and replace with:
my @sorted = sort keys %scoreHash;
while (my ($i, $key) = each @sorted) {
lines 127, 129-131: `use List::Util qw<sum>` at the top, and then `my
$sum = sum(@$array)`
lines 133-139: similarly, I think this clarifies the computation:
my $squares_sum = sum(map { $_ * $_ } @$array);
my $stddev = sqrt(($squares_sum - $sum * $sum / $n) / ($n - 1);
(with a declaration of $n, but early enough to use in the calculation
of $mean, too) — but even better would be to use a CPAN module, as in
your stats_basic branch
> I haven't merged the stats_basic branch
> yet because it changes the output data extensively, and I'd like to run a
> few calculations by hand to determine which set of answers is right.
Do the code under discussion and Statistics::Basic maybe differ in
whether they divide by N versus N-1? (I forget the relevant
terminology, I'm afraid.)
--
Aaron Crane ** http://aaroncrane.co.uk/
More information about the Edinburgh-pm
mailing list