[VPM] Iterating through a dynamic array

Peter Scott Peter at PSDT.com
Thu Jun 3 18:51:37 CDT 2004


At 03:49 PM 6/3/2004, nkuipers wrote:
>Hello all,
>
>I am rather stuck on this, and would like some help please.
>
>I have 2 arrays, the first contains a bunch of string refs, the second 
>a bunch
>of numeric scores, where the indeces of the second array correspond to those
>of the first.  So, the stringref in array_one[0] has a score contained in
>array_two[0].

In general, this is a bad idea (to split a data structure across 
independent variables).  Only under special circumstances would I 
concede that this was better than combining them.

>Now, I want to iterate through the scores, and if a score
>doesn't meet a certain threshold value, I want to splice() out the
>corresponding stringref.  Well, that's fine for the first splice operation,
>but then I am stuck with two sets of indeces that no longer correspond.

Not if you combine them into one data structure.  How about having each 
element be an arrayref whose first element was the stringref (or 
string) and the second was the score?

>I
>suppose I could splice out of both arrays to keep them the same size, 
>and then
>restart the iteration, but then I am redoing score comparisons that already
>passed in order to get to the next splice candidate, and that's ick.

Indeed.  Of course, you could remember where you left off, and pick up there.

>For
>those of you you speak better in code than prose, here it is, slightly
>beatified:
>
>my @init; # gets populated with stringrefs
>...
>
>remove_similar(get_similarity_matrix($shifted_stringref_from_init));
>
># similarity() is from String::Similar on CPAN
>sub get_similarity_matrix {
>         my ($seq1) = @_;
>         my @score_matrix = ();
>         foreach my $seq2 (@init) {
>                 push @score_matrix, similarity($$seq1, $$seq2);
>         }
>         return \@score_matrix;
>}
>
>sub remove_similar {
>         my @score_matrix = @{ shift @_ };
>         for (my $i = 0; $i <= $#score_matrix; $i++) {
>                 if ($score_matrix[$i] > $LIMIT) {
>                 splice @init, $i, 1; # ACK!!
>                 }
>         }
>}
>
>Thanks for any insight,

Referencing the global variable @init in remove_similar() is, as 
Douglas Adams would say, a dead giveaway that you have a suboptimal design.

Another approach: grep out the indices from the score array that you 
want to remove, and then splice both arrays:

remove_similar(get_similarity_matrix($shifted_stringref_from_init), \@init);

sub remove_similar
{
   my ($score, $init) = @_;
   my @keep_indices = grep $score->[$_] <= $LIMIT => 0 .. $#$score;
   @$score = @$score[@keep_indices];
   @$init  = @$init[@keep_indices];
}

Okay, so I don't like the double pass there.  So how about:

sub remove_similar
{
   my ($score, $init) = @_;
   my @keep_indices;
   my $index = 0;
   @$score = grep { $index++; $_ <= $LIMIT && push @keep_indices, 
$index } @$score;
   @$init  = @$init[@keep_indices];
}

(Untested, but they look right...)

But as I said, I'd sooner have one data structure:

my @init = map [ $_ ] => ...list of stringrefs ...
get_similarities(\@init);
remove_similar(\@init);

sub get_similarites
{
   my $arr = shift;
   ...
   push @$_, $score for @$arr;  # More or less
}

sub remove_similar
{
   my $arr = shift;
   @$arr = grep $_->[1] <= $LIMIT => @$arr;
}

-- 
Peter Scott
Pacific Systems Design Technologies
http://www.perldebugged.com/
*** New! *** http://www.perlmedic.com/




More information about the Victoria-pm mailing list