APM: Sparse arrays

Ren Maddox renm at iname.com
Thu Jul 10 10:00:47 CDT 2003


On Thu, 2003-07-10 at 08:00, Brian Michalk wrote:
> Let's say I have a sparse array:
> $ary[100]=1;
> $ary[100000]=2;
> $ary[1000000]=3;
> 
> Now, I want to iterate over the array, either from lowest to highest array
> index, and still retain the index of the element returned.  How do I do
> that?
> 
> foreach $elt (@ary) {
> 	$some_value = $elt;
> 	$the_index = ????????;			# how do I get this information?
> }
> 
> 
> This algorithm is chewing up CPU time, causing the process to take days to
> complete... the dataset is several gigabytes large.
> 
> Using a hash, and sorting the keys will not give me a performance benefit.

Perhaps not, but in similar circumstances I have found that using a hash
and *storing* the sorted list of keys will give a huge performance
benefit.

my %hash = ( 100, 1, 100_000, 2, 1_000_000, 3 );
my @keys = sort keys %hash;

for my $key ( @keys ) {
  $some_value = $hash{$key};
  # $key *is* the index
}


Of course, some work is required to maintain consistency between @keys
and %hash.  Using one of the existing modules that provides ordered
hashes may be appropriate if that is a concern, though simply rebuilding
@keys periodically may be good enough depending on the situation.

-- 
Ren Maddox <renm at iname.com>




More information about the Austin mailing list