APM: Sparse arrays
Ren Maddox
renm at iname.com
Thu Jul 10 10:00:47 CDT 2003
On Thu, 2003-07-10 at 08:00, Brian Michalk wrote:
> Let's say I have a sparse array:
> $ary[100]=1;
> $ary[100000]=2;
> $ary[1000000]=3;
>
> Now, I want to iterate over the array, either from lowest to highest array
> index, and still retain the index of the element returned. How do I do
> that?
>
> foreach $elt (@ary) {
> $some_value = $elt;
> $the_index = ????????; # how do I get this information?
> }
>
>
> This algorithm is chewing up CPU time, causing the process to take days to
> complete... the dataset is several gigabytes large.
>
> Using a hash, and sorting the keys will not give me a performance benefit.
Perhaps not, but in similar circumstances I have found that using a hash
and *storing* the sorted list of keys will give a huge performance
benefit.
my %hash = ( 100, 1, 100_000, 2, 1_000_000, 3 );
my @keys = sort keys %hash;
for my $key ( @keys ) {
$some_value = $hash{$key};
# $key *is* the index
}
Of course, some work is required to maintain consistency between @keys
and %hash. Using one of the existing modules that provides ordered
hashes may be appropriate if that is a concern, though simply rebuilding
@keys periodically may be good enough depending on the situation.
--
Ren Maddox <renm at iname.com>
More information about the Austin
mailing list