APM: Sparse arrays
Mike Stok
mike at stok.co.uk
Thu Jul 10 10:35:17 CDT 2003
On 10 Jul 2003, Ren Maddox wrote:
> On Thu, 2003-07-10 at 08:00, Brian Michalk wrote:
> > Let's say I have a sparse array:
> > $ary[100]=1;
> > $ary[100000]=2;
> > $ary[1000000]=3;
> >
> > Now, I want to iterate over the array, either from lowest to highest array
> > index, and still retain the index of the element returned. How do I do
> > that?
> >
> > foreach $elt (@ary) {
> > $some_value = $elt;
> > $the_index = ????????; # how do I get this information?
> > }
> >
> >
> > This algorithm is chewing up CPU time, causing the process to take days to
> > complete... the dataset is several gigabytes large.
> >
> > Using a hash, and sorting the keys will not give me a performance benefit.
>
> Perhaps not, but in similar circumstances I have found that using a hash
> and *storing* the sorted list of keys will give a huge performance
> benefit.
>
> my %hash = ( 100, 1, 100_000, 2, 1_000_000, 3 );
> my @keys = sort keys %hash;
>
> for my $key ( @keys ) {
> $some_value = $hash{$key};
> # $key *is* the index
> }
>
>
> Of course, some work is required to maintain consistency between @keys
> and %hash. Using one of the existing modules that provides ordered
> hashes may be appropriate if that is a concern, though simply rebuilding
> @keys periodically may be good enough depending on the situation.
Why not just say
foreacn my $key (sort {$a <=> $b} keys %hash) {
...
}
unless you have so many keys that sort takes too long.
alternatively it could be possible to make an array-like object, though
too much Ruby has lowered my tolerance to the pain of doing this kind of
thing in perl :-(
Mike
--
mike at stok.co.uk | The "`Stok' disclaimers" apply.
http://www.stok.co.uk/~mike/ | GPG PGP Key 1024D/059913DA
mike at exegenix.com | Fingerprint 0570 71CD 6790 7C28 3D60
http://www.exegenix.com/ | 75D2 9EC4 C1C0 0599 13DA
More information about the Austin
mailing list