[Pdx-pm] Instance hash ( keyed array )

Michael G Schwern schwern at gmail.com
Tue Oct 17 22:16:06 PDT 2006


Roderick A. Anderson wrote:
> I'm working on a module [0] that needs to do something I have not seen 
> in any other module or documentation.  It may be my ignorance that is 
> making this complicated.
> 
> I need to build a, sometimes, quite large hash table ( 40,000 - 470,000 
> records ) one record at a time.  The method will return the key it is 
> used to the caller so it can be used to key the rest of the data not 
> passed in to the method.

If memory consumption starts to become a problem you might want to consider using a DBM file (see AnyDBM_File).


> Here is the sub.
> 
> sub add_addr {
> 
>      my $self = shift;
>      my @stuff = @_;
> 
>      my $recid = $self->inc_recid();
>      my $addr = join( $FldDelim, $recid, @stuff );
>      $addr .= $RcdDelim;
> 
>      $self->{addr_data}->{$recid} = $addr;
> 
>      return $recid;
> }
> 
> Is there a better way to do this?

Hard to say without context as to how its being used, but I can make some observations.

Its kind of weird that you're flattening the $recid and @stuff into a delimited string.  It reduces the flexibility of your data, its easier to work with a list than an encoded string.  As a rule of thumb, separate formatting from functionality to increase flexibility.

And as Joshua pointed out, storing the key in the value is odd, though there are sometimes when its useful.

Oh, and lose the vowelless variable names because thy cn b hrd 2 rd.

So I would do this:

    # Functionality.
    sub add_address {
        my $self = shift;
        my @address = @_;

        my $id = $self->next_record_id();
        $self->{address_data}{$id} = \@address;

        return $id;
    }

and then flatten as necessary:

    # Formatting
    sub flatten_record {
        my($self, $id) = @_;

        my $address = $self->{address_data}{$id};
        return unless $address;

        return join( $Field_Delim, $id, @$address ) . $Record_Delim;
    }


More information about the Pdx-pm-list mailing list