Scott Walters scott at illogics.org
Fri Jun 9 18:22:16 PDT 2006

Hi Bobby,

Pre-allocating only speeds up insertion of items into the top level hash.
If you have a hash %foo that you're putting things into subhashes of
with $foo{whatever}{whatever}, then preallocating into %foo won't
help you.  In fact, it'll waste RAM and CPU.  I didn't catch 
whether you were doing single level or multi level so I'm not sure if this 


On  0, "Metz, Bobby W, WWCS" <bwmetz at att.com> wrote:
> 	This is kind of a follow-up question to my multi-level hash
> post.  Everything I've been reading on-line about how hashes work leads
> me to conclusions that don't seem to pan out in reality, e.g.
> pre-defining the # of hash buckets to increase performance on large data
> sets.  At least, I thought +40K records would be considered large...no
> jokes please.
> 	So, here's what I've observed using two methods to load +40K
> records into a single level hash.  I have always used method #1 as I
> learned it that way years ago but would love some thoughts around
> whether method #2 might be superior somehow as I know a lot of folks
> that do it that way instead.
> Method 1
> + Dynamically build hash from data file at run time.
> + Program load is consistently 3 seconds faster than Method 2.
> + Used 13M of memory to hold the records.
> Method 2
> + Used pre-built hashes loaded via "require".
> + Program load is consistently 3 seconds slower than Method 1.
> + Used 36M of memory to hold the records.
> 	Any of you know the inner workings of hashes enough to explain
> the difference?  I think the memory increase might have something to do
> with "require" mucking with the usual shared hash table used by perl,
> possibly forcing two copies.  But, that's just an uneducated guess.
> There was no discernable difference in output performance using a small
> test set against the +40K records, only the initial program load and
> total memory consumption.
> Thoughts?
> Thanks,
> Bobby
