[VPM] - data structures, performance and memory

Darren Duncan darren at DarrenDuncan.net
Sun Nov 12 20:23:15 PST 2006


At 5:30 PM -0800 11/10/06, Jer A wrote:
>the "hundred" records example was not a good one, I mean very large records
>put together in a very large array.....(in the thousands).
>what are the memory sizes for scalars,hashes,single-dim arrays,double-dim
>arrays, array of anon-hashes -etc. how can i use as little memory as
>possible, and how can i search these very large arrays efficiently.
>
>Some pointers would be great, I don't need any examples.

Generally speaking, each Hash or Array used, whether anonymous or 
not, uses more memory for overhead and per element than simple 
scalars do.

AFAIK, a scalar variable uses about 20 bytes of memory overhead plus 
its actual data, a hash or array are maybe 50-100 bytes per variable 
overhead plus maybe 20-50 per element; a lot of the latter are 
guesses.  I do know that Hashes use more memory than Arrays for the 
same number of value elements, maybe about 30% more overhead memory.

I would guess that, to save memory, if you can use a 
single-dimensional array or hash rather than a 2-dimensional 
structure built from other, saves memory, or if you have 2 parallel 
single-dimension arrays or hashes will save memory compared to 1 
array or hash of 2-element arrays or hashes.

>are elements on a array, of variant data-type...does this general type
>consume more memory than if  the type is explicitly defined, if so, how can
>I explicitly define the type, eg. int,string,bool in perl terms etc.

I don't know if it is possible to specify strong types like 
int/string/bool in Perl 5 without getting into Perl's internals or 
using some third party module.  (You can in Perl 6, but not that that 
can help you now.)  I've heard such a feature may have been added to 
Perl 5 starting with 5.8+ or something, but if so then it is obscure 
or I don't know where to look.

>can operations on very large arrays eat up more memory, in execution?
>how can I control perl's allocation of memory.

A foreach loop that iterates an array may be faster than a map which 
produces a new array from an existing one ... or not.

But certainly, if you're processing a file, you want to read it onw 
line at a time rather than slurp it.  But that's easy.

-- Darren Duncan


More information about the Victoria-pm mailing list