[Melbourne-pm] returning hashes and arrays, source filters considered harmful?

Jacinta Richardson jarich at perltraining.com.au
Fri Feb 19 00:04:35 PST 2010


Sam Watkins wrote:

> but I normally do it like this because I feel like it should be more efficient:
> 
> 	...
> 	  return \@foo;
> 	}
> 
> 	$foo = my_func();

Yes, that is more efficient, although for small arrays, containing small data I 
doubt the difference in speed would matter in almost all circumstances.

> This has the unfortunate effect of converting my array to an array ref.  If I
> wanted to reverse that, I think I could use:
> 
> 	@foo = @{my_func()};

That works, but should have about the same efficiency penalty as just returning 
an array to start with (since all you've done is moved the copy).  However if 
*most* of the time you want my_func to return a reference, and sometimes you 
want a list; this isn't a bad way to go about it.

> which is getting ugly and inefficient too.  The same thing happens when
> returning hash variables, except that I suppose it is even more inefficient to
> return it via a list return compared to a reference.

For the same memory size, it *should* be the same efficiency.

> My question is, does perl actually optimize this so it sucks less than I am
> naively supposing it does?

No, Perl doesn't optimise this. On the other hand, my question for you is why 
are you doing all this premature optimisation to start with?  Are you actually 
hitting speed issues in all of your subroutines on a regular basis?  I 
appreciate a desire to be consistent, but most of the problem this is, and 
should be, a non-issue.

   Also, is there any way to make like an alias @foo
> for @$foo, so you can treat an array reference as a normal array without
> writing @$foo all the time?  (and also for hashes)

Not in Perl 5.  Yes, in Perl 6.

> These concerns make me reluctant to use normal perl @array and %hash types at
> all.  If I have to pass and return things by reference, and can't alias these
> back to normal @array and %hash types, I would prefer to use references for
> everything in order to be consistent and avoid having to rewrite code when I
> suddenly need to pass some variable to or from a sub.  This would make the @foo
> %bar syntax useless for anything but the one-liners.  So I'm hoping there is
> some workaround or optimization, and a way to alias them.

Most real world code I see uses arrays and hashes all over the place; taking 
references to them when passing them into subroutines, and handling whatever the 
subroutine returns.  Arrays and hashes are convenient, and machines are fast 
enough these days that programmer time and code maintaianability is usually more 
important.

When I'm asked to help increase the efficiency of these programs, changing _all_ 
the data types to be references would be one of the last suggestions I'd make. 
I might suggest changing how some specific,larger structures were passed around, 
but that's about it. I'd view this advice akin to suggesting unwrapping methods 
to avoid calling the dispatcher.

By all means use only references in your code, but be aware that it's much more 
a style decision (and a fairly unique one at that) than a necessary practice).

	J



More information about the Melbourne-pm mailing list