[Chicago-talk] appending hashes
Steven Lembark
lembark at jeeves.wrkhors.com
Tue Nov 4 11:19:44 CST 2003
-- Andy_Bach at wiwb.uscourts.gov
> Interestingly (???) the map method:
> 'mapappend' => sub
> {
> %a = %b;
> %a = map {$_ => $c{$_} } keys(%c);
> },
>
> is slower and gets worse as the hashes get bigger. I bumped up c to:
> %c = map(($_, $_), 0 .. 6750);
> and:
> Benchmark: timing 1000 iterations of append, baseline, mapappend...
> append: 36 wallclock secs (33.94 usr + 0.22 sys = 34.16 CPU) @
> 29.27/s (n=1000)
> baseline: 2 wallclock secs ( 2.08 usr + 0.02 sys = 2.10 CPU) @
> 476.19/s (n=1000)
> mapappend: 65 wallclock secs (61.32 usr + 0.58 sys = 61.90 CPU) @
> 16.16/s (n=1000)
>
> (left off unroll as it was really slow). Looks like slices are the way
> to go.
Makes sense: map has to individually process the unrolled
hash into $_ and build the output list via -- essentially --
push as it goes along. Serializing the operation is most of
what causes the pain.
So far as I know:
@foo{keys %bar} = values %bar
is the fastest, lowest overhead way to merge the hashes.
To merge multiple hashes use hash referents in a sub (e.g.,
from a job I use to manage the environment):
#!/blah/perl
...
sub merge
{
my %bucket = ();
@bucket{ keys %$_ } = values %$_ for @_;
\%bucket
}
# don't wanna loose these either way...
my @inherit = qw( TERM HOME USER MAIL DISPLAY );
# configured environment
my %default = qw( ... ); # read from config files, whatever
my %host = qw( ... ); # point is they go from least specific
my %user = qw( ... ); # to most specific as you go down
my %job = qw( ... ); # the list
my $newenv = merge \%ENV, \%default, \%host, \%user, \%job;
@{$newenv}{ @inherit } = @ENV{ @inherit };
$newenv->{ENV_SETUP_SOURCES} = join ':', @sourcefiles;
# at this point the environment is hard-wired from
# the config files -- no need to worry about env
# vars from a working shell polluting dot-scripts.
%ENV = %$newenv;
exec @ARGV || $ENV{SHELL};
die "Roadkill: $!";
this gets stuffed into a #! and called via something like:
[ "$ENV_SETUP_SOURCES" = "" ] && exec env_setup $0 $*;
at the top of shell scripts. If the environment has not yet
been set up then the multiple exec's leave the PID alone
(parent never gets a SIGCHLD) but the job is left running
with a fully configured environment. Only trick is to make
sure the env var used to flag the passthrough doesn't collide
with anything else.
--
Steven Lembark 2930 W. Palmer
Workhorse Computing Chicago, IL 60647
+1 888 359 3508
More information about the Chicago-talk
mailing list