question about the nature of DBM ties
nkuipers
nkuipers at uvic.ca
Tue Sep 24 15:09:39 CDT 2002
Hi all,
When you tie a data structure to an external file, is populating that
structure for large input updating directly into the file or is it crowding
more and more stuff into memory which then gets dumped into the file, or what?
In other words, does tying in this manner free up more RAM? The bottleneck
in my code is the unique function but this function is necessary. All in all
the code works perfectly but takes too long.
#!/usr/bin/perl
use strict;
use warnings;
use DB_File;
my $infile = shift;
my $wordsize = 10;
my %clusters; #key=>value = 'id_string' => 'DNA_string'
my %k_strings; #key=>value =(ie.) 'ACGTGGTCAC' => [id_string1, id_string2,...]
tie(%k_strings, "DB_File", "index.tmp") or die "Can't open filename: $!";
%k_strings = &build_index(\%clusters);
untie %k_strings;
sub build_index {
my $clusters_hashref = shift;
my %k_hash;
while ( (my $id, my $sequence) = each %$clusters_hashref ) {
my $tmp = $sequence;
while ( length($tmp) >= $wordsize ) {
my $kstring = substr($tmp, 0, $wordsize);
if ( exists $k_hash{$kstring} ) {
push @{ $k_hash{$kstring} }, $id
if unique(\@{ $k_hash{$kstring} }, \$id)
} else { $k_hash{$kstring} = [ $id ] }
$tmp =~ s/^\w//;
}
}
return %k_hash;
}
sub unique {
my ($array_ref, $id_ref) = @_;
my $flag = 0;
for (@$array_ref) {
if ( $_ eq $$id_ref ) {
$flag = 1;
last;
}
}
$flag == 1 ? return 0 : (return 1);
}
__END__
More information about the Victoria-pm
mailing list