[Pdx-pm] hash from array question
Thomas Keller
kellert at ohsu.edu
Fri May 25 22:55:40 PDT 2007
Onward and upward.
The problem I am working on is that I have a tab delimited file of
many thousands of lines, and I have a second file of names which is a
subset of the names contained as the first field of the larger file.
I'm using a riff on p.60 of Intermediate Perl by R. Schwartz (any
problems are my own).
So the code snippet looks like this:
my @index = grep {
my $c = $_;
if ($c > $#lines or # always false
( grep { $lines[$c] =~ m/$_/ } @names ) > 0 ) {
1; #yes, select it
} else {
0; # no, skip it
}
} 0..$#lines;
my @gene_of_interest = @lines[@index]; # ref Int. Perl, p.60
This works, but it is really slow. Is there a faster way?
thanks,
Tom K
Here is a short version of the data:
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: STM_Genome_Annotation2.txt
Url: http://mail.pm.org/pipermail/pdx-pm-list/attachments/20070525/ec0ce7a0/attachment.txt
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: names01.txt
Url: http://mail.pm.org/pipermail/pdx-pm-list/attachments/20070525/ec0ce7a0/attachment-0001.txt
-------------- next part --------------
On May 25, 2007, at 4:39 PM, Eric Wilhelm wrote:
> # from Thomas J Keller
> # on Friday 25 May 2007 02:23 pm:
>
>> line 32 print "genomes string: $genomes\n";
>> line 33 print "split gives: ", split(/,\s*/,$genomes),"\n";
>> line 34 my %genomes = split(/,\s*/,$genomes) ;
>> line 35 print map { "Key: $_ has value: $genomes{$_}\n" } sort
>> keys %
>> $genomes;
>
> Yep. What Andy said.
>
> While $genomes and %genomes are distinct, it is usually helpful to
> avoid
> using the same word with different sigils within a single scope. Had
> there not been a $genomes, the use of %$genomes would have
> triggered the
> error [Global symbol "$genomes" requires ...] rather than the slightly
> less obvious error [Can't use string ... as a hash reference].
>
> e.g.
>
> my %genomes = split(/,\s*/, $config{"$project.genomes"});
>
> Anyway, now for the comic relief. I'll admit to not having any
> clue what
> I'm doing here, but I find the scary avoidance of exception
> handling rather
> ironic when coupled with the fact that string comparison is done so
> elegantly
> via "==" operator overloading. Also note the "lets all invent our own
> string libraries" fun coming from both ends. And of course the 2-line
> grep/map statement that would be required to do this in a 10-line
> method in
> Perl. Whee!
>
> bool wxMozillaBrowser::ScrollToElementByID(wxString id)
> {
> //fprintf(stderr, ((wxT("id: ")+ id + wxT("\n")).mb_str()));
> nsCOMPtr<nsIDOMWindow> domWindow;
> nsresult rv;
> rv = m_Mozilla->mWebBrowser->GetContentDOMWindow(getter_AddRefs
> (domWindow));
> if (!domWindow)
> return FALSE;
> if (NS_FAILED(rv))
> return FALSE;
>
> nsCOMPtr<nsIDOMDocument>doc;
> rv = domWindow->GetDocument(getter_AddRefs(doc));
> if (NS_FAILED(rv))
> return FALSE;
>
> nsString element_id = wxString_to_nsString(id, wxConvISO8859_1);
> nsCOMPtr<nsIDOMElement> domElement;
> rv = doc->GetElementById(element_id, getter_AddRefs(domElement));
>
> if(domElement) {
> // GRR all our doc is a wiki, yay. So how do I use this
> domElement?
> // We must make it be a member of the class that has the method.
> nsCOMPtr<nsIDOMNSHTMLElement> hElement(do_QueryInterface
> (domElement));
> // TODO maybe should check that this succeeds?
> hElement->ScrollIntoView(TRUE);
> return TRUE;
> }
>
> // ok, by-id got us nothing, so try to find the first named anchor
> // fprintf(stderr, "seaching for a name=\n");
> nsCOMPtr<nsIDOMNodeList> a_tags;
> doc->GetElementsByTagName(
> NS_LITERAL_STRING("a"), getter_AddRefs(a_tags)
> );
>
> if(!a_tags) return FALSE;
>
> PRUint32 count;
> a_tags->GetLength(&count);
>
> if(!count) return FALSE;
>
> for(PRUint32 i = 0; i < count; i++) {
> // fprintf(stderr, "check tag %i\n", i);
>
> nsCOMPtr<nsIDOMNode> node;
> rv = a_tags->Item(i, getter_AddRefs(node));
> if (NS_FAILED(rv) || !node) continue;
>
> nsCOMPtr<nsIDOMHTMLAnchorElement> anc;
> anc = do_QueryInterface(node);
> if(!anc) continue;
>
> // make thing, pass in to get return value, lather rinse,
> repeat...
> nsAutoString name;
> rv = anc->GetName(name);
> if (NS_FAILED(rv)) continue;
>
> fprintf(stderr, ((wxT("now check ") +
> nsString_to_wxString(name, wxConvISO8859_1) + wxT
> ("\n")).mb_str()));
> if(name == element_id) {
> nsCOMPtr<nsIDOMNSHTMLElement> hElement(do_QueryInterface(node));
> hElement->ScrollIntoView(TRUE);
> return TRUE;
> }
> }
>
> return FALSE;
> }
>
>
> --Eric
> --
> "It works better if you plug it in!"
> --Sattinger's Law
> ---------------------------------------------------
> http://scratchcomputing.com
> ---------------------------------------------------
> _______________________________________________
> Pdx-pm-list mailing list
> Pdx-pm-list at pm.org
> http://mail.pm.org/mailman/listinfo/pdx-pm-list
>
More information about the Pdx-pm-list
mailing list