[mplspm]: In a sorted list [better]

Dave Bianchi djb at tc.umn.edu
Fri Mar 22 11:16:12 CST 2002


Okay, Jim and I are thinking the same way :-)

Jim Anderson wrote:
> 
> On Fri, Mar 22, 2002 at 10:59:12AM -0600, John J. Trammell wrote:
> > On Fri, Mar 22, 2002 at 09:40:54AM -0600, Jim Anderson wrote:
> > > And with 2 million strings, building a hash is a non-trivial amount of
> > > time.
> 
> A couple other points.  The original premise had the list sorted to start
> with.  Has it always been sorted?  Building the hash may be faster than
> doing a sort if it wasn't sorted to start with.  Where did the list come
> from?  If it came from inside a database, having an index on the field
> could eliminate the problem entirely.  If the data is in a flat file on
> the disk, and searches are relatively infrequent, it may be best to do
> a binary search on the file itself, and not even reading the list into
> memory.  There used to be a Unix utility that did exactly that (It might
> have been called 'look').  If a great deal of searches are being done,
> it might work to create the hash, especially if the very long startup
> time to build the hash can be tolerated.  Another possiblity, if the
> list is relatively stable, is to use a tie'd hash to a DBM file.  That
> way, you can build the hash once, and then use an index into the hash to
> only read the bit of the file that has the string you're looking for.
> This has the advantage of very fast searches, combined with little or no
> setup time for the user program.  There is a very slow initial startup
> time to build the hash in the first place, though.
> 
> --
> Jim Anderson                    (612) 782-0456     jim at acadcam.com
> Anderson CAD/CAM, Inc           Lucifer designed MS-DOS to try
> 3800 Apache Lane NE                 men's souls.
> St Anthony, MN  55421           Then he had a better idea...
> 
> --------------------------------------------------
> Minneapolis Perl Mongers mailing list
> 
> To unsubscribe, send mail to majordomo at pm.org
> with "unsubscribe mpls" in the body of the message.


--------------------------------------------------
Minneapolis Perl Mongers mailing list

To unsubscribe, send mail to majordomo at pm.org
with "unsubscribe mpls" in the body of the message.



More information about the Mpls-pm mailing list