[Melbourne-pm] Data::Token

Paul Fenwick pjf at perltraining.com.au
Wed May 28 18:01:30 PDT 2008


G'day Scott,

Hashing
=======

I notice that Data::Token is using MD5.  Unfortunately, we're starting to 
get very good at engineering MD5 collisions, with 
http://th.informatik.uni-mannheim.de/People/lucks/HashCollisions/ as a 
striking example of this.  For Data::Token this could be considered a 
non-issue, as we just want our tokens to be hard-to-guess, rather than using 
them as hash of a real documentation.  Even so, I'd tend towards SHA1 as a 
hashing algorithm with less flaws.

Randomness
==========

Unfortunately, rand(time) isn't very random.  When Perl sees the use of rand 
it will first try to seed its pseudo-random number generate (PRNG) with a 
good source of entropy, typically from /dev/urandom on modern unixes.  On 
most systems, this gives you at most 32 bits of entropy, since that's all 
the random seed will take.  rand(time) then generates a floating point 
number between 0 and the seconds from the epoch.  This number can be 
predicted based upon the current time, and our original 32 bits of entropy 
(which we can brute force).

Uniqueness
==========
MD5 doesn't guarantee that its output is unique, even though the input has 
been generated from unique identifiers.  It's *very* unlikely that we'll see 
a collision, but it's still a possibility.

Suggestion
==========
Rather than pushing our UUID and our random number through MD5, I would 
suggest a simple concatenation.  The UUID guarantees that our resulting 
string will be unique, and our random number (appropriately encoded) will 
ensure that it's hard to guess.  I would allow the user to supply an 
argument specifying how many bits of randomness they want, and possibly an 
argument to specify the quality of that randomness (are we willing to block 
for good randomness?).

I recommend using Crypt::Random from CPAN as a way to get your random 
numbers.  It does the hard work of finding an appropriate source of 
randomness, including hooking into /dev/u?random, asking PARI, or talking to 
the entropy gathering daemon (if installed).  It also takes size and 
strength arguments, which can be passed straight through from the user.

Further reading
===============
I discuss the troubles with generating good random numbers in Perl in 
chapter 10 of "Perl Security", available from 
http://perltraining.com.au/notes.html .  Feedback and comments appreciated.

Cheerio,

	Paul

-- 
Paul Fenwick <pjf at perltraining.com.au> | http://perltraining.com.au/
Director of Training                   | Ph:  +61 3 9354 6001
Perl Training Australia                | Fax: +61 3 9354 2681


More information about the Melbourne-pm mailing list