[Melbourne-pm] Data::Token

Wed May 28 20:41:55 PDT 2008

> SHA1 and MD5 are in the same family, and successful attacks on (full)
> SHA1 have reduced collision generation to 2^69 trials from 2^80.
>
> Plan on replacing SHA1 everywhere within the next ten years, and on
> needing to step up to SHA256 or SHA512 in the interim, at the very
> least.

All the above is correct but not quite for this case. MD5 and SHA1 and  
up all just decrease how likely collisions are to help against bruit  
force attack - but for signatures against text. Remember that this is  
just a way of hiding the secret. What it needs to do is make it so  
that you need 1000s or more of guesses to get the next entry. Where as  
doing time (or as shown even rand(time)) is predictable.

One of the reasons Cryptography is so hard is you can't apply one rule  
to another. The MD5 birthday attack scenarios are useful only against  
documents you are signing. Where as this is just a one way hashing  
algorithm I need. I could probably use crypt :-) (not really).

> [...]
>
>> Most of the algorithms around use a simple text string - "MySecret".
>> This is how things tokens are generated for apache cookies and
>> examples for tokens in PHP and on Perl Monks - but that is silly in a
>> CPAN module, so I thought a bit of randomness.
>
> [...]
>
>> It is a sad fact that most of the Token code on CPAN and in the wile
>> use things like Database ID, Time stamp or similar to set the token
>> for a cookie :-)
>
> ...I agree that your model is substantially better, but I would
> generally encourage building secure first, then looking at allowing  
> the
> protection to be weakened later.
>
> That way you fail safe rather than depending on programmers to  
> actually
> have an notion of how to effectively secure the system.

Agreed.

>
> [...]
>
>> Good one thanks. I think the module should try and do well with zero
>> input (DWIM) - so I will look at Crypt::Random. But we can always
>> allow input into the function for increased random by passing  
>> straight
>> through.
>
> Allowing the end user to pass in "random" data to increase entropy  
> will,
> in many cases, result in less entropy included because, frankly, most
> people don't really understand how to generate that. :/
>
> However, Crypt::Random is a blocking module, and your web server is
> likely to be fairly entropy constrained[1], so you want to be  
> careful to
> set the strength of the input to low (Strength => 0) when setting it  
> up.

We don't need to create the secret every time, that can be generated  
once and kept in memory (yes that is safe, it is not a crypt key, just  
a means for making the token unpredictable). However that would only  
work if you are using mod_perl or similar.

But as for inputs - I intend to not give the user any inputs, but do  
it to the security good enough. Rather than provide a flexible module  
that does everything, this will just do one thing well. Then as issues  
arise, SHA-1 becomes no good, better randomness is required - I just  
change it.

> If you were extending this I would consider an implementation that can
> answer the key question "Is this my token" in a cryptographically  
> secure
> fashion, ensuring that you don't need to store the token anywhere.

That is a great idea, but not for this module I think. I will consider  
though a way of supporting it. The problem is of course you must keep  
your secret. A long time secret is vulnerable.

In the end though, a token really needs to be stored, so you can  
always just look it up. Nice idea though, good for form processing.

On another topic - Security of using MD5 - it seems that every module  
I find on the net from Java to PHP to Python to Perl are using what I  
originally wrote - MD5 of a random string (usually time) against a  
unique number (often just generated with a sequence, time or  
combination of time, ip etc).

The most common PHP code is
	$token = md5(uniqid(rand(), TRUE));

uniqid is equiv to Data::UUID (different way of calculating).

Even the praised Apache::Session and CGI::Session just use:

	md5_hex($$, time(), rand(time));

I can't find a single reference on the net that says this is insecure  
as has been documented in this thread. Some people raise in threads  
that you should use SHA1 and in each case it is said not to be required.

So the question is:

1) Am I missing the threads on the net
2) Are we jumping to the wrong conclusion because we are mixing  
document signature faking with unpredictability
3) Is this really a problem and we are the first to really solve it.

My gut is now telling me (2). If it is not then almost every single  
site on the internet is now vulnerable.

Note also that the PHP, Apache::Session, CGI::Session. Even  
Apache::AuthCookie just uses md5_hex($date, $PID, $PAC); I can't find  
a single example on the net that does not use MD5, except the insecure  
ones.

Scott