*****--------------------------=== Snowflake ===--------------------------***** -----*****-------------------------^^^^^^^^^-------------------------*****----- SUMMARY ------- Snowflakes are considered unique. A common saying asserts that no two are identical. However, under specific circumstances and environments their shapes tend to be predictable. For example, the shape of the snowflake is determined broadly by the temperature and humidity at which it is formed. Snowflake is a framework to assist the exploitation of randomness vulnerabilities in PHP applications and in particular, it helps with the implementation of seed recovery attacks. For more information on these attacks check the paper "I Forgot Your Password: Randomness Attacks Against PHP Applications" This release includes the standalone version of the snowflake cracking tool along with its python interface and in addition, a sample password reset exploit against mediawiki 1.18.1 in order to demonstrate the process of writing such exploits, and a stripped version of the mt_rand() function from the PHP sources. INSTALLATION ------------ Installation of snowflake is pretty straightforward. Just run the make command on the top directory and all binaries and libraries will be compiled along with the necessary library for the mediawiki sample exploit that is included. The snowflake executable and library will be installed at the release directory while the hash library for the mediawiki exploit will be installed in the exploits directory. Snowflake does not have any weird dependencies therefore it should compile successfully in any unix system. Using snowflake --------------- For those familiar with hash cracking tools, snowflake follows an architecture which is similar with that of the rainbowcrack tool. The user creates a user defined hash function (which is application dependent) and then compiles that hash function as a shared library, taking care to export the necessary information (see below). The function should take as input a 32 bit integer (the seed) and return an arbitrary output (defined as char *). Note that we use the term hash function in a liberal sense as it is not necessary to have reduction of input length or any other property typically associated with such functions (e.g., collision resistance). Afterwards, snowflake can do the following using this hash function: 1. Generate rainbow tables for all 2^32 possible seeds. 2. Search some given rainbow tables for a target hash. 3. Perform an online bruteforcing on all 2^32 seeds to find a target hash. In addition, using the python interface a user can perform the actions 2,3 from a python script which makes it much easier to use from within a python exploit. **** Defining a hash function **** In order to define a new hash function one should write the hash function and then define a hashFuncEntry array named hashFuncArray that will contain the information that snowflake needs regarding the hash function: struct hashFuncEntry is the following: typedef struct { char *hashName; char *(*hashFunc)(unsigned int, char *); unsigned int hashLen; } hashFuncEntry; where hashName is the name of the hash function that will be passed to snowflake hashFunc is a pointer to the function and hashLen is the length of the hashes that this function produces. Here is an example from the mediawiki hash function: hashFuncEntry hashFuncArray[] = { // this symbol will be exported. {"wikihash", mediawikiHash, 16}, {0, 0, 0}, // terminated by a zero entry. }; The array should be always terminated by a all zero entry, however one can have more than one hash function in each library. The hash function should look like this: char *mediawikiHash(unsigned int seed, char hash[]) The third argument must be an array large enough to hold the hash and the function should also return a pointer to that array. Afterwards, the code should be compiled as a shared library that will lie in the same directory as snowflake. The name of the library file should be hashlibN.so where N is the number of the library and can be from 0 to 10, ex. hashlib0.so. When writing hash functions that use the mt_rand generator one can use the strip version that is included in the mt_rand directory. This version is a standalone implementation of the mt_rand function stripped from the PHP code. Depending on the hash function defined one may be able to do some additional optimizations in order to improve performance (ex. see the mediawiki hash function at hashlibs/mwikihash.c). For hash functions that use the rand() function you may use the standard libc implementation. Note that attacking rand() on windows is trivial since the state of the LCG of rand() is 15 bits which makes it very easy to attack even with direct bruteforce over the network. Now we are ready to use the hash function with snowflake. **** Generating rainbow tables **** In order to generate a number of rainbow tables we can use the standalone snowflake tool. But first we need to calculate the success probability of our tables; ideally we would like our tables to have 100% success probability. For this purpose we can use the prob.py script that will calculate the success probability given the parameters for each table and the total number of tables. To generate one rainbow table with 10000000 entries and chain length 1000 we issue the following command: snowflake generate 10000000 1000 1 myhashfunction check the usage menu of snowflake for a detailed explanation for the command. Notice that because our search space is only 2^32 we do not have any particular problem in having a success probability of almost one while keeping the tables moderately small and the chains short. Some parameters follow: Chain Number | Chain length | no. of tables | Success Probality 10m 1000 3 0.990317 10m 3000 3 0.999879 5m 3000 3 0.997676 Notice also that each chain entry is 64 bit so we can create a 10m entries table with around 80mb of space. **** Searching for rainbow tables **** In order to search a rainbow table for a target hash we issue the following command: snowflake search rainbow_table_path target_hash The hash should be in hexadecimal human readable form. snowflake will convert it to its raw byte form. An important note is that all the information for the rainbow table is included in the filename should the filename should stay intact, otherwise you will experience a number of funny stuff ranging from errors to stack overflows. **** Cracking the hash online **** If no tables are available or the hash was not found in these tables we can perform an exhaustive search over all 2^32 possible seeds. To do so we issue the command: snowflake crack hash_func_name target_hash The cracker is multithreaded and will use a number of threads that is equal to the number of CPU cores in the system in order to optimize performance. With a 2.3 GHz quadcore processor we were able to search all search space for the mediawiki hash function in about 20 minutes. **** Using the python interface **** Snowflake also includes a python module in order to use the search and crack functionalities from a python script. The module uses ctypes to access the function from within python. In addition, the module also includes a pure python implementation of the mt_rand function as used by PHP. The mediawiki exploit is a nice showcase on how to use both of these classes (check exploits/mediawiki.py). Initially import the classes from snowflake import MtRand, Snowflake There are 3 functions from the Snowflake class that one can use: -- __init__( clib = './snowflake.so' ) The class takes as an optional initialization argument the full path for the snowflake shared library, otherwise it will search in the current directory for snowflake.so. -- searchRainbowTables( targetHash, tableList ) This function will take as input a targetHash (in raw byte form, you can use the unhexlify function to convert that) and a list with rainbow tables and will try to find the targetHash in each of these tables. -- searchHashOnline( targetHash, hashFuncName ) Likewise this function takes a targetHash in raw byte form and the hash function name and will perform an exhaustive search in order to find the correct seed that generated the target hash. -- oneWayOrAnother( targetHash, tableList, hashFuncName ) This function is simply a wrapper in order to call both of the above functions. It will first search in the given rainbow tables and if the hash is not found it will perform an exhaustive search. The MtRand class offers the following functions: -- __init__ ( php = True ): At initialization the user need to instruct the class if it should use the php Mersenne Twister implementation or the original mersenne twister implementation. There are some differences between the implementations that are explained in the paper. -- mtSrand( seed ) Seeds the generator with the given seed using the PHP seeding algorithm. -- mtRand( min = None, max = None) Generates a random number using the original Mersenne Twister algorithm with an option to map the number to a smaller range using the PHP truncation algorithm. -- phpMtRand( min = None, max = None ) Generates a random number using the PHP Mersenne Twister implementation, again giving the option to map the number to a smaller range. Writing exploits using snowflake -------------------------------- Okay, now the interesting stuff. We now have everyting we need to mount seed recovery attacks against PHP applications that use either rand() or mt_rand(). The first thing we need to do is to find a place within the application that an output of the target function is leaked to the user in any form (it could be a CSRF token, a random password that is generated, a password reset token, just grep around...) Afterwards we write some C code that generates this leak when a newly seeded generator is used. This code will serve as our hash function. Then we generate some rainbow tables for this hash function and write an exploit that will force apache to spawn a new process and we obtain a sample of our hash function which we then crack using snowflake to obtain the initial seed. This will give us the initial seed that was used by the target application. Now we can reset the target user's password within the same connection (remember: keep-alive is your friend!), and to predict the token that will be generated. To find the token we create a new instance of the MtRand class and seed it with the seed we obtained. Afterwards we can simply apply the same algorithm to predict the generated token. For a code tutorial which will better illustrate the process check the mediawiki exploit. LICENCE ------- This software was written based on research funded by the ERC project CODAMODA and is released under the New BSD License. Acknowledgements ---------------- This research was partly supported by ERC project CODAMODA, #259152. TODO / BUGS / ETC ----------------- We think that the tool is quite usable by now. We will probably add multithreaded search to the rainbow tables in order to improve the search time when complex hash functions are used, although with any hash function we have used until now the search time is under 1 minute. wen addition, this is an ALPHA release. You will probably find a number of bugs for which we would be grateful if you reported. However, please report bugs which occur only when the user is not acting maliciously. if someone get owned because he opened a malicious rainbow table, so be it! Nevertheless, we will probably fix some of these bugs too (when we find some spare time...). Finally, If anybody wants to contribute some code too, she is more than welcome to do so. Happy exploitation!