Welcome Guest [Log In] [Register]
Welcome to Crypto. We hope you enjoy your visit.


You're currently viewing our forum as a guest. This means you are limited to certain areas of the board and there are some features you can't use. If you join our community, you'll be able to access member-only sections, and use many member-only features such as customizing your profile, sending personal messages, and voting in polls. Registration is simple, fast, and completely free.


Join our community!


If you're already a member please log in to your account to access all of our features:

Username:   Password:
Add Reply
pride.crypto.metrics; python tools for statistical analysis of crypto
Topic Started: Apr 21 2016, 12:21 AM (457 Views)
E. Rose
Member
[ *  * ]
I develop a python module called pride.crypto.metrics. It contains various tests for various types of cryptographic primitives, such as hash functions and block ciphers. I have found the tools invaluable when researching my own algorithms. I am putting it here in the hope that someone else may find it useful as well.

Tests include:

  • avalanche/diffusion test
    - Measures how much a change in input influences output
  • randomness test (ent)
    - Standard measurement of random bytes using ent (entropy, chi square, serial correlation, etc)
  • bias test
    - Determines uniformity of the supplied random bytes; Measures the evenness of output distribution in terms of what symbols appear and where.
  • period test
    - Cycle detection test. Chains output as input to the hash and measures how regularly truncated output cycles.
  • collision test
    - Truncated collision detection test. Crypts permutations of inputs and notes the how often collisions happen using truncated outputs
  • compression test
    - Measures how long it takes a hash function to compress large input
  • performance test
    - Measures how long it takes to generate 1 MB of random data

The code is available on github here: https://github.com/erose1337/pride/blob/master/crypto/metrics.py and is being attached as a zip to this post.

You should not have to install the complete pride package just to use pride.crypto.metrics; The script should function as a standalone, save for the demo test of AES/random data. Testing of random data does require ent; ent should be located in the same directory the script is run from.

Let me know if there's any problems, if you need help using it, or if you find it helpful. It could be documented a little better, but it's pretty straightforward to use.
Attached to this post:
Attachments: metrics.py (9.48 KB)
Edited by E. Rose, Apr 21 2016, 12:23 AM.
Offline Profile Quote Post Goto Top
 
Replies:
Karl-Uwe Frank
NSA worthy
[ *  *  *  *  *  * ]
Great! Thanks again, it straightens the process firmly.

Your test routine seems to be a fine tool for measuring hash output quality.

Just one suggestion which I found in diehard and the TestU01 suites so useful is,
that you maybe include some common hash algorithm to choose from, enabling
someone who is testing his own algorithm to verify his results against those of
some common hash of checksum functions.



cHNiMUBACG0HAAAAAAAAAAAAAABIZVbDdKVM0w1kM9vxQHw+bkLxsY/Z0czY0uv8/Ks6WULxJVua
zjvpoYvtEwDVhP7RGTCBVlzZ+VBWPHg5rqmKWvtzsuVmMSDxAIS6Db6YhtzT+RStzoG9ForBcG8k
G97Q3Jml/aBun8Kyf+XOBHpl5gNW4YqhiM0=
Offline Profile Quote Post Goto Top
 
E. Rose
Member
[ *  * ]
Well thank you!

When run as a module, metrics.py will run metrics on sha256 and on the random oracle hash function. This provides examples results that a "real" hash function would produce, along with the metrics of a truly random function.

Oh, and I cleaned up the support for block ciphers/stream ciphers; The updated version on github supports test_block_cipher and test_stream_cipher functions too. Those methods accept an encrypt method, along with a key and iv/nonce; plaintext data/ciphertext data is supplied/created by the tests.
Edited by E. Rose, Apr 28 2016, 01:16 AM.
Offline Profile Quote Post Goto Top
 
Karl-Uwe Frank
NSA worthy
[ *  *  *  *  *  * ]
Just added two 32 bit hashes. Do you consider these results below correct?

Code:
 
...

import[space]struct

#-----------------------------------------
#[space]ugly[space]bad[space]hash
#
def[space]BadHash(data):
[space][space]h[space]=[space]int(0)

[space][space]#[space]Hash[space]the[space]Content
[space][space]for[space]i[space]in[space]range(len(data)):
[space][space][space][space]h[space]^=[space]h[space]>>[space]3
[space][space][space][space]h[space]^=[space]h[space]<<[space]5
[space][space][space][space]h[space]^=[space]ord(data[i])

[space][space]return[space]struct.pack('I',[space]h)

#-----------------------------------------
#[space]One-byte-at-a-time[space]hash[space]based[space]on[space]Murmur's[space]mix
#
def[space]MurmurOAAT(data):
[space][space]h[space]=[space]int(0)

[space][space]#[space]Hash[space]the[space]Content
[space][space]for[space]i[space]in[space]range(len(data)):
[space][space][space][space]h[space]^=[space]ord(data[i])
[space][space][space][space]h[space][space]=[space](h[space]*[space]0x5bd1e995)[space]&[space]0xffffffff
[space][space][space][space]h[space]^=[space]h[space]>>[space]15

[space][space]return[space]struct.pack('I',[space]h)


...

[space][space]metrics.test_hash_function([space]BadHash[space])

[space][space]metrics.test_hash_function([space]MurmurOAAT[space])



Code:
 
Testing[space]BadHash...
-------------------------------------------------
Testing[space]diffusion/avalanche...[space]
Minimum[space]Hamming[space]distance[space]ratio:[space][space]0.03125
Average[space]Hamming[space]distance[space]ratio:[space][space]0.03125
Maximum[space]Hamming[space]distance[space]ratio:[space][space]0.25
Testing[space]randomness[space]of[space]131072[space]bytes...[space]
Data[space]generated;[space]Running[space]ent...
Entropy[space]=[space]0.000000[space]bits[space]per[space]byte.

Optimum[space]compression[space]would[space]reduce[space]the[space]size
of[space]this[space]131072[space]byte[space]file[space]by[space]100[space]percent.

Chi[space]square[space]distribution[space]for[space]131072[space]samples[space]is[space]33423360.00,[space]and[space]randomly
would[space]exceed[space]this[space]value[space]less[space]than[space]0.01[space]percent[space]of[space]the[space]times.

Arithmetic[space]mean[space]value[space]of[space]data[space]bytes[space]is[space]0.0000[space](127.5[space]=[space]random).
Monte[space]Carlo[space]value[space]for[space]Pi[space]is[space]4.000000000[space](error[space]27.32[space]percent).
Serial[space]correlation[space]coefficient[space]is[space]undefined[space](all[space]values[space]equal!).
Testing[space]period[space]with[space]output[space]truncated[space]to[space]2[space]byte:[space]
Minimum/Average/Maximum[space]cycle[space]lengths:[space][space](1,[space]1.0,[space]1)
Testing[space]for[space]byte[space]bias...
Byte[space]bias:[space][space]4[space][256,[space]32,[space]1,[space]1,[space]0,[space]0,[space]0,[space]0,[space]0,[space]0,[space]0,[space]0,[space]0,[space]0,[space]0,[space]0]
Symbols[space]out[space]of[space]256[space]that[space]appeared[space]anywhere:[space][space]256
Testing[space]for[space]collisions[space]with[space]output[space]of[space]3[space]bytes...[space]
Collision[space]after:[space][space]256[space];[space]Output[space]size:[space][space]3
-------------------------------------------------

Testing[space]MurmurOAAT...
-------------------------------------------------
Testing[space]diffusion/avalanche...[space]
Minimum[space]Hamming[space]distance[space]ratio:[space][space]0.21875
Average[space]Hamming[space]distance[space]ratio:[space][space]0.5
Maximum[space]Hamming[space]distance[space]ratio:[space][space]0.84375
Testing[space]randomness[space]of[space]131072[space]bytes...[space]
Data[space]generated;[space]Running[space]ent...
Entropy[space]=[space]0.000000[space]bits[space]per[space]byte.

Optimum[space]compression[space]would[space]reduce[space]the[space]size
of[space]this[space]131072[space]byte[space]file[space]by[space]100[space]percent.

Chi[space]square[space]distribution[space]for[space]131072[space]samples[space]is[space]33423360.00,[space]and[space]randomly
would[space]exceed[space]this[space]value[space]less[space]than[space]0.01[space]percent[space]of[space]the[space]times.

Arithmetic[space]mean[space]value[space]of[space]data[space]bytes[space]is[space]0.0000[space](127.5[space]=[space]random).
Monte[space]Carlo[space]value[space]for[space]Pi[space]is[space]4.000000000[space](error[space]27.32[space]percent).
Serial[space]correlation[space]coefficient[space]is[space]undefined[space](all[space]values[space]equal!).
Testing[space]period[space]with[space]output[space]truncated[space]to[space]2[space]byte:[space]
Minimum/Average/Maximum[space]cycle[space]lengths:[space][space](1,[space]1.0,[space]1)
Testing[space]for[space]byte[space]bias...
Byte[space]bias:[space][space]4[space][256,[space]256,[space]256,[space]256,[space]0,[space]0,[space]0,[space]0,[space]0,[space]0,[space]0,[space]0,[space]0,[space]0,[space]0,[space]0]
Symbols[space]out[space]of[space]256[space]that[space]appeared[space]anywhere:[space][space]256
Testing[space]for[space]collisions[space]with[space]output[space]of[space]3[space]bytes...[space]
Collision[space]after:[space][space]2485[space];[space]Output[space]size:[space][space]3
-------------------------------------------------


cHNiMUBACG0HAAAAAAAAAAAAAABIZVbDdKVM0w1kM9vxQHw+bkLxsY/Z0czY0uv8/Ks6WULxJVua
zjvpoYvtEwDVhP7RGTCBVlzZ+VBWPHg5rqmKWvtzsuVmMSDxAIS6Db6YhtzT+RStzoG9ForBcG8k
G97Q3Jml/aBun8Kyf+XOBHpl5gNW4YqhiM0=
Offline Profile Quote Post Goto Top
 
E. Rose
Member
[ *  * ]
The ent test looks a little weird; It says all values are equal in the serial correlation section. The period test looks like it is suffering from the same problem somehow. Otherwise it looks appropriate. Let me take a look at what might be happening with the ent/period test problem.
Offline Profile Quote Post Goto Top
 
Karl-Uwe Frank
NSA worthy
[ *  *  *  *  *  * ]
If you look at the Test_Data_{}.bin files you will realise that they are filled with all 0x0.

By the way, it might be useful in my opinion either to delete these binary test files after the test run or rename same in a way that they correlate with the hash which generated them.
Edited by Karl-Uwe Frank, Apr 28 2016, 06:11 PM.
cHNiMUBACG0HAAAAAAAAAAAAAABIZVbDdKVM0w1kM9vxQHw+bkLxsY/Z0czY0uv8/Ks6WULxJVua
zjvpoYvtEwDVhP7RGTCBVlzZ+VBWPHg5rqmKWvtzsuVmMSDxAIS6Db6YhtzT+RStzoG9ForBcG8k
G97Q3Jml/aBun8Kyf+XOBHpl5gNW4YqhiM0=
Offline Profile Quote Post Goto Top
 
E. Rose
Member
[ *  * ]
I pushed an updated version to github. I incorporated your feature request and the files that hold random data for testing are now deleted at the end, and hopefully fixed that bug.
Offline Profile Quote Post Goto Top
 
Karl-Uwe Frank
NSA worthy
[ *  *  *  *  *  * ]
Still Test_Data.bin contains all zero byte.
cHNiMUBACG0HAAAAAAAAAAAAAABIZVbDdKVM0w1kM9vxQHw+bkLxsY/Z0czY0uv8/Ks6WULxJVua
zjvpoYvtEwDVhP7RGTCBVlzZ+VBWPHg5rqmKWvtzsuVmMSDxAIS6Db6YhtzT+RStzoG9ForBcG8k
G97Q3Jml/aBun8Kyf+XOBHpl5gNW4YqhiM0=
Offline Profile Quote Post Goto Top
 
E. Rose
Member
[ *  * ]
The problem is being caused because of how test data is generated. Data is generated via a hash chain. The hash chain begins with the hash of null bytes, and the output becomes the next input.

Both the hashes that you are testing with do not work on null strings. Applying either of them to a string of null bytes always returns a string of null bytes. For example:

Code:
 

from[space]hashlib[space]import[space]sha

print[space]sha256("\x00").digest()
print
print[space]sha256("\x00\x00").digest()
print
print[space]sha256("\x00"[space]*[space]100).digest()
print
print[space]MurmurOAAT("\x00")
print
print[space]MurmurOAAT("\x00\x00")
print
print[space]MurmurOAAT("\x00"[space]*[space]100)


When supplied "\x00", "\x00\x00", and "\x00" * 100, sha256 produces three distinct outputs. MurmurOAAT produces "\x00\x00\x00\x00" for all three inputs. This is what was causing your test data files to be filled with all zeros.

I can change the test to start with the hash of "\x01", which will fix the random data not being generated correctly. I'm also going to add an additional, separate test to detect this feature badhash/MurmurOAAT both happen to have, because it wouldn't have showed up if the initial hash weren't "\x00", and I feel it's something worth noting.

The updated version is on now github; I tested it with MurmurOAAT and it produces the following results:
Quote:
 

The supplied hash produces collisions for null input strings of varying length
Testing diffusion/avalanche...
Minimum Hamming distance ratio: 0.21875
Average Hamming distance ratio: 0.5
Maximum Hamming distance ratio: 0.84375
Testing randomness of 1048576 bytes...
Data generated; Running ent...
Testing period with output truncated to 2 byte:
Minimum/Average/Maximum cycle lengths: (1, 34.87, 577)
Testing for byte bias...
Byte bias: 4 [256, 256, 256, 256, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
Symbols out of 256 that appeared anywhere: 256
Testing for collisions with output of 3 bytes...
Collision after: 2485 ; Output size: 3
Time testing compression function...
0.0355698579366
Testing time to generate 1024 * 1024 bytes...
4.10920479213


Thanks for your patience!
Edited by E. Rose, May 1 2016, 11:17 PM.
Offline Profile Quote Post Goto Top
 
Karl-Uwe Frank
NSA worthy
[ *  *  *  *  *  * ]
Well, now it throws the error

File "./pride.crypto/metrics.py", line 8, in <module>
from utilities import slide
ImportError: No module named utilities
cHNiMUBACG0HAAAAAAAAAAAAAABIZVbDdKVM0w1kM9vxQHw+bkLxsY/Z0czY0uv8/Ks6WULxJVua
zjvpoYvtEwDVhP7RGTCBVlzZ+VBWPHg5rqmKWvtzsuVmMSDxAIS6Db6YhtzT+RStzoG9ForBcG8k
G97Q3Jml/aBun8Kyf+XOBHpl5gNW4YqhiM0=
Offline Profile Quote Post Goto Top
 
E. Rose
Member
[ *  * ]
My apologies. I was refactoring my code and moved that. The updated version has it fixed. Sorry it's taken so long to sort this out!
Offline Profile Quote Post Goto Top
 
Karl-Uwe Frank
NSA worthy
[ *  *  *  *  *  * ]
No need to say sorry. You should take your time and you fixed it.
Code:
 
Testing[space]BadHash...
-------------------------------------------------
The[space]supplied[space]hash[space]produces[space]collisions[space]for[space]null[space]input[space]strings[space]of[space]varying[space]length
Testing[space]diffusion/avalanche...[space]
Minimum[space]Hamming[space]distance[space]ratio:[space][space]0.03125
Average[space]Hamming[space]distance[space]ratio:[space][space]0.03125
Maximum[space]Hamming[space]distance[space]ratio:[space][space]0.25
Testing[space]randomness[space]of[space]1048576[space]bytes...[space]
Data[space]generated;[space]Running[space]ent...
Entropy[space]=[space]6.721764[space]bits[space]per[space]byte.

Optimum[space]compression[space]would[space]reduce[space]the[space]size
of[space]this[space]1048576[space]byte[space]file[space]by[space]15[space]percent.

Chi[space]square[space]distribution[space]for[space]1048576[space]samples[space]is[space]16899671.41,[space]and[space]randomly
would[space]exceed[space]this[space]value[space]less[space]than[space]0.01[space]percent[space]of[space]the[space]times.

Arithmetic[space]mean[space]value[space]of[space]data[space]bytes[space]is[space]79.8866[space](127.5[space]=[space]random).
Monte[space]Carlo[space]value[space]for[space]Pi[space]is[space]3.915130291[space](error[space]24.62[space]percent).
Serial[space]correlation[space]coefficient[space]is[space]-0.043353[space](totally[space]uncorrelated[space]=[space]0.0).
Testing[space]period[space]with[space]output[space]truncated[space]to[space]2[space]byte:[space]
Minimum/Average/Maximum[space]cycle[space]lengths:[space][space](1,[space]34.86,[space]442)
Testing[space]for[space]byte[space]bias...
Byte[space]bias:[space][space]4[space][256,[space]32,[space]1,[space]1,[space]0,[space]0,[space]0,[space]0,[space]0,[space]0,[space]0,[space]0,[space]0,[space]0,[space]0,[space]0]
Symbols[space]out[space]of[space]256[space]that[space]appeared[space]anywhere:[space][space]256
Testing[space]for[space]collisions[space]with[space]output[space]of[space]3[space]bytes...[space]
Collision[space]after:[space][space]256[space];[space]Output[space]size:[space][space]3
These test result look reliable to me.
cHNiMUBACG0HAAAAAAAAAAAAAABIZVbDdKVM0w1kM9vxQHw+bkLxsY/Z0czY0uv8/Ks6WULxJVua
zjvpoYvtEwDVhP7RGTCBVlzZ+VBWPHg5rqmKWvtzsuVmMSDxAIS6Db6YhtzT+RStzoG9ForBcG8k
G97Q3Jml/aBun8Kyf+XOBHpl5gNW4YqhiM0=
Offline Profile Quote Post Goto Top
 
1 user reading this topic (1 Guest and 0 Anonymous)
ZetaBoards - Free Forum Hosting
ZetaBoards gives you all the tools to create a successful discussion community.
Learn More · Sign-up Now
« Previous Topic · Utilities · Next Topic »
Add Reply