Welcome Guest [Log In] [Register]
Viewing Single Post From: Announcement: Official Vigenère Tutorial
rot13
Elite member
[ *  *  *  *  * ]
PulsarSL
May 25 2006, 01:27 AM
What we need now is a definitive guide to cracking a vigenere using IOC attacks.

Pulsar

Do keep in mind that IOC is useful for determining the key length, but after that, you have to use something else to actually recover the key. In general, you can shift through possible key values and compare the decoded results with the standard frequencies for English. For a Vigenere text of a resonable size, this technique works quite well. For shorter texts it can be a little more difficult and you may have to resort to trial and error, or crib dragging.

I have found myself using IOC several times lately and I almost wonder if we need a tutorial in how to use IOC to attack various ciphers. Maybe it will due to just mention what it is good for and what it isn't. IOC tests whether the repetitiveness of a group of items is similar to a known group. The most common example is, do these letters repeat in roughly the same amounts that English does?

If letters are about evenly distributed, the IOC is usually in the 0.03 to 0.04 range. For English text it is in the 0.06 to 0.07 range. If there are only a few letters you could see much higher values. The IOC for the raw ADFGVX ciphertext in Loki's challenge is just under 0.2.

To compute the IOC (technically, what I am presenting here isn't IOC but the PHI test, but it works out the same), sum f*(f-1) where f is the number of occurrences of each letter. If you have 5 A's, 3 B'2 and 1 C, you add 5*4 + 3*2 + 1*0. You divide this sum by N*(N-1) where N is the total number of letters. Again, with 5 A's, 3 B's and 1 C, that's 9 * 8. So the IOC is 26/72 or about 0.36.

To apply IOC to a Vigenere or other periodic cipher, you are basically trying various key lengths and seeing if the IOC of the resulting alphabets is close to English (or whatever language you think the plaintext is in). For a key length of 3, for example, you take letters 1, 4, 7, 10, 13, etc. and compute the IOC. Then you take letters 2,5,8,11,14,etc. and compute the IOC, then take letters 3,6,9,12,15,etc. and compute the IOC. Then you average those 3 IOCs. Next, you do the same procedure for the 4 alphabets generated for a key length of 4, and so forth. The highest IOC probably indicates the right key length, although sometimes multiples of the actual key length may score higher.

IOC can also be used for fractionating systems like ADFGVX and straddling checkerboards. For ciphers like these, you are not looking for a key length, but for an equivalent monoalphabetic substitution (i.e. instead of just solving the cipher outright, you convert it into a monoalphabetic substition and THEN solve it). With ADFGVX, a plaintext letter is represented by a pair of letters, and these pairs are broken up and scrambled. If you think you have unscrambled these pairs back into the original representation, you can use IOC to see whether your unscrambling produces text with the same repetition characteristics as English. Likewise, with straddling checkerboard where a letter can be represented by either a single digit or by a 2-digit number, you can try various possibilities for the 2 first digits of the 2-digit numbers, and check the resulting text to see if it has the same repetition characteristics as English. I found with the checkerboard that I often had alphabets with more repetition, so the IOC was higher. Instead of picking the highest, I picked what was closest to 0.066.

IOC is typically useless for a monoalphabetic substitution because the IOC of the substituted text is the same as the original. Where in the original you might have 14 E's, the substitution man have 14 Q's, and when that's converted into a number, it makes no difference what the original letter was.

IOC on single letters is also pretty useless for a transposition, because the letter frequency should already be about the same as English. For some transpositions, however, it may be useful to compute a digraphic IOC. That is, instead of doing the IOC on the frequency of single letters, you can do it on every pair of letters. It is the same procedure, sum f*(f-1) for all digraphic frequencies (how many AA's, AB's, .. ZZ's) and divide by N*(N-1) the total number of digraphs, which should be the number of letters / 2. For English, that number should be around 0.0069 as opposed to 0.0015 for random text (I get these numbers from Lanaki's Lesson 2 which has a great discussion on IOC.

You can also use IOC to help identify ciphers. An issue of The Cryptogram last year had a table with a number of expected test values for various ciphers that included the expected monographic and digraphic IOC's.
Offline Profile Quote Post
Announcement: Official Vigenère Tutorial · News