Welcome Guest [Log In] [Register]
Welcome to Crypto. We hope you enjoy your visit.


You're currently viewing our forum as a guest. This means you are limited to certain areas of the board and there are some features you can't use. If you join our community, you'll be able to access member-only sections, and use many member-only features such as customizing your profile, sending personal messages, and voting in polls. Registration is simple, fast, and completely free.


Join our community!


If you're already a member please log in to your account to access all of our features:

Username:   Password:
Add Reply
The "spread" Cipher; a cipher I'm playing with
Topic Started: Mar 30 2007, 03:49 PM (421 Views)
Donald
Elite member
[ *  *  *  *  * ]
The "SPREAD" cipher. Nothing new about this, I'm certain someone else had done it before. I'm just having fun playing around.

In this cipher, we use a frequency ordered alphabet. (Since there are several, you must agree upon which version you will use ahead of time) I'm picking this one:
Code:
 

E:12.4%   I: 6.7%   L: 3.6%   F: 2.2%   V: 0.8%   Z: 0.0
T: 8.9%   H: 6.5%   U: 2.7%   G: 2.0%   K: 0.7%
A: 8.0%   S: 6.2%   M: 2.5%   Y: 2.0%   Q: 0.1%
O: 7.6%   R: 6.1%   W: 2.3%   P: 1.6%   X: 0.1%
N: 7.0%   D: 4.6%   C: 2.2%   B: 1.3%   J: 0.1%


now then, we need a passphrase of 30 letters or more (drop anything after 30) I'll use:

"i adore homemade blueberry muffins"

turn this passphrase into 3 sets of the numbers 0-9 in psuedo randomized order by simply numbering them in alphabetic order numbering repeat letters from left to right:
Code:
 

5017924863
iadorehome

8034179526
madebluebe

5693801247
rrymuffins


Now we combine the number sets and the freq ordered alphabet like this:
Code:
 

5 0 1 7 9 2 4 8 6 3
E T A O N I H > > >     FIRST ROW

8 0 3 4 1 7 9 5 2 6
S R D L U M W C F >     SECOND ROW

5 6 9 3 8 0 1 2 4 7
G Y P B V K Q X J Z     THIRD ROW


The encryption process is fairly simple. Any letter on the first row (the highest frequency letters) is represented by the single number above it.
So E becomes 5. N becomes 9, etc.

BUT, if we find the letter on the SECOND row, then we use any one of the three "SHIFT" numbers from the first row (8, 6 or 3 in this example) followed by the number above that letter in the second row.
So S becomes 88 68 or 38 and R becomes 80 60 or 30.

For the letters in the third row (the least frequent), we must have TWO shifts, any one of the shift numbers from the first row, plus the shift number from the second row, followed by the third row number above the letter in question.

So G becomes 865 665 or 365, and P becomes 869 669 or 369

An example of encrypting a message:
Code:
 

t  h  e  s  e  c  r  e  t  m  e  s  s  a  g   e  <-plain
0  4  5  88 5  35 60 5  0  7  5  38 68 1  865 5  <-crypt


Resulting in the final encrypted message of 04588535605075386818655

The Decryption process is simply the reverse of the encryption process

If the number translates to a letter on the first row, we decrypt to that letter.
If the number translates to a SHIFT on the first row, we skip it and check the 2nd number on the second row.
If the 2nd number translates to a letter on the 2nd row we decrypt to that letter, but if it translates to the SHIFT on the second row, we skip it and check the 3rd number on the third row.
If you get to the 3rd number, it will always decrypt to a letter on the third row.

so to decrypt the message 3859834584369 we would proceed as follows:

3 is a first row shift, so we skip it and check 8 on the second row.
on the second row, 8 becomes S. so 38=S

the next number is 5, which equals E on the first row. 5=e

9=N on the first row

8 is a first row shift, followed by 3 which equals D on the second row.

So far we have:
Code:
 

38 5 9 83 4584369
s e n  d          


We continue with 4=H, 5=E, 8=Shift 4=L, 3=Shift 6=Shift, 9=P. For a final result of:
Code:
 

38 5 9 83 4 5 84 369
s e n  d h e  l   p    


Tada!

now then. I can certainly see some weaknesses in this cipher right off.
First: the encrypted message is longer than the plain text, thats minor, but a disadvantage.
Second: it's probably going to be possible to identify the shift letters.
Third: the frequency analysis will be all skewed around, but it will NOT be useless.
Fourth: its likely to be more subject to errors than a standard cipher, and will require more work to recover from them.

What do you folks think?
Offline Profile Quote Post Goto Top
 
jdege
Member Avatar
Elite member
[ *  *  *  *  * ]
Interesting how folks can take old ideas and use them in a slightly new way.

It has a lot of similarity to a straddling checkerboard - particularly it's method of providing shorter cipher tokens for high-frequency plaintext letters.

Biggest problem is that what you've described doesn't have a keying mechanism. How would you build an alphabet out of a keyword, while retaining the frequency characteristics? It could be done, but you've not addressed it.

How limited would the key space be, if you restrict it so as to keep the frequency behavior?
When cryptography is outlawed, bayl bhgynjf jvyy unir cevinpl.
Offline Profile Quote Post Goto Top
 
Donald
Elite member
[ *  *  *  *  * ]
the keyspace is the randomization of the index numbers.
Offline Profile Quote Post Goto Top
 
jdege
Member Avatar
Elite member
[ *  *  *  *  * ]
Donald
Mar 30 2007, 07:42 PM
the keyspace is the randomization of the index numbers.

Sorry. I saw that when I read it. I don't know how I managed to forget it when I started replying.

Have you done any analysis of this? Done frequency counts, contact charts, etc., to see if any recognizable patterns show up?

I'm thinking about cribs. Suppose you knew (or guessed) that a message contained a certain word or phrase. What could you do with it?

Code:
 

t   h   e   s   e   c   r   e   t   m   e   s   s   a   g   e   <-plain
0   4   5   88  5   35  60  5   0   7   5   38  68  1   865 5   <-crypt


There are only 720 possible first-level shifts. Trying them all would not be impossible. Build a regex from each, and search for a match. If on a certain pass, s1=3, s2=6, and s3=8, then the letters that were encrypted from the first line would match [0124579], and the letters that were encrypted from the second line would match [368][0-9], and the letters that were encrypted from the third would match [789][0-9][0-9].

So if your crib was the word "secret", one of the regexes you'd be using would be

Code:
 
[368][0-9][0124579][368][0-9][368][0-9][0124579][0124579]


Which, surprisingly enough, matches the cyphertext in the proper place.

Code:
 

04588535605075386818655
   ^^^^^^^^^          


My guess is that false positives would be fairly rare.

What I'd try? Get a fairly short list of the most comonly-used longer words, run each through the paces - try the 720 possible shift characters. You'll get a lot of matches, some true positive, some false. The true positive matches would all use the same combination of shift characters. The false positive matches would not. So the combination of shift characters that appeared most frequently would probably be the right set. If it were - as I'd suspect it to be - considerably more frequent than the second place combination, I'd say you'd found them.

Alternatively, you could sort the prospective matches by the characters they match. Your example uses e=>5. So we'd expect the true positive matches to all use e=>5, and the false positives to be near-random.

And from there, you'd have the first-level shift characters, and the second-level shift character, and most of the most frequently-used letters.

You could do another pass through your dictionary with regexes constructed around what you do know, replacing unknown second-level letters with [SRDLUMWCF] and third-level with [GYPBVKQXJZ], but in truth, I doubt it'd be be necessary. You should have enough that what's left would fall out by eye.


(BTW - the 'm' in "message" should have been encrypted as "[368]7", not as '7' - 'm' is in the second row.)
When cryptography is outlawed, bayl bhgynjf jvyy unir cevinpl.
Offline Profile Quote Post Goto Top
 
Donald
Elite member
[ *  *  *  *  * ]
"Jedge"
 
Have you done any analysis of this? Done frequency counts, contact charts, etc., to see if any recognizable patterns show up?

Just starting to play with it. I'll try to have some analysis up soon, but feel free to announce anything else you find in the meantime!

"Jedge"
 
There are only 720 possible first-level shifts. Trying them all would not be impossible. Build a regex from each

Very interesting approach! By limiting your attack to the shifts, you make the job much smaller.

"Jedge"
 
(BTW - the 'm' in "message" should have been encrypted as "[368]7", not as '7' - 'm' is in the second row.)

UGH! You are absolutely correct. Like I said, very vulnerable to errors. :)
Offline Profile Quote Post Goto Top
 
jdege
Member Avatar
Elite member
[ *  *  *  *  * ]
Donald
Mar 30 2007, 11:10 PM
"Jedge"
 
There are only 720 possible first-level shifts. Trying them all would not be impossible. Build a regex from each

Very interesting approach! By limiting your attack to the shifts, you make the job much smaller.

I've been playing with this, a bit.

Only 120 possible first-level shifts - order doesn't matter.

But I've been seeing far more matches than I thought I would - enough that the proper match does not stand out, simply based on the count of number of matches for each attempted set of shifts.
When cryptography is outlawed, bayl bhgynjf jvyy unir cevinpl.
Offline Profile Quote Post Goto Top
 
jdege
Member Avatar
Elite member
[ *  *  *  *  * ]
I went back at it again, since my first approach didn't work.

For the first try, I threw together a script that would encrypt according to a random key, choosing among the three first-level shifts randomly. I used this and the Unix fortune program to create a text to play with.

It seemed to me that the first-level shifts + second level shift would show up high in a digraph frequency count. And they did.

I did a digraph frequency count, and sorted first by the second digit, then by the count. There are ten second digits, so ten sets of digraph frequencies. Of those ten, some could be eliminated because one of the high frequency pairs was a double. Others had one pair with significantly higher frequency than the rest. Only one had three pairs with nearly equal frequency, none of which were doubles.

So I assumed that these were the shift digits, and used them to convert the ciphertext into a monoalphabetic cipher, then did a mono frequency count on that, and had what looked very much like an ordinary language distribution.

So I went back to look at the plaintext (which I'd not looked at, up to then) and discovered that these were, in fact, the shift characters.

Thinking about it, again, I'm not sure all of this is necessary. Simply use each possible set of shift characters (1200 by my count), use each to convert to a monoalphabet, and do an IC on each. Odds are only a very few will look like natural language.

For that matter, running all 1200 through a monoalphabet cracker wouldn't take all that long. So I'd run through them all, in order of descending IC.
When cryptography is outlawed, bayl bhgynjf jvyy unir cevinpl.
Offline Profile Quote Post Goto Top
 
jdege
Member Avatar
Elite member
[ *  *  *  *  * ]
You know, I hadn't intended to kill the conversation.

Where did everybody go?

When cryptography is outlawed, bayl bhgynjf jvyy unir cevinpl.
Offline Profile Quote Post Goto Top
 
Donald
Elite member
[ *  *  *  *  * ]
Sorry! My load at work suddenly exploded. I'll be getting back to this just as soon as I have the time!
Offline Profile Quote Post Goto Top
 
insecure
Elite member
[ *  *  *  *  * ]
Donald
Apr 13 2007, 02:36 AM
Sorry! My load at work suddenly exploded. I'll be getting back to this just as soon as I have the time!

I've been kinda busy too. It doesn't help that, whenever Galeon crashes and I start it again, it asks whether I want to restore the previous state, and sometimes I answer "no". When I do so, obviously it no longer shows a tab for this forum, and I ... er... well, sorry, but I keep forgetting it exists!

Fortunately, I have a link to it on my home page, so I periodically notice it and re-remember.
Offline Profile Quote Post Goto Top
 
1 user reading this topic (1 Guest and 0 Anonymous)
« Previous Topic · General · Next Topic »
Add Reply