| Welcome to Crypto. We hope you enjoy your visit. You're currently viewing our forum as a guest. This means you are limited to certain areas of the board and there are some features you can't use. If you join our community, you'll be able to access member-only sections, and use many member-only features such as customizing your profile, sending personal messages, and voting in polls. Registration is simple, fast, and completely free. Join our community! If you're already a member please log in to your account to access all of our features: |
| My New Cipher; Quite Interesting!! | |
|---|---|
| Topic Started: Apr 15 2008, 07:17 AM (899 Views) | |
| jdege | Apr 22 2008, 12:44 PM Post #16 |
|
Elite member
![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
The standard Vig is made by shifting the standard alphabet. PLaintext letter is across the top, keyword letter down the side. the intersection is the ciphertext letter. A mixed Vig uses a mixed alphabet: I'm sure you're familiar with frequency analysis - the way the uneven distributions of letters in normal text shows through a simple substitution, and makes cracking the cipher simple. When you use multiple substitution alphabets, those distributions are smeared - the highs are lower, the lows are higher. The more alphabets you use, the flatter the distribution.
The IC is a simple statistical measure of the uneveness of a frequency distribution. It's calculated from the frequency counts, Wikipedia has a good description of it, under the name Index of coincidence. What's important to understand about it is that it gives you a fast way of telling when you have text that has the same sort of distribution as normal text. Think about someone who was trying to crack your cipher. The first thing they'd do is try to strip off your additive sequences. So they write a little program that generates different additive sequences, and subtracts each of them from your ciphertext. What good does that do them? Quite a lot, if they have a fast way to test whether the result is correct. Your substitution pass, since it uses only two alphabets, has a distribution that is very different from random. So when his program tries to subtract the correct additive sequence, he's going to see that he has the correct one.
First, don't always use the same substitution. Make the substitution process part of the system that is determined by the key. Second, use more than two alphabets. Preferably, let the precise number of alphabets you use be determined by the key, as well. That way they can't just assume that there will only be two, three, or five alphabets, but will have to try every possible length. I'd suggest at least five. Some people like mixed Vig. I'm no great fan. It's easier to work with the standard table. Actually, I find it fastest to convert to numbers, add, then convert back. I can quickly scribble out a conversion table to speed the process: Plaintext 's' = 18, keyword 'k' = 10, sum is 28 = 2 = 'C'. And using a mixed alphabet doesn't help in hiding frequency distributions. A five-alphabet standard Vig and a five-alphabet mixed-Vig have the same statistical properties. So in your cipher, using a mixed Vig wouldn't make it harder for your attacker to determine that he has subtracted the right additive. Now in your cipher, you were reversing the text after applying the substitution, but before applying the additive sequences. This is a simple transposition, but it was always the same transposition, which doesn't increase security. You could do a more complicated transposition here, but the truth is that transposition doesn't change the statistical distribution of the text, so it doesn't make it harder to guess the correct additive. So I'll suggest that you not do a transposition here, but that you might consider adding a transposition pass after you've added your additive sequences. I'd planned on talking about your additives next, but this is already going on too long. Later
|
| When cryptography is outlawed, bayl bhgynjf jvyy unir cevinpl. | |
![]() |
|
| Paarth Dave | Apr 24 2008, 06:28 AM Post #17 |
|
Advanced Member
![]() ![]() ![]() ![]() ![]()
|
Thanks a lot for your valuable suggestions.
Do you mean that I swap the indices of alternate numbers after I have added my sequence? |
|
Cryptography Vanquished.... | |
![]() |
|
| jdege | Apr 24 2008, 12:15 PM Post #18 |
|
Elite member
![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Swapping the indices within each number is just another substitution. What I meant was your reversal of the entire message. Changing the order of the letters is a transposition, and a good transposition can be a very effective part of a cipher. Of course, simply reversing the order of the letters isn't a good transposition. Not just because it's simple to recognize and reverse, but also because it's the same transposition every time, so that once someone figures out that that is what you're doing, they'll be able to reverse it in every other message, even after key changes. Remember what I said about piling substitution upon substitution, and how it doesn't matter how many you apply in succession, they don't increase security? That does not hold true for applying different kinds of ciphers. What you had was a substitution followed by a transposition (reversing the letters) followed by an additive sequence. In order to break the cipher, a cryptanalyst has to break each in order. And one of the most important parts of doing them well is to have the output of each stage look as much like random text as possible, so that it's hard for him to be sure he's done it correctly. My point about using a transposition prior to the additive is that transposition doesn't change letter frequencies. So if the cryptanalyst is attempting to remove the additive sequence by making guesses and checking to see if the result has a frequency distribution that looks like the result of a substitution cipher the transposition cipher won't help. Think about a simpler form of your cipher, in which you just did one substitution, A=1, Z=26, followed by adding a sequence. In your original message, 12% of your letters will be E. After the substitution 12% of your letters will be 5. After you add your sequence, each of those 5's will be added to a different number, so in the final result you won't have anything that stands out. But... When the cryptanalyst starts trying out different additive sequences, subtracting each from your ciphertext, what does he see? When he's guessed the right one, he sees that 12% of your letters are 5. So he can tell he's guessed the right one. Now suppose you try to make things more difficult by applying a transposition to your letters before you apply your additive. What does the cryptanalyst see after subtracting a trial additive sequence? Exactly what he did before - 12% of your letters are 5. Transpositions don't change letter frequencies. So it's just as easy for him to subtract the additive as it was before. And what happens if you apply a second substitution? Changing all of those 5's into 50's? When he subtracts the correct trial additive sequence, he'll see that 12% of your letters are 50. If you apply a second substitution and a transposition, he'll still see 12% of your letters are 50. That's why I was stressing, in my earlier posts, the importance of using multiple alphabets in your substitution phase. If you use two alphabets, half those Es will be changed to one value and half to another, and instead of having one letter making up 12% of your text, you'll have two letters making 6 or 7% each. If you're using ten alphabets, you'll have an almost entirely flat distribution, and the result of subtracting the correct additive won't look much different from subtracting an incorrect one. So, that said, on to the second problem, and that has to do with the way you're adding numbers. Have you ever thought about the distribution of the result of adding two numbers, when each of the addends is chosen from a fixed range? Think about a pair of dice. Each can have a range from 1-6. The sum can have a range from 2-12. How much can you tell about the separate values of the addends, just from the sum? Quite a lot, sometimes. If the sum is 2, you know both addends were 1. If the sum is 3, one of the addends is 1 and the other is 2. If the sum is four, the addends were either 1+3 or 2+2. It's only when the sum is 7 that you have no information about what the addends are. Consider, as an alternative, what happens when you add two dice mod-6. The result will range from 1-6, but it will be a flat distribution. If the sum is 2, it might be 1+1, 2+6, 3+5, 4+4, 5+3, or 6+2,. Every sum will have six possible pairs of addends. The same thing happens when you add letters. If you add them flat, using normal addition, the resulting distribution is peaked. If A = 1 and Z = 26, then a result of 2 means A+A and 52 means 26+26. Where if you add mod-26. a result of 2 could be anything. It's for this reason why the classical ciphers always did their math mod-26 (or mod however many letters there were in their alphabet). So my second suggestion is that you do your additions mod-26. That you limit yourself to substitutions that result in values between 1 and 26 and that you add your additives mod-26. |
| When cryptography is outlawed, bayl bhgynjf jvyy unir cevinpl. | |
![]() |
|
| Paarth Dave | Apr 27 2008, 08:50 AM Post #19 |
|
Advanced Member
![]() ![]() ![]() ![]() ![]()
|
What if I do mod-26 and the value goes in negative? Then what? |
|
Cryptography Vanquished.... | |
![]() |
|
| jdege | Apr 27 2008, 01:20 PM Post #20 |
|
Elite member
![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Add 26.
It's pretty simple. You simply add or subtract multiples of 26 until you're in the range of 0-25. You can assign A=0 and Z=25, if you like, or A=1, in which case Z=26, which is equivalent to saying Z=0, and Y=25. You'll see people do it either way. It makes no difference so long as you're consistent. One of the neat things about addition mod some base is that if one of the addends has a flat probability distribution, then the sum has a flat probability distribution. Consider, for example, a real simple alphabet, with only two symbols, 0 and 1. If in your plaintext, 0 appears 75% of the time and 1 only 25%, but in your key 0 appears 50% and 1 appears 50%, then in a long text you'll see the following: And you'll notice that you have a sum of 0 exactly half the time. Adding a key stream with a random distribution hides all frequency information. In fact, if your key stream is perfectly random, and you never use the same one twice, you have a system called a "one time pad", and it's theoretically unbreakble. (The NSA managed limited breaks into the Soviet Venona documents only because the office that generated the random pads couldn't keep up with demand and sent out pads using duplicate pages.) I don't know if you were aware of this, but this mechanism you used, of encoding once and then encoding again using an additive stream, has a long history. It was the primary diplomatic system used around the world from the late 1800s until WWII. What they were using were codes, not ciphers. They had large books containing code words, each of which substituted for a name, a word, or a phrase. The problem with these code books is that they were expensive to reproduce, and so had to be used for extensive periods of time - years or decades. Which was easily long enough for them to be used in enough traffic for them to be broken. So they used what was called a super-encipherment. An additive sequence to encrypt the output of the code. Very much like your additive sequences in concept, though they made an effort to make their additive sequences more random, which yours isn't. What the Soviets were supposed to be doing was to ship pads with pages of random numbers, to be used for superencipherment. A message would be encoded, then superenciphered by adding the numbers from one of the pages. The number of the page would be added to the message, and the page would be thrown away, never to be used again. The recipient would fnd the other copy of the page and subtract the numbers, then decode the result. No page would ever be used twice. You'll notice that this means that every sender and recipient would have to share their own unique pad. If you have 10 stations, you'd need 2 copies each of 90 pads, if you have 100 stations, you'd need 2 copies of 9900 pads. Back before the days of xerox machines, this was an enormous amount of work. So most simply printed a pamphlet of additives, and sent to everybody. They'd define a mechanism by which a sender would pick a random starting point, and would communicate that to the recipient, and they'd keep track of how much traffic was using the pamphlet, and would send out a new set when they thought that there had been enough repeats that the enemy would have started to figure out what the additives were. In actual practice, most of the organizations using these systems didn't send out new pamphlets of additives often enough, and most of these systems were cracked. |
| When cryptography is outlawed, bayl bhgynjf jvyy unir cevinpl. | |
![]() |
|
| Paarth Dave | Apr 28 2008, 05:14 AM Post #21 |
|
Advanced Member
![]() ![]() ![]() ![]() ![]()
|
Are these one-time pads still unbreakable? I feel that this concept can only be achieved with the plaintext and the keyword being in perfect alignment. How are these combinations made and are there any existing ways of breaking such one-time pads? If there aren't, then maybe there won't be discovered even in the future... |
|
Cryptography Vanquished.... | |
![]() |
|
| Revelation | Apr 28 2008, 08:47 AM Post #22 |
|
Administrator
![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
A one-time pad is uncrackable, almost per definition. If you've got a fully random key with the same size as the plaintext and you add them, you'll get a ciphertext you can't crack. Think about it: you can't analyse anything, because each letter in the key is random. The reason why one-time pads aren't used often is the enormous keylength for large texts. The only way to crack a OTP is if the key is used twice (or more). That's called a two-time pad. Now you can analyse. There's a TTP challange somewhere in this forum. |
|
RRRREJMEEEEEPVKLWENFNVJKEEEEEAOLKAFKLXCFZAASDJXZTTTTTTTLSIOWJXMOKLAFJNNKFNXN RAGRBAQEMHIGDJVDSEOXVIYCELFHWLELJFIENXLRATALSJFSLCYTKLASJDKMHGOVOKAJDNMNUITN RRRRLJVEEEEECLYVYHNVPFTAEEEEEMWLMEIRNGLARWJAKJDFLWNTIERJMIPQWOTZEOCXKNUBNXCN RJIRPOWEANFUSNCZVDVZNMSFEKLOEPZLDKDJWSAAAAAAAOERHJCTNCKFRIMVKSOFOMKMANREWNBN RZUDRGXEEEEENFQIDVLQNCKNEEEEEDGLLLLLLAWIOSNCDARLODMTOEJXMILDFJROTKJSDNLVCZNN | |
![]() |
|
| jdege | Apr 28 2008, 12:21 PM Post #23 |
|
Elite member
![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Because the key is the same size as the message, for every possible message there is a key that will generate it. There is no way to determine which message is the correct one. No one has ever tried to use it on a large scale except for the Soviets. For every message sent, a key of random numbers has to be generated, two copies made, and one copy sent to the sender and the other to the recipient. It's an enormous amount of work, and it's no wonder that the Soviet office responsible for generating the pads sent out duplicates. But as secure a system as it is, when pads aren't reused, it provides no security at all if it is reused. Remember what I was saying about how the distribution looked random, if one of the addends was random? That provides an easy test to see if a pad has been used twice - simply subtract two messages from each other. Suppose you have two ciphertexts, C1 = P1 + K1, and C2 = P2 + K2. You know C1 and C2, you don't know P1, P2, K1, or K2, but you suspect that K1 and K2 might be equal. Calculate C1 - C2. This is equal to (P1 - P2) + (K1 - K2). If K1 and K2 are different random streams, the difference between them is random, and thus C1 - C2 will look random, It will have a flat distribution. If K1 = K2, then K1 - K2 = 0, and C1 - C2 = P1 - P2. P1 and P2 are different, perhaps unrelated plaintexts. Neither will have a distribution that is anything like random, and the difference between them will be very far from random. So you have a fast, easy test to determine whether a pad was ever used twice. You just need to subtract every message from every other message. Tedious, but simple. Easy work for a computer - and it's exactly this work that the first computers were designed and built for. As for cracking the code, extracting the two plaintext messages from the difference between them, there's a pretty good explanation on wikipedia: http://en.wikipedia.org/wiki/Autokey_cipher The next question is, if distributing random pads is so much work, why not just use random number generators? Algorithms that generate streams of random-looking numbers? The short answer is that random-looking isn't random. The long answer is too long for me to enter, right now. |
| When cryptography is outlawed, bayl bhgynjf jvyy unir cevinpl. | |
![]() |
|
| jdege | Apr 29 2008, 12:48 AM Post #24 |
|
Elite member
![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
So, to get back to the question. Let's start by looking at your original cipher. Suppose your evesdropper (let's call her "Eve") knows everything about your cipher except for the key you used. Which means she knows everything except the amount by which your two additive sequences increment. Think about what each individual value of the ciphertext represents: , where Cn is the n'th character of the even or odd set of ciphertext characters, Pn is the n'th character of the even or odd set of plaintext characters, encoded the way you encode the even or odd characters, and I is the increment you use for the even or odd characters. What do you get when you subtract an even or odd character from the next even or odd character? You get the difference between two plaintext characters plus I. And what happens when you calculate an average of these differences? Calculate the difference for each succeeding pair of ciphertext letters? The average of the difference between succeeding pairs of plaintext letters will be zero, leaving you with I. Do this once for the even characters, and once for the odd characters, and you've recovered the key. If the numbers had been added mod-26, this would not have worked. And if the sequence wasn't just sequential, it'd be more difficult. Suppose we used an additive sequence that looked more like a sequence of random numbers, and added them mod-26? The most commonly used pseudo-random number generator in computers is linear congruential. This works by taking each number, multiplying it by one number, adding another, and doing it all modulo some third number. The trick is that the number that you multiply by must be relatively prime to the modulus. The first number in the sequence is called the "seed", and is supplied the user. Let's do it mod-26, using a multiplicand of 5, an addend of 6, and a seed of 11.
There's one obvious problem with using this as a key - there aren't many key choices. There are only 12 numbers relatively prime to 26, mod 26 - the odd numbers other than 13. And the addend cannot be 0, so you have only 25 choices there, and you have 26 possible seeds. Which gives you a total of 7800 possible keys. Far too few. But there's another problem. It only takes one lucky guess to identify the entire sequence. Suppose Eve has intercepted a message you sent your Uncle Henry. She might guess that you've started your message with "DEAR". So she can subtract the encoding for "DEAR" from the first characters of the ciphertext, and that will give her a section of your additve sequence. The problem is that there are only four numbers involved in generating the sequence. If she's correctly guessed four values of the sequence, she has four equations in four unknowns, and calculating the four values is simple algebra, So what to do, if you want to use something like a random additive sequence, without being restricted to llists of previously generated random numbers? You need a pseudo-random number generator that has far more values involved in calculating each number, both so as to provide a larger key space, and to increase the size of a correct guess needed to extract those values. There are common tetragrams that are nearly certain to appear in any English language text - "tion", 'even', 'teen', 'enty', etc. Try each of these against each position in the ciphertext, calculate the variables of your PRNG, and the code is broken. There is only one method for generating random numbers that I am aware of that can be done by hand that has any chance at all of holding up to these techniques. Lagged Fibonacci. As used in the Soviet VIC cipher. More on that next. |
| When cryptography is outlawed, bayl bhgynjf jvyy unir cevinpl. | |
![]() |
|
| Paarth Dave | Apr 29 2008, 06:09 AM Post #25 |
|
Advanced Member
![]() ![]() ![]() ![]() ![]()
|
Good. Go on... |
|
Cryptography Vanquished.... | |
![]() |
|
| jdege | Apr 30 2008, 03:26 AM Post #26 |
|
Elite member
![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
OK - lagged fibonacci sequences. Most folks are familiar with the standard fibonacci sequence - a sequence of numbers in which every number is the sum of the two immediately prior numbers. X[n] = X[n-1] + X[n-2] Most commonly starting off with 0 and 1, the sequence runs:
Lagged fibonacci sequences are the same, except that every number is the sum of two prior numbers other than the two immediately prior numbers. X[n] = X[n-7] + X[n-10], for example. This has long been used as a pseudo-random number generator, using mudulo addition, and for certain "lags" it produces a pretty high-quality one. (Donald Knuth's "Art of Computer Programming" includes an extensive discussion of the use of these sequences as PRNG, including which lags provide the best-quality.) The Soviet's cold-war-era VIC cipher used a lagged fibonacci sequence with lags of 9 and 10. They started with ten digits, and then added the 1st and 2nd to create the 11th, the 2nd and 3rd to create the 12th, etc.: Continuing this creates a stream of random-looking digits for as long as you care to continue the process. As a keystream it has a significant advantage over the linear congruential generator we used earler. It's sequence is dependent upon the a number of characters equal to the lag. If the lag is 10, there are ten constants that define the sequence, as opposed to the three in the linear congruential. Or if that's not enough, use a lag of 20. And it's this that I thought of, when I was looking at your additive streams. Your streams are predictable. Replacing them with a lagged fibonacci sequence would make them very difficult to predict, and thus very much harder to break. Of course, you have the problem of creating rules that would generate a sequence that was suitable for your problem. For example, if you're going to be doing your addition mod-26. you'd probably want to generate your sequence mod-26. And you have to figure a way of generating the starting numbers. You'd probably want something that used a keyword:
A lagged fibonacci isn't really a cryptographically strong PRNG. It's simply the closest to one that I'm aware of that can be easily generated by hand. By itself, it could be broken, given enough ciphertext. But - if it's imposed on top of another system, like the Vig we started off talking about, it would likely be harder to crack. And if you imposed a transposition on it, as your third step, scrambling the relationship between the lagged characters and the number that was generated from them, I'm thinking you might have a pretty solid system. Certainly not one I'd know how to break. (Not that that's saying all that much.) Unfortunately, it's not one that would be very useful, because the greatest weakness of these sorts of ciphers is their lack of ability to recover from errors. Make a mistake, and odds are that your entire message is going to be unreadable. And there's simply no way anyone could make it through a non-trivial message without making a mistake. Still, it's an interesting intellectual exercise. |
| When cryptography is outlawed, bayl bhgynjf jvyy unir cevinpl. | |
![]() |
|
| Paarth Dave | Apr 30 2008, 06:55 AM Post #27 |
|
Advanced Member
![]() ![]() ![]() ![]() ![]()
|
How do you decrypt Lagged Fibonacci? Taking the last example you took. After Lagged Fibonacci, the ciphertext looked like
And the link that you provided about the Autokey cipher article in Wikipedia some posts ago..Remember? The cryptanalysis of this cipher shows that decrypting an Autokey cipher without a keyword requires guessing of some words and then choosing of possible digraphs and trigraphs....I was thinking about why not apply Caesar shift to the plaintext and then use Autokey? How would it be like? I tried it out myself and the chances of finding possible digraphs & trigraphs is almost null. |
|
Cryptography Vanquished.... | |
![]() |
|
| jdege | Apr 30 2008, 12:02 PM Post #28 |
|
Elite member
![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
You recreate the sequence, by starting with the same set of numbers, and then subtract them from the cyphertext, the way you'd added them to the plaintext to create the ciphertext in the first place. Supposing your lagged fibonacci sequence starts out "C Y J Q" and your message starts out "send". Encryption would be and decryption would be
If Eve knows you're doing a Caesar shift, all she needs do is to subtract out each of the 25 possible Caesar shifts, then try the Autokey cryptanalysis on the 25 results. So suppose you do a substitution cipher, then an autokey? She could subtract the ciphertext from itself, shifted by various amounts. When she had the right shift, she'd be left with just the substitution cipher, which would have a frequency distribution that looked like plain text. That's the general solution for breaking mutli-step ciphers - if you can distinguish the results of successfully reversing the last step, you can - at worst - do a exhaustive search. That's what we've been talking about over in the ADFGVX thread. |
| When cryptography is outlawed, bayl bhgynjf jvyy unir cevinpl. | |
![]() |
|
| Donald | Apr 30 2008, 05:42 PM Post #29 |
|
Elite member
![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
cool, nice info. thanks! |
![]() |
|
| Paarth Dave | May 1 2008, 07:09 AM Post #30 |
|
Advanced Member
![]() ![]() ![]() ![]() ![]()
|
How will Eve come to know if she has reached the substitution cipher if the keyword itself is not mentioned? |
|
Cryptography Vanquished.... | |
![]() |
|
| 1 user reading this topic (1 Guest and 0 Anonymous) | |
| Go to Next Page | |
| « Previous Topic · Challenges · Next Topic » |





![]](http://209.85.122.85/static/1/pip_r.png)



6:53 PM Nov 23