Welcome Guest [Log In] [Register]
Welcome to Crypto. We hope you enjoy your visit.


You're currently viewing our forum as a guest. This means you are limited to certain areas of the board and there are some features you can't use. If you join our community, you'll be able to access member-only sections, and use many member-only features such as customizing your profile, sending personal messages, and voting in polls. Registration is simple, fast, and completely free.


Join our community!


If you're already a member please log in to your account to access all of our features:

Username:   Password:
Add Reply
Brute Forcing The Adfgvx Cipher
Topic Started: Apr 19 2008, 08:06 AM (2,745 Views)
Revelation
Member Avatar
Administrator
[ *  *  *  *  * ]
The ADFGVX cipher is a pretty good cipher: it's a pain to crack manually and brute force probably takes a very long time.

But there must be a way to make a smarter brute force.

I'm thinking about quick guesses. If the F appears a lot in the ciphertext, it means the row and/or the column with F is a row with letters with high frequencies. So the chances of having a number in the F row and column is not that high.

Futhermore, if you've guessed the keylength and column order you can do a frequency check.

Have you got any more ideas to improve the brute forcer?
RRRREJMEEEEEPVKLWENFNVJKEEEEEAOLKAFKLXCFZAASDJXZTTTTTTTLSIOWJXMOKLAFJNNKFNXN
RAGRBAQEMHIGDJVDSEOXVIYCELFHWLELJFIENXLRATALSJFSLCYTKLASJDKMHGOVOKAJDNMNUITN
RRRRLJVEEEEECLYVYHNVPFTAEEEEEMWLMEIRNGLARWJAKJDFLWNTIERJMIPQWOTZEOCXKNUBNXCN
RJIRPOWEANFUSNCZVDVZNMSFEKLOEPZLDKDJWSAAAAAAAOERHJCTNCKFRIMVKSOFOMKMANREWNBN
RZUDRGXEEEEENFQIDVLQNCKNEEEEEDGLLLLLLAWIOSNCDARLODMTOEJXMILDFJROTKJSDNLVCZNN
Offline Profile Quote Post Goto Top
 
jdege
Member Avatar
Elite member
[ *  *  *  *  * ]
I'd think the keyspace of ADFGVX is far too large for brute-forcing to have any chance.

What I'm wondering if there are any statistical tests that would determine whether you had successfully reversed the transposition. It's a two-stage cipher, polybius-square substitution followed by columnar transposition. Is there a test we can perform on a trial solution of the transposition, that would determine we had it correct, without having had to break the substitution first?

I'd be very surprised if there were not, but I don't know what it would be.

Remember - this was cracked, in pre-computer days. But only occasionally, in times of high traffic. My guess is that they attacked the transposition by simultaneous anagramming of messages of identical size.
When cryptography is outlawed, bayl bhgynjf jvyy unir cevinpl.
Offline Profile Quote Post Goto Top
 
Revelation
Member Avatar
Administrator
[ *  *  *  *  * ]
Well, we could check if the letter frequencies match the ones of standard English when we've done a transposition. That might help.

We could also run a quick pattern check on the bigrams to see if a given crib is found.
RRRREJMEEEEEPVKLWENFNVJKEEEEEAOLKAFKLXCFZAASDJXZTTTTTTTLSIOWJXMOKLAFJNNKFNXN
RAGRBAQEMHIGDJVDSEOXVIYCELFHWLELJFIENXLRATALSJFSLCYTKLASJDKMHGOVOKAJDNMNUITN
RRRRLJVEEEEECLYVYHNVPFTAEEEEEMWLMEIRNGLARWJAKJDFLWNTIERJMIPQWOTZEOCXKNUBNXCN
RJIRPOWEANFUSNCZVDVZNMSFEKLOEPZLDKDJWSAAAAAAAOERHJCTNCKFRIMVKSOFOMKMANREWNBN
RZUDRGXEEEEENFQIDVLQNCKNEEEEEDGLLLLLLAWIOSNCDARLODMTOEJXMILDFJROTKJSDNLVCZNN
Offline Profile Quote Post Goto Top
 
jdege
Member Avatar
Elite member
[ *  *  *  *  * ]
From Wikipedia:
Quote:
 
ADFGVX was cryptanalysed by French Army Lieutenant Georges Painvin. The work was exceptionally difficult by the standards of classical cryptography, and Painvin became physically ill during it. His method of solution relied on finding messages with stereotyped beginnings, which would fractionate the same, then form similar patterns in the positions in the ciphertext that had corresponded to column headings in the transposition table. (Considerable statistical analysis was required after this step had been reached — all done by hand.)

This meant it was only effective during times of very high traffic — but, fortunately for the cryptanalysts, that was also when the most important messages were sent.


Aegean Park has a book, "General Solution of the ADFGVX Cipher System," by J. Rives Childs, that would be interesting. Friedman's Military Cryptanalysis IV also discusses it.

But it's so much more fun to try to figure out a break on your own, before you read about how others have done it.

When cryptography is outlawed, bayl bhgynjf jvyy unir cevinpl.
Offline Profile Quote Post Goto Top
 
Donald
Elite member
[ *  *  *  *  * ]
I've always found the ADFGVX cipher to be one of the most intimidating pen and paper ciphers around. Which is why I used a variation of it in my impractical cipher.

Fractionated ciphers just seem impossible to me. Especially if you go to the next step and recombine the transposed fractionated message back into single letters.

I recognize that they are NOT impossible, but I can't figure out how to even get a theoretical toe hold on them. It's well beyond me. In the above referenced thread rot13 mentioned trying to attack it by tracing column and row coincidences. BUT, I'm certain I couldn't make that work. Not yet anyway.

And I note, NO ONE really made attempt on my challenge, even the known plain text one. :)

ADFGVX is quite a cipher.
Offline Profile Quote Post Goto Top
 
Revelation
Member Avatar
Administrator
[ *  *  *  *  * ]
Well, I still think that my pattern sweep can tell whether you've got a good transposition or not.

A transposition with keylength < 10 is cracked in about 10 seconds. That means that the pattern sweep takes about 15 seconds. That's do-able.

When you've got a pattern match, solve the mono sub using frequencies.

In theory it sounds like the ADFGVX can be brute forced in an acceptable amount of time. I want to make a C application that does all of the above.

Since most of us are programmers, we can find ways to optimise it.

Quote:
 
And I note, NO ONE really made attempt on my challenge, even the known plain text one.

Well, we all hate fractionating systems, for obvious reasons :P I'll have a look at it again when we have brute forced the ADFGVX cipher.
RRRREJMEEEEEPVKLWENFNVJKEEEEEAOLKAFKLXCFZAASDJXZTTTTTTTLSIOWJXMOKLAFJNNKFNXN
RAGRBAQEMHIGDJVDSEOXVIYCELFHWLELJFIENXLRATALSJFSLCYTKLASJDKMHGOVOKAJDNMNUITN
RRRRLJVEEEEECLYVYHNVPFTAEEEEEMWLMEIRNGLARWJAKJDFLWNTIERJMIPQWOTZEOCXKNUBNXCN
RJIRPOWEANFUSNCZVDVZNMSFEKLOEPZLDKDJWSAAAAAAAOERHJCTNCKFRIMVKSOFOMKMANREWNBN
RZUDRGXEEEEENFQIDVLQNCKNEEEEEDGLLLLLLAWIOSNCDARLODMTOEJXMILDFJROTKJSDNLVCZNN
Offline Profile Quote Post Goto Top
 
jdege
Member Avatar
Elite member
[ *  *  *  *  * ]
Revelation
Apr 28 2008, 09:49 AM
A transposition with keylength < 10 is cracked in about 10 seconds. That means that the pattern sweep takes about 15 seconds. That's do-able.

IFF you have a good test for whether the result of a given trial transposition looks like the output of the substitution phase.

The first step is a fractionated substitution, creating two letters for one. But it's a simple substitution, and a digraph frequency count would show a normal language distribution. For that matter, converting it back to single characters using any arbitrary Polybius square would yield a result that was only a simple substitution away from the plaintext.

So there's no question that there are statistical measures that distinguish the output of the substitution phase. The question is whether the output of the trial transpositions are statistically distinct. It's clear to me that most of them will be. The majority of incorrect transpositions will yield something like a random shuffle, that would result in a digraph frequency distribution very different from plain text. But I'd not feel in comfortable in stating that all of the incorrect transpositions will be so. It seems perfectly possible to me that some number of incorrect transpositions will look correct. Whether this is true, or how many of them there are, I'd not even pretend to say, but it seems to me that the number of transpositions that could be correct is the major driver as to whether this is a viable approach.

Of course, given multiple messages encoded with the same key, things are very different. While it might be possible that a number of incorrect transpositions yield normal-looking frequency distributions, the odds that the same incorrect transposition would yield normal distributions are very low.

Still, you've been playing around with this. How many false positives have you been seeing?
When cryptography is outlawed, bayl bhgynjf jvyy unir cevinpl.
Offline Profile Quote Post Goto Top
 
Donald
Elite member
[ *  *  *  *  * ]
"Jdedge"
 
converting it back to single characters using any arbitrary Polybius square would yield a result that was only a simple substitution away from the plaintext.

AH! Ok, it took me a bit, but I think I see your point.
If the sequence was the standard
step 1: fractionate,
step 2: transpose

Then we just try various transpositions until our digram frequency looks like a mono substitution frequency. And that undoes step 2. From there, any polybius square will convert the remaining crypt text into a mono alphabetic substitution, which is easy to solve. Brilliant! And NICE!!!!

BUT, If the sequence is:
step 1: fractionate,
step 2: transpose
step 3: unfractionate

THIS still baffles me. How do you undo step 3? All of your results should look more or less like random data. If you don't use the correct square, you get a digraph substitution of the original step 2 data, and the original step 2 data is scrambled to start with. I don't see any way to detect that you have gotten closer to (or hit) the correct step 2 text, nor do I see any way to use an incorrect digraph substitution to your advantage. Because the connection between the step 2 data and the original plain text is involved in the relationships between the letters positions in the fractionating step that happened in step 1. It's the relative positions. Right? an incorrect digraph substitution of step 2's output doesn't give us the same cipher, but with a different square for step 1. It gives us the wrong relationships between the letters, and therefore a completely invalid and useless result.

Of course, as long as the relationships between the letters used in the squares in step 1 and step 3 are the same, they do not need to be the exact same squares. But that should still eliminate all but a tiny fraction of the keyspace as useless during a brute force attack.

With a playfair you can use shotgun hill climbing. Just decrypt the cipher text with a random square and then make changes to that square keep the relationships that make the decrypted text look more and more like ordinary plain text. I only understand this approach on the most theoretical level, but I can see how it works sort of.

BUT, with the 3 step ADFGVX I don't see how shotgun hill climbing would work because the intermediate text does not show digram or mono text frequencies. You have to undo the transposition to get something with normal digram frequencies. Which makes a brute force attack very impractical.



Offline Profile Quote Post Goto Top
 
Revelation
Member Avatar
Administrator
[ *  *  *  *  * ]
Quote:
 
Then we just try various transpositions until our digram frequency looks like a mono substitution frequency. And that undoes step 2. From there, any polybius square will convert the remaining crypt text into a mono alphabetic substitution, which is easy to solve. Brilliant! And NICE!!!!


That's what I suggested in my second and third post and that's what my application would do: first sweeping the pattern of a crib through the text and then checking the frequency of the bigrams (btw, I haven't started writing it yet)

Your other sequence is a pain, especially if you use two different squares in step 1 and 3. I'm not sure how to solve that.
RRRREJMEEEEEPVKLWENFNVJKEEEEEAOLKAFKLXCFZAASDJXZTTTTTTTLSIOWJXMOKLAFJNNKFNXN
RAGRBAQEMHIGDJVDSEOXVIYCELFHWLELJFIENXLRATALSJFSLCYTKLASJDKMHGOVOKAJDNMNUITN
RRRRLJVEEEEECLYVYHNVPFTAEEEEEMWLMEIRNGLARWJAKJDFLWNTIERJMIPQWOTZEOCXKNUBNXCN
RJIRPOWEANFUSNCZVDVZNMSFEKLOEPZLDKDJWSAAAAAAAOERHJCTNCKFRIMVKSOFOMKMANREWNBN
RZUDRGXEEEEENFQIDVLQNCKNEEEEEDGLLLLLLAWIOSNCDARLODMTOEJXMILDFJROTKJSDNLVCZNN
Offline Profile Quote Post Goto Top
 
jdege
Member Avatar
Elite member
[ *  *  *  *  * ]
Donald
Apr 28 2008, 12:58 PM
"Jdedge"
 
converting it back to single characters using any arbitrary Polybius square would yield a result that was only a simple substitution away from the plaintext.

AH! Ok, it took me a bit, but I think I see your point.
If the sequence was the standard
step 1: fractionate,
step 2: transpose

Then we just try various transpositions until our digram frequency looks like a mono substitution frequency. And that undoes step 2. From there, any polybius square will convert the remaining crypt text into a mono alphabetic substitution, which is easy to solve. Brilliant! And NICE!!!!

BUT, If the sequence is:
step 1: fractionate,
step 2: transpose
step 3: unfractionate

THIS still baffles me. How do you undo step 3?


Can I get by with "ADFGVX doesn't unfractionate, do it doesn't matter"?

Didn't think so.

Now if we're doing Bifid, with an even key-length, it doesn't matter, either, because then the two halves of each letter map to the two halfs of another letter:

Code:
 

   P1  P2  P3  P4
   L1  L2  L3  L4
   R1  R2  R3  R4

  L1:L2  L3:L4 R1:R2 R3:R4


Notice how the first letter of the ciphertext is constructed from the left halves of the first and second letters of the plaintext, and how the third letter of the ciphertext is constructed from the right halves of the first and second letters of the plaintext.

In other words, with a key-length of four, the first and second letters of the plaintext entirely determine the first and third letters of the ciphertext, and the second and fourth letters of the plaintext entirely determine the second and fourth letters of the ciphertext.

It can be solved as an ordinary digram substitution.

With an odd key-length, this doesn't work. How to solve odd-length Bifids? I've a couple of explanations around here, but I've not studied them. (And the explanations aren't as straightforward as the even-length Bifid, so my fast skims haven't yielded anything useful.)

As for the generic problem, with an unknown transposition? Not a clue.
When cryptography is outlawed, bayl bhgynjf jvyy unir cevinpl.
Offline Profile Quote Post Goto Top
 
jdege
Member Avatar
Elite member
[ *  *  *  *  * ]
Revelation
Apr 28 2008, 01:09 PM
That's what I suggested in my second and third post and that's what my application would do: first sweeping the pattern of a crib through the text and then checking the frequency of the bigrams (btw, I haven't started writing it yet)


And you were a step or two ahead of us. I was just figuring it out.

Revelation
Apr 28 2008, 01:09 PM
Your other sequence is a pain, especially if you use two different squares in step 1 and 3.  I'm not sure how to solve that.

When I think about it, I'm always driven back to the same question - how can we recognize when we've properly reversed one step?

Fractionate+transpose+unfractionate. Is there something we can recognize in a fractionate+transpose, that would allow us to determine that we have properly reversed the unfractionate?

If so, does a not-quite-there attempt at unfractionating look more like a fully-there attempt at unfractionating?

If so, then the heuristic methods should work - shotgun hill climbing, simulated annealing, genetic algorithms. But it'd take an in-depth look at the properties of fractionate+transpose to determine whether that was the case.

One thing's for certain - no exhaustive search of a Polybius square keyspace is going to work. Exhaustive search of single columnar transposition is doable, assuming moderate key lengths. 8!, 9!, or 10! can be handled. So using exhaustive search to remove the transposition is feasible, particularly since we think we may have a good test for recognizing the correct result.

But 25! is a whole different world.



When cryptography is outlawed, bayl bhgynjf jvyy unir cevinpl.
Offline Profile Quote Post Goto Top
 
Revelation
Member Avatar
Administrator
[ *  *  *  *  * ]
I've started building the app. It only does the pattern check now. The results are not great, but not bad too. The pattern check yields about 200 false positives for a simple and short pattern.

If the keylength gets larger (8+), the number of false positives increases rapidly.

I'm hoping the frequency check will do us some good.

For the .c file and the .exe, look in the zip. It should also compile on Unix, haven't tested it there yet.

Here's the code:
Code:
 

// By Ben Ruyl

#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <assert.h>

#define MAXKEYLENGTH 30
#define MAXCRIB 30
#define MAXCIPHER 2000
#define MAXFILENAME 128

char crib[MAXCRIB];
int cribpat[MAXCRIB];
char ct[MAXCIPHER];

int begin = 1; // standard cipher length region, if not specified
int end = 12;

int results = 0; // count the matches

void createpattern(char* v, int* pat)
{
   int i = 0, j = 0, highpat = 0;

   for (i = 0; i < strlen(v); i++)
   {
       int found = 0;
       for (j = 0; j < i; j++)
           if (v[i] == v[j])
           {
               pat[i] = pat[j];
               found = 1;
           }

       if (!found)
       {
           pat[i] = highpat;
           highpat++;
       }
   }
}

void createpatternbigram(char* v, int* pat)
{
   int i = 0, j = 0, highpat = 0;

   assert(strlen(v) % 2 == 0);

   for (i = 0; i < strlen(v); i += 2)
   {
       int found = 0;
       for (j = 0; j < i; j += 2)
           if (v[i] == v[j] && v[i + 1] == v[j + 1])
           {
               pat[i / 2] = pat[j / 2];
               found = 1;
           }

       if (!found)
       {
           pat[i / 2] = highpat;
           highpat++;
       }
   }

}

void makepatofpat(int* v, int len, int* pat)
{
   int i = 0, j = 0, highpat = 0;

   for (i = 0; i < len; i++)
   {
       int found = 0;
       for (j = 0; j < i; j++)
           if (v[i] == v[j])
           {
               pat[i] = pat[j];
               found = 1;
           }

       if (!found)
       {
           pat[i] = highpat;
           highpat++;
       }
   }
}


void checkperm(const int *p, const int size)
{
   if (p != 0)
   {
       const int len = strlen(ct);
       int i = 0, j = 0, u = 0, v = 0;
       int pos = 0;
       int minrow = len / size;
       int rest = len % size;
       int rowsatperm = minrow;

       char table[MAXCIPHER]; // FIXME: terrible waste of space
       int ctpat[MAXCIPHER];
       int revp[MAXCRIB];

       memset(table, 0, MAXCIPHER);

       for (j = 0; j < size; j++)
           revp[p[j]] = j; // reverse it, index -> perm, perm -> index

       // untranspose the cipher
       for (i = 0; i < size; i++)
       {
           rowsatperm = minrow;

           if (revp[i] < rest)
               rowsatperm++;

           for (j = 0; j < rowsatperm; j++)
           {
               table[revp[i] + j * size] = ct[pos];
               pos++;
           }
       }

       createpatternbigram(table, ctpat); // create the pattern of the bigrams. we will also use this pattern for freq. analysis

       // sweep the crib through the pattern
       for (u = 0; u < strlen(table) / 2 - strlen(crib); u++) // FIXME: remove all those strlens
       {
           int matches = 1;
           for (v = 0; v < strlen(crib); v++)
           {
               // now make a pattern of this part of the pattern of the ciphertext, because 12 13 12 14 = 0 1 0 2!
               int patofpat[MAXCRIB];
               makepatofpat(ctpat + u, strlen(crib), patofpat);

               if (patofpat[v] != cribpat[v])
               {
                   matches = 0;
                   break;
               }

           }

           if (matches) // if it matches, print the permutation
           {
               results++;
               printf("PATTERN MATCH! perm:");

               for (v = 0; v < size; v++)
                   printf("%d", p[v]);

               printf("\n");
           }
       }

   }
}

// University of Exeter algorithm
// a fast algorithm, we don't care about sorting it like a dictionary
void permute(int *v, const int start, const int n)
{
   if (start == n - 1)
   {
       checkperm(v, n);
   }
   else
   {
       int i = 0;
       for (i = start; i < n; i++)
       {
           int tmp = v[i];

           v[i] = v[start];
           v[start] = tmp;
           permute(v, start + 1, n);
           v[start] = v[i];
           v[i] = tmp;
       }
   }
}


int main(int argc, char *argv[])
{
   int beginloc = -2, endloc = -2, cribloc = -2, fileloc = -2;
   long linenum = 0;
   int theoption, except = 0, number = 0, found =0;

   int i = 0;
   int v[MAXKEYLENGTH];

   // parse the arguments
   for (i = 0; i < argc; i++)
   {
       if (argv[i][0] == '-')
       {
           if (i == beginloc || i == endloc || i == cribloc || i == fileloc)
           {
               printf("No parameter entered");
               return -1;
           }

           switch (argv[i][1])
           {
           case 'b':
               beginloc = i + 1;
               break;

           case 'e':
               endloc = i + 1;
               break;

           case 'c':
               cribloc = i + 1;
               break;

           case 'f':
               fileloc = i + 1;
               break;
           }


       }
       else
       {
           if (i == beginloc)
           {
               int res = atoi(argv[i]);
               if (res >= 2)
               {
                   begin = res;
                   beginloc = -1;

               }
               else
               {
                   printf("Wrong starting value entered");
                   return -1;
               }
           }
           else
               if (i == endloc)
               {
                   int res = atoi(argv[i]);
                   if (res >= 2)
                   {
                       end = res;
                       endloc = -1;

                   }
                   else
                   {
                       printf("Wrong ending value entered");
                       return -1;
                   }
               }
               else
                   if (i == cribloc)
                   {
                       if (strlen(argv[i]) <= MAXCRIB)
                       {
                           strcpy(crib, argv[i]);
                           cribloc = -1;

                       }
                       else
                       {
                           printf("The crib is too long");
                           return -1;
                       }
                   }
                   else
                       if (i == fileloc)
                       {
                           if (strlen(argv[i]) <= MAXFILENAME)
                           {

                               FILE *file = fopen( argv[i], "r" );


                               if ( file == 0 )
                               {
                                   printf( "Could not open file\n" );
                                   return -1;
                               }


                               fgets(ct, MAXCIPHER, file); // read the cipher, FIXME: no support for spaces, no support for multiline

                               if (ct[strlen(ct) - 1] == '\n')
                                   ct[strlen(ct) - 1] = 0;

                               if (ct[strlen(ct) - 1] == '\r') // support for Windows
                                   ct[strlen(ct) - 1] = 0;

                               fclose(file);

                               fileloc = -1;
                           }
                           else
                           {
                               printf("The filename is too long");
                               return -1;
                           }
                       }
                       else // read the ciphertext, leave blank if you've specified a file
                       {
                           if (strlen(argv[i]) <= MAXCIPHER)
                           {
                               strcpy(ct, argv[i]);

                           }
                           else
                           {
                               printf("The ciphertext is too long");
                               return -1;
                           }
                       }
       }

   }

   assert(strlen(ct) % 2 == 0);

   // create a pattern from the crib
   createpattern(crib, cribpat);

   // print some stuff
   printf("Begin: %i\n", begin);
   printf("End: %i\n", end);
   printf("Crib: %s\n", crib);

   printf("Crib pattern: ");
   for (i = 0; i < strlen(crib); i++)
       printf("%d", cribpat[i]);
   printf("\n");


   printf("Cipher: %s\n", ct);

   // create the list that will be permutated. no need to create a new sorted one in the keylength loop, because we don't care about order
   for (i = 0; i < MAXKEYLENGTH; i++)
       v[i] = i;

   for (i = begin; i < end + 1; i++)
   {
       printf("\nStarting with keylength %d.\n", i);
       permute(v, 0, i);
   }


   printf("\nSearch done. %d matches found. Press any key to exit.\n", results);

   getchar();

   return 0;

}


If you've got some time, you may want to experiment with the app, like trying largers texts (maybe Loki's challenge). Haven't had the time for that.
RRRREJMEEEEEPVKLWENFNVJKEEEEEAOLKAFKLXCFZAASDJXZTTTTTTTLSIOWJXMOKLAFJNNKFNXN
RAGRBAQEMHIGDJVDSEOXVIYCELFHWLELJFIENXLRATALSJFSLCYTKLASJDKMHGOVOKAJDNMNUITN
RRRRLJVEEEEECLYVYHNVPFTAEEEEEMWLMEIRNGLARWJAKJDFLWNTIERJMIPQWOTZEOCXKNUBNXCN
RJIRPOWEANFUSNCZVDVZNMSFEKLOEPZLDKDJWSAAAAAAAOERHJCTNCKFRIMVKSOFOMKMANREWNBN
RZUDRGXEEEEENFQIDVLQNCKNEEEEEDGLLLLLLAWIOSNCDARLODMTOEJXMILDFJROTKJSDNLVCZNN
Offline Profile Quote Post Goto Top
 
jdege
Member Avatar
Elite member
[ *  *  *  *  * ]
It compiles clean on Linux.
When cryptography is outlawed, bayl bhgynjf jvyy unir cevinpl.
Offline Profile Quote Post Goto Top
 
Revelation
Member Avatar
Administrator
[ *  *  *  *  * ]
If you see room for optimisation, please let me know. It should run as fast as possible.
RRRREJMEEEEEPVKLWENFNVJKEEEEEAOLKAFKLXCFZAASDJXZTTTTTTTLSIOWJXMOKLAFJNNKFNXN
RAGRBAQEMHIGDJVDSEOXVIYCELFHWLELJFIENXLRATALSJFSLCYTKLASJDKMHGOVOKAJDNMNUITN
RRRRLJVEEEEECLYVYHNVPFTAEEEEEMWLMEIRNGLARWJAKJDFLWNTIERJMIPQWOTZEOCXKNUBNXCN
RJIRPOWEANFUSNCZVDVZNMSFEKLOEPZLDKDJWSAAAAAAAOERHJCTNCKFRIMVKSOFOMKMANREWNBN
RZUDRGXEEEEENFQIDVLQNCKNEEEEEDGLLLLLLAWIOSNCDARLODMTOEJXMILDFJROTKJSDNLVCZNN
Offline Profile Quote Post Goto Top
 
jdege
Member Avatar
Elite member
[ *  *  *  *  * ]
Revelation
May 14 2008, 02:27 PM
If you see room for optimisation, please let me know. It should run as fast as possible.

I don't think you're going to see much room for optimization in the code you have so far. It'll be your test routines that will be the pace where I'd expect most of your time to be spent.

And there, I'd expect two places where optimization might pay off.

1. How you access the statistics.
2. What sort of math you do on the statistics.

And in truth, I don't see much room for improvement in 1. You plan on matching 2-grams, or maybe 4-grams? Both are small enough to keep in a non-sparse array, there isn't anything that's faster than array lookup.

As for 2? Use integers, instead of floats. Add logarithms instead of multiplying, etc. Lot's of possibilities.

But my real advice? Use a profiling tool. No point in optimizing where your program isn't spending any time.

And remember - premature optimization is the root of all evil.


When cryptography is outlawed, bayl bhgynjf jvyy unir cevinpl.
Offline Profile Quote Post Goto Top
 
1 user reading this topic (1 Guest and 0 Anonymous)
Go to Next Page
« Previous Topic · General · Next Topic »
Add Reply