Welcome Guest [Log In] [Register]
Viewing Single Post From: Another technique for identifying vowels
jdege
Member Avatar
Elite member
[ *  *  *  *  * ]
It's been quiet for a while, so I thought I'd post code for another technique for identifying vowels.

This showed up in a discussion in the ACA group on MSN. The guy who offered it (CryptoRat) had been using it for decades, and had no idea who had originated it.

I took his idea, wrote some code, and tried it out. It did astoundingly well.

Testing on 1000 runs of ""fortune -l" from bsdgames, CryptoRat's approach (as implemented my somewhat debuggified code), correctly identified all four high-frequency vowels 593 times. It identified three or more of them 880 times, it identified two or more of them 938 times.

I'd call this a very promising technique.

More - of the 287 instances in which this identified three of the four high-frequency vowels, 117 times the fourth letter found was 'U'. So you could say that it found four vowels 710 times. (Next most common fourth letters were 'H' and 'T', 40 and 39 times.)

Code:
 

//[space]FILE:[space]ccluster.cpp
#if[space]0
From[space]CryptoRat,[space]on[space]groups.msn.com/Cryptogram[space](ACA[space]forum)


From:[space]MSN[space]NicknameCryptoRat[space][space]in[space]response[space]to[space]Message[space]1[space][space][space][space]Sent:[space]5/17/2008[space]11:46[space]AM
I[space]find[space]the[space]most[space]reliable[space]system[space]is[space]a[space]statistical[space]test[space]best[space]done[space]by[space]computers,[space]but[space]it[space]can[space]be[space]done[space]by[space]hand.[space]You[space]have[space]to[space]take[space]every[space]group[space]of[space]4[space]letters[space]that[space]appear[space]in[space]the[space]con[space](but[space]if[space]you[space]start[space]with[space]the[space]4[space]most[space]frequent[space]and[space]work[space]your[space]way[space]down[space]you[space]can[space]usually[space]identify[space]the[space]best[space]group[space]pretty[space]early[space]on[space]and[space]get[space]at[space]least[space]two[space]or[space]three[space]vowels[space]right[space]very[space]quickly).

For[space]each[space]group[space]count[space]the[space]number[space]of[space]spaces[space]from[space]each[space]of[space]those[space]letters[space]to[space]the[space]next[space]one[space]that[space]appears[space]in[space]the[space]con,[space]then[space]square[space]that[space]number[space]and[space]add[space]it[space]to[space]the[space]running[space]total.[space]The[space]group[space]of[space]4[space]with[space]the[space]lowest[space]total[space]is[space]usually[space]all[space]vowels[space]or[space]contains[space]three[space]vowels[space]and[space]maybe[space]an[space]H.

Example:
Now[space]is[space]the[space]time[space]for[space]all[space]good[space]men[space]to[space]come[space]to[space]the[space]aid[space]of[space]their[space]country.

If[space]you[space]are[space]testing[space]the[space]group[space]OAEI:
start[space]from[space]the[space]beginning[space]and[space]count[space]the[space]distance[space]between[space]the[space]O[space]in[space]NOW[space]to[space]the[space]I[space]in[space]IS[space](2[space]squared[space]=[space]4)[space]then[space]from[space]that[space]I[space]to[space]the[space]E[space]in[space]THE[space](4[space]*[space]4[space]=[space]16[space]added[space]to[space]4[space]=[space]20),[space]then[space]from[space]that[space]E[space]to[space]the[space]I[space]in[space]TIME,[space]etc.[space]In[space]this[space]example[space]the[space]group[space]AEIO[space]scores[space]lower[space]than[space]any[space]other[space]combination.[space]The[space]test[space]isn't[space]100%[space]accurate,[space]so[space]I[space]keep[space]at[space]least[space]the[space]5[space]best[space](i.e.[space]lowest)[space]scoring[space]groups[space]to[space]explore.[space]In[space]my[space]computer[space]program[space]I[space]modified[space]the[space]test[space]to[space]prevent[space]certain[space]anomalies.[space]For[space]example,[space]if[space]the[space]group[space]produces[space]three[space]in[space]a[space]row,[space]I[space]add[space]a[space]large[space]number,[space]since[space]three[space]vowels[space]in[space]a[space]row[space]are[space]unusual,[space]even[space]though[space]the[space]score[space]would[space]be[space]1*1[space]+[space]1*1[space]=[space]2.

#endif


#if[space]0

----------------------------------
From:[space]CryptoRat
Message[space]5[space]in[space]Discussion

I[space]can't[space]take[space]credit[space]for[space]the[space]original[space]idea.[space][space]I[space]read[space]it[space]somewhere,
although[space]I[space]can't[space]remember[space]where.[space][space]I[space]recently[space]looked[space]at[space]my[space]code[space]in[space]my
interactive[space]solver[space](written[space]in[space]QuickBasic)[space]and[space]saw[space]that[space]I[space]had[space]comments
referring[space]to[space]the[space]same[space]routine[space]in[space]Applesoft[space]Basic,[space]so[space]that[space]means[space]I[space]coded
it[space]back[space]before[space]I[space]switched[space]from[space]my[space]Apple[space]][e[space]to[space]my[space]DOS[space]machine,[space]whenever
that[space]was.[space][space]Early[space]or[space]mid-80s,[space]I[space]think.[space][space]A[space]good[space]guess[space]might[space]be[space]Caxton
Foster's[space]Cryptanalysis[space]for[space]Microcomputers,[space]or[space]it[space]could[space]have[space]been[space]in[space]the
ACA[space]Computer[space]Supplement.[space][space]I[space]did[space]tweak[space]the[space]code[space]to[space]account[space]for[space]3-vowel
clusters[space]and[space]maybe[space]something[space]else[space]minor[space]like[space]that.[space][space]It's[space]a[space]good[space]algorithm.

#endif


#include[space]<stdio.h>
#include[space]<stdlib.h>
#include[space]<ctype.h>
#include[space]<time.h>

#include[space]<string>
#include[space]<iostream>
#include[space]<fstream>

#include[space]<iterator>
#include[space]<vector>
#include[space]<map>

using[space]namespace[space]std;

#include[space]"textbuffer.hpp"
#include[space]"xforms.hpp"

long[space]consonantClusterCount(const[space]TextBuffer[space]&in);

int[space]main(int[space]argc,[space]char[space]*argv[])
{
[space][space][space][space]TextBuffer[space]inText;
[space][space][space][space]if[space](argc[space]<=[space]1)
[space][space][space][space][space][space][space][space]inText.load(cin);
[space][space][space][space]else
[space][space][space][space]{
[space][space][space][space][space][space][space][space]for[space](int[space]i[space]=[space]1;[space]i<argc;[space]i++)
[space][space][space][space][space][space][space][space][space][space][space][space]inText.load(argv[i]);
[space][space][space][space]}

[space][space][space][space]consonantClusterCount(toUpperAlpha(inText));

[space][space][space][space]return[space]EXIT_SUCCESS;
}


long[space]consonantClusterCount(const[space]TextBuffer[space]&in)
{
[space][space][space][space]map<char,bool>[space]have;
[space][space][space][space]
[space][space][space][space]for[space](char[space]*ch[space]=[space]"ABCDEFGHIJKLMNOPQRSTUVWXYZ";[space]*ch;[space]ch++)
[space][space][space][space][space][space][space][space]have[*ch][space]=[space]false;

[space][space][space][space]for[space](vector<char>::const_iterator[space]iter[space]=[space]in.begin();[space]iter[space]!=[space]in.end();[space]++iter)
[space][space][space][space][space][space][space][space]have[*iter][space]=[space]true;

[space][space][space][space]char[space]saved[6];
[space][space][space][space]long[space]maxTotal[space]=[space]-1;

[space][space][space][space]for[space](int[space]v1='A';[space]v1<='Z';[space]v1++)
[space][space][space][space]{
[space][space][space][space][space][space][space][space]if[space](!have[v1])
[space][space][space][space][space][space][space][space][space][space][space][space]continue;
[space][space][space][space][space][space][space][space]for[space](int[space]v2=v1+1;[space]v2<='Z';[space]v2++)
[space][space][space][space][space][space][space][space]{
[space][space][space][space][space][space][space][space][space][space][space][space]if[space](!have[v2])
[space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space]continue;
[space][space][space][space][space][space][space][space][space][space][space][space]for[space](int[space]v3=v2+1;[space]v3<='Z';[space]v3++)
[space][space][space][space][space][space][space][space][space][space][space][space]{
[space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space]if[space](!have[v3])
[space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space]continue;
[space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space]for[space](int[space]v4=v3+1;[space]v4<='Z';[space]v4++)
[space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space]{
[space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space]if[space](!have[v4])
[space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space]continue;
[space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space]int[space]last=0;
[space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space]int[space]i=0;
[space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space]long[space]total[space]=[space]-1;

[space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space]bool[space]isVowel1[space]=[space]false;
[space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space]bool[space]isVowel2[space]=[space]false;
[space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space]for[space](vector<char>::const_iterator[space]iter[space]=[space]in.begin();
[space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space]iter[space]!=[space]in.end();[space]++iter)
[space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space]{
[space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space]i[space]+=[space]1;
[space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space]
[space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space]const[space]char[space]ch[space]=[space]*iter;

[space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space]if[space](ch[space]==[space]v1[space]||[space]ch[space]==[space]v2[space]||[space]ch[space]==[space]v3[space]||[space]ch[space]==[space]v4)
[space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space]{
[space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space]if[space](total[space]==[space]-1)
[space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space]total[space]=[space]0;
[space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space]long[space]distance[space]=[space]i[space]-[space]last;
[space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space]
[space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space]if[space](isVowel1[space]&&[space]isVowel2)
[space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space]distance[space]=[space]100;
[space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space]total[space]+=[space]distance*distance;
[space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space]last[space]=[space]i;
[space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space]
[space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space]isVowel2[space]=[space]isVowel1;
[space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space]isVowel1[space]=[space]true;
[space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space]}
[space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space]else
[space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space]{
[space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space]isVowel2[space]=[space]isVowel1;
[space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space]isVowel1[space]=[space]false;
[space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space]}
[space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space]}
[space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space]
[space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space]if[space]((maxTotal[space]==[space]-1[space]||[space]total[space]<[space]maxTotal)[space]&&[space]total[space]!=[space]-1)
[space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space]{
[space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space]saved[0][space]=[space]v1;
[space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space]saved[1][space]=[space]v2;
[space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space]saved[2][space]=[space]v3;
[space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space]saved[3][space]=[space]v4;
[space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space]saved[4][space]=[space]'\0';
[space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space]
[space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space]maxTotal[space]=[space]total;
[space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space]cout[space]<<[space]saved[space]<<[space]":[space]"[space]<<[space]maxTotal[space]<<[space]endl;
[space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space]}
[space][space][space][space][space][space][space][space][space][space][space][space][space][space][space][space]}
[space][space][space][space][space][space][space][space][space][space][space][space]}
[space][space][space][space][space][space][space][space]}
[space][space][space][space]}
}


Code:
 

//[space]FILE:[space]xforms.hpp
#ifndef[space]XFORMS_HPP_
#define[space]XFORMS_HPP_

#include[space]"textbuffer.hpp"

TextBuffer[space]toUpperAlpha(const[space]TextBuffer[space]&in);

TextBuffer[space]formatGroups(const[space]TextBuffer[space]&in);

#endif[space]//[space]#ifndef[space]XFORMS_HPP_


Code:
 

//[space]FILE:[space]textbuffer.hpp
#ifndef[space]TEXTBUFFER_H__
#define[space]TEXTBUFFER_H__

class[space]TextBuffer[space]:[space]public[space]vector<char>
{
public:

bool[space]load(const[space]string[space]&filename);
bool[space]load(istream[space]&in);

bool[space]dump(const[space]string[space]&filename);
bool[space]dump(ostream[space]&out);

};


#endif[space]//[space]#ifndef[space]TEXTBUFFER_H__


Code:
 

//[space]FILE:[space]xforms.cpp
#include[space]<stdio.h>
#include[space]<stdlib.h>
#include[space]<ctype.h>
#include[space]<time.h>

#include[space]<string>
#include[space]<iostream>
#include[space]<fstream>

#include[space]<iterator>
#include[space]<vector>

#include[space]<getopt.h>

using[space]namespace[space]std;

#include[space]"textbuffer.hpp"
#include[space]"xforms.hpp"

template[space]<typename[space]InIt,[space]typename[space]OutIt>
void[space]copy_groups([space]InIt[space]first,[space]InIt[space]last,[space]OutIt[space]out,[space]
unsigned[space]major,[space]unsigned[space]minor[space])
{
unsigned[space]i[space]=[space]0;
while[space](first[space]!=[space]last)
{
*out++[space]=[space]*first++;
if[space](++i[space]%[space]major[space]==[space]0)
*out++[space]=[space]'\n';
else[space]if[space](i[space]%[space]minor[space]==[space]0)
*out++[space]=[space]'[space]';
}

if[space]([space]i[space]%[space]major[space]!=[space]0[space])
*out++[space]=[space]'\n';
}

TextBuffer[space]toUpperAlpha(const[space]TextBuffer[space]&in)
{
TextBuffer[space]out;

transform(in.begin(),[space]in.end(),
back_inserter(out),[space]::toupper);

out.erase(
remove_if(out.begin(),[space]out.end(),
not1(ptr_fun(::isalpha))),
out.end());

return[space]out;
}

TextBuffer[space]formatGroups(const[space]TextBuffer[space]&in)
{
TextBuffer[space]out;

copy_groups(in.begin(),[space]in.end(),[space]
back_inserter(out),[space]60,[space]5);

return[space]out;
}


Code:
 

//[space]FILE:[space]textbuffer.cpp
#include[space]<stdio.h>
#include[space]<stdlib.h>
#include[space]<ctype.h>
#include[space]<time.h>

#include[space]<string>
#include[space]<iostream>
#include[space]<fstream>

#include[space]<iterator>
#include[space]<vector>

#include[space]<getopt.h>

using[space]namespace[space]std;

#include[space]"textbuffer.hpp"


bool[space]TextBuffer::load(const[space]string[space]&filename)
{
ifstream[space]in(filename.c_str());

bool[space]rc[space]=[space]load(in);

return[space]rc;
}

bool[space]TextBuffer::load(istream[space]&in)
{
in[space]>>[space]noskipws;
copy(istream_iterator<char>(in),[space]istream_iterator<char>(),
back_inserter(*this));

return[space]true;
}

bool[space]TextBuffer::dump(const[space]string[space]&filename)
{
ofstream[space]out(filename.c_str());

bool[space]rc[space]=[space]dump(out);

return[space]rc;
}

bool[space]TextBuffer::dump(ostream[space]&out)
{
copy(begin(),[space]end(),[space]ostream_iterator<char>(out));

return[space]true;
}


(And the source code tag in this new forum software doesn't seem to preserve leading whitespace....)

A typical run on unencrypted text (so we should find AEIO):

Code:
 

ABCD:[space]5194
ABCE:[space]3128
ABDS:[space]2976
ABEF:[space]2884
ABEU:[space]2558
ABOS:[space]2473
ACOS:[space]2469
AEFO:[space]2251
AEIO:[space]1703
When cryptography is outlawed, bayl bhgynjf jvyy unir cevinpl.
Offline Profile Quote Post
Another technique for identifying vowels · General