Welcome Guest [Log In] [Register]
Welcome to Crypto. We hope you enjoy your visit.


You're currently viewing our forum as a guest. This means you are limited to certain areas of the board and there are some features you can't use. If you join our community, you'll be able to access member-only sections, and use many member-only features such as customizing your profile, sending personal messages, and voting in polls. Registration is simple, fast, and completely free.


Join our community!


If you're already a member please log in to your account to access all of our features:

Username:   Password:
Add Reply
Vowel identification - Sukhotin's Algorithm
Topic Started: Oct 29 2008, 03:34 AM (180 Views)
jdege
Member Avatar
Elite member
[ *  *  *  *  * ]
The November-December 2008 issue of "The Cryptogram" arrived the other day. PARROT has an article "Solving P-Sp-1".

In it, he references an algorithm for identifying vowels I'd not heard of before. That sent me on a bit of a research spree.

B.V. Sukhotin published it in a Russian linguistics journal, back in 1962. A translation was published in a French information systems journal in 1973. The first English language description was in 1991, by Jacques Guy, in Cryptologia 15:3,258-262.

It was later discussed by PHEONIX in The Cryptogram MA92 and SO92.

Anyone heard of this? The Cm SO92 article measures its effectiveness compared to a number of other published vowel-identification techniques, and ranks it the most effective.

Yet I've never seen it discussed before. A Google search finds hits on a 1992 Cryptologia article by Caxton Foster that does similar comparisons with the same results. (Was PHOENIX Caxton Foster? The nym is no longer in the ACA index.)

The process is simple.

Code:
 

1.[space]Create[space]a[space]26[space]x[space]26[space]matrix[space]of[space]the[space]number[space]of[space]times[space]each
letter[space]contacts[space]another.[space]The[space]result[space]will[space]be[space]diagonally[space]symmetrical.

2.[space]Set[space]all[space]the[space]cells[space]on[space]the[space]main[space]diagonal[space]to[space]zero.

3.[space]Create[space]a[space]1[space]x[space]26[space]vector[space]of[space]the[space]sums[space]of[space]the[space]rows.

4.[space]Mark[space]the[space]highest[space]sum[space]that[space]is[space]greater[space]than[space]zero[space]as[space]a[space]vowel.

5.[space]From[space]each[space]row[space]sum[space]that[space]is[space]not[space]marked[space]as[space]a[space]vowel[space]subtract
twice[space]the[space]number[space]of[space]times[space]that[space]the[space]row[space]letter[space]occurs[space]next[space]to
your[space]new[space]vowel.

6.[space]If[space]there[space]remain[space]non-vowel[space]rows[space]with[space]sums[space]greater[space]than[space]zero,[space]
go[space]to[space]step[space]4.

7.[space]Exit.


Simple enough. It only took me a couple of minutes to write an awk script that would implement it. It does work.

But despite the articles on how well it works compared to other methods, it doesn't seem to work as well as the method I described in Another technique for identifying vowels

What is different, though, about this method is that it can be done with paper and pencil.
When cryptography is outlawed, bayl bhgynjf jvyy unir cevinpl.
Offline Profile Quote Post Goto Top
 
Revelation
Member Avatar
Administrator
[ *  *  *  *  * ]
I've received the issue too. This is interesting stuff!
RRRREJMEEEEEPVKLWENFNVJKEEEEEAOLKAFKLXCFZAASDJXZTTTTTTTLSIOWJXMOKLAFJNNKFNXN
RAGRBAQEMHIGDJVDSEOXVIYCELFHWLELJFIENXLRATALSJFSLCYTKLASJDKMHGOVOKAJDNMNUITN
RRRRLJVEEEEECLYVYHNVPFTAEEEEEMWLMEIRNGLARWJAKJDFLWNTIERJMIPQWOTZEOCXKNUBNXCN
RJIRPOWEANFUSNCZVDVZNMSFEKLOEPZLDKDJWSAAAAAAAOERHJCTNCKFRIMVKSOFOMKMANREWNBN
RZUDRGXEEEEENFQIDVLQNCKNEEEEEDGLLLLLLAWIOSNCDARLODMTOEJXMILDFJROTKJSDNLVCZNN
Offline Profile Quote Post Goto Top
 
jdege
Member Avatar
Elite member
[ *  *  *  *  * ]
I loved how he started out with an incorrect assumption, and from it managed to work his way to a solution.

It's the way it usually happens for me.

(Haven't started on any of the Cons, yet. Waiting for the digital copy. Hate wasting my time working around my typos. ;)
When cryptography is outlawed, bayl bhgynjf jvyy unir cevinpl.
Offline Profile Quote Post Goto Top
 
Revelation
Member Avatar
Administrator
[ *  *  *  *  * ]
I've made an implementation too, in Python. Be warned: this is my very first python program.

It took me a while to find out how to set the size of an array (sum = [0] * 26). Can't believe that that's not in the documentation.
Attached to this post:
Attachments: vowel.py (926 Bytes)
Edited by Revelation, Oct 29 2008, 11:14 PM.
RRRREJMEEEEEPVKLWENFNVJKEEEEEAOLKAFKLXCFZAASDJXZTTTTTTTLSIOWJXMOKLAFJNNKFNXN
RAGRBAQEMHIGDJVDSEOXVIYCELFHWLELJFIENXLRATALSJFSLCYTKLASJDKMHGOVOKAJDNMNUITN
RRRRLJVEEEEECLYVYHNVPFTAEEEEEMWLMEIRNGLARWJAKJDFLWNTIERJMIPQWOTZEOCXKNUBNXCN
RJIRPOWEANFUSNCZVDVZNMSFEKLOEPZLDKDJWSAAAAAAAOERHJCTNCKFRIMVKSOFOMKMANREWNBN
RZUDRGXEEEEENFQIDVLQNCKNEEEEEDGLLLLLLAWIOSNCDARLODMTOEJXMILDFJROTKJSDNLVCZNN
Offline Profile Quote Post Goto Top
 
jdege
Member Avatar
Elite member
[ *  *  *  *  * ]
One of the articles in Cm suggests that instead of calculating the sum for each row, you calculate the sum and the variety of contact (VOC). Then choose as the vowel the largest product of sum and VOC. Then decrement the sum by 2x the count, as we had been doing, and decrement the VOC by 1x the count. If any sum goes below zero, set it to 0, otherwise you could see a negative sum and a negative VOC be the highest product.
When cryptography is outlawed, bayl bhgynjf jvyy unir cevinpl.
Offline Profile Quote Post Goto Top
 
1 user reading this topic (1 Guest and 0 Anonymous)
« Previous Topic · General · Next Topic »
Add Reply