# Describing Your Way to Decryption

Al’s talk to the ACM on STE and decryption.

### History

- Encryption has been used for thousands of years.
- Written security took many forms
- Shaved heads and then codes

- Computers begin being used around WWII
- The “modern” era of cryptography dates to Shannon in 1940’s

### Shannon Theory

- Comes from “Communication Theory of Secrecy” in 1949
- Talks about “confusion” and “diffusion”
- Discusses “unicity”, “redundancy”, and “entropy”
- These key concepts have resulted in DES, RSA, and CipherSmith
- Also responsible for the direction of modern cryptography

### Entropy

- Entropy is the uncertainty in a letter before you know it
- Based on Hartley’s function
- Gives an idea of how much uncertainty remains after randomly picking a member of a set
- Maximum where you have a uniform probability

### Entropy and Redundancy

- Redundancy indicates how often information is repeated in a string
- Languages are not uniformly random
- This is all conditional

### Unicity

- Unicity is the measure of the average number of symbols needed to break a code
- For a substitution cipher
*n*= 26 symbols - The R
_{L}measure was empirically determined - Minimum (lower limit) when R
_{L}= 1

### Set Theoretic Estimation

- Been used for years for many different problems
- The method is based largely on set theory and has certain requirements
- Uses “Last Man Standing”
- Requires:
- A function that resolves to a discrete point
- The function and any “noise” must be bounded
- The function and it’s inverse must be one-to-one and onto
- Must know elements of target set
- Must be continuous

### Property Sets

- Describe some property of the solution given an input
- The property set maps back to the possible inputs
- Each property must be a subset of possible inputs
- The solutions must be a member of all the property sets
- We can use AND and intersection
- Finding one property eliminates other property sets

### Sets

Language -> English -> Letters -> Sentences -> Words -> M-grams

### Why STE?

- A close examination of Shannon reveals he used a STE approach
- Shannon attempted to build up a solution with p(k) = 1
- Shannon used m-grams to find p(k)

### More M-grams

- We have a list of forbidden m-grams: qqq, zxy, bbqxnk, etc.
- We also used allowed m-grams

### Property Sets

- English language
- Doubled letters
- M-grams for 2 through 6 and 13
- Letter frequency for very uncommon and extremely common letters
- Words and sentences

### Simple Example

- Encrypted data: xyz
- Possible decryptions: amy, mut, oop, lly
- oop and lly are out since their patterns don’t match

### Where’s the data come from?

- Training done by checking generally recognizable works in English
- Works are chosen from literature from various genres
- Each author is represented by several texts
- Removed non-alphabetic characters to count words, m-grams, etc.

### Testing

- Encrypt file
- Randomly select start point
- Read one character at a time
- If the matrix changes, reapply the changes to the stuff we’ve already seen
- Continue until all letters are found (need 25)
- Constantly check against key to make sure we haven’t made a mistake

### Results

- Shift cipher
- Using all property sets
- 1142 tests
- 100% success
- 5.0 symbols average std. dev = .5545 symbols
- Traditional unicity is 1.3 symbols
- Used only 4 and 5-grams

- General substitution cipher
- Used 12 runs per work
- Ran a composite set of all authors against each author
- Ran a total of 12792 tests
- Authors always correctly decrypt their own works
- Authors of similar style and era sometimes work
- The composite set decrypted all works correctly
- Unicity was 17.456 symbols
- Std. dev. was 5.53 symbols
- High was 91 symbols
- Low was 13 symbols
- Average time / test was less than 1 ms
- Remove worst 5 cases and get std. dev. of 3.36 symbols

### Extensions and Future Work

- Add data from words (variable length m-grams)
- Add data from sentence structure
- Entropy needs a ton more attention
- Examine more ciphers to see if they’re vulnerable too (XOR, stream, block, etc.)
- Stylometry (Morton 1976)
- Language recognition via STE
- Parallel processing

### Conclusions

- Can’t apply theories of uniform and random distribution to language since they’re not valid
- Certain parts of a string are more vulnerable than others. How can we find them?
- In effect, we’re changing the probability distribution function
- STE can be a powerful tool for decryption
- STE provides a framework for explaining Shannon
- Need more people