Encipher/Decipher

Due Sunday, April 13th at 8pm

Sample Solution

Shock and horror! Noah T. Sosmart, the only offspring of the famous Professor I. M. Sosmart has been kidnapped by a gang of dimwitted student thugs who are holding him hostage until Professor Sosmart raises their grades. Luckily Noah (a big fan of spy movies) was able to get a secret coded message out to his father that revealed his location, as well as assurances that Noah is safe.

Unfortunately, Noah didn't have access to any computational resources to help him encode his message. Nor is Noah an elite cryptographer . So Noah chose a very simple cipher to encode his message: the Caesar Cipher. In the Caesar Cipher, each letter in the original message (the plaintext) is shifted by a constant number of letters to produce the coded message (the cyphertext). For example, if the plaintext is "I LOVE RIVERSIDE" and the key is 'E' (meaning you shift 5 characters towards 'Z', since 'E' is the 5th character of the alphabet):
ABCDEFGHIJKLMNOPQRSTUVWXYZ <- Plaintext alphabet
EFGHIJKLMNOPQRSTUVWXYZABCD <- Cyphertext alphabet

I LOVE RIVERSIDE           <- Plaintext
M PSZI VMZIVWMHI           <- Cyphertext
(Traditionally, punctuation and whitespace are removed, and all letters are converted to uppercase.)

To encode, simply take each letter from the plaintext, find the corresponding letter in the plaintext alphabet and find the letter below that in the cyphertext alphabet.

Your job is to write a program that will allow Professor Sosmart to decode his son's message and go rescue him. Hurry, there is no time to lose!

The Assignment

You will create a C++ program called cipher.cc which will implement the Caesar cipher. To aid in compilation, you will be expected to also create a Makefile for this assignment, which will produce an executable called cipher. Note that decryption of the Caesar cipher with key n is just the same as encryption with key 26 - n, so you only need to implement one version of the program to perform both tasks.

The program will first read in a single character (in the range 'a'..'z' or 'A' .. 'Z'), which will be used as the key. Then, read input line-by-line until the standard input stream is exhausted, printing out the encyphered version of each line as it is read. Here is an example:

> ./cipher
b
THIS IS SOME TEXT^D
UIJT JT TPNF UFYU
For a more robust sample, here are several sample input and output files, differing portions of Noah's message.
You can test that your program is working correctly by running the following (note the convention of having a ">" character to indicate that you type these at the Linux/UNIX command line.):
> ./cipher < crypt1.in | diff crypt1.out -
> ./cipher < crypt2.in | diff crypt2.out -
> ./cipher < crypt3.in | diff crypt3.out -
If any of these three commands produce any output, your program is not correct. As always, just passing these three tests does not guarentee that your program is correct, so try to take the time to come up with your own test cases.

Remember to focus some effort on style and documentation. A significant portion of your grade will be based on these aspects of your submission. Also, remember that we will be grading your programs with the -Wall -Werror compilation options, as always. If your program does not compile with these flags, you will be docked.

Remember to put your name and lab section at the top of your submission so we can make sure you get credit.

Bonus Credit

Bonus #1 Implement a program (or a special case of your current program) that tries all 26 possible Caesar keys as decryption keys for some cyphertext. Ask the user after each attempt if the decryption was sucessful, and terminate when the correct key is found, or print a message if no key is found.

Bonus #2 Implement the above, but rather than checking all the keys in order, check the keys in order of how likely they are, based on the following ordering. Generally, the letter 'E' appears far more often in English text than any other letter. Knowing this, take the sorted frequency list for the characters in the cyphertext, and try the decryption of the cyphertext for each in order, assuming that the letter in question was plaintext 'E'. That is, if the cyphertext were:
YMFY NX NK YMJ HDUMJWYJCY BJWJ
Then the frequency count would be:
A 0       B 1        C 1
D 1       E 0        F 1
G 0       H 1        I 0
J 5       K 1        L 0
M 3       N 2        O 0
P 0       Q 0        R 0
S 0       T 0        U 1
V 0       W 2        X 1
Y 5       Z 0
So first we would check if ciphertext 'J' were plaintext 'E', which gives us:
THAT IS IF THE CYPHERTEXT WERE
Which is the answer we were looking for, in 1 guess!

Your program should still prompt the user at each iteration to ask whether the decryption was correct or not.

Bonus #3 Using English letter-frequencies, write a program that will use either of the above decryption methods and attempt to guess which decryption is the correct one, based on matching the letter-frequencies of the possible plaintext to these frequencies. This is a challenging problem, but can be worth up to 20% bonus credit.

Before you begin, think about the following: Will my letter-frequencies match up exactly with the published ones? How will I tell which decryption matches most closely? You may want to look at the published literature online. Google is, as always, your friend.