Casual Coded Correspondence

In this project, I will encode and decode various messages generated by either CodeCademy or myself. The monospaced font contains the codecademy prompts for this project.

Part 1: Ceasar Cipher

Introducing the Caesar Cipher: Take a message, something like "hello", and then shift all letters by a certain offset. For example, if you chose an offset of 3 and a message of "hello", you would encode the message by shifting each letter 3 places to the left with respect to their alphabetical position. So "h" becomes "e", and "o" becomes "l". Then you have an encoded message,"ebiil"!

Step 1: Decode a Message

Here is an encoded message:

    xuo jxuhu! jxyi yi qd unqcfbu ev q squiqh syfxuh. muhu oek qrbu je tusetu yj? y xefu ie! iudt cu q cuiiqwu rqsa myjx jxu iqcu evviuj!

This message has an offset of 10. Can you decode it?

In the cell below, I'll write a function to decode any ceaser cipher message given the offset.

In [1]:
def cc_decrypt(message_c, offset):
    letters='abcdefghijklmnopqrstuvwxyz'
    message_p = ''
    for char in message_c:
        pos = letters.find(char)
        if pos<0:
            message_p += char
        else:
            message_p += letters[(pos+offset)%26]
    return message_p

ciphertext = 'xuo jxuhu! jxyi yi qd unqcfbu ev q squiqh syfxuh. muhu oek qrbu je tusetu yj? y xefu ie! iudt cu q cuiiqwu rqsa myjx jxu iqcu evviuj!'
plaintext  = cc_decrypt(ciphertext, 10)
plaintext
Out[1]:
'hey there! this is an example of a caesar cipher. were you able to decode it? i hope so! send me a message back with the same offset!'

Step 2: Encode a Message

Now I'll write a function which encodes any message with the ceaser cipher.

In [2]:
def cc_encrypt(message_p, offset):
    letters='abcdefghijklmnopqrstuvwxyz'
    message_c = ''
    for char in message_p:
        pos = letters.find(char)
        if pos<0:
            message_c += char
        else:
            message_c += letters[(pos-offset)%26]
    return message_c

plaintext  = 'hello! now it is my turn to give this encryption thing a try.  I guess if you are reading this that means you were successul, huh? Well, ttyl!'
ciphertext = cc_encrypt(plaintext, 10)
ciphertext
Out[2]:
'xubbe! dem yj yi co jkhd je wylu jxyi udshofjyed jxydw q jho.  I wkuii yv oek qhu huqtydw jxyi jxqj cuqdi oek muhu ikssuiikb, xkx? Wubb, jjob!'
Here are two more messages, the first one encoded just like before with an offset of ten, and it contains the hint for decoding the second message!

First message:

    jxu evviuj veh jxu iusedt cuiiqwu yi vekhjuud.

Second message:

    bqdradyuzs ygxfubxq omqemd oubtqde fa oapq kagd yqeemsqe ue qhqz yadq eqogdq!

We'll use the same function writen earlier to solve these.

In [3]:
ct1 = 'jxu evviuj veh jxu iusedt cuiiqwu yi vekhjuud.'
ct2 = 'bqdradyuzs ygxfubxq omqemd oubtqde fa oapq kagd yqeemsqe ue qhqz yadq eqogdq!'
pt1 = cc_decrypt(ct1, 10)
print(pt1)
pt2 = cc_decrypt(ct2, 14)
print(pt2)
the offset for the second message is fourteen.
performing multiple caesar ciphers to code your messages is even more secure!

Step 3: Solving a Caesar Cipher without knowing the offset value

To test your cryptography skills, this next message is going to be coded with a Caesar Cipher but this time you won't know the value of the offset

Here's the coded message:

    vhfinmxkl atox kxgwxkxw tee hy maxlx hew vbiaxkl tl hulhexmx. px'ee atox mh kxteer lmxi ni hnk ztfx by px ptgm mh dxxi hnk fxlltzxl ltyx.

Good luck!

I'll write a function below which takes the frequency of letters and compares them with the frequency of letters as used in the English language. The function will use this comparison to suggest the offsets that are most likely.

In [10]:
def cc_get_offset_suggestions(ciphertext):
    letters='abcdefghijklmnopqrstuvwxyz'
    letter_frequencies = [8.167,1.492,2.782,4.253,12.702,2.228,2.015,6.094,6.966,0.153,0.772,4.025,2.406,6.749,7.507,1.929,0.095,5.987,6.327,9.056,2.758,0.978,2.360,0.150,1.974,0.074]
    # build frequency table
    ft = {}
    total_letters = 0
    for char in letters:
        ft[char] = 0
    for char in ciphertext:
        if char in letters:
            ft[char] += 1
            total_letters +=1
    frequency_list = [ft[letter]/float(total_letters)*100 for letter in letters]
    # find minumum of sum(letter_frequencies-frequency_list)
    diff_list = []
    for i in range(26):
        diff_list.append(( i, sum([abs(lf-fl) for lf,fl in zip(letter_frequencies, frequency_list)]) ))
        frequency_list = frequency_list[-1:]+frequency_list[:-1] #rotate list
    diff_list.sort(key=lambda x: x[1])
    
    suggestion_list = [shift for shift, error in diff_list]
    
    for offset in suggestion_list:
        plaintext = cc_decrypt(ciphertext, offset)
        print('Possible decrypted text:')
        print(plaintext)
        print()
        print('Enter C to continue trying other offset values, and anything else to finish.')
        #time.sleep(.02) # to deal with asynchonous jupyter output if necessary
        response = input('> ')
        if not response.upper().startswith('C'):
            return plaintext

ct = "vhfinmxkl atox kxgwxkxw tee hy maxlx hew vbiaxkl tl hulhexmx. px'ee atox mh kxteer lmxi ni hnk ztfx by px ptgm mh dxxi hnk fxlltzxl ltyx."
cc_get_offset_suggestions(ct)
Possible decrypted text:
computers have rendered all of these old ciphers as obsolete. we'll have to really step up our game if we want to keep our messages safe.

Enter C to continue trying other offset values, and anything else to finish.
> 
Out[10]:
"computers have rendered all of these old ciphers as obsolete. we'll have to really step up our game if we want to keep our messages safe."

Looks like the function figured it out on the first try!

Part 2: The Vigenère Cipher

Step 1: Decoding

As you can see, technology has made brute forcing simple ciphers like the Caesar Cipher extremely easy. We'll now move on to a slightly more secure cipher, the Vigenère Cipher

The Vigenère Cipher is a polyalphabetic substitution cipher, as opposed to the Caesar Cipher which was a monoalphabetic substitution cipher. What this means is that opposed to having a single shift that is applied to every letter, the Vigenère Cipher has a different shift for each individual letter. The value of the shift for each letter is determined by a given keyword.
Consider the message

    barryisthespy

If we want to code this message, first we choose a keyword. For this example, we'll use the keyword

    dog

Now we repeat the keyword over and over to generate a keystream that is the same length as the message we want to code. So if we want to code the message "barryisthespy" our _keyword phrase_ is "dogdogdogdogd". Now we are ready to start coding our message. We shift the each letter of our message by the place value of the corresponding letter in the keyword phrase, assuming that "a" has a place value of 0, "b" has a place value of 1, and so forth. Remember, we zero-index because this is Python we're talking about!

                  message:    b  a  r  r  y  i  s  t  h  e  s  p  y

                keystream:    d  o  g  d  o  g  d  o  g  d  o  g  d

    resulting place value:    4  14 15 12 16 24 11 21 25 22 22 17 5

So we shift "b", which has an index of 1, by the index of "d", which is 3. This gives us an place value of 4, which is "e". Then continue the trend: we shift "a" by the place value of "o", 14, and get "o" again, we shift "r" by the place value of "g", 15, and get "x", shift the next "r" by 12 places and "u", and so forth. Once we complete all the shifts we end up with our coded message:

            eoxumovhnhgvb

This is a lot harder to crack without knowing the keyword! So now I'll give you a message and the keyword, and you'll get to crack it!

    dfc aruw fsti gr vjtwhr wznj? vmph otis! cbx swv jipreneo uhllj kpi rahjib eg fjdkwkedhmp!

and the keyword is 

    friends

Below I'll write my function to decode the vigenère cipher.

In [6]:
def v_decrypt(ctext, key):
    letters = 'abcdefghijklmnopqrstuvwxyz'
    ptext = ''
    key_idx=0
    for char in ctext:
        pos = letters.find(char)
        if pos<0: #pass up punctiation
            ptext += char
        else:
            ptext += letters[(pos-letters.find(key[key_idx]))%26]
            key_idx = (key_idx+1)%len(key)
    return ptext

ciphertext = 'dfc aruw fsti gr vjtwhr wznj? vmph otis! cbx swv jipreneo uhllj kpi rahjib eg fjdkwkedhmp!'
plaintext  = v_decrypt(ciphertext, 'friends')
plaintext
Out[6]:
'you were able to decode this? nice work! you are becoming quite the expert at crytography!'

Step 2: Encoding

We'll now modify the previous function to encode messages.

In [7]:
def v_encrypt(ptext, key):
    letters = 'abcdefghijklmnopqrstuvwxyz'
    ctext = ''
    key_idx=0
    for char in ptext:
        pos = letters.find(char)
        if pos<0: #pass up punctiation
            ctext += char
        else:
            ctext += letters[(pos+letters.find(key[key_idx]))%26]
            key_idx = (key_idx+1)%len(key)
    return ctext

plaintext  = 'These ciphers can get kinda cool! Next thing up is trying to crack this without knowing the key!'
ciphertext = v_encrypt(plaintext, 'friends')
ciphertext
Out[7]:
'Tmvai plhmvzw pdf lvb ovqvf twsy! Nhpy kpmaj mu za xebasx bs pushb blvv onkpshw csfemaj lmv sil!'

Conclusion

In this project we've explored two different ciphers and used Python to encode messages, with them, as well as to decode messages with and without knowing the key.

In [ ]: