Advent of Code, Day 3: wishlist snooping


Some kids are very secretive, and encrypt their wishlists. Luckily, they don't know much about cryptography and only ever use caesar ciphers.

Caesar ciphers are susceptible to frequency analysis attacks. In English, the most common letters are e, then t, then a. If a letter to Santa is encrypted using the m as the key, it's pretty likely that the most common letters will be q, then f, then m.

We have a plaintext frequency table. (You can get this off of wikipedia, or just use your frequency_table on a book from Project Gutenberg. It looks something like this:

Letter:      A       B       C       D       E       F    ...  Z
Frequency: .08167  .01492  .02782  .04253  .12702  .02228   .00074

To figure out the encryption key, we will guess each one in turn, and compute the frequency table of the message under this decryption key. That looks something like this:

Letter:      A       B       C       D       E       F   ...  Z
Frequency: .02228  .02015  .06094  .06966  .00153  .00772   .12702

We need to compare each of these 26 tables to the normal English-language table to figure out which key gives us output that looks most like English. There are a number of good ways to do that, but here's one:

def distance(freq_table1, freq_table2):
    """
    Compute the sum-of-squares distance between
    two letter-frequency distributions
    """
    # dist= 0
    # For every letter a..z:
        # dist += ((freq of letter in freq_table1) - (freq of letter in freq_table2))**2
    # return dist

here are some example wishlists to try out:

When you've got a solution, submit the plaintext of the following letter: ghduvdqwdlzrxogolnhdwrbsxsdwrbnlwwhqdwrbexqqbdqgdwrbkruvhdqgdovrdvpdoosodqhwkdwlfrxogulghlqlzrxogolnhdfrorulqjerrnwrr