# Advent of Code, Day 5: cracking vigenere

Update: There was an issue with the key length guessing algorithm description. You should be comparing the first segment to subsequent segments because you are looking for breaks in the in the key length matching up with repeating lyrics.

Jay likes to sing along to music, but he has trouble remembering the lyrics. He keeps the text, to the lyrics of even his most favorite song. However, he's very secretive of his favorite music and encrypts these lyrics. Not only that, Jay is quite paranoid so he's using Vigenere and the key might be up to 40 characters long. That's going to take too long to try every single key like when cracking a Caeser Cipher.

However, if you think about it, a Vigenere secret message is just a key_length number of Caeser messages interleaved. Rather than trying more than `26^40+26^39...etc` key attempts, we could slice up the text and reassemble it into multiple Caeser cipher texts. Divided up we only have to try `26 * 40 + 26 * 39...etc` brute force attempts that's not too bad on a computer, but we can do even better. If we can figure out the length, we only have to make `26 * key_length` brute force attempts.

The first useful function is this endeaver, would be a normalize function. Since Jay left all his spaces and punction in his encrypted lyrics, having something that makes it easy to strip out all the non-letters and make them all lowercase will make analysis easier.

The Second useful function to have would be a hamming_distance function. Hamming Distance is a way to compare two equal length strings. We can use the fact that not only is the Vigenere key repeating itself, but the music lyrics repeat themselves. We can find the likely `key_length` by dividing the encrypted text by all the possible key lengths, taking the first 10 of these, and then use the hamming_distance to score those possible key lengths. More specifically you would do this by comparing the first segment to each of the following segments with this function and then summing them together and dividing by the trial `key_length`. Lowest score is likely the key length.

Once you have the likely key length, take every `nth` character and put it into `key_length` number of Caeser cipher texts. Perform Caeser analysis on each one of these like in LC101 advent of code day 3, revealing each `nth` letter of the key.

Here are the encrypted song lyrics. So what is the title of the song?

`Bn lohr zatk exgxc kwm sml'c kmQaacr fyoqr cku C ofgp fwm urmjI to cslcjo jakh qq fxd rgsrr uigJhsg ccn'g lonKgl mv owd ngtgl wvt vt, vhnmy cxFgp owfoc fbnry poze tiqxJERQ ses ch sr kl eleLoh gqtgt Vlevs uho osxf n lbt n sguhf vg onKanx pb ly cmhr srvead neg'l yjmfWolzqlsbl lboxia' av mrZruruag' apw umbJifh loh nkg jty xefubu' bzo ngtHrrr wvtj tuby befvlvuw umbPita-pitat rvznn jjgf Jctk-GsoaV jhsg bktr bzCx'k xhy bzo pmbk V dbn'g lkgum onMddphg lri yzal oa tue cnpxjuc gjysiqQmtr ig uc jvgiy ztqy ml xxlt dsoc soeeclnyAo zr iuvvah pa axjgaigeUa uaDit pofpaw lagdo'w yyl vn vt Tevtvg pckyq ppb adRy aa aa aa aa pa at tursFt ui fk ry aaaaTegtkn wbmac oam pbFk ry aa aa aa aa pa attuRs ft ui fk ryaaTegtvn litze qml amUi fk ry aa aa aa aa paatTu rs ft ui fkryTegtvn wiigl pon mlFt ui fk ry aa aa aa aapaAt tu rs ft uifk Afnt loh wnnv tb ugfp oamo bzo ogqWntph lowr fmkj cgm fpozd jyylGrliag vo qh cbel A wplEkqy-hnu mnmn-uph ztsu ggex jtgce'Ga tue ziqdne by zbi udni eadl rue euo-a-qudNb eupi xgk apw rerrrfMnd pawsr B mix xdhvz koerf ag tue YameelYyi ew hu bzo jgstl ynrq lknr ponl lzx YianipfMrt Nlv hg tbej gi A'e mom ybiygeftV gbt vhr ykpij xhy bzo jjnvbr bf n ctojw vfiskxyLB zpyl aaoghrrHrbf zbi hjbukw yj ruifYbue hkgugkmwGfef jsn gfvcxs eiqe kn zr cbmhkLvclr xm ghr wrsg tq tux kuwl lh apw xspghOohgut oy ubzm efv phbur 'ik to bfs a to qfsTn sik qxz g'svp wn dbn'g sgorIa mny aafmlz gb xfr (shmzeetkmr)B suowk ba pgd Kcgtvn witga wvm onRs ft ui fk ry aa aa aaaaPa at tu rs ftuiYoxrvn witgl wkt vmTu rs ft ui fk ry aa aaaaAa pa at tu rsftNmldml witgl wvt ktAt tu rs ft ui fk ry aaaaAa aa pa at tursYxabax ngtgl wvt vtPa at tu rs ft ui fk ryaaAa aa aa pa attu Iayaa-napxw V.S. vf loh ngeq t rcjlOav'a lri ivd vn ghr dtocPni idkx Dqdv WkvtuLvvvn' vhnm rcjw khtm uyrqvdrr n mltjRbvq zvge lvclr wreert go bng tjh lcjlzPvuwx yqrd go gensg mrZopi al mv uw xsu aipe nnq ecslLohgw A fvdwn yn yixe Tebrie ngj QlwwslgUbiyz tb tue zazizns C fw slrqf' 'oqUbuyd loh lkkr mu vsmfvl eadl rue orbtuet tutz'm tdsmpvmwRciee sre Jinl nmzugcag' 'luA beruee pyal bcly ponl Kztx ifn 'ikSlntgea 'eoPfribiZamaqf' iss ghbuthg I vobd g mtadeIcl S hgqn'gTeuft vhr egxc gx ff tapi que uigtvn'Jig akl aala h ljyt rbp jigh ghg rvuhirUjbi ngb qw zoz oa tue quglqcvlk hm XzspjlYbu grlipg gh lfip gg tmVyr'r oe fiyll Ggtgbt dmyyr dql sxLn nn nn nn nc nn gg hefsGh vs xe ln nnnnGrtvia coakq oba qlXe ln nn nn nn nc nn ggheFs gh vs xe lnnnGrtgip jvzms aal baVs xe ln nn nn nn nc nnggHe fs gh vs xelnGrtgia jkgtr ccx alGh vs xe ln nn nn nn ncnnGg he fs gh vsxe`

Credit to Jay Tuley for this problem