Cryptography: Old and New ( part 1 )
Cryptography Old and New (4)
For as long as there have been stories there have been secrets – words unspoken for tactical advantage or for fear of reprisal. Secrets often need to be sent afar, and their remaining secret en route is of paramount importance. So it was when Xerxes’ attack on Sparta was thwarted by Demaratus (a Greek exile living in Persia, whose warning message was sent to Sparta hidden on an apparently blank wax tablet).
And so it is when you send your credit card details across the ether to pay for gadgets, snacks or socks. Most people will likely be familiar with a substitution cipher, in which one letter is replaced by another. The best- known of these is the Caesar cipher, in which each letter is replaced by one a fixed distance further down the alphabet, wrapping around when one runs out of letters. It is said that Julius Caesar used this method, replacing A with D, B with E, and so on, wrapping around with A replacing X, whereas his nephew Augustus favoured a shift of just one letter, in which A is replaced by B, B by C etc, but with no wraparound, so that Z is replaced by the symbol AA.
The Kama Sutra also describes, among other rather more interesting tricks, the art of mlecchita-vikalpa (secret writing). It details a substitution cipher in which letters are paired and interchanged by a fixed random scheme, so that lovers can “conceal the details of their liaisons”.
An even older substitution system is Atbash, originally found in old (circa 500 BC) Hebrew texts. Here the first letter of the alphabet, aleph, is replaced by the last, tav; the second, beth, by the second to last, shin, and so on, effectively reversing the alphabet.
The latinic equivalent is interchanging A and Z, B and Y, and so forth. The ROT13 system (a Caesar cipher with a shift of 13) is still used on some websites and newsgroups to obfuscate plot spoilers, punchlines or naughty words. These monoalphabetic substitution ciphers (MSCs) are not in any way cryptographically secure by today’s standards, but in their time they were likely effective enough – the highway bandits of Caesar’s time being likely illiterate, unlike the masterful wordsmiths of the modern internet.
These ciphers do contain a germ of the idea of the modern cryptographic key, though. Whether it’s the length of the shift in a Caesar cipher, the dimensions of the Scytale, or the pairings used in the Kama Sutra (no, not those pairings), knowledge of the method of encryption, together with the key, allows one to decipher the message. We have 26 possible keys (including the trivial zero-shift) for a Caesar cipher, whereas ROT13 and Atbash are essentially single-key systems. The Kama Sutra cipher has a fairly large keyspace – there are about 8 trillion (8 followed by 12 zeroes) unique ways of pairing the alphabet.
The general MSC has an astounding number of possible combinations (26 factorial – about 4 followed by 26 zeroes – or a little more than 88-bits in modern binary terms), but size isn’t everything… The Arab polymath Al-Kindi, in a ninth-century manuscript titled On Deciphering Cryptographic Messages, gave the first description of breaking MSCs by frequency analysis – exploiting the fact that in an ‘average’ message, some letters will occur more frequently than others.
For example, in English the letter ‘e’ occurs with a relative frequency of about 13%, followed by ‘t’ with 9%, and so on. This is why Scrabble scoring is the way it is – the more common the letter, the less it scores. Other languages have different letters and frequencies, but the principle remains the same: replace the most frequently occurring letter in the ciphertext with the most frequently occurring letter in the language, then repeat for the next most frequent letter, and continue until you are able to fill in the blanks.
The original message might not have exactly the same letter frequencies as the language, but provided it’s long enough it will at least be close enough that decryption will be possible with a little tweaking. The discovery of the 1586 Babington Plot (which sought to assassinate Queen Elizabeth I) led to Mary Queen of Scots and her co-conspirators being executed after their correspondence was decrypted by renowned codebreaker Thomas Phelippes. Letters between Mary and Babington had been encrypted by substitution using symbols mostly from the Greek alphabet, and Phelippes was able to forge an addendum to one of Mary’s letters requesting the identities of the co-conspirators.
Once they were thus incriminated, heads were off’d. A milestone in the history of cryptography was the invention of the so-called Vigenère cipher in 1553. This was actually the work of cryptologist Giovan Battista Bellaso, who built on the ideas of Trithemius and Alberti. Vigenère did in fact publish a stronger autokeying cipher in 1586, but history has misattributed this earlier cipher to him. The cipher is a polyalphabetic substitution cipher which uses a keyword to switch cipher alphabets after each letter. Each letter is encrypted by a Caesar cipher with shift determined by the corresponding letter of the keyword.
This (providing the keyword has more than one unique letter) thwarts traditional frequency analysis. The cipher was considered so strong that it was dubbed le chiffre indéchiffrable , and indecipherable it remained until work by Babbage and Kasiski in the mid-19th century. Their efforts centred on isolating the length of the key: once that is known then the ciphertext can be separated into as many chunks; each chunk will be encrypted by a different Caesar shift, which is easily dealt to by frequency analysis.
Later, this cipher was augmented with the letter V to make the imaginatively-titled ADFGVX cipher. In 1918, in a phenomenal tour- de-force, the French cryptanalyst Georges Painvin managed to decrypt an ADFGVX- encrypted message which revealed where the German forces were planning to attack Paris. Painvin lost 15kg of body weight over the course of this crypto-toil. One may wonder if anyone can make a truly unbreakable cipher, and one may be shocked to learn that such a thing already exists.
That it has been patented since 1917 may leave one so utterly aghast as to impinge permanently on one’s health, but this is fact nonetheless. The chap responsible (for the patent at least) was Gilbert Vernam, and his invention is known as the One Time Pad. The trick is to ensure that there is as much key material as there is plaintext, that the key material is entirely random and perfectly secret, and no part of the key material is used more than once. In practical terms, though, Vernam’s system is largely useless.
Generating truly random material is difficult, as is distributing a huge amount of it in secret and ensuring its destruction post-use.
Wartime cryptography relied heavily on codebooks which contained daily keys, and these had a bad habit of falling into enemy hands. Once such a breach occurred and news of it reached HQ, generals were faced with the tremendous logistical problem of alerting relevant personnel as to the breach and then manufacturing and distributing new key material. Long-range naval missions often failed to receive this, necessitating that messages be retransmitted using old keys. This exchange was sometimes intercepted, providing clues as to the new key.
During World War I, the decrypting of the Zimmerman telegram (which invited Mexico to ally with Germany) was instrumental to American involvement in the war. By World War II the Germans had upgraded the Enigma series of machines to present a sufficient cryptographic challenge to Bletchley Park. Polish researches had broken the original design as early as 1932, and just prior to the outbreak of war they shared their intelligence with the British. Alan Turing designed the Bombe machine, which by 1940 was doing a fine job of breaking Jerry comms.
The Enigma machine, despite having a huge number of rotor, plugboard and stecker settings, had a weakness in that a letter was never encrypted to itself. This vastly reduced the amount of work that the Bombe and the computers (usually women with a good eye for detail and skill at crossword puzzles) had to do. After a letter was typed on the Enigma, the cipher alphabet was changed by the rotor mechanism, in a manner not dissimilar from the Vigenère cipher.
There were other layers of encryption too, but a lot of these were constant settings made redundant when Enigma machines were captured. By the end of the war there were around 200 Bombes in use throughout England. The Americans, being in a much better position for obtaining supplies, were able to build and design 125 much faster Bombes, and the Allies were able to farm out work to these remote behemoths via (encrypted) cable.
Turing’s genius notwithstanding, much of the Enigma traffic was decrypted thanks to sloppy operational security. Message keys could have been changed with every transmission but were not, or when they were the change was only slight and easily guessed. Numbers were often spelled out, so ‘einsing’ was a common technique – looking for occurrences that might decrypt to ‘eins’.
If numerals had been allowed, this technique would have failed. In the 1970s, two developments brought the cryptography game into the computer age. The first of these developments was the Data Encryption Standard, a block cipher based on work by Horst Feistel at IBM. Prior to its standardisation, it was slightly modified at the behest of the NSA. With no reasons being cited for these agency-mandated changes, suspicions were raised about a possible back door.
Two decades later, it emerged that the opposite was true: the S-boxes of the original cipher were susceptible to a technique called ‘differential cryptanalysis’, which at the time (cryptography being considered a munition) was classified. The NSA changes made the cipher more resistant to the technique, although they did also recommend a smaller 48-bit, as opposed to 64-bit, key size. Being the first publicly available cipher, DES became the subject of intense scrutiny and in many ways bootstrapped serious academic study of cryptography.
While the thousands of pages of journal articles on the subject provide all manner of theoretical attacks on DES, by far its most serious weakness is the short key size. IBM and the NSA eventually compromised on a nominal 64-bit key, but eight of these 64 bits were redundant checksum bits. At the time of its introduction this was probably sufficient, but in the early 1990s machinery was proposed that could brute-force a key within hours. In 1997 an Internet-wide project successfully cracked a DES key for the first time. In 1998, the Electronic Frontier Foundation built a device (for a princely $250,000) which successfully cracked a key in a little over two days.
Among the other attacks on DES it’s worth mentioning Matsui’s ‘linear cryptanalysis’. The attack involves building up approximations to parts of the cipher by finding modulo 2-linear expressions that hold with a probability significantly different from 0.5. By collecting a huge number (2 43 ) of plaintext-ciphertext pairs, one can deduce a sufficient number of bits of the key that the remainder can be brute-forced.
Linear expressions can be found speedily thanks to the Walsh-Hadamard transform, and modern ciphers all are very careful to include a heavily nonlinear component to mitigate against these attacks. In some ways one can look at Matsui’s work as an abstraction of basic letter frequency analysis, using characteristics of the cipher rather than the language, and 1s and 0s rather than characters.