There are various methods to hack into an exchange of encrypted information; a few examples that are often used in combination:
- Brute Force; bluntly trying all possible keys. A very long process, but if the keys were, for instance, selected badly, then it suddenly becomes feasible with modern resources. For asymmetrical algorithms, the encrypted information can be submitted to a Brute Force, but it is, perhaps, faster to check what private key would fit with the public key, which was calculated on the basis of the private key.
- Searching for “defects” in algorithms. In case of good encryption, the encrypted data always appear to be a completely random set of bits. However, it may be that under certain circumstances, for instance with a certain sequence of letters, the result of the encryption is not entirely random. These patterns can then be used in a targeted manner in order to decrypt a small piece of data. This in turn results in knowledge about the algorithm and keys used, which can then be used to decrypt the rest of the information. A recurring concern is that defects can also have been included intentionally. For instance, there have been accusations that NSA have spread algorithms with a back-door.
- Statistical analysis of the encrypted data and correlating them with a statistical analysis of the language used. Although this does not directly result in the decryption of data, it may offer leads, both about the content of the encrypted message and about other ways of decrypting the information.
- The main thing, already mentioned in point 1, is: making use of errors during the encryption process. The errors that can be made are manifold.
Applying cryptography well is more difficult than it seems. Assume, I want to send the Tesorion logo in an encrypted manner. To do this, I use AES with a block size of 256 bits. I select this key completely at random. I divide the image into a number of blocks with data and encrypt them with AES. I then send the image. Sounds good and secure, right? This is what happened to the image through the encryption:
After the encryption the image can still be recognised! That is because the image contains structure. This structure has recurring elements. When this element is encrypted, then the result will always be the same. It makes no difference where the block of blue or white pixels is placed, the encryption of the block provides the same result every time.
This way of encryption is, for that matter, referred to as “Electronic Codebook” (ECB). If we want to encrypt the image well, then we will therefore need to apply the algorithm in a different way.</p
Diagrammatically the following just happened:
This can easily be changed into:
Where we first perform a calculation for the encryption of every block, during which we use the block to be encrypted and the encrypted result of the previous block. This manner of encryption is referred to as “Cipher Block Chaining” (CBC). In addition to this method, there are also a number of other ways, like “Cipher Feedback” (CFB) and “Output Feedback” (OFB). In our instance the result is as follows:
When sending encrypted messages, a third party cannot see what the content of the message is, but it can observe the activities that take place at both parties. For instance, a party can send an encrypted message to hoist a certain flag. A third party sees the encrypted message and sees that the flag is being hoisted.
The next time this message is sent, the flag is hoisted again, and the third party can then guess what the content of the message was and anticipate this the next time, or even send the encrypted message to provoke the action. We refer to the latter as a “Replay Attack”. The “replay” of an encrypted message to accomplish a certain reaction. Hence, with this attack it is not necessary to know how the message needs to be decrypted!
It is therefore important that during the encryption of a text, a measure is taken that ensures that the encryption yields a different result every time.
Arithmetically this can be done with ease. In the CBC method, the value of the previous block is always used. There is no previous value for the first block. So, we can select this ourselves. This value is then referred to as the “Initialisation Vector” (IV).
The IV is usually selected at random. When both parties agree to use an IV only once, we speak of a “nonce” (number used once). The IV or the nonce can be communicated mutually, but arrangements can also be agreed on how to determine the IV or the nonce, for instance on the basis of the message number or the time.
It is important to determine the IV unpredictably / at random. In the past there have been attacks that relied on the predictability of the IV to test assumptions about the encrypted data. An example hereof is the BEAST attack against the TLS protocol that is used in https. A request to a web server, the first block in the CBC chain, almost always starts in the same way. The user will ask for a certain page. Take a predictable IV and the attacker has a number of data that he can use when hacking the connection.
Random numbers are frequently used during encryption, for the determination of the keys, for the determination of the IV or the nonce, and as salt for a hash.
If there is something that computers are bad at, it is the selection of a random number. It is even difficult for a human being. If an attacker can foresee how a computer selects a random number, then he can exclude a lot of keys / IVs / nonces in his attempt to hack the communication.
But how does a computer select a random number? Simply put, with a formula in combination with something that it can measure. That can be the time, a piece of data in a certain place on the hard drive, or the last move of the computer mouse. A frequently used formula is:
Xn+1= ( a* Xn+ b ) mod m
Where a, band m are integers and X a series of numbers where the next number is determined on the basis of the lastly calculated number
The first time that we use the formula we have nothing for X yet; Xn=0 is unknown. The first number in the series is often selected based on something that the computer can measure and is referred to as the seed.
The result is, of course, only seemingly random and that is why, in case of a computer, we usually refer to a Pseudo Random Number Generator (PRNG) instead of a Random Number Generator (RNG). We will use the formula with a number of values for a, b, m, and the seed in order to demonstrate the problem with the PRNG.
The table clearly shows the problem. If we select a, b, and m too small, then a random number is out of the question. Even if we select the numbers larger, we still see a recurring series of numbers. Only in the last column there is no recurrence visible in the first 10 numbers.
The formula in combination with the numbers selected in the example does not result in a good PRNG. Because the number are not distributed nicely, we refer to a bad entropy. A good PRNG does not have an easily detectable period and has a homogeneous distribution of random numbers, or rather a good entropy.
If a PRNG has a bad entropy, then an attacker can start making assumptions about the numbers that are used in, for instance, a key. Then it can suddenly become feasible to guess a key within a short period of time!
The weakest link
The examples that were mentioned in this blog thus far may appear to be worrying, however usually it goes wrong in a much easier way.
For example, a website needs to be provided with a certificate. You can purchase it on the internet from a TTP. The TTP has good services and offers the private key and the certificate for download with a simple click on the button. How did the TTP generate this private key, where did it store the key? Did the TTP take sufficient measures to detect a cyber-attack?
Private keys must always be generated by the person or organisation themselves. The TTP only needs to see the certificate request with the public key. Banks, among others, take this so seriously that they dispose of special equipment to handle the encryption of the most sensitive information, also referred to as Hardware Security Module (HSM), so that the (private) keys can always be generated securely and are locked away.
Another simple example: the longer the key is being used, the more encrypted data are available to conduct statistical analysis with. The longer you use the same key, the bigger the chance that it leaks following this kind of analysis. That is why keys should be changed regularly. This is something that is frequently forgotten. So is setting up procedures as to what needs to happen when the suspicion arises that a key has been leaked.
Do you know where you rely on encryption and where the keys are stored?
““Stay tuned” for the next part of this blog in which the block-chain is finally discussed!
- BEAST attack: Attack on TLS where a badly selected IV is used
- Cipher Block Chaining: Way of encryption where during the encryption of every block the result of the encryption of the previous block is used. This prevents patterns that are present in the source data from also being visible in the encrypted data.
- Electronic Codebook (ECB): Way of encryption where every block of data is encrypted individually.
- Entropy: Degree of randomness of a set of numbers or data. If something is completely random, then it has a good entropy. If something is not completely random then it has a bad entropy.
- Initialisation Vector (IV): Numbers or data that are used for the encryption of the first block of data in a conversation.
- Nonce: An IV that is only used once.
- PRNG: Pseudo RNG. A piece of software that imitates an RNG.
- Replay Attack: Attack where the attacker resends a piece of (encrypted) data in order to thus accomplish a certain reaction. The attacker does not necessarily need to know what the content of the data is.
- RNG: Generator for random numbers
- Seed: Number or input with which a PRNG is initialised.