Authors: Amit Ranjan & Rahul Veer
Once a cryptographic requirement is in place we need to start working on the design and the challenges we might face during its implementation. Designing a cryptographic solution has its own characteristic weaknesses which aren’t apparent at its initial phase. During implementation, we have to deal with practical nuances and choices that we make to avoid such vulnerabilities while maintaining adaptability for future modifications in standards.
Most parts of cryptographic implementations deal with maintaining the confidentiality of plain text or Image communicated over a channel. Broken cryptography is where an adversary can infer the data from ciphertext thereby compromising communication between two parties. In this article, we will understand typical cryptographic vulnerabilities along with certain elementary concepts of cryptography. In subsequent articles, we will discuss more techniques and best practices to avoid these vulnerabilities.
Let’s start with typical cryptographic vulnerabilities.
- Known Ciphertext attack is one such vulnerability where an adversary has access to a set of ciphertexts and can deduce plain text through some knowledge such as the language in which the plain text was written.
Impacted Algorithm: WEP (Wired Equivalent Privacy) and stream ciphers such as A5/1, A5/2 used in GSM telephony and initial versions of Microsoft implementation of VPN using PPTP Point-to-Point Tunneling Protocol,
- Known Plain text attack is an attack where the attacker has access to both the plain text and ciphertext of one or more messages. This can be used to attack information systems to reveal cryptographic keys.
Impacted Algorithm: The PKZIP stream cipher used for file compression is prone to this attack.
- Chosen ciphertext attack is an attack model where an adversary can decrypt some of the ciphertexts without knowing the key. Having access to some ciphertext, an adversary can fool the system to provide plain text which may be useful in analyzing the system. A non-adaptive (Lunchtime) chosen ciphertext attack refers to an attack where a system is available to obtain plain text of randomly chosen ciphertext while the owner of the system is out for lunch or not monitoring the system. An Adaptive chosen ciphertext attack is an attack where the ciphertext is chosen before and after the challenge ciphertext is given to the adversary. The adaptive ciphertext is planned after analysis of the cryptosystem using the non-adaptive chosen ciphertext attack.
Impacted Algorithm: ElGamal an asymmetric encryption algorithm based on Diffie–Hellman key exchange is not secure under chosen ciphertext attack. Early versions of RSA padding used in SSL protocol were vulnerable to adaptive chosen ciphertext attack that potentially revealed SSL session keys.
- Chosen plain text attack assumes that the attacker can obtain ciphertexts for any clear texts. It is more considerable in public key encryption where the public key is used for encryption. Consequently, an attacker can encrypt any plain text of their choice. Chosen Plain text attack can be Batch Chosen where all the plain text is selected without analyzing any of the ciphertexts. An attacker, however, can request ciphertexts of certain plain texts after carefully analyzing ciphertexts for some other set of plain texts. Such form of chosen Plain text attack is known as Adaptive Chosen plain text attack.
Impacted Algorithm: Vigenere Cipher that uses a series of different Caesar ciphers is found to be vulnerable to this attack.
In cryptography, plain text is encrypted predominantly using two distinct types of ciphers, stream and block. Block cipher uses a cryptographic key to encrypt a block of data (e.g. 64 contiguous bits) at a time. On the other hand, stream cipher encrypts one bit at a time.
Block ciphers fundamentally are deterministic algorithms, which while operating on a fixed block of data; always produce the same output given a particular input. Block ciphers today serve as important components in building any cryptographic protocols/systems that encrypt bulk data. Many block ciphers are designed based on the concept of iterated product cipher (sometimes also called as iterated block cipher) that carries out encryption in multiple rounds, each of which uses a different sub key derived from the original key. One common implementation of such cipher is DES. Most other designs of block ciphers like AES are based on a different type of iterated block cipher known as Substitution permutation networks. Substitution permutation network takes a block of plain text and key as input and applies several alternating rounds of substitution and permutation to produce ciphertext block.
Semantic security implies that the knowledge of ciphertext of some unknown plain text doesn’t reveal any information which could potentially be used to feasibly extract plain text. A block cipher primarily performs encryption only of a fixed block of data at a time and invariably produces same output for a given input. A block cipher by itself is only suitable for cryptographic transformation (encryption/decryption) of a block of data. To achieve semantic security, several so-called modes of operation have been designed. A mode of operation describes how to repeatedly apply a cipher's single-block operation to securely transform large volumes of data. Most modes require a unique sequence called initialization vector (IV) for each encryption operation. The IV ensures that distinct ciphertexts are produced even when same plain text is encrypted multiple times with the same encryption key.
Block ciphers could be operated under varying block sizes of data, but during transformation (encryption/decryption) the block size remains fixed. Block cipher modes operate on entire blocks and necessitate that the last part of data be padded to a full block if it is smaller than current block size, also it’s essential to build continuity in the blocks to add additional security.
The Electronic Code Book (ECB) mode of operation, splits a message (plain text) into blocks of data (possibly extending the last block with padding bits), and then each block is encrypted/decrypted independently. However, such a simple method is highly insecure as equal plain text blocks always generate equal ciphertext blocks (for same key), so patterns in plain text message are apparent in ciphertext output.
The Cipher Block Chaining (CBC) mode of operation, uses an IV along with plain text message. This IV is added in an exclusive-or (XOR) manner to the first plain text block before it is being encrypted. The resultant ciphertext block is then used as new IV for the next plain text block. So on, all the blocks of data are XORed with previous ciphertext before being encrypted. This way, each ciphertext block depends on all plain text blocks processed up to that point. CBC has been the most commonly used mode of operation. During decryption with incorrect IV, the first block of plain text would be corrupt but all subsequent plain text blocks would be correct. This is because a plain text block can be recovered from two adjacent blocks of ciphertext. Note that a single bit change to ciphertext causes complete corruption of corresponding block of plain text and inverts the corresponding bit in the following block of plain text, however, rest of the blocks remain unbroken. This unique behavior is exploited in padding oracle attacks, such as POODLE.
The CipherFeedback (CFB) mode turns a block cipher into a self-synchronizing stream cipher. The initialization vector is first encrypted and then added to the plain text block. CFB decryption is almost identical to CBC encryption performed in reverse.
The OutputFeedback (OFB) mode turns a block cipher into a synchronous stream cipher. It generates key stream blocks, which are then XORed with plain text blocks to get the ciphertext. Just as with other stream ciphers, changing a bit in ciphertext impacts a similar bit in plain text at the same location.
The Counter (CTR) mode creates a key stream but has advantage of requiring unique (and not necessarily random) values as initialization vectors; the essential randomness is computed internally by using initialization vector as a block counter and encrypting this counter for each block.
In the example below, it is evident that ECB mode ciphertext reveals too many patterns compared to the other modes.
Original image Encrypted using ECB mode Encrypted using non ECB mode
Some modes such as the ECB and CBC operate only on complete plain text blocks. So, consider a case where the plain text couldn’t be split down to an exact multiple of blocks. How do you encrypt the last set of plain text bytes that falls short of some more bytes to form a valid block? Here comes padding to our rescue. You, in essence, pad those vital extra bytes in order to form a valid block. While decrypting though, the receiving party must know how to remove the padding bytes in an unambiguous manner.
This article details out various mechanisms of padding used in ECB and CBC mode of operation with block ciphers. I am going to use ECB mode to demonstrate the use of padding, as it is the simplest to explain (and understand too), thereby implying the significance of it. However, reader is advised to note the security issues outlined here:
- There are major security issues in using ECB mode (as identical plain text blocks produce indistinguishable ciphertext blocks) and its use is generally discouraged.
- CBC mode can leak information if handled incorrectly.
- Caution must be exercised while generating unique initialization vector (IV) with algorithms that require one.
- There are limits on the amount of plain text that can be safely encrypted with a given (key, IV) pair.
The plain text is ASCII representation of actual data to be encrypted. Let us assume we need to encrypt a sentence ‘Now is the time for’. Now the plain text (ASCII encoded) form of his sentence would be the 19-byte sequence ‘4E 6F 77 20 69 73 20 74 68 65 20 74 69 6D 65 20 66 6F 72’. We will encrypt this plain text using DES algorithm run in ECB mode. Though, we shouldn’t be using DES with ECB in the real world, we are using it in order to demonstrate and help you understand the underlying concepts of padding effortlessly. Let us consider ‘0123456789ABCDEF’ will be our cryptographic key.
The data block size for DES algorithm is 64 bits (8 bytes). To encrypt, we break the plain text into blocks of 8 bytes each.
The original plain text:
1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 |
N | o | w | i | S | t | H | e | t | i | m | e | f | o | r | ||||
4E | 6F | 77 | 20 | 69 | 73 | 20 | 74 | 68 | 65 | 20 | 74 | 69 | 6D | 65 | 20 | 66 | 6F | 72 |
The original plain text breaks up into two blocks of 8 bytes each and a third block of three bytes.
1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 1 | 2 | 3 |
N | o | w | i | S | t | H | e | t | i | m | e | f | o | r | ||||
4E | 6F | 77 | 20 | 69 | 73 | 20 | 74 | 68 | 65 | 20 | 74 | 69 | 6D | 65 | 20 | 66 | 6F | 72 |
However, in this encryption scheme we need data blocks to be of 8 bytes in size. In ECB mode, each 8-byte block is encrypted independently.
First Block:
1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | |
DES INPUT BLOCK | N | o | w | i | s | t | ||
HEX | 4E | 6F | 77 | 20 | 69 | 73 | 20 | 74 |
DES OUTPUT BLOCK | 3F | A4 | 0E | 8A | 98 | 4D | 43 | 15 |
Second Block:
1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | |
DES INPUT BLOCK | H | e | t | i | m | e | ||
HEX | 68 | 65 | 20 | 74 | 69 | 6D | 65 | 20 |
DES OUTPUT BLOCK | 6A | 27 | 17 | 87 | AB | 88 | 83 | F9 |
Third Block:
1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | |
DES INPUT BLOCK | F | o | r | X | X | X | X | X |
HEX | 66 | 6F | 72 | X | X | X | X | X |
So what do we do with the third data block of 3 bytes, which is falling short of 5 bytes to reach a block size of 8 bytes? We pad the last block with 5 bytes to make it up to the expected length.
1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 |
N | o | w | i | s | t | h | e | t | i | m | e | f | o | r | X | X | X | X | X | ||||
4E | 6F | 77 | 20 | 69 | 73 | 20 | 74 | 68 | 65 | 20 | 74 | 69 | 6D | 65 | 20 | 66 | 6F | 72 | X | X | X | X | X |
There are many standards followed to pad a data block to reach the expected length.
- PKCS5/PKCS7: Padding is done in whole bytes, all of the same value as the number of padding bytes required.
By far the PKCS5/PKCS7 is the most popular and is usually referred to as "PKCS5 padding". It recommends to Pad the plain text with a padding string of anywhere between 1 and 8 bytes to make the total length an exact multiple of 8 bytes. The value of each byte of the padding string is set to the number of bytes added - i.e. 8 bytes of value 0x08, 7 bytes of value 0x07, ..., 2 bytes of 0x02, or one byte of value 0x01.
Our third block of plain text data block is padded with 5 bytes of value 05:
1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | |
DES INPUT BLOCK | F | o | r | X | X | X | X | X |
HEX | 66 | 6F | 72 | 05 | 05 | 05 | 05 | 05 |
DES OUTPUT BLOCK | FD | 29 | 85 | C9 | E8 | DF | 41 | 40 |
At the receiving end after decrypting, read the last character decrypted and strip off those many bytes from the trailing end. This method can be used with any plain text, ASCII or binary. Don't forget to check first, that the number of characters to be stripped is between one and eight. This also gives you an extra check that the decryption has been carried out correctly.
- ISO/IEC 7816-4: Pad is done by appending 80 to the message followed by as many zero bytes as required. Add a single padding byte of value 80 and then pad the balance with enough bytes of value zero to make the total length an exact multiple of 8 bytes. If the single 80 byte makes the total length an exact multiple then do not add any zero bytes. This is known as "OneAndZeroes padding".
Our third block of plain text data is padded with 80 followed by 4 bytes of value 00:
1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | |
DES INPUT BLOCK | F | o | r | X | X | X | X | X |
HEX | 66 | 6F | 72 | 80 | 00 | 00 | 00 | 00 |
DES OUTPUT BLOCK | BE | 62 | 5D | 9F | F3 | C6 | C8 | 40 |
At the receiving end after decrypting, strip off all trailing zero bytes and the 80 byte. This method can be used with any plain text, ASCII or binary. Cryptographers who work with smart cards seem to prefer this method.
- ANSI X.923: Padding is done with zeroes except the last byte is made equal to the number of padding bytes added.
Our third block of plain text data is padded with 00 followed by a byte of value 05:
1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | |
DES INPUT BLOCK | F | o | r | X | X | X | X | X |
HEX | 66 | 6F | 72 | 00 | 00 | 00 | 00 | 05 |
DES OUTPUT BLOCK | 91 | 19 | 2C | 64 | B5 | 5C | 5D | B8 |
At the receiving end after decrypting, read the last character decrypted and strip off those many bytes from the trailing end. Don't forget to check first, that the number of characters to be stripped is between one and eight. This also gives you an extra check that the decryption has been carried out correctly. This method can be used with any plain text, ASCII or binary. The convention with this method is usually always to add a padding string, even if the original plain text was already an exact multiple of 8 bytes. The final byte could, therefore, have a value between 01 and 08.
- Zero Padding: Padding is done with zero (null) characters.
Our third block of plain text data is padded with 00 followed by a byte of value 05:
1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | |
DES INPUT BLOCK | F | o | r | X | X | X | X | X |
HEX | 66 | 6F | 72 | 00 | 00 | 00 | 00 | 00 |
DES OUTPUT BLOCK | 9E | 14 | FB | 96 | C5 | FE | EB | 75 |
At the receiving end after decrypting, trim all null characters found at the end until you find a non-null character. You cannot use this method when the plain text could contain a null value. This is not a problem if you are dealing with ASCII text, but would be if encrypting binary data like an EXE file.
The resulting ciphertext from these padding methods would look like:
1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | |
N | O | w | i | s | T | h | e | t | i | m | e | f | o | r | X | X | X | X | X | |||||
1 | 4E | 6F | 77 | 20 | 69 | 73 | 20 | 74 | 68 | 65 | 20 | 74 | 69 | 6D | 65 | 20 | FD | 29 | 85 | C9 | E8 | DF | 41 | 40 |
2 | 4E | 6F | 77 | 20 | 69 | 73 | 20 | 74 | 68 | 65 | 20 | 74 | 69 | 6D | 65 | 20 | BE | 62 | 5D | 9F | F3 | C6 | C8 | 40 |
3 | 4E | 6F | 77 | 20 | 69 | 73 | 20 | 74 | 68 | 65 | 20 | 74 | 69 | 6D | 65 | 20 | 91 | 19 | 2C | 64 | B5 | 5C | 5D | B8 |
4 | 4E | 6F | 77 | 20 | 69 | 73 | 20 | 74 | 68 | 65 | 20 | 74 | 69 | 6D | 65 | 20 | 9E | 14 | FB | 96 | C5 | FE | EB | 75 |
Note that how different the last block of ciphertext is for each of the padding mechanisms.
Now you know about encryption, algorithms, modes and padding, in our next article we continue our journey in designing cryptographic solutions.