CS 161 Notes Lec 5-6
-
Course website: https://su21.cs161.org/
-
Course textbook: https://textbook.cs161.org/
Overview
Lecture 5: Introduction to Cryptography
Lecture 6: Block Ciphers and Modes of Operation
- Block ciphers
- Block cipher chaining modes
What is cryptography?
Cryptography is the study of secure communication over insecure channels.
Note that cryptography is something you never wanna try by yourself! Even with a secure algorithm, there’re too many edge cases to consider and a tiny mistake can make the whole system useless. An example related to CS 61A Summer 20 exam was present in the slides, basically the exam system developed by TAs has a vulnerability related with encryption that enabled students to see the exam paper before official time.
Definitions
Four characters are introduced here,
-
Alice and Bob: The main characters trying to send messages to each other over an insecure communication channel
-
Eve: An eavesdropper who can read any data sent over the channel
-
Mallory: A manipulator who can read and modify any data sent over the channel
-
We often describe cryptographic problems using a common cast of characters
- One scenario:
-
Alice wants to send a message to Bob.
-
However, Eve is going to eavesdrop on the communication channel.
-
How does Alice send the message to Bob without Eve learning about the message?
-
- Another scenario:
-
Bob wants to send a message to Alice.
-
However, Mallory is going to tamper with the communication channel.
-
How does Bob send the message to Alice without Mallory changing the message?(Here the goal is actually detect changing)
-
Three goals of cryptography
-
Confidentiality: An attacker can’t read the message.
-
Integrity: An attacker can’t change the message without being detected.
-
Authenticity: Receiver can prove that the message is sent from the one who claims to be the sender.
Keys
-
The most basic building block of any cryptographic scheme: The key
-
We can use the key in our algorithms to secure messages
-
Two models of keys:
-
Symmetric key model: Alice and Bob both know the value of a secret key.
-
Asymmetric key model: Everybody has two keys, a secret key and a public key.
-
Kerckhoff’s Principle
-
Crypto-systems should remain secure even when the attacker knows all internal details of the system
-
The key should be the only thing that must be kept secret
-
The system should be designed to make it easy to change keys that are leaked (or suspected to be leaked)
- If your secrets are leaked, it is usually a lot easier to change the key than to replace every instance of the running software
Plaintext
- The original message
Ciphertext
- The encrypted message
Graphic explanation
Confidentiality
Integrity & Authenticity
Threat models
What if Eve can do more than eavesdrop?
Some threat models for analyzing confidentiality:
Can Eve trick Alice into encrypting messages of Eve’s choosing? | Can Even trick Bob into decrypting messages of Eve’s choosing? | |
---|---|---|
ciphertext-only | ||
chosen-plaintext | Yes | |
chosen-ciphertext | Yes | |
chosen-plaintext-ciphertext | Yes | Yes |
In the class, chosen-plaintext threat model is picked.
In practice, cryptographers use the chosen plaintext-ciphertext model since it’s the most powerful one.
A brief history of cryptography
Three typical cryptography examples are mentioned in this part, I’ll just list some brief introduction about them, for more details, check the course slides or just search online, these stories are definitely interesting!
Caesar cipher
- One of the earliest cryptographic schemes
- Used by Julius Caesar
-
Choose a key K randomly between 0 to 25
- To encrypt a plaintext message M:
-
Replace each letter in M with the letter K positions later in the alphabet
-
If K = 3, plaintext DOG becomes GRJ
-
- To decrypt an encrypted ciphertext C:
-
Replace each letter in C with the letter K positions earlier in the alphabet
-
If K = 3, ciphertext GRJ becomes DOG
-
- A better cipher: create a mapping of each character to another character.
-
Example: A = N, B = Q, C = L, D = Z, etc.
-
Unlike the Caesar cipher, the shift is no longer constant!
-
- Key generation algorithm: KeyGen()
- Generate a random, one-to-one mapping of characters
- Encryption algorithm: Enc(K, M)
- Map each letter in M to the output according to the mapping K
- Decryption algorithm: Dec(K, C):
- Map each letter in C to the output according to the reverse of the mapping K
Enigma
- A mechanical encryption machine used by the Germans in WWII
- KeyGen():
-
Choose rotors, rotor orders, rotor positions, and plugboard settings
-
158,962,555,217,826,360,000 possible keys
-
- Enc(K, M) and Dec(K, C):
-
Input the rotor settings K into the Enigma machine
-
Press each letter in the input, and the lampboard will light up the corresponding output letter
-
Encryption and decryption are the same algorithm!
-
-
Germans believed that Enigma was an “unbreakable code”
-
Polish and British cryptographers built BOMBE, a machine to brute-force Enigma keys
- Why was Enigma breakable?
-
Kerckhoff’s principle: The Allies stole Enigma machines, so they knew the algorithm
-
Known plaintext attacks: the Germans often sent predictable messages (e.g. the weather report every morning)
-
Chosen plaintext attacks: the Allies could trick the Germans into sending a message (e.g. “soldiers at Normandy”)
-
Brute-force: BOMBE would try many keys until the correct one was found
-
Cryptography by computers
-
The modern era of cryptography started after WWII, with the work of Claude Shannon
- “New Directions in Cryptography” (1976) showed how number theory can be used in cryptography
- Its authors, Whitfield Diffie and Martin Hellman, won the Turing Award in 2015 for this paper
- This is the era of cryptography we’ll be focusing on
IND-CPA security
First, take a look at the cryptography roadmap
Symmetric-key | Asymmetric-key | |
Confidentiality | 1. One-time pads 2. Block ciphers with chaining modes (e.g. AES-CBC) | 1. RSA encryption 2. ElGamal encryption |
Integrity, Authentication | 1. MACs (e.g. HMAC) | 1. Digital signatures (e.g. RSA signatures) |
The course will first focus on encryption schemes that satisfy symmetric-key. We cause those schemes “symmetric-key encryption schemes”, which falls in the block above that satisfies both symmetric-key and confidentiality.
For modern schemes, we’ll assume the messages are bit strings(01010011010101010…).
Definition of “symmetric-key encryption”
- A symmetric-key encryption scheme has three algorithms:
-
KeyGen() → K: Generate a key K
-
Enc(K, M) → C: Encrypt a plaintext M using the key K to produce ciphertext C
-
Dec(K, C) → M: Decrypt a ciphertext C using the key K
-
- What properties do we want from a symmetric encryption scheme?
- Correctness: Decrypting a ciphertext should result in the message that was originally encrypted
- Dec(K, Enc(K, M)) = M for all K ← KeyGen() and M
-
Efficiency: Encryption/decryption should be fast
- Security: Confidentiality
- Correctness: Decrypting a ciphertext should result in the message that was originally encrypted
Well, it might be the time to give a more formal definition of “confidentiality”
- Recall our definition of confidentiality from earlier: “An adversary cannot read our messages”
- This definition isn’t very specific
-
What if Eve can read the first half of Alice’s message, but not the second half?
-
What if Eve figures out that Alice’s message starts with “Dear Bob”?
-
- This definition doesn’t account for prior knowledge
-
What if Eve already knew that Alice’s message ends in “Sincerely, Alice”?
-
What if Eve knows that Alice’s message is “BUY!” or “SELL” but doesn’t know which?
-
- This definition isn’t very specific
- Better definition:
-
The ciphertext should not give the attacker any additional information about the plaintext.
-
This definition is called IND-CPA(indistinguishability under chosen plaintext attack).
-
Define IND-CPA in terms of games:
-
Eve can send Alice any plaintext to encrypt and receive the ciphertext.
-
Eve can do that for as many times as she likes with any plaintext.
-
In the end, Eve will send Alice two plaintext, M0 and M1, Alice will randomly pick one of them and send its ciphertext back.
-
Eve should guess which plaintext corresponds to this ciphertext, with a probability higher than 1/2
-
If the encryption scheme is IND-CPA secure, no matter how many times Eve let Alice translate these messages, her possibility of identifying plaintext given ciphertext will never increase. The probability will always be 1/2.
Edge cases:
-
Length: In the IND-CPA scenario, we assume the possible messages share equal length(thus length of ciphertext will not provide extra information).
-
Attacker runtime: One common practical limit: Eve is limited to polynomial runtime algorithms
(no exponential-time algorithms) -
Negligible advantage: Sometimes it’s possible for Eve to win with probability 1/2 + 1/2128
-
This probability is greater than 1/2, but it’s so close to 1/2 that it’s as good as 1/2.
-
Eve’s advantage is so small that she can’t use it for any practical attacks
-
In the IND-CPA game: The scheme is secure even if Eve can win with probability > 1/2 + Ɛ, where Ɛ is negligible
-
One-time pads
Recall that one-time pad is one of the symmetric-key encryption schemes.
Key idea:
- XOR operation
0 ⊕ 0 = 0 |
0 ⊕ 1 = 1 |
1 ⊕ 0 = 1 |
1 ⊕ 1 = 0 |
- XOR properties
x ⊕ 0 = x |
x ⊕ x = 0 |
x ⊕ y = y ⊕ x |
(x ⊕ y) ⊕ z = x ⊕ (y ⊕ z) |
(x ⊕ y) ⊕ x = y |
Note the last property, it’s useful :) .
How does one-time pad work?
-
Each time, before Alice and Bob communicate, a random bit string that has the same length as the message will be given to them.
-
Encryption will be done as simply bitwise XOR.
-
Decryption is also easy, since (x ⊕ y) ⊕ x = y, to get the plaintext, receiver only need to XOR the ciphertext with key string once again.
-
Demonstration:
It’s trivial that this one-time pad is definitely secure in the sense of confidentiality. It’s also trivial that this method is stupid ;) (Why should you use such an encryption when you already can send the key each time to both Alice and Bob?)
Before starting Lecture 6, we first define two new concepts, “traffic analysis” and “side channels”.
Traffic analysis: Analyzing who is talking to whom and when.
Side channels: Information about the plaintext revealed as a result of the implementation of the scheme, not the scheme itself.
- Modern crypto systems are usually broken through side channels
Block ciphers
Recall that “block ciphers with chaining modes” falls in the symmetric-key & confidentiality category just like “one-time pads”
Definition
Block cipher is an encryption/decryption algorithm that encrypts a fixed-sized block of bits.
- E__K(M) → C: Encryption
-
Inputs: k-bit key K and an n-bit plaintext M
-
Output: An n-bit ciphertext C
-
Sometimes written as: {0, 1}k × {0, 1}n → {0, 1}n
-
- D__K(C) → M: Decryption
-
Inputs: a k-bit key, and an n-bit ciphertext C
-
Output: An n-bit plaintext
-
Sometimes written as: {0, 1}k × {0, 1}n → {0, 1}n
-
The inverse of the encryption function
-
- Properties
-
Correctness: EK is a permutation(explanation can be found below)
-
Efficiency: Encryption/decryption should be fast
-
Security: E behaves like a random permutation
-
A graphic explanation would be:
Note that the input and output number of bits are always fixed and equal!
Correctness: one-to-one
Correctness of block ciphers means that EK must be a permutation(bijective function) on n-bit strings.
Why do we need it to be a permutation?
- Recall that block ciphers have fixed-size input and output, so plaintext M that encrypts to ciphertext C must be able to be decrypted back to M by C. This property forces the mapping between M and C to be bijective.
Security: random permutations
The security property of block cipher is defined as follows:
- A secure block cipher behaves like a randomly chosen permutation from the set of all permutations on n-bit strings
- A random permutation: Each n-bit input is mapped to one randomly-chosen n-bit output
- Defined by a distinguishing game
-
Eve gets two boxes: One is a randomly chosen permutation, and one is E__K with a randomly chosen key K
-
Eve should not be able to tell which is which with probability > 1/2
-
Efficiency
Also, don’t forget that block ciphers should be designed with efficiency.
-
Encryption and decryption should be computable in microseconds
- Block cipher algorithms typically use operations like XOR and bit-shifting
- Very fast on modern processors
- Modern CPUs provide dedicated support for block ciphers
Example: AES
To introduce AES, we first introduce DES.
DES(Data Encryption Standard)
-
Designed in late 1970s
-
Block size 64 bits (n = 64)
-
Key size 56 bits (k = 56)
- NSA influenced two facets of its design
-
Altered some subtle internal workings in a mysterious way
-
Reduced key size from 64 bits to 56 bits
-
Made brute force attacks feasible for an attacker with massive computational resources (by 1970s standards)
-
- The algorithm remains essentially unbroken 40 years later
- The NSA’s tweaking hardened it against an attack publicly revealed a decade later
- However, modern computer speeds make it completely unsafe due to small key size
AES(Advanced Encryption Standard)
-
1997–2000: NIST (National Institute of Standards and Technology) in the US held a competition to pick a new block cipher standard
-
One of the finalists, Twofish, was designed by Berkeley professor and occasional 161 instructor David Wagner!
- Out of the 5 finalists:
-
Rijndael, Twofish, and Serpent had really good performance
-
RC6 had okay performance
-
Mars had ugly performance
-
-
On any given computing platform, Rijndael was never the fastest
- But on every computing platform, Rijndael was always the second-fastest
- Twofish and Serpent each had at least one compute platform they were bad at
-
Rijndael was selected as the new block cipher standard
- Key size 128 bits (k = 128)
-
Versions with 192-bit and 256-bit keys also exist if you’re paranoid, but 128-bit keys are usually considered safe
-
Sometimes written as AES-128, AES-192, AES-256
-
- Block size 128 bits (n = 128)
- Note: The block size is still always 128 bits, regardless of key size
- The algorithm itself is out of scope
-
If you’re curious, here’s a comic (Definitely check this out! It’s excellent)
-
You can think of it as lots of shuffling with XORs and bit-shifting
-
-
There is no formal proof that AES is secure (indistinguishable from a random permutation)
- However, in 20 years, nobody has been able to break it, so it is assumed to be secure
- The NSA uses AES-256 for secrets they want to keep secure for the 40 years (even in the face of unknown breakthroughs in research)
Question: Are block ciphers IND-CPA secure?
Recall that IND-CPA can be defined as a game between Eve and Alice. So, what if Eve just send Alice message M0 to get its encryption? She would than be able to win the game with probability being 1! Thus block ciphers are not IND-CPA secure!
Block cipher chaining modes
Yep, block ciphers are not IND-CPA secure, but don’t worry, we can make them secure after applying them in some modes!
Firstly, the reason why block ciphers are not IND-CPA secure is that their results are deterministic(the same inputs will always give the same outputs), recall that this kind of property is avoided by one-time pads with its random key generation each time. Similarly, we have to introduce randomness to our scheme if we want block ciphers to be IND-CPA.
But before that, let’s consider one issue first. As block ciphers have their fixed length, what if we want to encrypt messages that don’t fit with this length? Shall we just apply the block cipher multiple times?
ECB mode
ECB stands for Electronic Code Book.
The key idea of this mode is:
-
Enc(K, M) = _C_1 _C_2 … Cn
Which is simply apply block ciphers to multiple chunks of the message and chain them together finally, all block ciphers share the same key.
Clearly this AES-ECB is not IND-CPA secure, we can break it with the same idea as how we break block ciphers.
An example:
Here’s a picture of Tux
encrypt it with ECB mode, here’s the ciphertext
There’s recognizable information leaked because no randomness has been added!
CBC mode
CBC stands for Cipher Block Chaining.
The key idea of this mode is:
-
C__i = E__K(M__i ⊕ C__i-1)
-
C0 = IV
-
IV(Initialization Vector) is a random k-bit string that gets updated each time.
The graph below is way more clear then the formula above ;)
Each block of cipher text will be used as the encryption vector added for next block(act as the randomness!). Recall that “⊕” is XOR operation.
Here’s how you decrypt CBC mode
Some properties of CBC mode:
-
CBC mode is IND-CPA secure.
-
Encryption can’t be parallelized since you need the former ciphertext when encrypting next block.
-
Decryption can be parallelized.
-
Padding bits when message is not exactly the multiply of block size. Don’t simply pad with zeros, padding is also critical for security.
Here’s Tux with CBC mode encryption, random IV :)
AES-CTR
CTR stands for Counter.
The key idea of this mode is:
-
Each time, generate a random nonce of n-bit string, each block will add it with the counter of that block to get its own random string.
-
Pass the random bit-string to block cipher, then XOR with plaintext segments to get the ciphertext segments.
Illustration with graph:
For decryption:
Yes, encryption of block cipher is used in both CTR mode encryption and CTR mode decryption, with a trivial reason.
Properties of CTR mode:
-
CTR mode is IND-CPA secure
-
Both encryption and decryption can be parallelized
-
No padding is needed! Since plaintext is encrypted by XOR instead of block cipher
-
CTR is similar with one-time pads, but the random keys are generated with block cipher and one shorter random key instead.
Here’s Tux with CTR mode encryption, random nonces :)
IVs and nonces
IV and nonce are basically the same thing, they all need to be randomly generated each time.
- Initialization vector (IV): A random, but public, one-use value to introduce randomness into the algorithm
-
For CTR mode, we say that you use a nonce (number used once), since the value has to be unique, not necessarily random.
-
In this class, we use IV and nonce interchangeably
-
- Never reuse IVs
-
In some algorithms, IV/nonce reuse only leaks a little information (e.g. CBC)
-
In some algorithms, IV/nonce reuse leads to catastrophic failure (e.g. CTR)
-
Recall that in What is cryptography? we mentioned an example about CS 61A summer exam which was encrypted improperly? Here’s what those TAs did:
- They used a Python library for AES
- A bad library for other reasons besides this example
- When they invoked CTR mode encryption, they didn’t specify an IV
-
Assumption: the crypto library would add a random IV for them
-
Reality: the crypto library defaulted to IV = 0 every time
-
-
The same IV was used to encrypt multiple exam questions
- If IV is not randomized each time, this CTR mode is no longer secure.
Comparing modes of operation
- If you need high performance, which mode is better?
- CTR mode, because you can parallelize both encryption and decryption
- If you’re paranoid about security, which mode is better?
-
CBC mode is better
-
Theoretically, CBC and CTR mode are equally secure if used properly
-
However, if used improperly (IV/nonce reuse), CBC only leaks partial information, and CTR fails catastrophically
- Consider human factors: Systems should be as secure as possible even when implemented incorrectly
-
CFB mode
CFB stands for Cipher Feedback.
CFB is also IND-CPA secure, its design is shown below, but the course didn’t explain them in detail
No integrity or authenticity
Keep in mind that block ciphers are not secure in sense of integrity and authenticity! Alice and Bob can’t tell if the message has been modified by Mallory.