CIS551 Lecture 10 Notes
2/16/2006
Project 2 is now available on the course webpage. It is due on 3/14/2006. Please follow the instructions carefully, especially with regards to filenames, to make it easy for automating testing.
Diffie-Hellman Key Exchange
Symmetric key cryptography only works when both communicating parties share a secret key. When out-of-band communication is possible, secret keys can be transferred in that manner. However, how do two parties communicating only over a public communications medium come to share a secret key?
The basic idea of the Diffie-Hellman Key Exchange protocol is that each of the communicating parties creates a part of the shared key which they keep private, but they publicy send a something derived from that key, so that all the communicating parties can combine the public part that they received with their own private part to get a common shared key.
There is an analogy in the lecture slides dealing with paint. We assume that someone who sees a new color of paint cannot discern which other colors were combined to create the new one. The Diffie-Hellman Key Exchange proceeds like so:
1. Alice chooses a public base color and tells Bart what it is publicly.
2. Alice also chooses a private color and tells no one.
3. Bart chooses a private color and tells no one.
4. Alice mixes her private color with the public color and sends it to Bart.
5. Bart mixes his private color with the public color and sends it to Alice.
6. Alice mixes the mixed color that she received from Bart with her private color.
7. Bart mixes the mixed that he received from Alice with his private color.
8. Both Alice and Bart have the same shared secret color that no one else can reproduce just by watching their public exchange of colors (since there would be too much of the public color if they just mixed what they saw).
The mathematics behind the Diffie-Hellman Key Exchange is basic modular arithmetic. Recall that the finite field of a prime p is 0, 1, 2, …, p - 1. A primitive root of prime p is a number that when raised to different exponents mod p generates all the non-zero numbers in the finite field of p. A primitive root is also called a generator. For example, 2 is a primitive root of 5 because 20 mod 5 = 1, 21 mod 5 = 2, 22 mod 5 = 4, and 23 mod 5 = 3. The key feature of a primitive root is that if you are given a primitive root, a prime, and a number generated by raising the primitive root to an unknown power modulo the prime, then it is computationally hard to figure out what the unknown exponent was (for sufficiently large primes).
1. Alice chooses a prime p and a generator g and sends them to Bart publicly.
2. Alice chooses a private exponent a and sends ga mod p to Bart.
3. Bart chooses a private exponent b and sends gb mod p to Alice.
4. Alice computes (gb mod p)a mod p which equals gba mod p.
5. Bart computes (ga mod p)b mod p which equals gab mod p.
6. By the commutative property, gba mod p = gab mod p, so Bart and Alice have a shared secret.
7. Onlookers only have p, g, ga mod p, and gb mod p. Since the discrete logarithm problem is computationally hard, they cannot compute a or b easily. So all they can try to do is (ga mod p)(gb mod p) = ga + b mod p, but ga + b mod p is not equal to gab mod p, so onlookers do not know the shared secret.
8. Now Alice and Bart can use their shared secret as the seed to a pseudo-random number generator to produce the same shared secret key for use in a symmetric key algorithm to communicate with confidentiality.
However, this pure Diffie-Hellman Key Exchange does not include any authentication mechanism, so it is susceptible to a man-in-the-middle attack. It could go like this:
1. Alice chooses a prime p and a generator g and sends them to Bart publicly.
2. Alice chooses a private exponent a and sends ga mod p to Bart.
3. The man-in-the-middle (Evil) Eve intercepts ga mod p from Alice, and instead chooses a private exponent a` and sends ga` mod p to Bart.
4. Bart chooses a private exponent b and sends gb mod p to Alice.
5. Eve intercepts gb mod p from Bart, and instead chooses another private exponent b` and sends gb` mod p to Alice.
6. Alice computes (gb` mod p)a mod p which equals gab` mod p as the secret key.
7. Bart computes (ga` mod p)b mod p which equals ga`b mod p as the secret key.
8. Eve computes (ga mod p)b` mod p which equals gab` mod p as the shared key with Alice and also computes (gb mod p)a` mod p which equals ga`b mod p as the shared key with Bart.
9. Eve can listen in on the communication between Alice and Bart by intercepting each message from one, decrypting it, and then resending it encrypted with the appropriate shared key to the other, while both Alice and Bart think that they are talking directly to each other.
In class, possible alternatives to the Diffie-Hellman Key Exchange were discussed. Any commutative encryption operation would work for creating a shared secret over a public communications medium. One possible alternative using RSA is:
1. Alice chooses an RSA keypair with public key KA and private key kA, chooses a shared secret key KS, and sends KA{KS} to Bart.
2. Bart chooses an RSA keypair with public key KB and private key kB, and sends KB{KA{KS}} to Alice.
3. Since RSA is commutative, KB{KA{KS}} = KA{KB{KS}}, Alice can decrypt with her private key and gets kA{KA{KB{KS}}} = KB{KS}, which she then sends to Bart.
4. Bart decrypts it with his private key to get kB{KB{KS}} = KS as the shared secret key that no one else knows.
Cryptographic Hash Algorithms
Hash algorithms have two main properties:
1. Converts a large input into a fixed length output known as a hash, digest, or summary.
2. Deterministic - always produces the same output from the same input.
Cryptographic hash algorithms (also known as one-way hash functions) have additional properties:
3. Hard to invert - given a hash, you cannot easily find an input that produced it.
4. Hard to find a collision for a given input - given an input and its hash, you cannot easily find another input that produces the same hash.
5. Hard to find any collisions - cannot easily craft two inputs that produce the same hash.
6. Diffusion - the hashes for inputs are uniformly distributed throughout the space of possible hashes, especially that the hashes for similar inputs are generally quite dissimilar (although this is usually a good property even for non-cryptographic hash algorithms).
Hashing is useful for verifying the integrity of a message. Rather than sending the entire message multiple times to ensure that it is received correctly, just send the message and its hash and if the hash received does not match the hash of the message received, then there has been an error. (Note that this method can only show that there are errors, not that there aren’t any errors. An attacker could easily replace the message and then replace the hash with the correct hash of the new message.) This method is used in network traffic for parity bits, checksums, and CRCs. Hashes can also be used for fingerprinting files for virus scanners.
The hash algorithms are collision resistant. Measuring how long it should take computationally in order to find inputs that have a hash collision is a relative measure of how secure a hash algorithm is.
MD5 is a popular hash function, although several serious flaws have been found in its design and numerous collisions have been found..
SHA-1 is the most commonly used hash algorithm today. It divides the input into 512-bit blocks and processes each block as 16 32-bit words, with 80 rounds for munging all the bits together using multiple binary operations (binary shifts, XOR, AND) along with the use of some special constants. It is the successor to SHA-0. Weaknesses have been found in both SHA-0 and SHA-1. Recently, it has been shown that two messages that have a hash collision can be found in 263 operations (which is significantly less than the original 280 operations that were thought to be required). However, just because two messages have a hash collision does not mean that they are semantically meaningful and can be used for an attacker’s gain. SHA-1 is currently approved by the U.S. Government as a secure hash algorithm.
There are four flavors of SHA-2, namely SHA-224, SHA-256, SHA-384, and SHA-512, based on the number of bits in the hash produced. SHA-2 is a more recent algorithm and has not been scrutinized as much, but no weaknesses have been found yet. All four flavors of SHA-2 are also currently approved by the U.S. Government as secure hash algorithms.