Loading
Apuntes
Study Reminders
Support
Text Version

RSA Public-Key Cryptosystem

Set your study reminders

We will email you at these times to remind you to study.
  • Monday

    -

    7am

    +

    Tuesday

    -

    7am

    +

    Wednesday

    -

    7am

    +

    Thursday

    -

    7am

    +

    Friday

    -

    7am

    +

    Saturday

    -

    7am

    +

    Sunday

    -

    7am

    +

Foundations of Cryptography Dr. Ashish Choudhury Department of Computer Science International Institute of Information Technology – Bangalore Lecture – 45 RSA Assumption Hello everyone, welcome to this lecture. (Refer Slide Time: 00:42) So just to recap in the last lecture, we have seen Candidate Public Encryption Scheme, CPAsecure Public Encryption Scheme, mainly El Gamal encryption scheme. In this lecture and the next lecture, we will discuss another Candidate Public Encryption Scheme based on the different hard problem, which we call as RSA assumption, which will lead us to another public encryption scheme namely the RSA Public Encryption Scheme. (Refer Slide Time: 00:58)So to understand the RSA assumption, let us first try to understand the set ZN*. So ZN* basically is a set of integers modulo N, which are co-prime to the modulus N, namely ZN* is the collection of all elements b in the range {1,…,N-1} such that the element b is co-prime to the modulus N. Namely, their GCD is 1. For example, the set Z10* is nothing but the elements 1,3,7, and 9. Because all these elements 1, 3,7, and 9, they are co-prime to the modulus 10, and it is easy to see that if modulus N is prime, then the set ZN* basically consist of all the elements 1 to N-1. So, a well-known fact from the number theory whose proof I am skipping is that the set ZN* along with the operation multiplication modulo N constitutes a group and satisfies all the axioms of the group. So, for instance, I have drawn the matrix for the elements in Z10*, namely 1, 3, 7 and 9, and the result of performing the operation multiplication modulo 10 for any pair elements from the set, and now you can see that all the axioms of the groups are satisfied and that is why this is a group. Moreover, a well-known fact from the number theory is that we have polytime algorithms for computing the multiplicative inverse, which I denote by, a-1 , for any element a in the set ZN*. And there is a well-known algorithm for that, which we call as Extended Euclid’s GCD algorithm, which we will discuss in one of our subsequent lectures, and the running time of that algorithm is polynomial in the number of bits that we need to represent the modulus N. (Refer Slide Time: 02:47)So, let us next discuss Euler’s Totient Function, φ function. So the Euler’s Totient Function, which is the denoted by the symbol φ(N) is basically the cardinality of the set ZN*, namely it is the number of elements in the set 1 to N-1, which are co-prime to your modulus N. Some of the well-known facts again borrowed from Number theory are as follows. If N is a prime p, then the value of φ(p) is nothing but p -1. Because there are exactly p-1 elements, which are co-prime to p. On the other hand, if n is the product of 2 distinct primes, p and q, then φ(N) is nothing but p-1 times q-1. That means, there are these many elements, namely p-1 times q-1 elements in the range {1,…., N-1}, which will be co-prime to your modulus N. To verify that, let us take an example, say N = 10, which you can write as the product of 2 and 5, where 2 and 5 are prime. So your p is 2 and your q is 5, then we know that there are 4 elements in the set Z10*, so φ(10) should be 4, and 4 is nothing but p-1, namely 2-1 multiplied by q-1, which is 5-1, and finally another interesting property which we will use from Number Theory is that, for any element a in the ZN*, aφ(N) modulo N is 1. That means, if N is a prime p, then this φ(N) is nothing but p-1, so we get ap-1 modulo p to be 1. This property or this result is also called as Fermat’s Little Theorem, and we also get due to this fact, the corollary that if you have any element a belonging to the set ZN*, then ax modulo N is exactly the same as ax modulo φ(N) modulo N. That means, in the exponent you can reduce the exponent x to x modulo φ(N). So the reason this corollary issue is that, if your x is anyhow less than φ(N), then both L.H.S. and R.H.S. are same, but if your x is more than φ(N)then to compute ax modulo φ(N), what you can do is, you can rewrite ax as several batches of aφ(N) with φ(N) in the exponent. aφ(N) depending upon the relationship between x and φ(N), and the last batch will have ax modulo φ(N) , and everything in the base modulo N. Now, we know that aφ(N) modulo N is 1, that means each of these batches, you have full aφ(N) modulo N, are going to give you the answer 1. So, all this batches are going to give you the answer 1, 1, 1, 1 and 1 multiplied by 1 is going to give you 1. So, you are finally left with the last batch, which has ax modulo φ(N) . So, that is how we get this corollary. (Refer Slide Time: 06:03) So, with all this mathematics in place, now let us try to understand the RSA permutation. So this RSA permutation is a mapping from the set ZN* to ZN*, so the way this permutation is defined as follows. Imagine, you are given an exponent e, which is greater than 2, such that the exponent e is co-prime to your value φ(N). So, I stress it is co-primed to φ(N), not to N. Now, a well-known fact from Number Theory is that if you have an e, which is co-prime to φ(N), then you can find out the multiplicative inverse of that e modulo φ(N). In general, if you have 2 numbers, say a and b, such that GCD of a and b is 1, that means, a and b are co-prime to each other, then you can always find the multiplicative inverse of a modulo b. So, my modulus in this case is φ(N), because e is co-prime to φ(N), and if e is coprime to φ(N), by using the Extended Euclid algorithm, I can find d where e and d constitutes their mutual multiplicative inverses. That means, e times d modulo φ(N), and d times e modulo φ(N) is 1. Now, the RSA permutation is defined as follows. To go in the forward direction, the function is fe, and to compute the value of fe(x) where x is a member of ZN*, we basically compute xe and do mod N. On the other hand, the reverse direction function is basically a function of d, namely, fd(y), and this function is basically if I want to compute fd(y), I will compute yd , and then perform mod N. I claim here that the function fd is the inverse of the function fe and vice versa. So, for that consider any arbitrary x here, belonging to ZN*, and say I map it to y, as per the function fe, so my y is nothing but xe , and now let’s see what we obtain by reversing this value y as per the function fd. If I reverse this y as per fd, then I obtain xe mod N, and then whole raise to the power d mod N. So I can take the overall mod outside, and I obtain that this is the same as xe times d modulo N, and by using the corollary that we have stated in the last slide, xed modulo N in the exponent, I can reduce this exponent e*d to e*d modulo φ(N). I know that e*d modulo φ(N), which is there in the exponent, gives me value 1, because I have chosen e and d such that they are multiplicative inverse of each other. So, in the exponent, it is e times d modulo φ(N) reduces to 1, so hence I obtain that this xed modulo N is nothing but x modulo N, and since I have chosen my x to belong to the set ZN*. That means x is anyhow less than N, and if x is less than N, then x modulo N gives you x such that the effect of mod does not happen at all. So this proves that your function fd and fe are mutually inverse of each other. I also prove here now that my function fe(x) is a bijection, and that means, it is a one-to-one onto mapping. For that, let us consider an arbitrary pair of inputs x1, x2, from the set ZN*, and say the resultant images of x1 and x2 as per the function fe, are y1 and y2. Now, if y1 and y2 are same, that means, x1e modulo N, is same as x2e modulo N, that means, if I raise both the sides to the exponent d, I get x1 ed modulo N is same as the x2 ed modulo N. This basically means that x1 is equal to x2, because x1 ed modulo N, is nothing but x1 and x2 ed modulo N is nothing but x2. So that means, if I take an arbitrary x1, x2 mapping to the same y, then basically I end up showing that x1 is equal to x2, and hence I overall get that my function fe is a permutation here. (Refer Slide Time: 10:32)Now, let us discuss resultant assumption, which will be related to the RSA assumption, which we will finally discuss. This is called the Factoring Assumption, and the intuition behind this assumption is that, if you are given the product of two large arbitrary prime numbers, it turns out that finding the prime factors is indeed a difficult task. This is a very widely known and well-studied problem studied over several centuries. And till now, we do not have efficient polytime algorithms for factorizing a product of two arbitrary prime numbers, if those prime numbers are chosen arbitrarily. So, this intuition or the hardness of this problem is going to be formalized by an experiment, and for that experiment, let us first define an algorithm, which we call as GenModulus algorithm. It basically picks two random n-bit prime numbers, say p and q, and there are well-known algorithms to pick random prime numbers. I am not going into the exact details at how exactly you pick those two random prime numbers, in poly(n) time. You can refer to the book by Katz and Lindell for the exact algorithm. Now, once p and q are chosen, you compute the modulus, namely N, which is the product p and q. Now, the Factoring Assumption, with respect to the GenModulus algorithm, is modeled as an experiment, which we call as Factor Experiment, and the rules of that experiment are as follows. The challenger runs the GenModulus algorithm, generates the parameter N, p, and q, and the challenge is thrown to the adversary, namely N, and the challenge for the adversary is to come up with its prime factorization, namely p and q. So, it submits a pair of numbers, p’ and q’, and we say that the output of the experiment is 1, namely adversary has won the experiment if indeed, the pair p’, q’ is exactly the same as the prime factorization of your modulus N, and we say that the Factoring Assumption holds with respect to our algorithm GenModulus, if for every polytime adversary, the probability that the adversary could come up with correct factorization of the modulus N, is upper bounded by some negligible function. Notice that, there is always a guessing strategy by an adversary, it can guess some p’ and q’, and with non-zero probability, it may indeed with the correct factorization of N. (Refer Slide Time: 13:00) What we want is basically that no polytime adversary should be able to do anything better than that. That is basically the intuition of this experiment. Now, finally let us see the RSA assumption here. So, it turns out that Factoring Assumption, even though it looks like a candidate One-Way Function, that means that if I give you the product of two arbitrary large prime numbers and challenge you to come up with the factorization, then that looks like a candidate One-Way Function. But just based on this candidate One-Way Function, we cannot directly get a practical public-key cryptosystem. So, to get around that, we introduce a related problem, which we call as an RSA problem, whose difficulty is related to the hardness of the factoring problem. So, this RSA problem is described with respect to another parameter generation algorithm, which we call as GenRSA. So, what does GenRSA algorithm does is, it first runs the GenModulus algorithm and pick up random primes p and q of size n-bits each, and multiples them to get the modulus N, and now it computes the value φ(N) and it is computable because φ(N) is nothing but the product of p-1 and q-1. Then this GenRSA algorithm picks an exponent e such that e is relatively prime or co-prime to φ(N). And since e is co-prime to φ(N), by running the Extended Euclid’s algorithm, the multiplicative inverse d of e modulo φ(N) can be computed. Overall, this GenRSA algorithm now outputs N, p, q, e, and d. Intuitively, the RSA problem is, if you are given just a public modulus N and the public exponent e, and the random element y from the ZN* set, then the challenge for you is to compute the eth root of y modulo N without actually knowing the actual value of d or without knowing the prime factorization of N. I stress that the challenge is to compute eth root modulo N because if I just challenge you to compute the eth root without modulo N, that is very easy to do that. The challenge here is to compute the value of eth root of the random y modulo N, which is formalized by this experiment which I call as RSA-inv experiment. The experiment is as follows, the challenger runs the GenRSA algorithm to generate the parameters N, p, q, e, d are random-wise chosen from the set ZN*. Again, if you are wondering that how picking up a random element from the set ZN* is possible in poly(n) time, well it is possible, you can see the reference book by Katz and Lindell where the algorithm is given of how to pick random elements. Now the challenge for the adversary is to find out the eth root of this random y just based on the knowledge of public modulus N and the public exponent e. So, it has to output a group element say x and we say that adversary has won the experiment or the output of the experiment is 1, if and only if, x is indeed the eth root of random y modulo N, namely if xe modulo N is y. We say that RSA assumption holds with respect to this GenRSA algorithm, if for every polytime algorithm participating in this experiment, the probability that it can come up with the correct eth root of a random y is upper bounded by some negligible probability. (Refer Slide Time: 16:23)Now, let’s try to compare the RSA assumption and Factoring Assumption, because on a high level, they might look similar to you. So, on your left hand side you have the factoring experiment where the challenge for the adversary is to come up with the prime factors of the public modulus N. On the right hand side, you have the experiment modeling the RSA problem where adversary is now given some extra information as part of its challenge, namely, it is given the public exponent e and the random y from the set ZN*, and its goal is to come up with eth root of random y modulo N. It turns out that we can prove that if RSA assumption holds, then Factoring Assumption also holds and this can be proved by contrapositive, namely, if you assume that factoring is computationally easy, that means it is possible for a polytime adversary to come up with a prime factorization p, q of a modulus N. Then, once you factorize your modulus N into its prime factorization of p and q, then you can easily compute φ(N) because φ(N) is nothing but p-1 times q-1. And if φ(N) is computable in polytime and anyhow e is given to you, then by running the Extended Euclid algorithm, you can yourself compute the multiplicative inverse of e, namely d, and once you compute d, the eth root of y is nothing but yd modulo N. That means, this implication is indeed true. On the other hand, let’s try to consider the relationship between the Factoring Assumption and RSA, in the sense, can we say that if Factoring Assumption holds, then RSA assumption holds and we do not have any answer for this. That means, we cannot prove that if RSA assumption does not hold and the Factoring Assumption does not hold. It is a big open problem because there might be a way to efficiently solve the RSA problem without actually factoring your modulus. But we can have some kind of tradeoff and for most of the many practical instantiation, this is a common click, which is used to make the encryption process fast. Now, assume we are using an instantiation of RSA, where we have set e to be 3, and say we are consider an application where the underlying messages m ∈ZN* , but the magnitude of m is strictly less than the eth root of the modulus N. And if you are wondering that what kind of applications encounters such kind of m, if later on we will see the Hybrid RSA encryption. Then the size of the modulus that we typically use is of size 1024 bits, and the message that we would like to encrypt to the Hybrid RSA encryption will be roughly a 300-bit uniformly random bit string and we will set e to be 3. If that is the case, then the cipher text will be m3 modulo N. But since my magnitude of m is strictly less than the cube root of m, me or m3 modulo N will be equivalent to the integer cube and the effect of mod N will not take place at all. And if adversary is aware of the fact that we are using an application with underlying plain text, it is magnitude is strictly less than the cube root of modulus N, then it also will be aware that the underlying m, which has been encrypted. Its cube is actually contained in the cipher text c, namely the integer cube not the modulo cube, and it is very easy to recover the underlying unknown m from the c by just finding the integer cube root of the cipher text c, which can be computed in poly of the size of modulus N amount of computation. So that is our first class of attack. (Refer Slide Time: 14:14)We can have many other kinds of attack possible on Plain RSA Cipher, so imagine for the moment that we are considering an application where the same message needs to be encrypted and sent to multiple receivers. So, imagine if we say we have 3 receivers, receiver 1, 2, and 3 each of it has his own modulus N1, N2, and N3 respectively. But each of them is using a same public exponent say 3, and of course a different decryption exponent d1, d2, and d3 where d1 is the multiplicative inverse of 3 modulo N1, d2 is the multiplicative inverse of 3 modulo N2 and so on. So, the public key, which is available in the public domain are the respective public keys of the 3 receivers and imagine that we have a sender, which has a random message m ∈ZN1 * , as well as ZN2 * , as well as ZN3 * randomly chosen and say it wants to encrypt and send a same message to the 3 receivers. So, it will compute c1 which will be the m3 modulo N1, c2 which will be the m3 modulo N2 and so on. Now, assume that this modulus N1, N2, and N3, they are pair-wise prime. If not suppose for instance, the GCD of Ni and Nj is not 1, they are not co-prime to each other. That means there exist a common factor of Ni and Nj, then using that common factor, it is computationally easy to factorize either the modulus Ni or Nj and say for instance if Ni is factorizable, and since ei is also publicly known and if we know Ni’s factors, we can compute the value of i). And once we know i) and ei, we can compute the value of di in polynomial amount of time. If di is computable, then any encryption which is intended for the ith receiver can be decrypted by the adversary. So, it is safe to assume for the moment that pair wise, all the modulus N1, N2, and N3, they are co-prime to each other. Now, let me define a bigger modulus N* , which is the product of all the three modulus here, namely N1, N2, and N3, and since my underlying plaintext m, which is the sender has encrypted ∈ ZN1 * , as well as ZN2 * , as well as ZN3 *. That means m is strictly less than N1, it is strictly less than N2, it is strictly less than N3. That means the integer m3 will be strictly less than the bigger modulus N* and now what I can do is since my modulus N1, N2, and N2, N3 and N1, N3 they are co-prime to each other. I can invoke a very nice result from the Number Theory, which we call as the Chinese Remainder Theorem, which says that if your individual modulus N1, N2, and N3 they are coprime. Then it is possible to compute the integer value m3 in polynomial amount of time, polynomial in the size of the respective modulus N1, N2, and N3 such that the recovered m3 basically satisfies the system of following modular equations. So, you are given the value c1, which is some integer m3 modulo N1, where m3 is not known, and you are given c2, which is same unknown m3 modulo N2 and you are given the value of c3, which is the same unknown m3 modulo N3, and if your N1, N2, and N3, the respective modulus they are co-prime to each other. Then what basically the Chinese Remainder Theorem says is it is possible to recover that unknown m3 in polynomial amount of computation. Then, by applying the Chinese Remainder Theorem, an adversary who has eavesdrop these three cipher text c1, c2, c3, can apply the Chinese Remainder Theorem, and recover the unknown m3. Once the unknown integer m3 is recovered, it can find out the exact m. So, again this is another attack, which is possible on the Plain RSA Cipher even if we assume that a sender’s message m ∈ZN* set. It turns out that there could be several other possible attacks, which can be launched on the Plain RSA encryption process. (Refer Slide Time: 18:48)So, this brings us to the concept of what we call as padded RSA public-key cryptosystem and basically the general idea here is that, we have seen already that the Plain RSA encryption process is a deterministic encryption process. There is no internal randomness hidden and if there is no internal randomness, then we can never hope to achieve CPA security. So, the idea behind the padded RSA encryption process is that we try to append some random bit-strings to the plaintext bits, which we want to encrypt and convert everything into a group element in the set ZN* , and then we apply the RSA function for encryption. On the other hand, the decryption end, we will chop off the random bit strings, which we have added to the plain bit strings before encryption and then that will be the recovered plaintext.