Loading

Module 1: Pseudo-random Generators

Notes
Study Reminders
Support
Text Version

Pseudo-random Functions

Set your study reminders

We will email you at these times to remind you to study.
  • Monday

    -

    7am

    +

    Tuesday

    -

    7am

    +

    Wednesday

    -

    7am

    +

    Thursday

    -

    7am

    +

    Friday

    -

    7am

    +

    Saturday

    -

    7am

    +

    Sunday

    -

    7am

    +

Foundations of CryptographyProf. Dr. Ashish Choudhury(Former) Infosys Foundation Career Development Chair ProfessorIndian Institute of Technology-BangaloreLecture-14Pseudo Random Functions PRFsHello everyone, welcome to lecture 13.(Refer Slide Time: 00:31)The plan for this lecture is as follows. We will introduce a very important building block for the symmetric key cryptography namely pseudo random function. And later we will see that how the pseudo random function acts as a fundamental building block for designing many beautiful symmetric key primitives. We will also discuss variants of pseudo random function namely pseudo random permutation and strong pseudo random permutation. We will see how to construct pseudo random generators from pseudo random function.(Refer Slide Time: 01:01)So, let us start our discussion on pseudo random function. On a very high level, a pseudo random function is a deterministic algorithm with two inputs and a single outputwhere and . And we will assume that the key size is n and n is some security parameter. So that is why we often call the function F as a keyed function because it is going to be operated by a key. In practice, the size of n, l and L can all very and later on we will see various instantiations of pseudo random function. Indeed n, l, L all are different, but asymptotically everything has to be some polynomial function of your security parameter.Now, how we are going to use this pseudo random function. So, whenever we are designing any cryptographic primitives, the way we use this pseudo random function is as follows: at the beginning of the instantiations of the cryptographic primitive, which uses this function F, either the sender or the receiver is going to generate a uniformly random key. And somehow it will be established with the receiving party as well.And it would not be known to the attacker. Once the key has been fixed, we are not going to change the key throughout that session or throughout that instantiation. The key will remain the same. Now, once we fixed the key, where k is going to be fixed and it would not be known to the attacker. That is the way we are going to use a pseudo random function. Now, what exactly is the security property we require from the pseudo random function. And informally we require the following: if the key k is unknown, and uniformly randomly chosen from the domain of the key space, then we basically require that the behavior or the output of the key function of Fk should almost resemble the output of any truly random function to . So, remember, a truly random function is an unkeyed function. So, what basically want is that keyed function Fk, once the key has been randomly chosen, its behavior should almost resemble the behavior of a pseudo random function.(Refer Slide Time: 03:53)So, let us go a little bit deeper into what exactly I mean by the formal statement. Imagine you have a truly random function which is an unkeyed function, it does not have any key, it just takes an input of size l bits and it produces an output of size L bits. So, it is easy to imagine the behavior of a truly random function as follows: what does truly random function basically takes is an input of size x, and it produces an output of size L bits.And you can imagine that basically, this truly random function maintains a table consisting of 2lrows, where basically the first-row stores . The second-rowstores and the last row stores .So, whenever this truly random function receives an input x, what basically does is it internally checks whether there is already an entry for the value of this truly random function at the input x.If it is not there, then fill that row, namely F(x) by . For the future invocations of this truly random functions said that y to be the output of this truly random function on the input x.On the other hand, if the value x which has been passed, you already have an entry corresponding to that key x in this table then just pass on the value which is stored in that corresponding row as the output y. So that is the way you can imagine the behavior of a truly random function. The important thing here is that since this function is a truly random function, each row here is an independent string of length L bits. That means and are independent of each other, for any . That means, if there exist an algorithm, which has been notyet seen the value of the truly random function on some input x, it cannot predict what exactly the value of the function is going to be for that input x, except for guessing the output, and the guessing will be successful with probability 1/2L. Apart from that, there is no way to predict outcome of the truly random function on some an input x.And this holds even if that algorithm which is actually trying to predict outcome of this truly random function on the input x has already seen the output of this truly random function on several previous x values, which might be related to this new x values. But this is because each row in the table of this truly random function is independent of each other. The security property that we require from a pseudo random function is that once we fix the key by selecting the key uniformly randomly then that keyed function should also have the similar properties except with a negligible probability.(Refer Slide Time: 07:02)So, what exactly that means is that on your left-hand side you have a keyed function Fk, where the description of the function F is publicly known. I stress the description of the function is publicly known and what is not known is basically the value of the key. On the right-hand side, you have a truly random function which basically consists of 2Lrows each entry consisting of a uniformly random string of length L bits.And when I say that my function F is a pseudo random function, what basically I mean is that there exist no polynomial time statistical test or statistical algorithm which can significantly distinguish apart an output of the algorithm Fk from the output of this truly random function. That means, if our distinguisher or the statistical test is basically given outputs of either the function Fk or the output of the random function on several inputs of adversary’s or distinguisher choice.From the viewpoint of the distinguisher, those output could be as likely to come from the function Fk as it is likely to come from this truly random function. Now, before we go further, this condition is similar to the way we have defined the notion of pseudo random generator. So, remember in the pseudo random generator that security is defined by saying that there exist no statistical test, which when given a sample cannot distinguish apart whether that sample is generated by running a pseudo random generator, or by running a truly random generator.In that experiment, that distinguisher was given only one sample because our pseudo random generator is a single input function. For each invocation of the pseudo random generator, the sample is going to be different because the key for the pseudo random generator is going to be different. But in the context of pseudo random function, the way we are going to use a pseudo random function in real world application is that the key will be fixed once for all at the beginning of the session. And then the x inputs are going to be varied. And each invocation of the function will be with the same key.(Refer Slide Time: 09:14)So what we are now going to do is that in our formal definition, basically the adversary is going to be given many samples, and he has to distinguish apart whether those samples are generated by running a keyed function Fk or by running a truly random function. So let us see what exactly are the formal details. So you are given the description of a publicly known function, and the experiment we call as indistinguishability experiment against a PRF with respect to a distinguisher algorithm against the function F and l is the security parameter.The rules of the games are as follows: the distinguisher is allowed to ask for the function output at many x inputs of his choice, and it can ask its queries adaptively that means it can first ask for the function output on input x1. Then based on the response, it can decide what should be x2. And then based on the response, it can decide what should be x3, and so on. We put absolutely no restriction on what kind of queries distinguisher is asking.Now, once the distinguisher submits its queries, the challenger here has to come up with the response, namely, the output of the functions at those inputs. And the way the challenger would have prepared those response is as follows: basically, the challenger is going to toss a uniformly random coin, which is either going to output 0 or 1 with probability 1/2. If the coin toss is 0, then all these responses y1,.., yq are basically generated by running a truly random function on those x inputs. In a more detail all this y strings are basically independent of each other and each of them is basically a uniformly random bit string of length L bits. On the other hand, if the coin toss of the challenger is 1, then this y outputs are basically the output of a keyed function Fk, where the key is chosen uniformly randomly by the challenger. And now the challenge for the distinguisher is to find out how exactly this responses y1,.., yq are generated, whether they are generated by mechanism 0 or whether they are generated bymechanism 1. That is a challenge for our distinguisher. So, distinguisher in this case outputs b’ which is going to be a bit which basically says whether it feels that a samples y1,.., yq are generated by mechanisms 0 or by mechanism 1. And our securitydefinition is we say that the function F is a pseudo random function if for every probabilistic polynomial time algorithm D participating in this experiment, there exist a negligible function such as that the probability the distinguisher correctly identifies the label or the nature of the samples y1,.., yq is upper bounded by ½+ negl(n). Again, the probability is taken here over the random choice of the challenger and the random queries of the distinguisher.Another equivalent formulation of the same definition is that we say that the function F is a PRFif the distinguishing advantage of our distinguisher is upper bounded by a negligible function. That means it does not matter whether b = 0 or whether b = 1 which means it does not matter whether the y samples are generated by a truly random functions or whether they are generated by a pseudo random function.In both the cases distinguisher should output the same output, say b’ = 1 except with negligible probability. And again, we can prove that if we have a pseudo random function which satisfies the first condition, then it applies it is also satisfies the second condition and vice versa. So depending upon our convenience, we can use either of these two definitions. So that is the definition of a pseudo random function.Basically, the intuition in this experiment is that we are giving our distinguisher an oracle access to the function where the function is either a truly random function or a keyed function. And distinguisher has to distinguish apart whether it is interacting with a truly random function oracle or whether it is interacting with a keyed function oracle. This security definition demands that except with negligible probability it should not be able to distinguish.Notice that here, we are required to upper bound the success probability of the adversary by ½+negl(n). We cannot put a definition saying that a success probability of the distinguisher should be 0 because there is always possibility that distinguisher who can just guess that it is interacting with say either TRF or PRF and with probability half it can actually correctly identify or the probability half its guess could be correct.So, we can never put a condition that the success probability of the distinguisher should be 0. The additional negligible advantage is basically due to the necessary evil associated with the fact that we are in the computational world.(Refer Slide Time: 14:25)So, now let us see, whether it is easy or whether it is how easy or how difficult it is to construct a pseudo random function. So, imagine I design a function F as follows and for simplicity, I assume that the key length and block length and output length are of same size namely say n bitstrings and the way the output of the function is defined . That is the way output is computed.Our goal is to either prove or disprove whether this function is a pseudo random function. In fact we want to disprove this construction is not a PRF. And for that, we basically want to argue whether indeed, the outputs of this function F is going to produce pseudo random outputs once we fix the key. And if you go a little bit deeper into the algorithm, you can clearly see the following fact: if we have a truly random function mapping n bits strings to n bit strings then the output of the truly random function on 2 different inputs x1 and x2 will be completely independent of each other. On the other hand, for the function of that we are considering, , for any and any .That means you now have a test which will always pass or which will always hold for the samples which are generated by the function Fk. And you have a test, the same test may not always be applicable for the samples generated by a truly random function.So, this basically gives us an intuition to design a distinguisher which can distinguish apart the outcome of this function F from the outcome of a truly random function. So, here is the instance of the distinguisher: it basically asks for the value of the function at input x1, x2 which are different. In response, the challenger replies with outputs y1 and y2. The way it is y1 and y2 would have been generated as per the PRF indistinguishability game is as follows: the challenger would have basically tossed a coin if the coin toss is 0 then y1 and y2 are random n bit strings. Whereas, if the coin toss is 1 then y1 and y2 the outcomes of the keyed function Fk for the uniformly random key known only to the challenger. And now distinguisher can act smartly and basically distinguish apart whether y1 and y2 are generated by a truly random function or a pseudo random function by just performing this test.It checks whether x1 and x2 their XOR is the same as the XOR of y1 and y2. If that is the case,then it says that, the samples y1 and y2 are generated by the mechanism b = 1, namely, it submits b’ = 1. Whereas if the test fails, and it says, the samples y1 and y2 are generated by mechanism 0, namely b’ = 0. Now, let us analyze what is the distinguishing advantage of this particular distinguisher.So let us first analyze what is the probability that our distinguisher is correctly labeling the samples y1 and y2 generated by a pseudo random function indeed being the samples of a pseudo random function namely the pr[D outputs b’ = 1/ b = 1]. I claim that this probability is equal to 1. Because if indeed b = 1, that is a case, the samples y1 and y2 are as per the outputs of a pseudo random function. And in that case, this condition, the check that the adversary or the distinguisher is performing will always pass. That is why the probability 1, if b = 1, the strategy of the distinguisher will indeed output b’= 1.On the other hand, let us calculate the second probability that what is the probability that our distinguisher incorrectly labels truly random samples y1 and y2 being the samples of a pseudo random function. Well, if b = 0, that means our samples y1 and y2 are independent of each other. Then the only way the distinguisher can still output b’ = 1 is that for uniformly random y1 and y2this condition holds or in other words, the pr[D output b’ = 1/b = 0] = 1/2n.So that gives us the distinguishing advantage of the distinguisher that we have designed. And if you take the absolute difference, it is almost equal to 1 that makes with almost 100% probability. If n becomes larger then this 1 – 1/2nalmost becomes 1. So that is why with almost 100% probability a distinguisher can distinguish apart the outcome of key function F from the output of a truly random function. And that is why this function F is not the pseudo random function. So that means designing pseudo random function is indeed a challenging task. We will see the candidate constructions later on.(Refer Slide Time: 19:56)Now let us just define some other variants of pseudo random functions with more stronger properties and security guarantees. So the first variant is called as the pseudo random permutation, which is also known as block cipher. And here again we have a keyed function F.The only difference here is that the keyed function Fk should be now a bijection, . That is the only difference.Informally, the security property that we require here is that we require that once we fix the key by selecting a uniformly random key, and the key is not known to the attacker or a distinguisher. Then, no polynomial time distinguisher can distinguish apart the output behavior or the nature of this key function Fk from a truly random bijection mapping L bit strings to L bit strings, which again can be modeled as an indistinguishability experiment.(Refer Slide Time: 20:53)So this is indistinguishability experiment we call as the PRP indistinguishability experiment and we have a bijection here, keyed bijection. We want to capture the intuition that no distinguisher should be able to distinguish apart the behavior of this keyed bijection from an unkeyed truly random bijection. So the rules of the experiments are as follows: distinguisher queries for several x inputs of its choice and in response the challenger gives back the answers. The answers are prepared by pulling either of the following rules: it tosses a coin and if coin is 0, then all these samples y1,.., yq are basically generated by running a truly random permutation. Whereas if the coin toss is 1 that all these samples are generated by running the keyed function F on a uniformly random key not known to the distinguisher. The challenge for the distinguisher is to find out what exactly is the way this the samples are generated. That means it has to output a bit and our security definition is that we say that keyed bijection F is a PRP, if the probability that any polynomial time distinguisher can correctly identify the nature of the sample is upper bounded by ½+negl(n). Equivalently saying that the distinguishing advantage of our distinguisher should be upper bounded by a negligible function.So, in essence everything is same as for the case of pseudo random function, the only difference is that we are now basically in this case of PRP function is now a bijection.(Refer Slide Time: 22:34)So, it is interesting to see the relationship between these 2 primitives pseudo random function and pseudo random permutation. So, on your left-hand side part you have a pseudo random function. The difference here it is a function, that means the input length, the block length and output length could be different. Whereas in the case of pseudo random permutation, it is abijection. That means, in the case of pseudo random functions, it could be a many to one functionwhereas in the case of pseudo random permutation, it is a one to one mapping. Since our PRF may not be a bijection, it is easy to see that a PRF may not be a PRP. What about the other way around? Interestingly, we can prove that if the output size L is greater than equal to l, or in more generic terms, if the output size is some polynomial function of the security parameter n then we can view a pseudo random permutation as a pseudo random function. And the intuition for this statement is as follows.We imagine that we are given a keyed permutation. So, this Fk is a keyed bijection. And since it is a secure PRP that means no polynomial time distinguisher can distinguish apart and interaction with this keyed function Fk from a truly random unkeyed bijection, that sense both this primitives Fk and FTRP are computationally indistinguishable.Now, if we compare a truly random function mapping L bit strings to L bit strings, how exactly it is going to be different from a truly random permutation, well, the only difference between a truly random function from a unkeyed truly random function and an unkeyed truly random permutation is that a function need not be a bijection. That means there are chances of collisions that means it could be a many to one function where several x inputs could give you the same y output.Whereas in the case of truly random permutation, there are no chances of collisions. So the only way any distinguisher can distinguish apart unkeyed truly random function from this keyed bijection Fk is the following. If it so happens that our distinguisher is interacting with unkeyed truly random function and if so happens that some of its queries gives you the same output and it can clearly identify that it is interacting with a unkeyed truly random function.Because if it is interacting with this keyed bijection Fk, the collisions are not going to happen. That means we can say that conditioned on the event that our distinguisher queries are never going to lead to a collision, then the interaction of our distinguisher with Fk and FTRF is almost the same as if the distinguisher is interacting with Fk versus FTRP and since our function Fk is a keyed permutation. We know that that interaction is computationally indistinguishable. So, condition on the event at our distinguisher is not getting collisions from his queries, the interaction of our distinguisher with the unkeyed truly random function and keyed bijection Fkare almost going to be identical. Now the question is what is the probability that the distinguisher getting a collision by making q random queries through the unkeyed truly random function?If it is making q random queries then using a well known result, which we call as the birthday paradox, which we will discuss more rigorously in the context of hash function, we can prove that the chances of getting a collision can be upper bounded by the probability q2 /2L. And that iswhy if your L is some polynomial function of the security parameter n, then clearly, this is a negligible quantity. That means the chances of collisions being happening is negligible. And that is why we can say, or we can treat the keyed bijection Fk also as a pseudo random function. So that is the relationship between pseudo random functions and pseudo random permutations.(Refer Slide Time: 27:00)Now let us see the final variant of pseudo random functions which we call a strong pseudo random permutations or SPRP, which is a special kind of pseudo random permutation. And basically here we require that the keyed bijection Fk should be indistinguishable from a truly random permutation even if the distinguisher gets oracle access to the inverse of the permutation.What I mean by that is demonstrated in this experiment.So, this indistinguishability experiment is called as Experiment . And here the distinguisher now gives access or response for 2 types of queries. It has got oracle access to the values of the permutations, and it also has oracle access to the inverse of the permutation. What I mean by that is it can adaptively ask for the value of the permutations at several x inputs of its choice and in response, it gets back corresponding y outputs.And it is also allowed to ask for the inverse of the permutation of several y values of its choiceand see the corresponding x values. And the way the challenger would have responded is as follows it would have tossed a uniformly random coin if the coin toss is 0, then all these queries are responded by evaluating a truly random permutation. That means all these x values are evaluated as per this truly random permutation. And all these inverse queries are also answered by querying the inverse of that corresponding truly random permutation. On the other hand, if the coin toss b = 1, then all these y values and all these inverse values are computed by running the keyed function Fk and inverse of that keyed function Fk. And the challenge for the distinguisher as usual is to identify whether it has interacted with oracle which represents a truly random permutation or whether it is interacted with an keyed oracle.Our security definition is we say that function F is a strong pseudo random permutation if no polynomial time distinguisher can correctly identify the nature of its oracle, except with probability ½+negl(n) or put it in other words, that distinguishing advantage of our distinguisher should be upper bounded by a negligible quantity. It turns out that, if we have a strong pseudo random permutation, then by definition, it is also a pseudo random permutation.And we can give constructions where the construction will be a pseudo random permutation. That means, it will be secure only when the adversary gets access to the oracle queries for the function output. But it may not be a strong pseudo random permutation that means, as soon as we provide the distinguisher access to the inverse oracle the adversary can distinguish apart. That means the strong pseudo random permutation is more stronger primitive than the pseudo random permutation.(Refer Slide Time: 30:01)Now, let me end this lecture by giving an example of how to construct a pseudo random generator from a pseudo random function. So imagine you are given a secure PRF,. Now, using this I can construct a pseudo random generator G, .And basically, the way this algorithm G operates is as follows: it takes the seed k for the algorithm G and create the k for the pseudo random function F. And the pseudo random function F is now evaluated at publicly known inputs 1 2 3 up to t. That means the block inputs that are used inside this algorithm G are publicly known they are 1 to up to t, it is only the key which is not going to be known to the distinguisher.And each invocation of this function F is with the same key, which is actually the input of our pseudo random generators. And the output of the pseudo random generator is basically the concatenation of the individual outputs which are obtaining by running the t instances of the keyed function Fk, that is the way our pseudo random generator is going to be operated. Now we want to prove that if the keyed function Fk is a secure PRF as per our notion of indistinguishability, then the algorithm G which we have constructed is also a secure PRG. As per the PRG indistinguishability game provided, the number of times we have invoked the pseudo random function Fk, namely t, which is some polynomial function of the security parameter.To prove that, let us first understand our intuition. We basically consider another version of the algorithm G, which I call as GTRG, where each instances of keyed function Fk is replaced by an instance of a truly random function mapping l bit strings to L bit strings. We introduced the concept of pseudo random function; we saw the definition. And we introduced various variants of pseudo random functions like pseudo random permutation, strong pseudo random permutation, and we had seen how to construct provably secure pseudo random generator from secure pseudo random function. Thank you!