Foundations of CryptographyProf. Dr. Ashish Choudhury(Former) Infosys Foundation Career Development Chair ProfessorInternational Institute of Information Technology – BangaloreLecture – 7Semantic Security(Refer Slide Time: 00:32)Hello, everyone, welcome to lecture 7. The plan for this lecture is as follows. We will define the notion of semantic security in the ciphertext only attack model and we will see an equivalent version of this definition based on indistinguishability game and we will also introduce reduction-based proofs which is central to cryptography.(Refer Slide Time: 00:51)So, let us begin with the semantic-security definition in the ciphertext only attack model and the scenario is the following. So, we are in the ciphertext only attack model where we have an adversary, and now we will consider a computationally bounded adversary. Because remember, in the last lecture we have discussed that if key reusability is your ultimate goal, then you have to ensure that your adversary is computationally bound.So, we assume we have a computationally bounded adversary, who is seeing a ciphertext c of some unknown message m encrypted by the sender using an unknown key k as per the encryption algorithm, where the steps of the encryption algorithm is known to the adversary. Intuitively, we will say that our encryption process is semantically secure in this ciphertext only attack model if the ciphertext does not reveal any additional information about the underlying plaintext to the attacker.Moreover, this should hold even if the adversary have any kind of prior external information about the underlying plaintext, which could have been leaked to the attacker by any other means before the ciphertext have been communicated. So even though this intuition is very straightforward to understand, it is extremely challenging to formalize the above intuition. So let us proceed to formalize this intuition, right.(Refer Slide Time: 02:11)So, we first introduce an abstract function here, namely h(m), which models any kind of prior external information about the underlying plaintext, which might be leaked to the adversary through other means before any ciphertext have been communicated. So this h(m) is kind of some history function. There might be in some context or might be a scenario where adversary might have absolutely no prior external information, in that case, my function h(m)will be an empty function, but there might be a scenario where adversary might have some prior external information about the underlying plaintext, which has been leaked to the adversary through some other means. So whatever is the case, we introduce this abstract function to model this prior external information about the underlying plaintext, which the adversary has.We next introduce another function f(m), which basically models the additional information about the underlying plaintext, which adversary would like to compute after seeing the ciphertext or which adversary would like to know, right? So this models the additional information and intuitively, the goal of semantic security is to ensure the following. We say that our encryption process is semantically secure if the probability with which adversary could compute this additional information, namely the function f(m) by using the ciphertext cand using the help of the history function or the prior information is almost the same with which the adversary could have computed the function f(m) by just using the prior information in the absence of the ciphertext. If that is the case, then the implication that we get here is that ciphertext is of no help whatsoever for the attacker in computing f(m).(Refer Slide Time: 04:04)What I mean by this is pictorially the following. So you imagine that in this world, we have an adversary, who is actually seeing a ciphertext, which is an encryption of some unknown message under the unknown key and the adversary also have access to the history function, namely any kind of prior information, which would have been leaked to the adversary through some external mechanism without the knowledge of the ciphertext. So the adversary in this world is called A. You compare by imagining another world where we have again another adversary, say A’, who does not see the ciphertext and this adversary A’ has access only to the history function, namely the prior information about the underlying plaintext, which sender might communicate over the channel. Now, the intuition behind semantic security is that the probability with which A and A’ could compute the f(m), namely the additional information about the underlying plaintext are almost the same, namely, we will say that our encryptionprocess is semantically secure in the ciphertext attack model if the absolute difference between the following two probabilities is upper bounded by some negligible function. So letus see closer. Let us have a closer look into this respective probabilities, namely :??[? ???? ? , ℎ ? = ? ? ] and ??[?’(ℎ(?)) = ?(?)]So your first probability is the probability with which the adversary A, namely the adversary in the first world, outputs the value of f(m), where the adversary is given the ciphertext as well as the history function. Whereas the second probability is the probability with which the adversary in the second world, namely the adversary A’, computes the value of f(m) just using the value of history function. So if the absolute difference between these two probabilities is upper bounded by a negligible probability, then what it means is that whatever adversary could have computed by seeing the ciphertext, namely whatever the adversary could have known about f(m) using the help of c with almost the same probability adversary could have computed f(m) without actually seeing the c. If that is the case, then it means that our encryption process is so good that ciphertext is kind of some random bit strings, and it helps or provides no kind of aid to the adversary in computing f(m) with a significant advantage, that is what is the intuition behind the notion of semantic security, right.(Refer Slide Time: 06:42)So this is the original definition of semantic security, and it turns out that if we want to prove the semantic security of an arbitrary encryption process as per this original definition, then this is slightly complicated, where because here you have to bring history function as well as here you have to bring the arbitrary function f(m) which adversary would like to compute. Instead, what we are going to do is we will see an equivalent version of this definition based on indistinguishability based experiment.This alternate definition based on indistinguishability based experiment, you can imagine that it is the computationally secure variant of indistinguishability based definition of perfect secrecy, right.(Refer Slide Time: 07:27)So, let us first recall the indistinguishability based definition that we use to define perfect secrecy. So, the essence of that indistinguishability based definition for defining perfect secrecy is that if you have a scenario where a sender has 2 messages m0 or m1 and it has randomly encrypted one of those messages, and if adversary is aware of the fact that the sender has either encrypted m0 or m1, then even after having this prior information and seeing the ciphertext c, adversary should not be able to identify what has been encrypted in the ciphertext c with probability better than half. That was the intuition that we wanted to capture through the indistinguishability based definition of perfect secrecy. This was captured very nicely by the following experiment, right. So this is the experiment, which we use to define the notion of perfect secrecy, where we have a publicly known scheme, namely a triplet of algorithms over some message space, and in the model of perfect secrecy, we had a computationally unbounded adversary, and the name of the experiment was this : ??????,Π ???. So just to recall the nomenclature of the experiment is as follows. PrivK denotes that we want to model an experiment, which in the context of a private key or symmetric encryption, eav means we are considering an adversary who is an eavesdropper, A is the name of the adversarial algorithm and Π is the name of the scheme, and in this experiment, the rules are as follows. Adversary is allowed to submit any pair of messages from the plaintext space with the restriction that the size of the two plaintext should be same, and the experiment or the verifier, the hypothetical verifier does the following:It randomly generates a key by running a key generation algorithm and it randomly encrypts one of the messages using the key, and the challenge for the adversary is to identify what plaintext has been encrypted in the challenge ciphertext c, whether it is m0 or m1. So adversary outputs a bit, namely its guess about what exactly has been encrypted in the challenge ciphertext. We define the scheme Π to be perfectly secure or we say that a scheme is perfectly indistinguishable if the probability with which adversary could successfully identify what message has been encrypted is upper bounded by half.So if adversary could successfully identify what message has been encrypted in the challenge ciphertext, then we say that adversary has won the game or we say that output of the experiment is equal to 1. That means this notation that output of the experiment is 1 denotes the probability that b’ equal to b. So in the context of perfect secrecy, our requirement was that a probability adversary could identify b, b’ equal to b correctly should be upper bounded by half.(Refer Slide Time: 10:25)Now, let us see the indistinguishability based definition to model the notion of semantic security in the ciphertext only attack model. We will make the following changes. The first change that we are going to make is the following. Instead of assuming that our adversary is computationally unbounded, we will assume that our adversary is computationally bounded.This is to model the first relaxation that we agreed that we should make in computational security model right.So remember the last lecture we discussed that if key reusability is your ultimate goal, then we should target to achieve security only against a computationally bounded adversary, namely an adversary whose running time is upper bounded by some polynomial function of the security parameter. So, that is why we made this first change in the experiment, we will not be considering an adversary whose running time is computationally unbounded. Consequently, the name of the experiment is going to be ??????,Π ??? (?).So, instead of saying EAV, I am now calling this experiment COA to denote that this is the indistinguishability experiment in the ciphertext only attack model and the second difference here in the nomenclature is that I am now introducing this security parameter n because the running time of the adversary is going to be upper bounded by a polynomial function of the security parameter, whereas if you see in the nomenclature for the indistinguishability based experiment in the world of perfect secrecy, no such security parameter was there because our adversary was allowed to have unlimited running time.So this is the second relaxation. This is the second change in the experiment, and the third change is that instead of saying that our encryption process is perfectly indistinguishable, we will say that our encryption process is computationally indistinguishable. If the probability with which adversary could correctly identify what has been encrypted in the challenge ciphertext is upper bounded by some negligible function plus half, that is a third change we are going to make in our experiment, right.So in the experiment for perfect secrecy, the requirement was that adversary should not be able to identify what has been encrypted in the challenge ciphertext with probability better than half, but now we are giving the adversary extra negligible advantage to correctly identify what has been encrypted in the challenge ciphertext c. This extra advantage, namely an advantage of negligible function and advantage of some negligible probability is to model the second relaxation that we have to make in the model of computational security if the key reusability is your ultimate goal.So again recall in the last lecture, we have seen that if you want to design a scheme wherekey reusability is your ultimate goal, that instead of demanding that adversary should not learn anything additional, you should be willing to let the adversary learn something about your underlying message or to let the adversary break your scheme with some additional probability and that additional probability should be so small in which it should be a negligible probability, which for most practical purposes you can ignore it off.So, that is why I am bringing this additional advantage of negligible function of n in my security definition. So, that is the computationally secure indistinguishability based version experiment in the ciphertext only attack model. So the essence of this experiment is the following. What we want to capture through this experiment is the following. If you have a scenario where a sender is having a pair of message, say m0 and m1 and if the adversary is aware of this where our adversary is computationally bounded, if one of these 2 messages m0or m1 has been encrypted by the sender and communicated to the receiver and our adversary intercepts a ciphertext, then we require the following property from our encryption process. We require that a computationally bounded adversary should not be able to identify whether the ciphertext c which he is seeing is an encryption of m0 or whether it is an encryption of m1with probability better than half plus negligible. That is what is the scenario or real world scenario we are trying to capture through this experiment. (Refer Slide Time: 14:33)So now we have the following 2 definitions. The first definition is actually the original definition of semantic security in the ciphertext only attack model, where we want to capture that the advantage of the adversary in first world and the adversary in the second world is upper bounded by negligible probability, whereas the second definition is the indistinguishability based definition. It turns out that both these two definitions are equivalent.Namely if we have an encryption process which satisfies the first condition, then we canprove that for the same encryption process in the second condition also hold and vice versa. Namely, if we have an encryption process where the second condition hold, then for the same encryption process, the first condition also holds. I would like to stress the following. In the experiment, which we have discussed when I say that the probability that adversary correctly identifies what has been encrypted in the challenge ciphertext should be upper bounded by half plus negligible, then this probability is over the randomness of the experiment. So remember that the experiment could choose the message m0 to be encrypted in the challenge ciphertext with probability half and with probability half the experiment or the verifier could choose the message m1 to be encrypted in the ciphertext c. This probability of correctly identifying whatever has been encrypted in c should be also over the randomness of the adversary, right, because the entire experiment is going to be a randomized experiment, right.So, as I said that these 2 notions of security or these 2 definitions are equivalent, and the proof that these 2 definitions are equivalent is slightly complicated and due to interest of time, we will not be going into the details of the proof. However, if you want to have a very high level overview of the proof, you can refer to the book by Katz and Lindell. Interestingly, it turns out that the equivalence of these 2 definitions holds in the other models as well, rightSo, currently what we are considering is the ciphertext only attack model and in the ciphertext only attack model, the adversary has got access to the encryption of some message, but if I go to the higher attack model, by higher attack model means more powerful attack model, say the CPA attack model where apart from the ciphertext, adversary also gets access to the encryption oracle then we can have a corresponding semantically secure version of the definition that we are currently giving here for the COA model, right. Namely, we would like to state that adversarial advantage or the difference of the absolute probabilities of adversary computing the function f(n) in the 2 worlds should be upper bounded by a negligible function, where in the first world apart from the ciphertext, adversary will also get access to the encryption oracle if we take this definition to the CPAattack model. In the same way if we take this definition to the CCA attack model, then apart from the ciphertext, adversary in the first world will have access to the encryption oracle, it will have access to the decryption oracle and so on.So, we can come up with a semantically secure, we can come up with a version of the semantic security in the CPA model, in the CCA model and so on where the essence of the definition will be that the absolute difference between the two probabilities of adversity computing f(m) in the first world and f(m) in the second world should be upper bounded by a negligible probability, that will be the essence of the semantic security definition in CPAmodel, CCA model and so on.It turns out that irrespective of the model, we can come up with a corresponding indistinguishability based definition and we can prove that the semantically secure version of the definition, the original version of the semantic security will be equivalent to theindistinguishability based definition. So, that is why for the rest of the course, we will not be seeing the original version of the semantic security definition.We will be rather following the indistinguishability based security definition and depending upon whether we are in the CPA world, CCA world, we will enhance the indistinguishability based experiment, right.(Refer Slide Time: 18:53)So, this is the indistinguishability based definition in the ciphertext only attack model and it turns out that we could come up with an alternate version of this definition, right. So the original definition requires you that the probability that our adversary is correctly able to identify the message that is encrypted in c should be upper bounded by half plus negligible, that is what is the original definition. The alternate definition demands that the output of the adversary should be same irrespective of what exactly is the message which has been encrypted in the challenge ciphertext c.So, remember that since this indistinguishability based definition is a randomized experiment with probability half, my b could be equal to zero and with probability half the message which has been encrypted and ciphertext c could be m1, right. The goal of the adversary is to identify whether it is m0 which is encrypted in c or whether it is m1 which has been encrypted in c. So this alternate definition demands that output of the adversary should be same irrespective of whether it is m0 which is encrypted in c or whether it is m1 which is encrypted in c.More formally, it does not matter whether the message m1 has been encrypted or message m0has been encrypted in c, in both the cases, adversary’s output should be almost same except with a negligible probability. That means absolute advantage of the adversary distinguishing apart whether he is seeing an encryption of m0 in ciphertext c or whether he is seeing an encryption of m1 in the ciphertext c should be upper bounded by some negligible function.If this is the case, then we say that our encryption process has indistinguishable encryption in the ciphertext only attack model, right. So, another interpretation of this difference of these two probabilities is upper bounded by a negligible probability is that a distinguishing advantage, you can view the difference between these 2 probabilities as the distinguishing advantage of our adversary, right? So the essence of this alternate definition is that the distinguishing advantage of our adversary in this experiment should be upper bounded by a negligible probability.(Refer Slide Time: 21:12)It turns out that these 2 versions of the indistinguishability based definitions are equivalent. Namely, we can say that our encryption process is computationally indistinguishable if the probability with which adversary could correctly output b equal to b’ is upper bounded by half plus some negligible function. The second definition says that the distinguishing advantage of the attacker to distinguish apart whether he is seeing an encryption of m0 or whether he is seeing an encryption of m1 should be upper bounded by a negligible probability.It turns out that both these 2 conditions are equivalent. Namely, we can prove that if we have an encryption process Π where the first condition holds and for the same encryption process, the second condition also holds and vice versa. So, what we are going to do is we will follow the implication in the direction that the condition 2 implies condition 1, namely we will assume that say we have an encryption process where condition 2 holds, namely the distinguishing advantage of the attacker is upper bounded by some negligible probability.If that is the case, then we are going to show that irrespective of the way adversary participates in the indistinguishability based experiment, the probability that adversary could correctly identify b equal to b’ or it ensures b equal to b’ is upper bounded by half plus some negligible function. So, let us prove that. So, what is the probability that in the indistinguishability based experiment, adversary outputs b equal to b’ because if adversary outputs b equal to b’, that is what is the interpretation that the experiment outputs 1.Now, if you recall, there are 2 versions of the experiment. An equivalent version of this definition is the indistinguishability based definition, where the requirement is that if the adversary sees an encryption of a randomly chosen message from a pair of messages where the pair of messages is known to the attacker, then the probability with which it can identify whether it is an encryption of this 0th message or the first message is upper bounded by half plus negligible. So which can be also stated as the distinguishing advantage of the attacker and distinguishing apart whether the challenge ciphertext it sees in the experiment, in the indistinguishable based experiment belongs to m0 or m1, it cannot separate apart except with a negligible probability. We also saw an illustration where we showed actually that if your encryption process satisfies the indistinguishability based definition, then it indeed implies that adversary cannot compute any of the underlying bits of the plaintext. In that illustration, we introduced a reduction?based proof, which are central to cryptography. I hope you enjoyed this lecture. Thank you.
Log in to save your progress and obtain a certificate in Alison’s free An Introduction to Cryptography online course
Sign up to save your progress and obtain a certificate in Alison’s free An Introduction to Cryptography online course
Please enter you email address and we will mail you a link to reset your password.