Loading

Module 1: Block Ciphers

Notes
Study Reminders
Support
Text Version

Practical Construction of Block Ciphers

Set your study reminders

We will email you at these times to remind you to study.
  • Monday

    -

    7am

    +

    Tuesday

    -

    7am

    +

    Wednesday

    -

    7am

    +

    Thursday

    -

    7am

    +

    Friday

    -

    7am

    +

    Saturday

    -

    7am

    +

    Sunday

    -

    7am

    +

Foundations of CryptographyProf. Dr. Ashish Choudhury(Former) Infosys Foundation Career Development Chair ProfessorInternational Institute of Information Technology-BengaluruLecture-19Practical Constructions of Block Ciphers Part IWelcome to lecture 18, in this lecture we will discuss about the practical constructions of block ciphers. And this will be the part 1 of the discussion on the practical constructions of block ciphers.(Refer Slide Time: 00:41)So, the road map of this lecture is as follows, we will start discussing about confusion diffusion paradigm and then we will see how to instantiate the confusion diffusion paradigm using substitution permutation network or SPN. Later on in during our part 2 of the discussion on practical constructions block cipher, we will see that how SPN plays a very crucial role in the design of one of the most popularly used block ciphers, namely DES.(Refer Slide Time: 01:10)So let us start our discussion on confusion diffusion paradigm. So, this paradigm was introduced by Claude Shannon. And the underlying idea here is that we are interested to design a random looking keyed permutation, right. So that is our goal, we want to construct a random looking keyed permutation F with a larger block length by combining several random looking permutations, which are also keyed permutations.But with a small block length, so the random permutations with small block length those are denoted by fi. And for basically the confusion diffusion paradigm does is it tells you how to compose or to combine this random looking permutations with small block length to obtain a random looking permutation with a larger block length, right. And the way we do the composing is in such a way that the choice of the smaller permutations fi are determined by the secret key of your larger sized permutation.(Refer Slide Time: 02:17)So, to demonstrate how exactly the confusion diffusion paradigm works, let me take an example where my goal is to construct the keyed permutation F with a block length of 128 bits using several keyed permutations of block length of 8 bits, right. So, I beg your pardon here, so the smaller size permutations, they are not keyed permutations, they are unkeyed permutations. There will be truly random permutations and what basically the confusion diffusion paradigm does is:It tells you how to compose these small block size unkeyed permutations and obtain a larger blocks length, the keyed permutation. So for the demonstration purpose, we will assume that we have several random permutations, mapping 8 bits to 8 bit strings. And using that our goal is to compute or construct a keyed permutation F of block length of 128 bits. So the way we are going to design the function F is as follows.So we parse the input fx for the function F as a sequence of 16 bytes, so those bytes I amdenoting as x1, x2, x16. And the reason I am splitting it into bytes is because we are given at our disposal several permutations, mapping 8 bits to 8 bits, right. So that is why the input x for the keyed permutation F is divided into bytes here.(Refer Slide Time: 03:44)So imagine that you have the set of all permutations mapping 8 bit string to 8 bit strings. So that set, I am denoting by this notation from sub 8 right. So each of this small circles, you can imagine that it is a permutation mapping 8 bit string to 8 bit string right. And each of these permutations, say, if I take this particular permutation fi. You can interpret it as a table consisting of 28entries, right.So you can imagine it as a table consisting of 28entries where the first entry denotes the value of the permutation on the input all 0. The second entry denote the value of the permutation on the input 001 and like that the last entry denote the value of the permutation on the input 111. So, like that you will have 28rows, and each row will basically consisting of 3 bits, and 3 bits,namely the value of permutation on the corresponding inputs.So, in that sense, I can imagine or interpret each of these small permutations fi as a table consisting of 211 bits, right.(Refer Slide Time: 04:57)So, like that you have several such tables and each such table I am denoting as a red circle here. And a collection of all those red circles basically is my set Perm8 ok.(Refer Slide Time: 05:07)So, the way we are going to construct the keyed permutation F is as follows, we randomly choose 16 permutations f1, f2, f16 from this bigger set. And say the randomly chosen permutations of 16 permutations that I have chosen are denoted by this yellow circles. And each of these 16 permutations mapping 8 bit to 8 bit strings can be individually interpreted as a string of 211 bits.This is because as I said earlier, each of the permutations can be interpreted as a table consisting of 211 bits. So, the choice of the first small permutation that corresponds to a string k1, the choice of the next permutation that will be the string k2. And like that the choice of the 16th permutation mapping 8 bit string to 8 bit string can be interpreted as another string of 211 bits ok.(Refer Slide Time: 06:09)And now once we have randomly chosen the 16 permutations, which are going to be used for the construction of the function F, the overall key for the big permutation F is the concatenation of the individual tables. Namely, the concatenation of k1, k2, k16 and if I concatenate all the strings, I obtain the overall key for the function F. So, it is easy to see that the key size for the function Fwhich we are interested to construct will be of 215 bits.Because each of the smaller permutations which I am going to use is denoted by 211 bits and I have 16 such permutations. So overall the key size for the bigger permutation F is 215bits. And once I have decided the value of the key, the output of the keyed function Fk on the input x is determined to be the value of the first small size permutation f1 operated on the first byte,concatenated with the value of the second smaller permutation f2 on the second byte of the input x. And like that the value of the 16th smaller sized permutation on the 16th byte of the input. And if I concatenate all these individual outputs, that is what is going to be my output of the keyedfunction f k on the input x. So that is the way we are going to construct a function F using several smaller size permutations. Now the advantage of constructing the bigger function Fk like this is as follows.(Refer Slide Time: 07:56)If I would have directly tried to construct a function Fk, right mapping say 128 bit strings to 128 bit strings. Then that would have required me to actually store a table consisting of 2135 bits, because that table will have 2128 entries. And each entry would have further consisted of 8 bits. So that is why the overall size of that table would have been 2135 bits.But the way we have constructed the function Fk by combining several 16 smaller permutations, I just need to store a key of size 215 bits. And it is enough if I just store the value of the k namely, the description of the 16 smaller permutations which I have chosen. And that suffices for evaluating the value of F on any input x. So that is the advantage of constructing the keyed function or keyed permutation Fk by this confusion diffusion paradigm.So now you might be wondering why exactly the name confusion and diffusion in this term confusion diffusion paradigm, what exactly causes confusion and what exactly causes diffusion.(Refer Slide Time: 09:13)So let us go into the details, so this is the way in our example we have constructed the keyedfunction Fk. So, the smaller size function f1, f2, f16 which we have chosen here, they are called as round functions and they are dependent on the key, right. So that creates a confusion from the viewpoint of an adversary. In the sense if an adversary wants to compute the value of Fk(x), but he is not aware of the value of k.Then the adversary does not know what exactly are the 16 round functions I am going to use. And from the viewpoint of the adversary, it could be any 16 functions from this collection of all permutations mapping 8 bit strings to 8 bit strings. And a size of this collection of all permutations mapping 8 bit strings to 8 bit strings is enormously large. So that means adversary will be completely confused or clueless what exactly is the value of the key, that means what exactly are the 16 round functions.And hence, adversary cannot predict what exactly is going to be the value of Fk(x) even if it has seen several value of Fk(x), for several x of it is choice in the past. So, that creates the confusion aspect in this paradigm.(Refer Slide Time: 10:30)However, it turns out that even though the description of the keyed function Fk which we have constructed is very concise. Namely, it requires us to store only 215 bits, it is not pseudo random, right. So recall that the property of the pseudo random permutation is that. If I have 2 inputs x and x’ and if I compute the value of Fk(x) and Fk(x’) with respect to the same key, then even if x and x’ differs in a single bit or a single byte, the outputs should be significantly different. That iswhat we expect from a truly random permutation and if I say that Fk is a pseudo random permutation, then basically I expect almost similar behavior from the resultant Fk(x) function.(Refer Slide Time: 11:26)But it turns out that the way we have constructed the function Fk(x), this property is not achieved. It is easy to see that if I have 2 inputs, x and x’ which differs only in the first byte, then the output of Fk(x) and Fk(x’) will differ only in the first byte. All the remaining 16, 17, 15 bytes of the output of Fk(x) and Fk(x’) will be exactly the same. And that is not what exactly you expect from a pseudo random permutation.So the way we get around with this difficulty is that we actually apply the logic that we have used in the construction of Fk(x) several number of times by introducing what we call as a diffusion step.(Refer Slide Time: 12:05)So let me go into a little bit more detail, so imagine we have the input x for the function Fk, which we are interested to construct, and we have the value of the key. What we do is thatdepending upon the value of key, we determine the smaller sized permutations which we are going to apply on the individual bytes of x. And after we obtain the outcome of the individual round functions on the respective bytes, what we do is we apply a mixing permutations. Namely, we just shuffle the bits of the intermediate outcome that we have just obtained.(Refer Slide Time: 12:44)And what we say is that this one iteration of confusion followed by one iteration of this mixing permutation constitutes 1 round. And then we repeat this process again, that means we just do not output whatever we obtain after 1 round as the output of Fk(x), whatever output we obtain at the end of first round, that is considered to be the x input for the next iteration. And again, depending upon the value of the key, we determine the round functions that we are going to use.We apply the individual round functions on the intermediate output and again we do a mixing permutation. And then we repeat this process for several iterations for several fixed number of rounds. And by repeating this process for several number of rounds, it is ensured that even if there is a single change in the input bit, it affects over several output bits. Because every iteration, the result of mixing permutation will cause the bits of the intermediate output to shuffle around.And that is what creates a diffusion from the viewpoint of the adversary and that is why the name confusion diffusion paradigm. The confusion is created because the value of key is not known for the adversary. And that is why the choice of the round functions would not be known to the adversary. And diffusion is caused because of the mixing permutation which ensures that after every at the end of every round whatever output we obtain that is shuffled around.And that is as a result when we go to the beginning of the next iterations, the next round functions will be applied on different bytes. So that creates a diffusion, which ensures that even if there is a single change in the input bit, that affects several output bits and ensures that the overall keyed permutation Fk(x) that we obtained behaves like a pseudo random permutation, right.(Refer Slide Time: 14:39)So that is on a very high level, the confusion diffusion paradigm. And after we do this process for several number of iterations, the resultant output is denoted as Fk(x).(Refer Slide Time: 14:49)So now let us discuss about substitution permutation network or SPN in short, which implements the confusion diffusion paradigm. And this SPN is a very important building block used in the construction of practical block ciphers like DES, which we will discuss in the next lecture. And this SPN implements the confusion diffusion paradigm, but with few differences. The most important difference is that instead of choosing round functions which are key dependent, we will be now using key independent and publicly known round functions, which I denote by Si or which are also called us S boxes. Because they are known as substitution boxes that means, if I go back and if I see the description of the confusion diffusion paradigm. In the confusion diffusion paradigm, the confusion was coming because the value of key was determining the round functions that we are using.(Refer Slide Time: 15:51)But in the substitution permutation network, there are no round functions which are key dependent, everything is publicly known. So, now you might be wondering that if the round functions that we are going to use in SPN are publicly known and key independent, how exactly the confusion is going to be brought from with respect to the viewpoint of the adversary. Well, the confusion comes because even though the S boxes are publicly known, the S boxes are applied on an input, which depends upon the value of the key. Namely, if I want to apply the S box on an input x, then we do not directly compute the value of the S box on the input x. But rather on the value of x, XORed with the value of the key and that ensures that even though the value of the description of the S box is publicly known. The exact input on which the S box is operated upon is not known from the viewpoint of the attacker. And that is what creates that confusion aspect of the confusion diffusion paradigm.(Refer Slide Time: 16:52)So, again to demonstrate how exactly an SPN works, let us discuss how exactly we are going to construct a keyed permutation with a block size of 64 bit. Assuming that we are given 8 S boxes, right. So those as boxes we denote as S1, S2, S8 and they are publicly known that they are key independent. So this is the x input for my keyed permutation that I am interested to construct which I denote as x. And the first step in the SPN architecture will be the key mixing step, where we use a 64 bit sub key and mask it with my current x input.(Refer Slide Time: 17:36)And whatever output I obtain that I parse it as collection of 8 bytes, and now this individual 8bytes go through the 8 S boxes, which are publicly known. So the first step where actually we do the masking of the input : current input with the sub key, that step is denoted as the key mixing step. And this key mixing step is followed by a substitution step, where the outcome of the key mixing step are interpreted as individual bytes and they go through the individual S boxes.And as a result, now I obtain a 64 bit intermediate output and the substitution step is followed by a permutation step, where we do the shuffling. Namely, we apply a mixing permutation and this mixing permutation also will be publicly known. So, in this whole architecture, everything will be publicly known except the value of the key, the adversary would not be knowing the value of the key.Other than that, it will be knowing the entire architecture, it will be knowing the description of the S boxes, it will be knowing the description of the mixing permutation and so on. And once we apply the mixing permutation that finishes 1 round, and then we apply this process again. That means we go to the next round that means whatever intermediate output we obtain at the end of the mixing permutation that serves as the x input for the next iteration.And then we again go through the same process. And after doing it for several iterations, the final outcome is treated as the outcome of the resultant keyed permutation that we have constructed using the SPN architecture ok.(Refer Slide Time: 19:12)So if I say that I have a keyed permutation designed using an r-round SPN, then it means that we have done r iterations of key mixing followed by substitution, followed by permutation. And this r-iterations are finally followed by one final round of key mixing, right. So, you might be wondering that why this 1 round of final key mixing after the rth iteration. Well, it turns out that if we do not do this final key mixing at the end of the after the completion of the rth iteration, then the effect of the rth iteration, substitution step and permutation step are completely useless.Because the adversary knows the description of the permutation step and the description of the substitution step. Say if I do not apply this final round or final iteration of the key mixing at the end of the rth iteration. Then it is as good as saying that the effect of rth iteration substitution step and the permutation step is not going to be applicable. So that is why this final key mixing step is useful.(Refer Slide Time: 20:21)And now, the question comes is how do we determine this sub keys for the individual round. So, we assume that we have a master key k for the function Fk(x) which we are interested to construct. And this 64 bit sub keys are selected from this master key as per a publicly known key scheduling algorithm, right. Say depends upon how exactly my key scheduling algorithm is designed and this key scheduling algorithm will also be publicly known.Because remember, the only thing which a sender and a receiver who are going to operate this keyed permutation Fk will share is the value of the key. Other than that, we expect that they should not have any pre shared information. So that is why the description of the S boxes, the description of the mixing permutation, the description of the keyed scheduling algorithm everything is available in the public domain.So, depending upon the length of the master key, we determine a key scheduling algorithm according to which we decide which subset of the master key is going to serve as the sub key for the ith iteration and that description will be publicly known right.(Refer Slide Time: 21:31)So, a very important theorem which we can prove is that, if the S boxes which we are using in the architecture of SPN are permutations, then the keyed function Fk which we are going to obtain at the end of any r-round SPN is going to be a permutation. That means it is a one to one and onto mapping and it does not matter what exactly is the value of r. As long as you ensure that these S boxes are invertible, the overall keyed function Fk which we are going to obtain by deploying an r-round SPN is going to be a keyed permutation right.And the proof for this simply follows from the fact that at the receiving end if the master key is also known to the receiver, then the effect of each round of the SPN is invertible.(Refer Slide Time: 22:28)For instance, imagine that the sender has used an input x and it has operated an r-round SPN and it has obtained the value y. And imagine that the receiver knows the value of y and it also knows the master key k right. Now, the goal of the receiver is to uniquely go back from the output y to the input x. So, let us see whether it can reverse back the effect of the last iteration, right.So, since this mixing permutation is publicly known, from this output y, receiver can go back to this intermediate step. Because the mixing permutation is publicly known, so that as a result in inverse of this output y with respect to this mixing permutation. The input that would have given this output y with respect to this mixing permutation can be uniquely computed because it is a permutation.Now, once we have reached here, that means we have reached to the state where we know the output of the S boxes 8 S boxes and each of these S boxes are invertible. As a result, we can reverse back the effect of each of these S boxes and we have come to the output of the key mixing stage. And now, if I want to go back from the output of the key mixing stage to the previous input, what we have to do is basically we have to take out or we have to do the XOR of this sub key which I would have used in the last iteration.And that we can compute provided we know the value of the master key, and the key scheduling algorithm is anyhow publicly available. That means any entity which knows the value of key and y can uniquely invert back the output y and obtain back the input x. And as a result we can saythat the overall construction Fk(x) which we have constructed is indeed a keyed permutation.(Refer Slide Time: 24:24)So, now let us come to the security properties of the keyed permutation which we obtained by operating an r-round SPN, right. So, the construction is very straightforward right, now the important thing that we have to discuss is about the security aspect. Now, it turns out that the security of any keyed permutation Fk, which we obtained by running an r-round SPN, depends upon the exact choice of the S boxes which we are using, the exact choice of the mixing permutation that we are using, and the exact choice of key scheduling that we are using right. So in the architecture of SPN, till now I have not discussed what exactly should be the properties of the S boxes. We just required that the S boxes should be invertible to ensure that the overall Fk(x) that we obtain is also invertible. But for the security properties, namely to ensure that your keyed permutation Fk(x) is a pseudo random permutation, we need some desirable properties from these S boxes.In the same way we need some desirable properties from the mixing permutation and from the key scheduling algorithm right. It turns out that there are no hard hitting rules or sufficient conditions. Namely it is not known that if we ensure certain list of properties with respect to the S boxes mixing permutation and key scheduling algorithm, then the resultant construction is always going to be a pseudo random permutation.There are no such cookbook or algorithmic rules available. But it turns out that we do expect certain desirable properties from the underlying S boxes, mixing permutations and key scheduling algorithm to ensure that the resultant keyed permutation indeed behaves like a pseudo random permutations.(Refer Slide Time: 26: 12)So there is an important property called avalanche effect and what exactly avalanche effect means is that, if we have constructed a keyed permutation Fk using an r-round SPN, then we expect that even a minor change in the input of Fk should produce a significantly different output. Now the question is, how do we ensure that how do choose an SPN which ensures that the resultant Fk designed using the SPN indeed achieves this avalanche effect. So, for the moment assume that we have designed or we are using S boxes the publicly known S boxes and the mixing permutation having the following properties.(Refer Slide Time: 26:58)The S boxes that we are using which is publicly known and known to the adversary have the property that : even if there is a change of 1 bit of input in the S box, it ensures that there are at least 2 output bits in the output bits of S box which differ right. That means what I am saying is that assume you have the S box having the property that if you operate it with input x and input x’, where x and x’ differs only in say 1 bit, then the resultant outputs y and y’ differs in at least 2 bits.  My current right half namely R0, it simply goes and serves as the next L half. And I apply the current round function, namely, f1 on my current right half and XOR it with my current left half to obtain the next right half.And this is what we do in every iteration, what differs or what could differ is exactly the choice of the individual round functions. But the operation wise the structure of the operations that we are performing in each iteration is exactly the same.(Refer Slide Time: 41:41)And again, I am not going to prove this but we had seen in one of our earlier lectures, that irrespective of what exactly are your individual round functions, for any value of R and any choice of the round functions, the overall function, the composed Feistel network, the overall function that we have obtained by the composed Feistel network is indeed an invertible function.So, that brings me to the end of this lecture. Just to recall in this lecture, we have seen 2important building blocks namely SPN and Feistel network, which we use in the constructions of practical block ciphers. And in our next lecture, we will discuss that how these 2 building blocks are used to design some of the real world block ciphers like DES and AES, thank you.