Alison's New App is now available on iOS and Android! Download Now

Module 1: Functional Dependency

    Study Reminders
    Support

    So let us continue our discussion on this topic of normal forms. So the reference you can follow their (()) (00:26) this book and that will give you a different setof examples and all that. And the order in which these concepts are being presented to you in thisslides is completely different from the order in which they are presented in (()) (00:45) and thisthing is material is spread over 2 chapters there. I think 15 and 16, if I remember correctly do notgo by my numbers.But anyway, the concepts of spread to this one and this I am going to talk about the proof ofsoundness and completeness. That is actually not little bit of physical engineering will help (())(01:21), what I was telling you, is that the proof that I am going to present today is not availablein the mainstream textbooks, the ones that have given to you. But it is available in principles ofdatabase systems by Jeff Pullman very old book 70s book.Our library used to have a few copies of that book our department library used to have book ok.So let us start we already slightly delayed ok, in the last lecture we were talking about thesoundness and completeness of Armstrong's axioms. So basically, what we were discussing isthat given some set of functional dependencies on a scheme. The set of logic; the set offunctional dependencies; that can be logically derived from this given set of functionaldependencies, as per our definition of logical derivations.And the set of functional dependencies that can be derived mechanically using the Armstrong’saxioms they are one in the same. So that is what we are going to present today. So we call one ofthem as F underscore F double A, the other one has already been defined as F +. Now we aregoing to show that these 2 things are equal ok, so the way of showing 2 sets are equal is to showthat one is a subset of the other and the other also is a subset of the first one.So showing F underscore double A is subset of F + is what is called soundness. Because we arebasically saying that whatever you can derive from F using the Armstrong’s axioms is always acorrect functional dependency. It is always logically follows from F, whereas in thecompleteness, we argue that whatever can be logically derived from F can indeed be derivedfrom using F using Armstrong’s axioms that is what completeness is.So we will now look at the so this session is going to be a little bit you know dry and you knowtechnical. So but follow me carefully, you will be able to understand this course. Most bookswhich I have suggested to you for reading, skip this portion, skip this completely and then do noteven state this set F double A, instead they will mention the Armstrong’s axioms and then saywhich Armstrong’s axioms can be used for, you know deriving functional dependencies.Whereas we want to be clear, that they are indeed same as the entailment relation that we havetalked about earlier ok. So let us go about this prove the soundness proof ok, here is the diagram that I have shown toyou in the last lecture also. So whatever we derive using Armstrong's axioms is inside F + thatsoundness. And if you take something here and show that indeed it can be derived usingArmstrong's axiom that is what is completeness ok, any questions so far, what we are doing ok.So out of these 2 the soundness is actually easier much easier soundness is much easier.Basically why is it easy, is that if you are deriving using Armstrong's axioms, there are only 3 ofthem. So if we establish the correctness of these 3 axioms, then repeated application of theseaxioms will always give you correct new functional dependencies anyway, so that is theargument right. So I will show you that, so suppose X Y, X determines Y is derived from F using theArmstrong’s axioms in some n number of steps. So if each of these steps is correct thenobviously the overall reduction would be correct right. And each of the steps were using 1, 2 and3 the Armstrong’s axioms reflexive rule then they augmentation rule and the transitive rule RAPyou remember that, so we are using these rule.So, all that we have to do is establish the correctness of these 3 rules individually. And then weknow that you know you can actually construct a more formal proof by using induction on air ook, so I will skip that. But we will just establish the correctness of these 3 things formally againthough actually while introducing this Armstrong’s axioms itself a kind of told you as to whythere are correct while let us just repeat that thing for completeness sake.So rule 1 is reflexive rule, what does it say, it says that as long as the right hand side is a subsetof the left hand side that is a correct FD it is actually a trivial functional dependency. So since itgives rise to trivial functional dependencies which is always true. So application of this rule willalways result in correct functional dependencies, now rule 2 which says that augmentation. Sowhat it says is if you have a functional dependency X determines Y, you can augment it on bothsides by Z.That means, you can add attributes Z to left hand side and then add same set of attributes Z to theright hand side and they will get a correct function dependency. So this is also actually we haveestablished we have argued it but now let me I have written it down really now. So let us saysuppose t 1, t 2, r in some instance and they agree on XZ they agree on XZ. That means theyhave the same corresponding values for attributes in X and Z right.So if that is the case then in particular they agree on X because X is a subset of X Z, for inparticular they agree on X. And if they agree on X we have given the X determines Y and sothey will agree on Y and so they will actually agree on Y as well as Z. So hence you knowapplying this rule 2 always gives rise to correct functional dependency. Now rule 3 is transitivitytransitive rule.So X determines Y, Y determines Z entails X determines Z. So suppose t 1, t 2 that are there in ainstance agree on X, then because X determines Y is given, they will agree on Y. And since Ydetermines Z is given, they will agree on Z. And so if you assume that they agree on X, we willbe able to derive that they agree on X. So this establishes the correctness of the rule 3, so allthese 3 rules will always give rise to correct functional dependencies and your repeatedapplication of these 3 rules n number of times will always give you a correct functionaldependency ok.So whatever the rule that whatever is the new functional dependency that you are going to deriveby repeated application of these 3 rules will be such that it will be entailed by the given set offunctional dependencies. You can kind of establish it by induction because each step is can beproved ok, so do you agree on this now. So you want to recall what is soundness, soundnessagain is there are using Armstrong’s axioms.So you may do it some in n number of steps, but each step is correct is what we are saying. Andso at the end of all the n steps, you will only land up you will land up in always land up in F thatis the argument F doing ok. So to kind of prove the completeness, let us see what is the so we will need one importantauxiliary concept for this particular proof. So let me set that up first we will define what is calledattribute closure ok, X is a set of attributes remember that we are using the alphabets 1 the laterpart of the alphabet set as sets of attributes. So X is a set of attributes and so we will define so farwe had defined for the closure only for functional dependency sets of functional dependencies.Now we are defining it for a set of attributes ok, it is different, for a set of attributes let me definethe closure of X with respect to a set of functional dependency ok, closure X. So what that isdefine how that is different is like this. It is the set of all attributes A such that X determines Acan be derived from F using Armstrong’s axioms ok. Now remember one thing that here I am notbringing in entailment at all.I am just saying that these attributes can be derived from F using the Armstrong’s axioms thatmeans after some number of applications of axioms, you can derive X determines A. So if that isthe case, then we will say that A belongs to the attribute closure of X with respect to F, becausethe F is in the background using only these functional dependencies in F we are deriving it. Sowe can also think of it as the set of attributes that occur on the right hand side of for any FDwhose LHS is X as per this Armstrong’s axioms ok.Now we will use this we will make use of this attribute closure and then work out the proof ofcompleteness ok. One of the claims that we can make out of on this X + is like this X now saywe have X determines Y is some general you know FD. It can be derive from F usingArmstrong's axioms ok, if and only if Y is a subset of this X + F they should be subscript FDalso ok, Y is a subset of X +.So how is this claim useful, so if you want to check whether X determines Y is derive from Fusing Armstrong’s axioms all that you have to now do is that if this is true is that just computethe attribute closure and then check if Y is a subset of that attribute closure. We will later on seehow to compute attribute closures and how what will be the computational cost of that ok. Sothis is a very useful claim which is saying that the functional dependency can be there from usingthe Armstrong's axioms if and only if the right hand side is a subset of the attribute closure isdisappear.Now let us prove that, so these statements which have if and only if have 2 parts right. The proofhas 2 parts the; if part right if and only if part right. So the; if part it assumes that X determines Ycan be derived from F using Armstrong’s axioms. And then proves that Y is a subset of X + theonly if part that is the other way around. So let A be let this Y be some attributes A 1 to A n oh Ithink I said it wrongly right. The, if part assumes that Y is a subset of X + and then proves that Xdetermines Y can be derived from F using Armstrong’s axioms ok.So that is why we have let Y equals A 1 through A n and we have given that Y is a subset of X+.So what does that mean Y is a subset of X+ that means every attribute here all of these A i's orsuch that X determines A i can be derived from F using Armstrong’s axioms (()) (17:12). And ifthat is the case for each of these i, if the individual attributes F X this X determines A can bederived from F using Armstrong’s axioms, this is by definition of the attribute closure.Then we can basically make use of the union rule by making the using a union rule, what doesthe union rule say. We call that union rule, what it says is that if the left hand side the same asright hand sides are 2 different things. We can now derive a new functional dependency, wherethe left hand side is the same left hand side and then the right hand side the union of a (())(17:50) that is the union rule.So now that you have X determines A i individually, you can use the union rule and then simplyyou know derive this X determines Y. Y is nothing but this entire (()) (18:08) ok. So this can beby union rule it follows that X determines Y can be derived from, so basically the; if part is true.Now the only if part we assume that this it is in some sense the reverse of this argument Xdetermines Y can be derived from F using Armstrong's axioms.If that is the case, we can now use a projective rule and then say that X determines A i for eachof these 1 to n can indeed be derived from Armstrong’s axioms. So if that is the case, bydefinition each of these A i will belong to X+. So since each of these A i's belong to X + Ywhich is a same Y here is indeed a subset of X+. So it is actually not a very deep prove but it is asimple ok just that we just want to write down the proof.Y is a subset of X+ is same as saying that X determines Y can be derived from F usingArmstrong’s axioms that is how the definition of X+ is set up ok. So this claim is easy, so we canremember that but only thing is we should remember this claim what does is it, it is givingremember this statement. Now let us look at the completeness of this Armstrong's axioms. As a statement, let us look atthat, what does it say, it says that if F entails X determines Y then it implies that X determines Yfollows from F using Armstrong’s axioms that is what we need to prove for completeness. Nowlet us write the; you have studied implications and mathematical ok how to prove integration, sovarious ways of proving implications.Let us write the contrapositive of case what is the let us look at what is contrapositive; thecontrapositive is equivalent to the statement you remember that right, a contrapositive is like this.If X determines Y cannot be derived from F using Armstrong's axioms, the negation of thisimplies F does not entail X determines Y the negation of this implies the negation of this is thecontrapositive.So that is what we written here X determines Y cannot be derived from F using Armstrong’saxioms implies that F does not entail X determines Y ok. Now how do you restate this F does notentail X determines Y in terms of instances we can write it like this if this is the case if andactually we can even write this as if and only that there exists a relation instance R on capital Rsuch that all the FD’s of F hold on it.But X determines Y does not hold on only then will say that X determines Y is not entailed by Fright, there is at least one instance. Remember the definition of entails, says definition of entails.Let us recall that what it says is that if F entails X determines Y, if in any instance every instancewhere FD’s of F hold, then it is guaranteed that X determines Y also hold ok that is what entailsdefinition here.Now in order to disprove that what we need to do is that we should exhibit one instance, thereexists at least one relation instance R. Such that all the FD’s of F hold on it but X determines Ydoes not hold on it. If that is the case then we are done right to show that it F does not entail Xdetermines Y, is that clear, is that statement clear first. So this is what we are actually going tomake use of in establishing the proof of the completeness.So we are going to actually exhibit so this is instance a instance R constructed cleverly. Such thatall the FD’s of F hold on r but X determines Y does not hold on r ok. Now this r is constructed ina interesting way this r is a table by the way ok, this r is accurate table I mean I need has only 2tuples ok. So this is a theoretical tool right, so it has only 2 tuples and one of them is all ones theother one has all ones still some part and all zeros later.And the attributes are grouped such that these attributes are all attributes of X +, remember whatis X + is the attribute closure of X. So overall we can look at the attributes of R and then classifythem as 2 parts right. Those that belong to X + do not thus do not belong to X +. So put all theattributes that belong to X + in the front and do not belong to X + later and then you know put allones here and put all ones in the second row.You have all ones for attributes that belong to X + and then zeros for other attribute that is allthis is the nice a little instance. And then what we will actually the scheme a thing is that we willshow that this instance r does the job for us. This instance r is such that all the FD’s of F hold onit but X determines Y does not hold on it. If this is the case that X determines Y cannot bederived from F using Armstrong’s axioms.So under these assumptions if this is given to us we will show that this particular instance r issuch that all the FDs of F hold on it but X determines Y does not hold on it, so that will completethe proof. So we only need to now prove to specific statements saying that this particular. So youdo not ask me how this is particularly constructed. So there is a intuition behind how this proofneeds to be worked out and so we have constructed this r has been constructed like this is thatfine. So this is r is having exactly 2 rows the first row has all ones and I will repeat these r inother slides also.So that you will remember what the r is, so now we only have to show this thing that all FDs of Fhold on r and X determines Y does not hold on r, let us see how we can prove that. So claim 2 is that all FD’s of F or satisfied by r, the r is dissipated here. Well, this is againproved by proof by contradiction it is proved by proof by contradiction suppose not supposethere some FD in F let us call it W determines Z is not satisfied by r ok. If it is not satisfied by rthe structure of r is very simple it is like this right. If some FD is not satisfied by r that meanswhat, this W the left hand side should indeed be here right, only then this only existing 2 tupleswill agree on the left hand side.So some FD is not satisfied by r means what that there are they exist 2 tuples that agree on theleft hand side and do not agree on the right hand side right, that is what it is. So if W determinesZ is not satisfied by r then W should be in this part, only then these 2 tuples will agree on lefthand side right. And then the Z if Z also is here itself that Z means the set of attributes is right.These Z attributes are also somewhere here only then both tuples will agree on them becausethey all of them have an identical values 1 ok.So this instance will end up satisfying W determines Z but we have so that is not the case rightW determines Z is not satisfied by r. So W should be here and Z should not be here that is whatwe conclude. So Z is not a subset of X +, whereas W is a subset of X here ok, is that clear tillthat. Everything just logically follows from the definition of these thing, there is nothingsurprising individuals steps ok.So basically since we assumed that W determines Z is not satisfied by r it logically follows thatW should be here and Z should not be here it should be somewhere like this, it should containsome attributes from R - X + ok. So let some attribute pick up some attribute X A that is there inZ, but not in X + that means somewhere here not all of them need to be in Z. But some of themneed to be in Z at least one should be there in Z that we know because that should not be herecompletely.So we know that at least one attribute exists which belongs to Z and which is not here, so let uspick up that attribute A ok. Now there is a sequence of inferences that we can make which willgive rise by contradiction. So what is that sequence of the inferences we can make. Now sincewe know that W is a subset of X + from the definition of X + and the claim 1, X determines Wfollows from F using Armstrong’s axioms, the claim 1 basically said that.Because if W is a subset of X + then X determines W follows from F using Armstrong's axioms,because of the claim 1 we have proved that ok. So X determines W is in our hand now, so Xdetermines Z follows from F using the Armstrong's axioms by transitive rule why is that. Xdetermines Z, W is there and W determines Z is already given in F ok W determines Z is alreadyavailable in F and X determines W is there. So now X determines Z follows from here using thetransitive rule, now A is a member of Z, remember that A is a member of Z and it is not amember of.So since A is a member of Z, Z determines A follows from F using the reflexive rule because Ais a basically member of rule. So now we have X determines Z, Z determine A and so bytransitive rule X determines A follows from F using their Armstrong’s axioms ok. Now you cansee the contradiction, the contradiction is coming because by definition of this closures, if Xdetermines A follows from F using the Armstrong’s axioms which we have now shown.Then A must belong to X + that is how the definition of X + is say that of. So we started off withA saying that they must exist an A which is in Z but not in X + and then about that A we are ableto make a new conclusion that A must be in X +. So it is a contradiction to our assumption thatsuch an A exists. So it all started because we made this assumption they saying that let there besome W determine Z which is in F which is not satisfied by r.We make such an assumption, we will derive a contradiction and hence all FDs in F are indeedsatisfied by r ok. So remember that derivations using Armstrong's axioms from a given set offunctional dependencies here, you know do not take any instances into consideration at all. Soour assumption that you know F is not satisfied by r or let W be not satisfied by r, you know doesnot played a role here in the sense of you know in driving using F ok.Derivations of new functional dependencies from the given set of functional dependencies usingArmstrong's axioms, it does not depend on any instances whether any instance satisfies this ornot about. So you may be wondering that we are using it which is not satisfied by r but that doesnot matter as long as it is in F we can make use of it ok. So is this argument a bit technicalargument of both of course is the argument clear.That basically what we are saying here is that, if you assume make an assumption that Wdetermines Z is an FD that is not satisfied by r we will be able to derive a contradiction sayingthat they cannot be any attribute A that belongs to this part ok. Now let us move on to claim 3 which is actually very simple this is much simpler claim, whatthis says is X determines Y is not satisfied by r this is not very surprising actually. Again let usdo it by proof by contradiction suppose not let us say it is indeed satisfied by r ok. If it is satisfiedby r because of the structure of r will be forced to conclude that Y should be a subset of X + thatmeans this part.Because only then the 2 tuples will agree on both X as well as Y only then both tuples will agreeboth on left hand side and the right hand side that this part has zeros right, you follow me. So ifX determines Y is indeed satisfied by r if X is somewhere here you know if X cannot be herebecause X is supposed to be subset of X + right, X is always here only. And since X will alwaysbe a subset of X + remember that reflexive rule ok.So X is here, now if Y is somewhere here then X determines Y cannot hold. Because the 2 tupleswill not agree on Y attributes, so Y also is forced to be here only ok. So if Y is a subset of X +again because of the claim 1 X determines Y can be derived from F using the Armstrong'saxioms, that is the conclusion we make, but this contradicts the very assumptions that. We madeabout X determines Y, what is the assumption about X determines Y, is something which cannotbe derived from F using Armstrong's axioms.That is the fundamental assumption about the X determines Y. So it immediately gives rise thecontradiction and hence this is true X determines Y is not satisfied by r. So whenever, so thiskind of completes the proof actually because we proved claim 2, claim 3 separately rememberthis thing. This is the assumption that we started off with X determines Y cannot we derive fromhere using the Armstrong’s axioms.And then we showed that the exists relation instance r on capital R such that all the FDs of Fhold on it and X determines Y does not follow. The claim 3 proved this X determines Y does nothold on it because if it holds then it contradicts this itself the assumption (()) (38:31) ok. Sobecause of these 2 claims whenever X determines Y does not follow from F using theArmstrong's axioms, F does not logically imply X determines Y.So this actually proves that, Armstrong’s axioms are indeed complete. So soundness andcompleteness can actually be proved, so because the proof is a little bit technical and you knowyou have to go into the details of this thing. Many textbook actually gloss over this and thensimply define that the Armstrong's axioms are there and you can derive new axioms out of 3.And then they are all logically correct that is what they claim ok. Say the consequence of the completeness of this Armstrong’s axioms is basically that F + whichis the set of all functional dependencies that are entailed by F right or exactly the same as the setof all functional dependencies that are derived from F using Armstrong’s axioms. The ruleswhatever derived from F and what is logically implied they are same, so these 2 sets are same.So this we call it as F underscore W in one of the previous slides.And in a similar way the attribute closure X + also is a set of all A that you know attributes aresuch that X determines A follows from F using Armstrong’s axioms. It is the same as this a set ofall A such that F entails A. Basically wherever follows from F using Armstrong’s axioms is therewe can simply replace by entails A because the rules are sound and complete. Now let us spend a little bit of time on computing this closures, how exactly this F + right. Sohere is I am giving an example of an F which some kind of a pathological case is the worst case.So if have, so F has these functional dependencies, the left hand side is same for everybody A,attribute A determines B 1, A determines B 2, A determines B n number of them.Now you can see that A determines X, where X is a subset of B 1 through B n is always a newfunctional dependency that logically follows from these given functional dependency right. Forany subset of because you can use the addition rule union rule, you can use union rule and thenget back any set any subset of B 1 through B n on the right hand side, so it is a new functionaldependence.So now, you know that there are 2 power and subsets of this B 1 through B n. So for each ofthese 2 power n subsets, we can put it on the right hand side and then get a new functionaldependency A determines that subset. So if you look at the size of F + number of elements in F +it can be exponential in terms of the number of n number of V functional dependencies that aregiven.So theoretically speaking, F + is exponential in size compared to the size of F. So size of F + is 2power n, whereas size of F is just n number of FDs in F ok. So obviously, the lower bound onany algorithm that computes F + given F is 2 power n (()) (42:5) at least because there are tooutput all of them you need 2 power n right, so that is the lower bound. So, any algorithm thatproduces F + has to take exponential term.So that is why computing F + is computationally expensive, whereas fortunately because of allthis setup we have made especially this attribute closure. So remember the claim 1 X determinesY belongs to F + is same as Y is a subset of attributes closure of X. That is the claim 1 first claimwill make today morning that X determines Y belongs to F + ok, if and only if Y is a subset ofthis one and this also is consequence of the completeness of those.Now, so though computing F + is expensive checking the membership of some functionaldependency inside that F + can be done by checking Y is a subset of this one provided we cancompute this we can compute this easily. If we compute this, then we can check whether Y is asubset of that X + and then if that is the case, we can conclude that X determines Y is indeed (())(44:25). So computing this attribute closure is fortunately easier, I will just show that and thenclose.