Mega March Sale! 😍 25% off Digital Certs & DiplomasEnds in  : : :

Claim Your Discount!

Module 1: Functional Dependency

    Study Reminders
    Support

    Ok, so let us continue our discussion on normal forms. So in the last class I have defined in the last lecture we have seen the definition of the functionaldependencies. Let me briefly recall this important definition because we are going to spendconsiderable amount of time understanding this notion of functional dependencies. So itbasically says that one set of attributes is actually functionally dependent on X that means it isactually a function of X, so you all know what a way function.So the value of these attributes would uniquely determine the values of these attributes ok. Sothat is all we are saying but we are saying it in elaborate manner here that if any instance r of R if2 tuples agree on these attributes of X then they will also agree on the attributes of Y which is adifferent way of saying that it is a function ok. So having said that we would look at several examples. And in this class I am going to continue giving a couple of more examples and then we will startlooking at these functional dependencies in a little bit abstract way and then see what are it isimplications ok. So in the last class we also talked about what are called trivial and non trivialfunctional dependencies. So you notice that if the right hand side attributes are a subset of the lefthand side attributes.Then this kind of a functional dependency X determines Y will always fall on any scheme, whatit says here is that if 2 tuples agree on X, then they will agree on Y. But if Y is a subset of X, sothey are already agreeing on Y ok so there is nothing. So we call such kind of dependencies astrivial functional dependencies as they always hold. Now if Y is not a subset of X we will callthat as a non trivial functional dependency.And then if X and Y are actually completely they are disjoint then we will call that as acompletely non trivial functional dependencies. So most of the time we will be worried aboutnon trivial functional dependencies and we will see how exactly what are the other various otherthings we will do with functional dependencies ok. Before we proceed here is a bit of anotational convention, we will use this low end alphabets A, B, C, D along with their subscriptedversions like A subscript 1, B 1, C 1 like that.To denote individual attributes and then the alphabets on the high end the X, Y, Z, Z, Y, X, W,U, V etc those alphabets we will use four sets of attributes. So these letters we will use to standfor sets of attributes. That is why in the functional dependency we have always been using Xdetermines Y, so X is a set of attributes, Y is a set of attributes. And if we want to talk aboutsome individual attributes, then we will usually put it as A 1, A 2, B 1 B 2 like C 1, C 2 we useletters from a lower end of the occupants just to make it easier for us to remember. Now let us look at some more examples, consider this scheme prerequisite we already have thisin our scheme of institute scheme. We also have a lots of other relations in the scheme, if I leaveit to you as an exercise to look at the functional dependencies those things. Actually most of theother relations the functional dependency that holds would be key on the left hand side and theremaining attributes on the right hand side.If you can verify that but this one, what about this one does prerequisite functionally determinecourse Id. So prerequisite recall that prerequisite course is a course number, course Id is also acourse number. And we will have a row if some particular course is a prerequisite of some othercourse, right. And so, there will be all such tuples in this particular instance.Now the question is, if 2 rows agree on prerequisite course should they also agree on course Id inthis relation. So if they always agree in all instances of prerequisite relation then we say thatprerequisite course determines course Id. So from the logical understanding of the domain, whatdo you feel about this, this is not true.So a course might be prerequisite for many courses and so it might appear along with many othercourses as tuples here. So given the value of prerequisite we will not be able to uniquelydetermine what is the course Id in general ok. What about the other way around, does course Iddetermine prerequisite it is also not true because again a course may have many prerequisitesactually.So it is possible that no FDs hold on some schemes, so do not be looking for functionaldependencies in all possible schemes. It is possible that there are cases where no functionaldependencies hold at all ok. Let us proceed what about here student, remember that this is the example I told you that this is aexample of a bad scheme. And then we started all our discussion with this example. So let usreconsider that example where we have roll number, name, sex, department name, so and officephone and HOD. So what are all the assumptions here, the assumptions are that each studentbelongs to a single department.And each department is headed by a professor and the department has one office phone, so theseare the assumptions. So under these assumptions what is the key for this scheme, what is thehesitation for telling me the key here. It is simple right, we just chose to attach some moreinformation with each student. So that does not you know affect the you know property that rollnumber is the key or the relation.Roll number continues to be the key for this here, because given the roll number we will be ableto uniquely identify 1 row in this particular dataset. Even the roll number will be able to identify,whether there would not be 2 rows with same roll number except relation. The roll numbercontinues to be key, so roll number since roll number is a key. So roll number determines all theother attributes roll number attributes.Notice one more thing here that we could actually write roll number determines departmentname as a separate FD and it will still be valid ok. And because you know instead of writing rollnumber name determines name, roll number determines department Id etc. We now actuallyhave one FD which says roll number determines all the other attributes. And this kind of a thingas we have mentioned right in the beginning that this kind of thing always happens.If the the left hand side is a key, key will always determine all the other attributes ok. So that is afunctional dependency that always holds true in any scheme, of course if this key itself is all theattributes, then there is no possibility for the right hand side, that is what happened for theprerequisite relations, good. So that is the what about any other FDs that are existing hereanymore FDs exist.Let us pick up some attributes and then see whether they determine any other attribute. Thatmeans if 2 rows agree on that attribute then they have to agree on some other attribute right. Canyou see that something happening in this set, roll number, name, sex department name, officephone and HOD. Particularly let us look at this department name, department names are uniquein an institute we will not have 2 departments of computer science ok.So Computer Science and engineering with title of the name of the department is only onedepartment, so like that. So with this values for this if you focus on the values of this, let us say if2 rows of this particular table agree on the value of the department name. Then what should theyagree on what other attributes they should agree on, what is one more attribute that they shouldagree on at least a HOD right.You cannot say that in one tuple in computer science department has he escaped from a Chandrais HOD. And in some other tuple you cannot say that Krishna Sivalingam is the right, it will beinconsistent defect. So if 2 rows have department of computer science in as under this columnthen under the column HOD they have exactly one person, they cannot have 2 different valuesthere, they have to agree on.And so is the case with Office phone, if you mentioned office phone as one phone here and thensome other phone in somewhere else we have doubt actually which is the actual phone, so right.So department name indeed functionally determines office phone and HOD well good. Is thereanything else everything in one more thing or anything more let us look at this, they just half adozen of these attributes, so ok.Let me give you again what about HOD on the left hand side, what is the policy of the instituteof regarding HODs, can some professor be a HOD for 2 departments will any professor willagree for that, one department itself is a big headache. So I cannot be head of this department for2 head department, so nobody will agree generally for being a HOD for 2 departments, so ingeneral, a person is HOD for exactly one department.That is a reasonable assumption to make in this context. So given that so if you have a HOD onthe left hand side, or do you have on the right hand side. If 2 tuples agree on HOD then they willagree on department name and obviously phone. So HOD determines department name andoffice phone. In some places they may violate this policy and then make some poor professor asHOD of 2 departments.So it kind of depends on the policy for the institute, luckily we have a policy that each professorhad only one department. So again it brings to focus the fact that these functional dependenciesare actually knowledge about the domain that you are modeling. They are not they do not comefrom some thin air from somewhere, they are knowledge about the domain that you aremodeling.And particularly they cannot be inferred by just looking at it you have look at the model, youhave to look at the domain. Now no other FDs hold and so we can does office phone determinedepartment name, actually that, I think is. So given that if the exact each you know departmenthas, so even office phone can be part of a some kind of a determiner for good, so we can addthat.Office phone now determines department name and also determines HOD, good so some otherFDs also. So this actually again brings to focus ok this is one thing this kind of you knowdetermining the functional dependencies that hold on the domain. By you know understandingthe domain figuring out the key properties in the domain, knowledge about. But there are somekind of functional dependencies which can actually be you know derived sort of mechanicallyfrom the given functional dependencies. That is another situation that arises in this context andwe will look at that Given that some set of functional dependencies F hold on some scheme R, it turns out that wecan actually do some logical inferences. And then logically we can conclude that some otherfunctional dependencies should always hold ok. I will give you example for this, say for instancegiven that X determines Y and Y determines Z hold on some relations scheme. We can actuallyinfer that X determines Z must also hold, why is this.If 2 rows agree on X and because X determines Y holds they agree on Y attribute values. Andbecause Y determines at holds in the given scheme. Since they are agreeing on Y attributes theymust also agree on Z attributes. So that means basically if we assume that they agree on Xattrbutes, we will conclude that they agree on Z attributes this is a simple you know we can laterof course given nice name for this.We can actually call this as some kind of a transitivity property of the function that determinessymbol, well. Then they are going to be a lot more of these functional dependencies, which arekind of you know mechanically can be figured out given some set of functional dependencies.We can actually figure out a lot more functional dependencies by this process of deriving them.So then the question that comes up is how do we systematically obtain you know all such newfunctional dependencies.And obviously we as designers are you know how precious time, so we will not be able to sitdown and list on lots of things. We will list the crucial things and let is there some algorithmicprocess by which we can actually obtain F a lot of dependencies, that is one question that comesup. Of course under the we assume that unless all FDs are known a relation scheme is not kind offully specified, so ok. In this context, let me define a new kind of a thing we call it the entailment relation, it is a newname and bringing in a new symbol I will bringing in. This symbol actually is called the turnstilesymbol. So we say that a set of functional dependencies would entail some other functionaldependency X determines Y. So it is read multiple ways we say F entails X determines Y or Flogically implies X determines Y like that we will read it.Now the defining condition is that if in every instance r of some relation scheme R capital R. Onwhich these functional dependencies F hold, so if the left hand side functional dependency ishold. Then it is always the case that the right hand side functional dependency also holds ok. Ifthat is the case we say that then this right hand side functional dependency is logically impliedby the left hand side functional dependency.Or left hand side functional dependencies entail the right hand side functional dependency ok.Now we have already seen such an example, so in the previous slide we have seen that if Xdetermines Y, Y determines X hold on some scheme. Then they will logically imply that Xdetermines Z holds ok. So are there more such things, so there was a researcher by nameArmstrong who investigated this issue in the early you know when the relational theory wasgetting developed in the 80s 90s.And then he came up with it several inference rows for deriving new functional dependenciesfrom a given set of functional dependency ok. So we will study them ok, before we go one morenotation we call this F + F superscript + as the closure of F which is the set of all functionaldependencies X determines Y. That are logically implied by the given set of functionaldependencies here ok.The set of all functional dependencies that are entailed by a given set of functional dependency,that is F +. So given F, we talk about F + which is the in some sense analogical closure of F ok.Now so these logical the inference rules that Armstrong came up with. They are called Armstrong’s inference rules and they are also known as Armstrong's axioms wewill look at them off ok. The first one is called the reflexive rule which actually is you knowalways holds. So if F entails X determines Y all X determines Y such as Y is a subset of X theseare actually trivial function dependencies. So what basically saying here is that whatever we isthe given set of functional dependency we can always derive trivial function dependencies out ofit ok.So these trivial functional dependencies are those which have their right hand side as a subset ofthe left hand side so trivial FDs. Next one is what is called the augmentation rule, what this is isthat, suppose you are given a functional dependency X determines Y. Then that would logicallyimply in the a new functional dependency where you take some bunch of attributes and add themboth to the left hand side and to the right hand side ok.So we write them as XZ determines YZ where exactly some set of attributes from R. Now so weuse this notational convenience saying that instead of writing X union Z. We simply write it asXZ, ok you can also in our context of schemas it kind of makes sense because we are just goingto write these additional attributes concatenated with the old attributes ok. So XZ determines YZfor some Z which is a subset of R.So this will given that on all any instance if X determines Y holds, then what we are saying isthat XZ determines YZ also false. But it is not very difficult to see this actually, because anywaywe are adding the same set of attributes on the left hand side and right hand side. So if nowadditionally let us say to argue that this is correct. So on 2 rows if XZ attributes agree thenbecause we know that X determines Y holds ok they have to agree on Y attributes.And since you already know that they agree on Z attributes because in the left hand side that Z isalso there. So it is automatic that why is that the agree on why is that attributes nothing ok. Haveany questions please pause me because I am using a little bit new terms here. By now I think youare comfortable with this term called the 2 tuples agree on attributes that means they have thesame corresponding values on a attributes.Because they agree on XZ attributes, they will agree on YZ attribute, given that X determines Y,so that is called the augmentation rule. Then the next one is actually it will transitive rule and wehave already seen I have given you this example. So if X determines Y holds Y determines Ztogether hold on some scheme then we can actually argue that they are logically implied Xdetermines Z also holds ok.And the appropriate name to do for this is obviously transitive call it transitive rule. Now inaddition to this there are some other there are actually 3 more rules which are exactly I will showyou in the next slide that they are not exactly essential. But they are nice to have them you knowso let us look at that actually. So one of this called the decomposition or projective rule, what thissays is that.Given that X determines YZ see if I want to now distinguish between 2 sets of attributes, that iswhy I have written 2 symbols here Y and Z, Y is a set of attributes, Z is another set of attributes.X determines YZ is given then it logically implies X determines Y goes ok. So what we havebasically saying here is that we are only choosing some subset of the right hand side and thenclaiming that function dependency holds which is actually very obvious.Because if 2 given that this happens that means any 2 rows if they agree on X they will agree onYZ it is given. They will obviously agree on the subset of YZ that they already are agreeing onthe entire YZ. So we will agree on some subset of, this Y is a subset of YZ remember this wewill continue to use this notation the 2 sets write them together it is unique ok. So this is calleddecomposition rule which basically allows us to kind of drop some attributes on the right handside and then derive new functional dependencies from the given functional dependency.Now the reverse of this is what is called the union rule if you are able to do that, then obviouslywe can do this. So if X determines Y holds, X determines Z holds, then we might as wellcombine that together and then write that X determines YZ ok. Given that X determines Y and Xdetermines Z whole that means if 2 tuples agree on X. They are going to agree on Y, they aregoing to agree on Z.So we might as well write that X determines YZ which is in some sense the inverse of the aboutrule the projective rule, so that is why it is called union or additive rule. Now here is a one morerule called pseudo transitive rule, what this says is that if X determines Y and some WYdetermines Z is given. Then you can derive WX determines Z it is not very actually difficult tosee this. So if you start off with WX determines WX if they agree on WX then you know sincethey agree on X they will be agree on Y and since W is already there.So if they agree on WY and it is given that they agree on Z, so you can see that they agree onWX they will agree on Z. We can actually formally prove by using, I will show you I will takeone of these examples and then show how exactly we can so called derived then using. In factwhat we can show is that all these 4, 5, 6 are not really necessary. These that this is logicallyimplied by this can in fact be proved by using 1, 2, 3 itself ok.We will see that I think I will show you for this additive rule and I will leave it as an exercise foryou to show the other proof ok. So basically what is happening here is that there are somefunctional dependencies that can be in some sense logically derived from the given somefunctional dependencies. And now we are finding that this researcher Armstrong has postulatedgiven certain rules.And what we actually find interesting about this whole thing is that these first 3 rows or AT or infact necessary. And also sufficient to get hold of all the functional dependencies logically holdon a given a particular set of functional dependencies. In order to get hold of F +, these R A Tthis reflexive rule augmentation rule, transitive rule or necessary and sufficient. We will actuallymake that statement along with it, we will elaborate on next slide good. Before we proceed to that let us take rule 5 rule 4, 5, 6 actually are not really necessary. We willtake rule 5 and then see that we can actually prove that using 1, 2, 3 alone. So let me take this asa small exercise as to also illustrate to you as to how you if you are given some entailmentrelation like this, how do you actually prove it. So you prove it by writing down these giventhings on into different lines as given.And then start writing all the inferences you know what can be, so now given X determines Yextra X determines Z. We can use the augmentation rule on this 1 and then simply augment X onboth sides augment it with X on both sides. So X union X is X itself and X union Y is XY, so wewill get a new thing called X determines XY. Now take the other one X determines Z andaugmented Y, XY determines YZ ZY.Now that you have X determines XY, XY determines ZY use the transitive rule and then youwill get this X determines ZY or YZ. It does not matter whether you write it YZ or ZY becauseunion is committed. So what is the rule here saying X determines Y, X determines Z logicallyimplies X determines YZ we will be able to prove that X determines by making use of rule 1, 2,3 reflects the sorry.Rule augmentation rule 2 times and the transitive rule once, rules (()) (35:07) we have been ableto be sometimes produce. So try this as an exercise for the other 2 rules also 4, 6 also you will beable to prove them is in just want to (()) (35:21) ok. But it is kind of useful to have this 4, 5, 6 asyou know shortcut rules. Because every time you want to use this nice you know rule called theprojective rule or union rule, you do not have to again you know use 1, 2, 3 depend first derive.So well let us keep them as 4, 5, 6 as useful shortcuts but theoretically speaking they are notreally necessary ok. So what is this story so far, the story so far is that normally functionaldependencies have to be determined by the designers ok. But there are some kind of functionaldependencies which can be you know logically derived from the given set of functionaldependencies.In order to understand this we have defined a notion called logical implication among the sets offunctional dependencies ok and we call that has entitlement ok. And in that after we define thisentitlement relation we have now seen that there are 3 rules called the reflexive rule,augmentation rule and transitive rule using which we can derive other functional dependencies Now the interesting question that comes up is that are these rules arbitrarily chosen or they youknow good first thing is are they good and are they enough. So this also can be called as soundand complete inferences. So what Armstrong has not shown is that, he not only postulated these3 rules but he is also shown that these rules are what are called sound and complete. In the sensethat supposing I define this F I mean subscript double A for standing for Armstrong’s Axioms.As the set of all functional dependencies, X determines Y that can actually be derived from Fusing Armstrong's Axioms. Now that we have Armstrong axiom we clear understood them, soyou know also how to derive a new functional dependency given a set of functionaldependencies. So let us call that set of all functional dependencies that can be derived from usingthe Armstrong as let us give it a new name F underscore double AA.Now for soundness, what we have claiming here is that F double A is in fact subset of F +, whatis F +, F + is a closure that is the set of all functional dependencies that can be logically derivedfrom. So whatever the Armstrong's axioms derive or in fact correct ones. That mean they are infact members of F +, so every new functional dependency X determines Y derived from a givenset of functional dependencies F using the Armstrong’s axiom is such that F indeed entails that.So this is if you show this, this is called soundness and or correctness of these rules. The otherone is called the completeness, what this says is that given any functional dependencies that is sowhat actually what this says is F + is a subset of F underscore double A. That means given anyfunctional dependency X determines Y that is logically implied by F. That means is a member ofF + it can in fact be derived from here using Armstrong’s Axioms, this is surprising fact actuallyright.So if you are able to show these 2 things then we have made really very strong statement aboutthe Armstrong axiom is that. These rules are enough, do not look around for new rules theserules are enough ok. So that is very nice property of this Armstrong’s Axioms. So in the nextlecture, I will actually prove that the soundness and completeness of this Armstrong’s Axiomsindeed hold ok. Now there is a nice picture that I have here to illustrate this thing, that this is F and this is F + ok.So basically derive using double A, so what we have soundness what claims is that whatever youderive using Armstrong's axioms falls within F + it does not go outside that is the sound. Nowthe other one is that taken something from here you take some functional dependency that is amember of F + just it can in fact be derived using F.That is what the completeness F, F + is subset of so it is little bit interesting to prove these 2facts. We will see them in the next lecture ok, I suggest you reflect on these 2 interestingproperties of F transactions before we come to the next class. Actually try out a some schemes infact you have all you know some derived some schemes. So on them, you try to figure out whatare the functional dependencies that hold between attributes. Look at them from this angle ok,good.