Amazing April Sale! 🥳 25% off all digital certs & diplomas!Ends in  : : :

Claim Your Discount!

Module 1: Introduction to Relational Data Model

    Study Reminders
    Support

    So in the last lecture I was discussing this query and then I introduced several ideas. So, I kind ofadded couple of slides I thought I would repeat some of this detail before I proceed further. So, basically I illustrated this issue of you know using renaming and introducing temporaryrelations while doing the specification of the query. So, the use of renaming I have kind of specifically written here that this is a temporary relation tohold the intermediate results. And all these attributes are being renamed as eId, pname anddepartment number. So, we discussed this already and creating such relations basically helps usand understanding and formulating the query make easier manner. And in a similar way we havethe other department detail also as a temporary relation.And we are using this left arrow as a to indicate assignment operation okay this is also anothertemporary relation. Now, this renaming is necessary, because we want to ensure that the crossproduct has distinct attribute names, this also I mentioned in the last class. Now, the operator rho is a renaming operator and I tried to write this big expression using a thejournal here right, but I do not know whether you have been comfortable seeing that thing. So, Iread under written in this slide here. So, one can use the rename operator and write the wholequery as one big expression as an alternative to using this temporary relations.So, how it looks is like this okay. So you can now make use of, so what we are doing here is that,so, this is the professor relation that we have and then we are projecting the required attributesfrom here and then renaming them using this rho operator renaming them and then doing a crossproduct with the other. This is what we call us department detail in the in the previous slide,whereas this is the one which we call us professor detail right.And so we are doing a cross product of these 2 things, and then selecting the relevantcombinations by giving the selection condition department number equals dId, and then finallyprojecting eId, pname and dname attributes from this entire relation. So, these brackets seem tohave modes here and there in this slide, but this bracket closes this whereas this one closes thatright.So, one can write it as a big relational algebra expression like this, but then you can see that it iseasier for us to understand the and formulate the query if we use this meaningfully namedtemporary relations. So I would encourage you to do that students are encouraged to use thenotation of introducing the temporary relations and then writing the sub expressions and thenformulate the query okay.So, I hope these points are clear. Now, let us move on to the need for other operator which iscalled the joint operator I just mentioned in the last class. This is a very important operator, mainly we noticed that whenever we use cross product, we aregenerating you know combinations of tuples from 2 relations and most of the time all thecombinations will not make sense. Only certain combinations will make sense and so we need tomore often than not follow it up immediately with a selection condition to select the appropriatecombinations of tuples from the 2relations.So, if you are doing it very often, why do not we introduce an operator for that that is thethinking before for this joint operation. So, since cross product produces all combinations oftuples, and since only certain combinations are meaningful, we have to follow it with usingselection. And so we know introduce a new operator called join, what it does is combined tuplesfrom 2 relations combines meaning concatenate.In this case we will take 1 tuple from here take and another tuple and concatenate the tuples,concatenate tuples provided they satisfy a specified condition, they call the joint condition. So,this is kind of equivalent to performing the cross product followed by selection and it is veryuseful operation in practice okay. So, depending on the type of the condition that we specify,there are various kinds of joints that we that are kind of talked about. So we will talk about thesethings called theta join, equi join and natural join various different joints.And then later on of course, when we go to you know the implementation details about theseoperators, we will talk about specific algorithms for realizing these joint operations okay. So what is theta joint, l 1 is a relation with scheme r 1 equals A 1 through A m some m numberof attributes, r 2 is another relation with scheme B 1 through B n. And let us assume that withoutloss of generality we can make this assumption that the intersection of this r 1 and r 2 is null set.There are no common attribute names across the 2, because if there are there, we can alwaysrename it right.So, we can make this assumption now, there is no loss of generality in making this assumptionthat this A 1 set is designed from this B 1 through B n, there no common attributes. So r 1intersection r 2 is phi. So, we will make that assumption, we will actually also as sometimesrelax this assumption in the case of natural than as we see it later okay. So, now, we make thisassumption that the attribute sets the schemas are disjoint for these 2 relations.And we express this join with this symbol the join, so the subscript is the joint condition, so, r 1joint r 2 with the joint condition and what is the structure of the joint condition, how should it be.So, theta is the joint condition is of the we can assume that it is a conjunction of certain atomsand where each of the C i is of this form that whereas the A j compared with some B k, noticethat there is no need for having a condition in which there are attributes of the same relation ofcompound.Because we are talking about joint relation, joint condition here. So, attributes from one relationhave to be compared with attributes of the other relation. If you are comparing attributes of thesame relation, you might as well do it in the selection conditions right, you do not have to do it inthe join. So the in the joint conditions, the form will be like, we take an attribute from the leftside relation and take an attribute from the right hand side relation and then we do a comparison.So the comparisons is one of these 6 possible comparison operators. So, we make comparisonsand then we can do conjunction. And then and this is a general form of the theta, notice that wedo not need to really bother about the form where we are having r s and things like that, becausea every logical you know, expression can be rendered into conjunctive normal form. So, weknow that, so we basically make use of that thing. So, these are conjunctions.And what is the scheme of the result relation r, the scheme of the result relation r is basically theconcatenation of these 2 schemes right. So, A 1 A 2 A m followed by B 1 to B n of the schemeand the result relation itself is defined like this. So, the tuples even A 1 A 2 A m b 1 b 2 little bwe are using here to indicate the values. So, such a tuple such that A 1 through A m data tuplebelongs to r1 b q through b 2 belongs to r 2.And this combination satisfies the joint condition provided satisfies the joint condition. So, thatis the results such that this part of the tuple belongs to r1, this part of the tuple belongs to r 2 andit together satisfies the joint condition and what is the joint condition is in terms of these valuesof these. So, it says A j equals B k. So, you pick up the A j value B k value and then check thecondition and there was a bunch of conditions to be checked. You can do that okay.So that is what a joint is. So, this is very useful operation in practice and so the most generalform of that is called the theta join. And we will oftentimes do not use this very general form, butwe will use some specific forms of join and so we have names for those kinds of joints okay. Okay here is a set of some data that I will use in order to illustrate a join. Let us see. So, we haveprofessor relation and department relation. And the courses relation and professor we have acouple of computer science professors and in electrical engineering and mechanical engineeringprofessor and so on. And so we have here computer science, electrical engineering, mechanicalengineering departments.And then some courses like algorithm A.I, D.S.P and aerodynamics all that okay, so some data isthere, so we will just use this data in order to illustrate couple of queries. So now let us considerthis for each department find its name, find its name, the department's name, and the name, sexand phone number of the head of the department okay. So, where is that information thatinformation is obviously here in the department relation.And as in the professor relation, because they hod is an employee, hod is a professor andprofessors details are here. So you need to join department with professor with an appropriatejoint condition and what is the joint condition here. The joint condition is basically that heredepartment is mentioning the hods employee Id. So this employee Id should match the employeeId of the professor tuple only then that particular combination gives us the details about aparticular department.So the matching the joint condition here is this hid should be equal to the employee Id. And sowith that we will be able to formulate this query, so we will see how exactly we will do that. So, before we do that some renaming is necessary because there are some common attributenames between these 2 things. So name is common. The attribute name is common here okay.And what else is common, phone is common. Phone is also a common attribute. So we have todo renaming because we assume that these attribute lists have to be disjoint. So we will doappropriate renaming.And then so I will introduce a short form of professor as prof and then do the renaming here,employee Id, pname instead of name, I just made it pname and the rest of the things are as theyare, and then I rename phone as prof-phone, because then this will be different from thedepartment phone. So it is a prof-phone. So that is how I take professor and then project therequired attributes from the professor and then rename them here like this. And then similarly,then I can go straight for the join.So here I performance department join process prof, which is this temporary relation with thejoint condition hod equals employeeId right. And so once you do that, we get the relevantcombinations and from the relevant combinations we have been asked to find name or thedepartment name and other details of the head of the department. So we will project the so eventhough we just been asked name, we will know project department Id also.So that, you know, we know that the key of the department is the result. So, department Id namehod, pname for the name of the professor and other details. So, that is the result how exactly thisgets computed and all that we will not go into right now we just look at them as mathematicaloperators right now. But then of course, we have to worry about how exactly to implement theseoperators at an appropriate time later.Good. So I suppose this should, so if you do not have join operation, then we just have to wehave to do a cross product and then get all possible combinations of department prof and thenfollowed it by a selection condition with this selection thing, right, that is how we would havedone without the join operation with the existing the join save some you know, expression here,it simplifies the expression, the presence of joint okay. Now moving on, we have other kinds of joints called the equi join and natural joint. So equi joinis basically where the in the joint condition is not a generic kind of a joint condition, but it is avery specific kind of joint condition where equality is the only component operative used in thejoint condition, that is the case, we call it an equi joint. That is all. Sometimes it is useful becausemost of the time we are only doing about value matching. And so, equi joints would be enoughfor us.And natural join is where we make a specific assumption saying that r 1 and r 2 have indeedcommon attributes like say X 1 X 2 X 3 is let us say these are the 3 common attributes, okay.And then we do not even have to specify the join condition, so this particular natural joint has itsown small operator, star operator asterisk operator. So, where we assume that whatever be thecommon attributes between the 2 schemas will be simply equated.That is the joint condition. So the joint condition is always assumed to be like this R 1.X1 = R2.X1 R 1.X2 = R 2.X 2 R 1.X 3 = R 2.X3, X1 X2 X3 are the only common attributes between R1 R 2 and we will put the equality for them and then so values of common attribute should beequal is the join condition. And the results came of a Q will be R 1 union R 2 minus these 3attributes because you only need one copy of X 1 X 2 X 3.X 1 X 2 X 3 are present in both R 1 and R 2 but we need only one copy because it does not makesense to keep both copies because anyway they will be equal because we have putting thiscondition right here saying that there has to be equal. So in the result they will be equal. You getburned in the result, the result of a join the values of the X 1 from R 1 and the value of X 1 fromR 2 in the tuple will be the same where you are imposing the condition.So and the notation for the natural join is simply like this, r 1 start r 2 this is asterisk symbol. So,you do not have to specify any condition, the condition is assumed to be this always. If there areno common attributes, then there is almost like you know, you are not specifying any condition.So, it will actually, you know, come down to cross product. There are no things it will be equalto prosper. So, here is an example of equity joined, find courses offered by each department. So, we want tofind the list of courses offered by each department. So, each department offers several courses inour example data, we just had only 4 courses and so, 2 of them are offered by computer sciencedepartment with other offered by electrical and then the third one is offered by mechanicalengineering department.And so when you do a department so this is an example where equi join, so I just want to say thatis the quality condition. You cannot perform natural joint here because these attributes aredifferent, you know, they are not the same. So, they even though they are conceptually same, buttheir actual, you know, the strings are different. So they will be treated as different attributes. So, for example, involving natural join let us look at this. Let us say apart from the professorrelation, we also have the teaching relation, which has these details like CS01 employee isoffering some courses CS635, 636 like that. And these are the courses being offered in sounds ofsemesters and what are the classrooms etc. So now to find courses handled by each professor ortaught by each professor, what we do is to simply do a natural join between professor andteaching.Because employee Id is the common attribute. And so we want we can go back to that professortable that had employee Id and of course it has okay, I think it is almost there here right. Soprofessor, employee Id, name, sex, start year, department number, phone. These are the attributesof professor and now in the teaching we have employee Id. So, there will be a natural join thatoccurs and so we get details about.So, in case a particular professor is accurate teaching more than one course then the entire detailsabout the professor will be repeated right. In this case, it is not like that so, we get this lesson. Sothis is natural join okay. Moving on, here is another interesting kind of operator. I mean, calledthe division operator. It kind of gets used in certain may be specific kind of situations. So here is the details of thedivision operator, the necessary condition for us to determine the odd divided by s on instances Rand S is that the scheme is should be a subset of R proper subset of R ok. So the relation rdivided by s is a relation on the scheme R minus S. R minus S okay here is the actual definitionbefore we understand the definition let me introduce this little bit of notation here.We take a tuple, say t r whatever it is, and then write a subset of attributes are actuallysubsequence of attributes in this okay, a bunch of attributes. Let us say A 1 A 2 A 3, then whatthat means is these sub tuple of that particular tuple, consisting of values of these attributes, in Sokay, so that is the notation, we introduce a notation for that because we oftentimes have to talkabout that sub tuple.So, there is a wide tuple and then we have a subset of those attributes from that tuple. So, wewant to talk about that. So, we introduced a notation for that t r, S, S is a subset of the scheme isthe sub tuple of that tuple t r consisting of values of the attributes in that particular sets are thesequences okay. Now, so to define r developed by s, there are 2 conditions here that it willconsist of all the tuples t which R – S part of R okay.R – S part of the r the first left hand operator, but then they have this very unique property thatyou know. So, the combination of this tuple which is a tuple of R – S okay. It combines witheach of the tuples that are there in S and exists in R. So, that is actually expressed here that forevery tuple t s in s, there is a tuple t r in the relation r satisfying that the sub tuple for S is this asub tuple of t r for this particular attributes s is indeed this t s.And then the R minus S is the t that we are looking for okay, t is going to be in the result.Actually, the sub tuple t is the one that is we are looking at we are looking for. So, this is thetuple sub tuple that exists in the result, but this one will have the property that no it combineswith every tuple t s okay. So, if you focus on some tuple t S. So, we are guaranteed that sometuple t r exists in r such that the R – S portion of it is this particular t that we have.And the S portion is the tuple t s in s okay. So, basically it kind of you know we can alsorestarted it in this way maybe to help us to understand the definition of have a look at here okay.So the division operator produces a relation R X, now here I am slightly again renaming theschemas that includes all tuples T X that appear in r 1 in combination with a every tuple r 2 of r 2every tuple from r 2 okay where this you know schemas are appropriately worked out.So r 1 is some Z is R 2 is this Y, and Z is the union of these 2 things okay, so, I think this needsan example we can have a look at the example and then come back to this definition, so that wecan understand this definition better. So, let me show you an example first. Let us look at this okay, so let us say r is having attributes A B C D and S is A B and X is C D.Okay and now looking at r divided by s. And the scheme for r divided by s is r – s okay. So thatmeans the set of all attributes that are there in r but not in s that means C D right. So, X C D isthe output schema of the r divided by s okay, now how exactly this result gets computed, you canhave a look at it.Just let us look at this instance . And so s having 2 tuples a 1 b 1 a 2 b 2. Now, what are thosetuples in the C D part of the relation r that combined with every tuple here and exist here that isthe question we are asking. So, get hold of all those tuples in the C D part of this data, which hasthis interesting property that they combined with every tuple of s and exist here okay.So, now we can see that among so if you focus on the C D part, C and D part of the relation r,you can see that if you project is here, you will get C 1, D 1 C 2 D 2 C 3 D 3 right if you projectthis relation on C D, you will get 3 tuples C 1 D 1, C 2 D 2, C 3 D 3, right. So among the C 1 D1, C 2 D 2 C 3 D 3 which of those things have the property that they combined with a 1 b 1 and a2 b 2 and exist here.That is the question we are asking okay, so now we can see that C 2 D 2 does not have thatproperty, C 2 D 2 combines it only a 1 b 1, but does not combined with a 2 b 2, a 2 b 2 c 1 c 2 c 2d 2 does not exist in the data here. Whereas c 1 d 1 combines with both a 1 b 1 and a 2 b 2 existhere, whereas c 3 d 3 also combines with a 1 b 1 and a 2 b 2 and exists here only c 2 d 2 does nothave that property. So in the result we get only c 1 d 1 and c 3 d 3 okay. So is the process I meanis the semantics of division clear and basically what we are asking for let me know take it backto this definition.So, the division operator produces a relation R X that includes all tuples t X where X is that youknow the difference R – S kind of that appear in R 1 in combination with every tuple from r 2okay, r 2 is the other side the right hand side. So, now you can reread this definition t belongs tothe R - S portion of r and for every tuple t s in s, there is a t r in r such that the t r R - S portion ist and T s s portion is this particular t s.Basically we are saying that t combines with all t s which are tuples in s. Well, you mightwonder actually is this operator it still looks quite complicated when do we really need it do youuse it you know that kind of yes in some situations, it becomes useful. Any questions here. Havea look at the example again if you want. So the result is R – S. So which is C D and so. So c 2 d2 is not present in the result of the division as it does not appear in combination with all thetuples.So to kind of illustrate this, let me take an example of a set of students, you know, who have thisunique property. They are probably, you know, graduating soon or something like that, and theyhave registered for enroll for all the courses that are offered by computer science departmentright. And some time ago, we used to have this interesting situation that the number of facultywas less, and so the number of electives were less.So the dual degree students ended up doing almost all the courses that are available in thedepartment, you know, like that. So, of course nowadays, it is not really possible for many of you to do all the courses in thedepartment. So find those students who have enrolled for all courses offered in the department ofcomputer science department. So what is it that we are looking for here. So this is an apt examplefor this is an app situation for applying division operator, because we are now looking at theenrollment data and then, you know, we are asking for who are those students who combinedwith every course that is there in the set of courses offered by the computer science departmentright.That is what we are asking for. Who are those students, roll numbers that combined with everycourse that is existing in the set of courses that offer by computer science department and exist inthe enrollment relation, right. So that is how this the division operator will be used here. So getthe courses enrollment information for all students. Where do you get this. You get it from theenrollment relation okay.So I want to take a natural join between student and enrollment. In fact, I do not actually needthis because enrollment itself has roll number and roll number is there but if you want the namedetails, then you will have to combine with student let us assume that, so we are doing that okay.So we are doing a natural joining student and enrollment and getting the student enrollment data,roll number, name and the course ID.And then in the second step, we are getting the course Ids of all courses offered by the computerscience department. How can you get that, it is very simple. So we take the courses and thedepartment and do it join saying that department Id should be same as department number. In thecourses we have department Id attribute as what is the offering department. And in thedepartment we have department number.So we take the appropriate the combinations. So this will produce information about coursesalong with the details about the department that is offering. And then we do a selection based oncomputer science, the name of the department should be computer science. We are assuming thatwe do not know the idea of the computer science department. And so we have a selection here.We in fact, know that it is department number 1 right from the data.But then anyway, we have to go by what is given in the query. So gets course Ids of all coursesoffered by CS department, so we get you can project the course Ids, after doing a selection ofcomputer science of this joint, so we will get this CS courses. Now all that we had to do is tosimply divide the student enrolled relation instance with the CS courses to get the desired result.So, result a student enrolled division CS course set of all.So, this will me give me so what will be the scheme of this, what will be the scheme of the rollnumber, name, right is not this minus this right. So, roll number name is the scheme for this So,these students are these special kinds of students who have enrolled for all courses that are herein the computer science department they have nothing else to do, they have to leave okay. So, this can illustrate with data, so suppose the we skip the roll numbers for simplicity, let usassume that for the moment that the student names are unique okay, so we have name and courseId. Let us say this is the student enrollment data. And this is the course Ids just ridiculously smallI mean just 2 courses being offered by computer science department which is not the case andthen we do a division.So you can verify that this is indeed only this Mahesh and Piyush should have this oh, okay. Theonly man left out is Rajesh okay he is not done all the courses okay. So you get the point right, Isuppose. So this is for in some special situations like where you want to check whether a data setof data combines with every tuple in another relation. You basically use the division operatorokay.So, study this examples and also the definition so that you will get the you know the way wehave defined it whether okay. Now let us come to this complete set of operators question are all the relational. So, what are allthe relational algebra expressions we have introduced so far, we have introduced selection,projection, union, intersection, difference, cross product, join and division, rename, of course,rename is a not a computational object but its computational operator but it is the convenientoperator. So, some operators can be realized through other operators we have already seen that.So, that leads to a question as to are all algebra essential. The answers of course is not, that theyare all not essential, the minimal set of operators we can show it theoretically which we are notgoing to do it now in this course, but one can show this in theoretically saying that sigmaselection projection cross product union, difference, these are the complete set of operators thatmeans, these are necessary and sufficient.So, you can realize all the other operators through these operators. So, the intersection can berealized through union and difference, work it out as an exercise as to how you can introduce,you know, realize intersection using union and difference. The join and as I was alreadymentioned, can basically be made use of can be done by cross product followed by selection. Thedivision also you may take it as an exercise and work it out as to how you can do the divisionwithout using the other operators.Division is not an essential operator actually, it is you can do it without with the basic operatorsproject costs products and because I am just giving a hint here saying that you can do it withusing project cross product in difference. But you have to work it out as to how exactly to dothat. So, it is very interesting and that all the various kind of queries that we may have againstinformation system can all be expressed using this less than half dozen operators right.So, of course in practice, so are these the operators that are going to be used in practice.