Alison's New App is now available on iOS and Android! Download Now

Module 1: Fundamental SQL Principles

    Study Reminders
    Support

    I have uploaded the slides for TRC. So please go through the remaining. There is oneimportant aspect about the Tuple Relational Calculus which I could not discuss in classbecause of power failure last class, but please go through that. It is something to do veryimportant aspect. It is something to do about safe query; some Tuple Relational Calculusqueries are called safe Tuple Relational Calculus queries.So information is there in the slides. Okay let us get started off with this; so sequel it is alsocalled sequel SQL it originally expanded as Structured Query Language, but it is much morethan a query language. So now we think about it more as a standard so that is why the title formodule is the SQL Standard. So it is an international standard that specifies how a relationaldatabase is to be setup and so the various details about that.How do we create the schema and then how data is to be updated and inserted into thisrelations, how data is to be queries one uniform way of querying the data that happens to bethe principle you know function of this particular whole standard and that is why the nameoriginally was called as a query language and then how you know it also has constructs forspecifying when to start this transaction, how to stop it.And then setting certain parameters related to the transaction so how these transactions if thecollection of transactions how should they collectively behave, what is there is somethingcalled isolation levels can be setup. So should this transaction should isolate from each otherthese are the various things that you can specify using SQL and then when data is there inrelation database a program have to access the data and then make use of data.And then also update the data. So how do programs access data so how is this programmaticaccess to data which is there in the database, how is that to be performed all these things arestandardized and host of other things are also kind of standardized. So the advantage of astandard is basically that you get what is called interoperability that means if you setup yourdatabase and define it using standard SQL features.Then in principle you will be able to operate with that you know from other applicationsdevelop in any language and if you want to and you develop it in one system and you want toport it in other system it should be easy to do that because you know the standard is followed.Okay so the database management RDBMS vendors so all the RDBMS management systemsvarious RDBMS management system Oracle, Db2 Microsoft Sequel server, MySql so manyRDBMS systems.So all of them are required to support and implement this SQL standard okay that is when itcan standards are for enforcing consensus, standards are essentially for enforcing consensuswhere all of them should agree that yes these are the features, these are the way we supportthe features etcetera and that is when interoperability will be possible okay. So of course dueto certain commercial interest the RDBMS vendors may give some features which are overand above the SQL standard.See SQL standard says that these are the SQL commands, these are the various features, thesehas to be implemented, but nobody prevents me from doing a little bit more right I canalways add frills to my product and then make it more appeal so to say. So RDBMS vendorsmight will it is that to their commercial advantage to kind of actually give a little bit moreadditional features so that you may get attracted to use those features, but of course thedownside of using these vendor specific features is again portability.So today you are using oracle and then you have some issue with the oracle service or thecompany or whatever and then now you want to move your entire application to anotherRDBMS system like Db2 or some other Db2 is supported by IBM like that then what youwould expect is that your application you know it should be easy for you to put yourapplication into the new RDBMS system that is what the meaning of interoperability is right.So when both these vendors you know they implement SQL, but if over and top of that SQLif they have given some features and if you have used those features then you are likely to getdifficulty putting your application from one vendor to another vendor, shifting yourapplication from one specific system to another specific system. So the pragmatic advice hereis that stick to the standard.The vendors have they may their own reasons why they may give you additional features, butit is always advisable to check whether is this feature part of the SQL standard before youemploy it in your application in the interest of portability of the application. Okay so let us move on going a little bit of into the history of SQL if it is developed in byIBM in the early 70s and then it was actually implemented as part of the system or project atIBM San Jose Research Lab and that is actually the earliest version of SQL. Later on ofcourse it got standardized during 1986-89 one SQL standard version came then around 92 itwas SQL 2 has been standardized.And around 1999 to 2003 there is new standard that came which is this SQL 3 that includesalso object relational features and the kind of evolution continuous. It is as and when variousdevelopment take place in the computing science general domain the SQL standard also kindof tries to evolve to cater to those additional needs. For example, when lot of developmentsfor taking place in the XML world.So RDBMS standards have reacted and then kind of included additional features as to howyou can treat an entire XML chunk as a data item for example. Now you also have if youhave to deal with large binary objects then how do you handle them multi as multimediastorage has increased how do we handle multimedia objects as part of relation databasesystem. So many other you know additional details will you know need some standardization.And these standards are continuously kind of evolving. So one has to keep up to date if youare working in the database world then it is kind of essential for you to keep looking as whatis the development in the SQL standard and this module I just want to put a disclaimer herethat this module covers important principles of SQL. It is not you know I am not going togive a complete syntactical and not even all the features of SQL, SQL is vast standard.So we will touch upon the important features that are relevant for a introductory course likethis course and then so I urge you to pick up lot more information about SQL from the SQLreference manuals. So various components of SQL standard okay we can yeah so there are various features thatcan be kind of grouped into some functional groups. So there is a data definitions language socalled. So this basically has constructs for schema definition and the relation integrityconstraints definitions and use and schema modification all these constructs that carry outthese tasks are all can be thought of as the portion called data definition language.And then how do we specify querying of data in relation instances and how do we insert data,how do we update data all these things is in the data manipulation sub language so to say andthen what are the SQL commands or features that are required for programmatic access tothis relation database systems. So from a high level host language like C, C++, Java programshow do we access data features too. So we will look at embedded SQL as part of this moduleand these are dynamic SQL and embedded SQL are involved in this process of setting upproblematic access to databases. Okay now moving on we also have features for transaction control, commands for transactioncontrol, how transactions have to started, stopped etcetera and one major module within SQLis the access control data access control. Data is actually sometimes data is very sensitive datayou know you do not want the certain pieces of data to be seen by all people. You want to putsome restrictions and who are the people who can look at the data, who can update the data.Who are authorized to change the data etcetera these are access restrictions. So how do wespecify these restrictions? So there are features in SQL in order to do that it is calledauthorization features. We will not actually study much of these things. Okay let us start off with data definition. So far we have been saying there is a relationschema so how do we actually create a table. So it is the SQL command to do that. So here isthe create table command and then you give the name of the relation and then within bracesyou give various attribute definitions. So attribute definition, attribute definition some nnumber of attribute definitions followed by.The square bracket indicate optional thing so this is integrity constraints various integrityconstraints we can give these are also called the table constraints and we will go into thedetails about what are the integrity constraints we have already seen them actually these arecalled structural constraints right so how do we specify them etcetera. So this is the createtable command.And it has 2 things attribute definition and the integrity constraints. Attribute definition itbasically has the name of the attribute or attribute name and the domain type followed byoptionally followed by whether it is allowed to be null or not null. So if you give not null thenit is never allowed to be null if you do not give this then it is allowed to be null and then youcan also specify what is the default value for it.So there is a you can by using this word default and then supplying a value in quotes you cangive. For example, create table example 1 it has A and B and C these are the 3 attributes. SoA is 6 character length value it is not allowed to be null and it is default is this all zeros and Bis an integer and C is a single character whose default value is F. And there are lots of these domain types I will not go into full details about this, but most ofthe types that you are used to in programming languages have their kind of counterparts inSQL data types. So there are numeric data types, integers of various sizes like integers smallint, real numbers, real, float, double precision and then formatted numbers all of them arekind of available.So and then character strings of 2 types the fixed length characteristics and the varying lengthcharacteristics. So varying length means VARCHAR so the value can have maximum of ncharacters you can specify that and if it is character n then it is fixed length it has exactly ncharacters and things like that and in the and there are bit string data types also and thatmeans it is a string of bits and you have again 2 variations bit varying and just fixed lengthbit. So and then date data type, time data type and lot of other data types are there please look upthe SQL reference books when you come to developing the programs. Date is a 10 positionformat so year and then month and day time has hours, minutes and seconds etcetera. So,most of the data types that you are familiar within programming languages will be availablehere. Now let us come to the integrity constraints part. These are also called the table constraintsand what are the table constraints that we are familiar with. Domain constraints, keyconstraints and referential integrity constraints, so domain constraints are in some senseindirectly they are already specified here because you are giving the data type right for everyattribute we are giving the data type, domain type okay.So that specifies the domain constraints and then coming to the key constraints here is howyou can specify the key constraints. So you use the word prime key word primary key andthen give a list of attributes. So what it basically specifies here is that the set A1 through Akis the primary key for the table and then a table can have a relation can have multiple keysand so you want to pick one of them and then say that it is a primary key.And then the remaining things you can specify using this word called unique so this key wordunique you can use and then say that so this B1 through Bk also is a key. So other candidateskeys you will use this word unique to specify that these are also keys. So that kind ofcompletely specifies as to what are all the keys that the relation has. So there can obviouslybe more than one unique constraint, but only one primary key constraint for a table right. And then here is how you specify referential integrity constraints. So the key word here is touse foreign key and then specify what exactly is the keys sorry attribute or attributes thatconstitutes the foreign key and the key word references and the relation that it this that werefer to and what is the specific attribute in that relation that we refer to okay. So this phrasespecifies that attribute A1 of the table that is being defined let us say r1 is a foreign key.And it refers to attribute B1 of the table r2 okay. So we use the word table and relationinterchangeably in this module onwards okay. So though there is some technical differencebetween the results of SQL queries we will see it later. Okay so what does this meanspecifying this foreign key A1 references r2. You can recall that what this means is that eachvalue of this column A1 in the relation r1 in the relation instance of r1 the relation instance r1you know is either null is allowed to be null.Or is one of the values that are appearing in the column B1 of r2 that is what referentialintegrity constraint is right recall that. So you could specify so this is how the foreign keygets specified. So let us now look at the scenario as to what should happen if a referential integrity constraintgets actually violated okay. So we can actually specify while we are defining the table itselfwe can specify as to what are the actions that can be taken or should be taken actually if thereferential integrity constraint violation occurs. Okay when does a violation occur? A tuple inone relation instance is trying to refer to refer to tuple in other relation okay.A violation can occur if the refer tuple does not exist there okay. It should exist that is whatour referential integrity constraint means supposing you are operating your database and thereis a reference from one relation one tuple from one relation to a tuple in another relation andfor some reason the second tuple the referenced tuple gets deleted. If it gets deleted, thenthere is a referential integrity constraint violation okay.Or let us say it gets updated for some reason you know the actual column value B that this isbeing used for referencing has changed, the value has changed then also see while referringyou are using department number 3 and for some reason in the department number 3 thedepartment table 3 has got updated to 6 for some reason let us say. So then there is also aconstraint violation because 3 got updated to something.So this can occur the violation can occur if the reference tuple is either deleted or is modifiedunder these 2 circumstances a violation can occur and what we will be doing as part ofdefining the relation is that what is the action that is to be taken on the relation that is beingdefined or the instance relation is data when such a thing happens okay. So there are 3possibilities so these actions to be specified have these qualifiers.Because only upon deletion or the modification the violation might occur so if occurs due todeletion or if it occurs due to update what should be the action. So there are 3 possibilitiescan actually be specified. So one of them is set null the other one is set default and the otherone is cascade so let us look at those things. So, these actions that can be taken on thereferencing tuple.The tuple is referencing referring to null a tuple we will call it referencing tuple okay. Sothese actions will be taken on that referencing tuple. So what it says one of this first things isset null. Set null means you know this is the safest option like so if foreign key has certainvalue and due to update or delete that particular value is now not existing in the other tableand you detected this.If you detect, this the simplest thing that you can do is in the referencing tuple you can simplythat value to null, simply convert the value to null. If you convert the value to null, then youare kind of breaking that link there is no more referencing at all there is no more referencingat all. So that means in some situations you may favor this. In other situations, you might saythat set default.So the foreign key attribute value will be set to its default value. Let us say you know allemployees who join are first by default assigned to some department and then later changedto other departments. So there is a default department for all employees let us say so there is adefault value for the department ID or the section (()) (27:08) department ID. So you canchoose to set up that.So in case he loses the information about see the employee tuple for example is referencing tosome department and then that department vanishes for some reason and so you can set thisguide to work at the default department. Let us see later we will see how to handle thesituation kind of right. So that is called set default I will take an example to illustrate all this 3things.Then there is a more one more option called cascade what this is it is a bit of a very drasticaction what this says is that if the reference tuple is deleted then delete the referencing tupleor update the foreign key attribute if the reference tuple is updated propagate that updateokay. So this is the option. So I will show. Let us talk about these things again in the context of an example let us create this studentcreate table students. Now we have various attributes here roll number I would say fixed 8characters’ value which is not allowed to be null, name is a 15 character I am sure some ofyour names do not fit in 15 characters, but so 15 character up to 15 characters are not null,degree is 5 characters, year is small int, sex is 1 character not null.And department number is small integer, advisor is a 6 character field then the primary key isroll number and then we have table constraints so these are all comma separated just you cansee. So within the same thing there is no space separated and then comma separated. Theforeign key department number okay so this department number is declared also as a foreignkey and it references another relation called department and the attribute references isdepartment ID okay.So the student’s department is a foreign key that references the department ID attribute in thedepartment relation for further details about the department. So now here are the additionalthings that we specify in order to take care of the referential integrity constraint violations.What this says is on delete set null on delete set null and on update cascade okay. So what didhe saying here is that supposing there is one (()) (30:19) whose department is 9.And for some reason department 9 has been closed okay (()) (30:31). So what we do here iswe set the students department as null value this fellow is now without any department okayand the other thing is on update cascade his department was actually 9 for some reason it gotchanged into 10 because somebody decided that will give department numbers according toalphabetical order of the department names or something like that happened.So his department has now changed to department ID 10. So what this says is cascade it thatmeans whoever is was in earlier department number 9 you update all those fellowsdepartment ID to 10 now which will solve the problem right. So that is what on updatecascade that means basically propagate the change to the referencing tuples those tupleswhich are referencing this value.Okay so this is how you specify the actions to be taken on this table the data of this tablewhich is being defined when things happened on the tables that this table is referencing it isreferring to some other tables and if something happens in those tables what should be donehere okay that is what we are specifying here. So in case something happens in thedepartment table what should be done for the department number attribute in the tuples thatexist in the student relation that is what we are specifying.Okay the other is foreign key other foreign key is advisor here. Advisor references aprofessor table and the specifically the employee ID attribute there and again same actionsare found suitable here so for some reason somebody is advisor leaves the institute he may setit as null advisor as of now, somebody should figure out that this guy does not have anadvisor.And then Dean office will wake up Monday and then try to fix in advisor for this particularperson or if the advisor for example changed from one department to another departmentokay and then his employee ID for some reason changed and then propagate that employeeID, propagate update cascade. So that is what we are specifying in as to. So these are veryimportant and interesting actions that we can seek.Because these are kind of structural constraints we have this ability to kind of tell as to whatshould happen when these constraints are violated and that is given as part of the definition ofthe relation itself the schema definition we set up this actions also okay. So that is theexample so if you want to now go back to these things. So these are the 3 actions that can beso these of course these actions are specified for each of this foreign keys that are thereindependently.For each of the foreign keys that are there in the relation that is being defined you have to setup all these referentially triggered actions okay any questions there. So you can set up ondelete cascade it is a very drastic option on delete cascade. So your student is referring tosome advisor when advisor leaves then you say that drop the student also it is a very drasticaction.So you should be careful while specifying on delete cascade that if the delete cascades toother tuples then we will lose lot of data. It might be appropriate in some cases for examplelet us say we are keeping track of the information about the dependents of an employee in acompany and the employee leaves the company, if the employee leaves the company I do nothave any incentive to keep track of the information about the dependents of that particularemployee anymore in my company database.And so I might use this option of on delete cascade while I am specifying the dependentswhile I am specifying the dependents in the dependents table while the dependent eachdependent tuple refers to employee tuple and in case employee leaves drop the dependentalso so that is an appropriate. Okay so these are the various actions that you can specify. Now how do you modify a defined schema. So there is a command, there is a feature whichallows you to kind of change the table. So alter table command can be used to kind of modifythe schema by adding a new attribute. So far we did not have address in the students. So nowwe can say alter table student relation and add is a key word the address attributes with itsdata types.Now you might wonder what should actually what happens to the data. So far we have somedata and in which the address attribute was not there. So now we are trying to add thisaddress attribute so it is a issue as to how to pick up address value for all the existing tuplesand then etcetera. So schema modifications are pretty messy actually. So you should avoidthem to the extent possible.But in some cases that you may want to really record additional information which was listout while the design phase was going through you may want to do this. So there arecomments for doing it, but it is going to be pretty messy to handle the (()) (38:04) during thisschema modifications okay. So alter table add a new attribute and deleting an attribute is alsopossible.You no longer think that it is appropriate to keep some information about certain entities. Soif you delete an attribute you know what you need to also specify as to what should be doneabout the various views or constraints that refer to the attribute being dropped okay. ow in though I did not talk about it in the table definition in the SQL standard it is alsopossible for you to kind of give a name for this constraint. See this is a table constraint rightso you can give a numeric name for that particular constraint okay. So then you can later referto that constraint and then say that drop that constraint okay. So when you are deleting anattribute it is possible that attribute is being used in some views or it is being used in someconstraints etcetera.And you need to kind of specify as to what should be done for those constraints or views. Soagain there are 2 possibilities. The cascade is a drastic option delete the views or theconstraints also. In case you are dropping an attribute if there is a any constraint or a viewthat is referring to this particular attribute simply drop that also that is this option. Thisanother option called restrict what it says is that even though the delete request has beenmade do not delete the attribute.If there are some views or some constraints which are referring to it. They are using it someconstraints and some views is using this attribute then go back to the person who has giventhat command and then say that this is being used somewhere and you still want to delete it,drop it. So this is safer option restrict then you will the system will prompt you back sayingthat you know it is being used somewhere and you really want to drop it.So the command is alter table student drop degree restrict you want to some attribute I amdropping the important attribute of course similarly in an entire table definition can also bedeleted and all that. I have anyway in some sense which the logical point where we can takeup this discussion about the important very nice features SQL. So we will start doing it in thenext class.