We'll email you at these times to remind you to study
You can set up to 7 reminders per week
We'll email you at these times to remind you to study
So, we are going to talk about the vector data types. We would see what is the topology andin the context of GIS I mean what is its use; we are going to look into the geo relational datamodels different types of geo relational data models.
(Refer Slide Time: 01:31)
So, let us see what are the basic vector data types. So, the first one is a point feature and it is alayer or a point in a GIS where we have a point feature, it would comprise of a set of points.Now, I mean you can think of some examples like say location of wells, location of bus stopsor schools or any other such feature.
Now, the next vector data type basic vector data type is a line. Now, it is a one-dimensional; ithas one-dimensional property and it only has length. So, in addition to the location, it wouldalso have two points the start point and the end points and in between we would have the link.So, that would basically by joining a line between this two start points, it would create a linesegment.
Now, it may be a connected as line segment. So, it may have multiple segments or it could bea smooth curve; I mean, we can generate a smooth curve from point segments, I mean or line
segments these are these functions are generally known as slicing algorithms. So, we cangenerate smooth lines as well.
So, I mean a feature a line feature would be made out of a set of lines as we said point featurewould be made out of a set of points, line feature would be made out of a set of points. So, asan example; I mean, we can code the roads in a given city as line feature and record it as a ina layer as a road feature. We can have the boundaries of the villages, states and the differentadministrative areas in a very hierarchical manner. We can record the natural streams whichbasically converge into drains, which may again join to form river courses. So, these are someof the examples where in this features could be coded as line feature in the using basic vectordata type in a GIS framework.
The next data type is a polygon data type which is a two-dimensional data it has not only thelength, it also would have a property of the area as well as the perimeter. So, in addition to thelocation; I mean, when we are talking about GIS data types these data would always have itsthe latitude longitude coordinate or it would have the projected coordinate. In our earlierprojection; earlier lectures on projections and transformations we have seen how the latitudelongitude could be transformed into your cylindrical projections or chronic projections ororthographic projections. So, we have seen different context.
So, these data can either have a code of the latitude and longitude of the place; it is known asa geographic I mean units in a WGS framework WGS 84 framework or otherwise I mean, itcan be in a projected coordinate system. So, this polygon it would have area and perimeter asI said in addition to the location aspects I mean, the location coordinates.
So, it is would be made by joining line segments which is closed I mean it is not open. I meanthis line segments has to be closed and these line segments has to be non-intersecting linesegments that is very important, because if you have intersecting line segments then thatwould form multiple polygons in that case.
So, when we are looking into higher vector data types; I mean, these are the basic vector datatypes, when we look for advanced vector data types. We would see that how it could be madepossible in a GIS framework to a code intersecting lines or polygons which are I mean whichmay have holes in it or which may have zones which should not have any area. So, those kindof polygons can also be coded, encoded in the GIS framework.
So, I mean the perimeter or the boundary would define the polygons. So, polygons it could bestandalone polygon or it could share its boundary with other adjoining polygons and it mayalso have a hole within, I mean there could be a area which is not included. Like for anexample, I mean if you are doing a land use and you do not want to include a pond comeswithin a pharmacy and you do not want to include the area of the pond if you are trying tocalculate the productivity of that form; I mean, how what would be the output in terms of saytonnage of yield of rice or wheat. So, in that case you would discount the pond area whichcomes within the enclosure of the field of the agricultural field.
So, it is possible for us to code polygons which also has holes and we can remove that muchof an area from the statistics. So, it would in this case where we would have these kind ofpolygons which are having a embedded holes, it would result in an interior and an exteriorboundary. So, I mean this feature would consist of polygons or a set of polygon; I mean,polygon as a feature. And, examples of this what could come to my mind I mean, one suchexample I told you is a field agricultural field it could include vegetated areas, forested areas,urban areas, water bodies. So, I mean you can have different area based features and you canencode it as a polygon feature in GIS.
(Refer Slide Time: 08:31)
Now, let us see some of the examples. So, here you can see an example wherein you see wehave an example of points, we have few points whose I mean points are given, whoselocations are given. So, we have some number of points out here whose attributes are giventhis is the idea of the point and we have the location coordinates in terms of x and y.
So, you can see for the first point the location coordinates are given, for the second coordinatepoint again the location coordinate are given, for third I mean point and the for the fourthpoint we have the location coordinates. So, this is how the data is basically stored in referenceto the points in the GIS data.
So, now, we have an example of the lines wherein you can see we have the from node and wehave the to node. So, if you see the segment one. So, in this case, this line segment that is 11to 12 is your arc ID number 1 which is given over here which is the link ID and then you have
the node ID that is 11 and 12 which is given as these points. So, we have the node wherein wehave the from node and we have the to node and then we have the link IDs as arcs.
So, similarly, if we pick up say arc number your arc number 5. So, in this case if we go to thiswe see the starting point is 14; so, the from node should be your 14. In this case, it is 15starting node is 15 and to node is 14. So, the reason being this is that the line would havebeen digitized in this direction that when somebody would have been digitize this line, theywould have first clicked on this point 15 and then on point 14. So, this arc 5 has from node is15 and the to node is 14 in this case.
So, we can have the length of these arcs, we can have the adjacency matrix if we see thepolygon coverage. So, in this case we see the polygon coverage. So, you can see the polygonIDs are given over here 100, 101, 103 and 104, but you also see a default ID which is 100which is beyond the lines that we see in this particular area, this ID is I mean comes asdefault. So, any field which has a hash after it is the default field created by the software.
So, you can see for the ID 101 that is the polygon which is listed as 101 has the line segmentsthat is one starting from here then line segment 4 which is this to this then we have the linesegment 6 so, this line segment is line segment 6. So, this completes the polygon 101 andsimilarly we have polygon say 103 or 104 and we can find out what are the list of arcs whichis enclosing or making this particular polygon.
Now, we can also find out from this that your x-y coordinate of arc 1 is given over here wewere talking about line segments. So, you can see that coordinate 1 and 3, 1 and 9 and 4 and 9are basically creating this particular arc segment 1. So, I mean we you can go to the data andyou can refer to it and see how I mean the data is arranged I mean you can create a shape fileor arc file in arc GIS or q GIS or other softwares.
So, we will have one lecture wherein I will try to tell you to install the software’s and we cando some basic exercises on these softwares. So, we will have a dedicated lecture for that andwe will try to work on the QGS platform. So, apart from this you can also see there are two
matrix which is the adjacency and incidence matrix which we are going to talk about in ourlater slides.
So, in the adjacency matrix you can see that the adjacency of these points are given as amatrix. So, if your point 12; this is point 12 whether it is adjacent to point 11, so, if it isadjacent then it is score is given as 1. Now, if you see the point 13; 13 is not adjacent to 11.So, in that case it is coded as 0 wherein the I mean for 14 the adjacency is 1 since you canreach to 12 you can reach to 14 from point 11. But, if you see I mean the adjacency frompoint 13 you can only reach the point 11 from point 13. So, for adjacency matrix of 13 youcan see it is only having the adjacency to point 11 which is this point.
So, now what we can do is if we see the incidence matrix for this I mean the point segmentsand the line segments. So, the link codes which are given out here that is for point 11 you cansee that it is connected to the first segment, but it has a negative connection. I mean, the lineof flow direction of flow is negative in this case wherein it is connected to the second arc andit is in the positive direction, it radiates from this point 11. So, it radiates from this point 11so, it is given as 1.
So, similarly you can reach this point 14 and this is through this particular line segment that is4; so, again that is a positive value. So, you can likewise I mean, if you are unable to reach itdirectly if the connection is negative it would be represented as negative one in the incidencematrix. So, we have gone through the different types of data sets that is the polygon data sets, we have the line data sets, we have the polygon data sets and the point data sets.
So, next we will look at what is topology. Now, we have to build this relationships in ourearlier slide we have seen that your numbers the polygons were coded, the points wherecoded. So, I mean they had this latitude longitude and this links had numbers and it wasarranged in a table. So, this is the relationship that we try to capture between the points andthe lines which represents real world features and is encoded in the data.
So, we may have extended database attributes along with this table; I mean, it would have forthe tables along with it. So, it will have multiple columns wherein we can add other featuresas well information as well, multiple I mean sets of information. So, this needs to be encodedin form of a relationship.
So, this geometric properties of the geometric object; I mean, this would remain I mean a fortopology to be build I mean it would remain invariant. Such as, if we are making a topology
using lines or say I mean it would result in a polygon, so that geometry should remaininvariant. Invariant means when we are doing geo-referencing in our last class we had donethe transformation. We had done the transformation and we have seen the polygon orpolynomial or affine transformation in which we are basically trying to change the geometricattributes the values of the points.
So, in that case it may undergo warping, it may undergo a deformation in terms of the changeof shape. So, we had seen shearing, rotation, translation these are different things that mayhappen to a database. So, it could be a image, but today we are talking about vector data. So,when we are doing a transformation on the vector layer what happens is the, there may be adeformation in terms of the geometry, but the relationship the properties of the geometricobjects they remain same. I mean the road the name of the road would not get changed if youare doing a geometric transformation that needs to be preserved.
So, in a topology when we have a topology, this properties of the geometric objects it wouldremain invariant. It would not change under any such transformation such as bending orstretching, you can see the example of a rubber band; I mean, we use it for time.
So, if you stretch the rubber band I mean imagine it to be made out of lines so, you would seethat the entities would be remaining the same. The line segments would be the same may bethe length increases or it gets deformed or if you want to if you make a polygon out of it maybecome deformed, but the entities remain. The geometric objects would remain invarianteven if you apply some kind of a transformation such as stretching or I mean skewing or anysuch transformation.
So, this I mean relationships this property’s could be explained through directed graphs, it isalso known as digraphs. So, this is aspect of graph theory. So, I mean, the directional naturecan be expressed through the digraphs that is the directed graphs which shows thearrangement of the geometric objects.
In our earlier slide, we had seen the lines in which the directions were there. As you startdigitizing it, it also takes the direction which is encoded in the to point, from point and the to
point. So, those relationships could be encoded and I mean these relationships could beembedded as a table in a table in different columns. So, this basically is the topology or therelationships that we build with the points and the lines.
Next, we had talked about the adjacency and indices incidences. So, these are alsofundamental relationships that are extensively used in GIS, wherein it would establish arelationship between the nodes and arcs. So, in digraph topology it would require additionalfiles to store the spatial relationships. When we have this spatial relationship we have seenthat adjacency and incidence matrix where two separate matrix, two separate data sets apartfrom the basic data sets of the location, length or area of the line point or the polygon.
So, these are two additional data sets which basically stores your adjacency and incidencedata sets incidence information. So, we have already seen this example of the polygontopology and we had explained what is adjacency matrix; adjacency matrix and a incidencematrix in our earlier slide and how they are connected and how they are encoded in thisparticular table.
Now, let us see this topological relationships is also useful if we want to do some kind of aspatial query. If we you are doing some kind of a spatial data query say suppose you want tosearch for a road and you see if you are travelling somewhere you open the Google maps andyou can see how which segment of the road is congested and what is the travel time. So, it isshown through colors indicated through colors red indicates that there is a traffic jam, thetraffic speed is very less in a particular segment, if it is green or blue then it indicates there isa smooth flow of traffic in that particular line segment.
So, I mean we need to do some queries, I mean these are queries which are self generated, butyou can also give a specific query say suppose what is the length of a particular road. I meanyou may have a particular road which may be a express way or I mean a highway part of thehighway running through a city. So, you may want to know its length. So, you can find outusing a spatial query.
So, when you have these topological relationships it helps or it aids in creating these spatialqueries like say suppose, I mean there is a case of say kind of pollution. So, how much or howmany a buildings are I mean, affected or how much population is affected because of thispollution. So, you can try to use measures of containment or intersection. So, these are veryimportant two topological relationships important for the spatial data query.
So, we shall subsequently see in our decodes of time in our next lectures series of lecturesthat how we can generate a spatial data query and how it is useful.
(Refer Slide Time: 23:55)
So, we would look into the georelational data model. So, it stores geometry I mean as geo Imean which signifies the graphic aspects I mean, how the data is coded as lines or polygonsand the relation that we build into this model which is the database part that is the attributepart. So, it uses feature identification IDs to look the to basically link this two components
that is the geometric component and the attribute component. We have the attributecomponent as well as the geometric component.
So, we can link these two components. So, these two components are against synchronized; Imean, these are two separate sets. Like you if you are some of you have worked on auto CADyou would see that you can draw lines, you can draw polygons, you can draw shapes, you candraw points. So, these are only a drawing entities I mean it does not have any attribute and onthe other hand you can have a excel table wherein you may have attributes. So, these twoinformation sets could be having a unique tag or a unique ID which would link these twocomponents that is I mean, geographic or the geometric component and the attributecomponent.
So, these are synchronized so that they can be queried, analyzed or displayed together. So, theexamples of such georelational data models are COVERAGE and SHAPEFILE. So,COVERAGE is a very popular Esri I mean format and SHAPEFILE is a open source format,we will have a look at it. This coverage is a topological I mean model wherein a shape file isa non-topological model. We will again discuss what is a, I mean difference betweencoverage and shape file when we look into this two aspects we shall see.
So, you can see the coverage has two component arc coverage has two component. So, youcan see it has it is linked the geo that we had talked the geometry that we had talked this is thegeometry part and then we have the info file which is basically the attribute. So, they arelinked with two sets of I mean entities, I mean you can see a common field having the entitynumbers which link this geometry as well as the attribute.
So, whenever we said we are doing some kind of a query or we are doing some kind of Imean analysis they are these two components. These two components that is the geometryand the attribute of this particular system, that is why we are calling it a GIS, GeographicalInformation System which has these components. So, these are synchronized and they can bequeried analyzed or displayed together displayed in unison.
(Refer Slide Time: 27:19)
So, let us see the Esri coverage. This is a proprietary format given by Esri and theconnectivity is through arcs and nodes. So, there is a area which is defined for this polygonswhen the arcs are connected together to enclose area and they are referred to as polygons.And, we also can identify the contiguity; I mean, we can identify whether area is lying closeby. So, we can find out which are the left polygon and the right polygon that is there in thedatabase.
Now, you can see the data structure for a point we had already studied this. So, you can knowthe coordinates from this I mean graph. So, for point 1; for point 1 we have coordinates 2 and9. So, in the x direction we travel two grids and in the y direction we travel two grids to havethis point 1 which has the coordinates 2 and 9. So, this is how the point data structure is
linked, then we have the data for the line coverage in which again you can see that we havethe arc ID. So, you would have arc ID and you would have the from node and the to node.
So, we have already covered this. So, we have the from node and the to node. So, this wehave the from node as 11 and to node as 12. So, similarly we can also have your polygoncoverage wherein we have the left polygon and we have the right polygon. So, we had talkedabout the contiguity, in this case we had talked about the left and the right polygons. So, youcan see that is encoded in this particular data set the left and the right polygon.
So, for arc 1 you have the left polygon which is 100 which is the outer area and the code forthe outer area and for the right polygon you have the right polygon as 101. So, this is your arc1. So, similarly you may also have a polygon arc table polygon table in which you can findout which are the arcs which enclose to make this polygonal area. So, it has arc 1, 4 and 6. So,you have 1, arc 4 and you have arc 6 to make your polygon 101. So, this is how the Esri arccoverage would store the data.
(Refer Slide Time: 30:15)
Now, next let us see how the shape file is different from the Esri arc coverage. Now, shapefile it would it is a non-topological format, it is open source format and it is extensively usedin a multi in a range of multitude of GIS softwares, GIS platforms. So, this shape file it treatsa point as the x-y coordinates; as the x-y coordinate that we had seen in arc in the Esricoverage file. So, it has a line would have series of points, a polygon would have series ofline segments and the difference is that this polygons would have duplicate arcs when theyhave sheared boundaries which is not so in case of the arc coverage that we had seen earlier.
So, we can see that there are two types of files in this. One is your shape file and it is coded asshx file, shp file and another is a shx file. So, shp file it I mean records the geometry whereinshx maintains the spatial index. We had talked about the spatial index in the last slide; so, itmaintains the spatial index of the feature geometry.
So, there are few advantages of using shape files they are I mean, they basically can be veryrapidly displayed on a computer monitor. So, it is very useful when a user is trying to onlyview some data sets. So, in that case it is easier if we use shape file as they are they can bedisplayed quickly. These are non-proprietary formats actually there was a initiative in 1990and there was a demand for having non-proprietary open source GIS data file.
So, this format shape file is the result of that initiative and this consortium is known as theopen geo spatial consortium which came up in 1994. So, it basically I mean is very; I mean, itpresses on interoperability of the data sets, so that I mean you can use it in multitudes ofplatforms. So, if you go to this particular website you can find out the details regarding opengeo spatial consortium.
(Refer Slide Time: 33:03)
Now, we have the object based data model. So, this is again a standard non-topographicaldata format it is also used in Esri products. It is format from Esri, but it is I mean it is anon-topological proprietary format. So, it stores geometries and attributes in a single system.Unlike earlier data bases like shape file wherein the geometries and attributes are stored indifferent systems with the unique ID relating both of them this would store the geometriesand attributes in a single system.
So, the geometry is a stored as collection of binary data and it is in a specific field which isknown as a BLOB; I mean, this is known as binary large object. So, it is abbreviated asBLOB. So, I mean there are spatial feature or objects which are associated with the set ofproperties and methods. So, you can see here that the geometry of each land use is coded hereand the shape is given here it is given as polygon for the land use ID. So, this is how yourobject based data model is stored.
Now, they are affected by the property which describes the attribute or characteristics of theobject, I mean your GIS object it could be the shape or the area that is the extent or themethod like I mean, performing a specific action such as copy or delete. So, these GISoperations are affected by the property and method which are encoded in the object basedmodel.
(Refer Slide Time: 35:01)
Now, talking about the next geo relational data model which is the geo database this is againby Esri, it uses points, lines and polygons to represent the vector data. It is very similar to arccoverage in terms of its simple features, but it differs from arc coverage in terms of compositefeatures as you can code roots or regions which we shall discuss in our next few slides. So, itcan also store raster data, it can also store triangulated irregular networks, it can also storelocation data.
(Refer Slide Time: 35:41)
So, the geo database, vector data bases is organized into feature classes and feature data sets.So, feature classes would store the spatial features of a similar type of geometry, if you havepoint. So, for different points it would store the spatial features in the feature class. And, itwould participate in the topological relationship with one another, that is, for an example Imean if you have say coincident boundaries like, if you have census data which is veryhierarchal in nature wherein you may have village, you may have taluka blocks or districts,state and the country boundaries. So, there could be I mean coincident boundaries betweenthis different scales of data.
So, this geo database I mean participates in the topological relationship which is wherein youhave coincident boundaries. So, the feature data set, I mean we had talked about the feature
data set. It stores the feature classes and that would share the same coordinate system and thearea excess area extent.
(Refer Slide Time: 36:57)
So, next we had talked about the triangulated area network in our earlier slide in the geodatabase. So, we can see what is the triangulated irregular network is; it is used to code theundulating nature of a terrain. If you have a hilly area you can code the slope, you can codethe height, you can code the terrain information using this particular type of vector datastructure which is known as TIN; Triangulated Irregular Network.
So, these are basically a set of non-overlapping triangles and each triangle would have aconstant gradient. The slope in each of this triangle would remain same throughout thattriangle, it would not change. So, you can see, you can say that the triangle would be of a
nature of equiplanar triangle for the points lying in a triangle, they would be equiplanar. Theywould be lying in the same plane; it would have a constant gradient.
Now, the node of the triangle is a point and each edge would be a line and the triangle itselfwould be a polygon. So, that is how the data is structure. So, it has a triangle number, thenumber of each adjacent triangle, the data would have the list of points, edges that is thepoints the lines as well as the x, y and z value of the elevation points. So, this you can seehow triangulated irregular network looks like.
So, if you have low lying area where there are very less undulations in terms of your heightinformation, you can see the triangles would be much bigger than areas where you wouldhave frequent change in the elevation values, in the height values. So, there you can see thetriangulated facets would be much dense compared to the plain areas where the size of thetriangles are much bigger.
So, you can see the structure of a triangulated irregular network. So, we have the nodes that isnode 11 and it would have the elevation value that is the I mean, height value and it wouldcontain the x and y reference with respect to the coordinate frame that is 2 points along the 2units along the x-axis and 9 units along the y-axis. So, this is where your node 11 is stored.Now, this triangle 101 is comprised of these three nodes that is node 11, node 12 and node13.
Log in to save your progress and obtain a certificate in Alison’s free An Introduction to GIS and Data Models online course
Sign up to save your progress and obtain a certificate in Alison’s free An Introduction to GIS and Data Models online course
Please enter you email address and we will mail you a link to reset your password.