Prof. Steve Lavalle
Department of Applied Mathematics
Indian Institute of Technology, madras
Lecture – 5-3
Geometry of Virtual Worlds (eye transforms)
(Refer Slide Time: 00:15)
So, I would like to go into this eye transformation next if there are no questions about this, remember all of this is being applied to these points. So, if I have a you know X, Y, Z, one here right X, Y, Z is the point that I want to apply that is like a point on this cube that I drew the yellow cube that was drawn in the in the previous figure that I made on the board.
I just put the point here add a one to it apply all of these transformations and then I will figure out in some kind of pixel coordinates that look like floating point numbers or real numbers depending on whether you are using a computer or not it will it will give you these coordinates that are not necessarily integers, alright. So, I want to figure out the eye transformation. (Refer Slide Time: 01:02)
And make it really clear I will call it cyclopean; cyclopean eye transformation.
So, if you are trying to imagine maybe people do not make a distinction in computer graphics about this, but if you are doing virtual reality, then you could build a virtual reality headset for a Cyclops and then this would be just fine right, but because you have 2 eyes, I do not want to pick the right or the left, I would rather pick the midpoint between them to make everything simple here and then talk about how to do a nice offset for left and right. So, that is why we will imagine that we are designing a virtual reality headset for a Cyclops in the in the beginning here, all right.
So, the first thing to do is to imagine that lets say to express the eyes rotation and translation in the world. So, in other words, I just imagined that this eye looks like an eyeball with a with a pupil is defined in some kind of body frame let me give you what is essentially the body frame of the eye, I want to put the pupil right at the origin because that is the point where the light comes in the real world. So, that is this is all virtual that I am drawing here, but never the less why I have z running into the eye and the only reason why were doing that is. So, that we can maintain right handed coordinate system. So, it is another one of these annoyances that it is here for a good reason because if you do not do it this way then you will have more annoyances.
So, it is a just its a trade off and remember x is out. So, this is essentially a body frame for the eyeball and now if I take the eyeball and place it into the world there is going to be a transformation for that that is going to be a rotation and a translation of the eyeball, I want to figure out; what that transformation is based on how I place the eyeball and then I want to use that to make the eye transform.
So, one way to do that which is very common is to consider what is called a look at look at somewhat strange name, but a look at. So, here is how this works we have three things. (Refer Slide Time: 03:58)
We have position of the eye, sometimes I will say position; sometimes, I will say translations, sometimes I will say orientations, sometimes I will say rotation.
So, rotation and translation are the operations and then the result is orientation and position correct. So, it is to interchange those as long as you understand what is going on. So, 2; I want to have a looking direction looking direction and three I need to specify an up direction. So, let me give some notation to these and I will explain it. So, the position of the eye I will just use as a vector e.
So, I will just write it as e like this single letter and then the looking direction will be c; a little hard to write here, I will put a hat on it, did you note that we have unit vectors and you put a hat on to notice a unit vector if you do not like my hats leave them off, but there are unit vectors. So, it is trying to denote it somehow, all right.
So, how can I obtain the look at here is one possible way to do it. (Refer Slide Time: 05:50)
So, I have the eye and I have determined that the eye decided that the pupil is going to be located at some location e. So, it is very easy to write down. So, imagine you are given where the eye is, it will have some coordinates that correspond to where the pupil is at I will call that e and then if you are told that let us see if you are told that the pupil is looking at a particular point see you get rid of that.
So, the eye is pointed towards a particular point here call it l, then you get a vector, let us say a vector is that corresponds to traveling from the pupil to the point l. So, I can just call this vector c, I will say that c is just using the simple coordinates. So, you have here the Cartesian coordinates of the word l minus e, all right. So, that gives me that vector, right. So, it gives me that vector and then my c hat which is the looking direction here number 2 and I have given you number one by just simple specification number 2 I just normalize that vector.
So, l minus e divided by the length of that vector l minus e, correct; so, that is I am true what am I doing here; I am trying to specify the coordinate frame that I get here right how is it oriented in the world when the eye is looking at a particular place. So, I just want to describe that I want to figure out that coordinate frame and then from that very easily construct a rotation matrix that will put the eye in that orientation and of course, construct a full homogeneous transformation matrix that will also put it in the right location, but that part is considerably easier because it is just translation by e.
So, I need to figure out the orientation of this I am trying to put together the coordinate system it looks like once I figured out c hat, I should essentially have the z part, right, it is just pointing in the wrong direction. So, it will be the c hat corresponds to minus z and then there is also a possible role with respect to that that has to do with the up direction which we will have to take care of and once we have the up direction and we have let us say the forward backwards direction we can do a simple cross product right handed cross product to determine the third axis the third direction and we have agreed to do right handed coordinates and so, that will matter. (Refer Slide Time: 08:41)
So, if that is the case, then if I want to figure out these axes. So, let us say coordinate axes for the eye in the world, then I get x hat, y hat, z hat for these and as I said z hat is going to be minus c hat. So, I am figuring out that one, if I have this eyeball oriented somewhere I am figuring out that part and then x hat; x hat I take the up direction and I cross it with z hat, right. So, I take my up direction, I cross it with z hat and I guess y should be should y just be u hat that make natural sense.
So, when I started teaching this I thought why should be u hat and then I noticed; it was not in the textbook; for example, by Shirley that we are using and it is not in a lot of the graphics literature and it is instead just z cross x and z hat cross x hat which is also fine the reason why they do not use u here directly is just in case someone gave you u that was not perfectly parallel to the image plane right. So, it does not correspond perfectly it is not; so, if I look; if I am; if the eye is actually my eye, then if you instead of pointing straight up is slanted some this would correct for that if you do it in this way.
So, it will tolerate a sloppily defined up if you want. So, so yes.
Student: (Refer Time: 10:22) e and (Refer Time: 10:23).
These are all in the world friend correct yeah I am trying to imagine that the eye has been placed into the world frame and I am working with a description of that somebody tells me well the pupils at this location and it is looking at this particular point in other words that point should be in the center of its image right that is the point l here that would be in the center of the image from that, I make this vector c by just subtracting these 2 points.
And then I get the look at direction essentially right like what way are you looking and then the only thing that matters is you know is the camera upside down right side up how exactly is it oriented that is the next part that were trying to figure out which is handled by this last step, I am taking the up into account here and I just keep performing Cartesian products to get everything to get my coordinate frame orthogonal and right handed.
Regardless of whether somebody gave me a sloppy u value that is not perfectly orthogonal to my other axes; so, it allows you to be sloppy with you personally I would say get it right the first time do not be sloppy about it, but this will fix a little bit of slop in that. So, once I have that I get now a rotation matrix R eye where let me just write it out with dots and then these columns this column is just x hat this column is y hat and this column is z hat.
So, this gives me a rotation matrix do we agree with that now this gives me a rotation matrix that if I start off with my eye defined in its body frame and I apply this rotation matrix to it that I have constructed it should it does not translate the eye, but it should rotate the eye. So, that it is looking in the right direction do you believe that that. So, that is what we have if I want to also translate it then I would after performing the rotation I add e the x, y and z components to the result.
I can do it all together in one homogeneous transform does this solve our problem of trying to figure out what the perspective of the world is from the eye from the eyes coordinate frame I think I did it backwards did not; I figured out from the coordinate frame of the world how the eye appears I need to do in the other direction how do I solve that alright. So, so let us see if I am you know I am standing here I turn my we talked about this before I turned my head let us say counterclockwise 45 degrees, it how does the world appear to have moved from the reference point of my eye if I hold my eyes still it appears that the world has rotated in the opposite direction right it is an inverse transform.
So, that is what we actually have to apply we do not want our eye we want the inverse and we do not want the translation e we want the inverse and so, the way to imagine it; it is as if your eyeball is being held fixed and it is the world that is moving around all the time quite wildly I guess because if I turn my head quickly it would take a lot of energy to move the world around that fast right. So, it is a bad relativity problem right, but that is that is how the mathematics feels like when you perform these transformations.
So, the eye is being held fixed all the time because the screens is being held fixed and relative to the eye and you are going be presenting images to that ultimately. So, this is being held fixed in the world is being transformed. So, we need the inverse now remember I just gave you a warning about taking the inverse of one of these transformation matrices. So, let me give you the inverse now and then we will stop and take a break which I seem to have forgotten to do I got a little too excited. So, let us see.
(Refer Slide Time: 14:15)
So, let us give you the final result. So, T eye is equal to well I need to have as the columns x hat, y hat and z hat, sorry, I need to have as the rows, I said columns is wrong I need to have as the rows x hat, y hat, z hat because I am performing the inverse transform also I need to extend this because I am not just going to apply rotation here I am going to put it together in homogeneous form.
So, this is just the rotation part this corresponds to that R eye there transpose which because it is a rotation matrix this corresponds to R eye inverse and then I have the remaining components and I
make this. So, that undoes the rotation and then I have the undoing the translation part which should have my negative e coordinates here negative e x negative y and negative z, right. So, I have inverted the placement of the t eye I undo the translation which is this matrix here it is on the right.
So, it gets applied first and then I undo the rotation which is the transpose of the rotation matrix that we build using to look at you do not have to build from a look at you may get it directly from a filtering algorithm that is doing head tracking, right.
So, it may come from another place, but traditionally in graphics it comes from a look at. So, I wanted to build it from that as well, but it may just come directly from them from the tracking and it may come from some combination of the 2 as you move a controller around. So, it will come from different sources; it is not a constant, but once it is given and fixed this is how you obtain T sub eye; questions about that.
Student: (Refer Time: 16:14).
How do you find.
Student: (Refer Time: 16:18) are you find the rotational.
Oh, I think I just did it. So, I mean I basically if you remember when I was covering 3 d rotations. In fact, when I went all the way back to 2 d rotations I was trying to convince you that the columns correspond to how the coordinate axes get rotated right I did that in the 2 d example and a 3 d example was the same way that is why we had these constraints remember when I was saying the columns have to be orthogonal.
So, they have to have a inner product of 0 and they have to have length one. So, these are how the coordinate axes are transformed by the rotation. So, I figured out what these are by the geometry of the look at and then I just write these directly into the matrix and that automatically forms the rotation. So, it is a very good question and it is a very easy thing to forget, but whenever you look at a rotation matrix remember that the columns are these are the coordinate axes normalized and showing exactly the effect of this rotation matrix on the standard basis elements.
So, if you take the x, y, z basis elements that are 0 0 1 0 1 0 and sorry I should start I could go right in order if you take the basic coordinate axes 1 0 0, 0 1 0 and 0 0 1 transform to three of those with this rotation matrix, you will get exactly the expressions in these columns. So, it is just a direct step long answer for a short question hopefully its clear.
Prof. Steve Lavalle
Department of Applied Mechanics
Indian Institute of Technology, Madras
Lecture – 6
Geometry of Virtual Worlds (eye transforms, cont’d)
Good to go; I can continue with the lecture.
(Refer Slide Time: 00:21)
We just finished giving you the homogeneous transform for the cyclopean eye. One thing I like to add at this point is the pair of transformations that will make it stereo. So, I can give you right and left eyes. So, if you want to do that we get T let us say left eye which is equal to let me make an identity matrix first it is not a very good identity matrix see here, and then put T eye here and. So, I guess left eye is equal to T eye if I do this transformation because this is just the identity.
So, I just want to do a little fix. So, basically the horizontal direction from the coordinate system we set up is the x direction correct. So, I just want to take the x part here.
(Refer Slide Time: 01:27)
This is the x translation part and I want to put some shift in there, I will call it t over 2 or t I do not see here.
(Refer Slide Time: 01:52)
Is going be the inter eye distance in the virtual world. So, t is the inter eye distance in the world.
And if you are curious about the real world the t average value in the world I mean in the real world in this case right like on the earth among humans, not among the monkeys running around on your campus, but among humans is equal to 0.064 meters or 64 millimeters.
So, that is just an average there is quite a bit of variation from person to person, and this is a nice example of when you define the world, it is good to use whatever system of units you use in your daily life. So, if you are using the metric system here, it is nice to set that all up. So, that your units in the world are meters. So, that then when you perform this shift here this makes appropriate sense because we are still using the coordinates of the world here for the eye, we have not done any strange distortions or rescaling yet right these are all rigid body transformations.
So, it is to use meter coordinates and for the left eye I do this transformation. So, is that does this need to be I guess one question need to ask here, just need to be minus or plus you think. So, let me think about this. So, I start off with a cyclopean view and then I want to figure out what the perspective should be for my left eye. So, if I all of the sudden move my eye to the left; that means, I should be shifting the world to the right which means that I am adding to the x coordinates.
So, that is what I get that very easy to mess this up. In fact, and one of the very first oculus demos we were working on the right and left eyes were swapped and everybody in the company thought it was fine for a while, it was very difficult you had eventually look you had eventually look around the edge of some corner. So, that your left eye cannot see beyond the corner and your right eye can and you realize hey somethings not right here, and you have to learn to open one eye and close the other and try to resolve these things that gets into the perception parts again.
So, your brain cannot necessarily distinguish, when you make some of these mistakes. So, that is t left eye and for t right eye.
(Refer Slide Time: 04:20)
It is just the other way we just use the other sign, we just put minus t over 2 here and then fill in the oops sorry it is one we fill in the rest of the identity matrix and put T eye here. So, in that chain of transformations I gave you, if you want to go from cyclopean eye to binocular vision having left and right eyes then you just replace T eye with t right eye and t left eye and that will give you 2 different paths to go down for the rest of the chain to fix everything. You can try to do some hacks at the end where you just shift it all the way down on the pixel coordinates, but you may make mistakes.
So, it might look in some cases it may cause errors and others that are perceptible. So, this is the right way to fix it, even oh it might not be the most efficient way to fix it is correct and. So, there are other ways that may be more efficient, but their hacks and they may make mistakes we can talk about some of those kinds of things later in the in the course; questions about this yes.
Student: (Refer Time: 05:36).
Oh I see. So, for this is going to be this is still changing just the viewing I am still I see what you mean. So, you want to if they are both fixated on a single point in space right then you are right there will be some virgins, and that is something separate that we are going to cover we are not going to we are not going to present images to the eyes, based exactly on which way they are oriented with respect to your head because we are not doing eye tracking. If we additionally add eye tracking then we would have to also consider the individual rotations of the eyes as they converge and our brains are considering that certainly in the real world and they are taking that part into account, but if we are just going to render onto a display it will be an assumption that the that the center views of the eyes are in fact, the same.
So, the eyes are in fact, looking off at infinity that is something infinitely far away and we are not taking the vergence into account well there is a very good question that is interesting. So, we are neglecting that part and it is here, but in some other system where you are taking eye tracking into account and you are presenting to the eyes exactly what the eyes are looking at considering that rotation this is called foveated rendering for example. It is more expensive to do that now and less clear how to solve all the engineering challenges, but it can be done and if that is done then you have to take that extra transformation into account other questions.
Prof. Steve Lavalle
Department of General
University of Illinois
Lecture – 6-1
Geometry of Virtual Worlds (canonical view transform)
So, I want to do the canonical view transform and then I will be finished with the chain of transformations, and then we will change topics significantly. So much I should say that the distance between the eyes is called interpupillary distance. So, sometimes called ipd people who work in vision science researchers in that area just call it pd pupillary distance. (Refer Slide Time: 00:38)
This will come up again, so the canonical view transform. So, to make things reasonably easy to draw for me, let me assume that the eye is up here looking straight down.
(Refer Slide Time: 01:01)
So, you have to reorient with using your imagination and so, there is a straight down and then we are going to be rendering ultimately on to a display that is rectangular. So, because of that there is going to be a rectangular boundary in terms of the field of view of the eye for the human eye it is not rectangular, right? We engineering people we engineer rectangles or more easily than circles and other kinds of curves and things put them. So, there is a rectangular part to this that is going to begin here and continue through the whole pipeline.
But if you ultimately have let us say a virtual reality display that is a circular or some other shape, then it will show up in here, right? So, there is going to be rectangular boundaries coming up. So, imagine the viewing one is for the eye there will be some, let us say down here I want to render or draw a far plane I will call it. So, this will be called the far plane, I want this line to come right down to and hit the center here, and then there will be a near plane put it up here well I better before drawing the near plane let me connect the lines here.
So, I want to have the limits of how far I can see, please forgive me if my lines get a little curvy see here or if they do not quite match, but they should be here. So, if I take a horizontal cross section, I should get a rectangle that is what you are supposed to be looking at through this strange perspective drawing here. And then the farthest away plane where I am going to consider is called the far plane. So, any objects beyond that I will just consider to be out of bounds, they are not visible to the eye.
So, the beyond the limits of this camera or eye whatever it might be there is also going to be a near plane and anything closer than that is also going to be considered not within the limits. So, I will put it up here let us see if I can get this one right. So, I have another plane here again this center line should be poking right through the center of it get there, through the center that should be in the center we call this the right near plane. And then this part that we have gotten here let us see, let me shade this shape here that I get I would not let me see, I do not know what to go about the occlusion here let me just get the outer boundaries here. So, I will shape this and I think I got all the outer lines there.
So, I did not mean to color this part let us just leave that here. So, you should be a see a shape there and in out of the pink lines that corresponds to a frustum, let me try to draw this in another orientation.
(Refer Slide Time: 04:51)
Now this picture where I have my eye coordinates the eye frame z going away from the eye, y going up, x coming out, then I should have some kind of picture like this and see here. I am really going to get this right without making a mistake probably not, and let see some hidden line in the back there as well your brain will play some tricks on you, because I have not drawn the hidden lines correctly, I can use dashed lines for the hidden ones, I can go back and do that.
Let us see this is the frustum in particular it is called the viewing frustum, that is what I was trying to show here in pink viewing. Is that clear right? So, that is the geometric piece I have decided on the near plane far plane, and I want to figure out now for all of the points of our model in the world that appear inside the viewing frustum. I need to figure out where they go inside of this canonical viewing cube, that is this canonical cube that remember had coordinates plus or minus 1. So, in order to do that I want to set up a perspective projection model, which is going to take the frustum it is going to look at points inside of here and consider the line that goes from each one of these points to the pupil, right? So, it is going to extend out like that and you can imagine that the image I am trying to draw shows up in the near plane here if you like.
So, in the near plane you can imagine there is going to be a picture that shows up and it is going to be where for every point that is inside the frustum, every point that is inside it is connected by a line to the pupil and at some point, it strikes the near plane, and that is how I get my image. So, if things are very large and close to the near plane they are going to show up as very large in the image, but if they are the same size and then you move it further and further back to the back end of the viewing frustum they are going to look tiny here, because all these lines are going to come together and everything shrinks down to a point, right? So, every point in the viewing frustum think of that as having an associated line moving from point space to line space is the basis of something called projective geometry, turns out the homogeneous transforms that we did are also related to that, but I would not go into all those fascinating connections here.
So, because we are dealing with perspective projection and interpreting points as lines, I want to make a very simple picture, I want to drop the dimensions down and keep it very simple explain the algebra and then we will handle the complicated case. So, let us try to make a very simple side view of this picture here. So, it will be a 2-dimensional version.
(Refer Slide Time: 08:27)
So, here is the focal point or if you like I have been saying the pupil location draw a horizontal line here, and a line off at some angle and now I want to consider a point up here P, this point is just any point that was in the world right after we applied the Fourier transformations to get into the world and we then applied the eye transform as well.
So, I guess it is not quite in world coordinates is now an eye coordinate excuse me even though there is been no geometric distortion here. And so, if that is the case based on this coordinate frame that we have set up I have some just in Zp for this point.
(Refer Slide Time: 09:44)
So, the point I guess I do not really have an X part here because of this projection, but you can imagine it looks like ??, ??, ??and then this height here and move my p, Y p, so I want to drop the diagram to look like this. And so, this is again this point is somewhere inside the frustum and then I will take the near plane up here, and suppose it is at a particular location here and it is at some distance d, I will remove let say now I will draw it up here. So, here I will have this focal plane if you like this is the place, where I am going to be apparently drawing the image it could be the near plane could be somewhere closer I can move it all the way down and have it very, very close to the focal point it is just going to shrink everything, but everything's going to be in the right place, right?
So, everything that I do all the information is going to be in the right place it just might need to be scaled differently. So, I do not really care where I pick along here too much that is just going to affect the scaling, and I can always redo the scaling it is like converting from millimeters to meters or something that is no big deal right very simple. So, the scaling is going to matter too much, but let us just pick this particular plane here, alright. So, I picked this plane and now I would like to know what is going to be this height right here. So, this amount h let us call it for height and so one thing to notice by the way I have set everything up here is that we have similar looking triangles, we have this smaller triangle here, and we have the larger triangle out here alright. So, it is like a small triangle contained inside of a larger triangle, but they both have the same angle here correct, right?
So, I want to look at these are both right triangles, and I want to compare these lengths h and Y, I want to figure out what h is and I want to look at that in terms of or express that in terms of d, Z and Y, Y and Z are already given and d is also going to be given because I picked someplace to make a kind of fictitious image if you like I just need to figure out what h is going to be so that it ends up in the right place. Well one thing that is very important to projective geometry or these perspective transforms is that ratios are preserved of certain kinds.
So, I expect this ratio to be preserved h over d is equal to Y p over Z p, right?
Because this triangle on the left is just a small scaled version of the bigger triangle, right? So, they are only off by scale, so these ratios of h over d and Y p over Z should be identical. So, if that is the case then this perspective transform applied to Y p should be h equals just solving this equation let us say d over Z p, Y p I have organized it.