Virtual Reality Engineering
Department of Biomedical Engineering
Indian Institute of Technology, Madras
Lecture – 43
Welcome back in one of the earlier classes we talked about the resolution which is needed for virtual reality from the anatomical point of view. We looked at the photoreceptors rods and cones from there we had derived what is the resolution possible. In the later classes we talked about psychophysics, the measurement of perception. The resolution is therefore, is it going to be limited by the perception. Psychophysics as you all know it is just not the physiology or anatomy, but actually the perception. The relation between the they are the perception on the anatomy or physiology.
So, the earlier resolution which is dictated by the anatomy and resolution how is it going to be modified by the psychophysics is what we are going to see and we are going to come up with a further limiting resolution.
(Refer Slide Time: 01:20)
For this we will look at the Contrast Sensitivity function which we already discussed in one of the previous classes. The CSF function talks about for different spatial frequency how the threshold is actually increasing actually this y axis is the sensitivity. Threshold is a which we are seen is the inverse of this sensitivity, sensitivity.
So, the higher sensitivity where ah at this point we need a lowest 2 threshold as a sensitivity decreases we need higher and higher threshold for example, here we might need at this particular frequency we might need more threshold. So, the implication of this CSF which we have already seen in the earlier class that, if a picture which has higher frequency details then this diagram tells that we need more threshold in order to be no seen there is the implication of this graph. Suddenly very low frequency also needs higher threshold the least the threshold is necessary only for the medium frequencies that is the implication of this right.
So, how we can use this sensitivity function in order to come up with a limiting resolution is what other topic today.
(Refer Slide Time: 03:12)
For this, let us again get into the definition of what is the resolution what is the contrast and then see whether we can connect these 2 things. The spatial frequency let us say an image spatial spatially varying image such as this one where there is a black and white. Let us say we are giving this as input to our system, the system let us say it is a generic one. Let us say the system outputs an image which is transformed in some way in this case is this is giving an image like this. The input image where binary the black and white image is converted into a gray image. If you look at the black one with the blue picture the black and white image which has this change in the contrast this is the black this is the white the square grating kind of things right and when this is given to the system this is transforming the image into a sinusoidal gray level are image. So, here the contrast is 100 percent because there is only black and white and the contrast definition and if you look at it you can say this is a I max minus I minimum (equation) divided by I max plus I minimum. This in terms of the percentage contrast this is one definition of contrast, right.
So, in this case this is going to be 100 percent there is a here this contrast has reduced over here let us say this is a 90 percent. So, here this there are 2 concepts the concept of resolution also we can look at it the concept of contrast also we can look at it. There is a connection between there a concept of resolution and contrast. Resolution is the closest to 2 lines that can be separated. Suppose if we give this image in as an input to the system the output can it be resolved into 2 separate lines or what is the what is the highest frequency which can and be resolved all into 2
different lines in the output system right there is a resolution. You can see that this resolution from the output image which we are today meaning that is limited by this contrast, again to look into a little more detail.
(Refer Slide Time: 06:40)
Let us look at the how this contrast is limiting the resolutions. Suppose if we change the contrast gradually from left to right we have only black and white but slowly we are con reducing this is contrast or is reducing the contrast here and the contrast is reducing contrast.
This is a black this is the white and initially it is the completely black and white and this is a comp this is a white and the black is reduced and finally, we reach some. So, that the again using the definition what we have written I max minus I minimum dived by I max plus I minimum will tell you what is the contrast over here this contrast is what limiting the resolution. (Refer Slide Time: 07:45)
For example, again we have our system over here if we increase the frequency let us say in this case we are increasing the frequency doubled, then other system we are talking about system is no more separating the lines in the output image. You can see that the output image has lesser contrast let us say in this case it is 20 percent of the contrast. So now, the resolution we will define in terms of the modulation.
(Refer Slide Time: 08:40)
So, if you have a system input frequency and a output spatial input frequency output frequency output modulated image output modulated image. So, we will see how this contrast is transferred. So, if you draw a graph let us say in the x axis we have this frequency in the y axis we have this threshold function MTF factor let us say Modulation Transfer Factor. So, a system transferring the modulation of the given input to the output if it is completely transferring then it is let us say this is number 1. If it is reducing then it may be now lesser than 1.
So, modulation transfer function factor let us draw it in the let us plot it in the y axis for a various frequencies this may be 1, but all of a sudden it may be reducing this function is called the modulation transfer function or MTF. This MTF is a system property, system property and this system property can be measured using various approaches assuming all this is a linear system MTF of different components can be controlled with many other MTF for example, in our system we have the electronics in our virtual reality systems we have electronics, we have a optical system and then we have the software system, each one may have a MTF, MTF 1, MTF 2, MTF 3. So, the system MTF can be tremendous convolution of a MTF 1, MTF 2, MTF of 3 kind of this is the system MTF right? So, this system MTF let us say we have some of found out from some test for other virtual reality headset.
Now, as long as MTF is concerned there is no perception involved in it. The perception will come into picture when there is a the contrast sensitivity is involved. The contrast sensitivity involved the CSF function if you remember, but can be now brought in together here, but CSF is a sensitivity is measured here because the y axis is a modulation transfer function it can be the inverse of CSF needs to be needs to be plotted in this graph. So, instead of CSF we will say contrast threshold function. So, we remember our threshold is equal to 1 over the sensitivity. Therefore, or the same function can be plotted something like this this is the CTF there is a contrast threshold function. So, this threshold function tells you what is the threshold necessary as we increase the frequency. So, as we seen in the CSF there is a maximum frequency or that there is a frequency at which it needs the least resolution, sorry in least a threshold. At a lower frequency it has the high it needs higher threshold similarly as we increase the frequency it needs a higher threshold as well.
Now, the frequency at which this MTF and then CTF joints could be a valuable information. This could be called as the limiting resolution. This limiting resolution is dictated by this perceptual graph this is where the psychophysics come into picture. So, the resolution earlier it was dictated by the anatomy and physiology this MTF is dictated by the system properties the outside at a system properties, now this resolution c CSF which is a psychophysical cow together we can we can put them together to find out what is that limiting resolution. Now let us look at the CSF curve in one of these CSF curves.
(Refer Slide Time: 15:33)
So, the CSF cow is mentioned as it is about 40 cycles per second is where the limiting resolution happens this is one let us say this is the modulation transfer factor this is a threshold.
So, this graph tells that at about forty cycles per second there is the limiting resolution happens this is the now arc. So, if you have field of view 120 degree. So, this is 40 cycles per arc this is a frequency per arc we can say there arc their degree their degree. So, number of pixels needed for 120 degree is going to be 120 into 40 which is going to be 4800 pixels. As of now we are we have about 1200 pixels. There is a let us say said state of the arc. So, in order to reach this 4800 pixels we might need 10 years, 10 years we will reach this limiting resolution. Assuming the moores law is constant and that is valid
So, every 5 years those pixels is going to be doubling therefore, after 5 years this is going to be 2400 pixels and after 10 year we will reach the 4800 pixels, but that is assuming 120-degree field of view, but our industries are going to consider a much more field of view a as we are going to see in the later classes. So, for 200 degree let us say this is going to take 200 into 40 it is about 8000 pixels. To reach this 8000 pixels we might need it up probably 15 years let us see again assuming the moores law is constant and that is valid.
And so, we are right now at 1200 and ah. So, it will get doubled in 5 years there is going to be 2400 that is again going to be double 2 times that is going to be 4800 that is in 10 years and that is again is going to be double 9600 that is about 15 years it will take. All these things again we are going to come we are going to assume that this MTF up of different components of the system MTF 1, MTF 2, MTF 3, all these things are assuming those things as constant maybe each of the same MTF will be improving therefore, the requirement of this forty may be reducing in the future.
So, this might become now 30 or better if that is the case the number of resolutions required this all will be lesser. So, I hope you see the point of how our psychophysics can be used to find out what is their limiting to resolution that can be useful design of the better imaging systems are better visual systems which is used in virtual reality will stop here.
Virtual Reality Engineering
Dr. Steve Lavalle
Department of General
Indian Institute of Technology, Madras
Lecture – 10-3
Human Vision (depth perception)
(Refer Slide Time: 00:15)
Next topic I have which is depth perception. I will go a little bit into that this is covered in chapter 10 of the of the Mather book.
Depth perceptions and invert is a very important topic for virtual reality. We like to think that one of the biggest differences in looking at a screen compared to looking through a head mounted display is that we can provide to you depth perception through stereo, but not completely true that that is the most important distinguishing feature. We get depth information from many, many sources and I want to make sure that is very clear so.
(Refer Slide Time: 00:51)
Depth perception, this is by the way a very general pattern in perceptual psychology is there is something perception. We have depth perception will talk next about motion perception there is some other things I would not have a lot of time to talk about, but we could talk about scale perception, we can talk about color perception. This is just a template here if you like. If you want to study more of these things later it will follow the same mentality.
We will be studying what are called depth cues and again in this pattern cues over and over and comes up again. If I if I am studying motion perception then I will talk about motion cues. Cues are somehow pieces of information that trigger the brain to perceive whatever it is that were trying to perceive. What is the key information that we need what are these features that are going to be used? It is going to be 2 different kinds of depth cues one nice way to separate them is into what I call metric and ordinal, in other words metric is going to be varying continuously with distance and ordinal is about ordering as the name suggests for example, near to far. What is in front of what else another way to maybe name these number one could be maybe continuous if you like and number 2 could be combinatorial. It could appeal to the computer science in you if you have that background.
Again, the thing I want to emphasize is that there is a multiplicity of depth cues. Not just stereo, when you look at panoramic images for example, if you have had a chance to look at those in the lab they may look quite 3 dimensional even though the same image is being presented to both of your eyes. Why is that? We need to talk about that.
(Refer Slide Time: 03:37)
I will go through some examples one retinal image size. How far away it is may have to do with how much of your retina the image takes up, combined with your knowledge about what it is you looking at. Assuming you looking at something familiar let me give some pictures here. (Refer Slide Time: 04:05)
For example, just based on the size of that hand as an image on your retina in the context of a person standing there behind, you make some inferences about clothes right does a woman have an enormous hand in the in the image on the left or is it just closer you do not see an arm there really it is the same picture either way all. You are making some inference about depth and you are not using stereo.
(Refer Slide Time: 04:27)
Again, what is the size of the image in the retina? We know maybe how large a tree should be. And so, if we see 2 trees and one is further away. The on the right there it shows that these smaller tree is perceived to be further away simply because it occupies a smaller region of the retina. (Refer Slide Time: 04:52)
Another thing we can look at is height in the visual field
Let me write that height in visual field. We did not change the size of the person there we just changed their height in the visual field there is a horizon line. And so, it appears that the man at the top is further away. That kinds of information were also using. (Refer Slide Time: 05:21)
Here is another one where is the spear going to hit. This is also some size changing there as well that is a combination of 2 of them I would say because it is height in the visual field, but there is also differences I do not think the elephant should be smaller than the animal in the front. (Refer Slide Time: 05:40)
Texture gradients and perspective texture gradients and the perspective is a part of this may not be a perfect texture. We have some examples of that you perceive depth right just from the street stones in this painting.
(Refer Slide Time: 06:09)
Lot of perspective near the top there in fact, it looks a bit excessive, but, but at least from the tiles on the ground you can see you get some idea of depth.
(Refer Slide Time: 06:20)
You get ideas of depth from this right just an arrangement of lily pads. Again, it is depth from this kind of texture gradient.
(Refer Slide Time: 06:28)
Again, it is a kind of texture here a texture gradient.
(Refer Slide Time: 06:34)
Another one is image blur. It looks in this one that the blurry parts are farther away. Further away just by the way we drawn this image we perceive the other pink flower to be closer. (Refer Slide Time: 06:57)
In this case we have blurring up in the front we perceive the garden gnome to be further away. (Refer Slide Time: 07:06)
This is a case before.
(Refer Slide Time: 07:09)
Another one is atmospheric perspective. The hazy mountains in the distance seem to be further away.
(Refer Slide Time: 07:28)
Some additional cue this one’s great shadows let us see 6 shadows are shading. In the top picture, I perceive the balls to be going further and further away. As we go from left to right, but in the bottom picture I prefer them I perceive them to be at the same depth, but just different heights in the air right and the only difference between the top and the bottom is how the shadows are rendered. Clearly, you are using some additional information from the shadows to help you reason about depths happening all the time for us.
(Refer Slide Time: 08:11)
Um you can also look at these shadings to figure out whether it is a cylindrical shape just from the way the shading works here. In addition to just straightforward shadows there is just shading across the object it is the one on the left and inward cylinder outward cylinder I am not sure we can tell there may be ambiguities there looks like the other 2 are a little bit clearer maybe not sure you know. Maybe the last one looks clear that one looks outward. Be careful with those, but you do get some depth information.
(Refer Slide Time: 08:42)
Another one is interposition and so, I perceive the yellow disc to be closer than the red square which is closer than the green triangle.
We get some kind of depth ordering that is an example of ordinal information. (Refer Slide Time: 09:07)
And the same kind of thing is happening in this complex picture. There is a lot of boats out there people in the front waving, but we get a lot of information just from the ordering and not necessarily using any of the other cues. In this particular case, it is not a lot of extra information there, but we start to infer about various sailboats and where they located with respect to each other from this interposition. Let me give a few more here goes on and on does not 8 accommodations and this we talked about before. Is this refocusing of your eye? Just based on your brains knowledge that your eye has had to refocus you perceive something as being closer.
Now, if you messed that up with your head mounted display by making everything appear to be at infinity all the time you lost this cue. What is the effect of that? I do not know someone should study it some people have studied it, but there still a lot of unknowns. Your losing that cue many people would argue that is a very important cue to keep for depth may be more important in stereo. (Refer Slide Time: 10:21)
Motion parallax, motion parallax if I move back and forth because I need to hold some kind of object stationary here, but if I if I hold the station I go back and forth then I see this object passing in front of you who are further away at a faster rate. The objects at varying depths intermingle in a certain way where the closer objects are moving more quickly as I move back and forth. Very important motion parallax that is important for visual sense it is important for your audio sense. It shows up in other places. It is important here as well.
Now, if I have a head tracking algorithm that is running with sensors and such for a head mounted display if I can only rotate I get some motion parallax if I simulate the translation of my pupils still using rotation, but if I move like this I will not get motion parallax. If I look at a panoramic display and it was taken by a panoramic camera at some fixed location and then I start doing to do these parallax motions I will not get this beautiful arrangement of objects moving back and forth. You will lose that if you capture a panorama with a stationary camera. You lose that bit of information.
It is very important to have that, but that gets lost.
(Refer Slide Time: 11:54)
In that case and finally, number 10 I say this for the end the obvious 1 which is a stereo cues. What are these there is the vergence angle. And so, your brain knows how your eyes are oriented? How much they have converged? You have a signal for that you have what is called binocular disparity.
How much how different are the images between the 2 eyes.
(Refer Slide Time: 12:44)
Now, there would not be because your eyes are rotating it would not be the same as if I just had 2 cameras facing forward and then I move them apart to get stereo, I am not rotating the cameras to face the object like I would wipe my eyes actually rotate to face the object of interest. Your cameras would not right unless you put some special motors on them to rotate them. Be a little bit careful. I mean doing stereo computer vision you get a very large displacement, when you when were talking about the human eyes that are rotating and converging to fixate on the object of interest, but there is still some disparity in the images. Even though they cannot look identical because, your looking at them from 2 different perspectives.
That is what I mean by this binocular disparity it is not as much as it is in the engineered case, but nevertheless your brain can detect that information. There is diplopia which I mentioned earlier if I fixate on some closed object then there are multiple images in the periphery. That is additional information, a couple of more things I say about stereo you have to pay very close attention to what is called the inter pupillary distance or IPD. The IPD in the real world becomes very important if you going to place lenses in front of your eyes then you have to line them up correctly.
So that your eyes are centered when you are looking forward that is the best you can do if you do not get that right if you do not have adjustments for this do we have adjustments for those in the lab does not look like it. If your IPD is far from the average what you might not even know this what your IPD is most people do not know this my IPD happens to be lower than most. I am I can remember maybe the 15th percentile or tenth percentile or something some eyes are actually close together, you know who would have thought I know compared to the world average.
If you that is one questions are you looking through the center of the lenses that is very important for the optical part of a head mounted display. Another interesting question is in virtual reality now. Have I matched you know if you believe in a perfectly scaled world? Have I matched the interpupillary distance perfectly? Or is it too small or too large if you make it very, very large in virtual reality you will feel like a like a big monster or something right now you may look like godzilla or something.
You can make an entire city look very small or feel very small you move your head around and you know it looks like it is been miniaturized. If you make your IPD very small you might feel like you become tiny. There are some there are some applications people have written where you can make yourself feel like you suddenly maybe 10 centimeters high. And so, part of the way to achieve this is by using the interpupillary distance.
That definitely has a lot to do with your perception of depth and your perception of scale. Let us see here.
(Refer Slide Time: 16:00)
This figure just shows what is called the horopter which is the region when you fixate it is the it is the stereo region that you get. A commons stereo focal region when you consider the clear images projected onto the retina, there is you know remember we did all this study of focal planes this is showing the stereo focal surface that is common for your 2 eyes. There is the calculated theoretical surface for that and then there is what is been determined empirically, which does not seem to match exactly I personally do not know why they do not match.
I do not know what theories there are for that, but that is what we get and. (Refer Slide Time: 16:42)
This may also provide some motivation for people advertising curved displays perhaps. If you sit exactly in the right place, then it may give you a perfect a perception, but you know who knows. Let us see are there any questions about anything I gone over today. I leave you with one final optical illusion just 4 4 final surprises, but see if you have enough background to explain why this might be occurring.
(Refer Slide Time: 17:11)
Does that look strange to you? This is the it is called the cafe tile illusion. All of those lines are horizontal back going back and forth, but they seem to be bending do not they, but there not. I will leave you with one optical illusion there to think about. That is it for today.