Loading
Notes
Study Reminders
Support
Text Version

Measures of Association

Set your study reminders

We will email you at these times to remind you to study.
  • Monday

    -

    7am

    +

    Tuesday

    -

    7am

    +

    Wednesday

    -

    7am

    +

    Thursday

    -

    7am

    +

    Friday

    -

    7am

    +

    Saturday

    -

    7am

    +

    Sunday

    -

    7am

    +

In this lecture, we study the Association between Random Variables. In the two previous lectures, we introduced the concept of a Random Variable and we also showed computations to find out the expected value of the random variable, which is the measure of central tendency and we also looked at equations to compute the variance and the standard deviation.Using which we compared two different random variables and use these to make decisions.Now, we look at the association between random variables just as we saw association in statistics we looked at covariance and correlation. We would also see these measures of association and show how to compute some of these measures and make decisions based on the computation.So, let us look at two random variables which are called x and y. So, x could be stock 1and y should be stock 2. So, let us assume that the random variable x takes three values which means it can increase by 60 with the probability of 0.15. It remains the same with a probability of 0.75 and decreases by 60 with the probability of 0.1. The stock second stock called y can also take three values 100, 0, and minus 100 with the associated probabilities given here. Now, mu X is equal to 3, expected value is 60 into 0.5 plus 0 into 0.75 minus60 into 0.1. So, 60 into 0.15 is 9, plus 0 into 0.75, 0, minus 60 into 0.1 6. So, the expected value is 3. Standard deviation based on our computation we have not shown the details of the computation, but we have seen how to compute it in earlier lectures sigma x turns out to be 29.85. Now, we look at stock 2. Expected value is100 into 0.07 which is 7, 0 in to 0.88 0 minus 100 into 0.05 is minus 5. So, the expected value is 2 and the standard deviation turns out to be 34.58. Now, if we assume that the interest-free rate of return is 5 percent of 1000 by 365 etcetera we look per day and all these are assumed to be a random variable which represents the change between two consecutive days, then the interest-free rate of return will be 5 percent of 1000 by 365 which is now taken as 0.14 for our computation.So, we compute what is called a Sharpe ratio for a stock 1 which is mu X minus r by sigma which is 0.096, Sharpe ratio for stock 2, which is mu Y minus r by sigma Y which is0.054. So, if the investor wishes to put all the money in one of the two stocks then the person will choose stock 1 which has a larger Sharpe ratio.Now, we try to find out the association and see whether it is advantageous to put part of the money in stock 1 and part of the money in stock 2.So, the joint probability distribution of X and Y labeled as p x y gives the probability of events of the form X equal to x and Y equal to y. This represents the simultaneous outcome of both the random variables. So, P of X equal to 0 will be P of X equal to 0 and Y equal to 100 plus P of X equal to 0 and Y equal to 0 plus P of X equal to 0 and Y equal to minus 100. Y we know from the previous slides that Y we can take values of 100, 0, and minus100 with the given probability. We also define independent random variables.Two random variables are independent if and only if the joint probability distributions the product of the marginal probability distribution. X and Y are independent if p of x comma y is equal to p of x into p of y.Now, multiply now we will define some more things multiplication rule for the expected value of the product of independent random variables the if the random variables are independent the expected value of the product of independent random variables is the product of the expected values. So, E of XY is equal to E of X into E of Y. Addition rule for the expected value of the sum. So, the expected value of a sum of independent random variables isthe sum of the expected values. So, E of XY is equal to E of X plus E of Y.Now, let us find the joint probability distribution for X and Y. We show X here; x can take three values 60, 0 and minus 60, 0, and 60. Y can take minus 100, 0, and 100. Please note that we have just change the order the way the order was from the earlier slide. Now, we look at the probabilities of y equal to minus 100 has 0.07. So, from this y equal to minus100 as 0.07, and then we realize now that x equal to minus 60, x equal to 0, and x equal to plus 60. The probabilities add up to one. So, we have 0.07 here and then we multiply with the three probabilities they are rounded off suitably. So, that we get 0.01, 0.05,0.01 which becomes 0.07. Similarly, for y equal to 0, x equal to minus60, they are multiplied suitably to get 0.88 multiplied and suitably shown to get 0.88and y equal to 100 for all the x values adds up to 0.05. Similarly, the x probabilities are also 0.1, 0.75 and 0.15 as we see here 0.1, 0.75 and 0.15. So, we have completed this table just like we completed this table in an earlier lecture in statistics.Now, we also know that mu of X is 3, sigma X is 29.85, mu of Y is 2, sigma Y is 34.58.So, if we invest something in stock 1 and something we invest 1 in stock 1 and 1 in stock 2 E of X plus Y is equal to E of X which is 3 plus mu Y is 2 which is 5 variance is additive. So, sigma is 45.68 and the ratio will be mu X plus mu Y minus 2r by sigma Xplus Y which becomes 0.1033 and right now it makes sense to it is better to invest in one share of each together rather than put everything in either the first share or to put in the second share.Now, let us look at a similar example to understand this further. In a sweatshop, customers buy either a 100 gram sweet or a 200 gram sweet along with the 100-gram mixture or a 250-gram mixture. Now, the probabilities are given. So, the probability of buying a 100 gram sweet and a 100-gram mixture is 0.4 and so on. So, find the marginal distribution of X and Y, the table is shown here. What is the expected total weight of the purchase? So, the marginal distribution would be 0.7 and 0.3, 0.6 and 0.4. So, the expected weight of the purchase will be 130 plus 160 which is 290. Now, how do we get this? So, this 130 and160 come as 100 gram sweet with 0.7 X is 70, 200 grams with 0.3 60 giving us 130, 100 grams with 0.6 is 60, 250 grams with 0.4 is 100. So, we get 160 and 290 is the expected weight of X plus E of Y. Are X and Y dependent or independent they are dependent because we also from the; if we do this, this is 0.6 here and this is 0.7 here. So, we would have multiplied them and got 0.42. But, since we have only 0.4 they are now dependent on each other. If they were independent then the number here would have been 0.6 into 0.7the number here would have been 0.6 into 0.3 which is 0.1 8, but since it is 0.2 and not equal to the multiplication of the probabilities we say that they are dependent, this is the variance of the total weight X plus Y equal or larger or smaller than sigma X square plus sigma Y square. Now, we realize that sigma X square is 2100, sigma Y square is 5400 and the variance of the total weight X plus Y we can do that as well we can say X plus Y can now take 200, can take 300, can take 250 and take 450 with the probabilities that are given, we can now find the expected value and the variance and if we do that we will get 6900 which is smaller than sigma X square plus sigma Y square.Now, let us also try to find out the dependence between random variables, just as we introduce the covariance earlier in statistics, now we try to find the covariance between random variables. So, computing the variance of X plus Y from the table is a little difficult.How did we calculate 6900, we will see is there a way to do that. Now, the covariance between random variables is the covariance between the columns of the data.So, the covariance between x and y is x 1 minus x bar into y 1 minus y bar plus x 2 minus X bar plus y 2 minus y bar etcetera divided by n minus 1. So, the covariance between random variables is the expected value of the product of the deviations from the mean. So, covariance, Y is equal to E of X minus mu X into Y minus mu Y. So, covariance is positive when the distribution puts more probability and outcomes when X and Y are both larger than the mean. So, let us now find out the covariance for these and from here we can do this. So, xis minus 60, 0, and plus 60; y is minus 100, 0, and plus 100. We have the individual probabilities X is 3 sigma X is 29.85, mu Y is 2, sigma Y is 34.58 and we also found out that sigma squared x plus Y is 2235 which is not equal to sigma x square plus sigma Y square.Now, covariance we can now calculate expected value of X minus mu X into Y minus mu Y. Weknow mu X is 3, x takes these three values, Y mu Y is 2, y takes these values. So, covarianceis 65.6. Now, the variance of X plus Y is equal to the variance of X plus variance of Yplus 2 times covariance of X, Y which we get 2235 in this example. That is a way to find out the variance of X plus Y rather than try to take each of these cases and individually try to find out it is easy to find the covariance and from the covariance go back and calculate variances of X plus Y.Now, we can also find the correlation between two random variables. So, the correlation of Xcomma Y is equal to covariance of X, Y divided by sigma X sigma Y and in this case, it turns out to be 0.064, because we already know the values of sigma X, sigma Y we know the covariance of X, Y and correlation is 0.064. As usual, the correlation coefficient is between minus1 and plus 1.Now, the addition rule for independent and identically distributed iid they are called the addition rule for iid random variables. So, if n random variables X 1, X 2, X 3, to X n are iid independent and identically distributed with mean mu X and sigma standard deviations sigma X then the expected value of the sum of them is X 1 plus X 2 to X n is equal to n times muX, variance is n times sigma X square, the standard deviation is root n into sigma X.Now, addition rules like we did before. So, E of a X plus b Y plus c is a of E X plusb of E Y plus c. Variance is a square into the variance of X plus b square into the variance of Y plus 2 abs into covariance of X, Y the constant c goes.So, we now have six items from 1 to 6. So, you look at this. So, we have positive covariance.So, when the covariance is a positive variance of X plus Y is greater than the variance of X plus variance of Y which is something we saw. If X and Y are identically distributed p of X is equal to p of Y. 3 – uncorrelated random variables; so, the variance of X plus Y is equal to the variance of X plus the variance of Y. So, covariance does not act, therefore, no correlation.Item – 4, covariance. Covariance is the correlation coefficient into sigma X and sigma Y the definition of correlation is covariance by sigma X sigma Y. Therefore, covariance is correlation which is rho into sigma X sigma Y. Weighted sum of two random variables is something like plus 3y, x is a random variable Y is another random variable. 6 – correlation coefficient is given by the symbol rho.Look at some true or false. A restaurant has higher revenue on weekends it treats the revenue on consecutive weekends as iid with mean mu and standard deviation sigma. So, we will just check the restaurant expects the same revenue on average on the first and second weekends. Yes, because they are iid, independent, and identically distributed you can expect the same average. If the revenue is low on the first weekend it will be low on the second weekend, not necessarily only the expected values are equal.The random variable can still take different values and therefore, it can be false. The standard deviation of sales over two weekends is 2 sigmas. It will be root 2 times sigma because we found out that variance is additive, the standard deviation is not. So, for two weeks it will be sigma square plus sigma square which is 2 sigma square and the standard deviation will be root 2 times sigma.If investors want small portfolio risk would they choose investments with negative covariance or positive covariance or uncorrelated? So, they would choose something with a negative covariance, so that the risk which is the variance of X plus Y would reduce with negative covariance. So, it is good to choose investments that have a negative covariance. So, that the portfolio risk comes down. Does a portfolio form from a mix of three investments have more risk compared to a portfolio with two investments? I would say it is generally true because the more diversification the less would be the risk, but then one also has to look at returns, and in this question, we are only looking at risk. Therefore, we would say generally true, but then the return can come down and so on. What is the covariance and correlation coefficient between a random variable and itself? So, between the random variable and itself, the covariance is the variance and the correlation coefficient is 1. If the covariance is higher the correlation equal to 1? So, one would generally get a feeling that since correlation is equal to covariance divided by sigma X sigma Y high covariance can lead to a correlation close to 1, but the actual answer is covariance by itself having a large value can also depend on the unit of measurement of the covariance. We have already seen that when things were measured in rupees there was a certain value and when they were or when they are measured in paise then the covariance becomes different and becomes much larger. So, the answer would depend on the unit of measurement of the random variable. Would it be reasonable to model the daily sale as a sequence of radically distributed and independent random variables?Not necessary, because we have to look at weekends before we do that it might follow two different kinds of things with weekend sales being different as seen in the previous question.Now, let us look at a few more simple questions if the variance of X is 10 and variance of Y is10 and variance of X plus Y is 16, what is the correlation between X and Y? So, the variance of X plus Y is equal to the variance of X plus a variance of Y plus 2 times covariance of Xplus Y. Therefore, the covariance of X Y is minus 2 and correlation is minus 0.2.X is a random variable 1, 2, 3, 4, 5, 6, 7, 8, Y is 1, 1, 1, 1, 1, 1, 1, 1. What is the correlation? So, the coefficient will be covariance will be 0 and we cannot find a correlation because we could standard deviation of one of them would be 0, and therefore, we would not be able to find the correlation.Now, a supermarket has 2 vehicles and the drivers on average make 5 trips a day with a standard deviation equal to 2. The drivers operate independently of each other. The average time per trip for driver 1 is 1 hour, while it is 45 minutes for driver 2. Find the mean and standard deviation of the number of trips and time took.So, they make 5 trips on average. So, E of X plus Y is 10, 5 plus 5. The standard deviation is 2 times root 2, the standard deviation is 2. So, root 2 times 2 there are two drivers, 2.83. The time taken is X plus 0.75Y, so, 8.75 and the standard deviation of X plus0.75Y turn out to be 2.5 when we do use the equations to find out the value.Another example is during an interval in a movie theater the audience buys popcorn and a cool drink from a shop. So, the distribution is given. So, we could assume that they buy one cool drink or two cool drinks with one popcorn or two popcorns distributions are given. So, find the expected value and the variance of the number of popcorns and cool drinks, find the correlation. So, expected value for X is 1.4 popcorn, because1 into 0.6 plus 2 into 0.4 is 1.4, variance of X is 0.24. For Y it is one cool drink into0.3 plus two cool drinks into 0.7, which is 1.7 and that variance is 0.21. Covariance can separately find as X minus mu X into Y minus mu Y, which gives 0.02 and the correlation is 0.089. So, with this, we come to the end of the discussion on the topic association between random variables.