Loading
Notes
Study Reminders
Support
Text Version

Introduction to Probability

Set your study reminders

We will email you at these times to remind you to study.
  • Monday

    -

    7am

    +

    Tuesday

    -

    7am

    +

    Wednesday

    -

    7am

    +

    Thursday

    -

    7am

    +

    Friday

    -

    7am

    +

    Saturday

    -

    7am

    +

    Sunday

    -

    7am

    +

In this lecture, we introduce Probability and begin the discussion on probability. From now on till the end of this course, we would be discussing concepts in probability, and then we will look at random variables, and then we would also study some known distributions such as binomial and normal. So, let us begin the discussion on probability.We will start with some simple well-known results; the probability of a tossed coin fair coin resulting in heads is half. So, this is something that all of us have learned and all of us know that when you toss a coin probability of getting ahead is 0.5, and the probability of getting a tail is also 0.5. So, what is the probability of getting a tail?So, the probability of getting a tail is 0.5. Now how does that happen? So, when we toss the coin we believe that there are only 2 outcomes; it is either a head or a tail. And then if we know that the probability of getting ahead is half then the sum of the probabilities to be 1. And therefore, the probability of getting a tail is also half. The other way of looking at it is to conduct a large number of experiments to generalize it. We will see about that as we move along. So, the probability of getting a tail is half. The probability of getting 1 when a die is rolled is 1 by 6. So, a die has 6faces, and we assume that there are dots representing numbers. So, if there is 1 dot it represents1, 2 dots it represents 2, and so on. Since there are 6 faces numbers one to 6 are there. One in each face and when it is rolled probability of getting a 1 is 1 by 6 because, any one of the 6 faces can show up, and with equal probability; so, it is 1 by 6. Now, what's the probability of getting 2? It is the same it is 1 by 6 because any one of the6 faces can show up. So, the probability of getting a 2 is also 1 by 6. Now, these are some known results, these are things that we have learned right from high school to where we are. Now, let us ask another question what is the probability that it will rain in Chennai on 5th May.So, the answers to the first 2 things which I any way discussed just before are the head and tail there are 2 equally likely possibilities or outcomes, and therefore, each is half.When a die is rolled there are 6 equally likely outcomes and possibilities.Therefore, the probability is 1 by 6. Now if we ask the next question what is the probability that it will rain in Chennai on 5th May. We require data, otherwise, it will be an opinion and we do not want to make a decision based on opinion. So, the next best thing we could do is ask for data. Now go back in the past and then find out how many years on 5th May it has actually rained in Chennai. So, we require data, and therefore, we understand that whatever result or answer that we give to this question will depend on the data that we have. And therefore, there is a relationship between data and probability.Now, what is the probability that it will rain in Chennai on 5th May? Now let us assume that we have collected data for 10 years. And I am not going to say that these are actualdata, and let us assume that the data that we have with us right now which may not entirelyrepresent the correct data, would be say n is no and say y is yes. So, if we are giventhese 10 pieces of data on yes and no, then we realize the 2 out of 10 there was a yes,and then we say the answer is 20 percent probability or probability of 0.2.Now, that leads us to the next question now instead of 10 years of data, if we had taken20 years of data would the answer be different, or to put it simply if we had taken 20 years of data, would we have a situation where there are 4 yes out of 20 because we had 2 yes 10. So, we need to answer that question.And we will do that as we move along. Now, let us look at another question, though we said the probability of getting the head is half, let us still continue the discussion on this. The probability of winning the toss, in a sample of matches say between February 2015 and November 2015; let us say the Indian captain won the toss, in 7 out of 19 matches under 2 different captains. Now, what is the probability of India winning the toss when in a cricket match? From the data that we have we would say since India won the toss, remember winning the toss is slightly different is little more than tossing the coin, because one person tosses the coin the other person calls out whether it is head or tail. And if whatever the person has called out is correct, the person wins the toss.Otherwise, the person tossing the coin wins the toss. And also remember that in these19 matches, it is not necessary that the Indian captain had tossed in all 19 times or had called in all 19 times. In spite of all these discussions, from the information that we have, we say 7 out of 19 matches the Indian captain had won the toss, and therefore, one would say that the probability of India winning the toss is 7 by 19 which is 36.80 percent. Now, let us ask another question, if we had taken data for 100 matches would the answer be close to 50 percent, because based on the discussion that we had on the heads and tails, we can extrapolate it to understand that in the general probability of winning the toss is close to 50 percent. So, if we had taken more data would the answer be close to 50 percent, and the next question is does the size of the sample actually matter.Now, we will look at some definitions and slowly try to answer some of these questions that we post. So, what is probability? The probability of an event is defined as it is long-run relative frequency. So, the frequency is obviously, an occurrence so, it is a long-run relative frequency relative to something. So, the long run implies more and more, it also means time, it also could mean size. There is a very important concept called the law of large numbers. Now, the law of large numbers guarantees that this intuition is a correct ideal example such as tossing a coin. So, as we keep tossing a coin and keep doing these experiments more and more large number of times; we will observe that 50 percent of the time that is heading, and 50 percent of the time it is tails.So, the relative frequency of an outcome converges to a number; which is the probability of the outcome as the number of observed outcomes increases is called the law of large numbers.Many times for example, if you take this 7 out of 19, and if 7 out of 19 is correct data, for example, we just went ahead and looked at 19 matches and it so happened that the Indian captain won the toss in 7 out of 19, and we realized that the probability of winning the toss based just on this is 36.8 percent and not 50 percent.Whereas if we had taken 100 matches and looked at it would be much much closer to 50 percent than 36.8. So, the law of large numbers is extremely important to compute and understand probability. Many times we in our discussions in our computations, we use proportions as probabilities. For example, if I had said the from this situation if I had said the Indian captain won the toss in 7 out of 19 matches.So, the proportion of India captain winning the toss is 36.8 it is fine in most of our discussions we would even say the probability is 0.368 or 38 percent. So, proportions become probabilities under the assumption of the law of large numbers. So, whenever we substitute proportion to a probability, we have to assume that the number of trials, the number of times we have done it is sufficiently large to make that generalization.Now, let us look at continuing this discussion a little more. So, what is the probability that a random number generated from say an excel sheet or from a calculator is greater than 0.5? So, we did a small experiment and said we did 100 trials and it gave 40 numbers to be greater than 0.5. But when we did 500 trials, it gave 253 numbers to be greater than 0.5. So now, do we answer the first question saying: what is the probability that the random number is 0.5. So, actually what is the probability of getting a number greater than 0.5? Is it 0.5 or is it 0.506 and so on? But then we realized if we did 1000 trials 10,000 trials, they realize that 50 percent of the time the value is more than 0.5.The simple exercises, the last toss was won by India by the Indian captain would the Indian captain win or lose the next toss explained in the context of large numbers. There are times we answer this question by saying oh last toss was won by the Indian captain. And therefore, it is quite likely that the person may not win the toss now so that the average becomes 0.5. Does not happen all the time; this is a separate event or a separate thing, and the probability of winning the toss or losing the toss does not change, because the previous toss was won or lost. So, we have to understand the idea of large numbers, it is not that we are making a decision based on 2 numbers. But we have to understand the probability based on repeating the experiment a large number of times. The probability of an accident happening in a day is 0.1.We did not have an accident in the last 12 days will we have one definitely today. If the question is will we have one definitely today, the answer is no, and we cannot say that that we will have one definitely today. The only thing we can say is, yes there is a 10 percent chance that there will be an accident today. It does not matter whether one accident happened yesterday or 2 accidents happened yesterday, it has nothing to do with it. So, one more time we have to understand the large law of large numbers, and then give answers to these questions. A visitor is expected at 5 pm, it is right now 5 10 pm, does the probability of him coming in the next minute higher than him coming at 5 12, once again we have to look at this from the law context of large numbers, and say no, no, no, the probability of the person coming is still the same. So, these 3 different situations actually helped us understand the role of large numbers, while many times we convert proportions to probabilities based on a limited sample or a small number of trials. And we always have to keep in mind that the probabilities numbers are computed with a large number of trials or experiments.Now, we start introducing some rules and some notation and some description now what probability, some definitions which would help us. So, the first definition is called a sample space, a sample space is a set of all possible outcomes that can happen for a situation. Sample space is the set of all possible outcomes that can happen for a given situation or a chance situation. So, tossing a coin can have 2 outcomes, head and a tail so, 2 outcomes. So, if we do toss the coin 2 times, then it is 4 outcomes, head-head, head-tail, tail head, and tail. And the number of outcomes can become very large. By definition, an event is a portion or a subset of the sample space and it is a set of outcomes. So, there can be an event, where I toss the coin 2 times, and I have head and tail. There can be an event call on-time arrival of an airplane. So, there can be an event that has numbers 1 5, and 3 which were the outcomes when a die was rolled 3 times. In the second example, we could have 2 outcomes, which could be the plane arrives in time or the plane arrived late. So, arrival on time is one of the outcomes.Now, R R would be rained on 2 consecutive days and so on. There are 3 important rules in probability. Now every event has a probability denoted by P of A. So, the first rule is called something must happen. So, the probability of an outcome in a sample space is one. When we assign probabilities to outcomes we must distribute all of the probability. So, the probabilities do not add up to 1, then it means we have missed something or we have double-counted, or we have made an error. For example, the first one is the easiest to understand head and tail, 2 outcomes each has a probability of half so, it adds up to one. 2 tosses we know quickly that head, head-tail, tail head, tail. There are 4 outcomes; all of them are equally likely. So, each one of them has a probability of 1 by 4 and the probabilities add up to 1 which is what is told as the first rule.Let us look at this example a bag has 4 red balls and 3 blue balls and one ball is picked at random. So now, what happens? Since one ball is picked at random, the sample space has 2 events, and the ball that is picked is either a blue ball or a red ball. Since there are 3 blue balls out of a total of 7 probability of picking a blue ball is 3 by7, the probability of picking a red ball is 4 by 7, and the sum is equal to 1.Now, look at the second situation, the bag has 4 red balls and 3 blue balls as before.One ball is picked at random and put back. Another ball is picked at random and again put back. So now, since we have picked 2 balls, the sample space can be blue and blue, blue and red, red and blue, and red and red. Now the probability of picking 2 blue balls is 3 by7 into 3 by 7, because there are 3 blue balls out of 7, so 3 by 7.And since the ball has been put back the probability stays at 3 by 7, so 9 by 49. The probability of blue and red is 12 by 49, first picking red and then picking blue is 12 by 49. And red and red is 16 by 49, 4 by 7 into 4 by 7, 16 by 49 if we add all these events, we get 9plus 12 21, plus 12 33 plus 16 49 by 49 which is equal to 1.Now, look at a third scenario, again the bag has 4 red balls and 3 blue balls. One ball is picked at random, it is not put back. Another ball is picked at random and again it is not put back. Again the sample space can be blue and blue, blue and red, red and blue, and red and red. The probability of blue and blue is 3 by 7 into 2 by 6, the 2 by 6 comes, because we pick the first ball, and we assume that this ball is blue with a probability of 3by 7. So, if this ball is a blue ball and it is not replaced or put back, then there are 2 remaining blue balls out of 6 balls that are inside, and therefore, the next blue ball has a probability of picking equal to 2 by 6. Therefore, this is 3 by 7 into 2 by6, which is 6 by 42, which is 1 by 7. Now, blue and red will be 3 by 7 into 4 by6, because the first ball is blue 3 by 7. It is not put back. So, there are 6 balls remaining, out of which 4 balls are red, so 4 by 6. So, again 12 by 42 which is 2 by 7.Red and blue are also 2 by 7 by the same reasoning. And red and red is also 2 by 7. And therefore, the total is 1 by 7 plus 2 by 7 plus 2 by 7, which is equal to 1. So, we now realize that when we write the sample space as a set of all possible outcomes, and if we compute probabilities for these they add up to 1.

We continue our discussion on the basic concepts of probability. In the previous lecture, we looked at sample space and the events. And, then we saw the first principle that the sum of the probabilities of all the events adds up to 1.Now, we look at the second rule which says for every event A the probability of A is between 0 and 1. So, P of A is less than or equal to 1. The probability cannot be bigger than 1, and it cannot be less than 0. Sometimes we say this if a person does not lie, then we say the probability of that person lying is minus 1. In a conversation, we say that to stress the point that this person will not lie at all.Or if someone says what is the probability of the sun rising in the east, then sometimes we say 1.1, saying that it is simply that the sun will not rise in any other direction.So, in all these instances we have to understand that the actual answers are 0 and 1 respectively, and not any number less than 0 or greater than 1. So, what is the probability of the sun rising in the west? Is 0 and not anything less than that.Let us continue; now we have something called disjoint events. So, events that have no outcomes in common are called disjoint events, sometimes also called mutually exclusive events. Now we define the union of 2 events A and B as the collection of outcomes in A and B or in both written as A or B. Third rule is the addition rule for disjoint events. The probability of a union of disjoint events is the sum of the probabilities if A and B are disjoint if A or B is P of A plus P of B. So, if A and B are mutually exclusive P of A or Bis P of A plus P of B.Now, let us look at examples. A die is rolled once, what is the probability of getting an od number? So, probability of getting an odd number is probability of getting either1 or 3 or 5. They are disjoint, mutually exclusive therefore, probability of getting a 1 or getting 3 or getting a 5 is probability of getting a 1 plus probability of getting a 3 plus probability o getting a 5. In each of these cases the probabilities 1 by 6 because it is a die an,d therefore, the answer is 1 by 6 plus 1 by 6 which is half. Of course, there is another way of looking at it, you either get an odd number or 0 plus even number.So, you have 3 plus 3 therefore, the answer is half. Now, look at another example. A bag has 2 circular plates with numbers 3 and 6 and 3 square plates with numbers 1, 2 and5. One plate is drawn at random. What is the probability that it is a circle or it has an odd number? Now there are 5 plates. So, let C and S represent the circle, and the square plate respectively. So, we have one plate; which is called C3; which means a circular plate with number 3 we have another plate which is C6 which is a circular plate with the number 6 written and the 3 square plates are S1, S2 and S5, one out of these 5; so, C3,C6, S1. Out of these 5 C3, C6, S1 and S5 meet the requirement of a circle or an odd. So,the question is what is the probability that it is a circle or it has an odd number. So,out of these 5, 4 of them meet the requirement C3, C6, S1 and S5 meet the requirement.Therefore, probability is C or odd which is 4 by 5. Now we realize carefully that the circl has C3 and C6 while odd number is C3, S1 and S5. These are not disjoint, because w have C3 common in both. Therefore, probability of C and odd is not equal to probability ofC plus probability of odd. Once again in this particular problem or example, we found out tha there are actually 5 plates. Out of which 4 out of 5 meet our requirement and therefore,the probability was 4 by 5. But if we looked at both of these separately, there are 2 circular plate, and there are 3 plates with odd number. But there is one that is common which isC3. Therefore, probability of circle or odd is not equal to probability of circle plus probability of odd.Now let us look at the third rule which is called the complement rule. The probabl of a event is 1 minus probability of it is complement. So, P of A is equal to 1 minus P of A compliment.The probabi that it will rain today is 0.3, what is the probability that it will not rain toda? 0.7; probability that the stock price will go up tomorrow is 0.25, what is the probability tha it will go down tomorrow? Could say 0.75, but it could remain the same one could say i is slightly less than 0.75.Example probability of India winning the world cup is 0.6, what is the probability of India no winning the world cup? Is 1 minus 0.6, which is 0.4. The probabl of Jim getting ana grade is 0.5, what is the probability that he gets B or C or D? There are only 4 grades possibl for the course. So, probability of getting A grade is 0.5, probability of getting anothe grade that is not aA grade is also 0.5. Since the person can get only one grade C and D are disjoint.Rule 5 addition rule, for 2 events A and B the probability that one or the other occurs I the sum of the individual probabilities minus the probability of intersection. So,P of A or B is equal to P of A plus P of B minus P of A and B. So, P of A or B is equal totP of A plus P of B minus P of A intersection B.The Venn diagram shows what we are discussing. If A and B are disjoint and P of A and B is0 therefore, P of A or B is equal to P of A plus P of B. Now we realize that the rule tha we saw for disjoint comes under the general addition rule.Example. A supermarket has movie theaters, garment shops and restaurants on 4 floors.Part of the data is given below. So, we have a movie theatre in the first floor, we have restaurant in the first floor, we have a restaurant in the second floor, we have a garmen shop on the second floor, we have another movie theater on the second floor,we have another restaurant in the third floor and a garment shop in the third. What is the probabilit that the next person is going to the garment shop or to the second floor?So, if the person is going to the garment shop, the person can go to a garment shop I the second floor as well as third floor. And if the person is going to the second floor,the person could be going to either the restaurant or a garment shop, or a movie theater.So, let A represent the garment shop and B represent the second floor. So now, probability A is 2 by 7 because there are 7 items and 2 of them are garment shops. Let us assume that equally likely that people go to each one of these. So, P of A is 2 by 7, P of Bis 3 by 7, because B is second floor, there are 3 things on the second floor. So, 3 by7 P of A, intersection B is 1 by 7, because there is also a garment shop in the second floor so, it is 1 by 7. Therefore, P of A union B is equal to P of A plus P of B less of A intersection B. So, 2 by 7 plus 3 by 7 minus 1 by 7 is equal to 4 by 7.We can also do this slightly differently. Let M R and G represent movie restaurant and garment shop, and let FS and T represent the floors. So, there are 7 events M F, RF, RS, GS, MS, R T, G T. Out of these four involve G or S which is garment shop or second floor and therefore, the probability is 4 by 7. So, the same example or problem they can do it in multiple ways. Sometimes we follow this event sample space way, sometimes we us these formulae and slowly as we move along we will have to understand both the ways of doing it. And more importantly, understand how both of them are related to each other.Example: A box contains 3 blue balls and 4 green balls. 4 balls are drawn randomly, what I the probability that 2 blue and 2 green balls are drawn? So, since 4 balls are drawn randoml. The first way to do it is it can be blue, blue, green, green, blue, green,blue, green, blue, green, green, blue, green, green, blue, blue and so on. So, we have looked a all of these. So, probability of doing a blue and blue and green and a green 3 assuming tha these are not replaced. So, 3 by 7 into 2 by 6 into 4 by 5 into 3by 4, which is 3 by 35; there are 6 ways of doing it, all 6 have the same probability o 3 by 35. So, the answer is 6 times 3 by 35, which is 18 by 35 which is 0.514. Now w do this in another way. So, 2 blue balls from 3 balls can be chosen in 3 C 2which is 3 ways. 2 blue balls from 4 balls can be chosen in 4 C 2 which is 6 ways. 4balls out of 7 can be chosen in 7 C 4 which is equal to 7 C3, which is equal to 7 into6 into 5 by 1 into 2 into 3; which is 35 therefore, the probability is 3 into 6 by 35 which is18 by 35 which is 0.514. So, knowledge of permutations and combinations helps in counting the outcomes. So, this is another way of solving this kind of problem.Sometimes we do it in the first way; where we compute individual probabilities and multiply.The same thing is done slightly differently with the number of ways of doing something and then we compute. So, this is another thing that we have to understand as we progress in our study of probability; that there are multiple ways of looking at time-solving the problem. The concepts are all the same, except that we have learned to understand how each one works and relate each one of them.Now, let us look at another example. So, in a card game a pack of 52 cards is dealt to4 players. So, each player gets 13 cards, and what is the probability that every player get 1 ace. So, there are 4 aces in a pack of 52 cards. So, what is the probability that eac player gets 1 ace? So, this is slightly more involved example and where some of the things that we have learned till now will be put to use.So, let us try to see how we solve this. So, let us take the first person. The ace should come in one out of the 13 picks because we assumed that somebody is dealing the cards.So, this person is going to get one card per pick 13 times. So, look at the case where the ace is in pick one and the remaining 12 do not have an ace. So, ace in position one, and no ace in the remaining 12 positions is 4 by 52, because the first position there are 52 cards are there the person gets one ace so, 4 by 52. Now if you leave out these 4 aces there are 48 non-aces. So, 48 by 51 into 47 by 50 and so on; this48 by 51 comes because this second pick out of the 51 cards that are remaining. So, he could get any out of these 48, because these 48 do not have an ace. Remember, again this is equivalent to the ball not being replaced so, the card does not go back to the deck.And it is completely given to this 4 peep. So, for the 12 remaining picks we will start with48 by 51, 47 by 50, and go on till 30 7 by 40. And therefore, this probability is 0.03376.So, for the first person, getting an ace in pick one, and not getting an ace in the remaining12 picks is 0.03376. Now if we take the same person, we are trying to go back to the problem t see that this person gets only one ace. So, this person can get that ace in his or he first pick second pick up to 13th pick. So, it can come in any one of the 13 positions,and we can quickly realize that the probability is actually the same. Therefore, the total probability is 13 into 0.03376 which is 0.4388. So, this is the probability that the first person gets an ace and one ace out of the 13 picks that this person has.Now, for player 2 we have 3 by 39 into 36 by 38 and so on. So, how we get these numbers?So, we assume what normally happens when 4 people play cards is we start putting one for each person and do it 13 times. Right now we are not assuming that, we are going to assume in some way that the packet is well shuffled. And the first person gets the first13 cards; the second person gets the next 13 cards and so on. And therefore, for the secon person, there are only 39 cards that are remaining. And since the first person has got only one ace, 3 aces are remaining in these 39 cards and therefore, this person gets 3 by 39 is the probability of getting an ace in the first pick.Now, the probability of not getting an ace in the remaining picks is 36 by 38 into 35 by37 and so on, because with every pick you realize that the denominator is reducing by1. Because one card is given the 36 by 38 comes because the second player has already got an ace in that 3 by 39. So, there are only 2 more race remaining, and 36 non acesremaining out of 38 remaining cards. And therefore, 36 by 38 and then it moves on till 25 by 27which is 0.03556. Now again this ace can come in any one of the 13 positions. Therefore,it is 13 times 0.03556; which is 0.4623. For player 3 now both players 1 and 2 have got their 13 cards. So, only 26 cards remain, and players 1 and 2 have got one ace so, aces remain. So, 2 by 26 and then out of these 26, there are2 aces that remain 24 non-aces remain therefore; we get 24 by 25 into 23 by 24 and so on to ge 0.04. And this ace can come in 13 positions therefore, 0.04 into 13 which is 0.52. Now player 4 the answer is actually 1 because prayer 4 does not have any choice. Player 4 has to get back take back the remaining 13 cards that are available. And therefore, the probability is 1 by 13, and we will realize that it happens 13 times. So, 13 into 1 by13 which is 1, and therefore, the probability that each one gets one ace is 0.4388 into0.4623 into 0.52 into 1, which is 0.1054. So, this is a more involved example. And itis very common to study and to look at this kind of examples either from tossing a coin or rolling a die or picking something from a pack of cards, whenever we study probability.But that we can understand is if the card problems become a little more complicated than the die problem or the ball picking problem. So, we will look at some of these interesting problems as we move along. What is also required is to understand what happens and in what sequence or what order it happens. And once we understand that the whole thing comes nicely for us to get this. So, we will quickly realize that 0.1054 is the probability that a personal 4 gets an ace. Please note that we have not qualified it by saying that the first person gets the ace of spade and so on, it becomes even more involved to do that, but each person getting one ace is happened 10 percent of the times 0.1054.Now, let us look at the same problem done in a completely different manner using permutations and combinations. So, in a card game, a pack of 52 cards is dealt wi 4 players, what is th probability that each player gets one ace? So, the actual computation was the is equa to 52 into 39 into 26 into 13 into 48 factorial by 52 factorial which is given here.Now, 52 cards can be divided into 4 groups of 13 in these many ways. 52 factorial into 39 factorial into 26 factorial divided by 39 factorial into 13 factorial into 26 factorial int 13 factorial and so on. Now, 48 cards can be divided into 4 groups o 12 in these ways, 4 aces in 4 factorial ways. And therefore, the probability will b 4 factorial into 48 factorial into 12 factorial into 12 factorial into 12factorial, divided by 52 factorial into 13 factorial into 13 factorial into 13 factorial which is 0.1054. So, this looks a little more complicated, but one has to kind of understand how this actually happens. So, it is a lot easier to look at this problem in the previous method, where we took the case where we take one person, and then say this person gets an ace in the first pick, second pick third pick 13 picks, and then find the probability, then we move to the second person and so on. This involves a lot more of factorials which means we do it the permutation combination.