Alison's New App is now available on iOS and Android! Study Reminders Support
Text Version

We will email you at these times to remind you to study.
• Monday

Tuesday

Wednesday

Thursday

Friday

Saturday

Sunday

In this video, we will now visit our second notion of measure of Randomness namely we want to find out what is the least number of bits L least number of bits L such that for if I give you a uniform random variable on L bits. So, if this if I give you L independent coin tosses which are unbiased then you can find a function f such that you can generate a given distribution P using this U.
So, P of f of U is close to the given distribution P. It is closed in this total variation distance epsilon. So what is the minimum that such bits required? This is the number of bits required to generate this random variable. And we will think of it as a measure of randomness in this kind of variable. This is sort of the complementary definition to the one that we saw in the previous video. So, this problem is very much related to the one we saw last time and rather than, I will just directly jump into the scheme.
The scheme is very much like the scheme we saw last time where last time what we did was we were given a random variable of a given distribution and we want to generate uniform random variable out of it. What we did was we divided the domain into small parts such that each part had roughly same probability some mass was left and that we dumped into this final part, which we are calling XM plus 1.
But the first M parts have a balanced they had roughly same probability. Therefore if you declare different values on these different parts that random variable is almost uniform and we showed this formally and what the amazing part about the scheme was how what was the number of bits that you were able to extract? Well, it was the entropy of the random variable.
So, now we will look at a complementary scheme where we are given uniform random variable and we want to extract, we want to generate a TMF P out of it. This is something which may be very important in practice. You may have seen similar questions but in a probability course where you use the inverse of the probability cumulative distribution function to do this, but what we will do now is very different we are accounting for the number of random bits used.
So, here is a scheme we define the set. This is a new set T lambda that we are defining it is different from the previous one, define this set unlike the previous one where we had a large probability lower bound on minus log Px, this is your P the one you want to generate. Suppose Lambda is a large probability upper bound of minus log Px. So, this is my set T lambda this one and we assume that the lambda is large probability.
So, you can check very easily check that for each x in T lambda Px is greater than equal to 2 to the power minus lambda. So, what we do is we set M to be greater than 2 to the power Lambda the same Lambda here and now we consider a partition of 1 to M. So, we since each probability is greater than 2 to the power minus Lambda we can say something about the cardinality of the set.
So, if you sum all the elements of the set the probabilities of all the elements that sums to 1 and that must be less than equal to the probability must be less than equal to 1 but each element has probability more than 2 to the power minus lambda. Therefore, there are at most 2 to the power lambda elements in this set. So, consider a partition Y1 to Y M prime plus 1 of this 1 to M set such that M prime is equal to 2 to the power to the cardinality of T Lambda.
So, remember cardinality of T lambda is less than equal to 2 to the power lambda. Why? Because each element has probability at least to the power minus Lambda and probabilities sum to 1 so the total number of elements in T Lambda cannot accede to the power Lambda. So, this M prime that we have is less than M by our assumption. We assume M is greater than 2 to the power Lambda.
Therefore you can form this partition of 1 to M of size M prime plus 1. Such that, what do you want out of this partition? What we want out of this partition is if you take any element X remember, there are M prime elements D lambda. What we want is that each part has probability that is between Pxi and Pxi plus 1 by M.
So, the question is, why can we do this or how can we do this rather? So, take each element, we know that each element is probability greater than 2 to the power minus lambda and 1 by M is smaller than that. So, you just basically take one element and these parts here is probability is under a uniform distribution. So, this is just the size of the number of elements you will take here that number divided by M prime by M which is more than 2 to the power Lambda.
So, you can convince yourself that this kind of partition Y1 to YM prime plus 1 can be designed it is just a very simple account keeping. This U is the uniform distribution. So, now this is what we have done. We have constructed these parts as many just one more than the number of elements in T Lambda just one more and each part has probability for the first M prime parts the probability is between the probability of elements here plus 1 by M.
So, we roughly have these element we roughly have these parts which have probabilities of this element. So, now you can imagine what we are trying to do we whenever we see something in this part this Yi we will declare Xi, that because you want to generate something from PMF P and it is off by some 1 by M but that is the error you have to account for.
And this last element x m prime plus 1 is something outside the T lambda some arbitrary element you can choose. Now what is a function the function that we want to use to generate TMF? Well, it is very simple when you see while in why are you declare Xi that is a function.
So, let us analyze the total variation distance between the generated regenerated output and then target distribution. So, this is the generated and this guy here it is the target distribution. And there are close that is what you want to say. So, for each element x in T lambda probability that you output that element minus T probability of that element under this distribution P, this differences we will compare and claim that this is small or summed over all the elements and then well this function never declares anything outside T lambda.
The only thing in declares outside T lambda is x m prime plus 1. So, but the you that we account for the error here and then this y m prime plus 1 final part, but that also you can account for the error you validate you can check this bound. So, will bound each of this term.
So, for the first term, each of these elements what we notice is, so let us analyze each of these terms. So, for the first term that is how we constructed the set this Yi minus Pxi remember that Yi minus Pxi. This is the condition the green one here is they differ by just 1 by m. That is how we constructed yM. So, this is at most 1 by M for each part and there are M prime elements in T lambda and that is M prime by M.
The first one is easy third term is just this guy. So, remember what is the third term, third term is this guy. So, for the third term so this is the last term, it requires some thought. So, what we have seen is that for each Xi and this is less than this that is a construction of this partition. So, we already have this found for the first term and now we sum over all of these guys.
So, you get P of T Lambda there and what you get here you get the first M prime parts, but remember that we have partitioned the whole 1 to M into just this M prime plus 1 part. So, this is just equal to 1 minus P of last part under uniform distribution, correct? So, this guy here we can check from this one is less than equal to 1 minus T Lambda complement probability of that. So, by our assumption this is less than equal to epsilon.
So, what we have shown is that this total variation is so this is T lambda complement one part and then another one of such thing. So, divide two time that when you divide by 2 what you get is this is less than equal to this by this and then you substitute that this I put a question here for you to think about maybe you can pause and think about because I will answer this question right away.
So, remember that our construction tells us that each element in T lambda is this and from that we verified this earlier, let me read later proof. So, if for this is true for all x in T lambda, so if you add the probability of all the elements in T lambda this probability must be greater than the cardinality of T lambda times 2 to the power minus lambda. But this sum here must be less than equal to 1.
Because probability some to at most 1 and therefore T lambda must be less than 2 lambda, 2 to the power lambda. So, you get this bound and you can choose any M of your choice. We choose M to be this guy here. So, the choice of M, and then we substitute this what we get is that this is less than equal to 3 by 2 lambda. So, the M is the number of bits the uniform, the size of the informed random variable that we used.
So, what have we shown here is the theorem that we have shown if shown that suppose that you can find a lambda for which the probability for which minus log P x is less than equal to lambda B probability greater than 1 minus Epsilon. So it's a large probability upper bound on this minus log the px. This is the quantity we have been playing around this week and also last week. So, suppose that is true.
So, this is related to your PMFP this lambda which gives this bound is associated with you PMFP. The one you want to generate then if you use L equal to lambda log 1 by epsilon bits that is just log of this kind log M. Lambda log 1 by epsilon bits then if you generate a uniform random variable with this length or if you generate these many random bits and you can find a function f the one we specified above such that the total variation distance between Fu and P is at most 3 by 2 epsilon.
So, this is a number of bits needed roughly lambda. Now once again, what is a good estimate for lambda earlier we had large probability lower bound this now, we need large probability upper bound on this. Well, that is a good estimate is the expected value of this guy, which is the same guy, which is the