Loading

Alison's New App is now available on iOS and Android! Download Now

Study Reminders
Support
Text Version

Set your study reminders

We will email you at these times to remind you to study.
  • Monday

    -

    7am

    +

    Tuesday

    -

    7am

    +

    Wednesday

    -

    7am

    +

    Thursday

    -

    7am

    +

    Friday

    -

    7am

    +

    Saturday

    -

    7am

    +

    Sunday

    -

    7am

    +

this slide is here if I had a way of blocking it, I can immediately ask a quiz and see whether you remember, so where will you find the cap 5'? At the very beginning of the UTR. So what will be the relationship of transcription start site with the relation to the exons and introns? So the 5' UTR can very well be starting at the transcription start site so that it can be in the exon too. Alright, we will do more quizzes later on this. So right now, we will continue on this, so today we are going to mostly focus on molecular biology techniques, some of which you may have learned in other courses, and some of them may be new. Because some of them are whole organism-based techniques and that you may not have heard elsewhere. So essentially, the main focus is the techniques by which we try to understand differential gene expression are. So that is going to be the focus of this lecture as well as the next one. (Refer Slide Time: 01:43) So this is continuing refreshing our memory of the eukaryotic gene structure. So the previous one was a cartoon; this is an actual sequence of the beta-globin gene. So you have the upstream promoter elements like the TATA box, and this is the -70 sequence. So these are there in most of the promoters; that is why they are part of the core structure here. So this is the sequence that signals the cap addition. And colored ones are exons the grey are the introns. So you see that the UTR is an exon and there you have the cap, start codon ATG, then the actual nucleotide and the translated amino acid sequence. Then you have the beginning of an intron and end of an intron and then so on. Later, in the end, you have the stop codon, and then you have 3' UTR. So then you have the AATAAA, this is a canonical poly-A signal, but other variants of this also work as polyadenylation signal. So they help in identifying the cleavage site also. So actually, the machinery that does the cleavage, here the cleavage means, the pre mRNA or nascent transcript can be longer at the 3' than the mature one in the cytoplasm having the poly-A tail. So at the 3' end cleavage happens, so some amount of the transcribed mRNA is removed, and to the remainder, poly-A tail is added. So that is a major complex and highly regulated; it is as big as a ribosome. Another such major complex is the spliceosome, the splicing machinery. So these are major protein complexes that operate. (Refer Slide Time: 03:45) So, next, we are looking at transcription itself, so this is the first step here, so you have the transcription. So as soon as the nascent mRNA or nuclear RNA comes out, you have the cap added, Trimethyl G cap. At that stage, splicing is yet to happen; that is what I want to point out, these additions help in protecting RNA. But I am not sure why that step is fast, but these additions happen at both ends before splicing starts. So then you have processing, which is essentially splicing. (Refer Slide Time: 04:28) Then you have the messenger RNA or mature mRNA, so this comes into the cytoplasm where it gets translated, then you have the protein chain. And then that has to undergo two things; one proper folding to the native conformation and also post-translational modification. So here, for example, the prosthetic group gets added, this is a four-subunit protein, it is a heterodimer. So beta-globin, alpha-globin, each come together to make the functional molecule. So all these steps put together is what you call as gene expression. So do not assume that only transcription is gene expression, and this is something like protein level, the whole thing is gene expression. When you talk gene, the definition of a gene is a biological function that is the phenotype, so protein is responsible for it. So we do not distinguish these steps, starting from the modifications of the chromatin structure; like from heterochromatin to euchromatin transformation which happens due to removal of methyl groups and addition of acetyl groups and specific methyl groups to H3 tail; starting from there to the post-translationally modified active or inactive protein. So up to that, the whole steps are called gene expression; each one is one step in gene expression. So now we know what a gene is and what is its final functional form. (Refer Slide Time: 06:13) So now let us look at again we are refreshing some other steps, we are still not getting into differential gene expression. For that, we want to familiarize ourselves with what happens, so that we will know where regulations can happen. So the first thing is we are going to look at the transcription initiation itself, so transcription initiation in this cartoon goes with some details. Still, the summary is that a set of proteins have to bind in a specific sequence to enable the RNA polymerase to start transcription. So the initial assembly and recruiting the RNA polymerase to the promoter is what you call as transcription initiation. So here you have TF2D, transcription factor 2D, so that has first to bind, it binds the TATA box then recruits 2A, then you have B and H, these clamp-like structures. (Refer Slide Time: 07:14) They add, and then only you have pol II recruited, and to the pol II, you have the E and F already binding to it, and when it binds this carboxy-terminal domain, CTD plays a significant role in regulation. So that is bound to the 2D structure, the main molecule. (Refer Slide Time: 07:39) And so that is how it is, so this is already set, but it is a not go. So that requires phosphorylation of CTD. So, to phosphorylate or not to phosphorylate is a step there. If phosphorylation does not happen, then transcription will not be initiated, although everything has bound. So that is a step where regulation often happens. So certain serine residues are critical for the phosphorylation. So oftentimes, developmental biologists use those phosphorylated serine specific antibodies or phosphorylated CTD specific antibodies to measure whether transcription is happening in a given nucleus or not. So with those phospho-specific antibodies or CTD specific phosphorylated antibodies, if you do not detect a signal probably, there is no mRNA transcription happening. So remember we are talking only mRNA because it is pol II specific antibodies, but other transcription could be happening. So once pol II phosphorylated, then it is released from TF2D, and it is ready to elongate. So initiation then elongation, so this is the initiation step, and this is elongation. And even in elongation, it can be paused sometimes, so we are not going into those details, but this is good enough to know that this is a potential step for regulation. (Refer Slide Time: 09:17) Alright, so now we are done with those basics. So we are going to now look at a series of experiments that help us to identify sequences in the promoter region and as well as elsewhere in the vicinity of the gene that could be contributing to transcription one, and second, we are also going to look at how does one find transcription factors? That is, transacting factors like proteins. The DNA sequence is called Cis because it is in the same molecule as the open reading frame and protein is encoded elsewhere, and it comes as a different molecule different from that piece of DNA where you have the gene, so, therefore, it is called Trans. So when you say transacting factors, you are talking about factors other than a given gene sequence that influences that gene expression. So this is a very commonly used Assay; people use it for multiple contexts where you want to test nucleic acid-protein interaction. So right now, we are looking at DNA-protein interaction to find out the promoter elements that may be involved in interacting with a specific transcription factor. Thus, these transcription factors shown in blue color are aiding in recruiting the RNA pol II to the promoter, and they are transcription factors, so these factors are listed here and named; these are there for every gene. These are core transcription factors, and there are gene-specific transcription factors that we will learn later, and so how do we test whether a given transacting factor interacts with a given element of DNA or not? So to test that is we use this experiment called gel shift assay or gel mobility shift assay or gel retardation Assay; there are multiple names for it. The most commonly used term is electrophoretic mobility shift Assay or E M S A or EMSA. So some labs call EMSA, or some labs call E M S A, some labs do not even say that they say mobility shift or gel shift. It is a straightforward technique. So what you are doing is you take a radiolabeled version shown here in this, whatever color that comes on that screen here for me it looks purple. So you take a radiolabeled version of the DNA fragment you want to test whether it interacts with a given protein. Then you incubate the DNA with the protein; it may be purified protein, or it is a cell lysate where the protein may be present, so you incubate with it, and then you run on a gel. If the protein binds to that DNA fragment, the complex size is bigger than the individual nucleic acid, so the mobility is reduced. Since in the gel you see it as a shift, an upward shift you call it as gel mobility shift assay or since it retards the mobility of the nucleic acid you call it as gel retardation assay, so all those names are valid. So that is how you see it. So here in this particular case, Pax 6 is a transcription factor about which we will see some more. And without that the free nucleic acid we call it a free probe, so free probe for a given time in a given condition of gel, for example, gel pore size, etc. It moves a certain distance, and when you add the transcription factor which binds this particular DNA fragment, its mobility is shifted; since it is radiolabeled, you can detect it by autoradiography so this is a very versatile and very useful experiment. In our lab, we primarily use it for finding RNA binding proteins that interact with 3' UTR and regulate translation. So that is the context in which we have used, so therefore it is useful for nucleic acid-protein interaction in different settings. So in lane 3 you have Pax6 added, so it shifts, and in lane 4 also Pax6 is added; it is a control experiment. So the interpretation here has two alternative explanations one maybe it is a protein that would bind any nucleic acid, non-specifically. It may not be sequence- specific, to test that what you do is you add a significant molar excess of the same nucleic acid fragment but without radiolabeling. So now what will happen is depending on what is the molar excess; let us say I have a 10-fold molar excess or 100-fold molar excess, then this intensity will go down here. There is only one because it is a cartoon and just showing you the scheme. But in an actual experiment, we will have 2-fold, 4-fold, 10-fold, 100-fold like serially increased unlabeled same nucleic acid. So that would also have an equal tendency to bind; it is more like a competitive inhibitor. So now you have the band there, but that is not having radioactivity, so you do not detect it. So then another control which is not shown here is you add similar molar excess of a similar number of nucleotides the same length of the nucleic acid but a nonspecific sequence. And that will not compete out if it is sequence-specific binding, same sequence unlabeled will compete out the radiolabel there, but a different sequence will not compete. So even it is excess, this protein will still bind it to that particular sequence. (Refer Slide Time: 15:41) Alright, so the next one is DNase protection Assay, so this picture if you see, we are no longer doing DNA sequencing by this method but otherwise suppose like when I was a student this is a routine gel, so this kind of image we see every day if we are doing DNA sequencing. So on those days, we did not do a whole lot in developmental biology. So these molecular biology experiments are what most of the people did, so this is how you do DNA sequencing. So you have radiolabeled oligo to start with, and in each reaction, you have dideoxy of that nucleotide, so you will have four lanes you will run four lanes on a gel. And then you look in which lane you have the smallest fragment, and from there progressively, you count upward, and that is how you get the DNA sequence. So, this is one such DNA sequencing gel, but here you are not sequencing the DNA itself. So instead, the DNA is incubated with a protein that binds to some region of the DNA, so this is really large fragment compared to what is used in gel shift assay. So here you are trying to find which sequence in the larger sequence of the DNA, binds to the protein. Then once the protein binds to the DNA, you add a nuclease, a controlled a nuclease treatment is done. So now the fragment that is protected by the first protein that is bound to the DNA, protects that sequence, and the rest of it get cleaved. And when you run a gel, since it is controlled digestion and you have partial digestion for every nucleotide length. So, in the control lane, you see bands for the entire length because at every place that could be digested. But if you look at this portion in these two lanes and then here, you see bands missing when you add Pax6. So, if you compare the first two lanes with the control lane which has no pax-6, indicate that pax6 binds to that region, mainly you are testing which portion of the DNA is protected by your protein from the nuclease cleavage. So, this is also done with the RNAse protection to find part of the RNA where RNA binding protein binds. So, this is a DNAse protection assay, which is also called footprint assay because it is like a footprint, the protein’s footprint. (Refer Slide Time: 18:32) So before we go into the next set of techniques, we are going to look at a part of the gene which is a regulatory element, they do not belong to the coding part of the gene, and it can be anywhere actually it need not be always in the promoter region, it can be at a considerable distance, it can be downstream of the transcription initiation site as well. So that is what we are going to look at, and they are called enhancers. So, what are enhancers? The first point tells you they control the efficiency and rate of transcription. So, they may increase the rate, or in some places, they very strictly specify the spatial or temporal expression of a gene, so that is shown here in this. In the Pax6 promoter cartoon, these orange-colored bars or bands, are the exons. This red one is the promoter itself, then this green is the enhancer, and you see enhancer present here four fragments or four parts, and one of them is actually in an intron. So, 1, 2, 3, 4, 5, 5a, 6, 7, are exon and between those are introns, so one of the enhancers is part of an intron too. So, all these enhancers, as I mentioned, they control efficiency and rate and sometimes stringent temporal and spatial regulation. And here we see such spatial regulation information, for example, this enhancer between exon 4 and 5 confers expression of Pax6 in the retina and this particular exon here just before the transcription start is a neural tube enhancer that makes sure this protein is expressed in neural tubes. And the other one lens and cornea, if you do not have it, is not going express there and the other one the upstream most is pancreas-specific. So, each one of these four has four different tissue- specific expressions of this gene. So, you might have the core promoter, core transcription factor everything, but you are not going to have a transcription, if you do not have these enhancers and those enhancer specific transcription factors. So, we will see them one by one. So, this actual sequence tells you in a part of this pancreatic enhancer sequence where you see binding sites for two different transcription factors. (Refer Slide Time: 21:28) So, a little bit more detail on that we will see in two slides later. There are many ways by which one identifies enhancers, and one easy to understand and very powerful technique is enhancer trap. So, it involves having a core promoter driving a reporter gene, all of it embedded in a transposable element. So, this technique is useful in an organism wherein natural endogenous transposable elements exist. These elements hop on and hop out depending on the context, where they are active or not; depending on that, they will move around the genome. And usually, for transposable elements to get inserted, they need repetitive sequences, and wherever that is there they will go. So if you use the transposable elements as a vehicle to carry this reporter with the core promoter, without any regulatory elements like no enhancers or anything and if you provide the context in which transposable elements are active, for a given period and if you look at it you will find different insertions. This is a kind of mutagenesis using transposable elements, instead of some other mutagen. So, in a given organism, the transposable element will be present in different parts of the genome. If this transcriptional reporter gets inserted in the vicinity of an enhancer as you see in this picture here. If this enhancer influences the expression of this reporter, this gets expressed where a gene is usually expressed, and then you can detect it this way in the picture. This is a drosophila embryo in the picture, so where you see the expression pattern. So essentially, you are trapping an enhancer with the help of a reporter gene. So, the reporter gene has a weak promoter; it just has the basic core one, and it is not going to express on its own without coming under the influence of enhancer. So, when you do random insertion, if the reporter gets inserted near an enhancer, and if the reporter expresses where that enhancer activates transcription of a gene, typically, it shows that this enhancer works in this particular tissue. So later that can be verified by other methods as well, the reporter gene here is LacZ. So, this is done in an era long before GFP was discovered. So that you will find out by sequencing the flanking region, that will not be difficult, but the point is you have trapped an enhancer. So, the primary goal again is these thought processes are coming from genetics; in genetics, you do not worry about the molecule. First, you see if you hit a molecule that is responsible for a function, then you can search and find the molecule; that is a lot easier than having a bucketful of molecules like you can take any organism and crush it and send it to a company. And they will give you all the proteins produced there, all the mRNA produced there, but what will you do with it without connecting to the function? So, we usually call that a big laundry list, but you do not know what to do with it, many such lists are hanging around for the last 20 years without people having made progress, this is not a belittle the usefulness of that. This is only to highlight the importance of directly connecting to the phenotype because it is straightforward to do an inverse PCR and sequence the neighboring sequence. Since you know the sequence, so you can digest with restriction enzymes, and some might cut somewhere here and there. Under a very dilute condition, if you ligate it, intramolecular ligation will happen. So, it will get circularized, then you take the sequence you know then you use primers going in the opposite direction, then these primers PCR amplify the flanking region and finally you sequence, and you will know where it is. So that is how you identify, but the primary goal here is to find, will I get a tissue-specific activation? So that means I am in the vicinity of a potential enhancer, so that is the goal of this. So, these were done in the mid '80s to late '80s period. (Refer Slide Time: 26:25) So the next one is the same thing but helps you to determine, like for example if you know an enhancer, and then you want to find out what are all the different places, where that enhancer works, So you can make reporters under the control of enhancer and look at it. So, this is an older assay method where it is LacZ, and then you see its expression for primarily in you know the central nervous system here and muscle-specific expression. And here, what you are seeing is lens crystallin’s enhancer, and you see where it is expressed, so this is image I particularly like because it shows you tissue-specific control so clearly. So, there are a lot of such things yesterday in the lab we had some images where the gene’s GFP is expressed only in the nuclei of a particular set of cells along the length of the worm. So, the worm has something called the seam, a horizontal structure providing structural reinforcement for the body. And that is made up of 16 pairs of cells, and this particular gene is expressed only in those 16 cells; the rest of the 959 cells are all dark, and only these are glowing green. So, it was too late, so I did not make a picture that could be shown here today. So, but this is doing the same job so you see it only in the lens not even in the rest of the eye. Because you do not want crystallin expressed elsewhere, you want only in the lens. So now, you are getting an idea of what is differential gene expression. (Refer Slide Time: 28:09) So now, let’s see the applications of enhancer. So, they are beneficial in two different contexts. We are going to see two various examples. One of them is, it is often used in mouse where you want to have a conditional knockout. Let us say a gene has a crucial role in cleavage stage of the embryo, and if you just hit it and create a null allele, then it is going to be embryonic lethal. Imagine that gene had a function in the liver, and you want to know specifically what it does in the liver; you will never know that. So, therefore, when people want to find the tissue-specific or organ-specific function of a gene, they want to have conditional knockouts, meaning conditional deletion. Knockout is a new name, basically, what do you say is the deletion of a gene, that is what the non-geneticists initially called a knockout. Now geneticists also use it, then some people call knock-in, and then people say knockdown, a partial depletion they call knockdown. I just do not like that word because that does not convey the meaning; it does not say what depletion tells. And similarly, knock-in meaning transgene insertion. So now we are looking at conditional knockout, meaning we are not deleting it at the zygotic nucleus, we are going to knock out in a particular somatic tissue. So how do you do that? How do enhancers help? So now you have one strain of mouse, where you have a bacteriophage recombinase. So, I will tell you a little bit about the bacteriophage. I am not sure how many of you have learned the bacteriophage lifecycle. Did anyone learn lambda phage lysogeny and lytic cycles? What does bacteriophage do for their living? How do they survive? So, they get integrated into the host cell, and then when favorable conditions come, they can make multiple copies and come out of the host cell. They do it by using site-specific recombination, and that is exploited here because if I say recombinase and if you did not know what recombinase is then there is no point in going forward. So Cre is one such recombinase, so you express that protein in this particular example under the control of an enhancer for albumin. So, you have a mouse strain that has Cre recombinase under albumin enhancer as a transgene. So, it will be expressed only in the liver where albumin will be produced; only there Cre recombinase will be produced. Now you have another mouse strain where your gene of interest, for example, in this case, the transcription factor HNF4 alpha, its exon 2 is flanked by the sequence recognized by the Cre recombinase. So that is how the site-specific recombination works. So, the site for the Cre is called bacteriophage p1, and that is called loxp sites. So, the loxp site flanks exon 2, normally people call it as floxed, flanked by lox so floxed. So, you flox the gene of interest this exon 2 is not going to be removed in that strain anywhere because there is no recombinase generally in the mouse that will recognize loxp. So that is going to be healthy, and simply expressing Cre recombinase in the liver is not going to cause any problem for that strain because it will work only if there is a loxp; otherwise, you are producing a protein that is neither toxic nor having any functional interference. But if you cross these two mice and generate a double mutant, the mouse that carries both of them in its liver, you will have both the functional elements brought together. You will have Cre recombinase produced because all the albumin enhancer makes it express only in the liver and nowhere else and also you need to remember this is germline floxing. So, in this mouse strain, the genome of all the cells will have this sequence. We already know genome equivalence and that does not matter. Only in the liver where you have the Cre recombinase the loxp site will be recognized and it will treat it as bacteriophage genome and it will excise out. And then you will not have exon 2, so this is a conditional knockout. So, this is how people use enhancers. So now in a journal club or a seminar, if someone says conditional knockout and I floxed this gene, you immediately understand what it is, so it is essential to know this. (Refer Slide Time: 34:04) And the second one is primarily used in Drosophila. Still, it is compelling and they came up with this for a long time ago. Only recently, people have been successfully doing it in C. elegans, probably when people did not find a critical context where they needed it or lazy to develop a tool or whatever. But recently they have shown it works, but in Drosophila this has been very well exploited, and you know this one example is good enough to understand how powerful this is. So, this is the GAL4-UAS system very similar in principle to the previous one. You will have two different strains; here one strain is under the enhancer of a particular imaginal disc. I do not know how many of you know what imaginal disc is, so imaginal disc goes back to metamorphosis. Since Drosophila being an insect, it has a worm-like caterpillar stage from this; the fly comes out, so there is metamorphosis. So, in that worm-like stage, it has the primordium for all the adult stages, and these primordia are called imaginal discs. So, wing primordia mean the wing imaginal disc, meaning the primordial cells that are present in the caterpillar which will eventually be expanded into wing in the adult and so these are the imaginal discs. You use that enhancer to control the expression of a yeast transcription factor GAL4. So GAL4 again has a particular short sequence that is used to recognize and activates transcription. So, you have that transcription factor produced in one strain of Drosophila under this imaginal disc- specific enhancer. Now you have another strain where you have the upstream of your gene of interest, or the gene itself is a transgene, for example, Pax6. You are taking from the mammalian system and putting it here, so you try to express this under the control of the promoter where GAL4 binds. So, this is called the upstream activating sequence, therefore UAS. So, the UAS here is GAL4-UAS, and that is for Pax6, and in the other one, let us say eye-specific imaginal disc. So now, when you cross both and make a double mutant, you are going to produce pax6 protein, where usually, this will drive the GAL4 expression. So, in this particular case, it is not the eye; it is another structure in the thorax or in the head region where by driving pax-6, they were able to induce the formation of the eye in the place of the antenna. So, what it tells you here is pax-6 acts like a master regulator of eye development. So once you provide pax-6, then it looks like everything else is already there in that particular tissue to drive the formation of the eye. So, we will see similar such a complete organ formation in the wrong place in more examples later as we go through. A good group of genes called HOX genes or the homeotic mutations where you have one organ replaced by another organ simply because you changed the master regulator there. So, it gives you versatility like, for example, I can have this line, and cross it with another line where the UAS for another gene is present. It gives you that flexibility, so I can have a library like a set of driver gene, and set of UAS, so like that, I can have, and then by bringing them together, I can activate, so that is the main reason. In the previous one also it is the same, so that is because you want to know in which imaginal disc; if I activate pax-6, I will see this phenotype or instead if I express pax-6 in different imaginal discs what happens? So, you want to know two things here; one is what does this protein do if it is put in a particular biological context? Another one is which imaginal disc can develop what kind of structures. Say, for example, it usually forms antenna, but it can make eye as well if you provide an eye-specific transcription factor. So, this has been extensively exploited by the Drosophila developmental biologists you will find lot of papers, you will hardly find a paper where they do not use this method. (Refer Slide Time: 40:15) Alright, so that ends this, but we have not finished. We have another lecture we can begin, which is a continuation of the same, we are going back to the same pax-6 promoter region. So, it would help if you remembered this, I am talking about the pax-6 promoter. So, where pax-6 expression and how it is regulated is the focus. So, we are looking at its enhancer, and in turn, Pax-6 is a transcription factor as well, so that we