Metabolic EngineeringProf. Amit GhoshSchool of Engineering Science and EngineeringIndian Institute of Technology – KharagpurLecture – 18Flux Variability Analysis (FVA) and Flux Coupling (FC)Welcome to the metabolic engineering course. So, today we will learn about flux variability analysis and flux coupling. So, these are the 2 important topics in metabolic network.(Refer Slide Time: 00:41)Where you need to know some basic about linear programming we will learn a little bit about linear programming today and then followed by flux variability analysis and flux coupling.(Refer Slide Time: 00:54)So, before we go to flux variability and flux coupling. In the last class we learned about the biomass and maintenance what is required for the cell in that the macromolecules actually constitute of cell mass we assume that all the macromolecules like protein DNA and they constitute the biomass equation and the cofactors are needed to drive the process. So, to simulate growth simulation the biomass maintenance requirements have to be satisfied.So, and it is mathematically the model it will grow only when all the biomass components are synthesized inside the cell in that way you know that all bio synthetic pathway for the biomass components are present inside the cell otherwise it will not grow the model will not grow that requirements should be satisfied. And also apart from biomass there will be constant drain in absence of growth.So, there are carbons which are flowing into secondary metabolite which actually drain the energy as well along with the non growth associated reaction. So, there are growth associated reaction that is growth coupled reaction and then non growth coupled reaction. So that we have already explained how to actually you can determine the growth coupled reaction and non growth coupled reaction.So, when you want to calculate the growth coupled reaction you maximize that reaction to see whether it is a growth coupled reaction or not. If the maximization of that reaction increases thegrowth or the growth reaction there is a flux in the growth reaction then you know that is the biomass equation if there, is a flux in the biomass equation then you know that it is a growth coupled reaction. This is the strategy where you can find out which are the reaction is actually growth coupled.(Refer Slide Time: 02:53)So, the constraint based reconstruction and analysis method that is a COBRA that involves a lot of techniques the difference optimization algorithms is available. As for example we will learn about the flux variability analysis today where this part of the flux variability analyzed, you can see that this is what we are going to learn today and also we learned about flux coupling. So, where is flux coupling so here are these 2.So, there are many techniques to optimize the network to extract new information. So, out of that 14 and 11 we will focus on it and these require different types of optimization algorithm, for example linear programming will be used mostly and in rare cases like we will use mixed integer linear programming and nonlinear programming, but mostly it is based on linear programming.(Refer Slide Time: 04:00)So, let us learn about some linear programming today what is linear programming, the basics of linear programming? Suppose, you have a network that has only 2 metabolites so you have a intake metabolite, A is entering the cell and from here and then at the output we have the metabolite B and we have 2 internal reactions. So, the reaction 1 which is x 1 and the reaction 2 as x 2 so, x 1 is producing ATP and x 2 is producing NADH.So that then how you maximize now this network can be maximized for ATP production and as well as for NADH production. And the optimal solution lies at the admissible solution the optimal solution if you maximize ATP then you will see that it lies at the x axis. So, here your solution the maximization of ATP you have a constant of x 1 + x 2 = r A that is the input flux. So, here r A = r B.So, we have the how much your flux is entering is equal to the amount of flux going out. So, when you maximize ATP then it will reach at the point the r A it will become the maximum ATP this network will produce can produce r A and also here maximize NADH is also the maximum amount of NADH this network gives is also r A. So, these are the 2 extreme when you maximize NADH you have the x 1 = x 2 = r A.And when you maximize ATP then x 1 = r A. But the optimal solution lies in only straight lines. So, this is the orange colour straight line you can see that is they have 2 extreme points. So, whenyou maximize ATP you come here and when you maximize NADH you are here but any point on this straight line is a solution any point on this is actually a solution which is actually a combination of ATP production and NADH production. So, this is a simple solution network where do you want to optimize the network to see how much ATP and NADH you can produce given the input and output reaction.(Refer Slide Time: 06:28)So, they are now we think about a more complicated problem where you have 2 variables. So, 2 variables in this sense we have metabolite variable A and variable B which are basically amount of toys cars and the trucks you can produce in a workshop. So, you have an industry, you set up an industry where you are producing toy cars and trucks and those the number of toy cars you are producing by can denoted by A number of cars or trucks you are producing you take it as B.So, A and B are the variable. Due to resource limitation you can produce no more than 60 cars and no more than 50 trucks, this is the resource limitation you have because you have some infrastructure and based on the infrastructure you have your industry can produce only 60 cars and 50 trucks. So, your variable A and B lies from 0 to 60 anything between 0 to 60 and then B that is number of trucks goes from 0 to 50.So, any point between these in this range that is the number of trucks you are producing and you are limited by shipping also not just the infrastructure you have shipping facility which is whereyou have some limitation. So, the number of cars plus the twice the number of trucks must be less than 150. So that constant you have that the number of trucks and the number of cars that you want to see every day is limited by 150.So, you can now you can sell these toy car and truck in the market at the rate of like 20 dollar per car and 30 dollar per truck. So, the selling price and then you are earning become 20 into A that is A is the number of cars and B is the number of trucks. So, this is your total earnings so that becomes your objective function. So, because you want to maximize your earnings because given a company your main target is to actually how you can earn more for to earn to increase the earnings.So, you have to define a objective function which is basically the number of cars and number of trucks you want to sell in the market at a price which is given here. So, this is now we can see these are the constraint. You have constraint 1, constraint 2 and constraint 3 and this is the objective function. So, now your problem is defined so, you have a set of equation which act as a constant and you have an objective function. Now you can maximize it you maximize Z, so, what value of A and B you should produce car every day so that you are earning is maximum. So it is a very simple problem that you can maximize and see how much it is how much.(Refer Slide Time: 09:25)Earning you can produces. So, once you have 3 constraints A greater than 0 less than 60, B greater than 0 less than 50 and we have a shipping constraint, A + 2B equal to less than 150. So these are the 3 So during dynamic flux balance analysis batch culture calculation has been performed in anaerobic culture where you can see that the you can also calculate the yield how much it is producing like biomass. You can see that delta X that is the change in biomass. How much biomass you are getting from t = 0 to at equal to some time you fix that that is a delta X you want to calculate the maximum biomass at t equal to infinity. You can take when equal t = 0, t equal to infinity and calculate the change in biomass concentration divided by the how much glucose it has consumed. So that it becomes your yield of the biomass and similarly for acetate also the how much acetate it is produced divided by the amount of glucose consumed. Then amount of ethanol produced divided by the total amount of glucose consumed. For example lactate the total amount of lactate is consumed divided by the total number of glucose consumed. So this way you can calculate the yield for the biomass acetate, ethanol and lactate and then yield is always actually millimole by millimole or gram by gram. Based on that you can found that how much efficiency the cell is working how much efficient it is. So this using dynamic FBA you can do a lot of things as well using dynamic FBA. You can find the yield of the biomass yield or acetate yield just by simulation. You do not have to go for experiments also so we can also change you can start with initial OD of 0.02. And how does the yield is changing with if the starting OD is different that also you can check because you have to design an experiment. Now when you start to experiment you have to choose a initial OD and these initial OD you can simulate. And see how if you take adifferent initial OD how we can actually the initial OD of the biomass. So how much starting OD you should take for running the experiment. But if you can do some simulation like this it will help to choose which one is the starting point. So you can check whether if you start with 0.02 OD whether the biomass yield is changing or not. And that also you can change so this kind of experiment you can perform in computer and see what should be the starting OD. (Refer Slide Time: 28:43)So this is a dynamic flux balance analysis have been performed for and they found out that the biomass yield is around 0.26, acetate yield is around 0.76, formate is 1.65, ethanol 0.73. So each this is these are all simulation results the simulation results you can simulate it where you can see the substrate the glucose is going down. And then we have the formate which is going up and then you have the acetate and ethanol which is also going up so it is increasing with time. And biomass you can see that the concentration of the biomass is increasing. So only the glucose is going down and other things are increasing all entirely simulated results. And even calculate the yield also yield for biomass acetate, formate, ethanol is also shown over here. And you can check you can vary the initial concentration of the cell to check whether the biomass is still remaining the same or not this is kind of experiment you can do. (Refer Slide Time: 29:48)Now I come to gene deletion algorithm, gene deletion algorithm we will start with FBA. (Refer Slide Time: 29:58)So FBA you know already that in FBA, we maximize the objective function. For example, you maximize any 1 flux, that can be a biomass, ATP or any other product that you want to maximize. And then you apply steady state condition S dot v = 0 and then put upper. And lower bound of all fluxes, and that will allow the flux to be constrained in the solution space, where it is shown in the form of the cone all solution lies in the cone. And then, in the mutant is, basically, you put one of the reaction to be 0 that we V k, that any reaction you want to mutate it. And that when that we actually put the flux at the lower and upper bound to be 0 so it is basically lower bound of V k to be 0 in both the case in both lower bound and upper bound. This way, you can actually remove a reaction from the network.Removing a reaction from the network is directly proportional to actually you can look back to the mapping GPR relationship you can see how many genes to be removed. So this is a suppose I want to remove a reaction, that is the flux for that reaction to be 0. You do not have to remove the reaction from the network you just put the flux going through their reaction to be 0, that you can change due by changing the lower bound and upper bound. If you put lower bound and upper bound for any given reaction to be 0, then automatically that reaction is taken out from the network. And then also, you can see how many genes are actually removed. Because you are removing a reaction, that mapping, you have to design then you can go to the lab and knock off those genes to remove that reaction. So, this way, you can see that the moment you remove a reaction, then the solution space reduces. Because you are putting a constraint in the previous class also I told as you increase more number of constraint. Then you will see that your solution space reduces and you have a constraint solution. So here also you remove a reaction then you see the solution space also reduces. So your solutions space reduces over here and this is a wild type and this is the mutant. So in the mutant, you see, they remove a reaction. So removing a reaction is also indirectly you are saying that you are removing a gene. Because every reaction has a mapping to a gene, it may be 1 gene, 2 gene, 3 gene that you have to see how many genes are you have to knockout to remove a reaction. (Refer Slide Time: 32:37)So, this is a model versus experimental result comparison where, you can see experimentally. You can actually mutate a gene and also simulation through simulation or computational technique also. You can remove a gene and then you compare this to how it is performing. So in this a calculation were gene essentiality reality has been performed on a glycerol minimal media, where they use a strain that is MG1655 E. coli strain, which has 4400 genes and also you have taken a model which has 1003 genes. So, the model has a lesser number of genes, and then you do in silico gene knockout one by one in knockout all genes. You see that 182 are the genes are actually lethal out of 1003 gene and 821, 182 genes are actually lethal and 821 genes are actually viable. And this, when you compare with the experimental result, we see that 119 genes are actually lethal. And then 3769 genes are actually viable those mutants so these are all single gene knockout. And if you compare the model with the experiment, you see that 819 genes are actually it can predict correctly. So out of 896 mutants, it can predict correctly around 819 genes. So the accuracy of the model is 91%, which is very good in glycerol minimal media. In glycerol minimal media, they perform the experiment experimentally. And also theoretically, and they found that 90% of the gene predictions are correct, using the model. So the models and these are all FBA mutants, every single gene knockout using FBA and they compare very well. (Refer Slide Time: 34:43)Now we will discuss about the MOMA, MOMA is another technique where you can actually predict the flux of the mutant. So, where you can actually predict you can predict the knockout fluxes. So, this is the blue one is basically the wild type solution space when there is no mutant and then the green color is the knockout solution space. So, MOMA is basically minimizing the distance between the wild type and mutant flux distribution if you make a list of fluxes for the wild type strain. So, if you have a flux distribution for the wild type and as well as for the mutant just single knockout using FBA and then you take a difference that how much difference it is there. So, this in this case you can see that you choose any non essential flux and then growth rate you can plot and you can see that the solution space for the mutant is smaller. So, is previously I told that as you put more constraints the solutions space will reduce compared to the wild type. The wild type has a more bigger solution space but the mutant has a lesser solution space but MOMA gives a prediction that what is the MOMA? MOMA is another technique where you can get the prediction of the flux profile of the flux distribution you have for the wild type and for the knockout. But MOMA says no, the actual flux distribution for the knockout which is much more experimentally viable is not the one you get from the knockout flux profile by FBA. But it is actually if you draw a normal from this is the maximum growth rate you have in the wild type and this is the maximum growth rate you get in the mutant strain because this iswhere the biomass is maximum. So, these 2 optimal solutions you get for the knockout and the mutant. But MOMA says no this optimal solution is not here, but it is here. So, how did you get this solution space and that the mutant solutions were there based on the MOMA prediction. If you draw a normal from this point, and wherever it is hitting the solution space that is the point. If you draw a normal then that point become your optimal solution and that is the minimum distance. So, this is the minimum distance between the wild type flux distribution and the mutant flux distribution that is the minimum distance between 2 optima this is an optima you get from the wild type strain. And then you draw a normal and you get the most optimal prediction that you get from the MOMA. (Refer Slide Time: 37:38)And that you can mathematically also you can calculate the MOMA prediction while you take you minimize the distance between the wild type flux distribution and the mutant distribution. So, mutant is the V j and the wild type is V j wild type. So, this is the wild flux distribution you get it from the just running simple FBA and also the mutant also you get it from FBA. And then you take minimization problem or you take a difference and with whole square and then the constraint remained the same that is S dot v = 0 and the upper bound and V k one of the reaction you have removed V k = 0. So, taking the difference, the minimize the distance between the wild type flux distribution and the mutant flux distribution you get the fluxes that is MOMA predicted fluxes, which is much more compared to the experiment.(Refer Slide Time: 38:36)So, if you compare the FBA and MOMA that is a mutant growth rate prediction given by FBA and MOMA you will see that the predicted FBA and MOMA correctly predicted the knockout phenotype lethal knockout phenotype agreeing with the experimental data. So, these are the knockout which has been performed exponentially and it was found that FBA and MOMA predictions are actually exactly matching each other. And then there are cases where FBA and MOMA predicted non lethal phenotype. So, this is the cases where FBA and MOMA actually predicted zwf, pgl, pts, ppc, gnd these genes are actually non lethal and then MOMA and FBA predicted in experimental data both these prediction are matching compared with the experimental data. But there are cases where you can see which is shown in blue color where the MOMA predict lethal phenotype agreeing with the experimental data where the FBA cannot predict. So, this is a case where there a MOMA prediction is much better. MOMA can predict much better because it can predict the lethal gene, which is experimentally shown to be true. (Refer Slide Time: 39:55)Similarly, FBA and MOMA flux prediction are compared are different growth condition like carbon limited growth and then high carbon growth, nitrogen limited growth. And they found the correlation coefficient CC is basically the correlation coefficient. And P is the correlation coefficient is actually low compared to the when you compare with the FBA prediction. Then correlation with 0.06 and the P value is 0.6 whereas, consider the knockout prediction the correlation coefficient is 0.56. But the P value is very, very less so, this is where the P value is very, very less than this is much more significant the results are much more significant. Whereas, for high carbon growth you see the correlation coefficient is 0.77. But, the MOMA prediction is much more much better like 0.94. So, in both the cases you can see 0.64 correlation coefficient is very low, but, the MOMA correlation coefficient is very high 0.56 and this case is very close to 1. So, if it is very close to 1 then and then the experimental and the theoretical predictions are very much matching and here is the correlation coefficient for nitrogen limited growth is 0.86. But, for MOMA it is 0.73 in this case it has reduced a little bit, but the P value is very high low the P value is also very low for the FBA prediction. So, this way these are the 2 optimization tool FBA and MOMA that can be used to actually predict lethal knockout and that can be compared with the experimental data. (Refer Slide Time: 41:35)Now, another prediction tool is the ROOM that minimize the number of fluxes that change. So, what is the number of fluxes that change? There because the cell adjust within itself when you knockout the gene, the cell will do minimum adjustments. So, it will not so the cell has a multiple optimal state and is very well connected network that is why this thing happened and when you remove a gene, then what happened that it will adjust within each cell so, that the minimum number of fluxes are changing. So, this prediction ROOM algorithm will also work on this method where it changes minimum number of fluxes. So, the these are the 3 prediction done in the FBA, MOMA and ROOM. You can see and then the knockout there they have performed the number of significant flux changes between the flux distribution of the wild type strain. And the flux distribution predicted by MOMA, FBA and ROOM for knockout organisms are shown over here. So, they have done this many knockout you can see that the ROOM and the for the significant changes fluxes is maximum in FBA. So FBA is actually all the cases 1, 2, 3, 4, 5, 6, 7, 8 so 8 knockout strain they have predicted and in all the 8 cases, you can see that a 9 cases you can see all the 9 knockouts have actually maximum flux change in FBA. But MOMA predicted little lesser much lesser than the FBA technique. But ROOM you have seen in all the case situation the flux profiles are actually very, very less compared to the wild type. So, the number of fluxes changing is also very less. So, these 3techniques are widely used to actually predict the flux profile of the knockout strain and each of the technique has their importance. (Refer Slide Time: 43:33)And then this and then they got compared ROOM with experimental data ROOM, MOMA and FBA have been compared here you can see the 9 knockouts. And the correlation coefficient with the flux measurement is plotted over here the correlation coefficient is very high. So, 8 out of 9 cases ROOM was actually performed better. So, the ROOM actually out of 9 mutant 8 of the cases the ROOM actually perform better you see. The correlation coefficient is more, more here, more here more only in this case, which is shown in red square box, rectangular box where we see there the prediction is low. So the ROOM prediction is the correlation coefficient is low. So, on all the cases except one case, why the ROOM is has a better prediction? And MOMA and if you compare the growth rate the relative error in the growth rate. You can see that the MOMA is actually not performing better in the growth rate in all the cases you can see that the MOMA prediction is actually predicting over predicting under predicting the growth rate. So, in all the cases it will go through it, that it will by MOMA is not very good apart from correlation in the flux. So, in this way you can compare the growth rate and the flux profile you can compare to find out which of the method is actually performing better the ROOM has performed better here, whereas, MOMA not perform well in terms of growth rate relative growth rate. (Refer Slide Time: 45:12)In conclusion, you can see how we have seen today that dynamic flux balance analysis actually can be used to simulate growth in batch culture. And you can simulate the product formation substrate consumption biomass formation those things you can actually calculate in a dynamic fashion. And you can compare with the experimental data and the FBA will always predict higher growth rate as compared to MOMA. So, ROOM or ROOM so, FBA has in both in all cases we have seen that it is predicting more growth rate compared to a MOMA and ROOM. And the MOMA solution space is unique given a single wild type flux distribution, the MOMA solution space is unique. And then ROOM solution space is not unique. And they are often multiple flux distribution of the same number of altered fluxes. (Refer Slide Time: 46:06)So these are the conclusion and then we have the references. Hope you enjoyed the class and thank you for listening.
Log in to save your progress and obtain a certificate in Alison’s free Metabolic Network Analysis online course
Sign up to save your progress and obtain a certificate in Alison’s free Metabolic Network Analysis online course
Please enter you email address and we will mail you a link to reset your password.