Health Research | Validity of Epidemiological Studies

hello today we are going to talk aboutvalidity in epidemiological studies wellwhat if you come across one day aheadline in the newspaper that a studysays that coffee drinking doubles therisk of heart attack what's going to beyour reaction well in order to furthergo the into in-depth into this study wewill actually need to look at how thisstudy was done and how valid are theresults of this study if we look at anyepidemic illogical study the basic goalof any epidemiological study is twofoldone is to obtain an accurate estimate ofwhatever is being studied whether that'sthe frequency of a disease or the effectof an exposure on a health outcome andall of this we study in a certain sampleof the populationnow this aspect of any epidemiologicalstudy is known as internal validity ofthe study how valid are the methodologythat is being used to either estimatethe frequency or determine the effect ofan exposure now in the long run what wewould want is we won't want that thisestimate is generalizable to therelevant target population among whichthe study is being done now this aspectof any epidemiological study is known asexternal validity so that the results ofthe study can be extrapolated to thewhole population when we talk ofaccuracy of the estimate what accuracyactually means it consists of two thingsprecision and validity if we look at saya bull's eye and we want to hit the markwhat we would want to be is to beprecise in an as well as valid so thatwe try to hit the bull's mark as many astimes as possible so similarly everyepidemiological study can have resultswhich are both precise and valid whichis what we would actually want in everystudy however there could be studieswhere the results may be precise whichmeans that every time the study has beendone you get the similar results but itmay be that the methodologywas not correct and so they are notvalid it may happen that met the resultsare not precise but sometimes they maybe valid or in the worst case scenarioeither results mean it may be neitherprecise nor valid so when we are lookingat any epidemiological study we need tobe very of both precision as well asvalidity as I told you in a pre-medicalstudies all that we are doing isbasically estimating we are estimatingeither the frequency of a disease or onour health outcome or we are estimatingwhat is the effect of an exposure on anoutcome now when we are doing theseestimations there are bound to be errorsthat may happen in our studies there aretwo kinds of errors that we come acrosswhen we are doing epidemiologicalstudies one are called random errors orerrors that happen due to chance whichis basically the variability because ofany unknown or uncontrollable causessuch as errors in sampling or errors indoing measurements however the moreproblematic error that we may face inany study are called systematic errorsor biases these are the errors that arebasically a threat to the valve tovalidity of any epidemiological studyand how do these errors happen in anystudy basically the way we do the studythe methodology that we use to do anypedagogical study and if it is done in acertain way that tends to produceresults that are not the true resultsthen that leads to errors which arecalled biases and ultimately what wewould see is that either the estimatesor the associations that we are tryingto assess between the exposure and theoutcome in the study sample may differfrom the true causal association betweenthe same exposure and outcome that maybe there in the source population so letus look at the various kinds of biasesor threats to validity in epidemiologystudies there are essentially threekinds of biases that may occur in anyepidemiological study these are calledas selection biasinformation bias and confounding solet's go through one by onecoming to selection bias selection biashappens when we use procedures to selectpopulations remember that in anepidemiological study we are sampling acertain number of individuals toparticipate in the study the way inwhich we select these study participantsare we sure that these studyparticipants really accurately representthe target population and if there isany issue in which the way we selectthese people that results into what wecall as selection bias now how do allthese things happen in epidemiologicalstudies remember that we are selectingour cases and controls and these mayhappen through either we are using asurveillance mechanism from which thereis a systematic notification of casesand if we are taking more of exposedcases from the surveillance mechanismthat would be one way in which selectionbias could occur will may be screeningand doing diagnoses more systematicallyamong the case those who are exposed ifwe know their exposure historybeforehand and then that canartificially in create biases againselection biases can occur in if weselect our cases and controls fromhealth care facilities hospitals andwhere if it is likely that more of thecase patients who are exposed areadmitted or the other way around thatcan lead to selection bias anothercommon way in which selection biasoccurs is when we select those cases whoare alive the cases of the disease whoare dead would not be part of ourstudies and it may be that the reasonwhy these cases are alive may have to dowith the exposure status and henceselective selection of survived patientscan actually lead to selection bias incohort studies selection bias usuallyoccurs when there is a loss to followremember that we have to follow uppeople over a period of time in cohortstudies and if it's likely that peoplewho are less exposed or more exposedthey are more likely to be lost orpeople who are at more risklater lesser risk if they are morelikely to be lost to follow up thateventually can lead to results that arebiased and that would be attributable toselection bias how do we deal withselection bias we can deal withselection bias at any stage of our studyideally we would want to make sure thatthe way in which we design the study isfree from selection bias so one waywould be to use incident cases and notprevalent cases because prevalent caseshave the issue of survival buyersespecially case control studies are moreprone to this - selection bias andvarious ways in which to deal withselection bias and case control studiesis to use population-based design ratherthan hospital based design such that thecases and controls are actually selectedfrom the community or the population andnot from fue or a particular healthcarefacilities we need to make sure that weapply the same eligible criteria when weare selecting cases and controls and weare not leaning towards a particularexposure among the cases and controlsagain both cases and controls shouldundergo the same diagnostic proceduresand the same intensity of surveillancein order to identify them as cases andcontrols so that we are not biased inthe time of the selection now at thetime of data collection what we need toensure is to minimize non-response tominimize non-participation and make surethat we don't lose many peopleespecially in cohort studies over a longfollow-up period even if we anticipateactually that we may lose people and soit would be a good idea to actually keepa record of all these losses people atleast some basic social demographiccharacteristics of these people so thatlater on at the analysis stage we canactually compare people who were lost tofollow-up versus those who remain in thestudy and see if there were any be anymajor differences in these twopopulations which could lead toselection bias we also need to make sureat the time of data collection that thediagnosis of diseasenot affected by the exposure statuswhich means at the time of selecting whothe cases and controls are the personwho is selecting the case in controlsshould not be aware of what the exposurestatus of this population of the casesand controls are and this one way inwhich we do this is called blinding nowad even at the analysis stage where ifwhat we can do is as I told you beforewe can compare those who responded or tothose who didn't respond those who aredropouts compared to those who were leftin the study with respect to thebaseline variables and see if there areany large or small differences betweenthese two groups large difference if wefind large differences it is suggestiveof selection bias however smalldifferences do not rule out selectionbias so we need to be wary of that againanother way to assess whether there maybe a selection bias may have occurred inour study is to do what we call assensitivity analysis in which we try todo an analysis assuming how much biascould have happened and what directionit could have gone and try to see how itaffects the study results if the studyresults are affected in a major way thenwe can assume that yes selection biashas occurred moving on to the nextthreat to validity and that is calledinformation bias information bias isessentially a bias that can occur whenwe are measuring the characteristics ofstudy participants now what do wemeasure we measure exposures we measureoutcomes and we measure other variableswhich may influence the exposures andthe outcomes which are called as thirdfactors or confounders or modifiers whatwe need to make sure that themeasurements that we are doingaccurately represent what it actually isthe level of exposure is accuratelymeasured whether there is an outcomepresent or absent is accurately measuredand other variables such as socialdemographic age andother education income all thosevariables are also appropriatelymeasured now how does this happen now incase control studies information biascan happen if we are collecting exposureinformation which is leaning towards aparticular exposure status if we'reelect trying to collect more of peoplewho are exposed compared to theunexposed or the other way around thiscan lead to information bias one of thevery common ways in which informationbias occurs in case control studies isthrough the process of recall rememberthat we have cases and controls andwe're trying to read we are asking themto recall the past history of exposuresand it may be it may so happen thatthose people who are deceased or whohave a certain health event may be morelikely to recall certain exposurescompared to those people who are healthyand this is what we call as recall biasit may also be possible that betterexposure data is available on casescompared to the controls and that againcan lead to information bias in cohortstudies information bias can happen ifwe collect information leaning towards aspecific outcome status if we follow theexposed population much more rigorouslycompared to the unexposed populationthat is something that can lead to itsinformation bias in cohort studies itmay also be possible that better outcomedata is available among the exposed andthen again compared to the unexposedwhich can again produce information biasin the study information bias can beintroduced in a study both either bywhat the investigator does in the whichin the way in which the investigatorscollect the information about the casesabout the controls about the exposureabout the about whether they get thedisease or not get the disease and ifthere is a systematic way in which thisis being done irregularly that can leadtoinformation buyers and last but not theleast cause remember that in an ingeneral and observationalepidemiological studies we are dependenton what our study participants tell usand if there is any systematicdistortion of the true facts by thestudy participants that is anyways goingto lead to information bias now how dowe deal with information bias first ofall because we are measuring theexposure variable the outcome variableand the other variables we need to setup precise operational definitions ofwhat we are going to measure and howmuch is it going to be we need to havedetailed measurement protocols in theway we are going to measure each ofthese variables sometimes it's also agood way to do repeated measurements onkey variables say for example bloodpressure and we know that blood pressurecan vary from time to time so if we maytake more than one readings of bloodpressures and then take an average ofthat reading in order to say what theactual blood pressure of that individualis at that particular point of time itis very important that the investigatorsare trained and certified in the way inwhich they follow the study protocol andall the methodology that needs to bedone to collect information there we cando data audits both of the interviewersand off course of the data managementcenters where the data is stored - inorder to make sure that the way in whichthe data is collected the data isretrieved the data is stored as is donecorrectly and there's no informationbias happening because of the same oncethe data is collected we need to makesure that the data is cleaned we need togo through the data both visually aswell as can be through computer programssoftware's and make sure that we aregetting clean data it's also goodpractice to actually rerun all youranalysis before you are trying to dogive say send your paper for publicationjust to make sure that you're not thatthere is no possibility of anyinformation bias occurring because ofthe way the analysis was done now we aregoing to look atthe next threat to validity which iscalled confounding confounding comesfrom a French word which actually meansconfusion of effects now what effectsare we talking about here remember whatwe are doing an epidemiological study islooking at the effect of an exposure onthe outcome whether if you're moreexposed are you more likely to get thedisease or vice versa now there it so wewhat we want to know is the effect ofthis exposure on a particular outcomenow this effect can be confused with theeffect of a third factor which can havean influence both on the outcome as wellas the exposure and this is what leadsto what the phenomenon of what is calledconfounding now what does confounding doactually confounding is probably themost the biggest threat to validity inany pretty miracle study becauseconfounding can actually simulate canshow you an association even when itdoes not exist confounding may hide anassociation that is actually there orconfounding may actually increase ordecrease the strength of the Associationso you may say that an exposure is moreassociated with the outcome or lessassociated with the outcome than what itactually is and in the worst casescenario confounding can actually changethe direction of an effect if anexposure say causes an outcome becauseof confounding you may see that theexposure is preventing the occurrence ofthat outcome and that's the mostdangerous threat to validity in anypretty logical study so how doesconfounding happen so diagrammaticallywhat we will present that confounder isa third factor is a variable whichinfluences both the exposure and theoutcome and when we are trying todetermine what's the association betweenthe exposure and outcome thisassociation is influenced by this thirdfactor now we can deal with confoundingboth at the design stage and at theanalysis stage it's always better todeal it with the design stage than totake care of the analysis stageso at the design stage one we can doseveral things one we can do what iscalled restriction we can restrict ourstudy participants to only those peoplewho are in one stratum of the confounderso that the confounder cannot play arole in the association between exposureand outcome secondly we can do what iscalled matching if we already know whatthe parts potential confounders could befor in a particular study we can matchour cases and controls on thoseparticular confounders and which willnegate the effect of the confounder andthen the the association that we seebetween the exposure and outcome wouldbe without the influence of theconfounder of course remember that ifyou do matching you have to do what iscalled matched analysis in experimentalstudies we do what is calledrandomization and that is something thatactually automatically is takes care ofthe confounders and make sure that thetwo arms in a randomized trials aresimilar in all ways in terms of theconfounding variable now at the time ofanalysis what we can do enough what weneed to do in every study is to actuallyfirst test whether there is anyconfounding or whether there arevariables which could be acting asconfounders which need to be taken careof at the time of analysis and this iswhere we do what is called stratifiedanalysis and we stratify our data on invarious stratum of the confounder andthen try to find associations and whichhelps us to identify whether there isconfounding or not now in order to takecare of these confounders we can dowhat's called a multivariate analysiswherein we do we use regressiontechniques whether it's logisticregression linear regression or othermethod other advanced methodologies inorder to take into account the effect ofconfounding and then with theassociations that we get between theexposure and outcome are without theinfluence of the confounder or as we sayadjusted for the confoundersso how do you evaluate associationswhenever you see a study whenever yousee a so when you see a risk ratio or anodds ratio what you see is a crudeAssociation now how do we make sure thatthis crude Association is actually thetrue or the causal Association that isthe true relationship between theexposure and the outcome what we need tomake sure is we need to go through thisspiral we need to make sure that it'snot because of chance we need to ensurethat there is no selection bias we needto check if there isn't there could beany information bias we need tounderstand if they could be confoundingand if that confounding has been takencare of only after going through thisprocess we would be able to say thatwhether the crude Association isactually the causal Association or notso coming back to our problem thiscoffee really increased the risk ofheart attack well let's analyze thiswhat we if this what we wanted to do isto look at all four populations alladults in the population were drinkingcoffee now in the study what we get is asample of people who agreed to take partin the study now these people could bemore people who are more likely to drinkcoffee or less likely to drink coffeethese are these people may behospitalized patients and if those huhif you are doing a study in a hospitaland it may be that these patients arehospitalized for say say gastric ulcerand that's because of coffee drinking sothe way in which we select theseparticipants can actually lead to a biasand that's what is called selection biasnow what we are the exposure that we'retrying to assess here is the coffeeactual coffee intake of the studyparticipants and what we get from thestudy participants is actually what theyreport are the reporting the true coffeeintake do they actually remember howmany cups of coffee they have had in thepastwhat's the average number of coffee theydrink whether they drink coffee withmilk without milk what's the strength ofthe coffee all of these issues canactually influence whether the coffeeintake that we are measuring is actuallythe true coffee intake and that can leadto information wise again remember thatwe are also trying to see whether thepeople really had a heart attack or notand it's possible that they maybe havemay have been a misdiagnosis of a heartattack there could be other the chestpain that the study participants mayreport as heart attack may actually havebeen maybe due to other causes and thatis reported as a heart attackso actually what we may be seeing is notheart attack but some other causes forchest pain and that's again the studyresults would then be influenced byinformation bias and then of coursethere's confounding could it be thatthis association that we saw betweencoffee and heart attack what we callmyocardial infarction and says inmedical terminologycould it be confounded by smoking is itpossible that those who are we know thatthose were smokers are more likely to isa known risk factor for heart attackit's also known that those who aresmokers are more likely to be coffeedrinkers and it's possible that becausewe may see more of smokers among thecoffee drinkers and more of smokersamong those who had a heart attack theassociation that we are seeing betweencoffee and heart attack is not due tothe actual coffee but it's actuallybecause of the effect of smoking onheart attack and so there is the resultsof this association between coffee and MI could have just been confounded by theeffect of smoking so what we need tounderstand is that there are variousthreats to validity in anyepidemiological study and these biasescan occur in all epidemiological studiesmore so in observational studies such ascase control and cohort studies and lessso in randomized trials biases can occurduring all stages of the stirwhen we are designing the study if thestudy is not designed appropriately ifthe study is not conducted appropriatelyor if the analysis is not doneappropriately all of which can lead toone or the other biases and we know thatbiases threaten both the internal andexternal validity remember that thestudy which has no internal validitycannot be generalized and so it doesn'thave any external validity so what do weneed to keep in mind is that when we aredesigning a research study we need to bethinking of all the possible ways inwhich these various biases could creepin into our study and design itappropriately and try to prevent as manybiases as possible for at the time ofdesigning and implementing the studyhowever we should also remember thatthere could be some biases which cannotbe avoided what we need to understand isat the time of analysis of the resultswe do need to be aware of what thesebiases could have been and stayed thesebiases in the form of limitations of thestudy so it is critical that whenever welook at the results of anyepidemiological study we need to be waryof what possible threats could be to thevalidity of these studies and make surethat the investigators have taken careof these various threats thank you

Module 1: Overview of Health Research

Module 1: Overview of Health Research

Overview of Health Research - Learning Outcomes

Introduction to Health Research

Formulating Research Question

Literature Review

Measures of Disease Frequency

Descriptive Study Designs

Analytical Study Designs

Experimental Study Designs - Clinical Trials

Validity of Epidemiological Studies

Overview of Health Research - Lesson Summary

Module 2: Qualitative Research Methods

Qualitative Research Methods - Learning Outcomes

Qualitative Research Methods - An Overview

Measurement of Study Variables

Sampling Methods

Calculating Sample Size

Selection of Study Population

Study Plan and Project Management

Designing Data Collection Tools

Principles of Data Collection

Data Management

Overview of Data Analysis

Qualitative Research Methods - Learning Summary

Module 3: Ethics in Health Research

Ethics in Health Research - Learning Outcomes

Ethical Framework for Health Research

Conducting Clinical Trials

Preparing a Concept Paper

Elements of Protocol in Research Studies

Publication Ethics

Ethics in Health Research - Lesson Summary

Course assessment

Introduction to Biomedical Research - Course Assessment

We offer unlimited learning for free. Be a part of our mission.

Support us in our mission to keep education free for all.

Pick Your Contribution Amount.

Select Payment Method

Thank you for being part of our mission!

“Education should be...”

Education should be... free and accessible.

Select Payment Method

Thank you for your contribution!

You’ve started now, make sure you finish!

Learners with study reminders are 34% more likely to finish their course!

Set A Weekly Study Reminder

Set Study Reminders

Set Study Reminder

Empower Yourself For Free

Education should be...
free and accessible.