advanced gene mapping coursemath.ucdenver.edu/~spaul/empty/hostedfiles/... · • chi-square test...
TRANSCRIPT
![Page 1: advanced gene mapping coursemath.ucdenver.edu/~spaul/empty/hostedfiles/... · • Chi-square test of independence • Gains power with assumption that the two loci are independent](https://reader034.vdocument.in/reader034/viewer/2022042111/5e8c26cb10d4d770ce71089d/html5/thumbnails/1.jpg)
AdvancedGeneMappingCourse
RockefellerUniversity,NY
Subrata Paul
![Page 2: advanced gene mapping coursemath.ucdenver.edu/~spaul/empty/hostedfiles/... · • Chi-square test of independence • Gains power with assumption that the two loci are independent](https://reader034.vdocument.in/reader034/viewer/2022042111/5e8c26cb10d4d770ce71089d/html5/thumbnails/2.jpg)
GeneticsforStatisticsiswhatPhysicsisforMathematics
Geneticsisaleadingmotivationfordevelopmentofnewbasicstatistics.
![Page 3: advanced gene mapping coursemath.ucdenver.edu/~spaul/empty/hostedfiles/... · • Chi-square test of independence • Gains power with assumption that the two loci are independent](https://reader034.vdocument.in/reader034/viewer/2022042111/5e8c26cb10d4d770ce71089d/html5/thumbnails/3.jpg)
TopicsCovered
• Populationandfamilybasedassociationstudies• DataQC• Rarevariantassociationanalysis• DetectingInteraction• Imputation• Metaanalysis• Linearmixedmodel• eQTL mapping• Evolutionarygenetics• Incorporatefunctionalityinrarevariantassoc• Missingheritability
• PLINK• VAT• GenAbel• BEAM3• CASSI• MACH• MINIMAC• METAL• GCTA-MLMA• GERP• GenMAPP• Etc.
![Page 4: advanced gene mapping coursemath.ucdenver.edu/~spaul/empty/hostedfiles/... · • Chi-square test of independence • Gains power with assumption that the two loci are independent](https://reader034.vdocument.in/reader034/viewer/2022042111/5e8c26cb10d4d770ce71089d/html5/thumbnails/4.jpg)
Instructors
• HeatherCordell,InstituteofGeneticMedicine,NewcastleUniversity,UK• SuzanneM.Leal,BaylorCollageofMedicine• GoncaloAbecasis,Univ ofMichiganSchoolofPublicHealth• NancyJ.Cox,VanderbiltGeneticsInstitute• Shamil Sunyaev,DepartmentofMedicine,HarvardMedicalSchool
![Page 5: advanced gene mapping coursemath.ucdenver.edu/~spaul/empty/hostedfiles/... · • Chi-square test of independence • Gains power with assumption that the two loci are independent](https://reader034.vdocument.in/reader034/viewer/2022042111/5e8c26cb10d4d770ce71089d/html5/thumbnails/5.jpg)
GWAS:WTCCC
WelcomeTrustCaseControlConsortium
• 7differentdiseases:Bipolardisorder,coronaryartery,crohn's disease,hypertension,rheumatoidarthritis,type1andtype2diabetes.• 2000casesforeachdisease• Commonpopulation-basedcontrols• Foundsignals6outof7diseases• ExpandedtoWTCCC2andWTCCC3with5200commoncontrols
![Page 6: advanced gene mapping coursemath.ucdenver.edu/~spaul/empty/hostedfiles/... · • Chi-square test of independence • Gains power with assumption that the two loci are independent](https://reader034.vdocument.in/reader034/viewer/2022042111/5e8c26cb10d4d770ce71089d/html5/thumbnails/6.jpg)
DataQC:
• LowCallrates;excessheterozygosity• Xchromosomemarkersusefulforcheckinggender• Checkingrelationshipandethnicity• Mendelianmisinheritances• Hardy-Weinbergdisequilibrium• MinorAlleleFrequency
![Page 7: advanced gene mapping coursemath.ucdenver.edu/~spaul/empty/hostedfiles/... · • Chi-square test of independence • Gains power with assumption that the two loci are independent](https://reader034.vdocument.in/reader034/viewer/2022042111/5e8c26cb10d4d770ce71089d/html5/thumbnails/7.jpg)
DataQC:CallratesandHeterozygosity
Inbreeding
SampleContamination
![Page 8: advanced gene mapping coursemath.ucdenver.edu/~spaul/empty/hostedfiles/... · • Chi-square test of independence • Gains power with assumption that the two loci are independent](https://reader034.vdocument.in/reader034/viewer/2022042111/5e8c26cb10d4d770ce71089d/html5/thumbnails/8.jpg)
AccessingSex
• MaleswithanexcessofheterozygousSNPsontheXchromosomecandenote• Malesmislabeledasfemales• MaleswithKlinefelter syndrome
• FemaleswithanexcessofhomozygousgenotypeontheXchromosomecandenote• Femalesmislabeledasmales• FemaleswithTurnerSyndrome
• Canbeobservedduetosamplemix-ups• Samplesforwhichthesexisincorrectshouldberemovedfromtheanalysis(probablynotthepersonyouthinkitis)
![Page 9: advanced gene mapping coursemath.ucdenver.edu/~spaul/empty/hostedfiles/... · • Chi-square test of independence • Gains power with assumption that the two loci are independent](https://reader034.vdocument.in/reader034/viewer/2022042111/5e8c26cb10d4d770ce71089d/html5/thumbnails/9.jpg)
DataQC:
• Ethnicity
![Page 10: advanced gene mapping coursemath.ucdenver.edu/~spaul/empty/hostedfiles/... · • Chi-square test of independence • Gains power with assumption that the two loci are independent](https://reader034.vdocument.in/reader034/viewer/2022042111/5e8c26cb10d4d770ce71089d/html5/thumbnails/10.jpg)
QQPlots(good)
![Page 11: advanced gene mapping coursemath.ucdenver.edu/~spaul/empty/hostedfiles/... · • Chi-square test of independence • Gains power with assumption that the two loci are independent](https://reader034.vdocument.in/reader034/viewer/2022042111/5e8c26cb10d4d770ce71089d/html5/thumbnails/11.jpg)
QQPlots(bad)
![Page 12: advanced gene mapping coursemath.ucdenver.edu/~spaul/empty/hostedfiles/... · • Chi-square test of independence • Gains power with assumption that the two loci are independent](https://reader034.vdocument.in/reader034/viewer/2022042111/5e8c26cb10d4d770ce71089d/html5/thumbnails/12.jpg)
GenomicInflationFactor
• GenomicInflationFactoristheratioofthemedianoftheteststatisticstoexpectedmedianandisusuallyrepresentedas𝜆• Noinflationoftheteststatistics𝜆 = 1• Inflation𝜆 > 1• Deflation𝜆 < 1
![Page 13: advanced gene mapping coursemath.ucdenver.edu/~spaul/empty/hostedfiles/... · • Chi-square test of independence • Gains power with assumption that the two loci are independent](https://reader034.vdocument.in/reader034/viewer/2022042111/5e8c26cb10d4d770ce71089d/html5/thumbnails/13.jpg)
PopulationStratification
• Populationsampledactuallyconsistsofseveralsub-populationthatdonotintermix• Canleadtospuriousfalsepositive(type1errors)incase/controlstudies• Solutions:• PCA• MDS(MultidimensionalScaling)alsoknownasprincipalcoordinatesanalysis
![Page 14: advanced gene mapping coursemath.ucdenver.edu/~spaul/empty/hostedfiles/... · • Chi-square test of independence • Gains power with assumption that the two loci are independent](https://reader034.vdocument.in/reader034/viewer/2022042111/5e8c26cb10d4d770ce71089d/html5/thumbnails/14.jpg)
PopulationStratification(PCA)
• Computetheeigenvectorsandeigenvaluesofmatrixofcorrelationsbetweenindividuals(basedonIBDorIBS)• Includeprincipalcomponentscoresfromtop10(say)eigenvectorsascovariatesinalogisticregressionanalysis• Plottingfirstprincipalcomponents(firsttwo)youcanvisualizeethnicoutliers• LinearMixedModel• Estimatekinshipmatrix(IBDsharing)betweenpairsofindividualsusinggenome-widegenotypedata• Usethistomodeltheir(extra)correlation,inalinearregressiontypeanalysis
![Page 15: advanced gene mapping coursemath.ucdenver.edu/~spaul/empty/hostedfiles/... · • Chi-square test of independence • Gains power with assumption that the two loci are independent](https://reader034.vdocument.in/reader034/viewer/2022042111/5e8c26cb10d4d770ce71089d/html5/thumbnails/15.jpg)
PopStratification(Variancecomponentsmodels)
• Analternativeapproachbasedonvariancecomponentsmodelshasbeenproposed• Kangetal.(2010)NatGenet42:348-354• Zhangetal.(2010)NatGenet42:355-360
• Basedonmethodsdesignedtotestforgenotypeassociationswithquantitativetraits:linearregression
𝑦 = 𝜇 + 𝛽𝑥 + 𝜖Where,
𝑦 isthetraitvalue𝑥 isavariablecodingforgenotype𝜖 ∼ 𝑁(0, 𝜎3) Residualerror
![Page 16: advanced gene mapping coursemath.ucdenver.edu/~spaul/empty/hostedfiles/... · • Chi-square test of independence • Gains power with assumption that the two loci are independent](https://reader034.vdocument.in/reader034/viewer/2022042111/5e8c26cb10d4d770ce71089d/html5/thumbnails/16.jpg)
VarianceComponents(mixed)models• Linearmixedmodelsallowthisideatobeappliedtorelatedindividuals• 𝜖 ∼ 𝑀𝑉𝑁(0, 𝑉) wherevariance/covariancematrix𝑉 followsstandardvariancecomponentsmodel,accountingforknownkinship
• 𝑉78 = 𝜎93 + 𝜎:3 𝑖 = 𝑗• 𝑉78 = 2Φ78𝜎93 𝑖 ≠ 𝑗
• 𝜎93, 𝜎:3 representstheadditivepolygenicvariance(duetoallloci)andtheenvironmental(=error)variancerespectively
• Φ78 ishalftheexpectedIBDsharingbetweenindividuals𝑖and𝑗(=theirkinshipcoefficient)
• CloselyrelatedtoQTDT(Abecasis et.al2000a;b)whichimplementsaslightlymoregeneral/complexmodel• Softwaretoimpement :GenABEL,EMMAX,FaST-LMM,GEMMA,MMM
![Page 17: advanced gene mapping coursemath.ucdenver.edu/~spaul/empty/hostedfiles/... · • Chi-square test of independence • Gains power with assumption that the two loci are independent](https://reader034.vdocument.in/reader034/viewer/2022042111/5e8c26cb10d4d770ce71089d/html5/thumbnails/17.jpg)
LinearMixedModel(detailed)𝑌7 =A𝛽8𝑋78 + 𝜖
�
8
𝑋78- Normalizedgenotypeofindividual𝑖 atSNP𝑗Inthematrixform:
𝑦D = 𝑋�̅� + 𝜖Twoimportantmatrices
𝐿𝐷 =1𝑀𝑋H𝑋
𝐺𝑅𝑀 =1𝑁𝑋𝑋
H
![Page 18: advanced gene mapping coursemath.ucdenver.edu/~spaul/empty/hostedfiles/... · • Chi-square test of independence • Gains power with assumption that the two loci are independent](https://reader034.vdocument.in/reader034/viewer/2022042111/5e8c26cb10d4d770ce71089d/html5/thumbnails/18.jpg)
LinearMixedModel(detailed)Ourmodel
𝑌7 =A𝛽8𝑋78 + 𝜖�
8Wehavetofitmarkersindividually
𝑌7 = 𝛽K + 𝑋K +A𝛽8𝑋78 + 𝜖 ∼ 𝛽K𝑋K + 𝜖′�
8M3ForeachSNPwecanfitthemodel
𝑌7 = 𝛽𝑋7 + 𝑢7 + 𝜖𝜖 ∼ 𝑁 0, 𝐼𝜎3 𝑢 ∼ 𝑀𝑉𝑁(𝑜, 𝐺𝑅𝑀)
![Page 19: advanced gene mapping coursemath.ucdenver.edu/~spaul/empty/hostedfiles/... · • Chi-square test of independence • Gains power with assumption that the two loci are independent](https://reader034.vdocument.in/reader034/viewer/2022042111/5e8c26cb10d4d770ce71089d/html5/thumbnails/19.jpg)
ROADTRIPS
• RobustAssociation-DetectionTestforRelatedIndividualswithPopulationSubstructure• ThorntonandMcPeek (2010)AJHG86:172-184
• ExtensionofMQLS(MaximumQuassi-LikelihoodStatistic)• Bothmethodsconstructadjustedversionofcase/control𝜒3(orArmitageTrend)test• Usingknownpedigreerelationshipstocorrectforrelatedness• ROADTRIPSalsousescovariancematrixbasedonkinship/IBDsharingtocorrectforunknownrelatedness/populationstratification
![Page 20: advanced gene mapping coursemath.ucdenver.edu/~spaul/empty/hostedfiles/... · • Chi-square test of independence • Gains power with assumption that the two loci are independent](https://reader034.vdocument.in/reader034/viewer/2022042111/5e8c26cb10d4d770ce71089d/html5/thumbnails/20.jpg)
ComplexTrait:RareVariants
• MRV– MultipleRareVarianthypothesis:Complextraitsaretheresultofmultiplerarevariantswithalargephenotypiceffect• Largeeffectsizecomparedtocommonvariants• Althoughthesevariantsarerarecollectivelytheymaybequitecommon• Strongevidencethatrarevariantsplayanimportantrole
![Page 21: advanced gene mapping coursemath.ucdenver.edu/~spaul/empty/hostedfiles/... · • Chi-square test of independence • Gains power with assumption that the two loci are independent](https://reader034.vdocument.in/reader034/viewer/2022042111/5e8c26cb10d4d770ce71089d/html5/thumbnails/21.jpg)
FunctionalRareVariants
Keizun,Garimella,Do,Stitziel etal.NatureGenetics2012
![Page 22: advanced gene mapping coursemath.ucdenver.edu/~spaul/empty/hostedfiles/... · • Chi-square test of independence • Gains power with assumption that the two loci are independent](https://reader034.vdocument.in/reader034/viewer/2022042111/5e8c26cb10d4d770ce71089d/html5/thumbnails/22.jpg)
AnalysisofRareVariants
• Difficulties• Lackofararevariantcatalogwithreferencegenotypes• Largesamplesizeneeded.
• Samplingallelewithfrequency.5%or.05%withprobability99%needs460or4600individualsrespectively.
• Betteranalyticaltoolboxneededtogain power.• Commonvariantshaveonlyalimitedcapacitytotagrarevariants
• SingleMarkerTest• Chi-squaretest• Cochran-Armitagetestfortrend
• MultipleMarkerTest• Hotelling's T^2• LogisticRegression• Minimalp-value
![Page 23: advanced gene mapping coursemath.ucdenver.edu/~spaul/empty/hostedfiles/... · • Chi-square test of independence • Gains power with assumption that the two loci are independent](https://reader034.vdocument.in/reader034/viewer/2022042111/5e8c26cb10d4d770ce71089d/html5/thumbnails/23.jpg)
SingleMarkerTest• Forcase-controldatapossiblemethods:chi-squared,Fishers'exact,Cocharn-Armitagetrend,logisticregression(linearregression)• Fisher'sexacttestisrecommendedwhentherearesmallcounts• Regressionanalysiscontrollingforcofounders• Correctionformultiplecomparisonsneeded• ControllingFWERresultsinaloseofpower• Obtainempiricalp-valuesbyrandompermutationorcontrolFDR(sequential Bonferroni-typeprocedure).• Samplesizemustbeverylargeforsufficientpower
• Need6,400,54,000and540,000samplesforMAF0.1,0.01and0.001toget80%power• Successexample:insulinprocessing;Sample– 8000,variantsinSGSM2withMAF=1.4%,𝑝 = 8.7×10WKX andMADDwithMAF3.7%,𝑝 = 7.6×10WKZ
![Page 24: advanced gene mapping coursemath.ucdenver.edu/~spaul/empty/hostedfiles/... · • Chi-square test of independence • Gains power with assumption that the two loci are independent](https://reader034.vdocument.in/reader034/viewer/2022042111/5e8c26cb10d4d770ce71089d/html5/thumbnails/24.jpg)
MultipleMarkerTests
• Multipleregression:reduceddegreesoffreem• Hotelling’s twosample𝑇3 test:
• Reductionofpowerwithnumberofvariants• Greatlyeffectedbymaf• Identifiedriskallele(direction)isneeded
• MDMR(MultivariateDistanceMatrixRegression)• Usesgeneticsimilarityofindividuals• Don’tneedtoidentifyriskalleleateachvariant
• KBAT(Kernel-BasedAssociationTest)• Basedongenotypesimilarityscorebetweenindividualsmeasuredbyakernelfunction
• Noassumptionaboutdirection• Canhandlecorrelatedand/orindependentSNPs
![Page 25: advanced gene mapping coursemath.ucdenver.edu/~spaul/empty/hostedfiles/... · • Chi-square test of independence • Gains power with assumption that the two loci are independent](https://reader034.vdocument.in/reader034/viewer/2022042111/5e8c26cb10d4d770ce71089d/html5/thumbnails/25.jpg)
GenebasedAggregationTests
• Regressionbasedtests• Burdentests(collapsing)• Adaptiveburdentests• Variancecomponenttests• Combinationoftheabove
• Evaluatecumulativeeffectsofmultiplevariants• CMC(CombinedMultivariateandCollapsing)
![Page 26: advanced gene mapping coursemath.ucdenver.edu/~spaul/empty/hostedfiles/... · • Chi-square test of independence • Gains power with assumption that the two loci are independent](https://reader034.vdocument.in/reader034/viewer/2022042111/5e8c26cb10d4d770ce71089d/html5/thumbnails/26.jpg)
CMC
![Page 27: advanced gene mapping coursemath.ucdenver.edu/~spaul/empty/hostedfiles/... · • Chi-square test of independence • Gains power with assumption that the two loci are independent](https://reader034.vdocument.in/reader034/viewer/2022042111/5e8c26cb10d4d770ce71089d/html5/thumbnails/27.jpg)
ResentMethodsandSummary
• CMC– jointlyassessesroleofcommonandrarevariants• WSS– Weightedsumstatistics• KBAC– Kernelbasedadaptiveclustertest:weightingscheme• SKAT– sequencekernelassociationtest• Powertodetectassociationdepends• Thenumberandproportionofcausalvariants• Populationfrequency• Theireffectsizesanddirectionality• Numberofgenescontributingtothetrait• Thefractionofcausalvariantslocated(bysequencinge.g.exomeseq)
![Page 28: advanced gene mapping coursemath.ucdenver.edu/~spaul/empty/hostedfiles/... · • Chi-square test of independence • Gains power with assumption that the two loci are independent](https://reader034.vdocument.in/reader034/viewer/2022042111/5e8c26cb10d4d770ce71089d/html5/thumbnails/28.jpg)
RecentmethodsandSummary
• Statisticaltestsaresensitivetodiseasearchitecture• Differenttestshowsstrengthfordifferenteffectsizedistribution:• WWS:1/𝑥(1 − 𝑥);x-populationfreq.• SKAT:𝛽(𝑥; 𝑎K, 𝑎3) forpre-specified𝑎K, 𝑎3
• Allowoppositeeffectsontraits• Step-up,C-alpha,thereplication-basedtest,SKAT
![Page 29: advanced gene mapping coursemath.ucdenver.edu/~spaul/empty/hostedfiles/... · • Chi-square test of independence • Gains power with assumption that the two loci are independent](https://reader034.vdocument.in/reader034/viewer/2022042111/5e8c26cb10d4d770ce71089d/html5/thumbnails/29.jpg)
SoftwarePackages
![Page 30: advanced gene mapping coursemath.ucdenver.edu/~spaul/empty/hostedfiles/... · • Chi-square test of independence • Gains power with assumption that the two loci are independent](https://reader034.vdocument.in/reader034/viewer/2022042111/5e8c26cb10d4d770ce71089d/html5/thumbnails/30.jpg)
Gene× GeneInteraction
![Page 31: advanced gene mapping coursemath.ucdenver.edu/~spaul/empty/hostedfiles/... · • Chi-square test of independence • Gains power with assumption that the two loci are independent](https://reader034.vdocument.in/reader034/viewer/2022042111/5e8c26cb10d4d770ce71089d/html5/thumbnails/31.jpg)
Gene× GeneInteraction
![Page 32: advanced gene mapping coursemath.ucdenver.edu/~spaul/empty/hostedfiles/... · • Chi-square test of independence • Gains power with assumption that the two loci are independent](https://reader034.vdocument.in/reader034/viewer/2022042111/5e8c26cb10d4d770ce71089d/html5/thumbnails/32.jpg)
Gene× GeneInteraction
![Page 33: advanced gene mapping coursemath.ucdenver.edu/~spaul/empty/hostedfiles/... · • Chi-square test of independence • Gains power with assumption that the two loci are independent](https://reader034.vdocument.in/reader034/viewer/2022042111/5e8c26cb10d4d770ce71089d/html5/thumbnails/33.jpg)
Gene× GeneInteraction
![Page 34: advanced gene mapping coursemath.ucdenver.edu/~spaul/empty/hostedfiles/... · • Chi-square test of independence • Gains power with assumption that the two loci are independent](https://reader034.vdocument.in/reader034/viewer/2022042111/5e8c26cb10d4d770ce71089d/html5/thumbnails/34.jpg)
TestingforInteraction
• Logistic(linear)regressionforcase/controldata• ‘—epistasis’inPLINK• Morepowerful:Case-onlyanalysis• Interaction⟺ Correlationbetweenrelevantpredictors• TestNullhypothesis:twolociareindependent(nocorrelation)• Chi-squaretestofindependence• Gainspowerwithassumptionthatthetwolociareindependentinpopulation• Preferabletoincorporatecase-onlyandcase-controlestimatorintoasingletest(greaterpowerthanlogistic);--fast-epistasisinPLINKperformssuchtest
![Page 35: advanced gene mapping coursemath.ucdenver.edu/~spaul/empty/hostedfiles/... · • Chi-square test of independence • Gains power with assumption that the two loci are independent](https://reader034.vdocument.in/reader034/viewer/2022042111/5e8c26cb10d4d770ce71089d/html5/thumbnails/35.jpg)
PLINK--fast-epistasis
![Page 36: advanced gene mapping coursemath.ucdenver.edu/~spaul/empty/hostedfiles/... · • Chi-square test of independence • Gains power with assumption that the two loci are independent](https://reader034.vdocument.in/reader034/viewer/2022042111/5e8c26cb10d4d770ce71089d/html5/thumbnails/36.jpg)
• ExhaustiveSearch:useGPUs,suffersfrommultipletesting• Dataminingapproach:usecross–validationtoavoidoverfitting• MultifactorDimensionalityReduction(Ritchieetal.(2001)AJHG)• RandomForest(CART)• Penalizedregressionmethods(Zhuetal.(2014))• Entropybasedmethods• BEAM(Zhangetal.(2007))• Bayesianmodelselection• MCMC,MECPM(JiangandNeapolitan(2015))
OtherTechniques
![Page 37: advanced gene mapping coursemath.ucdenver.edu/~spaul/empty/hostedfiles/... · • Chi-square test of independence • Gains power with assumption that the two loci are independent](https://reader034.vdocument.in/reader034/viewer/2022042111/5e8c26cb10d4d770ce71089d/html5/thumbnails/37.jpg)
THANKYOU