rna-seq - bioinformatics differential gene expressi… · rna-seq quantification harm nijveen...
TRANSCRIPT
RNA-seq
Quantification
HarmNijveen
Differentialexpression
Whichgenesarehigher/lowerexpressedbetweentissues,aftertreatment,etc.?DifferentiallyExpressedgenes(DEGs)haveanexpressionlevelthatissignificantlydifferentbetweendifferentconditions.
RootLeaf
ExpressiongeneX
IstheexpressionofgeneXdifferentbetweenrootandleaf?
ExpressiongeneX
IstheexpressionofgeneXdifferentbetweenrootandleaf?
Basedononesample:perhaps…
RootLeaf
IstheexpressionofgeneXdifferentbetweenrootandleaf?
Basedonthesesamples:NO!
ExpressiongeneX
RootLeaf
IstheexpressionofgeneXdifferentbetweenrootandleaf?
Basedonthesesamples:YES!
ExpressiongeneX
RootLeaf
Isagenedifferentiallyexpressed?
Withonlyonemeasurement:impossibletosayWehavetoknowthewithin-treatmentvariation
Determiningexpressionvariation
Accuratelydeterminingthevariationrequiresmanybiologicalsamples(replicates)Unfortunatelyinmostcaseweonlyhavetwoorthreereplicates
Variationhastobeestimated
Readcountdistribution
Poissondistribution:variance=meanHoldsfortechnicalreplicates
Negativebinomial:variance>meanBetterfitforbiologicalreplicates
https://intro-prog-bioinfo-2012.wikispaces.com/
Variancedependsonthemean
Trapnelletal.2012
Mainassumption:Variancedependsonthemean.Objective:Findafunctionthatbestdescribestherelationshipbetweenthemeanandvariance.
p-value
Tofinddifferentiallyexpressedgeneswecandoastatisticaltestanddetermineap-value.p-value=0.05meansthatthereisa5%chanceforanot-differentiallyexpressedgenetoshowthesekindofexpressiondifferencesBut:with10,000genesi.e.10,000tests,youcanexpect0.05x10,000=500falsepositives!
MultipletestingcorrectionWeneedtocorrectthep-valuefordoingalargenumberoftestsWecanusedtheFalseDiscoveryRate(FDR)thatproducesanadjustedp-valuecalledq-valueq-value=0.05meansthatthereisa5%chancethattheseexpressionvaluesarefromanotdifferentiallyexpressedgene
Sometools..
DESeq/DESeq2EdgeRSleuth(kallisto)HISAT2/StringTie/Ballgown(canquantifyisoforms)
PlottingDEGsVolcanoplot
x:log2(foldchange)y:-log10(p-value)
MAplot
x:meanexpressiony:log2(foldchange)
AdotrepresentsonegeneReddotsaresignificant
VolcanoplotVolcanoplot
MAplot
Reddotshavep-adj<0.01
Schurch,N.J.,P.Schofield,M.Gierliński,C.Cole,A.Sherstnev,V.Singh,N.Wrobel,K.Gharbi,G.G.SimpsonandT.Owen-Hughes(2015)."EvaluationoftoolsfordifferentialgeneexpressionanalysisbyRNA-seqona48biologicalreplicateexperiment."arXivpreprintarXiv:1505.02017.
Log2(|Foldchange|)
Whicharetheinterestinggenes?
Highestfoldchange?Lowestp-value?Other?
Comparingmultipletreatments
Timeseries,multipletissues,etc.LookforgeneswithasimilarexpressionpatternUsingvariouskindsofclusteringmethods
Co-expression
Andnow?
Howdothe‘usualsuspects’behave?Whichbiologicalprocessesareenriched?Whichpathwaysareenriched?Dependsonthebiologicalquestion!Continuedthisafternoon…