random rna interactions control protein expression in prokaryotes

33
Random RNA interactions control protein expression in prokaryotes Paul Gardner University of Canterbury Christchurch New Zealand

Upload: paul-gardner

Post on 19-Jan-2017

40 views

Category:

Science


0 download

TRANSCRIPT

Page 1: Random RNA interactions control protein expression in prokaryotes

Random RNA interactions control protein expression inprokaryotes

Paul Gardner

University of CanterburyChristchurchNew Zealand

Page 2: Random RNA interactions control protein expression in prokaryotes

Feel free to share what you hear

These slides are available at: http://www.slideshare.net/ppgardne/presentations

Page 3: Random RNA interactions control protein expression in prokaryotes

The hard work of Sinan Umu, Ant Poole & Ren Dobson

Page 4: Random RNA interactions control protein expression in prokaryotes

mRNA levels are imperfectly correlated with protein levels

Lu et al. (2007) Nature biotechnology.

Page 5: Random RNA interactions control protein expression in prokaryotes

Determinants of protein concentration

Protein concentration depends on mRNA concentration, translation anddegradation rates

DNA[D]

RNA[R]

Protein[P]

ktranscription ktranslation

kmRNA degradation kprotein degradation

0 1

AT GGCTA

AGGGGCA

ATC

TT

TACA A

GATCCGTTCCTG

AACGCAC

T GCGT CGGGA ACGTGT

T CCAGTTTCTATTTATTT

G G T G A A T G GTATTA A G C T GCAAG

G GC

AAAT

CG

AGT

CT

TTTG

AT

CAGT

TCG

TGA

TC

CT

GT

TGA A

AAACACGGTCA GC

CAG

ATGGT TT

AC

AAGCAC

GCGATT

T C T AC

TGT

T G T C C CG T CTCG C C C G G T T T CTCATCACAGTAACAACGCCGGTGGCGGTA

CCAGCAGTAA

C T A C C A TCA

TGGTAGCAG

CGC

GC A

GA A

TACT

TCC

GC

GC

AACAGGAC

AG

CGAAGAAACCG

AA

TAA

de Sousa Abreu, Penalva, Marcotte & Vogel (2009) Global signatures of protein and mRNA expression levels. MolecularBioSystems.

Page 6: Random RNA interactions control protein expression in prokaryotes

Two general models describe variation in translation rate

I 1. Codon usage (Ikemura, 1981)

Figure from: Tuller & Zur (2015) Nucl. Acids Res.

Page 7: Random RNA interactions control protein expression in prokaryotes

Two general models describe variation in translation rate

I 2. mRNA structure (Pelletier & Sonenberg, 1987)

Figure from: Tuller & Zur (2015) Nucl. Acids Res.

Page 8: Random RNA interactions control protein expression in prokaryotes

We think we have a third general model...

http://dx.doi.org/10.7554/eLife.13479

http://dx.doi.org/10.7554/eLife.20686

Page 9: Random RNA interactions control protein expression in prokaryotes

Non-coding RNAs are abundant

●●

01

23

45

log 1

0(M

ean

Rea

d D

epth

)

Core ncRNA genesCore protein coding genes

Lindgreen, Umu et al. (2014) PLOS Computational Biology.

Page 10: Random RNA interactions control protein expression in prokaryotes

Bacterial non-coding RNA function

Hfq

AUGSD

XRibosome

sRNA

AUG

RNase Erecruitment

AUGSD

Ribosome

Anti-antisense mechanism

Selective mRNA stabilisation

AUG

RNase E

Shine-Dalgarno sequence

Sequestration of ribosome binding site

Induction of mRNA decay

SD =

Figure by Bethany Jose

Page 11: Random RNA interactions control protein expression in prokaryotes

Checking for mRNA:ncRNA interactions

I Looking for regulatory interactions which are specific and small innumber, off-targets are non-specific and large in number

I Compare 5′ ends of CDS & ncRNAsI Looking for a bump on the left...

−15 −10 −5 0

0.00

0.05

0.10

0.15

0.20

0.25

Binding Energy (kcal/mol)

Den

sity

Page 12: Random RNA interactions control protein expression in prokaryotes

Checking for mRNA:ncRNA interactions

−15 −10 −5 0

0.00

0.05

0.10

0.15

0.20

0.25

Binding Energy (kcal/mol)

NativeShuffled (P = 7.69−52)

Page 13: Random RNA interactions control protein expression in prokaryotes

Checking negative controls!

−15 −10 −5 0

0.00

0.05

0.10

0.15

0.20

0.25

Binding Energy (kcal/mol)

NativeShuffled (P = 7.69−52)Different phylum (P = 0 )Downstream (P = 2.66−124)Rev. complement (P = 6.51−57)Intergenic (P = 6.16−93)

Page 14: Random RNA interactions control protein expression in prokaryotes

Do ubiquitous and abundant RNAs influence translation?

I Given that ncRNAs are among the most abundant RNAs in the cell([ncRNA] >> [mRNA])

I AND that RNAs frequently hybridiseI THEN maybe stochastic interactions with mRNAs inhibit translation

Corley & Laederach (2016) Bioinformatics: Selecting against accidental RNA interactions. eLife.

Page 15: Random RNA interactions control protein expression in prokaryotes

How can this hypothesis be tested?

I We predict that:

1. There is selection against mRNA:ncRNA interactions2. That stochastic mRNA:ncRNA interactions influence [protein]:[mRNA]

ratios

I For consistency: focus on 6 ncRNA families & 114 mRNAs/proteinsthat are highly conserved & expressed; And first 21 nts of CDS.

I Tested 1,582 bacterial & 118 archaeal genomes

Page 16: Random RNA interactions control protein expression in prokaryotes

Are mRNA:ncRNA interactions selected against?

−15 −10 −5 0

−0.0

10−0

.005

0.00

00.

005

0.01

00.

015

Binding Energy (kcal/mol)

Den

sity

Diff

eren

ceActinobacteria (n:163) P = 9.8x10−69

Bacteroidetes (n:60) P = 8.7x10−148

Chlamydiae (n:38) P = 1.4x10−193

Cyanobacteria (n:40) P = 3.8x10−11

Firmicutes (n:378) P = 0

Proteobacteria (n:756) P = 0

Spirochaetes (n:38) P = 1.6x10−98

Archaea (n:118) P = 4.2x10−177

Background (n:100)

More stable interactions

Nat

ive in

tera

ctio

nsSh

uffle

d in

tera

ctio

ns

Act

Bac

Chl

Cya Fi

rPr

oSp

iAr

c

010

2030

40

−log

10P

Page 17: Random RNA interactions control protein expression in prokaryotes

Do mRNA:ncRNA interactions influence proteinexpression?

●●●

●●●●

●●

●●

● ●

●●

● ●

●●

●●

●●

●●

● ● ●●

●●

●● ●

●●●

●●●

● ●

●●

●●

●●●

●●

2.0

2.5

3.0

3.5

4.0

−300 −250 −200 −150

Rs=0.65

log 1

0(flu

ores

cenc

e)

Avoidance (kcal/mol)Expression data from: Kudla et al. (2009) Science.

Page 18: Random RNA interactions control protein expression in prokaryotes

Do mRNA:ncRNA interactions influence proteinexpression?

I Testing the relationship between protein abundance estimates andavoidance, mRNA secondary structure, codon usage and mRNAabundance

GFP datasets Mass-Spec datasets

E.coli

(n=

52

)G

FP/q

PC

R

E.coli

(n=

15

4)

GFP

/Nort

hern

E.coli

(n=

14

,23

4)

mC

herr

y/R

NA

seq

E.coli

(n=

38

9)

MS

/mic

roarr

ay

E.coli

(n=

3,3

01

)M

S/m

icro

arr

ay

P.aeru

ginosa

(n=

5,4

79

)M

S/m

icro

arr

ay

P.aeru

ginosa

(n=

1,1

48

)M

S/m

icro

arr

ay

*

*

*

*

*

*

*

*

*

*

*

*

*

*

*

*

*

*

*

*

*

*

*

*

*P < 0.05

0.0 0.60.2 0.4-0.2

Correlation CoefficientAvoidance

Secondary Structure

Codon

[mRNA]

Page 19: Random RNA interactions control protein expression in prokaryotes

Testing the extremes of expression

0.1

0.5

0.8

1.2

1.6

1.9

2.3

2.6 3

3.3

3.7

4.1

4.4

4.8

Freq

0

20

40

60

80

100

120

A

log10([Protein]/[mRNA])

Freq

uenc

y

low expression (n=10)high expression (n=10)

B

Avoi

danc

e

Cod

on

Sec.

Str.

Nul

l

Sec.

Str.

Cod

on

Avoi

danc

e

−2

−1

0

1

2

*

*

Z sc

ore

low expression (n=10)high expression (n=10)

I E. coli genes (n = 389)

Page 20: Random RNA interactions control protein expression in prokaryotes

Designing mRNAs

I 239aa GFP can be encoded by 7.62x10111 synonymous mRNAs

I Extremes of avoidance have a stronger effect than codon usage orsecondary structure

● ●

●●

●●

●●

●●

4.2

4.3

4.4

4.5

4.6

4.7

0.60 0.65 0.70 0.75 0.80 0.85CAI

log 1

0(flu

ores

cenc

e)

Rs=0.29

●●

●●

●●

●●

● ●

4.2

4.3

4.4

4.5

4.6

4.7

−15 −10 −5 0Folding Energy (kcal/mol)

Rs=0.34

●●

●●

●●

●●

●●

4.2

4.3

4.4

4.5

4.6

4.7

−350 −300 −250 −200 −150 −100Binding Energy (kcal/mol)

Rs=0.56

hi low●

AvoidFoldCodonOptimal●

Page 21: Random RNA interactions control protein expression in prokaryotes

Avoidance in 3D on the ribosome

I Protein binds to regions with low avoidance (green) while exposedregions are high avoidance (blue): P = 9.3x10−15, Fishers exact test

Page 22: Random RNA interactions control protein expression in prokaryotes

Further Work

I Further work:I Testing adaptation with experimental evolution experimentsI Do mRNA:ncRNA interactions influence eukaryotic gene expression?

I Number of possible interactions increases quadratically with number ofgenes. May require spatial & temporal separation of genes

I Does avoidance drive compartmentalisation and increases in nucleotidebinding proteins?

I Do mRNA:ncRNA interactions influence viral infection, hybridisation,HGT & transformation expts?

I Are protein, DNA and protein:nucleotide interactions also avoided?

Page 23: Random RNA interactions control protein expression in prokaryotes

And now for something completely different...

Page 24: Random RNA interactions control protein expression in prokaryotes

Bioinformaticians are horrible!

I Bioinformaticians are bad, impatient & intolerantI Build a phylogenetic tree: which of the 172 methods do you use?

MBIOREANC-GENEBAli-PhyBAMBEBayesPhylogeniesBEASTBESTBio++bms_runnerburntreesCadenceCruxIMa2MesquiteMrBayesMrBayesPluginMrBayes-tree-scannersMultidivtimep4SIMMAPPALtracerPAMLVanillaPHASEPHYLLABPhyloBayes

ARBBionumericsBIRCHBosqueBPAnalysisCAFCACRANNDAMBEEMBOSSTNTFootPrinterFreqparsGambitGAParsGelCompar-IIGeneTreegmaesHennig86IDEALVBMALIGNMEGAMesquiteMurka

NetworkNimbleTreeNONANotungParsimovPASTPAUP*PAUPRatPaupUpphangornPHYLIPPhyloNetPhylo_winPOYPRAPPSODARASeaViewSeqStateSimplotsogTCS

Parsimony Maximum Likelihood BayesianALIFRITZaLRTARBBio++BionumericsBIRCHBootPHYMLBosqueCodeAxeCoMETConcaterpillarCONSELCruxDAMBEDARTDarwindnaratesDPRMLDT-ModSelEMBOSSEREMfastDNAmlfastDNAmlRevFASTML

FastTreeGARLIGZ-GammaHY-PHYIQPNNIKakusan4LeaphyMac5McRateMesquiteMetaPIGAMixtureTreeModelfitModelGeneratorMOLPHYMrAICMrModeltestMrMTguiMultiPhylNEPALNHMLnhPhyMLNimbleTreep4

PALPAMLPARATPARBOOTPASSMLPAUP*PAUPRatPaupUpphangornPHYLLABPhyloCoCoPhylo_winPHYMLPhyML-MultiPhyNavPHYSIGPLATOPorn*PRAPPROCOVProtTestPTPr8s-bootstrapRate4Site

Rate-evolutionRAxMLraxmlGUIRevDNAratesrRNA-phylogenySeaViewSegminatorSEMPHYSeqPupSeqStateSIMMAPSimplotSLRSpectronetSpectrumSplitsTreeSSATipDateTreefinderTREE-PUZZLEVanilla

Page 25: Random RNA interactions control protein expression in prokaryotes

How can we choose software?

I Which methods do you use?

Page 26: Random RNA interactions control protein expression in prokaryotes

Approach software like a scientist

I Are any good controls available?I Positive: databases, publications,

simulation, ...I Negative: randomized, select

relevant negative data, ...

I Some common accuracy metrics:I Sensitivity (true positive rate)I Specificity (true negative rate)I Mathew’s correlation coefficientsI Area under an ROC curve

False positive rate

True

pos

itive

rat

e

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

DBS, PfamDBS, TreefamDBS, CustomPROVEANPolyphen−2SIFTFATHMM, weightedFATHMM, unweighted

Wheeler et al. (2016) A profile-based method foridentifying functional divergence of orthologous genesin bacterial genomes. Bioinformatics.

Page 27: Random RNA interactions control protein expression in prokaryotes

Benchmarks are useful, and fun...

Page 28: Random RNA interactions control protein expression in prokaryotes

Is there really a relationship between software speed &accuracy?

I Can we run a meta-analysis of bioinformatic benchmarks?I If speed isn’t related to accuracy, then what is?I Some possibilities:

I Software ageI Journal “impact” (IF & GoogleScholar H5)I Number of citationsI Corresponding author’s H-index & M-index

Page 29: Random RNA interactions control protein expression in prokaryotes

After some literature mining...

I found 43 matching articles.

I 102 benchmarksI Accuracy & speed ranks for 243 bioinformatic software tools

I Manually extracted IF, H, age, ...

I 65 journals (Bioinformatics, NAR, Genome Research, ...)

I 151 author GoogleScholar profilesabyss antepiseeker apg barry bellerophontes bfast bismark biss boost bowtie bowtie2 bowtiestar bratbw bsmap

bsmooth bsseeker buckycon buckymrbayes buckymrbayesspa buckypop buckyraxml builder bwa bwasw caml camp carma

ce celera clark clc clustalomega clustalw comus coprarna coral cosine crisp cro cromwell cufflinks cwt dali

de dexseq dialign dialign22 dialignt dialigntx diffsplice diginormvelvet dima djigsaw downhillsimplex dsgseq

ebi echo edenanonstrict edenastrict edit epimode ericscript erpin fa fasta fasttree fisherexacttest

fusioncatcher fusionmap gassst gatk genometa gojobori goldman gossamer gottcha greedyft gsnap heidge hitec

hmmer hshrec idbaud igtpduplossft inchworm infernal intarna jaffa kalign kbsps kraken kthse leidnl limpic

lmat lms lofreq lsqman mafft mafftfftns mafftfftns2 mafftlinsi mapsplice maq mats megan metaphlan metaphyler

methylkit methylsig mgrast minia mira mirdeep mireap mirena mirexpress mlclustalw mlclustalwquicktree mlmafft

mlmafftparttree mlmuscle mlopal mlprankgt modellerv mosaik motu mpest mpjclustalw mpsclustalw mrfast mrpml

mrpmp mrsfast msinspect multalin muscle musclemaxiters mzmine nbc ncbiblast nest newbler nfuse novoalign

oases onecodex openms pairfold paralign pass perm phylonetft phylopythias phymmbl piler poa poy poystar

pragcz probalign probcons probtree process pso pt qiime qsra quake raiphy ravenna raxml raxmllimited

rdiffparam repeatfinder repeatgluer repeatscout reptile rmap rnacofold rnaduplex rnahybrid rnaplex rnaup

rsearch rsmatch sam sate scro scwrl scwrlcons segemehl segmodencad seqgsea seqman seqmap sga sharcgs shrimp

simulatedannealing sl smalt snap snpruler snver soap soap2 soapdenovo soapec soapstar spades sparse

sparseassembler spcomp specarray spt srmapper ssaha ssake ssap ssearch ssm sst st starbeast strcutal

swissmodel taipan targetrna targetrna2 taxatortk tcoffee team tmap tophatfusion transabyss trinity upmes

varscan vcake velvet wmrpmp woodhams wublast xalign xcmswithcorrection xcmswithoutretentiontime zema

Page 30: Random RNA interactions control protein expression in prokaryotes

Nothing is correlated with accuracy!

Rel. ag

eYe

ar

Accurac

ySpe

ed JH5 JIF Cite

s

Rel. cit

es

H−inde

x

M−inde

x

Rel. ag

eYe

arAccurac

ySpe

edJH5JIF

CitesRel. cit

esH−inde

xM−in

dex

Rel. ag

eYe

ar

SpeedJH

5 JIFCites

Rel. cit

es

H−inde

x

M−inde

xX X X X X XX X X X X X X

XX

X X X XX X X X X X XX X X X X X XX X X X X X X XX X X X X X

X X X X X

Correlates with accuracy rank

Spea

rman

's rh

o

−0.2

−0.1

0.0

0.1

0.2

xxx

x

x

x

x

x

x

xxxx

x

x

x

x

x

xx

xx

xxx

x

xxxxx

xxx

xxx

x

x

xxx

xxx

x

x

xx

xxx

x

xxxx

xx

xxx

x

xx

xx

x

x

xx

x

x

x

xxx

xxx

xxxxxxxxxx

x

x

x

x

x

xxxx

x

x

x

xxxx

x

xxxx

xx

xxx

x

x

xxx

xx

xxx

xx

x

x

x

x

x

xx

x

xxx

x

xxxxxxx

x

xxxxxxxxx

xx

xxx

x

x

x

xxxxxxxxxxxx

xx

xxxxxxxx

x

x

x

xx

x

x

x

x

x

xx

x

xx

x

xx

x

xx

x

xxxxxx

xx

xxxxxxx

x

xxxxxx

x

x

x

x

x

xx

x

xx

x

x

x

xxxxx

x

x

x

xxxxx

x

x

x

x

x

xxx

x

x

xx

x

x

xx

x

xxxx

x

xx

x

x

x

xx

x

x

x

xxx

x

xx

x

xxx

x

x

x

x

xxx

x

xxx

x

xx

x

x

xx

xx

xx

xxxxx xxxxx

x

xxxxxxx

x

xxxx

xxxxxx

x

xxxxxxxx

x

x

xx

x

xx

xxxx

xx

x

xxxxxxx

x

xxx

x

xxxxxx

x

x

x

x

x

xxx

x

xx

x

xxxxx

x

xxx

xxxxxxx

x

xx

x

xxxxxxxxxxx

xxxxxxxxxxxx

x

x

xx

x

xx

x

xxxxxxxxxxxxxxxx

x

x

x

x

x

xx

x

xxxxxxxx

x

xx

xx

x

x

x

x

x

xxx

x

xxx

xx

xxxxxxxxxxx

x

xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

x

xxxxxxxx

x

x

x

xxxxx

x

x

x

xxxxxxxxxx

x

x

x

xxxxx

x

x

x

xxxxxxxxxxx

xxxxxxx

xxxxxxxx

x

xxxxxxxxx

x

xxxxxxxxxxxxx

x

x

xx

xxxxxx

x

xxxxxxx

x

xxxxx

x

xxxxx

x

x

xxxxx

x

x

x

xxx

x

xxx

xx

xxxxxxxxxx

xxxxxxxx

x x

xx

x

x

x

x

xx

xx

xxxxxx

xx

x

xxxxx

xx

x

xxxxx

xxxxxxxxxxxxx

xxxxxxxxxxxxx

xx

x

x

xxxx

x

xxxxxxxxxxxxx

x

x

x

x

xx

x

x

xxx

x

xxxxxx

xxxxxx

x

x

xxxx

x

x

x

x

xxxxxxxxxx

x

xxxxx

xxxxx

x

x

xxx

x

xxx

x

xxxx

x

xx

x

xxx

x

x

xxx

x

x

x

xxxx

xx

x

xxxxx

x

x

xxx

x

x

xxxx

x

xx

x

xxxx

x

xx

xxx

xxx

x

xx

x

xxxxxxxxxxxxxxxx

x

xx

x

x

x

xxx

x

xxxxxxxxxxx

x

xx

x

xx

x

xx

xxxxxxxxxx

x

x

xxx

xx

x

x

x

x

x

x

xxxxx

x

xxxxxx

x

x

x

xxxxxx

x

xxx

x

x

x

xxxxx

x

x

xxx

xxx

xxx

x

xxx

xxx

xx

xx

xxx

x

x

x

x

x

x

xxxxx

x

xxxxxxx

xxx

x

x

x

xx

x

xxx

x

xxxxxxxxxxxxxx

x

xxx

x

xx

x

x

xxx

x

xxxxxxxxxxx

x

x

xx

x

xx

x

xxx

x

xxxxxxxxxxxxxx

x

x

x

xx

x

xxxxxx

xxx

x

xxxxxxx

xx

xxx

x

x

xx

xxxx

x

x

xxxxx

x

x

x

xxxx

x

xxxx

x

x

x

x

xx

x

x

x

x

xxx

x

x

xx

xx

x

xxxxxxxx

x

x

xx

x

xx

x

xxxx

xxx

x

x

xxxxxxxxxx

x

x

xxxx

x

xxxx

x

x

x

xx

x

xx

x

xx

x

x

xxxxxxxxxx

x

x

xx

x

xx

xx

x

xxxx

x

xxx

xx

x

x

x

x

xx

x

xxxx

x

x

x

xxx

x

xxxxxx

x

x

x

xxxxxx

x

xxxxxxxxx

x

xxxx

x

xxx

x

x

x

xx

xxxxx

xxxxxxx

xx

x

xxxxxxxx

x

x

x

x

xxxx

x

xxxx

x

x

x

xx

x

xxx

x

x

x

xx

x

xxxxxxx

x

xxxxxxxxxxxxxx

x

xxx

x

x

x

x

x

x

x

xxx

x

xxx

x

xx

x

x

xx

xxxx

x

xxxxx

x

xxxxxx

xx

x

x

xx

x

xx

xxx

xxx

x

xxxxxxxxxxx

x

xxxxx

xxx

x

xx

x

xx

xx

x

xx

x

x

x

x

x

x

xxxxxx

xxx

x

x

xx

x

x

x

x

x

x

xxx

x

x

x

xx

x

x

x

x

x

x

xxxx

x

xx

x

xxx

x

x

x

xx

xxxx

x

x

xx

x

x

xxx

x

xxx

x

x

x

x

x

x

x

xxxxxxxx

x

x

x

x

xx

x

x

x

xxxxx

x

xxx

xx

xxxx

x

xx

xxxxx

x

xxxxx

xx

x

x

xxxxxxx

x

x

x

x

x

xxxxx

x

x

xxxx

x

x

x

x

xx

x

x

x

x

xxx

xx

x

xxxxx

x

x

x

xxxxxx

xx

x

x

xx

xxx

x

x

xxx

x

x

x

x

x

x

xx

x

x

x

xxx

xxx

x

x

xx

xx

x

xx

xxxxxxxx

x

xxxxx

x

xxxxx

xx

xxxxx

x

xx

x

xxxx

x

xxxxx

x

x

xx

x

x

x

x

x

x

x

x

xx

x

x

x

xx

x

x

x

x

x

x

x

xxx

xxx

xxx

x

x

xx

x

xxxxxxx

x

xxx

xx

x

xx

x

xxxx

x

x

x

x

x

xx

xx

x

x

x

xxx

x

xx

xx

xx

xxx

x

x

xx

x

x

x

x

x

xx

xxx

xx

xx

x

x

x

xxx

xx

x

x

x

x

x

x

xxxxx

x

xx

x

x

xxxxx

x

xxxxx

xx

x

x

x

x

xx

x

x

x

xxx

xxxxx

x

x

xx

x

xx

x

x

x

xx

x

x

xxxxxxxx

x

xxxxxxxx

x

xxxxx

xx

xxxx

xxx

xxx

x

x

x

x

xxxx

x

xx

x

xx

x

xxx

x

x

xxx

xx

xxxx

xxxxx

xxxxx

x

xx

xxx

x

xxxx

xxxxx

x

x

xx

xx

xx

x

x

x

x

x

x

xx

x

x

xx

x

x

xxxxxxxx

x

x

x

x

xxx

x

xxx

x

xxx

xx

xxx

x

xx

x

xx

x

x

x

xx

x

xxxxx

x

x

x

x

x

xxxx

x

xx

x

xx

xx

xx

x

xx

xx

x

xx

x

xxxx

xx

xx

xx

x

xxx

x

x

x

xx

xx

xx

xxx

x

xx

x

xxx

x

x

x

xxx

x

xxx

x

xx

x

x

xxx

x

xxx

xx

xx

x

xx

xxx

xxx

x

xxx

xx

xxxxx

x

x

xxx

x

x

xx

x

xxx

xx

x

x

x

xxx

x

xxxx

x

x

x

x

xxxx

x

x

xxxxx

x

x

xxx

x

x

x

x

x

x

xx

x

x

x

xx

xx

xxxx

x

xxxxxxxx

x

xxxx

x

xxxx

x

xxxx

x

x

x

x

x

x

xxx

x

x

xxx

xx

xxx

x

xxx

x

x

xx

xx

xx

x

xxxxxx

x

x

xx

x

x

x

x

x

xx

x

xx

x

xxxxx

x

xx

x

xxx

x

xxx

x

x

xx

xx

xx

xxxxxxx

x

x

x

x

xxx

x

x

x

xx

x

x

xxxx

x

x

x

x

xx

xx

x

x

x

x

x

xxxxxxxx

xx

xxx

x

x

x

x

xx

xxxx

x

xx

x

xxx

xxxxx

x

xxxxxxxxx

x

x

x

xxxx

xx

x

x

xx

x

xx

x

x

xx

xx

x

xx

x

xx

xx

xx

xxxxx

xx

x

x

x

x

xxx

x

x

xxxxx

x

x

xxx

x

x

xxxx

xxxx

xxxxx

xx

x

xx

x

xx

x

xx

xx

xxx

x

xx

x

x

xx

xx

xxx

x

x

x

xx

x

xxxx

xxx

xxxxxx

x

x

xx

xx

x

x

xxx

xxxx

x

x

x

x

x

x

x

x

xxx

x

xxx

xxxxx

x

x

xxxx

x

x

xxxx

xx

xxx

x

x

xx

x

x

x

x

x

x

x

x

x

xxx

x

x

xx

x

x

xxx

x

xx

x

xxxxx

x

x

xxx

x

xx

xx

xx

xx

xxxx

xxxxxxxxxxxxxx

x

xx

xxx

xxx

x

xxx

x

xx

x

x

x

x

xxx

xxx

xxx

xxx

x

xxx

x

xxx

xx

x

x

xxxx

x

xxxxxx

x

x

xx

xxxxxxxx

x

x

xx

xx

x

xx

xx

xxx

xxx

x

x

xxx

x

xxxx

xx

xx

x

x

xx

xxx

x

xxxxxxxxxxx

xx

x

x

xx

x

x

xxxx

x

xxx

xxx

x

x

xx

x

xx

xxxxx

xx

xxxx

xx

x

x

x

xx

xxx

x

x

x

x

xx

x

xx

xx

x

x

x

x

x

xx

x

x

xx

x

x

xxxx

xx

x

x

x

xx

x

x

x

x

x

x

xx

x

xxxxxxx

x

x

xx

x

x

x

xx

xxxx

x

x

x

x

xxxxx

x

x

x

xxx

x

x

xxxx

x

xxx

x

xx

x

x

x

x

xxxxxx

x

x

x

x

xxxxx

xxx

xx

x

xxx

x

xxxxx

x

x

xx

x

x

x

xxx

x

xx

xx

x

xx

x

x

xx

x

xxx

x

x

x

xxxx

x

xxxxxx

x

x

x

x

x

xxx

x

x

x

xxxx

x

x

x

xx

x

x

x

x

xxxxx

x

xx

x

x

x

x

xxxxx

x

x

xxxx

x

xx

x

x

xxx

x

xxx

xx

x

x

xx

x

x

xx

x

xx

xx

x

x

xxx

x

xx

xx

x

xxxxxx

xx

xx

x

x

xx

xxx

x

x

xxx

xxx

x

x

x

x

x

x

x

x

xxx

x

xxxx

x

xx

x

xxx

x

x

xx

x

x

x

xxx

x

xxx

xx

x

x

x

xx

xx

xxx

x

x

xx

x

x

xx

xxx

x

xxx

x

x

xxx

x

x

x

x

x

xx

x

x

x

xx

xxxxxx

xxxxxx

xxx

xxx

x

x

x

x

xx

xxx

x

x

xx

x

xx

xx

xxx

xxxxx

x

xx

xx

xx

x

x

xx

xx

xxxxxx

x

x

x

xxx

xx

x

x

xx

x

x

x

xx

xxxxx

x

xx

xxx

x

x

x

x

xx

x

x

xx

x

xx

xx

x

x

xx

xx

xx

x

x

x

x

xx

x

x

x

x

xx

x

xxx

x

xx

x

x

xxx

xx

x

xxx

x

x

x

xxxxx

x

xxxx

x

x

x

x

x

x

x

xx

xxxx

x

xx

x

xx

xxxxx

xxxx

xx

x

xxx

x

x

xx

x

x

x

x

xx

xx

x

x

x

xxx

x

xx

xx

xxxx

xx

x

x

x

xxxx

x

x

x

x

xxxxx xxx

xxxxxxxxxx

x

xxxx

x

xxxx

x

xxxxxxxxxxxx

x

xx

xx

xxx

x

x

xx

x

xx

x

xx

x

xxx

x

xxxx

x

x

x

x

xx

x

xx

x

xx

x

xx

xxx

xxx

x

x

x

xx

x

xxxx

xx

xxxx

xx

x

xx

x

xx

xxx

x

xxxx

xx

x

x

x

xxx

xx

xx

xxxxx

x

x

x

x

x

xx

x

x

xx

xx

x

x

x

xxxx

x

x

x

x

xx

xxxx

xxx

xxx

x

x

x

x

x

xxxx

xxxxxx

xx

x

xx

xxxx

x

x

x

xx

x

x

x

xx

x

x

xxx

x

xx

x

x

xxx

x

xxxxx

x

xxx

xx

x

xxx

xxxx

x

x

x

x

x

xx

xx

xx

x

x

x

x

xx

x

xx

xxx

x

xx

xxxx

x

xx

xxxx

xxx

xx

xxx

x

xxxxx

x

x

xx

x

x

x

x

xxx

xx

x

x

x

xxxx

x

x

xx

x

xx

x

x

x

xxxxxxx

xx

xxx

x

x

x

xx

x

xxxxx

xx

xx

-1 0 1

Spearman's rho

A B

Page 31: Random RNA interactions control protein expression in prokaryotes

-3 30

Z-score

Speed

Accuracy

Freq

.0 6 12

010002000

Freq

.

0 6 12

010002000

Freq

.

0 20

05000

10000

10

Freq

.

0 6 12

010002000

Freq

.

0 6 12

010002000

X

X X

X X

X X

X X X X X

X X

X X

X X X

X X

Page 32: Random RNA interactions control protein expression in prokaryotes

Conclusions

I Speed is NOT reflective of accuracyI Neither is author/journal reputation, software age & # citations

I The only reasonable way to select software is by benchmarking

I Publication bias is influencing software accuracy

I It doesn’t matter how famous you are, you can still write great software!

Page 33: Random RNA interactions control protein expression in prokaryotes

Thanks!

I Avoidance: Sinan Umu, Anthony Poole & Renwick Dobson

I Meta-benchmark: James Paterson, Fatemeh Ashari Ghomi, Sinan Umu,Stephanie McGimpsey, Aleksandra Pawlik

Umu, Poole, Dobson & Gardner (2016) Avoidance of stochastic RNA interactions can be harnessed to control protein expressionlevels in bacteria and archaea. eLife.Gardner et al. (2017) A meta-analysis of bioinformatics software benchmarks reveals that publication-bias influences softwareaccuracy. In preparation.

These slides are available at: http://www.slideshare.net/ppgardne/presentations