quantitation of cis-translational control by general mrna...
TRANSCRIPT
Quantitationofcis-translationalcontrolbygeneralmRNAfeaturesinfiveeukaryotes
SoCal SysBio Conference, Feb 9, 2019
Joint work with Drs. Guo-Liang Chew (Fred-Hutch) and Mark D. Biggin (LBNL)
Jingyi Jessica Li
University of California, Los Angeles
Translationisoneoffourstepsthatdetermineproteinabundances
transcriptionmRNA
degradationmRNA
translation
protein
proteindegradation
kdr x [mRNA]
kdr x [protein]
Vtxn
ktns x [mRNA]
Gene and condition specifice.g. miRNA
RBPs
Present in a minority of mRNAs, cell types and conditions
General featurese.g. RNA structure
uORFs, codon usage
Present in most or all mRNAs, cell types and conditions
AAAAAUG5’ cap
protein coding sequence
RBP
AAAAAUG5’ cap
protein coding sequence
miRNA
Twoclassesoftranslationalcontrol
AAAAAUG protein coding sequence
AAAAAUG protein coding sequence
efficient codon usage
uORF
inefficient codon usage5’ cap
uORFuORF5’ cap
Therearefivegeneralfeatures
AAAAAAAAAUGAUG
PROTEIN CODING REGION (CDS)
RNAstem loop
poly A tailAPEuORF
1. 5’ UTR RNA secondary structure
2. upstream open reading frames (uORFs)
3. AUG proximal element (APE)
5’ UTR 3’ UTR
5’ cap
4. protein coding sequence (CDS) length
5. CDS codon frequency
protein coding sequence
Amultivariatelinearmodelcombiningallfivefeatures
AAAAAAAAAUGAUG
PROTEIN CODING REGION (CDS)
RNAstem loop
poly A tailAPEuORF
1. 5’ UTR RNA secondary structure
2. upstream open reading frames (uORFs)
3. AUG proximal element (APE)
5’ UTR 3’ UTR
5’ cap
4. protein coding sequence (CDS) length
5. CDS codon frequency
protein coding sequence
Amultivariatelinearmodelcombiningallfivefeatures
S. cerevisiae
GeneralfeaturespredictTRinfiveeukaryotes
general features
0%S. cerevisae mouseArabidopsis
40%
80%
60%
20%
% v
aria
nce
oftra
nsla
tion
rate
100%
humanS. pombe
GeneralfeaturespredictTRinfiveeukaryotes
general features
0%S. cerevisae mouseArabidopsis
40%
80%
60%
20%
% v
aria
nce
oftra
nsla
tion
rate
100%
humanS. pombe
GeneralfeaturespredictTRinfiveeukaryotes
Unexplainedvariancesetsalimitongeneandconditionspecificcontrol
unexplained general features
0%S. cerevisae mouseArabidopsis
40%
80%
60%
20%
% v
aria
nce
oftra
nsla
tion
rate
100%
humanS. pombe
no miRNAs~100 RBPs
>1,800 miRNAs~ 380 RBPs
420 miRNAs559 RBPs
Controloftranslationby5’UTRRNAstructure
1 100755025 125 2001751500.0
-12.0
-8.0
-4.0
RN
A fo
ldin
g en
ergy
Location of 5’ end of 35 nucleotide sliding window
Controloftranslationby5’UTRRNAstructure
5’—C A A C U—3’A
CG
C
G
A
UG
C
U
5’—C A A G C A C C—3’
Themostfoldedregionswithin5’UTRsdeterminetranslationrates
S. cerevisae
H. sapiens
M. musculusArabidopsis
S. pombe
0%
20%
40%
60%
80%
100%
% R
2 f
ea
ture
vs lo
g1
0 T
R
min max90%75%25%10%
1 100755025 125 2001751500.0
-12.0
-8.0
-4.0
RN
A fo
ldin
g en
ergy
Location of 5’ end of 35 nucleotide sliding window
5’—C A A C U—3’A
CG
C
G
A
UG
C
U
5’—C A A G C A C C—3’
Themostfoldedregionswithin5’UTRsdeterminetranslationrates
Contiguousstemsaremosteffective,butothernucleotidepairscontribute
0.00
-0.10
-0.05
-0.25
-0.15
-0.20
S. cerevisae Arabidopsis M. musculus
contig stem max stem all
Pe
ars
on
co
effic
ien
t (r
)
S. pombe H. sapiens
35 25 40 55 60
Mostfoldedregionsaresimilaracrosseukaryotes
0.0
1.0
-1.0
-3.0
-2.0
2.0
-4.0
-5.0
-6.0
-7.0
S. cerevisae
Arabidopsis
M. musculus
low
TR
valu
e -
hig
h T
R v
alu
e
MFE loopTR
contig.
stem
max
stem
all
S. pombe
H. sapiens
Structureatthe5’capcontributeslittle
0%
% o
r TR
exp
lain
ed
mostfolded
leastfolded
90%75%25%10%
10%
8%
6%
4%
2%
16%
14%
12%
5’cap -1 whole allall -whole
sum<10%
0%
% o
r TR
exp
lain
ed
mostfolded
leastfolded
90%75%25%10%
10%
8%
6%
4%
2%
16%
14%
12%
5’cap -1 whole allall -whole
sum<10%
Longdistanceloopsmakelittlecontribution
Translationsequencemotifs
+1-60 +30-30
S. POMBE
01
bits
01
bits
+1-60 +30-30
S. CEREVISAE
ARABIDOPSIS
01
bits
+1-60 +30-30
H. SAPIENS
01
bits
+1-60 +30-30
M. MUSCULUS
01
bits
+1-60 +30-30
5’UTR CDSAUG
SimilarcontrolsequencesacrossEukarya
1.5
2.0
0.0
0.5
1.0
2.5
Yeas
t uA
PE
1.5 2.00.0 1.0 2.5
Arabidopsis uAPE0.5
r=0.88
GGGATG
AACCAA
2.5
1.5
2.0
0.0
0.5
1.0
Mou
se 5
’ of A
PE
1.5 2.00.0 1.0 2.5
Arabidopsis 5’ of APE0.5
r=0.36
GGG
ATG
AACCAA
Generalfeaturesactsimilarlyindifferenttissues,butdifferentlybetweenspecies
uAUG 5’motifscodonfreq.
CDSlength
5’UTRRNAfold
S.cerevisae
S.pombe
kidny
hela
liver
3T3
H. sapiens
M. musculus
Arabidopsis
100%80%60%40%20%0%
% contribution to sum of general feature control
leaf
root
shoot
1
2
R2 all vs TR0.0 0.4 0.8
a b
Gene and condition specifice.g. miRNA
RBPs
Present in a minority of mRNAs, cell types and conditions
General featurese.g. RNA structure
uORFs, codon usage
Present in most or all mRNAs, cell types and conditions
AAAAAUG5’ cap
protein coding sequence
RBP
AAAAAUG5’ cap
protein coding sequence
miRNA
AAAAAUG protein coding sequence
AAAAAUG protein coding sequence
efficient codon usage
uORF
inefficient codon usage5’ cap
uORFuORF5’ cap
Thepercentcontributionssumto>100%
200%80% 160%40% 120%0%
uAUG 5’motifscodonfreq.
CDSlength
5’UTRRNAfold
S.cerevisae
S.pombe
kidny
hela
liver
3T3
H. sapiens
M. musculus
Arabidopsis
% contribution to general feature control
leaf
root
shoot
1
2
Highlycorrelated(collinear)controlbetweenthegeneralfeatures
M. MUSCULUS
S. CEREVISAE
uAUG
RNAfold
CDSlength
RNAfold
CDSlength
codon
80% 27% 78%
47% 77%
51%
92% 40% 59%
89% 100%
59%
uAUG
RNAfold
CDSlength
RNAfold
CDSlength
codon
S.POMBE ARABIDOPSIS
78% 28% 47%
45% 40%
62%
uAUG
RNAfold
CDSlength
RNAfold
CDSlength
codon
H. SAPIENS
59% 37% 55%
45% 39%
52%
uAUG
RNAfold
CDSlength
RNAfold
CDSlength
codon
7% 22% 30%
35% 26%
45%
uAUG
RNAfold
CDSlength
RNAfold
CDSlength
codon
Controlbyseparatestemloopsishighlycorrelated
Controlbyseparatestemloopsishighlycorrelated
Controlbyseparatestemloopsishighlycorrelated
CollinearcontrolbyN- andC- terminalcodonusage
Collinearcontrolbyaminoacidfrequencyandsynonymouscodonpreferences
Correlatedcontrolgivesevenpacking
Uncorrelatedcontrolgivesunevenpacking
Ribosomesarelimitingforcellgrowth
>90% ribosomes bound to mRNA
50% reductions in ribosome copy slows cell growth(in Drosophila)
Correlatedcontrolbygeneralfeaturesaidsefficientuseofthetranslationmachinery
Uncorrelatedcontrolincreasesparticlecollisions,leadingtoRNAdegradation
Thegeneralfeatureshaveseveralroles
AAAAAAAAAUGAUG
PROTEIN CODING REGION (CDS)
RNAstem loop
poly A tailAPEuORF
1. 5’ UTR RNA secondary structure
2. upstream open reading frames (uORFs)
3. AUG proximal element (APE)
5’ UTR 3’ UTR
5’ cap
4. protein coding sequence (CDS) length
5. CDS codon frequency
protein coding sequence
Mark BigginLBNL
Guo-Liang ChewFred Hutchison Center
Quantitationofcis-translationalcontrolbygeneralmRNAfeaturesinfiveeukaryotes