expression operating unit -...

Post on 01-May-2018

213 Views

Category:

Documents

1 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Vivek K Mutalik

BioFAB Community meeting, July 19, 2010

biofab

Expression Operating UnitOrigins and Current Status

biofab Predictive Synthetic Biology

What can we build today ? Toggle switches, band pass filters, oscillators, complex circuits etc

Involves fine tuning of parts to operate….And how predictable that is ?

Prediction at each stage of composition...

C-dogma Challenge

biofab

Parts Devices Systems

Design Characterize Standardize Fabricate

Making biology easy to engineer

BioFAB: C-Dog Goals

• Performance can be predictable • Reliable functional composition

biofab C-Dog Goal and challenge

acgtcttaagacccactttcacatttaagttgtttttctaatccgcatatgatcaattcaaggccgaataagaaggctggctctgcaccttggtgatcaaataattcgatagcttgtcgtaataatggcggcatactatcagtagtaggtgtttccctttcttctttagcgacttgatgctcttgatcttccaatacgcaacctaaagtaaaatgccccacagcgctgagtgcatataatgcattctctagtgaaaaaccttgttggcataaaaaggctaattgattttcgagagtttcatactgtttttct

Promoter

T

GFP*

RBS

Input (Stimulus)

0.9

0.1

Resp

onse

Time

Resp

onse

Design, Assemble & characterize

PrePrediction & Rational design in place of tuning of parts performance

Current goal:Express any number of proteins to a desired mean and variance

Future goal:Set the leak, maximum, temporal response and dose response for the production of an arbitrary protein from an inducible promoter

biofab Part characterization

Transcription start site (TSS)

Relative Promoter strength

Transcription kinetics

Biochemical characterization

Terminator efficiency

Termination sites

Transcription pause

Kinetics

mRNA levels

mRNA Structure

mRNA stabilityTransfer functions

Reporter levels

Promoter- 1 GFP

RBS

T

Promoter Reporter Terminator

biofab

Salmonella

E. Coli TG1E. Coli K12

Host Context & Phylogenetic Distance

Small genetic variants, Deletion/overexpression librariesTagged Libraries

Genomic Context

Environmental Context• Defined Minimal vs. Rich• Different Carbon Sources• Growth Phase• Stressors (T, NaCL, pH)• 96 well plate, test tubes, flask

Genome

ORIInsertion point

BAC

ORI

PlasmidORIPlasmid

Plasmid

Context characterization

How does a part behave in different contexts ?

• Predicting a part’s performance is non-trivial task.

• Example: Promoter strength prediction from sequence

• The lessons learnt here are generally applicable to other promoters as well.

biofab Lessons

GoalsImprove promoter predictions across genomes

Design promoters with specific strengths for use in genetic circuits

OutcomeBy correlating promoter strength with sequence we have been able to model promoters for 2 alternative sigma factors σE andσ32

Improving Promoter Models

Virgil Rhodius

In-vitro studies and Homology modeling

Promoters of alternative sigmas are more highly conserved

Housekeeping σ Alternative σs

Housekeeping sigma Alternative sigmas

Regulate 1000s of promoters Regulate 10s – 100s of promoters

Most promoters regulated by transcription factors

Few promoters regulated by transcription factors

Promoters poorly conserved Promoters relatively well-conserved

As a test case, we have modeled promoters of the E. coli alternative sigma, Membrane stress sigma factor σE and

Heat shock sigma factor σ32

-10

αCTD

αNTDβ/β’

σ

UP -35

Discriminator

Variable SpacersAT rich

+1

Weakly conserved motifs separated by variable length spacer sequences – difficult to identify motifs

Why is predicting/modeling promoters difficult?

Promoters encode a multistep process

•Each kinetic step is encoded in the promoter sequence

•Position Weight Matrix Models are used for predicting transcription factor binding sites

•Based on binding energies•Promoters are more complex as encode multiple steps

•Transcription initiation requires:•DNA binding, DNA melting, Promoter Escape

+

RNA polymerase

promoter bound complex

KB

open complex

kf

abortive initiation

NTPs

+

promoter escape

mRNA

Approach – Refining Promoter Models

• Compare promoter strength with model promoter score

• Identify motifs that improve promoter strength

• Study outliers

Promoter prediction Model

Promoter strength

Promoter Score = PWM model score of homologous promoter sequences

Promoter Strength = rate of mRNA transcript production

Measuring strength of σE promotersLibrary of 60 natural σE promoters (E. coli and Salmonella)

-40-50-60 -30 -20 -10 +1

-10-35 LONG promoters

-65 +20

-10-35 SHORT promotersUP element

In vivo promoter strength measurementGFP reporter assay (σE overexpression)

In vitro promoter strength measurementTranscriptions from linear DNA templates

Promoters

Promoterstrength

Rhodius & Mutalik, 2010a. PNAS 107:2854-9

σE Natural promoters (60)

In vivo and In vitro strength of σE promoters– First time for any sigma regulon

Active promoters

Weak/inactive promoters

-10-35 SHORT promoters

Build PWM models based from active Short promoters

Test models by cross-validation

Test models ability to distinguish weak promoters

Modeling Strategy for σE promoters

Promoter score usingPosition Weight Matrix models (PWMs)

Align motifs Build frequency matrix

Build position weight matrix

Score motifs

Short Promoter score = PWM-35 + PWM-10 + PWM+1 + Spacer penaltiesFor each promoter:

G G A

G C A

G G T

G 3 2 0

A 0 0 2

T 0 0 1

C 0 1 0

G 1.0 0.7 -0.9

A -0.9 -0.9 2

T -0.9 -0.9 0.2

C -0.9 0.2 -0.9

weight = log (observed freq / expected freq)

GGT = 1 + 0.7 + 0.2= 1.9

Assumptions:•Represents the binding energies for DNA binding proteins•Each position is additive•Each position is independent

Modeling Strategy for σE promoters

Sequence logo of 40 in vivo active σE promoters

Model for σE short promoters

Promoter module scores with total scores and strengths

Promoter strength summary

•Promoter strength can be modeled based on strong promoters

•Assumptions of PWMs generally true for core promoter sequences

•Minimum module scores required to distinguish promoter function

•Provides a strategy for improving promoter prediction models

Rhodius & Mutalik, 2010a. PNAS 107:2854-9Rhodius & Mutalik, 2010b. Stay tuned!

Promoter strength summary

1. Can we build a reliable “strong, medium and weak” synthetic promoters ?

2. Lack of good quality data for building models /deriving general rules

3. We even do not have a simple data for different promoter-RBS-CDS combinations.

biofabBioFAB: Pilot project

Widely used Promoters and RBS

1. What happens ?

2. Are they independent ?

3. Composition rules ?

4. Predictive models ?

biofabPromoter-RBS combinations

10 Used RBS’s10 Famous Promoters

CON

TEXT

CON

TEXT

Promoter- 1 GFP T

-35 -10 +1

GFP = f(P, RBS)

biofabPromoter-RBS combinations

Assay Strain:BW25113Media:MOPS

Combinatorial libraryAssembled

Total 144

Plate reader

Flow cytometry

Data analysis and Modeling

Data

Lance Martin

biofabPromoter-RBS combinations: Results

RBS

Prom

oter

s

Activity

biofabStrategy for data analysis

Seq Activity=N N

P1 R1 A =N NN

Black box Seq-Activity models

PWM, PLSR Regression, HMM, NN, SVM models

P R

35 A =10 sp T rate

dG sp

Features Sequence-Activity models

---

Joao Guimaraes

=

Recoding

biofabResults

βp (P) * βR (R) = [Activity]

0 P1 P2 P3 ---

1 1 0 0 0

2 0 1 0 0

3 0 0 0 1

--- 1 0 0 0

R1 R2 R3 ---

1 0 0 0

0 1 0 0

0 1 0 0

0 0 0 1

A

0

0

0

1

100

10

50

1

R² = 0.8239

0123456789

10

0 2 4 6 8 10

Pred

icte

d A

ctiv

ity

Observed Activity

Q2 = 0.75

Multivariate data analysis (Partial least square regression)

BioFAB et al., Manuscript being preped

biofabResults and Summary

• Varied promoters & RBS used as test case for studying junctions• Promoter & RBS regions appear to be independent• Simple model explains >70% data variance

• We are testing the generality of the model for different reporters and building sophisticated models

Predicting Promoter-RBS combination outputs

P1 R1N

FP =NN

GFPRFP

mCherCat

LacZ

Copy

Envi

ronm

ent

RNA

N

biofab

Salmonella

E. Coli TG1E. Coli K12

Host Context & Phylogenetic Distance

Small genetic variants, Deletion/overexpression librariesTagged Libraries

Genomic ContextEnvironmental Context

• Defined Minimal vs. Rich• Different Carbon Sources• Growth Phase• Stressors (T, NaCL, pH)• 96 well plate, test tubes,

flask

Genome

ORIInsertion point

BAC

ORI

PlasmidORIPlasmid

Plasmid

Context characterization

How does a part behave in different contexts ?

biofab

Expression operating unit (EOU)

EOU

How to insulate functionality of parts from context change?How to improve predictability of performance ?

biofab EOU: higher resolution

Specific restriction sites ?Biobrick scars ?No restriction sites ?

Assembly methods& Context change

3’ UTR

biofab EOU # 1.0: Components

Promoter- 1 GFP T

ReporterdbITerminator

CON

TEXTpTac

InsulatorT

CON

TEXT

Pause

SspB region

T7A1D111

rpoCTerminator

BujardUTR

mRFP

mCherry

Gemini

LacZBreak EOUImprovise

Davis et al., Suer Lab

Mutalik et al.,Arkin lab

biofabPart Libraries

How do we design parts such that their performance is predictable?

Promoter- 1 GFP T

ReporterdbITerminator

CON

TEXTpTac

InsulatorT

CON

TEXT

Pause

SspB region

T7A1D111

rpoCTerminator

BujardUTR

biofab Junction architectures

Statistical Experimental Design- ExtensionPromoter library

Promoter Insulators RBS library

Terminatorlibrary

Reporterlibrary RBS library

Reporterlibrary

Operon designs and optimization

EXPRESSION OPERATING UNIT

J6 J7 J8J4 J5J2 J3J1

Up P 5’ Tran CDS 3’ T InsIns

J6 J7 J8J4 J5J2 J3J1

Up P 5’ Tran CDS 3’ T InsIns

biofab

C-Dog Project scope

Work in progress

“C. Dog” Project Leadership

J6 J7 J8J4 J5J2 J3J1

Up P 5’ Tran CDS 3’ T InsIns

Team GuillaumeTeam Vivek

Vector & Chromosome Data Analysis

biofab

biofab Open technology and contribution

• Defining our progress/success criteria

• We will work together with academic and industry communities to propose, adopt, implement and refine best practices in characterization, tools and methods

biofab Acknowledgements

UCB Chris Anderson John Dueber Julius LucksWeston Whitaker Stanley

JBEI Nathan Hillson Aindrila MMasood Hadi Hou Cheng ChuWill Holtz Jeff Dietrich Greg Bokinsky Adrienne Mckee

UCSF Virgil Rhodius Athanasios Typas Chris Voigt

BioFAB team: Joao and LanceAdam Arkin & Drew Endy

Stanford Christina Smolke

SynBERC Kevin Costa Leonard Katz

Puzzles• EOU parts• Insulators/Junctions• Integration sites

Opportunities• Assembly• HT Characterization

(Eg., Microfluidic)• Context studies• Biophysical models• Internal ref standards

biofab C-Dog-Puzzles & Opps

top related