computational engineering of bionanostructures ram samudrala associate professor university of...
TRANSCRIPT
COMPUTATIONAL ENGINEERING OF BIONANOSTRUCTURES
RAM SAMUDRALAASSOCIATE PROFESSOR
UNIVERSITY OF WASHINGTON
How can we design peptides and proteins capable of interacting with inorganic substrates with specific selectivity and affinity?
MOTIVATION
The functions necessary for life are undertaken by proteins.
Protein function is mediated by protein three-dimensional structure.
A number of semi-accurate computational methodologies have been developed for the analysis and modelling of the sequences and structures of naturally occurring proteins.
We can harness these knowledge- and biophysics-based computational methodologies to design peptides and proteins capable of interacting inorganic substrates with specific affinity and selectivity.
Goal is to develop generalised computational techniques to construct molecular building blocks based on peptides and proteins that can be easily assembled to design higher order structures.
Applications in the area of medicine, nanotechnology, and biological computing.
KNOWLEDGE-BASED DESIGN
Proteins that are evolutionarily related generally have similar sequences, structures, and functions.
We hypothesised that this applies to experimentally discovered peptides capable of binding to inorganic substrates.
We then examined similarity of sequences between experimentally discovered peptides and random peptide sequences using standard sequence comparison tools.
Random peptide sequences most similar to a particular group of experimentally discovered peptides were considered to possess the same functional property.
Some examples of experimentally discovered peptides (from Mehmet Sarikaya’s group): Quartz binders:
RLNPPSQMDPPF
QTWPPPLWFSTS
LTPHQTTMAHFL
Hydroxyapatite binders:
MLPHHGA
TTTPNRA
PVAMPHWOren/Tamerler/Sarikaya
OPTIMISATION OF SCORING MATRICES (QUARTZ)
We perturbed the PAM 250 scoring matrix systematically to produce a higher strong-strong self-similarity and lower strong-weak cross-similarity score, and backtested the predictive power of the new QUARTZ I matrix.
Oren/Tamerler/Sarikaya
EXPERIMENTAL VERIFICATION (QUARTZ)
Three sets of experiments were performed by Mehmet Sarikaya’s group to validate the computationally designed sequences.
Oren/Tamerler/Sarikaya
DESIGN OF SECOND GENERATION MATRICES
Oren/Tamerler/Sarikaya
HA12 (12 aa linear)
49: 16S, 20M, 13W
HA7 (7 aa constrained)
56: 12S, 27M, 17W
1. M L P H H G A2. T T T P N R A3. P V A M P H W4. N N N Y S R H5. P K D A V P A6. P S F D N G F7. Q L I P V S N8. L T Q S D H P9. H S P S N P S10. R T N Q P Q K11. D P Q Y G Q H12. N S G S R H H13. T P P H H Q P14. H Q H N M K I15. M H P T H T T16. H P A T I E D17. S G Q I S L L18. S G S P V P N19. D N T S D M V20. S S W Q R L R21. Q N K D F Q K22. H Q E S H P P23. P H H H H Q P24. S N Y F A E M25. Q S S H S F L26. A I N D T N Q27. P T T P N E Q28. S M K V P S S29. S V E E R G S30. N E S F T G A31. Y P T Q T T D32. I Y E V N T E33. S P Q T P S R34. S D N T V R Y35. S M I P P Y R36. V L T P T Q S37. R P I V H H Q38. M W R D S K P39. H Q T H H P Q
40. T G L Q N S S41. L S P K P Q L42. N P G F A Q A43. G I G Q P Q A44. M I F L R V V45. T A H A M L Y46. H L P I P S A47. M G A G R A A48. S I H S R D T49. T F H K W P S50. S T W I P E F51. P S S P L Q S52. H L H Q Q N T53. Q L Q L L Q S54. R T T P S Y H55. T T H Q E A P56. Y P P R S N T
1. S P T K P T P P R S S Q2. T S T N Y W L Y S S E S3. V P F Q F K V T G D P L4. A F S Q L K G F Y S R Y5. E F Y T P T G L P P G R6. H T V N R S M D V P G V7. N T P A H A N A D F F D8. A S G A K P W T S D L H9. I P M T P S Y D S H I L10. H A P Y K S H V W T E Q11. A F A Y R D N L S M H P12. L L A D T T H H R P W T13. H W G E I P S R L S L P14. L D T Q F I K P P Q K S15. S V A A L F R H V P G H16. N G W W T A S P G V P M17. W K W L Y D L V T P T I18. N E Y Y I H Q V H P P T19. G E E L G N R L A R I T20. S Q P F W M L S R V L A21. D L F S V H W P P L K A22. A T S H L H V R L P S R23. T L V P K N E T P L S S24. L S A A S H L H T S S S25. I I P S Q Q Q S L M A P26. Q I P S Y W P R G P G G27. S S L H A L H P F G A V28. Q S T T V L H A S P T L29. K L P Y A L E L S G T V30. K F L S L P P P T R S G31. V A S P E R T S P A F P32. E S A Q L N R T L Q L P33. I D M S R L E S Y T L P34. N H Q G V L S V H G S L35. H Y L P K N V R T S L Q36. T L P S P L A L L T V H
37. L S P L H Q L N S S V N38. S P S M L T S M W P N T39. N L P S P L I P A S S P40. S L S P T R S L Y E A T41. N I S D T L N R S R W K42. Q S Y S S M L Y P S P F43. A Q S Q M M S A Q F R P44. E L L A P R G S L N T G45. T T N S H E F P P G Q S46. Y D E I L G A A P S L K47. T P G E Y L R L A T G R48. G A Q Q L N S M H P E H49. R P L E S R T P L Y L P
Oren/Tamerler/Sarikaya
KNOWLEDGE-BASED DESIGN (HA)
HA12 IHA12 I HA7 IHA7 I
HA_7 (7 aa constrained)
HA12 (12 aa linear)
HA12 IHA12 I HA7 IHA7 I
Oren/Tamerler/Sarikaya
BACKTESTING (HA)
CASE STUDY: AMELOGENIN
Principal protein involved in enamel formation.
Multifunction protein– Mineralization.– Signaling.– Adhesion to process matrix.– Physical protein-protein interactions.
Never been crystallised (irregular / unstable?).– Most proteins with non-repeating sequence are active in globular form.– Many proteins fold into globular form upon interaction with substrate /
interactor.– Assumption of linear and globular forms.– Start with protein structure prediction.
CASE STUDY: AMELOGENIN STRUCTURE
Predicted five models (typical for CASP).
Annotate structure with experimental and simulation evidence to find best predicted globular structure and infer function.
MGTWILFACLLGAAFAMPLPPHPGSPGYINLSYEKSHSQAINTDRTALVLTPLKWYQSMIRQPYPSYGYEPMGGWLHHQIIPVLSQQHPPSHTLQPHHHLPVVPAQQPVA1 10 20 30 40 50 60 70 80 90 100 110
PQQPMMPVPGHHSMTPTQHHQPNIPPSAQQPFQQPFQPQAIPPQSHQPMQPQSPLHPMQPLAPQPPLPPLFSMQPLSPILPELPLEAWPATDKTKREEVD 120 130 140 150 160 170 180 190 200 210
Signal RegionSignal Region Exon 4Exon 4
CASE STUDY: AMELOGENIN FUNCTION
Horst/Oren/Cheng/Wang
MGTWILFACLLGAAFAMPLPPHPGSPGYINLSYEKSHSQAINTDRTALVLTPLKWYQSMIRQPYPSYGYEPMGGWLHHQIIPVLSQQHPPSHTLQPHHHLPVVPAQQPVA1 10 20 30 40 50 60 70 80 90 100 110
PQQPMMPVPGHHSMTPTQHHQPNIPPSAQQPFQQPFQPQAIPPQSHQPMQPQSPLHPMQPLAPQPPLPPLFSMQPLSPILPELPLEAWPATDKTKREEVD 120 130 140 150 160 170 180 190 200 210
1. PV 2. HPPSHTLQPHHHLPVV 3. VPGHHSMTPTQH
1. LFACLLGAAFAMPLP 2. PGYINLSYEKSHSQAINTDRTA 3. LPPLFSMQPLSPILPELPLEAWPAT
MOUSE AMELOGENIN STRUCTURAL ANALYSISMOUSE AMELOGENIN STRUCTURAL ANALYSIS
Model 3 Model 4 Model 5Model 1 Model 2
CASE STUDY: AMELOGENIN – WHAT IT DOES
Horst/Oren/Cheng/Wang
CASE STUDY: AMELOGENIN INTERACTION
Horst
Sequences derived from amelogenin:
1. HTLQPHHHLPVV (12)
2. VPGHHSMTPTQH (12)
3. LFACLLGAAFAMPLP (15)
4. HPPSHTLQPHHHLPVV (16)
5. PGYINLSYEKSHSQAINTDRTA (22)
6. LPPLFSMQPLSPILPELPLEAWPAT (25)
7. HPPSHTLQPHHHLPVVPAQQPVAPQQPMMPVPGHHSMTPTQH (42)
Oren/Tamerler/Sarikaya
CASE STUDY: AMELOGENIN – HA BINDING
Characterise sequences and structures of naturally occurring proteins in terms of their total similarity scores using different scoring matrices. This will produce a database of sequences with predicted and known structures with specific selectivity and affinity to different inorganics.
This database can be analysed for atom-atom preferences, torsion angle preferences, and other characteristics to define energy functions and move sets for performing protein structure simulations.
We will combine this with our all-atom energy function capable of handling inorganics and our protein structure simulation software.
Design higher order protein-like scaffolds with specific functionalities:
BIOPHYSICS-BASED DESIGN
Strong quartz binding regionStrong hydroxyapatite binding region
Active site
People:Ersin Emre OrenJeremy HorstSamudrala groupMehmet Sarikaya and his groupCandan Tamerler-Behar and her group
Support from:National Institutes of HealthNational Science FoundationKinship Foundation (Searle Scholars Program)Defense University Research Initiative on NanoTechnologyGenetically Engineered Materials Science and Engineering CenterPuget Sound Partners in Global Health (Gates Foundation)UW Technologies InitiativeUW Technology Gap Research FundWashington Research Fund
ACKNOWLEDGEMENTS