structural genomics south east asian training course on bioinformatics applied to tropical diseases...

46
Structur al Genomics South East Asian Training Course on Bioinformatics Applied to Tropical Diseases Natesh, ICGEB, INDIA 1-October-2004 Structural Genomics

Upload: marlene-mcdowell

Post on 02-Jan-2016

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Structural Genomics South East Asian Training Course on Bioinformatics Applied to Tropical Diseases R. Natesh, ICGEB, INDIA 1-October-2004 Structural Genomics

 StructuralGenomics

South East Asian Training Course on Bioinformatics Applied to Tropical Diseases

R. Natesh, ICGEB, INDIA

1-October-2004

Structural Genomics

Page 2: Structural Genomics South East Asian Training Course on Bioinformatics Applied to Tropical Diseases R. Natesh, ICGEB, INDIA 1-October-2004 Structural Genomics

Organization of the talk

• Structural genomics and Bioinformatics

• Tools used in structural genomics

Page 3: Structural Genomics South East Asian Training Course on Bioinformatics Applied to Tropical Diseases R. Natesh, ICGEB, INDIA 1-October-2004 Structural Genomics

Structural Genomics - Buzz word

• With the draft release of Human Genome.• Human Genome contains of 3 billion base

pairs.• Human Genome is estimated to contain 30

to 40 thousand genes that encode proteins.• Structures of only ~10 % of proteins known

to human are known. (27321 structures, ~4K NMR structures), (24536 proteins, 7278 unique structures with < 70 % identity. 5037 Homo Sapiens, ~2000 structures with < 70 % identity)

Page 4: Structural Genomics South East Asian Training Course on Bioinformatics Applied to Tropical Diseases R. Natesh, ICGEB, INDIA 1-October-2004 Structural Genomics

•Structure to function with structural basis for therapeutics is the main goal of the Structural Genomics. Three Dimensional structures can yield knowledge to discover newer and efficient drug design to cure diseases. High throughput mode.

Structure understand function

Page 5: Structural Genomics South East Asian Training Course on Bioinformatics Applied to Tropical Diseases R. Natesh, ICGEB, INDIA 1-October-2004 Structural Genomics
Page 6: Structural Genomics South East Asian Training Course on Bioinformatics Applied to Tropical Diseases R. Natesh, ICGEB, INDIA 1-October-2004 Structural Genomics

•Classification based on Sequence & Structure similarities would enable to identify related proteins eg. SCOP, CATH. Structural Genomics/biology uses Bioinformatics as tool to setup such databases.

•Example of one such consortium is “Mycobacterium tuberculosis Structural Genomics Consortium” http://www.doe-mbi.ucla.edu/TB/

Page 7: Structural Genomics South East Asian Training Course on Bioinformatics Applied to Tropical Diseases R. Natesh, ICGEB, INDIA 1-October-2004 Structural Genomics

•Computational molecular biology created the rationale for structural genomics by deriving the general principles of protein structure organization and by providing a tentative upper boundary for the total number of existing protein folds.(As the number of structure increases the estimated fold increases but with a damping factor since the number of folds is assumed to be limited)

•Efficient ways of their prediction and classification.

•Comparative protein sequence and structure analysis is a major cost-saving factor in high-throughput structure determination leading to optimal, most economic selection of targets for X-ray crystallography or NMR studies.

Structural Genomics and Bioinformatics

Page 8: Structural Genomics South East Asian Training Course on Bioinformatics Applied to Tropical Diseases R. Natesh, ICGEB, INDIA 1-October-2004 Structural Genomics

•Bioinformatics also plays a crucial role in assessment and classification of the new structural data obtained.

•Bioinformatics research, in its turn, directly benefits from the flood of data generated by structural genomics projects, resulting in improved algorithms, software, and databanks.

Page 9: Structural Genomics South East Asian Training Course on Bioinformatics Applied to Tropical Diseases R. Natesh, ICGEB, INDIA 1-October-2004 Structural Genomics

PDB has search tools

Page 11: Structural Genomics South East Asian Training Course on Bioinformatics Applied to Tropical Diseases R. Natesh, ICGEB, INDIA 1-October-2004 Structural Genomics

Example

Page 13: Structural Genomics South East Asian Training Course on Bioinformatics Applied to Tropical Diseases R. Natesh, ICGEB, INDIA 1-October-2004 Structural Genomics

Status of unreleased entries

Page 14: Structural Genomics South East Asian Training Course on Bioinformatics Applied to Tropical Diseases R. Natesh, ICGEB, INDIA 1-October-2004 Structural Genomics

One of the major tool used is Crystallography apart from NMR and electron microscopy. Structural Genomics is almost always address with X-ray crystallography.

Robotics in Crystallography – For crystallisation, crystal mounting etc. Automation in Data collection..etc.

Bioinformatics Tools – Blast, Fasta etc., to identify the proteins of interest for particular disease eg. tropical diseases like Malaria and Leishmaniasis. SCOP, CATH, DALI

Tools used in structural genomics

Page 15: Structural Genomics South East Asian Training Course on Bioinformatics Applied to Tropical Diseases R. Natesh, ICGEB, INDIA 1-October-2004 Structural Genomics

Gene

clone

Over expression

Purification

Crystallisation

Page 16: Structural Genomics South East Asian Training Course on Bioinformatics Applied to Tropical Diseases R. Natesh, ICGEB, INDIA 1-October-2004 Structural Genomics

Agincourt

Agincourt™, Syrrx’s high-throughput Nanovolume Crystallization® robot. This robot has set up more crystallization experiments than any single organization in the world (over 5 million experiments to date) and has crystallized over 300 different proteins.

Page 17: Structural Genomics South East Asian Training Course on Bioinformatics Applied to Tropical Diseases R. Natesh, ICGEB, INDIA 1-October-2004 Structural Genomics

The Diamond

ESRF

Synchrotron light sources

Page 18: Structural Genomics South East Asian Training Course on Bioinformatics Applied to Tropical Diseases R. Natesh, ICGEB, INDIA 1-October-2004 Structural Genomics

Resolution: R-merge I/σ(I)1.8 Å (1.86-1.8 Å) 0.05 (0.11) 31.06 (11.73)

Beamline Automation

Page 19: Structural Genomics South East Asian Training Course on Bioinformatics Applied to Tropical Diseases R. Natesh, ICGEB, INDIA 1-October-2004 Structural Genomics

Crystal Wavelength (Å) Resolution range (Å)

Zn Peak 1.2825 50-2.00(2.07-2.00) Zn infl. 1.2832 50-2.01(2.08-2.01) Zn rem. 0.9537 50-1.98(2.05-1.98)

MAD Phasing data.

Long Wavelength 1.7712

50-2.66(2.76-2.66)

MIRAS Phasing data

K2PtCl4 0.87 20-2.80(2.9-2.8) K2PtCl4 0.87 20-2.60(2.69-2.6) K2PdCl4 0.978 50-2.18(2.26-2.18) K2PdCl4 0.978 30-2.60(2.69-2.6) OsCl3 1.488 50-2.80(2.9-2.8) tACE-native 1.488 50-2.00(2.07-2.0)

More than 50 heavy atom soaking experiments conducted.

An example of the Data sets used in structure solution.

Page 20: Structural Genomics South East Asian Training Course on Bioinformatics Applied to Tropical Diseases R. Natesh, ICGEB, INDIA 1-October-2004 Structural Genomics

'P 21 21 21'ORTHORHOMBIC X,Y,Z 1/2-X,-Y,1/2+Z -X,1/2+Y,1/2-Z 1/2+X,1/2-Y,-Z

Page 21: Structural Genomics South East Asian Training Course on Bioinformatics Applied to Tropical Diseases R. Natesh, ICGEB, INDIA 1-October-2004 Structural Genomics
Page 22: Structural Genomics South East Asian Training Course on Bioinformatics Applied to Tropical Diseases R. Natesh, ICGEB, INDIA 1-October-2004 Structural Genomics

Summary of data statistics and refinement statistics of 1.8 Å

Total number of reflections measured 85,248 Multiplicity of Data Set 3.7 Total number of unique reflections 22,983 Total number of unique reflections with I > 0 22,832 Number of unique reflections in the resolution range 10 – 1.8 Å I > 2 (I)

22,454

Completeness of data in resolution range 99.0 – 1.8 Å (%)

92.9

Completeness in the resolution range 10 – 1.8 Å in working set (%)

83.0

Completeness in the resolution range 10 – 1.8 Å in test set (%)

9.2

Completeness in the last resolution shell 1.86 – 1.8 Å (%)

86.1

Rmerge (%) * 5.0 Rmerge in the last resolution shell 1.86 – 1.8 Å (%)

11.0

Initial R factor (%) † for xyna_strli model in xyna_theau cell (10 – 2) Å

42.9

Initial Free R factor (%) † for xyna_strli model in xyna_theau cell (10 – 2) Å

49.4

Final R factor ( % ) † for reflections with I > 2 (I) in the resolution range (10 – 1.8) Å

16.0

Final Free R factor (%) † for reflections with I > 2 (I) in the resolution range (10 – 1.8) Å

21.1

† R factor = 100 x | Fo – k Fc| / Fo *Rmerge = 100 x | I - I h | / I

MR

Page 23: Structural Genomics South East Asian Training Course on Bioinformatics Applied to Tropical Diseases R. Natesh, ICGEB, INDIA 1-October-2004 Structural Genomics
Page 24: Structural Genomics South East Asian Training Course on Bioinformatics Applied to Tropical Diseases R. Natesh, ICGEB, INDIA 1-October-2004 Structural Genomics
Page 25: Structural Genomics South East Asian Training Course on Bioinformatics Applied to Tropical Diseases R. Natesh, ICGEB, INDIA 1-October-2004 Structural Genomics

Advantages of Ultrahigh or Atomic resolution

• Phase Problems can be solved directly.

• They allow comprehensive least square refinement of the structure with anisotropic ADP. The final R factor can be < 10 %.

• Multiple (Dual) Conformations can be seen.

• Unrestrained refinement (Coordinate Error Estimates)

• The position of many hydrogen atoms can be seen in density maps.

Page 26: Structural Genomics South East Asian Training Course on Bioinformatics Applied to Tropical Diseases R. Natesh, ICGEB, INDIA 1-October-2004 Structural Genomics
Page 27: Structural Genomics South East Asian Training Course on Bioinformatics Applied to Tropical Diseases R. Natesh, ICGEB, INDIA 1-October-2004 Structural Genomics
Page 28: Structural Genomics South East Asian Training Course on Bioinformatics Applied to Tropical Diseases R. Natesh, ICGEB, INDIA 1-October-2004 Structural Genomics

Substrate Specificity

Page 29: Structural Genomics South East Asian Training Course on Bioinformatics Applied to Tropical Diseases R. Natesh, ICGEB, INDIA 1-October-2004 Structural Genomics

Steric hindrance between Arg276 and Trp 275 (seen in 0.89 A T. aurantiacus xylanase structure) may play a key role in substrate specificity

Page 30: Structural Genomics South East Asian Training Course on Bioinformatics Applied to Tropical Diseases R. Natesh, ICGEB, INDIA 1-October-2004 Structural Genomics
Page 31: Structural Genomics South East Asian Training Course on Bioinformatics Applied to Tropical Diseases R. Natesh, ICGEB, INDIA 1-October-2004 Structural Genomics
Page 32: Structural Genomics South East Asian Training Course on Bioinformatics Applied to Tropical Diseases R. Natesh, ICGEB, INDIA 1-October-2004 Structural Genomics
Page 33: Structural Genomics South East Asian Training Course on Bioinformatics Applied to Tropical Diseases R. Natesh, ICGEB, INDIA 1-October-2004 Structural Genomics

AMINO ACID COMPOSITIONS (%) FOR F/10 family xylanases.

1BG4 CTUX 1XYZ XAS 2EXO 1CLX Ala(A) 10.92 11.55 6.81 12.04 14.42 10.95 Asx(B) 0 0 0 0 0 0 Cys(C) .66 .66 1.85 1.33 1.28 .57 Asp(D) 5.96 5.61 6.19 6.35 8.65 7.2 Glu(E) 2.64 2.31 4.95 3.34 4.48 3.45 Phe(F) 2.98 2.97 4.95 4.01 5.12 4.32 Gly(G) 8.6 6.93 6.81 7.69 7.37 7.2 His(H) 1.98 1.32 1.54 1.67 1.6 2.01 Ile(I) 6.95 5.61 7.12 3.67 2.56 4.61 Lys(K) 6.62 4.62 4.64 3.01 5.76 2.59 Leu(L) 7.28 7.26 5.88 5.01 6.08 6.62 Met(M) .99 .99 4.02 2.34 1.6 1.72 Asn(N) 7.94 8.25 10.21 7.02 4.48 8.35 Pro(P) 2.98 4.29 4.95 2.67 3.84 4.89 Gln(Q) 2.64 5.94 3.71 5.68 4.48 5.18 Arg(R) 2.31 3.3 4.33 6.35 3.84 5.76 Ser(S) 8.6 6.6 4.95 8.69 5.44 6.91 Thr(T) 6.29 7.26 4.02 6.02 4.8 4.03 Val(V) 6.95 8.58 4.95 6.68 8.65 7.2 Trp(W) 2.31 2.64 2.16 3.01 2.24 2.01 Unk(X) 0 .33 0 0 0 0 Tyr(Y) 4.3 2.97 5.88 3.34 3.2 4.32 Glx(Z) 0 0 0 0 0 0 Total 100 100 100 100 100 100

1BG4 Penicillium simplicissimum xylanase. CTUX Cryo-temperature Thermoascus aurantiacus Xylanase (0.89 Å structure). 1XYZ Clostridium thermocellum xylanase. XAS Streptomyces lividans xylanase. 2EXO Cellulomonas fimi xylanase. 1CLX Pseudomonas fluorescens xylanase.

Themophile Mesophile

PEPTIDE SORT

Page 34: Structural Genomics South East Asian Training Course on Bioinformatics Applied to Tropical Diseases R. Natesh, ICGEB, INDIA 1-October-2004 Structural Genomics

Natesh et al., Nature, 421,551-554

(30th Jan 2003).

Page 35: Structural Genomics South East Asian Training Course on Bioinformatics Applied to Tropical Diseases R. Natesh, ICGEB, INDIA 1-October-2004 Structural Genomics
Page 36: Structural Genomics South East Asian Training Course on Bioinformatics Applied to Tropical Diseases R. Natesh, ICGEB, INDIA 1-October-2004 Structural Genomics
Page 37: Structural Genomics South East Asian Training Course on Bioinformatics Applied to Tropical Diseases R. Natesh, ICGEB, INDIA 1-October-2004 Structural Genomics
Page 38: Structural Genomics South East Asian Training Course on Bioinformatics Applied to Tropical Diseases R. Natesh, ICGEB, INDIA 1-October-2004 Structural Genomics
Page 39: Structural Genomics South East Asian Training Course on Bioinformatics Applied to Tropical Diseases R. Natesh, ICGEB, INDIA 1-October-2004 Structural Genomics

40 50 60 70 80LVTDEAEASKFVEEYDRTSQVVWNEYAEANWNYNT ITTETSKILLQKNM

90 100 110 120 130QIA HTLKYGTQARKFDVNQLQ TTIKRIIKKVQDLERAALPAQELEEYN

140 150KILLDMETTYSVATVCHP GSCLQLEPDLTNVMATSRKYEDLLWAWEGWR

190 200 210 220 230DKAGRAILQFYPKYVELINQAARLNGYVDAGDSWRSMYETPSLEQDLERL

240 250 260 270 280FQELQPLYLNLHAYVRRALHRHYGAQHINLEGPIPAHLLGNMWAQTWSNI

290 300 310 320 330YDLVVPFPSAPSMDTTEAMLKQGWTPRRMFKEADDFFTSLGLLPVPPEFW

340 350 360 370 380KSMLEKPTDGREVVCHASAWDFYNGKDFRIKQCTTVNLEDLVVAHHEMG

390 400 410 420 430HIQYFMQYKDLPVALREGANPGFHEAIGDVLALSVSTPKHLHSLNLLSSE

440 450 460 470 480GGSDEHDINFLMKMALDKIAFIPFSYLVDQWRWRVFDGSITKENYNQEWW

490 500 510 520 530SLRLKYQGLCPPVPRTQGDFDPGAKFHIPSSVPYIRYFVSFIIQFQFHEA

540 550 560 570 580LCQAAGHTGPLHKCDIYQSKEAGQRLATAMKLGFSRPWPEAMQLITGQP

590 600 610 620MSASAMLSYFKPLLDWLRTENELHGEKLGWPQYNWTPNS

160 170 180

N

N N

N

N

N

Page 40: Structural Genomics South East Asian Training Course on Bioinformatics Applied to Tropical Diseases R. Natesh, ICGEB, INDIA 1-October-2004 Structural Genomics

Inhibitors

• Captopril (Classic first ACE inhibitors)

• Lisinopril, Enalapril, Ramipril….

•Most common is persistant dry cough.Big fall in BP1st, Kidney and liver problem, a type of swelling (angioedema), rash, inflammation of the pancreas, hay fever-like symptoms, sinusitis, sore throat, nausea, vomiting, indigestion, diarrhoea, constipation and blood cell changes.

SIDE EFFECT

Page 41: Structural Genomics South East Asian Training Course on Bioinformatics Applied to Tropical Diseases R. Natesh, ICGEB, INDIA 1-October-2004 Structural Genomics
Page 42: Structural Genomics South East Asian Training Course on Bioinformatics Applied to Tropical Diseases R. Natesh, ICGEB, INDIA 1-October-2004 Structural Genomics

• Captopril Carboxypeptidase

* Ondetti and Cushmann

• Lisinopril & Enalapril and their relatives Thermolysin * Patchett et al.,

(side effects).

Present ACE Stucture enables 2nd generation SBDD.

Page 43: Structural Genomics South East Asian Training Course on Bioinformatics Applied to Tropical Diseases R. Natesh, ICGEB, INDIA 1-October-2004 Structural Genomics

N- and C-terminal selectivity of known (current) ACE inhibitors

Page 44: Structural Genomics South East Asian Training Course on Bioinformatics Applied to Tropical Diseases R. Natesh, ICGEB, INDIA 1-October-2004 Structural Genomics

HICUP

Molecular Modeling

Page 45: Structural Genomics South East Asian Training Course on Bioinformatics Applied to Tropical Diseases R. Natesh, ICGEB, INDIA 1-October-2004 Structural Genomics

DALI

DALI

Page 46: Structural Genomics South East Asian Training Course on Bioinformatics Applied to Tropical Diseases R. Natesh, ICGEB, INDIA 1-October-2004 Structural Genomics

•Structural Genomix - Structure to Function to Drug discovery. High throughput regime.

•Tools include X-ray crystallography, electron microscopy, NMR molecular biology

•Bioinformatics - select target proteins.

Summary