ligand building with arp/warp. automated model building given the native x-ray diffraction data and...

32
Ligand Building with ARP/wARP

Upload: hubert-webb

Post on 18-Jan-2016

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Ligand Building with ARP/wARP. Automated Model Building Given the native X-ray diffraction data and a phase-set To rapidly deliver a complete, accurate

Ligand Building with ARP/wARP

Page 2: Ligand Building with ARP/wARP. Automated Model Building Given the native X-ray diffraction data and a phase-set To rapidly deliver a complete, accurate

Automated Model Building

Given the native X-ray diffraction data and a phase-set

To rapidly deliver a complete, accurate and error free model

Page 3: Ligand Building with ARP/wARP. Automated Model Building Given the native X-ray diffraction data and a phase-set To rapidly deliver a complete, accurate

Building Ligands from Dummy Atoms / Seed Points

Back to about 2000: a side project for a PhD student

Page 4: Ligand Building with ARP/wARP. Automated Model Building Given the native X-ray diffraction data and a phase-set To rapidly deliver a complete, accurate

Nearest Neighbour Distance Distribution

f ( d

jk

obs

) =

1

σ

m

π

d

jk

obs

d

jk

tar

e

d

jk

o b s

( )

2

+ d

jk

ta r

( )

2

4 σ

m

2

sinh

d

jk

obs

d

jk

tar

2 σ

m

2

0

0.1

0.2

0.3

0.4

0.5

0.6

0 0.5 1 1.5 2 2.5 3 3.5 4

d

obs

Error free distance dtar is 1.5 Å

Expected rmsd is 1.0 Å

N ( d

ij

tar

, 2 σ

m

2

)

ShakeGiven a coordinate error, the inter-atomic distances in a protein model change:

Page 5: Ligand Building with ARP/wARP. Automated Model Building Given the native X-ray diffraction data and a phase-set To rapidly deliver a complete, accurate

Fit that

into

that !

Building a Ligand into a Difference Mapimagine:

a ligand consisting of N atoms

a density map containing M points

the only thing to do is to correctly select N out of M !

Page 6: Ligand Building with ARP/wARP. Automated Model Building Given the native X-ray diffraction data and a phase-set To rapidly deliver a complete, accurate

A Simple Example: Select 3 out of 4

• The task is to find an equilateral triangle• Prior knowledge: edges should have a length 1.0 Å• Reliability: error on data (distances) is 0.01 Å

a

bc

d

a b c d

a 0 1.07 Å 0.98 Å 1.01 Å

b 7 0 0.85 Å 2.10 Å

c 2 15 0 0.95 Å

d 1 110 5 0

Triangle Log likelihood Probability

abc -278 2.0*10-108

f ( d

j k

obs

) =

1

σ

m

π

d

j k

obs

d

j k

t ar

e

d

j k

obs

( )

2

+ d

j k

t ar

( )

2

4 σ

m

2

s inh

d

j k

obs

d

j k

t ar

2 σ

m

2

0

0.1

0.2

0.3

0.4

0.5

0.6

0 0.5 1 1.5 2 2.5 3 3.5 4

d

obs

Error free distance dtar is 1.5 Å

Expected rmsd is 1.0 Å

N ( d

ij

t ar

, 2 σ

m

2

)

Page 7: Ligand Building with ARP/wARP. Automated Model Building Given the native X-ray diffraction data and a phase-set To rapidly deliver a complete, accurate

A Simple Example: Select 3 out of 4

• The task is to find an equilateral triangle• Prior knowledge: edges should have a length 1.0 Å• Reliability: error on data (distances) is 0.01 Å

a

bc

d

Triangle Log likelihood Probability

abc -278 2.0*10-108

abd -12150 0

a b c d

a 0 1.07 Å 0.98 Å 1.01 Å

b 7 0 0.85 Å 2.10 Å

c 2 15 0 0.95 Å

d 1 110 5 0

Page 8: Ligand Building with ARP/wARP. Automated Model Building Given the native X-ray diffraction data and a phase-set To rapidly deliver a complete, accurate

A Simple Example: Select 3 out of 4

• The task is to find an equilateral triangle• Prior knowledge: edges should have a length 1.0 Å• Reliability: error on data (distances) is 0.01 Å

a

bc

d

Triangle Log likelihood Probability

abc -278 2.0*10-108

abd -12150 0

bcd -12350 0

a b c d

a 0 1.07 Å 0.98 Å 1.01 Å

b 7 0 0.85 Å 2.10 Å

c 2 15 0 0.95 Å

d 1 110 5 0

Page 9: Ligand Building with ARP/wARP. Automated Model Building Given the native X-ray diffraction data and a phase-set To rapidly deliver a complete, accurate

A Simple Example: Select 3 out of 4

• The task is to find an equilateral triangle• Prior knowledge: edges should have a length 1.0 Å• Reliability: error on data (distances) is 0.01 Å

a

bc

d

Triangle Log likelihood Probability

abc -278 2.0*10-108

abd -12150 0

bcd -12350 0

acd -30 0.9999

a b c d

a 0 1.07 Å 0.98 Å 1.01 Å

b 7 0 0.85 Å 2.10 Å

c 2 15 0 0.95 Å

d 1 110 5 0

Page 10: Ligand Building with ARP/wARP. Automated Model Building Given the native X-ray diffraction data and a phase-set To rapidly deliver a complete, accurate

N atoms in the ligand molecule

M points in a density map

W X Y Z

A B C D

Ligand Building as a Label Swapping Problem

Qassignment = log P(dijobs | dij

assigned ,error _model)[ ]j= i+1

N

∑i=1

N

• Sources of possible prior information:– Chemical composition of a ligand– Bonding distances – Angle bonded distances– Chirality– VdW interactions

Combinatorial Explosion

N po int s!

N po int s −Natoms( )!

Page 11: Ligand Building with ARP/wARP. Automated Model Building Given the native X-ray diffraction data and a phase-set To rapidly deliver a complete, accurate

Label Swapping

Initial map 349 grid pointsComplexity 1059

Sparse map 58 grid pointsComplexity 1037

22-atoms molecule of retinoic acid

Topological Extension(a branch and bound approach)

Page 12: Ligand Building with ARP/wARP. Automated Model Building Given the native X-ray diffraction data and a phase-set To rapidly deliver a complete, accurate

Retinoic acid - topological extension

Topology of the sparse map Topology of the ligand

a

bc

d a

bc

d a

bc

d a

bc

d a

bc

d

Page 13: Ligand Building with ARP/wARP. Automated Model Building Given the native X-ray diffraction data and a phase-set To rapidly deliver a complete, accurate

Real Space Fit for Final Selection of the Model

22 atoms molecule of retinoic acid: among 100 “top” models:21 are less than 0.5 Å r.m.s.d. from the final modelthe “best” model is 0.14 Å r.m.s.d. from the final model

Page 14: Ligand Building with ARP/wARP. Automated Model Building Given the native X-ray diffraction data and a phase-set To rapidly deliver a complete, accurate

MTZ file

Protein withoutligand

Ligand

Ligand Building Module in ARP/wARP 6.1

Take the largest object in the

difference map

Build the ligand there (label assignment)

Real space refinement of the

ligand

Page 15: Ligand Building with ARP/wARP. Automated Model Building Given the native X-ray diffraction data and a phase-set To rapidly deliver a complete, accurate

Ligand Building Module in ARP/wARP 6.1

Location unknown Location known

Single known ligand

Yes (if the largest) No

A ligand out of the list of expected

ligandsNo No

Partially ordered ligand

No No

Page 16: Ligand Building with ARP/wARP. Automated Model Building Given the native X-ray diffraction data and a phase-set To rapidly deliver a complete, accurate

Working sample

Ligand building

Performance Assessment

Run with default parameters

- PDB and MTZ from the EDS- Ligand PDB from HICUP- Exclude DNA- Exclude ligands covalently bound to the chain- Exclude ligands with partial occupancies

(3821 structures)

Large-Scale Test

1

3

2

4

5

6

78

9

Name-by-name Nearest neighbour

Assume the PDB structure to be correct

Page 17: Ligand Building with ARP/wARP. Automated Model Building Given the native X-ray diffraction data and a phase-set To rapidly deliver a complete, accurate

Atomic scale(correctly built ligand

into correct site)

Ligand scale(correct site

incorrectly built ligand)

Protein scale(incorrect site)

Accuracy of Ligand Building Process

Page 18: Ligand Building with ARP/wARP. Automated Model Building Given the native X-ray diffraction data and a phase-set To rapidly deliver a complete, accurate

Size of the Largest Ligand in the Working Sample

2981 structures withLigand size 7

3821 structures

Page 19: Ligand Building with ARP/wARP. Automated Model Building Given the native X-ray diffraction data and a phase-set To rapidly deliver a complete, accurate

Dependence on Resolution of the Data

Page 20: Ligand Building with ARP/wARP. Automated Model Building Given the native X-ray diffraction data and a phase-set To rapidly deliver a complete, accurate

Dependence on Ligand DisorderB factors

Page 21: Ligand Building with ARP/wARP. Automated Model Building Given the native X-ray diffraction data and a phase-set To rapidly deliver a complete, accurate

Dependence on Ligand DisorderR.m.s.d (Ligand_Bfactors)

Page 22: Ligand Building with ARP/wARP. Automated Model Building Given the native X-ray diffraction data and a phase-set To rapidly deliver a complete, accurate

Dependence on Ligand Size

Page 23: Ligand Building with ARP/wARP. Automated Model Building Given the native X-ray diffraction data and a phase-set To rapidly deliver a complete, accurate

What is the Ligand Site / Largest Object ?

Typically it is the largest set (cluster) of connected map points where the density is above a threshold

It is however mostly the case that at different thresholds there are different (and even non-overlapping) clusters

Take the largest object in the

difference map

Build the ligand there (label assignment)

Real space refinement of the

ligand

Page 24: Ligand Building with ARP/wARP. Automated Model Building Given the native X-ray diffraction data and a phase-set To rapidly deliver a complete, accurate
Page 25: Ligand Building with ARP/wARP. Automated Model Building Given the native X-ray diffraction data and a phase-set To rapidly deliver a complete, accurate

At each density threshold count the number of clusters.

A maximum is reached at typically ~1.5 sigma density level.

Density Clusters and a Fragmentation Tree

Page 26: Ligand Building with ARP/wARP. Automated Model Building Given the native X-ray diffraction data and a phase-set To rapidly deliver a complete, accurate

1ED5 (nitric oxide synthase), 1.8 Å resolution, Rfactor 21 % (with CNS)

Ligands: 2 x HEM and NGR (N-omega-nitro-L-arginine)

Fragmentation Tree: an Example

Page 27: Ligand Building with ARP/wARP. Automated Model Building Given the native X-ray diffraction data and a phase-set To rapidly deliver a complete, accurate

1ED5 (nitric oxide synthase), 1.8 Å resolution, Rfactor 21 % (with CNS)

Ligands: 2 x HEM and NGR (N-omega-nitro-L-arginine)

Fragmentation Tree: an Example

Page 28: Ligand Building with ARP/wARP. Automated Model Building Given the native X-ray diffraction data and a phase-set To rapidly deliver a complete, accurate

Looking for HEM, finding HEM

Scoring of Density Clusters

Looking for NGR, finding NGR

Looking for NGR, finding HEM Looking for HEM, finding NGR

Page 29: Ligand Building with ARP/wARP. Automated Model Building Given the native X-ray diffraction data and a phase-set To rapidly deliver a complete, accurate

Selection of Correct Density Cluster

Page 30: Ligand Building with ARP/wARP. Automated Model Building Given the native X-ray diffraction data and a phase-set To rapidly deliver a complete, accurate

Other Lessons ?

Take the largest object in the

difference map

Build the ligand there (label assignment)

Real space refinement of the

ligand

Page 31: Ligand Building with ARP/wARP. Automated Model Building Given the native X-ray diffraction data and a phase-set To rapidly deliver a complete, accurate

Ligand Building: ARP/wARP 6.1 and perspectives

Location unknown Location known

Single known ligand

Yes (if the largest)

Yes

No

Yes

A ligand out of the list of expected

ligands

No

Yes

No

Yes

Partially ordered ligand

No

No

No

May be

Page 32: Ligand Building with ARP/wARP. Automated Model Building Given the native X-ray diffraction data and a phase-set To rapidly deliver a complete, accurate

Developers

EMBL Hamburg: Guillaume Evrard, Johan Hattne, Gerrit Langer,

Venkat Parthasarathy, Tilo Strutz, Victor Lamzin and

many in-house friends

NKI Amsterdam: Serge Cohen, Diederick De Vries, Marouane

Jelloul, Krista Joosten, Tassos Perrakis

Former members and collaborators

Richard Morris, Peter Zwart, Francisco Fernandez, Olga

Kirillova, Matheos Kakaris, Gleb Bourenkov, Garib

Murshudov, Alexei Vagin, Andrey Lebedev, Peter Briggs,

Eleanor Dodson, Keith Wilson, Zbyszek Dauter, Gerard

Klejwegt

ARP/wARP - the people