iterative acorn asahigh throughput toolinstructural...

6
X-ray crystallography has become a central tool in modern drug and target discovery, providing important insights into molecular interactions and biological function. The sequence information available for macromolecules exceeds far in number the 3D structures in the protein data bank'. Thus, 'High Throughput' techniques are needed in all aspects of macromolecular crystallography, so that structural genomics project aiming at rapidly solving a large number of new structures in a short time' would be successful. ACORN is one of the programs/ useful in the high throughput structure deterrninatiorr'" , and was developed for ab initio structure determination of macromolecules, when atomic resolution (AR) data are available. The applications. of ACORN to AR data are outlined earlier'". This paper deals with the extension of ACORN to non-atomic resolution data. The use of secondary structural elements like a-helices and ~- strands was already pointed out as a seed input to ACORNs",. Dynamic density modification (DDM) is a fast and ,powerfulapproach incorporated in ACORN. ACORN along , wnu 'Uutnmated model building program ARP/wARP'2, followed by the refinement program REFMAC l3 can be~an ab initio method for solving IndianJournal of Biochemistry & Biophysics Vol.43, August 2006, pp 211-216 Iterative ACORN as a high throughput tool in structural genomics' S Selvanayagam a, D Velrnurugant'and T Yamane" "Department of Crystallography and Biophysics, University of Madras, Guindy Campus, Chennai 600 025, India "Department of Biotechnology and Biomaterial Science, Graduate School of Engineering, Nagoya University, Furo-Cho, Chikusa-Ku, Nagoya 464-8603, Japan Received 21 October 2005; revised 15 May 2006 High throughput macromolecular structure determination is very essential in structural genomics as the available number of sequence information far exceeds the number of available 3D structures. ACORN, a freely available resource in the CCP4 suite of programs is a comprehensive and efficient program for phasing in the determination of protein structures, when atomic resolution data are available. ACORN with the automatic model-building program ARP/wARP and refinement program REFMAC is a suitable combination for the high throughput structural genomics. ACORN can also be run with secondary structural elements like helices and sheets as inputs with high resolution data. In situations, where ACORN phasing is not sufficient for building the protein model, the fragments (incomplete model/dummy atoms) can again be used as a starting input. Iterative ACORN is proved to work efficiently in the subsequent model building stages in congerin (PDB-ID: lis3) and catalase (PDB-ID: l gwe) for which models are available. Keywords: ACORN, Congerin, Catalase tTo whom the correspondence should be addressed one: 91-44-22300122; Fax: 91-44-22352494 E~~ [email protected]] protein structures for which data are better than 1.2 A resolution", Detailed papers on the description and application of ACORN program have already appeared in the literature"!'. This paper mainly deals with handling situations, where ACORN phases could not come out with more or less complete model. However, the dummy atoms or less number of modelled amino acids can again be used as a seed input to .ACORN (Iterative ACORN). Automatic model building and refinement can then be continued. The above steps can be repeated, if needed, until the final complete model is obtained. In our earlier study, it was concluded that the. minimum number of amino acids model as seed to ACORN is 27 a.a. for congerin" and 76 a.a. for catalase (unpublished results). In the present work, it has been proved that even a lesser number (14 a.a. for congerin and 66 a.a. for catalase) is adequate, provided one follows iterative ACORN procedures as outlined. Materials and Methods The subsequent sections describe the two crystal structures (congerin II and catalase) undertaken for iterative ACORN. Congerin II The structural details of conger eel galectin (congerin TT\ at ].45 6--''''-~'lqtion (PDB-ID: lis3-

Upload: others

Post on 28-Mar-2021

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Iterative ACORN asahigh throughput toolinstructural genomics'nopr.niscair.res.in/bitstream/123456789/30372/1/IJBB 43(4...structure, conventionally solved by the molecular replacement

ctionper-220,

X-ray crystallography has become a central tool inmodern drug and target discovery, providingimportant insights into molecular interactions andbiological function. The sequence informationavailable for macromolecules exceeds far in numberthe 3D structures in the protein data bank'. Thus,'High Throughput' techniques are needed in allaspects of macromolecular crystallography, so thatstructural genomics project aiming at rapidly solvinga large number of new structures in a short time'would be successful. ACORN is one of theprograms/ useful in the high throughput structuredeterrninatiorr'" , and was developed for ab initiostructure determination of macromolecules, whenatomic resolution (AR) data are available. Theapplications. of ACORN to AR data are outlinedearlier'". This paper deals with the extension ofACORN to non-atomic resolution data. The use ofsecondary structural elements like a-helices and ~-strands was already pointed out as a seed input toACORNs",.Dynamic density modification (DDM) is a fast and

,powerful approach incorporated in ACORN. ACORNalong , wnu 'Uutnmated model building programARP/wARP'2, followed by the refinement programREFMACl3 can be~an ab initio method for solving

IndianJournal of Biochemistry & Biophysicsirt in • Vol.43, August 2006, pp 211-216s 86,

Iterative ACORN as a high throughput tool in structural genomics'

S Selvanayagam a,D Velrnurugant'and T Yamane""Department of Crystallography and Biophysics, University of Madras, Guindy Campus, Chennai 600 025, India"Department of Biotechnology and Biomaterial Science, Graduate School of Engineering, Nagoya University,

Furo-Cho, Chikusa-Ku, Nagoya 464-8603, Japan

Received 21 October 2005; revised 15 May 2006

High throughput macromolecular structure determination is very essential in structural genomics as the availablenumber of sequence information far exceeds the number of available 3D structures. ACORN, a freely available resource inthe CCP4 suite of programs is a comprehensive and efficient program for phasing in the determination of protein structures,when atomic resolution data are available. ACORN with the automatic model-building program ARP/wARP and refinementprogram REFMAC is a suitable combination for the high throughput structural genomics. ACORN can also be run withsecondary structural elements like helices and sheets as inputs with high resolution data. In situations, where ACORNphasing is not sufficient for building the protein model, the fragments (incomplete model/dummy atoms) can again be usedas a starting input. Iterative ACORN is proved to work efficiently in the subsequent model building stages in congerin(PDB-ID: lis3) and catalase (PDB-ID: l gwe) for which models are available.

Keywords: ACORN, Congerin, Catalase

lIP aMaterdtii.

001)iesis,osine37A,

[ andction357,

, tTo whom the correspondence should be addressedone: 91-44-22300122; Fax: 91-44-22352494E~~ [email protected]]

protein structures for which data are better than 1.2 Aresolution", Detailed papers on the description andapplication of ACORN program have alreadyappeared in the literature"!'. This paper mainly dealswith handling situations, where ACORN phases couldnot come out with more or less complete model.However, the dummy atoms or less number ofmodelled amino acids can again be used as a seedinput to .ACORN (Iterative ACORN). Automaticmodel building and refinement can then be continued.The above steps can be repeated, if needed, until thefinal complete model is obtained.

In our earlier study, it was concluded that the.minimum number of amino acids model as seed toACORN is 27 a.a. for congerin" and 76 a.a. forcatalase (unpublished results). In the present work, ithas been proved that even a lesser number (14 a.a. forcongerin and 66 a.a. for catalase) is adequate,provided one follows iterative ACORN procedures asoutlined.

Materials and MethodsThe subsequent sections describe the two crystal

structures (congerin II and catalase) undertaken foriterative ACORN.

Congerin IIThe structural details of conger eel galectin

(congerin TT\ at ].45 6--''''-~'lqtion (PDB-ID: lis3-

Page 2: Iterative ACORN asahigh throughput toolinstructural genomics'nopr.niscair.res.in/bitstream/123456789/30372/1/IJBB 43(4...structure, conventionally solved by the molecular replacement

212 INDIAN J. BIOCHEM. BIOPHYS., VOL. 43, AUGUST 2006

synchrotron data) has been presented earlier". Thestructure consists of 135 residues and sheetscontaining a total of 14 residues from the PDB-ID:1is3 were given as input to ACORN.

CatalaseThe structural details of Micrococcus lysodekticus

catalase at 0.88 A resolution (PDB-ID: l gwe) havebeen presented earlier's. The structure consists of 503residues and one iron cluster. Also, the ab initiostructure determination of this enzyme at atomicresolution using ACORN has been demonstrated 10.

Here, we attempted the iterative ACORN procedurefor this enzyme with a truncated data at 1.5 Aresolution.

Results and DiscussionFig. 1 shows the flowchart of the present work,

wherein the partial structure built by ARP/wARP wasfed into ACORN until ARP/wARP was able to buildnearly the entire structure. ACORN developed thephases from this partial structure and phaserefinement was carried out. The final phase set ismuch better than the initial one. The relevant crystaldata (synchrotron data) used in the present study arepresented in Table 1.

Congerin IIACORN PHASE option was selected for the

structure solution. The R-factor and correlationcoefficient for the medium reflections of initial modelwere 51.8% and 0.0457, respectively. After 65 cyclesof the DDM, the correlation coefficient attained0.0695. The phases were then fed to ARP/wARP and

Output

Table I-Details of crystallographic data of congerin and catalaseParameters Congerin Catalasea = b (A) 61.500 105.785c (A) 80.700 105.001ex= ~ = y (0) 90 90.000Space group P42212 P42212Resolution range (A) 20-1.45 (1.498-1.45) 20-1.5 (1.549-1.5)Completeness (%) 93.58 (76.54) 98.7 (99.1)I/cr(I) 24.8 (6.21) 25.06 (19.18)

REFMAC and model-building program ARP/wARPwas able to build 1526 dummy atoms only.

ACORN was again run with all these dummy atoms.Within 26 cycles of DDM, the correlation coefficientof medium reflections attained 0.0891. This indicated abetter solution than the first one. Using these phasesARP/w ARP was able to build 7 residues in 2 chainsand also located 1594 dummy atoms with aconnectivity index of 0.56. Another cycle of iteration inACORN was carried out with the above as input. Inthis run, within 24 cycles of DDM, the correlationcoefficient of medium reflections attained 0.1198. Thisindicated a good solution. Using these phases,ARP/w ARP was able to build 132, out of 135 residuesin one chain. At this stage, the Rw and Rf values were17.0 and 20.9%, respectively. The map indicated thedifference densities of missing region and remainingresidues were modeled into this.

After manual model building, 20 cycles ofmaximum-likelihood refinement were performed usingREFMAC and solvent atoms were updated after therefinement using ARP/wARP 'build solvent atoms'script. The final Rwand Rf values were 18.1 and19.0%, respectively and all these details are presentedin Table 2. The backbone of the final model obtainedusing the above procedure was superimposed with thestructure, conventionally solved by the molecularreplacement method (PDB-ID: 1is3). The root-mean-square (RMS) deviation is 0.073 A and all these detailsare shown in Table 2.

Fig. 2a shows the cartoon diagram of the final modelof iterative ACORN with black-shaded regioncorresponding to the input sheets. Fig. 2b shows thesuperposition of Co; atoms of the current model withthe original PDB (lis3) and shows the overall fold ofthe current model to be similar to that of original PDB(lis3). Fig. 2c shows a section of the final modelsuperposed with the corresponding electron density ofeach stage of ACORN map and also the final 2IFoI-IFcimap. It can be clearly seen that the second iterativeACORN developed better phases than the initial andthe first iterative ACORl~, respectively. --Fig. I-Flow ch~-t of the present work

InpuiPRO

Ace[Phaknovcoor

ARf

AC(

ARl

AO

AR

Fin[S(

Cal

stnHeIll}

cas

delini

Page 3: Iterative ACORN asahigh throughput toolinstructural genomics'nopr.niscair.res.in/bitstream/123456789/30372/1/IJBB 43(4...structure, conventionally solved by the molecular replacement

,e

CatalaseFor this procedure, only 66 residues (13% of the

structure) were enough to build the whole structure.Helices containing a total of 66 residues were given asinput to ACORN. The procedure mentioned in thecase of congerin could also be followed here.

Table 2 lists the ACORN statistics and ARP/wARPdetails. With this minimum input, ARP/wARPiniti~lly gave 6490 dummy atoms only. With an

iterative ACORN cycle, ARP/wARP was able tobuild 475 residues out of 503 residues into 8 chains.The first 60 residues were wrongly fitted in theautomated model building. From the densitiesobtained, manual model building was carried out for29 residues, which had clear densities. An iterativecycle of ARP/w ARP was again carried out with this,which later revealed 480 residues, out of 503 residuesin 7 chains. The map also showed the densities in

SELV ANA YAGAM et al.: ITERATIVE ACORN AS A HIGH THROUGHPUT TOOL 213

InputPROGRAM

Resolution Limit

Table 2-Details of ACORN, ARP/wARP and REFMAC results for congerin and catalase

Total no. of reflectionsThe strong reflections with E> 1.2The medium reflections with 0.l<E<l.2The weak reflections with O.O<E<O.1

ACORN-I[Phasing withknown startingcoordinates] Initial

Final~P No. of DDM cycles

No. of auto building cyclesNo. of Refmac cycles in each auto building cycle

InitialFinal

ns.~ntdasesins

a1111

InIon'his

ARP/wARP-I

Connectivity IndexNo. ChainsNo. Res. BuiltNo. of Dummy atoms

ACORN-II InitialFinal

No. of DDM cyclesInitial

ARP/wARP-II

No. of auto building cyclesNo. of Refmac cycles in each auto building cycle

Final

ses,lues"erethe

ling

Connectivity IndexNo. ChainsNo. Res. BuiltNo. of Dummy atoms

ofsing. thenns'and

.ntedtinedh thecularlean-etails

ACORN-IIIInitialFinal

No. of DDM cyclesInitial

No. of auto building cyclesNo. of Refmac cycles in each auto building cycle

FinalARP/wARP-III Connectivity Index

No. ChainsNo. Res. BuiltNo. of Dummy atomsRw and Rf without dummy atoms

Final model with solvent atoms[Solvent building carried out using 20 cycles of ARP/wARP: building

solvent atoms script]nodelegionIS theI withold ofIPDBmodelsity ofFo\-\Fc\erative(al and

Congerin (14 a.a)20- 1.4526163594020169

94R= 51.8R= 52.0

cc= 0.0457CC= 0.0695

65105

Rw=44.3Rw= 27.5

0.00oo

1526R= 41.5R= 51.3

Rf=42.6Rf= 49.3

CC= 0.3941CC=0.0891

26Rw= 43.4 Rf= 43.3

105

Rw= 27.40.56

27

1594R= 41.3R= 50.5

Rf= 45.7

CC= 0.3945CC=0.1l98

24Rw= 42.1 Rf= 40.6

105

Rw= 17.00.98

1132301

Rw = 24.7Rw = 18.1

Rf= 20.9

Rf= 24.8Rf= 19.0

r.m.s deviation of backboneatoms: 0.073 A

Catalase (66 a.a)

20-1.54317011010793271653457

CC= 0.0437CC= 0.0632

66105

"

R= 54.5R= 54.3

Rw=45.6 Rf=45.7Rw= 27.8 Rf= 50.5

0.00oo

6490CC= 0.4353CC=0.0838

32

R= 41.5R= 53.3

Rw =44.7 Rf= 44.8105

Rw= 14.6 Rf= 17.90.96

84751243

Rw= 32.4 Rf= 33.1105

Rw= 14.6 Rfe 18.50.977

4801303

Rw = 28.1 Rf= 28.9Rw = 13.9 Rf = 15.5r.m.s deviation of backbone

atoms: 0.150 A

Page 4: Iterative ACORN asahigh throughput toolinstructural genomics'nopr.niscair.res.in/bitstream/123456789/30372/1/IJBB 43(4...structure, conventionally solved by the molecular replacement

214 INDIAN J. BIOCHEM. BIOPHYS., VOL. 43, AUGUST 2006

c

Fig. 2-(A): Final model of congerin. Input: 14 a.a; Auto Built: 132 a.a; Black-shaded region corresponds to input residues; (B): Fig. 3--'(A):Superposition of the Co; atoms of the current model (congerin) with POB i.d. lis3 (red) (C): Final model superposed with (a) ACORN, Superpositiol(b) ACORN-ARP-ACORN and (c) ACORN-ARP-ACORN-ARP-ACORN maps at Io; and (d) Final 2lFol-lFcl map at Io (b) ACORN-

missing region, so the manual model building wascarried out for the remaining residues and therefinement completed with Rw and Rf values of 13.9and 15.5%, respectively. The backbone of the finalmodel obtained using the above procedure wassuperimposed with the structure, conventionallysolved by the molecular replacement method (PDB

l gwe). The RMS deviation is 0.15 A, when thebackbone atoms are superposed and all these detailsare shown in Table 2.

Fig. 3a shows the cartoon diagram of the final modelof iterative ACORN and Fig. 3b shows thesuperposition of Ca atoms of the current model with

. the original PDB (lgwe) and it shows that overall fold

f the cumIgwe). Fi)uperposedap at eaclcl map. It

Page 5: Iterative ACORN asahigh throughput toolinstructural genomics'nopr.niscair.res.in/bitstream/123456789/30372/1/IJBB 43(4...structure, conventionally solved by the molecular replacement

dues; (B):I ACORN,

rhen thee details

al modellWS theIdeIwitherall fold

SEL VANA YAGAM et al.: ITERATIVE ACORN AS A HIGH THROUGHPUT TOOL

A

c

1

Fig. 3-(A): Final model of catalase. Input: 66 a.a; Auto Built: 480 a.a; Magenda shaded helices correspond to input residues; (B):Superposition of the Co; atoms of the current model (catalase) with PDB i.d. Igwe (red) (C): Final model superposed with (a) ACORN;(b)ACORN-ARP-ACORN maps at 10; and (c) Final 2IFol-IFc, map at 10

of the current model is similar to that of original PDB(lgwe). Fig. 3c shows a section of the final modelsuperposed with the corresponding electron densitymapat each stage of ACORN and also the final 2IFoi _IFelmap. It is clear from the figure that the iterative

ACORN developed better phases than the initialACORN.

From these results, it is very clear that very littleinformation is needed to determine the structure of aprotein using iterative ACORN procedure, rather than

215

Page 6: Iterative ACORN asahigh throughput toolinstructural genomics'nopr.niscair.res.in/bitstream/123456789/30372/1/IJBB 43(4...structure, conventionally solved by the molecular replacement

216 INDIAN J. BIOCHEM. BIOPHYS., VOL. 43, AUGUST 2006

the single run of ACORN. Partial-structure iterativeACORN phasing is proved to be much more powerfulthan single-run ACORN phasing which required moreinput. The output phases are more accurate and arebeneficial to the automatic model building and thus tothe high throughput structure determination ofproteins. Fig. 2c and Fig. 3c confirm this.

ConclusionThe work presented above clearly demonstrated the

superiority of iterative ACORN in situations where themodel building is incomplete. It is also clear that withjust 13% of the secondary structural information,iterative ACORN could give successful model evenwith high-resolution data. Further work in this directionis in progress with data mining of PDB for feedingsecondary structural information to ACORN forsolving new macromolecules. This would be of use insituations where the conventional molecularreplacement method fails, due to such a low percentageof model.

AcknowledgementSS thanks Council of Scientific and Industrial

Research (CSIR) for providing Senior ResearchFellowship. DV acknowledges BioinformaticsDivision of Department of Biotechnology (DBT) andUniversity Grants Commission (UGC), Govt. of Indiafor supporting this work. Financial support to theDepartment under UGC-SAP and DST -FISTprogrammes is also acknowledged. DV also thanksVenture Business Laboratory Authorities, Nagoya..

Univeristy, Nagoya, Japan for the VisitingProfessorship assignments in short terms.

References1 Bernstein F C, Koetzle T F, Williams G I B, Meyer E F,

Brice M D, Rodgers J R, Kennard 0, Shimanouchi T &Tasumi M (1977) J Mol Bioi 112,535-542

2 Collaborative Computational Project, No. 4 (1994) ActaCryst 050, 760-763

3 Foadi I, Woolfson M M, Dodson E I, Wilson K S, Jin-XingY & Chao-de Z (2000) Acta Cryst D56, 1137-1147

4 Foadi J (2003) Crystallogr Rev 9, 43-655 Yao J .x (2002) Acta Cryst D58, 1941-19476 Dodson E J & Yao J-X (2003) Crystallogr Rev 9,67-727 McAuley K E, Yao I-X, Dodson E J, Lehmbeck J,

Ostergaard P R & Wilson K S (2001) Acta Cryst D57, 1571-1578

8 Rajakannan V, Yamane T, Shirai T, Kobayshi T, Ito S &Velmurugan D (2004) J Synchrotron Rad 11,64-67

9 Ra:jakannan V, Selvanayagam S, Yamane T, Shirai T.Kobayshi T, Ito S & Velmurugan D (2004) J SynchrotronRCtd 11, 358-362

10 Selvanayagam S, Velmurugan D & Yamane T (2005) ActaCryst A61, C151

11 Velmurugan D, Rajakannan V, Gayathri D, Banumathi S,Yamane T, Dauter Z, Dauter M & Sekar K (2006) Curr Sci90,1091-1099

12 Perrakis A, Morris R & Lamzin V S (1999) Nature StructBioi 6, 458-463

13 Murshudov G N, Lebedev A, Vagin A A, Wilson K S &Dodson E J (1999) Acta Cryst D55, 247 -255

14 Shirai T, Matsui Y, Shionyu-Mitsuyama C, Yamane T,Kamiya H, Ishii C, Ogawa T & Muramoto K'(2002) J MolBioi 321, 879-889

15 Murshudov G N, Grebenko A I, Brannigan I A, Antson A A,Barynin V V, Dodson G G, Dauter Z, Wilson K S & Melik-Adamyan W R (2002) Acta Cryst D58, 1972-1982

1

IndianVol. 4

(

Tempaffectwhiclfew nandpublismyosifreshvmatio:

*Corre:E-mail:Tel: +9Fax: +SAbbrevblue; r;m-ATPmyosinactornysodiumTLCK,(hydroxspectra.