weighing in on a timeless controversy

5
PREDICTION REPORT Weighing in on a Timeless Controversy Jason Perry * Division of Biological Sciences, University of California at San Diego, La Jolla, California ABSTRACT Timeless (Tim) and Period (Per) are coordinately synthesized interacting proteins that in response to positional/environmental cues comigrate to the nucleus as obligate heterodimers where they act to suppress their own gene expres- sion as part of the circadian rhythm network in Drosophila. There has been considerable contro- versy about the structural nature of Tim because of somewhat conflicting results generated by the auto- mated threading algorithm 3D-PSSM. With use of a computer-assisted but largely manual approach, it is shown here that Tim is composed of repetitive structural elements and that those elements com- prise two ARM repeat domains, validating the es- sence of the original 3D-PSSM prediction. Eleven individual ARM structural units are assigned, and a phylogenetic analysis showing their relatedness to one another is performed. In addition, there ap- pears to be a small domain of prenyltransferase-like repeats C-terminal to the second ARM domain, which went undetected in previous analyses. Although we cannot know the precise conformation it adopts until a structure is solved, Tim emerges here as clearly a member of the helical repeat protein superfamily. Given its role in periodic nuclear translocation, Tim may, therefore, have a functional analogy with the photoperiod-responsive protein Phor1 and other karyopherin-like molecules. Proteins 2005;61:699 –703. © 2005 Wiley-Liss, Inc. Key words: bioinformatics; helical repeat; ABRA (Alignment-Based Repeat Annotation) INTRODUCTION The simplest model of a circadian rhythm system is one with three components: (1) a sensory receptor or “input detector”; (2) an oscillator, which biochemically translates the input signal to regulate (3) the effector. The action of the effector component typically results in further spatio- temporal gene regulation, and ultimately, physiological effects. In Drosophila, the oscillator component comprises at least four proteins: Period (Per), Timeless (Tim), Clock (Clk), and Cycle (Cyc). Clk and Cyc are transcription factors that up-regulate the PER and TIM genes. Per is a cytosolic signal sensor protein that contains two PAS (Per-Arnt-Sim) domains and is a substrate for the DBT kinase, which promotes its degradation. Following the appropriate physiological cues, Per and Tim heterodimer- ize. This shields Per from DBT kinase phosphorylation and subsequent degradation, and then the Per-Tim com- plex translocates to the nucleus. Once inside the nucleus, Tim is degraded and Per strongly represses the activity of Clk and Cyc until it is again subject to DBT phosphoryla- tion and degradation. This provides an efficient autoregu- latory loop that coregulates transcription of the primary effector gene, which encodes the neuropeptide pigment dispersing factor (Pdf). A schematic for this network is provided in Supplemental Figure S1. (For recent reviews, see Refs. 1–3.) The structural/biochemical nature of Tim has been the subject of a rather heated debate. Vodovar et al. 4,5 used the threading algorithm 3D-PSSM to garner evidence that Tim was a member of the HEAT/ARM protein superfam- ily. Threading is an automated technique first developed so that amino acid side-chains are placed onto a peptide backbone fold and the structural model is optimized by using a set of pair potentials plus a separate residue- specific solvation potential. 6 3D-PSSM is an elegant adap- tation of this idea that incorporates PSI-BLAST align- ments and secondary structure information (rather than pair potentials) into the model along with the solvation potential. 7 Application of certain fragments of Tim to 3D-PSSM produced marginally significant hits (E 0.7) to a number of HEAT/ARM proteins including importin- (ARM), importin- (HEAT) and -catenin (ARM). 4 How- ever, Kippert and Gerloff subsequently performed indepen- dent 3D-PSSM analyses with larger regions of Tim and found no significant homology to HEAT/ARM proteins, concluding that Tim could not be classified as such. 8 HEAT (Huntingtin-Elongation factor 3-PR65/A subunit of protein phosphatase 2A-TOR1) repeats are canonically defined by two antiparallel interacting -helices connected The Supplementary Materials referred to in this article can be found online at http://www.interscience.wiley.com/jpages.0887-3585/supp- mat/ *Correspondence to: Jason Perry, Division of Biological Sciences, University of California at San Diego, La Jolla, CA 92093. E-mail: [email protected] Received 27 April 2005; Accepted 1 July 2005 Published online 17 October 2005 in Wiley InterScience (www.interscience.wiley.com). DOI: 10.1002/prot.20706 PROTEINS: Structure, Function, and Bioinformatics 61:699 –703 (2005) © 2005 WILEY-LISS, INC.

Upload: jason-perry

Post on 06-Jul-2016

226 views

Category:

Documents


4 download

TRANSCRIPT

Page 1: Weighing in on a timeless controversy

PREDICTION REPORT

Weighing in on a Timeless ControversyJason Perry*Division of Biological Sciences, University of California at San Diego, La Jolla, California

ABSTRACT Timeless (Tim) and Period (Per)are coordinately synthesized interacting proteinsthat in response to positional/environmental cuescomigrate to the nucleus as obligate heterodimerswhere they act to suppress their own gene expres-sion as part of the circadian rhythm network inDrosophila. There has been considerable contro-versy about the structural nature of Tim because ofsomewhat conflicting results generated by the auto-mated threading algorithm 3D-PSSM. With use of acomputer-assisted but largely manual approach, itis shown here that Tim is composed of repetitivestructural elements and that those elements com-prise two ARM repeat domains, validating the es-sence of the original 3D-PSSM prediction. Elevenindividual ARM structural units are assigned, and aphylogenetic analysis showing their relatedness toone another is performed. In addition, there ap-pears to be a small domain of prenyltransferase-likerepeats C-terminal to the second ARM domain, whichwent undetected in previous analyses. Although wecannot know the precise conformation it adoptsuntil a structure is solved, Tim emerges here as clearlya member of the helical repeat protein superfamily.Given its role in periodic nuclear translocation, Timmay, therefore, have a functional analogy with thephotoperiod-responsive protein Phor1 and otherkaryopherin-like molecules. Proteins 2005;61:699–703.© 2005 Wiley-Liss, Inc.

Key words: bioinformatics; helical repeat; ABRA(Alignment-Based Repeat Annotation)

INTRODUCTION

The simplest model of a circadian rhythm system is onewith three components: (1) a sensory receptor or “inputdetector”; (2) an oscillator, which biochemically translatesthe input signal to regulate (3) the effector. The action ofthe effector component typically results in further spatio-temporal gene regulation, and ultimately, physiologicaleffects. In Drosophila, the oscillator component comprisesat least four proteins: Period (Per), Timeless (Tim), Clock(Clk), and Cycle (Cyc). Clk and Cyc are transcriptionfactors that up-regulate the PER and TIM genes. Per is acytosolic signal sensor protein that contains two PAS(Per-Arnt-Sim) domains and is a substrate for the DBT

kinase, which promotes its degradation. Following theappropriate physiological cues, Per and Tim heterodimer-ize. This shields Per from DBT kinase phosphorylationand subsequent degradation, and then the Per-Tim com-plex translocates to the nucleus. Once inside the nucleus,Tim is degraded and Per strongly represses the activity ofClk and Cyc until it is again subject to DBT phosphoryla-tion and degradation. This provides an efficient autoregu-latory loop that coregulates transcription of the primaryeffector gene, which encodes the neuropeptide pigmentdispersing factor (Pdf). A schematic for this network isprovided in Supplemental Figure S1. (For recent reviews,see Refs. 1–3.)

The structural/biochemical nature of Tim has been thesubject of a rather heated debate. Vodovar et al.4,5 used thethreading algorithm 3D-PSSM to garner evidence thatTim was a member of the HEAT/ARM protein superfam-ily. Threading is an automated technique first developedso that amino acid side-chains are placed onto a peptidebackbone fold and the structural model is optimized byusing a set of pair potentials plus a separate residue-specific solvation potential.6 3D-PSSM is an elegant adap-tation of this idea that incorporates PSI-BLAST align-ments and secondary structure information (rather thanpair potentials) into the model along with the solvationpotential.7 Application of certain fragments of Tim to3D-PSSM produced marginally significant hits (E � 0.7) toa number of HEAT/ARM proteins including importin-�(ARM), importin-� (HEAT) and �-catenin (ARM).4 How-ever, Kippert and Gerloff subsequently performed indepen-dent 3D-PSSM analyses with larger regions of Tim andfound no significant homology to HEAT/ARM proteins,concluding that Tim could not be classified as such.8

HEAT (Huntingtin-Elongation factor 3-PR65/A subunitof protein phosphatase 2A-TOR1) repeats are canonicallydefined by two antiparallel interacting �-helices connected

The Supplementary Materials referred to in this article can be foundonline at http://www.interscience.wiley.com/jpages.0887-3585/supp-mat/

*Correspondence to: Jason Perry, Division of Biological Sciences,University of California at San Diego, La Jolla, CA 92093. E-mail:[email protected]

Received 27 April 2005; Accepted 1 July 2005

Published online 17 October 2005 in Wiley InterScience(www.interscience.wiley.com). DOI: 10.1002/prot.20706

PROTEINS: Structure, Function, and Bioinformatics 61:699–703 (2005)

© 2005 WILEY-LISS, INC.

Page 2: Weighing in on a timeless controversy

by a disordered loop.9 A further defining feature of a HEATrepeat is a slight bend in the A helix, which is sometimesformed by an LLPXL amino acid sequence motif. ARM(Armadillo) repeats are evolutionary cousins of HEATrepeats. A principal difference is that the bend in the Ahelix is better represented by two separate helices, andARM repeats emerge as three helix structural units.10

Both HEAT and ARM repeats occur in periodic arrays ofmultiple structural units, and the entire repeat-containingdomain folds into a superhelix that serves as scaffolding torecruit other macromolecules. (See examples in Refs. 11and 12.) Another characteristic of these and other types ofstructurally iterative proteins is that great degeneracy inamino acid sequence is easily accommodated. In someinstances, sets of sequences of known functional orthologscannot be properly aligned by even the newer generationalgorithms such as T-coffee without significant manualadjustment.13 Furthermore, the loops both within repeatsand connecting consecutive repeat units can vary greatlyin both length and composition. Such elasticity is not atrivial parameter to incorporate into a search algorithm;therefore, members of the HEAT/ARM superfamily mayslip through undetected as such,13 unless they are com-posed of very homogeneous repeat units, as is the case inthe PR65/A subunit of protein phosphatase 2A.12 Thismeans that even a marginally significant hit with athreading algorithm such as 3D-PSSM should not bedismissed on the basis of an unsatisfactory E-value with-out further analysis.

Using an algorithm-assisted but largely manual method-ology named ABRA (alignment-based repeat annotation),it is shown here that Tim is clearly composed of repetitivestructural elements and that those elements comprise twoARM repeat domains, as suggested by the original 3D-PSSM automated threading analysis performed by Vodo-var et al.4 In addition, there appears to be a small domainof prenyltransferase-like repeats C-terminal to the secondARM domain, which went undetected in previous analy-ses. The biological implications of these findings arediscussed.

RESULTS AND DISCUSSION

ABRA is a more general extension of a method usedpreviously to map divergent HEAT repeats.13 The basicpremise is rather simple. One begins with either anautomated T-coffee14 or ClustalW15 multiple-sequencealignment and then adjusts it manually (if necessary) afterconsideration of the following criteria: (1) all PSI-BLAST-generated pairwise alignments; (2) amino acid composition/position analysis (i.e., aliphatic hydrophobic in position 1,H-bond acceptor in position 2, etc.); and (3) secondarystructure predictions (HNN, Jpred, etc.). In essence, this issimply a manual threading analysis before modeling thedata onto a particular fold. With the optimized alignmentin hand, repeating sequences are usually immediatelyevident in at least one sequence, and the repeats can thenbe inferred and delimited in all sequences based on thefewest number of structural units (�-helices, �-strands)required for one repeat iteration. Sometimes, as is the case

for the Tim sequences analyzed here, the Radar algorithm(http://www.ebi.ac.uk/Radar/) can give clues as to where tobegin looking for repeats, and hence nucleate the array,when it is not obvious from visual inspection.16 Sequentialrepetitive elements can then be aligned against one an-other for further refinement.

For Tim, there are several clues that suggest it is ahelical repeat protein, and they are shown in Figure 1.First are the marginally significant hits to HEAT/ARMproteins that Vodovar et al. found by using the D. melano-gaster sequence in the original 3D-PSSM analysis.4 Next,current submission of the same Tim sequence to the Polishmetaserver (http://bioinfo.pl/meta/) finds all-� proteins asthe leading candidates from several prediction serversincluding 3D-PSSM, mGenTHREADER, and FUGUE, withsignificant 3D-Jury scores.7,17–19 mGenTHREADER spe-cifically finds HEAT/ARM proteins as its top candidates.Third, the Radar algorithm can clearly identify the pres-ence of repetitive elements in the presumed ortholog fromA. gambiae. Finally, there are at least two domains of high�-helicity as judged by the secondary structure algorithmHNN (http://us.expasy.org/tools/).

The ABRA methodology can be applied to an alignmentof the Tim sequences from D. melanogaster (GenBank:AAN10371), D. virilis (GenBank:AAB94891), C. costata(DDBJ:BAB91179), A. gambiae (GenBank:EAA12266), andA. pernyi (GenBank:AAF66996), using the Radar identi-fied repetitive element in the A. gambiae sequence tonucleate the array (Supplemental Figure S2). The resultsof the analysis for the disputed D. melanogaster Timprotein are shown in Figure 2(A). The observed repetitiveelements are assigned and aligned with respect to oneanother. Tim emerges with two clusters of three-helixARM-like repeats, just as originally predicted by 3D-PSSM, followed by a set of at least three two-helix units,which appear most similar in sequence to the proteinprenyltransferase �-subunit repeats (a TPR derivative)listed in the Pfam database (PF01239) (http://pfam.wus-tl.edu/hmmsearch.shtml). Repeat arrays are thought toarise by tandem duplication, and the relatedness of thedefined ARM repeats to one another can be shown byautomated phylogenetic analysis using the cluster algo-rithm (http://www.genebee.msu.su/services/phtree_re-duced.html) [Fig. 2(B)]. The overall architecture of Tim isshown in Figure 2(C). Four ARM repeats are followed by alarge low-complexity region, which itself contains repeti-tive elements, and is followed by a domain of seven moreARM repeats and then a low-complexity linker connectingto a domain of three prenyltransferase repeats followed bymore low-complexity sequence. A possibility exists thatthere are additional divergent prenyltransferase repeatshidden within the C-terminal low-complexity region, butthis region appears not to be conserved by A. gambiae. Anumber of stray helices and helical extensions to one endor the other of various repeat units are predicted. Thelarge low-complexity loops and additional helices found inthe Tims are common in these types of proteins,13,20,21

further complicating their analysis by a strictly computeralgorithm-based approach.

700 J. PERRY

Page 3: Weighing in on a timeless controversy

Although we cannot know the exact conformation Timwill adopt in the absence of an experimentally determinedstructure, with the present information in hand, thebiological role of Tim is inferred to be that of a conditional,unidirectional karyopherin. Karyopherins are HEAT/ARM molecules that are perpetually cycling in and out ofthe nucleus, shuttling cargo molecules in either direction.In the nuclear import cycle, a cargo molecule with affinityfor the karyopherin is brought through the nuclear poreensemble as a karyopherin-cargo complex. Once inside thenucleus, Ran-GTP binds to the karyopherin and the cargois released. The Ran-GTP-karyopherin complex is eitherexported back to the cytosol as such or, in the case of anexportin, as a ternary complex with a different cargomolecule targeted for nuclear export. Once the complex

reaches the cytosol, the GTP is hydrolyzed to Ran-GDPand the complex dissociates, leaving the karyopherinmodules primed for another cycle. (For recent reviews onkaryopherin biology, see Refs. 22–24.)

Tim apparently only moves on cue and in one direction;it brings Per into the nucleus at the appropriate time andthen it is destroyed, liberating Per to have its maximalinhibitory effect on the Clk/Cyc transcription factors anddown-regulating PDF. Karyopherins sometimes act asheterodimers; the classical case is the karyopherin-�1/karyopherin-� complex in which karyopherin-� serves asan adaptor module to recognize bipartite basic nuclearlocalization sequences found in many nuclear proteins.22

The finding of the prenyltransferase-like repeats in conjunc-tion with the ARM domains in Tim raises the possibility

Fig. 1. Algorithm-derived clues that Tim is a helical repeat protein. A: Original 3D-PSSM analysis of segments of Tim from D. melanogaster. B:Current fold assignments suggested by submission of Tim from D. melanogaster to the Polish metaserver. A 3D-Jury score of �50 is deemed significant.It is of interest that these folds were deemed significant matches in the absence of a significant PDB-Blast score. C: Radar analysis of a segment of Timfrom A. gambiae. D: HNN secondary structure analysis of Tim from D. melanogaster.

A TIMELESS CONTROVERSY 701

Page 4: Weighing in on a timeless controversy

that there could be another, yet unidentified, protein thatbinds to Tim to either facilitate the nuclear translocationof the Tim/Per complex or to actively sequester Tim in thecytosol until the appropriate signal is given to heterodimer-ize with Per.

The concept of an ARM repeat protein migrating to thenucleus in response to positional/environmental cues isnot unprecedented. PHOR1 (photoperiod-Uesponsive 1) isup-regulated in the short daylight cycles required for

potato tuberization. Phor1 is an ARM repeat protein thatis localized in the cytosol until a light-linked gibberellinsignal results in its translocation to the nucleus.25 Phor1,or perhaps its yet unidentified cargo, then derepressesgene expression by interacting with the DNA-bound tran-scriptional repressors Ga1 and Rga. Thus, Phor1 is impli-cated in gene regulation in response to the plant hormonecycle and the photoperiod. This could be functionallyanalogous to Tim/Per nuclear translocation to regulateClk/Cyc gene expression in response to circadian rhythmsin Drosophila.

Tim emerges from this analysis clearly a member of thehelical repeat superfamily. There are two ARM domainscomprised of four structural units and seven structuralunits, respectively, which are linked by a large low-complexity loop. There also appears to be an additionalC-terminal domain of two-helix structural units mostsimilar to prenyltransferase repeats. Given the structuralnature of Tim plus what is already known about itsbiological function, Tim emerges as a likely conditionaland unidirectional karyopherin functional analog.

ACKNOWLEDGMENTS

The author thanks the anonymous reviewers whoseinsights have strengthened this article and the membersof the various genome sequencing and analysis consortiawhose tireless efforts make this type of work possible.

REFERENCES1. Young MW. Life’s 24-hour clock: molecular control of circadian

rhythms in animal cells. Trends Biochem Sci 2000; 25:601–606.2. Levine JD. Sharing time on the fly. Curr Opin Cell Biol 2004;16:

210–216.3. Salome PA, McClung CR. The Arabidopsis thaliana clock. J Biol

Rhythms 2004;19:425–435.4. Vodovar N, Clayton JD, Costa R, Odell M, Kyriacou CP. The

Drosophila clock protein Timeless is a member of the Arm/HEATfamily. Curr Biol 2002;12:R610–R611.

5. Kyriacou CP, Odell M. No ARM in it? Curr Biol 2004;14:R652–R653.

6. Jones DT, Taylor WR, Thornton JM. A new approach to proteinfold recognition. Nature 1992;358:86–89.

7. Kelley LA, MacCallum RM, Sternberg MJ. Enhanced genomeannotation using structural profiles in the program 3D-PSSM. JMol Biol 2000;299:499–520.

8. Kippert F, Gerloff DL. Timeless and Armadillo: a link too far. CurrBiol 2004;14:R650–R651.

9. Andrade MA, Bork P. HEAT repeats in the Huntington’s diseaseprotein. Nat Genet 1995;11:115–116.

10. Andrade MA, Petosa C, O’Donoghue SI, Muller CW, Bork P.Comparison of ARM and HEAT protein repeats. J Mol Biol2001;309:1–18.

11. Graham TA, Weaver C, Mao, F, Kimelman, D, Xu W. Crystalstructure of a beta-catenin/Tcf complex. Cell 2000;103:885–896.

12. Groves MR, Hanlon N, Turowski P, Hemmings BA, Barford D.The structure of the protein phosphatase 2A PR65/A subunitreveals the conformation of its 15 tandemly repeated HEATmotifs. Cell 1999;96:99–110.

13. Perry J, Kleckner N. The ATRs, ATMs, and TORs are giant HEATrepeat proteins. Cell 2003;112:151–155.

14. Notredame C, Higgins DG, Heringa J. T-Coffee: a novel methodfor fast and accurate multiple sequence alignment. J Mol Biol2000;302:205–217.

15. Thompson JD, Higgins DG, Gibson TJ. Clustal W: improving thesensitivity of progressive multiple sequence alignment throughsequence weighting, position-specific gap penalties and weightmatrix choice. Nucleic Acids Res 1994;22:4673–4680.

16. Heger A, Holm L. Rapid automatic detection and alignment ofrepeats in protein sequences. Proteins 2000;41:224–237.

Fig. 2. Results of an ABRA analysis of Timeless from D. melano-gaster. A: The repetitive sequence elements found in Tim are assignedand aligned to one another (manually adjusted T-coffee alignment). B:Phylogenetic analysis (http://www.genebee.msu.su/services/phtree_re-duced.html) (cluster algorithm) of the ARM repeats found in Tim. Thenumbers at the branching points represent the percentage of times eachbranch topology was found during bootstrap analysis. C: Overall domainarchitecture of Drosophila Tim. The diagram is approximately to scale.

702 J. PERRY

Page 5: Weighing in on a timeless controversy

17. McGuffin LJ, Jones DT. Improvement of the GenTHREADERmethod for genomic fold recognition. Bioinformatics 2003;19:874–881.

18. Shi J, Blundell TL, Mizuguchi K. FUGUE: sequence-structurehomology recognition using environment-specific substitutiontables and structure-dependent gap penalties. J Mol Biol 2001;310:243–257.

19. Ginalski K, Elofsson A, Fischer D, Rychlewski L. 3D-Jury: asimple approach to improve protein structure predictions. Bioinfor-matics 2003;19:1015–1018.

20. Choi J, Chen J, Schreiber SL, Clardy J. Structure of the FKBP12-rapamycin complex interacting with the binding domain of humanFRAP. Science 1996;273:239–242.

21. Cingolani G, Petosa C, Weis K, Muller CW. Structure of importin-beta bound to the IBB domain of importin-alpha. Nature 1999;399:221–229.

22. Chook YM, Blobel G. Karyopherins and nuclear import. Curr OpinStruct Biol 2001;11:703–715.

23. Mosammaparast N, Pemberton LF. Karyopherins: from nuclear-transport mediators to nuclear-function regulators. Trends CellBiol 2004;14:547–556.

24. Harel A, Forbes DJ. Importin beta: conducting a much largercellular symphony. Mol Cell 2004;16:319–330.

25. Amador V, Monte E, Garcia-Martinez JL, Prat S. Gibberellinssignal nuclear import of PHOR1, a photoperiod-responsive proteinwith homology to Drosophila armadillo. Cell 2001;106:343–354.

A TIMELESS CONTROVERSY 703