649-659, 2006 bioinformatic prediction of polymerase ... · bioinformatic prediction of polymerase...

12
Biol Res 39: 649-659, 2006 BR Bioinformatic prediction of polymerase elements in the rotavirus VP1 protein RODRIGO VÁSQUEZ-DEL CARPIO * , JAIME L MORALES, MARIO BARRO * , ALBA RICARDO, EUGENIO SPENCER * Laboratorio de Virología, Facultad de Química y Biología, Universidad de Santiago de Chile, Casilla 40 Correo 33, Santiago, Chile. * Present address: Laboratory of Infectious diseases, NIAID, National Institutes of Health, 50 South Drive MSC 8026, Room 6314, Bethesda, MD 20892. SUMMARY Rotaviruses are the major cause of acute gastroenteritis in infants world-wide. The genome consists of eleven double stranded RNA segments. The major segment encodes the structural protein VP1, the viral RNA- dependent RNA polymerase (RdRp), which is a minor component of the viral inner core. This study is a detailed bioinformatic assessment of the VP1 sequence. Using various methods we have identified canonical motifs within the VP1 sequence which correspond to motifs previously identified within RdRps of other positive strand, double-strand RNA viruses. The study also predicts an overall structural conservation in the middle region that may correspond to the palm subdomain and part of the fingers and thumb subdomains, which comprise the polymerase core of the protein. Based on this analysis, we suggest that the rotavirus replicase has the minimal elements to function as an RNA-dependent RNA polymerase. VP1, besides having common RdRp features, also contains large unique regions that might be responsible for characteristic features observed in the Reoviridae family. Key terms: Rotavirus, genome replication, RNA polymerase, bioinformatic analysis. *Corresponding author. Mailing address: Laboratorio de Virologia, Facultad de Química y Biología, Universidad de Santiago de Chile, Casilla 40 Correo 33, Santiago, Chile. Phone: (562) 681-0185. Email: [email protected] Received: April 3, 2006. Accepted: May 23, 2006 The abbreviations used are: RdRp, RNA-dependent RNA polymerase; dsRNA, double stranded RNA; mRNA, messenger RNA; aa, amino acid(s); kDa, kilo Dalton. INTRODUCTION Rotaviruses, members of the Reoviridae family, are the major world-wide cause of acute gastroenteritis in infants and young children (Parashar et al., 2003). The rotavirus virion is a non-enveloped icosahedron, consisting of three concentric protein layers and a viral genome composed of eleven double stranded RNA (dsRNA) segments (Kapikian et al., 2001). These segments encode six structural and six non- structural proteins (Estes, 2001). Segment one encodes the structural protein VP1, the rotavirus putative RNA-dependent RNA polymerase (RdRp) (Eiden & Hirshon, 1993, Gallegos & Patton, 1989, Valenzuela et al., 1991). This enzyme is proposed to possess transcriptase and replicase functions. These activities catalyze the formation of the viral mRNA (plus strand synthesis) and the dsRNA genome (minus strand synthesis), respectively (Eiden & Hirshon, 1993, Estes, 2001, Patton et al., 2003). VP1 together with VP3, the capping enzyme (Liu et al., 1992, Pizarro et al., 1991), are the minor components of the viral core (a single copy of each per five- fold lattice) (Prasad et al., 1996). VP1 specifically recognizes multiple signals contained in the last 60 nucleotides of the 3’ end of the viral mRNA of gene 8 (Patton, 1996, Tortorici et al., 2003). To date, the region(s) of VP1 responsible for this specific recognition of the 3’ end of the plus strand has not been identified. This is

Upload: buikhue

Post on 01-Oct-2018

215 views

Category:

Documents


0 download

TRANSCRIPT

649VÁSQUEZ ET AL. Biol Res 39, 2006, 649-659Biol Res 39: 649-659, 2006 BRBioinformatic prediction of polymerase elements in therotavirus VP1 protein

RODRIGO VÁSQUEZ-DEL CARPIO*, JAIME L MORALES, MARIO BARRO*,ALBA RICARDO, EUGENIO SPENCER*

Laboratorio de Virología, Facultad de Química y Biología, Universidad de Santiago de Chile, Casilla 40Correo 33, Santiago, Chile.* Present address: Laboratory of Infectious diseases, NIAID, National Institutes of Health, 50 South DriveMSC 8026, Room 6314, Bethesda, MD 20892.

SUMMARY

Rotaviruses are the major cause of acute gastroenteritis in infants world-wide. The genome consists of elevendouble stranded RNA segments. The major segment encodes the structural protein VP1, the viral RNA-dependent RNA polymerase (RdRp), which is a minor component of the viral inner core. This study is adetailed bioinformatic assessment of the VP1 sequence. Using various methods we have identified canonicalmotifs within the VP1 sequence which correspond to motifs previously identified within RdRps of otherpositive strand, double-strand RNA viruses. The study also predicts an overall structural conservation in themiddle region that may correspond to the palm subdomain and part of the fingers and thumb subdomains,which comprise the polymerase core of the protein. Based on this analysis, we suggest that the rotavirusreplicase has the minimal elements to function as an RNA-dependent RNA polymerase. VP1, besides havingcommon RdRp features, also contains large unique regions that might be responsible for characteristicfeatures observed in the Reoviridae family.

Key terms: Rotavirus, genome replication, RNA polymerase, bioinformatic analysis.

*Corresponding author. Mailing address: Laboratorio de Virologia, Facultad de Química y Biología, Universidad deSantiago de Chile, Casilla 40 Correo 33, Santiago, Chile. Phone: (562) 681-0185. Email: [email protected]

Received: April 3, 2006. Accepted: May 23, 2006

The abbreviations used are: RdRp, RNA-dependent RNA polymerase; dsRNA, double stranded RNA; mRNA,messenger RNA; aa, amino acid(s); kDa, kilo Dalton.

INTRODUCTION

Rotaviruses, members of the Reoviridaefamily, are the major world-wide cause ofacute gastroenteritis in infants and youngchildren (Parashar et al. , 2003). Therotavirus virion is a non-envelopedicosahedron, consisting of three concentricprotein layers and a viral genome composedof eleven double stranded RNA (dsRNA)segments (Kapikian et al., 2001). Thesesegments encode six structural and six non-structural proteins (Estes, 2001). Segmentone encodes the structural protein VP1, therotavirus putative RNA-dependent RNApolymerase (RdRp) (Eiden & Hirshon,1993, Gallegos & Patton, 1989, Valenzuelaet al., 1991). This enzyme is proposed to

possess transcriptase and replicasefunctions. These activities catalyze theformation of the viral mRNA (plus strandsynthesis) and the dsRNA genome (minusstrand synthesis), respectively (Eiden &Hirshon, 1993, Estes, 2001, Patton et al.,2003). VP1 together with VP3, the cappingenzyme (Liu et al., 1992, Pizarro et al.,1991), are the minor components of theviral core (a single copy of each per five-fold lattice) (Prasad et al., 1996).

VP1 specifically recognizes multiplesignals contained in the last 60 nucleotidesof the 3’ end of the viral mRNA of gene 8(Patton, 1996, Tortorici et al., 2003). Todate, the region(s) of VP1 responsible forthis specific recognition of the 3’ end of theplus strand has not been identified. This is

VÁSQUEZ ET AL. Biol Res 39, 2006, 649-659650

not unexpected due to the lack ofinformation about the structural-functionalfeatures of this viral protein, i.e. thelocation of the nucleic acid-protein and theprotein-protein interacting regions in VP1are unknown. This can be attributed in partto its large size (1,088 aa, 125 kDa), to thelack of a solved structure and to therequirement of VP1 for the core latticeprotein (VP2) to form a competentreplication complex (Patton, 1996, Tortoriciet al., 2003). To date, the VP1 sequence hasbeen poorly studied; specifically, fewamino acids have been suggested to play animportant role in protein function. Previousstudies include the identification ofconserved residues by full sequencealignments of VP1 from group A, B and Crotaviruses (Bremont et al., 1992, Eiden &Hirshon, 1993, Mitchell & Both, 1990), andby short alignments using other viralreplicases of positive strand RNA viruses(Cohen et al., 1989, Mitchell & Both,1990). The function of these conservedamino acids has been described in otherviral RNA polymerases (i.e. phage φ6,HCV, Poliovirus RdRp), using biochemical(site-directed mutagenesis) and/orbiophysical methods (crystallography andX-ray diffraction) (Bressanelli et al., 1999,Butcher et al., 2001, Ribas & Wickner,1992). In such enzymes, these residues areorganized in typical motifs and are situatedin the catalytic core of the polymerases,playing an important role in catalysis. Inrotavirus, there is no experimental data toshow that these amino acids form part ofthe motifs and could be involved inpolymerization or interaction with thetemplate. Some early studies havesuggested the presence of these motifs inthe rotavirus VP1 protein by sequencecomparison of viral RdRps (Bruenn, 1991,Almanza et al., 1994). However, thesestudies did not clearly describe the classicalmotifs containing the conserved aminoacids, or analyze the VP1 sequence forfeatures characteristic for other viral RNApolymerases.

The vast information available aboutviral RNA polymerases has allowed us toperform a detailed study of the rotavirusVP1 sequence using a bioinformatic

approach. This study is based on theobserved conservation of the secondarystructure of the palm subdomain in otherreplicases of positive strand and doublestrand RNA viruses. This conservationhelped us first, to visualize and describe thepossible motifs and amino acids that maybe implicated in polymerization and sugarselection; and second, to delimit thepolymerase region in the VP1 sequenceguided by the predicted structuralconservation in the middle part of theprotein. Finally, the prediction of apolymerase region in VP1 enabled us toobserve, by a conserved domain search, thepossible tertiary structure of the centralregion of the protein.

METHODS

Cell culture, virus propagation and RNApurification

Rhesus monkey fetal kidney cells (MA104)were maintained in minimal essential media(MEM) supplemented with 10% fetalbovine serum and grown at 37 oC. SimianRotavirus strain SA11 was propagated andtitrated in these cells. The dsRNA waspurified from viral particles as previouslydescribed (Spencer & Arias, 1981). Thegene 1 was purified from the viral dsRNAgenome by gel electrophoresis in 0.8% lowmelt point (LMP) agarose followingpreviously described procedures (Sambrooket al., 1989); approximately 50 ng ofpurified gene one dsRNA was used for eachRT-PCR amplification reaction.

cDNA synthesis, cloning and sequencedetermination

The complete open reading frame of geneone, which corresponds to VP1, wasamplified using a two step RT–PCRreaction: AMV-RT (Promega) and Elongase(Invitrogen) were used for the cDNAsynthesis and amplification steps,respectively. Using combinations ofprimers that contain Sal I or Hind IIIrestriction sites, the amplified productswere cloned in pFastBac Hta (Invitrogen)

651VÁSQUEZ ET AL. Biol Res 39, 2006, 649-659

and transformed into Escherichia coliDH5α following general protocolsdescribed elsewhere (Sambrook et al.,1989). A combination of the primers: (5’-TAGCGTCGACGAATGGGGAAGTACAATCTAATC-3’; 5’ GGGAAGCTTCTATGGTTTATCAACATTCACTGG-3’), (5’-TAGCGTCGACGACCAGTGAATGTTGATAAACCA-3’; 5’-GGGAAGCTTCTATCTCTTTTCATTATTAAGTAG-3’) and (5’-TAGCGTCGACGACTACTTAATAATGAAAAGAGA-3’; 5’-GGGAAGCTTCTAATCTTGAAAGAAGTTCGCGTT-3’), was used to amplifythe nucleic acid sequence corresponding tothe N-termini, middle and C-termini regionof VP1, respectively. Transformants wereselected on LB agar plates supplementedwith ampicillin (100 mg/ml). Plasmidpurification was realized by the commonalkaline lysis procedure (Sambrook et al.,1989). The presence of the correct insertwas confirmed by restriction enzymedigestion and PCR. The DNA wasvisualized by electrophoresis on agarosegels [0.9%] stained with ethidium bromide[0.5 μg/ml]. The insert was sequenced atleast three times by automated sequencingwith an ABI PRISM 3100 genetic analyzer(PE Applied Biosystems), the full sequencewas determined by contiguous assembly.

Sequence manipulation and analysis

The obtained VP1 sequence was comparedwith various RdRp sequences available inthe Genbank database (http://www.ncbi.nih.gov/Genbank/index.html).The alignments were made using ClustalW3.0 (http://www2.ebi.ac.uk/clustalw/) andedited in BOXSHADE 3.21 (http://w w w . c h . e m b n e t . o r g / s o f t w a r e /BOX_form.html). The secondary structureprediction of complete amino acidsequences was made using the PSIPREDprotein structure prediction server (http://insulin.brunel.ac.uk/psipred/). Thevisualization of the possible tertiarystructure of the palm subdomain wasinitially performed using the conserveddomain (CD) search option in the Blastpage (http://www.ncbi.nlm.nih.gov/BLAST/), where the initial output is analignment with various polymerases

including among them a protein with asolved structure. The alignment between thequery protein and the protein with a solvedstructure allows the modeling programCn3D 4.0 (NCBI/NIH), to show the regionsof similarity on the crystal structure of thesolved protein. The motif searcher programMEME (GCG/Wisconsin package) was usedto find and confirm the previous motifsidentified by visual determination. Theinput to the program consisted of anappropriate pool of sequences (i.e. shortsequences containing the described motifs).The sequences were of 457 aa in length andcorresponded to the middle region of theproteins including, for instance, the RdRpcanonical motifs.

Sequence accession numbers

The amino acid sequences of RdRps fromvarious viruses selected for this studywere: brome mosaic virus (BMV) (822 aa)accession No. CAA41362, tobacco mosaicvirus (TMV) (1,616 aa) No. AAD44327.1,tomato bushy stunt virus (TBSV) (818 aa)No. CAB56480, poliovirus (462 aa) No.NP_041277, encephalomyocarditis virus(EMCV) (459 aa) No. GNNYEB, foot andmouth disease virus (FMDV) (470 aa) No.AAG35699, hepatitis A virus (HAV) (415aa) No. CAA33490, hepatitis C virus(HCV) (591 aa) No. CAB10747,bacteriophage Qβ (589 aa) No.CAA32872, vesicular stomatitis virus(VSV) (2,109 aa) No. NP_041716,infectious haematopoietic necrotic virus(IHNV) (1,986 aa) No. CAA61500,measles virus (MV) (2,183 aa) No.AAD29097, avian rotavirus P013 strain(1,088 aa) No. BAA24146, humanrotavirus IDIR strain (1,159 aa) No.A44280, porcine rotavirus Cowden strain(1,082 aa) No. P1XRPC, grass carpreovirus VP2 (1,274 aa) No. AAG10436,blue tongue virus (BTV) (1,302 aa) No.RRXRBT, Saccharomyces cerevisiae L-Avirus (731 aa) No. AAA50508, P2bacteriophage φ6 (665 aa) No. AAA32355and P2 bacteriophage φ13 (713 aa) No.AAG00444. The accession number of therotavirus VP1 sequence originated in thisstudy is DQ457016.

VÁSQUEZ ET AL. Biol Res 39, 2006, 649-659652

RESULTS

Comparison of VP1 sequences available inthe database.

The complete VP1 nucleic and amino acidsequence reported in this study was used as atemplate to search for other related rotavirusVP1 sequences in the database (http://www.ncbi.nlm.nih.gov/BLAST/). The VP1sequences were then aligned using theprogram ClustalW. Interestingly, when theamino acid sequence is aligned with VP1sequences from the same group (A) (i.e.bovine UK and RF, porcine Gottfried, avian)or with other rotavirus groups (IDIR-B,Cowden-C), the main differences were foundin the amino- and carboxi-termini of theprotein and not in the central region (datanot shown). This suggests that the middlesection of the protein represents a region ofhigh homology with respect to the aminoacid variations that are seen for the rest ofthe protein and for instance this zone couldhave an important role in enzyme function,for example, polymerization.

Localization of canonical motifs displayedby other viral RdRps in the rotavirus VP1sequence.

In different families of polymerases, or evenwithin the same family there are only a fewconserved amino acids, and these areessential for the catalytic function of theenzyme. These amino acids are in aparticular and strictly structural context inthe catalytic core of the enzyme (O’Reilly &Kao, 1998). Some of these amino acids areinvolved in the coordination of the bivalentcations needed for the nucleophilic attackmediated by a two-metal-ion mechanism(Steitz, 1998), while other residues areimportant for sugar selection (Brautigam &Steitz, 1998, O’Reilly & Kao, 1998). Themotifs that contain these conserved aminoacids are well described for different groupsof polymerases, in particular the motifinvolved in the coordination of the bivalentions (motif C). The nature of these motifs isdiverse, they vary depending on the sugarselection of the enzyme and according to theclass of polymerase that they belong to, for

example, reverse transcriptases (RTs),multimeric DNA-dependent RNApolymerases (RNA pols), RdRps (Joyce &Steitz, 1995). Several works previouslydescribed the nature of the motifs shared byviral RdRps (Koonin, 1991, Koonin et al.,1989, O’Reilly & Kao, 1998, Poch et al.,1989). This information was used to searchthe rotavirus VP1 sequence for the classicalmotifs described in other RdRps. In order toaccomplish this, our initial approach was tosearch for motifs in VP1 guided by thedefined structural context in which these aresituated. The most notorious of these is theubiquitous motif C, which is strictly locatedbetween two short beta strands usuallypresent in the middle portion of the protein(O’Reilly & Kao, 1998). Koonin et al, hadinitially described eight motifs in positivestrand RNA viruses, where three of them arepresent in almost every RdRp, with very fewexceptions (Shwed et al., 2002). We wereable to find, guided by a particularsecondary structure context, these threemotifs (A, B and C, or often called motifsIV, V and VI), as well as two of the otherfive motifs initially described in positivestrand RNA viruses, motifs D and F (Fig. 1)(Johnson et al., 2001, Koonin et al., 1989,O’Reilly & Kao, 1998). All of the motifsthat were identified are in the same classicalarrangement as is seen for other positivestrand, double-strand RNA viruses (O’Reilly& Kao, 1998). In the case of motif E, whichis involved in a hydrophobic interaction withthe thumb subdomain (O’Reilly & Kao,1998), we were unable to define the motifprecisely. This was due to the fact that themotif could not be located in a correctstructural boundary, which is usuallydescribed to have serine residues and to bein a rich beta strand region following themotif D (Johnson et al., 2001) (Fig. 1). Oncethe motifs were defined, short sequencealignments with other positive, and negativestrand, double-strand RNA viruses weremade (Fig. 2). The conserved amino acidsthat we found in VP1 are in a very definedstructural context as was described for otherviral RdRps. The results also show that themotifs belonging to positive strand anddsRNA viruses are much more similarbetween them than when compared with the

653VÁSQUEZ ET AL. Biol Res 39, 2006, 649-659

representatives of the negative strand RNAviruses. This could give us an idea of thephylogeny of the RNA viruses, wherepossibly the minus strand RNA viruses couldhave had an early divergence or,alternatively, have a different origin. Thissuggestion is in agreement with aphylogenetic tree made in the MEGAprogram (Version 2.1) (data not shown) andwith more extended studies (Zanotto et al.,1996).

In a second approach, with the aim tofind novel motifs in the rotavirus sequence,we performed a motif search using theMEME program (GCG/Wisconsin package)(Bailey & Elkan, 1994). The results of theprogram show the presence of the canonicalmotifs in the VP1 sequence as previouslyidentified in this study (Fig. 1) and no novelmotifs. The results obtained using thisprogram serve to define similarities withthe motifs detected earlier by visualanalysis of the rotavirus VP1 sequence.

In order to find other amino acids ofinterest in the rotavirus replicase, weperformed a PSI-Blast protein search (non-redundant database), using small VP1sequence segments. Each segmentcorresponds to 128 aa of the full sequencewith an overlap of 64 aa between them. Thisapproach was made with the aim to get onlyshort nearly exact matches and for instance,to avoid the polymerase region bias during

Figure 1: Identification of the canonical motifs present in other viral RdRps in rotavirus VP1. Themotifs in the VP1 protein are in a particular structural context and in a specific arrangement, as isthe case for other viral RdRps. The identified motifs are shown by underlines of different colors. Inaddition a tentative location for motif E is shown. Alpha helix and beta strand predicted secondarystructures are represented as an arrow and a tubular structure, respectively. Numbers denote theamino acid at which the sequence shown starts or ends.

the search. The search was performed untilthe iterations converged. Only one site ofinterest with a value sufficiently higher thanthe threshold background level was found. Itis located towards the C-terminus, betweenthe amino acids 802 and 856. This sitepresents similarity with mitochondrialATPases/ATP synthases, and could be anovel ATP binding site in VP1, different tothe more common NTP binding loop usedfor polymerization, which is composed ofamino acids corresponding to the fingers andpalm subdomains (motif F) (Butcher et al.,2001).

Prediction of a polymerase region in VP1based on the RdRp structural conservationof other positive strand and dsRNA viruses

Despite their great structural heterogeneity,distinct types of polymerases share commonfeatures that are responsible for theircatalytic function of polymerization ofnucleic acids (Steitz, 1998, Steitz, 1999).Their domains have a similar overallarchitecture, with a conformation resemblinga right-hand (Brautigam & Steitz, 1998,Cramer, 2002). The polymerases can beclassified by their catalytic function, forexample, according to the nature of thetemplate that is recognized and to the type ofsubstrate (rNTPs or dNTPs) that is used forthe polymerization, and/or to the structural

VÁSQUEZ ET AL. Biol Res 39, 2006, 649-659654

architecture of their subdomains (fingers,palm and thumb). The RdRps, as is the casefor other classes of polymerases, share aparticular overall structural conservation oftheir basic domains (fingers, palm andthumb) within the family (O’Reilly & Kao,1998). With the objective to determine if therotavirus polymerase adhered to thisstructural conservation, we performed thesecondary structure prediction of the VP1sequence together with RdRps from positivestrand and dsRNA viruses. The programPSIPRED was used to predict the secondarystructures of the different viral sequences.Results of the predictions suggest astructural conservation of the polymerasecore region in the RdRps secondarystructure, as was previously described (Fig.

3) (O’Reilly & Kao, 1998). Thisconservation is extended to the predictedrotavirus VP1 secondary structure, where theconserved part is in its middle region (Fig.3). This region, similar to the other viralRNA polymerases, is predicted to be locatedthe palm subdomain and part of the fingerssubdomain. The predicted conservation inthe middle part of the protein is inconcordance with the distribution of themotifs and amino acids implicated in thepolymerization reaction (located mostlywithin the palm subdomain) (Brautigam &Steitz, 1998). The observed degree ofconservation decreases towards the amino-and carboxy-termini, zones in which themain part of the fingers (the distal region tothe palm) and the complete thumb

Figure 2: Alignments of various viral RdRp sequences from positive, double and negative strandedRNA viruses. The three canonical motifs located in a precise structural context in the RdRps areshown. Grey highlighting denotes amino acid similarity. Black highlighting denotes identitybetween the sequences belonging to the same group. The amino acids highlighted with differentcolors (yellow, cyan, red or blue) also represent identity, and are the conserved amino acidsdescribed in several studies that comprise the active site and are critical for the polymerase activity.Numbers at the left of each sequence show the position at which the residue starts in the fullsequence. Graphics in the upper part of the alignment show the representative and conservedsecondary structure predicted for the motifs. The blue line shown in the double stranded groupindicates rotavirus sequences.

655VÁSQUEZ ET AL. Biol Res 39, 2006, 649-659

subdomains are probably located. Finally,close to the N- and C-termini of the protein,the predicted structural similarity completelybreaks down. The polymerase region doesnot cover these zones of the protein, and ispredicted to end approximately 320 aa fromthe N-termini and 220 from to the C-terminiof the VP1 sequence, thus comprising almosthalf of the total protein sequence length (Fig.3). The N- and C-termini of the protein varygreatly from one polymerase to another,these unique regions may be involved in thespecific recognition of template by thepolymerase, or alternatively, may give somespecial catalytic properties to the protein (eg.a proofreading domain in the Klenowfragment, or a RNase H domain in the RT ofHIV) (O’Reilly & Kao, 1998).

Visualization of a possible three-dimensional structure of the VP1 middleregion based on the partial sequencesimilarity with the rabbit hemorrhagicdisease virus polymerase

The possible three-dimensional structure ofthe rotavirus polymerase middle region wasobserved by a conserved domain alignment(CD alignment) (Fig. 4). This was possibleusing the CD search option in the Blast

page (see methodology) (Marchler-Bauer etal., 2003). This program compares a proteinsequence against the conserved domaindatabase (Smart and Pfam) using the RPS-BLAST program. This allows the predictionof known functional and structural domainsin protein query sequences. Enoughsimilarity was found to make a partialalignment of the VP1 sequence with thesequence of a crystallized protein, in thiscase the RdRp of the Rabbit HemorrhagicDisease Virus (RHDV), a positive strandRNA virus belonging to the Caliciviridaefamily (Ng et al., 2002). The program wasable to perform an alignment of residues458-633 of the VP1 sequence and 186-356of the RHDV RdRp sequence (where mostof the motifs of both proteins are located),and to show this region of VP1 as a three-dimensional structure based on the structureof this region from the crystallized RHDVprotein (Fig. 4). Thus, this approach waspossible using the sequence correspondingto the middle region of the protein, fromresidues 438 to 776 of VP1, and not the restof the protein due to the lack of generalsequence similarity and structuralconservation that occurs outside of themiddle region of the replicases. The regionsof similarity obtained by the alignment and

Figure 3: Secondary structure conservation at the polymerase region of the rotavirus replicase. Thesuperimposition of the predicted secondary structures belonging to plus and double stranded viralRdRp sequences is shown. Motif C (GDD) of the palm subdomain was used as a signature to startthe superimposition of the secondary structures towards N- and C-termini of the protein. Thedashed lines in some structures denote a gap in the superimposed predicted structure. As was seenpreviously, the continuity of the palm subdomain is stopped by part of the fingers subdomains(proximal to the palm region) in these types of polymerase (RdRps or RTs). Some of the predictedsecondary structures do not have complete structural equivalence compared to the crystallized formof the protein. The predicted secondary structures were chosen for the structural superimposition inorder to have a consistent bias.

VÁSQUEZ ET AL. Biol Res 39, 2006, 649-659656

observed in the crystal structure of theRHDV RdRp corresponds to the predictedpalm and part of the fingers subdomain inVP1, where the conserved residues(identical or similar) are those that lie in theactive site of the protein and are mainlylocated in the palm subdomain. The rest ofthe VP1 sequence in the alignment does nothave any similarity with RHDV RdRp andcould correspond to part of the putativefingers and part of the thumb subdomain,where the similarity decreases. The resultsobtained in this section provide us a basisto postulate that the central core of therotavirus replicase probably has a typical

right handed conformation described forother RdRps of positive strand and dsRNAviruses.

DISCUSSION

The gene one of the rotavirus genomeencodes for the structural protein VP1. Thisviral protein forms part of the rotavirusreplication machinery that also includes thecore lattice protein VP2 and the cappingenzyme VP3. VP1 needs the presence ofVP2 to form a competent replicationcomplex and display replicase activity. VP1

Figure 4: Region of similarity between the central core of the rotavirus VP1 and the middleregion of the crystallized RHDV polymerase. A, RHDV RdRp X-ray crystal structure (PDB:1KHV), determined by Ng et al. The pink tubular and the cyan arrow-shaped structures denotealpha helix and beta strand structures, respectively. The worm-shaped structure denotes the α-backbone of the protein. B, region of the RHDV replicase that has sequence similarity with therotavirus VP1 protein, shown in the same orientation as figure A. This region corresponds to themiddle region of the RHDV RdRp where most of the palm and part of the fingers subdomains arelocated, this region matches with the middle region of VP1. The residues forming part of theregion with similarity are shown in blue, while identical residues are shown in red. C, thestructure in figure B was rotated 90º towards the viewer in the y-axis from top to front, to showthe conserved amino acids of the different motifs (yellow). D, sequence alignment of the CDsearch results, the numbers denotes the residue positions in the RHDV RdRp sequence. The colorshave been maintained with respect to figures B and C. The corresponding motifs are shown underthe sequence alignment. Figures were prepared using the Cn3D modeling program (NCBI/NIH).

657VÁSQUEZ ET AL. Biol Res 39, 2006, 649-659

has been assigned as the viral RdRp butlittle is known about its functional domains.This protein is not well studied, i.e. therehas been no information reported aboutwhich region of VP1 is interacting with theviral mRNA or dsRNA templates prior topolymerization, or with the structuralproteins VP2 and VP3, or the non-structuralprotein NSP2. Also, there have been noreports of which amino acids of therotavirus replicase have an important role inthe catalytic function of the protein. Withthe aim to provide a better understanding ofthis viral protein, we performed a detailedanalysis of its amino acid sequence.

Bioinformatic programs available on theworld wide web (internet), were used topredict motifs shared by viral RdRps in therotavirus VP1 sequence. We were able topredict five of the eight motifs initiallydescribed in plus stranded viruspolymerases using the structuralconservation in the middle part of the viralpolymerases as a guide. In addition, weidentified conserved amino acids that makeup these motifs, which could be involved inpolymerization and sugar selection duringnucleic acid synthesis. A predictedstructural conservation of the putative palmsubdomain in rotavirus VP1 by comparisonwith the predicted secondary structures ofother viral RNA polymerases was shown.Finally, this predicted conservation in themiddle part of the protein allowed us toperform a conserved domain search, andfurther, observe the possible three-dimensional structure of the VP1 middlepart based on the RHDV polymerase. Basedon this approach, we suggest that therotavirus replicase contains featurescommon to other positive strand, dsRNAviral polymerases, namely motifs andconserved amino acids, including an overallstructural conservation of the middleportion (proximal part of the predictedfingers and palm subdomains).

As was expected, the predictedstructural conservation was reducedtowards the distal fingers (fingertips) andthumb putative subdomains, and wasabsent towards the N- and C-termini,based on other viral RdRps. Since the palmsubdomain harbors nearly all the described

motifs (only one known motif is present inthe fingers subdomain, motif F, and thereare no classical motifs described for thethumb), it is reasonable to suggest that thisdomain is structurally conserved and thatthis structural conservation decreasescloser to the other subdomains, wherethere is almost no motif that comprises thecatalytic nuclei of the enzyme. We alsosuggest that in the case of a large sizeprotein, such as the rotavirus replicase(1,088 aa, 125 kDa), the polymeraseregion corresponds to only one half of thetotal protein, where the other half of theVP1 protein (distributed in the N- and C-termini of the protein) could be uniqueregions. This data is in concordance withthe size observed for other RdRpspolymerase domains (eg. in phage φ6 isapproximately 600 aa) (Butcher et al.,2001). This is even more evident in thesolved structure of the reovirus λ3 RdRp(Tao et al., 2002), where the polymerasedomain corresponds to only 509 of a totalof 1,267 aa. This protein has twoaccessory domains, an N-terminal domainand a C-terminal (bracelet) domain, thatgive it a three-dimensional cage-likestructure with other characteristic features.It is presumable that related virusesbelonging to the same Reoviridae family,like rotavirus, will have the similar overalldomain dispositions and arrangements,including some unique characteristics. It ispostulated that every polymerase musthave two types of interactions with itstemplate: one specific, the interaction thatis made prior to the formation of the pre-initiation complex (binding); and anotherunspecif ic , the interact ion that thepolymerase has while bound to thetemplate during elongation (Steitz, 1998).It will be interesting to see if the rotavirusRdRp has a defined domain for thespecific recognition of its template, andwhere this domain is located. This type ofbionformatic approach enabled us tosuggest unique regions in the protein,where one can look for novel proteinfunctions. For example, a novel site in therotavirus VP1 sequence was found(residues 802 to 856) that could be relatedto a possible ATPase activity and be an

VÁSQUEZ ET AL. Biol Res 39, 2006, 649-659658

attractive target for mutagenesis. Such anactivity could be useful for a helicaseactivity during the transcriptase mode ofVP1, when the enzyme utilizes dsRNA asa template, and be related to the high invitro polymerase activity observed for therotavirus double-layered particles (Spencer& Arias, 1981). We are aware that theseare predictions and they will requirefurther experimental studies to corroboratethe data, particularly using biophysicalmethods such as crystallography/X-raydiffract ion or NMR techniques andfunctional methods such as mutagenesisand biochemical studies. Nonethelessusing this kind of approach we havestrengthened the notion that VP1 is therotavirus RNA-dependent RNApolymerase.

ACKNOWLEDGMENTS

We are grateful to Dr. John T. Patton(NIAID/NIH) for the resources kindlyprovided during the VP1 SA11 sequencedetermination and to Dr. F.D. Gonzalez-Nilo (U. de Talca) for the computationalresources and expert advice. We also wantto thank Dr. J. Chnaiderman for criticalreview of the manuscript. R.V.D was afellowship recipient from DAAD, Germanyand from CONICYT, Chile. This work waspartially supported by FONDECYT grant#1050002 to ES.

REFERENCES

ALMANZA L, ARIAS CF, LÓPEZ S (1994) Amino acidsequence of the porcine rotavirus YM VP1 protein. ResVirol 145:313-317

BAILEY TL, ELKAN C (1994) Fitting a mixture model byexpectation maximization to discover motifs inbiopolymers. Proc Int Conf Intell Syst Mol Biol 2: 28-36

BRAUTIGAM CA, STEITZ TA (1998) Structural andfunctional insights provided by crystal structures ofDNA polymerases and their substrate complexes. CurrOpin Struct Biol 8: 54-63

BREMONT M, JUSTE-LESAGE P, CHABANNE-VAUTHEROT D, CHARPILIENNE A & COHEN J(1992) Sequences of the four larger proteins of aporcine group C rotavirus and comparison with theequivalent group A rotavirus proteins. Virology 18:684-92

BRESSANELLI S, TOMEI L, ROUSSEL A, INCITTI I,

VITALE RL, MATHIEU M, DE FRANCESCO R,REY FA (1999) Crystal structure of the RNA-dependent RNA polymerase of hepatitis C virus. ProcNatl Acad Sci U S A 96: 13034-9

BRUENN JA (1991) Relationships among the positivestrand and double-strand RNA viruses as viewedthrough their RNA-dependent RNA polymerases. NuclAcids Res 19:217-226

BUTCHER SJ, GRIMES JM, MAKEYEV EV, BAMFORDDH, STUART DI (2001). A mechanism for initiatingRNA-dependent RNA polymerization. Nature 410:235-40

COHEN J, CHARPILIENNE A, CHILMONCZYK S,ESTES MK (1989) Nucleotide sequence of bovinerotavirus gene 1 and expression of the gene product inbaculovirus. Virology 171: 131-40

CRAMER P (2002) Common structural features of nucleicacid polymerases. Bioessays 24: 724-9

EIDEN JJ, HIRSHON C (1993) Sequence analysis of groupB rotavirus gene 1 and definition of a rotavirus-specificsequence motif within the RNA polymerase gene.Virology 192: 154-60

ESTES MK (2001) Rotavirus and their Replication. InFields Fundamental Virology, pp 1747-1785. Edited byDM Knipe, PM Howley. Philadelphia: LippincottWilliams & Wilkins Press

GALLEGOS CO, PATTON JT (1989) Characterization ofrotavirus replication intermediates: a model for theassembly of single-shelled particles. Virology 172:616-27

JOHNSON KN, JOHNSON KL, DASGUPTA R,GRATSCH T, BALL LA (2001) Comparisons amongthe larger genome segments of six nodaviruses andtheir encoded RNA replicases. J Gen Virol 82: 1855-66

JOYCE CM, STEITZ TA (1995) Polymerase structuresand function: variations on a theme? J Bacteriol 177:6321-9

KAPIKIAN AZ, HOSHINO Y, CHANOCK RM (2001)Rotaviruses. In Fields Fundamental Virology, 4thEdition edn, pp 1787-1833. Edited by DM Knipe, PMHowley. Philadelphia: Lippincott Williams & WilkinsPress

KOONIN EV (1991) The phylogeny of RNA-dependentRNA polymerases of positive-strand RNA viruses. JGen Virol 72: 2197-206

KOONIN EV, GORBALENYA AE, CHUMAKOV KM(1989) Tentative identification of RNA-dependentRNA polymerases of dsRNA viruses and theirrelationship to positive strand RNA viral polymerases.FEBS Lett 252: 42-6

LIU M, MATTION NM, ESTES MK (1992) Rotavirus VP3expressed in insect cells possesses guanylyltransferaseactivity. Virology 188: 77-84

MARCHLER-BAUER A, ANDERSON JB, DEWEESE-SCOTT C, FEDOROVA ND, GEER LY, HE S,HURWITZ DI, JACKSON JD, JACOBS AR,LANCZYCKI CJ, LIEBERT CA, LIU C, MADEJ T,MARCHLER GH, MAZUMDER R, NIKOLSKAYAAN, PANCHENKO AR, RAO BS, SHOEMAKER BA,SIMONYAN V, SONG JS, THIESSEN PA,VASUDEVAN S, WANG Y, YAMASHITA RA, YINJJ, BRYANT SH (2003) CDD: a curated Entrezdatabase of conserved domain alignments. NucleicAcids Res 31: 383-7

MITCHELL DB, BOTH GW (1990) Completion of thegenomic sequence of the simian rotavirus SA11:nucleotide sequences of segments 1, 2, and 3. Virology177: 324-31.

NG KK, CHERNEY MM, VAZQUEZ AL, MACHIN A,ALONSO JM, PARRA F, JAMES MN (2002) Crystal

659VÁSQUEZ ET AL. Biol Res 39, 2006, 649-659

structures of active and inactive conformations of acaliciviral RNA-dependent RNA polymerase. J BiolChem 277: 1381-7

O’REILLY EK, KAO CC (1998) Analysis of RNA-dependent RNA polymerase structure and function asguided by known polymerase structures and computerpredictions of secondary structure. Virology 252: 287-303

PARASHAR UD, HUMMELMAN EG, BRESEE JS,MILLER MA, GLASS RI (2003) Global illness anddeaths caused by rotavirus disease in children. EmergInfect Dis 9: 565-72

PATTON JT (1996) Rotavirus VP1 alone specifically bindsto the 3’ end of viral mRNA, but the interaction is notsufficient to initiate minus-strand synthesis. J Virol 70:7940-7

PATTON JT, KEARNEY K, TARAPOREWALA ZF(2003) Rotavirus genome replication: role of RNA-bindign proteins. In Perspectives in Medical Virology:Viral Gastroenteritis, pp 165-183. Edited by J Gray, UDusselberger. Amsterdam: Elsevier Science B.V.

PIZARRO JL, SANDINO AM, PIZARRO JM,FERNANDEZ J, SPENCER E (1991) Characterizationof rotavirus guanylyltransferase activity associatedwith polypeptide VP3. J Gen Virol 72: 325-32

POCH O, SAUVAGET I, DELARUE M, TORDO N (1989)Identification of four conserved motifs among theRNA-dependent polymerase encoding elements. EmboJ 8: 3867-74

PRASAD BV, ROTHNAGEL R, ZENG CQ, JAKANA J,LAWTON JA, CHIU W, ESTES MK (1996)Visualization of ordered genomic RNA andlocalization of transcriptional complexes in rotavirus.Nature 382: 471-3

RIBAS JC, WICKNER RB (1992) RNA-dependent RNApolymerase consensus sequence of the L-A double-stranded RNA virus: definition of essential domains.Proc Natl Acad Sci U S A 89: 2185-9

SAMBROOK J, FRITSCH EF, MANIATIS T (1989)Molecular Cloning: A Laboratory Manual, 2 edn. NewYork: Cold Spring Harbor Laboratory Press

SHWED PS, DOBOS P, CAMERON LA, VAKHARIA VN,DUNCAN R (2002) Birnavirus VP1 proteins form adistinct subgroup of RNA-dependent RNA polymeraseslacking a GDD motif. Virology 296: 241-50

SPENCER E & ARIAS ML (1981) In vitro transcriptioncatalyzed by heat-treated human rotavirus. J Virol 40:1-10

STEITZ TA (1998) A mechanism for all polymerases.Nature 391: 231-2

STEITZ TA (1999) DNA polymerases: structural diversityand common mechanisms. J Biol Chem 274, 17395-8

TAO Y, FARSETTA DL, NIBERT ML, HARRISON SC(2002) RNA synthesis in a cage—structural studies ofreovirus polymerase lambda3. Cell 111: 733-45

TORTORICI MA, BROERING TJ, NIBERT ML, PATTONJT (2003) Template recognition and formation ofinitiation complexes by the replicase of a segmenteddouble-stranded RNA virus. J Biol Chem 278: 32673-82

VALENZUELA S, PIZARRO J, SANDINO AM,VÁSQUEZ M, FERNÁNDEZ J, HERNÁNDEZ O,PATTON J, SPENCER E (1991) Photoaffinity labelingof rotavirus VP1 with 8-azido-ATP: identification ofthe viral RNA polymerase. J Virol 65: 3964-7

ZANOTTO PM, GIBBS MJ, GOULD EA, HOLMES EC(1996) A reevaluation of the higher taxonomy ofviruses based on RNA polymerases. J Virol 70: 6083-96

VÁSQUEZ ET AL. Biol Res 39, 2006, 649-659660