complete nucleotide sequence of the 113-kilobase linear ... · fragments were ligated into...

13
JOURNAL OF BACTERIOLOGY, May 2007, p. 3855–3867 Vol. 189, No. 10 0021-9193/07/$08.000 doi:10.1128/JB.00089-07 Copyright © 2007, American Society for Microbiology. All Rights Reserved. Complete Nucleotide Sequence of the 113-Kilobase Linear Catabolic Plasmid pAL1 of Arthrobacter nitroguajacolicus Ru ¨61a and Transcriptional Analysis of Genes Involved in Quinaldine Degradation Katja Parschat, 1 Jo ¨rg Overhage, 1 ‡ Axel W. Strittmatter, 2 Anke Henne, 2 § Gerhard Gottschalk, 2 and Susanne Fetzner 1 * Institut fu ¨r Molekulare Mikrobiologie und Biotechnologie, Westfa ¨lische Wilhelms-Universita ¨t Mu ¨nster, D-48149 Mu ¨nster, Germany, 1 and Laboratorium fu ¨r Genomanalyse, Institut fu ¨r Mikrobiologie und Genetik, Georg-August-Universita ¨t Go ¨ttingen, D-37077 Go ¨ttingen, Germany 2 Received 17 January 2007/Accepted 27 February 2007 The nucleotide sequence of the linear catabolic plasmid pAL1 from the 2-methylquinoline (quinaldine)- degrading strain Arthrobacter nitroguajacolicus Ru ¨61a comprises 112,992 bp. A total of 103 open reading frames (ORFs) were identified on pAL1, 49 of which had no annotatable function. The ORFs were assigned to the following functional groups: (i) catabolism of quinaldine and anthranilate, (ii) conjugation, and (iii) plasmid maintenance and DNA replication and repair. The genes for conversion of quinaldine to anthranilate are organized in two operons that include ORFs presumed to code for proteins involved in assembly of the quinaldine-4-oxidase holoenzyme, namely, a MobA-like putative molybdopterin cytosine dinucleotide synthase and an XdhC-like protein that could be required for insertion of the molybdenum cofactor. Genes possibly coding for enzymes involved in anthranilate degradation via 2-aminobenzoyl coenzyme A form another operon. These operons were expressed when cells were grown on quinaldine or on aromatic compounds downstream in the catabolic pathway. Single-stranded 3 overhangs of putative replication intermediates of pAL1 were predicted to form elaborate secondary structures due to palindromic and superpalindromic terminal se- quences; however, the two telomeres appear to form different structures. Sequence analysis of ORFs 101 to 103 suggested that pAL1 codes for one or two putative terminal proteins, presumed to be covalently bound to the 5 termini, and a multidomain telomere-associated protein (Tap) comprising 1,707 amino acids. Even if the putative proteins encoded by ORFs 101 to 103 share motifs with the Tap and terminal proteins involved in telomere patching of Streptomyces linear replicons, their overall sequences and domain structures differ significantly. Arthrobacter nitroguajacolicus Ru ¨61a, formerly assigned to Arthrobacter ilicis, is able to utilize quinaldine (2-methylquino- line) as a sole source of carbon and energy (27; for a review, see reference 13). The “upper pathway” of quinaldine degra- dation, the conversion of quinaldine to anthranilate, is en- coded by a gene cluster containing genes encoding quinaldine 4-oxidase (Qox) (41), 1H-4-oxoquinaldine 3-monooxygenase (Moq), a 2,4-dioxygenase (Hod) catalyzing heterocyclic ring cleavage of 1H-3-hydroxy-4-oxoquinaldine to carbon monoxide and N-acetylanthranilate (14, 15), and an aryl-acylamidase (Amq) that forms anthranilate and acetate (32) (Fig. 1A). The “lower pathway” (i.e., the mineralization of anthranilate) has been suggested to involve catechol formation and ortho ring cleavage (27) (Fig. 1B). Recently, we found that the ability to convert quinaldine to anthranilate is conferred by the conju- gative plasmid pAL1, which was identified as a linear replicon with proteins attached to its 5 ends (40). Linear plasmids of gram-positive bacteria were first de- scribed in Streptomyces rochei in 1979 (20). Since then, they have been reported to occur in many Streptomyces spp., several rhodococci and mycobacteria, Planobispora rosea, the plant pathogen Clavibacter michiganensis, and a Terrabacter sp. The linear replicons of these actinobacteria belong to a class of genetic elements called invertrons, which are characterized by terminal inverted repeats and terminal proteins covalently bound to each 5 end (50). Replication of linear Streptomyces DNA proceeds bidirectionally from an internal origin toward the telomeres (reference 65 and references therein). For linear plasmids of actinobacteria other than Streptomyces spp., cen- trally located origins have been detected in pCLP of Mycobac- terium celatum (42) and pRHL3 of Rhodococcus sp. strain RHA1 (64); however, it is assumed that other actinomycete linear plasmids also replicate from an internal origin. Since this mode of linear DNA replication generates intermediates with 3 overhangs, the recessed 5 ends of the lagging strands have to be filled in to produce full-length duplex DNA molecules * Corresponding author. Mailing address: Institut fu ¨r Molekulare Mikrobiologie und Biotechnologie, Westfa ¨lische Wilhelms-Universita ¨t Mu ¨nster, Corrensstraße 3, D-48149 Mu ¨nster, Germany. Phone: 49 (0)251 83 39824. Fax: 49 (0)251 83 38388. E-mail: [email protected]. † Supplemental material for this article may be found at http://jb .asm.org/. ‡ Present address: Centre for Microbial Diseases and Immunity Research, University of British Columbia, 2259 Lower Mall, Vancou- ver, BC, V6T 1Z4, Canada. § Present address: Qiagen GmbH, Qiagen Strasse 1, 40724 Hilden, Germany. Published ahead of print on 2 March 2007. 3855 on August 28, 2019 by guest http://jb.asm.org/ Downloaded from

Upload: dinhkhuong

Post on 29-Aug-2019

218 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Complete Nucleotide Sequence of the 113-Kilobase Linear ... · fragments were ligated into pBluescript II SK( ) digested with PstI and blunt-cutting EcoRV. E. coli DH5 was transformed

JOURNAL OF BACTERIOLOGY, May 2007, p. 3855–3867 Vol. 189, No. 100021-9193/07/$08.00�0 doi:10.1128/JB.00089-07Copyright © 2007, American Society for Microbiology. All Rights Reserved.

Complete Nucleotide Sequence of the 113-Kilobase Linear CatabolicPlasmid pAL1 of Arthrobacter nitroguajacolicus Ru61a and

Transcriptional Analysis of Genes Involved inQuinaldine Degradation�†

Katja Parschat,1 Jorg Overhage,1‡ Axel W. Strittmatter,2 Anke Henne,2§Gerhard Gottschalk,2 and Susanne Fetzner1*

Institut fur Molekulare Mikrobiologie und Biotechnologie, Westfalische Wilhelms-Universitat Munster, D-48149 Munster, Germany,1

and Laboratorium fur Genomanalyse, Institut fur Mikrobiologie und Genetik, Georg-August-Universitat Gottingen,D-37077 Gottingen, Germany2

Received 17 January 2007/Accepted 27 February 2007

The nucleotide sequence of the linear catabolic plasmid pAL1 from the 2-methylquinoline (quinaldine)-degrading strain Arthrobacter nitroguajacolicus Ru61a comprises 112,992 bp. A total of 103 open reading frames(ORFs) were identified on pAL1, 49 of which had no annotatable function. The ORFs were assigned to thefollowing functional groups: (i) catabolism of quinaldine and anthranilate, (ii) conjugation, and (iii) plasmidmaintenance and DNA replication and repair. The genes for conversion of quinaldine to anthranilate areorganized in two operons that include ORFs presumed to code for proteins involved in assembly of thequinaldine-4-oxidase holoenzyme, namely, a MobA-like putative molybdopterin cytosine dinucleotide synthaseand an XdhC-like protein that could be required for insertion of the molybdenum cofactor. Genes possiblycoding for enzymes involved in anthranilate degradation via 2-aminobenzoyl coenzyme A form another operon.These operons were expressed when cells were grown on quinaldine or on aromatic compounds downstream inthe catabolic pathway. Single-stranded 3� overhangs of putative replication intermediates of pAL1 werepredicted to form elaborate secondary structures due to palindromic and superpalindromic terminal se-quences; however, the two telomeres appear to form different structures. Sequence analysis of ORFs 101 to 103suggested that pAL1 codes for one or two putative terminal proteins, presumed to be covalently bound to the5� termini, and a multidomain telomere-associated protein (Tap) comprising 1,707 amino acids. Even if theputative proteins encoded by ORFs 101 to 103 share motifs with the Tap and terminal proteins involved intelomere patching of Streptomyces linear replicons, their overall sequences and domain structures differsignificantly.

Arthrobacter nitroguajacolicus Ru61a, formerly assigned toArthrobacter ilicis, is able to utilize quinaldine (2-methylquino-line) as a sole source of carbon and energy (27; for a review,see reference 13). The “upper pathway” of quinaldine degra-dation, the conversion of quinaldine to anthranilate, is en-coded by a gene cluster containing genes encoding quinaldine4-oxidase (Qox) (41), 1H-4-oxoquinaldine 3-monooxygenase(Moq), a 2,4-dioxygenase (Hod) catalyzing heterocyclic ringcleavage of 1H-3-hydroxy-4-oxoquinaldine to carbon monoxideand N-acetylanthranilate (14, 15), and an aryl-acylamidase(Amq) that forms anthranilate and acetate (32) (Fig. 1A). The“lower pathway” (i.e., the mineralization of anthranilate) has

been suggested to involve catechol formation and ortho ringcleavage (27) (Fig. 1B). Recently, we found that the ability toconvert quinaldine to anthranilate is conferred by the conju-gative plasmid pAL1, which was identified as a linear repliconwith proteins attached to its 5� ends (40).

Linear plasmids of gram-positive bacteria were first de-scribed in Streptomyces rochei in 1979 (20). Since then, theyhave been reported to occur in many Streptomyces spp., severalrhodococci and mycobacteria, Planobispora rosea, the plantpathogen Clavibacter michiganensis, and a Terrabacter sp. Thelinear replicons of these actinobacteria belong to a class ofgenetic elements called invertrons, which are characterized byterminal inverted repeats and terminal proteins covalentlybound to each 5� end (50). Replication of linear StreptomycesDNA proceeds bidirectionally from an internal origin towardthe telomeres (reference 65 and references therein). For linearplasmids of actinobacteria other than Streptomyces spp., cen-trally located origins have been detected in pCLP of Mycobac-terium celatum (42) and pRHL3 of Rhodococcus sp. strainRHA1 (64); however, it is assumed that other actinomycetelinear plasmids also replicate from an internal origin. Since thismode of linear DNA replication generates intermediates with3� overhangs, the recessed 5� ends of the lagging strands haveto be filled in to produce full-length duplex DNA molecules

* Corresponding author. Mailing address: Institut fur MolekulareMikrobiologie und Biotechnologie, Westfalische Wilhelms-UniversitatMunster, Corrensstraße 3, D-48149 Munster, Germany. Phone: 49 (0)25183 39824. Fax: 49 (0)251 83 38388. E-mail: [email protected].

† Supplemental material for this article may be found at http://jb.asm.org/.

‡ Present address: Centre for Microbial Diseases and ImmunityResearch, University of British Columbia, 2259 Lower Mall, Vancou-ver, BC, V6T 1Z4, Canada.

§ Present address: Qiagen GmbH, Qiagen Strasse 1, 40724 Hilden,Germany.

� Published ahead of print on 2 March 2007.

3855

on August 28, 2019 by guest

http://jb.asm.org/

Dow

nloaded from

Page 2: Complete Nucleotide Sequence of the 113-Kilobase Linear ... · fragments were ligated into pBluescript II SK( ) digested with PstI and blunt-cutting EcoRV. E. coli DH5 was transformed

(“telomere patching”). The single-stranded 3� overhangs arethought to “fold back” to form complex secondary structuresthat might provide a recognition site for binding of terminalproteins (Tps) and/or telomere-associated protein (Tap),might be a signal for a Tp-dependent polymerase to completethe 5� strand, or both (22, 25, 26, 44). The Tp provides ahydroxyl group that acts as a primer for covalent attachment ofthe first deoxynucleotide and subsequent polymerase-catalyzedfilling in at the telomere. However, despite seminal studies ofStreptomyces invertrons (2, 3, 66–68), the detailed mechanismof telomere patching is not completely understood yet, and thepossibility that in some linear plasmids replication starts at thetelomere and proceeds via strand displacement also cannot beruled out.

For the genus Arthrobacter, pAL1 was the first extrachromo-somal DNA element shown to be a linear replicon (40). As afirst approach to understand the function of pAL1, we deter-mined its complete sequence in order to perform a functionalannotation of its putative genes and to model secondary struc-tures of putative single-stranded 3� telomeric overhangs of thisplasmid. We analyzed the operon organization of catabolicgenes on pAL1 and the carbon source-dependent expression ofthese genes.

MATERIALS AND METHODS

Bacterial strains, media, and growth conditions. A. nitroguajacolicus Ru61awas grown at 30°C in mineral salts medium (61) containing 1 ml/liter of a vitaminstock solution containing (per liter) 2 mg biotin, 20 mg nicotinic acid, 10 mgthiamine-HCl � 2H2O, 5 mg 4-aminobenzoate, 10 mg calcium pantothenate, 50mg pyridoxine-HCl, 10 mg vitamin B12, 10 mg riboflavin, and 1 mg folic acid.Carbon sources were added to the medium at concentrations of 2 mM forquinaldine, 1H-4-oxoquinaldine, 1H-3-hydroxy-4-oxoquinaldine, N-acetylanthra-nilate, and anthranilate, 3 mM for hypoxanthine, and 30 mM for succinate.Consumption of aromatic substrates was monitored spectrophotometrically. Theconcentration of quinaldine in culture supernatants was determined as describedby Stephan et al. (61); an ε325 nm of 10.2 mM�1 cm�1 was used to estimate1H-4-oxoquinaldine concentrations. When aromatic substrates were consumed,

portions of substrate stock solutions were added to the cultures to obtain theappropriate final concentrations. A pAL1-deficient mutant of strain Ru61a (40)was grown in the presence of streptomycin (50 �g/ml) and rifampin (25 �g/ml).The involvement of a canonical molybdenum hydroxylase in hypoxanthine utili-zation was assessed by replacing the molybdate in the mineral salts medium bythe same concentration of tungstate. To test the possibility that there was carboncatabolite repression of degradation of aromatic compounds, cells of A. nitrogua-jacolicus Ru61a grown for about 16 h on succinate were harvested by centrifu-gation, washed twice in saline, and resuspended in mineral salts medium with 10mM succinate supplemented with either 2 mM quinaldine or 2 mM 1H-4-oxoquinaldine. Escherichia coli DH5� (17), which was used as a plasmid host,was grown at 37°C in lysogeny broth (LB) (52) supplemented with ampicillin (100�g/ml) if appropriate. For amplification of cells carrying the shotgun library ofthe pAL1 plasmid, chemically competent E. coli One Shot TOP10 cells (Invitro-gen, Karlsruhe, Germany) were transformed and were grown at 37°C and 350rpm in 2� LB for 20 h.

DNA techniques. Genomic DNA of A. nitroguajacolicus Ru61a and of thepAL1-deficient mutant was isolated by using the method of Rainey et al. (46).Plasmid DNA was obtained from E. coli DH5� clones with an E.Z.N.A. plasmidmini kit I (peqlab, Erlangen, Germany). Competent E. coli cells were preparedas described by Hanahan (19). DNA restriction and agarose gel electrophoresiswere carried out using standard procedures (52). PCR was performed using theExpand High Fidelity PCR system (Roche, Mannheim, Germany) or the TripleMaster PCR system (Eppendorf, Hamburg, Germany). Random-primed labelingof probes, blotting, hybridization, and colorimetric detection with nitroblue tet-razolium salt and 5-bromo-4-chloro-3-indolylphosphate were performed by usingthe methods recommended in the DIG System user’s guide for filter hybridiza-tion (Roche Molecular Biochemicals, 1995).

Preparation and subcloning of pAL1 DNA. Preparation of cell plugs, whichalways included proteinase K treatment, pulsed-field gel electrophoresis, andisolation of pAL1 from gels by electroelution were performed as describedpreviously (40). For construction of shotgun libraries, purified pAL1 plasmidDNA was partially digested for 10 to 20 s using the blunt-cutting enzymeBsp143I or AluI. DNA fragments were purified by gel electrophoresis (frag-ment size, 2.0 to 3.0 kb). After gel elution with Qiaquick (QIAGEN, Hilden,Germany), DNA fragments were filled in using T4 polymerase, 5� adenylatedusing Taq polymerase, dephosphorylated by treatment with calf intestinalphosphatase in a buffer recommended by the supplier (all enzymes wereobtained from MBI Fermentas, Vilnius, Lithuania), and cloned into thepCR4-TOPO vector (Invitrogen). Recombinant plasmids were transformedinto chemically competent E. coli One Shot TOP10 cells. For cloning of theterminal fragments of pAL1, plasmid DNA was digested with PstI, and

FIG. 1. Quinaldine degradation by A. nitroguajacolicus Ru61a (13, 27, 40, 41). (A) Quinaldine conversion to anthranilate. 1, quinaldine(2-methylquinoline); 2, 1H-4-oxoquinaldine; 3, 1H-3-hydroxy-4-oxoquinaldine; 4, N-acetylanthranilic acid; 5, anthranilic acid; Qox, quinaldine4-oxidase; Moq, 1H-4-oxoquinaldine 3-monooxygenase; Hod, 1H-3-hydroxy-4-oxoquinaldine 2,4-dioxygenase; Amq, N-acetylanthranilate amidehydrolase. (B) Hypothetical initial steps in anthranilate degradation by strain Ru61a. 6, catechol; 7, 2-aminobenzoyl-CoA; 8, 2-amino-5-oxo-cyclohex-1-ene-carbonyl-CoA. For details, see the text.

3856 PARSCHAT ET AL. J. BACTERIOL.

on August 28, 2019 by guest

http://jb.asm.org/

Dow

nloaded from

Page 3: Complete Nucleotide Sequence of the 113-Kilobase Linear ... · fragments were ligated into pBluescript II SK( ) digested with PstI and blunt-cutting EcoRV. E. coli DH5 was transformed

fragments were ligated into pBluescript II SK(�) digested with PstI andblunt-cutting EcoRV. E. coli DH5� was transformed with the ligation mix-ture, and transformants were selected on LB agar plates containing 100 �g/mlampicillin, 40 �g/ml isopropyl-�-D-thiogalactopyranoside, and 400 �g/ml5-bromo-4-chloro-3-indolyl-�-D-galactopyranoside. Plasmids pBSK5 andpBSK3, containing 3.0- and 2.0-kb inserts, respectively, were recovered fromthe transformants. To ensure coverage of the ends of pAL1, terminal frag-ments were cloned again from A. nitroguajacolicus Ru61a genomic DNA,which was isolated using a protocol (57) that in addition to proteolyticdigestion includes treatment with alkali in order to cleave the alkali-labileester linkage between the residual peptide of Tp and the DNA 5� ends.PstI-digested genomic DNA was then hybridized with probes “lt” and “rt,”which were obtained by PCR amplification from pBSK5 and pBSK3, respec-tively (see Table S1 in the supplemental material for a description of theprimers). Genomic DNA isolated from the pAL1-deficient mutant of A.nitroguajacolicus Ru61a did not hybridize with these probes. Hybridizing PstIfragments of DNA from the wild-type strain were extracted from agarose gelsand ligated into PstI/EcoRV-digested pBluescript II SK(�) as describedabove. Four and five plasmids containing the 3.0- and 2.0-kb terminal frag-ments, respectively, were identified by colony blotting of E. coli DH5� trans-formants using probes “lt” and “rt,” and all inserts were sequenced.

DNA sequencing. Sequencing of isolated plasmid DNA was performed asdescribed previously (48), using BigDye Terminator 3.1 chemistry and a 3730XLcapillary sequencer (Applied Biosystems, Darmstadt, Germany). For sequencingof pCR4 and pBluescript II SK(�) derivatives, standard vector primers wereused (see Table S1 in the supplemental material). For direct sequencing ofderived PCR fragments, custom-made PCR primers were used in standard se-quencing reactions performed with recommended annealing temperatures.

RNA extraction and RT-PCR. For isolation of total RNA, A. nitroguajacolicusRu61a was grown in mineral salts medium with different carbon sources to anoptical density at 600 nm of about 1.0. Aliquots (6 ml) of the cultures were frozenin liquid nitrogen and stored at �80°C. After thawing on ice, bacteria wereharvested by centrifugation at 4,000 � g and 4°C for 2 min. The cells wereresuspended in 100 �l of Tris-EDTA buffer (pH 8.0) containing 14 mg ly-sozyme/ml and 20 U RNase inhibitor (RNasin Plus; Promega, Madison, WI) andincubated at 28°C for 3 h. Total RNA was isolated from the pretreated cells withan RNeasy kit (QIAGEN) by following the instructions of the supplier, includingan on-column DNase digestion step. Residual DNA was removed by digestionwith 1 U RNase-free DNase (Promega) per 1 �g RNA in the presence of 20 URNase inhibitor. Samples were incubated for 45 min at 37°C, the DNase wasinactivated, and the RNA was repurified. Reverse transcription (RT)-PCR wasperformed with a RevertAid H minus first-strand cDNA synthesis kit (MBIFermentas). The cDNA synthesis reaction was carried out at 43°C with 1 �g oftotal RNA and random hexamer primers. For negative controls, reverse trans-criptase was omitted from the reaction mixture. A PCR for amplification ofcDNA was performed by using 10-�l assay mixtures containing 1 �l of cDNA, 20pmol (each) of the forward primer and the reverse primer (see Table S1 in thesupplemental material), and 0.75 U of GoTaq DNA polymerase (Promega).

Identification of transcriptional start sites. Transcriptional start sites weredetermined by rapid amplification of 5� cDNA ends (5�-RACE) using a 5�/3�-RACE kit from Roche according to the manufacturer’s instructions. For cDNAsynthesis, 1 �g of total RNA, isolated from quinaldine-grown cells, and specificprimer SP1 (see Table S1 in the supplemental material) were added. In the nextsteps, nested primers SP2 and SP3 (see Table S1 in the supplemental material)were used to obtain specific products of the tailed cDNA, and PCR was carriedout with the Expand High Fidelity PCR system (Roche). PCR products werepurified by gel extraction (E.Z.N.A. gel extraction kit; peqlab) and were se-quenced (MWG Biotech, Ebersberg, Germany).

Enzyme assays and polyacrylamide gel electrophoresis. A. nitroguajacolicusRu61a cells grown on different carbon sources were suspended in 100 mMTris-HCl buffer (pH 8.0), treated with 2 mg/ml lysozyme for 30 min at 37°C, anddisrupted by sonication on ice. Crude extracts containing soluble proteins wereobtained by centrifugation for 45 min at 36,000 � g. Qox activity was determinedspectrophotometrically by measuring the quinaldine-dependent reduction of theartificial electron acceptor iodonitrotetrazolium chloride (INT) (41). Hod activ-ity was measured as described previously (15), using 1H-3-hydroxy-4-oxoquinal-dine as the substrate. Protein concentrations were estimated by the method ofBradford, as modified by Zor and Selinger (69), using bovine serum albumin asthe standard protein. Nondenaturing polyacrylamide gel electrophoresis wasperformed in resolving gels with a final acrylamide concentration of 12% (wt/vol)or 7.5% (wt/vol), using the high-pH discontinuous system described by Hames(18). For activity staining of xanthine oxidoreductase, gels were immersed in 100mM Tris-HCl buffer (pH 8.5) containing 0.3% (vol/vol) Triton X-100, 1.5 mM

INT, and 1 mM hypoxanthine or xanthine. For activity staining of Qox, gels weresoaked in the same buffer containing 1.25 mM INT and 50 �M quinaldine.

Sequence analysis. The sequence of pAL1 was determined using a standardshotgun library with an insert size of 2 to 3 kb. A total of 1,484 reads with anaverage read length of 616 bp were performed. After the first assembly using thePhrap assembly tool (http://www.phrap.org), primer walking on plasmids andPCR-based techniques were used to close remaining gaps and to solve misas-sembled regions. All manual editing steps were performed using the GAP4software package (v4.5 and v4.6) (58, 59). After sequence polishing and finishing,the plasmid sequence had 7.0-fold sequence redundancy and assembled into asingle contig. Coding regions of pAL1 were identified with the ARTEMIS DNAannotation tool (49), with the heuristic approach of GENEMARK (6), and withFRAMES at HUSAR 4.0 (http://genome.dkfz-heidelberg.de/). Sequences wereanalyzed with the BLAST family of programs (1) for database searches, GAP forbinary alignments and calculation of similarities and identities, and ClustalW(21) and T-Coffee (39) for multiple alignments. Open reading frames (ORFs)with hypothetical ribosome binding sites but lacking BLAST hits were manuallyannotated. Hypothetical gene products were scanned with PROSITE (http://www.expasy.org/prosite/) to obtain functional information. The SOSUI pro-gram (24) was used to search for putative transmembrane helices. Possiblesecondary structures of putative terminal 3�-strand overhangs of replicative in-termediates of pAL1 were computed with mfold (version 3.1) (53, 70), based ona folding temperature of 30°C and Na� concentrations of 100 mM to 1 M.Theoretical pI values were calculated using the Compute pI/Mw tool at http://www.expasy.ch.

Nucleotide sequence accession number. The complete sequence of linearplasmid pAL1 from A. nitroguajacolicus Ru61a has been deposited in the EMBLnucleotide sequence database under accession number AM286278.

RESULTS AND DISCUSSION

Overview of the DNA sequence of linear plasmid pAL1. Thecalculated mean G�C content of pAL1 (112,992 bp) is60.87%, which is within the range described for DNA of thegenus Arthrobacter (59 to 66%) (28). In the left half of pAL1(ORFs 15 to 45), there are significant variations in the G�Ccontents of the individual coding sequences, which range from54.7% (ORF 27) to 67% (ORF 28). Long contiguous DNAregions with G�C contents deviating from the average contentare not apparent; however, the first 100 bp of the left and righttermini have an above-average G�C content (71%).

A total of 103 ORFs were identified on pAL1, which cover84.6% of the plasmid sequence (Fig. 2). ORFs which werefunctionally annotated are briefly described in Table 1. TableS2 in the supplemental material lists all ORFs identified onpAL1. For 49 of the putative genes, no function could bepredicted. Genes coding for quinaldine degradation and genespresumed to code for reactions involved in anthranilate catab-olism are clustered in a 31-kb region (ORFs 3 to 25), whereasa 53-kb region (ORFs 63 to 103) appears to code for proteinsinvolved in DNA mobilization, plasmid maintenance, andDNA replication and repair. Genes presumed to have a role intelomere patching are localized at the end of pAL1 (ORFs 101to 103).

Genes involved in conversion of quinaldine to anthranilate.The gene cluster comprising ORFs 4 to 9 of pAL1, which codefor the enzymes catalyzing conversion of quinaldine to anthra-nilate (Table 1), has been characterized previously (41). A23-kb segment of genomic DNA of A. nitroguajacolicus Ru61awas described, which on the basis of the pAL1 sequence re-sulted from the fusion of 10.8- and 12.2-kb HindIII fragments.The genes coding for conversion of quinaldine to anthranilateare located on the 10.8-kb fragment of pAL1, whereas the12.2-kb fragment belongs to another replicon of strain Ru61a.

The functions of qoxLMS (ORFs 4 to 6) encoding quinal-

VOL. 189, 2007 LINEAR CATABOLIC PLASMID pAL1 3857

on August 28, 2019 by guest

http://jb.asm.org/

Dow

nloaded from

Page 4: Complete Nucleotide Sequence of the 113-Kilobase Linear ... · fragments were ligated into pBluescript II SK( ) digested with PstI and blunt-cutting EcoRV. E. coli DH5 was transformed

dine-4-oxidase (Qox), hod (ORF 8) coding for the 2,4-dioxy-genase catalyzing heterocyclic ring cleavage, and amq (ORF 9)encoding N-acetylanthranilate amide hydrolase were deter-mined by heterologous gene expression analysis (15, 32, 41).The physiological role of the 1H-4-oxoquinaldine 3-monooxy-genase gene moq (ORF 7) was confirmed by interposon mu-tagenesis (K. Parschat, unpublished data). The presumed su-peroxide dismutase (SOD) encoded by ORF 14 belongs to thefamily of iron- or manganese-containing SODs. SOD activitymight be particularly necessary when A. nitroguajacolicusRu61a grows on quinaldine, since incomplete reduction of O2

by Qox, which is presumed to use dioxygen as its physiologicalelectron acceptor, would produce superoxide anion radicalsinstead of H2O2. A. nitroguajacolicus Ru61a has another, prob-ably chromosomal, SOD gene, since PCR performed with tem-plate DNA from the pAL1-deficient mutant and primers spe-cific for ORF 14 (see Table S1 in the supplemental material)resulted in a 544-bp DNA fragment that exhibited 97% identityto the corresponding part of ORF 14.

ORF 3, which is localized directly downstream of qoxLMS,codes for an XdhC-like protein which, by analogy to thearchetypal XdhC protein (38), may be involved in insertion ofthe molybdenum cofactor (Moco) into the Qox protein.Homologs of xdhC are often localized in the vicinity of genes

encoding molybdenum hydroxylases; in the case of pucABCDE(55), xdhABC (36), and the quinoline catabolic gene cluster(7), they are even transcribed in an operon.

The ORF 11 protein (ORF11p) of pAL1 exhibited 24%identity to E. coli MobA (accession no. P32173). MobA cata-lyzes the condensation of Mo-molybdopterin and GTP, form-ing molybdopterin guanine dinucleotide. However, Qox, likemany catabolic molybdenum hydroxylases, contains the molyb-dopterin cytosine dinucleotide form of the molybdenum cofac-tor. It is interesting that asparagine residue 53 and aspartateresidue 71 of E. coli MobA, which have been proposed todetermine its selectivity for GTP (33), are replaced in ORF11pby leucine and arginine, respectively. Remarkably, these resi-dues are also found at corresponding positions in the MobA-like proteins of Arthrobacter nicotinovorans (ORF 204 ofpAO1; accession no. AAK64261) and Pseudomonas putida 86(ORF 4; accession no. CAE47360). Since the correspondingmobA-like genes also are clustered with genes encoding mo-lybdoenzymes with the molybdopterin cytosine dinucleotidecofactor, it is tempting to speculate that the conserved L and Rresidues mediate specificity for CTP.

ORF12p is similar to MoaA-like proteins, which togetherwith MoaC catalyze the first step in Moco biosynthesis, andORF13p exhibits 36% identity to MoaE of E. coli (accession

FIG. 2. Schematic diagram of ORFs on the 112.992-bp linear plasmid pAL1. Genes and predicted ORFs are indicated by arrows, and thearrowheads indicate the directions of transcription. The ORFs are grouped on the basis of (hypothetical) function, as follows: blue, ORFs involvedin quinaldine or anthranilate catabolism; yellow, ORFs involved in DNA mobilization by conjugation or transposition; red, ORFs involved in DNAreplication, repair, and modification and in plasmid maintenance; green, ORFs involved in diverse functions; gray, ORFs encoding conservedhypothetical proteins, white, ORFs coding for hypothetical proteins.

3858 PARSCHAT ET AL. J. BACTERIOL.

on August 28, 2019 by guest

http://jb.asm.org/

Dow

nloaded from

Page 5: Complete Nucleotide Sequence of the 113-Kilobase Linear ... · fragments were ligated into pBluescript II SK( ) digested with PstI and blunt-cutting EcoRV. E. coli DH5 was transformed

TABLE 1. Summary of ORFs on pAL1a

ORF

Productsize

(no. ofaminoacids)

Coding sequence(position of startcodon-positionof stop codon)b

Gene product description and conserved protein domainsc

Closest relative

Accessionno.

No. ofidentical

amino acids/length of

aligned region(%)

1 392 1047–2225 Putative IS21 family transposase, COG4584 ABK01858 344/388 (88)2 102 3026–3334c Conserved hypothetical protein, putatively involved in biosynthesis of extracellular

polysaccharides, COG2329, pfam03992CAD47968 77/99 (77)

3 385 4079–5236c Putative XdhC homologue (41), COG1975, pfam02625 CAD61048 380/380 (100)4 174 5810–6334c Quinaldine 4-oxidase (Qox), small subunit (41) CAD61047 174/174 (100)5 290 6337–7209c Quinaldine 4-oxidase (Qox), medium subunit (41) CAD61046 290/290 (100)6 795 7206–9593c Quinaldine 4-oxidase (Qox), large subunit (41) CAD61045 795/795 (100)7 388 10294–11460 1H-4-Oxoquinaldine 3-monooxygenase (Moq) (41) CAD61044 388/388 (100)8 276 11551–12381 1H-3-Hydroxy-4-oxoquinaldine 2,4-dioxygenase (Hod) (14, 15, 41) CAD61043 276/276 (100)9 293 12389–13270 N-Acetylanthranilate amidase (Amq) (32, 41) CAD61042 293/293 (100)10 423 13338–14609 Putative MFS family transporter (41), COG2271, COG2814, COG2807 CAD61041 423/423 (100)11 215 15400–16047 Putative MobA-like protein, putative molybdopterin cytosine dinucleotide biosynthesis

protein, COG2068ABK04790 131/194 (67)

12 376 16554–17684c Putative molybdopterin cofactor synthesis protein MoaA, smart00729, pfam06463,pfam04055, COG2896, COG0535

ABK02834 306/345 (89)

13 160 18088–18570 Putative molybdopterin synthase subunit MoaE, pfam02391 ABK02841 121/146 (82)14 208 18794–19420c Putative iron/manganese-dependent SOD, pfam02777, pfam00081, COG0605 ABM08788 200/207 (96)15 219 20080–20739 Putative adenylate kinase, cd01428, pfam00406, COG0563 ABK04334 125/192 (65)16 287 20736–21599c Putative transcriptional regulator, PaaX family, COG3327 BAD56096 129/270 (47)17 130 21747–22139 Hypothetical proteind ABC30680 26/61 (40)18 132 22136–22534c Putative translational inhibitor, cd02198, cd00448, pfam01042, COG0251 ABM09192 110/129 (85)19 299 22653–23552 Putative fumarylacetoacetate hydrolase family protein, pfam01557, COG0179, COG3970,

COG3971ABM10066 198/275 (72)

20 160 23559–24041 Conserved hypothetical protein, cupin domain, pfam00190, pfam02311, COG4101, COG1917 ABM09292 141/160 (88)21 301 24047–24952 Putative acyl-CoA thioesterase, pfam02551, COG1946 ABM06535 218/290 (75)22 551 24988–26643 Putative acid-CoA ligase, pfam00501, COG0365, COG0318, COG1021, COG1022 ABM09333 442/551 (80)23 797 26656–29049 Putative flavoprotein monooxygenase/NADH:flavin oxidoreductase, cd02932, cd02930,

cd00415, pfam00724, pfam01360, COG0654BAD56090 535/791 (67)

24 288 29264–30130c Putative transcriptional regulator, PaaX family, COG3327 BAD56096 133/271 (49)25 162 30407–30895 Putative thioesterase superfamily protein, pfam03061, COG0824 ABM08276 84/128 (65)27 246 31796–32536c Putative TIR domain protein, smart00255 EAR22547 38/111 (34)28 373 33238–34359c Putative TraG family protein (fragment), pfam02534, COG3505 ABK05710 257/297 (86)29 100 34307–34609c Putative TraG family protein (fragment) ABK05710 31/47 (65)30 223 34725–35396c Conserved hypothetical proteine BAD48028 26/85 (30)39 117 42607–42960 Putative single-stranded DNA binding protein, pfam00436, COG0629 ABK03799 45/115 (39)47 891 47779–50454 Putative ATP binding protein, COG5635 ABM10470 180/582 (30)49 143 51314–51745 Putative SOS mutagenesis protein UmuD, pfam00717, COG1974, COG2932 ABK07457 124/143 (86)50 432 51781–53079 Putative SOS mutagenesis protein UmuC, pfam00817, COG0389 ABM09597 351/400 (87)57 76 56056–56286 Putative glutaredoxin-like protein, pfam00462, pfam03960, COG0695, COG1393 CAE13581 26/71 (36)63 710 58940–61072c Putative FtsK/SpoIIIE family protein, pfam01580, COG1674 AAP73944 276/728 (37)64 338 61589–62605c Putative zeta toxin protein, pfam06414 ABD11715 85/335 (25)67 520 63785–65347c Putative peptidase, M23 family, pfam01551, pfam00877, COG0791, COG0739 BAE45967 167/520 (32)69 837 67339–69852c Putative ATP/GTP binding protein, VirB4 related, COG3451 ABH00219 314/840 (37)71 151 70424–70879c Putative membrane protein EAX50825 21/74 (28)74 713 73164–75305c Putative ATP-dependent Lon protease, COG4930 AAS96494 468/688 (68)76 1141 77804–81229c Putative type II restriction enzyme, methylase subunit, COG1002 AAS96497 530/1,178 (44)77 1166 81229–84729c Putative ATP binding protein AAS96500 522/1,194 (43)80 339 86814–87833 Putative ParA-like ATPase involved in chromosome partitioning, pfam00991, COG1192,

COG4963, COG0003ABM10428 118/312 (37)

82 238 88411–89127 Putative membrane protein AAP73899 77/197 (39)84 521 89938–91503 Putative type II/IV secretion system protein, VirB11 related, cd01130, COG4962, pfam00437,

COG0630EAP98658 187/449 (41)

85 290 91506–92378 Putative type II/IV secretion system integral membrane protein, TadB related, COG4965,pfam00482

EAP98659 125/269 (46)

86 294 92378–93262 Putative type II/IV secretion system integral membrane protein, TadC related, pfam00482 ABL81730 119/296 (40)87 139 93662–94081 Putative membrane protein, TadG/TadE related, COG4961, pfam07811 BAC67898 42/123 (34)88 144 94078–94512 Putative membrane protein, TadE related, pfam07811 ABH00243 81/136 (37)89 133 94517–94918 Putative membrane protein BAC67900 50/126 (39)91 342 95534–96562 Conserved hypothetical protein, predicted transmembrane helices CAC36687 79/191 (41)92 982 96567–99515 Putative membrane protein, LysM domain, COG1652 BAE45055 338/1,066 (31)93 196 99532–100122 Putative type IV leader peptidase, pfam01478 ABD12567 54/155 (34)101 1707 105981–111104 Putative telomere-associated protein, smart00400, COG5519 BAE45951 614/1,800 (34)102 210 111108–111740 Putative terminal protein AAP73891 79/191 (41)103 207 111737–112360c Putative DNA binding protein AAP73890 63/204 (30)

a For details, see the text.b c, complementary strand.c The numbers in parentheses are reference numbers.d Other ORFs annotated as genes encoding hypothetical proteins (coding sequence): ORF 26 (positions 31024 to 31341c), ORF 33 (38219 to 38782), ORF 34 (38862 to

39440c), ORF 35 (40075 to 40503), ORF 36 (40445 to 40750), ORF 38 (41822 to 42295), ORF 40 (42993 to 43436), ORF 43 (45509 to 46102c), ORF 44 (45906 to 46238),ORF 45 (46560 to 46889), ORF 46 (46902 to 47597), ORF 48 (50435 to 51241), ORF 52 (53612 to 53941), ORF 53 (53938 to 54303c), ORF 55 (54906 to 55334c), ORF 56(55445 to 55981), ORF 58 (56283 to 57080), ORF 59 (57117 to 57536), ORF 60 (57555 to 57992), ORF 62 (58656 to 58931c), ORF 66 (63108 to 63989c), ORF 90 (94935to 95549), ORF 94 (100183 to 100698), ORF 96 (101422 to 101862c), ORF 97 (101862 to 102665c), ORF 99 (104586 to 105080c), and ORF 100 (105422 to 105892).

e Other ORFs annotated as genes encoding conserved hypothetical proteins without detected conserved protein domains (coding sequence): ORF 31 (positions 35530 to36495c), ORF 32 (36599 to 37780), ORF 37 (40816 to 41226), ORF 41 (44274 to 44576), ORF 42 (44708 to 45506), ORF 51 (53110 to 53427), ORF 54 (54639 to 54920), ORF 61(57989 to 58528), ORF 65 (62602 to 63039c), ORF 68 (65357 to 67141c), ORF 70 (69872 to 70417c), ORF 72 (70771 to 71823c), ORF 73 (72324 to 73064), ORF 75 (75319 to 77811c),ORF 78 (84733 to 85332c), ORF 79 (85329 to 86048c), ORF 81 (87830 to 88444), ORF 83 (89131 to 89937), ORF 95 (100765 to 101412c), and ORF 98 (103056 to 104408).

3859

on August 28, 2019 by guest

http://jb.asm.org/

Dow

nloaded from

Page 6: Complete Nucleotide Sequence of the 113-Kilobase Linear ... · fragments were ligated into pBluescript II SK( ) digested with PstI and blunt-cutting EcoRV. E. coli DH5 was transformed

no. AAC73872), which is a subunit of molybdopterin synthase.However, homologs of moaC, moaD, moeB, mogA, moaA, andmoaB, which are required for Moco biosynthesis in pro-karyotes, and modABC genes encoding molybdate uptake pro-teins were not detected on pAL1. Growth of bacteria on hy-poxanthine usually requires a Moco-dependent xanthineoxidoreductase. Replacement of molybdate in the mineral saltsmedium by tungstate completely suppressed growth and hypo-xanthine utilization by strain Ru61a and its pAL1-deficientmutant (data not shown), confirming that these strains indeedcontain a canonical xanthine oxidoreductase. In the presenceof molybdate, the two strains grew equally well on hypoxan-thine as a sole carbon source. Moreover, catalytically activexanthine oxidoreductase was detected in crude extracts of hy-poxanthine-grown cells of both wild-type and plasmid-deficientA. nitroguajacolicus Ru61a (data not shown), indicating thatthe genome of the pAL1-deficient mutant contains all of thegenes for Moco biosynthesis.

Genes presumed to be involved in metabolism of aromaticcompounds. Based on the isolation of catechol as an interme-diate in the quinaldine degradation pathway and the detectionof catechol 1,2-dioxygenase activity (27), degradation of an-thranilate by A. nitroguajacolicus Ru61a was proposed to pro-ceed via the classical �-ketoadipate pathway (Fig. 1B). How-ever, genes that might code for a catechol-forming anthranilate1,2-dioxygenase and a catechol dioxygenase were not identifiedin pAL1. Since the pAL1-deficient mutant of A. nitroguajacoli-cus is able to utilize anthranilate and catechol as carbonsources (data not shown), such an ortho cleavage pathway maywell be encoded on the chromosome. However, sequence anal-ysis of ORFs 19 to 23 of pAL1 suggested that a second routefor metabolism of anthranilate may be present in strain Ru61a.

The amino acid sequence deduced from the ORF 22 se-quence exhibits 52% identity with the sequence of the 2-ami-nobenzoate-coenzyme A (2-aminobenzoate-CoA) ligase fromAzoarcus evansii (accession no. AAL02069). ORF 23 codes fora protein that exhibits 50% identity to the natural fusionprotein 2-aminobenzoyl-CoA monooxygenase/reductase (ac-cession no. AAL02063), which in A. evansii catalyzes the for-mation of 2-amino-5-oxo-cyclohex-1-ene-carbonyl-CoA from2-aminobenzoyl-CoA (54). Two putative thioesterases are en-coded by ORFs 21 and 25; ORF 21 codes for a hypotheticalacyl-CoA thioesterase II (pfam02551), while ORF25p com-prises the cd00586 domain of 4-hydroxybenzoyl-CoA thioes-terases. The products of ORF 22, ORF 23, and ORF 21 and/orORF 25 may well be involved in anthranilate catabolism via2-aminobenzoyl-CoA and 2-amino-5-oxo-cyclohex-1-ene-car-bonyl-CoA (Fig. 1B). In A. evansii, the latter compound wasproposed to be degraded in a �-oxidation-like pathway (54). Ifsuch a �-oxidation pathway were functional in strain Ru61a, itwould require the involvement of enzymes encoded on thechromosome (or additional DNA elements), as genes thatcould code for enoyl-CoA hydratase/isomerase or acyl-CoAdehydrogenases were not detected on pAL1.

Carbon source-dependent transcription and operon struc-ture of genes presumably involved in the degradation ofquinaldine and aromatic compounds. RT-PCR analysis ofRNA isolated from A. nitroguajacolicus Ru61a grown on dif-ferent carbon sources revealed that the qoxM and amq genesand ORF 23 (Fig. 3B), as well as ORFs 3, 4, 6, 7, 8, 10, and 19

to 22 (not shown), were distinctly expressed when cells weregrown on quinaldine or on aromatic intermediates of the path-way for conversion of quinaldine to anthranilate. Even if RT-PCR provided only a semiquantitative estimate of transcriptformation, expression of qoxM and amq clearly was weaker insuccinate-grown cells than in cells grown on quinaldine (Fig.3B). Similar results were obtained for ORFs 3 and 10, for theqoxL, qoxS, moq, and hod genes (not shown), and for theintergenic regions of ORFs 3 to 6 and 7 to 11 (Fig. 3C; data notshown). However, the differences in transcript formation inquinaldine- and succinate-grown cells seemed to be less pro-nounced for the region comprising ORFs 19 to 23 (Fig. 3B andC). Transcription of ORF 16 (Fig. 3B) and ORF 24 (notshown), each of which codes for a putative PaaX-type tran-scriptional repressor, occurred independent of the carbonsource.

The genes coding for the enzymes of the “upper pathway” ofquinaldine degradation are organized in two operons. TheqoxLMS genes are cotranscribed together with the xdhC-likegene (ORF 3) (Fig. 3C), suggesting that the product of thelatter gene indeed may participate in maturation of the mo-lybdenum enzyme, as discussed above. ORF 2 does not belongto this operon (data not shown). The genes coding for enzymesinvolved in conversion of 1H-4-oxoquinaldine to anthranilate(moq, hod, and amq) form another operon (not shown) thatalso contains ORF 10 (Fig. 3C) and ORF 11 (not shown). Thegenes hypothesized to be involved in the metabolism of an-thranilate (ORFs 19 to 23) form a third operon (see Fig. 3C forcotranscription of ORFs 21 and 22; other data not shown).

The specific activities of Qox and Hod in extracts from cellsgrown on different carbon sources were determined. The levelof Qox activity in the crude extract (soluble fraction) fromsuccinate-grown cells was below the detection limit of the spec-trophotometric standard assay, but minor catalytic activity wasdetected on a nondenaturing polyacrylamide gel stained toreveal Qox activity (data not shown). In contrast, enzymaticactivity was readily detected in the soluble fractions of cellextracts from quinaldine- and 1H-4-oxoquinaldine-grown cells(Qox specific activity, �0.05 U/mg protein). The activities ofthe ring cleavage dioxygenase Hod in the soluble fractions ofextracts from quinaldine- and 1H-4-oxoquinaldine-grown cellswere about 1 to 1.2 U/mg protein, compared to 0.01 U/mg inan extract from succinate-grown cells. Strain Ru61a exhibiteddiauxic growth in mineral media with succinate plus eitherquinaldine or 1H-4-oxoquinaldine, and only the second loga-rithmic growth phase correlated with consumption of the aro-matic substrate (data not shown), suggesting that there wassuccinate-mediated repression of the degradation of het-eroaromatic compounds.

Transcriptional start sites of catabolic operons and putativepromoters. To determine the potential transcriptional startsites of the catabolic operons qoxLMS-ORF 3, moq-hod-amq-ORF 10-ORF 11, and ORF 19 to ORF 23, a 5�-RACE analysiswas performed with RNA isolated from quinaldine-growncells. The deduced transcriptional initiation site of theqoxLMS-ORF3 operon was located 110 nucleotides upstreamof the ATG start codon of qoxL; it was preceded by a putative�10 box (CATACT), as well as a putative �35 region (TTGACG) for the binding of 70-dependent RNA polymerase(Fig. 4A). The transcriptional start site of the second operon

3860 PARSCHAT ET AL. J. BACTERIOL.

on August 28, 2019 by guest

http://jb.asm.org/

Dow

nloaded from

Page 7: Complete Nucleotide Sequence of the 113-Kilobase Linear ... · fragments were ligated into pBluescript II SK( ) digested with PstI and blunt-cutting EcoRV. E. coli DH5 was transformed

was located 126 nucleotides upstream of the moq start codon(Fig. 4B). The TATATA and TTGACG sequences might rep-resent a �10 region and a �35 region, respectively, of the Pmoq

promoter. The putative transcriptional start site of the operonconsisting of ORFs 19 to 23 was located 48 nucleotides up-stream of the start codon of ORF 19; however, analysis of itsupstream sequence did not reveal putative �10 and �35 ele-ments of a 70-dependent promoter. Notably, the �10 regionexhibited similarity to the �10 consensus sequence recognizedby the 38 RNA polymerase subunits (�15-TGCTACCTT-�7;nucleotides important for 38 recognition are underlined) (Fig.4C) (4, 35). As in most promoters that bind the 38 factor (4),a conserved �35 region was not obvious in Porf19.

Two putative PaaX-like transcriptional regulators (COG3327)exhibiting 67% identity to each other are encoded by ORFs 16

and 24. PaaX proteins in the absence of phenylacetyl-CoA re-press transcription of phenylacetate catabolic gene clusters in E.coli and P. putida strains (11). The proximity on plasmid pAL1 ofcatabolic operons and paaX-like genes prompted us to search forpotential PaaX binding sites in the upstream regions of qorL,moq, and ORF 19. The sequence TGATTC-N25-GATTCA (Fig.4A), which differs only slightly from the consensus sequence ofthe E. coli PaaX binding site (TGATTC-N26-28-GAATCA) (31),was detected downstream of the qoxL transcriptional start sitebetween positions 12 and 48. The presence of a PaaX operator inthe transcribed region has also been reported for the PstyA pro-moter of Pseudomonas sp. strain Y2, which controls genes re-quired for oxidation of styrene to phenylacetate (11). In the up-stream regions of moq and ORF 19, sequences resembling PaaXbinding sites were not identified. However, the hypothesis that the

FIG. 3. Operon structure of ORFs 3 to 24 (nucleotides 4,079 to 30,130) of pAL1 and RT-PCR analysis of transcripts generated from this region.(A) Schematic diagram of ORFs and RT-PCR strategy. For descriptions of ORFs 3 to 24, see Table 1 and the text. Bars I to XIV indicate theregions amplified in the RT-PCR analysis (for the primers and lengths of PCR products, see Table S1 in the supplemental material). Arrows withthe same pattern indicate ORFs belonging to the same operon. (B) RT-PCR analysis of qoxM, amq, ORF 23, and ORF 16. Total RNA from A.nitroguajacolicus Ru61a grown on different carbon sources was used as a template for RT. The carbon sources for growth of cells were as follows:lanes 1 and 2, quinaldine; lanes 3 and 4, 1H-4-oxoquinaldine; lanes 5 and 6, 1H-3-hydroxy-4-oxoquinaldine; lanes 7 and 8, N-acetylanthranilate;lanes 9 and 10, anthranilate; and lanes 11 and 12, succinate. In negative controls (lanes 2, 4, 6, 8, 10, and 12), reverse transcriptase was omittedfrom the cDNA synthesis reaction mixture. As a positive control for the PCR, total DNA was used as a template for PCR (lanes 13). The romannumerals indicate the transcripts (see panel A), as follows: I, qoxM; II, amq; III, ORF 23; and IV, ORF 16. Lane M contained a size marker.(C) RT-PCR analysis using primer pairs to amplify intergenic regions between ORF 3 and qoxS (V), between amq and ORF 10 (IX), and betweenORF 21 and ORF 22 (XIII). RNA used for cDNA synthesis was isolated from cells grown on quinaldine (lanes 1 and 2) and succinate (lanes 3and 4). In the negative controls (lanes 2 and 4), reverse transcriptase was omitted from the reaction mixture. Lane 5 contained a positive controlfor PCR performed with total DNA as the template. Lane M contained a size marker. The results for amplification of regions VI (qoxSML), VII(moq-hod), VIII (hod-amq), and X (ORF 10 and ORF 11) were similar to the results for amplification of regions V and IX. Amplification of regionsXI (ORF 19 and ORF 20), XII (ORF 20 and ORF 21), and XIV (ORF 22 and ORF 23) resulted in patterns and intensities of RT-PCR productscorresponding to the patterns and intensities of RT-PCR products from region XIII.

VOL. 189, 2007 LINEAR CATABOLIC PLASMID pAL1 3861

on August 28, 2019 by guest

http://jb.asm.org/

Dow

nloaded from

Page 8: Complete Nucleotide Sequence of the 113-Kilobase Linear ... · fragments were ligated into pBluescript II SK( ) digested with PstI and blunt-cutting EcoRV. E. coli DH5 was transformed

qoxLMS-ORF3 operon might be controlled by a PaaX-like re-pressor cannot explain transcriptional induction or derepressionof all three catabolic operons (ORFs 3 to 6, ORFs 7 to 11, andORFs 19 to 23) by quinaldine or by its downstream aromaticmetabolites.

Genes presumed to be involved in conjugation. Recently, wedescribed conjugative transfer of pAL1 to the pAL1-deficientmutant of A. nitroguajacolicus Ru61a and to A. nicotinovoransDSM 420 (40). Bacterial conjugation is a complex process thatinvolves DNA-processing enzymes, proteins involved in matingpair formation, and a coupling protein that guides the DNA-protein complex to a type IV secretion system (8, 37). Thededuced gene product of ORF 28 of pAL1 exhibits similarity toabout 30% of the pfam02534 domain conserved in couplingproteins belonging to the TraG/TraD family and the VirD4family (COG3505). The predicted ORF 29, which overlapsORF 28 for 52 bp, encodes a small 100-amino-acid (aa) pro-tein. The N-terminal 47 residues of this protein exhibit 65%identity to residues 111 to 157 of a putative TraG protein ofArthrobacter sp. strain FB24 (accession no. ABK05710) andmay comprise an ATP binding motif. Thus, ORFs 28 and 29may have originally been derived from a single gene that wasdisrupted by a frameshift mutation.

ORF69p is related to nucleoside triphosphate binding pro-teins, and its sequence aligns with 52% of the sequence ofCOG3451 representing the conserved domain of VirB4, anATPase crucial for assembly of the transenvelope channel andfor induction of conformational changes of the translocationsystem required to drive transport of the DNA-protein sub-strate (10). ORF 84 may code for an ATP binding VirB11component of a type IV secretion complex. The proteins en-coded by ORFs 82, 85 to 89, and 91 to 92 were all predicted tocontain transmembrane helices and therefore could be in-volved in formation of a translocation channel spanning thecell envelope. Similarities to TadB, TadC, and TadG-like pro-teins were found for the sequences of ORF85p, ORF86p, andORF87p, respectively; the N-terminal regions of both ORF87pand ORF88p resemble TadE-like proteins. Tad (tight adher-ence) proteins are constituents of a system involved in thesecretion and assembly of fimbriae (fibrils) (29, 63). The prod-ucts of ORFs 82 and 85 to 89 of pAL1 thus might contribute toformation, assembly, or anchorage of a secretion pilus. The

deduced ORF93p sequence exhibits 41% similarity to the se-quence of TrpJ, the integral membrane type IV prepilin pep-tidase of Vibrio cholerae (accession no. AAK20796), and couldbe involved in processing of secreted type IV prepilins. Inconclusion, gene clusters and individual genes presumed tocode for proteins involved in conjugation and DNA transferare somewhat scattered on pAL1, comprising ORFs 28 and 29,ORF 69, ORFs 82 and 84 to 89, and ORFs 91 to 93.

Conserved gene clusters. Conserved clustering of genes sug-gests a common evolutionary ancestry and perhaps a func-tional connection of the gene products. Three regions homol-ogous to gene clusters previously described for other bacteriaare apparent. (i) The first region includes ORFs 67 to 73, withbest hits in BLAST searches with proteins encoded by linearplasmids pBD2, pREL1, pRHL2, and SCP1 of Rhodococcuserythropolis BD2, R. erythropolis PR4, Rhodococcus sp. strainRHA1, and Streptomyces coelicolor A3(2), respectively (Table2). Additionally, three of these genes are also conserved in thechromosome of Streptomyces avermitilis MA-4680. (ii) The sec-ond region includes ORFs 82 to 89 and 91 to 92 (Table 2),presumed to code for proteins of a secretion system possiblyinvolved in conjugation, interrupted by ORF 90 coding for ahypothetical protein without any database matches. Proteinsencoded by corresponding genes on SCP1 of S. coelicolorA3(2) have been proposed to form a surface-located proteincomplex (5). (iii) The third region includes ORFs 74 to 79 (seeTable S3 in the supplemental material); for most of the de-duced gene products, functional classification was not possible.This region occurs in genomes of physiologically and phyloge-netically diverse organisms.

Genes presumed to be involved in plasmid maintenance,DNA repair, and DNA replication. The putative protein en-coded by ORF 80 is similar to an ATP-hydrolyzing ParA-likeprotein and thus may be involved in plasmid partitioning (12),but a protein exhibiting similarity to ParB (which binds to aspecific DNA site and interacts with ParA) was not identifiedon pAL1. The presence of lone parA genes was previouslyreported for several linear plasmids of actinomycetes, includ-ing pCLP from M. celatum (34), pBD2 from R. erythropolis(60), and pREL1 from R. erythropolis PR4 (56). Since parABgenes are thought to be indispensable for plasmid partitioning,the functional significance of lone parA genes on linear plas-

FIG. 4. Nucleotide sequences upstream of qoxL (A), moq (B), and ORF 19 (C) and putative promoters. Translation initiation codons for theqoxL and moq genes and for ORF 19 (indicated by arrows) are in bold type; putative ribosome binding sites (RBS) are indicated by italics. Thetranscriptional start sites (�1) of qoxL, moq, and ORF 19 are indicated by bold type and italics; the direction of transcription is indicated by bentarrows. Hypothetical �10 and �35 regions of each promoter are underlined. The dotted arrows indicate the sequence corresponding to the PaaXbinding motif (see text).

3862 PARSCHAT ET AL. J. BACTERIOL.

on August 28, 2019 by guest

http://jb.asm.org/

Dow

nloaded from

Page 9: Complete Nucleotide Sequence of the 113-Kilobase Linear ... · fragments were ligated into pBluescript II SK( ) digested with PstI and blunt-cutting EcoRV. E. coli DH5 was transformed

mids is not clear. We speculate that either a ParB proteinencoded by another replicon assists in plasmid partitioning orthis protein function is provided by a still-unidentified gene onpAL1.

The consecutive ORFs 49 and 50 encode homologs of SOSmutagenesis and repair proteins UmuD and UmuC, respec-tively. Residues 59 to 127 of ORF49p resemble a conserveddomain of family 24 peptidases (pfam00717), which includesUmuD and its plasmid-encoded homolog MucA; aa 16 to 360of ORF50p align with the pfam00817 domain conserved in theUmuC and MucB proteins representing DNA polymerase V(16, 47). Translesion replication in addition to UmuC andUmuD� requires RecA and a single-stranded DNA bindingprotein (SSB). The product of ORF 39 exhibits 29% identity tothe N-terminal fragment (aa 1 to 135) of SSB from E. coli(SSBC; accession no. 1EYG) carrying the single-strandedDNA binding site (45). Thus, it might be involved in SOSrepair or even in regular replication and/or telomere patchingof pAL1.

Genes presumed to be involved in “telomere patching.” Pro-teins encoded by ORFs 101 to 103 might be involved in reac-tions that are specific for replication at the telomeres (“telo-mere patching”). The closest BLAST matches for the putativegene products of ORFs 101 to 103 are rhodococcal proteins(Table 1). The large protein ORF101p appears to consist ofmultiple domains. Its C-terminal region (aa 1,446 to 1,663)exhibits 29% identity to residues 415 to 678 of both TapL ofStreptomyces lividans (accession no. AAO73842) and TapC ofS. coelicolor (accession no. AAO73843), which, however, con-sist of only 739 aa. Since the telomere-associated protein TapLof S. lividans binds to specific single-stranded DNA sequencesof telomeric 3� overhangs of Streptomyces plasmid pSLA2 andinteracts with the Tp, it was suggested that it recruits Tp to thetelomere termini of replication intermediates (3). Besides its

“Tap domain” (aa 1,446 to 1,663), ORF101p has a zinc fingerCHCC-type domain (smart00400) at the N terminus (aa 35 to85), and its region comprising residues 598 to 868 resembles adomain of the superfamily II helicase (COG5519). Interest-ingly, the putative telomere-associated protein pRL2.4c en-coded by linear plasmid pRL2 of Streptomyces strain 44414(accession no. ABC67366), which consists of 1,100 aa, also hasa superfamily II helicase domain in addition to a Tap domain.The Tap-like and superfamily II helicase domains of pRL2.4cand ORF101p of pAL1 exhibit 25 and 21% identity, respec-tively. An additional domain of ORF101p may be formed bythe region covering aa 188 to 303, which matches the DNAprimase core, (i.e., its RNA polymerase domain) (SCOP ac-cession no. SSF56731); this region seems to be not present inthe two-domain pRL2.4 c protein. Genes coding for largeproteins (�1,700 aa) similar to the multidomain ORF101pprotein of pAL1 have been described for rhodococcal linearreplicons, including pBD2.007 of R. erythropolis pBD2 (60),pREL1_0008 coding for a putative telomere binding protein ofR. erythropolis PR4 (56), and RHA1_ro10009 of pRHL2 ofRhodococcus sp. strain RHA1 (accession no. ABH00202). LikeORF101p, two of these proteins contain a DNA primase do-main (RHA1_ro10009 [aa 213 to 288] and pBD2.007 [aa 200 to277]). We suggest that ORF 101 of pAL1 and its rhodococcalorthologs code for proteins that perform functions comparableto those of Tap of Streptomyces spp.; however, the additionaldomains may broaden their roles in the telomere patchingreaction.

The deduced protein ORF102p resembles pBD2.006,RHA1_ro10008, and pREL1_0007. Remarkably, it exhibitsweak but significant similarity to Tps of Streptomyces linearreplicons, including 24% identity to TpgCL1 from Streptomycesclavuligerus and 25% identity to the proposed terminal proteinpRL2.3c of Streptomyces sp. strain 44414 (68), suggesting that

TABLE 2. Gene clusters of pAL1 conserved in plasmids or genomes of actinobacteria

ORF of A.nitroguajacolicus

Ru61a

Corresponding ORFs (ORF; length of aligned region �nucleotides /% amino acid identity)

R. erythropolisBD2(pBD2)

(accession no.AY223810)

Rhodococcussp. strain

RHA1(pRHL2)(accession no.

CP000433)

R. erythropolisPR4(pREL1)(accession no.

AP008931)

S. coelicolorA3(2)(SCP1)(accession no.

AL590463)

S. avermitilisMA-4680

(accession no.BA000030)

Terrabactersp. strainDBF63

(pDBF1)(accession no.

AP008980)

Nocardioidessp. strain

JS614(accession no.

CP000509)

67 057; 150/40, 190/35a 0024; 518/31 148; 308/3068 056; 343/24 10027; 404/23 0023; 841/37 147; 198/22 177; 175/2569 055; 853/37 10026; 840/37 0022; 397/23 146; 862/30 175; 883/2870 054; 193/32 10025; 174/31 0021; 139/34 145; 138/2371 053; 93/2772 052; 318/26 10023; 335/28 0019; 307/27 143; 354/25 172; 317/2873 050; 256/26 10021; 195/26 0017; 189/26

82 014; 159/40 10043; 159/35 0039; 159/35 167; 167/32 183; 172/28 27; 168/30 2220; 159/4583 015; 223/29 10044; 239/28 0040; 239/28 168; 246/27 184; 252/30 28; 180/36 2221; 249/2984 016; 409/36 10045; 409/31 0041; 383/30 169; 408/33 185; 455/31 29; 404/38 2222; 397/3485 017; 237/32 10046; 235/31 0042; 233/32 170; 262/28 186; 232/37 30; 253/33 2223; 245/4086 018; 292/30 10047; 292/31 0043; 293/30 171; 291/29 187; 294/30 31; 290/27 2224; 296/3787 020; 122/31 10049; 65/33 0047; 70/30 173; 125/28 189; 123/28 33; 128/28 2226; 71/3288 021; 141/36 10050; 136/28 0048; 136/27 174; 141/32 190; 136/36 34; 144/27 2227; 141/3789 022; 35/51 175; 37/43 35; 34/35 2228; 36/4491 025; 168/27 10055; 125/32 0053; 178/26 182; 225/34 166; 158/44 38; 146/31 2232; 222/3992 023; 728/26 10052; 709/29 0050; 883/27 176; 503/26 192; 263/33 36; 664/30 2233; 456/25

a There are two regions of homology in the gene product.

VOL. 189, 2007 LINEAR CATABOLIC PLASMID pAL1 3863

on August 28, 2019 by guest

http://jb.asm.org/

Dow

nloaded from

Page 10: Complete Nucleotide Sequence of the 113-Kilobase Linear ... · fragments were ligated into pBluescript II SK( ) digested with PstI and blunt-cutting EcoRV. E. coli DH5 was transformed

it may represent a terminal protein of pAL1. A recent study ofTp of S. coelicolor demonstrated that the 5�-terminal nucleo-tide of a Streptomyces linear replicon is covalently linked to athreonine residue located in the C-terminal region of the Tp(66). Note that ORF102p, like the Streptomyces Tps, containsnumerous threonine residues (i.e., 17 threonine residues);however, since the amino acid residues linked to the phos-phodiester bond of DNA are different in the terminal proteinsof different taxonomic groups (51), the hypothesis that a Thr

residue participates in the deoxnucleotidylation reaction of Tpof pAL1 is highly speculative.

Both ORF102p and ORF103p, like Tps of Streptomyces linearreplicons (67), have high theoretical pI values (pI 9.79 and 10.05),and both are predicted to contain DNA binding domains at the Nterminus. ORF103p is similar to the hypothetical proteinspBD2.005, pREL1_0006, and RHA1_ro10007. It is presumed tobe a DNA binding protein that could be involved in replicationand/or telomere patching. Multiple alignment of ORF102p and

FIG. 5. Termini of pAL1. (A) Predicted secondary structures formed by 3� single-stranded DNA of the left and right ends of pAL1.(B) Nucleotide sequences of the left and right termini. Palindromic sequences are enclosed in boxes in panel B and are labeled with romannumerals in both panels; the palindromes that were predicted to form stem-loop structures are indicated by gray shading in both panels.Superpalindromic sequences indicated by different colors in panel A are indicated by arrows that are in the corresponding colors below thesequence in panel B. Note that sequences of 3� overhangs shown in panel A are reverse complementary to the sequence in panel B. In panel B,the central motif 5�-GCTNCGC-3� that is conserved in many actinomycete linear replicons (see text) is indicated by bold type and underlining;trinucleotides proposed to form hairpins with sheared G-A pairs in the corresponding 3� overhang are indicated by white type with a blackbackground; and asterisks indicate nonpairing bases within palindromes.

3864 PARSCHAT ET AL. J. BACTERIOL.

on August 28, 2019 by guest

http://jb.asm.org/

Dow

nloaded from

Page 11: Complete Nucleotide Sequence of the 113-Kilobase Linear ... · fragments were ligated into pBluescript II SK( ) digested with PstI and blunt-cutting EcoRV. E. coli DH5 was transformed

ORF103p, their homologs from Rhodococcus strains, and Strep-tomyces Tps revealed a short common motif, (T/V)(X)3(A/S)(X)3(G/R)(V/I)(S/T)XRT(V/I)XR (where X is any aminoacid), involving conserved serine and threonine residues (under-lined), located in the N-terminal, DNA binding region of theproteins. Even if Tps of streptomycetes were not among the hitswhen a BLASTP search was performed, the possibility that ORF103 might code for a second Tp should not be excluded.

In conclusion, based on sequence analysis of ORFs 101 to103 of pAL1 and the corresponding orthologs in rhodococcallinear plasmids, we suggest that these actinomycetal linearreplicons encode telomere patching proteins whose primarysequence and domain structure differ significantly from theprimary sequence and domain structure of proteins encoded byStreptomyces replicons. The difference in protein architecturemay indicate that there are subtle differences in the telomerepatching mechanisms.

Analysis of the left and right termini of pAL1. The terminiof different actinomycetal linear replicons have been found tocontain palindromic sequences with the potential to “foldback” to form secondary structures presumed to be function-ally important for telomere patching. Assuming that pAL1 hasblunt-end termini, fragments of pAL1 generated by restrictionwith PstI were inserted into the PstI- and EcoRV-digestedvector pBluescript II SK(�), resulting in a 2.014-bp insert(pBSK3) and a 3.092-bp insert (pBSK5) corresponding to theleft and right termini of pAL1, respectively. However, sinceprevious work on actinomycetal linear replicons showed that 5�ends of protease-treated plasmid DNA may still be blocked bylinkage of residual peptides to the DNA (23, 25), the bluntends of the cloned inserts, instead of representing the end ofpAL1, might have resulted from DNA shearing. Therefore, thePstI fragments were cloned again from DNA subjected toproteinase K digestion, as well as alkali treatment to hydrolyzethe ester bond between the terminal nucleotide and any rem-nants of Tp. Four 3.0-kb inserts and five 2.0-kb inserts wereidentified and sequenced. Compared to the sequences of theinserts of pBSK3 and pBSK5, each of these new DNA frag-ments contained an additional nucleotide at the blunt-endterminus. Sequencing of these terminal fragments also re-vealed that the 5� nucleotide of both ends of pAL1 is dCMP, asobserved for all other actinomycetal linear replicons studied sofar (25, 56, 66, 68).

The first 100 nucleotides of the two terminal sequences ofpAL1 exhibit a rather low level of homology (53% identity),but they contain three similar palindromic sequences, palin-dromes I to III (Fig. 5). Palindromes II and V of the left pAL1terminus both have the central motif 5�-GCTGCGC-3� (Fig.5B), which in a single-stranded 3� overhang may form a stablehairpin structure with a single C residue loop closed by shearedpurine-purine (G-A) pairing (Fig. 5A). The 5�-GCTNCGC-3�motif was found to be conserved in terminal sequences ofseveral Streptomyces and Rhodococcus linear replicons, in thetermini of pCLP of M. celatum, and in the 3� ends of thegenomes of autonomous (helper-independent) parvoviruses(9, 25, 30, 43, 56, 60), suggesting that it has some generalrelevance in protein-primed replication mechanisms. Remark-ably, the two specific binding sites of single-stranded DNA ontelomeric 3� overhangs of Streptomyces plasmid pSLA2, whichare recognized by the Streptomyces Tap protein, include this

conserved GCTXCGC motif as a core sequence (3). However,the right end of pAL1 does not exhibit similarity to terminalsequences of Rhodococcus or Streptomyces replicons (56) andlacks the 5�-GCTNCGC-3� motif (Fig. 5).

Calculations using the energy minimization algorithm mfoldfor the left 3� overhang of pAL1 predicted that nucleotides 6 to11 and nucleotides 37 to 41 (Fig. 5B) may fold back to form alarge stem-like secondary structure that encloses the hairpinwith a GCA loop formed by palindrome II (Fig. 5A, left). Ahairpin with a tight GCA loop may also be formed by palin-drome V. In marked contrast, secondary structures that con-tain loops closed by sheared purine-purine pairing were notpredicted for the potential 3� overhang of the right terminus(Fig. 5A, right). The different secondary structures of the leftand right 3� overhangs might interact with different proteins, ifthere is specificity between telomeric sequences and telomerebinding proteins, as suggested previously (62, 68).

Concluding remarks. Since pAL1 confers the ability to de-grade quinaldine to anthranilate and may code for enzymesinvolved in anthranilate conversion via 2-aminobenzoyl-CoA,it can be considered a catabolic plasmid. Despite the apparentlack of transposons and insertion sequences, it has a somewhatmodular structure. A distinctive feature of pAL1, which ap-parently is shared by some rhodococcal plasmids, is the largeputative telomere-associated protein that differs from strepto-mycetal Tap proteins, as it includes additional domains. Itwould be interesting to investigate the properties of this pro-tein with respect to DNA binding, protein-protein interactions,and possible catalytic activities.

ACKNOWLEDGMENTS

The financial support provided by the Deutsche Forschungsgemein-schaft to S.F. (grant FE 383/11) is gratefully acknowledged.

We thank Stephan Kolkenbrock, Munster, Germany, for assistancewith graphic representation.

REFERENCES

1. Altschul, S. F., T. L. Madden, A. A. Schaffer, J. Zhang, Z. Zhang, W. Miller,and D. J. Lipman. 1997. Gapped BLAST and PSI-BLAST: a new generationof protein database search programs. Nucleic Acids Res. 25:3389–3402.

2. Bao, K., and S. N. Cohen. 2001. Terminal proteins essential for the replica-tion of linear plasmids and chromosomes in Streptomyces. Genes Dev. 15:1518–1527.

3. Bao, K., and S. N. Cohen. 2003. Recruitment of terminal protein to the endsof Streptomyces linear plasmids and chromosomes by a novel telomere-binding protein essential for linear DNA replication. Genes Dev. 17:774–785.

4. Becker, G., and R. Hengge-Aronis. 2001. What makes an Escherichia colipromoter S dependent? Role of the �13/�14 nucleotide promoter posi-tions and region 2.5 of S. Mol. Microbiol. 39:1153–1165.

5. Bentley, S. D., S. Brown, L. D. Murphy, D. E. Harris, M. A. Quail, J.Parkhill, B. G. Barrell, J. R. McCormick, R. I. Santamaria, R. Losick, M.Yamasaki, H. Kinashi, C. W. Chen, G. Chandra, D. Jakimowicz, H. M.Kieser, T. Kieser, and K. F. Chater. 2004. SCP1, a 356,023 bp linear plasmidadapted to the ecology and developmental biology of its host, Streptomycescoelicolor A3(2). Mol. Microbiol. 51:1615–1628.

6. Besemer, J., and M. Borodovsky. 1999. Heuristic approach to deriving mod-els for gene finding. Nucleic Acids Res. 27:3911–3920.

7. Carl, B., and S. Fetzner. 2005. Transcriptional activation of quinoline deg-radation operons of Pseudomonas putida 86 by the AraC/XylS-type regulatorOxoS and cross-regulation of the PqorM promoter by XylS. Appl. Environ.Microbiol. 71:8618–8626.

8. Chen, I., P. J. Christie, and D. Dubnau. 2005. The ins and outs of DNAtransfer in bacteria. Science 310:1456–1460.

9. Chou, S. H., L. Zhu, and B. R. Reid. 1997. Sheared purine � purine pairingin biology. J. Mol. Biol. 267:1055–1067.

10. Dang, T. A., X. R. Zhou, B. Graf, and P. J. Christie. 1999. Dimerization ofthe Agrobacterium tumefaciens VirB4 ATPase and the effect of ATP-bindingcassette mutations on the assembly and function of the T-DNA transporter.Mol. Microbiol. 32:1239–1253.

VOL. 189, 2007 LINEAR CATABOLIC PLASMID pAL1 3865

on August 28, 2019 by guest

http://jb.asm.org/

Dow

nloaded from

Page 12: Complete Nucleotide Sequence of the 113-Kilobase Linear ... · fragments were ligated into pBluescript II SK( ) digested with PstI and blunt-cutting EcoRV. E. coli DH5 was transformed

11. del Peso-Santos, T., D. Bartolome-Martın, C. Fernandez, S. Alonso, J. L.Garcıa, E. Dıaz, V. Shingler, and J. Perera. 2006. Coregulation by phenyla-cetyl-coenzyme A-responsive PaaX integrates control of the upper and lowerpathways for catabolism of styrene by Pseudomonas sp. strain Y2. J. Bacte-riol. 188:4812–4821.

12. Ebersbach, G., and K. Gerdes. 2005. Plasmid segregation mechanisms.Annu. Rev. Genet. 39:453–479.

13. Fetzner, S. 1998. Bacterial degradation of pyridine, indole, quinoline, andtheir derivatives under different redox conditions. Appl. Microbiol. Biotech-nol. 49:237–250.

14. Frerichs-Deeken, U., and S. Fetzner. 2005. Dioxygenases without requirementfor cofactors: identification of amino acid residues involved in substrate bindingand catalysis, and testing for rate-limiting steps in the reaction of 1H-3-hydroxy-4-oxoquinaldine 2,4-dioxygenase. Curr. Microbiol. 51:344–352.

15. Frerichs-Deeken, U., K. Ranguelova, R. Kappl, J. Huttermann, and S.Fetzner. 2004. Dioxygenases without requirement for cofactors and theirchemical model reaction: compulsory order ternary complex mechanism of1H-3-hydroxy-4-oxoquinaldine 2,4-dioxygenase involving general base catal-ysis by histidine 251 and single-electron oxidation of the substrate dianion.Biochemistry 43:14485–14499.

16. Goldsmith, M., L. Sarov-Blat, and Z. Livneh. 2000. Plasmid-encoded MucBprotein is a DNA polymerase (pol RI) specialized for lesion bypass in thepresence of MucA�, RecA, and SSB. Proc. Natl. Acad. Sci. USA 97:11227–11231.

17. Grant, S. G., J. Jessee, F. R. Bloom, and D. Hanahan. 1990. Differentialplasmid rescue from transgenic mouse DNAs into Escherichia coli methyla-tion-restriction mutants. Proc. Natl. Acad. Sci. USA 87:4645–4649.

18. Hames, B. D. 1990. One-dimensional polyacrylamide gel electrophoresis, p.1–147. In B. D. Hames and D. Rickwood (ed.), Gel electrophoresis ofproteins, 2nd ed. IRL Press, Oxford, United Kingdom.

19. Hanahan, D. 1983. Studies on transformation of Escherichia coli with plas-mids. J. Mol. Biol. 166:557–580.

20. Hayakawa, T., T. Tanaka, K. Sakaguchi, N. Otake, and H. Yonehara. 1979.Linear plasmid-like DNA in Streptomyces sp. producing lankacidin groupantibiotics. J. Gen. Appl. Microbiol. 25:255–260.

21. Higgins, D. G., J. D. Thompson, and T. J. Gibson. 1996. Using CLUSTALfor multiple sequence alignments. Methods Enzymol. 266:383–402.

22. Hiratsu, K., S. Mochizuki, and H. Kinashi. 2000. Cloning and analysis of thereplication origin and the telomeres of the large linear plasmid pSLA2-L inStreptomyces rochei. Mol. Gen. Genet. 263:1015–1021.

23. Hirochika, H., K. Nakamura, and K. Sakaguchi. 1984. A linear DNA plas-mid from Streptomyces rochei with an inverted terminal repetition of 614 basepairs. EMBO J. 3:761–766.

24. Hirokawa, T., S. Boon-Chieng, and S. Mitaku. 1998. SOSUI: classificationand secondary structure prediction system for membrane proteins. Bioinfor-matics 14:378–379.

25. Huang, C. H., Y. S. Lin, Y. L. Yang, S. W. Huang, and C. W. Chen. 1998. Thetelomeres of Streptomyces chromosomes contain conserved palindromic se-quences with potential to form complex secondary structures. Mol. Micro-biol. 28:905–916.

26. Huang, C. H., C. Y. Chen, H. H. Tsai, C. Chen, Y. S. Lin, and C. W. Chen.2003. Linear plasmid SLP2 of Streptomyces lividans is a composite replicon.Mol. Microbiol. 47:1563–1576.

27. Hund, H.-K., A. de Beyer, and F. Lingens. 1990. Microbial metabolism ofquinoline and related compounds. VI. Degradation of quinaldine by Ar-throbacter sp. Biol. Chem. Hoppe-Seyler 371:1005–1008.

28. Jones, D., and R. M. Keddie. 1992. The genus Arthrobacter, p. 1283–1299. InA. Balows, H. G. Truper, M. Dworkin, W. Harder, and K. H. Schleifer (ed.),The prokaryotes, vol. 2. Springer Verlag, Heidelberg, Germany.

29. Kachlany, S. C., P. J. Planet, M. K. Bhattacharjee, E. Kollia, R. DeSalle,D. H. Fine, and D. H. Figurski. 2000. Nonspecific adherence by Actinoba-cillus actinomycetemcomitans requires genes widespread in Bacteria and Ar-chaea. J. Bacteriol. 182:6169–6176.

30. Kalkus, J., R. Menne, M. Reh, and H. G. Schlegel. 1998. The terminalstructures of linear plasmids from Rhodococcus opacus. Microbiology 144:1271–1279.

31. Kim, H. S., T. S. Kang, J. S. Hyun, and H. S. Kang. 2004. Regulation ofpenicillin G acylase gene expression in Escherichia coli by repressor PaaXand the cAMP-cAMP receptor protein complex. J. Biol. Chem. 279:33253–33262.

32. Kolkenbrock, S., K. Parschat, B. Beermann, H.-J. Hinz, and S. Fetzner.2006. N-Acetylanthranilate amidase from Arthrobacter nitroguajacolicusRu61a, an �/�-hydrolase-fold protein active towards aryl-acylamides and-esters, and properties of its cysteine-deficient variant. J. Bacteriol. 188:8430–8440.

33. Lake, M. W., C. A. Temple, K. V. Rajagopalan, and H. Schindelin. 2000. Thecrystal structure of the Escherichia coli MobA protein provides insight intomolybdopterin guanine dinucleotide biosynthesis. J. Biol. Chem. 275:40211–40217.

34. Le Dantec, C., N. Winter, B. Gicquel, V. Vincent, and M. Picardeau. 2001.Genomic sequence and transcriptional analysis of a 23-kilobase mycobacte-

rial linear plasmid: evidence for horizontal transfer and identification ofplasmid maintenance systems. J. Bacteriol. 183:2157–2164.

35. Lee, S. J., and J. D. Gralla. 2001. Sigma38 (rpoS) RNA polymerase promoterengagement via �10 region nucleotides. J. Biol. Chem. 276:30064–30071.

36. Leimkuhler, S., and W. Klipp. 1999. Role of XDHC in molybdenum cofactorinsertion into xanthine dehydrogenase of Rhodobacter capsulatus. J. Bacte-riol. 181:2745–2751.

37. Llosa, M., F. X. Gomis-Ruth, M. Coll, and F. de la Cruz. 2002. Bacterialconjugation: a two-step mechanism for DNA transport. Mol. Microbiol.45:1–8.

38. Neumann, M., M. Schulte, N. Junemann, W. Stocklein, and S. Leimkuhler.2006. Rhodobacter capsulatus XdhC is involved in molybdenum cofactorbinding and insertion into xanthine dehydrogenase. J. Biol. Chem. 281:15701–15708.

39. Notredame, C., D. G. Higgins, and J. Heringa. 2000. T-Coffee: a novelmethod for fast and accurate multiple sequence alignment. J. Mol. Biol.302:205–217.

40. Overhage, J., S. Sielker, S. Homburg, K. Parschat, and S. Fetzner. 2005.Identification of large linear plasmids in Arthrobacter spp. encoding thedegradation of quinaldine to anthranilate. Microbiology 151:491–500.

41. Parschat, K., B. Hauer, R. Kappl, R. Kraft, J. Huttermann, and S. Fetzner.2003. Gene cluster of Arthrobacter ilicis Ru61a involved in the degradation ofquinaldine to anthranilate: characterization and functional expression of thequinaldine 4-oxidase qoxLMS genes. J. Biol. Chem. 278:27483–27494.

42. Picardeau, M., C. Le Dantec, and V. Vincent. 2000. Analysis of the internalreplication origin of a mycobacerial linear plasmid. Microbiology 146:305–313.

43. Picardeau, M., and V. Vincent. 1998. Mycobacterial linear plasmids have aninvertron-like structure related to other linear replicons in actinomycetes.Microbiology 144:1981–1988.

44. Qin, Z., and S. N. Cohen. 1998. Replication at the telomeres of the Strep-tomyces linear plasmid pSLA2. Mol. Microbiol. 28:893–903.

45. Raghunathan, S., C. S. Ricard, T. M. Lohman, and G. Waksman. 1997.Crystal structure of the homo-tetrameric DNA binding domain of Esche-richia coli single-stranded DNA-binding protein determined by multiwave-length X-ray diffraction on the selenomethionyl protein at 2.9-A resolution.Proc. Natl. Acad. Sci. USA 94:6652–6657.

46. Rainey, F. A., N. Ward-Rainey, R. M. Kroppenstedt, and E. Stackebrandt.1996. The genus Nocardiopsis represents a phylogenetically coherent taxonand a distinct actinomycete lineage: proposal of Nocardiopsaceae fam. nov.Int. J. Syst. Bacteriol. 46:1088–1092.

47. Reuven, N. B., G. Arad, A. Maor-Shoshani, and Z. Livneh. 1999. The mu-tagenesis protein UmuC is a DNA polymerase activated by UmuD�, RecA,and SSB and is specialized for translesion replication. J. Biol. Chem. 274:31763–31766.

48. Rosenblum, B. B., L. G. Lee, S. L. Spurgeon, S. H. Khan, S. M. Menchen,C. R. Heiner, and S. M. Chen. 1997. New dye-labeled terminators for im-proved DNA sequencing patterns. Nucleic Acids Res. 25:4500–4504.

49. Rutherford, K., J. Parkhill, J. Crook, T. Horsnell, P. Rice, M.-A. Rajan-dream, and B. Barrell. 2000. Artemis: sequence visualization and annota-tion. Bioinformatics 16:944–945.

50. Sakaguchi, K. 1990. Invertrons, a class of structurally and functionally re-lated genetic elements that includes linear DNA plasmids, transposableelements, and genomes of adeno-type viruses. Microbiol. Rev. 54:66–74.

51. Salas, M. 1991. Protein-priming of DNA replication. Annu. Rev. Biochem.60:39–71.

52. Sambrook, J., E. F. Fritsch, and T. Maniatis. 1989. Molecular cloning: alaboratory manual, 2nd ed. Cold Spring Harbor Laboratory Press, ColdSpring Harbor, NY.

53. SantaLucia, J., Jr. 1998. A unified view of polymer, dumbbell, and oligonu-cleotide DNA nearest-neighbor thermodynamics. Proc. Natl. Acad. Sci. USA95:1460–1465.

54. Schuhle, K., M. Jahn, S. Ghisla, and G. Fuchs. 2001. Two similar geneclusters coding for enzymes of a new type of aerobic 2-aminobenzoate(anthranilate) metabolism in the bacterium Azoarcus evansii. J. Bacteriol.183:5268–5278.

55. Schultz, A. C., P. Nygaard, and H. H. Saxild. 2001. Functional analysis of 14genes that constitute the purine catabolic pathway in Bacillus subtilis andevidence for a novel regulon controlled by the PucR transcription activator.J. Bacteriol. 183:3293–3302.

56. Sekine, M., S. Tanikawa, S. Omata, M. Saito, T. Fujisawa, N. Tsukatani, T.Tajima, T. Sekigawa, H. Kosugi, Y. Matsuo, R. Nishiko, K. Imamura, M. Ito,H. Narita, S. Tago, N. Fujita, and S. Harayama. 2006. Sequence analysis ofthree plasmids harboured in Rhodococcus erythropolis strain PR4. Environ.Microbiol. 8:334–346.

57. Shiffman, D., and S. N. Cohen. 1992. Reconstruction of a Streptomyces linearreplicon from separately cloned DNA fragments: existence of a cryptic originof circular replication within the linear plasmid. Proc. Natl. Acad. Sci. USA89:6129–6133.

58. Staden, R. 1996. The Staden sequence analysis package. Mol. Biotechnol.5:233–241.

3866 PARSCHAT ET AL. J. BACTERIOL.

on August 28, 2019 by guest

http://jb.asm.org/

Dow

nloaded from

Page 13: Complete Nucleotide Sequence of the 113-Kilobase Linear ... · fragments were ligated into pBluescript II SK( ) digested with PstI and blunt-cutting EcoRV. E. coli DH5 was transformed

59. Staden, R., K. F. Beal, and J. K. Bonfield. 2000. The Staden package, 1998.Methods Mol. Biol. 132:115–130.

60. Stecker, C., A. Johann, C. Herzberg, B. Averhoff, and G. Gottschalk. 2003.Complete nucleotide sequence and genetic organization of the 210-kilobaselinear plasmid of Rhodococcus erythropolis BD2. J. Bacteriol. 185:5269–5274.

61. Stephan, I., B. Tshisuaka, S. Fetzner, and F. Lingens. 1996. Quinaldine4-oxidase from Arthrobacter sp. Ru61a, a versatile procaryotic molybdenum-containing hydroxylase active towards N-containing heterocyclic compoundsand aromatic aldehydes. Eur. J. Biochem. 236:155–162.

62. Stoll, A., L.-I. Horvat, S. A. R. Lopes-Shikida, G. Padilla, and J. Cullum.2000. Isolation and cloning of Streptomyces terminal fragments. AntonieLeeuwenhoek 78:223–226.

63. Wang, Y., and C. Chen. 2005. Mutation analysis of the flp operon in Acti-nobacillus actinomycetemcomitans. Gene 351:61–71.

64. Warren, R., W. W. Hsiao, H. Kudo, M. Myhre, M. Dosanjh, A. Petrescu, H.Kobayashi, S. Shimizu, K. Miyauchi, E. Masai, G. Yang, J. M. Stott, J. E.Schein, H. Shin, J. Khattra, D. Smailus, Y. S. Butterfield, A. Siddiqui, R.Holt, M. A. Marra, S. J. Jones, W. W. Mohn, F. S. Brinkman, M. Fukuda, J.Davies, and L. D. Eltis. 2004. Functional characterization of a catabolicplasmid from polychlorinated biphenyl-degrading Rhodococcus sp. strainRHA1. J. Bacteriol. 186:7783–7795.

65. Wu, W., S. K. D. Leblanc, J. Piktel, S. E. Jensen, and K. L. Roy. 2006.Prediction and functional analysis of the replication origin of the linearplasmid pSCL2 in Streptomyces clavuligerus. Can. J. Microbiol. 52:293–300.

66. Yang, C. C., Y. H. Chen, H. H. Tsai, C. H. Huang, T. W. Huang, and C. W.Chen. 2006. In vitro deoxynucleotidylation of terminal protein of Streptomy-ces linear chromosomes. Appl. Environ. Microbiol. 72:7959–7961

67. Yang, C. C., C. H. Huang, C. Y. Li, Y. G. Tsay, S. C. Lee, and C. W. Chen.2002. The terminal proteins of linear Streptomyces chromosomes and plas-mids: a novel class of replication priming proteins. Mol. Microbiol. 43:297–305.

68. Zhang, R., Y. Yang, P. Fang, C. Jiang, L. Xu, Y. Zhu, M. Shen, H. Xia, J.Zhao, T. Chen, and Z. Qin. 2006. Diversity of telomere palindromic se-quences and replication genes among Streptomyces linear plasmids. Appl.Environ. Microbiol. 72:5728–5733.

69. Zor, T., and Z. Selinger. 1996. Linearization of the Bradford protein assayincreases its sensitivity: theoretical and experimental studies. Anal. Biochem.236:302–308.

70. Zuker, M. 2003. Mfold web server for nucleic acid folding and hybridizationprediction. Nucleic Acids Res. 31:3406–3415.

VOL. 189, 2007 LINEAR CATABOLIC PLASMID pAL1 3867

on August 28, 2019 by guest

http://jb.asm.org/

Dow

nloaded from