functional characterization and gene regulation of … guo.pdf · the thesis entitled `` functional...

156
KØBENHAVNS UNIVERSITET Supervisor: Xu Peng Submitted: 05/11/14 PhD Thesis Yang Guo Functional characterization and Gene regulation of the archaeal virus SIRV2

Upload: duongbao

Post on 21-Aug-2018

224 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

2

K Ø B E N H A V N S U N I V E R S I T E T

Supervisor: Xu Peng

Submitted: 05/11/14

PhD Thesis

Yang Guo

Functional characterization and Gene regulation

of the archaeal virus SIRV2

Page 2: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

2

Page 3: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

II

Preface

The thesis entitled `` Functional characterization and gene regulation of the archaeal virus

SIRV2 ´´ was submitted to the Faculty of Science, University of Copenhagen to obtain the

degree. I have been financed by a scholarship from China Scholarship Council and a stipend

from the European Union.

Almost all the experimental work presented in this thesis was performed at Danish Archaea

Centre (DAC), Department of the Biology, University of Copenhagen, Copenhagen,

Denmark, under the supervision of Associate Professor Dr. Xu Peng. Protein Circular

dichroism (CD) spectroscopy was performed at the SBIN lab, Department of the Biology,

University of Copenhagen and the high-throughput sequencing was carried out at

Department of System Biology, Technical University of Denmark, Copenhagen, Denmark.

The thesis starts with a briefly summary of archaea and its viruses. Some typical viruses

with unexpected morphotypes and genome structures were described and SIRV2 infection

life cycle was also presented in detail. Then the second parts is the summary of the results

mainly described ssDNA binding, annealing and nuclease activities of a conserved gene

cluster of SIRV2, and a regulation map of two transcription regulators of Sulfolobus

solfataricus P2 upon SIRV2 infection. At last, it ends with the conclusions and further

perspectives for future work. Two manuscripts are enclosed behind.

Author: Yang Guo

Date: 05/11/14

Place: Copenhagen, Denmark

Page 4: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

III

Acknowledgments

First and most, I would like to greatly thank my academic supervisor Xu Peng for endless

support and always positive attitude, for her patient guidance and fruitful discussions. I also

like to thank Xu for bringing me along the inspiring and exciting research trips from the

very beginning of my biology study.

Further, I would like to thank Qunxin She and Roger A. Garrett for the valuable suggestions

and scientific discussions.

Moreover, I want to thank the lab. technicians at Denmark Archaea Center, Hein Phan and

Mariana Awayez for the great technical assistance in the laboratory.

Many great thanks to all the members in Denmark Archaea Center, Ling Deng, Fei He,

Laura Alvarez, Daniel Jensen, Soley Gudbergsdottir, Wenyuan han, Wenfang Peng,

Guannan Liu ,Carlos Sobrino, Kristine Uldahl and Marzieh Mousaei. You are inspring and

talented people on various projects, and bring me many fun times in the laboratory.

Last but not least, I would like to thank my beloved family and friends outside the scientific

environment for unreserved love and support throughout the duration of my Ph.D.

Page 5: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

IV

Table of content

Preface II

Acknowledgments III

Summary (English) V

Resumé (Danish) VII

List of the Publications IX

Abbreviations X

Thesis Objective XIII

Introduction 1

1. Archaea 2

1.1 Classification of Archaea 3

1.2 Sulfolobus 5

2. Archaeal Viruses 8

2.1 Crenarchaeal viron morphotypes and their genome 9

2.2 Sulfolobus islandics rod-shaped virus 2 13

2.3 SIRV2 life cycle 15

2.3.1 Attachment and Entry 15

2.3.2 SIRV2 Gene Transcription and Regulation 17

2.3.3 SIRV2 Replication 18

2.3.4 SIRV2 Release Mechanism 20

Summary of Results 23

Future Perspectives 28

Reference 30

Manuscript I 40

Manuscript II 85

Page 6: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

V

Summary (English)

Viruses infecting hyperthermophilic archaea have gained wide attention during recent years

owing to its remarkable diversity on morphology and genome structures. Although a

substantial work was made to decipher the functions of the unique proteins encoded by

archaeal viruses and to characterize the relationship of the viruses and host cells, the

knowledge on the biology of the archaeal viruses is still limited. The crenarchaeal virus

Sulfolobus islandics rod-shaped virus 2 (SIRV2), was emerging as a promising model for

genetic and biochemical studies as well as for the characterization of different stages in viral

infection cycle. However, similar to other archaeal viruses, the majority of the SIRV2

genome sequence showed little similarity to the public databases, which hindered the virus

functional researches and raised challenges in protein comparison and prediction.

This thesis comprises two parts of results. Firstly, the functional characterization of a highly

conserved operon of SIRV2 was described, revealing their unique protein structures,

biochemical activities as well as possible biological process they may participate in. In the

second part, the genome wide regulations of two Sulfolobus sofataricus P2 transcription

regulators upon SIRV2 infection were firstly constructed.

A SIRV2 gene operon (gp17, gp18 and gp19) was found to be the only and highly

conserved gene clusters in rudiviruses and filamentous viruses, suggesting an important

function in both viral families. The experimental results showed that ORF131b (gp17) was a

novel ssDNA binding protein, without a canonical ssDNA binding domain. A few positively

charged residues forming a U-shaped binding channel on the gp17 dimer are crucial for its

ssDNA binding activity. The intrinsically disordered C-terminus of gp17 was demonstrated

to be involved in the interaction with gp18, which was predicted previously as a helicase but

showed a ssDNA annealing activity in this study. gp19 was shown to possess a 5´ to 3´

ssDNA nuclease activity, in addition to the previously demonstrated endonuclease activity,

and a weak interaction between gp18 and gp19 was also detected. The functional

characterization of the entire operon and the strand-displacement replication mode proposed

previously for SIRV2 strongly point to a role of the operon in genome maturation and/or

DNA recombination in viral gene DNA replication and repair.

Two transcription regulators sso2474 and sso10340 from Sulfolobus solfataricus P2 were

differently expressed upon SIRV2 infection. A method similar to, but simpler than,

Page 7: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

VI

Chromatin immunoprecipitation combined with subsequent high-throughput sequencing

(Chip-seq) was applied in this study to get insight into the gene composition of the two

protein regulons in vivo. After mapping the sequence data with the genomes of Sulfolobus

solfataricus P2 and SIRV2, protein sso2474 was detected to have a high binding affinity to

virus genome by an unknown mechanism, whereas sso10340 or its interacted protein

preferred to bind and regulate the host genes on several binding sites. A total of 27 enriched

DNA fragments extracted from sso10340 complex were selected as candidate binding

targets from the host genome for the further analysis using EMSA (Electrophoretic mobility

shift assay) and foot printing assay. A palindromic sequence motif was defined based on the

enriched sequences, and most of these target genes were involved in energy metabolism,

transport and amino acid metabolism. The genome-wide binding profile presented here

reflected two different kinds of regulon conditions and contribute to the knowledge

expansion of the transcription regulation upon virus infection in Sulfolobus.

Page 8: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

VII

Resumé (Danish)

Vira der inficerende hypertermofile archebakterier har fået stor opmærksomhed i de seneste

år på grund af deres bemærkelsesværdige mangfoldighed indenfor morfologi og genom

strukturer. Selv om et stort arbejde bliver gjort for at identificer funktionen af de unikke

proteiner arke virus kodet for og at beskrive forholdet mellem vira og værtsceller, er viden

om arke viras biologi stadig begrænset. Den crenarchaeal virus Sulfolobus islandics

stavformet virus 2 (SIRV2), er et lovende model for genetiske og biokemiske undersøgelser

samt til karakterisering af forskellige stadier af virusinfektionscyklus. Men i lighed med

andre arke vira, har størstedelen af SIRV2 genom sekvens ringe lighed med sekvenser i de

offentlige databaser, dette hindrede funktionel virus forskning og giver store udfordringer

ved sammenligning af og funktionelle forudsigelse af proteiner.

Denne afhandling består af to dele. Første beskrives hvordan en særdeles konserveret

operon fra SIRV2 bliver funktionelle karakteriseret, her afsløres operons proteiners unikke

strukturer, biokemiske aktiviteter samt mulig biologisk processer, de kan deltage i. I den

anden del bliver to Sulfolobus sofataricus P2 transskription regulators mål identificeret i

hele host genomet for første gang.

En SIRV2 gen operon (gp17, gp18 og gp19) blev anset for at være de eneste og

højkonserverede genklynger i rudiviruses og trådformede vira, hvilket tyder på en vigtig

funktion i begge vira familier. De eksperimentelle resultater viste, at ORF131b (gp17) var

en hidtil ukendt ssDNA bindende protein uden et kanonisk ssDNA bindende domæne. Et

par positivt ladede aminosyre, danner en U-formet substrat kanal på gp17 dimer. Dette er

afgørende for gp17s ssDNA bindende aktivitet. Den naturlige uordnet C-terminalen del af

gp17 blev påvist at være involveret i interaktionen med gp18. Som tidligere forudsigelser

har klassificeret som en helicase, men i denne undersøgelse viste gp18 ssDNA bindende

aktivitet. Det blev påvist at gp19 har 5' til 3' ssDNA nuklease aktivitet, udover den tidligere

påvist endonukleaseaktivitet. Ydermere blev en svag interaktion mellem gp18 og gp19 blev

også påvist. Den funktionelle karakterisering af hele operonet og streng-fortrængning

replikation metode som tidligere er foreslået for SIRV2 peger kraftigt på operonens rolle i

genomet modning og / eller DNA-rekombination af viral-DNA under replikation og

reparation.

Page 9: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

VIII

To transskription regulatorer sso2474 og sso10340 fra Sulfolobus solfataricus P2 blev

forskelligt udtrykt ved SIRV2 infektion. En metode, der ligner, men er enklere end,

Chromatin immunopræcipitation kombineret med efterfølgende høj-throughput

sekventering (Chip-seq) blev anvendt i denne undersøgelse for at få indsigt i den genetiske

opbygning af de to proteiners regulon in vivo. Efter kortlægning af sekvens data mod

genomerne fra Sulfolobus solfataricus P2 og SIRV2 blev det påvist protein sso2474 have en

høj affinitet til virus genom via en ukendt mekanisme, hvorimod sso10340 eller dets

interaktion partner foretrak at binde og regulere værtsgener på flere steder på genomet. I alt

27 berigede DNA-fragmenter blev udvundet fra sso10340 kompleks blev udvalgt som

mulige bindings mål i værtsgenomet og yderligere analyse ved hjælp af EMSA

(Electrophoretic mobility shift assay) og fodaftryk analyse. Et palindromt mønster blev

defineret på basis af de berigede sekvenser. De fleste af de genre relateret til dette mønster

var involveret i stofskiftet, aminosyretransport og metabolismen. Profilen for de to

proteiners binding til DNA, der dækker hele genomet, afspejler to forskellige typer af

regulons og er med til at udvide viden om regulation af transskription i relation til virus

infektion i Sulfolobus.

Page 10: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

IX

List of the Publications:

Article I

Guo,Y., Kragelund,B., White, M., and Peng, X. Single-strand DNA binding,

annealing and nuclease activities encoded by a conserved archaeal viral gene cluster.

Submitted to Nucleic acid research.

Article II

Guo, Y., and Peng,X. Genome-wide binding profile of two transcription regulators

from Sulfolobus solfataricus.

In prep.

Article III

Deng,L., He,F., Bhoobalan-Chitty,Y., Martinez-Alvarez,L., Guo,Y., and Peng,X.

(2014) Unveiling cell surface and type IV secretion proteins responsible for archaeal

rudivirus entry. J Virol 88: 10264-10268.

Page 11: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

X

Abbreviations

ABV, Acidianus bottle-shaped virus

ACV, Aeropyrum coil-shaped virus

AFV1, Acidianus filamentous virus 1

APBV1, Aeropyrum pernix bacilliform virus 1

ARV1, Acidianus rod-shaped virus 1

ASV1, Acidianus spindle-shaped virus 1

ATP, Adenosine Triphosphate

ATV, Acidianus two-tailed virus

Ala, Alanine

Amp, Ampicillin

bp, Base pair

BSA, Bovin serum albumin

Cam, Chloramphenicol

CRISPR, Clustered Regularly Interspaced Short Palindromic Repeats

dsDNA, Double-Strand DNA

DTT, Dithiothreitol

EB, Elution Buffer

E.coli, Escherichia coli

EDTA, Ethylenediaminetetraacetic acid

EMSA, Electrophoric mobility shift assay

GST, Glutathione-S-trasferase

Hjr, Holliday junction resolvases

ICTV, International Committee on Taxonomy of Viruses

IPTG, Isopropyl-beta-D-thiogalactopyranoside

ITR, Inverted terminal repeats

Kan, Kanamycin

KDa, Kilo-Dalton

Page 12: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

XI

LB medium, Lysogeny Broth medium

Ni-NTA, Ni-nitrilotriacetic acid

MALDI-TOF, Matrix-assisted laser desorption/ionization-time of flight

MCP, Major capsid protein

M.O.I, Multiplicity of infection

mM, Mili Molar

OD, Optical density

ORF, Open Reading Frame

PAV, Pyrococcus abyssi virus

PAGE, Poly acrylamide gel electrophoresis

PBS, Phosphate Buffered Saline

PCNA, Proliferating cell nuclear antigen

PCR, Polymerase Chain Reaction

PDB, Protein Data Bank

PMSF, Phenylmethylsulfonyl fluoride

PSV1, Pyrobaculum spherical virus 1

Pfu, Pyrococcus furiosus

RCR, Rolling-circle replication

rRNA, Ribosomal RNA

SIFV, Sulfolobus islandicus filamentous virus

SIRV1/2, Sulfolobus islandicus rod-shaped virus ½

SNDV, Sulfolobus neozealandicus droplet-shaped virus

SMRV1, Sulfolobales Mexican rudivirus 1

SRV, Stygiolobus rod-shaped virus

SSU, Small-subunit

SSV, Sulfolobus spindle-shaped virus

SSVK1, Sulfolobus spindle-shaped virus K1, Kamchatka

SSVrh, Sulfolobus spindle-shaped virus RH, Yellowstone

STIV1/2, Sulfolobus turreted icosahedral virus 1/2

Page 13: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

XII

STSV1, Sulfolobus tengchongensis spindle-shaped virus 1

TEMED, N, N, N', N'-tetramethylethylenediamine

TTSV1, Thermoproteustenax spherical virus 1

TTV1, Thermoproteus tenax virus 1

VAP, Virus-associated pyramid

WT, Wild-type

Page 14: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

XIII

Thesis Objective

The objective of this PhD is mainly focused on the functional characterization of a

conserved archaeal viral gene cluster-ORF131b (gp17), ORF436 (gp18) and ORF207 (gp19)

of SIRV2 to investigate their possible roles in the whole virus life cycle. Besides, the

genome wide regulation of two Sulfolobus solfataricus transcription regulators upon SIRV2

infection were also studied to get a better understanding of the regulation network between

virus and host cells.

Page 15: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

2

Introduction

Page 16: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

2

1 Archaea

Evolution is a process through which the composition of genes in a population changes over

generations, and it seems to progress in a quantized way, from one lever or domain of

organization rising ultimately to a more complex one. In the early to middle 20th

century,

microbiologists tried to classify microorganisms based on the structures of their cell walls,

their shapes, and the substances they consume. Until five decades ago, Zuckerkandl and

Pauling claimed that it is at the level of molecules (particularly molecular sequences) that

one really becomes privy to the workings of the evolutionary process. The comparative

analysis of the molecular sequences started to become a powerful approach for determining

evolutionary relationship (Zuckerkandl and Pauling, 1965).

The Ribosomal RNA was chosen to be a candidate molecule to detect relatedness among

distant species due to broad distribution, slowly changed sequence and a component of

self-replicating systems (Zablen et al., 1975). In 1977, Woese and Fox digested the 16S

(18S) ribosomal RNA of the organisms with T1 RNase and subjected the products to

two-dimensional electrophoretic separation, producing oligonucleotide fingerprint to

identify the relationships of the living system. They found out that many of the prokaryotes

once classified as bacteria belong to their own domain, which was later classified as a third

domain – Archaea, meaning ancient and primitive in ancient Greek language (Woese and

Fox, 1977).

The discovery of the new microbial kingdom eventually led to the classification of all

known life into three major Domains: Eucarya (all eukaryotes), Archaea, and Bacteria,

which was a significant breakthrough in the history of biology (Forterre et al., 2002).

Actually in the early 1980s, people already realized that Thermoplasma and Halobacterium

had close evolutionary affinity with Methanogens, all of which were the representatives of

known archaea (Woese et al., 1990). For a long time, archaea were seen as extremophiles

that only exist in extreme habitats such as hot springs and salt lakes with high salt

concentration (Oren, 2002), low pH (Johnson et al., 2008) or high temperature (Stetter,

2006). At the end of the last century, more organisms were discovered along with new

habitats were studied, archaea have been found in a wide variety of non-extreme

environments, including marine waters (DeLong, 1992), freshwater sediments (Schleper et

al., 1997) as well as all kinds of soil environments (Bintrim et al., 1997;Oline et al., 2006).

Page 17: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

3

They are globally distributed in nature and have become common microbes in environment.

Since archaea can survive in such harsh conditions, they can provide a source of enzymes

that resist to heat and/or to acidity, which is a valuable treasure for industry (Breithaupt,

2001). The most familiar application of an archaeal enzyme is the thermostable Pfu DNA

polymerase from Pyrococcus furiosus, allowing the accurate polymerase chain reaction

(PCR) to be widely used in biology science researches. There are many hidden treasures in

archaea still waiting to be deciphered by old and new Archaea lovers.

1.1 Classification of Archaea

Based on the pioneering work of Carl Woese, the small subunit ribosomal RNA (ss rRNA)

is widely used in molecular phylogenetic studies to investigate the relationship between

organisms, rather like some classification systems that trying to group archaea based on the

shared structural features and common ancestors (Gevers et al., 2006). At the early stage,

archaea was further classified into two distinct groups, that the methanogens as well as their

relatives were named as Euryachaeotes and the thermoacidophiles, sulfurdependent ones

were categorized as Crenarchaeota (Woese et al., 1990). Most of the cultivable and

well-studied archaeal species exhibit in these two main phyla. In 2002, the peculiar species

Nanoarchaeum equitant was found. It harbors the smallest archaeal genome with a spherical

cell shape and had been given its own phylum – Nanoarcheota (Hohn et al., 2002). Another

small new group of thermophilic archaeal species, exhibiting an apparent affinity to the

Crenarchaeota, but also sharing features with Euryarchaeota, were identified as

Korarchaeota (Elkins et al., 2008;Anderson et al., 2008). A fifth group has also been created

as Thaumarchaeota in recent years (Guy and Ettema, 2011).

Euryarchaeota, as one of the major phyla in Archaea, encompasses the most diversified

phenotypes. The cultivated Euryarchaeota is subdivided into eight groups ( Thermococci,

Methanopyri, Methanococci, Methanobacteria, Thermoplasmata, Archaeoglobi,

Halobacteria and Methanomicrobia ), while Methanogenesis was the main invention that

occurred in the euryarchaeal phylum along with halophiles, some thermoacidophiles as well

as some hyperthermophiles (Gribaldo and Brochier-Armanet, 2006). In contrast, most of the

cultivable Crenarchaeota strains belong to the thermophilic or hyperthermophilic species,

showing a very limited phenotypic diversity (Forterre et al., 2002). However, since the

marine archaeal group was discovered and identified as characteristic Crenarchaeota by

Page 18: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

4

environmental rRNA, it is thought that they may be the extremely abundant archaea in the

marine environment and could be a significant component of deep-sea metabolism

(Fuhrman et al., 1992).

The orders -Thermoproteales, Caldisphaerales, Desulfurococcales and Sulfolobales

represent the four lineages of the Crenarchaeaotal branch of Archaea. Thermoproteales are

rod-shaped extreme thermophiles or hyperthermophiles. They are the only organisms known

to lack the canonical SSB proteins, instead possessing the protein ThermoDBP specifically

bound to ssDNA (Paytubi et al., 2012). The Sulfolobus species are relatively easy to

cultivate due to the aerobic lifestyle and relatively short doubling times, and as the only

genetic manipulatable representatives in Crenarchaeota, have developed into model

organisms to study their DNA repair, replication, transcription, chromosome integration,

RNA processing, cell division, virus-host interaction systems as well as many other cellular

aspects (Bernander, 2007).

Page 19: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

5

Figure 1. Small subunit ribosomal RNA-based phylogenetic tree. The thick lineages

represent Hyperthermophiles. (modified from Stetter, 2006)

1.2 Sulfolobus

Since the first description by T. Brock in 1972 about Sulfolobus acidocaldarius, isolated

from a hot spring in Yellowstone National Park, this new group of sulfur-oxidizing

Page 20: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

6

organisms has been of interest both evolutionarily and geochemically (Brock et al., 1972).

Sulfolobus species has been isolated from a wide variety of acid thermal areas (in the USA,

Italy, Iceland, Russia and elsewhere), with optimal growth occurring at pH 2-3

and temperatures of 75-80 °C, making them acidophiles and thermophiles respectively. So

far, most strains isolated are able to grow heterotrophically as well as autotrophically. Since

the Sulfolobus species have a wide geographic distribution, they are normally named after

the location where they were first isolated, e.g. Sulfolobus islandics strains were isolated in

Iceland (`island` is German for `Iceland`) (Zillig et al., 1994), Sulfolobus tengchongensis

from Teng Chong, China (Xiang et al., 2003) and Sulfolobus solfataricus from volcanic

hot springs at Pisciarelli Solfatara (Zillig et al., 1980). Among these species, S.solfataricus

is one of the best-characterized and most commonly used strains in laboratories.

Figure 2. Electron micrographs of Sulfolobus solfataricus strain, DSM 1617, thin section

(from Zillig et al., 1980).

The Sulfolobus strains DSM 1616 and DSM 1617 (Fig 2.) were firstly named as Sulfolobus

solfataricus by Zillig for having a similar GC content but significantly different RNA

polymerase molecular weights with respect to S. acidocaIdarius (Zillig et al., 1980).

Moreover, they were newly renamed as S. solfataricus P1, P2 and have developed as the

main model species that researchers work on, especially when the genome sequence of S.

solfataricus strain P2 was published by She, the transcriptome map was drawn by Wurtzel,

providing rich detailed information for the further work on DNA replication mechanism,

cell cycle, transcription and large numbers of unknown genes (She et al., 2001;Wurtzel et

al., 2010). Wealthy data, standardized methods, maturing genetic system, the easy

Page 21: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

7

lab-cultivating advantages as well as the host species for studying virus-host interactions

contributed to the construction of Sulfolobus solfataricus as a model organism (Albers et al.,

2009;Worthington et al., 2003;Deng et al., 2009).

The widely used strain S. solfataricus P2 (DSM1617) showed a low fraction of susceptible

cells to both Sulfolobus islandicus rod-shaped virus 2 (SIRV2) and Sulfolobus turreted

icosahedral virus (STIV) infection (Okutan et al., 2013;Ortmann et al., 2008). In this work,

the highly susceptible mutant strain S. solfataricus P2 5E.6 was selected as the host strain

for SIRV2 infection study. The strain carries a deletion of CRISPR (Clustered regularly

interspaced short palindromic repeats) clusters A-D, but shows similar phenotype on

chromosome degradation and virus life cycle as the natural host strain S. islandicus

LAL14/1 upon SIRV2 infection (Okutan et al., 2013;Bize et al., 2009).

Page 22: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

8

2 Archaeal Viruses

Along with archaeal communities, viruses thrive in extreme conditions and play an

important role in ecosystem dynamics. Like members of the other domains of life, archaea

are infected with viruses. At the initial stage, the thermophilic viruses isolated from archaea

domain resembled bacteriophages with head-tail in morphotype (Martin et al., 1984), and

the subsequently discovered Euryarchael viruses were also similar with bacterial viruses. In

contrast, as more habitats were studied, viruses were found infecting members of the

kingdom Crenarchaeota (Sulfolobus, Acidianus, Pyrobaculum and Thermoproteus) and

exhibiting highly diverse morphotypes and genomic properties. The extraordinary shape of

some viruses have never been observed before (Prangishvili, 2003). Most crenarchaeal

viruses have been isolated from hot terrestrial habitats, and they show adaption to their

extreme environments like their host.

Due to the abundance and unique biology, some challenges arise with archaeal viruses. The

low lever of sequence similarity to public databases, novel biochemical mechanisms as well

as difficulties in virus and host cultivation need to be addressed (Prangishvili and Garrett,

2005). After a relativly intensive study on archaeal viruses in the last decade, about 100

viral species have been sequenced and their genomic properties as well as relationships with

the host cell have also been described. Among these viruses, only two single-strand DNA

virus species have been discovered (Pietila et al., 2010;Mochizuki et al., 2012) , the others

all possess double-strand DNA genomes. According to the International Committee on

Taxonomy of Viruses (ICTV), bacterial viruses comprise nine morphotypes, which belong

to ten families. While archaeal viruses, exhibit 16 different morphotypes, and are classified

into 15 families (Ackermann and Prangishvili, 2012) (Fig 3.). Although limited in number,

compared with the viruses infecting bacteria, the diverse and unique morphotypes of

archaea viruses revealed new insights into the viral world.

Page 23: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

9

Figure 3. Virion morphotypes of prokaryotic viruses. Names of viral genera or families based on International

Committee on Taxonomy of Viruses (ICTV) are indicated below the schematic virus particles. If an archaeal

virus has not been assigned to any genus or family, individual virus names are given. The virions are not

drawn to scale (from Pietila et al., 2014) .

2.1 Crenarchaeal viron morphotypes and their genomes

Thermophilic viruses infecting the crenarchaea have been classified into ten families based

on their morphology, eight families were approved by ICTV and the remaining two are

waiting for approval (Pietila et al., 2014). The ten crenarchaeal virus families are: One tail

spindle –shaped Fuselloviridae (SSV1-7, SSVK1, SSVrh, ASV1); two tail spindle–shaped

Bicaudaviridae (ATV); Bottle-shaped Ampullaviridae (ABV); Droplet-shaped Guttaviridae

(SNDV); Linear filamentous Lipothrixviridae (AFV1-9,SIFV,TTV1); Linear rod-shaped

rudiviridae (SIRV1-2,SRV,ARV1); Spherical Globulaviridae (PSV1,TTSV1); Bacilliform

Clavaviridae (APBV1), Tailless icosahedral `Turriviridae` (STIV,STIV2) and Coil-shaped

Spiravridae (ACV), the last two families are waiting for the approval. Some other viruses

Page 24: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

10

like Sulfolobus tengchongensis spindle-shaped virus (STSV1) and Pyrococcus abyssi virus

(PAV1) are awaiting assignment to a viral family. Some well-studied and intriguing viruses

will be described in more detail as below:

Sulfolobus spindle-shaped virus 1 (SSV1). The Sulfolobus spindle-shaped viruses (SSVs)

of the family Fuselloviridae were the first discovered family of archaeal viruses. Most of the

SSVs (except for SSV6 and ASV1) are spindle-shaped, 100 x 60 nm in size and carry tail

structures at one pole (Fig 4.).

Figure 4. Electron micrographs of virus

particles. (A) Cell apparently extruding virus.

(B) Free virus and virus particles attached to

cellular material. Two large particles are

arrowed. (C) Purified free virus particles

exhibiting tail structures. Three bullet-shaped

particles are seen on the right. (D) Thin sections

of cells sampled 6 h after u.v. irritation showing

three cell-cell contacts. The bars represent 0.2

um (from Martin et al., 1984).

The virus SSV1 was isolated from UV-induced growing cultures of Sulfolobus shibatae

(strain B12). It contains a 15.5-kb positively supercoiled circular double-stranded DNA,

with a GC-content of 39.7 %, resembling that of the host DNA (Palm et al., 1991). During

infection SSV1 is stably carried by its lysogenic host S. shibatae and is found

intracellularlly either in a covalently closed circular (plasmid) form or site-specifically

integrated within an arginine tRNA gene in the host chromosome (Yeats et al., 1982). The

transcription pattern of the SSV1 genome is relatively simple, some genes are significantly

upregulated by UV irradiation and the genes can be clearly divided into early, late and

UV-inducible categories (Reiter et al., 1987;Frols et al., 2007).

Acidianus two-tailed virus (ATV). This archaeal virus was discovered in an acidic hot

spring (85–93 °C; pH 1.5) at Pozzuoli, Italy. As the sole member of the viron family

Bicaudaviridae, ATV contains a lemon-shaped central structure, but when it exits the host

cell, it then develops elongated tails protruding from both pointed ends, specifically at

Page 25: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

11

temperatures above 75°C, close to the temperature of the natural habitat of the host (Fig. 5).

The circular, dsDNA genome contains 62730 bp, encodes 72 predicted proteins, 11 of

which are structural proteins with molecular masses in the range of 12 to 90 kDa. The

unique host-independent as well as extracellular functional activity might be associated with

an 88.7-kDa ATV viron protein P800, which is rich in coiled-coil motifs and can generate

structures that resemble intermediate filaments (Haring et al., 2005c;Prangishvili et al.,

2006c). ATV was the first known virus from hot, acidic habitats that causes lysis of its host

cell, whereas most archaeal viruses maintain a stable relationship with their host.

Figure 5. Electron micrographs of

Acidianus convivator and different forms

of the Acidianus two-tailed virus. a,

Virions in an enriched sample taken from

acidic hot springs in Pozzuoli, Italy (pH

1.5, 85–93 °C). b, Extrusion of

lemon-shaped virions from an

ATV-infected A. convivator cell. c,

Virions in a growing culture of

ATV-infected A. convivator, 2 days after

infection. d, Cultured virions after

purification and incubation at 75 °C for 0,

2, 5, 6 and 7 days ( panels from right to

left, respectively) (from Haring et al.,

2005c).

Acidianus bottle-shaped virus (ABV). The enveloped virion of ABV, has a complex form

resembling a bottle (230-nm long, 4–75-nm wide, Fig 6.C), the morphology is so unique

that it has been assigned to a new family Ampullaviridae. The narrow end of `the bottle` is

likely to be involved in cellular adsorption and in channeling of viral DNA into the host cell

(Fig 6. A) and the broad end exhibits 20 thin filaments, which are inserted into a disk and

interconnected at the base, the function of these filaments remains unclear but very

intriguing (Fig 6.B) (Haring et al., 2005a).

ABV was isolated from the same hot spring in Pozzuoli, Italy, where ATV was isolated. It

infects strains of the hyperthermophilic archaeal genus Acidianus, and contains a linear

double-stranded DNA. The viron genome has a length of 23,814 bp, with a G +C content of

Page 26: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

12

35%, and a 590-bp inverted terminal repeat. It encodes 57 predicted ORFs, of which a

putative RNA molecule was predicted to have a notable secondary structural similarity to

the bacteriophage RNA molecule, which has been implicated in DNA packaging. Moreover,

in contrast to other crearchaeal viruses, ABV encodes a Family B DNA polymerase (Peng et

al., 2007).

Fig. 6. Electron micrographs of particles of ABV after negative staining with 3% uranyl acetate. (A) ABV

particles adsorbed with their pointed end toward a membrane vesicle of the host “A. convivator.” (B) ABV

particles attached to each other with their thin filaments at the broad end. Bars, 100 nm. (C) A scheme of the

structure of an ABV virion (from Haring et al., 2005a).

Acidianus filamentous virus 1 (AFV1). AFV is a Lipothrixvirus that infects the Acidiannus

genus of the Crenarchaeota in a stable carrier state and was observed in an enrichment

culture from a hot spring at 80 °C in Crater Hills region of Yellowstone National Park

(Rachel et al., 2002). AFV1 is composed of a protein core covered with a lipid envelope,

containing at least five different proteins with molecular masses in the range of 23-130 kDa.

The 20.8-kb-long linear genome contains 40 ORFs and particles of AFV1 are measured

with size of 900 × 24 nm.

AFV1 exhibits claw-like terminal structures, connected to the virion body by appendages at

the both ends (Fig 7.A). Apparently, the unusual termini of the virions have a special

function in the process of adsorption, which was detected to have an attachment with the

host pili and the contact seems rather strong (Fig 7.B) (Bettstetter et al., 2003). Crystal

structures of two major coat proteins AFV1-132 and AFV1-140 have been resolved, both

carry a novel four-helix-bundle fold and AFV1-140 also carries an extra C terminal domain

possibly interacting with the virion envelope (Goulet et al., 2009b). Recently, a new

replication model was proposed by analyzing the replicative intermediates on

C

Page 27: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

13

two-dimensional (2D) agarose gel, revealing that the genome replication started from a

D-loop formation, proceeded via strand displacement, and terminated by recombination.

This process in some degree resembled the T4 DNA replication, but further studies are still

needed to support the proposed model (Pina et al., 2014).

Fig 7. (A) Electron micrographs of

particles of AFV1 with tail structures

in their native conformation. (B)

Electron micrographs of particles of

AFV1 adsorbed to pili of host A.

hospitalis CH10/1, stained with 3%

uranyl acetate. Black arrows indicate

pili; white arrows show knots which

are putative viral terminal structures

separated from the virus body. Bars,

100 nm. (from Bettstetter et al., 2003)

2.2 Sulfolobus islandics rod-shaped virus 2 (SIRV2).

Rudiviridae and Lipothrixviridae belong to the linear viruses, and they are ubiquitous in

high temperature (>75°C) and low pH (pH <3) terrestrial geothermal environments.

Comparative-genomic analysis suggests a common evolutionary ancestry of the rudiviruses

and lipothrixviruses, based on the conservation of orthologous core genes and the similarity

of the major viron coat proteins (Prangishvili et al., 2006a). Together with Sulfolobus

islandics rod-shaped virus 1 (SIRV1), Stygiolobus rod-shaped virus (SRV) (Vestergaard et

al., 2008b), Acidianus rod-shaped virus 1 (ARV1) (Vestergaard et al., 2005) and

Sulfolobales Mexican rudivirus 1 (SMRV1) (Servin-Garciduenas et al., 2013), SIRV2 was

grouped as rudiviruses by their rod-shaped morphology, gene architecture and sequence.

SIRV2 is one of the most extensively studied archaeal viruses and has developed into the

archaeal model virus thanks to the structural, genomic and transcription studies. SIRV2 was

first isolated from the colony-cloned S.islandicus stains HVE 10/2 isolated from solfataric

fields in Iceland-Hveragerdi. This non enveloped virus is a stiff rod of 23 nm in width, 900

nm in length. As shown in the electron micrographs, a central cavity ends plugging by

A B

Page 28: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

14

approximately 50 nm stoppers were clearly visible in Figure 8, and both ends decorated with

three short tail fibers. Sensitivity of SIRV2 genome to BAL31 but not λ exonuclease

indicated the existence of covalently closed hairpin ends (Prangishvili et al., 1999).

The linear 35,502 bp double stranded DNA SIRV2 genome carries inverted terminal repeats

(ITRs) of 1628 bp at each end, and with a low G+C content of 25%. The virion body is a

superhelix formed by genomic DNA and multiple copies of the highly glycosylated 20-kDa

capsid protein. SIRV2 genome encodes 54 ORFs, sharing 44 homologs with SIRV1, and

approximately half of the encoded proteins have been characterized by sequence, structural

and biochemical analysis, which is the highest proportion on recognizing gene functions

among crenarchaeal viruses. Four SIRV2 viron proteins were identified as virion structure

proteins: the major capsid protein (MCP), P134 (gp26), shares a common fold with MCPs

of lipothrixvirus AFV (Acidianus filamentous virus) (Goulet et al., 2009a). The largest viral

protein P1070 (gp38), has a MW of 105 kDa, possesses a coiled-coil domain. It is a

component of the three fibers (Steinmetz et al., 2008). Besides, the two structural proteins

ORF488 (gp33) and ORF564 (gp39) are also found in the SIRV2 virons, although in a low

amount (Vestergaard et al., 2008b). Crystal structure resolution of SIRV1 P119 as well as

the detection of nicking and joining activities by experiments suggest that this protein could

be involved in initiation of SIRV1 genome replication (Oke et al., 2011). P121 has a high

sequence similarity with archaeal Holliday junction resolvases (Hjrs) and the Hjr activity

was experimentally examined (Birkenbihl et al., 2001). There are more hypothetical

proteins predicted by sequence analysis and experimental evidence that involved in

transcription, replication and nucleic acid metabolism, more detailed information about the

whole life cycle of SIRV2 will be discussed below.

Page 29: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

15

Figure 8. Structure of SIRV2 particles. (A).

TEM of negatively stained SIRV2 particles.

Inset shows a high resolution image of the

end structure of the capsid. The scale bar is

500 nm. (B). Schematic depiction of

SIRV2 particles ( from Steinmetz et al.,

2008).

2.3 SIRV2 life cycle

2.3.1 Attachment and Entry

The archaeal viruses display an unusual and diverse morphotypes, genome sequences, as

well as the structure of proteins (Krupovic et al., 2012). Recent researches revealed that the

interaction between archeal viruses and their hosts seem also to be unique (Bize et al., 2009).

However, compared with wealth of data available on bacterial and eukaryotic systems, the

studies on archaeal viruses mainly consistent of biochemical and genetic characterization of

their virions, while the attachment and entry process are still elusive.

Virus infection is initiated by entry into the host cell, and the first step of the entry process is

to recognize the receptors present on the host cell surface by specific interaction. Then they

must have ways to transporting their genetic information to the cell compartment where

their genome is replicated (Poranen et al., 2002). The vast majority of known viruses have a

tail structure decorated to one or two ends of nucleocapsid, which facilitate the attachment

of virions to the host membrane. In the Lipothrixviridae family, each of the virion is tapered

and carries different specific terminal structures. These structures can represent claws

(AFV1), T-bars (AFV9), mop-like structures (SIFV), three (AFV3) or six (SFV) short

Page 30: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

16

filaments or tips resembling bottle brushes (AFV2), which are implicated in cellular

adsorption (Bettstetter et al., 2003;Bize et al., 2008;Haring et al., 2005b).

Both termini of the SIRV2 virion are connected with three tail fibers composed of the minor

structural protein P1070. These termini were detected to bind the tips of the pilus-like

filaments, which are abundant on the surface of host cells, by transmission electron

microscopy and whole-cell electron cryotomography (cryo-ET). Figure 9 demonstrated the

interaction between SIRV2 termini fibers and purified host cellular filaments. The virus

adsorption was very fast and irreversible, but the infected cells were no longer able to

adsorb more virus efficiently (Quemin et al., 2013). Many bacterial viruses like Ff

inoviruses, utilize the filamentous cellular appendages as primary receptors. Then retraction

of the host pilus bring the viron close to the host cell surface, where it could bind to the

secondary receptor (Rakonjac et al., 2011). Although no retracting pili have been identified

in archaea, there should be secondary receptors on the host cell surface to adsorb the virus

particles. Indeed, Sulfolobus mutant strain lacking cluster sso3138-sso3141 and cluster

sso2386-sso2387 was resistant to SIRV2. No growth retardation was observed when this

mutant strain was diluted and infected with SIRV2 at the same M.O.I, compared with wide

type strain. The first clusters sso3138 to sso3141 were predicted to possess transmembrane

helices and to be located extracellularly, probably acting as a receptor for SIRV2. The

proteins encoded by the other gene cluster may be involved in the secretion of the receptor

components. Besides, the genetic complementation experiments confirmed the involvement

of the mutation in virus resistance and further support that these proteins are responsible for

SIRV2 entry (Deng et al., 2014).

Figure 9. Transmission electron

micrographs of SIRV2 interaction with

purified cellular filaments. The

filaments were removed from S.

islandicus LAL14/1 cells (from

Quemin et al., 2013) .

Page 31: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

17

However, how the virus overcomes 12.5-m-long filament to reach the cell body, and the

mechanism of removing their coat protein as well as the association of the two identified

gene clusters with the structure of pili is still poorly understood and need further studies.

2.3.2 SIRV2 Gene Transcription and Regulation

SIRV2 enters into the cell, removes the coat protein and is likely to recruit the host RNA

polymerase complex for transcribing the SIRV2 genes, as no viral gene was shown to

encode a RNA polymerase. Generally, the virus transcription is time regulated and could be

classified as early, middle and late transcribed genes that encode the proteins involved in

regulation, translation, replication and structure proteins for assembly in a chronological

way.

In bacterial and eukaryal virus-host systems, modification of cellular transcription as a result

of virus infection is well studied, such as T7 bacteriophage. Besides transcribed by host

E.coli RNA polymerase, T7 bacteriophage encodes its own RNA polymerase, which is a

single subunit enzyme of 99 kDa. The DNA genome of T7 is transcribed entirely from left

to right, firstly in the early region by E.coli RNA polymerase, and then from a portion of the

early region to the entire late region by newly-made T7 RNA polymerase (Dunn and Studier,

1983;Steitz, 2004). Whereas in archaeal domain, the mechanisms and controls of viral gene

expression as well as host gene regulation upon virus infection are still not elucidated. To

date, several archaeal viruses have developed as suitable models to study molecular details

of the archaeal viron life cycle and host responses, e.g. the temperate Sulfolobus

spindle-shaped viruses (SSV) (Frols et al., 2007) and the lytic Sulfolobus turreted

icosahedral virus (STIV) (Ortmann et al., 2008;Maaty et al., 2012).

Reminiscent to the life cycles of bacteriophages and eukaryotic viruses, SSV1 exhibits a

tight temporal regulation of its own transcription after UV treatment, initiating from a small

UV-specific gene and then continues as three distinct sets of genes representing

immediate-early, early and late transcripts. But very few of host genes was regulated upon

virus infection (Frols et al., 2007). However, the microarray study about transcription of the

lytic virus STIV was completely an opposite story. STIV transcription did not show a

typical temporal regulation. In the virus life cycle, transcription signals of nine early viron

Page 32: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

18

genes were detected at 8h, then all the remaining genes were transcribed subsequently.

Surprisingly, a total of 177 host genes were determined to be differentially expressed during

the infection, of which two thirds were up-regulated and one third were down-regulated

(Ortmann et al., 2008).

SIRV2 infects Sulfolobus spp. but does not integrate into the host chromosome. It was

thought to be a lysogenic virus existing in a stable carrier state in the host, corresponding to

the uniform transcription pattern revealed in an earlier study (Kessler et al., 2004). During

the characterization of the special release mechanism of SIRV2, it was then demonstrated to

be a lytic virus (Bize et al., 2009). Independent microarray study performed in infected S.

solfataricus 5E6 cells and transcriptomic analysis of infected S. islandicus LAL14/1 cells

exhibited that SIRV2 transcription starts from the terminal genes located at both ends of the

linear genome (Quax et al., 2013;Okutan et al., 2013). Although SIRV2 transcription is not

tightly regulated chronologically like SSV1, the gene expression showed a temporal pattern.

Some early genes like two identical ORF83a, ORF83b as well as ORF119C, the viral

replication initiation protein (Oke et al., 2011), were detected to be transcribed in 15-30 min

and highly expressed at 1h. Whereas the structure proteins like the major capsid protein and

virus-associated pyramids protein were most abundant at the late stage of the infection cycle.

Despite lacking strong temporal regulation of transcription on its own virus, the host

response to SIRV2 infection was significant. More than one third of the host genes were

differentially regulated, with a similar number of down and upregulated genes. Most of the

host genes that are strongly activated upon infection are assumed to function in defense

against viruses, as well as cellular collapse, energy metabolism and membrane transport,

which may suggest that the virus control the replication phases less dependent on its own

differential gene expression, but co-opted host genes.

2.3.3 SIRV2 Replication

As we know that SIRV2 genome is a linear duplex with covalently closed hairpin termini

and long ITRs at both ends (Blum et al., 2001). This termini structure is normally involved

in replication initiation that parallels to that of Poxviridae and other large cytoplasmic

eukaryotic viruses. Both DNA sequence and structure within the termini are important for

template recognition, which is nicked by Rep initiation protein and exposed a 3'-OH group

as a primer for DNA synthesis (Du and Traktman, 1996).

Page 33: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

19

By sequence and structure analysis, ORF119 of SIRV2 was found to be a member of the

replication initiator (Rep) family, having a conserved key active-site motif with

rolling-circle replication (RCR) rep proteins (Vega-Rocha et al., 2007). It forms a dimer and

sequence-specifically nicks one strand of the SIRV2 terminal hairpin only when the

substrate is in the single-strand form. The joining activity of ligating the fragments by a

strand transfer (flip-flop) mechanism was also confirmed (Oke et al., 2011). According to

all these features, along with the detection of head-to-head concatemers of the replicative

intermediates (Peng et al., 2001), a related but unrestricted mechanisms to rolling-circle

replication (RCR) was proposed. The Rep protein recognizes and nicks one ori site of the

genome, then one subunit of rep protein covalent connected with the new generated ori site

and the other subunit of the protein ligated the old two fragments, forming a contiguous

DNA circle. Displacement replication is then used to replicate the rest of the genome and

generated a double strand DNA circle, but with a nicked hairpin termini adducted rep

protein. The next steps of the replication are similar to RCR of the poxviruses. At the

junctions between genome monomers, opposing inverted terminal repeats can be extruded

to form hairpin fourway junctions. Therefore, a Holiday junction resolving enzyme (Hjr)

was supposed to introduced to resolve the concatamers, producing monomer copies with

linear hairpin ends (Culyba et al., 2006;Oke et al., 2011).

Holliday junction resolving enzymes are ubiquitously found in all the domains of life, such

as RuvC in Bacteria, Human GEN1 in Eukarya (Declais and Lilley, 2008), and two different

holiday junction resolving enzymes (Hjr and Hje) from Sulfolobus solfataricus of

crenarchaeon (Kvaratskhelia and White, 2000). As expected, SIRV2 encodes a 14 kDa

Holliday junction resolving enzyme (SIRV2 Hjr), which is conserved among rudiviruses.

Unlike the bacteriophage resolving enzymes, which cleave a variety of branched DNA

structures formed during replication, the SIRV2 Hjr showed a very narrow substrate range,

only cleaves the four-way junctions DNA structures, and the cleavage pattern is also unique

by nicking only exchange strand pairs (Gardner et al., 2011a). This protein was presumed to

be important for the processing of replicative DNA intermediates late during the infection

cycle before packaging into newly synthesized heads commences.

Unlike some bacterial viruses encoding DNA replication related proteins, most of the

archaeal viruses lack its own DNA polymerase genes, indicating that their replication

probably rely on the host replication machinery. It was proved by recently published work

that two of the heterotrimeric S. solfataricus sliding clamp (SsoPCNA1 to 3) (proliferating

Page 34: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

20

cell nuclear antigen) interacted with some SIRV2 viral proteins. PCNA is a key protein

functioning as a cofactor of DNA polymerases recruiting different crucial DNA metabolism

proteins in DNA replication and repair (Moldovan et al., 2007). Most of the interacting viral

proteins have not been assigned function, except SIRV2 Hjr, which agreed to previous

research released that SsoPCNA could stimulate the Hjr enzyme activity in S.solfataricus

(Dorazi et al., 2006). It is intriguing that the early transcribed genes ORF83a/b were also

shown to interact with PCNA, suggesting its important roles during the replication cycle of

SIRV2 (Gardner et al., 2014).

Although the functions of some replicative viral proteins were confirmed, and a preliminary

model was proposed, a lot of further studies are still needed to discover and explain the

virus replication mechanism.

2.3.4 SIRV2 Release Mechanism

The final step for completion of the viral replication cycle is the release of virus particles. In

bacterial domain, most lytic viruses cross the cell envelope and spread to the environment

with the assistant of phage-encoded small integral membrane proteins--holins (Krupovic

and Bamford, 2008). How the archaeal viruses overcome the challenging task of rupturing

the cell membrane and escape from the host cells have attracted a lot of attention in recent

years, especially after the discovery of a unique release mechanism (Bize et al., 2009).

Among archaeal viruses, SIRV2 and STIV are the best studied viruses with respect to host

cell interactions. Both of them are lytic viruses, and shared the same extraordinary virion

egress mechanism. SIRV2 induces the degradation of the host chromosome and assemble

virus particles in cytoplasm (Fig. 10A). In the late stages of the virus infection cycle,

numerous prominent virus-associated pyramids (VAPs) were formed on the host cell surface

(Fig. 10B and C), and these special structures open outward at the end of infection cycle,

allowing the escape of the mature viruses through the created apertures (Bize et al.,

2009;Brumfield et al., 2009). Apparently, this release mechanism is not universal for

hyperthermophilic viruses. Although sharing the release mechanism, the two viruses are

dramatically different in their morphological properties. Therefore, it is possible that the

morphogenetic and egress systems evolved independently.

Page 35: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

21

Figure 10. (A) Schematic representation of the major stages of SIRV2 infection cycle in the Sulfolobus host

cell. Times after infection are indicated in hours. The gradual opening out of VAPs(at time points

between10and14 h) is illustrated inmoredetails with fragments from the TEM of thin sections. (B) Cells 10 h

after infection. (C) Thin sections in a plane perpendicular to the cell envelope. Arrows indicate VAPs (Bize et

al., 2009). D and E , Negative contrast electron micrographs of isolated VAPs. (D) The side view of intact

VAPs. (E) Top view of a VAP in the open conformation (Scale bars: 100 nm.). (F) Thin sections through S.

acidocaldarius expressing SIRV2-P98. Arrows indicate VAPs. (Scale bars: 200 nm.) (from Quax et al., 2010)

In order to investigate the special structural protein components of SIRV2-infected cells,

three different virus-infected cell fractions were collected, compared and analyzed: the total

cell lysate, the membrane and the cytosol fractions. It was found that the 10 kDa P98 of

SIRV2 is the only protein appearing specifically in the membrane fraction of infected cells

and is exposed on the surface that rupturing the S-layer, no other viral protein is involved in

the assembly of pyramids. After overexpression of SIRV2-ORF98 in E. coli and S.

acidocaldarius, the VAPs were also formed with the same size and shape as those formed in

S. islandicus infected with SIRV2 (Fig. 10F) (Quax et al., 2010;Quax et al., 2011). The

sequence alignment data revealed that no other archaeal virus carried the homologue of the

SIRV2-ORF98, except for STIV and the Rudiviridae (SIRV1/2, SRV). It is also intriguing

that the VAPs was a separate structural unit which can be isolated and purified, and the solo

protein SIRV2-ORF98 is capable of self-assembling into ordered sevenfold isosceles

A B C

D E F

Page 36: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

22

triangle–shaped pyramid (Fig. 10D and E), which seems to be a baseless and hollow

structure (Quax et al., 2011).

Although the similar VAPs were observed in heterologous expressed E.coli and S.

acidocaldarius, they only existed in the surface of the inner membrane and all of the VAPs

were in closed state. There must be at least one special factor induced the VAPs opening,

which is absent in S. acidocaldarius and E. coli, but is present in its native host cells.

Page 37: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

23

Summary of Results

Page 38: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

24

Sulfolobus islandics rod-shaped virus 2 (SIRV2), as a member of the family Rudiviridae, is

a promising candidate to become a general model for detailed studies of archaeal virus

biology due to its relatively easy laboratory cultivation and sufficient yields. To date,

several important stages of its biological life cycle have been characterized such as viral

entry, transcription pattern, genome replication as well as its unique egress mechanism,

which provide us a much better understanding of the unknown archaeal viral world.

Even so, similar to the vast majority of other archaeal viruses showing little sequence

similarity to public databases, the functions of many SIRV2 proteins remain to be identified.

Only one fifth of the 54 ORFs encoded by SIRV2 genome were experimentally confirmed a

function, and the knowledge of its basic molecular processes like DNA repair,

recombination, genome maturation as well as the interaction with its host are still limited.

Although possessing limited sequence similarities with the public gene bank, a

CRISPR-associated Cas4-like protein ORF207 (gp19), previously identified as a ssDNA

endonuclease, has drawn a lot of interests (Gardner et al., 2011b). It was detected to be

transcribed from a single promoter with two other proteins (gp17 and gp18), located at its

upstream, and generated a polycistronic transcript (Kessler et al., 2004). This organization

of proteins suggests related functions. Moreover, the bioinformatic analysis revealed that

this operon constitutes the most conserved gene cluster in archaeal linear viruses including

rudiviruses and filamentous viruses. Then it has raised questions regarding the functions of

this entire operon and the related virus infection stages they may be involved in.

The resolved crystal structure of gp17 homolog encoded by SIRV1 indicated a DNA

binding activity, although no obvious structural similarity was matched in Protein Data

Bank. Different structural DNA substrates were tested for its binding activities, and the

results demonstrated that either ssDNA or dsDNA with a single or double flaps can be

shifted with the protein gp17, no blunt-end dsDNA could form the protein-DNA complex,

which indicate that gp17 is a ssDNA binding protein. However, none of the documented

classical ssDNA binding domains were found in the structure of gp17, therefore this protein

constitutes a novel non-canonical ssDNA binding protein. Sequence alignment of gp17

homologs revealed 3 highly conserved and 5 relatively conserved residues. Mutagenesis of a

Page 39: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

25

few conserved basic residues distributed in two adjacent loops within each monomer

suggested a U-shaped binding path for ssDNA.

As gp18 couldn’t be cloned into Sulfolobus cells due to its toxicity and as recombinant

expression in E.coli resulted in the formation of inclusion bodies, a denaturation and

refolding strategy was employed to purify the His-tagged gp18 from E.coli. Both the

circular dichroism spectroscopy and gel-filtration chromatography assay showed that the

refolded protein gp18 is functional stable to be used for the further study. BlastP search of

gp18 sequence suggested a weak similarity to bacterial ATPase domains of Lon protease,

and a tertiary structure prediction suggested a function as hexameric helicase. However,

neither protease or helicase activity of gp18 could be detected under all possible conditions.

Instead, gp18 was detected to be able to increase the dsDNA yield from two complementary

oligonucleotides. The failure of detecting the helicase activity could be due to the lack of

proper experimental conditions or possible mask of helicase activity by the stronger

annealing activity. It also could be that gp18 carries no helicase activity, but only annealing

activirty, possessing the similar features as the annealing helicases (HARP, AH2). (Yusufzai

and Kadonaga, 2008;Yusufzai and Kadonaga, 2010).

To better understand the function of the entire gene operon, the protein product of the third

gene, gp19, was further characterized in this study, which was detected to have a 5’-3’

ssDNA exonuclease activity, in addition to the previously demonstrated ssDNA

endonuclease activity.

There are 38 aa residues missing at the C-terminus of gp17 in the crystal structure, which

was predicted as disordered region by two different program IUpred and PONDR. The

disordered C-terminus of bacterial SSB proteins are normally involved in protein-protein

interactions. Since the entire operon all work in the same type of substrate, ssDNA, the

interactions among the three proteins were performed by GST affinity chromatography. The

experimental results demonstrated that gp17 interacts with gp18 and the C-terminal

disordered domain of gp17 is essential for the interaction. No interaction was detected

between gp17 and gp19, but a weak interaction was shown between gp18 and gp19. In order

to confirm whether gp17 could recruit some ssDNA-processing proteins as bacterial SSBs,

this gene was inserted into plasmid and transfermed into host Sulfolobus solfataricus P2

cells for the pull-down assay in vivo. Two host proteins sso2277 and reverse gyrase

Page 40: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

26

(sso0422) were detected to interact with gp17. Protein structure prediction of sso2277

revealed a high similarity with RecF and RecN proteins, which involved in DNA replication

/ recombination.

The operonic or clustered organization of the three genes in rudi- and filamentous viruses

and the observed interactions between their protein products strongly suggest their close

cooperation in a same process(es) involving ssDNA. This process could be the SIRV2

genome maturation, replication or recombination, and new evidences are needed to support

the hypothesis.

Besides available information concerns unusual viral morphological and genomic properties,

the SIRV2 transcription pattern as well as the regulation of host genes during virus infection

was studied, either by microarray analysis or by deep transcriptome sequencing (Okutan et

al., 2013;Quax et al., 2013). Although lacking of strong temporal regulation of transcription

on its own virus, the host response to SIRV2 infection was significant. More than one third

of the host genes were differentially regulated, with a similar number of downregulated and

upregulated genes. Among these regulated genes, there are two transcription regulators

sso2474 and sso10340 from Sulfolobus sulfataricus P2 were responded differently. Then we

are curious to find out if any host genes or virus genes were regulated by these two

regulators, and whether they are a local or global acting transcription factors. In this work,

we investigate the binding targets of the two proteins in an in vivo context by performing a

method similar to chromatin immunoprecipitation combined with DNA sequencing.

In order to detect the regulation on both host genes and viral genes, after the proteins were

overexpressed for 15h, the cells were infected with SIRV2 at about M.I.O of 10 for 2.5 h.

His-tagged protein purification was carried out from virus infected cells, the protein sso2474

was detected to bind hundreds of fold more DNA than the control group, and sso10340

exhibited a range of oligomeric states resembling the feature of Lrp/AsnC family proteins.

The bound DNA was separated from DNA-protein complex from the two purified proteins,

respectively, and sent for the high-throughput sequencing. The alignment between

sequenced data and virus genome or host genome revealed that most of the DNA bound by

sso2474 is viral DNA and sso10340 is mainly associated with the host regulation.

Page 41: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

27

However, the specific binding targets and the binding mechanisms of sso2474 are still not

identified, and the experiments showed that this protein purified from E.coli preferred to

bind dsDNA than ssDNA. A total 27 binding target regions by protein sso10340 or its

interacted proteins in S.solfataricus P2 were identified, and half of them located in the

upstream or partial upstream of the corresponding genes, while the other half fell within the

gene coding regions. A 11bp palindromic binding motif was defined by analysis of the

enriched oligonucleotide sequences, which was present in 96% of the binding targets. The

functions of these related genes were categorized and most of which were involved in

energy metabolism, transport and amino acid metabolism.

Page 42: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

28

Future Perspectives

Page 43: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

29

The PhD work is the first study providing the functional characterization of an entire gene

operon conserved in archaeal rudiviruses and filamentous viruses as well as the general

regulation profile of two host regulators. There are still some work to be done in the future

to enlarge the knowledge of the archaeal viral biology and virus-host interaction.

The toxicity of gp18 to Sulfolobus cells as well as its insolubilities in E.coli hindered the

progress of characterization of the whole operon. Although insertion of both gp17 and gp18

genes into Sulfolobus solfataricus cells could decrease the strong toxicity of gp18, almost no

expressed gp18 can be purified (data not shown). One method we would like further to try is

co-expressing the two or three proteins in E.coli in a suitable system to test if gp17 or gp19

could help the folding of gp18, since both of them were demonstrated to interact with gp18.

If so, the next plan is crystalizing the complex with a synthesized oligonucleotide to further

investigate the interaction in detail between the complex and ssDNA.

The ssDNA binding, annealing and nuclease activities in vitro were all characterized in this

study, and the possible function in viral infection cycle are discussed, whereas more in vivo

evidence is still needed to complete the scenario we constructed. Due to the limited viral

genetic technologies and relatively large size of this virus, silencing these genes in virus is

not possible until now. We already tried to overexpressing the c-terminal truncated gp17 in

the host cells for competition with the wide-type one in virus to detect its influence either in

virus replication or genome maturation. However, the result is not conclusive due to

different reasons. It would be interesting and exciting to use some good ideas and methods

to detect the viral function of this operon in vivo.

The very surprising thing about sso2474 is its special high affinity to viral DNA in a

non-sequence binding way. The DNA binding mechanism of sso2474 was not clear, and the

phonotype changes and no growth retardation to virus are expected to be observed in the

sso2474 mutated organism, if it can be knocked out. At last, the global regulation of

sso10340 need to be further validated. Is there any other DNA binding protein or regulators

interacting with sso10340 and whether this regulator activates or represses the transcription

of the corresponding genes are still need to be confirmed.

Page 44: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

30

Reference

Ackermann,H.W., and Prangishvili,D. (2012) Prokaryote viruses studied by electron

microscopy. Arch Virol 157: 1843-1849.

Albers,S.V., Birkeland,N.K., Driessen,A.J., Gertig,S., Haferkamp,P., Klenk,H.P. et al. (2009)

SulfoSYS (Sulfolobus Systems Biology): towards a silicon cell model for the central

carbohydrate metabolism of the archaeon Sulfolobus solfataricus under temperature

variation. Biochem Soc Trans 37: 58-64.

Anderson,I., Rodriguez,J., Susanti,D., Porat,I., Reich,C., Ulrich,L.E. et al. (2008) Genome

sequence of Thermofilum pendens reveals an exceptional loss of biosynthetic pathways

without genome reduction. J Bacteriol 190: 2957-2965.

Bernander,R. (2007) The cell cycle of Sulfolobus. Mol Microbiol 66: 557-562.

Bettstetter,M., Peng,X., Garrett,R.A., and Prangishvili,D. (2003) AFV1, a novel virus

infecting hyperthermophilic archaea of the genus acidianus. Virology 315: 68-79.

Bintrim,S.B., Donohue,T.J., Handelsman,J., Roberts,G.P., and Goodman,R.M. (1997)

Molecular phylogeny of Archaea from soil. Proc Natl Acad Sci U S A 94: 277-282.

Birkenbihl,R.P., Neef,K., Prangishvili,D., and Kemper,B. (2001) Holliday junction resolving

enzymes of archaeal viruses SIRV1 and SIRV2. J Mol Biol 309: 1067-1076.

Bize,A., Karlsson,E.A., Ekefjard,K., Quax,T.E., Pina,M., Prevost,M.C. et al. (2009) A unique

virus release mechanism in the Archaea. Proc Natl Acad Sci U S A 106: 11306-11311.

Bize,A., Peng,X., Prokofeva,M., Maclellan,K., Lucas,S., Forterre,P. et al. (2008) Viruses in

acidic geothermal environments of the Kamchatka Peninsula. Res Microbiol 159: 358-366.

Blum,H., Zillig,W., Mallok,S., Domdey,H., and Prangishvili,D. (2001) The genome of the

archaeal virus SIRV1 has features in common with genomes of eukaryal viruses. Virology

281: 6-9.

Bochkarev,A., and Bochkareva,E. (2004) From RPA to BRCA2: lessons from single-stranded

DNA binding by the OB-fold. Curr Opin Struct Biol 14: 36-42.

Breithaupt,H. (2001) The hunt for living gold. The search for organisms in extreme

environments yields useful enzymes for industry. EMBO Rep 2: 968-971.

Brock,T.D., Brock,K.M., Belly,R.T., and Weiss,R.L. (1972) Sulfolobus: a new genus of

sulfur-oxidizing bacteria living at low pH and high temperature. Arch Mikrobiol 84: 54-68.

Page 45: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

31

Brugger,K., Redder,P., and Skovgaard,M. (2003) MUTAGEN: multi-user tool for annotating

genomes. Bioinformatics 19: 2480-2481.

Brumfield,S.K., Ortmann,A.C., Ruigrok,V., Suci,P., Douglas,T., and Young,M.J. (2009) Particle

assembly and ultrastructural features associated with replication of the lytic archaeal virus

sulfolobus turreted icosahedral virus. J Virol 83: 5964-5970.

Chong,J.P., Hayashi,M.K., Simon,M.N., Xu,R.M., and Stillman,B. (2000) A double-hexamer

archaeal minichromosome maintenance protein is an ATP-dependent DNA helicase. Proc

Natl Acad Sci U S A 97: 1530-1535.

Culyba,M.J., Harrison,J.E., Hwang,Y., and Bushman,F.D. (2006) DNA cleavage by the A22R

resolvase of vaccinia virus. Virology 352: 466-476.

Declais,A.C., and Lilley,D.M. (2008) New insight into the recognition of branched DNA

structure by junction-resolving enzymes. Curr Opin Struct Biol 18: 86-95.

DeLong,E.F. (1992) Archaea in coastal marine environments. Proc Natl Acad Sci U S A 89:

5685-5689.

Deng,L., He,F., Bhoobalan-Chitty,Y., Martinez-Alvarez,L., Guo,Y., and Peng,X. (2014)

Unveiling cell surface and type IV secretion proteins responsible for archaeal rudivirus

entry. J Virol 88: 10264-10268.

Deng,L., Zhu,H., Chen,Z., Liang,Y.X., and She,Q. (2009) Unmarked gene deletion and

host-vector system for the hyperthermophilic crenarchaeon Sulfolobus islandicus.

Extremophiles 13: 735-746.

Dickey,T.H., Altschuler,S.E., and Wuttke,D.S. (2013) Single-stranded DNA-binding proteins:

multiple domains for multiple functions. Structure 21: 1074-1084.

Dorazi,R., Parker,J.L., and White,M.F. (2006) PCNA activates the Holliday junction

endonuclease Hjc. J Mol Biol 364: 243-247.

Dosztanyi,Z., Csizmok,V., Tompa,P., and Simon,I. (2005) IUPred: web server for the

prediction of intrinsically unstructured regions of proteins based on estimated energy

content. Bioinformatics 21: 3433-3434.

Du,S., and Traktman,P. (1996) Vaccinia virus DNA replication: two hundred base pairs of

telomeric sequence confer optimal replication efficiency on minichromosome templates.

Proc Natl Acad Sci U S A 93: 9693-9698.

Dunn,J.J., and Studier,F.W. (1983) Complete nucleotide sequence of bacteriophage T7 DNA

and the locations of T7 genetic elements. J Mol Biol 166: 477-535.

Page 46: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

32

Elkins,J.G., Podar,M., Graham,D.E., Makarova,K.S., Wolf,Y., Randau,L. et al. (2008) A

korarchaeal genome reveals insights into the evolution of the Archaea. Proc Natl Acad Sci

U S A 105: 8102-8107.

Forterre,P., Brochier,C., and Philippe,H. (2002) Evolution of the Archaea. Theor Popul Biol

61: 409-422.

Frols,S., Gordon,P.M., Panlilio,M.A., Schleper,C., and Sensen,C.W. (2007) Elucidating the

transcription cycle of the UV-inducible hyperthermophilic archaeal virus SSV1 by DNA

microarrays. Virology 365: 48-59.

Fuhrman,J.A., McCallum,K., and Davis,A.A. (1992) Novel major archaebacterial group from

marine plankton. Nature 356: 148-149.

Gardner,A.F., Bell,S.D., White,M.F., Prangishvili,D., and Krupovic,M. (2014) Protein-protein

interactions leading to recruitment of the host DNA sliding clamp by the hyperthermophilic

Sulfolobus islandicus rod-shaped virus 2. J Virol 88: 7105-7108.

Gardner,A.F., Guan,C., and Jack,W.E. (2011a) Biochemical characterization of a

structure-specific resolving enzyme from Sulfolobus islandicus rod-shaped virus 2. PLoS

One 6: e23668.

Gardner,A.F., Prangishvili,D., and Jack,W.E. (2011b) Characterization of Sulfolobus

islandicus rod-shaped virus 2 gp19, a single-strand specific endonuclease. Extremophiles 15:

619-624.

Gevers,D., Dawyndt,P., Vandamme,P., Willems,A., Vancanneyt,M., Swings,J., and De,V.P.

(2006) Stepping stones towards a new prokaryotic taxonomy. Philos Trans R Soc Lond B

Biol Sci 361: 1911-1916.

Goulet,A., Blangy,S., Redder,P., Prangishvili,D., Felisberto-Rodrigues,C., Forterre,P. et al.

(2009a) Acidianus filamentous virus 1 coat proteins display a helical fold spanning the

filamentous archaeal viruses lineage. Proc Natl Acad Sci U S A 106: 21155-21160.

Goulet,A., Spinelli,S., Blangy,S., van,T.H., Leulliot,N., Basta,T. et al. (2009b) The thermo-

and acido-stable ORF-99 from the archaeal virus AFV1. Protein Sci 18: 1316-1320.

Gribaldo,S., and Brochier-Armanet,C. (2006) The origin and evolution of Archaea: a state of

the art. Philos Trans R Soc Lond B Biol Sci 361: 1007-1022.

Gudbergsdottir,S., Deng,L., Chen,Z., Jensen,J.V., Jensen,L.R., She,Q., and Garrett,R.A. (2011)

Dynamic properties of the Sulfolobus CRISPR/Cas and CRISPR/Cmr systems when

challenged with vector-borne viral and plasmid genes and protospacers. Mol Microbiol 79:

35-49.

Page 47: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

33

Guilliere,F., Peixeiro,N., Kessler,A., Raynal,B., Desnoues,N., Keller,J. et al. (2009) Structure,

function, and targets of the transcriptional regulator SvtR from the hyperthermophilic

archaeal virus SIRV1. J Biol Chem 284: 22222-22237.

Guy,L., and Ettema,T.J. (2011) The archaeal 'TACK' superphylum and the origin of

eukaryotes. Trends Microbiol 19: 580-587.

Haring,M., Rachel,R., Peng,X., Garrett,R.A., and Prangishvili,D. (2005a) Viral diversity in hot

springs of Pozzuoli, Italy, and characterization of a unique archaeal virus, Acidianus

bottle-shaped virus, from a new family, the Ampullaviridae. J Virol 79: 9904-9911.

Haring,M., Vestergaard,G., Brugger,K., Rachel,R., Garrett,R.A., and Prangishvili,D. (2005b)

Structure and genome organization of AFV2, a novel archaeal lipothrixvirus with unusual

terminal and core structures. J Bacteriol 187: 3855-3858.

Haring,M., Vestergaard,G., Rachel,R., Chen,L., Garrett,R.A., and Prangishvili,D. (2005c)

Virology: independent virus development outside a host. Nature 436: 1101-1102.

Hohn,M.J., Hedlund,B.P., and Huber,H. (2002) Detection of 16S rDNA sequences

representing the novel phylum "Nanoarchaeota": indication for a wide distribution in high

temperature biotopes. Syst Appl Microbiol 25: 551-554.

Howard,J.A., Delmas,S., Ivancic-Bace,I., and Bolt,E.L. (2011) Helicase dissociation and

annealing of RNA-DNA hybrids by Escherichia coli Cas3 protein. Biochem J 439: 85-95.

Johnson,D.B., Joulian,C., d'Hugues,P., and Hallberg,K.B. (2008) Sulfobacillus benefaciens sp.

nov., an acidophilic facultative anaerobic Firmicute isolated from mineral bioleaching

operations. Extremophiles 12: 789-798.

Kawano,S., Iyaguchi,D., Okada,C., Sasaki,Y., and Toyota,E. (2013) Expression, purification,

and refolding of active recombinant human E-selectin lectin and EGF domains in

Escherichia coli. Protein J 32: 386-391.

Kelley,L.A., and Sternberg,M.J. (2009) Protein structure prediction on the Web: a case

study using the Phyre server. Nat Protoc 4: 363-371.

Kessler,A., Brinkman,A.B., van der Oost,J., and Prangishvili,D. (2004) Transcription of the

rod-shaped viruses SIRV1 and SIRV2 of the hyperthermophilic archaeon sulfolobus. J

Bacteriol 186: 7745-7753.

Kowalczykowski,S.C. (2000) Initiation of genetic recombination and

recombination-dependent replication. Trends Biochem Sci 25: 156-165.

Krupovic,M., and Bamford,D.H. (2008) Holin of bacteriophage lambda: structural insights

into a membrane lesion. Mol Microbiol 69: 781-783.

Page 48: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

34

Krupovic,M., White,M.F., Forterre,P., and Prangishvili,D. (2012) Postcards from the edge:

structural genomics of archaeal viruses. Adv Virus Res 82: 33-62.

Kvaratskhelia,M., and White,M.F. (2000) Two Holliday junction resolving enzymes in

Sulfolobus solfataricus. J Mol Biol 297: 923-932.

Lemak,S., Beloglazova,N., Nocek,B., Skarina,T., Flick,R., Brown,G. et al. (2013) Toroidal

structure and DNA cleavage by the CRISPR-associated [4Fe-4S] cluster containing Cas4

nuclease SSO0001 from Sulfolobus solfataricus. J Am Chem Soc 135: 17476-17487.

Maaty,W.S., Steffens,J.D., Heinemann,J., Ortmann,A.C., Reeves,B.D., Biswas,S.K. et al.

(2012) Global analysis of viral infection in an archaeal model system. Front Microbiol 3:

411.

Martin,A., Yeats,S., Janekovic,D., Reiter,W.D., Aicher,W., and Zillig,W. (1984) SAV 1, a

temperate u.v.-inducible DNA virus-like particle from the archaebacterium Sulfolobus

acidocaldarius isolate B12. EMBO J 3: 2165-2168.

Mochizuki,T., Krupovic,M., Pehau-Arnaudet,G., Sako,Y., Forterre,P., and Prangishvili,D.

(2012) Archaeal virus with exceptional virion architecture and the largest single-stranded

DNA genome. Proc Natl Acad Sci U S A 109: 13386-13391.

Moldovan,G.L., Pfander,B., and Jentsch,S. (2007) PCNA, the maestro of the replication fork.

Cell 129: 665-679.

Mosig,G. (1998) Recombination and recombination-dependent DNA replication in

bacteriophage T4. Annu Rev Genet 32: 379-413.

Munoz,V., and Serrano,L. (1994) Elucidating the folding problem of helical peptides using

empirical parameters. Nat Struct Biol 1: 399-409.

Oke,M., Carter,L.G., Johnson,K.A., Liu,H., McMahon,S.A., Yan,X. et al. (2010) The Scottish

Structural Proteomics Facility: targets, methods and outputs. J Struct Funct Genomics 11:

167-180.

Oke,M., Kerou,M., Liu,H., Peng,X., Garrett,R.A., Prangishvili,D. et al. (2011) A dimeric Rep

protein initiates replication of a linear archaeal virus genome: implications for the Rep

mechanism and viral replication. J Virol 85: 925-931.

Okutan,E., Deng,L., Mirlashari,S., Uldahl,K., Halim,M., Liu,C. et al. (2013) Novel insights into

gene regulation of the rudivirus SIRV2 infecting Sulfolobus cells. RNA Biol 10: 875-885.

Oline,D.K., Schmidt,S.K., and Grant,M.C. (2006) Biogeography and landscape-scale diversity

of the dominant Crenarchaeota of soil. Microb Ecol 52: 480-490.

Page 49: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

35

Oren,A. (2002) Molecular ecology of extremely halophilic Archaea and Bacteria. FEMS

Microbiol Ecol 39: 1-7.

Ortmann,A.C., Brumfield,S.K., Walther,J., McInnerney,K., Brouns,S.J., van de Werken,H.J. et

al. (2008) Transcriptome analysis of infection of the archaeon Sulfolobus solfataricus with

Sulfolobus turreted icosahedral virus. J Virol 82: 4874-4883.

Palm,P., Schleper,C., Grampp,B., Yeats,S., McWilliam,P., Reiter,W.D., and Zillig,W. (1991)

Complete nucleotide sequence of the virus SSV1 of the archaebacterium Sulfolobus

shibatae. Virology 185: 242-250.

Paytubi,S., McMahon,S.A., Graham,S., Liu,H., Botting,C.H., Makarova,K.S. et al. (2012)

Displacement of the canonical single-stranded DNA-binding protein in the

Thermoproteales. Proc Natl Acad Sci U S A 109: E398-E405.

Peng,X., Basta,T., Haring,M., Garrett,R.A., and Prangishvili,D. (2007) Genome of the

Acidianus bottle-shaped virus and insights into the replication and packaging mechanisms.

Virology 364: 237-243.

Peng,X., Blum,H., She,Q., Mallok,S., Brugger,K., Garrett,R.A. et al. (2001) Sequences and

replication of genomes of the archaeal rudiviruses SIRV1 and SIRV2: relationships to the

archaeal lipothrixvirus SIFV and some eukaryal viruses. Virology 291: 226-234.

Peng,X., Kessler,A., Phan,H., Garrett,R.A., and Prangishvili,D. (2004) Multiple variants of the

archaeal DNA rudivirus SIRV1 in a single host and a novel mechanism of genomic variation.

Mol Microbiol 54: 366-375.

Pietila,M.K., Demina,T.A., Atanasova,N.S., Oksanen,H.M., and Bamford,D.H. (2014)

Archaeal viruses and bacteriophages: comparisons and contrasts. Trends Microbiol 22:

334-344.

Pietila,M.K., Laurinavicius,S., Sund,J., Roine,E., and Bamford,D.H. (2010) The

single-stranded DNA genome of novel archaeal virus halorubrum pleomorphic virus 1 is

enclosed in the envelope decorated with glycoprotein spikes. J Virol 84: 788-798.

Pietila,M.K., Roine,E., Paulin,L., Kalkkinen,N., and Bamford,D.H. (2009) An ssDNA virus

infecting archaea: a new lineage of viruses with a membrane envelope. Mol Microbiol 72:

307-319.

Pina,M., Basta,T., Quax,T.E., Joubert,A., Baconnais,S., Cortez,D. et al. (2014) Unique

genome replication mechanism of the archaeal virus AFV1. Mol Microbiol 92: 1313-1325.

Poranen,M.M., Daugelavicius,R., and Bamford,D.H. (2002) Common principles in viral entry.

Annu Rev Microbiol 56: 521-538.

Page 50: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

36

Prangishvili,D. (2003) Evolutionary insights from studies on viruses of hyperthermophilic

archaea. Res Microbiol 154: 289-294.

Prangishvili,D., Arnold,H.P., Gotz,D., Ziese,U., Holz,I., Kristjansson,J.K., and Zillig,W. (1999)

A novel virus family, the Rudiviridae: Structure, virus-host interactions and genome

variability of the sulfolobus viruses SIRV1 and SIRV2. Genetics 152: 1387-1396.

Prangishvili,D., Forterre,P., and Garrett,R.A. (2006a) Viruses of the Archaea: a unifying view.

Nat Rev Microbiol 4: 837-848.

Prangishvili,D., and Garrett,R.A. (2005) Viruses of hyperthermophilic Crenarchaea. Trends

Microbiol 13: 535-542.

Prangishvili,D., Garrett,R.A., and Koonin,E.V. (2006b) Evolutionary genomics of archaeal

viruses: unique viral genomes in the third domain of life. Virus Res 117: 52-67.

Prangishvili,D., Koonin,E.V., and Krupovic,M. (2013) Genomics and biology of Rudiviruses, a

model for the study of virus-host interactions in Archaea. Biochem Soc Trans 41: 443-450.

Prangishvili,D., Vestergaard,G., Haring,M., Aramayo,R., Basta,T., Rachel,R., and Garrett,R.A.

(2006c) Structural and genomic properties of the hyperthermophilic archaeal virus ATV

with an extracellular stage of the reproductive cycle. J Mol Biol 359: 1203-1216.

Quax,T.E., Krupovic,M., Lucas,S., Forterre,P., and Prangishvili,D. (2010) The Sulfolobus

rod-shaped virus 2 encodes a prominent structural component of the unique virion release

system in Archaea. Virology 404: 1-4.

Quax,T.E., Lucas,S., Reimann,J., Pehau-Arnaudet,G., Prevost,M.C., Forterre,P. et al. (2011)

Simple and elegant design of a virion egress structure in Archaea. Proc Natl Acad Sci U S A

108: 3354-3359.

Quax,T.E., Voet,M., Sismeiro,O., Dillies,M.A., Jagla,B., Coppee,J.Y. et al. (2013) Massive

activation of archaeal defense genes during viral infection. J Virol 87: 8419-8428.

Quemin,E.R., Lucas,S., Daum,B., Quax,T.E., Kuhlbrandt,W., Forterre,P. et al. (2013) First

insights into the entry process of hyperthermophilic archaeal viruses. J Virol 87:

13379-13385.

Rachel,R., Bettstetter,M., Hedlund,B.P., Haring,M., Kessler,A., Stetter,K.O., and

Prangishvili,D. (2002) Remarkable morphological diversity of viruses and virus-like particles

in hot terrestrial environments. Arch Virol 147: 2419-2429.

Rakonjac,J., Bennett,N.J., Spagnuolo,J., Gagic,D., and Russel,M. (2011) Filamentous

bacteriophage: biology, phage display and nanotechnology applications. Curr Issues Mol

Biol 13: 51-76.

Page 51: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

37

Reiter,W.D., Palm,P., Yeats,S., and Zillig,W. (1987) Gene expression in archaebacteria:

physical mapping of constitutive and UV-inducible transcripts from the Sulfolobus virus-like

particle SSV1. Mol Gen Genet 209: 270-275.

Schleper,C., Holben,W., and Klenk,H.P. (1997) Recovery of crenarchaeotal ribosomal DNA

sequences from freshwater-lake sediments. Appl Environ Microbiol 63: 321-323.

Servin-Garciduenas,L.E., Peng,X., Garrett,R.A., and Martinez-Romero,E. (2013) Genome

sequence of a novel archaeal rudivirus recovered from a mexican hot spring. Genome

Announc 1.

She,Q., Singh,R.K., Confalonieri,F., Zivanovic,Y., Allard,G., Awayez,M.J. et al. (2001) The

complete genome of the crenarchaeon Sulfolobus solfataricus P2. Proc Natl Acad Sci U S A

98: 7835-7840.

Shereda,R.D., Kozlov,A.G., Lohman,T.M., Cox,M.M., and Keck,J.L. (2008) SSB as an

organizer/mobilizer of genome maintenance complexes. Crit Rev Biochem Mol Biol 43:

289-318.

Steinmetz,N.F., Bize,A., Findlay,K.C., Lomonossoff,G.P., Manchester,M., Evans,D.J., et al.

(2008) Site-specific and spatially controlled addressability of a new viral nanobuilding block:

Sulfolobus islandics Rod-shaped Virus 2. Advanced functional materials 18: 3478-3486.

Steitz,T.A. (2004) The structural basis of the transition from initiation to elongation phases

of transcription, as well as translocation and strand separation, by T7 RNA polymerase.

Curr Opin Struct Biol 14: 4-9.

Stetter,K.O. (2006) Hyperthermophiles in the history of life. Philos Trans R Soc Lond B Biol

Sci 361: 1837-1842.

Suck,D. (1997) Common fold, common function, common origin? Nat Struct Biol 4:

161-165.

Theobald,D.L., Mitton-Fry,R.M., and Wuttke,D.S. (2003) Nucleic acid recognition by OB-fold

proteins. Annu Rev Biophys Biomol Struct 32: 115-133.

Vega-Rocha,S., Gronenborn,B., Gronenborn,A.M., and Campos-Olivas,R. (2007) Solution

structure of the endonuclease domain from the master replication initiator protein of the

nanovirus faba bean necrotic yellows virus and comparison with the corresponding

geminivirus and circovirus structures. Biochemistry 46: 6201-6212.

Vestergaard,G., Aramayo,R., Basta,T., Haring,M., Peng,X., Brugger,K. et al. (2008a)

Structure of the acidianus filamentous virus 3 and comparative genomics of related

archaeal lipothrixviruses. J Virol 82: 371-381.

Page 52: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

38

Vestergaard,G., Haring,M., Peng,X., Rachel,R., Garrett,R.A., and Prangishvili,D. (2005) A

novel rudivirus, ARV1, of the hyperthermophilic archaeal genus Acidianus. Virology 336:

83-92.

Vestergaard,G., Shah,S.A., Bize,A., Reitberger,W., Reuter,M., Phan,H. et al. (2008b)

Stygiolobus rod-shaped virus and the interplay of crenarchaeal rudiviruses with the CRISPR

antiviral system. J Bacteriol 190: 6837-6845.

Woese,C.R., and Fox,G.E. (1977) Phylogenetic structure of the prokaryotic domain: the

primary kingdoms. Proc Natl Acad Sci U S A 74: 5088-5090.

Woese,C.R., Kandler,O., and Wheelis,M.L. (1990) Towards a natural system of organisms:

proposal for the domains Archaea, Bacteria, and Eucarya. Proc Natl Acad Sci U S A 87:

4576-4579.

Worthington,P., Hoang,V., Perez-Pomares,F., and Blum,P. (2003) Targeted disruption of

the alpha-amylase gene in the hyperthermophilic archaeon Sulfolobus solfataricus. J

Bacteriol 185: 482-488.

Wu,Y. (2012) Unwinding and rewinding: double faces of helicase? J Nucleic Acids 2012:

140601.

Wurtzel,O., Sapra,R., Chen,F., Zhu,Y., Simmons,B.A., and Sorek,R. (2010) A single-base

resolution map of an archaeal transcriptome. Genome Res 20: 133-141.

Xiang,X., Dong,X., and Huang,L. (2003) Sulfolobus tengchongensis sp. nov., a novel

thermoacidophilic archaeon isolated from a hot spring in Tengchong, China. Extremophiles

7: 493-498.

Xue,B., Dunbrack,R.L., Williams,R.W., Dunker,A.K., and Uversky,V.N. (2010) PONDR-FIT: a

meta-predictor of intrinsically disordered amino acids. Biochim Biophys Acta 1804:

996-1010.

Yeats,S., McWilliam,P., and Zillig,W. (1982) A plasmid in the archaebacterium Sulfolobus

acidocaldarius. The EMBO Journal 1 : 1035-1038.

Yusufzai,T., and Kadonaga,J.T. (2008) HARP is an ATP-driven annealing helicase. Science

322: 748-750.

Yusufzai,T., and Kadonaga,J.T. (2010) Annealing helicase 2 (AH2), a DNA-rewinding motor

with an HNH motif. Proc Natl Acad Sci U S A 107: 20970-20973.

Zablen,L.B., Kissil,M.S., Woese,C.R., and Buetow,D.E. (1975) Phylogenetic origin of the

chloroplast and prokaryotic nature of its ribosomal RNA. Proc Natl Acad Sci U S A 72:

2418-2422.

Page 53: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

39

Zhang,J., Kasciukovic,T., and White,M.F. (2012) The CRISPR associated protein Cas4 Is a 5'

to 3' DNA exonuclease with an iron-sulfur cluster. PLoS One 7: e47232.

Zillig,W., Stetter,O.K., Wunderl,S., Schulz,W., Priess,H., Scholz,I.(1980) The

Sulfolobus-``Caldariella`` Group: Taxonomy on the Basis of the Structure of

DNA-Dependent RNA Polymerases. Archives of Microbiology 125:259-269.

Zillig,W., Kletzin,A., Schleper,C., Holz,I., Janekovic,D., Hain,J. et al. (1994) Screening for

Sulfolobales, Their Plasmids and Their Viruses in Icelandic Solfataras. Systematic and

Applied Microbiology 16: 609-628.

Zuckerkandl,E., and Pauling,L. (1965) Molecules as documents of evolutionary history. J

Theor Biol 8: 357-366.

Page 54: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

40

Manuscript I

Single-stranded DNA binding, annealing and nuclease activities encoded by a

conserved archaeal viral gene cluster

Yang Guo, Birthe B. Kragelund, Malcolm F. White and Xu Peng

Submitted to Nucleic Acid Research

Page 55: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

41

Single-stranded DNA binding, annealing and nuclease activities

encoded by a conserved archaeal viral gene cluster

Yang Guo1, Birthe B. Kragelund1, Malcolm F. White2 and Xu Peng1*

1 Department of Biology, University of Copenhagen, Ole Maaloes Vej 5, 2200 CPH N. Denmark

2 Biomedical Sciences Research Complex, University of St. Andrews, North Haugh, St. Andrews,

Fife, UK

* To whom correspondence should be addressed. Tel: +45 35322018; Fax: +45 35322128; Email:

[email protected]

Page 56: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

42

ABSTRACT

Single-stranded DNA (ssDNA) occurs in various cellular and viral processes of DNA

metabolism, including DNA replication, homologous recombination and repair pathways.

Here, we describe a novel type of ssDNA binding protein, a novel ssDNA annealing protein

and a ssDNA nuclease encoded by an operon comprised of ORF131b (gp17), ORF436

(gp18) and ORF207 (gp19), respectively, of Sulfolubus islandicus rod-shaped virus 2

(SIRV2). Rather than comprising one of the canonical ssDNA binding domains, SIRV2 gp17

forms a dimer with each monomer containing two α -helices and three β-strands.

Mutagenesis of a few conserved basic residues distributed in two adjacent loops within each

monomer suggested a U-shaped binding path for ssDNA. Although predicted previously as

a helicase, the recombinant gp18 showed a ssDNA annealing activity often associated with

helicases and recombinases. Moreover, gp19 was shown to possess a 5´ to 3´ ssDNA

exonuclease activity, in addition to the previously demonstrated ssDNA endonuclease

activity. Further, in vitro pull-down assay demonstrated interactions between gp17 and gp18

and between gp18 and gp19 with the former being mediated by the intrinsically disordered

C-terminus of gp17. The strand-displacement replication mode proposed previously for

rudiviruses and the close interaction between the ssDNA binding, annealing and nuclease

activities strongly point to a role of the gene operon in genome maturation and/or DNA

recombination which may function in viral DNA replication/repair.

INTRODUCTION

Viruses that infect extreme hyperthermophilic archaea, the third domain of life, are unusual

in their morphology, genome structure and proteins. In the last decade, a major effort has

been undertaken to study the archaeal viruses, which have attracted intense interest as

model systems to understand the biochemistry and molecular biology required for life at

high temperatures. Based on their morphological and genomic characteristics, 15 viral

families have been classified and about 100 viral isolates described, all with either linear or

circular double-stranded (ds) DNA, except two species possessing a single-stranded (ss)

DNA genome (1-3).

The Sulfolobus islandicus rod-shaped virus 2 (SIRV2) (4), together with SIRV1 (5),

Stygiolobus rod-shaped virus, SRV (6), Acidianus rod-shaped virus 1, ARV1 (7) and

Sulfolobales Mexican rudivirus 1 (SMRV1) (8), belong to the family Rudiviridae. The

Page 57: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

43

rudiviruses have linear dsDNA genomes (24.6 to 35.8 kbp) with inverted terminal repeats

and the two strands at the genomic termini are covalently linked. Recently SIRV2 has been

the focus of genomic, structural, genetic and transcriptional studies, which have provided

important insights into its entry, gene regulation and unique release mechanisms (5;9-13).

Even so, similar to the vast majority of other archaeal viruses showing little sequence

similarity to public databases (14), the functions of many SIRV2 proteins remain to be

identified.

Among the 54 ORFs encoded in the genome of SIRV2, only one fifth had been

experimentally assigned a function (15). The virus is coated with one major capsid protein

gp26 and three minor structural proteins gp33, gp38 and gp39 (16). Together with the genes

encoding the viral structural proteins, gp49, encoding the component of the pyramidal

egress structure (11), is repressed by the transcription regulator gp15 (SvtR) during the

early virus infection cycle (17;18). gp16, belonging to the replication initiator (Rep) family

and nicking one strand of the viral genomic termini, was proposed to be involved in the

initiation of the DNA replication (19). The Holiday junction resolving enzyme (Hjr) gp35 was

suggested to resolve the concatemers of the replicative intermediates, producing

monomeric copies with linear hairpin ends (20). Taken together, the functions of many

SIRV2 genes remain unknown and the knowledge of its biology and basic molecular

processes such as DNA replication, recombination and maturation is still limited.

In this work we studied a SIRV2 operon containing three genes, gp17, gp18 and

gp19 that are highly conserved in rudiviruses and filamentous viruses. gp19 was previously

shown to be an endonuclease specifically cutting ssDNA (21). We demonstrate here that

gp17 is a ssDNA binding protein and interacts with gp18 while the latter stimulates

annealing of complementary oligonucleotides. In addition to the previously identified ssDNA

endonuclease activity, we detected the 5’ to 3’ ssDNA exonuclease activity of gp19, which

also interacts with gp18. Based on the data, the possible functions of the gene operon are

discussed.

MATERIALS AND METHODS

Cloning, expression and purification of C-terminally His-tagged recombinant proteins

The coding sequences of gp17, gp18 and gp19 were amplified by PCR from SIRV2 genome

using primers listed in Table S1, digested with NdeI and XhoI and subsequently inserted

Page 58: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

44

into a similarly digested pET-30(a) (Novagen) expression vector. To introduce single or

multiple amino acid (aa) mutations into the recombinant gp17, fusion PCR using 4 primers

(Table S1) was performed for each mutant.

E.coli BL21 CodonPlus cells were transformed with individual plasmid construct and

a single clone transformant was inoculated in LB medium containing 30 g/ml kanamycin

and 25 g/ml chloramphenicol. At an optical density (OD600) of 0.4, IPTG (0.5 mM) was

added to the culture and the cells were further cultured at 25°C for 12 hours. Harvested cell

pellet was resuspended in lysis buffer (50 mM Tris-HCl pH 8.0, 300 mM NaCl, 1 mM EDTA,

1% Triton X-100 and 1 mM PMSF) and lysed by sonication. The lysate was cleared by

centrifugation at 10000 x g for 20 minutes and the supernatant was then incubated with

Ni-NTA-agarose beads (Qiagen, Germany) for 1 h at room temperature. The beads were

washed three times with washing buffer (50 mM Tris-HCl pH 8.0, 300 mM NaCl, 40 mM

Imidazol ) and the protein eluted with elution buffer (50 mM Tris-HCl pH 8.0, 300 mM NaCl,

250 mM Imidazol). The purity of the proteins was evaluated by SDS-PAGE and the gel

(12.5%) was stained with PAGE blue (Sigma Aldrich, UK). In the case of gp18, a denaturing

and refolding method was applied (see below).

Cloning, expression and purification of N-terminally GST-tagged proteins

The wild-type gp17, gp17-I, its truncated mutants gp17-II (1-121aa) and gp17-III (1-111aa)

and gp19 were amplified by PCR using primers listed in Table S1 and the PCR products

were digested with BamHI and XhoI and ligated to a BamHI and XhoI digested pGEX-6p-2

(GE Healthcare Life Science, Sweden ) expression vector. The constructs were introduced

individually into BL21 CodonPlus cells. Transformed cells were grown in LB medium

containing 100 g/ml ampicillin and 25 g/ml chloramphenicol and induced with 0.5 mM

IPTG at an OD600 of 0.4. After 12 hours incubation at 25°C, the cells were pelleted,

resuspended in lysis buffer (PBS buffer pH 8.0, 1 mM EDTA, 1% Triton X-100 and 1 mM

PMSF) and lysed by sonication. The supernatant was incubated with Glutathione

Sepharose 4 Fast Flow beads (GE Healthcare Life Science, Sweden) for 1 h at room

temperature. The beads were washed three times with PBS buffer and the proteins remain

bound on the beads for subsequent pull-down assays with His-tagged proteins.

Refolding of His-tagged gp18 from inclusion bodies

Page 59: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

45

The cell lysate was centrifuged at 10000 × g for 20 min, and the inclusion bodies in the pellet

were solubilized in lysis buffer containing 8 M urea at room temperature for 1h. The

purification of the denatured protein using Ni-NTA-agarose beads followed the same

procedure as described above for other His-tagged proteins, except that 8 M urea was

included in both washing and elution buffers. The eluted 2 ml protein from 1L E.coli cells

was dialysed first in 200 ml 0.5 M L-Arginine buffer for 2 h, and then in 2 L PBS buffer for 3 h

and the latter dialysis was repeated for 3 times.

Preparation of substrates for DNA mobility shift and nuclease activity assays

To prepare DNA substrates for DNA mobility shift and exonuclease assays, oligonucleotide

1 (Table S2) was annealed to a series of fully or partially complementary ssDNA

oligonucleotides (Table S2) to generate either a 23-nucleotide (23-nt) 5´-ssDNA tailed

duplex (substrate B), a 23-nt 3´-ssDNA tailed duplex (substrate C), or a blunt-ended duplex

(substrate A) (Table 1). Oligonucleotide 4 (Table S2) was annealed to its partially

complementary ssDNA oligonucleotides 5 (Table S2) to generate a Y-shaped dsDNA

(substrate D). The annealing mixture was heated at 95°C for 2 min and then slowly cooled

to room temperature (25°C) over a period of 1 h. M13mp18 DNA (New England Biolabs,

America) was chosen as circular single-stranded DNA substrate for the endonuclease

assay.

Gel mobility shift analysis

50 nM of the ssDNA (oligo 4) or dsDNA (substrate A, B and D in Table 1) were incubated for

20 min at 50oC with increasing concentrations of gp17 or its mutant variants (0-2000 nM) in

20 l DNA-binding buffer (10 mM Tris-Cl, pH 8.0, 100 mM KCl, 2 mM DTT, 10% [vol/vol]

glycerol). The samples were loaded onto 12% acrylamide gel and electrophoresed in 0.5 ×

TBE buffer for 1 h 50 min. Following electrophoresis, the gels were stained with SYBR®

Gold (Life Technologies) and scanned by Typhoon FLA 7000 (GE Healthcare Life Science).

The bands were quantified using ImageQuant TL (GE Healthcare Life Science).

Gel-filtration chromatography

Gel-filtration chromatography was carried out using an ÄKTA–FPLC system. Briefly, purified

proteins in PBS buffer were applied individually to a Superdex 200 HR 10/300 GL column

(GE Healthcare Bio- Sciences, America) equilibrated with the same buffer. The column was

Page 60: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

46

operated at a flow rate of 0.5 ml/min, and 0.5 ml fractions were collected. The proteins were

detected by measuring the absorbance at 280 nm, 254 nm and 215 nm. The column was

calibrated with proteins of known molecular weight: Thyroglobulin, Bovine (669 kDa),

Apoferritin, Horse Spleen (443 kDa), β-Amylase, Sweet Potato (200 kDa), Alchol

Dehydrogenase, Yeast (150 kDa).

Circular dichroism (CD) spectroscopy

A far-UV CD spectrum was recorded on a Jasco 810 spectropolarimeter at a wavelength

range from 260 to 190 nm, a scan rate of 20 nm/min, 15 accumulations and 2 s response

time, at room temperature. Samples were recorded in a quartz cuvette with a 1mm path

length. A corresponding spectrum of the buffer was recorded and subtracted and the

resulting spectrum smoothed (Jasco software). The spectrum was recorded of 3.85 µM

protein in PBS, pH 8.0 and the ellipticity given as mean residue ellipticity [ϴ]MRW in

deg*cm2*dmol-1. A temperature denaturation profile was recorded at 220 nm by heating the

sample from 25°C to 95°C with a rate of 1°C/min, and apparent melting temperatures Tmapp’s

derived from fitting of the data to the following equation:

( ) ( ) ( ) (

)

(

)

Where ΔH(Tm) is the enthalpy change at Tm, and ΔS(Tm) the entropy change at Tm.

ssDNA annealing activity

For ssDNA annealing assay, the [32P] end-labelled 57-nt oligo 4 (1 nM) (Table S2) was

incubated in annealing buffer (30 mM Tris-HCl, pH7.5, 5 mM MgCl2, 75 mM NaCl, 50 mM

KCl and 1 mM DTT) with increasing amounts of gp18 at 25˚C for 5 min. The reaction was

initiated by adding the unlabelled complementary oligonucleotide 5 (1.2 nM) (Table S2) and

incubated at 50˚C for 15 min. The reaction was then stopped by the addition of 20 nM cold

oligo 4, 0.5% [wt/vol] SDS and 1 mg/ml proteinase K. The deproteination was carried out at

25˚C for 10 min and the samples were loaded on a 10% native polyacrylamide gel and run

at 100V for 1 h 20 min in 0.5 × TBE buffer. Following electrophoresis, gels were dried and

exposed to X-ray film for documentation. DNA was quantified using ImageQuant TL (GE

Healthcare Life Science).

Page 61: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

47

Nuclease activity assays

The nuclease activity assays (20 l) were performed by mixing 0.08 M DNA duplex

(substrate A, B or C) or 0.05 M M13mp18 DNA in reaction buffer containing 20 mM

Tris-HCl, pH 7.5, 50 mM NaCl, 5 mM MgCl2, 1 mM DTT and 10% glycerol. Reactions were

initiated by the addition of 0.5 M SIRV2 gp19, and the mixtures were incubated at 50°C for

the indicated length of time. Time course analyses were carried out by scaling up the

reaction volume to 150 l and withdrawing 20 l aliquots at the indicated times. Reactions

were terminated by the addition of 6 l stop solution (0.1% [wt/vol] bromophenol blue, 0.1%

[wt/vol] xylene cyanol, 8% [vol/vol] glycerol, 1% [wt/vol] SDS, 50 mM EDTA and 2 mg/ml

protease K). As a negative control, the substrates were incubated in the reaction mix in the

absence of protein gp19. Samples for the exonuclease assay were analyzed by 12%

polyacrylamide gel electrophoresis in 0.5 x TBE buffer and stained with SYBR® Gold (Life

Technologies). Samples for the endonuclease assay were resolved in 0.7% agarose gel.

Western blot analysis

Proteins were separated in a 12.5% SDS polyacrylamide gel and transferred with transfer

buffer (39 mM Glycine, 50 mM Tris pH 8.7, 0.04% SDS, 20% Methanol) onto a nitrocellulose

membrane (Whattman, Germany) at 70 mA for 1h 15 min. The membrane was blocked for 1

h with PBST buffer (PBS buffer containing 0.05% Tween® 20) containing 8% milk powder

and then incubated with anti-His antibody (Qiagen, Germany, 1:2000 dilution in PBS buffer

containing 3% BSA). The membrane was further incubated for 1 h with a

peroxidase-coupled secondary antibody (anti-mouse 1:10 000 IgG, Sigma-Aldrich, America)

followed by 3 times wash with PBS buffer. An alkaline phosphate (Sigma-Aldrich, America)

substrate was used for detection according to the instructions provided.

In vitro detection of protein-protein interactions

The GST-tagged gp17-I, gp17-II (1-121aa), gp17-III (1-111aa) or gp19 remaining on the

GSH beads (50 g) was incubated at 25°C for 1 h with His-tagged prey proteins (90 g) in

PBS buffer (0.1 µg/ml BSA, 0.1% Triton X-100). The mixture was centrifuged for 3 min at

3000 × g and the supernatant was stored for later analysis. The beads were washed several

times with PBS buffer to remove unbound components and then heated in SDS loading

buffer at 98°C for 10 min before SDS-PAGE in a 12.5% gel. The gel was stained with PAGE

Page 62: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

48

blue (Sigma Aldrich, UK) and the presence of His-tagged interaction partners was detected

by Western blot using anti-His antibody, as described above.

Identification of Sulfolobus proteins interacting with gp17

The gp17 fragment amplified by PCR (Table S1) was inserted between NdeI and Notl

restriction sites of the Sulfolobus/E.coli shuttle vector pEXA2 (22), allowing the expression

of His-tagged gp17 under the control of arabinose promoter. The constructed plasmid, as

well as the empty plasmid pEXA2, were then electroporated individually into the uracil

deficient competent cells (23). Single colonies of the transformants were inoculated into test

tubes containing 5 ml SCV (basal medium supplemented with 0.2% sucrose, 0.2%

casamino acids and 1% vitamin solution) (23), and incubated in an Innova 3100 oil-bath

shaker. Large-scale culturing was performed in ACV medium (0.2% D-arabinose was

substituted for sucrose) with Erlenmeyer flasks of long necks.

Purification of the His-tagged gp17 from the transformed Sulfolobus cells was

performed as described above, and the eluted proteins were evaluated by 12.5%

SDS-PAGE and stained with PAGE blue (Sigma Aldrich, UK). Protein bands present

exclusively in the gp17-containing transformant were sliced from the gel and subjected to

MALDI-TOF analysis (Alphalyse A/S, Odense, Denmark).

RESULTS

Bioinformatic analysis revealed high conservation of SIRV2 gp17, gp18 and gp19 in

archaeal rudiviruses and filamentous viruses

gp17, gp18 and gp19 occur as a gene cluster in the genome of SIRV2 and were shown

previously to be transcribed from a single promoter generating a polycistronic transcript (10).

The operon organization suggests the three genes are functionally related. However, except

gp19 which was experimentally determined to possess a ssDNA endonuclease activity (21),

very little is known about the function of gp17 and gp18. The crystal structure of gp17

homolog encoded by SIRV1 has been resolved (24), but no functional insight is available.

Although a weak similarity between a limited part of gp18 and bacterial ATPase domains of

Lon proteases was described (15), a tertiary structure prediction using the threading

program Phyre 2 (25) suggested a different function. About two thirds of gp18 sequence

Page 63: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

49

(residues 140 - 430) matched with high confidence (98.8%) to MCM homolog 2 (c3f8tA)

from Methanopyrus kandleri, suggesting that gp18 might be a hexameric helicase.

The three genes are conserved in all rudiviruses and in one of the filamentous

viruses, AFV1 (Fig. 1). The amino acid (aa) sequence similarities range between 36 - 96%

for gp17 homologs, 51 - 100% for gp18 homologs and 51 – 98% for gp19 homologs.

Although no significant sequence similarity was detected between SIRV2 gp19 and the

other filamentous viral genomes, a putative nuclease is clearly encoded by the latter (except

AFV2) and it belongs to the Cas4 superfamily similar to SIRV2 gp19 (6;26). Interestingly, a

highly conserved gene encoding a putative helicase is found upstream of the putative

nuclease gene in all filamentous viral genomes. Upstream of the putative helicase gene is

another highly conserved gene encoding a 79 aa hypothetical protein showing no sequence

similarity to SIRV2 gp17 (Fig. 1). It is not clear whether this small gene is functionally related

to SIRV2 gp17, while homologs or analogs of SIRV2 gp18 and gp19 are present in almost

all rudiviral and filamentous viral genomes. Genome comparison of all the rudiviruses and

filamentous viruses using the Mutagen program (27) revealed that SIRV2 gp17, gp18 and

gp19 constitute the only conserved gene cluster in the archaeal linear viruses (Fig. S1 and

Table S3). Thus, the three genes appear important for both viral families.

gp17 is a single-stranded DNA binding protein

Structural features of SIRV2 gp17. To gain insights into the functions of the gene cluster, we

first examined the structure of gp17, which forms a dimer in the crystals (Protein Data Bank

identifier [ID] 2X5T)(24)(Fig. S2A). Although it doesn’t show obvious structural similarity to

any known domains present in Protein Data Bank (PDB), the electrostatics and the shape of

the molecule indicate a DNA binding activity. It is dominated by basic (blue) residues on the

concave side (Fig. S2C) and by acidic (red) residues on the convex side (Fig. S2D). The

concave side fits well as a DNA straddling pocket. Two arginine side chains point down in

the middle of the arch. The dimer binds a sulphate group, which may reflect its affinity

towards phosphates.

The total length of gp17 is 131 residues, and structural information is missing for 38

residues at the C-terminus. In line with this lack of structural information, analysis of the

sequence with protein disorder predictors revealed a potential intrinsically disordered

C-terminus of about 35 residues (28-30) (Fig. S3). Analysis of gp17 homologs encoded by

Page 64: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

50

other rudiviruses and AFV1 revealed intrinsically disordered C-terminus of similar size (data

not shown), indicating the importance of the disorder in this domain for the function of the

protein.

ssDNA binding activity of gp17. gp17 was amplified from the SIRV2 genome and

cloned into E. coli vector pET30a. The C-terminally His-tagged protein was purified from

E.coli to homogeneity (Fig. S4) and tested for binding activity to different DNA substrates.

As shown in Fig. 2A, gp17 binds to substrates that are either ssDNA or dsDNA containing a

single or double flaps, whereas no binding to the blunt ended dsDNA was detected with the

same range of protein concentrations. The same result was obtained when ssDNA and

blunt ended dsDNA were mixed with equal molar concentration in the reaction, where

almost all ssDNA, but no or very little dsDNA, were shifted in the presence of 130 nM gp17

(Fig. 2B). At higher concentrations of gp17, no free ssDNA is available, and the protein

exhibited binding to the dsDNA, albeit with a much lower affinity. At a gp17 concentration of

3.9 µM, a significant amount of the dsDNA still remains unshifted, indicating that the affinity

of gp17 towards ssDNA is at least 30 times higher than to dsDNA.

A few positively charged residues forming a U-shaped binding channel on the gp17

dimer are crucial for its ssDNA binding activity

To identify essential elements of the ssDNA-binding domain of gp17, we first aligned the

sequences of gp17 homologs and identified 3 fully conserved positively charged residues,

R60, K61 and K82 (Fig.S5). R60 and K61 are located in a loop at the central cleft of the

concave side, while K82 is found on the surface of the convex side (Fig. 3A). The three

residues were mutated individually into alanine and the binding affinity of the mutant

proteins to ssDNA was compared to that of the wild-type gp17. Whereas the K82A variant

exhibited a similar level of binding affinity as the WT gp17, a 2 and a 5 fold reduction in

binding affinity was observed, respectively, for the R60A and the K61A variants.

Interestingly, simultaneous mutation of R60 and K61 to alanine abolished almost completely

its ssDNA binding activity (Fig. 3B).

Four other positively charged residues, R24, K27, K29 and R33, are relatively

conserved in the rudiviruses (Fig S5), and the corresponding residue of SIRV2 R33 in AFV1

ORF135 (K32) is also positively charged. Therefore the four residues were mutated to

alanine either individually or simultaneously. While the double mutant R24K27 and the triple

Page 65: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

51

mutant R24K27K29 showed very little or only a mild reduction in the binding affinity (Fig. S6),

the R33 mutant demonstrated the most profound effect within the tested single mutants,

with an 8 fold drop of the binding activity observed (Fig. 3B and 3C).

The above experiments demonstrated that R33, R60 and K61 are important for the

ssDNA binding whereas R24, K27, K29 and K82 are less or not important. By examining the

location and the orientation of the residues, it is obvious that the side chains of the former

residues point to the central cleft and those of the latter residues point to the outer surface

(Fig. 3A). It appears that the residues important for DNA binding (R33, R60, K61) form a

positively charged and U-shaped structure in the gp17 dimer, thus straddling on ssDNA and

causing bending of the ssDNA (Fig. 3D).

Within the U-shaped path another relatively conserved residue, H54, has a positive

charge (Fig. S5) and was thus mutated to alanine to test its possible contribution to ssDNA

binding. In line with its charge, conservation and location in the protein, H54 appears also

important for ssDNA binding, as the H54A mutant demonstrated a 4 fold drop of binding

activity compared with the WT protein (Fig. 3B, 3C and 3D).

Purification, refolding and stability of gp18

To characterize gp18 biochemically, a certain amount of soluble protein was needed. As

gp18 couldn’t be cloned into Sulfolobus (see below) due to its toxicity and as recombinant

expression in E.coli resulted in the formation of inclusion bodies, a denaturation and

refolding strategy was employed to purify the His-tagged gp18 from E.coli. Following cell

lysis, the inclusion bodies were pelleted and dissolved in 8 M urea (Fig. 4A lane 3), and

gp18-His was purified using Ni-NTA-agarose beads. The denatured gp18-His was then

refolded in L-Arginine buffer (31).

The protein appeared refolded properly as it remained soluble in the solution after 20

min incubation at 70°C (Fig. 4A lane 4). To assess the fold integrity of gp18 after

recombinant production and refolding, the final preparation was subjected to structure

analyses by CD spectroscopy. The far-UV CD spectrum recorded at room temperature

revealed distinct negative molar ellipticity with minima at 218 and 208 nm, strongly

indicating that the protein is folded with content of both α-helices and β-strands (Fig. 4B).

Additionally, a temperature denaturation monitored at 220 nm showed that the protein was

stable with two cooperative transitions, one with an apparent melting temperature (Tmapp) of

Page 66: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

52

~60°C and a major, highly cooperative transition with a Tmapp of 91°C, ΔH(Tm) = -469 ± 15

kJ/mol and ΔS(Tm) = -1.29 ± 0.04 kJ/mol (Fig. 4C). Because of the extreme stability of the

protein, the post transition was not perfectly revealed at the current experimental conditions,

and hence the thermodynamic parameters are approximations. However, conclusively,

recombinantly produced gp18 was highly stable, cooperatively folded and with content of

both α-helices and β-strands.

The oligomerization status of the refolded gp18 was analysed by gel filtration

chromatography, which resulted in the formation of a broad peak containing two “shoulders”

with a total elution volume of 23.6 ml at a flow rate of 0.5 ml/min (Fig. 4D). The elute volume

of the main peak (labelled 1 in Fig. 4D) was between 9.85 and 10.60 ml and those of the two

“shoulders” were 8.05 ml and 12.75 ml, respectively. Assuming that gp18 has a shape and

partial specific volume similar to those of standard proteins, the molecular mass of the main

peak is estimated to be between 685 kDa and 458 kDa and that of the two “shoulders” to be

Vo and 147 kDa, respectively, calculated from a standard linear regression equation,

Kav=-0.2976(logMW)+1.8388 (Fig.S7). Since the molecular weight of the monomeric gp18 is

50.36 kDa, the protein was refolded as a series of oligomers from trimmers, nonamers to

dodecamers. The molecular mass of the main top (560.6 kDa) ranges from nonamers to

dodecamers, which might be the functional folds of gp18.

gp18 stimulates the annealing of complementary oligonucleotides

Since structural prediction suggested that gp18 could be a hexameric helicase such as

MCM, an essential protein for the initiation and elongation phases of DNA replication (32),

we performed helicase assays. However, no helicase activity was detected in spite of

multiple trials with different nucleic acid substrates and varied experimental conditions with

different metal ions, NTPs, and temperatures (data not shown). Surprisingly, during the

helicase assays, we found that instead of unwinding dsDNA, gp18 seemed to be able to

increase the dsDNA yield from two complementary oligonucleotides.

To determine whether gp18 catalyzes ssDNA annealing, a [32P]-labelled 57-nt

oligonucleotide (oligo-4 in Table S2) was first mixed with gp18. The annealing reactions

were initiated by the addition of the complementary strand (oligo-5) and stopped by a

20-fold excess of the unlabeled oligo-4. After deproteination, the reaction products were

resolved on a native PAGE gel. gp18-mediated annealing was dependent upon protein

Page 67: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

53

concentration, with the reaction being most efficient at 200 nM of gp18 under the

experimental condition (Fig. 5A, lane 5), also supporting the oligomeric structure of the

protein. Moreover, the efficiency of annealing was not changed when ATP was excluded

from the reaction (Fig. 5A, lane 6), suggesting that ATP hydrolysis is not needed in the

catalyzed process.

As shown in the left panel of Fig. 5B, spontaneous annealing between the two

oligonucleotides occurred slowly in a time-dependent manner with only 40% of the ssDNA

annealed after 4 minutes incubation (Fig. 5C). The annealing process was drastically

accelerated by gp18 (right panel of Fig. 5B). The oligonucleotides were almost completely

annealed to form the slow-migrating dsDNA after 4 minutes of incubation in the presence of

gp18 (Fig. 5B and 5C).

gp19 demonstrates both ssDNA endonuclease and 5´-3´ exonuclease activities

Although previously demonstrated as a ssDNA endonuclease (21), gp19 shares sequence

similarity with different CRISPR-associated Cas4 proteins (26), which possess

metal-dependent endonuclease and 5´→3´exonuclease activities against ssDNA (33). We

therefore examined the possible exonuclease activity of gp19 using DNA substrates of

different structures. As shown in Fig. 6A, the migration of the blunt–end duplex DNA did not

change upon the addition of gp19, indicating that it is not a substrate of gp19. While the

3´-flap duplex DNA remained unchanged as the blunt-ended DNA, the 5´-flap duplex DNA

was cleaved with the final product having the same size of the blunt-ended DNA. It

appeared that gp19 initiated cleavage from the 5´ single-strand end and stopped at the

single strand and double strand junction (Fig. 6B). This indicates that gp19 has the 5´-3´

ssDNA exonuclease activity.

To confirm the ssDNA endonuclease activity of gp19, the circular ssDNA M13mp18

was tested with the same reaction buffer. As shown in Fig. 6C, the incubation with gp19 led

to slow degradation of the circular ssDNA. These results confirm that SIRV2 gp19

possesses both 5´→3´exonuclease activity and endonuclease activity against ssDNA.

Interactions between gp17 and gp18 and between gp18 and gp19

Given that gp17, gp18 and gp19 all work on the same type of substrate, ssDNA, it appeared

possible that the three proteins interact with one another. Since the removal of the last 10

Page 68: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

54

residues showed little effect on the DNA binding activity of gp17 (Fig. S6), the intrinsically

disordered C-terminus is possibly involved in other functions such as protein-protein

interactions as demonstrated for the disordered C-terminus of bacterial SSB proteins (34).

Therefore, we first tested its possible interactions with gp18 and gp19.

gp17 and its C-terminally truncated variants were expressed as GST fusion proteins

and purified separately (Fig.7A). Individual GST fusion proteins were incubated with

His-tagged gp18, and immobilized on GSH beads. After centrifugation, the beads were

washed and boiled in SDS buffer before loading on SDS gel for Western blotting. Western

hybridization using His-tag antibody revealed the presence of gp18 on the GSH beads with

immobilized wild type gp17 protein, indicative of the interaction between gp17 and gp18 (Fig.

7B). However, no interaction was detected between gp18 and the two gp17 variants with 10

and 20 C- terminal residues removed, respectively (Fig. 7C). These results demonstrate

that gp17 interacts with gp18 and the C-terminal disordered domain of gp17 is essential for

the interaction.

The same method was applied to test possible interactions between GST-tagged

gp17 and His-tagged gp19, and between GST-tagged gp19 and His-tagged gp17, none of

which showed positive results (data not shown). While no interaction was detected between

gp17 and gp19, the GST-tagged gp19 retained a small amount of His-tagged gp18 on the

GSH beads (Fig. 7B), demonstrating a weak interaction between gp18 and gp19.

gp17 binds to two Sulfolobus host proteins

ssDNA binding proteins are essential for protecting ssDNA and recruiting specific

ssDNA-processing proteins. In bacteria, SSBs were found to interact with more than a

dozen different proteins involved in DNA replication, recombination and repair (34). To

identify possible interactions with other proteins, gp17 was cloned into the E.coli/Sulfolobus

shuttle vector pEXA2 under the control of arabinose promoter (22) and expressed in

Sulfolobus. By Ni-NTA-Agarose chromatography the His-tagged gp17 was co-purified with

two large proteins, of about 60 and 150 kDa, respectively (Fig. 7D). The absence of the two

bands in proteins purified from the control cells transformed with empty pEXA2 supported

that they were pulled-down specifically by gp17. Western blot hybridization using His-tag

antibody revealed a single band with the expected size of gp17-His, and the two large bands

were thus not oligomers of gp17-His (Fig.7E).

Page 69: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

55

The two bands were sliced from the gel and identified by MALDI-TOF analysis. Band

1 contained a hypothetical protein encoded by SSO2277 with a theoretical mass of 57 kDa,

carrying an ATPase domain. Band 2 was identified to be reverse gyrase from S. solfataricus

P2 (SSO0422) with a mass of 142 kDa (Table S4). The same procedure was repeated with

SIRV2 infected transformants and revealed again the same results (data not shown). No

viral proteins such as gp18 were co-purified with gp17, which could be due to low

expression of gp18, as demonstrated previously by microarray analysis (12).

We attempted to clone gp18 and gp19 individually into Sulfolobus using pEXA2 as

cloning vector. Whereas gp18 was shown to be highly toxic and couldn’t be transformed into

Sulfolobus, overexpression of gp19 caused growth retardation of the transformant, and no

host proteins were identified to interact with gp19 (data not shown).

DISCUSSION

Single stranded DNA binding proteins are ubiquitous across all three domains of life and are

found in many viruses playing essential roles in genome maintenance, DNA replication,

recombination, repair and transcription. They can coat, protect and remove secondary

structures of the ssDNA intermediates. Besides, some specific ssDNA-processing proteins

are recruited and coordinated by ssDNA binding proteins during DNA metabolism pathways

(35-37). In spite of high sequence, structural and functional divergence, almost all classical

ssDNA binding proteins contain one of the following four structural topologies:

oligonucleotide/oligosaccharide/oligopeptide-binding (OB) folds, K homology (KH) domains,

RNA recognition motifs (RRMs), and whirly domains (38). Recently a group of

hyperthermophilic archaeal organisms were found to lack a classical ssDNA binding protein

and instead to harbour a distinct ssDNA binding protein termed ThermoDBP (39). The

ssDNA binding protein encoded by SIRV2 gp17 differs in structure from the classical ssDNA

binding proteins as well as from the ThermoDBPs, and thus constitutes a novel

non-canonical ssDNA binding protein.

Single strand annealing activity has been detected in different proteins including

some helicases and recombinases encoded by cellular life and by some viruses (reviewed

by (40). In many of the helicases containing annealing activity, a separate protein domain

distinct from the helicase domain is responsible for the annealing activity (40). Remarkably,

a helicase domain-containing protein, HARP, was recently discovered to possess annealing,

Page 70: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

56

but no unwinding, activity (41). HARP binds to the ssDNA binding protein RPA and anneals

RPA-coated complementary ssDNA. Mutations in HARP are associated with Schimke

Immuno-Osseous Dysplasis (SIOD) disease and the defects in the annealing activity of two

HARP mutants correlate with the severity of the disease (41). Together with AH2, another

protein with similar features (42), HARP was termed annealing helicase. In this study, the

annealing activity was clearly demonstrated for the SIRV2 gp18 protein. The failure of

detecting the helicase activity, which was predicted by structural modelling of the gp18

sequence, could be due to the lack of proper experimental conditions or possible mask of

helicase activity by the stronger annealing activity. A third possibility is that gp18 carries no

helicase activity, as demonstrated for the annealing helicases. While only the structural

modelling revealed a connection between SIRV2 gp18 and a MCM helicase, a high

sequence similarity to Cas3 and other helicases was clearly detected by BlastP searches of

the gp18 analogues encoded in the genomes of most filamentous viruses (Fig. 1).

Interestingly, the E. coli Cas3 was found to possess both helicase and annealing activities

(43).

To better understand the function of the entire gene operon, the protein product of

the third gene, gp19, was further characterized in this study which revealed a 5’-3’ ssDNA

exonuclease activity, in addition to the previously demonstrated ssDNA endonuclease

activity (Fig. 6 and Garder et al., 2011b). The operonic or clustered organization of the three

genes in rudi- and filamentous viruses (Fig. 1) and the observed interactions between their

protein products (Fig. 7) strongly suggest their close cooperation in a same process(es)

involving ssDNA. The SIRV2 genome replication study by different approaches

demonstrated that SIRV2 forms ssDNA intermediates larger than a single genome size, and

large concatemers are abundant during the replication process (Martinez-Alvarez et al., in

preparation). This requires, first of all, abundant ssDNA binding protein to protect the ssDNA

intermediates and the highly expressed gp17 (12) may fulfil this requirement. To mature into

dsDNA monomers, the long ssDNA concatemers must anneal between the two

complementary strands, which could be facilitated by gp18. Subsequent nicking by a ssDNA

endonuclease and ligation by an unknown ligase would produce a mature dsDNA genome.

Through protein-protein interactions, gp17 can recruit gp18 to facilitate ssDNA annealing

whereas gp19 can be recruited by gp18 to perform the final cleavage. In support of this

scenario, gp17 was found to be still present at high amount at the late stage of SIRV2 life

Page 71: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

57

cycle, together with the tail-fiber protein of SIRV2 virions (11). Thus, it is very likely that the

gene operon is involved in genome maturation of SIRV2 replicative intermediates.

Another common and interesting feature shared by the rudiviruses and filamentous

viruses is the presence of multiple 12 bp insertion/deletions (indels) in their genomes,

revealed by sequence alignment between closely related viral genomes and between

homologous genes (6;44). In the latter case where the nucleotide sequences diverged too

much to be aligned, amino acid sequence alignment between homologs allowed the

detection of a single or multiple 4 residue indels. Given the proposed function of Cas4 in

CRISPR spacer acquisition and the fact that gp19 belongs to the Cas4 nuclease

superfamily (26;33), it is possible that gp19 is involved in the generation of the 12 bp indels

and the annealing activity of gp18 fits well with both insertion and deletion scenarios.

Following strand-displacement replication as proposed for both AFV1 (45) and SIRV2 (19)

unpublished data from Martinez-Alvarez et al.), ssDNA bubbles may arise frequently and

spontaneously during genome maturation. Repair of such structures involving ssDNA

binding protein (gp17), annealing protein (gp18) and ssDNA nuclease (gp19) could in

principle produce either insertions or deletions.

A third possible function of the gene operon is recombination involved in general

repair or replication initiation, which has been proved important for many viruses (e.g. T4 as

in (46). After dsDNA unwinding by a helicase, which remains to be identified in this case,

ssDNA nuclease, binding and annealing activities are all needed in the classical

recombination processes (47) and the identified functions of the three proteins fit well with

the scenario. In support of this, Phyre2 structural modelling of SSO2277,

a Sulfolobus protein interacting with gp17 (Fig. 7D) and annotated as hypothetical, revealed

a good match (99.9% confidence over half of the protein) with proteins of the family RecF,

RecN, Rad50 etc (data not shown). The latter proteins are involved in recombination (48).

In conclusion, this is the first study providing the functional characterization of an

entire gene operon conserved in archaeal rudiviruses and filamentous viruses. Due to low or

no sequence homology with characterized proteins, the majority of archaeal viral genes

remain hypothetical. This had hindered the progress of the archaeal virology field. The

results from this study will therefore contribute to better understanding of the novel viruses

infecting Archaea, the third domain of life. More importantly, the sequence and/or structural

divergence of the three proteins from previously characterized ssDNA binding, annealing

Page 72: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

58

and nuclease proteins not only add novelty to, but also provide important information for

evolutionary studies of these proteins, which are nearly ubiquitous from bacteria, archaea to

eukaryotes including humans.

FUNDING

This work was supported by the European Union Frame Work 7 program 265933. Y.G.

received a stipend from China Scholarship Council.

REFERENCES

1. Pietila,M.K., Roine,E., Paulin,L., Kalkkinen,N. and Bamford,D.H. (2009) An ssDNA virus

infecting archaea: a new lineage of viruses with a membrane envelope. Mol.

Microbiol., 72, 307-319.

2. Mochizuki,T., Krupovic,M., Pehau-Arnaudet,G., Sako,Y., Forterre,P. and Prangishvili,D.

(2012) Archaeal virus with exceptional virion architecture and the largest

single-stranded DNA genome. Proc. Natl. Acad. Sci. U. S. A, 109, 13386-13391.

3. Pietila,M.K., Demina,T.A., Atanasova,N.S., Oksanen,H.M. and Bamford,D.H. (2014)

Archaeal viruses and bacteriophages: comparisons and contrasts. Trends Microbiol.,

22, 334-344.

4. Prangishvili,D., Arnold,H.P., Gotz,D., Ziese,U., Holz,I., Kristjansson,J.K. and Zillig,W.

(1999) A novel virus family, the Rudiviridae: Structure, virus-host interactions and

genome variability of the sulfolobus viruses SIRV1 and SIRV2. Genetics, 152,

1387-1396.

5. Blum,H., Zillig,W., Mallok,S., Domdey,H. and Prangishvili,D. (2001) The genome of the

archaeal virus SIRV1 has features in common with genomes of eukaryal viruses.

Virology, 281, 6-9.

6. Vestergaard,G., Aramayo,R., Basta,T., Haring,M., Peng,X., Brugger,K., Chen,L.,

Rachel,R., Boisset,N., Garrett,R.A. et al. (2008) Structure of the acidianus filamentous

virus 3 and comparative genomics of related archaeal lipothrixviruses. J. Virol., 82,

371-381.

7. Vestergaard,G., Haring,M., Peng,X., Rachel,R., Garrett,R.A. and Prangishvili,D. (2005)

A novel rudivirus, ARV1, of the hyperthermophilic archaeal genus Acidianus. Virology,

336, 83-92.

8. Servin-Garciduenas,L.E., Peng,X., Garrett,R.A. and Martinez-Romero,E. (2013)

Genome sequence of a novel archaeal rudivirus recovered from a mexican hot spring.

Genome Announc., 1.

Page 73: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

59

9. Peng,X., Blum,H., She,Q., Mallok,S., Brugger,K., Garrett,R.A., Zillig,W. and

Prangishvili,D. (2001) Sequences and replication of genomes of the archaeal

rudiviruses SIRV1 and SIRV2: relationships to the archaeal lipothrixvirus SIFV and

some eukaryal viruses. Virology, 291, 226-234.

10. Kessler,A., Brinkman,A.B., van der Oost,J. and Prangishvili,D. (2004) Transcription of

the rod-shaped viruses SIRV1 and SIRV2 of the hyperthermophilic archaeon

sulfolobus. J. Bacteriol., 186, 7745-7753.

11. Quax,T.E., Krupovic,M., Lucas,S., Forterre,P. and Prangishvili,D. (2010) The Sulfolobus

rod-shaped virus 2 encodes a prominent structural component of the unique virion

release system in Archaea. Virology, 404, 1-4.

12. Okutan,E., Deng,L., Mirlashari,S., Uldahl,K., Halim,M., Liu,C., Garrett,R.A., She,Q. and

Peng,X. (2013) Novel insights into gene regulation of the rudivirus SIRV2 infecting

Sulfolobus cells. RNA. Biol., 10, 875-885.

13. Deng,L., He,F., Bhoobalan-Chitty,Y., Martinez-Alvarez,L., Guo,Y. and Peng,X. (2014)

Unveiling cell surface and type IV secretion proteins responsible for archaeal

rudivirus entry. J. Virol., 88, 10264-10268.

14. Prangishvili,D., Garrett,R.A. and Koonin,E.V. (2006) Evolutionary genomics of archaeal

viruses: unique viral genomes in the third domain of life. Virus Res., 117, 52-67.

15. Prangishvili,D., Koonin,E.V. and Krupovic,M. (2013) Genomics and biology of

Rudiviruses, a model for the study of virus-host interactions in Archaea. Biochem. Soc.

Trans., 41, 443-450.

16. Vestergaard,G., Shah,S.A., Bize,A., Reitberger,W., Reuter,M., Phan,H., Briegel,A.,

Rachel,R., Garrett,R.A. and Prangishvili,D. (2008) Stygiolobus rod-shaped virus and

the interplay of crenarchaeal rudiviruses with the CRISPR antiviral system. J.

Bacteriol., 190, 6837-6845.

17. Guilliere,F., Peixeiro,N., Kessler,A., Raynal,B., Desnoues,N., Keller,J., Delepierre,M.,

Prangishvili,D., Sezonov,G. and Guijarro,J.I. (2009) Structure, function, and targets of

the transcriptional regulator SvtR from the hyperthermophilic archaeal virus SIRV1. J.

Biol. Chem., 284, 22222-22237.

18. Quax,T.E., Voet,M., Sismeiro,O., Dillies,M.A., Jagla,B., Coppee,J.Y., Sezonov,G.,

Forterre,P., van der Oost,J., Lavigne,R. et al. (2013) Massive activation of archaeal

defense genes during viral infection. J. Virol., 87, 8419-8428.

19. Oke,M., Kerou,M., Liu,H., Peng,X., Garrett,R.A., Prangishvili,D., Naismith,J.H. and

White,M.F. (2011) A dimeric Rep protein initiates replication of a linear archaeal virus

Page 74: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

60

genome: implications for the Rep mechanism and viral replication. J. Virol., 85,

925-931.

20. Gardner,A.F., Guan,C. and Jack,W.E. (2011) Biochemical characterization of a

structure-specific resolving enzyme from Sulfolobus islandicus rod-shaped virus 2.

PLoS. One., 6, e23668.

21. Gardner,A.F., Prangishvili,D. and Jack,W.E. (2011) Characterization of Sulfolobus

islandicus rod-shaped virus 2 gp19, a single-strand specific endonuclease.

Extremophiles., 15, 619-624.

22. Gudbergsdottir,S., Deng,L., Chen,Z., Jensen,J.V., Jensen,L.R., She,Q. and Garrett,R.A.

(2011) Dynamic properties of the Sulfolobus CRISPR/Cas and CRISPR/Cmr systems

when challenged with vector-borne viral and plasmid genes and protospacers. Mol.

Microbiol., 79, 35-49.

23. Deng,L., Zhu,H., Chen,Z., Liang,Y.X. and She,Q. (2009) Unmarked gene deletion and

host-vector system for the hyperthermophilic crenarchaeon Sulfolobus islandicus.

Extremophiles., 13, 735-746.

24. Oke,M., Carter,L.G., Johnson,K.A., Liu,H., McMahon,S.A., Yan,X., Kerou,M.,

Weikart,N.D., Kadi,N., Sheikh,M.A. et al. (2010) The Scottish Structural Proteomics

Facility: targets, methods and outputs. J. Struct. Funct. Genomics, 11, 167-180.

25. Kelley,L.A. and Sternberg,M.J. (2009) Protein structure prediction on the Web: a case

study using the Phyre server. Nat. Protoc., 4, 363-371.

26. Zhang,J., Kasciukovic,T. and White,M.F. (2012) The CRISPR associated protein Cas4 Is

a 5' to 3' DNA exonuclease with an iron-sulfur cluster. PLoS. One., 7, e47232.

27. Brugger,K., Redder,P. and Skovgaard,M. (2003) MUTAGEN: multi-user tool for

annotating genomes. Bioinformatics., 19, 2480-2481.

28. Xue,B., Dunbrack,R.L., Williams,R.W., Dunker,A.K. and Uversky,V.N. (2010)

PONDR-FIT: a meta-predictor of intrinsically disordered amino acids. Biochim. Biophys.

Acta, 1804, 996-1010.

29. Dosztanyi,Z., Csizmok,V., Tompa,P. and Simon,I. (2005) IUPred: web server for the

prediction of intrinsically unstructured regions of proteins based on estimated energy

content. Bioinformatics., 21, 3433-3434.

30. Munoz,V. and Serrano,L. (1994) Elucidating the folding problem of helical peptides

using empirical parameters. Nat. Struct. Biol., 1, 399-409.

Page 75: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

61

31. Kawano,S., Iyaguchi,D., Okada,C., Sasaki,Y. and Toyota,E. (2013) Expression,

purification, and refolding of active recombinant human E-selectin lectin and EGF

domains in Escherichia coli. Protein J., 32, 386-391.

32. Chong,J.P., Hayashi,M.K., Simon,M.N., Xu,R.M. and Stillman,B. (2000) A

double-hexamer archaeal minichromosome maintenance protein is an

ATP-dependent DNA helicase. Proc. Natl. Acad. Sci. U. S. A, 97, 1530-1535.

33. Lemak,S., Beloglazova,N., Nocek,B., Skarina,T., Flick,R., Brown,G., Popovic,A.,

Joachimiak,A., Savchenko,A. and Yakunin,A.F. (2013) Toroidal structure and DNA

cleavage by the CRISPR-associated [4Fe-4S] cluster containing Cas4 nuclease SSO0001

from Sulfolobus solfataricus. J. Am. Chem. Soc., 135, 17476-17487.

34. Shereda,R.D., Kozlov,A.G., Lohman,T.M., Cox,M.M. and Keck,J.L. (2008) SSB as an

organizer/mobilizer of genome maintenance complexes. Crit Rev. Biochem. Mol. Biol.,

43, 289-318.

35. Bochkarev,A. and Bochkareva,E. (2004) From RPA to BRCA2: lessons from

single-stranded DNA binding by the OB-fold. Curr. Opin. Struct. Biol., 14, 36-42.

36. Suck,D. (1997) Common fold, common function, common origin? Nat. Struct. Biol., 4,

161-165.

37. Theobald,D.L., Mitton-Fry,R.M. and Wuttke,D.S. (2003) Nucleic acid recognition by

OB-fold proteins. Annu. Rev. Biophys. Biomol. Struct., 32, 115-133.

38. Dickey,T.H., Altschuler,S.E. and Wuttke,D.S. (2013) Single-stranded DNA-binding

proteins: multiple domains for multiple functions. Structure., 21, 1074-1084.

39. Paytubi,S., McMahon,S.A., Graham,S., Liu,H., Botting,C.H., Makarova,K.S., Koonin,E.V.,

Naismith,J.H. and White,M.F. (2012) Displacement of the canonical single-stranded

DNA-binding protein in the Thermoproteales. Proc. Natl. Acad. Sci. U. S. A, 109,

E398-E405.

40. Wu,Y. (2012) Unwinding and rewinding: double faces of helicase? J. Nucleic Acids,

2012, 140601.

41. Yusufzai,T. and Kadonaga,J.T. (2008) HARP is an ATP-driven annealing helicase.

Science, 322, 748-750.

42. Yusufzai,T. and Kadonaga,J.T. (2010) Annealing helicase 2 (AH2), a DNA-rewinding

motor with an HNH motif. Proc. Natl. Acad. Sci. U. S. A, 107, 20970-20973.

43. Howard,J.A., Delmas,S., Ivancic-Bace,I. and Bolt,E.L. (2011) Helicase dissociation and

annealing of RNA-DNA hybrids by Escherichia coli Cas3 protein. Biochem. J., 439,

85-95.

Page 76: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

62

44. Peng,X., Kessler,A., Phan,H., Garrett,R.A. and Prangishvili,D. (2004) Multiple variants

of the archaeal DNA rudivirus SIRV1 in a single host and a novel mechanism of

genomic variation. Mol. Microbiol., 54, 366-375.

45. Pina,M., Basta,T., Quax,T.E., Joubert,A., Baconnais,S., Cortez,D., Lambert,S., Le,C.E.,

Bell,S.D., Forterre,P. et al. (2014) Unique genome replication mechanism of the

archaeal virus AFV1. Mol. Microbiol., 92, 1313-1325.

46. Mosig,G. (1998) Recombination and recombination-dependent DNA replication in

bacteriophage T4. Annu. Rev. Genet., 32, 379-413.

47. Kowalczykowski,S.C. (2000) Initiation of genetic recombination and

recombination-dependent replication. Trends Biochem. Sci., 25, 156-165.

48. Kowalczykowski,S.C., Dixon,D.A., Eggleston,A.K., Lauder,S.D. and Rehrauer,W.M.

(1994) Biochemistry of homologous recombination in Escherichia coli. Microbiol. Rev.,

58, 401-465.

TABLE AND FIGURE LEGENDS

Table 1. Structures of the DNA substrates used in this study.

Figure 1. Organization of SIRV2 gp17, gp18, gp19 and their homologs in the genomes of

archaeal linear viruses. The pattern codes are as follows:

SIRV2 gp17 homolog; SIRV2 gp18 homolog or analog; SIRV2 gp19

homolog; conserved gene upstream of SIRV2 gp18 homolog in most

filamentous viruses.

Figure 2. gp17 binds to ssDNA. (A) gp17 binds to DNA substrates with either a 23-nt

5´-ssDNA flap or Y-shaped double-flaps (23 nt), but not to blunt-ended dsDNA. F, free DNA;

C, DNA-protein complex. (B) gp17 shows a high preference towards ssDNA than to dsDNA.

The concentration (nM) of gp17 is indicated on the top of the gel.

Figure 3. Mutagenesis of gp17 revealed a U-shaped binding path for ssDNA. (A) The

structure of a monomer of gp17 homolog with the conserved positive charged residues

labeled in stick model. (B) Gel retardation assays using gp17 WT and mutant proteins and

ssDNA. Protein concentrations are labeled on the top of each gel. DNA forms are indicated

by a short line (free) or a line covered with a circle (DNA-protein complex). (C) Quantification

of the ssDNA binding activity of different gp17 mutants based on the results shown in B. (D)

Page 77: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

63

Binding path of ssDNA on gp17. The residues contributing to ssDNA binding are labeled in

sticks.

Figure 4. Characterization of the refolded recombinant gp18. (A) Purification and refolding

of gp18 from inclusion bodies. Lane 1, supernatant of E.coli cells expressing gp18; lane 2,

pellet of the lysate; lane 3, pellet protein dissolved in 8 M urea; lane 4, supernatant of gp18

after purification, refolding, heating at 70°C for 20 min and centrifugation. (B) Far-UV CD

spectrum of refolded gp18. (C) Temperature denaturation of gp18 followed at 220 nm. (D)

Gel-filtration chromatographic analysis of the purified and refolded gp18 protein. The main

peak and the two shoulders are labeled.

Figure 5. gp18 stimulates annealing of the complementary oligonucleotides. (A)

Concentration-dependent enhancement of oligonucleotide annealing by gp18. The 32P-labeled 57-mer oligo-4 (1nM) and the complementary oligo-5 (1.2 nM) were incubated in

the absence (lane 1) or presence (lanes 2 to 6) of gp18. gp18 concentrations were indicated

on the top of the gel. The presence (lanes 2 to 5) or absence (lane 6) of ATP is also

indicated. (B) Time course of gp18-enhanced single-strand annealing. Left panel, annealing

without gp18; right panel, annealing in the presence of 200 nM gp18. (C) Quantification of

annealed DNA in the absence or presence of gp18. The percentages were calculated based

on the intensities of bands in B.

Figure 6. Nuclease activities of SIRV2 gp19. (A) Selective cleavage of DNA substrate with a

5´ ssDNA flap by gp19. (B) Gradual cleavage of ssDNA. (C) Endonuclease activity of gp19.

The circular ssDNA of M13mp18 was incubated at 50 °C with or without the addition of 0.5

μM gp19, and the incubation time is given on top of the gel.

Figure 7. Interactions between gp17, gp18 and gp19 and between gp17 and Sulfolobus host

proteins. (A) Schematic presentation of GST-tagged gp17 mutants. (B) Pull-down assays by

GST affinity chromatography. The purified and refolded His-tagged gp18 (labeled as P for

prey) was incubated with GST-tagged gp17 or GST-tagged gp19 on GSH column for 1 h.

After washing with PBS buffer, the GSH beads were boiled in the SDS loading buffer and

loaded for SDS-PAGE, and the interacting protein was detected by anti-His antibody. GST

protein was used as negative control. Positive controls for Western blotting were carried out

using the input His-tagged gp18. (C) Interaction between GST-tagged gp17 mutants and

His-tagged gp18. Pull-down assays were performed as described in B. (D) Identification of

Sulfolobus proteins interacting with His-tagged gp17 overexpressed in sulfolobus sofataricus

P2. Three fractions of eluted proteins from the negative control cells containing the empty

pEXA2 vector (lanes 1 to 3) and from the gp17 transformant (lanes 4 to 6) were tested. The

Page 78: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

64

identified proteins are indicated at the right side. (E) Western blot hybridization of negative

control (Lane1) and gp17 protein elution (Lane 2).

Table S1. Details of the primers used in this study.

Table S2. Sequences of the oligonucleotides used as substrates in this study.

Table S3. The location, gene length and functions of the conserved gene cluster in all linear

viruses.

Table S4. Mass spectrometric peptide mapping and sequencing analysis of the two pulled

down proteins.

Figure S1. Genome comparison of all the rudiviruses and filamentous viruses using the

Mutagen program. Conserved gene clusters are labeled with red square. Homologs are

color-coded whereas white rectangles represent ORFs without homologues.

Figure S2. Structure of SIRV1 ORF1312-96 (PDB identifier [ID] 2X5T)(24). (A) Dimer

structure of SIRV1 ORF1312-96 coloured in deep teal and violet purple for the two monomers.

(B) Secondary structure elements of the monomer are labeled in different colors. (C) and (D),

A surface representation shown on the concave side and convex side of the dimer, indicating

the electrostatic potential of the putative binding interface.

Figure S3. (A) Probability of disordered gp17 aa: two different programs (IUpred and PONDR)

were used for the prediction, both revealed disorder at the C-terminus. (B) Percentage of

helicity of gp17.

Figure S4. Purification of SIRV2 gp17. Protein gp17 expressed and purified to homogeneity

from E.coli. Lane 1-4, four elution fractions from Ni-NTA-agarose beads.

Figure S5. Alignment of SIRV2 gp17 and its homologs. Identical residues are labeled as *,

and conserved positive charged residues are shaded red. Red arrows indicate the position of

the β sheets, blue bars indicate the position of α helices. Residues mutated to alanine in this

study are marked with black dots and numbered accordingly.

Figure S6. Gel retardation assays showing the binding of gp17 WT and some of the mutant

proteins to ssDNA.

Figure S7. Standard linear regression curve. The column Superdex 200 HR 10/30 was

calibrated with proteins of known molecular masses: Thyroglobulin, Bovine (669 kDa);

Page 79: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

65

Apoferritin, Horse Spleen (443 kDa); β-Amylase, Sweet Potato (200 kDa) and Alcohol

Dehydrogenase, Yeast (150 kDa).

Page 80: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

Name Structure Oligonucleotides*

Substrate A: Blunt end duplex 1+2

Substrate B: 5´-ssDNA flap duplex 1+3

Substrate C: 3´-ssDNA flap duplex 1+4

Substrate D: Y-shaped duplex 4+5

Table 1. Structures of the substrates used in this study

*: The sequences of the oligonucleotides are provided in Table S2

66

Page 81: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

SRV << >>

ARV << >>

SIRV1,2 << >>

AFV1 << >>

SIFV << >>

AFV3-8 << >>

<< >> AFV9

Figure 1

67

Page 82: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

C

F

0 33 66 130 260

F

0 33 66 130 260 gp17(nM)

C1

F

0 33 66 130 260

C2

A

0 66 130 260 1300 2600 3250 3900 gp17(nM)

B

Figure 2

68

Page 83: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

K61

R60

R24

H54

R33

K82

K27

R29

A

Figure 3

0 16 33 66 133 266 400 533 666 1200

gp17 wt (nM)

B

0 16 33 66 133 266 400 533 666 1200

gp17 K82A (nM)

0 16 33 66 133 266 400 533 666 1200

gp17 R60A (nM)

0 33 66 133 266 400 533 800 1200 2000

gp17 K61A (nM)

0 33 66 133 266 400 533 800 1200 2000

gp17 H54A (nM)

0 133 266 400 533 733 933 1200 1466 2000

gp17 R33A (nM)

gp17 R60A K61A (nM)

0 133 266 400 533 733 933 1200 1466 2000

0

50

100

0 200 400 600 800 1000 1200

gp17 wt

K82A

R60A

H54A

K61A

R33A

R60A K61A

C

Bo

un

d r

ati

o %

Protein (nM)

D

R33

H54

K61 R60

69

Page 84: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

Figure 4

Vo

1

2

D

B C A

M 1 2 3 4

gp18 55kDa

35kDa

15kDa

10kDa

70

Page 85: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

Time (min) A

nn

ealin

g

(% )

C A

gp18 (nM) -- 50 100 150 200 200

ATP + + + + + -

Time (min) 15 15 15 15 15 15

1 2 3 4 5 6

B

gp18 (nM) 0 0 0 0 0

Time (min) 0 0.5 1 2 4

gp18 (nM) 0 200 200 200 200

Time (min) 0 0.5 1 2 4

Figure 5

0

10

20

30

40

50

60

70

80

90

0 1 2 3 4

control

gp18

71

Page 86: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

Figure 6

A

5 5 5 3

10 3 10 10 3 10 10 3 10 Time (min)

5 3

3 3 5 5

gp19 (0.5M) – + + – + + – + +

3 3

C

M – + + + + 60 15 30 45 60

5 kbp

1 kbp

Time (min)

gp19 (0.5M)

5

5

5

– + + + +

Time (min)

gp19 (0.5M)

10 2 5 8 10

3

5 3

3

3

B

72

Page 87: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

A

55kDa

35kDa

GSTgp17-I (131aa)

GSTgp17-II (121aa)

GSTgp17-III(111aa)

100 50 (aa)

B

Figure 7

C

M 55kDa

D E

55kDa

25kDa

15kDa

170kDa

70kDa

M 1 2 3 4 5 6

gp17

Sso2277

Reverse

gyrase

pEXA2-control pEXA2-gp17

1 2

gp17

130kDa

73

Page 88: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

Expression

Host Tag Protein Name Oligo Nr. Sequence 5´-3´

E.coli

C-HIS tag

gp17 wt 1 Fw: GGCGAAAACCATATGGCCTCATTAAAACAAATAATAG

2 Rv: GCTTCTCGAGAAACTCCTCCTCAACTGTTTTTT

gp17(1-121aa)a 3 Rv: CGTACTCGAGTTATTTTTCTCTCGTTTTCTCTTCTT

gp17 K82Ab 4 1Rv: GCATACGCCTCTAGAAATTCAG

5 1Fw: GAATTTCTAGAGGCGTATGCAG

gp17 K32Ab 6 1Rv: CAACTATTCTCGCTATACCT TTTATC

7 1Fw: GGTATAGCGAGAATAGTTGTACAG

gp17 R60Ab 8 1Rv: CCAATTTGTTTCGCGAAATTATTC

9 1Fw: CGCGAAACAAATTGGAATAAC

gp17 K61Ab 10 1Rv: CCAATTTGCGCTCTGAAATTATT

11 1Fw: CAGAGCGCAAATTGGAATAAC

gp17 H54Ab 12 1Rv: GAAATTATTCTGACTCGCTATCGTC

13 1Fw: CATGACGATAGCGAGTCAGAA

gp17 R33Ab 14 1Rv: GTACAACTATCGCTTTTATACCTTTTA

15 1Fw: GCGATAGTTGTACAGTTAAATGC

gp17 R60A K61Ab 16 1Rv: TTGCGCCGCGAAATTATTCTG

17 1Fw: TTCGCGGCGCAAATTGGAA

gp17 R24A K27Ab 18 1Rv: CGCTAAAATCGCAGACGCTATTTTATTGTTCTCTTT

19 1Fw:GCGATTTTAGCGATAAAAGGTATAAAAAGAATAGTTGTAC

gp17 R24A K27A

K29Ab

20 1Rv: TCTTTTTATACCCGCTATCGCTAAAATCGC AGACGC

21 1Fw: GCG ATA GCG GGT ATA AAA AGA ATA GTT GTA CAG

gp18 22 Fw: CATTTGTTCCATATGAGTGAAAACACACAACTATTTG

23 Rv: CGTACTCGAGCCATCCTCCTAAATTGCTAAATC

gp19 24 Fw: CTACCATTCATATGGTAAATATGAATTATGAAGATC

25 Rv: GCGCTCGAGAAAAAGTGATATAATGCATTTTTG

N-GST tag

GSTgp17-I 26 Fw: ATCGGGATCCGCCTCATTAAAACAAATAATAG

27 Rv: CGTACTCGAGTTAAAACTCCTCCTCAACTGTTTTTT

GSTgp17-IIc 28 Rv: CGTACTCGAGTTATTTTTCTCTCGTTTTCTCTTCTT

GSTgp17-IIIc 29 Rv: CGTACTCGAGTTATTGTTCCATGTCTAGCTCTTC

GSTgp19 30

Fw:CGCGGATCCGTAAATATGAATTATGAAGATCATATAAAAGAA

AG

31 Rv:GCCGCTCGAGTTAAAAAAGTGATATAATGCATTTTTGTTTG

Sulfolobus

solfataricus C-HIS tag

gp17 34 Fw: CATTTGTTCCATATGGCCTCATTAAAACAAATAATAG

35 Rv: CTCAACTAGCGGCCGCAAACTCCTCCTCAACTGTTT

gp18 36 Fw: CATTTGTTCCATATGAGTGAAAACACACAACTATTTG

37 Rv:TATTAATAGCGGCCGCCCATCCTCCTAAATTGCTAAAT

gp19 38 Fw: CATTTGTTCCATATGGTAAATATGAATTATGAAGATC

39 Rv: CTCTTCTAGCGGCCGCTTAAAAAAGTGATATAATGCATTT

a The forward primer used for this construction is oligo-1.

b For this construction, the front part fragment was amplified using oligo1 and 1Rv primer, and the rest part

fragment was amplified using 1Fw and oligo2. Then the two fragments were used as template to amplify the whole

construction by oligo-1 and oligo-2.

c The forward primer used for this construction is oligo-26.

Table S1. Details of the primers used in this study

74

Page 89: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

Name Sequence (5’ to 3’)

oligo-1……………………..CGTACTCGAGTTATTGTTCCATGTCTAGCTCTTC

oligo-2………………….....GAAGAGCTAGACATGGAACAATAACTCGAGTACG

oligo-3……………………..GTTATTGCATGAAAGCCCGGCTGGAAGAGCTAGACATGGAACAATAACTCGAGTACG

oligo-4……………………..GTCAGTCCAAAAGTACATTATTGCGTACTCGAGTTATTGTTCCATGTCTAGCTCTTC

oligo-5……………………..GAAGAGCTAGACATGGAACAATAACTCGAGTACGGTTATTGCATGAAAGCCCGGCTG

Table S2. Sequences of the oligonucleotides used as substrates in this study

75

Page 90: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

Strain Gene1 Length

(aa) Function Gene2

Length

(aa) Function Gene3

Length

(aa) Function

AFV3 gp10 79 Hypothetical

protein gp09 593 Putative helicase gp08 203 Putative nuclease

AFV6 gp10 79 Hypothetical

protein gp09 593 Putative helicase gp08 203 Putative nuclease

AFV7 gp05 79 Hypothetical

protein gp04 593 Putative helicase gp03 203 Putative nuclease

AFV8 gp07 79 Hypothetical

protein gp06 593 Putative helicase gp05 203 Putative nuclease

AFV9 gp11 79 Hypothetical

protein gp10 602

Putative Holiday

junction branch

migration

helicase

gp08 203 Putative

nuclease

SIFV SIFV-08 79 Hypothetical

protein SIFV-07 601 Putative helicase SIFV-06 232

Hypothetical

protein

AFV2 - gp15 425 Hypothetical

protein -

AFV1 gp14 135 Hypothetical

protein gp15 426

Hypothetical

protein gp17 223

CRISPR-associated

Cas4-like protein

ARV1 gp12 134 Hypothetical

protein gp16 443

Hypothetical

protein gp17 207

Hypothetical

protein

SRV SRV-

ORF138- 138

Hypothetical

protein

SRV-

ORF440 440

Hypothetical

protein

SRV-

ORF199 199

Hypothetical

protein

SIRV2 gp17 131 Hypothetical

protein gp18 436

Hypothetical

protein gp19 207

Single strand

nuclease

Table S3. The location, gene length and functions of the conserved gene cluster in all linear viruses

76

Page 91: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

Table S4. Mass spectrometric peptide mapping and sequencing analysis

of the two pulled down proteins

77

Page 92: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

Figure S1

78

Page 93: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

A B

C D

Figure S2

79

Page 94: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

Figure S3

80

Page 95: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

M 1 2 3 4

25KDa

15KDa

10KDa

SIRV2 gp17

Figure S4

81

Page 96: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

Figure S5

82

Page 97: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

0 33 66 130 260

gp17 wt

(nM) 0 33 66 130 260

gp17(R24A K27A)

(nM) 0 33 66 130 260

gp17(R24A K27A K29A)

(nM)

0 33 66 130 260

gp17(1-121aa)

(nM)

Figure S6

83

Page 98: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

Ka

v

log MW

660kDa

440kDa

200kDa

150kDa

Figure S7

84

Page 99: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

85

Manuscript II

Genome-wide binding profile of two transcription regulators of

Sulfolobus solfataricus

Yang Guo, Xu Peng

In preparation

Page 100: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

86

Genome-wide binding profile of two transcription regulators of

Sulfolobus solfataricus

Yang Guo and Xu Peng *

Department of Biology, University of Copenhagen, Ole Maaloes Vej 5, 2200 CPH N. Denmark

* To whom correspondence should be addressed. Tel: +45 35322018; Fax: +45 35322128; Email:

[email protected]

Page 101: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

87

Abstract

Two transcription regulators sso2474 and sso10340 from Sulfolobus solfataricus P2 were

differently regulated upon SIRV2 infection. A method similar as Chromatin

immunoprecipitation combined with subsequent high-throughput sequencing (Chip-seq)

was applied in this study to get into the gene composition of the two protein regulons in vivo.

Mapping of the sequencing data with Sulfolobus solfataricus P2 and SIRV2 genomes

demonstrated that sso2474 binds with a high affinity to virus genome, whereas sso10340

mainly binds to the host DNA. A total of 27 enriched host DNA fragments extracted from

sso10340-DNA complex appeared as potential binding targets, most of which are genes

involved in energy metabolism, transport, translation and amino acid metabolism. The

genome-wide binding profiles presented here reveal two different kinds of regulon

conditions and contribute to the knowledge expansion of the transcription regulations upon

virus infection.

Page 102: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

88

Introduction

Viruses infecting the organisms of Archaea, the third domain of life, comprise the most

diverse and previously unsuspected virion morphotypes (Pina et al., 2011). In the last few

years, a substantial effort was made to explore the functions of the hypothetical proteins and

the viral infection life cycles (Kessler et al., 2004;Oke et al., 2011;Gardner et al., 2011;Bize

et al., 2009). To date, several virus-host systems have become promising models providing

a great opportunity for studying virus-host interactions.

One of the best-studied viruses in hyperthermophilic archaea is SIRV2 (Sulfolobus

islandicus rod-shaped virus 2), isolated from an acidic hot spring in Iceland, belongs to

Rudiviridae and share a common ancestry with the family Lipothrixviridae. Studies by

transmission electron microscopy showed that SIRV2 virions specificially recognized the

pilus-like filaments on the host cell surface to get adsorption (Quemin et al., 2013). On the

other hand , two gene clusters, cluster sso3138 to sso3141 and cluster sso2386 and sso2387

identified from the SIRV2 resistant Sulfolobus mutants were confirmed responsible for the

virus entry (Deng et al., 2014), providing first insights into its entry process. Unlike most

archaeal viruses, infecting host cells with a `carrier state`, this linear non-enveloped double-

stranded DNA (dsDNA) SIRV2, together with TTV1 and STIV, are lytic viruses (Bize et al.,

2009;Ortmann et al., 2008;Zillig et al., 1996). The virions released from the host cell

through a unique mechanism, which involvs the formation of pyramid-like protrusions,

transecting the cell envelope and S-layer. At the end of the infection stage, this seven

isosceles triangular faces pyramid opens up, allowing mature virions to escape from the cell

(Bize et al., 2009;Quax et al., 2011). To gain better insights into the biology of virus, life

cycle and their effect on the host, microarray analysis to determine the transcriptional

responses of the host and the virus during the infection process could be very efficient and

had been successfully applied to three archaeal viruses, the fusellovirus SSV1, the

icosahedral virus STIV and the Rudiviridae virus SIRV2 (Frols et al., 2007;Ortmann et al.,

2008;Okutan et al., 2013).

What we focused in this work is to investigate the host genes regulation upon SIRV2

infection. As the previous study revealed that a total of 148 host genes differently responded,

and among these genes, two transcription regulators sso2474 and sso10340 were up and

Page 103: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

89

down regulated, respectively. It is raised an interesting question as to how these two

proteins regulate the corresponding genes upon the virus infection stress. Are they global

regulators or just regulate their own promoters? A method similar with chip-seq was applied

to this study for detection of the DNA binding sites in vivo. Combined with the gene

expression analysis, we can get a first insight into the transcription regulation network

between virus and host cells.

Materials and methods

Sulfolobus cultivation and plasmid construction

sso2474 and sso10340 fragments were amplified from Sulfolobus solfataricus P2 genome

by PCR, digested with NdeI and NotI and inserted into the similarly digested

sulfolobus/E.coli shuttle vector pEXA3 (He et al., 2014), allowing the expression of His-

tagged gp17 under the control of arabinose promoter. The constructed plasmid, as well as

the empty plasmid pEXA3, were then electroporated individually into the uracil deficient

competent cells (Deng et a.,2009), Single colonies of the transformants were inoculated into

test tubes containing 5 ml SCV (basal medium supplemented with 0.2% sucrose, 0.2%

casamino acids and 1% vitamin solution) (Deng et a.,2009), and incubated in an Innova

3100 oil-bath shaker. Large-scale culturing was performed in ACV medium (0.2% D-

arabinose was substituted for sucrose) with Erlenmeyer flasks of long necks. When the

culture OD 600 reached to 0.8, it was infected by SIRV2 at about m.o.i of 10. The cells were

collected after 2.5 h virus post infection.

E. coli cells cultivation and plasmid construction

The coding sequences of sso2474 was amplified by PCR from Sulfolobus solfataricus P2

genome, digested with NdeI and XhoI and subsequently inserted into a similarly digested

pET-30(a) (Novagen) expression vector. E.coli BL21 CodonPlus cells were transformed

with individual plasmid construct and a single clone transformant was inoculated in LB

medium containing 30 g/ml kanamycin and 25 g/ml chloramphenicol. At an optical

density (OD600) of 0.4, IPTG (0.5 mM) was added to the culture and the cells were further

cultured at 25oC for 12 hours.

Protein purification

The purification of His-tagged sso2474 and sso10340 either from sulfolobus or E.coli was

carried out as follows, the harvested cell pellets were lysed in lysis buffer (50 mM Tris-Hcl

pH 8.0 , 300 mM NaCl , 1 mM EDTA, 1% Triton X100 and 1 mM PMSF) by sonication,

and different sonication time (4,6,8,10 min) was detected to minimum the size of the DNA

Page 104: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

90

fragments bound by the proteins. Then the lysate were cleared by centrifugation at 10000× g

for 20 min. Supernatant was then incubated with Ni-NTA-agarose beads(Qiagen, Germany)

for 1 h at room temperature. Beads was washed three times with washing buffer (50 mM

Tris-Hcl pH 8.0 , 300 mM NaCl, 40 mM Imidazol ) and protein-DNA complex were eluted

with elution buffer (50 mM Tris-Hcl pH 8.0 , 300 mM NaCl, 250 mM Imidazol). The purity

of the protein was evaluated by 12.5 % SDS-PAGE and staining with PAGE blue (Sigma

Aldrich, UK), and the amount of DNA in the samples were detected on 0.7% Agarose gel

and stained with GelRed (Biotium).

DNA extraction and high-throughput sequencing

The eluted protein-DNA complex in solution were diluted with one volume of water and

treated with RNase A at room temperature for 1h. The deproteination was carried out by

incubation with 2 mg/ml Protease K at 50 °C for 2 h, and 65 °C for 8 h. Then the DNA was

extracted with phenol/chloroform/isoamyl (25:24:1) mixture solution, and finally was

precipitated and concentrated by ethanol precipitation. Sequencing libraries with an average

fragment size of 350 bp were prepared according to protocol of the ion plus fragment library

kit, and sequenced in the Ion PGM™ Sequencer (Life Technology).

Reads mapping and Peak detection

The quality filtered reads were treated and aligned to genome Sulfolobus solfataricus P2 as

well as SIRV2 using Bowtie, the ultrafast memory-efficient short read aligner, to align

sequenced sets of short DNA reads to large genomes (Satoh and Tabunoki, 2013), and then

the enriched peaks were visualized using Artemis (Carver T, etal. 2012), allowing for up to

two errors per reads (insertion, deletion and/or mismatch).

Real-time Quantitative PCR

qPCR reactions were performed in 10 L mixtures containing 5 L iQ SYBR Green

Supermix (Bio-Rad, Cat. No. 170-8880), 1 mM primers and around 1 ng total DNA.

Separate reactions were prepared for detection of reference gene and virus specific and

sulfolobus solfataricus-host specific amplicons. The mixtures were prepared in duplicates in

96-well microliter PCR plates (Bio-Rad Laboratories), sealed with an adhesive cover (Bio-

Rad Laboratories) and worked on the CFX96 Real-Time Detection System (Bio-Rad

Laboratories) following this uniform cycling parameters: Initialization (95°C for 3 min) was

followed by the denaturation of the strands (95°C for 10 sec), annealing of the primers to the

template (55°C for 10 sec), elongation of the primers by the DNA polymerase (72°C for 15

sec). The cycle from denaturation to elongation were repeated 40 times. Thereafter, the final

elongation step was performed at 95°C for 10 sec. At last, a melting temperature gradient

Page 105: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

91

with 0.5°C increasing increment from 65–95°C for 5 sec was used to confirm the specificity

of the primer sets. Besides, the possibility of unspecific amplification products and

contamination was checked by using a non-template control (NTC). Furthermore, a positive

control was used with a known amount of template. qPCR data were analyzed with the Bio-

Rad CFX manager software, which allows for the immediate determination of the cycle

threshold (Ct), melting curves and quantification of samples.

Motif analysis

For de novo motif discovery within the significantly enriched DNA fragments, their

genomic sequences were submitted to MEME (Bailey and Elkan, 1994). Parameters were

set to search for zero or one palindromic motif of 16 bp width per sequence.

DNA band shift assays

50 nM of the ssDNA (oligo1-CGTACTCGAGTTATTGTTCCATGTCTAGCTCTTC) and

blunt-ended dsDNA (which was formed by annealing the oligo1 with its complementary

oligonucleotide) were incubated for 20 min at 50oC with increasing concentrations of

SSO2474 (0-2.0 M) in 20 l DNA-binding buffer (10 mM Tris-Cl, pH 8.0, 100 mM KCl,

2 mM DTT, 10% glycerol). The samples were loaded onto 12% (v/v) acrylamide gel and

electrophoresed in 0.5 × TBE buffer for 1 h 50 min. Following electrophoresis, the gels

were stained with SYBR® Gold (Life Technologies) and scanned by Typhoon FLA 7000

(GE Healthcare Life Science).

Results

Function prediction and high conservation of the two host transcription regulators

The transcription machinery in Archaea has drawn a lot of interests due to its bacterial-like

regulators and eukayote-like basal factors (Bell et al., 2001). Although possessing unique

genome structure, half of the transcription factors (TFs) identified in archaeal genomes

share at least one homolog with bacterial genomes (Perez-Rueda and Janga, 2010).

Therefore, structural or sequence similarities to the well-studied protiens could provide

efficient and direct information for the study of the unknown archaeal proteins.

BlastP search of sso2474 revealed a putative HTH (Helix-turn-Helix) motif located between

20 and 70 amino acid residues, and a conserved domain matching MarR-2 (multiple

antibiotic resistance) family proteins was also detected at the same location. The genome

sequence similarities in the database suggest that sso2474 belong to the TrmB-like

(transcriptional regulator of the maltose system) family, and it has a 36% identity and 65%

sequence similarity to transcription regulator TrmB in Sulfolobus archaeon AZ1 and a 36%

Page 106: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

92

identity and 61% similarity to transcription regulator TrmB in strain Acidianus hospitalis.

Most of the TrmB proteins are able to bind DNA using a HTH motif as DBD (DNA–

binding domain). The DBD is located at the N-terminal region and a mutational analysis

revealed that it is essential for binding (Maruyama et al., 2011). However, the tertiary

structure prediction of sso2474 using the threading program Phyre 2 (Kelley and Sternberg,

2009) suggested a high confidence (99.6%) to MarR-like family transcription regulator with

a 93% alignment coverage. The crystal structures of many MarR family proteins from

bacterial and archaeal species were solved, and they reveal a common architecture with a

characteristic winged helix domain for DNA binding. Although sequence identities between

these homologs is less than 20%, they all possess the same core fold (Nichols et al., 2009).

The protein sso2474 was conserved in Sulfolobus, Acidianus and Metallosphaera specises

of Sulfolobales as well as euryarchaeotal Halobacteria species (Fig. S1).

The down-regulated gene, sso10340, encoding a 10 kDa protein, has a high identity with

truncated variant of Lrp/AsnC-family. Sequence alignment between sso10340 and E.coli

Lrp (leucine-responsive regulatory) protein, one of the best characterized Lrp family

proteins, revealed that the sso10340 protein aa sequence matched well with the C-terminal

amino acid effector domain of the Lrp protein (Fig. S3). The structure prediction of

sso10340 by Phyre2 also suggested a high confidence (99.9%) to the STS042 protein from

Sulfolobus tokodaii 7. STS042 was identified as a stand-alone RAM (regulation of amino

acid metabolism) module protein, which has homologies with the C-terminal domain of

Lrp/AsnC-family proteins (Miyazono et al., 2008). Search results among the DNA database

indicated that Lrp/AsnC-family proteins distribute among many bacterial and most archaea.

Sso10340 has homologues in crenarchaeotal sulfolobales species as well as bacterial

species (Fig. S2).

DNA extraction from protein-DNA complex and high-throughput sequencing

A method similar to Chip-seq was used to gain further insights into the two proteins regulon.

sso2474 and sso10340 were cloned into the E.coli-Sulfolobus shuttle vector pEXA2 under

the control of arabinose promoter, with a His-tag in the C-terminus, respectively

(Gudbergsdottir et al., 2011). In order to detect their binding sites in both the host and virus

genome, the cells were infected with SIRV2 at a m.o.i of 10 after the expression of the

target protein was induced for 15 h. The cells were collected after 2.5 h post virus infection.

By Ni-NTA-Agarose chromatography the protein-DNA complex were purified. Proteins

were detected in SDS-PAGE gel and DNA bound by these proteins was run on the agarose

gel. As shown in Fig. 1A, sso2474 was purified to homogeneity, with a single band detected

in SDS-gel. The DNA extracted from sso2474 exhibited hundreds of folds higher yields

Page 107: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

93

than the control DNA, which was purified by Ni-NTA-Agarose beads from the cells

transformed with an empty plasmid. It seems that sso2474 showed a really high affinity to

DNA. The SDS-PAGE and western blot analysis showed that the Lrp-like protein, sso10340,

exhibit a range of oligomeric states including dimers, octamers and decamers even after

SDS treatment (Fig. 1B), resembling the Lrp/AsnC family protiens which form a range of

multimeric species in solution (Brinkman et al., 2003;Leonard et al., 2001). There are also

significantly more DNA from sso10340-DNA complex than from the control.

The purified DNA-protein complex was firstly treated with RNase A to remove the

contaminated RNA, and the deproteination was carried out by incubation with protease K.

The target DNA fragments of each sample were finally extracted using phenol/chloroform

extraction and ethanol precipitation, with an average size of 300-500 bp. Then the prepared

sample was sequenced using ion torrent next-generation sequencing. Of the sequenced 363-

393 thousand reads, 347-362 thousand reads was uniquely mapped with either sulfolobus

sofataricus host genome DNA or SIRV2 virus DNA (Table 1). It is interesting that 91.7 %

of the mapped reads from sso2474 are aligned with virus genome, whereas 92.04 % reads

from sso10340 belong to the host genome, indicating that sso2474 has a high affinity to

virus genome and sso10340 specially regulate the host genes.

It is surprising that almost all the DNA extracted from sso2474 was aligned to virus genome.

In order to validate whether it is due to the high amount of virus genome present in the cell,

we checked the copy number ratio between host and virus genome by Real-Time PCR

(qPCR). The infected cells (the same one for sequence) were collected and washed 3 times

to remove the virus on the cell surface, and the total DNA from the infected Sulfolobus cells

was extracted. One set of primers belong to the Sulfolobus solfataricus TFB-II were

designed to detect the host genome copy numbers and the primers amplifying the SIRV2

coat protein were designed to check the virus copy numbers in the same DNA sample.

Whereas, the data in Table 2 showed that there was average 0.6 virus entered in one host

cell after 2.5 h post infection, excluding the possibility that the high coverage of viral

sequence reads was due to a high copy number of the virus present in the infected cells.

Thus, we conclude the sso2474 preferentially binds the viral DNA.

Detection of the enriched DNA fragment and genome-wide binding profile of the two

proteins

To identify the DNA-enriched regions, we use Bowtie, the ultrafast memory-efficient short

read aligner, to align sequenced sets of short DNA reads to large genome Sulfolobus

solfataricus P2 as well as SIRV2 (Satoh and Tabunoki, 2013). And the genomic locations of

Page 108: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

94

the peaks was identified and visualized by Artemis, an integrated platform to analyze high-

throughput sequence-based experimental data (Carver et al., 2012).

Protein Sso2474 binds to virus DNA with low specificity

Although only 3.76 % DNA extracted from protein sso2474 can be aligned to the host

genome, there is still a specific binding peak showing up in the map (Fig. 2A), located

upstream and inside of sso2474, indicating that the protein was regulated by itself (peak 10

in Fig 2A). In contrast, an average of 5400 reads were aligned to SIRV2 genome, which was

more than 500 folds than that to the host genome. They were demonstrated as mountain

shape with wide peaks covering the whole virus genome (Fig 2B). Even so, some potential

specific binding sites were marked with numbers, the binding regions were amplified by

PCR, and the gel mobility shift assays were carried out for validation.

Protein sso2474 was firstly expressed and purified from Sulfolobus solfataricus P2 (Fig 1A).

However, DNA specifically bound by the protein cannot be removed by either DNaseI or

PEI (Phenylethyleneimine) and formed an extremely high background. Then we set out to

express the sso2474 protein in E.coli inserted into the pET-30a vector with a C-terminal His

tag and purified by Ni2+

-affinity chromatography. It was expressed soluble in high amount

and SDS-PAGE analysis of the purified protein revealed a pure major band with a

molecular weight of approximately 15kDa (Fig. 3A).

Firstly the 11 enriched fragments and a negative control fragment were amplified by PCR

from SIRV2 genome, and the electrophoretic mobility shift assay (EMSA) of the

recombinant sso2474 from E.coli with the target fragment were carried out. The results

showed that this protein bound all the DNA fragments with no specificity (data not shown).

Since the binding region of sso2474 cannot be detected, another possibility is that this

protein prefers to bind ssDNA rather than dsDNA, some single-strand DNA binding

proteins bind DNA in a non-sequence specific way (Dickey et al., 2013). To verify whether

sso2474 is a single-strand binding protein, an EMSA experiment with equal molar ratio of

ssDNA and dsDNA substrate mixture were performed.

The concentration of the protein was increased from 0.1 M to 2.0 M. The samples were

deposited in a 12% acrylamide gel and was run in 0.5 x TBE buffer. It is demonstrated that

sso2474 preferred to bind dsDNA than ssDNA. As it is shown in Fig .3B lane 3, the band

representing dsDNA began to shift while the amount of ssDNA kept the same. When the

protein binds all dsDNA in the sample, and there is still more protein left, it begins to bind

the ssDNA. When the protein concentration increased to 1.6 M (Fig 3B. Lane 7), almost

all the substrates formed complex and no free DNA left. The result demonstrated a clear

image that sso2474 showed more fold affinity to dsDNA than to ssDNA.

Page 109: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

95

Protein sso10340 bound the host genome at several regions

Compared with sso2474, the genome-wide binding profile of sso10340 with host genome

was well mapped showing a dozen of binding sites (Fig 4 A). However, the reads

corresponding to viral sequences were randomly aligned with virus genome, similar to the

control sample (Fig 4 B). Only regions exhibiting more than 2-fold enrichment in CHIP

DNA versus input DNA were considered to be bound significantly to sso10340. A total of

27 genomic regions, scattered in the genome, was identified. The various functions of the

genes that these binding peaks overlapped or closest to were summarized, most of which

participate in amino acid metabolism, energy metabolism, biosynthesis and transport ( Table

3).

Additionally, the fragments were grouped into four categories according to their location

with respect to open reading frames (Fig 4 C). As we observed that 41% (upstream and

intragenic but upstream) of the 27 regions localize to the upstream regulatory region of the

corresponding gene, and 46% of them fell within the coding region. The small left peaks

(13%) were found in the regions locating both the downstream of the neighbored genes.

As half of the binding region fell into the upstream area of the corresponding gene, a

binding profile of 14 genomic regions near promoter area were zoomed in and analyzed in

detail (Fig. S4). The binding genomic fragment was amplified by PCR with an averagesize

of 150bp. The protein sso10340 was purified from Sulfolobus, and an EMSA screen of these

regions was performed to verify whether these targets regions also interact with purified

protein in vitro. However, no binding was observed by protein sso10340 in vitro.

Motif analysis for sso10340 binding site

The protein sso10340 binding motif was defined by enriched oligonucleotide sequences

within bound regions. The sequences of these DNA fragments were submitted to the motif-

based sequence analysis tool MEME-ChIP (Machanick & Bailey, 2011.

http://meme.nbcr.net.) to detect conserved DNA motif. The most suggested motif was

shown in Fig. 5 B.

This 11 bp, an imperfect palindromic sequence was present in 96% of all binding regions

and have a similarity with the known motif of PPARG (Peroxisome proliferator activated

receptor gamma) (MA0066.1) (Fig 5 A). PPARgamma binds as heterodimer composing of

members of the retinoid X receptor family (RXR) and PPRE (PPAR response elements),

which had a direct repeat of two half sites of 5´-AGGTCA-3´ separated by one nucleotide

(Fig 5 A).

Discussion

Page 110: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

96

The archaeal transcription regulation possessed the eukaryotic-like basal transcription

machinery and bacterial-like regulators that distinguished them from the other two domains

(Koonin and Galperin, 1997;Grabowski and Kelman, 2003). Bindings of the TBP (TATA-

box binding protein) and TFB (transcription factor B) to TATA box and BRE (TFB

response element) in the promoter region are critical to the transcription initiation of

archaeal genes (Bell et al., 1999). How the bacterial-type transcriptional regulators regulate

the eukaryotic-like transcription machinery in archaea, especially on the virus infection, are

still need to be elucidated.

The protein sso2474 showed an amino acid sequence similarity with TrmB family proteins,

which were found in all three domains of life, containing all three kinds of possible TF

combinations- repressors, activators or both. In archaea, most of the TrmB family proteins

were spread in the kingdom Euryarchaeota, only some exist in Crenarchaeota (Maruyama et

al., 2011). No matter the best studied TrmB proteins in Thermococcales P. furiosus of

Euryarchaeota or the research on TrmB family protein MalR of S. acidocaldarius in

Crenarchaeon, most of documented TrmBs seem to function as controlling diverse sugar

transporters or different genes of sugar metabolism, such as maltose and glucose processing,

as well as genes involved in other metabolisms (Kanai et al., 2007;Lee et al., 2008;Reichlen

et al., 2012;Wagner et al., 2014). However, in this study, the binding map of the protein

sso2474 indicated that this protein did not show a significant response on regulating the

genes related with sugar metabolism (except sso2474 itself) to activate or repress the

corresponding gene for its healthy maintenance.

On the other hand, both the conserved domain and the structure prediction revealed that

sso2474 belongs to the MarR (multiple antibiotic resistance) family transcription regulators.

MarR family proteins constitute a diverse group of transcription regulators that modulate the

expression of genes encoding proteins involved in the metabolic pathways, stress responses,

virulence and degradation or export of harmful chemicals such as antibotics, organic

solvents (White et al., 1997), oxidative stress agents (Ariza et al., 1994), and house

disinfections (McMurry et al., 1998). It seems that this mar locus is involved in the

mechanism that the stains used to resist the lethal effects of a wide range of toxic agents.

E.coli MarR was the firstly described MarR family regulator and its homologs are widely

distributed in both bacterial and archaea. MarR, as a component of the marRAB locus in

E.coli, is a repressor to its own operon and MarA is a transcription activator that can active

the operon and regulates the expression of proteins important to the multiple antibiotic

resistance (Alekshun and Levy, 1997). In many strains, constitutive expression of MarA

makes a contribution to maintenance of the resistance to antibiotics and other environmental

hazards, and the marR deletion mutant or the inactivation of MarR will result the increased

Page 111: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

97

expression of MarA (Alekshun and Levy, 1999;Barbosa and Levy, 2000). To date, no

research revealed that TrmB family or MarR family proteins bind specifically to virus

genome upon virus infection. If sso2474 is more similar to a MarR family protein, the strain

would activate the expression of corresponding proteins to resist the exposure to the virus

infection, and the sso2474 could be in a way like binding to virus genome to hinder the

process of transcription. Based on this hypothesis, the growth retardation to SIRV2 was

compared between sso2474 overexpressed stain and the wild-type strain, and no difference

was observed from the growth curve (data not show). The above experiment indicated that

this protein probably is not involved in inhibiting the growth of virus. Or perhaps there is a

difference between viral and host DNA, e.g. modification, so sso2474 could specifically

binds to viral DNA. However, the binding mechanism of this protein and its possible

interacted partners involved are needed to be further identified. It will be intriguing to detect

the phenotype changes of sso2474 mutant strain upon SIRV2 infection, comparing to wide

type strain.

The downregulated transcription regulator sso10340 showed a similarity to the C-terminal

domain of Lrp/AsnC family proteins ( leucine-responsive regulatory protein ). Most of the

experimentally characterized archaeal transcriptional regulators belong to this family, and it

is a family that globally and specifically regulates genes. These family members can be

found in both bacteria and archaea but not in eukarya (Brinkman et al., 2003). The Lrp

family proteins typically have a 15 kDa molecular weight for the monomer with an N-

terminal wHTH domain and a C-terminal Amino Acid Metabolism (RAM) domain. The

RAM possesses a αβ sandwich fold and possibly involved in effector recognition and

oligomerization of the protein subunits (Thaw et al., 2006). Actually, proteins that only

possess the RAM domain are frequently observed in the genomes of many organisms. They

are defined as a novel ligand-binding domain or stand-alone RAM-domain (SARD) proteins

involved in regulation of amino acid metabolism (Ettema et al., 2002). Although many of

them were crystalized and structurally determined, the functions of these proteins remain

not clear and still need to be elucidated (Miyazono et al., 2008;Nakano et al., 2006). The

failure of detecting any DNA binding by sso10340 is possibly due to lack of binding

conditions or lack of DNA binding activity. It is possible that sso10340 recognizes an

effector and interacts with a DNA binding protein or a transcription regulator to achieve its

regulatory role in vivo. Indeed, sequences close or in the coding region of a DNA binding

protein (sso2626) and a transcription regulator (sso2827) were detected in this study (Table

3). Weather sso10304 interacted with the two proteins still need to further confirmed.

Page 112: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

98

Reference

Alekshun,M.N., and Levy,S.B. (1997) Regulation of chromosomally mediated multiple antibiotic resistance: the mar regulon. Antimicrob Agents Chemother 41: 2067-2075.

Alekshun,M.N., and Levy,S.B. (1999) Alteration of the repressor activity of MarR, the negative regulator of the Escherichia coli marRAB locus, by multiple chemicals in vitro. J Bacteriol 181: 4669-4672.

Ariza,R.R., Cohen,S.P., Bachhawat,N., Levy,S.B., and Demple,B. (1994) Repressor mutations in the marRAB operon that activate oxidative stress genes and multiple antibiotic resistance in Escherichia coli. J Bacteriol 176: 143-148.

Bailey,T.L., and Elkan,C. (1994) Fitting a mixture model by expectation maximization to discover motifs in biopolymers. Proc Int Conf Intell Syst Mol Biol 2: 28-36.

Barbosa,T.M., and Levy,S.B. (2000) Differential expression of over 60 chromosomal genes in Escherichia coli by constitutive expression of MarA. J Bacteriol 182: 3467-3474.

Bell,S.D., Kosa,P.L., Sigler,P.B., and Jackson,S.P. (1999) Orientation of the transcription preinitiation complex in archaea. Proc Natl Acad Sci U S A 96: 13662-13667.

Bell,S.D., Magill,C.P., and Jackson,S.P. (2001) Basal and regulated transcription in Archaea. Biochem Soc Trans 29: 392-395.

Bize,A., Karlsson,E.A., Ekefjard,K., Quax,T.E., Pina,M., Prevost,M.C. et al. (2009) A unique virus release mechanism in the Archaea. Proc Natl Acad Sci U S A 106: 11306-11311.

Brinkman,A.B., Ettema,T.J., de Vos,W.M., and van der Oost,J. (2003) The Lrp family of transcriptional regulators. Mol Microbiol 48: 287-294.

Carver,T., Harris,S.R., Berriman,M., Parkhill,J., and McQuillan,J.A. (2012) Artemis: an integrated platform for visualization and analysis of high-throughput sequence-based experimental data. Bioinformatics 28: 464-469.

Deng,L., He,F., Bhoobalan-Chitty,Y., Martinez-Alvarez,L., Guo,Y., and Peng,X. (2014) Unveiling cell surface and type IV secretion proteins responsible for archaeal rudivirus entry. J Virol 88: 10264-10268.

Dickey,T.H., Altschuler,S.E., and Wuttke,D.S. (2013) Single-stranded DNA-binding proteins: multiple domains for multiple functions. Structure 21: 1074-1084.

Ettema,T.J., Brinkman,A.B., Tani,T.H., Rafferty,J.B., and Van Der Oost,J. (2002) A novel ligand-binding domain involved in regulation of amino acid metabolism in prokaryotes. J Biol Chem 277: 37464-37468.

Frols,S., Gordon,P.M., Panlilio,M.A., Schleper,C., and Sensen,C.W. (2007) Elucidating the transcription cycle of the UV-inducible hyperthermophilic archaeal virus SSV1 by DNA microarrays. Virology 365: 48-59.

Gardner,A.F., Guan,C., and Jack,W.E. (2011) Biochemical characterization of a structure-specific resolving enzyme from Sulfolobus islandicus rod-shaped virus 2. PLoS One 6: e23668.

Page 113: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

99

Grabowski,B., and Kelman,Z. (2003) Archeal DNA replication: eukaryal proteins in a bacterial context. Annu Rev Microbiol 57: 487-516.

Gudbergsdottir,S., Deng,L., Chen,Z., Jensen,J.V., Jensen,L.R., She,Q., and Garrett,R.A. (2011) Dynamic properties of the Sulfolobus CRISPR/Cas and CRISPR/Cmr systems when challenged with vector-borne viral and plasmid genes and protospacers. Mol Microbiol 79: 35-49.

He,F., Chen,L., and Peng,X. (2014) First Experimental Evidence for the Presence of a CRISPR Toxin in Sulfolobus. J Mol Biol.

Kanai,T., Akerboom,J., Takedomi,S., van de Werken,H.J., Blombach,F., van der Oost,J. et al. (2007) A global transcriptional regulator in Thermococcus kodakaraensis controls the expression levels of both glycolytic and gluconeogenic enzyme-encoding genes. J Biol Chem 282: 33659-33670.

Kelley,L.A., and Sternberg,M.J. (2009) Protein structure prediction on the Web: a case study using the Phyre server. Nat Protoc 4: 363-371.

Kessler,A., Brinkman,A.B., van der Oost,J., and Prangishvili,D. (2004) Transcription of the rod-shaped viruses SIRV1 and SIRV2 of the hyperthermophilic archaeon sulfolobus. J Bacteriol 186: 7745-7753.

Koonin,E.V., and Galperin,M.Y. (1997) Prokaryotic genomes: the emerging paradigm of genome-based microbiology. Curr Opin Genet Dev 7: 757-763.

Lee,S.J., Surma,M., Hausner,W., Thomm,M., and Boos,W. (2008) The role of TrmB and TrmB-like transcriptional regulators for sugar transport and metabolism in the hyperthermophilic archaeon Pyrococcus furiosus. Arch Microbiol 190: 247-256.

Leonard,P.M., Smits,S.H., Sedelnikova,S.E., Brinkman,A.B., de Vos,W.M., van der Oost,J. et al. (2001) Crystal structure of the Lrp-like transcriptional regulator from the archaeon Pyrococcus furiosus. EMBO J 20: 990-997.

Maruyama,H., Shin,M., Oda,T., Matsumi,R., Ohniwa,R.L., Itoh,T. et al. (2011) Histone and TK0471/TrmBL2 form a novel heterogeneous genome architecture in the hyperthermophilic archaeon Thermococcus kodakarensis. Mol Biol Cell 22: 386-398.

McMurry,L.M., Oethinger,M., and Levy,S.B. (1998) Overexpression of marA, soxS, or acrAB produces resistance to triclosan in laboratory and clinical strains of Escherichia coli. FEMS Microbiol Lett 166: 305-309.

Miyazono,K., Tsujimura,M., Kawarabayasi,Y., and Tanokura,M. (2008) Crystal structure of STS042, a stand-alone RAM module protein, from hyperthermophilic archaeon Sulfolobus tokodaii strain 7. Proteins 71: 1557-1562.

Nakano,N., Okazaki,N., Satoh,S., Takio,K., Kuramitsu,S., Shinkai,A., and Yokoyama,S. (2006) Structure of the stand-alone RAM-domain protein from Thermus thermophilus HB8. Acta Crystallogr Sect F Struct Biol Cryst Commun 62: 855-860.

Nichols,C.E., Sainsbury,S., Ren,J., Walter,T.S., Verma,A., Stammers,D.K. et al. (2009) The structure of NMB1585, a MarR-family regulator from Neisseria meningitidis. Acta Crystallogr Sect F Struct Biol Cryst Commun 65: 204-209.

Page 114: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

100

Oke,M., Kerou,M., Liu,H., Peng,X., Garrett,R.A., Prangishvili,D. et al. (2011) A dimeric Rep protein initiates replication of a linear archaeal virus genome: implications for the Rep mechanism and viral replication. J Virol 85: 925-931.

Okutan,E., Deng,L., Mirlashari,S., Uldahl,K., Halim,M., Liu,C. et al. (2013) Novel insights into gene regulation of the rudivirus SIRV2 infecting Sulfolobus cells. RNA Biol 10: 875-885.

Ortmann,A.C., Brumfield,S.K., Walther,J., McInnerney,K., Brouns,S.J., van de Werken,H.J. et al. (2008) Transcriptome analysis of infection of the archaeon Sulfolobus solfataricus with Sulfolobus turreted icosahedral virus. J Virol 82: 4874-4883.

Perez-Rueda,E., and Janga,S.C. (2010) Identification and genomic analysis of transcription factors in archaeal genomes exemplifies their functional architecture and evolutionary origin. Mol Biol Evol 27: 1449-1459.

Pina,M., Bize,A., Forterre,P., and Prangishvili,D. (2011) The archeoviruses. FEMS Microbiol Rev 35: 1035-1054.

Quax,T.E., Lucas,S., Reimann,J., Pehau-Arnaudet,G., Prevost,M.C., Forterre,P. et al. (2011) Simple and elegant design of a virion egress structure in Archaea. Proc Natl Acad Sci U S A 108: 3354-3359.

Quemin,E.R., Lucas,S., Daum,B., Quax,T.E., Kuhlbrandt,W., Forterre,P. et al. (2013) First insights into the entry process of hyperthermophilic archaeal viruses. J Virol 87: 13379-13385.

Reichlen,M.J., Vepachedu,V.R., Murakami,K.S., and Ferry,J.G. (2012) MreA functions in the global regulation of methanogenic pathways in Methanosarcina acetivorans. MBio 3: e00189-12.

Satoh,J., and Tabunoki,H. (2013) Molecular network of chromatin immunoprecipitation followed by deep sequencing-based vitamin D receptor target genes. Mult Scler 19: 1035-1045.

Thaw,P., Sedelnikova,S.E., Muranova,T., Wiese,S., Ayora,S., Alonso,J.C. et al. (2006) Structural insight into gene transcriptional regulation and effector binding by the Lrp/AsnC family. Nucleic Acids Res 34: 1439-1449.

Wagner,M., Wagner,A., Ma,X., Kort,J.C., Ghosh,A., Rauch,B. et al. (2014) Investigation of the malE promoter and MalR, a positive regulator of the maltose regulon, for an improved expression system in Sulfolobus acidocaldarius. Appl Environ Microbiol 80: 1072-1081.

White,D.G., Goldman,J.D., Demple,B., and Levy,S.B. (1997) Role of the acrAB locus in organic solvent tolerance mediated by expression of marA, soxS, or robA in Escherichia coli. J Bacteriol 179: 6122-6126.

Zillig,W., Prangishvilli,D., Schleper,C., Elferink,M., Holz,I., Albers,S. et al. (1996) Viruses, plasmids and other genetic elements of thermophilic and hyperthermophilic Archaea. FEMS Microbiol Rev 18: 225-236.

Page 115: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

101

Table and Figure Legends

Table1. Sequencing and mapping data with Sulfolobus solfataricus P2 and SIRV2 genomes

Table 2. The average virus copy number in each host cell for 2.5 h post infection of SIRV2

Table 3. The products as well as their COG Functional Category of target genes binding by

sso10340

Figure 1. Purification of protein sso2474, sso10340 as well as negative control from Sulfolobus

sofataricus P2. (A) Purified protein elution samples of sso2474 and negative control was subjected

to 12.5% SDS-PAGE gel to detect the purity of the protein (a), and 0.7% Agarose gel to detect the

DNA amount bound by protein(b).(B) Purified protein elution samples from sso10340 and negative

control was subjected to SDS-PAGE gel to detect the the purity of the protein (a), and Agarose gel

to detect the DNA amount(b).Western blot analysis of the purified sso10340 (c).

Figure 2. Genome-wide distribution of sso2474 binding regions on Sulfolobus solfataricus P2

genome (A) and SIRV2 viron genome (B). These experiments were performed using DNA

extracted from purified sso2474 and negative control. Data were analyzed and visualized as

`Material and Methods` section. The genome coordinates (in bp) are given on the x-axis, and y-axis

represents the sequenced reads aligned on the genome. And the sharp peak marked in arrow in (A)

locates in the region of protein sso2474.

Figure 3. (A) The purified sso2474 fractions (L1-L6) from E.coli were analyzed on a SDS-

PAGE gel. The single major band around 15kDa represented the protein sso2474. (B) EMSA assay

with dsDNA and ssDNA mixture as substrate. The concentration of the protein was increased

from 0.1 M to 2.0 M. The length of oligos was 35 bp. The dsDNA and ssDNA were mixed with

equal molar ratio (25 nM: 25 nM).

Figure 4. Genome-wide distribution of sso10340 binding regions on Sulfolobus solfataricus P2

genome (A) and SIRV2 viron genome (B) as well as classification of sso10340 binding regions

with respect to genomic organization(C).(A) and (B),These experiments were performed using

DNA extracted from purified sso10340(red bar) and control(empty plasmids) (green bar)-input chip.

The genome coordinates (in bp) are given on the x-axis, and y-axis represents the sequenced reads

aligned on the genome. Taget gene detected in vivo are indicated.(C) ChIP-enriched regions are

indicated by black horizontal bars, whereas ORFs are depicted by horizontal arrows. Binding

regions ranging from – 500 bp to +100 bp relative to translation start site was considered to be

upstream of a transcription unit or intragenic but upstream. ChIP-enriched regions that are

exclusively located in gene coding regions or partial in downsteam of the gene were identified as

intragenic part. For binding in the regions belong to both downstream of neighbored gene was

classified to intergenic region.

Figure 5. Weblogo of the motif detected with MEME-chip. For MEME motifs, the discovered

motif Logo (B) from submitted bind-region sequences is shown aligned with the most similar

JASPAR motif Logo (A) and the similarity is significant (E≤0.05).

Page 116: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

102

Table 1. Sequencing and mapping with Sulfolobus solfataricus P2 and SIRV2

Sample Sequenced

reads Mapped reads

Mapped reads

with p2 genome

Alignment rate

with p2 genome

Mapped reads

with Sirv2

genome

Alighnment

rate with Sirv2

genome

Input control 393537 362185 359559 91.37% 2626 0.67%

sso2474 363814 347347 13677 3.76% 333670 91.70%

sso10340 388533 359145 357598 92.04% 1547 0.4%

Table 2. The average virus copy number in each host cell for 2 h post infection of SIRV2

Time (h) Host SQ Mean Host Std.Dev Virus SQ Mean Virus Std.Dev

Virus copy

number/host

chromosome

2.5 5.287E+07 1.80E+06 3.158E+07 8.22E+06 0.6

Page 117: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

103

Table 3. The products as well as their COG Functional Category of target genes binding by sso10340

Target

name

Genomic

coordinate

peak start

Genomic

coordinate

peak stop

Motif binding

location

Log2 CHIP

DNA/input

DNA

Gene function

COG

Functional

Category

doxC 40520 41150 Inside the gene 1.82 Terminal oxidase

Energy

production and

conversion

sso1214 1053350 1054300 Inside the gene 3.41 Carbonic anhydrases

sso1580 1427200 1428050

80bp Inside the

gene from start

codon

2.21

Anaerobic dehydrogenases;

molybdopterin

oxidoreductase

sso2826 2587400 2588210 Inside the gene 2.34 molybdopterin-binding

protein

pacS 2411320 2412430 200bp upstream 3.39 Cation transporting ATPase

Inorganic ion

transport and

metabolism

sso3189 2935150 2935710

75bp inside the

gene from start

codon

1.93 Amino acid permease Amino acid

transport and

metabolism adh-11 2469950 2470580 70bp upstream 1.96 Alcohol dehydrogenase

sso2153 1977900 1978650 Inside the gene 2.66

Archaeal putative

transposase ISC1217;

pfam04693

Transposase

sso12210 2965830 2966580 175bp upstream 2.07 Transposase

sso10340 2189085 2189390 Inside the gene 2.31

Conserved,truncated variant

of Lrp/AsnC-family;RNA

polymerase and

transcription factors Transcription

sso2626 2391100 2392050 Start codon 2.07 DNA-binding protein

sso2827 2588200 2589300 Inside the gene 3.06 Transcription regulator

rps26E 502000 502750 Inside the gene 2.49 30S ribosomal protein S26e

Translation rp144e 907520 908380

24bp Inside the

gene from start

coden

1.88 50S ribosomal protein L44e

mrp 395515 396010 Inside the gene 1.78 ATPases involved in

chromosome partitioning Cell division

and

chromosome

partitioning sso2730 2485375 2485920 Inside the gene 1.87

Possibly ATPase of the

AAA superfamily

metS 491920 492385 Inside the gene 2.48 Methionone-tRNA ligase RNA processing

and

modification sso1044 903240 904070 Inside the gene 2.96

RNA methyltransferase,

DNA methyltransferase,

tRNA

sso1288 1114815 1115458 Inside the gene 2.16

short last area hits to a zoo

of molecules associated

with cell wall/cell

membrane

Cell envelop

biogenesis,

outermembrane sso3150 2902540 2903225 Inside the gene 3.19 Putative membrane protein

sso3178 2923630 2924485 280bp upstream 2.10 Cell well

sso2749 2502810 2503615 Inside the gene 2.37

ferritin

Oxidoreductase,oxidative-

stress response Defense

mechanisms

sso2037 1850715 1851580 Inside the gene 3.06 multicopy.similar to

sso1886;proteases

sso1758 1592730 1593280 202bp upstream 2.20

multicopy similar to

carboxy-end of

sso1752,1469 and 1606 Unclassified

sso0927 787505 788400 Inside the gene 2.09 Fe-S cluster assembly

protein SufB

sso0438 380250 381045 43bp upstream 2.45 -

Function

unknown sso1984 1796200 1796975

22bp inside the

gene from the

start codon

2.98 -

Page 118: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

104

(c)

A

B

M

PEX

A3

SSO1

03

40-1

SSO1

03

40-2

10kDa

15kDa

27kDa

70kDa 55kDa

sso10340

dimmer

octamer

Figure 1

Page 119: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

105

A

B

Figure 2

Page 120: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

106

dsDNA

ssDNA

1 2 3 4 5 6 7 8

10kDa

K

15kDa

K

sso2474

K

(Sso2474 uM) 0 0.1 0.2 0.4 0.8 1.2 1.6 2.0 M 1 2 3 4 5 6

A

K

B

K

Figure 3

Page 121: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

107

A

Figure 4

Page 122: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

108

B

C

46%

26%

15%

13%

Distribution of the genomic locations of 10340 binding sites

Intragenic Intragenic but upstream Upstream Intergenic

Intragenic

Intragenic but upstream

Upstream

Intergenic

Page 123: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

109

Figure 5

Page 124: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

SyntTax Report http://archaea.u-psud.fr/SyntTax page 1/3

Genomic contexts

Sulfolobus solfataricus98 2 uid167998

Score: 79.92

02

71

02

72

02

73

02

74

02

75

02

76

02

77

02

78

02

79

02

80

02

81

02

82

02

83

02

84

02

85

02

86

02

87

Sulfolobus islandicusHVE10 4 uid162067

Score: 70.470

26

2

02

63

02

64

02

65

02

66

02

67

02

68

02

69

02

70

02

71

02

72

02

73

02

74

02

75

02

76

02

77

02

78

02

79

02

80

Sulfolobus islandicusL S 2 15 uid58871

Score: 70.08

02

86

02

87

30

11

leu

D

02

89

02

90

02

91

02

92

02

93

02

94

02

95

02

96

02

97

29

66

02

98

02

99

03

00

03

01

03

02

Sulfolobus islandicusL D 8 5 uid43679 C1

Score: 70.08

02

61

02

62

02

63

02

64

02

65

02

66

02

67

02

68

02

69

02

70

02

71

02

72

02

73

02

74

02

75

02

76

02

77

02

78

Sulfolobus solfataricusP2 uid57721Score: 70.08

24

68

24

69

leu

D

leu

C

ph

rB

24

73

24

74

10

44

9

24

75

24

76

24

78

24

79

24

81

Sulfolobus islandicusM 14 25 uid58849

Score: 70.08

02

55

02

56

02

57

leu

D

02

59

02

60

02

61

02

62

02

63

02

64

02

65

02

66

02

67

02

68

02

69

02

70

02

71

thrS

Sulfolobus islandicusY N 15 51 uid58825 C1

Score: 70.08

28

63

28

64

28

65

leu

D

28

67

28

68

28

69

28

70

28

71

28

72

28

73

28

74

28

75

28

76

28

77

28

78

28

79

28

80

28

81

Sulfolobus islandicusM 16 4 uid58841

Score: 70.08

02

73

02

74

02

75

leu

D

02

77

02

78

02

79

02

80

02

81

02

82

02

83

02

84

02

85

02

86

02

87

02

88

02

89

thrS

Sulfolobus islandicusM 16 27 uid58851

Score: 70.08

02

55

02

56

02

57

leu

D

02

59

02

60

02

61

02

62

02

63

02

64

02

65

02

66

02

67

02

68

02

69

02

70

02

71

thrS

Sulfolobus islandicusY G 57 14 uid58923

Score: 70.08

02

58

02

59

02

60

leu

D

02

62

02

63

02

64

02

65

02

66

02

67

02

68

02

69

02

70

02

71

02

72

29

88

02

73

02

74

02

75

Sulfolobus islandicusREY15A uid162071

Score: 69.69

02

57

02

58

02

59

02

60

02

61

02

62

02

63

02

64

02

65

02

66

02

67

02

68

02

69

02

70

02

71

02

72

02

73

02

74

Sulfolobus islandicusLAL14 1 uid197216

Score: 69.69

02

47

02

48

02

49

02

50

02

51

02

52

02

53

02

54

02

55

02

56

02

57

02

58

02

59

02

60

02

61

02

62

02

63

02

64

Acidianus hospitalisW1 uid66875Score: 27.01

09

26

09

27

09

28

09

29

09

30

09

31

09

32

09

33

09

34

09

35

09

36

09

37

09

38

09

39

09

40

09

41

09

42

Sulfolobus tokodaii7 uid57807

Score: 25.79

08

30

09

6

08

31

08

33

08

34

08

35

08

36

08

37

08

38

09

7

08

39

08

40

08

41

08

42

08

43

08

44

08

45

08

46

Metallosphaera cuprinaAr 4 uid66329

Score: 25.16

08

29

08

30

08

31

08

32

08

33

08

34

08

35

08

36

08

37

08

38

08

39

08

40

08

41

08

42

08

43

08

44

08

45

08

46

08

47

08

48

08

49

08

50

Metallosphaera sedulaDSM 5348 uid58717

Score: 23.03

13

87

13

88

13

89

13

90

13

91

13

92

13

93

13

94

13

95

13

96

13

97

13

98

13

99

14

00

14

01

14

02

14

03

14

04

14

05

14

06

Sulfolobus acidocaldariusSUSAZ uid232254

Score: 20.31

00

44

5

00

45

0

00

45

5

00

46

0

00

46

5

00

47

0

00

47

5

00

48

0

00

48

5

00

49

0

00

49

5

00

50

0

00

50

5

00

51

0

00

51

5

00

52

0

00

52

5

00

53

0

00

53

5

00

54

0

00

54

5

Sulfolobus acidocaldariusN8 uid189027

Score: 20.16

00

44

0

00

44

5

00

45

0

11

56

9

00

45

5

00

46

0

00

46

5

00

47

0

00

47

5

00

48

0

00

48

5

00

49

0

00

49

5

00

50

0

00

50

5

00

51

0

00

51

5

00

52

0

00

52

5

00

53

0

00

53

5

Sulfolobus acidocaldariusRon12 I uid189028

Score: 20.16

00

44

0

00

44

5

00

45

0

11

81

4

00

45

5

00

46

0

00

46

5

00

47

0

00

47

5

00

48

0

00

48

5

00

49

0

00

49

5

00

50

0

00

50

5

00

51

0

00

51

5

00

52

0

00

52

5

00

53

0

00

53

5

vzh685
Highlight
Page 125: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

SyntTax Report http://archaea.u-psud.fr/SyntTax page 2/3

Sulfolobus acidocaldariusDSM 639 uid58379

Score: 20.16

gd

s

00

93

pg

mB

00

95

00

96

cb

aA

00

98

00

99

01

00

01

01

lrs1

4

01

03

01

04

01

05

mo

eA

01

07

01

08

01

09

01

10

01

11

01

12

01

13

Fervidicoccus fontisKam940 uid162201

Score: 17.13

03

43

03

44

03

45

03

46

03

47

03

48

03

49

03

50

03

51

R0

01

0

03

52

03

53

03

54

03

55

03

56

03

57

03

58

03

59

03

60

03

61

Pyrolobus fumarii1A uid73415Score: 15.31

14

80

14

81

14

82

14

83

14

84

14

85

14

86

14

87

14

88

14

89

14

90

14

91

14

92

14

93

14

94

14

95

14

96

14

97

Methanocaldococcus fervensAG86 uid59347 C1

Score: 11.54

10

20

10

21

10

22

tfb

10

24

10

25

10

26

10

27

10

28

10

29

10

30

10

31

10

32

10

33

10

34

10

35

Pyrobaculum ogunienseTE7 uid84411 C1

Score: 11.06

18

48

18

49

18

50

18

51

18

52

18

53

18

54

18

55

18

56

18

57

18

58

18

59

18

60

Pyrobaculum arsenaticumDSM 13514 uid58409

Score: 11.06

04

89

04

90

04

91

04

92

04

93

04

94

04

95

04

96

04

97

04

98

04

99

05

00

Methanobacterium SWAN1 uid67359

Score: 10.75

02

69

02

70

02

71

02

72

02

73

02

74

02

75

02

76

02

77

02

78

02

79

02

80

02

81

02

82

Halopiger xanaduensisSH 6 uid68105 C1

Score: 10.75

37

88

37

89

37

90

37

91

37

92

37

93

37

94

37

95

37

96

37

97

37

98

37

99

Methanosarcina mazeiGo1 uid57893

Score: 10.47

01

05

fxsA

01

07

01

08

01

09

01

10

01

11

01

12

01

13

tru

D

01

15

01

16

01

17

pyrG

Methanosarcina mazeiTuc01 uid190185

Score: 10.47

01

10

01

11

01

12

01

13

01

14

01

15

01

16

01

17

01

18

01

19

01

20

01

21

01

22

01

23

01

24

01

25

01

26

Methanosarcina barkeriFusaro uid57715 C1

Score: 10.47

fxsA

A3

19

9

A3

20

0

A3

20

1

A3

20

2

A3

20

3

A3

20

4

A3

20

5

tru

D

A3

20

7

A3

20

8

A3

20

9

A3

21

0

A3

21

1

Natrinema pellirubrumDSM 15624 uid74437 C1

Score: 10.47

12

61

12

62

12

63

12

64

12

65

12

66

12

67

12

68

12

69

12

70

12

71

12

72

12

73

12

74

12

75

Pyrococcus horikoshiiOT3 uid57753

Score: 10.31

10

80

10

81

10

82

10

83

10

85

10

86

10

87

10

88

10

89

10

90

10

91

10

93

10

95

10

97

10

98

Methanosarcina acetivoransC2A uid57879

Score: 10.31

32

66

32

67

rim

K

32

69

32

70

tru

D

32

72

32

73

32

74

32

75

32

76

32

77

32

78

Pyrobaculum calidifontisJCM 11548 uid58787

Score: 10.00

14

45

14

46

14

47

14

48

14

49

14

50

14

51

14

52

14

53

14

54

14

55

14

56

14

57

14

58

14

59

14

60

14

61

14

62

14

63

14

64

14

65

Thermofilum pendensHrk 5 uid58563 C1

Score: 10.00

00

35

00

36

00

37

00

38

00

39

00

40

00

41

00

42

00

43

00

44

00

45

00

46

00

47

00

48

00

49

00

50

R0

00

1

00

51

00

52

00

53

00

54

00

55

Page 126: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

SyntTax Report http://archaea.u-psud.fr/SyntTax page 3/3

Query protein sequence:mqgkseismp dgrvadvfnv vkflyglsdr dieilkllik sqssltmeei sselnitksv vnksilnlek knivikekve sskkgrrayt yrvdvnyltr klvtdldqli kdlkvkiadv igiqiektas v

Genomes without synteny:Acidilobus_saccharovorans_345_15_uid51395Caldisphaera_lagunensis_DSM_15908_uid183486Aeropyrum_camini_SY1___JCM_12091_uid222311Aeropyrum_pernix_K1_uid57757Desulfurococcus_fermentans_DSM_16532_uid75119Desulfurococcus_kamchatkensis_1221n_uid59133Desulfurococcus_mucosus_DSM_2162_uid62227Ignicoccus_hospitalis_KIN4_I_uid58365Ignisphaera_aggregans_DSM_17230_uid51875Staphylothermus_hellenicus_DSM_12710_uid45893Staphylothermus_marinus_F1_uid58719Thermogladius_1633_uid167488Thermosphaera_aggregans_DSM_11486_uid48993Hyperthermus_butylicus_DSM_5456_uid57755Thermofilum_1910b_uid215374Caldivirga_maquilingensis_IC_167_uid58711Pyrobaculum_1860_uid82379Pyrobaculum_aerophilum_IM2_uid57727Pyrobaculum_islandicum_DSM_4184_uid58635Pyrobaculum_neutrophilum_V24Sta_uid58421Thermoproteus_tenax_Kra_1_uid74443Thermoproteus_uzoniensis_768_20_uid65089Vulcanisaeta_distributa_DSM_14429_uid52827Vulcanisaeta_moutnovskia_768_28_uid63631Archaeoglobus_fulgidus_DSM_4304_uid57717Archaeoglobus_profundus_DSM_5631_uid43493_C1Archaeoglobus_sulfaticallidus_PM70_1_uid201033Archaeoglobus_veneficus_SNP6_uid65269Ferroglobus_placidus_DSM_10642_uid40863Halalkalicoccus_jeotgali_B3_uid50305_C1Haloarcula_hispanica_ATCC_33960_uid72475_C1Haloarcula_hispanica_N601_uid230920_C1Haloarcula_marismortui_ATCC_43049_uid57719_C1Halobacterium_NRC_1_uid57769_C1Halobacterium_salinarum_R1_uid61571_C1Haloferax_mediterranei_ATCC_33500_uid167315_C1Haloferax_volcanii_DS2_uid46845_C1Halogeometricum_borinquense_DSM_11551_uid54919_C1Halomicrobium_mukohataei_DSM_12286_uid59107_C1Haloquadratum_walsbyi_C23_uid162019_C1Haloquadratum_walsbyi_DSM_16790_uid58673_C1Halorhabdus_tiamatea_SARL4B_uid214082_C1Halorhabdus_utahensis_DSM_12940_uid59189Halorubrum_lacusprofundi_ATCC_49239_uid58807_C1Haloterrigena_turkmenica_DSM_5511_uid43501_C1Halovivax_ruber_XH_70_uid184819Natrialba_magadii_ATCC_43099_uid46245_C1Natrinema_J7_uid171337_C1Natronobacterium_gregoryi_SP2_uid74439Natronococcus_occultus_SP4_uid184863_C1Natronomonas_moolapensis_8_8_11_uid190182Natronomonas_pharaonis_DSM_2160_uid58435_C1Salinarchaeum_laminariae_Harcht_Bsk1_uid207001Methanobacterium_AL_21_uid63623Methanobacterium_MB1_uid231690Methanobrevibacter_AbM4_uid206516Methanobrevibacter_ruminantium_M1_uid45857Methanobrevibacter_smithii_ATCC_35061_uid58827Methanosphaera_stadtmanae_DSM_3091_uid58407Methanothermobacter_marburgensis_Marburg_uid51637_C1Methanothermobacter_thermautotrophicus_Delta_H_uid57877Methanothermus_fervidus_DSM_2088_uid60167Methanocaldococcus_FS406_22_uid42499_C1Methanocaldococcus_infernus_ME_uid48803Methanocaldococcus_jannaschii_DSM_2661_uid57713_C1Methanocaldococcus_vulcanius_M7_uid41131_C1Methanotorris_igneus_Kol_5_uid67321Methanococcus_aeolicus_Nankai_3_uid58823Methanococcus_maripaludis_C5_uid58741_C1Methanococcus_maripaludis_C6_uid58947Methanococcus_maripaludis_C7_uid58847Methanococcus_maripaludis_S2_uid58035Methanococcus_maripaludis_X1_uid70729Methanococcus_vannielii_SB_uid58767Methanococcus_voltae_A3_uid49529Methanothermococcus_okinawensis_IH1_uid51535_C1Methanocella_arvoryzae_MRE50_uid61623Methanocella_conradii_HZ254_uid157911Methanocella_paludicola_SANAE_uid42887Methanocorpusculum_labreanum_Z_uid58785Methanoculleus_bourgensis_MS2_uid171377Methanoculleus_marisnigri_JR1_uid58561Methanoplanus_petrolearius_DSM_11571_uid52695Methanoregula_boonei_6A8_uid58815Methanoregula_formicicum_SMSP_uid184406Methanosphaerula_palustris_E1_9c_uid59193Methanospirillum_hungatei_JF_1_uid58181Methanosaeta_concilii_GP6_uid66207_C1Methanosaeta_harundinacea_6Ac_uid81199_C1Methanosaeta_thermophila_PT_uid58469Methanococcoides_burtonii_DSM_6242_uid58023Methanohalobium_evestigatum_Z_7303_uid49857_C1Methanohalophilus_mahii_DSM_5219_uid47313Methanolobus_psychrophilus_R15_uid177925Methanomethylovorans_hollandica_DSM_15978_uid184864_C1Methanosalsum_zhilinae_DSM_4017_uid68249Methanopyrus_kandleri_AV19_uid57883Pyrococcus_abyssi_GE5_uid62903_C1Pyrococcus_furiosus_COM1_uid169620Pyrococcus_furiosus_DSM_3638_uid57873Pyrococcus_NA2_uid66551Pyrococcus_ST04_uid167261Pyrococcus_yayanosii_CH1_uid68281Thermococcus_4557_uid70841Thermococcus_AM4_uid54735Thermococcus_barophilus_MP_uid54733Thermococcus_CL1_uid168259Thermococcus_gammatolerans_EJ3_uid59389Thermococcus_kodakarensis_KOD1_uid58225Thermococcus_litoralis_DSM_5473_uid82997Thermococcus_onnurineus_NA1_uid59043Thermococcus_sibiricus_MM_739_uid59399Ferroplasma_acidarmanus_fer1_uid54095Picrophilus_torridus_DSM_9790_uid58041Thermoplasma_acidophilum_DSM_1728_uid61573Thermoplasma_volcanium_GSS1_uid57751

Page 127: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

SyntTax Report http://archaea.u-psud.fr/SyntTax page 1/8

Genomic contexts

Sulfolobus solfataricus98 2 uid167998

Score: 81.88

02

02

02

03

02

04

02

05

02

06

02

07

02

08

02

09

02

10

02

11

02

12

02

13

02

14

02

15

02

16

02

17

02

18

02

19

02

20

02

21

02

22

02

23

Sulfolobus islandicusY N 15 51 uid58825 C1

Score: 81.88

27

93

27

94

27

95

27

96

pa

nB

27

98

32

85

27

99

28

00

28

01

28

02

28

03

28

04

28

05

28

06

aksA

28

08

28

09

28

10

28

11

28

12

28

13

28

14

Sulfolobus solfataricusP2 uid57721Score: 81.88

23

95

23

98

23

97

23

99

pa

nB

24

01

24

02

24

04

10

34

0

24

05

10

34

2

24

06

3 24

08

10

34

8

24

09

24

10

24

11

24

12

24

13

24

15

3

Sulfolobus islandicusL S 2 15 uid58871

Score: 81.88

02

18

02

19

02

20

02

21

pa

nB

02

23

02

24

02

25

02

26

02

27

02

28

02

29

02

30

02

31

02

32

02

33

02

34

02

35

02

36

02

37

02

38

02

39

Sulfolobus islandicusL D 8 5 uid43679 C1

Score: 81.88

01

91

01

92

01

93

01

94

01

95

01

96

01

97

01

98

01

99

02

00

02

01

02

02

02

03

02

04

02

05

02

06

02

07

02

08

02

09

02

10

02

11

02

12

Sulfolobus islandicusLAL14 1 uid197216

Score: 81.21

01

80

01

81

01

82

01

83

01

84

01

85

01

86

01

87

01

88

01

89

01

90

01

91

01

92

01

93

01

94

01

95

01

96

01

97

01

98

01

99

02

00

Sulfolobus islandicusREY15A uid162071

Score: 81.21

01

87

01

88

01

89

01

90

01

91

01

93

01

92

01

94

01

95

00

09

00

10

01

96

01

97

01

98

01

99

00

11

02

00

02

01

02

02

02

03

02

04

02

05

02

06

02

07

Sulfolobus islandicusHVE10 4 uid162067

Score: 81.21

01

93

01

94

01

95

01

96

01

97

01

99

01

98

02

00

02

01

00

09

02

02

02

03

02

04

02

05

02

06

02

07

02

08

02

09

02

10

02

11

02

12

02

13

02

14

Sulfolobus islandicusM 14 25 uid58849

Score: 81.21

01

87

01

88

01

89

01

90

pa

nB

01

92

01

93

01

94

01

95

01

96

01

97

01

98

01

99

aksA

02

01

02

02

02

03

02

04

02

05

02

06

02

07

02

08

Sulfolobus islandicusM 16 27 uid58851

Score: 81.21

01

87

01

88

01

89

01

90

pa

nB

01

92

01

93

01

94

01

95

01

96

01

97

01

98

01

99

aksA

02

01

02

02

02

03

02

04

02

05

02

06

02

07

02

08

Sulfolobus islandicusM 16 4 uid58841

Score: 81.21

02

06

02

07

02

08

02

09

pa

nB

02

11

02

12

02

13

02

14

02

15

02

16

02

17

02

18

aksA

02

20

02

21

02

22

02

23

02

24

02

25

02

26

02

27

Sulfolobus islandicusY G 57 14 uid58923

Score: 80.54

01

91

01

92

01

93

01

94

pa

nB

01

96

01

97

01

98

01

99

02

00

02

01

02

02

02

03

aksA

02

05

02

06

02

07

02

08

02

09

02

10

02

11

02

12

Sulfolobus acidocaldariusN8 uid189027

Score: 44.970

45

15

04

52

0

04

52

5

04

53

0

04

53

5

04

54

0

04

54

5

04

55

0

04

55

5

aksA

04

56

5

04

57

0

04

57

5

04

58

0

04

58

5

04

59

0

04

59

5

pa

nB

04

60

5

04

61

0

04

61

5

04

62

0

04

62

5

04

63

0

Sulfolobus acidocaldariusDSM 639 uid58379

Score: 44.97

trxR

09

32

09

33

09

34

09

35

09

36

09

37

09

38

09

39

aksA

09

41

09

42

09

43

09

44

09

45

09

46

09

47

pa

nB

09

49

09

50

09

51

09

52

09

53

09

54

Sulfolobus acidocaldariusRon12 I uid189028

Score: 44.97

04

50

5

04

51

0

04

51

5

04

52

0

04

52

5

04

53

0

04

53

5

04

54

0

04

54

5

aksA

04

55

5

04

56

0

04

56

5

04

57

0

04

57

5

04

58

0

04

58

5

pa

nB

04

59

5

04

60

0

04

60

5

04

61

0

04

61

5

04

62

0

Sulfolobus acidocaldariusSUSAZ uid232254

Score: 44.97

04

26

5

04

27

0

04

27

5

04

28

0

04

28

5

04

29

0

04

29

5

04

30

0

04

30

5

04

31

0

04

31

5

04

32

0

04

32

5

04

33

0

04

33

5

04

34

0

04

34

5

04

35

0

04

35

5

04

36

0

04

36

5

04

37

0

04

37

5

04

38

0

Sulfolobus tokodaii7 uid57807

Score: 44.97

05

28

05

29

05

30

05

32

pa

nB

05

34

05

35

05

36

07

1

07

2

05

37

aksA

05

39

07

3

05

40

05

41

05

42

05

43

05

44

05

45

05

46

Metallosphaera cuprinaAr 4 uid66329

Score: 43.69

14

72

14

73

14

74

14

75

14

76

14

77

14

78

14

79

14

80

14

81

14

82

14

83

14

84

14

85

14

86

14

87

14

88

14

89

14

90

14

91

Acidianus hospitalisW1 uid66875Score: 39.53

21

89

21

90

21

91

21

92

21

93

21

94

21

95

21

96

21

97

00

20

21

98

21

99

22

00

22

01

00

21

22

02

22

03

22

04

22

05

22

06

22

07

22

08

22

09

22

10

vzh685
Highlight
Page 128: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

SyntTax Report http://archaea.u-psud.fr/SyntTax page 2/8

Metallosphaera sedulaDSM 5348 uid58717

Score: 39.53

06

11

06

12

06

13

06

14

pa

nB

06

16

06

17

06

18

06

19

06

20

06

21

06

22

aksA

06

24

06

25

06

26

06

27

06

28

06

29

06

30

06

31

06

32

Thermofilum 1910buid215374

Score: 27.38

03

36

5

03

37

0

03

37

5

03

38

0

03

38

5

03

39

0

03

39

5

03

40

0

03

40

5

03

41

0

03

41

5

03

42

0

03

42

5

03

43

0

03

43

5

03

44

0

03

44

5

03

45

0

03

45

5

03

46

0

03

46

5

03

47

0

03

47

5

03

48

0

03

48

5

03

49

0

Thermococcus CL1uid168259

Score: 26.38

14

35

14

36

14

37

14

38

14

39

14

40

14

41

14

42

14

43

14

44

14

45

14

46

14

47

14

48

14

49

14

50

14

51

14

52

Pyrococcus furiosusDSM 3638 uid57873

Score: 25.84

18

85

18

86

18

87

18

88

18

89

18

90

18

91

18

92

18

93

18

94

18

95

18

96

s0

43

t04

31

89

61

18

97

18

98

18

99

19

00

19

01

19

02

Thermococcus 4557uid70841

Score: 25.84

09

19

5

09

20

0

09

20

5

09

21

0

09

21

5

09

22

0

09

22

5

09

23

0

09

23

5

09

24

0

09

24

5

09

25

0

09

25

5

09

26

0

09

26

5

09

27

0

09

27

5

09

28

0

Pyrococcus furiosusCOM1 uid169620

Score: 25.84

09

07

5

09

08

0

09

08

5

09

09

0

09

09

50

91

00

10

69

0

09

10

5

09

11

0

09

11

5

09

12

0

09

12

5

09

13

0

09

13

5

09

14

0

09

14

5

09

15

0

09

15

5

09

16

0

Pyrococcus abyssiGE5 uid62903 C1

Score: 24.56

20

69

2 20

67

20

66

20

65

20

64

20

63

20

62

20

62

1

20

61

02

82

like

30

06

20

60

04

07

02

84

20

59

02

85

20

57

aksA

Pyrococcus NA2uid66551

Score: 24.56

04

44

04

45

04

46

04

47

04

48

04

49

04

50

04

51

04

52

04

53

04

54

04

55

20

02

04

57

04

56

04

58

04

59

04

60

04

61

04

62

Pyrococcus yayanosiiCH1 uid68281

Score: 24.30

08

15

0

08

16

0

08

17

0

08

18

0

08

19

0

08

20

0

08

21

0

08

22

0

08

23

0

08

24

0

08

25

0

08

26

0

t90

08

27

0

08

28

0

08

29

0

08

30

0

08

31

0

08

32

0

08

33

0

Thermofilum pendensHrk 5 uid58563 C1

Score: 24.30

05

72

05

73

05

74

05

75

05

76

R0

02

6

05

77

05

78

R0

02

7

05

79

t25

05

80

05

81

05

82

05

83

05

84

R0

02

8

R0

02

9

05

85

Thermococcus onnurineusNA1 uid59043

Score: 24.30

13

29

13

30

13

31

13

32

13

33

13

34

13

35

13

36

13

37

13

38

13

39

13

40

13

41

13

42

13

43

13

44

13

45

13

46

13

47

13

48

13

49

13

50

13

51

13

52

Thermococcus kodakarensisKOD1 uid58225

Score: 24.03

13

21

rplX

13

23

13

24

13

25

13

26

13

27

13

28

13

29

13

30

13

31

13

32

13

33

13

34

13

35

13

36

13

37

13

38

Candidatus Nitrososphaeragargensis Ga9 2 uid176707

Score: 24.03

13

60

0

13

61

0

13

62

0

13

63

0

13

64

0

13

65

0

13

66

0

13

67

0

13

68

0

13

69

0

sd

r2

trn

A1

13

72

0

mscL

rpiA

2

13

75

0

13

76

0

Pyrococcus horikoshiiOT3 uid57753

Score: 23.76

18

48

18

49

18

50

18

52

18

53

18

54

18

55

18

56

18

56

1

05

4

18

57

18

58

43

18

59

18

60

18

61

18

62

18

63

18

64

18

65

18

66

Thermococcus gammatoleransEJ3 uid59389Score: 23.76

08

70

08

71

dp

pF

dp

pD

op

pC

op

pD

08

76

08

77

srp

19

08

79

08

80

08

81

08

82

08

83

08

84

08

85

08

86

08

87

08

88

08

89

08

90

08

91

Pyrococcus ST04uid167261

Score: 23.49

17

75

17

76

17

77

17

78

17

79

17

80

17

81

17

82

17

83

17

84

17

85

17

86

t00

46

17

87

17

88

17

89

Thermococcus sibiricusMM 739 uid59399

Score: 23.49

09

19

09

20

09

21

09

22

09

23

09

24

09

25

09

26

09

27

09

28

09

29

09

30

09

31

09

32

09

33

09

34

09

35

09

36

Methanoregula boonei6A8 uid58815Score: 23.49

19

84

19

85

19

86

19

87

19

88

19

89

19

90

19

91

R0

03

8

19

92

19

93

19

94

19

95

19

96

19

97

19

98

Thermococcus AM4uid54735

Score: 23.49

71

2

23

18

44

2

57

2

55

8

48

7

66

2

60

9

56

2

88

0

75

5

23

20

23

21

23

22

70

0

46

3

85

9

Page 129: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

SyntTax Report http://archaea.u-psud.fr/SyntTax page 3/8

Candidatus Caldiarchaeumsubterraneum uid227223

Score: 23.29

C0

10

3

C0

10

4

C0

10

5

C0

10

6

C0

10

7

C0

10

8

C0

10

9

C0

11

0

C0

11

1

C0

11

2

C0

11

3

C0

11

4

C0

11

5

C0

11

6

C0

11

7

C0

11

8

C0

11

9Thermoplasma volcanium

GSS1 uid57751Score: 23.29

02

85

02

86

02

87

02

88

02

89

02

90

02

91

02

92

02

93

02

94

02

95

02

96

02

97

02

98

02

99

09

03

00

03

01

03

02

Thermococcus litoralisDSM 5473 uid82997

Score: 23.29

10

98

0

14

49

0

14

49

5

10

99

5

11

00

0

14

50

0

14

50

5

07

05

4

07

05

9

07

06

4

07

06

9

07

07

4

07

07

9

07

08

4

07

08

9

07

09

4

07

09

9

07

10

4

07

10

9

Thermoplasma acidophilumDSM 1728 uid61573

Score: 23.29

13

52

13

53

Ta

t38

13

54

13

55

Ta

t39

13

56

13

57

13

58

13

59

13

59

13

61

fur

13

63

13

64

13

65

gcvH

pu

rA

13

68

13

70

Aeropyrum pernixK1 uid57757Score: 23.29

cca

ligT

top

A

17

96

1

17

97

fbp

A

17

99

1

17

99

a

pe

lA

18

04

va

lS

18

07

32

18

08

18

10

Aeropyrum caminiSY1 JCM 12091 uid222311

Score: 23.02

cca

ligT

top

A

11

25

11

26

fbp

A

11

28

11

29

pe

lA

va

lS

11

32

32

11

33

11

34

11

35

11

36

Pyrolobus fumarii1A uid73415Score: 22.75

19

76

R0

05

1

19

77

19

78

19

79

19

80

19

81

19

82

19

83

R0

05

21

98

4

19

85

19

86

Thermococcus barophilusMP uid54733Score: 22.75

01

51

1

01

51

2

01

51

3

01

51

4

01

51

5

01

51

6

01

51

7

01

51

8

01

51

9

01

52

0

01

52

1

01

52

2

01

52

3

01

52

4

01

52

5

01

52

6

01

52

7

01

52

8

01

52

9

Hyperthermus butylicusDSM 5456 uid57755

Score: 21.95

09

79

09

80

09

81

09

82

09

83

09

84

09

85

09

86

09

87

09

88

09

89

09

90

09

91

09

92

Methanosphaerula palustrisE1 9c uid59193

Score: 20.94

19

26

19

27

19

28

19

29

19

30

19

31

19

32

19

33

19

34

19

35

19

36

19

37

19

38

19

39

19

40

19

41

19

42

19

43

19

44

19

45

Aciduliprofundum booneiT469 uid43333

Score: 20.94

04

77

04

78

04

79

04

80

04

81

04

82

04

83

04

84

04

85

04

86

04

87

04

88

04

89

04

90

04

91

04

92

04

93

04

94

04

95

04

96

04

97

Ignisphaera aggregansDSM 17230 uid51875

Score: 20.94

11

14

11

15

11

16

11

17

11

18

11

19

11

20

11

21

11

22

11

23

11

24

11

25

11

26

11

27

11

28

11

29

11

30

11

31

11

32

11

33

11

34

11

35

11

36

11

37

Candidatus NitrosopumilusAR2 uid176130

Score: 19.87

08

33

0

08

33

5

08

34

0

08

34

5

08

35

0

08

35

5

08

36

0

08

36

5

08

37

0

08

37

5

08

38

0

08

38

5

08

39

0

08

39

5

08

40

0

08

40

5

Nitrosopumilus maritimusSCM1 uid58903

Score: 19.87

13

29

13

30

13

31

13

32

13

33

13

34

13

35

13

36

13

37

13

38

13

39

13

40

13

41

13

42

13

43

ectC

13

45

13

46

13

47

13

48

13

49

Aciduliprofundum MAR08339 uid184407

Score: 19.87

05

43

05

44

05

45

05

46

05

47

05

48

05

49

05

50

05

51

05

52

05

53

05

54

05

55

05

56

05

57

05

58

05

59

05

60

05

61

05

62

05

63

05

64

05

65

Candidatus Nitrosopumiluskoreensis AR1 uid176129

Score: 19.66

08

25

0

08

25

5

08

26

0

08

26

5

08

27

0

08

27

5

08

28

0

08

28

5

08

29

0

08

29

5

08

30

0

08

30

5

08

31

0

08

31

5

Ignicoccus hospitalisKIN4 I uid58365

Score: 19.13

02

99

03

00

03

01

03

02

03

03

03

04

03

05

03

06

03

07

03

08

03

09

03

10

03

11

03

12

03

13

03

14

Haloarcula marismortuiATCC 43049 uid57719 C1

Score: 17.85

tra

B

04

37

pu

rM

04

39

04

40

04

41

psm

A3

04

43

ma

nB

2

cd

d

ud

p2

yjlD

1

04

51

ga

lE

arg

1

gyrA

gyrB

Thermosphaera aggregansDSM 11486 uid48993

Score: 17.85

12

22

12

23

12

24

12

25

12

26

12

27

12

28

R0

03

7

12

29

12

30

12

31

12

32

12

33

R0

03

81

23

4

12

35

12

36

12

37

12

38

Page 130: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

SyntTax Report http://archaea.u-psud.fr/SyntTax page 4/8

Thermoproteus uzoniensis768 20 uid65089

Score: 17.32

01

93

01

94

01

95

01

96

01

97

R0

2

01

98

01

99

02

00

02

02

02

03

02

04

02

05

02

06

02

07

02

08

02

09

Methanomassiliicoccus Mx1Issoire uid207287

Score: 17.050

82

35

08

24

0

08

24

5

08

25

0

08

25

5

08

26

0

08

26

5

08

27

0

08

27

5

08

28

0

08

28

5

08

29

0

08

29

5

08

30

0

Desulfurococcus1221n uid59133

Score: 16.78

12

29

12

30

12

31

12

32

12

33

12

34

12

35

12

36

R0

03

5

12

37

12

38

12

39

12

40

12

41

R0

03

6

12

42

12

43

R0

03

7

12

44

Methanobacterium AL21 uid63623Score: 16.78

24

44

24

45

24

46

24

47

24

48

24

49

24

50

24

51

24

52

24

53

24

54

24

55

24

56

Archaeoglobus profundusDSM 5631 uid43493 C1

Score: 16.31

00

01

00

02

00

03

00

04

00

05

00

06

00

07

00

08

00

09

00

10

Desulfurococcus fermentansDSM 16532 uid75119

Score: 16.31

13

32

13

33

13

34

13

35

13

36

R0

04

0

13

37

13

38

R0

04

1

13

39

13

40

13

41

13

42

13

43

13

44

13

45

R0

04

2

13

46

Vulcanisaeta moutnovskia768 28 uid63631

Score: 16.31

05

34

05

35

05

36

05

37

05

38

05

39

05

40

05

41

05

42

05

43

05

44

05

45

05

46

05

47

05

48

Pyrobaculum ogunienseTE7 uid84411 C1

Score: 16.04

03

61

03

62

03

63

03

64

03

65

03

66

03

67

03

68

03

69

03

70

03

71

03

72

03

73

03

74

03

75

03

76

03

77

03

78

03

79

Pyrobaculum arsenaticumDSM 13514 uid58409

Score: 16.04

17

53

17

54

17

55

arg

S

28

17

58

7 17

60

17

61

17

62

17

63

R0

03

0R

00

31

17

64

17

65

17

66

17

67

17

68

Methanosaeta conciliiGP6 uid66207 C1

Score: 16.04

33

95

33

96

33

97

33

98

34

05

34

06

34

07

34

08

34

09

34

11

34

12

34

13

34

14

34

15

Halobacterium salinarumR1 uid61571 C1

Score: 15.77

Ile

36

86

36

86

13

68

71

36

88

nir

DL

trp

D2

pp

iA

36

96

Le

u

36

95

36

99

crc

B1

crc

B2

37

06

trh

5

trkA

6

37

11

237

14

tmk

no

lA

ftsZ

3

37

19

37

21

co

fG

Halobacterium NRC1 uid57769 C1

Score: 15.77

trn

31

19

07

19

10

nir

D

trp

D2

pp

iA

19

16

trn

32

19

17

19

18

19

19

19

21

19

20

trh

5

trkA

6

19

25

pd

hA

1

19

27

tmk

no

lA

ftsZ

3

19

34

19

35

co

fG

Halorubrum lacusprofundiATCC 49239 uid58807 C1

Score: 15.77

18

58

18

59

18

60

18

61

18

62

18

63

18

64

18

65

18

66

18

67

18

68

18

69

Haloquadratum walsbyiDSM 16790 uid58673 C1

Score: 15.77

19

68

19

69

19

70

19

71

ald

H

sfs

A

tau

A

tau

C

tau

B

tau

C

19

78

pn

cB

Haloquadratum walsbyiC23 uid162019 C1

Score: 15.77

12

80

12

81

36

27

AB

tif2

a

no

p1

0

12

86

BAF

12

87

ph

zF

prt

2

12

90

A mta

P

scp

A

sm

c

Halovivax ruberXH 70 uid184819

Score: 15.50

07

14

07

15

07

16

07

17

07

18

07

19

07

20

07

21

07

22

07

23

07

24

07

25

07

26

07

27

07

28

07

29

07

30

Methanoculleus marisnigriJR1 uid58561Score: 15.50

18

96

18

97

18

98

pyrG

19

00

19

01

19

02

19

03

19

04

19

05

19

06

19

07

19

08

19

09

19

10

Methanopyrus kandleriAV19 uid57883

Score: 15.50

08

51

08

52

08

53

08

54

His

B

08

56

08

57

2 GA

R1

08

60

08

61

Ph

oU

2 08

64

arg

J

08

66

08

67

08

68

08

69

08

70

Cya

B

MethanomethylovoransDSM 15978 uid184864 C1

Score: 15.50

07

13

07

14

07

15

07

16

07

17

07

18

07

19

07

20

07

21

07

22

07

23

07

24

07

25

07

26

07

27

07

28

07

29

Page 131: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

SyntTax Report http://archaea.u-psud.fr/SyntTax page 5/8

Natronomonas pharaonisDSM 2160 uid58435 C1

Score: 15.50

15

02

48

02

50

02

52

02

54

02

56

tpa

02

02

60

ub

iA

atp

D

02

66

02

68

prf

1

se

rA

Halorhabdus utahensisDSM 12940 uid59189

Score: 15.231

37

8

13

79

13

80

13

81

13

82

13

83

13

84

13

85

13

86

13

87

13

88

13

89

13

90

13

91

13

92

13

93

13

94

13

95

13

96

Methanocaldococcus infernusME uid48803Score: 15.23

11

25

11

26

11

27

11

28

11

29

11

30

11

31

11

32

11

33

11

34

11

35

11

36

11

37

11

38

11

39

11

40

11

41

11

42

11

43

11

44

11

45

Methanocaldococcus FS40622 uid42499 C1

Score: 15.23

10

45

10

46

10

47

10

48

10

49

10

50

10

51

10

52

10

53

10

54

10

55

10

56

10

57

10

58

10

59

10

60

10

61

10

62

Methanosarcina barkeriFusaro uid57715 C1

Score: 15.23

A3

01

3

A3

01

4

A3

01

5

A3

01

6

A3

01

7

A3

01

8

A3

01

9

A3

02

0

A3

02

1

A3

02

2

Thermoplasmatales archaeonBRNA1 uid195930

Score: 15.23

00

75

5

00

75

6

00

75

7

00

75

8

00

75

9

00

76

0

00

76

1

00

76

2

00

76

3

00

76

4

00

76

5

00

76

6

00

76

7

00

76

8

00

76

9

00

77

0

Natronomonas moolapensis8 8 11 uid190182

Score: 15.23

21

47

pg

i

21

49

21

50

rps1

5

recJ2

21

53

rps1

e

tmk

21

56

trkA

2

21

58

csp

A3

tfb

A1

ald

H1

21

63

ca

rA

so

pI

htr

1S

21

67

Pyrobaculum 1860uid82379

Score: 15.23

10

13

10

14

10

15

10

16

10

17

10

18

10

19

10

20

10

21

10

22

10

23

10

24

10

25

Methanococcus aeolicusNankai 3 uid58823

Score: 15.23

12

26

12

27

12

28

12

29

12

30

12

31

12

32

12

33

12

34

12

35

12

36

12

37

12

38

12

39

12

40

12

41

12

42

12

43

12

44

12

45

12

46

12

47

12

48

Methanococcus maripaludisC5 uid58741 C1

Score: 15.23

13

59

13

60

13

61

13

62

13

63

13

64

13

65

13

66

13

67

13

68

13

69

13

70

13

71

13

72

13

73

13

74

13

75

13

76

13

77

13

78

13

79

Methanococcus maripaludisC6 uid58947Score: 15.23

06

36

06

37

06

38

06

39

06

40

06

41

06

42

06

43

06

44

06

45

06

46

06

47

06

48

06

49

06

50

06

51

06

52

06

53

06

54

06

55

Methanococcus maripaludisC7 uid58847Score: 15.23

12

97

12

98

12

99

13

00

13

01

13

02

13

03

13

04

13

05

13

06

13

07

13

08

13

09

13

10

13

11

13

12

13

13

13

14

13

15

13

16

13

17

Methanococcus maripaludisX1 uid70729Score: 15.23

01

51

0

01

51

5

01

52

0

01

52

5

01

53

0

01

53

5

01

54

0

01

54

50

15

50

01

55

5

01

56

0

01

56

5

01

57

0

01

57

5

01

58

0

01

58

5

01

59

0

01

59

5

01

60

0

01

60

5

01

61

0

01

61

5

01

62

0

Haloferax volcaniiDS2 uid46845 C1

Score: 15.23

26

61

26

62

26

63

26

64

hp

cH

thiD

thiM

thiE

he

mG

26

70

26

71

26

72

26

73

mn

tA

his

D

MethanothermococcusIH1 uid51535 C1

Score: 15.23

02

19

02

20

02

21

02

22

02

23

02

24

02

25

02

26

02

27

02

28

02

29

02

30

02

31

02

32

02

33

Methanococcus maripaludisS2 uid58035Score: 15.23

thrB

02

96

be

ta

rpl1

5

02

99

03

00

hyp

A

03

02

03

03

03

04

03

05

03

06

03

07

03

08

03

09

03

10

03

11

03

12

03

13

03

14

iorB

1

Natrialba magadiiATCC 43099 uid46245 C1

Score: 14.97

14

47

14

48

14

49

14

50

14

51

14

52

14

53

14

54

14

55

14

56

14

57

14

58

14

59

14

60

14

61

14

62

14

63

MethanocaldococcusDSM 2661 uid57713 C1

Score: 14.97

09

04

09

05

09

06

09

07

09

08

09

09

09

10

09

11

09

12

09

13

09

14

09

15

09

16

pn

k

Methanoplanus petroleariusDSM 11571 uid52695

Score: 14.97

07

93

07

94

07

95

07

96

07

97

07

98

07

99

08

00

R0

01

8

08

01

08

02

08

03

08

04

08

05

Page 132: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

SyntTax Report http://archaea.u-psud.fr/SyntTax page 6/8

Ferroglobus placidusDSM 10642 uid40863

Score: 14.97

14

82

14

83

14

84

14

85

14

86

14

87

14

88

14

89

14

90

14

91

14

92

R0

03

0

14

93

14

94

14

95

14

96

14

97

Methanocaldococcus fervensAG86 uid59347 C1

Score: 14.971

19

0

11

91

11

92

11

93

11

94

11

95

11

96

11

97

11

98

11

99

12

00

12

01

12

02

12

03

Methanocaldococcus vulcaniusM7 uid41131 C1

Score: 14.97

01

04

01

05

01

06

01

07

01

08

01

09

01

10

01

11

01

12

01

13

01

14

01

15

01

16

01

17

01

18

01

19

Cenarchaeum symbiosumA uid61411

Score: 14.70

02

51

02

52

02

53

02

54

02

55

02

56

02

57

02

58

02

59

02

60

02

61

02

62

02

63

02

64

02

65

02

66

02

67

02

68

02

69

02

70

02

71

02

72

02

73

02

74

02

75

Acidilobus saccharovorans345 15 uid51395

Score: 14.70

R0

00

5

01

68

01

69

01

70

01

71

01

72

01

73

01

74

01

75

01

76

R0

00

6

01

77

01

78

01

79

01

80

01

81

Caldivirga maquilingensisIC 167 uid58711

Score: 14.70

05

12

05

13

05

14

19

05

16

05

17

05

18

05

19

ub

iA

05

21

05

22

05

23

R0

01

1

05

24

dcd

05

26

05

27

05

28

Archaeoglobus veneficusSNP6 uid65269

Score: 14.70

07

81

07

82

07

83

07

84

07

85

07

86

07

87

07

88

07

89

07

90

07

91

07

92

07

93

07

94

07

95

Pyrobaculum neutrophilumV24Sta uid58421

Score: 14.70

08

39

08

40

08

41

08

42

08

43

08

44

08

45

08

46

08

47

08

48

08

49

08

50

08

51

08

52

08

53

08

54

08

55

08

56

08

57

08

58

15

Methanohalobium evestigatumZ 7303 uid49857 C1

Score: 14.70

08

72

08

73

08

74

08

75

08

76

08

77

08

78

08

79

R0

02

3

08

80

08

81

08

82

08

83

08

84

08

85

08

86

Methanosarcina mazeiGo1 uid57893

Score: 14.70

04

67

04

68

04

69

04

70

04

71

04

72

04

73

arg

J

04

75

arg

C

04

77

04

78

04

79

Halorhabdus tiamateaSARL4B uid214082 C1

Score: 14.50

10

49

10

50

10

51

10

52

10

53

10

54

10

55

10

56

10

57

10

58

Pyrobaculum aerophilumIM2 uid57727Score: 14.50

19

21

19

22

19

24

aro

B

19

27

19

29

19

31

19

32

19

34

19

35

19

36

19

39

19

41

19

43

19

44

19

46

Halalkalicoccus jeotgaliB3 uid50305 C1

Score: 14.50

04

28

5

04

29

0

04

29

5

pyrH

04

30

5

04

31

0

04

31

5

04

32

0

04

32

5

04

33

0

04

33

5

04

34

0

04

34

5

04

35

0

04

35

5

04

36

0

15

27

6

04

36

5

04

37

0

04

37

5

Salinarchaeum laminariaeHarcht Bsk1 uid207001

Score: 14.50

14

65

5

14

66

0

14

66

5

14

67

0

ga

tA

14

68

0

14

69

5

14

70

0

14

70

5

14

71

0

14

71

5

14

72

0

Staphylothermus marinusF1 uid58719Score: 14.50

13

98

13

99

14

00

14

01

14

02

14

03

14

04

14

05

14

06

14

07

14

08

14

09

MethanothermobacterDelta H uid57877

Score: 14.50

29

2

29

3

29

4

29

5

29

6

29

7

29

8

29

9

30

0

30

1

30

2

30

3

30

4

30

5

30

6

30

7

30

8

30

9

Methanococcus voltaeA3 uid49529Score: 14.50

08

54

08

55

08

56

08

57

08

58

08

59

08

60

08

61

08

62

08

63

08

64

08

65

08

66

08

67

MethanothermobacterMarburg uid51637 C1

Score: 14.50

07

56

0

07

57

0

07

58

0

07

59

0

07

60

0

07

61

0

07

62

0

07

63

0

07

64

0

07

65

0

07

66

0

07

67

0

07

68

0

07

69

0

07

70

0

07

71

0

Methanosaeta thermophilaPT uid58469Score: 14.50

08

56

08

57

08

58

08

59

08

60

08

61

pyrG

08

63

08

64

08

65

08

66

08

67

08

68

Page 133: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

SyntTax Report http://archaea.u-psud.fr/SyntTax page 7/8

Methanosaeta harundinacea6Ac uid81199 C1

Score: 14.50

17

12

17

13

17

14

17

15

17

16

17

17

17

18

17

19

17

20

Nanoarchaeum equitansKin4 M uid58009

Score: 14.23

02

1

02

3

02

4

02

5

02

6

02

8

02

7

02

9

03

0

03

1

t03

03

2

03

3

03

4

03

5

03

6

37

03

7

t04

03

9

04

1

Methanohalophilus mahiiDSM 5219 uid47313

Score: 14.23

18

08

18

09

18

10

18

11

18

12

18

13

18

14

18

15

18

16

18

17

18

18

18

19

18

20

18

21

18

22

18

23

18

24

Candidatus Korarchaeumcryptofilum OPF8 uid58601

Score: 14.23

00

92

00

93

00

94

00

95

00

96

00

97

00

98

00

99

01

00

01

01

01

02

01

03

01

04

01

05

Vulcanisaeta distributaDSM 14429 uid52827

Score: 14.23

07

99

08

00

08

01

08

02

08

03

08

04

08

05

08

06

08

07

08

08

08

09

08

10

08

11

08

12

08

13

08

14

08

15

08

16

08

17

Caldisphaera lagunensisDSM 15908 uid183486

Score: 13.96

00

68

00

69

00

70

00

71

00

72

00

73

00

74

00

75

00

76

00

77

00

78

00

79

00

80

Desulfurococcus mucosusDSM 2162 uid62227

Score: 13.96

13

45

13

46

13

47

13

48

13

49

13

50

13

51

13

52

13

53

13

54

13

55

13

56

13

57

R0

04

9

13

58

13

59

13

60

13

61

13

62

13

63

13

64

13

65

13

66

Methanococcus vannieliiSB uid58767Score: 13.96

10

73

10

74

rplX

10

76

10

77

10

78

10

79

10

80

10

81

10

82

10

83

R0

02

6R

00

27

10

84

10

85

10

86

10

87

10

88

10

89

10

90

10

91

10

92

Methanobrevibacter smithiiATCC 35061 uid58827

Score: 13.96

05

25

05

26

05

27

05

28

05

29

05

30

05

31

05

32

05

33

05

34

05

35

05

36

05

37

05

38

05

39

05

40

18

03

18

04

18

05

Ferroplasma acidarmanusfer1 uid54095Score: 13.96

00

00

10

01

3

00

00

10

01

4

00

00

10

01

5

00

00

10

01

6

00

00

10

01

7

00

00

10

01

8

00

00

10

01

9

00

00

10

02

0

00

00

10

02

1

00

00

10

02

2

Thermogladius 1633uid167488

Score: 13.69

11

38

11

39

11

40

11

41

11

42

11

43

11

44

11

45

11

46

11

47

11

48

11

49

11

50

11

51

11

52

11

53

11

54

11

55

11

56

Page 134: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

SyntTax Report http://archaea.u-psud.fr/SyntTax page 8/8

Query protein sequence:maevvrayil vsttvgkeme vadmakkvsg viradpvyge ydvvveveak ssddlkkviy eirrnpniir tvtlivm

Genomes without synteny:Staphylothermus_hellenicus_DSM_12710_uid45893Fervidicoccus_fontis_Kam940_uid162201Pyrobaculum_calidifontis_JCM_11548_uid58787Pyrobaculum_islandicum_DSM_4184_uid58635Thermoproteus_tenax_Kra_1_uid74443Archaeoglobus_fulgidus_DSM_4304_uid57717Archaeoglobus_sulfaticallidus_PM70_1_uid201033Haloarcula_hispanica_ATCC_33960_uid72475_C1Haloarcula_hispanica_N601_uid230920_C1Haloferax_mediterranei_ATCC_33500_uid167315_C1Halogeometricum_borinquense_DSM_11551_uid54919_C1Halomicrobium_mukohataei_DSM_12286_uid59107_C1Halopiger_xanaduensis_SH_6_uid68105_C1Haloterrigena_turkmenica_DSM_5511_uid43501_C1Natrinema_J7_uid171337_C1Natrinema_pellirubrum_DSM_15624_uid74437_C1Natronobacterium_gregoryi_SP2_uid74439Natronococcus_occultus_SP4_uid184863_C1Methanobacterium_MB1_uid231690Methanobacterium_SWAN_1_uid67359Methanobrevibacter_AbM4_uid206516Methanobrevibacter_ruminantium_M1_uid45857Methanosphaera_stadtmanae_DSM_3091_uid58407Methanothermus_fervidus_DSM_2088_uid60167Methanotorris_igneus_Kol_5_uid67321Methanocella_arvoryzae_MRE50_uid61623Methanocella_conradii_HZ254_uid157911Methanocella_paludicola_SANAE_uid42887Methanocorpusculum_labreanum_Z_uid58785Methanoculleus_bourgensis_MS2_uid171377Methanoregula_formicicum_SMSP_uid184406Methanospirillum_hungatei_JF_1_uid58181Methanococcoides_burtonii_DSM_6242_uid58023Methanolobus_psychrophilus_R15_uid177925Methanosalsum_zhilinae_DSM_4017_uid68249Methanosarcina_acetivorans_C2A_uid57879Methanosarcina_mazei_Tuc01_uid190185Picrophilus_torridus_DSM_9790_uid58041

Page 135: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

10 20 30 40 50....|....| ....|....| ....|....| ....|....| ....|....|

Lrp 1 MVDSKKRPGK DLDRIDRNIL NELQKDGRIS NVELSKRVGL SPTPCLERVR

sso10340 1 ---------- ---------- ---------- ---------- ----------

Clustal C

60 70 80 90 100....|....| ....|....| ....|....| ....|....| ....|....|

Lrp 51 RLERQGFIQG YTALLNPHYL DASLLVFVEI TLNRGAPDVF EQFNTAVQKL

sso10340 1 ---------- ---------M AEVVRAYILV STTVGKEMEV ADM---AKKV

Clustal C : : .:: : : . * . :: .:*:

110 120 130 140 150....|....| ....|....| ....|....| ....|....| ....|....|

Lrp 101 EEIQECHLVS GDFDYLLKTR VPDMSAYRKL LGETLLRLPG VNDTRTYVVM

sso10340 29 SGVIRADPVY GEYDVVVEVE AKSSDDLKKV IYE-IRRNPN IIRTVTLIVM

Clustal C 12 . : ... * *::* :::.. . . . :*: : * : * *. : * * :**

160....|....| ....

Lrp 151 EEVKQSNRLV IKTR

sso10340 77 ---------- ----

Clustal C

Figure S3

Page 136: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

Figure S4

1 Sso 0438 peak area(380600-380750) nr 220

AAATAGTTACACTTATCTCCTAAAGGCATTGGTTTCTGCTCATCAGTAAAGGAGGGCTGTCTTTCAAAGCCTCT

ATTTCC

GTTCCTCAAACTACTTATACCCCCAAACTTTTAGACAATCATATTAAACTTACATTTTGACCCATTTAAACATT

ATGTATCTTATACCTTCACTTTTACGATCCACTAGAGCCAATACTAATCTCTTTTTAGTGCTATGTGAGACTCT

ACCGAAACTCAATAGTTCATGAG

Primer EMSASso0438 S 5 CAAAGCCTCTATTTCCGTTC 3 TM 53.8 EMSAsso0438 A 5 CGGTAGAGTCTCACATAGCA 3 TM 55.7 168bp

Page 137: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

2 Sso 0570 peak area (502350-502600) nr192 (with rps26E together transcription)

GTTGTGATCAATGTGGTGCTAGAGTACCAGAGGATAAGGCAGTATGTGTAACAAAAATGTATAGCCCCGTGGAT

GCTTCT

CTAGCATCTGAATTAGAAAAGAAGGGTGCAATAATTGCTAGATATCCTGTAACTAAGTGTTACTGTGTGAATTG

TGCGGT

ATTTTTGGGTATTATTAAGATAAGAGCAGAAAATGAGAGAAAGCAAAAAGCTCGTTTAAGATAGGCTTTTAAAC

CTTTAG

TCAGAATATGTGATGAAATGAGACTTTATGAATTATCTTTTGCACAAATTGAAGATTTTTTCTATAAACTAGCA

GAAGTTAAAGATATTATAAAAGATCATGGTCTATTAG

Primer EMSAsso0570 S 5 GTTGTGATCAATGTGGTGCTAG 3 TM 57.4 EMSAsso0570 A 5 CGCACAATTCACACAGTAACA 3 TM 57.5 158bp

Page 138: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

3 Rp144e peak area (907700-908000) nr 195

TATATAATGATCACAGGGTTAATGAGGGAGATGTTTTGGTTTTACCTATGAGAGAAGCCTTGCCATTAATAATA

GCAAGT

TATTTAACTCCCTATAAGATAGATATTGAAGAACAATTATGAAAGTCCCTAAGGTCATCAGCACATATTGTCCA

AAGTGTAAGACTCATACAGATCACTCTGTATCACTATACAAGAGCGGTAAGAGAAGAAATCTCGCTGAAGGACA

GAGAAGATATGAGAGAAAGAATATTGGATATGGAAGTAAAAGAAAACCAGAACAGAAGAGATTTGCAAAAGTT

Primer EMSArp144e S 5 GCAAGTTATTTAACTCCCTATAAG 3 TM 53.2 EMSArp144e A 5 CTCTCATATCTTCT CTGTCCTTC 3 TM 55.2 168bp

Page 139: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

4 Sso 1758 peak area (1592900-1593150) nr 213

AGACGACCTCAAATGACCTGCCTTTCTCTATGATACCAACAGTATCAACGATCGTACCGTGTTCAACCCTCTTA

GGATCTACACCATGAAGTTCTAAAGCTTTCCTATAAAGTTTTAACATTGTCCTTTCAAACCCTTTACCGGCCCT

AGAAGTAAAACT

ACCTAACTCAACTGAAAACTTCTTTTGACTCCTAACTAGTCTAAGAATGATCCTGGAATGTCTTTCAATTCCTT

TTTGGGAGAGAGACTATTGCTTCTCCTTGCTTTTTGACTTCTTGCTGTAATGACGTAATAGCCTCACTGTGCCT

CTTAACCTCTTCTTGCAAGGATCTTATTGTCTCTTGTAAAAGCTGAATGCTTTTAGTATTTTCTTCCAATCTCT

TTATCACTATTTCATCC

Primer EMSAsso1758 S 5 GAATGATCCTGGAATGTCTTTC 3 TM 54.5 EMSAsso1758 A 5 GGAAGAAAATACTAAAAGCATTCAG 3 TM 55.2 171bp

Page 140: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

5 Sso 1879 peak area (1692550-1692750)

GGTTATCCACCTTATGGATACCTTGTAACGATAATATTAAATATTGAGGAGCTGTCAGACGCATTGCGAAGTTT

AGCTGAGTACTTACGCTCCATAATGAATTAACAGAATAAGGTTATAGAAGCGGAATGATAAATGGCTAACTTTA

TCACCTCAATAC

ATTAAATCTATATTGTGACATTTGACGACTTAAATAAGTTAATCAGAGAGAAACTTAGCGTAGAAACGTACCCT

TATCAAAAGTACATCAG

Primer EMSAsso1879 S 5 CAGACGCATTGCGAAGTTTA 3 TM 56.5 EMSAsso1879 A 5 CGCTAAGTTTCTCTCTGATTAAC 3 TM 54.9 165bp

Page 141: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

6 Rpom-2 peak area (1731850-1732040) nr 105.8 (with sso1913 together transcription)

TGAGCTACATTCTCCTTTAATATTACTTTCTCTACATCATGATCAGAATACCCACACTTACTACAAACCATTTT

GTTACCTTTTACCTTCAAGAAAGAACCACATTTAGGGCAAAAACGCATATCTAGAAATGAGTCTTTTAATCATA

AAAGCTTGCGGC

TAACATACTAACTCCGGGCCAAGGTGTGTCA

Primer EMSArpom-2 S 5 GATCAGAATACCCACACTTAC 3 TM 53.3 EMSArpom-2 A 5 GAGTTAGTATGTTAGCCGCA 3 TM 54.3 134bp

Page 142: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

7 Sso1984 peak area (1796450-1796650) nr 363

TGTACCAAATGGTCAAGCAATAAATTATAATGGACATACAGATCCTGTGGTGATTTAATACTAAGTAATGGAAC

TATGATACAAAATGTGGTATGGGATGGACAATATGCAGGTACAATAATTCAAAATCATTACCAAATAGTTCAAT

TGAATGATGAATGGGTAGGAAGAACCGACCCAGTGAATAATCAACAATATGTA

Primer EMSAsso1984 S 5 GGTCAAGCAATAAATTATAATGGAC 3 TM 55.0 EMSAsso1984 A 5 CCTACCCATTCATCATTCAATTG 3 TM 55.9 157bp

GCTTAAAGGGTTTGTTTGTTTTGGTGGTGTTATGGGTGCGTTGAAGTACGTGGCCATTGGCGTTGTGGTTTTCG

CCACCACTGTGTTTTACTACTACCGACACCGGGTGCCTGTGCAGTACGTGGGTTCACCCAGCGGTTATGAGGCA

TTTGTACCAAATGGTCAAGCAATAAATTATAATGGACATACAGATCCTGTGGTGATTTAATACTAAGTAATGGA

ACTATGATACAAAATGTGGTATGGGATGGACAATATGCAGGTACAATAATTCAAAATCATTACCAAATAGTTCA

ATTGAATGATGAATGGGTAGGAAGAACCGACCCAGTGAATAATCAACAATATGTAACTTTTCAAGATTTCTACG

TGATCAAGGGTCAAGTACCAATTGAGAATGTAACAATAAATGGACAGACGTATTACGTGATAGATGCGGATAAA

ATAAACCCAGCGGACATCGCGGGATTTTTCACATACTGGAGATGGGTAAACAACTTC

FPsso1984 S 5 GCTTAAAGGGTTTGTTTGTTTTG 3 TM 55.4 FPsso1984 A 5 GAAGTTGTTTACCCATCTCCAG 3 TM 56.5 501bp

Page 143: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

8 Sso 2102 peak area (1923900-1924100) nr 143

GAATGCGAAATATGTAAAGCAATAGTTTCAGTATTATGTGGACTATTAGCTGAAGGAGTAGCTAAAAGTGTGGC

ATGTGACGAAGCTTGTGGAACAGTTTGCTTAATATTTGTTGAGGATCCCATTATTTATGATATTTGTGTGGTAA

TATGTATACCTTCTTGTGATGAACTACTTCAACTAATTATCTCAATAGGAGTAGCGACTGCATGTGGACTAGGT

GGTGAGTATCTATGTCAA

AAGGCTGGTCTGTGTTGCTAATAGATTTTTTTTAGATATGTGGTTATAAATAGCTAAGGAGGGAGGGATAGAA

ATGGAAAGAAAGGGGACAGAAATAGAAAGAAATAAGATATCTTTTTTAAAGTGGCTTGAGCTAACATTATTATT

TGTGATTTTGCCTT

Primer EMSAsso2102 S 5 GTGGACTATTAGCTGAAGGAG 3 TM 54.6 EMSAsso2102 A 5 CAGTCGCTACTCCTATTGAGATA 3 TM 56.5 171bp

Page 144: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

9 Sso 2626 peak area (2391350-2391550) nr 167.4

ATGAGATAACTAAACAATTGAGAGATGAAGCTGACAAATTACAACAACCCCTTAAGAAGTATATTGGGCTTGTT

CACAAT

GTAGGTGGCACAGGTCACTTTGCATATGTTATGATTCTAAGAAGGTGACCTTAGATGCAAATAGATGCAATACC

GTTATCAATAAAATATAAGATCAAATACCCGGACGAATTTATTGAAGCAGTTAAGAGGGGAGAAATTGTAGCTA

CCAAATGTAAAAATTGTGGTTCC

Primer EMSAsso2626 S 5 CAACAACCCCTTAAGAAGTATATT 3 TM 54.5 EMSAsso2626 A 5 CTCCCCTCTTAACTGCTTCAAT 3 TM 57.5 175bp

Page 145: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

Pacs

10 Pacs peak aera (2411750-2411950) nr 1080

CTTCTCAGTGGCCATGTGGCATTCCCTTAGGTCCATTCCTTAAATACTCTTCCGGATTTCTCTGAAACTCCCTT

AGACAA

TGAGATGAGCAGAAATAGTAGATTTTTCCCTTATACATTGTCTTATATTGACTTTTCTCATCTACTTCCATTCC

ACAAACCGGATCGATTATCATAATACTAATTATTGCTTTTATTATAAAGAACCTTTTCTTCTGTAAATTTGTAT

CTATATATTGTTGATTACAGTCAAAATGACTCTAAAACTAATGTAAGTGCAAGCCATTGTTGCGTGCAAATTTT

TCCTTTAATCCAGACAAGCAAGAATTGCAACAAGCATAATACGTTTTCC

Primer EMSApacs S 5 GTCTTATATTGACTTTTCTCATCTAC 3 TM 54.0 EMSApacs A 5 GCTTGTCTGG ATTAAAGGAAAA 3 TM 54.4 201bp

CCGGATTTCTCTGAAACTCCCTTAGACAATGAGATGAGCAGAAATAGTAGATTTTTCCCTTATACATTGTCTTATATTGACTTTTCTCATCTACTTCCATTCCACAAACCGGATCGATTATCATAATACTAATTATTGCTTTTATTATAAAGAACCTTTTCTTCTGTAAATTTGTATCTATATATTGTTGATTACAGTCAAAATGACTCTAAAACTAATGTAAGTGCAAGCCATTGTTGCGTGCAAATTTTTCCTTTAATCCAGACAAGCAAGAATTGCAACAAGCATAATACGTTTTCCTTCCTACCTTATAGGTCAAGGGATTATCATATATTTCCTTACCGCAGTAATCACATTTAACTACAAAACTGCTCTTGCCTATGAAAACTAAATCCTTTTCTTTTACATTTTTTAACTTATTTTCTAAATCTTCTAAAGTGTTAGCTCTTATCATGTTAATAAATCTACCATCCAACAGTTTGTAACACTCGTCACTTTGA

FPpacs S 5 CTCTGAAACTCCCTTAGACAATG 3 TM 56.4 FPpacs A 5 CAAACTGTTGGATGGTAGATTTA 3 TM 54.3 474bp

Page 146: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

Adh-11

11 Adh-11 peak area (2470250-2470400) nr 120

ATAAAAAATACCTTCCACAATCTTAGCTCACCTTTATTTGCCTATAGTTAGTTATTATTACTGTTCCGATGAAG

TATATA

AGACCTATCCCAATTAAGATTTCCATTCCTAAGAGATATGCTTTGGTGACTTCCACAGTAGCACCAAATACTAT

TGGTAC

TATTATACCCCATAGAGTTTCCCAAAAGCCAATATGACCACCAAATTGACCAGCTAATTCAGTAGAAACTAGAA

AGGATGGTGCTGCCCATTGTATCCCCGACCATCTTAGGAAGAAGAAAGTTACCATTAGTATTTCAAC

Primer EMSAadh-11 S 5 GAGATATGCTTTGGTGACTTCC 3 TM 56.0 EMSAadh-11 A 5 CCTAAGATGGTCGGGGATAC 3 TM 55.8 160bp

Page 147: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

12 Sso 3178 peak area (2923900-2924050) nr 206

TTTCTTTAAAGCTCTATCACATGTAGTCCAGTTCTCATATAGTTCCTTATCCTCTTTTCTTAGCAGTCATATTG

ACTTATGAATTATAATCAACGACGTTTTTGGGAGTAAAGGTTAAGTTTGCTCACATTTCTACTTTAAGAGAGAG

TAAATTAAATTAACTTTCCTTTAAATTTTCTAATGCTTTGAGGTTCTATGCAAGAGCTGGTTTACAATGGTATA

TAGTTAAAAGAGAGGGCTGAAAATGAAACTTGGAGTGGAAAAATTGTTGCCAGCTTATTAAAAACCAAGTGGAG

CATTATCGTAGCTGAGAAAACATAGTCCAGTCCTTAATTCTAGCTTTAGTGCGACGTATTTTGTGAAAAAATCG

CTTGATTTTTGCATGTTTCAAAACATTTTTA

Primer EMSAsso3178 S 5 GCTCTATCACATGTAGTCCAG 3 TM 55.1 EMSAsso3178 A 5 GCATAGAACCTCAAAGCATTAG 3 TM 55.2 188bp

Page 148: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

13 Sso3189 peak area (2935250-2935550)

AAGTAAATTATGAACGATTATCTAATCATACTTTTGCAATGGATAAGTAAATTTAAATAGGGGAATTTAGAAGA

TAGCGT

GTGAGTAACAAAAATAGAATATTCGTAAGGGAGACTTCTGGTTTAATAAAGAACGTATCATTATGGGATGCAGT

TGCACTCAATATAGGCAATATGTCAGCTGGAGTAGCATTATTTGAATCAATATCACCATATGTACAACAAGGAG

GAGTATTGTGGCTGGCTTCATTAATAGGCTTCATCTTCGCTATACCACAACTGTTAATTTATGTATTTTTAAC

Primer EMSAsso3189 S 5 GGGGAATTTAGAAGATAGCGT 3 TM 54.5 EMSAsso3189 A 5 CAGCCACAATACTCCTCCTT 3 TM 56.2 183bp

Page 149: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

14 Sso 12210 peak area (2966200-2966400)

TTTGCACGCTAACTTCAACTGGGCTTTTAACGCCCTAAGGGTTTGTTCGTCAGTGTATGCACGGAAGCGAAACC

CTAAGGTGGGTATTGATGTGTATTTTGTTATTTTCCTATTTTTAACTTTTCTACAAAGGGATTCATCTCAAAGA

GGCGAAGTTTTCCGCCCCCTTTGAACCCCCGTCTGTTATAAACATAATACGCAATCATAGGTCAGATTGACTAC

AGATGATAGCTTATATGGCTGAAATGTAGTATAAAAATACCTAAGACGTACTGGTGTTTACCCGTGGCGTAACT

TCTGC

Primer EMSAsso12210 S 5 CTTTTCTACAAAGGGATTCATCTC 3 TM 55.6 EMSAsso12210 A 5 GGTAAACACCAGTACGTCTT 3 TM 54.5 204 bp

Page 150: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

Area 505950-506100 151bp

CTCCTAAAAATAGCGAACCGCCACTCTTTCTTGCAAATTCTATAATACCTTCTGCTTGTTCCCTTCTAACATAA

ATTTTTCCCTCTATATCCTTTCTACTCTTCATCTCAATTAAAATAATAACGCCATTCTTTAAAGCGATAATATC

CGGTATAGGGTC

Primer EMSA no binding S 5 CTCCTAAAAATAGCGAACCG 3 TM 53.5 EMSA no binding A 5 GACCCTATACCGGATATTATCG 3 TM 54.5 160bp

Page 151: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

  Published Ahead of Print 25 June 2014. 2014, 88(17):10264. DOI: 10.1128/JVI.01495-14. J. Virol. 

Martinez-Alvarez, Yang Guo and Xu PengLing Deng, Fei He, Yuvaraj Bhoobalan-Chitty, Laura Archaeal Rudivirus EntrySecretion Proteins Responsible for Unveiling Cell Surface and Type IV

http://jvi.asm.org/content/88/17/10264Updated information and services can be found at:

These include:

SUPPLEMENTAL MATERIAL Supplemental material

REFERENCEShttp://jvi.asm.org/content/88/17/10264#ref-list-1at:

This article cites 32 articles, 11 of which can be accessed free

CONTENT ALERTS more»articles cite this article),

Receive: RSS Feeds, eTOCs, free email alerts (when new

http://journals.asm.org/site/misc/reprints.xhtmlInformation about commercial reprint orders: http://journals.asm.org/site/subscriptions/To subscribe to to another ASM Journal go to:

on Novem

ber 2, 2014 by Copenhagen U

niversity Libraryhttp://jvi.asm

.org/D

ownloaded from

on N

ovember 2, 2014 by C

openhagen University Library

http://jvi.asm.org/

Dow

nloaded from

Page 152: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

Unveiling Cell Surface and Type IV Secretion Proteins Responsible forArchaeal Rudivirus Entry

Ling Deng, Fei He, Yuvaraj Bhoobalan-Chitty, Laura Martinez-Alvarez, Yang Guo, Xu Peng

Archaea Centre, Department of Biology, University of Copenhagen, Copenhagen, Denmark

Sulfolobus mutants resistant to archaeal lytic virus Sulfolobus islandicus rod-shaped virus 2 (SIRV2) were isolated, and muta-tions were identified in two gene clusters, cluster sso3138 to sso3141 and cluster sso2386 and sso2387, encoding cell surface andtype IV secretion proteins, respectively. The involvement of the mutations in the resistance was confirmed by genetic comple-mentation. Blocking of virus entry into the mutants was demonstrated by the lack of early gene transcription, strongly support-ing the idea of a role of the proteins in SIRV2 entry.

To date, relatively few archaeal viruses have been character-ized, and most of those that have been characterized infect

acidothermophilic members of the order Sulfolobales. Despitetheir limited number of around 50 species, they exhibit consider-ably greater morphological diversity than the more extensivelycharacterized bacteriophages, about 95% of which show head-tailmorphologies. Archaeal viruses, in contrast, exhibit fusiformshapes, often with one or two tails, bottle shapes, bearded-globu-lar forms, and a wide variety of rod-like and filamentous morpho-types which often carry small terminal appendages (1–3). Thismorphological diversity suggests that the archaeal viruses mayemploy a variety of mechanisms to enter their hosts, but currentinsights into entry mechanisms are limited to an OppA trans-porter protein, Sso1273, possibly providing a receptor site for theAcidianus two-tailed virus (ATV) in Sulfolobus solfataricus P2 (4).And very recently, microscopic studies suggested that Sulfolobusislandicus rod-shaped virus 2 (SIRV2) enters the host cell by at-taching and moving through a pilus-like filament; however, thenature of the structure and the identity of the involved proteinsremain elusive (5).

Sulfolobus solfataricus P2 is an acidothermophilic crenar-chaeon that can host a wide range of archaeal viruses, many ofwhich are propagated stably (1, 3, 6). Moreover, few of the virusesappear to induce cell lysis, possibly reflecting a need to minimizecontact with the harsh hot acidic environment. However, recentstudies have identified a few viruses that can enter a lytic phase,including the Sulfolobus turreted icosahedral virus (STIV), thetwo-tailed fusiform (ATV), and, more recently, the rudivirusSIRV2 (7–9).

SIRV2 is classified in the family Rudiviridae together with otherwell-characterized viruses, including SIRV1 (10, 11), ARV1 (12)and SRV1, (13), all of which are rod shaped and lack an envelope,and their genomes consist of linear double-stranded DNA withcovalently closed ends (10, 14, 15). In a recent microarray analysisof S. solfataricus infected with SIRV2, we demonstrated that theviral genes were activated at different times and that mainly stress-response host genes and those implicated in vesicle formationwere downregulated (16). The results also illustrated that SIRV2infection at a multiplicity of infection (MOI) of 30 resulted ingrowth inhibition of S. solfataricus 5E6 (16). In the present exper-iment, the culture was infected at a lower MOI (�1) which also ledto a growth retardation, but the infected culture could enter theexponential-growth phase at 80 h postinfection (p.i.) (Fig. 1A).

The surviving cells (named 5E6R) appeared to be resistant toSIRV2 because, in contrast to the sensitive 5E6 strain, no growth

Received 24 May 2014 Accepted 16 June 2014

Published ahead of print 25 June 2014

Editor: A. Simon

Address correspondence to Xu Peng, [email protected].

Supplemental material for this article may be found at http://dx.doi.org/10.1128/JVI.01495-14.

Copyright © 2014, American Society for Microbiology. All Rights Reserved.

doi:10.1128/JVI.01495-14

FIG 1 (A) Growth retardation of S. solfataricus 5E6 upon SIRV2 infection. (B)Resistance of S. solfataricus 5E6R to SIRV2. OD600, optical density at 600 nm.

10264 jvi.asm.org Journal of Virology p. 10264 –10268 September 2014 Volume 88 Number 17

on Novem

ber 2, 2014 by Copenhagen U

niversity Libraryhttp://jvi.asm

.org/D

ownloaded from

Page 153: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

retardation was observed when 5E6R was diluted and infectedwith SIRV2 at the same MOI (Fig. 1).

In order to manipulate the SIRV2-sensitive S. solfataricus 5E6strain genetically (17), 10 pyrEF mutants, labeled Sens1 to Sens10,were isolated from Gelrite plates containing 5-fluoroorotic acid(5=-FOA). Their mutation sites in the pyrEF gene region wereidentified by a combination of PCR amplification, restriction di-gest analysis, and sequencing (17). All the mutations were shownto result from transposon insertions, either IS elements or minia-ture inverted terminal repeat elements (MITEs), and the inser-tions occurred in the coding sequences or within the single pro-moter (Fig. 2A). These results are consistent with the previousreports demonstrating high transposition activity in S. solfataricusand its contribution to chromosomal plasticity (18–20). Follow-ing the procedure described above, SIRV2-resistant cultures weregenerated for each of the pyrEF mutants. Single colonies were thenproduced from the cultures by streaking onto Gelrite plates toyield the purified resistant strains Res1 to Res10. The stability ofthe transposon insertions in the pyrEF genes was tested for each ofthe 10 pyrEF mutants (Sens1 to Sens10) and their correspondingSIRV2-resistant colonies (Res1 to Res10) by growing them in richmedia containing uracil (17) for 3 days without transfer, prior tototal DNA extraction and PCR amplification of the pyrEF regions.Each transposon insertion appeared to be stable, because no wild-type PCR bands were observed, except a weak wild-type bandproduced in Sens2, consistent with the undetectable reversionrates for Sulfolobus transposons recorded earlier (19, 20). SinceRes2 did not generate the wild-type band, the extra PCR productin Sens2 was probably due to a minor contamination of the colonyby wild-type cells (Fig. 2B).

Sens1, Sens3, Sens7, and Sens8 were selected for a transforma-tion test because they carried different transposons located at dif-ferent insertion sites (Fig. 2A). Shuttle vector pEXA was used fortransformation (21), and water was used in the negative control.While Sens7 and Sens8 appeared unstable after electroporation,Sens1 and Sens3 yielded transformants without colony formationin the negative control. Thus, we focused on Sens1 and its resistantmutants for further studies of SIRV2 susceptibility.

The SIRV2-resistant cells were enriched directly from theSIRV2-sensitive culture; therefore, the only selective pressure ap-peared to occur either upon SIRV2 infection or during virus-in-duced cell lysis. Moreover, since the active clustered regularly in-

terspaced short palindromic repeat (CRISPR) loci A, B, C, and Dwere all lost from the 5E6 host strain (16), the residual CRISPRloci E and F, which lack the spacer acquisition cas genes, wereunlikely to be responsible for the resistance (21, 22). Therefore, weinferred that resistance arose as a result of mutated host genes thatare important for the SIRV2 life cycle. To identify such mutations,the genomes of strains Sens1 and Res1 were resequenced by theuse of a Hiseq 2000 sequencer, yielding about 200-fold coverage.The sequencing reads of both strains were aligned with thegenome sequence of S. solfataricus P2 (23) using the R2R program(24) to identify mismatches as well as insertions and deletions.Mutations to the P2 genome in the resequenced genomes of Sens1and Res1 were then compared manually. Only one mutation wasdetected and constituted a single insertion of ISC1078 into Res1but not Sens1. The insertion was localized in sso3139, a gene en-coding a conserved hypothetical protein lying within an operon(Fig. 3A).

Next we tested whether other resistant strains also carried mu-tations in sso3139 or in other genes of the same operon by employ-ing a primer pair (5=-GCTACGCTTCTAACAAACCTAATCTGand 5=-CGAAACTTGCGAAACAACTACCT) designed to am-plify the whole operon region. After PCR amplification, restric-tion digestion, and sequencing, another 5 strains were shown tocontain mutations at different locations within sso3139 or the ad-jacent sso3140 (Fig. 3A). Interestingly, all the 6 mutations wereproduced by ISC1078 insertion (Fig. 3A) and appeared to be stablymaintained (Fig. 3C).

To identify possible mutations in the other 4 resistant strains,genome resequencing followed by PCR analyses of relevant geneswas performed (using primers 5=-GAGTCTGGGGAAAATCGGTAAAGTT and 5=-TGGCATTGTAACCCTAATTGCTTCT).These revealed IS element insertions in sso2387 of Res2 and Res10(Fig. 3B). Sso2387 in Sens2 and Sens10 contains 577 amino acids(aa) but only 283 aa in the sequenced S. solfataricus P2 genome(23). An analysis of the sequences around sso2387 in S. solfataricusP2 revealed that it is a partial gene that resulted from an ISC1225insertion (Fig. 3B), which could explain the resistance of the wild-type P2 strain to SIRV2 (13 and this work). Interestingly, an in-version in sso2386 was detected in Res7 whereas no mutationswere identified in Res9 that could be linked to SIRV2 resistance(Fig. 3B).

The frequently observed mutations in cluster sso3139 and

FIG 2 Analysis of the pyrEF mutants derived from S. solfataricus 5E6. (A) Types and insertion sites of transposons inserted in the pyrEF gene region of differentpyrEF mutants. (B) PCR amplification of the pyrEF region from Sens1 to Sens10 and from Res1 to Res10. wt, wild type.

SIRV2 Entry in Sulfolobus

September 2014 Volume 88 Number 17 jvi.asm.org 10265

on Novem

ber 2, 2014 by Copenhagen U

niversity Libraryhttp://jvi.asm

.org/D

ownloaded from

Page 154: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

sso3140 and cluster sso2386 and sso2387 in the resistant strainsstrongly suggest that the two gene clusters are important for theSIRV2 life cycle. To confirm the implication of the mutationsin the gained resistance, genetic complementation was per-formed for the mutated genes. As described above, Sens1 ap-peared stable during genetic manipulation, and we thus se-lected Res1 for complementation of sso3139 mutation. Forcomplementation of mutations in the other gene cluster,Res1B, carrying an ISC1234 insertion in sso2387 (Fig. 3B), wasisolated from SIRV2-infected Sens1. Res1 cells were trans-formed with vector pEXA2 containing sso3139, and Res1B cellswere transformed with vector pEXA2 containing sso2386 andsso2387. After SIRV2 was added into the cultures, growth re-tardation occurred in the complemented cells, while the non-complemented culture, transformed with the empty vector,showed a growth rate similar to that of the uninfected culture(Fig. 4A and B). Further, Southern hybridization (17) using aprobe derived from the SIRV2 inverted terminal repeats (ITR)detected signals only from the complemented cells (Fig. 4C andD) and the multiple hybridized bands were consistent withongoing replication (Fig. 4E) (10, 25). The absence of SIRV2signal in the resistant strains indicates a defect in the virus lifecycle.

To gain insights into the functions of the two gene clusters, theprotein sequences of the genes were firstly analyzed by the use ofprogram TMHMM (http://www.cbs.dtu.dk/services/TMHMM-2.0/) for the possible presence of transmembrane helices. Sso3138,Sso3139, and Sso3140 were predicted to be primarily located ex-tracellularly (see Fig. S1 in the supplemental material), correlatingwith a previous prediction of the presence of class III signal pep-tides at their N termini (26). Among these, Sso3140 was confirmedto be a membrane-associated protein in a proteomics study (27).In contrast to the other 3 proteins, Sso3141 was predicted to con-

tain two transmembrane helices, one at the N terminus and theother at the C terminus, while the sequence between them waspresumed to be located intracellularly. Therefore, it appears thatthe proteins encoded in the operon form a membrane-associatedcell surface structure and may function as a receptor for SIRV2.Moreover, it was demonstrated recently that Sso2386 carries mul-tiple transmembrane helices and that Sso2387 constitutes anATPase associated with a type IV secretion system, and they weredesignated AapF and AapE, respectively (28). Further, homologsof both are essential for the formation of the adhesive type IVpilus of S. acidocaldarius (28). The association with the cell mem-brane of proteins encoded by both gene clusters strongly indicatestheir involvement in the entry process of SIRV2.

The failure of viral entry into Res1 and Res1B cells was fur-ther confirmed by reverse transcription-PCR analysis of one ofthe early genes, ORF131a (17). RNA extracted from cells takenat 15 min p.i. was DNase I treated and reverse transcribed(SuperScript II reverse transcriptase; Invitrogen). PCR per-formed on the cDNAs detected ORF131a only from Sens1 cells,while the positive-control sso0446 (tfb-1) gene was detected inall the 3 strains (Fig. 4F). This strongly supports the conclusionthat the proteins encoded by the two gene clusters are involvedin SIRV2 entry. A likely scenario is that gene cluster sso3138 tosso3141 encodes a surface receptor for SIRV2 and that genecluster sso2386 and sso2387 is involved in the secretion of thereceptor components.

Except in Escherichia coli, very few virus receptors are known inthe domains of Bacteria and Archaea (29). The primary receptorsfor E. coli filamentous phages are pili which retract toward the cellsurface, bringing the phages to the secondary receptor located inthe periplasm (30). Linear archaeal viruses, including rudiviruses,have been observed to attach to pili (5, 31, 32). Future work isneeded to determine the association of the two identified gene

FIG 3 Different mutations in the SIRV2-resistant strains and their stability. (A) Transposon insertions in sso3139 and sso3140. (B) Mutations in sso2386 andsso2387. (C) PCR amplification of the mutation region from different resistant strains.

Deng et al.

10266 jvi.asm.org Journal of Virology

on Novem

ber 2, 2014 by Copenhagen U

niversity Libraryhttp://jvi.asm

.org/D

ownloaded from

Page 155: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

clusters with the structure of pili. To our knowledge, this is the firstwork providing genetic and biochemical evidence for a possiblereceptor system in archaeal virus entry.

ACKNOWLEDGMENTS

We thank Roger A. Garrett for critically reading the manuscript.This work is supported by a European Union FP7 grant (265933).

REFERENCES1. Prangishvili D, Forterre P, Garrett RA. 2006. Viruses of the Archaea: a

unifying view. Nat. Rev. Microbiol. 4:837– 848. http://dx.doi.org/10.1038/nrmicro1527.

2. Pina M, Bize A, Forterre P, Prangishvili D. 2011. The archeoviruses.FEMS Microbiol. Rev. 35:1035–1054. http://dx.doi.org/10.1111/j.1574-6976.2011.00280.x.

3. Peng X, Garrett RA, She Q. 2012. Archaeal viruses–novel, diverse andenigmatic. Sci. China Life Sci. 55:422– 433. http://dx.doi.org/10.1007/s11427-012-4325-8.

4. Erdmann S, Scheele U, Garrett RA. 2011. AAA ATPase p529 of Acidianustwo-tailed virus ATV and host receptor recognition. Virology 421:61– 66.http://dx.doi.org/10.1016/j.virol.2011.08.029.

5. Quemin ER, Lucas S, Daum B, Quax TE, Kuhlbrandt W, Forterre P,Albers SV, Prangishvili D, Krupovic M. 2013. First insights into the entryprocess of hyperthermophilic archaeal viruses. J. Virol. 87:13379 –13385.http://dx.doi.org/10.1128/JVI.02742-13.

6. Zillig W, Arnold HP, Holz I, Prangishvili D, Schweier A, Stedman K,She Q, Phan H, Garrett R, Kristjansson JK. 1998. Genetic elements in theextremely thermophilic archaeon Sulfolobus. Extremophiles 2:131–140.http://dx.doi.org/10.1007/s007920050052.

7. Fu CY, Johnson JE. 2012. Structure and cell biology of archaeal virusSTIV. Curr. Opin. Virol. 2:122–127. http://dx.doi.org/10.1016/j.coviro.2012.01.007.

8. Häring M, Vestergaard G, Rachel R, Chen L, Garrett RA, PrangishviliD. 2005. Virology: independent virus development outside a host. Nature436:1101–1102. http://dx.doi.org/10.1038/4361101a.

9. Bize A, Karlsson EA, Ekefjard K, Quax TE, Pina M, Prevost MC,Forterre P, Tenaillon O, Bernander R, Prangishvili D. 2009. A unique

FIG 4 Cluster sso3138 to sso3141 and cluster sso2386 and sso2387 are involved in SIRV2 entry. (A) Growth retardation of sso3139-complemented Res1 uponSIRV2 infection. Res1 (pEXA), Res1 transformed with expression vector pEXA; Res1 (pEXA3139), Res1 transformed with expression vector pEXA containingsso3139. (B) Growth retardation of sso2386-and-sso2387-complemented Res1B upon SIRV2 infection. Res1B (pEXA), Res1B transformed with expression vectorpEXA; Res1B (pEXA2386 –2387), Res1B transformed with expression vector pEXA containing sso2386 and sso2387. (C and D) Visualization of SIRV2 DNAreplication in Res1 (C) and Res1B (D) transformants infected with SIRV2. Plasmid constructs contained in the transformants are labeled on top of each lane, andthe sampling time p.i. is indicated as hours. L and R designate the left and right terminal fragments, respectively, after a double digestion with BamHI and HindIII(see panel E). (E) Schematic presentation of SIRV2 genomic map and the formation of terminal duplex replicative intermediates (2L and 2R), as describedpreviously (12). The locations of the probe (filled rectangle) in the termini are indicated. ITR, inverted terminal repeat. (F) RT-PCR amplification of sso0446(tfb-1) (left panel) and SIRV2 ORF131a transcript fragments (right panel). “�” and “�” indicate the presence and absence of reverse transcriptase (RT),respectively.

SIRV2 Entry in Sulfolobus

September 2014 Volume 88 Number 17 jvi.asm.org 10267

on Novem

ber 2, 2014 by Copenhagen U

niversity Libraryhttp://jvi.asm

.org/D

ownloaded from

Page 156: Functional characterization and Gene regulation of … Guo.pdf · The thesis entitled `` Functional characterization and gene regulation of the archaeal ... environment for unreserved

virus release mechanism in the Archaea. Proc. Natl. Acad. Sci. U. S. A.106:11306 –11311. http://dx.doi.org/10.1073/pnas.0901238106.

10. Peng X, Blum H, She Q, Mallok S, Brugger K, Garrett RA, Zillig W,Prangishvili D. 2001. Sequences and replication of genomes of the ar-chaeal rudiviruses SIRV1 and SIRV2: relationships to the archaeal lipo-thrixvirus SIFV and some eukaryal viruses. Virology 291:226 –234. http://dx.doi.org/10.1006/viro.2001.1190.

11. Prangishvili D, Arnold HP, Gotz D, Ziese U, Holz I, Kristjansson JK,Zillig W. 1999. A novel virus family, the Rudiviridae: structure, virus-hostinteractions and genome variability of the sulfolobus viruses SIRV1 andSIRV2. Genetics 152:1387–1396.

12. Vestergaard G, Haring M, Peng X, Rachel R, Garrett RA, PrangishviliD. 2005. A novel rudivirus, ARV1, of the hyperthermophilic archaealgenus Acidianus. Virology 336:83–92. http://dx.doi.org/10.1016/j.virol.2005.02.025.

13. Vestergaard G, Shah SA, Bize A, Reitberger W, Reuter M, Phan H,Briegel A, Rachel R, Garrett RA, Prangishvili D. 2008. Stygiolobusrod-shaped virus and the interplay of crenarchaeal rudiviruses with theCRISPR antiviral system. J. Bacteriol. 190:6837– 6845. http://dx.doi.org/10.1128/JB.00795-08.

14. Blum H, Zillig W, Mallok S, Domdey H, Prangishvili D. 2001. Thegenome of the archaeal virus SIRV1 has features in common with genomesof eukaryal viruses. Virology 281:6 –9. http://dx.doi.org/10.1006/viro.2000.0776.

15. Prangishvili D, Koonin EV, Krupovic M. 2013. Genomics and biology ofRudiviruses, a model for the study of virus-host interactions in Archaea.Biochem. Soc. Trans. 41:443–450. http://dx.doi.org/10.1042/BST20120313.

16. Okutan E, Deng L, Mirlashari S, Uldahl K, Halim M, Liu C, Garrett RA,She Q, Peng X. 2013. Novel insights into gene regulation of the rudivirusSIRV2 infecting Sulfolobus cells. RNA Biol. 10:875– 885. http://dx.doi.org/10.4161/rna.24537.

17. Deng L, Zhu H, Chen Z, Liang YX, She Q. 2009. Unmarked genedeletion and host-vector system for the hyperthermophilic crenarchaeonSulfolobus islandicus. Extremophiles 13:735–746. http://dx.doi.org/10.1007/s00792-009-0254-2.

18. Martusewitsch E, Sensen CW, Schleper C. 2000. High spontaneousmutation rate in the hyperthermophilic archaeon Sulfolobus solfataricusis mediated by transposable elements. J. Bacteriol. 182:2574 –2581. http://dx.doi.org/10.1128/JB.182.9.2574-2581.2000.

19. Redder P, Garrett RA. 2006. Mutations and rearrangements in the ge-nome of Sulfolobus solfataricus P2. J. Bacteriol. 188:4198 – 4206. http://dx.doi.org/10.1128/JB.00061-06.

20. Blount ZD, Grogan DW. 2005. New insertion sequences of Sulfolobus:functional properties and implications for genome evolution in hyper-thermophilic archaea. Mol. Microbiol. 55:312–325. http://dx.doi.org/10.1111/j.1365-2958.2004.04391.x.

21. Gudbergsdottir S, Deng L, Chen Z, Jensen JV, Jensen LR, She Q,Garrett RA. 2011. Dynamic properties of the Sulfolobus CRISPR/Cas and

CRISPR/Cmr systems when challenged with vector-borne viral and plas-mid genes and protospacers. Mol. Microbiol. 79:35– 49. http://dx.doi.org/10.1111/j.1365-2958.2010.07452.x.

22. Erdmann S, Garrett RA. 2012. Selective and hyperactive uptake of foreignDNA by adaptive immune systems of an archaeon via two distinct mech-anisms. Mol. Microbiol. 85:1044 –1056. http://dx.doi.org/10.1111/j.1365-2958.2012.08171.x.

23. She Q, Singh RK, Confalonieri F, Zivanovic Y, Allard G, Awayez MJ,Chan-Weiher CC, Clausen IG, Curtis BA, De Moors A, Erauso G,Fletcher C, Gordon PM, Heikamp-de Jong I, Jeffries AC, Kozera CJ,Medina N, Peng X, Thi-Ngoc HP, Redder P, Schenk ME, Theriault C,Tolstrup N, Charlebois RL, Doolittle WF, Duguet M, Gaasterland T,Garrett RA, Ragan MA, Sensen CW, Van der Oost J. 2001. The completegenome of the crenarchaeon Sulfolobus solfataricus P2. Proc. Natl. Acad.Sci. U. S. A. 98:7835–7840. http://dx.doi.org/10.1073/pnas.141222098.

24. Skovgaard O, Bak M, Lobner-Olesen A, Tommerup N. 2011. Genome-wide detection of chromosomal rearrangements, indels, and mutations incircular chromosomes by short read sequencing. Genome Res. 21:1388 –1393. http://dx.doi.org/10.1101/gr.117416.110.

25. Oke M, Kerou M, Liu H, Peng X, Garrett RA, Prangishvili D, NaismithJH, White MF. 2011. A dimeric Rep protein initiates replication of a lineararchaeal virus genome: implications for the Rep mechanism and viralreplication. J. Virol. 85:925–931. http://dx.doi.org/10.1128/JVI.01467-10.

26. Szabó Z, Stahl AO, Albers SV, Kissinger JC, Driessen AJ, PohlschröderM. 2007. Identification of diverse archaeal proteins with class III signalpeptides cleaved by distinct archaeal prepilin peptidases. J. Bacteriol. 189:772–778. http://dx.doi.org/10.1128/JB.01547-06.

27. Pham TK, Sierocinski P, van der Oost J, Wright PC. 2010. Quantitativeproteomic analysis of Sulfolobus solfataricus membrane proteins. J. Pro-teome Res. 9:1165–1172. http://dx.doi.org/10.1021/pr9007688.

28. Henche AL, Ghosh A, Yu X, Jeske T, Egelman E, Albers SV. 2012.Structure and function of the adhesive type IV pilus of Sulfolobus acido-caldarius. Environ. Microbiol. 14:3188 –3202. http://dx.doi.org/10.1111/j.1462-2920.2012.02898.x.

29. Labrie SJ, Samson JE, Moineau S. 2010. Bacteriophage resistance mechanisms.Nat. Rev. Microbiol. 8:317–327. http://dx.doi.org/10.1038/nrmicro2315.

30. Rakonjac J, Bennett NJ, Spagnuolo J, Gagic D, Russel M. 2011. Fila-mentous bacteriophage: biology, phage display and nanotechnology ap-plications. Curr. Issues Mol. Biol. 13:51–76. http://www.horizonpress.com/cimb/v/v13/51.pdf.

31. Zillig W, Kletzin A, Schleper C, Holz I, Janekovic D, Hain J, Lanzendör-fer M, Kristjansson JK. 1993. Screening for Sulfolobales, their plasmidsand their viruses in Icelandic Solfataras. Syst. Appl. Microbiol. 16:609 –628. http://dx.doi.org/10.1016/S0723-2020(11)80333-4.

32. Bettstetter M, Peng X, Garrett RA, Prangishvili D. 2003. AFV1, a novelvirus infecting hyperthermophilic archaea of the genus acidianus. Virol-ogy 315:68 –79. http://dx.doi.org/10.1016/S0042-6822(03)00481-1.

Deng et al.

10268 jvi.asm.org Journal of Virology

on Novem

ber 2, 2014 by Copenhagen U

niversity Libraryhttp://jvi.asm

.org/D

ownloaded from