uva-dare (digital academic repository) anomalous dna in ...uva-dare is a service provided by the...

166
UvA-DARE is a service provided by the library of the University of Amsterdam (https://dare.uva.nl) UvA-DARE (Digital Academic Repository) Anomalous DNA in prokaryotic genomes van Passel, M.W.J. Publication date 2006 Link to publication Citation for published version (APA): van Passel, M. W. J. (2006). Anomalous DNA in prokaryotic genomes. General rights It is not permitted to download or to forward/distribute the text or part of it without the consent of the author(s) and/or copyright holder(s), other than for strictly personal, individual use, unless the work is under an open content license (like Creative Commons). Disclaimer/Complaints regulations If you believe that digital publication of certain material infringes any of your rights or (privacy) interests, please let the Library know, stating your reasons. In case of a legitimate complaint, the Library will make the material inaccessible and/or remove it from the website. Please Ask the Library: https://uba.uva.nl/en/contact, or a letter to: Library of the University of Amsterdam, Secretariat, Singel 425, 1012 WP Amsterdam, The Netherlands. You will be contacted as soon as possible. Download date:22 Jun 2021

Upload: others

Post on 03-Feb-2021

2 views

Category:

Documents


0 download

TRANSCRIPT

  • UvA-DARE is a service provided by the library of the University of Amsterdam (https://dare.uva.nl)

    UvA-DARE (Digital Academic Repository)

    Anomalous DNA in prokaryotic genomes

    van Passel, M.W.J.

    Publication date2006

    Link to publication

    Citation for published version (APA):van Passel, M. W. J. (2006). Anomalous DNA in prokaryotic genomes.

    General rightsIt is not permitted to download or to forward/distribute the text or part of it without the consent of the author(s)and/or copyright holder(s), other than for strictly personal, individual use, unless the work is under an opencontent license (like Creative Commons).

    Disclaimer/Complaints regulationsIf you believe that digital publication of certain material infringes any of your rights or (privacy) interests, pleaselet the Library know, stating your reasons. In case of a legitimate complaint, the Library will make the materialinaccessible and/or remove it from the website. Please Ask the Library: https://uba.uva.nl/en/contact, or a letterto: Library of the University of Amsterdam, Secretariat, Singel 425, 1012 WP Amsterdam, The Netherlands. Youwill be contacted as soon as possible.

    Download date:22 Jun 2021

    https://dare.uva.nl/personal/pure/en/publications/anomalous-dna-in-prokaryotic-genomes(0d6ef0e9-55f3-4ea8-8c4d-8a4aa5cdd53e).html

  • ��

    ���� ��������������������������� ��

    ����������

    � ����� ����� ����������������

    �����

  • ������������������������ ��������������������������� �������� ��� ��������������������������������������������������������������������������������������� ������ !"#$#$%#�#��&�������������������'���(�����������)(���������� ������� ����*����+,�������������� ������-������������� ��&���������,�������������.�)��/�����-������������0������������,���((�����(������+����������������������*,��1������������������2�-�������+��*,�� ���� ����������2�

  • ����

    ����� ��������������������������� ��

    ������ ���������������

    ��������3+��+�������+����������������

    �����������������������������

    �(�+0�+�������4������ �+��������

    (���2��2��2��2��������5�3���

    �����������������������������+������(���������+�����

    ��

    ������������(�*�����������+��������������������������

    �(��������+�67��*������8!!7����69:!!������

    �����

    � ����� ����� ��������������

    �*������;�

  • ��� ������� � ����

    ��� ����� � � (���2���2�-2� 2�2=2�.����*�����

  • ��

    %������!����������������������&��������!�����'�������!������(�-����� ������?���'�*����+,��������'������,@��

    A� ����*����+,������ �������������+,�4��'���8!!9B

  • Table of Contents

    Table of Contents Chapter 1 Introduction 1 Chapter 2 An in vitro strategy for the selective isolation of

    anomalous DNA from unsequenced prokaryotic genomes 29

    Chapter 3 δδδδρρρρ-web, an online tool to assess composition similarity of individual nucleic acid sequences 47

    Chapter 4 An acquisition account of genomic islands based on

    genome signature comparisons 55 Chapter 5 Identification of anomalous sequences in Neisseria

    lactamica expands the neisserial gene pool 75 Chapter 6 Plasmid diversity in neisseriae 95 Chapter 7 Compositional discordance between prokaryotic plasmids

    and host chromosomes 115 Chapter 8 Summary and discussion, samenvatting, publicatielijst en

    curriculum vitae 129

  • Chapter 1

    1

    Chapter 1

    Introduction

    1.1 Prologue

    Infectious diseases remain one of the leading causes of death worldwide.

    Since the establishment of the germ theory of disease around 1870 by Louis Pasteur

    and Robert Koch, which states that infectious diseases are caused by

    microorganisms within the body, biomedical research has rigorously investigated

    potential remedies against microbial infections. This resulted in better hygiene, the

    production of vaccines (for example against anthrax, tetanus, polio), extensive

    classes of antimicrobial agents against infectious bacteria (which cause diseases

    such as gonorrhoea, pneumonia and meningitis), but also viral inhibitors. These

    combined measures brought down the percentage of deaths due to infectious

    diseases from 30% at the beginning of the 20th century to 1.5 % at the beginning of

    this century in the Netherlands. Still, according to the World Health Report 1996,

    infectious diseases kill over 17 million people every year, of which 9 million young

    children.

    The field of genetics has contributed considerably to the quest for cures for

    infections by elucidating bacterial strategies for causing disease. Historically,

    genetics has been studied for about one and a half century. The Austrian monk

    Gregor Mendel started investigating the basics of genetics in the 19th century using

    peas, but it wasn’t until 1944 that Oswald Avery and co-workers discovered that DNA

    was the carrier on which hereditary information is stored. The structure of a DNA

    molecule was finally resolved in 1953 by James Watson, Francis Crick and Rosalind

    Franklin. In 1958, Matthew Meselson and Frederick Stahl found out that replication of

    the two DNA strands, which occurs with every cell division, is semi conservative; this

    explained how after every cell division the two daughter cells each contain an exact

    copy of the original genetic information of the parent cell. Finally, a central dogma of

  • Chapter 1

    2

    molecular biology was proposed concerning the flow of genetic information; DNA is

    transcribed into an intermediate, RNA, which in its turn is translated to the main

    functional units of metabolism, namely proteins. Some of these proteins mediate

    DNA replication, which makes the reproduction of DNA come full circle.

    In 1995 the first complete genetic sequence of a free-living organism (the

    bacterium Haemophilus influenzae) was published, and accordingly started the age

    of genomics. Since then, genome sequencing has resulted in the publication of over

    250 complete genome sequences, most of which originate from microbes (as the

    genomes of microbes are relatively small and manageable compared to plant and

    animal genome sequences). These genomes each contain the entire genetic data of

    the organism in question, and so give insight in the organisation and expression of

    genes, metabolic potential of a microbe, the formation of different species, and

    genome evolution. The final shape of all genomes results from hundreds of millions

    of years of evolution, and as such they each represent their own account of the

    recorded evolutionary history of life.

    Studying genome evolution allows insight in the pathogens side of the arms

    race between the pathogen and the host (e.g. humans). Understanding genome

    evolution is therefore key in finding new vaccines, antibiotics and other therapeutics.

    This chapter aims to introduce the main topics of this thesis, which focuses on

    bacterial genome composition, in particular of Neisseria and especially on horizontal

    gene transfer (HGT), an important contributor to genome evolution. In order to

    explain the implications of horizontal gene transfer, a little background in genomics

    and bioinformatics is necessary, and is therefore included in this introduction.

  • Chapter 1

    3

    1.2.1 Neisseria meningitidis

    Neisseria meningitidis, or meningococcus, is a Gram-negative diplococcal

    bacterium belonging to the family of Neisseriae, a subdivision of the β-Proteobacteria.

    It is an obligate human pathogen that inhabits the naso- and oropharynx. It has been

    estimated that in this niche it can be encountered in approximately 10% of the

    population (reviewed by [1]), although this may be a grave underestimation [2]. This

    bacterium, was first isolated by Anton Weichselbaum from cerebrospinal fluid of a

    meningitis patient and identified as the causal agent of a case of meningitis 1887,

    and initially named Diplococcus intracellularis [3]. In contrast with the closely related

    gonococcus (Neisseria gonorrhoeae), isolated almost a decade earlier by Albert

    Neisser [4], which invariably leads to disease in the infected host, the meningococcus

    only causes disease in a fraction of the carriers of this bacterium. Therefore, N.

    meningitidis could be regarded as a commensal bacterium [5]. However, sporadically

    meningococci crosses the mucosal barrier and enter the bloodstream, leading to

    various clinical entities such as sepsis, meningitis or both simultaneously (reviewed

    by [6]).

    1.2.2 Bacterial typing

    Meningococcal identification is based on a few distinctive features such as

    shape, Gram-stain, and phenotypic characteristics. Meningococcal isolates are

    grouped according to a number of antigenic and genotypic characteristics. This

    enables intensive epidemiological surveillance, which is important for public health

    decisions and the development of vaccination strategies. Traditionally,

    meningococcal phenotypes were classified using polyclonal sera against surface

    exposed structures [7], but nowadays serological typing is carried out with various

    monoclonal antibodies more specifically aimed at the different neisserial antigenic

    structures. The capsular polysaccharide, although occasionally absent, designates

    the serogroups [8], of which the groups A, B, C, W-135 and Y are predominant

    among clinical isolates. The major outer membrane proteins PorB and PorA define

    the serotype and serosubtype, respectively, and finally, immunotypes are assigned

    according to the lipopolysaccharide (LPS) structure [9, 10]. This results in the

    following classification scheme for N. meningitidis:

  • Chapter 1

    4

    [serogroup]:[serotype]:[serosubtype]:[immunotype], e.g. B:4:P1.7,4:L3,8. Meanwhile,

    serological sero(sub)typing has been largely replaced by typing schemes based on

    sequence data of the two variable regions (VR1 and VR2) of the PorA encoding gene

    porA, and the partial nucleotide sequence of the outer-membrane protein encoding

    gene fetA.

    With respect to genotyping approaches, the emphasis has shifted recently

    from multilocus enzyme electrophoresis (MLEE, [11]), pulse field gel electrophoresis

    (PFGE, [12]), and random amplified polymorphic DNA (RAPD, [13, 14]) to multilocus

    sequence typing (MLST). The latter method allows a much higher resolution and is

    by far more reproducible and unambiguously comparable (‘portable’) between

    different laboratories [15]. Current MLST is based on the sequences of seven

    housekeeping genes that contain sufficient variation to allow high resolution and

    congruence, resulting in different sequence types, and is used for typing different

    organisms (for details see http://www.mlst.net/). This MLST database is freely

    available for global epidemiology analyses and surveillance, and can identify clonal

    complexes [16] and even suggest the descent of isolates and/or clonal complexes

    [17, 18].

    However, Neisseria are naturally transformable [19], and recombination

    events between strains of different types can distort tree-like interpretations of

    phylogenetic analysis. For example, when using MLST, Neisseria meningitidis is

    depicted with a ‘fuzzy’ or unclear species definition, as species “…are not ideal

    entities with sharp and unambiguous boundaries” [20]. However, MLST analyses

    may support further modeling of how species may emerge.

    1.2.3 Epidemiology/Incidence

    The overall incidence of meningococcal disease varies considerably

    throughout the world. In the Netherlands, the frequency is around 2/100,000

    inhabitants per year [21], but during epidemics in sub-Saharan African countries the

    incidence of disease has reached numbers up to 500/100,000 [22, 23]. Industrialised

    countries have also experienced epidemics of N. meningitidis, for example Norway in

    1974-1975 [24] and New Zealand from 1990 onwards [25, 26].

  • Chapter 1

    5

    Widespread pandemics can be instigated by specific human migratory

    patterns, such as the Hajj, and outbreaks of these clonal complexes be followed by

    global surveillance via the MLST database. A specific N. meningitidis clone with

    serogroup W-135 emerged after the Hajj pilgrimage of 2000, infecting a total of over

    40 Hajj pilgrims and their household contacts in the United Kingdom, France, the

    Netherlands, and Oman, with an additional number of meningococcal disease cases

    in Saudi Arabia related to this outbreak [27]. This serogroup W-135 clone also

    caused local epidemics such as in Burkina Faso in 2002 [28]. Recent vaccination

    strategies however in this region halted the epidemic [29].

    In the Netherlands, the National Reference Laboratory for Bacterial Meningitis

    (NRLBM, hosted by the Academic Medical Center and The National Institute for

    Public Health and the Environment (RIVM)) has been collecting isolates of N.

    meningitidis since 1959 and currently harbours over 37,000 meningitis and/or sepsis

    causing isolates available for epidemiological studies. The annual reports of the

    NRLBM show that the seasonal distribution of meningococcal disease finds its peak

    in the first quarter of each calendar year. Also, the distribution of patients suffering

    from meningococcal disease over the different age groups is not uniform; age-

    specific incidences per 100,000 inhabitants in the age groups younger than 5 years

    and between 15-19 years old are substantially higher than in other age groups [21].

    With the advent of molecular epidemiology new meningococcus variants have

    been identified, amongst which the so-called Lineage III cluster, first defined in the

    Netherlands. This hypervirulent B:4:P1,4 type cluster greatly increased in numbers in

    disease cases, until it finally comprised over half of all Dutch meningococcal disease

    isolates (figure 1 [21]). The emergence of this cluster halted in the mid-nineties, after

    which a gradual decline of Lineage III meningococci was observed. Reasons for the

    emergence and decline of this cluster remain obscure.

  • Chapter 1

    6

    Figure 1. Meningococcal disease in the Netherlands over the last 40 years (only the main serogroups

    are depicted). With hardly any serogroup A present, it is clear that serogroup B is the main cause of

    meningococcal disease in the Netherlands, although recently the incidence of serogroup B is

    diminishing. From 2001 onwards a substantial increase of serogroup C is visible, followed by a

    decrease (kindly provided by dr. A. van der Ende, the Netherlands Reference Laboratory for Bacterial

    Meningitis, Academic Medical Center, Amsterdam).

    Also visible in figure 1 is a steep increase in the number of cases of

    meningococcal disease in the beginning of 2002 in the Netherlands, primarily caused

    by N. meningitidis serogroup C [30]. This lead the Public Health authorities to start a

    massive nationwide vaccination campaign by mid 2002; 3 million children aged

    between 12 months and 19 years were vaccinated in the first year after the campaign

    was started. Since then, a N. meningitidis serogroup C vaccine, based on a

    conjugate of the polysaccharide capsule, is included in the Dutch vaccination

    scheme. This policy almost immediately led to a sharp decrease of meningococcal

    disease caused by the serogroup C variant [31], which was predicted by the study on

    meningococcal carriage performed in the UK [32]. However, N. meningitidis

    serogroup B still causes many cases of meningococcal infection per year in the

    Netherlands. Its capsule, consisting of polysaccharides resembling those on host

    cells, is poorly immunogenic [33, 34]. This renders the development of a vaccine very

    difficult; hence no commercial vaccine is currently available against this serogroup.

  • Chapter 1

    7

    However, alternative epitopes are sought after for broad coverage meningococcal

    vaccines, such as outer-membrane proteins (Pizza et al., 2000).

    Interestingly, studies by Caugant and co-workers showed that among N.

    meningitidis carrier isolates few meningococci of disease-causing genotypes are

    found [11, 35], which emphasises the importance of epidemiological studies. The

    related non-pathogenic Neisseria lactamica [36] has been suggested to be a human

    coloniser causing natural immunity against the meningococcus [37, 38], and

    potentially, non-pathogenic N. meningitidis variants may cause natural immunity as

    well. Braun and co-workers reported the occurrence of cross-reactive epitopes that

    are shared by N. meningitidis and N. lactamica [39], and Gorringe and co-workers

    described initial research on a N. lactamica based vaccine [40]. Another longitudinal

    study showed high N. lactamica carriage rates among infants, which was interpreted

    as support for the aforementioned natural immunity hypothesis [41]. Since vaccines

    based on one or a few protein epitopes still lack sufficient coverage due to variation,

    alternative strategies for vaccine development, such as using commensal Neisseriae

    are of interest.

    1.2.4 Pathogenesis of meningococcal disease

    Relatively little is known about the actual pathogenesis process of

    meningococcal disease, as this process exclusively takes place in humans.

    Moreover, this process usually occurs very suddenly and dramatically. However,

    many different putative virulence-associated factors have been described, such as

    adhesion factors such as pili [42-45], the immunoglobulin protease IgA1 [46], the

    putative RTX toxins Frp [47-49], the capsular polysaccharide [50], outer membrane

    proteins Opa and Opc [51], and lipopolysaccharide (LPS or endotoxin) [52]. More

    recently, the formation of biofilms have been studied [53]. Most in vivo studies

    concerning invasive meningococcal disease have been performed with animal

    models [54]. These animal models are supposed to simulate the infection process in

    humans, but often these models require intraperitoneal injection of the virulent or

    attenuated bacteria [55], which is a poor imitation of the actual infection route in

    humans. However, intranasal immunisation studies in murine models, simulating a

  • Chapter 1

    8

    more comparable route of infection, have been performed as well [56, 57], and the

    recent introduction of transgenic mice which express human epitopes indicate

    improvements in meningococcal infection models [58].

    Next to animal models, some in vitro experiments have been conducted with

    human tissue [51], but epidemiological data, sequence comparisons and analogies

    with gonorrhoea can also give information about potential pathogenicity factors of the

    meningococcus [59-61]. Interestingly, with recently developed population dynamics

    models, it was shown that outbreaks of meningococcal disease are caused by

    diversity in the pathogenicity of meningococcal strains [62]. This is an incentive for

    performing genome comparisons between carrier strains of N. meningitidis and

    clinical isolates, in order to identify bacterial genetic factors underlying invasiveness.

    Our limited understanding of the factors that play an obligatory role in the

    pathogenesis of invasive disease is highlighted by the recent description of a single

    case of invasive disease caused by an unencapsulated strain in an presumably

    immunocompetent patient [63]. This contrasts the general assumption that the

    presence of a capsule is pivotal to the virulence of N. meningitidis. Also, strains

    lacking PorA, the major outer membrane porin that is currently being considered and

    tested as a potential broad-range vaccine-candidate [64], have been found to be able

    to cause invasive disease [65], questioning the practicability of such single protein-

    based vaccines.

    Although meningococcal disease in humans is hard to study directly, the

    consequences of meningococcaemia are easily observed, and different studies

    showed the devastating effects. In addition to a case fatality rate of approximately 5%

    for meningitis, and up to 40% for meningococcal septicaemia, [66], the sequelae of

    the disease in survivors include deafness, loss of limbs and mental retardation [67]

    This imposes a high burden for a prolonged time, due to the often relatively young

    age of the patient. Koomen and colleagues have recently presented a number of

    studies in which young patients that survived bacterial meningitis were tested for

    several years for mental sequelae [68-70]. Amongst others it was found that after a

    meningitis, children were more likely than ‘controls’ to underachieve at school [69].

  • Chapter 1

    9

    Although the precise disease process is still largely unknown, a number of

    predisposing factors for the human host have been determined. As pathogenesis is a

    complex interplay between host and pathogen, the role of the host should not be

    underestimated. Host risk factors include (passive) smoking [71], and reduced

    immunocompetence [72]. Recently, a higher attack rate of meningococcal disease

    rate has been observed in children with a pregnant mother [73], but the cause of this

    is still unknown. Also, genetic polymorphisms among components of the diverse

    cascades of the immune system are involved in heightened susceptibility for

    meningitis [74]. This clearly shows that factors beyond the intrinsic pathogenic

    potential of the microbe are important for disease to occur.

    1.2.5 Phase variation and antigenic variation

    Meningococci employ various strategies to evade the human immune system.

    One of these strategies is the variation in gene expression levels via length

    differences in simple sequence repeats, coined phase variation [75]. Differences in

    length of these sequence repeats, both homopolymeric tracts and other simple

    repeats, which may occur in coding and/or in promoter regions, is presumably the

    result of slipped strand mispairing during replication and can thereby influence gene

    expression levels. Different studies suggested that various loci are responsible for

    phase variation frequencies in various genes, such as transformation associated

    genes [76], pilli and iron transport genes [77], but also genome maintenance genes

    (such as mutS) may be involved [78].

    This phase variation strategy allows the bacterium expression versatility of

    genes including, but not restricted to, immunologically important surface structures

    such as pili, outer membrane proteins or enzymes involved in capsular

    polysaccharide biosynthesis, as well as adhesin encoding genes such as opC [79]. It

    may also provide adaptation to different environmental niches, during colonisation

    and potentially also during the dissemination process. Whole genome analysis of the

    N. meningitidis MC58 genome sequence identified over 65 phase-variable genes

    [80], whereas the comparative analysis of two additional N. meningitidis genome

    sequences revealed a repertoire of over 100 putative phase variable genes [81].

  • Chapter 1

    10

    These studies showed that the meningococcal genome sequence contains the most

    extensive repertoire of phase variable genes described to date.

    A different strategy employed by the meningococcus is called antigenic

    variation. This strategy exploits differences in variants of a single surface component,

    such as the outer membrane porin PorA, which displays not only variation in

    expression levels via phase variation, but also variation in sequence composition,

    resulting in antigenically different PorA proteins. Antigenic variation can result from

    transformation-mediated recombination, point mutations or replacement of different

    alleles present in the genome sequence itself, as has been observed for pilin

    structures [82, 83]. Combined, phase variation and antigenic variation make the

    meningococcus a highly versatile bacterium.

    1.3 Genomics & Bioinformatics

    Since the publication of the first complete genome sequence of a free living

    organism in 1995, Haemophilus influenzae [84], over 231 prokaryotic and 33

    eukaryotic genome sequences have been annotated, and over 1000 genome

    sequencing projects are still ongoing (www.genomesonline.org, [85, 86]). This

    enormous amount of data has been amassed in order to mine for better vaccine

    candidates [87, 88], scan for virulence evolution [89], to fuel industrial interest in

    probiotics [90], or to examine adaptation strategies to extreme conditions [91, 92], to

    give only a few examples. The application area of prokaryotic genome sequencing

    projects is nonetheless biased towards biomedical research and industrial purposes,

    as of the sequenced genomes, 52% and 47%, respectively are selected for their

    relevance in these fields (www.genomesonline.org). Together, this results in a poor

    representation of biological diversity [93] and this bias in its turn may have

    consequences for the interpretation of certain types of analyses (e.g. species

    diversity estimates due to sampling bias, protein function predictions).

    With genome sequencing being highly automated, large scale projects such

    as community sequencing are feasible and have been conducted [94, 95]. These

    metagenome projects analyse all (microbial) DNA present in a chosen biotope (an

    acid mine drainage pool in [94] and a seawater sample of the Sargasso Sea in [95]),

  • Chapter 1

    11

    including the DNA of uncultivable microbes, which are still thought to make up most

    of microbial life on earth (reviewed by [96]). This approach has shown to be of great

    value in estimating oceanic microbial diversity [95] and intraspecies genetic diversity

    and metabolic potential [94]. Also, datasets from these metagenomic projects can

    instigate new data mining schemes, such as the investigation for the selenoproteome

    in the Sargasso Sea environmental genome project [93]. Recently, metagenomic

    analyses by Tringe and co-workers have shown that vast amounts of sequences are

    needed to yield a complete genome of the predominant species in biologically

    complex populations [97]. These authors did however find environment-specific

    genes, which allow for habitat-specific fingerprinting.

    Furthermore, the release of a great many microbial genome sequences

    allowed a critical look at species definition and taxonomy, which until recently was

    based solely on phenotypical, morphological and limited genotypical data ([98],

    recently reviewed by [99]). Coenye and co-workers suggest a number of approaches

    for assessing taxonomic relationships [99]. Genomes can be compared according to

    their gene content and gene order (synteny), as well as their nucleotide composition,

    such as GC-content and dinucleotide frequencies or genome signature comparisons.

    These approaches were compared in a study comprising the lactic acid bacteria as a

    test case, and it was concluded that the different whole genome approaches that

    were used yielded very similar classification results [100].

    The release of these large amounts of data has fueled the informational

    technology field considerably, as new and optimized computational approaches were

    necessary to manage these complex and extensive databases [101]. An outstanding

    example of the enormity of these databases and the potential of bioinformatics was

    the finding, by serendipity, of almost entire genome sequences of new

    endosymbionts in the raw sequence traces of various Drosophila genome-

    sequencing projects [102].

    The discipline of bioinformatics could be regarded as a science or a facilitative

    technology platform [103]. On the one hand, numerous applications have been

    developed that allow users to scan data for motifs, for instance promoter sequences

    or genome signatures (which are species specific oligonucleotide frequencies

  • Chapter 1

    12

    observed in whole genome sequences). But also different applications have been

    developed, such as phylogenomic tree builders, potential protein interaction partner

    search tools, models for operon prediction in whole genomes, and also visualisation

    software such as Artemis, Bugview, Plasmapper [104-107], which are mostly aimed

    at facilitating research. These applications are described amongst others in

    specialised web issues of renowned journals. On the other hand, bioinformaticians

    may directly develop hypotheses concerning biological phenomena, such as the turn-

    over of gene content in Proteobacteria and Archaea [108], the assessment of

    functional modules by the determination of pair wise protein interaction [109] or the

    origins of gene repertoires in prokaryotes [110].

    Large-scale databases have also found their place in specialised sections of

    scientific journals. As a response to the maintenance of these large datasets, the

    American Society for Microbiology (ASM) recently published a colloquium report

    regarding maintenance and improvement of extensive genome sequencing

    databases, as this is often neglected due to lack of funds and scientific merit [111].

    1.4 Genome composition and evolution

    In the process of genome evolution, three major forces play an important role:

    gene genesis (e.g. via horizontal acquisition of DNA), gene loss, and genomic

    rearrangements (figure 2) ([112] and reviewed by [113]).

    The relation between gene content and genome size has been studied by

    Konstantinidis and Tiedje [114]. They suggest that prokaryotic species with large

    genome sizes have a more extensive metabolic potential, enabling survival in

    (different) environments where resources are scarce. The main contributors to

    genome expansion are acquisition of DNA and duplication events. As for gene

    acquisition or genesis, Daubin and Ochman suggest that a substantial part of new

    (small) genes is acquired via horizontal gene transfer from bacteriophages [115,

    116], whereas other acquisition events involve much larger gene clusters such as

    Genomic Islands (GIs) and Pathogenicity-Associated Islands (PAIs) ([117]). The

    origins of these large gene clusters remain obscure, which may be explained by the

    relatively small number of different species that have been sequenced.

  • Chapter 1

    13

    However, although genomes do incorporate new DNA, they do not grow ever

    larger. Cellular processes that eliminate (excess) DNA form the genome must be

    present. The existence of species with small genomes, often endosymbionts, is

    suggestive of genome evolution leading to niche-specific organisms by the loss of

    many regulatory functions [118, 119]. These small-genome organisms are thought to

    be derived from larger genome-sized species [120], and may represent an illustration

    of the process of genome reduction.

    Figure 2. Depicted are the processes involved in genome size evolution in bacteria. Genome

    expansion takes place by duplication and acquisition events, whereas genome reduction is mainly

    maintained by deletion, either by direct deletion or slow erosion via gene activation and subsequent

    deletion (adapted from [121]).

    A fine example of recent massive gene decay is found in the genome

    sequence of the leprosy bacillus Mycobacterium leprae [122]. The bacterium is

    strictly intracellular, but it has a relatively large genome size (3.3 Mb). It has,

    presumably relatively recently, undergone massive reduction in genome size

    resulting in many pseudogenes, thought to result from extensive recombination.

    Pseudogenes are inactivated genes of which the remnants are still present, and

    recent analyses of pseudogene content across diverse prokaryote genomes indicate

    that pseudogenes are formed and eliminated rapidly from genome sequences [123].

    In M. leprae the total number of predicted functional genes is around 1,600

    Processes increasing

    genome size

    Processes reducing

    genome size

    Gene

    acquisition

    Loss of

    fragments

    plication

    Deletional

    bias

    Gene inactivation

    Bacterial

    genome

    Du-

  • Chapter 1

    14

    (compared to almost 4,000 in the related M. tuberculosis with a similar genome size),

    which is indicative for a genome in flux.

    Genomic rearrangements were first visualized when two different complete

    genome sequences of the same species were sequenced and compared [124]. With

    the exception of operons, functionally involved genes which out of necessity are in

    close proximity to each other, the order of genes is often poorly conserved among

    bacteria during evolution [125]. This genomic flexibility may have a function in both

    the genesis of new genes (such as genomic region duplication events), in the

    removal of unnecessary sequences or alteration of transcription levels. Comparisons

    of whole genome sequences are best depicted in whole genome alignment graphs,

    as described by Eisen and co-workers [126]. The conservation of gene order is

    expressed with the synteny parameter, which is used when quantifying genome

    sequence similarity.

    1.5 Horizontal gene transfer

    Even before whole genome sequences were available, horizontal gene

    transfer (HGT) was recognized and acknowledged as a factor contributing to

    prokaryotic evolution, although the impact of HGT was not fully appreciated until the

    genomic era was well underway. One of the most well-known examples of HGT

    events are R-plasmids that encode resistance against certain (types of) antibiotics

    [127].

    Horizontal gene transfer, sometimes addressed as lateral gene transfer,

    constitutes an alternative for the orthodox vertical inheritance of genetic traits. There

    are three distinct routes which can lead to horizontal acquisition of DNA: 1) direct

    uptake of DNA by the cell (transformation), 2) directed DNA transfer via conjugation

    and 3) bacteriophage-mediated DNA transfer (transduction) (see figure 3). These

    three different routes of horizontal acquisition of DNA have all been studied

    extensively; nowadays they constitute basic tools in molecular biology.

    The first genome-scale analysis of HGT was performed by Lawrence and

    Ochman [128], who found by that approximately 18% of the Escherichia coli genome

    has been acquired via horizontal gene transfer. This study of a single genome

  • Chapter 1

    15

    sequence opened the door to further analyses, which seem to increase in accuracy

    with the availability of more genome sequences (and more readily available

    phylogenetic data) and of more parameters for identification procedures.

    Although no actual natural transfer events have been witnessed, the results of

    HGT can be recognized in different ways. The most obvious is the phylogenetic

    approach, in which incongruence in evolutionary relationships between different gene

    clusters hint at transfer events. Molecular phylogenetic analyses were originally

    performed using nucleotide sequences present in all organisms: those of ribosomal

    RNA [129]. Nowadays, with the availability of many different complete genome

    sequences, weighted genome trees can be constructed [130], and discrepancies in

    these analyses are often explained most parsimoniously by an horizontal transfer

    event rather at one of the branches, rather then by a large number of deletion events

    at a great many more branches of the phylogenetic tree.

    The second approach for the detection of acquisition events consists of

    parametric analyses, and is more enthusiastically embraced by bioinformaticians, as

    it is relatively easily implemented in software. Different parameters have been

    proposed to detect horizontally acquired sequences, and the most well known

    approaches are based on codon-usage biases, GC percentage, and dinucleotide

    frequency (also called the genome-signature) deviations [131]. Parametric

    identification of horizontally transferred DNA is based on the genome hypothesis,

    which proposes that for a given prokaryotic genus genomic DNA is relatively constant

    in codon usage and GC content [132, 133]. Therefore, horizontally acquired

    sequences may differ in codon usage and/or GC composition from the recipient

    genome and can be identified in whole genome sequences. Improvements of these

    approaches that permit better resolution have been published recently [134, 135].

    Currently, with a great many genome sequences tested for putative horizontally

    acquired genes, the emphasis has shifted toward functional analyses of horizontally

    transferred sequences. Recent analysis of acquired genes suggest a bias towards

    three functional categories: cell-surface, DNA binding and pathogenicity-associated

    [136]. The observed bias in functional categories may however be the result of the

    aforementioned disproportional availability of genome sequences of biomedical and

    industrially relevant strains, as compared to other strains (www.genomesonline.org).

    Now, with more genome sequences of environmental strains rapidly becoming

  • Chapter 1

    16

    available, an increasing variety of acquired gene clusters providing diverse metabolic

    capacities are being discovered, emphasising that horizontal genetic transfer is not

    limited to virulence traits [117], but may also constitute novel catabolic pathways

    involved in for example xenobiotics degradation [91]. It is of note however, that the

    parametric approach to the identification of compositionally dissimilar sequences

    might not be sufficient to identify all horizontally acquired sequences. Exchange of

    DNA between closely related species can lead to the acquisition of non-anomalous

    sequences [137]. On the other hand, autochthonous sequences can sometimes be

    very different from the rest of the genome sequence [131], as they are highly

    expressed or have distinct features such as a strong bias for certain amino acids, but

    also ribosomal RNA sequences. Finally, there are genomic sequences, are

    suspected to be acquired via HGT, but no definite history or origin can be assigned.

    This is a different drawback of parametric detection of anomalous sequences in

    genome sequences; no clear cut-off value for the number of putative horizontally

    acquired genes can be given without further phylogenetic support.

    Besides whole genome sequencing, several other techniques are available to

    selectively isolate putative horizontally acquired sequences. These include

    subtractive hybridization [138] and representational difference analysis [138, 139],

    both techniques with which the differences of two related strains are cloned and

    analysed. These approaches rely on the lack of hybridisation between sequences

    unique to one of the two strains. However, no dedicated tool is available to score

    individual sequences isolated with these techniques for composition dissimilarities

    compared to a genome sequence, although for many of these putative horizontally

    transferred sequences a representative genome sequence (i.e. a genomic context) is

    available. Currently, applications that can test whether a nucleotide sequence is

    atypical within a genomic context, and therefore putatively horizontally acquired,

    relies solely on whole genome sequences, and disregards all other sequences

    available in the databases [136, 140-143].

    Obviously, horizontal gene transfer does not create ever-larger prokaryotic

    genome sequences. Constraints on HGT must therefore exist and limit the maximum

    of acquired sequences that can be stably introduced to host genomes. For transfer

  • Chapter 1

    17

    routes such as conjugation or transduction, a limited host range of the transferring

    agents may restrict the horizontal spread of genes. Lawrence and Hendrickson

    propose a different potential constraint on HGT [144], based on specific

    oligonucleotide motifs, distributed asymmetrically between the two DNA strands,

    which may be involved in genome replication processes. The differential distribution

    of such specific oligonucleotide motifs between different species of microbes may

    limit sequence exchange, as the introduction of the DNA would incur a selective

    detriment that could potentially offset any benefits provided by the newly acquired

    gene products [144]. How this would relate to large-scale genome rearrangements is

    still unknown.

    Incompatibility may play a significant role in HGT. As sequences are thought

    to function optimally in the genomes where they originally belong, sequence

    establishment in a new host may not always result in compatibility of the encoded

    proteins with the host (intra- and extracellular) environment. Also, most proteins

    interact with and/or are dependent on other proteins in complex systems, and the

    absence of similar systems in a new host may restrict functionality of acquired

    sequences [145]. As a result, it is thought that sequences are removed quickly from

    the genome, when they are not beneficial to the host. [121]

    Finally, restriction modification systems may form a constraint posed on HGT

    from a very different perspective. The defence hypothesis postulates that these

    systems play a role in maintaining species identity [146]. Restriction modification

    systems have two main activities: that of a restriction endonuclease and that of a

    methylase. Both recognize the same nucleotide motif. The endonuclease activity

    cleaves unmethylated DNA, which usually consists of acquired DNA that has not yet

    been methylated. Genomic DNA is safe from the harmful activity of the

    endonuclease, as the methylase protects all recognition sequences. However, the

    mobile nature of these restriction modification systems questions their supposed

    protective functionality towards species identity [147].

  • Chapter 1

    18

    Figure 3. Three different routes for horizontal gene transfer, Transformation (uptake of naked DNA),

    conjugation (plasmid-directed DNA transfer) and transduction (bacteriophage-mediated transfer of

    DNA).

  • Chapter 1

    19

    1.6 Outline of this thesis

    The aim of this introduction was to draw the framework in which this thesis

    should be placed, as well as to formulate the research questions of this thesis. The

    initial aim of the project was to identify neisserial sequences that are responsible for

    the hypervirulent nature of certain N. meningitidis genotypes. However, the aim

    gradually shifted towards analyses of horizontal gene fluxes amongst prokaryotes in

    general, and acquired gene clusters in Neisseriae in particular. In other words, how

    can horizontally transferred sequences be isolated without strain-by-strain

    comparisons? Next, how can individual sequences, suspected to be acquired via

    HGT, be placed within a genomic context? Furthermore we aimed to analyse

    horizontal gene fluxes both intragenomically (with genomic islands) and between

    species (with plasmids). The first step would be the development of an in vitro

    strategy to isolate horizontally acquired sequences from Neisseriae. Different

    techniques exist to identify anomalous sequences in silico in completely sequenced

    genomes. However, any sequenced genome is merely a representative of a

    particular species and it is therefore difficult to extrapolate to unsequenced isolates of

    the same species. Bioinformatical approaches have facilitated horizontal gene

    transfer identification procedures in whole genome sequences as well as the

    mapping of horizontal gene transfer processes. There is a need to implement in vitro

    approaches to detect DNA acquisition events in order to unravel this important

    contributor to bacterial evolution and diversification.

    Chapter two deals with a novel in vitro strategy to selectively isolate

    anomalous sequences from unsequenced prokaryotic genomes. The availability of

    multiple genome sequences allows comparative genomics to test new strategies to

    isolate putative acquired sequences, without performing hybridizations. In this new

    strategy we focus on compositional attributes such as the genome signature, instead

    of specific sequences potentially involved in DNA transfer processes such as tRNA

    synthetase encoding genes. The availability of representative genome sequences is

    a prerequisite for this strategy, and fortunately the large number of ongoing whole

    genome sequencing projects safeguards a steady increase of such representative

    genome sequences, which could expand our technique to different prokaryotic

    genera.

  • Chapter 1

    20

    Chapter three describes the development of and gives instructions about a

    web application that permits nucleotide composition analyses of individual sequences

    in comparison to a suitable genomic context. The bioinformatical background of

    chapter two is explained extensively in this chapter.

    We hypothesized that different large putative horizontally acquired sequences

    (i.e. Genomic Islands) within the same genome sequence may be compared with

    each other, as it is reasonable to suggest that a single donor was responsible for

    multiple transfer events. A comparison of Genomic Islands is described in chapter

    four, and an application that allows users to investigate the acquisition account of

    given prokaryotes is described. This approach might facilitate donor identification of

    putative horizontally acquired genes.

    Chapter five comprises a study focussed on the anomalous DNA content of a

    Neisseria lactamica strain. N. lactamica is a human commensal residing in the

    nasopharynx. This species however does not cause disease, although it is closely

    related to the meningococcus. The strategy developed in chapter two to specifically

    isolate putative horizontally acquired sequences is applied to the unsequenced N.

    lactamica, and the analysis of the detected anomalous sequences in this species is

    described.

    A typical and well-known prokaryotic mobile element is the plasmid. Chapter

    six reports on a range of plasmids isolated from N. lactamica strains. Selfish genetic

    elements such as plasmids form a common vehicle for horizontal gene transfer

    processes. Not much is known about plasmid sequence diversity amongst different

    neisserial species. The (compositional) analyses of neisserial plasmids may reveal

    what sequences are being transferred via these mobile elements.

    Based on some of the results from chapter six, we perform a database-

    analysis of a large number of prokaryotic plasmids, which is described in chapter

    seven. Compositional comparisons of plasmid sequences and their respective host

    chromosome sequences may confirm compositional compatibility. If compositional

    incompatibility would be detected, alternative selection pressures would be exerted

    on genomic and epigenetic sequences.

  • Chapter 1

    21

    Finally, chapter eight summarises the findings of this thesis and discusses

    these further within the context of current developments in the field of molecular

    microbiology.

  • Chapter 1

    22

    References

    1. Yazdankhah, S.P. and D.A. Caugant, Neisseria meningitidis: an overview of the carriage

    state. J Med Microbiol, 2004. 53(Pt 9): p. 821-32. 2. Sim, R.J., et al., Underestimation of meningococci in tonsillar tissue by nasopharyngeal

    swabbing. Lancet, 2000. 356(9242): p. 1653-4. 3. Weichselbaum, A., Ueber die aetiologie der akuten meningitis cerebro-spinalis. Fortschr Med,

    1887. 5: p. 573–583. 4. Neisser, A., Ueber eine der Gonorrhoe eigentuemliche Micrococcusform. Centralb. Med.

    Wissenschaften, 1879. 28: p. 497-500. 5. Taha, M.K., et al., The duality of virulence and transmissibility in Neisseria meningitidis.

    Trends Microbiol, 2002. 10(8): p. 376-82. 6. Tzeng, Y.L. and D.S. Stephens, Epidemiology and pathogenesis of Neisseria meningitidis.

    Microbes Infect, 2000. 2(6): p. 687-700. 7. Branham, S.E., Serological relationships among meningococci. Bacteriol Rev, 1953. 17(3): p.

    175-88. 8. Frasch, C.E., W.D. Zollinger, and J.T. Poolman, Serotype antigens of Neisseria meningitidis

    and a proposed scheme for designation of serotypes. Rev Infect Dis, 1985. 7(4): p. 504-10. 9. Zollinger, W.D. and R.E. Mandrell, Outer-membrane protein and lipopolysaccharide serotyping

    of Neisseria meningitidis by inhibition of a solid-phase radioimmunoassay. Infect Immun, 1977. 18(2): p. 424-33.

    10. Mandrell, R.E. and W.D. Zollinger, Lipopolysaccharide serotyping of Neisseria meningitidis by hemagglutination inhibition. Infect Immun, 1977. 16(2): p. 471-5.

    11. Caugant, D.A., et al., Multilocus genotypes determined by enzyme electrophoresis of Neisseria meningitidis isolated from patients with systemic disease and from healthy carriers. J Gen Microbiol, 1986. 132(3): p. 641-52.

    12. Bygraves, J.A. and M.C. Maiden, Analysis of the clonal relationships between strains of Neisseria meningitidis by pulsed field gel electrophoresis. J Gen Microbiol, 1992. 138(3): p. 523-31.

    13. Woods, J.P., et al., Use of arbitrarily primed polymerase chain reaction analysis to type disease and carrier strains of Neisseria meningitidis isolated during a university outbreak. J Infect Dis, 1994. 169(6): p. 1384-9.

    14. Bart, A., et al., Randomly amplified polymorphic DNA genotyping of serogroup A meningococci yields results similar to those obtained by multilocus enzyme electrophoresis and reveals new genotypes. J Clin Microbiol, 1998. 36(6): p. 1746-9.

    15. Maiden, M.C., et al., Multilocus sequence typing: a portable approach to the identification of clones within populations of pathogenic microorganisms. Proc Natl Acad Sci U S A, 1998. 95(6): p. 3140-5.

    16. Jolley, K.A., M.S. Chan, and M.C. Maiden, mlstdbNet - distributed multi-locus sequence typing (MLST) databases. BMC Bioinformatics, 2004. 5: p. 86.

    17. Feil, E.J., et al., eBURST: inferring patterns of evolutionary descent among clusters of related bacterial genotypes from multilocus sequence typing data. J Bacteriol, 2004. 186(5): p. 1518-30.

    18. Spratt, B.G., et al., Displaying the relatedness among isolates of bacterial species -- the eBURST approach. FEMS Microbiol Lett, 2004. 241(2): p. 129-34.

    19. Catlin, B.W., Transformation of Neisseria meningitidis by deoxyribonucleates from cells and from culture slime. J Bacteriol, 1960. 79: p. 579-90.

    20. Hanage, W.P., C. Fraser, and B.G. Spratt, Fuzzy species among recombinogenic bacteria. BMC Biol, 2005. 3(1): p. 6.

    21. Van der Ende, A., Spanjaard, L, Vandenbroucke-Grauls, C.M.J.E., Bacterial meningitis in the Netherlands; annual report 2003. 2003, Netherlands Reference Laboratory for Bacterial Meningitis (Academic Medical Center and the National Institute of Public Health and the Environment): Amsterdam.

    22. Achtman, M., Epidemic spread and antigenic variability of Neisseria meningitidis. Trends Microbiol, 1995. 3(5): p. 186-92.

    23. Hart, C.A. and L.E. Cuevas, Meningococcal disease in Africa. Ann Trop Med Parasitol, 1997. 91(7): p. 777-85.

    24. Bovre, K., et al., Neisseria meningitidis infections in Northern Norway: an epidemic in 1974-1975 due mainly to group B organisms. J Infect Dis, 1977. 135(4): p. 669-72.

  • Chapter 1

    23

    25. Martin, D.R., et al., New Zealand epidemic of meningococcal disease identified by a strain with phenotype B:4:P1.4. J Infect Dis, 1998. 177(2): p. 497-500.

    26. Sexton, K., et al., The New Zealand Meningococcal Vaccine Strategy: a tailor-made vaccine to combat a devastating epidemic. N Z Med J, 2004. 117(1200): p. U1015.

    27. Fine, A., Layton, M., Hakim, A., Smith, P.,, Serogroup W-135 meningococcal disease among travelers returning from Saudi Arabia--United States, 2000. MMWR Morb Mortal Wkly Rep, 2000. 49(16): p. 345-6.

    28. Decosas, J. and J.B. Koama, Chronicle of an outbreak foretold: meningococcal meningitis W135 in Burkina Faso. Lancet Infect Dis, 2002. 2(12): p. 763-5.

    29. Ahmad, K., Vaccination halts meningitis outbreak in Burkina Faso. Lancet, 2004. 363(9417): p. 1290.

    30. Van der Ende, A., Spanjaard, L, Vandenbroucke-Grauls, C.M.J.E., Bacterial meningitis in the Netherlands; annual report 2002. 2002, Netherlands Reference Laboratory for Bacterial Meningitis (Academic Medical Center and the National Institute of Public Health and the Environment): Amsterdam.

    31. de Greeff, S.C., et al., [The first effect of the national vaccination campaign against meningococcal-C disease: a rapid and sharp decrease in the number of patients]. Ned Tijdschr Geneeskd, 2003. 147(23): p. 1132-5.

    32. Maiden, M.C. and J.M. Stuart, Carriage of serogroup C meningococci 1 year after meningococcal C conjugate polysaccharide vaccination. Lancet, 2002. 359(9320): p. 1829-31.

    33. Finne, J., et al., An IgG monoclonal antibody to group B meningococci cross-reacts with developmentally regulated polysialic acid units of glycoproteins in neural and extraneural tissues. J Immunol, 1987. 138(12): p. 4402-7.

    34. Moe, G.R., S. Tan, and D.M. Granoff, Molecular mimetics of polysaccharide epitopes as vaccine candidates for prevention of Neisseria meningitidis serogroup B disease. FEMS Immunol Med Microbiol, 1999. 26(3-4): p. 209-26.

    35. Caugant, D.A., et al., Asymptomatic carriage of Neisseria meningitidis in a randomly sampled population. J Clin Microbiol, 1994. 32(2): p. 323-30.

    36. Hollis, D.G., G.L. Wiggins, and R.E. Weaver, Neisseria lactamicus sp. n., a lactose-fermenting species resembling Neisseria meningitidis. Appl Microbiol, 1969. 17(1): p. 71-7.

    37. Oliver, K.J., et al., Neisseria lactamica protects against experimental meningococcal infection. Infect Immun, 2002. 70(7): p. 3621-6.

    38. Cartwright, K.A., et al., The Stonehouse survey: nasopharyngeal carriage of meningococci and Neisseria lactamica. Epidemiol Infect, 1987. 99(3): p. 591-601.

    39. Braun, J.M., et al., Neisseria meningitidis, Neisseria lactamica and Moraxella catarrhalis share cross-reactive carbohydrate antigens. Vaccine, 2004. 22(7): p. 898-908.

    40. Gorringe, A., et al., The development of a meningococcal disease vaccine based on Neisseria lactamica outer membrane vesicles. Vaccine, 2005. 23(17-18): p. 2210-3.

    41. Bennett, J.S., et al., Genetic diversity and carriage dynamics of Neisseria lactamica in infants. Infect Immun, 2005. 73(4): p. 2424-32.

    42. DeVoe, I.W. and J.E. Gilchrist, Pili on meningococci from primary cultures of nasopharyngeal carriers and cerebrospinal fluid of patients with acute disease. J Exp Med, 1975. 141(2): p. 297-305.

    43. Virji, M., et al., The role of pili in the interactions of pathogenic Neisseria with cultured human endothelial cells. Mol Microbiol, 1991. 5(8): p. 1831-41.

    44. Virji, M., et al., Variations in the expression of pili: the effect on adherence of Neisseria meningitidis to human epithelial and endothelial cells. Mol Microbiol, 1992. 6(10): p. 1271-9.

    45. Nassif, X., et al., Antigenic variation of pilin regulates adhesion of Neisseria meningitidis to human epithelial cells. Mol Microbiol, 1993. 8(4): p. 719-25.

    46. Lomholt, H., et al., Molecular polymorphism and epidemiology of Neisseria meningitidis immunoglobulin A1 proteases. Proc Natl Acad Sci U S A, 1992. 89(6): p. 2120-4.

    47. Thompson, S.A., et al., Neisseria meningitidis produces iron-regulated proteins related to the RTX family of exoproteins. J Bacteriol, 1993. 175(3): p. 811-8.

    48. Thompson, S.A., L.L. Wang, and P.F. Sparling, Cloning and nucleotide sequence of frpC, a second gene from Neisseria meningitidis encoding a protein similar to RTX cytotoxins. Mol Microbiol, 1993. 9(1): p. 85-96.

    49. Thompson, S.A. and P.F. Sparling, The RTX cytotoxin-related FrpA protein of Neisseria meningitidis is secreted extracellularly by meningococci and by HlyBD+ Escherichia coli. Infect Immun, 1993. 61(7): p. 2906-11.

  • Chapter 1

    24

    50. Swartley, J.S., et al., Capsule switching of Neisseria meningitidis. Proc Natl Acad Sci U S A, 1997. 94(1): p. 271-6.

    51. de Vries, F.P., et al., Neisseria meningitidis producing the Opc adhesin binds epithelial cell proteoglycan receptors. Mol Microbiol, 1998. 27(6): p. 1203-12.

    52. Brandtzaeg, P., et al., Neisseria meningitidis lipopolysaccharides in human pathology. J Endotoxin Res, 2001. 7(6): p. 401-20.

    53. Yi, K., et al., Biofilm formation by Neisseria meningitidis. Infect Immun, 2004. 72(10): p. 6132-8.

    54. Yi, K., D.S. Stephens, and I. Stojiljkovic, Development and evaluation of an improved mouse model of meningococcal colonization. Infect Immun, 2003. 71(4): p. 1849-55.

    55. Gorringe, A.R., et al., Experimental disease models for the assessment of meningococcal vaccines. Vaccine, 2005. 23(17-18): p. 2214-7.

    56. Mackinnon, F.G., et al., Demonstration of lipooligosaccharide immunotype and capsule as virulence factors for Neisseria meningitidis using an infant mouse intranasal infection model. Microb Pathog, 1993. 15(5): p. 359-66.

    57. de Jonge, M.I., et al., Intranasal immunisation of mice with liposomes containing recombinant meningococcal OpaB and OpaJ proteins. Vaccine, 2004. 22(29-30): p. 4021-8.

    58. Johansson, L., et al., CD46 in meningococcal disease. Science, 2003. 301(5631): p. 373-5. 59. Perrin, A., X. Nassif, and C. Tinsley, Identification of regions of the chromosome of Neisseria

    meningitidis and Neisseria gonorrhoeae which are specific to the pathogenic Neisseria species. Infect Immun, 1999. 67(11): p. 6119-29.

    60. Klee, S.R., et al., Molecular and biological analysis of eight genetic islands that distinguish Neisseria meningitidis from the closely related pathogen Neisseria gonorrhoeae. Infect Immun, 2000. 68(4): p. 2082-95.

    61. Tinsley, C.R. and X. Nassif, Analysis of the genetic differences between Neisseria meningitidis and Neisseria gonorrhoeae: two closely related bacteria expressing two different pathogenicities. Proc Natl Acad Sci U S A, 1996. 93(20): p. 11109-14.

    62. Stollenwerk, N., M.C. Maiden, and V.A. Jansen, Diversity in pathogenicity can cause outbreaks of meningococcal disease. Proc Natl Acad Sci U S A, 2004. 101(27): p. 10229-34.

    63. Hoang, L.M., et al., Rapid and fatal meningococcal disease due to a strain of Neisseria meningitidis containing the capsule null locus. Clin Infect Dis, 2005. 40(5): p. e38-42.

    64. Peeters, C.C., et al., Phase I clinical trial with a hexavalent PorA containing meningococcal outer membrane vesicle vaccine. Vaccine, 1996. 14(10): p. 1009-15.

    65. van der Ende, A., et al., Outbreak of meningococcal disease caused by PorA-deficient meningococci. J Infect Dis, 2003. 187(5): p. 869-71.

    66. Rosenstein, N.E. and B.A. Perkins, Update on Haemophilus influenzae serotype b and meningococcal vaccines. Pediatr Clin North Am, 2000. 47(2): p. 337-52, vi.

    67. Edwards, M.S. and C.J. Baker, Complications and sequelae of meningococcal infections in children. J Pediatr, 1981. 99(4): p. 540-5.

    68. Koomen, I., et al., Neuropsychology of academic and behavioural limitations in school-age survivors of bacterial meningitis. Dev Med Child Neurol, 2004. 46(11): p. 724-32.

    69. Koomen, I., et al., Parental perception of educational, behavioural and general health problems in school-age survivors of bacterial meningitis. Acta Paediatr, 2003. 92(2): p. 177-85.

    70. Koomen, I., et al., Hearing loss at school age in survivors of bacterial meningitis: assessment, incidence, and prediction. Pediatrics, 2003. 112(5): p. 1049-53.

    71. Stanwell-Smith, R.E., et al., Smoking, the environment and meningococcal disease: a case control study. Epidemiol Infect, 1994. 112(2): p. 315-28.

    72. Figueroa, J., J. Andreoni, and P. Densen, Complement deficiency states and meningococcal disease. Immunol Res, 1993. 12(3): p. 295-311.

    73. van Gils, E.J., et al., Increased attack rate of meningococcal disease in children with a pregnant mother. Pediatrics, 2005. 115(5): p. e590-3.

    74. Emonts, M., et al., Host genetic determinants of Neisseria meningitidis infections. Lancet Infect Dis, 2003. 3(9): p. 565-77.

    75. van der Ende, A., et al., Variable expression of class 1 outer membrane protein in Neisseria meningitidis is caused by variation in the spacing between the -10 and -35 regions of the promoter. J Bacteriol, 1995. 177(9): p. 2475-80.

    76. Alexander, H.L., A.R. Richardson, and I. Stojiljkovic, Natural transformation and phase variation modulation in Neisseria meningitidis. Mol Microbiol, 2004. 52(3): p. 771-83.

  • Chapter 1

    25

    77. Alexander, H.L., A.W. Rasmussen, and I. Stojiljkovic, Identification of Neisseria meningitidis genetic loci involved in the modulation of phase variation frequencies. Infect Immun, 2004. 72(11): p. 6743-7.

    78. Martin, P., et al., Involvement of genes of genome maintenance in the regulation of phase variation frequencies in Neisseria meningitidis. Microbiology, 2004. 150(Pt 9): p. 3001-12.

    79. Sarkari, J., et al., Variable expression of the Opc outer membrane protein in Neisseria meningitidis is caused by size variation of a promoter containing poly-cytidine. Mol Microbiol, 1994. 13(2): p. 207-17.

    80. Saunders, N.J., et al., Repeat-associated phase variable genes in the complete genome sequence of Neisseria meningitidis strain MC58. Mol Microbiol, 2000. 37(1): p. 207-15.

    81. Snyder, L.A., S.A. Butcher, and N.J. Saunders, Comparative whole-genome analyses reveal over 100 putative phase-variable genes in the pathogenic Neisseria spp. Microbiology, 2001. 147(Pt 8): p. 2321-32.

    82. Seifert, H.S., et al., DNA transformation leads to pilin antigenic variation in Neisseria gonorrhoeae. Nature, 1988. 336(6197): p. 392-5.

    83. Gibbs, C.P., et al., Reassortment of pilin genes in Neisseria gonorrhoeae occurs by two distinct mechanisms. Nature, 1989. 338(6217): p. 651-2.

    84. Fleischmann, R.D., et al., Whole-genome random sequencing and assembly of Haemophilus influenzae Rd. Science, 1995. 269(5223): p. 496-512.

    85. Bernal, A., U. Ear, and N. Kyrpides, Genomes OnLine Database (GOLD): a monitor of genome projects world-wide. Nucleic Acids Res, 2001. 29(1): p. 126-7.

    86. Kyrpides, N.C., Genomes OnLine Database (GOLD 1.0): a monitor of complete and ongoing genome projects world-wide. Bioinformatics, 1999. 15(9): p. 773-4.

    87. Pizza, M., et al., Identification of vaccine candidates against serogroup B meningococcus by whole-genome sequencing. Science, 2000. 287(5459): p. 1816-20.

    88. Tettelin, H., et al., Complete genome sequence of Neisseria meningitidis serogroup B strain MC58. Science, 2000. 287(5459): p. 1809-15.

    89. Holden, M.T., et al., Complete genomes of two clinical Staphylococcus aureus strains: evidence for the rapid evolution of virulence and drug resistance. Proc Natl Acad Sci U S A, 2004. 101(26): p. 9786-91.

    90. Altermann, E., et al., Complete genome sequence of the probiotic lactic acid bacterium Lactobacillus acidophilus NCFM. Proc Natl Acad Sci U S A, 2005. 102(11): p. 3906-12.

    91. Springael, D. and E.M. Top, Horizontal gene transfer and microbial adaptation to xenobiotics: new types of mobile genetic elements and lessons from ecological studies. Trends Microbiol, 2004. 12(2): p. 53-8.

    92. Futterer, O., et al., Genome sequence of Picrophilus torridus and its implications for life around pH 0. Proc Natl Acad Sci U S A, 2004. 101(24): p. 9091-6.

    93. Zhang, Y., D.E. Fomenko, and V.N. Gladyshev, The microbial selenoproteome of the Sargasso Sea. Genome Biol, 2005. 6(4): p. R37.

    94. Tyson, G.W., et al., Community structure and metabolism through reconstruction of microbial genomes from the environment. Nature, 2004. 428(6978): p. 37-43.

    95. Venter, J.C., et al., Environmental genome shotgun sequencing of the Sargasso Sea. Science, 2004. 304(5667): p. 66-74.

    96. Hugenholtz, P., Exploring prokaryotic diversity in the genomic era. Genome Biol, 2002. 3(2): p. REVIEWS0003.

    97. Tringe, S.G., et al., Comparative metagenomics of microbial communities. Science, 2005. 308(5721): p. 554-7.

    98. Konstantinidis, K.T. and J.M. Tiedje, Genomic insights that advance the species definition for prokaryotes. Proc Natl Acad Sci U S A, 2005. 102(7): p. 2567-72.

    99. Coenye, T., et al., Towards a prokaryotic genomic taxonomy. FEMS Microbiol Rev, 2005. 29(2): p. 147-67.

    100. Coenye, T. and P. Vandamme, Extracting phylogenetic information from whole-genome sequencing projects: the lactic acid bacteria as a test case. Microbiology, 2003. 149(Pt 12): p. 3507-17.

    101. Kanehisa, M. and P. Bork, Bioinformatics in the post-sequence era. Nat Genet, 2003. 33 Suppl: p. 305-10.

    102. Salzberg, S.L., et al., Serendipitous discovery of Wolbachia genomes in multiple Drosophila species. Genome Biol, 2005. 6(3): p. R23.

    103. Ouzounis, C., Bioinformatics and the theoretical foundations of molecular biology. Bioinformatics, 2002. 18(3): p. 377-8.

  • Chapter 1

    26

    104. Price, M.N., et al., A novel method for accurate operon predictions in all sequenced prokaryotes. Nucleic Acids Res, 2005. 33(3): p. 880-92.

    105. Rutherford, K., et al., Artemis: sequence visualization and annotation. Bioinformatics, 2000. 16(10): p. 944-5.

    106. Leader, D.P., BugView: a browser for comparing genomes. Bioinformatics, 2004. 20(1): p. 129-30.

    107. Dong, X., et al., PlasMapper: a web server for drawing and auto-annotating plasmid maps. Nucleic Acids Res, 2004. 32(Web Server issue): p. W660-4.

    108. Snel, B., P. Bork, and M.A. Huynen, Genomes in flux: the evolution of archaeal and proteobacterial gene content. Genome Res, 2002. 12(1): p. 17-25.

    109. Snel, B., P. Bork, and M.A. Huynen, The identification of functional modules from the genomic association of genes. Proc Natl Acad Sci U S A, 2002. 99(9): p. 5890-5.

    110. Lerat, E., et al., Evolutionary origins of genomic repertoires in bacteria. PLoS Biol, 2005. 3(5): p. e130.

    111. Roberts, R.J., Karp, P. ,Kasif, S., Linn, S. & Buckley, M. R., An Experimental Approach to Genome Annotation. 2004, American Society for Microbiology: Washington DC.

    112. Kunin, V. and C.A. Ouzounis, The balance of driving forces during genome evolution in prokaryotes. Genome Res, 2003. 13(7): p. 1589-94.

    113. Cohan, F.M., What are bacterial species? Annu Rev Microbiol, 2002. 56: p. 457-87. 114. Konstantinidis, K.T. and J.M. Tiedje, Trends between gene content and genome size in

    prokaryotic species with larger genomes. Proc Natl Acad Sci U S A, 2004. 101(9): p. 3160-5. 115. Daubin, V. and H. Ochman, Bacterial genomes as new gene homes: the genealogy of

    ORFans in E. coli. Genome Res, 2004. 14(6): p. 1036-42. 116. Daubin, V. and H. Ochman, Start-up entities in the origin of new genes. Curr Opin Genet Dev,

    2004. 14(6): p. 616-9. 117. Dobrindt, U., et al., Genomic islands in pathogenic and environmental microorganisms. Nat

    Rev Microbiol, 2004. 2(5): p. 414-424. 118. Moran, N.A., Microbial minimalism: genome reduction in bacterial pathogens. Cell, 2002.

    108(5): p. 583-6. 119. Moran, N.A., Tracing the evolution of gene loss in obligate bacterial symbionts. Curr Opin

    Microbiol, 2003. 6(5): p. 512-8. 120. Moran, N.A. and A. Mira, The process of genome shrinkage in the obligate symbiont Buchnera

    aphidicola. Genome Biol, 2001. 2(12): p. RESEARCH0054. 121. Mira, A., H. Ochman, and N.A. Moran, Deletional bias and the evolution of bacterial genomes.

    Trends Genet, 2001. 17(10): p. 589-96. 122. Cole, S.T., et al., Massive gene decay in the leprosy bacillus. Nature, 2001. 409(6823): p.

    1007-11. 123. Lerat, E. and H. Ochman, Recognizing the pseudogenes in bacterial genomes. Nucleic Acids

    Res, 2005. 33(10): p. 3125-32. 124. Alm, R.A., et al., Genomic-sequence comparison of two unrelated isolates of the human

    gastric pathogen Helicobacter pylori. Nature, 1999. 397(6715): p. 176-80. 125. Tillier, E.R. and R.A. Collins, Genome rearrangement by replication-directed translocation. Nat

    Genet, 2000. 26(2): p. 195-7. 126. Eisen, J.A., et al., Evidence for symmetric chromosomal inversions around the replication

    origin in bacteria. Genome Biol, 2000. 1(6): p. RESEARCH0011. 127. Leclercq, R., et al., Plasmid-mediated resistance to vancomycin and teicoplanin in

    Enterococcus faecium. N Engl J Med, 1988. 319(3): p. 157-61. 128. Lawrence, J.G. and H. Ochman, Molecular archaeology of the Escherichia coli genome. Proc

    Natl Acad Sci U S A, 1998. 95(16): p. 9413-7. 129. Woese, C.R. and G.E. Fox, Phylogenetic structure of the prokaryotic domain: the primary

    kingdoms. Proc Natl Acad Sci U S A, 1977. 74(11): p. 5088-90. 130. Gophna, U., W.F. Doolittle, and R.L. Charlebois, Weighted genome trees: refinements and

    applications. J Bacteriol, 2005. 187(4): p. 1305-16. 131. Karlin, S., Detecting anomalous gene clusters and pathogenicity islands in diverse bacterial

    genomes. Trends Microbiol, 2001. 9(7): p. 335-43. 132. Grantham, R., et al., Codon catalog usage and the genome hypothesis. Nucleic Acids Res,

    1980. 8(1): p. r49-r62. 133. Lawrence, J.G. and H. Ochman, Amelioration of bacterial genomes: rates of change and

    exchange. J Mol Evol, 1997. 44(4): p. 383-97.

  • Chapter 1

    27

    134. Zhang, R. and C.T. Zhang, A systematic method to identify genomic islands and its applications in analyzing the genomes of Corynebacterium glutamicum and Vibrio vulnificus CMCP6 chromosome I. Bioinformatics, 2004. 20(5): p. 612-22.

    135. Sandberg, R., et al., Capturing whole-genome characteristics in short sequences using a naive Bayesian classifier. Genome Res, 2001. 11(8): p. 1404-9.

    136. Nakamura, Y., et al., Biased biological functions of horizontally transferred genes in prokaryotic genomes. Nat Genet, 2004. 36(7): p. 760-6.

    137. Linz, B., et al., Frequent interspecific genetic exchange between commensal Neisseriae and Neisseria meningitidis. Mol Microbiol, 2000. 36(5): p. 1049-58.

    138. Straus, D. and F.M. Ausubel, Genomic subtraction for cloning DNA corresponding to deletion mutations. Proc Natl Acad Sci U S A, 1990. 87(5): p. 1889-93.

    139. Lisitsyn, N., N. Lisitsyn, and M. Wigler, Cloning the differences between two complex genomes. Science, 1993. 259(5097): p. 946-51.

    140. Hsiao, W., et al., IslandPath: aiding detection of genomic islands in prokaryotes. Bioinformatics, 2003. 19(3): p. 418-20.

    141. Merkl, R., SIGI: score-based identification of genomic islands. BMC Bioinformatics, 2004. 5(1): p. 22.

    142. Dufraigne, C., et al., Detection and characterization of horizontal transfers in prokaryotes using genomic signature. Nucleic Acids Res, 2005. 33(1): p. e6.

    143. Tsirigos, A. and I. Rigoutsos, A new computational method for the detection of horizontal gene transfer events. Nucleic Acids Res, 2005. 33(3): p. 922-33.

    144. Lawrence, J.G. and H. Hendrickson, Lateral gene transfer: when will adolescence end? Mol Microbiol, 2003. 50(3): p. 739-49.

    145. Jain, R., M.C. Rivera, and J.A. Lake, Horizontal gene transfer among genomes: the complexity hypothesis. Proc Natl Acad Sci U S A, 1999. 96(7): p. 3801-6.

    146. Jeltsch, A., Maintenance of species identity and controlling speciation of bacteria: a new function for restriction/modification systems? Gene, 2003. 317(1-2): p. 13-6.

    147. Jeltsch, A. and A. Pingoud, Horizontal gene transfer contributes to the wide distribution and evolution of type II restriction-modification systems. J Mol Evol, 1996. 42(2): p. 91-6.

  • 28

  • Chapter 2

    29

    Chapter 2

    An in vitro strategy for the selective isolation of

    anomalous DNA from prokaryotic genomes

    M. W. J. van Passel1, A. Bart1, R. J. A. Waaijer2, A. C. M. Luyf2, A. H. C. van

    Kampen2, A. van der Ende1,*

    1Department of Medical Microbiology, 2Bioinformatics Laboratory, Academic Medical

    Center, Amsterdam, the Netherlands

    Adapted from Nucleic Acids Research (2004), 32(14):e114

  • Chapter 2

    30

    Abstract

    In sequenced genomes of prokaryotes, anomalous DNA (aDNA) can be

    recognised, among others, by atypical clustering of dinucleotides. We hypothesised

    that atypical clustering of hexameric endonuclease recognition sites in aDNA allows

    the specific isolation of anomalous sequences in vitro. Clustering of endonuclease

    recognition sites in aDNA regions of eight published prokaryotic genome sequences

    was demonstrated. In silico digestion of the Neisseria meningitidis MC58 genome,

    using four selected endonucleases, revealed that of 27 of the predicted small

    fragments (300 bp and

  • Chapter 2

    31

    Introduction

    Horizontal gene transfer (HGT) was already identified in 1944 by the same

    experiment that demonstrated the transformation of non-virulent to virulent

    Streptococcus pneumoniae [1]. The extent of HGT as an evolutionary phenomenon

    had not been addressed quantitatively on genomic scale until Lawrence and Ochman

    calculated that approximately 18% of the genome of Escherichia coli MG1665 was

    horizontally transferred since its divergence from the Salmonella lineage 100 million

    years ago [2]. This identified HGT as a major factor in prokaryotic genome evolution.

    Recently, an extensive database of horizontally transferred genes based on complete

    bacterial and archaeal genomes has been made available [3].

    The rationale behind the computational identification of horizontally transferred

    DNA is the genome hypothesis, which proposes that for a given prokaryotic genus

    genomic DNA is relatively constant in codon usage and GC content [4, 5]. In contrast,

    horizontally acquired anomalous DNA differs in codon usage and/or GC composition

    from the recipient genome and can therefore be identified when substantial sequence

    information is available.

    An additional parameter in lateral genomics is based on oligonucleotide

    compositional extremes: the dinucleotide relative abundance values or genome

    signature ρ* [6, 7]. The genome signature is constant among members of a genus,

    but deviates substantially between members of different genera [8]. When used for

    intragenomic comparisons, ρ* makes an excellent parameter for the identification of

    anomalous DNA regions. Aberrant dinucleotide frequencies in aDNA are then

    expressed as the genome dissimilarity δ*, being the average dinucleotide relative

    abundance difference between the aDNA region and the whole genome [6-8].

    Although the genome signature is capable of identifying clusters of alien genes and

    acquired pathogenicity associated islands (PAI) with an atypical nucleotide

    composition, highly expressed regions such as ribosomal clusters can also display

    aberrant dinucleotide frequencies [8, 9].

    Still, to our knowledge, no method exists that uses (one of) these parameters

    and enables the selective isolation of anomalous DNA sequences from a microbial

    genome in vitro. In order to develop such a technique we investigated a special

    group of oligonucleotide composition extremes: the local overrepresentation in a

  • Chapter 2

    32

    genome of palindromic hexanucleotide sequences, specifically restriction

    endonuclease recognition sites, in aDNA regions. Like the genomic dinucleotide and

    tetranucleotide frequencies [10, 11], frequencies of restriction sites vary between the

    genomes of different microbial species [12]. Avoidance of cognate recognition

    sequences is probably the operating mechanism [13, 14]. An HGT event between

    different organisms may introduce clusters of certain restriction sites in the recipient’s

    genome. Therefore, digestion of the chromosomal DNA with such a restriction

    endonuclease can produce a limited number of small restriction fragments,

    comprising potential anomalous DNA, which can be selectively amplified by adaptor-

    linked PCR (ALP [15]). The resulting amplicons can subsequently be subcloned and

    identified by sequence analysis.

    Clustering of restriction endonuclease recognition sites in diverse aDNA

    regions in prokaryotic genomes was illustrated by the in silico assessment of seven

    genomes sequences of five different species. The restriction enzymes of which the

    hexameric recognition sites are underrepresented were identified for each genome,

    and restriction fragments between clustered sites, being smaller than 5 kbp, were

    analysed for nucleotide composition concerning GC percentage and genomic

    dissimilarity.

    Next, the restriction fragments of N. meningitidis MC58 between 300 bp and 5

    kb were analysed in silico for both GC content and genome signature compared to

    the genomic values. Also, the restriction fragments obtained with the selected

    restriction endonucleases from N. meningitidis MC58 and Z2491 strains were

    compared.

    Finally, in order to demonstrate the applicability of this technique in vitro,

    adaptor-linked PCR was performed on chromosomal DNA from strain MC58 digested

    by each of the selected restriction endonucleases. The resulting amplicons were

    sequenced to verify the predicted sequence composition.

  • Chapter 2

    33

    Material and Methods

    Bacterial strain and growth conditions

    N. meningitidis MC58 is a serogroup B:15:P1.7,16 strain isolated from a case

    of invasive infection in the UK [16]. This wild-type MC58 strain lacks the erythromycin

    resistance cassette insertion in the capsule gene locus in contrast to the sequenced

    strain MC58 [17]. Neisseriae were grown on heated blood (chocolate) agar plates or

    in liquid Tryptic Soy Broth (DIFCO) medium at 37°C in a humidified atmosphere of

    5% CO2.

    Chromosomal DNA preparation and digestion

    Chromosomal DNA was isolated with the Puregene DNA isolation kit (Biozym).

    Restriction digests and subsequent heat inactivation were carried out according to

    the manufacturer’s instructions (Roche).

    Adaptor-linked PCR and DNA sequencing

    Adaptor-linked PCR was performed as described in [18]. The adaptor and

    linker sets are MP19 (5’- ACG TCG ACT ATC CAT GAA CAG ATC 3’) and MP23 (5’-

    GAT CTG TTC ATG-3’) for the ScaI-digested genomic template, MP24 (5’-ACC GAC

    GTC GAC TAT CCA TGA ACA-3’) and MP20 (5’- CTA GTG TTC ATG -3’) for both

    the NheI- and SpeI-digested chromosomal DNA and MP24 and MP23 for the BglII-

    digested genomic template. PCR amplicons were purified by agarose gel extraction

    (Qiagen) and subcloned into a pCR2.1 vector (Invitrogen) according to the

    manufacturer’s instructions. Escherichia coli DH5α was transformed by standard heat

    shock procedure. The constructed plasmids were isolated with the Wizard Kit

    (Promega). Inserts were sequenced using standard M13 primers or primer walking

    on vector or genomic DNA according to the manufacturer’s instruction (ABI).

    Sequences were analyzed using the Staden Package (http://www.mrc-

    lmb.cam.ac.uk/pubseq/).

    Software

    The restriction site frequency tables from the various genomes were obtained

    from http://tools.neb.com/~posfai/FINISHED. The in silico digestions of the various

  • Chapter 2

    34

    sequenced genomes (for accession numbers see Table 1) were performed using the

    Restriction Digest tool from The Institute for Genomic Research (TIGR)

    (http://www.tigr.org). In silico retrieval and identification of the restriction fragments

    was performed with the Position Search/Segment Retrieval tool from TIGR

    (http://www.tigr.org). The different genomes of N. meningitidis were compared using

    the Artemis Comparison Tool (ACT) (http://www.sanger.ac.uk).

    Data analysis

    Fragments were designated anomalous in GC composition if the GC content

    of the fragment is below the fifth or above the 95th percentile of the genomic GC

    content distribution, calculated with a window and step size identical to the fragment

    length (http://www.tigr.org).

    The δ* value for each restriction fragments was calculated as described earlier

    by Karlin and colleagues [7]. In brief, the dinucleotide r