genome‐centric resolution of microbial diversity, metabolism and …392867/uq392867... ·...
TRANSCRIPT
Genome-centric resolution of microbial diversity, metabolism and interactions
in anaerobic digestion
Running title: Genome-centric resolution through deep metagenomics
Inka Vanwonterghem1,2
, Paul D Jensen1, Korneel Rabaey
1,3 and Gene W Tyson
1,2*
1Advanced Water Management Centre (AWMC), The University of Queensland, St Lucia, QLD
4072, Australia; 2Australian Centre for Ecogenomics (ACE), School of Chemistry and Molecular
Biosciences, The University of Queensland, St Lucia, QLD 4072, Australia; 3Laboratory for
Microbial Ecology and Technology (LabMET), Ghent University, Coupure Links 653, 9000 Ghent,
Belgium
*Corresponding author: Prof. Gene W. Tyson. Mailing address: Australian Centre for
Ecogenomics (ACE), School of Chemistry and Molecular Biosciences, The University of
Queensland, St Lucia, QLD 4072, Australia. Phone: +617 3365 3829 Fax: +617 336 54511 Email:
Keywords: metagenomics / genome-centric / functional redundancy / metabolic network / novel
diversity / anaerobic digestion
This article has been accepted for publication and undergone full peer review but has not beenthrough the copyediting, typesetting, pagination and proofreading process which may lead todifferences between this version and the Version of Record. Please cite this article as an‘Accepted Article’, doi: 10.1111/1462-2920.13382
This article is protected by copyright. All rights reserved.
2
Abstract
Our understanding of the complex interconnected processes performed by microbial communities is
hindered by our inability to culture the vast majority of microorganisms. Metagenomics provides a
way to bypass this cultivation bottleneck and recent advances in this field now allow us to recover a
growing number of genomes representing previously uncultured populations from increasingly
complex environments. In this study, a temporal genome-centric metagenomic analysis was
performed of lab-scale anaerobic digesters that host complex microbial communities fulfilling a
series of interlinked metabolic processes to enable the conversion of cellulose to methane. In total,
101 population genomes that were moderate to near-complete were recovered based primarily on
differential coverage binning. These populations span 19 phyla, represent mostly novel species and
expand the genomic coverage of several rare phyla. Classification into functional guilds based on
their metabolic potential revealed metabolic networks with a high level of functional redundancy as
well as niche specialization, and allowed us to identify potential roles such as hydrolytic specialists
for several rare, uncultured populations. Genome-centric analyses of complex microbial
communities across diverse environments provide the key to understanding the phylogenetic and
metabolic diversity of these interactive communities.
Introduction
Microorganisms are ubiquitous in the environment and play key roles in global biogeochemical
cycles. As the majority of microbial life has eluded cultivation in the laboratory, culture-
independent techniques have been developed to study their diversity and functions (Tringe and
Rubin, 2005; Albertsen et al., 2013; Vanwonterghem et al., 2014a). Metagenomics, the sequencing
of bulk DNA extracted directly from environmental samples, provides direct access to the
This article is protected by copyright. All rights reserved.
3
metabolic potential of a microbial community. Advances in sequence throughput, read length and
quality, and bioinformatics tools have contributed to a more widespread application of
metagenomics to study natural and engineered systems.
Early metagenomic studies relied largely on gene-centric analyses (Venter et al., 2004; Tringe et al.,
2005) with the recovery of individual genomes limited to environments dominated by few distinct
populations (Tyson et al., 2004). These gene-centric approaches are biased towards existing
databases, hereby overlooking a significant fraction of the novel diversity (Jaenicke et al., 2010;
Wong et al., 2013). In addition, as only an overview of the metabolic potential of the community is
provided without assigning functions to individual populations, important metabolic interactions
may remain undetected. The development of new improved sequencing technologies and
population genome binning algorithms (Wrighton et al., 2012; Albertsen et al., 2013; Imelfort et al.,
2014) has allowed us to move beyond gene-centric approaches and recover population genomes
from increasingly complex environments. This has led to the discovery of novel lineages (Brown et
al., 2015; Castelle et al., 2015), and insight into the metabolic processes (Raghoebarsing et al.,
2006; Haroon et al., 2013) and microbial interactions (Wrighton et al., 2014; Baker et al., 2015)
taking place in these environments.
Engineered systems offer a controlled environment in which to study complex microbial
communities, test hypotheses and explore the efficacy of new metagenomic approaches. Anaerobic
digestion provides an interesting study environment as it consists of a series of metabolic processes
carried out by a consortium of interdependent microorganisms. This process is a critical component
of the global carbon cycle as well as industrially relevant as a waste management strategy and for
the production of bioenergy (Amani et al., 2010). Due to the complexity of the communities
involved, anaerobic digesters (ADs) remain genomically underexplored and most metagenomic
studies have relied on gene-centric approaches (Jaenicke et al., 2010; Hanreich et al., 2013; Wong
This article is protected by copyright. All rights reserved.
4
et al., 2013; Solli et al., 2014; Stolze et al., 2015). The recovery of population genomes from
various engineered systems has provided genomic insight into candidate phyla such as TM7
(Albertsen et al., 2013) and KSB3 (Sekiguchi et al., 2015), which is responsible for filamentous
bulking in anaerobic wastewater treatment, and microbial interactions such as synergistic networks
within terephthalate-degrading bioreactors (Nobu et al., 2014). Genome-centric approaches can thus
provide a powerful means to understanding the phylogenetic and metabolic diversity in anaerobic
digestion.
Here, a detailed genome-centric exploration of complex microbial communities in ADs was
performed to reconstruct the metabolic network by gaining access to the functional potential of
individual population involved in the conversion of cellulose to methane. ADs were operated in
triplicate for a year and supplied with cellulose. Metagenomic sequencing was performed on
samples taken at two time points (spanning ~8 months), characterized by differences in
performance. Co-assembly of the six generated metagenomes followed by differential coverage-
based binning resulted in the recovery of 101 population genomes that constitute the majority of the
community. These genomes represent 19 phyla and expand the genomic diversity of several
lineages with few sequenced representatives. The metabolic reconstruction of individual
populations combined with their relative abundance estimates allowed us to study ecological
theories through the identification of a high level of functional redundancy, and construct an
interaction network for the flow of carbon through the community. These results demonstrate the
importance of genome-centric analyses when studying complex communities that harbor novel
diversity, and provide the foundation for further hypotheses-driven experiments.
Results
Metagenomic sequencing and assembly
This article is protected by copyright. All rights reserved.
5
The phylogenetic and metabolic diversity of microbial communities involved in anaerobic digestion
was studied using a genome-centric metagenomic approach. Three lab-scale ADs (designated AD1,
AD2 and AD3) were used as controlled systems in which to study the community dynamics and
reconstruct the metabolic network. The ADs were inoculated with a mixture of eight samples taken
from anaerobic environmental and engineered systems (Table S1). They were operated for 362 days
and supplied with cellulose as the sole carbon and energy source. Samples for metagenomic
sequencing were collected from the reactors at two time points (T1: Day 96; T2: Day 362) based on
differences in the structure and performance of the microbial communities, which are summarized
in Fig. S1 and Table S2, and have been described in detail previously (Vanwonterghem et al.,
2014b). Briefly, cellulose hydrolysis was stable at both time points at an average efficiency of 86 ±
4%. Accumulation of predominantly acetate and propionate was observed at T1, with highest
volatile fatty acid (VFA) concentrations measured for AD1 which correlated with lower methane
production. At T2, VFAs were efficiently converted to methane and only minor differences were
observed between the reactors. The six metagenomes (111 Gb total raw reads) from the triplicate
ADs at these two time points were co-assembled, generating 494,042 contigs with a combined
length of 908 Mb (Table S3). On average, >85% of the metagenomic reads from each dataset
mapped onto the contigs (>500 bp) from the combined assembly (Table S4).
Microbial community composition and population genome recovery
The community composition was determined by extracting the 16S rRNA gene sequences from the
metagenomes (Fig. 1) and compared to previously reported amplicon-based community profiles
(Fig. S1) (Vanwonterghem et al., 2014b). The most abundant populations belonged to the phyla
Euryarchaeota, Actinobacteria, Bacteroidetes, Fibrobacteres, Firmicutes, Spirochaetes and
Verrucomicrobia, which are commonly found in ADs (Jaenicke et al., 2010). The microbial
communities were highly similar to one another, but shifted in structure over time leading to
significantly different communities at the two time points (P < 0.001). Several differences could be
This article is protected by copyright. All rights reserved.
6
observed between the metagenome- and amplicon-based community profiles. Interestingly, a
Cellulomonas population was detected at 3-7% relative abundance in the metagenomes, while a
primer mismatch for the forward primer (926F) used in the amplicon sequencing approach (Fig. S2)
failed to detect this population (Fig. S1). On the contrary, the abundance of methanogens was
overestimated in the amplicon dataset compared to the metagenome dataset, which is likely due to
PCR primer and amplification biases. Amplicon-based studies using the 454 sequencing platform
also suffer from lower taxonomic resolution compared to metagenomics and may underestimate the
community diversity and dynamics. For example, two Fibrobacter populations were detected in the
metagenome dataset, each dominant at a different time point, yet were grouped together as one
phylotype in the amplicon dataset (Fig. 1 and Fig. S1). A similar observation was made for the
dominant Methanosaeta populations and influences our perception of the microbial community
dynamics.
Population genome binning of the co-assembled metagenomes enabled the recovery of 93 bacterial
and 8 archaeal population genomes with ≥50% completeness and ≤10% contamination (Table 1 and
Table S5). Of these genomes, 58 were substantially to near complete (≥80%) with low to medium
contamination, according to the CheckM classification (Table 1) (Parks et al., 2015). The 101
genomes ranged in size between 1.4 and 6.3 Mb, across a GC content range between 29 and 74%
(Table 1 and Table S5). They represent the majority of the community (62 ± 3% and 79 ± 4% at T1
and T2, respectively; based on percentage of reads mapping), with 58% representing relatively high
abundance populations (>0.5% in at least one of the samples) and the remaining 42% representing
low abundance populations (down to 0.09% maximum relative abundance in at least one of the
samples) (Table 1 and Table S5). In addition to recovering genomes for all the abundant population
identified in the 16S rRNA gene profiles (Fig. 1, Fig. 2 and Fig. S3), a large number of low
abundance population genomes were recovered which highlights the strength of the binning
approach used in this study. The populations were phylogenetically diverse and belong to 19
This article is protected by copyright. All rights reserved.
7
different phyla (Fig. 2). Many of these genomes represent novel orders, families and/or genera, and
they significantly expand the genomic representation of phyla with relatively few sequenced
genomes such as Fibrobacteres (Rahman et al., 2015), Verrucomicrobia, Planctomycetes and
Candidate division WWE1 (Fig. 2).
Classification into functional guilds based on metabolic potential
The metabolic potential of the microbial communities in these reactors was determined in order to
classify individual populations into functional guilds fulfilling the different steps in anaerobic
digestion (hydrolysis, fermentation, syntrophic oxidation and methanogenesis). Based on the
potential substrate utilization for the dominant populations and their relative abundance, the flow of
carbon from cellulose to methane in each community could be inferred, leading to the construction
of a metabolic network.
Hydrolysis. Firstly, a gene-centric approach was applied to examine the hydrolytic capacity of the
AD communities over time and relative to other environments. Glycoside hydrolase (GH) profiles
were generated for each individually assembled metagenome by calculating the total number of
enzymes within each GH family. Comparative analysis of these GH profiles showed no significant
differences between reactors and time points (P<0.05). The AD metagenomes were enriched in
genes belonging to GH5 (5.3 ± 0.4% of total GH) and GH9 (1.6 ± 0.6%), but also showed high
levels of other GH families, including GH2 (4.2 ± 0.3%), GH3 (3 ± 0.4%), GH31 (2.3 ± 0.2%),
GH43 (4.2 ± 0.7%), GH94 (2.0 ± 0.2%), GH78 (3.3 ± 0.3%), GH13 (4.9 ± 0.5%) and GH23 (3.1 ±
0.5%) (Table S6). Enzymes belonging to these GH families are predominantly involved in the
hydrolysis of cellulose, oligosaccharides, sugar side chains, amylose/maltose and peptidoglycan. A
comparison was made between the GH profiles of the ADs and those reported for soil ecosystems
(Tveit et al., 2013), switchgrass compost, termite hindgut and rumen (Allgaier et al., 2010) (Table
S7 and Fig. S4). Principle component analysis showed distinct clustering of the cellulose-degrading
This article is protected by copyright. All rights reserved.
8
reactor samples together with the wood-feeding termite hindgut community (Allgaier et al., 2010),
which were all enriched for cellulases predominantly belonging to GH5, reflecting the cellulosic
substrate. The soil environments clustered together despite differences in plant cover (moss versus
vascular plants), while the rumen sample was most different and showed a high abundance of
oligosaccharide degrading enzymes belonging to GH2, GH3 and GH51 (Table S7 and Fig. S4),
which is likely driven by the dominant grass hemicellulose found in this environment.
Cellulose hydrolyzers were identified in the ADs by generating GH profiles for the individual
population genomes and correlating known activities for GH families with gene annotations to
determine the substrate profile (Fig. 3). The potential to degrade cellulose was a common feature
and present in 65% of the bacterial populations, including phyla commonly associated with
cellulose hydrolysis such as Fibrobacteres (Fibro_01-03), Firmicutes (Firm_03-06, Firm_10-11,
Firm_13-14 and Firm_16), Bacteroidetes (Bact_02-03, Bact_08-11, Bact_13 and Bact_24),
Spirochaetes (Spiro_07-10 and Spiro_12), and Actinobacteria (Actino_01-02) (Fig. 3, Fig. 4 and
Fig. 5) (Lynd et al., 2002; Bayer et al., 2008; Bekele et al., 2011; Suen et al., 2011; Naas et al.,
2014). A range of GH enzymes were also detected in the two Verrucomicrobia populations
(Verruco_01-02) (Fig. 3), and it has previously been speculated that certain populations within this
phylogenetically heterogeneous group can make a substantial contribution to polysaccharide
hydrolysis, even when present at low abundance (Martinez-Garcia et al., 2012). Similar to prior
studies, one of the Lentisphaerae genomes (Lenti_02) (Fig. 3) encoded a high abundance and
variety of GH enzymes (Kaoutari et al., 2013). However, only a very limited number of GH
enzymes were detected in the second Lentisphaerae population (Lenti_01), indicating that
polysaccharide hydrolysis is not a representative feature of the whole phylum. Although the
genome completeness of Lenti_01 is lower than Lenti_02, it is unlikely that this large difference in
GH abundance and diversity can be bridged by the missing fraction of the genome. The largest
number of GH enzymes was observed for a Planctomycetes population (Planc_01) (Fig. 3), which
This article is protected by copyright. All rights reserved.
9
expands our understanding of the metabolic role of Phycisphaerae since only a limited number of
genomes within this class have been sequenced thus far, and this agrees with the recent finding of a
broad range of GH enzymes within Planctomycetes genomes recovered from estuary sediment
(Baker et al., 2015). The discovery of hydrolytic potential within novel species highlights the
importance of genome-centric approaches as these organisms play a crucial role in carbon cycling.
Microorganisms that could use cellobiose but not cellulose were identified in the reactors among
Proteobacteria (Alpha_01, Beta_02, Delta_01 and Epsilon_01), Bacteroidetes (Bact_22-23),
Spirochaetes (Spiro_02-03) and Synergistetes (Syner_01). By assigning functions to individual
populations, discrepancies could be observed between cellobiose opportunists and cellulose
degraders. In contrast to previous studies that reported a minimum ration of 2:1 for these functional
groups (cellobiose:cellulose) (Berlemont and Martiny, 2013; Wrighton et al., 2014), the number of
cellobiose opportunists in the ADs was lower than cellulose degraders. When taking the relative
abundance into account it could be shown that this ratio was dynamic and became more even over
time (1:7 at T1, 1:3 at T2 of cellobiose:cellulose).
The GH profile for each genome was normalized by its relative abundance at each time point (Fig.
S5 and Fig. S6) and this showed a clear shift in the abundant cellulose degraders over time, i.e.
from Bacteroidetes (Bact_02-03) and Ruminococcus (Firm_04-06) populations at T1 (Fig. 4 and
Fig. S5), to Cellulomonas (Actino_01), Fibrobacter (Fibro_03) and Clostridiales (Firm_11)
populations at T2 (Fig. 5 and Fig. S6). Several Spirochaetes (Spiro_07-10 and Spiro_12) and
Verrucomicrobia (Verruco_01-02) were initially present at lower abundance (maximum 1.3%) but
increased over time (maximum 6.1%). Most of the dominant cellulolytic populations possessed a
plurality of genes with cellulase and cellobiosidase activity (Fig. 3), and it has been hypothesized
that higher GH diversity and copy number results in improved cellulose degrading ability
(Berlemont and Martiny, 2013).
This article is protected by copyright. All rights reserved.
10
The presence of multiple high abundance cellulose degraders at the same time within a community
may suggest there is a level of niche specialization. For example, a positive correlation in relative
abundance was observed between Fibro_03 and Firm_11 (Table 1 and Fig. S7). These populations
may utilize different strategies for attachment to cellulose particles since fibro-slime proteins (fsu)
and pili (pil) were identified in Fibro_03, similar to Fibrobacter succinogenes (Suen et al., 2011),
while dockerin and cohesion modules were detected in Firm_11 suggesting the presence of an
organized cellulosome apparatus similar to Clostridium thermocellum (Lynd et al., 2002; Bayer et
al., 2008). Their substrate specificity may also vary as multiple endoglucanases (GH5, GH8, GH9
and GH45) but only one cellobiose phosphorylase (GH94) for cellobiose utilization were found in
Fibro_03, while only few endoglucanases within the GH5 family but multiple cellobiose
phosphorylase (GH94) and beta-glucosidase genes (GH1 and GH3) were detected for Firm_11. In
addition, these populations potentially use different oligosaccharide, cellobiose and glucose
transport mechanisms, such as phosphotransferase systems (pts), non-specific sugar ABC
transporters (e.g. msmK, malK, sugC, and gguAB) and specific cellobiose transporters (cebEFG)
(Fig. S8). These differences in hydrolytic potential suggest that within the same environment and
functional guild, niche specialization may allow seemingly functionally redundant populations to
grow simultaneously and potentially work together.
Fermentation. The majority of the community showed a potential to convert glucose to acetate,
with 73% of the bacterial population genomes encoding the acetate kinase (ack) and phosphate
acetyltransferase (pta) genes required for acetate production (Fig. 4 and Fig. 5). An additional 16%
were missing only one of these genes. This indicates a high level of functional redundancy and
confirms acetate as one of the most important intermediates in these types of systems (Amani et al.,
2010).
This article is protected by copyright. All rights reserved.
11
Propionate production within these communities occurred via the methylmalonyl-CoA pathway by
populations within the Actinobacteria (Actino_02), Bacteroidetes (Bact_02-03, Bact_09-11,
Bact_13, Bact_19 and Bact_22-24), Rhodospirillum (Alpha_01) and Verrucomicrobia
(Verruco_01-02), which contained the key enzymes methylmalonyl-CoA mutase, methylmalonyl-
CoA epimerase and methylmalonyl-CoA carboxyltransferase. The higher propionate concentrations
observed in the reactors at T1 (Table S2) were likely related to the high relative abundance of
Bact_03 (10 ± 2%) and Actino_02 (4 ± 1%), a population closely related to Propionibacterium
(Fig. 4). The main propionate producers decreased in abundance over time and at T2 the dominant
populations of this functional guild shifted to members of the Bacteroidetes (Bact_19 and Bact_22-
24; 0-5%) and Verrucomicrobia (Verruco_01-02; 0- 6%) (Fig. 5). A full complement of genes for
propionate production via the acrylate pathway or propanediol pathway was not detected in the
investigated genomes.
Multiple potential butyrate producers were detected within the phylum Bacteroidetes (Bact_08-11,
Bact_13, Bact_19 and Bact_22-24) (Fig. 4 and Fig. 5). These populations contained the key gene
butyrate kinase (buk) as well as most or all of the remaining genes in the butyrate fermentation
pathway. The alternative pathway using butyryl-CoA:acetate CoA-transferase (but) was not
detected in the studied population genomes. Although butyrate production genes were expected to
be found in the Clostridiales genomes based on what is known from cultured species and genome
representatives (Vital et al., 2016), a complete pathway for butyrate production was not detected in
any of the Clostridiales genomes from this study. Potential for amino acid fermentation to acetate
and butyrate was detected for Synergistetes (Syner_01 and Syner_03) and Treponema (Spiro_12)
populations, which has been observed for species belonging to these phyla previously (Tucci and
Martin, 2007; Ganesan et al., 2008; Chertkov et al., 2010). These populations may be scavengers
utilizing proteins that have been excreted or leaked from dead cells. Potential growth on
proteinaceous compounds and sugars with predominantly acetate and lower amounts of butyrate as
This article is protected by copyright. All rights reserved.
12
fermentation products may also be possible for the Thermotogae populations (Thermo_01-02),
similar to what has been suggested for Mesotoga prima (Nesbo et al., 2012). Only three mesophilic
Thermotogae genomes have been described so far, providing limited knowledge of their
metabolism. The populations within the reactors seem phylogenetically more closely related to
Mesotoga infera, however they lack the genes for utilization of sulfur as terminal electron acceptor,
a key feature for this species (Hanaia et al., 2013). Instead, they also contain a selection of
polysaccharide degrading enzymes, which can be related back to the environment in which they are
found.
Syntrophic VFA oxidation. Reduced compounds such as propionate and butyrate can be further
oxidized to acetate, CO2, H2 and formate by syntrophic bacteria when H2 partial pressures are low.
Two Syntrophobacterales genomes (Delta_01 and Delta_02; 47% amino acid identity (AAI))
contained the majority of genes for the methylmalonyl-CoA pathway, indicating a potential
involvement in propionate oxidation. Other members of this family are capable of syntrophic
propionate oxidation, i.e. Syntrophobacter fumaroxidans (Harmsen et al., 1998) (64% AAI to
Delta_01), and syntrophic oxidation of phenol and other aromatics to acetate, i.e. Syntrophorhabdus
aromaticivora (Qiu et al., 2008) (63% AAI to Delta_02). Delta_01 and Delta_02 were present at
<0.2% relative abundance at T1 and increased over time to 0.3-0.9% at T2 (Fig. 5). Although these
relative abundances are still low, syntrophic propionate oxidizers are capable of high substrate
turnover and this likely contributed to the low observed propionate concentrations at T2. It has been
suggested that Candidatus ‘Cloacamonas Acidaminovorans’ is a hydrogen-producing syntroph
capable of oxidizing propionate based on its genome sequence combined with cultivation
experiments (Pelletier et al., 2008). Although the WWE1 genome recovered from the reactors
appears to have similar genes required for the utilization of amino acids, sugars and carboxylic
acids, as well as multiple putative Fe-only hydrogenases, the energy-conservation mechanism
required for syntrophic VFA oxidation remains to be elucidated.
This article is protected by copyright. All rights reserved.
13
Butyrate oxidation was likely performed via the beta-oxidation pathway by another
Syntrophobacterales population (Delta_03), which is most closely related to Syntrophus
aciditrophicus (60% AAI) (Mclnerney et al., 2007). The Delta_03 genome had a large number of
genes invested in butyrate oxidation and increased in abundance over time from <0.001% at T1 to
~1.4% at T2 (Fig. 5). The Delta_01 and Delta_02 genomes only encode part of the beta-oxidation
pathway, i.e. from butyryl-CoA or crotonyl-CoA to acetate, suggesting intermediates from other
oxidation pathways may feed into the butyrate oxidation pathway at this step.
Methanogenesis. Methane producing populations within these communities were related to
Methanocorpusculum (Methan_05), Methanospirillum (Methan_06), Methanoculleus (Methan_07)
and Methanosaeta (Methan_01-02 and Methan_04) (Fig. 4 and Fig. 5). Over time, there was an
overall increase in methanogen abundance associated with a shift from hydrogenotrophic to
acetoclastic methanogenesis. The presence of multiple populations capable of fulfilling the same
function shows that a level of functional redundancy remained within more specialized functional
guilds. Another interesting finding was the presence of a near complete complement of genes for
hydrogenotrophic methanogenesis within each of the three Methanosaeta genomes (Methan_01-02
and Methan_04), which showed little to no contamination and are reported to be strictly
acetoclastic. Various hypotheses have been developed to explain the potential role of this pathway
(Smith and Ingram-Smith, 2007; Rotaru et al., 2014) but functional assays are needed to determine
whether this pathway is active in these systems.
While methanogen abundance increased over time, the increase in methane production was
disproportional, and this was likely due to a shift in the rate-limiting step. The observed
accumulation of VFAs at T1 indicates syntrophic VFA oxidation and/or methanogenesis was rate-
limiting within the community at this time point. As all VFAs were efficiently converted to biogas
This article is protected by copyright. All rights reserved.
14
at T2, steps upstream in the metabolic network were more likely rate-limiting. When substrate
concentrations are low, methanogens can use internal storage compounds (e.g. glycogen) for
growth without methane production (Verhees et al., 2003). Also, enzymes for assimilatory and
dissimilatory sulfate reduction were encoded within several populations present at higher
abundance at T2 (Delta_01, Chlorobi_01 and Alpha_01-03), indicating potential competition with
methanogens for H2 and/or acetate (Oremland and Polcin, 1982).
Discussion
The widespread application of metagenomics sequencing has led to the discovery of novel species
and metabolic processes of global importance (Haroon et al., 2013; Wrighton et al., 2014; Baker et
al., 2015). Improved metagenome assembly and binning tools (Imelfort et al., 2014) now allow a
growing number of population genomes to be recovered from increasingly complex environments
(Albertsen et al., 2013; Baker et al., 2015; Brown et al., 2015). Here, a detailed genome-centric
analysis of microbial communities involved in the conversion of cellulose to methane led to the
recovery of 101 population genomes that could be classified into functional guilds based on their
potential substrate utilization. Through the recovery of population genomes for the majority of the
community, we were able to combine the metabolic potential of individual populations with their
relative abundance, and reconstruct a metabolic network for the dominant players in the
communities at two time points (T1: Fig. 4 and T2: Fig. 5). The networks revealed a high level of
functional redundancy, particularly among the hydrolyzers and fermenters, as changes in the
dominant players were observed over time while the overall functionality was maintained. Potential
niche specialization was also observed based on the variety and abundance of GH families. Various
microbial interactions could be inferred including competition for substrates and cellobiose- or
glucose-utilizing opportunists that depend on the activity of primary cellulose degraders. Metabolic
functions that could not have been predicted from known cultured or sequenced representatives
This article is protected by copyright. All rights reserved.
15
were also identified within each functional guild. By correlating the metabolic network with
performance parameters, observations such as the accumulation of propionate could be explained.
The genome-resolved network also enabled the proportion of the community represented by each
functional guild to be calculated, and this highlighted the importance of a diverse and well-balanced
community with functional flexibility to fulfill a complex multi-step process such as the anaerobic
digestion.
The results presented here demonstrate the valuable insights that can be gained into complex
metabolic networks through genome-centric metagenomics. The approach described in this study
can be readily applied to other natural and engineered systems, which will undoubtedly reveal novel
microbial diversity and metabolic interactions. When genome-centric metagenomics is combined
with functional data derived from metatranscriptomics or -proteomics, we will be able to develop a
holistic understanding of the complex roles microorganisms play in these environments.
Experimental procedures
Sample collection and DNA extraction
Triplicate ADs (2L working volume) were seeded with a diverse inoculum consisting of a samples
taken various anaerobic digesters, an anaerobic lagoon, rumen fluid and anoxic lake sediment
(Vanwonterghem et al., 2014b), and supplied with alpha cellulose (Sigma Aldrich, NSW Australia)
as the sole energy and carbon source. The reactors were designated AD1, AD2 and AD3, and were
run for 362 days at a 10 day sludge retention time (SRT), under mesophilic conditions and at
neutral pH. The medium contained 3 g L-1
Na2HPO4, 1 g L-1
NH4Cl, 0.5 g L-1
NaCl, 0.2465 g L-1
MgSO4.7H2O, 1.5 g L-1
KH2PO4, 14.7 mg L-1
CaCl2, 2.6 g L-1
NaHCO3, 0.5 g L-1
C3H7NO2S, 0.25
This article is protected by copyright. All rights reserved.
16
g L-1
Na2S.9H2O, and 1 mL of trace solution containing 1.5 g L-1
FeSO4.7H2O, 0.15 g L-1
H3BO3,
0.03g L-1
CuSO4.5H2O, 0.18 g L-1
KI, 0.12 g L-1
MnCl2.4H2O, 0.06 g L-1
Na2Mo4.2H2O, 0.12 g L-1
ZnSO4.7H2O, 0.15 g L-1
CoCl2.6H2O, 10 g L-1
EDTA and 23 mg L-1
NiCl2.6H2O. It was sparged
with N2 and then autoclaved at 121°C for 60 min for oxygen removal and sterilisation. The reactors
were supplied with alpha cellulose at a concentration of 5 g cellulose L-1
medium semi-continuously,
i.e. at intervals of six hours resulting in 4 feed events per day. Reactor performance parameters and
microbial community composition were monitored over time as part of a previous study
(Vanwonterghem et al., 2014b). Samples for metagenomic sequencing were collected from the
three reactors (2 mL) at two time points (Day 96 and Day 362) based on differences in reactor
performance (Table S2). The samples were centrifuged at 14,000 g for 2 min to collect the biomass,
and the pellet was snap-frozen in liquid nitrogen and stored at -80°C until further processing. DNA
was extracted from these samples using the MP-Bio FastDNA Spin Kit for Soil (MP Biomedicals,
Australia) and according to the manufacturer’s instructions.
Metagenome library preparation and sequencing
DNA libraries for samples from the first time point were prepared using the TruSeq DNA Sample
Preparation Kits v2 (Illumina, CA) with 2 µg of DNA from each sample, following the
manufacturer’s instructions. The DNA concentration of the libraries was measured using the
QuanIT kit (Molecular Probes, CA). Paired-end sequencing (2 x 150 bp, average fragment size 250
bp) was performed on the Illumina HiSeq2000 using the TruSeq PE Cluster Kit v3-cBot-HS
(Illumina). The second set of samples were prepared for sequencing using the Nextera DNA
Sample Preparation Kit (Illumina) with 50 ng of DNA from each sample, following the
manufacturer’s instructions. Quantification and quality assessment of the libraries was performed
using the Agilent 2100 Bioanalyser (Agilent technologies, CA). Paired-end sequencing (2 x 150 bp,
This article is protected by copyright. All rights reserved.
17
ranging from 300-800 bp fragment size) was performed on the Illumina HiSeq2000 platform using
the TruSeq SBS Kit v3 (Illumina). Each sample was sequenced on one third of a flowcell lane,
generating a combined total of 111 Gb of raw sequence data. Three additional large insert (2797 ±
83 bp) mate-pair libraries were generated from the same genomic DNA extracted from the three
reactors at day 362. The libraries were constructed using the Nextera Mate Pair Sample Preparation
Kit (Illumina) and sequenced on the Illumina MiSeq system (2 x 250 bp paired-end sequencing)
using the MiSeq Reagent kit v2. Each sample was sequenced on one quarter of a flowcell lane,
generating a combined total of 5 Gb of raw sequence data.
Community profiling
16S rRNA gene amplicon sequencing of all samples using the Roche 454 GS-FLX Titanium
platform (Roche Diagnostics, Australia) has been previously reported (Vanwonterghem et al.,
2014b). The microbial community composition was also determined by identifying and classifying
all 16S rRNA reads from the paired-end metagenomic datasets using the software CommunityM
v.1.2 with default parameters (https://github.com/dparks1134/CommunityM.git), which uses hidden
Markov models (HMMs) to identify the 16S rRNA gene sequences and classifies them using the
GreenGenes database (DeSantis et al., 2006) with clustering at 97% sequence similarity.
Metagenome assembly and population genome binning
Paired-end reads were quality trimmed using CLC workbench v.6 (CLC Bio, Taiwan) with a
quality score threshold of 0.01 and minimum read length of 100 bp. Illumina sequencing adapters at
the ends of reads were trimmed (if found) and reads containing ambiguous nucleotides were
removed from the dataset. Trimmed sequences were assembled using the CLC de novo assembly
algorithm with a kmer size of 63 and automatic bubble size. All six datasets were assembled
individually and also combined in a single large dataset co-assembly for population genome
binning. Only contigs larger than 500 bp were used in downstream analyses. The raw paired-end
This article is protected by copyright. All rights reserved.
18
reads from each individual dataset were mapped onto the combined assembly using BWA (Li,
2013) and SAMtools (Li et al., 2009) with default parameters. On average 87 ± 4% of all reads
mapped onto the co-assembly. Population genomes were recovered from the sequence data based
primarily on differential coverage profiles using GroopM v.0.2 (Imelfort et al., 2014), with initial
core formation set at 1500 bp.
Population genome bin refinement and quality assessment
The completeness and level of contamination of the population genome bins was calculated with
CheckM v.0.9.4 (Parks et al., 2015), which uses lineage specific conserved marker gene sets for
each population genome. Manual refinement of the population genome bins was performed using
the GroopM refine function based on coverage profiles, kmer signatures and GC content, leading to
a significant increase in good quality population genomes (Table S8). The resulting population
genome bins were further refined using the mate-pair sequence data. Adapter sequences were
removed, trimmed reads shorter than 50 bp were discarded, and only valid mate pairs, i.e. reads
oriented in the reverse-forward direction, were retained. Scaffolding of the processed mate-pair
reads was performed using SSPACE v.2.0 (Boetzer et al., 2011) with a minimum number of links
set at 2. The population genome bins were improved by adding or removing linked contigs based on
coverage information, the number of connections between contigs and completeness/contamination
estimates (Table S8). The completeness estimates were also used to calculate the expected genome
size. The data has been submitted to the NCBI Short Read Archive under BioProject
PRJNA284316.
Genome tree phylogeny
A genome tree was generated using 38 universal (Darling et al., 2014) conserved marker genes
from 2015 finished bacterial and archaeal genomes available from the Integrated Microbial
Genomes database (IMG) (Markowitz et al., 2012) and the recovered population genomes (Table
This article is protected by copyright. All rights reserved.
19
S9). Marker genes were identified using HMMs and the genome tree was generated with FastTree
(Price et al., 2009) using a concatenated alignment of the marker genes. The phylogenetic affiliation
of the population genomes was determined relative to the IMG genomes and compared to the
taxonomy of 16S rRNA sequences identified in the genome bins using CommunityM v.1.2 with
default parameters and the GreenGenes database clustered at 97% sequence similarity
(https://github.com/dparks1134/CommunityM.git).
Functional annotation of the metagenomes
For each individually assembled metagenome, open reading frames (ORFs) were identified using
PROKKA v.1.8 (Seemann, 2014). Genes encoding carbohydrate active enzymes (CAZy) (Lombard
et al., 2014) were detected using hmmer v.3.1 (Finn et al., 2011) and the HMM-based database for
CAZy annotation (dbCAN v.3) (Yin et al., 2012), which classifies enzymes that degrade glycosidic
bonds into families based on structurally-related catalytic and carbohydrate-binding modules. For
each metagenome, the total number of hits to a glycoside hydrolase (GH) family was calculated for
comparative analysis.
Functional annotation of the population genomes and metabolic network reconstruction
Population genomes recovered from the combined metagenome assembly were annotated using
PROKKA v.1.8 and validated based on homology search with BLASTP (Altschul et al., 1990)
using the IMG protein database (Markowitz et al., 2012) and KEGG Orthology database (Kanehisa
and Goto, 2000; Kanehisa et al., 2014). Carbohydrate active enzymes were detected for each
population genome using hmmer and dbCAN, similar to the individual metagenomes. These results
were combined with known activities of GH families (http://www.cazy.org;
https://www.cazypedia.org) (Allgaier et al., 2010) and the annotations based PROKKA and IMG
databases, in order to determine the predominant substrate profile for each GH family. A full
reconstruction of the metabolic potential for each population genomes was based on the consensus
of the different annotation methods used and metabolic pathways identified in KEGG and MetaCyc
This article is protected by copyright. All rights reserved.
20
(Caspi et al., 2008). A metabolic pathway comprising multiple genes was considered present if the
majority (>75%) of genes involved in this pathway were detected in the genome. The populations
could be classified into one or more functional guilds, namely hydrolysis (cellulose/cellobiose),
fermentation (acetate/propionate/butyrate), syntrophic VFA oxidation and methanogenesis
(hydrogenotrophic/acetoclastic), based primarily on their carbon metabolism. In order to reconstruct
the metabolic networks at each time point (Fig. 4 and Fig. 5), only those populations present at >
0.1% relative abundance in at least one of the reactor were considered, and their average relative
abundance across the reactors at each time point was calculated to determine the contribution of
each population to the flow of carbon (represented by the thickness of the lines in Fig. 4 and Fig. 5).
The combined (average) relative abundance of all populations within a functional guild was
calculated to assess the overall distribution of functions across the community and how this balance
shifts over time.
Statistical analyses
All statistical analyses and construction of heatmaps were carried out in RStudio v.2.15.0 using the
R CRAN packages: vegan (Oksanen et al., 2013) and RColorBrewer (Neuwirth, 2011). Tukey’s
Honestly Significant Differences Tests were used to statistically compare the datasets and principle
component analysis (PCA) was used to assess the variability between samples. Correlation analyses
were performed through linear regression of the relative abundance profiles and assessing the
respective R2 values.
Acknowledgements
This article is protected by copyright. All rights reserved.
21
This study was supported by the Commonwealth Scientific and industrial Research Organization
(CSIRO) Flagship Cluster “Biotechnological solutions to Australia’s transport, energy and
greenhouse gas challenges”. IV acknowledges support from the University of Queensland
International Scholarship, and PJ acknowledges support from the Australian Meat Processor
Corporation (2013/4008 Technology Fellowship). KR acknowledges support by the European
Research Council (Starter Grant Electrotalk), and GWT was supported by an Australian Research
Council Queen Elizabeth fellowship (DP1093175). The authors would like to thank Serene Low at
the Australian Centre for Ecogenomics for the metagenome library preparation, Mike Imelfort for
assistance with the bioinformatics analysis, Donovan Parks for assistance with the genome quality
assessment and Philip Hugenholtz for providing comments on the manuscript.
Competing financial interests
The authors declare no competing financial interests.
References
Albertsen, M., Hugenholtz, P., Skarshewski, A., Nielsen, K.A., Tyson, G.W., and Nielsen, P.H.
(2013) Genome sequences of rare, uncultured bacertia obtained by differential coverage binning of
multiple metagenomes. Nature Biotechnology 31: 533-538.
Allgaier, M., Reddy, A., Park, J.I., Ivanova, N.N., D'Haeseleer, P., Lowry, P. et al. (2010) Targeted
discovery of glycoside hydrolases from a switchgrass-adapted compost community. Plos One 5: 1-
9.
Altschul, S.F., Gisch, W., Miller, W., Meyers, E.W., and Lipman, D.J. (1990) Basic local alignment
search tool. Journal of Molecular Biology 215: 403-410.
This article is protected by copyright. All rights reserved.
22
Amani, T., Norsati, M., and Sreekrishnan, T.R. (2010) Anaerobic digestion from the viewpoint of
microbiological, chemical, and operational aspects - a review. Environmental Reviews 18: 255-278.
Baker, B.J., Lazar, C.S., Teske, A.P., and Dick, G.J. (2015) Genomic resolution of linkages in
carbon, nitrogen, and sulfur cycling among widespread estuary sediment bacteria. Microbiome 3: 1-
12.
Bayer, E.A., Lamed, R., White, B.A., and Flint, H.J. (2008) From cellulosomes to cellulosomics.
The Chemical Record 8: 364-377.
Bekele, A.Z., Koike, S., and Kobayashi, Y. (2011) Phylogenetic diversity and dietary assoiation of
rumen Treponema revealed using grou-specific 16S rRNA gene-based analysis. FEMS
Microbiology Letters 316: 51-60.
Berlemont, R., and Martiny, A.C. (2013) Phylogenetic distribution of potential cellulases in
bacteria. Applied and Environmental Microbiology 79: 1545-1554.
Boetzer, M., Henkel, C.V., Jansen, H.J., Butler, D., and Pirovano, W. (2011) Scaffolding pre-
assembled contigs using SSPACE. Bioinformatics 27: 578-579.
Brown, C.T., Hug, L.A., Thomas, B.C., Sharon, I., Castelle, C.J., Singh, A. et al. (2015) Unusual
biology across a group comprising more than 15% of domain Bacteria. Nature: 1-18.
Caspi, R., Foerster, H., Fulcher, C.A., PKaipa, P., Krummenacker, M., Latendresse, M. et al. (2008)
The MetaCyc database of metabolic pathways and enzymes and the BioCyc collection of
pathway/genome databases. Nucleic Acids Research 36: 623-631.
Castelle, C.J., Wrighton, K.C., Thomas, B.C., Hug, L.A., Brown, C.T., Wilkins, M.J. et al. (2015)
Genomic expansion of domain Archaea highlights roles for organisms from new phyla in anaerobic
carbon cycling. Current Biology 25: 1-12.
Chertkov, O., Sikorski, J., Brambilla, E., Lapidus, A., Copeland, A., Glavina Del Rio, T. et al.
(2010) Complete genome sequence of Aminobacterium colombiense type strain (ALA-1T).
Standards in Genomic Sciences 2: 280-289.
This article is protected by copyright. All rights reserved.
23
Darling, A.E., Jospin, G., Lowe, E., Matsen, I.V., Bik, H.M., and Eisen, J.A. (2014) Phylosift:
phylogenetic analysis of genomes and metagenomes. PeerJ 2: 1-28.
DeSantis, T.Z., Hugenholtz, P., Larsen, N., Rojas, M., Brodie, E.L., Keller, K. et al. (2006)
Greengenes, a chimera-checkes 16S rRNA gene database and workbench compatible with ARB.
Applied and Environmental Microbiology 72: 5069-5072.
Finn, R.D., Clements, J., and Eddy, S.R. (2011) HMMER web server: interactive sequence
similarity searching. Nucleic Acids Research 39: 29-37.
Ganesan, A., Chaussonnerie, S., Tarrade, A., Dauga, C., Boucher, T., Pelletier, E. et al. (2008)
Cloacibacillus evryensis gen. nov., sp. nov., a novel asaccharolytic, mesophilic, amino-acid-
degrading bacterium within the phylum 'Synergistetes', isolated from an anaerobic sludge digester.
International Journal of Systematic and Evolutionary Microbiology 58: 2003-2012.
Hanaia, W.B., Postec, A., Aullo, T., Ranchou-Peyruse, A., Erauso, G., Brochier-Armanet, C. et al.
(2013) Mesotoga infera sp. nov., a mesophilic member of the order Thermotogales, isolated from an
underground gas storage aquifer. International Journal of Systematic and Evolutionary
Microbiology 63: 3003-3008.
Hanreich, A., Schimpf, U., Zakrzewski, M., Schluter, A., Benndorf, D., Heyer, R. et al. (2013)
Metagenome and metaproteome analyses of microbial communities in mesophilic biogas-producing
anaerobic batch fermentations indicate concerted plant carbohydrate degradation. Systematic and
Applied Microbiology 36: 330-338.
Harmsen, H.J.M., Van Kuijk, B.L.M., Plugge, C.M., Akkermans, A.D.L., De Vos, W.M., and
Stams, A.J.M. (1998) Syntrophobacter fumaroxidans sp. nov., a syntrophic propionate-degrading
sulfate-reducing bacterium. International Journal of Systematic Bacteriology 48: 1383-1387.
Haroon, M.F., Hu, S., Shi, Y., Imelfort, M., Keller, J., Hugenholtz, P. et al. (2013) Anaerobic
oxidation of methane coupled to nitrate reduction in a novel archaeal lineage. Nature 500: 567-570.
This article is protected by copyright. All rights reserved.
24
Imelfort, M., Parks, D.H., Woodcroft, B.J., Dennis, P.D., Hugenholtz, P., and Tyson, G.W. (2014)
GroopM: an automated tool for the recovery of population genomes from related metagenomes.
PeerJ 2: e603.
Jaenicke, S., Ander, C., Bekel, T., Bisdorf, R., Droge, M., Gartemann, K.-H. et al. (2010)
Comparative and joint analysis of two metagenomic datasets from a biogas fermenter obtained by
454-pyrosequencing. Plos One 6: 1-15.
Kanehisa, M., and Goto, S. (2000) KEGG: Kyoto Encyclopedia of Genes and Genomes. Nucleic
Acids Research 28: 27-30.
Kanehisa, M., Goto, S., Sato, Y., Kawashima, M., Furumichi, M., and Tanabe, M. (2014) Data,
information, knowledge and principle: back to metabolism in KEGG. Nucleic Acids Research 42:
199-205.
Kaoutari, A.E., Armougom, F., Gordon, J.I., Raoult, D., and Henrissat, B. (2013) The abundance
and variety of carbohydrate-active enzymes in the human gut microbiota. Nature Reviews
Microbiology 11: 497-504.
Li, H. (2013) Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM.
arXiv:13033997v2 [q-bioGN].
Li, H., Handsaker, B., Wysoker, A., Fennell, T., Ruan, J., Homer, N. et al. (2009) The sequence
alignment/map (SAM) format and SAMtools. Bioinformatics 25: 2078-2079.
Lombard, V., Golaconda Ramulu, H., Drula, E., Coutinho, P.M., and Henrissat, B. (2014) The
carbohydrate-active enzymes database (CAZy) in 2013. Nucleic Acids Research 42: 490-495.
Lynd, L.R., Weimer, P.J., van Zyl, W.H., and Pretorius, I.S. (2002) Microbial cellulose utilization:
Fundamentals and biotechnology. Microbiology and Molecular Biology Reviews 66: 506-577.
Markowitz, V.M., Chen, I.-M.A., Palaniappan, K., Chu, K., Szeto, E., Grechkin, Y. et al. (2012)
IMG: the integrated microbial genomes database and comparative analysis system. Nucleic Acids
Research 40: 115-122.
This article is protected by copyright. All rights reserved.
25
Martinez-Garcia, M., Brazel, D.M., Swan, B.K., Arnosti, C., Chain, P.S.G., Reitenga, K.G. et al.
(2012) Capturing single cell genomes of active polysaccharide degraders: An unexpected
contribution of Verrucomicrobia. Plos One 7: 1-11.
Mclnerney, M.J., Rohlin, L., Mouttaki, H., Kim, U., Krupp, R.S., Rios-Hernandez, L. et al. (2007)
The genome of Syntrophus aciditrophicus: Life at the thermodynamic limit of microbial growth.
PNAS 104: 7600-7605.
Naas, A.E., Mackenzie, A.K., Mravec, J., Schuckel, J., Willats, W.G.T., Eijsink, V.G.H., and Pope,
P.B. (2014) Do rumen Bacteroidetes utilize an alternative mechanism for cellulose degradation.
MBio 5: 1-6.
Nesbo, C.L., Bradman, D.M., Adebusuyi, A., Dlutek, M., Petrus, A.K., Foght, J. et al. (2012)
Mesotoga prima gen. nov., sp. nov., the first described mesophilic species of the Thermotogales.
Extremophiles 16: 387-393.
Neuwirth, E. (2011) RColorBrewer: ColorBrewer palettes. .
Nobu, M.K., Narihiro, T., Rinke, C., Kamagata, Y., Tringe, S.G., Woyke, T., and Liu, W.-T. (2014)
Microbial dark matter ecogenomics reveals complex synergistic networks in a methanogenic
bioreactor. The ISME Journal: 1-13.
Oksanen, J., Blanchet, G., Kindt, R., Legendre, P., Minchin, P.R., O'Hara, R.B. et al. (2013) Vegan:
community ecology package.
Oremland, R.S., and Polcin, S. (1982) Methanogenesis and sulfate reduction: competitive and
noncompetitive substrates in estuarine sediments. Applied and Environmental Microbiology 44:
1270-1276.
Parks, D.H., Imelfort, M., Skennerton, C.T., Hugenholtz, P., and Tyson, G.W. (2015) CheckM:
assessing the quality of microbial genomes recovered from isolates, single cells, and metagenomes.
PeerJ PrePrints 2.
This article is protected by copyright. All rights reserved.
26
Pelletier, E., Kreimeyer, A., Bocs, S., Rouy, Z., Gyapay, G., Chouari, R. et al. (2008) "Candidatus
Cloacamonas Acidaminovorans": genome sequence reconstruction provides a first glimpse of a new
bacterial division. Journal of Bacteriology 190: 2572-2579.
Price, M.N., Dehal, P.S., and Arkin, A.P. (2009) FastTree: Computing large minimum-evolution
trees with profiles instead of distance matric. Molecular Biology and Evolution 26: 1641-1650.
Qiu, Y.-L., Hanada, S., Ohashi, A., Harada, H., Kamagata, Y., and Sekiguchi, Y. (2008)
Syntrophorhabdus aromaticivorans gen. nov., sp. nov., the first cultured anaerobe capable of
degrading phenol to acetate in obligate syntrophic associations with a hydrogenotrophic
methanogen. Applied and Environmental Microbiology 74: 2051-2058.
Raghoebarsing, A.A., Pol, A., van de Pas-Schoonen, K.T., Smolders, A.J.P., Ettwig, K.F., Rijpstra,
I.C. et al. (2006) A microbial consortium couples anaerobic methane oxidation to denitrification.
Nature 440: 918-921.
Rahman, N.A., Parks, D., Vanwonterghem, I., Morrison, M., Tyson, G.W., and Hugenholtz, P.
(2015) A phylogenomic analysis of the bacterial phylum Fibrobacteres. Frontiers in Microbiology.
Rotaru, A.-E., Shrestha, P.M., Liu, F., Shrestha, M., Shrestha, D., Embree, M. et al. (2014) A new
model for electron flow during anaerobic digestion: direct interspecies electron transfer to
Methanosaeta for the reduction of carbon dioxide to methane. Energy and Environmental Science 7:
408-415.
Seemann, T. (2014) Prokka: rapid prokaryotic genome annotation. Bioinformatics 30: 2068-2069.
Sekiguchi, Y., Ohashi, A., Parks, D.H., Yamauchi, T., Tyson, G.W., and Hugenholtz, P. (2015)
First genomic insights into members of a candidate bacterial phylum responsible for wastewater
bulking. PeerJ 3.
Smith, K.S., and Ingram-Smith, C. (2007) Methanosaeta, the forgotten methanogen? Trends in
Microbiology 15: 150-155.
Solli, L., Havelsrud, O.E., Horn, S.J., and Rike, A.G. (2014) A metagenomic study of the microbial
communities in four parallel biogas reactors. Biotechnology for Biofuels 7: 1-15.
This article is protected by copyright. All rights reserved.
27
Stolze, Y., Zakrzewski, M., Maus, I., Eikmeyer, F., Jaenicke, S., Rottmann, N. et al. (2015)
Comparative metagenomics of biogas-producing microbial communities from production-scale
biogas plants operating under wet or dry fermentation conditions. Biotechnology for Biofuels 8: 1-
18.
Suen, G., Weimer, P.J., Stevenson, D.M., Aylward, F.O., Boyum, J., Deneke, J. et al. (2011) The
complete gneome sequence of Fibrobacter succinogenes S85 reveals a cellulolytic and metabolic
specialist. PLoS ONE 6: 1-15.
Tringe, S.G., and Rubin, E.M. (2005) Metagenomics: DNA sequencing of environmental samples.
Nature Reviews Genetics 6: 805-814.
Tringe, S.G., Von Mering, C., Kobayashi, A., Salamov, A.A., Chen, K., Chang, H.W. et al. (2005)
Comparative metagenomics of microbial communities. Science 308: 554-557.
Tucci, S., and Martin, W. (2007) A novel prokaryotic trans-2-enoyl-CoA reductase from the
spirochete Treponema denticola. FEBS Letters 581: 1561-1566.
Tveit, A., Schwacke, R., Svenning, M.M., and Urich, T. (2013) Organic carbon transformations in
high-Arctic peat soils: key functions and microorganisms. The ISME Journal 7: 299-311.
Tyson, G.W., Chapman, J., Hugenholtz, P., Allen, E.E., Ram, R.J., Richardson, P.M. et al. (2004)
Community structure and metabolism through reconstruction of microbial genomes from the
environment. Nature 428: 37-43.
Vanwonterghem, I., Jensen, P.D., Ho, D.P., Batstone, D.J., and Tyson, G.W. (2014a) Linking
microbial community structure, interactions and function in anaerobic digesters using new
molecular techniques. Current Opinion in Biotechnology 27: 55-64.
Vanwonterghem, I., Jensen, P.D., Dennis, P.G., Hugenholtz, P., Rabaey, K., and Tyson, G.W.
(2014b) Deterministic processes guide long-term synchronised population dynamics in replicate
anaerobic digesters. The ISME Journal: 1-14.
Venter, J.C., Remington, K., Heidelberg, J.F., Halpern, A.L., Rusch, D., Eisen, J.A. et al. (2004)
Environmental genome shotgun sequencing of the Sargasso sea. Science 304: 66-74.
This article is protected by copyright. All rights reserved.
28
Verhees, C.H., Kengen, S.W.M., Tuininga, J.E., Schut, G.J., Adams, M.W.W., De Vos, W.M., and
Van der Oost, J. (2003) The uniqu features of glycolytic pathways in Archaea. Biochemical Journal
375: 231-246.
Vital, M., Howe, A.C., and Tiedje, J.M. (2016) Revealing the bacterial butyrate synthesis pathways
by analyzing (meta)genomic data. MBio 5: 1-11.
Wong, M.T., Zhang, D., Li, J., Hui, R.K.H., Tun, H.M., Brar, M.S. et al. (2013) Towards a
metagenomic understanding on enhanced biomethane production from waste activated sludge after
pH 10 pretreatment. Biotechnology for Biofuels 6: 1-14.
Wrighton, K.C., Thomas, B.C., Sharon, I., Miller, C.S., Castelle, C.J., Verberkmoes, N.C. et al.
(2012) Fermentation, hydrogen, and sulfur metabolism in multiple uncultivated bacterial phyla.
Science 337: 1661-1665.
Wrighton, K.C., Castelle, C.J., Wilkins, M.J., Hug, L.A., Sharon, I., Thomas, B.C. et al. (2014)
Metabolic interdependencies between phylogenetically novel fermenters and respiratory organisms
in an unconfined aquifer. The ISME Journal 8: 1452-1463.
Yin, Y., Mao, X., Yang, J.C., Chen, X., Mao, F., and Xu, Y. (2012) dbCAN: a web resource for
automated carbohydrate-active enzyme annotation. Nucleic Acids Research 40: 445-451.
Figure legends
Fig. 1. Metagenome-based microbial community composition. The community profiles are shown
for AD1, AD2 and AD3 on Days 96 (T1) and 362 (T2) based on the 16S rRNA genes extracted
from the metagenomes and clustered at 97% sequence similarity. All populations present at >0.5%
relative abundance in at least one of the samples are shown. The taxonomic classification based on
the 16S rRNA gene are shown at the phylum level (left-hand side) and lowest level of taxonomic
assignment (c: class, o: order, f: family and g: genus; right-hand side).
This article is protected by copyright. All rights reserved.
29
Fig. 2. Phylogeny of the population genomes. Genome tree based on a concatenated set of marker
genes showing the phylogenetic affiliation of the 101 recovered population genomes from the
anaerobic digesters relative to 2015 IMG genomes.
Fig. 3. Distribution of glycoside hydrolase (GH) families for 62 population genomes. The number
of open reading frames (ORFs) identified within each GH family is shown by the heatmap and GH
families are grouped by substrate activity. The phylum-level classification of the population
genomes is shown on the left-hand side of the panel.
Fig. 4. Metabolic network based on the functional classification of all populations present at >0.1%
relative abundance in at least one of the anaerobic digesters (AD1, AD2 and AD3) at Day 96. The
color of the edges corresponds to the substrate node and the thickness of the edges is representative
of the relative abundance of each population genome (average for the three reactors). The
percentages on the right-hand side of the panel show the fraction of the community (total relative
abundance) classified within each functional guild.
Fig. 5. Metabolic network based on the functional classification of all populations present at >0.1%
relative abundance in at least one of the anaerobic digesters (AD1, AD2 and AD3) at Day 362. The
color of the edges corresponds to the substrate node and the thickness of the edges is representative
of the relative abundance of each population genome (average for the three reactors). The
percentages on the right-hand side of the panel show the fraction of the community (total relative
abundance) classified within each functional guild.
This article is protected by copyright. All rights reserved.
31
Table
Table 1. Summary statistics (Compl.: completeness; Cont.: contamination) of 62 population genomes selected for metabolic analysis, which were most
complete and/or abundant in the reactors.
Bin_ID Size Scaffolds Compl. Cont. GC ORFs Genome tree Relative abundance (%)
16S
rRNA
(Mb) # (%) (%) (%) # phylogeny AD1_T1 AD2_T1 AD3_T1 AD1_T2 AD2_T2 AD3_T2 gene
Methan_01 2.6 1 99.4 0.0 59.1 2557 Euryarchaeota 0.00 0.00 0.00 10.13 19.67 5.79 +
Methan_02 2.5 104 99.0 0.7 60.8 2633 Euryarchaeota 0.00 0.00 0.00 0.01 0.23 6.41 +
Methan_04 2.3 235 85.3 2.9 53.4 2083 Euryarchaeota 0.03 0.04 0.49 0.03 0.11 0.12 -
Methan_05 3.0 387 62.4 5.0 52.5 2034 Euryarchaeota 0.03 0.08 0.03 0.12 0.20 0.14 -
Methan_06 2.7 427 59.0 3.6 48.0 1864 Euryarchaeota 0.15 0.28 0.18 0.07 0.44 0.10 -
Methan_07 2.8 414 67.3 10.8 61.6 2182 Euryarchaeota 0.04 0.14 0.39 0.74 0.98 0.86 +
Cren_01 1.8 151 89.0 0.9 58.0 1863 Crenarchaeota 0.00 0.00 0.00 0.23 0.05 0.28 +
Actino_01 3.6 1 99.4 0.0 73.9 3175 Actinobacteria 0.16 0.16 0.22 23.49 7.59 6.24 +
Actino_02 3.2 27 95.2 0.3 67.6 2823 Actinobacteria 3.19 4.26 5.16 0.31 0.21 0.19 -
Alpha_01 3.9 55 99.0 0.5 68.1 3789 Alphaproteobacteria 0.00 0.00 0.00 2.18 0.28 0.26 +
Alpha_02 4.4 182 92.1 6.5 67.6 3768 Alphaproteobacteria 0.00 0.00 0.00 0.65 0.67 0.63 -
Alpha_03 6.4 53 98.9 8.5 60.6 5964 Alphaproteobacteria 0.00 0.00 0.00 1.93 1.01 1.22 +
Alpha_05 3.5 628 80.7 3.9 66.9 3160 Alphaproteobacteria 0.27 0.28 0.37 0.01 0.01 0.01 -
Bact_02 4.5 99 90.9 4.1 41.5 3385 Bacteroidetes 0.76 1.11 0.92 0.02 0.01 0.01 +
Bact_03 3.4 15 100.0 0.5 41.5 2882 Bacteroidetes 10.23 12.28 8.36 0.77 0.61 0.50 +
Bact_08 2.4 59 80.4 0.0 33.6 1628 Bacteroidetes 0.10 0.22 0.03 0.00 0.07 0.03 +
Bact_09 2.1 15 93.3 1.7 46.6 1707 Bacteroidetes 0.00 0.00 0.00 2.01 0.02 0.19 +
Bact_10 3.0 67 93.8 4.8 46.0 2385 Bacteroidetes 0.25 0.35 0.02 0.04 0.08 0.94 +
Bact_11 2.2 102 86.7 1.3 58.0 1670 Bacteroidetes 0.25 0.02 2.21 0.00 0.00 0.10 +
Bact_13 2.5 178 84.4 4.0 58.6 1907 Bacteroidetes 0.38 0.00 0.00 0.00 0.01 0.00 +
Bact_19 3.2 7 99.3 3.3 32.1 2464 Bacteroidetes 0.02 1.01 0.09 0.72 0.74 4.24 +
Bact_22 2.1 2 96.7 0.0 47.0 1711 Bacteroidetes 0.00 0.01 0.00 0.34 4.77 0.10 +
This article is protected by copyright. All rights reserved.
32
Bact_23 4.3 200 94.9 1.7 48.3 3135 Bacteroidetes 0.00 0.00 0.00 0.00 0.00 2.22 -
Bact_24 3.7 81 80.1 2.5 42.8 2413 Bacteroidetes 0.25 0.94 1.04 0.07 1.13 0.00 +
Beta_02 3.4 139 83.0 2.8 63.7 2793 Betaproteobacteria 0.20 0.17 0.23 0.16 0.12 0.11 -
WWEI_01 2.0 130 91.3 6.0 36.4 1532 WWEI 0.20 0.06 0.60 0.00 0.00 0.00 -
Clorobi_01 2.3 3 99.5 0.8 56.2 2119 Chlorobi 0.01 0.00 0.00 1.77 2.78 3.01 +
Chloro_02 2.6 237 70.8 0.2 52.5 1804 Chloroflexi 0.03 0.00 0.00 0.05 0.37 0.06 -
Deferri_01 2.9 14 98.2 0.9 44.4 2626 Deferribacterales 1.06 1.27 1.01 0.00 0.00 0.00 +
Delta_01 4.9 267 88.8 4.4 59.7 3909 Deltaproteobacteria 0.00 0.02 0.00 0.37 0.86 0.27 +
Delta_02 4.4 138 92.6 4.9 56.8 3832 Deltaproteobacteria 0.00 0.00 0.00 0.32 0.36 0.90 -
Delta_03 5.1 106 69.0 5.8 61.5 3387 Deltaproteobacteria 0.00 0.00 0.00 1.60 1.34 1.28 +
Epsilon_01 2.7 26 100.0 0.8 43.9 2690 Epsilonproteobacteria 1.42 1.46 1.19 0.01 0.03 0.01 -
Fibro_01 2.9 50 98.9 2.2 37.4 2362 Fibrobacteres 0.87 6.26 0.67 0.00 0.00 0.00 -
Fibro_02 3.5 122 93.1 0.7 51.4 2764 Fibrobacteres 4.00 4.16 2.62 0.05 0.01 0.02 +
Fibro_03 4.1 11 89.4 2.3 50.2 2968 Fibrobacteres 0.13 0.45 0.09 5.83 4.40 12.69 +
Firm_03 2.7 684 89.0 3.4 39.5 2573 Firmicutes 0.00 0.05 0.00 0.85 0.93 0.20 +
Firm_04 4.2 53 83.9 2.7 54.1 2959 Firmicutes 9.18 0.00 3.83 0.00 0.00 0.00 +
Firm_05 3.2 265 92.9 1.8 45.7 2806 Firmicutes 0.00 7.16 3.75 0.00 0.08 0.00 +
Firm_06 4.2 215 99.3 3.4 44.5 3763 Firmicutes 0.00 3.09 0.73 0.00 0.00 0.00 -
Firm_10 3.3 152 84.9 1.6 62.5 2482 Firmicutes 0.26 0.09 0.09 0.36 0.06 0.06 +
Firm_11 3.5 3 99.2 0.3 55.2 2847 Firmicutes 0.00 0.00 0.00 3.14 2.73 14.64 +
Firm_13 2.0 67 85.1 0.4 51.6 1514 Firmicutes 0.28 0.72 0.19 0.00 0.00 0.00 -
Firm_14 3.1 45 100.0 0.7 46.3 2824 Firmicutes 0.00 0.00 4.61 0.00 0.00 0.00 +
Firm_16 3.8 117 98.6 4.6 49.3 3097 Firmicutes 12.81 4.44 0.03 0.03 0.03 0.00 -
Lenti_01 3.9 118 70.3 0.4 60.9 2774 Lentisphaerae 2.43 2.08 2.84 2.99 3.63 1.54 +
Lenti_02 6.0 662 82.7 4.1 67.3 4114 Lentisphaerae 0.11 0.08 0.53 0.22 0.79 0.14 +
Planc_01 5.7 162 100.0 1.1 62.8 4512 Planctomycetes 0.00 0.00 0.00 0.40 1.79 1.14 +
Spiro_02 2.6 74 98.9 2.3 55.0 2311 Spirochaetes 2.93 1.23 0.07 0.46 0.19 0.01 -
Spiro_03 2.6 64 94.3 0.0 56.0 2219 Spirochaetes 0.00 0.00 0.00 0.00 0.08 0.81 -
Spiro_04 1.9 170 92.0 0.0 59.3 1727 Spirochaetes 0.24 0.05 0.09 0.09 0.19 0.01 -
Spiro_07 3.1 59 85.8 2.1 44.7 2326 Spirochaetes 0.33 0.00 0.00 0.10 0.00 0.00 -
Spiro_08 2.9 57 98.6 0.0 52.1 2443 Spirochaetes 0.16 0.13 0.00 0.44 0.25 1.27 -
Spiro_09 3.0 29 90.6 0.7 51.0 2397 Spirochaetes 0.01 0.00 0.00 0.65 1.39 0.16 -
Spiro_10 3.0 11 97.9 0.0 61.8 2527 Spirochaetes 0.33 0.36 1.29 6.10 2.61 1.55 +
This article is protected by copyright. All rights reserved.
33
Spiro_12 2.4 139 94.9 0.0 57.2 2129 Spirochaetes 0.95 0.86 0.75 2.64 0.50 0.79 +
Syner_01 3.7 683 83.8 5.7 58.9 3280 Synergistetes 0.01 0.01 0.00 0.44 1.26 0.86 +
Syner_03 1.9 218 100.0 2.4 52.0 1862 Synergistetes 0.48 0.79 1.44 0.32 1.35 1.85 +
Thermo_01 2.8 76 94.4 0.3 48.6 2472 Thermotogae 0.06 0.37 0.31 0.22 1.81 1.17 +
Thermo_02 3.5 643 93.8 1.9 47.0 3159 Thermotogae 0.00 0.00 0.00 0.33 0.27 0.01 -
Verruco_01 2.7 7 95.4 1.3 63.0 2261 Verrucomicrobia 0.00 0.01 0.19 5.88 0.51 0.00 +
Verruco_02 2.9 33 94.6 2.0 58.8 2263 Verrucomicrobia 0.00 0.00 0.06 0.01 0.65 0.00 +
This article is protected by copyright. All rights reserved.
Metagenome-based microbial community composition. The community profiles are shown for AD1, AD2 and AD3 on Days 96 (T1) and 362 (T2) based on the 16S rRNA genes extracted from the metagenomes and
clustered at 97% sequence similarity. All populations present at >0.5% relative abundance in at least one of
the samples are shown. The taxonomic classification based on the 16S rRNA gene are shown at the phylum level (left-hand side) and lowest level of taxonomic assignment (c: class, o: order, f: family and g: genus;
right-hand side). 189x278mm (300 x 300 DPI)
Page 34 of 38
Wiley-Blackwell and Society for Applied Microbiology
This article is protected by copyright. All rights reserved.
C. Microarchaeum acidiphilum
C.Parvarchaeum acidiphilum
NANOARCHAEOTA KORARCHAEOTA
THAUMARCHAEOTA
c_Thermoprotei
OSPB1 - NAG1
o_Thermoplasmatales
g_Archaeoglobus
Methanoflorens stordalenmirensis
c_Halobacteria
g_Methanocellus
ANME-1
o_Methanosarcinales
Methanosaeta harundinacea
Methanosaeta therm
ophila
Methanosaeta concilii
Methanocorpusculum
labreanum
g_Methanoplanus
Methanospirillum
hungatei
o_Methanom
icrobiales
Methanofollis lim
inatans
Methanoculleus m
arisnigri
f_Thermotogaceae
f_Thermotogaceae
Kosm
otoga olearia
Mesotoga prim
a
OP9
g_Anaerobaculum
Thermovirga lienii
Cloacibacillus evryensis
f_Synergistaceae
o_Synergistales
Synergistetes sp. S
GP1
Am
inobacteriu
m colom
biense
CYA
NO
BACTE
RIA
ARM
ATIM
ON
AD
ETE
SKted
onob
acter racemifer
f_D
ehalococcoid
aceaec_
Chloroflexi
Term
obacu
lum
terrenum
c_Therm
omicrob
iaCald
ilinea aerop
hila
Bacteriu
m sp
. JAD
2Anaerolin
ea sp.
Anaerolin
ea sp.
Anae
rolin
ea t
her
mop
hila
TH
ERM
Ig_Rubro
bac
ter
o_Sol
irubro
bac
tera
les
f_Cor
ionib
acte
riac
eaAci
dim
icro
biu
m fer
roox
idan
s
Nitrilir
upto
r al
kalip
hilu
sAct
inob
acte
rium
c_Act
inob
acte
ria
c_Act
inob
acte
ria
c_Act
inob
acte
ria
Jian
gella
gan
suen
sis
Act
inop
olym
orph
a al
baf_
Noc
ardo
idac
eae
Mic
rolu
natu
s ph
osph
ovor
usD
ehal
obac
ter
sp.
f_Pr
opio
niba
cter
iace
ae
c_Act
inob
acte
ria
o_M
icro
cocc
ales
c_Act
inob
acte
ria
c_Act
inob
acte
ria
Bra
chyb
acte
rium
fae
cium
Beu
tenb
ergi
a ca
vern
ae
Cel
lulo
mon
as fla
vige
na
Cellu
lom
onas
fim
i
Cellv
ibrio
gilv
us
FUSO
BACT
ERIA
g_Ac
hole
plas
ma
f_Er
ysip
elot
richa
ceae
f_Myc
oplasm
atac
eae
Tepida
naer
obac
ter s
p.
o_Th
erm
anae
roba
cter
ales
g_Clos
tridium
f_Eu
bacter
iace
ae
o_Clos
tridiales
f_Tiss
ierell
acea
e
Euba
cter
ium sa
phen
um
Euba
cteriu
m infir
mum
f_Ace
tivibr
ionac
eae
f_Lac
hnos
pirac
eae
Abiotro
phia
defec
tiva
f_Lach
nospir
acea
e
Clostridiales
sp.
f_Oscillospiraceae
f_Ruminococcaceae
f_Ruminococcaceae
Anaerotruncus colihominis
Clostridium methylpentosum
g_RuminococcusEubacterium siraeum
Ruminococcus champanellensis
Ruminococcus flavefaciens
ELUSIMICROBIA TM6 PARCUBACTERIA
f_Leptospiraceae
g_Brachyspirag_Borellia
g_Spirochaetag_Spirochaeta
Spirochaeta smaragdinaeSpirochaeta coccoidesSpirochaeta sp. Grapes
Spirochaeta sp. Buddyg_TreponemaTreponema brennaborense
g_TreponemaTreponema denticolag_Treponema
C. Kuenenia stuttgartiensisc_PlanctomycetiaCHLAMYDIA
Phycisphaera mikurensis
Lentisphaera araneosa
Victivallis vadensis
o_Opitutales
Coraliomargarita akajimensis
c_Verrucomicrobiae
Pedosphaera parvula
C. Cloacamonas acidaminovorans
Fibrobacter succinogenes
GEMMATIMONADETES
IGNAVIBACTERIA
Chloroherpeton thalassium
g_Prosthecochloris
g_Chlorobium
Chlorobaculum parvum
Chlorobium tepidum
c_Rhodothermia
c_Saprospirae
o_Cytophagales
f_Sphingobacteriaceaeo_Flavobacteriales
Rikenella m
icrofusus
Alistipes indistinctus
g_Alistipes
o_O
dorib
acterales
f_M
arinilab
iaceae
Bacteroid
etes sp.
Paludib
acter prop
ionicig
enes
f_Barn
esiellaceae
o_Bacteroid
ales
g_Bacteroid
es
g_Bacteroid
esC. A
zobactoid
es pseu
dotrich
onym
phae
g_D
ysgon
omon
as
g_Porp
hyrom
onas
Tan
nerella forsyth
ensis
g_Parab
acteroides
AQ
UIF
ICAE
f_D
esulfure
llace
ae
f_N
autilia
ceae
g_N
itra
tiru
pto
r
c_Epsi
lonpro
teob
acte
ria
f_H
elic
obac
tera
ceae
g_Cam
pylo
bact
er
g_Sul
furo
spirill
um
Den
itro
vibr
io a
cetiph
ilus
f_D
efer
riba
cter
acea
e
o_Nitro
spiral
es
ACID
OBACTE
RIA
c_Del
tapr
oteo
bact
eria
c_Del
tapr
oteo
bact
eria
Des
ulfo
mon
ile tie
djei
Synt
roph
us a
cidi
trop
hicu
s
Synt
roph
orha
bdus
aro
mat
iciv
oran
s
c_Del
tapr
oteo
bact
eria
c_Del
tapr
oteo
bact
eria
Des
ulfo
bacc
a ac
etox
idan
s
Synt
roph
obac
ter fu
mar
oxid
ans
c_Gam
map
rote
obac
teria
f_Ne
isse
riace
ae
c_Be
tapr
oteo
bact
eriaf_Rh
odoc
yclace
ae
c_Be
tapr
oteo
bacter
ia
f_Co
mam
onad
acea
e
Thiomon
as sp
.
Thiom
onas
inte
rmed
ia
f_La
utro
piace
ae
f_Sut
tere
llace
ae
f_Alca
ligen
acea
eg_Bon
detel
la
Pussi
limon
as sp
.
c_Alph
aprot
eoba
cteria
Magne
tococ
cus s
p.
Geminico
ccus r
oseus
f_Acetobact
eracea
e
c_Alphaproteobact
eriag_Magneto
spirillum
g_Rhodospirillum
o_SphingomonadalesMeganema perideroedes
Rhodobacterales sp.
f_Rhodobacteraceaeg_Rhodobacter
Rhodobacter capsulatusg_Paracoccus
c_AlphaproteobacteriaParvibaculum lavamentivorans
f_Hyphomicrobiaceaeo_RhizobialesPelagibacterium halotoleransPseudovibrio sp.Polymorphum gilvumRoseibium sp.Labrenzia alexandriiLabrenzia aggregataAhrensia sp.f_Aurantimonadaceaeo_Rhizobialesg_Mesorhizobium
Chelativorans sp.
Nitratireductor aquibiodomusHoeflea phototrophica
Martelella mediterranae
C. Liberibacter asiaticus g_Rhizobium
g_Agrobacterium
g_Agrobacterium
Rhizobium leguminosarum
Rhizobium giardinii
Rhizobium sp.
Sinorhizobium fredii
Rhizobium sp.
Sinorhizobium terangae
Sinorhizobium arboris
Ensifer medicae
Sinorhizobium medicae
Ensifer meliloti
Sinorhizobium meliloti
Cren_01
Methan_01
Methan_02
Methan_03
Methan_04
Methan_05
Methan_06
Methan_07
Thermo_01
Thermo_02
Syner_03
Syn
er_01
Syn
er_02
Chloro_
01
Chloro_
02
Chloro_
03
Chloro_
04
Act
ino_
02
Actin
o_01
Tene
r_01
Tene
r_02
Firm
_01
Firm
_02
Firm
_18
Firm_1
7
Firm_09
Firm_10
Firm_11
Firm_12
Firm_13
Firm_14
Firm_15
Firm_16
Firm_07
Firm_08
Firm_03
Firm_04
Firm_05Firm_06
Spiro_01Spiro_02Spiro_03Spiro_04Spiro_05Spiro_06Spiro_12
Spiro_08Spiro_07Spiro_09Spiro_10Spiro_11
Planc_01Lenti_01
Lenti_02Lenti_03Verruco_01
Verruco_02
Verruco_03
Verruco_04Fibro_02
Fibro_03
Chlorobi_01Bact_22
Bact_23
Bact_24
Bact_18
Bact_19
Bact_20
Bact_08
Bact_09
Bact_10
Bact_11
Bact_
12
Bact_
13
Bact_
14
Bact_
15
Bact_
16
Bact_
03
Bact_
01
Bact_
02Ep
silo
n_01
Def
erri_0
1
Del
ta_0
3
Del
ta_0
2
Del
ta_0
1
Beta
_02
Beta_0
1
Alpha_05
Alpha_01
Alpha_02
Alpha_04
Alpha_03
0.001
Verruco_05
WWE1_01
Fibro_01
Bact_07
Bact_17Bact_
05
Bact_
04
Bact_
06
ProteobacteriaDeferribacteresEpsilonproteobacteriaBacteroidetesChlorobiFibrobacteresWWE1VerrucomicrobiaLentisphaeraePlanctomycetesSpirochaetesFirmicutesTenericutesActinobacteriaChloroflexiSynergistetesThermotogaeEuryarchaeotaCrenarchaeota
Page 35 of 38
Wiley-Blackwell and Society for Applied Microbiology
This article is protected by copyright. All rights reserved.
Distribution of glycoside hydrolase (GH) families for 62 population genomes. The number of open reading frames (ORFs) identified within each GH family is shown by the heatmap and GH families are grouped by
substrate activity. The phylum-level classification of the population genomes is shown on the left-hand side of the panel.
178x195mm (300 x 300 DPI)
Page 36 of 38
Wiley-Blackwell and Society for Applied Microbiology
This article is protected by copyright. All rights reserved.
Metabolic network based on the functional classification of all populations present at >0.1% relative abundance in at least one of the anaerobic digesters (AD1, AD2 and AD3) at Day 96. The color of the edges
corresponds to the substrate node and the thickness of the edges is representative of the relative
abundance of each population genome (average for the three reactors). The percentages on the right-hand side of the panel show the fraction of the community (total relative abundance) classified within each
functional guild. 212x183mm (300 x 300 DPI)
Page 37 of 38
Wiley-Blackwell and Society for Applied Microbiology
This article is protected by copyright. All rights reserved.
Metabolic network based on the functional classification of all populations present at >0.1% relative abundance in at least one of the anaerobic digesters (AD1, AD2 and AD3) at Day 362. The color of the edges corresponds to the substrate node and the thickness of the edges is representative of the relative
abundance of each population genome (average for the three reactors). The percentages on the right-hand side of the panel show the fraction of the community (total relative abundance) classified within each
functional guild. 247x183mm (300 x 300 DPI)
Page 38 of 38
Wiley-Blackwell and Society for Applied Microbiology
This article is protected by copyright. All rights reserved.