relative evolutionary rates of nbs-encoding genes revealed by soybean segmental duplication
TRANSCRIPT
ORIGINAL PAPER
Relative evolutionary rates of NBS-encoding genes revealedby soybean segmental duplication
Xiaohui Zhang • Ying Feng • Hao Cheng •
Dacheng Tian • Sihai Yang • Jian-Qun Chen
Received: 29 April 2010 / Accepted: 26 October 2010 / Published online: 16 November 2010
� Springer-Verlag 2010
Abstract It is well known that nucleotide binding site
(NBS)-encoding genes are duplicate-rich and fast-evolving
genes. However, there is little information on the relative
importance of tandem and segmental NBS duplicates and
their exact evolutionary rates. The two rounds of large-
scale duplication that have occurred in soybean provide a
unique opportunity to investigate these issues. Comparison
of NBS and non-NBS genes on segments of syntenic ho-
moeologs shows that NBS-encoding genes evolve at least
1.5-fold faster (*1.5-fold higher synonymous and *2.3-
fold higher nonsynonymous substitution rates) and lose
their genes *twofold faster than the flanking non-NBS
genes. Compared with segmental duplicates, tandem NBS
duplicates are more abundant in soybean, suggesting that
tandem duplication is the major driving force in the
expansion of NBS genes. Notably, significant sequence
exchanges along with significantly positive selection were
detected in most tandem-duplicated NBS gene families.
The results suggest that the rapid evolution of NBS genes
may be due to the combined effects of diversifying selec-
tion and frequent sequence exchanges. Interestingly, TIR–
NBS–LRR genes (TNLs) have a higher nucleotide substi-
tution rate than non-TNLs, indicating that these types of
NBS genes may have a rather different evolutionary pat-
tern. It is important to determine the exact relative evolu-
tionary rates of TNL, non-TNL, and non-NBS genes in
order to understand how fast the host plant can adjust its
response to rapidly evolving pathogens in a coevolutionary
context.
Keywords NBS–LRR genes � Evolutionary rate �Tandem and segmental duplication � Soybean
Introduction
Over the long history of the battle between plants and
pathogens, plants have evolved sophisticated mechanisms
to perceive pathogen attack and subsequently produce a
highly localized and specific response, resulting in a visible
hypersensitive response (HR) (Dangl and Jones 2001). An
important and well-characterized perception mechanism is
based on resistance (R) genes in flowering plants whose
products confer recognition of cognate avirulence (Avr)
proteins in pathogens (Dangl and Jones 2001). More than
40 R genes have been cloned over the past two decades and
most of them belong to the nucleotide binding sites and
leucine-rich repeat (NBS–LRR) genes, which form one of
the largest plant gene families. NBS–LRR proteins have
been studied extensively to understand their evolution and
the molecular basis of their functions (Martin et al. 2003;
Deyoung and Innes 2006). The N-terminal region of
Communicated by Y. Van de Peer.
X. Zhang and Y. Feng contributed equally to this work.
Electronic supplementary material The online version of thisarticle (doi:10.1007/s00438-010-0587-7) contains supplementarymaterial, which is available to authorized users.
X. Zhang � Y. Feng � D. Tian � S. Yang (&) � J.-Q. Chen (&)
State Key Laboratory of Pharmaceutical Biotechnology, School
of Life Sciences, Nanjing University, Nanjing 210093, China
e-mail: [email protected]
J.-Q. Chen
e-mail: [email protected]
H. Cheng
National Center for Soybean Improvement, National Key
Laboratory of Crop Genetics and Germplasm Enhancement,
Nanjing Agricultural University, Nanjing 210095, China
123
Mol Genet Genomics (2011) 285:79–90
DOI 10.1007/s00438-010-0587-7
NBS–LRR genes is structurally diverse, including the Toll/
Interleukin-1 receptor domain (TIR–NBS–LRR, TNL) and
non-TIR–NBS–LRRs (non-TNLs) (Dangl and Jones 2001).
NBS-encoding genes are often observed in clusters of
tandem repeats, suggesting that gene duplication is a
common event in this family (Meyers et al. 2003; Leister
2004; Yang et al. 2006, 2008; Li et al. 2010). Several
evolutionary and population genetic models of the fate of
duplicate genes have been proposed, which provide theo-
retical and mechanistic explanations for gene retention
(Ohno 1970). In different plants, NBS genes are found as
both singletons and gene clusters. Compared with other
gene families, the NBS gene family contains a higher
proportion of duplicated genes (Yang et al. 2008; Li et al.
2010). These duplicated NBS genes are mostly derived
from segmental and tandem duplication events (Meyers
et al. 2003). Genomic analyses in the grapevine and poplar
genomes have shown that recent tandem duplication plays
a major role in the expansion of NBS-encoding genes
(Yang et al. 2008). In the Arabidopsis genome, only 22
NBS-encoding genes were detected in both members of the
duplication pair, indicating high frequency of NBS gene
loss after whole-genome duplication (Nobuta et al. 2005).
Interestingly, although all NBS-LRR genes were presumed
to originate from a common ancestor, considerable varia-
tions in the numbers of these genes were detected in dif-
ferent species (Li et al. 2010). It has been proposed that
rapid expansion and/or contraction of genes may be a
fundamentally important strategy employed by plants to
adapt to the rapidly changing species-specific pathogen
spectrum (Chen et al. 2010; Li et al. 2010). However,
which duplication events dominate the expansion of NBS-
encoding genes and why these duplicates tend to be
retained are questions that remain unanswered.
Recent studies have shown that plant NBS–LRRs are
incredibly adaptive in their pathways of pathogen recog-
nition and defense initiation (Deyoung and Innes 2006).
Molecular genetic studies of related NBS–LRR proteins
with different recognition specificities indicate that the
LRR domain is the primary determinant of NBS–LRR
protein for recognition specificity, and this domain is
believed to adopt an arc-shaped conformation, forming a
protein–protein interaction surface (Dangl and Jones 2001).
Genome-wide surveys of NBS gene polymorphisms have
shown that LRR regions are highly polymorphic for protein
variants and are driven by balancing or diversifying
selection (Bakker et al. 2006; Yang et al. 2006; Ding et al.
2007). However, there is little comparative data on the
accurate evolutionary rates of NBS and non-NBS genes.
Legumes, including Lotus, Medicago, Glycine, and
Phaseolus, comprise one of the three largest families of
flowering plants and are important from an agricultural
perspective because they fix atmospheric nitrogen through
closely symbiotic interactions with microorganisms
(Schlueter et al. 2007). Soybean (Glycine max) is an
important crop because of the high protein and oil content
of its seeds. This plant has undergone at least two rounds of
large-scale duplication approximately 13 and 59 million
years ago (Mya), resulting in a highly duplicated genome
with nearly 75% of the genes present in multiple copies
(Schmutz et al. 2010). The recent duplication event in
soybean provides a unique opportunity to investigate the
different fates of tandem and segmental NBS duplicates
and to determine the type of gene duplication that influ-
ences the expansion and/or contraction of NBS genes and
how this happens.
Materials and methods
Identification of NBS-encoding genes
Soybean (Glycine max; Glyma1.01; Schmutz et al. 2010),
barrel medic (Medicago truncatula; Ameline-Torregrosa
et al. 2008), birdsfoot trefoil (Lotus japonicus; V1.0; Sato
et al. 2008), and Castor bean (Ricinus communis) assembly
and genemodels were obtained from Soybean Genome
Sequencing Consortium (http://www.phytozome.net/soybean.
php), Medicago Genome Sequence Consortium (http://
medicago.org/genome/downloads/Mt2/; MGSC 2007),
National BioResource Project (NBRP) Legume Base (http://
www.legumebase.agr.miyazaki-u.ac.jp/index.jsp), and TIGR
castorWGS release 0.1 (http://castorbean.jcvi.org/castorbean_
downloads.shtml), respectively.
To identify NBS-encoding genes in the four plant gen-
omes, both BLAST and hidden Markov models (HMM)
searches were performed, as described in Yang et al.
(2008) and Li et al. (2010). First, possible homologs
encoded in plant genomes were searched in BLASTP with
amino acid sequence of NB-ARC domain (Pfam: PF00931)
as a query with a threshold expectation value of 1E-4, a
value determined empirically to filter out most of the
spurious hits. Second, the nucleotide sequences of candi-
date NBS genes were used as queries to find homologs in
their genomes by BLASTn search. This step was crucial to
find the maximum number of candidate genes. All new
BLAST hits in the genomes, together with flanking regions
of 5,000–10,000 bp at both sides, were annotated using the
gene-finding programs FGENESH with the legume plant
training set (http://www.softberry.com/) and GENSCAN
with the Arabidopsis training set (http://genes.mit.edu/
GENSCAN.html) to obtain information on complete open
reading frames (ORFs). To exclude potentially redundant
candidate NBS genes, all sequences were orientated by
BLASTn, and sequences found in the same location were
eliminated. All non-redundant candidate NBS genes were
80 Mol Genet Genomics (2011) 285:79–90
123
surveyed to further verify whether they encoded NBS or
LRR motifs using the Pfam database v23.0 (E value cut-off
of 10-4) (http://pfam.janelia.org/), SMART protein motif
analyses (http://smart.embl-heidelberg.de/), and Multiple
Expectation Maximization for Motif Elicitation (MEME)
(Bailey and Elkan 2005). CC motifs were detected using
COILS with a threshold of 0.9 (Lupas et al. 1991).
Determining the segmentally duplicated NBS-encoding
genes
Soybean is a well-documented paleopolyploid whose gen-
ome has undergone at least two rounds of polyploidy and
subsequent diploidization events (Schmutz et al. 2010). To
examine the evolutionary fate of duplicated NBS-encoding
genes, segmental duplication events containing NBS-
encoding genes in both segmentally duplicated regions
were identified in the soybean genome. First, all NBS-
encoding genes were oriented on the appropriate chromo-
somes as the original anchor points by using BLASTn.
Thirty-one non-repetitive-element genes of each anchor
point, including the NBS-encoding gene and 15 flanking
genes on each side, were then compared by pairwise
BLAST analysis to identify duplicated genes between two
independent segmental blocks. When more than five gene
pairs with syntenic relationships (BLAST E value \10-10)
were detected, the two blocks were defined as a segmentally
duplicated region pair (Fig. 1). In addition, for small gene
families between two duplicated blocks, each syntenic gene
from homoeologue 1 was paired only once with its least
divergent counterpart in homoeologue 2.
To examine the relative ages of the segmental duplica-
tion events, the synonymous substitution rates (Ks) of
the full-length nucleotide coding sequences (CDSs) with
the alignable nucleotide sequence covering [80% of the
longer gene between duplicates were calculated using the
method developed by Nei and Gojobori (1986). For each
gene pair, the Ks value was translated into divergence time
in million years based on a rate of 6.1 9 10-9 substitutions
per site per year. The divergence time (T) was calculated
as T = Ks/(2 9 6.1 9 10-9) 9 10-6 Mya (Lynch and
Conery 2000).
Sequence alignment and phylogenetic analysis
Multiple alignments of amino acid sequences were per-
formed using ClustalW with default options (Thompson
et al. 1994) and then with MEGA v4.0 (Tamura et al. 2007)
to manually correct the alignments. The resulting amino
acid sequence alignments were then used to guide the
alignments of CDSs. Phylogenetic trees were constructed
by the bootstrap neighbor-joining (NJ) method with a
Kimura 2-parameter model by MEGA. The stability of
internal nodes was assessed by bootstrap analysis with
1,000 replicates. The time of the duplication event for each
node in the phylogenetic tree was determined from the
average Ks value of the node (Yang et al. 2008). Nucleo-
tide divergence among paralogs was estimated from Dxy
with the Jukes and Cantor correction (Lynch and Crease
1990) using MEGA v4.0. Using a sliding-window search
(15 ORFs as a window size), a gene cluster was defined as
a region in which two neighboring homologous genes
were B15 ORFs apart. Similar to the definition of a cluster
in Meyers et al. (2003), that is a useful operational defi-
nition because the number of clusters changed only slightly
when the maximum number of intervening ORFs was
increased to 20 or even 30. Gene families were defined as
groups of NBS-encoding genes in which each gene dis-
plays at least 70% nucleotide identity to at least one other
member of the group.
The xxLxLxx motif of the LRR domain (L = Leu or
another aliphatic amino acid; x = any amino acid) was
regarded as the determinant of recognition specificity for Avr
factors. To determine the evolutionary dynamics of dupli-
cated genes, we estimated the ratios of nonsynonymous (Ka)
to synonymous (Ks) nucleotide substitutions of the NBS
domain, LRR region, and xxLxLxx motif between NBS
paralogs. Ka and Ks were calculated using DnaSp v4.0
(Rozas et al. 2003) on the basis of the Nei and Gojobori
(1986) method. A Ka/Ks ratio greater than 1 suggests posi-
tive selection, and a ratio less than 1 indicates purifying
selection. The average Ka/Ks ratio for a gene family was
calculated by taking the arithmetic mean of all possible
pairwise combinations. Recombination has a profound
impact on evolutionary rates and analysis of selection
Fig. 1 An example of the highly syntenic relationships of segmen-
tally duplicated non-repetitive-element genes between two blocks on
chromosomes 10 and 20. The least divergent counterparts are linked
by shadows. Colored arrows (red, NBS–LRR genes; black, other
genes) indicate the positions and orientations of the predicted genes.
Dotted lines indicate that no corresponding ORFs or homoeologs are
detected in the same position compared with its counterpart (color
figure online)
Mol Genet Genomics (2011) 285:79–90 81
123
pressure may be confounded by the presence or absence of
recombination. Next, the genetic algorithm for recombina-
tion detection (GARD) tool was implemented in Datamon-
key web with default setting to detect the presence or
absence of recombination and the location of recombination
breakpoints (http://www.datamankey.org; Pond et al. 2006).
A NEXUS file with partition information output created by
GARD was subsequently used as the input for the positive
selection analysis; this method allows the analysis of
recombinant sequences for positive selection (Pond et al.
2006). The HyPhy package with the random effects likeli-
hood (REL) method run on the resulting GARD fragments
(Pond et al. 2005 and 2006) was used to detect positively and
negatively selected sites with posterior probability [0.98.
Sequence exchange was also investigated with GENE-
CONV 1.81 (http://www.math.wustl.edu/sawyer/geneconv/).
The default setting of 10,000 permutations was used for the
analysis. The statistical significance of gene conversion events
was defined as a global permutation P value \0.05.
Results
NBS-encoding gene families and their recent
duplication
We performed BLAST and HMM searches and identified
429 NBS-encoding genes in the soybean genome. Of these
genes, 154, 236 and 39 were identified as TIR–NBS–LRR
(TNL), non-TIR–NBS–LRR (non-TNL) and NBS genes
lacking the LRR domain, respectively (Table 1). In a pre-
vious study, 330 nonredundant NBS-encoding genes were
identified in the M. truncatula draft genome (Mt 1.0;
Ameline-Torregrosa et al. 2008). Using the same methods
and an updated database (Mt 2.0), 392 NBS-encoding
genes, including 143 TNLs, 193 non-TNLs, and 56 NBSs
were re-identified in the Medicago genome (Table 1). In
another model legume species, L. japonicus, 95 NBS-
encoding genes with the LRR domain and 63 NBS genes
lacking LRRs were identified in *67% of the sequenced
genome. In castor bean (Ricinus communis), another oil-
seed plant, 111 NBS–LRR and 18 NBS genes were
identified.
When nucleotide identity [70% was used as the crite-
rion for predicting multigene families (see ‘‘Materials and
methods’’ for details), soybean was found to have 60 NBS
gene families, containing 76.7% of the NBS-encoding
genes, with 5.45 members per gene family (Table 2).
Similarly, Medicago had 81 NBS gene families containing
82.9% of the NBS-encoding genes, with 4.01 members per
gene family. However, when the same criterion was used
for predicting multigene families in the Arabidopsis and
rice genomes, only 48.5 and 37.0% of the R genes were
found to belong to multigene families, respectively.
Moreover, the multigene family of Arabidopsis had 3.23
members per family while that of rice had 2.63 members
per family. These results suggest that more NBSs were
retained in the two legume genomes. Additionally, in the
Table 1 Number of genes that
encode domains similar to NBS
genes in plant genomes
Predicted protein domains Letter code G. max M. truncatula L. japonicus R. communis
Total NBS encoding genes 429 392 158 129
NBS–LRR type genes
TIR–NBS–LRR TNL
TIR–NBS–LRR TNL0 126 58 26 28
X-NBS–LRR XNL 26 74 22 5
X-cc–NBS–LRR XCNL 2 11 1 0
Total 154 143 49 32
Non-TIR–NBS–LRR Non-TNL
CC–NBS–LRR CNL 94 102 23 54
X-NBS–LRR XNL 142 85 20 25
LRR–NBS–TIR LNT 0 6 3 0
Total 236 193 46 79
Total 390 336 95 111
NBS type genes
TIR–NBS TN 11 8 8 2
X-NBS (TIR type) XN0 9 24 23 4
CC-NBS CN 3 10 6 7
X-NBS (CC type) XN0 0 16 12 16 5
NBS–TIR (CC type) NT 0 2 10 0
Total 39 56 63 18
82 Mol Genet Genomics (2011) 285:79–90
123
soybean genome, 87.2% (287/329) of the NBS paralogs
were located in clusters, indicating that tandem duplication
has contributed to the expansion of the recently duplicated
NBS genes.
When more stringent criteria were applied to identify
recently duplicated NBSs, approximately 14.2, 27.3, and
23.1% of the NBS duplicates in soybean and 15.3, 14.3, and
33.7% of the NBS duplicates in Medicago were identified as
having homologous members with C95, 90–95, and
80–90% nucleotide identities, respectively (Fig. 2). These
values were significantly higher than those in Arabidopsis
(0, 5.8, and 21.4, respectively) and rice (1.9, 3.85, and
13.8%, respectively) (paired t test, P \ 0.05; Fig. 2). In
other words, approximately 41.5% of the NBSs had
homologs with [90% nucleotide identity in the soybean
genome, while the corresponding values in the Arabidopsis
and rice genomes were only 5.8 and 5.75%, respectively.
The Medicago genome had the largest proportion of NBS
duplicates (33.7%), with identity levels ranging from 80 to
90%, while the rice genome had the lowest proportion of
NBS duplicates with an identity level exceeding 80%.
In phylogenetic trees, when the phylogenetic clade was
defined as having nucleotide identity [70% between par-
alogs and bootstrap value [70% (Table S1), it was clear
that the rice genome had the largest number of R clades
(400), which was significantly higher than the value in
soybean (160), Medicago (148), and Arabidopsis (115).
Moreover, soybean and Medicago had more R paralogs per
clade (2.68 and 2.65, respectively) than Arabidopsis (1.50)
and rice (1.30).
Segmentally duplicated NBS-encoding genes and their
evolutionary rate
Gene expansions are common in NBS-encoding proteins
(Meyers et al. 2003; Yang et al.2006, 2008; Li et al. 2010).
Approximately 72.7 and 81.9% of NBSs were tandem
arrayed and physically clustered in chromosomes in the
soybean and Medicago genomes, respectively (Table S2
and S3). In total, 70 and 56 NBS clusters were found,
averaging 4.51 and 5.73 NBS-encoding genes per cluster in
soybean and Medicago, respectively. The largest clusters
contained 18 and 19 NBSs in soybean (on chromosome 3)
and Medicago (on chromosome 6), respectively.
Soybean has undergone at least two rounds of large-
scale duplication approximately 13 and 59 Mya, and
approximately 50% of the duplicates in the segmental
regions have been retained (Schlueter et al. 2007). Our data
showed that a total of 183 NBS loci, consisting of 113
singletons and 70 R clusters, were present in soybean. To
examine the evolutionary fate of these duplicated NBSs,
segmentally duplicated blocks were identified pairwise (see
‘‘Materials and methods’’ for details). Interestingly, only 30
segmentally duplicated block pairs with syntenically ho-
moeologous NBS–LRR genes and their flanking genes
were unambiguously detected (Fig. 1 ; Table S4), and these
included 89 NBS-encoding genes (34 TIR- and 55 non-
TIR-NBSs). In the remaining 123 NBS loci (67.2% of the
total NBS loci), no unambiguous syntenic region was
detected, and even when the syntenic region was detected,
its homoeologous NBS genes were not found in the seg-
mentally duplicated region, suggesting that most of the
segmentally duplicated NBS genes or loci were lost.
In the 30 unambiguous pairs of segmentally duplicated
block that contained NBSs in both regions, approximately
31–68% of the duplicated non-NBS genes (flanking genes)
were retained. In these cases, most retained homologs had
approximately 90% or greater nucleotide identity, with a
few extremes. To estimate the timing of the duplication
Table 2 Detection of NBS
gene families in the genomes of
soybean, Medicago,
Arabidopsis, and rice
Family
numbers
Gene
numbers
Proportion of NBSs
in families (%)
Average number
per family
Soybean 60 329 76.7 5.45
Medicago 81 325 82.9 4.01
Arabidopsis 26 84 48.5 3.23
Rice 73 192 37.0 2.63
Fig. 2 Distribution of recently generated NBS duplicates in the
genomes of soybean, Medicago, Arabidopsis, and rice
Mol Genet Genomics (2011) 285:79–90 83
123
event that created the two duplicated blocks, pairwise
synonymous substitutions (Ks) were calculated between
duplicated NBSs and non-NBSs. Interestingly, only a sin-
gle distinct peak of Ks distribution was detected in all pairs
of duplicated non-NBS genes (Fig. 3a). The average Ks
value was 0.158 ± 0.085 (range 0.021–0.552). This value
compares well with those reported in previous analyses of
gene duplicates derived from the most recent whole gen-
ome duplication event in soybean, where the genome-wide
computation of duplicated gene pairs showed an average
Ks of 0.13 (Schmutz et al. 2010) and 23 homologous gene
pairs showed an average Ks of 0.147 (Van et al. 2008). On
the basis of a substitution rate of 6.1 9 10-9 substitutions
per site per year (Lynch and Conery 2000), the divergence
time of segmental duplication was approximately 12.95
Mya, which is consistent with previous findings (*13
Mya; Schlueter et al. 2007; Schmutz et al. 2010). These
results indicate that all of the retained segmentally dupli-
cated regions, including the duplicated NBS genes in both
regions, were products of the recently large-scale dupli-
cation. In more ancient segmentally duplicated regions
(*59 Mya), NBS genes were lost in either one or both
regions, suggesting that NBS genes were lost more fre-
quently than non-NBS genes.
In comparison with the Ks distribution in segmental
non-NBS duplicates, the Ks distribution in segmental NBS
duplicates displayed a broader peak and higher average
value (0.268 ± 0.121, range 0.057–0.553; Fig. 3b). The
higher Ks value suggests that NBS genes may be evolving
more rapidly than non-NBS genes. However, there are at
least two other possible explanations for the higher Ks
values of the duplicated NBS genes than their flanking non-
NBS gene pairs other than higher evolutionary rate in these
genes: (1) frequent sequence exchange between NBS par-
alogs and (2) the deeper coalescence time of these dupli-
cated NBS gene pairs. More stringent standards were
required to identify the exact cause.
First, for a single duplicated NBS locus, three unam-
biguously syntenically duplicated NBS pairs were detected
using the syntenically homoeologous region of Medicago
as the outgroup (Glyma09g07020 and Glyma15g18290
vs. Med AC186731_16.6; Glyma01g37620 and Gly-
ma11g07680 vs. Med CR955005_30.5; Glyma07g04140
and Glyma16g00860 vs. Med AC160841_18.4). In the
three duplicated NBS gene pairs, the average Ks (0.21) of
the NBS duplicates was *1.51-fold higher than that
(0.139) of the flanking gene pairs in soybean. Second, for
these segmentally duplicated single NBS gene pairs for
which orthologous NBSs could not be found in Medicago,
if the nearest neighboring genes on both sides along with
the corresponding intergenic regions can be aligned and
there are no more than 10 bp indels throughout the align-
ment, these duplicated NBS gene pairs could also be
considered as candidate syntenically duplicated NBS pairs.
Six duplicated single NBS pairs met these criteria. Con-
sistent with above results, the average Ks (0.199) of the six
NBS pairs was *1.4-fold higher than that of the flanking
gene pairs (Ks = 0.142; t test, P \ 0.05). Third, for
duplicated multi-NBS loci, each syntenic NBS gene from
homoeologue 1 was paired only once with its least diver-
gent counterpart in homoeologue 2. No sequence exchange
event was detected between the NBS paralogs in nine
segmentally duplicated block pairs, for which the average
Ks (0.229) of the duplicated NBS pairs was *1.44-fold
higher than that of the flanking gene pairs (Ks = 0.159;
t test, P \ 0.05). These results support the conclusion that
the relative evolutionary rate of duplicated NBS genes is
Fig. 3 Ks distribution of syntenically homoeologous genes. a–d Syn-
tenically homoeologous non-NBS, NBS, non-TNL, and TNL genes
whose average Ks values are 0.158, 0.268, 0.215, and 0.345,
respectively
84 Mol Genet Genomics (2011) 285:79–90
123
significantly higher, at least by 1.4-fold, than that of the
flanking non-NBS genes (t test, P \ 0.05).
On the other hand, for these multi-NBS loci with fre-
quent sequence exchange events, the average Ks (0.29) of
the least divergent NBS pairs was *1.77-fold higher than
the Ks (0.164) of the flanking gene pairs (t test, P \ 0.01),
which is also higher than the average Ks (0.215) of the
three aforementioned duplicated NBS pairs. These results
suggest that a deeper coalescence time and/or frequent
sequence exchanges between paralogs contribute to a cer-
tain extent to the diversification of complex NBS genes.
Interestingly, the single peak of Ks distribution of the
duplicated NBSs could be divided into two independent
peaks (Fig. 3c, d). One peak corresponded to the distribu-
tion of Ks from non-TNL duplicates, and the other was
attributed to TNL duplicates. The average Ks (0.345) of the
TNL duplicates was significantly higher than that of the
non-TNL duplicates (0.215; t test, P \ 0.001), but no
significant difference (P = 0.503) was observed between
the duplicated non-NBS genes that flanked the NBSs in the
same duplicated block pairs (Ks = 0.162 for non-TNLs
and Ks = 0.153 for TNLs). This finding suggests that
different rates of birth and death, which would influence
the ability to identify true homoeologs, might result in the
different Ks distribution between TNL and non-TNL
paralogs.
Detecting positive selection and sequence exchange
in R duplicates
To detect positive selection between NBS paralogs, the
ratio of the non-synonymous substitution rate to synony-
mous substitution rate (Ka/Ks) was calculated for the
xxLxLxx motifs in the LRR region, which were regarded
as determinants of recognition specificity for Avr factors.
Ka/Ks ratios were determined for all pairwise compari-
sons within each gene family. 27 of 72 families (37.5%)
in soybean displayed a Ka/Ks [ 1.0 in [50% of pairwise
combinations. In Medicago 15 of 81 NBS families
(18.5%) met these criteria (Table 3; Table S5 and S6).
The proportion of NBS families in which the average Ka
is significantly larger than Ks was twofold higher in
soybean (37.5%) than in Medicago (*18.5%). In addi-
tion, the HyPhy package with the REL method run on the
resulting GARD fragments was used to detect positively
and negatively selected sites with posterior probabil-
ity [0.98 (Table 3; Fig. 4). The proportion of NBS fam-
ilies with positively selected sites was approximately 85.7
and 58.3% in soybean and Medicago, respectively. The
proportion of NBS families in which the Ka/Ks ratio was
greater than 1.0 was higher in soybean than in Medicago.
Consistent with this, the number of sites under positive
selection for NBS families was greater in soybean than in
Medicago, suggesting that domestication might have
contributed to the rapid diversification of NBS genes in
soybean.
In tandem-duplicated NBS families of soybean, the
average Ka/Ks values in the CDS and xxLxLxx region
were 0.67 and 1.17, respectively, which was significantly
higher than those in segmentally duplicated NBS gene
pairs (0.489 in the CDS and 0.628 in xxLxLxx region;
P \ 0.001). This indicates that more amino acid substi-
tutions were selectively accumulated in tandem-dupli-
cated NBS paralogs than in segmentally duplicated NBS
pairs.
In the 30 unambiguously segmentally duplicated block
pairs in soybean, the average Ka/Ks values in syntenic
NBSs and the flanking gene pairs were 0.489 and 0.291,
respectively. This suggests that relaxation of negative
selection or enhancement of positive selection was signif-
icantly stronger in the NBS paralogs than in other genes
(P \ 0.001). In the genes flanking the NBSs, only two
duplicated pairs showed evidence of positive selection—a
PMP22-like gene with a Ka/Ks ratio of 1.58 and a RAD-
like gene with a Ka/Ks ratio of 1.35. All other retained
gene pairs flanking the NBSs appeared to be under puri-
fying selection. And six homoeologous NBS pairs with
significant Ka [ Ks (P \ 0.05) were observed. Interest-
ingly, the average Ka/Ks (0.555 in the CDS and 0.752 in
the LRR regions) in the TNL segmentally duplicated pairs
was significantly higher than that in the non-TNL dupli-
cated pairs (0.441 and 0.514, respectively; P \ 0.05),
suggesting that positive selection might play an important
role in the rapid diversification of TNL paralogs.
Table 3 Detection of positive selection and positively selected sites in soybean NBS paralogs
C2 members in a family C3 members in a family
Family
numbers
Families with
Ka [ KsaFamily
numbers
Families with positively
selected sitesb
Soybean 72 27 42 36 (average 38.6 sites per family)
Medicago 81 15 48 28 (average 35.6 sites per family)
a Average Ka [ Ks between the xxLxLxx motifs in a family (Table S5 and 6)b Positively selected sites with posterior probability [0.98 were detected using the HyPhy package with the REL method
Mol Genet Genomics (2011) 285:79–90 85
123
In families with C3 NBS paralogs, approximately
69.3% of positively selected sites and 71.1% of negatively
selected sites were clustered in the LRR and NBS regions,
respectively, and this finding is consistent with that of
previous studies (Mondragon-Palomino et al. 2002; Chen
et al. 2010). Interestingly, similar selection pressures were
detected in most NBS family pairs between the soybean
and Medicago genomes. For example, 5 Rpg1-b homologs
were detected in Medicago (Table S7). In the Rpg1-b locus,
there were 38 positively selected sites and no negatively
selected sites, and the average Ka/Ks in the xxLxLxx motif
region was 4.5 (Fig. 4a). Similarly, 128 positively selected
sites, 7 negatively selected sites, and an average Ka/Ks of
1.12 were found between 8 Rpg1-b homologs (Table S7) in
the soybean genome (Fig. 4b), suggesting that positive
selection was detected in the Rpg1-b locus of both soybean
and Medicago. In the Rps1 locus of soybean, 111 positively
selected sites and 66 negatively selected sites were detec-
ted, and Ka/Ks was found to be 1.41 between 17 Rps1-like
homologs (Fig. 4d). 3 positively selected sites, no nega-
tively selected sites, and a Ka/Ks ratio of 1.12 were found
between 6 Rps1-like homologs in Medicago (Figure un-
shown; Table S7). Similar purifying selection was detected
in the RCT1 locus in both soybean and Medicago: 2 pos-
itively selected sites, 137 negatively selected sites and a
Ka/Ks value of 0.612 were detected between 5 RCT1-like
paralogs in Medicago (Fig. 4f), while 5 positively selected
sites, 4 negatively selected sites, and a Ka/Ks ratio of 0.557
were found between 6 RCT1-like paralogs in soybean
(Figure unshown).
Fig. 4 The posterior probability of positively (abbreviated as P on
the Y axis) and negatively (abbreviated as N on the Y axis) selected
sites in four functional NBS–LRR loci. The X axis denotes the
position in the amino acid alignment. Sites with blue bars have
posterior probability [0.98, as determined by the HyPhy package
with the REL method run on the resulting GARD fragments. The
boxes under each graph denote the domain structures of the nucleotide
sequences in the group as identified using Pfam. a The sequence
group (including 5 Rpg1-b homologs) in Medicago. b, c The sequence
groups on chromosome 13 (including 8 Rpg1-b homologs) and its
segmentally duplicated region on chromosome 15 (including 11
Rpg1-b homologs) in soybean, respectively. d–f The sequence groups
of Rps1 (17 homologs) and RPP4 (3 homologs) from soybean and
RCT1 (5 homologs) from Medicago, respectively
86 Mol Genet Genomics (2011) 285:79–90
123
Different selective pressures were detected between
segmentally duplicated NBS gene families which diversi-
fied rapidly. For example, Fig. 4b and c represent groups of
sequences of the Rpg1-b locus on chromosome 13 (8
homologs; Table S7) and its segmentally duplicated region
on chromosome 15 (11 homologs; Table S7) in soybean,
respectively. Between these two segmentally duplicated
regions, the collinearity is quite good for the flanking non-
NBSs, but not for most NBS genes. The cluster on chro-
mosome 13 that contains Rpg1-b appears to be completely
absent from chromosome 15, which was also observed by
Innes et al. (2008). Interestingly, positive selection was
detected in 8 Rpg1-b-like homologs on chromosome 13
(Fig. 4b). However, only 13 positively selected sites, 30
negatively selected sites, and a Ka/Ks value of 0.89 were
found between the homologs on the duplicated region on
chromosome 15 (Fig. 4c), suggesting the opposite selection
pressure (purifying selection) on these homologs compared
with their counterpart genes on chromosome 13.
Sequence exchanges, including recombination, gene
conversion, and/or unequal crossing-over, were detected in
the NBS families. Interestingly, more events of sequence
exchange were found in families under positive selection
(Table S7; Fig. 4a, b, d, e) than in those under negative
selection (Table S7; Fig. 4f). Sequence exchanges were
also detected in other NBS families with C3 members.
Notably, significant sequence exchanges (P \ 0.05) were
found in most NBS families (15 of 18 in soybean and 10 of
11 in Medicago) with significant Ka [ Ks (Tables S5 and
S6). These results suggest that sequence exchanges play an
important role in rapidly fixing a beneficial mutation and
effectively removing a deleterious mutation.
Atypical domain combinations of NBS-encoding genes
A previous study has shown that there are nine domain
arrangements in the TIR–NBS subfamily in Medicago,
suggesting diverse domain combinations in the TIR–NBS
subfamily: N, NL, NT, NTNL, TN, TNL, TNLT, TNLTNL,
TNTNL, and TTNL (Ameline-Torregrosa et al. 2008).
Interestingly, 21 other atypical NBSs with unconven-
tional domain combinations were detected in this study,
including 6 LRR–NBS–TIRs (LNTs) and 2 NBS–TIRs
(NTs) in the Medicago genome and 3 LNTs and 10 NTs in
the L. japonicus genome (Table 1; Table S8). To avoid
errors in gene annotation, we compared these predicted
genes with available expressed sequence tags (ESTs) from
GenBank. By applying a high match stringency of at least
98% nucleotide identity between the ESTs and predicted
genes, we found that 7 of 8 NBS genes in Medicago had
EST support (Table S8). Previous studies have shown that
phylogenies calculated from the NBS domain robustly
distinguish TIR–NBS and non-TIR–NBS genes (Meyers
et al. 2003; Ameline-Torregrosa et al. 2008; Yang et al.
2008). However, all these NTs or LNTs were clustered in
one clade on the non-TIR–NBS phylogenetic branch, in
contrast to previous diverse domain combinations of the
TIR subfamily whose NBS domains were clustered in the
TIR–NBS phylogenetic branch (Ameline-Torregrosa et al.
2008; Yang et al. 2008). These findings suggest that these
atypical NBSs might result from recent lineage-specific
domain combinations between a non-TIR–NBS domain
and a TIR domain.
Discussion
Diversifying selection drives the higher evolutionary
rate of NBS genes than that of other genes
Plant populations are often extremely diverse in their
resistance to pathogens, and interactions at the molecular
level are often complex because of the coevolution of hosts
and pathogens, which are engaged in a never-ending con-
voluted battle of molecular one-upmanship (Deyoung and
Innes 2006). Thus, the coevolutionary arms race dynamics
in the gene-for-gene interaction is expected to result in a
high rate of turnover of resistance gene alleles (Flor 1956).
Previous studies have shown that most NBS–LRR genes
undergo rapid adaptive evolution with more rapid struc-
tural and functional divergence and genomic reorganiza-
tion than other genes (Leister et al. 1998; Ding et al. 2007;
Zhang et al. 2009; Chen et al. 2010). However, it is difficult
to accurately determine the differences in evolutionary
rates between NBS genes and other genes.
Comparison of NBS and non-NBS genes on segments
between syntenic homoeologs allowed investigation of
their relative evolutionary rates. Using Medicago as the
outgroup, we found out that the average Ka, Ks, and Ka/Ks
values of the 3 unambiguously syntenically duplicated
NBS gene pairs were about 2.3-, 1.51-, and 1.6-fold higher
than those of the flanking non-NBS gene pairs, respec-
tively, suggesting higher evolutionary rate and more amino
acid substitutions between the NBS duplicates. Besides,
gene loss rate of the duplicated NBSs was *twofold
higher than that of the whole genome genes. In detail
approximately 65.0% of the 183 NBS gene loci had their
syntenic NBS homoeologs lost while only 32.7% (15166/
46430) gene pairs were missing across the whole soybean
genome (Schmutz et al. 2010). These results suggested that
the evolutionary rates of NBS genes, including the nucle-
otide substitution and gene loss rates, were at least 1.5-fold
higher than that of other genes.
Interestingly, among the NBS families present in the
soybean genome, 85.7% were observed to have significant
positively selected sites, and 37.5% were found to have
Mol Genet Genomics (2011) 285:79–90 87
123
significant Ka [ Ks (P \ 0.05). Furthermore, the Ka of the
duplicated NBS gene pairs was *2.3-fold greater than that
of the other gene pairs flanking the NBS genes. Collectively,
the results suggest that diversifying selection could be the
main driving force for the rapid diversification of NBS genes.
On the other hand, significant sequence exchanges were
detected in most NBS families with significant Ka [ Ks. In
contrast, gene conversion was not detected between the
non-NBS homoeologous gene pairs of the flanking regions
of the NBSs (Innes et al. 2008). Hayes et al. (2004) proved
that recombination within a NBS-encoding cluster pro-
duced new variants that conferred resistance to soybean
mosaic virus in soybean, which indicated that sequence
exchange could play an important role in rapidly fixing a
beneficial mutation. Therefore, the combined effects of
frequent sequence exchanges and diversifying selection are
a possible mechanism by which plants obtain new R genes
to tackle rapidly changing pathogens.
TIR–NBS genes evolve faster than non-TIR–NBS
genes
NBS-encoding proteins can be categorized into two major
types, TNLs and non-TNLs, based on the structure of the
N-terminus domain that contains either a TIR or a coiled-
coil (CC) domain (Meyers et al. 2003; Yang et al. 2008).
TNLs are relatively homogeneous and form a single clade,
while non-TNLs form multiple clades (Yang et al. 2008; Li
et al. 2010). Our previous studies have shown that these
two types of NBS genes differ in terms of their evolu-
tionary pattern (Yang et al. 2008; Chen et al. 2010). In the
grapevine, poplar, Arabidopsis, and rice genomes, exten-
sive species-specific expansion was detected in TNLs,
while most non-TNLs within a clade of the phylogenetic
tree were poly- or paraphyletic (Yang et al. 2008). These
results indicate that these two types of NBSs might be
responsible for recognizing different types of pathogens.
Comparison of the closely related species A. thaliana and
A. lyrata shows that the branch length of the phylogenetic
tree between TNL homoeologs is significantly longer than
that between non-TNLs homoeologs, suggesting that TNLs
might evolve faster than non-TNLs (Chen et al. 2010).
The recent segmental duplication of the soybean genome
also provides an opportunity for estimating the relative
evolutionary rate of TNL and non-TNL genes. Our study
shows that the synonymous substitution rate of segmental
TNL duplicates (average Ks = 0.345) was approximately
1.6-fold higher than that of non-TNLs (average Ks =
0.215). A similar result was obtained in our earlier study,
where we found that the Ks of homoeologous TNL pairs was
significantly higher (*1.4-fold; P \ 0.05) than that of
homoeologous non-TNL pairs between A. thaliana and
A. lyrata (Chen et al. 2010).
Notably, during plant evolution, TNLs and non-TNLs
exhibit different evolutionary fates. TNLs are found in
bryophytes (Akita and Valkonen 2002), both TNLs and
non-TNLs are found in gymnosperms and eudicots,
and only non-TNLs are found in monocots (Liu and
Ekramoddoullah 2003; Meyers et al. 2003; Li et al. 2010).
A recent study shows that TIR-type NBS sequences are not
present in monocots, but they do exist in basal angio-
sperms, suggesting that these sequences have been lost or
reduced significantly in monocots (Tarr and Alexander
2009). The results indicate that since different plant lin-
eages have different life histories, the nature of selective
pressure imposed by their environmental conditions has
driven the diversified evolution of TNLs and non-TNLs.
Tandem duplication plays a major role in the expansion
of NBS genes
Gene duplication is one of the major evolutionary mecha-
nisms for generating novel genes that help organisms adapt
to different environments (Ohno 1970). Segmental and
tandem duplications are well-known patterns of gene
duplication in plants. In the case of soybean, genomic
segmental duplications led to a substantial increase in the
gene number (Schmutz et al. 2010). However, only 20.7%
of NBS genes were retained in both segmentally duplicated
regions, which is significantly lower than the frequencies of
the other retained genes (40–50%; Schmutz et al. 2010).
Similar results were also observed in the maize and rice
genomes in which \10% of the segmentally duplicated
NBS genes were retained (Li et al. 2010; Yang et al. 2006).
In contrast, *73.7% of NBS genes were physically clus-
tered in the soybean genome, and 87.9% of these were
classified as family members (Figure S1), suggesting that
tandem duplication rather than segmental duplication plays
a major role in the expansion of NBS genes. Segmentally
duplicated NBSs appear to be preferentially lost, indicating
that this loss following polyploidy appears to be a general
phenomenon, as these genes are also highly underrepre-
sented in duplicated regions of the Arabidopsis genome
(Nobuta et al. 2005; Innes et al. 2008).
Tandem- duplicated NBS genes, which are local
redundancies of R gene sequences, may act as a reservoir
of genetic variation from which new specificities can
evolve (Dangl and Jones 2001). Copy number variations
and sequence exchanges frequently occur in these NBS
loci, indicating that they are inherently unstable, fast
evolving, and complex (Noel et al. 1999; Yang et al. 2006;
Kuang et al. 2008; Yi and Richards 2008; Chen et al. 2010;
Li et al. 2010). In fact, individual clusters may confer
specific resistance to a wide range of pathogens and
pathogen genotypes. For example, the Rsv1 gene, which
confers resistance to Soybean mosaic virus, and the
88 Mol Genet Genomics (2011) 285:79–90
123
Rpg1-b gene, which confers resistance to Pseudomonas
syringae pv glycinea, were mapped to the same locus in
different soybean lines (Innes et al. 2008).
It is clear that tandem duplication is one of the major
driving forces for the expansion of NBS genes. Why do these
tandem duplicates tend to be retained? An important feature
of tandem R genes is their high rate of duplication per gen-
eration. As a result, new tandem NBS paralogs are contin-
uously generated, probably providing a pool of highly
dynamic targets for selection. In addition, tandem NBS-
encoding duplicates are highly variable within species
(Yang et al. 2006; Ding et al. 2007; Kuang et al. 2008). This
high-level within-species variation among tandem genes
further increases the number of targets that can be selected
from ever-changing pathogen populations. Most NBS
duplicates have undergone lineage-specific selection and
rapid birth-and-death evolution, which means that these
genes are probably important for adaptive evolution to rap-
idly changing environments. However, soon after genomic
segmental duplication, the typical fate of NBS duplicates is
rapid loss, possibly because of selection. If the cost of
maintaining the duplicated R genes is higher than the cost of
disease threat, selection will act to remove one or both of the
duplicated R genes from the genome (Tian et al. 2003).
Further, the Ka/Ks ratios are significantly higher (aver-
age 0.67 vs. 0.49) and sequence exchange events are more
frequent (average 19.8 vs. 3.36) for tandem NBS duplicates
than for segmental NBS duplicates (t test, P \ 0.05). This
suggests that diversifying selection is important for
increasing the sequence divergence between paralogs. On
the other hand, frequent sequence exchanges would spread
these beneficial mutations rapidly and result in homoge-
nization of paralogs, which are required for keeping pace
with changes in the pathogen populations. Therefore,
clusters of closely related NBS genes that have originated
from tandem gene duplication may provide a reservoir of
genetic variation. The subsequent combined effects of
frequent sequence exchanges and diversifying selection on
these NBS paralogs may allow plants to evolve resistance
to rapidly changing pathogens.
Acknowledgments This work was supported by the National
Natural Science Foundation of China (30930008, 30970198 and
J0730641) and National Key Project for Gene Transform in China
(2009ZX08009-27B). Two anonymous reviewers provided helpful
comments.
References
Akita M, Valkonen JP (2002) A novel gene family in moss
(Physcomitrella patens) shows sequence homology and a
phylogenetic relationship with the TIR–NBS class of plant
disease resistance genes. J Mol Evol 55:595–605
Ameline-Torregrosa C, Wang BB, O’Bleness MS, Deshpande S, Zhu
H, Roe B, Young ND, Cannon SB (2008) Identification and
characterization of NBS–LRR genes in the model plant Medi-
cago truncatula. Plant Physiol 146:5–21
Bailey TL, Elkan C (2005) The value of prior knowledge in
discovering motifs with MEME. Proc Int Conf Intell Syst Mol
Biol 3:21–29
Bakker EG, Toomajian C, Keritman M, Bergelson J (2006) A
genome-wide survey of R gene polymorphisms in Arabidopsis.
Plant Cell 18:1803–1818
Chen Q, Han Z, Jiang H, Tian D, Yang S (2010) Strong positive
selection drives rapid diversification of R-genes in Arabidopsis
relatives. J Mol Evol 70:137–148
Dangl JL, Jones JD (2001) Plant pathogens and integrated defence
responses to infection. Nature 411:826–833
Deyoung BJ, Innes RW (2006) Plant NBS–LRR proteins in pathogen
sensing and host defense. Nat Immunol 7:1243–1249
Ding J, Zhang W, Jing Z, Chen J-Q, Tian D (2007) Unique pattern of
R-gene variation within populations in Arabidopsis. Mol Genet
Genomics 277:619–629
Flor HH (1956) Mutations in flax rust induced by ultraviolet radiation.
Science 124:888–889
Hayes AJ, Jeong SC, Gore MA, Yu YG, Buss GR, Tolin SA, Saghai
Maroof MA (2004) Recombination within a nucleotide-binding-
site/leucine-rice repeat gene cluster produces new variants
conditioning resistance to soybean mosaic virus in soybeans.
Genetics 166:493–503
Innes RW, Ameline-Torregrosa C, Ashfield T, Cannon E, Cannon CS
et al (2008) Differential accumulation of retroelements and
diversification of NB-LRR disease resistance genes in duplicated
regions following polyploidy in the ancestor of soybean. Plant
Physiol 148:1740–1759
Kuang H, Caldwell KS, Meyers BC, Michelmore RW (2008)
Frequent sequence exchanges between homologs of RPP8 in
Arabidopsis are not necessarily associated with genomic prox-
imity. Plant J 54:69–80
Leister D (2004) Tandem and segmental gene duplication and
recombination in the evolution of plant disease resistance genes.
Trends Genet 20:116–122
Leister D, Kurth J, Laurie DA, Yano M, Sasaki T et al (1998) Rapid
reorganization of resistance gene homologues in cereal genomes.
Proc Natl Acad Sci USA 95:370–375
Li J, Ding J, Peng C, Zhang Y, Tang P, Chen JQ, Tian D, Yang S
(2010) Unique evolutionary pattern of numbers of gramineous
NBS–LRR genes. Mol Genet Genomic 283:427–438
Liu JJ, Ekramoddoullah AK (2003) Root-specific expression of a
western white pine PR10 gene is mediated by different promoter
regions in transgenic tobacco. Plant Mol Biol 52:103–120
Lupas A, Van Dyke M, Stock J (1991) Predicting coiled coils from
protein sequences. Science 252:1162–1164
Lynch M, Conery JS (2000) The evolutionary fate and consequences
of duplicated genes. Science 290:1151–1155
Lynch M, Crease TJ (1990) The analysis of population survey data on
DNA sequence variation. Mol Biol Evol 7:377–394
Martin GB, Bogdanove AJ, Sessa G (2003) Understanding the
functions of plant disease resistance proteins. Annu Rev Plant
Biol 54:23–61
Meyers BC, Kozik A, Griego A, Kuang H, Michelmore RW (2003)
Genome-wide analysis of NBS–LRR-encoding genes in Arabid-
opsis. Plant Cell 15:809–834
MGSC (2007) Medicago truncatula genome ‘‘Mt2.0’’ release white-
paper available at http://medicago.org/genome/downloads/Mt2/
Mondragon-Palomino M, Meyers BC, Michelmore R, Gaut B (2002)
Patterns of positive selection in the complete NBS–LRR
gene family of Arabidopsis thaliana. Genome Res 12:1305–
1315
Mol Genet Genomics (2011) 285:79–90 89
123
Nei M, Gojobori T (1986) Simple methods for estimating the numbers
of synonymous and nonsynonymous nucleotide substitutions.
Mol Biol Evol 3:418–426
Nobuta K, Ashfield T, Kim S, Innes RW (2005) Diversification of
non-TIR class NB-LRR genes in relation to whole-genome
duplication events in Arabidopsis. MPMI 18:103–109
Noel L, Moores TL, van Der Biezen EA, Parniske M, Daniels MJ,
Parker JE, Jones JD (1999) Pronounced intraspecific haplotype
divergence at the RPP5 complex disease resistance locus of
Arabidopsis. Plant Cell 11:2099–2112
Ohno S (1970) Evolution by gene duplication. Allen & Unwin;
Springer-Verlag, London
Pond SL, Frost SD, Muse SV (2005) HyPhy: hypothesis testing using
phylogenies. Bioinformatics 21:676–679
Pond SL, Posada D, Gravenor MB, Woelk CH, Frost SDW (2006)
Automated phylogenetic detection of recombination using a
genetic algorithm. Mol Biol Evol 23:1891–1901
Rozas J, Sanchez-DelBarrio JC, Messeguer X, Rozas R (2003)
DnaSP, DNA polymorphism analyses by the coalescent and
other methods. Bioinformatics 19:2496–2497
Sato S, Nakamuta Y, Kaneko T, Asamizu E, Kato T et al (2008)
Genome structure of the legume, Lotus japonicus. DNA Res
15:227–239
Schlueter JA, Lin JY, Schlueter SD, Vasylenko-Sanders IF, Desh-
pande S et al (2007) Gene duplication and paleopolyploidy in
soybean and the implications for whole genome sequencing.
BMC Genomics 8:330
Schmutz J, Cannon SB, Schlueter J, Ma J, Hyten DL et al (2010)
Genome sequence of the palaeopolyploid soybean. Nature
463:178–183
Tamura K, Dudley J, Nei M, Kumar S (2007) MEGA4: Molecular
Evolutionary Genetics Analysis (MEGA) software version 4.0.
Mol Biol Evol 24:1596–1599
Tarr DE, Alexander HM (2009) TIR–NBS–LRR genes are rare in
monocots: evidence from diverse monocot orders. BMC Res
Notes 2:197
Thompson JD, Higgins DG, Gibson TJ (1994) CLUSTAL W:
improving the sensitivity of progressive multiple sequence
alignment through sequence weighting, position-speciWc gap
penalties and weight matrix choice. Nucleic Acids Res 22:
4673–4680
Tian D, Traw MB, Chen JQ, Kreitman M, Bergelson J (2003) Fitness
costs of R-gene-mediated resistance in Arabidopsis thaliana.
Nature 423:74–77
Van K, Kim DH, Cai CM, Kim MY, Shin JH et al (2008) Sequence
level analysis of recently duplicated regions in soybean [Glycinemax (L.) Merr.] genome. DNA Res 15:93–102
Yang S, Feng Z, Zhang X, Jiang K, Jin X, Hang Y, Chen JQ, Tian D
(2006) Genome-wide investigation on the genetic variations of
rice disease resistance genes. Plant Mol Biol 62:181–193
Yang S, Zhang X, Yue J-X, Tian D, Chen JQ (2008) Recent
duplications dominate NBS-encoding gene expansion in two
woody species. Mol Genet Genomics 280:187–198
Yi H, Richards EJ (2008) Phenotypic instability of Arabidopsis alleles
affecting a disease resistance gene cluster. BMC Plant Biol 8:36
Zhang Y, Wang J, Zhang X, Chen JQ, Tian D, Yang S (2009) Genetic
signature of rice domestication shown by a variety of genes.
J Mol Evol 68:393–402
90 Mol Genet Genomics (2011) 285:79–90
123