relative evolutionary rates of nbs-encoding genes revealed by soybean segmental duplication

ORIGINAL PAPER

Relative evolutionary rates of NBS-encoding genes revealedby soybean segmental duplication

Xiaohui Zhang • Ying Feng • Hao Cheng •

Dacheng Tian • Sihai Yang • Jian-Qun Chen

Received: 29 April 2010 / Accepted: 26 October 2010 / Published online: 16 November 2010

� Springer-Verlag 2010

Abstract It is well known that nucleotide binding site

(NBS)-encoding genes are duplicate-rich and fast-evolving

genes. However, there is little information on the relative

importance of tandem and segmental NBS duplicates and

their exact evolutionary rates. The two rounds of large-

scale duplication that have occurred in soybean provide a

unique opportunity to investigate these issues. Comparison

of NBS and non-NBS genes on segments of syntenic ho-

moeologs shows that NBS-encoding genes evolve at least

1.5-fold faster (*1.5-fold higher synonymous and *2.3-

fold higher nonsynonymous substitution rates) and lose

their genes *twofold faster than the flanking non-NBS

genes. Compared with segmental duplicates, tandem NBS

duplicates are more abundant in soybean, suggesting that

tandem duplication is the major driving force in the

expansion of NBS genes. Notably, significant sequence

exchanges along with significantly positive selection were

detected in most tandem-duplicated NBS gene families.

The results suggest that the rapid evolution of NBS genes

may be due to the combined effects of diversifying selec-

tion and frequent sequence exchanges. Interestingly, TIR–

NBS–LRR genes (TNLs) have a higher nucleotide substi-

tution rate than non-TNLs, indicating that these types of

NBS genes may have a rather different evolutionary pat-

tern. It is important to determine the exact relative evolu-

tionary rates of TNL, non-TNL, and non-NBS genes in

order to understand how fast the host plant can adjust its

response to rapidly evolving pathogens in a coevolutionary

context.

Keywords NBS–LRR genes � Evolutionary rate �Tandem and segmental duplication � Soybean

Introduction

Over the long history of the battle between plants and

pathogens, plants have evolved sophisticated mechanisms

to perceive pathogen attack and subsequently produce a

highly localized and specific response, resulting in a visible

hypersensitive response (HR) (Dangl and Jones 2001). An

important and well-characterized perception mechanism is

based on resistance (R) genes in flowering plants whose

products confer recognition of cognate avirulence (Avr)

proteins in pathogens (Dangl and Jones 2001). More than

40 R genes have been cloned over the past two decades and

most of them belong to the nucleotide binding sites and

leucine-rich repeat (NBS–LRR) genes, which form one of

the largest plant gene families. NBS–LRR proteins have

been studied extensively to understand their evolution and

the molecular basis of their functions (Martin et al. 2003;

Deyoung and Innes 2006). The N-terminal region of

Communicated by Y. Van de Peer.

X. Zhang and Y. Feng contributed equally to this work.

Electronic supplementary material The online version of thisarticle (doi:10.1007/s00438-010-0587-7) contains supplementarymaterial, which is available to authorized users.

X. Zhang � Y. Feng � D. Tian � S. Yang (&) � J.-Q. Chen (&)

State Key Laboratory of Pharmaceutical Biotechnology, School

of Life Sciences, Nanjing University, Nanjing 210093, China

e-mail: [email protected]

J.-Q. Chen

e-mail: [email protected]

H. Cheng

National Center for Soybean Improvement, National Key

Laboratory of Crop Genetics and Germplasm Enhancement,

Nanjing Agricultural University, Nanjing 210095, China

123

Mol Genet Genomics (2011) 285:79–90

DOI 10.1007/s00438-010-0587-7

http://dx.doi.org/10.1007/s00438-010-0587-7

NBS–LRR genes is structurally diverse, including the Toll/

Interleukin-1 receptor domain (TIR–NBS–LRR, TNL) and

non-TIR–NBS–LRRs (non-TNLs) (Dangl and Jones 2001).

NBS-encoding genes are often observed in clusters of

tandem repeats, suggesting that gene duplication is a

common event in this family (Meyers et al. 2003; Leister

2004; Yang et al. 2006, 2008; Li et al. 2010). Several

evolutionary and population genetic models of the fate of

duplicate genes have been proposed, which provide theo-

retical and mechanistic explanations for gene retention

(Ohno 1970). In different plants, NBS genes are found as

both singletons and gene clusters. Compared with other

gene families, the NBS gene family contains a higher

proportion of duplicated genes (Yang et al. 2008; Li et al.

2010). These duplicated NBS genes are mostly derived

from segmental and tandem duplication events (Meyers

et al. 2003). Genomic analyses in the grapevine and poplar

genomes have shown that recent tandem duplication plays

a major role in the expansion of NBS-encoding genes

(Yang et al. 2008). In the Arabidopsis genome, only 22

NBS-encoding genes were detected in both members of the

duplication pair, indicating high frequency of NBS gene

loss after whole-genome duplication (Nobuta et al. 2005).

Interestingly, although all NBS-LRR genes were presumed

to originate from a common ancestor, considerable varia-

tions in the numbers of these genes were detected in dif-

ferent species (Li et al. 2010). It has been proposed that

rapid expansion and/or contraction of genes may be a

fundamentally important strategy employed by plants to

adapt to the rapidly changing species-specific pathogen

spectrum (Chen et al. 2010; Li et al. 2010). However,

which duplication events dominate the expansion of NBS-

encoding genes and why these duplicates tend to be

retained are questions that remain unanswered.

Recent studies have shown that plant NBS–LRRs are

incredibly adaptive in their pathways of pathogen recog-

nition and defense initiation (Deyoung and Innes 2006).

Molecular genetic studies of related NBS–LRR proteins

with different recognition specificities indicate that the

LRR domain is the primary determinant of NBS–LRR

protein for recognition specificity, and this domain is

believed to adopt an arc-shaped conformation, forming a

protein–protein interaction surface (Dangl and Jones 2001).

Genome-wide surveys of NBS gene polymorphisms have

shown that LRR regions are highly polymorphic for protein

variants and are driven by balancing or diversifying

selection (Bakker et al. 2006; Yang et al. 2006; Ding et al.

2007). However, there is little comparative data on the

accurate evolutionary rates of NBS and non-NBS genes.

Legumes, including Lotus, Medicago, Glycine, and

Phaseolus, comprise one of the three largest families of

flowering plants and are important from an agricultural

perspective because they fix atmospheric nitrogen through

closely symbiotic interactions with microorganisms

(Schlueter et al. 2007). Soybean (Glycine max) is an

important crop because of the high protein and oil content

of its seeds. This plant has undergone at least two rounds of

large-scale duplication approximately 13 and 59 million

years ago (Mya), resulting in a highly duplicated genome

with nearly 75% of the genes present in multiple copies

(Schmutz et al. 2010). The recent duplication event in

soybean provides a unique opportunity to investigate the

different fates of tandem and segmental NBS duplicates

and to determine the type of gene duplication that influ-

ences the expansion and/or contraction of NBS genes and

how this happens.

Materials and methods

Identification of NBS-encoding genes

Soybean (Glycine max; Glyma1.01; Schmutz et al. 2010),

barrel medic (Medicago truncatula; Ameline-Torregrosa

et al. 2008), birdsfoot trefoil (Lotus japonicus; V1.0; Sato

et al. 2008), and Castor bean (Ricinus communis) assembly

and genemodels were obtained from Soybean Genome

Sequencing Consortium (http://www.phytozome.net/soybean.

php), Medicago Genome Sequence Consortium (http://

medicago.org/genome/downloads/Mt2/; MGSC 2007),

National BioResource Project (NBRP) Legume Base (http://

www.legumebase.agr.miyazaki-u.ac.jp/index.jsp), and TIGR

castorWGS release 0.1 (http://castorbean.jcvi.org/castorbean_

downloads.shtml), respectively.

To identify NBS-encoding genes in the four plant gen-

omes, both BLAST and hidden Markov models (HMM)

searches were performed, as described in Yang et al.

(2008) and Li et al. (2010). First, possible homologs

encoded in plant genomes were searched in BLASTP with

amino acid sequence of NB-ARC domain (Pfam: PF00931)

as a query with a threshold expectation value of 1E-4, a

value determined empirically to filter out most of the

spurious hits. Second, the nucleotide sequences of candi-

date NBS genes were used as queries to find homologs in

their genomes by BLASTn search. This step was crucial to

find the maximum number of candidate genes. All new

BLAST hits in the genomes, together with flanking regions

of 5,000–10,000 bp at both sides, were annotated using the

gene-finding programs FGENESH with the legume plant

training set (http://www.softberry.com/) and GENSCAN

with the Arabidopsis training set (http://genes.mit.edu/

GENSCAN.html) to obtain information on complete open

reading frames (ORFs). To exclude potentially redundant

candidate NBS genes, all sequences were orientated by

BLASTn, and sequences found in the same location were

eliminated. All non-redundant candidate NBS genes were

80 Mol Genet Genomics (2011) 285:79–90

123

http://www.phytozome.net/soybean.php

http://www.phytozome.net/soybean.php

http://medicago.org/genome/downloads/Mt2/


http://www.legumebase.agr.miyazaki-u.ac.jp/index.jsp

http://www.legumebase.agr.miyazaki-u.ac.jp/index.jsp

http://castorbean.jcvi.org/castorbean_downloads.shtml

http://castorbean.jcvi.org/castorbean_downloads.shtml

http://www.softberry.com/

http://genes.mit.edu/GENSCAN.html

http://genes.mit.edu/GENSCAN.html

surveyed to further verify whether they encoded NBS or

LRR motifs using the Pfam database v23.0 (E value cut-off

of 10-4) (http://pfam.janelia.org/), SMART protein motif

analyses (http://smart.embl-heidelberg.de/), and Multiple

Expectation Maximization for Motif Elicitation (MEME)

(Bailey and Elkan 2005). CC motifs were detected using

COILS with a threshold of 0.9 (Lupas et al. 1991).

Determining the segmentally duplicated NBS-encoding

genes

Soybean is a well-documented paleopolyploid whose gen-

ome has undergone at least two rounds of polyploidy and

subsequent diploidization events (Schmutz et al. 2010). To

examine the evolutionary fate of duplicated NBS-encoding

genes, segmental duplication events containing NBS-

encoding genes in both segmentally duplicated regions

were identified in the soybean genome. First, all NBS-

encoding genes were oriented on the appropriate chromo-

somes as the original anchor points by using BLASTn.

Thirty-one non-repetitive-element genes of each anchor

point, including the NBS-encoding gene and 15 flanking

genes on each side, were then compared by pairwise

BLAST analysis to identify duplicated genes between two

independent segmental blocks. When more than five gene

pairs with syntenic relationships (BLAST E value \10-10)

were detected, the two blocks were defined as a segmentally

duplicated region pair (Fig. 1). In addition, for small gene

families between two duplicated blocks, each syntenic gene

from homoeologue 1 was paired only once with its least

divergent counterpart in homoeologue 2.

To examine the relative ages of the segmental duplica-

tion events, the synonymous substitution rates (Ks) of

the full-length nucleotide coding sequences (CDSs) with

the alignable nucleotide sequence covering [80% of the

longer gene between duplicates were calculated using the

method developed by Nei and Gojobori (1986). For each

gene pair, the Ks value was translated into divergence time

in million years based on a rate of 6.1 9 10-9 substitutions

per site per year. The divergence time (T) was calculated

as T = Ks/(2 9 6.1 9 10-9) 9 10-6 Mya (Lynch and

Conery 2000).

Sequence alignment and phylogenetic analysis

Multiple alignments of amino acid sequences were per-

formed using ClustalW with default options (Thompson

et al. 1994) and then with MEGA v4.0 (Tamura et al. 2007)

to manually correct the alignments. The resulting amino

acid sequence alignments were then used to guide the

alignments of CDSs. Phylogenetic trees were constructed

by the bootstrap neighbor-joining (NJ) method with a

Kimura 2-parameter model by MEGA. The stability of

internal nodes was assessed by bootstrap analysis with

1,000 replicates. The time of the duplication event for each

node in the phylogenetic tree was determined from the

average Ks value of the node (Yang et al. 2008). Nucleo-

tide divergence among paralogs was estimated from Dxy

with the Jukes and Cantor correction (Lynch and Crease

1990) using MEGA v4.0. Using a sliding-window search

(15 ORFs as a window size), a gene cluster was defined as

a region in which two neighboring homologous genes

were B15 ORFs apart. Similar to the definition of a cluster

in Meyers et al. (2003), that is a useful operational defi-

nition because the number of clusters changed only slightly

when the maximum number of intervening ORFs was

increased to 20 or even 30. Gene families were defined as

groups of NBS-encoding genes in which each gene dis-

plays at least 70% nucleotide identity to at least one other

member of the group.

The xxLxLxx motif of the LRR domain (L = Leu or

another aliphatic amino acid; x = any amino acid) was

regarded as the determinant of recognition specificity for Avr

factors. To determine the evolutionary dynamics of dupli-

cated genes, we estimated the ratios of nonsynonymous (Ka)

to synonymous (Ks) nucleotide substitutions of the NBS

domain, LRR region, and xxLxLxx motif between NBS

paralogs. Ka and Ks were calculated using DnaSp v4.0

(Rozas et al. 2003) on the basis of the Nei and Gojobori

(1986) method. A Ka/Ks ratio greater than 1 suggests posi-

tive selection, and a ratio less than 1 indicates purifying

selection. The average Ka/Ks ratio for a gene family was

calculated by taking the arithmetic mean of all possible

pairwise combinations. Recombination has a profound

impact on evolutionary rates and analysis of selection

Fig. 1 An example of the highly syntenic relationships of segmen-

tally duplicated non-repetitive-element genes between two blocks on

chromosomes 10 and 20. The least divergent counterparts are linked

by shadows. Colored arrows (red, NBS–LRR genes; black, other

genes) indicate the positions and orientations of the predicted genes.

Dotted lines indicate that no corresponding ORFs or homoeologs are

detected in the same position compared with its counterpart (color

figure online)

Mol Genet Genomics (2011) 285:79–90 81

123

http://pfam.janelia.org/

http://smart.embl-heidelberg.de/

pressure may be confounded by the presence or absence of

recombination. Next, the genetic algorithm for recombina-

tion detection (GARD) tool was implemented in Datamon-

key web with default setting to detect the presence or

absence of recombination and the location of recombination

breakpoints (http://www.datamankey.org; Pond et al. 2006).

A NEXUS file with partition information output created by

GARD was subsequently used as the input for the positive

selection analysis; this method allows the analysis of

recombinant sequences for positive selection (Pond et al.

2006). The HyPhy package with the random effects likeli-

hood (REL) method run on the resulting GARD fragments

(Pond et al. 2005 and 2006) was used to detect positively and

negatively selected sites with posterior probability [0.98.

Sequence exchange was also investigated with GENE-

CONV 1.81 (http://www.math.wustl.edu/sawyer/geneconv/).

The default setting of 10,000 permutations was used for the

analysis. The statistical significance of gene conversion events

was defined as a global permutation P value \0.05.

Results

NBS-encoding gene families and their recent

duplication

We performed BLAST and HMM searches and identified

429 NBS-encoding genes in the soybean genome. Of these

genes, 154, 236 and 39 were identified as TIR–NBS–LRR

(TNL), non-TIR–NBS–LRR (non-TNL) and NBS genes

lacking the LRR domain, respectively (Table 1). In a pre-

vious study, 330 nonredundant NBS-encoding genes were

identified in the M. truncatula draft genome (Mt 1.0;

Ameline-Torregrosa et al. 2008). Using the same methods

and an updated database (Mt 2.0), 392 NBS-encoding

genes, including 143 TNLs, 193 non-TNLs, and 56 NBSs

were re-identified in the Medicago genome (Table 1). In

another model legume species, L. japonicus, 95 NBS-

encoding genes with the LRR domain and 63 NBS genes

lacking LRRs were identified in *67% of the sequenced

genome. In castor bean (Ricinus communis), another oil-

seed plant, 111 NBS–LRR and 18 NBS genes were

identified.

When nucleotide identity [70% was used as the crite-

rion for predicting multigene families (see ‘‘Materials and

methods’’ for details), soybean was found to have 60 NBS

gene families, containing 76.7% of the NBS-encoding

genes, with 5.45 members per gene family (Table 2).

Similarly, Medicago had 81 NBS gene families containing

82.9% of the NBS-encoding genes, with 4.01 members per

gene family. However, when the same criterion was used

for predicting multigene families in the Arabidopsis and

rice genomes, only 48.5 and 37.0% of the R genes were

found to belong to multigene families, respectively.

Moreover, the multigene family of Arabidopsis had 3.23

members per family while that of rice had 2.63 members

per family. These results suggest that more NBSs were

retained in the two legume genomes. Additionally, in the

Table 1 Number of genes that

encode domains similar to NBS

genes in plant genomes

Predicted protein domains Letter code G. max M. truncatula L. japonicus R. communis

Total NBS encoding genes 429 392 158 129

NBS–LRR type genes

TIR–NBS–LRR TNL

TIR–NBS–LRR TNL0 126 58 26 28

X-NBS–LRR XNL 26 74 22 5

X-cc–NBS–LRR XCNL 2 11 1 0

Total 154 143 49 32

Non-TIR–NBS–LRR Non-TNL

CC–NBS–LRR CNL 94 102 23 54

X-NBS–LRR XNL 142 85 20 25

LRR–NBS–TIR LNT 0 6 3 0

Total 236 193 46 79

Total 390 336 95 111

NBS type genes

TIR–NBS TN 11 8 8 2

X-NBS (TIR type) XN0 9 24 23 4

CC-NBS CN 3 10 6 7

X-NBS (CC type) XN0 0 16 12 16 5

NBS–TIR (CC type) NT 0 2 10 0

Total 39 56 63 18


123

http://www.datamankey.org

http://www.math.wustl.edu/sawyer/geneconv/

soybean genome, 87.2% (287/329) of the NBS paralogs

were located in clusters, indicating that tandem duplication

has contributed to the expansion of the recently duplicated

NBS genes.

When more stringent criteria were applied to identify

recently duplicated NBSs, approximately 14.2, 27.3, and

23.1% of the NBS duplicates in soybean and 15.3, 14.3, and

33.7% of the NBS duplicates in Medicago were identified as

having homologous members with C95, 90–95, and

80–90% nucleotide identities, respectively (Fig. 2). These

values were significantly higher than those in Arabidopsis

(0, 5.8, and 21.4, respectively) and rice (1.9, 3.85, and

13.8%, respectively) (paired t test, P \ 0.05; Fig. 2). In

other words, approximately 41.5% of the NBSs had

homologs with [90% nucleotide identity in the soybean

genome, while the corresponding values in the Arabidopsis

and rice genomes were only 5.8 and 5.75%, respectively.

The Medicago genome had the largest proportion of NBS

duplicates (33.7%), with identity levels ranging from 80 to

90%, while the rice genome had the lowest proportion of

NBS duplicates with an identity level exceeding 80%.

In phylogenetic trees, when the phylogenetic clade was

defined as having nucleotide identity [70% between par-

alogs and bootstrap value [70% (Table S1), it was clear

that the rice genome had the largest number of R clades

(400), which was significantly higher than the value in

soybean (160), Medicago (148), and Arabidopsis (115).

Moreover, soybean and Medicago had more R paralogs per

clade (2.68 and 2.65, respectively) than Arabidopsis (1.50)

and rice (1.30).

Segmentally duplicated NBS-encoding genes and their

evolutionary rate

Gene expansions are common in NBS-encoding proteins

(Meyers et al. 2003; Yang et al.2006, 2008; Li et al. 2010).

Approximately 72.7 and 81.9% of NBSs were tandem

arrayed and physically clustered in chromosomes in the

soybean and Medicago genomes, respectively (Table S2

and S3). In total, 70 and 56 NBS clusters were found,

averaging 4.51 and 5.73 NBS-encoding genes per cluster in

soybean and Medicago, respectively. The largest clusters

contained 18 and 19 NBSs in soybean (on chromosome 3)

and Medicago (on chromosome 6), respectively.

Soybean has undergone at least two rounds of large-

scale duplication approximately 13 and 59 Mya, and

approximately 50% of the duplicates in the segmental

regions have been retained (Schlueter et al. 2007). Our data

showed that a total of 183 NBS loci, consisting of 113

singletons and 70 R clusters, were present in soybean. To

examine the evolutionary fate of these duplicated NBSs,

segmentally duplicated blocks were identified pairwise (see

‘‘Materials and methods’’ for details). Interestingly, only 30

segmentally duplicated block pairs with syntenically ho-

moeologous NBS–LRR genes and their flanking genes

were unambiguously detected (Fig. 1 ; Table S4), and these

included 89 NBS-encoding genes (34 TIR- and 55 non-

TIR-NBSs). In the remaining 123 NBS loci (67.2% of the

total NBS loci), no unambiguous syntenic region was

detected, and even when the syntenic region was detected,

its homoeologous NBS genes were not found in the seg-

mentally duplicated region, suggesting that most of the

segmentally duplicated NBS genes or loci were lost.

In the 30 unambiguous pairs of segmentally duplicated

block that contained NBSs in both regions, approximately

31–68% of the duplicated non-NBS genes (flanking genes)

were retained. In these cases, most retained homologs had

approximately 90% or greater nucleotide identity, with a

few extremes. To estimate the timing of the duplication

Table 2 Detection of NBS

gene families in the genomes of

soybean, Medicago,

Arabidopsis, and rice

Family

numbers

Gene

numbers

Proportion of NBSs

in families (%)

Average number

per family

Soybean 60 329 76.7 5.45

Medicago 81 325 82.9 4.01

Arabidopsis 26 84 48.5 3.23

Rice 73 192 37.0 2.63

Fig. 2 Distribution of recently generated NBS duplicates in the

genomes of soybean, Medicago, Arabidopsis, and rice


123

event that created the two duplicated blocks, pairwise

synonymous substitutions (Ks) were calculated between

duplicated NBSs and non-NBSs. Interestingly, only a sin-

gle distinct peak of Ks distribution was detected in all pairs

of duplicated non-NBS genes (Fig. 3a). The average Ks

value was 0.158 ± 0.085 (range 0.021–0.552). This value

compares well with those reported in previous analyses of

gene duplicates derived from the most recent whole gen-

ome duplication event in soybean, where the genome-wide

computation of duplicated gene pairs showed an average

Ks of 0.13 (Schmutz et al. 2010) and 23 homologous gene

pairs showed an average Ks of 0.147 (Van et al. 2008). On

the basis of a substitution rate of 6.1 9 10-9 substitutions

per site per year (Lynch and Conery 2000), the divergence

time of segmental duplication was approximately 12.95

Mya, which is consistent with previous findings (*13

Mya; Schlueter et al. 2007; Schmutz et al. 2010). These

results indicate that all of the retained segmentally dupli-

cated regions, including the duplicated NBS genes in both

regions, were products of the recently large-scale dupli-

cation. In more ancient segmentally duplicated regions

(*59 Mya), NBS genes were lost in either one or both

regions, suggesting that NBS genes were lost more fre-

quently than non-NBS genes.

In comparison with the Ks distribution in segmental

non-NBS duplicates, the Ks distribution in segmental NBS

duplicates displayed a broader peak and higher average

value (0.268 ± 0.121, range 0.057–0.553; Fig. 3b). The

higher Ks value suggests that NBS genes may be evolving

more rapidly than non-NBS genes. However, there are at

least two other possible explanations for the higher Ks

values of the duplicated NBS genes than their flanking non-

NBS gene pairs other than higher evolutionary rate in these

genes: (1) frequent sequence exchange between NBS par-

alogs and (2) the deeper coalescence time of these dupli-

cated NBS gene pairs. More stringent standards were

required to identify the exact cause.

First, for a single duplicated NBS locus, three unam-

biguously syntenically duplicated NBS pairs were detected

using the syntenically homoeologous region of Medicago

as the outgroup (Glyma09g07020 and Glyma15g18290

vs. Med AC186731_16.6; Glyma01g37620 and Gly-

ma11g07680 vs. Med CR955005_30.5; Glyma07g04140

and Glyma16g00860 vs. Med AC160841_18.4). In the

three duplicated NBS gene pairs, the average Ks (0.21) of

the NBS duplicates was *1.51-fold higher than that

(0.139) of the flanking gene pairs in soybean. Second, for

these segmentally duplicated single NBS gene pairs for

which orthologous NBSs could not be found in Medicago,

if the nearest neighboring genes on both sides along with

the corresponding intergenic regions can be aligned and

there are no more than 10 bp indels throughout the align-

ment, these duplicated NBS gene pairs could also be

considered as candidate syntenically duplicated NBS pairs.

Six duplicated single NBS pairs met these criteria. Con-

sistent with above results, the average Ks (0.199) of the six

NBS pairs was *1.4-fold higher than that of the flanking

gene pairs (Ks = 0.142; t test, P \ 0.05). Third, for

duplicated multi-NBS loci, each syntenic NBS gene from

homoeologue 1 was paired only once with its least diver-

gent counterpart in homoeologue 2. No sequence exchange

event was detected between the NBS paralogs in nine

segmentally duplicated block pairs, for which the average

Ks (0.229) of the duplicated NBS pairs was *1.44-fold

higher than that of the flanking gene pairs (Ks = 0.159;

t test, P \ 0.05). These results support the conclusion that

the relative evolutionary rate of duplicated NBS genes is

Fig. 3 Ks distribution of syntenically homoeologous genes. a–d Syn-

tenically homoeologous non-NBS, NBS, non-TNL, and TNL genes

whose average Ks values are 0.158, 0.268, 0.215, and 0.345,

respectively


123

significantly higher, at least by 1.4-fold, than that of the

flanking non-NBS genes (t test, P \ 0.05).

On the other hand, for these multi-NBS loci with fre-

quent sequence exchange events, the average Ks (0.29) of

the least divergent NBS pairs was *1.77-fold higher than

the Ks (0.164) of the flanking gene pairs (t test, P \ 0.01),

which is also higher than the average Ks (0.215) of the

three aforementioned duplicated NBS pairs. These results

suggest that a deeper coalescence time and/or frequent

sequence exchanges between paralogs contribute to a cer-

tain extent to the diversification of complex NBS genes.

Interestingly, the single peak of Ks distribution of the

duplicated NBSs could be divided into two independent

peaks (Fig. 3c, d). One peak corresponded to the distribu-

tion of Ks from non-TNL duplicates, and the other was

attributed to TNL duplicates. The average Ks (0.345) of the

TNL duplicates was significantly higher than that of the

non-TNL duplicates (0.215; t test, P \ 0.001), but no

significant difference (P = 0.503) was observed between

the duplicated non-NBS genes that flanked the NBSs in the

same duplicated block pairs (Ks = 0.162 for non-TNLs

and Ks = 0.153 for TNLs). This finding suggests that

different rates of birth and death, which would influence

the ability to identify true homoeologs, might result in the

different Ks distribution between TNL and non-TNL

paralogs.

Detecting positive selection and sequence exchange

in R duplicates

To detect positive selection between NBS paralogs, the

ratio of the non-synonymous substitution rate to synony-

mous substitution rate (Ka/Ks) was calculated for the

xxLxLxx motifs in the LRR region, which were regarded

as determinants of recognition specificity for Avr factors.

Ka/Ks ratios were determined for all pairwise compari-

sons within each gene family. 27 of 72 families (37.5%)

in soybean displayed a Ka/Ks [ 1.0 in [50% of pairwise

combinations. In Medicago 15 of 81 NBS families

(18.5%) met these criteria (Table 3; Table S5 and S6).

The proportion of NBS families in which the average Ka

is significantly larger than Ks was twofold higher in

soybean (37.5%) than in Medicago (*18.5%). In addi-

tion, the HyPhy package with the REL method run on the

resulting GARD fragments was used to detect positively

and negatively selected sites with posterior probabil-

ity [0.98 (Table 3; Fig. 4). The proportion of NBS fam-

ilies with positively selected sites was approximately 85.7

and 58.3% in soybean and Medicago, respectively. The

proportion of NBS families in which the Ka/Ks ratio was

greater than 1.0 was higher in soybean than in Medicago.

Consistent with this, the number of sites under positive

selection for NBS families was greater in soybean than in

Medicago, suggesting that domestication might have

contributed to the rapid diversification of NBS genes in

soybean.

In tandem-duplicated NBS families of soybean, the

average Ka/Ks values in the CDS and xxLxLxx region

were 0.67 and 1.17, respectively, which was significantly

higher than those in segmentally duplicated NBS gene

pairs (0.489 in the CDS and 0.628 in xxLxLxx region;

P \ 0.001). This indicates that more amino acid substi-

tutions were selectively accumulated in tandem-dupli-

cated NBS paralogs than in segmentally duplicated NBS

pairs.

In the 30 unambiguously segmentally duplicated block

pairs in soybean, the average Ka/Ks values in syntenic

NBSs and the flanking gene pairs were 0.489 and 0.291,

respectively. This suggests that relaxation of negative

selection or enhancement of positive selection was signif-

icantly stronger in the NBS paralogs than in other genes

(P \ 0.001). In the genes flanking the NBSs, only two

duplicated pairs showed evidence of positive selection—a

PMP22-like gene with a Ka/Ks ratio of 1.58 and a RAD-

like gene with a Ka/Ks ratio of 1.35. All other retained

gene pairs flanking the NBSs appeared to be under puri-

fying selection. And six homoeologous NBS pairs with

significant Ka [ Ks (P \ 0.05) were observed. Interest-

ingly, the average Ka/Ks (0.555 in the CDS and 0.752 in

the LRR regions) in the TNL segmentally duplicated pairs

was significantly higher than that in the non-TNL dupli-

cated pairs (0.441 and 0.514, respectively; P \ 0.05),

suggesting that positive selection might play an important

role in the rapid diversification of TNL paralogs.

Table 3 Detection of positive selection and positively selected sites in soybean NBS paralogs

C2 members in a family C3 members in a family

Family

numbers

Families with

Ka [ KsaFamily

numbers

Families with positively

selected sitesb

Soybean 72 27 42 36 (average 38.6 sites per family)

Medicago 81 15 48 28 (average 35.6 sites per family)

a Average Ka [ Ks between the xxLxLxx motifs in a family (Table S5 and 6)b Positively selected sites with posterior probability [0.98 were detected using the HyPhy package with the REL method


123

In families with C3 NBS paralogs, approximately

69.3% of positively selected sites and 71.1% of negatively

selected sites were clustered in the LRR and NBS regions,

respectively, and this finding is consistent with that of

previous studies (Mondragon-Palomino et al. 2002; Chen

et al. 2010). Interestingly, similar selection pressures were

detected in most NBS family pairs between the soybean

and Medicago genomes. For example, 5 Rpg1-b homologs

were detected in Medicago (Table S7). In the Rpg1-b locus,

there were 38 positively selected sites and no negatively

selected sites, and the average Ka/Ks in the xxLxLxx motif

region was 4.5 (Fig. 4a). Similarly, 128 positively selected

sites, 7 negatively selected sites, and an average Ka/Ks of

1.12 were found between 8 Rpg1-b homologs (Table S7) in

the soybean genome (Fig. 4b), suggesting that positive

selection was detected in the Rpg1-b locus of both soybean

and Medicago. In the Rps1 locus of soybean, 111 positively

selected sites and 66 negatively selected sites were detec-

ted, and Ka/Ks was found to be 1.41 between 17 Rps1-like

homologs (Fig. 4d). 3 positively selected sites, no nega-

tively selected sites, and a Ka/Ks ratio of 1.12 were found

between 6 Rps1-like homologs in Medicago (Figure un-

shown; Table S7). Similar purifying selection was detected

in the RCT1 locus in both soybean and Medicago: 2 pos-

itively selected sites, 137 negatively selected sites and a

Ka/Ks value of 0.612 were detected between 5 RCT1-like

paralogs in Medicago (Fig. 4f), while 5 positively selected

sites, 4 negatively selected sites, and a Ka/Ks ratio of 0.557

were found between 6 RCT1-like paralogs in soybean

(Figure unshown).

Fig. 4 The posterior probability of positively (abbreviated as P on

the Y axis) and negatively (abbreviated as N on the Y axis) selected

sites in four functional NBS–LRR loci. The X axis denotes the

position in the amino acid alignment. Sites with blue bars have

posterior probability [0.98, as determined by the HyPhy package

with the REL method run on the resulting GARD fragments. The

boxes under each graph denote the domain structures of the nucleotide

sequences in the group as identified using Pfam. a The sequence

group (including 5 Rpg1-b homologs) in Medicago. b, c The sequence

groups on chromosome 13 (including 8 Rpg1-b homologs) and its

segmentally duplicated region on chromosome 15 (including 11

Rpg1-b homologs) in soybean, respectively. d–f The sequence groups

of Rps1 (17 homologs) and RPP4 (3 homologs) from soybean and

RCT1 (5 homologs) from Medicago, respectively


123

Different selective pressures were detected between

segmentally duplicated NBS gene families which diversi-

fied rapidly. For example, Fig. 4b and c represent groups of

sequences of the Rpg1-b locus on chromosome 13 (8

homologs; Table S7) and its segmentally duplicated region

on chromosome 15 (11 homologs; Table S7) in soybean,

respectively. Between these two segmentally duplicated

regions, the collinearity is quite good for the flanking non-

NBSs, but not for most NBS genes. The cluster on chro-

mosome 13 that contains Rpg1-b appears to be completely

absent from chromosome 15, which was also observed by

Innes et al. (2008). Interestingly, positive selection was

detected in 8 Rpg1-b-like homologs on chromosome 13

(Fig. 4b). However, only 13 positively selected sites, 30

negatively selected sites, and a Ka/Ks value of 0.89 were

found between the homologs on the duplicated region on

chromosome 15 (Fig. 4c), suggesting the opposite selection

pressure (purifying selection) on these homologs compared

with their counterpart genes on chromosome 13.

Sequence exchanges, including recombination, gene

conversion, and/or unequal crossing-over, were detected in

the NBS families. Interestingly, more events of sequence

exchange were found in families under positive selection

(Table S7; Fig. 4a, b, d, e) than in those under negative

selection (Table S7; Fig. 4f). Sequence exchanges were

also detected in other NBS families with C3 members.

Notably, significant sequence exchanges (P \ 0.05) were

found in most NBS families (15 of 18 in soybean and 10 of

11 in Medicago) with significant Ka [ Ks (Tables S5 and

S6). These results suggest that sequence exchanges play an

important role in rapidly fixing a beneficial mutation and

effectively removing a deleterious mutation.

Atypical domain combinations of NBS-encoding genes

A previous study has shown that there are nine domain

arrangements in the TIR–NBS subfamily in Medicago,

suggesting diverse domain combinations in the TIR–NBS

subfamily: N, NL, NT, NTNL, TN, TNL, TNLT, TNLTNL,

TNTNL, and TTNL (Ameline-Torregrosa et al. 2008).

Interestingly, 21 other atypical NBSs with unconven-

tional domain combinations were detected in this study,

including 6 LRR–NBS–TIRs (LNTs) and 2 NBS–TIRs

(NTs) in the Medicago genome and 3 LNTs and 10 NTs in

the L. japonicus genome (Table 1; Table S8). To avoid

errors in gene annotation, we compared these predicted

genes with available expressed sequence tags (ESTs) from

GenBank. By applying a high match stringency of at least

98% nucleotide identity between the ESTs and predicted

genes, we found that 7 of 8 NBS genes in Medicago had

EST support (Table S8). Previous studies have shown that

phylogenies calculated from the NBS domain robustly

distinguish TIR–NBS and non-TIR–NBS genes (Meyers

et al. 2003; Ameline-Torregrosa et al. 2008; Yang et al.

2008). However, all these NTs or LNTs were clustered in

one clade on the non-TIR–NBS phylogenetic branch, in

contrast to previous diverse domain combinations of the

TIR subfamily whose NBS domains were clustered in the

TIR–NBS phylogenetic branch (Ameline-Torregrosa et al.

2008; Yang et al. 2008). These findings suggest that these

atypical NBSs might result from recent lineage-specific

domain combinations between a non-TIR–NBS domain

and a TIR domain.

Discussion

Diversifying selection drives the higher evolutionary

rate of NBS genes than that of other genes

Plant populations are often extremely diverse in their

resistance to pathogens, and interactions at the molecular

level are often complex because of the coevolution of hosts

and pathogens, which are engaged in a never-ending con-

voluted battle of molecular one-upmanship (Deyoung and

Innes 2006). Thus, the coevolutionary arms race dynamics

in the gene-for-gene interaction is expected to result in a

high rate of turnover of resistance gene alleles (Flor 1956).

Previous studies have shown that most NBS–LRR genes

undergo rapid adaptive evolution with more rapid struc-

tural and functional divergence and genomic reorganiza-

tion than other genes (Leister et al. 1998; Ding et al. 2007;

Zhang et al. 2009; Chen et al. 2010). However, it is difficult

to accurately determine the differences in evolutionary

rates between NBS genes and other genes.

Comparison of NBS and non-NBS genes on segments

between syntenic homoeologs allowed investigation of

their relative evolutionary rates. Using Medicago as the

outgroup, we found out that the average Ka, Ks, and Ka/Ks

values of the 3 unambiguously syntenically duplicated

NBS gene pairs were about 2.3-, 1.51-, and 1.6-fold higher

than those of the flanking non-NBS gene pairs, respec-

tively, suggesting higher evolutionary rate and more amino

acid substitutions between the NBS duplicates. Besides,

gene loss rate of the duplicated NBSs was *twofold

higher than that of the whole genome genes. In detail

approximately 65.0% of the 183 NBS gene loci had their

syntenic NBS homoeologs lost while only 32.7% (15166/

46430) gene pairs were missing across the whole soybean

genome (Schmutz et al. 2010). These results suggested that

the evolutionary rates of NBS genes, including the nucle-

otide substitution and gene loss rates, were at least 1.5-fold

higher than that of other genes.

Interestingly, among the NBS families present in the

soybean genome, 85.7% were observed to have significant

positively selected sites, and 37.5% were found to have


123

significant Ka [ Ks (P \ 0.05). Furthermore, the Ka of the

duplicated NBS gene pairs was *2.3-fold greater than that

of the other gene pairs flanking the NBS genes. Collectively,

the results suggest that diversifying selection could be the

main driving force for the rapid diversification of NBS genes.

On the other hand, significant sequence exchanges were

detected in most NBS families with significant Ka [ Ks. In

contrast, gene conversion was not detected between the

non-NBS homoeologous gene pairs of the flanking regions

of the NBSs (Innes et al. 2008). Hayes et al. (2004) proved

that recombination within a NBS-encoding cluster pro-

duced new variants that conferred resistance to soybean

mosaic virus in soybean, which indicated that sequence

exchange could play an important role in rapidly fixing a

beneficial mutation. Therefore, the combined effects of

frequent sequence exchanges and diversifying selection are

a possible mechanism by which plants obtain new R genes

to tackle rapidly changing pathogens.

TIR–NBS genes evolve faster than non-TIR–NBS

genes

NBS-encoding proteins can be categorized into two major

types, TNLs and non-TNLs, based on the structure of the

N-terminus domain that contains either a TIR or a coiled-

coil (CC) domain (Meyers et al. 2003; Yang et al. 2008).

TNLs are relatively homogeneous and form a single clade,

while non-TNLs form multiple clades (Yang et al. 2008; Li

et al. 2010). Our previous studies have shown that these

two types of NBS genes differ in terms of their evolu-

tionary pattern (Yang et al. 2008; Chen et al. 2010). In the

grapevine, poplar, Arabidopsis, and rice genomes, exten-

sive species-specific expansion was detected in TNLs,

while most non-TNLs within a clade of the phylogenetic

tree were poly- or paraphyletic (Yang et al. 2008). These

results indicate that these two types of NBSs might be

responsible for recognizing different types of pathogens.

Comparison of the closely related species A. thaliana and

A. lyrata shows that the branch length of the phylogenetic

tree between TNL homoeologs is significantly longer than

that between non-TNLs homoeologs, suggesting that TNLs

might evolve faster than non-TNLs (Chen et al. 2010).

The recent segmental duplication of the soybean genome

also provides an opportunity for estimating the relative

evolutionary rate of TNL and non-TNL genes. Our study

shows that the synonymous substitution rate of segmental

TNL duplicates (average Ks = 0.345) was approximately

1.6-fold higher than that of non-TNLs (average Ks =

0.215). A similar result was obtained in our earlier study,

where we found that the Ks of homoeologous TNL pairs was

significantly higher (*1.4-fold; P \ 0.05) than that of

homoeologous non-TNL pairs between A. thaliana and

A. lyrata (Chen et al. 2010).

Notably, during plant evolution, TNLs and non-TNLs

exhibit different evolutionary fates. TNLs are found in

bryophytes (Akita and Valkonen 2002), both TNLs and

non-TNLs are found in gymnosperms and eudicots,

and only non-TNLs are found in monocots (Liu and

Ekramoddoullah 2003; Meyers et al. 2003; Li et al. 2010).

A recent study shows that TIR-type NBS sequences are not

present in monocots, but they do exist in basal angio-

sperms, suggesting that these sequences have been lost or

reduced significantly in monocots (Tarr and Alexander

2009). The results indicate that since different plant lin-

eages have different life histories, the nature of selective

pressure imposed by their environmental conditions has

driven the diversified evolution of TNLs and non-TNLs.

Tandem duplication plays a major role in the expansion

of NBS genes

Gene duplication is one of the major evolutionary mecha-

nisms for generating novel genes that help organisms adapt

to different environments (Ohno 1970). Segmental and

tandem duplications are well-known patterns of gene

duplication in plants. In the case of soybean, genomic

segmental duplications led to a substantial increase in the

gene number (Schmutz et al. 2010). However, only 20.7%

of NBS genes were retained in both segmentally duplicated

regions, which is significantly lower than the frequencies of

the other retained genes (40–50%; Schmutz et al. 2010).

Similar results were also observed in the maize and rice

genomes in which \10% of the segmentally duplicated

NBS genes were retained (Li et al. 2010; Yang et al. 2006).

In contrast, *73.7% of NBS genes were physically clus-

tered in the soybean genome, and 87.9% of these were

classified as family members (Figure S1), suggesting that

tandem duplication rather than segmental duplication plays

a major role in the expansion of NBS genes. Segmentally

duplicated NBSs appear to be preferentially lost, indicating

that this loss following polyploidy appears to be a general

phenomenon, as these genes are also highly underrepre-

sented in duplicated regions of the Arabidopsis genome

(Nobuta et al. 2005; Innes et al. 2008).

Tandem- duplicated NBS genes, which are local

redundancies of R gene sequences, may act as a reservoir

of genetic variation from which new specificities can

evolve (Dangl and Jones 2001). Copy number variations

and sequence exchanges frequently occur in these NBS

loci, indicating that they are inherently unstable, fast

evolving, and complex (Noel et al. 1999; Yang et al. 2006;

Kuang et al. 2008; Yi and Richards 2008; Chen et al. 2010;

Li et al. 2010). In fact, individual clusters may confer

specific resistance to a wide range of pathogens and

pathogen genotypes. For example, the Rsv1 gene, which

confers resistance to Soybean mosaic virus, and the


123

Rpg1-b gene, which confers resistance to Pseudomonas

syringae pv glycinea, were mapped to the same locus in

different soybean lines (Innes et al. 2008).

It is clear that tandem duplication is one of the major

driving forces for the expansion of NBS genes. Why do these

tandem duplicates tend to be retained? An important feature

of tandem R genes is their high rate of duplication per gen-

eration. As a result, new tandem NBS paralogs are contin-

uously generated, probably providing a pool of highly

dynamic targets for selection. In addition, tandem NBS-

encoding duplicates are highly variable within species

(Yang et al. 2006; Ding et al. 2007; Kuang et al. 2008). This

high-level within-species variation among tandem genes

further increases the number of targets that can be selected

from ever-changing pathogen populations. Most NBS

duplicates have undergone lineage-specific selection and

rapid birth-and-death evolution, which means that these

genes are probably important for adaptive evolution to rap-

idly changing environments. However, soon after genomic

segmental duplication, the typical fate of NBS duplicates is

rapid loss, possibly because of selection. If the cost of

maintaining the duplicated R genes is higher than the cost of

disease threat, selection will act to remove one or both of the

duplicated R genes from the genome (Tian et al. 2003).

Further, the Ka/Ks ratios are significantly higher (aver-

age 0.67 vs. 0.49) and sequence exchange events are more

frequent (average 19.8 vs. 3.36) for tandem NBS duplicates

than for segmental NBS duplicates (t test, P \ 0.05). This

suggests that diversifying selection is important for

increasing the sequence divergence between paralogs. On

the other hand, frequent sequence exchanges would spread

these beneficial mutations rapidly and result in homoge-

nization of paralogs, which are required for keeping pace

with changes in the pathogen populations. Therefore,

clusters of closely related NBS genes that have originated

from tandem gene duplication may provide a reservoir of

genetic variation. The subsequent combined effects of

frequent sequence exchanges and diversifying selection on

these NBS paralogs may allow plants to evolve resistance

to rapidly changing pathogens.

Acknowledgments This work was supported by the National

Natural Science Foundation of China (30930008, 30970198 and

J0730641) and National Key Project for Gene Transform in China

(2009ZX08009-27B). Two anonymous reviewers provided helpful

comments.

References

Akita M, Valkonen JP (2002) A novel gene family in moss

(Physcomitrella patens) shows sequence homology and a

phylogenetic relationship with the TIR–NBS class of plant

disease resistance genes. J Mol Evol 55:595–605

Ameline-Torregrosa C, Wang BB, O’Bleness MS, Deshpande S, Zhu

H, Roe B, Young ND, Cannon SB (2008) Identification and

characterization of NBS–LRR genes in the model plant Medi-

cago truncatula. Plant Physiol 146:5–21

Bailey TL, Elkan C (2005) The value of prior knowledge in

discovering motifs with MEME. Proc Int Conf Intell Syst Mol

Biol 3:21–29

Bakker EG, Toomajian C, Keritman M, Bergelson J (2006) A

genome-wide survey of R gene polymorphisms in Arabidopsis.

Plant Cell 18:1803–1818

Chen Q, Han Z, Jiang H, Tian D, Yang S (2010) Strong positive

selection drives rapid diversification of R-genes in Arabidopsis

relatives. J Mol Evol 70:137–148

Dangl JL, Jones JD (2001) Plant pathogens and integrated defence

responses to infection. Nature 411:826–833

Deyoung BJ, Innes RW (2006) Plant NBS–LRR proteins in pathogen

sensing and host defense. Nat Immunol 7:1243–1249

Ding J, Zhang W, Jing Z, Chen J-Q, Tian D (2007) Unique pattern of

R-gene variation within populations in Arabidopsis. Mol Genet

Genomics 277:619–629

Flor HH (1956) Mutations in flax rust induced by ultraviolet radiation.

Science 124:888–889

Hayes AJ, Jeong SC, Gore MA, Yu YG, Buss GR, Tolin SA, Saghai

Maroof MA (2004) Recombination within a nucleotide-binding-

site/leucine-rice repeat gene cluster produces new variants

conditioning resistance to soybean mosaic virus in soybeans.

Genetics 166:493–503

Innes RW, Ameline-Torregrosa C, Ashfield T, Cannon E, Cannon CS

et al (2008) Differential accumulation of retroelements and

diversification of NB-LRR disease resistance genes in duplicated

regions following polyploidy in the ancestor of soybean. Plant

Physiol 148:1740–1759

Kuang H, Caldwell KS, Meyers BC, Michelmore RW (2008)

Frequent sequence exchanges between homologs of RPP8 in

Arabidopsis are not necessarily associated with genomic prox-

imity. Plant J 54:69–80

Leister D (2004) Tandem and segmental gene duplication and

recombination in the evolution of plant disease resistance genes.

Trends Genet 20:116–122

Leister D, Kurth J, Laurie DA, Yano M, Sasaki T et al (1998) Rapid

reorganization of resistance gene homologues in cereal genomes.

Proc Natl Acad Sci USA 95:370–375

Li J, Ding J, Peng C, Zhang Y, Tang P, Chen JQ, Tian D, Yang S

(2010) Unique evolutionary pattern of numbers of gramineous

NBS–LRR genes. Mol Genet Genomic 283:427–438

Liu JJ, Ekramoddoullah AK (2003) Root-specific expression of a

western white pine PR10 gene is mediated by different promoter

regions in transgenic tobacco. Plant Mol Biol 52:103–120

Lupas A, Van Dyke M, Stock J (1991) Predicting coiled coils from

protein sequences. Science 252:1162–1164

Lynch M, Conery JS (2000) The evolutionary fate and consequences

of duplicated genes. Science 290:1151–1155

Lynch M, Crease TJ (1990) The analysis of population survey data on

DNA sequence variation. Mol Biol Evol 7:377–394

Martin GB, Bogdanove AJ, Sessa G (2003) Understanding the

functions of plant disease resistance proteins. Annu Rev Plant

Biol 54:23–61

Meyers BC, Kozik A, Griego A, Kuang H, Michelmore RW (2003)

Genome-wide analysis of NBS–LRR-encoding genes in Arabid-

opsis. Plant Cell 15:809–834

MGSC (2007) Medicago truncatula genome ‘‘Mt2.0’’ release white-

paper available at http://medicago.org/genome/downloads/Mt2/

Mondragon-Palomino M, Meyers BC, Michelmore R, Gaut B (2002)

Patterns of positive selection in the complete NBS–LRR

gene family of Arabidopsis thaliana. Genome Res 12:1305–

1315


123


Nei M, Gojobori T (1986) Simple methods for estimating the numbers

of synonymous and nonsynonymous nucleotide substitutions.

Mol Biol Evol 3:418–426

Nobuta K, Ashfield T, Kim S, Innes RW (2005) Diversification of

non-TIR class NB-LRR genes in relation to whole-genome

duplication events in Arabidopsis. MPMI 18:103–109

Noel L, Moores TL, van Der Biezen EA, Parniske M, Daniels MJ,

Parker JE, Jones JD (1999) Pronounced intraspecific haplotype

divergence at the RPP5 complex disease resistance locus of

Arabidopsis. Plant Cell 11:2099–2112

Ohno S (1970) Evolution by gene duplication. Allen & Unwin;

Springer-Verlag, London

Pond SL, Frost SD, Muse SV (2005) HyPhy: hypothesis testing using

phylogenies. Bioinformatics 21:676–679

Pond SL, Posada D, Gravenor MB, Woelk CH, Frost SDW (2006)

Automated phylogenetic detection of recombination using a

genetic algorithm. Mol Biol Evol 23:1891–1901

Rozas J, Sanchez-DelBarrio JC, Messeguer X, Rozas R (2003)

DnaSP, DNA polymorphism analyses by the coalescent and

other methods. Bioinformatics 19:2496–2497

Sato S, Nakamuta Y, Kaneko T, Asamizu E, Kato T et al (2008)

Genome structure of the legume, Lotus japonicus. DNA Res

15:227–239

Schlueter JA, Lin JY, Schlueter SD, Vasylenko-Sanders IF, Desh-

pande S et al (2007) Gene duplication and paleopolyploidy in

soybean and the implications for whole genome sequencing.

BMC Genomics 8:330

Schmutz J, Cannon SB, Schlueter J, Ma J, Hyten DL et al (2010)

Genome sequence of the palaeopolyploid soybean. Nature

463:178–183

Tamura K, Dudley J, Nei M, Kumar S (2007) MEGA4: Molecular

Evolutionary Genetics Analysis (MEGA) software version 4.0.

Mol Biol Evol 24:1596–1599

Tarr DE, Alexander HM (2009) TIR–NBS–LRR genes are rare in

monocots: evidence from diverse monocot orders. BMC Res

Notes 2:197

Thompson JD, Higgins DG, Gibson TJ (1994) CLUSTAL W:

improving the sensitivity of progressive multiple sequence

alignment through sequence weighting, position-speciWc gap

penalties and weight matrix choice. Nucleic Acids Res 22:

4673–4680

Tian D, Traw MB, Chen JQ, Kreitman M, Bergelson J (2003) Fitness

costs of R-gene-mediated resistance in Arabidopsis thaliana.

Nature 423:74–77

Van K, Kim DH, Cai CM, Kim MY, Shin JH et al (2008) Sequence

level analysis of recently duplicated regions in soybean [Glycinemax (L.) Merr.] genome. DNA Res 15:93–102

Yang S, Feng Z, Zhang X, Jiang K, Jin X, Hang Y, Chen JQ, Tian D

(2006) Genome-wide investigation on the genetic variations of

rice disease resistance genes. Plant Mol Biol 62:181–193

Yang S, Zhang X, Yue J-X, Tian D, Chen JQ (2008) Recent

duplications dominate NBS-encoding gene expansion in two

woody species. Mol Genet Genomics 280:187–198

Yi H, Richards EJ (2008) Phenotypic instability of Arabidopsis alleles

affecting a disease resistance gene cluster. BMC Plant Biol 8:36

Zhang Y, Wang J, Zhang X, Chen JQ, Tian D, Yang S (2009) Genetic

signature of rice domestication shown by a variety of genes.

J Mol Evol 68:393–402


123

relative evolutionary rates of nbs-encoding genes revealed by soybean segmental duplication

Documents