Download - Association mapping for improvement of agronomic traits in rice

Doctoral seminar Doctoral seminar onon

Presented ByZuge Sopan ShivajiPh.D. ScholarDept. of Genetics and Plant Breeding, CoA, Raipur.

Introduction

• Polygenic inheritance of agronomic traits- controlled by multiple genes whose

expression is affected by many factors. Hence phenotypic selection becomes

tedious job.

• Family mapping (Limitations-Biparental population, Low resolution, Analysis

of only 2 alleles, time consuming).

• Population or Association mapping (I) increased mapping resolution, (ii)

reduced research time, and (iii) greater allele number (Yu and Buckler, 2006).

• Association mapping identifies quantitative trait loci (QTLs) by examining the

marker-trait associations that can be attributed to the strength of linkage

disequilibrium between markers and phenotype across a set of diverse

germplasm.

• Association mapping, also known as "linkage disequilibrium

mapping", is a method of mapping quantitative trait loci (QTLs)

that takes advantage of linkage disequilibrium to link phenotypes

to genotypes.

Offers greater precision in QTL location than family-based

linkage analysis.

Does not require family or pedigree information , can be applied to

a range of experimental and non-experimental populations.

Association mapping (AM)

How it works?

• Association studies are based on the assumption that a marker locus

is ‘sufficiently close’ to a trait locus so that some marker allele

would be ‘travelling’ along with the trait allele through many

generations during recombination.

Direct and Indirect Allelic Association

D

*

Measure disease relevance (*) directly, ignoring correlated markers nearby

Direct Association

M1 M2 Mn

Assess trait effects on D via correlated markers (Mi) rather than susceptibility/etiologic variants.

D

Indirect Association & LD

•Allele of interest is itself involved in phenotype

• Allele itself is not involved, but a nearby correlated marker changes phenotype

Linkage mapping

In 1913, the first individual to construct a (very small) genetic map was Alfred Sturtevant.

Genes/ markers in order, indicating the relative genetic distances between them, and assigning them to their chromosome.

Distance = Recombination frequency=No. of recombinants /Total progeny X 100

Suppose the recombination between loci A and B is 6%, that between loci B and C is 20%, and that between A and C 24%, then we can order the loci along the chromosome as…

(Hartal et al., 2010)

Mapping resolution

(Braulio et al., 2012)

Marker-trait associations in experimental and natural populations

Experimental populations (e.g. F2, RIL) 2-parental alleles; small genetic variation; few meiotic cycles; low resolution

Natural populations many alleles; large genetic variation; many meiotic cycles; high resolution

Approaches to mapping genes

(Yu and Buckler, 2006)

Advantages of AM over linkage mapping

Linkage Mapping Association Mapping

Structured Population

(e.g. Biparental population)

Un-structured population

(e.g. Germplasm lines)

Low resolution (few to several

centimorgans away from gene/QTL)

High resolution (Much closer than

those by linkage mapping)

Only few alleles can be detected Many alleles can be detected

Moderate marker density High/moderate marker density

Feasible in annual and biennial

species, not feasible in perennial

species

Feasible in annual, biennial and

perennial species

Narrow range Wide range

Time consuming Less time required(Yu et al., 2006)

Types of association mapping

1. Genome wide association mapping: Search whole genome for causal

genetic variation. A large number of markers are tested for association

with various complex traits and it doesn’t require any prior information

on the candidate genes.

2. Candidate gene association mapping: Dissect out the genetic control

of complex traits, based on the available results from genetic,

biochemical, or physiology studies in model and non-model plant

species (Mackay, 2001). Requires identification of SNPs between lines

within specific genes.

(Zu et al., 2009)

Steps in association mapping

Mapping population and Population structure

• Randomly or non-randomly mated germplasm

• Randomly mated populations represent a rather narrow group of

germplasm, likely to lower resolution and harbor only a narrow range of

alleles

• Non randomly mated germplasm is used, population structure needs to

be controlled in the statistical analysis.

• Cluster analysis is done to know the variation in population and most

diverse individuals are selected from each cluster to represent the

individuals of that cluster.

(Yu et al., 2006)

Phenotyping

Phenotyping

• Success of AM depends on accuracy and throughput of genotyping

• Replications across multiple years in randomized plots and multiple

locations and environments.

• Field Design:- incomplete block design (Lattice), RBD (Eskridge,

2003).

Should be done on the basis of

• Diversity:- on the basis of phenotype and genotype

• Population structure:- Systematic difference in allele frequencies

btw. sub-populations…

Genotyping

• Mostly multiallelic, reproducible, PCR-based markers are used.

• Microsatellites or simple sequence repeats (SSRs), and SNPs are

more revealing than their dominant counterparts and, therefore, are

more powerful.

• Due to higher genome density, lower mutation rate and wide

distribution throughout the genome SNPs are rapidly becoming the

marker of choice for complex trait

Linkage Disequilibrium

Linkage disequilibrium means that we don’t need to genotype the

exact causal variant, but only a variant that is correlated with it.

Linkage Disequilibrium Map & Allelic Association

Primary Aim of LD maps: To identify the relationship between marker and QTL or trait of interest.

Marker 1 2 3 n

LD

D

Linkage disequilibrium (LD)

• LD refers to non random association of allels at different loci.

• LD follows the fact that closely located genes are transmitted as a block, which only

rarely breaks up in meiosis.

• Closely located genes often express linkage disequilibrium to each other: An example:

Consider two independently segregating genes A and B with two alleles (A, a and B, b

respectively)

• At equilibrium, the frequency of the AB should equal to the product of the allele

frequencies of A and B,

• PAB Pab =PAbPaB (1:1 ratio = no LD)

• Any deviation from these values implies LD.

A a Total

B AB aB Bb Ab ab bTotal A a

Cont.…..

(Zhu et al.,2009)

LD Decay with time for four different recombination fractions (ϴ)

(Powell et al., 2006)

Factors affecting LD

LD increases due to population structure, relatedness (kinship), small

founder population size or genetic drift, selection (natural, artificial).

While factors like outcrossing, high recombination rate, high

mutation rate, gene conversion, etc., lead to a decrease/disruption in

LD. Thus, LD declines with 1) increase in genetic distance and

2) increase in number of generations.

(Huttley et al., 2005)

Evaluation of linkage disequilibrium and associating

genotype- phenotype

• TASSEL (http://www.maizegenetics.net) is used to measure the

extent of LD as squared allele frequency correlation estimates (R2,

Weir, 1996) and measure the significance of R2.

• Besides TASSEL there are many other softwares like DnaSP,

Arlequin etc. used to calculate D‘ and R2.

http://www.maizegenetics.net/

Softwares used in AMSr. Software Focus Description

1. TASSEL Association analysis Free, LD statistics, sequence analysis, association mapping

2. Haploview 4.2

Haplotype analysis and LD LD and haplotype block analysis, haplotype population frequency estimation, single SNP and haplotype association tests.

3. SVS 7 Stratification,LD and AM

Estimate stratification, LD, haplotypes blocks and multiple AM approaches for up to 1.8 million SNPs and 10,000 sample

4. GenStat Stratification, LD and AM SSR markers, GLM and MLM-PCA methods

5. JMP genomics

Stratification, LD and structured AM

SNPs, CG and GWAS, analysis of common and rare Variants

6. GenAMap Stratification, LD and structured AM

SNPs, tree of functional branches, multiple visualization tools

7 PLINK Stratification, LD and structured AM

SNPs, multiple AM approaches, IBD and IBS Analyses

8. STRUCTURE

Populationstructure

Compute a MCMC Bayesian analysis to estimate the proportion of the genome of an individual originating from the different inferred Populations

9. SPAGeDi Relative kinship genetic relationship analysis(Braulio et al., 2012)

Advantages of AM

1. Saves time, effort, and cost needed for the development of specific

mapping populations.

2. The QTL-linked markers identified by AM can be directly used for MAS

3. AM has high resolution

4. AM would assess the entire range of diversity in the trait of interest

5. Associated markers identified during AM can be used for either selection

of parents for hybridization or for selection of desirable segregants

Disadvantages of AM

1. The results from AM are affected by several factors like selection

history, population structure, kinship, etc., may lead to false positive

association

2. Large number (hundreds of thousands or even millions) of markers

would be required to adequately cover the entire genome.

3. High quality phenotypic data required (Multiple environment with multi

location)

4. The rate of recombination is not uniform throughout the genome.

Need of Association mapping in Rice

• Rice (Oryza sativa) is a staple food that feeds 3 billion people.

• Largest variability among germplasm and genomic database as

compared to any other species.

• All the agronomic traits in rice (Grain yield, Days to maturity,

Height, etc.) have quantitative inheritance.

• Challenge and opportunity is to utilize this information to

understand and predict how genotypic variation gives rise to the

abundance of phenotypic variation and its utilization in MAS.

Association mapping studies in RiceAssociation mapping studies in Rice

Population Sample Size

Markers used

Trait Reference

Germplasm 523 5291 SNPs 12 agronomic traits (Qing et al., 2015)

Diverse accessions 203 154 SSRs Trait of Harvest Index (Li et al., 2012)

Diverse rice accessions

383 44,000 SNPs Aluminum Tolerance (Famoso et al., 2011)

Diverse accessions 413 44K SNPs Agronomic traits (Zhao et al., 2011)

Diverse accessions 210 86 SSRs yield and grain quality

(Borba et al., 2010)

Landraces 517 3,625,200 SNPs

14 agronomic traits (Xuehui et al., 2010)

Mini core collection

90 108 SSR stigma and spikelet characteristics

(Yan et al., 2009)

Diverse accessions 103 123 SSRs Yield and its components

(Agrama et al., 2007)

517 landraces were phenotyped and genotyped by sequencing using

Illumina Genome Analyzer II

Aligned sequence reads to the rice reference genome for SNP

identification

Discrepancies with rice reference genome were called as candidate

SNPs.

Case Study

A total of 3,625,200 SNPs were identified, resulting in an average of 9.32

SNPs per kb, with 87.9% of the SNPs located within 0.2 kb of the nearest SNP

A total of 167,514 SNPs were found in the coding regions of 25,409 genes.

3,625 large-effect SNPs (representing mutations predicted to cause large

effects) were identified.

Principal-component analysis

seperated rice germaplasm in two

groups i.e. indica and japonica.

Further both indica and japonica had three subgroups.

Because of strong population differentiation between the two

subspecies of cultivated rice GWAS was conducted only for 373

indica lines using mixed linear model (MLM)

80 associations for the 14 agronomic traits were identified.

Heading date strongly correlated with both population structure and

geographic distribution.

• Ultimate aim of plant breeding is prediction of phenotype from

genotype

• Major agricultural economic traits are of complex nature

• It is desperate to dissect these complex traits and assign them function

• Advanced genomic tools like association mapping will be a valuable

option can be effectively and efficiently utilized to accelerate crop

improvement

• Association mapping is long term commitment, so have all the things

and then go for it

Conclusion

Download - Association mapping for improvement of agronomic traits in rice

Top Related