biplot analysis of met data iita

Upload: karna2012

Post on 05-Apr-2018

228 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/31/2019 Biplot Analysis of MET Data IITA

    1/68

    Contact: [email protected]

    Weikai YanMay 2006

  • 7/31/2019 Biplot Analysis of MET Data IITA

    2/68

    Weikai Yan 2006

    Multi-Environment Trials (MET)

    MET are essential MET are expensive

    MET data are valuable MET data are not fully used

  • 7/31/2019 Biplot Analysis of MET Data IITA

    3/68

    Weikai Yan 2006

    Why biplot analysis?

    Biplot analysis can help understand METdata Graphically, Effectively, Conveniently

  • 7/31/2019 Biplot Analysis of MET Data IITA

    4/68

    Weikai Yan 2006

    Outline

    Multi-environment trial (MET) data Basics of biplot analysis Biplot analysis of G-by-E data Biplot analysis of G-by-T data Better understanding of MET data Conclusions

  • 7/31/2019 Biplot Analysis of MET Data IITA

    5/68

    Contact: [email protected]

  • 7/31/2019 Biplot Analysis of MET Data IITA

    6/68

    Weikai Yan 2006

    MET data isa genotype-environment-trait

    (G-E-T) 3-way table Multiple Genotypes

    Multiple Environments Multiple Traits

  • 7/31/2019 Biplot Analysis of MET Data IITA

    7/68 Weikai Yan 2006

    A G-E-T 3-way table containsmany 2-way tables

    G by E: for each trait G by T (trait): in each environment;

    across environments E by T: for each genotype; across

    genotypes

    G-E-T data >> G-E data

  • 7/31/2019 Biplot Analysis of MET Data IITA

    8/68 Weikai Yan 2006

    A G-E-T 3-way table isan extended 2-way table

    G by V: each E-T combination as a variable (V)

    P by T: each G-E combination as a phenotype

    (P)

  • 7/31/2019 Biplot Analysis of MET Data IITA

    9/68 Weikai Yan 2006

    A G-E-T 3-way table impliesinformative 2-way tables

    Association by environment 2-waytables

    Associations: among traits between traits and genetic markers

  • 7/31/2019 Biplot Analysis of MET Data IITA

    10/68 Weikai Yan 2006

    Goals of MET data analysis

    Short-term goals: Variety evaluation

    Response to the environment (G x E)

    Trait profiles (G x T) Long-term goals:

    To understand the target environment (G x E) the test environments (G x E) the crop (G x T) the genotype x environment interaction (A x T)

  • 7/31/2019 Biplot Analysis of MET Data IITA

    11/68Contact: [email protected]

    Most two-way tables can be

    visually studied using biplots

  • 7/31/2019 Biplot Analysis of MET Data IITA

    12/68 Weikai Yan 2006

    Origin of biplot

    Gabriel (1971)One of the mostimportant advances indata analysis in recentdecadesCurrently

    > 50,000 web pagesNumerous academicpublicationsIncluded in moststatistical analysispackages

    Still a very newtechnique to mostscientists

    Prof. Ruben Gabriel, The founder of biplot Courtesy of Prof. Purificacin Galindo

    University of Salamanca, Spain

  • 7/31/2019 Biplot Analysis of MET Data IITA

    13/68 Weikai Yan 2006

    What is a biplot?

    Biplot = bi + plot plot

    scatter plot of two rows OR of two columns, or scatter plot summarizing the rows OR the columns

    bi BOTH rows AND columns

    1 biplot >> 2 plots

  • 7/31/2019 Biplot Analysis of MET Data IITA

    14/68 Weikai Yan 2006

    Mathematical definition of a BiplotGraphical display of matrix multiplication

    Inner product property P ij =OA i *OBj *cos ij Implies the product matrix

    A(4, 2) B(2, 3) P(4, 3)

    121284

    96103

    151262

    69201

    321

    214

    332

    321

    044

    313

    332

    341

    a

    a

    a

    a

    bbb

    y

    x

    bbb

    a

    a

    a

    a

    y x

    Matrix multiplication

    -4

    -3

    -2

    -1

    0

    1

    2

    3

    4

    5

    -4 -3 -2 -1 0 1 2 3 4 5

    X

    Y

    O

    A1A2

    A3

    A4

    B1

    B2

    B3

    5.0

    cos =0.8944

    4.472

    P11 = 5*4.472*0.8944 = 20

  • 7/31/2019 Biplot Analysis of MET Data IITA

    15/68 Weikai Yan 2006

    Practical definition of a biplotAny two -way table can be analyzed using a 2D-biplot as soon as it can be

    sufficiently approximated by a rank- 2 matrix. (Gabriel, 1971)

    214

    332

    321

    044

    313

    332

    341

    121284

    96103

    151262

    69201

    321

    y

    x

    eee

    g

    g

    g

    g

    y x

    g

    g

    g

    g

    eee

    G-by-E table

    Matrix decomposition

    -4

    -3

    -2

    -1

    0

    1

    2

    3

    4

    5

    -4 -3 -2 -1 0 1 2 3 4 5

    X

    Y

    O

    G1G2

    G3

    G4

    E1

    E2

    E3

    P(4, 3) G(3, 2) E(2, 3)

    (Now 3D- biplots are also possible)

  • 7/31/2019 Biplot Analysis of MET Data IITA

    16/68 Weikai Yan 2006

    Singular Value Decomposition (SVD) &Singular Value Partitioning (SVP)

    r

    k kj

    f

    k

    f

    k ik

    SVP

    r

    k kjk ik

    SVDij

    ba

    baY

    1

    1

    1

    ))((

    (0 f 1)

    Singular values Matrixcharacterising the rows

    Matrixcharacterising the columns

    SVD = PCA?

    SVD:

    SVP:

    The rank of Y, i.e.,the minimum numberof PC required tofully represent Y

    Rows scores Column scores

    BiplotPlot Plot

  • 7/31/2019 Biplot Analysis of MET Data IITA

    17/68 Weikai Yan 2006

    Biplot interpretations

    Inner-product propertyInterpretations based on biplots with f = 1

    approximates YY T, the distance matrix Similarity/dissimilarity among row (genotype) factors

    Interpretations based on biplots with f = 0approximates Y TY, the variance matrix

    Similarity/dissimilarity among column (environment)factors

    Combined use of f = 0 and f = 1

    (Gabriel, 2002 Biometrika; Yan, 2002, Agron J; Built in the GGEbiplot software)

    ))((1

    1

    r

    k kj f k f k ik ij baY

  • 7/31/2019 Biplot Analysis of MET Data IITA

    18/68 Weikai Yan 2006

    Biplot analysis is

    to use biplots to display a two-way data per se (Y), its distance matrix (YYT), and its variance matrix (YTY)

    so that relationships among rows,

    relationships among columns, and interactions between rows and columns

    can be graphically visualized.

  • 7/31/2019 Biplot Analysis of MET Data IITA

    19/68 Weikai Yan 2006

    Data centering prior to biplot analysis

    The general linear model for a G-by-Edata set (P) P = M + G + E + GE

    Possible two- way tables (Y): Y = P = M + G + E + GE original data: QQE biplot Y = P M = G + E + GE global-centered (PCA)

    Y = P M E = G + GE column-centered: GGE biplot Y = P M G = E + GE row-centered Y = P M G E = GE double-centered: GE biplot

    All models are useful, depending on the research objectives (built in GGEbiplot)

  • 7/31/2019 Biplot Analysis of MET Data IITA

    20/68 Weikai Yan 2006

    Data scaling prior to biplot analysis

    Different GGE biplots Yij = ( i + ij )/s j

    S j = 1 no scaling S j = (s.d.) j all environments are equally important S j = (s.e.) j heterogeneity among environments is removed

    (built in GGEbiplot)

  • 7/31/2019 Biplot Analysis of MET Data IITA

    21/68 Weikai Yan 2006

    Four questions must be askedbefore trying to interpret a biplot

    1. What is the model?How the data were centered and scaled?What are we looking at?

    2. What is the goodness of fit?How confident are we about what we see?What if the data is fitted poorly?

    3. How singular values are partitioned?What questions can be asked?

    4. Are the axes drawn to scale?Are the patterns artifacts?

    (All are addressed explicitly in GGEbiplot)

  • 7/31/2019 Biplot Analysis of MET Data IITA

    22/68Contact: [email protected]

    MEGA-ENVIRONMENT

    ANALYSIS

    TESTENVIRONMENTEVALUATION

    GENOTYPEEVALUATION

  • 7/31/2019 Biplot Analysis of MET Data IITA

    23/68 Weikai Yan 2006

    Sample G-by-E data(Yield data of 18 genotypes in 9 environments, 1993, Ontario, Canada)

  • 7/31/2019 Biplot Analysis of MET Data IITA

    24/68

    Weikai Yan 2006

    Before trying to interpret a biplot

    1. Model selection?Centering = 2 (G+GE) Scaling =0

    2. Goodness of fit?

    78%.3. Singular value

    partitioning?SVP = 2 (environment-

    metric )

    4. Draw to scale?Yes.

  • 7/31/2019 Biplot Analysis of MET Data IITA

    25/68

    Weikai Yan 2006

    G By E data analysis

    MEGA-ENVIRONMENT

    ANALYSIS

    TEST

    ENVIRONMENTEVALUATION

    GENOTYPEEVALUATION

    Mega-environment is a group of geographical locations that share the same (set of)

    best genotypes consistently across years.

  • 7/31/2019 Biplot Analysis of MET Data IITA

    26/68

    Weikai Yan 2006

    Relationships among environments The Environment - vector view

    Angle vs.correlation

    The anglesamong testenvironments

    Environment

    grouping

  • 7/31/2019 Biplot Analysis of MET Data IITA

    27/68

    Weikai Yan 2006

    Which -won- where

    ( Crossover GE is GE that caused genotype rank changes and different winners in

    different test environments)

    G12

    G7G18

    G8G13

  • 7/31/2019 Biplot Analysis of MET Data IITA

    28/68

    Weikai Yan 2006

    Are there meaningful crossover GE? The which -won- where view

    ( Crossover GE is GE that caused genotype rank changes and different winners in

    different test environments)

  • 7/31/2019 Biplot Analysis of MET Data IITA

    29/68

    Weikai Yan 2006

    Are the crossover patterns*repeatable?

    If YES The target environment can be divided into multiple

    mega-environments GE can be exploited by selecting for each mega-

    environment

    GE G If NO

    The target environment CANNOT be divided intomultiple mega-environments

    GE CANNOT be exploited GE must be avoided by testing across locations and

    years

    *Not the environment-grouping patterns Mega-environment is a group of geographical locations that share the same (set of) bestgenotypes consistently across years.

    Multi-year data are needed

  • 7/31/2019 Biplot Analysis of MET Data IITA

    30/68

    Weikai Yan 2006

    Classify your target environment intoone of three categories

    With Crossover GE No CrossoverGE

    Repeatable (2) Multiple MEsSelect for specifically adaptedgenotypes for each ME

    (1) Singlesimple MEA single test location,single year suffices toselect a single bestvariety

    Not repeatable (3) Single

    complex MESelect for generally adaptedgenotypes across the wholeregions across multiple years

    ME: mega-environment

  • 7/31/2019 Biplot Analysis of MET Data IITA

    31/68

    Weikai Yan 2006

    G By E data analysis

    MEGA-ENVIRONMENT

    ANALYSIS

    TEST

    ENVIRONMENTEVALUATION

    GENOTYPEEVALUATION

  • 7/31/2019 Biplot Analysis of MET Data IITA

    32/68

    Weikai Yan 2006

    Discriminating ability and representativeness

    Vector length: discriminating ability Angle to the AE: representativeness

    Average-environment axis

    Average environment

  • 7/31/2019 Biplot Analysis of MET Data IITA

    33/68

    Weikai Yan 2006

    Ideal test environments:discriminating and representative

    Ideal test

    environment

  • 7/31/2019 Biplot Analysis of MET Data IITA

    34/68

    Weikai Yan 2006

    Classify each test environment intoone of three categories

    For each good or useful test environment: is it essential?

    Discriminative Notdiscriminative

    Representative (2) Good forselecting (moreimportant)

    (1) Useless

    Notrepresentative

    (3) Useful forculling (less important)

  • 7/31/2019 Biplot Analysis of MET Data IITA

    35/68

    Weikai Yan 2006

    Vector length = discrimination= GE = GE1 + GE2

    Contribution toProportionateGE

    Contribution toNon-proportionateGE

  • 7/31/2019 Biplot Analysis of MET Data IITA

    36/68

    Weikai Yan 2006

    G By E data analysis

    MEGA-ENVIRONMENT

    ANALYSIS

    TEST

    ENVIRONMENTEVALUATION

    GENOTYPEEVALUATION

  • 7/31/2019 Biplot Analysis of MET Data IITA

    37/68

    Weikai Yan 2006

    Vector length = GGE = G + GE

    Contribution To GE(instability)

    Contribution To G(mean performance)

  • 7/31/2019 Biplot Analysis of MET Data IITA

    38/68

    Weikai Yan 2006

    Mean vs. Stability

  • 7/31/2019 Biplot Analysis of MET Data IITA

    39/68

    Weikai Yan 2006

    Genotype ranking on both MEAN and STABILITY

    The ideal genotype

  • 7/31/2019 Biplot Analysis of MET Data IITA

    40/68

    Weikai Yan 2006

    Genotype classification

    MeanStability

    High meanperformance

    Low meanperformance

    High stability Generally adapted(VERY GOOD)

    Bad everywhere(VERY BAD)

    Low stability Specifically Adapted(GOOD)

    Bad somewhere(BAD)

    Are there stability genes?!

  • 7/31/2019 Biplot Analysis of MET Data IITA

    41/68

    Weikai Yan 2006

    G x E data analysis summary

    1) Mega-environment analysis 2) Test environment evaluation 3) Genotype evaluation

    Important comments: (2) and (3) are meaningful only for a single mega-environment Any stability analysis is meaningful only for a single mega-

    environment Any stability index can be used only as a modifier to the ranking

    based on mean performance

  • 7/31/2019 Biplot Analysis of MET Data IITA

    42/68

    Contact: [email protected]

  • 7/31/2019 Biplot Analysis of MET Data IITA

    43/68

    Weikai Yan 2006

    Inner-product property

  • 7/31/2019 Biplot Analysis of MET Data IITA

    44/68

    Weikai Yan 2006

    Ranking on a single environment

  • 7/31/2019 Biplot Analysis of MET Data IITA

    45/68

    Weikai Yan 2006

    Ranking on two environments

  • 7/31/2019 Biplot Analysis of MET Data IITA

    46/68

    Weikai Yan 2006

    Relative adaptation of a genotype

  • 7/31/2019 Biplot Analysis of MET Data IITA

    47/68

    Weikai Yan 2006

    Compare any two genotypes

  • 7/31/2019 Biplot Analysis of MET Data IITA

    48/68

    Contact: [email protected]

  • 7/31/2019 Biplot Analysis of MET Data IITA

    49/68

    Weikai Yan 2006

    Objectives of G By T data analysis

    Genotype evaluation based on traitprofiles

    Relationship among breeding objectives

  • 7/31/2019 Biplot Analysis of MET Data IITA

    50/68

    Weikai Yan 2006

    Data of 4 traits for 19 covered oat varieties (Ontario 2004)

    (Background info: High yield, high groat, high protein, and low oil are desirable for milling oats)

  • 7/31/2019 Biplot Analysis of MET Data IITA

    51/68

    Weikai Yan 2006

    Relationships among traits

  • 7/31/2019 Biplot Analysis of MET Data IITA

    52/68

    Weikai Yan 2006

    Trait profile of each genotype

  • 7/31/2019 Biplot Analysis of MET Data IITA

    53/68

    Weikai Yan 2006

    Trait profile of a genotype

    T i fil i b

  • 7/31/2019 Biplot Analysis of MET Data IITA

    54/68

    Weikai Yan 2006

    Trait profile comparison betweentwo genotypes

  • 7/31/2019 Biplot Analysis of MET Data IITA

    55/68

    Weikai Yan 2006

    Genotype ranking based on a trait

  • 7/31/2019 Biplot Analysis of MET Data IITA

    56/68

    Weikai Yan 2006

    Parent selection based on trait profiles

  • 7/31/2019 Biplot Analysis of MET Data IITA

    57/68

    Weikai Yan 2006

    Independent culling

  • 7/31/2019 Biplot Analysis of MET Data IITA

    58/68

    Contact: [email protected]

    MET data are more informativethan you thought

    A G E T 3 d t t t i

  • 7/31/2019 Biplot Analysis of MET Data IITA

    59/68

    Weikai Yan 2006

    A G-E-T 3-way dataset contains various 2-way tables

    G by E data G by T data E by T data:

    for each genotype; all genotypes G by V data:

    each E-T as a variable (V) P by T data:

    each G-E as a phenotype (P) Genetic association by environment data Trait association by environment data

    G i i b i bi l

  • 7/31/2019 Biplot Analysis of MET Data IITA

    60/68

    Weikai Yan 2006

    Genetic-covariate by environment biplot(QTL by environment biplot)

    Barley GenomicsData

  • 7/31/2019 Biplot Analysis of MET Data IITA

    61/68

    Weikai Yan 2006

    Trait-association by environment biplot

    OatMETData

  • 7/31/2019 Biplot Analysis of MET Data IITA

    62/68

    Weikai Yan 2006

    Four-way data analysis

    Year

  • 7/31/2019 Biplot Analysis of MET Data IITA

    63/68

    Contact: [email protected]

  • 7/31/2019 Biplot Analysis of MET Data IITA

    64/68

    Weikai Yan 2006

    Conclusion (1)

    GGE biplot analysis is an effective toolfor G by E data analysis to achieveunderstandings about.

    1. the target environment,2. the test environments, and3. the genotypes

    4. stability analysis is useful only to a singlemega-environment

  • 7/31/2019 Biplot Analysis of MET Data IITA

    65/68

    Weikai Yan 2006

    Conclusion (2)

    GGE biplot analysis is an effective toolfor G by T data analysis to achieveunderstandings about.

    1. the interconnected plant system,2. positively correlated traits3. negatively correlated traits

    4. the strength and weakness of thegenotypes

  • 7/31/2019 Biplot Analysis of MET Data IITA

    66/68

    Weikai Yan 2006

    Conclusion (3)

    Biplot analysis is an effective tool for other two-way table analysis

    Marker by environment QTL by environment Gene by treatment Diallel cross

  • 7/31/2019 Biplot Analysis of MET Data IITA

    67/68

    Weikai Yan 2006

    Conclusion (4)

    Biplot analysis can be VERY EASY From reading data to displaying the biplot: 2 seconds Displaying any of the perspectives of a biplot and

    changing from one to another: 1 second Displaying the biplot for any subset: 1 second Learning how to use the software and interpret

    biplots: 30 minutes

    Everything can be just one mouse-click away

  • 7/31/2019 Biplot Analysis of MET Data IITA

    68/68

    Contact: Weikai Yan: [email protected] web: www.ggebiplot.com

    mailto:[email protected]://www.ggebiplot.com/http://www.ggebiplot.com/mailto:[email protected]