bayesian approach for animal breeding data analysis

Upload: gopal-gowane

Post on 04-Feb-2018

221 views

Category:

Documents


4 download

TRANSCRIPT

  • 7/21/2019 Bayesian Approach for Animal Breeding Data Analysis

    1/42

    Application of Bayesian Model for

    Animal Breeding Data Analysis

    G. R. Gowane

    ICAR-CSWRI Avikanagar

  • 7/21/2019 Bayesian Approach for Animal Breeding Data Analysis

    2/42

    Why do we need to estimate genetic

    parameters?

    They are necessary to plan an efficient breedingprogram for the trait of interest.

    knowledge of the

    genetic architecture of

    the population

  • 7/21/2019 Bayesian Approach for Animal Breeding Data Analysis

    3/42

    Accurate estimation of VC:

    Prediction error variances for

    predicted random effects (BV)

    increase as differences between

    estimated and true values of VC

    increase

    Henders

    on, 1975

  • 7/21/2019 Bayesian Approach for Animal Breeding Data Analysis

    4/42

    Choice of estimating VC

    REML DFREML, MTDFREML, VCE, Wombat

    Gibbs Sampling (GS) algorithm for Bayesian

    analysis

    MTGSAM (Van Tassell and Van Vleck, 1995)

    RRGibbs (Meyer)

  • 7/21/2019 Bayesian Approach for Animal Breeding Data Analysis

    5/42

    Why Bayes?

    Combine information from prior to obtain the

    posterior distribution.

    MORE ACCURACY!!

    Memory space required for estimating

    variance components.

    Does it really matter these days?

    Threshold traits analysis

    Several algorithms are being searched

    It is an alternate approach

  • 7/21/2019 Bayesian Approach for Animal Breeding Data Analysis

    6/42

    Bayesian statistics uses probabilityto express

    uncertainty about the unknowns that are

    being estimated.

    The use of probability is more efficientthan

    any other method of expressing uncertainty.

    Unfortunately, to make it possible, inverse

    probability needs the knowledge of some

    priorinformation.

  • 7/21/2019 Bayesian Approach for Animal Breeding Data Analysis

    7/42

    Three problems of Bayesian approach

    Difficulty of integrating priorinformation

    how to represent ignorance

    because there is no prior information,

    because we do not like the way in which this prior isintegrated,

    because we would like to assess the information provided

    by the data without prior considerations

    Use of probability to express uncertainty, this leads tomultiple integrals that cannot be solved even by using

    approximate methods

    1990s MCMCa numerical method came to resque

  • 7/21/2019 Bayesian Approach for Animal Breeding Data Analysis

    8/42

    In Bayesian theory we

    know that all problems

    are reduced to a single

    pathway: we should

    look for a posterior

    distribution, given the

    distribution of the dataand the prior

    distribution.

  • 7/21/2019 Bayesian Approach for Animal Breeding Data Analysis

    9/42

    9

    A brief history of Bayesian analysis

    Bayes (1763) Links statistics to probability

    Laplace (1800) Normal distribution

    Many applications, including census

    [sampling models] Gauss (1800)

    Least squares

    Applications to astronomy [measurement error models]

    Keynes, von Neumann, Savage (1920s-1950s) Link Bayesian statistics to decision theory

    Applied statisticians (1950s-1970s) Hierarchical linear models

    Applications to animal breeding, education [data in groups]

    Daniel Gianola (Wang et al., 1994) and Daniel Sorensen (Sorensen et al. 1994)brought these techniques into the field of animal breeding.

  • 7/21/2019 Bayesian Approach for Animal Breeding Data Analysis

    10/42

    Prior???

    Litter Size of Landrace Pig ~ 10 Spanish Landrace?

    Mean = 5

    Prior mean = 10

    What to do?

    Prior information is the

    information about the

    parameters we want toestimate that

    exists before we perform our

    experiment.

  • 7/21/2019 Bayesian Approach for Animal Breeding Data Analysis

    11/42

    Bayesian Variance Components ModelPrior Distributions

    "flat" prior distribution for the "fixed" effects, that is, thereis no prior knowledge about these effects.

    Next, the random effects are assumed to be normallydistributed. For the genetic effects there will be an

    additional assumption of a known covariance structureamong those random effects corresponding to therelationship matrix.

    Finally, the residual effects are assumed to be distributednormally. These assumptions are the same as those usedwith most likelihood based methods.

    Results in BLUE and BLUP solutions for fixed and randomeffects

  • 7/21/2019 Bayesian Approach for Animal Breeding Data Analysis

    12/42

    Gibbs Sampling Animal Model

    Gibbs sampling (GS) is a method of numerical

    integration that allows inferences to be made

    about joint or marginal densities, even when

    those densities cannot be evaluated directly.

  • 7/21/2019 Bayesian Approach for Animal Breeding Data Analysis

    13/42

    Posterior inference is affected by the specified

    prior density unless the information in thedata analysed (likelihood) overwhelmsthe

    prior.

    With normality, the posteriordistribution issimply the (frequentist) likelihood function

    scaled by priordistributions of the unknown

    parameters in the model (Van Tassell and and

    Van Vleck, 1996).

  • 7/21/2019 Bayesian Approach for Animal Breeding Data Analysis

    14/42

    Burn In:The number of rounds discarded

    before the values are considered samples

    from the posterior distribution is usually

    called the burn-in period. Raftery and Lewis(1992)

    Gibanal (Van Kaam, 1997) can be used to

    define the burn in period and convergencecriteria for the problem

  • 7/21/2019 Bayesian Approach for Animal Breeding Data Analysis

    15/42

    Defining the Number of Iterations or length

    of the Gibbs sampling

    The number of Gibbs sampling for executing

    the program should be large enough.

    Although Gibanal also dictates the length of

    the chain, however, one long chain suffices

    the need for the program (Geyer 1992).

  • 7/21/2019 Bayesian Approach for Animal Breeding Data Analysis

    16/42

    MTGSAM

    Van Tassell and Van Vleck (1995)

    Model Assumptions y = X+ Zu + e

    where is the vector of fixed effects associated with

    records in ybyX, and uis the vector of random effects

    associated with records in ybyZ, and eis the vector ofrandom residual effects.

  • 7/21/2019 Bayesian Approach for Animal Breeding Data Analysis

    17/42

    Prior Distribution: The MTGSAM

    flat prior distributions for the fixed effects.

    For the genetic effects there will be an additional

    assumption of a known covariance structure

    corresponding to the numerator relationship

    matrix.

    Inverted Wishart (IW) distributions are used asprior distributions for the (co)variance

    components

  • 7/21/2019 Bayesian Approach for Animal Breeding Data Analysis

    18/42

    Variance Components:

    The MTGSAM posterior mean estimate for

    (co)variance components is based on the

    expected value of the IW RV

    The mean of a (co)variance component is

    calculated as the average of expected values

    over the length of the post burn-in chain.

  • 7/21/2019 Bayesian Approach for Animal Breeding Data Analysis

    19/42

    The MTGSAM for Genetic Analysis

    Preparing a pedigree and data file

  • 7/21/2019 Bayesian Approach for Animal Breeding Data Analysis

    20/42

  • 7/21/2019 Bayesian Approach for Animal Breeding Data Analysis

    21/42

  • 7/21/2019 Bayesian Approach for Animal Breeding Data Analysis

    22/42

  • 7/21/2019 Bayesian Approach for Animal Breeding Data Analysis

    23/42

  • 7/21/2019 Bayesian Approach for Animal Breeding Data Analysis

    24/42

  • 7/21/2019 Bayesian Approach for Animal Breeding Data Analysis

    25/42

  • 7/21/2019 Bayesian Approach for Animal Breeding Data Analysis

    26/42

  • 7/21/2019 Bayesian Approach for Animal Breeding Data Analysis

    27/42

  • 7/21/2019 Bayesian Approach for Animal Breeding Data Analysis

    28/42

  • 7/21/2019 Bayesian Approach for Animal Breeding Data Analysis

    29/42

  • 7/21/2019 Bayesian Approach for Animal Breeding Data Analysis

    30/42

  • 7/21/2019 Bayesian Approach for Animal Breeding Data Analysis

    31/42

  • 7/21/2019 Bayesian Approach for Animal Breeding Data Analysis

    32/42

  • 7/21/2019 Bayesian Approach for Animal Breeding Data Analysis

    33/42

    This procedure completes the execution of theprogram. The results obtained are stored mainly in theMTGS60, MTGS61, MTGS62, MTGS63, MTGS81,MTGS82 and MTGS83 files.

    The unit MTGS61 file contains the observed values ofthe variance components

    The unit 62 file contains the parameters used togenerate the samples from the appropriatedistribution.

    These values can be extracted by using the softwarePULLDAT.EXE (Annexure 2) for calculating the Mean, SEor SD for the estimates obtained.

  • 7/21/2019 Bayesian Approach for Animal Breeding Data Analysis

    34/42

    MTGS81 is the log file of information from the

    execution of MTGSNRM. The information

    includes number of animals in A-1, number of

    non-zero elements, and inbreedinginformation

  • 7/21/2019 Bayesian Approach for Animal Breeding Data Analysis

    35/42

    Actual

    results

    are

    given inthe

    MTGS8

    3

  • 7/21/2019 Bayesian Approach for Animal Breeding Data Analysis

    36/42

  • 7/21/2019 Bayesian Approach for Animal Breeding Data Analysis

    37/42

    MTGS72

    Animal effect

    Second

    animal effect

    P lld

  • 7/21/2019 Bayesian Approach for Animal Breeding Data Analysis

    38/42

    Pulldat.exeYou need to have pulldat.exe

    in the same folder where

    MTGS61, MTGS62 and

    MTGS63 are present.Only after files are

    subjected to pulldat they

    can be further used for

    Gibanal or other

    analyses.

    Open the CMD window andchange path directory to

    pulldat

    Give the name for output

    fille.

    Follow the options as given

    except at the point Enterthe number of variables

    in each record to be read

    from MTGS61. Here in

    our case, 5 variables

    needs to be extracted,

    however, the input will

    change according to thedata in consideration and

  • 7/21/2019 Bayesian Approach for Animal Breeding Data Analysis

    39/42

    Gibanal.exe

    the observed values of the fixed and random

    effects are written to unit 61

    We can use Gibanal to see

    Serial correlation

    Burn in

    Chain length

  • 7/21/2019 Bayesian Approach for Animal Breeding Data Analysis

    40/42

    Bayesian analysis are carried out to estimate severalsampling for the same (co)variance component.These values for Va or Vm etc. do not depend of number of

    animals, but of length chain you run.You can use 10,000 animal in your analysis, and infrequentist approach you'll estimate only one Va or Vmetc., but in bayesian approach you'll have sampling ofthese variances, it depend on the sampling length, for

    example:1 - Iteration length = 11002 - Burn-in = 1003 - Thinning interval = 10

    So, we have: iteration length - Burn-in = 1100 - 100 = 1000(Sampling after burn-in).Now, we made: Sampling after burn-in / thinning interval =1000 / 10 = 100(this is a number of sampled observationsfor Va or Vm etc.), not the number of animals

  • 7/21/2019 Bayesian Approach for Animal Breeding Data Analysis

    41/42

    Bloco Freqncia

    2.18 1

    2.46 3

    2.73 5

    3.01 10

    3.29 13

    3.57 22

    3.85 23

    4.13 9

    4.41 2

  • 7/21/2019 Bayesian Approach for Animal Breeding Data Analysis

    42/42

    Thank You