bayesian approach for animal breeding data analysis
TRANSCRIPT
-
7/21/2019 Bayesian Approach for Animal Breeding Data Analysis
1/42
Application of Bayesian Model for
Animal Breeding Data Analysis
G. R. Gowane
ICAR-CSWRI Avikanagar
-
7/21/2019 Bayesian Approach for Animal Breeding Data Analysis
2/42
Why do we need to estimate genetic
parameters?
They are necessary to plan an efficient breedingprogram for the trait of interest.
knowledge of the
genetic architecture of
the population
-
7/21/2019 Bayesian Approach for Animal Breeding Data Analysis
3/42
Accurate estimation of VC:
Prediction error variances for
predicted random effects (BV)
increase as differences between
estimated and true values of VC
increase
Henders
on, 1975
-
7/21/2019 Bayesian Approach for Animal Breeding Data Analysis
4/42
Choice of estimating VC
REML DFREML, MTDFREML, VCE, Wombat
Gibbs Sampling (GS) algorithm for Bayesian
analysis
MTGSAM (Van Tassell and Van Vleck, 1995)
RRGibbs (Meyer)
-
7/21/2019 Bayesian Approach for Animal Breeding Data Analysis
5/42
Why Bayes?
Combine information from prior to obtain the
posterior distribution.
MORE ACCURACY!!
Memory space required for estimating
variance components.
Does it really matter these days?
Threshold traits analysis
Several algorithms are being searched
It is an alternate approach
-
7/21/2019 Bayesian Approach for Animal Breeding Data Analysis
6/42
Bayesian statistics uses probabilityto express
uncertainty about the unknowns that are
being estimated.
The use of probability is more efficientthan
any other method of expressing uncertainty.
Unfortunately, to make it possible, inverse
probability needs the knowledge of some
priorinformation.
-
7/21/2019 Bayesian Approach for Animal Breeding Data Analysis
7/42
Three problems of Bayesian approach
Difficulty of integrating priorinformation
how to represent ignorance
because there is no prior information,
because we do not like the way in which this prior isintegrated,
because we would like to assess the information provided
by the data without prior considerations
Use of probability to express uncertainty, this leads tomultiple integrals that cannot be solved even by using
approximate methods
1990s MCMCa numerical method came to resque
-
7/21/2019 Bayesian Approach for Animal Breeding Data Analysis
8/42
In Bayesian theory we
know that all problems
are reduced to a single
pathway: we should
look for a posterior
distribution, given the
distribution of the dataand the prior
distribution.
-
7/21/2019 Bayesian Approach for Animal Breeding Data Analysis
9/42
9
A brief history of Bayesian analysis
Bayes (1763) Links statistics to probability
Laplace (1800) Normal distribution
Many applications, including census
[sampling models] Gauss (1800)
Least squares
Applications to astronomy [measurement error models]
Keynes, von Neumann, Savage (1920s-1950s) Link Bayesian statistics to decision theory
Applied statisticians (1950s-1970s) Hierarchical linear models
Applications to animal breeding, education [data in groups]
Daniel Gianola (Wang et al., 1994) and Daniel Sorensen (Sorensen et al. 1994)brought these techniques into the field of animal breeding.
-
7/21/2019 Bayesian Approach for Animal Breeding Data Analysis
10/42
Prior???
Litter Size of Landrace Pig ~ 10 Spanish Landrace?
Mean = 5
Prior mean = 10
What to do?
Prior information is the
information about the
parameters we want toestimate that
exists before we perform our
experiment.
-
7/21/2019 Bayesian Approach for Animal Breeding Data Analysis
11/42
Bayesian Variance Components ModelPrior Distributions
"flat" prior distribution for the "fixed" effects, that is, thereis no prior knowledge about these effects.
Next, the random effects are assumed to be normallydistributed. For the genetic effects there will be an
additional assumption of a known covariance structureamong those random effects corresponding to therelationship matrix.
Finally, the residual effects are assumed to be distributednormally. These assumptions are the same as those usedwith most likelihood based methods.
Results in BLUE and BLUP solutions for fixed and randomeffects
-
7/21/2019 Bayesian Approach for Animal Breeding Data Analysis
12/42
Gibbs Sampling Animal Model
Gibbs sampling (GS) is a method of numerical
integration that allows inferences to be made
about joint or marginal densities, even when
those densities cannot be evaluated directly.
-
7/21/2019 Bayesian Approach for Animal Breeding Data Analysis
13/42
Posterior inference is affected by the specified
prior density unless the information in thedata analysed (likelihood) overwhelmsthe
prior.
With normality, the posteriordistribution issimply the (frequentist) likelihood function
scaled by priordistributions of the unknown
parameters in the model (Van Tassell and and
Van Vleck, 1996).
-
7/21/2019 Bayesian Approach for Animal Breeding Data Analysis
14/42
Burn In:The number of rounds discarded
before the values are considered samples
from the posterior distribution is usually
called the burn-in period. Raftery and Lewis(1992)
Gibanal (Van Kaam, 1997) can be used to
define the burn in period and convergencecriteria for the problem
-
7/21/2019 Bayesian Approach for Animal Breeding Data Analysis
15/42
Defining the Number of Iterations or length
of the Gibbs sampling
The number of Gibbs sampling for executing
the program should be large enough.
Although Gibanal also dictates the length of
the chain, however, one long chain suffices
the need for the program (Geyer 1992).
-
7/21/2019 Bayesian Approach for Animal Breeding Data Analysis
16/42
MTGSAM
Van Tassell and Van Vleck (1995)
Model Assumptions y = X+ Zu + e
where is the vector of fixed effects associated with
records in ybyX, and uis the vector of random effects
associated with records in ybyZ, and eis the vector ofrandom residual effects.
-
7/21/2019 Bayesian Approach for Animal Breeding Data Analysis
17/42
Prior Distribution: The MTGSAM
flat prior distributions for the fixed effects.
For the genetic effects there will be an additional
assumption of a known covariance structure
corresponding to the numerator relationship
matrix.
Inverted Wishart (IW) distributions are used asprior distributions for the (co)variance
components
-
7/21/2019 Bayesian Approach for Animal Breeding Data Analysis
18/42
Variance Components:
The MTGSAM posterior mean estimate for
(co)variance components is based on the
expected value of the IW RV
The mean of a (co)variance component is
calculated as the average of expected values
over the length of the post burn-in chain.
-
7/21/2019 Bayesian Approach for Animal Breeding Data Analysis
19/42
The MTGSAM for Genetic Analysis
Preparing a pedigree and data file
-
7/21/2019 Bayesian Approach for Animal Breeding Data Analysis
20/42
-
7/21/2019 Bayesian Approach for Animal Breeding Data Analysis
21/42
-
7/21/2019 Bayesian Approach for Animal Breeding Data Analysis
22/42
-
7/21/2019 Bayesian Approach for Animal Breeding Data Analysis
23/42
-
7/21/2019 Bayesian Approach for Animal Breeding Data Analysis
24/42
-
7/21/2019 Bayesian Approach for Animal Breeding Data Analysis
25/42
-
7/21/2019 Bayesian Approach for Animal Breeding Data Analysis
26/42
-
7/21/2019 Bayesian Approach for Animal Breeding Data Analysis
27/42
-
7/21/2019 Bayesian Approach for Animal Breeding Data Analysis
28/42
-
7/21/2019 Bayesian Approach for Animal Breeding Data Analysis
29/42
-
7/21/2019 Bayesian Approach for Animal Breeding Data Analysis
30/42
-
7/21/2019 Bayesian Approach for Animal Breeding Data Analysis
31/42
-
7/21/2019 Bayesian Approach for Animal Breeding Data Analysis
32/42
-
7/21/2019 Bayesian Approach for Animal Breeding Data Analysis
33/42
This procedure completes the execution of theprogram. The results obtained are stored mainly in theMTGS60, MTGS61, MTGS62, MTGS63, MTGS81,MTGS82 and MTGS83 files.
The unit MTGS61 file contains the observed values ofthe variance components
The unit 62 file contains the parameters used togenerate the samples from the appropriatedistribution.
These values can be extracted by using the softwarePULLDAT.EXE (Annexure 2) for calculating the Mean, SEor SD for the estimates obtained.
-
7/21/2019 Bayesian Approach for Animal Breeding Data Analysis
34/42
MTGS81 is the log file of information from the
execution of MTGSNRM. The information
includes number of animals in A-1, number of
non-zero elements, and inbreedinginformation
-
7/21/2019 Bayesian Approach for Animal Breeding Data Analysis
35/42
Actual
results
are
given inthe
MTGS8
3
-
7/21/2019 Bayesian Approach for Animal Breeding Data Analysis
36/42
-
7/21/2019 Bayesian Approach for Animal Breeding Data Analysis
37/42
MTGS72
Animal effect
Second
animal effect
P lld
-
7/21/2019 Bayesian Approach for Animal Breeding Data Analysis
38/42
Pulldat.exeYou need to have pulldat.exe
in the same folder where
MTGS61, MTGS62 and
MTGS63 are present.Only after files are
subjected to pulldat they
can be further used for
Gibanal or other
analyses.
Open the CMD window andchange path directory to
pulldat
Give the name for output
fille.
Follow the options as given
except at the point Enterthe number of variables
in each record to be read
from MTGS61. Here in
our case, 5 variables
needs to be extracted,
however, the input will
change according to thedata in consideration and
-
7/21/2019 Bayesian Approach for Animal Breeding Data Analysis
39/42
Gibanal.exe
the observed values of the fixed and random
effects are written to unit 61
We can use Gibanal to see
Serial correlation
Burn in
Chain length
-
7/21/2019 Bayesian Approach for Animal Breeding Data Analysis
40/42
Bayesian analysis are carried out to estimate severalsampling for the same (co)variance component.These values for Va or Vm etc. do not depend of number of
animals, but of length chain you run.You can use 10,000 animal in your analysis, and infrequentist approach you'll estimate only one Va or Vmetc., but in bayesian approach you'll have sampling ofthese variances, it depend on the sampling length, for
example:1 - Iteration length = 11002 - Burn-in = 1003 - Thinning interval = 10
So, we have: iteration length - Burn-in = 1100 - 100 = 1000(Sampling after burn-in).Now, we made: Sampling after burn-in / thinning interval =1000 / 10 = 100(this is a number of sampled observationsfor Va or Vm etc.), not the number of animals
-
7/21/2019 Bayesian Approach for Animal Breeding Data Analysis
41/42
Bloco Freqncia
2.18 1
2.46 3
2.73 5
3.01 10
3.29 13
3.57 22
3.85 23
4.13 9
4.41 2
-
7/21/2019 Bayesian Approach for Animal Breeding Data Analysis
42/42
Thank You