advanced methods of statistical analysis used in animal breeding

58

Upload: drbarada-mohanty

Post on 27-Jan-2017

360 views

Category:

Education


6 download

TRANSCRIPT

Page 1: Advanced Methods of Statistical Analysis used in Animal Breeding
Page 2: Advanced Methods of Statistical Analysis used in Animal Breeding

ADVANCE METHODS OF STATISTICAL ANALYSIS USED IN ANIMAL BREEDING

Dr. Barada Shankar MohantyM-5428

Biostatistics ,Division Of LES & IT

Page 3: Advanced Methods of Statistical Analysis used in Animal Breeding

CON

TEN

TSTypes of Data

Nature of Data

Models Used

Types of Model (Linear , Nonlinear )

Different types of data analysis

Bayesian Method of Analysis

References

Page 4: Advanced Methods of Statistical Analysis used in Animal Breeding

Gathering or obtaining the desired information under study. Primary Source

Data is collected by researcher himself can be collected by using experiments, surveys, questionnaires,

interviews, and observations. Secondary Source

comes from resources that have already been published. Data collected, compiled or written by other researchers eg. books,

journals, newspapers Any reference must be acknowledged

DATA

Page 5: Advanced Methods of Statistical Analysis used in Animal Breeding

Variables : Any measurable characteristics or quantity which can assume a range of numerical values within certain limits, i.e. income, Height, age, weight ,price etc.

• A discrete variable may assume only a countable number of values: intermediate values are not meaningful.

• Mastitis, DiseaseDiscrete

• A continuous variable may assume any real value within some range. Takes fractional or integral values.

• Milk yield ,fat yield,Body wt etcContinuous

Page 6: Advanced Methods of Statistical Analysis used in Animal Breeding

• variable is controlled by the experimenter.Dependent

• variable is measured.Independent

Page 7: Advanced Methods of Statistical Analysis used in Animal Breeding

A mathematical model is an equation or a set of equations which represents the behavior of a system.• Linear Model : Unit increase in independent variable cause a proportionate increase

dependent variable.• Y= β0+ β1X + β2X2 + e A linear model will exactly spell out which effects are affecting which observation and

the different effects (such as breed and feeding regime) are estimated simultaneously and during this process they are corrected for each other.

Linear models are the most common type of statistical models used in animal breeding to predict breeding values based on phenotypic observations.

•Non-linear Model : If one or all the parameter of a model are not appear • linearly, the model is known as nonlinear model.

• Y= a Xb e(-cx) + e

05/01/2023

MODEL

Page 8: Advanced Methods of Statistical Analysis used in Animal Breeding

The model usually consists of factors. • Discrete factors or class variables such as sex, year, herd• Continuous factors or covariables such as age

Model Contains 3 components : Predictor-Dependent Variable Predictant- Independent Variable Error term

PredictorModelPredictant

Error

Page 9: Advanced Methods of Statistical Analysis used in Animal Breeding

Types of Analysis :

(A)Univariate Analysis : we group the individuals on the basis of single performance.

When we use one variable to describe a person, place, or thing.

(B)Bivariate Analysis : When we use two variable .

(C)Multivariate Analysis : When we use more than two variables.

05/01/2023

Page 10: Advanced Methods of Statistical Analysis used in Animal Breeding

05/01/2023

Univariate Analysis :

1. Linear Regression Model2. Least square model

a. Random effect Model- Heritability,Repeatabilty estimation. b. Fixed effect Model-c. Mixed effect Model- BLUP

Page 11: Advanced Methods of Statistical Analysis used in Animal Breeding

05/01/2023

Linear Regression Model :

Where Yi = Dependent VariableXi = Independent Variable β0 & β1 remains fixed ,we can’t found them exactly.e= Error term

The principle of estimation of regression coefficient is based on Least Square Analysis.

Yi = β0+ β1Xi + e

Page 12: Advanced Methods of Statistical Analysis used in Animal Breeding

Random

Effects Fixe

d

Effec

t

Page 13: Advanced Methods of Statistical Analysis used in Animal Breeding

Rand

om E

ffect

s

Effects which have levels that are considered to be drawn from an infinite large population of levels.

Animal effects are often random. In repeated experiments there maybe other animals drawn from

the population. e,g. Sire ,Dam effect

Fixe

d Eff

ects

Effects for which the defined classes comprise all the possible levels of interest, e.g. Herd , Season ,Year effect. Effects can be considered as fixed when the number of levels are relatively small and is confined to this number after repeated sampling.

Page 14: Advanced Methods of Statistical Analysis used in Animal Breeding

Predictors- Used for Estimation of Random effect.

Estimators- Used for Estimation of Fixed effect.

Principles of Least Square Analysis :The Square of difference between observed and estimated/ predicted value of dependent variable must be least or Zero.

05/01/2023

= b0+ b1Xi + e then

0

Page 15: Advanced Methods of Statistical Analysis used in Animal Breeding

Random Effect Model :

Where , Yij = jth observation of ith Sire μ= General mean effectSi = Effect of ith sire eij = Error term

05/01/2023

Yij = μ+ Si + eij

Page 16: Advanced Methods of Statistical Analysis used in Animal Breeding

05/01/2023

Fixed Effect :

Where , Yij = jth observation of ith Herd μ= General mean effectFi = Effect of ith herd eij = Error term

Yij = μ+ Fi + eij

Page 17: Advanced Methods of Statistical Analysis used in Animal Breeding

To achieve this ‘mixed models’ are used in which fixed effects and breeding values (indicated as ‘random effects’) will be estimated jointly.

05/01/2023

Mixed Effect Models

Page 18: Advanced Methods of Statistical Analysis used in Animal Breeding

Where , Yijk = kth observation in ith farm of jth Sireμ= General mean effectSj = Effect of jth sire in ith farm Fi = ith farm effect eijk = Error term

Yijk=μ+ Fi+Sj + eijk

Page 19: Advanced Methods of Statistical Analysis used in Animal Breeding

Interaction effect :

Yijk=μ+ Si+Fj + (SF)ij + eijk

Page 20: Advanced Methods of Statistical Analysis used in Animal Breeding

BLUP estimation of breeding values is based on a mixed model, which is a linear basis of Best Linear Unbiased Prediction.

BLUP

Page 21: Advanced Methods of Statistical Analysis used in Animal Breeding

Best in the sense that they have minimum mean squared error within the class of linear unbiased estimators; and predictors to distinguish them from estimators of fixed effects.

BLUP estimates of the realized values of the random variables u are linear in the sense that they are linear functions of the data, y;

The Best prediction is that which minimises the prediction error. Unbiased in the sense that the average value of the estimate is

equal to the average value of the quantity being estimated;

(G.K Robinson,1991)

05/01/2023

Page 22: Advanced Methods of Statistical Analysis used in Animal Breeding

05/01/2023

Maximizes the correlation between true and estimated value of effects by minimizing the error variance.

The factors for which estimates are required linear functions of the observations.

Estimates of fixed effects and estimable functions are such that E(T) = .

Best -

Linear

Unbiased

Page 23: Advanced Methods of Statistical Analysis used in Animal Breeding

3 different kinds of BLUP, (Henderson, 1973 )

Henderson model-1

Henderson model-2

Henderson model-3

Only Random Effect

Random + Fixed + Interaction Effect

Both Random & Fixed Effect

Page 24: Advanced Methods of Statistical Analysis used in Animal Breeding

The general linear model equation in matrix form is

Y = Xβ + Zu + eWhere ,

Y is an n × 1 vector of n observed records

X is a known incidence matrix of order n × p, which relates the records in y to the fixed effects in b

β is a p × 1 vector of p levels of fixed effects (to be estimated )

Z is a known incidence matrix of order n × q, which relates the records in y to the random effects in u

u is a q × 1 vector of q levels of random effects such as individual genetic value(to be estimated )

e is an n × 1 vector of random, residual terms

05/01/2023

Page 25: Advanced Methods of Statistical Analysis used in Animal Breeding

Expectations and Variance Covariance (VCV) Matrices In general the expectation of y is

> which is also known as the 1st moment. The 2nd moments describe the variance covariance structure of y:

G is a dispersion matrix for random effects other than errors , R is the dispersion matrix of error terms, for which both are general square

matrices assumed to be non-singular and positive definite, with elements that are assumed known.

We usually write V = ZGZT + R

05/01/2023

Page 26: Advanced Methods of Statistical Analysis used in Animal Breeding

Estimating fixed Effects & Predicting Random Effects :-

For a mixed model, y, X, and Z , , u, R, and G are generally unknown Two complementary estimation issues Estimation of and u Estimation of fixed effects, BLUE = Best Linear Unbiased Estimator

Prediction of random effects BLUP = Best Linear Unbiased Predictor

05/01/2023

= (XTX) -1 XTy

= GZT ( y-X )

Page 27: Advanced Methods of Statistical Analysis used in Animal Breeding

• The BLUP eliminates the non genetic biases in estimating Breeding Value.• It also removes the genetic biases taking in to account the effects of non-

random mating , genetic merit of Dams and selection.

05/01/2023

Page 28: Advanced Methods of Statistical Analysis used in Animal Breeding

05/01/2023

Page 29: Advanced Methods of Statistical Analysis used in Animal Breeding

Advantages:– Handles unbalanced designs– Uses information for all relatives measured to improve estimatesBLUP can be used to estimate a variety of genetic values– GCA, SCA, line values (i.e., genotypic values ofpure lines)– One can also use BLUP to estimate environmental effects, G x E

Page 30: Advanced Methods of Statistical Analysis used in Animal Breeding

REML = Restricted Maximum Likelihood. Standard ML variance estimation assumes fixed factors are known without error. REML is an approach that produces unbiased estimators for these special

cases and produces less biased estimates than ML in general.

Depending on whom you ask, REML stands for Residual Maximun Likelihood or Restricted Maximum Likelihood

05/01/2023

REML Variance Component Estimation

Page 31: Advanced Methods of Statistical Analysis used in Animal Breeding

variance components by REML are estimated based on residuals calculated after fitting by ordinary least squares from fixed effects part of the model.

It Maximizes a marginal likelihood function. So it is also called Residual Maximun Likelihood or Marginal Maximun

Likelihood.

For linear mixed effects models, the REML estimators of variance components produce the same estimates as the unbiased ANOVA-based estimators formed by taking appropriate linear combinations of mean squares when the latter are positive and data are balanced.

05/01/2023

Page 32: Advanced Methods of Statistical Analysis used in Animal Breeding

05/01/2023

Page 33: Advanced Methods of Statistical Analysis used in Animal Breeding
Page 34: Advanced Methods of Statistical Analysis used in Animal Breeding

ESTIMATION

Page 35: Advanced Methods of Statistical Analysis used in Animal Breeding

05/01/2023

Page 36: Advanced Methods of Statistical Analysis used in Animal Breeding

05/01/2023

Before getting onto iterative algorithms, it is helpful to review the differencebetween the log-likelihood function l used to calculate maximum likelihood estimates, and that () used for REML:

The term log ()makes the adjustment for degrees of freedom used in estimating treatment effects, so that REML estimates of variance components are less biased than ML estimates. The other major differences are: is not a function of the fixed effects The constant in is a function of the fixed design matrix X

ML vrs REML

Page 37: Advanced Methods of Statistical Analysis used in Animal Breeding

05/01/2023

Page 38: Advanced Methods of Statistical Analysis used in Animal Breeding

ML vrs REML ML estimates are biased because no account is taken of degree of

freedom in estimating the variance components. REML takes care of bias in estimates as well as avoids –ve estimates of

component variance . (Searle et al, 1992 )

Page 39: Advanced Methods of Statistical Analysis used in Animal Breeding

05/01/2023

Genetic EvaluationREML and BLUP applied to multi trait mixed models have become the standard method for genetic evaluation in all terrestrial animal species.The main benefits of using this methodology include:(1) Increasing accuracy of selection;(2) Managing accumulation of inbreeding;(3) estimating genetic trend without a control;(4) the possibility of conducting large scale genetic evaluation across populations.

N.H. Nguyen and R.W. Ponzoni Vol. 29 No. 3 & 4 Jul-Dec 2006

Page 40: Advanced Methods of Statistical Analysis used in Animal Breeding

Bivariate Analysis :

•Tests of statistical significance.•Chi-square.

Page 41: Advanced Methods of Statistical Analysis used in Animal Breeding

05/01/2023

Multivariate analysis

Multivariate analysis consists of a collection of methods that can be used when several measurements are made on each individual or object in one or more samples.

Page 42: Advanced Methods of Statistical Analysis used in Animal Breeding

05/01/2023

Page 43: Advanced Methods of Statistical Analysis used in Animal Breeding

Types :1. Multivariate Regression2.Discriminant Analysis3. Principal Component Analysis4. Genetic Divergence Analysis5. Canonical Variate Analysis

05/01/2023

Page 44: Advanced Methods of Statistical Analysis used in Animal Breeding

05/01/2023

Yi =β0+ β1X1+β2X2+…………… βnXn+ ei

Yi=β0+ + ei

When the number of population is more than one and each animal in the population has multiple characters.

1. Multivariate Regression :

2. Discriminant Analysis :

Page 45: Advanced Methods of Statistical Analysis used in Animal Breeding

• Purpose : To find out the Discriminant Function (D) that increases the differences among populations by minimising the variances within the population and maximising the mean differences between the populations with respect of characters

• D=λ1 d1 + λ2 d2+ λ3 d3

• = ∑ λi di

Where , di = ith mean difference of the populations in relation to the character, λi = weighting coefficient attached to the difference.

05/01/2023

Page 46: Advanced Methods of Statistical Analysis used in Animal Breeding

A Linear combinations of independent characters are involved to maximise the variance accounted for in the original set of characters.• Z1=a1 x1 + a2 x2+ a3 x3+ a4 x4

• Z2=b1 x1 + b2 x2+ b3 x3+ b4 x4

• Z3=c1 x1 + c2 x2+ c3 x3+ c4 x4

• Z4=d1 x1 + d2 x2+ d3 x3+ d4 x4

ai , bi , ci , di are relative weighting factor attached to each character.∑ai

2= ∑bi 2= ∑ci

2 =∑di

2=1 It is mainly confined to single population

Principal Component shows highest variance - 1st Principal Component Principal Component shows next highest variance – 2nd Principal Component .

05/01/2023

3. Principal Component (Z) Analysis :

First time this analysis In India was reported by Dhara and Chakravarty (1996) in large animals for predicting the breeding value of milk production on the basis of selected number of principal components.

Page 47: Advanced Methods of Statistical Analysis used in Animal Breeding

4. Genetic Divergence Analysis / Genetic distance analysis / D2

analysis : (Given by P C Mahalanobis,1928 )by summing the squares of deviation of the same

transformed or untransformed traits between the two genotypes in various combinations.

i= No. of traits varrying from 1-p, J,k= genotypes. j k

Follows chi-square dist. at p degrees of freedom More critical is the trait the more no. of distinctly different group develop.

05/01/2023

D2=

Page 48: Advanced Methods of Statistical Analysis used in Animal Breeding

05/01/2023

Few Notes on GD : To keep variation within population between different animals,thus many

groups can be formed within population. Explain about extent of variability and range of variability within

population. It explains the evolutionary divergence. Progeny testing programme needs GD for distiguishing unrelated sires.

Page 49: Advanced Methods of Statistical Analysis used in Animal Breeding

Used in single population Traits of the Animals can be divided into two sets and the relationship

between two sets Is to be evaluated. 2 set of characters – Y set- response character

X set- Predictor characterY Set is maximally correlated with X set.• M= p1 y1 + p2 y2

• N= q1 x1 + q2 x2+ q3 x3

The Canonical coefficients (p1 , p2 and q1 , q2 and q3) in such a way so that the correlation between two sets of characters or Canonical Variate (M and N) become maximum and that correlation is called Canonical correlation.

05/01/2023

5. Canonical Variate Analysis

In India First used on dairy buffaloes by Thomas and Chakravarty ( 1999 )

Page 50: Advanced Methods of Statistical Analysis used in Animal Breeding

Fundamentals of a Bayesian Analysis :

Formulate a probability model for the data. Decide on a prior distribution, which quantifies the uncertainty in the values

of the unknown model parameters before the data are observed. Summarize important features of the posterior distribution, or calculate

quantities of interest based on the posterior distribution. These quantities constitute statistical outputs, such as point estimates and

intervals.

BAYESIAN METHODS OF ANALYSIS

Page 51: Advanced Methods of Statistical Analysis used in Animal Breeding

Bayesian inference: A form of inference which regards parameters as being random variables possessed of prior distributions reflecting the accumulated state of knowledge.

Bayes estimation: The estimation of population parameters by the use of methods of inverse probability.

( A.L. Pretorius and A.J. van der Merwe (2000)

Page 52: Advanced Methods of Statistical Analysis used in Animal Breeding

This theorem is based on Conditional probability.Conditional probability : The probability of event B occurring when it is known that some event A has only occurred and it is noted by P().

when A & B are dependent event.

P()=

or

Page 53: Advanced Methods of Statistical Analysis used in Animal Breeding

If B1 , B2….. BK are mutually disjoint events with probability P(BK)≠ 0 (i=1,2,….K) than for any arbitrary event ‘A’ which is a subset of such that P(A)≠ 0 then we have

𝑷 (𝑩𝒊

𝑨 )=𝑷 (𝑩𝒊 ) .𝑷 ( 𝑨𝑩𝒊 )

∑𝒊=𝟏

𝑲

𝑷 (𝑩𝒊 ) .𝑷 ( 𝑨𝑩𝒊 )

Page 54: Advanced Methods of Statistical Analysis used in Animal Breeding

1. The probability i.e. P(B1), P(B2),……. P(BK) are called “A Priori Probability” as they exist before we get any information of the experiment itself.

2. The probability i.e. , i=1,2,3…..k are called “Likelihood” because they indicate how how likely the event under consideration is to occur for given each and “A Priori Probability”.

3. The probability i.e. are called “Posterior Probability” because they are determined after the result of experiment are known.

P(Bi)= “A Priori Probability”=“Likelihood” =“Posterior Probability”

𝑷 (𝑩𝒊

𝑨 )=𝑷 (𝑩𝒊 ) .𝑷 ( 𝑨𝑩𝒊 )

∑𝒊=𝟏

𝑲

𝑷 (𝑩𝒊 ) .𝑷 ( 𝑨𝑩𝒊 )

Page 55: Advanced Methods of Statistical Analysis used in Animal Breeding

Few Important Notes :

The Notation of “priori” and “posterior” in Bayes’ theorem are relative to a given sample outcome. That is, if a posterior distribution has been determined from a particular sample, this Posterior distribution would be considered the prior distribution relative to a new sample.

Priori Posterior-1 Posterior-2

priori

priori

Page 56: Advanced Methods of Statistical Analysis used in Animal Breeding

05/01/2023

References :

Robin Thompson and Esa Mantysaari , Prospects for Statistical Methods in Animal Breeding ,Jour. Ind. Soc. Ag. Statistics 57 (Special Volume), 2004: 15-25

P Narain, Statistics And Its Application To Agriculture And Genetics , IARI,New Delhi.

A.K.Chakravarty, Multivariate Analysis In Animal Breeding, NDRI,Karnal. Verbyla, A. P. (1990) A conditional derivation of residual maximum

likelihood. Australian Journal of Statistics, 32, 227-230. Henderson CR (1975) Best linear unbiased estimation and prediction

under a selection model. Biometrics 31:423–447. Henderson CR (1976) A simple method for the inverse of a numerator

relationship matrix used in prediction of breeding values. Biometrics 32:69–83

Page 57: Advanced Methods of Statistical Analysis used in Animal Breeding

05/01/2023

Gelman, A., Carlin, J. B., Stern, H. S., and Rubin, D. B. (2003) Bayesian Data Analysis, 2nd ed. London, Chapman & Hall.Carlin, B. P., and Louis, T. A. (2000) Bayes and Empirical Bayes Methods for Data Analysis, 2nd ed. Boca Raton, Chapman & Hall. Duchateau, L., Janssen, P., & Rowlands,G.L., 1998. Linear Mixed Models. An

introduction with applications in veterinary research. ILRI, Nairobi, Kenya, 159-170.

Page 58: Advanced Methods of Statistical Analysis used in Animal Breeding

05/01/2023 [email protected]