introduction
DESCRIPTION
Results (cont.). Software & Computing (cont.). Introduction. Higher density genotypes can provide markers closer to QTL, but imputation is needed for genotypes of less than highest density. Markers from multiple chips can then be combined in genomic evaluation. - PowerPoint PPT PresentationTRANSCRIPT
Genomic imputation and evaluation using 1074 high density Holstein genotypes
P. M. VanRaden1, D. J. Null1*, G.R. Wiggans1, T.S. Sonstegard2, E.E. Connor2, M. Winters3, and M. Sargolzaei4
1Animal Improvement Programs Laboratory, ARS, USDA, Beltsville, MD 2Bovine Functional Genomics Laboratory, ARS, USDA, Beltsville, MD, and
3 Dairy Co Agriculture and Horticulture Development Board, Warwickshire, UK 4Centre for Genetic Improvement of Livestock, U. Guelph, ON, Canada
Abstr. W53
2011
Introduction
Data
• Four types of genotypes were used for this analysis: HD, 50K, 3K, and imputed dams.
• The animals genotyped included 1,074 with HD, 66,540 with 50K, 33,119 with 3K, and 2,337
imputed dams.
•HD genotypes were from 356 influential USA and CAN sires, 398 GBR sires, 156 other sires,
138 Beltsville research cows, and 26 other females.
• To test imputation, an example simulated chromosome was used with 1% of the genotypes
missing and 0.02% incorrect initially from each chip. Among all animals, 94.4% of genotypes
were missing initially.
Conclusions
• Imputation from 50K to HD is accurate (98.9%),
• The 0.4% average increase in reliability is less favorable than the 0.9% expected from
simulation.
• More animals with HD genotypes will improve imputation and reliability.
• Multi-breed evaluation could produce larger gains than the single-breed evaluation that was
investigated.
Software & Computing (cont.)
•A maximum length of 2,000 markers and a minimum of 200 yielded the best results when findhap
was run one time.
•A maximum length of 1,500 markers and a minimum of 200 markers yielded the best results when
findhap was run twice and when findhap and FImpute were combined .
•Running FImpute and findhap yielded the best results with an average of 96.37% correctly called HD
genotypes across all chip types including imputed dams (Table 1).
• The average reliability gain over all traits was 0.4% (Table 2).
Table 2. Gains in Reliability • Three combinations of the programs were tested: findhap run once (imputing from 3K and 50K up to
HD), findhap run twice (first imputing 3K to 50K then imputing 50K to HD), and running FImpute
(imputing 3K to 50K) before running findhap (imputing 50K to HD).
• Several combinations of segment lengths were tested in findhap.
• Imputation of 636,967 markers for 103,070 animals with findhap required 50 Gbytes of memory and
10 hours using 6 processors.
• Iteration for SNP effects for 29 traits required 2 days using 6 processors.
•August 2007 predictions were tested with April 2011 data
Higher density genotypes can provide markers closer to QTL, but imputation is needed for
genotypes of less than highest density. Markers from multiple chips can then be combined in
genomic evaluation.
Results (cont.)
Objectives
•Determine the accuracy of imputing up to 636,967 markers (HD) from 42,495 markers (50K),
2,614 markers (3K) or from 0 markers (imputed dams) using simulated data.
•Determine gain in reliability from using more markers with actual data.
Results
Table 1. Correctly imputed genotypes.
Software & Computing
•Both findhap.f90 developed at AIPL and FImpute developed at U. Guelph and Boviteq Alliance
were tested in this analysis.
• The imputation rate with findhap version 2 is improved compared to version 1 results tested
earlier.
• Version 2 of findhap uses both long segments to improve haplotype matches for close relatives
and short segments to help detect matches from more remote ancestors.
Correctly called genotypes (%)3K to 50K 50K to HD Dams HD 50K 3K Average
Findhap 94.23 99.91 98.84 88.77 95.43
Findhap Findhap 94.52 99.91 98.92 90.36 95.93
FImpute Findhap 95.53 99.91 98.93 92.69 96.76
Trait 50K Rel HD Rel HD Gain
Milk 67.3 67.8 0.6
Fat 69.9 70.3 0.4
Protein 61.0 61.4 0.4
Fat % 85.6 87.5 1.9
Protein % 78.4 80.9 2.6
Net Merit 52.4 52.4 0.0
Productive Life 52.9 53.1 0.2
SCS 61.4 60.9 -0.5
Daughter Pregnancy Rate 50.8 50.5 -0.3
Sire Calving Ease 30.8 32.2 1.5
Daughter Calving Ease 38.9 37.0 -1.9
Sire Stillbirth 17.6 18.2 0.6
Daughter Stillbirth 28.5 28.8 0.3
Final Score 53.2 53.4 0.2
Stature 63.9 65.4 1.4
Strength 63.8 64.0 0.2
Udder Depth 73.8 74.2 0.4
Average 57.0 57.4 0.4