considerations for the routine implementation of genomic

34
Considerations for the routine implementation of genomic selection J.M. Hickey and J. Crossa

Upload: others

Post on 20-Feb-2022

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Considerations for the routine implementation of genomic

Considerations for the routine implementation of genomic selection

J.M. Hickey and J. Crossa

Page 2: Considerations for the routine implementation of genomic

Implementation of Genomic Selection

• I am an animal breeder!

• Genomic selection – how it works

• What do we need to consider when we implement genomic selection

• How can we make it cheap and cost effective

Page 3: Considerations for the routine implementation of genomic

Industrial implementation

• Circa 95% of breeding decisions in dairy industry– All developed world national breeding programs– International genetic evaluation

• Pig sector– Major companies = routine or advanced testing

• Poultry sector– Major companies = routine or advanced testing

• Beef industry– Sparse use

• Sheep industry– Australia and New Zealand = routine

• Ask any of the companies = It just works

Page 4: Considerations for the routine implementation of genomic

Genomic selection

• Complete coverage of genome with markers

• All QTL in linkage disequilibrium with at least 1 marker

• No QTL size thresholds needed

• Accurate breeding values of individuals at birth

• Individual marker effects are poorly estimated but the sum is well estimated (law of large numbers)– GWAS results are implicit

Page 5: Considerations for the routine implementation of genomic

Genomic selection

• Meuwissen, Hayes, Goddard (2001) Genetics

• Complete coverage of genome with markers

• Exploits linkage disequilibrium between markers and QTL

• No QTL size thresholds needed

• Accurate breeding values of individuals at birth

OversimplifiedWas based on common QTL of large size

Relatively easy to findWork across the population

Easy

BUT

Real data results indicate that the true model is polygenic

Page 6: Considerations for the routine implementation of genomic

Various ways of doing a genomic prediction

X'X X'ZZ'X Z'Z+λA-1

!

"##

$

%&&

bg

!

"##

$

%&&=

X'yZ'y

!

"##

$

%&&

X'X X'ZZ'X Z'Z+λG-1

!

"##

$

%&&

bg

!

"##

$

%&&=

X'yZ'y

!

"##

$

%&&

Rather than normal BLUP

…use GBLUP

G = the genomic relationship matrix

G consists of estimated relationships between individuals based on all markers à genomic relationships

Page 7: Considerations for the routine implementation of genomic

�some animals are more equal than others�…….. even if the additive genetic relationship is the same

0

20

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1Additive genetic relation

HSFSm=4

Genomic Relationships

e.g. actual relationship between HS can vary between 0.2 and 0.3

Page 8: Considerations for the routine implementation of genomic

A-matrix (pedigree-based) G-matrix (DNA-based)

Example: Data on sire 1, sons 2 and 3, animal 4 is unrelated, want to predict 5

1 0.5 0.5 0 0.5 1 0.5 0.5 0.02 0.5 0.5 1 0.25 0 0.25 0.5 1 0.20 0.015 0.20 0.5 0.25 1 0 0.25 0.5 0.20 1 0.025 0.30 0 0 0 1 0 0.02 0.015 0.025 1 0.025

0.5 0.25 0.25 0 1 0.5 0.20 0.30 0.025 1

Genomic Prediction: GBLUP

Page 9: Considerations for the routine implementation of genomic

A-matrix (pedigree-based) G-matrix (DNA-based)

Example: Data on sire 1, sons 2 and 3, animal 4 is unrelated, want to predict 5

1 0.5 0.5 0 0.5 1 0.5 0.5 0.02 0.5 0.5 1 0.25 0 0.25 0.5 1 0.20 0.015 0.20 0.5 0.25 1 0 0.25 0.5 0.20 1 0.025 0.30 0 0 0 1 0 0.02 0.015 0.025 1 0.025

0.5 0.25 0.25 0 1 0.5 0.20 0.30 0.025 1

Genomic Prediction: GBLUP

Page 10: Considerations for the routine implementation of genomic

A-matrix (pedigree-based) G-matrix (DNA-based)

Example: Data on sire 1, sons 2 and 3, animal 4 is unrelated, want to predict 5

1 0.5 0.5 0 0.5 1 0.5 0.5 0.02 0.5 0.5 1 0.25 0 0.25 0.5 1 0.20 0.015 0.20 0.5 0.25 1 0 0.25 0.5 0.20 1 0.025 0.30 0 0 0 1 0 0.02 0.015 0.025 1 0.025

0.5 0.25 0.25 0 1 0.5 0.20 0.30 0.025 1

BLUP uses: Family Info GBLUP uses: Family InfoSegregation within family

Info on ‘unrelated’

Genomic Prediction: GBLUP

Page 11: Considerations for the routine implementation of genomic

Practical implementation of genomic selection

• Identify where in the breeding program it can be used

• Simulation and modelling

• Proof of concept experiment– Guarantee success– Overspend on genotypes and phenotypes

• Make it cheap– Genotyping and phenotyping

• Get the IT and hardware infrastructure in place

• Automated delivery– No human interference– Let a selection index drive the breeding program

Page 12: Considerations for the routine implementation of genomic

Where in the breeding program?

Page 13: Considerations for the routine implementation of genomic

Breeding value has a number of components

• 50% due to parent average component– Best prediction at birth is the average of two parents breeding value

• 50% due to Mendelian sampling component– Sampling of parents genes– Reason for differences in:

• a pair of full sibs at birth• a pair of F2 in a bi-parental

– Can make no prediction at birth– Response to selection driven by

• Accuracy of and time taken to estimate of Mendelian sampling term

Page 14: Considerations for the routine implementation of genomic

How to get accurate evaluation of Mendeliansampling term?

• Animals = progeny test– Evaluate 100 daughters of a bull

– Massive cost and time

• Plants = replicated field trials– Cost and time?

– Some cases not possible

• Genomic selection offers us an alternative– Can get accurate prediction of Mendelian sampling term at birth/seed

• TIME AND MONEY – two potential benefits– Accurate estimates of breeding value available earlier in the cycle

– Cheaper than phenotyping (some traits)

– Cost and time per unit of genetic gain

Page 15: Considerations for the routine implementation of genomic

Genomic selection – what has been achieved?

• Performance measured by correlation with true breeding value

• Response to selection is linear with respect to this correlation

• Accuracy ranges from 0.0 to 0.90 for total breeding value

Page 16: Considerations for the routine implementation of genomic

Accuracy of breeding values in chickens

Page 17: Considerations for the routine implementation of genomic

How well can we evaluate the accuracy of the Mendelian sampling term?

Family Accuracy of Mendelian sampling term

Average 0.241 -0.37*2 -0.23*3 -0.20*4 0.215 0.376 0.547 0.698 0.87

Page 18: Considerations for the routine implementation of genomic

Relationships drive accuracy of genomic selection

R² = 0.962

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

0 0.1 0.2 0.3 0.4 0.5 0.6

Accu

racy

of G

EBV

Mean of the Top Ten Relationships

Page 19: Considerations for the routine implementation of genomic

Marker density is not important

Page 20: Considerations for the routine implementation of genomic

The promise and problem of sequence data

Page 21: Considerations for the routine implementation of genomic

Summarising all of this

• Genomic selection is an excellent tool

• It has delivered in the industry

• Marker density not important

• Most of the predictive power comes from linkage or relationship information

• Validation population design needs to validate what you think it is validating

Page 22: Considerations for the routine implementation of genomic

Make it cheap and routine

Page 23: Considerations for the routine implementation of genomic

Is genomic selection expensive or cheap?

• Genomic selection will not compete with cheap and early phenotypes

• Likely to be useful where early phenotyping is not possible or is expensive

• Cost and time per unit of genetic gain

Page 24: Considerations for the routine implementation of genomic

Is genomic selection expensive or cheap?• Its value to the breeding program depends on:

– The accuracy of the Mendelian sampling term – Cost and time of delivery of that accuracy compared to yield trials

• Can cost as little as $9 per candidate

• Pig industry implementing at a cost of $21 per candidate

• Marginal benefit– Unlikely to be economical compared to cheap phenotyping (e.g. Disease resistance)– Not worth it for simple traits controlled by single gene– Could be useful for generating polygenic disease resistance?

– What about an EBV accuracy of 0.5 for yield and quality traits for $9 at the F2 stage– Or $21 for an accuracy of 0.7– Or combine phenotypic selection with GS to drive costs down further? $3

• Economics needs to be evaluated when part of a whole breeding system

Page 25: Considerations for the routine implementation of genomic

Genotyping platform and imputation

• Two most important decisions to get right when implementing genomic selection are the – Genotyping technology– Phenotyping strategy

• Why we need imputation– We need higher density to bring phenotypes from other families into

the family we are trying to make selection decisions about– Imputation can do this cheaply

Page 26: Considerations for the routine implementation of genomic

The cost and accuracy of sensible strategies

nSires = 480nDams = 11884nCandidates = 100000

60k chip = $1206k chip = $483k chip = $35384 chip = $20

Scenarios Other MGS + PGS MGD + PGD Sire Dam Candidates Individual cost Accuracy of Imputation R2

SC1 60k 60k 0 60k 0 384 ! 0.878SC2 60k 60k 384 60k 384 384 $20.58 0.929SC3 60k 60k 3k 60k 3k 384 $24.74 0.950SC4 60k 60k 6k 60k 6k 384 $26.28 0.944SC5 60k 60k 60k 60k 60k 384 $34.84 0.964SC6 60k 60k 0 60k 0 3k ! 0.968SC7 60k 60k 384 60k 384 3k ! 0.972SC8 60k 60k 3k 60k 3k 3k $35.58 0.984SC9 60k 60k 6k 60k 6k 3k $41.28 0.983SC10 60k 60k 60k 60k 60k 3k $49.84 0.993SC11 60k 60k 0 60k 0 6k ! 0.982SC12 60k 60k 384 60k 384 6k ! 0.983SC13 60k 60k 3k 60k 3k 6k ! 0.986SC14 60k 60k 6k 60k 6k 6k $48.58 0.991SC15 60k 60k 60k 60k 60k 6k $62.84 0.996SC16 60k 60k 60k 60k 60k 60k $120.00 1.000

Page 27: Considerations for the routine implementation of genomic

Imputation and GEBV accuracyGenotyping Scenario Imputation

accuracy Other PGS+MGS PGD+MGD Sire Dam Progeny

4220 74 108 70 107 184 450 3k 6k S1 H H H H H L 0.97 0.99 1.00 S2 H 0 0 H H L 0.95 0.98 0.99 S3 H H 0 H 0 L 0.91 0.97 0.98 S4 H H L H L L 0.94 0.99 0.99 !1!

N HD Geno.

Genotyping Scenario Imputed gEBV

Accuracy

Other PGS+MGS

PGD+MGD Sire Dam Progeny 450 3k 6k

S1 2519 H H H H H L 0.94 0.97 0.97 S2 2344 H 0 0 H H L 0.89 0.95 0.96 S3 2318 H H 0 H 0 L 0.87 0.92 0.93 S4 2318 H H L H L L 0.90 0.96 0.97 !1!

Page 28: Considerations for the routine implementation of genomic

How can imputation be harnessed in a plant breeding program

• Firstly we need a physical map– Approximate map is probably sufficient

– Sheep map was initially a virtual map of cattle genome

– Replaced by real sheep map once genome was sequenced

• Genotyping costs

– $5 for 50 SNP

– $2 for every subsequent SNP added

– $9 for 200 SNP

– $21 for 450 SNP

– $100 to $150 (approx.) for 50,000 SNP

• Harness the pedigree and low density genotyping platforms to make low cost

genomic selection work

• I like SNP chips for routine genomic selection

– Turn around time

– Quality

– Consistency

Page 29: Considerations for the routine implementation of genomic

You don’t need to genotype all individuals

• Scenario 1 - Train using the 3200 HD animals

• Scenario 2 - Train using the 3200 HD and 2764 completely ungenotyped animals

• Predict in the 509 testing animals

• Growth trait with h2 of 0.61

• Results - correlation with progeny test EBV from BLUP

Unrestricted RestrictedScenario 1 0.42 0.51

Scenario 2 0.49 0.62

Page 30: Considerations for the routine implementation of genomic

Get the IT and hardware infrastructure in place and remove human intervention!!

Page 31: Considerations for the routine implementation of genomic

Things to consider in a plant breeding program

Page 32: Considerations for the routine implementation of genomic

How a pig breeding program tackled the implementation of genomic selection

• One approach cannot answer all questions cheaply and quickly

• Simulation study to understand the dynamics of GS in species and breeding program

• On basis of simulation design an experiment using a set of highly related individuals– Cognizant of issues specific to pigs

– Retrospective experiment allowed results within months

– Experimental animals genotyped with the most expensive and highest quality genotyping technology available

• Once accurate results and clear understanding achieved– Develop a low cost strategy to achieve THE SAME ACCURACY

• Routine implementation on the basis of the low cost strategy– 100,000 candidates to selection at a cost of $22 per candidate

• 4 year project – cost? – Genotypes – 1 million dollars

– Phenotypes free (proof of concept was done with historical data from database) – External research and development contracts – circa $500,000

– Two internal staff full time / Some more part time

– Good database and IT infrastructure was the key

Page 33: Considerations for the routine implementation of genomic

Similar approach to genomics in plant breeding program

• Simulate alternative strategies– Economic appraisal– Cost + time + accuracy – What is the marginal value?

• Low cost strategy based on imputation works in livestock– Evaluate possibility in plant breeding program (pilot genotyping experiment cost $5000)– No physical map = problem = can it work without imputation?

• On the basis of simulation and pilot genotyping experiment set up a field trial– Separation of issues– GxE– Genotyping platform– Proof of concept– Experiment needs to be focussed on specific questions– Focus on design with high chance of success (e.g. within family selection at F2)

Page 34: Considerations for the routine implementation of genomic

Conclusion• Very very very careful validation experiment/design

• What selection decisions do we want to do make genomics

• Unlikely to compete with cheap and early phenotyping

• How accurate is the accuracy of the Mendelian sampling term

• Cost and time of this accuracy

• IT and hardware infrastructure