The potential of advanced approaches for accurate genotyping in grapevines
N. ŠTAJNER1, L. BITZ2, B ŠKRLJ1, J. JAKŠE1, R. BACILIERI3 and B. JAVORNIK11 University Of Ljubljana, Biotechnical Faculty, Agronomy Department, Slovenia
2 Natural Resources Institute Finland, Alimentum – Myllytie 1, 31600 Jokioinen, Finland3 UMR AGAP, Equipe Diversité Et Adaptation De La Vigne Et Des Espèces Méditerranéennes, INRA, Montpellier, France
Objective: GenotypingVariety identification
• grape growers • winemakers • regulatory authorities • consumers• germplasm collections
AREA Conference - Belgrade, 18-20 April 2016
• ampelography• chemical and
biochemical methods• molecular methods
• RFLP• RAPD• AFLP• SNP• SSR
Microsatellite markers• highly abundant in the genome, • distributed throughout the genome• highly polymorphic (replication slippage once per
1,000 generations)
AREA Conference - Belgrade, 18-20 April 2016
TATATATATA
GTCGTCGTCGTCGTC
High throughput analysis of grape genetic diversity
•Laucou et al., 2011, TAG• 4,370 accessions (Vitis vinifera subsp. sativa (3,727), V. vinifera subsp. sylvestris (80),
interspecific hybrids (364), and rootstocks (199) - INRA collection)• 20 SSR markers•→ 2,836 SSR unique genotypes
AREA Conference - Belgrade, 18-20 April 2016
Cipriani et al., 2010, TAG
• 1,148 accessions (from the Italian national repositories)• genotyped at 34 microsatellite loci→745 unique genotypes
Emanuelli et al. 2013, BMC Plant Biology
• 2,273 accessions (from the collection San Michaele all'Adige in Italy )• 22 microsatellites and 384 SNPs→ 1,085 unique genotypes
Towards the preservation of autochthonous grapevine
varieties in WBC
COLLECTION of autochthonous varieties
• National grapevine collections • Local vineyards• Nursuries or individual farms
AREA Conference - Belgrade, 18-20 April 2016
Sampling in WBC
• Slovenia (3 locat, 52 sampl)• BIH (9 locat, 80 sampl.) • Montenegro (8 locat, 24 sampl) • Serbia (1 locat, 62 sampl) • Macedonia (7 locat,15 sampl)
AREA Conference - Belgrade, 18-20 April 2016
233 grapevine samples
AREA Conference - Belgrade, 18-20 April 2016
autochthonous/local grapevines
MATERIAL
Main agronomic features
AREA Conference - Belgrade, 18-20 April 2016
Photo 1: Bagrina
Photo 2: Bakator beli
Photo 2: BelaDinka
Photo 4: Medenac beli
AIM OF THE PROJECT
MOLECULAR CHARACTERIZATION
• Establish microsatellite profiles• Identify genotypes• Determine synonyms and homonyms• Evaluate genetic relationships• Revealed genetic structuring
AREA Conference - Belgrade, 18-20 April 2016
Comparative analysis (intra- and inter-group)
Intra-group analysis22 microsatellites
AREA Conference - Belgrade, 18-20 April 2016
233 samples / 147 uniqueUNIQUE (63%)
• Misnomers (planting mistakes) (6%)• Synonyms (8%)• Associations of duplicates/clones (23%)
REDUNDANCY (37%)
Same name, same genotype, different locationDuplicates (23%)
Belo Zimsko MAK = Belo Zimsko SRBBena I BIH = Bena IX BIHBlatina I BIH = Blatina SRBDobrogostina I BIH = Dobrogostina IX BIHKambuša I = Kambuša IXKratošija I MNE = Kratošija II MNE = Kratošija IV MNEKratošija MAK = Kratošija SRBKrkošija I = Krkošija IXPlavka I BIH = Plavka IX BIHPlavka IV BIH = Plavka VI BIHProkupac IX BIH = Prokupac III MNE = Prokupac IV MNERadovača II BIH = Radovača III BIHRefošk SLO = Star Refošk SLO = Refošk Istra SLO = Refošk Teranovka SLOSmederevka BIH = Smederevka MAK = Smederevka SRBVranac I BIH = Vranac SRB = Vranac MAKVranac V MNE = Vranac VI MNEŽilavka VIIBIH =Žilavka VIII BIH = Žilavka MAK = Žilavka IX MNE = Žilavka SRBŽlozder I BIH = Žlozder IX BIHŽupljanka I BIH = Župljanka SRB
AREA Conference - Belgrade, 18-20 April 2016
Same name, different genotype, different location
Misnomers –planting/naming mistakes (6%)
AREA Conference - Belgrade, 18-20 April 2016
Bena III BIH ≠ (Bena I BIH = Bena IX BIH)
Blatina IX BIH ≠ (Blatina I BIH = Blatina SRB)Krkošija III BIH = (Krkošija IX BIH = Krkošija SRB)
Menigovka I BIH ≠ Menigovka II BIH
Podbil I BIH ≠ Podbil IX BIH
Prokupac SRB ≠ (Prokupac III MNE = Prokupac IV MNE = Prokupac IX BIH)
Radovača IX BIH ≠ (Radovača II BIH = Radovača III BIH)
Razaklija IX BIH ≠ (Rezaklija V BIH = Rezaklija VI MNE)
Ružica V MNE ≠ Ružica VI MNE ≠ Ružica SRB
(Kratošija I MNE = Kratošija II MNE = Kratošija IV) ≠ (Kratošija MAK = Kratošija SRB)
(Vranac I BIH = Vranac SRB = Vranac MAK) ≠ (Vranac V MNE = Vranac VI MNE)
Different name, same genotypeSynonyms (8%)
AREA Conference - Belgrade, 18-20 April 2016
Begljerka Bela MAK = Bele Kozije Sise SRB = Bijela Prošip I BIH = Drenak Beli SRBBela Zgodnja SLO = Petrovka IX BIHBena III BIH = Krkošija I BIH = Krkošija IX BIH = Krkošija SRBBlatina I BIH = Stara Blatina I BIH = Stara Blatina VII = Stara Blatina VIIIBlatina VII BIH = Blatina VIII BIH = Plavka I BIH = Plavka IX BIHBlatina IX = Krkošija III BIH=Žilavka BIH I=Žilavka II= Žilavka III= Žilavka V= Žilavka VII= Žilavka VIII= Žilavka MAK =Žuta Žilavka I BIHCrljenak IX BIH = Kratošija I MNE = Kratošija II MNE = Kratošija IV MNE = Vranac III MNE = Vranac IV MNECrna Prošip I BIH = Dobrogostina I BIH = Dobrogostina IX BIHČauš MAK = Čauš Crveni SRBDalmatinka VI BIH = Kambuša I BIH = Kambuša IX BIH = Podbil IX BIHFrankovka SRB = Sura Lisičina SRBKadarun II BIH = Kadarun IV BIH = Podbil I BIH = Surac IV BIHKadarun Bijeli BIH = Smederevka BIH= Smederevka MAK = Smederevka SRBMala Blatina I BIH = Prošip I BIHMala Blatina VII BIH = Mala Blatina VIII BIH= Vranac I BIH = Vranec MAK = Vranac SRBOhridsko Crno MAK = Stanušina MAKPlavka IV BIH = Plavka VI BIHProkupac IX BIH = Prokupac III MNE = Prokupac IV MNE = Surac VIRezaklija VI MNE = Rezaklija V BIHRužica IV MNE = Slankamenka Crvena SRB
SYNONYMSe.g. Smederevka SRB = Kadarun Bijeli BIHe.g. Stanušina = Ohridsko crno MKde.g. Žilavka BIH = Žuta Žilavka BIH
TRUE-TO-TYPEe.g. Smederevka SRB = Smederevka MNE = Smederevka BIHe.g. Žilavka BIH = Žilavka MKD = Žilavka SRB
32 groups of SYNONYMS
HOMONYMSe.g. Prokupac SRB ≠ Prokupac SRB ≠ Prokupac (DVC)
3 groups of HOMONYMS
MISTAKESe.g. Krkošija III BIH ≠ (Krkošija IX BIH = Krkošija SRB)e.g. Menigovka I BIH ≠ Menigovka II BIH
10 groups of MISNOMERS
70 groups of TRU-TO-TYPE
147 UNIQUE GENOTYPES AREA Conference - Belgrade, 18-20 April
2016
ANANLYSIS OF GENETIC RELATEDNESS
Nei genetic distance:• the highest genetic distance
between SLO and MAK cultivars • the lowest between SLO and SRB
cultivars. AREA Conference - Belgrade, 18-20 April
2016
I GROUP
II GROUP
III GROUP
IV GROUP
V GROUP
VI GROUP
VII GROUP
VIII GROUP
IX GROUP
Proportion of shared alleles among Balkan cultivars = 35 %
Genotypes clustered into nine groups:
MAK grapevines ≠ groups I, II, IV, VIII, IXEuropean cultivars (group IV) ≠ MAK, MNE
BIH, SRB, SLO cultivars equally dispersed among all nine clusters
MNE, MAK cultivars = separated gene pool
RUPESTRIS
Core collection –entire allelic richness (242 alleles)
• 63 genotypes outof 147 (43%)
•• 50% genotypes
from each country
AREA Conference - Belgrade, 18-20 April 2016
STRUCTURE • assigned to a cluster when 85% or more of their
inferred genome belonged to the cluster groups• identified according to the classification of Negrul:• 1) representatives from 4 countries, excluding Slovenia• 2) group lack Macedonian and Montenegrin genotypes • 3) grapevines from Montenegro, Macedonia and BiH• 4) representatives from Western Europe and Slovenia
AREA Conference - Belgrade, 18-20 April 2016
Pontica Balcanica Pontica Occidentalis GallicaOrientalis Caspica
Inter- comparative analysis
• to standardize allele scoring by defining reference alleles
• well-known reference cultivars• Barbera, Cabernet Sauvignon, Chardonnay, Merlot, Pinot Noir, Sultanine, and Touriga National
• Compared only some of micorsatellite markers(VVS2, VVMD5, VVMD7, VVMD27, ssrVrZAG62, ssrVrZAG79)
AREA Conference - Belgrade, 18-20 April 2016
Standardization
Inter-group analysis
AREA Conference - Belgrade, 18-20 April 2016
Bojanka MAK = Garnacho (or Morrastel-Bouschet) = Grand Noir (University of California Davis collection) Crven Valandovski Drenok = Geisstutte (Sefc et al., 2000) = Cornichon Violet (Davis) Frankovka SRB = Sura Lisičina SRB =Franconia (Grando, 2000) = Limberger (Davis) (Galet,
2000) Kadarka Bela SRB = Kadarka Byala BU, Slankamenka Crvena SRB = Pamid BU in
Smederevka = Dimyat BU (Bulgarian cultivars genotyped by Dhambazova et al. (2009) Furmint SRB= Knipperle (Davis) = Ortlieber (Sefc et al., 2000). Alicante Bouchet = Kambuša/Dalmatinka = Alicante Henri Bouchet (Lopes et al., 2006),
Garnacha Tinta (de Mattia et al., 2009) = Garnacha Tintorea (Sefc et al., 2000; Ibanez et al., 2003). Menigovka = Italia = Perd’e Sali (Zecca et al. (2010) Kratošija = Zinfandel in Primitivo (Davis) (Calo et al. (2008). Muskat Krokan SRB = Muscat Fleur d’Oranger (Gianenetto et al. (2010) Plemenka Nova SRB = Pirovano 2 = Angelo Pirovano Portugizac SRB = Portugais Bleu (Davis) Plavka BIH = Plavina HR (Maletić et al., 1999) = Brajdica = Plavina Crna (Davis) Tamjanika Crna SRB = Muškat Ruža Porečki (Maletić et al., 1999) = Moscato Rosa (Crespan
et al., 1999)
Variety
Country
Prime name DNA reference Putative synonymes
Bela Zgodnja SLO Perla von Csaba Sefc et al. 1998, Ibanez et al. 2009, Santana et al. 2010, UCD
Cilibarka SRB CilibarkaGrk BIH ?, distinct from Grk in
Pejic et al. (2000) and in CVD
NN BIH Sljiva IX (below), except in VVMD28NPS1 BIH Diretta Bianca (Carimi et al. 2010)Prosip Bijela BIH Vela Pergolla Maletic et al. 1999 ; Sefc et
al. 2001Volovnik (Stajner et al. 2008)
Radovaca MNE Afus Ali Dattier de Beyrouth (UCD) ; Mennavacca (Zulini et al. 2002); Regina (Crespan et al. 1999)
Ruzica VI MNE Pamid Dzhambazova et al. 2009 ; BD
Slankamenka (below)
Ruzica V MNE Garvan Dzhambazova et al. 2009SlankamenkaCrvena
SRB Pamid Dzhambazova et al. 2009 ; BD
Ruzica VI (above)??????.. . .
147 UNIQUE GENOTYPES
“NEW” IDENTITY
TRUE-TO-TYPEHOMONYMS
DISCOVERIES OF IDENTITY
“NEW” IDENTITY
DISCOVERY OF MISTAKES
110 UNIQUE GENOTYPES
AREA Conference - Belgrade, 18-20 April 2016
Ruzica VI MNE
Which is RUZICA ???
Ruzica SRB
Ruzica IV MNE
Slankamenka Crvena SRB
Gavran (SRB)
Pamid (BNC)
Garvan (BNC)
Unique Genotype
≠≠
=≠
Some new questions …
AREA Conference - Belgrade, 18-20 April 2016
and answers ….
Linking of two sets of SSR data
AREA Conference - Belgrade, 18-20 April 2016
1,130 grapevine genotypes held at the INRA Vassal,
France
14 SSR markers
110 uniquegenotypes fromWestern Balkan
region
Laucou et al., 2011, TAGLacombe et al., 2013, TAGBacilieri et al., 2013, BMC Plant Biol.
Synonyms/counterpartsBalkan sample INRA sample (LACOMBE et al. 2013)‘Bela Zgodnja’ (SVN) ‘Perle de Csaba’ (#1069) (HUN)‘Čauš Bel’ (MKD) ‘Chaouch rose’ (#1674) (TUR)‘Drenak’ (BIH) ‘Rosa menna di vacca’ (#1662) (ROU)‘Drenak Beli’ (SRB) ‘Coarna alba’ (#749) (ROU)‘Drenak Crni’ (SRB) ‘Darkaia noir’ (#728) (MAR)‘Elezovka’ (BIH) ‘Chaouch blanc’ (#1673) (TUR)‘Furmint’ (SRB) ‘Knipperle’ (#283) (FRA)‘Gnjet’ (SVN) ‘Piccola nera’ (#2387) (ITA)‘Harslevelu’ (SRB) ‘Harslevelu’ (#1609) (HUN)‘Kratosija’ (MNE, MKD) ‘Primitivo’ = ‘Zinfandel’ (#1277) (ITA)‘Marburger’ (SVN) ‘Neuburger’ (#2172) (AUT)‘Menigovka’ (BIH) ‘Italia’ = ‘Pirovano’ 65 (#926) (ITA)‘Muskat Ruza’ (SRB) ‘Muscat rouge de Madère’ = ‘Moscato violetto’ (#576) (ITA)‘Plavka’ (BIH) ‘Plavina crna’ (#1843) (HRV)‘Prokupac’ (BIH) ‘Prokupac’ (#1630) (SCG)‘Radovaca’ (MNE) ‘Dattier de Beyrouth’=’Afuz Ali’ (#634) (TUR)‘Refosk’ (SVN) ‘Terrano’ (#1293) (ITA)‘Rezaklija’ (BIH) ‘Razachie rosie’ (#1887) (ROU)‘Ruzica’ (SRB) ‘Kovidinka’ (#1578) (YUG)‘Šipon’ (SVN) ‘Furmint’ (#25) (HUN)‘Smederevka’ (SRB) ‘Dimiat’ (#1666) (BGR)‘Srem Zelenika’ (SRB) ‘Szeremi zöld’ (#1623) (HUN)‘Trbljan Beli’ (SRB) ‘Mostosa = ‘Empibotte bianco’ (#2054) (ITA)‘Žilavka’ (BIH) ‘Zilavka’ (#1637) (BIH)
AREA Conference - Belgrade, 18-20 April 2016
ParentagesOffspring First candidate Second candidate
Trio loci compared
‘Dobrogostina’ (BIH) = ‘Stara Žilavka’ (BIH)
‘Krkošija’ (BIH) ‘Rosenmuskat ‘(#3550) (DEU) 14
‘Godominka’ (SRB) ‘Muscat Ottonel’ (#280) (FRA)‘Smederevka’ (SRB)=’Dimiat’ (#1666) (BGR)
14
‘Gročanka’ (SRB)‘Bela Zgodnja’ (SVN)= ‘Perle de Csaba’ (#1069) (HUN)
‘Radovaca’ (MNE)=‘Dattier de Beyrouth’=Afuz Ali’ (#634) (TUR)
15
‘Krivaja’ (SRB)‘Drenak (BIH)=‘Rosa menna di vacca’ (#1662) (ROU)
Rezaklija’ (BIH)=’Razachie rosie’ (#1887) (ROU)
14
‘Manastirsko Belo’ (MKD)‘Drenak (BIH)=‘Rosa menna di vacca’ (#1662) (ROU)
‘Heptakilo’ (#743) (GRC) 14
‘Polšakica’ (SVN) ‘Malvasia del Chianti’ (#1352) (ITA) ‘Prošip Bijela’ (BIH) 14
‘Prokupac’ (SRB)‘Prokupac’ (BIH)=‘Prokupac’ (#1630) (SRB)
‘Refošk’ (SVN)=’Terrano’ (#1293) (ITA)
14
‘Šljiva’ (BIH)‘Drenak Beli’ (SRB)=‘Coarna alba’ (#749) (ROU)
‘Heptakilo’ (#743) (GRC) 14
‘Žilavka’ (BIH) ‘Albaimputotato’ (#44) (ROU) ‘Dobrogostina’ (BIH) 14
‘Bagrina’ (SRB) ‘Beli Medenac’ (SRB) ‘Braghina=Dinkavoros’ (#1670) (HUN) 13
‘Kreaca’ (SRB)‘Drenak Beli’ (SRB)=‘Coarna alba’ (#749) (ROU) ‘Plavac Mali’ (#3144) (HRV) 14
‘Prošip’ (BIH)‘Drenak Beli’ (SRB)=‘Coarna alba’ (#749) (ROU) ‘Gouaisblanc’=’Heunischweiss’(#211) (FRA) 14
AREA Conference - Belgrade, 18-20 April 2016
AREA Conference - Belgrade, 18-20 April 2016
Origin/group 1 2 3 4AFG 100,0 0,0 0,0 0,0ARG 26,7 0,0 73,3 0,0ARM 62,5 0,0 37,5 0,0AUS 27,3 9,1 54,5 9,1AUT 23,1 38,5 7,7 30,8AZE 0,0 0,0 100,0 0,0BEL 0,0 20,0 80,0 0,0BGR 18,8 25,0 56,3 0,0BIH 19,0 76,2 4,8 0,0CHE 0,0 0,0 0,0 100,0CYP 50,0 0,0 50,0 0,0CZE 0,0 0,0 50,0 50,0DEU 3,6 9,1 9,1 78,2DZA 85,7 0,0 14,3 0,0ESP 63,2 0,0 21,1 15,8FRA 18,8 24,1 11,9 45,2GBR 0,0 0,0 100,0 0,0GEO 50,0 50,0 0,0 0,0GRC 36,4 27,3 36,4 0,0HRV 40,0 50,0 10,0 0,0HUN 8,2 58,8 28,2 4,7IRN 42,9 0,0 42,9 14,3ISR 10,0 10,0 80,0 0,0ITA 20,8 24,0 49,0 6,3LBN 0,0 0,0 100,0 0,0MAR 57,1 0,0 28,6 14,3MDA 0,0 50,0 50,0 0,0MEX 100,0 0,0 0,0 0,0MKD 0,0 75,0 25,0 0,0MNE 0,0 100,0 0,0 0,0nd 43,8 6,3 37,5 12,5NLD 0,0 0,0 100,0 0,0PER 0,0 0,0 100,0 0,0PRT 53,4 2,7 21,9 21,9ROU 11,1 44,4 40,7 3,7RUS 42,9 50,0 7,1 0,0SCG 0,0 25,0 75,0 0,0SRB 5,6 72,2 16,7 5,6SVN 9,1 81,8 9,1 0,0SYR 60,0 0,0 40,0 0,0TCH 0,0 33,3 0,0 66,7TJK 100,0 0,0 0,0 0,0TKM 100,0 0,0 0,0 0,0TUN 50,0 12,5 37,5 0,0TUR 20,0 60,0 20,0 0,0UKR 40,0 33,3 20,0 6,7URS 16,7 25,0 50,0 8,3USA 31,8 0,0 63,6 4,5UZB 63,6 0,0 36,4 0,0YUG 0,0 100,0 0,0 0,0ZAF 8,3 0,0 75,0 16,7
AREA Conference - Belgrade, 18-20 April 2016
MNE 100
SVN 82
BIH 76
MKD 75
SRB 72
YUG 100
TUR 60
HUN 59
RUS 50
HRV 50
Characterization of the STRUCTURE groups according to Geography And Use.Bacilieri Et Al. BMC Plant Biology 2013 13:25
AREA Conference - Belgrade, 18-20 April 2016
Principal Component Analysis (PCA)
AREA Conference - Belgrade, 18-20 April 2016
Far- and Middle-East
Western and Central Europe
Balkan and East Europe
Principal Component Analysis (PCA)
AREA Conference - Belgrade, 18-20 April 2016
Western and Central Europe
Balkan and East Europe
Far- and Middle-East
Balkan and East Europe
Western and Central Europe
Far- and Middle-East
AREA Conference - Belgrade, 18-20 April 2016
Ruzica VI MNE
Is this true-to-type RUZICA ???
Ruzica SRB
Ruzica IV MNE
Slankamenka Crvena SRB
Pamid (BNC)
Garvan (BNC)
Unique Genotype
≠≠
=WHO IS RUZICA ?!
XKovidinka (#1578)
NO!AREA Conference - Belgrade, 18-20 April
2016
≠
What about Gavran (SRB)
Conclusions• 233 WB cultivars 57 unique
• 40 accessions (83%) are structured within group of Balkan and Eastern genotypes.
• valuable for conservation, • future investigations and • breeding purposes.
AREA Conference - Belgrade, 18-20 April 2016
8%9%
83%
A3-Western Europe group
B3-Far- and Middle-East group
C3-Balkan and Eastern Europe
CONCLUSION
NGS-SSR-genotyping protocol
• using Ion Proton or Ion PGM sequencers or MiniSeqsystem of Illumina1) Library preparation2) Sequencing (Ion Plus Fragment Library kit)3) Mapping of the reads to loci specific sequences; 4) Trimming of primers to obtain full length sequences of
microsatellite amplicons, 5) Counting of repeats, 6) Analysis of the frequency of the discovered
microsatellites to determine true alleles of the genotypes.
AREA Conference - Belgrade, 18-20 April 2016
Output – Ion PGMChardonnay_sample Counts PercentagesTotal reads raw 286.214 100%Total nucleotides 44.624.468 100%Total trimmed reads 269.756 94.25%Total trimmed nucleotides 42.649.743 95.57%Mapped reads 269.150 94.04% (of total)Not mapped reads 606 0.21% (of total)
AREA Conference - Belgrade, 18-20 April 2016
LociDe-novo assembly
Reference/Loci name
Reference length
Mapped reads
Non-perfect matches
Average coverage
Total nucleotides in
data setAllele1 length Allele1 count
Allele2 length Allele2 count
VVS2 137 27.079 18255 19.740.32 2.754.490 144 3995 138 2274
VVMD5 227 9.887 9874 8.451.77 1.999.474 233 5054
VVMD7 246 35.960 30730 26.393.76 6.371.400 242 2967 238 8864
VVMD25 241 33.369 23615 22.921.37 5.623.129 241 801 239 1500
VVMD27 180 16.291 14069 11.068.31 2.054.445 186 983 180 665
VVMD32 271 52.179 40658 28.917.14 8.689.772 250 25
VrZAG62 214 34.700 25025 15.907.08 3.423.149 196 3004 186 887
VrZAG79 258 26.461 26248 17.798.42 4.623.334 242 4315
Validation tests - CE vs. NGS
AREA Conference - Belgrade, 18-20 April 2016
P values generated using a Chi-square test to compare the CE peak height ratios to the NGS sequence count ratios
VVMD 32
High-throughput sequencingtechnology
• The maximum output of the Ion PGM sequencer (~5 M) enable us to analyze simultaneously
• 96 genotypes at 15 loci with an average length of 250 bp
• resulting in coverage of 4 000 x for each locus specific sequence
AREA Conference - Belgrade, 18-20 April 2016
Genotyping-by-Sequencing
• the next generation of molecular markers
• reduced representation sequencing method
• A shift from uniplex to multiplex assays that allow
the simultaneous analysis of multiple markers
• variant discovery (SNPs, microsatellites, etc.)• By-passing the entire classical marker assay
development stage
Applications
• GWAS• MAS• Genetic mapping• Genomic Selection – Breeding applications• Conservation• QTL analyses• diversity studies
Genotyping by sequencing (GBS) in any large genomespecies requires reduction of genome complexity
I. – restriction site(Technically less challenging)
• no prior seq data about genome
• no prior identified polymorphisms, SNP
• with RE reduce the complexity of the genome
• identification of markers ‚de novo‘
II. – target enrichment(AmpliSeq approach)
• known data about genome, transcriptomes etc.
• the polymorphism were identified
• design of primers in targetregions of polymorphic sites
• up to 6000 loci can beamplified simultaneusly
Poland et al., 2012
Approach I: Restriction
1) restriction with RE
2) ligation of universal adapters
3) amplification and sequencing
By choosing appropriate REs, repetitive regions of genomes can be avoided, and lower copy regions can be targeted with two to three fold higher efficiency
Transcriptoms of fewgenotypes:Celeia WyeTargetAppolo Nugget TaurusMagnum Merkur Tettnanger
Detect Polymorphic sites among mapped reads
CLC Genomics workbench
Tool 1
Tool 2
reference genome
ResultsCeleia-T Celeia-V WyeTarget Taurus Appolo Nugget Magnum Merkur Tettnanger
Technology Illumina PE 100 Illumina 50 bp Illumina 50 bp Illumina PE100Illumina PE100
Illumina PE 100 IlluminaPE100 454 454
Cleaned Reads 304,410,220 414,356,977 403,695,640 108,900,006 53,495,846 23,468,860 18,515,373 296,487 210,129
No. of bases 29,582,070,478 20,717,848,850 20,184,782,000 10,998,900,606 5,403,080,446 2,370,354,860 1,870,052,673 84,603,345 56,456,211Mapped reads 161,119,954 227,880,067 160,481,003 70,319,435 37,343,753 16,689,802 13,023,120 149,763 210,129
No. of bases 15599356755 11,394,003,350 8,024,050,150 7,102,262,9353,771,719,053 1,685,670,002 1,315,335,120 40,620,826 56,456,211
Percentage of all 52.93% 55.00% 39.75% 64.57% 69.81% 71.11% 70.34% 50.51% 49.77%Not mapped 143290266 186,476,910 243,214,637 38,580,571 16,152,093 6,779,058 5,492,253 146,724 190807Percentage of all 47.07% 45.00% 60.25% 35.43% 30.19% 28.89% 29.66% 49.49% 47.60%Reads in pairs 83569798 / / 53,469,574 28,706,978 12,658,482 10,153,922 / /
Broken paired reads 74958798 / / 15,039,022 7,695,499 3,587,412 2,527,208 / /
Variants 265,712 246,086 191,236 152,132 80,201 29,030 21,392 534 1012variants without N 264118 220781 183227 151990 80104 29030 21373 532 1010Ns 1594 25305 8009 142 97 0 19 2 2Homozygous 126886 121357 90503 52414 42304 12149 8928 236 595Heterozygous 137232 99424 92724 99576 37800 16881 12445 296 415Unique references 12385 14804 13708 10502/10495 7923/7921 4128 3240/3238 158 276
Cross-section of polymorphisms
• Highly polymorphic markers– with high quality– high coverage– good allelic ratio
1992 of heterozygous SNPs/variations
referenca identifikator Polimorf. Apollo Celeia NuggetMagnum Taurus WyeTarget
LD151021.1 10005 SNV Heterozygous Homo Heterozygous Heterozygous /
LD157464.1 10016 SNV Heterozygous / Heterozygous Heterozygous Heterozygous
LD163320.1 10025 SNV Heterozygous Homo Homo Homo Homo
LD133157.1 10064 SNV Heterozygous / / / /
LD154264.1 10065 SNV Heterozygous Heterozygous Heterozygous Heterozygous Heterozygous
LD140195.1 1006 SNV Heterozygous Heterozygous Heterozygous Heterozygous Heterozygous
LD257123.1 1006 SNV Heterozygous Heterozygous Heterozygous Heterozygous Heterozygous
LD154857.1 10071 SNV Heterozygous Heterozygous Heterozygous Heterozygous Heterozygous
LD157012.1 10073 SNV Heterozygous Heterozygous Heterozygous Heterozygous Heterozygous
LD170702.1 10078 Insertion Heterozygous Heterozygous Heterozygous Heterozygous Heterozygous
LD138133.1 10085 SNV Heterozygous Heterozygous Heterozygous Heterozygous Heterozygous
LD137100.1 10092 SNV Heterozygous Heterozygous Heterozygous Heterozygous Heterozygous
LD178254.1 10092 SNV Heterozygous Heterozygous Heterozygous Heterozygous Heterozygous
LD168001.1 10102 SNV Heterozygous Heterozygous Heterozygous Heterozygous Heterozygous
LD154646.1 10104 SNV Heterozygous Heterozygous Heterozygous Heterozygous Heterozygous
LD168565.1 10104 SNV Heterozygous Heterozygous Heterozygous Heterozygous Heterozygous
LD175361.1 10105 SNV Heterozygous Heterozygous Heterozygous Heterozygous Heterozygous
LD139992.1 10106 SNV Heterozygous Heterozygous Heterozygous Heterozygous Heterozygous
LD149849.1 10107 SNV Heterozygous Heterozygous Heterozygous Heterozygous Heterozygous
LD159214.1 10119 SNV Heterozygous Heterozygous Heterozygous Heterozygous Heterozygous
Preliminary testing with Sanger sequencing
1) Primerjs hetero (5 genotypes – heterozyg. SNPs)LD132484-F CGAAATTCCACCAAATCTGCLD132484-R GTACCTGCTGCCAATGGAACLD132578-F TCACCACAGCCCAATTAACALD132578-R GGAGGAGATATTCCCGAACCLD132627-F GCCCTGTTGACTTGCTTGATLD132627-R GCAATTAGCAAACACGAGGALD132933-F AAGGAGCCAAAGCATTCTCALD132933-R CAAGGGGTAACATCCCCATALD133023-F AATAGGAACCAGGAGCAGCALD133023-R GAACCAGCCCATGTGTTGTALD133025-F GGAGGTAAAAGCGCATAGCALD133025-R CCAGCCAGGGTATCTATTGCLD133146-F GCTTTCTCAATTGTCCCCTGTLD133146-R TGTTCCAGGAAACACAACTGALD133184-F TCATCAGGCTCCTCAGTACGLD133184-R AGGTGTTAAGCACGGCTTTTLD133310-F TGAAAGTTGAAGCCAAATAAGGLD133310-R TTTATATATTTCTCTAAAATGGCTGGALD133373-F CCTGTCATGACCTCCACCTTLD133373-R GCCAATGGGTCTGATGACTT
2) Primers SNPs in Celei a and WTcontig3-2732-F GACTGCGTGAAGATGGTGAAcontig3-2732-R TCAAGTAGGGACGGGTTACGcontig5-4227-F GGTGCCAAGATTTGATAGGGcontig5-4227-R CCTGGTGTCCAACAGGTTTAGLD132744-F CCACCTGAGTTCTACACCTTCCLD132744-R ATTGATCGGTTGAACCCTGALD132791-F TCAAGGTGGACTTTGGATCTTTLD132791-R TCACAAAGGATTCCAGCTCALD132865-F CCGACGAAGACTGATGCTCLD132865-R TACCTTCGAGCTCTGGCAACLD133125-F TCCCAAGGAGAAAGCTGAGALD133125-R CCTTGTTTTGAGCCGAGAATLD133184_CeWy 2-F TCCGGATATCAACACCATCALD133184_CeWy 2-R TCTCAGCAATTGCAGCTAACTCLD134063-F TTCAGAGAAGATAGAGAGAAAGTGAAALD134063-R TCAACTTCAATAAACACTATTCAATCALD134219-F GCACGGGTTTCTTCTTTTGALD134219-R GGGTTTGGGTTTGGATTCATLD134913-F ATAAACCGGATTGGTGTGGALD134913-R CAGCTGTGAGTCTGATGTGTGALD135083-F AACTAAGATTAAAACCCATAATTGAAALD135083-R GCCTCTCAAGTGTTTGTTGG
3) Primers-controlLD151021-F GTTGCTGAGGCTTCTGGATTLD151021-R GCGCTGAGTGCTCCATTAAGLD157464-F CCCTCTTCAGTCTGGTCGAGLD157464-R GATGGTCCTGCTCCATCTTGLD134176-F GAGGAGGTATCCACGCCTTCLD134176-R GCGACGCCGTTAGAATAATA
Take home message• Microsatellites are considered as a gold standard in the
field but there is still scope for the improvement in terms of speed and accuracy
• Given the amount of coverage it would be possible to sequence around 1000 PCR-amplified loci x 100 samples in one Ion Proton sequencing run.
• The current prices (and the time needed) are approx. half lower in comparison to Sanger sequencing and fragment analysis approach (CE)
• Looking forward, high-density markers from NGS systems (like GBS) will soon be applied to almost every genomic question as they are getting more robust and economical each year
AREA Conference - Belgrade, 18-20 April 2016
Acknowledgments
• SEE-ERA.NET Plus project• Slovenian Research Agency• FA COST Action FA1003
AREA Conference - Belgrade, 18-20 April 2016
University of Ljubljana, Biotechnical faculty Center for
plant breeding and genetics
višek …
N. ŠTAJNER1, L. BITZ1,3, R. BACILIERI2, B. JAVORNIK1 et al.1 University Of Ljubljana, Biotechnical Faculty, Agronomy Department, Slovenia
2 UMR AGAP, Equipe Diversité Et Adaptation De La Vigne Et Des Espèces Méditerranéennes, INRA, Montpellier, France 3 Natural Resources Institute Finland, Alimentum – Myllytie 1, 31600 Jokioinen, Finland
STRUCTURE No. 1
• 112 ‘Balkan’ genotypes, • the hierarchical levels of structure clusters
could hardly be recognized. • Clusters of admixed origin and usage (wine vs.
table)
AREA Conference - Belgrade, 18-20 April 2016
Long term impacts and practical implications
• Story of Žilavka – can it become successful outside borders?
Important for economy of Balkan countries? (particluarly BIH?)If well characterized might become a local brend.
We obtained: • truenes-to-genotype & truenes-to-type; • intravarietal genetic polimorphism (AFLP); • analysis of putative clonal variability (AFLP + SSR); • we established close genetic relatedness with other local cultivars (Dobrogostina; Crna Prosip
& Stara Žilavka) and strong differentiation from other cultivars clones or siblings? evidence of local breeding efforts?
• Is Žilavke bred in Herzegovina? • Parentageship is finally elucidated: Žilavka = Albaimputotato (Romania) × Dobrogostina (BIH)
AREA Conference - Belgrade, 18-20 April 2016
Activities and objectives
• Molecular characterization and evaluation of genetic relationships– Microsatellite genotyping (22 loci) and data analysis
• Determination of synonyms and homonyms and establishment of SSR database– Comparative analysis (intra- and inter-group)
AREA Conference - Belgrade, 18-20 April 2016
How to solve Synonyms ?• Grapevine scions were shared all over the world by different travelers having
diverse intention for use and were obtaining local names often compeltely different from what is called prime name.
• World database with standardized gentic profiles prooved excellent tool in solving these complicated puzzles generated over times.
• VIVC is updating almost daily in 2015• therefore our results might change accordingly
Accession OriginPrime name given in VIVC
Pending prime name DNA reference
Country of collection
Berry color Comment
Begljerka Crna Macedonia Begljerka Crna WBVD; Zulj Mihaljevicet al. 2013 MAK
Noir Confirmed by curator Beleski Klime
Bela Dinka Serbia Bela Dinka Dzhambazova et al.2012; WBVD SRB
Blanc Unique genotype
Dolgi Grozdi Slovenia Pelena Stajner et al. 2014;WBVD; Crespan et al.2011 SLO
Blanc Synonym
Kolana Slovenia KOLANA not found SLO Unique genotype
Krivaja ? ? KRIVAJA Present paper SRB Blanc Unique genotype
Kujundžuša ? ? KUJUNDZUSA Present paper BIH Blanc Unique genotype
Stara Blatina Bosnia andHerzegovina
Blatina Present paper; ZuljMihaljevic et al. 2013;Tomic et al. 2012 BIH
Noir Synonym
AREA Conference - Belgrade, 18-20 April 2016
microsatellite markers• individual fingerprinting, • revealing identities• analyzing population structures• parental and kinship analysis• detection of genetic diversity• Linkage mapping• Geographical Origin - predominance of
certain alleles
AREA Conference - Belgrade, 18-20 April 2016
What are Duplicates telling us?
• In case of not-standardized traditional varieties
• duplicates having an identical allelic profile are able to confirm
• that the cultivar is true-to-type. •
AREA Conference - Belgrade, 18-20 April 2016
How to solve Synonyms ?• Grapevine material were shared all over the
world• obtaining local names often compeltely
different from what is called prime name
•World database- VIVC with standardized gentic profiles
prooved excellent tool in solving these complicated puzzles
AREA Conference - Belgrade, 18-20 April 2016
AmpliSeq™
• ‚Custom design‘ – limit to 6144 primers or 3072 amplicons in one PCR reaction.
• we can perform two PCR reaction and join them subsequently
• panel for human exome – amplification of 25.000 amplicons in one reaction
Analysis
MappingReference Position Type Length Reference Allele Zygosity Count Coverage Frequency
Average quality Unique_references
LD132478.1 mapping 9147 MNV 2 TC AA Homozygous 47 48 97,9166667 36,69767 LD132478.1 mappingLD132478.1 mapping 10151 SNV 1 T N Heterozygous 29 92 31,5217391 37,65517 LD132480.1 mappingLD132478.1 mapping 18308 SNV 1 C T Homozygous 45 45 100 39,28889 LD132481.1 mappingLD132478.1 mapping 18592 SNV 1 A G Homozygous 82 85 96,4705882 37,86585 LD132482.1 mappingLD132478.1 mapping 19198 SNV 1 T G Homozygous 77 77 100 39,01299 LD132483.1 mappingLD132478.1 mapping 19919 SNV 1 C T Homozygous 75 85 88,2352941 39,14667 LD132484.1 mappingLD132478.1 mapping 19939 SNV 1 C T Homozygous 102 102 100 37,98039 LD132485.1 mappingLD132478.1 mapping 22634 SNV 1 A G Homozygous 154 154 100 38,38961 LD132486.1 mappingLD132478.1 mapping 22649 Replacement 2 GT A Homozygous 25 41 60,9756098 38,36 LD132487.1 mappingLD132478.1 mapping 23320 SNV 1 C T Homozygous 83 84 98,8095238 39,38554 LD132488.1 mappingLD132478.1 mapping 23339 Insertion 3 - GAT Heterozygous 23 48 47,9166667 38,97101 LD132489.1 mappingLD132478.1 mapping 23559 Deletion 1 A - Homozygous 32 43 74,4186047 33,53125 LD132490.1 mappingLD132478.1 mapping 24361 SNV 1 G A Homozygous 153 153 100 38,94771 LD132491.1 mappingLD132478.1 mapping 24373 SNV 1 C T Homozygous 114 114 100 38,59649 LD132493.1 mappingLD132478.1 mapping 24548 Deletion 1 T - Homozygous 39 46 84,7826087 37,61538 LD132494.1 mappingLD132478.1 mapping 26627 SNV 1 T G Heterozygous 161 342 47,0760234 36,86335 LD132509.1 mapping
WyeTarget
Illumina 50 bp
403,695,640
20,184,782,000160,481,003
Variants191,236
1832278009
905039272413708
Custom reference
LD132478 9147 9148LD132478 10151 10152LD132478 18308 18309LD132478 18592 18593LD132478 19198 19199LD132478 19919 19920LD132478 19939 19940
Result
384Well_PlateID
384Well_Row
384Well_Col Amplicon_ID
Ion_AmpliSeq_Fwd_Primer
Ion_AmpliSeq_Rev_Primer Name Chr
Amplicon_Start
Insert_Start Insert_Stop
Amplicon_Stop
IAD88342_1_A A 1 AMPL1029338AATGGTGAACAGAATTGTGCTATTGC
AAGGCCAGAGCAATACTTACCC SNP1 LD132478 8961 8987 9213 9235
IAD88342_1_A A 2 Blank . .
IAD88342_1_A A 3 AMPL1029339GAACAGCTTGTCGAAGCATCAC
CCAAGCCCAGCAAATGAATTGAC SNP2 LD132478 9933 9955 10184 10207
IAD88342_1_A A 4 Blank . .
IAD88342_1_A A 5 AMPL1029340CATTTTCTTCATGTTGTGTCATTTGTTGTG
CTGTACAAACGCAGCAGACACAAAG SNP3 LD132478 18075 18105 18324 18349
IAD88342_1_A A 6 Blank . .
IAD88342_1_A A 7 AMPL1029341TGATGTAGTCAATGTAGATGGATGGTATGT
CGCTCAAGTATTTTAACACATTTTGGAAGA SNP4 LD132478 18359 18389 18603 18633
IAD88342_1_A A 8 Blank . .
IAD88342_1_A A 9 AMPL1029342GCAATGAGGGCTCTTCAGTATTTTC
CTTACAAGAGCAAGGGCTTCAAAG SNP5 LD132478 18959 18984 19209 19233
IAD88342_1_A A 10 Blank . .
IAD88342_1_A A 11 AMPL1029343TGTTAATTAGGCATGTGGGATTAAGACTG
ACTTACCAACCCAAGATATCGAATGTTT SNP6
SNP7 LD132478 19704 19733 19950 19978
8%
9%
83%
A3-Western Europe group
B3-Far- and Middle-East group
C3-Balkan and Eastern Europe
AREA Conference - Belgrade, 18-20 April 2016