one man's *1 is another man's *13? trouble with nomenclatures in personalized medicine

25
Trouble with nomenclatures in personalized medicine Asst.-Prof. Mag. Dr. Matthias Samwald CeMSIIS, Medical University of Vienna SUMMER SCHOOL: GENOMIC MEDICINE – Bridging research and the clinic, May 6 2016, Portoroz, Slovenia One man's *1 is another man's *13? Funded by Austrian Science Fund (FWF): [P 25608-N1 This project has received funding from the European Union’s Horizon 2020 research and Innovation programme under grant agreement No 668353 (KB and MS).

Upload: matthias-samwald

Post on 14-Apr-2017

182 views

Category:

Technology


3 download

TRANSCRIPT

Page 1: One man's *1 is another man's *13? Trouble with nomenclatures in personalized medicine

Trouble with nomenclatures in personalized medicine

Asst.-Prof. Mag. Dr. Matthias SamwaldCeMSIIS, Medical University of Vienna

SUMMER SCHOOL: GENOMIC MEDICINE – Bridging research and the clinic, May 6 2016, Portoroz, Slovenia

One man's *1 is another man's *13?

Funded by Austrian Science Fund (FWF): [P 25608-N15]

This project has received funding from the European Union’s Horizon 2020 research and Innovation programme under grant agreement No 668353 (KB and MS).

Page 2: One man's *1 is another man's *13? Trouble with nomenclatures in personalized medicine

What‘s the problem?

Page 3: One man's *1 is another man's *13? Trouble with nomenclatures in personalized medicine
Page 4: One man's *1 is another man's *13? Trouble with nomenclatures in personalized medicine
Page 5: One man's *1 is another man's *13? Trouble with nomenclatures in personalized medicine
Page 6: One man's *1 is another man's *13? Trouble with nomenclatures in personalized medicine
Page 7: One man's *1 is another man's *13? Trouble with nomenclatures in personalized medicine

We simulated the accuracy of various targeted, low-cost assays suitable for pre-emptive testing compared to next-gen sequencing

Venn diagram displaying the numbers and overlaps of polymorphisms covered by constrained views derived from four pharmacogenomic assays. DMET: derived from the Affymetrix DMET™ Plus assay, VERA: Illumina VeraCode® ADME Core Panel, TAQM: TaqMan® OpenArray® PGx Panel, FLOR: University of Florida and Stanford Custom Array.

Page 8: One man's *1 is another man's *13? Trouble with nomenclatures in personalized medicine

We simulated the accuracy of various targeted, low-cost assays suitable for pre-emptive testing compared to next-gen sequencing

Page 9: One man's *1 is another man's *13? Trouble with nomenclatures in personalized medicine

We simulated the accuracy of various targeted, low-cost assays suitable for pre-emptive testing compared to next-gen sequencing

Page 10: One man's *1 is another man's *13? Trouble with nomenclatures in personalized medicine
Page 11: One man's *1 is another man's *13? Trouble with nomenclatures in personalized medicine

We simulated the accuracy of various targeted, low-cost assays suitable for pre-emptive testing compared to next-gen sequencing

Fraction of tested genes resulting in aberrations in haplotype calling with restricted assay compared to next-gen sequencing. Based on full genome sequences of 2504 persons. Manuscript currently under review at ‘Pharmacogenomics’.

Page 12: One man's *1 is another man's *13? Trouble with nomenclatures in personalized medicine

We simulated the accuracy of various targeted, low-cost assays suitable for pre-emptive testing compared to next-gen sequencing

Fraction of tested genes resulting in aberrations in haplotype calling with restricted assay compared to next-gen sequencing. Based on full genome sequences of 2504 persons. Manuscript currently under review at ‘Pharmacogenomics’.

Page 13: One man's *1 is another man's *13? Trouble with nomenclatures in personalized medicine

Where to go from here?

Page 14: One man's *1 is another man's *13? Trouble with nomenclatures in personalized medicine
Page 15: One man's *1 is another man's *13? Trouble with nomenclatures in personalized medicine
Page 16: One man's *1 is another man's *13? Trouble with nomenclatures in personalized medicine
Page 17: One man's *1 is another man's *13? Trouble with nomenclatures in personalized medicine

Allele Registry project

Page 18: One man's *1 is another man's *13? Trouble with nomenclatures in personalized medicine

From the lab: experimental mnemonic nomenclature

• Idea: Experiment with human-friendly nomenclatureo No human committeeo Less cryptic alphanumeric descriptors

Page 19: One man's *1 is another man's *13? Trouble with nomenclatures in personalized medicine

From the lab: experimental mnemonic nomenclature

• Synthetic pseudo-words can encode a lot of information

• CVCVCV pattern examples (C = consonant, V = vowel):o binoruo nivudio pekuvoo jutoxuo hacifio dejula

• CVCVCV tuple (Y as vowel) can denote: 20 * 6 * 20 * 6 * 20 * 6 = 1 728 000 variants

Page 20: One man's *1 is another man's *13? Trouble with nomenclatures in personalized medicine

Algorithm (no human curation / committee)

• Take large dataset containing variant data of our usual (1000 Genomes, 100.000 Genomes, 1M genomes…) as reference

• Create list of genome loci and variants observed there (some loci might have more than 2 possible variants)

• For each gene:o For each locus:

Sort observed variants based on their frequencies define most frequently observed variant as ‘wild type’;

remove these variants from the table we use for constructing the mnemonics (they are considered to be the default)

o Sort loci based on the frequency of the most frequent non-wild-type variant of each locus

o Assign mnemonics to each variant systematically, starting with shorter mnemonic strings (i.e., 2-character tuple)

Page 21: One man's *1 is another man's *13? Trouble with nomenclatures in personalized medicine

Algorithm (no human curation / committee)

• Take large dataset containing variant data of our usual (1000 Genomes, 100.000 Genomes, 1M genomes…) as reference

• Create list of genome loci and variants observed there (some loci might have more than 2 possible variants)

• For each gene:o For each locus:

Sort observed variants based on their frequencies define most frequently observed variant as ‘wild type’;

remove these variants from the table we use for constructing the mnemonics (they are considered to be the default)

o Sort loci based on the frequency of the most frequent non-wild-type variant of each locus

o Assign mnemonics to each variant systematically, starting with shorter mnemonic strings (i.e., 2-character tuple)

Page 22: One man's *1 is another man's *13? Trouble with nomenclatures in personalized medicine
Page 23: One man's *1 is another man's *13? Trouble with nomenclatures in personalized medicine

Example mnemonic code sequences

VKORC1: cy-do-du | be-do-duCYP2D6: nai / nai-pek CYP2D6: nai / be-wi / nai-pek  (copy number variation)TMPT: be-fu-fy | ba-bi-fi-tek

Mnemonic code + reference to variants/regions covered by assay = automatically decompress to full sequence / genotype result

Sets auf co-occuring SNP variants could automatically be assigned identifier of their own and combined with individual SNP variant identifiers

Currently creating humble proof-of-concept based on 1000 Genomes data

Page 24: One man's *1 is another man's *13? Trouble with nomenclatures in personalized medicine

Local team (Medical University of Vienna) Asst.-Prof. Mag. Dr. Matthias Samwald (PI)Dr. Kathrin Blagec Mag. Sebastian HoferHong Xu, BScWolfgang Kuch

Webhttp://samwald.info/http://safety-code.org/http://upgx.eu

Thanks!

Page 25: One man's *1 is another man's *13? Trouble with nomenclatures in personalized medicine

• Reference: Matthias Samwald, Kathrin Blagec, Sebastian Hofer and Robert R. Freimuth. “Analysing the potential for incorrect haplotype calls with different pharmacogenomic assays in different populations: a simulation based on 1000 Genomes data.” Pharmacogenomics, September 30, 2015. doi:10.2217/pgs.15.108

• Code Availability: The curated resources and the IPython notebooks available at https://gitlab.com/medication-safety/ms-ipython

Further info