introduction to the concept of functional genomics david meyre, associate professor, mcmaster...

52
Introduction to the concept of functional genomics David Meyre, Associate Professor, McMaster University ([email protected]) HRM 728 Graduate Course: Genetic Epidemiology October, 24 th 2014 Population Genomics Program

Upload: arnold-hutchinson

Post on 13-Dec-2015

223 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Introduction to the concept of functional genomics David Meyre, Associate Professor, McMaster University (meyred@mcmaster.ca) HRM 728 Graduate Course:

Introduction to the concept of functional genomics

David Meyre, Associate Professor, McMaster University([email protected])

HRM 728 Graduate Course: Genetic Epidemiology – October, 24th 2014

Population Genomics Program

Page 2: Introduction to the concept of functional genomics David Meyre, Associate Professor, McMaster University (meyred@mcmaster.ca) HRM 728 Graduate Course:

Introduction to the concept of functional genomics

What Is Functional Genomics?

The goal of functional genomics is to understand the relationship between an organism’s genome and its phenotype.

Functional genomics is a field of molecular biology that is attempting to make use of the vast wealth of data produced by genome sequencing projects to describe genome function. Functional genomics uses high-throughput techniques like DNA microarrays, proteomics, epigenomics, metagenomics, metabolomics and mutation analysis to describe the function and interactions of genes.

Page 3: Introduction to the concept of functional genomics David Meyre, Associate Professor, McMaster University (meyred@mcmaster.ca) HRM 728 Graduate Course:

The genomic revolution

Human genome sequence High-throughput technologies

Large human biobanksBiostatistics & Bioinformatics

GENES

FUNCTIONAL GENOMICS

Page 4: Introduction to the concept of functional genomics David Meyre, Associate Professor, McMaster University (meyred@mcmaster.ca) HRM 728 Graduate Course:

Gene identification approaches

Genome-wide linkage Candidate gene

Homozygosity mapping Genome-wide association

GENES

Full exome / genome sequencing

Page 5: Introduction to the concept of functional genomics David Meyre, Associate Professor, McMaster University (meyred@mcmaster.ca) HRM 728 Graduate Course:

Classification of human genetic diseases

Syndromic disease

( < 0.004%)

Monogenic disease

( < 2 %)

Polygenic disease

(~ 20%)

OBESITY

Page 6: Introduction to the concept of functional genomics David Meyre, Associate Professor, McMaster University (meyred@mcmaster.ca) HRM 728 Graduate Course:

Genes and causality

SYNDROMIC / MONOGENIC DISEASE

Beyond co-segregation studies, additional arguments are needed to demonstrate the causal role of a mutation in the disease

functional genomics

Page 7: Introduction to the concept of functional genomics David Meyre, Associate Professor, McMaster University (meyred@mcmaster.ca) HRM 728 Graduate Course:

Genes and causality

POLYGENIC DISEASE (e.g. type 2 diabetes)

Beyond association studies, additional arguments are needed to demonstrate the causal role of a variant / gene in the disease

functional genomicsSladek et al., Nature 2007

Page 8: Introduction to the concept of functional genomics David Meyre, Associate Professor, McMaster University (meyred@mcmaster.ca) HRM 728 Graduate Course:

Introduction to the concept of functional genomics

TRANS-ETHNIC FINE MAPPING APPROACH

Page 9: Introduction to the concept of functional genomics David Meyre, Associate Professor, McMaster University (meyred@mcmaster.ca) HRM 728 Graduate Course:

Variant identification : Functional prediction (in silico)

Gene / locus Identification

Functional validation (in vitro, in vivo)

Hypothesis free approaches Candidate gene

Fine mappingWe are here

Page 10: Introduction to the concept of functional genomics David Meyre, Associate Professor, McMaster University (meyred@mcmaster.ca) HRM 728 Graduate Course:

Trans-ethnic fine mapping approach

.Linkage disequilibrium is the non-random association of alleles at two or more loci

. The human genome is composed of blocks of linkage disequilibrium

. The extent of linkage disequilibrium blocks varies according to the ethnic background

Page 11: Introduction to the concept of functional genomics David Meyre, Associate Professor, McMaster University (meyred@mcmaster.ca) HRM 728 Graduate Course:

Trans-ethnic fine mapping approach

Distance (Kb)

Icelandic

French

Asian

African

Disease-associated LD block

SNP1 SNP2 SNP3 SNP4 SNP5

Causal SNP

Page 12: Introduction to the concept of functional genomics David Meyre, Associate Professor, McMaster University (meyred@mcmaster.ca) HRM 728 Graduate Course:

Trans-ethnic fine mapping approach

. Large-scale resequencing and case control association studies in Icelandic, Danish, West African and American African subjects identified the rs903146 as the likely causal type 2 diabetes-associated SNP

Page 13: Introduction to the concept of functional genomics David Meyre, Associate Professor, McMaster University (meyred@mcmaster.ca) HRM 728 Graduate Course:

Variant identification : Functional prediction (in silico)

Gene / locus Identification

Functional validation (in vitro, in vivo)

Hypothesis free approaches Candidate gene

Fine mapping

We are here

Page 14: Introduction to the concept of functional genomics David Meyre, Associate Professor, McMaster University (meyred@mcmaster.ca) HRM 728 Graduate Course:

Gene candidacy

. Are genes in the disease-associated LD block involved in syndromic / monogenic forms of the same disease?

-loci associated with polygenic obesity: MC4R, BDNF, POMC, PCSK1, SIM1

-GWAS for complex traits: 20% of the GWAS loci include genes involved in mendelian disorders for the same trait

. Are genes in the disease-associated LD block involved in a corresponding phenotype in animal models (KO, Tg, SiRNA)?

-loci associated with polygenic obesity: MC4R, BDNF, POMC, PCSK1, SIM1, FTO, GIPR, NPC1, SH2B1, TBC1D1, NEGR1

- > 170 genes induce a phenotype of severe obesity in genetic mice models

. Gene function, biology

-function related to energy metabolism

Page 15: Introduction to the concept of functional genomics David Meyre, Associate Professor, McMaster University (meyred@mcmaster.ca) HRM 728 Graduate Course:

Gene candidacy

In order to find the causal gene in a disease-associated linkage disequilibrium block, mRNA expression studies can be useful (microarrays, RT-PCR):

1-Is the gene expressed in target tissues for the disease (obesity: brain, adipocytes; T2D: pancreas)?

2-Is the gene mRNA expression modulated by the disease status in a relevant tissue?

3-Is the gene mRNA expression modulated by the disease-associated SNP in a relevant tissue?

Page 16: Introduction to the concept of functional genomics David Meyre, Associate Professor, McMaster University (meyred@mcmaster.ca) HRM 728 Graduate Course:

Gene candidacy

. ORMDL3 is one of the 19 genes located in the asthma-associated LD block

. ORMDL3 is expressed in the lung

. ORMDL3 mRNA level is modulated by asthma disease status in lymphoblastoid cell lines

. ORMDL3 mRNA level is strongly modulated by the asthma-associated SNP in lymphoblastoid cell lines

ORMDL3 is a highly relevant candidate gene at this locus

Moffatt et al., Nature 2007

Page 17: Introduction to the concept of functional genomics David Meyre, Associate Professor, McMaster University (meyred@mcmaster.ca) HRM 728 Graduate Course:

Gene candidacy

. Combination of expression mRNA and GWAS studies

. 27 genes differentially regulated in adipose tissue of monozygotic twins discordant for obesity

. ‘Hypothesis driven’ GWAS analysis for these 27 genes followed by a replication in a second independent sample identified a novel obesity gene: F13A1

Naukkarinen et al., PLOS Genet 2010

Page 18: Introduction to the concept of functional genomics David Meyre, Associate Professor, McMaster University (meyred@mcmaster.ca) HRM 728 Graduate Course:

Functional prediction (in silico)

Gene / locus Identification

Functional validation (in vitro, in vivo)

Hypothesis free approaches Candidate gene

Fine mapping

We are here

Page 19: Introduction to the concept of functional genomics David Meyre, Associate Professor, McMaster University (meyred@mcmaster.ca) HRM 728 Graduate Course:

Introduction to the concept of functional genomics

GENE VARIANT AND FUNCTION

Page 20: Introduction to the concept of functional genomics David Meyre, Associate Professor, McMaster University (meyred@mcmaster.ca) HRM 728 Graduate Course:

Gene variant and function

Two major types of variants :

1- Variants that affect the protein structure and function of the gene in which they occur :

Missense, nonsense, frameshift (indels) coding mutations: altered protein function

Intron / exon mutations, splicing branch points: exon skipping/adding

Page 21: Introduction to the concept of functional genomics David Meyre, Associate Professor, McMaster University (meyred@mcmaster.ca) HRM 728 Graduate Course:

Gene variant and function

2- Variants that affect expression and regulation of the gene in which they occur or other distal genes (eQTLs)

gene variant in the promoter (Transcription Factor Biding Site): change in gene expression

gene variant in 3’UTR: altered mRNA stability

gene variant in microRNAs binding sites: change in expression

gene variant in enhancers / silencers/ insulators: change in expression in a distal gene (or a group of genes)

gene variant in a CpG methylation site: change in DNA methylation pattern

Copy Number Variants (CNV): modulation of gene expression, haplo-insufficiency

How to prove causality between a genetic variant and a biological effect?

Page 22: Introduction to the concept of functional genomics David Meyre, Associate Professor, McMaster University (meyred@mcmaster.ca) HRM 728 Graduate Course:

In silico prediction studies for coding variants

+: deleterious

-: neutral

Eight coding non-synonymous mutations in the PCSK1 gene have been identified in extreme obese patients: the Polyphen-2 software (conservation of the amino-acid across evolution + protein structure) is 100% concordant with in vitro studies

Mutations PolyPhen-2 PANTHER SIFT SNAP PMUT

K26E - - - - -

M125I + - - - -

T175M + + + + +

N180S + + + + -

Y181H + + + + -

G226R + + - + +

S325N + + + + -

T558A + NA - - -

G593R + NA + + +

Creemers et al., Diabetes 2012

Page 23: Introduction to the concept of functional genomics David Meyre, Associate Professor, McMaster University (meyred@mcmaster.ca) HRM 728 Graduate Course:

prediction studies for regulatory variants

Transcription factor binding sites and cis regulatory modules

phylogenetically conserved sites

specific epigenetic marks (ex : enhancer /silencers /insulators specific proteins, promoter proteins , DNA methylation, DNAse hypersensitivity … )

• Combine both in silico and indirect experimental data, ex: ANOVAR, FunciSNP, PMCA, GWAS3D

• These tools attribute a score to each variant in the LD block of the genomic region thought to cause the phenotype and predict its functionalaty based on its proximity to:

Page 24: Introduction to the concept of functional genomics David Meyre, Associate Professor, McMaster University (meyred@mcmaster.ca) HRM 728 Graduate Course:

Introduction to the concept of functional genomics

EVOLUTIONARY GENETICS

We are here

Page 25: Introduction to the concept of functional genomics David Meyre, Associate Professor, McMaster University (meyred@mcmaster.ca) HRM 728 Graduate Course:

Evolutionary genetics

Natural selection is the gradual, non-random process by which biological traits become either more or less common in a population as a function of differential reproduction of their bearers. It is a key mechanism of evolution. The term "natural selection" was popularized by Charles Darwin.

Evolutionary genetics (Huxley 1942)

-advantageous mutations have been positively selected in human populations during recent evolution

-disadvantageous mutations have been negatively selected in human populations during recent evolution

Page 26: Introduction to the concept of functional genomics David Meyre, Associate Professor, McMaster University (meyred@mcmaster.ca) HRM 728 Graduate Course:

Evolutionary genetics

THRIFTY GENOTYPE HYPOTHESIS: the 'thrifty' genotype would have been advantageous for hunter-gatherer populations, especially child-bearing women, because it would allow them to fatten more quickly during times of abundance. Fatter individuals carrying the thrifty genes would thus better survive times of food scarcity.

Obesity and type 2 diabetes predisposing mutations may show evidence of positive signature of evolution

Page 27: Introduction to the concept of functional genomics David Meyre, Associate Professor, McMaster University (meyred@mcmaster.ca) HRM 728 Graduate Course:

Evolutionary genetics

.The LCT rs4988235 T variant confers lactase persistence

. The LCT rs4988235 T variant is associated with more milk / dairy products consumption and increased body mass index

. The LCT rs4988235 T variant has a selective advantage in milk-producing dairy farming populations and has been submitted to positive selection in relation with events of cattle domestication

. The LCT rs4988235 T allele frequency is more frequent in Northern (MAF: 0.7) than in Southern Europe (MAF: 0.1)

Page 28: Introduction to the concept of functional genomics David Meyre, Associate Professor, McMaster University (meyred@mcmaster.ca) HRM 728 Graduate Course:

Evolutionary genetics

Davey-Smith et al., EJHG 2009

LCT rs4988235 T allele frequency in UK

Page 29: Introduction to the concept of functional genomics David Meyre, Associate Professor, McMaster University (meyred@mcmaster.ca) HRM 728 Graduate Course:

Evolutionary genetics

. Genome-wide approaches in diverse ethnic backgrounds have identified several hundreds of regions showing recent positive natural selection

. New methods are able to identify causal variants in regions with positive natural selection signature

. The amino-acid change Lys109Arg in the LEPR gene is as a causal variant submitted to positive selection

. The Lys109Arg variant is associated with body mass index variation

Grossman et al., Science 2010

Page 30: Introduction to the concept of functional genomics David Meyre, Associate Professor, McMaster University (meyred@mcmaster.ca) HRM 728 Graduate Course:

Evolutionary genetics

. Genome-wide approaches in diverse ethnic backgrounds have identified several hundreds of regions showing recent positive natural selection

. New methods are able to identify causal variants in regions with positive natural selection signature

. The amino-acid change Lys109Arg in the LEPR gene is as a causal variant submitted to positive selection

. The Lys109Arg variant is associated with body mass index variation

Grossman et al., Science 2010

Page 31: Introduction to the concept of functional genomics David Meyre, Associate Professor, McMaster University (meyred@mcmaster.ca) HRM 728 Graduate Course:

Introduction to the concept of functional genomics

Sources of data for variant functionality prediction

We are here

Page 32: Introduction to the concept of functional genomics David Meyre, Associate Professor, McMaster University (meyred@mcmaster.ca) HRM 728 Graduate Course:

ENCODE PROJECT (ENCyclopedia of Dna Elements).

• https://www.encodeproject.org/

Integrative analysis of :

•3545 biosamples (2441 in humans) from different cell lines/ tissues

•971 epigenetic marks

5194 assays (Chip-seq, RNA-seq, IP, DNAse seq, transcription profiling …)

Page 33: Introduction to the concept of functional genomics David Meyre, Associate Professor, McMaster University (meyred@mcmaster.ca) HRM 728 Graduate Course:

NIH Roadmap Epigenomics Mapping Consortium

•http://genomebrowser.wustl.edu/

Integrative analysis of :

•111 reference human cells/tissues

•40+ epigenetic marks

Page 34: Introduction to the concept of functional genomics David Meyre, Associate Professor, McMaster University (meyred@mcmaster.ca) HRM 728 Graduate Course:

Genotype-Tissue Expression (GTEx) project 

Correlations between genotype and tissue-specific gene expression levels in

•42 cell lines/ tissues

•100 - 200 RNA seq and genotyped samples

• http://www.gtexportal.org/home/

Page 35: Introduction to the concept of functional genomics David Meyre, Associate Professor, McMaster University (meyred@mcmaster.ca) HRM 728 Graduate Course:

Examples of coding and regulatory variants

Page 36: Introduction to the concept of functional genomics David Meyre, Associate Professor, McMaster University (meyred@mcmaster.ca) HRM 728 Graduate Course:

Gene variation in the promoter and gene Gene variation in the promoter and gene expression expression

. The -11391 G>A variant in the promoter of the ACDC/adiponectin gene is associated with higher in vitro promoter activity and with higher plasma adiponectin level in lean and in obese children

Bouatia-Naji et al., Diabetes 2006

Page 37: Introduction to the concept of functional genomics David Meyre, Associate Professor, McMaster University (meyred@mcmaster.ca) HRM 728 Graduate Course:

Gene variation and long-range Gene variation and long-range enhancerenhancer

Smemo et al., Nature 2014

. The obesity-associated FTO intron 1 region directly interacts with the promoter of IRX3 gene (580 Kb downstream of FTO)

. The intron 1 SNP in FTO modulates IRX3 (but not FTO) expression

. Irx3-deficient mice display a leanness phenotype

Page 38: Introduction to the concept of functional genomics David Meyre, Associate Professor, McMaster University (meyred@mcmaster.ca) HRM 728 Graduate Course:

Gene variation at a CpG methylation Gene variation at a CpG methylation sitesite

. Gene variant rs1421085 in intron 1 of FTO is the main contributor to polygenic obesity (Dina et al., Nat Genet 2007)

. Gene variant rs7202116, in full linkage disequilibrium with rs1421085, creates a CpG methylation site and is associated with increased methylation of a 7.7 kb regulatory region within FTO

. The 7.7 kb regulatory region encapsulates a Highly-Conserved non Coding Element that acts as a long range gene expression enhancer

Bell et al., PLOS One 2010

Page 39: Introduction to the concept of functional genomics David Meyre, Associate Professor, McMaster University (meyred@mcmaster.ca) HRM 728 Graduate Course:

Intron / exon mutations and exon skipping

. Extreme obesity cosegregates with homozygosity for a G/A substitution in the splice donor site of exon 16 of the LEPR gene

. The intron / exon mutation induces skipping of exon 16 and a truncated inactive leptin receptor

Clement et al., Nature 1998

Page 40: Introduction to the concept of functional genomics David Meyre, Associate Professor, McMaster University (meyred@mcmaster.ca) HRM 728 Graduate Course:

CNVs are highly causal variants in CNVs are highly causal variants in mendelian diseases mendelian diseases

a 600kb heterozygous deletion (~30 genes) on chromosome 16p11.2 explains 0.7% of morbid hyperphagic obesity and is associated with developmental delays

duplications in the same chromosomal region are associated with underweight and eating restrictive disorders

SH2B1, a key modulator of the response to the satiety hormone leptin, and a Mendelian hyperphagic obesity gene, is located in the deleted interval

Walters et al., Nature 2010; Jacquemont et al., Nature 2012

Page 41: Introduction to the concept of functional genomics David Meyre, Associate Professor, McMaster University (meyred@mcmaster.ca) HRM 728 Graduate Course:

Gene variation in 3’UTR and mRNA Gene variation in 3’UTR and mRNA stability stability

.A>G +1044 TGA SNP is included in the ENPP1 risk haplotype associated with higher ENPP1 plasma level and risk of obesity / T2D

.A>G +1044 TGA forms a linkage disequilibrium block in 3’UTR with A>C +1092 TGA and C>T+1157 TGA

.In HLA cells transfected with either 3’UTR variant or wild-type cDNA, specific ENPP1 mRNA half-life was increased for those transfected with 3’UTR variant cDNA (t/2=4.35 vs. 2.55 h; p=0.001)

Meyre et al., unpublished

Page 42: Introduction to the concept of functional genomics David Meyre, Associate Professor, McMaster University (meyred@mcmaster.ca) HRM 728 Graduate Course:

Cis versus Trans e-QTLs?

. The polymorphism rs9585056 is associated with T1D, modulates the expression of the cis-gene GPR183 and the expression of the IRF7 network genes

Heinig et al., Nature 2010

Page 43: Introduction to the concept of functional genomics David Meyre, Associate Professor, McMaster University (meyred@mcmaster.ca) HRM 728 Graduate Course:

Functional prediction (in silico)

Gene / locus Identification

Functional validation (in vitro, in vivo)

Hypothesis free approaches Gene candidate

Fine mapping

We are here

Page 44: Introduction to the concept of functional genomics David Meyre, Associate Professor, McMaster University (meyred@mcmaster.ca) HRM 728 Graduate Course:

In vitro functional studies

68% of non-synonymous mutations found in obese patients are deleterious (test alpha-MSH)

Stutzmann et al., Diabetes 2008

Page 45: Introduction to the concept of functional genomics David Meyre, Associate Professor, McMaster University (meyred@mcmaster.ca) HRM 728 Graduate Course:

In vitro/ In vivo functional studiesCRISPR/Cas9 system, a powerful genetic tool for

genome editing and the study of functional variants

The rs1421085 variant of FTO identified by PMCA has been proven to modulate the expression of IRX3 and IRX5 genes using CRISPR/Cas9 method

Page 46: Introduction to the concept of functional genomics David Meyre, Associate Professor, McMaster University (meyred@mcmaster.ca) HRM 728 Graduate Course:

Introduction to the concept of functional genomics

STUDY OF ENDOPHENOTYPES

Page 47: Introduction to the concept of functional genomics David Meyre, Associate Professor, McMaster University (meyred@mcmaster.ca) HRM 728 Graduate Course:

. Rs17782313 near MC4R has been associated with BMI by GWAS

. Deleterious coding mutations in MC4R are the commonest form of monogenic obesity with hyperphagia and increased stature

. If the SNP modulates the expression / function of MC4R, we can predict associations with the same traits in an appropriate direction

. The SNP rs17782313 obesity predisposing allele is associated with more snacking and overeating and increased stature

MC4R is a highly relevant candidate gene at this locus

Study of endophenotypes

Stutzmann et al., Int J Obes 2009

Page 48: Introduction to the concept of functional genomics David Meyre, Associate Professor, McMaster University (meyred@mcmaster.ca) HRM 728 Graduate Course:

Functional prediction (in silico)

Gene / locus Identification

Functional validation (in vitro, in vivo)

Hypothesis free approaches Gene candidate

Fine mapping

Page 49: Introduction to the concept of functional genomics David Meyre, Associate Professor, McMaster University (meyred@mcmaster.ca) HRM 728 Graduate Course:

FTO, a good illustration of integrative approach

.Novel variants identified in African populations

.FTO SNP shows evidence of positive natural selection

.The SNP is associated with different patterns of methylation (demethylase)

FTO SNPs in intron 1 affect the expression of other genes (IRX3 and IRX5) implicated in fat storage and energy expenditure

. FTO complete deficiency leads to a polymalformative lethal syndrome in humans

. FTO partial deficiency does not relate to leanness/obesity in humans

. FTO knock-out mice are lean, FTO transgenic mice are obese

. FTO is highly expressed in hypothalamus and is regulated by fasting and feeding

. FTO SNP is associated with food intake in humans

Page 50: Introduction to the concept of functional genomics David Meyre, Associate Professor, McMaster University (meyred@mcmaster.ca) HRM 728 Graduate Course:

Ichimura et al., Nature 2012

Page 51: Introduction to the concept of functional genomics David Meyre, Associate Professor, McMaster University (meyred@mcmaster.ca) HRM 728 Graduate Course:
Page 52: Introduction to the concept of functional genomics David Meyre, Associate Professor, McMaster University (meyred@mcmaster.ca) HRM 728 Graduate Course:

ANY QUESTIONS?ANY QUESTIONS?

The French fair-play!