Download - Sci cafe humangenome&health
Insights from The Human Genome Project about Risk of Disease
Toby G. Rossman, Ph.D.
The Nelson Institute of Environmental Medicine
NYU-Langone School of Medicine
In 2003, scientists in the Human Genome Project obtained the DNA sequence of the 3 billion base pairs making up the human genome.
The significance of this work to our health is an ongoing project.
THE HOPE:
Genes and The Genome
Genes are made of DNA and provide the directions for building all of the proteins that make our bodies function. Genes are passed down by parents to their offspring.
The Genome is the sum total of all genetic material (DNA) in a cell.
Introduction to DNA
A double helix (2 strands) with a sugar-phosphate backbone.
Attached to each sugar are one of 4 “bases”: A, G, T, C
Base pair:C always pairs with GA always pairs with T
The sequence of the bases (e.g.: AATGCCCTGAACGTT) contain the information (genetic code).
DNA is the genetic material within the nucleus.
What DNA does
The process of transcription
creates an RNA using
DNA information.
The RNA leaves the nucleus.
Cytoplasm
Nucleus
DNA
RNA
Protein
Replication
Transcription
Translation
The process of translation takes place in the cytoplasm.
I creates a protein using
RNA information.
DNA is the genetic material within the nucleus. It is many genes long.
DNA is replicated before a cell divides. Each daughter cell receives an identical copy.
The DNA in a gene is transcribed to create an RNA copy, using information from one DNA strand.
The RNA leaves the nucleus.
In the cytoplasm, the RNA is translated to create a protein
using RNA information.
The “Central Dogma” of biologyThe “Central Dogma” of biology
DNADNA
mRNARNAcarries the messagecarries the message
Protein
transcriptiontranscription
translationtranslation
A protein is a string of amino acids
stores all informationstores all information
Codon = 3 bases
All known organisms use the same genetic code. (Unity of life on earth; we are all related).
The genetic code is universal.
The genetic code is degenerate.Some codons encode the same amino acid. e.g. GGU, GGC, GGA, and GGG all encode glycine.Degeneracy is mostly at the third base of the codon.
Some codons have additional functions.AUG encodes methionine, but is also a start codon.
UAA, UAG, and UGA do not encode an amino acid.These codons signal termination of the protein (stop codon).
The Genetic Code
DNA is packed into chromosomes (highly condensed)
Genes are present both in the nucleus and in mitochondria
1 2 3 4 5 6 7 8 9 10 11 12
Sex-chromosomes
mtDNA
16,569 bp
Autosomes
Mitochondrial DNA
Nuclear DNA
3.2 billion bp
Chromosomes located in cell nucleus
1 chromosome in mitochondria (hundreds
of copies in cell cytoplasm)
Inherited only from mother
13 14 15 16 17 18 19 20 21 22 X Y
An important technique: Hybridization
Heating the cellular DNA will separate the 2 strands.
You can then identify a piece of DNA by hybridization, using a “probe” of the complementary strand (at least 12 bases). This also works with RNA.
HOT
COLD
Painting the human chromosomesPainting the human chromosomes
Chromosome painting
A pair of homologous chromosomes (one from father and one from mother)
Gene – unit of DNA information about a trait
Alleles – slightly different versions of a gene
p
Q
Homozygous: both alleles are the same
Heterozygous: the alleles are different
Most genes are in the 23 pairs of chromosomes in the cell nucleus.
Chromosomes contain thousands of genes.
SOME DEFINITIONS
• The human genome is nearly the same in all people. We are >99.9% alike.
• Only about 1.5% of the human genome contains genes that are translated into proteins.
• Most of our DNA is transcribed into RNA, but not translated.
Some important lessons from the Human Genome Project
Only a small % gets translated into protein
NEXT SLIDE
The DNA in a gene is interrupted. It is not all expressed.
Exons are expressed; Introns are removed
Alternative splicing of exons allows more than 1 protein from each transcript
EXON = expressed
INTRON = not expressed
intronexon exon exon exonintron intron
It turns out that 90% of the genome is actively transcribed into RNA.
Although initially dismissed as “junk DNA”, recent evidence suggests
that many of the RNAs transcribed from this DNA play major biological
roles in control of gene expression, cellular development and
metabolism.
For example:
The Human Genome encodes only ~20,000 protein-coding genes. But
we have about 100,000 different proteins! Alternative splicing yields
many more than 20,000 proteins. RNA molecules may help control
which splicing event occurs.
What is all that DNA for?
We are at the tip of a very large iceberg!
Genetic variation
Single nucleotide polymorphisms (SNPs)
• Our genes are >99.9% alike (unless we are identical twins).
• Gene sequences can differ at a single base (called single nucleotide polymorphism or SNP).
• The human genome has at least 10 million SNPs.
Understanding SNPs
• Most variations are meaningless and do not affect
our ability to survive or adapt.
• For example: silent mutations in DNA, which
change the DNA, but do not change the amino acid
the DNA codes for.
• Other mutations may change the amino acid
sequence of a protein, but not the overall function
of that protein.
• Some variation leads to disease. Single-gene
disorders include sickel cell anemia, cystic fibrosis
and Huntington disease.
There are other types of genetic variation besides SNPs
Eleanor Rigby picks up the rice in the church. (Original sequence)
Eleanor Rigby picks up the lice in the church. (SNP)
Eleanor Rigby picks up the church. (deletion)
Eleanor Rigby picks up the rice and beans in the church. (insertion)
Eleanor Rigby picku pt her icei nt hec hurch. (-1 frameshift)
Eleanor Rigby picks hcruhc eht ni ecir eht pu. (inversion)
Eleanor Rigby picks up the rice in the church. Eleanor Rigby picks up the rice in the church. (gene duplification)
Rearrangement: Gene gets transposed to another chromosome.
Gene duplication: A major event in evolutionGene duplications produce copy number variations (CNV) in humans
Copy number for amylase-1 gene is higher in populations with a starchy diet.
Amylase digests starch.
Perry et al., Nature Genetics 2007
50%
Gene copy number/cell
Copy number variation (CNV)
How different are human beings from one another?
Differences in SNPs: about 10 million (~3/1000 bases)
Large deletions and insertions: about 15 million base differences
Copy number variations: can lead to very large numbers of base changes
Risk Factors for Disease
A risk factor increases your risk of developing a disease or health problem.
Behaviors and lifestyle (including diet)
“Environment”
Genes
+ = Environment
How can you tell which is more important? Sometimes a single gene mutation leads to disease. THE EASY CASES
SICKLE CELL ANEMIAAn example of a single gene disease
• Caused by a mutant allele h of a hemoglobin gene H.
• 1/500 black Americans have the disease. (homozygous hh)
• 1/10 is heterozygous (Hh) for the sickle cell gene
• The mutant gene is even more common among West African Blacks.
• In some parts of Africa, the fraction of individuals with this disease is 1/25
• The mutant allele confers some resistance to malaria.
Fig. 11.2
Sickel cell anemia is caused by a single base change (SNP)
Some other genetic disorders caused by single genes
Cystic fibrosis 1/2,000 Caucasians
1/17,000 African Americans
1/9,000 Hispanics
Obstructive lung disorder, infections, heart failure
Tay-Sachs disease 1/300,000 in US
1/3500 in Ashkenazi Jewish
Destroys nervous system at about 8 months. Rarely live past age 2.
Huntington’s disease 1/20,000 (Western countries), less freq. in Africa & Asia
Gradual deteriation of nervous system
Maple syrup urine disease 1/9,000 to 1/300,000
1/176 in PA Mennonites
Vomiting, seizures, mental retard., coma. Death in 2 years
Alpha-1-antitrypsin 1/2000 Early lung disorder (by 40 years)
BRCA1 or BRCA2 breast cancer
Will discuss later along with other cancer risk genes
Increased risk for breast cancer
Phenylketonurea (PKU) Mental retardation, excretion of phenylalanine
Hemophilia B X-linked recessive Lack of clotting factor IX
Disorder Incidence (US) Symptoms
Complexities in single gene diseases: Cystic Fibrosis
Cystic fibrosis (CF) is caused by an autosomal recessive mutation in the gene CFTR (cystic fibrosis trans-membrane conductance regulator).
However, it is not possible to predict the exact phenotype of this disease from analysis of CFTR mutations. Other genes modify the disease.
Polygenic inheritanceThe Harder cases
• These underlie some of the more clinically important
human diseases including
• Heart disease
• Stroke
• Diabetes
• Schiozphrenia
Body Mass Index (BMI)
• At least 17 genes interact to control body weight: Genes that affect how much we eat, metabolic rate and fat distribution.
• One of these genes encodes a protein hormone called leptin. Eating stimulates fat cells to secrete leptin. Leptin travels to the brain and signals it to supress appetite and increase metabolism to digest the food.
• Low levels of leptin indicate starvation, which triggers hunger and decreases the metabolic rate.
• However, body weight is not entirely genetically determined.
Breast cancer risk differs in different countries: Heredity or Environment?
US
• Adult use associated with risk (about 10% increase for each drink per day)
• All studies have reported impact of early age alcohol use on breast cancer risk
• About twice the risk of breast cancer for women below 35 years
• Alcohol use increases estrogen levels
• Adequate folic acid (B vitamin) may decrease risk in women who have more than 1 drink per day
e.g. Alcohol Use
Breast cancer risk cannot be entirely genetic.
For example, diet, obesity, radiation exposure, and alcohol use influence risk.
High incidence in Japanese men in Japan, but lower in High incidence in Japanese men in Japan, but lower in HawaiiHawaii
Another example: stomach cancer
Example of multifactor causation: breast cancer
or
Other inherited cancer syndromes
Syndrome Primary tumors Other tumors/traits
Dominant
Familial retinoblastoma retinoblastoma osteosarcoma
Hered. non-polyposis colon can. colorectal many
Familial adenomatous polyposis colorectal other GI, jaw, brain
Nevoid basal cell carcinoma skin jaw cysts, ovary, fibomas
Familial melanoma skin pancreas
Multiple endocrine neoplasia (1/2) pancreas(1), thyroid(2) other endocrine
Li-Fraumeni syndrome sarcoma, breast brain, leukemias
Recessive
Ataxia telangiectasia lymphoma cerebellar
ataxia
Bloom’s syndrome solid tumors
immunodeficiency,
small stature
Xeroderma pigmentosum skin abnormal
pigmentation
Fanconi’s anemia AML skeletal abnormalities
Exposure to mutagenic/carcinogenic
compounds
DNA adduct
A
T
C
G*
N
N
N
N
NH2
O
O
O
P O-O
O
O
OH
OHHO
NH
O
O
O O-
O
P
O
O
O O-
O
P
N
N
N
NH
O
P O-O
O
O
O
NH
N
O
N
N
NH2
O
Metabolic activation
DNA repair
Mutation
CancerCancer
Chemical carcinogens often cause DNA damage which leads to mutations if the damage is not
repaired
No Cancer
DNA repair defect diseases
Proteins known to bind to BRCA1’s BRCT domains—such as Abraxas, BACH1, and CtIP—are all involved in the DNA repair process.
This meshes with the prevailing view that faulty DNA repair by a mutated BRCA1 results in genomic instability, which ultimately leads to tumorigenesis.
Involvement of BRCA1 in DNA repair
Pennisi E, Science 2007; 318:1842-43.
2007: The Year of GWA Studies
THE NEWEST METHOD FOR GENETIC ANALYSIS:Genome-Wide Association Study (GWAS)
GWAS is an approach that involves rapidly scanning genetic markers across the complete sets of DNA (genomes), of many people to find genetic variations associated with a particular disease.
Such studies are being carried out to find genetic variations that contribute to common, complex diseases, such as asthma, cancer, diabetes, heart disease and mental illnesses.
The Gene Chip (microarray)
Microarray chips
Oligonucleotide probes representing many genes are spotted in an array.
Types of microarrays
Comparative Genomic Hybridization (CGH): for genomic gains and losses or for a change in the number of copies of a particular gene.
Microarray expression analysis: to determine the level of expression of a gene. Reflects subject’s mRNA levels.
SNP or mutation analysis: In this case, gene sequences placed on any given spot within the array will differ from that of other spots, by only one (SNP) or a few specific nucleotides.
http://www.broad.mit.edu/diabetes/scandinavs/type2.html
Genome-Wide Scan for Type 2 Diabetes in a Scandinavian Cohort:
SNP results
Type 2 Diabetes
GWAS have accelerated the identification of type 2 diabetes susceptibility genes. There are now at least 19 loci containing genes that increase risk of type 2 diabetes.
Individually, most of these variants confer a modest risk (odds ratio [OR] = 1.1–1.25) of developing type 2 diabetes.
To date, these approaches have only identified two genes: PPARG (peroxisome proliferator-activated receptor-r ) and KCNJ11(potassium inwardly-rectifying channel J11) robustly implicated in type 2 diabetes susceptibility.
GWAS for SNP Associations with Myocardial Infarction show hot
region on Ch 9
Samani N et al., N Engl J Med 2007; 357:443-53.
Klein et al, Science 2005; 308:385-389.
GWAS for Age-Related Macular Degeneration
WTCCC, Nature 2007; 447:661-678.
Wellcome Trust GWAS of Seven Diseases
BIG SURPRISE: Most SNPs associated with disease susceptibility are in introns and intergenic positions!!
Unique Aspects and problems of GWAS
• GWAS permits examination of inherited genetic variability at unprecedented level of resolution.
• GWAS permits "agnostic" genome-wide evaluation. • Once a genome is measured, it can be related to any trait.• Most robust associations in GWA studies have not been
with genes previously suspected of association with the disease.
• Many associations are in regions that do not harbor genes.
But with more than 500,000 comparisons per study, the potential for false positive results is unprecedented (and expensive!).
Most associations are NOT robust.
Further Reading (and some criticisms)
Taft et al., Non-coding RNAs: regulators of disease. J. Pathology 220:126-139, 2010
Roberts et al., The predictive capacity of personal genome sequencing. Science Translational Medicine 2010 10.1126/scitranslmed.3003380 stm.sciencemag.org
Ioannidis et al., A compendium of genome-wide associations for cancer: Critical synopsis and reappraisal. J. Natl. Cancer Inst. 102:846-858, 2010
Bell, Our changing view of the genomic landscape of cancer. J. Pathology 220:231-243, 2010
Vineis and Pearce, Genome-wide association studies may be misinterpreted: genes versus heritability. Carcinogenesis 32:1295-1298, 2011
CDC, Office of Genomics and Disease Prevention www.cdc.gov/genomics/public/famhist.htm
"DNA Interactive" Site from Cold Spring Harbor Labs: http://www.dnai.org/index.htm
Howard Hughes Medical Institute's "Biointeractive" Site http://www.hhmi.org/biointeractive/genomics/microarray.html
Learn Genetics — Genetic Science Learning Centerhttp://learn.genetics.utah.edu/
National Center for Biotechnologyhttp://www.ncbi.nlm.nih.gov/About/primer/snps.html
http://hapmap.ncbi.nlm.nih.gov/
Animations On How DNA Microarrays Workhttp://www.imagecyte.com/animations/array2.html http://www.bio.davidson.edu/courses/genomics/chip/chip.html
Websites
Some other things besides genes are important
Founder Effects
Occur when a population is established by a small number of people. A mutation in one of the founders becomes prevelent in the resulting population.
Afrikaners (S Africa) Familial hypocholesterolemia, APC, BRCA1/2, Blooms
French Canadians HED (skin disorder), congenital adrenal hyperplasia
Finns hMLH1, diastrophic dysplasia
Icelanders BRCA2
Dutch BRCA1/2, melanoma
Norwegians BRCA1
North Africans Allgrove syndrome
Swedes BRCA1/2
African Americans BRCA1
Germans (Black Forest) Von Hippel-Lindau disease
France (Rhone Alps) Hemophilia B
Sicilians Glycogen storage disease type II
South Italians CDA-II (anemia)
Steps in Chemical Carcinogenesis (also radiation)
HAPMAP
Testing all of the 10 million common SNPs in a person's chromosomes would be extremely expensive. The development of the HapMap will enable geneticists to take advantage of how SNPs and other genetic variants are organized on chromosomes. Genetic variants that are near each other tend to be inherited together. For example, all of the people who have an A rather than a G at a particular location in a chromosome can have identical genetic variants at other SNPs in the chromosomal region surrounding the A. These regions of linked variants are known as haplotypes.
Diseases and Traits with Published GWA Studies (n = 76, 11/17/08)
• Macular Degeneration• Exfoliation Glaucoma
• Lung Cancer• Prostate Cancer• Breast Cancer• Colorectal Cancer• Bladder Cancer• Neuroblastoma• Melanoma• TP53 Cancer
Predispos’n• Chr. Lymph. Leukemia
• Inflamm. Bowel Disease• Celiac Disease• Gallstones• Irritable Bowel
Syndrome
• QT Prolongation • Coronary Disease• Coronary Spasm • Atrial
Fibrillation/Flutter• Stroke• Subarachnoid
Hemorrhage• Intracranial Aneurysm • Hypertension• Hypt. Diuretic Response• Peripheral Artery
Disease
• Syst. Lupus Erythematosus
• Sarcoidosis• Pulmonary Fibrosis• Psoriasis • HIV Viral Setpoint• Childhood Asthma
• Type 1 Diabetes • Type 2 Diabetes• Diabetic Nephropathy • End-St. Renal Disease• Obesity, BMI, Waist,
IR• Height• Osteoporosis• Osteoarthritis• Male Pattern Baldness
• F-Cell Distribution• Fetal Hgb Levels• C-Reactive Protein• ICAM-1• Total IgE Levels• Uric Acid Levels, Gout• Protein Levels• Vitamin B12 Levels• Recombination Rate• Pigmentation
• Lipids and Lipoproteins• Warfarin Dosing• Ximelegatran Adv.
Resp.
• Parkinson Disease• Amyotrophic Lat.
Sclerosis• Multiple Sclerosis• MS Interferon-β
Response • Prog. Supranuclear
Palsy• Alzheimer’s Disease in
ε4+• Cognitive Ability• Memory• Hearing• Restless Legs Syndrome • Nicotine Dependence• Methamphetamine
Depend.• Neuroticism• Schizophrenia• Sz. Iloperidone
Response• Bipolar Disorder• Family Chaos• Narcolepsy• Attention Deficit
Hyperactivity• Personality Traits
• Rheumatoid Arthritis• RA Anti-TNF Response