human population genomics man, woman, birth, death , infinity, plus
DESCRIPTION
Human Population Genomics Man, Woman, Birth, Death , Infinity, Plus Altruism, Cheap Talks, Bad Behavior , ¥ Money, God and Diversity on Steroids. Jack Schwartz (1930 – 2009). Lord Jeffrey (misattributed; badly paraphrased). “Damn the Human Genomes. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Human Population Genomics Man, Woman, Birth, Death , Infinity, Plus](https://reader036.vdocument.in/reader036/viewer/2022062423/56814914550346895db64bd6/html5/thumbnails/1.jpg)
Human Population Genomics
Man, Woman, Birth, Death,
Infinity, Plus
Altruism, Cheap Talks,
Bad Behavior, ¥ Money, God and
Diversity on Steroids
![Page 2: Human Population Genomics Man, Woman, Birth, Death , Infinity, Plus](https://reader036.vdocument.in/reader036/viewer/2022062423/56814914550346895db64bd6/html5/thumbnails/2.jpg)
2
Jack Schwartz (1930 – 2009)
![Page 3: Human Population Genomics Man, Woman, Birth, Death , Infinity, Plus](https://reader036.vdocument.in/reader036/viewer/2022062423/56814914550346895db64bd6/html5/thumbnails/3.jpg)
“Damn the Human Genomes.Small populations;Genes too distant;Pestered with duplications;Feeble contrivance;Could make a better one myself!”
Lord Jeffrey (misattributed; badly paraphrased)
3
![Page 4: Human Population Genomics Man, Woman, Birth, Death , Infinity, Plus](https://reader036.vdocument.in/reader036/viewer/2022062423/56814914550346895db64bd6/html5/thumbnails/4.jpg)
•Non-equilibrium Models•Population Bottlenecks•Not Well-mixed•Migration/Colonization Patterns•Catastrophic Infections
• Heterozygous Advantages
Small Populations4
![Page 5: Human Population Genomics Man, Woman, Birth, Death , Infinity, Plus](https://reader036.vdocument.in/reader036/viewer/2022062423/56814914550346895db64bd6/html5/thumbnails/5.jpg)
Wright-Fisher Process
Mickey (Coalescent talk)
5
N in
div
idua
ls
mutation Derived allele extinction!
generation
Ancestral alleleDerived allele
![Page 6: Human Population Genomics Man, Woman, Birth, Death , Infinity, Plus](https://reader036.vdocument.in/reader036/viewer/2022062423/56814914550346895db64bd6/html5/thumbnails/6.jpg)
Moran Process6
deathtime
•Overlapping generations•Distribution of time to replication
![Page 7: Human Population Genomics Man, Woman, Birth, Death , Infinity, Plus](https://reader036.vdocument.in/reader036/viewer/2022062423/56814914550346895db64bd6/html5/thumbnails/7.jpg)
Forces in Population Genetics How to understand forces that produce
and maintain inherited genetic variation Forces
Mutation Recombination Natural Selection Population Structure/Migration Random birth/death (drift)
7
![Page 8: Human Population Genomics Man, Woman, Birth, Death , Infinity, Plus](https://reader036.vdocument.in/reader036/viewer/2022062423/56814914550346895db64bd6/html5/thumbnails/8.jpg)
8
![Page 9: Human Population Genomics Man, Woman, Birth, Death , Infinity, Plus](https://reader036.vdocument.in/reader036/viewer/2022062423/56814914550346895db64bd6/html5/thumbnails/9.jpg)
•20,000 Genes (Estimate in 80’s 120,000)
• Occurring about every 150 Kb•Many more functional ncRNA
• snoRNA, siRNA, piRNA, etc.•Uncharacterized
Genes Too Distant9
![Page 10: Human Population Genomics Man, Woman, Birth, Death , Infinity, Plus](https://reader036.vdocument.in/reader036/viewer/2022062423/56814914550346895db64bd6/html5/thumbnails/10.jpg)
Y
“From a gene’s point of view, reshuffling is a great restorative…
“The Y, in its solitary state disapproves of such laxity. Apart from small parts near each tip which line up with a shared section of the X, it stands aloof from the great DNA swap. Its genes, such as they are, remain in purdah as the generations succeed. As a result, each Y is a genetic republic, insulated from the outside world. Like most closed societies it becomes both selfish and wasteful. Every lineage evolves an identity of its own which, quite often, collapses under the weight of its own inborn weaknesses.
“Celibacy has ruined man’s chromosome.” Steve Jones, Y: The descent of Men, 2002.
![Page 11: Human Population Genomics Man, Woman, Birth, Death , Infinity, Plus](https://reader036.vdocument.in/reader036/viewer/2022062423/56814914550346895db64bd6/html5/thumbnails/11.jpg)
DAZ locus on Y Chromosome
![Page 12: Human Population Genomics Man, Woman, Birth, Death , Infinity, Plus](https://reader036.vdocument.in/reader036/viewer/2022062423/56814914550346895db64bd6/html5/thumbnails/12.jpg)
Optical Mapping
Cells gently lysed to extract genomic DNA
DNA captured in parallel arrays of long single DNA molecules using microfluidic device
Genomic DNA, captured as single DNA molecules produced by random breakage of intact chromosomes
1. Capture and immobilize whole genomes as massive collections of single DNA molecules
![Page 13: Human Population Genomics Man, Woman, Birth, Death , Infinity, Plus](https://reader036.vdocument.in/reader036/viewer/2022062423/56814914550346895db64bd6/html5/thumbnails/13.jpg)
Overlapping single molecule maps are aligned to produce a map assembly covering an entire chromosome
⌘⌘⌘
![Page 14: Human Population Genomics Man, Woman, Birth, Death , Infinity, Plus](https://reader036.vdocument.in/reader036/viewer/2022062423/56814914550346895db64bd6/html5/thumbnails/14.jpg)
⌘⌘⌘⌘
Sizing Error (Bernoulli
labeling, absorption cross-section, PSF)
Partial Digestion False Optical Sites Orientation Spurious
molecules, Optical chimerism, Calibration
Image of restriction enzyme digested YAC clone: YAC clone 6H3, derived from human chromosome 11, digested with the restriction endonuclease Eag I and Mlu I, stained with a fluorochrome and imaged by fluorescence microscopy.
![Page 15: Human Population Genomics Man, Woman, Birth, Death , Infinity, Plus](https://reader036.vdocument.in/reader036/viewer/2022062423/56814914550346895db64bd6/html5/thumbnails/15.jpg)
⌘⌘⌘⌘⌘
Various combinations of error sources lead to NP-hard Problems
![Page 16: Human Population Genomics Man, Woman, Birth, Death , Infinity, Plus](https://reader036.vdocument.in/reader036/viewer/2022062423/56814914550346895db64bd6/html5/thumbnails/16.jpg)
•Complex Genome Structures• Segmental Duplications• Many types of Polymorphisms
(SNPs, CNVs, SVs, etc.)•Models of Genome Dynamics
• GOD (Genome Organizing Devices)•Models of Coalescence
Pestered with duplications18
![Page 17: Human Population Genomics Man, Woman, Birth, Death , Infinity, Plus](https://reader036.vdocument.in/reader036/viewer/2022062423/56814914550346895db64bd6/html5/thumbnails/17.jpg)
Segmental Duplications
Segmental duplications have been found to be associated with genomic disorders. Deletions: Williams-Beuren syndrome Duplications: Charcot-Marie-Tooth disease type 1A Inversions: Haemophilia A Translocations: Derivative 22 [der(22)] syndrome.
Segmental duplications may be related to cancer development by causing copy number fluctuations Duplication of myc in lung cancer, and ERBB2 in
breast cancer.
![Page 18: Human Population Genomics Man, Woman, Birth, Death , Infinity, Plus](https://reader036.vdocument.in/reader036/viewer/2022062423/56814914550346895db64bd6/html5/thumbnails/18.jpg)
Recent Segmental Duplications
From [Bailey, et al. 2002]
•3.5% ~ 5% of the human genome is found to contain
• segmental duplications, with length > 5 or 1kb, identity > 90%.
•August, 2001 assembly, •[Bailey, et al. 2002].
•April, 2003 assembly, •[Cheung, et al. 2003].
•These duplications are estimated to have emerged about 40Mya under neutral assumption.•The duplications are mostly interspersed (non-tandem), and happen both inter- and intra-chromosomally.
Human
![Page 19: Human Population Genomics Man, Woman, Birth, Death , Infinity, Plus](https://reader036.vdocument.in/reader036/viewer/2022062423/56814914550346895db64bd6/html5/thumbnails/19.jpg)
Recent Segmental Duplications
Mouse
From [Cheung, et al. 2003]
•1.2% of the mouse genome is found to contain segmental duplications, with length > 5kb, identity > 90%.
•February, 2003 mouse assembly,•[Cheung, et al. 2003].
•These duplications are estimated to have emerged about 25Mya under neutral assumption.•The duplications happen both inter- and intra-chromosomally.
![Page 20: Human Population Genomics Man, Woman, Birth, Death , Infinity, Plus](https://reader036.vdocument.in/reader036/viewer/2022062423/56814914550346895db64bd6/html5/thumbnails/20.jpg)
Duplication Flanking Sequences What are the molecular mechanisms
that caused the recent segmental duplications in the human and mouse genomes? Thermodynamic instability in the DNA
sequences; Recombination between homologous repeat
elements; Other unknown mechanisms.
![Page 21: Human Population Genomics Man, Woman, Birth, Death , Infinity, Plus](https://reader036.vdocument.in/reader036/viewer/2022062423/56814914550346895db64bd6/html5/thumbnails/21.jpg)
Thermodynamics
5’-breakpoint 3’-breakpoint
duplicated
region
5’ 3’+512bp-512bp
Control
Data
![Page 22: Human Population Genomics Man, Woman, Birth, Death , Infinity, Plus](https://reader036.vdocument.in/reader036/viewer/2022062423/56814914550346895db64bd6/html5/thumbnails/22.jpg)
⌘
FLAM/FRAM Alu-Jo Alu-Jb Alu-Sc~Sx Alu-Y Alu-Ya~YbMIR20% 14% 14% 8% 5% >1%30%
Divergence:
* *
**
**
**
SINE
L2 L1M4 L1M3 L1M2 L1M1 L1P5 L1P4 L1P3 L1P2 L1P1 L1Hs30% 22% 21% 19% 18% 12% 11% 7% 4% 2% <1%Divergence:
**
****
****
****
****
LINE
Fre
quen
cies
of
the
repe
ats
Control set
Data set
![Page 23: Human Population Genomics Man, Woman, Birth, Death , Infinity, Plus](https://reader036.vdocument.in/reader036/viewer/2022062423/56814914550346895db64bd6/html5/thumbnails/23.jpg)
The Model
Duplication by recombination between repeats
Duplication by recombination between other repeats or other mechanisms
insertion
insertion deletion or
mutation
deletion or mutation
f - -
f ++
f + -
Mutation accumulation in the duplicated sequences
f - -
f ++
f + -
![Page 24: Human Population Genomics Man, Woman, Birth, Death , Infinity, Plus](https://reader036.vdocument.in/reader036/viewer/2022062423/56814914550346895db64bd6/html5/thumbnails/24.jpg)
The Mathematical Model
H0
H1
0 ≤ d < ε ε ≤ d < 2ε (k-1)ε ≤ d < kε
α1-α-2β
1-α-2γ
1-α-β/2-γ
α
α
α
α
α
α
α
α
γ 2β
2γ β/2 2γ β/2 2γ β/2
γ 2β γ 2β
1-α-2β 1-α-2β
1-α-β/2-γ 1-α-β/2-γ
1-α-2γ 1-α-2γ
h0
h1 h1++
h1--
h0+-
h0--
h0++
f - -
f ++
f + -
α
α
α
Time after duplication
h1: proportion of duplications by repeat recombination;
h1++: proportion of duplications by recombination of the specific repeat;
h1- - : proportion of duplications by recombination of other repeats;
h0: proportion of duplications by other repeat-unrelated mechanism;
h0++: proportion of h0 with common specific repeat in the flanking regions;
h0+-: proportion of h0 with no common specific repeat in the flanking regions;
h0- -: proportion of h0 with no specific repeat in the flanking regions;
α: mutation rate in duplicated sequences;
β: insertion rate of the specific repeat;
γ: mutation rate in the specific repeat;
d: divergence level of duplications;
ε: divergence interval of duplications.
![Page 25: Human Population Genomics Man, Woman, Birth, Death , Infinity, Plus](https://reader036.vdocument.in/reader036/viewer/2022062423/56814914550346895db64bd6/html5/thumbnails/25.jpg)
Model Fitting
Diversity:
f - -
f ++
f + -
Alu
Diversity:
f - -
f ++
f + -
L1
The model parameters (αAlu, βAlu, γAlu, αL1, βL1, γL1) are estimated from the reported mutation and insertion rates in the literature.
The relative strengths of the alternative hypotheses can be estimated by model fitting to the real data.
h1++Alu ≈ 0.3; h1++ L1 ≈ 0.35.
![Page 26: Human Population Genomics Man, Woman, Birth, Death , Infinity, Plus](https://reader036.vdocument.in/reader036/viewer/2022062423/56814914550346895db64bd6/html5/thumbnails/26.jpg)
Mer Frequencies
Chr1
Ns
ATs
Reps
CDs
ΔG
DupCopy#
MerFreq
MER57A L1P
![Page 27: Human Population Genomics Man, Woman, Birth, Death , Infinity, Plus](https://reader036.vdocument.in/reader036/viewer/2022062423/56814914550346895db64bd6/html5/thumbnails/27.jpg)
Copy Number Variation Data
China46 people
China46 people
UtahEuropean
origin: 90 people
UtahEuropean
origin: 90 people
Yoruba89 people
Yoruba89 people
Japan45 people
Japan45 people
HapMap data
Made available to us by Drs. Evan Eichler and Andy Sharp
![Page 28: Human Population Genomics Man, Woman, Birth, Death , Infinity, Plus](https://reader036.vdocument.in/reader036/viewer/2022062423/56814914550346895db64bd6/html5/thumbnails/28.jpg)
CNVs in Unique regions
OROROROR
![Page 29: Human Population Genomics Man, Woman, Birth, Death , Infinity, Plus](https://reader036.vdocument.in/reader036/viewer/2022062423/56814914550346895db64bd6/html5/thumbnails/29.jpg)
CNVs in Unique regions
Yoruba Japanese Chinese Ceph
No polymorphism
810 817 817 799
Amplifications only
43 43 46 55
Deletions only 46 37 36 44
Mixed 1* 3=2+1* 1* 2=1+1*
![Page 30: Human Population Genomics Man, Woman, Birth, Death , Infinity, Plus](https://reader036.vdocument.in/reader036/viewer/2022062423/56814914550346895db64bd6/html5/thumbnails/30.jpg)
CNVs in SD regions
ANDANDANDAND
![Page 31: Human Population Genomics Man, Woman, Birth, Death , Infinity, Plus](https://reader036.vdocument.in/reader036/viewer/2022062423/56814914550346895db64bd6/html5/thumbnails/31.jpg)
CNV in SD regions
Yoruba Japanese Chinese Ceph
No polymorphism
786 794 785 741
Amplifications only
124 135 141 129
Deletions only 101 86 101 141
Mixed 43 40 27 44
Unique and SD regions show completely different behavior of CNVs!
![Page 32: Human Population Genomics Man, Woman, Birth, Death , Infinity, Plus](https://reader036.vdocument.in/reader036/viewer/2022062423/56814914550346895db64bd6/html5/thumbnails/32.jpg)
Distance-dependent recombination
The chance of recombination depends on the distance between Allele A and its copy
![Page 33: Human Population Genomics Man, Woman, Birth, Death , Infinity, Plus](https://reader036.vdocument.in/reader036/viewer/2022062423/56814914550346895db64bd6/html5/thumbnails/33.jpg)
Simulation (probabilistic model)
![Page 34: Human Population Genomics Man, Woman, Birth, Death , Infinity, Plus](https://reader036.vdocument.in/reader036/viewer/2022062423/56814914550346895db64bd6/html5/thumbnails/34.jpg)
Observations & Conclusions
Mutation rate of 0.0001 and recombination rate of 0.001 in SD regions constitute the best fit to observed real life data.
Single mutations cannot explain observed data, but can be explained by convergence via recombination.
Evolution-by-Duplication (EBD) appears to play a crucial role in evolution and molds the genetic circuitry in a rather constrained way, before it is subject to selection pressure
![Page 35: Human Population Genomics Man, Woman, Birth, Death , Infinity, Plus](https://reader036.vdocument.in/reader036/viewer/2022062423/56814914550346895db64bd6/html5/thumbnails/35.jpg)
•GWAS (Genome-Wide Association Studies)
• Common Variants vs. Rare Variants• Haplotype Phasing/Linkage Analysis
•Poor Experiment Design• Reference Sequences• Genotypic vs. Haplotypic References
•Weak Technologies
Feeble Contrivance39
![Page 36: Human Population Genomics Man, Woman, Birth, Death , Infinity, Plus](https://reader036.vdocument.in/reader036/viewer/2022062423/56814914550346895db64bd6/html5/thumbnails/36.jpg)
Common vs. Rare Disease Variants From Ionita-Laza (2009) There are two disease models:
CDCV - common disease, common variants CDRV - common disease, rare variants
The current genome-wide association studies only consider common variants (frequency at least 5%). Feasible with available resources The common loci identified so far have small effects
(ORs 1:1 -1:5) and only explain a small percentage of the estimated heritability.
Rare susceptibility variants are expected to play an important role: population genetics theory (Pritchard, 2001) empirical evidence (BMI, blood pressure, autism,
Mendelian diseases etc.)
40
![Page 37: Human Population Genomics Man, Woman, Birth, Death , Infinity, Plus](https://reader036.vdocument.in/reader036/viewer/2022062423/56814914550346895db64bd6/html5/thumbnails/37.jpg)
Effect Size Distribution41
![Page 38: Human Population Genomics Man, Woman, Birth, Death , Infinity, Plus](https://reader036.vdocument.in/reader036/viewer/2022062423/56814914550346895db64bd6/html5/thumbnails/38.jpg)
Capture-Recapture Model
Suppose we have sequence data on Nind individuals in a genomic region. An individual shows variation at a position if the
corresponding allele is different from the ancestral one.
A position is variable or is a variant if there is at least one individual in the dataset with a variation at that position.
Let xs be the number of individuals with variation at position s: xs > 0.
What is N: the total, unknown number of variants in the region.
42
![Page 39: Human Population Genomics Man, Woman, Birth, Death , Infinity, Plus](https://reader036.vdocument.in/reader036/viewer/2022062423/56814914550346895db64bd6/html5/thumbnails/39.jpg)
One can estimate the following: Δ(t) = # NEW variants expected to be
found in a FUTURE dataset of size t . Nind. t is a multiplier of initial dataset size, Nind. Δf(t) = # new variants with frequency at
least f . . .
43
![Page 40: Human Population Genomics Man, Woman, Birth, Death , Infinity, Plus](https://reader036.vdocument.in/reader036/viewer/2022062423/56814914550346895db64bd6/html5/thumbnails/40.jpg)
ENCODE dataset
Ten 500Kb genomic regions were sequenced in several unrelated DNA samples: 8 Yoruba (YRI) 16 CEPH European (CEPH) 7 Han Chinese (CHB) 8 Japanese (JPT)
To make results comparable across the four populations (YRI, CEPH, CHB and JPT), they considered only 7 of the sequenced individuals for each dataset.
44
![Page 41: Human Population Genomics Man, Woman, Birth, Death , Infinity, Plus](https://reader036.vdocument.in/reader036/viewer/2022062423/56814914550346895db64bd6/html5/thumbnails/41.jpg)
ENCODE - Δf(t)
From Ionita-Laza et al. 2009
45
![Page 42: Human Population Genomics Man, Woman, Birth, Death , Infinity, Plus](https://reader036.vdocument.in/reader036/viewer/2022062423/56814914550346895db64bd6/html5/thumbnails/42.jpg)
•Debugging a human better•Sequencing a genome•Sequencing a population
How to Make a Better Human?
46
![Page 43: Human Population Genomics Man, Woman, Birth, Death , Infinity, Plus](https://reader036.vdocument.in/reader036/viewer/2022062423/56814914550346895db64bd6/html5/thumbnails/43.jpg)
S ★M ★ A ★ S ★ H
SingleMoleculeApproach toSequencing-by-Hybridization
![Page 44: Human Population Genomics Man, Woman, Birth, Death , Infinity, Plus](https://reader036.vdocument.in/reader036/viewer/2022062423/56814914550346895db64bd6/html5/thumbnails/44.jpg)
S*M*A*S*H
Sequence a human size genome of about 6 Gb—include both haplotypes.
Integrate: Optical Mapping (Ordered Restriction Maps) Hybridization (with short nucleobase probes
[PNA or LNA oligomers] with dsDNA on a surface, and
Positional Sequencing by Hybridization (efficient polynomial time algorithms to solve “localized versions” of the PSBH problems)
![Page 45: Human Population Genomics Man, Woman, Birth, Death , Infinity, Plus](https://reader036.vdocument.in/reader036/viewer/2022062423/56814914550346895db64bd6/html5/thumbnails/45.jpg)
⌘
Genomic DNA is carefully extracted
Fig 1
![Page 46: Human Population Genomics Man, Woman, Birth, Death , Infinity, Plus](https://reader036.vdocument.in/reader036/viewer/2022062423/56814914550346895db64bd6/html5/thumbnails/46.jpg)
⌘⌘
LNA probes of length 6 – 8 nucleotides are hybridized to dsDNA (double-stranded genomic DNA)
The modified DNA is stretched on a 1” x 1” chip.
Fig 2
![Page 47: Human Population Genomics Man, Woman, Birth, Death , Infinity, Plus](https://reader036.vdocument.in/reader036/viewer/2022062423/56814914550346895db64bd6/html5/thumbnails/47.jpg)
⌘⌘⌘
DNA adheres to the surface along the channels and stretches out.
Size from 0.3 – 3 million base pairs in length.
Bright emitters are attached to the probes and imaged (Fig 3).
Fig 3
![Page 48: Human Population Genomics Man, Woman, Birth, Death , Infinity, Plus](https://reader036.vdocument.in/reader036/viewer/2022062423/56814914550346895db64bd6/html5/thumbnails/48.jpg)
⌘⌘⌘⌘
A restriction breaks the DNA at specific sites.
The cut fragments of DNA relax like entropic springs, leaving small visible gaps
Fig 4
![Page 49: Human Population Genomics Man, Woman, Birth, Death , Infinity, Plus](https://reader036.vdocument.in/reader036/viewer/2022062423/56814914550346895db64bd6/html5/thumbnails/49.jpg)
⌘⌘⌘⌘⌘
The DNA is then stained with a fluorogen (Fig 5) and reimaged.
The two images are combined in a composite image suggesting the
locations of a specific short word (e.g., probes) within the context of a pattern of restriction sites.
Fig 5
![Page 50: Human Population Genomics Man, Woman, Birth, Death , Infinity, Plus](https://reader036.vdocument.in/reader036/viewer/2022062423/56814914550346895db64bd6/html5/thumbnails/50.jpg)
⌘⌘⌘⌘⌘⌘
The integrated intensity measures the length of the DNA fragments.
The bright-emitters on probes provides a profile for locations of the probes.
Fig 6
The restriction sites are represented by a tall rectangle & The probe sites by small circles
![Page 51: Human Population Genomics Man, Woman, Birth, Death , Infinity, Plus](https://reader036.vdocument.in/reader036/viewer/2022062423/56814914550346895db64bd6/html5/thumbnails/51.jpg)
⌘⌘⌘⌘⌘⌘⌘
These steps are repeated for all possible probe compositions (modulo reverse
complementarity). Software assembles
the haplotypic ordered restriction maps with approximate probe locations superimposed on the map.
ATAT
TATC
ATCA
TCAT
CATA
ATATCATAT
Fig 7
![Page 52: Human Population Genomics Man, Woman, Birth, Death , Infinity, Plus](https://reader036.vdocument.in/reader036/viewer/2022062423/56814914550346895db64bd6/html5/thumbnails/52.jpg)
S*M*A*S*H
Local clusters of overlapping words are combined by our PSBH (positional sequencing by hybridization) algorithm
ATAT
TATC
ATCA
TCAT
CATA
ATATCATAT
Fig 8
![Page 53: Human Population Genomics Man, Woman, Birth, Death , Infinity, Plus](https://reader036.vdocument.in/reader036/viewer/2022062423/56814914550346895db64bd6/html5/thumbnails/53.jpg)
Probe Map (lambda DNA)
![Page 54: Human Population Genomics Man, Woman, Birth, Death , Infinity, Plus](https://reader036.vdocument.in/reader036/viewer/2022062423/56814914550346895db64bd6/html5/thumbnails/54.jpg)
Final Probe Map
Consensus map with 2 probe locations 14.8% and 52.4% of the DNA length.
In close agreement with the correct map 50.2% and 85.7% (known from the
sequence) Implied probe hybridization rate = 42%.
Significantly better than the needed 30%
![Page 55: Human Population Genomics Man, Woman, Birth, Death , Infinity, Plus](https://reader036.vdocument.in/reader036/viewer/2022062423/56814914550346895db64bd6/html5/thumbnails/55.jpg)
500 nm
AFour AFM images of lambda DNA with PNA probes
![Page 56: Human Population Genomics Man, Woman, Birth, Death , Infinity, Plus](https://reader036.vdocument.in/reader036/viewer/2022062423/56814914550346895db64bd6/html5/thumbnails/56.jpg)
Combinatorial Structure
![Page 57: Human Population Genomics Man, Woman, Birth, Death , Infinity, Plus](https://reader036.vdocument.in/reader036/viewer/2022062423/56814914550346895db64bd6/html5/thumbnails/57.jpg)
Discretization
![Page 58: Human Population Genomics Man, Woman, Birth, Death , Infinity, Plus](https://reader036.vdocument.in/reader036/viewer/2022062423/56814914550346895db64bd6/html5/thumbnails/58.jpg)
Prediction
The probability of successfully computing the correct restriction map as a function of the number of cuts in the map and number of molecules used in creating the map…
![Page 59: Human Population Genomics Man, Woman, Birth, Death , Infinity, Plus](https://reader036.vdocument.in/reader036/viewer/2022062423/56814914550346895db64bd6/html5/thumbnails/59.jpg)
Gentig: Bayesian Approach
![Page 60: Human Population Genomics Man, Woman, Birth, Death , Infinity, Plus](https://reader036.vdocument.in/reader036/viewer/2022062423/56814914550346895db64bd6/html5/thumbnails/60.jpg)
Bayesian Model
![Page 61: Human Population Genomics Man, Woman, Birth, Death , Infinity, Plus](https://reader036.vdocument.in/reader036/viewer/2022062423/56814914550346895db64bd6/html5/thumbnails/61.jpg)
Robustness
BAC Clones with 6-cutters Average Clone size = 160 Kb; Average Fragment
Size = 4 Kb, & Average Number of Cutsites = 40. Parameters:
Digestion rate can be as low as 10% Orientation of DNA need not be known. 40% foreign DNA 85% DNA partially broken Relative sizing error up to 30% 30% spurious randomly located cuts…
![Page 62: Human Population Genomics Man, Woman, Birth, Death , Infinity, Plus](https://reader036.vdocument.in/reader036/viewer/2022062423/56814914550346895db64bd6/html5/thumbnails/62.jpg)
Single Molecule Hapoltyping:Candida Albicans
The left end of chromsome-1 of the common fungus Candida Albicans (being sequenced by Stanford).
Three polymorphisms: (A) Fragment 2 is of size
41.19kb (top) vs 38.73kb (bottom).
(B) The 3rd fragment of size 7.76kb is missing from the top haplotype.
(C)The large fragment in the middle is of size 61.78kb vs 59.66kb.
![Page 63: Human Population Genomics Man, Woman, Birth, Death , Infinity, Plus](https://reader036.vdocument.in/reader036/viewer/2022062423/56814914550346895db64bd6/html5/thumbnails/63.jpg)
Problem to Solve…
Given probe maps of some small region of the genome for all N-bp hybridization probes (e.g. all 2080 probes of 6-bp).
With known error rates (false positive, false negatives and sizing errors).
Can we reconstruct the complete sequence ?
![Page 64: Human Population Genomics Man, Woman, Birth, Death , Infinity, Plus](https://reader036.vdocument.in/reader036/viewer/2022062423/56814914550346895db64bd6/html5/thumbnails/64.jpg)
Basic reconstruction algorithm Keep track of multiple sequence assemblies. Initialize with all possible 5-bp sequences. Try all 4 possible extensions of each sequence. Check if probe is present in corresponding map :
if not add a penalty score to the sequence involved.
Periodically delete sequences with high penalty. Stop when missing probe rate jumps significantly
from False Negative rate (2%) to (100% - false extension rate) = 55%.
Return highest scoring sequence.
![Page 65: Human Population Genomics Man, Woman, Birth, Death , Infinity, Plus](https://reader036.vdocument.in/reader036/viewer/2022062423/56814914550346895db64bd6/html5/thumbnails/65.jpg)
Anomalies
Irresolvable Ambiguities: From assemblies based on 6bp probes
Error Pattern : s w sRC Correct Pattern : s wRC sRC
s = tcgcc (any 5 bases) sRC=ggcga (Reverse compliment of X) w = CCCCTAAC (any short sequence under 50bp) wRC= GTTAGGGG (Reverse compliment of Y)
Assembly:…tcgccCCCCTAAC ggcga… || || | || | ||Correct :…tcgccGTTAGGGGggcga…
Assembly:…tcgccCCCCTAAC ggcga… || || | || | ||Correct :…tcgccGTTAGGGGggcga…
![Page 66: Human Population Genomics Man, Woman, Birth, Death , Infinity, Plus](https://reader036.vdocument.in/reader036/viewer/2022062423/56814914550346895db64bd6/html5/thumbnails/66.jpg)
Directed Eulerian Graph
AATT
TTAA
ATTC
TAAG
TTCG
AAGC
ATCG
TAGC
![Page 67: Human Population Genomics Man, Woman, Birth, Death , Infinity, Plus](https://reader036.vdocument.in/reader036/viewer/2022062423/56814914550346895db64bd6/html5/thumbnails/67.jpg)
⌘
Mixing ‘solid’ bases with `wild-card’ bases: E.g., xx-x-x-xx (9-mers) or xxx- -x- -x- -xxx
(14 mers) An ‘inert’ base
Universal: In terms of its ability to form base pairs with the other natural DNA/RNA bases.
Examples: The naturally occurring base hypoxanthine,
as its ribo- or 2'-deoxyribonucleoside; 2'-deoxyisoinosine; 7-deaza-2'-deoxyinosine; 2-aza-2'-deoxyinosine
![Page 68: Human Population Genomics Man, Woman, Birth, Death , Infinity, Plus](https://reader036.vdocument.in/reader036/viewer/2022062423/56814914550346895db64bd6/html5/thumbnails/68.jpg)
Simulation Results
1
10
100
1000
10000
5 6 7 8
Bases per probe
Err
ors
per
10kb s
equence
0.01
0.1
1
10
100
1000
0 1 2 3 4 5
Gapped bases per probe (6 solid bases)E
rro
rs p
er 1
0kb
seq
uen
ce
UNGAPPED GAPPED
![Page 69: Human Population Genomics Man, Woman, Birth, Death , Infinity, Plus](https://reader036.vdocument.in/reader036/viewer/2022062423/56814914550346895db64bd6/html5/thumbnails/69.jpg)
1000 Rupees Genome
22.67 US$ for 6 billion bases135 billion US $ for the entire human population
![Page 70: Human Population Genomics Man, Woman, Birth, Death , Infinity, Plus](https://reader036.vdocument.in/reader036/viewer/2022062423/56814914550346895db64bd6/html5/thumbnails/70.jpg)
Who we are…
Population David Albers (Columbia) Eric Aslakson (NYU) Mickey Atwal (CSHL) Ivan Iossifov (CSHL) Hossein Khiabanian
(Columbia) Samantha Kleinberg (NYU) Partha Mitra (CSHL) Michaela Oswald (CSHL) Raul Rabadan (Columbia) Vladimir Trifonov (Colmbia) Daniel Valente (CSHL) Chris Wiggins (Columbia)
Polymorphims Iuliana Ionita-Laza (Harvard) Antonina Mitrofanova (NYU) Joey Zhao (Princeton)
SMASH TS Anantharaman (OpGen) Charles Cantor (Sequenom) Vladimir Demidov (BU) Pierre Franquin (NYU) Alex Lim (Ex-NYU) Toto Paxia (Ex-NYU) Jason Reed (UCLA) Andrew Sundstrom (NYU)
SUTTA Giusepe Narzisi (NYU) Alessio Narzisi
(NYU/Catania)
74
![Page 71: Human Population Genomics Man, Woman, Birth, Death , Infinity, Plus](https://reader036.vdocument.in/reader036/viewer/2022062423/56814914550346895db64bd6/html5/thumbnails/71.jpg)
“Beware prejudices.
“They are like rats, and men's minds are like traps; prejudices get in easily, but it is doubtful if they ever get out.”
Lord Jeffrey75