evaluation of placement techniques for dna probe array layout andrew b. kahng 1 ion i. mandoiu 2...
Post on 19-Dec-2015
217 views
TRANSCRIPT
Evaluation of Placement Techniques for DNA Probe Array Layout
Andrew B. Kahng1 Ion I. Mandoiu2 Sherief Reda1 Xu Xu1 Alex Zelikovsky3
(1) CSE Department, University of California at San Diego
(2) CSE Department, University of Connecticut
(3) CS Department, Georgia State University
Introduction to DNA microarrays and border minimization challenges
Outline
Partitioning-based probe placement
Comparison of probe placement heuristicsQuantified sub-optimality of placement
Conclusions and future research directions
Previous probe placement algorithm
Introduction to DNA Probe ArraysDNA Arrays are composed of probes where each probe is a sequence of 25 nucleotides
Images courtesy of Affymetrix.
Optical scanning
Laser activation
Tagged fragments flushed over array
Probe Synthesis
array probes
A 3 X 3 array
CG AC G
AC ACG AG
CG AG C
Nu
cle
otid
e D
ep
ositi
on
Se
que
nce
AC
G
A Mask 1
A
A
A
A
A
Probe Synthesis
array probes
CG AC G
AC ACG AG
CG AG C
Nu
cle
otid
e D
ep
ositi
on
Se
que
nce
AC
G
C
C
C C
C
CA
A
A
A
A
A 3 X 3 array
C Mask 2
Probe Synthesis
array probes
CG AC G
AC ACG AG
CG AG C
Nu
cle
otid
e D
ep
ositi
on
Se
que
nce
AC
G
C
C
C C
C
CA
A
A
A
A
G
G G
G
G
G
A Nucleotide Deposition Sequence defines the order of nucleotide deposition
A Probe Embedding specifies the steps it uses in the sequence to get placed
A 3 X 3 array
G Mask 3
Border Minimization Challenges
Lamp
Mask
Array
Problem: Diffraction, internal reflection, scattering, internal illumination
Occurs at sites near to intentionally exposed sites
Reduce Border
Increase yield
Reduce cost
Design objective: Minimize the border
Intentionally exposed sites
Unwanted illumination
Border
Border Reduction with Probe PlacementProbe Placement
Similar probes should be placed close together
Dep
ositi
on S
eque
nce
A
A
C
C
G
GT
T
CT
TA
Probes CT
C
T
C
T
TA
Border = 8
CT
CT
TA
C
T
T
T
A
C
Border = 4
Optimize
Border Reduction in Probe Embedding
Synchronous embedding: deposit one nucleotide in each group of “ACGT”
Probe Embedding
Asynchronous embedding: no restriction
Dep
ositi
on S
eque
nce
A
A
C
C
G
GT
T
CT
TAProbes
C
T
TA
Border = 4
CT
TA
C
T TA
Border = 2
Basic DNA Array Design FlowProbe Selection
Design of Test Probes
Probe Placement
Probe Embedding
DNA Array
Logic Synthesis
BIST and DFT
Placement
Routing
VLSI Chip
Physical Design
Probe Placement
Probe Embedding
Probe Selection
Design of Test Probes
Logic Synthesis
BIST and DFT
Physical Design
Routing
Placement
Analogy
Lithography Lithography
DNA Microarrays Physical Design Problem
Placement of probes in n x n sites
Give: n2 probes
Total border cost
Find:
Embedding of the probes
Minimize:
Introduction to DNA microarrays and border minimization challenges
Outline
Partitioning-based probe placement
Comparison of probe placement heuristicsQuantified sub-optimality of placement
Conclusions and future research directions
Previous probe placement algorithm
Previous Work
Border minimization was first introduced by Feldman and Pevzner. “Gray Code masks for sequencing by hybridization,” Genomics, 1994, pp. 233-235
Work by Hannenhalli et al. gave heuristics for the placement problem by using a TSP formulation.
Kahng et al. “Border length minimization in DNA Array Design,” WABI02, suggested constructive methods for placement and embedding
Kahng et al. “Engineering a Scalable Placement Heuristic for DNA Probe Arrays ,” RECOMB03, suggested scalable placement improvement and embedding techniques
1-D Probe Placement (TSP)
How to place the 1-D ordering of probes onto the 2-D chip?
Probe 1 Probe 2 Probe 3 Probe 4
ACGACG
CTTTTC
ACGATC
CCTATC
ACGACG
Probe 1
ACGATC
Probe 3 Probe 4
CCTATC
Probe 2
CTTTTC
Hamming Distance (P1, P2) = number of nucleotides which are different from its counterpart= border (synchronous embedding)
Hamming Distance =4
Placement By ThreadingThread on the chip
1
2 3
4
ACGACG
Probe 1
ACGATC
Probe 2 Probe 3
CCTATC
Probe 4
CTTTTC
Optimized EdgeNot Optimized Edge
Row-Epitaxial Placement
For each site position (i, j):
Find the best probe which minimize border
(i, j)
Move the best probe to (i, j) and lock it in this position
Switch
Introduction to DNA microarrays and border minimization challenges
Outline
Partitioning-based probe placement
Comparison of probe placement heuristicsQuantified sub-optimality of placement
Conclusions and future research directions
Previous probe placement algorithm
Basic DNA Array Design Flow
Partitioning
Placement
Question: Shall we use partitioning in probe placement?
Probe Selection
Design of Test Probes
Probe Placement
Probe Embedding
DNA Array
Logic Synthesis
BIST and DFT
Placement
Routing
VLSI Chip
Physical Design
Probe Placement
Probe Embedding
Probe Selection
Design of Test Probes
Logic Synthesis
BIST and DFT
Physical Design
Routing
Placement
Analogy
Lithography Lithography
Single Nucleotide Placement
A A A A A A A AA A A A A A A ACC
CC
CC
CC
CC
CC
CC
CC
GGTT
GGTT
GGTT
GGTT
GGTT
GGTT
GGTT
GGTT
Row-Epitaxial Placement
Border = 48
A A A A
A A A AA A A A
A A A A
CC
CC
CC
CC
CC
CC
CC
CC
GG
TT
GG
TT
GG
TT
GG
TT
GG
TT
GG
TT
GG
TT
GG
TT
Partitioning Based Placement
Border = 32
Can partitioning based placement achieve improvement for 25-nucleotide probes?
Partitioning Based Placement
Randomly choose a probe as seed 1.Choose a probe as seed 2 which has the largest Hamming distance with seed 1.
Choose a probe as seed 3 which has the largest total Hamming distance with seed 1 and seed 2.
Choose a probe as seed 4 which has the largest total Hamming distance with seed 1, seed 2 and seed 3.
Partitioning Based Placement
Level 1 Partition
Level 2 Partition
Row epitaxial one by one
“Border aware”
Introduction to DNA microarrays and border minimization challenges
Outline
Partitioning-based probe placement
Comparison of probe placement heuristicsQuantified sub-optimality of placement
Conclusions and future research directions
Previous probe placement algorithm
2-D Gray code Placement
n=2n=4
For synchronous embedding, Border = 2 for any two neighbor probes.
C G
A T
AC
TC
CC
GC
TG
AG
GG
CG
AA
TA
TT
AT
CA
GA
GT
CT
Scaling Construction
n x n real chip
Ratio= <1 Solution quality scale wellnew border
4(old border)
A AA A CC
A GG A TT
A GA A TC
A AG A CT
CA A GC TG A AT
A CA A GC A TG A AT
Four isomorphic copies with the same border
Introduction to DNA microarrays and border minimization challenges
Outline
Partitioning-based probe placement
Comparison of probe placement heuristicsQuantified sub-optimality of placement
Conclusions and future research directions
Previous probe placement algorithm
Experiments Setup
Chip size range: between 100x100 and 500x500
Randomly generatedType of instances
2-D Gray codeScaled / suboptimality test cases
SynchronousEmbedding methods
Asynchronous
Total border costQuality measure
Gap from lower boundNormalized cost CPU
All tests are run on Xeon 2.4 GHz CPU.
Comparison of Synchronous Placement Results
0
2000000
4000000
6000000
8000000
10000000
12000000
14000000
100 200 300 500 Chip size
Borders
0
20000
40000
60000
80000
100000
120000
100 200 300 500 Chip size
CPU
TSP + Threading Row EpitaxialPartitioning
Based(Level=2)
0
10
20
30
40
50
60
100 200 300 500Chip size
Gap from lower bound
0
5
10
15
20
25
30
100 200 300 500 Chip size
Normalized cost
Compared with row epitaxial, new method reduce the border cost by 3.7% and is 3 times faster.
Results on 2-D Gray code Test cases
0200000400000600000800000100000012000001400000160000018000002000000
16 32 64 128 256 512 Chip size
Borders
TSP + Threading
Row Epitaxial
Recursive Partitioning
0
10
20
30
40
50
60
70
80
16 32 64 128 256 512
Chip size
Gap from Optimal solution
5.6%
Suboptimality Experiments Results
0
5000000
10000000
15000000
20000000
25000000
30000000
35000000
40000000
100 200 300 400 500 Chip size
Borders
Row Epitaxial
Partitioning Based(Level=2)
0.660.68
0.70.720.74
0.760.780.80.82
0.840.86
100 200 300 400 500 Chip size
Scaling ratio
2.5%
Placement Polishing Using Re-Embedding
Use polishing algorithm to re-embed each probe with respect to its neighbors
Perform polishing one by one
Dep
ositi
on S
eque
nce
A
A
C
C
G
GT
T
TC
CG
Probes AC
C
T
C
AC
G
Border = 8Border = 4
Comparison of Asynchronous Placement Results
0
2000000
4000000
6000000
8000000
10000000
12000000
100 200 300 500Chip size
Borders
0
20000
40000
60000
80000
100000
120000
100 200 300 500
Chip size
CPU
TSP + Threading Row EpitaxialPartitioning Based
(Level=2)
0
20
40
60
80
100
120
100 200 300 500Chip size
Gap from lower bound
17
18
19
20
21
22
23
100 200 300 500 Chip size
Normalized cost
Compared with row epitaxial, new method reduce the border cost by 4% and is 2.65 times faster.
Introduction to DNA microarrays and border minimization challenges
Outline
Partitioning-based probe placement
Comparison of probe placement heuristicsQuantified sub-optimality of placement
Conclusions and future research directions
Previous probe placement algorithm
Conclusions
We draw a fertile analogue between DNA array and VLSI Design AutomationWe propose a new recursive partitioning-based placement algorithm and a new embedding algorithm which achieves 4% improvementWe study and quantify the performance of existing and newly proposed algorithms on benchmarks with known optimal cost as well as scaling suboptimality experiments