Introduction to DNA Computing
Russell DeatonElec. & Comp. Engr.The University of MemphisMemphis, TN [email protected]
Junghuei ChenDepartment of Chem & BiochemUniversity of DelawareNewark, DE [email protected]
What is DNA Computing (DNAC) ?
The use of biological molecules, primarily DNA, DNA analogs, and RNA, for computational purposes.
Why Nucleic Acids?• Density (Adleman, Baum):
– DNA: 1 bit per nm3, 1020 molecules– Video: 1 bit per 1012 nm3
• Efficiency (Adleman)– DNA: 1019 ops / J– Supercomputer: 109 ops / J
• Speed (Adleman):– DNA: 1014 ops per s– Supercomputer: 1012 ops per s
What makes DNAC possible?• Great advances in molecular biology
– PCR (Polymerase Chain Reaction)– DNA Microarrays– New enzymes and proteins– Better understanding of biological molecules
• Ability to produce massive numbers of DNA molecules with specified sequence and size
• DNA molecules interact through template matching reactions
What are the basics from molecular biology that I need to
know to understand DNA computing?
PHYSICAL STRUCTURE OF DNA
Nitrogenous Base
34 Å
MajorGroove
Minor Groove
Central Axis
Sugar-PhosphateBackbone
20 Å5’ C
3’ OH
3’ 0HC 5’
5’
3’
3’
5’
INTER-STRAND HYDROGEN BONDING
Adenine Thymine
to Sugar-PhosphateBackbone
to Sugar-PhosphateBackbone
(+) (-)
(+)(-)
Hydrogen Bond
Guanine Cytosine
to Sugar-PhosphateBackbone
to Sugar-PhosphateBackbone
(-) (+)
(+)(-)
(+)(-)
STRAND HYBRIDIZATION
A B
a b
A B
ab
b
B
a
A
HEAT
COOL
ba
A B
OR
100° C
DNA LIGATION
’ ’
’ ’
Ligase Joins 5' phosphateto 3' hydroxyl
’ ’
RESTRICTION ENDONUCLEASES
EcoRI
HindIII
AluI
HaeIII
- OH 3’
5’ P -
- P 5’
3’ OH -
DNA Polymerase
DNA Sequencing
GEL ELECTROPHORESIS - SIZE SORTING
BufferGel
Electrode
Electrode
Samples
Faster
Slower
ANTIBODY AFFINITY
CACCATGTGAC
GTGGTACACTG B
PMP
+
Anneal
CACCATGTGAC
GTGGTACACTG B+
CACCATGTGAC
GTGGTACACTG B PMP
Bind
Add oligo withBiotin label
Heat and cool
Add Paramagnetic-Streptavidin
Particles
Isolate with MagnetN
S
POLYMERASE CHAIN
REACTION
What is a the typical methodology?
• Encoding: Map problem instance onto set of biological molecules and molecular biology protocols
• Molecular Operations: Let molecules react to form potential solutions
• Extraction/Detection: Use protocols to extract result in molecular form
What is an example?
• “Molecular Computation of Solutions to Combinatorial Problems”
• Adleman, Science, v. 266, p. 1021.
Algorithm
• Generate Random Paths through the graph.
• Keep only those paths that begin with vin and end with vout.
• If graph has n vertices, then keep only those paths that enter exactly n vertices.
• Keep only those paths that enter all the vertices at least once.
• In any paths remain, say “Yes”; otherwise, say “No”
Encoding0
1
2
‘GCATGGCC
‘AGCTTAGG
‘ATGGCATG
CCGGTCGA’
CCGGTACC’
‘GCATGGCCAGCTTAGG CCGGTCGA’
‘GCATGGCCATGGCATG CCGGTACC’
00 21
What are the success stories?
• Self-Assembling Computations Demonstrated (Winfree and Seeman)
• New Approaches and Protocols Developed – Surface-based (Wisconsin-Madison, Dimacs II)– Evolutionary Approaches (Wood and Chen,
Gecco-99, DNA-5)
• How do cells and nature compute? (Kari
and Landweber, Dimacs IV)
Source: http://seemanlab4.chem.nyu.edu/
Source: Winfree, DIMACS IV
Source: http://corninfo.chem.wisc.edu/writings/dnatalk/dna01.html
Source: http://www.princeton.edu/~lfl/washpost.html
What are the challenges?
• Error: Molecular operations are not perfect.
• Reversible and Irreversible Error
• Efficiency: How many molecules contribute?
• Encoding problem in molecules is difficult.
• Scaling to larger problems
• Applications
Mismatches
DNA Word Design
• Design of DNA Sequences that hybridize as planned (that is, minimize mismatches)
• Reliability: False Positives and Negatives
• Efficiency: Hybridizations that Contribute to Solution
• Hybridizations are Templates for Subsequent Enzymatic Steps
DNA Word Design
• Minimum Distance Codes to Prevent Hybridization Error
• Distance Measure– Combinatoric (Hamming)– Energetic (Base Stacking Energy)
• Design DNA Words with Evolutionary Algorithms
• Good Codes Achievable
Code Word
Hybridization
Code Word
Hybridization
Base Stacking
What are the possible applications?
• DNAC and Conventional Computers
• DNAC and Evolutionary Computation
• DNAC and Biotechnology
DNAC and Electronic Computing
• Solution versus solid state
• Individual molecules versus ensembles of charge carriers
• The importance of shape in biological molecules
• Programmability/Evolvability Trade-off (Conrad)
Edna
• Electronic DNA
• Virtual Test Tube for Design and Simulation of DNA Computations
• Molecules as Cellular Automata
• Solve Adleman and Other Problems
• Distributed Edna to Solve Large Problems
• New Paradigm
In Vitro Evolutionary Computation
• Randomness and Uncertainty Inherent in Biomolecular Reactions
• Never Level of Control like EE over Solid State Devices
• Use Nature’s ToolBox: Enzymes, Reaction/Diffusion, Adaptability, and Robustness
• Evolved, Not Designed
DNAC and Biotechnology
• “Computationally Inspired Biotechnology”
• DNA2DNA “killer app”
• Automation of protocols
• DNA Word Design (Gene Expression Chips)
• Exquisite Detection of Biomaterials
• Bio-engineered Materials
What developments can we expect in the near-term (1999)?
• Increased use of molecules other than DNA
• Evolutionary approaches
• Continued impact by advances in molecular biology
• Some impact on molecular biology by DNA computation
• Increased error avoidance and detection
What are the long-term prospects?
• Cross-fertilization among evolutionary computing, DNA computing, molecular biology, and computational biology
• Niche uses of DNA computers for problems that are difficult for electronic computers
Where can I learn more?• Web Sites:
• http://www.wi.leidenuniv.nl/~jdassen/dna.html• http://dope.caltech.edu/winfree/DNA.html• http://www.msci.memphis.edu/~garzonm/bmc.html• (Conrad) http://www.cs.wayne.edu/biolab/index.html
• DIMACS Proceedings: DNA Based Computers I (#27), II (#44), III (#48), IV (Special Issue of Biosystems), V (MIT, June 1999), VI (Leiden, June 2000)• Other: Genetic Programming 1 (Stanford, 1997), Genetic Programming 2 (Wisconsin-Madison, 1998), GECCO-1999,IEEE International Conference on Evolutionary Computation (Indianapolis, 1997)• G. Paun (ed.), Computing with Biomolecules: Theory and Experiment, Springer-Verlag, Singapore 1998.• “DNA Computing: A Review,” Fundamenta Informaticae, vol. 35, pp. 231-245.•M. H. Garzon and R. J. Deaton, “Biomolecular Computing and Programming,” IEEE Transactions on Evolutionary Computation, vol. 3, pp. 236-250, 1999.