maximum clique. 1introduction 2theoretical background biochemistry/molecular biology 3theoretical...

76
Maximum clique

Upload: stewart-rogers

Post on 28-Dec-2015

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Maximum clique. 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History of the field 5Splicing

Maximum clique

Page 2: Maximum clique. 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History of the field 5Splicing

1 Introduction

2 Theoretical background Biochemistry/molecular biology

3 Theoretical background computer science

4 History of the field

5 Splicing systems

6 P systems

7 Hairpins

8 Detection techniques

9 Micro technology introduction

10 Microchips and fluidics

11 Self assembly

12 Regulatory networks

13 Molecular motors

14 DNA nanowires

15 Protein computers

16 DNA computing - summery

17 Presentation of essay and discussion

Course outline

Page 3: Maximum clique. 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History of the field 5Splicing

NP complete continued

Page 4: Maximum clique. 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History of the field 5Splicing

Some problems are undecidable: no computer can

solve them.

e.g. Turing’s “Halting Problem”

Other problems are decidable, but intractable:

as they grow large, we are unable to solve them

in reasonable time

What constitutes “reasonable time”?

tractibility

Page 5: Maximum clique. 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History of the field 5Splicing

P = set of problems that can be solved

in polynomial time

NP = set of problems for which a solution

can be verified in polynomial time

P NP

The big question: Does P = NP?

P and NP summary

Page 6: Maximum clique. 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History of the field 5Splicing

The NP-Complete problems are an interesting

class of problems whose status is unknown

No polynomial-time algorithm has been

discovered for an NP-Complete problem

No suprapolynomial lower bound has been

proved for any NP-Complete problem, either

Intuitively and informally, what does it mean

for a problem to be NP-Complete?

NP-complete problems

Page 7: Maximum clique. 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History of the field 5Splicing

A problem P can be reduced to another problem

Q if any instance of P can be rephrased to an

instance of Q, the solution to which provides

a solution to the instance of P. This

rephrasing is called a transformation

Intuitively: If P reduces in polynomial time

to Q, P is “no harder to solve” than Q

reduction

Page 8: Maximum clique. 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History of the field 5Splicing

Though nobody has proven that P != NP, if

you prove a problem NP-Complete, most people

accept that it is probably intractable

Therefore it can be important to prove that

a problem is NP-Complete

Don’t need to come up with an efficient

algorithm

Can instead work on approximation

algorithms

Why prove NP-completenss

Page 9: Maximum clique. 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History of the field 5Splicing

What is a clique of a graph G?

Answer: a subset of vertices fully connected to

each other, i.e. a complete subgraph of G

The clique problem: how large is the maximum-

size clique in a graph?

Can we turn this into a decision problem?

Answer: Yes, we call this the k-clique problem

Is the k-clique problem within NP?

clique

Page 10: Maximum clique. 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History of the field 5Splicing

What should the reduction do?

Answer: Transform a 3-CNF formula to a

graph, for which a k-clique will exist (for

some k) iff the 3-CNF formula is

satisfiable

clique

Page 11: Maximum clique. 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History of the field 5Splicing

The reduction:

Let B = C1 C2 … Ck be a 3-CNF formula with k

clauses, each of which has 3 distinct literals

For each clause put a triple of vertices in the

graph, one for each literal

Put an edge between two vertices if they are in

different triples and their literals are

consistent, meaning not each other’s negation

Run an example:

B = (x y z) (x y z ) (x y z )

clique

Page 12: Maximum clique. 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History of the field 5Splicing

Prove the reduction works:

If B has a satisfying assignment, then each

clause has at least one literal (vertex) that

evaluates to 1

Picking one such “true” literal from each clause

gives a set V’ of k vertices. V’ is a clique

(Why?)

If G has a clique V’ of size k, it must contain

one vertex in each clique (Why?)

We can assign 1 to each literal corresponding

with a vertex in V’, without fear of

contradiction

clique

Page 13: Maximum clique. 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History of the field 5Splicing

A clique of a graph G=(V,E) is a subgraph C

that is fully-connected (every pair in C has

an edge).

CLIQUE: Given a graph G and an integer K, is

there a clique in G of size at least K?

CLIQUE is in NP: non-deterministically

choose a subset C of size K and check that

every pair in C has an edge in G.

This graph has a clique of size 5

Clique problem, summary

Page 14: Maximum clique. 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History of the field 5Splicing

Maximum clique with DNA

Page 15: Maximum clique. 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History of the field 5Splicing

Clique

defined as a set of verticesa set of vertices in

which every vertex is connected

to every other vertex by an edge

Maximal clique problem

Given a network containing N

vertices and M edges, how many

vertices are in the largest

clique?

Finding the size of the largest

clique has been proven to be an NP-

complete problem

Introdcution

Page 16: Maximum clique. 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History of the field 5Splicing

Step 1 Make the complete data poolcomplete data pool

For a graph with N vertices, each possible

clique is represented by an N-digit binary

number

1: a vertex in the clique

0: a vertex out of the clique

i.e. i.e. clique (4,1,0)binary number 010011

Step 2

Find pairs of vertices in the graph that

are not connected by an edge

(0,2) (0,5) (1,5) (1,3)

The complementary graph

Algorithm

Page 17: Maximum clique. 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History of the field 5Splicing

Step 3

Eliminate from the complete data pool all

numbers containing connections in the

complementary graph

xxx1x1 or 1xxxx1 or 1xxx1x or xx1x1x

Step 4

Sort the remaining data pool to find the

data containing the largest number of 1’s

the clique with the largest number of 1’sthe largest number of 1’s

tells us the sizesize of the maximal clique

Algorithm

Page 18: Maximum clique. 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History of the field 5Splicing

two DNA sections

bit’s valuebit’s value (Vi) V0~V5 0 bp when Vi =1

10 bp when Vi

=0

position valueposition value (Pi) P0~P6 20 bp

Longest = 610 + 720 = 200bp (000000)

Shortest = 60 + 720 = 140bp (111111)

dsDNA

Construction of DNA molecules

Page 19: Maximum clique. 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History of the field 5Splicing

sequence construction - randomly generated

to avoid mispairing, avoid accidental

homologies longer than 4bp

embedded restriction sequencesrestriction sequences within each Vi

=1

POA (parallel overlap assembly)

Construction of DNA molecules

Page 20: Maximum clique. 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History of the field 5Splicing

POA (parallel overlap assembly) with 12 oligonucleotides

PiViPi+1 for even i

<Pi+1ViPi> for odd i

P0V0P1 P2V2P3

P4V4P5

<P2V1P1> <P4V3P3>

<P6V5P5>

PCR with P0 and <P6>

as primers (lane2 in fig3)

POA

Page 21: Maximum clique. 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History of the field 5Splicing

Construction of DNA molecules

Page 22: Maximum clique. 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History of the field 5Splicing

Construction of DNA molecules

Page 23: Maximum clique. 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History of the field 5Splicing

Break DNA : internal sequence Vi =1

PCR with P0 and <P6> as primers

broken sting were not amplified

Division of the data pool into two

test tube

t0 : Alf II cut Vo=1

t1 : Spe I cut V2=1

combine t0 and t1 into

test tube t, which did not

contain xxx1x1

Digestion of restriction enzymes

Page 24: Maximum clique. 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History of the field 5Splicing

Elimination all strings connected by edges

xxx1x1, 1xxxx1, 1xxx1x, xx1x1x

PCR amplification of remaining data DNA (Fig 3), Lane 5: digestion result

Lane 6: PCR result

Digestion and PCR amplification

Page 25: Maximum clique. 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History of the field 5Splicing

Reading the size of the largest clique(s)shortest length : 160bp four vertices

What is the maximal clique?

6C4 = 15, 15 different strings read the answer by molecular cloning

1 insertion the DNA into M13 bacteriophage through site-directed mutagenesis

2 transfection of the mutagenized M13 phase DNA into E.coli

3 cloning

4 DNA extraction and sequencing

Readout

Page 26: Maximum clique. 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History of the field 5Splicing

correct answer 111100

Readout

Page 27: Maximum clique. 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History of the field 5Splicing

Production of ssDNA during PCR

cannot be cut by restriction enzymes

solution : digestion of the ssDNA with S1

nuclease before restriction digestion

Incomplete cutting by restriction enzymes

repetition of digestion-PCR process

increase the signal-to-noise

discussion - major error

Page 28: Maximum clique. 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History of the field 5Splicing

Strengths

high parallelism

Weaknesses

limitation on the number of vertices that

this algorithm can handle

maximum number of vertices with picomole

operations = 27 (36 vertices with

nanomole)

exponential increase in the size of the

pool with the size of the problem

Further scale-up becomes impractical

New algorithms are needed

Discussion - strengths and weaknesses

Page 29: Maximum clique. 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History of the field 5Splicing

Rapid and accurate data access is needed

biotin-avidin purification

electrophoresis

DNA cloning

too slow/ too noising

biochip is needed to accelerate readout

Discussion – future direction

Page 30: Maximum clique. 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History of the field 5Splicing

Clique in microreactors

Page 31: Maximum clique. 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History of the field 5Splicing

all possible solutions{000} {001} {010} {011} {100} {101}{110} {111}

clauses(x=1)^(y=0)^(z=1)

Selection principle

Page 32: Maximum clique. 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History of the field 5Splicing

Selection principle

Page 33: Maximum clique. 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History of the field 5Splicing

Positive selection

Page 34: Maximum clique. 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History of the field 5Splicing

Negative selection

Page 35: Maximum clique. 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History of the field 5Splicing

Logical operations

Page 36: Maximum clique. 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History of the field 5Splicing

logical NOT operations

Logical operations

Page 37: Maximum clique. 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History of the field 5Splicing

a b

logical AND operations

Logical operations

Page 38: Maximum clique. 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History of the field 5Splicing

a b

logical OR operations

Logical operations

Page 39: Maximum clique. 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History of the field 5Splicing

magnet

Microreactor structure

Page 40: Maximum clique. 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History of the field 5Splicing

magnet

Microreactor structure

Page 41: Maximum clique. 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History of the field 5Splicing

Selection principle

Page 42: Maximum clique. 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History of the field 5Splicing

DNA input and transport principle

Page 43: Maximum clique. 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History of the field 5Splicing

6 nodes, 2 initial answers 6

Max: SABCDE=101001

Maximal cliques

Page 44: Maximum clique. 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History of the field 5Splicing

  A B C D E F

A 1 0 1 0 0 1

B 0 1 1 1 0 0

C 1 1 1 0 1 1

D 0 1 0 1 1 0

E 0 0 1 1 1 0

F 1 0 1 0 0 1

Maximal cliques – connectivity matrix

Page 45: Maximum clique. 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History of the field 5Splicing

SA=0

SE=0

SD=0

SC=1SC=0

SB=0

SA=0 SA=1

SF=0 SF=1

Maximal cliques – flow diagram

Page 46: Maximum clique. 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History of the field 5Splicing

0xxxxx 00xxxx 0xx0xx 00x0xx 0xxx0x 00xx0x 0xx00x x0x00x 00x00x

0xxxxx 00xxxx 0xx0xx x0x0xx 00x0xx

0xxxxx x0xxxx 00xxxx

0xxxxx xxxxxx

XXXXXX with x={0,1}

SA=0

SA=0

SA=0

SA=0 SE=0

SD=0

SC=1SC=0

SB=0

SA=0 SA=0 SA=1

SA=0 SF=0 SF=1

Maximal cliques – flow diagram

Page 47: Maximum clique. 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History of the field 5Splicing

DNA in

DNA out

Optical control

DNA computer design

Page 48: Maximum clique. 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History of the field 5Splicing

DNA computer design

Page 49: Maximum clique. 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History of the field 5Splicing

DNA computer design

Page 50: Maximum clique. 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History of the field 5Splicing

  node A node B node C node D

node B A0 B0 B1/Ø                        

node C A0 C0 C1/Ø B0 C0 C1/Ø                  

node D A0 D0 D1/Ø B0 D0 D1/Ø C0 D0 D1/Ø            

node E A0 E0 E1/Ø B0 E0 E1/Ø C0 E0 E1/Ø D0 E0 E1/Ø      

node F A0 F0 F1/Ø B0 F0 F1/Ø C0 F0 F1/Ø D0 F0 F1/Ø E0 F0 F1/Ø

node F

DNA computer design – selection modules

Page 51: Maximum clique. 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History of the field 5Splicing

DNA information flow

Page 52: Maximum clique. 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History of the field 5Splicing

100 m

Flow separation – laminar flow

Page 53: Maximum clique. 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History of the field 5Splicing

100 m

Flow separation – laminar flow

Page 54: Maximum clique. 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History of the field 5Splicing

Micro fabrication

Page 55: Maximum clique. 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History of the field 5Splicing

Micro fabrication

Page 56: Maximum clique. 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History of the field 5Splicing

Micro fabrication

Page 57: Maximum clique. 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History of the field 5Splicing

Micro fabrication

Page 58: Maximum clique. 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History of the field 5Splicing

Micro fabrication

Page 59: Maximum clique. 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History of the field 5Splicing

Micro fabrication

Page 60: Maximum clique. 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History of the field 5Splicing

Micro fabrication

Page 61: Maximum clique. 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History of the field 5Splicing

DNA computer design – 20 nodes

Page 62: Maximum clique. 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History of the field 5Splicing

DNA computer design – 20 nodes

Page 63: Maximum clique. 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History of the field 5Splicing

word codes

optical programmability

usage of masks to programme

immobilisation of DNA to paramagnetic beads

hybridisation of DNA-strands

DNA sequence handling

Page 64: Maximum clique. 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History of the field 5Splicing

Bead

Capture probe(Vn = 1)

Vn = 1 Vn+1Vn+2

Vn+3Vn-1Vn-2

Vn = 0 Vn+1Vn+2

Vn+3Vn-1Vn-2

3'-ATCGTCGAAGGAATGC-5'5'-TAGCAGCTTCCTTACG-3'

5'-ACACTGTGCTGATCTC-3'

The DNA library

Page 65: Maximum clique. 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History of the field 5Splicing

PBS1: 5'-GCCCTAAAGGATCCACGTAAGGTCCTATGC

V0-1: 5'-AACCACCAACCAAACC V0-0: 5'-AAAACGCGGCAACAAG V1-1: 5'-TCAGTCAGGAGAAGTC V1-0: 5'-TCTTGGGTTTCCTGCA V2-1: 5'-TTTTCCCCCACACACA V2-0: 5'-TTGGACCATACGAGGA V3-1: 5'-CGTTCATCTCGATAGC V3-0: 5'-AGAGTCTCACACGACA V4-1: 5'-AAGGACGTACCATTGG V4-0: 5'-CTCTAGTCCCATCTAC V5-1: 5'-CAACGGTTTTATGGCG V5-0: 5'-GCGCAATTTGGTAACC V6-1: 5'-TAGCAGCTTCCTTACG V6-0: 5'-ACACTGTGCTGATCTC V7-1: 5'-CACATGTGTCAGCACT V7-0: 5'-TGTGTGTGCCTACTTG V8-1: 5'-GATGGGATAGAGAGAG V8-0: 5'-AATCCCACCAGTTGAC V9-1: 5'-ATGCAGGAGCGAATCA V9-0: 5'-GCTTGTTCAACCTGGTV10-1: 5'-CCCAGTATGAGATCAG V10-0: 5'-CTGTCCAAGTACGCTAV11-1: 5'-ATCGAGCTTCTCAGAG V11-0: 5'-TGTAGAGGCTAGCGAT

PBS2: 5'-TGGTTTGGCGGCTTTAGAATTCTGTGACAC

The DNA library

Page 66: Maximum clique. 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History of the field 5Splicing

DNA hybridisation

Page 67: Maximum clique. 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History of the field 5Splicing

100 m

DNA hybridisation

Page 68: Maximum clique. 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History of the field 5Splicing

DNA hybridisation

Page 69: Maximum clique. 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History of the field 5Splicing

DNA hybridisation

Page 70: Maximum clique. 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History of the field 5Splicing

DNA hybridisation

Page 71: Maximum clique. 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History of the field 5Splicing

liquid handling DNA computer

robotics

detection system

sorting module

computer control

DNA computer control

Page 72: Maximum clique. 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History of the field 5Splicing

3.5mm

Page 73: Maximum clique. 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History of the field 5Splicing
Page 74: Maximum clique. 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History of the field 5Splicing
Page 75: Maximum clique. 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History of the field 5Splicing
Page 76: Maximum clique. 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History of the field 5Splicing