single molecule mechanical sequencing of...

1
ABSTRACT We have developed a novel method for extracting sequence information from DNA that can be used for the phylogenetic analysis and characterization of complex microbial populations. Our method is based on the mechanical opening and closing of DNA hairpins and the detection of roadblocks to re-hybridization due to the presence of complementary DNA fragments in solution. Our technology has been called SIMDEQ TM (short for Single-Molecule Magnetic DEtection & Quantification) and is being developed commercially by PicoSeq. Using SIMDEQ TM it is possible to ‘fingerprint’, fully sequence, and detect the methylation status of individual long fragments of DNA with very high accuracy. The technique is simple, low cost, with a low error rate, and has the potential to be run at very high throughput. Early results indicate that it will be possible to use SIMDEQ TM to rapidly identify, characterize, and quantify many thousands of microbial species within complex samples such as those involved in microbiome studies. Due to its high throughput, low cost and tunable phylogenetic resolution, we expect SIMDEQ TM to become a powerful platform for the detailed analysis of complex microbial populations. SINGLE MOLECULE MECHANICAL SEQUENCING OF DNA ENABLING RAPID CHARACTERISATION OF COMPLEX MICROBIAL POPULATIONS Gordon Hamilton 1 , Charles André 2 , Jimmy Ouellet 1 , Jean-Francois Allemand 1 , David Bensimon 1 , Vincent Croquette 1 Laboratoire de Physique Statistique, ENS, 24 rue Lhomond, 75005 Paris; 2 PicoSeq SAS, 82 rue Fondary, 75015, Paris 1. Sample preparation DNA fragments (e.g. 16S rRNA) are incubated with synthetic DNA components and streptavidin coated paramagnetic beads, to form hairpins linked by one arm to beads 2. Sample interrogation THE SIMDEQ TM APPROACH THE SIMDEQ TM PROTOTYPE Experiments on these small bench-top prototype instruments are run at ambient room temperature, with data sent to a PC for analysis ANALYSIS OF 16S RRNA – METHODS & RESULTS Materials & Methods Sample preparation: Bacterial colonies were re-suspended in 100μl of water and lysed at 98°C for 10 minutes. PCR was performed with the primers PS014 (TGGTCTTTCTGGTGCTCTTCAAAAGAGTTTGATCATGGCTCAG) and PS015 (GCCGGCGTTTTCGCCGGCAAAAAGGAGGTGATCCANCCRCA) using Phusion DNA polymerase (Thermo Scientific) with the GC buffer and the addition of 3% DMSO. PCR conditions were as follows : 98°C for 3 min, 35 cycles of 98°C for 15sec, 56°C for 15sec and 72°C for 1min and a final extension of 3min at 72°C. The PCR products were purified on an agarose gel and treated with the T4 DNA polymerase to produce the ssDNA 5’ overhang required to ligate the adaptor and the loop. Resulting hairpins were purified on a gel and attached on a streptavidin coated flow-cell floor. Oligos for hybridisation were used at 100nM. SIMDEQ TM methodology: for a complete description of the process, please refer to Nature Methods Vol. 9 (4), April 2012, Ding et al. Results We have performed proof of principle experiments using prototype instruments capable of analysing individual DNA molecules of up to 25 Kb in length. We tested a 1.6 kb region of the 16S rRNA gene with a panel of 14 oligos that allows species identification. Future experiments will extend this study in several aspects: 1. Larger molecules containing the entire rDNA operon will be amplified and analysed using larger numbers of oligos, in order to increase the phylogenetic resolution of the system 2. In addition to the binary readout of the simple presence or absence of a particular oligo, future analyses will take into account the exact position of the oligo binding sites, with single base resolution. This will be particularly useful for intergenic regions and other more variable sections of the genes. 3. We are currently refining our sequencing strategies to allow an initial fingerprinting experiment to be followed up by rapid full sequencing (minutes to hours) of particular molecules that have been identified as being of interest. CONCLUSIONS & FUTURE DIRECTIONS LED Our sequence detection requires precise calculation of where a paramagnetic bead is located in 3D space. This is achieved by shining a red LED light onto the beads (L panel) which generates a pattern if diffraction rings that can be imaged on a video camera (top R). A mere 2-3nm of movement of the bead in the vertical axis changes the diffraction ring pattern. This can be detected in real-time with image analysis (lower R). 100 50 Extension nM Time Open Hairpin Closed Hairpin Unzipping the hairpin. Lowering the magnet increases the force pulling on the paramagnetic beads, the double stranded portion of the DNA is “unzipped” typically with a force of 10-15 pN. This reaction is reversible, and can be repeated many thousands of times. The opening and closing of the hairpin can be monitored by tracking the Z position of the bead in real time (as shown in panel B). The bead starts in the closed position (extension = 0nM) and is subsequently opened (extension ~80nM) and allowed to close again. The length of the open hairpin allows the total length of the DNA molecule to be determined (in this case about 80 nucleotides). Closed Hairpin 100 50 Extension nM Time } 40nM Blocking the re-zipping with a hybridising oligo. In its unzipped state, the hairpin can bind to complementary oligos. A bound oligo will then block the re-zipping of the hairpin when the force is reduced, which will be detectable as a pause in the retraction of the bead position from open to closed. The length of this pause is largely dependent on the size of the oligo which can be designed to block for just a few seconds before falling off and allowing the hairpin to re-zip completely. The binding position of the oligo can be determined precisely (in the figure above, at 40 nm) and from this the sequence of that region of the hairpin can be inferred. Bioinformatic analysis suggests that only 14 oligonucleotides would be required to generate unique fingerprints for several thousand microbiological species. 16S rDNA gene 16S PCR primers Oligos (example) Example species Presence/absence of oligos (typically 14 oligos, 7 shown here) Etc. Value Escherichia 1 1 1 0 1 0 0 232 Bifidobacterium 1 0 0 0 1 1 0 70 Streptococcus 0 1 1 1 1 0 1 61 Bacteroides 1 1 1 1 0 0 0 120 Prevotella 1 0 0 1 0 1 1 75 Eubacterium 1 1 1 1 1 0 0 124 Faecolibacterium 0 0 1 1 1 0 1 29 Parabacteroides 1 0 1 1 0 1 1 91 Roseburia 1 1 1 1 0 1 1 123 Converting hybridization data into a binary value. In the case of the 16S rDNA gene, the approximate location of each oligo is the same for all species being tested, and thus the presence or absence of a number of oligos can be determined simultaneously and the binary data converted to a unique decimal value. More complex schemes (for example taking into account the precise distance between oligos) could be used to extract more information if required. This could be useful, for example, when using the more variable intergenic regions or the rDNA operon. 16S rDNA gene Comparing expected results with hybridization data. Example of an actual hybridisation pattern on a 1.5kb hairpin containing 16S rDNA from an unknown species. For clarity data is shown for only 7 of 14 oligos used in the experiment. The histogram shows the positions of the hybridisation events, with the height of the peaks approximately proportional to the total blockage time measured for 20 cycles of opening and closing the hairpin. Each binding position (in nt) is calculated relative to the positions of the open and closed state of the hairpin. Data indicates the presence of Nesseria species (awaiting confirmation with full sequence data). Test oligo binding positions closed hairpin open hairpin

Upload: others

Post on 29-Sep-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: SINGLE MOLECULE MECHANICAL SEQUENCING OF ...depixus.com/wp-content/uploads/2016/09/single-molecule.pdfWe have developed a novel method for extracting sequence information from DNA

ABSTRACT

We have developed a novel method for extracting sequence information from DNA that can be used for the phylogenetic analysis and characterization of complex microbial populations. Our method is based on the mechanical opening and closing of DNA hairpins and the detection of roadblocks to re-hybridization due to the presence of complementary DNA fragments in solution. Our technology has been called SIMDEQTM (short for Single-Molecule Magnetic DEtection & Quantification) and is being developed commercially by PicoSeq. Using SIMDEQTM it is possible to ‘fingerprint’, fully sequence, and detect the methylation status of individual long fragments of DNA with very high accuracy. The technique is simple, low cost, with a low error rate, and has the potential to be run at very high throughput. Early results indicate that it will be possible to use SIMDEQTM to rapidly identify, characterize, and quantify many thousands of microbial species within complex samples such as those involved in microbiome studies. Due to its high throughput, low cost and tunable phylogenetic resolution, we expect SIMDEQTM to become a powerful platform for the detailed analysis of complex microbial populations.

SINGLE MOLECULE MECHANICAL SEQUENCING OF DNA ENABLING RAPID CHARACTERISATION OF COMPLEX MICROBIAL POPULATIONS

Gordon Hamilton1, Charles André2, Jimmy Ouellet1, Jean-Francois Allemand1, David Bensimon1, Vincent Croquette 1 Laboratoire de Physique Statistique, ENS, 24 rue Lhomond, 75005 Paris; 2 PicoSeq SAS, 82 rue Fondary, 75015, Paris

1. Sample preparation DNA fragments (e.g. 16S rRNA) are incubated with synthetic DNA components and streptavidin coated paramagnetic beads, to form hairpins linked by one arm to beads

2. Sample interrogation

THE SIMDEQTM APPROACH

THE SIMDEQTM PROTOTYPE

Experiments on these small bench-top prototype instruments are run at ambient room temperature, with data sent to a PC for analysis

ANALYSIS OF 16S RRNA – METHODS & RESULTS

Materials & Methods Sample preparation: Bacterial colonies were re-suspended in 100μl of water and lysed at 98°C for 10 minutes. PCR was performed with the primers PS014 (TGGTCTTTCTGGTGCTCTTCAAAAGAGTTTGATCATGGCTCAG) and PS015 (GCCGGCGTTTTCGCCGGCAAAAAGGAGGTGATCCANCCRCA) using Phusion DNA polymerase (Thermo Scientific) with the GC buffer and the addition of 3% DMSO. PCR conditions were as follows : 98°C for 3 min, 35 cycles of 98°C for 15sec, 56°C for 15sec and 72°C for 1min and a final extension of 3min at 72°C. The PCR products were purified on an agarose gel and treated with the T4 DNA polymerase to produce the ssDNA 5’ overhang required to ligate the adaptor and the loop. Resulting hairpins were purified on a gel and attached on a streptavidin coated flow-cell floor. Oligos for hybridisation were used at 100nM. SIMDEQTM methodology: for a complete description of the process, please refer to Nature Methods Vol. 9 (4), April 2012, Ding et al.

Results

We have performed proof of principle experiments using prototype instruments capable of analysing individual DNA molecules of up to 25 Kb in length. We tested a 1.6 kb region of the 16S rRNA gene with a panel of 14 oligos that allows species identification. Future experiments will extend this study in several aspects: 1. Larger molecules containing the entire rDNA operon will be amplified and analysed using larger numbers of oligos, in order to increase the phylogenetic resolution of the system 2. In addition to the binary readout of the simple presence or absence of a particular oligo, future analyses will take into account the exact position of the oligo binding sites, with single base

resolution. This will be particularly useful for intergenic regions and other more variable sections of the genes. 3. We are currently refining our sequencing strategies to allow an initial fingerprinting experiment to be followed up by rapid full sequencing (minutes to hours) of particular molecules that

have been identified as being of interest.

CONCLUSIONS & FUTURE DIRECTIONS

LED

Our sequence detection requires precise calculation of where a paramagnetic bead is located in 3D space. This is achieved by shining a red LED light onto the beads (L panel) which generates a pattern if diffraction rings that can be imaged on a video camera (top R). A mere 2-3nm of movement of the bead in the vertical axis changes the diffraction ring pattern. This can be detected in real-time with image analysis (lower R).

100

50

Exte

nsi

on

nM

Time

Open Hairpin

Closed Hairpin

Unzipping the hairpin. Lowering the magnet increases the force pulling on the paramagnetic beads, the double stranded portion of the DNA is “unzipped” typically with a force of 10-15 pN. This reaction is reversible, and can be repeated many thousands of times. The opening and closing of the hairpin can be monitored by tracking the Z position of the bead in real time (as shown in panel B). The bead starts in the closed position (extension = 0nM) and is subsequently opened (extension ~80nM) and allowed to close again. The length of the open hairpin allows the total length of the DNA molecule to be determined (in this case about 80 nucleotides).

Closed Hairpin

100

50

Exte

nsi

on

nM

Time

} 40nM

Blocking the re-zipping with a hybridising oligo. In its unzipped state, the hairpin can bind to complementary oligos. A bound oligo will then block the re-zipping of the hairpin when the force is reduced, which will be detectable as a pause in the retraction of the bead position from open to closed. The length of this pause is largely dependent on the size of the oligo which can be designed to block for just a few seconds before falling off and allowing the hairpin to re-zip completely. The binding position of the oligo can be determined precisely (in the figure above, at 40 nm) and from this the sequence of that region of the hairpin can be inferred. Bioinformatic analysis suggests that only 14 oligonucleotides would be required to generate unique fingerprints for several thousand microbiological species.

16S rDNA gene

16S PCR primers

Oligos (example)

Example species Presence/absence of oligos (typically 14 oligos, 7 shown here) Etc. Value

Escherichia 1 1 1 0 1 0 0 … 232

Bifidobacterium 1 0 0 0 1 1 0 … 70

Streptococcus 0 1 1 1 1 0 1 … 61

Bacteroides 1 1 1 1 0 0 0 … 120

Prevotella 1 0 0 1 0 1 1 … 75

Eubacterium 1 1 1 1 1 0 0 … 124

Faecolibacterium 0 0 1 1 1 0 1 … 29

Parabacteroides 1 0 1 1 0 1 1 … 91

Roseburia 1 1 1 1 0 1 1 … 123

Converting hybridization data into a binary value. In the case of the 16S rDNA gene, the approximate location of each oligo is the same for all species being tested, and thus the presence or absence of a number of oligos can be determined simultaneously and the binary data converted to a unique decimal value. More complex schemes (for example taking into account the precise distance between oligos) could be used to extract more information if required. This could be useful, for example, when using the more variable intergenic regions or the rDNA operon.

16S rDNA gene

Comparing expected results with hybridization data. Example of an actual hybridisation pattern on a 1.5kb hairpin containing 16S rDNA from an unknown species. For clarity data is shown for only 7 of 14 oligos used in the experiment. The histogram shows the positions of the hybridisation events, with the height of the peaks approximately proportional to the total blockage time measured for 20 cycles of opening and closing the hairpin. Each binding position (in nt) is calculated relative to the positions of the open and closed state of the hairpin. Data indicates the presence of Nesseria species (awaiting confirmation with full sequence data).

Test oligo binding positions

closed hairpin

open hairpin