genome sequencing: harmonia axyridis

Genome Sequencing: Harmonia axyridisIsabel RischUniversity of MemphisW. Harry Feinstone Center for Genomic ResearchMay 28, 2013-June 14, 2013

Ladybugs (Ladybirds) Tennessee state insect Coccinella septempunctata

Seven-Spotted Lady Beetle Native to North America; being outcompeted by Harmonia

Harmonia axyridis Asian/harlequin lady beetle Large coccinellid beetle Dome-shaped; smooth transition between head and thorax/abdomen Adults colored anywhere from yellow to bright red Spots on back can be anywhere from zero to twenty Native to Asia; introduced to North America and Europe in order to

control aphid populations; now crowding out other species (invasive) Carries a fungus that kills other species of ladybugs

Harmonia creates the chemical ‘harmonine’ which prevents the fungus from infecting it

Genome: What Is It? An organism’s hereditary information,

coded in DNA/chromosomes; in eukaryotes, includes introns and exons Chromosomes: DNA wrapped around

histones

Human Chromosome Painting

Genome: What Is It Made Of? DNA (deoxyribonucleic acid)

Called the “molecule of life” Codes for all proteins that make cells (life) possible

Made of deoxyribose, three phosphate groups, and a nitrogen base

Double-stranded molecule; covalent bonds between ribose/phosphate backbone on outside; hydrogen bonds between nitrogen bases on inside Allows for the breaking of hydrogen bondsreplication and

expression through RNA Bases: Adenine, Thymine, Guanine, Cytosine; A-T, G-C Order of nitrogen bases codes for specific amino acids

polypeptide chains protein In eukaryotes, contains both introns (non-coding sections)

and exons (coding sections)

Genome: What Is It Made Of? DNA and Heredity

Heredity: the passing of traits from one generation to the next– basis of genetics and evolution

Determined by genes on chromosomes; variations of a gene are alleles

Sexually-reproducing animals get two alleles (one from each parent) Mendel’s Law of Segregation

Alleles express themselves as phenotypic traits; thus, DNA determines heredity

Genome: What Can We Do With This Information? By determining the sequence of genomes, we can…

Compare them to other genomes Study phylogeny and evolution

Use them to understand diseases and better create potential treatments; also better predict the body’s response to certain treatments Genetic diseases Somatic diseases

Use them for forensic science Research deeper into genetic engineering of plants

and animals (biotechnology)

Genome Mapping Can be done once a genome is sequenced Determines the physical order of the sequence features of the entire

DNA of an individual Places certain DNA fragments onto chromosomes by identifying the

fragments Identify by certain markers or by the exact base pair sequence of DNA Traditional maps mapped millions of base pairs at once (low resolution),

but modern ones can map in SNPs (one or two base pairs at a time) for higher resolutions

Can be used to identify a certain genetic marker with a certain disease Somatic diseases

Ex: cancer can occur when a tumor-suppressing gene is inactivated or blocked; genome mapping can be used to identify the genes and research ways to reactivate them

Genetic diseases Ex: sickle cell anemia is related to a mutation in the beta hemoglobin gene

DNA Sequencing: Background Sanger Method

Used to determine nucleotide order in DNA Rapid DNA sequencing Uses modified, labeled nucleotides to stop DNA

strand elongation at specific bases Scientists treat each DNA sample with one labeled

base DNA can then be run on a gel and tracked to where

it was terminated; nucleotides separated by size and nucleotide type

Results photographed on an X-ray or gel image Dye-terminator sequencing: revised method

Uses fluorescent dyes to visualize all bases on one lane

DNA Sequencing: Background Illumina Technologies

Next-generation sequencing A single strand of DNA fragment provides a template for the DNA to be

re-synthesized Signals are emitted and interpreted by the sequencing machine Unlike Sanger, next-gen can be applied to millions of base pairs at once

via a flow cell Fragmented reads are then re-assembled by alignment whole

genome MiSeq

“Personal” tabletop sequencer Capable of many of the functions of a large

sequencer Uses fluorescence and LED light while previous

machines used lasers Cheaper– now many universities can afford

sequencers

DNA Extraction The process of separating pure genomic

DNA from the rest of the contents of cells and tissues

Steps: Lysing cells (breaking them to get to DNA) Removing contaminants from DNA (proteins,

RNA, lipids, etc.) Pelleting DNA (precipitating and compacting it

to separate it from everything else) Washing away solutions used to purify DNA

Genome Sequencing The process of determining the

nucleotide order of a specific genome DNA extraction DNA prep

Tagmentation, amplification, etc. Run on a sequencer Alignment and re-assembly

Genome Sequencing Harmonia Why we sequenced it:

To better understand the insect and other beetles close to it

What we used to sequence it: G Biosciences DNA extraction/prep kits Illumina sequencer (MiSeq) Blue Pippin to run gels and size selections QuBit to measure DNA concentrations in samples

Genome had very low diversity; difficult to sequence May be due to transposon activity/repetitive elements

in the genome

Steps of Sequencing DNA Extraction

Harmonia pupa homogenized Lyse cells reach DNA inside

Proteinase K added Breaks down proteins surrounding the DNA (purifies)

Chloroform added Precipitates waste from DNA

DNA Stripping Solution added Strips DNA of any more waste

Precipitation Solution added Precipitates waste

Isopropanol added Precipitates DNA so it can be separated from other parts of mixture

Ethanol wash Washes DNA to further purify (remove excess salt)

Steps of Sequencing Paired End Prep

Followed Nextera XT DNA Prep Kit (Illumina, San Diego, CA) Tagmentation

DNA is fragmented and “tagged” (adapters added to DNA ends) allows DNA to be PCR amplified

PCR Amplification DNA is “amplified” in a polymerase chain reaction Amplification: DNA is replicated many times over so the sequencer

can read it PCR Clean-up

DNA is purified using AMPure Beads (unusable bits of DNA are washed out)

Library Normalization Makes sure that the DNA quantities from each sample are equal in

the final pooled library

http://res.illumina.com/images/technology/paired-end_sequencing.jpg

Steps of Sequencing Mate Pair Prep

Followed the Nextera Mate Pair DNA Prep Kit (Illumina, San Diego, CA).

Two versions of the mate pair were run Gel-plus/size selection

Used a Blue Pippin Prep machine (Sage Sciences, Beverly, MA)

Yielded fragments 10kb-17kb Gel-free

Yielded 3kb-15kb fragments

Steps of Sequencing Mate Pair Prep: Gel-Free

Tagmentation Strand Displacement Reaction

Polymerase is used to fill gaps in DNA caused by tagmentation

AMPure Purification Usable DNA binds to AMPure Beads;

anything unwanted in the solution, including small DNA fragments, is washed away


Circularization Fragments are circularized with blunt-ended

ligation Exonuclease Digestion

Any remaining linear DNA is broken down, removed from the circularized fragments

Fragmentation of Circularized Fragments Circularized DNA is sheared to smaller

fragments by sonication


Purification of Mate Pair Fragments Usable DNA fragments bind to streptavidin

beads; everything else is washed away Usable DNA= fragments containing biotinylated

adapters End Repair/A-Tailing

Overhangs from DNA shearing are blunted 3’ overhangs are removed; 5’ are filled in with

polymerase An ‘A’ nucleotide is added to the 3’ ends


Adapter Ligation Indexing adapters are added to the ends of

the fragments Contain a ‘T’ nucleotide that ligates to the ‘A’

tail Prepares the fragments for amplification and

flow cell hybridization PCR Amplification PCR Clean-up

Steps of Sequencing Mate Pair Prep: Gel-Plus

Tagmentation Strand Displacement Reaction AMPure Purification Size Selection

Used a Blue Pippin Prep machine (Sage Sciences, Beverly, MA) Specific range of DNA fragment sizes are chosen and separated from rest of DNA

10-17kb Circularization Exonuclease Digestion Fragmentation of Circularized Fragments Purification of Mate Pair Fragments End Repair/A-Tailing Adapter Ligation PCR Amplification PCR Clean-up

Steps of Sequencing Sequencing Paired Ends

Sample was diluted with hybridization buffer and paired-end sequenced in the MiSeq

2x250 run Sequencer reads 250bp at a time

Run yielded poor-quality data (low diversity) Spiked with PhiX, re-run

Steps of Sequencing Sequencing Mate Pairs

Gel and non-gel libraries diluted to 2 nM with Tris-Cl 10 mM, pH 8.5 with 0.1 Tween 20

2nM of DNA from each library was pooled Pooled library was diluted with 0.2N NaOH and

hybridization buffer Mixture was diluted again with hybridization buffer Placed on the MiSeq for mate pair sequencing

Run yielded poor-quality data (low diversity) Sample was spiked with PhiX, re-run

Assembly First, DNA quality is charted and basic stats are

reviewed (FastQC) Use charts to find which bases to trim

Trim first and last bases (bad quality– unusable) Aligned reads to reference genome (or similar

genome in de novo assembly) in BWA (Burrows-Wheeler Aligner)

BWA output files are imported into Integrative Genome Viewer (IGV)

Overlaps in read sequences allow whole genome to be re-assembled

IGV: viewing depth of coverage and fragment lengths

Paired ends give 100x coverageMate pairs provide scaffold

Results H. axyridis genome is about 300 million bp

long After trimming, we ended with…

628,908 paired end reads 4,038,064 singletons 1,454,689 mate pair reads (non-gel) 199,700 mate pair reads (gel)

Low diversity suggests transposon activity in genome

Genome full of long ‘A’ sequences

AcknowledgementsThanks to the W. Harry Feinstone Center

for Genomic Research and especially the Sutter Lab for allowing me to intern with them.

Special thanks to Dr. Shirlean Goodwin, Dr. Thomas Sutter, and Dr. Michael Dickens for their help during my time in the lab.

Disclaimer This is an informal presentation;

information taken from various print and Internet sources

Images are not mine Google, Illumina Technologies

genome sequencing: harmonia axyridis

Documents

entire dna

heredity genome

certain markers

gene sequencing

dna deoxyribonucleic

axyridis genome sequencing

beta hemoglobin gene

exons coding sections