whole genome comparison kelley crouse and greg matuszek

9
Whole genome comparison Kelley Crouse And Greg Matuszek

Upload: marylou-andrews

Post on 30-Dec-2015

214 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Whole genome comparison Kelley Crouse And Greg Matuszek

Whole genome comparison

Kelley CrouseAnd

Greg Matuszek

Page 2: Whole genome comparison Kelley Crouse And Greg Matuszek

Objective

• Implement a parallel program for genome and chromosome comparisons

Page 3: Whole genome comparison Kelley Crouse And Greg Matuszek

Background

• MUMmer: serial implementation using a suffix tree

• Parallel implementation using a variant of the Smith-Waterman local alignment algorithm.

Page 4: Whole genome comparison Kelley Crouse And Greg Matuszek

Disadvantages

• Neither handles larger genomes and chromosomes quickly

• Parallel version hindered by data structure

Page 5: Whole genome comparison Kelley Crouse And Greg Matuszek

How we plan to implement

• A suffix tree will be created using one sequence

• The second sequence will be fragmented and sent out to the workers.

• Each worker will compare its fragment against the suffix tree and report back to the farmer with the location(s) of similarity

Page 6: Whole genome comparison Kelley Crouse And Greg Matuszek

What is a Suffix Tree?

• The tree represents all suffixes within a given string

• Used to search for a sub-string within a string

• By comparing a test string, T, against the suffix tree of string, S, it is possible to locate any and all possible correlations between the two strings

Page 7: Whole genome comparison Kelley Crouse And Greg Matuszek

Suffix Tree - Bananas

• Each suffix of “Bananas” is represented within the suffix tree

• Sub-string S, can be compared to bananas by following the paths of each leaf.

Page 8: Whole genome comparison Kelley Crouse And Greg Matuszek

Fragmenting the Second Sequence

Random fragmenting- Difficult to assemble alignment- allows for small and large fragments

Specific length fragments- Restricted to one fragment size- Alignment is easier to assemble

Page 9: Whole genome comparison Kelley Crouse And Greg Matuszek

What we hope to gain

• Ability to identify conserved regions between genomes (and chromosomes)

• Conduct comparison between large genomes and chromosomes quickly and accurately