soap3-dp workflow. paired-end reads soap3 (2-mismatch) chr 6, +4,059, -4,369;...............

15
SOAP3-dp Workflow

Upload: dustin-holland

Post on 24-Dec-2015

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: SOAP3-dp Workflow. Paired-end reads SOAP3 (2-mismatch) chr 6, +4,059, -4,369;............ …………. paired alignments Step 1: Use SOAP3 to align pair-ended

SOAP3-dpWorkflow

Page 2: SOAP3-dp Workflow. Paired-end reads SOAP3 (2-mismatch) chr 6, +4,059, -4,369;............ …………. paired alignments Step 1: Use SOAP3 to align pair-ended

Paired-end reads

SOAP3 (2-mismatch)chr 6, +4,059, -4,369;............………….

paired alignmentsStep 1: Use SOAP3 to align pair-ended reads

Step 2: For reads with one end mapped but another not, use Default-DP to align the unmapped ends

chr 9, +49,538;………..…….….

One ends’ alignments

The unmapped ends+

+49,538

chr 9

candidate regionfor the unmapped end

mapped regionof one end

Default-DP

…………………..…………………..

……….……….

Step 3: For reads with both ends unaligned, use SOAP3 to align the seeds and then use Deep-DP to align both ends

chr 9, +49,538, -49,829;............………….

paired alignments

seeds

use DP to align

SOAP3 (1-mismatch)

chr 18, +349,683;............………….

seed alignmentsof first end

Pair up the seed alignments

chr 18, -349,998;............………….

seed alignmentsof second end

+

chr 18, +349,683, -349,998;............………….

paired seed alignments-

chr 18candidate region

Deep-DP

use DP to align

+349,998349,683

chr 18, +349,664, -349,923;............………….

paired alignments

SOAP3-dp workflow for paired-end alignment

Page 3: SOAP3-dp Workflow. Paired-end reads SOAP3 (2-mismatch) chr 6, +4,059, -4,369;............ …………. paired alignments Step 1: Use SOAP3 to align pair-ended

Step 1: SOAP3

SOAP3 (2-mismatch)

Both ends can be mapped and paired properly

Report the alignments

Only one end can be mappedwith not too many hits (i.e. <= 30)

Store the readID (of aligned end) and hits to ARRAY A

Only one end can be mappedwith too many hits (i.e. > 30)

Store the readID ( of aligned end) and hits to ARRAY B

both ends cannot be mappedStore the readID (of the first read of the pairs) and hits to ARRAY C

Both ends can be mapped but not paired properly

Store the readID and hits to ARRAY A or B

(describe more in next slides)

A read pair is paired properly if:1. Both ends are mapped within

the insert size (i.e. a range of distance between two ends inputted by the user).

2. In proper orientation (for illumina reads, the end aligned to left side is in forward strand, while another aligned to right in reverse strand.)

Page 4: SOAP3-dp Workflow. Paired-end reads SOAP3 (2-mismatch) chr 6, +4,059, -4,369;............ …………. paired alignments Step 1: Use SOAP3 to align pair-ended

Step 1: SOAP3 -- Both ends can be mapped but not paired properly

YES YESNot paired properly

YES NO YESNO

ARRAY A

Let x = # of all valid hits of read 1 Let y = # of all valid hits of read 2

a) x,y <= 30

b) x <= 30 < y YES NO ARRAY A

c) y <= 30 < x YESNO ARRAY A

d) 30 < x < y YES NO ARRAY B

e) 30 < y <= x YESNO ARRAY B

If x > 30, only retains the best hits of read 1 and reset x = # of best hits of read 1.If y > 30, only retains the best hits of read 2 and reset y = # of best hits of read 2.

read 1 read 2

Store the read ID and hits of YES to ARRAY A or B

Page 5: SOAP3-dp Workflow. Paired-end reads SOAP3 (2-mismatch) chr 6, +4,059, -4,369;............ …………. paired alignments Step 1: Use SOAP3 to align pair-ended

Step 2 and step 3:default DP and new default DP

Default DP

Both ends can be mapped and paired properly

Report the alignments

Otherwise

Store the readID of the first read of the pairs to ARRAY C

Array A

New default DP

Both ends can be mapped and paired properly

Report the alignments

OtherwiseStore the readID of the first read

of the pairs to ARRAY CArray B

Page 6: SOAP3-dp Workflow. Paired-end reads SOAP3 (2-mismatch) chr 6, +4,059, -4,369;............ …………. paired alignments Step 1: Use SOAP3 to align pair-ended

For reads with one end mapped but another not, AND the number of hits is not too many,use Default-DP to align the unmapped ends

chr 9, +49538;………..…….….

One ends’ alignments

The unmapped ends+

+49538

chr 9

candidate regionfor the unmapped end

mapped regionof one end

Default-DP

……….……….

For reads with one end mapped but another not, AND the number of hits is too many,use New-Default-DP to align the unmapped ends

chr 9, +49538, -49829;............………….

paired alignments

seeds

use DP to align

SOAP3(1-mismatch)

Pair up the seed alignmentswith the alignments of another end

chr 18, -349998;............………….

seed alignmentsof unmapped end

chr 18, +349683, -349998;............………….

-chr 18

candidate region

New-Default-DP

use DP to align

+349998349683

chr 18, +349683, -349923;............………….

paired alignments

Detailed picture of Default DP and New Default DP

chr 18, +349683;………..…….….

One ends’ alignments

+

The unmapped endsseeds

mapped regionof one end

Page 7: SOAP3-dp Workflow. Paired-end reads SOAP3 (2-mismatch) chr 6, +4,059, -4,369;............ …………. paired alignments Step 1: Use SOAP3 to align pair-ended

ROUND 1 SEEDING for both endsSeed length: 26

Sample rate: 1/13Max # of hits allowed: 100

Step 4:2-level Deep DP

ARRAY C

If (1) there exists a seed with too many hits; AND (2) no pairs of hits within insert size.

ROUND 2 SEEDING for both endsSeed length: 30

Sample rate: 1/15Max # of hits allowed: 1000

If there exists pairs ofhits within insert size.

Perform DP for those pairs of hits

within insert size.

If there exists pairs ofhits within insert size.

Case 1: Valid paired alignments found Report the alignments

Case 2: No valid paired alignment found Store the readID of both ends to ARRAY D

Page 8: SOAP3-dp Workflow. Paired-end reads SOAP3 (2-mismatch) chr 6, +4,059, -4,369;............ …………. paired alignments Step 1: Use SOAP3 to align pair-ended

Step 5:Single DP

Single DP

The end can be mapped

Report the alignments

OtherwiseReport the ends cannot

be alignedArray D

Page 9: SOAP3-dp Workflow. Paired-end reads SOAP3 (2-mismatch) chr 6, +4,059, -4,369;............ …………. paired alignments Step 1: Use SOAP3 to align pair-ended

seedsSOAP3

(1-mismatch)chr 18, +349,683;............………….

seed alignments

Chr18Candidate region

Single-DP

use DP to align

+349,683

chr 18, +349,664;............………….

Report the alignments

Detailed picture of Single DP

Page 10: SOAP3-dp Workflow. Paired-end reads SOAP3 (2-mismatch) chr 6, +4,059, -4,369;............ …………. paired alignments Step 1: Use SOAP3 to align pair-ended

Paired-end alignment(overall workflow)

SOAP3 (2-mismatch)

New default DP

Default DP

2-level deep DP

single DP

Load 6M reads (3M pairs)

Create a new CPU threadto load next 6M reads

More reads to process?Yes

ENDNo

Note: New-default DP needs 2BWT in GPU, while default DP does not. Thus we run new-default DP before default DP,because after SOAP3, 2BWT index is already inside GPU.

Page 11: SOAP3-dp Workflow. Paired-end reads SOAP3 (2-mismatch) chr 6, +4,059, -4,369;............ …………. paired alignments Step 1: Use SOAP3 to align pair-ended

SOAP3 Architecture

Execution

Host (CPU)

Execution

Device (GPU)

Process round 3 alignment & Report results

Process 1M reads for round 1 and round 2 alignments

Process round 3 alignment & report results

Process round 3 alignment & report results

Process round 3 alignment & report results

Process 1M reads for round 1 and round 2 alignments

Process 1M reads for round 1 and round 2 alignments

Process 1M reads for round 1 and round 2 alignments

Process 1M reads for round 1 and round 2 alignments

…….. ……..

2BWT + SA

Memory-resident data structures

2BWT

Memory-resident data structures

Page 12: SOAP3-dp Workflow. Paired-end reads SOAP3 (2-mismatch) chr 6, +4,059, -4,369;............ …………. paired alignments Step 1: Use SOAP3 to align pair-ended

DP with seeding

Execution

Host (CPU)

Execution

Device (GPU)

Process round 3 alignment

Process 1M seeds for round 1 and round 2 alignments

Process 1M seeds for round 1 and round 2 alignments

2BWT + SA

Memory-resident data structures

2BWT / DP tables

Memory-resident data structures

Copy 2BWT index to GPU &Extract seeds of reads in Array C

…….. ……..

SOAP3(1-mismatch)

Pair-up the seed alignments,Clear 2BWT index in GPU &

Create DP tables in GPU Perform DP between the reads and the candidate regions

Page 13: SOAP3-dp Workflow. Paired-end reads SOAP3 (2-mismatch) chr 6, +4,059, -4,369;............ …………. paired alignments Step 1: Use SOAP3 to align pair-ended

Default DP

Execution

Host (CPU)

Execution

Device (GPU)

2BWT + SA

Memory-resident data structures

DP tables

Memory-resident data structures

Create DP tables in GPU

Perform DP between the reads and the candidate regions

Page 14: SOAP3-dp Workflow. Paired-end reads SOAP3 (2-mismatch) chr 6, +4,059, -4,369;............ …………. paired alignments Step 1: Use SOAP3 to align pair-ended

Single-end alignment(overall workflow)

SOAP3 (2-mismatch)

single DP

Load 6M single-end reads

Create a new CPU threadto load next 6M reads

More reads to process?Yes

END

No

Page 15: SOAP3-dp Workflow. Paired-end reads SOAP3 (2-mismatch) chr 6, +4,059, -4,369;............ …………. paired alignments Step 1: Use SOAP3 to align pair-ended

Paired-end alignment(For read length > 150)

2-level deep DP

single DP

Load 6M reads (3M pairs)

Create a new CPU threadto load next 6M reads

More reads to process?Yes

END

No