parallel algorithm for multiple genome alignment using multiple clusters nova ahmed, yi pan, art...
TRANSCRIPT
Parallel Algorithm forMultiple Genome Alignment
Using Multiple Clusters
Nova Ahmed, Yi Pan, Art Vandenberg Georgia State University
SURA Cyberinfrastructure Workshop:
Grid Application Planning & ImplementationJanuary 5-7, 2005
Southeastern Universities Research Association
Southeastern Universities Research Association
2SURA Cyberinfrastructure Workshop
January 5-7, 2005
Discussion Topics…
• Sequence alignment problem
• Memory efficient algorithm
• Convergence toward collaboration
• System configurations
Results (part 1, part 2)
Conclusions
Future work
Southeastern Universities Research Association
3SURA Cyberinfrastructure Workshop
January 5-7, 2005
Sequence alignment problem
• Sequences used to find biologically meaning relationships among organisms
• Evolutionary information• Determining diseases, causes, cures• Finding out information about proteins
• Problem especially compute intensive for long sequences• Needleman and Wunsch (1970) - optimal global alignment• Smith and Waterman (1981) - optimal local alignment• Taylor (1987) - multiple sequence alignment by pairwise alignment• BLAST trades off optimal results for faster computation
• Challenge - achieve optimal results without sacrificing speed
Southeastern Universities Research Association
4SURA Cyberinfrastructure Workshop
January 5-7, 2005
Memory efficient algorithm
• Based on pairwise algorithm• Similarity Matrix generated to compare all sequence positions• Observation that many “alignment scores” are zero value
• Similarity Matrix reduced by storing only non-zero elements• Row-column information stored along with value• Block of memory dynamically allocated as non-zero element found• Data structure used to access allocated blocks
• Parallelism introduced to reduce computation
Southeastern Universities Research Association
5SURA Cyberinfrastructure Workshop
January 5-7, 2005
• Alignment of DNA sequences:Sequence X: TGATGGAGGTSequence Y: GATAGG
• 1 = matching; 0 = non-matching• ss = substitution score; gp = gap score • Generate Similarity Matrix max score with respect to neighbors using:
Similarity Matrix Generation
Southeastern Universities Research Association
6SURA Cyberinfrastructure Workshop
January 5-7, 2005
• Back trace matrix to find sequence matches
Trace sequences
Southeastern Universities Research Association
7SURA Cyberinfrastructure Workshop
January 5-7, 2005
• Algorithm calculates only non-zero values• Memory dynamically allocated as needed
Data structure
Southeastern Universities Research Association
8SURA Cyberinfrastructure Workshop
January 5-7, 2005
Parallel distribution of multiple sequences
Sequences 1-6
Sequences 7-12
Seq 1-2 Seq 5-6Seq 3-4
Southeastern Universities Research Association
9SURA Cyberinfrastructure Workshop
January 5-7, 2005
Convergence toward collaboration
• Algorithm implementation• Nova Ahmed, Masters CS student
• Dr. Yi Pan, CS, graduate advisor
• Shared memory system – Georgia State• Algorithm implementation and initial validation results
• NMI Integration Testbed program• Georgia State
– Art Vandenberg, Victor Bolet, et al.
• University of Alabama at Birmingham– Jill Gemmill, John-Paul Robinson, Pravin Joshi
• SURA NMI Testbed Grid• Looking for applications to demonstrate value
Southeastern Universities Research Association
10SURA Cyberinfrastructure Workshop
January 5-7, 2005
System configurations
• Shared memory – Georgia State• SGI Origin 2000
– 24 250MHz MIPS R10000; 4 gigabytes total RAM
• Clusters – University of Alabama at Birmingham• Single Cluster
– 8 node Beowulf cluster (each node 4 550MHz Pentium III; 512 MB RAM)
• Single Cluster Grid
– Same 8 node Beowulf cluster with Globus Toolkit 3.0
• Multi-Cluster
– 2 additional grid-enabled clusters (small SMP systems)
• Multi-Cluster interconnect speed essentially 100mb/sec
Southeastern Universities Research Association
11SURA Cyberinfrastructure Workshop
January 5-7, 2005
Results, part 1
• Initial validation of algorithm on Shared memory
• UAB Cluster• As “relative comparison” to shared memory performance
• UAB grid-enabled cluster• To evaluate impact of grid middleware layer
Southeastern Universities Research Association
12SURA Cyberinfrastructure Workshop
January 5-7, 2005
Initial Validation: Shared Memory Machine
Performance Validates AlgorithmComputation time decreases with increased number of processors
2 4 6 8 10 12
Computation Time(Shared Memory)
0
100
200
300
400
500
Computation
Time
Number of Processors
Computation Time(Shared Memory)
Limitations• Memory
Max sequence is2000 x 2000
• ProcessorsPolicy limits studentto 12 processors
• Not scalable
Southeastern Universities Research Association
13SURA Cyberinfrastructure Workshop
January 5-7, 2005
Results: UAB Clusters; Shared Memory*
• Increase genome lengths to 3000 (remove student limit shared memory)
* NB: results comparing clusters with shared memory are relative;Systems distinctly different.
2
8
14
20
26
0
100
200
300
400
500
Computation
Time (seconds)
Number of Processors
Genome length 3000(Grid)
Genome length 3000(Cluster)
Genome length 3000( Shared Memory)
Southeastern Universities Research Association
14SURA Cyberinfrastructure Workshop
January 5-7, 2005
Results: Grid-enabled cluster (Globus, MPICH)
Advantages of grid-enabled cluster:• Longer Sequences – up to 10,000 length tested • Scalable – Can add new cluster nodes to the grid• Easier job submission – Don’t need account on every node• Scheduling is easier – Can submit multiple jobs at one time
2
8
14
20
26
0
200
400
600
Computation
Time (seconds)
Number of Processors
Genome length 10000 (Grid)
Genome length 10000( Cluster)
Southeastern Universities Research Association
15SURA Cyberinfrastructure Workshop
January 5-7, 2005
Results, part 2
• Focus on clusters• UAB Cluster• UAB grid-enabled cluster• Multi-clusters at UAB
• Multiple Genome alignment – not just pairwise• Sequence set from sequence library• Approx 150 sequences ranging from 80,000 to 1,000,000 length
• Globus Toolkit 3.0, MPICH-G2
Southeastern Universities Research Association
16SURA Cyberinfrastructure Workshop
January 5-7, 2005
Computation TimeNumber of elements per processor
Using 9 processors in each config (cluster, grid cluster, multi-grid cluster)
0
20
40
60
80
100
120
140
160
0 10 20 30 40 50
Number of elements per processor
Computation time (sec)
Single Cluster
Single Clustered Grid
Multi Clustered Grid
Southeastern Universities Research Association
17SURA Cyberinfrastructure Workshop
January 5-7, 2005
Computation Time
9 processors available in multi-cluster32 processors for other configs.
0
100
200
300
400
500
0 5 10 15 20 25 30
Number of processors
Computation time (sec)
Single Cluster
Single Clustered
Grid
Multi Clustered
Grid
Southeastern Universities Research Association
18SURA Cyberinfrastructure Workshop
January 5-7, 2005
Speed up (time 1 cpu / time n cpus)
0
1
2
3
4
5
6
7
8
9
0 5 10 15 20 25 30
Number of Processors
Speed up
Single Cluster
Single Clustered
Grid
Multi Clustered Grid
9 processors available in multi-cluster32 processors for other configs.
Southeastern Universities Research Association
19SURA Cyberinfrastructure Workshop
January 5-7, 2005
Some Conclusions
• Having cluster nodes available via Testbed beneficial• Enables access where resource not available locally• Empowers student investigation
• Grid capability demonstrated• Provides awareness and outreach vector• Nova Ahmed’s thesis defense - engages other graduate students• Concrete “take away” that engages faculty/IT/student discussion
• Some interesting results• Hypothesis: multi-cluster may provide better results than one cluster• Research leads to understanding, learning - whatever Hypothesis result
• Ahmed et al., “Memory Efficient Pair-Wise Genome Alignment Algorithm - A Small-Scale Application with Grid Potential,” Proceedings Grid and Cooperative Computing - GCC 2004, Lecture Notes in Computer Science
Southeastern Universities Research Association
20SURA Cyberinfrastructure Workshop
January 5-7, 2005
Future Work
• Running across clusters at different sites
• Intelligent agent: submit to mixed environment
– shared memory and/or clusters and/or …
• Using BridgeCA for transparent access
• Optically connected clusters?
• Analysis of network factors• cf. Warren Matthews, GaTech, et al., end-to-end performance
Southeastern Universities Research Association
21SURA Cyberinfrastructure Workshop
January 5-7, 2005
Questions / Contacts
Georgia State University
Nova Ahmed [email protected]
Yi Pan [email protected]
Art Vandenberg [email protected]
Southeastern Universities Research Association
22SURA Cyberinfrastructure Workshop
January 5-7, 2005
Acknowledgement
• This work is supported in part by the NSF Middleware Initiative Cooperative Agreement No. ANI-0123937. Any opinions, findings, conclusions or recommendations expressed herein are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.