a massively parallel architecture for bioinformatics
DESCRIPTION
A Massively Parallel Architecture for Bioinformatics. Presented by Md Jamiul Jahid. Introduction. Bioinformatics algorithms are demanding in scientific computing In general most of the bioinformatics algorithms are fairly simple Dealing with huge amount of data - PowerPoint PPT PresentationTRANSCRIPT
A Massively Parallel Architecture for Bioinformatics
Presented by Md Jamiul Jahid
Introduction
• Bioinformatics algorithms are demanding in scientific computing
• In general most of the bioinformatics algorithms are fairly simple
• Dealing with huge amount of data• The size of DNA sequence database doubles
every year
Introduction
• A typical DNA contains 3.4 billion base pairs• Maximum algorithms use only simple
operations with input data like – Arithmetic operation– String matching– String comparison
Introduction
• Standard CPUs are designed for providing a good instruction mix for almost all commonly used algorithm
• For a target class of algorithm they are not effective
• Results– High runtime– Energy– Money
Contribution
• Present a massively parallel architecture • Using low cost FPGA(Field Programmable Gate
Array)• They called it COPACOBANA 5000– Meaning Cost-Optimized Parallel Code Braker ANd
Analyzer
COPACOBANA 1000• This machine is for cryptanalysis: fast code
breaking• 120 low cost FPGAs• 20 subunits• Each has Xilinx Spartan -3 XC3S1000 FPGAs
COPACOBANA 1000
• Assumptions– Programs are
parallelizable– Demand of data
transfer is low– All node needed
very little local memory which can be served from on-chip RAM of FPGAs
COPACOBANA 5000
• Bus Concepts– Point to point connection two neighboring FPGA-
cards– Point to point connection contain 8 pairs of wire– Each 250MHz, total 2Gbit/s
COPACOBANA 5000
• Controller– Root entity of control is running on a remote host
computer– Connected to COPACOBANA5000 by LAN– Two scenario• Data on remote host• Data on COPACOBANA5000
COPACOBANA 5000
• FPGA-Card– Xilinx Spartan-3 5000 is used– Contains 8 FPGAs– All FPGAs are globally clocked
Performance Estimation
• Between– PC– COPACOBANA1000– COPACOBANA5000
Performance Estimation
Conclusion
• In this paper a new hardware for running bioinformatics algorithm is proposed
• The hardware are– Cheap– Low power consumption– Efficient
Questions
?
Thank You
Reference• Gerd Pfeiffer, Stefan Baumgart, Jan Schröder, and Manfred Schimmler,
A Massively Parallel Architecture for Bioinformatics, 9th International Conference on Computational Science (ICCS 2009).