introduction
Post on 30-Dec-2015
44 Views
Preview:
DESCRIPTION
TRANSCRIPT
. BioPerf is a suite of representative applications that we have assembled from the computational biology community, where the codes are carefully selected to span a breadth of algorithms and performance characteristics. We have analyzed the complexity of these codes, at the instruction and memory level, using “live” and aggregate data on contemporary high-performance architectures (Apple G5 with the IBM PowerPC 970), and on the IBM cycle-accurate simulator Mambo, previously used to design supercomputers such as IBM p-Series and BlueGene, and currently being used to model future systems. Hence, our work is novel in that it is one of the first efforts to incorporate life science application performance for optimizing high-end computer system architectures. Through dual-platform performance analysis, we offer system design parameters for machine configurations that may improve the performance of these codes. We target this suite for impact to both biologists and computer scientists for the evaluation of systems running bioinformatics applications.
Incorporating Life Science Applications into the Architectural Optimizations of Next-Generation Petaflops Systems
Introduction Performance Analysis
BioPerf
David A. BaderGeorgia Institute of Technology
Vipin Sachdeva University of New Mexico
The following codes are included in BioPerf
Package ExecutableBLAST blastn, blastpFASTA fasta34_t, ssearch34_tCLUSTALW clustalw, clustalw_smpHMMER hmmsearch, hmmpfamT-COFFEE tcoffeeGLIMMER glimmer, glimmer-packagePHYLIP dnapenny, promlkGRAPPA grappaCE cePREDATOR predator
0 200 400 600 800 1000
0.0
0.2
0.4
0.6
0.8
1.0
1.2
0.00
0.01
0.02
0.03
0.04
0.05
0.06
Instr. completed/cycle L1d Miss Rate
L1d
Mis
s R
ate
0 200 400 600 800 1000 1200 1400 1600 1800
0.0
0.2
0.4
0.6
0.8
1.0
1.2
1.4
0.00
0.01
0.02
0.03
0.04
0.05
Instructions Completed/Cycle L1D Cache Misses
Inst
ruct
ions
Com
plet
ed/C
ycle
0 200 400 600 800 1000 1200 1400
0.0
0.2
0.4
0.6
0.8
1.0
1.2
1.4
0.00
0.01
0.02
0.03
0.04
0.05
0.06
0.07
Instr. Completed/Cycle L1D Cache Miss Rate
Blast, hmmpfam, and tcoffee performance graphs (from left to right)
Instructions per cycle increases in the same cycle that the L1 data miss rate decreases. We can thus correlate the performance of the application as it varies, with the system metrics impacting it.
Methodology Performance Analysis through live data
Cumulative Metrics
Separated Regions of Performance
Performance Analysis through live data
6.0e+8 8.0e+8 1.0e+9 1.2e+9 1.4e+9 1.6e+9 1.8e+9 2.0e+9 2.2e+9 2.4e+90
5e+4
1e+5
2e+5
2e+5
3e+5
3e+5
0
2000
4000
6000
8000
10000
12000
14000
16000
18000
INST_TYPE_ARITH L1D_CACHE_MISS_USER
2.20e+9 2.22e+9 2.24e+9 2.26e+9 2.28e+9 2.30e+9 2.32e+9 2.34e+915000
20000
25000
30000
35000
40000
45000
50000
0
10000
20000
30000
40000
INST_TYPE_ARITH L1D_CACHE_MISS_USER
2.2e+9 2.4e+9 2.6e+9 2.8e+9 3.0e+9 3.2e+9 3.4e+9 3.6e+9 3.8e+9
0
1e+5
2e+5
3e+5
4e+5
5e+5
0
10000
20000
30000
40000
INST_TYPE_ARITH L1D_CACHE_MISS_USER
ClustalW region I (top), II (bottom left) and III (bottom right) showing differences in algorithmic complexity and memory access pattern.
Clustalw’s performance is roughly categorized into three regions.• Every sequence is compared against every other sequence by Smith Waterman, a quadratic time complexity dynamic programming algorithm,• The neighbor joining method in which comparison score of sequences is used to make a guide tree with the sequences at the leaves of the tree• The sequences are combined into a multiple sequence alignment according to the guide tree.
Instruction Profiling
0 500 1000 1500 2000 2500 3000
0.0
0.2
0.4
0.6
0.8
1.0
1.2
1.4
0.000
0.005
0.010
0.015
0.020
0.025
0.030
Instr. Completed/Cycle Branch Mispredicts/Instr.
0 500 1000 1500 2000 2500 3000
0.0
0.2
0.4
0.6
0.8
1.0
1.2
1.4
0.00
0.05
0.10
0.15
0.20
0.25
Instr. Completed/Cycle L1d Miss Rate
Clustalw’s livegraphs with L1d miss rate and branch mispredicts (top left and right).
Performance of the last phase of clustalw is more closely related to branch mispredicts than L1 data miss rate.
www.bioperf.org
top related