evaluating the imagine stream processor

24
Evaluating the Imagine Stream Processor Jung Ho Ahn, William J. Dally, Brucek Khailany, Ujval J. Kapasi, and Abhishek Das ISCA 2004

Upload: lelia

Post on 31-Jan-2016

41 views

Category:

Documents


0 download

DESCRIPTION

Evaluating the Imagine Stream Processor. Jung Ho Ahn , William J. Dally, Brucek Khailany , Ujval J. Kapasi , and Abhishek Das ISCA 2004. Motivation. Provide efficiency of an ASIC Provide flexibility of a programmable processor Simplify special-purpose processor design - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Evaluating the Imagine Stream Processor

Evaluating the Imagine Stream Processor

Jung Ho Ahn, William J. Dally, Brucek Khailany, Ujval J. Kapasi, and Abhishek Das

ISCA 2004

Page 2: Evaluating the Imagine Stream Processor

Motivation• Provide efficiency of an ASIC• Provide flexibility of a programmable processor• Simplify special-purpose processor design • Lower special-purpose processor design cost• Provide better applicability• Target media applications

Page 3: Evaluating the Imagine Stream Processor

Stream Architecture

Page 4: Evaluating the Imagine Stream Processor

Development Board

PowerPC, 150 MHz2 x Imagine, 200 MHzFPGA Bridge, 66 MHz

256MB of SDRAM / Imagine, 100 MHz

Page 5: Evaluating the Imagine Stream Processor

Applications

Page 6: Evaluating the Imagine Stream Processor

Mapping

Page 7: Evaluating the Imagine Stream Processor

Execution on a Single Stream

…Iteration n

Iteration 1

……

Output Stream

Input Stream

SRFKernel 1

Page 8: Evaluating the Imagine Stream Processor

Execution of Multiple KernelsSRF Kernel 1

Stream 1

Stream 2

Stream 3

processing…

Kernel 2

processing…

Kernel 3

processing…

Stream 4

Page 9: Evaluating the Imagine Stream Processor

Application PerformanceGOPS: 18%

GFLOPS: 60%

Page 10: Evaluating the Imagine Stream Processor

Sources of Overhead

Page 11: Evaluating the Imagine Stream Processor

Stream Length Effects

Page 12: Evaluating the Imagine Stream Processor

Access Pattern Effects

Page 13: Evaluating the Imagine Stream Processor

Energy Efficiency

Energy consumption per FLOP :(when normalized to 0.13um 1.2V process)

Imagine @ 200 MHz:277pJ/FLOP

TI C67x DSP @ 225MHz:889pJ/FLOP (3.2x more)

Intel Pentium M @ 1200GHz:3600pJ/FLOP (13x more)

Page 14: Evaluating the Imagine Stream Processor

Memory Bandwidth Requirement

Page 15: Evaluating the Imagine Stream Processor

Host Processor Bandwidth Requirement

Page 16: Evaluating the Imagine Stream Processor

Programming Model

Page 17: Evaluating the Imagine Stream Processor

Compiler OptimizationsStream Ordering

Page 18: Evaluating the Imagine Stream Processor

Compiler OptimizationsSRF Overlapping and Packing

Page 19: Evaluating the Imagine Stream Processor

Compiler OptimizationsStrip-mining

Page 20: Evaluating the Imagine Stream Processor

Compiler OptimizationsLoop Unrolling and Software Pipelining

Page 21: Evaluating the Imagine Stream Processor

Conclusions

• Provides performance close to that of ASIC and flexibility via programming

• Can sustain between 16% and 60% of the peak arithmetic performance

• Exposed 2-level register file allows compiler to exploit locality

• Broader applicability• Requires considerable programming effort• Limited to media applications with regular control-

flow

Page 22: Evaluating the Imagine Stream Processor

Collab Questions

• How does the performance compare to other processors? (Dan, Marko, Jason, Prateeksha, Chris)

• What is the compiler efficiency? (Mario, Liang)• How were the design decisions motivated? (Jing,

Marisabel)• How does the programming model compare to that

of GPUs? (Greg)

Page 23: Evaluating the Imagine Stream Processor
Page 24: Evaluating the Imagine Stream Processor

Kernels