mit opencourseware 6.189 multicore programming primer ... · csr-1 c is the common intel tflops...

104
MIT OpenCourseWare http://ocw.mit.edu 6.189 Multicore Programming Primer, January (IAP) 2007 Please use the following citation format: Bill Thies, 6.189 Multicore Programming Primer, January (IAP) 2007. (Massachusetts Institute of Technology: MIT OpenCourseWare). http://ocw.mit.edu (accessed MM DD, YYYY). License: Creative Commons Attribution-Noncommercial-Share Alike. Note: Please use the actual date you accessed this material in your citation. For more information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms

Upload: others

Post on 04-Aug-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: MIT OpenCourseWare 6.189 Multicore Programming Primer ... · CSR-1 C is the common Intel Tflops machine language Raza Cavium Raw XLR Octeon Niagara a Cell Boardcom 1480 Opteron 4P

MIT OpenCourseWare http://ocw.mit.edu

6.189 Multicore Programming Primer, January (IAP) 2007

Please use the following citation format:

Bill Thies, 6.189 Multicore Programming Primer, January (IAP) 2007. (Massachusetts Institute of Technology: MIT OpenCourseWare). http://ocw.mit.edu (accessed MM DD, YYYY). License: Creative Commons Attribution-Noncommercial-Share Alike.

Note: Please use the actual date you accessed this material in your citation.

For more information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms

Page 2: MIT OpenCourseWare 6.189 Multicore Programming Primer ... · CSR-1 C is the common Intel Tflops machine language Raza Cavium Raw XLR Octeon Niagara a Cell Boardcom 1480 Opteron 4P

6.189 IAP 2007

Lecture 8

The StreamIt Language

Bill Thies, MIT. 1 6.189 IAP 2007 MIT

Page 3: MIT OpenCourseWare 6.189 Multicore Programming Primer ... · CSR-1 C is the common Intel Tflops machine language Raza Cavium Raw XLR Octeon Niagara a Cell Boardcom 1480 Opteron 4P

Languages Have Not Kept Up

C von-Neumann Modern machine architecture

�� Develop cool architecture withDevelop cool architecture with complicated, ad-hoc languagecomplicated, ad-hoc language

� Bend over backwards to supportold languages like C/C++

Images by MIT OpenCourseW

Two choices:●● Two choices:Two choices:

Bill Thies, MIT. 2 6.189 IAP 2007 MIT

Images by MIT OpenCourseWare.

are.

Page 4: MIT OpenCourseWare 6.189 Multicore Programming Primer ... · CSR-1 C is the common Intel Tflops machine language Raza Cavium Raw XLR Octeon Niagara a Cell Boardcom 1480 Opteron 4P

Why a New Language?

512

256

128

64

# of 32

16cores 8

4

2

For uniprocessors, PicoPi chipcochip AmbricbriAm cAMA 2045M2045C was: PC102PC102

CiscoCisco

IntelIntelTflopsTflops•Portable

CSR-1CSR-1

•High Performance RazRa aza CaviCav umium•Composable RawRa XLRw XLR OcteonOcteon

•Malleable Niagara CellCeNiagara ll

•Maintainable Boardcom 1480 Opteron 4PBoardcom 1480 Opteron 4PXeX on MPeon MP

Xbox36Xb 0ox360PA-8800PA-8800 OpteronOpteron TanglewoodTanglewood

Power4Power4

4004 80868080

8008

286 386 486 Pentium P2 P3 ItaniumP4

Itanium 2

Yonah

1 Yonah

PExtremPE e

Athlon

xtreme Power6Power6

1970 1975 1980 1985 1990 1995 2000 2005 20??

Bill Thies, MIT. 3 6.189 IAP 2007 MIT

Page 5: MIT OpenCourseWare 6.189 Multicore Programming Primer ... · CSR-1 C is the common Intel Tflops machine language Raza Cavium Raw XLR Octeon Niagara a Cell Boardcom 1480 Opteron 4P

Why a New Language?

512

256

128

64

# of 32

16cores 8

4

2

PC102PC102 AMA 2045M2045Uniprocessors: PicoPi chipcochip AmbricbriAm c

CiscoCiscoCSR-1CSR-1

IntelIntelC is the common TflopsTflops

machine language RazRa aza CaviCav umium

RawRa XLRw XLR OcteonOcteon

Niagara CellCeNiagara ll

Boardcom 1480 Opteron 4PBoardcom 1480 Opteron 4PXeX on MPeon MP

Xbox36Xb 0ox360PA-8800PA-8800 OpteronOpteron TanglewoodTanglewood

Power4Power4

4004 80868080

8008

286 386 486 Pentium P2 P3 ItaniumP4

Itanium 2

Yonah

1 Yonah

PExtremPE e

Athlon

xtreme Power6Power6

1970 1975 1980 1985 1990 1995 2000 2005 20??

Bill Thies, MIT. 4 6.189 IAP 2007 MIT

Page 6: MIT OpenCourseWare 6.189 Multicore Programming Primer ... · CSR-1 C is the common Intel Tflops machine language Raza Cavium Raw XLR Octeon Niagara a Cell Boardcom 1480 Opteron 4P

Why a New Language?

512

256

128

64

# of 32

16cores 8

4

2

1

What is the common machine language for multicores?

4004

8008

80868080 286 386 486 Pentium P2 P3 P4 Itanium

Itanium 2

Raw

Power4 Opteron

Power6

Niagara

Yonah PExtreme

Tanglewood

Cell

Intel Tflops

teonRa

Xbox360Xb

Cavium OcCavRaza

XLR Ra

PA-8800PA

Cisco CSR-1 Ci

Picochip PC102 Pi

Boardcom 1480B Opteron 4P Xeon MP

Opteron

Athlon

Ambric AM2045 Am

w

Power4 Opteron

Power6

Niagara

YonahPExtreme

Tanglewood

Cell

IntelTflops

ox360

iumOcteon

zaXLR

-8800

scoCSR-1

cochipPC102

oardcom 1480 4PXeon MP

bricAM2045

1970 1975 1980 1985 1990 1995 2000 2005 20?? Bill Thies, MIT. 5 6.189 IAP 2007 MIT

Page 7: MIT OpenCourseWare 6.189 Multicore Programming Primer ... · CSR-1 C is the common Intel Tflops machine language Raza Cavium Raw XLR Octeon Niagara a Cell Boardcom 1480 Opteron 4P

Common Machine Languages

Uniprocessors: Multicores:Common PropertiesCommon Properties Multiple flows of control

Single memory image

Single flow of control

Multiple local memories

Differences:Differences: Number and capabilities of cores

ISA

Register File

Communication Model

Functional Units Synchronization Model

von-Neumann languages represent the Need common machine language(s) common properties and abstract away for multicores the differences

Bill Thies, MIT. 6 6.189 IAP 2007 MIT

Page 8: MIT OpenCourseWare 6.189 Multicore Programming Primer ... · CSR-1 C is the common Intel Tflops machine language Raza Cavium Raw XLR Octeon Niagara a Cell Boardcom 1480 Opteron 4P

6.189 IAP 2007 MIT

Streaming as a Common Machine Language

Adder

Speaker

AtoD

FMDemod

LPF1

Scatter

Gather

LPF2 LPF3

HPF1 HPF2 HPF3

● For programs based on streams of data � Audio, video, DSP, networking, and

cryptographic processing kernels � Examples: HDTV editing, radar

tracking, microphone arrays, cell phone base stations, graphics

● Several attractive properties � Regular and repeating computation � Independent filters

with explicit communication � Task, data, and pipeline parallelism

Bill Thies, MIT. 7

Page 9: MIT OpenCourseWare 6.189 Multicore Programming Primer ... · CSR-1 C is the common Intel Tflops machine language Raza Cavium Raw XLR Octeon Niagara a Cell Boardcom 1480 Opteron 4P

Streaming Models of Computation

● Many different ways to represent streaming � Do senders/receivers block? � How much buffering is allowed on channels? � Is computation deterministic? � Can you avoid deadlock?

● Three common models: 1. Kahn Process Networks 2. Synchronous Dataflow 3. Communicating Sequential Processes

Bill Thies, MIT. 8 6.189 IAP 2007 MIT

Page 10: MIT OpenCourseWare 6.189 Multicore Programming Primer ... · CSR-1 C is the common Intel Tflops machine language Raza Cavium Raw XLR Octeon Niagara a Cell Boardcom 1480 Opteron 4P

Streaming Models of Computation

Communication Pattern

Buffering Notes

Kahn process networks (KPN)

Data-dependent, but deterministic

Conceptually unbounded

- UNIX pipes - Ambric (startup)

Synchronous dataflow (SDF)

Static Fixed by compiler

- Static scheduling - Deadlock freedom

Communicating Sequential Processes (CSP)

Data-dependent, allows non-determinism

None (Rendesvouz)

- Rich synchronization primitives

- Occam language

SDF

KPN CSP space of program behaviors

Bill Thies, MIT. 9 6.189 IAP 2007 MIT

Page 11: MIT OpenCourseWare 6.189 Multicore Programming Primer ... · CSR-1 C is the common Intel Tflops machine language Raza Cavium Raw XLR Octeon Niagara a Cell Boardcom 1480 Opteron 4P

The StreamIt Language

● A high-level, architecture-independent language for streaming applications � Improves programmer productivity (vs. Java, C) � Offers scalable performance on multicores

● Based on synchronous dataflow, with dynamic extensions � Compiler determines execution order of filters � Many aggressive optimizations possible

Bill Thies, MIT. 10 6.189 IAP 2007 MIT

Page 12: MIT OpenCourseWare 6.189 Multicore Programming Primer ... · CSR-1 C is the common Intel Tflops machine language Raza Cavium Raw XLR Octeon Niagara a Cell Boardcom 1480 Opteron 4P

The StreamIt Project

StreamIt Program

Front-end

Stream-Aware Optimizations

Uniprocessor backend

Cluster backend

Raw backend

IBM X10 backend

C/C++ C per tile + msg code

Streaming X10 runtime

Annotated Java

MPI-like C/C++

Simulator (Java Library)

● Applications � DES and Serpent [PLDI 05]� MPEG-2 [IPDPS 06]� SAR, DSP benchmarks, JPEG, …

● Programmability � StreamIt Language (CC 02) � Teleport Messaging (PPOPP 05)� Programming Environment in Eclipse (P-PHEC 05)

● Domain Specific Optimizations � Linear Analysis and Optimization (PLDI 03) � Optimizations for bit streaming (PLDI 05) � Linear State Space Analysis (CASES 05)

● Architecture Specific Optimizations � Compiling for Communication-Exposed

Architectures (ASPLOS 02)� Phased Scheduling (LCTES 03)� Cache Aware Optimization (LCTES 05)� Load-Balanced Rendering

(Graphics Hardware 05)

Bill Thies, MIT. 11 6.189 IAP 2007 MIT

Page 13: MIT OpenCourseWare 6.189 Multicore Programming Primer ... · CSR-1 C is the common Intel Tflops machine language Raza Cavium Raw XLR Octeon Niagara a Cell Boardcom 1480 Opteron 4P

Example: A Simple Counter

void->void pipeline Counter() { add IntSource(); add IntPrinter();

} void->int filter IntSource() {

int x; init { x = 0; }

Counter

IntSource

IntPrinter

work push 1 { push (x++); } % strc Counter.str –o counter % ./counter –i 4

} 0

int->void filter IntPrinter() { 1 work pop 1 { print(pop()); } 2

} 3

Bill Thies, MIT. 12 6.189 IAP 2007 MIT

Page 14: MIT OpenCourseWare 6.189 Multicore Programming Primer ... · CSR-1 C is the common Intel Tflops machine language Raza Cavium Raw XLR Octeon Niagara a Cell Boardcom 1480 Opteron 4P

Representing Streams

● Conventional wisdom: streams are graphs � Graphs have no simple textual representation � Graphs are difficult to analyze and optimize

● Insight: stream programs have structure

unstructured structuredBill Thies, MIT. 13 6.189 IAP 2007 MIT

Page 15: MIT OpenCourseWare 6.189 Multicore Programming Primer ... · CSR-1 C is the common Intel Tflops machine language Raza Cavium Raw XLR Octeon Niagara a Cell Boardcom 1480 Opteron 4P

Structured Streams

parallel computation

may be any StreamIt language construct

joinersplitter

pipeline

feedback loop

joiner splitter

splitjoin

filter ● Each structure is single-input, single-output

● Hierarchical and composable

Bill Thies, MIT. 14 6.189 IAP 2007 MIT

Page 16: MIT OpenCourseWare 6.189 Multicore Programming Primer ... · CSR-1 C is the common Intel Tflops machine language Raza Cavium Raw XLR Octeon Niagara a Cell Boardcom 1480 Opteron 4P

Filter Example: Low Pass Filter

float->float filter LowPassFilter (int N, float freq) { float[N] weights;

init {

weights = calcWeights(freq); N }

work peek N push 1 pop 1 { float result = 0; for (int i=0; i<weights.length; i++) {

result += weights[i] * peek(i); } push(result); pop();

} }

filter

Bill Thies, MIT. 15 6.189 IAP 2007 MIT

Page 17: MIT OpenCourseWare 6.189 Multicore Programming Primer ... · CSR-1 C is the common Intel Tflops machine language Raza Cavium Raw XLR Octeon Niagara a Cell Boardcom 1480 Opteron 4P

Low Pass Filter in C

void FIR(int* src, ● FIR functionality obscured by int* dest, buffer management detailsint* srcIndex,int* destIndex, ● Programmer must commit to aint srcBufferSize, int destBufferSize, particular buffer implementationint N) { strategyfloat result = 0.0;for (int i = 0; i < N; i++) {result += weights[i] * src[(*srcIndex + i) % srcBufferSize];

}dest[*destIndex] = result;*srcIndex = (*srcIndex + 1) % srcBufferSize;*destIndex = (*destIndex + 1) % destBufferSize;

}

Bill Thies, MIT. 16 6.189 IAP 2007 MIT

Page 18: MIT OpenCourseWare 6.189 Multicore Programming Primer ... · CSR-1 C is the common Intel Tflops machine language Raza Cavium Raw XLR Octeon Niagara a Cell Boardcom 1480 Opteron 4P

Pipeline Example: Band Pass Filter

float→float pipeline BandPassFilter (int N, float low, float high) {

add LowPassFilter(N, low); add HighPassFilter(N, high);

}

HighPassFilter

LowPassFilter

Bill Thies, MIT. 17 6.189 IAP 2007 MIT

Page 19: MIT OpenCourseWare 6.189 Multicore Programming Primer ... · CSR-1 C is the common Intel Tflops machine language Raza Cavium Raw XLR Octeon Niagara a Cell Boardcom 1480 Opteron 4P

SplitJoin Example: Equalizer

duplicate

BPF BPF BPF

Adder

roundrobin

Equalizerfloat→float pipeline Equalizer (int N, float lo, float hi) {

add splitjoin {

split duplicate;

for (int i=0; i<N; i++)

add BandPassFilter(64, lo + i*(hi - lo)/N);

join roundrobin(1);

}

add Adder(N);

}

Bill Thies, MIT. 18 6.189 IAP 2007 MIT

Page 20: MIT OpenCourseWare 6.189 Multicore Programming Primer ... · CSR-1 C is the common Intel Tflops machine language Raza Cavium Raw XLR Octeon Niagara a Cell Boardcom 1480 Opteron 4P

Building Larger Programs: FMRadio

void->void pipeline FMRadio(int N, float lo, float hi) {

add AtoD();

add FMDemod();

add splitjoin { split duplicate; for (int i=0; i<N; i++) {

add pipeline {

add LowPassFilter(lo + i*(hi - lo)/N);

add HighPassFilter(lo + i*(hi - lo)/N); }

}join roundrobin();

} add Adder();

add Speaker();

Adder

Speaker

AtoD

FMDemod

LPF1

Duplicate

RoundRobin

LPF2 LPF3

HPF1 HPF2 HPF3

} Bill Thies, MIT. 19 6.189 IAP 2007 MIT

Page 21: MIT OpenCourseWare 6.189 Multicore Programming Primer ... · CSR-1 C is the common Intel Tflops machine language Raza Cavium Raw XLR Octeon Niagara a Cell Boardcom 1480 Opteron 4P

The Beauty of Streaming

“Some programs are elegant, some are exquisite, some are sparkling. My claim is that it is possible to write grand programs, noble programs, truly magnificient ones!”

— Don Knuth, ACM Turing Award Lecture

Bill Thies, MIT. 20 6.189 IAP 2007 MIT

Image removed due to copyright restrictions.

Page 22: MIT OpenCourseWare 6.189 Multicore Programming Primer ... · CSR-1 C is the common Intel Tflops machine language Raza Cavium Raw XLR Octeon Niagara a Cell Boardcom 1480 Opteron 4P

SplitJoins are Beautiful

split duplicate split roundrobin(N) join roundrobin(N)

Bill Thies, MIT. 21 6.189 IAP 2007 MIT

Page 23: MIT OpenCourseWare 6.189 Multicore Programming Primer ... · CSR-1 C is the common Intel Tflops machine language Raza Cavium Raw XLR Octeon Niagara a Cell Boardcom 1480 Opteron 4P

SplitJoins are Beautiful

split duplicate split roundrobin(N) join roundrobin(N)

Bill Thies, MIT. 22 6.189 IAP 2007 MIT

Page 24: MIT OpenCourseWare 6.189 Multicore Programming Primer ... · CSR-1 C is the common Intel Tflops machine language Raza Cavium Raw XLR Octeon Niagara a Cell Boardcom 1480 Opteron 4P

SplitJoins are Beautiful

split duplicate split roundrobin(N) join roundrobin(N)

Bill Thies, MIT. 23 6.189 IAP 2007 MIT

Page 25: MIT OpenCourseWare 6.189 Multicore Programming Primer ... · CSR-1 C is the common Intel Tflops machine language Raza Cavium Raw XLR Octeon Niagara a Cell Boardcom 1480 Opteron 4P

SplitJoins are Beautiful

split duplicate split roundrobin(N) join roundrobin(N)

Bill Thies, MIT. 24 6.189 IAP 2007 MIT

Page 26: MIT OpenCourseWare 6.189 Multicore Programming Primer ... · CSR-1 C is the common Intel Tflops machine language Raza Cavium Raw XLR Octeon Niagara a Cell Boardcom 1480 Opteron 4P

SplitJoins are Beautiful

split duplicate split roundrobin(N) join roundrobin(N)

Bill Thies, MIT. 25 6.189 IAP 2007 MIT

Page 27: MIT OpenCourseWare 6.189 Multicore Programming Primer ... · CSR-1 C is the common Intel Tflops machine language Raza Cavium Raw XLR Octeon Niagara a Cell Boardcom 1480 Opteron 4P

SplitJoins are Beautiful

split duplicate split roundrobin(N) join roundrobin(N)

Bill Thies, MIT. 26 6.189 IAP 2007 MIT

Page 28: MIT OpenCourseWare 6.189 Multicore Programming Primer ... · CSR-1 C is the common Intel Tflops machine language Raza Cavium Raw XLR Octeon Niagara a Cell Boardcom 1480 Opteron 4P

SplitJoins are Beautiful

split duplicate split roundrobin(N) join roundrobin(N)

Bill Thies, MIT. 27 6.189 IAP 2007 MIT

Page 29: MIT OpenCourseWare 6.189 Multicore Programming Primer ... · CSR-1 C is the common Intel Tflops machine language Raza Cavium Raw XLR Octeon Niagara a Cell Boardcom 1480 Opteron 4P

SplitJoins are Beautiful

split duplicate split roundrobin(N) join roundrobin(N)

Bill Thies, MIT. 28 6.189 IAP 2007 MIT

Page 30: MIT OpenCourseWare 6.189 Multicore Programming Primer ... · CSR-1 C is the common Intel Tflops machine language Raza Cavium Raw XLR Octeon Niagara a Cell Boardcom 1480 Opteron 4P

SplitJoins are Beautiful

split duplicate split roundrobin(N) join roundrobin(N)

Bill Thies, MIT. 29 6.189 IAP 2007 MIT

Page 31: MIT OpenCourseWare 6.189 Multicore Programming Primer ... · CSR-1 C is the common Intel Tflops machine language Raza Cavium Raw XLR Octeon Niagara a Cell Boardcom 1480 Opteron 4P

SplitJoins are Beautiful

split duplicate split roundrobin(N) join roundrobin(N)

Bill Thies, MIT. 30 6.189 IAP 2007 MIT

Page 32: MIT OpenCourseWare 6.189 Multicore Programming Primer ... · CSR-1 C is the common Intel Tflops machine language Raza Cavium Raw XLR Octeon Niagara a Cell Boardcom 1480 Opteron 4P

SplitJoins are Beautiful

split duplicate split roundrobin(N) join roundrobin(N)

Bill Thies, MIT. 31 6.189 IAP 2007 MIT

Page 33: MIT OpenCourseWare 6.189 Multicore Programming Primer ... · CSR-1 C is the common Intel Tflops machine language Raza Cavium Raw XLR Octeon Niagara a Cell Boardcom 1480 Opteron 4P

SplitJoins are Beautiful

split duplicate split roundrobin(N) join roundrobin(N)

Bill Thies, MIT. 32 6.189 IAP 2007 MIT

Page 34: MIT OpenCourseWare 6.189 Multicore Programming Primer ... · CSR-1 C is the common Intel Tflops machine language Raza Cavium Raw XLR Octeon Niagara a Cell Boardcom 1480 Opteron 4P

SplitJoins are Beautiful

split duplicate split roundrobin(N) join roundrobin(N)

Bill Thies, MIT. 33 6.189 IAP 2007 MIT

Page 35: MIT OpenCourseWare 6.189 Multicore Programming Primer ... · CSR-1 C is the common Intel Tflops machine language Raza Cavium Raw XLR Octeon Niagara a Cell Boardcom 1480 Opteron 4P

SplitJoins are Beautiful

split duplicate split roundrobin(N) join roundrobin(N)

Bill Thies, MIT. 34 6.189 IAP 2007 MIT

Page 36: MIT OpenCourseWare 6.189 Multicore Programming Primer ... · CSR-1 C is the common Intel Tflops machine language Raza Cavium Raw XLR Octeon Niagara a Cell Boardcom 1480 Opteron 4P

SplitJoins are Beautiful

split duplicate split roundrobin(1) join roundrobin(1)

Bill Thies, MIT. 35 6.189 IAP 2007 MIT

Page 37: MIT OpenCourseWare 6.189 Multicore Programming Primer ... · CSR-1 C is the common Intel Tflops machine language Raza Cavium Raw XLR Octeon Niagara a Cell Boardcom 1480 Opteron 4P

SplitJoins are Beautiful

split duplicate split roundrobin(1) join roundrobin(1)

Bill Thies, MIT. 36 6.189 IAP 2007 MIT

Page 38: MIT OpenCourseWare 6.189 Multicore Programming Primer ... · CSR-1 C is the common Intel Tflops machine language Raza Cavium Raw XLR Octeon Niagara a Cell Boardcom 1480 Opteron 4P

SplitJoins are Beautiful

split duplicate split roundrobin(1) join roundrobin(1)

Bill Thies, MIT. 37 6.189 IAP 2007 MIT

Page 39: MIT OpenCourseWare 6.189 Multicore Programming Primer ... · CSR-1 C is the common Intel Tflops machine language Raza Cavium Raw XLR Octeon Niagara a Cell Boardcom 1480 Opteron 4P

SplitJoins are Beautiful

split duplicate split roundrobin(1) join roundrobin(1)

Bill Thies, MIT. 38 6.189 IAP 2007 MIT

Page 40: MIT OpenCourseWare 6.189 Multicore Programming Primer ... · CSR-1 C is the common Intel Tflops machine language Raza Cavium Raw XLR Octeon Niagara a Cell Boardcom 1480 Opteron 4P

SplitJoins are Beautiful

split duplicate split roundrobin(1) join roundrobin(1)

Bill Thies, MIT. 39 6.189 IAP 2007 MIT

Page 41: MIT OpenCourseWare 6.189 Multicore Programming Primer ... · CSR-1 C is the common Intel Tflops machine language Raza Cavium Raw XLR Octeon Niagara a Cell Boardcom 1480 Opteron 4P

SplitJoins are Beautiful

split duplicate split roundrobin(1) join roundrobin(1)

Bill Thies, MIT. 40 6.189 IAP 2007 MIT

Page 42: MIT OpenCourseWare 6.189 Multicore Programming Primer ... · CSR-1 C is the common Intel Tflops machine language Raza Cavium Raw XLR Octeon Niagara a Cell Boardcom 1480 Opteron 4P

SplitJoins are Beautiful

split duplicate split roundrobin(1) join roundrobin(1)

Bill Thies, MIT. 41 6.189 IAP 2007 MIT

Page 43: MIT OpenCourseWare 6.189 Multicore Programming Primer ... · CSR-1 C is the common Intel Tflops machine language Raza Cavium Raw XLR Octeon Niagara a Cell Boardcom 1480 Opteron 4P

SplitJoins are Beautiful

split duplicate split roundrobin(1) join roundrobin(1)

Bill Thies, MIT. 42 6.189 IAP 2007 MIT

Page 44: MIT OpenCourseWare 6.189 Multicore Programming Primer ... · CSR-1 C is the common Intel Tflops machine language Raza Cavium Raw XLR Octeon Niagara a Cell Boardcom 1480 Opteron 4P

SplitJoins are Beautiful

split duplicate split roundrobin(1) join roundrobin(1)

Bill Thies, MIT. 43 6.189 IAP 2007 MIT

Page 45: MIT OpenCourseWare 6.189 Multicore Programming Primer ... · CSR-1 C is the common Intel Tflops machine language Raza Cavium Raw XLR Octeon Niagara a Cell Boardcom 1480 Opteron 4P

SplitJoins are Beautiful

split duplicate split roundrobin(1) join roundrobin(1)

Bill Thies, MIT. 44 6.189 IAP 2007 MIT

Page 46: MIT OpenCourseWare 6.189 Multicore Programming Primer ... · CSR-1 C is the common Intel Tflops machine language Raza Cavium Raw XLR Octeon Niagara a Cell Boardcom 1480 Opteron 4P

SplitJoins are Beautiful

split duplicate split roundrobin(1) join roundrobin(1)

Bill Thies, MIT. 45 6.189 IAP 2007 MIT

Page 47: MIT OpenCourseWare 6.189 Multicore Programming Primer ... · CSR-1 C is the common Intel Tflops machine language Raza Cavium Raw XLR Octeon Niagara a Cell Boardcom 1480 Opteron 4P

SplitJoins are Beautiful

split duplicate split roundrobin(1) join roundrobin(1)

Bill Thies, MIT. 46 6.189 IAP 2007 MIT

Page 48: MIT OpenCourseWare 6.189 Multicore Programming Primer ... · CSR-1 C is the common Intel Tflops machine language Raza Cavium Raw XLR Octeon Niagara a Cell Boardcom 1480 Opteron 4P

SplitJoins are Beautiful

split duplicate split roundrobin(1) join roundrobin(1)

Bill Thies, MIT. 47 6.189 IAP 2007 MIT

Page 49: MIT OpenCourseWare 6.189 Multicore Programming Primer ... · CSR-1 C is the common Intel Tflops machine language Raza Cavium Raw XLR Octeon Niagara a Cell Boardcom 1480 Opteron 4P

SplitJoins are Beautiful

split duplicate split roundrobin(1) join roundrobin(1)

Bill Thies, MIT. 48 6.189 IAP 2007 MIT

Page 50: MIT OpenCourseWare 6.189 Multicore Programming Primer ... · CSR-1 C is the common Intel Tflops machine language Raza Cavium Raw XLR Octeon Niagara a Cell Boardcom 1480 Opteron 4P

SplitJoins are Beautiful

split duplicate split roundrobin(1) join roundrobin(1)

Bill Thies, MIT. 49 6.189 IAP 2007 MIT

Page 51: MIT OpenCourseWare 6.189 Multicore Programming Primer ... · CSR-1 C is the common Intel Tflops machine language Raza Cavium Raw XLR Octeon Niagara a Cell Boardcom 1480 Opteron 4P

SplitJoins are Beautiful

split duplicate split roundrobin(1) join roundrobin(1)

Bill Thies, MIT. 50 6.189 IAP 2007 MIT

Page 52: MIT OpenCourseWare 6.189 Multicore Programming Primer ... · CSR-1 C is the common Intel Tflops machine language Raza Cavium Raw XLR Octeon Niagara a Cell Boardcom 1480 Opteron 4P

SplitJoins are Beautiful

split duplicate split roundrobin(1) join roundrobin(1)

Bill Thies, MIT. 51 6.189 IAP 2007 MIT

Page 53: MIT OpenCourseWare 6.189 Multicore Programming Primer ... · CSR-1 C is the common Intel Tflops machine language Raza Cavium Raw XLR Octeon Niagara a Cell Boardcom 1480 Opteron 4P

SplitJoins are Beautiful

split duplicate split roundrobin(1) join roundrobin(1)

Bill Thies, MIT. 52 6.189 IAP 2007 MIT

Page 54: MIT OpenCourseWare 6.189 Multicore Programming Primer ... · CSR-1 C is the common Intel Tflops machine language Raza Cavium Raw XLR Octeon Niagara a Cell Boardcom 1480 Opteron 4P

SplitJoins are Beautiful

split duplicate split roundrobin(1) join roundrobin(1)

Bill Thies, MIT. 53 6.189 IAP 2007 MIT

Page 55: MIT OpenCourseWare 6.189 Multicore Programming Primer ... · CSR-1 C is the common Intel Tflops machine language Raza Cavium Raw XLR Octeon Niagara a Cell Boardcom 1480 Opteron 4P

SplitJoins are Beautiful

split duplicate split roundrobin(1) join roundrobin(1)

Bill Thies, MIT. 54 6.189 IAP 2007 MIT

Page 56: MIT OpenCourseWare 6.189 Multicore Programming Primer ... · CSR-1 C is the common Intel Tflops machine language Raza Cavium Raw XLR Octeon Niagara a Cell Boardcom 1480 Opteron 4P

SplitJoins are Beautiful

split duplicate split roundrobin(1) join roundrobin(1)

Bill Thies, MIT. 55 6.189 IAP 2007 MIT

Page 57: MIT OpenCourseWare 6.189 Multicore Programming Primer ... · CSR-1 C is the common Intel Tflops machine language Raza Cavium Raw XLR Octeon Niagara a Cell Boardcom 1480 Opteron 4P

SplitJoins are Beautiful

split duplicate split roundrobin(1) join roundrobin(1)

Bill Thies, MIT. 56 6.189 IAP 2007 MIT

Page 58: MIT OpenCourseWare 6.189 Multicore Programming Primer ... · CSR-1 C is the common Intel Tflops machine language Raza Cavium Raw XLR Octeon Niagara a Cell Boardcom 1480 Opteron 4P

SplitJoins are Beautiful

split duplicate split roundrobin(1) join roundrobin(1)

Bill Thies, MIT. 57 6.189 IAP 2007 MIT

Page 59: MIT OpenCourseWare 6.189 Multicore Programming Primer ... · CSR-1 C is the common Intel Tflops machine language Raza Cavium Raw XLR Octeon Niagara a Cell Boardcom 1480 Opteron 4P

SplitJoins are Beautiful

split duplicate split roundrobin(1) join roundrobin(1)

Bill Thies, MIT. 58 6.189 IAP 2007 MIT

Page 60: MIT OpenCourseWare 6.189 Multicore Programming Primer ... · CSR-1 C is the common Intel Tflops machine language Raza Cavium Raw XLR Octeon Niagara a Cell Boardcom 1480 Opteron 4P

SplitJoins are Beautiful

split duplicate split roundrobin(1) join roundrobin(1)

Bill Thies, MIT. 59 6.189 IAP 2007 MIT

Page 61: MIT OpenCourseWare 6.189 Multicore Programming Primer ... · CSR-1 C is the common Intel Tflops machine language Raza Cavium Raw XLR Octeon Niagara a Cell Boardcom 1480 Opteron 4P

SplitJoins are Beautiful

split duplicate split roundrobin(1) join roundrobin(1)

Bill Thies, MIT. 60 6.189 IAP 2007 MIT

Page 62: MIT OpenCourseWare 6.189 Multicore Programming Primer ... · CSR-1 C is the common Intel Tflops machine language Raza Cavium Raw XLR Octeon Niagara a Cell Boardcom 1480 Opteron 4P

SplitJoins are Beautiful

split duplicate split roundrobin(1) join roundrobin(1)

Bill Thies, MIT. 61 6.189 IAP 2007 MIT

Page 63: MIT OpenCourseWare 6.189 Multicore Programming Primer ... · CSR-1 C is the common Intel Tflops machine language Raza Cavium Raw XLR Octeon Niagara a Cell Boardcom 1480 Opteron 4P

SplitJoins are Beautiful

split duplicate split roundrobin(2) join roundrobin(1)

Bill Thies, MIT. 62 6.189 IAP 2007 MIT

Page 64: MIT OpenCourseWare 6.189 Multicore Programming Primer ... · CSR-1 C is the common Intel Tflops machine language Raza Cavium Raw XLR Octeon Niagara a Cell Boardcom 1480 Opteron 4P

SplitJoins are Beautiful

split duplicate split roundrobin(2) join roundrobin(1)

Bill Thies, MIT. 63 6.189 IAP 2007 MIT

Page 65: MIT OpenCourseWare 6.189 Multicore Programming Primer ... · CSR-1 C is the common Intel Tflops machine language Raza Cavium Raw XLR Octeon Niagara a Cell Boardcom 1480 Opteron 4P

SplitJoins are Beautiful

split duplicate split roundrobin(2) join roundrobin(1)

Bill Thies, MIT. 64 6.189 IAP 2007 MIT

Page 66: MIT OpenCourseWare 6.189 Multicore Programming Primer ... · CSR-1 C is the common Intel Tflops machine language Raza Cavium Raw XLR Octeon Niagara a Cell Boardcom 1480 Opteron 4P

SplitJoins are Beautiful

split duplicate split roundrobin(2) join roundrobin(1)

Bill Thies, MIT. 65 6.189 IAP 2007 MIT

Page 67: MIT OpenCourseWare 6.189 Multicore Programming Primer ... · CSR-1 C is the common Intel Tflops machine language Raza Cavium Raw XLR Octeon Niagara a Cell Boardcom 1480 Opteron 4P

SplitJoins are Beautiful

split duplicate split roundrobin(2) join roundrobin(1)

Bill Thies, MIT. 66 6.189 IAP 2007 MIT

Page 68: MIT OpenCourseWare 6.189 Multicore Programming Primer ... · CSR-1 C is the common Intel Tflops machine language Raza Cavium Raw XLR Octeon Niagara a Cell Boardcom 1480 Opteron 4P

SplitJoins are Beautiful

split duplicate split roundrobin(2) join roundrobin(1)

Bill Thies, MIT. 67 6.189 IAP 2007 MIT

Page 69: MIT OpenCourseWare 6.189 Multicore Programming Primer ... · CSR-1 C is the common Intel Tflops machine language Raza Cavium Raw XLR Octeon Niagara a Cell Boardcom 1480 Opteron 4P

SplitJoins are Beautiful

split duplicate split roundrobin(2) join roundrobin(1)

Bill Thies, MIT. 68 6.189 IAP 2007 MIT

Page 70: MIT OpenCourseWare 6.189 Multicore Programming Primer ... · CSR-1 C is the common Intel Tflops machine language Raza Cavium Raw XLR Octeon Niagara a Cell Boardcom 1480 Opteron 4P

SplitJoins are Beautiful

split duplicate split roundrobin(2) join roundrobin(1)

Bill Thies, MIT. 69 6.189 IAP 2007 MIT

Page 71: MIT OpenCourseWare 6.189 Multicore Programming Primer ... · CSR-1 C is the common Intel Tflops machine language Raza Cavium Raw XLR Octeon Niagara a Cell Boardcom 1480 Opteron 4P

SplitJoins are Beautiful

split duplicate split roundrobin(2) join roundrobin(1)

Bill Thies, MIT. 70 6.189 IAP 2007 MIT

Page 72: MIT OpenCourseWare 6.189 Multicore Programming Primer ... · CSR-1 C is the common Intel Tflops machine language Raza Cavium Raw XLR Octeon Niagara a Cell Boardcom 1480 Opteron 4P

SplitJoins are Beautiful

split duplicate split roundrobin(2)

join roundrobin(1,2,3)

Bill Thies, MIT. 71 6.189 IAP 2007 MIT

Page 73: MIT OpenCourseWare 6.189 Multicore Programming Primer ... · CSR-1 C is the common Intel Tflops machine language Raza Cavium Raw XLR Octeon Niagara a Cell Boardcom 1480 Opteron 4P

SplitJoins are Beautiful

split duplicate split roundrobin(2)

join roundrobin(1,2,3)

Bill Thies, MIT. 72 6.189 IAP 2007 MIT

Page 74: MIT OpenCourseWare 6.189 Multicore Programming Primer ... · CSR-1 C is the common Intel Tflops machine language Raza Cavium Raw XLR Octeon Niagara a Cell Boardcom 1480 Opteron 4P

SplitJoins are Beautiful

split duplicate split roundrobin(2)

join roundrobin(1,2,3)

Bill Thies, MIT. 73 6.189 IAP 2007 MIT

Page 75: MIT OpenCourseWare 6.189 Multicore Programming Primer ... · CSR-1 C is the common Intel Tflops machine language Raza Cavium Raw XLR Octeon Niagara a Cell Boardcom 1480 Opteron 4P

SplitJoins are Beautiful

split duplicate split roundrobin(2)

join roundrobin(1,2,3)

Bill Thies, MIT. 74 6.189 IAP 2007 MIT

Page 76: MIT OpenCourseWare 6.189 Multicore Programming Primer ... · CSR-1 C is the common Intel Tflops machine language Raza Cavium Raw XLR Octeon Niagara a Cell Boardcom 1480 Opteron 4P

SplitJoins are Beautiful

split duplicate split roundrobin(2)

join roundrobin(1,2,3)

Bill Thies, MIT. 75 6.189 IAP 2007 MIT

Page 77: MIT OpenCourseWare 6.189 Multicore Programming Primer ... · CSR-1 C is the common Intel Tflops machine language Raza Cavium Raw XLR Octeon Niagara a Cell Boardcom 1480 Opteron 4P

SplitJoins are Beautiful

split duplicate split roundrobin(2)

join roundrobin(1,2,3)

Bill Thies, MIT. 76 6.189 IAP 2007 MIT

Page 78: MIT OpenCourseWare 6.189 Multicore Programming Primer ... · CSR-1 C is the common Intel Tflops machine language Raza Cavium Raw XLR Octeon Niagara a Cell Boardcom 1480 Opteron 4P

SplitJoins are Beautiful

split duplicate split roundrobin(2)

join roundrobin(1,2,3)

Bill Thies, MIT. 77 6.189 IAP 2007 MIT

Page 79: MIT OpenCourseWare 6.189 Multicore Programming Primer ... · CSR-1 C is the common Intel Tflops machine language Raza Cavium Raw XLR Octeon Niagara a Cell Boardcom 1480 Opteron 4P

SplitJoins are Beautiful

split duplicate split roundrobin(2)

join roundrobin(1,2,3)

Bill Thies, MIT. 78 6.189 IAP 2007 MIT

Page 80: MIT OpenCourseWare 6.189 Multicore Programming Primer ... · CSR-1 C is the common Intel Tflops machine language Raza Cavium Raw XLR Octeon Niagara a Cell Boardcom 1480 Opteron 4P

Matrix TransposeN

Transpose

M

M

NBill Thies, MIT. 79 6.189 IAP 2007 MIT

Page 81: MIT OpenCourseWare 6.189 Multicore Programming Primer ... · CSR-1 C is the common Intel Tflops machine language Raza Cavium Raw XLR Octeon Niagara a Cell Boardcom 1480 Opteron 4P

Matrix TransposeN

M

M

?

roundrobin(?)

roundrobin(?)

NBill Thies, MIT. 80 6.189 IAP 2007 MIT

Page 82: MIT OpenCourseWare 6.189 Multicore Programming Primer ... · CSR-1 C is the common Intel Tflops machine language Raza Cavium Raw XLR Octeon Niagara a Cell Boardcom 1480 Opteron 4P

Matrix TransposeN

M

M

?

roundrobin(?)

roundrobin(?)

NBill Thies, MIT. 81 6.189 IAP 2007 MIT

Page 83: MIT OpenCourseWare 6.189 Multicore Programming Primer ... · CSR-1 C is the common Intel Tflops machine language Raza Cavium Raw XLR Octeon Niagara a Cell Boardcom 1480 Opteron 4P

Matrix TransposeN

M

M

N

roundrobin(M)

roundrobin(1)

NBill Thies, MIT. 82 6.189 IAP 2007 MIT

Page 84: MIT OpenCourseWare 6.189 Multicore Programming Primer ... · CSR-1 C is the common Intel Tflops machine language Raza Cavium Raw XLR Octeon Niagara a Cell Boardcom 1480 Opteron 4P

Matrix TransposeN

M

M

N

roundrobin(M)

roundrobin(1)

NBill Thies, MIT. 83 6.189 IAP 2007 MIT

Page 85: MIT OpenCourseWare 6.189 Multicore Programming Primer ... · CSR-1 C is the common Intel Tflops machine language Raza Cavium Raw XLR Octeon Niagara a Cell Boardcom 1480 Opteron 4P

Matrix Transpose

M

N

N

M

N

roundrobin(M)

roundrobin(1) float->float splitjoin Transpose (int M,

int N) { split roundrobin(1); for (int i = 0; i<N; i++) {

add Identity<float>; } join roundrobin(M);

}

Bill Thies, MIT. 84 6.189 IAP 2007 MIT

Page 86: MIT OpenCourseWare 6.189 Multicore Programming Primer ... · CSR-1 C is the common Intel Tflops machine language Raza Cavium Raw XLR Octeon Niagara a Cell Boardcom 1480 Opteron 4P

Bit-reversed ordering

● Many FFT algorithms require a bit-reversal stage

● If item is at index n (with binary digits b0 b1 … bk), then it is transferred to reversed index bk … b1 b0

● For 3-digit binary numbers:

000011110011001101010101

000011110011001101010101

Bill Thies, MIT. 85 6.189 IAP 2007 MIT

Page 87: MIT OpenCourseWare 6.189 Multicore Programming Primer ... · CSR-1 C is the common Intel Tflops machine language Raza Cavium Raw XLR Octeon Niagara a Cell Boardcom 1480 Opteron 4P

Bit-reversed ordering

● Many FFT algorithms require a bit-reversal stage

● If item is at index n (with binary digits b0 b1 … bk), then it is transferred to reversed index bk … b1 b0

● For 3-digit binary numbers:

000011110011001101010101 RR(w1)

2) RR(w2)

RR(w1)

RR(w1)

00001111 00110011 01010101 RR(w3)

RR(w

Bill Thies, MIT. 86 6.189 IAP 2007 MIT

Page 88: MIT OpenCourseWare 6.189 Multicore Programming Primer ... · CSR-1 C is the common Intel Tflops machine language Raza Cavium Raw XLR Octeon Niagara a Cell Boardcom 1480 Opteron 4P

Bit-reversed ordering

● Many FFT algorithms require a bit-reversal stage

● If item is at index n (with binary digits b0 b1 … bk), then it is transferred to reversed index bk … b1 b0

● For 3-digit binary numbers:

000011110011001101010101 RR(1)

RR(2)

RR(1)

RR(1)

00001111 00110011 01010101 RR(4)

RR(2)

Bill Thies, MIT. 87 6.189 IAP 2007 MIT

Page 89: MIT OpenCourseWare 6.189 Multicore Programming Primer ... · CSR-1 C is the common Intel Tflops machine language Raza Cavium Raw XLR Octeon Niagara a Cell Boardcom 1480 Opteron 4P

Bit-reversed ordering

complex->complex pipeline BitReverse (int N) { if (N==2) {

add Identity<complex>; } else {

add splitjoin {

split roundrobin(1);add BitReverse(N/2);add BitReverse(N/2);join roundrobin(N/2);

4 4

2 2 2 2

1 1 1 1

1 1 rr rr

rrrr

4 4

2 2 2 2

1 1 1 1

1 1 rr rr

rrrr

1 1rr

rrrr

8 8 rr

rrrr

} }

}

Bill Thies, MIT. 88 6.189 IAP 2007 MIT

Page 90: MIT OpenCourseWare 6.189 Multicore Programming Primer ... · CSR-1 C is the common Intel Tflops machine language Raza Cavium Raw XLR Octeon Niagara a Cell Boardcom 1480 Opteron 4P

89 6.189 IAP 2007 MITBill Thies, MIT.

N-Element Merge Sort

int->int pipeline MergeSort (int N) {if (N==2) {

add Sort(N);} else {

add splitjoin {split roundrobin(N/2);add MergeSort(N/2);add MergeSort(N/2);join roundrobin(N/2);

} }add Merge(N);

}

Sort Sort SortSort

Sort Sort Sort Sort

MergeMerge Merge Merge

Merge Merge

Merge

N/2 N/2

N/4 N/4 N/4 N/4

N/8 N/8N/8 N/8 N/8 N/8 N/8 N/8

N

Page 91: MIT OpenCourseWare 6.189 Multicore Programming Primer ... · CSR-1 C is the common Intel Tflops machine language Raza Cavium Raw XLR Octeon Niagara a Cell Boardcom 1480 Opteron 4P

N-Element Merge Sort (3-level)

90 6.189 IAP 2007 MITBill Thies, MIT.

Sort Sort Sort Sort Sort Sort Sort Sort

MergeMerge Merge Merge

Merge Merge

Merge

N/2 N/2

N/4 N/4 N/4 N/4

N/8 N/8 N/8 N/8 N/8 N/8 N/8 N/8

N

Page 92: MIT OpenCourseWare 6.189 Multicore Programming Primer ... · CSR-1 C is the common Intel Tflops machine language Raza Cavium Raw XLR Octeon Niagara a Cell Boardcom 1480 Opteron 4P

91

Bitonic Sort

Bill Thies, MIT. 6.189 IAP 2007 MIT

Courtesy of William Thies.Used with permission.

Page 93: MIT OpenCourseWare 6.189 Multicore Programming Primer ... · CSR-1 C is the common Intel Tflops machine language Raza Cavium Raw XLR Octeon Niagara a Cell Boardcom 1480 Opteron 4P

FFT

Bill Thies, MIT. 92 6.189 IAP 2007 MIT

Courtesy of William Thies.Used with permission.

Page 94: MIT OpenCourseWare 6.189 Multicore Programming Primer ... · CSR-1 C is the common Intel Tflops machine language Raza Cavium Raw XLR Octeon Niagara a Cell Boardcom 1480 Opteron 4P

Block Matrix Multiply

93 6.189 IAP 2007 MITBill Thies, MIT.

Courtesy of William Thies.Used with permission.

Page 95: MIT OpenCourseWare 6.189 Multicore Programming Primer ... · CSR-1 C is the common Intel Tflops machine language Raza Cavium Raw XLR Octeon Niagara a Cell Boardcom 1480 Opteron 4P

Filterbank

Bill Thies, MIT. 94 6.189 IAP 2007 MIT

Courtesy of William Thies.Used with permission.

Page 96: MIT OpenCourseWare 6.189 Multicore Programming Primer ... · CSR-1 C is the common Intel Tflops machine language Raza Cavium Raw XLR Octeon Niagara a Cell Boardcom 1480 Opteron 4P

FM Radio with Equalizer

Bill Thies, MIT. 95 6.189 IAP 2007 MIT

Courtesy of William Thies.Used with permission.

Page 97: MIT OpenCourseWare 6.189 Multicore Programming Primer ... · CSR-1 C is the common Intel Tflops machine language Raza Cavium Raw XLR Octeon Niagara a Cell Boardcom 1480 Opteron 4P

Radar-Array Front End

Bill Thies, MIT. 96 6.189 IAP 2007 MIT

Courtesy of William Thies.Used with permission.

Page 98: MIT OpenCourseWare 6.189 Multicore Programming Primer ... · CSR-1 C is the common Intel Tflops machine language Raza Cavium Raw XLR Octeon Niagara a Cell Boardcom 1480 Opteron 4P

MP3 Decoder

Bill Thies, MIT. 97 6.189 IAP 2007 MIT

Courtesy of William Thies.Used with permission.

Page 99: MIT OpenCourseWare 6.189 Multicore Programming Primer ... · CSR-1 C is the common Intel Tflops machine language Raza Cavium Raw XLR Octeon Niagara a Cell Boardcom 1480 Opteron 4P

Case Study:MPEG-2 Decoder in StreamIt

Bill Thies, MIT. 98 6.189 IAP 2007 MIT

Page 100: MIT OpenCourseWare 6.189 Multicore Programming Primer ... · CSR-1 C is the common Intel Tflops machine language Raza Cavium Raw XLR Octeon Niagara a Cell Boardcom 1480 Opteron 4P

MPEG-2 Decoder in StreamIt

macroblocks, motion vectors

frequency encodedmacroblocks motion vectors

motion vectors spatially encoded macroblocks

quantization coefficients

picture type

<QC>

<PT1, PT2>

add VLD(QC, PT1, PT2);

add splitjoin {

split roundrobin(N∗B, V);

add pipeline {add ZigZag(B);add IQuantization(B) to QC;add IDCT(B); add Saturation(B);

}add pipeline {

add MotionVectorDecode(); add Repeat(V, N);

}

join roundrobin(B, V); }

add splitjoin {split roundrobin(4∗(B+V), B+V, B+V);

add MotionCompensation(4∗(B+V)) to PT1;for (int i = 0; i < 2; i++) {

add pipeline {add MotionCompensation(B+V) to PT1; add ChannelUpsample(B);

}}

join roundrobin(1, 1, 1); }

add PictureReorder(3∗W∗H) to PT2;

add ColorSpaceConversion(3∗W∗H);

MPEG bit stream

Picture Reorder

joiner

joiner

IDCT

IQuantization

splitter

splitter

VLD

differentially coded

recovered picture

ZigZag

Saturation

Channel Upsample Channel Upsample

Motion Vector Decode

Y Cb Cr

<QC>

reference picture

Motion Compensation

<PT1> reference picture

Motion Compensation

<PT1> reference picture

Motion Compensation

<PT1>

<PT2>

Repeat

Color Space Conversion

Bill Thies, MIT. output to player 99 6.189 IAP 2007 MIT

Page 101: MIT OpenCourseWare 6.189 Multicore Programming Primer ... · CSR-1 C is the common Intel Tflops machine language Raza Cavium Raw XLR Octeon Niagara a Cell Boardcom 1480 Opteron 4P

Teleport Messaging in MPEG-2

joiner

splitter

recovered picture

Channel Upsample

Channel Upsample

Cr

Motion Compensation

Motion Compensation

Motion Compensation

Y Cb

picture type

Order

IQ

VLD

MC MCMC

Bill Thies, MIT. 100 6.189 IAP 2007 MIT

Page 102: MIT OpenCourseWare 6.189 Multicore Programming Primer ... · CSR-1 C is the common Intel Tflops machine language Raza Cavium Raw XLR Octeon Niagara a Cell Boardcom 1480 Opteron 4P

Messaging Equivalent in CThe MPEG Bitstream

File Parsing

Decode Picture

Decode Macroblock

Decode Motion Vectors

Decode Block

Motion Compensation

ZigZagUnordering InverseQuantization

Global VariableSpace

Saturate

IDCT

Motion Compensation For Single Channel

Frame Reordering

Output Video

Bill Thies, MIT. 101 6.189 IAP 2007 MIT

Page 103: MIT OpenCourseWare 6.189 Multicore Programming Primer ... · CSR-1 C is the common Intel Tflops machine language Raza Cavium Raw XLR Octeon Niagara a Cell Boardcom 1480 Opteron 4P

MPEG-2 Implementation

● Fully-functional MPEG-2 decoder and encoder

● Developed by 1 programmer in 8 weeks

● 2257 lines of code � Vs. 3477 lines of C code in MPEG-2 reference

● 48 static streams, 643 instantiated filters

Bill Thies, MIT. 102 6.189 IAP 2007 MIT

Page 104: MIT OpenCourseWare 6.189 Multicore Programming Primer ... · CSR-1 C is the common Intel Tflops machine language Raza Cavium Raw XLR Octeon Niagara a Cell Boardcom 1480 Opteron 4P

Conclusions

● StreamIt language preserves program structure � Natural for programmers

● Parallelism and communication naturally exposed � Compiler managed buffers, and portable parallelization

technology

● StreamIt increases programmer productivity, enables parallel performance

Bill Thies, MIT. 103 6.189 IAP 2007 MIT