confidential cell technology & ‘many-core’ i’ve got more than you…

Post on 15-Dec-2015

215 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

CONFIDENTIAL

Cell Technology & ‘Many-Core’

I’ve got more than you…

CONFIDENTIAL

Cell/B.E. Recap

What is the Cell Broadband Engine?Pioneered by Sony, Toshiba, IBM‘Many-Core’ processor1 PPE Processor8 SPE ‘SIMD Monsters’

PPE (PPU) is a scaler proc

SPEs (SPUs) are vector or array procs

CONFIDENTIAL

Cell/B.E. Recap

230GFLOPS by 9 processing elements

Synergistic Processor Element

Jointly developed by Sony, Toshiba, and IBM

FLEX I/O20GB/s

CONFIDENTIAL

Where is the Cell?

Look in your kids’ rooms

…or next to your HDTV!

PS3 Clusters are also in high demand

for their compelling price/performance. Folding@home, Black Hole Research, Ray-Tracing, Modelling, etc

CONFIDENTIAL

The Fastest Supercomputer

Los Alamos National LaboratoryRoadrunner

Hybrid Many-Core Architecture Cell and AMD

116,640 Cell cores

12,960 AMD cores

CONFIDENTIAL

Roadrunner –116,640 Cell Cores1st Supercomputer to Sustain 1 petaflop/s

“The Los Alamos system, nicknamed Roadrunner… fended off a challenge by the Cray XT5 supercomputer at Oak Ridge National Laboratory called Jaguar.

The system, only the second to break the petaflop/s barrier, posted a top performance of 1.059 petaflop/s in running the Linpack benchmark application. One petaflop/s represents one quadrillion floating point operations per second.“

petaFLOPS = 1015

“The Los Alamos system, nicknamed Roadrunner… fended off a challenge by the Cray XT5 supercomputer at Oak Ridge National Laboratory called Jaguar.

The system, only the second to break the petaflop/s barrier, posted a top performance of 1.059 petaflop/s in running the Linpack benchmark application. One petaflop/s represents one quadrillion floating point operations per second.“

petaFLOPS = 1015

CONFIDENTIAL

Sony Cell+RSX Appliance

RSX Graphics coprocessor for hardware OpenGL operations – scaling, CSC, etc

20GBps FlexIO interconnect between Cell & RSX Low power, 1RU, Cell+RSX on motherboard Example Application: Proprietary Mathematically

Lossless CODEC ideal for DPX WAN transfers

230 GigaFLOPS230 GigaFLOPS

CONFIDENTIAL

Ingest, Transcode, Processing, Etc

fast 3.7 TeraFLOPSflexible Cell + Intelsmall 3U Rack Mount

New Platform & Potential Apps8 SPUs

Up to 16 Cell Chips

IntelMotherboard

8x HD-SDI

CONFIDENTIAL

• Dual, Quad, or Octo-core on one die• QPI Interconnect (no more FSB)• Integrated Memory Controller• IOH replaces MCH functions• Triple Channel DDR3 Memory• Hyper Threading• Built-in Power Management• Expanded PCIe support (8GB/s on x16)

Core i7 - Nehalem & X58 (Tylersburg)

CONFIDENTIAL

Legacy w/ FSB New (Nehalem) w/ QPI

CoreCore CoreCoreCoreCore CoreCore

CoreCore CoreCoreCoreCore CoreCore

NorthbridgeNorthbridge RAMRAM

CoreCore CoreCoreCoreCore CoreCore

CoreCore CoreCoreCoreCore CoreCore

IOHIOH

RAMRAM RAMRAM

IOHIOH

CONFIDENTIAL

For single core application…

CoreCore CoreCoreCoreCore CoreCore

CoreCore CoreCoreCoreCore CoreCore

NorthbridgeNorthbridge RAMRAM

CoreCore CoreCoreCoreCore CoreCore

CoreCore CoreCoreCoreCore CoreCore

IOHIOH

RAMRAM RAMRAM

IOHIOH

RAM: - Far from cores (poor latency) - Goes thru NB (low bandwidth)

RAM: - Next to cores (better latency) - Directly to RAM (better bandwidth)

Legacy w/ FSB New (Nehalem) w/ QPI

CONFIDENTIAL

AMD

Almost same as AMD HT architecture.Wider bandwidthLower latency

This was the reason why many people selected AMD Opteron.

CONFIDENTIAL

CoreCore CoreCoreCoreCore CoreCore

CoreCore CoreCoreCoreCore CoreCore

NorthbridgeNorthbridge RAMRAM

CoreCore CoreCoreCoreCore CoreCore

CoreCore CoreCoreCoreCore CoreCore

IOHIOH

RAMRAM RAMRAM

IOHIOH

QPI is much wider than legacy FSB >24GB/sLatency inside CPU is much less than NB

Much Better SolutionLegacy w/ FSB New (Nehalem) w/ QPI

CONFIDENTIAL

HD-SDI RS422/9pin

VTR 1

Ingest 3

VTR 2

VTR 3

VTR 4

HD-SDI RS422/9pin

HD-SDI RS422/9pin

HD-SDI RS422/9pin

Ingest 2

Ingest 1

Ingest 4

Multi-Port Ingest

CONFIDENTIAL

Multi-Port Ingest

HD-SDI RS422/9pin

VTR 1

VTR 2

VTR 3

VTR 4

HD-SDI RS422/9pin

HD-SDI RS422/9pin

HD-SDI RS422/9pin

Multi-IngestEliminates separate

ingest devices

Future support planned for 2x ingest via single

3Gig HD-SDI connection

CONFIDENTIAL

Distributed Processing

Tasks

Many-CoreEngine #1

Many-CoreEngine #2

Many-CoreEngine #3

Many-CoreEngine #4

Many-Core Engine

Cell Cell Cell Cell Cell Cell Cell Cell

Cell Cell Cell Cell Cell Cell Cell Cell

CONFIDENTIAL

Hybrid ‘Many-Core’ ArchitectureA Compelling Future

• Best of both worlds - Cell+Intel• Latest Nehalem / X58 Design• Dramatic scalability and flexibility• Open to new technologies• Ultimate in throughput and performance• Attention to power and efficiency concerns

CONFIDENTIAL

Thank you…

Contact: Lance KelsonLance.Kelson@am.sony.com

top related