hpc memory subsystemscc.acad.bg/ncsa/articles/library/library2016_supercomputers-at-work/... ·...

12
5 5 Josh Fryman, Intel Corp. Shekhar Borkar ISC, June 20, 2016 Acknowledgment: Dinesh Somasekhar, Dave Dunning HPC Memory Subsystem— Beyond Myths & Hype This research was, in part, funded by the U.S. Government, DOE. The views and conclusions contained in this document are those of the authors and should not be interpreted as representing the official policies, either expressed or implied, of the U.S. Government.

Upload: others

Post on 20-Mar-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: HPC Memory Subsystemscc.acad.bg/ncsa/articles/library/Library2016_Supercomputers-at-Work/... · Bandwidth BW GB/s Higher M Architecture Latency, cycle time T ns Lower M Architecture

5 5

Josh Fryman, Intel Corp.

Shekhar Borkar

ISC, June 20, 2016

Acknowledgment: Dinesh Somasekhar, Dave Dunning

HPC Memory Subsystem—

Beyond Myths & Hype

This research was, in part, funded by the U.S. Government, DOE. The views and conclusions contained in this document are those of the authors and should not be interpreted as representing the official policies,

either expressed or implied, of the U.S. Government.

Page 2: HPC Memory Subsystemscc.acad.bg/ncsa/articles/library/Library2016_Supercomputers-at-Work/... · Bandwidth BW GB/s Higher M Architecture Latency, cycle time T ns Lower M Architecture

6 6ISC 16 June 20, 2016

System level value metric

Disparate memory technologies

Comparison of memory technologies

Summary

Outline

Page 3: HPC Memory Subsystemscc.acad.bg/ncsa/articles/library/Library2016_Supercomputers-at-Work/... · Bandwidth BW GB/s Higher M Architecture Latency, cycle time T ns Lower M Architecture

7 7ISC 16 June 20, 2016

The information on this page is subject to the use and disclosure restrictions provided on the cover page to this document.

System Level Value Metric

Attribute Units Desire

Bandwidth BW GB/s Higher

Latency, cycle time T ns Lower

Energy E pJ/b Lower

Non-volatility NV

Endurance, Reliability R 10n Higher

Cost/bit C $/b Lower

Page 4: HPC Memory Subsystemscc.acad.bg/ncsa/articles/library/Library2016_Supercomputers-at-Work/... · Bandwidth BW GB/s Higher M Architecture Latency, cycle time T ns Lower M Architecture

8 8ISC 16 June 20, 2016

Energy Consumption, DRAM ExampleCPU walk

10 fJ

Address Walk

22mm

22

mm

25-100 fJ

13mm

7m

m

25 fJ

500-2000

fJ

DRAM Energy = 4pJ

2KB page512bit access60% page hit

DRAM walkCrossing

~90 fJ

550 fJ

BL Energy

25 fJ

WL E

nerg

y

325 fJ

DATA Walk

500 fJ

MC

NOTE: Approx. 320fJ to read SRAM

On CPU Energy = 0.8pJ

Energy determined by interconnects, not Memory Technology

Page 5: HPC Memory Subsystemscc.acad.bg/ncsa/articles/library/Library2016_Supercomputers-at-Work/... · Bandwidth BW GB/s Higher M Architecture Latency, cycle time T ns Lower M Architecture

9 9ISC 16 June 20, 2016

Access Time, DRAM Example

CPU walk22mm

22

mm

13mm

7m

m

Address Walk

8

1

LP DDR dimensions

DRAM walkCrossing

2 eDRAM20 Commodity

5

BL Read Time

10

4

WL

MC

Numbers in terms of ns OR clocks @1GHz

WL decodeRamp up

4

16 burst

8

DATA Walk

1

Serializedeserialize

2

Walk is pipelined on CPU

~40ns18ns 6ns

Access time determined by interconnects, not Memory Technology

Page 6: HPC Memory Subsystemscc.acad.bg/ncsa/articles/library/Library2016_Supercomputers-at-Work/... · Bandwidth BW GB/s Higher M Architecture Latency, cycle time T ns Lower M Architecture

10 10ISC 16 June 20, 2016

The information on this page is subject to the use and disclosure restrictions provided on the cover page to this document.

Conventional Memories6T-SRAM

Power dominated by leakage

Logic integration friendly

1T-DRAM

Commodity—cost sensitive

Logic integration possible

NAND Flash

Commodity—cost sensitive

NV, Slow, limited endurance

PCM—Phase Change

Commodity—cost sensitive

NV, Slow, limited endurance

Page 7: HPC Memory Subsystemscc.acad.bg/ncsa/articles/library/Library2016_Supercomputers-at-Work/... · Bandwidth BW GB/s Higher M Architecture Latency, cycle time T ns Lower M Architecture

11 11ISC 16 June 20, 2016

The information on this page is subject to the use and disclosure restrictions provided on the cover page to this document.

Emerging MemoriesSpin Torque

High currents

NV, High endurance

FeRAM

DRAM like, charge based

NV, High endurance

MRAM

Scalability issues

NV, High endurance

ReRAM

Cross-point memory

NV, Limited endurance

Page 8: HPC Memory Subsystemscc.acad.bg/ncsa/articles/library/Library2016_Supercomputers-at-Work/... · Bandwidth BW GB/s Higher M Architecture Latency, cycle time T ns Lower M Architecture

12 12ISC 16 June 20, 2016

The information on this page is subject to the use and disclosure restrictions provided on the cover page to this document.

Compare Memory Technologies (1)Maturity Memory Cell Size

Array Efficiency Memory Density

Page 9: HPC Memory Subsystemscc.acad.bg/ncsa/articles/library/Library2016_Supercomputers-at-Work/... · Bandwidth BW GB/s Higher M Architecture Latency, cycle time T ns Lower M Architecture

13 13ISC 16 June 20, 2016

The information on this page is subject to the use and disclosure restrictions provided on the cover page to this document.

Compare Memory Technologies (2)Performance Read/Write Energy

Endurance Soft-Error Immunity

Page 10: HPC Memory Subsystemscc.acad.bg/ncsa/articles/library/Library2016_Supercomputers-at-Work/... · Bandwidth BW GB/s Higher M Architecture Latency, cycle time T ns Lower M Architecture

14 14ISC 16 June 20, 2016

The information on this page is subject to the use and disclosure restrictions provided on the cover page to this document.

Memory Technology Score Card

SRAM DRAM Flash PCM STT FeR Mram Rram

Capacity Low High V High V High Low Low Low High

Performance High Med Low Low Med Low Low Low

Energy Low Med High High Med ? High ?

Endurance High High Low Low High Med High Med

NV No No Yes Yes Yes Yes Yes Yes

Scalable Yes Yes Yes Yes Yes Limited Limited Limited

Maturity High High High Med Low High Med Low

Small and fastBalanced (capacity, speed, energy)

High capacity with less activity ???

Page 11: HPC Memory Subsystemscc.acad.bg/ncsa/articles/library/Library2016_Supercomputers-at-Work/... · Bandwidth BW GB/s Higher M Architecture Latency, cycle time T ns Lower M Architecture

15 15ISC 16 June 20, 2016

The information on this page is subject to the use and disclosure restrictions provided on the cover page to this document.

Choose your value(s)…

Attribute Units Desire Primary Dependence

Bandwidth BW GB/s Higher M Architecture

Latency, cycle time T ns Lower M Architecture

Energy E pJ/b Lower M Arch, M Tech (some)

Non-volatility NV M Technology

Endurance, Reliability R 10n Higher M Technology

Cost/bit C $/b Lower M Arch, M Tech (some)

…architect, and choose the memory technology

Hard constraint:If Cost is your value, then you are stuck with commodity

Page 12: HPC Memory Subsystemscc.acad.bg/ncsa/articles/library/Library2016_Supercomputers-at-Work/... · Bandwidth BW GB/s Higher M Architecture Latency, cycle time T ns Lower M Architecture

16 16ISC 16 June 20, 2016

SRAM, DRAM, NAND, and PCM will be the only mature technologies (10 Years)

Primary high endurance memory technologies remain SRAM and DRAM

DRAM likely to remain the best choice for capacity memory

Advances in platform architecture needed for NAND, PCM based storage class memories

Summary