phase change memory: an architecture and systems...

46
Phase Change Memory An Architecture and Systems Perspective Benjamin C. Lee Stanford University [email protected] Fall 2010, Assistant Professor @ Duke University Benjamin C. Lee 1

Upload: phungphuc

Post on 20-Mar-2018

217 views

Category:

Documents


1 download

TRANSCRIPT

Phase Change MemoryAn Architecture and Systems Perspective

Benjamin C. LeeStanford University

[email protected]

Fall 2010, Assistant Professor @ Duke University

Benjamin C. Lee 1

Memory Scaling◦ ⇑ density, capacity; ⇓ cost-capability ratio

◦ Emerging challenges for prevalent technologies

“Process integration, devices, and structures,” ITRS 2009.

Benjamin C. Lee 2

Memory in Transition

• Charge Memory◦ Write data by capturing charge Q◦ Read data by detecting voltage V◦ Examples: Flash, DRAM

• Resistive Memory◦ Write data by pulsing current dQ/dt◦ Read data by detecting resistance R◦ Examples: PCM, STT-MRAM, memristor

Benjamin C. Lee 3

Limits of Charge Memory

◦ Difficult charge placement and control

◦ Flash: floating gate charge

◦ DRAM: capacitor charge, transistor leakage

Benjamin C. Lee 4

Towards Resistive Memory

• PCM◦ Inject current to change material phase◦ Resistance determined by phase

• STT-MRAM◦ Inject current to change magnet polarity◦ Resistance determined by polarity

• Memristors◦ Inject current to change atomic structure◦ Resistance determined by atom distn

Benjamin C. Lee 5

Benefits of Resistive Memory

• Scalable◦ Program cell with scalable current◦ Map resistance to logical state

• Non-Volatile◦ Alter structure of storage element◦ Incur activation cost to alter properties

• Competitive◦ Achieve viable latency, power, endurance◦ Scale to improve performance metrics

Benjamin C. Lee 6

TechnologyBenjamin Lee, Engin Ipek, Onur Mutlu, Doug Burger. “Architecting phase change

memory as a scalable DRAM alternative.” ISCA 2009.

Benjamin C. Lee 7

Phase Change Memory

◦ Store data within phase change material

◦ Set phase via current pulse

◦ Detect phase via resistance (amorphous/crystalline)

Benjamin C. Lee 8

PCM Scalability

◦ Program with current pulses, which scale linearly

◦ PCM demonstration at 30nm, DRAM roadmap to 40nm

[1] Raoux et al., IBM J Res & Dev 2008; [2] Lai, IEDM 2003; [3] Pirovano et al., IEDM 2003.

Benjamin C. Lee 9

PCM Non-Volatility

• Joule Effect◦ Program with current pulses◦ Melt material by heating material (e.g., 650 ◦C)◦ Cool material to desired phase

• Activation Cost◦ Isolate thermal effects to target cell◦ Retain data for >10 years at 85 ◦C

Benjamin C. Lee 10

Technology Parameters

◦ Survey prototypes from 2003-2008 (ITRS, IEDM, VLSI, ISSCC)

◦ Derive PCM parameters for F=90nm

• Cell Size◦ 9-12F2 using BJT◦ 1.5× DRAM, 2-3× NAND

• Endurance◦ 108 writes per cell◦ 10−8× DRAM, 103× NAND

[1] Lee et al., ISCA 2009; [2] Russo, Workshop Emerging Memory Technologies 2010

Benjamin C. Lee 11

Technology Parameters

◦ Survey prototypes from 2003-2008 (ITRS, IEDM, VLSI, ISSCC)

◦ Derive PCM parameters for F=90nm

• Read Latency◦ 50 ns Rd◦ 4× DRAM, 10−3× NAND

• Write Bandwidth◦ 5-10 MB/s◦ 0.1× DRAM, 1.0× NAND

• Dynamic Energy◦ 40µA Rd, 150µA Wr◦ 2-43× DRAM, 1.0× NAND

[1] Lee et al. ISCA 2009; [2] Russo, Workshop Emerging Memory Technologies 2010

Benjamin C. Lee 12

PCM Comparison

• vs. Flash◦ Advantage in endurance, performance◦ Comparable in energy◦ Scalable Flash alternative

• vs. DRAM◦ Disadvantage in endurance, performance, energy◦ Within competitive range of DRAM◦ Scalable DRAM alternative

Benjamin C. Lee 13

PCM as DRAM Alternative

◦ Deploy PCM on memory bus

◦ Begin by co-locating PCM, DRAM

Benjamin C. Lee 14

Price of Scalability

◦ Replace DRAM with PCM in present architectures

◦ 1.6× delay, 2.2× energy, 500-hour lifetime

Benjamin C. Lee 15

Architecture and ScalabilityBenjamin Lee, Engin Ipek, Onur Mutlu, Doug Burger. “Architecting phase change

memory as a scalable DRAM alternative.” ISCA 2009.

Benjamin C. Lee 16

Architecture Objectives

• DRAM-Competitive◦ Reorganize row buffer to mitigate delay, energy◦ Implement partial writes to mitigate wear mechanism

• Area-Efficient◦ Minimize disruption to density trends◦ Impacts row buffer organization

• Complexity-Effective◦ Encourage adoption with modest mechanisms◦ Impacts partial writes

Benjamin C. Lee 17

Buffer Organization

• On-Chip Buffers◦ Use DRAM-like buffer and interface◦ Evict modified rows into array

• Narrow Rows◦ Reduce write energy ∝ buffer width◦ Reduce peripheral circuitry, associated area

• Multiple Rows◦ Reduce eviction frequency◦ Improve locality, write coalescing

Benjamin C. Lee 18

Buffer Area Strategy

◦ Narrow rows :: fewer expensive S/A’s (44T)

◦ Multiple rows :: additional inexpensive latches (8T)

Benjamin C. Lee 19

Buffering and Locality

◦ Read locality: number of array reads

◦ Write coalescing: number of array writes per buffer write

Benjamin C. Lee 20

Buffer Design Space◦ Derive DRAM, PCM area model

◦ Explore space of area-neutral buffer designs

Benjamin C. Lee 21

Wear Reduction

• Wear Mechanism◦ Writes induce phase change at 650 ◦C◦ Contacts degrade from thermal expansion/contraction◦ Current injection is less reliable after 1E+08 writes

• Partial Writes◦ Reduce writes to PCM array◦ Write only stored lines (64B), words (4B)◦ Add cache line state with 0.2%, 3.1% overhead

Benjamin C. Lee 22

Partial Writes◦ Derive PCM lifetime model

◦ Quantify eliminated writes during buffer eviction

Benjamin C. Lee 23

Scalable Performance

◦ 1.2× delay, 1.0× energy, 5.6-year lifetime

◦ Scaling improves energy, endurance

Benjamin C. Lee 24

Systems and Non-VolatilityJeremy Condit, Edmund Nightingale, Christopher Frost, Engin Ipek, Benjamin Lee,

Doug Burger, Derrick Coetzee. “Better I/O through byte-addressable, persistent

memory.” SOSP 2009.

Benjamin C. Lee 25

Storage Systems

◦ Persistent data in slow, non-volatile memory

◦ Buffered data in fast, volatile memory

Benjamin C. Lee 26

Storage System Trade-offs

• Design Objectives◦ Safety :: security against crashes◦ Consistency :: accurate description of file state◦ Performance :: buffering in volatile memory

• Byte-addressable Persistence (BPRAM)◦ Narrows gap between volatile/non-volatile memory◦ Byte-addressable like DRAM◦ Persistent like disk, Flash

Benjamin C. Lee 27

Byte-addressable Persistent File System (BPFS)

• Safety◦ Use PCM as DRAM alternative◦ Reflect writes to PCM in O(ms), not O(s)

• Consistency◦ Enforce atomicity, ordering in hardware◦ Support shadow paging, copy-on-write

• Performance◦ Exploit byte-addressability for small, in-place writes◦ Propose short-circuit shadow paging

Benjamin C. Lee 28

Tree-Based File System

Benjamin C. Lee 29

File System (FS) Consistency

• Computer Crash◦ Update to FS may require multiple writes◦ Crash between writes leaves FS invalid◦ State of files not as described by FS

• Consistency Checker◦ Traverse FS◦ Compares directory structure, data blocks◦ Depends on allocator, free-space manager

Benjamin C. Lee 30

Disks & Journaling

◦ Writes to journal before to file system

◦ Requires twice the writes

Benjamin C. Lee 31

Disks & Journaling

◦ Writes to journal before to file system

◦ Requires twice the writes

Benjamin C. Lee 31

Disks & Journaling

◦ Writes to journal before to file system

◦ Requires twice the writes

Benjamin C. Lee 31

Disks & Shadow Paging

◦ Copy-on-writes up to file system root

◦ Incurs copying overhead

Benjamin C. Lee 32

Disks & Shadow Paging

◦ Copy-on-writes up to file system root

◦ Incurs copying overhead

Benjamin C. Lee 32

Disks & Shadow Paging

◦ Copy-on-writes up to file system root

◦ Incurs copying overhead

Benjamin C. Lee 32

PCM & Short-Circuit Shadow Paging (1)

◦ Exploit PCM byte-addressability

◦ Write in-place when possible (e.g., 64b updates)

◦ Ex: In-place write

Benjamin C. Lee 33

PCM & Short-Circuit Shadow Paging (2)

◦ Exploit PCM byte-addressability

◦ Write in-place when possible (e.g., 64b updates)

◦ Ex: In-place append

Benjamin C. Lee 34

PCM & Short-Circuit Shadow Paging (3)

◦ Exploit PCM byte-addressability

◦ Write in-place when possible (e.g., 64b updates)

◦ Ex: Partial copy-on-write

Benjamin C. Lee 35

Hardware Support for Atomicity

• Volatile Buffers◦ PCM row buffers are volatile 8T latches◦ BPFS updates 64b pointer to buffer◦ BPFS requires atomic 64b eviction to array

• Atomicity◦ PCM writes atomically into memory array◦ PCM writes complete in O(100ns)◦ Capacitors guard against power failures

Benjamin C. Lee 36

Hardware Support for Ordering

• Modern Performance Optimizations◦ Controller for write-back caches re-order stores◦ Controller for memory bus re-orders transactions

• Ordering◦ Epochs define barrier-delimited BPFS writes◦ Copy-on-write updates may be re-ordered◦ Pointer update must follow barrier

Benjamin C. Lee 37

BPFS Evaluation

◦ NTFS-Disk: baseline

◦ NTFS-RAM: current FS on DRAM, proxy for PCM

◦ BPFS-RAM: proposed FS on DRAM, proxy for PCM

Benjamin C. Lee 38

BPFS Evaluation

◦ Safety: no DRAM buffer

◦ Consistency: shadow paging

◦ Performance: short-circuit shadow paging

Benjamin C. Lee 39

For more information...

• Workshop on Emerging Memory Technologies 2010◦ Survey of emerging memories◦ www.stanford.edu/∼bcclee/emt.html

• Phase Change Memory◦ Lee et al., “Architecting phase change memory as a

scalable DRAM alternative.” ISCA 2009.◦ Condit et al., “Better I/O through byte-addressable,

persistent memory.” SOSP 2009.

Benjamin C. Lee 40

Conclusions

• Scaling Challenges◦ Fundamental limits in charge memory◦ Transition towards resistive memory

• Architecture and Scalability◦ PCM position as Flash replacement◦ PCM viable as DRAM alternative◦ Architect buffers, partial writes

• Systems and Non-Volatility◦ Apply non-volatility for new capabilities◦ Change storage system trade-offs◦ Improve durability, performance

Benjamin C. Lee 41

Phase Change MemoryAn Architecture and Systems Perspective

Benjamin C. LeeStanford University

[email protected]

Fall 2010, Assistant Professor @ Duke University

Benjamin C. Lee 42