introduction to software distributed shared memory systems chang-yi lin 2004 / 02 / 26
TRANSCRIPT
![Page 1: Introduction to Software Distributed Shared Memory Systems Chang-Yi Lin 2004 / 02 / 26](https://reader036.vdocument.in/reader036/viewer/2022062315/5697bff91a28abf838cbff56/html5/thumbnails/1.jpg)
Introduction to Software Distributed
Shared Memory Systems
Chang-Yi Lin2004 / 02 / 26
![Page 2: Introduction to Software Distributed Shared Memory Systems Chang-Yi Lin 2004 / 02 / 26](https://reader036.vdocument.in/reader036/viewer/2022062315/5697bff91a28abf838cbff56/html5/thumbnails/2.jpg)
Outlines What is a software DSM system? Message-passing vs. Shared-memory How does it work? Memory Consistency Models Cache Coherence Implementation Levels Granularity
![Page 3: Introduction to Software Distributed Shared Memory Systems Chang-Yi Lin 2004 / 02 / 26](https://reader036.vdocument.in/reader036/viewer/2022062315/5697bff91a28abf838cbff56/html5/thumbnails/3.jpg)
What is a software DSM system? A distributed-memory system (often called a
multicomputer) consist of multiple independent processing nodes with local memory modules, connected by a general interconnection network.
Global shared memory.
![Page 4: Introduction to Software Distributed Shared Memory Systems Chang-Yi Lin 2004 / 02 / 26](https://reader036.vdocument.in/reader036/viewer/2022062315/5697bff91a28abf838cbff56/html5/thumbnails/4.jpg)
![Page 5: Introduction to Software Distributed Shared Memory Systems Chang-Yi Lin 2004 / 02 / 26](https://reader036.vdocument.in/reader036/viewer/2022062315/5697bff91a28abf838cbff56/html5/thumbnails/5.jpg)
![Page 6: Introduction to Software Distributed Shared Memory Systems Chang-Yi Lin 2004 / 02 / 26](https://reader036.vdocument.in/reader036/viewer/2022062315/5697bff91a28abf838cbff56/html5/thumbnails/6.jpg)
What is a software DSM system? A DSM system logically implements the
shared-memory model on a physically distributed-memory system.
The DSM system hides the remote communication mechanism from the application writer, preserving the programming ease and portability typical of shared-memory systems.
![Page 7: Introduction to Software Distributed Shared Memory Systems Chang-Yi Lin 2004 / 02 / 26](https://reader036.vdocument.in/reader036/viewer/2022062315/5697bff91a28abf838cbff56/html5/thumbnails/7.jpg)
Message-passing vs. Shared-memory Two different programming type. Shared-memory programming is easier.
![Page 8: Introduction to Software Distributed Shared Memory Systems Chang-Yi Lin 2004 / 02 / 26](https://reader036.vdocument.in/reader036/viewer/2022062315/5697bff91a28abf838cbff56/html5/thumbnails/8.jpg)
Message-passing vs. Shared-memory Message-passing
Point-to-point communication
![Page 9: Introduction to Software Distributed Shared Memory Systems Chang-Yi Lin 2004 / 02 / 26](https://reader036.vdocument.in/reader036/viewer/2022062315/5697bff91a28abf838cbff56/html5/thumbnails/9.jpg)
Message-passing vs. Shared-memory Message-passing
buffers and data types blocking and nonblocking
![Page 10: Introduction to Software Distributed Shared Memory Systems Chang-Yi Lin 2004 / 02 / 26](https://reader036.vdocument.in/reader036/viewer/2022062315/5697bff91a28abf838cbff56/html5/thumbnails/10.jpg)
Message-passing vs. Shared-memory
![Page 11: Introduction to Software Distributed Shared Memory Systems Chang-Yi Lin 2004 / 02 / 26](https://reader036.vdocument.in/reader036/viewer/2022062315/5697bff91a28abf838cbff56/html5/thumbnails/11.jpg)
Message-passing vs. Shared-memory Pthread
Thread 1
LockA = A++;unlock
Thread 2
LockA = A++;unlock
Thread 3
LockA = A++;unlock
![Page 12: Introduction to Software Distributed Shared Memory Systems Chang-Yi Lin 2004 / 02 / 26](https://reader036.vdocument.in/reader036/viewer/2022062315/5697bff91a28abf838cbff56/html5/thumbnails/12.jpg)
![Page 13: Introduction to Software Distributed Shared Memory Systems Chang-Yi Lin 2004 / 02 / 26](https://reader036.vdocument.in/reader036/viewer/2022062315/5697bff91a28abf838cbff56/html5/thumbnails/13.jpg)
How does it work?
DSM system
Memory
Shared-memory applications
Interconnection network
![Page 14: Introduction to Software Distributed Shared Memory Systems Chang-Yi Lin 2004 / 02 / 26](https://reader036.vdocument.in/reader036/viewer/2022062315/5697bff91a28abf838cbff56/html5/thumbnails/14.jpg)
How does it work?
Interconnection network
Mem MemMemMem
DSM DSM DSM DSM
App AppAppApp
![Page 15: Introduction to Software Distributed Shared Memory Systems Chang-Yi Lin 2004 / 02 / 26](https://reader036.vdocument.in/reader036/viewer/2022062315/5697bff91a28abf838cbff56/html5/thumbnails/15.jpg)
How does it work?
![Page 16: Introduction to Software Distributed Shared Memory Systems Chang-Yi Lin 2004 / 02 / 26](https://reader036.vdocument.in/reader036/viewer/2022062315/5697bff91a28abf838cbff56/html5/thumbnails/16.jpg)
Memory Consistency Models What is Memory consistency?
在單機上 P1: w(x)1 R(x)1---------------------------------- time
在兩台機器上 P1: w(x)1 P2: R(x)?---------------------------------------- time
![Page 17: Introduction to Software Distributed Shared Memory Systems Chang-Yi Lin 2004 / 02 / 26](https://reader036.vdocument.in/reader036/viewer/2022062315/5697bff91a28abf838cbff56/html5/thumbnails/17.jpg)
Memory Consistency Models A consistency model is essentially a
contract between the software and the memory.
If the software agrees to obey certain rules, the memory promises to work correctly.
If the software violates these rules, correctness of memory operation is no longer guaranteed.
![Page 18: Introduction to Software Distributed Shared Memory Systems Chang-Yi Lin 2004 / 02 / 26](https://reader036.vdocument.in/reader036/viewer/2022062315/5697bff91a28abf838cbff56/html5/thumbnails/18.jpg)
Memory Consistency Models Strict consistency Sequential consistency (SC) Release consistency (RC) Scope consistency (ScC)
![Page 19: Introduction to Software Distributed Shared Memory Systems Chang-Yi Lin 2004 / 02 / 26](https://reader036.vdocument.in/reader036/viewer/2022062315/5697bff91a28abf838cbff56/html5/thumbnails/19.jpg)
Memory Consistency Models Strict consistency
Definition: any read to memory location x returns the value stored by the most recent write operation to x.
Impossible to DSM
P1: w(x)1P2: R(x)1---------------------------------------- time
![Page 20: Introduction to Software Distributed Shared Memory Systems Chang-Yi Lin 2004 / 02 / 26](https://reader036.vdocument.in/reader036/viewer/2022062315/5697bff91a28abf838cbff56/html5/thumbnails/20.jpg)
Memory Consistency Models Sequential consistency (SC)
Definition: the result of any execution is the same as if the operations of all processors were executed in some sequential order, and the operations of each individual processor appear in this sequence in the order specified by its program.
Any valid interleaving is acceptable behavior, but all processes must see the same sequence of memory reference.
![Page 21: Introduction to Software Distributed Shared Memory Systems Chang-Yi Lin 2004 / 02 / 26](https://reader036.vdocument.in/reader036/viewer/2022062315/5697bff91a28abf838cbff56/html5/thumbnails/21.jpg)
Memory Consistency Models Sequential consistency (SC)
P1
(A) a=1;(B) Print(b,c);
P2
(C) b=1;(D) Print(a,c);
P3
(E) c=1;(F) Print(a,b);
(A)(B)(C)(D)(E)(F)
(A)(C)(D)(B)(E)(F)
(C)(E)(F)(D)(A)(B)
(A)(C)(E)(B)(D)(F)
![Page 22: Introduction to Software Distributed Shared Memory Systems Chang-Yi Lin 2004 / 02 / 26](https://reader036.vdocument.in/reader036/viewer/2022062315/5697bff91a28abf838cbff56/html5/thumbnails/22.jpg)
Memory Consistency Models Release Consistency (RC)
Two types of access Ordinary access: read and write Synchronization access: acquire lock, release lock
and barrier Rules:
Before an ordinary access to a shared variable is performed, all previous acquires done by the process must have completed successfully.
Before a release is allowed to be performed, all previous reads and writes done by the process must have completed.
![Page 23: Introduction to Software Distributed Shared Memory Systems Chang-Yi Lin 2004 / 02 / 26](https://reader036.vdocument.in/reader036/viewer/2022062315/5697bff91a28abf838cbff56/html5/thumbnails/23.jpg)
Memory Consistency Models Release Consistency (RC)
Rule1: …Acq(L1)…Rel(L1) … Acq(L5) …Acq(L3)…R(x)
Rule2: Acq(L2) w(x) R(y) R(z) Rel(L2)
Example:
P1: Acq(L) w(x)1 w(x)2 Rel(L)
P2: Acq(L) R(x)2 Rel(L)
P3: R(x)?
![Page 24: Introduction to Software Distributed Shared Memory Systems Chang-Yi Lin 2004 / 02 / 26](https://reader036.vdocument.in/reader036/viewer/2022062315/5697bff91a28abf838cbff56/html5/thumbnails/24.jpg)
Memory Consistency Models Scope consistency (ScC)
![Page 25: Introduction to Software Distributed Shared Memory Systems Chang-Yi Lin 2004 / 02 / 26](https://reader036.vdocument.in/reader036/viewer/2022062315/5697bff91a28abf838cbff56/html5/thumbnails/25.jpg)
Memory Consistency Models Relaxing consistency permits temporary incon
sistencies (delayed updates) Lazy release consistency (LRC) (TreadMarks, CVM) Scope consistency (ScC) (JIAJIA, JUMP)
![Page 26: Introduction to Software Distributed Shared Memory Systems Chang-Yi Lin 2004 / 02 / 26](https://reader036.vdocument.in/reader036/viewer/2022062315/5697bff91a28abf838cbff56/html5/thumbnails/26.jpg)
Cache Coherence Write invalidate
Suffer from false sharing Write update
Too expansive when many replicas Work best in application with tight sharing
![Page 27: Introduction to Software Distributed Shared Memory Systems Chang-Yi Lin 2004 / 02 / 26](https://reader036.vdocument.in/reader036/viewer/2022062315/5697bff91a28abf838cbff56/html5/thumbnails/27.jpg)
Implementation Levels Modifying OS kernel
IVY (SC): modifying the memory management unit (MMU) of OS to map between the shared virtual memory address space and the local memory.
Language level Linda, Orca
User-level runtime library Trademarks, CVM, JIAJIA, JUMP, Brazos
Combination of multiple implementation levels, even hardware support Munin, Midway, NCP2
![Page 28: Introduction to Software Distributed Shared Memory Systems Chang-Yi Lin 2004 / 02 / 26](https://reader036.vdocument.in/reader036/viewer/2022062315/5697bff91a28abf838cbff56/html5/thumbnails/28.jpg)
Granularity The choice of the block size depends on
the cost of communcation 1 byte message v.s. 1024 byte message
Locality of reference in the application Most DSM systems use a page-based granulari
ty with 1K byte to 8K byte. Larger page size, better locality of reference
![Page 29: Introduction to Software Distributed Shared Memory Systems Chang-Yi Lin 2004 / 02 / 26](https://reader036.vdocument.in/reader036/viewer/2022062315/5697bff91a28abf838cbff56/html5/thumbnails/29.jpg)
系統 開發者 實作層次 顆粒度 一致性模型 一致性協定
IVY Yale 函式庫+作業系統 頁 (1KB) SC WI
Munin Rice 函式庫+作業系統 可變 Eager RC WU/WI
TreadMarks Rice 函式庫 頁 (4KB) LRC MW , WI
CVM Maryland 函式庫 頁 LRC-MW , LRC-SW ,SC
WU
Midway CMU 函式庫+編譯器 可變 EC, PC , RC WU
NCP2 UFRJ, Brail 函式庫+硬體支援 頁 (4KB) EC , RC WU/WI
Quarks Utah 函式庫 region 、頁 RC , SC WU/WI , MW
softFLASH Standford 作業系統 頁 (16KB) RC , DIRC FLASH-like
Cashmere-2L Rochester 函式庫 頁 (8KB) HLRC WU
Brazos Rice 函式庫 頁 ScC Early update ,WU
Shasta DEC WRL 編譯器 可變 SC WI
Mermaid Toronto 函式庫+作業系統 頁 (1KB , 8KB)
SC WI
Dsoftware DSM6K IBM Research 作業系統 頁 (4KB) SC WI
Mirage UCLA 作業系統 512Bytes SC WI
JIAJIA 中國科學院 函式庫 頁 (4KB) ScC WI
Simple-COMA SICS(Sweden) and SUN
作業系統 頁 SC WI
Blizzard-S Wisconsin 函式庫 快取行 SC WI
Shrimp Princeton 作業系統+硬體支援 頁 AURC , SC WU/WI
Linda Yale 語言 可變 SC Impl.dependent
Orca Vrije Univ., Netherlands
語言 可變 同步相關 WU