simulation of the fresh breeze storage system xiaoxuan mengjack dennis mit csailuniversity of...
TRANSCRIPT
Simulation of the Fresh BreezeStorage System
Xiaoxuan MengJack Dennis
MIT CSAIL University of Delaware
Funded by NSF HECURA Grant CCF-0937832
Project Goals
• Implement fine-grain management of tasks and memory.• Provide a base for cpability-based access control and security.• Support composability of arbitrary parallel program components.• Achieve high efficiency in operation.
2
Computer Systems that:
The Fresh Breeze Memory Model
• Fan-out as large as 16
Data
Chunkse.g. 128 Bytes
Master
Chunk
• Write-Once then Read Only
Cycle-Free Heap Arrays as Trees of Chunks
3
• Arrays: Three levels yields 4096 elements (longs)
Storage System Architecture
Execution System
Top Storage
Level
4
AD
Main Storage
Level
CUSU
SUAD CU
SU
SU
AD CUSU
SU
AD – Associative Directory
CU - Control Unit
SU – Storage Unit
Spawn and Join
5
Master Task
spawn (n)
join_fetch Join
Worker 0 Worker n-1
join_update join_update
Spawned Tasks
Continuation Task
Spawn - Join Implementation
Results from
Spawned Tasks
6
Join TicketJoin Data
Count/Limit
Continuation
Task
System for Simulation
L1P
MainMemory
Switch
Processors
40Up to 64IndependentStorage Units
EUs TSUs MSUs
L1P
Fresh Breeze Simulation
X-Bar Switch
EUs – Task Execution TSUs – CacheImplementation
MSUs – Main Storage Implementation
Cyclops TUs
The Dot Product
A
B
* Sum
A B
5 levels:Vector length =165 = 1,048,576
* +
scalar result
* *
Each Task:Dot Product of two16-element vectors:16 multiplies; 15 adds
Spawn – Join Programming
void DotProd ( a, b ) {join_init (16, Accumulate ( ) );for ( i = 0; i < 16; i++)
spawn_one ( i, DotProd ( a[i], b[i] )quit ();
}
void Accumulate ( ) {handle = join_fetch ();sum = 0;for ( j = 0; j < 16; j++ )
sum += handle [j];join_update ( sum );
}
Findings
• Using fine-grain tasks and a hardware scheduler can effectively distribute concurrent tasks over a large number of processors.
• Task stealing can achieve high utilization of many processors.
11
Further Work
• Model a Three-Level Storage System: L1, L2, Main.• Task distribution to exploit data locality.• Further benchmark applications for testing: MM, FFT, LUD.• FunJava Compiler: DFG transformations and code
generation.
12