disksim with ssd_extension

28
Disksim with SSD extension -- A develop's perspective Jiannan Ouyang PhD CS@PITT 2011/04/07

Upload: cucufrog

Post on 29-Jun-2015

6.120 views

Category:

Technology


0 download

DESCRIPTION

Analyzed the source code of disk simulator Disksim, and its SSD extension from Microsoft

TRANSCRIPT

Page 1: Disksim with SSD_extension

Disksim with SSD extension-- A develop's perspective

Jiannan OuyangPhD CS@PITT

2011/04/07

Page 2: Disksim with SSD_extension

Outline

Overview

Disksim implementation

SSD extension

Page 3: Disksim with SSD_extension

Disksim

Disksim: An open source disk simulator originally developed at UMich. and enhanced at CMU.

Page 4: Disksim with SSD_extension

Disksim features

Various device model including: disk, simpledisk, memsmodel

Controller model: simple, smart(with cache)

Trace synthesis and different trace file format

DIXtrac: automatic disk characterization

Page 5: Disksim with SSD_extension

ssdmodel

Developed by Microsoft.

NOT for any specific SSD Device

For an idealized SSD that is parameterized by the properties of NAND flash chips

Cache is NOT natively supported

Page 6: Disksim with SSD_extension

Source Dir

src/ disksim source (disksim_*.c/h)

ssdmodel/ ssd extension source (ssd_*.c/h)

diskmodel/ diskmodel layout and mech

memsmodel/ MEMS device model

libparam/ parameter processing lib

...

Page 7: Disksim with SSD_extension

Outline

Overview

Disksim implementation

SSD extension

Page 8: Disksim with SSD_extension

Disksim source: src/

disksim_main* main entrance main()

disksim_iodriver* driver iodriver_send_event_down_path()

dismsim_bus* bus bus_deliver_event()

disksim_controller* controller controller_event_arrive()

disksim_diskctlr* disk controller disk_event_arrive()

...

Page 9: Disksim with SSD_extension

Disksim Control Path

Event Based System: various types of events: io, interrupt, timer...all event are stored in a global queue in time orderaddtointq() and removefromintq() are used to access the global queue

Equivalent code:while(curr=getnextevent()){ swith (curr->type){ case IO_REQUEST_ARRIVE: iodriver_request(curr); break; }}

Page 10: Disksim with SSD_extension

Example

src/disksim_iosim.c io_internal_event() case IO_ACCESS_ARRIVE: iodriver_schedule(0, curr); break;

src/disksim_iodriver.c iodriver_schedule() iodriver_send_event_down_path(curr);

src/disksim_iodriver.c iodriver_send_event_down_path() bus_deliver_event(busno.byte[0], slotno.byte[0], curr);

Page 11: Disksim with SSD_extension

Example con.

src/disksim_bus.c bus_deliver_event() case CONTROLLER: controller_event_arrive(devno, curr); break;

case DEVICE: ASSERT(devno == curr->devno); device_event_arrive(curr); break;

This control flow is a simulation of an event.

Page 12: Disksim with SSD_extension

Disksim & Device Interface

INLINE void device_event_arrive (ioreq_event *curr){ ASSERT1 ((curr->devno >= 0) && (curr->devno < numdevices), "curr->devno", curr->devno); return disksim->deviceinfo->devices[curr->devno]->event_arrive(curr);}

Funtion pointer! By dynamic tracing using gdb, we found thatFor disk, it jumps to disk_event_arrive()For ssd, it jumps to ssd_event_arrive()

Page 13: Disksim with SSD_extension

event_arrive: disk v.s. ssddisk_event_arrive() ssd_event_arrive()case IO_ACCESS_ARRIVE: disk_request_arrive(curr); case DEVICE_OVERHEAD_COMPLETE: disk_request_arrive(curr); case DEVICE_BUFFER_SEEKDONE: disk_buffer_seekdone(currdisk, curr); case DEVICE_BUFFER_SECTOR_DONE: disk_buffer_sector_done(currdisk, curr); case DEVICE_GOTO_REMAPPED_SECTOR: disk_goto_remapped_sector(currdisk, curr); case DEVICE_GOT_REMAPPED_SECTOR: disk_got_remapped_sector(currdisk, curr); case DEVICE_PREPARE_FOR_DATA_TRANSFER: disk_prepare_for_data_transfer(curr); case DEVICE_DATA_TRANSFER_COMPLETE: disk_reconnection_or_transfer_complete(curr); case IO_INTERRUPT_COMPLETE: disk_interrupt_complete(curr);

case DEVICE_OVERHEAD_COMPLETE: ssd_request_arrive(curr); case DEVICE_ACCESS_COMPLETE: ssd_access_complete (curr); case DEVICE_DATA_TRANSFER_COMPLETE: ssd_bustransfer_complete(curr); case IO_INTERRUPT_COMPLETE: ssd_interrupt_complete(curr);case SSD_CLEAN_GANG: ssd_clean_gang_complete(curr);case SSD_CLEAN_ELEMENT: ssd_clean_element_complete(curr);

"buffer" is cache related events."remapped sector" seems to related to data layout (not sure)

"clean" is garbage collection and wear-leveling related. "Gang" and "Element" specify the allocation and reclaim unit.

Page 14: Disksim with SSD_extension

Outline

Overview

Disksim implementation

SSD extension

Page 15: Disksim with SSD_extension

ssdmodel features

Add an auxiliary level of parallel elements, each with a closed queue, to represent flash elements or gangsAdd logic to serialized request completions from these parallel elementsFor each elements, maintain data structures to represent SSD logical block maps, cleaning state and wear_leveling stateDelay is introduced when request is processedParameters including background cleaning, gang-size, gang organization, interleaving, overprovisioning

Page 16: Disksim with SSD_extension

Flash Package Internal

Page 17: Disksim with SSD_extension

Flash Chip Performance

1. Latencybus<->data reg 100us

media->reg: read 25us

reg->media: write 200us

erease 1.5ms

4. Bandwidth and Interleave

src plane -> dest plane 4 page copying(100us per page)

2. Two-plane commands can be executed on their plane pairs 0&1 or 2&3

3. Support background copy on the same plane

Page 18: Disksim with SSD_extension

SSD Simulation

Logical Block Mapallocation pool

Cleaninggreedy or wear-leveling aware

Parallelism and Interconnect Density ganging, interleaving, background cleaning

Persistencesaving mapping information per block in DRAM

Page 19: Disksim with SSD_extension

Interconnection - Ganging

A gang of flash packages can be utilized in synchrony to optimized a multi-page request. Allow multiple packages to be used in parallel while sharing one request queueA request queue can be associated to each gang or to each element (full interconnection mode)

Page 20: Disksim with SSD_extension

Logical Block Map

Use allocation pool to think about how an SSD allocates flash blocks to service write requests

An allocation pool an be a flash package or a gang

Static: a portion of each LBA constitutes a fixed mapping to a specific allocation pool

Dynamic: the non-static portion of a LBA is the lookup key for a mapping within a pool

Page 21: Disksim with SSD_extension

Garbage Collection (Cleaning)

active block: block available to holding incoming writes in a pool

superseded page: out-of-date page

cleaning efficiency: (superseded / total pages) in a block

a pure greedy approach: choosing blocks to clean based on potential cleaning efficiency

Page 22: Disksim with SSD_extension

Wear-Leveling

average remaining lifetime(ARL) of a blockage variance (say 20%) of the ARLretirement age (say 85%) of the ARL

Wear-aware garbage collection:1. If ARL < retirement, migrate cold data into this block from a

migration-candidate queue, and recycle the head block of the queue. Populate the queue with new blocks with cold data.

Otherwise, if ARL<age variance, then restrict recycling of the block with a probability that increases linearly as the remaining lifetime drops to 0. (80% of average ~ Prob of recycle = 1; 0% of average ~ 0)

Page 23: Disksim with SSD_extension

Source: ssdmodel/

ssdmodel is very simple, all c files listed below:

ssd.c main ssd_event_arrive()

ssd_clean.c gabege collection and wear leveling

ssd_activate_gang()

ssd_gang.c several flash packages orgnised as gang

ssd_clean_blocks_greedy()

ssd_timing.c timing model ssd_compute_access_time()

ssd_utils.c util

ssd_init.c init

Page 24: Disksim with SSD_extension

Example

event sequences for one request:ssd_request_arrive->ssd_interrupt_complete(reconnect)->ssd_bustransfer_complete->ssd_access_complete->ssd_interrupt_complete(completion)

ssd_bustransfer_complete() -> ssd_media_access_request ();ssdmodel/ssd.c: ssd_media_access_request () case SSD_ALLOC_POOL_PLANE: case SSD_ALLOC_POOL_CHIP: ssd_media_access_request_element(curr); break; case SSD_ALLOC_POOL_GANG:#if SYNC_GANG ssd_media_access_request_gang_sync(curr);#else ssd_media_access_request_gang(curr);#endif break;

Page 25: Disksim with SSD_extension

Example con.

ssd_media_access_request_element() -> sse_activate_element() -> ssd_invoke_element_cleaning() -> ssd_compute_access_time(currdisk, elem_num, read_reqs, read_total); -> add complete into global event queue -> ssd_compute_access_time(currdisk, elem_num, write_reqs, write_total); -> add complete into global event queue

Parallel processing sequential complete is achieved by processing batch of requests in parallel, however, generate the ACCESS_COMPLETE events sequencially

Page 26: Disksim with SSD_extension

References

Disksim: http://www.pdl.cmu.edu/DiskSim/Disksim Manual: http://www.pdl.cmu.edu/PDL-FTP/DriveChar/CMU-PDL-08-101.pdfDisksim implementation doc: src/doc/Outline.txtSSD Extension: http://research.microsoft.com/en-us/downloads/b41019e2-1d2b-44d8-b512-ba35ab814cd4/SSD Extension paper: Design Tradeoffs for SSD Performance, N Agrawal, 2008Cache over SSD project: Group 6 on http://www-users.cselabs.umn.edu/classes/Spring-2009/csci8980-ass/

Page 27: Disksim with SSD_extension

Thanks

Q & A ?

Page 28: Disksim with SSD_extension

Block stripping

// blocks can be concatenated (chained) from each plane//// plane 0 plane 1 plane 2 plane 3// ------------------------------------------// blk 0 blk 2048 blk 4096 blk 6144// blk 1 blk 2049 blk 4097 blk 6145// ... ...// blk 2047 blk 4095 blk 6143 blk 8191

// blocks can be stripped across all the planes//// plane 0 plane 1 plane 2 plane 3// ------------------------------------------// blk 0 blk 1 blk 2 blk 3// blk 4 blk 5 blk 6 blk 7// ... ...// blk 8188 blk 8189 blk 8190 blk 8191//