juniors: prototypes for novel exascale i/o concepts

Upload: heiko-joerg-schick

Post on 05-Apr-2018

214 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/31/2019 Juniors: Prototypes for novel exascale I/O concepts

    1/1

    M

    gedderHemhoz-Gemenscha

    Juniors: Prototypes for novel exascale I/O concepts

    Prototype II: Based on Blue Gene/Q Technology Setup description

    4 Blue Gene/Q I/O drawers

    8 I/O nodes per drawer

    Flash card devices: TMS Ramsan 450 GByte 10 GbE adapter (for external connectivity)

    Status: Initial unit tests successful

    I/O Benchmarks

    IOR: Parallel benchmark supporting different I/Ointerfaces like POSIX, MPI-IO, HDF5

    FIO: Parallel benchmark with multiple engines forgenerating synchronous and asynchronous I/Orequests

    Others: IOzone, mdtest and application benchmarks

    H. El-Harake, S. El Sayed, U. Fischer, S. Graf, M. Hennecke, W. Homberg, K. Kutzer, J. Lauritsen, O. Mextorf, P. Morjan,D. Pleiter, H. Schick, G. Schwarz, M. Stephan

    To meet future demands of exascale systems new I/O concepts are required since compute power and perfor -mance of storage technologies are developing at different speed. With PCIe flash cards a new fast, persistentstorage technology is emerging which can bridge the performance gap between volatile main memory andpersistent disk-based storage devices.

    Prototype I: Servers with x86 Processors Cluster of 8 IBM xSeries servers

    10 GbE interconnect

    Different PCIe flash devices:

    Fusion-IO Duo 320 GByte TSM Ramsan 450 and 900 GByte

    Status: Operational

    Preliminary performance results using FIO Bandwidth and IOP rates at or close to vendor

    specification

    Performance of GPFS using 2 Fusion-IO cards similarto raw device access

    Performance on lower-clocked many-core deviceswill improve using interfaces and devices whichenable highly concurrent access to flash memory

    Flash Storage Managed by Filesystem

    Concept verified on prototype I:

    GPFS pools on flash storage and disk

    Use GPFS policy engine to manage placement andmigration

    Example: Migration of oldest files triggered byLOW_SPACE event:

    RULE MIGRATEFROM POOL 'flash'

    THRESHOLD(60,20)WEIGHT(CURRENT_TIMESTAMP-ACCESS_TIME)

    TO POOL 'disk'

    DCS3700 / FC2TB HDDs (RAID6)

    DCS3700 / SAS2TB HDDs (RAID6)

    Juniorsm (.115.66)x3650M3

    Juniors1 (.115.67)x3650M3 / FusionIO

    Juniors2 (.115.68)x3650M3 / FusionIO

    Juniors3 (.115.69)x3650M3 / FusionIO

    juniors4 (.115.70)x3650M3 / FusionIO

    Juniors5 (.115.71)x3650M3 / FC

    Juniors6 (.115.72)x3650M3 / FC

    Juniors7 (.115.73)x3650M3 / SAS

    Juniors8 (.115.74)x3650M3 / SAS

    M a n a g e m e n t N e t

    ( 1 3 4

    . 9 4

    . 1 1 5

    . 6 4

    / 2 7

    )

    ssh juniors.fz-juelich.de

    Nexus 7k Switch

    Nexus 7k Switch

    Supported by

    Fusion-IO Duo 320 GBytes TMS Ramsan 450 GBytesRead BW [Gbyte/s] 1.5 1.25

    Write BW [Gbytes/s] 1.5 0.9

    Read IOPs 261,000 300,000

    Write IOPs 262,000 220,000

    PCIe GEN2 bus width 4x 8x

    1 8 64Nthread

    0.7

    0.8

    0.9

    1

    1.1

    1.2

    1.3

    G B y t e / s

    readwrite

    Prototype IIioengine=libaio, bs=1M, iodepth=64

    1 8 64Nthread

    0

    50

    100

    150

    k I O P S

    random readrandom write

    Prototype IIioengine=libaio, bs=4K, iodepth=512

    1 8 64Nthread

    0.7

    0.8

    0.9

    1

    1.1

    1.2

    1.3

    G B y t e / s read

    write

    Prototype Iioengine=libaio, bs=1M, iodepth=64

    1 8 64Nthread

    50

    100

    150

    200

    250

    300

    350

    400

    k I O P S

    random readrandom write

    Prototype Iioengine=libaio, bs=4K, iodepth=512