socl - opencl frontend for starpu · socl opencl frontend for starpu sylvain henry...

21

Upload: others

Post on 29-Jan-2021

9 views

Category:

Documents


0 download

TRANSCRIPT

  • SOCL

    OpenCL Frontend for StarPU

    Sylvain [email protected]

    University of Bordeaux - LaBRI - Inria

    April 5th, 2013

  • StarPU

    How to use the StarPU runtime system:

    1. Low-level API

    s t a r p u d a t a h a n d l e t v e c t o r h a n d l e ;

    s t a r p u v e c t o r d a t a r e g i s t e r (& ve c t o r h and l e , 0 , ( u i n t p t r t ) v ec to r , NX,s i z e o f ( v e c t o r [ 0 ] ) ) ;

    s t r u c t s t a r p u t a s k � t a s k = s t a r p u t a s k c r e a t e ( ) ;task�>s ynch ronous = 1 ;task�>c l = &c l ;task�>hand l e s [ 0 ] = v e c t o r h a n d l e ;task�>c l a r g = &f a c t o r ;task�>c l a r g s i z e = s i z e o f ( f a c t o r ) ;

    s t a r p u t a s k s u bm i t ( t a s k ) ;

    s t a r p u d a t a u n r e g i s t e r ( v e c t o r h a n d l e ) ;

  • StarPU

    How to use the StarPU runtime system:

    1. Low-level API

    2. Insert Task

    s t a r p u i n s e r t t a s k (&mande l b r o t c l ,STARPU VALUE, &iby , s i z e o f ( i b y ) ,STARPU VALUE, &b l o c k s i z e , s i z e o f ( b l o c k s i z e ) ,STARPU VALUE, &stepX , s i z e o f ( stepX ) ,STARPU VALUE, &stepY , s i z e o f ( stepY ) ,STARPU W, b l o c k h a nd l e s [ i b y ] ,STARPU VALUE, &pcnt , s i z e o f ( i n t �) ,0) ;

  • StarPU

    How to use the StarPU runtime system:

    1. Low-level API

    2. Insert Task

    3. GCC Plugin

    s t a t i c vo id matmul ( const f l o a t �A, const f l o a t �B, f l o a t �C, s i z e t nx , s i z e tny , s i z e t nz )

    a t t r i b u t e ( ( t a s k ) ) ;

    s t a t i c vo id matmul cpu ( const f l o a t �A, const f l o a t �B, f l o a t �C, s i z e t nx ,s i z e t ny , s i z e t nz )

    a t t r i b u t e ( ( t a s k imp l emen t a t i o n ( "cpu" , matmul ) ) ) ;

    #pragma s t a r pu r e g i s t e r &A[ i�zdim�bydim + j�bzdim�bydim ] ( bzdim � bydim )#pragma s t a r pu r e g i s t e r &B[ i�xdim�bzdim + j�bxdim�bzdim ] ( bxdim � bzdim )#pragma s t a r pu r e g i s t e r &C[ i�xdim�bydim + j�bxdim�bydim ] ( bxdim � bydim )

    matmul (&A[ i � zdim � bydim + k � bzdim � bydim ] ,&B[ k � xdim � bzdim + j � bxdim � bzdim ] ,&C [ i � xdim � bydim + j � bxdim � bydim ] , bxdim ,bydim , bzdim ) ;

    #pragma s t a r pu wa i t

  • StarPU

    How to use the StarPU runtime system:

    1. Low-level API

    2. Insert Task

    3. GCC Plugin

    4. OpenCL API

  • Layers

    StarPU specificapplication

    StarPU

    CUDABackend

    CPUBackend

    OpenCLBackend

    CPUcores

    NvidiaGPUs

    OpenCLdevices

  • Layers

    StarPU

    CUDABackend

    CPUBackend

    OpenCLBackend

    CPUcores

    NvidiaGPUs

    OpenCLdevices

    SOCL

    Generic OpenCLapplication

    StarPU specificapplication

    StarPU

    CUDABackend

    CPUBackend

    OpenCLBackend

    CPUcores

    NvidiaGPUs

    OpenCLdevices

  • Layers

    StarPU

    CUDABackend

    CPUBackend

    OpenCLBackend

    CPUcores

    NvidiaGPUs

    OpenCLdevices

    SOCL

    Generic OpenCLapplication

  • Installable Client Driver

    ...Vendor AOpenCL

    Vendor ZOpenCLSOCL

    Installable Client Driver (libOpenCL)

    Application

    Installable Client Driver (libOpenCL)

    ...Vendor AOpenCL

    Vendor ZOpenCLSOCL

  • SOCL Platform

    I SOCL Platform exposes devices of every other platformsI GPUs, CPUs, accelerators. . .

    I Synchronizations (events. . . ) can be used between all devices

    I Bu�ers can be shared by all devicesI Automatic transfers between devices handled by StarPU

    I Static scheduling (�a-la OpenCL) can be used as described inthe speci�cation

    I One command queue per device

  • SOCL Platform

    I SOCL Platform exposes devices of every other platformsI GPUs, CPUs, accelerators. . .

    I Synchronizations (events. . . ) can be used between all devices

    I Bu�ers can be shared by all devicesI Automatic transfers between devices handled by StarPU

    I Static scheduling (�a-la OpenCL) can be used as described inthe speci�cation

    I One command queue per device

  • SOCL Platform

    I SOCL Platform exposes devices of every other platformsI GPUs, CPUs, accelerators. . .

    I Synchronizations (events. . . ) can be used between all devices

    I Bu�ers can be shared by all devicesI Automatic transfers between devices handled by StarPU

    I Static scheduling (�a-la OpenCL) can be used as described inthe speci�cation

    I One command queue per device

  • SOCL Platform

    I SOCL Platform exposes devices of every other platformsI GPUs, CPUs, accelerators. . .

    I Synchronizations (events. . . ) can be used between all devices

    I Bu�ers can be shared by all devicesI Automatic transfers between devices handled by StarPU

    I Static scheduling (�a-la OpenCL) can be used as described inthe speci�cation

    I One command queue per device

  • Scheduling Contexts

    GPU GPU CPU

    clEnqueue*

    Context 0Context 1

    I Command queues can now be associated to contexts

    I Automatic scheduling for commands submitted in these

    queues!

    I Scheduler can be chosen for each context

  • Scheduling Contexts

    GPU GPU CPU

    clEnqueue*

    Context 0Context 1

    I Command queues can now be associated to contexts

    I Automatic scheduling for commands submitted in these

    queues!

    I Scheduler can be chosen for each context

  • Scheduling Contexts

    GPU GPU CPU

    clEnqueue*

    Context 0Context 1

    I Command queues can now be associated to contexts

    I Automatic scheduling for commands submitted in these

    queues!

    I Scheduler can be chosen for each context

  • Scheduling Contexts

    GPU GPU CPU

    clEnqueue*

    Context 0Context 1

    I Command queues can now be associated to contexts

    I Automatic scheduling for commands submitted in these

    queues!

    I Scheduler can be chosen for each context

  • Scheduling Contexts

    c l c o n t e x t p r o p e r t i e s g p u p r o p e r t i e s [ ] = fCL CONTEXT PLATFORM, ( c l c o n t e x t p r o p e r t i e s ) p l a t f o rm s [ p l a t f o rm i d x ] ,CL CONTEXT SCHEDULER SOCL , "heft" ,CL CONTEXT NAME SOCL, "GPUs" ,0

    g ;gpu con t ex t = c lCreateContextFromType ( g pu p r o p e r t i e s , CL DEVICE TYPE GPU , NULL ,

    NULL , &e r r ) ;

  • LuxRender

    CPU CPU1 GPU

    CPU2 GPUs

    2 GPUs 1 GPU

    AMD APPSOCLIntel SDK

    Sam

    ples

    /sec

    (av

    erag

    e in

    mill

    ions

    )

    0

    1

    2

    3

    4

    Intel Xeon E5-2650 2.00GHz with 32GB, 2 AMD Radeon HD 7900

  • LuxRender

    CPU CPU1 GPU

    CPU2 GPUs

    2 GPUs 1 GPU

    Nvidia OpenCLSOCLIntel SDK

    Sam

    ples

    /sec

    (av

    erag

    e in

    mill

    ions

    )

    0.0

    0.5

    1.0

    1.5

    2.0

    2.5

    Intel Xeon E5-2650 2.00GHz with 64GB, 2 NVidia Tesla M2075

  • Thank you

    Introduction