gpu virtualization support in cloud system ching-chi lin institute of information science, academia...

GPU Virtualization Support in Cloud SystemChing-Chi LinInstitute of Information Science, Academia Sinica

Department of Computer Science and Information Engineering, Nation Taiwan University

Chih-Yuan Yeh, Chung-Yao Kao, Wei-Shu HungDepartment of Computer Science and Information Engineering, Nation Taiwan University

Pangfeng LiuDepartment of Computer Science and Information Engineering, Nation Taiwan University

Graduate Institute of Networking and Multimedia, Nation Taiwan University

Jan-Jan WuInstitute of Information Science, Academia Sinica

Research Center for Information Technology Innovation, Academia Sinica

Kuang-Chih LiuCloud Computing Center for Mobile Applications, Industrial Technology Research Institute, Hsinchu, Taiwan

IntroductionCloud computing is very popular.Virtualization

◦Share hardware resources. CPU Memory

What about GPU?

GPUGraphic Processing Unit (GPU)

◦A specialized microprocessor that accelerates images rendering and displaying.

◦Hundreds of cores.Advantage

◦Better performance/cost ratio.◦Powerful parallel computing

capability.

GPGPU“General-purpose computing on

graphics processing units”.◦supercomputing, molecular

dynamics, protein folding, and planetary system simulation, etc.

Various programming environment to support GPGPU.◦CUDA◦OpenCL

Our GoalVirtualize GPU

◦Cloud user can rent VM to execute CUDA programs.

Difficulties◦No built-in time sharing mechanism.◦Information of GPU is not available.

Driver source code

Main IdeaGather and pack GPU kernels into

batch.Concurrent kernel execution

◦A technique that execute CUDA kernels concurrently.

Concurrent Kernel ExecutionUse NVidia Fermi architecture

with CUDA v4.0.

Architecture

CombinerRuns in domain-0.Packing kernels

◦Parses the source codes.◦Creates different CUDA streams.◦Prepares the combined kernel for

concurrent execution.

Combining PoliciesChooses the kernels by FIFO.Send a batch to the Executor if

◦The combined kernel will use at least 90% of the GPU resources.

◦There are 16 kernels in the batch.◦There are no incoming kernels within

the last 10 seconds.

System Flow

Experiment SettingXen 4.1.2 hypervisorPhysical machine

◦ Intel Core i5-2400 processor with 4 cores running at 3.40GHz

◦8GB memory◦NVidia GTX 560-Ti GPU◦CUDA 4.0

Virtual machine◦Dual core CPU, 1GB memory, 20GB disk◦Ubuntu 12.04 with linux 3.5 kernel

Performance Evaluation

The ratio decreases while the concurrency increasing.◦However, not linear due to overhead.

Dispatch Overhead

Different Programs Mixture

ConclusionThis paper propose a GPU

virtualization architecture using Nvidia Fermi GPU.◦Reduces execution time ◦Increases system throughput.

Future works◦Apply Nvidia Kepler GPU.◦Better kernel packing policies.

gpu virtualization support in cloud system ching-chi lin institute of information science, academia...

kernel slide

taiwan slide

information of gpu

architecture slide

cuda opencl slide

gpu resources

system flow slide

ti gpu cuda

Documents

predicate logic - sinica

elenco riviste isi 2012 - unina.itmechanica$solida$sinica...

radiation processes - sinica

endogenous preferential treatment in centralized...

1 error-tolerant algorithms in bioinformatics wen-lian hsu...

research topics - sinica

lecture of cell signaling-i dec. 7, 2004 contact...

st meeting room - sinica

nanotechnology - academia sinica

ma huang (ephedra sinica),

tung-wei kuo , kate ching-ju lin, and ming- jer tsai...

durable goods monopolist - sinica

an introduction to lda tools kuan-yu chen institute of...

conducting research on chinese radicalism in the french ......

tung-wei kuo, kate ching-ju lin, and ming-jer tsai academia...

secured outsourcing of frequent itemset mining hana chih-hua...

let’s - sinica

vr project orientation - 國立臺灣大學r97128/vr project...

pdf - academia sinica

quantum monte-carlo studies of b, al, and c clusters...