assessment of data path implementations for download and streaming

35
Assessment of Data Path Implementations for Download and Streaming Pål Halvorsen

Upload: saxton

Post on 18-Mar-2016

41 views

Category:

Documents


0 download

DESCRIPTION

Assessment of Data Path Implementations for Download and Streaming. Pål Halvorsen. Overview. RELAY overview??? Existing mechanisms in Linux Tested enhancements Ongoing Summary and Conclusions. RELAY Resource Utilization in Large-Scale Time-Dependent Systems. VoD. WWW. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Assessment of  Data Path Implementations for  Download and Streaming

Assessment of Data Path Implementations for

Download and Streaming

Pål Halvorsen

Page 2: Assessment of  Data Path Implementations for  Download and Streaming

Visit at Technische Universität Braunschweig, March 2007

Overview

RELAY overview???

Existing mechanisms in Linux

Tested enhancements

Ongoing

Summary and Conclusions

Page 3: Assessment of  Data Path Implementations for  Download and Streaming

RELAYResource Utilization in

Large-Scale Time-Dependent Systems

Page 4: Assessment of  Data Path Implementations for  Download and Streaming

Visit at Technische Universität Braunschweig, March 2007

networknetwork

networknetwork

VoDWWW

P2P

Live eventperformance??

Picture Today

Page 5: Assessment of  Data Path Implementations for  Download and Streaming

Visit at Technische Universität Braunschweig, March 2007

Phys

TransportNetwork

Link

Application

HardwareDrivers

Kernel

User Space• System support for improved resource utilization & QoS• Multimedia (game and video) servers• … Some current areas

• protocols for interactive applications• multicast group maintenance• latency hiding• resource availability adaptation• hybrid P2P streaming / streaming to mobile devices• asymmetric multiprocessor scheduling• …

RELAY

Page 6: Assessment of  Data Path Implementations for  Download and Streaming

Linux Data Path Linux Data Path ImplementationsImplementations

Page 7: Assessment of  Data Path Implementations for  Download and Streaming

Visit at Technische Universität Braunschweig, March 2007

Delivery Systems

Network

bus(es)

Page 8: Assessment of  Data Path Implementations for  Download and Streaming

Visit at Technische Universität Braunschweig, March 2007

bus(es)

file system communication system

application

user spacekernel space

Delivery Systems

Page 9: Assessment of  Data Path Implementations for  Download and Streaming

Visit at Technische Universität Braunschweig, March 2007

Pentium 4Processor

registerscache(s)

I/Ocontroller

hub

memorycontroller

hub

RDRAMRDRAM

RDRAMRDRAM

PCI slotsPCI slotsPCI slots

network card

disk

file systemcommunication systemapplication

file system communication system

application

disk network card

Intel Hub Architecture several in-memory data movements and context switches

Page 10: Assessment of  Data Path Implementations for  Download and Streaming

Visit at Technische Universität Braunschweig, March 2007

Cost of Data Transfers Data copy operations are expensive

−consume CPU, memory, hub, bus and interface resources (proportional to size)

−profiling shows that ~40% of CPU time is consumed by copying data in a disk-network scenario

−speed-gap between memory and CPU increase−different access times to different banks

System calls makes a lot of switches between user and kernel space− ~450 ns on 933MHz PentiumIII− ~920 ns on 1.7GHz PentiumIV

Page 11: Assessment of  Data Path Implementations for  Download and Streaming

Visit at Technische Universität Braunschweig, March 2007

Observation and QuestionA lot of research

has been performed in this

area!!!!

BUT, what is the status today of commodity

OSes?IO-Litesplice

MMBUF

stream

sendfile

….

Page 12: Assessment of  Data Path Implementations for  Download and Streaming

Visit at Technische Universität Braunschweig, March 2007

file system communication system

application

user spacekernel space

bus(es)

Content Download

Page 13: Assessment of  Data Path Implementations for  Download and Streaming

Visit at Technische Universität Braunschweig, March 2007

Content Download: read / send

application

kernel

page cache socket buffer

applicationbuffer

read send

copycopy

DMA transfer DMA transfer

2n copy operations 2n system calls

Page 14: Assessment of  Data Path Implementations for  Download and Streaming

Visit at Technische Universität Braunschweig, March 2007

Content Download: mmap / send

application

kernel

page cache socket buffer

mmap send

copy

DMA transfer DMA transfer

n copy operations 1 + n system calls

Page 15: Assessment of  Data Path Implementations for  Download and Streaming

Visit at Technische Universität Braunschweig, March 2007

Content Download: sendfile

application

kernel

page cache socket buffer

sendfile

gather DMA transfer

append descriptor

DMA transfer

0 copy operations 1 system calls

Page 16: Assessment of  Data Path Implementations for  Download and Streaming

Visit at Technische Universität Braunschweig, March 2007

Content Download: Results

UDP TCP

Tested transfer of 1 GB file on Linux 2.6 Both UDP (with enhancements) and TCP

Page 17: Assessment of  Data Path Implementations for  Download and Streaming

Visit at Technische Universität Braunschweig, March 2007

file system communication system

application

user spacekernel space

bus(es)

Streaming

Page 18: Assessment of  Data Path Implementations for  Download and Streaming

Visit at Technische Universität Braunschweig, March 2007

Streaming: read / send

application

kernel

page cache socket buffer

application buffer

read send

copycopy

DMA transfer DMA transfer

2n (3n) copy operations 2n system calls

Page 19: Assessment of  Data Path Implementations for  Download and Streaming

Visit at Technische Universität Braunschweig, March 2007

Streaming: read / writev

application

kernel

page cache socket buffer

application buffer

read writev

copycopy

DMA transfer DMA transfer

3n copy operations 2n system calls

copy

Page 20: Assessment of  Data Path Implementations for  Download and Streaming

Visit at Technische Universität Braunschweig, March 2007

Streaming: mmap / send

application

kernel

page cache socket buffer

application buffer

mmap uncork

copy

DMA transfer DMA transfer

2n copy operations 1 + 4n system calls

copy

sendsendcork

Page 21: Assessment of  Data Path Implementations for  Download and Streaming

Visit at Technische Universität Braunschweig, March 2007

Streaming: mmap / writev

application

kernel

page cache socket buffer

application buffer

mmap writev

copy

DMA transfer DMA transfer

2n copy operations 1 + n system calls

copy

Page 22: Assessment of  Data Path Implementations for  Download and Streaming

Visit at Technische Universität Braunschweig, March 2007

Streaming: sendfile

application

kernel

page cache socket buffer

application buffer

DMA transfer

n copy operations 4n system calls

gather DMA transfer

append descriptor

copyuncorksendfilesendcork

Page 23: Assessment of  Data Path Implementations for  Download and Streaming

Visit at Technische Universität Braunschweig, March 2007

Streaming: Results Tested streaming of 1 GB file on Linux 2.6 RTP over UDP

TCP sendfile (content download)

Compared to not sending an RTP header over UDP, we get an increase of 29%(additional send call)

More copy operations and system calls required potential for improvements

Page 24: Assessment of  Data Path Implementations for  Download and Streaming

Enhanced Streaming Enhanced Streaming Data PathsData Paths

Page 25: Assessment of  Data Path Implementations for  Download and Streaming

Visit at Technische Universität Braunschweig, March 2007

Enhanced Streaming: mmap / msend

application

kernel

page cache socket buffer

application buffer

DMA transfer

n copy operations 1 + 4n system calls

gather DMA transfer

append descriptor

copy

msend allows to send data from anmmap’ed file without copy

mmap uncorksendsendcork msend

copy

DMA transfer

Page 26: Assessment of  Data Path Implementations for  Download and Streaming

Visit at Technische Universität Braunschweig, March 2007

Enhanced Streaming: mmap / rtpmsend

application

kernel

page cache socket buffer

application buffer

DMA transfer

n copy operations 1 + n system calls

gather DMA transfer

append descriptor

copymmap uncorkmsendsendcork rtpmsend

RTP header copy integrated intomsend system call

Page 27: Assessment of  Data Path Implementations for  Download and Streaming

Visit at Technische Universität Braunschweig, March 2007

Enhanced Streaming: mmap / krtpmsend

application

kernel

page cache socket buffer

application buffer

DMA transfer

0 copy operations 1 system call

gather DMA transfer

append descriptor

copykrtpmsend

An RTP engine in the kernel adds RTP headers

rtpmsend

RTP engine

Page 28: Assessment of  Data Path Implementations for  Download and Streaming

Visit at Technische Universität Braunschweig, March 2007

Enhanced Streaming: rtpsendfile

application

kernel

page cache socket buffer

application buffer

DMA transfer

n copy operations n system calls

gather DMA transfer

append descriptor

copyrtpsendfileuncorksendfilesendcork

RTP header copy integrated intosendfile system call

Page 29: Assessment of  Data Path Implementations for  Download and Streaming

Visit at Technische Universität Braunschweig, March 2007

Enhanced Streaming: krtpsendfile

application

kernel

page cache socket buffer

application buffer

DMA transfer

0 copy operations 1 system call

gather DMA transfer

append descriptor

copykrtpsendfile

An RTP engine in the kerneladds RTP headers

rtpsendfile

RTP engine

Page 30: Assessment of  Data Path Implementations for  Download and Streaming

Visit at Technische Universität Braunschweig, March 2007

Enhanced Streaming: Results Tested streaming of 1 GB file on Linux 2.6 RTP over UDP

TCP s

endfi

le (c

onte

nt d

ownlo

ad)Ex

isting

mec

hanis

m

(stre

aming

)

mmap based mechanisms sendfile based mechanisms

~27%

impr

ovem

ent

~25%

impr

ovem

ent

Page 31: Assessment of  Data Path Implementations for  Download and Streaming

Ongoing WorkOngoing Work

Page 32: Assessment of  Data Path Implementations for  Download and Streaming

Visit at Technische Universität Braunschweig, March 2007

Enhanced Streaming: rtpsendfile

application

kernel

page cache socket buffer

application buffer

DMA transfer

n copy operations n system calls

gather DMA transfer

append descriptor

copyrtpsendfile

Calls like writev, sendfilev, … exist

Page 33: Assessment of  Data Path Implementations for  Download and Streaming

Visit at Technische Universität Braunschweig, March 2007

Enhanced Streaming: sendfilew

application

kernel

page cache socket buffer

application buffer

DMA transfer

gather DMA transfer

append descriptor

copysendfilew

len, off, src_fd, flags

Batched system call enabling an arbitrary interleaving of blocks from files and user-space buffers to be sent as one or more packets

Page 34: Assessment of  Data Path Implementations for  Download and Streaming

Visit at Technische Universität Braunschweig, March 2007

Conclusions sendfile works nice for download scenarios

Current commodity operating systems still pay a high price for streaming services

However, small changes in the system call layer might be sufficient to remove most of the overhead

Conclusively, commodity operating systems still have potential for improvement with respect to streaming support

What can we hope to be supported?

Page 35: Assessment of  Data Path Implementations for  Download and Streaming

Visit at Technische Universität Braunschweig, March 2007

Questions??