paste: network stacks must integrate with nvmm abstractions

23
Michio Honda (NEC Laboratories Europe/NetApp*), Lars Eggert and Douglas Santry (NetApp) ACM HotNets 2016 November 9-10 th , Atlanta, GA PASTE: Network Stacks Must Integrate with NVMM Abstractions *work done

Upload: micchie

Post on 15-Apr-2017

128 views

Category:

Software


0 download

TRANSCRIPT

Michio Honda(NECLaboratoriesEurope/NetApp*),LarsEggertandDouglasSantry (NetApp)

ACMHotNets 2016November9-10th,Atlanta,GA

PASTE:NetworkStacksMustIntegratewithNVMM

Abstractions

*workdone

https://www.hpe.com/us/en/servers/persistent-memory.html

Motivation

• Non-VolatileMainMemories(NVMMs)• Persistent• Byte-addressable• Lowlatency• 10s-1000sofns

• Shiftfromblock- tobyte-granularitypersistency• OSabstractions• Direct accesstommap()-ed files

• Datastructures• Filesystemsanddatabases

Whatareimplicationsfornetworking?

Case Study: Write-Ahead Logging

• Persistclient’srequestpriortoacknowledgment• Maskoverheadofupdatingprimarydatabase(e.g.B-tree)totheclient

• 1KBcommit

• 2030us• Networkingtakes40us

client

DRAM

Networkstack

SSD/DIsk

App

NIC(1)

(2)(5)

Storagestack

(4)

write()/memcpy()->fsync()/msync()

(3)

read()

write()/memcpy()->fsync()/msync()

Case Study: Write-Ahead Logging

• Persistclient’srequestpriortoacknowledgment• Maskoverheadofupdatingprimarydatabase(e.g.B-tree)totheclient

• 1KBcommit

• 2000 42 us• Networkingtakes40us

• This2usisnotsmall

client

DRAM

Networkstack

Storagestack

NVMM

App

NIC(1)

(2)

(5)EmulatedusingareservedregionofDRAM

(3)

read()

(4)

1 5 10 15 20 250

20406080

100120140

TK

rou

gK

pu

t [1

. t

ran

V/V]

1 5 10 15 20 250

50100150200250300350

/ate

ncy

[µV]

# of Concurrent ConnectLonV

1et. 2nOy1et. + read()/mVync() on 1V00 (emuO.)1et. + memcpy()/mVync() on 1V00 (emuO.)

• Parallelrequestsareserializedoneachcore

Case Study: Write-Ahead Logging

33%throughputdecrease,50%latencyincrease

Data Copies Matter

• CacheMisses• Copytotmp buffer(e.g.,read())ischeap• Loggingalwayshappenstoadifferentdestination

appbufferkernelbuffer

read()

logfile(mmap()-ed)

memcpy()

Overall cachemisses LargestContributor

Networkingonly 0.0004 % net_rx_action()(84%)

Networking+NVMM(read() +memcpy()+msync())

4.4121% memcpy()(98%)

Networking+NVMM(read()+msync())

8.3451% sys_read()(99%)

Wemustavoiddatacopyforlogging!

Packet Store (PASTE) Overview

• StaticpacketbuffersonanamedNVMMregion• DMAtoNVMM

• Zero-copyAPIs• Fastlogging

client

Networkstack

Storagestack

NVMM

App

NIC(1)

(2)

(3)

/mnt/nvmm/pktbufs

/mnt/pmem/appmd

(4)

metadataonly(e.g.,bufferindex)

Fast Logging with PASTE

/mnt/nvmm/myapp_metadata

buf_idx off len0 100 11351 100 9323 100 1024

/mnt/nvmm/pktbufsbuf_ofs: 123

/mnt/nvmm/pktbufs

packetbuffers(static)

metadataheader

metadataentries

NICring

Application

(2)Writemetadataentry(3)Flush(bufferandmetadata)

netmap APImmap()

(1)Readdata(zerocopy)

Kernel

User

TCP/IPinputand

output

BecauseofDDIO(DMAtoL3cache),NVMMlatencyonlyarisesonflushingdata(i.e.,step(3)),notoneverypacket

UnreadRead

Flushed

Fast Logging with PASTE

/mnt/nvmm/myapp_metadata

buf_idx off len0 100 11351 100 9323 100 1024

/mnt/nvmm/pktbufsbuf_ofs: 123

/mnt/nvmm/pktbufs

packetbuffers(static)

metadataheader

metadataentries

NICring

Application

(2)Writemetadataentry(3)Flush(bufferandmetadata)

netmap APImmap()

(1)Readdata(zerocopy)

Kernel

User

TCP/IPinputand

output

UnreadRead

Flushed

BecauseofDDIO(DMAtoL3cache),NVMMlatencyonlyarisesonflushingdata(i.e.,step(3)),notoneverypacket

Fast Logging with PASTE

/mnt/nvmm/myapp_metadata

buf_idx off len0 100 11351 100 9323 100 1024

/mnt/nvmm/pktbufsbuf_ofs: 123

/mnt/nvmm/pktbufs

packetbuffers(static)

metadataheader

metadataentries

NICring

Application

(2)Writemetadataentry(3)Flush(bufferandmetadata)

netmap APImmap()

(1)Readdata(zerocopy)

Kernel

User

TCP/IPinputand

output

UnreadRead

Flushed

BecauseofDDIO(DMAtoL3cache),NVMMlatencyonlyarisesonflushingdata(i.e.,step(3)),notoneverypacket

Fast Logging with PASTE

/mnt/nvmm/myapp_metadata

buf_idx off len0 100 11351 100 9323 100 1024

/mnt/nvmm/pktbufsbuf_ofs: 123

/mnt/nvmm/pktbufs

packetbuffers(static)

metadataheader

metadataentries

NICring

Application

(2)Writemetadataentry(3)Flush(bufferandmetadata)

netmap APImmap()

(1)Readdata(zerocopy)

Kernel

User

TCP/IPinputand

output

UnreadRead

Flushed

BecauseofDDIO(DMAtoL3cache),NVMMlatencyonlyarisesonflushingdata(i.e.,step(3)),notoneverypacket

Fast Logging with PASTE

/mnt/nvmm/myapp_metadata

buf_idx off len0 100 11351 100 9323 100 1024

/mnt/nvmm/pktbufsbuf_ofs: 123

/mnt/nvmm/pktbufs

packetbuffers(static)

metadataheader

metadataentries

NICring

Application

(2)Writemetadataentry(3)Flush(bufferandmetadata)

netmap APImmap()

(1)Readdata(zerocopy)

Kernel

User

TCP/IPinputand

output

UnreadRead

Flushed

BecauseofDDIO(DMAtoL3cache),NVMMlatencyonlyarisesonflushingdata(i.e.,step(3)),notoneverypacket

Fast Logging with PASTE

/mnt/nvmm/myapp_metadata

buf_idx off len0 100 11351 100 9323 100 1024

/mnt/nvmm/pktbufsbuf_ofs: 123

/mnt/nvmm/pktbufs

packetbuffers(static)

metadataheader

metadataentries

NICring

Application

(2)Writemetadataentry(3)Flush(bufferandmetadata)

netmap APImmap()

(1)Readdata(zerocopy)

Kernel

User

TCP/IPinputand

output

UnreadRead

Flushed

BecauseofDDIO(DMAtoL3cache),NVMMlatencyonlyarisesonflushingdata(i.e.,step(3)),notoneverypacket

Fast Logging with PASTE

/mnt/nvmm/myapp_metadata

buf_idx off len0 100 11351 100 9323 100 1024

/mnt/nvmm/pktbufsbuf_ofs: 123

/mnt/nvmm/pktbufs

packetbuffers(static)

metadataheader

metadataentries

NICring

Application

(2)Writemetadataentry(3)Flush(bufferandmetadata)

netmap APImmap()

(1)Readdata(zerocopy)

Kernel

User

TCP/IPinputand

output

UnreadRead

Flushed

BecauseofDDIO(DMAtoL3cache),NVMMlatencyonlyarisesonflushingdata(i.e.,step(3)),notoneverypacket

Fast Logging with PASTE

/mnt/nvmm/myapp_metadata

buf_idx off len0 100 11351 100 9323 100 1024

/mnt/nvmm/pktbufsbuf_ofs: 123

/mnt/nvmm/pktbufs

packetbuffers(static)

metadataheader

metadataentries

NICring

Application

(2)Writemetadataentry(3)Flush(bufferandmetadata)

netmap APImmap()

(1)Readdata(zerocopy)

Kernel

User

TCP/IPinputand

output

UnreadRead

Flushed

BecauseofDDIO(DMAtoL3cache),NVMMlatencyonlyarisesonflushingdata(i.e.,step(3)),notoneverypacket

Fast Logging with PASTE

/mnt/nvmm/myapp_metadata

buf_idx off len0 100 11351 100 9323 100 1024

/mnt/nvmm/pktbufsbuf_ofs: 123

/mnt/nvmm/pktbufs

packetbuffers(static)

metadataheader

metadataentries

NICring

Application

(2)Writemetadataentry(3)Flush(bufferandmetadata)

netmap APImmap()

(1)Readdata(zerocopy)

Kernel

User

TCP/IPinputand

output

UnreadRead

Flushed

BecauseofDDIO(DMAtoL3cache),NVMMlatencyonlyarisesonflushingdata(i.e.,step(3)),notoneverypacket

Fast Logging with PASTE

/mnt/nvmm/myapp_metadata

buf_idx off len0 100 11351 100 9323 100 1024

/mnt/nvmm/pktbufsbuf_ofs: 123

/mnt/nvmm/pktbufs

packetbuffers(static)

metadataheader

metadataentries

NICring

Application

(2)Writemetadataentry(3)Flush(bufferandmetadata)

netmap APImmap()

(1)Readdata(zerocopy)

Kernel

User

TCP/IPinputand

output

Idempotentrequest

UnreadRead

Flushed

BecauseofDDIO(DMAtoL3cache),NVMMlatencyonlyarisesonflushingdata(i.e.,step(3)),notoneverypacket

Fast Logging with PASTE

/mnt/nvmm/myapp_metadata

buf_idx off len0 100 11351 100 9323 100 1024

/mnt/nvmm/pktbufsbuf_ofs: 123

/mnt/nvmm/pktbufs

packetbuffers(static)

metadataheader

metadataentries

NICring

Application

(2)Writemetadataentry(3)Flush(bufferandmetadata)

netmap APImmap()

(1)Readdata(zerocopy)

Kernel

User

TCP/IPinputand

output

UnreadRead

Flushed

BecauseofDDIO(DMAtoL3cache),NVMMlatencyonlyarisesonflushingdata(i.e.,step(3)),notoneverypacket

Fast Logging with PASTE

/mnt/nvmm/myapp_metadata

buf_idx off len0 100 11351 100 9323 100 1024

/mnt/nvmm/pktbufsbuf_ofs: 123

/mnt/nvmm/pktbufs

packetbuffers(static)

metadataheader

metadataentries

NICring

Application

(2)Writemetadataentry(3)Flush(bufferandmetadata)

netmap APImmap()

(1)Readdata(zerocopy)

Kernel

User

TCP/IPinputand

output

UnreadRead

Flushed

BecauseofDDIO(DMAtoL3cache),NVMMlatencyonlyarisesonflushingdata(i.e.,step(3)),notoneverypacket

Fast Logging with PASTE

/mnt/nvmm/myapp_metadata

buf_idx off len0 100 11351 100 9323 100 1024

/mnt/nvmm/pktbufsbuf_ofs: 123

/mnt/nvmm/pktbufs

packetbuffers(static)

metadataheader

metadataentries

NICring

Application

(2)Writemetadataentry(3)Flush(bufferandmetadata)

netmap APImmap()

(1)Readdata(zerocopy)

Kernel

User

TCP/IPinputand

output

UnreadRead

Flushed

BecauseofDDIO(DMAtoL3cache),NVMMlatencyonlyarisesonflushingdata(i.e.,step(3)),notoneverypacket

10-88%throughputincrease,9-46%latencyreduction

Preliminary Results

• Implementation• Extendthenetmap framework

• Stackmap forTCP/IP

Related Work

• Enhancednetworkstacks• MegaPipe (OSDI’12),Stackmap (ATC’16),Fastsocket(ASPLOS’16)• IXandArrakis (OSDI’14),mTCP (NSDI’13),Sandstorm(SIGCOMM’14)

• NVMMfilesystems• BPFS(SOSP’09),NOVA(FAST’15)

• NVMMdatabases• NVWAL(ASPLOS’15),REWIND(VLDB’15),NV-Tree (FAST’15)

NoNVMMaware

Nonetworkingaware

Conclusion

• Implications• Networkstacksarenowabottleneckfordurablystoringdata• Improvingnetworkandstoragestacksinisolationisnotenough• Weneednewstacksdesign

PASTE:FastloggingwithnamedpacketbuffersonNVMMandzero-copyAPI