lio and the tcmu userspace passthrough: the best of both ...lio and the tcmu userspace passthrough:...
TRANSCRIPT
![Page 1: LIO and the TCMU Userspace Passthrough: The Best of Both ...LIO and the TCMU Userspace Passthrough: The Best of Both Worlds Andy Grover @iamagrover March](https://reader030.vdocument.in/reader030/viewer/2022040106/5e369f7a16624140e92c0be5/html5/thumbnails/1.jpg)
LIO AND TCMU1
LIO and the TCMU Userspace Passthrough:The Best of Both Worlds
Andy Grover <[email protected]>@iamagroverMarch 11, 2015
![Page 2: LIO and the TCMU Userspace Passthrough: The Best of Both ...LIO and the TCMU Userspace Passthrough: The Best of Both Worlds Andy Grover @iamagrover March](https://reader030.vdocument.in/reader030/viewer/2022040106/5e369f7a16624140e92c0be5/html5/thumbnails/2.jpg)
LIO AND TCMU2
What is LIO?
Multi-protocol in-kernel SCSI target
![Page 3: LIO and the TCMU Userspace Passthrough: The Best of Both ...LIO and the TCMU Userspace Passthrough: The Best of Both Worlds Andy Grover @iamagrover March](https://reader030.vdocument.in/reader030/viewer/2022040106/5e369f7a16624140e92c0be5/html5/thumbnails/3.jpg)
LIO AND TCMU3
Multi-protocol in-kernel SCSI target
Unlike other targets like IET, tgt, and SCST, LIO is entirely kernel code.
![Page 4: LIO and the TCMU Userspace Passthrough: The Best of Both ...LIO and the TCMU Userspace Passthrough: The Best of Both Worlds Andy Grover @iamagrover March](https://reader030.vdocument.in/reader030/viewer/2022040106/5e369f7a16624140e92c0be5/html5/thumbnails/4.jpg)
LIO AND TCMU4
iSCSI commands
User
Kernel
disk.img
/dev/vg0/vol0
Configuration
Fabrics
tcm_fciscsi
/dev/sdc
block
file
pscsi
rtslib
targetcli
LIO core
/sys/kernel/config/target
Backstores
![Page 5: LIO and the TCMU Userspace Passthrough: The Best of Both ...LIO and the TCMU Userspace Passthrough: The Best of Both Worlds Andy Grover @iamagrover March](https://reader030.vdocument.in/reader030/viewer/2022040106/5e369f7a16624140e92c0be5/html5/thumbnails/5.jpg)
LIO AND TCMU5
Why add userspace command handling?
● Enable wider variety of backstores without kernel code
● Clustered network storage filesystems like Ceph, GlusterFS, & other things that have shared libraries available
● File formats beyond .img, such as qcow2 & vmdk for more interesting features & compatibility
● SCSI devices beyond mass storage● Enable experimentation just like FUSE did for
filesystems
![Page 6: LIO and the TCMU Userspace Passthrough: The Best of Both ...LIO and the TCMU Userspace Passthrough: The Best of Both Worlds Andy Grover @iamagrover March](https://reader030.vdocument.in/reader030/viewer/2022040106/5e369f7a16624140e92c0be5/html5/thumbnails/6.jpg)
LIO AND TCMU6
Userspace handling challenges: perf
● I/O latency● I/O throughput● Parallelism within a dev (OoO cmd completion)● Parallelism across devs, we're good.
![Page 7: LIO and the TCMU Userspace Passthrough: The Best of Both ...LIO and the TCMU Userspace Passthrough: The Best of Both Worlds Andy Grover @iamagrover March](https://reader030.vdocument.in/reader030/viewer/2022040106/5e369f7a16624140e92c0be5/html5/thumbnails/7.jpg)
LIO AND TCMU7
Userspace handling challenges: usability
● Configuration as simple as existing backstores● Userspace daemon failure● Userspace daemon activation/restart● Balance ultimate flexibility with common use● Avoid “multiple personalities”● Reasonable resource usage
![Page 8: LIO and the TCMU Userspace Passthrough: The Best of Both ...LIO and the TCMU Userspace Passthrough: The Best of Both Worlds Andy Grover @iamagrover March](https://reader030.vdocument.in/reader030/viewer/2022040106/5e369f7a16624140e92c0be5/html5/thumbnails/8.jpg)
LIO AND TCMU8
iSCSI commands
User
Kernel
disk.img
/dev/vg0/vol0
Configuration
Fabrics
tcm_fciscsi
/dev/sdc
block
file
pscsi
rtslib
targetcli
LIO core
/sys/kernel/config/targettcm-user
uio0 uio1
cmd handling daemon
SCSI Command processing
Backstores
NEW!NEW!
![Page 9: LIO and the TCMU Userspace Passthrough: The Best of Both ...LIO and the TCMU Userspace Passthrough: The Best of Both Worlds Andy Grover @iamagrover March](https://reader030.vdocument.in/reader030/viewer/2022040106/5e369f7a16624140e92c0be5/html5/thumbnails/9.jpg)
LIO AND TCMU9
tcm-user
cmd handling daemon
/dev/uio0 /sys/class/uio/uio0
uio0
● read to get configuration
● read (wait for cmd)● write (cmds done)● mmap (get SMR)
from LIO core
Shared Memory Region Layout(not to scale)
MailboxMailbox
Command RingCommand Ring
Data AreaData Area
cmdr_off
cmdr_size
cmd_head
cmd_tail
cmd_entry
...
opcode iovec[]
in/out data
User/Kernel Communication
...
dev add/remove
(netlink)
![Page 10: LIO and the TCMU Userspace Passthrough: The Best of Both ...LIO and the TCMU Userspace Passthrough: The Best of Both Worlds Andy Grover @iamagrover March](https://reader030.vdocument.in/reader030/viewer/2022040106/5e369f7a16624140e92c0be5/html5/thumbnails/10.jpg)
LIO AND TCMU10
tcm-user merged in 3.18
● Initial patch as simple as possible● Performance tuning not done, BUT an interface
flexible enough to not block likely perf optimization strategies
● Acceptance enabled phase 2: userspace usability pieces
![Page 11: LIO and the TCMU Userspace Passthrough: The Best of Both ...LIO and the TCMU Userspace Passthrough: The Best of Both Worlds Andy Grover @iamagrover March](https://reader030.vdocument.in/reader030/viewer/2022040106/5e369f7a16624140e92c0be5/html5/thumbnails/11.jpg)
LIO AND TCMU11
Performance Opportunities for Later
● Larger vmalloc()ed shared mem region● -> Demand-allocate pages to avoid bloat
● Block size == PAGE_SIZE preferred
● Complete commands out of order● Must handle data area fragmentation
● Fabrics copy into already-mapped pages● Just moves the cache misses?● Fabrics don't have per-lun device queues
anyway. Just IB reimplemented poorly?● Userspace busywait on ring cmd_head
![Page 12: LIO and the TCMU Userspace Passthrough: The Best of Both ...LIO and the TCMU Userspace Passthrough: The Best of Both Worlds Andy Grover @iamagrover March](https://reader030.vdocument.in/reader030/viewer/2022040106/5e369f7a16624140e92c0be5/html5/thumbnails/12.jpg)
LIO AND TCMU12
Our user-kernel API ended up flexible, but fraught with danger!
● Ring operations easy to mess up● Make every handler write daemon boilerplate?
● And support Netlink?
● And maybe D-Bus???
![Page 13: LIO and the TCMU Userspace Passthrough: The Best of Both ...LIO and the TCMU Userspace Passthrough: The Best of Both Worlds Andy Grover @iamagrover March](https://reader030.vdocument.in/reader030/viewer/2022040106/5e369f7a16624140e92c0be5/html5/thumbnails/13.jpg)
LIO AND TCMU13
tcmu-runner: A standard handler daemon
● Handle the messy ring bits● Expose a C plugin API● Implement library routines for common handler
code, e.g. mandatory SCSI commands● Permissively licensed: Apache 2.0
● Doubles as sample code for do-it-yourself-ers● Needed a prototype daemon in any case
![Page 14: LIO and the TCMU Userspace Passthrough: The Best of Both ...LIO and the TCMU Userspace Passthrough: The Best of Both Worlds Andy Grover @iamagrover March](https://reader030.vdocument.in/reader030/viewer/2022040106/5e369f7a16624140e92c0be5/html5/thumbnails/14.jpg)
LIO AND TCMU14
iSCSI commands
User
Kernel
disk.img
/dev/vg0/vol0
Configuration
Fabrics
tcm_fciscsi
/dev/sdc
block
file
pscsi
rtslib
targetcli
LIO core
/sys/kernel/config/targettcm-user
uio0 uio1
Backstores
tcmu-runner
tape vmdk mmc smc glfs
NEW!NEW!
![Page 15: LIO and the TCMU Userspace Passthrough: The Best of Both ...LIO and the TCMU Userspace Passthrough: The Best of Both Worlds Andy Grover @iamagrover March](https://reader030.vdocument.in/reader030/viewer/2022040106/5e369f7a16624140e92c0be5/html5/thumbnails/15.jpg)
LIO AND TCMU15
Handlers Need Config Info
● All configuration should still be through standard LIO mechanisms (i.e. targetcli)
● User backstore create includes a URL-like “configstring” that gives handler and per-device handler-specific stuf
● This is published in uio sysfs, and netlink add_device message
● Also has info to allow going from uio dev back to matching LIO backstore
● Handler needs attribs, block size, etc.
![Page 16: LIO and the TCMU Userspace Passthrough: The Best of Both ...LIO and the TCMU Userspace Passthrough: The Best of Both Worlds Andy Grover @iamagrover March](https://reader030.vdocument.in/reader030/viewer/2022040106/5e369f7a16624140e92c0be5/html5/thumbnails/16.jpg)
LIO AND TCMU16
Config tools need Info on Handlers Too!
● Users should call “backstores/foo create x y z” not “backstores/user <configstring>”
● targetcli needs param and help strings for xyz● Verify backstore params are correct before
creating the device● Must be loosely coupled – no hard dependencies
on targetcli or tcmu-handler● Solution: D-Bus!
![Page 17: LIO and the TCMU Userspace Passthrough: The Best of Both ...LIO and the TCMU Userspace Passthrough: The Best of Both Worlds Andy Grover @iamagrover March](https://reader030.vdocument.in/reader030/viewer/2022040106/5e369f7a16624140e92c0be5/html5/thumbnails/17.jpg)
LIO AND TCMU17
iSCSI commands
User
Kernel
disk.img
/dev/vg0/vol0
Configuration
Fabrics
tcm_fciscsi
/dev/sdc
block
file
pscsi
rtslib
targetcli
LIO core
/sys/kernel/config/targettcm-user
uio0 uio1
Backstores
tcmu-runner
tape vmdk mmc smc glfsNEW!NEW!
DBus
![Page 18: LIO and the TCMU Userspace Passthrough: The Best of Both ...LIO and the TCMU Userspace Passthrough: The Best of Both Worlds Andy Grover @iamagrover March](https://reader030.vdocument.in/reader030/viewer/2022040106/5e369f7a16624140e92c0be5/html5/thumbnails/18.jpg)
LIO AND TCMU18
The QEMU Question
● QEMU has great support for many image formats and other backstores
● Can we reuse or integrate somehow?● How?
● Build qemu handler code separately and integrate as a tcmu-runner handler?
● Extend qemu to implement TCMU directly, and possibly also enable it to configure LIO exports?
● Deferred for now
![Page 19: LIO and the TCMU Userspace Passthrough: The Best of Both ...LIO and the TCMU Userspace Passthrough: The Best of Both Worlds Andy Grover @iamagrover March](https://reader030.vdocument.in/reader030/viewer/2022040106/5e369f7a16624140e92c0be5/html5/thumbnails/19.jpg)
LIO AND TCMU19
Getting Involved
● Give feedback!● [email protected], [email protected]
● Check out tcmu-runner: https://github.com/agrover/tcmu-runner and its included sample handlers.
● Use github PRs and issue tracking.● Much help needed, esp. QEMU hackers!
● Start doing some TCMU performance benchmarking● Start thinking of interesting device types, userspace
libraries to use, or weird things to do with a SCSI command sandbox
![Page 20: LIO and the TCMU Userspace Passthrough: The Best of Both ...LIO and the TCMU Userspace Passthrough: The Best of Both Worlds Andy Grover @iamagrover March](https://reader030.vdocument.in/reader030/viewer/2022040106/5e369f7a16624140e92c0be5/html5/thumbnails/20.jpg)
LIO AND TCMU20
Thanks!Thanks!Questions?Questions?
![Page 21: LIO and the TCMU Userspace Passthrough: The Best of Both ...LIO and the TCMU Userspace Passthrough: The Best of Both Worlds Andy Grover @iamagrover March](https://reader030.vdocument.in/reader030/viewer/2022040106/5e369f7a16624140e92c0be5/html5/thumbnails/21.jpg)
LIO AND TCMU21
ISCSI commands
User
Kernel
disk.img
/dev/vg0/vol0Fabrics
tcm_fciscsi
/dev/sdc
file
pscsi
LIO core
/sys/kernel/config/targettcm-user
tcmu-runner
block
rtslib
targetcli targetd
liblvm
lsmcli (libstoragemgmt) Remote
Local
NEW!NEW!
![Page 22: LIO and the TCMU Userspace Passthrough: The Best of Both ...LIO and the TCMU Userspace Passthrough: The Best of Both Worlds Andy Grover @iamagrover March](https://reader030.vdocument.in/reader030/viewer/2022040106/5e369f7a16624140e92c0be5/html5/thumbnails/22.jpg)
LIO AND TCMU22
tcm-user
tcmu-runner
/dev/uio0 /sys/class/uio/uio0
uio0
● read to get configuration
● read (wait for cmd)● write (cmds done)● mmap (get SMR)
from LIO core
Shared Memory Region Layout(not to scale)
MailboxMailbox
Command RingCommand Ring
Data AreaData Area
cmdr_off
cmdr_size
cmd_head
cmd_tail
cmd_entry
...
opcode iovec[]
in/out data
User/Kernel Communication
...
![Page 23: LIO and the TCMU Userspace Passthrough: The Best of Both ...LIO and the TCMU Userspace Passthrough: The Best of Both Worlds Andy Grover @iamagrover March](https://reader030.vdocument.in/reader030/viewer/2022040106/5e369f7a16624140e92c0be5/html5/thumbnails/23.jpg)
LIO AND TCMU23
SCSIA set of standards for physically connecting and
transferring data between computers and peripheral devices*
* http://en.wikipedia.org/wiki/SCSI
![Page 24: LIO and the TCMU Userspace Passthrough: The Best of Both ...LIO and the TCMU Userspace Passthrough: The Best of Both Worlds Andy Grover @iamagrover March](https://reader030.vdocument.in/reader030/viewer/2022040106/5e369f7a16624140e92c0be5/html5/thumbnails/24.jpg)
LIO AND TCMU24
SCSI targetInitiator sends commands,
Target handles them
![Page 25: LIO and the TCMU Userspace Passthrough: The Best of Both ...LIO and the TCMU Userspace Passthrough: The Best of Both Worlds Andy Grover @iamagrover March](https://reader030.vdocument.in/reader030/viewer/2022040106/5e369f7a16624140e92c0be5/html5/thumbnails/25.jpg)
LIO AND TCMU25
Multi-protocol SCSI targetSCSI commands & data can be sent over many
types of physical links and protocols. e.g:
● SCSI Parallel Interface (original)● iSCSI (over TCP/IP)● SAS (over SATA cables)● Fibre Channel (over FCP)● FCoE (over Ethernet)● SRP (over Infiniband)● SBP-2 (over Firewire)
![Page 26: LIO and the TCMU Userspace Passthrough: The Best of Both ...LIO and the TCMU Userspace Passthrough: The Best of Both Worlds Andy Grover @iamagrover March](https://reader030.vdocument.in/reader030/viewer/2022040106/5e369f7a16624140e92c0be5/html5/thumbnails/26.jpg)
LIO AND TCMU26
tgtd
iSCSI commandsfrom initiator
User
KernelBacked by local filesor block devices
libglfs
librbdNEW!NEW!
disk.img /dev/vg0/vol0
tgtadmConfiguration
![Page 27: LIO and the TCMU Userspace Passthrough: The Best of Both ...LIO and the TCMU Userspace Passthrough: The Best of Both Worlds Andy Grover @iamagrover March](https://reader030.vdocument.in/reader030/viewer/2022040106/5e369f7a16624140e92c0be5/html5/thumbnails/27.jpg)
LIO AND TCMU27
ISCSI commands
User
Kernel
disk.img
/dev/vg0/vol0Fabrics
tcm_fciscsi
/dev/sdc
file
pscsi
LIO core
/sys/kernel/config/targettcm-user
tcmu-runner
librbd libglfs qemu-lio-tcmu
block
qcow2 vmdk vdi rbd* glfs*
others
Coming
Soon!Coming
Soon!