high performance storage service virtualization scott baker university of arizona

27
High Performance Storage Service Virtualization Scott Baker University of Arizona

Post on 20-Dec-2015

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: High Performance Storage Service Virtualization Scott Baker University of Arizona

High Performance Storage Service Virtualization

Scott Baker

University of Arizona

Page 2: High Performance Storage Service Virtualization Scott Baker University of Arizona

What is Virtualization?

C lie nt S e rve rS e rvic e

C lie nt S e rve rS e rvic e

V irtua lize rS e rvic e '

A virtualizer is inserted between the client to provide a better service:

Consider an existing client/server scenario:

Page 3: High Performance Storage Service Virtualization Scott Baker University of Arizona

Why Virtualization ?

• Why not create a newer/better service ?– Modified clients / servers– Lengthy standardization process– Slow to integrate into existing infrastructure

• Virtualization offers– Unmodified clients/server– No standardization requirements– Rapid integration

Page 4: High Performance Storage Service Virtualization Scott Baker University of Arizona

Types of Virtualizations

• Mutation– Change a service into something different

• Aggregation– 2+ servers 1 big server

• Replication– 2+ servers 1 more reliable server

• Fortification– Vulnerable server more secure server

Page 5: High Performance Storage Service Virtualization Scott Baker University of Arizona

Mutation (Gecko)

• Web uses HTTP protocol• Traditional programs use file system semantics

– open, close, read, write, etc

• Inconvenient to modify existing body of applications to use the web

• WWW8, SP&E papers

L in u x o rW in d o ws C lien t

H T T PG e c k o

N F S W o rld W ideW e b

Page 6: High Performance Storage Service Virtualization Scott Baker University of Arizona

Aggregation (Mirage)

• Combine 2+ NFS file systems to create one big file system

• Clients are unaware multiple servers exist• IEEE LCN paper

C lie nt

N F S S e rve r

M ira g eN F S

N F S S e rve r

N F S

N F S

Page 7: High Performance Storage Service Virtualization Scott Baker University of Arizona

Replication (Mirage)

• Unmodified primary server• Specialized backup server• Asymmetric design

– Commercial primary, commodity backup

• Logging

C lie nt P rim a ry S e rve rM ira g eN F S

B a c k up S e rve r

N F S

N F S + B P

Page 8: High Performance Storage Service Virtualization Scott Baker University of Arizona

Fortification (Mirage)

• Fortify a server against DoS attacks

• Several ideas– Prevent faulty requests from reaching servers– Use scheduling to ensure fairness– Push authentication to the border of the

network

• Currently work in progress

Page 9: High Performance Storage Service Virtualization Scott Baker University of Arizona

Mirage in the Network

• Mirage could be located in:– Client (patch OS, or user-mode daemon)– Server (patch app, or user-mode daemon)– Router (unmodified clients, servers)

• Mirage is a router not an application– Rewrite packets on-the-fly & forward– Benefits: stateless, low overhead

Page 10: High Performance Storage Service Virtualization Scott Baker University of Arizona

NFS Basics

• NFS uses “handles” to identify objects– Lookup (par_handle, “name”) chi_handle– Read (chi_handle) data

• Handles are opaque to clients

• Handles are 32-bytes (NFSv2) or 64-bytes (NFSv3)

Page 11: High Performance Storage Service Virtualization Scott Baker University of Arizona

Aggregation Issues

• Make 2+ servers look like 1 big server

• Key problem– 2 servers may generate the same handle for

different objects– Client will be confused

• Solution– Virtual handles

Page 12: High Performance Storage Service Virtualization Scott Baker University of Arizona

Virtual and Physical Handles

• Virtual handles exist between clients and Mirage• Physical handles exist between Mirage and

servers

C lient M irage

S erver # 1

S erver # 2

V irtual H and les P hys ic al H and les

Page 13: High Performance Storage Service Virtualization Scott Baker University of Arizona

VFH Contents

• Mirage decides what to put in VFH• VFH is composed of

– PIN (Physical Inode Number)– PFS (Physical File System Number)– SID (Server ID)– VIN (Virtual Inode Number)– HVC (Handle Verification Checksum)– MCH (Mount Checksum)

• (PIN, PFS, SID) uniquely identifies a file

Page 14: High Performance Storage Service Virtualization Scott Baker University of Arizona

Data Structures

• Transaction Table (TT)– Entry created during request– Entry deleted during reply– Remembers NFS Proc Number, Client ID

• Handle Table (HT)– VFH PFH mappings

• Tables are Soft State

Page 15: High Performance Storage Service Virtualization Scott Baker University of Arizona

Request / Reply Processing

• On requests,– Lookup VFH in HT yields PFH– Rewrite VFH in request with PFH– Forward to server (SID tells which one)

• On replies,– Lookup PFH in HT yields VFH

• Create new mapping if necessary

– Rewrite PFH in reply with VFH– Forward to client

Page 16: High Performance Storage Service Virtualization Scott Baker University of Arizona

Router Failure / Recovery

• If router fails, TT and HT are lost• Clients will retry any ops in progress

– TT state regenerated automatically

• Recover HT state from fields in VFH– Extract (PIN, PFS, SID)– Search servers for (PIN, PFS, SID) to get

PFH– Similar to BASE

• Periodically checkpoint HT to servers

Page 17: High Performance Storage Service Virtualization Scott Baker University of Arizona

Prototypes

• User Mode Process– Linux operating system / commodity HW– Proof of concept– Demonstrates aggregation & replication– UDP Sockets

• IXP2400 Network Processor– High performance– Possible production system– Subject of ongoing/future work

Page 18: High Performance Storage Service Virtualization Scott Baker University of Arizona

IXP2400 Overview

• 1 StrongArm CPU (general purpose, Linux OS)• 8 Microengine CPUs (packet processing)

B U S

m e 4

m e 7

m e 6

m e 5

m e 0

m e 3

m e 2

m e 1

G B E TH

G B E TH

Str o ng Ar mC P U

D R AM M e m o r ySR AM M e m o r y

S c ratc h P ad

H as h

C ap

Page 19: High Performance Storage Service Virtualization Scott Baker University of Arizona

Microengine CPU Properties

• Lots of registers– 256 GPR, 128 NN, 512 memory-i/o

• Special packet-processing instruction set

• Multithreading support– 8 threads per microengine– Zero context-switch overhead

• Asynchronous memory I/O

• Fast-path processing

Page 20: High Performance Storage Service Virtualization Scott Baker University of Arizona

Memory

• DRAM: 64 MB / 300 cycles– Direct IO to and from network interface

• SRAM: 8 MB / 150 cycles– Support atomic operations– Built-in “queues” w/ atomic dequeue, get, put

• Scratchpad:16 KB / 60 cycles– Supports atomic operations– Built-in “rings” with atomic get/put

• Local per-microengine: 2560 B / 3 cycles

Page 21: High Performance Storage Service Virtualization Scott Baker University of Arizona

IXP Issues

• Divide Mirage functionality across Microengines• Control interface between StrongArm and

Microengines• Optimize Microengine code

R e c e i ve r Tr ans m i tte rC l as s i fi e r

N fs R e q

N fs R e p

Str o ng Ar mC P U

P ac ke tsIn

P ac ke tsO ut

Page 22: High Performance Storage Service Virtualization Scott Baker University of Arizona

Benchmark Configuration

• Two IXP boards: Benchmark and Mirage• Attempt throughput and measure actual

F l o o d

F akeSe r ve r

C o unt

N F S R e q

N F S R e p

R eq u es t

R e w r itten R e q u e st

R ep ly

R ew r itte n R e p ly

B e nc hm ar k IX P B o ar d M i r ag e IX P B o ar d

(note: transmit, receive, classifier microengines not shown)

Page 23: High Performance Storage Service Virtualization Scott Baker University of Arizona

Loopback Configuration

• Simulates a router without Mirage

F l o o d

F akeSe r ve r

C o unt

R e que s t

R eq u es t

R ep ly

R e pl y

B e nc hm ar k IX P B o ar d M i r ag e IX P B o ar d

Page 24: High Performance Storage Service Virtualization Scott Baker University of Arizona

IXP Performance

0

100000

200000

300000

400000

500000

600000

Attempted Throughput (Packets / Sec)

Actu

al

Th

rou

gh

pu

t (P

ackets

/ S

ec)

getattr-loop

getattr-mirage

write-loop

write-mirage

Page 25: High Performance Storage Service Virtualization Scott Baker University of Arizona

Analysis

• User-mode Mirage– 40,000 packets/second– Read/Write bandwidth at 320 Mbps

• IXP Mirage– 290,000 packets/second– Read/write bandwidth exceeds gigabit line

speed (In theory, approx 2.4 Gbps)

Page 26: High Performance Storage Service Virtualization Scott Baker University of Arizona

Status

• Completed– User-mode Mutation (Gecko), Aggregation,

Replication, Fortification– IXP Aggregation

• To-do– IXP performance tuning– Finish IXP benchmarks– IXP Replication ?– IXP Gecko ?– SOSP Paper

Page 27: High Performance Storage Service Virtualization Scott Baker University of Arizona

Publications• Scott Baker, John Hartman, “The Gecko NFS Web Proxy,”

Proceedings of the Eighth International Conference on World Wide Web. 1999.

• Scott Baker, Bongki Moon, “Distributed Cooperative Web Servers,” Proceedings of the Eighth International Conference on World Wide Web. 1999.

• Scott Baker, John Hartman, “The design and implementation of the Gecko NFS Web Proxy,” Software Practice and Experience, June 2001.

• Scott Baker,  John Hartman, and Ian Murdock, "Swarm: Agent-Based Storage," The 2004 International Conference on Software Engineering Research and Practice. Las Vegas, Nevada. June, 2004.

• Scott Baker and John Hartman, "The Mirage NFS Router," The 29th IEEE Conference on Local Area Networks. Tampa, FL. November, 2004.