remote procedure call an effective primitive for distributed computing seth james nielson
TRANSCRIPT
Remote Procedure Remote Procedure CallCall
An Effective Primitive for An Effective Primitive for Distributed ComputingDistributed Computing
Seth James NielsonSeth James Nielson
What is RPC?What is RPC?
Procedure calls Procedure calls transfer control transfer control within local within local memorymemory
RPC’s transfer RPC’s transfer control to remote control to remote machinesmachines
Unused
Proc B
Proc A
Main
Why RPC?Why RPC?
Clean/Simple semanticsClean/Simple semantics Communication efficiencyCommunication efficiency GeneralityGenerality
RPC is an effective primitive for distributed systems because of -
How it WorksHow it Works(Idealized Example)(Idealized Example)
localCall()……
c = encrypt(msg)
CLIENT SERVERWith specialized hardware
and encryption key
wait…
c = encrypt(msg)
localCall() … …
encrypt(msg)Implementation
Request
Response
Early History of RPCEarly History of RPC
1976: 1976: early reference in early reference in literatureliterature
1976-1984: 1976-1984: few full few full implementationsimplementations
Feb 1984:Feb 1984: Cedar RPCCedar RPC– A. Birrell, B. Nelson at XeroxA. Birrell, B. Nelson at Xerox– ““Implementing Remote Procedure Implementing Remote Procedure
Calls” Calls”
Imagine our Surprise…Imagine our Surprise…
““In practice, … In practice, … several areas [of several areas [of RPC] were RPC] were inadequately inadequately understood”understood”
RPC Design IssuesRPC Design Issues
1.1. Machine/communication failuresMachine/communication failures
2.2. Address-containing argumentsAddress-containing arguments
3.3. Integration into existing systemsIntegration into existing systems
4.4. BindingBinding
5.5. Suitable protocolsSuitable protocols
6.6. Data integrity/securityData integrity/security
Birrell and Nelson Birrell and Nelson AimsAims
Primary AimPrimary Aim– Easy distributed computationEasy distributed computation
Secondary AimsSecondary Aims– Efficient (with powerful semantics)Efficient (with powerful semantics)– SecureSecure
Fundamental DecisionsFundamental Decisions
1.1. No shared address space among No shared address space among computerscomputers
2.2. Semantics of remote procedure Semantics of remote procedure calls should be as close as calls should be as close as possible to local procedure callspossible to local procedure calls
Note that the first decision partially Note that the first decision partially violates the second…violates the second…
BindingBinding
Binds an importer to exporterBinds an importer to exporter Interface name:Interface name: type type//instanceinstance Uses Uses GrapevineGrapevine DB to locate DB to locate
appropriate exporterappropriate exporter Bindings (based on unique ID) break Bindings (based on unique ID) break
if exporter crashes and restartsif exporter crashes and restarts
Unique IDUnique ID
At binding, importer learns of At binding, importer learns of exported interface’s exported interface’s Unique ID Unique ID (UID)(UID)
The UID is initialized by a real-time The UID is initialized by a real-time clock on system start-upclock on system start-up
If the system crashes and restarts, If the system crashes and restarts, the UID will be a new unique numberthe UID will be a new unique number
The change in UID breaks existing The change in UID breaks existing connectionsconnections
How Cedar RPC worksHow Cedar RPC worksCaller Machine Grapevine Callee Machine
User User Stub RPCRun. RPCRun. Server Stub Server
record export export
update setConnect
update addmember
import import getConnect lookup
bind(A,B) lookup
return record
return
x=F(y) F=>3 transmit Check 3 3=>F F(y)
Packet-Level Transport Packet-Level Transport ProtocolProtocol
Primary goalPrimary goal: minimize time : minimize time between initiating the call and between initiating the call and getting results getting results
NOT general – designed for RPCNOT general – designed for RPC Why? possible 10X performance gainWhy? possible 10X performance gain No upper bound on waiting for results No upper bound on waiting for results Error Semantics: User does not know Error Semantics: User does not know
if machine crashed or network failedif machine crashed or network failed
Creating RPC-enabled Creating RPC-enabled SoftwareSoftwareUser Code
Server Code
InterfaceModules
Develo
per
Lupin
e
User Stub
Server Stub
RPCRuntime
ServerProgram
RPCRuntime
ClientProgram
Clie
nt M
ach
ine
Serv
er M
ach
ine
Making it FasterMaking it Faster
Simple Calls (common case): all Simple Calls (common case): all of the arguments fit in a single of the arguments fit in a single packetpacket
A server reply and a 2A server reply and a 2ndnd RPC RPC operates as an implicit ACKoperates as an implicit ACK
Explicit ACKs required if call lasts Explicit ACKs required if call lasts longer or there is a longer interval longer or there is a longer interval between callsbetween calls
Simple CallsSimple Calls
CLIENT SERVER
Call
Response/ACK
Call/ACK
Response/ACK
Complex CallsComplex Calls
CLIENT SERVER
Call (pkt 0)
ACK pkt 0
Data (pkt 2)
Response/ACK
Data (pkt 1)
ACK pkt 1
ACK or New Call
Keeping it LightKeeping it Light
A A connectionconnection is just is just shared stateshared state Reduce process Reduce process
creation/swappingcreation/swapping– Maintain idle Maintain idle server processesserver processes– Each packet has a process identifier Each packet has a process identifier
to reduce swapto reduce swap– Full scheme results in no processes Full scheme results in no processes
created/four process swaps per callcreated/four process swaps per call RPC directly on top of EthernetRPC directly on top of Ethernet
Elapsed Time Elapsed Time PerformancePerformance
Number of Number of Args/ResultsArgs/Results
TimeTime
00 10971097µµ
100100 12781278µµ
100 word array100 word array 29262926µµ
THE NEED THE NEED FOR SPEEDFOR SPEED
RPC performance cost is a barrier RPC performance cost is a barrier (Cedar RPC requires .1 sec for a 0 arg (Cedar RPC requires .1 sec for a 0 arg call!)call!)
Peregrine RPC (about nine years later) Peregrine RPC (about nine years later) manages a 0 arg call in .0573 seconds!manages a 0 arg call in .0573 seconds!
A Few DefinitionsA Few Definitions
Hardware latencyHardware latency – Sum of – Sum of call/result network penaltycall/result network penalty
Network penalty – Network penalty – Time to Time to transmit (greater than…)transmit (greater than…)
Network transmission time Network transmission time – – Raw Network SpeedRaw Network Speed
Network RPCNetwork RPC – RPC – RPC between two machinesbetween two machines
Local RPC Local RPC – RPC between – RPC between separate threadsseparate threads
Peregrine RPCPeregrine RPC
Supports full functionality of RPCSupports full functionality of RPC Network RPC performance close Network RPC performance close
to HW latencyto HW latency Also supports efficient local RPCAlso supports efficient local RPC
Messing with the GutsMessing with the Guts
Three General OptimizationsThree General Optimizations Three RPC-Specific OptimizationsThree RPC-Specific Optimizations
General OptimizationGeneral Optimization
1.1. Transmitted arguments avoid Transmitted arguments avoid copiescopies
2.2. No conversion for client/server No conversion for client/server with the same data with the same data representationrepresentation
3.3. Use of packet header templates Use of packet header templates that avoid recomputation per that avoid recomputation per callcall
RPC Specific RPC Specific OptimizationsOptimizations
1.1. No thread-specific state is saved between No thread-specific state is saved between calls in the servercalls in the server
2.2. Server arguments are mapped (not Server arguments are mapped (not copied)copied)
3.3. No copying in the critical path of multi-No copying in the critical path of multi-packet argumentspacket arguments
I think this is I think this is COOLCOOL
To avoid copying arguments from To avoid copying arguments from a single-packet RPC, Peregrine a single-packet RPC, Peregrine arranges instead to use the arranges instead to use the packet buffer itself packet buffer itself as the server as the server thread’s stackthread’s stack
Any pointers are replaced with Any pointers are replaced with server-appropriate pointers server-appropriate pointers (Cedar RPC didn’t support this…)(Cedar RPC didn’t support this…)
This is cool tooThis is cool too
Multi-packet RPC’s use Multi-packet RPC’s use blastblast protocol protocol (selective retransmission)(selective retransmission)
Data is transmitted in parallel with Data is transmitted in parallel with data copydata copy
Last packet is mapped into placeLast packet is mapped into place
Data 0
Data 3
Data 1
Data 2Data 1
Data 2
Data 0
Data 3Header0
Header 0
Header3
Header2
Header1
PageBoundary
Packets 1-3 dataare copied into bufferat server
Packet 0 buffer (sent last)Is remapped at server
Fast Multi-Packet Fast Multi-Packet ReceiveReceive
Peregrine 0-Arg Peregrine 0-Arg PerformancePerformance
SystemSystem LatencyLatency ThroughpuThroughputt
CedarCedar 10971097µsecµsec 2.0mbps2.0mbps
Amoeba**Amoeba** 11001100µsecµsec 6.4mbps6.4mbps
x-kernelx-kernel 17301730µsecµsec 7.1mbps7.1mbps
V-SystemV-System 25402540µsecµsec 4.4mbps4.4mbps
Firefly (5 Firefly (5 CPU)CPU)
26602660µsecµsec 4.6mbps4.6mbps
SpriteSprite 28002800µsecµsec 5.7mbps5.7mbps
Firefly (1 Firefly (1 CPU)CPU)
48004800µsecµsec 2.5mbps2.5mbps
SunRPC**SunRPC** 67006700µsecµsec 2.7mbps2.7mbps
PeregrinePeregrine 573573µsecµsec 8.9mbps8.9mbps
Peregrine Multi-Packet Peregrine Multi-Packet PerformancePerformance
ProcedureProcedure
(Bytes)(Bytes)Network Network Penalty Penalty
(ms)(ms)
LatencLatencyy
(ms)(ms)
ThrougThroughputhput
(mbps)(mbps)
3000 byte in RPC3000 byte in RPC 2.712.71 3.203.20 7.507.50
3000 byte in-out 3000 byte in-out RPCRPC
5.165.16 6.046.04 7.957.95
48000 byte in RPC48000 byte in RPC 40.9640.96 43.3343.33 8.868.86
48000 byte in-out 48000 byte in-out RPCRPC
81.6681.66 86.2986.29 8.908.90
Cedar RPC SummaryCedar RPC Summary
Cedar RPC introduced practical Cedar RPC introduced practical RPCRPC
Demonstrated easy semanticsDemonstrated easy semantics Identified major design issuesIdentified major design issues Established RPC as effective Established RPC as effective
primitiveprimitive
Peregrine RPC Peregrine RPC SummarySummary
Same RPC semantics (with Same RPC semantics (with addition of pointers)addition of pointers)
Significantly faster than Cedar Significantly faster than Cedar RPC and othersRPC and others
General optimizations (e.g., pre-General optimizations (e.g., pre-computed headers)computed headers)
RPC-Specific (e.g., no copying in RPC-Specific (e.g., no copying in multipacket critical path)multipacket critical path)
ObservationsObservations
RPC is a very “transparent” RPC is a very “transparent” mechanism – it acts like a local callmechanism – it acts like a local call
However, RPC requires a deep However, RPC requires a deep understanding of hardware to tuneunderstanding of hardware to tune
In short, In short, RPC requires sophistication in RPC requires sophistication in its presentation as well as its its presentation as well as its operation to be viableoperation to be viable