reliable and scalable internet telephony

Reliable and Scalable Reliable and Scalable Internet TelephonyInternet Telephony

Kundan Singh and Henning SchulzrinneInternet Real Time Lab – Internal Talk

Sept 24, 2004

2

Telephone reliabilityTelephone reliability(PSTN: Public Switched Telephone Network)(PSTN: Public Switched Telephone Network)

“bearer” network telephone switch(SSP)

database (SCP)for freephone, calling card, …

signaling network(SS7)

signaling router(STP)

local telephone switch(class 5 switch)10,000 customers20,000 calls/hour

database (SCP)10 million customers2 million lookups/hour

signaling router (STP)1 million customers1.5 million calls/hour

regional telephone switch(class 4 switch)100,000 customers150,000 calls/hour

3

DB

Internet telephonyInternet telephony(SIP: Session Initiation Protocol)(SIP: Session Initiation Protocol)

[email protected]@yahoo.com yahoo.com example.comREGISTER

INVITEINVITE 192.1.2.4129.1.2.3

DNS

http://www.infocellar.com/networks/graphics/images/router-blue-small.jpg













http://images.google.com/imgres?imgurl=http://www.snia.org/about/images/network_component_icons/Firewall.jpg&imgrefurl=http://www.snia.org/about/images/network_component_icons&h=397&w=471&sz=39&tbnid=VXKn1RZkm1IJ:&tbnh=104&tbnw=124&start=8&prev=/images%3Fq%3Dfirewall%26hl%3Den%26lr%3D%26ie%3DUTF-8%26sa%3DN

4

SIP network architectureSIP network architectureScalability requirement depends on roleScalability requirement depends on role

GW

GW

MG

MG

MG

IP network

PSTN

SIP/PSTN SIP/MGC SIP/MGCCarrier network

ISP

ISP Cybercafe

IP

PSTNGW

PBX

IP phones

PSTN phones T1 PRI/BRI






5

Reliability and scalabilityReliability and scalabilityfor for call routing, registrationcall routing, registration, conferencing, , conferencing, voicemailsvoicemails

Requirements Reliable

Mean Time Between Failures (MTBF), Mean Time To Recover (MTTR)

Scalable Registration rate, call rate, #requests/s

Proposed solutions Server redundancy

Apply existing web-redundancy designs Evaluate quantitatively (future work)

Peer-to-peer Novel P2P-SIP architecture Evaluate quantitatively (future work)

6

Server redundancyServer redundancyThe problem: failure or overloadThe problem: failure or overload

REGISTERINVITE

7

Server redundancyServer redundancyReplicate registration or search on callReplicate registration or search on call

REGISTERINVITE REGISTERINVITE

8

Server redundancyServer redundancyKnown techniquesKnown techniques

Client-based Cisco phones: primary and backup

proxy DNS

NAPTR, SRV IP address takeover Database redundancy . . .

9

High availabilityHigh availabilityFailover in CINEMAFailover in CINEMA

Slave/master

Webscripts

D2

P2

Master/slave

Webscripts

D1

P1

phone.cs.columbia.edu sip2.cs.columbia.eduREGISTER

proxy1 = phone.csbackup = sip2.cs

_sip._udp SRV 0 0 5060 phone.cs.columbia.edu SRV 1 0 5060 sip2.cs.columbia.edu

replication

10

High availabilityHigh availabilityTime to recoverTime to recover

Client re-sends INVITE to P2 Immediately on ICMP error Or after 10s otherwise

sipd has in-memory cache Refresh registration much before expiry Registrations are additive

Measurement of recovery time Optimal #servers

11

ScalabilityScalabilityLoad sharing: redundant proxies and databasesLoad sharing: redundant proxies and databases

REGISTER Write to D1 & D2

INVITE Read from D1 or

D2 Database write/

synchronization traffic becomes bottleneck

D1

D2

P1

P2

P3

REGISTER

INVITE

12

ScalabilityScalabilityLoad sharing: divide the user spaceLoad sharing: divide the user space

Proxy and database on the same host

Stateless proxy can become overloaded

Hashing Static vs dynamic

D1

D2

P1

P2

P3

D3

a-h

i-q

r-z

13

ScalabilityScalabilityComparison of the two designsComparison of the two designs

((tr/D)+1)TN= (A/D) + B

((tr+1)/D)TN= (A/D) + (B/D)

D1

D2

P1

P2

P3

D1

D2

P1

P2

P3D2

a-h

i-q

r-z

Total time per DB

D = number of database serversN = number of writes (REGISTER)r = #reads/#writes = (INV+REG)/REG ~ 2T = write latencyt = read latency/write latency

14

Reliability and scalabilityReliability and scalabilityTwo stage architecture for CINEMATwo stage architecture for CINEMA

Master

Slave

Master

Slave

sip:[email protected]:[email protected]

s1

s2

s3

a1

a2

b1

b2

a*@example.com

b*@example.com

example.com_sip._udp SRV 0 0 s1.example.com SRV 0 0 s2.example.com SRV 0 0 s3.example.com SRV 1 0 ex.backup.com

a.example.com_sip._udp SRV 0 0 a1.example.com SRV 1 0 a2.example.com

b.example.com_sip._udp SRV 0 0 b1.example.com SRV 1 0 b2.example.com

Request-rate = f(#stateless, #groups)

Bottleneck: CPU, memory, bandwidth?Failover latency: ?

15

Server-based vs peer-to-peerServer-based vs peer-to-peer Server-based

Cost: maintenance, configuration Central points of failures Controlled infrastructure (e.g., DNS)

Peer-to-peer Robust: no central dependency Self organizing, no configuration Scalability ?

C

C

C

C

C

S

P

P

P

P

P

16

Related work: Skype Related work: Skype From the KaZaA communityFrom the KaZaA community

Host cache of some super nodes Bootstrap IP addresses Auto-detect NAT/firewall settings

STUN and TURN Protocol among super nodes – ?? Allows searching a user (e.g., kun*) History of known buddies All communication is encrypted Promote to super node

Based on availability, capacity Conferencing

P

P

PP

PP

P P

P

P P P

17

We propose: P2P-SIPWe propose: P2P-SIP Unlike server-based SIP architecture Unlike proprietary Skype architecture

Robust and efficient lookup using DHT Interoperability

DHT algorithm uses SIP communication Hybrid architecture

Lookup in SIP+P2P Unlike file-sharing applications

Data storage, caching, delay, reliability Disadvantages

Lookup delay and security

18

P2P-SIPP2P-SIPBackground: DHT (Chord)Background: DHT (Chord)

Identifier circle Keys assigned to successor Evenly distributed keys and nodes Finger table: logN

ith finger points to first node that succeeds n by at least 2i-1

Stabilization for join/leave

18

14

21

3238

58

47

10

2430

54

38

42

Key node

8+1 = 9 148+2 = 10

14

8+4 = 12

14

8+8 = 16

21

8+16=24

32

8+32=40

42

19

P2P-SIPP2P-SIPDesign AlternativesDesign Alternatives

65a1fc

d13da3

d4213f

d462bad467c4

d471f1

d46a1c

Route(d46a1c)

18

14

21

3238

58

47

10

24 30

54

38

42

Use DHT in server farm

Use DHT for all clients; But some are resource limited

Use DHT among super-nodes

1. Hierarchy2. Dynamically adapt

servers

clients

1

10

2430

54

38

20

P2P-SIPP2P-SIPNode architecture: registrar, proxy, user agentNode architecture: registrar, proxy, user agent

DHT communication using SIP REGISTER Known node: sip:[email protected] Unknown node: sip:[email protected] User: sip:[email protected]

User interface (buddy list, etc.)

SIPICE RTP/RTCPCodecs

Audio devicesDHT (Chord)

On startup

Discover

User location

Multicast REGPeer found/Detect NAT

REG REG, INVITE,MESSAGE

Signup,Find buddies

JoinFind

Leave

On resetSignout,transfer

IM,call

21

P2P-SIPP2P-SIPNode StartupNode Startup

SIP REGISTER with SIP registrar

DHT Discover peers: multicast REGISTER Join DHT using node-key=Hash(ip) REGISTER with DHT using user-

key=Hash([email protected]) Dialing out

Call, instant message, etc. INVITE sip:[email protected] MESSAGE sip:[email protected] Last seen, SIP NAPTR/SRV, DHT

[email protected]

REGISTERDB

sipd

Detect peers

columbia.edu

14

32

5812

42REGISTER alice=42

REGISTER bob=12

22

P2P-SIPP2P-SIPNode LeavesNode Leaves

Graceful leave Un-REGISTER Transfer registrations

Failure Attached nodes detect

and re-REGISTER New REGISTER goes to

new super-nodes Super-nodes adjust DHT

accordingly

DHT

REGISTER key=42

OPTIONS

42

42

REGISTER

23

P2P-SIPP2P-SIPImplementationImplementation

sippeer: C++, Unix (Linux), Chord Node join and form

the DHT Node failure is

detected and DHT updated

Registrations transferred on node shutdown

Co-located sipc can use sippeer service

1

11

9

30

26

31

15

29

25

19

31

26

24

P2P-SIPP2P-SIPEvaluationEvaluation #super-nodes needed depends

on Registration refresh rate, replication Join/leave rate, uptime Call arrival rate CPU, memory, bandwidth limits

Other metrics Call setup latency Recovery time after super-node

failure

25

P2P-SIPP2P-SIPAdvanced services and open issuesAdvanced services and open issues

Offline messages INVITE or MESSAGE fails => Responsible

node stores voicemail, instant message. Conferencing

Mixer, full mesh, multicast Open issues

P2P reputation system Motivation to become super node Security (SPAM, DOS, spy, …) . . .

26

Server-based vs peer-to-Server-based vs peer-to-peerpeer

Reliability, failover latency

DNS-based. Depends on client retry timeout, DB replication latency, registration refresh interval

DHT self organization and periodic registration refresh. Depends on client timeout, registration refresh interval.

Scalability, number of users

Depends on number of servers in the two stages.

Depends on refresh rate, join/leave rate, uptime

Call setup latency

One or two steps. O(log(N)) steps.

Security TLS, digest authentication, S/MIME

Additionally needs a reputation system, working around spy nodes

Maintenance, configuration

Administrator: DNS, database, middle-box

Automatic: one time bootstrap node addresses

PSTN interoperability

Gateways, TRIP, ENUM Interact with server-based infrastructure or co-locate peer node with the gateway

27

SummarySummary Motivation

PSTN is reliable and scalable Can IP telephony do better?

Server-based DNS, stateless, DB replication, two

stage Peer-to-peer

SIP, DHT, soft state, self organizing

28

InternalTelephoneExtn: 7040

SIP/PSTN Gateway

Department PBX

Web based configuration

Web server

Telephoneswitch

SQLdatabase

sipd:Proxy, redirect, Registrar server

NetMeeting

H.323

rtspd: media server

sipum: Unified messaging

Quicktime

RTSP clients

RTSP

713x

CINEMA servers

sipconf: Conference server

siph323: SIP-H.323 translator

Local/long distance1-212-5551212

PSTN

Beyond proxy/registrarBeyond proxy/registrarCINEMA: Columbia InterNet Extensible Multimedia CINEMA: Columbia InterNet Extensible Multimedia ArchitectureArchitecture

SIP

VXML

vxmlcgi

29

Communication to Communication to collaborationcollaboration Synchronous (tightly coupled)

Video conference, IM, screen sharing, … Asynchronous (loosely coupled)

File sharing, message board, … Messaging and notifications

Personalized view Per-user calendar, access control, address

bookGoal: provide personalized access, alternate between synchronous and asynchronous communication, and access from different devices and clients.

reliable and scalable internet telephony

Documents

udp srv

qrztotal time

mttrscalableregistration

number of database serversn

number of writes registerr

telephone reliabilitypstn

future workpeer

serversscalabilityload