scaling sip servers

21
Scaling SIP Scaling SIP Servers Servers Sankaran Narayanan Joint work with CINEMA team IRT Group Meeting – April 17, 2002

Upload: jenn

Post on 09-Feb-2016

43 views

Category:

Documents


4 download

DESCRIPTION

Scaling SIP Servers. Sankaran Narayanan Joint work with CINEMA team IRT Group Meeting – April 17, 2002. Agenda. Introduction Issues in scaling Facets of sipd architecture Some results Conclusion and Future Work. SQL database. Introduction – SIP servers. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Scaling SIP Servers

Scaling SIP Scaling SIP ServersServers

Sankaran NarayananJoint work with CINEMA team

IRT Group Meeting – April 17, 2002

Page 2: Scaling SIP Servers

AgendaAgenda Introduction Issues in scaling Facets of sipd architecture Some results Conclusion and Future Work

Page 3: Scaling SIP Servers

Introduction – SIP serversIntroduction – SIP servers SIP Signaling – Proxy,

redirect Proxies

Call routing by contact location

UDP/TCP/TLS Stateful or stateless Programmable scripts

User location – Registrars

SQLdatabase

Page 4: Scaling SIP Servers

What is scale ?What is scale ? Large call volumes,

commodity hardware [Schu0012:Industrial]

Response times (mean, deviation), Turn around time

Goals Delay budget [SIPstone]

R2 < 2 s R1 < 500 ms

Class-5 switches handle > 750K BHCA

REGISTER

200 OK

INVITE

180

INVITE

180200

200

ACKACK

R1

R2

Page 5: Scaling SIP Servers

Limits to scalingLimits to scaling Not CPU bound

Network I/O – blocking Wait for responses Latency: Contact, DNS lookups

OS resource limits Open files (<= 1024 on Unix) LWP’s (Solaris) vs. user-kernel threads

(Linux, Windows) Try not to…

Customize and recompile OS (parts) server into kernel (khttpd, AFPA, …)

Page 6: Scaling SIP Servers

The problemThe problem Scaling CPU-bound jobs (throughput=1/delay)

Hardware: CPU speed, RAM, … Software: better OS, scheduler, … Algorithm: optimize protocol processing

Blocking (Network, Disk I/O) is expensive Hypothesis

I/O-bound CPU-bound; reduce blocking Optimized resource usage – stability at high

loads

Page 7: Scaling SIP Servers

Facets of sipd architectureFacets of sipd architecture Blocking Process models Socket management Protocol processing

Page 8: Scaling SIP Servers

BlockingBlocking Mutex, event (socket,

timeout), fread Queue builds up

Potentially high variability Tandem queue system

Easy to fix Non-blocking calls (event

driven, later!) Move queue to different

thread (lazy logger)

Logger { lock; write; unlock;}

Page 9: Scaling SIP Servers

Blocking (2)Blocking (2) Call routing involves ( 1)

contact lookups 10 ms per query (approx)

Cache Works well for sipd style

servers Fetch-on-demand with

replacement (harder) Loading entire database is easy

need for refresh – long lived servers.

Potentially useful for DNS SRV lookups (?)

SQLdatabase

Cache

PeriodicRefresh

< 1 ms

Page 10: Scaling SIP Servers

REGISTER performanceREGISTER performanceSingle CPU Sun Ultra10

Response time is constant for Cache (FastSQL)

Page 11: Scaling SIP Servers

Process models (1)Process models (1)One thread per

request Doesn’t scale

Too many threads over a short timescale

Stateless proxy: 2-4 threads per transaction

High load affects throughput

R1R2

R3

R4

IncomingRequestsR1-4

Load

Thro

ughp

ut

Page 12: Scaling SIP Servers

Process models (2)Process models (2)Thread pool + Queue Thread overhead less;

more useful processing Overload management

drop requests over responses, drop tail

Not enough if holding time is high

Each request holds (blocks) a thread

IncomingRequestsR1-4

Fixed number of threads

Load

Thro

ughp

ut

Page 13: Scaling SIP Servers

Stateless proxy (Solaris)Stateless proxy (Solaris)

Turnaround time is almost constant for stateless proxy

• The sudden increase in response time - client problem

• UDP losses on Ultra10 @ (120 * 6 * 500 * 8) bps

Page 14: Scaling SIP Servers

Stateless proxy (Linux)Stateless proxy (Linux)

Request turnaround time breaks downResponse turnaround time is constantEffect of high holding times and thread schedulingHow to set queue size – investigate?

Page 15: Scaling SIP Servers

Queue evolution for sipdQueue evolution for sipd

Number of requests (y-axis) waiting in the queue for a free thread on Solaris (left) and Linux (right) over a period of up-time (x-axis).

Page 16: Scaling SIP Servers

Process models (3)Process models (3) Blocking thread model needs “too

many” threads Stateful transaction stays for 30 s Return thread to free pool instead of

blocking Event-driven architectures

State transition triggered by a global event scheduler

OnIncoming1xx(), OnInviteTimeout(), … SIP-CGI: pre-forked multiple processes

Page 17: Scaling SIP Servers

Socket managementSocket management Problem: open sockets limit (1024),

“liveness” detection, retransmission One socket per transaction does not

scale Global socket if downstream server is

alive, soft state – works for UDP Hard for TCP/TLS – connections Worse for Java servers – no select, poll

Page 18: Scaling SIP Servers

Optimizing protocol Optimizing protocol processingprocessing Not too useful if CPU is not the

bottleneck Text protocol - parsing, formatting

overheads Order of headers matter (Via) Other optimizations (parse-on-

demand, date formatting). . .

Page 19: Scaling SIP Servers

ConclusionConclusion Unlike web servers: can be stateful, less

disk I/O, lesser impact of TCP stack/behavior, …

Pros: UDP, Stateless routing, Load-balancing using DNS, …

Challenges: scaling state machine, Towards 2.5M BHCA (3600 messages/s)

Event driven architecture (SEDA?) Resource management (file limits, threads) Tuning operating system (scheduler, …)

Page 20: Scaling SIP Servers

Future workFuture work Stateful proxy performance

Evaluate event driven architecture Effect of request forking (> 1

contacts) on server behavior Programmable scripts

Queue management and overload control

Other types of servers (conference servers, media servers, etc.),

Page 21: Scaling SIP Servers

ReferencesReferences CINEMA web page.

http://www.cs.columbia.edu/IRT/cinema H. Schulzrinne. “Industrial strength

internet telephony,” Presentation at 6th SIP bakeoff, Dec. 2000.

H. Schulzrinne et. al. “SIPstone – Benchmarking SIP server performance,” CS Technical report, Columbia University.