advmultimedia2k7_01h

8/3/2019 AdvMultiMedia2k7_01H

1/70

11/2/2011

1

11/2/2011 Nguyen Chan Hung Hanoi University of Technology 1

Advanced Multimedia Technology

Roadmap

Introduction

Chapter 1: Multimedia Network

RTP & RTCP

QoS for multimedia network

Chapter 2: Voice and Video Over IP

SIP protocol

VoIP

VideoIP

Chapter 3: MPEG-4 & H264


Introduction

Targets: State-of-the-art knowledge on multimedia technology &

applications Information on real-world multimedia systems How-to tutorials / Hand-ons excersices / Technological

Demonstrations which inspire innovations Some smart students can setup practical systems for themselves

Show examples: VoIP using ADSL VoWiFi Mobile Nokia, PDA Video Streaming TV programs Internet Radio broadcast station Remote Home monitor RDLAB Project (See next slide)

Pre-requirements: Multimedia Computer network

Edited by Foxit ReaderCopyright(C) by Foxit Software Company,2005-2008For Evaluation Only.


2/70

11/2/2011

2


Introduction (2)

More information:

The Internet (anything you need)

http://Rdlab410.dyndns.org/rdlab/ Introductionto Reseach and Development Laboratory ofMultimedia Technology (C9 Room410 FET -HUT)

http://Rdlab410.dyndns.org/4SV/ Document for

students


Chapter 1: Multimedia networks

Overview:

Challenges of Multimedia networks &applications

RTP (Real Time Protocol) & RTCP QoS mechanisms

Traffic Scheduling & Policing: Int-serv,Diff-serv



3/70

11/2/2011

3


Multimedia requirements App classes

Typically sensitive to delay, but can toleratepacket loss (would cause minor glitches thatcan be concealed)

Data contains audio and video content(continuous media), three classes ofapplications:

Streaming stored contents

Unidirectional Real-Time

Interactive Real-Time


Application Classes

Streaming stored contents

Clients request audio/videofiles from servers andpipeline reception over thenetwork and display

Interactive: user can controloperation (similar to VCR:pause, resume, fastforward, rewind, etc.)

Delay: from client requestuntil display start can be 1to 10 seconds

Unidirectional Real-Time:

Similar to existing TV and radiostations, but delivery over theInternet

Non-interactive, just listen/view

Interactive Real-Time :

Phone or video conference

More stringent delay requirementthan Streaming & Unidirectionalbecause of real-time nature

Video: < 150 msec acceptable

Audio: < 150 msec good,


4/70

11/2/2011

4


Challenges

TCP/UDP/IP suite provides best-effort, no guaranteeson expectation or variance of packet delay

Streaming applications delay of 5 to 10 seconds istypicaland has been acceptable, but performancedeteriorate if links are congested (transoceanic)

Real-Time Interactive requirements on delay and itsjitter have been satisfied by over-provisioning(providing plenty of bandwidth), what will happen whenthe load increases?...

Most router implementations use only First-Come-First-Serve(FCFS) packet processing andtransmission scheduling


Making the best of best effort Internet

To mitigate impact of best-effort Internet, we can:

Use UDPto avoid TCP andits slow-start phase

Buffer contentat client and

control playback to remedyjitter

We can timestamppackets, so that receiverknows when the packetsshould be played back.

Adapt compression levelto available bandwidth

We can sendredundant packetstomitigate the effects ofpacket loss.

We will discuss allthese solutions.



5/70

11/2/2011

5


Internet evolution to better support multimedia

Integrated services philosophy:

Change Internet protocols sothat applications can reserveend-to-end bandwidth

Need to deploy protocol thatreserves bandwidth

Must modify schedulingpolicies in routers to honorreservations

Application must provide thenetwork with a description ofits traffic, and must furtherabide to this description.

Requires new, complexsoftware in hosts & routers

Differentiated servicesphilosophy:

Fewer changes to Internetinfrastructure, yet provide1st and 2nd class service.

Datagrams are marked.

User pays more tosend/receive 1st classpackets.

ISPs pay more to

backbones to send/receive1st class packets.


Internet evolution to better supportmultimedia (2)

Laissez-faire philosophy

No reservations, nodatagram marking

As demand increases,provision more bandwidth

Place stored content atedge of network:

ISPs & backbones addcaches

Content providers putcontent in CDN nodes

P2P: choose nearby peerwith content

Virtual private networks(VPNs)

Reserve permanentblocks of bandwidthforenterprises.

Routers distinguish VPN

traffic using IP addresses Routers use special

scheduling policies toprovide reservedbandwidth.


6/70

11/2/2011

6

11/2/2011Nguyen Chan Hung Hanoi University of

Technology 11

RTP & RTCP


Real-Time Protocol (RTP)

RTP specifies a packet

structure for packetscarrying audio andvideo data: RFC 1889.

RTP packet provides

payload typeidentification

packet sequencenumbering

timestamping

RTP runs in the end

systems.

RTP packets areencapsulated in UDPsegments or optionallyin TCP

Interoperability: If twoInternet phone

applications run RTP,then they may be ableto work together



7/70

11/2/2011

7


Fundamental Design Philosophies of RTP

To build a mechanism for robust, real-timemedia deliveryabove an unreliabletransportlayer.

RTP design follows 2 philosophies:

application-level framing

end-to-end principle.


Application-Level Framing

Only the application has sufficient knowledgeof its data to make aninformed decision about how that data should be transported.

Transport protocol should expose the details of their deliveryas muchas possible the application can make an appropriate response if an erroroccurs.

RTP Differ from TCP design !!

The application cooperating with the transportto achieve reliable

delivery. Real-time audio and visual media is:

loss tolerant

BUT has strict timing bounds.

By using application-level framing with UDP-based transport, multimediaapplications can:

Be able to accept losses where necessary,

Havethe flexibility to use the full spectrum of recovery techniques, such as

retransmission and forward error correction, where appropriate.



8/70

11/2/2011

8


The End-to-End Principle

To design a system that must communicatereliably across a network.

Similar to TCP principle

Implies that intelligence is at the endpoints,not within the network.

Case studies: Internet: Smart endpoints dumb network

Telephony: Smart network dumb endpoints (ORterminal)

MPEG: Smart sender dumb receiver


The RTP Specifications

RTP was published as an IETF proposedstandard (RFC 1889) in January 1996,

The first revision of ITU recommendationH.323 included a verbatim copy of the RTP

specification; later revisions reference thecurrent IETF standard.

Two parts of RTP:

the data transfer protocol

an associated control protocol (RTCP)


9/70

11/2/2011

9


RTP and OSI model

RTP mostly performs tasks typically of transport-layerprotocol

RTP libraries provide a transport-layer interface thatextend UDP:

port numbers, IP addresses

error checking across segment

payload type identification

packet sequence numbering

time-stamping

RTP performs some tasks of the session layer(i.e.spanning disparate transport connections and

managing participant identification in a transport-neutralmanner)

RTP also performs some tasks of Presentation layer(i.e. defining standard representations for media data).


RTP and related standards


10/70

11/2/2011

10


RTP Sessions

Definition: A RTP session consists of a group of participants who arecommunicating using RTP.

A participant may be active in multiple RTP sessions e.g. one session for exchanging audio data and another session for

exchanging video data.

For each participant, the session is identified by a network addressand port pairto which data should be sent, and a port pair on whichdata is received.

The send and receive ports may be the same.

Each port pair comprises two adjacent ports: an even-numbered port for RTP data packets,

the next higher (odd-numbered) port for RTCP control packets.

The default port pair is 5004 and 5005 for UDP/IP,

but manyapplications dynamically allocate ports during session setupandignore the default.

RTP sessions are designed to transport a single type of media; eachmedia type should be carried in a separate RTP session.


Types of RTP Sessions


11/70

11/2/2011

11


RTP Example

Consider sending 64 kbpsPCM-encoded voice overRTP.

Application collects theencoded data in chunks,e.g., every 20 msec = 160bytes in a chunk. (= 8000bytes/sec/50)

The audio chunk along withthe RTP header form the

RTP packet, which isencapsulated into a UDPsegment.

RTP header indicates typeof audio encoding in eachpacket;

senders can changeencoding during aconference.

RTP header also containssequence numbersandtimestamps.


RTP Implementations

RTP Sender: Uncompressed media dataaudio or video is captured into a buffer, from which compressed

frames are produced.

Frames may be encoded in several waysdepending on the compression algorithm used (e.g.H264, MPEG-4)

Compressed frames are loaded into RTP packetsfor sending.

If frames are large, they may be fragmented into several RTP packets;

if frames are small, several frames may be bundled into a single RTP packet.

A channel coder may be used to generate error correction packetsor to reorder packetsbeforetransmission.

After sending the RTP packets, the buffered media of those packets is freed.

The sender must buffer data for some time after the corresponding packets have been sent,depending on the codec and error correction scheme used.

The sender is responsible for generating periodic status reports for the media streamsit isgenerating, e.g. lip synchronization.

It also receives reception quality feedback from other participantsand may use thatinformation to adapt its transmission.


12/70

11/2/2011

12


RTP Implementations (2)

RTP receiver

Receiver is responsible for:

Collecting RTP packets from the network,

Correcting any losses,

Recovering the timing,

Decompressing the media, Presenting the result to the user.

Sends reception quality feedback, allowing the sender toadapt the transmissionto the receiver,

Maintains a database of participants in the session.


RTP and QoS

RTP does NOT provideany mechanism to

ensure timely deliveryof data or provide otherquality of service

guarantees. RTP encapsulation is

only seen at the endsystems -- it is NOTseen by intermediaterouters.

Router Do not make any

special effort to ensure that

RTP packets arrive at the

destination in a timely

matter.

In order to provide QoS to an

application, the Internet

must provide a

mechanism, such as RSVP,

for the application to reserve

network resources.


13/70

11/2/2011

13


RTP Streams

RTP allows each source (forexample, a camera or amicrophone) to be assignedits own independent RTPstreamof packets.

For example, for avideoconference between twoparticipants, four RTP streamscould be opened:

2 streams for transmittingthe audio (one in each

direction) 2 streams for the video (one

in each direction).

Some popular encodingtechniques (e.g. MPEG1 andMPEG2) bundle the audio andvideo into a single streamduring the encoding process.only one RTP stream isgenerated in each direction.

For a many-to-many multicastsession, all of the senders andsources typically send their RTPstreams into the same

multicast tree with the samemulticast address.


Translators

Definition: A translator is an intermediate system that operateson RTP data while maintaining the synchronization source andtimeline of a stream.

For examples: Systems that Convertbetween media-encoding formats without mixing,

Bridgebetween different transport protocols,

Add or remove encryption,

Filter media streams.

A translator is invisible to the RTP end systems

There are a few classes of translators: Bridgesare one-to-one translators that don't change the media

encoding e.g, gateways between different transport protocols, like RTP/UDP/IP

and RTP/ATM, or RTP/UDP/IPv4 and RTP/UDP/IPv6.

Bridges is the simplest class of translator

Cause no changes to the RTP or RTCP data.


14/70

11/2/2011

14


Translator (2)

Transcodersare one-to-one translatorsthatchange the media encoding

E.g, decoding the compressed data and reencoding itwith a different payload formatto better suit thecharacteristics of the output network.

The payload type usually changes, as may the padding, butother RTP header fields generally remain unchanged.

Explodersare one-to-many translators, which takein a single packet and produce multiple packets.

Mergersare many-to-one translators, combiningmultiple packets into one. This is the inverse of theprevious category.


Mixers

Definition:A mixer is an intermediate system that receives RTPpackets from a group of sources and combines them into a singleoutput, possibly changing the encoding, before forwarding theresult.

Examples include the networked equivalent of an audio mixingdeck, or a video picture-in-picture (PIP) device.

Because the timing of the input streams generally will not be

synchronized, the mixer will have to make its own adjustments tosynchronize the media before combining them, and hence itbecomes the synchronization source of the output media stream.

A mixer may use playout buffers for each arriving media streamto help maintain the timing relationships between streams.

A mixer has its own SSRC, which is inserted into the datapackets it generates. The SSRC identifiers from the input datapackets are copied into the CSRC list of the output packet.


15/70

11/2/2011

15


Mixers (2)

A mixer has a unique view of the session: It sees all sources as synchronizationsources, whereas the other participants see somesynchronization sourcesand somecontributing sources.

In above figure, participant X receives data from three synchronization sourcesY, Z, and Mwith A and B contributing sources in the mixed packets coming fromM.

Participant A sees B and M as synchronization sourceswith X, Y, and Zcontributing to M.

The mixer generates RTCP sender and receiver reports separately for eachhalf of the session, and it does not forward them between the two halves.

It forwards RTCP source description and BYE packetsso that all participantscan be identified


RTP packet format


16/70

11/2/2011

16


RTP packet format (2)

Payload Type (7 bits): Used to indicate the type of encoding that is

currently being used.

If a sender changes the encoding in the middle of a conference, the

sender informs the receiver through this payload type field.

Payload type 0: PCM mu-law, 64 Kbps

Payload type 3, GSM, 13 Kbps

Payload type 7, LPC, 2.4 Kbps

Payload type 26, Motion JPEG

Payload type 31. H.261

Payload type 33, MPEG2 video

Sequence Number (16 bits): The sequence number increments by

one for each RTP packetsent; may be used to detect packet loss

and to restore packet sequence.


RTP packet format (3) Timestamp field (32 bytes long). Reflects the sampling instant of the first byte in

the RTP data packet.

The receiver can use the timestamps to remove packet jitter and provide

synchronous playout.

The timestamp is derived from a sampling clock at the sender.

Example: for audio the timestamp clock increments by one for each sampling period (for

example, each 125 usecs for a 8 KHz sampling clock);

if the audio application generates chunks consisiting of 160 encoded samples, then the

timestamp increases by 160 for each RTP packet when the source is active.

The timestamp clock continues to increase at a constant rate even the source is inactive.

SSRC field (32 bits long). Identifies the source of the RTP stream. Each

stream in a RTP session should have a distinct SSRC.

Definition:The synchronization source (SSRC) identifies participants within an RTP

session. It is a per-session identifier that is mapped to a long-lived canonical name,

CNAME(e.g. [email protected]), through the RTP control protocol

Be chosen randomly to minimize collision probability

RTP Partcipants must resolve possible conflict of SSRC col lision. (sent BYE and choose another

SSRC)


17/70

11/2/2011

17


RTP packet format (4) Contributing sources (CSRCs)

Under normal circumstances, RTP data is generated by a single source,

But When multiple RTP streams pass through a mixer or translator,multiple data sources may have contributed to an RTP data packet.

The list of contributing sources (CSRCs) identifies participants who havecontributed to an RTP packetbut were not responsible for its timing andsynchronization.

Each contributing source identifier is a 32-bit integer, corresponding to theSSRC of the participantwho contributed to this packet.

The length of the CSRC list is indicated by the CC field in the RTP header.

Payload Headers The mandatory RTP header provides information that is common to all

payload formats.

Sometime, a payload format will need more information for optimal operation; This information forms an additional headerthat is defined as part of the payload

format specification.

The payload header is included in an RTP packet following the fixed headerand any CSRC list and header extension.

The definition of the payload header constitutes the majority of a payloadformat specification. Example: palyload header for H.261 video is defined in RFC 2032 and RFC 2736


Real-Time Control Protocol (RTCP)

Works in conjunction withRTP.

Each participant in an RTPsession periodicallytransmits RTCP controlpackets to all otherparticipants.

Each RTCP packetcontains sender and/orreceiver reports that reportstatistics useful to theapplication.

Statisticsinclude:

number of packets sent,

number of packets lost,

interarrival jitter,

etc.

This feedback to theapplication can be used tocontrol performanceandfor diagnostic purposes.

The sender may modifyits transmissionsbasedon the feedback.


18/70

11/2/2011

18


RTCP (2)

For an RTP session there is typically a

single multicast address; all RTP

and RTCP packets belonging to the

session use the multicast address.

RTP and RTCP packets are

distinguished from each otherthrough

the use of distinct port numbers.

To limit traffic, each participant reduceshis RTCP traffic as the number

of conference participants increases.


RTCP packet format

Five typesof RTCP packets are defined in the RTP specification:

receiver report (RR),

sender report (SR),

source description (SDES),

membership management (BYE),

and application-defined (APP).

They all follow a common structure: (see figure)


19/70

11/2/2011

19


RTCP packet format (2)

Version number (V). The version number is always 2 for the current version of RTP.

Padding (P). The padding bit indicates that the packet has been padded outbeyond its natural size. If this bit is set, one or more octets of padding have beenadded to the end of this packet, and the last octet contains a count of the number ofpadding octets added.

Item count (IC). Some packet types contain a list of items, perhaps in addition tosome fixed, type-specific information.

The item count field is used by these packet types to indicate the number of items included inthe packet (the field has different names in different packet types depending on its use).

Up to 31 items may be includedin each RTCP packet, limited also by the maximumtransmission unitof the network.

If more than 31 items are needed, the application must generate multiple RTCP packets.

Packet type (PT). The packet type identifies the type of information carried in the

packet. Five standard packet types are defined in the RTP specification; othertypes may be defined in the future

Length. The length field denotes the length of the packet contentsfollowing thecommon header.

It is measured in units of 32-bit wordsbecause all RTCP packets are multiples of 32 bits inlength


RTCP Packets - OverviewReceiver report packets: (RR)

Fraction of packets lost,

Last sequence number,

Average interarrival jitter.

Sender report packets: (SR)

SSRC of the RTP stream,

The current time,

The number of packets sent,

The number of bytes sent.

Source description packets (SDES)

e-mail address of the sender,

The sender's name,

The SSRC of the associated RTPstream.

Packets provide a mapping betweenthe SSRC and the user/host name.

BYE: Membership Control

A BYE packet is generated when aparticipant leaves the session,

or when it changes its SSRC forexample, because of a collision.

APP: Application-Defined RTCPPackets

The final class of RTCP packet(APP) allows for application-defined extensions.


20/70

11/2/2011

20


Receiver Report The reception quality

feedback in RR packets isuseful not only for thesender, but also for otherparticipants and third-partymonitoring tools.

The RR feedback allow thesender to adapt itstransmissionsaccording tothe feedback.

Other participantscandetermine whether problemsare local or common toseveral receivers,

Network managersmay usemonitors that receive only theRTCP packets to evaluatethe performance of theirnetworks.


Sender report From the SR, an application

can calculate the averagepayload data rate and theaverage packet rateoveran interval withoutreceiving the data.

The ratio of the two isthe average payload size.

If it can be assumed thatpacket loss is independent

of packet size, then:

Receiver Throughput =number of packets *average payload size

The timestamps are usedto generate acorrespondence betweenmedia clocks and the NTP Used for lip-synch


21/70

11/2/2011

21


SDES Source DEScription (SDES) provides

participant identification and

supplementary details, such as

location, e-mail address, and

telephone number.

The information in SDES packets is

typically entered by the user and is

often displayed in the graphical user

interface of an application

Each list of SDES items starts with

the SSRC of the source being

described, followed by one or more

entries with the format shown in

Figure.

Each entry starts with a type and a

length field, then the item text itself inUTF-8 format.

The length field indicates how

many octetsof text are present; the

text is not null-terminated.


BYE The RC field in the common

RTCP header indicates the

number of SSRC identifiers in

the packet.

On receiving a BYE packet, an

implementation should assume

that the listed sources have leftthe sessionand ignore any

further RTP and RTCPpackets

from that source.

A BYE packet may also contain

text indicating the reason for

leaving a session, suitable for

display in the user interface.


22/70

11/2/2011

22


RTCP APP: Application-Defined RTCP Packets

The application-defined packet

name is a four-character prefix

intended to uniquely identify this

extension, with each character

being chosen from the ASCII

character set.

Application-defined packets are

used for nonstandard extensions

to RTCP, and for experimentation

with new features.

Experimenters use APP to try new

features, and then register new

packet types if the features have

wider use. Several applications generate APP

packets, implementations

should be prepared to ignore

unrecognized APPpackets.


Synchronization of Streams

RTCP can be used to

synchronize different media

streamswithin a RTP session.

Consider a videoconferencing

application for which each

sender generates one RTP

stream for video and one foraudio.

The timestamps in these RTP

packets are tied to the video

and audio sampling clocks,

and are NOT tied to the wall-

clock time(i.e., to real time).

Each RTCP sender-reportpacket contains, for the mostrecently generated packetinthe associated RTP stream:

the timestamp of the RTPpacket

the wall-clock time for when

the packet was created. Thus the RTCP sender-

report packets associate thesampling clock to the real-time clock.

Receivers can use thisassociation to synchronize theplayout of audio and video.


23/70

11/2/2011

23


RTCP Bandwidth Scaling

RTCP attempts to limit its

traffic to 5% of the

session bandwidth.

For example, one

sender, sending video at

2 Mbps. RTCP limit its

traffic to 100 Kbps.

75% of this rate, or 75

kbps, to the receivers;

The remaining 25% of the

rate, or 25 kbps, to the

sender.

The 75 kbps devoted to the receivers is

equally shared among the receivers.

if there are R receivers, then each

receiver gets to send RTCP traffic at a

rate of 75/R kbps and the sender gets to

send RTCP traffic at a rate of 25 kbps.

A participant (a sender or receiver)

determines the RTCP packet

transmission periodby dynamically

calculating the average RTCP

packet size(across the entire session)

and dividing the average RTCP

packet size by its allocated rate.


Audio Capture, Digitization, and Framing

Audio capture devices can producesamples with 8-, 16-, or 24-bitresolution,

Linear, -law or A-law quantization,

Rates between 8,000 and 96,000samples per second, mono orstereo.

It may be necessary to convert themedia to an alternative formatbefore the media can be used

for example, changing the samplerate or converting from linear to -lawquantization

Many speech codecs perform voiceactivity detectionwith silencesuppression


24/70

11/2/2011

24


Video Capture

Most Video codec use inter-framecompression introduce delay

YUV to RGB conversion


Use of Prerecorded Content RTP makes no distinction

between live and prerecordedmedia, and senders generate datapackets from compressed framesin the same way

First, the sender must generate anew SSRCand choose randominitial values for the RTPtimestamp and sequence number.

During the streaming process, the

sender must be prepared tohandle SSRC collisionsandshould generate and respond toRTCP packetsfor the stream.

Also, if the sender implements acontrol protocol, such as RTSP,that allows the receiver to pauseor seek within the mediastream, the sender must keeptrack of such interactionssothat it can insert the correctsequence number and timestampinto RTP data packets


25/70

11/2/2011

25


Fragmentation of a Media Frame into RTP Packets

The fragmentation process is critical to the qualityof the media in the presence of packet loss.

The ability to decode each fragment independentlyis desirable

otherwise loss of a single fragment wil l result in the entire frame being discarded

When multiple RTP packets are generated for each frame, the sender must choose betweensending the packets in a single burstand spreading their transmission across the framinginterval.

Sending the packets in a single burst reduces the end-to-end delay but may overwhelm the limitedbuffering capacityof the network or receiving host.

it is recommended that the sender spread the packets out in timeacross the framing interval.


Packet Reception Input queues

Separation between the packetreception and playout routines by inputqueues(See figure)

It is important to store the exact arrivaltime, M, of RTP data packets tocalculate interarrival jitter

The arrival time should be measuredaccording to a local reference wallclock, T, converted to the media clockrate, R.

Since the receiver do not have such aclock, so usually we calculate thearrival time by sampling the referenceclock(typically the system wall clocktime) and converting it to the localtimeline:

where the offsetis used to mapfrom the reference clock to the mediatimeline, in the process correcting forskew between the media clock andthe reference clock.


26/70

11/2/2011

26


Disruption of Interpacket Timing during

Network Transit There are bursts when

several packets arrive atonce

Gaps when no packetsarrive

Packets may even arriveout of order.

The receiver does notknow when data packetsare going to arrive, so it

should be prepared toaccept packets inbursts, and in anyorder


The Playout Buffer

Data packets are extracted from their input queue and inserted into a source-specific playout buffer sorted by their RTP timestamps.

Frames are held in the playout buffer for a period of time to smooth timingvariationscaused by the network.

Holding the data in a playout buffer also allows the pieces of fragmented framesto be received and grouped, and it allows any error correction data to arrive.

The frames are then decompressed, any remaining errors are concealed, and themedia is renderedfor the user.

A single buffer may be usedto compensate for network timing variability and as adecode buffer for the media codec. It is also possible to separate these functions: using separate buffers for jitter removal

and decoding.


27/70

11/2/2011

27


The Playout Buffer Data Structures

The playout buffer comprises atime-ordered linked list ofnodes.

Each node represents a frameof media data, with associatedtiming information.

The data structure for eachnode contains pointers to: the adjacent nodes,

the arrival time,

RTP timestamp,

desired playout time for theframe,

and pointers to both

The compressed fragments ofthe frame (the data received inRTP packets)

and

The uncompressed media data


The Playout Buffer Data Structures (2) When the first RTP packet in a frame arrives, it is removed from the input

queue and positioned in the playout buffer in order of its RTP timestamp.

This involves creating a new playout buffer node, which is inserted into thelinked listof the playout buffer.

The compressed data from the recently arrived packetis linked from theplayout buffer node, for later decoding. The frame's playout time is thencalculated

The newly created node resides in the playout buffer until its playout time

is reached. During this waiting period, packets containing other fragments of the frame may

arriveand are linked from the node.

Once all the fragments of a frame have been received, the decoder isinvokedand the resulting uncompressed frame linked from the playoutbuffernode.

Determining that a complete frame has been received depends on thecodec:

Audio codecs typically do not fragment frames, and they have a single packet perframe (MPEG Audio Layer-3MP3is a common exception);

Video codecs often generate multiple packets per video frame, with the RTPmarker bit being setto indicate the RTP packet containing the last fragment.


28/70

11/2/2011

28


Playout buffer processing The decision of when to invoke the decoder depends on the receiverand is not

specified by RTP. Frames can be decoded as soon as they arrive or kept compresseduntil the last

possible moment.

The choice depends on the relative availability of processing cycles andstorage space for uncompressed frames, and perhaps on the receiver'sestimate of future resource availability. For example, a receiver may wish to decode data early if it knows that an index frame is

due and it will shortly be busy.

When the playout time for a frame arrives, it is queued for playout. The receiver must make its best effort to decode the frame, even if some fragments

are missing, because this is the last chance before the frame is needed.

Error concealment may be invokedto hide any uncorrected packet loss.

Once the frame has been played out, the corresponding playout buffer nodeand its linked data should be destroyed or recycled.

If error concealment is used, it may be desirable to delay this process untilthe surrounding frames have also been played outbecause the linked mediadata may be useful for the concealment operation.

RTP packets arriving lateand corresponding to frames that have missed theirplayout point should be discarded. The timeliness of a packet can be determined by comparison of its RTP timestamp

with the timestamp of the oldest packet in the playout buffer


Clock skew

Calculation of clock skew:

observe the rate of the sender clockthe RTPtimestampand compare with the local clock.

If TR(n) is the RTP timestamp of the n th packet received,and TL(n) is the value of the local clock at that time, thenthe clock skew can be estimated as follows:


29/70

11/2/2011

29


The Playout calculation

5 steps:1. The sender timeline is mapped to the localplayout timeline, compensating for

the relative offset between sender and receiver clocks, to derive a base timefor the playout calculation

2. If necessary, the receiver compensates for clock skew relative to the sender,by adding a skew compensation offsetthat is periodically adjusted to thebase time

3. The playout delay on the local timeline is calculated according to a sender-related componentof the playout delay and ajitter-related component

4. The playout delay is adjusted if the route has changed , if packets have been reordered, if the chosen playout delay causes frames to overlap, in response to other changes in the media

5. Finally, the playout delay is added to the base timeto derive the actual playouttime for the frame.


Review questions1. Compare RTP and TCPWhat are their main differences &

similarities ?

2. What is a RTP stream ? Find out an example of RTP stream in real-world applications.

3. What is a RTP session ? Find out an example of RTP session in real-world applications.

4. Define SSRC/CSRC ? Describe their roles in RTP.

5. What will happen if in a video conferencing session, a host find out that

it use the same SSRC as one of other participants ?6. Find some examples of RTP mixer/translator in real-world applications.

7. How often does RTP end-point send audio package ? Why ?

8. What are the purposes of using sequence number / time stamp in RTPheader ?

9. A RTP session and a FTP session sharing a congested ADSL link. Will the RTP session affected ?

Describe the interaction between PC applications, ADSL modem and ISProuter.

Which traffic will be prioritized? Why ? How ?


30/70

11/2/2011

30


Review questions (2)

1. What are the main roles of playout buffer ?

2. Describe the operation of linked-list of buffer nodes ?

3. List the main reasons of clock skew ?

4. How many steps involved in calculating playout time ? Describethese steps ?

5. What happens to RTP packets while traversing through thenetwork ?

6. What happens to RTP packets that arrive later than displayedpackets ?

7. Assuming a multipoint video & audio conference of 4participants, where all participants can see and talk to one

another. How many input queues that an application shouldmaintain ?

8. Describe the relationship between RTP packet size and networkMTU ?


Technology 60

Quality of Service inMultimedia network


31/70

11/2/2011

31


Traffic Scheduling & Policing: Int-Serv, Diff-Serv, RSVP

IETF groups are working on proposals to improve QoS

control in IP networks, i.e., going beyond best effort to

provide some assurance for QOS

Work in Progress includes RSVP, DifferentiatedServices, and Integrated Services

Simple model for sharing and congestion studies


Principles for QOS Guarantees

Consider a phone application at 1Mbps and an FTP applicationsharing a 1.5 Mbps link.

Bursts of FTP can congest the router and cause audio packetsto be dropped want to give priority to audio over FTP !!

PRINCIPLE 1: Marking of packets is needed for router todistinguish between different classes; and new router policyto treat packets accordingly


32/70

11/2/2011

32


Principles for QOS Guarantees (2)

Applications misbehave (audio sends packets at a rate higherthan 1Mbps assumed above);

PRINCIPLE 2: provide protection (isolation) for one classfrom other classes

Require Policing Mechanisms to ensure sources adhere tobandwidth requirements; Marking and Policing need to be doneat the edges:



Alternative to Marking and Policing: allocate a set portionof bandwidth to each application flow; can lead toinefficient use of bandwidth if one of the flows does notuse its allocation

PRINCIPLE 3: While providing isolation, it isdesirable to use resources as efficiently as possible


33/70

11/2/2011

33



Cannot support traffic beyond link capacity

PRINCIPLE 4: Need a Call AdmissionProcess; application flow declares its needs,network may block call if it cannot satisfy theneeds


Summary


34/70

11/2/2011

34


Scheduling And Policing Mechanisms

Scheduling: choosing the next packet fortransmission on a link can be done following anumber of policies;

FIFO: in order of arrival to the queue; packets thatarrive to a full buffer are either discarded, or a

discard policyis used to determine which packet todiscard among the arrival and those already queued


Scheduling Policies

Priority Queuing: classes have different priorities;class may depend on explicit marking or otherheader info, eg IP source or destination, TCPPort numbers, etc.

Transmit a packet from the highest priority classwith a non-empty queue

Preemptive and non-preemptive versions


35/70

11/2/2011

35


Scheduling Policies (2)

Round Robin: scan class queues serving onefrom each class that has a non-empty queue


Scheduling Policies (3)

Weighted Fair Queuing: is a generalizedRound Robin in which an attempt is made toprovide a class with a differentiated amountof service over a given period of time


36/70

11/2/2011

36


Policing Mechanisms

Three criteria:

(Long term) Average Rate (100 packets persec or 6000 packets per min??), crucialaspect is the interval length

Peak Rate: e.g., 6000 p p minute Avg and1500 p p sec Peak

(Max.) Burst Size: Max. number of packetssent consecutively, ie over a short period oftime


Policing Mechanisms

Token Bucket mechanism, provides a meansfor limiting input to specified Burst Size andAverage Rate.


37/70

11/2/2011

37


Policing Mechanisms (2)

Bucket can hold b tokens;token are generated at arate of r token/secunlessbucket is full of tokens.

Over an interval of lengtht, the number of packetsthat are admitted is lessthan or equal to (r t + b).

Token bucket and WFQcan be combined toprovide upperbound on delay.


Integrated Services

An architecture for providing QOS guarantees in IPnetworks for individual application sessions

Relies on resource reservation, and routers need to: Maintain state info (Virtual Circuit) maintaining records of

allocated resources and..

Respond to new Call setup requests on that basis


38/70

11/2/2011

38


Call Admission

Session must first declare its QOSrequirement and characterize the traffic it willsend through the network

R-spec: defines the QOS being requested

T-spec: defines the traffic characteristics

A signaling protocol is needed to carry the R-spec and T-spec to the routers where

reservation is required RSVP is a leadingcandidate for such signaling protocol


Call Admission

Call Admission: routers will admit calls based on

their R-spec and T-spec and base on the currentresource allocated at the routers to other calls.


39/70

11/2/2011

39


Integrated Services: Classes

Guaranteed QOS: this class is provided withfirm bounds on queuing delay at a router;envisioned for hard real-time applications thatare highly sensitive to end-to-end delayexpectation and variance

Controlled Load: this class is provided aQOS closely approximating that provided by

an unloaded router; envisioned for todays IPnetwork real-time applications which performwell in an unloaded network


Differentiated Services

Intended to address the following difficulties withIntserv and RSVP;

Scalability: maintaining states by routers in highspeed networks is difficult sue to the very largenumber of flows

Flexible Service Models: Intserv has only twoclasses, want to provide more qualitative serviceclasses; want to provide relative service distinction(Platinum, Gold, Silver, )

Simpler signaling: (than RSVP) many applicationsand users may only want to specify a morequalitative notion of service


40/70

11/2/2011

40


Differentiated Services

Approach:

Only simple functions in the core, and relativelycomplex functions at edge routers (or hosts)

Do not define service classes, instead providesfunctional components with which service classescan be built


Edge Functions

At DS-capable host or f irst DS-capable router

Classification: edge node marks packets according toclassification rules to be specified (manually by admin, or bysome TBD protocol)

Traffic Conditioning: edge node may delay and thenforward or may discard


41/70

11/2/2011

41


Core Functions

Forwarding: according to Per-Hop-Behavior or PHB specified for the particularpacket class; such PHB is strictly based onclass marking (no other header fields can beused to influence PHB)

BIG ADVANTAGE:

No state info to be maintained by routers!


Classification and Conditioning

Packet is marked in the Type of Service(TOS) in IPv4, and Traffic Class in IPv6

6 bits used for Differentiated Service CodePoint (DSCP) and determine PHB that the

packet will receive

2 bits are currently unused


42/70

11/2/2011

42


Classification and Conditioning

It may be desirable to limit traffic injectionrate of some class; user declares trafficprofile (eg, rate and burst size); traffic ismetered and shaped if non-conforming


Forwarding (PHB)

PHB result in a different observable(measurable) forwarding performancebehavior

PHB does not specify what mechanisms to

use to ensure required PHB performancebehavior

Examples: Class A gets x% of outgoing link bandwidth over

time intervals of a specified length

Class A packets leave first before packets fromclass B


43/70

11/2/2011

43


Forwarding (PHB)

PHBs under consideration:

Expedited Forwarding (EF): departure rate ofpackets from a class equals or exceeds aspecified rate (logical link with a minimumguaranteed rate)

Assured Forwarding (AF): 4 classes, each

guaranteed a minimum amount of bandwidth andbuffering; each with three drop preferencepartitions


Differentiated Services Issues

AF and EF are not even in a standard trackyet research ongoing

Virtual Leased lines and Olympic servicesare being discussed

Impact of crossing multiple ASs and routersthat are not DS-capable


44/70

11/2/2011

44


Key points of Chapter 1

RTP & RTCP

Media transmissionand reception overnetwork

Translator & Mixer

RTP Session

RTP Stream

RTP Packet format

SSRC & CSRC

RTCP packet format

SR/RR/SDES/BYE/APP

QoS:

Scheduling

WFQ

Policing:

Token Bucket

Packet Classifications

DSCP/TOS

Call Admission

T-Spec/R-Spec

Int-Serv RSVP

Diff-Serv Forwarding PHB

AF/EF

DSCP

Int-Serv vs. Diff-Serv


Technical terms

RTCP SSRC

CSRC SDES RSVP

DiffServ IntServ WFQ = Weight Fair

Queuing

PHB = Per hopbehavior

DSCP = DifferentiatedService Code Point

TOS = Type ofService


45/70

11/2/2011

45


Chapter 2: Multimedia applications -

Voice and Video over IP

Roadmap

Streaming applications

VoIP archiecture

VoIP issues

SIP protocol

SIP applications


Streaming applications

Important and growing application due to:

Reduction of storage costs,

Broadband Internet access

Enhancements of caching

More QoS suports in IP-based networks

Applications:

IPTV, VoD (Video on Demand)

Internet Radio

Audio/Video file is segmented and sent over eitherTCP or UDP, public segmentation protocol: Real-Time Protocol (RTP)


46/70

11/2/2011

46


Streaming Stored Audio & VideoStreaming stored media:

Audio/video file isstored in a server

Users requestaudio/video file ondemand.

Audio/video is renderedwithin, say, 10 s after

request. Interactivity (pause, re-

positioning, etc.) isallowed.

Media player: removes jitter

decompresses

error correction

graphical user interfacewith controls forinteractivity

Plug-ins may be usedto imbed the media

player into thebrowser window.


Streaming

User interactive control is provided, e.g. thepublic protocol Real Time StreamingProtocol (RTSP)

Helper Application: displays content, whichis typically requested via a Web browser; e.g.RealPlayer; typical functions: Decompression

Jitter removal

Error correction: use redundant packets to beused for reconstruction of original stream

GUI for user control


47/70

11/2/2011

47


Streaming From Web Servers

Audio: in files sent as HTTPobjects

Video (interleaved audio andimages in one file, or twoseparate files and clientsynchronizes the display) sentas HTTP object(s)

A simple architecture is to havethe Browser requests the

object(s) and after theirreception pass them to theplayer for display

- No pipelining


Streaming From Web Server (2)

Alternative: set up connection between serverand player, then download

Web browser requests and receives a MetaFile ( a file describing the object) instead of

receiving the file itself;

Browser launches the appropriate Player andpasses it the Meta File;

Player sets up a TCP connection with WebServer and downloads the file


48/70

11/2/2011

48


Meta file requests


Using a Streaming Server

This gets us around HTTP, allows a choice of UDP

vs. TCP and the application layer protocol can bebetter tailored to Streaming; many enhancementsoptions are possible (see next slide)


49/70

11/2/2011

49


Options When Using a Streaming Server

Use UDP, and Server sends at a rate (Compression andTransmission) appropriate for client; to reduce jitter, Playerbuffers initially for 2-5 seconds, then starts display

Use TCP, and sender sends at maximum possible rate underTCP; retransmit when error is encountered; Player uses amuch large buffer to smooth delivery rate of TCP


Real Time Streaming Protocol (RTSP)

For user to control display: rewind, fast forward,

pause, resume, etc

Out-of-band protocol (uses two connections, one forcontrol messages (Port 554) and for media stream)

RFC 2326 permits use of either TCP or UDP for thecontrol messages connection, sometimes called theRTSP Channel

As before, meta file is communicated to webbrowser which then launches the Player; Player sets

up an RTSP connection for control messages inaddition to the connection for the streaming media


50/70

11/2/2011

50


Meta File Example

Twister


SDP Session Description Protocol

RFC 4566


51/70

11/2/2011

51


RTSP Operation


RTSP Exchange Example

C: SETUP rtsp://audio.example.com/twister/audio RTSP/1.0Transport: rtp/udp; compression; port=3056; mode=PLAY

S: RTSP/1.0 200 1 OKSession 4231

C: PLAY rtsp://audio.example.com/twister/audio.en/lofi RTSP/1.0Session: 4231

Range: npt=0-

C: PAUSE rtsp://audio.example.com/twister/audio.en/lofi RTSP/1.0Session: 4231Range: npt=37

C: TEARDOWN rtsp://audio.example.com/twister/audio.en/lofi RTSP/1.0Session: 4231

S: 200 3 OK


52/70

11/2/2011

52


Real-World Streaming Applications

What material can be streamed ? Video/Audio (File or Live source) Presentation slides Combinations

Real Most OSes Server: Helix Universal Server + Helix Producer Client: Real Player

MicrosoftWindows only Server: Windows Media Server/Encoder Client: Windows Media Player

Apple MacOS/Win32/Some UNIXes Server: Darwin Server Client: QuickTime

Adobe Flash Server: Flash Communication Server Client: Flash plug-in in most Web browser

Hand-ons excercises: Setup your own streaming server to serve movies, music and real-time TV

programs for your friends on ADSL access network


Technology Demo

Darwin streaming server

SDP

Playlist

Ethereal capture

Client QuickTime/ Realplayer (Mobile Phone)


53/70

11/2/2011

53


Technology 105

Internet TelephonyVoIP & VideoIP


Internet telephony


54/70

11/2/2011

54


VoIP archiecture


Protocols needed


55/70

11/2/2011

55


Protocol



56/70

11/2/2011

56



Technology 112

Session Initiation Protocol(SIP)


57/70

11/2/2011

57


What is SIP?

Session Initiation Protocol - Anapplication layer signaling protocol thatdefines initiation, modification andtermination of interactive, multimedia

communication sessions between users.

IETF RFC 2543 Session Initiation Protocol


Sip end-devices


58/70

11/2/2011

58


SIP Framework

Session initiation.

Multiple users.

Interactivemultimediaapplications.


RedirectServer

SIP Distributed Architecture

LocationServer

RegistrarServer

User Agent

ProxyServer

Gateway

PSTN

SIP Components

ProxyServer


59/70

11/2/2011

59


Proxy Server

An intermediary program that acts asboth a server and a client to makerequests on behalf of other clients.

Requests are serviced internally or bypassing them on, possibly aftertranslation, to other servers.

Interprets, rewrites or translates arequest message before forwarding it.


Location Server

A location server is used by a SIP redirect or

proxy server to obtain information about a

called partys possible location(s).


60/70

11/2/2011

60


Redirect Server

A server that accepts a SIP request,maps the address into zero or more newaddresses and returns these addressesto the client.

Unlike a proxy server, the redirect serverdoes not initiate its own SIP request.

Unlike a user agent server, the redirectserver does not accept or terminate calls.


Registrar Server

A server that accepts REGISTERrequests.

The register server may support

authentication. A registrar server is typically co-located

with a proxy or redirect server and mayoffer location services.


61/70

11/2/2011

61


Structure of SIP

MethodMethod Request URI SIP version

* User ISO 10646 , character set in UTF8

MethodSIP version Status code Reason-pharse

* The format of the Response message header


SIP Messages Methods and Responses

SIP Methods: INVITE Initiates a call by

inviting user to participate insession.

ACK - Confirms that the clienthas received a final responseto an INVITE request.

BYE - Indicates termination ofthe call.

CANCEL - Cancels a pendingrequest.

REGISTER Registers theuser agent.

OPTIONS Used to query thecapabilities of a server.

INFO Used to carry out-of-bound information, such asDTMF digits.

SIP Responses: 1xx - Informational Messages. 2xx - Successful Responses. 3xx - Redirection Responses. 4xx - Request Failure

Responses. 5xx - Server Failure

Responses. 6xx - Global Failures

Responses.

SIP components communicate by exchanging SIP messages:


62/70

11/2/2011

62


SIP Headers SIP borrows much of the syntax and semantics from

HTTP.

A SIP messages looks like an HTTP message message formatting, header and MIME support.

An example SIP header:-----------------------------------------------------------------

SIP Header

-----------------------------------------------------------------

INVITE sip:[email protected] SIP/2.0

Via: SIP/2.0/UDP 192.168.6.21:5060

From: sip:[email protected]

To:

Call-ID: [email protected]: 100 INVITE

Expires: 180

User-Agent: Cisco IP Phone/ Rev. 1/ SIP enabled

Accept: application/sdp

Contact: sip:[email protected]:5060

Content-Type: application/sdp


SIP Addressing

The SIP address is identified by a SIP URL,in the format: user@host.

Examples of SIP URLs:

sip:[email protected]




63/70

11/2/2011

63


Process for Establishing Communication

Establishing communication using SIP usually occursin 6 steps:

1. Registering, initiating and locating the user.

2. Determine the media to use involves delivering adescription of the session that the user is invited to.

3. Determine the willingness of the called party tocommunicate the called party must send aresponse message to indicate willingness tocommunicate accept or reject.

4. Call setup.5. Call modification or handling example, call transfer

(optional).

6. Call termination.


Registration

Each time a user turns on theSIP user client (SIP IP Phone,PC, or other SIP device), theclient registers with theproxy/registration server.

Registration can also occurwhen the SIP user client needs

to inform the proxy/registrationserver of its location.

The registration information isperiodically refreshed and eachuser client must re-register withthe proxy/registration server.

Typically the proxy/registrationserver will forward thisinformation to be saved in thelocation/redirect server.

SIP Messages:

REGISTER Registers the address listed in the Toheader field.

200 OK.

Proxy/

RegistrationServer

SIP PhoneUser

Location/

RedirectServer

REGISTER REGISTER

200200


64/70

11/2/2011

64


Simplified SIP Call Setup and Teardown


SIP Design Framework

SIP was designed for:

Integration with existing IETF protocols.

Scalability and simplicity.

Mobility.

Easy feature and service creation.


65/70

11/2/2011

65



Call flow SIP to PSTN


66/70

11/2/2011

66


Technology Demo

Asterisk VoIP system Linux

SIP config

SIP calls

Asterisk CLI

Client

X-Lite

Mobile Phone, SIP Client Nokia


VoIP issues

Internet phone applications generate packets duringtalk spurts

Bit rate is 8 KBytes, and every 20 msec, the senderforms a packet of 160 Bytes + a header

The coded voice information is encapsulated into a

UDP packet and sent out Some packets may be lost up to 20 % loss is

tolerable

Using TCP eliminates loss but cause variance indelay

FEC is sometimes used to fix errors and make uplosses


67/70

11/2/2011

67


Real-Time (Phone) Over IPs Best-Effort

End-to-end delays above 400 msec cannot betolerated delayed packets are ignored at thereceiver !

Delay jitter is handled by using

timestamps,

sequence numbers,

Delaying playout at receivers either a fixed or avariable amount

With fixed playout delay, the delay should be assmall as possible without missing too many packets;

delay cannot exceed 400 msec


Internet Phone with Fixed Playout Delay


68/70

11/2/2011

68


Adaptive Playout Delay

The objective is to use a value for p-r that tracksthe network delay performance as it varies duringa phone call

The playout delay is computed for each talkspurt based on observed average delay andobserved deviation from this average delay

Estimated average delay and deviation of averagedelay are computed in a manner similar toestimates of RTT and deviation in TCP

The beginning of a talk spurt is identified fromexamining the timestamps in successive and/orsequence numbers of chunks


Techniques to deal with packet Loss

Loss is in a broader sense: packet never arrives or

arrives later than its scheduled playout time

Since retransmission is inappropriate for Real

Time applications, FEC (Forward ErrorCorrection) or Interleaving are used to reduce lossimpact.

Simplest FEC scheme adds a redundant chunkmade up of exclusive OR of a group of n chunks;

redundancy is 1/n; can reconstruct if at most onelost chunk; playout time schedule assumes a lossper group


69/70

11/2/2011

69


Techniques to deal with packet Loss (2)

Mixed quality streams are used to includeredundant duplicates of chunks; upon lossplayout available redundant chunk, albeit alower quality one

With one redundant chunk per chunk canrecover from single losses


Piggybacking Lower Quality Stream


70/70

11/2/2011


Interleaving

Has no redundancy, but can cause delay in playout

beyond Real Time requirements

Divide 20 msec of audio data into smaller units of 5msec each and interleave

Upon loss, have a set of partially filled chunks

advmultimedia2k7_01h

Documents