error performance aspects of ip multicast over satellite

Error Performance Aspects of IP Multicast over Satellite

Michael Philip Howarth

Thesis submitted for the degree of Master of Science

at the University of Surrey

School of Electronics, Computing and Mathematics August 2001

Abstract

The last few years have seen an explosive growth in the use of the Internet. In addition to email and browsing the world-wide web, it is now used for interactive videoconferencing, information distribution, audio and video streaming, and real time applications. Satellites have significant potential to support these applications at a large scale by using IP multicast, and satellites with on-board processing and switching are currently being developed to carry IP-based traffic. In this context the characteristics of IP multicast in networks that include satellite links are of significant interest.

This thesis begins by describing the principles of multicast routing protocols and reliable multicast protocols, and illustrates how the introduction of a satellite link into a network results in a number of aspects where the behaviour of multicast protocols differs from that in a terrestrial network.

Satellite communication links also have a number of disadvantages compared to terrestrial communications networks (including high round-trip delay times and significant error characteristics), and this thesis focuses on one of these areas, namely error performance, and considers error correction mechanisms in the context of multicast protocols. On a satellite link, errors tend to occur in bursts due to the Viterbi decoding normally employed in satellite modems. In the case of satellites that use ATM, this error characteristic is significantly different from the random single bit errors that occur in the fibre optic links for which ATM was originally designed. A theoretical model is developed in this thesis for the unicast and multicast error performance of IP over a satellite ATM link. The network simulation tool Opnet is then used to develop a multicast model which includes a burst error model of satellite link behaviour. Results obtained from the simulation of the error performance of the satellite link show excellent agreement with the theoretical model.

The thesis goes on to describe the development of a simulation of a reliable multicast protocol based on one named protocol, PGM, which is currently an Internet draft document. This protocol supports two modes of error correction: one (“Selective Nack”) is a continuous RQ selective repeat protocol, the other (“Parity Nack”) is a hybrid ARQ code that encodes repair data based on groups of packets. A file transfer application is also developed in Opnet as an example of multicast information distribution, and results obtained for a typical multicast scenario. The behaviour of the two error correction modes (Selective Nack and Parity Nack) is compared, and it is shown how the Parity Nack mode has superior performance in that it uses less network traffic on both the forward and reverse links, but the Selective Nack mode has the better performance in the sense that reduced jitter is observed in the delivery of ordered data to the application.

Copyright © 2001 Michael Philip Howarth

Acknowledgements

I would like to thank my supervisor Zhili Sun, for his support during my work on this thesis and his well-focused comments during our many conversations. I am also grateful to Haitham Cruickshank for a number of useful discussions.

I am pleased to acknowledge the support of an EPSRC Studentship.

Table of contents

1 Introduction .................................................................................................................... 7 1.1 Context of the study................................................................................................ 7 1.2 Objectives............................................................................................................... 7 1.3 Progress and achievements ..................................................................................... 7 1.4 Content of this thesis .............................................................................................. 8

2 Background: satellite and multicast technology ............................................................. 9 2.1 The potential of satellite communication systems .................................................. 9 2.2 Satellite IP network architectures ......................................................................... 10 2.3 Multicast: principles ............................................................................................. 11 2.4 IP multicast........................................................................................................... 12 2.5 IP multicast over ATM......................................................................................... 17 2.6 Summary .............................................................................................................. 18

3 Error behaviour of multicast protocols ......................................................................... 20 3.1 Introduction .......................................................................................................... 20 3.2 Error correction: general....................................................................................... 20 3.3 Error correction in multicast protocols ................................................................. 21 3.4 Error correction in PGM....................................................................................... 22

4 Error performance model ............................................................................................. 24 4.1 Introduction .......................................................................................................... 24 4.2 Theoretical model................................................................................................. 25 4.3 Analytical results .................................................................................................. 28

5 Simulation software development ................................................................................ 30 5.1 Simulation tool: Opnet ......................................................................................... 30 5.2 Approach to simulation work ............................................................................... 31 5.3 Simple Continuous RQ Go-Back-N protocol ....................................................... 33 5.4 Layered protocol model........................................................................................ 38 5.5 Burst error model.................................................................................................. 40 5.6 Reliable multicast protocol ................................................................................... 48 5.7 Summary .............................................................................................................. 55

6 Results .......................................................................................................................... 56 6.1 Introduction .......................................................................................................... 56 6.2 Simulation ............................................................................................................ 56 6.3 Metrics.................................................................................................................. 57 6.4 File transfer results ............................................................................................... 58

7 Conclusions and further work....................................................................................... 66

References............................................................................................................................ 69

Appendix A: Theory of Continuous RQ protocols ............................................................... 71

Appendix B: Unicast error performance simulation results.................................................. 74

Appendix C: Software listings.............................................................................................. 79

List of abbreviations

AAL ATM Adaptation Layer Section 4.2.3

ARQ Automatic Retransmission reQuest Section 3.2

ATM Asynchronous Transfer Mode Section 2.5

BER Bit Error Rate

CBT Core Based Tree Section 2.4.2

CER Cell Error Ratio Section 4.1

CLR Cell Loss Ratio Section 4.1

CMR Cell Mis-insertion Ratio Section 4.1

CPCS Common Part Convergence Sublayer Section 4.2.3

CS Convergence Sublayer Section 4.2.3

DLR Designated Local Repairer Section 2.4.4

DVB-S Digital Video Broadcasting – Satellite Section 2.2

DVMRP Distance Vector Multicast Routing Protocol Section 2.4.2

FEC Forward Error Correction Section 3.2

HEC Header Error Control Section 4.1

IGMP Internet Group Membership Protocol Section 2.4.1

IP Internet Protocol

LAN Local Area Network

LLC Logical Link Control Section 4.3

MARS Multicast Address Resolution Server Section 2.5

MFTP Multicast File Transfer Protocol Section 3.3

MOSPF Multicast Open Shortest Path First Section 2.4.2

MPLS MultiProtocol Label Switching Section 2.2

MTU Maximum Transmission Unit

NCF Network Confirmation Section 2.4.4

NE Network Element Section 2.4.4

NORM Nack-Oriented Reliable Multicast Protocol Section 3.3

ODATA Original Data Section 2.4.4

OSPF Open Shortest Path First Section 2.4.2

PDU Protocol Data Unit Section 4.2.3

PGM Pragmatic General Multicast Section 2.4.4

PIM-DM Protocol Independent Multicast – Dense Mode Section 2.4.2

PIM-SM Protocol Independent Multicast – Sparse Mode Section 2.4.2

PT Payload Type Section 4.2.3

RDATA Repair Data Section 2.4.4

RP Rendezvous Point Section 2.4.2

RPF Reverse Path Forwarding Section 2.4.2

RRMP Restricted Reliable Multicast Protocol Section 3.3

SAR Segmentation And Reassembly sublayer Section 4.2.3

SDU Service Data Unit Section 4.2.3

SNAP SubNetwork Access Protocol Section 4.3

SPM Source Path Message Section 5.6.4

SSCS Service-Specific Convergence Sublayer Section 4.2.3

TCP Transmission Control Protocol

UDP User Datagram Protocol

VC Virtual Circuit or Virtual Channel

VCI Virtual Channel Identifier Section 5.4

VPI Virtual Path Identifier Section 5.4

List of principal symbols

SA Number of packets transmitted by application layer

b Length of an error burst, units: bits

meanb Mean length of an error burst from a Viterbi decoder, units: bits

k Number of blocks or packets of uncoded data

N Number of ATM cells used to carry an IP datagram

n Number of blocks or packets of encoded data

p Bit error rate

fP Probability of loss of a transmitted packet

losscellP _ Probability of loss of an ATM cell

R Number of multicast receivers / sinks

S Number of transmission slots used per packet delivered to an end application

1 Introduction

1.1 Context of the study

The last few years have seen an explosive growth in the use of the Internet. In addition to email and browsing the world-wide web, it is now used for interactive videoconferencing, information distribution, audio and video streaming, and real time applications. Satellites have significant potential to support these applications at a large scale by using IP (Internet Protocol) multicast, and satellites with on-board processing and switching are currently being developed to carry IP-based traffic. In this context, the characteristics of IP in networks that include satellite links are of significant interest, and the area of multicast IP is the particular focus of the work described in this thesis.

This work is associated with GEOCAST, an EU IST (Information Society Technologies) project in which the Centre for Communication Systems Research (CCSR) at the University of Surrey is participating. The objective of GEOCAST is to address the issues raised by the deployment of multicast services over existing and next-generation geostationary broadband satellites.

1.2 Objectives

A significant body of work has been developed that addresses the behaviour of TCP over satellite. However, little work has been conducted to date on IP multicast, and how this is affected by the conditions which exist on satellite links. Consequently, the overall objective of the work described in this thesis is to investigate IP multicast protocols over satellite networks, to identify key issues in their operation, and to develop, evaluate and compare techniques to resolve these issues. In line with the GEOCAST project, the work addresses geostationary earth orbit (GEO) satellite systems.

Multicast IP over satellite raises a number of issues which are discussed later in this thesis. In particular, the work described here focuses on the use of error correction techniques to improve the behaviour of reliable multicast protocols. The technical approach which is adopted to resolving these issues is by developing theoretical models of network protocol performance, and by conducting computer simulations of networks which include satellite links.

1.3 Progress and achievements

The following work has been accomplished in this study:

• Review conducted in the areas of satellite architectures, IP multicast technologies and protocols, ATM, satellite link error performance and error correction algorithms;

• Analysis conducted of the impact of the satellite environment on multicast behaviour;

• Theoretical model developed of IP error performance over an ATM satellite link in the presence of burst errors. The results obtained have been accepted for publication [Howarth,01];

• Layered network model of IP, AAL and ATM developed using the simulation tool Opnet, which shows excellent agreement with the theoretical model of IP error performance;

• The Opnet model further developed to include a reliable multicast protocol and file transfer application;

• Results obtained and compared for two modes of error correction in the reliable multicast protocol.

1.4 Content of the thesis

Section 2 provides an introductory background describing the use of satellites and IP multicast. Section 3 describes work conducted in the area of error correction mechanisms for reliable multicast protocols, and describes one particular protocol, PGM. Section 4 presents a theoretical model of the error performance of IP over a satellite link.

Section 5 proceeds to describe the work conducted in developing and testing a simulation model in Opnet of IP multicast over a satellite link, and Section 6 presents results from the simulation. Conclusions and next steps are offered in Section 7.

2 Background: satellite and multicast technology

Next-generation satellite communication systems with on-board processing are currently under development. In parallel with this, the explosive growth in the terrestrial Internet for commercial and private use means that there is interest in the provision of Internet-based services over satellite. These are likely to include information distribution and multimedia services such as videoconferencing and streaming video and audio, delivered using both multicast and unicast technologies.

2.1 The potential of satellite communication systems

The potential advantages of satellite-based IP services have been well-rehearsed (for example, [Akylidiz,97]), and include the following:

• Satellites have a potentially global reach, including all geographical areas whether urban, rural, remote, sea-based or otherwise inaccessible by conventional terrestrial communications methods. This wide reach is particularly of benefit to countries with underdeveloped terrestrial communications infrastructures, where satellite services can be cheaper than the cost of installing an equivalent telecommunications network.

• Satellites are well-known for providing cost-effective broadcast facilities, and they have significant potential for multicast services, whether point-to-multipoint or multipoint-to-multipoint.

• New users can be added to a system relatively easily, with equipment installation only required at the customer’s premises.

• Satellites can provide alternative protection paths for existing network connections. However, as terrestrial link data rates increase (particularly for fibre optic links), satellites are not capable of supporting all the required traffic should a main circuit fail.

• Bandwidth can be assigned on demand.

Against these advantages must be considered the drawbacks of satellites. The principal disadvantages compared to terrestrial communications networks are as follows:

• High round-trip delay times, especially of satellites in geostationary orbits. For a geostationary satellite, the round trip delay to the satellite and back varies between 240ms for a ground user at the equator to 280ms at 80° latitude. This clearly has an adverse impact on two-way real-time communications (for example, telephone conversations or videoconferences), and also affects the behaviour of network protocols such as TCP.

• The significant error characteristics of satellite transmission channels: these have higher bit error rates than fibre optic links. The design of the channel coding typically used on satellites to maintain a low bit error rate means that errors tend to occur in bursts. As we will see later in this thesis this affects the channel performance.

• Limited transmission power: large ground stations can transmit at high power, but the satellite is constrained by the power available from its solar cells. The power that can be transmitted from user ground stations may also be limited, either by lack of mains electricity or limited in urban areas to avoid excessive microwave radiation transmission levels. This limits the signal to noise ratio and hence the channel capacity.

• Limited data rates: the data rates (“bandwidth”) available on satellite links are lower than for terrestrial links.

2.2 Satellite IP network architectures

Two key technologies which enable IP networking over satellites are Digital Video Broadcasting - Satellite (DVB-S) and Asynchronous Transfer Mode (ATM). These two technologies are described by [Bem,00], together with descriptions of three satellite systems: Astra (geostationary satellite, IP transmitted on the forward link using DVB multi-protocol encapsulation, with a reverse channel of IP over ATM via the satellite); the Teledesic low earth orbit (LEO) constellation, which uses a proprietary packet switching technology similar to ATM and which is interfaced to standard networking protocols in the user terminals; and the Skybridge LEO constellation, which will support IP using ground-based ATM switches.

GEOCAST [Iyengar,01] will provide an end-to-end emulator of a satellite system, with two principal scenarios:

• The satellite acting as a transit provider in an ISP’s edge network, located between the Internet backbone and the ISP’s PoP (point of presence);

• The satellite providing connectivity between end users and ISPs, who are located in either the same or different spotbeams. This scenario will also consider direct connections between user earth stations via satellite.

An example architecture for an IP / ATM satellite system is described in detail by [Yegenoglu,00], where an ATM switch is located on board the satellite, with IP routing and IP/ATM address translation controlled by a ground based “satellite route server”.

The long lead times involved in the design and build of satellite systems means that current satellite design use slightly older technologies (e.g. ATM and DVB-S) rather than technologies such as MPLS (Multi-Protocol Label Switching). Consequently MPLS has not been considered further in this thesis.

Much work has been performed in studying from a communications perspective both geostationary satellites and low earth orbit (LEO) satellite systems. The latter have the advantages of significantly reduced round-trip delay times and lower transmission power (due to the reduced free space loss), but mean that satellites move relative to an earth-based

user. An individual LEO satellite may only be visible to a user for 20 minutes, and therefore a constellation of multiple satellites must be used, with handover of a communications connection between satellites. The satellite system also needs to maintain inter-satellite links with dynamic routing between satellites. With current technologies it seems that geostationary systems are more favoured by industry, and in this thesis only geostationary satellites are considered.

A significant body of work has been developed that addresses the behaviour of TCP over satellite (for example, [Henderson,99; RFC2760]). The principal concerns in the case of TCP are that the high satellite round trip delay affects flow control and throughput, and that satellite link transmission errors are interpreted by TCP as network congestion, thus also affecting throughput. However, in this thesis we will consider the behaviour of IP multicast, and how it is affected by the conditions which exist on satellite links.

2.3 Multicast: principles

Multicast allows a communications network source to send data to multiple destinations simultaneously whilst transmitting only a single copy of the data on to the network. The network then replicates packets and fans them out to recipients as necessary. Multicast can be considered as part of a spectrum of three types of communications:

• Unicast: transmitting data from a single source to a single destination (for example, downloading a web page from a server to a user’s browser; copying a file from one server to another);

• Multicast: transmitting data from a single source to multiple destinations. An example of this is streaming audio from a server to multiple workstations. The definition also encompasses communications where there may be more than one source: videoconferences provide an example of this, where each participant can be regarded as a single source multicasting to the other participants in the videoconference.

• Broadcast: transmitting data from a single source to all receivers within a domain (for example within a LAN; or from a satellite to all receivers within the satellite beam).

The advantages of multicast are as follows:

• Reduced network bandwidth: for example, if data packets are being multicast to 100 recipients the source only sends a single copy of each packet. The network forwards this to the destinations, only making multiple copies of the packet when it needs to send packets on different network links to reach all destinations. Thus only a single copy of each packet is transmitted over any link in the network, and the total network load is reduced compared to 100 separate unicast connections.

• Reduced source host processing: the source host does not need to maintain state information about the communications link to each individual recipient.

Multicast can be either best-effort or reliable. “Best effort” means that there is no guarantee that the data sent by any multicast source is received by all or any receivers, and is usually implemented by a source transmitting UDP packets on a multicast address (the addressing

mechanism is described in further detail in Section 2.4 below). “Reliable” means that mechanisms are implemented to ensure that all receivers of a multicast transmission receive all the data that is sent by a source: this requires a multicast protocol.

2.4 IP multicast

IP Version 4 implements three different types of address, corresponding to the spectrum of the three types of communication described in the Section 2.3. Unicast addresses (Classes A, B and C, Figure 2.1) are used to transmit an IP datagram to a single destination; they are split into the network and host ids. Broadcast addresses are used to transmit a datagram to a subnetwork, and a host address of all ones is reserved for broadcast in Classes A, B and C. Multicast employs a Class D destination address (Figure 2.1).

IP Version 4 address:

0 network host

1 0 network host

1 1 0 network host

1 1 1 0 multicast group

Class B address:

Class D address:

Class C address:

Class A address: Range: 0.0.0.0 – 127.255.255.255 Unicast and Broadcast

Range: 128.0.0.0 – 191.255.255.255 Unicast and Broadcast

Range: 192.0.0.0 – 223.255.255.255 Unicast and Broadcast

Range: 224.0.0.0 – 239.255.255.255 Used for IP Multicast

1 1 1 1 0 Class E address: Range: 240.0.0.0 – 255.255.255.255 Reserved for future use

0 1 2 3 4 … 7 8 15 16 23 24 31

Figure 2.1: IP multicast addresses

2.4.1 IGMP

The Internet Group Membership Protocol (IGMP, [RFC2236]) allows hosts to declare an interest in receiving a multicast transmission. IGMP supports three main types of message: Report, Query and Leave.

A host wishing to receive a multicast transmission issues a join Report, which is received by the nearest router. This Report specifies the IP multicast class D address of the group being joined. The router then uses a multicast routing protocol (described below) to determine a path to the source. To confirm the state of multicast hosts, a router occasionally issues an IGMP Query to hosts on its network. When a host receives such a query, it sets a separate timer for each of its (potentially many) group memberships. When each timer expires, the host issues an IGMP Report to confirm that it still wishes to receive the multicast transmission1. When a host wishes to finish receiving the multicast transmission it issues a Leave request using IGMP2.

1 However, in order to suppress duplicate reports for the same Class D group address, if the host has already heard a report for that group from another host it stops its timer and does not send a Report. 2 The Leave message is supported in IGMP Version 2. In Version 1, a host quietly changes state to non-member, and no message is sent to the router.

IGMP behaviour in a satellite environment

In a conventional LAN environment, an IGMP Report is heard by other multicast receivers and prevents flooding of the LAN with multiple reports. In a satellite system, individual ground stations can not hear each other, and therefore either the satellite must retransmit the IGMP message or multiple reports will be transmitted by receivers. For a satellite with an on-board ATM switch which wishes to retransmit the IGMP messages, separate point-to-multipoint virtual circuits (VCs) need to be established sourced at each ground station within a satellite spotbeam.

2.4.2 Multicast routing protocols

Multicast routing protocols address the issue of identifying a route for data to be transmitted across a network from a source to multiple destinations, while minimising the network resources required to do this.

A number of multicast routing protocols have been developed [Sahasrab,00; Crowcroft], and the following protocols are briefly described here:

• DVMRP [RFC1075] and PIM-DM [Deering,96]: these are “flood and prune” algorithms. When a source starts sending data, the protocols flood the network with the data. All routers that have no multicast recipients attached send a prune message back towards the source (they know they have no receivers because they have received no IGMP join Reports). These protocols have the disadvantage that a “prune” state is required in all routers (i.e. “I have pruned on this multicast address”), including those routers with no multicast recipients downstream.

Flood and prune protocols use Reverse Path Forwarding (RPF) to forward multicast packets from a source to the recipients: the RPF interface for any packet is the interface that the router would use to send unicast packets to the packet source. If a packet arrives on the RPF interface it is flooded to all other interfaces, but if the packet arrives on any other interface it is silently discarded. This ensures efficient flooding and prevents packet looping.

DVMRP uses its own routing table to compute the best path to the source, whereas PIM-DM uses an underlying unicast routing protocol.

• CBT [RFC2189]: this protocol uses a “core”, through which all multicast messages pass. This core can be considered to be the root of a tree. When a receiver joins a group, the router looks up the address of the core, and issues a Join message to the next router in the direction of the core. As this join ripples along the routers, a bi-directional forwarding state is set up, and an acknowledgement is sent to the previous router. When a sender transmits, routers forward the data to the core or until the data hits a router on the multicast tree. The tree then propagates the data both out to its downstream leaves, and back up to the core.

• PIM-SM [RFC2362; Deering,96]: this protocol relies on a Rendezvous Point (RP), a single node through which all multicast messages pass. Local routers know about the locations of RPs for each given multicast group. When a receiver joins a group, the router unicasts a packet to the RP. Each router that the packet passes through on the way sets up a unidirectional shared tree. Thereafter, data only flows from the RP

out to the leaves of the tree. When a sender starts to transmit, its IP packets are tunnelled (by its local router) and unicast to the RP, and from there the data is multicast to all receivers. Clearly the RP can become a point of congestion; also, the need for data to pass through the RP means that traffic is not necessarily taking the shortest route from source to any destination.

• MOSPF [RFC1584]: this is an extension to the OSPF unicast routing protocol. Each router notes the multicast groups for which it has direct attached receivers. When routers flood their link state information (to enable a conventional OSPF Dijkstra calculation) this multicast information is also passed on. Each router can then build a multicast forwarding table for each Class D address. In order to reduce the computational load, the building of these tables is only performed the first time data is sent from a source to a multicast group (data driven).

Multicast routing protocols in a satellite environment

Consider a flood and prune algorithm (DVMRP or PIM-DM). When a source at a ground station starts to transmit, the data is flooded across the network. If the satellite contains an ATM switch, then for other receivers within the same spotbeam as the source to hear the flooded data, a point to multipoint VC needs to be established from each ground station within the spotbeam. The satellite is then retransmitting the data out through the same spotbeam on which it received the data in order to transmit it to other routers

Conversely, if the satellite contains an on-board router then the RPF mechanism described above does not work (Figure 2.2). This is because in a typical terrestrial WAN a single router can flood its data directly to all other routers. However, with a satellite-based router the router will effectively have to rebroadcast the data through its RPF interface so that other receivers within the satellite spotbeam can hear the data. This is in contravention of the normal RPF algorithm.

RS

R2A

R2B

R1A

R1B

R1C

WAN

WAN

R1A starts to multicast. R1A floods to RS, R1B, R1C. If x, y, z are the RPF interfaces on each router then the data will flood correctly, else each router will silently discard the packets. RS then floods the packets to R2A and R2B.

w y

z

x

RS

R1B

R1C R2B

R1A starts to multicast. R1A can only flood to the satellite, RS. RS needs to explicitly re-broadcast the packets back to R1B, R1C, in addition to correctly forwarding them to R2A and R2B.

R2A

R1A

Spotbeam 1 Spotbeam 2

Figure 2.2: Multicast routing flooding: terrestrial (left) and satellite (right) networks compared. R1A multicasts data to R1b, R1c and R2A, R2B.

2.4.3 Reliable multicast protocols

Reliable multicast protocols address the issue of ensuring that data is multicast from a source to all the multicast recipients and that each packet sent by the source is successfully received by all recipients. Reliable multicast protocols usually also ensure ordered and non-duplicated delivery of packets. Since they provide an end-to-end service they are conventionally regarded as transport layer protocols in the context of the OSI Reference Model.

A wide range of reliable multicast protocols have been developed and described in the literature. One reason for this is that efficient multicast is a much more complex problem than efficient unicast, and consequently many multicast protocols have been developed for specific classes of application. Two examples of different application classes are delay-sensitive real-time applications and multicast file transfer, each of which has its own specific multicast requirements. A taxonomy of multicast protocols is described in [Obraczka,98], where they are referred to as “multicast transport protocols”. This work has been taken forward in the context of satellite networks by [Koyabe,01].

Following the structure used by Koyabe, some of the key features of multicast protocols can be described as follows:

• Data propagation: this covers a number of basic parameters describing the protocol capability. These are: (a) whether the propagation is one-to-many, many-to-one, or many-to-many; (b) whether data transfers are one-way (outbound only) or two-way (return path required).

• Scalability: the number of data recipients (for example, of the order of a few tens or a few hundred thousands); and their geographical spread. A multicast protocol that has to cover a wide geographical range consequently probably has to deal with a wide range of round-trip times and therefore is unable to optimise its transmission rates for all users (assuming all users are kept in synchronism). This is particularly exacerbated if some recipients are connected via terrestrial links and some are connected by geostationary satellite links.

• Reliability required of the protocol: some applications such as file transfer require guaranteed delivery of data to all destinations, whereas other applications such as video streaming can tolerate a certain loss rate. Closely allied to this issue are mechanisms for reducing the loss of data, either using forward error correction (FEC) techniques, or automatic repeat request (ARQ). The use of ARQ means that acknowledgements (or, in some multicast protocols, “negative acknowledgements” when the recipient determines that some data has not been received) are sent back to the sender. Where there are a large number of recipients, the volume of acknowledgements (“Acks”) or negative acknowledgements (“Nacks”) sent back to the recipient can overload either the network or the sender, and this is a well known problem in multicast protocols of an “implosion”.

• Flow and congestion control: flow control is the managing of the data transfer rate so as not to overload the recipient. In a multicast environment, different receivers may be connected to the network by different bandwidth links and may have different data processing capabilities. The multicast protocol needs to be able to

scale so that these differing needs can be taken into account. Congestion control is the managing of data flow once an overload has occurred (either on the network or at the recipient), and the multicast protocol needs to be able to respond sensitively to overload conditions.

2.4.4 Reliable multicast protocol: PGM

In this Section, one specific reliable multicast protocol is described, namely Pragmatic General Multicast, PGM [PGM,01]. This protocol has been selected for a more detailed review in this thesis because it is an Internet draft, and the protocol includes forward error correction capabilities. It is consequently used as the basis for a reliable multicast protocol model in the simulation work that is described later in this thesis. Here, we give a brief summary of the protocol. This does not describe all the features of the protocol, but gives sufficient understanding of its key operations.

To quote from the draft specification, PGM is suitable “for applications that require ordered or unordered, duplicate-free, multicast data delivery from multiple sources to multiple receivers. PGM guarantees that a receiver in the group either receives all data packets from transmissions and repairs, or is able to detect unrecoverable data packet loss.”

The operation of PGM can be explained using Figure 2.3 as an illustration. The Figure shows a data source and multiple sinks, together with network elements (NEs) that are “PGM-aware” and devices called Designated Local Repairers (DLRs). These latter devices maintain a copy of data transmitted by the source so that they can respond to requests for retransmission of lost data.

The normal flow of data is from a single source to multiple sinks and is carried as original data or ODATA. If missing data is detected by a sink, it issues multiple Negative Acknowledgements (Nacks) until it receives a Network Confirmation (NCF) from a network element. The purpose of the NCF is to notify the sink that the network elements are now responsible for returning the Nack to the source. Consequently the network element in turn issues a Nack back “upstream” toward the source and in turn receives a NCF from the next element. Nack suppression procedures are included in the protocol to avoid a Nack implosion when a large number of sinks fail to receive any given data. When the Nack arrives either at the source or a DLR, repair data (RDATA) is transmitted

The protocol includes a transmit window system, and the specification provides guidance on window advance strategies. It also supports error correction schemes that can either be used proactively or on demand, and these are described in more detail in Section 3.4.

Source

DLR

NE DLR

NE NE

Sink Sink Sink Sink Sink Sink Sink

1. ODATA

1. ODATA 1. ODATA

1. ODATA 1. ODATA

4. Nack

1. ODATA 3. NCF

2. Nack

5. NCF 5. NCF

3. NCF

1. ODATA

6. RDATA

NE

6. RDATA

1.ODATA is sent from the source. If a sink detects missing data it issues a 2.Nack which is confirmed 3.NCF by the nearest PGM-aware network element. This NE in turn issues a 4.Nack and receives a 5.NCF. On receipt of the 4.Nack the DLR can then transmit 6.RDATA to the sink.

1. ODATA

to other sinks

Figure 2.3: PGM summary

2.5 IP multicast over ATM

The work described in this thesis focuses on the use of ATM (Asynchronous Transfer Mode) as the underlying transmission mechanism for IP. This approach has been taken since a number of satellite systems, including GEOCAST, intend to use ATM for satellite switching. In this Section, issues of transmitting IP multicast over an ATM fabric are briefly introduced.

IP multicast uses a receiver-controlled model, where the receiver joins a multicast group by issuing an IGMP join Request for a specific multicast IP Class D address. ATM also supports a point-to-multipoint service, but in contrast this is a sender-controlled model, with all connections initiated by the source. In order to support IP multicast over ATM, the Internet Engineering Task Force (IETF) developed the MARS model (Multicast Address Resolution Server). This is defined in [RFC2022] and described by [Armitage,97].

When ATM sets up a point-to-multipoint connection for n receivers, it needs to know the actual unicast ATM address of each of the n “leaves” on the multicast tree. The source then creates the point-to-multipoint virtual circuit by first issuing a SETUP call for one leaf, and then (n-1) ADD_PARTY requests for the remaining leaf nodes.

In order to establish the point-to-multipoint connection, ATM uses the information contained in the Multicast Address Resolution Server (MARS). This contains a mapping of each multicast group Class D IP address to n ATM addresses, being the n recipients that have subscribed to the multicast group. Armitage recommends that in keeping with IP’s subnet model, a MARS should only list hosts within the same logical IP subnet (LIS, the set of IP/ATM addresses that are part of the same network as defined by the network portion of the host’s Class A/B/C address). Multicast routers are considered to remain responsible for IP multicast forwarding to other nets or subnets. RFC2022 allows either of two models for setting up the ATM point-to-multipoint connections for any-to-any connectivity: these are a fully meshed network, or a multicast server-centric model (Figure 2.4). The former is more general but involves far more virtual circuits; while the latter results in the multicast server being a point of congestion, and a single point of failure for the entire multicast network.

1 2

3 4

Multicast mesh

1 2

3 4

Multicast server

M/cast Server

Figure 2.4: MARS full multipoint connectivity between n hosts, all links unidirectional

2.6 Summary

We can see that IP multicast in a satellite environment raises a number of issues which do not occur in a terrestrial network. These can be summarised as follows:

• Refinements may be needed to either the multicast routing protocols or the satellite router design: in a satellite network not all routers within a single spotbeam can see each other, and so the on-board router has to re-broadcast some traffic within each spotbeam;

• The behaviour of IGMP affects the operation of the satellite system: each router in a spotbeam is unaware of any other ground stations that are receiving multicast transmissions, unless IGMP Reports are re-broadcast by the satellite;

• The round trip delay of a geostationary satellite has an impact on the scalability of multicast protocols, since it increases the transmission delays experienced by some receivers. The round trip delay may also affect the flow control and congestion control algorithms;

• Reliability and error correction mechanisms need to take into account the satellite’s high error rate;

• Satellite links with terrestrial return paths have different forward and return path routes and so are not suitable for some bi-directional multicast routing protocols such as CBT.

The issues described above are separate from and in addition to those which occur with non-multicast satellite traffic. An example of the latter is security concerns caused by the satellite transmission (broadcast) of data, which can easily be picked up by unauthorised hosts.

In this thesis, we now go on to consider one of these issues, namely error behaviour and error correction in multicast protocols. This is addressed in the following Section.

3 Error behaviour of multicast protocols

3.1 Introduction

The work to be described in the remainder of this thesis covers the effect of errors on multicast protocol behaviour. In Section 4, a theoretical model is presented of the error performance of an ATM satellite link. Before we look at the modelling work, in this Section the theoretical results are anticipated by describing some work conducted by other authors in the area of IP multicast error performance. We begin with a background discussion on error correction.

3.2 Error correction: general

There are two principal error correction approaches: forward error correction (FEC) and automatic retransmission request (ARQ). In the case of FEC, an original message is transmitted together with some redundant (parity) information to form a codeword, so that if part of the codeword is lost or corrupted the receiver can both detect and correct the error. The original message can thus be reconstructed from the redundant information in the codeword, provided that the number of errors is below a certain level. FEC is not by itself able to guarantee delivery of data, and has the further disadvantages of a coding overhead that is not needed when the channel error rate is low, reduced effective bandwidth of the channel and possibly an encoding or decoding delay. ARQ on the other hand is only able to detect errors in the original data, but if such errors occur the receiver requests a further copy of the data from the transmitter. ARQ has the advantage that it can guarantee data delivery, but can also suffer from significant delays when data has to be retransmitted.

ARQ can be divided into two categories: these are Idle RQ and Continuous RQ. The latter can employ either a Selective Repeat or a Go-Back-N strategy. In the case of Selective Repeat, if a packet of data is lost or errored, the receiver requests and is sent a copy of only the errored packet. By contrast, with Go-Back-N, the receiver requests the transmitter to retransmit packets starting from the errored packet. The Selective Repeat strategy uses less network resources to transmit the data, but the Go-Back-N strategy can employ a simpler receiver architecture.

A combination of FEC and ARQ can also be used, and this is called hybrid ARQ. In a Type-I hybrid ARQ scheme, parity data is transmitted with the original message so that errors can be both detected and corrected. If the number of errors is too high, so that they can not be corrected, the receiver requests retransmission of the same codeword. In a Type-II hybrid ARQ scheme, if the receiver is unable to correct the errors in the received codeword then it requests transmission of further parity data until it has received sufficient to allow it to decode the original codeword.

In summary, FEC provides a statistical approach by which the majority of errors can (in principle at least) be detected and corrected by the recipient without further reference to the sender. They are therefore suitable in cases where the return channel is non-existent, or has a very low data rate or incurs a significant time delay. However, they can not correct all errors, and ARQ (or hybrid ARQ) may be required to ensure complete data integrity.

3.3 Error correction in multicast protocols

The use of FEC in reliable multicast is discussed in an IETF draft memo [RMTWG,00], and this paper compares and contrasts a number of FEC codes. The paper points out how at the data link layer error correction typically needs to deal with individual errored bits. However, at the network and transport layers, lower layer protocols will have either accepted or rejected each packet. Error correction at these layers therefore tends to deal with packets that have been discarded in their entirety.

The analysis to be presented in Section 4 shows specifically how IP datagrams can be discarded at the receiver due to errors. Such a discarded datagram is known as an erasure, and in general an erasure is a missing segment, packet, datagram or cell whose location in a stream is known. In the case of the satellite link we are considering erasure of an IP datagram and hence the transport layer packet which it carried. Some reliable multicast protocols such as RRMP3 [Clausen,99], PGM [PGM,01] and NORM4 [NORM,01] use FEC to recover from erasures, but FEC has not been implemented in other protocols. RRMP calculates parity packets that can be used to reconstruct lost packets. PGM uses a matrix inversion technique to recover lost blocks of data in one of its two modes of error correction [Rizzo,97a]. Fcast is a reliable multicast file transfer protocol that uses Reed-Solomon based codes [Gemmell,00].

As an alternative, a scheme similar to that developed for ATM cells [Ohta,91], described in outline below, could also be applied to IP datagrams as part of a multicast protocol.

Hanle [Hanle,98] describes simulation work performed using the network simulation software package ns-2 on a file transfer reliable multicast protocol, MFTP. A FEC mechanism based on Reed-Muller codes was developed for the simulation, and results were obtained for a terrestrial network.

FEC schemes can also be implemented at the data link layer. One option [Ohta,91] is a mechanism in which a block of ATM cells is transmitted together with both “cell loss detection cells” (CLD cells) which allow cell erasures to be identified and parity cells which allow erased cells to be recovered. This scheme could be implemented in a way which is transparent to higher layers. This FEC mechanism involves a decoding delay since an errored or erased cell can not be passed to the higher protocol layers until the matching CLD cell and parity cell have been received.

Interleaving, either at bit or byte level, spreads an error burst across multiple ATM cells [Akyildiz,97; Chitre,94; Hamouda,98]. If the error burst is spread out so that only one error occurs in each cell header then the HEC correction algorithm can be used to correct each

3 Restricted Reliable Multicast Protocol. 4 Nack-Oriented Reliable Multicast protocol

errored bit. The dual mode operation of the algorithm (Figure 3.1) means that if two consecutive cells contain single bit errors in their headers then the second cell is discarded; consequently the interleaving process must separate errored bits by at least two cells. However, interleaving ATM cells would not be effective in correcting error bursts in the payload: an error burst would be spread over multiple ATM cells, but in general would still result in multiple errors within a single service data unit (SDU) at the AAL layer. The CRC check used by AAL would detect these errors and discard the entire SDU.

Correction mode

Detection mode

No error detected (cell accepted)

No error detected (cell accepted)

Error detected (cell discarded)

Multi-bit error detected (cell discarded)

Single-bit error detected (cell corrected & accepted)

Figure 3.1: ATM HEC dual mode algorithm (used for bit errors in the ATM header)

3.4 Error correction in PGM

PGM was introduced in Section 2.4.4.

From an error correction perspective, PGM provides for either of two mechanisms to be implemented:

• A Continuous RQ Selective Repeat mechanism (in the PGM specification this mechanism is called “Selective NAKs” and is hereafter referred to as “Selective Nacks”);

• A Type II hybrid ARQ mechanism, in which a receiver can request parity packets that allow lost packets of the original message to be reconstructed. The parity packets are built for each block of k packets sent by the source (in the PGM specification this mechanism is called “Parity NAKs” and is hereafter referred to as “Parity Nacks”). Using an erasure code [Rizzo,97a] the original k packets can be recovered if the sink receives say (k-m) original data packets and m repair packets.

This thesis will in Section 6 describe and compare the performance of these two mechanisms.

PGM additionally allows Parity Nacks to be either proactive (i.e. parity packets are calculated when data is originally transmitted and the parity packets are sent at the same time as the original data) or on demand (i.e. parity packets only transmitted when requested by sinks). In this work, only the latter option has been considered.

3.4.1 Introduction to erasure codes used in Parity Nack mode

In the erasure code proposed for PGM [Rizzo,97a] k blocks of source data are encoded systematically5 so as to produce a total of n blocks of encoded data. In PGM, each block of source data is one packet. All packets are required to have the same length when Parity Nack mode is used.

In PGM the k packets of original data are transmitted first. If a sink detects loss of say m packets it sends a negative acknowledgement which requests a corresponding m parity packets. The source (or DLR, see Section 2.4.4) then transmits m of the (n-k) parity packets (assuming m<(n-k) ). So long as the sink receives a total of k packets (made up of k-m original packets and m parity, or repair, packets) it can decode the data to reconstruct the k original packets (Figure 3.2).

If the encoded packets are a n by 1 vector y that is obtained by multiplying a generator matrix G by a k by 1 vector of source data x, that is: y = G x, then the receiver can recover the source data by calculating x = G’ -1 y’ where G’ is the subset of rows from G corresponding to the components of the received data y’ .

The codes are linear block codes and use finite field arithmetic. For example, a GF(16) field represents numbers in 4 bits. This means that k successive multicast packets each of fixed length l bytes can be split into 2l nibbles, and the encoding applied to each group of nibbles separately. Thus, so long as all the multicast packets are of the same length, the erasure code can be applied to packets of any length.

In PGM, the group of k original blocks (or packets) are called a transmission group, of size TGSIZE.

k packets of original

data

k packets of original

data

n-k parity

packets

Encoder

y = G x

Recovered k packets

of original data

k’ (>=k) packets

received by each sink

Decoder

x = G’ -1 y’

Encoding process Decoding process Transmission

Figure 3.2: Graphical representation of the encoding / decoding process (based on [Rizzo,97a])

5 A systematic code is one in which the n blocks of encoded data comprises k blocks of data in the original unencoded form together with n-k blocks of coded data.

4 Error performance model

4.1 Introduction

This Section considers the error performance of IP over an ATM-based satellite link. This provides a theoretical model, which can be used to validate the simulation that will be conducted in the next stage of this work. The results described in this Section are also reported in [Howarth,01]. The protocol model considered here is illustrated in Figure 4.1.

Two key ATM performance measures are Cell Loss Ratio (CLR), the fraction of cells that are transmitted but not delivered, and Cell Error Ratio (CER), the fraction of delivered cells that have an error in the payload (information field). It is assumed here that cells are not lost due to buffer overflow, so header errors are the main source of cell loss.

Higher level protocols (TCP, UDP, reliable multicast etc)

IP layer

CS: convergence sublayerSAR: segmentation and

reassembly sublayer

ATM layer

Satellite modem and physical layer

ATM Adaptation Layer

Figure 4.1: Satellite protocol model

ATM was optimised for operation primarily over fibre optic links. The error performance of this medium is characterised by a low bit error rate, comprising random single bit errors, and in which error bursts do not generally occur. To protect against single bit errors, each ATM cell includes a header error control (HEC) field which protects the header against corruption. This minimises the cell loss ratio (CLR) and the misaddressing of cells (measured by the cell misinsertion rate, CMR). Errors which occur in the cell payload are not corrected by ATM but are left either to the ATM Adaptation Layer (AAL) to detect, or are passed on to higher-level protocols.

On a satellite link, errors occur more frequently than in fibre optic media, and tend to occur in bursts due to the nature of the channel encoding typically used. The result of these burst errors is a higher ATM cell loss rate on satellite links than would be expected for the same bit error rate (BER) on fibre optic links. For a satellite link that is being used to carry IP-based traffic this results in the loss of the IP datagram. For unicast communication this can

cause TCP congestion control and avoidance algorithms to be invoked (even though the cause was a bit error, not network congestion), slowing down the effective data transfer rate. For reliable multicast applications the datagram loss can also have significant impact if this requires retransmission by the multicast protocol.

4.2 Theoretical model

4.2.1 The satellite modem

A typical satellite communications link employs a convolutional encoder with Viterbi decoding. This reduces the link’s effective bit error rate, but the nature of the Viterbi decoding means that this reduction in the error rate is at the cost of residual errors occurring in bursts. However, the burst length is limited: Heissler [Heissler,99] developed a satellite terminal model with an additive white Gaussian noise channel, and derived statistics for the probability of a given error burst size in an ATM header and payload as a function of signal to noise ratio. The satellite terminal model employed a typical ½ rate encoder with a constraint length of 7. The results showed for example that at an Eb/N0 of 7dB, no bursts were of greater than 20 bits in length and most were less than 10 bits long.

4.2.2 The ATM layer

Many authors have considered the effect of satellite link errors on ATM. Cell loss and cell errors have been considered in a bursty error environment [Ramseier,95], assuming a Neyman A contagious distribution. This distribution [Neyman,39] reflects the behaviour of the Viterbi decoding process, because a burst means that the presence of a single erroneous bit is likely to be accompanied by other errors. Ramseier derives expressions which are a function of the mean error burst length, b. Brandão et al [Brandão,99] note that the Neyman A distribution implies a large observation interval relative to the mean burst length. They derive more complex expressions for cell loss ratio (CLR) and cell error ratio (CER) which are a function of both the mean burst length, L, and the mean number RI�HUURUV�SHU�EXUVW�� However the values of the parameters used by both sets of authors give similar results for CLR and CER as a function of the bit error rate. These results in turn are similar to results presented by other authors [Cuevas,99].

[Franchi,93] notes that a burst is defined by CCITT [ITU-T] as being “a group of bits in which two successive erroneous bits are always separated by less than a given number (X) of correct bits”. Franchi then assumes that a Viterbi decoder error burst will start and end with an error and that between these errors correct and incorrect decisions are made with equal probability. So for bursts of length ~ 6-10 bits we get a bit error density within the burst of ~0.6 to 0.7. [Ramseier,95] assumes that the number of errors within a burst is Poisson distributed. The analysis presented below explicitly assumes the errored bits are consecutive, although some degree of spreading of the errored bits (for example to give a bit error density within the burst similar to that of [Franchi,93]) can occur without affecting the results significantly6.

6 For example, anticipating the analysis below which derives expressions for cell loss and cell error, introducing one or two non-errored bits in a burst of length b=6 would make negligible difference to (4.2) and (4.3) and would increase (4.4), whose value is small compared to (4.2) and (4.3), by ~50%).

Expressions can be derived for the probability of cell loss and cell error. Assume the error burst length to be b consecutive errored bits. Further assume a low BER so that not more than one error burst arrives per ATM cell, and let the burst length be in the range

402 ≤≤ b . The first errored bit of the burst can be in any of the 424 bits of the ATM cell

(Figure 4.2). If the burst starts in any of the bits from 1 to 39 then a cell loss will occur. If the burst starts in bit 40, no cell loss occurs if the HEC algorithm is in correction mode, but the rest of the burst produces a cell error. A burst starting in any of bits 41 to )1424( +− b

results in a cell error. A burst starting in bit )2424( +− b causes a cell error and also

corrupts one bit of the following cell header; that cell is not lost provided the HEC algorithm is in correction mode. If the burst starts in bits )3424( +− b to 424 then we get a cell error,

together with loss of the following cell.

Assuming that the low bit error rate means that the HEC algorithm is in correction mode, the error probabilities are therefore as follows:

b

perrorlossPerrorsnoP 4241)()( −=∩= (4.1)

b

perrorlossP 39)( =∩ (4.2)

b

pberrorlossP )387()( −=∩ (4.3)

b

pberrorlossP )2()( −=∩ (4.4)

where p is the overall bit error rate.

We may further assume the length of each error burst follows a Poisson distribution with a mean length meanb and an exponentially distributed burst inter-arrival time. In this case, the

expressions for the loss and error probabilities should be weighted to reflect the probability distribution of burst lengths b , but it can be shown using a simple spreadsheet model that is an adequate approximation to assume b is the mean burst length meanb . This observation

will also be demonstrated in Section 5 using the simulation tool, Opnet.

1 40 41 424

cell error cell loss cell loss cell error and loss

of following cell

Payload Header

Example error bursts

Figure 4.2: The effect of error bursts on ATM cells

4.2.3 The ATM Adaptation Layer

Classical IP over ATM is carried using the AAL5 service class [RFC2225]. AAL comprises two sublayers, the Convergence Sublayer (CS) and the Segmentation And Reassembly sublayer (SAR) (Figure 4.1). For AAL5, the CS is in turn split into the service-specific convergence sublayer (SSCS) and the common part convergence sublayer (CPCS). For the analysis here a null SSCS layer has been assumed. The CPCS provides AAL5 with error detection, by appending an 8 octet trailer to the service data unit (SDU) passed down from the layer above. This trailer contains a 32-bit CRC checksum (Figure 4.3). The CPCS also contains padding to bring the data being transferred up to a multiple of 48 octets so that the SDU can be carried exactly in an integral number of ATM cells. The CRC is capable of detecting all errors bursts of less than 32 bits in length, all odd length error bursts, and most error bursts of 32 bits or greater. In the event that an error is detected, the SDU and hence the IP datagram is discarded by AAL. Figure 4.4 illustrates how the AAL PDU is divided into multiple ATM cells. All ATM cells except the final one carrying an AAL5 PDU have their header PT field set to the value zero, and the final cell has the PT field set to one.

In the unlikely event that the error is not detected by the CPCS CRC, the error burst should either cause the IP header to fail its checksum (if the errors occur in the IP header) or the TCP or UDP header (or other protocol header) to fail its checksum (if the errors occur in the IP payload).

SDU (1 to 65,535 octets) Padding Trailer

N * 48 octets

CRC Length and other fields

8 octets

4 octets 4 octets

Figure 4.3: AAL Type 5 CPCS protocol data unit

AAL5 PDU AAL

ATM cells PT=0

PT=0

PT=0

PT=1

Figure 4.4: AAL packetisation into ATM cells

4.2.4 The IP layer

From the discussion of the ATM Adaptation Layer, it can be seen that an IP datagram will be lost if either a cell loss or a cell error occurs in any ATM cell which is carrying part of the datagram. Furthermore the datagram will also be lost if the final cell of the preceding AAL5 PDU is lost (but not if it is errored), since the PT=1 flag will not be received and the two AAL PDUs will be processed as one. This error will be detected by the AAL5 CRC, causing loss of the IP datagram. If the IP datagram is transmitted in N ATM cells, using (4.1), (4.2) and (4.4), the probability of loss is therefore given by:

( ) ))239(1()4241(1)(1)_(1b

pb

b

plossPerrorsnoPP NN

IPloss −+−−−=−−= (4.5)

4.3 Analytical results

Figure 4.5 shows what may be referred to as the unicast error performance of the satellite link, i.e. the probability of IP datagram loss as a function of BER. The graph is calculated using (4.5) and assumes a mean burst length of 6=meanb bits [Ramseier,95]. As the mean

burst length increases, errored bits are grouped together and affect fewer datagrams, so if

meanb is doubled the probability of datagram loss is approximately halved.

10−7

10−6

10−5

10−4

10−3

10−5

10−4

10−3

10−2

10−1

100

Bit error rate

P(s

ingl

e IP

dat

agra

m lo

ss)

Unicast error performance

N=192 (IP datagram length 9180 octets − default AAL5 MTU) N=32 (IP datagram length 1492 octets − IEEE 802.3 maximum) N=7 (IP datagram length 320 octets − illustrative small datagram)

Figure 4.5: Unicast error performance

Curves have been shown for the following IP datagram lengths:

• 9180 octets, the default IP MTU for use with ATM AAL5, specified in RFC 2225;

• 1492 octets, the maximum IP datagram length in an Ethernet frame;

• 320 octets, an illustrative small IP datagram.

The datagram lengths assume standard LLC/SNAP headers.

Figure 4.6 shows what may be considered to be the multicast error performance of the satellite link. This is the probability that in a multicast transfer at least one of the recipients does not correctly receive the datagram. For a reliable multicast protocol, this is also effectively the percentage of datagrams that will need to be retransmitted (each retransmission will of course carry with it a further probability of loss which is not considered here). If there are R multicast receivers per satellite spotbeam then assuming independent losses on the transmission paths the multicast probability of loss is given by:

RIPlossPlossMulticastI PP )1(1 −−= (4.6)

Figure 4.6 shows at a bit error rate of 10-6 that for receiver populations greater than a few tens per spotbeam there is a high probability that any individual multicast datagram will not be received by at least one recipient. For a bit error rate of 10-8 a receiver population approximately one hundred times larger can be supported for a given error performance.

101

102

103

104

10−3

10−2

10−1

100

Number of receivers per spotbeam

P(lo

ss o

f dat

agra

m b

y 1

or m

ore

rece

iver

s)

Multicast error performance

BER = 1E−06

N=192 (IP datagram length 9180 octets − default AAL5 MTU) N=32 (IP datagram length 1492 octets − IEEE 802.3 maximum) N=7 (IP datagram length 320 octets − illustrative small datagram)

Figure 4.6: Multicast error performance (10-6 BER)

5 Simulation software development

5.1 Simulation tool: Opnet

Opnet Modeler is a discrete event simulation tool for modelling networks and protocols. It allows a user to model a network as a set of nodes connected by links. The nodes can represent end hosts or network components such as routers, switches and satellites.

Each node consists of a set of process models that define the node’s behaviour. Each process is represented by a set of states, and the process changes states according to a series of events which act on the process. The operations performed by Opnet within any state are written in C code. An example of an event might be the arrival of a packet at a node (or, more strictly, at a process within the node): this event causes the process to change state and execute some code. This code might for example un-encapsulate the contents of the packet and pass the contents up to a higher layer of software, which would be implemented in Opnet as a different process. These operations are achieved by using Opnet function calls which retrieve the packet from the head of a queue, read or set fields within the packet, and send the packet out on a stream.

Three basic link types are provided: point-to-point, bus and radio. Each link is modelled in Opnet as a pipeline: that is, as a series of stages. Each packet that passes over a link is processed through the link’s pipeline. For the simplest, the point-to-point link, these stages are:

• Stage 0: transmission delay: calculation of the time required for the packet to leave the transmitter (i.e. the time to transmit the number of bits in the packet);

• Stage 1: propagation delay: calculation of the time required for the first bit of the packet to travel from the transmitter along the link to the receiver;

• Stage 2: error allocation: calculation of the number of bit errors which affect the packet while it travels along the link;

• Stage 3: error correction model: calculation of whether the packet with its bit errors is deemed to be received by the receiver or lost.

The bus link provides additional stages which deal with replication of a packet to multiple receivers on the link and deals with any collisions (as would happen for example on an Ethernet link). The radio link provides stages which deal with transmitting and receiving antennas, their pointing direction and gain, and calculation of noise and interference and hence signal-to-noise ratio and thus bit error rate.

Opnet also allows statistics to be defined by the programmer and collected during the simulation. An example of this might be a count of the number of packets received by a process, with the time at which the statistic was incremented also being recorded.

5.2 Approach to simulation work

Although Opnet supplies a number of protocol libraries, the decision was taken by the author to build the simulation models from scratch. This approach was adopted for the following reasons:

• The libraries have the advantage that they model protocols exhaustively, but this has the corresponding drawback that they are complex, with a considerable learning curve to become proficient in their use, and this time was not considered available in conducting the work described in this thesis.

• The intention of the simulation was to consider new protocols and other conditions which are not included in the Opnet standard libraries, and for which custom models would therefore have to be developed anyway.

• The experience of building models from scratch was considered to give the author considerable insight into both the operation of Opnet and the behaviour of the system being modelled.

The following terminology is used in the description of the modelling work: a source is an originator of application layer data, and a sink is a recipient of application layer data. In this Section we also distinguish between a cell or packet which arrives at a sink and is said to be received, and a cell or packet which arrives at a sink and has the correct sequence number to allow it to be (either in concept or as part of the simulation model) delivered to the layer above. Opnet uses the generic term packet to describe units of data sent across links and processed by nodes, and the term is adopted here in the same meaning. Terms such as cell and datagram are also used in this Section where they refer specifically to ATM cells or IP datagrams respectively.

The following Sections of this thesis describe the development of a reliable multicast IP over ATM satellite model. The development proceeded in a number of phases, illustrated in Figure 5.1 and described together with their testing and validation in the following Sections:

• Section 5.3: a basic illustrative protocol between a single source and multiple sinks, connected via a simple satellite model, with a random bit error on the links and a back channel for the return of negative acknowledgements, transferring data using a Continuous RQ Go-Back-N protocol (Figure 5.1(a)).

• Section 5.4: development of a layered protocol model, with layers to represent IP, AAL and ATM (Figure 5.1(b)).

• Section 5.5: development of a burst error model: Opnet only provides a simple random bit error model in its link model (Stage 2 of the pipeline) and this was not adequate to model the burst errors described in Section 4 of this thesis.

• Section 5.6: development of a reliable multicast IP protocol, based on PGM, together with an application layer to model a file transfer application (Figure 5.1(c)).

All the simulation work described in this thesis assumes that any connections required between hosts are established and terminated outside the simulation period. This includes, for example, ATM connection setup and any reliable multicast protocol negotiation.

Sink

(a) Simple Go-Back-N model

Constant rate source

Sink Sink

ATM

AAL5

IP

Sink

(b) Layered protocol model

Constant rate source

IP

AAL5

ATM

Sink

IP

AAL5

ATM ATM

AAL5

IP

Sink

ATM

(c) File transfer and reliable multicast model

Reliable Multicast

IP

AAL5

ATM

File transfer Application Reliable

Multicast

IP

AAL5

Reliable Multicast

IP

AAL5

Reliable Multicast

IP

File transfer Application

ATM ATM

AAL5

Figure 5.1: Phases in the development of the reliable multicast model

5.3 Simple Continuous RQ Go-Back-N protocol

5.3.1 Summary of model

In order to gain experience with the Opnet tool, a simple model of a Continuous RQ Go-Back-N protocol was developed. The model included a simple satellite capable of multicasting packets to a small number of sinks. The Go-Back-N protocol was selected for this initial exercise since it is relatively simple to implement.

An overview of the model is shown in Figure 5.1(a), and the key features of the design are described below.

The source generates cells (that is, Opnet packets of length 53 octets) at a constant rate of one every 0.5 seconds. The payload of the cell is filled with a sequence number that starts at one and is incremented by one for every cell sent. This sequence number is also modified when the source receives a negative acknowledgement (Nack), as described below.

The satellite forward link receives an incoming packet from the uplink, and makes multiple copies of it. A separate copy is then sent out on each downlink. This effectively simulates an ATM point-to-multipoint call: in this simple model every downlink is assumed to be a party associated with the call (see for example, [ATMForum,93]). Conversely, when a packet is received on the reverse link, it is simply copied to the downlink to which the constant rate source is attached, and the packet is not copied to any of the other sinks. The reverse link may therefore be considered as a set of simple point-to-point ATM connections.

The behaviour of each sink is determined by the sequence number of the cells it receives. Cells received in order are “delivered” to the layer above – in the case of this simple single layer model they are simply recorded as having been delivered, and then destroyed. However, if the sink detects a missing sequence number, a Nack for that sequence number is transmitted back via the satellite link to the source. In this simple implementation, a Nack is simply a 53 octet cell travelling from a sink back to the source. When the source receives the Nack it “reissues” the missing sequence number, by setting the sequence number of the next outgoing cell to the value specified in the Nack provided this is less than the current sequence number7. The source then continues incrementing the sequence number from that value at 0.5 second intervals.

Each sink maintains two state variables in order to implement the Go-Back-N protocol:

• delivered_seq: the number of the highest contiguous cell received;

• prev_seq_num: the sequence number of the preceding packet.

The latter state variable is required to ensure that Nacks are only issued once for each missing sequence number; without it a Nack would be issued for every packet received once a packet has been lost until the missing sequence number is received, and this would result in an unstable protocol.

7 If the Nack requests a sequence number higher than the source’s current sequence number then this packet will automatically transmitted as the sequence number increases, so no explicit action is required by the source. This situation can arise when data is multicast to several sinks, and two sinks fail to receive different packets.

A summary of the Go-Back-N algorithm implemented in the sinks is shown in Figure 5.2.

Sink

Source

1 2 3 4 5 6 4 6 7 5

1 2 3 5 6 4 5 6 7

Nac

k(4)

Normal operation: Condition: seq_num_rxd == delivered_seq + 1 Actions: deliver packet to layer above. delivered_seq ++;

Missing cell detected (1): Condition: seq_num_rxd != prev_seq_num + 1 seq_num_rxd > delivered_seq + 1 Action: transmit Nack and set Nack timer.

Missing sequence number received: Condition: seq_num_rxd == delivered_seq + 1 Actions: deliver packet to layer above; clear any outstanding Nack timer. delivered_seq++;

Missing cell detected (2): (only arises in a multicast network) Condition: seq_num_rxd != prev_seq_num + 1 seq_num_rxd < delivered_seq + 1 Action: duplicate packet received, so ignore.

Awaiting missing cell: Condition: seq_num_rxd == prev_seq_num + 1 seq_num_rxd != delivered_seq + 1 Action: out of order or duplicate packet received, so ignore.

Figure 5.2: Summary of Go-Back-N algorithm as implemented using Opnet

Each sink maintains a Nack timer: this is set whenever a Nack is issued, and if it expires before the missing sequence number is received the Nack is re-issued.

All links were modelled using Opnet’s point-to-point links. Although the radio link may seem more suitable for traffic between the satellite and ground stations, the Opnet radio link model (which performs a link budget calculation to determine the signal-to-noise ratio and hence BER as described in Section 5.1) is inappropriate when the focus of the work is the behaviour of the multicast protocol as a function of bit error rate. All the simulation work described in this thesis therefore uses point-to-point links8.

Errors on the link were modelled using Opnet’s built-in random bit error model. As was described above, in Opnet error processing takes place in pipeline Stages 2 and 3 of a point-to-point link. Stage 2 calculates the number of random bit errors that occur in each packet, and Stage 3 compares this number of errors with a threshold. If the number of bit errors is less than the threshold then the packet is accepted, whereas if the number of bit errors exceeds the threshold then the packet is deemed to be lost and is discarded silently by Opnet.

8 The advantage of the radio link is that it would easily allow multiple ground stations to lie within a single spotbeam of a satellite, and is therefore suitable for modelling large numbers of sinks. This may be a useful extension of the work.

Although the simulation could have been run with bit errors on all links, only the forward downlink had a non-zero bit error rate. The simulation thus assumed a perfect reverse channel, and a lossless forward uplink.

For the satellite access protocol it is assumed that connections are already established. The mean access delay is included in the propagation delay (set to an illustrative 120ms in these simulations).

5.3.2 Theory

A standard formula exists for the efficiency of a unicast Go-Back-N protocol (see for example, [Halsall,96, pp. 207-210]), but this expression seems to omit the behaviour of the protocol in the presence of a timeout mechanism. The formula also assumes a window of size K packets and continuous transmission of packets while the window is open, neither of which is applicable to the model used in the current Opnet simulation. Consequently a different equation has been used here. It is shown in Appendix A that for a Go-Back-N Continuous RQ protocol which transmits packets at a constant rate between a source and a single sink the number of slots used per cell delivered to the end application is:

f

ff

P

PPS

−++

=1

221 2

(5.1)

where Pf is the probability of loss of a transmitted packet.

5.3.3 Testing and validation of model

A number of quantitative tests can be applied to this model to confirm that the simulation results are in line with theoretical predictions. These tests are as follows, and are described in detail below:

• Validation of link end-to-end delay;

• Validation of cell loss rate assuming a random bit error on the link;

• Link utilisation assuming a unicast system (i.e. a single sink) with a Go-Back-N protocol.

The end-to-end delay is simply given by the sum of the transmission delay and the propagation delay on the satellite uplink and downlink. Since these are symmetrical, assuming transmission of a 53 octet ATM cell at 9600 bit/s and a fixed propagation delay of

0.12 seconds, we get an end-to-end delay of 32833.0)9600

42412.0(2 =+ seconds, which

was the value observed in the Opnet simulations.

The cell loss rate can also be calculated assuming a random bit error model. The Stage 3 pipeline threshold was set to a value equivalent to allowing cells with either zero or one bit error, but rejecting cells with two or more bit errors9. If )(xP means the probability that x

9 In Opnet terms, ecc_threshold was set to the value 0.0047 at the receiver of each sink. ecc_threshold is the fraction of errored bits in the packet which is accepted, and correspond to allowing one bit in 424 (=0.00236), while rejecting two bits in 424 (=0.00472).

bits in the cell are errored and p is the bit error rate, then the fraction of cells which are accepted is given by:

423424_ )1(424)1()1()0(1 pppPPPP losscellaccept −+−=+=−= (5.2)

and, in this model, flosscell PP =_ .

Table 5.1 shows how the fraction of cells received on the downlink varies with the bit error rate, and shows excellent agreement between theory and Opnet simulation.

Link BER p

No. cells sent (simulation)

No. cells received (simulation)

Percentage of cells received

(simulation)

Theoretical value Paccept

10-4 999 998 0.999 0.999

10-3 999 925 0.926 0.932

3 . 10-3 999 637 0.638 0.637

10-2 999 71 0.071 0.075

Table 5.1: Forward downlink cell loss rate in the presence of random bit errors

To validate the model against the theoretical unicast link utilisation only one sink was modelled. Opnet simulations for 500 seconds were run for a variety of bit error rates on the forward downlink. The value of S in the simulation is then given by:

Ssim = (Number of cells sent by source) / (Number of cells delivered by sink) (5.3)

and the cell loss probability in the simulation is given by:

Pf sim = 1 - (Number of cells received by sink) / (Number of cells sent by source) (5.4)

Figure 5.3 compares the theoretical and simulation results and shows that they agree to a very high accuracy.

5.3.4 Extension of model to multicast case

It would be expected that if the packets are multicast to several different sinks the overall performance will be reduced as described in Section 4, assuming the errors on the forward downlinks to be independent. This is because each sink will lose some packets which cause the source to implement its Go-Back-N algorithm, even though the other sinks have successfully received the packet. The overall network utilisation will be reduced compared to a system which only has one sink.

Figure 5.4 shows the multicast behaviour of the Go-Back-N model. Here, the same bit error rate was specified on each of the three downlinks. The simulation values of Ssim and Pf sim were calculated using equations (5.3) and (5.4); for each simulation run with 3 multicast sinks there are of course three such pairs of values. It will be observed that for example at a BER of 2 x 10-3 (corresponding to Pf ~0.2) the three sink multicast system uses approximately 25% more bandwidth to transmit the same data than the single sink unicast system.

Figure 5.3: Continuous RQ Go-Back-N model validation: unicast performance

Figure 5.4: Continuous RQ Go-Back-N model: multicast performance

5.3.5 Summary

This simple model showed that a satellite model that emulates an ATM point-to-multipoint call could be developed. Theoretical models for random bit error performance and the unicast Go-Back-N protocol efficiency have been developed and excellent agreement has been shown between the theoretical models and the Opnet simulations. The model has been extended to illustrate behaviour in a simple multicast case.

This model provided the author with considerable confidence in the development of Opnet models. The next stage was to extend the model to consider multiple layers of protocols.

5.4 Layered protocol model

This Section describes how the single layer model of the previous Section was extended. The overall model architecture is shown in Figure 5.1(b), and the key features of the design are now described.

At the application layer the model kept the same Continuous RQ Go-Back-N protocol, with packets transmitted at a constant rate of one packet every 0.5 seconds. However, here the packet was changed so that it was did not represent a 53 octet ATM cell, but instead was a packet of fixed length (this length typically had a value of a few hundred octets, and could be changed easily between different simulation runs). Together with the headers and trailers of the other protocols in the stack, this meant that a single application layer packet would in general be transmitted as a number of ATM cells. The return Nacks were for the purposes of this model given the same size as the application data packets.

The IP layer was implemented with very simple functionality. When it receives a packet from the layer above, the packet is encapsulated in a datagram with a 20 octet header (i.e. assuming IP Version 4) and is passed to the layer below. Conversely, when receiving a packet from the layer below, the header is stripped off the packet and the encapsulated packet is passed to the layer above. In this simple model, no value was assigned to any of the fields in the header: in particular, no IP addressing system was implemented.

The AAL5 layer was similar, adding or stripping an 8 octet trailer to or from the IP datagram. The model again did not assign any value to any of the fields in the trailer, and in particular it did not generate or check the CRC field: error detection relied instead on a mechanism implemented in the ATM layer as described below. AAL5 padding is also effectively added by the ATM layer.

The ATM layer receives an AAL5 PDU, calculates the number of 48 octet payloads required to hold the PDU, and creates this number of cells. ATM header fields are assigned values as follows:

• The PT field is set to zero for all cells except the final cell of the AAL5 PDU, for which it is set to one.

• As a rudimentary VPI/VCI10 addressing mechanism, an address unique to each ATM process is generated at simulation start time, and is inserted in the VCI field of the header of each cell. This effectively assigns a unique VPI/VCI to each flow

10 Virtual Path Identifier and Virtual Channel Identifier

from any ATM process in a node to the ATM process in any other node. This was required since on the return path from the sinks to the source, Nacks could be transmitted simultaneously by sinks, and their ATM cells become interleaved on the reverse downlink back to the source. The addressing mechanism allowed these cells to be de-interleaved.

• A sequence number is assigned to each ATM cell transmitted by any given ATM process, and is carried in a dummy field in the cell in a way which does not affect the ATM cell’s apparent length11. When the ATM cells are received at the destination their sequence numbers are checked. If any is missing, a status flag is set to ‘fail’ and the AAL5 PDU (i.e. all ATM cells up to and including the cell with PT=1) is destroyed, and not passed up to the receiver’s AAL5 layer. This mechanism effectively emulates from a network perspective the function of the AAL5 CRC, at less implementation effort.

5.4.1 Validation

The following equations describe the behaviour of the layered protocol model:

• The random bit error model provided by Opnet is still used, so equation (5.2) applies when we allow zero or one bit errors.

• Due to the use of the value of the PT field to determine the end of a AAL5 PDU, we see that successful transmission of an AAL5 PDU in N ATM cells requires all N cells to be transmitted successfully together with the last ATM cell of the preceding AAL5 PDU. If Pf is now the probability of loss of an AAL5 PDU then:

1_ )1(1 +−=− N

losscellf PP (5.5)

where losscellP _ is defined by (5.2). Note that this equation is not the same as

equation (4.5), since the random bit error model is unable to distinguish between ATM cell error and ATM cell loss.

• No losses occur in the IP layer of the model, so the value of Pf is also the probability of loss of an IP datagram.

• The Go-Back-N protocol is still used in the application layer, so equation (5.1) above applies.

Figure 5.5 shows the unicast behaviour of the layered protocol model, illustrating good agreement between theory and simulation. As before, Opnet simulations for 500 seconds were run for a variety of bit error rates on the forward downlink, and (5.3) and (5.4) give the simulation values for S and Pf.

11 In Opnet the function call op_pk_total_size_set() can be used to set the cell length to 424 bits irrespective of the data actually being carried.

Figure 5.5: Layered protocol model validation: unicast performance12

5.4.2 Summary

The layered protocol model worked well and provided a base from which the theoretical model for unicast and multicast error performance could be tested. This however first required the development of a burst error model, which is described in the following Section.

5.5 Burst error model

5.5.1 Approach

This Section describes the development of a burst error simulation model. This phase of the work proceeded in two stages:

• A model was first developed which tested each bit of every ATM cell on the link, determining whether or not it was errored, and summing the bit losses for each cell: this model was used to validate the theory of Section 4;

• Once the theoretical model had been validated, a second simulation model was developed which used the formulae of Section 4 to calculate the cell loss and error

12 Application packet size=100 octets (i.e. 3 ATM cells per application packet), link data rate=9600 bit/s.

statistics. The advantage of this was that this second model ran faster: the simulations described in this Section ran in approximately 70% of the time required for the bit-by-bit simulation.

The burst error models described here were used in the link’s pipeline Stage 2 (as defined in Section 5.1 above). The Stage 3 error correction model was not changed: Section 5.5.3 below explains how the burst error model distinguishes between cell loss and cell error.

5.5.2 Theory of bit-by-bit burst model

We continue our assumption of Section 4 that a burst of length b means that b consecutive bits are errored. We therefore simulate the bursty link by testing each bit to determine if it is the start of a burst, and if so regarding that bit and the succeeding (b-1) bits as errored. The simulation deals with bursts of fixed length. For out theoretical analysis we first consider all the bits of the bit stream except those that are bits 2, 3, .. b of a burst. That is, we are considering bits which either are not errored, or are the first bit of a burst. Let the probability of any of these bits being in error be y. We then seek an expression for y in terms of the link bit error rate p.

If such a bit is errored then it is followed by (b-1) other errored bits. On average, therefore, a set of 1/y such bits will contain one bit which is the start of an error burst, and this is followed by (b-1) further errored bits. The mean bit error rate, which we require to be p, is thus:

)1(1 −+

=b

y

bp

On rearranging this we obtain an expression for y in terms of p:

)1( pbp

py

−+= (5.6)

For example, to simulate a BER of p=0.5 with burst lengths b=6, we set y=1/7 (this is illustrated in Figure 5.6).

1 2 3 4 5 6 1 2 3 4 5 6

24 bits containing two bursts each of length b=6. Probability that any bit other than bits 2..6 of the burst is the start of a burst is y=2/14=1/7.

1

Errored bit

Non-errored bit

Bits that are not 2..b of a burst Bits that are not 2..b of a burst Bits that are not 2..b of a burst

Figure 5.6: Bit-by-bit burst error model theory

5.5.3 Software design of bit-by-bit burst model

Since the model is only applicable to ATM cells, it rejects any Opnet packet that is other than 424 bits long, and terminates the simulation. The burst model software then performs two functions:

• Calculate the number of errored bits in the ATM cell header and payload;

• Determine whether the cell is accepted or rejected. Specifically, if a cell loss event13 has occurred, then the packet is discarded and never reaches the receiver at the far end of the link14. However, if only a cell error event15 has occurred (and not a cell loss), then a parameter16 is set indicating that the cell is errored, and the cell is forwarded to the receiver.

The ATM process that receives packets from the link was modified so that if a cell which is errored is received, the flag that indicates whether the AAL5 PDU is correctly received is set to fail the PDU. However, the ATM layer can still detect from the header the correct value of the PT field. It can thus be seen that equation (5.5) is now replaced with equation (4.5), the correct expression for the probability of AAL5 PDU loss:

))239(1()4241(1b

pb

b

pP N

f −+−−=− (5.7)

The link burst error model maintains the following state variables:

• burst: whether or not there is currently an error burst on the link (true/false);

• burst_bit_num: the sequence number of a bit within a burst (only meaningful if burst=true);

• hec_mode: flag indicating whether the ATM HEC dual mode algorithm (Figure 3.1) is in correction mode or detection mode.

The burst_bit_num flag allows a burst to extend over two ATM cells. This assumption that bursts extend between adjacent cells is only true if the cells are continuously transmitted over the link. This is not the case in the Opnet models when the link utilisation is less than 100% (or when part of the link load is caused by background traffic as we consider later in this thesis). However this does not materially affect the validity of the results presented here17.

13 In accordance with the ATM HEC algorithm (Figure 3.1), a cell loss event means two or more errored bits in the header if the HEC algorithm is in Correction mode, or one or more errored bits in the header if the HEC algorithm is in Detection mode. 14 This is achieved in the Opnet simulation by setting the number of errored bits (i.e. the Stage 2 pipeline output required by Opnet) to be 423. The Stage 3 pipeline code compares this value with the ecc_threshold, and rejects the cell. 15 A cell error event means one or more errors in the ATM information field (payload). 16 The parameter is an Opnet Transmission Data Attribute, which can be read by the process that receives the cell. 17 This is because the Go-Back-N application generates cells at a constant rate over the duration of the simulation, and the absence or presence of burst errors does not affect the number of cells that are transmitted per unit time. Even where the traffic volume is a function of the error rate (as for example

Figure 5.7 illustrates the two state mechanism used in the burst error model.

Burst mode(burst=true):

Mark each bit as errored

Non-burst mode(burst=false):

Mark each bit as unerrored

Probability ofTransition y

After b errored bits

Figure 5.7: Link state transition diagram

5.5.4 Results of bit-by-bit burst model

Appendix B shows simulation results for a range of bit error rates and application packet lengths. These results are compared with the theoretical values for unicast error performance of Section 4 (Figure 4.5) in Figure 5.8. Figure 5.9 compares the ATM CER and CLR from the simulation with the theoretical values from (4.1) to (4.4):

b

pbCLR ))2(39( −+= (5.8)

b

p

b

pbbCER 385))2()387(( =−+−= (5.9)

The results show excellent agreement between theory and simulation.

5.5.5 Effect of variable burst length

In a real Viterbi decoder output the error bursts are not of fixed length, but are variable. In Section 4 we assumed that the burst length b was constant, and it was suggested that it is adequate to assume that b is the same as the mean burst length bmean of the Viterbi decoder output.

This hypothesis can be tested in Opnet. Using the bit-by-bit model the length of each individual burst can be randomly generated using a Poisson distribution of mean b. A sample set of results is shown in Table 5.2 where the results are compared with the equivalent simulation results for a fixed burst length. The table shows excellent agreement

with the multicast protocol described later, where an increase in the number of Nacks results in more traffic on the forward link), the simulation results are not affected, for reasons given in the following paragraph.

This point is discussed in more detail by [Alburquerque,00], who considers the effect of a two state Markov error model on a link carrying TCP. He states that in a simulation errors on the link cause TCP to implement its congestion control mechanism and reduce the TCP traffic. This in turn means that errors which are some multiple (>1) of the number of packet transmissions appear to be spread over a longer period of time. This problem does not apply in our case since we assume that all error bursts last for much less time than one cell period. The time duration of error bursts is therefore in our case not affected by the protocol’s feedback mechanism.

in the simulation results between the two methods, and supports the hypothesis that the assumption of Section 4 is valid.

Figure 5.8: Unicast error performance using bit-by-bit burst error & layered protocol models

Figure 5.9: ATM CER and CLR using bit-by-bit burst error & layered protocol models

App packet length 1492 octets

∴ N N 32 ATM cells per app pkt

Notional downlink BER p 3 . 10-4

Simulation run time runtime 5,001 seconds

Link data rate 512 kbit/s

)1( _ losscellP− b

p4241 − 0.97880

)1( fP− 1_ )1( +− N

losscellP 0.50266

S

f

ff

P

PP

−++

1

221 2

4.95

Theory: Fixed burst length

Simulation: Poisson distribution

No. application pkts sent AS = 2 * runtime – 1 10,001 10,001

No. application pkts delivered

AS / S 2,019 2,239

No. ATM cells sent N . AS 320,032 320,032

No. ATM cells received Slosscell NAP )1( _− 313,247 313,377

No. IP datagrams received Sf AP )1( − 5,027 5,112

P(single IP d’gram loss) )431()4241(1b

p

b

p N −−− 0.497 0.489

No. errored bits - 40,101

Actual downlink BER - 2.95 . 10-4

Table 5.2: Comparison of constant burst length theory (fixed length = 6) and Poisson-distributed burst length model (mean length = 6)

5.5.6 Development of formula-based model

Having shown with a bit-by-bit error model that the theoretical results of Section 4 match the simulation results, a new error model based on the formulae (equations (4.1) to (4.4)) was developed.

The core of the formula model is as follows. For each cell that is carried on the link, a random number between 0 (inclusive) and 1 (exclusive) is generated using Opnet’s uniformly distribution random value generator. Two parameters are then determined (Figure 5.10):

• A cell loss event occurs if the random value lies between 0 and ( )b

pb )2(39 −+ ;

• A cell error event occurs if the random value lies between b

p39 and

b

p424 .

Cell error event

Cell loss event

1

0

39 p/b

(b-2) p/b

(387-b) p/b

Figure 5.10: Model of cell loss and cell error (formula-based burst error model)

5.5.7 Testing and validation of formula-based model

Table 5.3 compares a sample pair of runs conducted using the bit-by-bit error model and the formula-based model. The table shows that the models agree on the number of application packets delivered to within 5% (and to within 2-3% for the numbers of IP datagrams and ATM cells received).

5.5.8 Summary

In this Section the burst error model has been described, and results from simulations using the model presented. They show excellent agreement with the theoretical model of Section 4, and this gives significant confidence in the validity of both theory and simulation software.

This Section now finally addresses the development of the reliable multicast protocol model and file transfer application.

App packet length 1492 octets

∴ N N 32 ATM cells per app pkt

Notional downlink BER p 3 . 10-4

Simulation run time runtime 5,001 seconds

Link data rate 512 kbit/s

)1( celllossP− b

p4241 − 0.97880

)1( fP− )431()4241(1b

p

b

p N −−− 0.50266

S

f

ff

P

PP

−++

1

221 2

4.95

Theory Simulation: bit-by-bit model

Simulation: formula model

No. application pkts sent AS = 2 * runtime – 1 10,001 10,001

10,001


AS / S 2,019 2,209 2,078

No. ATM cells sent N . AS 320,032 320,032 320,032

No. ATM cells received Scellloss NAP )1( − 313,247 313,386 313,176

No. IP datagrams received

Sf AP )1( − 5,027 5,116 4,954

P(single IP datagram loss)

)431()4241(1b

p

b

p N −−− 0.497 0.488 0.505

No. errored bits - 39,942 Not modelled

Actual downlink BER - 2.94 . 10-4 N/A

Table 5.3: Comparison of theory, bit-by-bit error model and formula-based model

5.6 Reliable multicast protocol

This Section describes the reliable multicast protocol that has been implemented in Opnet. The protocol is based on PGM, described in Sections 2.4.4 and 3.4. The software is not a full compliant implementation of PGM: instead, the code focuses on those areas of the protocol that deal with error correction (ARQ and FEC). Furthermore, in a number of areas the protocol specification [PGM,01] does not mandate specific algorithms to be implemented, and also allows considerable latitude in interpretation. In these areas the author has attempted to make sensible assumptions for the implementation described here.

An overview of the model is shown in Figure 5.1(c), and the Opnet implementation of this as a set of processes within the source node is illustrated in Figure 5.11. PGM regards itself as an end-to-end transport protocol [PGM,01: Section 2], and indeed it is responsible for end-to-end ordered non-duplicated delivery of data. Consequently it is valid to regard the reliable multicast protocol described in this thesis as a transport layer protocol.

Figure 5.11: Opnet model: processes within ‘source’ node

5.6.1 Transmit window and window advance strategy

PGM uses a sliding windows protocol [PGM,01: Section 16]. The window mechanism uses both a transmit window and an increment window (Figure 5.12), which are defined by the following parameters:

• TXW_TRAIL: the sequence number of the trailing edge of the transmit window;

• TXW_LEAD: the sequence number of the leading edge of the window (i.e. the most recently transmitted packet);

• TXW_INC: the sequence number of the leading edge of the increment window, i.e. the most recently transmitted packet among those that will expire on the next increment of the transmit window;

• TXW_SQNS: the transmit window size, in sequence numbers.

The window is advanced by increasing TXW_TRAIL to the value of TXW_INC. At this point, data packets less than the new value of TXW_TRAIL are no longer available for repair. PGM does not state how or when to advance the window, but it does introduce the concept of a timer which runs for a period TXW_ADV_IVL. One option in the PGM specification [PGM,01, Section 16.2] which has been adopted here in the Opnet software is as follows:

• The timer is reset whenever a Nack is received for a packet that is in the increment window;

• When the timer expires after period TXW_ADV_IVL the window is advanced to TXW_INC (and a new value of TXW_INC is calculated).

Order of original data transmission

TXW_TRAIL TXW_LEAD TXW_INC

Highest seq num packet that can be transmitted

Transmit window (size: TXW_SQNS)

Increment window

Figure 5.12: PGM transmit window: sequence numbers

In other words, the window is advanced once no Nacks have been received for a period TXW_ADV_IVL for sequence numbers within the increment window.

PGM also suggests the following expression to determine the window’s parameters Section 16 of the specification):

SIZEPACKET

SECSTXWRTEMAXTXWSQNSTXW

_

_*___ = (5.10)

where:

• TXW_MAX_RTE is the maximum transmit rate in bytes per second;

• PACKET_SIZE is the number of bytes in each PGM packet;

• TXW_SECS is the period of time for which data is retained by the source as being available for repair.

In addition, TXW_INC needs to be defined. Section 16 of the PGM specification suggests the following definition, which appears suitable for Selective Nack mode:

SIZEPACKET

SECSADVTXWRTEMAXTXWTRAILTXWINCTXW

_

__*____ =− (5.11)

where TXW_ADV_SECS is the period of time represented by the size of the increment window. However, the PGM specification is not clear on how the window increment is determined when Parity Nack mode is used. We recall from Section 3.4 that in Parity Nack mode a transmission group of k=TGSIZE packets is encoded to form k packets of original data and (n-k) packets of repair data. Consequently, either all of the transmission group is available for repair or it is not. Here therefore the following assumption has been made:

TGSIZETRAILTXWINCTXW =− __ (5.12)

A further sensible approach may be to ensure that TXW_SQNS is an integer multiple of TGSIZE (although this has not been implemented in the results described in the next Section).

5.6.2 Error correction: Selective Nack mode

In Selective Nack mode, the sink issues a Nack for each missing ODATA (original data) sequence number it detects. A timer is set by the receiver (NACK_TIMERVALUE) for every Nack issued so that the Nack can be re-issued if the appropriate RDATA (repair data) is not received within the timer period.

At the source, a RDATA packet containing the requested packet of data is sent for each Nack received. A timeout parameter has been implemented (RDATA_RESEND_TIMER) so that if two Nacks for the same packet are received within the timer period only one packet is transmitted by the source.

5.6.3 Error correction: Parity Nack mode

A transmission group comprises TGSIZE packets. When a sink has received the final packet in a group (or, if the last packet is lost, when it first receives a packet in the following group) it counts the number of packets missing from the group. If this is greater than zero it issues a Nack indicating the number of RDATA packets to be sent. In general the source therefore receives a number of Nacks from different sinks each specifying a number of RDATA packets. The source then transmits the largest number of packets requested. For example, if it receives two Nacks (within the timeout interval RDATA_RESEND_TIMER) requesting respectively 2 repair packets and 3 repair packets, it transmits only three.

5.6.4 Simplifications to the PGM specification

The use of NCFs (network confirmations) has not been built in to the model. This is equivalent to modelling a system in which there are no PGM-aware network elements. It is assumed that if the source receives a Nack it will immediately return RDATA (unless

RDATA_RESEND_TIMER has not expired) and a NCF is therefore unnecessary. It may be interesting to conjecture a PGM-aware satellite node, but this has been left for further study.

PGM also supports source path messages (SPMs) that notify PGM-aware network elements of the locations of adjacent PGM-aware devices. For the same reason as the simplified modelling of NCFs, this component has been omitted from the work described here.

The PGM specification provides detailed packet formats for ODATA, RDATA and Nacks. These have been simplified for the purposes of this Opnet model.

PGM specifies a token bucket scheme or other equivalent traffic management scheme to avoid flooding the network. In this work, a simpler approximation has been adopted of issuing packets from the file transfer application at a constant rate. This does not emulate a token bucket scheme accurately however, since if there is a high level of repair data then the data rate transmitted by the source rises: this would not happen if a token bucket were managing the flow rate.

5.6.5 File transfer application

The other component added at this stage was a file transfer application, used to replace the constant rate source used for the modelling work to date. The application takes a file of fixed size (FILE_SIZE bytes), divides it into packets of constant size and sends the packets at fixed intervals of TIME_INCREMENT seconds. If the multicast window is not full (i.e. TXW_LEAD < TXW_TRAIL+ TXW_SQNS, Figure 5.12) then the packet is passed to the multicast layer for transmission. If the window is full the application effectively blocks until the trailing edge of the window is incremented to TXW_INC and the window is therefore opened again by the multicast protocol.

File transfer completion is deemed to occur when the application has sent all packets of application layer data and the multicast layer records that a period TX_SUCCESS has passed since any Nack was received for any packet of the application18.

5.6.6 Testing and validation

For the selective repeat protocol implemented in the Selective Nack mode it can be easily shown (Appendix A) that the ratio of transmissions required per packet delivered to the sink is given by )1/(1 fPS −= . This can be used to test the protocol in a unicast case by

predicting the number of IP datagrams required to transmit a given number of packets. Table 5.4 compares theoretical and simulation results, showing the validaity of the file transfer and reliable multicast protocol behaviour.

18 This leaves a subtle fault in the application, namely that if the last packet of the application is not received by any sink, the sink is unaware that further data remains to be received and no Nack is generated. In a practical protocol it would be necessary to use a mechanism such as an end-of-file marker so that the error could be identified and corrected. However, in the simulations here a satisfactory result is obtained even with this small error.

App packet length

290 octets 290 octets

∴ N N 7 ATM cells per app pkt 7 ATM cells per app pkt

Notional downlink BER

p 10-4 10-3

File size 200,000 bytes 200,000 bytes

)1( _ losscellP− b

p4241 − 0.92933 0.99293

)1( fP− )431()4241(1b

p

b

p N −−−

0.59440 0.95089

S fP−1

1 1.6824 1.0516

Theory Simulation Theory Simulation

No. application pkts sent

AS 690 690 690 690


AS 690 690 690 690

No. IP datagrams sent

fS P

A−1

1

1161 1170 726 722

No. IP d’grams received

AS 690 690 690 690

No. ATM cells sent

N . AS 8127 8190 5082 5054

No. ATM cells received Slosscell NAP )1( _− 7553 7530 5046 5021

Table 5.4: Comparison of theory and simulation for reliable multicast protocol model and file transfer application

The behaviour of the protocol can be used to test and verify the software. Figure 5.13(a) is a graph from Opnet that shows the ODATA and RDATA sequence numbers transmitted by the source as a function of time when the protocol is in Selective Nack mode. The lower graph shows the ODATA and RDATA sequence numbers received by the sink, together with the Nack sequence numbers that it returns to the source. The vertical scales show the sequence number of each packet, represented by a point on the graph. For this Figure, TXW_SQNS=20 and TXW_INC - TXW_TRAIL=6. We can see the multicast protocol transmitting 20 packets, of which four are not received by the sink (sequence numbers 3, 4, 7 and 10) and so cause Nacks to be sent by the sink, with the packets being re-transmitted by the source as RDATA. One of these packets (sequence number 7) is again not received by the sink, and so when the sink’s timeout expires, a Nack is again returned to the source and the packet is retransmitted as RDATA, this time successfully.

Packets 3 and 4 are in the increment window, and so the window is not advanced until period TXW_ADV_IVL (here set to 3 seconds) has expired. Since these packets were

retransmitted at t = 1.58 seconds, the window does not open until t = 4.58 seconds, whereupon TXW_INC – TXW_TRAIL = 6 packets are sent. The window can then only advance 3 seconds after sequence number 7 (which has been transmitted twice as RDATA) has been sent: the second retransmission of packet 7 occurred at t = 4.18 seconds, and so the window is advanced to TXW_INC at t = 7.18 seconds.

_

Figure 5.13: Illustration of reliable multicast protocol behaviour (Selective Nacks mode): source (top graph) and sink (bottom graph)

Figure 5.14 on the other hand shows the behaviour of the protocol when using Parity Nack mode. Here, the transmission group size (and therefore also the value of TXW_INC – TXW_TRAIL) has been set to 8. As before, packets 3, 4 and 7 are not received by the sink: however, in this case no Nack is issued until the entire transmission group has been received. When the sink receives packet 8 at t = 1.89 seconds it issues a Nack requesting three repair

packets (this is represented on the Figure by a single point with the value 3). These three repair packets (with repair sequence numbers 1, 2 and 3) are transmitted by the source at t = 2.18 seconds and received by the sink at about t = 2.5 seconds. This then allows the sink to calculate the values of packets 3, 4 and 7, and thus deliver the entire transmission group to the layer above.

Similarly, in the second transmission group, packets 9, 12, 14 and 15 are not received by the sink and so a request for four repair packets is sent back to the source once packet 16 (the final packet of the transmission group) is received.

Figure 5.14: Illustration of reliable multicast protocol behaviour (Parity Nack mode): source (top graph) and sink (bottom graph)

5.7 Summary

This Section has presented the following stages in the development of an Opnet simulation model:

• A basic illustrative Continuous RQ Go-Back-N protocol;

• A layered protocol model, including IP, AAL and ATM;

• A burst error model that represents the behaviour of the satellite modem;

• A reliable multicast protocol, based on PGM, with two error correction modes (Selective Nack and Parity Nack) and a file transfer application.

The burst error model has been validated against the theoretical analysis of Section 4 by testing the error status of each bit transmitted over the link. Once this had been validated, a second version of the burst error model was developed which uses the equations derived in Section 4. This reduced simulation run times by approximately 30%.

Software printouts of the burst error model and the reliable multicast transport layer protocol are attached in Appendix C.

6 Results

6.1 Introduction

The previous Section described the steps in the development of a reliable multicast model running over IP and ATM, and capable of supporting a file transfer application. In this Section results from this model are presented. The relative performance of the reliable multicast model’s Selective Nack and Parity Nack modes are illustrated and compared.

6.2 Simulation

The model described in Section 5 has here been extended to provide (Figure 6.1):

• A satellite capable of multicasting data to up to 20 receivers;

• Data rates of 2Mbit/s on both the forward and reverse links;

• Background traffic on all links.

In the scenario described in this Section, a 1MB file is to be transferred from the source to 20 receivers. For the values of parameters used in this simulation, the file transfer would take 17.2 seconds in the absence of any ATM cell losses or errors.

Figure 6.1: Opnet model: satellite with data source and 20 receivers

Opnet supports a background utilisation on each link. This is traffic that does not explicitly originate from sources in the scenario, but can nonetheless be included in the link’s traffic and therefore affects the delay seen by the explicitly modelled packets. Opnet models the background traffic on any link as a stream of packets with size equal to the mean packet size of the foreground (explicitly modelled) traffic and an arrival rate given by the required background utilisation (expressed as a number in the range 0 to 1) divided by the mean transmission delay of the foreground traffic19.

6.2.1 Tuning of simulation parameters

Section 5.6 described a number of parameters defined in PGM whose values can be altered. Optimum settings for these parameters depend on the general network state including in particular the round trip time, and a detailed analysis of the optimisation of these parameters is beyond the scope of this thesis. Nonetheless it is necessary to ensure that the parameters are tuned at least sensibly. For example, in selective Nack mode it is inappropriate to have a source Nack timer value (RDATA_RESEND_TIMER) that is longer than the sink’s Nack re-request timer (NACK_TIMERVALUE). This is because such a set of values would mean that a Nack sent by the sink following a re-request timeout would be ignored by the source, since the source would consider that it had already responded when it issued its preceding preceding Nack.

Except where otherwise stated, the parameters used in the simulations of this Section are listed in Table 6.1.

The value of TXW_MAX_RTE corresponds to a maximum transmission rate of 409.6 kbit/s, or 20% of the 2Mbit/s link. This determines the window size as defined in Section 5.6 by equations (5.10), (5.11) and (5.12). However, since a token bucket scheme has not been implemented in this simulation of PGM, the transmission rate rises above this value when there is a significant level of repair data requested due to high error rates on the links.

6.3 Metrics

Results are presented for the following measurements:

• File transfer time: the total time taken from the initialisation of the file transfer application to the time when the last packet is delivered to all sinks.

• Forward link network traffic: the total number of bytes in all the ATM cells transmitted by the source when sending the file. This therefore includes both original data and repair data transmitted in response to Nacks received by the source. The forward link network traffic in bytes is then )(53 rdataodataNtraffic +=

where odata is the number of original data packets, rdata is the number of repair

19 Thus if the link data rate is 19.2kbit/s and the foreground packet size is say 1920 bits then the mean transmission delay of the foreground packets is 0.1 seconds. For a background utilisation of 50% Opnet models the background traffic as a set of packets of size 1920 bits with an arrival rate of 5 per second (= 0.5 / 0.1 ).

data packets transmitted by the source and N is the number of ATM cells per data packet20.

Other measurements that could be used are the mean packet delay, the total network traffic (i.e. summing up the traffic on each separate link), and the volume of return Nack traffic.

Layer Parameter Value

File transfer application FILE_SIZE 1,000,000 bytes

TIME_INCREMENT 0.025 seconds

Reliable multicast: source TXW_SECS 6 seconds

TXW_MAX_RTE 51,200 bytes/second

PACKET_SIZE 1480 bytes

TXW_ADV_SECS 2 seconds

TXW_ADV_IVL 3 seconds

TGSIZE 64

TX_SUCCESS 10 seconds

RDATA_RESEND_TIMER 1 second

Reliable multicast: sink NACK_TIMERVALUE 2 seconds

Table 6.1: Simulation parameters

6.4 File transfer results

We first consider the multicast protocol in Selective Nack mode. Figure 6.2 shows the file transfer time as a function of the forward downlink bit error rate. As modelled in Section 5, here a perfect reverse channel is assumed, together with a zero bit error rate on the forward uplink. Two separate sets of results are shown, for 10 receivers and 20 receivers. As expected, the larger number of receivers results in more Nacks being issued and therefore a longer time to transfer the file.

The forward link network traffic is illustrated in Figure 6.3. Again the impact of increasing the bit error rate is to require more retransmissions of data and hence increase the link traffic. For example, at a bit error rate of 10-5, the 1MB file requires approximately 1.57MB of network traffic on the forward link compared to only 1.146MB on a lossless network (the excess over 1MB represents the overheads of the reliable multicast protocol, IP, AAL5 and ATM).

Figures 6.4 and 6.5 show the corresponding results when the reliable multicast protocol is in Parity Nack mode. Here Nacks are not issued by a sink until it has received packets from a

20 For the 1MB file used in these results, odata = 676 packets (each of 1480 bytes) and N=32. The value of rdata depends on the bit error rates of the links and the number of sinks.

complete transmission group (set to 64 multicast packets in these simulations) or until it first receives a packet from a successive transmission group. This means that only a single Nack is issued on the reverse channel even if multiple packet losses have occurred. Furthermore, a small number of coded repair packets will enable different packets lost by the sinks to be recovered. Consequently, at a bit error rate of 10-5, the forward link traffic is only 1.22MB compared to the 1.57MB required by the Selective Nack mode.

Figures 6.6 and 6.7 compare the relative performance of the Selective Nack and Parity Nack modes (both for the 20 satellite scenario). The file transfer time is slightly shorter in the case of Parity Nack mode, and the forward link network traffic is significantly reduced. The total number of Nacks issued by the receivers is also significantly reduced in the case of Parity Nack mode. For example, at a bit error rate of 10-5, the reverse downlink traffic due to Nacks is a total of 16.1kB for Selective Nack mode and 8.5kB for Parity Nack mode21. Both metrics (i.e. file transfer time and forward link network traffic) show the Parity Nack mode has a greater advantage at higher bit error rates.

However, Parity Nack mode has some shortcomings. One of these, the level of jitter, is illustrated in Figure 6.8. This figure shows the ordered packet sequence numbers delivered to the sink’s file transfer application (for sink_0) as a function of time. If there are no packet losses then the packets are delivered to the file transfer sink at an approximately constant rate. However, if a packet is lost then no delivery occurs until the error correction mechanism has recovered the packet: at this point the packet is delivered together with all other successive contiguous packets received at that time. Thus the packet delivery to the file transfer sink is ordered and non-duplicated. Figure 6.8(a) shows the reference case of a system with no link errors and consequently packet delivery occurring at a constant rate. Figures 6.8(b) and (c) show the Selective Nack and Parity Nack results for a BER=10-5 with 20 satellites. In case (b) when the receiver detects packet loss the delay in retrieving the packet is of the order of the round trip time (~0.6 seconds) plus any queueing delays. However, in case (c) a Nack is not issued until the entire transmission group is received: for a transmission group size of 64 packets, this is a mean value of the time to transmit 32 packets (i.e. ~0.8 seconds) plus the round trip time (~0.6 seconds) and queueing delays.

Jitter is not an important metric for a file transfer application, but would be a significant measure of performance for a realtime application such as a streaming multimedia transmission. For such an application it would be beneficial to develop a quantitative measure of jitter. An example of such a metric would be the mean square of the difference between the actual packet delivery time and the delivery time in the absence of errors. Such a quantitative measure is also of benefit in the case of Figure 6.8, where a qualitative observation of (b) and (c) suggests that Parity Nack mode, (c), has the higher jitter.

The measure of jitter adopted here is to take a straight line joining the first and last delivered sequence numbers, and measure the sum of the squares of the differences between the actual packet delivery times and the nominal straight line approximation. That is, if the sequence numbers run from 1 to odata, the actual arrival time of packet with sequence number i is

21 The Nack is only a few bytes long and including overheads from lower layer protocols fits into a single ATM cell.

][_ itimearrival , and the nominal arrival time for the packet is ][iTnom , then the sum of the

squares of the deviations is:

∑=

=

−odatai

inom iTitimearrival

1

2])[][_(

where ][iTnom is linearly interpolated between the first and last delivered sequence numbers:

( )]1[_][_)1(

)1(]1[_][ timearrivalodatatimearrival

odata

itimearrivaliTnom −

−−+=

and therefore the jitter metric is:

∑=

=

−=odatai

inom iTitimearrival

odatajitter

1

2])[][_(1

(6.1)

This metric (6.1) gives the following results for the cases illustrated in Figures 6.8(b) and (c):

• Selective Nack mode: mean jitter = 0.695 seconds ± 0.1;

• Parity Nack mode: mean jitter = 0.851 seconds ± 0.09.

This confirms the hypothesis that the Selective Nack mode has less jitter than the Parity Nack mode.

All the results presented above assume zero background utilisation on the satellite links. Figure 6.9 provides an illustration of how the file transfer time varies with the level of background traffic. The Figure shows a weak increase in file transfer time as the background utilisation increases. At other BERs the transfer time shows the same characteristic, decreasing at some traffic levels and increasing at others.

Figure 6.2: Selective Nack mode: 1MB file transfer time as function of BER and number of receivers

Figure 6.3: Selective Nack mode: forward link total network traffic for 1MB file transfer as function of BER and number of receivers

Figure 6.4: Parity Nack mode: 1MB file transfer time as function of BER and number of receivers

Figure 6.5: Parity Nack mode: forward link total network traffic for 1MB file transfer as function of BER and number of receivers

Figure 6.6: Comparison of Selective Nack and Parity Nack modes: 1MB file transfer time as function of BER

Figure 6.7: Comparison of Selective Nack and Parity Nack mode: forward link total network traffic for 1MB file transfer as function of BER

(a) Zero bit error rate: constant packet delivery rate

(b) Selective Nack mode: 20 receivers, BER=10-5

(c) Parity Nack mode: 20 receivers, BER=10-5

Figure 6.8: Comparison of jitter: file transfer application packet delivery as function of time

Figure 6.9: Effect of background traffic on typical file transfer time (Selective Nack mode, BER=10-5)

7 Conclusions and further work

7.1 Summary

This thesis began by reviewing the advantages and drawbacks of satellite-based communication systems and outlined the proposed architecture of some systems currently under development. The thesis went on to review the principles of multicast routing protocols and reliable multicast protocols, and described in outline a selection of these protocols, including in particular one protocol, Pragmatic General Multicast (PGM).

The thesis then focused on one of the significant features of satellite communications links, namely error performance, and considered error correction mechanisms in the context of multicast protocols. A theoretical model was developed for the unicast and multicast error performance of IP over a satellite ATM link subject to error bursts resulting from the Viterbi decoding normally employed in satellite modems. The network simulation tool Opnet was then used to develop a model of IP, AAL5 and ATM running over a satellite link subject to burst errors, and the theoretical and simulation results compared. The theoretical equations were validated using an Opnet burst error simulation model built by the author that tests each bit on the link for its error condition. Following validation of the equations, the link model was replaced with one that used the formulae (5.8) and (5.9) to calculate ATM cell loss and cell error conditions. This resulted in an approximately 30% reduction in simulation run times.

The simulation model was then extended to include the error correction components of the reliable multicast protocol PGM. This protocol supports two modes of error correction: one (“Selective Nack”) is a continuous RQ selective repeat protocol, the other (“Parity Nack”) is a Type II hybrid ARQ code that encodes repair data based on groups of packets. A file transfer application was also developed and used to provide a source of data for the simulation. The behaviour of the two error correction modes was compared in the example case of multicasting a 1MB file to a number of receivers, and quantitative measures were obtained of the performance of the two error correction modes as a function of satellite link bit error rate.

7.2 Conclusions

A theoretical model of the error performance of IP over a satellite ATM link has been developed in this thesis. A model of this link has been developed in Opnet, and excellent agreement has been obtained between the theory and the simulation results. The theoretical expressions derived assumed a constant burst length b, but simulations have been used to show that the equations also remain valid when the burst length has a Poisson distribution of mean b.

The results presented in Section 6 show that the use of Parity Nacks as an error correction mechanism in a reliable multicast protocol can result in a significant saving of network traffic and reduced transfer time for a file transfer application. However, this mechanism suffers from some disadvantages:

• Worse levels of jitter in the delivery of correctly sequenced packets to the end application due to the delay in sending Nacks to request repair data;

• Significant processing power is required in the receiver to decode the received packets when losses occur. The decoding time is shown by [Rizzo,97a&97b] to be of the order of some milliseconds22.

7.3 Further work

7.3.1 Analysis of other protocols

The Opnet model developed during this work and described in this thesis provides a basis for further investigation of the behaviour of applications and reliable multicast protocols over satellite links. Some areas which are likely to prove fruitful include the following:

• Other applications: real-time applications such as multimedia streaming and videoconferencing are examples where the error behaviour of the network can significantly affect the quality of service seen by the end-user. Quantitative measures of their performance using reliable multicast protocols over a satellite link would be very useful. Web browsing (effectively a transfer of a small number of files) could also be of interest, although this is more usually an example of unicast network traffic.

• Other error performance aspects of multicast protocols: proactive FEC, supported in PGM, could provide a useful mechanism for improved performance under moderate packet loss rates. Here, a small number of repair packets would be sent with each transmission group (Parity Nack mode) so that a sink could recover from a small amount of packet loss without needing to explicitly request data from the source. An analysis of the precise amount of repair data to minimise parameters such as total network traffic and jitter in application packet delivery would be useful. This mechanism could be particularly appropriate for multimedia streaming applications23 (where the high satellite round trip time means that ARQ does not provide benefit), and may be beneficial for web browsing.

• Behaviour using other link layers: this work has focused exclusively on ATM. However, DVB-S has been considered for a number of satellite systems such as

22 [Rizzo,97a] reports a decoding time of 3.5ms per packet for packets of length 1024 bytes with TGSIZE=k=32 and n-k=28, on a Pentium 133 MHz processor running Free BSD. In [Rizzo,97b] the total decoding time for the same values of n and k is given as 79ms. 23 We can distinguish further between bi-directional applications such as videoconferencing and uni-directional multimedia streaming (e.g. TV or radio broadcasts). In the former case real-time delays are significant and ARQ over a satellite link would result in unacceptable time lags. In the latter case the receiver could build-in a delay of a few seconds which then allows time to request additional packets if required during high error periods.

Astra and GEOCAST. MPLS could also in the future provide a switching fabric which could allow satellite connections to be integrated into ISPs’ network architectures. In addition, more sophisticated physical layer implementations (for example with Reed-Solomon outer codes) are likely to have superior error characteristics24.

7.3.2 Opnet model improvements

The following improvements could be made to the Opnet model:

• Satellite links: the current point-to-point links could be replaced with radio links. As described in Section 5 this would allow link budgets to be calculated for each receiver, and also has the advantage of allowing multiple receivers to lie within a single spotbeam. This would allow greater accuracy in the modelling of multicast scenarios.

• The PGM model could be improved by implementing a token bucket scheme as identified in the PGM specification. This would have the effect of exaggerating the file transfer delay presented in Section 6.4 since at higher bit error rates more repair data would slow down the rate at which original data is transmitted by the source.

• Thusfar results have been obtained which only cover up to 20 sinks. The Opnet model could be expanded relatively simply to allow more receivers to be modelled, to obtain results for larger multicast populations.

7.3.3 Wider areas of investigation

Section 2 of this thesis briefly showed how the introduction of a satellite link into a network affects the behaviour both of IGMP and of multicast routing protocols. Refinements to the protocols may be needed to support satellite links, and the dynamics of the protocols are likely to be affected, particularly by the long round trip delay. These aspects could be investigated further.

24 For comparison, a Geocast satellite model employs de-interleaving at the Viterbi decoder output to spread the bursts, followed by a Reed-Solomon outer code [Geocast,01]. This would make the ATM cell header much more easily able to correct single bit errors, giving an ATM cell loss rate similar to the random bit error model described by [Ramseier,95].

References

[Akylidiz,97] “Satellite ATM networks: a survey”, Akyildiz, I.F. and S.H. Jeong, IEEE Communications Magazine, July 1997, pp.30-43.

[Alburqueque,00] “A cautionary note on simulation error models for TCP analysis: discrete vs. continuous time”, Albuquerque, M, A.A. Abouzeid and S. Roy, University of Washington, Seattle USA, http://students.washington.edu/marcelo/errormodels.pdf.

[Armitage,97] “IP multicasting over ATM networks”, Armitage, G.J., IEEE Journal on Selected Areas in Communications, 15, No. 3, 1997 pp.445-457.

[ATMForum,93] “ATM user-network interface specification Version 3.0”, PTR Prentice Hall, 1993.

[Bem,00] “Broadband satellite systems”, Bem, D.J., T.W. Wieckowski et al., IEEE Communications, 3, No. 1, pp.2-14.

[Brandao,99] “A review of error performance models for satellite ATM networks”, Brandão, J.C., E.L. Pinto and M.A.G. Maia, IEEE Communications Magazine, July 1999, pp.80-85.

[Chitre,94] “Asynchronous transfer mode (ATM) operation via satellite: issues, challenges and resolutions”, Chitre D.M. et al, International Journal of Satellite Communications., 12, 1994, pp.211-222.

[Clausen,99] “Internet over direct broadcast satellites”, Clausen H.D. and B. Collini-Nocker, IEEE Communications Magazine, June 1999, pp.146-151.

[Crowcroft] “Internetworking multimedia”, Crowcroft J., M. Handley and I. Wakeman, http://www.cs.ucl.ac.uk/staff/jon/mmbook/book/book.html

[Cuevas,99] “The development of performance and availability standards for satellite ATM networks”, Cuevas, E.G., IEEE Communications Magazine, July 1999, pp.74-79.

[Deering,96] “The PIM architecture for wide-area multicast routing”, Deering, S., D.L. Estrin et al., IEEE Transactions on Networking, 4, No. 2, April 1996, pp.153-162.

[Franchi,93] “On the error burst properties of Viterbi decoding”, Franchi A., and R.A. Harris, Proc. IEEE Int. Conference on Communications, 23-26 May 1993, pp.1086-1091.

[Gemmell,00] “Fcast multicast file distribution”, Gemmell, J., J. Gray and E. Schooler, IEEE Network, Jan/Feb 2000, pp.58.

[Geocast,01] “Radio layer analysis for Geocast network simulations”, Geocast document GEOC-ASPI-TN-40, Issue 3, 28 June 2001.

[Halsall,96] “Data communications, computer networks and open systems”, Halsall, F., 4th ed., Addison-Wesley, 1996.

[Hamouda,98] “Performance of ATM cell transmission via regenerative satellite links”, Hamouda, W.A. and P.J. Mclane, IEEE International Conference on Communications, 1998, pp.1436-1442.

[Hanle,98] “Feasibility study of erasure correction for multicast file distribution using the network simulator ns-2”, Hänle, C., Proc. IEEE Milcomm, 19-21 Oct 1998, pp.1060-1066.

[Heissler,99] “An analysis of the Viterbi decoder error statistics for ATM and TCP/IP over satellite communication”, Heissler, J.R., Y.A. Barsoum and R. Condello, Proc. IEEE Milcomm 1999, pp.359-363.

[Henderson,99] “Transport protocols for Internet compatible satellite networks”, Henderson, T.R. and R.H. Katz, IEEE Journal on Selected Areas in Communications, 17, No.2, Feb 1999, pp.326-344.

[Howarth,01] “Unicast and multicast IP error performance over an ATM satellite link”, Howarth, M.P., H. Cruickshank and Z. Sun, IEEE Communications Letters, to be published.

[Iyengar,01] “Security issues in IP multicast over GEO satellites”, Iyengar, S., H. Cruickshank and Z. Sun, 19th AIAA International Communications Satellite Systems Conference and Exhibit, 17-20 April 2001, Toulouse, France, Paper No. 117.

[Koyabe,01] “Reliable multicast via satellite: a comparison survey and taxonomy”, Koyabe, M. and G. Fairhurst, International Journal of Satellite Communications, 19, No. 1, Jan 2001, pp.3-28.

[Neyman,39] “On a new class of contagious distributions applicable in entomology and bacteriology”, Annals of Math Statistics, 10, 1939, pp.35-57.

[NORM,01] “Nack-oriented reliable multicast protocol (NORM)”, IETF Internet Draft, Jul 2001 expires Jan 2002, http://search.ietf.org/internet-drafts/draft-ietf-rmt-pi-norm-02.txt.

[Obraczka,98] “Multicast transport protocols: a comparison survey and taxonomy”, Obraczka, L., IEEE Communications Magazine, Jan 1998, pp.94-102.

[Ohta,91] “A cell loss recovery method using FEC in ATM networks”, Ohta, H., and T. Kitami, IEEE Journal on Selected Areas in Communications, 9, No. 9, Dec 1991, pp.1471-1483.

[PGM,01] “PGM reliable transport protocol specification”, IETF Internet draft, 13 Feb 2001, expires 13 Aug 2001, http://search.ietf.org/internet-drafts/draft-speakman-pgm-spec-06.txt

[Ramseier,95] “Impact of burst errors on ATM over satellite – analysis and experimental results”, Ramseier, S. and T. Kaltenschnee, COST226 final symposium, Budapest Hungary, 10-12 May 1995, pp.99-108.

[Rizzo,97a] “Effective erasure codes for reliable computer communication protocols”, Rizzo, L., ACM Computer Communication Review, April 1997, pp.24-36.

[Rizzo,97b] “On the feasibility of software FEC”, DEIT Technical Report LR-970131, available at http://www.iet.unipi.it/~luigi/softfec.ps.

[RFC1075] “Distance vector multicast routing protocol”, Waitzman, D., C. Partridge and S. Deering, IETF RFC1075, Nov 1988.

[RFC1584] “Multicast extensions to OSPF”, Moy, J., IETF RFC1584, March 1994.

[RFC2022] “Support for multicast over UNI 3.0/3.1 based ATM networks”, Armitage, G., IETF RFC2022, Nov 1996.

[RFC2189] “Core based trees (CBT version 2) multicast routing”, Ballardie, A., IETF RFC2189, Sep 1997.

[RFC2225] “Classical IP and ARP over ATM”, Laubach, M. and J.Halpern, IETF RFC2225, Apr 1998.

[RFC2236] “Internet group management protocol, version 2”, Fenner, W., IETF RFC2236, Nov 1997.

[RFC2362] “Protocol independent multicast - sparse mode (PIM-SM): protocol specification”, Estrin, D. et al, IETF RFC2362, Jun 1998.

[RFC2760] “Ongoing TCP research related to satellites”, Allman, M., IETF RFC2760, Feb 2000.

[RMTWG,00] “The use of forward error correction in reliable multicast”, Reliable Multicast Transport Working Group, draft-ietf-rmt-info-fec-00.txt, 17 Nov 2000, IETF Internet draft, expires May 2001.

[Saharasrab,00] “Multicast routing algorithms and protocols: a tutorial”, Sahasrabuddhe, L.H. and B. Mukherjee, IEEE Network, Jan/Feb 2000, pp.90-102.

[Yegenoglu,00] “An IP transport and routing architecture for next-generation satellite networks”, Yegenoglu, F., R. Alexander and D. Gokhale, IEEE Network, Sep/Oct 2000, pp.32-38.

Appendix A: Theory of Continuous RQ protocols

A.1 Go-Back-N protocol

We consider a protocol with the following specification:

• Packets are transmitted with strictly monotonic increasing integer sequence numbers (except when Nacks are received, as described below);

• Transmit slots occur at a constant rate;

• One packet is transmitted per slot;

• The receiver issues a Nack back to the transmitter if it detects that a sequence number has not been received;

• If the transmitter receives a Nack it resets the sequence number of the packets back to the value specified in the Nack (provided this is less than the current value of the source’s sequence number), and continues numbering packets from this value.

• The window is assumed to be infinite in size.

• Error-free reverse channel.

Assume the processing time at source and sink is negligible. Let the round trip time (including data packet transmission time, Nack transmission time and propagation time in both directions) lie between D and D+1 slot periods, and let the sink’s timer expire so that the retransmitted Nack is received E slot periods after the previous Nack. For the cases shown in Figure A.1 we have D=1 and E=4.

Let Pf be the probability of packet loss.

Then the expected number of slots used to successfully transmit one packet to the receiver is the sum of the following components:

• An original transmission;

• Loss of the original transmission, and successful retransmission of the packet on receipt of a Nack;

• Loss of the original transmission and the first retransmission; single timeout by the receiver and successful receipt following the second retransmission;

• And in general, loss of the original transmission and n retransmissions; n timeouts by the receiver, and successful receipt following the (n+1)th retransmission.

Hence we have:

...)1()1(...)1()3()1()2(1 2222 +−++++−++−++= fnfffffGBN PPnDPPDPPDS

… (other terms for no timeouts) …

..)1()1(...)1()3()1()2( 212322 +−+++++−+++−+++ +f

nfffff PPEnDPPEDPPED

… (other terms for one timeout) …

...)1()1(...)1()3()1()2( 22221 +−+++++−+++−+++ +++f

inff

iff

if PPiEnDPPiEDPPiED

… (other terms for i timeouts, etc) …

And this reduces to:

∑∑∞

=

∞

=

++++−+=0 1

2 )1()1(1i n

inffGBN PniEDPS

To simplify this expression for SGBN, we first consider ∑∞

=

++1

)(n

infPnK where K and i are

constants (K=D+1+Ei). This can be simplified to 2

2

)1(

)1(

f

ffif P

KPPKP

−−+

. The expression

for SGBN therefore reduces to:

∑∞

=

++−+++=0

2})1()2{(1i

ffifGBN PEiDPEiDPS

Which can be written as:

∑∑∞

=

∞

=

−++−++=0

2

0

2 )(})1()2{(1i

ifff

i

ifffGBN iPPPEPPDPDS

Noting that )1(

1

0 fi

if P

P−

=∑∞

=

and that 2

0 )1( f

f

i

if P

PiP

−=∑

∞

=

we finally get that:

)1(

)1()1(1 2

f

ffGBN P

PDEPDS

−−−+++

=

For the case considered in this thesis, D=1 and E=4, so the expression for SGBN reduces to:

)1(

221 2

f

ffGBN P

PPS

−++

=

A.2 Selective repeat protocol

The case of a selective repeat protocol is much simpler, since the round trip time D and the timer expiry period E do not affect the overall efficiency of the protocol. In this case we obtain:

)1(

1..1 2

fffSR P

PPS−

=+++=

We note in passing that for Pf > 0, GBNSR SS < , and so the selective repeat protocol is the

superior protocol in terms of the metric S.

Sink

Source

1 2 3 4 2

1 3 2

Nac

k(2)

4

(a) Single packet loss, no timeout, 3 slots wasted

Probability of this scenario = Pf(1-Pf)2

(since packet 2 is lost; packet 3 succeeds; packet 4 don’t care; retransmitted packet 2 succeeds). So expected number of slots used by this scenario = (D+2) Pf(1-Pf)

2.

Sink

Source

1 2 3 4 5 2

1 4 5 2

Nac

k(2)

(b) Two consecutive packets lost, no timeout, 4 slots wasted

Probability of this scenario = Pf2(1-Pf)

2

(since packets 2 and 3 are lost; packet 4 succeeds; packet 5 don’t care; retransmitted packet 2 succeeds). So expected number of slots used by this scenario = (D+3) Pf

2(1-Pf)2.

Sink

Source

1 2 3 4 2 3 4 2 5

1 4 3 3 4 5 2

Nac

k(2)

(c) Retransmitted packet lost, single timeout, 7 slots wasted

Nac

k(2)

Timer running


2

(since packet 2 is lost; packet 3 succeeds; packet 4 don’t care; retransmitted packet 2 is lost; retransmitted packets 3, 4 & 5 don’t care; retransmitted packet 2 succeeds). So expected number of slots used by this scenario = (D+2+E) Pf

2(1-Pf)2.

Expiry of Timer

4 Sink

Source

1 2 3 4 5 2 3 5 4

1 4 2 3 5

Nac

k(2)

(d) Two initial packets lost, Retransmitted packet lost, single timeout, 8 slots wasted

Nac

k(2)

Timer running

2

5


2

(since packets 2 & 3 are lost; packet 4 succeeds; packet 5 don’t care; retransmitted packet 2 is lost; retransmitted packets 3, 4 & 5 don’t care; retransmitted packet 2 succeeds). So expected number of slots used by this scenario = (D+3+E) Pf

3(1-Pf)2.

Expiry of Timer

Figure A.1: Example cases of Go-Back-N behaviour

Appendix B: Unicast error performance simulation results

The table overleaf gives simulation run results from the bit-by-bit burst error model described in Section 5.5 for a unicast (single sink) simulation. The results are compared with the theoretical values for unicast error performance of Section 4 (Figure 4.5) in Figure 5.8.

Notes

Burst length b = 6 bits;

For all simulation runs, seed=128;

“Actual downlink BER” is the BER measured during the Opnet simulations.

Unicast error performance (burst error channel), layered protocol model (1)

Application packet length 9180 octets 9180 octets 1492 octets

∴ N N 192 ATM cells per app pkt 192 ATM cells per app pkt 32 ATM cells per app pkt

Notional downlink BER p 10-5 10-4 10-6

Simulation run time runtime 2,001 seconds 501 seconds 20,001 seconds

Link data rate 512 kbit/s 512 kbit/s 512 kbit/s

)1( _ losscellP−

b

p4241 −

0.999293 0.99293 0.9999293

)1( fP− 1_ )1( +− N

losscellP 0.8730 0.2561 0.99773

S

f

ff

P

PP

−++

1

221 2

1.473 14.04 1.007

Theory Simulation Theory Simulation Theory Simulation

No. application pkts sent AS = 2 * runtime – 1 4,001 4,001 1,001 1,001 40,001 40,001

No. application pkts delivered AS / S 2,716 2,794 71 91 39,730 39,695

No. ATM cells sent N . AS 768,192 768,192 192,192 192,192 1,280,032 1,280,032

No. ATM cells received Slosscell NAP )1( _− 767,649 767,673 190,833 190,877 1,279,941 1,279,929

No. IP datagrams received Sf AP )1( − 3,493 3,512 256 268 39,910 39,899

P(single IP datagram loss) )431()4241(1b

p

b

p N −−− 0.127 0.122 0.744 0.732 2.27 . 10-3 2.55 . 10-3

No. errored bits - 3,090 - 7,830 - 612

Actual downlink BER - 9.49 . 10-6 - 9.61 . 10-5 - 1.13 . 10-6

Unicast error performance (burst error channel), layered protocol model (2)

Application packet length 1492 octets 300 octets 300 octets


Notional downlink BER p 3 . 10-4 10-5 10-4

Simulation run time runtime 5,001 seconds 20,001 seconds 10,001 seconds

Link data rate 512 kbit/s 19.2 kbit/s 19.2 kbit/s

)1( _ losscellP−

b

p4241 −

0.97880 0.999293 0.99293

)1( fP− 1_ )1( +− N

losscellP 0.50266 0.99499 0.95088

S

f

ff

P

PP

−++

1

221 2

4.95 1.0152 1.160


No. application pkts sent AS = 2 * runtime – 1 10,001 10,001 40,001 40,001 20,001 20,001

No. application pkts delivered AS / S 2,019 2,209 39,402 39,422 17,242 17,366

No. ATM cells sent N . AS 320,032 320,032 280,007 280,007 140,007 140,007

No. ATM cells received Slosscell NAP )1( _− 313,247 313,386 279,809 279,812 139,017 139,045

No. IP datagrams received Sf AP )1( − 5,027 5,116 39,801 39,808 19,020 19,052

P(single IP datagram loss) )431()4241(1b

p

b

p N −−− 0.497 0.488 5.00 . 10-3 4.82 . 10-3 0.0490 0.0474

No. errored bits - 39,942 - 1,164 - 5,718

Actual downlink BER - 2.94 . 10-4 - 9.80 . 10-6 - 9.63 . 10-5

Comparison of ATM cell loss and cell error probabilities (1)

Applic’n packet length 9180 octets 9180 octets 1492 octets


Notional downlink BER p 10-5 10-4 10-6

Simulation run time runtime 2,001 seconds 501 seconds 20,001 seconds


Actual downlink BER

9.49 . 10-6 9.61 . 10-5 1.13 . 10-6

P(no errors) b

p4241 − 0.999293 0.999324 0.99293 0.99316 0.9999293 0.9999195

P( loss ��HUURU� b

p39 6.50 . 10-5 6.90 . 10-5 6.50 . 10-4 7.39 . 10-4 6.50 . 10-6 5.47 . 10-6

P( ! loss �HUURU� b

pb)387( − 6.35 . 10-4 6.00 . 10-4 6.35 . 10-3 6.01 . 10-3 6.35 . 10-5 7.50 . 10-5


pb )2( − 6.7 . 10-6 6.5 . 10-6 6.7 . 10-5 9.4 . 10-5 6.7 . 10-7 0

Comparison of ATM cell loss and cell error probabilities (2)

Applic’n packet length 1492 octets 300 octets 300 octets


Notional downlink BER p 3 . 10-4 10-5 10-4

Simulation run time runtime 5,001 seconds 20,001 seconds 10,001 seconds


Actual downlink BER

2.94 . 10-4 9.80 . 10-6 9.63 . 10-5

P(no errors) b

p4241 − 0.97880 0.97923 0.999293 0.999304 0.99293 0.99313


p39 1.95 . 10-3 2.04 . 10-3 6.50 . 10-5 3.57 . 10-5 6.50 . 10-4 6.50 . 10-4

P( ! loss �HUURU� b

pb)387( − 1.91 . 10-2 1.84 . 10-2 6.35 . 10-4 6.57 . 10-4 6.35 . 10-3 6.13 . 10-3


pb )2( − 2.0 . 10-4 3.2 . 10-4 6.7 . 10-6 3.6 . 10-6 6.7 . 10-5 9.3 . 10-5

Appendix C: Software listings

The following software listings are attached:

1. Pipeline Stage 2 burst error model (formula-based version).

2. Reliable multicast process software listings (see Figure C.1): • State variables, temporary variables and header block: declarations; • Function block: functions called by states; • init: process initialisation; • transmit: source-side, transmits ODATA; • pdu_arrival: both source and sink side, determines whether a npacket arriving from

the layer below is ODATA/RDATA (in which case hand the packet to state receive) or is a Nack (in which case hand the packet to state nack);

• receive: sink-side, process ODATA and RDATA; • nack: source-side, receive Nack and transmit RDATA; • nack_timeout: sink-side: if RDATA has not been received in response to a Nack, re-

transmit the Nack; • txw_ivl_timeout: source-side: confirms whether the TXW_ADV_IVL timer has expired,

and advances the increment window if it has; • tx_success: source-side: report success to application layer if no nacks received.

Other software developed during this work (file transfer application, IP, AAL5, ATM, constant rate source, satellite switch, ATM switch) is omitted for the sake of brevity.

Figure C.1: Multicast process

error performance aspects of ip multicast over satellite

Documents