analysis methodology - university of california,...

_____________________________________________________________________________________________________________ 45

CHAPTER 3

Analysis Methodology

In this chapter, we explain the methods and tools we used to obtain the results presented inChapter 4 and Chapter 5. Our analysis approach is strongly based on measurements. This ismainly motivated by the fact that simulators for wireless networking were not sufficientlydeveloped when we started our work. Also, with the globally deployed GSM-CSD systems wehad a real (wide-area) wireless network available for our study, about which little was knownwith respect to inefficient interactions with end-to-end protocols. Hence, we used the GSM-CSD system as a case study of a wireless link. Our measurement-based approach gave us theunique opportunity to capture the aggregate of real-world effects such as noise, interference,fading, and shadowing. This is a key advantage over simulations as unrealistic assumptionsabout the error characteristics of a wireless channel can completely change the results of a per-formance analysis. This often leads to inadequate design decisions as we demonstrate inChapter 4.

Furthermore, we believe that results obtained by measurement are often more convincing thanthose obtained by simulations. The reason is that it is much easier to make mistakes in simula-tions that lead to wrong conclusions than it is when performing measurements. Experimentalmeasurements often expose effects that may not be visible using simulations alone. This maybe due to implementation errors, or the fact that a simulator has abstracted to many details, i.e.,does not implement all the relevant features of a real system. In Section 3.2.5 and Section 4.2.3we present examples of problems that would have been difficult to detect by simulations.

Certainly, a measurement-based analysis approach has a number of drawbacks. First, itrequires the availability of a real system. Measuring the performance of new features of a net-work that is still in the design phase is not possible. Also, the process of performing measure-ments is often a time intensive task while a large base of measurements is required to drawgeneral conclusions when the error characteristics of a wireless link are crucial. Another prob-lem for measurements is that real systems often only allow a limited parameterization.

46 ________________________________________________________________________________________________ CHAPTER 3

3.1 Evaluating Error Recovery Strategies

In this section, we explain the methodology we use in Section 4.3 to evaluate the benefit oflink layer error recovery for reliable flows. With this analysis we address the problems of“underestimation of available bandwidth” (see Section 2.5.1), “inefficiency of end-to-end errorcontrol” (see Section 2.5.2), and also the problem of “failure of link layer differential encod-ings” (see Section 2.5.5). This work has been published in [LKJ99] and [LKJK00]. InSection 3.1.1 we provide general information about how we collected measurements in GSM-CSD that applies to both this section and Section 3.2.

3.1.1 Collecting Link Layer Traces in GSM-CSD

Performance measurements involving wireless links add a complex dimension to the charac-teristics with which links are usually described. In addition to the simpler parameters of link bitrate and link latency, the error characteristics play a crucial role as motivated in the next sub-section. The error characteristics of a wireless channel over a certain period of time can be cap-tured by a bit error trace. A bit error trace contains information about whether a particular bitwas transmitted correctly. The average Bit Error Rate (BER) is the first-order metric com-monly used to describe such a trace. The same approach can be applied at the block level (seeSection 2.4.4). Hence, a block erasure trace contains information about whether a particulardata block was correctly transmitted and the BLock Erasure Rate (BLER) denotes the averagerate at which block erasures, i.e., FEC decoding failures, occur in such a trace.

All our measurements involving a wireless link were carried out in commercially deployedGSM-CSD systems. Most of the measurements were carried out in the San Francisco BayArea. In addition, we have collected traces at other places in the U.S. and also in Sweden andGermany. Nevertheless, apart from the effects mentioned in Section 3.2.5, we did not find anydifferences between the various countries, or more precisely, between the manufacturers of theGSM network components and the frequencies used for operation. It is important to point outthat the error characteristics we have captured in the form of block erasure traces are only validfor the particular FEC and interleaving scheme implemented in GSM-CSD (see Section 2.4.4).Nevertheless, we believe that the results presented in Chapter 4 provide new insights into howto optimize this widely deployed system. These suggest techniques that can be used to designfuture wireless links, e.g., the GSM packet-switched data service which implements a similarFEC scheme [GSM05.03] and similar link layer error recovery [GSM04.60].

We are not interested in identifying physical link factors that cause measured block erasures.Rather, we are interested in the aggregate result captured by block erasure traces. This is simi-lar to the approach of trace-based mobile network emulation as proposed in [NSNK97]. Tocollect block erasure traces, we ported the RLP (see Section 2.4.3) implementation of a com-

Analysis Methodology _____________________________________________________________________________________ 47

mercially available GSM data PC-Card to the BSD/386 Version 3.0 operating system. We alsodeveloped a protocol monitor for RLP that we call rlpdump 1. It logs whether a receivedblock could be correctly reconstructed by the FEC decoder. This is possible because everyRLP frame corresponds to an FEC encoded data block. Thus, a received block suffered an era-sure whenever the corresponding RLP frame has a frame checksum error. In addition,rlpdump logs time/sequence information, i.e., which frame number was sent at which time,and also exceptional events, like selective rejects, retransmissions, flow control signals (XON/XOFF), and RLP link resets in both the send and the receive direction. For a given RLP con-nection such information makes up what we refer to as an RLP trace. Unfortunately, we werenot able to log internal receiver signal strength measurements from the mobile phone to corre-late them with the block erasure traces. Instead, we read the mobile phone's visual receiver sig-nal strength indicator ranging from 1 - 5. The receiver signal strength is used in Section 3.1.3and Section 3.2.3 to categorize measurements.

3.1.2 Analysis Goals, Assumptions, and Approach

Our goal is to evaluate the performance of the following two error recovery strategies. Withouta PEP in the network, these are the only alternatives that exist for reliable data transfer over apath that includes a wireless link.

• End-to-end error recovery complemented with link layer error recovery running over thewireless link.

• Pure end-to-end error recovery.

In Section 2.6.1, “pure end-to-end” implied that no assumptions are made about the existenceof dedicated support from the link layer, e.g., error recovery. Nevertheless, throughout the restof this dissertation, when we use the term “pure end-to-end error recovery” we imply that thewireless link is not protected by link layer error recovery.

In Section 4.3, we perform the evaluation of the two error recovery strategies through a casestudy of the GSM-CSD wireless link. We first investigate the impact of changing the (fixed)RLP frame size on application layer throughput and the consumption of radio resources (e.g.,spectrum and transmission power). We then quantify the benefits of link layer error recoveryby comparing it against the performance of pure end-to-end error recovery. There we show thatat least on some wireless links, e.g., a GSM-CSD link, the end-to-end performance that a reli-able flow can provide can only by optimized by complementing end-to-end with link layererror recovery.

1. rlpdump was implemented by Bela Rathonyi at Ericsson Mobile Communications AB, Sweden. Keith Sklower at U.C. Berkeley assisted in porting the RLP code to the BSD system.

48 ________________________________________________________________________________________________ CHAPTER 3

The performance difference between the two protocol design alternatives depends on the wire-

less channel’s time varying error characteristics versus the channel’s packet transmissiondelay. This is sketched in Figure 3-1, where “burst error” denotes time intervals during which

data in transit is corrupted to the extent that it cannot be recovered at the receiving link layer

(FEC decoder). With respect to GSM-CSD, a burst error corresponds to a series of back-to-back block erasures where the channel is error-free before and after that series. A wireless

channel’s error characteristic can be described by the length of burst errors and their correla-

tion expressing the degree of clustering. Link layer error recovery is less effective on wirelesslinks where the length of burst errors is large compared to the packet transmission delay (see

“Channel 1” in Figure 3-1). In this case, pure end-to-end error recovery often yields higher

throughput results by saving link layer protocol overhead. Another case is sketched with“Channel 2” in Figure 3-1 where the length of burst errors is small compared to the packet

transmission delay and the burst errors often occur isolated. In this case, the link layer over-

head is likely to be amortized when the “right” frame size is chosen. Studying this trade-offrequires a realistic error characterization of the wireless channel which motivates our measure-

ment-based analysis approach.

The key premise for our analysis is a model of a bulk data flow based on a fully-reliable proto-col such as TCP. As pointed out in Section 2.1, the main QoS requirement of bulk data flows is

to maximize throughput. Fully-reliable flows have the additional QoS requirement that the

transfer must be reliable, i.e., the transfer fails if the data is corrupted or incomplete when

received by the destination. To compare throughput among the two error recovery strategies,we assume that the GSM-CSD wireless link is the path’s bottleneck link, and that the bulk data

flow is the only flow that utilizes the bottleneck link. Using the ReTracer tool explained in

Section 3.1.4, we perform a best-case analysis on the basis of block erasure traces we had col-lected a priori as described in Section 3.1.3. The best-case analysis assumes that the bulk data

transfer always fully utilizes the wireless bottleneck link, i.e., utilizes the link 100 percent.

Channel 1

Channel 2

Legend:

IP Layer

Link Layer

Burst Error; Length represents the duration of this condition.

Error-free Channel; Length represents the duration of this condition.

Packet; Length represents the packet transmission delay.

Frame; Length represents the frame transmission delay.

Figure 3-1: Two different channel error characteristics.


We redefine the term utilization for our purposes as follows.

• Given a time period of length T, the utilization of a link is the fraction of T during whichuseful data, i.e., excluding packets/frames which had already been successfully transmit-ted1, is transmitted over the link, divided by T.

For link layer error recovery, the best-case analysis implies (1) the use of a selective rejectbased protocol, like RLP; and (2) an “infinite” error recovery persistency2. It also requires theuse of large enough windows to allow the link layer sender to always fully utilize the link. Thisavoids the stalled window condition, where the sender must interrupt transmission due to flowcontrol, i.e., when the receive buffer of the link layer receiver is exhausted to buffer additionalframes. For a bulk data flow that implements congestion control similar to TCP, the best-caseanalysis implies that the flow’s maximum load must exceeds two times (or more) the flow’spipe capacity as explained in Section 2.5.1.

The best-case assumption ignores inefficient interactions with end-to-end congestion controlschemes that may lead to an underestimation of the available bandwidth. For TCP over RLP inGSM-CSD, this is valid as we show in Section 4.2. For pure end-to-end error recovery, how-ever, this is often not the case as discussed in Section 2.5.1. Nevertheless, a best-case studyindicates the theoretical maximum application layer throughput that pure end-to-end errorrecovery could provide. Moreover, the application layer throughput that we determine inSection 4.3 under the given assumptions, directly translates into radio resource consumption.For example, if transport layer sender A only achieves half the throughput that sender Bachieves, it is using twice as much radio resources, i.e., it needs to transmit twice as many datablocks. This may happen if sender A has to rely on pure end-to-end error recovery, and has toretransmit packets of which only a small fraction of the corresponding original transmissionwas corrupted on the unreliable wireless link3.

3.1.3 Measurement Platform

Our measurement platform is depicted in Figure 3-2 (simplified from Figure 2-7). A single-hop path connects the mobile to a fixed host which terminates the GSM-CSD connection. Asexplained in the preceding subsection, we were only interested in capturing block erasuretraces, not in studying protocol interactions. We therefore used ping described in [Ste94] as atraffic generation tool because the underlying end-to-end protocol (ICMP) is unresponsive to

1. This can, e.g., happen in TCP which exhibits go-back-N behavior after spurious timeouts as explained in Section 5.1.

2. Throughout our measurements the highest number of retransmissions for a single RLP frame was 12. Thus, in GSM-CSD an “infinite” error recovery persistency (the RLP parameter N2) can be approximated with a maximum number of retransmissions of 12 + n for some small value of n.

3. E.g., if only a single byte of a 1500 bytes packet gets corrupted during transmission over an unreliable link, then still the entire packet has to be retransmitted.

50 ________________________________________________________________________________________________ CHAPTER 3

packet losses. The ping sender sends an ICMP packet every second that is echoed by theping receiver. Thus, when the ICMP packet size is configured so that the correspondingpacket transmission delay exceeds one second, ping can be used as an infinite and uninter-rupted traffic source1.

We then generated continuous traffic with ping and used rlpdump to capture the corre-sponding block erasure traces. That way we have collected block erasure traces for over500 minutes of “air-time” and distinguish between measurements where the host was station-ary versus mobile when driving in a car. All stationary measurements were taken in the exactsame location. We categorized the measurements as follows.

A.Stationary in an area with good receiver signal strength (3 - 4): 258 minutes.

B.Stationary in an area with poor receiver signal strength (1 - 2): 215 minutes.

C.Mobile in an area with mediocre receiver signal strength (2 - 4): 44 minutes.

3.1.4 The ReTracer Tool

Clearly, the size of an RLP frame does not need to match the size of an unencoded data block.Any multiple of the size of an unencoded data block would have been a valid design choice. In

1. This causes the sending host’s outbound interface buffer to constantly overflow leading to many dropped ping packets, but that did not matter in this case.

PSTN GSM

LoggingDatabase

RLP

RLPDUMP

Mobile HostUNIX (BSDi 3.0)

FEC/Interleaving

Fixed Host

PPP

Figure 3-2: The measurement platform.


fact, a multiple of 2 has been chosen for the new version of RLP [GSM04.22b] in the next gen-eration of GSM-CSD, which also uses a weaker FEC scheme [GSM04.21]. Larger framesintroduce less relative overhead per frame, but an entire frame has to be retransmitted even ifonly a single data block incurs an erasure. Applying our technique of retrace analysis, westudy this trade-off based on the block erasure traces we had collected a priori in environmentsA - C (see above). For that purpose we developed a tool we call ReTracer1 that automaticallyperforms the retrace analysis. Based on a given block erasure trace and a given bulk data trans-fer size, ReTracer reverse-engineers the value of target metrics (e.g., channel throughput ornumber of retransmissions). It emulates RLP while assuming a particular fixed frame size andfixed per frame overhead. We then iterate the retrace analysis over a range of RLP frame sizes,defined as multiples of the data block size. We can thereby find the frame size that maximizesthe bulk data throughput for a particular block erasure trace.

We use different block erasure traces for our analysis. trace_A is a concatenation of all blockerasure traces we collected in environment A. Likewise, trace_B and trace_C are concatena-tions of all block erasure traces we collected in environment B and C, respectively. We thenchoose an appropriate bulk data size to cover the entire trace (e.g., for trace_B a size corre-sponding to a transmission time of 215 min was chosen). Once the retrace analysis reaches theend of a trace, it wraps around to its beginning. In addition, we investigate the impact of errorburstiness, i.e., the extent to which the distribution of block erasures within a trace influencesour results. For that purpose, we artificially generated three more “non-bursty” block erasuretraces, trace_A_even, trace_B_even and trace_C_even. These have the same BLER as the cor-responding real traces, but with an even block erasure distribution, i.e., those traces have singleand isolated block erasures with a constant distance from each other.

3.2 Detecting Inefficient Cross-Layer Interactions

In this section, we explain the methodology we use in Section 4.2 to study in general the inef-ficient cross-layer interactions that may occur when running TCP-based bulk data transfersover RLP in GSM-CSD. This work has been published in [LRKOJ99]. Also, in Section 3.2.1we explain how to interpret TCP trace plots. In Chapter 4 and Chapter 5 we often use TCPtrace plots to illustrate certain effects, problems, or solutions.

1. ReTracer was implemented by Almudena Konrad at U.C. Berkeley.

52 ________________________________________________________________________________________________ CHAPTER 3

3.2.1 How to Read TCP Trace Plots

A trace is a series of events that was measured over time for a particular connection of a givenprotocol layer at the sender (called a sender trace), the receiver (called a receiver trace), or anode in the connection’s path. A trace plot is a graphical representation of a trace. Trace plotsprovide an excellent means to visualize a protocol’s operation over time correlated with effectsoccurring in the network, such as (excessive) packet delay or packet re-ordering. We mostlydeal with TCP traces, but in some cases correlate them with RLP traces that we captured usingrlpdump as described in Section 3.2.4. A TCP trace captures the times (timestamps) when asegment or an ACK is transmitted or received. In a trace this is represented by the tuple<timestamp, sequence number> or <timestamp, ACK number>, respectively (seeSection 2.2.1 for the definition of sequence number and ACK number). In the plots we labelthe graphs comprising points corresponding to

• segments sent by the TCP sender as Snd_Data (or TcpSnd_Data),

• ACKs received by the TCP sender as Snd_Ack (or TcpSnd_Ack),

• segments received by the TCP receiver as Rcv_Data (or TcpRcv_Data), and

• ACKs sent by the TCP receiver as Rcv_Ack (or TcpRcv_Ack).

To avoid that the sender and receiver plots overlap when shown in the same plot we offset thesequence number space of the TCP receiver trace by 10,000 bytes. In our measurements, theclocks of the sending and the receiving host were not synchronized. The exact timing of eventswas not necessary for our study. Instead, we loosely synchronized the sender and the receivertraces by defining as “time zero” the time when the sender sends the connect request (SYN)and when it arrives at the receiver. Thus, apart from clock drifts on both hosts, the receivertrace is offset by the one-way delay of the initial SYN.

Capturing TCP traces requires an extension of the operating system kernel that does the log-ging of relevant information, and a user level process to control the kernel extension and totransfer the logged data into user space. For the BSD system these two functions had alreadybeen developed1: the BSD Packet Filter (BPF) [MJ93], and tcpdump [JLM], respectively.The output generated by tcpdump can then be reformatted (we used our own scripts for thatpurpose) according to the input format required by standard plotting tools such as Xgraph[Xg].

A number of characteristics can be read off TCP trace plots. As an example, Figure 3-3 showsa section of a TCP sender trace. As each point in the Snd_Data graph corresponds to a

1. Those tools are publicly available and have been extensively used, tested, and enhanced by the Internet research commu-nity.


sequence number, the difference between two succeeding segments is the segment size of thefirst segment. During bulk data transfer it is usually the connection’s Maximum Segment Size(MSS). Also the flow’s load (usually the number of bytes outstanding divided by the MSS) andthe flow’s RTT can be read off the plot as indicated in Figure 3-3. The rate at which ACKsreturn to the sender can be determined by linear regression of the Snd_Ack trace. It corre-sponds to the flow’s available bandwidth at that time due to the self-clocking property of TCP(see Section 2.2.1). The ACK clock itself can be seen from the fact that no segment is sentbetween arrivals of ACKs, i.e., each ACK clocks out one or more segments.

Figure 3-3 shows a special case. The TCP connection has just been established and the senderis in the slow start phase where every ACK clocks out two segments1. One because the ACKadvanced the window and another one because the congestion window was increased by one.10.5 s into the connection the sending host’s interface buffer overflows and one packet isdropped. In response the tcp_quench() function [WS95] resets the TCP sender’s congestionwindow to . The following eight ACKs grow the congestion window back to

allowing that another segment is sent at about 15 s into the connection. Shortly afterthe 18th second the DUPACKs for the dropped segment return to the sender. The thirdDUPACK triggers a fast retransmit in the 20th second.

Figure 3-4 shows an example where both the sender and the receiver traces are correlated inthe same plot. This measurement was collected in the simple network shown in Figure 3-8 thatwe explain in Section 3.3.2. The plot shows the typical sender and receiver traces of a network-

1. In this case, the TCP receiver acknowledges the receipt of every segment because the ACK interarrival time is larger then the delayed-ACK timer of 200 ms used in TCP-Lite.

10000

12000

14000

16000

18000

20000

22000

24000

26000

28000

30000

6 8 10 12 14 16 18 20 22 24

Sequence Number

Time of Day (s)

Snd_Ack

Snd_Data

Linear regression of returning ACKsdetermines the connection's throughput.

MSS = Differencebetween 2 "dots"

Bytes outstanding (in terms of MSS = Load)

RTT

Fast Retransmiton 3rd DUPACK

Figure 3-3: A TCP sender trace plot.

1 MSS×17 MSS×

54 ________________________________________________________________________________________________ CHAPTER 3

limited TCP connection. The sender periodically: grows its load linearly during the congestionavoidance phase, drops a single packet, goes into fast recovery (triggered after a fast retrans-mit), and then goes into congestion avoidance again. This plot should be compared withFigure 2-6 which is an alternative representation of a network-limited TCP connection. In the35th second a relatively rare event happens that we captured by chance. A segment with achecksum error arrives at the TCP receiver1. This can be seen from the fact that the TCPreceiver does not provide any feedback, i.e., neither sends an ACK nor a DUPACK, upon itsarrival (see arrow in the plot). The lost segment triggers the fast retransmit 37.5 s into the con-nection. The segment following the one that was received in error was dropped due to conges-tion. However, because at that time not enough packets are in flight to generate threeDUPACKs, that segment has to be recovered by a timeout that occurs in the 43th second.


The main focus of our analysis is to study inefficient cross-layer interactions that may occurwhen running TCP-based bulk data transfers over RLP in GSM-CSD. The results of this studyare described in Section 4.2. We were only interested in “stable” connections that lasted longenough to allow for all TCP sender state variables (e.g., retransmission timer, slow-startthreshold, etc.) to converge from their initialization values to a stable range of operation. Wetherefore performed a series of large bulk data transfers ranging in size from 230 KBytes to

1. Apparently this error had not been detected by PPP’s error detection function.

0

10000

20000

30000

40000

50000

0 10 20 30 40 50Time of Day (s)

Seq

uenc

e N

umbe

r

Snd_Data

Snd_Ack

Rcv_Data

Rcv_Ack

Figure 3-4: A TCP sender and receiver trace plot.


1.5 MBytes. In Section 4.2.1 we report on the throughput that TCP achieved in those measure-ments. However, throughput itself is not sufficient information to determine whether TCP andRLP interacted in an inefficient way. For example, a throughput of one half of the theoreticalmaximum could either mean that the radio conditions were so poor that RLP had to retransmitevery other frame or it could indicate competing error recovery between TCP and RLP.

Utilization as defined in Section 3.1.2 is the key performance metric that can be used to deter-mine whether a data transfer suffered from inefficient TCP/RLP interactions or not. If the TCPsender fully utilizes the bandwidth provided by RLP (which may vary over time due to RLPretransmissions) then this indicates optimal performance and rules out inefficient interactionsbetween the two protocols. There are only two ways that utilization may not be optimal: (1) theTCP sender leaves the link (RLP) idle, or (2) the TCP sender sends spurious retransmissions.We used the MultiTracer tool explained in Section 3.2.4 to check for these two cases in all themeasurements we had collected a priori as described in Section 3.2.3. That way we isolated thetraces where utilization was 95 percent or less, and further investigated those to identify thecauses of the degraded performance.

Note that utilization can never be exactly 100 percent because of TCP’s initial slow-start phaseand the 3-way handshake required for both the TCP connection establishment and the discon-nection phase [Ste94]. In our measurement platform, however, the effect of slow-start is negli-gible because the pipe capacity of a TCP flow over a GSM-CSD link is already reached with2 - 3 segments, even when using a small MSS. Also, these effects are amortized when per-forming large bulk data transfers (as done here). Measuring utilization has the added advantagethat it is independent of protocol overhead. Thus, parameters like the Maximum TransmissionUnit (MTU) configured for PPP, the PPP framing overhead, and whether TCP/IP header com-pression was used or not, do not affect utilization as defined here.


The platform that we developed for measurement collection is depicted in Figure 3-5. The grayshaded area indicates a possible extension to the setup that we have not implemented. It wouldgenerate input for trace replay in a simulation environment allowing to reproduce variouseffects that were measured in reality. The measurement platform extends the setup shown inFigure 3-2 by the capability to capture TCP traces with tcpdump in addition to capturingRLP traces with rlpdump , and to correlate all traces onto the same time axis. Since wewanted to isolate the TCP/RLP interactions we continued using a single-hop path. Although itmight in some cases be reverse-engineered, tcpdump does not provide information about theTCP sender state variables, such as the congestion window, the slow start threshold, and theretransmission timeout value. We therefore used tcpstats [Pad98], a UNIX kernel instru-mentation tool that traces these TCP sender state variables. As for the measurements described

56 ________________________________________________________________________________________________ CHAPTER 3

in Section 3.1.3, we needed a traffic generation tool for bulk data transfers. Only this time weneeded one that was based on TCP. For that purpose we used the sock tool described in[Ste94].

Overall, we captured six hours of traces that we used for our analysis. Four hours were mea-sured in environments with good and two hours in environments with poor receiver signalstrength. Although in most of our measurements the mobile host was stationary, we also mea-sured while walking (indoor and outdoor) or driving in a car. We categorized the measurementsas follows.

D.Environments with good receiver signal strength (3 - 4): 4 hours.

E.Environments with poor receiver signal strength (1 - 2): 2 hours.

It is important to point out that, as reported in [KRLKA97], we also had situations where theGSM call, i.e., the physical connection, was dropped during a measurement. In almost allcases, this happened when the receiver signal was very low. Apparently, radio coverage wasinsufficient in those environments. As this data would have introduced an unrealistic bias intoour analysis, we excluded those traces from the analysis in Section 4.2.

Fixed HostUNIX (BSDi 3.0)

TCP

Mult iTracer

Trace Replayin Simulator

(e.g. ns, BONeS)

RLP

R L P D U M P

T C P D U M P

Mobi le HostUNIX (BSDi 3.0)

T C P D U M P

TCPSTATSTCPSTATS

Plott ingTool

(e.g. xgraph)

TrafficSource/Sink(e.g. sock)

TrafficSource/Sink(e.g. sock)

PSTN G S M

Figure 3-5: Measurement platform and tools.


3.2.4 The MultiTracer Tool

Altogether tcpdump , tcpstats and rlpdump generate a total of up to 300 bytes/s of tracedata for a GSM-CSD connection that is running at about 10 Kbit/s. It was therefore essential todevelop a post-processing tool that enabled the rapid correlation and representation of col-lected trace data in a comprehensive graphical manner for trace analysis. We call this toolMultiTracer1. MultiTracer is a set of script files that converts the trace data into the input for-mat required by a plotting tool such as Xgraph. MultiTracer also automatically determines theutilization of each measurement indicating whether a data transfer suffered from inefficientTCP/RLP interactions as explained in Section 3.2.2. For that purpose MultiTracer inspects theRLP trace to determine idle phases at the RLP sender, and it inspects the TCP traces for spuri-ous retransmissions.

In addition to the labeling scheme described in Section 3.2.1 we label the graphs comprisingpoints corresponding to

• the congestion window at the TCP sender as TcpSnd_cwnd,

• frames sent by the RLP sender for the first time as RlpSnd_Data,

• retransmitted frames sent by the RLP sender as RlpSnd_Xmit,

• flow control signals (XON/XOFF) sent by the RLP receiver as RlpRcv_XON andRlpRcv_XOFF,

• RLP link resets as RlpSnd_Rst.

MultiTracer generates more information (e.g., RTT, SRTT, RTO), but for this analysis we onlyuse the items listed above. To correlate RLP and TCP traces, MultiTracer uses the TCPsequence number space. Note, however, that in all plots the RlpSnd_*, TcpSnd_*, andTcpRcv_* graphs are offset by 10,000 bytes from each other in the plots so that the graphs donot overlap. All traces are loosely synchronized with respect to the TCP connect request asdescribed in Section 3.2.1.

To demonstrate the capability to correlate and visualize multi-layer traces we show a typicalmeasurement in Figure 3-6. The three rectangles in Figure 3-6 indicate sections of this plot thatare “zoomed in” for detailed analysis in the following subsection and in Section 4.2. This par-ticular measurement yielded optimal throughput performance. This can be seen from the factthat the TCP receiver continuously receives data. Linear regression of the RlpSnd_Data graphshows that throughput provided by RLP is almost 960 bytes/s which is equivalent to a bit rateof 9.6 Kbit/s asynchronous. Likewise, the trendline through TcpRcv_Data yields a throughputof 848 bytes/s. This is what we expected as TCP/IP header compression was not used for this

1. MultiTracer was implemented by Almudena Konrad at U.C. Berkeley.

58 ________________________________________________________________________________________________ CHAPTER 3

measurement and the overhead per MSS of 460 bytes was 59 bytes (40 bytes for the IP andTCP headers, 12 bytes timestamp option and 7 bytes PPP overhead). Thus, the TCP senderoptimally utilized the bandwidth provided by RLP. Note that the graph for RlpSnd_data alwayshas a larger slope than the TCP graphs because it includes the TCP, IP, and PPP overhead.

3.2.5 Detected “Implementation Bugs” in GSM

The maximum data rate provided by RLP is 1200 bytes/s. We were therefore surprised whenwe saw the gaps in the RlpSnd_Data graphs in some of our traces. However, after we tracedthe flow control messages at the L2R protocol (see Section 2.4.1) it became clear what wasoccurring. Due to limitations in some commercial GSM networks, the data rate appears to belimited to only 960 bytes/s (9.6 Kbit/s asynchronous).

In these networks, the RLP sender is flow controlled from the remote side so that the averagedata rate becomes 960 bytes/s. Figure 3-7 shows that the RLP sender sends at the maximumrate of almost 1200 bytes/s at times when it is not flow controlled, but the linear regression lineshows that the real throughput is throttled by 20 percent down to about 960 bytes/s. However,as can be seen from Figure 3-6, the periodic gaps of 950 - 1300 ms did not trigger spurioustimeouts in TCP.

As mentioned in Section 2.4.3, RLP can be implemented to provide fully-reliable service. Inthat case the data call is completely dropped when the error recovery persistency is reached.We have measured this effect several times in some commercial GSM networks. Simply drop-

0

10000

20000

30000

40000

50000

60000

70000

0 10 20 30 40 50

Sequence Number

Time of Day (s)

TcpRcv_Data (848 bytes/s)

RlpSnd_Data (958 bytes/s)

TcpRcv_Ack

TcpSnd_AckTcpSnd_Data

Figure 3-6: A typical multi-layer trace plot.

Figure 4-6

Figure 4-10

Figure 3-7


ping the call is, however, an unacceptable alternative. Not only will the user in many caseshave to re-initiate the data transfer (e.g., a file transfer), but will also be charged for air timethat yielded an unsuccessful transmission. Implementing RLP to provide semi-reliable serviceis therefore more “user friendly”.

3.3 Reproducing Inefficient Cross-Layer Interactions

In this section, we explain the methodology we use in Section 5.1 and Section 5.2 to study andsolve the problem of competing error recovery for the case of TCP. With this analysis we par-ticularly address the problem of “competing error recovery” (see Section 2.5.4). This work hasbeen published in [LK00].


The goal of our analysis is to study the impact of competing error recovery on TCP’s opera-tion. In TCP, competing error recovery can cause spurious retransmissions as explained inSection 2.5.4. Those can be triggered by spurious timeouts or packet re-ordering events. In thelatter case, we speak of spurious fast retransmits to distinguish those from spuriousretransmissions that have been triggered by spurious timeouts.

Spurious timeouts have not generally been a concern in the past. They are rare over all wirelinepaths [Pax97d], as well as on path’s that include reliable wireless links that do not lose connec-

7000

12000

17000

22000

27000

10 15 20 25 30

RlpRcv_XON

RlpRcv_XOFF

Time of Day (s)

Sequence Number

Linear regression ofRlpSnd_Data(958 bytes/s) RlpSnd_Data

(1198 bytes/son each sectionof the graph)

Figure 3-7: L2R flow control (zoom of Figure 3-6).

60 ________________________________________________________________________________________________ CHAPTER 3

tivity as we show in Section 4.2. This is due to the fact that the retransmission timer imple-mented in TCP-Lite is overly conservative as we show in Section 5.3. However, we believethat the problem will occur more frequently with the increasing number of hosts accessing theInternet via wide-area packet-radio networks1. Frequent disconnections - on the order of sec-onds - without losing data are not only common in these networks, but are explicitly accountedfor in their design. Over such links spurious timeouts in TCP are likely to be more frequent.

Spurious fast retransmits can occur if a link layer implements the out-of-order delivery func-tion, or if packets of the same flow are routed differently in the Internet. It is difficult to evalu-ate how serious this problem is in the Internet today. For wireless links that implement the out-of-order delivery function, we are not aware of any study that investigates this problem. Forthe other case, some studies [Pax97d] conclude that spurious fast retransmits occur rarely,while other studies [BPS99] find this problem to be more serious. Clearly, this depends on thepaths underlying such studies, e.g., whenever routers are inter-connected via multiple links/paths (e.g., for fault tolerance) and load balancing is performed across those links/paths on theaggregate traffic, packet re-orderings will occur more frequently.

To study competing error recovery in TCP, we setup a “clean” environment in which measure-ments are not blurred by uncontrolled effects like delay variations, or packet losses commonlyfound in the Internet. We then used the hiccup tool explained in Section 3.3.3 to artificiallyintroduce excessive packet delays and/or packet re-orderings to trigger spuriousretransmissions.


We used a single-hop, all wireline path for our experiments consisting of two hosts (BSD/386Version 3.0) inter-connected via a direct cable connection running PPP at 9.6 Kbit/s with anMTU of 512 bytes. In all measurements the TCP timestamp option was enabled. The TCPreceiver advertised a window of 8496 bytes ( ). We always measured a single con-nection at a time, and the pipe capacity was two segments. The size of the sending host’s inter-face buffer (IFQ_MAXLEN [WS95]) which in BSD-derived systems is maintained in terms ofIP packets was used to limit the number of queued packets. For example, an interface buffersize of 12 allows 12 packets to be queued before a packet (tail-)drop occurs. We used the inter-

1. Note that GSM-CSD is not a packet-radio network.

18 MSS×


face buffer size to trigger certain effects explained in Section 5.1. We used sock for bulk data

traffic generation.

3.3.3 The Hiccup Tool

We developed a tool called hiccup 1 to trigger spurious timeouts and/or spurious fast retrans-

mits. Depending on the parameters specified by a user-level process, hiccup operates on a

given interface in the inbound, outbound, or both directions, and generates transient delays by

queueing packets, or re-orders packets according to a user-specified re-ordering length. When

generating transient delays, hiccup can additionally be provided with an “expiration time”

after which each packet is dropped from the queue after its arrival2. The default “expiration

time” is indefinite. In that case packets are never dropped by hiccup . We have used this fea-

ture to demonstrate in Section 4.1.4 the problems that less persistent link layer error recovery

may cause for fully-reliable end-to-end protocols such as TCP. Effectively, hiccup emulates

a semi-reliable link layer protocol with a configurable error recovery persistency, and with the

in-order or the out-of-order delivery function.

1. hiccup was implemented by Keith Sklower at U.C. Berkeley.

2. Note, that this feature results in a drop-from-front as opposed to a tail-drop queue management scheme.

Receiver (BSDi 3.0)

Direct Cable (Serial Line)(9.6 Kb/s)

Sender (BSDi 3.0)

IPIP

PPPPPP

IFQ_MAXLEN

hiccup

TCPTCP

socksock

BSD PacketFilter

BSD PacketFilter

Figure 3-8: Measurement Setup.

62 ________________________________________________________________________________________________ CHAPTER 3

The location of hiccup in the protocol stack is important to understand the trace plots (e.g.,see Figure 5-2) in Section 5.1 and Section 5.2. Outbound packets queued by hiccup arelogged as a single burst by the BSD Packet Filter (BPF) although they have not been sent as aburst by the TCP sender. Those packets are clocked out separately by the TCP sender eachtime an ACK arrives (marked as + in the trace plots), but then get queued by hiccup . At thatpoint those packets are not logged by BPF. That is done after the transient delay is over, andhiccup flushes the queue of packets into the outbound interface buffer. The packets are thenspread out in time due to the transmission delay on the outgoing link before they are receivedby the TCP receiver.

3.4 Analyzing TCP’s Retransmission Timer

In this section, we explain the methodology we use in Section 5.3 to study TCP-Lite’s retrans-mission timer, the Lite-Xmit-Timer (see Section 2.2.2). We also use that methodology todevelop a new retransmission timer for TCP in Section 5.4. With this analysis we address theproblem of “competing error recovery” (see Section 2.5.4). This work has been described in[LS99].

Our experience with measuring TCP, especially the large amount of delay variation that isrequired to trigger a spurious timeout in TCP (see Section 4.2 and Section 5.1), led us tobelieve that something was wrong with the Lite-Xmit-Timer. We suspected that it was overlyconservative. We therefore analyzed the Lite-Xmit-Timer and confirmed our conjecture. Forthat purpose, we developed a model of the class of network-limited TCP bulk data transfers insteady state which we describe in Section 3.4.1 and Section 3.4.2. In Section 3.4.3, wedescribe the measurement setup that was used for validation purposes.

3.4.1 Choosing a “typical” TCP Connection

TCP’s operation and performance is largely determined by the path’s metrics such as availablebandwidth, end-to-end delay, and packet drop pattern. Ideally, a well-designed retransmissiontimer should perform well over any possible end-to-end path. In the Internet, however, thosepath metrics can vary considerably over short and long time scales [Pax97a]. Consequently, thetypical TCP connection does not exist. This makes it particularly difficult to validate thedesign of an end-to-end retransmission timer. Our approach is therefore to study one commonclass of TCP connections which is frequently found in the Internet, yet, is simple enough toallow for a model-based analysis.


We study the class of network-limited TCP bulk data transfers in steady state. In this case theTCP sender goes through periodic congestion avoidance cycles during which it linearlyincreases the load on the network until it receives a congestion signal. It then halves the loadwhich effectively means that it does not send any more segments for one half the RTT. Thisgives the queue at the bottleneck link time to drain. We further assume a non-shared bottlenecklink with a fixed bandwidth and the sender always sends fixed size segments. In addition, weassume that the sender fully utilizes (as defined in Section 3.1.2) the bottleneck link at anypoint in time. The latter has the effect that whenever the sender increases its load by one seg-ment, that this will increase the queue length at the bottleneck by one. Consequently, the RTTincreases by the segment’s service time at the bottleneck link. It also yields a maximum RTTthat is twice the minimum RTT as illustrated in Figure 3-9. Given these assumptions, the RTTof a given flight within one congestion avoidance cycle is the sum of the RTT of the precedingflight and a segment’s service time at the bottleneck link (see Figure 3-9 where each dot in thegraph denotes one RTT sample).

TCP connections that fulfill these assumptions can, e.g., be found in situations where theaccess link (e.g., low bandwidth dial-up or wide-area wireless) becomes the bottleneck link,and only a single application creates traffic. The analysis of a receiver-limited connection insuch a situation is trivial as the RTT is constant in that case.

TimeOfDay

RTT

MAX-RTT

MIN-RTTMAX-RTT = 2 x MIN-RTT

Bottleneck LinkService Time

One Congest ionAvoidance Cycle

RTT Samples ofthe same Flight

Figure 3-9: The RTT in steady state.

64 ________________________________________________________________________________________________ CHAPTER 3

3.4.2 Model-based Analysis

Given an RTT that evolves in a deterministic and recurrent manner as outlined in Section 3.4.1,the RTO does also, as it is a function of RTT. Thus, we have chosen to model the RTT, theRTO, and all other relevant sender-side connection state variables on a spreadsheet [Lud99a].

We make the following additional assumptions:

• In our model, we assume that every segment is timed to measure the RTT and that thereceiver acknowledges every segment, i.e., we assume an RTT sampling rate of one.

• We assume that congestion is signalled explicitly at the end of each congestion avoidancecycle instead of through a dropped packet (see Section 2.3). This simplifies the model-based analysis without limiting it.

• To make our model independent of the impact of the timer granularity (see Section 2.2.2)we model time in terms of ticks which can be arbitrarily defined.

On our spreadsheet, columns correspond to a specific connection state variable (e.g., the RTTor the RTO) and rows correspond to the arrival of a new ACK, i.e., a new RTT sample. Thus,the “Time of Day” progresses from one row to the next by the bottleneck link’s service time.The spreadsheet has a number of parameters including the segment size, the bottleneck link’sbandwidth and buffer size, and the end-to-end latency. Those are used to instantiate the spread-sheet to reflect a specific connection, i.e., a specific evolution of RTT. In the following werefer to such an instantiation of the spreadsheet as “the model”. The mentioned parametersitself are less important for our analysis. What matters is the flow’s load at the end of each con-gestion avoidance cycle. This is further discussed in Chapter 5.

Using spreadsheet software as a modeling tool for our purpose has a number of advantages.First, spreadsheet software usually includes graphing components which greatly ease the anal-ysis. Second, debugging is implicitly supported as the spreadsheet itself reflects the history ofthe sender-side connection state, i.e., the value of the modeled state variables over time. Thegreatest advantage over techniques like simulations, however, is the little processing timerequired to determine target metrics over time for a given parameter set. Once the parametershave been specified, a graph of interest can be viewed instantly.

3.4.3 Measurement-based Analysis

We perform a measurement-based analysis to validate our model-based analysis. Thus, ourgoal is to reproduce a connection with characteristics as close as possible to a connection wecan model using the technique and the assumptions described in the preceding two subsec-tions. For that purpose, we used the measurement setup described in Abschnitt 3.3.2 without


hiccup , an MTU of 1500 bytes, and a link speed of 2.4 Kbit/s. In addition, we set the size ofthe interface buffer to 40 packets. We chose those settings to produce RTTs that are severalmultiples of the timer granularity used in TCP-Lite (500 ms) to study the RTO at a sufficientresolution. With these settings, the RTT at the end of a congestion avoidance cycle is about250 s (40 packets of 1500 bytes draining from the interface buffer at 240 bytes/s).

The transmission delay for a segment in this setup is too high to trigger delayed ACKs. Conse-quently, we always measured with an RTT sampling rate of one. The only difference to themodel of this connection is that the TCP sender in the measurements had to rely on a droppedpacket and the corresponding three DUPACKs as the congestion signal. The minor impact ofthis difference is discussed in Abschnitt 5.4.2.

3.5 Summary

In this chapter, we explained the methods and tools we used to obtain the results presented inChapter 4 and Chapter 5. Most of our studies are based on measurements. To support thiswork, we have developed four new tools:

• rlpdump ,

• hiccup ,

• ReTracer, and

• MultiTracer.

Many thanks to Almudena Konrad, Bela Rathonyi, and Keith Sklower who contributed to thetools’ development.

The first two tools are kernel extensions of the BSD system with a corresponding user levelprocess. rlpdump logs RLP and block erasure traces. hiccup emulates a semi-reliable linklayer protocol with a configurable error recovery persistency, and with the in-order orout-of-order delivery function. It is used in a controlled, “non-wireless” environment. hiccup

can also be used to artificially generate excessive packet delays and/or packet re-orderings dur-ing a measurement to trigger spurious retransmissions in TCP.

The latter two tools are script files required for post-processing of TCP and RLP traces provid-ing the ability to visualize them correlated in time and at multiple levels of detail. Using thesetools, we have post-processed and analyzed a large base of traces representing a variety ofmobile data uses (e.g., stationary indoors, walking, driving, etc.).

66 ________________________________________________________________________________________________ CHAPTER 3

We use rlpdump and MultiTracer in Section 4.2 to study in general inefficient cross-layerinteractions that may occur when running TCP-based bulk data transfers over RLP in GSM-CSD. We use rlpdump and ReTracer in Section 4.3 to evaluate the benefit of link layer errorrecovery for reliable flows. This analysis addresses the problems of “underestimation of avail-able bandwidth” (see Section 2.5.1), “inefficiency of end-to-end error control” (seeSection 2.5.2), and also the problem of “failure of link layer differential encodings” (seeSection 2.5.5). We use hiccup in Section 5.1 and Section 5.2 to further study and solve theproblem of “competing error recovery” (see Section 2.5.4) for the case of TCP.

In addition to measurements, we use a model-based analysis approach to study TCP-Lite’sretransmission timer in Section 5.3, and to develop a new retransmission timer for TCP inSection 5.4. For that purpose we modeled on a spreadsheet the RTT, the RTO, and all other rel-evant sender-side connection state variables for the class of network-limited TCP bulk datatransfers in steady state. In Section 5.3.5, we validate our model-based analysis through mea-surements in a real network that yield the same results. This analysis addresses the problem of“competing error recovery” (see Section 2.5.4).

analysis methodology - university of california,...

Documents