assignment 1: network emulation of tcp cubic and … 1: network emulation of tcp cubic and sack ......

Assignment 1: Network Emulation of TCP CUBIC andSACK (Draft #1)

Naeem KhademiNetworks and Distributed Systems Group

Department of InformaticsUniversity of Oslo, Norway

[email protected]

G.K.M. TobinInstitute for Clarity in Documentation

P.O. Box 1212Dublin, Ohio 43017-6221

[email protected]

Lars ThørväldThe Thørväld Group

1 Thørväld CircleHekla, Iceland

[email protected]

Lawrence P. LeipunerBrookhaven LaboratoriesBrookhaven National Lab

P.O. Box [email protected]

Sean FogartyNASA Ames Research Center

Moffett FieldCalifornia 94035

[email protected]

Charles PalmerPalmer Research Laboratories

8600 Datapoint DriveSan Antonio, Texas 78229

[email protected]

ABSTRACTThe goal of this report is to experimentally evaluate theperformance of two TCP congestion control mechanisms,namely SACK and CUBIC, as implemented in TCP suite inLinux kernel by using network emulation techniques. Theperformance of these TCP variants have been evaluated us-ing real-life measurements under different parameters set-tings which are varying bottleneck buffer size and concurrentdownload flows.

KeywordsTCP, Measurement, Network emulation

1. INTRODUCTIONTCP has been the dominant reliable data transfer protocolin Internet over a decade and expected to remain so in the fu-ture. One of the most challening issues among academicianshas been to maximize the TCP’s performance (e.g. good-put) under different scenarios in Internet. Several congestioncontrol mechanisms are proposed to achieve this goal withCUBIC being currently the default congestion control mech-anism in Linux kernel. However, TCP suite implemented inLinux kernel provides the functionality to select the conges-tion control mechanisms other than default and also providesthe source code of them openly. This gives the oppurtunityto the researchers to use the passive measurements to eval-uate the performance of these mechanisms under varyingconditions and to enhance/modify the existing congestioncontrol mechanisms to optimize the performance.

Buffer size plays an important rule in determining the TCP’sperformance. The common assumption about the buffer siz-ing requirement for a single TCP flow follows the rule-of-thumb which identifies the required buffer size at the bot-tleneck as the bandwidth×delay product of the flow’s path.This amount of buffer size is associated to the sawtooth be-havior of TCP congestion window (cwnd) in NewReno orSACK. TCP designed in a way to saturate the maximum

available bandwidth over a long-term period and it eventu-ally fills up the total buffer size provided along the path.Larger buffer sizes introduce higher queueing delays andtherefore are not favorable for delay-sensitive applicationsand affect the TCP’s round-trip time (RTT).

It has been shown that, under a realistic traffic scenario, suchas when multiple TCP flows coexist along the path betweensender and receiver, desyncronized TCP flows’ cwnd clashout each other and provide almost a flat-rate aggregate cwndvalue, easing the buffer sizing requirement to a smaller valuethan bandwidth×delay product. The research works haveshown that, this value can be correlated to the square rootof the number of flows. This phonomenon facilitates thedevelopment of core routers at the Internet backbone withsmaller buffer sizes providing almost the same utilizationlevel while requiring few numbers of RAM chipsets, loweringthe products’ cost.

In this paper, we study the impact of different buffer sizesat the bottleneck router (here, a network emulator node)jointly with various number of coexisting flows between asender and a receiver, for both TCP CUBIC and SACK. Westudy how these scenario settings affect the TCP’s through-put and RTT and compare the behavior of CUBIC andSACK in this aspects. The rest of this paper is organizedas follows: Section 2 presents the experimental setup andmethodology used in the measurements. Section 3 demon-strates the measurement results and evaluates the perfor-mance of TCP CUBIC and SACK under various parametersettings and finally Section 4 justifies the provided resultsand concludes the paper.

2. EXPERIMENTAL METHODOLOGYThis section describes the experimental setup and method-ology used in this paper. To passively measure the perfor-mance of different TCP variants, we setup a testbed usingthree computers acting as sender, receiver and an emula-tor/router node in between. Each of the TCP peers arephysically conected to the emulator node using a 100Mbpswired link (however we have used an emulation tool to limitthe available bandwidth to 10Mbps only). Figure 1 is aschematic diagrams of the network topology used in thetests. We have used emulab to setup this testbed and con-duct our experiments. emulab in an open network testbedat the University of Utah which provides a platform forresearchers to conduct real-life network experiments (Fig-ure 2). The specifications of this test-bed is presented in

10 Mbps 10 Mbps

node A (sender)node B (receiver) Emulator node(delay=100ms)

Figure 1: Network topology

Figure 2: Emulab testbed

Table 1. The PCs employed for the measurements are lo-cated in the clusters of concentrated nodes stacked in net-work lab environments.

Table 1: Test-bed setupTest-bed Emulab

PC Pentium III 600 MHzMemory 256 MB

Operating System Fedora Core 4Linux kernel 2.6.18.6

Default Interface Queue Size 1000 pktsNode numbers 3

Each experiment was repeated for 10 runs, each of whichlasting for 50 seconds with 20 seconds time gap between eachof the two experiments. The presented results are the av-erages over all runs. The iperf traffic generation tool wasused to measure TCP traffic during each test. TCP’s sendand receive buffers were set large enough (256 KB) to pro-vide the opportunity for cwnd to grow with no restriction.The Maximum Segment Size (MSS) was set to 1448 bytes(MTU=1500 bytes). Two performance metrics are measured:1) TCP throughput. 2) Round-Trip Time (RTT). To gatherthe traffic statistics we have used tcpdump to monitor thetraffic traversed along the path between sender and receiver.To calculate the throughput and the average RTT, we em-ployed tcptrace, a tool which parses the dumped traffic filesin pcap format and gives various statistics about the TCPflows.

Each experiment consists of 50 seconds iperf TCP trafficfrom node A to node B. The number of coexisting flowsbetween these nodes varies from 1 to 20. The network emu-lator’s role (the intermediate node) is to incur a fixed amountof delay for each arriving TCP packet (100 ms) and to limitthe maximum available bandwidth on the link to 10 Mbps.

0

1e+06

2e+06

3e+06

4e+06

5e+06

1 2 5 10 20

Agg

rega

te T

hrou

ghpu

t (b/

s)

Number of Parallel Flows (buffer size=20pkts)

CUBICSACK

Figure 3: Aggregate throughput of parallel flows

In addition, the emulator buffers the arriving TCP pack-ets in both its ingress and egress buffers with them beingin equal sizes (ranging from 20 to 100 packets). To pro-vide a fine-grained distribution of the serviced packets bythe emulator node among different TCP flows, we have setthe maximum burst size of a TCP flow at the emulator nodeto be as 10 consecutive packets only. We are able to changethe TCP congestion control mechanism in TCP Linux suiteusing the /proc file system or sysctl command.

3. MEASUREMENT RESULTSThis section demonstrates the performance measurement re-sults of TCP CUBIC and SACK for different buffer sizes andvarious number of parallel flows.

3.1 The Impact of Number of Parallel TCPFlows

Figure 3 shows the impact of number of coexisting flowson the overall TCP performance when the buffer size atthe bottleneck node is 20 packets. We can observe thatas the number of concurrent TCP flows grow from 1 to 20,the total aggregate throughput of system increases from 2.5and 2.9 Mbps to 4.5 and 4.9 Mbps for SACK and CUBICrespestively with CUBIC achieveing a slightly higher TCPthroughput than SACK. This almost two-fold increase inthroughput is due to the fact that as the number of coexist-ing flows increases, there flows become more desynchronizedover time provding a better utilization level than a singleflow scenario. In this scenario, CUBIC’s performance standsat a higher level than SACK due to the aggresive nature ofCUBIC’s cwnd growth (as the cubic function of the timeelapsed from the last congestion event).

Figure 4 demonstrates the impact of various number of co-existing TCP flows on the average RTT of all flows. TCPflows’ average RTT remains almost fixed at a minimal levelfor various number of parallel flows, when bottleneck’s buffersize is set to the small values of 20 packets for both CUBICand SACK. This is because small buffer sizes incur verylittle queuing delays and packet drop events become morefrequent instead.

100

120

140

160

180

200

1 2 5 10 20

RT

T (

ms)

Number of Parallel Flows

CUBIC-buf=20pktsSACK-buf=20pkts

CUBIC-buf=50pkts

SACK-buf=50pktsCUBIC-buf=100pktsSACK-buf=100pkts

Figure 4: RTT vs. number of parallel flows

0

1e+06

2e+06

3e+06

4e+06

5e+06

6e+06

7e+06

8e+06

9e+06

1e+07

1.1e+07

1.2e+07

20 50 100 200

Agg

rega

te T

hrou

ghpu

t (b/

s)

Buffer Size (packets)

CUBIC, 1 flowSACK, 1 flow

CUBIC, 2 flowsSACK, 2 flows




Figure 5: Throughput vs. Buffer Size

However, the difference between CUBIC and SACK’s RTTbecomes more significant when the buffer size grows withCUBIC’s RTT being more than SACK under various num-ber of parallel flows. This returns to the aggresive behaviorof CUBIC after a loss event, most probably caused by abuffer overflow, which leads to the higher number of TCPpackets being at the bottleneck buffer at any instant oftime and therefore increasing the queuing delay. In con-trast, SACK performs conservatively by halving the cwndsize (similar to NewReno), therefore having less number ofpackets traversing along the path, and draining the bottle-neck buffer after a loss event, and consequently smaller que-ing delay values.

3.2 The Impact of Buffer SizingFigure 5 shows the impact of buffer sizing on the aggregateTCP throughput of the system (This graph will be explainedin the next draft/presentation). Figure 6 demonstrates theaverage RTT of TCP flows for varying buffer sizes (Thisgraph will be explained in the next draft/presentation).

4. CONCLUSIVE REMARKS

100

120

140

160

180

200

20 50 100 200

RT

T (

ms)

Buffer Size (packets)

CUBIC, 1 flowSACK, 1 flow


CUBIC, 5 flows

SACK, 5 flowsCUBIC, 10 flowsSACK, 10 flows


Figure 6: RTT vs. Buffer Size

In this paper, we evaluated the performance of two conges-tion control mechanisms, namely CUBIC and SACK, usingpassive measurements and emulation techniques and undervarious scenarios. The impact of buffer sizing at the bot-tleneck on the throughput and RTT of these TCP variantshas been studied. Forthermore, the impact of number of co-existing TCP flows has been studied as well. While highernumber of parallel flows leads to the increase of aggregatethroughput for both CUBIC and SACK (with CUBIC per-forming slightly better most of the times), it also increasesthe average RTT of CUBIC compared to SACK. On theother hand, considering various buffer sizes, an increase inbuffer size leads to the increase of throughput and RTT forboth SACK and CUBIC. However, TCP throught remainsconstact after a certain threshold while increasing in queu-ing delay (and therfore RTT), in this case, CUBIC is beingmore exposed to the increase in RTT.

5. REMARKSThis draft will be modified accordingly to include more re-sults/figures. We are already continuing the experiments forlarger than 100 packets buffer size values. Each test is re-peated for 10 runs in emulab and the final average resultswill be presented with confidence intervals of 95%.

assignment 1: network emulation of tcp cubic and … 1: network emulation of tcp cubic and sack ......

Documents