fast pattern-based throughput prediction for tcp bulk transfers

31
Fast Pattern-Based Throughput Prediction for TCP Bulk Transfers Tsung-i (Mark) Huang Jaspal Subhlok University of Houston GAN’05 / May 10, 2005

Upload: jui

Post on 18-Jan-2016

18 views

Category:

Documents


0 download

DESCRIPTION

Fast Pattern-Based Throughput Prediction for TCP Bulk Transfers. Tsung-i (Mark) Huang Jaspal Subhlok University of Houston GAN ’ 05 / May 10, 2005. Outline. Background Problem Description Methodology Experiments and Results Conclusion and Future Works. “Are we there yet?”. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Fast Pattern-Based  Throughput Prediction  for TCP Bulk Transfers

Fast Pattern-Based Throughput Prediction for TCP Bulk Transfers

Tsung-i (Mark) Huang

Jaspal SubhlokUniversity of Houston

GAN’05 / May 10, 2005

Page 2: Fast Pattern-Based  Throughput Prediction  for TCP Bulk Transfers

TMH - GAN'05, 05/10/2005 2

Outline Background Problem Description Methodology Experiments and Results Conclusion and Future Works

Page 3: Fast Pattern-Based  Throughput Prediction  for TCP Bulk Transfers

TMH - GAN'05, 05/10/2005 3

“Are we there yet?”

When you need Throughput Prediction?File download: xx minutes left: MS IE vs. Mozilla

Mirror site selection: Knoppix: Florida State Univ. (fsu.edu) or TU Ilmenau, Germany (tu-ilmenau.de)

Resource selection in a grid environmentCache selection for web content

delivery services

Page 4: Fast Pattern-Based  Throughput Prediction  for TCP Bulk Transfers

TMH - GAN'05, 05/10/2005 4

Which site will give the best throughput? Current approaches and tools:

Geographical distance Ping (ICMP) Download 512 KBytes (fixed size) – NWS / iperf Download 10 seconds (fixed duration) - iperf

Last two approaches are most accurate: How much data to download / How long?

Is “Bandwidth * Delay” the answer? One size fits all?

“All or nothing” – no result is available until the

end of transmission

Page 5: Fast Pattern-Based  Throughput Prediction  for TCP Bulk Transfers

TMH - GAN'05, 05/10/2005 5

Problem Description Predicted future throughput can be used in

mirror/replica site selection Predict throughput of a TCP bulk transfer

Single TCP stream Input: Time Series of (Arrival time, Bytes received) Output: Predicted future throughput Make a prediction of future throughput after 10 ~ 100

RTTs Utilize knowledge of TCP flow patterns Assume TCP flow patterns will repeat later in the same

TCP stream

Page 6: Fast Pattern-Based  Throughput Prediction  for TCP Bulk Transfers

TMH - GAN'05, 05/10/2005 6

TCP Flow Patterns

(a) Rate Control (b) Congestion Control

(c) Rate Control with delay (d) Mixed Congestion Control

• Textbook Examples:

• In Reality:

Page 7: Fast Pattern-Based  Throughput Prediction  for TCP Bulk Transfers

TMH - GAN'05, 05/10/2005 7

Approach to Throughput Prediction Analyze Time-Series (TS1) of (Arrival Time, Bytes received) to

get a meaningful throughput Time-Series Possible solutions:

Instant throughput: throughput since previous TCP segment Fixed Interval throughput: avg throughput over a fixed time

period Per RTT throughput: partition using fixed SYN-ACK RTT

Idea: TCP sends a window full of data segments every RTT

Partition Time-Series (TS1 ) with fixed SYN-ACK RTT, and get per RTT Throughput (TS2 )

Analyze per RTT Throughput Time-Series (TS2 ) to predict future throughput

Compare different prediction methods across all traces

Page 8: Fast Pattern-Based  Throughput Prediction  for TCP Bulk Transfers

TMH - GAN'05, 05/10/2005 8

TCP Segment Partitioning (1)

Over 1 GBytes/sec

About 220 Bytes/sec

Instant throughput shows wide-range of fluctuation.

Log S

cale

d

Fixed Interval throughput shows less fluctuation.

121 KB/sec

40 KB/sec

Fixed Intervalof 100 ms

SYN-ACK RTT = 176 ms

per RTT Throughput

Page 9: Fast Pattern-Based  Throughput Prediction  for TCP Bulk Transfers

TMH - GAN'05, 05/10/2005 9

RTT estimationUse fixed SYN-ACK RTTSimple and effective

Partition TCP segments into per RTT throughput time series

TCP Segment Partitioning (2)

SYN

ACKRTT

Page 10: Fast Pattern-Based  Throughput Prediction  for TCP Bulk Transfers

TMH - GAN'05, 05/10/2005 10

Throughput Prediction (1)

TCP Patterns Rate Control limited (RC) Congestion Control limited (CC)

Identify basic elements Flat regions Exponential Climb regions Linear Climb regions Drop points

Drop points

Flat

Linear ClimbExponential Climb

Page 11: Fast Pattern-Based  Throughput Prediction  for TCP Bulk Transfers

TMH - GAN'05, 05/10/2005 11

Throughput Prediction (2) Peak of slow start

Data points up to end of 1st slow

start are ignored for prediction initial slow start does not repeat

RC-based prediction Use flat regions

CC-based prediction Use complete CC cycles

Window-based prediction If no clear pattern observed

Peak of slow start

Page 12: Fast Pattern-Based  Throughput Prediction  for TCP Bulk Transfers

TMH - GAN'05, 05/10/2005 12

Experiments (1) - Setup

Download data files from 290 web sites (Debian/Gentoo mirrors) Use TCPDUMP to capture receiver’s traffic Record SYN-ACK RTTs Include Retransmitted packets (0.09%) Average file size is 30 MBytes

461 traces collected at Univ. of Houston Traces are analyzed using perl scripts

Page 13: Fast Pattern-Based  Throughput Prediction  for TCP Bulk Transfers

TMH - GAN'05, 05/10/2005 13

Experiments (2) – Prediction Methods

Prediction methods compared Moving Average (MA) – avg throughput of previous 10 RTTs

Exponential Weighted Moving Average (EWMA) Aggregate throughput – average past throughput (same as

cumulative average); use this as predicted throughput

TCP Pattern prediction

Average error in predicted future throughput

Cut off at 100% if over, in case measured future throughput is very small

predicted throughput – measured throughput

measured throughputx 100%

Page 14: Fast Pattern-Based  Throughput Prediction  for TCP Bulk Transfers

TMH - GAN'05, 05/10/2005 14

• TCP Throughput Prediction: average throughput of 9~25 RTTs (RC-based prediction)

• Aggregate Throughput Prediction: average throughput of 0~25 RTTs

Window size (in RTTs)

Illustration of Prediction (1)Make a prediction for next 200 RTTs:

Peak of slow start

• TCP Throughput Prediction: using Window-based prediction after 27th RTTs (a significant drop)

Drop at 27th RTT

Prediction at 25th RTT

Prediction at 40th RTT

per RTT throughputAggregateTCP Pattern

25th RTT 40th RTT

Page 15: Fast Pattern-Based  Throughput Prediction  for TCP Bulk Transfers

TMH - GAN'05, 05/10/2005 15

Window size (in RTTs)

Illustration of Prediction (2)Make a prediction for next 200 RTTs:

• Avg error against measured future throughput of next 200 RTTs (for example, at 20th RTT, avg throughput of 21~220 RTTs is used)

Closer to 0, better the prediction.

per RTT throughputAggregateTCP Pattern

per RTT throughputAggregateMoving AverageEWMATCP Pattern

Page 16: Fast Pattern-Based  Throughput Prediction  for TCP Bulk Transfers

TMH - GAN'05, 05/10/2005 16

Illustration of Prediction (3)Make a prediction for next 200 RTTs:

Throughput prediction using Congestion-Control based patterns.

Prediction made at 65th RTT using 3 CC complete cycles

One complete CC cycle

Closer to 0, better the prediction.

per RTT throughputAggregateTCP Pattern

per RTT throughputAggregateMoving AverageEWMATCP Pattern

Page 17: Fast Pattern-Based  Throughput Prediction  for TCP Bulk Transfers

TMH - GAN'05, 05/10/2005 17

• Aggregate is not accurate for small window size (< 30 RTTs)

Results (1) – predict next 200 RTTs at different time

• MA / EWMA generally not as accurate

30th RTT

per RTT throughputAggregateTCP Pattern

per RTT throughputMoving AverageEWMATCP Pattern

per RTT throughputAggregateMoving AverageEWMATCP Pattern

Page 18: Fast Pattern-Based  Throughput Prediction  for TCP Bulk Transfers

TMH - GAN'05, 05/10/2005 18

Results (2) – predict at 15th RTT for different time in the future

• When only limited data is available,

• MA performs best; TCP Pattern is close

• Aggregate is not accurate

per RTT throughputAggregateTCP Pattern

per RTT throughputMoving AverageEWMATCP Pattern

per RTT throughputAggregateMoving AverageEWMATCP Pattern

Page 19: Fast Pattern-Based  Throughput Prediction  for TCP Bulk Transfers

TMH - GAN'05, 05/10/2005 19

Results (3) – predict at 25th RTT for different time in the future

• More data is available,

• TCP Pattern performs best; MA is close

• Aggregate performs better

per RTT throughputAggregateTCP Pattern

per RTT throughputMoving AverageEWMATCP Pattern

per RTT throughputAggregateMoving AverageEWMATCP Pattern

Page 20: Fast Pattern-Based  Throughput Prediction  for TCP Bulk Transfers

TMH - GAN'05, 05/10/2005 20

Results (4) – predict at 50th RTT for different time in the future

• Even more data is available,

• MA now performs worse, due to dynamic of TCP flows

• TCP Pattern best and Aggregate is close

per RTT throughputAggregateTCP Pattern

per RTT throughputMoving AverageEWMATCP Pattern

per RTT throughputAggregateMoving AverageEWMATCP Pattern

Page 21: Fast Pattern-Based  Throughput Prediction  for TCP Bulk Transfers

TMH - GAN'05, 05/10/2005 21

Summary of Results

Aggregate is accurate with sufficient data, not with a few RTTs of data

MA performs very well for a few RTTs of data

EWMA is not a good predictor TCP Pattern generally performs better or

as well as other methods

Page 22: Fast Pattern-Based  Throughput Prediction  for TCP Bulk Transfers

TMH - GAN'05, 05/10/2005 22

Summary of Results (table view)

MethodsSmall # of RTTs

of dataLarge # of RTTs

of data

Aggregate Worse (3) Better (2)

Moving Average

Best (1) Worse (3)

EWMA Worst (4) Worst (4)

TCP Pattern Better (2) Best (1)

Page 23: Fast Pattern-Based  Throughput Prediction  for TCP Bulk Transfers

TMH - GAN'05, 05/10/2005 23

Conclusion and Future Works TCP-pattern based throughput prediction is as

good or better than other methods. Good predictions within 25 RTTs (or ~ 5 sec). Patterns observed: 65% Rate Control, few

Congestion Control Methods using Aggregate (e.g. NWS) can not be

expected to work well for small test files What’s next?

Identify more patterns Add a degree of confidence for each prediction Multiple TCP streams

Page 24: Fast Pattern-Based  Throughput Prediction  for TCP Bulk Transfers

TMH - GAN'05, 05/10/2005 24

That’s all, folks!

Thank You!

Page 25: Fast Pattern-Based  Throughput Prediction  for TCP Bulk Transfers

TMH - GAN'05, 05/10/2005 25

Supplement Slides

Page 26: Fast Pattern-Based  Throughput Prediction  for TCP Bulk Transfers

TMH - GAN'05, 05/10/2005 26

Characteristics of collected traces (1)Terms Values Comments

Number of traces 461

Downloaded file size 26-34 MB Avg: 30 MB

Unique web sites 290 Debian/Gentoo

Avg # segment per trace 24,062(min/max/median) = (17,025/69,866/24,412)

Retransmitted segments 0.09% 97 out of 461 traces

Avg # retransmitted segments per trace

103.6(min/max/median) = (0/2,672/4)

Avg SYN-ACT RTT 0.1696 sec(min/max/median) = (0.02/2.91/0.155)

Avg # RTTs per trace 2,589(min/max/median) = (143/110,673/662)

Page 27: Fast Pattern-Based  Throughput Prediction  for TCP Bulk Transfers

TMH - GAN'05, 05/10/2005 27

Characteristics of collected traces (2)

Type #traces % Comments

Rate Control 301 65.29% 35 traces (7.59%) have big gaps (> 10 RTTs)

Congestion Avoidance

30 6.51%

Mixed or Congestion Control

130 28.20% 51 traces (11.06%) are very low in volume (up to 8~12 pkts/RTT (vs ~44 pkts/RTT))

Total 461 100.00%

• Classification: one trace presents over 50% “some type” of patterns.

Page 28: Fast Pattern-Based  Throughput Prediction  for TCP Bulk Transfers

TMH - GAN'05, 05/10/2005 28

Some Trace Patterns (300 RTTs)

Under-estimated RTT; 100 RTTs

Page 29: Fast Pattern-Based  Throughput Prediction  for TCP Bulk Transfers

TMH - GAN'05, 05/10/2005 29

Results (0.5) – predict next 100 RTTs at different time

per RTT throughputAggregateMoving AverageEWMATCP Pattern

Page 30: Fast Pattern-Based  Throughput Prediction  for TCP Bulk Transfers

TMH - GAN'05, 05/10/2005 30

Results (1.5) – predict next 400 RTTs at different time

per RTT throughputAggregateMoving AverageEWMATCP Pattern

Page 31: Fast Pattern-Based  Throughput Prediction  for TCP Bulk Transfers

TMH - GAN'05, 05/10/2005 31

Bandwidth

Bandwidth: The amount of data that can be pushed through a link in

unit time. Usually measured in bits or bytes per second.

Bottleneck Bandwidth (BB) Available Bandwidth (AB) Throughput (T) T ≤ AB ≤ BB