the power of prediction: cloud bandwidth and cost...

18
The Power of Prediction: Cloud Bandwidth and Cost Reduction EyalZohar Israel Cidon Technion Osnat(Ossi) Mokryn Tel-Aviv College

Upload: dinhhuong

Post on 24-May-2018

218 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: The Power of Prediction: Cloud Bandwidth and Cost Reductionconferences.sigcomm.org/sigcomm/2011/slides/s86.pdf · Traffic Redundancy Elimination (TRE) Traffic redundancy stems from

The Power of Prediction:

Cloud Bandwidth and Cost

Reduction

Eyal Zohar Israel Cidon

Technion

Osnat (Ossi) Mokryn

Tel-Aviv College

Page 2: The Power of Prediction: Cloud Bandwidth and Cost Reductionconferences.sigcomm.org/sigcomm/2011/slides/s86.pdf · Traffic Redundancy Elimination (TRE) Traffic redundancy stems from

Traffic Redundancy Elimination (TRE)

Traffic redundancy stems from downloading

same or similar information items.

We found around 70% redundancy in

end-clients traffic, compared with past traffic

and local files.

2SIGCOMM 2011

Page 3: The Power of Prediction: Cloud Bandwidth and Cost Reductionconferences.sigcomm.org/sigcomm/2011/slides/s86.pdf · Traffic Redundancy Elimination (TRE) Traffic redundancy stems from

TRE Importance

Moving to the cloud => higher e2e traffic.

Cloud users pay for traffic used in practice =>

incentive to use TRE.

Cloud ProviderCloud Provider

Cloud User

Pay for Use

End-user

Application

TRE

Cloud Traffic

3SIGCOMM 2011

Page 4: The Power of Prediction: Cloud Bandwidth and Cost Reductionconferences.sigcomm.org/sigcomm/2011/slides/s86.pdf · Traffic Redundancy Elimination (TRE) Traffic redundancy stems from

How TRE Works

Chunk 1 Chunk 2 Chunk 3

Byte stream

Anchor 1 Anchor 2 Anchor 3 Anchor 4

Sign. 1 Sign. 2 Sign 3

Rolling hash

SHA-1 signature

Server parses the outgoing stream to content-

based chunks and signs with SHA-1

Chunk 1 Chunk 2’ Chunk 3Insertion example

New bytes

4SIGCOMM 2011

Page 5: The Power of Prediction: Cloud Bandwidth and Cost Reductionconferences.sigcomm.org/sigcomm/2011/slides/s86.pdf · Traffic Redundancy Elimination (TRE) Traffic redundancy stems from

Problems in Existing Solutions

In the cloud environment:

1. High processing costs in the cloud.

2. Scalability – remember each client.

3. Elasticity - unaware of data from other sources.

4. Do not handle long-term repeats (days/weeks).

ReceiverServer 1

Server 2

5SIGCOMM 2011

Page 6: The Power of Prediction: Cloud Bandwidth and Cost Reductionconferences.sigcomm.org/sigcomm/2011/slides/s86.pdf · Traffic Redundancy Elimination (TRE) Traffic redundancy stems from

Our Solution: PACK (Predictive ACK)

Redundancy detection by the client.

Repeats appear in chains.

Tries to match incoming chunks with a

previously received chain or local file.

Sends to the server predictions of the future

data.

6SIGCOMM 2011

Page 7: The Power of Prediction: Cloud Bandwidth and Cost Reductionconferences.sigcomm.org/sigcomm/2011/slides/s86.pdf · Traffic Redundancy Elimination (TRE) Traffic redundancy stems from

PACK: The Client Prediction

Each prediction:

1.TCP seq. – no server parsing

2.Hint – spare unnecessary SHA-1

3.SHA-1 signature

Chunk

SHA-1

7

Last-byte

hint

TCP seq.

SIGCOMM 2011

Chunk 1 Chunk 2 Chunk 3

Sign. 1 Sign. 2 Sign 3

SHA-1 signature

Stream chunks

Chain of chunks

Received Prediction

Page 8: The Power of Prediction: Cloud Bandwidth and Cost Reductionconferences.sigcomm.org/sigcomm/2011/slides/s86.pdf · Traffic Redundancy Elimination (TRE) Traffic redundancy stems from

3

PACK: Server Operation

The server compares the hint with the last-byte to sign.

Upon a hint match it performs the expensive SHA-1.

PACK saves cloud’s computational effort in the absence

of redundancy.

First receiver-based TRE: the server does not parse. It

signs with >99% confidence.

ClientServer

Local

storage

1 21 2 3

2,3?

Chain

2,3V

8SIGCOMM 2011

Page 9: The Power of Prediction: Cloud Bandwidth and Cost Reductionconferences.sigcomm.org/sigcomm/2011/slides/s86.pdf · Traffic Redundancy Elimination (TRE) Traffic redundancy stems from

PACK Benefits

Minimizes processing costs induced by TRE.

– Signs with SHA-1 in the presence of redundancy.

Receiver-based end-to-end TRE => suitable for cloud

server elasticity and client mobility.

– Does not require the server to continuously maintain

clients’ status.

9SIGCOMM 2011

Page 10: The Power of Prediction: Cloud Bandwidth and Cost Reductionconferences.sigcomm.org/sigcomm/2011/slides/s86.pdf · Traffic Redundancy Elimination (TRE) Traffic redundancy stems from

Server Effort Experiment

Several data-sets in 3 modes: baseline no-TRE,

PACK and a sender-based TRE.

0%

20%

40%

60%

80%

100%

120%

140%

0% 10% 20% 30% 40% 50%

Sin

gle

Serv

er

Clo

ud O

pera

tional C

ost

(100%

=w

ithout

TR

E s

yste

m)

Redundancy Elimination Ratio

EndRE-like

PACK

25%-30% redundancy:

common to many

data-sets

10SIGCOMM 2011

Sender-based

Page 11: The Power of Prediction: Cloud Bandwidth and Cost Reductionconferences.sigcomm.org/sigcomm/2011/slides/s86.pdf · Traffic Redundancy Elimination (TRE) Traffic redundancy stems from

0%

5%

10%

15%

20%

25%

30%

35%

0.0

0.5

1.0

1.5

2.0

2.5

3.0

PA

CK

TR

E (

Rem

oved R

edundancy)

All

YouT

ube T

raff

ic (

Gbps)

Time (24 hours)

YouTube Traffic

PACK TRE

YouTube Redundancy

Traces of 40k clients, captured at an ISP.

Found 30% end-to-end (personal) redundancy.

11SIGCOMM 2011

Page 12: The Power of Prediction: Cloud Bandwidth and Cost Reductionconferences.sigcomm.org/sigcomm/2011/slides/s86.pdf · Traffic Redundancy Elimination (TRE) Traffic redundancy stems from

0%

10%

20%

30%

40%

50%

60%

70%

80%

1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33

Avera

ge R

edundancy o

f D

aily

Tra

ffic

Days Since Start

Unlimited

1 Hour

24 Hours

Social network: eliminated 30% with one hour cache

and 75% with a long-term cache.

12

Long-Term TRE

SIGCOMM 2011

Page 13: The Power of Prediction: Cloud Bandwidth and Cost Reductionconferences.sigcomm.org/sigcomm/2011/slides/s86.pdf · Traffic Redundancy Elimination (TRE) Traffic redundancy stems from

Gmail account with 1,000 Inbox messages.

Found 32% static redundancy (higher when

messages are read multiple times).

Cloud Email Redundancy

0

50

100

150

200

250

300

Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec

Tra

ffic

Vo

lum

e P

er

Mo

nth

(M

B)

Month

Redundant

Non-redundant

13SIGCOMM 2011

Page 14: The Power of Prediction: Cloud Bandwidth and Cost Reductionconferences.sigcomm.org/sigcomm/2011/slides/s86.pdf · Traffic Redundancy Elimination (TRE) Traffic redundancy stems from

Implementation

Linux with Netfilter Queue, 25k lines of C and

Java, available for download.

Receiver-sender protocol is embedded in the

TCP Options field.

Transparent use at both sides.

14SIGCOMM 2011

Page 15: The Power of Prediction: Cloud Bandwidth and Cost Reductionconferences.sigcomm.org/sigcomm/2011/slides/s86.pdf · Traffic Redundancy Elimination (TRE) Traffic redundancy stems from

Processing Effort in the Client

SIGCOMM 2011 15

Laptop experiment: PACK-related CPU consumption is

~4% when playing HD video (9 Mbps with 30%

redundancy).

Smartphone experiment: PACK consumes ~3% of the

battery power when processing 1 GB video (avg.

monthly data plan).

Virtual traffic saves

the client the need

to chunk or sign.

Page 16: The Power of Prediction: Cloud Bandwidth and Cost Reductionconferences.sigcomm.org/sigcomm/2011/slides/s86.pdf · Traffic Redundancy Elimination (TRE) Traffic redundancy stems from

New Chunking Algorithm

SIGCOMM 2011 16

Most existing solutions use Rabin fingerprint.

Page 17: The Power of Prediction: Cloud Bandwidth and Cost Reductionconferences.sigcomm.org/sigcomm/2011/slides/s86.pdf · Traffic Redundancy Elimination (TRE) Traffic redundancy stems from

New Chunking Algorithm64 bits

nn-1

n-2n-3

n-4

n-5n-6

n-7n-8

n-40n-41

n-42n-43

n-44n-45

n-46n-47

Mask=00 00 8A 31 10 58 30 80

SIGCOMM 2011 17

Page 18: The Power of Prediction: Cloud Bandwidth and Cost Reductionconferences.sigcomm.org/sigcomm/2011/slides/s86.pdf · Traffic Redundancy Elimination (TRE) Traffic redundancy stems from

Summary

Current TRE solutions may not reduce cloud cost.

PACK is the first receiver-based TRE – leverages the

power of prediction.

Minimizes processing costs induced by TRE.

Suitable for cloud server migration and client mobility.

Implementation is available for download.

18SIGCOMM 2011