submitting: barak pinhas037584042 gil fiss031801731 laurent levy 921012076

24
Submitting: Barak Pinhas 037584042 Gil Fiss 031801731 Laurent Levy 921012076

Post on 21-Dec-2015

217 views

Category:

Documents


0 download

TRANSCRIPT

Submitting:Barak Pinhas 037584042Gil Fiss 031801731Laurent Levy 921012076

Caching by proxy servers

• The World-Wide Web evolution since its introduction in 1990 has evolved from a simple client server model into a complex distributed architecture.

• This evolution has been driven largely due to the scaling problems associated with exponential growth.

• One of the core infrastructure components that have been employed to meet the demands of this growth is caching by proxy servers.

• A proxy server is an intermediary program which acts as both a server and a client for the purpose of making requests on behalf of other clients.

• The proxy’s cache stores cacheable responses in order to reduce the response time and network bandwidth consumption on future, equivalent requests.

A proxy:

Proxy cache capacity

• The proxy cache must utilize a clean-up algorithm to determine which documents to throw away when its capacity limit is reached.

• When the proxy free space reaches a certain limit, a number of web pages are cleared from the proxy.

Proxy cluster

director

userProxy #1

Proxy #2

Proxy #n

Web server1.The user request for a web page

2.The director select a proxy to handle the request

4.if the requested web page is not in the proxy’s cahce,the proxy searches the neighboring proxies for the web page,and if the web page is not found in the neighboring proxies the proxy takes the page from the web server.

3.The proxy checks if he has the requestedpage

Proxy cache capacityMethods to determine the web pages that would be cleared: 

• LFU aging

Least Frequently Used aging. The document which has been requested least frequently is cleared from the cache.

• LRU aging

Least Recently Used aging. The document which has been requested least recently is cleared from the cache.

• Perfect LFU

Perfect Least Frequently Used aging. In this method the document which has been requested least frequently is cleared from the cache.

Project’s aims and measurements

Proxy servers that cache web pages can potentially reduce:

• The number of requests that arrive to the web servers ( the load on web servers in units of web pages).

• The volume of network traffic resulting from document requests ( the load on web servers in units of bytes).

• The latency that an end user experiences in retrieving a document

Selection methods for the Director

1. RRM - Round Robin selection method

When a page request arrives, the director selects the proxy that is after the last selected proxy in a predetermined fixed order.

2.Hash Function

The hash key is the URL, when a new request arrives with a given URL the request goes to the proxy number hash(URL).

www.cs.technion.ac.il

Index kdirector

Proxy #1

Proxy #k

Proxy #n

Hash function

Selection methods for the Director

• The Hash function is:

Hash(Url Number) =

<sum of the Url Number digits> mod < number of proxies >

Selection methods for the Director

3.Unfinished job

evaluating the number of jobs that the proxy takes care of and giving the request to the proxy with the least jobs at the time of the request.

1.The director selects the proxy with the least tasks

Proxy #1 Proxy #2 Proxy #3

tasks

time time

tasks

time

At the time of the request,proxy #1 has 1 task

At the time of the request,proxy #2 has 2 task

At the time of the request,proxy #3 has2 task

Selection methods for the Director

5. BLFU – byte least frequently used.

Selection methods for the Director

In this Method each proxy has a BLFU Value( In The PLFU method the system remembers the usage frequency record of each request after that request was erased.)

Selection methods for the DirectorProxies with BLFU Value higher then a certain Verge are ignored

4. PLFU – perfect LFU:

Selection methods for the Director

In this Method each proxy has a PLFU Value( In The PLFU method the system remembers the usage frequency record of each request after that request was erased.)

Selection methods for the DirectorProxies with PLFU Value higher then a certain Verge are ignored

Expected results

• We expected ROUND-ROBIN oriented selection methods (RRM TASKS and PLFU/BLFU) to get a lower local hit ratio then the hash selection methods, mainly because while using the hash method a cluster hit will always come with a local hit. That is why a hash function that is not even might lead a a lower cluster hit rate.

• Among the ROUND-ROBIN oriented selection methods we expected the “even load distribution“ methods like TASKS and PLFU/BLFU to get a better hit ratio then the plain RRM because we expected these methods to grant us an even distribution and that way less pruning and a more adaptive cluster (a better cluster hit ratio).

• Object hit ratio

Expected results

• Byte hit ratio

• We expected the methods that takes requests sizes into consideration to get a better byte hit ratio

• Load distribution

• We expected the methods that strive to achieve a better distribution (like RRM TASKS and PLFU/BLFU) to get a better load distribution then the Hash method.

• We expected the following best results ranking:

• BLFU/PLFU - because of the “perfect” technique used.

• TASK - because of the load distribution.

• RRM - because of a simple distribution.

• HASH – because the hash keying might not be even.

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

Pro

xy R

equ

ests

Rat

io

1 2 3 4 5 6 7

Proxy Number

Proxy Requests Distribution

test #1

test #2

test #3

test #4

test #5

The results

In this graph we see a significant advantage to the hash selection method.The average hit rate for the hash selection method is about 62%-65%.But a disadvantage of this method is that the load distribution is not even.All the other method produced about 50% hit rate.

Comparing the director selection methods

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

Pro

xy

Re

qu

es

ts R

ati

o

1 2 3 4 5 6 7

Proxy Number

Proxy Requests Distribution

test #1

test #2

test #3

test #4

test #5

test #6

test #7

test #8

test #9

test #10

test #11

Hit rate

As we can see in the graph,the best result of hit rate was in hash for the director method and LFU,LRU and PLFU for the proxies pruning method.In the next place we see the task method for the director and PLFU for the proxy pruning method.

Comparing all the possible director selection and proxy pruning methods

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

Pro

xy R

equ

ests

Byt

es R

atio

1 2 3 4 5 6 7

Proxy Number

Proxy Requests Bytes Distribution

test #1

test #2

test #3

test #4

test #5

test #6

test #7

test #8

test #9

test #10

test #11

Byte hit rate

In this graph we can see again that the hash method for the director Got the best results.Comparing to the hit rate we can see that the hash method result differsMore From other method result in the byte hit rate campers to the hit rate graph.

Comparing all the possible director selection and proxy pruning methods

16000000000

17000000000

18000000000

19000000000

20000000000

21000000000

22000000000

23000000000

Pro

xy R

equ

ests

Byt

es

1 2 3 4 5 6

Proxy Number

Load Distribution

test #1

test #2

test #3

test #4

test #5

test #6

test #7

test #8

test #9

test #10

test #11

Load distribution

In this graph we see that there is no even distribution among the proxies.Proxy #1 got more request in compare to the other proxies when hash methodWas chosen for the director method,and proxy #5 got more request when tasksMethod was chosen for the director.

Comparing all the possible director selection and proxy pruning methods