workload-driven analysis of file systems in shared multi-tier data-centers over infiniband

29
Workload-driven Analysis of File Systems in Shared Multi-Tier Data-Centers over InfiniBand K. Vaidyanathan P. Balaji H. –W. Jin D.K. Panda Network-Based Computing Laboratory Department of Computer Science and Engineering The Ohio State University

Upload: medea

Post on 22-Mar-2016

22 views

Category:

Documents


0 download

DESCRIPTION

Workload-driven Analysis of File Systems in Shared Multi-Tier Data-Centers over InfiniBand. K. Vaidyanathan P. Balaji H. –W. Jin D.K. Panda. Network-Based Computing Laboratory Department of Computer Science and Engineering The Ohio State University. Presentation Outline. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Workload-driven Analysis of File Systems in Shared Multi-Tier Data-Centers  over InfiniBand

Workload-driven Analysis of File Systems in Shared Multi-Tier Data-

Centers over InfiniBand

K. Vaidyanathan P. Balaji H. –W. Jin D.K. Panda

Network-Based Computing LaboratoryDepartment of Computer Science and

EngineeringThe Ohio State University

Page 2: Workload-driven Analysis of File Systems in Shared Multi-Tier Data-Centers  over InfiniBand

Presentation Outline

• Introduction and Background• Characterization of local and network-

based file systems• Multi File System for Data-Centers• Experimental Results• Conclusions

Page 3: Workload-driven Analysis of File Systems in Shared Multi-Tier Data-Centers  over InfiniBand

Introduction• Exponential growth of Internet

– Primary means of electronic interaction– Online book-stores, World-cup scores, Stock markets– Ex. Google, Amazon, etc

• Highly Scalable and Available Web-Services• Performance is critical for such Services• Utilizing Clusters for Web-Services? [shah01]

– High Performance-to-cost ratio– Has been proposed by Industry and Research Environments

[shah01]: CSP: A Novel System Architecture for Scalable Internet and Communication Services. H. V. Shah, D. B. Minturn, A. Foong, G. L. McAlpine, R. S. Madukkarumukumana and G. J. Regnier In USITS 2001

Page 4: Workload-driven Analysis of File Systems in Shared Multi-Tier Data-Centers  over InfiniBand

Cluster-Based Data-Centers

• Nodes are logically partitioned– provides specific services (serving static and dynamic content)– Use high speed interconnects like InfiniBand, Myrinet, etc.

• Requests get forwarded through multiple tiers• Replication of content on all nodes

ProxyServer

WebServer

(Apache)

Application

Server(PHP)

DatabaseServer

(MySQL)

WAN

Clients

Storage

Page 5: Workload-driven Analysis of File Systems in Shared Multi-Tier Data-Centers  over InfiniBand

Shared Cluster-Based Data-Centers

• Hosting several unrelated services on a single data-center– Currently used by several ISPs and Web Service Providers (IBM,

HP)• Replication of content

– Amount of data replicated increases linearly with the number of web-sites hosted

ProxyServer

WebServer

Application

ServerDatabase

Server

WAN

Clients

Website B

Website C

Website A

}}}

ABCABCABC

Storage

Page 6: Workload-driven Analysis of File Systems in Shared Multi-Tier Data-Centers  over InfiniBand

Issues in Shared Cluster-Based Data-Centers

• File System Caches being shared across multiple web-sites

• Under-utilization of aggregate cache of all nodes• Web-site Content

– Replication of content on all nodes if we use local file system– Need to fetch the document via network if we use network

file system, however no replication required• Can we adapt the file system to avoid

these?

Page 7: Workload-driven Analysis of File Systems in Shared Multi-Tier Data-Centers  over InfiniBand

File System Interactions

ProxyServer

SAN SAN

WebServer

Application

Server

DatabaseServer

Network-based File SystemsLocal

file system

Localfile system

Data-Center Interaction

File System Interaction

Localfile system

Page 8: Workload-driven Analysis of File Systems in Shared Multi-Tier Data-Centers  over InfiniBand

Existing File Systems

• Network-based File System: Parallel Virtual File System (PVFS) and Lustre (supports client-side caching)

• Local File System: ext3fs and memory file system (ramfs)

computenode

SAN

WebServer

Localfile system

MetadataManager

I/O(OST)Node

I/O(OST)Node

MetaData

Data

Data

computenode

computenode

computenode

Client-side Cache

Server-side Cache

Page 9: Workload-driven Analysis of File Systems in Shared Multi-Tier Data-Centers  over InfiniBand

Presentation Outline

• Introduction and Background• Characterization of local and

network-based file systems• Multi File System for Data-Centers• Experimental Analysis• Conclusions

Page 10: Workload-driven Analysis of File Systems in Shared Multi-Tier Data-Centers  over InfiniBand

Characterization of local and network-based File Systems

• Network Traffic Requirements• Aggregate Cache• Cache Pollution Effects

Page 11: Workload-driven Analysis of File Systems in Shared Multi-Tier Data-Centers  over InfiniBand

Network Traffic Requirements• Absolute Network Traffic generated

– Static Content– Dynamic Content

• Network Utilization– Large/Small burst (static or dynamic content)

• Overhead of Metadata Operations

Page 12: Workload-driven Analysis of File Systems in Shared Multi-Tier Data-Centers  over InfiniBand

Aggregate Cache in Data-Centers• Local File Systems use only single node’s cache

– Small files get huge benefits, if in memory. Otherwise, we pay a penalty of accessing the disk

– Large Files may not fit in memory and also have high penalties in accessing the disk

• Network File Systems use aggregate cache from all nodes– Large Files, if striped, can reside in file system cache on

multiple nodes– Small files also get benefits due to aggregate cache

Page 13: Workload-driven Analysis of File Systems in Shared Multi-Tier Data-Centers  over InfiniBand

Cache Pollution Effects

• Working set – frequently accessed documents; usually fits in memory

• Shared Data-Centers– Multiple web-sites share the file system cache; each

website has lesser amount of file system cache to utilize– Bursts of requests/accesses to one web-site may result

in cache pollution– May result in drastic drop in the number of cache hits

Page 14: Workload-driven Analysis of File Systems in Shared Multi-Tier Data-Centers  over InfiniBand

Presentation Outline

• Introduction and Background• Characterization of local and network-

based file systems• Multi File System for Data-Centers• Experimental Results• Conclusions

Page 15: Workload-driven Analysis of File Systems in Shared Multi-Tier Data-Centers  over InfiniBand

Multi File System for Data-CentersCharacterization ext3fs ramfs pvfs lustre

Network Traffic generated

Min Min More traffic

Min

Use of Aggregate Cache

No No Yes Yes

Cache pollution effects

Yes No Yes Yes

Metadata overhead

No No Yes Yes

Page 16: Workload-driven Analysis of File Systems in Shared Multi-Tier Data-Centers  over InfiniBand

Multi File System for Data-Centers

• A combination of file systems for different environments

• Memory file system and local file system (ext3fs) for workloads with high temporal locality

• Memory file system and network file system (pvfs/lustre) for workloads with low temporal locality

Page 17: Workload-driven Analysis of File Systems in Shared Multi-Tier Data-Centers  over InfiniBand

Presentation Outline

• Introduction and Background• Characterization of local and network-

based file systems with data-centers• Multi File System for Data-Centers• Experimental Results• Conclusions

Page 18: Workload-driven Analysis of File Systems in Shared Multi-Tier Data-Centers  over InfiniBand

Experimental Test-bed• Cluster 1 with:

– 8 SuperMicro SUPER X5DL8-GG nodes; Dual Intel Xeon 3.0 GHz processors

– 512 KB L2 Cache, 2 GB memory; PCI-X 64 bit 133 MHz• Cluster 2 with:

– 8 SuperMicro SUPER P4DL6 nodes; Dual Intel Xeon 2.4 GHz processors

– 512 KB L2 Cache, 512 MB memory; PCI-X 64 bit 133 MHz• Mellanox MT23108 Dual Port 4x HCAs; MT43132 24-port

switch• Apache 2.0.48 Web and PHP 4.3.7 Servers; MySQL 4.0.12,

PVFS 1.6.2, Lustre 1.0.4

Page 19: Workload-driven Analysis of File Systems in Shared Multi-Tier Data-Centers  over InfiniBand

Workloads• Zipf workloads: the relative probability of a

request for the ith most popular document is proportional to 1/i with 1– High Temporal locality (constant )– Low Temporal locality (varying )

• TPC-W traces according to the specifications

Class File Sizes SizeClass 0 1K – 250K 25 MBClass 1 1K – 1MB 100 MBClass 2 1K – 4MB 450 MBClass 3 1K – 16MB 2 GBClass 4 1K – 64MB 6 GB

Page 20: Workload-driven Analysis of File Systems in Shared Multi-Tier Data-Centers  over InfiniBand

Experimental Analysis (Outline)

• Basic Performance of different file systems

• Network Traffic Requirements• Impact of Aggregate Cache• Cache Pollution Effects• Multi File System for Data-Centers

Page 21: Workload-driven Analysis of File Systems in Shared Multi-Tier Data-Centers  over InfiniBand

Basic Performance

• Network File Systems incur high overhead for metadata operations (open() and close())

• Lustre supports client-side cache• For large files, network-based file system does better than local

file system due to striping of the file

Latency ext3fs(usecs)

ramfs(usecs)

pvfs (usecs)

lustre(usecs)

4K 1M 4K 1M 4K 1M 4K 1M

Open & Close overhead

6 6 6 6 1060 1060 876 876

Read Latency (cache)

4 1602 4 1578 680 13825 7.7 1998

Read Latency (no cache)

1500 76312 1400 2379 9600 44108 3000 50713

Page 22: Workload-driven Analysis of File Systems in Shared Multi-Tier Data-Centers  over InfiniBand

Network Traffic Requirements

0

200000

400000

600000

800000

ZipfClass 0

ZipfClass 1

ZipfClass 2

ZipfClass 3

#pac

kets

sen

t/rec

eive

d

ext3fs pvfs lustre

0

200000

400000

600000

800000

TPCWClass 0

TPCWClass 1

TPCWClass 2

TPCWClass 3

#pac

kets

sent

/rece

ived

ext3fs pvfs lustre

• Absolute Network Traffic Generated:– Increases proportionally compared to the local file system for PVFS– For Lustre, the traffic is close to that of the local file system– For dynamic content, the network traffic does not increase with increase

in database size

Page 23: Workload-driven Analysis of File Systems in Shared Multi-Tier Data-Centers  over InfiniBand

Impact of Caching and Metadata operations

• Local File Systems are better for workloads with high temporal locality

• Surprisingly Lustre performs comparable with local file systems

02000400060008000

100001200014000

ZipfClass 0

ZipfClass 1

ZipfClass 2

ZipfClass 3

TPS

ext3fsramfspvfslustre

0

50

100

150

200

250

TPCWClass 0

TPCWClass 1

TPCWClass 2

TPCWClass 3

TPS

ext3fsramfspvfslustre

Page 24: Workload-driven Analysis of File Systems in Shared Multi-Tier Data-Centers  over InfiniBand

Impact of Aggregate Cache

0

20

40

60

80

100

α =0.8

α =0.75

α =0.7

α =0.65

α =0.6

α =0.55

α =0.5

α =0.4

α =0.3

Workload with varying temporal locality

TPS

ext3fspvfslustre

• Aggregate Cache improves data-center performance for network-based file systems

Page 25: Workload-driven Analysis of File Systems in Shared Multi-Tier Data-Centers  over InfiniBand

Cache Pollution Effects in Shared Data-Centers

• Small Workloads, web-sites are not affected• Large Workloads, cache pollution affects multiple web-

sites• Placing files on memory file system might avoid the

cache pollution effects

0%

20%

40%

60%

80%

100%

Sin

gle

Sha

red

Sin

gle

Sha

red

Sin

gle

Sha

red

Sin

gle

Sha

red

Sin

gle

Sha

red

Zipf Class0

Zipf Class1

Zipf Class2

Zipf Class3

Zipf Class4Pe

rcen

tage

of C

ache

d/N

onC

ache

d C

onte

nt

NonCached

Cached

Page 26: Workload-driven Analysis of File Systems in Shared Multi-Tier Data-Centers  over InfiniBand

Multi File System Data-Centers

• Performance benefits for static content is close to 48%

• Performance benefits for dynamic content is close to 41%

0%

10%

20%

30%

40%

50%

60%

LowLoad

MediumLoad

HeavyLoad

Perfor

man

ce Im

prov

emen

t

Zipf Class 0Zipf Class 1Zipf Class 2

0%

10%

20%

30%

40%

50%

LowLoad

MediumLoad

HeavyLoad

Perf

orm

ance

Impr

ovem

ent

TPCW Class 0TPCW Class 1TPCW Class 2

Page 27: Workload-driven Analysis of File Systems in Shared Multi-Tier Data-Centers  over InfiniBand

Multi File System Data-Centers

• Benefits are two folds:– Avoidance of Cache Pollution– Reduced overhead of open() and close() operations for small files

02468

101214161820

α = 0.75 α = 0.65 α = 0.55 α = 0.45Workload with varying temporal locality

TPS

pvfs pvfs with ramfs

Page 28: Workload-driven Analysis of File Systems in Shared Multi-Tier Data-Centers  over InfiniBand

Conclusions & Future Work• Fragmentation of resources in shared data-Centers

– Under-utilization of file system cache in clusters– Cache Pollution affects performance

• Studied the impact of file systems in terms of network traffic, aggregate cache and cache pollution effects

• Proposed a Multi File System approach to utilize the benefits from each file system– Combination of Network and Memory File System for static content with

low temporal locality– Memory File System and local file system for static content with high

temporal locality and dynamic content• Propose to perform dynamic reconfiguration based on each node’s

memory cache and provide prioritization and QoS

Page 29: Workload-driven Analysis of File Systems in Shared Multi-Tier Data-Centers  over InfiniBand

Web Pointers

http://www.cse.ohio-state.edu/~pandahttp://nowlab.cse.ohio-state.edu

{vaidyana,balaji,jinhy,panda}@cse.ohio-state.edu

NOWLAB