ceph vs local storage for virtual machine 26 th march 2015 hepix spring 2015, oxford alexander dibbo...

22
Ceph vs Local Storage for Virtual Machine 26 th March 2015 HEPiX Spring 2015, Oxford Alexander Dibbo George Ryall, Ian Collier, Andrew Lahiff, Frazer Barnsley

Upload: bethany-brown

Post on 19-Dec-2015

219 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Ceph vs Local Storage for Virtual Machine 26 th March 2015 HEPiX Spring 2015, Oxford Alexander Dibbo George Ryall, Ian Collier, Andrew Lahiff, Frazer Barnsley

Ceph vs Local Storage for Virtual Machine

26th March 2015HEPiX Spring 2015, Oxford

Alexander DibboGeorge Ryall, Ian Collier, Andrew Lahiff, Frazer

Barnsley

Page 2: Ceph vs Local Storage for Virtual Machine 26 th March 2015 HEPiX Spring 2015, Oxford Alexander Dibbo George Ryall, Ian Collier, Andrew Lahiff, Frazer Barnsley

Background• As presented earlier, we have developed a cloud based on

OpenNebula and backed by Ceph storage• We have

– 28 hypervisors (32 threads, 128GB RAM, 2TB disk, 10GB networking)

– 30 storage nodes (8 threads, 8GB RAM, 8 x 4TB disks, 10GB front and backend networks)

– OpenNebula Headnode is virtual– Ceph monitors are virtual

Page 3: Ceph vs Local Storage for Virtual Machine 26 th March 2015 HEPiX Spring 2015, Oxford Alexander Dibbo George Ryall, Ian Collier, Andrew Lahiff, Frazer Barnsley

What Are We Testing?• The performance of Virtual Machines on Local Storage

(hypervisor local disk) vs Ceph RBD storage • How quick machines are to deploy and to become useable

in different configurations• How quickly management tasks can be performed (i.e. live

migration)

Page 4: Ceph vs Local Storage for Virtual Machine 26 th March 2015 HEPiX Spring 2015, Oxford Alexander Dibbo George Ryall, Ian Collier, Andrew Lahiff, Frazer Barnsley

What are we trying to find out?

• The performance characteristics of virtual machines on each type of storage

• How agile we can be with machines on each type of storage

Page 5: Ceph vs Local Storage for Virtual Machine 26 th March 2015 HEPiX Spring 2015, Oxford Alexander Dibbo George Ryall, Ian Collier, Andrew Lahiff, Frazer Barnsley

Test Setup• Virtual Machine – Our SL6 image, 1 CPU, 4GB of RAM, 10GB OS

Disk and a 50GB Sparse Disk for Data• 4 Different Configurations

– OS on Ceph, Data on Ceph– OS Local, Data Local,– OS on Ceph, Data Local– OS Local, Data on Ceph

• 3 VMs of each configuration spread across the cloud for a total of 12 VMs

• The cloud is very lightly used as it is still being commissioned

Page 6: Ceph vs Local Storage for Virtual Machine 26 th March 2015 HEPiX Spring 2015, Oxford Alexander Dibbo George Ryall, Ian Collier, Andrew Lahiff, Frazer Barnsley

How Are We Testing?• Pending to Running ( Time to deploy to Hypervisor)• Running to useable (How long to boot)• Pending to useable (Total of the above)

– This is what users care about• Live migration time• IOZone Single Thread Tests (Read, ReRead, Write, ReWrite)

– 6GB on OS Disk– 24GB on Data Disk– 3 VMs of each configuration throughout our cloud. 20 instance s

of each test per VM

Page 7: Ceph vs Local Storage for Virtual Machine 26 th March 2015 HEPiX Spring 2015, Oxford Alexander Dibbo George Ryall, Ian Collier, Andrew Lahiff, Frazer Barnsley

How Are We Testing?

• IOZone Aggregate Test – 12 Threads equal split mixed Read and Write (Read, ReRead, Write, ReWrite)– 0.5 GB per thread on the OS disk – 6GB total– 2 GB per thread on the Data disk – 24GB total– 3 VMs of each configuration throughout our cloud. 20

instance s of each test per VM (240 data points)

Page 8: Ceph vs Local Storage for Virtual Machine 26 th March 2015 HEPiX Spring 2015, Oxford Alexander Dibbo George Ryall, Ian Collier, Andrew Lahiff, Frazer Barnsley

ResultsLaunch Tests

0

200

400

600

800

1000

1200

OS-Ceph/Data-Ceph OS-Local/Data-Local OS-Ceph/Data-Local OS-Local/Data-Ceph

Tim

e (S

econ

ds)

Pending - Running (s)

Running to useable (s)

Time to Useable (s)

Live Migration Time (s)

Page 9: Ceph vs Local Storage for Virtual Machine 26 th March 2015 HEPiX Spring 2015, Oxford Alexander Dibbo George Ryall, Ian Collier, Andrew Lahiff, Frazer Barnsley

ResultsLaunch Tests (Log Scaled)

1

10

100

1000

10000

OS-Ceph/Data-Ceph OS-Local/Data-Local OS-Ceph/Data-Local OS-Local/Data-Ceph

Tim

e (S

econ

ds)

Pending - Running (s)

Running to useable (s)

Time to Useable (s)

Live Migration Time (s)

Page 10: Ceph vs Local Storage for Virtual Machine 26 th March 2015 HEPiX Spring 2015, Oxford Alexander Dibbo George Ryall, Ian Collier, Andrew Lahiff, Frazer Barnsley

ResultsIOZone Single Thread Tests Read/ReRead

Page 11: Ceph vs Local Storage for Virtual Machine 26 th March 2015 HEPiX Spring 2015, Oxford Alexander Dibbo George Ryall, Ian Collier, Andrew Lahiff, Frazer Barnsley

ResultsIOZone Single Thread Tests Read/ReRead (Log Scaled)

Page 12: Ceph vs Local Storage for Virtual Machine 26 th March 2015 HEPiX Spring 2015, Oxford Alexander Dibbo George Ryall, Ian Collier, Andrew Lahiff, Frazer Barnsley

ResultsIOZone Single Thread Tests Write/ReWrite

0

20000

40000

60000

80000

100000

120000

140000

160000

OS-Ceph/Data-Ceph OS-Local/Data-Local OS-Ceph/Data-Local OS-Local/Data-Ceph

Thro

ughp

ut (

KB/s

)

OS Write

OS ReWrite

Data Write

Data ReWrite

Page 13: Ceph vs Local Storage for Virtual Machine 26 th March 2015 HEPiX Spring 2015, Oxford Alexander Dibbo George Ryall, Ian Collier, Andrew Lahiff, Frazer Barnsley

ResultsIOZone Single Thread Tests Write/ReWrite (Log Scaled)

1

10

100

1000

10000

100000

1000000

OS-Ceph/Data-Ceph OS-Local/Data-Local OS-Ceph/Data-Local OS-Local/Data-Ceph

Thro

ughp

ut (

KB/s

)

OS Write

OS ReWrite

Data Write

Data ReWrite

Page 14: Ceph vs Local Storage for Virtual Machine 26 th March 2015 HEPiX Spring 2015, Oxford Alexander Dibbo George Ryall, Ian Collier, Andrew Lahiff, Frazer Barnsley

ResultsIOZone Multi Thread Tests Read/ReRead

0

200000

400000

600000

800000

1000000

1200000

1400000

OS-Ceph/Data-Ceph OS-Local/Data-Local OS-Ceph/Data-Local OS-Local/Data-Ceph

Thro

ughp

ut (

KB/s

)

OS Read

OS ReRead

Data Read

Data ReRead

Page 15: Ceph vs Local Storage for Virtual Machine 26 th March 2015 HEPiX Spring 2015, Oxford Alexander Dibbo George Ryall, Ian Collier, Andrew Lahiff, Frazer Barnsley

ResultsIOZone Multi Thread Tests Read/ReRead (Log Scaled)

1

10

100

1000

10000

100000

1000000

10000000

OS-Ceph/Data-Ceph OS-Local/Data-Local OS-Ceph/Data-Local OS-Local/Data-Ceph

Thro

ughp

ut (

KB/s

)

OS Read

OS ReRead

Data Read

Data ReRead

Page 16: Ceph vs Local Storage for Virtual Machine 26 th March 2015 HEPiX Spring 2015, Oxford Alexander Dibbo George Ryall, Ian Collier, Andrew Lahiff, Frazer Barnsley

ResultsIOZone Multi Thread Tests Write/ReWrite

0

20000

40000

60000

80000

100000

120000

140000

OS-Ceph/Data-Ceph OS-Local/Data-Local OS-Ceph/Data-Local OS-Local/Data-Ceph

Thro

ughp

ut (

KB/s

)

OS Write

OS ReWrite

Data Write

Data ReWrite

Page 17: Ceph vs Local Storage for Virtual Machine 26 th March 2015 HEPiX Spring 2015, Oxford Alexander Dibbo George Ryall, Ian Collier, Andrew Lahiff, Frazer Barnsley

ResultsIOZone Multi Thread Tests Write/ReWrite (Log Scaled)

1

10

100

1000

10000

100000

1000000

OS-Ceph/Data-Ceph OS-Local/Data-Local OS-Ceph/Data-Local OS-Local/Data-Ceph

Thro

ughp

ut (

KB/s

)

OS Write

OS ReWrite

Data Write

Data ReWrite

Page 18: Ceph vs Local Storage for Virtual Machine 26 th March 2015 HEPiX Spring 2015, Oxford Alexander Dibbo George Ryall, Ian Collier, Andrew Lahiff, Frazer Barnsley

Conclusions• Local disk wins for single threaded read operations (such

as booting the virtual machine)• Ceph wins for single threaded write operations (large

sequential writes)• Ceph wins for both reads and writes for multi threaded

operations

Page 19: Ceph vs Local Storage for Virtual Machine 26 th March 2015 HEPiX Spring 2015, Oxford Alexander Dibbo George Ryall, Ian Collier, Andrew Lahiff, Frazer Barnsley

Why is this?• Local disks have a maximum throughput which is very

limited• Due to the way RBD stripes data across the Ceph cluster

the bottleneck here is the NIC on the hypervisor– In this case the NICs are 10Gb so to get equivalent

performance would require a large RAID set in each hypervisor.

Page 20: Ceph vs Local Storage for Virtual Machine 26 th March 2015 HEPiX Spring 2015, Oxford Alexander Dibbo George Ryall, Ian Collier, Andrew Lahiff, Frazer Barnsley

Further Work

• Test when the cloud is under more load• Test using micro kernel VMs such as the µCernVM• Test larger data sets

Page 21: Ceph vs Local Storage for Virtual Machine 26 th March 2015 HEPiX Spring 2015, Oxford Alexander Dibbo George Ryall, Ian Collier, Andrew Lahiff, Frazer Barnsley

A Minor IssueDuring the testing run we noticed that one of the storage nodes had dropped out of use.After some investigation we found this ->The testing, and the cloud as a whole, didn’t skip a beat

Page 22: Ceph vs Local Storage for Virtual Machine 26 th March 2015 HEPiX Spring 2015, Oxford Alexander Dibbo George Ryall, Ian Collier, Andrew Lahiff, Frazer Barnsley

Any Questions?

Email: [email protected]