open cloud consortium: an update (04-23-10, v9)
DESCRIPTION
This is an overview of the Open Cloud Consortium that I gave at Cloud Lab '10 on April 23, 2010.TRANSCRIPT
Open Cloud Consortium: An Update
Robert GrossmanOpen Cloud Consortium
April 23, 2010
www.opencloudconsortium.org
Part 1.
Overview of theOpen Cloud Consortium (OCC)
www.opencloudconsortium.org
2
501(3)(c) Not-for-profit corporation Supports the development of standards,
interoperability frameworks, and reference implementations.
Manages testbeds: Open Cloud Testbed and Intercloud Testbed.
Manages cloud computing infrastructure to support scientific research: Open Science Data Cloud.
Develops benchmarks.
3
www.opencloudconsortium.org
OCC Members
Companies: Aerospace, Booz Allen Hamilton, Cisco, InfoBlox, Open Data Group, Raytheon, Yahoo
Universities: CalIT2, Johns Hopkins, MIT Lincoln Lab, Northwestern Univ., University of Illinois at Chicago, University of Chicago
Government agencies: NASA Open Source Projects: Sector Project
4
OCC Working Groups
1. Large Data Cloud Working Group2. Open Cloud Testbed Working Group.3. Intercloud Testbed Working Group4. Open Science Data Cloud Working Group
Storage Services
Compute Services
Applications
Virtual Network Manager
Data Services
Network Transport
Virtual Machine Manager
Metadata Services
Identity Manager
IaaS
PaaS
Apps
Part 2. Intercloud Testbed
7
Cloud 1
Cloud 2
We have several cloud standards…
Infrastructure as a Service– Virtual Data Centers (VDC)– Virtual Networks (VN)– Virtual Machines (VM)
Platform as a Service– Cloud Compute Services– Data/Table Cloud Services– Cloud Storage Services
Open Virtualization Format (OVF)
Open Cloud Computing Interface (OCCI)
SNIA Cloud Data Management Interface (CDMI)
Large Data Cloud Interoperability Framework
Where are the Gaps?
Infrastructure as a Service– Virtual Data Centers (VDC)– Virtual Networks (VN)– Virtual Machines (VM)– Physical Resources
Platform as a Service– Cloud Compute Services– Data as a Service
Open Virtualization Format (OVF)
Open Cloud Computing Interface (OCCI)
SNIA Cloud Data Management Interface (CDMI)
Large Data Cloud Interoperability Framework
Naming entities in IaaS & PaaS Bridging IaaS & DaaS Services that span multiple VMs, ….
Bridging the Gaps…A Small Step
Infrastructure as a Service– Virtual Data Centers (VDC)– Virtual Networks (VN)– Virtual Machines (VM)– Physical Resources
Platform as a Service– Cloud Compute Services– Data as a Service
Open Virtualization Format (OVF)
Open Cloud Computing Interface (OCCI)
SNIA Cloud Data Management Interface (CDMI)
Large Data Cloud Interoperability Framework
Metadata service linking IaaS and DaaS
Metadata service naming and linking entities in the IaaS layers
Part 3. Large Data Cloud Working Group
11
Standards for integrating and interoperating large data cloud services such as those provided by Hadoop and similar systems.
Focus of Working Group
12
Cloud Storage Services
Cloud Compute Services (MapReduce, UDF, & other programming frameworks)
Table-based Data Services
Relational-like Data Services
App App App App App
App App
App App
Developing APIs for this framework.
Benchmarks for Large Data Clouds
Until recently, the only benchmark used was Terasort (sorting 10 billion 100 byte records)
Replaced by Gray Sort and Minute Sort Gray Sort tries to maximize TB / min sorted on
100 TB or more of data. Hadoop holds the current Gray Sort and
Minute Sort records. Problem: sort is just one of the types of work
load for analytic applications
MalStone
MalGen – generates synthetic data with realistic distributions.
MalStone A & B – “stylized” computations that can be used as benchmarks for architectures, software and systems for large data clouds.
Open source and available at malgen.googlecode.com
14
Part 4. Open Cloud Testbed
Condominium Clouds In a condominium cloud, you buy your own rack
or bunch of racks. The racks are managed and operated by the
condominium association, in this case the OCC. If your rack is 120 TB, you get the rights to c. 40
TB of storage in the cloud. The rest is a shared resource.
The Open Cloud Testbed is a condo cloud managed by the OCC.
16
Open Cloud Testbed
Phase 2 9 racks 250+ Nodes 1000+ Cores 10+ Gb/s
Phase 3 (2011) – we will stand up some 100 Gb/s links.
MREN
CENIC Dragon
Hadoop Sector/Sphere Thrift KVM VMs Eucalyptus VMs
C-Wave
Part 5. Open Science Data Cloud Working Group
18
Open Science Data Cloud
19
Astronomical dataBiological data (Bionimbus)
Networking data
Provide a long term home for selected scientific data sets and support elastic cloud-based analysis & integration of the data.
Part 6. Image Processing for Disaster Relief Using Elastic Clouds
The Challenge
When a disaster strikes, there is usually an immediate and critical need for computing power to process images.
Example, there was a delay getting current images of Haiti to non-government organizations (NGO) after earthquake on January 12, 2009.
The Idea …The OCC Elastic Cloud for Disaster Relief
Set up a permanent elastic cloud that is available to assist with disaster relief.
Establish connections to sources of images that can be enabled at times of need.
Set up a network of volunteers with accounts on the cloud and knowledge of the tools that can swarm when needed.
Use as a test of large data cloud standards and interoperability.
Image Processing on Large Data Clouds
Data parallel applications– Parallelism is often required at file or directory level– Data locality is important– Parallel disk IO is also critical
Requirements– The input data size can be at 10+ TB per day– Want to integrate with open source libraries such as
OSSIM
Distributed File Systems & Image Processing
Sector is broadly similar to the Hadoop Distributed File System
Main differences– Hadoop directly implements a distributed block based file
system– Sector is a layer over a native file system
Sector does not split files– A single image will not be split, therefore when it is being
processed, the application does not need to read the data from other nodes via network
– A directory can be kept together on a single node as well, as an option
Get Involved…
… Join our volunteer effort.
Part 7. Virtual Networks
How Long Does It Take…
… To Move A Cloud Application Spanning Multiple VMs Between Clouds?
… To Add A New Rack to a Cloud Service?
… To Add Another Public Cloud to A Private/Public Cloud?
We Have Several Ways of Defining Virtual Networks….
VN-Link
VLAN
VPNs
BGP
MPLSOpenFlow
Open vSwitch
vSwitchCloudSwitch
But No Vendor Neutral VN Standard That
That scales to 100,000+ VMs Supported by multiple vendors Spans multiple physical switches Supports VN Mobility Provides strong isolation of VN Is easy for VMs to join and leave VNs Includes management interfaces ….
For More Information