enabling the autonomic data center with a smart bare-metal

21
Enabling the Autonomic Data Center with a Smart Bare-Metal Server Platform Arzhan Kinzhalin, Rodolfo Kohn, Ricardo Morin, David Lombard Software and Services Group 6 th International Conference on Autonomic Computing Barcelona, Spain June 17, 2009

Upload: others

Post on 11-Dec-2021

1 views

Category:

Documents


0 download

TRANSCRIPT

Enabling the Autonomic Data Center with a Smart

Bare-Metal Server PlatformArzhan Kinzhalin, Rodolfo Kohn, Ricardo Morin, David Lombard

Software and Services Group6th International Conference on Autonomic Computing

Barcelona, SpainJune 17, 2009

2

Software and Services Group

2

6th IEEE International Conference on Autonomic Computing

Barcelona, Spain June 17, 2009 Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries.*Other brands and names are the property of their respective owners.

Agenda

•Motivation

•The requirements

•The solution

•The value for autonomic computing

•Summary

3

Software and Services Group

3

6th IEEE International Conference on Autonomic Computing

Barcelona, Spain June 17, 2009 Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries.*Other brands and names are the property of their respective owners.

The Motivation

•Data Centers consist of 10s or even 100s thousands of commodity servers−Modern applications that Data Centers run are designed to scale-

out and thus require dynamic allocation of the resources

•The Data Centers employ management software to discover, query, provision, configure, allocate and de-allocate resources

•Nevertheless, there is an automation gap which is referred to as time-zero problem

4

Software and Services Group

4

6th IEEE International Conference on Autonomic Computing

Barcelona, Spain June 17, 2009 Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries.*Other brands and names are the property of their respective owners.

Time-zero Problem Illustrated

4

A general-purpose BIOS forces clusters, enterprise grid, or cloud, which have well-defined but limited operational modes, to be operated as PCs in racks, instead of a robust, scalable, integrated entity

vs.

5

Software and Services Group

5

6th IEEE International Conference on Autonomic Computing

Barcelona, Spain June 17, 2009 Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries.*Other brands and names are the property of their respective owners.

Time-zero Problem and PXE

•PXE is widely used to address the time-zero problem

•It allows booting arbitrary OS image−Normal management layer takes it from there

−Management tools are proprietary

•PXE is static−Bound to MAC, normally configured manually

−Dynamic provisioning capabilities are minimal

•PXE is unreliable and not scalable−Uses UDP and TFTP

6

Software and Services Group

6

6th IEEE International Conference on Autonomic Computing

Barcelona, Spain June 17, 2009 Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries.*Other brands and names are the property of their respective owners.

Understanding the Industry Needs

Data Centers need a platform that is

•Based on standards−DMTF, Networking

•Scalable−Routable protocols, low-overhead

•Reliable−Discovery and transport

•Secure−Authentication and authorization

•Available at time zero−Take it off the box, plug it in, turn it on

7

Software and Services Group

7

6th IEEE International Conference on Autonomic Computing

Barcelona, Spain June 17, 2009 Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries.*Other brands and names are the property of their respective owners.

Proposed solution

Use the technologies available with modern Intel platforms to expose the server as an manageable entity thus enabling the intelligent hardware and software configuration, provisioning, and management

8

Software and Services Group

8

6th IEEE International Conference on Autonomic Computing

Barcelona, Spain June 17, 2009 Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries.*Other brands and names are the property of their respective owners.

Data Center Model

Simplified node roles

•Resource Manager owns and manages the datacenter server repository

•Directory Agent is the SLP DA

•Server Configuration Manager is the policy enforcement server

•Bare-metal servers are the commodity server units

9

Software and Services Group

9

6th IEEE International Conference on Autonomic Computing

Barcelona, Spain June 17, 2009 Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries.*Other brands and names are the property of their respective owners.

The architecture

Standardized Runtime Environment

WBEM interfaces for discovery and access

Capability Inventory

Deployment

Infrastructure Elements

EFI BIOS

Hardware

EFI-bootable OS (transient) or Domain0 (persistent)

Power Management

Firmware Tools API

FWT Driver

Platform configuration Better BootFirmware

Management

Development

Toolkit

Compiler, binutils

Standardizedruntime

Developer’s docs

PXE++

Virtual Machine Monitor (present only in persistent mode)

Deploymenttools

Custom/AdditionalCapability

FWT Driver

ME Firmware

ME

NPTM

Workload Characterization

10

Software and Services Group

10

6th IEEE International Conference on Autonomic Computing

Barcelona, Spain June 17, 2009 Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries.*Other brands and names are the property of their respective owners.

The features

•Runs at time zero on bare-metal platform−All the features are readily available out-of-the-box on power on

•Does not require human intervention once it’s plugged in−Automatic discovery, configuration, and provisioning

•Uses reliable protocols−Both application- and transport-level

•Extensible−Based on standard CIM model

−Easy to develop and deploy custom providers

•Leverages modern DC infrastructure−Uses stable, widely accepted technologies

−Enables smart policy-based resource management

11

Software and Services Group

11

6th IEEE International Conference on Autonomic Computing

Barcelona, Spain June 17, 2009 Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries.*Other brands and names are the property of their respective owners.

Technology Overview

WBEM-compliant set of protocols and technologies

•Service Location Protocol (SLP)−RFC 2608

−Reliable discovery protocol

•Common Information Model (CIM) Schema−DMTF standard representation of manageable resources

−It has own XML representation (CIM-XML) as well as bindings to WS-Management

−Extensible

•Security−SSL/TLS transport level security

12

Software and Services Group

12

6th IEEE International Conference on Autonomic Computing

Barcelona, Spain June 17, 2009 Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries.*Other brands and names are the property of their respective owners.

Patagonia Lake: Proof-of-Concept Implementation

SLP Directory Agent

WBEM ResourceManager

(developed by any ISV)

2-Subscribe for WBEM services

7-Notification of new WBEM services

1- Subscribes to SLP DA to receive

Notifications of new WBEM services

Configures the discovered server in pre-boot

environment and tells the server to continue booting

3 – Upon initiation, EFI BIOS passes control to Patagonia Lake in preboot. SLP SA and SFCBD

with special providers are running

SLP DA receives WBEM service registration and notifies all registered

applications

11-Continues booting the existing OS (or the

provisioned one)

13

Software and Services Group

13

6th IEEE International Conference on Autonomic Computing

Barcelona, Spain June 17, 2009 Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries.*Other brands and names are the property of their respective owners.

Message Sequence Diagram

Manageable Server Directory AgentResource Manager DHCP Server

Request IP address

Assign IP address

SLP: register WBEM service

Subscribe to new registration events

Notify on new service registration

CIM-XML: request server configuration

CIM-XML: configure CPU knobs

CIM-XML: success

CIM-XML: report server configuration

CIM-XML: boot to the production OS

CPUconfiguration

provider

Boot controlprovider

Start CIMbroker

CIM-XML: success

14

Software and Services Group

14

6th IEEE International Conference on Autonomic Computing

Barcelona, Spain June 17, 2009 Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries.*Other brands and names are the property of their respective owners.

Proof-of-Concept Overview

•CIM Broker−SBLIM-SFCB, a light-weight low-footprint implementation

•CIM providers−CIM_Processor extension to expose CPU configuration

>Adjacent Sector pre-fetcher, aka Second Sector pre-fetcher

>Hardware pre-fetcher

−CIM_BootControl>Controls the boot sequence

•SLP−OpenSLP, client integrated into SFCB and stand-alone SA

•Linux* kernel + uClibc + busybox for the runtime

All in 1.28MB!

15

Software and Services Group

15

6th IEEE International Conference on Autonomic Computing

Barcelona, Spain June 17, 2009 Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries.*Other brands and names are the property of their respective owners.

The Enabling Technologies

•Intel® Rapid Boot Toolkit−UEFI-compliant BIOS

>Certain features irrelevant to server market removed

>E.g. video, UI, waiting for user input

>Freed up space could be used to place payloads

−Payloads>These are EFI applications

>One of them is Linux*

>IRBT 1.0 leaves 1.28MB for the payload

•kexec mechanism−Replaces kernel with another one

−We used it for fast-boot>No hard reset required

16

Software and Services Group

16

6th IEEE International Conference on Autonomic Computing

Barcelona, Spain June 17, 2009 Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries.*Other brands and names are the property of their respective owners.

Enabling Autonomous Data Center

•The runtime environment presented is the first step and one of the possible enabling technologies for the future smart hardware platforms

•The solution enables intelligent Resource Managers that−Discover newly plugged servers

−Creates capability inventory of the server

−Make intelligent allocation and provisioning decisions

•There are many applications>Dynamic inventory

>Power-reduction mechanisms

>Hardware and software configuration

>Provisioning

>Reliable network boot

17

Software and Services Group

17

6th IEEE International Conference on Autonomic Computing

Barcelona, Spain June 17, 2009 Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries.*Other brands and names are the property of their respective owners.

Example application: rack-level power management

1

Ethernet/IP, Serial, (other options?)

Moblin

Menlow

Runtime/Libraries

DMTF CIM

Security WS Mgmt

SLP Discovery

CIM Broker

WS EventsSecurity

Field-level CIM Providers

RC detailRC (Menlow…)0-U, tucked away in top, side, or bottom of rack

Display showsreal-time

power utilitzation

Headnode (TBD)

Message Fabric (IB)

Intel Smart Platform

≥12

Mgt Fabric(Ethernet)

18

Software and Services Group

18

6th IEEE International Conference on Autonomic Computing

Barcelona, Spain June 17, 2009 Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries.*Other brands and names are the property of their respective owners.

Future work

•Larger flash spaces will enable new features

•Security−WS-Security

−Transport

•WS-Management−via WS-CIM binding

•Persistent VMM

•Production deployments−HPC for starters

19

Software and Services Group

19

6th IEEE International Conference on Autonomic Computing

Barcelona, Spain June 17, 2009 Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries.*Other brands and names are the property of their respective owners.

Conclusion

•We present a smart server platform which enables extensible representation of the server identity and can be used as the vehicle for autonomic computing use cases

•It provides a better solution for pre-boot environment than the existing ones

•The proof-of-concept demonstrates how CPU configuration can be made scalable and driven by policies

Thank you!