autonomic computing in action - university of st andrews lecture3.pdf · autonomic computing in...

24
Autonomic Computing © 2004 IBM Corporation Autonomic Computing in Action Ric Telford Director, Strategy and Technology Autonomic Computing April 14, 2004

Upload: ngohanh

Post on 25-Mar-2018

218 views

Category:

Documents


2 download

TRANSCRIPT

Autonomic Computing

© 2004 IBM Corporation

Autonomic Computing in Action

Ric TelfordDirector, Strategy and TechnologyAutonomic ComputingApril 14, 2004

2

Autonomic Computing

© 2004 IBM Corporation

Agenda

�Putting the pieces together

�Autonomic Computing system examples

�Building a self-healing system

�A self-configuring system

�Autonomic Computing research

�Self-configuring devices

�Self-optimizing web servers

�Utility functions

�Autonomic Computing Storage

�Ethnographic Studies

3

Autonomic Computing

© 2004 IBM Corporation

AMKS

TP

Registries

Knowledge Services

Putting the pieces together

EM

A P

K

Symptom

Action Plan

CBEs

MR

MR

CommandsLogs

Configuration Files

EventsAPIs

Sensor Effector

MR

MR

CommandsLogs

Configuration Files

EventsAPIs

Sensor Effector

EM

A P

K

Symptom

Action Plan

CBEs

When autonomic

managers are introduced

The AMs must register

with the fabric.

This knowledge must be

passed to the ‘fabric’. AND

the touchpoints must

register themselves with the

fabric.

Now the system is ready to

run! CBEs can flow, MAPE

loops can make decisions,

resources can be managed

CBEsCBEs

Execution Workflow Execution Workflow

Initially, there is the ‘fabric’.

The ‘services’ in the fabric

must register themselves

with the Fabric Registries

Managed resource prototype

Policies

Symptoms

Treatable cause

Configuration

Dependency Data

Managed resource prototype

Policies

Symptoms

Treatable cause

Configuration

Dependency Data

} {

When managed resources

are introduced

There is knowledge that is

created about each

managed resource.

Knowledge Creation

Knowledge about MRs,

symptoms, etc. must

either be passed to the

AMs

OR it must be made

available via k-services

�Symptom: something

that indicates the

existence of something

else (from

webster.com)

AMs and TPs must talk to

fabric registries to find the

appropriate

manager/touchpointES ES

CBEs are created by MRs

CBEs are passed to the fabric,

which uses the deliver-

notification interaction style to

deliver CBEs to AMs

The ‘M’ function correlates CBEs

into Symptoms

Symptoms are passed to ‘A’

‘A’ creates an action and passes

that to ‘P’

‘P ’creates a plan and passes that

to ‘E’

‘E’ executes on the plan, using the

fabric and the perform operation

interaction style

4

Autonomic Computing

© 2004 IBM Corporation

� Disparate pieces and parts

� Tools focused on individual products

� No common interfaces

among tools

� No synergies in building tools OR in creating log entries

Problem Determination todayProblem Determination tomorrow

� Generic log adapter

� Common format for

log files

� Common set of tools

� Common interfaces

among tools

common base event

Ad

ap

ters

Ad

ap

ters

Common Base Eventsubmitted to OASIS

Common Base Eventsubmitted to OASIS

Database

Networks

ApplicationServer

Servers

Storage devices

Applications

5

Autonomic Computing

© 2004 IBM Corporation

Ad

ap

ters

Autonomic computing self-healing systems

Feedback

Data

Polic

ies

Autonomic

manager

Knowledge

Policy

engine

Symptom

ApplicationServer

Servers

Storage devices

Database

Networks

Applications

Ad

ap

ters

Common Base Event

6

Autonomic Computing

© 2004 IBM Corporation

IBM Global Services

©2003 IBM Corporation

In a basic problem bypass and resolution service flow, most tasks and information exchanges are performed manually

Customer

External Service

Problem Manager

Problem Analyst

Problem Queue Manager

Automated Tools

Customer-Detected Problem

Resolved Problem

Report Problem Status and Trends

Incident Management

Incident Management

Change Management

Configuration Management

Manage Problem

Resolution

Bypass Problem

Analyze Problem

Identify Problem

7

Autonomic Computing

© 2004 IBM Corporation

EM

A P

K

Symptom

Action Plan

Self-healing Vision

SymptomsSymptoms

Managed

Resource

Properties

Symptom 1

Component: WAS

Situation: DEPENDENCY NOT MET

Correlation: MATCH

Treatable Cause: Path configured wrong

Symptom: Symptom 1

Action : CONFIG PATH

Identify

Problem

Manage

Problem

Resolution

Analyze

Problem

In order for AM to operate:

- Knowledge must be created about the

symptoms and treatable causes.

- Knowledge about the MR, like what sensors and

effectors it has must be created

- Policies must be created based on customer

goals

- All this knowledge must be passed into the AM

Managed Resource : WAS

Sensors Implemented:

SendEvent

Effectors Implemented

CheckPath

ConfigurePath

CBEs Generated

Dependency

…..

Knowledge CreationManaged

Resource

Prototype

Scope: WAS Server

Condition:

8am to 6PM

BV: 100

GOAL: 99.5 % Availability

PoliciesPolicies

When an error occurs:

Monitor takes in CBEs

Monitor uses correlation technology to identify Symptoms

Monitor passes symptoms to Analyze

Analyze uses Decision Tree (or symptom service) to determine Treatable Cause

Analyze determines best corrective action and passes that to Plan

Plan creates a plan to enact

Execute takes the actions. ES

CBEs

MR

MR

CommandsLogsConfiguration Files

EventsAPIs

Sensor Effector

CBEsComponent: WAS

Situation: DEPENDENCY NOT MET

Data: EAR file missing

Check Path

Configure ear Path

Autonomic Manager

8

Autonomic Computing

© 2004 IBM Corporation

IBM Global Services

©2003 IBM Corporation

In a more autonomic workflow, the infrastructure can detect and bypass many problems and the tools help automate the informationexchange between tasks

Customer

External Service

Problem Manager

Problem Analyst

Problem Queue Manager

Automated Tools

Problem Management

Reports

Identify Problem

Report Problem Status and

Trends

Event Management

Incident Management

Change Management

Configuration Management

Manage Problem

Resolution

Bypass Problem

Analyze Problem

Event Mgmt Tool

Config. Mgmt Tool

Change Mgmt Tool

Incident Mgmt Tool

Problem Mgmt Tool

Incident Management

Problem Mgmt Tool

Incident Mgmt Tool

Event Mgmt Tool

9

Autonomic Computing

© 2004 IBM Corporation

A self-configuring system: Think Dynamics

10

Autonomic Computing

© 2004 IBM Corporation

AutonomicPCs

Autonomic Storage

AutonomicDatabase

Technology

AutonomicMiddleware

Autonomic Capabilities for eServer

Focus on Autonomic Computing Research

Autonomic Computing

Core ResearchAC building block

technologies

Policy technologies

System prototyping

11

Autonomic Computing

© 2004 IBM Corporation

Autonomic Computing ResearchPCE - Personal Software Configuration Engine

inventorycollection

Analysis

rules

Analysis:characterize

inventory

PPE

Plan:Choose

components, resolve

dependencies

Smart

Catalog

Clean up?

New software?Make space?

install/uninstall

Planning

rules

Policies

ACTI ACTI ‘‘0202

� Automate SW

maintenance &

migration on personal devices

� “Upgrade all my

applications”

� “Make my new laptop

work like the old one”

� “Migrate most valuable

Palm apps to my PC”

12

Autonomic Computing

© 2004 IBM Corporation

Fewminutes

later…

An important scenario: Workload surge

�� SystemsSystems cancan go go fromfrom steadysteady state state ……

Internet

�� to to overloadedoverloaded withoutwithout

warning warning

13

Autonomic Computing

© 2004 IBM Corporation

Autonomic Computing: Dynamic SurgeProtection

DatabaseWeb

Application

Servers

Driver (simulates

Internet in/out)

SurgeButton

MonitoringConfiguration management

Sensor Effector

ControllerForecaster

Performance Modeler

Statistics /Knowledge

Element

Monitor

Analyze

(Forecaster)

Sensors

Execute

(Controller)

Plan

(Perf Modeler)

Effectors

Knowledge

14

Autonomic Computing

© 2004 IBM Corporation

Response Time

Actual BOPS

Predicted BOPS

#Active Servers

#Requested Servers

1

1. Steady State

15

Autonomic Computing

© 2004 IBM Corporation

2. Monitor, Detect Surge

Response Time

Actual BOPS

Predicted BOPS

#Active Servers

#Requested Servers

1 2

16

Autonomic Computing

© 2004 IBM Corporation

3. Forecast, Provision Servers

Response Time

Actual BOPS

Predicted BOPS

#Active Servers

#Requested Servers

1 21 2 3

17

Autonomic Computing

© 2004 IBM Corporation

Response Time

Actual BOPS

Predicted BOPS

#Active Servers

#Requested Servers

4. Monitor, Remove Servers1 2 43

18

Autonomic Computing

© 2004 IBM Corporation

Utility Functions and Autonomic Computing

� Utility functions can guide autonomic

decision making

� Self-optimization: natural way to

express optimization criteria

•Declarative: preferable to implicitly hard-

coded in special purpose algorithms

� Derivable from business objectives

(e.g. optimize total profits)

•Can translate to computing metrics at

different levels

� Exploring applications in Workload

Management, other areas

Response time RT

V(R

T)Utility function

19

Autonomic Computing

© 2004 IBM Corporation

IBM IceCube Server

“Brick”

10 Gbit/s

capacitive

“Coupler”

(6) per brick

=

“Thermal

Bus Array”

6”

Prototype Brick:

- (12) 2.5” disks

- 8-port Switch

- Linux on fast CPU

Full IceCube System

blue: Storage Bricks

yellow: Compute Bricks

3D mesh @ 10 Gb/s per link

No connectors,

wires, fibers,

lasers or fans

� Lego-like Collection of ‘Intelligent Bricks”

� Fail-in-place policy: bad bricks are left in place

� 7 x smaller than equivalent standard systems

� Fast, power-hungry components (CPU etc) ok

� Includes resource allocation software

� First Application : Petabyte-class Storage Server

� intended to be managed by one person

20

Autonomic Computing

© 2004 IBM Corporation

Ethnographic Field Studies of Middleware Admins

DB2

Poughkeepsie, IBM IGS SDC

WAS

Boulder, IBMIGS SDC

WAS/Middleware

Southbury, IBMIGS SDC

WAS/Middleware

Southbury, IBM IGS SDC

� Early finding: Collaboration compensating for system complexity and simple tools

DB2

Charlotte, Belk

Video Log Spreadsheets

� Study and Analysis methods used to study Sys Administrators at work

“Let’s do it together”

Task Time ChartsVideotape Logger Video Analysis Lab

� What tasks do Autonomic Systems need to simplify for Sys Administrators?

21

Autonomic Computing

© 2004 IBM Corporation

What do those admins do? (Video Analysis)

face to face

5%

email

4%

command line

11%

web

3%

phone

36%

instant

messenger

16%

gui

21%

other

4%

face to face

22%

email

2%

command line

6%

web

3%

phone

44% instant

messenger

23%

Tool Usage Topics Discussed

• ~90% time collaborating

• Main topics • collaboration

• configuration• strategies

• ~60% time collaborating

• Main topics

• commands• strategies• collaboration

• configuration

log

0%

personal

2%

strategy

20%

error

3%

documentation

8%

administrative

3%

collaboration

22%

command

24%

configuration

17%

state

1%

log

3%

personal

3%

strategy

15%

state

11%

tools

2%

configuration

31%

command

7%

collaboration

21%

administrative

0%

documentation

3%

error

4%Tro

ub

lesh

oo

tin

g (

2.5

ho

urs

)N

ew

Web

Sealin

sta

nce

Tro

ub

lesh

oo

tin

g (

2 h

ou

rs)

IIS

Cert

ific

ate

issu

e

22

Autonomic Computing

© 2004 IBM Corporation

Preliminary findings for AC Console UI designs

• Troubleshooting depends on information access

• Need clear and consistent naming conventions for system objects

• Support easy correlation of log data and config settings

• Make logs browsable, visualizable, query-able, annotatable

• Place events in the context of end-to-end systems rather than of components

• Admins build their own tools, configure their own environments

• Support “tools built by admins for admins”

• Provide composable components and appropriate combination methods

• Enable widespread sharing of admin developed tools

• Collaboration is a vital part of admin practice

• Help establish trust among participants

• Support side-by-side exploration of system events,

configurations, etc.

• Enable ready, shared access to all relevant information

23

Autonomic Computing

© 2004 IBM Corporation

Summary

� Autonomic Computing architecture

components and interfaces come together

to form a system

� Today’s IT processes can be improved

through Autonomic Computing� IBM Research is working on some of the

hard problems in Autonomic Computing, but

there is still a lot of work to do!

24

Autonomic Computing

© 2004 IBM Corporation

© Copyright IBM Corporation 2004. All rights reserved.

The information contained in these materials is provided for informational purposes only, and is provided AS IS without warranty of any kind, express or implied. IBM shall not be responsible for any damages arising out of the use of, or otherwise related to, these materials. Nothing contained in these materials is intended to, nor shall have the effect of, creating any warranties or representations from IBM or its suppliers or licensors, or altering the terms and conditions of the applicable license agreement governing the use of IBM software.

References in these materials to IBM products, programs, or services do not imply that they will be available in all countries in which IBM operates. Product release dates and/or capabilities referenced in these materials may change at any time at IBM’s sole discretion based on market opportunities or other factors, and are not intended to be a commitment to future product or feature availability in any way.

IBM, the IBM logo, the e-business logo and other IBM products and services are trademarks or registered trademarks of the International Business Machines Corporation, in the United States, other countries or both.

Java and all Java-based trademarks are trademarks of Sun Microsystems, Inc. in the United States, other countries or both.

Microsoft, Windows, Windows NT and the Windows logo are trademarks of Microsoft Corporation in the United States, other countries or both.

All other trademarks, company, products or service names may be trademarks, registered trademarks or service marks of others

Disclaimer: NOTICE – BUSINESS VALUE INFORMATION IS PROVIDED TO YOU 'AS IS' WITH THE UNDERSTANDING THAT THERE ARE NO REPRESENTATIONS OR WARRANTIES OF ANY KIND EITHER EXPRESS OR IMPLIED. IBM DISCLAIMS ALL WARRANTIES INCLUDING, BUT NOT LIMITED TO, IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. IBM DOES NOT WARRANT OR MAKE ANY REPRESENTATIONS REGARDING THE USE, VALIDITY, ACCURACY OR RELIABILITY OF THE BUSINESS BENEFITS SHOWN.. IN NO EVENT SHALL IBM BE LIABLE FOR ANY DAMAGES, INCLUDING THOSE ARISING AS A RESULT OF IBM'S NEGLIGENCE.WHETHER THOSE DAMAGES ARE DIRECT, CONSEQUENTIAL, INCIDENTAL, OR SPECIAL, FLOWING FROM YOUR USE OF OR INABILITY TO USE THE INFORMATION PROVIDED HEREWITH OR RESULTS EVEN IF IBM HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES. THE ULTIMATE RESPONSIBILITY FOR ACHIEVING THE CALCULATED RESULTS REMAINS WITH YOU.