autonomic computing - expos© de gretr

Autonomic ComputingExposé de GRETR

Matthieu Lagarde Mouna Makni

Master SARUniversité Pierre et Marie Curie

13 Mars 2006

Lagarde, Makni (UPMC) GRETR 2006 1 / 28

Plan

1 Introduction

2 Vision de Autonomic Computing

3 Approche à base d’agents

4 Rôle des ontologies dans les systèmes autonomes

5 Conclusion


IntroductionIBM Research

© 2003 IBM CorporationResearch Challenges in Autonomic Computing | October 29, 2003

Complex heterogeneous infrastructures are a reality!

Directory Directory and Security and Security

ServicesServicesExistingExisting

ApplicationsApplicationsand Dataand Data

BusinessBusinessDataData

DataDataServerServerWebWeb

ApplicationApplicationServerServer

Storage AreaStorage AreaNetworkNetwork

BPs andBPs andExternalExternalServicesServices

WebWebServerServer

DNSDNSServerServer

DataData

Dozens of systems and applications

Hundreds of components

Thousands of tuning

parameters


Introduction

Introduction

Autonomic Computing : nouveau défi introduit en 2001 par IBM,en référence au fonctionnement du système nerveux humain.Ensemble de concepts, de technologies et d’outils permettant auxsystèmes de fonctionner de manière plus autonome.Garantir que l’infrastructure de la compagnie ne résiste passimplement au changement mais le permet.

⇒ Augmenter la productivité.⇒ Assurer un bon équilibre entre personnel humain et technologies.⇒ Réduire le coût de gestion.


Vision de Autonomic Computing

Vers des systèmes autonomes

tures must be introduced to allow the enterprise tooptimize resource usage across the collection of sys-tems within their infrastructure, while also maintain-ing their flexibility to meet the ever-changing needsof the enterprise.

Self-protecting. Systems anticipate, detect, identify,and protect themselves from attacks from anywhere.Self-protecting systems must have the ability to de-fine and manage user access to all computing re-sources within the enterprise, to protect against un-authorized resource access, to detect intrusions andreport and prevent these activities as they occur, andto provide backup and recovery capabilities that areas secure as the original resource management sys-tems. Systems will need to build on top of a numberof core security technologies already available today,including LDAP (Lightweight Directory Access Pro-tocol), Kerberos, hardware encryption, and SSL (Se-cure Socket Layer). Capabilities must be providedto more easily understand and handle user identi-ties in various contexts, removing the burden fromadministrators.

An evolution, not a revolution

To implement autonomic computing, the industrymust take an evolutionary approach and deliver im-

provements to current systems that will provide sig-nificant self-managing value to customers without re-quiring them to completely replace their current ITenvironments. New open standards must be devel-oped that will define the new mechanisms for inter-operating heterogeneous systems. Figure 2 is a rep-resentation of those levels, starting from the basiclevel, through managed, predictive, and adaptive lev-els, and finally to the autonomic level.

As seen in the figure, the basic level represents thestarting point where some IT systems are today. Eachsystem element is managed independently by IT pro-fessionals who set it up, monitor it, and eventuallyreplace it. At the managed level, systems manage-ment technologies can be used to collect informa-tion from disparate systems onto fewer consoles, re-ducing the time it takes for the administrator tocollect and synthesize information as the systems be-come more complex to operate. In the predictivelevel, as new technologies are introduced that pro-vide correlation among several elements of the sys-tem, the system itself can begin to recognize patterns,predict the optimal configuration, and provide ad-vice on what course of action the administratorshould take. As these technologies improve and aspeople become more comfortable with the adviceand predictive power of these systems, we can pro-

Figure 2 Evolving to autonomic operations

! MULTIPLE SOURCES OF SYSTEM GENERATED DATA

! REQUIRES EXTENSIVE, HIGHLY SKILLED IT STAFF

BASICLEVEL 1

AUTONOMICLEVEL 5

ADAPTIVELEVEL 4

PREDICTIVELEVEL 3

MANAGEDLEVEL 2

• CONSOLIDATION OF DATA THROUGH MANAGEMENT TOOLS

• IT STAFF ANALYZES AND TAKES ACTIONS

• GREATER SYSTEM AWARENESS

• IMPROVED PRODUCTIVITY

! INTEGRATED COMPONENTS DYNAMICALLY MANAGED BY BUSINESS RULES/POLICIES

! IT STAFF FOCUSES ON ENABLING BUSINESS NEEDS

! BUSINESS POLICY DRIVES IT MANAGEMENT

! BUSINESS AGILITY AND RESILIENCY

! SYSTEM MONITORS, CORRELATES, AND TAKES ACTION

! IT STAFF MANAGES PERFORMANCE AGAINST SLAs

! IT AGILITY AND RESILIENCY WITH MINIMAL HUMAN INTERACTION

! SYSTEM MONITORS, CORRELATES, AND RECOMMENDS ACTIONS

! IT STAFF APPROVES AND INITIATES ACTIONS

! REDUCED DEPENDENCY ON DEEP SKILLS

! FASTER AND BETTER DECISION MAKING

MANUALAUTONOMIC

From IBM Global Services and Autonomic Computing, IBM White Paper, October 2002; see http://www-3.ibm.com/autonomic/pdfs/wp-igs-autonomic.pdf.

IBM SYSTEMS JOURNAL, VOL 42, NO 1, 2003 GANEK AND CORBI 9


Vision de Autonomic Computing

Caractéristiques d’un système autonome

⇒ Self-Configuring : ajuster dynamiquement ses ressources.⇒ Self-Healing : récupération sur des défaillances.⇒ Self-Optimizing : organsation de ses ressources.⇒ Self-Protecting : capacité de se défendre contre les attaques

intérieures/extérieures.⇒ Self-Definig : connaissance de lui même.⇒ Contextually Aware : contrôler, gérer et distribuer ses ressources,

systèmes et réseaux.⇒ Open : indépendant de l’hétérogéinité des systèmes.⇒ Anticipatory : anticiper les actions utilisateurs.


Approche à base d’agents

Définitions

Agent :effectue des opérations et peut communiquer.Agent autonome :agent qui négocie ; peut accepter/refuser de faire des actions.Agent intelligent :agent avec des buts, une mémoire, des capacités d’apprentissageet s’adapte à son environnement.



Vue globale

Eléments (= agents) autonomes ajoutés au système..Il a son propre comportement.Respecte des politiques.Communique avec d’autres éléments pour atteindre ses buts.



L’élément autonome

doit se gérer : auto-configuration, fautes internes.doit se protéger : attaques externes.doit trouver d’autres éléments adéquats pour atteindre ses buts :⇒ se comprendre.⇒ décrire ses services.⇒ négocier les services.⇒ vérifier que les engagements n’entrent pas en conflit avec les

politiques.⇒ respecter les engagements.⇒ demander de l’aide en cas de conflit engagements/politiques.



Le comportement

offrir des services performants, fiables et sûrs.demander des services réalistes→ planification.“traduire” ses services pour être compris→ auto-assemblagesans planification centralisée.se protéger contre les requêtes/réponses indésirables→authentification.



Les politiques

un agent doit respecter des règles : définit son comportement.3 types de politiques :⇒ bas niveau : sur les actions.⇒ niveau intermédiaire : sur les buts (temporelle...). Ne dit pas

comment atteindre les buts.⇒ haut niveau : sur les fonctions de services ; détermine le but le plus

adapté.



Un ordinateur personnel autonome

Combinaison des objectifs de l’ordinateur personnel :

facilité d’utilisation.flexibilité.

avec ceux de l’informatique autonome :

simplicité d’utilisation.disponibilité.sécurité.



Boucle de contrôle



Architecture d’un ordinateur personnel autonome



Réseau d’éléments autonomes



L’autonomic personal computing déjà dans vos PC

IBM ThinkVantage Technologies, incluant l’autonomic computing, dansles ThinkPad et ordinateur de bureau.


Rôle des ontologies dans les systèmes autonomes

Web sémantique

Motivation : Ampleur et hétérogénéité des sources d’informationdisponibles sur le web.Finalité : Décrire l’information publiée et mieux interpréterl’information reçue.Structuration en couches :

Correlation rules. One task of the correlation engineis to reduce the number of events shown, for exam-ple, to the system administrator, and to enrich themeaning of the events. Ideally, the correlation en-gine should be able to condense the received eventsinto a single event directly indicating a problem (i.e.,a situation event) in the managed system. For ex-ample, a rule might specify that the administratoris to be notified only if three memory problems oc-cur within an hour.

Correlation rules can be divided in two types:

● Stateless rules consider events in isolation. Theyperform passive filtering on the attribute valuesof an incoming event. For example, a specific state-less rule detects a system failure when a file sys-tem has crashed or an IP address has failed.

● State-based rules are critical for analyzing eventsover time. They allow the same or repeating eventsto be distilled into a single event, regardless of thefrequency of occurrence. For example, a rule mightrequire that the administrator be alerted if an IPaddress is involved in five separate attacks on dif-ferent parts of the network over a 6-month period.

Thus, stateless correlation rules operate on a singlecurrent event, whereas state-based correlation rulesrely on a history of events.

Action rules. Preprocessing, filtering, and correlat-ing events before they are passed to the next level ofautonomic manager or directly to the operating staffminimizes the time spent on repairs, provides morespecific alarm information, and clarifies fault corre-lation. However, in order to automate corrective ac-tions, additional inference rules and designated ac-tion rules might be needed. These rules are used toreduce a system administrator!s work in two ways:

● By triggering automatic remedy actions● By gathering additional monitoring data to obtain

a detailed view of the current exceptional state ofresources; this additional information should re-duce the efforts a system administrator must maketo decide how to treat an affected resource. Forexample, in the case of a printer jam, action rulescan be used to restart the printer or to inform theadministrator about the printer failure.

The Semantic WebThe main goal of the Semantic Web is to be able toexpress the meaning of resources that can be foundon the Web.11 In order to achieve that objective, sev-eral layers of representational structures are need-ed.10 The subset of these layers that is relevant forour discussion is presented in Figure 2.

These layers have the following roles:

● The XML (eXtensible Markup Language)14 layerrepresents the structure of data.

● The RDF (Resource Definition Framework)15 layerrepresents the meaning of data.

● The ontology layer represents the formal commonagreement about the meaning of data.

● The logic layer enables intelligent reasoning withmeaningful data.

● The proof layer supports the exchange of proofsin an interagent communication, enabling com-mon understanding of how the desired informa-tion is derived.

It is worth noting that the real power of the Seman-tic Web will be realized when many systems are cre-ated that (1) collect Web content from diversesources, (2) integrate and process the information,and (3) exchange the results with other human ormachine agents. Thus, the effectiveness of the Se-mantic Web will increase drastically as more ma-chine-readable Web content and more automatedservices become available. This level of interagentcommunication will require the exchange of proofsto ensure common understanding among theseagents.

Two important technologies for developing the Se-mantic Web are already in place, namely XML andRDF. XML lets users create their own tags to anno-tate Web pages or sections of text on a page. Sys-tems can make use of these tags in sophisticated ways,but to do so a systems programmer must know whatthe page author intended by each new tag. In other

Figure 2 Layers of the Semantic Web architecture

Proof

Ontology Vocabulary

RDF + RDFS

XML

Logic

IBM SYSTEMS JOURNAL, VOL 43, NO 3, 2004 L. STOJANOVIC ET AL. 603



Ontologies

Définition : Spécification explicite et formelle d’un concept partagé.Avantage : Construction de systèmes basés sur des politiques dehaut niveau.Plusieurs langages de représentation des ontologies : OWL,KAON ..

⇒ Approche abstraite de haut niveau pour décrire les moteurs decorrélations.



Moteur de corrélation – Correlation Engine

Composants autonomes de base effectuant l’analyse automatiséeet continue des données en se basant sur des politiquespréétablies par l’administrateur.Règles utilisées pour détecter les menaces, attaques, échecs desystème, et lancer les réactions correspondantes.Modèle de référence = MAPE (Monitor Analyse Plan Execute)



Modèle de référence MAPE

The layered reference model for correlation enginesenables the separation of:

● What has to be managed● Why management is required (monitoring)● How to manage (decision logic)

Consequently, correlation engines are able to:

● Capture and select important events and updatekey characteristics (states) of the managedresources

● Detect changes or problems based on knowledgeof state changes

● Initiate actions to correct any behavior not in linewith a desired goal

In the rest of this section, we first describe the re-sources that have to be managed. We then providea classification of the events that can trigger the man-agement process. Finally, we describe the rules thatrepresent automated responses to events.

Resources. Unless each resource in a system canshare information with every other part and contrib-ute to some overall system awareness, the goal ofautonomic computing will not really be reached.

Figure 1 Reference model for correlation engines

ActionRules

Resources

Correlation Engine

Events

MONITOR

ANALYZE

EXECUTE

CorrelationRules

PLAN

Rules

R1

EventnEvent2Event1

R2 Rm




Exemple : eAutomation Correlation Engine

by extending and combining it with the SemanticWeb technologies described previously.

Specifically, we will now focus on how ontologies mayadvance autonomic-computing solutions. We use theeAutomation correlation engine24 as an examplethroughout this section. Note that similar strategiescan be applied for other engines as well. The ap-proach can be summarized in the following steps:

1. The model of the eAutomation engine is firsttransformed into the eAutomation ontology.

2. Hidden (hard-coded) knowledge embedded inthe eAutomation engine is translated into a setof rules in the corresponding ontology and isused in typical inferencing tasks.

In the rest of this section we describe the model ofthe eAutomation engine and show how it can be im-proved by translating the model of the eAutomationengine into the eAutomation ontology.

The eAutomation engine. Any correlation engine hasto provide answers to three questions: (1) what tomanage, (2) why management is required, and (3)how to manage. Hence, for the rest of this subsec-tion we elaborate the resource, event, and rule mod-els used in the eAutomation engine.

Resources. One of the core foundations of the eAu-tomation engine is the abstract representation of anytype of IT resource, with the ultimate goal of avail-ability management. This abstract resource repre-sentation is depicted in Figure 4. (Note that we willnot discuss all elements of the abstract resourcemodel explicitly in this paper.) Each resource has aunique name that is represented through the Nameattribute. The most important attributes of a resourceare those related to the operational state of a re-source. Whereas the Current_Operational_State at-tribute describes the current availability state of aresource, which is typically monitored and whosechange is represented as an event, the Desired_Operational_State attribute is used to specify the de-sired state of a resource. This state is typically setby an administrator or is part of a policy reflectingoverall goals. The Compound_State attribute indi-cates the state of a resource in the context of otherresources together with the composition of the statesof these other resources, thus providing a view ofthe overall situation. This state is computed basedon internal aggregation; a correlation rule is used toderive an overall state. The set of possible values ofthe state attributes is predefined. For example, if theresource is not started, the value of the Current_Operational_State attribute is Offline. The Online valuemeans that the resource is ready for work, whereasPending Online means that the resource has beenstarted, but is not yet ready for work. There are sev-eral operational states indicating problems. For ex-ample, Failed Offline signifies that a resource is bro-ken and cannot be used.

The eAutomation resource model allows resourcesto be grouped in resource groups, a process knownas composition. A resource group is itself an eAu-tomation resource which contains a collection of re-sources that are handled as one logical entity. Dif-ferent types of groups can be supported byimplementing different state aggregation rules. Theserules are used to derive the observed state of thegroup resource. Because a resource group is just an-other resource, it can in turn be a member of an-other resource group. Internally, we build an isMem-berOf relationship tree.

An equivalency is a collection of resources that pro-vide the same functionality. An equivalency consistsof a set of resources from the same class. For ex-ample, network adapters might be defined as mem-bers of an equivalency. If one network adapter fails,

Figure 4 Abstract resource model

Name (unique key)

Compound_State

Current_Operational_State

Desired_Operational_State

Number_of_Instances

Restart_Interval

RequestOnline

RequestOffline

Request_Modify

ResetFromBroken

Include/Exclude Location

eAutomation Resource

Number_of_Retries

Generic Resource States

Resource Relationship• Operation sequence• Placement constraints• Resource compositions

L. STOJANOVIC ET AL. IBM SYSTEMS JOURNAL, VOL 43, NO 3, 2004606



Exemple : eAutomation Correlation Engine

Relations entre ressources :Start/stop relationshipsLocation relationships

Evènements : changements d’états des ressources.Règles :

Simple correlation rulesRelationship correlation rulesRequest propagation rules



Exemple : eAutomation Ontology

The eAutomation ontology. The model of the eAu-tomation engine presented in the previous sectionwas then used as the basis for defining the eAuto-mation ontology. Our goal was to model all infor-mation that exists in the eAutomation model, includ-ing any implicit knowledge. Here we briefly describethe process of enriching the eAutomation model intothe eAutomation ontology.

Figure 5 shows a part of the eAutomation ontology.The most important concept in this ontology is theconcept Resource, corresponding to the abstract re-source in the eAutomation model shown in Figure4. Each ontology concept is described with a set of

properties that can be either attributes or relations.The concept Resource contains only the one attributename; the value for this property can be an arbitrarystring (e.g., Resource123). All other properties de-fined for the concept Resource are relationships. Forexample, the current state of a resource is modeledas the property currentState (the range of this prop-erty is the concept CurrentState) because the set ofpossible values is known in advance. In this way thecorrectness of the concrete model is improved sig-nificantly because only instances of the concept Cur-rentState can be used to specify the value of the cur-rent resource state. Note that domains and rangessimply specify schema constraints that must be sat-

Figure 5 Part of the eAutomation ontology

DesiredState CurrentState

CompoundState

FixedResource FloatingResource

Node

currentState

desiredState

Name

allowedNode

includes

isEquivalent

hasResourceType

Resource

locationDependency

startStopDependency

ResourceType

startAfter

dependsOn

dependsOnAny

collocated

anticollocated

affinity

antiaffinity

isStartable

OperationalState

State

L. STOJANOVIC ET AL. IBM SYSTEMS JOURNAL, VOL 43, NO 3, 2004608



Exemple : eAutomation Ontology

isfied for the property to be instantiated. They donot infer new facts, but rather guide the user in con-structing the ontology by determining what can andcannot be explicitly stated.

In order to define the state of a resource in a uniqueway, the concept State is organized into the concepthierarchy shown in Figure 6. The concept State con-tains two subconcepts. The first subconcept Com-poundState has the instances good, bad, and unknown.These instances represent the concrete values thatcan be used for the property instances. Note thatgood is a unique identifier for an instance of the con-cept CompoundState and is much more than just thestring “good.” When good is considered as a URI(unique resource indicator), additional values canbe specified according to the structure of the con-cept CompoundState. Further, lexical entries can bedefined. For example, the instance good can be re-lated to the term (string) “gut” in German. The sec-ond subconcept OperationalState is divided into theconcepts DesiredState and CurrentState. The possi-ble values for the operational state are modeled asinstances. The values that are specific only for thecurrent state (e.g., stopping) are represented as in-stances of that concept.

Several properties of resources are derived from themodel of the eAutomation correlation engine. Theseinclude startAfter, dependsOn, dependsOnAll, collo-cated, anticollocated, affinity, antiaffinity, and isStart-

able among others. Based on their semantics, theseproperties are organized into two groups: startStop-Dependencies (used to define a start/stop behavior)and locationDependencies (used for locating re-sources on nodes). These groups correspond to therelationships that can be defined between resourcesin the eAutomation model.

Each of these relationships has an implicit meaningthat is hard-coded in the program that uses them.Moreover, there are semantic connections betweensome of these relationships. For example, the col-located relationship is symmetric; the collocated andanticollocated relationships are mutually inverse. Thisknowledge may be defined formally and explicitly byaxioms.

In an ontology there are two types of implicit knowl-edge: axioms and general rules. Axioms are a stan-dard set of rules, such as the rules for symmetric,transitive, and inverse properties. For example, if Acontains B, B contains C, and contains is a transitiveproperty, then the ontology system can infer that Acontains C as well. Thus, we do not need to expressthis information explicitly. General rules are domain-specific rules that are needed to combine and toadapt information available in the ontology. Theyare used to specify the relationships between onto-logical entities in the form of rules. For example, ifA contains B and B is about C, then it can be con-cluded that A also is about C.

Figure 6 The concept “state” in the eAutomation ontology

unknown

bad

good

CurrentState

stopPending

startPending

CompoundState

State

OperationalState

offline online stopping

DesiredState

starting




Avantages

Méchanismes d’inférence implicites pour déduire lesconnaissances.Compréhension partagée d’un domaine.Réutilisation, extensibilité, vérificationAnalyse de situations rigoureuses compréhensible par leshumains.


Conclusion

Conclusion

Evolution des logiciels et systèmes.Besoins de systèmes auto-gérés.Diminution des coûts pour les entreprises.Rencontre de plusieurs domaines : systèmes distribués,intelligence artificielle.


Conclusion

Les principaux acteurs

HP : Adaptative EnterpriseMicrosoft : Dynamic SystemsIBM : Autonomic Computing . . .

Travaux :IBM : l’existence de liens entre les systèmes autonomes et lessystèmes quantiques.LAAS : systèmes autonomes critiques (SAC) pour les robots


Conclusion

Questions ?


autonomic computing - expos© de gretr

Documents