autonomic computing - expos© de gretr
TRANSCRIPT
Autonomic ComputingExposé de GRETR
Matthieu Lagarde Mouna Makni
Master SARUniversité Pierre et Marie Curie
13 Mars 2006
Lagarde, Makni (UPMC) GRETR 2006 1 / 28
Plan
1 Introduction
2 Vision de Autonomic Computing
3 Approche à base d’agents
4 Rôle des ontologies dans les systèmes autonomes
5 Conclusion
Lagarde, Makni (UPMC) GRETR 2006 2 / 28
IntroductionIBM Research
© 2003 IBM CorporationResearch Challenges in Autonomic Computing | October 29, 2003
Complex heterogeneous infrastructures are a reality!
Directory Directory and Security and Security
ServicesServicesExistingExisting
ApplicationsApplicationsand Dataand Data
BusinessBusinessDataData
DataDataServerServerWebWeb
ApplicationApplicationServerServer
Storage AreaStorage AreaNetworkNetwork
BPs andBPs andExternalExternalServicesServices
WebWebServerServer
DNSDNSServerServer
DataData
Dozens of systems and applications
Hundreds of components
Thousands of tuning
parameters
Lagarde, Makni (UPMC) GRETR 2006 3 / 28
Introduction
Introduction
Autonomic Computing : nouveau défi introduit en 2001 par IBM,en référence au fonctionnement du système nerveux humain.Ensemble de concepts, de technologies et d’outils permettant auxsystèmes de fonctionner de manière plus autonome.Garantir que l’infrastructure de la compagnie ne résiste passimplement au changement mais le permet.
⇒ Augmenter la productivité.⇒ Assurer un bon équilibre entre personnel humain et technologies.⇒ Réduire le coût de gestion.
Lagarde, Makni (UPMC) GRETR 2006 4 / 28
Vision de Autonomic Computing
Vers des systèmes autonomes
tures must be introduced to allow the enterprise tooptimize resource usage across the collection of sys-tems within their infrastructure, while also maintain-ing their flexibility to meet the ever-changing needsof the enterprise.
Self-protecting. Systems anticipate, detect, identify,and protect themselves from attacks from anywhere.Self-protecting systems must have the ability to de-fine and manage user access to all computing re-sources within the enterprise, to protect against un-authorized resource access, to detect intrusions andreport and prevent these activities as they occur, andto provide backup and recovery capabilities that areas secure as the original resource management sys-tems. Systems will need to build on top of a numberof core security technologies already available today,including LDAP (Lightweight Directory Access Pro-tocol), Kerberos, hardware encryption, and SSL (Se-cure Socket Layer). Capabilities must be providedto more easily understand and handle user identi-ties in various contexts, removing the burden fromadministrators.
An evolution, not a revolution
To implement autonomic computing, the industrymust take an evolutionary approach and deliver im-
provements to current systems that will provide sig-nificant self-managing value to customers without re-quiring them to completely replace their current ITenvironments. New open standards must be devel-oped that will define the new mechanisms for inter-operating heterogeneous systems. Figure 2 is a rep-resentation of those levels, starting from the basiclevel, through managed, predictive, and adaptive lev-els, and finally to the autonomic level.
As seen in the figure, the basic level represents thestarting point where some IT systems are today. Eachsystem element is managed independently by IT pro-fessionals who set it up, monitor it, and eventuallyreplace it. At the managed level, systems manage-ment technologies can be used to collect informa-tion from disparate systems onto fewer consoles, re-ducing the time it takes for the administrator tocollect and synthesize information as the systems be-come more complex to operate. In the predictivelevel, as new technologies are introduced that pro-vide correlation among several elements of the sys-tem, the system itself can begin to recognize patterns,predict the optimal configuration, and provide ad-vice on what course of action the administratorshould take. As these technologies improve and aspeople become more comfortable with the adviceand predictive power of these systems, we can pro-
Figure 2 Evolving to autonomic operations
! MULTIPLE SOURCES OF SYSTEM GENERATED DATA
! REQUIRES EXTENSIVE, HIGHLY SKILLED IT STAFF
BASICLEVEL 1
AUTONOMICLEVEL 5
ADAPTIVELEVEL 4
PREDICTIVELEVEL 3
MANAGEDLEVEL 2
• CONSOLIDATION OF DATA THROUGH MANAGEMENT TOOLS
• IT STAFF ANALYZES AND TAKES ACTIONS
• GREATER SYSTEM AWARENESS
• IMPROVED PRODUCTIVITY
! INTEGRATED COMPONENTS DYNAMICALLY MANAGED BY BUSINESS RULES/POLICIES
! IT STAFF FOCUSES ON ENABLING BUSINESS NEEDS
! BUSINESS POLICY DRIVES IT MANAGEMENT
! BUSINESS AGILITY AND RESILIENCY
! SYSTEM MONITORS, CORRELATES, AND TAKES ACTION
! IT STAFF MANAGES PERFORMANCE AGAINST SLAs
! IT AGILITY AND RESILIENCY WITH MINIMAL HUMAN INTERACTION
! SYSTEM MONITORS, CORRELATES, AND RECOMMENDS ACTIONS
! IT STAFF APPROVES AND INITIATES ACTIONS
! REDUCED DEPENDENCY ON DEEP SKILLS
! FASTER AND BETTER DECISION MAKING
MANUALAUTONOMIC
From IBM Global Services and Autonomic Computing, IBM White Paper, October 2002; see http://www-3.ibm.com/autonomic/pdfs/wp-igs-autonomic.pdf.
IBM SYSTEMS JOURNAL, VOL 42, NO 1, 2003 GANEK AND CORBI 9
Lagarde, Makni (UPMC) GRETR 2006 5 / 28
Vision de Autonomic Computing
Caractéristiques d’un système autonome
⇒ Self-Configuring : ajuster dynamiquement ses ressources.⇒ Self-Healing : récupération sur des défaillances.⇒ Self-Optimizing : organsation de ses ressources.⇒ Self-Protecting : capacité de se défendre contre les attaques
intérieures/extérieures.⇒ Self-Definig : connaissance de lui même.⇒ Contextually Aware : contrôler, gérer et distribuer ses ressources,
systèmes et réseaux.⇒ Open : indépendant de l’hétérogéinité des systèmes.⇒ Anticipatory : anticiper les actions utilisateurs.
Lagarde, Makni (UPMC) GRETR 2006 6 / 28
Approche à base d’agents
Définitions
Agent :effectue des opérations et peut communiquer.Agent autonome :agent qui négocie ; peut accepter/refuser de faire des actions.Agent intelligent :agent avec des buts, une mémoire, des capacités d’apprentissageet s’adapte à son environnement.
Lagarde, Makni (UPMC) GRETR 2006 7 / 28
Approche à base d’agents
Vue globale
Eléments (= agents) autonomes ajoutés au système..Il a son propre comportement.Respecte des politiques.Communique avec d’autres éléments pour atteindre ses buts.
Lagarde, Makni (UPMC) GRETR 2006 8 / 28
Approche à base d’agents
L’élément autonome
doit se gérer : auto-configuration, fautes internes.doit se protéger : attaques externes.doit trouver d’autres éléments adéquats pour atteindre ses buts :⇒ se comprendre.⇒ décrire ses services.⇒ négocier les services.⇒ vérifier que les engagements n’entrent pas en conflit avec les
politiques.⇒ respecter les engagements.⇒ demander de l’aide en cas de conflit engagements/politiques.
Lagarde, Makni (UPMC) GRETR 2006 9 / 28
Approche à base d’agents
Le comportement
offrir des services performants, fiables et sûrs.demander des services réalistes→ planification.“traduire” ses services pour être compris→ auto-assemblagesans planification centralisée.se protéger contre les requêtes/réponses indésirables→authentification.
Lagarde, Makni (UPMC) GRETR 2006 10 / 28
Approche à base d’agents
Les politiques
un agent doit respecter des règles : définit son comportement.3 types de politiques :⇒ bas niveau : sur les actions.⇒ niveau intermédiaire : sur les buts (temporelle...). Ne dit pas
comment atteindre les buts.⇒ haut niveau : sur les fonctions de services ; détermine le but le plus
adapté.
Lagarde, Makni (UPMC) GRETR 2006 11 / 28
Approche à base d’agents
Un ordinateur personnel autonome
Combinaison des objectifs de l’ordinateur personnel :
facilité d’utilisation.flexibilité.
avec ceux de l’informatique autonome :
simplicité d’utilisation.disponibilité.sécurité.
Lagarde, Makni (UPMC) GRETR 2006 12 / 28
Approche à base d’agents
Boucle de contrôle
Lagarde, Makni (UPMC) GRETR 2006 13 / 28
Approche à base d’agents
Architecture d’un ordinateur personnel autonome
Lagarde, Makni (UPMC) GRETR 2006 14 / 28
Approche à base d’agents
Réseau d’éléments autonomes
Lagarde, Makni (UPMC) GRETR 2006 15 / 28
Approche à base d’agents
L’autonomic personal computing déjà dans vos PC
IBM ThinkVantage Technologies, incluant l’autonomic computing, dansles ThinkPad et ordinateur de bureau.
Lagarde, Makni (UPMC) GRETR 2006 16 / 28
Rôle des ontologies dans les systèmes autonomes
Web sémantique
Motivation : Ampleur et hétérogénéité des sources d’informationdisponibles sur le web.Finalité : Décrire l’information publiée et mieux interpréterl’information reçue.Structuration en couches :
Correlation rules. One task of the correlation engineis to reduce the number of events shown, for exam-ple, to the system administrator, and to enrich themeaning of the events. Ideally, the correlation en-gine should be able to condense the received eventsinto a single event directly indicating a problem (i.e.,a situation event) in the managed system. For ex-ample, a rule might specify that the administratoris to be notified only if three memory problems oc-cur within an hour.
Correlation rules can be divided in two types:
● Stateless rules consider events in isolation. Theyperform passive filtering on the attribute valuesof an incoming event. For example, a specific state-less rule detects a system failure when a file sys-tem has crashed or an IP address has failed.
● State-based rules are critical for analyzing eventsover time. They allow the same or repeating eventsto be distilled into a single event, regardless of thefrequency of occurrence. For example, a rule mightrequire that the administrator be alerted if an IPaddress is involved in five separate attacks on dif-ferent parts of the network over a 6-month period.
Thus, stateless correlation rules operate on a singlecurrent event, whereas state-based correlation rulesrely on a history of events.
Action rules. Preprocessing, filtering, and correlat-ing events before they are passed to the next level ofautonomic manager or directly to the operating staffminimizes the time spent on repairs, provides morespecific alarm information, and clarifies fault corre-lation. However, in order to automate corrective ac-tions, additional inference rules and designated ac-tion rules might be needed. These rules are used toreduce a system administrator!s work in two ways:
● By triggering automatic remedy actions● By gathering additional monitoring data to obtain
a detailed view of the current exceptional state ofresources; this additional information should re-duce the efforts a system administrator must maketo decide how to treat an affected resource. Forexample, in the case of a printer jam, action rulescan be used to restart the printer or to inform theadministrator about the printer failure.
The Semantic WebThe main goal of the Semantic Web is to be able toexpress the meaning of resources that can be foundon the Web.11 In order to achieve that objective, sev-eral layers of representational structures are need-ed.10 The subset of these layers that is relevant forour discussion is presented in Figure 2.
These layers have the following roles:
● The XML (eXtensible Markup Language)14 layerrepresents the structure of data.
● The RDF (Resource Definition Framework)15 layerrepresents the meaning of data.
● The ontology layer represents the formal commonagreement about the meaning of data.
● The logic layer enables intelligent reasoning withmeaningful data.
● The proof layer supports the exchange of proofsin an interagent communication, enabling com-mon understanding of how the desired informa-tion is derived.
It is worth noting that the real power of the Seman-tic Web will be realized when many systems are cre-ated that (1) collect Web content from diversesources, (2) integrate and process the information,and (3) exchange the results with other human ormachine agents. Thus, the effectiveness of the Se-mantic Web will increase drastically as more ma-chine-readable Web content and more automatedservices become available. This level of interagentcommunication will require the exchange of proofsto ensure common understanding among theseagents.
Two important technologies for developing the Se-mantic Web are already in place, namely XML andRDF. XML lets users create their own tags to anno-tate Web pages or sections of text on a page. Sys-tems can make use of these tags in sophisticated ways,but to do so a systems programmer must know whatthe page author intended by each new tag. In other
Figure 2 Layers of the Semantic Web architecture
Proof
Ontology Vocabulary
RDF + RDFS
XML
Logic
IBM SYSTEMS JOURNAL, VOL 43, NO 3, 2004 L. STOJANOVIC ET AL. 603
Lagarde, Makni (UPMC) GRETR 2006 17 / 28
Rôle des ontologies dans les systèmes autonomes
Ontologies
Définition : Spécification explicite et formelle d’un concept partagé.Avantage : Construction de systèmes basés sur des politiques dehaut niveau.Plusieurs langages de représentation des ontologies : OWL,KAON ..
⇒ Approche abstraite de haut niveau pour décrire les moteurs decorrélations.
Lagarde, Makni (UPMC) GRETR 2006 18 / 28
Rôle des ontologies dans les systèmes autonomes
Moteur de corrélation – Correlation Engine
Composants autonomes de base effectuant l’analyse automatiséeet continue des données en se basant sur des politiquespréétablies par l’administrateur.Règles utilisées pour détecter les menaces, attaques, échecs desystème, et lancer les réactions correspondantes.Modèle de référence = MAPE (Monitor Analyse Plan Execute)
Lagarde, Makni (UPMC) GRETR 2006 19 / 28
Rôle des ontologies dans les systèmes autonomes
Modèle de référence MAPE
The layered reference model for correlation enginesenables the separation of:
● What has to be managed● Why management is required (monitoring)● How to manage (decision logic)
Consequently, correlation engines are able to:
● Capture and select important events and updatekey characteristics (states) of the managedresources
● Detect changes or problems based on knowledgeof state changes
● Initiate actions to correct any behavior not in linewith a desired goal
In the rest of this section, we first describe the re-sources that have to be managed. We then providea classification of the events that can trigger the man-agement process. Finally, we describe the rules thatrepresent automated responses to events.
Resources. Unless each resource in a system canshare information with every other part and contrib-ute to some overall system awareness, the goal ofautonomic computing will not really be reached.
Figure 1 Reference model for correlation engines
ActionRules
Resources
Correlation Engine
Events
MONITOR
ANALYZE
EXECUTE
CorrelationRules
PLAN
Rules
R1
EventnEvent2Event1
R2 Rm
IBM SYSTEMS JOURNAL, VOL 43, NO 3, 2004 L. STOJANOVIC ET AL. 601
Lagarde, Makni (UPMC) GRETR 2006 20 / 28
Rôle des ontologies dans les systèmes autonomes
Exemple : eAutomation Correlation Engine
by extending and combining it with the SemanticWeb technologies described previously.
Specifically, we will now focus on how ontologies mayadvance autonomic-computing solutions. We use theeAutomation correlation engine24 as an examplethroughout this section. Note that similar strategiescan be applied for other engines as well. The ap-proach can be summarized in the following steps:
1. The model of the eAutomation engine is firsttransformed into the eAutomation ontology.
2. Hidden (hard-coded) knowledge embedded inthe eAutomation engine is translated into a setof rules in the corresponding ontology and isused in typical inferencing tasks.
In the rest of this section we describe the model ofthe eAutomation engine and show how it can be im-proved by translating the model of the eAutomationengine into the eAutomation ontology.
The eAutomation engine. Any correlation engine hasto provide answers to three questions: (1) what tomanage, (2) why management is required, and (3)how to manage. Hence, for the rest of this subsec-tion we elaborate the resource, event, and rule mod-els used in the eAutomation engine.
Resources. One of the core foundations of the eAu-tomation engine is the abstract representation of anytype of IT resource, with the ultimate goal of avail-ability management. This abstract resource repre-sentation is depicted in Figure 4. (Note that we willnot discuss all elements of the abstract resourcemodel explicitly in this paper.) Each resource has aunique name that is represented through the Nameattribute. The most important attributes of a resourceare those related to the operational state of a re-source. Whereas the Current_Operational_State at-tribute describes the current availability state of aresource, which is typically monitored and whosechange is represented as an event, the Desired_Operational_State attribute is used to specify the de-sired state of a resource. This state is typically setby an administrator or is part of a policy reflectingoverall goals. The Compound_State attribute indi-cates the state of a resource in the context of otherresources together with the composition of the statesof these other resources, thus providing a view ofthe overall situation. This state is computed basedon internal aggregation; a correlation rule is used toderive an overall state. The set of possible values ofthe state attributes is predefined. For example, if theresource is not started, the value of the Current_Operational_State attribute is Offline. The Online valuemeans that the resource is ready for work, whereasPending Online means that the resource has beenstarted, but is not yet ready for work. There are sev-eral operational states indicating problems. For ex-ample, Failed Offline signifies that a resource is bro-ken and cannot be used.
The eAutomation resource model allows resourcesto be grouped in resource groups, a process knownas composition. A resource group is itself an eAu-tomation resource which contains a collection of re-sources that are handled as one logical entity. Dif-ferent types of groups can be supported byimplementing different state aggregation rules. Theserules are used to derive the observed state of thegroup resource. Because a resource group is just an-other resource, it can in turn be a member of an-other resource group. Internally, we build an isMem-berOf relationship tree.
An equivalency is a collection of resources that pro-vide the same functionality. An equivalency consistsof a set of resources from the same class. For ex-ample, network adapters might be defined as mem-bers of an equivalency. If one network adapter fails,
Figure 4 Abstract resource model
Name (unique key)
Compound_State
Current_Operational_State
Desired_Operational_State
Number_of_Instances
Restart_Interval
RequestOnline
RequestOffline
Request_Modify
ResetFromBroken
Include/Exclude Location
eAutomation Resource
Number_of_Retries
Generic Resource States
Resource Relationship• Operation sequence• Placement constraints• Resource compositions
L. STOJANOVIC ET AL. IBM SYSTEMS JOURNAL, VOL 43, NO 3, 2004606
Lagarde, Makni (UPMC) GRETR 2006 21 / 28
Rôle des ontologies dans les systèmes autonomes
Exemple : eAutomation Correlation Engine
Relations entre ressources :Start/stop relationshipsLocation relationships
Evènements : changements d’états des ressources.Règles :
Simple correlation rulesRelationship correlation rulesRequest propagation rules
Lagarde, Makni (UPMC) GRETR 2006 22 / 28
Rôle des ontologies dans les systèmes autonomes
Exemple : eAutomation Ontology
The eAutomation ontology. The model of the eAu-tomation engine presented in the previous sectionwas then used as the basis for defining the eAuto-mation ontology. Our goal was to model all infor-mation that exists in the eAutomation model, includ-ing any implicit knowledge. Here we briefly describethe process of enriching the eAutomation model intothe eAutomation ontology.
Figure 5 shows a part of the eAutomation ontology.The most important concept in this ontology is theconcept Resource, corresponding to the abstract re-source in the eAutomation model shown in Figure4. Each ontology concept is described with a set of
properties that can be either attributes or relations.The concept Resource contains only the one attributename; the value for this property can be an arbitrarystring (e.g., Resource123). All other properties de-fined for the concept Resource are relationships. Forexample, the current state of a resource is modeledas the property currentState (the range of this prop-erty is the concept CurrentState) because the set ofpossible values is known in advance. In this way thecorrectness of the concrete model is improved sig-nificantly because only instances of the concept Cur-rentState can be used to specify the value of the cur-rent resource state. Note that domains and rangessimply specify schema constraints that must be sat-
Figure 5 Part of the eAutomation ontology
DesiredState CurrentState
CompoundState
FixedResource FloatingResource
Node
currentState
desiredState
Name
allowedNode
includes
isEquivalent
hasResourceType
Resource
locationDependency
startStopDependency
ResourceType
startAfter
dependsOn
dependsOnAny
collocated
anticollocated
affinity
antiaffinity
isStartable
OperationalState
State
L. STOJANOVIC ET AL. IBM SYSTEMS JOURNAL, VOL 43, NO 3, 2004608
Lagarde, Makni (UPMC) GRETR 2006 23 / 28
Rôle des ontologies dans les systèmes autonomes
Exemple : eAutomation Ontology
isfied for the property to be instantiated. They donot infer new facts, but rather guide the user in con-structing the ontology by determining what can andcannot be explicitly stated.
In order to define the state of a resource in a uniqueway, the concept State is organized into the concepthierarchy shown in Figure 6. The concept State con-tains two subconcepts. The first subconcept Com-poundState has the instances good, bad, and unknown.These instances represent the concrete values thatcan be used for the property instances. Note thatgood is a unique identifier for an instance of the con-cept CompoundState and is much more than just thestring “good.” When good is considered as a URI(unique resource indicator), additional values canbe specified according to the structure of the con-cept CompoundState. Further, lexical entries can bedefined. For example, the instance good can be re-lated to the term (string) “gut” in German. The sec-ond subconcept OperationalState is divided into theconcepts DesiredState and CurrentState. The possi-ble values for the operational state are modeled asinstances. The values that are specific only for thecurrent state (e.g., stopping) are represented as in-stances of that concept.
Several properties of resources are derived from themodel of the eAutomation correlation engine. Theseinclude startAfter, dependsOn, dependsOnAll, collo-cated, anticollocated, affinity, antiaffinity, and isStart-
able among others. Based on their semantics, theseproperties are organized into two groups: startStop-Dependencies (used to define a start/stop behavior)and locationDependencies (used for locating re-sources on nodes). These groups correspond to therelationships that can be defined between resourcesin the eAutomation model.
Each of these relationships has an implicit meaningthat is hard-coded in the program that uses them.Moreover, there are semantic connections betweensome of these relationships. For example, the col-located relationship is symmetric; the collocated andanticollocated relationships are mutually inverse. Thisknowledge may be defined formally and explicitly byaxioms.
In an ontology there are two types of implicit knowl-edge: axioms and general rules. Axioms are a stan-dard set of rules, such as the rules for symmetric,transitive, and inverse properties. For example, if Acontains B, B contains C, and contains is a transitiveproperty, then the ontology system can infer that Acontains C as well. Thus, we do not need to expressthis information explicitly. General rules are domain-specific rules that are needed to combine and toadapt information available in the ontology. Theyare used to specify the relationships between onto-logical entities in the form of rules. For example, ifA contains B and B is about C, then it can be con-cluded that A also is about C.
Figure 6 The concept “state” in the eAutomation ontology
unknown
bad
good
CurrentState
stopPending
startPending
CompoundState
State
OperationalState
offline online stopping
DesiredState
starting
IBM SYSTEMS JOURNAL, VOL 43, NO 3, 2004 L. STOJANOVIC ET AL. 609
Lagarde, Makni (UPMC) GRETR 2006 24 / 28
Rôle des ontologies dans les systèmes autonomes
Avantages
Méchanismes d’inférence implicites pour déduire lesconnaissances.Compréhension partagée d’un domaine.Réutilisation, extensibilité, vérificationAnalyse de situations rigoureuses compréhensible par leshumains.
Lagarde, Makni (UPMC) GRETR 2006 25 / 28
Conclusion
Conclusion
Evolution des logiciels et systèmes.Besoins de systèmes auto-gérés.Diminution des coûts pour les entreprises.Rencontre de plusieurs domaines : systèmes distribués,intelligence artificielle.
Lagarde, Makni (UPMC) GRETR 2006 26 / 28
Conclusion
Les principaux acteurs
HP : Adaptative EnterpriseMicrosoft : Dynamic SystemsIBM : Autonomic Computing . . .
Travaux :IBM : l’existence de liens entre les systèmes autonomes et lessystèmes quantiques.LAAS : systèmes autonomes critiques (SAC) pour les robots
Lagarde, Makni (UPMC) GRETR 2006 27 / 28
Conclusion
Questions ?
Lagarde, Makni (UPMC) GRETR 2006 28 / 28