viewing knowledge management as a case-based reasoning ... · figure 1: the km process. consider...

Viewing Knowledge Management as a Case-Based ReasoningApplication

W. Dubitzky§, A.G. Büchner§, F.J. Azuaje‡

§ Faculty of Informatics, University of Ulster, at Jordanstown, Northern Ireland, UK‡ Northern Ireland Bio-engineering Centre, University of Ulster, at Jordanstown, Northern Ireland, UK

{w.dubitzky, ag.buchner, fj.azuaje}@ulst.ac.uk

AbstractKnowledge management refers to the process of creating,sharing, and reusing of knowledge to improve and supportthe overall business strategy of an organization. Case-basedreasoning is a general problem-solving or decision-makingframework, which revolves around the processes of caseretrieval, reuse, retention, and maintenance. Based on theobvious affinity of the two approaches, we are proposing acase-based model to knowledge management. The keyassumption of this model is that knowledge management canbe viewed as a decision support task. Both the notion ofcase-level information synthesis (knowledge creation) andthe integration of multiple case bases (knowledge sharing)constitute crucial concepts in the proposed model.

IntroductionKnowledge management (KM) can be thought of as abusiness discipline involving the identification, capturing,organization, analysis, and processing of data andinformation to create new knowledge, which is madeavailable to others in order to use and create moreknowledge (Radding et al., 1998). KM solutions constitutean attempt to address the business problem that largeamounts of an organization’s knowledge are not reused.Benefits of successful KM include improved decision-making (faster, more accurate decisions), higheradaptability and flexibility, and prevention of knowledgeloss. The key KM processes are knowledge creation,sharing, and reuse. Traditionally, KM solutions involvetechnologies such as computer networks (e.g., intranets,extranets), storage facilities (e.g., relational databases),capture and collection systems (e.g., documentmanagement systems, groupware), disseminationtechnologies (e.g., data warehouses), and knowledgeprocessing and analysis technologies (e.g., data mining,statistical tools, on-line analytical processing, datavisualisation).

Case-based reasoning (CBR) is a problem-solvingmethod or reasoning model whose core processes revolvearound the retrieval, reuse, and retention of previouslyencountered problem-solving episodes or cases (Lenz et al.1998). CBR technology has been applied to a variety ofanalytic (e.g. classification, prognosis, decision support)and synthetic (e.g., design, configuration, planning)

problem-solving or decision-making tasks.The principal goals of CBR and KM are the same: to

capture and reuse experience or knowledge. In contrast toKM, CBR work has put less emphasis on knowledgecreation and sharing aspects; cases are assumed to exist andnormally there is only a single case base. This paper putsforth a case-based KM framework by extending the currentCBR process and explicitly putting knowledge creation andsharing on the same conceptual footing as, for example,retrieval. Key aspects of the proposed case-based KMprocess include the integration or synthesis ofheterogeneous information at the level of individual cases,and the integration or sharing of multiple, autonomous casebases.

Although the paper has a strong positional character,some key aspects of the proposed CBR/KM architecturehave been modelled and tested by the authors in variousstudies and experiments. Given the nature and availablespace, this paper is deliberately very focused. Therefore, anumber of relevant aspects (e.g., knowledge as humanresource, properties of knowledge and data, etc.) areignored.

The remainder of the paper is organised as follows: Toachieve some degree of self-containment, Section 2recapitulates KM and CBR concepts important for thediscussion. Section 3 is concerned with presenting thevarious aspects and the rationale for the proposed CBR/KMframework. Finally, Section 4 concludes the paper with abrief summary and some remarks.

KM and CBR: Setting the Scene

Knowledge ManagementThe business problem KM is designed to solve is that theknowledge in an organization, which has been gainedthrough experience, is not reused because it is not sharedand made available to others in the organization in a formalway (Radding 1998). In order to address this problem, KMresearchers and practitioners have devised variousapproaches that revolve around the basic knowledgeprocesses of creating, sharing, and reusing knowledge.

In the basic KM process, we distinguish between

From: AAAI Technical Report WS-99-10. Compilation copyright © 1999, AAAI (www.aaai.org). All rights reserved.

knowledge creation, and knowledge dissemination andreuse. The basis for the KM process is manifest in theorganization’s disparate data, information, and knowledgesources, information systems, and information-orientedbusiness processes. Up to 90% of the data held in thesesources is unstructured information (e.g., text documents,graphics, images), and the rest is in structured format (e.g,databases).

Figure 1: The KM process.

Consider Figure 1. The first step of the KM process isto filter, validate, select, and aggregate relevant data fromdisparate sources and organize and store it in some kind ofinformation source, data warehouse or staging area. Instep 2, the cleansed and pre-processed information isfurther analyzed, summarized, abstracted, and synthesizedusing powerful statistical and data mining techniques. Theoutput of this step is considered knowledge, it is stored insome knowledge source (e.g., database, data warehouse).An example for a piece of knowledge created in step 2 isthe business rule “Ignore all insurance claims of less than$750”. Step 3 makes use of technologies such asgroupware, computer networks, databases, and others, toshare the newly created knowledge among variousknowledge seekers within the organization. The sharedknowledge can than be reused and applied to businessproblems, and the lessons learned from using thatknowledge can be recorded, either as new knowledge or asraw data.

Some Potential KM IssuesThe lack of systematically integrating the knowledgecreated in the KM process is arguably one of the moresevere shortcomings in current KM and data miningapproaches. At least for rule-formatted knowledge, thisproblem is partly due to the maintenance problem knownfrom classical rule-based or expert systems. A crucialelement for meaningful and efficient reuse of anyknowledge is that the knowledge can be accessed based onthe understanding of what the knowledge seeker needs or

wants to know. It seems that existing KM models are moreconcerned with facilitating physical access, as opposed tointelligent access that is driven by the needs of theknowledge seeker. Consistent knowledge integration andrelevance-driven knowledge access has been at the centre ofCBR research. Moreover, there seems to be scope forimproving the KM process with regard to consistentlyintegrating related heterogeneous data (e.g., graphics, text)into reusable knowledge units. Recent work on second-generation CBR has been addressing issues arising fromloosely structured and heterogeneous cases such as textdocuments (Brüninghaus and Ashley 1999), signals(Azuaje et al. 1999), and others (Gebhardt 1997).

Case-Based ReasoningCBR refers to a reasoning model or problem-solvingprocess that revolves around accessing and reusingpreviously encountered problem-solving episodes (cases)(Lenz et al. 1998). The notion of a problem in this contextdescribes a very general concept ranging from simple tocomplex analytic (e.g., diagnosis) as well as synthetic (e.g.,design) tasks. Including the design-time case base creationphase, the CBR process could be described by the sub-processes case retrieval, case reuse, and case retention.

Figure 2: The CBR process.

Consider Figure 2. The first step in the CBR process issimilar to the knowledge creation phase in the KM process,it revolves around constructing the actual case base.Typically, this part of the process is concerned withidentifying and defining the content structure of the actualcases, a retrieval model (similarity measure, indexingstructure), rules for solution transformation and caseretention. Furthermore, a set of suitable seed cases must beselected in this phase. Conventional case base design ischaracterized by a considerable knowledge engineeringeffort involving subject matter specialists and knowledgeengineers. Once the design is completed, the CBR processis largely governed by the processes case retrieval, reuse,and retention, and case base maintenance. Given the

description of a new problem-solving situation, the caseretrieval process locates those cases in the case base that aremost relevant to the problem at hand. Relevance isnormally approximated using a similarity measure(computational approach) or an indexing model(representational approach) (Liao et al. 1998). Based onthe retrieved cases, the actual reuse or reasoning processtakes place, it is concerned with adapting (the solution of)the past case to the problem at hand. This process mayinvolve simple modification rules or more complexinferences. Case retention refers to integrating the newproblem-solving episode or case into the case library, and,possibly, updating of other knowledge containers (e.g.,indexing structures, adaptation rules). The maintenanceprocess encompasses consolidation of the case base’sknowledge containers (Leake and Wilson 1998). Two keyissues faced by this process are the utility problem — whichhas to do with the trade-off of the system’s performance(time complexity) and reasoning competence (Smyth andKeane 1995) — and concerns about the consistency andaccuracy of the case knowledge (e.g., obsolete cases need tobe updated or removed).

The KM/CBR Synergy FrameworkAmong other aspects, such as the prevention of knowledgeloss, KM solutions could essentially be considered asdecision support systems whose knowledge content iscomposed from various distributed and heterogeneous datasources, and made available to decision-makers accordingto the KM process depicted in Figure 1. Although thenotion defies precise definition, decision support generallyrefers to data and information access facilities targeted toprovide the information needed by (business) decision-makers. Decision support can be as simple as providingfront-line managers periodically with reports onproduction, or it can be as complex as extensive modellingof prospective customers using sophisticated techniques likeBayesian belief networks. Decision support systems areoften used in domains where terminology is highlyambiguous, context plays a strong role, or existing domainknowledge is too incomplete or weak for stronger problem-solving methods to be applied. Decision support belongs tothe broad category of analytic tasks (Lenz et al. 1998).Analytic tasks involve the analysis and interpretation of agiven situation, and the drawing of inferences based on theresults of the analyses and interpretations. However, incontrast to more specific analytic tasks, such asclassification and diagnosis, decision support is of a moregeneral nature. It is often not obvious, which parts of thesituation describe the problem and which the solution, oreven to decide if a solution proposed by the system iscorrect or not, or more useful than another solution or not.

CBR systems have been successfully applied to bothanalytic as well as synthetic problem tasks. Thus, CBR canfulfil the decision-support role of a KM system.

Viewing the KM run-time as a decision-support CBRsystem (see Figure 3) has the advantage that powerful CBRmechanisms, including relevance-based retrieval, casereuse, and learning, could be used for building KMsystems. However, the question of how the KM sub-processes of (initial) knowledge creation and sharing arehandled within case-based KM framework needs to bediscussed. (The items labelled in bold in the diagramindicate KM elements introduced from the CBRframework).

Figure 3: Case-based KM process.

The standard KM knowledge creation process ischaracterized by analysis and abstraction. In contrast,knowledge creation in the proposed case-based KM willfocus on synthesis and integration. Given the diverse typesof information sources in the organization, integration atcase level will involve representing case content that is ofheterogeneous nature, and effective handling (retrieval,reuse) of such cases. Sharing in the standard KM processis often achieved by simply providing “physical” access tovarious data warehouses and data marts. In the proposedcase-based KM model, sharing will require relevance-driven access to a set of heterogeneous case bases based onthe needs of the knowledge seeker.

Integration: Heterogeneous CasesAt the case level, the case-based KM approach will requirethe selection, validation, filtering, and organization ofinformation that will be used to describe case contentinformation. This information will be held in one or moreCBR warehouses (see Figure 3). A CBR warehouse issimilar to its conventional counterpart except that itscontent is geared towards construction of case bases. EachCBR warehouse will eventually be associated with a more

or less autonomous case base or case library — calleddepartmental case base. This sub-process will requireanalytic as well as synthetic tasks. It will involve decisionson what constitutes a case, what vocabulary should be usedto describe the case content, what model will be suitable todescribe relevance, and so on. Because graphics, text, andother types of information will be crucial in describing thecontent of individual cases, this process will draw upon thelessons learned with more recent second-generation CBR(see Section 2.1.1) and data fusion approaches. Data fusion

refers to the acquisition, processing, and merging ofinformation originating from multiple sources to provide abetter insight and understanding of the phenomena underconsideration (Arabnia and Zhu 1998). In (Azuaje et al.1999) we have proposed a case-level integration model, thatcombines signal (ECG signals) case data with structuredpatient records (Figure 4b). An index knowledge discoverymodel, based on growing cell structures neural networks(Fritzke 1996), was used in these experiments toautomatically generate the indexing-based retrieval model.

Figure 4: Case-level and inter-case-base integration (IKD = index knowledge discovery)

Integration: Heterogeneous Case BasesWithin the proposed case-based KM model, knowledgesharing will be viewed as the provision and integration ofrelevance-driven access to a set of autonomous,heterogeneous case bases (departmental case bases).Simply speaking, the integration at this level will facilitatethe retrieval of relevant cases from the set of underlyingindividual case bases without requiring the user of thisknowledge pool to possess detailed information on thevarious cases bases (meta data, case base location, etc.).Instead, the user should only be required to have someknowledge of the vocabulary relevant to his or her area ofinterest or expertise. To enable a coherent retrievalmechanism across a set of loosely coupled or independentcase bases, a mechanism must be provided that is able tohandle interoperability conflicts arising from schematic(syntactical description of case content) and semantic(knowledge content of case) heterogeneity across differentcase bases. Focusing on semantic heterogeneity (the morecomplex of the two types), we propose a model based on(Kashyap and Sheth 1998). This model achieves semanticinteroperability using terminological relationships betweenterms across ontologies. This approach is based on theassumption that terminological ontologies are used torepresent the knowledge content of cases across the entireset of case bases. The basic architecture for integratingmultiple departmental case bases is depicted in Figure 5.

Figure 5: Shared access to multiple case bases (adapted from (Kashyapand A. Sheth 1998)).

The problem of semantic heterogeneity is theidentification of semantically related objects (cases) indifferent case bases and the resolution of discrepanciesamong them. The key to interoperability is vocabularysharing among the ontological terms and case contentdescriptions associated with the various case bases(highlighted elements in Figure 5). The ontology levelrepresents domain and application-specific terminologies.To integrate the terms into a single vocabulary, mapping-based mediation is necessary. Appropriate relationshipsthat allow inter-ontology interoperation are synonyms,hyponyms, and hypernyms (Kashyap and Sheth 1998). Wehave extended their approach with an alternativerelationship and applied the concepts in the context ofmultiple manufacturing ontologies (Büchner et al. 1999).The content level represents intensional descriptions of the

meaning of multiple case bases, each of which uses its ownapplication-specific ontological terminology. In order tocreate an intensional view that contains the contents of allcase bases participating in the case base federation,mediation is required, which is usually relying on Booleancombinations of content descriptions. The case base levelholds extensional descriptions of each participating casebase instance. The virtual synergy of all loosely joined casebases results in the extensional view. Explicit mediation isnot necessary, since it is implicitly achieved through thenegotiation mechanism at the content and ontology levels.Within a software design reuse scenario, we havedemonstrated that the terminological ontology approach isuseful to model cross-domain remindings (Lester et al.1999). The basic concept used to bridge domainboundaries is the synonym relation. In (Azuaje et al. 1999)we have combined the knowledge contained within twodifferently-formatted case bases at the retrieval-result level(Figure 4c).

The Need for Automation and ToolsClearly, there is a need for appropriate tools andautomation techniques that facilitate the construction andintegration of cases with heterogeneous content; the designand management of suitable retrieval and maintenancemodels; the management and maintenance of multiple casebases. Representing multi-format case content willcertainly require tools that support the labelling andannotation of the knowledge contained in cases. Machinelearning techniques and tools such as genetic algorithmsand neural networks could turn out to be crucial forautomatic or semi-automatic construction of suitablevocabularies, indexing (Azuaje et al. 1999), similarity(Dubitzky et al. 1999), adaptation (Hanney and Keane1997), and other knowledge containers.

Brief Summary and Some RemarksViewing KM as a decision support exercise, the paperoutlines some aspects of a possible approach to KM throughCBR. It proposes the synthesis of disparate andheterogeneous information sources into cases as a keyelement of the knowledge creation process. Further, itsuggests that effective knowledge sharing could be achievedvia relevance-driven access to multiple case bases on thebasis of ontology-based mediation models. The underlyingcharacter of the paper is deliberately positional, its mainintention is to stimulate discussion. Having adopted aCBR-biased approach, we are particularly interested incomments, criticism, and feedback from the KMcommunity.

ReferencesArabnia, H., and Zhu D. (eds.), 1998. In Proc. of the Int’l.Conf. on Multisource-Multisensor Information Fusion,(FUSION-98), vol. I and II. CSREA PRESS.Azuaje, F., Dubitzky, W., Black, N., Adamson, K. 1999.Improving Clinical Decision Support Through Case-BasedFusion. To appear in IEEE Transactions on BiomedicalEngineering, Special Issue on Biomedical Data Fusion.Brüninghaus, S., and Ashley, K.D. 1999. Bootstrappingcase base development with annotated case summaries. Toappear in ICCBR-99.Büchner, A.G., Ranta, R., Hughes, J.G, Mäntylä, M. 1999.Semantic Information Mediation among Multiple ProductOntologies. To appear in Proc. World Conf. on IntegratedDesign & Process Technology.Dubitzky, W., Azuaje, F., Lopes, P., Black, N., McCullagh,P., and Song, Y. 1999. On Local and Global FeatureWeight Discovery for CBR. In Proc. Int’l. Conf. onComputers and their Applications, 107-110.Fritzke, B. Growing Self-organising Networks — Why?1996. In European Symposium on ANNs, 61-72.Gebhardt, F., 1997. Survey on structure-based caseretrieval. In The Knowledge Engineering Review, 41-58.Hanney, K. and Keane M.T. 1997. The AdaptationKnowledge Bottleneck: How to Ease it by Learning fromCases. In Proc. 2nd ICCBR-97, 359-370.Kashyap, V. and Sheth, A. 1998. Semantic Heterogeneityin Global Information Systems: The Role of Metadata,Context, and Ontologies. In M.P. Papazoglou and G.Schlageter (eds.), Cooperative Information Systems: Trendsand Directions, Academic Press, 139-178.Leake D.B. and Wilson D.C., 1998. Categorizing CaseBase Maintenance: Dimensions and Directions. In Proc.4th EWCBR-98, 196-207.Lenz, M., Bartsch-Spörl, B., Burkhard, H-D., and Wess S.(eds) 1998. CBR Technology: From Foundations toApplications, Springer-Verlag.Lester, N.G., Wilkie, F.G., Dubitzky, W., and Bustard,D.W. 1999. A Knowledge-Guided Retrieval Framework forReusable Software Artefacts. 1999. Submitted to Int’l.Information Reuse and Integration Conf.Liao, T.W., Zhang, Z., and Mount, C.R. 1998. SimilarityMeasures for Retrieval in Case-Based Reasoning Systems.In Applied Artificial Intelligence, 12: 267-288, 1998.Radding A. (1st ed.). 1998. Knowledge Management:Succeeding in the Information-based Global Economy. InReport of Computer Technology Research Corp.Smyth, B., Keane, M.T. 1995. Remembering to Forget: ACompetence-Preserving Case Deletion Policy for CBRSystems. In Proc. IJCAI-95, 337-382.

viewing knowledge management as a case-based reasoning ... · figure 1: the km process. consider...

Documents