curation of information and knowledge

55
Curation of Information & Knowledge © 2011 Jorn Bettin http://commons.wikimedia.org/wiki/ File:Wentletrap_001.jpg Attribution-ShareAlike 3.0 Unported (CC BY-SA 3.0)

Upload: jorn-bettin

Post on 22-Apr-2015

1.299 views

Category:

Business


2 download

DESCRIPTION

 

TRANSCRIPT

Page 1: Curation of information and knowledge

Curation of Information & Knowledge

© 2011 Jorn Bettin

http://commons.wikimedia.org/wiki/File:Wentletrap_001.jpg

Attribution-ShareAlike 3.0 Unported (CC BY-SA 3.0)

Page 2: Curation of information and knowledge

Converting raw data and tacit knowledge into Relevant Information and Explicit Knowledge

Page 4: Curation of information and knowledge

Complexity ...

Page 6: Curation of information and knowledge

Understandability ...

Page 7: Curation of information and knowledge

is hard to communicate• It’s not tangible

• It’s not raw data

• Much of it is tacit

Value of Knowledge

http://commons.wikimedia.org/wiki/File:Cloud_computing_icon.svg

Page 8: Curation of information and knowledge

Measuring Quality of Information

Relevant dimensions1. Accuracy

2. Currency

3. Completeness

4. Security

5. Reliability

6. Unambiguity

7. Findability

8. Traceability

9. Simplicity

10. Usability

Page 9: Curation of information and knowledge

AccuracyWhy does it matter?

• Information is used for operational and strategic decision making

• It must be trustworthy

How is it measurable?• Define acceptable tolerance intervals

How can it be improved?• Focus on relevant information and eliminate

irrelevant information

Page 10: Curation of information and knowledge

CurrencyWhy does it matter?

• Information is used for operational and strategic decision making

• It must be timely

How is it measurable?• Define acceptable temporal delays

How can it be improved?• Increase the level of automated system

integration

• Invest in adequate computing and network infrastructure

Page 11: Curation of information and knowledge

CompletenessWhy does it matter?

• Information is used for operational and strategic decision making

• It must be sufficiently free of gaps

How is it measurable?• Specify the sources of each piece of

information

• Distinguish between mandatory and optional information for decision making

How can it be improved?• Focus on relevant information and

eliminate irrelevant information

Page 12: Curation of information and knowledge

SecurityWhy does it matter?

• To enforce information ownership

• To ensure compliance with privacy legislation

• To prevent theft of information

How is it measurable?• Strength of authentication mechanisms

• Strength of encryption mechanisms

• Level of alignment between role based access control and job descriptions

How can it be improved?• Introduce stronger authentication and

encryption

• Remove ambiguities from job descriptions

Page 13: Curation of information and knowledge

ReliabilityWhy does it matter?

• To avoid outages

• To prevent disasters

How is it measurable?• Definine the acceptable minimum availability

of each information source

How can it be improved?• Use software designs that tolerate temporary

outages of required/external services

• Invest in system and data centre replication technology

Page 14: Curation of information and knowledge

UnambiguitityWhy does it matter?

• To minimise communication errors

• To prevent wrong decisions

• To prevent disasters

How is it measurable?• Count the homonyms in each

role-specific context

How can it be improved?• Establish a comprehensive

registry of concepts

• Use concepts names that are tailored to the role-specific context

• Use semantic identities instead of names when communicating information

Page 15: Curation of information and knowledge

FinadabilityWhy does it matter?

• To enable staff to find relevant information

• To speed up decision making

• To prevent disasters

How is it measurable?• Count how often staff need to talk to

colleagues to find information that is stored in an information system

How can it be improved?• Provide advanced support for queries

• Make the query engine aware of the role-specific context

• Allow query by information category, by container, by name, and by semantic identity

Page 16: Curation of information and knowledge

TraceabilityWhy does it matter?

• To speed up root cause analysis of errors

• To speed up the learning curve for new staff

• To meet legal & regulatory compliance needs

How is it measurable?• Count how often staff need to talk to

colleagues or need to resort to ad-hoc search for tracing the source of an error

How can it be improved?• Consistent use of information categories and

containers

• Automatic tagging of information with temporal & spacial meta data

• Adherance to retention constraints

Page 17: Curation of information and knowledge

SimplicityWhy does it matter?

• To accommodate human cognitive limits

• To prevent wrong decisions

• To prevent disasters

How is it measurable?• Collect artefact complexity metrics

How can it be improved?• Intuitive representations that are developed in

collaboration with domain experts

• As needed, role-specific representations

• Provide an explicit modularisation mechanism for all artefacts

Page 18: Curation of information and knowledge

UsabilityWhy does it matter?

• Intuitive user/system interaction

• Device independent information access

• To discourage use of non-compliant tools

How is it measurable?• Validation by average users

How can it be improved?• Consistency of representations across devices

• Use of high-quality icons that are developed in collaboration with domain experts

• Ensure adequate reliability

Page 20: Curation of information and knowledge

Knowledge Repositories

Page 21: Curation of information and knowledge

Examples

A language artefact is a non-hardware artefact• information content of pheromones

• information content of body language

• live music

• live speech

• information content in traditional symbolic notations

• program/diagram/hypertext/database content

• information content of recorded sound/pictures/videos

• information content of genetic material

http://commons.wikimedia.org/wiki/File:Photo_with_histogram.JPG

SelerequmAdequate support for role based access control

cri\el Su\rce template/transSate support for role based access control

SeptersAMS datastore bisupport for role based access control

Page 22: Curation of information and knowledge

A language artefact• is a container of information

• is instantiated by a specific actor (human or system)

• is consumed by at least one actor (human or system)

• represents a natural unit of work (for the instantiating & consuming actors)

• may contain links to other artefacts

• has a state and a lifecycle

Definition SelerequmAdequate support for role based access control

cri\el Su\rce template/transSate support for role based access control

SeptersAMS datastore bisupport for role based access control

Page 23: Curation of information and knowledge

Communication

Page 24: Curation of information and knowledge

Definition

Software is an arbitrary set of language artefacts

SelerequmAdequate support for role based access control

cri\el Su\rce template/transSate support for role based access control

SeptersAMS datastore bisupport for role based access control SelerequmAdequ

ate support for role based access control

cri\el Su\rce template/transSate support for role based access control

SeptersAMS datastore bisupport for role based access control SelerequmAdequ

ate support for role based access control

cri\el Su\rce template/transSate support for role based access control

SeptersAMS datastore bisupport for role based access control SelerequmAdequ

ate support for role based access control

cri\el Su\rce template/transSate support for role based access control

SeptersAMS datastore bisupport for role based access control

Page 25: Curation of information and knowledge

Software Producers

software systems & other humans

software developers

SelerequmAdequate support for role

cri\el Su\rce template/transSat

SeptersAMS datastore bisuppor

SelerequmAdequate support for role

cri\el Su\rce template/transSat

SeptersAMS datastore bisuppor

SelerequmAdequate support for role

cri\el Su\rce template/transSat

SeptersAMS datastore bisuppor

SelerequmAdequate support for role

cri\el Su\rce template/transSat

SeptersAMS datastore bisuppor

Page 26: Curation of information and knowledge

1st-Level Categorisation

operational data

meta datameta data

SelerequmAdequate support for role

cri\el Su\rce template/transSat

SeptersAMS datastore bisuppor

SelerequmAdequate support for role

cri\el Su\rce template/transSat

SeptersAMS datastore bisuppor

SelerequmAdequate support for role

cri\el Su\rce template/transSat

SeptersAMS datastore bisuppor

SelerequmAdequate support for role

cri\el Su\rce template/transSat

SeptersAMS datastore bisuppor

Page 27: Curation of information and knowledge

Definitions

Data, Information, Knowledge• uncategorised data has very little value

• categorised data is valuable information

• information combined with an understanding of its usage context is valuable knowledge

the categories (= meta data) must be relevant to the organisation

SelerequmAdequate support for role

cri\el Su\rce template/transSat

SeptersAMS datastore bisuppor

SelerequmAdequate support for role

cri\el Su\rce template/transSat

SeptersAMS datastore bisuppor

SelerequmAdequate support for role

cri\el Su\rce template/transSat

SeptersAMS datastore bisuppor

SelerequmAdequate support for role

cri\el Su\rce template/transSat

SeptersAMS datastore bisuppor

Page 28: Curation of information and knowledge

A

F

EF

C

D

B

Value Chain

produce

consume

produceproduce

consume

prod

uce

prod

uce

consume

produceco

nsum

e

consume

A B C

D E F

Selection criteria for a metadata repositoryAdequate support for CR compatible versioning, branching, locking requirementsSupport for interfaces with current commercial products (eg ERWin)Metamodelling capability and ideally an extensible metametamodel Support for development of adaptersAdequate support for generalisation/specialisationSupport for multiple terminologies/jargonsIntegration with open source template/transformation languagesRDBMS datastore binding (to support referential integrity)Support for information ownershipAdequate support for role based access control

Selection criteria for a metadata repositoryAdequate support for CR compatible versioning, branching, locking requirementsSupport for interfaces with current commercial products (eg ERWin)Metamodelling capability and ideally an extensible metametamodel Support for development of adaptersAdequate support for generalisation/specialisationSupport for multiple terminologies/jargonsIntegration with open source template/transformation languagesRDBMS datastore binding (to support referential integrity)Support for information ownershipAdequate support for role based access control

Selection criteria for a metadata repositoryAdequate support for CR compatible versioning, branching, locking requirementsSupport for interfaces with current commercial products (eg ERWin)Metamodelling capability and ideally an extensible metametamodel Support for development of adaptersAdequate support for generalisation/specialisationSupport for multiple terminologies/jargonsIntegration with open source template/transformation languagesRDBMS datastore binding (to support referential integrity)Support for information ownershipAdequate support for role based access control

Selection criteria for a metadata repositoryAdequate support for CR compatible versioning, branching, locking requirementsSupport for interfaces with current commercial products (eg ERWin)Metamodelling capability and ideally an extensible metametamodel Support for development of adaptersAdequate support for generalisation/specialisationSupport for multiple terminologies/jargonsIntegration with open source template/transformation languagesRDBMS datastore binding (to support referential integrity)Support for information ownershipAdequate support for role based access control

Selection criteria for a metadata repositoryAdequate support for CR compatible versioning, branching, locking requirementsSupport for interfaces with current commercial products (eg ERWin)Metamodelling capability and ideally an extensible metametamodel Support for development of adaptersAdequate support for generalisation/specialisationSupport for multiple terminologies/jargonsIntegration with open source template/transformation languagesRDBMS datastore binding (to support referential integrity)Support for information ownershipAdequate support for role based access control

Selection criteria for a metadata repositoryAdequate support for CR compatible versioning, branching, locking requirementsSupport for interfaces with current commercial products (eg ERWin)Metamodelling capability and ideally an extensible metametamodel Support for development of adaptersAdequate support for generalisation/specialisationSupport for multiple terminologies/jargonsIntegration with open source template/transformation languagesRDBMS datastore binding (to support referential integrity)Support for information ownershipAdequate support for role based access control

Page 29: Curation of information and knowledge

Elements of knowledge acquisition• Collaboration

• Exploration

• Observation

• Validation

• Abstraction

• Modularisation

• Representation

Learning

Page 30: Curation of information and knowledge

Collaboration

“We are smarter than me”Jean-Marie FavreSoftware Anthropologist

Page 31: Curation of information and knowledge

Exploration

Raw data acquired by exploration is essential for understanding an unknown domain• Data can be analysed and categorised

• Lack of data only leads to speculation

Page 32: Curation of information and knowledge

Observation

Connecting the dots – building a mental model• Associating information with time,

space, and other attributes of origin

• Noticing possible associationsbetween different pieces of information

http://commons.wikimedia.org/wiki/File:Knowledge,_observation_and_reality.svg

Tacit

Page 33: Curation of information and knowledge

Validation

Confirming observations• Using the scientific method

• By comparing with observations from others

• By involving domain experts from related disciplines

• Remember: we are smarter than me!

Page 34: Curation of information and knowledge

Abstraction

Look for Commonalities• Avoid repetition

• Identify patterns

• Remember: KISS!Photographer Kurt Salzmann -

www.salzmaenner.ch

Page 35: Curation of information and knowledge

Modularisation

http://commons.wikimedia.org/wiki/File:Modular_origami.jpg

Modules preserve Simplicity• Rely on role-based

separation of concerns

• Modules must correspond to a natural unit of work

• Roles and modular artefacts represent the building blocks of value chains

• Optimise within the organisational context of customers, suppliers, and available skills

Page 36: Curation of information and knowledge

Representation

Modelling is about clarity• Balancing act between simplicity

and not compromising the desired intent

• Focus is on human cognitive abilities & limits

• As needed use multiple syntax elements (visual containers, symbols, text, mathematical expressions)

• Borrow syntax from established languages, or design syntax in close collaboration with the user community

Page 37: Curation of information and knowledge

Code

All models are codea system of symbols used for

• identification

• classification in the sense of grouping

a system of signals used to send messages

a set of conventions governing behaviour

Modelling is meta coding to improve clarity of code

Page 38: Curation of information and knowledge

Examples

Class : Mammal

dateOfBirth

Class : Dog

isPoliceDog

Class : Cat

Dog : Jack{1/5/03, yes}

Dog : Susie{1/2/00, no}

Cat : Coco{4/3/07}

Cat : Peter{10/9/98}

[*]

[2]

[*]

[2]

http://commons.wikimedia.org/wiki/

Page 39: Curation of information and knowledge

Communication Costs

Not all code is a model• a system of signals that includes a

translation of messages to deal with someone else’s syntax

• a system of symbols used for classification in the sense of obfuscation or encryption

http://commons.wikimedia.org/wiki/File:Encryption_-_decryption.svg

Page 40: Curation of information and knowledge

Software suffers from the same problems as way backwhen natural language evolved to enrich the exchange between humans

Increasingly the artefacts exchanged between humans are neither hardware nor natural language (encoded in speech or symbolic notation)

All language artefacts share the probems of natural language: unanticipated interpretations

Today

Page 41: Curation of information and knowledge

http://commons.wikimedia.org/wiki/File:Discussion.jpg

Requires collaboration and good will between artefact producers & all consumersAssociating information with its usage context

Respecting the notational and terminological preferences of all parties

Assigning a unique semantic identity to each piece of information (= concept)

Minimising Unanticipated Interpretation

Page 42: Curation of information and knowledge

Semantic Modelling

AC

B

Page 43: Curation of information and knowledge

Semantic Modelling

Semantic DomainsModels

1. Identification of concepts andassigment of semantic identities

3. Naming of concepts in as many terminologies as required by artefact producers and consumers

2. Modelling

next

next

Page 44: Curation of information and knowledge

• Based on the mathematics of model theory & denotational semantics

• Constitutes a solid foundation for information engineering & knowledge curation

• Not the same as modelling with theRecource Description Framework (Semantic Web)

• Not the same as classical entity-relationship modelling

• Not the same as object-oriented modelling

Semantic Modelling

Semantic DomainsModels

Page 45: Curation of information and knowledge

• Focuses on the meaning of information in a concrete usage context

• Converts tacit knowledge into explicit knowledge for use by humans and software tools

• The Recource Description Framework only partially implements denotational semantics

• Entity-relationship schemas lack a mechanism for modularity

• Object-oriented models are limited to one level of instantiation

Semantic Modelling

Semantic DomainsModels

Page 46: Curation of information and knowledge

Without delving into the formal mathematical details, the significance of model theory is best appreciated intuitively by considering the following observations:

• Formal lingustics as pioneered by Noam Chomsky in the 1950s and 1960s can be expressed as a special case of model theory.

• The work of model theorists goes back to the beginning of the 20th century, and was motivated by mathematicians who were concerned about potential logical inconsistencies in the mathematical symbol system and the conventions governing its use.

• The resulting research into symbol systems has led to a mathematical theory that can be used to formalise any symbol system, not limited to the languages invented by humans, and including the genetic code.

• The pictures produced on flip charts and white boards constitute domain specific languages as well, and with the help of their authors, sets of pictures can easily be formalised mathematically, using a specialised software tool for semantic modelling.

Model Theory

Page 47: Curation of information and knowledge

A

FEF

CB

Semantic Domains

DD

Page 48: Curation of information and knowledge

Modular Models

Modules preserve Simplicity• Roles and modular artefacts represent

the building blocks of value chains

• Optimise within the organisational context of customers, suppliers, and available skills

separation of concerns

unit of workrole based

A B C

D E F

Selection criteria for a metadata repositoryAdequate support for CR compatible versioning, branching, locking requirementsSupport for interfaces with current commercial products (eg ERWin)Metamodelling capability and ideally an extensible metametamodel Support for development of adaptersAdequate support for generalisation/specialisationSupport for multiple terminologies/jargonsIntegration with open source template/transformation languagesRDBMS datastore binding (to support referential integrity)Support for information ownershipAdequate support for role based access control

Selection criteria for a metadata repositoryAdequate support for CR compatible versioning, branching, locking requirementsSupport for interfaces with current commercial products (eg ERWin)Metamodelling capability and ideally an extensible metametamodel Support for development of adaptersAdequate support for generalisation/specialisationSupport for multiple terminologies/jargonsIntegration with open source template/transformation languagesRDBMS datastore binding (to support referential integrity)Support for information ownershipAdequate support for role based access control

Selection criteria for a metadata repositoryAdequate support for CR compatible versioning, branching, locking requirementsSupport for interfaces with current commercial products (eg ERWin)Metamodelling capability and ideally an extensible metametamodel Support for development of adaptersAdequate support for generalisation/specialisationSupport for multiple terminologies/jargonsIntegration with open source template/transformation languagesRDBMS datastore binding (to support referential integrity)Support for information ownershipAdequate support for role based access control

Selection criteria for a metadata repositoryAdequate support for CR compatible versioning, branching, locking requirementsSupport for interfaces with current commercial products (eg ERWin)Metamodelling capability and ideally an extensible metametamodel Support for development of adaptersAdequate support for generalisation/specialisationSupport for multiple terminologies/jargonsIntegration with open source template/transformation languagesRDBMS datastore binding (to support referential integrity)Support for information ownershipAdequate support for role based access control

Selection criteria for a metadata repositoryAdequate support for CR compatible versioning, branching, locking requirementsSupport for interfaces with current commercial products (eg ERWin)Metamodelling capability and ideally an extensible metametamodel Support for development of adaptersAdequate support for generalisation/specialisationSupport for multiple terminologies/jargonsIntegration with open source template/transformation languagesRDBMS datastore binding (to support referential integrity)Support for information ownershipAdequate support for role based access control

Selection criteria for a metadata repositoryAdequate support for CR compatible versioning, branching, locking requirementsSupport for interfaces with current commercial products (eg ERWin)Metamodelling capability and ideally an extensible metametamodel Support for development of adaptersAdequate support for generalisation/specialisationSupport for multiple terminologies/jargonsIntegration with open source template/transformation languagesRDBMS datastore binding (to support referential integrity)Support for information ownershipAdequate support for role based access control

Page 49: Curation of information and knowledge

A B C

D E F

Connected Semantic Domains

Page 50: Curation of information and knowledge

Selection criteria for a metadata repositoryAdequate support for CR compatible versioning, branching, locking requirementsSupport for interfaces with current commercial products (eg ERWin)Metamodelling capability and ideally an extensible metametamodel Support for development of adaptersAdequate support for generalisation/specialisationSupport for multiple terminologies/jargonsIntegration with open source template/transformation languagesRDBMS datastore binding (to support referential integrity)Support for information ownershipAdequate support for role based access control

Selection criteria for a metadata repositoryAdequate support for CR compatible versioning, branching, locking requirementsSupport for interfaces with current commercial products (eg ERWin)Metamodelling capability and ideally an extensible metametamodel Support for development of adaptersAdequate support for generalisation/specialisationSupport for multiple terminologies/jargonsIntegration with open source template/transformation languagesRDBMS datastore binding (to support referential integrity)Support for information ownershipAdequate support for role based access control

Selection criteria for a metadata repositoryAdequate support for CR compatible versioning, branching, locking requirementsSupport for interfaces with current commercial products (eg ERWin)Metamodelling capability and ideally an extensible metametamodel Support for development of adaptersAdequate support for generalisation/specialisationSupport for multiple terminologies/jargonsIntegration with open source template/transformation languagesRDBMS datastore binding (to support referential integrity)Support for information ownershipAdequate support for role based access control

Selection criteria for a metadata repositoryAdequate support for CR compatible versioning, branching, locking requirementsSupport for interfaces with current commercial products (eg ERWin)Metamodelling capability and ideally an extensible metametamodel Support for development of adaptersAdequate support for generalisation/specialisationSupport for multiple terminologies/jargonsIntegration with open source template/transformation languagesRDBMS datastore binding (to support referential integrity)Support for information ownershipAdequate support for role based access control

Selection criteria for a metadata repositoryAdequate support for CR compatible versioning, branching, locking requirementsSupport for interfaces with current commercial products (eg ERWin)Metamodelling capability and ideally an extensible metametamodel Support for development of adaptersAdequate support for generalisation/specialisationSupport for multiple terminologies/jargonsIntegration with open source template/transformation languagesRDBMS datastore binding (to support referential integrity)Support for information ownershipAdequate support for role based access control

Selection criteria for a metadata repositoryAdequate support for CR compatible versioning, branching, locking requirementsSupport for interfaces with current commercial products (eg ERWin)Metamodelling capability and ideally an extensible metametamodel Support for development of adaptersAdequate support for generalisation/specialisationSupport for multiple terminologies/jargonsIntegration with open source template/transformation languagesRDBMS datastore binding (to support referential integrity)Support for information ownershipAdequate support for role based access control

Shared Language

ab

ac

df

de

ad

Page 51: Curation of information and knowledge

abacdfde ad bcef cf

Jargon = Words + Symbols

Page 52: Curation of information and knowledge

dfD

View Point

Perspective

JargonF

Page 53: Curation of information and knowledge

Ff

View Point

Reflexive Jargon

DSMLF

DSML = Domain Specific Modelling Language

Page 54: Curation of information and knowledge

Jargons develop on top of Shared Semantic Subdomains

A B C

D E F

ab ac

dfde

ad

bc

ef

cf

Page 55: Curation of information and knowledge

Thank youJorn Bettin

+61 424 758 540

Knowledge Reconstruction & Risk Management http://jornbettin.com

Gmodel Team Blog the-software-artefact.blogspot.com

The Role of Artefacts tiny.cc/artefacts

From Muddling to Modelling tiny.cc/muddleToModel

Model Oriented Domain Analysis tiny.cc/domainanalysis

More Information

jbettin @ ibrs.com.au www.ibrs.com.au