cardiotoronto.pps

94
New York State Center of Excellence in Bioinformatics & Life Sciences R T U CHSS Data Center Work Weekend Ontology, Terminology, and Cardiovascular Surgery Nov 21, 2008 – Toronto, Canada Werner CEUSTERS, MD Center of Excellence in Bioinformatics and Life Sciences, and National Center for Biomedical Ontology, University at Buffalo, NY, USA

Upload: cardiacinfo

Post on 19-Jun-2015

209 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: CardioToronto.pps

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

CHSS Data Center Work Weekend

Ontology, Terminology, and Cardiovascular Surgery

Nov 21, 2008 – Toronto, Canada

Werner CEUSTERS, MD Center of Excellence in Bioinformatics and Life Sciences, and National Center for Biomedical Ontology, University at Buffalo, NY, USA

Page 2: CardioToronto.pps

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

Short personal history

1959 - 20081977

1989

1992

1998

2002

2004

2006

19931995

Page 3: CardioToronto.pps

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

Structure of this presentation

• Data and where they (should) come from

• Realism-based ontology

• Referent Tracking

• How to build ontologies from terminologies

• How to link to patient data

• How can disparate views been accommodated

Page 4: CardioToronto.pps

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

The central hypothesis

• For disease registries to facilitate meaningful multi-institutional outcomes analysis, there must be:

1. Common language = nomenclature,2. Mechanism of data collection (database or registry) with an

established uniform core data set,3. Mechanism of evaluating case complexity,4. Mechanism to ensure and verify data completeness and

accuracy,5. Collaboration between medical subspecialties.

JP Jacobs et.al. Nomenclature and Databases — The Past, the Present, and the Future: A Primer for the Congenital Heart Surgeon. Pediatr Cardiol (2007)

Page 5: CardioToronto.pps

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

Would this do ?

• For disease registries to facilitate meaningful multi-institutional outcomes analysis, there must be:

1. Whatever sort of Common language = nomenclature,2. Whatever sort of Mechanism of data collection (database or

registry) with an established uniform core data set,3. Whatever sort of Mechanism of evaluating case complexity,4. Whatever sort of Mechanism to ensure and verify data

completeness and accuracy,5. Whatever sort of Collaboration between medical

subspecialties.

?

Page 6: CardioToronto.pps

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

The answer is clearly …

• … No !

• There are – many such animals – of various sorts, – which all have shortcomings,– and therefore lead to the creation of even more such

animals,– which finally end up suffering – more or less - from the

same flaws.

Page 7: CardioToronto.pps

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

Alagille SyndromeAortic CoarctationArrhythmogenic RV DysplasiaCor Triatriatum...

Aortic CoarctationArrhythmogenic RV DysplasiaCor Triatriatum...

Alagille SyndromeAortic CoarctationArrhythmogenic RV DysplasiaCor Triatriatum...

Mesh 2008: congenital heart defectsAll MeSH Categories

Diseases Category Cardiovascular Diseases

Cardiovascular Abnormalities Heart Defects, Congenital

All MeSH Categories Diseases Category

Congenital, Hereditary, and Neonatal Diseases and Abnormalities

Congenital Abnormalities Cardiovascular Abnormalities

Heart Defects, Congenital

All MeSH Categories Diseases Category

Cardiovascular Diseases Heart Diseases

Heart Defects, Congenital

?

Page 8: CardioToronto.pps

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

SNOMED-CT version 2008.01.7AC

Page 9: CardioToronto.pps

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

SNOMED-CT’s‘Fallot’s trilogy’

versus ‘Fallot’s triad’

Page 10: CardioToronto.pps

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

Trilogy of Fallot• Definition:

– Combination of pulmonary valve stenosis and atrial septal defect with right ventricular hypertrophy.

• Typical representational mistake:– From (correctly, if the definition is right) :

• ‘a patient which has Fallot’s triad– has a pulmonary valve stenosis, – has an atrial septal defect,– has a right ventricular hypertrophy.’

– To (wrong, even if the definition is right) :• ‘a Fallot’s triad

– is a pulmonary valve stenosis, – is an atrial septal defect,– is a right ventricular hypertrophy.’

Page 11: CardioToronto.pps

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

In general: some alarming publications

• Why most published research findings are false. Ioannidis JPA (2005). PLoS Med 2(8): e124.– Institute for Clinical Research and Health Policy Studies, Department of

Medicine, Tufts-New England Medical Center, Tufts University School of Medicine, Boston, Massachusetts.

• Why Current Publication Practices May Distort Science. Young NS, Ioannidis JPA, Al-Ubaydli O (2008, October 7) PLoS Med 5(10): e201. doi:10.1371/journal.pmed.0050201.– Hematology Branch, National Heart, Lung, and Blood Institute, National

Institutes of Health, Bethesda, Maryland,

Page 12: CardioToronto.pps

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

Key question:

Why is this ?

Page 13: CardioToronto.pps

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

‘The spectrum of the Health Sciences’

http://www.uvm.edu/~ccts

Turning data in knowledge

Page 14: CardioToronto.pps

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

What is missing here ?

http://www.uvm.edu/~ccts

?Turning data in knowledge

Page 15: CardioToronto.pps

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

Source of all data

Reality !

Page 16: CardioToronto.pps

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

Today’s data generation and use

observation &measurement

dataorganization

model development

use

add

Genericbeliefs

verify

further R&D(instrument and

study optimization)

application

Δ = outcome

Page 17: CardioToronto.pps

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

Key components

data information

knowledgehypotheses

• Players• HIT• Outcomes

generates

generates

generates

influences

representationreality about

Page 18: CardioToronto.pps

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

Current deficiencies• At the level of reality:

– Desired outcomes different for distinct players• Competing interests

– Multitude of HIT applications and paradigms• At the level of representations:

– Variety of formats– Silo formation– Doubtful semantics

• In their interplay:– Very poor provenance or history keeping– No formal link with that what the data are about– Low quality

Page 19: CardioToronto.pps

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

Where should we go?

Page 20: CardioToronto.pps

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

Ultimate goal (at least mine)

A digital copy of the world

Page 21: CardioToronto.pps

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

Requirements for this digital copy

• R1: A faithful representation of reality• R2 … of everything that is digitally registered,

what is generic scientific theories

what is specific what individual entities exist and how they relate

• R3: … throughout reality’s entire history,• R4 … which is computable in order to …

… allow queries over the world’s past and present,

… make predictions,

… fill in gaps,

… identify mistakes,

...

Page 22: CardioToronto.pps

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

In fact … the ultimate crystal ball

Page 23: CardioToronto.pps

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

The ‘binding’ wall

How to do it right ?

A cartoon of the world

Page 24: CardioToronto.pps

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

Major problems

1. A mismatch between what is - and has been - the case in reality, and representations thereof in:

a) (generic) Knowledge repositories, and

b) (specific) Data and Information repositories.

2. An inadequate integration of a) and b).

Solutions

Philosophicalrealism

Realism-based Ontology

Referent Tracking

Philosophy

HIT

Page 25: CardioToronto.pps

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

Realism-based Ontology

Page 26: CardioToronto.pps

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

‘Ontology’: one word, two meanings• In philosophy:

– Ontology (no plural) is the study of what entities exist and how they relate to each other;

• In computer science and (biomedical informatics) applications:– An ontology (plural: ontologies) is a shared and agreed upon

conceptualization of a domain;

• Our ‘realist’ view within the Ontology Research Group combines the two:– We use realism, a specific theory of ontology, as the basis for

building high quality ontologies, using reality as benchmark.

Page 27: CardioToronto.pps

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

Realism-based ontology

• Basic assumptions:1. reality exists objectively in itself, i.e. independent of

the perceptions or beliefs of cognitive beings;

2. reality, including its structure, is accessible to us, and can be discovered through (scientific) research;

3. the quality of an ontology is at least determined by the accuracy with which its structure mimics the pre-existing structure of reality.

Page 28: CardioToronto.pps

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

However: the dominant view in Comp Sc is conceptualism

SemanticTriangle

concept

object term

Embedded inTerminology

Page 29: CardioToronto.pps

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

The concept-based view

P P P PP P P P

P P P P

isa class

Page 30: CardioToronto.pps

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

The realism-based view

P P P PP P P P

P P P P

universal

instance-of

extension-of

member-of class

Defined class

e.g. human

e.g. all humans

e.g. all humans in this room

Page 31: CardioToronto.pps

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

Ontology

universal

P P P PP P P P

P P P P

instance-of

extension-of

member-of class

Defined class

e.g. human

e.g. all humans

e.g. all humans in this room

universal

P P P PP P P P

P P P P

instance-of

extension-ofextension-of

member-of classmember-ofmember-of class

Defined class

Defined class

e.g. human

e.g. all humans

e.g. all humans in this room

P P P PP P P P

P P P P

instance-of

class/concept

Terminology

Page 32: CardioToronto.pps

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

The ‘terminology / ontology divide’

• Terminology:– solves certain issues related to language use, i.e. with respect to

how we talk about entities in reality (if any);• Relations between terms / concepts

– does not provide an adequate means to represent independent of use what we talk about, i.e. how reality is structured;

• Women, Fire and Dangerous Things (Lakoff).

• Ontology (of the right sort):– Language and perception neutral view on reality.

• Relations between entities in first-order reality

Page 33: CardioToronto.pps

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

Terminological versus Ontological approach

• The terminologist defines:– ‘a clinical drug is a pharmaceutical product given to (or taken

by) a patient with a therapeutic or diagnostic intent’. (RxNorm)

• The ontologist thinks:– Does ‘given’ includes ‘prescribed’?

– Is manufactured with the intent to … not sufficient?• Are newly marketed products – available in the pharmacy, but not yet

prescribed – not clinical drugs?

• Are products stolen from a pharmacy not clinical drugs?

• What about such products taken by persons that are not patients?– e.g. children mistaking tablets for candies.

Page 34: CardioToronto.pps

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

Cardiovascular surgery examples• Systemic venous anomaly, SVC, Bilateral SVC • Systemic venous anomaly, SVC, Bilateral SVC, Innominate absent• Systemic venous anomaly, SVC, Bilateral SVC, Innominate present

• VA valve overriding• VA valve overriding, Aortic valve• VA valve overriding, Left sided VA Valve• VA valve overriding, Pulmonary valve• VA valve overriding, Right sided VA Valve• VA valve overriding-modifier for degree of override, Override of VA valve ,50%• VA valve overriding-modifier for degree of override, Override of VA valve .90%• VA valve overriding-modifier for degree of override, Override of VA valve 50–

90%

JP. Jacobs et.al. The nomenclature, definition and classification of cardiac structures in the setting of heterotaxy. Cardiol Young 2007; 17(Suppl. 2): 1–28

Page 35: CardioToronto.pps

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

The semantic triangle revisited

concepts

termsobjects

Representation and Reference

First Order Reality

about

termsconcepts

Page 36: CardioToronto.pps

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

Terminology Realist Ontology

Representation and Reference

First Order Reality

about

representational units

universals particularsobjects

termsconcepts

Page 37: CardioToronto.pps

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

Terminology Realist Ontology

Representation and Reference

First Order Reality

about

representational units

universals particularsobjects

termsconcepts

Page 38: CardioToronto.pps

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

Terminology Realist Ontology

Representation and Reference

First Order Reality

about

universals particularsobjects

termsconcepts cognitiveunits

communicativeunits

representational units

Page 39: CardioToronto.pps

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

Terminology Realist Ontology

Representation and Reference

First Order Reality

universals particulars

cognitiveunits

representational units

(1) Entities with objective existence which are not about anything

(2) Cognitive entities which are our beliefs about (1)

communicativeunits

(3) Representational units in various forms about (1), (2) or (3)

Three levels of reality in Realist Ontology

Page 40: CardioToronto.pps

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

The three levels in medical practice

1.First-order

reality

2. Beliefs (knowledge)

Generic Specific

DIAGNOSIS

INDICATION

my doctor’swork plan

my doctor’sdiagnosis

MOLECULE

PERSON

DISEASE

PATHOLOGICALSTRUCTURE

BLOODPRESSURE

DRUG

me

my blood pressure

my ASD

my doctor my doctor’s computer

3. Representation ‘atrial septal defect’ ‘W. Ceusters’ ‘my heart defect’

Page 41: CardioToronto.pps

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

Terminology is too reductionistWhat concepts do we need?

How do we name concepts properly?

Page 42: CardioToronto.pps

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

The power of realism in ontology design

Reality as benchmark !

1. Is the scientific ‘state of the art’consistent with biomedical reality ?

Page 43: CardioToronto.pps

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

The power of realism in ontology design

Reality as benchmark !

2. Is my doctor’s knowledge up to date?

Page 44: CardioToronto.pps

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

The power of realism in ontology design

Reality as benchmark !

3. Does my doctor have an accurateassessment of my health status?

Page 45: CardioToronto.pps

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

The power of realism in ontology design

Reality as benchmark !4. Is our terminology rich enough

to communicate about all three levels?

Page 46: CardioToronto.pps

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

The power of realism in ontology design

Reality as benchmark !

5. How can we use case studies betterto advance the state of the art?

Page 47: CardioToronto.pps

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

Referent Tracking

Page 48: CardioToronto.pps

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

Another problem to solve: how many disorders?

5572 04/07/1990 26442006 closed fracture of shaft of femur

5572 04/07/1990 81134009 Fracture, closed, spiral

5572 12/07/1990 26442006 closed fracture of shaft of femur

5572 12/07/1990 9001224 Accident in public building (supermarket)

5572 04/07/1990 79001 Essential hypertension

0939 24/12/1991 255174002 benign polyp of biliary tract

2309 21/03/1992 26442006 closed fracture of shaft of femur

2309 21/03/1992 9001224 Accident in public building (supermarket)

47804 03/04/1993 58298795 Other lesion on other specified region

5572 17/05/1993 79001 Essential hypertension

298 22/08/1993 2909872 Closed fracture of radial head

298 22/08/1993 9001224 Accident in public building (supermarket)

5572 01/04/1997 26442006 closed fracture of shaft of femur

5572 01/04/1997 79001 Essential hypertension

PtID Date ObsCode Narrative

0939 20/12/1998 255087006 malignant polyp of biliary tract

Three references of hypertension for the samepatient denote three times the same disease.

If two different fracture codes are used in relation to

observations made on the same day for the same patient, they

might refer to the same fracture

The same type of location code usedin relation to three different events might or might not refer to the samelocation.

If the same fracture code is used for the

same patient on different dates, then these codes might or might not refer to the

same fracture.

The same fracture code used in relationto two different patients can not refer tothe same fracure.

If two different tumor codes are usedin relation to observations made on differentdates for the same patient, they may still refer to the same tumor.

Page 49: CardioToronto.pps

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

Requirements for a digital copy of the world

• R1: A faithful representation of reality• R2 … of everything that is digitally registered,

what is generic scientific theories realism-based ontologies

what is specific what individual entities exist and how they relate

• R3: … throughout reality’s entire history,• R4 … which is computable in order to …

… allow queries over the world’s past and present,

… make predictions,

… fill in gaps,

… identify mistakes,

...

Page 50: CardioToronto.pps

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

The reality: a digital copy of part of the world

Applying the grid should not give a distorted representation of reality, but only

an incomplete representation !!!

Page 51: CardioToronto.pps

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

Key issue: keeping track of what the bits denote

Page 52: CardioToronto.pps

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

• explicit reference to the concrete individual entities relevant to the accurate description of each patient’s condition, therapies, outcomes, ...

Fundamental goal of Referent Tracking

Ceusters W, Smith B. Strategies for Referent Tracking in Electronic Health Records. J Biomed Inform. 2006 Jun;39(3):362-78.

Page 53: CardioToronto.pps

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

Method: numbers instead of words

Ceusters W, Smith B. Strategies for Referent Tracking in Electronic Health Records. J Biomed Inform. 2006 Jun;39(3):362-78.

– Introduce an Instance Unique Identifier (IUI) for each relevant particular (individual) entity

78

Page 54: CardioToronto.pps

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

The essence ofReferent Tracking

• Keeping track of particulars

• By means of singular and globally unique identifiers (#1, #2, #3, …)

• That function as surrogates for these entities in information systems, documents, etc

• And are managed IN a referent tracking system.

Ceusters W. and Smith B. Tracking Referents in Electronic Health Records. In: Engelbrecht R. et al. (eds.) Medical Informatics Europe, IOS Press, Amsterdam, 2005;:71-76

Page 55: CardioToronto.pps

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

‘John Doe’s ‘John Smith’s

liver liver

tumor tumor

was treated was treated

with with

RPCI’s RPCI’s

irradiation device’ irradiation device’

‘John Doe’s

liver

tumor

was treated

with

RPCI’s

irradiation device’

The principle of Referent Tracking

#1

#3

#2

#4

#5

#6

treating

person

liver

tumor

clinic

device

instance-of at t1

instance-of at t1

instance-of at t1

instance-of

instance-of at t1

#10

#30

#20

#40

#5

#6

inst-of at t2

inst-of at t2

inst-of at t2

inst-of

inst-of at t2

Page 56: CardioToronto.pps

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

EHR – Ontology “collaboration”

Page 57: CardioToronto.pps

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

Reasoning over instances and universals

instance-of at t

#105caused

by

Page 58: CardioToronto.pps

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

5572 04/07/1990 26442006 closed fracture of shaft of femur

5572 04/07/1990 81134009 Fracture, closed, spiral

5572 12/07/1990 26442006 closed fracture of shaft of femur

5572 12/07/1990 9001224 Accident in public building (supermarket)

5572 04/07/1990 79001 Essential hypertension

0939 24/12/1991 255174002 benign polyp of biliary tract

2309 21/03/1992 26442006 closed fracture of shaft of femur

2309 21/03/1992 9001224 Accident in public building (supermarket)

47804 03/04/1993 58298795 Other lesion on other specified region

5572 17/05/1993 79001 Essential hypertension

298 22/08/1993 2909872 Closed fracture of radial head

298 22/08/1993 9001224 Accident in public building (supermarket)

5572 01/04/1997 26442006 closed fracture of shaft of femur

5572 01/04/1997 79001 Essential hypertension

PtID Date ObsCode Narrative

0939 20/12/1998 255087006 malignant polyp of biliary tract

IUI-001

IUI-001

IUI-001

IUI-003

IUI-004

IUI-004

IUI-005

IUI-005

IUI-005

IUI-007

IUI-007

IUI-007

IUI-002

IUI-012

IUI-006

7 distinct disorders

Codes for types AND identifiers for instances

Page 59: CardioToronto.pps

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

Requirements for a digital copy of the world

• R1: A faithful representation of reality• R2 … of everything that is digitally registered,

what is generic scientific theories

what is specific what individual entities exist and how they relate

• R3: … throughout reality’s entire history,• R4 … which is computable in order to …

… allow queries over the world’s past and present,

… make predictions,

… fill in gaps,

… identify mistakes,

...

Page 60: CardioToronto.pps

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

Eternal memory

Page 61: CardioToronto.pps

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

Accept that everything may change:

1. changes in the underlying reality:• Particulars come, change and go

Page 62: CardioToronto.pps

New York State Center of Excellence in Bioinformatics & Life Sciences

R T UIdentity & instantiation

child adult

caterpillar butterfly

t

person

animal

Livingcreature

Page 63: CardioToronto.pps

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

Accept that everything may change:

1. changes in the underlying reality:• Particulars come, change and go

2. changes in our (scientific) understanding: • The plant Vulcan does not exist

Page 64: CardioToronto.pps

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

Reality and representation: both in evolution

IUI-#3

O-#2: ‘cancer’

O-#1: ‘benign tumor’

tU1: benign tumor

U2: malignant tumor

p3Reality

Repr.

O-#0: diabolic possession

= “denotes” = what constitutes the meaning of representational units …. Therefore: O-#0 is meaningless

Page 65: CardioToronto.pps

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

Accept that everything may change:

1. changes in the underlying reality:• Particulars come, change and go

2. changes in our (scientific) understanding: • The plant Vulcan does not exist

3. reassessments of what is considered to be relevant for inclusion (notion of purpose).

4. encoding mistakes introduced during data entry or ontology development.

Page 66: CardioToronto.pps

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

Changes over time

• In John Smith’s Electronic Health Record:– At t1: “male” at t2: “female”

• What are the possibilities ?• Change in reality:

• transgender surgery• change in legal self-identification

• Change in understanding: it was female from the very beginning but interpreted wrongly

• Correction of data entry mistake: it was understood as male, but wrongly transcribed

• (Change in word meaning)

Page 67: CardioToronto.pps

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

Requirements for a digital copy of the world

• R1: A faithful representation of reality• R2 … of everything that is digitally registered,

what is generic scientific theories

what is specific what individual entities exist and how they relate

• R3: … throughout reality’s entire history,• R4 … which is computable in order to …

… allow queries over the world’s past and present,

… make predictions,

… fill in gaps,

… identify mistakes,

...

Page 68: CardioToronto.pps

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

Referent Tracking System Components

• Referent Tracking SoftwareManipulation of statements about facts and

beliefs

• Referent Tracking Datastore:• IUI repository

A collection of globally unique singular identifiers denoting particulars

• Referent Tracking Database

A collection of facts and beliefs about the particulars denoted in the IUI repository

Manzoor S, Ceusters W, Rudnicki R. Implementation of a Referent Tracking System. International Journal of Healthcare Information Systems and Informatics 2007;2(4):41-58.

Page 69: CardioToronto.pps

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

Place in the Health IT arena

#IUI-1 ‘affects’ #IUI-2#IUI-3 ‘affects’ #IUI-2#IUI-1 ‘causes’ #IUI-3...

Referent TrackingDatabase

CAG repeat

Juvenile HD

persondisorder

continuantOntology

EHR

Page 70: CardioToronto.pps

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

How to build an ontology from a terminology?

Page 71: CardioToronto.pps

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

Steps in ontology building

1. For all terms identified in the terminology, find the entities in reality that are directly denoted;

2. Determine the top categories these entities belong to;

3. Determine for any dependent entity:• If process: the continuants that participate in it

• If dependent continuant: the continuant upon which it depends

4. For any entity determined in step 3, go to step 2.

Rudnicki R, Ceusters W, Manzoor S, Smith B. What Particulars are Referred to in EHR Data? A Case Study in Integrating Referent Tracking into an Electronic Health Record Application. In Teich JM,

Suermondt J, Hripcsak C. (eds.), American Medical Informatics Association 2007 Annual Symposium Proceedings, Biomedical and Health Informatics: From Foundations to Applications to Policy, Chicago

IL, 2007;:630-634.

Page 72: CardioToronto.pps

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

Building the Ontology underlying a terminology (MDS)

MDSOntology

U2

U3

U5

U4

U6

MDS1 MDS2 MDS3 MDS4 MDS5 MDS6 …

U11

U7

U14

U13

U10

U12

MDS terms

U17

U16

U1

U9U8

BFO

Class-relations

Page 73: CardioToronto.pps

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U Adding another terminology

U2

U1

U7

U17

U9

U3

U5

U4

U6U11

U10

U14

U12

U13

U…

OPOOntology

(MDS + CARE +…)

MDS1 MDS2 MDS3 MDS4 MDS5 MDS6 …

… MDS terms

U16

U8

BFO

Page 74: CardioToronto.pps

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U Adding another terminology

U2

U1

U7

U17

U9

U3

U5

U4

U6U11

U10

U14

U12

U13

U…

OPOOntology

(MDS + CARE +…)

MDS1 MDS2 MDS3 MDS4 MDS5 MDS6 …

……

CARE 1

CARE 2

CARE 3

CARE 4

MDS terms

CARE terms

U15 U16

U8

BFO

Page 75: CardioToronto.pps

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

How to link to patient data ?

Page 76: CardioToronto.pps

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

Semantic integration of data expressed in distinct terminologies

• Purpose:– Better comparability

– Statistical validation of the ontology• Explanation of observed correlations between assessment data elements

• Finding patient subpopulations exhibiting correlations which are near-significant without the ontology, but significant with the ontology

• Two level integration:– Type level : poor man’s linkage

– Particular level: rich man’s linkage

Page 77: CardioToronto.pps

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U ‘Poor man’s’ data linkage

U2

U1

U7

U17

U9

U3

U5

U4

U6U11

U10

U14

U12

U13

U…

MDSOntology

MDS1 MDS2 MDS3 MDS4 MDS5 MDS6 …

… MDS terms

U16

U8

pt4 pt3

Patientdata

Page 78: CardioToronto.pps

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

Data linkage using multiple instruments

U2

U1

U7

U17

U9

U3

U5

U4

U6U11

U10

U14

U12

U13

U…

OPOOntology

(MDS + CARE +…)

MDS1 MDS2 MDS3 MDS4 MDS5 MDS6 …

…… MDS terms

CARE terms

U15 U16

U8

BFO

X

X

X

X

X

X

X

X

X X X X

X X X

X X X

Patient 1

Patient 2

Patient 3

Page 79: CardioToronto.pps

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

Problems with this level

• Exclusive focus on universals, ignoring that in data collection (almost) everything is about particulars.

• Therefore Referent Tracking must be brought in the picture.

Page 80: CardioToronto.pps

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

Referent Tracking solves this problem:

• It is true that:– (1) ‘All Americans have one mother’– (2) ‘All Americans have one president’

• But:– (1) ‘all Americans have a distinct mother’– (2) ‘all Americans have a (numerically) identical

president’

Page 81: CardioToronto.pps

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U From ‘poor man’s’ to‘rich man’s’ data linkage

U2

U1

U7

U17

U9

U3

U5

U4

U6U11

U10

U14

U12

U13

MDSOntology

MDS1 MDS2 MDS3 MDS4 MDS5 MDS6 …

MDS terms

U16

U8

pt4 pt3

Patientdata

formula

Page 82: CardioToronto.pps

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

Rich man’s data linkage: focus on particulars

U6U11

MDS3 MDS4

pt4 pt3

pt4

IUI-1

U6

IUI-2 IUI-3

U11

IUI-4 IUI-5

pt3

Instance-of

Particularrelations

Page 83: CardioToronto.pps

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

Many more combinations possible

• The terms used in MDS4 denote distinct particulars related to both patients

• One of the terms used in MDS4 denotes the same particular for both patients

pt4

IUI-1

U6

IUI-2 IUI-3

U11

IUI-4 IUI-5

pt3pt4

IUI-1

U6

pt4

IUI-1

U6

IUI-1

U6

IUI-2 IUI-3

U11

IUI-2 IUI-3

U11

IUI-4 IUI-5

pt3

IUI-4 IUI-5

pt3 pt4

IUI-1

U6

IUI-2IUI-3

U11

IUI-5

pt3pt4

IUI-1

U6

IUI-1

U6

IUI-2IUI-3

U11

IUI-5

pt3

Page 84: CardioToronto.pps

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

What has worked ?How have disparate views

been accommodated?

Page 85: CardioToronto.pps

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

Definitions for ‘Adverse Event’D4 an observation of a change in the state of a subject assessed as being untoward

by one or more interested parties within the context of a protocol-driven research or public health.

BRIDG

D5 an event that results in unintended harm to the patient by an act of commission or omission rather than by the underlying disease or condition of the patient

IOM

D6 any unfavorable and unintended sign (including an abnormal laboratory finding), symptom, or disease temporally associated with the use of a medical treatment or procedure that may or may not be considered related to the medical treatment or procedure

NCI

D7 any untoward medical occurrence in a patient or clinical investigation subject administered a pharmaceutical product and which does not necessarily have to have a causal relationship with this treatment

CDISC

D8 an untoward, undesirable, and usually unanticipated event, such as death of a patient, an employee, or a visitor in a health care organization. Incidents such as patient falls or improper administration of medications are also considered adverse events even if there is no permanent effect on the patient.

JTC

D9 an injury that was caused by medical management and that results in measurable disability.

QUIC

Page 86: CardioToronto.pps

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

At least one argument• There is no entity which would be such that, were it placed before

these authors, they would each in turn be able to point to it and respectively say – faithfully and honestly – – “that is an observation” (definition D4), – “that is an injury” (definition D9), – “that is a laboratory finding” (definition D6).

• Clearly, – nothing which is an injury can be a laboratory finding, although, of course,

laboratory findings can aid in diagnosing an injury or in monitoring its evolution.

– nothing which is a laboratory finding, can be an observation, although, of course, some observation must have been made if we are to arrive at a laboratory finding.

Page 87: CardioToronto.pps

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

Hypothesis

• Because …– all the authors of the mentioned definitions use the term

‘adverse event’ in some context for a variety of distinct entities, and

– these contexts look quite similar • in each of them, more or less the same sort of entities seem to be

involved

• … there is some common ground (some portion of reality) which is such that the entities within it can be used as referents for the various meanings of ‘adverse event’.

Page 88: CardioToronto.pps

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

Why does this matter ?

• Be precise about what representational units in either an ontology or data repository stand for.

• Each such unit in an ontology should come with additional information on whether it denotes:– an entity at level 1, level 2 or level 3

and

– a universal, or a defined or composite class

Page 89: CardioToronto.pps

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

Examples from our adverse event domain ontologyDenotation Class Type Particular Type Description (role in adverse event scenario)

Level 1

C1 subject of care DC independent continuant

person to whom harm might have been done through an act under scrutiny

C2 act under scrutiny DC act of care act of care that might have caused harm to the subject of care

C7 structure change U process change in an anatomical structure of a person

C8 structure integrity U dependent continuant

aspect of an anatomical structure deviation from which would bring it about that the anatomical structure would either (1) itself become dysfunctional or (2) cause dysfunction in another anatomical structure

C12 subject investigation

DC process looking for a structure change in the subject of care

Level 2

C15 observation DC dependent continuant

cognitive representation of a structure change resulting from an act of perception within a subject investigation

C16 harm diagnosis DC dependent continuant

cognitive representation, resulting from a harm assessment, and involving an assertion to the effect that a structure change is or is not a harm

Level 3

C18 care reference DC information entity

concretized (through text, diagram, …) piece of knowledge drawn from state of the art principles that can be used to support the appropriateness of (or correctness with which) processes are performed involving a subject of care

Page 90: CardioToronto.pps

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

Representing particular cases

• Is the generic representation of the portion of reality adequate enough for the description of particular cases?

• Example: a patient – born at time t0 – undergoing anti-inflammatory treatment and

physiotherapy since t2 – for an arthrosis present since t1– develops a stomach ulcer at t3.

Page 91: CardioToronto.pps

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

Anti-inflammatory treatment with ulcer developmentIUI Particular description Properties

#1 the patient who is treated #1 member C1 since t2

#2 #1’s treatment #2 instance_of C3 #2 has_participant #1 since t2

#2 has_agent #3 since t2

#3 the physician responsible for #2 #3 member C4 since t2

#4 #1’s arthrosis #4 member C5 since t1

#5 #1’s anti-inflammatory treatment #5 part_of #2 #5 member C2 since t3

#6 #1’s physiotherapy #6 part_of #2

#7 #1’s stomach #7 member C6 since t2

#8 #7’s structure integrity #8 instance_of C8 since t0 #8 inheres_in #7 since t0

#9 #1’s stomach ulcer #9 part_of #7 since t3

#10 coming into existence of #9 #10 has_participant #9 at t3

#11 change brought about by #9 #11 has_agent #9 since t3 #11 has_participant #8

since t3

#11 instance_of C10 at t3

#12 noticing the presence of #9 #12 has_participant #9 at t3+x #12 has_agent #3 at t3+x

#13 cognitive representation in #3 about #9 #13 is_about #9 since t3+x

Page 92: CardioToronto.pps

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

Advantage 1: reduce ambiguity in definitions• E.g. ‘adverse drug reaction: an undesirable response associated

with use of a drug that either compromises therapeutic efficacy, enhances toxicity, or both.’ (Joint Technical Committee)

– May denote something on level 1, e.g. a realizable entity which exists objectively as an increased health risk; in this sense any event ‘that either compromises therapeutic efficacy, enhances toxicity, or both’ is undesirable;

– May denote something on level 2, so that, amongst all of those events which influence therapeutic efficacy or toxicity, only some are considered undesirable (for whatever reason) by either the patient, the caregiver or both; or

– May denote something relating to level 3, so a particular event occurring on level 1 is undesirable only when it is an instance of a type of event that is listed in some guideline, good practice management handbook, i.e. in some published statement of the state of the art in relevant matters.

Page 93: CardioToronto.pps

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

Advantage 2: reveal hidden assumptions

• E.g.: ‘adverse event: an event that results in unintended harm to the patient by an act of commission or omission rather than by the underlying disease or condition of the patient’ (IOM)

• But:– An ‘act of omission’ is under the realist agenda not an entity

that exist at level 1, but rather a level 3 entity denoting a configuration in which not was done what good practice requires to be done,

– Something what not exist at level 1, cannot cause harm by itself,

– Thus it must be the underlying disease.

Page 94: CardioToronto.pps

New York State Center of Excellence in Bioinformatics & Life Sciences

R T U

Conclusion

• Health data management involves many actors and IT systems: semantic interoperability is thus a key issue.

• Ontologies (of the right sort) provide a deep level of semantic interoperability between IT systems, thereby keeping track:– of what is the case;

– of what is known by some actor(s);

– of what has been and still needs to be done.

• Realism-based ontology, as a discipline, helps in creating ontologies of the right sort.