praveen r. rao , stanley a. edlavitch , jeffrey l. hackman ,...

1
Our Vision Potential Funding Sources Architectural Overview References Introduction Collaborators Motivations 1. Centers for Medicare and Medicaid Service, Office of Actuary, 2009, http://www.cms.hhs.gov/NationalHealthExpendData/downloads/proj2009.pdf. 2. American Cancer Society, Surveillance Research, 2009, http://www.cancer.org/downloads/PRO/2009_cases_deaths_by_age.pdf. 3. Fenstermacher D, Street C, McSherry T, Nayak V, Overby C, Feldman M: The Cancer Biomedical Information Grid (caBIG). Conference Proceedings IEEE Engineering in Medicine and Biology Society, 2005. 4. Weber GM, Murphy SN, McMurry AJ, Macfadden D, Nigrin DJ, Churchill S, Kohane IS: The Shared Health Research Information Network (SHRINE): A Prototype Federated Query Tool for Clinical Data Repositories, Journal American Medical Informatics Association, Sep 2009. 5. Stead WW, Lin HS: Computation Technology for Effective Health Care: Immediate Steps and Strategic Directions. Washington D.C.: The National Academies Press, 2009. 6. Hristidis V: Information Discovery on Electronic Health Records, CRC - Taylor & Francis, December 2009. 7. Rao PR, Moon B: Locating XML Documents in a Peer-to-Peer Network Using Distributed Hash Tables. IEEE Transactions on Knowledge and Data Engineering, Volume 21, No 12, pp 1737-1752, December 2009. 8. Rao PR, Moon B: An Internet-Scale Service for Publishing and Locating XML Documents. Proceedings of 25th IEEE International Conference on Data Engineering (ICDE ’09), Shanghai, China, March 2009. 9. Scott Boag, Don Chamberlin, Mary Fernandez, Daniela Florescu, Jonathan Robie, Simon J: XQuery 1.0: An XML Query Language. World Wide Web Consortium Recommendation 23, January 2007. 10. Nebraska Health Information Initiative, http://www.nehii.org/. National Institutes of Health (NIH) National Science Foundation (NSF) Agency for Healthcare Quality and Research (AHRQ) Missouri Life Sciences Research Board (MLSRB) 4.48 0.00 0.50 1.00 1.50 2.00 2.50 3.00 3.50 4.00 4.50 5.00 Trillions National Health Expenditure Based on data provided by US Department of Health and Human Services [1] Source: Centers for Medicare and Medicaid Services, Office of Actuary, National Health Statistics Group 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 All Ages Under 45 45 and Over Under 65 65 and Over Millions Cancer cases (2009) Men Women Both Genders Based on data provided by American Cancer Society [2] Key Challenges Sample Queries Incidence: What is the incidence of small cell lung cancer in a non- smoker male between 2007 through 2010? Regimen: What is the best treatment regimen for melanoma? What are the alternative regimens? Staging: How many patients were diagnosed with prostate cancer stage II-B? Radiation side-effects: What kind of cardiac side-effects were observed in patients receiving radiation to left breast? Chemotherapy side-effects: How safe is R-CHOP regimen for my condition? Survival rate: What is the 5-year survival rate for a patient with stage II-B colon cancer? Remission: What are the chances of complete remission for cervical cancer patient treated with adjuvant chemoradiotherapy? Praveen R. Rao 1 , Stanley A. Edlavitch 2 , Jeffrey L. Hackman 3 , Timothy P. Hickman 2 , Douglas S. McNair 4 , and Deepthi S. Rao 5 1 School of Computing and Engineering, University of Missouri-Kansas City, Kansas City 2 School of Medicine, University of Missouri-Kansas City, Kansas City 3 Truman Medical Centers, Kansas City 4 Cerner Corporation, Kansas City 5 Argentine Family Health, Kansas City Today the nation faces one of the toughest challenges in health care due to high operating costs. $2.3 Trillion in 2008 National Health Expenditures, 2008 Cancer is the second most common cause of deaths in the US. Health care costs are rising steadily. It is estimated that in the year 2019, US will spend $4.48 trillion for health care. Vast amounts of information (e.g., electronic health records, drug data, data from clinical diagnosis) remain largely untapped due to the lack of suitable IT solutions. Data sources evolve over time and are heterogeneous. Data sharing and collaboration, and large-scale management of health care data have been identified as the key IT challenges to advance the nation’s health care system [3]. The Institute of Medicine (IOM) envisions the development of a learning healthcare system. IOM’s quality aims: safe, timely, effective, efficient, equitable, patient -centered. Can we provide cost-effective, more efficient, and higher quality care to patients by sharing health care data? How can we share health care data on a very large-scale (e.g., petabytes of data)? How can we manage non-standardized data sources that evolve with time? How can we pose a single query to execute across multiple data sources? Can we protect the privacy of patients and ownership of patient data? Can we ensure HIPAA compliance? <?xml version="1.0" ?> <ClinicalDocument> <id extension="49912" root="2.16.840.1.113883.3.933"/> <patient> <name> <given>John</given> <family>Doe</family> </name> <genderCode code="M" codeSystem="2.16.840.1.5.1"/> <birthTime value="20020924"/> </patient> <component> <StructuredBody> <component> <section> <code code="10160-0" codeSystem="2.16.840.1.113883.6.1" codeSystemName="LOINC" /> <title>Medications</title> <entry> <Observation> <code code="84100007" codeSystem="2.16.840.1.113883.6.96" codeSystemName="SNOMED CT" displayName=" medication history"/> <value xsi:type="CD" code="195967001" codeSystem="2.16.840.1.113883.6.96" codeSystemName="SNOMED CT" displayName="Emphysema"> <value xsi:type="CD" code="91143003" codeSystem="2.16.840.1.113883.6.96" codeSystemName="SNOMED CT" displayName="Albuterol" /> <originalText> <reference value="m1"/> </originalText> </value> </Observation> </entry> <entry> <Observation> <code code="84100007" codeSystem="2.16.840.1.113883.6.96" codeSystemName="SNOMED CT" displayName="medication history"/> <value xsi:type="CD" code="32398004" codeSystem="2.16.840.1.113883.6.96" codeSystemName="SNOMED CT" displayName="Squamous cell lung carcinoma"> </value> </Observation> </entry> <entry> …………………………………………………… </entry> </section> </component> <component> <section> <code code="10164-2" codeSystem="2.16.840.1.113883.6.1" codeSystemName="LOINC" /> <title>History of Presen t Illness</title> <text>65 year old gentleman with a history of Emphysema presented to our hospital with cough since two months. He experienced severe bouts of cough since two days and an episode of Hemoptysis yesterday evening. A CT scan image showed a 3x2.5 cm mass in the apex of right lung and was confirmed through biopsy about the possibility of squamous cell lung carcinoma. </text> </section> </component> </StructuredBody> </component> </ClinicalDocument> An Example XML Document [6] HL7 Clinical Document Architecture Potential Impact Individual hospital level Allow a more rapid review of events by quality improvement staff Significantly decrease the time spent gathering information for nationally reportable databases such as Core Measures and PQRI (physicians quality reporting initiative) National and international level Monitor side-effects of new medicines Assess best practices Conduct comparative effectiveness research Leverage knowledge from clinical trials Superior decision support and data mining tools Local, state, and national level health information exchange (HIE) initiatives Limitations of current data integration systems E.g.,National Cancer Institute’s caBIG [3], SHRINE [4], NeHii [10] Creating a mediated schema and semantic mappings is very cumbersome in a federated database model as the number of data sources increases. Requires sufficient domain knowledge for each data source. Web services based architecture requires explicit specification of data location in the query coarse grained selection of data sources and lacks scalability. Collaborative Data Network Missouri Regional Life Sciences Summit (2010), Kansas City , MO Design Principles A CDN benefits from the marriage of two successful technologies Peer-to-peer computing (P2P) Scalability, fault tolerance, load balancing, decentralized design XML/JSON standards Model heterogeneity of data sources, non-standardization Rich query expressiveness (e.g., XQuery [10]) We propose a new design for sharing Electronic Health Records (EHRs) called a Collaborative Data Network (CDN). Data resides with the data provider it can implement local access control policies and protect the privacy of patients. Location oblivious queries - a user can pose a single query across multiple data sources. Natural language parser XQuery Generator Clinical terminology/ thesaurus XQuery processor XML/P2P based Collaborative Data Network Query Publisher Schema Recommender Data XML signature Location oblivious queries Query results User feedback Form-based query web service web service Feedback Analyzer Gossip ClinicalDocument id patient component Observation value value Emphysema Albuterol SNOMED CT Observation value Squamous cell lung carcinoma component section text 65 year old gentleman with a history of Emphysema presented to our hospital ...... about the possibility of squamous cell lung carcinoma. codeSystem 2.16.840.1.113883.6.1 XML/P2P based CDN (1) Single XQuery query (2) Find matching XML documents and their publishers (e.g., using psiX [7,8]) (3) Return matching documents/publishers (4) Data or query shipping Chemotherapy data (1) Single XQuery query (3) Query results (2) Join across multiple data sources Radiotherapy data HIPAA Compliance (5) Query results XML Document Tree Location Oblivious Queries By design, the actual data resides with the publisher (a.k.a. data provider). Data protection and local access control policies can be implemented similar to a federated design. Data ownership can be achieved in a CDN. Our CDN can protect the privacy of patients. Only the publisher has the permission to modify its data. By design, a user posing a query cannot access an XML document of a data provider if he/she is not authorized by the data provider. Data can be encrypted while being transferred through the network. Cloud Computing Cloud computing can reduce the infrastructure setup and maintenance costs for health care providers Our proposed collaborative platform can be adapted to run in the cloud. Data provider stores actual data locally and accesses the cloud via the Internet. We have a working prototype of psiX [7,8] at http://vortex.sce.umkc.edu/psix that demonstrates proof-of-concept. XML/P2P based CDN

Upload: others

Post on 29-Aug-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Praveen R. Rao , Stanley A. Edlavitch , Jeffrey L. Hackman , …r.web.umkc.edu/raopr/MLSS.POSTER.pdf · 2010. 3. 17. · Health Research Information Network (SHRINE): A Prototype

TEMPLATE DESIGN © 2008

www.PosterPresentations.com

Our Vision

Potential Funding Sources

Architectural Overview

References

Introduction

Collaborators

Motivations

1. Centers for Medicare and Medicaid Service, Office of Actuary, 2009,

http://www.cms.hhs.gov/NationalHealthExpendData/downloads/proj2009.pdf.

2. American Cancer Society, Surveillance Research, 2009,

http://www.cancer.org/downloads/PRO/2009_cases_deaths_by_age.pdf.

3. Fenstermacher D, Street C, McSherry T, Nayak V, Overby C, Feldman M: The Cancer Biomedical

Information Grid (caBIG). Conference Proceedings IEEE Engineering in Medicine and Biology

Society, 2005.

4. Weber GM, Murphy SN, McMurry AJ, Macfadden D, Nigrin DJ, Churchill S, Kohane IS: The Shared

Health Research Information Network (SHRINE): A Prototype Federated Query Tool for Clinical

Data Repositories, Journal American Medical Informatics Association, Sep 2009.

5. Stead WW, Lin HS: Computation Technology for Effective Health Care: Immediate Steps and

Strategic Directions. Washington D.C.: The National Academies Press, 2009.

6. Hristidis V: Information Discovery on Electronic Health Records, CRC - Taylor & Francis,

December 2009.

7. Rao PR, Moon B: Locating XML Documents in a Peer-to-Peer Network Using Distributed Hash

Tables. IEEE Transactions on Knowledge and Data Engineering, Volume 21, No 12, pp 1737-1752,

December 2009.

8. Rao PR, Moon B: An Internet-Scale Service for Publishing and Locating XML Documents.

Proceedings of 25th IEEE International Conference on Data Engineering (ICDE ’09), Shanghai,

China, March 2009.

9. Scott Boag, Don Chamberlin, Mary Fernandez, Daniela Florescu, Jonathan Robie, Simon J:

XQuery 1.0: An XML Query Language. World Wide Web Consortium Recommendation 23, January

2007.

10. Nebraska Health Information Initiative, http://www.nehii.org/.

National Institutes of Health (NIH)

National Science Foundation (NSF)

Agency for Healthcare Quality and Research (AHRQ)

Missouri Life Sciences Research Board (MLSRB)

4.48

0.00

0.50

1.00

1.50

2.00

2.50

3.00

3.50

4.00

4.50

5.00

Tri

llio

ns

National Health Expenditure

Based on data provided by US Department of Health and Human Services [1]

Source: Centers for Medicare and Medicaid Services,

Office of Actuary, National Health Statistics Group

0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

All Ages Under 45 45 and Over Under 65 65 and Over

Mill

ion

s

Cancer cases (2009)

Men Women Both Genders

Based on data provided by

American Cancer Society [2]

Key Challenges

Sample Queries

Incidence: What is the incidence of small cell lung cancer in a non-

smoker male between 2007 through 2010?

Regimen: What is the best treatment regimen for melanoma? What

are the alternative regimens?

Staging: How many patients were diagnosed with prostate cancer

stage II-B?

Radiation side-effects: What kind of cardiac side-effects were

observed in patients receiving radiation to left breast?

Chemotherapy side-effects: How safe is R-CHOP regimen for my

condition?

Survival rate: What is the 5-year survival rate for a patient with

stage II-B colon cancer?

Remission: What are the chances of complete remission for

cervical cancer patient treated with adjuvant chemoradiotherapy?

Praveen R. Rao1, Stanley A. Edlavitch2, Jeffrey L. Hackman3, Timothy P. Hickman2, Douglas S. McNair4, and Deepthi S. Rao5

1 School of Computing and Engineering, University of Missouri-Kansas City, Kansas City 2 School of Medicine, University of Missouri-Kansas City, Kansas City 3 Truman Medical Centers, Kansas City 4 Cerner Corporation, Kansas City 5 Argentine Family Health, Kansas City

Today the nation faces one of the toughest

challenges in health care due to high

operating costs.

$2.3 Trillion in 2008

National Health Expenditures, 2008

Cancer is the second most common cause of deaths in the US.

Health care costs are rising steadily. It is estimated that in the year 2019, US

will spend $4.48 trillion for health care.

Vast amounts of information (e.g., electronic health records, drug data, data

from clinical diagnosis) remain largely untapped due to the lack of suitable IT

solutions. Data sources evolve over time and are heterogeneous.

Data sharing and collaboration, and large-scale management of health care

data have been identified as the key IT challenges to advance the nation’s

health care system [3].

The Institute of Medicine (IOM) envisions the development of a learning

healthcare system.

IOM’s quality aims: safe, timely, effective, efficient, equitable, patient-centered.

Can we provide cost-effective, more efficient, and higher quality care to patients by

sharing health care data?

How can we share health care data on a very large-scale (e.g., petabytes of data)?

How can we manage non-standardized data sources that evolve with time?

How can we pose a single query to execute across multiple data sources?

Can we protect the privacy of patients and ownership of patient data?

Can we ensure HIPAA compliance?

<?xml version="1.0" ?>

<ClinicalDocument>

<id extension="49912" root="2.16.840.1.113883.3.933"/>

<patient>

<name>

<given>John</given>

<family>Doe</family>

</name>

<genderCode code="M" codeSystem="2.16.840.1.5.1"/>

<birthTime value="20020924"/>

</patient>

<component>

<StructuredBody>

<component>

<section>

<code code="10160-0" codeSystem="2.16.840.1.113883.6.1" codeSystemName="LOINC" />

<title>Medications</title>

<entry>

<Observation> <code code="84100007" codeSystem="2.16.840.1.113883.6.96" codeSystemName="SNOMED CT" displayName=" medication

history"/>

<value xsi:type="CD" code="195967001" codeSystem="2.16.840.1.113883.6.96" codeSystemName="SNOMED CT" displayName="Emphysema">

<value xsi:type="CD" code="91143003" codeSystem="2.16.840.1.113883.6.96" codeSystemName="SNOMED CT" displayName="Albuterol" />

<originalText>

<reference value="m1"/>

</originalText>

</value>

</Observation>

</entry>

<entry>

<Observation> <code code="84100007" codeSystem="2.16.840.1.113883.6.96" codeSystemName="SNOMED CT" displayName="medication

history"/>

<value xsi:type="CD" code="32398004" codeSystem="2.16.840.1.113883.6.96" codeSystemName="SNOMED CT" displayName="Squamous cell

lung carcinoma">

</value>

</Observation>

</entry>

<entry>

……………………………………………………

</entry>

</section>

</component>

<component>

<section>

<code code="10164-2" codeSystem="2.16.840.1.113883.6.1" codeSystemName="LOINC" />

<title>History of Presen t Illness</title>

<text>65 year old gentleman with a history of Emphysema presented to our hospital

with cough since two months. He experienced severe bouts of cough since two

days and an episode of Hemoptysis yesterday evening. A CT scan image showed a

3x2.5 cm mass in the apex of right lung and was confirmed through biopsy

about the possibility of squamous cell lung carcinoma.

</text>

</section>

</component>

</StructuredBody>

</component>

</ClinicalDocument>

An Example XML Document [6]

HL7 Clinical Document Architecture

Potential Impact

Individual hospital level

Allow a more rapid review of events by quality improvement staff

Significantly decrease the time spent gathering information for nationally

reportable databases such as Core Measures and PQRI (physicians quality

reporting initiative)

National and international level

Monitor side-effects of new medicines

Assess best practices

Conduct comparative effectiveness research

Leverage knowledge from clinical trials

Superior decision support and data mining tools

Local, state, and national level health information exchange (HIE) initiatives

Limitations of current data integration systems

E.g.,National Cancer Institute’s caBIG [3], SHRINE [4], NeHii [10]

Creating a mediated schema and semantic mappings is very cumbersome in a

federated database model as the number of data sources increases.

Requires sufficient domain knowledge for each data source.

Web services based architecture requires explicit specification of data location in

the query – coarse grained selection of data sources and lacks scalability.

Collaborative Data Network

Missouri Regional Life Sciences Summit (2010), Kansas City , MO

Design Principles

A CDN benefits from the marriage of two successful technologies

Peer-to-peer computing (P2P)

Scalability, fault tolerance, load balancing, decentralized design

XML/JSON standards

Model heterogeneity of data sources, non-standardization

Rich query expressiveness (e.g., XQuery [10])

We propose a new design for sharing Electronic Health Records (EHRs) called a

Collaborative Data Network (CDN).

Data resides with the data provider – it can implement local access control policies

and protect the privacy of patients.

Location oblivious queries - a user can pose a single query across multiple data

sources.

Natural language

parser

XQuery

Generator

Clinical

terminology/

thesaurus

XQuery processor

XML/P2P based Collaborative Data

Network

Query

Publisher

Schema

Recommender

Data

XML signature

Location oblivious

queries

Query results

User feedback

Form-based query

web serviceweb service

Feedback Analyzer

Gossip

ClinicalDocument

id patient component

Observation

value value

Emphysema AlbuterolSNOMED CT

Observation

value

Squamous cell lung

carcinoma

component

section

text

65 year old gentleman with a history of Emphysema

presented to our hospital ...... about the possibility of

squamous cell lung carcinoma.

codeSystem

2.16.840.1.113883.6.1

XML/P2P based CDN

(1) Single XQuery query

(2) Find matching XML documents

and their publishers (e.g., using psiX [7,8])

(3) Return matching

documents/publishers

(4) Data or query

shipping

Chemotherapy data

(1) Single XQuery query

(3) Query results

(2) Join across multiple data sources

Radiotherapy data

HIPAA Compliance

(5) Query results

XML Document Tree

Location Oblivious Queries

By design, the actual data resides with the publisher (a.k.a. data provider).

Data protection and local access control policies can be implemented similar

to a federated design.

Data ownership can be achieved in a CDN.

Our CDN can protect the privacy of patients.

Only the publisher has the permission to modify its data.

By design, a user posing a query cannot access an XML document of a data

provider if he/she is not authorized by the data provider.

Data can be encrypted while being transferred through the network.

Cloud ComputingCloud computing can reduce the infrastructure setup and maintenance costs for

health care providers

Our proposed collaborative platform can be adapted to run in the cloud.

Data provider stores actual data locally and accesses the cloud via the

Internet.

We have a working prototype of psiX [7,8] at http://vortex.sce.umkc.edu/psix

that demonstrates proof-of-concept.

XML/P2P based CDN