analytics of clinical “big data”€¦ · healthcare providers file contents storage metadata db...

23
© Hitachi, Ltd. 2014. All rights reserved. July 4, 2014 Toru Hisamitsu Chief Researcher Life Science Research Center Central Research Laboratory Hitachi, Ltd. Analytics of Clinical “Big Data” -Toward Applications for Pharmaceutical Industry- The 13th Kitasato University - Harvard School of Public Health Symposium

Upload: others

Post on 23-May-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Analytics of Clinical “Big Data”€¦ · Healthcare providers File Contents Storage Metadata DB Healthcare data analytics for value-based healthcare Secondary use of healthcare

© Hitachi, Ltd. 2014. All rights reserved.

July 4, 2014

Toru HisamitsuChief Researcher

Life Science Research Center

Central Research Laboratory

Hitachi, Ltd.

Analytics of Clinical “Big Data”-Toward Applications for Pharmaceutical Industry-

The 13th Kitasato University - Harvard School of Public Health Symposium

Page 2: Analytics of Clinical “Big Data”€¦ · Healthcare providers File Contents Storage Metadata DB Healthcare data analytics for value-based healthcare Secondary use of healthcare

© Hitachi, Ltd. 2014. All rights reserved.

1. Hitachi’s Concept for Healthcare IT

2. Hitachi’s Technologies and Their Pharmaceutical Industry

Applications

1

Contents

Page 3: Analytics of Clinical “Big Data”€¦ · Healthcare providers File Contents Storage Metadata DB Healthcare data analytics for value-based healthcare Secondary use of healthcare

© Hitachi, Ltd. 2014. All rights reserved.

1. Hitachi’s Concept for Healthcare IT

2

Page 4: Analytics of Clinical “Big Data”€¦ · Healthcare providers File Contents Storage Metadata DB Healthcare data analytics for value-based healthcare Secondary use of healthcare

© Hitachi, Ltd. 2014. All rights reserved.

1-1. Hitachi’s existing healthcare-related business

There has been rapid and significant change in healthcare around

world due to aging populations, increase in number of lifestyle-related

disease patients, and rising medical costs.

Hitachi has been focusing on healthcare business field within our

Social Innovation Business.

Clinical Inspection Device

Life Microscope

Diagnosis of mental health (advanced medical technology)Post Medical

Medical

Checkup system

“Harasuma” diet

Imaging/Radiation info. system

Proton Beam Therapy

Clinical Inspection system

CT/MRI/PET/Ultrasound diagnostics

Nursing care serviceFacility operation

Management solution fornursing care and welfare

Pre-Medical

Contributions towards realizing society where everyone lives healthy lives.

Open MRI

Hitachi is bringing together all its assets to provide innovative technologies,associated systems, solutions, and services.

3

EMR system

Page 5: Analytics of Clinical “Big Data”€¦ · Healthcare providers File Contents Storage Metadata DB Healthcare data analytics for value-based healthcare Secondary use of healthcare

© Hitachi, Ltd. 2014. All rights reserved. 4

Provide comprehensive solution for care-cycle optimization by integrating andanalyzing data during care cycle.

Provide solutions for pharmaceutical/food companies using aggregated data.

Focus on prediction of medical conditions, which is essential for both disease prevention support and medical process optimization.

Develop core technologies using UK and JP PoC, and break into ACO market in US.

For home:

Personalized home care

For Checkup Provider:

Personalized prevention

Therapy

Prevention

Test

Diagnosis

care cycle

Home care

For Medical Provider/ Hospital & Clinic:

Optimize medical process and clinical resources to collaborate with checkup and home care

For insurerImprove total health status of insured For

pharmaceutical companies: Marketing Drug

development Drug efficacy

/Adverse event detection

For food companies Marketing Food efficacy

/Healthy food development

Primary use Secondary use

Goal

Strategy

1-2. Healthcare IT mission

PoC: Proof of Concept, ACO: Accountable Care Organization

Page 6: Analytics of Clinical “Big Data”€¦ · Healthcare providers File Contents Storage Metadata DB Healthcare data analytics for value-based healthcare Secondary use of healthcare

© Hitachi, Ltd. 2014. All rights reserved.

1-3. Hitachi’s next generation healthcare service

5

Insurer, ACOPharmaceutical/food

companies

An

aly

ze

(co

mm

on m

odu

les)

Aggre

gation

Medical care

guideline

Medical care

data

Medical institution

Extraction of medical meta information

Contents house ware

Examination analysisStructured DB

Common analytic library・API

Clinical

Knowledge

Health

records

Healthcare providers

File Contents Storage

Metadata DB

Healthcare data analytics for value-based

healthcare

Secondary use of healthcare data for product

development S

erv

ice

(app

lica

tion

) Disease

prevention/

management

support

Population

health

management・・・ ・・・

Marketing

Medical cost

estimation

Adverse event

analysis

Secure database

for open innovation

【Secured healthcare cloud】

Page 7: Analytics of Clinical “Big Data”€¦ · Healthcare providers File Contents Storage Metadata DB Healthcare data analytics for value-based healthcare Secondary use of healthcare

© Hitachi, Ltd. 2014. All rights reserved.

Hitachi General Hospital

Hitachinaka General Hospital

Taga Genera l Hospi ta l

Hitachi Healthcare Center

Tsuchiura Health Checkup Center

288 beds

148 beds

410 beds

Hospitals

Health Insurance Society

Insured Numbers: 270,000

Yokohama Research Laboratory (1,100)

Hitachi Research Laboratory (1,200)

Research Laboratories

Central Research Laboratory (900)

3 Hospitals and 2 Health Checkup Centers 3 Research Laboratories

and 1 Design Div. (Total: 200, Global Lab: US, UK, CN, SG)

Design Division (150)

6

1-4. Hitachi’s healthcare related assets

Page 8: Analytics of Clinical “Big Data”€¦ · Healthcare providers File Contents Storage Metadata DB Healthcare data analytics for value-based healthcare Secondary use of healthcare

© Hitachi, Ltd. 2014. All rights reserved.

2. Hitachi’s Technologies and Their PharmaceuticalIndustry Applications

7

Page 9: Analytics of Clinical “Big Data”€¦ · Healthcare providers File Contents Storage Metadata DB Healthcare data analytics for value-based healthcare Secondary use of healthcare

© Hitachi, Ltd. 2014. All rights reserved.

2-1. Technologies mapped on service stack

8

Insurer, ACOPharmaceutical/food

companies

An

aly

ze

(co

mm

on m

odu

les)

Aggre

gation

Medical care

guideline

Medical care

data

Medical institution

Extraction of medical meta information

Contents house ware

Examination analysisStructured DB

Common analytic library・API

Clinical

Knowledge

Health

records

Healthcare providers

File Contents Storage

Metadata DB

Healthcare data analytics for value-based

healthcare

Secondary use of healthcare data for product

development S

erv

ice

(app

lica

tion

) Disease

prevention/

management

support

Population

health

management・・・ ・・・

Marketing

Medical cost

estimation

Adverse event

analysis

Secure database

for open innovation

Marketing

Medical cost

estimation

Adverse event

analysis

Secure database

for open innovation

Disease progression analysis

Security/Privacy protectionGraph-based clinical repository construction Medical Text Analysis

Medical cost simulation

2-2

2-3

2-4

【Secured healthcare cloud】

Page 10: Analytics of Clinical “Big Data”€¦ · Healthcare providers File Contents Storage Metadata DB Healthcare data analytics for value-based healthcare Secondary use of healthcare

© Hitachi, Ltd. 2014. All rights reserved.

Life habit data

Health data

Omics data

Clinical data

Ingest・Cleansing・Curation Analytic

Database

Research A

Research C

Research B

・Cohort Extraction・k-anonymization・Privacy preserved search/analysis

Medical Knowledge

【Issues】・Missing data (Lab data, Disease Name, etc.)・Data Fragmentation・Privacy Protection

① Clinical data is messy→ It is necessary to reduce cost for data preparation process

② Clinical data is sensitive→ It is necessary to guarantee data security while keeping ease of

data handling

Graph-based clinicalrepository construction

Medical Text Analysis

Security / Privacy protection

9

2-2. Technologies for managing clinical data

Staging Database

Page 11: Analytics of Clinical “Big Data”€¦ · Healthcare providers File Contents Storage Metadata DB Healthcare data analytics for value-based healthcare Secondary use of healthcare

© Hitachi, Ltd. 2014. All rights reserved.

• Clinical guideline

• Disease-drug relation

• Disease-test relation, etc.

2-2-1. Graph-based clinical repository construction

10

Constructing repository of clinical events extracted from hospital information systems (EMR, PACS, LIS, etc).

By adding clinical knowledge as semantic links between clinical events, repository enables 90% reduction of human labor to make data mart.

Clinical Semantic Linkage

Graph-based

representation

Cirrhosis

Drug Test

Test Operation

PatientA

Problem

Treat

◆ Add semantic links between clinical events automatically extracted from hospital information systems

Drug

Disease

focused on

Added Links

Liver Cancer

Extracted from

medical textsEMR

PACS

LIS

Finance

Page 12: Analytics of Clinical “Big Data”€¦ · Healthcare providers File Contents Storage Metadata DB Healthcare data analytics for value-based healthcare Secondary use of healthcare

© Hitachi, Ltd. 2014. All rights reserved.

2-2-2. Medical text analysis (1/2)

Unstructured medical texts (such as discharge summaries) contain a lot of

medical information that cannot be directly utilized by computer.

Analyzing medical texts transforms unstructured texts into XML.

This enables knowledge extraction from large volume of medical texts

(including publicly available DBs).

Output is structured XML.

Machine friendly

Easy to analyze

Input is text.

Flexible

Human friendly

Prednisolone 30 mg

daily was

commenced, but her

clinical picture did

not improve. Dose

was increased to 50

mg daily, .......

<event>

<type>Action</type>

<modifier>commence</modifier>

<action>.prescription.</action>

<what>Prednisolone</what>

<quantity>30 mg</quantity>

</event>

<event>

<type>State-Change</type>

<change>improve</change>

<wrt>clinical picture</wrt>

</event>

Discharge

summary

(pdf)

Layout Analysis Structuring

Hospital course

11

Page 13: Analytics of Clinical “Big Data”€¦ · Healthcare providers File Contents Storage Metadata DB Healthcare data analytics for value-based healthcare Secondary use of healthcare

© Hitachi, Ltd. 2014. All rights reserved.

体温が38度から40度まで上昇した(Body temperature increased from 38℃ to

40 ℃)

<event><polarity>0</polarity><attitude>confirm</attitude>

<change>.up.</action><action>上昇(increase)</change><attribute>体温(body temperature)</what><is>40度(40 degrees)</is><was>38度(38 degrees)</was>

</event>

Structuring (XML)

Text documents

XML

Morphological analysis

Predicate argument analysis

Event structure analysis

Priority rule for argument search

Predicate argument

structure data

Aiming to extract information such as time, date, symptoms and treatments in unstructured data from text documents

■ Structuring clinical document (ex. discharge summary)

■Analysis speed: 0.45 sec/documentDischarge summaries of Hitachi general hospital for 2 years(approximately 19,000) can be analyzed in 2.4 hours.

■Accuracy: 80 - 85% (precision: 90%, recall (giving answers): 90%)Internal medicine cases give higher score than surgery.

2-2-2. Medical text analysis (2/2)

12

Page 14: Analytics of Clinical “Big Data”€¦ · Healthcare providers File Contents Storage Metadata DB Healthcare data analytics for value-based healthcare Secondary use of healthcare

© Hitachi, Ltd. 2014. All rights reserved.

k-anonymization: Remove some details of cells so that no. of rows with same values becomes more than k.

Simple Anonymization: From data, remove all or some details by which individuals could be identified.

e.g. name, address, and phone number

Address Age Gender

Tokyo, Japan 28 Male

New York, USA 27 Female

Okinawa, Japan 112 Male

Singapore, Singapore 33 Female

Name Phone number

John Smith 0332581111

Mary Williams 0332581111

Robert Jones 0423231111

Linda Taylor 0458603093

Records with rare data still

remain, which might result

in identification of

individuals.

Re-identification risk

remains!

After k-anonymization,

possibility of identifying

individual becomes at most

1/k

Address Age Gender

Tokyo, Japan 28 Male

NY, USA 27 Female

Wash., USA 25 Female

Okinawa, Japan 21 Male

LA, USA 25 Female

Address Age Gender

Japan Twenties Male

USA Twenties Female

USA Twenties Female

Japan Twenties Male

USA Twenties Female

e.g.

k = 2

• Technical challenges in k-anonymization are (1) how to reduce information loss accompanying

anonymization, and (2) how to reduce trial and error in operation.

• Hitachi has techniques for solving these problems.

What’s k-anonymization?

2-2-3. Privacy protection / k-Anonymization (1/2)

13

Page 15: Analytics of Clinical “Big Data”€¦ · Healthcare providers File Contents Storage Metadata DB Healthcare data analytics for value-based healthcare Secondary use of healthcare

© Hitachi, Ltd. 2014. All rights reserved.

2-2-3. Privacy protection / k-Anonymization (2/2)

14

NameGender Age Cancer

F 40’s Stomach

F 40’s Liver

M 50’s or

more

Colon

M 30’s Colon

An

on

ym

izatio

n

Ge

ne

ratio

no

f Da

taH

iera

rch

y

Info

. Loss

Ca

lcu

latio

n

Pe

rso

na

lIn

fo.

An

on

ym

ized

In

fo.

4-Anonymization

Liver or Colon or …

Rectum

4

2

1

Colon

1

Colon or …

4

8

2

Automatic Generation

Name Gender Age Cancer

F 45 Stomach

F 40 Liver

M 98 Rectum

M 32 Colon

Mask

Can be identified from unmasked

data.

Complicate individual identification by specifying number (k) of persons who can be identified from their

unmasked data.

Conventional Methodk-Anonymization Technology

Ma

sk

Difficulty of identification can be controlled by

changing k.

k-anonymization technology to control re-identification risk of personal data for privacy-aware data usage.

Compared with previous methods, our method reduces information loss by more than 30% due to k-anonymization by using (1)automatic generalization of data hierarchy and (2) evaluation of information loss.

(1) (2)

Page 16: Analytics of Clinical “Big Data”€¦ · Healthcare providers File Contents Storage Metadata DB Healthcare data analytics for value-based healthcare Secondary use of healthcare

© Hitachi, Ltd. 2014. All rights reserved.

B55BE115 EC4FA6BC

C1 = 128E(Abe) E1 = 6E7A(M)

C2 = 84D2(Ito) E2 = D0D5(M)

C3 = 24F9(Abe) E3 = EC2B(F)

・・・ ・・・

User(Data Owner)

Search word m Q(m)

[Ci,Di,Ei]

Data encrypted by searchable encryption

Preliminary: User encrypts his data and stores in cloud

cloud(Service Provider)

Highly secure query

For same words,

different queries1. Send encrypted query

2. Return found data

Name

High efficiency

Additional time cost by

encryption is negligible

C33A0B12

D1 = B22A(Tokyo)

D2 = 1AC3(Chiba)

D3 = 4T7A(Tokyo)

・・・

Search for 10,000 records Time ( client / transmission(intranet) / server )

Plaintext 8.2 ms (2.3/4.9/1.0 [ms])

Encrypted 16.3 ms (2.8/5.5/8.0 [ms])

Highly secure encryptionFor same plaintexts, different ciphertexts

Address Gender

Private Search

Check (Q(m),Ci)

=Y/N

2-2-3. Privacy protection / Searchable encryption

Searchable Encryption Scheme“Encryption”, “Decryption” + “Searching with data remaining encrypted ”

Hitachi Scheme High Efficiency: using symmetric-key encryption scheme that enables

high-speed processing

High Security: ensure that completely different ciphertexts are produced, even for same source data

15This work was supported by Ministry of Internal Affairs and Communications, Japan.

Page 17: Analytics of Clinical “Big Data”€¦ · Healthcare providers File Contents Storage Metadata DB Healthcare data analytics for value-based healthcare Secondary use of healthcare

© Hitachi, Ltd. 2014. All rights reserved.

We have been constructing “disease progression model” using health checkup and claim data.

• Number of health checkups: 200,000• Number of claims: 1,600,000

Example of disease progression model

Obesity

Weight gain Eating

habits

Diabetes

Complications

High blood sugar

Exercise

XXX Patients over XX YearsXXX Patients over XX Years

100,000 Patients over 2 Years

Patient Information

Age

Gender

Weight

Height

Body fat

HbA1c

FPG

Exercise habits

Smoking habits

Etc…

Condition Status

Impaired Glucose Tolerance?

Impaired FPG?

High blood pressure?

Have diabetes?

Taking Oral Medication?

Using Insulin injections?

Using Insulin pump?

Diabetes foot disease?

Renal failure?

Etc…

Treatment Received

Blood test

Lifestyle intervention

Blood pressure medication

Oral diabetes medication

Glucose test

Insulin pens

Insulin pump

Foot amputation

Dialysis

Etc…

Visualized in 2D space, length of green lines indicate

relational strength of values

Source Data

• Circles denote health or disease status related to diabetes.

• Arrows denote direction of disease progression.

2-3-1. Disease progression modeling

16

Page 18: Analytics of Clinical “Big Data”€¦ · Healthcare providers File Contents Storage Metadata DB Healthcare data analytics for value-based healthcare Secondary use of healthcare

© Hitachi, Ltd. 2014. All rights reserved.

0

100,000

200,000

300,000

400,000

500,000

600,000

700,000

5.5 5.7 5.9 6.1 6.3 6.5 6.7 6.9 7.1 7.3

1人当たりの

1

0

年間の

累積医療費抑制額(

円)

初期のHbA1c(%)

HbA1cが0.2%改善した場合10年後

9年後

8年後

7年後

6年後

5年後

4年後

3年後

2年後

1年後

Elapsed years

2000

2600

3300

4000

4600

1300

600

0

Initial HbA1c [%]

Cum

ula

tive s

avin

gs o

f 10 y

ears

[£/p

ers

on

]

Highly cost-effective

Effect of health guidance

in case of 0.2% HbA1c improvement

For 10 years

£3500

Medical cost simulation

in case of guidance for specific person

highly cost-effective

Intervention

5k

0

10k

Control

We have been developing medical cost simulation usingdisease progression model.

Simulation is used not only to estimate medical cost reduction, but to extract subjects for whom health promotion services are effective.

Cum

ula

tive C

ost

[£/p

ers

on

]

2-3-2. Medical cost simulation

17

Page 19: Analytics of Clinical “Big Data”€¦ · Healthcare providers File Contents Storage Metadata DB Healthcare data analytics for value-based healthcare Secondary use of healthcare

© Hitachi, Ltd. 2014. All rights reserved.

2-4. Pharmaceutical industry applications

Analysis of medical needs for planning R&D of new drugs

Planning sales strategy by precise prediction of drug demand

Evaluation/comparison of effects and medical costs of drugs

Estimation of actual rate of adverse events of drug Early detection of adverse events and analysis of

contributing factors for minimizing risk

Promotion of open innovation by providing secure data sharing environment

Medical cost estimation

Adverse event analysis

Secure databasefor open

innovation

Marketing

18

Page 20: Analytics of Clinical “Big Data”€¦ · Healthcare providers File Contents Storage Metadata DB Healthcare data analytics for value-based healthcare Secondary use of healthcare

© Hitachi, Ltd. 2014. All rights reserved. 19

Appendix

Page 21: Analytics of Clinical “Big Data”€¦ · Healthcare providers File Contents Storage Metadata DB Healthcare data analytics for value-based healthcare Secondary use of healthcare

© Hitachi, Ltd. 2014. All rights reserved. 20

Utilizing IT for diabetes prevention and contribution to medical expenditure reduction

Enhance Quality of Life by Integrated Healthcare Platform

※1 National Health Service Greater Manchester ※2 General Practitioner

NHS GM and Hitachi Ltd. are promoting healthcare service

project utilizing IT (from Oct. 2013).

※1

Secure Integrated HealthcarePlatform Architecture

Adopt IT-based treatment for lifestyle-related diseases

GPGP GP

Research Institute

Hospital

Platform

Analytics DatabasePrivacy Enhancing

TechnologyAnalytics Application

70kg

69kg

68kg

67kg

66kg

65kg

10日目 20日目 30日目0日目5/18( 木 )

改善実施度

改善実施度

イベントイベント

70kg

69kg

68kg

67kg

66kg

65kg

10日目 20日目 30日目0日目5/18( 木 )

70kg

69kg

68kg

67kg

66kg

65kg

70kg

69kg

68kg

67kg

66kg

65kg

10 日目 20日目 30日目0日目5/18( 木 )

ParticipantSelf-care

Operator

・Visualization・Life guidance

Advice

Glucose levelWeight

Exercise HistoryData analysis

& guide

※2

A1-1. UK NHS project

Page 22: Analytics of Clinical “Big Data”€¦ · Healthcare providers File Contents Storage Metadata DB Healthcare data analytics for value-based healthcare Secondary use of healthcare

© Hitachi, Ltd. 2014. All rights reserved. 21

Adapting data analysis technology developed by Hitachi to disease prevention

programme in Salford, England

Adapting medical cost simulation technology, and developing cost effective

care plan

Issues- Manpower data management- Advice quality is variable

A1-2. Diabetes prevention

Providing cost effective care plan to people at high risk for Diabetes

Current lifestyle-related disease prevention programme

Enhanced lifestyle-related disease prevention programme

Issues- Recorded in paper- Invisible Effect

Validating Improvement of QoL (Quality of Life)by prevention of Diabetes and Reducing Care Cost through PoC

High riskPeople

Telecarer

Interview by form / phone

Advice based on each Telecarer’s decision

70kg

69kg

68kg

67kg

66kg

65kg

10日目 20日目 30日目0日目5/18(木)

改善実施度改善

実施度

イベントイベント

70kg

69kg

68kg

67kg

66kg

65kg

10日目 20日目 30日目0日目5/18(木)

70kg

69kg

68kg

67kg

66kg

65kg

70kg

69kg

68kg

67kg

66kg

65kg

10日目 20日目 30日目0日目5/18(木)

High riskPeople

Telecarer

Effective and Efficient Advicebased on patient’s disease status

Blood SugarWeight

Daily Activityetc

Self-checkAdvice based onData analytics

- Visualize daily effect- Effective advice

Page 23: Analytics of Clinical “Big Data”€¦ · Healthcare providers File Contents Storage Metadata DB Healthcare data analytics for value-based healthcare Secondary use of healthcare