causaltriad: toward pseudo causal relation discovery and ...ir.hit.edu.cn/~sdzhao/stan_zhao_acm bcb...

28
CausalTriad :Toward Pseudo Causal Relation Discovery and Hypotheses Generation from Medical Text Data Sendong (Stan) Zhao + , Meng Jiang * , Ming Liu + , Bing Qin + , Ting Liu + + Harbin Institute of Technology, China * University of Notre Dame, USA

Upload: others

Post on 27-Jul-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: CausalTriad: Toward Pseudo Causal Relation Discovery and ...ir.hit.edu.cn/~sdzhao/Stan_Zhao_ACM BCB 2018-presentation.pdf · CausalTriad: Toward Pseudo Causal Relation Discovery and

CausalTriad: Toward Pseudo Causal Relation Discovery and Hypotheses Generation from

Medical Text Data

Sendong (Stan) Zhao+, Meng Jiang*, Ming Liu+, Bing Qin+, Ting Liu+

+Harbin Institute of Technology, China*University of Notre Dame, USA

Page 2: CausalTriad: Toward Pseudo Causal Relation Discovery and ...ir.hit.edu.cn/~sdzhao/Stan_Zhao_ACM BCB 2018-presentation.pdf · CausalTriad: Toward Pseudo Causal Relation Discovery and

Pseudo Causal Relation

• Golden standard⁃ Randomized controlled experiments

⁃ Too costly

• Observational data ⁃ Structured data, eg. EHR

⁃ Unstructured data (Text data), eg. medical literature, patient report

• Pseudo causal relation⁃ Semantic-level causal relations

⁃ Verified true causal knowledge

⁃ Or, have not been identified previously

⁃ Or, no evidence to support them

Page 3: CausalTriad: Toward Pseudo Causal Relation Discovery and ...ir.hit.edu.cn/~sdzhao/Stan_Zhao_ACM BCB 2018-presentation.pdf · CausalTriad: Toward Pseudo Causal Relation Discovery and

Previous Studies

• Extract causal relations from single sentences

• While causal relations usually span multiple sentences

• Use only textual information and ignore structural information

• While causal relations naturally have an attached network structure

• Only extraction rather than inference

• While causality itself is a basic logical rule

Page 4: CausalTriad: Toward Pseudo Causal Relation Discovery and ...ir.hit.edu.cn/~sdzhao/Stan_Zhao_ACM BCB 2018-presentation.pdf · CausalTriad: Toward Pseudo Causal Relation Discovery and

Causation Transitivity

• Preserving transitivity is a basic desideratum for an adequate analysis of causation

--L. A. Paul and Ned Hall “Causation: A User’s Guide”

𝐴 𝐵

……

𝐶 𝐴 𝐶

Page 5: CausalTriad: Toward Pseudo Causal Relation Discovery and ...ir.hit.edu.cn/~sdzhao/Stan_Zhao_ACM BCB 2018-presentation.pdf · CausalTriad: Toward Pseudo Causal Relation Discovery and

Causation Transitivity in Medical Text

Obesity usually increases the risk of diabetes.

People with diabetes have more sugar in blood

called hyperglycemia.

Metformin has become a mainstay of type 2

diabetes management and is now the recommended

first-line drug for treating the disease.

Obesity Diabetes

Hyperglycemia

Metformin

?

?

cause

cause

Page 6: CausalTriad: Toward Pseudo Causal Relation Discovery and ...ir.hit.edu.cn/~sdzhao/Stan_Zhao_ACM BCB 2018-presentation.pdf · CausalTriad: Toward Pseudo Causal Relation Discovery and

Motivation

• Jointly utilize

⁃ Textual information (context and co-occurrence)

⁃ Structural information (causation transitivity rule)

• Through inference to

⁃ Discover causal relations in text

⁃ Generate new causal relation hypotheses

Page 7: CausalTriad: Toward Pseudo Causal Relation Discovery and ...ir.hit.edu.cn/~sdzhao/Stan_Zhao_ACM BCB 2018-presentation.pdf · CausalTriad: Toward Pseudo Causal Relation Discovery and

Problem Definition

• Problem: Causal Relation Discovery from Triad Structures

• Medical Cause-Effect Candidates Network𝐺 = 𝑉, 𝐸 , 𝐸 ∈ 𝑉 × 𝑉

• Triad Structure

⁃ Each Triangle in the network

⁃ Basic unit

Page 8: CausalTriad: Toward Pseudo Causal Relation Discovery and ...ir.hit.edu.cn/~sdzhao/Stan_Zhao_ACM BCB 2018-presentation.pdf · CausalTriad: Toward Pseudo Causal Relation Discovery and

Our method

• Causal Relation Candidates Matching

• 3 Clues for Causal Discovery

⁃ Causal Association

⁃ Contextual Information

⁃ Causal Transitivity Rules

• Factor Graph Model

Page 9: CausalTriad: Toward Pseudo Causal Relation Discovery and ...ir.hit.edu.cn/~sdzhao/Stan_Zhao_ACM BCB 2018-presentation.pdf · CausalTriad: Toward Pseudo Causal Relation Discovery and

Causal Relation Candidates Matching

• Medical Dictionary

⁃ Dryad data package

⁃ TCMonline and TCMID

• For every n consecutive sentences

• Match medical entities

• Pair each of them into several pairs

• Every two pairs with a shared entity generate a triad structure

• Eg. (𝑒𝑖, 𝑒𝑘) and (𝑒𝑖, 𝑒𝑗) generate a triad structure (𝑒𝑘, 𝑒𝑖, 𝑒𝑗)

Page 10: CausalTriad: Toward Pseudo Causal Relation Discovery and ...ir.hit.edu.cn/~sdzhao/Stan_Zhao_ACM BCB 2018-presentation.pdf · CausalTriad: Toward Pseudo Causal Relation Discovery and

Our method

• Causal Relation Candidates Matching

• 3 Clues for Causal Discovery

⁃ Causal Association

⁃ Contextual Information

⁃ Causal Transitivity Rules

• Factor Graph Model

Page 11: CausalTriad: Toward Pseudo Causal Relation Discovery and ...ir.hit.edu.cn/~sdzhao/Stan_Zhao_ACM BCB 2018-presentation.pdf · CausalTriad: Toward Pseudo Causal Relation Discovery and

3 Clues for Causal Discovery

• Causal Association⁃ Frequently co-occurring entities are more likely to be a causation [Do and

Roth 2013]

⁃ ei is a possible cause of entity ej, if ej happens more frequently with ei than by itself [Suppes 1970]

• Contextual Information⁃ Causal relations in the text tend to share special contexts

⁃ Like domain-related words, causal triggers, connectives, etc.

• Causation Transitivity Rule

Page 12: CausalTriad: Toward Pseudo Causal Relation Discovery and ...ir.hit.edu.cn/~sdzhao/Stan_Zhao_ACM BCB 2018-presentation.pdf · CausalTriad: Toward Pseudo Causal Relation Discovery and

Causal Association

• Modeling causal association

𝐶𝐴 𝑒𝑖𝑗 = 𝐼(𝑒𝑖 , 𝑒𝑗) × 𝐷(𝑒𝑖 , 𝑒𝑗) × 𝑀𝑎𝑥(𝑢𝑖 , 𝑢𝑗)

⁃ Larger mutual information

𝐼 𝑒𝑖 , 𝑒𝑗 = 𝑙𝑜𝑔𝑃(𝑒𝑖 , ej)

𝑃 𝑒𝑖 𝑃(𝑒𝑗)

⁃ Award pairs that co-exist closer, while penalizing those are further apart in text

𝐷 𝑒𝑖 , 𝑒𝑗 = − log𝑠𝑒𝑛𝑡 𝑒𝑖 − 𝑠𝑒𝑛𝑡 𝑒𝑗 + 1

2 ×𝑊𝑆⁃ Model the frequency of co-occurrence of two medical entities, 𝑀𝑎𝑥 𝑢𝑖 , 𝑢𝑗

𝑢𝑖 =𝑃(𝑒𝑖,𝑒𝑗)

max𝑘

𝑃 𝑒𝑖,𝑒𝑘 −𝑃(𝑒𝑖,𝑒𝑗 )+𝜀, 𝑢𝑗 =

𝑃(𝑒𝑖,𝑒𝑗)

max𝑘

𝑃 𝑒𝑘,𝑒𝑗 −𝑃(𝑒𝑖,𝑒𝑗 )+𝜀

Page 13: CausalTriad: Toward Pseudo Causal Relation Discovery and ...ir.hit.edu.cn/~sdzhao/Stan_Zhao_ACM BCB 2018-presentation.pdf · CausalTriad: Toward Pseudo Causal Relation Discovery and

Contextual Information (1)

• Encode Synthetic Context

Page 14: CausalTriad: Toward Pseudo Causal Relation Discovery and ...ir.hit.edu.cn/~sdzhao/Stan_Zhao_ACM BCB 2018-presentation.pdf · CausalTriad: Toward Pseudo Causal Relation Discovery and

Contextual Information (2)

• Encode context based on pre-trained word2vec Word Embedding

• Three ways

Page 15: CausalTriad: Toward Pseudo Causal Relation Discovery and ...ir.hit.edu.cn/~sdzhao/Stan_Zhao_ACM BCB 2018-presentation.pdf · CausalTriad: Toward Pseudo Causal Relation Discovery and

Causation Transitivity Rules

• angle rules and triadic rule

Page 16: CausalTriad: Toward Pseudo Causal Relation Discovery and ...ir.hit.edu.cn/~sdzhao/Stan_Zhao_ACM BCB 2018-presentation.pdf · CausalTriad: Toward Pseudo Causal Relation Discovery and

Integrate 3 Clues

• Combining evidence from both textual supports and structural inferences, the above three clues are better equipped to discover causal relations.

• They are complementary in several ways:

⁃ Causal association gives preferences to frequently co-occurring causal pairs.

⁃ Causal transitivity rules are designed to identify causal relations with few textual supports except for those that follow the transitivity rule and generate new causal hypothesis.

⁃ Incorporating contextual information from the text can potentially eliminate those frequently co-occurring medical entities which are not causal.

Page 17: CausalTriad: Toward Pseudo Causal Relation Discovery and ...ir.hit.edu.cn/~sdzhao/Stan_Zhao_ACM BCB 2018-presentation.pdf · CausalTriad: Toward Pseudo Causal Relation Discovery and

Our method

• Causal Relation Candidates Matching

• 3 Clues for Causal Discovery

⁃ Causal Association

⁃ Contextual Information

⁃ Causal Transitivity Rules

• Factor Graph Model

Page 18: CausalTriad: Toward Pseudo Causal Relation Discovery and ...ir.hit.edu.cn/~sdzhao/Stan_Zhao_ACM BCB 2018-presentation.pdf · CausalTriad: Toward Pseudo Causal Relation Discovery and

CausalTriad: Factor Graph for Each Triad Structure

Page 19: CausalTriad: Toward Pseudo Causal Relation Discovery and ...ir.hit.edu.cn/~sdzhao/Stan_Zhao_ACM BCB 2018-presentation.pdf · CausalTriad: Toward Pseudo Causal Relation Discovery and

Experiments

• Data collection

⁃ TCM consists of the abstracts of 106,151 papers.

⁃ HealthBoards consists of post messages on health and medical issues such as diseases, symptoms, medicines, and side-effects, etc.

Page 20: CausalTriad: Toward Pseudo Causal Relation Discovery and ...ir.hit.edu.cn/~sdzhao/Stan_Zhao_ACM BCB 2018-presentation.pdf · CausalTriad: Toward Pseudo Causal Relation Discovery and

• Generating new causal relation hypotheses

Experimental Results

Page 21: CausalTriad: Toward Pseudo Causal Relation Discovery and ...ir.hit.edu.cn/~sdzhao/Stan_Zhao_ACM BCB 2018-presentation.pdf · CausalTriad: Toward Pseudo Causal Relation Discovery and

• Different types of causal relations⁃ DISEASE–cause–SYMPTOM

⁃ FORMULA–against–DISEASE

⁃ HERB–against–DISEASE

⁃ FORMULA–relieve–SYMPTOM

⁃ HERB–relieve–SYMPTOM

⁃ DISEASE–bring–DISEASE

⁃ DRUG–against–DISEASE

⁃ DISEASE–cause–SYMPTOM

Experimental Results

Page 22: CausalTriad: Toward Pseudo Causal Relation Discovery and ...ir.hit.edu.cn/~sdzhao/Stan_Zhao_ACM BCB 2018-presentation.pdf · CausalTriad: Toward Pseudo Causal Relation Discovery and

• Patterns causal reasoning rules

Experimental Results

Page 23: CausalTriad: Toward Pseudo Causal Relation Discovery and ...ir.hit.edu.cn/~sdzhao/Stan_Zhao_ACM BCB 2018-presentation.pdf · CausalTriad: Toward Pseudo Causal Relation Discovery and

• Causal relation extraction

Experimental Results

Page 24: CausalTriad: Toward Pseudo Causal Relation Discovery and ...ir.hit.edu.cn/~sdzhao/Stan_Zhao_ACM BCB 2018-presentation.pdf · CausalTriad: Toward Pseudo Causal Relation Discovery and

• Extracting causal relations from single sentence and multiplesentences.

• Extracting implicit causal relations

Experimental Results

Page 25: CausalTriad: Toward Pseudo Causal Relation Discovery and ...ir.hit.edu.cn/~sdzhao/Stan_Zhao_ACM BCB 2018-presentation.pdf · CausalTriad: Toward Pseudo Causal Relation Discovery and

Influence Factors

• Influence from the size of labeled training data

Page 26: CausalTriad: Toward Pseudo Causal Relation Discovery and ...ir.hit.edu.cn/~sdzhao/Stan_Zhao_ACM BCB 2018-presentation.pdf · CausalTriad: Toward Pseudo Causal Relation Discovery and

Influence Factors

• Influence from the number of bootstrapping rounds and window size

Page 27: CausalTriad: Toward Pseudo Causal Relation Discovery and ...ir.hit.edu.cn/~sdzhao/Stan_Zhao_ACM BCB 2018-presentation.pdf · CausalTriad: Toward Pseudo Causal Relation Discovery and

Conclusions

• We propose CausalTriad to incorporate both textual and structural clues for causal relation discovery from text.

• Experimental results on two datasets demonstrate that:

⁃ CausalTriad is effective for discovering explicit and implicit causal relations from both single sentence and multiple sentences.

⁃ CausalTriad can generate new causal relation hypotheses through inference.

Page 28: CausalTriad: Toward Pseudo Causal Relation Discovery and ...ir.hit.edu.cn/~sdzhao/Stan_Zhao_ACM BCB 2018-presentation.pdf · CausalTriad: Toward Pseudo Causal Relation Discovery and

Thank You!Any comments and suggestions?

Homepage: http://ir.hit.edu.cn/~sdzhao/

Email: [email protected]

Sendong (Stan) Zhao Meng Jiang Ming Liu Ting LiuBing Qin