eidos, indra, & delphi: from free text to executable...

Eidos, INDRA, & Delphi: From Free Text to Executable Causal Models

Rebecca Sharp, Adarsh Pyarelal, Benjamin M. Gyori†, Keith Alcock,Egoitz Laparra, Marco A. Valenzuela-Escarcega, Ajay Nagesh, Vikas Yadav,

John A. Bachman†, Zheng Tang, Heather Lent, Fan Luo, Mithun Paul,Steven Bethard, Kobus Barnard, Clayton T. Morrison, Mihai Surdeanu

University of Arizona, Tucson, Arizona, USA†Harvard Medical School, Boston, Massachusetts, USA{bsharp,adarsh}@email.arizona.edu

Abstract

Building causal models of complicated phe-nomena such as food insecurity is currentlya slow and labor-intensive manual process.In this paper, we introduce an approach thatbuilds executable probabilistic models fromraw, free text. The proposed approach is im-plemented through three systems: Eidos1, IN-DRA2 and Delphi3. Eidos is an open-domainmachine reading system designed to extractcausal relations from natural language. It isrule-based, allowing for rapid domain trans-fer, customizability, and interpretability. IN-DRA aggregates multiple sources of causal in-formation and performs assembly to create acoherent knowledge base and assess its relia-bility. This assembled knowledge serves asthe starting point for modeling. Delphi is amodeling framework that assembles quantifiedcausal fragments and their contexts into exe-cutable probabilistic models that respect the se-mantics of the original text and can be used tosupport decision making.

1 Introduction

Food insecurity is an extremely complex phe-nomenon that affects wide swathes of the globalpopulation, and is governed by factors ranging frombiophysical variables that affect crop yields, to so-cial, economic, and political factors such as migra-tion, trade patterns, and conflict.

For any attempt to combat food insecurity tobe effective, it must be informed by a model thatcomprehensively considers the myriad of factorsinfluencing it. Furthermore, for analysts and deci-sion makers to truly trust such a model, it must becausal and interpretable, as in, it must provide amechanistic explanation of the phenomenon, ratherthan just being a black-box statistical construction.

1https://github.com/clulab/eidos2https://github.com/sorgerlab/indra3https://github.com/ml4ai/delphi

Currently, however, these models are hand-builtfor each new situation and require many months toconstruct, resulting in long delays for much-neededinterventions.

Here we propose an end-to-end system thatcombines open-domain information extraction (IE)with a quantitative model-building framework,transforming free text into executable probabilisticmodels that capture complex real-world systems.All code and data described here is open-source andpublicly available, and we provide a short videodemonstration4.

Contributions:(1) We introduce Eidos, a rule-based open-domainIE system that extracts causal statements from rawtext. To maximize domain independence, Eidos islargely unlexicalized (with the exception of causalcues such as promotes), and implements a top-down approach where causal interactions are ex-tracted first, followed by the participating concepts,which are grounded with specific geospatial andtemporal contexts for model contextualization. Ei-dos also extracts quantifiable adjectives (e.g. sig-nificant) that can be used to form a bridge betweenqualitative statements and quantitative modeling.

(2) We describe an extension of the Integrated Net-work and Dynamical Reasoning Assembler (IN-DRA, Gyori et al., 2017), an automated knowledgeand model assembly system which implements in-terfaces to Eidos and multiple other machine read-ing systems. Originally developed to assemblemodels of biochemical mechanisms, we general-ized INDRA to represent general causal influencesas INDRA Statements, and load a taxonomy ofconcepts to align related Statements from multiplereaders and documents.

(3) We introduce Delphi, a Bayesian modeling4https://youtu.be/FcLEJej1uAg

https://github.com/clulab/eidos

https://github.com/sorgerlab/indra

https://github.com/ml4ai/delphi

https://youtu.be/FcLEJej1uAg

Eidos

Text

Causal relationextraction

INDRA

Multi-readerintegration

Delphi

Model assemblyand parameterization

Causal probabilistic models

Figure 1: Overall architecture showing the flow of informa-tion between the systems.

framework that converts the above statements intoexecutable probabilistic models that respect the se-mantics of the source text. These models can helpdecision-makers rapidly build intuition about com-plicated systems and their dynamics. The proposedframework is interpretable due to its foundation inrule-based IE and Bayesian generative modeling.

Architecture: In Fig. 1, we show a high-level de-piction of the information flow pipeline. First, nat-ural language texts serve as inputs to Eidos, whichperforms causal relation extraction, grounding, andspatiotemporal contextualization. The extractedrelations are subsequently aggregated by INDRAinto data structures called INDRA Statements fordownstream modeling. These serve as an inputto Delphi, which assembles a causal probabilisticmodel from them.

2 Causal Information Extraction

Eidos was designed as an open-domain IE sys-tem (Banko et al., 2007) with a top-down approachthat allows us to not be limited to a fixed set of con-cepts, as determining this set across multiple dis-tinct domains (e.g., agronomy and socioeconomics)is close to impossible. First, we find trigger wordssignaling a relation of interest and then extract andexpand the participating concepts (2.1), link theseconcepts to a taxonomy (2.2), and annotate themwith temporal and spatial context (2.3).5,6

In addition to an API that can be used for ma-chine reading at scale, Eidos has a webapp thatprovides users a way to see what rules were re-sponsible for the extracted content, as well as bratvisualizations (Stenetorp et al., 2012) of the output,

5This has some similarities to FrameNet (Baker et al.,1998), whose Causation frame has targets (triggers) and frameelements (participating concepts) that are associated with ataxonomy (the FrameNet hierarchy). In our case, the conceptscome from a domain-specific taxonomy.

6We assume here that causal relations are specified withinsentences rather than across sentences at the document level,and that the concepts involved in the causal relations can belinked to an appropriate taxonomy.

facilitating rapid development of the interpretablerule-grammars.

2.1 Reading Approach

To understand our top-down approach, let us con-sider the individual steps involved in processingthe following sentence: The significantly increasedconflict seen in South Sudan forced many familiesto flee in 2017.

(1) We begin by preprocessing the text with depen-dency syntax using Stanford CoreNLP (Manninget al., 2014) and the processors library7.

(2) Then, Eidos finds any occurrences of quanti-fiers (gradable adjectives and adverbs). These arecommon in the high-level texts relevant to foodinsecurity, such as reports from UN agencies andnonprofits, but they are difficult to use in quantita-tive models without additional information. In theexample above, the word significantly is found asa quantifier of increased. Delphi uses these quan-tifiers to construct probability density functionsusing the crowdsourced data of Sharp et al. (2018),as detailed in 4.

(3) Next, Eidos uses a set of trigger words to findcausal and correlational relations with an Odingrammar (Valenzuela-Escarcega et al., 2016). Odinis an information extraction framework which in-cludes a declarative language supporting both sur-face and syntactic patterns and a runtime system.Eidos’s grammar was based in part on the biomedi-cal grammar developed by Valenzuela-Escarcegaet al. (2018) but adapted to the open domain andour representation of concepts. This rule gram-mar is fully interpretable and easily editable, allow-ing users to make modifications without needingto retrain a complex model. In the example sen-tence from earlier, the extraction of a causal rela-tion would be triggered by the word forced, withconflict and families identified as the initial causeand effect, respectively.

(4) The initial cause and effect are then expandedusing dependency syntax following the approachof Hahn-Powell et al. (2017). Namely, from eachof the initial arguments, we traverse outgoing de-pendency links to expand the arguments into theirdependency subgraph. Here, the resulting argu-ments are significantly increased conflict seen inSouth Sudan and many families to flee in 2017.

7https://github.com/clulab/processors

Figure 2: Screenshot of the Eidos output for the runningexample sentence, visualized in Eidos’s webapp.

(5) Relevant state information is then added to theexpanded concepts. Representing the polarity ofan influence on the causal relation edge (i.e., interms of promotes or inhibits) can be lossy, so Ei-dos instead uses concept states (i.e., concepts canbe increased, decreased, and/or quantified). In theexample above, Eidos marks the concept pertainingto conflict as being increased and quantified. If de-sired, the promotion/inhibition representation withedge polarity can be straightforwardly recovered.The final output of the Eidos system for the run-ning example sentence, as displayed in the Eidoswebapp, is shown in Fig. 2.

2.2 Concept linking

The Eidos reading system, with its top-down ap-proach, was designed to keep extracted concepts asclose to the text as possible, intentionally allowingdownstream users to make decisions about eventsemantics depending on their use cases. As a result,linking concepts to a taxonomy becomes criticalfor preventing sparsity.

Eidos’s concept linking is based on word-embedding similarities. A given concept (with stopwords removed), is represented by the average ofthe word embeddings for each of its words. Avector for each node in the taxonomy is similarlycalculated (using the provided “examples” for thenode), and the taxonomy node whose vector isclosest to the concept vector is considered to bethe grounding. In practice, Eidos returns the top kgroundings, allowing for downstream disambigua-tion. The concept linking strategy is modular andallows for grounding to any taxonomy providedin the human-readable YAML format. With thismethod, Eidos is able to link to an arbitrary num-ber of taxonomies, at both high and low levels ofabstraction.

2.3 Temporal and geospatial normalization

Time normalization The context surroundingthe extractions is often critical for downstream rea-soning. Eidos integrates the temporal parser ofLaparra et al. (2018) that uses a character recur-rent neural network to identify time expressionsin the text which are then linked together with aset of rules into semantic graphs which follow theSCATE schema (Bethard and Parker, 2016) andcan be interpreted using temporal logic to obtainthe intervals referred to by the time expressions.

After the time expressions are identified and nor-malized, an Odin grammar attaches them to thecausal relations extracted by Eidos. If the docu-ment creation time is provided, it is also parsed byour model and used as the default temporal attach-ment for those causal relations without a temporalexpression in their close context.

Geospatial normalization Eidos’s geospatialnormalization module (Yadav et al., 2019) has twocomponents: a detection component consisting ofthe word-level LSTM named entity recognition(NER) model of Yadav and Bethard (2018), and anormalization component which implements popu-lation heuristics (i.e., selecting the most populouslocation (Magge et al., 2018)) and filters using adistance-based heuristic (Magge et al., 2018).

3 Assembly of causal relations

The output of Eidos is processed by INDRA intoa collection of INDRA Statements, each of whichrepresents a causal influence relation. INDRA isalso able to process the output of multiple otherreading systems that extract causal relations fromtext (these systems are not described in detail here).INDRA implements input processor modules toextract standardized Statements from each readingsystem. A Statement represents a causal influencebetween two Concepts (a subject and an object),each of which is linked to one or more taxonomies(see Section 2.2). The Statement also captures thepolarity and magnitude of change in both subjectand object, if available. Finally, one or more Ev-idences are attached to each Statement capturingprovenance (reader, document, sentence) and con-text (time, location) information. This commonrepresentation establishes a link between diverseknowledge sources and several model formalismendpoints.

Given the attributes of each Statement and a tax-

onomy to which Concepts are linked, INDRA cre-ates a Statement graph whose edges capture (i) re-dundancy between two Statements (ii) hierarchicalrefinement between two Statements, and (iii) con-tradiction between two Statements. Statements thatare redundant, or in other words, capture the samecausal relation, are merged and their evidences areaggregated. A probability model is then used whichcaptures the empirical precision of each reader tocalculate the overall support (a “belief” score) for aStatement given the joint probability of correctnessimplied by the evidence. As a seed to this prob-ability model, INDRA loads empirical precisionvalues collected via human curation for each Eidosrule. INDRA exposes a collection of methods tofilter Statements that can be composed to form aproblem-specific assembly pipeline, including (i)filtering by Statement belief and Concept linkingaccuracy (ii) filtering to more general or specificStatements (with respect to a taxonomy), and (iii)filtering contradictions by belief. INDRA also ex-poses a REST API and JSON-based serializationof Statements.

INDRA contains multiple modules that can as-semble Statements into causal graphs (for visual-ization or inference) and executable ODE mod-els. In the architecture presented here, Delphi(our Bayesian modeling framework) takes INDRAStatements directly as input, and serves as a proba-bilistic model assembly system.

4 Causal Probabilistic Models from Text

Statements produced by INDRA are assembled byDelphi into a structure called a causal analysisgraph, or CAG. In Fig. 3, we show the CAG result-ing from our running example sentence (cell [1]).The node labels (conflict and human migration)in the CAG correspond to entries in the high-leveltaxonomy that the concepts have been grounded to.

Representation We represent abstract conceptssuch as conflict and human migration as real-valuedlatent variables in a dynamic Bayes network (DBN)(Dagum et al., 1992), and the indicators correspond-ing to these concepts as observed variables. By anindicator, we mean a tangible quantity that servesas a proxy for the abstract concept8. For example,the variable Net migration (as defined in WorldBank (2018)) is one of several indicators for the

8Note that these are not the same as the indicator randomvariables encountered in probability theory.

Figure 3: Construction of a causal analysis graph from therunning example sentence.

concept of human migration. To capture the un-certainty inherent in interpreting natural language,we take the transition model of the DBN itself tobe a random variable with an associated proba-bility distribution. We interpret sentences aboutcausal relations as saying something about the func-tional relationship between the concepts involved.For example, we interpret the running examplesentence as giving us a clue about the shape of∂(human migration)/∂(conflict).

Assembly To assemble our model, we do the fol-lowing9:

(1) We construct the aforementioned distributionover the transition model of the DBN using the ex-tracted polarities of the causal relations as well asthe gradable adjectives associated with the conceptsinvolved in the relations. The transition model is amatrix whose elements are random variables rep-resenting the coefficients of a system of linear dif-ferential equations (Guan et al., 2015), with distri-butions obtained by constructing a Gaussian kerneldensity estimator over Cartesian products of thecrowdsourced responses collected by Sharp et al.(2018) for the adjectives in each relation.

(2) To provide more tangible results, we map theabstract concepts to indicator variables for whichwe have time series data. This data is gathered froma number of databases, including but not limited to

9The complete mathematical details of the model assem-bly process are out of the scope of this paper, but can be foundat: http://vision.cs.arizona.edu/adarsh/Arizona_Text_to_Model_Procedure.pdf

http://vision.cs.arizona.edu/adarsh/Arizona_Text_to_Model_Procedure.pdf

http://vision.cs.arizona.edu/adarsh/Arizona_Text_to_Model_Procedure.pdf

Figure 4: Results of conditional forecasting experiment withCAG built from example sentence.

the FAOSTAT (Food and Agriculture Organizationof the United Nations, 2018) and the World Devel-opment Indicators (World Bank, 2018) databases.The mapping is done using the OntologyMappertool in Eidos that uses word embedding similaritiesto map entries in the high-level taxonomy to thelower-level variables in the time series data.

(3) Then, we associate values with indicators usinga parameterization algorithm that takes as inputsome spatiotemporal context, and retrieves valuesfor the indicators from the time series data, fallingback to aggregation over a (configurable) set ofaggregation axes in order to prevent null results.In Fig. 3, we show the indicators automaticallymapped to the conflict and human migration nodes(conflict incidences and net migration, respectively)and their values for the spatiotemporal context ofSouth Sudan in April 2017.

Conditional forecasting Once the model is as-sembled, we can run experiments to obtain quan-titative predictions for indicators, which can buildintuitions about the complex system in questionand support decision making. The outputs takethe form of time series data, with associated uncer-tainty estimates. An example is shown in Fig. 4,in which we investigate the impact of increasingconflict on human migration using our model, with∂(conflict)/∂t = 0.1e−t. The predictions of themodel reflect (i) the semantics of the source text(increased conflict leads to increased migration)

and (ii) the uncertainty in interpreting the sourcesentence. The confidence bands in the lower plot re-flect the distribution of the crowdsourced gradableadjective data.

5 Assessment

We are currently in the process of developing aframework to quantitatively evaluate the models as-sembled using this pipeline, primarily via backcast-ing. However, the systems have been qualitativelyevaluated by MITRE, an independent performergroup in the World Modelers program charged withdesigning and conducting evaluations of the tech-nologies developed. For the evaluation, a causalanalysis graph larger than the toy running exam-ple in this paper (≈ 20 nodes) was created andexecuted. Noted strengths of the system includethe ability to drill down into the provenance ofthe causal relations, the integration of multiple ma-chine readers, and the plausible directionality ofthe produced forecast (given the sentences used toconstruct the models). Some limitations were alsonoted, i.e., that the initialization and parameteriza-tion of the models were somewhat opaque (whichhindered explainability) and some aspects of un-certainty are captured by the readers but not fullypropagated to the model. We are actively workingon addressing both of these limitations.

6 Conclusion

Complex causal models are required in order toaddress key issues such as food insecurity that spanmultiple domains. As an alternative to expensive,hand-built models which can take months to yearsto construct, we propose an end-to-end frameworkfor creating executable probabilistic causal modelsfrom free text. Our entire pipeline is interpretableand intervenable, such that domain experts can useour tools to greatly reduce the time required todevelop new causal models for urgent situations.

Acknowledgments: This work was supported bythe Defense Advanced Research Projects Agency(DARPA) under the World Modelers program,grant W911NF1810014 and by the Bill andMelinda Gates Foundation HBGDki Initiative.Marco Valenzuela-Escarcega and Mihai Surdeanudeclare a financial interest in lum.ai. This inter-est has been properly disclosed to the Universityof Arizona Institutional Review Committee and ismanaged in accordance with its conflict of interestpolicies.

ReferencesCollin F. Baker, Charles J. Fillmore, and John B. Lowe.

1998. The Berkeley FrameNet project. In Proceed-ings of the 17th international conference on Compu-tational Linguistics-Volume 1, pages 86–90. Associ-ation for Computational Linguistics.

Michele Banko, Michael J. Cafarella, Stephen Soder-land, Matt Broadhead, and Oren Etzioni. 2007.Open information extraction from the web. In Pro-ceedings of the 20th International Joint Conferenceon Artifical Intelligence, IJCAI’07, pages 2670–2676, San Francisco, CA, USA. Morgan KaufmannPublishers Inc.

Steven Bethard and Jonathan Parker. 2016. A semanti-cally compositional annotation scheme for time nor-malization. In Proceedings of the Tenth Interna-tional Conference on Language Resources and Eval-uation (LREC 2016), Paris, France. European Lan-guage Resources Association (ELRA). [Acceptancerate 60%].

Paul Dagum, Adam Galper, and Eric Horvitz. 1992.Dynamic network models for forecasting. In DidierDubois, Michael P. Wellman, Bruce D’Ambrosio,and Phillipe Smets, editors, Uncertainty in ArtificialIntelligence, pages 41 – 48. Morgan Kaufmann.

Food and Agriculture Organization of the United Na-tions. 2018. FAOSTAT Database.

Jinyan Guan, Kyle Simek, Ernesto Brau, Clayton T.Morrison, Emily Butler, and Kobus Barnard. 2015.Moderated and drifting linear dynamical systems. InProceedings of the 32nd International Conferenceon Machine Learning, volume 37 of Proceedingsof Machine Learning Research, pages 2473–2482,Lille, France. PMLR.

Benjamin M. Gyori, John A. Bachman, Kartik Subra-manian, Jeremy L. Muhlich, Lucian Galescu, andPeter K. Sorger. 2017. From word models to ex-ecutable models of signaling networks using au-tomated assembly. Molecular Systems Biology,13(11).

Gus Hahn-Powell, Marco A. Valenzuela-Escarcega,and Mihai Surdeanu. 2017. Swanson linking revis-ited: Accelerating literature-based discovery acrossdomains using a conceptual influence graph. Pro-ceedings of ACL 2017, System Demonstrations,pages 103–108.

Egoitz Laparra, Dongfang Xu, and Steven Bethard.2018. From characters to time intervals: Newparadigms for evaluation and neural parsing of timenormalizations. Transactions of the Association forComputational Linguistics, 6:343–356.

Arjun Magge, Davy Weissenbacher, Abeed Sarker,Matthew Scotch, and Graciela Gonzalez-Hernandez.2018. Deep neural networks and distant supervisionfor geographic location mention extraction. Bioin-formatics, 34(13):i565–i573.

Christopher D. Manning, Mihai Surdeanu, John Bauer,Jenny Finkel, Steven J. Bethard, and David Mc-Closky. 2014. The Stanford CoreNLP natural lan-guage processing toolkit. In Association for Compu-tational Linguistics (ACL) System Demonstrations,pages 55–60.

Rebecca Sharp, Mithun Paul, Ajay Nagesh, DaneBell, and Mihai Surdeanu. 2018. Grounding grad-able adjectives through crowdsourcing. In Proceed-ings of the Eleventh International Conference onLanguage Resources and Evaluation (LREC 2018),Paris, France. European Language Resources Asso-ciation (ELRA).

Pontus Stenetorp, Sampo Pyysalo, Goran Topic,Tomoko Ohta, Sophia Ananiadou, and Jun’ichi Tsu-jii. 2012. brat: a web-based tool for nlp-assistedtext annotation. In Proceedings of the Demonstra-tions at the 13th Conference of the European Chap-ter of the Association for Computational Linguistics,pages 102–107. Association for Computational Lin-guistics.

Marco A. Valenzuela-Escarcega, Ozgun Babur, GusHahn-Powell, Dane Bell, Thomas Hicks, EnriqueNoriega-Atala, Xia Wang, Mihai Surdeanu, EmekDemir, and Clayton T. Morrison. 2018. Large-scaleautomated machine reading discovers new cancerdriving mechanisms. Database: The Journal of Bio-logical Databases and Curation.

Marco A. Valenzuela-Escarcega, Gustave Hahn-Powell, and Mihai Surdeanu. 2016. Odin’s runes:A rule language for information extraction. In Pro-ceedings of the 10th edition of the Language Re-sources and Evaluation Conference (LREC).

World Bank. 2018. World Development IndicatorsDatabase.

Vikas Yadav and Steven Bethard. 2018. A survey on re-cent advances in named entity recognition from deeplearning models. In Proceedings of the 27th Inter-national Conference on Computational Linguistics,pages 2145–2158.

Vikas Yadav, Egoitz Laparra, Ti-Tai Wang, Mihai Sur-deanu, and Steven Bethard. 2019. University of ari-zona at semeval-2019 task 12: Deep-affix named en-tity recognition of geolocation entities. In Proceed-ings of The 13th International Workshop on Seman-tic Evaluation, Minneapolis, USA. Association forComputational Linguistics.

http://dl.acm.org/citation.cfm?id=1625275.1625705

http://www.lrec-conf.org/proceedings/lrec2016/pdf/288_Paper.pdf



https://doi.org/https://doi.org/10.1016/B978-1-4832-8287-9.50010-4

http://www.fao.org/faostat/en/

http://proceedings.mlr.press/v37/guan15.html

https://doi.org/10.15252/msb.20177651

https://doi.org/10.15252/msb.20177651

https://doi.org/10.15252/msb.20177651

https://doi.org/10.1162/tacl_a_00025



http://www.aclweb.org/anthology/P/P14/P14-5010

http://www.aclweb.org/anthology/P/P14/P14-5010

http://www.lrec-conf.org/proceedings/lrec2018/pdf/977.pdf

http://www.lrec-conf.org/proceedings/lrec2018/pdf/977.pdf

http://aclweb.org/anthology/E12-2021

http://aclweb.org/anthology/E12-2021

https://doi.org/10.1093/database/bay098



http://surdeanu.info/mihai/papers/lrec2016-odin.pdf

http://surdeanu.info/mihai/papers/lrec2016-odin.pdf

https://datacatalog.worldbank.org/dataset/world-development-indicators

https://datacatalog.worldbank.org/dataset/world-development-indicators

eidos, indra, & delphi: from free text to executable...

Documents