what wikicite can learn from biomedical citation networks--wikicite2017--2017-05-22

56
What WikiCite can learn from biomedical citation networks Jodi Schneider WikiCite, 2017-05-22 [email protected] http://jodischneider.com/jodi.html @jschneider

Upload: jodischneider

Post on 28-Jan-2018

350 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: What WikiCite can learn from biomedical citation networks--Wikicite2017--2017-05-22

What WikiCite can learn from

biomedical citation networksJodi Schneider

WikiCite, 2017-05-22

[email protected]://jodischneider.com/jodi.html

@jschneider

Page 2: What WikiCite can learn from biomedical citation networks--Wikicite2017--2017-05-22

My hats

• Librariannow: Professor of future librarians and info managers

• Ontologist• Journal editor/founder (not quite a publisher!)

co-founder of open access Code4Lib Journal c. 2007

• Researcherscholarly communication, Biomedical informatics, Linked Data, argumentation, Computer-Supported Collaboration, Wikipedia

• Wikipedianmainly EN.WP, EN.WikiQuote

• Useracawiki.org (Community Manager c. 2009)bibliographic management softwarearticles, books, etc. in many, many fields

Page 3: What WikiCite can learn from biomedical citation networks--Wikicite2017--2017-05-22

Acawiki.org

Page 4: What WikiCite can learn from biomedical citation networks--Wikicite2017--2017-05-22

Which evidence do we take into

account for a given purpose?

Page 5: What WikiCite can learn from biomedical citation networks--Wikicite2017--2017-05-22

Hierarchy of Evidence

Figure credit: SUNY Downstate Medical Center. Medical Research

Library of Brooklyn. Evidence Based Medicine Course. A Guide to

Research Methods: The Evidence Pyramid:

http://library.downstate.edu/EBM2/2100.htm

Page 6: What WikiCite can learn from biomedical citation networks--Wikicite2017--2017-05-22

BUT: we cannot judge

evidence in isolation.

Page 7: What WikiCite can learn from biomedical citation networks--Wikicite2017--2017-05-22

Approaches include:

• Appraisal

Page 8: What WikiCite can learn from biomedical citation networks--Wikicite2017--2017-05-22

Approaches include:

• Appraisal

• Aggregation

Figure credit: Forest plot from Underhill, Kristen, Paul

Montgomery, and Don Operario. "Sexual abstinence only

programmes to prevent HIV infection in high income countries: systematic review." BMJ 335.7613 (2007): 248.

Page 9: What WikiCite can learn from biomedical citation networks--Wikicite2017--2017-05-22

Figure credit: Duke University Medical Center Library. Introduction to

Evidence-based Practice. What is Evidence-Based Practice (EBP)? http://guides.mclibrary.duke.edu/c.php?g=158201&p=1036021

• Appraisal

• Aggregation

• Contextualization

Approaches include:

Page 10: What WikiCite can learn from biomedical citation networks--Wikicite2017--2017-05-22

How trustworthy and valid is a

given scientific “fact”?

Page 11: What WikiCite can learn from biomedical citation networks--Wikicite2017--2017-05-22

How fragile is this knowledge?

Page 12: What WikiCite can learn from biomedical citation networks--Wikicite2017--2017-05-22

What supports it? What holds it up?

By Biochem1 (Own work) CC BY-SA 3.0 via Wikimedia

Commons

https://commons.wikimedia.org/wiki/File:جنگا_ست_یک.JPG

Page 13: What WikiCite can learn from biomedical citation networks--Wikicite2017--2017-05-22

Can it be shored up?

By Biochem1 (Own work) CC BY-SA 3.0 via Wikimedia Commons

https://commons.wikimedia.org/wiki/Category:Jenga#/media/File:

JPG.جنگا

Page 14: What WikiCite can learn from biomedical citation networks--Wikicite2017--2017-05-22

Do citations show the lineage of

an idea?

Page 15: What WikiCite can learn from biomedical citation networks--Wikicite2017--2017-05-22

{{Existing “facts”}} + {{New “facts”}}

Page 16: What WikiCite can learn from biomedical citation networks--Wikicite2017--2017-05-22

{{Existing “facts”}} + {{New “facts”}}

{{Stuff you cite}}

Page 17: What WikiCite can learn from biomedical citation networks--Wikicite2017--2017-05-22

Examples from discussion

sections of reports of

Randomized Controlled Trials

Page 18: What WikiCite can learn from biomedical citation networks--Wikicite2017--2017-05-22

“This” work agrees with…

• “This is in accordance with earlier studies

in the ambulatory surgical setting [3]” -

PMC1637100

Jodi Schneider, Graciela Rosemblat, Shabnam Tafreshi and Halil Kilicoglu. Rhetorical moves and audience considerations in the discussion sections of Randomized Controlled Trials of health interventions. To be presented at European Conference on Argumentation, June 2017

Page 19: What WikiCite can learn from biomedical citation networks--Wikicite2017--2017-05-22

Definitions and background info

• “Self-efficacy, which may relate to

motivation, is the perceived confidence in

one's ability to accomplish a specific task

[19].” - PMC2194735

Jodi Schneider, Graciela Rosemblat, Shabnam Tafreshi and Halil Kilicoglu. Rhetorical moves and audience considerations in the discussion sections of Randomized Controlled Trials of health interventions. To be presented at European Conference on Argumentation, June 2017

Page 20: What WikiCite can learn from biomedical citation networks--Wikicite2017--2017-05-22

Presenting a range of evidence

• “Except in one study [20], short-term

administration of GH transiently worsened

insulin resistance [19,53] and increased

fasting glucose levels [53].” - PMC1865086

Jodi Schneider, Graciela Rosemblat, Shabnam Tafreshi and Halil Kilicoglu. Rhetorical moves and audience considerations in the discussion sections of Randomized Controlled Trials of health interventions. To be presented at European Conference on Argumentation, June 2017

Page 21: What WikiCite can learn from biomedical citation networks--Wikicite2017--2017-05-22

Clause-level changes in meaning

• “Two of four randomised clinical trials

…have found a difference in admission

rate [12,19] and two have not [22,23].” -

PMC1142326

Jodi Schneider, Graciela Rosemblat, Shabnam Tafreshi and Halil Kilicoglu. Rhetorical moves and audience considerations in the discussion sections of Randomized Controlled Trials of health interventions. To be presented at European Conference on Argumentation, June 2017

Page 22: What WikiCite can learn from biomedical citation networks--Wikicite2017--2017-05-22

A single citation can support a

whole paragraph• Dutton and colleagues [8] described a series

of 81 coagulopathic trauma patients treated with rFVIIa. Of these, 20 received rFVIIa for treatment of coagulopathy related to TBI. Six of these patients had additional polytrauma. The outcome of these patients was poor and 15 of 20 patients died. The authors attributed this high mortality rate to the severity of brain injury. None of the 81 trauma patients in this series had any clinical indication of TE events.”

Jodi Schneider, Graciela Rosemblat, Shabnam Tafreshi and Halil Kilicoglu. Rhetorical moves and audience considerations in the discussion sections of Randomized Controlled Trials of health interventions. To be presented at European Conference on Argumentation, June 2017

Page 23: What WikiCite can learn from biomedical citation networks--Wikicite2017--2017-05-22

Discussing treatments, outcomes,

other authors’ conclusions• Dutton and colleagues [8] described a series

of 81 coagulopathic trauma patients treated with rFVIIa. Of these, 20 received rFVIIa for treatment of coagulopathy related to TBI. Six of these patients had additional polytrauma. The outcome of these patients was poor and 15 of 20 patients died. The authors attributed this high mortality rate to the severity of brain injury. None of the 81 trauma patients in this series had any clinical indication of TE events.”

Jodi Schneider, Graciela Rosemblat, Shabnam Tafreshi and Halil Kilicoglu. Rhetorical moves and audience considerations in the discussion sections of Randomized Controlled Trials of health interventions. To be presented at European Conference on Argumentation, June 2017

Page 24: What WikiCite can learn from biomedical citation networks--Wikicite2017--2017-05-22

Sometimes several parallel

paragraphs.

• Dutton and colleagues [8] described a series of 81 …patients treated with rFVIIa”

• “Zaaroor and Bar-Lavie [23] reported the first series of five patients …”

• “Morenski and colleagues [24] described …three pediatric … cases”

Jodi Schneider, Graciela Rosemblat, Shabnam Tafreshi and Halil Kilicoglu. Rhetorical moves and audience considerations in the discussion sections of Randomized Controlled Trials of health interventions. To be presented at European Conference on Argumentation, June 2017

Page 25: What WikiCite can learn from biomedical citation networks--Wikicite2017--2017-05-22

Multiple citations in a paragraph

• “Berger et al. [42] compared the efficacy of hypertonic saline and mannitol to reduce ICP after a combination of two different neuronal injuries. Initially, ….The authors demonstrated that …After …. It is remarkable that … An accumulation …These different effects … [42]. Furthermore, Prough et al. observed a higher regional cerebral blood flow in dogs with induced intracerebral hemorrhage after hypertonic saline without any increase of the CPP [43].” - PMC1297608

Jodi Schneider, Graciela Rosemblat, Shabnam Tafreshi and Halil Kilicoglu. Rhetorical moves and audience considerations in the discussion sections of Randomized Controlled Trials of health interventions. To be presented at European Conference on Argumentation, June 2017

Page 26: What WikiCite can learn from biomedical citation networks--Wikicite2017--2017-05-22

Avoiding a 1-sentence paragraph?

• “Berger et al. [42] compared the efficacy of hypertonic saline and mannitol to reduce ICP after a combination of two different neuronal injuries. Initially, ….The authors demonstrated that …After …. It is remarkable that … An accumulation …These different effects … [42]. Furthermore, Prough et al. observed a higher regional cerebral blood flow in dogs with induced intracerebral hemorrhage after hypertonic saline without any increase of the CPP [43].” - PMC1297608

Jodi Schneider, Graciela Rosemblat, Shabnam Tafreshi and Halil Kilicoglu. Rhetorical moves and audience considerations in the discussion sections of Randomized Controlled Trials of health interventions. To be presented at European Conference on Argumentation, June 2017

Page 27: What WikiCite can learn from biomedical citation networks--Wikicite2017--2017-05-22

“[Y]ou can transform a fact into

fiction or a fiction into fact just by

adding or subtracting references”- Bruno Latour

Page 28: What WikiCite can learn from biomedical citation networks--Wikicite2017--2017-05-22

... two miRNAs, miRNA-372 and-373, function as potential novel oncogenes in testicular germ cell tumors by inhibition of LATS2 expression, which suggests that Lats2 is an important tumor suppressor (Voorhoeve et al., 2006).

Raver-Shapira et.al, JMolCell 2007

miR-372 and miR-373 target the Lats2 tumor suppressor (Voorhoeve et al., 2006)

Yabuta, JBioChem 2007:

As claims get cited, they become facts:

To investigate the possibility that miR-372 and miR-373 suppress the expression of LATS2, we...

Therefore, these results point to LATS2 as a mediator of the miR-372 and miR-373 effects on cell proliferation and tumorigenicity,

Voorhoeve et al, Cell, 2006:

Hypothesis

Implication

Cited Implication

Fact

Slide credit: Anita DeWaard: 'Stories that persuade with data' - talk at CENDI meeting January 9 2014https://www.slideshare.net/anitawaard/stories-that-persuade-with-data-talk-at-cendi-meeting-january-

9-2014/6

Page 29: What WikiCite can learn from biomedical citation networks--Wikicite2017--2017-05-22

“The conversion of hypothesis to

fact through citation alone.”

- Stephen Greenberg

Page 30: What WikiCite can learn from biomedical citation networks--Wikicite2017--2017-05-22

Greenberg, Steven A. "Understanding belief using citation networks." Journal of evaluation in clinical practice 17.2 (2011): 389-393.http://dx.doi.org/10.1111/j.1365-2753.2011.01646.x

Page 31: What WikiCite can learn from biomedical citation networks--Wikicite2017--2017-05-22

“The conversion of hypothesis to fact through citation alone.”

- Stephen Greenberg

Greenberg, Steven A. "How citation distortions create unfounded authority: analysis of a citation network." BMJ 339 (2009): b2680.

https://doi.org/10.1136/bmj.b2680

Page 32: What WikiCite can learn from biomedical citation networks--Wikicite2017--2017-05-22

Funded grants with citation bias & citation distortion.

Greenberg, Steven A. "How citation distortions create unfounded authority: analysis of a citation network." BMJ 339 (2009): b2680.

https://doi.org/10.1136/bmj.b2680

Page 33: What WikiCite can learn from biomedical citation networks--Wikicite2017--2017-05-22

Modeling arguments and

evidence

Page 34: What WikiCite can learn from biomedical citation networks--Wikicite2017--2017-05-22

SEPIO – evidence lines

Brush, Matthew, Kent Shefchek, and Melissa Haendel. "SEPIO: a

semantic model for the integration and analysis of scientific

evidence." International Conference on Biomedical Ontology and BioCreative. 2016. http://ceur-ws.org/Vol-1747/IT605_ICBO2016.pdf

“A proposition has_evidence

one or more evidence lines, which have_supporting_data

one or more data items used in evaluation of the

proposition’s truth.”

Page 35: What WikiCite can learn from biomedical citation networks--Wikicite2017--2017-05-22

SEPIO – evidence lines example

Brush, Matthew, Kent Shefchek, and Melissa Haendel. "SEPIO: a

semantic model for the integration and analysis of scientific

evidence." International Conference on Biomedical Ontology and BioCreative. 2016. http://ceur-ws.org/Vol-1747/IT605_ICBO2016.pdf

“A simplified account of existing evidence related to this proposition is presented below,

presenting summaries of five evidence lines (E1-E5) from five studies relevant to the

classification of the variant for Fabry Disease:

E1. Six affected individuals with the variant were found to have reduced GLA enzyme

activity.

E2. The variant was absent from 528 unaffected controls.

E3. The variant is predicted to cause abnormal splicing that inserts additional sequence.

E4. Pedigree analyses showed Fabry Disease phenotypes segregating with the variant.

E5. Population databases show high frequency of individuals homozygous for the variant.”

Page 36: What WikiCite can learn from biomedical citation networks--Wikicite2017--2017-05-22

SEPIO – evidence lines example

Brush, Matthew, Kent Shefchek, and Melissa Haendel. "SEPIO: a

semantic model for the integration and analysis of scientific

evidence." International Conference on Biomedical Ontology and BioCreative. 2016. http://ceur-ws.org/Vol-1747/IT605_ICBO2016.pdf

“A simplified account of existing evidence related to this proposition is presented below,

presenting summaries of five evidence lines (E1-E5) from five studies relevant to the

classification of the variant for Fabry Disease:

E1. Six affected individuals with the variant were found to have reduced GLA enzyme

activity.

E2. The variant was absent from 528 unaffected controls.

E3. The variant is predicted to cause abnormal splicing that inserts additional sequence.

E4. Pedigree analyses showed Fabry Disease phenotypes segregating with the variant.

E5. Population databases show high frequency of individuals homozygous for the variant.”

Page 37: What WikiCite can learn from biomedical citation networks--Wikicite2017--2017-05-22

Modeling arguments and

evidence

Page 38: What WikiCite can learn from biomedical citation networks--Wikicite2017--2017-05-22

SEE

Bölling, Christian, Michael Weidlich, and Hermann-Georg Holzhutter.

"SEE: structured representation of scientific evidence in the biomedical

domain using Semantic Web techniques." Journal of Biomedical Semantics 5.1 (2014): 1.

Page 39: What WikiCite can learn from biomedical citation networks--Wikicite2017--2017-05-22

SEE

Bölling, Christian, Michael Weidlich, and Hermann-Georg Holzhutter.

"SEE: structured representation of scientific evidence in the biomedical

domain using Semantic Web techniques." Journal of Biomedical Semantics 5.1 (2014): 1.

Page 40: What WikiCite can learn from biomedical citation networks--Wikicite2017--2017-05-22

Modeling arguments and

evidence

Page 41: What WikiCite can learn from biomedical citation networks--Wikicite2017--2017-05-22

Micropublications

Clark, Tim, Paolo N. Ciccarese, and Carole A. Goble.

"Micropublications: a semantic model for claims, evidence, arguments

and annotations in biomedical communications." Journal of Biomedical

Semantics 5.28 (2014). http://dx.doi.org/10.1186/2041-1480-5-28

Page 42: What WikiCite can learn from biomedical citation networks--Wikicite2017--2017-05-22

Jodi Schneider, Paolo Ciccarese, Tim Clark, Richard D. Boyce. “Using the Micropublications ontology and the Open Annotation Data Model to represent evidence within a drug-drug interaction knowledge base.” Linked Science at ISWC 2014 http://ceur-ws.org/Vol-1282/lisc2014_submission_8.pdf

Page 43: What WikiCite can learn from biomedical citation networks--Wikicite2017--2017-05-22

Formalizing knowledge

Page 44: What WikiCite can learn from biomedical citation networks--Wikicite2017--2017-05-22

Cataloging evidence types for

knowledge bases.

Boyce, R.D.: A Draft Evidence Taxonomy and Inclusion Criteria for the

Drug Interaction Knowledge Base (DIKB),

http://purl.net/net/drug-interaction-knowledge-base/evidence-types-and-

inclusion-criteria

Page 45: What WikiCite can learn from biomedical citation networks--Wikicite2017--2017-05-22

Biological Expression Language

Rastegar-Mojarad, Majid, Ravikumar Komandur Elayavilli, and

Hongfang Liu. "BELTracker: evidence sentence retrieval for BEL

statements." Database 2016 (2016).See also: http://openbel.org

Page 46: What WikiCite can learn from biomedical citation networks--Wikicite2017--2017-05-22

Corpora of Interest

Page 47: What WikiCite can learn from biomedical citation networks--Wikicite2017--2017-05-22

Voorhoeve et al. (116) employed a novel strategy by combining an miRNA vector library and corresponding bar code array…

miR-372 and miR-373 were consequently found to permit proliferation and tumorigenesis of these primary cells carrying both oncogenic RAS and wild-type p53,

probably through direct inhibition of the expression of the tumor-suppressor LATS2 and subsequent neutralization of the p53 pathway.

to identify miRNAs that when overexpressedcould substitute for p53 loss and allow continued proliferation in the context of Ras activation

TAC Corpus: Curated Collection of 500 Citing > 50 Cited Papers

Voorhoeve et al. (2006), A Genetic Screen …

In mammals, a near-perfect complementarity between miRNAs and protein coding genes almost never exists, making it difficult to directly pinpoint relevant downstream targets of a miRNA. Several algorithms were developed that predict miRNA targets, most notably TargetScanS, PicTar, and miRanda (John et al., 2004, Lewis et al., 2005 and Robins et al., 2005).

These programs predict dozens to hundreds of target genes per miRNA, making it difficult to directly infer the cellular pathways affected by a given miRNA. Furthermore, the biological effect of the downregulation depends greatly on the cellular context, which exemplifies the need to deduce miRNAfunctions by in vivo genetic screens in well-defined model systems.

The cancerous process can be modeled by in vitro neoplastictransformation assays in primary human cells (Hahn et al., 1999). Using this system, sets of genetic elements required for transformation were identified. For example, the joint expression of the telomerase reverse transcriptase subunit (hTERT),

oncogenic H-RASV12, and SV40-small t antigen combined with the suppression of p53 and p16INK4A were sufficient to render primary human fibroblasts tumorigenic (Voorhoeveand Agami, 2003).

Goal

Method

Result

Conclusion

Citing PapersReference Paper

Slide credit: Anita DeWaard:

Argumentation in biology papershttps://www.slideshare.net/anitawaard/argumentation-in-biology-papers/27

Page 48: What WikiCite can learn from biomedical citation networks--Wikicite2017--2017-05-22

Take away messages

• Biomedicine has evolved multiple approaches for managing and appraising individual papers and bodies of “facts”.

• Citations come in many shapes and sizes.

• Citations may support “facts” – as part of a larger scientific fabric that includes data, evidence, arguments.

• Powermove: identify the critical supports (think Jenga)

Page 49: What WikiCite can learn from biomedical citation networks--Wikicite2017--2017-05-22

Take away messages

• Network analysis may help identify

problematic citation practices.

• Modeling arguments can help identify the

robustness of a claimed “fact”.

• Semantic models could enable inference-

based reasoning and citation network

querying.

• Relevant citation corpora exist.

Page 50: What WikiCite can learn from biomedical citation networks--Wikicite2017--2017-05-22
Page 51: What WikiCite can learn from biomedical citation networks--Wikicite2017--2017-05-22
Page 52: What WikiCite can learn from biomedical citation networks--Wikicite2017--2017-05-22
Page 53: What WikiCite can learn from biomedical citation networks--Wikicite2017--2017-05-22

Slide credit: Anita DeWaard: Epistemics, https://www.slideshare.net/anitawaard/epistemics/6

Latour is Science in Action: How to Follow Scientists and, p 33

Page 54: What WikiCite can learn from biomedical citation networks--Wikicite2017--2017-05-22

Anita’s insight: Detect and Track Metadiscourse

• Voorhoeve et al., 2006: “These miRNAs neutralize p53- mediated CDK inhibition, possibly through direct inhibition of the expression of the tumor suppressor LATS2.”

• Kloosterman and Plasterk, 2006: “In a genetic screen, miR-372 and miR-373 were found to allow proliferation of primary human cells that express oncogenic RAS and active p53, possibly by inhibiting the tumor suppressor LATS2 (Voorhoeve et al., 2006).”

• Yabuta et al., 2007: “[On the other hand,] two miRNAs, miRNA-372 and-373, function as potential novel oncogenes in testicular germ cell tumors by inhibition of LATS2 expression, which suggests that Lats2 is an important tumor suppressor (Voorhoeve et al., 2006).”

• Okada et al., 2011: “Two oncogenic miRNAs, miR-372 and miR-373, directly inhibit the expression of Lats2, thereby allowing tumorigenic growth in the presence of p53 (Voorhoeve et al., 2006).”

Slide credit: Based on Anita DeWaard: How to persuade with datahttps://www.slideshare.net/anitawaard/stories-thatpersuadev-4/18

Page 55: What WikiCite can learn from biomedical citation networks--Wikicite2017--2017-05-22

Metadiscourse: some progress• Hedging cues, speculative language, modality/negation:

• Light et al [6]: finding speculative language

• Wilbur et al (Hagit) [7]: focus, polarity, certainty, evidence, and directionality

• Thompson et al (Sophia) [8]: level of speculation, type/source of the evidence and level of certainty

• Sentiment detection (e.g. Kim and Hovy [9] a.m.o.): • Holder of the opinion, strength, polarity as ‘mathematical

function’ acting on main propositional content

• Can make this part of the semantic web: (e.g., Ontology for Reasoning, Certainty and Attribution, ORCA [10]):

• Value (Presumed True, Probable, Possible, Unknown)

• Source (Author, Named Other, Unknown)

• Basis (Data, Reasoning, Unknown) Slide credit: Anita DeWaard:

How to persuade with datahttps://www.slideshare.net/anita

waard/stories-thatpersuadev-

4/19

Page 56: What WikiCite can learn from biomedical citation networks--Wikicite2017--2017-05-22

Anita’s citations

[6] Light M, Qiu XY, Srinivasan P. (2004). The language of bioscience: facts, speculations, and statements in between. BioLINK 2004: Linking Biological Literature, Ontologies and Databases 2004:17-24.

[7] Wilbur WJ, Rzhetsky A, Shatkay H (2006). New directions in biomedical text annotations: definitions, guidelines and corpus construction. BMC Bioinformatics 2006, 7:356.

[8] Thompson P., Venturi G., McNaught J, Montemagni S, Ananiadou S. (2008). Categorising modality in biomedical texts. Proc. LREC 2008 WkshpBuilding and Evaluating Resources for Biomedical Text Mining 2008.

[9] Kim, S-M. Hovy, E.H. (2004). Determining the Sentiment of Opinions. Proceedings of the COLING conference, Geneva, 2004.

[10] de Waard, A. and Schneider, J. (2012) An Ontology of Reasoning, Certainty and Attribution (ORCA), ISWC 2012, http://ceur-ws.org/Vol-930/p2.pdf