open data, compound repurposing, and rare diseases (iscb)
TRANSCRIPT
![Page 1: Open data, compound repurposing, and rare diseases (ISCB)](https://reader035.vdocument.in/reader035/viewer/2022070516/58ce66141a28ab2f268b69ad/html5/thumbnails/1.jpg)
Open data, compound repurposing, and rare diseases
Andrew Su, Ph.D.@[email protected]://sulab.org
February 16, 2017
Slides: slideshare.net/andrewsu
![Page 2: Open data, compound repurposing, and rare diseases (ISCB)](https://reader035.vdocument.in/reader035/viewer/2022070516/58ce66141a28ab2f268b69ad/html5/thumbnails/2.jpg)
Raynaud disease and fish oil2
Raynaud disease
![Page 3: Open data, compound repurposing, and rare diseases (ISCB)](https://reader035.vdocument.in/reader035/viewer/2022070516/58ce66141a28ab2f268b69ad/html5/thumbnails/3.jpg)
Raynaud disease and fish oil3
Raynaud disease
Fish oil / EPA
Abnormal platelet activity
Abnormal blood
viscosity
High blood viscosity
Elevated RBC rigidity
Vasodilation
Low blood triglycerides
Increased prostacyclins
![Page 4: Open data, compound repurposing, and rare diseases (ISCB)](https://reader035.vdocument.in/reader035/viewer/2022070516/58ce66141a28ab2f268b69ad/html5/thumbnails/4.jpg)
Raynaud disease and fish oil4
![Page 5: Open data, compound repurposing, and rare diseases (ISCB)](https://reader035.vdocument.in/reader035/viewer/2022070516/58ce66141a28ab2f268b69ad/html5/thumbnails/5.jpg)
“Undiscovered public knowledge”5
Raynaud disease
Fish oil / EPA
Abnormal platelet activity
Abnormal blood
viscosity
High blood viscosity
Elevated RBC rigidity
Vasodilation
Low blood triglycerides
Increased prostacyclins
A
C
B
B BB
BB
B
![Page 6: Open data, compound repurposing, and rare diseases (ISCB)](https://reader035.vdocument.in/reader035/viewer/2022070516/58ce66141a28ab2f268b69ad/html5/thumbnails/6.jpg)
“Undiscovered public knowledge”6
Raynaud disease
Fish oil / EPA
Abnormal platelet activity
Abnormal blood
viscosity
High blood viscosity
Elevated RBC rigidity
Vasodilation
Low blood triglycerides
Increased prostacyclins
A
C
B
B BB
BB
B
![Page 7: Open data, compound repurposing, and rare diseases (ISCB)](https://reader035.vdocument.in/reader035/viewer/2022070516/58ce66141a28ab2f268b69ad/html5/thumbnails/7.jpg)
“Undiscovered public knowledge”7
![Page 8: Open data, compound repurposing, and rare diseases (ISCB)](https://reader035.vdocument.in/reader035/viewer/2022070516/58ce66141a28ab2f268b69ad/html5/thumbnails/8.jpg)
Building a Network of BioThings (then)8
Eicosapentaenoic acid
Platelet aggregation Fatty Acid
Edge = co-mention
x 1000s article titles
![Page 9: Open data, compound repurposing, and rare diseases (ISCB)](https://reader035.vdocument.in/reader035/viewer/2022070516/58ce66141a28ab2f268b69ad/html5/thumbnails/9.jpg)
Building a Network of BioThings (now)9
Eicosapentaenoic acid
Platelet aggregation Fatty Acid
x 1000s article titles
x 26 million articles…… and full abstracts
decreases Edge = co-mention
= PubChem:446284 = Timnodonic acid
![Page 10: Open data, compound repurposing, and rare diseases (ISCB)](https://reader035.vdocument.in/reader035/viewer/2022070516/58ce66141a28ab2f268b69ad/html5/thumbnails/10.jpg)
Information extraction 10
1: Identify all biomedical concepts
2: Identify relationships between concepts
![Page 11: Open data, compound repurposing, and rare diseases (ISCB)](https://reader035.vdocument.in/reader035/viewer/2022070516/58ce66141a28ab2f268b69ad/html5/thumbnails/11.jpg)
11
PathwaysDiseasesProteinsVariants
GenesDrugs
Goal: Assemble a network of biomedical entities that is comprehensive, current, computable, and traceable.
![Page 12: Open data, compound repurposing, and rare diseases (ISCB)](https://reader035.vdocument.in/reader035/viewer/2022070516/58ce66141a28ab2f268b69ad/html5/thumbnails/12.jpg)
12
![Page 13: Open data, compound repurposing, and rare diseases (ISCB)](https://reader035.vdocument.in/reader035/viewer/2022070516/58ce66141a28ab2f268b69ad/html5/thumbnails/13.jpg)
13
PathwaysDiseasesProteinsVariants
GenesDrugs
Goal: Assemble a network of biomedical entities that is comprehensive, current, computable, and traceable.
![Page 14: Open data, compound repurposing, and rare diseases (ISCB)](https://reader035.vdocument.in/reader035/viewer/2022070516/58ce66141a28ab2f268b69ad/html5/thumbnails/14.jpg)
14
Sou
rce:
http
s://w
ilson
com
mon
slab
.org
/201
4/03
/06/
calli
ng-a
ll-su
ppor
ters
![Page 15: Open data, compound repurposing, and rare diseases (ISCB)](https://reader035.vdocument.in/reader035/viewer/2022070516/58ce66141a28ab2f268b69ad/html5/thumbnails/15.jpg)
Question: Can a group of non-scientists collectively perform concept recognition in biomedical texts?
15
![Page 16: Open data, compound repurposing, and rare diseases (ISCB)](https://reader035.vdocument.in/reader035/viewer/2022070516/58ce66141a28ab2f268b69ad/html5/thumbnails/16.jpg)
16
Experts versus crowd for concept identification
593 PubMed abstracts
6,900 mentions of “disease concepts”
F = 0.87F = 0.78
$$$
![Page 17: Open data, compound repurposing, and rare diseases (ISCB)](https://reader035.vdocument.in/reader035/viewer/2022070516/58ce66141a28ab2f268b69ad/html5/thumbnails/17.jpg)
17
Experts versus crowd for concept identification
593 PubMed abstracts
6,900 mentions of “disease concepts”
F = 0.87F = 0.87
$$$
• 9 days• 145 workers• Total: $630.96
![Page 18: Open data, compound repurposing, and rare diseases (ISCB)](https://reader035.vdocument.in/reader035/viewer/2022070516/58ce66141a28ab2f268b69ad/html5/thumbnails/18.jpg)
19
http://mark2cure.org
![Page 19: Open data, compound repurposing, and rare diseases (ISCB)](https://reader035.vdocument.in/reader035/viewer/2022070516/58ce66141a28ab2f268b69ad/html5/thumbnails/19.jpg)
20
Paid crowdsourcing
• F = 0.84• 28 days• 212 workers• Total cost: $0
$$$
• F = 0.87• 9 days• 145 workers• Total: $630.96
“Help science, please”
Citizen Science
![Page 20: Open data, compound repurposing, and rare diseases (ISCB)](https://reader035.vdocument.in/reader035/viewer/2022070516/58ce66141a28ab2f268b69ad/html5/thumbnails/20.jpg)
Does Citizen Science scale?21
1,000,000 articles * 10 AE / article 15,828 volunteers
needed10,275 AE * 365 days
212 annotators* 28 days
AE = Annotation events
=
Number of annotation events per year
Number of annotation events per year
per volunteer
![Page 21: Open data, compound repurposing, and rare diseases (ISCB)](https://reader035.vdocument.in/reader035/viewer/2022070516/58ce66141a28ab2f268b69ad/html5/thumbnails/21.jpg)
Does Citizen Science scale?22
15,828 volunteers
needed
200,000 volunteers
460,000 volunteers
37,000 volunteers
1,000,000+ volunteers
![Page 22: Open data, compound repurposing, and rare diseases (ISCB)](https://reader035.vdocument.in/reader035/viewer/2022070516/58ce66141a28ab2f268b69ad/html5/thumbnails/22.jpg)
23
Nina Hale https://flic.kr/p/zoVih
![Page 23: Open data, compound repurposing, and rare diseases (ISCB)](https://reader035.vdocument.in/reader035/viewer/2022070516/58ce66141a28ab2f268b69ad/html5/thumbnails/23.jpg)
Rare disease case study #224
![Page 24: Open data, compound repurposing, and rare diseases (ISCB)](https://reader035.vdocument.in/reader035/viewer/2022070516/58ce66141a28ab2f268b69ad/html5/thumbnails/24.jpg)
25
![Page 25: Open data, compound repurposing, and rare diseases (ISCB)](https://reader035.vdocument.in/reader035/viewer/2022070516/58ce66141a28ab2f268b69ad/html5/thumbnails/25.jpg)
26
… but no obvious treatments
![Page 26: Open data, compound repurposing, and rare diseases (ISCB)](https://reader035.vdocument.in/reader035/viewer/2022070516/58ce66141a28ab2f268b69ad/html5/thumbnails/26.jpg)
Mapping the biomedical network around NGLY1 27
NGLY1
![Page 27: Open data, compound repurposing, and rare diseases (ISCB)](https://reader035.vdocument.in/reader035/viewer/2022070516/58ce66141a28ab2f268b69ad/html5/thumbnails/27.jpg)
28
http://mark2cure.org
![Page 28: Open data, compound repurposing, and rare diseases (ISCB)](https://reader035.vdocument.in/reader035/viewer/2022070516/58ce66141a28ab2f268b69ad/html5/thumbnails/28.jpg)
29
A preliminary view of the NGLY1-focused biological network
1,200 contributors3,200 documents 787,400 annotations
![Page 29: Open data, compound repurposing, and rare diseases (ISCB)](https://reader035.vdocument.in/reader035/viewer/2022070516/58ce66141a28ab2f268b69ad/html5/thumbnails/29.jpg)
30
A preliminary view of the NGLY1-focused biological network
A
C
B
B BB
BB
B
AB
B BB
BB
B
A
B
B BB
BB
B
![Page 30: Open data, compound repurposing, and rare diseases (ISCB)](https://reader035.vdocument.in/reader035/viewer/2022070516/58ce66141a28ab2f268b69ad/html5/thumbnails/30.jpg)
31
http://slides.com/dhimmel/big-data-seminar
![Page 31: Open data, compound repurposing, and rare diseases (ISCB)](https://reader035.vdocument.in/reader035/viewer/2022070516/58ce66141a28ab2f268b69ad/html5/thumbnails/31.jpg)
32
http://slides.com/dhimmel/big-data-seminar
![Page 32: Open data, compound repurposing, and rare diseases (ISCB)](https://reader035.vdocument.in/reader035/viewer/2022070516/58ce66141a28ab2f268b69ad/html5/thumbnails/32.jpg)
33
http://slides.com/dhimmel/big-data-seminar
![Page 33: Open data, compound repurposing, and rare diseases (ISCB)](https://reader035.vdocument.in/reader035/viewer/2022070516/58ce66141a28ab2f268b69ad/html5/thumbnails/33.jpg)
34
http://slides.com/dhimmel/big-data-seminar
![Page 34: Open data, compound repurposing, and rare diseases (ISCB)](https://reader035.vdocument.in/reader035/viewer/2022070516/58ce66141a28ab2f268b69ad/html5/thumbnails/34.jpg)
35
http://slides.com/dhimmel/big-data-seminar
![Page 35: Open data, compound repurposing, and rare diseases (ISCB)](https://reader035.vdocument.in/reader035/viewer/2022070516/58ce66141a28ab2f268b69ad/html5/thumbnails/35.jpg)
36
Biomedical research relies on effective
Pie
tro B
ellin
iht
tps:
//flic
.kr/p
/k5j
mja
KNOWELDGE MANAGEMENT
![Page 36: Open data, compound repurposing, and rare diseases (ISCB)](https://reader035.vdocument.in/reader035/viewer/2022070516/58ce66141a28ab2f268b69ad/html5/thumbnails/36.jpg)
Ben GoodChunlei Wu Shirley Willis
Sebastien Lelong
Andra Waagmeester
Max Nanis
Cyrus Afrasiabi
Julia Turner
Ginger Tsueng
M2C M2C
Louis Gioia
Toby Li
Karthik G
Kevin Xin
Jake Bruggemann
Mike Mayers
DR
DR
Julee Adesara
Ramya Gamini Greg Stupp Sebastian Burgstaller
Tim Putman Nuria Queralt Rosinach
DRDR
DR
DR M2C
The Crowds Funding