text mining, machine learning, nlp and all that (in 10 minutes)
DESCRIPTION
Byron C Wallace, from #CochraneTech Symposium, Québec 2013TRANSCRIPT
text mining, machine learning, NLP and all that (in 10 minutes)
Byron C WallaceBrown Center for Evidence Based Medicine
#CochraneTech
why do we need this stuff?
[Bastian et al, PLoS Medicine 2010]
why do we need this stuff?
[Bastian et al, PLoS Medicine 2010]
eleven systematic reviews. every day.
PubMed growth
[http://altmetrics.org/wp-content/uploads/2010/10/medline-articles-by-year-lg.png]
what can we automate
what can we automate
what can we automate?
abstracts from PubMed search
doctor conducting review
manually screened abstracts
SVM
how does this work?
SVMs
bag of words
special considerations for the case of systematic reviews
• class imbalance – far fewer relevant than irrelevant abstracts– asymmetric costs sensitivity more important than
specificity
• reviewer time is scarce and expensive– better models, fewer labels: active learning and
dual supervision
how do we do?
we can achieve 100% sensitivity while
substantially reducing workload
“Towards Modernizing the Systematic Review Pipeline: Efficient Updating via Data Mining” Genetics in Medicine 2012
beyond citation screening
beyond citation screening
Questions?
http://www.cebm.brown.edu/softwarewww.cebm.brown.edu/byron