improving knowledge discovery through development of big data to knowledge skills courses and open...
TRANSCRIPT
Nicole Vasilevsky1,2, Shannon McWeeney1, William Hersh1, David A. Dorr1,3, Ted Laderas1,Jackie Wirz4, Bjorn Pederson1, Melissa Haendel1,3
1Department of Medical Informatics and Clinical Epidemiology, 2Ontology Development Group, Library, 3Department of Medicine, 4Graduate Student Affairs, Oregon Health & Science University, Portland, OR
Students felt the course
AcknowledgementsThis work is supported by NIH Grants 1R25EB020379-01 and 1R25GM114820-01.
Skills Course TopicsWhy?
Course Offerings
Defining The Problem Wrangling Data
Methods, Tools And Analysis
Scientific Communication
Data Identification And Resources
…
?
v Problems amenable to analytics
v Importance of questionv Team definitionsv Scopev When we do this wrong:
methods don't match
v Finding the right datav Search methods
v Use of metadatav Data management
v Exploratory Data Analysis
v Data Dictionaryv As you touch data, what
can go wrong?
v Visualizationv Matching algorithms to
problems
v Reporting Findings and Limitationsv Giving “Elevator Speech” on ideas
of how to approach problemv Critique of related problem
Major challenge: how to manage, analyze and interpret vast amounts of data being generated in biomedical research
One goal of NIH Big Data to Knowledge (BD2K) initiative: provide training for students and researchers to address this
Research team in the Department of Medical Informatics and Clinical Epidemiology (DMICE) is developing Open Educational Resources and Skills Courses
Approach
OERs and courses connect the dots that help researchers understand how to apply data science techniques in the context of their whole research life cycle
§ Skills course and OER topics are aimed to fill specific gaps
§ Teaching students how to ask the question and follow through
①Develop Open Educational Resources (OERS)
②Teach Skills Courses
OER Modules
ChallengesScope
Images
Style
Dissemination
How to scope generic curricula for different levels of users
How to translate diverse teaching styles into general materials
How to maximize dissemination while protecting intellectual property
How to incorporate images and other copyrighted materials into open resources
Available at:http://dmice.ohsu.edu/bd2k
Improving Knowledge Discovery Through Development of Big Data to Knowledge Skills Courses and Open Educational Resources
Intro Course
Data After Dark
§ Week long course in Summer 2015§ Offered to interns and undergraduates§ Taught basics of data science in the context of
the research life cycle
§ Two-evening course in January 2016 § Offered to OHSU students, staff and faculty§ Taught basics of data science
Advanced Course
§ Four-evening course in May 2016 § Offered to OHSU students, staff and faculty§ Taught advanced topics in of data science
Data and Donuts
§ Two courses were offered in June and July 2016§ Offered to OHSU summer interns§ Taught basics of data science
01 | Biomedical Big Data Science02 | Introduction to Big Data in Biology
and Medicine 03 | Ethical Issues in Use of Big Data 04 | Clinical Standards Related to Big Data05 | Basic Research Data Standards06 | Public Health and Big Data07 | Team Science 08 | Secondary Use (Reuse) of Clinical Data 09 | Publication and Peer Review10 | Information Retrieval 11 | Version Control and Identifiers 12 | Data annotation and curation 13 | Data Tools and Landscape14 | Ontologies 10115 | Data metadata and provenance 16 | Semantic data interoperability 17 | Choice of Algorithms and Algorithm
Dynamics 18 | Visualization and Interpretation 19 | Replication, Validation and the
spectrum of Reproducibility Semantic data interoperability
20 | Regulatory Issues in Big Data for Genomics and Health Semantic Web data
21 | Hosting data dissemination and data stewardship workshops
22 | Hosting data dissemination and data stewardship workshops
23 | Terminology of Biomedical, Clinical, and Translational Research
24 | Computing Concepts for Big Data25 | Data modeling26 | Semantic Web data27 | Context-based selection of data28 | Translating the Question29 | Implications of Provenance and Pre-
processing30 | Data tells a story31 | Statistical Significance, P-hacking and
Multiple-testing32 | Displaying Confidence and
Uncertainty
Grey = still under development