improving knowledge discovery through development of big data to knowledge skills courses and open...

1
Nicole Vasilevsky 1,2 , Shannon McWeeney 1 , William Hersh 1 , David A. Dorr 1,3 , Ted Laderas 1 , Jackie Wirz 4 , Bjorn Pederson 1 , Melissa Haendel 1,3 1 Department of Medical Informatics and Clinical Epidemiology, 2 Ontology Development Group, Library, 3 Department of Medicine, 4 Graduate Student Affairs, Oregon Health & Science University, Portland, OR Students felt the course Acknowledgements This work is supported by NIH Grants 1R25EB020379-01 and 1R25GM114820-01. Skills Course Topics Why? Course Offerings Defining The Problem Wrangling Data Methods, Tools And Analysis Scientific Communication Data Identification And Resources ? v Problems amenable to analytics v Importance of question v Team definitions v Scope v When we do this wrong: methods don't match v Finding the right data v Search methods v Use of metadata v Data management v Exploratory Data Analysis v Data Dictionary v As you touch data, what can go wrong? v Visualization v Matching algorithms to problems v Reporting Findings and Limitations v Giving “Elevator Speech” on ideas of how to approach problem v Critique of related problem Major challenge: how to manage, analyze and interpret vast amounts of data being generated in biomedical research One goal of NIH Big Data to Knowledge (BD2K) initiative: provide training for students and researchers to address this Research team in the Department of Medical Informatics and Clinical Epidemiology (DMICE) is developing Open Educational Resources and Skills Courses Approach OERs and courses connect the dots that help researchers understand how to apply data science techniques in the context of their whole research life cycle § Skills course and OER topics are aimed to fill specific gaps § Teaching students how to ask the question and follow through Develop Open Educational Resources (OERS) Teach Skills Courses OER Modules Challenges Scope Images Style Dissemination How to scope generic curricula for different levels of users How to translate diverse teaching styles into general materials How to maximize dissemination while protecting intellectual property How to incorporate images and other copyrighted materials into open resources Available at: http://dmice.ohsu.edu/bd2k Improving Knowledge Discovery Through Development of Big Data to Knowledge Skills Courses and Open Educational Resources Intro Course Data After Dark § Week long course in Summer 2015 § Offered to interns and undergraduates § Taught basics of data science in the context of the research life cycle § Two-evening course in January 2016 § Offered to OHSU students, staff and faculty § Taught basics of data science Advanced Course § Four-evening course in May 2016 § Offered to OHSU students, staff and faculty § Taught advanced topics in of data science Data and Donuts § Two courses were offered in June and July 2016 § Offered to OHSU summer interns § Taught basics of data science 01 | Biomedical Big Data Science 02 | Introduction to Big Data in Biology and Medicine 03 | Ethical Issues in Use of Big Data 04 | Clinical Standards Related to Big Data 05 | Basic Research Data Standards 06 | Public Health and Big Data 07 | Team Science 08 | Secondary Use (Reuse) of Clinical Data 09 | Publication and Peer Review 10 | Information Retrieval 11 | Version Control and Identifiers 12 | Data annotation and curation 13 | Data Tools and Landscape 14 | Ontologies 101 15 | Data metadata and provenance 16 | Semantic data interoperability 17 | Choice of Algorithms and Algorithm Dynamics 18 | Visualization and Interpretation 19 | Replication, Validation and the spectrum of Reproducibility Semantic data interoperability 20 | Regulatory Issues in Big Data for Genomics and Health Semantic Web data 21 | Hosting data dissemination and data stewardship workshops 22 | Hosting data dissemination and data stewardship workshops 23 | Terminology of Biomedical, Clinical, and Translational Research 24 | Computing Concepts for Big Data 25 | Data modeling 26 | Semantic Web data 27 | Context-based selection of data 28 | Translating the Question 29 | Implications of Provenance and Pre- processing 30 | Data tells a story 31 | Statistical Significance, P-hacking and Multiple-testing 32 | Displaying Confidence and Uncertainty Grey = still under development

Upload: nicole-vasilevsky-phd

Post on 10-Apr-2017

57 views

Category:

Science


0 download

TRANSCRIPT

Page 1: Improving Knowledge Discovery Through Development of  Big Data to Knowledge Skills Courses and Open Educational Resources

Nicole Vasilevsky1,2, Shannon McWeeney1, William Hersh1, David A. Dorr1,3, Ted Laderas1,Jackie Wirz4, Bjorn Pederson1, Melissa Haendel1,3

1Department of Medical Informatics and Clinical Epidemiology, 2Ontology Development Group, Library, 3Department of Medicine, 4Graduate Student Affairs, Oregon Health & Science University, Portland, OR

Students felt the course

AcknowledgementsThis work is supported by NIH Grants 1R25EB020379-01 and 1R25GM114820-01.

Skills Course TopicsWhy?

Course Offerings

Defining The Problem Wrangling Data

Methods, Tools And Analysis

Scientific Communication

Data Identification And Resources

?

v Problems amenable to analytics

v Importance of questionv Team definitionsv Scopev When we do this wrong:

methods don't match

v Finding the right datav Search methods

v Use of metadatav Data management

v Exploratory Data Analysis

v Data Dictionaryv As you touch data, what

can go wrong?

v Visualizationv Matching algorithms to

problems

v Reporting Findings and Limitationsv Giving “Elevator Speech” on ideas

of how to approach problemv Critique of related problem

Major challenge: how to manage, analyze and interpret vast amounts of data being generated in biomedical research

One goal of NIH Big Data to Knowledge (BD2K) initiative: provide training for students and researchers to address this

Research team in the Department of Medical Informatics and Clinical Epidemiology (DMICE) is developing Open Educational Resources and Skills Courses

Approach

OERs and courses connect the dots that help researchers understand how to apply data science techniques in the context of their whole research life cycle

§ Skills course and OER topics are aimed to fill specific gaps

§ Teaching students how to ask the question and follow through

①Develop Open Educational Resources (OERS)

②Teach Skills Courses

OER Modules

ChallengesScope

Images

Style

Dissemination

How to scope generic curricula for different levels of users

How to translate diverse teaching styles into general materials

How to maximize dissemination while protecting intellectual property

How to incorporate images and other copyrighted materials into open resources

Available at:http://dmice.ohsu.edu/bd2k

Improving Knowledge Discovery Through Development of Big Data to Knowledge Skills Courses and Open Educational Resources

Intro Course

Data After Dark

§ Week long course in Summer 2015§ Offered to interns and undergraduates§ Taught basics of data science in the context of

the research life cycle

§ Two-evening course in January 2016 § Offered to OHSU students, staff and faculty§ Taught basics of data science

Advanced Course

§ Four-evening course in May 2016 § Offered to OHSU students, staff and faculty§ Taught advanced topics in of data science

Data and Donuts

§ Two courses were offered in June and July 2016§ Offered to OHSU summer interns§ Taught basics of data science

01 | Biomedical Big Data Science02 | Introduction to Big Data in Biology

and Medicine 03 | Ethical Issues in Use of Big Data 04 | Clinical Standards Related to Big Data05 | Basic Research Data Standards06 | Public Health and Big Data07 | Team Science 08 | Secondary Use (Reuse) of Clinical Data 09 | Publication and Peer Review10 | Information Retrieval 11 | Version Control and Identifiers 12 | Data annotation and curation 13 | Data Tools and Landscape14 | Ontologies 10115 | Data metadata and provenance 16 | Semantic data interoperability 17 | Choice of Algorithms and Algorithm

Dynamics 18 | Visualization and Interpretation 19 | Replication, Validation and the

spectrum of Reproducibility Semantic data interoperability

20 | Regulatory Issues in Big Data for Genomics and Health Semantic Web data

21 | Hosting data dissemination and data stewardship workshops

22 | Hosting data dissemination and data stewardship workshops

23 | Terminology of Biomedical, Clinical, and Translational Research

24 | Computing Concepts for Big Data25 | Data modeling26 | Semantic Web data27 | Context-based selection of data28 | Translating the Question29 | Implications of Provenance and Pre-

processing30 | Data tells a story31 | Statistical Significance, P-hacking and

Multiple-testing32 | Displaying Confidence and

Uncertainty

Grey = still under development