training and outreach efforts bd2k-lincs webinar for the nih bd2k working group on training may 29,...
TRANSCRIPT
Training and Outreach Efforts
BD2K-LINCS
Webinar for the NIH BD2K Working Group on Training
May 29, 2015
Avi Ma’ayan, PhD (Contact PI)Associate Professor
Sherry Jenkins, MSProgram Manager
Department of Pharmacology and Systems TherapeuticsIcahn School of Medicine at Mount Sinai
New York, New York
DATA COORDINATION AND INTEGRATION CENTER
AssaysCells
Pert
AssaysCells
PertAssays
Cells
Pert
Assays
Cells
Pert
AssaysCells
Pert
Assays
Cells
Pert
DCIC
High Throughput TranscriptomicsL1000Connectivity Map
NeuroLINCSALSImaging Proteomics Transcriptomics
Microenvironment Effects on Cancer Cells Transcriptomics Imaging Proteomics
High ThroughputProteomics
P100Epigenomics
Drug Combinations Mitigating Side Effects ProteomicsTranscriptomics
HighThroughput Imaging Proteomics Phenotypes Cancer Cells Modeling Cell Signaling
LINCS PHASE IILibrary of Integrated Network-based Cellular Signatures
BD2K-LINCS Data Coordination and Integration Center
DSR IKE CTO CCA
Internal & External Data
Science Research Projects
Metadata,APIs,
Visualization,Integration
ToolsTraining and
OutreachCoordination, Infrastructure
eDSRsiDSRs
Ma’ayan SchurerMedvedovic
lincs-dcic.orglincsproject.orgSummer Research
Training Program
Webinars
Mini-symposium, seminars and workshops
LINCS Working Groups
BD2K-LINCS Data Coordination and Integration Center – Scientific Objectives
- Understand how different layers of human cellular regulatory networks, i.e., transcriptomics and proteomics, correlate and interact.
- Develop methods to benchmark computational and experimental methods to objectively evaluate their quality and extract more knowledge from the data.
- Understand the inherit biases within low- and high content experiments, and develop methods to correct for such biases.
- Map the dimensionality of all possible global molecular states of human cells in normal physiology, disease, and in response to perturbations by small molecules and genetic manipulations.
- Develop methods to connect cellular and organismal phenotypes with molecular cellular signatures.
BD2K-LINCS Data Coordination and Integration Center – Data Science Objectives
- Organize, curate and serve for search and download the largest possible collection of annotated molecular cellular signatures, networks and attribute tables.
- Develop novel data visualization methods for
dynamically interacting with large-genomics and proteomics datasets.
- Develop educational and outreach activities for training and engaging the next generation of data scientists.
- Develop ontologies and other methods for data integration across diverse sets of experimental data collected by different laboratories, centers and large-scale projects utilizing different high content profiling assays.
6
Community Training and Outreach (CTO)
• Courses• MOOCs on Coursera:
1. Network Analysis in Systems Biology 2. Big Data Science with the BD2K-LINCS
DCIC• ISMMS Graduate Courses:
1. BD2K-LINCS DCIC - Programming for Big Data Biomedicine
2. BD2K-LINCS DCIC - Data Mining in Systems Biology
• Big Data Biostatistics PhD Program (at the University of Cincinnati College of Medicine)
• Summer Research Training Program in Biomedical Big Data
• Data Science Research Webinars
• Crowdsourcing Projects Portal
• External Data Science Projects
• Mini-symposium, Seminars and Workshops
lincs-dcic.org DCIC Outreach Activities
Funding Opportunities
7
Network Analysis in Systems BiologyMOOC on Coursera
A graduate-level course which serves as an introduction to Big Data analysis in systems biology including statistical methods used to identify differentially expressed genes, performing various types of enrichment analyses, and applying clustering algorithms.
https://class.coursera.org/netsysbio-002Description:
• 8 weeks / 7 modules
• 34 Short video lectures
• 24 Auto-graded short quizzes
• Auto-graded final exam
• Crowdsourcing tasks
Course Features:
• Weekly overviews
• Discussion forum
Last session: January 5, 2015 – March 3, 2015
8
Network Analysis in Systems BiologyCourse Analytics
Engagement:
Content:
~600 students passed the course to obtain a statement of accomplishment
9
Big Data Science with the BD2K-LINCS Data Coordination and Integration Center
MOOC on Courserahttps://www.coursera.org/course/bd2klincs
Session: Sep 15 – Nov 9 2015
• Overview of the NIH Common Fund LINCS Program
• Overview of the Data and Signature Generation Centers (experiments and data)
• Meta-Data and Ontologies• Data Normalization• Unsupervised Learning Methods: Data
Clustering• Supervised Learning Methods• Enrichment Analyses• Bayesian Data Integration• Network Analysis and Network Visualization• Cheminformatics• Serving data through RESTful APIs and
JSON• Interactive Data Visualization of LINCS Data
Syllabus:
10
BD2K-LINCS DCIC: Programming for Big Data BiomedicineISMMS Graduate Course
Ten-week mini-course taught by Avi Ma’ayan PhD and members of his research team within the BD2K-LINCS DCIC at the Icahn School of Medicine at Mount Sinai
• Agent Based Modeling with NetLogo• Agent Based Modeling with MATLAB• Python• Python and MatPlotLib• HTML and CSS• JavaScript and PHP• MySQL• MongoDB• Bootstrap Templates• R• Final Project
Topics:
Spring 2015
Course Dates: Feb 24 – May 4 2015
11
BD2K-LINCS DCIC: Data Mining in Systems BiologyISMMS Graduate Course
Ten-week mini-course taught by Avi Ma’ayan PhD and members of his research team within the BD2K-LINCS DCIC at the Icahn School of Medicine at Mount Sinai
• Self Organizing Maps• Hierarchical Clustering• PCA• Linear Regression• Decision Trees• Graph Theory Concepts• Support Vector Machines• Final Project
Topics:
Fall 2015Fall 2014 Course Dates:
September 16 – December 2, 2014
12
BD2K-LINCS DCIC Summer Research Training Program in Biomedical Big Data Science
Ten-week training program for undergraduate and master’s students interested in research projects aimed at solving data-intensive biomedical problems.
Icahn School of Medicine at Mount Sinai
Dynamic Data VisualizationMachine Learning
Data Harmonization
Ma’ayan Laboratory of Computational Systems Biology
Summer 2015 Program Dates: June 1 – August 7
Summer 2015 | Training Sites
University of WashingtonYeung / Computational Systems Biology Group
Machine LearningData Integration
Network Visualization Plugins
• Carnegie Mellon University, Biological Sciences (Bar-Joseph)• Carnegie Mellon University, Computational Biology (Ma’ayan)• University of Washington, Computer Engineering (Yeung)• University of Washington, Computer Science (Ma’ayan)• The City College of New York, Bioinformatics (Ma’ayan)• Yorktown High School (Ma’ayan)
2015 Cohort Summary• 6 trainees• 2 master’s / 3 undergraduate / 1 high school• 4 women / 2 men• All future plans include STEM graduate degrees
http://lincs-dcic.org/#/srp
Carnegie Mellon University
Machine LearningTime Series Analysis
Transcriptional Regulatory Networks
Bar-Joseph / Systems Biology Group
13
Data Science Research WebinarsServe as a general forum to engage data scientists within and outside of the LINCS project to work on problems related to LINCS data analysis and integration.
http://lincs-dcic.org/#/webinars
• Open to data science research community
• Advertised on DCIC website, LINCS portal, Twitter, Google group
• Schedule and connection details posted on the DCIC website and LINCS portal
• Past webinar videos posted on the DCIC’s YouTube channel
Purpose / Target Audience
www.lincsproject.org/community/webinars/BD2K-LINCS DCIC @BD2KLINCSDCIC| | |
14
BD2K-LINCS DCIC Crowdsourcing Portal
http://www.maayanlab.net/crowdsourcing/
15http://www.maayanlab.net/crowdsourcing/
Community Science Project: Building a Database of Gene Expression Signatures Extracted from Single Gene
Knockout/Knockdown Studies
16http://www.lincs-dcic.org/#/edsr
Data Science Research Collaborations with the BD2K-LINCS DCIC
17
Mini-symposium, Seminars and Workshops
Mini-symposium | January 7, 2015
Symposium was co-sponsored by the BD2K-LINCS DCIC and Mount Sinai’s Knowledge Management Center
for Illuminating the Druggable Genome
Winter 2014 - 2015
Invited Seminar SpeakersDecember 5, 2014Reverse Engineering a more Reliable Translational Pipeline with Patient-Derived iPSC Models of Neurodegenerative Disease, Robotic Longitudinal Single Cell Analysis and Deep LearningSteven Finkbeiner, MD, PhD / NeuroLINCS Center
January 14, 2015The PAGE Study and Coordinating Center (Population Architecture using Genomics and Epidemiology)Tara Matise, PhD / PAGE Coordinating Center
Works in Progress Seminar Series
March 23, 2015BD2K-LINCS Outreach Session:Turning Big Data to Knowledge (BD2K-LINCS): A discussion of the NIH BD2K initiative and how it might advance the practice of Toxicology and Risk AssessmentJohn Reichard PhD, Mario Medvedovic PhD / BD2K-LINCS DCIC
Poster Session: Big Data to Knowledge (BD2K) - A Graphical Approach for Data Coordination and IntegrationJ.F. Reichard, M. Medvedovic, S. Sivagas / BD2K-LINCS DCIC
Outreach Session at the Society of Toxicology’s Annual Meeting
January 15, 2015Enrichr and GEO2Enrichr: Tools to Extract and Analyze SignaturesGregory Gundersen and Matthew Jones / BD2K-LINCS DCIC
Calendar of events on lincs-dcic.org
18
Genomic and Computational Approaches for Biomarker and Drug Discovery
Workshop hosted by the NIAAALocation: San Antonio, TX Grand Hyatt, San AntonioRoom: Travis C/DTime: 2:00 – 5:00pm
http://www.lincsproject.org/news/
WORKSHOP | June 19, 2015
EnrichrSearch engine for gene lists and signatureshttp://amp.pharm.mssm.edu/Enrichr/
Hands-on Session: Web Apps and Tools
GEO2EnrichrDifferential Expression Analysis Toolhttp://maayanlab.net/g2e
L1000CDS2
L1000 Characteristic Direction Signature Search Enginehttp://amp.pharm.mssm.edu/L1000CDS2/
PAEAPrinciple Angle Enrichment Analysishttp://amp.pharm.mssm.edu/PAEA/
19
Acknowledgements
The BD2K-LINCS DCIC is co-funded by BD2K and the NIH Common Fund
NIH Grant Number: U54HL127624
Follow BD2K-LINCS DCIC
BD2K-LINCS DCIC
@BD2KLINCSDCIC
+BD2K-LINCS
lincs-dcic.org lincsproject.org
BD2K-LINCS DCIC WEBSITE LINCS CONSORTIUM PORTAL