towards open and reproducible neuroscience in the age of big data

72
Towards open and reproducible neuroscience in the age of big data Chris Gorgolewski Center for Reproducible Neuroscience Stanford University

Upload: krzysztof-gorgolewski

Post on 14-Feb-2017

297 views

Category:

Science


11 download

TRANSCRIPT

Page 1: Towards open and  reproducible neuroscience in the age of big data

Towards open and reproducible neuroscience in the age of big data

Chris GorgolewskiCenter for Reproducible NeuroscienceStanford University

Page 2: Towards open and  reproducible neuroscience in the age of big data

ON THE IMPORTANCE OF DATA

Page 3: Towards open and  reproducible neuroscience in the age of big data

ROSALIND FRANKLIN AND PHOTOGRAPH 51

Page 4: Towards open and  reproducible neuroscience in the age of big data

NEUROVAULT.ORG DATA REUSE

Sochat et al. 2015

Page 5: Towards open and  reproducible neuroscience in the age of big data

OPENFMRI DATA REUSE

Gorgolewski et. al 2015

Page 6: Towards open and  reproducible neuroscience in the age of big data

DATA SHARING SAVES MONEY

$878,988COST OF REACQUIRING DATA FOR EACH OF THE REUSES OF

OPENFMRI DATASETS (2015)

Page 7: Towards open and  reproducible neuroscience in the age of big data

STUDIES SHARING DATA HAVE HIGHER STATISTICAL QUALITY

Wicherts JM, Bakker M, Molenaar D (2011) Willingness to Share Research Data Is Related to the Strength of the Evidence and the Quality of Reporting of Statistical Results. PLoS ONE 6(11): e26828. doi: 10.1371/journal.pone.0026828

Page 8: Towards open and  reproducible neuroscience in the age of big data

SHARING DATA IS RELATED TO HIGHER CITATION RATE

PIWOWAR, DAY & FRIDSMA (2007)

Piwowar & Vision(2013)

Page 9: Towards open and  reproducible neuroscience in the age of big data
Page 10: Towards open and  reproducible neuroscience in the age of big data

MAKING MORE DATA ACCESSIBLE TO

MORE RESEARCHERS

Page 11: Towards open and  reproducible neuroscience in the age of big data

MEET PROF. SMITH

BIDS.NEUROIMAGING.IO

Page 12: Towards open and  reproducible neuroscience in the age of big data

MEET MIKE

BIDS.NEUROIMAGING.IO

Page 13: Towards open and  reproducible neuroscience in the age of big data

GETTING LOST IN YOUR DATA

BIDS.NEUROIMAGING.IO

Page 14: Towards open and  reproducible neuroscience in the age of big data

GETTING LOST IN YOUR DATA

HETEROGENEITY IN DATA DESCRIPTION PRACTICES CAUSES:• PROBLEMS IN SHARING DATA,• UNNECESSARY MANUAL METADATA

INPUT,• NO WAY TO AUTOMATICALLY VALIDATE

DATASETS.

BIDS.NEUROIMAGING.IO

Page 15: Towards open and  reproducible neuroscience in the age of big data

GETTING LOST IN YOUR DATA

• MRI HAS BEEN USED TO STUDY THE HUMAN BRAIN FOR OVER 20 YEARS.

• DESPITE SIMILARITIES IN EXPERIMENTAL DESIGNS AND DATA TYPES EACH RESEARCHER TENDS TO ORGANIZE AND DESCRIBE THEIR DATA IN THEIR OWN WAY.

http://www.nature.com/news/brain-imaging-fmri-2-0-1.10365

BIDS.NEUROIMAGING.IO

Page 16: Towards open and  reproducible neuroscience in the age of big data

BRAIN IMAGING DATA STRUCTURE

A NEW STANDARD FOR ORGANIZING HUMAN

NEUROIMAGING DATASETS

BIDS.NEUROIMAGING.IO

Page 17: Towards open and  reproducible neuroscience in the age of big data

WHO IS IT FOR?

1.LAB PIS. IT WILL MAKE HANDING OVER ONE DATASET FROM ONE STUDENT/POSTDOC TO ANOTHER EASY.

2.WORKFLOW DEVELOPERS. IT’S EASIER TO WRITE PIPELINES EXPECTING A PARTICULAR FILE ORGANIZATION.

3.DATABASE CURATORS. ACCEPTING ONE DATASET FORMAT WILL MAKE CURATION EASIER.

BIDS.NEUROIMAGING.IO

Page 18: Towards open and  reproducible neuroscience in the age of big data

WHO IS IT FOR?

BIDS.NEUROIMAGING.IO

Page 19: Towards open and  reproducible neuroscience in the age of big data

PRINCIPLES BEHIND BIDS

1.ADOPTION IS CRUCIAL. 2.DON’T REINVENT THE

WHEEL. 3.80/20 RULE.

BIDS.NEUROIMAGING.IO

Page 20: Towards open and  reproducible neuroscience in the age of big data

EVOLUTION OF BIDS1.KICKOFF MEETING AT STANFORD IN

SPRING 20152.MEETING AT OHBM 2015 (JUNE)3.INTRODUCED TO NEUROINFORMATICS

COMMUNITY AT NEUROINFORMATICS CONGRESS 2015 (AUGUST)

4.FIRST RELEASE CANDIDATE AND PUBLIC CALL FOR COMMENTS (SEPTEMBER)

5.VERSION 1.0.0 PUBLISHED ALONG A INTRODUCTORY PAPERBIDS.NEUROIMAGING.I

O

Page 21: Towards open and  reproducible neuroscience in the age of big data

COMMUNITY OUTREACH

• REACHED OVER 5000 RESEARCHERS• EXCHANGED HUNDREDS OF EMAIL

COMMENTS PRODUCED • ~40 EXAMPLE DATASETS

• 27 COAUTHORS ON THE FINAL MANUSCRIPT

BIDS.NEUROIMAGING.IO

Gorgolewski et al. (2016) Scientific Data

Page 22: Towards open and  reproducible neuroscience in the age of big data

FOLDER ORGANIZATION

BIDS.NEUROIMAGING.IO

Page 23: Towards open and  reproducible neuroscience in the age of big data

FOLDER ORGANIZATION

BIDS.NEUROIMAGING.IO

Page 24: Towards open and  reproducible neuroscience in the age of big data

FOLDER ORGANIZATION

participant_id age sex sub-001 34 M sub-002 12 F sub-003 33 F

BIDS.NEUROIMAGING.IO

Page 25: Towards open and  reproducible neuroscience in the age of big data

FOLDER ORGANIZATION

NIfTI

BIDS.NEUROIMAGING.IO

Page 26: Towards open and  reproducible neuroscience in the age of big data

FOLDER ORGANIZATION

{ "RepetitionTime": 3.0, "EchoTime": 0.03, "FlipAngle": 78, "SliceTiming": [0.0, 0.2, 0.4, …], "MultibandAccellerationFactor": 4, "PhaseEncodingDirection": "j-" }

BIDS.NEUROIMAGING.IO

Page 27: Towards open and  reproducible neuroscience in the age of big data

THE VALIDATOR

incf.github.io/bids-validator/

BIDS.NEUROIMAGING.IO

Page 28: Towards open and  reproducible neuroscience in the age of big data

PROF. SMITH(2030)

BIDS.NEUROIMAGING.IO

Page 29: Towards open and  reproducible neuroscience in the age of big data

SOFTWARE SUPPORTING BIDS

• QAP (QUALITY ASSESMENT)• MRIQC (QUALITY ASSESMENT)• FMRIPREP (PREPROCESSING WORKFLOW)• AUTOMATIC ANALYSIS (FMRI PROCESSING TOOLBOX)• OPENFMRI2BIDS (CONVERTER)• BIDS2ISATAB (CONVERTER)• DCM2NIIX (CONVERTER)• DICM2NII (CONVERTER)• OPENFMRI (REPOSITORY)• BIDSTO3COL (CONVERTER)• BIDS2NDA (CONVERTER)• AFNI BIDS-TOOLS (SET OF TOOLS FOR CONVERTING TO AND ANALYZING BIDS DATASETS IN

AFNI)• HEUDICONV (CONVERTER)• DCM2BIDS (CONVERTER)• C-PAC (CONFIGURABLE PIPELINE FOR THE ANALYSING CONNECTOMES)• BRAINSTORM (MEG/EEG ANALYSIS PACKAGE)

Page 30: Towards open and  reproducible neuroscience in the age of big data

BIDS APPS

BIDS-APPS.NEUROIMAGING.IO

PORTABLE NEUROIMAGING PIPELINES THAT SUPPORT

BIDS DATASETS

Page 31: Towards open and  reproducible neuroscience in the age of big data

BIDS APPS

BIDS-APPS.NEUROIMAGING.IO

Page 32: Towards open and  reproducible neuroscience in the age of big data

BIDS APPS - CONTAINERS

BIDS-APPS.NEUROIMAGING.IO

• BATTERIES INCLUDED – NO NEED TO INSTALL ANY EXTRA SOFTWARE

• NO NEED TO WORRY ABOUT INCOMPATIBLE THIRD PARTY SOFTWARE UPDATES

• EASY TO SWITCH BETWEEN VERSIONS• WORKS ON WINDOWS, MAX, LINUX AND MULTI USER HPC

Page 33: Towards open and  reproducible neuroscience in the age of big data

BIDS-APPS.NEUROIMAGING.IO

Page 34: Towards open and  reproducible neuroscience in the age of big data

WHAT IS DOCKER?

• DOCKER IS THE MOST POPULAR CONTAINER IMPLEMENTATION• IT CONSISTS OF:• DOCKER ENGINE (RUNS CONTAINERS)• DOCKER HUB (CENTRALIZED WEB SERVICE

FOR STORING AND SHARING CONTAINER IMAGES)

BIDS-APPS.NEUROIMAGING.IO

Page 35: Towards open and  reproducible neuroscience in the age of big data

RUNNING CONTAINERS ON CLUSTERS/HPCS

• DOCKER HAS LIMITATIONS:• IT WAS DESIGNED FOR THE CLOUD,

WHERE YOU ARE IN TOTAL CONTROL• REQUIRES MODERN KERNEL VERSION• ALLOWS USERS TO GAIN ROOT ACCESS

BIDS-APPS.NEUROIMAGING.IO

Page 36: Towards open and  reproducible neuroscience in the age of big data

RUNNING CONTAINERS ON CLUSTERS/HPCS

• THE ADVANCED DOCKER FEATURES ARE USEFUL FOR:• NETWORKING MANAGEMENT• SANDBOXING RESOURCES• MAPPING USERNAMES

• ALL SCIENTISTS CARE ABOUT IS PORTABILITY (CAPTURING BINARY DEPENDENCIES)

Page 37: Towards open and  reproducible neuroscience in the age of big data

RUNNING CONTAINERS ON CLUSTERS/HPCS

• SINGULARITY IS A CONTAINER FRAMEWORK THAT• WAS BUILD GROUND UP TO SUPPORT

CLUSTERS/HPCS• HAS MINIMAL REQUIREMENTS• RUNS ON LEGACY KERNELS• DOES NOT ELEVATE PERMISSIONS• ALLOWS IMPORTING DOCKER IMAGES

BIDS-APPS.NEUROIMAGING.IO

Page 38: Towards open and  reproducible neuroscience in the age of big data

BIDS APPS - PARALLELIZATION

BIDS-APPS.NEUROIMAGING.IO

Page 39: Towards open and  reproducible neuroscience in the age of big data

BIDS APPS - COMMUNITY

BIDS-APPS.NEUROIMAGING.IO GORGOLEWSKI ET. AL 2017

Page 40: Towards open and  reproducible neuroscience in the age of big data

22 AVAILABL

E BIDS APPS

BIDS-APPS.NEUROIMAGING.IO

Page 41: Towards open and  reproducible neuroscience in the age of big data

BIDS APPS - VERSIONING

BIDS-APPS.NEUROIMAGING.IO

Page 42: Towards open and  reproducible neuroscience in the age of big data

EXAMPLE BIDS APPS: MRIQC AND FMRIPREP

Page 43: Towards open and  reproducible neuroscience in the age of big data

MRIQC – QUALITY CONTROLFOR STRUCTURAL AND FUNCTIONAL IMAGES

MRIQC.ORG

Page 44: Towards open and  reproducible neuroscience in the age of big data
Page 45: Towards open and  reproducible neuroscience in the age of big data
Page 46: Towards open and  reproducible neuroscience in the age of big data
Page 47: Towards open and  reproducible neuroscience in the age of big data

SPIKES

Page 48: Towards open and  reproducible neuroscience in the age of big data
Page 49: Towards open and  reproducible neuroscience in the age of big data
Page 50: Towards open and  reproducible neuroscience in the age of big data

CROWDSOURCING ARTEFACTS

WHAT HAPPENS WHEN YOU ASK TWITTER FOR HELP...

Page 51: Towards open and  reproducible neuroscience in the age of big data

FMRIPREPROBUST EASY TRANSPARENT

FMRIPREP.READTHEDOCS.IO

Page 52: Towards open and  reproducible neuroscience in the age of big data

WHAT IS IT?FMRI DATA PREPROCESSING

TOOL

PREPROCESSING?DENOISING AND NORMALIZATION

FMRIPREP.READTHEDOCS.IO

Page 53: Towards open and  reproducible neuroscience in the age of big data

WHAT IT IS NOT

GLMDCM

CONNECTIVITYDYNAMICS

ETC.FMRIPREP.READTHEDOCS.I

O

Page 54: Towards open and  reproducible neuroscience in the age of big data

FMRIPREP: PRINCIPLES

• EASY TO INSTALL AND USE

• ROBUST – WORKS ON ANY* DATA

• TRANSPARENT – “GLASS BOX” RATHER THAN “BLACK BOX”

FMRIPREP.READTHEDOCS.IO

Page 55: Towards open and  reproducible neuroscience in the age of big data

FMRIPREP: T1W PREPROCESSING

• N4 BIAS FIELD CORRECTION (ANTS)

• SKULL STRIPPING (ANTS)

• 3 CLASS TISSUE SEGMENTATION (FSL FAST)

• ROBUST MNI COREGISTRATION (ANTS)FMRIPREP.READTHEDOCS.I

O

Page 56: Towards open and  reproducible neuroscience in the age of big data

FMRIPREP: EPI PREPROCESSING

• MOTION CORRECTION (FSL MCFLIRT)

• SKULL STRIPPING (NILEARN)

• COREGISTRATION TO T1(FSL FLIRT WITH BBR)

FMRIPREP.READTHEDOCS.IO

Page 57: Towards open and  reproducible neuroscience in the age of big data

REPORTS

Page 58: Towards open and  reproducible neuroscience in the age of big data

BIDS DERIVATIVES

Page 59: Towards open and  reproducible neuroscience in the age of big data

MAKING MORE DATA ACCESSIBLE TO MORE RESEARCHERS

Poldrack and Gorgolewski, 2014

Page 60: Towards open and  reproducible neuroscience in the age of big data

MAKING MORE DATA ACCESSIBLE TO

MORE RESEARCHERS

Page 61: Towards open and  reproducible neuroscience in the age of big data
Page 62: Towards open and  reproducible neuroscience in the age of big data

OPENNEURO*

*BORN OUT OF A TWEET

Page 63: Towards open and  reproducible neuroscience in the age of big data

FEATURES

• DATA ORGANIZATION – BIDS• DATA MANAGEMENT PLATFORM• UPLOADING• VALIDATION• SNAPSHOTING• DOWNLOADING

• DATA ANALYSIS – BIDS APPS

Page 64: Towards open and  reproducible neuroscience in the age of big data

UPLOADING

Page 65: Towards open and  reproducible neuroscience in the age of big data

BROWSING AND VALIDATION

Page 66: Towards open and  reproducible neuroscience in the age of big data

ANALYSIS RESULTS

Page 67: Towards open and  reproducible neuroscience in the age of big data

FUTURE DIRECTIONS

• HYPOTHESIS GENERATION MACHINE• SAME MODEL, SAME DATA, NEW BRAIN-BEHAVIOR

RELATIONSHIPS• VIBRATION RATIO ESTIMATION

• HOW MUCH YOUR RESULTS DEPEND ON PREPROCESSING DECISIONS (CARP 2012)

• EXPOSING NEUROIMAGING DATASETS TO ML COMMUNITY• FROM NIFTIS TO CSV FILES

• BIDS EXTENSIONS:• PET, MEG, MODELS, DERIVATIVES

Page 68: Towards open and  reproducible neuroscience in the age of big data

ACKNOWLEDGMENTS

The Poldrack Lab @ StanfordData Sharing Task Force

Page 69: Towards open and  reproducible neuroscience in the age of big data

ACKNOWLEDGMENTS

• TIBOR AUER • VINCE D. CALHOUN• R. CAMERON CRADDOCK• SAMIR DAS • EUGENE P. DUFF • GUILLAUME FLANDIN• SATRAJIT S. GHOSH • TRISTAN GLATARD • YAROSLAV O. HALCHENKO• DANIEL A. HANDWERKER• MICHAEL HANKE• DAVID KEATOR• XIANGRUI LI• DAN MARCUS

• ZACHARY MICHAEL• CAMILLE MAUMET• B. NOLAN NICHOLS• THOMAS E. NICHOLS• JOHN PELLMAN• JEAN-BAPTISTE POLINE• ARIEL ROKEM• CHRIS RORDEN• GUNNAR SCHAEFER• VANESSA SOCHAT• WILLIAM TRIPLETT• JESSICA A. TURNER • GAËL VAROQUAUX• RUSSELL A. POLDRACK

Page 70: Towards open and  reproducible neuroscience in the age of big data

ACKNOWLEDGMENTS

• FIDEL ALFARO-ALMAGRO• PIERRE BELLEC • MIHAI CAPOTĂ • M. MALLAR CHAKRAVARTY• NATHAN W. CHURCHILL• ALEXANDER LI COHEN • GABRIEL A. DEVENYI• ANDERS EKLUND • OSCAR ESTEBAN• J. SWAROOP GUNTUPALLI• MARK JENKINSON• ANISHA KESHAVAN• GREGORY KIAR• FRANZISKUS LIEM

• PRADEEP REDDY RAAMANA• DAVID RAFFELT• CHRISTOPHER J. STEELE• PIERRE-OLIVIER QUIRION• ROBERT E. SMITH• STEPHEN C. STROTHER• GAËL VAROQUAUX• TAL YARKONI• YIDA WANG • ROSS BLAIR• SHOSHANA BERLEANT• SUYASH BHOGOWAR• JOSEPH WEXLER• CHRIS MARKIEWICZ

Page 71: Towards open and  reproducible neuroscience in the age of big data

OHBM REPLICATION AWARD

BEST REPLICATION IN NEUROIMAGINGPEER REVIEWED PAPER OR PREPRINTPUBLISHED/POSTED ANYTIME BEFORE FEBRUARY 2017SUBMISSION DEADLINE:  FEBRUARY 22 2017THE AWARD ($2000) WILL BE PRESENTED AT A PLENARY

SESSION OF OHBM 2017HTTP://WWW.HUMANBRAINMAPPING.ORG/I4A/PAGES/INDEX.C

FM?PAGEID=3731HTTP://WWW.OHBMBRAINMAPPINGBLOG.COM/BLOG/OHBM-R

EPLICATION-AWARD-QA-WITH-CHRIS-GORGOLEWSKI

Page 72: Towards open and  reproducible neuroscience in the age of big data

HOW TO GET INVOLVED (WE NEED YOUR HELP!)

• SHARE YOUR DATA!• OPENFMRI.ORG (RAW) • NEUROVAULT.ORG (STATISTICAL MAPS)

• LEARN MORE ABOUT BIDS AND JOIN THE WORK ITS EXTENSIONS: • BIDS.NEUROIMAGING.IO

• CHECK OUT BIDS APPS, DEVELOP YOUR OWN: • BIDS-APPS.NEUROIMAGING.IO

• TRY MRIQC, LET US KNOW HOW TO MAKE IT BETTER: • MRIQC.ORG

• TRY FMRIPREP AND TELL US HOW IT WORKS WITH YOUR DATA: • FMRIPREP.READTHEDOCS.ORG

• APPLY FOR THE OHBM REPLICATION AWARD