hands-on data management planning for life sciences

29
HandsOn Data Management Planning for Life Sciences Andrew Sallans Head of Strategic Data Ini5a5ves University of Virginia Library [email protected] Andrea Denton Research and Data Services Manager Claude Moore Health Sciences Library [email protected] Crea5ve Commons License ”HandsOn Data Management Planning for Life Sciences", 3/19/13 by Andrew L. Sallans is licensed under a Crea5ve Commons APribu5onShareAlike 3.0 Unported License.

Upload: andrew-sallans

Post on 10-May-2015

200 views

Category:

Technology


2 download

DESCRIPTION

Workshop session at University of Virginia on 3/19/13. Presenters Andrew Sallans and Andrea Denton.

TRANSCRIPT

Page 1: Hands-On Data Management Planning for Life Sciences

Hands-­‐On  Data  Management  Planning  for  Life  Sciences  

Andrew  Sallans  Head  of  Strategic  Data  Ini5a5ves  University  of  Virginia  Library  

[email protected]    

Andrea  Denton  Research  and  Data  Services  Manager  Claude  Moore  Health  Sciences  Library  

[email protected]    

Crea5ve  Commons  License  ”Hands-­‐On  Data  Management  Planning  for  Life  Sciences",  3/19/13  by  Andrew  L.  Sallans  is  licensed  under  a  Crea5ve  Commons  APribu5on-­‐ShareAlike  3.0  Unported  License.    

Page 2: Hands-On Data Management Planning for Life Sciences

Goals  for  the  workshop  

•  Learn  about  data  management  planning  •  Learn  about  available  resources    •  Develop  rough  draU  of  a  data  management  plan  for  a  grant  

•  Gain  peer  and  expert  feedback  

Crea5ve  Commons  License  ”Hands-­‐On  Data  Management  Planning  for  Life  Sciences",  3/19/13  by  Andrew  L.  Sallans  is  licensed  under  a  Crea5ve  Commons  APribu5on-­‐ShareAlike  3.0  Unported  License.    

Page 3: Hands-On Data Management Planning for Life Sciences

Why  should  you  care  about  data  management  planning?  

•  It’s  good  science:  reproducible  results  and  con5nuity  •  Transparency  and  accountability  •  Gain  a  compe55ve  edge  in  grant  compe55on  •  Get  credit  by  making  your  data  citable,  more  impact  •  Be  efficient  and  avoid  data  loss  •  It’s  complex  and  requires  aPen5on  to  many  parts  •  You  may  be  required  to  by  your  government,  funder,  ins5tu5on,  publishers,  etc.  

Crea5ve  Commons  License  ”Hands-­‐On  Data  Management  Planning  for  Life  Sciences",  3/19/13  by  Andrew  L.  Sallans  is  licensed  under  a  Crea5ve  Commons  APribu5on-­‐ShareAlike  3.0  Unported  License.    

Page 4: Hands-On Data Management Planning for Life Sciences

Why  not?  

h"p://memegenerator.net/Fist-­‐Pump-­‐Baby  

Crea5ve  Commons  License  ”Hands-­‐On  Data  Management  Planning  for  Life  Sciences",  3/19/13  by  Andrew  L.  Sallans  is  licensed  under  a  Crea5ve  Commons  APribu5on-­‐ShareAlike  3.0  Unported  License.    

Page 5: Hands-On Data Management Planning for Life Sciences

Recent  news  

•  White  House,  Office  of  Science  and  Technology  Policy  from  February  22,  2013  

•  Federal  research  agencies  funding  more  than  $100M/year  must  develop  plan  to  make  the  results  (papers  and  data)  of  federally  funded  research  available  to  the  public  within  one  year  of  publica5on  

•  Also  requires  researchers  to  bePer  account  for  and  manage  data  

h"p://www.whitehouse.gov/sites/default/files/microsites/ostp/ostp_public_access_memo_2013.pdf  

Crea5ve  Commons  License  ”Hands-­‐On  Data  Management  Planning  for  Life  Sciences",  3/19/13  by  Andrew  L.  Sallans  is  licensed  under  a  Crea5ve  Commons  APribu5on-­‐ShareAlike  3.0  Unported  License.    

Page 6: Hands-On Data Management Planning for Life Sciences

Example:  Na5onal  Science  Founda5on  

– Data  Sharing  Policy:  Awards  &  Administra5on  Guide  Chapter  IV.D.4  

– Data  Management  Plan  requirement:  Grant  Proposal  Guide  Chapter  II.C.2.j  

– Addi5onal  requirements  from  individual  Directorates  and  Divisions  (e.g.,  BIO,  CISE,  EHR,  GEO,  MPS,  SBE):  Dissemina5on  and  Sharing  of  Results  

  Crea5ve  Commons  License  ”Hands-­‐On  Data  Management  Planning  for  Life  Sciences",  3/19/13  by  Andrew  L.  Sallans  is  licensed  under  a  Crea5ve  Commons  APribu5on-­‐ShareAlike  3.0  Unported  License.    

Page 7: Hands-On Data Management Planning for Life Sciences

Caveat:  it’s  not  just  the  NSF  

Read  calls  for  proposals  carefully  and  ask  program  director  about  specific  data  management  requirements.  Build  5me  into  your  proposal  development  to  formulate  a  data  management  plan!  

CDC   NEH  

DOE   NIH  

EPA   USDA  

IMLS   Private  and  public  founda5ons  

NASA    

Many  research  funding  agencies  in  the  U.K.,  Australia,  and  other  countries  

NOAA   Etc…  

CRkfs73p  

Crea5ve  Commons  License  ”Hands-­‐On  Data  Management  Planning  for  Life  Sciences",  3/19/13  by  Andrew  L.  Sallans  is  licensed  under  a  Crea5ve  Commons  APribu5on-­‐ShareAlike  3.0  Unported  License.    

Page 8: Hands-On Data Management Planning for Life Sciences

What  is  a  Data  Management  Plan?  

•  A  comprehensive  plan  of  how  you  will  manage  your  research  data  throughout  the  lifecycle  of  your  research  project  

                                                                                 AND  

•  Brief  descrip5on  of  how  you  will  comply  with  funder’s  data  sharing  policy  

•  Reviewed  as  part  of  a  grant  applica5on  Crea5ve  Commons  License  ”Hands-­‐On  Data  Management  Planning  for  Life  Sciences",  3/19/13  by  Andrew  L.  

Sallans  is  licensed  under  a  Crea5ve  Commons  APribu5on-­‐ShareAlike  3.0  Unported  License.    

Page 9: Hands-On Data Management Planning for Life Sciences

Dissemina=on  &  Sharing  of  Research  Results  

“Inves5gators  are  expected  to  share  with  other  researchers,  at  no  more  than  incremental  cost  and  within  a  reasonable  5me,  the  primary  data,  samples,  physical  collec5ons  and  other  suppor5ng  materials  created  or  gathered  in  the  course  of  work  under  NSF  grants.  Grantees  are  expected  to  encourage  and  facilitate  such  sharing.”    

Na=onal  Science  Founda=on:  Award  &  AdministraGon  Guide  (AAG)  Chapter  VI.D.4  

Crea5ve  Commons  License  ”Hands-­‐On  Data  Management  Planning  for  Life  Sciences",  3/19/13  by  Andrew  L.  Sallans  is  licensed  under  a  Crea5ve  Commons  APribu5on-­‐ShareAlike  3.0  Unported  License.    

Page 10: Hands-On Data Management Planning for Life Sciences

Plan  for  Data  Management  &  Sharing  of  the  Products  of  Research  

As  of  January  18,  2011:  

“Proposals  must  include  a  supplementary  document  of  no  more  than  two  pages  labeled  “Data  Management  Plan”.  This  supplement  should  describe  how  the  proposal  will  conform  to  NSF  policy  on  the  dissemina5on  and  sharing  of  research  results,  and  may  include…...”    

NSF:  Grant  Proposal  Guide  (GPG)  Chapter  II.C.2.j  

Crea5ve  Commons  License  ”Hands-­‐On  Data  Management  Planning  for  Life  Sciences",  3/19/13  by  Andrew  L.  Sallans  is  licensed  under  a  Crea5ve  Commons  APribu5on-­‐ShareAlike  3.0  Unported  License.    

Page 11: Hands-On Data Management Planning for Life Sciences

Which  NSF  requirement  to  use?  

•  Which  Guideline  Should  I  follow?  §  First:    follow  the  requirements  laid  out  in  the  specific  solicita5on,  if  any.    

§  Second:    follow  the  guidelines  published  by  the  appropriate  NSF  directorate  and/or  division.  If  there  is  a  conflict,  the  laPer  takes  precedence.  

§  Third:    follow  the  more  general  guidelines.  •  Interdisciplinary  Proposals  

§  Use  guidelines  appropriate  to  the  lead  program  (if  there  are  specific  guidelines)  

Crea5ve  Commons  License  ”Hands-­‐On  Data  Management  Planning  for  Life  Sciences",  3/19/13  by  Andrew  L.  Sallans  is  licensed  under  a  Crea5ve  Commons  APribu5on-­‐ShareAlike  3.0  Unported  License.    

Page 12: Hands-On Data Management Planning for Life Sciences

Parts  of  a  (Generic)  NSF  Data  Management  Plan  

I.   Products  of  the  Research:  The  types  of  data,  samples,  physical  collec5ons,  soUware,  curriculum  materials,  and  other  materials  to  be  produced  in  the  course  of  the  project.    

II.   Data  Formats:  The  standards  to  be  used  for  data  and  metadata  format  and  content  (where  exis5ng  standards  are  absent  or  deemed  inadequate,  this  should  be  documented  along  with  any  proposed  solu5ons  or  remedies).    

III.   Access  to  Data  and  Data  Sharing  Prac=ces  and  Policies:  Policies  for  access  and  sharing  including  provisions  for  appropriate  protec5on  of  privacy,  confiden5ality,  security,  intellectual  property,  or  other  rights  or  requirements.    

IV.  Policies  for  Re-­‐Use,  Re-­‐Distribu=on,  and  Produc=on  of  Deriva=ves.    

V.   Archiving  of  Data:  Plans  for  archiving  data,  samples,  and  other  research  products,  and  for  preserva5on  of  access  to  them.  

Grant  Proposal  Guide  (GPG)  Chapter  II.C.2.j    

Crea5ve  Commons  License  ”Hands-­‐On  Data  Management  Planning  for  Life  Sciences",  3/19/13  by  Andrew  L.  Sallans  is  licensed  under  a  Crea5ve  Commons  APribu5on-­‐ShareAlike  3.0  Unported  License.    

Page 13: Hands-On Data Management Planning for Life Sciences

•  Ques=ons  to  answer:  §  What  data  will  be  generated  in  the  research?    §  What  data  types  will  you  be  crea5ng  or  capturing?    §  How/when/where  will  you  capture  or  create  the  data?    §  How  will  the  data  be  processed?  §  If  you  will  be  using  exis5ng  data,  state  that  fact  and  include  where  you  got  it.  What  is  the  rela5onship  between  the  data  you  are  collec5ng  and  the  exis5ng  data?   13  

I.    Types  of  Data    

Crea5ve  Commons  License  ”Hands-­‐On  Data  Management  Planning  for  Life  Sciences",  3/19/13  by  Andrew  L.  Sallans  is  licensed  under  a  Crea5ve  Commons  APribu5on-­‐ShareAlike  3.0  Unported  License.    

Page 14: Hands-On Data Management Planning for Life Sciences

•  Ques=ons  to  answer:  §  Which  file  formats  will  you  use  for  your  data,  and  why?    

§  What  form  will  metadata  describing/  documen5ng  your  data  take?    

§  How  will  you  create  or  capture  these  details?  §  Which  metadata  standards  will  you  use  and  why  have  you  chosen  them?  

§  What  contextual  details  (metadata)  are  needed  to  make  the  data  you  capture  or  collect  meaningful?   14  

II.      Data  and  Metadata  Standards    

Crea5ve  Commons  License  ”Hands-­‐On  Data  Management  Planning  for  Life  Sciences",  3/19/13  by  Andrew  L.  Sallans  is  licensed  under  a  Crea5ve  Commons  APribu5on-­‐ShareAlike  3.0  Unported  License.    

Page 15: Hands-On Data Management Planning for Life Sciences

15  

III.  Policies  for  Access  and  Sharing  &  Provisions  for  Appropriate  Protec=on/Privacy  

 •  Ques=ons  to  answer:  

§  How/when    will  you  make  the  data  available?    §  What  is  the  process  for  gaining  access  to  the  data?  §  Does  the  original  data  collector/creator/principal  inves5gator  retain  the  right  to  use  the  data  before  opening  it  up  to  wider  use?  

§  Are  there  any  embargo  periods  for  poli5cal/  commercial/  patent  reasons?  If  so,  give  details.  

Crea5ve  Commons  License  ”Hands-­‐On  Data  Management  Planning  for  Life  Sciences",  3/19/13  by  Andrew  L.  Sallans  is  licensed  under  a  Crea5ve  Commons  APribu5on-­‐ShareAlike  3.0  Unported  License.    

Page 16: Hands-On Data Management Planning for Life Sciences

16  

III.  Policies  for  Access  and  Sharing  &  Provisions  for  Appropriate  Protec=on/Privacy  (Cont.)  

 

•  More  Ques=ons  to  answer:  §  Are  there  ethical  and  privacy  issues?    If  so,  how  will  these  be  resolved?    

§  What  have  you  done  to  comply  with  your  obliga5ons  in  your  IRB  Protocol?    

§  Who  will  hold  the  intellectual  property  rights  to  the  data  and  how  might  this  affect  data  access?  

Crea5ve  Commons  License  ”Hands-­‐On  Data  Management  Planning  for  Life  Sciences",  3/19/13  by  Andrew  L.  Sallans  is  licensed  under  a  Crea5ve  Commons  APribu5on-­‐ShareAlike  3.0  Unported  License.    

Page 17: Hands-On Data Management Planning for Life Sciences

17  

IV.    Policies  and  Provisions  for    Re-­‐Use,  Re-­‐Distribu=on  

•  Ques=ons  to  answer:  §  Will  any  permission  restric5ons  need  to  be  placed  on  the  data?  

§  Which  bodies/groups  are  likely  to  be  interested  in  the  data?  

§  What  could  be  the  intended  uses  of  the  data?  

Crea5ve  Commons  License  ”Hands-­‐On  Data  Management  Planning  for  Life  Sciences",  3/19/13  by  Andrew  L.  Sallans  is  licensed  under  a  Crea5ve  Commons  APribu5on-­‐ShareAlike  3.0  Unported  License.    

Page 18: Hands-On Data Management Planning for Life Sciences

18  

V.    Plans  for  Archiving  and  Preserva=on  of  Access  

•  Ques=ons  to  answer:  §  What  data  will  be  preserved  for  the  long-­‐term?  §  What  is  the  long-­‐term  strategy  for  maintaining,  cura5ng  and  archiving  the  data?    

§  Which  archive/repository/database  have  you  iden5fied  as  a  place  to  deposit  data?    

§  What  procedures  does  your  intended  long-­‐term  data  storage  facility  have  in  place  for  preserva5on  and  backup?  

§  How  long  will/should  data  be  kept  beyond  the  life  of  the  project?    

Crea5ve  Commons  License  ”Hands-­‐On  Data  Management  Planning  for  Life  Sciences",  3/19/13  by  Andrew  L.  Sallans  is  licensed  under  a  Crea5ve  Commons  APribu5on-­‐ShareAlike  3.0  Unported  License.    

Page 19: Hands-On Data Management Planning for Life Sciences

19  

V.    Plans  for  Archiving  and  Preserva=on  of  Access  (Cont.)  

•  More  Ques=ons  to  answer:  §  What  transforma5ons  will  be  necessary  to  prepare  data  for  preserva5on  /  data  sharing?    

§  What  metadata/  documenta5on  will  be  submiPed  alongside  the  data  or  created  on  deposit/  transforma5on  in  order  to  make  the  data  reusable?    

§  What  related  informa5on  will  be  deposited?  

Crea5ve  Commons  License  ”Hands-­‐On  Data  Management  Planning  for  Life  Sciences",  3/19/13  by  Andrew  L.  Sallans  is  licensed  under  a  Crea5ve  Commons  APribu5on-­‐ShareAlike  3.0  Unported  License.    

Page 20: Hands-On Data Management Planning for Life Sciences

What  needs  to  be  in  a  data  management  plan  for  a  grant?  

Example:  NSF    •  Two  pages  long    •  Reviewed  for  merit  and/or  impact  with  proposal    •  Your  plan  should  minimally  address  five  points:  

–  Data  being  produced    –  Format  and  descrip5on    –  Access  and  sharing    –  Reuse    –  Archiving    

 

Be  sure  to  address  addi5onal  Directorate  or  Division  guidelines  and  specific  program  requirements!    

Crea5ve  Commons  License  ”Hands-­‐On  Data  Management  Planning  for  Life  Sciences",  3/19/13  by  Andrew  L.  Sallans  is  licensed  under  a  Crea5ve  Commons  APribu5on-­‐ShareAlike  3.0  Unported  License.    

Page 21: Hands-On Data Management Planning for Life Sciences

Three  Data  Management    Planning  Resources  

•  DMPTool,  hPp://dmptool.org  –  Helps  you  create  a  data  management  plan  to  meet  grant  requirements  and  iden5fy  UVA  support  resources  and  policies  

•  Databib,  hPp://databib.org  –  Helps  you  find  an  appropriate  place  to  deposit  your  data  

•  Libra,  hPp://libra.virginia.edu  -­‐  Helps  UVA  faculty,  graduate  students,  and  staff  by  providing  a  place  to  deposit  and  share  datasets  

Crea5ve  Commons  License  ”Hands-­‐On  Data  Management  Planning  for  Life  Sciences",  3/19/13  by  Andrew  L.  Sallans  is  licensed  under  a  Crea5ve  Commons  APribu5on-­‐ShareAlike  3.0  Unported  License.    

Page 22: Hands-On Data Management Planning for Life Sciences

Step-­‐by-­‐step  wizard  for  genera5ng  DMP  

Create    |    edit    |    re-­‐use    |    share    |    save    |    generate    

Open  to  community    

Links  to  institutional  resources  

Directorate  information  &  updates  

h`p://dmptool.org                    

Crea5ve  Commons  License  ”Hands-­‐On  Data  Management  Planning  for  Life  Sciences",  3/19/13  by  Andrew  L.  Sallans  is  licensed  under  a  Crea5ve  Commons  APribu5on-­‐ShareAlike  3.0  Unported  License.    

Page 23: Hands-On Data Management Planning for Life Sciences

Goals  of  the  DMPTool  

I.  To  provide  researchers  a  simple  way  to  create  a  DMP  for  their  funding  agency  

•  Ques5ons  asked  by  the  agency  •  Addi5onal  explana5on/context  provided  by  the  agency  •  Links  to  the  agency  website  for  policies,  help,  guidance  

Crea5ve  Commons  License  ”Hands-­‐On  Data  Management  Planning  for  Life  Sciences",  3/19/13  by  Andrew  L.  Sallans  is  licensed  under  a  Crea5ve  Commons  APribu5on-­‐ShareAlike  3.0  Unported  License.    

Page 24: Hands-On Data Management Planning for Life Sciences

Goals  of  the  DMPTool  

II.  To  provide  researchers  with  DMP  informa=on  from  their  home  ins=tu=on  

•  Resources  and  services  to  help  them  manage  data  •  Help  text  for  specific  ques5ons  •  Suggested  answers  to  ques5ons;  easy  to  cut-­‐N-­‐paste    •  News  &  events  related  to  data  management  on  campus  

Crea5ve  Commons  License  ”Hands-­‐On  Data  Management  Planning  for  Life  Sciences",  3/19/13  by  Andrew  L.  Sallans  is  licensed  under  a  Crea5ve  Commons  APribu5on-­‐ShareAlike  3.0  Unported  License.    

Page 25: Hands-On Data Management Planning for Life Sciences

Last  point:    Grant  requirements  versus  ideal  

•  Grant  Driven  – Requirements  – Sharing  and  public  access  to  research  

•  Opera5onal  – Research  con5nuity  – Avoiding  data  loss  – Efficiency  

Crea5ve  Commons  License  ”Hands-­‐On  Data  Management  Planning  for  Life  Sciences",  3/19/13  by  Andrew  L.  Sallans  is  licensed  under  a  Crea5ve  Commons  APribu5on-­‐ShareAlike  3.0  Unported  License.    

Page 26: Hands-On Data Management Planning for Life Sciences

Team  Exercise  30  minutes  

1.  Iden5fy  a  grant  that  you  have  or  might  apply  for.  2.  Locate  the  requirements  for  that  grant  in  the  DMPTool.  3.  Go  through  plan  sec5ons  in  DMPTool  workflow  to  

produce  draU  plan.    –  Be  sure  to  address  metadata,  access  policies,  repositories  .  

4.  Iden5fy  solu5ons  and  available  support  through  DMPTool  sec5ons  or  ask  for  guidance.  

5.  Record  issues  and  ques5ons  for  discussion.  

 Crea5ve  Commons  License  ”Hands-­‐On  Data  Management  Planning  for  Life  Sciences",  3/19/13  by  Andrew  L.  

Sallans  is  licensed  under  a  Crea5ve  Commons  APribu5on-­‐ShareAlike  3.0  Unported  License.    

Page 27: Hands-On Data Management Planning for Life Sciences

Presenta5on  of  DraU  DMPs  15  minutes  

•  Iden5fy  grant    •  Describe  project  briefly  •  Explain  requirements  •  Describe  planned  solu5ons  – Must  address  metadata,  access  policies,  and  repositories.  

Crea5ve  Commons  License  ”Hands-­‐On  Data  Management  Planning  for  Life  Sciences",  3/19/13  by  Andrew  L.  Sallans  is  licensed  under  a  Crea5ve  Commons  APribu5on-­‐ShareAlike  3.0  Unported  License.    

Page 28: Hands-On Data Management Planning for Life Sciences

Ques5ons  and  Discussion?  

Crea5ve  Commons  License  ”Hands-­‐On  Data  Management  Planning  for  Life  Sciences",  3/19/13  by  Andrew  L.  Sallans  is  licensed  under  a  Crea5ve  Commons  APribu5on-­‐ShareAlike  3.0  Unported  License.    

Page 29: Hands-On Data Management Planning for Life Sciences

Follow-­‐up  

•  Contact  the  Scien5fic  Data  Consul5ng  Group  for  help  with  DMP  prepara5on  – Grant  driven:    hPp://www2.lib.virginia.edu/brown/data/DMP_Support.html    

– Opera5onal  •  Email:    [email protected]    

Crea5ve  Commons  License  ”Hands-­‐On  Data  Management  Planning  for  Life  Sciences",  3/19/13  by  Andrew  L.  Sallans  is  licensed  under  a  Crea5ve  Commons  APribu5on-­‐ShareAlike  3.0  Unported  License.