christine borgman keynote

38
Data, Data, Everywhere, Nor Any Drop to Drink Keynote presenta6on Research Data Alliance, Fourth Plenary Mee6ng Amsterdam, September 2014 Chris6ne L. Borgman Professor and Presiden6al Chair in Informa6on Studies University of California, Los Angeles Gustave Dore, Rime of the Ancient Mariner, Woodcut, 1798

Upload: research-data-alliance

Post on 25-May-2015

343 views

Category:

Documents


0 download

DESCRIPTION

RDA Fourth Plenary Keynote - Prof. Christine L. Borgman, Professor Presidential Chair in Information Studies at UCLA: "Data, Data, Everywhere, Nor Any Drop to Drink." Tuesday 23rd Sept 2014, Amsterdam, the Netherlands https://rd-alliance.org/plenary-meetings/fourth-plenary/plenary4-programme.html

TRANSCRIPT

Page 1: Christine borgman keynote

Data,  Data,  Everywhere,    Nor  Any  Drop  to  Drink  

Keynote  presenta6on  Research  Data  Alliance,  Fourth  Plenary  Mee6ng  Amsterdam,  September  2014    

Chris6ne  L.  Borgman  Professor  and  Presiden6al  Chair  in  Informa6on  Studies  University  of  California,  Los  Angeles  

 

Gustave  Dore,  Rime  of  the  Ancient  Mariner,  Woodcut,  1798  

Page 2: Christine borgman keynote

Gustave  Dore,  Ancient  Mariner  Illustra6on,  1798  

Day  aSer  day,  day  aSer  day,    We  stuck,  nor  breath  nor  mo6on;  As  idle  as  a  painted  ship    Upon  a  painted  ocean.      Water,  water,  every  where,    And  all  the  boards  did  shrink;  Water,  water,  every  where,    Nor  any  drop  to  drink.    

Stanzas  from  The  Rime  of  the  Ancient  Mariner    Samuel  Taylor  Coleridge,  1798  

Page 3: Christine borgman keynote
Page 4: Christine borgman keynote

Big  Data,  LiVle  Data,  No  Data:  Scholarship  in  the  Networked  World*  

•  Part  I:  Data  and  Scholarship    –  Ch  1:  Provoca6ons    –  Ch  2:  What  Are  Data?    –  Ch  3:  Data  Scholarship    –  Ch  4:  Data  Diversity  

•  Part  II:  Case  Studies  in  Data  Scholarship  –  Ch  5:  Data  Scholarship  in  the  Sciences    –  Ch  6:  Data  Scholarship  in  the  Social  Sciences    –  Ch  7:  Data  Scholarship  in  the  Humani6es    

•  Part  III:  Data  Policy  and  Prac6ce      –  Ch  8:  Releasing,  Sharing,  and  Reusing  Data    –  Ch  9:  Credit,  AVribu6on,  and  Discovery    –  Ch  10:  What  to  Keep  and  Why  

4  *C.  L.  Borgman  (2015,  January)  MIT  Press  

Page 5: Christine borgman keynote

Neelie  Kroes,  VP  European  Commission:  

•  To  collect,  curate,  preserve  and  make  available  ever-­‐increasing  amounts  of  scien6fic  data,  new  types  of  infrastructures  will  be  needed.  The  poten6al  benefits  are  enormous  but  the  same  is  true  for  the  costs.  We  therefore  need  to  lay  the  right  founda6ons  and  the  sooner  we  start  the  beVer.    

5  

Wood,  J.,  Andersson,  T.,  Bachem,  A.,  Best,  C.,  Genova,  F.,  Lopez,  D.  R.,  …  Hudson,  R.  L.  (2010).  Riding  the  wave:  How  Europe  can  gain  from  the  rising  <de  of  scien<fic  data.  Final  report  of  the  High  Level  Expert  Group  on  Scien6fic  Data.  Retrieved  from  hVp://cordis.europa.eu/fp7/ict/e-­‐infrastructure/docs/hlg-­‐sdi-­‐report.pdf    

Page 6: Christine borgman keynote

 Precondi6on:      

Researchers  share  data  6  

Page 7: Christine borgman keynote

Scholars’  perspec6ves  on  data  sharing  

•  Rewards  •  Responsibility  •  Data  •  Incen6ves  

7  

Persistent  URL:  photography.si.edu/SearchImage.aspx?id=5799  Repository:  Smithsonian  Ins6tu6on  Archives    

Page 8: Christine borgman keynote

Rewards  

•  Publica6ons  •  Publica6ons  •  Publica6ons  •  Publica6ons  •  Publica6ons  •  Publica6ons  •  Grants  •  Awards  and  honors  •  Teaching  •  Service  •  Data  

hVp://blog.starjreshtoday.com/Portals/170402/images/improve-­‐credit-­‐score1.jpg  

Page 9: Christine borgman keynote

Func6ons  of  Scholarly  Publica6ons  

•  Legi6miza6on  –  Authority,  quality    –  Priority,  trustworthiness  

•  Dissemina6on  –  Awareness    –  Diffusion    –  Publicity  

•  Access,  preserva6on,  cura6on  –  Availability    –  Discovery  –  Retrieval    –  Persistence  

Borgman,  C.L.  (2007).  Scholarship  in  the  Digital  Age:  Informa6on,  Infrastructure,  and  the  Internet.  MIT  Press.  

Page 10: Christine borgman keynote

Scholars’  perspec6ves  on  data  sharing  

•  Rewards  •  Responsibility  •  Data  •  Incen6ves  

10  

Persistent  URL:  photography.si.edu/SearchImage.aspx?id=5799  Repository:  Smithsonian  Ins6tu6on  Archives    

Page 11: Christine borgman keynote

Responsibility  

Publica6ons  are  arguments  made  by  authors,  and  data  are  the  evidence  used  to  support  the  arguments.    

C.L.  Borgman  (2015).  Big  Data,  LiBle  Data,  No  Data:    Scholarship  in  the  Networked  World.  MIT  Press  

Page 12: Christine borgman keynote

Responsibility  

•  Publica6ons  –  Independent  units  – Authorship  is  nego6ated  

•  Data  –  Compound  objects  – Ownership  is  rarely  clear  – AVribu6on  

•  Long  term  responsibility:  Inves6gators  •  Exper6se  for  interpreta6on:  Data  collectors  and  analysts  

hudsonalpha.org

Page 13: Christine borgman keynote

AVribu6on  of  data  •  Legal  responsibility  

–  Licensed  data  –  Specific  aVribu6on  required  

•  Scholarly  credit:  contributorship  –  “Author”  of  data  –  Contributor  of  data  to  this  publica6on  –  Colleague  who  shared  data  –  SoSware  developer  –  Data  collector  –  Instrument  builder  –  Data  curator  –  Data  manager  –  Data  scien6st  –  Field  site  staff  –  Data  calibra6on    –  Data  analysis,  visualiza6on  –  Funding  source  –  Data  repository  –  Lab  director  –  Principal  inves6gator  –  University  research  office  –  Research  subjects  –  Research  workers,  e.g.,  ci6zen  science…   13  

Page 14: Christine borgman keynote

Scholars’  perspec6ves  on  data  sharing  

•  Rewards  •  Responsibility  •  Data  •  Incen6ves  

14  

Persistent  URL:  photography.si.edu/SearchImage.aspx?id=5799  Repository:  Smithsonian  Ins6tu6on  Archives    

Page 15: Christine borgman keynote

http://www.census.gov/population/cen2000/map02.gif

What  are  data?  

ncl.ucar.edu http://onlineqda.hud.ac.uk/Intro_QDA/Examples_of_Qualitative_Data.php

Marie Curie’s notebook aip.org

hudsonalpha.org

NASA  Astronomy  Picture  of  the  Day  

15  

Page 16: Christine borgman keynote

16  

Page 17: Christine borgman keynote

17

Page 18: Christine borgman keynote

Center  for  Embedded  Networked  Sensing  

18  

•  NSF  Science  &  Tech  Ctr,  2002-­‐2012  •  5  universi6es,  plus  partners  •  300  members  •  Computer  science  and  engineering  •  Science  applica6on  areas  

Slide by Jason Fisher, UC-Merced, Center for Embedded Networked Sensing (CENS)

Page 19: Christine borgman keynote

UCLA USC UCR CALTECH UCM CENTER FOR EMBEDDED NETWORKED SENSING

Sensor Collected Application Data

Sensor Collected Proprioceptive Data

Sensor Collected Performance Data

Hand Collected Application Data

Flow

Water depth

Ammonium

Ammonia Phosphate

Water temp

pH

Temperature

Conductivity

Chlorophyll

GPS/location Time

Sap flow

CO2

Humidity

Rainfall Packets transmitted

Packets received ORP

PAR

Motor speed

Rudder angle

Heading

Roll/pitch/yaw Soil moisture

Nitrate

Calcium

Chloride

Water potential

Wind speed

Wind direction

Wind duration

Leaf wetness

Routing table

Neighbor table

Fault detection

Awake time

Organism presence

Organism concentration

Battery voltage

Mercury

Methylmercury

Nutrient concentration

Nutrient presence

LandSat images Mosscam

CDOM

Bird calls

CENS data variation

Borgman, et al. (2007). Drowning in data: Digital library architecture to support scientific use of embedded sensor networks. JCDL

Page 20: Christine borgman keynote

Documen6ng  Data  for  Interpreta6on  

Engineering  researcher:    “Temperature  is  temperature.”    

Biologist:  “There  are  hundreds  of  ways  to  measure  temperature.  ‘The  temperature  is  98’  is  low-­‐value  compared  to,  ‘the  temperature  of  the  surface,  measured  by  the  infrared  thermopile,  model  number  XYZ,  is  98.’  That  means  it  is  measuring  a  proxy  for  a  temperature,  rather  than  being  in  contact  with  a  probe,  and  it  is  measuring  from  a  distance.  The  accuracy  is  plus  or  minus  .05  of  a  degree.  I  [also]  want  to  know  that  it  was  taken  outside  versus  inside  a  controlled  environment,  how  long  it  had  been  in  place,  and  the  last  <me  it  was  calibrated,  which  might  tell  me  whether  it  has  driYed.."    CENS  Robo6cs  team  

Page 21: Christine borgman keynote

Center  for  Dark  Energy  Biosphere  Inves6ga6ons  

Repository  for  seafloor  cores.  Photo:  Peter  Darch  

Interna6onal  Ocean  Discovery  Program  Iodp.tamu.org  

•  NSF  Science  &  Tech  Ctr,  2010-­‐2020  •  20  universi6es,  plus  partners  (35  ins6tu6ons)  •  90  scien6sts  •  Biological  sciences  •  Physical  sciences   21  

Page 22: Christine borgman keynote

Social  science  data  

hVp://dss.princeton.edu/images/gss.gif   22  

Page 23: Christine borgman keynote

Social  science  data  

hVp://dss.princeton.edu/images/gss.gif   23  

Page 24: Christine borgman keynote

24  

Data  are  representa6ons  of  observa6ons,  objects,  or  other  en66es  used  as  evidence  of  phenomena  for  the  purposes  of  research  or  scholarship.    

C.L.  Borgman  (2015).  Big  Data,  LiBle  Data,  No  Data:  Scholarship  in  the  Networked  World.  MIT  Press  

hudsonalpha.org

Page 25: Christine borgman keynote

Scholars’  perspec6ves  on  data  sharing  

•  Rewards  •  Responsibility  •  Data  •  Incen6ves  

25  

Persistent  URL:  photography.si.edu/SearchImage.aspx?id=5799  Repository:  Smithsonian  Ins6tu6on  Archives    

Page 26: Christine borgman keynote

Incen6ves  

•  Publica6ons  that  report  the  research  Vs.  •  Data  that  are  reusable  by  others  

Image:  Alyssa  Goodman,  Harvard  Astronomy   26  

Page 27: Christine borgman keynote

27  Pepe,  A.,  Mayernik,  M.  S.,  Borgman,  C.  L.  &  Van  de  Sompel,  H.  (2010).  From  Ar6facts  to  Aggrega6ons:  Modeling  Scien6fic  Life  Cycles  on  the  Seman6c  Web.  Journal  of  the  American  Society  for  Informa6on  Science  and  Technology,  61(3):  567–582.  

Page 28: Christine borgman keynote

Metadata  

•  Metadata  is  structured  informa6on  that  describes,  explains,  locates,  or  otherwise  makes  it  easier  to  retrieve,  use,  or  manage  an  informa6on  resource.*    – descrip6ve    – structural  – administra6ve    

*Na6onal  Informa6on  Standards  Organiza6on  2004   photo  by  @kissane  

Page 29: Christine borgman keynote

Provenance  

•  Libraries:  Origin  or  source  •  Museums:  Chain  of  custody  •  Internet:  Provenance  is  informa6on  about  en66es,  ac6vi6es,  and  people  involved  in  producing  a  piece  of  data  or  thing,  which  can  be  used  to  form  assessments  about  its  quality,  reliability  or  trustworthiness.*  

*World  Wide  Web  Consor6um  (W3C)  Provenance  working  group  

Bri6sh  Library,  provenance  record:  Bes6ary  -­‐  cap6on:  'Owl  mobbed  by  smaller  birds'  

 

Page 30: Christine borgman keynote

•  Reuse  by  inves6gator  •  Reuse  by  collaborators  •  Reuse  by  colleagues  •  Reuse  by  unaffiliated  others  •  Reuse  at  later  6mes  

– Months  –  Years  – Decades  –  Centuries  

hVp://chandra.harvard.edu/photo/2013/kepler/kepler_525.jpg  

Reuse  across  place  and  6me  

30  

Page 31: Christine borgman keynote

Gustave  Dore,  Ancient  Mariner  Illustra6on,  1798  

Day  aSer  day,  day  aSer  day,    We  stuck,  nor  breath  nor  mo6on;  As  idle  as  a  painted  ship    Upon  a  painted  ocean.      Water,  water,  every  where,    And  all  the  boards  did  shrink;  Water,  water,  every  where,    Nor  any  drop  to  drink.    

Stanzas  from  The  Rime  of  the  Ancient  Mariner    Samuel  Taylor  Coleridge,  1798  

Page 32: Christine borgman keynote

Emerging  themes  in  data  prac6ces  

•  Scarcity  or  abundance  of  data  •  Centrality  of  data  to  research  •  Time  frame  of  research  •  Heterogeneity  of  exper6se  •  Maturity  of  standards  •  Community  building  

Borgman,  C.  L.,  et  al.  (2014).  The  Ups  and  Downs  of  Knowledge  Infrastructures  in  Science:  Implica6ons  for  Data  Management.  IEEE/ACM  Digital  Libraries  Conference,  London    

Page 33: Christine borgman keynote

Economics  of  the  Knowledge  Commons  

33  

                                                     Subtractability  /  Rivalry  

Low   High  

Exclusion    

Difficult    

Public  Goods  General  knowledge  Public  domain  data  

Common-­‐pool  resources  Libraries  Data  archives  

Easy   Toll  or  Club  Goods  Subscrip6on  journals  Subscrip6on  data  

Private  Goods  Printed  books  Raw  or  compe66ve  data    

Adapted  from  C.  Hess  &  E.  Ostrom  (Eds.),  Understanding  knowledge  as  a  commons:    From  theory  to  prac<ce.  MIT  Press.    

Page 34: Christine borgman keynote

hVp://environment.na6onalgeographic.com/environment/habitats/water-­‐pressure/  

To  share  data,  scholars  need  

•  Fresh  water  – Tools  – Services  – Skills  – Resources  –  Incen6ves  

Page 35: Christine borgman keynote

To  share  data,  scholars  need  

•  Life  boats  – Repositories  – Governance  models  – Provenance  models  – Data  stewardship  workforce  

Patent  Model,  Life  Boat,  1841;  Smithsonian  American  History  Museum  

Page 36: Christine borgman keynote

Image:  Alyssa  Goodman,  Harvard  Astronomy  

Knowledge  Infrastructures  

36  

Page 37: Christine borgman keynote

hVp://know

ledgeinfrastructures.org  

Page 38: Christine borgman keynote

Acknowledgements UCLA Data Practices team •  Peter Darch, Milena Golshan, Irene

Pasquetto, Ashley Sands, Sharon Traweek

•  Former members: Rebekah

Cummings, David Fearon, Ariel Hernandez, Elaine Levia, Jaklyn Nunga, Matthew Mayernik, Alberto Pepe, Kalpana Shankar, Katie Shilton, Jillian Wallis, Laura Wynholds, Kan Zhang

•  Research funding: National Science Foundation, Alfred P. Sloan Foundation, Microsoft Research

•  University of Oxford: Balliol College, Oliver Smithies Fellowship, Oxford Internet Institute, Oxford eResearch Center, Bodleian Library