data citation from the perspective of tracking data reuse
Post on 01-Nov-2014
2.349 Views
Preview:
DESCRIPTION
TRANSCRIPT
Data Citation Challenges and Opportunities
from the perspective ofTracking Data Reuse
Heather PiwowarDataONE postdoc with NESCent and Dryad
@researchremix
DataCite Summer MeetingAugust 2011
http://www.metmuseum.org/toah/ho/09/euwf/ho_24.45.1.htm
http://www.flickr.com/photos/jsmjr/62443357/
http://www.flickr.com/photos/camilleharrington/3587294608/
http://www.flickr.com/photos/rkuhnau/3318245976/
http://www.flickr.com/photos/conformpdx/1796399674/
http://www.flickr.com/photos/rkuhnau/3317418699/
http://www.flickr.com/photos/zemlinki/261617721/
http://www.flickr.com/photos/tracenmatt/3020786491/
http://www.flickr.com/photos/the-o/2078239333/
http://www.flickr.com/photos/ryanr/142455033/
?
http://www.flickr.com/photos/archeon/2941655917/
http://upload.wikimedia.org/wikipedia/commons/thumb/e/e6/Gamma_distribution_pdf.svg/500px-Gamma_distribution_pdf.svg.png
http://www.flickr.com/photos/jima/606588905/
http://www.flickr.com/photos/lofaesofa/248546821/
We have observed reuse of at 35% of GEO datasets submitted in 2005.
Piwowar, Vision, Whitlock (2011) Data archiving is a good investment. Nature 473, 285
http://researchremix.wordpress.com/2011/05/19/nature-letter/
Tracking 1k
10 * 100 = 1000
!"#
$!"#
%!"#
&!"#
'!"#
(!"#
)!"#
*!"#
+!"#
,!"#
$!!"#
-./# 0123141# 56447184#
!"#$"%
&'()'*
++',*
&*'#"
-."'*/
#01-2
(%.'
948:;1<4=
># 1?6@AB:C2#@2#64D4642E4#
1?6@AB:C2#@2#DCC<2C<4#
1?6@AB:C2#@2#<1AF4#
1?6@AB:C2#@2#<4G<#
!"!!!!
!#$!!
!%$$!!
!%#$!!
!&$$!!
!&#$!!
&$$'! &$$#! &$$(! &$$)! &$$*! &$$+! &$%$! &$%%!
!"#
$%&'(
)'*+,+'&%"-%-'.,
-./01,2
3!
456!
70890,0!
:;,,<0-,!
=>?@89!0?,;,09,!A4563!
=>?@89!0?,;09,!A70890,03!
=>?@89!0?,;09,!A:;,,<0-,3!
https://notebooks.dataone.org/tracking1000datasets/
Piwowar, Carlson, Vision (2011) Beginning to track 1000 datasets from public repositories into the published literature. ASIS&T poster.
My research blog:
ResearchRemix.wordpress.com
http://www.flickr.com/photos/myklroventine/892446624/
Data citation in the wild IDCC 2010 poster.
A best-practice solution!
http://www.flickr.com/photos/nilsrinaldi/5157809483/
#1
Lack of tool support for our best practice
Research Remix blog: http://bit.ly/aOwLoJ“Tracking Dataset Citations Using Common Citation Tracking Tools Doesn’t Work”
Research Remix blog: http://bit.ly/aOwLoJ“Tracking Dataset Citations Using Common Citation Tracking Tools Doesn’t Work”
We need more diversityWe need more players
We need start-ups
Abstracts are open. Ref lists should be too.
#2
Our best practice doesn’t scale to mega-reuse
!"#$!"#%!"#&!"#'!"#(!"#)!"#*!"#+!"#,!"#
$!!"#
!# (# $!# $(# %!# %(# &!#
!"#$%&'()'*+,+-%,-'&%)%&%./%*'$0'+&1/2%-',3+,'&%"-%*'456'*+,+7'/"#"2+18%'
!"#$!"#%!"#&!"#'!"#(!"#)!"#*!"#+!"#,!"#
$!!"#
!# (# $!# $(# %!# %(# &!#
!"#$%&'()'*+,+-%,-'&%)%&%./%*'$0'+&1/2%-',3+,'&%"-%*'456'*+,+7'/"#"2+18%'
!"#$!"#%!"#&!"#'!"#(!"#)!"#*!"#+!"#,!"#
$!!"#
!# (# $!# $(# %!# %(# &!#
!"#$%&'()'*+,+-%,-'&%)%&%./%*'$0'+&1/2%-',3+,'&%"-%*'456'*+,+7'/"#"2+18%'
http://www.nature.com/nature/authors/gta/
But wait!
#2
Our best practice doesn’tcan
scale to mega-reuse(if we work at it)
Another place where having a few big players is a bottleneck
Open reference lists.
#3a
Adoption of best practices erode incentives in the
short term
~70% in multivariate analysis
#3b
Data citations only matter if they are valued
Please donʼt tweet or publicize this next bit...
Early results from an ongoing survey.
n=538
!"#
$!"#
%!"#
&!"#
'!"#
(!"#
)*+,-./0#1234.+55#
6# 6# 758*+4/# 6# 6# )*+,-./0#4.+55#
9:/54351#,*;5+3#<4-#=82/1#,-#>0#?,+@#>,+5#5432/0#
!"#
$!"#
%!"#
&!"#
'!"#
(!"#
)*+,-./0#1234.+55#
6# 6# 758*+4/# 6# 6# )*+,-./0#4.+55#
9:/54351#;#<2//#.5*#=,+5#>2*4?,-3##
Do not publicize
!"#
$!"#
%!"#
&!"#
'!"#
(!"#
)*+,-./0#1234.+55#
6# 6# 758*+4/# 6# 6# )*+,-./0#4.+55#
9#:/54351#2*#;2//#<5#=4/851##<0#>0#?8-15+#
!"#
$!"#
%!"#
&!"#
'!"#
(!"#
)*+,-./0#1234.+55#
6# 6# 758*+4/# 6# 6# )*+,-./0#4.+55#
9#:/54351#2*#;2//#<5#=4/851##<0#>0#:+,>,?,-#,+#*5-8+5#@,>>2A55#
!"#
$!"#
%!"#
&!"#
'!"#
(!"#
)*+,-./0#1234.+55#
6# 6# 758*+4/# 6# 6# )*+,-./0#4.+55#
9:/54351#,*;5+3#<4-#=82/1#,-#>0#?,+@#>,+5#5432/0#
!"#
$!"#
%!"#
&!"#
'!"#
(!"#
)*+,-./0#1234.+55#
6# 6# 758*+4/# 6# 6# )*+,-./0#4.+55#
9:/54351#;#<2//#.5*#=,+5#>2*4?,-3##
Do not publicize
Top-down
http://www.nsf.gov/pubs/policydocs/pappguide/nsf08_1/gpg_2.jsp
Bottom-up
Text
DataCite!
http://www.flickr.com/photos/ginable/325235488/
http://www.flickr.com/photos/ryanr/142455033/
http://www.flickr.com/photos/supersam5/216868485/
thank youTodd Vision,
Jonathan Carlson, Estephanie Sta Maria, Nicholas Weber, Sarah Judson, Valerie EnriquezJason Priem and Beyond ImpactDryad and DataONE teams
The open science online community and those who release their articles, datasets and photos openly
blog: ResearchRemix.wordpress.com
No consistent practice
Sarah Judson, Data citation in the wild IDCC 2010 poster.
We reviewed 500 articles in six major evolution and ecology journals for evidence of data citation:
Sarah Judson, Data citation in the wild IDCC 2010 poster.
We reviewed 500 articles in six major evolution and ecology journals for evidence of data citation:
In 2009, 116 articles cited ORNL DAAC data.
Finding these articles took 70-80 hours
across at least 12 resourcesall chosen from a deep understanding of this specific research domain
then the full text of all the hits were manually reviewed
Valerie Enriquez interview with James Kidderhttp://openwetware.org/wiki/DataONE:Notebook/Reuse_of_repository_data
top related