introduction to research data management; lecture 01 for grad521
TRANSCRIPT
![Page 1: Introduction to research data management; Lecture 01 for GRAD521](https://reader034.vdocument.in/reader034/viewer/2022042518/55a6337c1a28ab6c488b46ff/html5/thumbnails/1.jpg)
GRAD 521, Research Data Management
Winter 2014 - Lecture 1
Amanda L. Whitmire, Asst. Professor
![Page 2: Introduction to research data management; Lecture 01 for GRAD521](https://reader034.vdocument.in/reader034/viewer/2022042518/55a6337c1a28ab6c488b46ff/html5/thumbnails/2.jpg)
Lesson One Outline
Introductions
The importance ofdata management
What is/are ‘data’?
![Page 3: Introduction to research data management; Lecture 01 for GRAD521](https://reader034.vdocument.in/reader034/viewer/2022042518/55a6337c1a28ab6c488b46ff/html5/thumbnails/3.jpg)
B.S. in Aquatic Biology, 2000Worked in a bioluminescence laboratory
Ph.D. in Oceanography, emphasis in biological oceanography, 2008Dissertation study area: bio-optics; using optical tools to study ocean ecology (N. California Current)
Post-doc in Oceanography, emphasis in biological oceanography, 2008-2012Study area: bio-optics; using optical tools to study ocean ecology in low oxygen zones (N. Chile)
Assistant Professor, Data Management Specialist, Sept. 2012 - present
![Page 4: Introduction to research data management; Lecture 01 for GRAD521](https://reader034.vdocument.in/reader034/viewer/2022042518/55a6337c1a28ab6c488b46ff/html5/thumbnails/4.jpg)
![Page 5: Introduction to research data management; Lecture 01 for GRAD521](https://reader034.vdocument.in/reader034/viewer/2022042518/55a6337c1a28ab6c488b46ff/html5/thumbnails/5.jpg)
Course Overview
Overview of research data management, definitions & best practices
Types, formats & stages of research data
Data storage, backup & security
Metadata (data documentation)
Legal & ethical considerations of research data
Data sharing & reuse
Archiving & preservation
![Page 6: Introduction to research data management; Lecture 01 for GRAD521](https://reader034.vdocument.in/reader034/viewer/2022042518/55a6337c1a28ab6c488b46ff/html5/thumbnails/6.jpg)
Pair & Share
Name
College/Department/Unit/etc.
1st year, 2nd year, etc.
What is/are data?
![Page 7: Introduction to research data management; Lecture 01 for GRAD521](https://reader034.vdocument.in/reader034/viewer/2022042518/55a6337c1a28ab6c488b46ff/html5/thumbnails/7.jpg)
Why actively manage it?
What is data?
![Page 8: Introduction to research data management; Lecture 01 for GRAD521](https://reader034.vdocument.in/reader034/viewer/2022042518/55a6337c1a28ab6c488b46ff/html5/thumbnails/8.jpg)
“…the recorded factual material commonly accepted in the scientific community as necessary to validate
research findings.”
Research data is:
U.S. Office of Management and Budget, Circular A-110
8
![Page 9: Introduction to research data management; Lecture 01 for GRAD521](https://reader034.vdocument.in/reader034/viewer/2022042518/55a6337c1a28ab6c488b46ff/html5/thumbnails/9.jpg)
“Unlike other types of information, research data are collected, observed, or created, for the
purposes of analysis to produce and validate original research results.”
University of Edinburgh
MANTRA Research Data Management Training,
‘Research Data Explained’
What is research data?
![Page 10: Introduction to research data management; Lecture 01 for GRAD521](https://reader034.vdocument.in/reader034/viewer/2022042518/55a6337c1a28ab6c488b46ff/html5/thumbnails/10.jpg)
Actions that contribute to effective storage, preservation and reuse ofdata and documentation throughout the research lifecycle.
What is data management?
![Page 11: Introduction to research data management; Lecture 01 for GRAD521](https://reader034.vdocument.in/reader034/viewer/2022042518/55a6337c1a28ab6c488b46ff/html5/thumbnails/11.jpg)
Data management is not:
Data scienceComputational scienceDatabase administrationA research method:
• what data to collect• how to collect them• how to design an experiment
![Page 12: Introduction to research data management; Lecture 01 for GRAD521](https://reader034.vdocument.in/reader034/viewer/2022042518/55a6337c1a28ab6c488b46ff/html5/thumbnails/12.jpg)
Why Data Management?
![Page 13: Introduction to research data management; Lecture 01 for GRAD521](https://reader034.vdocument.in/reader034/viewer/2022042518/55a6337c1a28ab6c488b46ff/html5/thumbnails/13.jpg)
Images collected by DataOne.org
![Page 14: Introduction to research data management; Lecture 01 for GRAD521](https://reader034.vdocument.in/reader034/viewer/2022042518/55a6337c1a28ab6c488b46ff/html5/thumbnails/14.jpg)
Ph
oto
co
urt
esy
of
ww
w.c
arb
oaf
rica
.net
Data is collected from sensors, sensor networks, remote sensing, observations, and more - this calls for increased attention to data management and stewardship
Data deluge
Ph
oto
co
urt
esy
of
htt
p:/
/mo
dis
.gsf
c.n
asa.
gov/
Ph
oto
co
urt
esy
of
htt
p:/
/ww
w.f
utu
rlec
.co
m
CC
imag
e b
y ta
jaio
n F
lickr
CC
imag
e b
y C
IMM
YT o
n F
lickr
Imag
e co
llect
ed b
y V
ivH
utc
hin
son
![Page 15: Introduction to research data management; Lecture 01 for GRAD521](https://reader034.vdocument.in/reader034/viewer/2022042518/55a6337c1a28ab6c488b46ff/html5/thumbnails/15.jpg)
Source: John Gantz, IDC Corporation: The Expanding Digital Universe
0
100,000
200,000
300,000
400,000
500,000
600,000
700,000
800,000
900,000
1,000,000
2005 2006 2007 2008 2009 2010
Transient information or unfilled demand for storage
Information
Available Storage
Peta
byt
es W
orl
dw
ide
The World of Data Around Us
![Page 16: Introduction to research data management; Lecture 01 for GRAD521](https://reader034.vdocument.in/reader034/viewer/2022042518/55a6337c1a28ab6c488b46ff/html5/thumbnails/16.jpg)
Natural disaster
Facilities infrastructure failure
Storage failure
Server hardware/software failure
Application software failure
External dependencies (e.g. PKI failure)
Format obsolescence
Legal encumbrance
Human error
Malicious attack by human or automated agents
Loss of staffing competencies
Loss of institutional commitment
Loss of financial stability
Changes in user expectations and requirements
The World of Data Around Us: Data Loss
CC
imag
e b
y Sh
aryn
Mo
rro
w o
n F
lickr
CC
imag
e b
y m
om
bo
leu
mo
n F
lickr
![Page 17: Introduction to research data management; Lecture 01 for GRAD521](https://reader034.vdocument.in/reader034/viewer/2022042518/55a6337c1a28ab6c488b46ff/html5/thumbnails/17.jpg)
Poor Data Management Affects Everyone
“MEDICARE PAYMENT ERRORS NEAR $20B” | (CNN) December 2004
Miscoding and billing errors from doctors and hospitals totaled $20,000,000,000 in FY2003 (9.3% error rate). The error rate measured claims that were paid despite being medically unnecessary, inadequately documented or improperly coded. In some instances, Medicare asked health care providers for medical records to back up their claims and got no response. The survey did not document instances of alleged fraud. This error rate actually was an improvement over the previous fiscal year (9.8% error rate).
“AUDIT: JUSTICE STATS ON ANTI-TERROR CASES FLAWED” | (AP) February 2007
The Justice Department Inspector General found only two sets of data out of 26 concerning terrorism attacks were accurate. The Justice Department uses these statistics to argue for their budget. The Inspector General said the data “appear to be the result of decentralized and haphazard methods of collections … and do not appear to be intentional.”
“OOPS! TECH ERROR WIPES OUT ALASKA INFO” | (AP) March 2007
A technician managed to delete the data and backup for the $38 billion Alaska oil revenue fund – money received by residents of the State. Correcting the errors cost the State an additional $220,700 (which of course was taken off the receipts to Alaska residents.)
Slide courtesy of BLM
![Page 18: Introduction to research data management; Lecture 01 for GRAD521](https://reader034.vdocument.in/reader034/viewer/2022042518/55a6337c1a28ab6c488b46ff/html5/thumbnails/18.jpg)
A wildlife biologist for a small field office was the in-house GIS expert and provided support for all the staff’s GIS needs. However, the data was stored on her own workstation. When the biologist relocated to another office, no one understood how the data was stored or managed.
Solution: A state office GIS specialist retrieved the workstation and sifted through files trying to salvage relevant data.
Cost: 1 work month ($4,000) plus thevalue of data that was not recovered
Poor Science Data Management Example
CC
imag
e b
y D
TRav
eo
n
Op
en C
lip A
rt L
ibra
ry
![Page 19: Introduction to research data management; Lecture 01 for GRAD521](https://reader034.vdocument.in/reader034/viewer/2022042518/55a6337c1a28ab6c488b46ff/html5/thumbnails/19.jpg)
Importance of Data Management
The climate scientists at the centre of a media storm over leaked emails were yesterday cleared of accusations that they fudged their results and silenced critics, but a review found they had failed to be open enough about their work.
![Page 20: Introduction to research data management; Lecture 01 for GRAD521](https://reader034.vdocument.in/reader034/viewer/2022042518/55a6337c1a28ab6c488b46ff/html5/thumbnails/20.jpg)
![Page 21: Introduction to research data management; Lecture 01 for GRAD521](https://reader034.vdocument.in/reader034/viewer/2022042518/55a6337c1a28ab6c488b46ff/html5/thumbnails/21.jpg)
Manage your data for yourself:
o Keep yourself organized
o Track your research processes for
reproducibility
o Better control versions of data
oQuality control your data more efficiently
Why Data Management: Researcher Perspective
![Page 22: Introduction to research data management; Lecture 01 for GRAD521](https://reader034.vdocument.in/reader034/viewer/2022042518/55a6337c1a28ab6c488b46ff/html5/thumbnails/22.jpg)
Make backups to avoid data loss
Format your data for re-use (by yourself or others)
Be prepared: Document your data for your own
recollection, accountability, and re-use (by yourself or others)
Prepare it to share it – gain credibility
and recognition for your science efforts!
CC
imag
e b
y U
WW
Res
Net
on
Flic
kr
Why Data Management: Researcher Perspective
![Page 23: Introduction to research data management; Lecture 01 for GRAD521](https://reader034.vdocument.in/reader034/viewer/2022042518/55a6337c1a28ab6c488b46ff/html5/thumbnails/23.jpg)
Data is a valuable asset
It is expensive & time consuming to collect
Why data management: Foundation to advance science
![Page 24: Introduction to research data management; Lecture 01 for GRAD521](https://reader034.vdocument.in/reader034/viewer/2022042518/55a6337c1a28ab6c488b46ff/html5/thumbnails/24.jpg)
Well-managed data can result in re-use, integration & new science
Spatio-Temporal Exploratory Models predict the probability of occurrence of bird species across the United States at a 35 km x 35 km grid.
Land Cover
Potential Uses-• Examine patterns of migration • Infer impacts of climate change• Measure patterns of habitat usage• Measure population trends
Model results
eBird
Meteorology
MODIS –Remote sensing data
Occurrence of Indigo Bunting (2008)
Jan Sep DecJunApr
Slide courtesy of DataONE
![Page 25: Introduction to research data management; Lecture 01 for GRAD521](https://reader034.vdocument.in/reader034/viewer/2022042518/55a6337c1a28ab6c488b46ff/html5/thumbnails/25.jpg)
Data Integration Results
Images court
esy o
f C
orn
ell
Orn
itholo
gy L
ab
http://www.youtube.com/watch?v=Cik6fIuoPDk
![Page 26: Introduction to research data management; Lecture 01 for GRAD521](https://reader034.vdocument.in/reader034/viewer/2022042518/55a6337c1a28ab6c488b46ff/html5/thumbnails/26.jpg)
Where a majority of data end up now…
![Page 27: Introduction to research data management; Lecture 01 for GRAD521](https://reader034.vdocument.in/reader034/viewer/2022042518/55a6337c1a28ab6c488b46ff/html5/thumbnails/27.jpg)
Imagine if data were more accessible
![Page 28: Introduction to research data management; Lecture 01 for GRAD521](https://reader034.vdocument.in/reader034/viewer/2022042518/55a6337c1a28ab6c488b46ff/html5/thumbnails/28.jpg)
New discoveriesA new image processing technique reveals something not before seen in this Hubble Space Telescope image taken 11 years ago: A faint planet (arrows), the outermost of three discovered with ground-based telescopes last year around the young star HR 8799.D. Lafrenière et al., Astrophysical Journal Letters
“The first thing it tells you is how valuable maintaining long-term archives can be. Here is a major discovery that’s been lurking in the data for about 10 years!” comments Matt Mountain, director of the Space Telescope Science Institute in Baltimore, which operates Hubble.
“The second thing its tells you is having a well calibrated archive is necessary but not sufficient to make breakthroughs — it also takes a very innovative group of people to develop very smart extraction routines that can get rid of all the artifacts to reveal the planet hidden under all that telescope and detector structure.”
“Planet hidden in Hubble archives”
Science News Feb. 27, 2009
D. L
afre
niè
reet
al.,
Ap
JLe
tter
s
![Page 29: Introduction to research data management; Lecture 01 for GRAD521](https://reader034.vdocument.in/reader034/viewer/2022042518/55a6337c1a28ab6c488b46ff/html5/thumbnails/29.jpg)
The data deluge has created a surge of information that needs to be well-managed and made accessible.
The cost of not doing data management can be very high.
Be cognizant of best practices and tools associated with the data lifecycle to manage your data well.
Many benefits are associated with the act of managing data, including the ability to find, access, understand, integrate and re-use data.
Summary
![Page 30: Introduction to research data management; Lecture 01 for GRAD521](https://reader034.vdocument.in/reader034/viewer/2022042518/55a6337c1a28ab6c488b46ff/html5/thumbnails/30.jpg)
Summary, continued
If data are:
Well-organized
Documented
Preserved
Accessible
Verified as to accuracy and validity
The result is:
High quality data
Easy to share and re-use
Citation & credibility to the researcher
Cost-savings to science
![Page 31: Introduction to research data management; Lecture 01 for GRAD521](https://reader034.vdocument.in/reader034/viewer/2022042518/55a6337c1a28ab6c488b46ff/html5/thumbnails/31.jpg)
Thursday
Data management plans & the research lifecycle
Homework:Take the pre-assessment survey
(link in Canvas)
![Page 32: Introduction to research data management; Lecture 01 for GRAD521](https://reader034.vdocument.in/reader034/viewer/2022042518/55a6337c1a28ab6c488b46ff/html5/thumbnails/32.jpg)
Archived slides
![Page 33: Introduction to research data management; Lecture 01 for GRAD521](https://reader034.vdocument.in/reader034/viewer/2022042518/55a6337c1a28ab6c488b46ff/html5/thumbnails/33.jpg)
About You