data managementbasics issr_20130301

34
Data Management Basics A Workshop for Graduate Students March 1, 2013 3/01/13 Data Management Basics 1

Upload: rebecca-reznik-zellen

Post on 01-Nov-2014

5 views

Category:

Documents


1 download

DESCRIPTION

Data Management Basics workshop slides for

TRANSCRIPT

Page 1: Data managementbasics issr_20130301

Dat

a M

anag

emen

t Bas

icsData Management

BasicsA Workshop for Graduate StudentsMarch 1, 2013

3/01

/13

1

Page 2: Data managementbasics issr_20130301

Dat

a M

anag

emen

t Bas

ics

WHY MANAGE DATA?

3/01

/13

2

Page 3: Data managementbasics issr_20130301

Dat

a M

anag

emen

t Bas

ics

1. Funders Require It• National Institutes of Health: Data Sharing Policy (2003)• All grants funded at $500K or above must include a Data Sharing Plan

• National Science Foundation: Data Management Plan Requirement (2011)• All proposals must submit a 2 pp supplementary “Data Management Plan” to

describe how projects will comply with NSF data sharing policy

• National Endowment for the Humanities: Sustainability and Data Management Plans Requirement (2012)• Digital Humanities Implementation Grants must include a plan to discuss how

data will be managed, disseminated, and preserved

• OSTP Directive to Funding Agencies (2013)• Federal agencies with more than $100M in R&D expenditures must ensure

that published results of federally funded research are freely available to the public within one year of publication -- including data

3/01

/13

3

Page 4: Data managementbasics issr_20130301

Dat

a M

anag

emen

t Bas

ics

National Science Foundation• Data Management Plan Requirement• How projects will conform to NSF data sharing policy• Flexible

• “The plan should reflect best practices in your area of research, and should be appropriate to the data you generate.”

• Directorate for Social, Behavioral and Economic Sciences• Discipline-specific guidelines

• Archeology (Digital Archeological Record)• Economics (American Economic Association)

• Universals (for the NSF Universe)• What data are generated by your research? • What is your plan for managing the data?

3/01

/13

4

Page 5: Data managementbasics issr_20130301

Dat

a M

anag

emen

t Bas

ics

2. It Makes Life Easier• For you…• Increases efficiency

• Easier to understand the data collected throughout the life cycle of the project

• Easier to find the data that you need throughout the life cycle of the project• Satisfies applicable legal obligations• Addresses preservation, documentation, verification issues• Helps reviewers understand the characteristics of your data• Increases citation rates for articles

• For others…• Provides continuity – other researchers can build on your data• Enhances longevity and usability• Facilitates new discoveries• Supports open access

3/01

/13

5

Page 6: Data managementbasics issr_20130301

Dat

a M

anag

emen

t Bas

ics

3. It’s the Right Thing To Do

Responsible Conduct of Research/Research Ethics• Data Acquisition, Management, Sharing and Ownership • Using the appropriate research method • Providing attention to detail• Obtaining appropriate permissions • Recording data accurately and securely• Maintaining data to allow it to confirm research findings,

establish priority, and be reanalyzed by other researchers. • Storing data to protect confidentiality, be secure from physical and

electronic damage, destruction or theft, and be maintained for the appropriate time frame dictated by sponsor and University policies.

Compliance• Research using Human Subjects (Institutional Review Board)

3/01

/13

6

Page 7: Data managementbasics issr_20130301

Dat

a M

anag

emen

t Bas

ics

SMART DATA PRACTICES

Naming Your filesOrganizing Your DataBackup and StoragePost-Project Considerations

3/01

/13

7

Page 8: Data managementbasics issr_20130301

Dat

a M

anag

emen

t Bas

ics

Organizing Your Data• Getting Started• Consider your goals

• What do you want to get out of managing your data?• What is the most efficient way to organize your data?

• Figure out your criteria for keeping data• Think about where you want your data to end up

3/01

/13

8

Page 9: Data managementbasics issr_20130301

Dat

a M

anag

emen

t Bas

ics

filename = chief identifier for a research data file

3/01

/13

9

Page 10: Data managementbasics issr_20130301

Dat

a M

anag

emen

t Bas

ics

3/01

/13

10

File naming

and labeling

Organization

ContextConsistency

Page 11: Data managementbasics issr_20130301

Dat

a M

anag

emen

t Bas

ics

Some potential components for your file naming strategy

• Version number• Date of creation• Name of creator• Description of content• Name of individual/research team/department• Publication date• Project number

3/01

/13

11

Page 12: Data managementbasics issr_20130301

Dat

a M

anag

emen

t Bas

ics

Organizing Your Data

3/01

/13

12

W. E. B. Du Bois, Niagara delegate meeting, Boston, 1907. W. E. B. Du Bois Papers (MS 312). Special Collections and University Archives, University Libraries, University of Massachusetts Amherst

Page 13: Data managementbasics issr_20130301

Dat

a M

anag

emen

t Bas

ics

Organizing Your Data• Let’s Clean Up Those File Names• abcdefghijklmnopqrstuvwxyz.jpg

• doesn’t make much sense, does it?

• How about:• 20120925_credo_du_bois_rrz_001.jpg

• And I put it in a directory called:• credo_du_bois

3/01

/13

13

Page 14: Data managementbasics issr_20130301

Dat

a M

anag

emen

t Bas

ics

Organizing Your Data• Why this structure? • Oh, I just made it up! But I’m going to be consistent

• 20120925 = date I found the image• credo = database/collection where I found the image• du_bois = image subject• rrz = my initials (I am working in a group!)• 001 = an accession number (I made that up, too, but I’ll continue to

use that schema)

3/01

/13

14

Page 15: Data managementbasics issr_20130301

Dat

a M

anag

emen

t Bas

ics

BAD naming practices• Using generic data file names that may conflict when moved

from one location to another• Failing to think about scale • Using special characters in a filename such as:

& * % $ £ ] { ! @

3/01

/13

15

Page 16: Data managementbasics issr_20130301

Dat

a M

anag

emen

t Bas

ics

Versioning• Use ordinal numbers (1,2,3) for major version changes and the

decimal for minor changes: v1, v1.1, v2.6• Beware of using confusing labels: revision, final, final2,

definitive_copy• Discard or delete obsolete versions • Use an auto-backup facility (if available) rather than saving or

archiving multiple versions• Turn on versioning or tracking in collaborative documents or

storage utilities such as Wikis, GoogleDocs, etc.

3/01

/13

16

Page 17: Data managementbasics issr_20130301

Dat

a M

anag

emen

t Bas

ics

Quiz! File naming by date

What is the best filename?A. 2012-09-25_AttachmentB. 25 September 2012 AttachmentC. 25092012attch 3/

01/1

3

17

Page 18: Data managementbasics issr_20130301

Dat

a M

anag

emen

t Bas

ics

Quiz! File naming by description

What is the best filename?A. dubois_great_barrington_recent_20120925_old

version.docxB. 2012-09-25_dubois_great_barrington_V1.docxC. FFTX_2365498_old.docx

3/01

/13

18

Page 19: Data managementbasics issr_20130301

Dat

a M

anag

emen

t Bas

ics

Organizing Your Data• Organizational methods• Hierarchical• Tag-based

• Retrieval• Location-based• Search-based

3/01

/13

19

“Very little skill is needed to actually be organized and efficient…. just the consciousness to put this file or folder in the right place.”

Page 20: Data managementbasics issr_20130301

Dat

a M

anag

emen

t Bas

ics

3/01

/13

20

DuBoisDuBois_Images

DuBois_Images/1868-1898/ DuBois_Images/1898-1928/

DuBois_Letters DuBois_Letters/1868-1898/ DuBois_Letters/1898-1928/

DuBois_Newspapers/

Organizing Your DataUse folders!

etc.

Page 21: Data managementbasics issr_20130301

Dat

a M

anag

emen

t Bas

ics

Archive what you don’t or won’t need

• Decide what your final data sets are• Once your project is over, weed out obsolete data and decide

what you want to keep for the long-term• Move files and folders to an ‘Archive’ or ‘Old files’ folder• z_archive

3/01

/13

21

Page 22: Data managementbasics issr_20130301

Dat

a M

anag

emen

t Bas

ics

Backup and Storage

3/01

/13

22

January 2011: “Stolen laptop contains cancer cure data”

Page 23: Data managementbasics issr_20130301

Dat

a M

anag

emen

t Bas

ics

Backup and Storage• Backup is an essential component of data management• Prevent against accidental or malicious data loss• Restore original data

• Keep 3 copies

• Consider• How much?• How frequently?• Which media?• Synchronization

• Test your system

3/01

/13

23

Original

External Local

External Remote

Page 24: Data managementbasics issr_20130301

Dat

a M

anag

emen

t Bas

ics

Backup and Storage• Accessibility of data depends on storage media and file format• Vulnerable to deterioration• Become obsolete over time

• Plan for disruption

• Consider• Non-proprietary

file formats• Different media types

in storage strategy• Migrate data• Unencrypted,

uncompressed

3/01

/13

24

Original

External Local

External Remote

Page 25: Data managementbasics issr_20130301

Dat

a M

anag

emen

t Bas

ics

Backup and Storage• Security• Encryption can be used for safely moving or storing files,

• Encrypting files on storage devices (flash drives) • Encryption during file transfer (ie: WinSCP)• Encrypted storage services

• Deleting Data• Weed out obsolete data and decide what you want to keep for the

long-term• Deleting files does not delete files

• Other things to Consider• How will the data be used?• Who pays for storage?

3/01

/13

25

Page 26: Data managementbasics issr_20130301

Dat

a M

anag

emen

t Bas

ics

Post-Project Activities• Publication? Sharing?• Intellectual Property• Copyright

• Creative Commons

• Platforms? • ScholarWorks@UMass Amherst• ICPSR

• Copyright & Information Policy LibrarianLaura Quilter [email protected]

3/01

/13

26

Page 27: Data managementbasics issr_20130301

Dat

a M

anag

emen

t Bas

ics

Data Management is About Planning

Data management will: • Prevent bad things

from happening to your data;• Make you a more

efficient researcher;• Prepare you for

grant management.

Collection Description

Storage and Backup Access

3/01

/13

27

Page 28: Data managementbasics issr_20130301

Dat

a M

anag

emen

t Bas

ics

Data Management PlansNSF

• The types of data; • The standards to be used for data and metadata format and

content ;• The policies for access and sharing; • The policies and provisions for re-use, re-distribution, and the

production of derivatives; and• The plans for archiving and for preservation of access.

3/01

/13

28

Page 29: Data managementbasics issr_20130301

Dat

a M

anag

emen

t Bas

ics

RESOURCES

3/01

/13

29

Page 30: Data managementbasics issr_20130301

Dat

a M

anag

emen

t Bas

ics

Planning• Data Working Group (email [email protected])• Digital projects• Long-term preservation• Assessment

• Web resources • UMass Amherst Libraries: General Resources (http://guides.library.umass.edu/

datamanagement)• Discipline-specific• Your faculty • Your mentors• Your professional associations• Industry partners• Public engagement

3/01

/13

30

Page 31: Data managementbasics issr_20130301

Dat

a M

anag

emen

t Bas

ics

Backup and Storage• Storage• Udrive (http://www.oit.umass.edu/udrive )• Departmental servers • CDs/DVDs/external hard drives

• Filesharing (see http://chronicle.com/blogs/profhacker/protecting-your-data/37350 ) • Dropbox • Google Docs

• Cloud Storage• Amazon Web Services• Rackspace• Microsoft Azure• Sugar Sync

• Additional Information• MIT on Backups and Security

http://libraries.mit.edu/guides/subjects/data-management/backups.html• UK Data Archive on Data Storage

http://www.data-archive.ac.uk/create-manage/storage• UK Preservation Office “Caring for CDs and DVDs”

http://www.bl.uk/blpac/pdf/cd.pdf

3/01

/13

31

Page 32: Data managementbasics issr_20130301

Dat

a M

anag

emen

t Bas

ics

ToolsInformation Management• Devonthink

http://www.devontechnologies.com• Yojimbo http://www.barebones.com

/products/yojimbo• EverNote

http://www.evernote.com/about/home.php• Scribe (Mac, Windows, Free)

http://chnm.gmu.edu/tools/scribe/• Springpad

http://springpadit.com/home Citation Management • Mendeley

http://www.mendeley.com/features/• Zotero

http://www.zotero.org/• RefWorks

http://guides.library.umass.edu/refworksatumass

Desktop Search Tools• Windows Search

http://www.microsoft.com/en-us/download/details.aspx?id=23

• UltraSearchhttp://www.jam-software.com/ultrasearch/

• Locate 32http://locate32.cogit.net/

Tagging Tools• Tabbles

http://tabbles.net/• TaggTool

http://www.taggtool.com/index.php• TaggedFrog

http://lunarfrog.com/taggedfrog/Tool Directories• Bamboo DiRT

http://dirt.projectbamboo.org/• CHNM Research + Tools

http://chnm.gmu.edu/research-and-tools/

3/01

/13

32

Page 33: Data managementbasics issr_20130301

Dat

a M

anag

emen

t Bas

ics

Sources• MIT Data Management

(http://libraries.mit.edu/guides/subjects/data-management/) • UK Data Archive

(http://www.data-archive.ac.uk/) • MANTRA (http://datalib.edina.ac.uk/mantra/organisingdata.html) • Creating Order from Chaos: 9 Great Ideas for Managing Your

Computer Files(http://www.makeuseof.com/tag/creating-order-chaos-9-great-ideas-managing-computer-files/)

• Research Information Management: Tools for the Humanities(http://sudamih.oucs.ox.ac.uk/docs/Generic%20Courses/Tools%20for%20the%20Humanities%20course%20book.docx)

3/01

/13

33

Page 34: Data managementbasics issr_20130301

Dat

a M

anag

emen

t Bas

ics

Questions/contact

[email protected]

3/01

/13

34