Data OrganizationData Collection and Spreadsheets
Consistent Data Organization
• Spreadsheets (such as those found in Excel) are sometimes a necessary evil– They allow “shortcuts” which will result in your
data not being machine-readable
• But there are some simple steps you can take to ensure that you are creating spreadsheets that are machine-readable and will withstand the test of time
Spreadsheets
From NASA Environmental Data Management Best Practices Webinar: Bob Cook
Spreadsheet Best Practices
• Include a Header Line 1st line (or record)
• Label each Column with a short but descriptive nameNames should be unique
Use letters, numbers, or “_” (underscore)
Do not include blank spaces or symbols (+ - & ^ *)
More Spreadsheet Best Practices
• Columns of data should be consistent Use the same naming convention for text data
• Each line should be “complete”
Columns should include only a single kind of data• Text or “string” data • Integer numbers• Floating point or real numbers
More Spreadsheet Best Practices
Use Naming Standards & Codes
• Use commonly accepted label names that describe the contents (e.g., precip for precipitation)
• Use consistent capitalization (e.g., not: temp, Temp, and TEMP in same file)
• Standard codes – State Postal (VA, MA)– FIPS Codes for Counties and County
Equivalent Entities (http://www.census.gov/geo/reference/codes/cou.html)
Use Standardized Formats
• Use standardized formats for unitsInternational System of Units (SI)
http://physics.nist.gov/Pubs/SP330/sp330.pdf
• ISO 8601 Standard for Date and TimeYYYYMMDDThh:mmss.sTZD
20091013T09:1234.9Z 20091013T09:1234.9+05:00
• Spatial Coordinates for Latitute/Longitude +/- DD.DDDDD -78.476 (longitude)
+38.029 (latitude)