measurement and instrumentation, data analysis · christen and mckendry / geography 309...
TRANSCRIPT
Christen and McKendry / Geography 309
Introduction to data analysis
1
Measurement and Instrumentation,Data Analysis
Christen and McKendry / Geography 309
Introduction to data analysis
2
Error in Scientific Measurement means the Inevitable Uncertainty that attends all measurements-Fritschen and Gay
Uncertainties are ubiquitous and therefore no reflection on theusefulness of the measurement or the competence of themeasurer
If we are to rationally use a measurement the uncertainties mustbe known quantitatively
Christen and McKendry / Geography 309
Introduction to data analysis
3
Precision vs Accuracy
Christen and McKendry / Geography 309
Introduction to data analysis
4
Christen and McKendry / Geography 309
Introduction to data analysis
5
Christen and McKendry / Geography 309
Introduction to data analysis
6
Christen and McKendry / Geography 309
Introduction to data analysis
7
Measurement Systems
INSTRUMENT: a device that contains at least a sensor, a signalconditioning device and a data display
SENSOR: interacts with variable to be measured (measurand) andgenerates an output signal proportional to that variable
TRANSDUCER: a device that converts energy from one form to another(an instrument may include several + the primary transducer (sensor)
Christen and McKendry / Geography 309
Introduction to data analysis
8
Sources of Error
Static: measured when input is held steady and after calibration applied –eg. random electrical “noise” or effects of unwanted inputs such astemperature
Dynamic: due to changing inputs…eg. time lag
Drift: physical changes in sensor over time
Exposure: imperfect coupling between sensor and measurand eg.temperature: radiation, conduction, dead air around sensor…..
Christen and McKendry / Geography 309
Introduction to data analysis
9
Instrument Platforms
Christen and McKendry / Geography 309
Introduction to data analysis
10
Data-sets and data analysis.
Keep it simple - UCAR Digital Image Library
Christen and McKendry / Geography 309
Introduction to data analysis
11
A classical research approach.
Documentation‘Hands-on’Creative Part
Hypothesis, motivation,possibilities
Experiment designAnalysis plan
Conclusions, hypothesisverification / falsification
Project ProposalInformation
Lab- and field measurements
Experiment documentation
Des
ign
Exper
imen
tA
nal
ysis
Data analysis
Data
Analysis documentation
Report
Christen and McKendry / Geography 309
Introduction to data analysis
12
Documentation : Meta-data
•Document your ideas and plans ! proposal.
•Document your instrumentation (specifications, manufactures,serial numbers).
•Document your sampling strategy (how, where and when...)
•Document your data files (parameter, units, time-zone,location)
•Document your analysis (data filtering, data selection criteria,statistical methods)
•Document and prove your conclusions ! report
log- or field notebook
Christen and McKendry / Geography 309
Introduction to data analysis
13
Boolean data Continuous dataClassification
-5.24
-3.20
2.32
9.63
14.15
17.83
27.08
16.34
10.22
4.43
-1.32
Cold
Cold
Cool
Cool
Warm
Warm
Hot
Warm
Cool
Cool
Cold
Yes
Yes
No
No
No
No
No
No
No
No
Yes
How will your data look like? - Data format
dat
a dim
ensi
on
dat
a dim
ensi
on
dat
a dim
ensi
on
Christen and McKendry / Geography 309
Introduction to data analysis
14
Field data in physical geography.
Any natural environment is a complex,multivariate web of interactingvariables.
We can never measurecontinuously everything,everywhere.
We sample selected datawith an appropriatestrategy.
Christen and McKendry / Geography 309
Introduction to data analysis
15
Dimensions.
Strictly speaking, we sample any physical, chemical or biological variable ! in afour dimensional setting, i.e. as a function of time t and space x (x, y, z).
Luckily, quite often we are not interested in all four dimensions. Likely, we focuson a single or two dimensions due to logistical reasons or because your studyobject allows this. You might assume that variability in one dimension is muchsmaller than in another one (homogeneity, stationarity critearia).
! In your project you will implicitly include certain dimensions / excludeothers.
Christen and McKendry / Geography 309
Introduction to data analysis
16
Dimensions.
Basic dimension Data-set Resolution Examples of 1-D data
Time Time seriesTemporalresolution
•One day of 10 min temperaturemeasurements at a climate station•One year of hourly dischargemeasurements from a stream
Space
Horizontal profile Spatial resolution
•Vegetation classification along atraverse.•A horizontal profile of snow watercontent along a line.
Vertical profile Spatial resolution
•A tethered balloon run measuringwind with height.•Temperature change in soil withdepth.
(Frequency)* Spectrum Spectral resolution•A histogram of different grain sizesin sediments•Irradiance in different wavelengths
*This is strictly speaking not a basic dimension, but a transformation of time or space
Christen and McKendry / Geography 309
Introduction to data analysis
17
Examples of data sets - time series.
time dimension
Example: Carbon dioxide in a forest as afunction of time of a day
Christen and McKendry / Geography 309
Introduction to data analysis
18
Examples of data sets - horizontal profiles.
Example: Horizontal transect throughVegetation
Example: Horizontal transect showing airtemperature
horizontal dimension horizontal dimension
Ecosystems of BC / T.R. Oke (1987): 'Boundary Layer Climates' 2nd Edition.
Christen and McKendry / Geography 309
Introduction to data analysis
19
Examples of data sets - vertical profiles.
vert
ical
dim
ensi
on
vert
ical
dim
ensi
on
Univ. Stuttgart / T.R. Oke (1987): 'Boundary Layer Climates' 2nd Edition.
Christen and McKendry / Geography 309
Introduction to data analysis
20
Examples of two dimensional data sets.
time-space
Example: temperatures in a lake as afunction of time of year and water depth
yearly course (time)
space-space (map)
Example: land use
hori
zonta
l dim
ensi
on
horizontal dimensionve
rtic
al d
imen
sion
timedimension
Christen and McKendry / Geography 309
Introduction to data analysis
21
Examples of two dimensional data sets.
Christen et al. (2001),
time-space
Example: temperatures in the air of a forest asfunction of time over one hour
vert
ical
dim
ensi
on
time dimension
Christen and McKendry / Geography 309
Introduction to data analysis
22
Resolution.
• Temporal and spatial resolution: How many data-points per unit of adimension? Temporal resolution and spatial resolution, i.e 1 measurement aday vs. 1440 measurements a day, or 1 measurement per km vs. 1000measurements per km.
• Data depth: How accurately can we distinguish between different physicalvalues, i.e. 0.02 vs. 0.0214523.
Illustration: Wikipedia
Christen and McKendry / Geography 309
Introduction to data analysis
23
Integration and interpolation.
Integration refers to the process of combiningor accumulating - or more generally tomethods of upscaling - data from an existingset of measured data points.
Interpolation refers to the process of splittingdown or fill-in data to constructing new datapoints - or generally to methods ofdownscaling - an existing set of measured datapoints.
Both can be done in time and space domains,and there are various methods.
Christen and McKendry / Geography 309
Introduction to data analysis
24
Gridded vs. irregular data
irregularregular
Voronoi tessellation
. Data Points
Nort
h (
spac
e)
East (space)
Nort
h (
spac
e)
East (space)
Christen and McKendry / Geography 309
Introduction to data analysis
25
Example of a regular grid in vegetation studies.
Photo: http://www.marine.gov/
Christen and McKendry / Geography 309
Introduction to data analysis
26
Your data-set?
Choose a physical parameter or a classification of interest in yourpotential project:
Data format?Dimensions?Resolution?Regular or irregular?Assumptions?
Christen and McKendry / Geography 309
Introduction to data analysis
27
Data processing.
• Correcting sensors with data from lab calibrations or field intercomparisons(mainly climatology).
• Plausibility checks - define criteria for errors, experiment disturbances, etc.
• Flag data - remove data that fulfill the above criteria (never delete dataforever, just flag it - and backup raw data!).
• Integrate or interpolate data - only if your data are not at the scale required,or if you have to compare two data sets with different resolution.
• Select data for further analysis if you have made assumptions to fulfill certaincriteria.
Christen and McKendry / Geography 309
Introduction to data analysis
28
Intercompare your sensors.
If you are interested in spatial or temporal difference, and you use multiplesensors at different locations in space or in different time slots of yourexperiment, you have to ensure that these sensors are comparable.
Pre experimentalLab- or field intercomparison
Field measurements
Post-experimentalLab- or field intercomparison
Some sensors need recalibration during field experiments.
Field intercomparison
Christen and McKendry / Geography 309
Introduction to data analysis
29
Check your data - potential approaches.
• Global criteria(minimum, maximum,...).
• Local criteria(rate of change, ...).
• Statistical criteria.
• Manual data flagging.
Standard deviation
CO2-concentration
Christen and McKendry / Geography 309
Introduction to data analysis
30
Analysis tools.
•Describe data distribution - statistical probability of occurrence, histograms,statistical moments, ...
•Find events - peak detection, integration, ...
•Find and quantify correlations (same variable at different locations, samevariable at compared different times, between two variables, correlation betweenmodel and measured values) - correlation, regressions, curve-fitting, statisticaltests
•Find groups and dominating dimensions - Clustering, principal componentanalysis,
•Find process dominating scales - Spectral analysis finds process dominating timeand length scales, wavelets.
Christen and McKendry / Geography 309
Introduction to data analysis
31
Data analysis tools.
Method / System Advantage Limitations Example of a system
Manual Analysis very simple, fastup to a few 10s of data-points, no large data-sets,no modelling
Calculator, paper, pen...
Spread-sheetsoftware
simple analysis andgraph tools
Limited # of data points,limited statistics,modelling, andautomation & slow.
Microsoft Excel
GIS systemcomplex spatialanalysis and modelling
Expert knowledge.Expensive.
Workstation with ArcGIS
Statistical softwareand programminglanguages
complex and fast timeseries analysis,automation, modelling
Programming skills.Expensive.
Workstation with Matlab, R,IDL ...
Christen and McKendry / Geography 309
Introduction to data analysis
32
Your data-analysis?
Think about your potential project:
Correction, calibrations?Data checks?Analysis concept?Software needs?Hardware needs?
again:Keep it simple!