oceanography 569 oceanographic data analysis laboratory

25
Oceanography 569 Oceanographic Data Analysis Laboratory Kathie Kelly Applied Physics Laboratory 515 Ben Hall IR Bldg class web site: faculty.washington.edu/kellyapl/class es/ocean569_2014/

Upload: oya

Post on 24-Feb-2016

64 views

Category:

Documents


1 download

DESCRIPTION

Oceanography 569 Oceanographic Data Analysis Laboratory. Kathie Kelly Applied Physics Laboratory 515 Ben Hall IR Bldg class web site: faculty.washington.edu/kellyapl/classes/ocean569_2014/. Organization. 1 lecture, 1 lab period (2 hrs) per week - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Oceanography 569 Oceanographic Data Analysis Laboratory

Oceanography 569Oceanographic Data Analysis Laboratory

Kathie KellyApplied Physics Laboratory

515 Ben Hall IR Bldgclass web site:

faculty.washington.edu/kellyapl/classes/ocean569_2014/

Page 2: Oceanography 569 Oceanographic Data Analysis Laboratory

Organization

• 1 lecture, 1 lab period (2 hrs) per week• Exercise assigned in lab, finish by following lecture• Presentation of solution in lecture session• One class project completed individually• Grade based on presentations and project• Office hours by appointment

Page 3: Oceanography 569 Oceanographic Data Analysis Laboratory

Materials

Materials available on class web site:• Powerpoint notes• mfiles & mat files for exercises• specialized functions (mfiles)• example solutions (following week)

Text: “Modeling Methods for Marine Science” by Glover, Jenkins & Doney• on reserve in Physics Library• a good reference to buy

Page 4: Oceanography 569 Oceanographic Data Analysis Laboratory

General Procedure for Data Analysis

• Define analysis goal• Characterize data• Prepare data• Errors and error propagation• Statistical analyses• Combine data with model (prognostic,

diagnostic, statistical)

Page 5: Oceanography 569 Oceanographic Data Analysis Laboratory

Exercise 1: Aegean Sea temperaturesanalysis goal: create continuous 3-m time series

Daily satellite SST maps• 5 buoys (POSEIDON)• 3-m 3-hourly

temperatures (with gaps)

Page 6: Oceanography 569 Oceanographic Data Analysis Laboratory

Exercise 1: Characterize Data

3-m: higher resolution, but gapsSST: continuous, but only daily

What happens when the data are “merged”?To make a consistent series, what is sacrificed?

Page 7: Oceanography 569 Oceanographic Data Analysis Laboratory

Exercise 1: Data discrepanciescompare apples & apples: average 3-m to daily

What are the characteristics of the differences?How can the differences be reconciled?

Page 8: Oceanography 569 Oceanographic Data Analysis Laboratory

Periodic Signals

Robust way to estimate periodic signals, especially for gappy data:

fit_harmonics: fit to cosines with period L, L/2, etc (cf. Fourier series)

[amp,phase,frac,offset,da]=fit_harmonics(data,time,nharm,L,cutoff);

d_periodic = amp(1)*cos(2*pi*t/L+phase(1)) + amp(2)*cos(2*pi*2*t/L+phase(2))

+ ... + amp(n)*cos(2*pi*n*t/L+phase(n)) +offset for nharm=n

includes jth term only if frac(tion) of variance removed > cutoff/100 returns anomaly: da = data - d_periodic

Note: offset is not the same as mean(data) Remove mean using fit_harmonics if strong seasonal cycle!

Page 9: Oceanography 569 Oceanographic Data Analysis Laboratory

Exercise 1: Fix discrepanciesfind & remove seasonal cycle in difference

Result: daily average temperature that matches the seasonal cycle of the 3-m series

Page 10: Oceanography 569 Oceanographic Data Analysis Laboratory

Other goals

1. Continuous SST with a diurnal cycle: use 3m temperature to find diurnal cycle

2. Correct SST for aliasing from undersampling the diurnal cycle

3. Create non-seasonal temperature anomalies

Page 11: Oceanography 569 Oceanographic Data Analysis Laboratory

AliasingSST sampling aliases diurnal cycle“Nyquist frequency”: period of 2*Δt

sample diurnal temperature signal using 26-hr intervals

Page 12: Oceanography 569 Oceanographic Data Analysis Laboratory

Matlab functions

datenum: converts yyyy,mm,dd to Julian dates, starting at year 0; also datestr, datevec, datetick(‘x’)

imagesc: bit map that shows each image pixel, scaled to colormap (cf. pcolor, which interpolates pixels to a grid)

NaN, “not a number”: use to flag invalid data, then nanmean, nansum, etc ignore NaN’s. Does not plot. To find valid data:

ind=find(~isnan(data));

fit_harmonics(data,time,nharm,L,cutoff): use to find any periodic signal in the data, using the time axis, period L and a cutoff (% of variance explained)

Page 13: Oceanography 569 Oceanographic Data Analysis Laboratory

Statistics of Observations “random” variables

Are these observations of random variables?Will removing the mean make them random?

Page 14: Oceanography 569 Oceanographic Data Analysis Laboratory

Statistical Definitions: mean

The sample mean is given by

The mean of the parent population is given by

But we never know it since the sample is finite. For class the mean wil refer to the sample mean, regardless of the symbol.

The factor N here is the number of degrees of freedom.

Page 15: Oceanography 569 Oceanographic Data Analysis Laboratory

Statistical Definitions: variance

The sample variance is given by

where s is the standard deviation of x. The variance of the parent population corresponds to an infinite number of samples, N.

The N-1 factor occurs because using the sample mean “uses up” one of the degrees of freedom of the data set.

In class the we will refer to the sample variance.

Page 16: Oceanography 569 Oceanographic Data Analysis Laboratory

Exercise 2: Periodic Signals need to remove non-random components

Both have periodic signals (seasonal, not random)

Page 17: Oceanography 569 Oceanographic Data Analysis Laboratory

Caution: mean of data with periodic componentif incomplete cycles in sample

Use “offset” from fit_harmonics instead

Page 18: Oceanography 569 Oceanographic Data Analysis Laboratory

Exercise 2: Probability Distributions(histogram)

Both non-seasonal SST and non-seasonal rain are random variables.Are either of these normally distributed?

Page 19: Oceanography 569 Oceanographic Data Analysis Laboratory

Normal Distribution for Random Variable

Why do we want a normal distribution?Least-squares fit, correlations, optimal interpolation have error estimates based on assumption of normal distributions of random data and/or errors

Page 20: Oceanography 569 Oceanographic Data Analysis Laboratory

Exercise 2: Making a variable more normal distribution of log(rain)

log(rain)

rain

Page 21: Oceanography 569 Oceanographic Data Analysis Laboratory

Exercise 2: distributions for modified variabledeciles

rain deciles

rain

uniform

Page 22: Oceanography 569 Oceanographic Data Analysis Laboratory

Exercise 2: test for normal cumulative distribution

Page 23: Oceanography 569 Oceanographic Data Analysis Laboratory

To edit or not to edit

For a truly normal distribution, 0.3% of the data are more than3 standard deviations from the mean

“Three-sigma edit”: remove data more than 3 std dev from mean

Best to justify edits in terms oflikely error sources and characteristics• spikes• unphysical values• comparisons with other

variables

Page 24: Oceanography 569 Oceanographic Data Analysis Laboratory

Exercise 2: Edit data3-sigma outliers

Procedure for removing suspicious data:1) remove known signals

(diurnal, seasonal, trends)2) check for normal

distribution3) compute σ (standard

deviation)4) remove data more than 3*σ

from mean

do not iterate!

Page 25: Oceanography 569 Oceanographic Data Analysis Laboratory

Central Limit Theorem

Why is Normal distribution commonly used?

Underlying distributions may be unknown or non-Normal

BUT if measurement (or error) is sum of many processes, distribution will approach Normal

Example: distribution of the mean of X for different distributions as the number of samples increases