University of IowaIowa Research Online
Theses and Dissertations
2011
The Hilbert-Huang Transform: theory,applications, developmentBradley Lee BarnhartUniversity of Iowa
Copyright 2011 Bradley L. Barnhart
This dissertation is available at Iowa Research Online: http://ir.uiowa.edu/etd/2670
Follow this and additional works at: http://ir.uiowa.edu/etd
Part of the Physics Commons
Recommended CitationBarnhart, Bradley Lee. "The Hilbert-Huang Transform: theory, applications, development." PhD diss., University of Iowa, 2011.http://ir.uiowa.edu/etd/2670.
1
THE HILBERT-HUANG TRANSFORM:
THEORY, APPLICATIONS, DEVELOPMENT
by
Bradley Lee Barnhart
An Abstract
Of a thesis submitted in partial fulfillment of the requirements for the Doctor of
Philosophy degree in Physics in the Graduate College of
The University of Iowa
December 2011
Thesis Supervisor: Professor William Eichinger
1
2
ABSTRACT
Hilbert-Huang Transform (HHT) is a data analysis tool, first developed in 1998,
which can be used to extract the periodic components embedded within oscillatory data.
This thesis is dedicated to the understanding, application, and development of this tool.
First, the background theory of HHT will be described and compared with other spectral
analysis tools. Then, a number of applications will be presented, which demonstrate the
capability for HHT to dissect and analyze the periodic components of different oscillatory
data. Finally, a new algorithm is presented which expands HHT ability to analyze
discontinuous data. The sum result is the creation of a number of useful tools developed
from the application of HHT, as well as an improvement of the HHT tool itself.
Abstract Approved: ________________________________
Thesis Supervisor ________________________________
Title and Department ________________________________
Date
3
THE HILBERT-HUANG TRANSFORM:
THEORY, APPLICATIONS, DEVELOPMENT
by
Bradley Lee Barnhart
A thesis submitted in partial fulfillment of the requirements for the Doctor of
Philosophy degree in Physics in the Graduate College of
The University of Iowa
December 2011
Thesis Supervisor: Professor William Eichinger
Graduate College The University of Iowa
Iowa City, Iowa
CERTIFICATE OF APPROVAL
_______________________
PH.D. THESIS
_______________
This is to certify that the Ph.D. thesis of
Bradley Lee Barnhart
has been approved by the Examining Committee for the thesis requirement for the Doctor of Philosophy degree in Physics at the December 2011 graduation.
Thesis Committee: ___________________________________ William Eichinger, Thesis Supervisor
___________________________________ Thomas Boggess Jr.
___________________________________ Paul Kleiber
___________________________________ Wayne Polyzou
___________________________________ Anton Kruger
ii
2
Dedicado a Eduardo y su duende
iii
3
ACKNOWLEDGMENTS
I want to first thank my adviser Dr. Bill Eichinger. Thank you for all of your
encouragement, advice, and support. This work would not be possible without you.
Also thank you to my wife Rebecca. You have always created such joy in my life, and
I thank you for all of your love, kindness, and support.
Thank you to my parents, Randall and Nancy, for a childhood which provided the
pathway to success. You are my role models.
And thank you to my dog Lucy. You always give me a great excuse for a long walk.
iv
4
ABSTRACT
Hilbert-Huang Transform (HHT) is a data analysis tool, first developed in 1998,
which can be used to extract the periodic components embedded within oscillatory data.
This thesis is dedicated to the understanding, application, and development of this tool.
First, the background theory of HHT will be described and compared with other spectral
analysis tools. Then, a number of applications will be presented, which demonstrate the
capability for HHT to dissect and analyze the periodic components of different oscillatory
data. Finally, a new algorithm is presented which expands HHT ability to analyze
discontinuous data. The sum result is the creation of a number of useful tools developed
from the application of HHT, as well as an improvement of the HHT tool itself.
v
5
TABLE OF CONTENTS
LIST OF TABLES ............................................................................................................................. vii LIST OF FIGURES ......................................................................................................................... viii CHAPTER I. INTRODUCTION ....................................................................................................... 1 II. BACKGROUND........................................................................................................... 4
Traditional Spectral Analysis Tools ............................................................................. 4
Fourier Analysis ...................................................................................................... 5 Short-Time Fourier Transform ............................................................................ 7 Wavelet Analysis ..................................................................................................... 8 Generalized Time-Frequency Distributions ....................................................... 9
III. HILBERT-HUANG TRANSFORM (HHT) ......................................................... 10
Hilbert Spectral Analysis ............................................................................................ 10 Empirical Mode Decomposition (EMD).................................................................. 12
IV. ANALYSIS OF SUNSPOT VARIABILITY USING THE
HILBERT-HUANG TRANSFORM ...................................................................... 14 Introduction .................................................................................................................. 14 Ensemble Empirical Mode Decomposition (EEMD) ............................................ 15 Results ............................................................................................................................ 15 Discussion ..................................................................................................................... 20 Further Research .......................................................................................................... 21
V. EMD APPLIED TO SOLAR IRRADIANCE, GLOBAL
TEMPERATURE, AND CO2 CONCENTRATION DATA ............................ 28 Introduction .................................................................................................................. 28 Data Used ...................................................................................................................... 28 Results ............................................................................................................................ 29
Cycles in Data ....................................................................................................... 30 IMF Comparisons ................................................................................................ 32
Discussion ..................................................................................................................... 34 VI. CHARACTERIZING SAMPLING ERRORS ASSOCIATED WITH
THE NEAR-SURFACE ENERGY BUDGET CLOSURE PROBLEM ......... 50 The Energy Balance Problem ..................................................................................... 50 EMD as a Dyadic Filter ............................................................................................... 53
vi
6
Eddy Covariance Methods .......................................................................................... 54 Traditional Eddy Covariance Method ............................................................... 54 EMD Eddy Covariance Method ........................................................................ 55
Orthogonality and Sampling Durations .................................................................... 57 How Long is Long Enough? ...................................................................................... 61 Conclusions ................................................................................................................... 63
VII. AN IMPROVED ENSEMBLE EMD ALGORITHM ........................................ 71 Motivation ..................................................................................................................... 71 Ensemble Empirical Mode Decomposition ............................................................. 72 Errors Due to Data Gaps ............................................................................................ 73 Error Reduction Methods ........................................................................................... 74 Discussion ..................................................................................................................... 75
VIII. SUMMARY ................................................................................................................... 83
REFERENCES .................................................................................................................................. 85
vii
7
LIST OF TABLES
Table 5.1 Mean and standard deviation of instantaneous frequencies (1/yrs) calculated
using the Hilbert Transform. ................................................................................................. 42 5.2 Periods (in years) calculated using Hilbert analysis and zero-crossing method. ............. 43 5.3 Correlation coefficients (r) between total solar irradiance and sunspot from 1749
to 2009 ....................................................................................................................................... 45 5.4 Correlation coefficients (r) between total solar irradiance and global mean
temperature from 1880 to 1945 ............................................................................................. 47 5.5 Correlation coefficients (r) between total solar irradiance and global mean
temperature from 1945 to 2009 ............................................................................................. 47 5.6 Correlation coefficients (r) between sunspot number and global mean
temperature from 1880 to 1945 ............................................................................................. 49 5.7 Correlation coefficients (r) between sunspot number and global mean
temperature from 1945 to 2009 ............................................................................................. 49
viii
8
LIST OF FIGURES
Figure 4.1 Monthly sunspot data decomposed into its intrinsic mode functions (IMFs)
using EEMD ............................................................................................................................. 22 4.2 Statistical significance test for the extracted IMFs. Notice the first extracted
IMF, is below the 1% confidence limit and is therefore considered statistically insignificant from noise ........................................................................................................... 23
4.3 The monthly sunspot data denoised by removing the first IMF extracted using EEMD ....................................................................................................................................... 24
4.4 Hilbert spectra of IMFs representing the (a) 11-year cycle, (b) 20-50-year cycle, and the (c) quasi-100-year cycle ............................................................................................. 25
4.5 Short-time Fourier spectrogram of the monthly sunspot data with window sizes of (a) 100 years and (b) 26 years ............................................................................................ 26
4.6 Wigner-Ville distribution of sunspot data ............................................................................ 27
4.7 Extracted IMF representing the 11-year solar cycle plotted along with its instantaneous frequency as calculated using equation (6) .................................................. 27
5.1 Sunspot number data set and its decomposed IMFs .......................................................... 38
5.2 Total Solar Irradiance (TSI) measurements and their decomposed IMFs ...................... 39
5.3 Global mean temperature and its decomposed IMFs ........................................................ 40
5.4 CO2 concentration as measured from the Mauna Loa Observatory ................................ 41
5.5 Subsection of IMF 2, the yearly cycle extracted from the CO2 data using EEMD ........ 42
5.6 Comparison between TSI and sunspot number IMFs. The TSI data were multiplied by a factor of 100 in order to improve visibility. .............................................. 44
5.7 Comparison of IMFs for global mean temperature and total solar irradiance ............... 46
5.8 Comparison of global mean temperature and sunspot number data IMFs. The temperature was multiplied by a factor of 500 to allow for visibility ............................... 48
6.1 Dyadic nature of EMD when applied to turbulence .......................................................... 65
6.2 Variance contributions from IMF pairs for 60 minute data sets of vertical wind velocity and temperature ......................................................................................................... 65
6.3 Covariance contributions from IMF pairs of vertical wind velocity and temperature ............................................................................................................................... 66
6.4 Covariance contributions from w IMF 10 and all T IMFs ................................................ 66
ix
9
6.5 Absolute value of the nonorthogonal fraction of the total covariance from the IMF Covariance Matrices as calculated for 10 days. The top two plots show SMEX 2002 data from Site 161. The bottom two plots show SMEX 2002 from Site 152 ...................................................................................................................................... 67
6.6 Orthogonal (blue) and nonorthogonal (red) fractions of the total covariance of wT plotted against essentially the number of cycles sampled, as defined by the sampling duration divided by the period of the process (in this case, an IMF) ............. 68
6.7 Orthogonal (blue) and nonorthogonal (red) fractions of the total covariance of wq plotted against essentially the number of cycles sampled, as defined by the sampling duration divided by the period of the process (in this case, an IMF). ............ 69
6.8 Orthogonal (blue) nonorthogonal (red) and total (black) contributions from each IMF for the sensible heat flux as calculated from Site 161 from SMEX 2002. Each subplot is a different sampling duration, going from left to right and top to bottom in the following order: 5, 10, 15, 20, 30, 45, 60, 80, 100, 120, 150 minutes ...... 70
7.1 Original data decomposed into its IMFs as well as the IMFs decomposed from discontinuous data ................................................................................................................... 77
7.2 Error as defined by the summed differences between the discontinuous and continuous extracted IMFs, plotted against data gap size ................................................. 78
7.3 Errors plotted as a function of frequency of IMFs. The errors are primarily in the low-frequency IMFs .......................................................................................................... 79
7.4 Errors plotted as a function of time. For the high-frequency IMFs, the errors occur largely near the gap endpoints ..................................................................................... 80
7.5 Comparison of three different decompositions of data. The original signals (black) contain no gaps. The red signals are the decomposed IMFs from the discontinuous EEMD algorithm, and the blue signals are the discontinuous EEMD algorithm used after a mirroring technique was performed ................................ 81
7.6 Comparison of relative error associated with including or not including the mirror technique when using the discontinuous EEMD decomposition ....................... 82
1
1
CHAPTER I INTRODUCTION
In order to describe the physical world, measurements must be gathered and
interpreted. Just as it is essential to understand the specifications of the instruments used to
collect data, it is also necessary to understand the strengths and limitations of the tools used
to interpret data.
Frequency analysis tools are used to analyze the internal fluctuations of a signal in
terms of their frequency, or size scales. While frequency analysis tools are beneficial for
describing the contributions to a signal from various frequency or size scales, oftentimes the
tools which are used have limitations that restrict how the data can be interpreted. For
example, Fourier-based analysis tools rely on the mathematical property that any signal can
be reconstructed from the sum of sinusoidal functions. This, in theory, is advantageous and
can be used to describe the relative contributions to the signal from the various sine
functions with different frequency. However, these sinusoidal functions are infinite in extent,
and are required to have constant amplitudes and phases. Imagine standing in a grass field
for an hour feeling the intermittent puffs of wind on your face, and it becomes clear that
nature is not stationary. Or conversely, imagine if ocean waves were required to have
constant amplitudes and phases, and how oddly convenient the world would seem. Since
nature does not fit stationary and linear assumptions, it is necessary then to extend our
mathematical tools which describe nature to more adaptive methods. That is, methods
should be extended to accommodate for signals to be nonstationary, and which may be the
result of many, perhaps nonlinear, combinations of processes.
Following the advent of traditional Fourier analysis, many new methods have been
developed to accommodate for nonstationary signals. These vary from short-time Fourier
2
2
transforms (STFT), which allow a signal to be nonstationary as long as it is piece-wise
stationary, or wavelet analysis which can sift out particular signatures from a signal on a
variety of size scales. Generalized time-frequency distributions have also been derived which
encompasses special cases such as wavelets or STFT, and include much more complicated
versions of these tools. With each frequency analysis tool come assumptions and limitations
which affect the signal being analyzed. It is important to understand these limitations to
properly interpret the signal.
This dissertation describes a relatively new data analysis tool called the Hilbert-
Huang transform (HHT) which is able to extract the frequency components from possibly
nonlinear and nonstationary intermittent signals. As with any frequency analysis tool, it has
strengths and weaknesses which need to be understood in order to accurately interpret the
output. However, it is a powerful tool which can describe the frequency components locally
and adaptively for nearly any oscillating signal. This makes the tool extremely versatile. For
instance, HHT has been used to study a wide variety of data including rainfall, earthquakes,
heart-rate variability, financial time series, Lidar data, and ocean waves to name a few
subjects. Therefore, it is justified to continue research on this relatively new tool in order to
fully understand the underlying theory, its potential applications, and its development.
This dissertation is divided into 8 Chapters. Chapter 2 gives a brief background of
current data analysis tools: their strengths and limitations. Chapter 3 introduces HHT and
compares its abilities to these traditional data analysis tools.
Chapters 4-7 describe four separate papers which were submitted to refereed
journals for publication between 2009-2011; two are currently published (Chapters 4,5) and
two are currently under review (Chapters 6,7).
3
3
Chapter 4 demonstrates the utility of HHT when applied to oscillatory data, in
particular, sunspot number data. HHT is compared with other data analysis tools and shown
to be useful to describe the local frequency components of complicated data. Chapter 5 uses
a portion of HHT in order to compare two or more periodic cycles within oscillatory data.
The techniques used in Chapter 4 are extended in Chapter 5 in order to compare two
separate cyclic data oscillations.
Chapters 4 and 5 utilized HHT with well-known data which has been analyzed
extensively using alternative methods. In contrast, Chapter 6 utilizes HHT to address an
unsolved problem: the problem of the lack of a near-surface energy budget closure. This
chapter will apply HHT to meteorological data in an attempt to shed new light on this
problem. The poorly understood measurement sampling errors associated with near-surface
fluxes will be analyzed and conclusions associated with the energy budget closure problem
will be discussed.
Chapter 7 describes an improvement to the EMD algorithm in order to
accommodate for discontinuous, intermittently sampled, data. The benefits of such an
improvement are discussed as well as its limitations.
Finally, Chapter 8 will give a brief summary of the research completed thus far, and
give several suggestions for needed future research.
In order to understand and predict the natural world around us, it is essential not to
fit the world into mathematical equations but rather to expand our mathematical equations
to better fit the natural world. This proposal aims to accomplish this by analyzing a new data
analysis tool, HHT, which may more locally and adaptively describe the natural world.
4
4
CHAPTER II BACKGROUND
Traditional Spectral Analysis Tools
Data analysis is the fundamental connection between measurements and the
conclusions we draw from those measurements. Typically, data analysis tools attempt to
describe the intrinsic variability of measured variables, whether they be temperature, wind
velocity, heart rate, population, rainfall, stock volatility, or any other variable system.
However, it is important to understand how the tools used to analyze data affect the data
itself.
In order to understand the intrinsic variability of a system, measured signals are
oftentimes written mathematically as the sum of their contributing components. Equation
2.1 shows that a time-dependent signal, f(t), can be written as the product of amplitude
coefficients, , and basis functions, { } [6].
( ) ∑ ( )
(2.1)
The signal and the basis functions could also be written as a function of space
depending on the system being analyzed. If the basis functions form an orthonormal set, the
amplitude coefficients can be calculated as in equation 2.2.
∫ ( ) ( )
(2.2)
The energy density contribution from each component, then, is shown in equation
2.3 [5][6][43].
| | (2.3)
5
5
Note that the series in equation 2.1 can be thought of as a mathematical
approximation to the original signal.
An enormous number of solutions to problems utilize this technique of describing a
complicated signal in terms of simpler ones. For example, consider the Schrodinger equation
in equation 2.4.
[
] (2.4)
The general solution to equation 2.4 is shown in equation 2.5
( ) ∑
( ) (2.5)
where the time-dependence in equation 2.5 can be included only when the potential,
V, is independent of time [23]. ( ) are the solutions to the time-independent Schrodinger
equation as shown in equation 2.6.
[
] (2.6)
Depending on the potential V of the system, different coefficients, , and
functions, ( ), are used to describe the solution. For the hydrogen atom, the functions
( ) are written as the product of radial functions, described by Laguerre polynomials, and
the angular functions, which are known as spherical harmonics. Other potentials give
solutions for the wave functions which require spherical or cylindrical Bessel functions [23].
These functions are mathematical approximations which represent the physical processes of
the system.
6
6
Fourier Analysis
When analyzing periodic fluctuations in measured data, the most common form of
data analysis is Fourier analysis. Formulated by Joseph Fourier in the early 1800s, Fourier
analysis utilizes the postulate that any signal can be constructed as a sum of sinusoidal
functions [5][6][43]. Therefore, Fourier analysis relies on the assumption that any signal can
be written as in equation 2.1 where the basis functions, { }, are sine and cosine functions.
For signals which contain multiple frequency components, Fourier analysis describes these
signals as the sum of sine waves, with infinite extent, with different frequencies, as shown in
equation 2.7.
( ) ∑
(2.7)
Because the frequency of each sinusoidal function must be time-independent,
Fourier analysis is able to construct stationary data only. That is, the frequency of the signal
being analyzed is assumed to not change with time. Also, because the sine waves used to
describe a signal are infinite in extent, Fourier analysis is considered a global analysis tool.
The amplitudes of the basis functions can be calculated as shown in equation 2.8.
( ) ( ) ∫ ( )
(2.8)
Equation 2.8 shows that the amplitude coefficients describe the contribution of the
signal at different frequency components, . This equation is called the Fourier transform,
and is useful because it provides a frequency-domain representation of a time-domain
function [5][6][43]. The energy density is the square of the amplitude functions. The relative
contribution to the energy density from each frequency component is shown in equation 2.9.
7
7
( ) ( ) (2.9)
The total energy density is then the sum of the contributions from all frequencies. It
is a common practice to plot Fourier spectra which plot energy density vs. frequency. This
allows the largest contributing components to be located at particular frequencies.
Short-Time Fourier Transform
As mentioned before, nonstationary signals such as sporadic impulses or aperiodic
signals cannot be described locally using Fourier analysis. In order to accommodate for
nonstationary signals, the short-time Fourier transform (STFT) was developed. It is shown
in equation 2.10.
( )
∫ ( ) ( )
(2.10)
The idea of STFT is to break a nonstationary signal into sections, in all of which the
signal is stationary. Then, the regular Fourier transform can be calculated in each section and
the energy density can be determined in each section. Therefore, the original signal can be
nonstationary, as long as it is stationary within each window. The window function,
( ), is chosen by the user to be a particular size. The STFT spectrogram, which
plots the energy density contributions from each frequency, is time-dependent. However, the
frequencies must be constant within each window [5][43].
Note that the choice of window size is important and determines what frequencies
will be resolved from the data. For instance, a short window will be able to show the time-
dependence of frequencies very locally in time, however, it will only capture the high
frequency components and will not resolve the lower frequency, longer periodic, oscillations.
8
8
If a longer window is used, the lower frequency components can be resolved, however, the
possibility of capturing nonstationary features from the signal increases [5][43].
Wavelet Analysis
Wavelet analysis is another type of frequency analysis tool. The wavelet transform of
the signal f(t) is shown in equation 2.11 where ψ is the wavelet, a is the scale factor and b is
the time shift.
( )
√ ∫ ( ) (
)
(2.1
1)
The transform basically represents the similarity between a signal and the pre-
determined wavelet at scale a at time b. Wavelets of different size (frequency) scales are used
to generate many wavelet transform coefficients. Energies can be calculated just as in
equation 2.9 to produce time-frequency-energy spectrograms without the need for a fixed
window [8][9][43]. This was an improvement over STFT because, as the result of the flexible
basis functions, both the high-frequency and the low-frequency structures could be analyzed.
Large scale wavelets are used to extract low frequency, large scale features while high
frequency oscillations are extracted with smaller scale wavelets [8][9][43].
Wavelet analysis works well to seek out particular structures at different size
(frequency) scales within data. For example, Morelet wavelets can isolate and analyze rainfall-
runoff events. However, a drawback of wavelet analysis is that the wavelet basis functions,
and therefore the structures being sifted out from the original signal, are chosen a priori. It is
possible that the utilized wavelets may or may not reflect the processes in the analyzed
signal. If an inappropriate set of wavelets is used to correlate with a signal, the calculated
9
9
wavelet coefficients and variance of the signal may give misleading and nonphysical results
[8][9][43].
Generalized Time-Frequency Distributions
There are many spectral analysis tools which can be described by an overall
generalized time-frequency distribution. equation 2.12 shows this distribution where x(t) is
the signal and ( ) is the kernel which determines the properties of the distribution
[9][33].
( )
∫ ( ) (
) (
) ( )
(2.
12)
The distribution is called the Wigner-Ville distribution when ( ) . Overall,
the Wigner-Ville distribution gives better time and frequency resolution than STFT and does
not have to sacrifice one resolution for the benefit of the other. Negatives of the Wigner-
Ville distribution include the possibility of calculating nonphysical harmonics and even
negative amplitudes. Therefore, the real frequency contributions have to be picked out
amidst nonphysical harmonics. Other kernals can be used including a bi-Gaussian, which
produces a pseudo-Wigner distribution (PWD) or an exponential kernel, which produces a
Choi-Williams distribution (CWD) [9][33].
10
10
CHAPTER III HILBERT-HUANG TRANSFORM (HHT)
An alternative data analysis tool has been proposed by Norden E. Huang called the
Hilbert-Huang Transform (HHT) [26]. The HHT technique for analyzing data consists of
two components: a decomposition algorithm called empirical mode decomposition (EMD)
and a spectral analysis tool called Hilbert spectral analysis. Both tools will be introduced and
described hereafter. It will be shown that HHT can provide a local description of the
oscillating components of a signal, whether nonstationary or nonlinear. This provides a new
approach for analyzing the variability of signals and can be compared with current tools such
as any of the methods mentioned previously.
Hilbert Spectral Analysis
The purpose of HHT is to demonstrate an alternative method to present spectral
analysis tools for providing the time-frequency-energy description of time series data. Also,
the method attempts to describe nonstationary data locally. Rather than a Fourier or wavelet
based transform, the Hilbert transform was used, in order to compute instantaneous
frequencies and amplitudes and describe the signal more locally. Equation 3.1 displays the
Hilbert transform, ( ), which can be written for any function x(t) of Lp class [6]. The PV
denotes Cauchy’s principle value integral.
[ ( )] ( )
∫
( )
(3.1)
[6][21] determined that an analytic function can be formed with the Hilbert
transform pair as shown in equation 3.2.
11
11
( ) ( ) ( ) ( ) ( ) (3.2)
where
( ) ( ) ( ) (
) √ (3.3)
( ) and ( ) are the instantaneous amplitudes and phase functions, respectively
[21]. The instantaneous frequency can then be written as the time derivative of the phase, as
shown in equation 3.4.
( )
(3.4)
Note that the analytic function z(t) is the mathematical approximation to the original
signal x(t).
Because the amplitude and frequency functions are expressed as functions of
time, the Hilbert spectrum, which displays the relative amplitude or energy (square of
amplitude) contributions for a certain frequency at a specific time, can be constructed as
H(w,t). Then, a marginal spectrum can be calculated as in equation 3.5, where the spectrum is
summed over the time domain of 0 and T.
( ) ∫ ( )
(3.5)
The marginal spectrum represents the sum of all amplitudes (energies) over the
entire data span. This can be directly compared to the Fourier spectrum which was shown in
equation 2.9 as ( ).
[26] and [27] showed that not all functions give “good” Hilbert transforms,
meaning those which produce physical instantaneous frequencies. For example, functions
with non-zero means will give negative frequency contributions using the Hilbert transform
[26][27]. Therefore, the signals which can be analyzed using the Hilbert transform must be
12
12
restricted so that their calculated instantaneous frequency functions have physical meaning.
Next, the empirical mode decomposition will be described. It is essentially an algorithm
which decomposes nearly any signal into a finite set of functions which have “good” Hilbert
transforms that produce physically meaningful instantaneous frequencies.
Empirical Mode Decomposition
The EMD algorithm is the other component to the HHT method. The
algorithm attempts to decompose nearly any signal into a finite set of functions, whose
Hilbert transforms give physical instantaneous frequency values. These functions are called
intrinsic mode functions (IMFs). The algorithm utilizes an iterative sifting process which
successively subtracts the local mean from a signal. The sifting process is as follows:
1. Determine the local extrema (maxima, minima) of the signal.
2. Connect the maxima with an interpolation function, creating an upper
envelope about the signal.
3. Connect the minima with an interpolation function, creating a lower
envelope about the signal.
4. Calculate the local mean as half the difference between the upper and lower
envelopes.
5. Subtract the local mean from the signal.
6. Iterate on the residual.
The sifting process is repeated until the signal meets the definition of an IMF, which
will be explained shortly. Then, the IMF is subtracted from the original signal, and the sifting
process is repeated on the remainder. This is repeated until the final residue is a monotonic
13
13
function. The last extracted IMF is the lowest frequency component of the signal, better
known as the trend.
Previously, the sifting process was said to stop when the signal met the
criteria of an IMF. Therefore, it is important to understand how an IMF is defined.
Remember the definition of an IMF was formed to ensure that the IMF signals give physical
frequency values when using the Hilbert transform. The definition of an IMF, therefore, is a
signal which has a zero-mean, and whose number of extrema and zero-crossings differ by at
most one [26][27]. IMFs are considered monocomponent functions which do not contain
riding waves [26][27].
Once a signal has been fully decomposed, the signal D(t) can be written as
the finite sum of the IMFs and a final residue as shown in equation 3.6.
( ) ( ) ∑ ( )
(3.6)
Using equations 3.2 and 3.3, the analytic function can be formed as shown in
equation 3.7.
( ) ( ) [∑ ( ) ∫ ( )
] (3.7)
Also, for reference, equation 3.8 shows the Fourier decomposition of a signal, x(t).
( ) [∑
] (3.8)
Notice that the EMD decomposition can be considered a generalized Fourier
decomposition, because it describes a signal in terms of amplitude and basis functions whose
14
14
amplitudes and frequencies may fluctuate with time [26][27]. The HHT will now be used on
a number of different data sets to analyze its applicability.
15
15
CHAPTER IV ANALYSIS OF SUNSPOT VARIABILITY USING
THE HILBERT-HUANG TRANSFORM
Introduction
Sunspot number variation has been well studied and represents a crucial component
in the analysis of solar activity [53]. Understanding the intrinsic cycles of sunspot number
fluctuations helps to better characterize and understand the solar processes from which they
are responsible. Also, it aids in the prediction of future solar activity.
Because sunspot number data are nonstationary and the result of nonlinear
processes, it is necessary to choose a data analysis tool which will accurately describe its
cyclic components locally and adaptively [26][27]. Sunspot cycles are known to be of varying
lengths and amplitudes. While Fourier analysis is the most common data analysis technique
used to extract periodicities from periodic signals, it requires constant amplitudes and phases
and is not well-suited to the problem [5][9]. Therefore, it is justified to explore a new data
analysis technique which may be more suitable to extract the cyclic components from the
sunspot number data set.
A relatively new data analysis tool called the Hilbert Huang Transform (HHT) is a
tool which was specifically developed for analyzing nonstationary and nonlinear signals
[26][27]. Here we present a HHT analysis of monthly sunspot numbers from 1749-2010 and
compare the extracted cyclic components with those found using Fourier analysis as well as
generalized time-frequency distributions.
Ensemble Empirical Mode Decomposition (EEMD)
EMD is a dyadic filter bank in the frequency domain [15]. This means that the sifting
method can only extract IMFs which differ in frequency by more than factors of 2. An
16
16
improved EMD algorithm called Ensemble EMD (EEMD) has been developed by [58]
which utilizes this characteristic to extract robust and statistically significant IMFs. EEMD is
summarized here:
1. Add finite amplitude noise to the original signal.
2. Decompose signal into a finite set of IMFs using the EMD sifting method
described previously.
3. Repeat steps 1 and 2 with different noise data sets.
4. Average the ensemble of extracted IMFs to average out the noise and obtain
mean IMFs.
A complete description of EEMD can be found in [58]. EEMD was used to analyze
monthly sunspot data and will be shown below.
Results
Monthly sunspot data from January 1749 to April 2010 were decomposed
into different frequency components using EEMD. Eighty different sets of noise with a
standard deviation of 0.2 were added to the original data and decomposed using EMD. The
ensemble of decomposed IMFs was then averaged to obtain mean IMFs. The data along
with the mean extracted IMFs are shown in Fig. 4.1.
Clearly the extracted IMFs have time-dependent amplitudes and phases and differ
from pure sinusoidal functions. They are the intrinsic fluctuations extracted directly from the
signal using the sifting process and are not pre-determined functions. The 11-year cycle is
shown as the second extracted IMF in Fig. 4.1. Notice that the IMF captures the oscillation
of the signal even though the signal is nonstationary. It is well known that each 11-year cycle
does not oscillate as a perfect sinusoid. In fact, it is possible the cycle is made up of two or
17
17
more cyclic components. While the EMD method is unable to separate any components
whose periodicity is greater than factors of 2, it is able to display the subsequent varying 11-
year cycle, and demonstrate the changes in frequency due to its nonlinear behavior. This
investigation focuses mainly on periodicities equal or greater than the 11-year Schwabe cycle.
There were originally four higher frequency components which were combined into one
IMF and labeled the high-frequency IMF as seen in the top plot of Fig. 4.1. The high
frequency oscillations in IMF 1were determined to be statistically insignificant from noise
due to a statistical test which was suggested by [58]. They decomposed a large number of
noise data sets using EMD to create statistical significance confidence limits. The 5 extracted
IMFs are shown in Fig. 4.2 along with the 1%, 50%, and 99% confidence limits derived
from [58].
The star in the upper left corner, with mean energy below the zero mark, is the data
set itself and can be ignored. All IMFs are above the 99-percentile confidence limit except
for the highest frequency fluctuations found in the first IMF. Therefore, only the first IMF is
not statistically significant from noise [58].
One application of the EMD or EEMD method is to remove the high frequency
fluctuations from the sunspot data by subtracting IMF 1 from the original signal as shown in
Fig. 4.3. The advantage of this technique is that the highest frequency components were
removed locally through the EMD sifting process. Therefore, meaningful structures were
not smoothed over which often occurs when using low-pass filtering [58]. For comparison, a
Butterworth low-pass filter was used to remove the high frequency oscillations of the
sunspot number data. The 3db cutoff frequency was set to remove periodicities less than
approximately 5.5 years. The correlation coefficient between the Butterworth filtered data
and the original sunspot data was 0.8498 whereas the EMD method filter gave a value of
18
18
0.9375. Therefore, EMD provided a more accurate representation of the original signal while
removing the high frequency fluctuations. It is conceivable an alternative cutoff frequency
could be used and a better fit obtained using a different low-pass filter, however, this post-
processing is subjective and prone to bias. The EMD method, however, does not require
prior knowledge of the system in order to locally and adaptively extract and remove the
highest frequency content from the signal [26][27].
The other extracted IMFs in Fig. 4.1 represent the longer cycles of the signal.
The 20-50-year cycle is shown as the third IMF in Fig. 4.1. Because EMD is a dyadic filter
bank in the frequency domain, it is possible this IMF is the sum of two or more cycles
whose periods differ by less than a factor of two. The 22-year (Hale) cycle dominates
between approximately 1825 and 1940. However, before and after this time period exists a
slightly longer cycle, approximately 40-50 years. IMF 4 exhibits an approximately 100-year
cyclic oscillation which is known as the Gleissberg cycle [53]. The Gleissberg cycle period is
typically between 60 to 120 years [53]. Finally, the trend is displayed which shows an upward
trend in sunspot number for the past 250 years.
Using the Hilbert transform, Hilbert spectra ( ), were calculated for
each IMF. These can be compared and contrasted with alternative spectral analysis methods
such as STFT and time-frequency distributions.
Fig. 4.4 shows the Hilbert spectra for the extracted IMFs. The frequency is displayed
as cycles/years. Figs. 4.5a and 4.5b show STFT spectrograms of the overall dataset. Fig. 4.5a
uses a window size of 100 years and Fig. 4.5b uses a window size of 26 years. Notice how
the frequency resolution is better and poorer, respectively, and that the time resolution is
related. [28] also analyzed sunspot data with STFT. They used a pre-emphasis filter, which
amplifies certain portions of the spectrogram, in order to more easily distinguish the cycles
19
19
within sunspot data. Figs. 4.5a and 4.5b did not use a pre-emphasis filter, therefore, the
cycles are slightly less resolved than in [28]. See Fig. 2 in [28] for their STFT spectrograms.
[33] also analyzed solar sunspot number using pseudo-Wigner (PWD) distribution. Refer to
Fig. 6 in [33] for the PWD spectrogram of solar sunspot data.
The Hilbert spectrum in Fig. 4.4a shows how the 11-year solar cycle is not constant
but actually changes with time. This is because the 11-year solar cycle is different from a
constant frequency sinusoid. In Fig. 4.4a, it oscillates about a mean of 0.0909 cycles/year
which corresponds to a period of 11.11 years. Fig. 4.4a also shows that the amplitude
fluctuations of the 11-year Schwabe cycle, as can be seen by the color variations from gray to
black, are oscillatory. These fluctuations correspond to the oscillations of IMF 4, as shown in
Fig. 4.1. This is not surprising as the Gleissberg cycle (IMF 4) is the amplitude modulation of
the Schwabe cycle [53].
The STFT spectra in Figs. 4.5a and 4.5b also exhibit a peak near 0.09 Hz which
corresponds to the 11-year cycle. However, there are large contributions from other
frequencies. Because Fourier analysis attempts to construct the original signal with a sum of
sine and cosine functions with constant amplitudes and phases, it requires an infinite number
of contributions from different frequencies [5]. Also, the STFT spectrogram does not
capture the oscillation in frequency of the 11-year solar cycle, which is due to the
nonlinearity of the signal. [33] used PWD to better resolve the 11-year solar cycle. The
distributions did not resolve the oscillation in instantaneous frequency due to nonlinearity,
however, they significantly increased the resolution in both frequency and time as compared
to STFT [33].
The high frequency noise as represented by IMF 1 does not show any coherent
energy contributions from a particular frequency so its Hilbert spectrum was not shown.
20
20
The Hilbert spectrum for IMF 3, the 20-50-year (quasi-Hale) cycle, is shown in Fig.
4.4b. Notice the frequency increases between approximately 1830 and 1940. The amplitude,
as shown by the color, decreases from 1830 to 1940 but is larger before and after this time
period. It is interesting to note that the STFT does not capture the Hale cycle and the PWD
from [33] shows a very faint Hale cycle in their Fig. 4.6.
The Gleissberg cycle, as mentioned previously, represents the periodic amplitude
modulation of the 11-year Schwabe cycle [53]. The Hilbert spectrum of the extracted
Gleissberg cycle, IMF 4, is shown in Fig. 4.4c. It exhibits a mostly constant frequency. This
is in contrast to both the STFT and PWD spectra which show a steady decrease in frequency
corresponding to a steadily lengthening cycle. Also, [33] are able to display shorter period
cycles in the PWD spectrograms. For this investigation, only cycles of approximately 11-
years and greater are shown.
Fig. 4.6 displays the Wigner-Ville distribution for this data set. While the Wigner-
Ville distribution is able to capture the 11-year cycle, there are nonphysical harmonics which
dominate the spectrum. Therefore, the Hilbert, STFT and PWD spectra are more
informative when used for interpreting the sunspot data.
Fig. 4.7 displays the instantaneous frequency of the 11-year cycle IMF and
the IMF itself. The 11-year IMF has been divided by 1000 and a constant of 0.1 has been
added for visibility with the instantaneous frequency. Notice that the IMF cycles tend to
increase more quickly when rising in number and decay more slowly when falling, which is
the cause for the change in instantaneous frequency. The instantaneous frequency is higher
during the rising in sunspot number and is lower during the prolonged “tail” when the
sunspot number decreases more slowly. This nonlinear behavior is similar to rainfall-runoff
data when a short duration rain event occurs followed by a longer runoff period, causing the
21
21
instantaneous frequency of the process to fluctuate with time. This nonlinearity is not
captured using alternative spectral analysis tools.
Discussion
The Hilbert-Huang transform (HHT) has been used to analyze monthly sunspot
numbers and their variability from 1749 to 2010. HHT decomposed the data set into a
number of cyclic components using the ensemble empirical mode decomposition (EEMD).
The IMFs could be viewed in the time domain and compared with the original data. They
were extracted locally and adaptively from the data set and did not require a priori
knowledge about the system or the selection of prescribed basis functions. However, the
method acts as a dyadic filter bank in the frequency domain, meaning that it cannot separate
cycles which differ in period by less than a factor of 2. The Hilbert transform was then used
to calculate spectra and compare with the short-time Fourier transform (STFT) and the
pseudo-Wigner distribution (PWD). The Hilbert method displayed energy contributions
from only a few cyclic components. They were found to be representative of the Schwabe,
Hale, and Gleissberg sunspot cycles. Also, the periodicity of the 11-year solar Schwabe cycle
was shown to be time-dependent. Overall, this analysis demonstrates the utility of HHT
when analyzing nonstationary data which may be due to nonlinear processes. Also, it has
extracted the various cycles from sunspot number data, which can be compared and
contrasted with previous and future sunspot research.
Future Research
The HHT has shown to be useful for decomposing sunspot number into its
intrinsic frequency components. From this, the study of the signal’s variability on different
22
22
time scales is possible. Also, one of the main strengths of the HHT method is to compare
the frequency components of two or more signals to determine relationships between them.
Further research will be pursued to utilize this technique to analyze mean global temperature,
co2 measurements and total solar irradiance proxy data. Then the different frequency
oscillations (IMFs) for each signal can be compared directly and checked for correlations.
Research should focus on developing techniques to compare different frequency
components and determine whether the two IMFs may be related.
Figure 4.1 Monthly sunspot data decomposed into its intrinsic mode
functions (IMFs) using EEMD
23
23
Figure 4.2 Statistical significance test for the extracted IMFs. Notice the first
extracted IMF, is below the 1% confidence limit and is therefore considered statistically insignificant from noise
24
24
Figure 4.3 The monthly sunspot data denoised by removing the first IMF extracted
using EEMD
25
25
Figure 4.4 Hilbert spectra of IMFs representing the (a) 11-year cycle, (b) 20-50-year
cycle, and the (c) quasi-100-year cycle
26
26
Figure 4.5 Short-time Fourier spectrogram of the monthly sunspot data with window
sizes of (a) 100 years and (b) 26 years
27
27
Figure 4.6 Wigner-Ville distribution of sunspot data
Figure 4.7 Extracted IMF representing the 11-year solar cycle plotted along with its
instantaneous frequency as calculated using equation (6)
28
28
CHAPTER V EMD APPLIED TO SOLAR IRRADIANCE, GLOBAL
TEMPERATURE, AND CO2 CONCENTRATION DATA
Introduction
In this investigation, the EMD method is used to isolate and analyze the various
cycles within total solar irradiance, global mean temperature, and co2 concentration
measurements. The different cyclic components will be compared with one another in the
time domain. For instance, the solar forcing from total solar irradiance will be compared
with the global mean temperature fluctuations at different frequency scales in the time-
domain. Therefore, it is easy to tell when two cyclic components are in phase and when they
are not. The EMD method can locally and adaptively analyze the inherent cyclic components
of nonlinear and nonstationary data. Therefore, it is beneficial to analyze the strengths and
weaknesses of this new tool in the context of climate data sets.
Data Used
The primary goal of this investigation is to utilize a relatively new data analysis tool
to identify, as well as compare and contrast, the intrinsic cycles of a number of possibly
inter-related variables. For demonstration, the following data sets were chosen: sunspot
number, total solar irradiance, global mean temperature, and CO2 concentration.
Sunspot number has been immensely recorded and studied. See for instance, [53].
Also, previously in this thesis, we decomposed monthly sunspot numbers into the 11-year
(Schwabe), quasi-22-year (Hale), and quasi-100-year (Gleissberg) cycles using Empirical
Mode Decomposition. They compared the HHT results with those from time–frequency
distributions, including short-time Fourier and pseudo-Wigner distributions. This
investigation will build upon these results by comparing the extracted IMFs with IMFs from
29
29
other variables in the time-domain. The monthly sunspot number records utilized were
obtained from the solar physics group at NASA's Marshall Space Flight Center. The monthly
sunspot number data set can be located at [39]. The monthly data from 1749 to 2009 were
decomposed into IMFs, then averaged to obtain annual resolution.
Total Solar Irradiance (TSI) is a direct measure of the solar output. Because TSI
measurements have only been prevalent since the mid-1970s via satellites, this investigation
chose to utilize a reconstructed TSI data set from 1749 to 2009. The proxy data used were
annual data from 1749 to 2009, obtained from the LASP Interactive Solar Irradiance Data
Center. The data can be downloaded at [34]. It is recognized that this data set was
reconstructed with the use of sunspot number data. Therefore, correlations between sunspot
number and the TSI data should be self-evident.
Global mean temperature is one of the most controversial and important data sets
that exist today. This investigation used monthly temperature data from 1880 to 2009 taken
from NASA's Land-Ocean Temperature Index, LOTI. The data can be found at [38]. The
data were decomposed into IMFs, then averaged to achieve annual resolution.
Yearly CO2 concentration data measured at Mauna Loa observatory were also used
for this investigation. Data were obtained from 1958 to 2010 from the NOAA Earth System
Research Laboratory c/o Dr. Pieter Tans at [40].
Results
The EMD method was utilized to extract the intrinsic cycles of the previously
mentioned datasets. Their decomposed intrinsic mode functions (IMFs) are displayed in
Figs. 5.1 through 5.3. The global mean temperature comprised of monthly measurements
30
30
while the TSI data and CO2 concentration data were recorded annually. Therefore, the IMFs
were decomposed for each data set, then averaged to provide annual temporal resolution.
Cycles in Data
The CO2 concentration data set is perhaps the most straightforward data set
decomposed by the EMD method. The decomposition, shown in Fig. 5.4, yielded three
IMFs, including high-frequency noise, an annual oscillation, and a steadily increasing trend.
[60] have previously decomposed CO2 concentration proxy data from 1880 to 2002
using EMD. They utilized annual data so they were unable to extract the yearly cycle as
shown in IMF 2. Instead, they extracted noise IMFs and a century long increasing trend. Fig.
5.5 is an enlarged plot of IMF 2, the annual oscillation. Notice how the signal is not a linear
sinusoid, for instance, at approximately the 1984 cycle. EMD is able to represent the actual
fluctuations of the signal, without forcing assumptions of linearity or stationarity [26][27].
Therefore, non-sinusoidal oscillations such as in IMF 2 can be extracted and analyzed in the
time-domain.
Apart from CO2 concentration, the EMD method was also applied to sunspot
number, total solar irradiance, and global mean temperature data sets as shown in Figs. 5.1
through 5.3. Instantaneous Frequencies (IF) were calculated for each IMF from the three
data sets using equation 3.4. These frequencies fluctuate over the entire data duration period.
The mean and standard deviation of the instantaneous frequencies were calculated and are
shown in Table 5.1.
The periods could also be approximated by equation 5.1 where Dur is the length of
the signal and ZC is the number of zero-crossings.
31
31
(5.1)
This calculation of the periods does not take into account the nonstationarity, or
frequency changes in the cycles. However, it does give an idea for the general cycle time
scales for each IMF. Both methods yielded nearly identical periods for the lower IMFs, as
shown in Table 5.2. The higher IMFs, which represent the longer periodic oscillations of the
signal, have greater differences because there were fewer cycles to average over fluctuations
in instantaneous frequencies.
The cycles of sunspot number have been studied immensely in the scientific
community. For an overview, see [53]. The most prominent cycle in sunspot data is the 11-
year Schwabe cycle shown as IMF 2 Figure 5.1. The third IMF shows a less uniform cycle
with a period between 13 and 16 years. IMF 4, which is the quasi-Hale IMF, exhibits a 22-
year cycle from approximately 1840 to 1940 but has longer cycles before and after this time
period. IMF 5 is approximately a 100-year cycle, which is known as the Gleissberg cycle.
Finally, IMF 6 is the trend or the lowest frequency component of the signal.
Total solar irradiance proxy data was also decomposed into 5 IMFs and a trend, as
shown in Fig. 5.2. IMF 2 has an approximately 11-year oscillation. This is not surprising
since the proxy data was partially reconstructed using solar sunspot number data. However,
it is interesting to note the time-dependent amplitude in IMF 2. The IMFs extracted using
EMD are not required to be constant amplitude or frequency [26][27]. The remaining IMFs
are longer oscillations, which will be shown to correspond to sunspot number IMFs.
The extracted IMFs from global mean temperature data are shown in Fig. 5.3. IMF 2
is approximately a 5-year oscillation and IMF 3 represents a quasi-11-year cycle. This will be
compared with the 11-year IMFs of sunspot number and TSI. IMFs 4 and 5 represent the
32
32
longer cyclic oscillations with mean periods of 17–24 and 58–65 years, respectively. [53] also
describes an 11-year climate oscillation, as well as large climate oscillations with periods of
approximately 20 and 60 years, using the Fourier spectral analysis. These were determined
from peaks in the frequency-domain spectra. The EMD method has expanded this research
by providing the ability to view the cyclic components in the time-domain.
IMF Comparisons
The variables have been decomposed into their IMFs, which represent oscillations at
various time scales. The IMFs can now be compared and contrasted.
CO2 concentration data was decomposed into an annual cycle and a trend. The
annual cycles were not resolved in the other variables because annually averaged data were
used. The trends, by inspection, are positively correlated.
Because the total solar irradiance proxy data was reconstructed partially using
sunspot data, it is not surprising that its IMFs are correlated with those of sunspot data. For
instance, Fig. 5.6 plots the two variables' IMFs together for comparison. Notice that the 11-
year cycle IMF matches very closely as well as the 100-year cycles. The middle IMFs do
correlate for most of the time period; however, there is some disconnect from 1850 to 1900.
The lowest frequency components for each variable can be compared directly from Figs. 5.1
and 5.2. These are clearly well correlated. Correlation values for the IMFs are given in Table
5.3. Again, the correlation between solar irradiance and sunspot number was expected
because the TSI data were reconstructed using sunspot data.
The possibility of correlation between TSI data and global mean temperature has
been a widely debated topic. It is important to understand how the fluctuations in solar
irradiance affect the earth's climate and global mean temperature. Also, it is beneficial to
33
33
quantify how much effect fluctuations in solar irradiance have compared to other forcings
within the global climate system. In order to approach this topic, we will first compare
visually the different IMFs of TSI data and global mean temperature. This is comparing the
different periodic cycles inherent within each data set. Fig. 5.7 shows the comparison of
various IMFs from TSI data with those from global mean temperature data.
Fig. 5.7 shows that the data fluctuates between being correlated and not. The first
regime can be seen from 1880 to approximately 1945 where the fluctuations in solar
irradiance appear to not be correlated well with the global temperature fluctuations. In the
third plot down in Fig. 5.7, the two are out of phase by 180 degrees. The two variables in the
second plot appear to lock in phase at approximately 1940 and continue until 2009.
However, the first and third plot show phase locking between the two variables between
1970 to approximately 1995, and lose the correlation for subsequent time periods. Also, the
trends of TSI and temperature can be compared by inspection of Fig. 5.2 and Fig. 5.3.
The difference between these time periods can be seen graphically as well as analyzed
in terms of correlation coefficients. Tables 5.4 and 5.5 display the correlation coefficients
between the IMFs of TSI and globally averaged temperature. These correspond to two
different time periods, mainly, 1880–1945 and 1945–2009. It is quite clear that from 1880 to
1945, there is small or negative correlation between the two signals for all time scales,
neglecting the trend. However, the time period 1945–2009 shows a dramatic increase in
correlation coefficient values. Tables 5.4 and 5.5 also show that the trends are well correlated
throughout the entire data duration.
Consider two oscillating processes that oscillate with different frequencies. It is
inevitable that they will reinforce one another during certain times and will be out of phase
during others. The EMD method demonstrates that while from 1880 to 1945 the variations
34
34
in solar output were not correlated with global temperature, between 1945 and 2009 they
were positively correlated. These correlations were found at a variety of time scales,
including approximately the 11-year, the 22-year, and the 100-year cycles.
Finally we compare the IMFs of global mean temperature with those from sunspot
number data. The results are plotted in Fig. 5.8. Also, all correlations are given in Tables 5.6
and 5.7 where the combination of IMFs of interest are highlighted in bold.
The IMFs appear to give very similar results as comparing global mean temperature
IMFs with those of total solar irradiance. Tables 5.6 and 5.7 show shifts in correlation
numerically. Notice that the correlations are mostly negative between 1880 and 1945, as
shown in Table 5.6. Between 1945 and 2009, however, the correlation values dramatically
increase, as shown in Table 5.7. For the first and third plots in Fig. 5.8, here is no correlation
from 1880 until approximately 1970, after which there is correlation until approximately
1995. For years after 1995, the correlation is reduced. For the second plot in Figure 5.8,
there is no correlation until approximately 1940. After 1940 there appears to be correlation
up until 2009. The third plot shows sunspot number and temperature out of phase from
1880 to 1945, as well as after 2000. However, between 1950 and approximately 1980, the
two signals are in phase.
Discussion
The Hilbert–Huang transform is a data analysis tool that is able to analyze
nonstationary data, which may be the result of nonlinear processes. Therefore, it is justified
to analyze various data sets to study the periodic cycles inherent in the data and to compare
different variables at different time scales.
35
35
By decomposing CO2 concentration into its IMFs, the different periodic
components could be analyzed in the time-domain. The CO2 concentration measurements
exhibited a diurnal cycle which, remarkably, has not changed much since 1958. For the last
50 years, the change in CO2 from annual minimum to maximum was 5.7±0.56 ppm as
calculated from the cycles in IMF 2. Superimposed upon this cycle is the long term trend,
which has increased approximately 75 ppm since 1958.
One of the most interesting results of this investigation is the identification of a
quasi-11-year cycle and quasi-22-year cycles within globally averaged temperature data. Also,
the EMD method showed that during particular time periods the quasi-11-year temperature
cycle was locked in phase with the cycles from total solar irradiance and sunspot number. It
seems intuitive that the dominate cycle in solar irradiance output, the 11-year cycle, would
directly affect the temperature at the earth. In fact, a number of studies have shown changes
within the troposphere, which are associated with solar fluctuations [10]. TSI and
temperature oscillations at longer time scales of 22 years and 65 years were also shown to be
correlated during these time periods. There have also been suggested correlations between
arctic-wide surface air temperature records and solar irradiance on decadal and multi-decadal
scales using wavelet analysis (Soon et al., 2009).
The magnitudes of the 11-year cycle fluctuations can be estimated empirically from
the IMFs. From Fig. 5.7, during the last five 11-year cycles, the average change in TSI from
solar minimum to solar maximum was 0.775±0.055 W/m2. For these same five 11-year
cycles, the average change in global mean temperature from minimum to maximum, as
calculated from Figure 5.7 was 0.101±0.012 °C.
It should be noted that decadal variations between 9 and 15 years in the temperature
records could be due to a variety of occurrences in addition to solar forcing. For instance,
36
36
volcanic eruptions and the Pacific Decadal Oscillation (PDO) both exhibit cycles at these
periodic scales and may be partially responsible for the resulting cycles present in the global
mean temperature data set.
Over the entire data duration, the 11-year Schwabe cycles remain relatively constant.
That is, over longer periods, they tend cancel themselves out with relatively symmetric
fluctuations. To determine the net radiative forcing over longer time periods, the trends
extracted using the EMD method can be analyzed. From Fig. 5.3 the trend of globally
averaged temperature has increased approximately 0.44 °C since 1959. Looking at the trend
of TSI data in Fig. 5.2, the change in TSI from 1959 to 2010 was approximately 0.3 W/m2.
Note that this is a maximum estimate, because not all of the energy from the Sun will be
absorbed by the Earth. This can be compared with the forcing associated with an increase in
CO2 concentration over the same time period, which can be calculated using equation 5.2
(
) (5.2)
Equation 5.2 was formulated using radiative transfer models to calculate the radiative
forcing due to a change in CO2 from some initial value to its present value (Myhre et al.,
1998). Solving equation 5.2 for the change in CO2 from 1958 to 2010, based upon the trend
in Figure 5.4 gives a radiative forcing of 1.13 W/m2. Therefore, while the short term and
long term fluctuations of total solar irradiance do produce radiative forcing upon the Earth,
the long term net radiative forcing is much smaller than the net forcing from increasing CO2
concentrations.
These estimates of forcing are not necessarily directly connected to absolute changes
in temperature. Multiple feedback mechanisms may exist, which complicate the processes by
which the Earth absorbs and retains energy.
37
37
The Hilbert–Huang Transform has been introduced as a relatively new spectral
analysis tool capable of analyzing the cyclic components of nonlinear and nonstationary data.
The Empirical Mode Decomposition method was used to decompose oscillatory signals of
total solar irradiance, sunspot number, global mean temperature, and CO2 concentration into
their Intrinsic Mode Functions. These IMFs exhibited time-dependent amplitudes and
frequencies. The IMFs were then analyzed and compared in the time-domain. Also,
empirical evaluations of radiative forcing from different periodic components of CO2
concentration and total solar irradiance were estimated. The net radiative forcing from
increasing solar irradiance was shown to be much smaller than the forcing due to increases
in CO2 during the last 50 years.
38
38
Figure 5.1 Sunspot number data set and its decomposed IMFs.
39
39
Figure 5.2 Total Solar Irradiance (TSI) measurements and their decomposed IMFs.
40
40
Figure 5.3 Global mean temperature and its decomposed IMFs.
41
41
Figure 5.4 CO2 concentration as measured from the Mauna Loa Observatory.
42
42
Figure 5.5 Subsection of IMF 2, the yearly cycle extracted from the CO2 data using
EEMD.
SSN
Mean(IF)
SSN Stdev(IF)
TSI
Mean(IF)
TSI
Stdev(IF)
T Mean(IF)
T Stdev(IF)
IMF
1
0.28
0.14
0.24
0.13
0.28
0.14
IMF
2
0.09
0.02
0.09
0.04
0.20
0.08
IMF
3
0.08
0.09
0.05
0.02
0.10
0.05
IMF
4
0.03
0.01
0.04
0.09
0.06
0.09
IMF
5
0.01
0.005
0.009
0.006
0.02
0.006
Table 5.1. Mean and standard deviation of instantaneous frequencies (1/yrs)
calculated using the Hilbert Transform.
43
43
SSN Hilb
SSN ZC
TSI Hilb
TSI ZC T Hilb T ZC
IMF 1 3.6 3.4 4.2 4.1 3.6 3.0
IMF 2 11 11 11 11 5.0 5.5
IMF 3 13 16 20 19 10 10
IMF 4 37 37 28 52 17 24
IMF 5 93 104 113 104 58 65
Table 5.2 Periods (in years) calculated using Hilbert analysis and zero-crossing
method.
44
44
Figure 5.6 Comparison between TSI and sunspot number IMFs. The TSI data were multiplied by a factor of 100 in order to improve visibility.
45
45
Sunspot IMF1 IMF2 IMF3 IMF4 IMF5 IMF6
TSI 0.85 0.50 0.82 0.27 0.28 0.33 0.20
IMF1 0.21 0.54 0.26 0.04 -0.08 0.02 -0.02
IMF2 0.61 0.54 0.95 0.10 -0.01 -0.001 -0.03
IMF3 0.36 0.10 0.31 0.74 0.10 0.03 0.06
IMF4 0.26 -0.02 0.01 0.19 0.79 -0.05 0.03
IMF5 0.48 -0.01 0.03 0.01 0.31 0.84 0.25
IMF6 0.59 -0.03 -0.05 -0.01 -0.001 0.59 0.96
Table 5.3 Correlation coefficients (r) between total solar irradiance and sunspot from 1749 to 2009.
46
46
Figure 5.7 Comparison of IMFs for global mean temperature and total solar irradiance.
47
47
TSI IMF 1 IMF 2 IMF 3 IMF 4 IMF 5 IMF 6
T 0.28 0.02 -0.03 -0.34 0.13 0.28 0.41
IMF 1 -0.13 -0.03 -0.08 -0.09 -0.04 -0.04 -0.10
IMF 2 -0.18 0.004 -0.02 -0.34 -0.07 -0.04 -0.14
IMF 3 0.09 0.02 -0.02 -0.40 0.30 0.10 0.13
IMF 4 0.11 0.07 0.08 0.04 -0.06 -0.26 0.26
IMF 5 0.77 0.03 0.02 0.02 0.20 0.66 0.86
IMF 6 0.67 0.01 -0.04 -0.04 0.27 0.37 0.87
Table 5.4 Correlation coefficients (r) between total solar irradiance and global mean temperature from 1880 to 1945.
TSI IMF 1 IMF 2 IMF 3 IMF 4 IMF 5 IMF 6
T 0.13 0.01 0.09 0.39 0.27 -0.06 0.06
IMF 1 0.02 0.004 0.13 0.17 0.01 0.005 -0.04
IMF 2 -0.02 0.02 0.06 0.43 0.08 -0.10 -0.11
IMF 3 0.10 -0.003 0.03 0.08 0.67 -0.05 0.04
IMF 4 0.27 0.02 0.01 -0.05 0.13 0.59 0.08
IMF 5 -0.85 0.004 0.06 -0.006 -0.28 -0.81 -0.77
IMF 6 0.83 -0.04 -0.04 0.007 0.33 0.26 0.99
Table 5.5 Correlation coefficients (r) between total solar irradiance and global mean temperature from 1945 to 2009.
48
48
Figure 5.8 Comparison of global mean temperature and sunspot number data IMFs. The temperature was multiplied by a factor of 500 to allow for visibility.
49
49
T IMF 1 IMF 2 IMF 3 IMF 4 IMF 5 IMF 6
Sunspot # 0.04 0.02 -0.03 -0.34 0.06 0.17 0.07
IMF 1 -0.14 -0.09 -0.11 -0.02 -0.04 0.01 -0.13
IMF 2 -0.22 -0.005 -0.01 -0.31 -0.15 -0.06 -0.17
IMF 3 -0.03 0.08 -0.02 -0.37 0.20 0.08 -0.07
IMF 4 -0.16 0.06 0.01 0.01 0.36 -0.42 -0.22
IMF 5 0.79 0.04 0.03 0.05 0.28 0.80 0.76
IMF 6 0.68 0.01 -0.04 -0.03 0.27 0.40 0.88
Table 5.6 Correlation coefficients (r) between sunspot number and global mean temperature from 1880 to 1945.
T IMF 1 IMF 2 IMF 3 IMF 4 IMF 5 IMF 6
Sunspot # -0.11 0.01 0.08 0.34 0.13 -0.14 -0.20
IMF 1 -0.13 -0.09 0.05 -0.05 -0.07 -0.11 -0.09
IMF 2 0.04 0.02 0.07 0.42 0.08 -0.03 -0.05
IMF 3 0.06 -0.005 0.02 0.09 0.68 -0.20 0.05
IMF 4 -0.02 0.03 0.02 -0.10 0.03 0.30 -0.17
IMF 5 -0.90 0.01 0.05 -0.005 -0.31 -0.64 -0.90
IMF 6 0.83 -0.04 -0.04 0.007 0.33 0.26 0.99
Table 5.7 Correlation coefficients (r) between sunspot number and global mean temperature from 1945 to 2009.
50
50
CHAPTER VI CHARACTERIZING SAMPLING ERRORS ASSOCIATED WITH
THE NEAR-SURFACE ENERGY BUDGET CLOSURE PROBLEM
The Energy Balance Problem
Conservation of energy at the earth’s surface, as defined by the balance of net
radiation and ground heat flux with the sum of turbulent sensible and latent heat fluxes, has
consistently not been satisfied experimentally [14][16][18][20]. Many studies have found the
net radiation and ground heat fluxes are consistently approximately 20% greater than the
turbulent fluxes [20]. This residual is often calculated as in equation 6.1,
( ) ( ) (6.1)
where R is the residual, Qnet is the net radiation, G is the soil heat flux, and H and E
are the sensible and latent heat fluxes, respectively, which shows the amount of energy
needed to balance the budget. Any lack of closure pertains not only to heat and moisture
measurements, but also those for trace gases such as carbon dioxide; the energy budget is
not closed for most of the FLUXNET sites, which measure flux of carbon dioxide
[3][13][55]. Accuracy of these measurements is pivotal for understanding the surface
exchange of greenhouse gases and quantifying carbon, as well as heat and moisture, cycling
over specific ecosystems.
A workshop was held in Genoble, France in 1994 to address the problems resulting
in the lack of closure [19]. The workshop formed the basis for the EBEX-2000 (Energy
Budget Experiment in 2000) which was conducted 50 miles south of Fresno, California.
However, the problems of closure were not able to be completely solved from the
experiment. More recently, in October 2009 in Thurnau, Germany, there was a panel
discussion about the energy budget closure problem which cited the current state of
51
51
knowledge, as well as areas where future research was needed [20]. They concluded the
following:
(1) Currently, the energy budget cannot be closed with experimental measurements.
(2) While previous studies have blamed the lack of closure on the high-frequency
response of meteorological instruments, they have been shown to not have a
remarkable effect with the advent of newer and faster sampling systems [20].
(3) The primary issues resulting in the lack of closure are attributed to the Eddy
Covariance technique and the resulting miscalculation of the sensible and latent
heat fluxes, not the net radiation or soil heat flux [20].
(4) One of the main contributions to the lack of closure is the energy transport of
large, low-frequency contributions to the vertical component of the eddy
transport, which are not fully measured using traditional eddy covariance
methods [13][14][19][52]. These are generally due to heterogeneity of the land
surface near the measurement system. These low-frequency transport
mechanisms can be due to slowly moving convection cells, or by the passage of
clouds above the sensing instruments [20].
(5) For some tall tower measurements, these mechanisms can be fully measured by
increasing the averaging time or using wavelet analysis, resulting in energy
balance [14][45][52]. However, when measurements are made more near the
surface, or near the surface of the roughness layer (i.e. above forest canopies),
the low-frequency oscillations are not fully measured. Therefore, no significant
amount of flux is measured from these low-frequency oscillations, at least for
averaging intervals between 30 and 240 minutes [19][20].
52
52
This section will specifically address the low-frequency contributions to the turbulent
fluxes using empirical mode decomposition and the Eddy Covariance method. We intend to
demonstrate that any finite measurement duration cannot fully capture all the low-frequency
oscillations within a realistic atmosphere. We will demonstrate that the errors within
turbulent fluxes, as calculated using the Eddy Covariance method, are partially due to
including undersampled low-frequency processes.
Other studies have come to similar conclusions, mainly, using Ogive functions they
suppose that low-frequency circulations must be responsible for missing flux [19][52]. We
will propose an alternative method to Ogive functions by determining the largest structure
that can be sufficiently sampled with a particular sampling duration. The EMD method,
then, provides a new method to view the frequency contributions to the total flux. The
contributions also demonstrate whether the processes have been sufficiently sampled.
To present this investigation, we will first introduce a relatively new spectral analysis
tool, Hilbert-Huang Transform, which is specifically designed to handle nonstationary data.
Much like wavelet analysis, it is a spectral analysis tool which extracts the frequency
contributions from an oscillatory signal. However, it does not require the use of pre-
determined basis functions, as in wavelet analysis. This relatively new method utilizes a
decomposition algorithm, called Empirical Mode Decomposition (EMD), which will be
introduced and used to decompose atmospheric wind components, temperature, and
humidity variables into their frequency components. Then we will quantify contributions to
the near-surface turbulent fluxes from each of these decomposed components. From this,
we will demonstrate and quantify errors due to undersampling low-frequency, nonstationary
oscillations when calculating the turbulent fluxes in the near-surface energy budget.
53
53
EMD as a Dyadic Filter
The EMD algorithm is able to sift out the intrinsic periodic components from
complicated oscillatory data. These components, called intrinsic mode functions (IMFs), are
time-domain functions which represent the local variability of the original signal at a
particular size (frequency) scale. There are limitations regarding the periodic components
EMD is able to extract from oscillatory data. For instance, EMD can only sift out periodic
components which differ in period by more than factors of two [15][57]. If a signal has two
or more superimposed periodic components which have periods closer than factors of two,
the extracted IMF will be the superposition of all the components within that dyadic range.
When dealing with turbulent atmospheric data, which can be thought of as a
collection of eddies existing at all size scales, the EMD algorithm acts as a dyadic filter. To
demonstrate this point, multiple data sets of 5 minute, 20 Hz temperature data were
decomposed using the EMD algorithm. Fig. 6.1 shows the calculated mean periods of the
IMFs as plotted on a log2 graph against IMF number. The mean periods were calculated
roughly by counting the number of zero-crossings of each IMF and dividing by twice the
total length of the IMF.
Fig. 6.1 demonstrates that the average period of each IMF is approximately twice the
preceding IMF. Since all time-domain components are additive, each IMF can be interpreted
as containing the sum of all oscillatory components within its particular dyadic range.
This also demonstrates that the problem of “mode mixing” is not influential in these
data. The problem called “mode mixing” is where a decomposed IMF contains a mixture of
different, sometimes drastically different, periodic scales [26][27]. Beceause the IMFs
displayed in Fig. 6.1 are clearly dyadic, each IMF is the sum of all frequency (periodic) scales
54
54
within the dyadic range of that IMF; in other words, there is no mixture of drastically
different modes. If there were “mode mixing”, Fig. 6.1 would not be linear, and each IMF
would not have the mean period which is twice the previous.
So, the EMD method acts as a dyadic filter when used with turbulent atmospheric
data. We can use the EMD method to decompose our atmospheric oscillatory data into a set
of IMFs whose different contributions to the turbulent flux can be calculated using a form
of the Eddy Covariance method. First, we will briefly give an overview of the Eddy
Covariance (EC) method.
Eddy Covariance Methods
Traditional Eddy Covariance Method
The eddy covariance (EC) technique is most commonly used for calculating the heat,
moisture, and CO2 fluxes near the earth’s surface [3][19]. The first two of these are the
sensible and latent turbulent heat fluxes which exist in the surface energy budget, and which
are consistently overestimated by approximately 20%. The EC method applies Reynold’s
averaging, the method of separating a signal into its mean and fluctuating components, to
the near-surface mass balance, as shown in equation 6.2 through 6.4.
( ) (6.2)
; (6.3)
( )
[( )( )]
( )
.
(6.4)
55
55
Here k is some scalar, u is the wind velocity vector, overbar denotes the mean, and
prime denotes the fluctuation from this mean. Then, typically the mean of both sides in
equation 6.4 is calculated in order to make
because by definition the average of fluctuating components will be zero.
Under particular assumptions of horizontal homogeneity, the vertical near-surface flux of
some scalar k is shown to be equal to the covariance of the fluctuating component of the
vertical wind velocity w and the fluctuating component of the scalar k, as in equation 6.5
( )
∑( )( )
(6.5)
Overall, the EC method says that the vertical turbulent transport of sensible and
latent heat can be calculated by their covariances with w; just replace k in equation 6.5 with
temperature T and specific humidity, q, for sensible and latent heat, respectively. However,
these calculations are used within near-surface energy budget calculations and are thought to
be approximately 20% underestimated.
The EC method relies on a number of assumptions associated with instrumental
setup and data collection. For a full explanation of proper measurement techniques, see
[16][18][54]. For this investigation, we assume that all systematic errors have been minimized
prior to our data analysis.
EMD Eddy Covariance Method
In addition to calculating the total vertical transport of sensible and latent heat fluxes
near the earth’s surface with the EC method, many studies calculate the relative
contributions to the flux from various size (frequency) scales of eddies [31][45][52]. To do
this, they separate the signals into frequency components using spectral analysis tools such as
56
56
Fourier analysis or wavelet analysis. Likewise, the EMD method can be used to separate
signals into their periodic components, and analyze their relative contributions to the total
flux.
Remember that any signal can be decomposed with the EMD method into a finite
number of fluctuating IMFs and a residue, as shown in equation 6.6
( ) ∑ ( )
( ) ∑ ( ) .
(6.6)
It is possible to calculate the contribution to the flux from particular IMFs.
This is equivalent to calculating contributions to the flux from particular frequency (size)
scales of eddies. Equation 6.7 shows the total flux,
∑( )
∑ ( )
∑( )( )
(6.7)
hich is equal to the sum of all covariances from each of the IMF pairs, where the
IMF numbers are indexed by i and j. The total flux can therefore be written as the total sum
of an “IMF Covariance Matrix” as shown in equation 6.8
( )
∑ (
( ) ( ) ( )( ) ( ) ( )
( ) ( ) ( )
)
(
(6.8)
where the total covariance cov(w,k) is equal to the sum of all the covariance
contributions from each of the IMF pairs.
57
57
Following the advent of traditional Fourier The sum of the IMF Covariance Matrix
gives results identical to calculating the total covariance via Fourier analysis or by simply
calculating the covariance of w and k as in equation 6.5.
Orthogonality and Sampling Durations
So far, we have introduced the EMD method and shown how the sum of its IMF
Covariance Matrix is equivalent to the total near-surface vertical turbulent flux of either
sensible or latent heat. Next, we will discuss how the IMF Covariance Matrix can be related
to sampling durations and show how errors due to undersampling of fluxes can be
calculated.
The total covariance of w and k can be separated into contributions from orthogonal
IMF (i = j) and from nonorthogonal IMF (i j) components as shown in equation (6.9).
( ) ∑ ( )
∑ ( )
(6.9)
The orthogonal terms are the diagonal terms in the IMF Covariance Matrix (first
sum on right hand side of equation 6.9) while the nonorthogonal terms are the off-diagonal
terms (second sum on right hand side of equation 6.9). The factor of two is related to the
fact that it is a symmetric matrix.
For comparison, consider typical Fourier decompositions. Fourier describes
an oscillatory signal as the sum of an infinite set of basis functions, which are weighted sine
and cosine functions with different constant frequencies from zero to the Nyquist frequency
[5][43][51]. The sine and cosine basis functions are, by definition, orthogonal when summed
58
58
(or integrated) over all space or time. Therefore, the Fourier analogue to the IMF Covariance
Matrix is an infinite matrix whose off-diagonal terms are all equal to zero.
( ) ∑ (
( ) ( ) ( )
) (6.10)
The Fourier Covariance Matrix, shown in equation 6.10, says that the only
contributions to the covariance come from diagonal (orthogonal) terms between identical
basis functions with frequencies from zero to the Nyquist frequency. However, in order for
the basis functions to be completely orthogonal, they must be infinite in extent [6][51].
Therefore, when dealing with finite length data, which all measured data are, Fourier analysis
assumes that the finite data length repeats infinitely [5][43][51]. This is why many scientists
use bell-tapering or other tapering methods at the beginning and end of the data set before
processing with Fourier analysis in order to avoid any discontinuous jumps between the end
and the beginning of the data.
We have shown that the EMD method allows for nonorthogonal (off-diagonal)
contributions in the IMF Covariance Matrix. Since the IMF Covariance Matrix and the
Fourier analogue matrix give identical results when summed over all components, it is
natural to wonder what causes the differences in the matrices themselves. To explore the
different contributions from orthogonal and nonorthogonal components via the EMD
method, data sets of w and T were decomposed into their IMFs for two different sampling
durations: 5 minutes and 60 minutes.
First, the variance contributions of w and T were calculated for all the IMFs. Fig. 6.2
shows the variance contributions from IMF 5 of w and T for the 60 minute duration data
59
59
set. This is essentially looking at one particular row (in this case, row 5) of the IMF
Covariance Matrix between w,w and T,T.
The orthogonal contribution comes from the same IMF number ( ), and is the
largest contribution in the row. None of the other IMFs contribute significantly to the
variance. Therefore, the orthogonal term dominates. This row, then, is considered
orthogonal.
The covariance was also calculated for w and T. Fig. 6.3 shows w IMF 5 and its
covariance contributions with all the T IMFs. This is equivalent to looking at row 5 in the
IMF Covariance Matrix between w,T.
Again, the major contribution comes from the orthogonal contribution, IMF 5
of w and T. However, there is energy spreading to the adjacent IMFs, mainly IMFs 3, 4, and
6. This is common when calculating covariances between different variables: the
momentum flux, vertical sensible heat flux, and , vertical latent heat flux. These
contributions from adjacent IMFs, energy spreading, are not nonorthogonal contributions
but are still considered orthogonal, pseudo-diagonal contributions. Therefore, Fig. 6.3 still
represents an orthogonal row in the Covariance Matrix.
Now for a nonorthogonal case. The 5 minute duration data sets were used and
decomposed into their respective set of IMFs. Fig. 6.4 shows the w IMF 10 and its
covariance contributions with all T IMFs.
IMF 10 shows contributions at a number of different IMFs. Notice there are even
significant negative contributions. This demonstrates a case where the row (row 10) has
significant nonorthogonal contributions.
Nonorthogonal contributions are due to the finite duration of the data, and the
undersampling of the lowest frequencies. When the short data sets of w and T were used (5
60
60
minutes) there were nonorthogonal contributions when looking at the higher (longer
periodicities) IMFs, specifically IMF 10. When the longer data sets of w and T were used (60
minutes) there were only orthogonal contributions, noting the occurrence of energy
spreading, when looking at the lower (shorter periodicities) IMFs, particularly IMF 5.
To investigate this further, we analyzed 10 data sets (consecutive days) of 20 Hz
meteorological data from 1000 to 1230 local time, from the SMEX 2002 experiment in Iowa.
These data were chosen because they represent “ideal” turbulent conditions over corn fields.
Each data set (w, q, T) was broken into 11 sampling durations, each starting at 1000
local time, including lengths of 5, 10, 15, 20, 30, 45, 60, 80, 100, 120, and 150 minutes. Each
data set was decomposed into their IMFs using the EMD method. Then, IMF Covariance
Matrices were calculated for each sampling duration.
Fig. 6.5 shows the absolute value of the nonorthogonal contributions (fraction of
total covariance) from the sensible (H) and latent (E) heat IMF Covariance Matrices as
summed for different sampling durations. Data are shown from sites 161, as shown in the
top row, and 152, as shown in the bottom row, from SMEX 2002. Notice that the
nonorthogonal contributions decrease as the sampling duration is increased, and that the
deviations between the different days decreases.
The physical reasons for these nonorthogonal contributions at the short sampling
durations is that the longer periodic IMFs, which represent the periodic eddies within the
system of particular frequencies, are not sampled sufficiently. As the measurement duration
is increased, more and more cycles are sampled which causes the nonorthogonal terms to
decrease. Absolute values were used because the nonorthogonal contributions can
sometimes be negative, as will be shown shortly. Also, ten data sets were used to show that
this is not an uncharacteristic result which occurs within one particular data set; the result is
61
61
not dependent upon the random fluctuations of the EMD algorithm. Notice, though, that
even a 150 minute duration is unable to reduce the nonorthogonal contributions to zero.
Fig. 6.5 shows that the nonorthogonal terms in the IMF Covariance Matrices should
decrease for the IMFs which are becoming more sufficiently sampled. However, if the
sampling duration is increased enough, more IMFs (rows and columns in the matrix) will be
created because using a longer sampling duration can capture larger cycles; these cycles will
not be sufficiently sampled, which result in nonzero nonorthogonal terms. In an idealized
case, where the measuring duration is infinite, the Covariance Matrix will equal the analogue
Fourier Covariance Matrix (as shown in equation 6.10), where it is an infinite matrix and
where all the nonorthogonal contributions are zero.
Next we can use the idea of orthogonality to determine when a signal is
sufficiently sampled, and how this effects the measurement and calculations of turbulent
fluxes.
How Long is Long Enough?
A long withstanding question within the atmospheric community is how to find the
appropriate duration to sufficiently sample an oscillatory process, or more poignantly, “How
Long is Long Enough?” (Lenschow et al. 1994). The question about appropriate length is
one which depends not only on the sampling duration, but also the period of the oscillatory
process being sampled.
The EMD has proven to be a unique frequency decomposition tool which can
provide insights into this question. From the EMD perspective, a process has been
sufficiently sampled when its nonorthogonal contributions have decreased to zero.
Therefore, a signal has been measured for enough cycles and its components are completely
62
62
distinguished from the other components embedded within the signal. We ask then, how
many cycles of a process need to be sampled in order for it to be sufficiently distinguished
from the other processes embedded in the signal?
Fig. 6.6 shows the orthogonal (blue) and nonorthogonal (red) fractions of the total
covariance as a function of sampling duration divided by period of IMF. The x-axis can be
explained as the number of cycles of a periodic process which is captured with a particular
sampling duration. For short sampling durations, or long periods, there is great scatter.
However, as the sampling duration divided by the period is increased, the nonorthogonal
fraction asymptotes to zero. Likewise, the orthogonal fraction asymptotes to one. The
regions where these contributions asymptote gives a quantitative estimate for the number of
cycles required to sufficiently sample the process. From the 10 days of data used for this test,
the number of cycles required was approximately 7. Physically, this means that it takes 7
cycles to sufficiently distinguish one periodic process from another when the two are
embedded within the same signal.
This can be used to determine the longest periodic oscillations sufficiently sampled
for a given sampling duration. If a 30 minute sampling duration is used, the longest periodic
process that will be sufficiently sampled will have a period of approximately 4 minutes,
assuming that it takes approximately 7 cycles to be sufficiently sampled. This explains why
typical Ogive functions do not display significant changes in flux estimates when increasing
sampling durations in small increments. In order to sufficiently sample a 30 minute process,
a sampling duration of at least 210 minutes is necessary.
The EMD method has created a tool which can be used to determine whether a
signal is sufficiently sampled. We now show that the errors in the turbulent fluxes are
partially due to the undersampling of the lowest frequency components. Fig. 6.8 shows the
nonorthogonal, orthogonal, and total contributions to the covariance of wT for site 161.
Each subplot is a different sampling duration in the following order: 5, 10, 15, 20, 30, 45, 60,
80, 100, 120, 150 minutes. Notice that the nonorthogonal contributions are typically found
63
63
in the higher IMFs, which represents lower frequency oscillations. As the sampling duration
is increased, as you can see by looking through the subplots, the high-frequency components
become more sufficiently sampled, however, the low-frequency components still have large
fluctuations. These fluctuations are random errors associated with undersampling the lowest
frequencies.
We have used the EMD method to develop a tool which has proven useful to
identify the largest periodic structure which can be sampled sufficiently with a particular
sampling duration. Therefore, the EMD method is able to determine which periodic
contributions to the total covariance contain random errors due to undersampling, and
which do not. However, it has been typically thought that the turbulent fluxes have always
been underestimated [16][18][20]. While the EMD method shows that random errors can
occur due to undersampling, it does not explain why the turbulent fluxes are consistently
underestimated. Instead, any undersampled IMF will contribute either positively or
negatively to the flux, causing errors.
Conclusions
This work has introduced a relatively new spectral analysis tool called Hilbert-Huang
Transform and has utilized its empirical mode decomposition algorithm to decompose
meteorological data into their intrinsic periodic oscillations. By using EMD as a dyadic filter
for meteorological data, we have calculated the contributions to near-surface fluxes from
different frequency components and constructed the idea of an IMF Covariance Matrix for
calculating near-surface turbulent fluxes.
This investigation also determined an approximate estimation for the number of
cycles needed to sufficiently distinguish a process among other embedded processes within a
signal. By recognizing that nonorthogonal contributions are evidence of undersampled
processes, the EMD method can be used to determine which frequency components have
been sampled sufficiently which contribute to the calculated turbulent flux.
64
64
While the method determines which periodic processes are undersampled, it does
not show that the errors due to undersampling are always negative. Rather, they occur
randomly, contributing positive and negative contributions to the total flux.
Further research should be performed to compare the nonorthogonal components
with direct calculations for many different data sets, including studies with small or large
energy budget residuals. Also, the nonorthogonal contributions could be compared to other
meteorological values such as u* or stability parameters in order to determine how and if
they are related.
65
65
Figure 6.1 Dyadic nature of EMD when applied to turbulence
Figure 6.2 Variance contributions from IMF pairs for 60 minute data sets of vertical
wind velocity and temperature
66
66
Figure 6.3 Covariance contributions from IMF pairs of vertical wind velocity and
temperature
Figure 6.4. Covariance contributions from w IMF 10 and all T IMFs
67
67
Figure 6.5 Absolute value of the nonorthogonal fraction of the total covariance from the IMF Covariance Matrices as calculated for 10 days. The top two plots show SMEX 2002
data from Site 161. The bottom two plots show SMEX 2002 from Site 152.
68
68
Figure 6.6 Orthogonal (blue) and nonorthogonal (red) fractions of the total covariance of wT plotted against essentially the number of cycles sampled, as defined by the
sampling duration divided by the period of the process (in this case, an IMF).
69
69
Figure 6.7 Orthogonal (blue) and nonorthogonal (red) fractions of the total covariance of wq plotted against essentially the number of cycles sampled, as defined by the
sampling duration divided by the period of the process (in this case, an IMF).
70
70
Figure 6.8 Orthogonal (blue) nonorthogonal (red) and total (black) contributions from each IMF for the sensible heat flux as calculated from Site 161 from SMEX 2002.
Each subplot is a different sampling duration, going from left to right and top to bottom in the following order: 5, 10, 15, 20, 30, 45, 60, 80, 100, 120, 150 minutes.
71
71
CHAPTER VII AN IMPROVED EEMD ALGORITHM
Motivation
There are a number of Empirical Mode Decomposition (EMD) algorithms available
today. These include commercial software called the Hilbert-Huang transform data
processing system (HHT-DPS) which was developed by Norden Huang at NASA and is
available through NASA’s website. There are also publicly available Matlab codes by Patrick
Flandrin [22] and R code by [32] which extract IMFs from a given input data series.
However, no investigation to our knowledge has utilized these algorithms to process
discontinuous data. For example, the EEMD algorithm introduced by [58] will not run when
there is gaps, or NaNs, in the input data.
Oftentimes instruments fail in the field, resulting in data gaps within the data. These
gaps prevent the EMD algorithm from properly sifting through the data. Typically, scientists
utilize interpolation values, such as the mean of surrounding data points, to fill data gaps.
This may be useful for small data gaps, however, the data is assumed to remain constant
during the time period, and so is insufficient for larger gaps of fluctuating data. In the spirit
of a local and adaptive decomposition tool, it is important to manipulate the data as little as
possible, and merely describe the data that does exist. This investigation suggests an
improvement to the Ensemble Empirical Mode Decomposition (EEMD) algorithm which
allows for gaps in the input data. The implications of applying EEMD to varying sizes of
discontinuous data will be discussed. Also, we will suggest an error reduction technique
which extracts IMFs which are more locally accurate.
72
72
Ensemble Empirical Mode Decomposition
This investigation utilizes the Ensemble Empirical Mode Decomposition (EEMD)
algorithm which was pioneered by [58]. The algorithm utilizes the original EMD sifting
method which is described fully in [26][27]. An overview of the EEMD algorithm follows:
1. Add finite amplitude noise to the original input signal.
2. Decompose the signal into a finite set of intrinsic mode functions (IMFs) using the
original EMD sifting method.
3. Repeat steps 1 and 2 with different noise data sets of same noise standard deviation.
4. Average the ensemble of extracted IMFs to average out the noise and obtain their
mean IMFs.
A complete description of EEMD can be found in [58]. The standard deviation of
random noise which is added to the original data before decomposition can be specified by
the user. For this investigation, we used a standard deviation of 0.2.
In order to accommodate for discontinuous data, the MATLAB version of the
EEMD algorithm by Zhaoua Wu was modified [58]. In order to accomplish this, the sifting
process, which fits spline functions to the local maxima and the minima of the signal, must
be performed on each individual continuous data segment. When the algorithm encounters a
data gap, the splines must be halted, and a new set of piece-wise spline functions begin on
the next set of continuous data.
Figure 1 shows original data and its decomposed IMFs, as well as the decomposed
set with artificial data gaps created.
The improvement is simple but powerful. The majority of the discontinuous IMFs
seem to replicate the original IMFs away from the gap section. For instance, look at
73
73
approximately Arbitrary Time 90 for IMF2. The local oscillations are still captured even with
the gap present. The two decompositions are not identical, however. This investigation will
now assess the errors associated with decomposing discontinuous data in a quantifiable
manner. Also, suggestions will be made for how to reduce such errors.
Errors Due to Data Gaps
In order to assess the abilities of EEMD applied to discontinuous data, we
present an error analysis comparing an original IMF with a discontinuous IMF. Equation 1
shows the root mean square error equation.
√
∑ ( )
. (1)
The total error for a single decomposition, then, is the sum of the rms errors from all
IMFs.
It is interesting to compare the rms error with the size of gaps within the data. Figure
2 shows the error calculated for a number of data gap sizes. That is, a data gap was artificially
created in the exact middle of the data. The gap length was created as some percentage of
the original data length. Then, the IMFs were decomposed and compared with the IMFs
from the original data.
The errors, as calculated by equation (1), are shown in Figure 2 for the first 6 IMFs
extracted using the new discontinuous EEMD algorithm. Notice that the errors increase
with increased gap size, as expected. Also, the highest frequency IMFs, the low IMF
numbers, have the smallest differences with the original data. As the IMF number increases,
which represents the lower frequency IMFs, the errors increase more quickly with increasing
data gap size. This shows that the algorithm works better with high-frequency IMFs.
74
74
Another question to ask is where do the errors most occur? Fig. 7.4 shows the errors
plotted against time for all IMFs with a gap size of 80 points, which is approximately 20% of
the total data.
Notice that the high-frequency IMFs have the largest errors near the endpoints of
the gaps. The low-frequency IMFs also have errors, but they are not specifically located near
the gap endpoints.
A common problem within the old EMD sifting algorithm has been one dealing with
so-called End-effect errors [26][44][58]. These errors, which have been well studied, have
traditionally existed at the endpoints of the data [26][44][58]. This is because the first or
second derivatives which are required for spline fitting are unavailable. The endpoints, then,
have large fluctuations which do not represent the real signal. For our particular algorithm
dealing with discontinuous data, these end-effect errors occur not only at the beginning and
end of the input data, but at the start and end of every data gap. While the differences in the
low-frequency IMFs are primarily due to the size of the data gap, the high-frequency IMFs
have differences primarily due to gap end-effect errors. Therefore, in order to reduce the
errors from the high-frequency IMFs, we can use traditional end-effect mitigation tools as
described in the subsequent section.
Error Reduction Methods
There are a number of investigations which have dealt with end-effect errors
[26][44][58]. Therefore, it may be possible to utilize these end-effect mitigation tools in order
to decrease the errors in the high-frequency IMFs due to data gaps.
[44] suggest that IMFs should use a mirror extension technique to lower the errors
due to end effects. This would restrict the spline from varying extravagantly at the ends of
the gaps.
75
75
The mirroring technique used in this investigation is now reviewed. When a data gap
is encountered, it is split in two sections. The first section is filled by the mirror image of the
data directly before the gap. The second section is filled by the mirror image of the data
directly after the gap. The amount of mirroring needed is dependent on the gap size. The
result is a continuous data set. The traditional EEMD algorithm is then used to decompose
the data into its IMFs. Once decomposed, the data gaps are recreated by removing the
mirrored data.
To test the effectiveness of this technique with discontinuous data, two gaps were
created in the continuous data. Three different algorithm iterations were performed. The
first one was a traditional EEMD algorithm decomposing the original data without gaps.
The second was the discontinuous EEMD algorithm as applied to the data set without
mirroring. The third was the discontinuous EEMD algorithm as applied to the data which
had undergone the mirroring technique.
Fig. 7.5 shows the three decompositions compared. This verifies, at least
visually, the effectiveness in the mirroring technique to reduce end-effect errors near the
endpoints of the gaps. Also, the low-frequency IMFs more closely match the original
decomposition.
Fig. 7.6 shows the relative error from each IMF as compared between the
discontinuous EEMD algorithms with and without mirroring applied.
The first IMF is the actual data and can be ignored. For all the other IMFs, the
mirroring technique greatly reduces the error in the decompositions. Therefore, the
processing of discontinuous data is greatly improved by using the mirroring technique.
Discussion
Overall, this investigation presents a new version of the Ensemble Empirical Mode
Decomposition (EEMD) algorithm which is now applicable to discontinuous data. For short
76
76
gap durations, the errors are small and the decomposition is locally representative. A
mirroring technique was utilized to improve the discontinuous decomposition. This makes
for a more local and adaptive decomposition of data which may contain one or more gaps.
Further research should be pursued which utilizes neural networks or prediction models to
fill gaps and improve on the reduction of errors.
77
77
Figure 7.1 Original data decomposed into its IMFs as well as the IMFs decomposed from discontinuous data
78
78
Figure 7.2 Error as defined by the summed differences between the discontinuous and continuous extracted IMFs, plotted against data gap size.
79
79
Figure 7.3 Errors plotted as a function of frequency of IMFs. The errors are primarily in the low-frequency IMFs
80
80
Figure 7.4 Errors plotted as a function of time. For the high-frequency IMFs, the errors occur largely near the gap endpoints.
81
81
Figure 7.5 Comparison of three different decompositions of data. The original signals (black) contain no gaps. The red signals are the decomposed IMFs from the discontinuous EEMD algorithm, and the blue signals are the discontinuous EEMD
algorithm used after a mirroring technique was performed.
82
82
Figure 7.6 Comparison of relative error associated with including or not including the mirror technique when using the discontinuous EEMD decomposition.
83
83
CHAPTER VIII SUMMARY
This dissertation focused on the theory, application, and development of a relatively
new spectral analysis tool called Hilbert-Huang Transform. While it is an empirical tool, its
power lies in its versatility, where it may be applied to virtually any oscillating data signal,
whether nonlinear or nonstationary.
While some of the data sets analyzed in this thesis are well-known and have been
studied immensely using other data analysis tools, the contribution of this thesis is the
development of the tools and techniques related to Hilbert-Huang Transform and how it
can be applied to different types of data.
First, using sunspot data, the periodic components were extracted using HHT and
compared to well-known processes. The results were shown to give more local descriptions
of the frequency components than traditional spectral analysis tools.
Next, the periodic components were shown useful when compared to one another in
the time domain. This provided new methods for analyzing frequency components of
different processes, embedded within a signal, in the time domain. The HHT was used to
calculate the similarity between the relative amplitudes and phases of two nonstationary
processes. This created new techniques which can be used on other nonstationary data.
The pivotal characteristic of the EMD method is that it acts as a dyadic filter bank in
the frequency domain. That is, it is unable to decipher fluctuations that have periodicities
which differ by less than factors of 2. This limits the cycles EMD is able to sift out from
fluctuating signals. The dyadic nature has been demonstrated on turbulent wind velocity,
temperature, and humidity data.
84
84
HHT was also used to approach the problem of energy budget closure near the
earth’s surface from a completely new viewpoint. The orthogonality of the extracted
components using the EMD method were shown to be related to whether or not the
internal oscillations were sufficiently sampled. This provides researchers with a tool to justify
that the covariance contributions from various frequency components have been sufficiently
sampled.
Finally, a modification to the EMD algorithm has been presented which allows for
data gaps within the input signal. As most real data does contain gaps during some duration,
this expands the potential applications of the EMD method greatly. Problems with the
algorithm have been discussed, including end-effect errors due to under-defined
interpolation functions. A mirroring technique has shown to reduce the errors due to end-
effect errors. Therefore, this new algorithm can accurately and adaptively work in extracting
the periodic components embedded within discontinuous data. This improvement is
essential to allowing HHT to work with discontinuous data, thereby making the tool much
more adaptive to all types of data.
Overall, this dissertation has provided an in-depth analysis of a new tool, and
has strengthened the tool itself. Future studies will further broaden its applicability to new
problems, and will attempt to strengthen the theoretical foundation on which it stands. Real
data is inherently messy; it is noisy, nonstationary, and intermittent. This thesis has taken a
step towards describing this messy reality more adaptively and efficiently.
85
85
REFERENCES
[1] Attoh-Okine, N. O. 2005: Perspectives on the Theory and Practices of the Hilbert- Huang Transform. In The Hilbert-Huang Transform in Engineering, edited by N.E. Huang and N.O. Attoh-Okine, 281-305. Taylor & Francis.
[2] Aubinet et al. 2000: Estimates of the annual net carbon and 5 water exchange of forest: The EUROFLUX methodology. Adv Ecol Res. 30, 113–175.
[3] Baldocchi et al. 2001: FLUXNET: A new tool to study the temporal and spatial variability of ecosystem-scale carbon dioxide, water vapor, and energy flux densities. Bull Amer Meteorol Soc. 82, 2415-2434.
[4] Balocchi, R. 2004: Deriving the respiratory sinus arrhythmia from the heartbeat time series using empirical mode decomposition, Chaos, solitons, and fractals, 20, 171-172.
[5] Brigham, E. 1988: The Fast Fourier Transform and its Applications, Prentice Hall, Englewood Cliffs, NJ.
[6] Byron F.W., and Fuller R.W. 1992: Mathematics of Classical and Quantum Physics, Dover Publications, Inc., New York.
[7] Chen, Q., N.E. Huang, S. Riemenschneider, and Y. Xu. 2006: A B-spline approach for empirical mode decompositions. Advances in computational mathematics, 24, 1-4, 171.
[8] Chui, C.K. 1992: An Introduction to Wavelets. Academic Press, Boston, MA. [9] Cohen, L. 1995: Time-Frequency Analysis, Prentice-Hall, Englewood Cliffs, NJ. [10] Coughlin, K.T. & K.K. Tung. 2004: 11-year solar cycle in the stratosphere extracted by
the empirical mode decomposition method. Advances in space research. 34, 2, 323. [11] Duffy, D.G. 2004: The application of Hilbert-Huang transforms to meteorological
datasets. Journal of Atmospheric and Oceanic Technology. 21, 4, 599. [12] Echeverría, J.C., J.A. Crowe, M.S. Woolfson, and B.R. Hayes-Gill. 2001: Application of
empirical mode decomposition to heart rate variability analysis. Medical biological engineering computing. 39, 4, 471.
[13] Feigenwinter C., Bernhofer C., and R. Vogt. 2004: The influence of advection on the short term CO2-budget in and above a forest canopy. Bound.-Layer Meteor. 113, 201–224.
[14] Finnigan JJ, Clement R, Malhi Y, Leuning R, and H.A. Cleugh. 2003: A re-evaluation of long-term flux measurement techniques. Part I: Averaging and coordinate rotation. Bound.-Layer Meteor. 107, 1–48.
86
86
[15] Flandrin, P., Rilling, G., and Goncalves, P. 2004: Empirical mode decomposition as a filter bank. IEEE Signal Processing Letters. 11, 2, 112.
[16] Foken T., S.P. Oncley. 1995: Results of the workshop “Instrumental and methodical
problems of land surface flux measurements. Bull Am Meteorol Soc. 76, 1191–1193.
[17] Foken T., Wichura B., Klemm O., Gerchau J., Winterhalter M., and T. Weidinger. 2001: Micrometeorogical measurements during the total solar eclipse of August 11, 1999. Meteorologische Zeitschrift. 10,171-178.
[18] Foken T., Gockede M., Mauder M., Mahrt L., Amiro B.D., and J.W. Munger. 2004:
Post-field data quality control. In Handbook of micrometeorology: A guide for surface flux measurement and analysis. Lee X, Massman WJ, Law B. Kluwer, Dordrecht. 181–208.
[19] Foken T., Wimmer F., Mauder M., Thomas C., and C. Liebethal, 2006: Some aspects of the energy balance closure problem. Atmos Chem Phys Discuss. 6, 3381.
[20] Foken T., Aubinet M., Finnigan J.J., Leclerc M.Y., Mauder M., and U. Kyaw Tha Paw. 2011: Results Of A Panel Discussion About The Energy Balance Closure Correction For Trace Gases. Bull Amer Meteor Soc. 92, ES13–ES18.
[26] Gabor, D. 1946: Theory of Communication. Proc. IEEE Part III, 93, 26, 429-457. [22] Gabriel Rilling. Empirical Mode Decomposition.
http://perso.ens-lyon.fr/patrick.flandrin/emd.html (accessed Nov 2, 2011). [23] Griffiths, D.J. 2005, Introduction to Quantum Mechanics, 2nd Edition. Pearson Ed. Intl.
Prentice Hall, Upper Saddle Rive, NJ.
[24] Haar, A. 1910: Zur theorie der orthogonalen funktionensysteme. Mathematische Annalen. 69, 3, 331.
[25] Holder H.E., A.M. Bolch, and R. Avissar. 2009: Using the Empirical Mode Decomposition (EMD) method to process turbulence data collected on board aircraft. Submitted to J. Atmos. Ocean. Tech. http://hdl.handle.net/10161/1074
[26] Huang, N.E., Shen, Z., Long, S., Wu, M., Shih, H., Zheng, Q., Yen, N., Tung, C. and Liu, H. 1998: The empirical mode decomposition and the Hilbert spectrum for nonlinear and non-stationary time series analysis. Proc. R. Soc.. 454, 1971, 903.
[27] Huang, N.E., Z. Shen, S.R. Long. 1999: A new view of nonlinear water waves: The Hilbert spectrum. Annual Review of Fluid Mechanics. 31, 1, 417.
[28] Huang Y., F.G. Schmitt, Z. Lu, Y. Liu. 2007: Empirical mode decomposition analysis of experimental homogeneous turbulence time series. Colloque GRETSI, 11-14 September, Troyes, http://documents.irevues.inist.fr/handle/2042/17539
87
87
[29] Huang, N.E. and Z. Wu. 2008: A review on Hilbert-Huang transform: Method and its applications to geophysical studies. Reviews of geophysics. 46, 2.
[30] Islam, M.K., M.S. Rahman, S. Akimasa, P. Banik. 2006: Empirical mode decomposition analysis of climate changes with special reference to rainfall data. Discrete Dynamics in Nature and Society. 2006.
[31] Katul et al. 2001: Multiscale analysis of vegetation surface fluxes: from seconds to Years.
Adv in Water Resources. 24, 1119-1132.
[32] Kim D. and H. Oh. 2009: EMD: A Package for Empirical Mode Decomposition and Hilbert Spectrum. The R Journal. 1, May 2009.
[33] Kolláth, Z. and K. Oláh. 2009: Multiple and changing cycles of active stars – I. Methods of analysis and application to the solar cycles. Astronomy & Astrophysics. 501 2, 695.
[34] LASP Interactive Solar Irradiance Datacenter. Historical Total Solar Irradiance. http://lasp.colorado.edu/lisird/tsi/historical_tsi.html (accessed Nov 2, 2011).
[35] Lenschow D.H., Mann J., and L. Kristensen. 1994: How long is long enough when measuring fluxes and other turbulence statistics? J of Atmos. and Oceanic Technology. 11, 661-673.
[36] Liu, Z., N. Zhang, R. Wang, and J. Zhu. 2007: Doppler wind lidar data acquisition system and data analysis by empirical mode decomposition method. Opt. Eng., 46, 26001.
[37] Malinowski, S.P., Haman, K.E., Kopec, M.K., Kumala, W., Gerber, H.E., and Krueger, S.K.2008: Small-scale variability of temperature and LWC at Stratocumulus top. 13th AMS Conference on Cloud Physics, 2.21.
[38] NASA GISS. GLOBAL Land-Ocean Temperature Index. http://data.giss.nasa.gov/gistemp/tabledata/GLB.Ts+dSST.txt (accessed Nov 2, 2011).
[39] NASA Marshall Space Flight Center. Solar Physics. http://solarscience.msfc.nasa.gov/greenwch/spot_num.txt.(accessed Nov 2, 2011).
[40] NOAA Earth System Research Laboratory. Trends in Atmospheric Carbon Dioxide. www.esrl.noaa.gov/gmd/ccgg/trends/. (accessed Nov 2, 2011).
[41] Peel MC, G.G.S. Pegram, and T.A. McMahon. 2007: Empirical Mode Decomposition: Improvement and application. In International Congress on Modeling and Simulation, edited by Oxley, L. and D. Kulasiri. Modelling and Simulation Society of Australia and New Zealand, December 2007, 2996-3002.
88
88
[42] Pegram, G. G. S., Peel, M. C. and T.A. McMahon. 2008: Empirical mode decomposition using rational splines: an application to rainfall time series. Proc. R. Soc. A. 464, 1483–1501.
[43] Qian, S. 2002: Introduction to Time-Frequency and Wavelet Transforms, Prentice-Hall Inc.
Upper Saddle River, NJ.
[44] Qingjie, Z., Huayong, Z., and S. Lincheng. 2010: A new method for mitigation of end effect in empirical mode decomposition. Informatics in Control, Automation, and Robotics (CAR), 2010 2nd International Asia Conf. March 2010.
[45] Sakai R., Fitzjarrald D., and K.E. Moore. 2001: Importance of low-frequency contributions to eddy fluxes observed over rough surfaces. J Appl Meteor. 40, 2178–2192.
[46] Sarabandi, K. and I. Koh. 2002: Effect of canopy-air interface roughness on HF-VHF wave propagation in forest. IEEE Transactions on Antennas and Propagation. 50, 2, 111.
[47] Sneddon, I. 1951: Fourier Transforms. McGraw-Hill Book Company, Inc. New York, NY.
[48] Sonett, C. P. 1983: J. Geophys. Res., vol. 88, no. A4, p. 3225-3228.
[49] Stephens, G. L. 1986: Radiative transfer in spatially heterogeneous, two-dimensional anisotropically scattering media. J. Quant. Spectrosc. Radiat. Transfer, 36, 51-67.
[50] Stephens, G.L., and C.M.R. Platt. 1987: Aircraft observations of the radiative and microphysical properties of stratocumulus and cumulus cloud fields. J. Clim. Appl. Meteorol., 26, 1243-1269.
[51] Stull, R. 1988, An Introduction to Boundary Layer Meteorology, Kluwer Academic
Publishers. Boston, MA.
[52] Sun X., Zhu Z., Wen X., Yuan G., and G. Yu. 2006: The impact of averaging period on eddy fluxes observed at ChinaFLUX sites. Agricultural and Forest Meteorology. 137, 188-193.
[53] Usoskin, I.G. and K. Mursula. 2003: Long-term solar cycle evolution: Review of recent developments. Solar Phys. 218, 319-343.
[54] Vickers D., and L. Mahrt. 1997: Quality control and flux sampling problems for tower and aircraft data. J of Atmos and Oceanic Technology. 14, 512-526.
[55] Wilson et al. 2002: Energy balance closure at FLUXNET sites. Agric Forest Meteorol. 113, 223–234.
[56] Wu, S., Liu, Z., and B. Liu. 2006: Enhancement of lidar backscatters signal-to-noise ratio using empirical mode decomposition. Optics Comm. 267, 1, 137.
89
89
[57] Wu, Z., and N.E. Huang. 2004: A study of the characteristics of white noise using the empirical mode decomposition. Proc. R. Soc. Lond. A, 460, 2046, 1597-1611.
[58] Wu, Z., and N.E. Huang. 2009: Ensemble Empirical Mode Decomposition: a noise-assisted data analysis method. Adv. in Adaptive Data Analysis. 1, 1, 1-41.
[59] Zhao, Jin-ping and D. Huang. 2001: Mirror extending and circular spline function for empirical mode decomposition method. Journal of Zhejiang University, 2, 3, 247-252.
[60] Zhen-Shan, L. 2007: Multi-scale analysis of global temperature changes and trend of a drop in temperature in the next 20 years. Meteorology and Atmospheric Physics. 95, 1-2, 115.