boston university - cybele.bu.educybele.bu.edu/download/thdis/alotsch.ma.pdf · y at boston univ...
TRANSCRIPT
BOSTON UNIVERSITY
GRADUATE SCHOOL OF ARTS AND SCIENCES
Thesis
BIOME LEVEL CLASSIFICATION OF LAND COVER AT
CONTINENTAL SCALES USING DECISION TREES
by
ALEXANDER LOTSCH
Vordiplom, Free University of Berlin, 1996
Submitted in partial ful�llment of the
requirements for the degree of
Master of Arts
1999
Approved by
First ReaderMark Friedl, Ph.D.Assistant Professor of Geography
Second ReaderRanga Myneni, Ph.D.Associate Professor of Geography
Third ReaderSucharita Gopal, Ph.D.Associate Professor of Geography
Acknowledgments
I would like to thank the people at the Department of Geography at Boston Univer-
sity, who made my time as a graduate student a rewarding and enriching experience.
Particular thanks goes to Mark Friedl, who guided me through this thesis with dedi-
cation and an extraordinary combination of rigour, exibility and academic support.
His attitude and openness helped foster a fruitful atmosphere, that enhanced my
academic experience. I would like to thank Ranga Myneni, who initially encouraged
me to pursue a degree in physical geography. He provided me the �nancial and
academic opportunity to integrate and participate in on-going research, which gave
me many critical insights. Also, I am grateful for the academic advice I received
from Sucharita Gopal. I especially appreciate her holistic view on geography, which
provided me orientation at several stages of my studies.
Many things I achieved during the two years in the Department of Geography were
only possible with the support of other graduate students. I am especially grateful to
Doug McIver, who has been extremely helpful throughout my thesis and coursework.
Many thanks to all my o�ce-mates, who assisted me numerous times with computer
problems and John Hodges for his imaginary support.
Finally, I would like to express my sincere gratitude to Chung Yi Lung for her
patience, advice and unyielding support as well as the German Fulbright Commission,
which funded my �rst year and allowed me to broaden my horizons in many ways.
iii
BIOME LEVEL CLASSIFICATION OF LAND COVER AT
CONTINENTAL SCALES USING DECISION TREES
ALEXANDER LOTSCH
Abstract
Land cover plays a key role in terrestrial biogeochemical processes. There-
fore many problems require accurate information on the distribution and prop-
erties of land cover. A decision tree classi�cation algorithm is used to generate
a land cover map of North America from remotely sensed data with 1 km
resolution in a 6-biome classi�cation scheme. To do this, the normalized di�er-
ence vegetation index (NDVI) data from the Advanced Very High Resolution
Radiometer (AVHRR) is used in association with ancillary data sources. Train-
ing sites required for this approach were generated by the Boston University
Land Cover and Land-Cover Change Research Group and improved in �ve pre-
processing steps. Accuracy assessment of the map produced via decision tree
classi�cation yields a site-based map accuracy of 73%. The map is compared
with maps generated from the same data, but classi�ed using the International
Geosphere Biosphere Program (IGBP) classi�cation scheme. Biome classes are
mapped with approximately 5% higher overall accuracies than IGBP classes.
The biome map will be useful for remote sensing-based retrievals of leaf area
index (LAI) and the fraction of absorbed photosynthetically active radiation
(FAPAR).
iv
Contents
1 Introduction 1
2 Background 5
2.1 The Role of Land Cover in Biogeochemical Modeling . . . . . . . . . 5
2.2 Global Land Cover Classi�cation Approaches . . . . . . . . . . . . . . 7
2.2.1 Conventional Approaches . . . . . . . . . . . . . . . . . . . . . 7
2.2.2 Remote Sensing-Based Approaches . . . . . . . . . . . . . . . 8
2.2.3 Biome-Based Classi�cation . . . . . . . . . . . . . . . . . . . . 11
2.3 Radiative Transfer Modeling of Vegetation Canopies . . . . . . . . . . 13
2.4 Tree-Based Classi�cation Algorithms . . . . . . . . . . . . . . . . . . 17
3 Methodology 21
3.1 Land Cover Classi�cation Algorithms . . . . . . . . . . . . . . . . . . 21
3.1.1 Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
3.1.2 Site Data Extraction and Classi�cation Estimation . . . . . . 23
3.1.3 Decision Tree Parameters . . . . . . . . . . . . . . . . . . . . 27
3.2 Cross-Walking from IGBP Classes to Biomes . . . . . . . . . . . . . . 29
3.3 Comparison of UMD, EDC and BU Maps . . . . . . . . . . . . . . . 31
3.4 Accuracy Assessment . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
3.5 Improving Training Data Quality . . . . . . . . . . . . . . . . . . . . 37
v
4 Results 39
4.1 Classi�cation Performance . . . . . . . . . . . . . . . . . . . . . . . . 39
4.1.1 IGBP Scheme . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
4.1.2 Biome Scheme . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
4.2 Comparison between Classi�cation Schemes . . . . . . . . . . . . . . 52
4.3 Map Comparisons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
4.3.1 Accuracy Coe�cients for the UMD and EDC Maps . . . . . . 53
4.3.2 Pixel-Based Comparisons . . . . . . . . . . . . . . . . . . . . . 56
5 Discussion 60
5.1 Training Data Improvement . . . . . . . . . . . . . . . . . . . . . . . 60
5.2 IGBP Classi�cation Performance . . . . . . . . . . . . . . . . . . . . 65
5.3 Biome-Level Classi�cation Performance . . . . . . . . . . . . . . . . . 67
5.4 Separability of Land Cover Classes . . . . . . . . . . . . . . . . . . . 68
5.5 Map Comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
6 Conclusions 71
A Appendix 85
vi
List of Tables
1 Visible, red, near-infrared (NIR) and shortwave infrared bands (SWIR)
for AVHRR and MODIS . . . . . . . . . . . . . . . . . . . . . . . . . 11
2 Canopy structural attributes of global land covers from the viewpoint
of radiative transfer modeling . . . . . . . . . . . . . . . . . . . . . . 16
3 Comparison of the IGBP and biome classi�cation scheme . . . . . . . 30
4 Recoded UMD classes . . . . . . . . . . . . . . . . . . . . . . . . . . 32
5 Arrangement of reference and test data in a confusion matrix . . . . . 34
6 Overview of site-based classi�cation performance improvement . . . . 41
7 Errors of omission for selected classes in the IGBP scheme. . . . . . . 43
8 Errors of commission for selected classes in the IGBP scheme. . . . . 44
9 Error matrix for biome classes and site-based accuracy coe�cients for
the uncleaned training data set (I). . . . . . . . . . . . . . . . . . . . 47
10 Error matrix for biome classes and site-based accuracy coe�cients for
the cleaned data set (II). . . . . . . . . . . . . . . . . . . . . . . . . . 48
11 Error matrix for biome classes and site-based accuracy coe�cients us-
ing SLCR labels (III). . . . . . . . . . . . . . . . . . . . . . . . . . . 49
12 Error matrix for biome classes and site-based accuracy coe�cients with
additional training sites (IV). . . . . . . . . . . . . . . . . . . . . . . 50
vii
13 Error matrix for biome classes and site-based accuracy coe�cients for
proportional sampling (V). . . . . . . . . . . . . . . . . . . . . . . . . 51
14 Test of signi�cant di�erences between accuracy coe�cients . . . . . . 52
15 Accuracy coe�cients for aggregated IGBP maps into a 7-class scheme. 53
16 Error matrix and site-based accuracy coe�cients for the UMD map in
the IGBP scheme. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
17 Error matrix and site-based accuracy coe�cients for the EDC map in
the IGBP scheme. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
18 Error matrix and site-based accuracy coe�cients for the UMD map in
the biome scheme. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
19 Error matrix and site-based accuracy coe�cients for the EDC map in
the biome scheme. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
20 Frequency of classes in the IGBP scheme for the UMD, EDC and BU
maps. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
21 Frequency of classes in the biome scheme for the UMD, EDC and BU
maps. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
22 Overall agreement of the UMD, EDC and BU maps in the IGBP and
biome classi�cation scheme. . . . . . . . . . . . . . . . . . . . . . . . 59
23 IGBP class de�nitions . . . . . . . . . . . . . . . . . . . . . . . . . . 85
24 Error matrix for IGBP classes and site-based accuracy coe�cients for
the uncleaned training data set (I). . . . . . . . . . . . . . . . . . . . 86
viii
25 Error matrix for IGBP classes and site-based accuracy coe�cients for
the cleaned data set (II). . . . . . . . . . . . . . . . . . . . . . . . . . 87
26 Error matrix for IGBP classes and site-based accuracy coe�cients us-
ing SLCR labels (III) . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
27 Error matrix for IGBP classes and site-based accuracy coe�cients with
additional training sites (IV). . . . . . . . . . . . . . . . . . . . . . . 89
28 Error matrix for IGBP classes and site-based accuracy coe�cients for
proportional sampling (V). . . . . . . . . . . . . . . . . . . . . . . . . 90
29 Pixel-based comparison of UMD and BU maps in the IGBP scheme. . 91
30 Pixel-based comparison of UMD and EDC maps in the IGBP scheme. 92
31 Pixel-based comparison of BU and EDC maps in the IGBP scheme. . 93
32 Pixel-based comparison of UMD and EDC maps in the biome scheme. 94
33 Pixel-based comparison of UMD and BU maps in the biome scheme. . 95
34 Pixel-based comparison of BU and EDC maps in the biome scheme. . 96
ix
List of Figures
1 Relationships of NDVI/LAI and NDVI/FAPAR . . . . . . . . . . . . 14
2 Decision tree structure . . . . . . . . . . . . . . . . . . . . . . . . . . 19
3 Data processing ow . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
4 Examples of multivariate statistical outliers. . . . . . . . . . . . . . . 26
5 Supervised classi�cation of IGBP classes for North America. . . . . . 97
6 Supervised classi�cation of biome classes for North America. . . . . . 98
7 Map comparison in the biome scheme between EDC and BU. . . . . . 99
8 Map comparison in the biome scheme between UMD and BU. . . . . 100
x
List of Abbreviations
AVHRR Advanced Very High Resolution Radiometer
BU Boston University
CART Classi�cation and Regression Tree
DAAC Distributed Active Archive Center
EROS Earth Resources Observation System
EDC EROS Data Center
EOS Earth Observing System
ET Evapo-Transpiration
FAPAR Fraction of Absorbed Photosynthetically Active Radiation
GLCC Global Land Cover Characterization
IGBP International Geosphere Biosphere Program
IR Infra-Red
ISG Integerized Sinosoidal Grid
LAI Leaf Area Index
LUT Look-Up Table
MISR Multiangle Imaging Spectroradiometer
MODIS Moderate Imaging Spectroradiometer
MLRC Multiresolution Land Characterization
xi
NASA National Aeronautics Space Administration
NIR Near-Infrared
NDVI Normalized Di�erence Vegetation Index
NPP Net Primary Productivity
NOAA National Oceanic and Atmospheric Administration
POLDER Polarization and Directionality of Earth's Re ectances
RTM Radiative Transfer Model
SLCR Seasonal Land Cover Regions
SPOT Systeme Probatoire d`Observation de la Terre
SVI Spectral Vegetation Index
SWIR Shortwave Infra-Red
TM Thematic Mapper
UMD University of Maryland, College Park
UTM Universe Transverse Mercator
xii
1 Introduction
Land cover plays a key role in terrestrial biogeochemical processes and is related in
a number of ways to the dynamics of global climate. Further, changes in land cover
induced by human activity have profound implications for climate, the functioning of
ecosystems, and biogeochemical uxes at regional and global scales [Lean and War-
ilow 1989; Dickinson and Henderson-Sellers 1988]. As a consequence, a wide range of
problems require reliable and accurate information on global land cover, most impor-
tantly the distribution and properties of vegetation. Mapping techniques from the
remote sensing domain are superior to conventional ground-based methods of vegeta-
tion mapping [Townshend et al. 1991]. The data source most commonly used in the
mapping of global vegetation cover is the Advanced Very High Resolution Radiometer
(AVHRR) with a spatial resolution of 1.1 km at nadir. In particular, the normalized
di�erence vegetation index (NDVI) has been used to map vegetation as well as to
infer the amount of photosynthetically active vegetation on the ground [Tucker 1979].
With the implementation of NASA's Earth Observing System (NASA-EOS) a
new generation of satellite data will be available for scienti�c research. The Moderate
Resolution Imaging Spectroradiometer (MODIS) is expected to provide substantially
better data for future land cover mapping [Justice et al. 1998]. Further, the Multi-
Angle Imaging Spectroradiometer (MISR) will obtain multiple view angles on the
earth's surface, which will be particularly useful for retrieving more accurate infor-
mation about structural properties of vegetation canopies [Knyazikhin et al. 1998].
2
A number of di�erent techniques exist to classify remotely sensed spectral data
into classes of land cover or vegetation types. Historically, supervised maximum
likelihood classi�cation algorithms and unsupervised techniques based on clustering
algorithms have been commonly used [Loveland et al. 1991]. More recently, the use
of neural networks [Gopal and Woodcock 1996], fuzzy logic [Gopal and Woodcock
1994] and decision trees [Friedl and Brodley 1997; DeFries et al. 1998] has provided
promising results.
The number and properties of classes of interest vary with the intended use of
the �nal vegetation map. One approach is the classi�cation of biomes based solely
on remotely sensed characteristics of vegetation [Running et al. 1995]. For example,
multi-temporal red, near-infrared and thermal infrared from NOAA/AVHRR have
been used to distinguish six structural vegetation classes [Nemani and Running 1997]
based on a hierarchical classi�cation structure. The classi�cation process involves a
series of rules to partition the feature space into smaller, more distinct sets of data
points. A key requirement for the successful implementation of this method is the
choice of thresholds used to de�ne the classi�cation structure. Unfortunately there
are several shortcomings of this approach, notably-
� The assumption that the chosen thresholds are general and robust is not nec-
essarily statistically sound or adequate. Speci�cally, the threshold choices are
relatively arbitrary and are not derived from an adequately large training sam-
ple.
3
� The thresholds are speci�c to a particular data set.
� The method does not allow a reliable and systematic validation and assessment
of the classi�cation performance unless an independent validation dataset is
available.
One common approach to create a map in a desired classi�cation scheme is to
collapse an existing map with �ner class de�nitions into one with broader classes,
or alternatively to relabel class values according to a cross-walking rule set. This
approach has the following short-comings:
� The class de�nitions in the di�erent classi�cation schemes may not be compat-
ible, e.g., di�erent thresholds for the discrimination of vegetation density may
be used.
� It is often impossible to unambiguously cross-walk broad classes to a �ner res-
olution (e.g., forest to needleleaf forest and broadleaf forest).
� The cross-walking process can introduce confusion and errors which may then
be propagated through algorithms that use land cover as input.
The spectral information about the earth's surface to be measured by MODIS
and MISR will be the basis for a wide range of biophysical algorithms and products
(e.g., net primary productivity or leaf area index). For many of these algorithms land
cover is one of the most important input parameters. Therefore, inaccuracies in land
cover classi�cation will propagate through downstream algorithms.
4
The primary objective of this research is to generate a biome-based land cover
map for North America and compare its accuracy and properties with existing land
cover maps at the same scale and resolution. To this end, a decision tree classi-
�cation algorithm is used to create land cover maps in a 6-biome scheme and the
International Geosphere-Biosphere Program (IGBP) classi�cation scheme. The pri-
mary data source to train the classi�cation algorithm is a 12 month time series of
AVHRR NDVI in conjunction with ancillary data sources. Issues relating to cross-
walking between classi�cation schemes are addressed as well as methods to generate
a training sample for a supervised classi�cation of biomes.
This research speci�cally employs the biome based land cover classi�cation scheme
suggested by Myneni et al. [1997]. The underlying assumption of this classi�cation
scheme is that the earth's vegetation can be categorized in 6 structurally distinct
classes. Vegetation canopy structure is de�ned by plant geometry and distribution.
The classi�cation scheme is designed to complement an algorithm to retrieve global
leaf area index (LAI) and the fraction of absorbed photosynthetically active radiation
(FAPAR) from spectral re ectances from MODIS and MISR [Knyazikhin et al. 1998].
Speci�cally, radiative transer models (RTM) simulate the transport and interactions
of photons in vegetation canopies to retrieve information about plant structure from
re ected solar radiation. However, the parameterization of the RTM is dependent on
the structural characteristics of the plant canopy, which can be categorized in biomes.
The availability of high quality biome level maps of vegetation will be very useful to
this MODIS/MISR LAI/FAPAR retrieval algorithm.
5
2 Background
2.1 The Role of Land Cover in Biogeochemical Modeling
The importance of vegetation in global climate and biogeochemical cycles is well rec-
ognized [Sellers and Schimel 1993]. This is particularly true with respect to carbon,
which is �xed via primary production by terrestrial vegetation [Myneni et al. 1995].
The estimation of carbon �xation by terrestrial vegetation and the prescription of
accurate land surface properties requires variables descriptive of radiation absorp-
tion, plant physiology, climatology and surface assimilation area. As a consequence,
global biogeochemical models require accurate parameterization of the structural and
functional properties of plant canopies.
Hydro-meteorological conditions determine plant growth and structure in the
sense that plants adapt and grow by optimizing the use of resources like water, nu-
trients and solar radiation. These adaption processes can a�ect vegetation attributes
including plant size, leaf type, leaf longevity, density, and are fundamental mecha-
nisms for optimizing the energy absorption and dissipation under water availability
constraints [Woodward 1987].
Because of the diversity of global vegetation, there is an in�nite variety of plant
canopy shapes, sizes and attributes. In order to characterize plant canopies in a
useful way, leaf area index (LAI) and the fraction of absorbed photosynthetically
active radiation (FAPAR) have proven to be powerful parameters representing the
basic structural characteristics of vegetation canopies and their interaction with in-
6
coming solar radiation [Ruimy et al. 1994; Sellers et al. 1986]. LAI is de�ned as the
one-sided leaf area per unit ground area in broadleaf canopies and the projected or
total leaf area in needleleaf canopies and ignores the complexities of canopy geom-
etry. The characterization of vegetation by LAI, rather than species composition,
is a critical simpli�cation used to make global comparisons of terrestrial ecosystems
possible. LAI provides a measure of the physiology that is most directly involved in
energy, H2O and CO2 exchange processes. Strong correlation across di�erent vege-
tation types between LAI and net primary production (NPP), site water availability
and evapotranspiration (ET) have been found [Gholz 1982; Webb et al. 1983; Grier
and Running 1977; Jarvis and McNaughton 1986]. The fraction of absorbed photo-
synthetically active radiation by vegetation (FAPAR) exhibits diurnal variation and
therefore requires appropriate time integration for models with time steps longer than
one day [Myneni et al. 1997].
Remote sensing has established the relationship between LAI, FAPAR and spec-
tral vegetation indices, in particular the normalized di�erence vegetation index (NDVI)
(reviewed in Myneni et al. [1995]). The NDVI is de�ned as the ratio of the di�erence
in near-infrared and red re ectance normalized by their sum.
NDV I =NIR� RED
NIR +RED(1)
Asrar et al. [1992] found that under speci�c canopy conditions FAPAR was
linearly related to NDVI, whereas LAI exhibited a curvilinear relationship. Other
7
studies have shown that the relation between FAPAR and NDVI is similar for one-
dimensional and three-dimensional canopies [Myneni et al. 1992; Myneni andWilliams
1994]. The theoretical basis for the existence of those relations is described in Myneni
et al. [1995] and summarized in Myneni et al. [1997]. FAPAR is frequently used to
translate satellite data into simple estimates of primary production and photosyn-
thetic activity. However, it is important to note that di�erent biomes exhibit distinct
di�erences in their NDVI/LAI and NDVI/FAPAR relationships. Essentially, these
di�erences are used in the MODIS/MISR algorithm [Knyazikhin et al. 1998]. To
do this, a priori knowledge is required regarding the global distribution of biomes.
This thesis seeks to support the MODIS/MISR LAI/FAPAR algorithm by developing
improved methods to map biomes in an accurate and repeatable fashion at global
scales.
2.2 Global Land Cover Classi�cation Approaches
2.2.1 Conventional Approaches
Because of the diversity of vegetation at a global scale, the accurate mapping and
representation of terrestrial vegetation has been a challenge for many years. The
compilation of reliable databases at global scales involves both the generalization of
vegetation types into a smaller set of critical attributes and the development of means
for measuring vegetation globally in a meaningful timespan [Running et al. 1995].
Current global climate models, however, rely on land-cover data sets which are
8
typically derived from pre-existing maps and atlases [Olson andWatts 1982; Matthews
1983; Wilson and Henderson-Sellers 1985; Prentice et al. 1992]. This approach has a
number of limitations regarding model parameterization. First, the reference sources
themselves often represent a range of di�erent scales, dates and classi�cation schemes,
and the translation of mapping units into the classi�cation system and scale of in-
terest may introduce signi�cant new errors. Second, some datasets are derived from
maps of potential vegetation, which is usually inferred from climate variables rather
than the actual vegetation type. A third limitation is that many datasets are static
and are therefore prone to the perpetuation of errors in the source from which they
were derived [Loveland et al. 1991; DeFries et al. 1995].
A good illustration of the problems presented in this regard is given by Town-
shend et al. [1991], who compared existing maps of global vegetation and showed
that the estimates of vegetation distribution from common sources varied consider-
ably. The lack of consistency among the various map sources was attributed to both
the vegetation classi�cation and resolutions used in spatial sampling. While such
databases have obvious limitations, they represent the state of the science for driving
large scale process models.
2.2.2 Remote Sensing-Based Approaches
There is wide consensus that remotely sensed data can provide an accurate and re-
peatable means of land cover mapping and monitoring, especially with respect to
areas with rapidly changing landuse and land management activities [Running et al.
9
1994; Townshend et al. 1991]. In particular, remote sensing based approaches make
use of the distinct spectral re ectances from di�erent land cover types in associa-
tion with the temporal variation of re ected radiation caused by the phenological
dynamics in vegetation [Loveland et al. 1991; Justice et al. 1985].
Most recent research on global land cover classi�cation has used satellite data
collected by the Advanced Very High Resolution Radiometer (AVHRR) instrument
on board the National Oceanic and Atmospheric Administration (NOAA) series of
satellites [Justice et al. 1985; Running et al. 1994]. The high temporal resolution
of AVHRR data is desirable for global land cover classi�cation and allows repeated
unobscured views on land surface features [Townshend and Tucker 1984]. In order
to reduce data volumes, 10-day or monthly composited NDVI is commonly used as
input to classi�cation algorithms [Holben 1986].
Surface temperature from NOAA/AVHRR, used in conjunction with spectral veg-
etation indices (SVI), have been found to be useful for the description and quanti�-
cation of energy exchange processes and absorption by plant canopies [Goward et al.
1994]. Satellite-derived land surface temperatures are a function of the proportion
of soil versus vegetation in a pixel as well as surface wetness. Nemani et al. [1993]
showed that under dry surface conditions, surface temperature is linearly correlated
with canopy density across di�erent vegetation types, whereas this relation is poorly
de�ned over wet surfaces. Furthermore, radiometric temperatures from space-borne
sensors are complex function of viewing geometry and illumination [Choudhury 1991].
Using AVHRR data, Loveland et al. [1995] developed a land cover database
10
using an unsupervised classi�cation algorithm in conjunction with extensive ancillary
data. The unsupervised classi�cation yielded spectrally similar clusters of vegetation.
Ancillary data was then used to label those clusters. The �nal classi�cation included
205 classes for North America, which may be collapsed into fewer and broader set
of classes in a straightforward manner. However, their algorithm involves signi�cant
amounts of ancillary data and requires substantial manual post processing.
Most current classi�cation schemes designed for application at continental to
global scales are based on the magnitude and temporal dynamic of spectral vege-
tation indices such as NDVI [Justice et al. 1985; Loveland et al. 1991; Loveland et al.
1995; DeFries and Townshend 1994]. More recently, Nemani and Running [1997]
have demonstrated the potential of a combination of both spectral vegetation indices
(SVI) and surface temperature observations. Their methodology is based on known
energy exchange processes rather than statistical associations of vegetation types and
spectral properties.
The use of additional information in the training process, such as thermal bands
or seasonal metrics has also been suggested by DeFries et al. [1998]. Friedl et al.
[1999], however, showed that the use of additional phenological metrics provided little
improvement in classi�cation accuracy relative to using an annual time series of NDVI
data. Also, the use of geographic location as an input feature yielded substantially
better accuracies than using only NDVI. However, this does not re ect the true
accuracies and can be explained by interaction between the decision tree algorithm
and the bias introduced by the geographic distribution of training data.
11
Although the approaches described above provide promising results, it must be
noted that AVHRR data is limited in several regards including a high level of at-
mospheric noise (especially in channel 2), lack of onboard calibration, and only �ve
spectral bands [Zhu and Yang 1996; Cihlar et al. 1997; Moody and Strahler 1994]. As
a consequence, AVHRR data is insu�cient to discriminate subtle di�erences among
many vegetation types. The MODIS instrument is expected to overcome these limita-
tions for global land cover classi�cation. Speci�cally, it will provide superior spectral
and spatial resolution as well as better facilities for atmospheric correction and instru-
ment calibration. The speci�c properties of the MODIS instrument are documented
in Running et al. [1994] and Barnes et al. [1998].
Band AVHRR MODIS
Blue NA 0.459-0.479
Green NA 0.545-0.565
Red 0.580-0.680 0.620-0.670
NIR 0.720-1.10 0.841-0.876
SWIR NA 1.23-1.25
SWIR NA 1.63-1.65
SWIR NA 2.11-2.16
Table 1: Visible, red, near-infrared (NIR) and shortwave infrared bands (SWIR) for
AVHRR and MODIS
2.2.3 Biome-Based Classi�cation
As described above, climate and biogeochemical models require accurate input and
data on land cover [DeFries et al. 1995]. For example, Running and Hunt [1993]
introduced an ecosystem model (BIOME-BGC) designed to capture the essential
12
physio-morphological factors that regulate energy exchange processes in vegetation.
Within Biome-BGC, global vegetation is represented by six di�erent biome classes.
The ecological foundation for this classi�cation approach was given in Running et al.
[1995] and the classi�cation is based on three primary attributes of plant canopy
structure: (i) permanence of above ground biomass, (ii) leaf longevity and (iii) leaf
type or shape.
The �rst attribute, aboveground biomass, discriminates between permanent respir-
ing biomass, such as forests and woody shrubs, and annual crops and grasses. It is an
important determinant of carbon cycles and is controlled primarily by climate. Leaf
longevity, on the other hand, separates evergreen from deciduous canopies and plays
a major role in carbon and energy exchange processes. Finally, the leaf type criteria
distinguishes broadleaf and needleleaf plants as well as grasses. It also determines
the radiation and gas exchange characteristics of canopies.
The combination of these three criteria yields the following six biome classes: (1)
evergreen needleleaf, (2) evergreen broadleaf, (3) deciduous needleleaf, (4) deciduous
broadleaf, (5) broadleaf annual and (6) grasses. This classi�cation scheme has three
advantages over earlier classi�cation e�orts. First, it uses only plant attributes,
therefore other variables, such as climate, are excluded from the class de�nition.
Second, it is tailored to the information content of remotely sensed observations. Most
importantly, it provides a relatively stable and unambiguous classi�cation scheme for
the purpose of global biogeochemical modeling [Nemani and Running 1997].
Nemani and Running [1997] implemented this logic using a hierarchical classi�ca-
13
tion structure based on di�erent thresholds for NDVI, surface temperature and their
seasonality. However, the choice of thresholds is somewhat arbitrary and estimation
of the accuracy and performance of this algorithm can only be done using pre-existing
land cover maps. A somewhat similar biome classi�cation scheme based on canopy
architecture will be described in the next section in the context of radiative transfer
modeling of vegetation canopies.
2.3 Radiative Transfer Modeling of Vegetation Canopies
Canopy radiative transfer models (RTM) simulate radiation absorption and scatter-
ing in vegetation canopies. A review of canopy radiative transfer models can be found
in Myneni et al. [1995]. Myneni et al. [1997] suggested an algorithm for the esti-
mation of LAI and FAPAR at a global scale using such models. For a more detailed
description of three-dimensional radiative transfer modeling e�orts refer to Myneni
et al. [1990]. A synergistic algorithm for the estimation of vegetation canopy LAI
and FAPAR from MODIS and MISR data is described in Knyazikhin et al. [1998].
The relationship between NDVI and LAI/FAPAR has been established theoret-
ically. However, the utility of this relationship depends on the sensitivity of these
variables to canopy characteristics [Myneni et al. 1997]. While FAPAR exhibits a pos-
itive linear relationship with increasing NDVI, LAI is curvi-linearly related and shows
saturation with increasing NDVI (Figure 1). In order to estimate LAI/FAPAR from
remotely sensed data, canopy structural types must be de�ned that exhibit di�erent
14
0 2 4 6 8LAI
0.0
0.2
0.4
0.6
0.8
1.0
ND
VI
Broadleaf ForestsBroadleaf ForestsBroadleaf ForestsBroadleaf ForestsBroadleaf ForestsNeedle ForestsNeedle ForestsNeedle ForestsNeedle ForestsNeedle Forests
0.0 0.2 0.4 0.6 0.8 1.0NDVI
0.0
0.2
0.4
0.6
0.8
1.0
FPA
R
Broadleaf ForestsBroadleaf ForestsBroadleaf ForestsBroadleaf ForestsBroadleaf ForestsNeedle ForestsNeedle ForestsNeedle ForestsNeedle ForestsNeedle Forests
Figure 1: Relationships of NDVI/LAI and NDVI/FAPAR: Results for broadleaf
forests and needleleaf forests from prototyping e�orts with POLDER data (Zhang
et al., BU MODIS/MISR LAI/FAPAR team at Boston University).
NDVI-LAI or FAPAR relations from one another. If the canopy types have similar
NDVI-LAI/FAPAR relations, information on land cover is redundant for the esti-
mation of LAI/FAPAR. Therefore many classi�cation schemes, which are based on
ecological, botanical or functional metrics are not necessarily suitable for LAI/FAPAR
estimation.
The planned algorithm for the retrieval of LAI and FAPAR from MODIS/MISR
data is based on six distinct plant structural types (biomes), which can be parame-
terized with variables that many radiative transfer models employ [Knyazikhin et al.
1998].
This implies that a land cover classi�cation scheme that is compatible with radia-
tive transfer and LAI/FAPAR algorithms is needed. Myneni et al. [1997] de�ne the
following six biomes based on their canopy structure, which invoke di�erent radiative
transfer models to estimate LAI/FAPAR from remote sensing data.
15
Grasses and Cereal Crops (Biome 1): This land cover type is characterized by
vertical and lateral homogeneity, full ground cover and plant height less than about
a meter. The plants have erect leaf inclination, no woody material, minimal leaf
clumping and intermediate soil brightness.
Shrubs (Biome 2): Unlike biome 1, canopies are laterally heterogeneous and show
sparse to intermediate vegetation ground cover (20-60 percent). The plants have small
leaves, woody material, and bright backgrounds. This land cover type is typically
found in semi-arid regions with extreme temperature regimes and poor soils.
Broadleaf Crops (Biome 3): These canopies are laterally heterogeneous and ex-
hibit large variations in vegetation ground cover, ranging from about 10 percent after
planting to 100 percent at full maturity. They are characterized by regular leaf spa-
tial dispersion, a high level of photosynthetic activity in both leaves and stems, and
dark background soil.
Savannas (Biome 4): Savanna canopies have two distinct vertical layers, an un-
derstory of grass (biome 1) and an overstory of trees with about 20 percent ground
cover. Savannas in the tropical and sub-tropical regions are described as mixtures of
broadleaf trees and warm grasses, whereas in the cooler regimes of higher latitudes,
they are characterized as mixtures of cool grasses and needleleaf trees.
Broadleaf Forests (Biome 5): Broadleaf forests are characterized by both vertical
and horizontal heterogeneity, i.e. high ground cover, green understory, mutual crown
shadowing and foliage clumping. Trunks and branches are included in the radiative
transfer models, which means that canopy structure and optical properties di�er
16
spatially. Trunks are modeled as erect structures and branches as randomly oriented.
Needleleaf Forests (Biome 6): Needleleaf forests represent the most complex
canopy structure. They are characterized by needle clumping on shoots, shoot clump-
ing in whorls, dark vertical trunks, sparse green understory and mutual crown shad-
owing. Branches are modeled as randomly oriented and trunks as erect structures.
Needles are assumed to be clumped in the shoots, and the shoots clumped in the
crown space.
The de�nitions and properties of the six biomes as they relate to radiative transfer
are shown in table 2.
Grasses/
Cereal
Crops
Shrubs Broadleaf
Crops
Savannas Broadleaf
Forests
Needleleaf
Forests
Horizontal
Heterogene-
ity (Ground
Cover)
No
gc=100%
Yes
gc = 10-
60%
Variable
gc = 10-
100%
Yes
gc < 20%
Yes
gc > 70%
Yes
gc > 70%
Vertical
Heterogeneity
No No No Yes Yes Yes
Stems/
Trunks
No No Green
Stems
Yes Yes Yes
Understory No No No Grasses Yes Yes
Foliage
Dispersion
Minimal
Clumping
Random Regular Minimal
Clumping
Clumped Severe
Clumping
Crown
Shadowing
No NotMutual No No Yes Mutual Yes Mutual
Background
Brightness
Medium Bright Dark Medium Dark Dark
Table 2: Canopy structural attributes of global land covers from the viewpoint of
radiative transfer modeling [Myneni et al. 1997].
17
2.4 Tree-Based Classi�cation Algorithms
A suite of techniques are currently used to classify remotely sensed data into classes of
land cover. Traditionally, the vast majority of land cover mapping approaches have
used parametric supervised classi�cation algorithms or unsupervised classi�cation
algorithms. The latter use clustering techniques to identify spectrally distinct groups
of data [Schoewengerdt 1997]. These techniques have generally been used for high
resolution imagery, such as Landsat or SPOT.
Global land cover classi�cation e�orts, however, have mostly employed coarse
resolution data from NOAA/AVHRR [DeFries and Townshend 1994]. The literature
provides various examples of global land cover classi�cation e�orts. The more tradi-
tional approaches include unsupervised clustering in conjunction with ancillary data
and manual labeling of clusters [Loveland et al. 1991], maximum likelihood classi�ca-
tion [DeFries and Townshend 1994], and simple classi�cation logic based on structural
and biophysical parameters [Running et al. 1995].
More recent approaches include applications of neural networks [Gopal and Wood-
cock 1996], including fuzzy neural networks [Carpenter et al. 1992]. Neural networks
can handle relatively complex relations among the class properties, whereas tradi-
tional classi�cation algorithms are somewhat limited in their statistical and theo-
retical sophistication. However, neural nets need an understanding of theory and a
parallel processor to run real-time. They may not be a viable solution to all applica-
tions.
18
More recently, decision tree algorithms have been used for the classi�cation of
global datasets with promising results [Friedl and Brodley 1997; Friedl et al. 1999;
DeFries et al. 1998; Hansen et al. 1999]. Decision tree techniques have been used
successfully for a wide spectrum of classi�cation problems in various �elds [Safavian
and Landgrebe 1991]. They are computationally e�cient and exible, and also have
an intuitive simplicity. They therefore have substantial advantages in remote sensing
applications [Friedl and Brodley 1997].
A decision tree is a classi�cation algorithm which recursively partitions the feature
space of the data set into increasingly homogeneous subsets based on a set of splitting
rules. The tree has a root, which represents the entire data set, a set of internal nodes
(splits), and a set of terminal nodes (leaves). The nodes represent subsets of the data
set, while the terminal nodes at the bottom of the tree represent the predictions of
the tree. Every node in the tree (except the terminal nodes) has one parent node and
two or more descendant nodes. Each observation is labeled according to the majority
class of the leaf in which it falls [Breiman et al. 1984].
Running et al. [1995] and Nemani and Running [1997] applied a tree-based
decision structure to a global data set of NDVI values. The data set is both well
understood and well behaved and the classi�cation tree was de�ned solely on analyst
expertise, where the threshold values are de�ned based on ecological knowledge. This
algorithm, however, is somewhat di�cult to implement since signi�cant spatial, tem-
poral and spectral variation make globally robust user de�ned threshold speci�cation
almost impossible.
19
Figure 2: Decision tree structure
More commonly, tree-based algorithms use statistical procedures, which estimate
the classi�cation rules from a training sample. A classic example is the classi�ca-
tion and regression tree (CART) model described by [Breiman et al. 1984]. These
algorithms combine the advantages of statistically based techniques and learning al-
gorithms, which have their origin in the machine-learning and pattern-recognition
communities. Tree-based methods are supervised techniques and therefore a training
set is required from which the classes can be learned.
A critical step in the estimation of a decision tree is to prune the tree back in
20
order to avoid over�tting. By convention a tree is constructed in such a way that all
(or nearly all) training samples are correctly classi�ed, i.e. the training classi�cation
accuracy is 100%. If the training data contains errors the tree will be over�tted
and will generate poor results when applied to unseen data. Common methods for
pruning decision trees are described in [Mingers 1989] and brie y discussed in the
next section.
21
3 Methodology
The analysis for this thesis involved three main methodological components, each of
which is described in the sections below. Section 3.1 describes in detail the algorithm
that was used to generate land cover maps using both the IGBP and the 6-biome
classi�cation schemes. Section 3.2 explains the steps that were taken to translate
(cross-walk) the IGBP classes into biomes throughout the analysis. Section 3.3 dis-
cusses how the maps of UMD, EDC and BU were compared. Section 3.4 describes
the methodology used for map accuracy assessment.
3.1 Land Cover Classi�cation Algorithms
This section speci�cally focuses on the data used for the analysis, the major data
processing steps, the methods used for classi�cation performance evaluation, and the
decision tree parameters used for the classi�cation algorithm. Also, steps that were
undertaken to improve shortcomings in the training data are described.
The land cover classi�cation algorithm pursued in this research is based on the
concept of combining remotely sensed re ected and emitted radiation through time
and over space with ancillary data and information collected on the ground. The
underlying assumption is that the spectral information measured by satellites con-
tains information about plant canopy properties. NDVI is assumed to be a powerful
metric to represent these properties. The validity of this assumption is supported by
numerous studies in the past two decades [Tucker et al. 1986; Townshend et al. 1991].
22
The algorithm's goal is to distinguish land cover types on the basis of the spectral
and spatial properties of features on the Earth's land surface and their temporal
trajectories.
3.1.1 Data
The most commonly used source of satellite imagery for continental to global scale
studies is provided by the Advanced Very High Resolution Radiometer (AVHRR)
on board the NOAA series of satellites. The major advantage of AVHRR data over
other sources of satellite imagery is its high temporal resolution and global coverage.
Further, it provides su�ciently high spatial resolution (1.1km at nadir) for global
studies. However, its spectral properties are substantially less useful for land cover
classi�cation problems than Landsat Thematic Mapper (TM) data, for instance.
The classi�cation analyses presented below were based on a 12 month NDVI time
series. The data set is composed of monthly composited NDVI data covering the
time span between February 1995 and January 1996. In addition, seasonal land
cover regions (SLCR) labels [Loveland et al. 1991; Loveland et al. 1995] were also
tested as a predictive variable.
For supervised classi�cation approaches, a training sample is required to train the
classi�cation algorithm. To this end, a database of global land cover training sites has
been compiled and is currently being improved and extended by the MODIS Land
Cover and Land-Cover Change group at BU [Strahler et al. 1996]. The database
currently contains approximately 1000 sites distributed over the North American
23
continent (including sites in Central America) and has undergone several iterations
of reevaluation. Each site polygon in the database has an areal extent ranging be-
tween 2 and 100 km2 and a label assignment de�ned by the IGBP classi�cation
scheme (Loveland and Belward, 1997). Where possible, a set of biophysical parame-
ters has been assigned by the analyst to each training site. The label and attribute
assignments were performed using recent TM imagery from the multiresolution land
resource characterization database [Loveland and Shaw 1996] along with ancillary
data sources such as existing paper or digital maps, literature sources, aerial imagery
as well as veri�ed ground information from collaborating science teams. The suite of
site attributes is described in Muchoney et al. (1998).
3.1.2 Site Data Extraction and Classi�cation Estimation
The biome based classi�cation and map production essentially follows the algorithmic
steps developed by the MODIS Land Cover and Land-Cover Change group at Boston
University [Strahler et al. 1996]. These steps are:
1. Extraction of AVHRR NDVI pixel values for each training site and assignment
of class labels from training site database.
2. Manual detection and removal of multivariate outliers in the training data.
3. Tree estimation and pruning.
4. Cross-validation evaluation of classi�cation performance using independent train
and test datasets.
24
5. Analysis of classi�cation performance.
To extract the respective NDVI values for each training and test site from the
AVHRR imagery, careful and accurate registration of each site to geographic coordi-
nates needs to be assured. To this end, each training site polygon was registered to
coordinates in the Universal Transverse Mercator Projection (UTM), converted to a
raster image format with a 30m resolution, aggregated to a 1km resolution and repro-
jected to the Integerized Sinosoidal Grid (ISG) Projection used for MODIS products.
In this projection the globe is tiled into a grid of 25x17 cells, of which 326 contain
land mass. Each tile has an extent of approximately 1200x1200 km 1 .
A key step in the training database development is to remove statistical outliers
in order to avoid unwanted confusion in the classi�cation algorithm. To do this, a
two step generalized gap test for multivariate outlier detection was performed [Rohlf
1975]. In the �rst step, the largest pixel outliers in each training site were removed
from the training data with the intent of increasing the homogeneity in each site. In
the second step, sites were identi�ed as outliers within each class to decrease within-
class heterogeneity. Examples of an outliers in shown in Figure 4. A total of 35 sites
(768 pixels) were removed from the training data based on this analysis.
Classi�cation performance was assessed using cross-validation procedures. Specif-
ically, the population of the training data was randomly split into 5 mutually exclusive
training samples consisting of 80 percent of the data and an independent test sample
1For a detailed description of the Hierarchical Data Format (HDF) used by the Earth ObservingSystem and MODIS data storage and gridding, the reader may refer to http://daac.gsfc.nasa.gov/and [Wolfe et al. 1998]
25
Figure 3: Data processing ow
consisting of the remaining 20 percent. For each 80/20 split a decision tree was esti-
mated using the training sample and its performance evaluated on the independent
test sample. In this way, the information contained in the test sample was previously
unseen (independent) and not used to build the tree. The classi�cation accuracies
herein are reported as averages across the �ve cross-validation runs.
Since the training sites were de�ned in a way such that the within-site homogeneity
is maximized, substantial spatial autocorrelation was present in the AVHRR data
26
Month
ND
VI
2 4 6 8 10 12
100
120
140
160
180
200
NDVI Trajectory for Max Outlier Class: 2 #Sites = 38 Site ID = 340
Month
ND
VI
2 4 6 8 10 12
100
120
140
160
180
200
NDVI Trajectory for Max Outlier Class: 1 #Sites = 45 Site ID = 419
Month
ND
VI
2 4 6 8 10 12
100
120
140
160
180
200
NDVI Trajectory for Max Outlier Class: 14 #Sites = 61 Site ID = 424
Month
ND
VI
2 4 6 8 10 12
100
120
140
160
180
200
NDVI Trajectory for Max Outlier Class: 4 #Sites = 51 Site ID = 793
1
Figure 4: Examples of multivariate statistical outliers in the training database. The
solid line represents the trajectory of the mean maximum NDVI value for a class.
The diamonds show the monthly mean maximum NDVI values for the largest outlier
in a site.
within sites 2 . Spatial autocorrelation can have a signi�cant impact on accuracy
assessment measures [Congalton 1988] and in uence accuracy coe�cients. That is,
two features in space are likely to be autocorrelated, when they are close to each
other. Conceptually speaking, the prediction of a pixel's value becomes \easier" for
the classi�cation algorithm based on prior information about adjacent pixels [Friedl
2Spatial autocorrelation occurs when the presence, absence, or degree of a certain characteristica�ects the presence, absence, or degree of the same characteristic in neighbouring units [Cli� andOrd 1973]
27
et al. 1999]. Therefore, both pixel-based and site-based accuracies are reported here.
For pixel-based accuracies the training data was randomly split on a per-pixel level.
That is, pixels used to estimate the decision tree may be used to predict other pixels
from the same site. For site-based accuracies on the other hand, the splits were
constrained by the site membership, which means that pixels from 80% of the sites
were used to predict the remaining 20% of the sites, which are spatially separated.
The processing steps described above allow a statistically sound evaluation of the
classi�cation performance on a given data set. To produce the �nal map, all the
training data were pooled and a �nal tree was built based on the entire training data
set. This tree was then used to classify the NDVI image dataset.
3.1.3 Decision Tree Parameters
For this analysis C5.0, a widely used and tested univariate decision tree algorithm, was
used. A detailed description of the algorithm can be found in [Quinlan 1993]. The
most important elements, however will be discussed brie y here. The method used by
C5.0 to estimate the splits at each internal node of the tree is called the information
gain ratio. This metric measures the reduction in entropy in the data produced by
a split, and the split which maximizes the reduction in entropy in descendant nodes
is selected. The algorithm is terminated when no more gain is yielded by further
splitting [Quinlan 1993]. Unlike other trees used in global land cover classi�cations
(e.g. DeFries et al. 1998) the �nal tree is often very complex and large, and the tree
may be over�t to noise in the data. Errors in the training data can therefore lead
28
to poor performance on unseen (independent) cases. C5.0 addresses this problem by
using error-based pruning, i.e. the tree is \cut back" until all parts of the tree are
removed that have a high predicted error rate based on unseen cases [Mingers 1989;
Quinlan 1987]. For this analysis a conservative value of 5 percent pruning con�dence
was used.
A second important concept used in the C5.0 classi�cation algorithm is boosting,
a technique developed in the machine learning research community [Shapire 1990].
Boosting attempts to increase the classi�cation accuracy of a given learning algo-
rithm by iteratively estimating a number of classi�cations from the same data using
the same algorithm. At each iteration, weights are assigned to each training obser-
vation, where observations that were misclassi�ed in the previous iteration obtain a
higher weight than correctly classi�ed ones. This allows the algorithm to concen-
trate on cases that are more di�cult to classify. Friedl et al. [1999] demonstrated
that boosting can increase classi�cation accuracy in global land cover classi�cation
problems. When applied to di�erent datasets, boosting has been shown to increase
classi�cation accuracy with di�ering numbers of iterations. Based on the results
of Quinlan [1996] and Friedl et al. [1999], this research applied boosting with ten
iterations.
29
3.2 Cross-Walking from IGBP Classes to Biomes
Cross-walking between di�erent classi�cation schemes if interest can not necessarily
be done in an unambiguous fashion and may introduce unwanted errors and inaccu-
racies. A critical step for the work presented here was to translate the training data
from the International Geosphere-Biosphere Program (IGBP) classi�cation scheme
into the biome classi�cation scheme (section 2.3). In particular, direct translation of
the 17 IGBP classes into the six biome classes is not possible for the IGBP classes
5, 6, 8, 12, 14 (mixed forest, closed shrublands, woody savanna, croplands and crop-
lands mosaic, respectively; for detailed de�nition of the classes refer to Table 23 in
the appendix).
To resolve these ambiguities, the seasonal land cover region characterization (SLCR)
[Loveland et al. 1995] was used as an ancillary data source. This map possesses sig-
ni�cantly more classes than the IGBP scheme and therefore much narrower class
de�nitions. The SLCR project de�ned approximately 200 classes for each of the �ve
continents (205 classes for North America, 963 globally). The narrow de�nition of
the SLCR classes allows their aggregation into broader classes of other classi�cation
schemes, e.g., the IGBP scheme. Look up tables (LUT) for the aggregation of SLCR
classes into various existing classi�cation schemes are provided by EDC and used as a
guideline for the translation to 6 biomes performed for this work. For a more detailed
description of the SLCR map product and its classi�cation scheme the reader may
refer to Loveland et al. [1995].
30
IGBP Biomes
1 Evergreen Needleleaf Forests (ENF) Grasses and Cereal Crops (Biome 1)
2 Evergreen Broadleaf Forests (EBF) Shrubs (Biome 2)
3 Deciduous Needleleaf Forests (DNF) Broadleaf Crops (Biome 3)
4 Deciduous Broadleaf Forests (DBF) Savannas (Biome 4)
5 Mixed Forests (MXF) Broadleaf Forests (Biome 5)
6 Closed Shrubland (CSH) Needleleaf Forests (Biome 6)
7 Open Shrubland (OSH) Non-Vegetated (Biome 7)
8 Woody Savannas (WSA)
9 Savannas (SAV)
10 Grasslands (GRL)
11 Permanent Wetlands (PWL)
12 Croplands (CRL)
13 Urban and Built-up (URB)
14 Cropland Mosaics (CRM)
15 Snow and Ice (SNI)
16 Barren or Sparsely Vegetated (BSV)
17 Water Bodies (WAT)
Table 3: Comparison of the IGBP and biome classi�cation scheme [Loveland et al.
1995; Myneni et al. 1997]
For this work a LUT based on those provided by EDC were used to assign a biome
label to each training site for those cases where the training site possessed an ambigu-
ous IGBP label (classes 5, 6, 8, 12, 14, 16). The relabeled training sites were then
used as input to the classi�cation process as described above (pre-classi�cation ag-
gregation). To accomplish this task, a SLCR label for each training site was obtained
by overlaying training site polygons with the SLCR map. The most common class
within the training site polygon was used as a SLCR label. The SLCR and IGBP
labels were then compared and examined for agreement. In 40 cases the training site
label and the corresponding SLCR label were not in agreement and were therefore
31
removed from further analysis.
Note that the use of the SLCR labels introduces a bias to the EDC map, which is
based on the SLCR map. That is, the training site label assignment was not directly
done by an expert, but was based on an ancillary data source, which was evaluated
later using the same data. Unfortunately, no other independent map with narrow
class de�nitions is available at this point which could be used as an independent data
source for this purpose.
3.3 Comparison of UMD, EDC and BU Maps
In the second part of the analysis a quantitative comparison of land cover map prod-
ucts provided by the EROS Data Center (EDC) and University of Maryland College
Park (UMD) was performed. This served two purposes. First, it provided an ad-
ditional way to assess the properties of the maps produced with the decision tree
classi�cation algorithm. Second, it highlighted the strengths and weaknesses of each
map and helped to decide, which map to use for global retrieval of LAI and FAPAR
by the Vegetation and Climate Research Group (section 5.4). While the map pro-
duced by EDC was created using a classi�cation approach based on an unsupervised
algorithm with subsequent labeling of spectral classes, UMD uses an approach sim-
ilar to BU. For detailed description of the respective classi�cation algorithms, refer
to [Hansen et al. 1999] and [Loveland et al. 1995].
The classi�cation scheme used by UMD follows essentially the IGBP classi�cation
32
logic. However, three IGBP classes are not included in the UMD scheme: snow and ice
(IGBP 15), permanent wetland (IGBP 11) and cropland mosaic (IGBP 14). Therefore
these three classes were excluded from further analysis. Furthermore, the UMD class
names and class numbers do not always correspond to the IGBP class names, even
though the class de�nitions are the same. For the purpose of this analysis, the UMD
map was recoded to correspond to the IGBP class numbers (Table 4).
Class Original UMD class number Recoded UMD class numbers
Water 0 17
ENF 1 1
EBF 2 2
DNF 3 3
DBF 4 4
MXF 5 5
WSA 6 8
SAV 7 9
CSH 8 6
OSH 9 7
GRL 10 10
CRL 11 12
BSV 12 16
URB 14 13
Table 4: Recoded UMD classes
The impact of misregistration on accuracy assessment and image analysis has been
previously demonstrated [Townshend et al. 1992]. Therefore, in order to perform a
meaningful comparison of the UMD, EDC and BUmaps, it was necessary to coregister
them accurately. To do this, the data and maps were analyzed and processed in the
Interrupted Goode's Homolosine map projection, which is commonly used for global
scale studies, and allows one-to-one mapping at global scales. That is, each pixel of
33
a continental to global scale map can be related to a corresponding pixels in another
map using the same pixel coordinates. A global map in the Goode's projection is
composed of a mosaic of 12 tiles in the Mollweide and the Sinusoidal projections,
which meet approximately at 40 degrees latitude [Steinwand 1994]. Reprojection of
source maps into other projections was avoided since it would have introduced errors.
The three map products were compared both qualitatively and quantitatively.
First, the areal extents of each class in the respective classi�cation scheme were com-
pared. Next, to provide a more rigorous analysis of the EDC and UMD maps, the
training site data from BU was used as reference data (\ground truth") to generate
accuracy statistics. Finally, by overlaying the maps, areas of agreement and disagree-
ment were identi�ed.
3.4 Accuracy Assessment
Classi�cation accuracy is typically assessed using an error or confusion matrix. This
matrix documents errors of omission and errors of commission by cross-tabulating per-
pixel labels output by the classi�cation algorithm with labels obtained from ground
truth mapping [Congalton 1991]. Errors of omission are calculated as the sum of all
o�-diagonal values in a row divided by the row total. They indicate the proportion
of sites or pixels in a particular class of the reference data that were not classi�ed
correctly by the algorithm. Errors of commission are calculated as the sum of all o�-
diagonal values in a column divided by the column total. They indicate the proportion
34
of sites or pixels in the map that were misclassi�ed by the algorithm. The total error
is calculated as the sum of all o�-diagonal values divided by the total of samples in
the matrix. Overall accuracies as well as conditional (i.e., class-speci�c) accuracies
can be computed by dividing the correctly classi�ed samples by the column, row or
matrix total, respectively. For this analysis the reference data are presented in rows,
whereas the test samples are presented in columns.
Error Matrix
#R/C! 1 2 ... q xk+ PAi
1 x11 x12 ... x1q x1+ x11=x1+
2 x21 x22 ... x2q x2+ x22=x2+
: : : ... : : :
q xq1 xq2 ... xqq xq+ xqq=xq+
x+k x+1 x+2 ... x+q
PUj x11=x+1 x22=x+2 ... xqq=x+q
Table 5: Arrangement of reference and test data in confusion matrix. #R refers to
the reference data, C! to the classi�ed (test) data.
The accuracy parameters used for this analysis are described below. Upper case
(P ) is used to denote summary parameters and lower case (x) denotes individual cell
values. Row and column totals are referred to as xk+ and x+k, respectively. The
total number of classes is q and the total number of pixels in the matrix is p. Using
this notation we have:
1. The overall proportion of area correctly classi�ed:
Po =1
p
qXk=1
xkk (2)
35
2. The Kappa coe�cient [Cohen 1960]:
� =
pqX
k=1
xkk �qX
k=1
xk+x+k
p2 �qX
k=1
xk+x+k
(3)
where
Pc =1
q2
qXk=1
xk+x+k (4)
Therefore � can be rewritten as:
� =Po � Pc1� Pc
(5)
3. User's accuracy PUj and commission error EUj for cover type j:
PUj =xjjx+j
EUj = 1� PUj (6)
4. Producer's accuracy PAiand omission error EAi
for cover type i:
PAi=
xiixi+
EAi= 1� PAi
(7)
The kappa coe�cient was introduced by [Cohen 1960] and provides a more re-
alistic estimation than a simple percentage agreement value because it considers all
cells in the error matrix and provides a correction for the proportion of chance agree-
ment between reference and test data [Rosen�eld and Fitzpatrick-Lins 1986]. PUj
describes the probability that a pixel classi�ed as class j in the map is labeled as
36
class j in the reference data. PAidescribes the probability that a pixel labeled as
class i in the reference data is classi�ed as class i in the map.
Each parameter uses di�erent information contained in the confusion matrix and
therefore summarizes the matrix in di�erent ways. While Pc and � provide a single
summary measure for the entire matrix, PUj and PAisummarize columns and rows,
respectively. However, since each of them obscures important details of the error
matrix, the full matrix is also reported [Stehman 1997].
It is important to note that error matrices with di�erent row and column totals
and a di�erent distribution of cell values may have the same overall accuracy or �
[Stehman 1997]. In order measure whether two matrices are signi�cantly di�erent,
the Z statistic is employed. This statistic allows to rank maps based on accuracy
coe�cients. Following the notation of Ma and Redmond [1995], Z for overall accuracy
is used as:
Z(Po) =Po2 � Po1q�2o2 + �2o1
(8)
For �, Z is used as:
Z(�) =�2 � �1q�2�2 + �2�1
(9)
The database of training sites compiled by BU provides extensive ground truth
for North America and can therefore also be used as an independent data set in
order to evaluate the EDC and UMD map. To compare maps, the same method can
37
be applied, except that each pixel of each map is compared rather than individual
polygons.
3.5 Improving Training Data Quality
Before the �nal analysis was performed, shortcomings in the training were improved
based on preliminary results and exploratory data analysis. This was accomplished
in three steps.
First, missing values in the AVHRR NDVI data (data dropout) introduced addi-
tional confusion in the classi�cation algorithm. This was in particularly a problem in
northern latitudes. In order to account for this problem, a set of temporal smoothing
and interpolation routines were applied to the dataset.
Second, due to misregistration of some of the TM scenes used in the training site
generation, not all sites could be used in the analysis. Out of the approximately 1000
sites only 665 were used. This had the consequence that areas in the northern part of
the continent were undersampled. In order to compensate for undersampled regions a
total of 32 new training sites was generated, based on areas of agreement between the
UMD and EDC map. This approach stems from the assumption that the con�dence
about the correct assignment of a class label is high where two independently gener-
ated maps agree. The sites were chosen randomly across the undersampled regions
with su�cient distance between each other in order to account for e�ects of spatial
autocorrelation. This method was also employed by Friedl and Brodley [1997].
38
Third, some categories were oversampled and introduced a bias in the classi�cation
algorithm to more frequent classes. In order to compensate for oversampled classes
the training data were resampled to re ect the expected proportions of land cover
classes on the North American continent. To do this, the proportions of each class
in the UMD and EDC maps were used as a guideline. In cases where the number of
training pixels available in a class was below the threshold required to characterize
the properties of the class, all the pixels were kept. In cases where the class size was
too large, a random sample proportional to the estimated frequency of this class on
the ground was generated and used for further analysis.
39
4 Results
This section discusses results from the analysis described above. Section 4.1 presents
results from the map generation in the IGBP and 6-biome classi�cation scheme. Sec-
tion 4.2 compares the accuracy coe�cients of the two maps, and section 4.3 includes
the results from the comparison of the EDC, UMD and BU maps.
4.1 Classi�cation Performance
The training data that were input to the classi�cation algorithm was processed in �ve
distinct iterations. Each iteration attempted to improve the quality of the maps and
increase accuracy coe�cients. The same methods and routines were applied to the
training data in both the IGBP and biome classi�cation schemes. In this section, the
iterations are refered to as I, II, III, IV and V. Each iteration has a particular training
data set and map associated with it. For each of iterations I-V accuracy assessments
were performed and the associated accuracy coe�cients are reported herein. The
estimated Z statistics demonstrate statistically signi�cant di�erences in classi�cation
accuracy between iterations.
Training set I represents the raw training data, without any data manipulation.
Training set II was manually cleaned for multivariate statistical outliers. Training
set III contains SLCR labels as an additional feature. Training set IV represents
the extended set with additional training sites added to it (i.e., SLCR labels in-
cluded). Training set V is training set IV with resampled proportions of land cover
40
and biome classes, respectively. Note that the reported accuracies are averages across
�ve-fold cross-validations. The results and the subsequent discussion mostly focus on
site-based accuracies, since these values were considered to provide a more rigorous
assessment of the classi�cation performance. The main results for the classi�cation
in the IGBP scheme are summarized in this section (Tables 6, 7 and 8). The full
error matrices from which the summary statistics are derived can be found in the
appendix (Tables 24, 25, 26, 27 and 28). The results for the classi�cation in the
biome scheme are also fully reported below.
The results presented in table 6 show an increase in overall classi�cation accuracy
with each iteration. For the IGBP scheme, the overall accuracy was improved from
55% to 64%. The accuracies for the biome scheme are generally 5% higher and range
from 61% to 73%, respectively (Table 6). As expected, � is generally smaller than
Po since it accounts for chance agreement. However, the same trend is observed for
� with values ranging from 0.49 to 0.59 for IGBP classes and 0.51 to 0.68 for biome
classes.
Visual inspection of the class maps was a crucial step required to assess the
reliability of these results. Speci�cally each map was checked for overall patterns
and the distribution of land cover classes. This was particularly important since the
accuracy statistics do not necessarily re ect a meaningful or expected distribution of
land cover patterns. The results for two dominant land cover classes which control
much of the overall patterns of the maps, namely the forest and cropland classes, are
shown below and discussed in the next section.
41
Site-based accuracies
I II III IV V
IGBP Po 55% 59% 62% 68% 64%
� 0.49 0.53 0.57 0.64 0.59
BIOME Po 61% 62% 68% 71% 73%
� 0.51 0.52 0.60 0.64 0.68
Pixel-based accuracies
IGBP Po 90.2% 90.3% 92.7% 95.5% 94.3%
BIOME Po 91.2% 91.5% 93.9% 94.1% 95.7%
Table 6: Overview of site-based classi�cation performance improvement. Cases I-IV
represent di�erent training data sets as follows: Case I - uncleaned training data;
Case II - outliers removed; Case III - SLCR as additional training variable; Case IV
- additional training data included for unsampled regions; Case V - proportionally
sampled training data set. Please refer to the appendix for corresponding confusion
tables.
Unfortunately, not all the maps derived from each training set in the respective
classi�cation scheme can be depicted with adequate detail in this thesis. The version
of the maps used for the comparative analysis with the UMD and EDC map (training
data set V) are printed in the appendix to provide the reader a representative result.
4.1.1 IGBP Scheme
While the focus of this research is the generation of a biome level map, important
issues in the mapping of the IGBP classes will be discussed here and are summarized
in tables 7 and 8 (omission errors and commission errors, respectively). These data
are helpfull for recognizing particular characteristics of the biome map. Note here
that the maps in both classi�cation systems were generated from the same training
data in each step.
42
Needleleaf forests (IGBP class 1), deciduous broadleaf forests (IGBP class 4),
grasslands (IGBP class 10) and croplands (IGBP class 12) are the main land cover
classes on the North American continent and possess relatively distinct geographic
distributions. Independent source maps generally agree on the overall distribution
of these [Knapp 1965; Brown et al. 1998; Omernik 1987] classes. Therefore, visual
inspection concentrated on these classes. The results for these four IGBP classes are
shown below.
Needleleaf forests (class 1): Visual inspection of the maps produced by supervised
classi�cation revealed relatively poor results from the classi�cation algorithm for
training sample I. The error of omission for needleleaf forests was 59% (Table 7)
and a signi�cant portion of the pixels was incorrectly assigned to classes 12, 5, and 2
(croplands, mixed forest and broadleaf forests, respectively). Also, the same classes
contributed the majority of the total error of commission of 58% (Table 8). At the
same time, PAiincreased to 84% for steps I-V and the corresponding omission error
was reduced to 16% . The inclusion of additional training pixels for class 1 resulted
in a signi�cantly higher classi�cation performance and a better map with less obvious
confusions. Note that the contribution to the error of commission for class 1 by the
cropland classes were lowered from 10% to 1% (Tables 24 and 28 in the appendix).
Deciduous broadleaf forests (class 4): For deciduous broadleaf forests, omission
errors improved from 48% to 31% (Table 7) and commission errors from 48% to 28%
(Table 8) from steps I-V. In particular, the contribution to the error of omission by
class 12 was reduced from 7% to less than 0.5%. However, the added training sites
43
IGBP Total Omis- Contribution by individual classes
sion Error
Set I
ENF (1) 59% EBF(21%), MXF(16%), CRL(11%)
DBF (4) 48% MXF(13%), CRL(7%), CRM(8%)
GRL (10) 76% EBF(14%), CRL(23%), BSV(20%)
CRL (12) 37% GRL(5%), CRM(15%)
Set II
ENF (1) 48% EBF(12%), MXF(16%), CRL(9%)
DBF (4) 47% ENF(7%), MXF(10%), CRM(8%)
GRL (10) 68% EBF(13%), CRL(25%)
CRL (12) 28% ENF(4%), CRM(10%)
Set III
ENF (1) 39% MXF(15%), CRL(9%)
DBF (4) 44% MXF(16%), CRM(10%)
GRL (10) 71% EBF(15%), CRM(20%), BSV(14%)
CRL (12) 27% CRM(10%)
Set IV
ENF (1) 28% CRL(14%), MXF(6%)
DBF (4) 52% ENF(20%), GRL(5%)
GRL (10) 67% CRL(20%), BSV(14%)
CRL (12) 26% ENF(4%), CRM(9%)
Set V
ENF (1) 16% EBF(3%), MXF(6%)
DBF (4) 31% ENF(5%), EBF(7%), MXF(5%)
GRL (10) 47% OSH(10%), CRL(10%)
CRL (12) 53% GRL(10%), CRM(18%)
Table 7: Errors of omission for selected classes in the IGBP scheme.
introduced confusion with classes 1 and 2.
Note that these results do not distinguish the severity of the errors made. For
example, it can be argued that the confusion between croplands and forests is more
severe than confusion among forest classes. The relatively poor appearance of the
map based on training sample I is largely attributed to these types of commission
and omission errors.
44
IGBP Total Comis- Contribution by individual classes
sion Error
Set I
ENF (1) 58% EBF(10%), MXF(21%), CRL(10%)
DBF (4) 48% MXF(14%), CRM(13%)
GRL (10) 47% CRL(18%), BSV(14%)
CRL (12) 40% ENF(4%), GRL(9%), CRM(11%)
Set II
ENF (1) 52% DBF(5%), MXF(17%), CRL(9%)
DBF (4) 46% MXF(13%), CRM(10%)
GRL (10) 60% OSH(11%), CRL(10%) BSV(9%)
CRL (12) 39% GRL(9%), CRM(11%)
Set III
ENF (1) 43% EBF(6%), MXF(21%), CRL(6%)
DBF (4) 30% MXF(11%), CRM(5%)
GRL (10) 62% OSH(16%), CRL(12%), BSV(13%)
CRL (12) 36% GRL(8%), CRM(8%)
Set IV
ENF (1) 28% DBF(10%), CRL(4%), MXF(7%)
DBF (4) 29% MXF(8%), CRM(7%)
GRL (10) 52% DBF(9%), SAV(8%), BSV(9%)
CRL (12) 43% ENF(12%), GRL(7%), CRM(9%)
Set V
ENF (1) 18% MXF(9%)
DBF (4) 28% ENF(7%), MXF(8%)
GRL (10) 49% OSH(21%), BSV(10%)
CRL (12) 63% CSH(9%), GRL(16%), CRM(14%)
Table 8: Errors of commission for selected classes in the IGBP scheme.
In general, the confusion between both the needleleaf and broadleaf forest classes
with the mixed forest class was consistently high. This is not surprising since mixed
forest is a continuum of forest classes and is subject to analyst error. Further, spectral
information was not included in the classi�cation process and may have provided an
additional feature to resolve these misclassi�cations.
Grasslands (class 10): The grassland class exhibits signi�cant misclassi�cation
45
errors with respect to croplands and the sparsely vegetated/barren class. Also high
misclassi�cation rates were found with respect to broadleaf forests (14% for training
sample I, Table 7). Errors of both omission and commission with respect to classes
12, 14 and 16 improved. PAiimproved from 24% to 53%, and PUj from 33% to 51%
(Tables 24 and 28). The confusion with broadleaf forest was reduced in training
samples III and IV. The confusion of grassland with classes 12, 14 and 16 may be
explained as a function of their similar spectral signal, whereas the confusion with
broadleaf forest is probably due to a similar temporal signal.
Croplands (class 12): The distribution of croplands on the North American con-
tinent is very distinct due to ecological constraints and settlement structure and can
therefore be used to assess map properties in a qualitative way. Generally the mis-
classi�cation rates for other classes as croplands or cropland mosaic were very high
and was veri�ed by a quick visual assessment of each map. In particular, the occur-
rence of croplands in northern latitudes was a common weakness of all class maps and
could not entirely be solved. Oversampling of cropland in the initial training data is
assumed to be the major source of misclassi�cation. This assumption is supported
by visual inspection of the maps.
Decision tree algorithms are optimized to maximize classi�cation accuracy. As
a result, predictions are biased to more frequent classes. That is, the classi�cation
will tend to predict cropland for ambiguous cases, since this class is over-represented
in the training data. This observation was the motivation for applying proportional
sampling to the training data. The e�ect of proportional sampling is re ected in
46
PAiand PUj for class 12 (Tables 24- 28), where the initial accuracies were relatively
high, but dropped drastically for training sample V (63% to 47% and 60% to 37%,
respectively). Rescaling the proportion of cropland pixels in the training data had the
e�ect that confusion between class 12 and class 14 (cropland mosaic) became more
signi�cant and resulted in a drop. In the accuracies for these classes misclassi�cation
as grassland was also relatively high. This can be explained by the similarity of
NDVI trajectory between cereal croplands and grasses, which are most common in
the mid-western US. The over-representation of croplands in the training data base
can be likely attributed to the fact that they are easily discernible on TM images
and airphotos and are therefore frequently picked by analysts to designate a training
polygon.
4.1.2 Biome Scheme
The classi�cation performance for the biome classes was generally better than for
the IGBP classes (Table 6). Both producer's and user's accuracies improved for all
classes for steps I-V. Misclassi�cation of forest and cropland classes was found to be
the most signi�cant problem.
Grasses and Cereal Crops (Biome 1): Throughout the analysis, biome 1 exhibited
the highest errors of omission with respect to both needleleaf and broadleaf forests.
The contribution to the total omission error by broadleaf forest decreased from 14%
(Table 9) to 5% (Table 13). The magnitude of omission errors for classes 2, 3, 5 and
6 was generally only slightly lower. However, the degree of confusion is highest for the
47
Error matrix for uncleaned training data (I)
#R/C! 1 2 3 4 5 6 7 xk+ PAi
1 1213 212 193 163 350 192 189 2520 0.48
2 285 477 22 52 54 8 67 965 0.49
3 106 0 1286 81 107 134 9 1723 0.75
4 136 37 128 137 153 176 82 849 0.16
5 151 15 130 110 3745 330 69 4550 0.82
6 226 26 114 160 664 929 78 2197 0.42
7 188 31 12 90 23 51 602 998 0.60
x+k 2305 798 1885 793 5096 1820 1096 13793
PUj0.53 0.60 0.68 0.17 0.74 0.51 0.55
Trace = 8389 Po = 0.61 Pc = 0.20Pq
k=1 xk+x+k = 38759394 � = 0.51
Table 9: Error matrix for biome classes and site-based accuracy coe�cients for the
uncleaned training data set (I).
forest classes, which are structurally distinctly di�erent from biome 1. The misclassi-
�cation rate for biome 4 (savannas, e.g., in training set V, Table 13) of 6% compared
to 5% for broadleaf forests, very likely relates to the spectral properties of savannas,
which by de�nition possess up to 80% grass understory. Shrubs and needleleaf forest
exhibited the highest commission errors for biome 1. Shrubs contributed 14% to the
total commission error of 35% for training set V, needleleaf forests contributed 7%
(Table 13).
Shrubs (Biome 2): Shrubs showed the highest omission errors for biome 1 for
training set I-III. With the addition of supplemental training sites (V), misclassi-
�cation of shrubs as savannas increased drastically (33%, Table 12). This results
from the fact that the training samples previously did not contain samples of shrubs
from northern latitudes (tundra). Whereas the shrublands in the western part of
48
Error matrix for cleaned training data (II)
#R/C! 1 2 3 4 5 6 7 xk+ PAi
1 1093 128 196 139 347 144 287 2334 0.47
2 267 402 51 4 27 45 27 823 0.49
3 72 0 1210 108 121 141 9 1661 0.73
4 167 3 98 127 155 209 46 805 0.16
5 183 17 120 77 3619 297 29 4342 0.83
6 233 8 103 139 423 1054 112 2072 0.51
7 215 26 13 83 29 55 573 994 0.58
x+k 2230 584 1791 677 4721 1945 1083 13031
PUj0.49 0.69 0.68 0.19 0.77 0.54 0.53
Trace = 8078 Po = 0.62 Pc = 0.21Pq
k=1 xk+x+k = 34810412 � = 0.52
Table 10: Error matrix for biome classes and site-based accuracy coe�cients for the
cleaned data set (II).
the continent have bright backgrounds, the shrublands in the subarctic region are
more similar to savannas in terms of their NDVI. This confusion is present in the
�nal data set, where 13% of the grasses/cereal crops pixels and 24% of the savanna
pixels contribute to a total omission error of 40% (Table 13). The commission error
for shrubs is generally smaller than the omission error, i.e. classes 1 and 3-6 are less
often classi�ed as shrubs than shrubs are classi�ed as one of the other classes.
Broadleaf Crops (Biome 3): For broadleaf crops, the error matrices demonstrate
that the highest omission errors were associated with the two forest classes (biomes 5
and 6), whereas the highest commission errors were generally contributed by biome 1
(column 1 in Tables 9- 13). The latter had an important in uence on the proportion
of the two cropland classes in the biome maps. Again, the confusion between forests
and broadleaf crops is more severe in terms of misclassi�cation costs than the confu-
49
Error matrix for training data with SLCR as additional variable (III)
#R/C! c-> 1 2 3 4 5 6 7 xk+ PAi
1 1426 85 168 94 264 160 137 2334 0.61
2 318 346 36 7 18 16 82 823 0.42
3 152 1 1296 68 79 46 19 1661 0.78
4 183 0 95 180 144 150 53 805 0.22
5 188 5 98 49 3672 303 27 4342 0.85
6 207 24 71 89 330 1335 16 2072 0.64
7 171 21 17 61 71 16 637 994 0.64
x+k 2645 482 1781 548 4578 2026 971 13031
PUj0.54 0.72 0.73 0.33 0.80 0.66 0.66
Trace = 8892 Po = 0.68 Pc = 0.21Pq
k=1 xk+x+k = 35010219 � = 0.60
Table 11: Error matrix for biome classes and site-based accuracy coe�cients using
SLCR labels (III).
sion with biome 1. The best results were obtained for training sample V, where the
commission error for broadleaf and needleleaf forest was as low as 3% (Table 13).
Savannas (Biome 4): The accuracies for savannas improved from 16% to 52% for
PAiand from 17% to 32% for PUj (Tables 9 - 13, row 4 and column 4, respectively).
In the training sets I-III savanna pixels were largely misclassi�ed as one of the forest
classes or as grasses. This is clearly related to the properties of this class; speci�cally
the mixtures of both grasses and woodlands. Also, savannas represent a small portion
of the training data and were penalized by the classi�cation algorithm.
Broadleaf forests (Biome 5): Broadleaf forests had both the highest PAiand
PUj throughout the 5 iterations. The highest errors were observed with respect to
needleleaf forests and grasses/cereal crops. The latter is probably caused by a similar
temporal pattern of NDVI. The misclassi�cation of biome 5 as needleleaf forests can
50
Error matrix for training data with additional training sites (IV)
#R/C! 1 2 3 4 5 6 7 xk+ PAi
1 2107 130 223 74 231 259 128 3152 0.67
2 344 1736 39 1135 34 30 133 3451 0.50
3 70 3 1206 153 132 83 14 1661 0.73
4 143 5 117 510 128 179 105 1187 0.43
5 151 36 99 57 4706 278 37 5364 0.88
6 265 36 76 64 326 3148 36 3951 0.80
7 142 308 13 65 92 42 930 1592 0.58
x+k 3222 2254 1773 2058 5649 4019 1383 20358
PUj0.65 0.77 0.68 0.25 0.83 0.78 0.67
Trace = 14343 Po = 0.71 Pc = 0.17Pq
k=1 xk+x+k = 71726426 � = 0.64
Table 12: Error matrix for biome classes and site-based accuracy coe�cients with
additional training sites (IV).
be explained by naturally occuring mixtures of both classes.
Needleleaf forests (Biome 6): The major improvement in classi�cation perfor-
mance for needleleaf forests can be attributed to the addition of training sites, which
resolved the bias of broadleaf forest pixels in the previous training sets and the
undersampling of needleleaf forests in northern latitudes. The severest source of mis-
classi�cation are the classi�cation of needleleaf forests as grasses, which amounts to
11% in the uncleaned training sample (Table 9) and 6% in the last training sample
(Table 13). This problem could not be entirely resolved and is evident in the �nal
biome map (see Appendix).
The non-vegetated class showed an interesting interaction with the 6 biome classes.
In particular, biome 1 was frequently assigned to the non-vegetated class and vice
versa. This is not too surprising since many agricultural �elds are actually non-
51
Error matrix for training data with proportional sampling (V)
#R/C! 1 2 3 4 5 6 7 xk+ PAi
1 2089 197 122 188 151 256 85 3088 0.68
2 448 2091 27 839 24 34 36 3499 0.60
3 91 2 1240 126 90 91 7 1647 0.75
4 150 3 114 620 110 130 60 1187 0.52
5 53 16 42 63 2079 199 18 2470 0.84
6 241 21 45 68 220 3311 45 3951 0.84
7 153 45 16 64 21 45 1422 1766 0.81
x+k 3225 2375 1606 1968 2695 4066 1673 17608
PUj0.65 0.88 0.77 0.32 0.77 0.81 0.85
Trace = 12852 Po = 0.73 Pc = 0.16Pq
k=1 xk+x+k = 48961277 � = 0.68
Table 13: Error matrix for biome classes and site-based accuracy coe�cients for
proportional sampling (V).
vegetated for a number of months a year. Some misclassi�cation of class 7 as shrubs
was observed as well. Note that shrubs are de�ned by low vegetation density and
bright backgrounds, which is very similar to just bare ground. Supplementing addi-
tional training sites resolved these issues for the most part (Table 12).
The Z-statistic was used to test for signi�cant di�erences between the accuracy
coe�cients Po and � for the training sets I-V. These results are summarized in table
14. At a 5% con�dence level, error matrices are statistically signi�cantly di�erent if
Z > � 1.96 [Ma and Redmond 1995]. The table shows that each iteration's accuracy
coe�cients were signi�cantly di�erent from the previous iteration, respectively. For
the IGBP scheme the coe�cients for training set V were lower than for IV, which
resulted in a negative Z-score.
52
IGBP-Scheme Biome-Scheme
Z(Po) Z(�) Z(Po) Z(�)
II vs. I 5.16 6.61 1.97 2.77
III vs. II 6.11 8.01 10.60 15.64
IV vs. III 9.84 14.00 4.28 10.25
V vs. VI -7.65 -9.13 5.48 9.46
Table 14: Test of signi�cant di�erences between accuracy coe�cients.
4.2 Comparison between Classi�cation Schemes
In order to test whether the relative classi�cation performance of the biome scheme
versus IGBP can be attributed to a better separability of classes based on NDVI or
whether it is simply a by-product of the fact that this classi�cation scheme possesses
a smaller number of classes, the error matrices in the IGBP scheme were aggregated
into a 7-class scheme. To do this, the forest classes were aggregated into broadleaf and
needleleaf forests (half of the pixels in the mixed forest class were assigned to each),
and the two savannas classes, two shrubland classes and the two cropland classes were
combined to one class, respectively. Grassland was kept as one class and all other
classes were combined to one. Even though this scheme does not re ect exactly the
classes in the biome scheme, this aggregation procedure provides an estimate of the
magnitude of improvement caused by having fewer classes. The associated accuracy
coe�cients for each training sample are shown in table 15. It can be seen that the
accuracies for the aggregated error matrices are generally of the same magnitude as
the ones from the biome scheme, in some cases even higher. The same results are
observed for �. This result does not support the hypothesis that the biome classes
53
exhibit a stronger separability than the IGBP classes based on a time series of NDVI.
I II III IV V
Biome Scheme Po 61% 62% 68% 70% 73%
� 0.49 0.53 0.58 0.61 0.62
Aggregated IGBP classes Po 63% 66% 68% 73% 0.68%
� 0.52 0.56 0.58 0.67 0.62
Table 15: Accuracy coe�cients for aggregated IGBP maps into a 7-class scheme.
4.3 Map Comparisons
In this section, an accuracy assessment of the UMD and the EDC maps is presented,
using the training sites from the analysis above as reference data. The areal distri-
butions of land cover classes in the biome and IGBP are then compared with each
other. Error matrices are used to identify confusion among particular classes in both
classi�cation schemes. Maps of areal agreement are provided in the appendix.
4.3.1 Accuracy Coe�cients for the UMD and EDC Maps
In order to perform a site-based accuracy assessment for the UMD and EDCmaps, the
BU training sites were overlayed with each of the maps. However, not all of the sites
were used in the error matrices, because some sites were not entirely covered by one
class. The most frequent class of the respective map in each polygon was associated
with the site. For the IGBP scheme, only those sites were used in the comparison that
were covered with at least 75% of one class in both maps. For the biome scheme, those
sites that covered at least 90% of one biome. These thresholds were chosen because
54
they maximized the area covered by one class, while still maintaining a su�ciently
large sample in each category. Also, the sites that were detected as outliers in the
analysis above were not used in the error matrix. Unfortunately, this reduced the
number of available sites for the analysis.
Error matrix for UMD, 75% covered
#R/C! 1 2 4 5 6 7 8 9 10 12 16 xk+ PAi
1 24 0 2 13 0 0 3 0 0 0 0 42 0.57
2 0 26 2 0 0 0 2 0 0 0 0 30 0.87
4 4 1 13 30 1 2 5 1 1 1 1 60 0.22
5 2 0 6 16 0 0 2 0 0 0 0 26 0.62
6 0 0 1 1 0 1 2 0 1 0 0 6 0.00
7 0 0 0 0 1 1 0 0 1 0 0 3 0.33
8 0 3 0 0 0 3 0 1 2 0 0 9 0.00
9 0 0 0 1 0 0 1 0 0 0 0 2 0.00
10 0 0 0 0 1 2 1 1 6 3 0 14 0.43
12 0 0 1 4 0 0 0 3 2 11 0 21 0.52
16 2 0 0 0 0 1 2 0 2 0 1 8 0.13
x+k 32 30 25 65 3 10 18 6 15 15 2 221
PUj0.75 0.87 0.52 0.25 0.00 0.10 0.00 0.00 0.40 0.73 0.50
Trace = 98 Po = 0.44 Pc = 0.13Pq
k=1xk+x+k = 6197 � = 0.36
Table 16: Error matrix and site-based accuracy coe�cients for the UMD map in the
IGBP scheme.
It is important to note that the sites used for the analysis in the biome scheme
have a bias, since the SLCR labels were used to overcome ambiguities in cross-walking
the IGBP class labels to biome labels in the training site generation (section 3). Both
the EDC and UMD map were cross-walked using the SLCR-biome LUT.
EDC map: 254 sites were used for the accuracy assessment of the EDC map in
the IGBP scheme (Table 16). The overall accuracy was 47% and � was 0.38. Open
shrubland (class 7) and evergreen broadleaf forest (class 1) were found to have the
highest producer's accuracies (100% and 93%, respectively). Classes 5, 6, 8, 9, 11,
55
14 and 16 showed poor PAi(13% and less). In terms of user's accuracies most of the
classes performed relatively poorly, with exception of classes 1, 2, 3, 10, 12 and 16,
which showed at least 39% for PUj . Note that deciduous needleleaf forest (class 3) is
excluded in the table, since it was not classi�ed in either of the maps. Also, class 13
is not shown since an ancillary urban mask was used.
Table 19 shows the analysis for the EDC map in the biome scheme for 306 sites.
The overall accuracy was 84% and � was 0.76. Shrubs (biome 2) and broadleaf crops
(biome 3) had a PAiof 100%. Broadleaf forests were classi�ed with 96% accuracy,
whereas needleleaf forests showed only 59% for PAi. The table also shows high user
accuracies, in particular for biome 1, 3 and the non-vegetated category.
Error matrix for EDC using training data, 75% covered
#R/C! 1 2 4 5 6 7 8 9 10 12 14 16 xk+ PAi
1 26 0 13 4 0 0 0 0 0 0 0 0 43 0.60
2 1 28 1 0 0 0 0 0 0 0 0 0 30 0.93
4 4 1 39 6 1 2 3 1 1 1 1 1 61 0.64
5 3 0 19 3 0 0 1 0 0 0 0 0 26 0.12
6 1 0 3 0 0 1 0 0 1 0 0 0 6 0.00
7 0 0 0 0 0 3 0 0 0 0 0 0 3 1.00
8 0 3 0 0 0 3 1 0 1 0 1 0 9 0.11
9 0 0 1 0 0 0 0 0 0 0 1 0 2 0.00
10 0 0 0 0 0 3 1 3 6 0 1 0 14 0.43
12 0 1 5 0 0 0 0 0 1 13 1 0 21 0.62
14 0 0 6 0 0 1 0 0 5 18 1 0 31 0.03
16 3 0 0 0 0 3 0 0 0 0 1 1 8 0.13
x+k 38 33 87 13 1 16 6 4 15 32 7 2 254
PUj0.68 0.85 0.45 0.23 0.00 0.19 0.17 0.00 0.40 0.40 0.14 0.50
Trace = 121 Po = 0.47 Pc = 0.15Pq
k=1xk+x+k = 8883 � = 0.38
Table 17: Error matrix and site-based accuracy coe�cients for the EDC map in the
IGBP scheme.
UMD map: 221 sites were used for the accuracy assessment of the UMD map in
the IGBP scheme. Overall accuracy and � were 44% and 0.36, respectively. PAiwas
56
higher than 40% for classes 1, 2, 5, 10 an 12. User's accuracy was poor (25% and
less) for classes 5, 6, 7, 8 and 9. The other classes showed PUj greater than 40%.
Classes 3, 11 and 13 through 15 were not used (Table 16).
The results from the analysis of the UMD in the biome scheme are shown in table
19. Overall accuracy and � are 83% and 0.75, respectively. High PAiis shown for
biome 3 and 5. Biome 3 also shows the highest PUi of 100% as well as the non-
vegetated category.
Error matrix for UMD, 90% covered
#R/C! 1 2 3 4 5 6 7 xk+ PAi
1 24 3 0 5 0 2 0 34 0.71
2 2 5 0 0 0 0 0 7 0.71
3 0 0 18 0 0 0 0 18 1.00
4 0 0 0 5 1 0 0 6 0.83
5 0 1 0 1 143 5 0 150 0.95
6 0 0 0 1 27 39 0 67 0.00
7 1 0 0 1 1 0 19 22 0.86
x+k 27 9 18 13 172 46 19 304
PUj0.89 0.56 1.00 0.38 0.83 0.85 1.00
Trace = 253.00 Po = 0.83 Pc = 0.33Pq
k=1xk+x+k = 30683 � = 0.75
Table 18: Error matrix and site-based accuracy coe�cients for the UMD map in the
biome scheme.
4.3.2 Pixel-Based Comparisons
Tables 29- 31 and 32- 34 (Appendix) show pixel-based comparisons between these
two classmaps in the IGBP scheme and the biome scheme, respectively. The dif-
ferences between two class maps is represented best by a confusion matrix. This
representation shows the confusion of one class in map A with all other classes in
map B in one table. Note that in this case rows and columns do not refer to reference
57
Error matrix for EDC, 90% covered
#R/C! 1 2 3 4 5 6 7 xk+ PAi
1 24 3 0 5 0 2 0 34 0.71
2 0 7 0 0 0 0 0 7 1.00
3 0 0 18 0 0 0 0 18 1.00
4 0 0 0 5 1 0 0 6 0.83
5 0 1 0 0 144 5 0 150 0.96
6 0 0 0 1 27 40 0 68 0.59
7 1 1 0 1 1 0 19 23 0.83
x+k 25 12 18 12 173 47 19 306
PUj0.96 0.58 1.00 0.42 0.83 0.85 1.00
Trace = 257 Po = 0.84 Pc = 0.33Pq
k=1xk+x+k = 30913 � = 0.76
Table 19: Error matrix and site-based accuracy coe�cients for the EDC map in the
biome scheme.
and classi�ed data, but simply to di�erent maps in the same classi�cation scheme.
The diagonal values show the pixels that were classi�ed to the same class in both
maps. The sum of all diagonal values divided by the matrix total gives the overall
agreement of the two maps. PCiand PCj
refer to the proportion of pixels in agreement
in each row and column, respectively.
In tables 29 and 30 the UMD map is represented in rows and the BU and EDC
map in columns. Table 31 shows the BU map in rows and the EDC map in columns.
Note that the row and column totals do not exactly correspond to the histogram in
table 20. This is because classes that were excluded in the map comparison, e.g.
class 13 (urban and built-up) were masked in the analysis. The overall agreement
between the maps in all three classi�cation schemes is summarized in table 22.
The frequency of IGBP classes in the EDC, UMD and BU maps are shown in
table 20, the frequency of biome classes in table 21. The tables show that the areal
proportions in the three maps di�er signi�cantly. Note the high percentage of crop-
58
Frequency of IGBP classes in the EDC, UMD and BU maps
EDC UMD BU
Class Pixels % Pixels % Pixels %
1 3736947 17.0 2306360 10.5 3472296 15.9
2 354661 1.6 338431 1.5 356568 1.6
4 1488111 6.8 778641 3.6 1574119 7.2
5 2854132 13.0 1196545 5.5 948452 4.3
6 579582 2.6 1656401 7.6 398707 1.8
7 2306720 10.5 2970955 13.6 5047635 23.0
8 1571191 7.2 3263042 14.9 1469668 6.7
9 73694 0.3 3111528 14.2 316218 1.4
10 1658740 7.6 1977154 9.0 963878 4.4
11 359708 1.6
12 1852240 8.5 1818480 8.3 3693475 16.9
13 84539 0.4 84539 0.4 84539 0.4
14 1510139 6.9 1497631 6.8
15 1472376 6.7 1555290 7.1
16 1998884 9.2 2399588 11.0 605572 2.8
Total land mass = 21899509
Table 20: Frequency of classes in the IGBP scheme for the UMD, EDC and BU maps.
lands in the BU map. The distortion in proportion for the comparison in the biome
scheme are largely explained by the fact that these maps were generated using a
simple aggregation scheme (see maps in the Appendix). This demonstrates how sig-
ni�cant the errors may be if a map is aggregated to the biome scheme using simple
cross-walking rules.
59
Frequency of IGBP classes in the EDC, UMD and BU maps
EDC UMD BU
Class Pixels % Pixels % Pixels %
1 1658740 7.5 1647527 7.5 3686851 16.8
2 2306720 10.5 2947821 13.4 4896929 22.4
3 1852240 8.5 1429950 6.5 1378206 6.3
4 1644885 7.5 5627713 25.7 1847457 8.4
5 5276486 24.1 3632644 16.6 2006770 9.1
6 3736947 17.1 2262153 10.3 5337426 24.4
7 3555799 16.2 2484009 11.3 2748025 12.5
Total land mass = 21899509
Table 21: Frequency of classes in the biome scheme for the UMD, EDC and BU
maps.
IGBP Biome
UMD vs. EDC 38.8% 45.5%
UMD vs. BU 36.3% 42.7%
EDC vs. BU 38.3% 47.6%
Table 22: Overall agreement of the UMD, EDC and BU maps in the IGBP and biome
classi�cation scheme.
60
5 Discussion
The analysis involved four main components: training data improvement, classi�-
cation performance in the IGBP scheme, classi�cation performance in the biome
scheme, and map comparison. These four components will be discussed in the sub-
sequent section.
5.1 Training Data Improvement
The initial generation of maps in both classi�cation schemes provided important
insights into shortcomings of the training data. In particular, maps derived from the
extracted training data without any further pre-processing yielded poor results. The
associated site-based map accuracies were 55% for the IGBP scheme and 61% for the
biome scheme. Major types of misclassi�cation errors were observed for IGBP classes
12 and 14 (cropland and cropland mosaics) and biome classes 1 and 3 (grasses/cereal
crops and broadleaf crops). In higher latitudes, these classes were frequently confused
with needleleaf forests. These errors were attributed to four main factors.
First, mislabeled or noisy training sites were found to introduce errors in the
classi�cation algorithm. These problems were generally related to errors made in the
training site database generation. Some sites were too heterogeneous in their NDVI
signal, and others were likely to be mislabeled or incorrectly georeferenced. Also,
dropouts in the AVHRR data caused problems.
Second, seasonal trajectories of NDVI data were limited in their ability to predict
61
and discriminate certain classes from the given training data provided. For example,
the trajectories of certain cropland training sites were not distinct from grassland
sites.
Third, the representation of certain land cover classes in the training data was
insu�cient. In particular, regions in northern latitudes were undersampled, because a
number of training sites could not be used in the analysis due to misregistration errors.
This particularly a�ected predictions of tundra (shrublands) and boreal needleleaf
forest.
Fourth, cropland classes were oversampled in the training data and C5.0 tended
to overpredict these classes. This was mainly a problem in high latitudes, where
needleleaf forest were misclassi�ed as croplands.
These issues were addressed in four steps of training data improvement and are
re ected in the training sets I-V. Manual removal of within-site and within-class
outliers (step I) resulted in an improvement of 4% in overall accuracy for the IGBP
scheme, but only 1% for the biome scheme. Note that the outlier detection was
performed for IGBP classes, i.e., if a pixel or a site was identi�ed as a multivariate
outlier in the IGBP scheme, the same pixel or site was removed from the training
data in the biome scheme. In this context, the smaller improvement produced for the
biome classi�cation scheme was probably caused by the narrower de�nition of the
IGBP classes, i.e., an outlier that was identi�ed within an IGBP class, may not be
an outlier within biome classes. Even though this method involved a certain degree
of subjectivity, con�dence was high that mislabeled and extremely heterogeneous
62
sites were removed from the training data. Unfortunately, the improvement due to
temporal smoothing of the AVHRR data was not quanti�ed in this analysis. However,
signi�cantly less artifacts were detected upon visual inspection of the maps.
The baseline dataset (step I) was not adequate to classify major ecological re-
gions. Visual comparison of each of the maps produced with existing vegetation
maps [Omernik 1987; Bailey 1996; Knapp 1965; Brown et al. 1998] revealed weak-
nesses regarding the overall pattern and distribution of land cover types. The use
of SLCR labels improved the maps signi�cantly and the patterns were generally in
better agreement with other map sources and expert knowledge. Note that the SLCR
labels were produced using extensive ancillary data in a labor intensive fashion [Love-
land et al. 1995] and provided high quality input to the classi�cation algorithm. The
analysis of the decision tree created from the training data in association with SLCR
labels showed that the SLCR labels were frequently selected as a decision feature,
but did not outweigh NDVI. The percentage of SLCR labels chosen as a feature in
the decision tree ranged from 20-25%. This number can only be estimated from mul-
tiple trees that were generated by boosting. This indicates the magnitude of the bias
introduced by using the SLCR labels in the training process.
Unfortunately, a signi�cant portion of the training sites could not be used in the
analysis due to misregistration errors (approximately 400 out of 1000 sites). The gen-
eration of additional training sites using the EDC and UMD maps was therefore an
important step in improving the properties of the �nal maps. Undersampled regions
were mostly located in northern latitudes where land cover is generally homogeneous.
63
Even though no outlier detection was performed on these sites, con�dence was high
that these sites were of good quality, since the EDC and UMD were in agreement.
This observation is underscored by the improvement in overall accuracy from 62% to
68% for the IGBP map and 68% to 71% for the biome map in step IV. It is acknowl-
edged that the use the UMD and EDC to generate new training sites introduces an
additional source of uncertainty that can not be accounted for. Nonetheless, this step
was important to supplement the classi�cation algorithm with undersampled regions.
Proportional sampling of classes in step V was particularly important since the
frequency distribution of land cover types in the training data did not re ect the
proportions of land cover types in other maps. The overall accuracy of 68% for the
IGBP class in step IV is partly attributed to a bias towards overpredicting class
12 (which was oversampled in the training data). Once this bias was removed the
accuracy dropped 4% in step V. This e�ect was not observed for the biome classes,
because the training data were rescaled to account for the smallest class in the dataset.
Therefore, the bias in the IGBP results was more pronounced than in the biome data.
It must be noted that shortcomings in the sampling design will a�ect the accu-
racy statistics derived from the training data [Congalton 1991; Stehman 1996]. The
sample of training sites is biased in three ways with respect to the SLCR map. First,
the SLCR labels were used to cross-walk the IGBP labels in the training data to
biome labels. Second, the SLCR were also used as a feature in the estimation of the
tree. Third, the generation of training sites used the SLCR map as a guideline for
a strati�ed sampling scheme. Unfortunately, only few alternative map sources are
64
available at continental scales that could serve as sampling strata. For example, the
maps generated by Omernik [1987] and Bailey [1996] provide an alternative for an in-
dependent sampling stratum. The accuracies reported herein are therefore expected
to contain errors and do not necessarily represent the exact accuracies. However, it is
di�cult to quantify the magnitude of this bias. The issues relating to shortcomings
in the sampling scheme could not be addressed in this work and need to be assessed
in the future. But it also has to be kept in mind, that the generation of a statistically
sound training site sample at global scales is extremely expensive and labor intensive.
Visual inspection was a very important tool in the production of land cover maps
and allowed identi�cation of weaknesses in the training data. This is a very com-
mon approach in supervised classi�cation and is frequently reported in the literature
[DeFries et al. 1998].
The land cover maps from EDC and UMD are the only comparable land cover
map products at a global to continental scale derived from 1 km AVHRR data. Un-
fortunately, accuracy coe�cients associated with these maps are yet to be published.
Therefore, the accuracies reported herein can only be benchmarked against results
from classi�cation e�orts at a di�erent spatial resolution. Also, pixel-based accu-
racies are generally reported in the literature. This research, however, focused on
site-based accuracies.
DeFries et al. [1998], for example, reported pixel-based overall accuracies ranging
from 81.4% to 90.3% using di�erent phenological metrics for a global land cover
classi�cation using 8km AVHRR data. However, validation of these map products
65
was based on the same data that was used to train their supervised classi�cation
algorithm. Therefore, these accuracies are expected to be biased. Friedl et al. [1999]
assessed the impact of boosting, phenological metrics and geographic location in
an supervised classi�cation process using the 1 degree land cover set compiled by
DeFries and Townshend [1994] and the EDC IGBP land cover map for North America
[Loveland et al. 1995]. The associated accuracies ranged from 78.7 to 96.6% and from
67.4 to 79.5%, respectively. The 96.6% overall accuracy associated with the 1 degree
dataset, however, was mostly attributed to the e�ect of using geographic location in
the classi�cation process and was not considered to be representative for the true
accuracy of the dataset. Also, non-independent splits of train and test data are
expected to cause a bias in the accuracy assessment.
5.2 IGBP Classi�cation Performance
The major improvement in the results from the IGBP classi�cation scheme across
steps I-V can be attributed to the reduction in errors of omission and commission with
respect to class 1 (evergreen needleleaf forest) and class 4 (deciduous broadleaf forest).
In particular, the confusion with croplands was signi�cantly reduced and was limited
to other forest classes in training set V. At this time the prediction of croplands in
northern latitudes could not be removed. Misclassi�cation of croplands was limited
to class 1 (grasslands) and class 14 (cropland mosaics). Further, misclassi�cation
of class 10 (grasslands) as class 7 (open shrubland) and class 16 (barren/sparsely
66
vegetated) was still present in training set V, but is considered to be a relatively
minor and expected error. The temporal signal and geographic occurrence of these
classes are very similar.
These results suggest that some of the IGBP classes are not separable using time
series of NDVI. The use of SLCR labels helped to separate some of these classes a
little better. However, major confusions are consistently observed for classes that
are mixtures by de�nition or which possess a continuum of fractional cover. More
speci�cally, high values of errors of omission and commission for mixed forests (class
5) are found throughout training sets I-V. Also, confusion of croplands (class 12) and
cropland mosaics (class 14) are consistently observed as well as confusion of grasslands
(class 10) with open shrubland (class 7) and barren/sparsely vegetated (class 16).
Finally, confusion of savannas (class 9) with forest classes (1, 2, 4) and grassland
(class 10) can be attributed to the continuum of fractional cover for savannas.
A signi�cant amount of post-classi�cation processing would be required to remove
these misclassi�cations entirely ( e.g., manually pruning decision trees and removing
misclassi�ed leafs from the estimated tree). This process is very labor intensive and
cannot be performed on a routine basis. Also, manual pruning of trees generated by
C5.0 is more complicated since the trees are generally more complex and larger than
those generated by Splus, for instance.
67
5.3 Biome-Level Classi�cation Performance
Error matrices produced for the biome scheme show that the improvement in overall
accuracy can be attributed largely to the improvement in accuracy for needleleaf
forest. In training set I the producer's and user's accuracies were 42% and 51%,
respectively. These statistics were improved to 84% and 81% in training data set V.
Even though the accuracies for savannas improved for I-V, it remained the class with
the lowest accuracies. User's accuracy for biome 2 (shrubs) is generally higher than
the producer's accuracy. This is largely caused by the confusion with savannas. A
constant number of pixels in the broadleaf and needleleaf forest class is misclassi�ed
as one of the other forest classes, respectively. This can be explained by the absence
of a mixed forest class in the IGBP scheme. Note that there were a number of mixed
forest training sites that needed to be assigned to either biome 5 or 6. This confusion
could not be resolved.
Surprisingly, the confusion between cereal crops (biome 1) and broadleaf crops
(biome 3) was not more pronounced than for the other biomes. Note that the training
sites that were previously labeled as IGBP class 12 (croplands) were translated to
the biome label using a SLCR-to-biome LUT. This suggests that the use of SLCR
labels to cross-walk the training data resolved the issues of ambiguous translation of
IGBP classes relatively well.
Examination of table 2 shows that a wide spectrum of naturally occuring vegeta-
tion is not captured by the 6-biome classi�cation scheme. In particular the de�nition
68
of fractional cover is not consistent, i.e. broadleaf forests and needleleaf forests are
de�ned by ground cover greater than 70%, whereas savannas are de�ned by less than
20% overstory. A signi�cant amount of naturally occuring land cover, however, falls
in the category of 20-70% ground cover [DeFries et al. 1998]. This caused prob-
lems cross-walking the IGBP classes to biomes and the use of the SLCR-to-biome
LUT could only partially resolve these issues. This is re ected in the accuracies for
savannas ( 32% user's accuracy and 52% producer's accuracy in training set V).
5.4 Separability of Land Cover Classes
The overall accuracies produced for the biome scheme were generally higher than
those produced for the IGBP scheme. Under the assumption that the features se-
lected in the classi�cation process were adequate for the classi�cation of IGBP and
biome classes, this may be theoretically be attributed to two e�ects. First, it may
suggest that the biome classes are more separable than IGBP classes in the 12 month
NDVI data space. Second, this result may be attributed to a statistical e�ect, i.e.,
the likelihood of misclassi�cation is smaller when there are fewer classes in the classi-
�cation system. In order to assess the validity of these two alternatives, the confusion
matrices in the IGBP scheme were aggregated to a 7-class scheme similar to the biome
scheme. The comparison showed the accuracies of the the aggregated scheme were
of about the same magnitude as the accuracies associated with the biome scheme.
The results from this aggregation suggest that the improvement in accuracy may be
69
largely attributed to the e�ect that there are fewer classes in the biome scheme than
in the IGBP scheme.
However, it is important to note the classi�cation algorithm used solely a one
year time series of NDVI in association with ancillary data in order to separate both
IGBP and biome classes. Even though time series of NDVI are generally used for
the purpose of land cover classi�cation [DeFries et al. 1998; Loveland et al. 1995], it
can be argued that a classi�cation based on the temporal trajectory of NDVI may
not be a su�cient metric to di�erentiate land cover types. This is particularly the
case for biomes, which are de�ned by structural properties rather than phenological
attributes. It is indeed questionable, whether the de�nition of the IGBP land cover
classes is suitable for remote sensing applications. The confusion among classes in the
IGBP scheme may be explained by the inseparability based on the temporal pro�le
of NDVI.
As a consequence it would be reasonable to consider to include additional met-
rics in the classi�cation process that account for the structural properties of biomes.
Radiative transfer theory of vegetation canopies supports the hypothesis that a de�-
nition of land cover types based on structural properties is more adequate for remote
sensing applications, since the geometric properties directly translate into particu-
lar radiative transfer regime for di�erent biomes [Myneni et al. 1990; Myneni et al.
1997]. The introduction of AVHRR channel data and directional information into the
classi�cation process may therefore provide a potential to improve classi�cation ac-
curacies with respect to biomes. Unfortunately, directional measurements are not yet
70
available at continental scales. MISR will therefore be an important source of data
for future classi�cation e�orts. The information contained in AVHRR channel data
has been used by DeFries et al. [1998] in a supervised classi�cation and was found to
improve classi�cation accuracies. The algorithm applied for this research, however,
is very sensitive to noise and it is very crucial to reduce the amount of noise in the
training data in order to achieve reasonable results. The use of AVHRR channel data
was therefore considered an insu�cient source of additional classi�cation features.
5.5 Map Comparison
The reference data used to estimate accuracy coe�cients for the EDC and UMD map
sampled all the classes with approximately the correct proportions. However, it must
be noted that many sites could not be used in the analysis because they were either
misregistered or not entirely covered by one class. Therefore, the set of reference sites
was considerably reduced and some class-speci�c accuracies are based on a relatively
small sample. The sites used for accuracy assessment, however, were considered the
best sub-set available to evaluate the EDC and UMD map without having signi�cant
bias and inaccuracies. Also, note that the accuracy coe�cients derived from the cross-
validation trials are not directly comparable with those estimated by overlaying the
sites with the EDC and UMD maps. Nonetheless, the estimated accuracies generally
re ect the accuracies obtained from the decision tree algorithm as well as the class
speci�c weaknesses. Even though the estimated biome level accuracies may be biased,
71
the associated accuracies are not unreasonably in ated. This can be explained by
the fact that both teams at EDC and UMD performed extensive manual labeling and
editing of the �nal map product and resolved many errors that are still present in
the BU map.
6 Conclusions
The general objective of this research was to generate a biome-based land cover map
for North America using decision trees. The more speci�c objectives were to use
multi-source data to generate land cover maps in the IGBP and biome classi�cation
scheme, and to compare the resultant maps with existing maps at the same scale and
resolution.
Training data for the supervised classi�cation algorithm were only available in the
IGBP classi�cation scheme, which is not consistent with the land surface parameter-
ization used by radiative transfer models to retrieve LAI and FAPAR from spectral
re ectances. Therefore, SLCR labels were used to cross-walk the training data to
biome classes. The training data was pre-processed and improved in 5 steps. Final
maps in both the IGBP and biome classi�cation schemes were then generated. These
maps were compared with other maps at the same scale and resolution in both clas-
si�cation schemes (biome and IGBP). The results from this analysis point to three
major conclusions:
First, the decision tree algorithm implemented in this research provides a powerful
72
technique to map biomes at continental scales using a time series of AVHRR NDVI
data in association with ancillary data sources. However, human interaction plays a
very important role at several stages of the mapping process.
Second, using AVHRR NDVI and SLCR labels as features in the classi�cation
process, biomes can not be mapped with signi�cantly higher accuracies than IGBP
classes. Lower accuracies are generally associated with transitional land cover types
and types that occur in mixtures with other classes. In this context, it needs to be
noted that the features used for this work (i.e., NDVI) do not necessarily represent
the best metrics to characterize the structural properties of biomes.
Third, the utilization of multiple data sources was e�ective in generating a biome-
based land cover map. SLCR labels can help reduce ambiguities in cross-walking
IGBP classes to biome classes. The SLCR labels also represent a powerful variable in
estimating decision trees for the mapping of IGBP classes and biomes at a continental
scale. Further, areas of agreement between the UMD and EDC maps are useful for
generating additional reference data for a supervised classi�cation approach using
decision trees.
Besides these general conclusions, additional lessons were learned in the mapping
process. In particular, exploratory data analysis involving the detection of multivari-
ate outliers in the training data is a crucial step. Even though the e�ect of removing
outliers is not directly re ected in the overall accuracy coe�cients, class speci�c mis-
classi�cations were a�ected. Also, speci�c misclassi�cation errors in the maps were
corrected by the removal of outliers. For this analysis the removal was performed
73
interactively in a manual fashion. For routine mapping of global land cover this step
needs to be automated in a rigorous way.
Visual inspection of the maps produced through this work, demonstrated that
the accuracy coe�cients did not necessarily represent the particular properties of the
maps. This is particularly true for overall accuracy and �, but is also true for class
speci�c user's and producer's accuracies. For instance, an error matrix may show
confusion between evergreen needleleaf forest and evergreen broadleaf forest. If this
confusion occurs in Florida this may not be of concern. However, if this kind of
confusion is observed in Alaska, it is a much more serious type of error.
The analysis also identi�ed areas to be addressed through future research. First,
the given set of training site samples had shortcomings in several respects. Current
e�orts in the context of the EOS validation program will provide substantially better
resources to set up more su�cient sampling designs for global scale studies. Second,
machine learning techniques are a new tool in land cover classi�cation research and
current research will provide a better understanding of the capabilities of these tech-
niques. Third, the availability of more high quality data sources at high resolution
and at a global scale will provide better ways to validate global scale products. Fi-
nally, the source data that will be available for global land cover classi�cation from
MODIS and MISR will be dramatically superior to AVHRR data due to higher spa-
tial resolution, higher signal to noise ratios, better calibration, and the synergism
inherent to these two instruments.
74
References
Asrar, G., Myneni, R., and Choudhury, B. 1992. Spatial heterogeneity in
vegetation canopies and remote sensing of absorbed photosynthetically active ra-
diation: A modeling study. Remote Sensing of Environment 41, 85{103.
Bailey, R. G. 1996. Ecosystem Geography. Springer, New York.
Barnes, W., Pagano, T., and Salomonson, V. 1998. Prelaunch Characteristics
of the Moderate Resolution Imaging Spectroradiometer (MODIS) on EOS-AM1.
IEEE Transactions on Geoscience and Remote Sensing 36, 1088{1100.
Breiman, L., Friedman, J., Olshen, R., and Stone, C. 1984. Classi�cation
and Regression Trees. Wadsworth International Group, Belmont, CA.
Brown, D., Reichenbacher, F., and Franson, S. 1998. A classi�cation of
North American biotic communities. The University of Utah, Salt Lake City.
Carpenter, G., Grossberg, S., Markuxon, N., Reynolds, J., and Rosen,
D. 1992. Fuzzy ARTMAP: A neural network architecture for incremental super-
vised learning of analog multidimensional maps. IEEE Transactions on Neural
Networks 3, 698{713.
Choudhury, B. 1991. Multispectral satellite data in the context of land surface
heat balance. Review of Geophysics 29, 217{236.
Cihlar, J., Ly, H., Chen, Z., Pokrant, H., and Huang, F. 1997. Multi-
temporal, multichannel AVHRR datasets for land biosphere studies-artifacts and
75
corrections. Remote Sensing of Environment 60, 35{37.
Cliff, A. and Ord, J. 1973. Spatial Autocorrelation. Pion Limited, London (Eng-
land).
Cohen, J. 1960. A coe�cient of agreement for nominal scales. Educational and
Psychological Measurement 20(1), 37{46.
Congalton, R. 1988. Using Spatial Autocorrelation Analysis to Explore the Errors
in Maps Generated from Remotely Sensed Data. Photogrammetric Engineering
and Remote Sensing 54(5), 587{592.
Congalton, R. 1991. A review of assessing the accuracy of classi�cations of re-
motely sensed data. Remote Sensing of Environment 37, 35{46.
DeFries, R., Field, C., Fung, I., Justice, C., Los, S., Matson, P. A.,
Matthews, E., Mooney, H., Potter, C., prentice, K., Sellers, P.,
Townshend, J., Tucker, C., Ustin, S., and Vitousek, P. 1995. Map-
ping the land surface for global atmosphere-biosphere models: Towards contin-
uous distributions of vegetation's functional properties. Journal of Geophysical
Research 100, D10, 20,867{20,882.
DeFries, R., Hansen, M., and Townshend, J. 1998. Continuous �elds of vege-
tation properties from multiyear, 8km AVHRR data, training data derived from
Landsat Imagery in decision tree classi�ers. International Journal of Remote Sens-
ing . (Submitted).
DeFries, R., Hansen, M., Townshend, J., and Sohlberg, R. 1998. Global
76
land cover classi�cation at 8Km spatial resolution: the use of training data derived
from Landsat imagery in decision tree classi�ers. International Journal of Remote
Sensing 19, 3141{3168.
DeFries, R. and Townshend, J. 1994. NDVI-derived land cover classi�cations
at a global scale. International Journal of Remote Sensing 15, 3567{3586.
Dickinson, R. and Henderson-Sellers, A. 1988. Modeling deforestation: A
study of GCM Land-Surface parameterization. Q.J.R. Meteorological Society 114,
439{462.
Friedl, M. and Brodley, C. 1997. Decision tree classi�cation of land cover from
remotely sensed data. Remote Sensing of Environment 61, 399{409.
Friedl, M., Brodley, C., and Strahler, A. 1999. Maximizing land cover clas-
si�cation accuracies produced by decision trees at continental to global scales.
IEEE Transactions on Geoscience and Remote Sensing, In Press.
Gholz, H. 1982. Environmental limits on aboveground net primary production,
leaf area, and biomass in vegetation zones of the Paci�c Northwest. Ecology 63,
469{481.
Gopal, S. and Woodcock, C. 1994. Theory and methods for accuracy assessment
of thematic maps using fuzzy sets. Photogrammetric Engineering and Remote
Sensing 60, 181{188.
Gopal, S. and Woodcock, C. 1996. Remote sensing of forest change using arti�-
cial neural networks. IEEE Transactions on Geoscience and Remote Sensing 34, 2,
77
398{404.
Goward, S., Waring, R., Dye, D., and Yang, Y. 1994. Ecological remote
sensing at OTTER: satellite macroscale observations. Ecological Applications 4,
322{343.
Grier, C. and Running, S. 1977. Leaf area of mature northwestern coniferous
forests: relations to site water balance. Ecology 58, 893{899.
Hansen, M., DeFries, R., Townshend, J., and Sohlberg, R. 1999. Global
land cover classi�cation at 1km spatial resolution using a classi�cation tree ap-
proach. International Journal of Remote Sensing . (Submitted).
Holben, B. 1986. Characteristics of maximum-value composite images from tem-
poral AVHRR data. International Journal of Remote Sensing 5, 145{160.
Jarvis, P. and McNaughton, K. 1986. Stomatal control of transpiration: scaling
up from leaf to region. Advances in Ecological Research 15, 1{49.
Justice, C., Hall, D., Salomonson, V., Privette, J., Riggs, G., Strahler,
A., Lucht, W., Myneni, R., Knjazihhin, Y., Running, S., Nemani, R.,
Vermote, E., Townshend, J., Defries, R., Roy, D., Wan, Z., Huete,
A., van Leeuwen, W., Wolfe, R., Giglio, L., Muller, J.-P., Lewis, P.,
and Barnsley, M. 1998. The Moderate Resolution Imaging Spectroradiometer
(MODIS): Land remote sensing for global change research. IEEE Transactions on
Geoscience and Remote Sensing 36(4), 1228{1249.
Justice, C., Townshend, J., Holben, B., and Tucker, C. 1985. Analysis of the
78
phenology of global vegetation using meteorological satellite data. International
Journal of Remote Sensing 6, 1271{1318.
Knapp, R. 1965. Die Vegetation von Nord-und Mittelamerika und der Hawaii-Inseln.
Gustav Fischer.
Knyazikhin, Y., Martonchik, J., Myneni, R., Diner, D., and Running, S.
1998. Synergistic algorithm for estimating vegetation canopy leaf area index and
fraction of absorbed photosynthetically active radiation from MODIS and MISR
data. Journal of Geophysical Research 103, 32,257{32,279.
Lean, J. and Warilow, D. 1989. Simulation of the regional climatic impact of
Amazon deforestation. Nature 342, 411{413.
Loveland, T. and Shaw, D. 1996. Multiresolution Land Characterization: Build-
ing Collaborative Partnerships. In Technologies for Biodiversity Gap Analysis:
Proceedings of the ASPRS/GAP Symposium, Charlotte, NC in press.
Loveland, T. R., Merchant, J. W., Brown, J. F., Ohlen, D. O., Reed,
B. C., Olsen, P., and Hutchinson, J. 1995. Seasonal land cover of the United
States. Annals of the Association of American Geographers 85, 2, 339{355.
Loveland, T. R., Merchant, J. W., Ohlen, D. O., and Brown, J. F. 1991.
Development of a land cover characteristics data base for the conterminuous U.S.
Photogrammetric Engineering and Remote Sensing 57, 1453{1463.
Ma, Z. and Redmond, R. 1995. Tau Coe�cients for Accuracy Assessment of Classi-
�cation of Remote Sensing Data. Photogrammetric Engineering and Remote Sens-
79
ing 61(4), 435{439.
Matthews, E. 1983. Global vegetation and land use: New high resolution databases
for climate studies. Journal of Climate and Applied Meteorology 22, 474{487.
Mingers, J. 1989. An empirical comparison of pruning methods for decision tree
induction. Machine Learning 4, 227{243.
Moody, A. and Strahler, A. 1994. Characteristics of composited AVHRR
data and problems in their classi�cation. International Journal of Remote Sens-
ing 15(17), 3473{3491.
Myneni, R., Asrar, G., and Hall, F. 1992. A three dimensional radiative trans-
fer method for optical remote sensing of vegetated land surfaces. Remote Sensing
of Environment 41, 105{121.
Myneni, R., Hall, F., Sellers, P., and Marshak, A. 1995. The interpreta-
tion of spectral vegetation indices. IEEE Transactions on Geoscience and Remote
Sensing 33, 481{486.
Myneni, R., Los, S., G., and Asrar 1995. Potential gross primary production
of terrestrial vegetation from 1982-1990. Geophysical Research Letters 22, 2617{
2620.
Myneni, R., Maggion, S., Iaquinta, J., Privette, J., Gobron, N., Pinty,
B., Verstraete, M., Kimes, D., and Williams, D. 1995. Optical remote
sensing of vegetation: Modeling, caveats and algorithms. Remote Sensing of En-
vironment 51, 169{188.
80
Myneni, R., Nemani, R., and Running, S. 1997. Estimation of global leaf area
index and absorbed par using radiative transfer models. IEEE Transactions on
Geoscience and Remote Sensing 35, 1380{1393.
Myneni, R. and Williams, D. 1994. On the relationship between FAPAR and
NDVI. Remote Sensing of Environment 49, 200{211.
Myneni, R. B., Asrar, G., and Gerstl, S. A. W. 1990. Radiative transfer
in three dimensional leaf canopies. Transport Theory and Statistical Physics 19,
205{250.
Nemani, R., Pierce, L., Running, S., and Goward, S. 1993. Developing satel-
lite derived estimates of surface moisture status. Journal of Applied Meteorol-
ogy 32, 548{557.
Nemani, R. and Running, S. 1997. Land Cover Characterization Using Multitem-
poral Red, Near-IR, And Thermal -IR Data From NOAA/AVHRR. Ecological
Applications (7)1, 79{90.
Olson, J. and Watts, J. 1982. Major world ecosystem comlexes. Oak Ridge Na-
tional Lab, Oak Ridge, TN .
Omernik, J. 1987. Ecoregions of the conterminous United States. Annals of the
Association of American Geographers 77, 118{125.
Prentice, C., Cramer, W., Harrison, S., Leemans, R., Monserud, R.,
and Solomon, R. 1992. A global biome model based on plant physiology and
dominance, soil properties and climate. Journal of Biogeography 19, 117{134.
81
Quinlan, J. 1987. Simplifying decision trees. International Journal of Man-machine
Studies 27, 221{234.
Quinlan, J. 1993. C4.5: Programs for Machine Learning. Morgan Kaufmann, San
Mateo, CA.
Quinlan, J. 1996. Bagging, boosting, and C4.5. Proceedings of the Thirteenth Na-
tional Conference on Arti�cial Intelligence Portland, OR, AAAI Press, 725{730.
Rohlf, F. 1975. Generalization of the Gap Test for the Detection of Multivariate
Outliers. Biometrics 31, 93{101.
Rosenfield and Fitzpatrick-Lins 1986. A coe�cient of agreement as a measure
of thematic classi�cation accuracy. Photogrammetric Enginieering and Remote
Sensing 52(2), 223{227.
Ruimy, A., Saugier, B., and Dedieu, G. 1994. Methodology for the estimation
of net primary production from remotely sensed data. Journal of Geophysical
Research 99, 5263{5283.
Running, S. and Hunt, E. 1993. Generalization of a forest ecosystem process
model for other biomes, BIOME-BGC, and an application for global-scale model.
In: Scaling Physiological Processes: Leaf to Globe, Ehleringer, J. and Field, C.
Running, S., Justice, C., Salomonson, V., Hall, D., Barker, J., Kauf-
mann, Y., Strahler, A., Huete, A., Vanderbilt, J. M. V., Wan, Z.,
Teillet, P., and Carneggie, D. 1994. Terrestrial remote sensing science
82
and algorithms planned for EOS/MODIS. International Journal of Remote Sens-
ing 15, 3587{3620.
Running, S. W., Loveland, T. R., Pierce, L. L., Nemani, R. R., and
Hunt Jr., E. R. 1995. A remote sensing based vegetation classi�cation logic
for global land cover analysis. Remote Sensing of Environment 51, 39{48.
Safavian, S. R. and Landgrebe, D. 1991. A survey of decision tree classi�er
methodology. IEEE Transactions on Systems, Man, and Cybernetics 21, 3, 660{
674.
Schoewengerdt, R. 1997. Remote Sensing, models and methods for image pro-
cessing. Academic Press.
Sellers, P., Mintz, Y., Sud, Y., and Dalcher, A. 1986. A simple biosphere
model (SiB) for use within general circulation models. Journal of Atmospheric
Science 43, 505{531.
Sellers, P. and Schimel, D. 1993. Remote Sensing of the land biosphere and
biogeochemistry in the EOS era: Science priorities, methods and implementation
- EOS land biosphere and biogeochemical cycles panels. Global and Planetary
Change 7, 279{297.
Shapire, R. 1990. The strength of weak learnability. Machine Learning 5, 2, 197{
227.
Stehman, S. 1996. Estimating the Kappa Coe�cient and its Variance under Strati-
�ed Random Sampling. Photogrammetric Engineering and Remote Sensing 62(4),
83
401{402.
Stehman, S. 1997. Selecting and Interpreting Measures of Thematic Classi�cation
Accuracy. Remote Sensing Environment 62, 77{89.
Steinwand, D. 1994. Mapping raster imagery to the Interrupted Goode Homolosine
projection. International Journal of Remote Sensing 15(17), 3463{3471.
Strahler, A., Townshend, J., Muchoney, D., Borak, J., Friedl, M.,
Gopal, S., Hyman, A., Moody, A., and Lambin, E. 1996. MODIS Land
Cover Product Algorithm Theoretical Basis Document (ATBD), V4.1. Boston Uni-
versity Center for Remote Sensing.
Townshend, J., Jusitce, C., Gurney, C., and McManus, J. 1992. The impact
of misregistration on change detection. IEEE Transactions on Geoscience and
Remote Sensing 30(5), 1054{1060.
Townshend, J., Justice, C., Li, W., Gurney, C., and McManus, J. 1991.
Global land cover classi�cation by remote sensing: Present capabilities and future
possibilities. Remote Sensing of Environment 35, 243{255.
Townshend, J. and Tucker, C. 1984. Objective assessment of Advanced Very
High Resolution Radiometer data for land cover mapping. International Journal
of Remote Sensing 5, 497{504.
Tucker, C. 1979. Red and photographic infrared linear combination for monitoring
vegetation. Remote Sensing of Environment 8, 127{150.
84
Tucker, C., Justice, C., and Prince, S. 1986. Monitoring the grasslands of the
Sahel 1984-1985. International Journal of Remote Sensing 7, 1571{1782.
Webb, W., Lauenroth, W., Szareck, S., R.S., and Kinerson 1983. Primary
production and abiotic controls in forests, grassland, and desert ecosystems in the
Unites States. Ecology 64, 134{151.
Wilson, M. and Henderson-Sellers, A. 1985. A global archive of land cover
and soils data for use in general circulation models. Journal of Climatology 5,
119{143.
Wolfe, R., Roy, P., and Vermote, E. 1998. MODIS Land Data Storage, Grid-
ding, and Compositing Methodology: Level 2 Grid. IEEE Transactions on Geo-
science and Remote Sensing 36, 1324{1338.
Woodward, F. 1987. Climate and plant distribution. Cambridge University Press,
Cambridge.
Zhu, Z. and Yang, L. 1996. Characteristics of the 1-km AVHRR data set for North
America. International Journal of Remote Sensing 17, 1915{1924.
85
A Appendix
IGBP Land Cover Classes
Class Ground
Cover
Canopy
Height
Description
1 Evergreen Needleleaf
Forest (ENF)
> 60% > 2m woody, green year-round
2 Evergreen Broadleaf
Forest (EBF)
> 60% > 2m woody, green year-round
3 Deciduous Needle-
leaf Forest (DNF)
> 60% > 2m woody, shed leaves during dry
season
4 Deciduous Broadleaf
Forest (DBF)
> 60% > 2m woody, shed leaves in annual
cycle
5 Mixed Forest (MXF) > 60% > 2m woody, needleleaf/broadleaf mix-
ture, neither component > 60%
6 Closed Shrubland
(CSH)
> 60% < 2m woody, herbaceous understory,
evergreen or deciduous
7 Open Shrubland
(OSH)
< 60% < 2m woody, sparse herbaceous under-
story, evergreen or deciduous
8 Woody Savannas
(WSA)
30�60% > 2m tree/shrub, herbaceous under-
story, evergreen or deciduous
9 Savannas (SAV) 10�30% > 2m tree/shrub, herbaceous under-
story, evergreen or deciduous
10 Grasslands (GRL) < 10% < 2m herbaceous
11 PermanentWetlands
(PWL)
water mosaic, herbaceous/woody,
salt, brackish or fresh water
12 Croplands (CRL) > 60% < 2m broadleaf crops, cereal crops
13 Urban and Built-Up
(URB)
man-made structures, buildings
14 Cropland Mosaics
(CRM)
> 60% croplands/nat. vegetation mo-
saic, neither component > 60%
15 Snow and Ice (SNI) snow/ice covered most of the year
16 Barren/Sparsely
Vegetated (BSV)
exposed soil, sand, rocks
17 Water Bodies
(WAT)
oceans, lakes, reservoirs, rivers
Table 23: IGBP class de�nitions
86
Errormatrixforuncleanedtrainingdata(I)
#R/C!
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
xk+
PAi
1
446
222
0
16
171
20
6
5
21
14
0
116
0
44
0
0
0
1081
0.41
2
106
1628
0
0
70
16
0
25
36
43
0
60
0
8
0
0
0
1992
0.82
3
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
NA
NA
4
37
1
0
442
106
37
10
2
0
57
0
60
0
64
0
19
0
845
0.52
5
222
74
0
123
439
33
3
22
4
8
0
82
0
46
0
0
0
1056
0.42
6
38
13
0
46
51
189
27
0
32
31
0
43
0
9
0
0
0
479
0.39
7
6
0
0
45
5
65
186
3
4
83
0
44
0
38
0
29
0
508
0.37
8
7
94
0
9
38
9
17
12
10
23
0
99
0
1
0
6
0
325
0.04
9
11
26
0
0
4
15
1
3
57
29
0
91
0
5
0
0
0
242
0.24
10
35
149
0
28
16
41
22
7
48
278
0
264
0
41
0
238
0
1167
0.24
11
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
NA
NA
12
103
38
0
22
107
43
36
27
36
155
0
1805
0
433
0
49
0
2854
0.63
13
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
NA
NA
14
40
72
0
107
74
3
2
4
14
13
0
325
0
1171
0
0
0
1825
0.64
15
0
0
0
0
0
0
1
0
0
0
0
0
0
0
1
32
0
34
0.03
16
1
1
0
12
0
0
6
0
0
115
0
13
0
6
1
705
0
860
0.82
17
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0.00
x+k
1052
2318
NA
850
1081
471
317
120
262
849
NA
3002
NA
1866
2
1078
2
13268
PUj
0.42
0.70
NA
0.52
0.41
0.40
0.59
0.10
0.22
0.33
NA
0.60
NA
0.63
0.50
0.65
0.00
Trace=7359
Po
=
0.55
Pc
=0.12
Pq k=1xk+
x+k
=21994592
�
=0.49
Table 24: Error matrix for IGBP classes and site-based accuracy coe�cients for the
uncleaned training data set (I).
87
Errormatrixforcleanedtrainingdata(II)
#R/C!
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
xk+
PAi
1
528
119
0
35
164
7
9
1
23
7
0
91
0
25
0
0
0
1009
0.52
2
94
1549
0
0
63
3
0
24
49
47
0
45
0
5
0
0
0
1879
0.82
3
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
NA
NA
4
57
0
0
435
80
35
7
10
0
70
0
52
0
64
0
8
0
818
0.53
5
191
93
0
103
388
27
13
40
5
17
0
93
0
35
0
0
0
1005
0.39
6
22
11
0
55
27
156
43
0
30
59
0
69
0
1
0
0
0
473
0.33
7
11
0
0
8
5
54
186
0
8
100
0
64
0
46
0
23
0
505
0.37
8
34
94
0
4
15
12
0
32
11
15
0
83
0
0
0
0
0
300
0.11
9
10
4
0
0
5
24
3
6
61
49
0
65
0
0
0
0
0
227
0.27
10
17
155
0
45
17
37
43
13
47
371
0
288
0
34
0
84
0
1151
0.32
11
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
NA
NA
12
103
61
0
26
58
44
38
20
34
89
0
1902
0
272
0
9
0
2656
0.72
13
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
NA
NA
14
26
29
0
77
79
3
8
0
0
15
0
334
0
1131
0
6
0
1708
0.66
15
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
33
0
33
0.00
16
0
2
0
13
0
0
21
0
0
82
0
16
0
6
4
592
0
736
0.80
17
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0.00
x+k
1093
2117
NA
801
901
402
371
146
268
921
NA
3102
NA
1619
4
755
0
12500
PUj
0.48
0.73
NA
0.54
0.43
0.39
0.50
0.22
0.23
0.40
NA
0.61
NA
0.70
0.00
0.78
0.00
Trace=7331
Po
=
0.59
Pc
=0.13
Pq k=1xk+
x+k
=19745828
�
=0.53
Table 25: Error matrix for IGBP classes and site-based accuracy coe�cients for the
cleaned data set (II).
88
ErrormatrixfortrainingdatawithSLCRasadditionalvariable(III)
#R/C!
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
xk+
PAi
1
615
55
0
25
156
6
4
1
22
10
0
90
0
25
0
0
0
1009
0.61
2
64
1603
0
0
27
5
0
36
6
25
0
102
0
11
0
0
0
1879
0.85
3
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
NA
NA
4
42
0
0
461
131
1
0
3
0
53
0
28
0
79
0
20
0
818
0.56
5
227
50
0
74
430
41
12
46
2
15
0
51
0
57
0
0
0
1005
0.43
6
5
13
0
5
93
187
1
2
35
36
0
90
0
6
0
0
0
473
0.40
7
15
1
0
5
9
2
254
0
2
140
0
63
0
13
0
1
0
505
0.50
8
7
96
0
2
20
3
0
32
10
23
0
106
0
1
0
0
0
300
0.11
9
17
3
0
0
3
8
0
3
118
8
0
66
0
1
0
0
0
227
0.52
10
11
169
0
30
15
16
55
17
65
330
0
231
0
48
0
164
0
1151
0.29
11
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
NA
NA
12
60
59
0
16
62
28
41
33
62
103
0
1932
0
259
0
1
0
2656
0.73
13
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
NA
NA
14
20
23
0
36
96
4
1
0
0
8
0
253
0
1267
0
0
0
1708
0.74
15
0
0
0
0
0
0
0
0
0
0
0
0
0
0
2
30
0
32
0.06
16
3
0
0
2
1
0
13
0
0
114
0
3
0
4
25
571
0
736
0.78
17
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0.00
x+k
1086
2072
NA
656
1043
301
381
173
322
865
NA
3015
NA
1771
27
787
0
12499
PUj
0.57
0.77
NA
0.70
0.41
0.62
0.67
0.18
0.37
0.38
NA
0.64
NA
0.72
0.07
0.73
0.00
Trace=7802
Po
=
0.62
Pc
=0.13
Pq k=1xk+
x+k
=19642076
�
=0.57
Table 26: Error matrix for IGBP classes and site-based accuracy coe�cients using
SLCR labels (III)
89
Errormatrixfortrainingdatawithadditionaltrainingsites(IV)
#R/C!
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
xk+
PAi
1
2084
82
0
49
171
11
9
19
0
2
0
412
0
46
0
3
0
2888
0.72
2
65
1608
0
1
39
11
4
26
16
17
0
89
0
1
0
0
0
1877
0.86
3
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0.00
4
278
111
0
676
89
55
15
3
0
71
0
46
0
58
0
4
0
1406
0.48
5
213
71
0
80
508
26
0
14
6
0
0
43
0
43
0
0
0
1004
0.51
6
6
13
0
28
32
224
26
7
17
51
0
67
0
2
0
0
0
473
0.47
7
21
0
0
5
6
84
2505
0
1
49
0
126
0
8
9
321
0
3135
0.80
8
18
99
0
7
6
3
6
403
27
25
0
88
0
0
0
0
0
682
0.59
9
7
6
0
0
5
2
0
12
59
63
0
73
0
0
0
0
0
227
0.26
10
16
63
0
31
14
28
81
20
81
385
0
233
0
34
0
164
0
1150
0.33
11
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0.00
12
116
99
0
12
73
38
37
12
26
44
0
1959
0
240
0
0
0
2656
0.74
13
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0.00
14
54
23
0
64
70
2
9
0
0
9
0
293
0
1184
0
0
0
1708
0.69
15
0
0
0
0
0
0
210
0
0
0
0
0
0
0
408
15
0
633
0.64
16
5
0
0
0
1
0
30
0
0
85
0
0
0
5
8
600
0
734
0.82
17
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0.00
x+k
2883
2175
0
953
1014
484
2932
516
233
801
0
3429
0
1621
425
1107
0
18573
PUj
0.72
0.74
0.00
0.71
0.50
0.46
0.85
0.78
0.25
0.48
0.00
0.57
0.00
0.73
0.96
0.54
0.00
Trace=12603
Po
=
0.68
Pc
=
0.11
Pq k=1
xk+
x+k
=38470913
�
=
0.64
Table 27: Error matrix for IGBP classes and site-based accuracy coe�cients with
additional training sites (IV).
90
Errormatrixfortrainingdatawithproportionalsampling(V)
#R/C!
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17xk+
PAi
1
2434
74
0
101
180
9
10
3
16
14
0
27
0
19
1
0
0
2888
0.84
2
27
451
0
0
14
1
4
28
11
10
0
9
0
2
0
0
0
557
0.81
3
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0.00
4
73
99
0
973
67
48
31
5
0
39
0
7
0
56
0
8
0
1406
0.69
5
271
32
0
106
458
42
5
4
4
3
0
54
0
25
0
0
0
1004
0.46
6
11
12
0
34
19
189
5
9
83
49
0
62
0
0
0
0
0
473
0.40
7
16
0
0
16
4
34
801
817
5
254
0
6
0
19
50
21
0
2043
0.39
8
24
91
0
3
17
4
5
429
26
25
0
56
0
2
0
0
0
682
0.63
9
7
10
0
0
5
29
2
10
92
18
0
13
0
0
0
0
0
186
0.49
10
18
63
0
47
14
29
120
12
74
609
0
110
0
25
0
29
0
1150
0.53
11
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0.00
12
32
19
0
4
20
27
13
11
18
53
0
261
0
99
0
0
0
557
0.47
13
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0.00
14
39
10
0
66
55
3
10
4
0
7
0
101
0
634
0
0
0
929
0.68
15
0
0
0
0
0
0
7
0
0
0
0
0
0
0
562
64
0
633
0.89
16
6
0
0
0
0
0
38
0
0
124
0
0
0
5
16
545
0
734
0.74
17
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0.00
x+k
2958
861
0
1350
853
415
1051
1332
329
1205
0
706
0
886
629
667
0
13242
PUj
0.82
0.52
0.00
0.72
0.54
0.46
0.76
0.32
0.28
0.51
0.00
0.37
0.00
0.72
0.89
0.82
0.00
Trace=8438
Po
=
0.64
Pc
=0.11
Pq k=1xk+
x+k
=18579720
�
=0.59
Table 28: Error matrix for IGBP classes and site-based accuracy coe�cients for
proportional sampling (V).
91
ArealComparisonofUMDandBUmapinIGBPscheme
BU!
1
2
4
5
6
7
8
9
10
12
16
Total
PCi
#UMD
1
1385316
13722
185728
155501
5802
181328
84878
6884
29202
171261
21062
2240684
61.8%
2
48527
154835
2106
71081
2412
4859
16700
4757
16600
12587
1652
336116
46.1%
4
46856
73969
305819
84923
730
9048
15780
1872
25116
186921
2007
753041
40.6%
5
268495
5183
298802
141310
1307
63053
23296
783
15905
295515
15942
1170945
12.5%
6
30183
1302
15672
4604
66288
691290
300417
9040
94618
248476
82204
1589447
4.3%
7
9449
21
159900
343
19585
2160696
78013
588
110747
85964
296910
2922216
73.9%
8
1098828
42658
269229
258276
46098
258893
389455
94259
138578
475822
20443
3092539
12.6%
9
425311
37694
222658
163857
138116
379342
389795
126403
194052
764514
28855
3111528
4.4%
10
98885
8230
34971
22533
81779
305998
122217
36980
214042
598352
59101
1583088
13.5%
12
56867
17912
69222
43376
34204
214789
46931
31521
119056
815392
12664
1461934
55.8%
16
21
0
5220
0
8
774368
68
1
110
241
1617459
2397496
67.5%
Total
3468738
355526
1569327
945804
396329
5043664
1467550
313088
958026
3655045
2158299
PCj
39.9%
3.6%
19.5%
14.9%
16.7%
42.8%
26.5%
40.4%
22.3%
22.3%
74.9%
Totalpixels=20331396
Overallagreement=36.3%
Table 29: Pixel-based comparison of UMD and BU maps in the IGBP scheme.
92
ArealComparisonofUMDandEDCmapinIGBPscheme
EDC!
1
2
4
5
6
7
8
9
10
12
16
Total
PCi
#UMD
1
1475583
5755
63780
656089
16743
1360
15430
842
7409
19038
124
2262153
65.2%
2
50142
153551
435
44776
3
46
3501
3485
11450
56214
5
323608
47.4%
4
38255
98151
324741
212453
1375
22
716
740
7299
34731
0
718483
45.2%
5
282721
2935
238753
580452
5540
226
3195
189
433
9806
7
1124257
51.6%
6
22003
0
74476
95744
79565
327081
449075
5343
285121
14584
113304
1466296
5.4%
7
284
0
456
55685
29236
1257460
240405
174
52138
4829
1307154
2947821
42.7%
8
1233402
50651
345829
715377
159932
13524
182368
10713
52118
201328
11578
2976820
6.1%
9
451056
36932
324148
379340
179208
70031
449741
25695
245613
470994
18135
2650893
1.0%
10
110098
1607
23574
86279
105152
246162
131203
6824
792029
141670
2929
1647527
48.1%
12
73403
5079
91919
27893
2828
9480
95523
19689
205130
898985
21
1429950
62.9%
16
0
0
0
44
0
381328
34
0
0
61
2018003
2399470
84.1%
Total
3736947
354661
1488111
2854132
579582
2306720
1571191
73694
1658740
1852240
3471260
PCj
39.5%
43.3%
21.8%
20.3%
13.7%
54.5%
11.6%
34.9%
47.7%
48.5%
58.1%
Totalpixels=19947278
Overallagreement=38.8%
Table 30: Pixel-based comparison of UMD and EDC maps in the IGBP scheme.
93
ArealComparisonofBU
andEDCmapinIGBPscheme
EDC!
1
2
4
5
6
7
8
9
10
11
12
14
15
16
Total
PCi
#BU
1
2878390
18737
38227
406213
22532
932
21405
5082
8425
2998
48686
17111
0
0
3468738
83.0%
2
29337
175851
4360
13721
323
99
7462
3458
22522
2170
65290
30933
0
0
355526
49.5%
4
86811
2682
915706
291041
5450
180675
66468
1440
5534
9
7441
6070
0
0
1569327
58.4%
5
316191
70461
97734
236556
19164
242
51990
3858
14881
592
98386
35749
0
0
945804
25.0%
6
32585
1841
26025
11529
1076
43479
91533
2062
117367
3223
55370
10239
0
0
396329
0.3%
7
37369
633
157155
261646
296634
1053231
745072
457
230055
1422
144100
150013
162480
1803397
5043664
20.9%
8
45048
27674
18965
682215
22157
37835
256689
3882
39864
231720
49001
52500
0
0
1467550
17.5%
9
41029
6078
21421
42585
4984
522
49859
8103
22851
3618
44700
67338
0
0
313088
2.6%
10
27607
30537
51897
75971
64214
169949
86648
23451
202092
9219
68496
147945
0
0
958026
21.1%
11
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0.00%
12
125078
16722
101135
619864
99757
156570
158232
19259
646445
97030
960257
654696
0
0
3655045
26.3%
14
113581
448
39861
168355
29709
114555
32573
1066
341511
4746
301841
335328
0
0
1483574
22.6%
15
0
0
0
0
0
73642
646
0
0
0
15
0
1308142
172794
1555239
84.1%
16
3921
2997
15625
44436
13582
474923
2614
1576
7193
2961
8657
2217
0
22358
603060
3.7%
Total
3736947
354661
1488111
2854132
579582
2306654
1571191
73694
1658740
359708
1852240
1510139
1470622
1998549
PCj
77.0%
49.6%
61.5%
8.3%
0.2%
45.7%
16.3%
11.0%
12.2%
0.0%
51.8%
22.2%
89.0%
1.1%
totalpixels=
21814970
Overallagreement=
38.3%
Table 31: Pixel-based comparison of BU and EDC maps in the IGBP scheme.
94
Areal Comparison of UMD and EDC map in biome scheme
EDC! 1 2 3 4 5 6 7 Total PCi
#UMD
1 792029 246162 141670 138027 216612 110098 2929 1647527 48.1%
2 52138 1257460 4829 240579 85377 284 1307154 2947821 42.7%
3 205130 9480 898985 115212 127719 73403 21 1429950 62.9%
4 297731 83555 672322 668517 2191417 1684458 29713 5627713 11.9%
5 304303 327375 115335 466244 1912950 393121 113316 3632644 52.7%
6 7409 1360 19038 16272 742367 1475583 124 2262153 65.2%
7 0 381328 61 34 44 0 2102542 2484009 84.6%
Total 1658740 2306720 1852240 1644885 5276486 3736947 3555799
PCj47.7% 54.5% 48.5% 40.6% 36.3% 39.55% 59.1%
Total pixels = 20031817 Overall agreement = 45.5%
Table 32: Pixel-based comparison of UMD and EDC maps in the biome scheme.
95
Areal Comparison of UMD and BU map in biome scheme
BU! 1 2 3 4 5 6 7 total PCi
#UMD
1 652144 176637 53541 153166 33515 357383 221141 1647527 39.6%
2 151806 2461179 2014 26648 63532 47257 195385 2947821 83.5%
3 520201 21282 358357 208611 123468 136137 61894 1429950 25.1%
4 1097453 364515 328838 694396 535548 2341016 265947 5627713 12.3%
5 562538 726865 161065 239013 1017841 708270 217052 3632644 28.0%
6 201336 105241 125620 65614 110138 1636584 17620 2262153 72.3%
7 24457 681678 9845 15218 11015 10880 1730916 2484009 69.7%
Total 3686851 4896929 1378206 1847457 2006770 5337426 2748025
PCj20.3% 54.2% 34.5% 49.5% 53.7% 31.2% 63.9%
Total pixels = 20031817 Overall agreement = 42.7%
Table 33: Pixel-based comparison of UMD and BU maps in the biome scheme.
96
Areal Comparison of EDC and BU map in biome scheme
EDC! 1 2 3 4 5 6 7 Total PCi
#BU
1 851029 239292 705732 188682 1010254 190536 24410 3209935 26.5%
2 81794 1502988 10884 737049 345679 44025 1814978 4537397 33.1%
3 56649 897 490456 55375 368025 58033 9845 1039280 47.2%
4 126469 2294 261062 328291 612432 56934 15184 1402666 23.4%
5 55986 69410 168111 105829 1397185 84210 14326 1895057 73.7%
6 307122 70969 98081 220610 1234142 3295752 10851 5237527 62.9%
7 179691 420870 117914 9049 308769 7457 1666205 2709955 61.5%
Total 1658740 2306720 1852240 1644885 5276486 3736947 3555799
PCj51.3% 65.2% 26.5% 20.0% 26.5% 88.2% 46.9%
Total pixels = 20031817 Overall agreement = 47.6%
Table 34: Pixel-based comparison of BU and EDC maps in the biome scheme.