passive vs. active remote sensing - university of …€¦ · passive vs. active remote sensing ......
TRANSCRIPT
David Tenenbaum – GEOG 070 – UNC-CH Spring 2005
Passive vs. Active Remote Sensing
Passive sensors receive solar energy reflected by the Earth’s surface (2), along with energy emitted by the atmosphere (1), surface (3) and sub-surface (4)
Active sensors receive energy reflected from the Earth’s surface that originally came from an emitter other than the Sun
http://www.ccrs.nrcan.gc.ca/ccrs/learn/tutorials/fundam/chapter3/chapter3_1_e.html
David Tenenbaum – GEOG 070 – UNC-CH Spring 2005
Electromagnetic radiation energy: Wave-particle duality
particleWavelength (λ)• EMR energy moves at the speed of light (c): c = f λ• f = frequency: The number of waves passing through a point within
a unit time (usually expressed per second)• Energy carried by a photon: ε = h f [h=Planck constant (6.626×10-34 Js)]
• The shorter the wavelength, the higher the frequency, and the more energy a photon carries. Therefore, short wave ultraviolet solar radiation is very destructive (sunburns)
Solar Radiation
David Tenenbaum – GEOG 070 – UNC-CH Spring 2005
Atmospheric windows
Solar Electromagnetic Radiation•The sun emits EMR across a broad spectrum of wavelengths:
But the atmosphere blocks much of the energy before it reaches the surface
David Tenenbaum – GEOG 070 – UNC-CH Spring 2005
Satellite Imagery - 4 Resolutions• Satellite imagery can be described by four resolutions:
– Spatial resolution: area on ground represented by each pixel, e.g.
• Landsat Thematic Mapper - 30m• Advanced Very High Resolution Radiometer (AVHRR) and
Moderate Resolutions Imaging Spectrometer (MODIS) - 1km• SPOT - 10m panchromatic /20m multispectral• IKONOS - 1m panchromatic /4m multispectral
– Temporal resolution: how often a satellite obtains imagery of a particular area
– Spectral resolution: specific wavelength intervals in the electromagnetic spectrum captured by each sensor (bands)
– Radiometric Resolution: number of possible data values reportable by each sensor (how many bits)
David Tenenbaum – GEOG 070 – UNC-CH Spring 2005
IKONOS panchromatic image of Sydney Olympic Park - 1m
Spatial Resolution
David Tenenbaum – GEOG 070 – UNC-CH Spring 2005
Temporal Resolution• Number of days between overhead passes - satellite orbit
– Landsat - 16 days– AVHRR & MODIS - daily– IKONOS - 1 to 3 days
David Tenenbaum – GEOG 070 – UNC-CH Spring 2005
Spectral Resolution• Number, spacing and width of sampled wavelength
bands (Landsat: 7 bands, AVIRIS: 224 bands!)• Multispectral vs. Panchromatic• Higher resolution results in more precision in the
representation of spectral signatures
David Tenenbaum – GEOG 070 – UNC-CH Spring 2005
Radiometric Resolution• Number of possible data values reported by the sensor,
which determines how many levels of brightness it can distinguish
• Range is expressed as 2n power– 8-bit radiometric resolution has 28 values, or 256
values - range is 0-255 (e.g. Landsat TM data)– 16-bit resolution has 216 values, or 65,536 values
- range is 0-65535 (e.g. MODIS data)• The value in each pixel is called the
– Digital Number (DN)– Brightness Value (BV)
David Tenenbaum – GEOG 070 – UNC-CH Spring 2005
Image Pre-Processing• Radiometric Corrections
– changing the image data BVs to correct for errors or distortions from a variety of sources:• atmospheric effects• sensor errors
• Geometric Corrections– changing the geometric/spatial properties of the
image data so that we can accurately project the image, a.k.a.• image rectification• rubber sheeting
David Tenenbaum – GEOG 070 – UNC-CH Spring 2005
Image Enhancements• Image enhancements are designed to improve the
usefulness of image data for various applications:
– Contrast Enhancement - maximizes the performance of the image for visual display
– Spatial Enhancements - increases or decreases the level of spatial detail in the image
– Spectral Enhancements - makes use of the spectral characteristics of different physical features to highlight specific features
David Tenenbaum – GEOG 070 – UNC-CH Spring 2005
Spatial Enhancements• Filters - used to
emphasize or de-emphasize spatial information– Low-pass filter -
emphasize large area changes and de-emphasize local detail
– High-pass filter -emphasize local detail and de-emphasize large area changes
David Tenenbaum – GEOG 070 – UNC-CH Spring 2005
Spectral Enhancements• Often involve taking ratios or other mathematical
combinations of multiple input bands to produce a derived index of some sort, e.g.:
• Normalized Difference Vegetation Index (NDVI)
– Designed to contrast heavily-vegetated areas with areas containing little vegetation, by taking advantage of vegetation’s strong absorption of red and reflection of near infrared:
– NDVI = (NIR-R) / (NIR + R)
• Other examples: Surface temperature (Ts) from IR bands, TVDI from NDVI and Ts
David Tenenbaum – GEOG 070 – UNC-CH Spring 2005
Classification• One of the key processing techniques in remote sensing• Categorizes pixels into thematic categories that
correspond to land cover types– e.g. forest, crops, water, urban, etc.
• Complex process that ensures that the variation among pixel BVs within a class is less than the variation between classes
• Basis for differentiation are the spectral signatures of the classes (although supplemental information such as texture/pattern etc. can be used in the process as well)
David Tenenbaum – GEOG 070 – UNC-CH Spring 2005
Ground Observation
Aerial observation
Space observation
Airborne Remote Sensing
David Tenenbaum – GEOG 070 – UNC-CH Spring 2005
Scale = Photo distanceGround distance =
fHH
f
How does the scale of air photos change with flight height?
Air Photo Scale•The scale of an air photo can be described with a representative fraction (just like a map), which in turn is proportional to the distance above the ground:
David Tenenbaum – GEOG 070 – UNC-CH Spring 2005
•An orthophoto is an orthographic photograph which does not contain scale, tilt and relief distortions•Orthophotos are like ‘photo maps’:
•Like maps: They have one scale everywhere•Like photos: They show the terrain in actual detail
Orthophotographs
David Tenenbaum – GEOG 070 – UNC-CH Spring 2005
Satellite Launched Decom. RBV MSS TM Orbit Info.Landsat-1 23 Jul 1972 6 Jan 1978 1-3 4-7 none 18d/900km
Landsat-2 22 Jan 1975 25 Feb 1982 1-3 4-7 none 18d/900km
Landsat-3 5 Mar 1978 31 Mar 1983 A-D 4-8 none 18d/900km
Landsat-4 16 Jul 1982 -- none 1-4 1-7 16d/705km
Landsat-5 2 Mar 1984 -- none 1-4 1-7 16d/705km
Landsat-6 5 Oct 1993 Launch Failure none none ETM 16d/705km
Landsat-7 15 Apr 1999 -- none none ETM+ 16d/705km
RBV: Return Beam Vidicon {Blue, Green, Red}@~40mMSS: Multi-spectral Scanner {Green, Red, NIR1, NIR2)@~80mTM: Thematic Mapper {Blue, Green, Red, NIR, IR1, IR2}@~30m, TIR@120mETM: Thematic Mapper {Blue, Green, Red, NIR, IR1, IR2}@~30m, TIR@60m
Landsat Platforms and their Sensors
David Tenenbaum – GEOG 070 – UNC-CH Spring 2005
•Landsat satellites’ orbits are designed to be sun-synchronous orbits, meaning that the satellites always cross the Equator at precisely the same local time (~10:00 am)•In this way, images collected of different parts of the globe are collected under as similar illumination conditions as possible
Landsat Orbits
David Tenenbaum – GEOG 070 – UNC-CH Spring 2005
‘Wiskbroom’ SensorsA
rono
ff, S
. 19
89.
Geo
grap
hic
Info
rmat
ion
Syst
ems:
A
Man
agem
ent P
ersp
ectiv
e. W
DL
Publ
icat
ions
, Otta
wa,
O
ntar
io, C
anad
a, p
. 72.
David Tenenbaum – GEOG 070 – UNC-CH Spring 2005
Launch DatesSPOT 1: February 22, 1986SPOT 2: January 22, 1990SPOT 3: September 26, 1993SPOT 4: March 24, 1998SPOT 5: May 3, 2002
HRV imaging instruments: SPOT 1, 2 and 3Spectral bands: Spatial resolution swath width0.5-0.59 (green) 20x20 m 60km0.61-0.68 (red) 20x20 m 60km0.79-0.89 (NIR) 20x20 m 60km0.51-0.73 (panchromatic) 10x10 m 60km
HRVIR imaging instruments: SPOT 4Spectral bands: Spatial resolution swath width1.58-1.75 (SWIR) 20x20 m 60km
HRG imaging instruments: SPOT 5Higher spatial resolution: 5m panchromatic, 10m visible/NIR bands, 20m SWIR
These are the primary sensors, each platform carries other …
Temporal resolution = 26 daysRadiometric resolution = 8-bit
SPOT Characteristics
David Tenenbaum – GEOG 070 – UNC-CH Spring 2005
‘Pushbroom’ SensorsA
rono
ff, S
. 19
89.
Geo
grap
hic
Info
rmat
ion
Syst
ems:
A
Man
agem
ent P
ersp
ectiv
e. W
DL
Publ
icat
ions
, Otta
wa,
O
ntar
io, C
anad
a, p
. 74.
David Tenenbaum – GEOG 070 – UNC-CH Spring 2005
Owner: Space Imaging (a commercial concern)Launched: September 1999Temporal resolution: 11 days (1-3 days considering oblique views)Radiometric resolution: 11-bit (8x better than TM or SPOT)Spectral bands spatial resolution0.45-0.52 (blue) 4m0.51-0.60 (green) 4m0.63-0.70 (red) 4m0.76-0.85 (NIR) 4m0.45-0.90 (Panchromatic) 1mSwath width: 11kmSensor systems: pushbroom system, pointable both along track and across track.Orbit: 682km sun-synchronous having an equatorial crossing time of 10:30am
Ikonos
David Tenenbaum – GEOG 070 – UNC-CH Spring 2005
Owner: Digital Globe (another commercial concern, the competition!)Launched: October 18, 2001Temporal resolution: 1-5 days (considering oblique views)Radiometric resolution: 11-bit (8x better than TM or SPOT)Spectral bands spatial resolution0.45-0.52 (blue) 2.5m0.52-0.60 (green) 2.5m0.63-0.69 (red) 2.5m0.76-0.90 (NIR) 2.5m0.45-0.90 (Panchromatic) 60cmSwath width: 16.5kmSensor systems: pushbroom system, pointable both along track and across track.Orbit: 450km sun-synchronous having an equatorial crossing time of 10:30am
Quickbird
David Tenenbaum – GEOG 070 – UNC-CH Spring 2005
•Instead of revolving around the Earth every 90-100 minutes in a sun-synchronous orbit like the other satellites we have discussed, these satellites were placed into an orbit that maintains a fixed relationship with the Earth•These orbits are very high (~35,800 km above the surface of the Earth), and the combination of this high orbit with a broad field of view means that sensors on these platforms can image a ‘full-disk’ or half the planet at one time•Because this orbit is geostationary, these satellites can image that half of the planet within their view continuouslysuch that information can be gathered over the full diurnal night-day cycle, although spatial resolution is sacrificed in this approach (much bigger pixels!)
Geostationary Orbit
David Tenenbaum – GEOG 070 – UNC-CH Spring 2005
AVHRR Characteristics
David Tenenbaum – GEOG 070 – UNC-CH Spring 2005
AVHRR Bands
David Tenenbaum – GEOG 070 – UNC-CH Spring 2005
Orbit: 705 km,
Time to cross equator: 10:30 a.m. descending node (Terra), 2:30 pm descending node (Aqua)
sun-synchronous, near-polar, circular
Sensor Systems: Across Track Scanning (‘Wiskbroom’)
Radiometric resolution: 12 bits
Temporal resolution: 1-2 days
Spatial Resolution:
250 m (bands 1-2)
500 m (bands 3-7)
1000 m (bands 8-36)
Design Life: 6 years
MODIS Characteristics
David Tenenbaum – GEOG 070 – UNC-CH Spring 2005
MODISBands
David Tenenbaum – GEOG 070 – UNC-CH Spring 2005
Spectral Properties of Clouds and SnowIn the visible spectrum clouds and snow lookvery similar. Thus, it is difficult to separate them with human eyes. But they are very different in the mid-infrared
MODIS Applications - Snow
David Tenenbaum – GEOG 070 – UNC-CH Spring 2005
David Tenenbaum – GEOG 070 – UNC-CH Spring 2005
David Tenenbaum – GEOG 070 – UNC-CH Spring 2005
David Tenenbaum – GEOG 070 – UNC-CH Spring 2005
David Tenenbaum – GEOG 070 – UNC-CH Spring 2005
Improvement Over Old Global DEMs
Lake Balbina, near Manaus, Brazil as depicted using old global 1km data (on the left), and the SRTM 30m DEM (on the right)
http://srtm.usgs.gov/srtmimagegallery/Lake%20Balbina,%20near%20Manaus,%20Brazil.htm
David Tenenbaum – GEOG 070 – UNC-CH Spring 2005
Nexrad Doppler Weather RADAR• The Nexrad network of weather RADAR sensors
consists of 158 radars that each have a maximum range of 250 miles that together provide excellent coverage of the continental United States
The sensors are known by the designation WSR-88D(Weather Surveillance Radar 88 Doppler), and the station in this area is located at RDU airport is #64 - KRAX
http://www.roc.noaa.gov/
David Tenenbaum – GEOG 070 – UNC-CH Spring 2005
CONUS Hourly Nexrad Rainfall•Here is Nexrad gauge-corrected for six one-hourly periods for the afternoon and evening of March 10, 2005
•Note the changes in shape of the blue bounding box, which show that some RADARs were offline where no overlapping coverage was present, thus no information was available
http://wwwt.emc.ncep.noaa.gov/mmb/ylin/pcpanl/stage4/images/st4.6hrloop.gif
David Tenenbaum – GEOG 070 – UNC-CH Spring 2005
David Tenenbaum – GEOG 070 – UNC-CH Spring 2005
David Tenenbaum – GEOG 070 – UNC-CH Spring 2005
David Tenenbaum – GEOG 070 – UNC-CH Spring 2005
David Tenenbaum – GEOG 070 – UNC-CH Spring 2005
David Tenenbaum – GEOG 070 – UNC-CH Spring 2005
David Tenenbaum – GEOG 070 – UNC-CH Spring 2005
TIGER/Line Address Range Basics
The complete chain has twoaddress ranges:left: odd-numbered right: even-numbered.
Potential address ranges along a complete chain have valuesthat encompass the addressesof existing structures, as wellas those not yet built
David Tenenbaum – GEOG 070 – UNC-CH Spring 2005
Coordinate Systems in TIGER/Line Files
Global coordinate system: Lat/Long.
No Projectionsome source maps scanned are in UTM projection, but thenconverted back to Lat/Long (causes error!)
Position Accuracy: +/- 167 feetmeet National Map Accuracy Standards. But not suitable for high-precision measurement applications, e.g. engineering,property transfer
David Tenenbaum – GEOG 070 – UNC-CH Spring 2005
Probability – Some Definitions• Probability – Refers to the likelihood that something
(an event) will have a certain outcome• An Event – Any phenomenon you can observe that can
have more than one outcome (e.g. flipping a coin)• An Outcome – Any unique condition that can be the
result of an event (e.g. the available outcomes when flipping a coin are heads and tails), a.k.a. simple events or sample points
• Sample Space – The set of all possible outcomesassociated with an event (e.g. the sample space for flipping a coin includes heads and tails)
David Tenenbaum – GEOG 070 – UNC-CH Spring 2005
• We can plot this probability distribution as a probability mass function:
Probability – An Example, Part III
xi P(xi)1 1/6 = 0.1672 1/6 = 0.1673 1/6 = 0.1674 3/6 = 0.5
0
0.25
0.50
P(x i)
1 2 3 4
xi
• This plot uses thin lines to denote that the probabilities are massed at discrete values of this random variable
David Tenenbaum – GEOG 070 – UNC-CH Spring 2005
• Probability mass functions have the following rules that dictate their possible values:
1. The probability of any outcome must be greater than or equal to zero and must also be less than or equal to one, i.e.
0 ≥ P(xi) ≤ 1 for i = {1, 2, 3, …, k-1, k}
2. The sum of all probabilities in the sample space must total one, i.e.
Probability Mass Functions
Σ P(xi) = 1i=1
i=k
David Tenenbaum – GEOG 070 – UNC-CH Spring 2005
• Continuous random variables can assume all real number values within an interval, for example: measurements of precipitation, pH, etc.
• Some random variables that are technically discrete exhibit such a tremendous range of values, that is it desirable to treat them as if they were continuous variables, e.g. population
• Discrete random variables are described by probability mass functions, and continuous random variables are described by probability density functions
Continuous Random Variables
David Tenenbaum – GEOG 070 – UNC-CH Spring 2005
• Probability density functions are defined using the same rules required of probability mass functions, with some additional requirements:
1. The function must have a non-negative value throughout the interval a to b, i.e.
f(x) ≥ 0 for a ≤ x ≤ b
2. The area under the curve defined by f(x), within the interval a to b, must equal 1:
Probability Density Functions
x
area=1f(x)
a b
David Tenenbaum – GEOG 070 – UNC-CH Spring 2005
• Suppose we are interested in computing the probability of a continuous random variable falling within a range of values bounded by lower limit c and upper limit d, within the interval a to b
Probability Density Functions
x
f(x)a b
c dHow can we find the probability of a value occurring between c and d?
• We need to calculate the shaded area if we know the density function, we could use calculus:
P(x)c
d=
d
c
f(x) dx
David Tenenbaum – GEOG 070 – UNC-CH Spring 2005
• We can use a Venn diagram with to describe the relationships between two sets or events, and the corresponding probabilities:
Probability Rules
•The union of sets A and B (written symbolically is A ∪ B) is represented by the areas enclosed by set A and B together, and can be expressed by OR (i.e. the union of the two sets includes any location in A or B, i.e. blue OR red) •The intersection of sets A and B (written symbolically as A ∩ B) is the area that is overlapped by both the A and B sets, and can be expressed by AND (I.e. the intersection of the two sets includes locations in A AND B, i.e. purple)
A B
A B
∩
∪
David Tenenbaum – GEOG 070 – UNC-CH Spring 2005
A B
C
AB
C
A, B, and C are mutually exclusive, but are not collectively exhaustive, e.g. rolling a 6-sided die and getting a 2 ,or a 4, or a 6
A, B, and C are mutually exclusive, and are collectively exhaustive, e.g. rolling a 6-sided die and getting a 1, 2, 3, 4, 5, or 6
Mutually Exclusive & Collectively Exhaustive
David Tenenbaum – GEOG 070 – UNC-CH Spring 2005
• If the sets A and B do overlap in the Venn diagram, the sets are not mutually exclusive, and this represents a case of independent, but not exclusive events
Probability Rules
•The union of sets A and B here isP(A∪Β) = P(A) + P(B) - P(A∩Β)
because we do not wish to count the intersection area twice, thus we need to subtract it from the sum of the areas of A and B when taking the union of a pair of overlapping sets•The intersection of sets A and B here is calculated by taking the product of the two probabilities, a.k.a. the multiplication rule:
P(A∩Β) = P(A) * P(B)
A B
A B
P(A∩Β) = P(A) * P(B)
∩
P(A∪Β) = P(A) + P(B) - P(A∩Β)
∪
David Tenenbaum – GEOG 070 – UNC-CH Spring 2005
Sampling Concepts•We must define the sampling unit - the smallest sub-division of the population that becomes part of our sample•We want to minimize sampling error when we design how we will collect data: Typically the sampling error ↓as the sample size ↑ because larger samples make up a larger proportion of the population (and a complete census, for example, theoretically has no sampling error)•We want to try and avoid sampling bias when we design how we will collect data: Bias here is referring to a systematic tendency in the selection of members of a population to be included in a sample, i.e. any given member of a population should have an equal chance of being included in the sample (for random sampling)
David Tenenbaum – GEOG 070 – UNC-CH Spring 2005
Random Spatial Sampling•We can choose a random point in (x,y) space by choosing pairs of random numbers … this produces a Poisson distribution if we divide the area into quadrats and count•This is easy with rectangular study areas, otherwise we also need to reject any points outside the study area (e.g. my method for selecting the beginning of a transect)•We can also produce stratified and systematic point samples by dividing the area into a group of mutually exclusive and collective exhaustive strata:
David Tenenbaum – GEOG 070 – UNC-CH Spring 2005
Data Portrayal•Once we have sampled some geographic phenomenon, it is often useful to portray it in some fashion that allows you to get a sense of the values in the dataset•Many portrayal approaches still involve reducing the volume of data (and information content), but if applied properly, they can help you see the interesting characteristics of data•For the various scales of measurement, there are different approaches that are applicable
David Tenenbaum – GEOG 070 – UNC-CH Spring 2005
Scales of Measurement
• Thematic data can be divided into four types
1. The Nominal Scale
2. The Ordinal Scale
3. The Interval Scale
4. The Ratio Scale
As we progress through these scales, the types of data they describe have increasing information content
David Tenenbaum – GEOG 070 – UNC-CH Spring 2005
Nominal DataClass Frequency % of TotalWoody 105 32.92Herbaceous 151 47.34Water 1 0.31Ground 6 1.88Road 23 7.21Pavement 22 6.90Structures 11 3.45•This is a tabular presentation of data –has the advantage of giving the exact quantities, but can be ‘busy’, especially in larger tables
David Tenenbaum – GEOG 070 – UNC-CH Spring 2005
Nominal DataClass FrequencyWoody 105Herbaceous 151Water 1Ground 6Road 23Pavement 22Structures 11•The frequency of nominal data classes can be well displayed by a bar graph
Segment Type Frequency
0
20
40
60
80
100
120
140
160
WoodyHerbac
eous
WaterGround
RoadPav
emen
tStru
ctures
Segment Types
Freq
uenc
y
David Tenenbaum – GEOG 070 – UNC-CH Spring 2005
Nominal DataClass % of TotalWoody 32.92Herbaceous 47.34Water 0.31Ground 1.88Road 7.21Pavement 6.90Structures 3.45•Once normalized, the values are well displayed in a pie chart, which emphasizes each category’s portion of the whole
Segment Types
Woody33%Water
0%
Ground2%
Road7%
Pavement7%
Structures3%
Herbaceous48%
Woody
Herbaceous
Water
Ground
Road
Pavement
Structures
David Tenenbaum – GEOG 070 – UNC-CH Spring 2005
Building a Histogram4. Plot the frequencies of each class• All that remains is to create the plot:
Pond Branch TMI Histogram
048
12162024283236404448
4 5 6 7 8 9 10 11 12 13 14 15 16
Topographic Moisture Index
Perc
ent o
f cel
ls in
cat
chm
ent
David Tenenbaum – GEOG 070 – UNC-CH Spring 2005
Frequencies & Distributions• By examining the shape of freq. distribution
curves we can gain some sense of the distribution through some general characteristics:
1. Modality – Most distributions are unimodal, but we might also see bimodal or multi-modal dists.(if unimodal, we can also consider):
2. Symmetry – a.k.a. skewness of the distribution –Is it positively or negatively skewed?
3. Kurtosis – Describes the degree of peakedness or flatness of the curve
David Tenenbaum – GEOG 070 – UNC-CH Spring 2005
Simple Descriptive Statistics• Descriptive statistics provide an organization
and summary of a dataset• A small number of summary measures replaces
the entirety of a dataset• You’re likely already familiar with some simple
descriptive summary measures:1. Ratios2. Proportions3. Percentages4. Rates of Change5. (Location Quotients)
David Tenenbaum – GEOG 070 – UNC-CH Spring 2005
Simple Descriptive Summary Measures5. Location Quotients cont. – For example,
suppose we have a region of 1000 km2 which we subdivide into three smaller areas of 200, 300, and 500 km2 respectively (labeled A, B, and C)
• The region has an influenza outbreak with 150 cases in the first region, 100 in the second, and 350 in the third (a total of 600 flu cases):
Proportion of Area Proportion of Cases Location Quotient
A 200/1000=0.2 150/600=0.25 0.25/0.2=1.25B 300/1000=0.3 100/600=0.17 0.17/0.3=0.57C 500/1000=0.5 350/600=0.58 0.58/0.5=1.17Location Quotient = Prop. of Cases / Prop. of Area
David Tenenbaum – GEOG 070 – UNC-CH Spring 2005
Simple Descriptive Statistics• These are ways to summarize a number set
quickly and accurately• The most common way of describing a variable
distribution is in terms of two of its properties:• Central tendency – describes the central
value of the distribution, around which the observations cluster
• Dispersion – describes how the observations are distributed
• First we’ll look at measures of central tendency
David Tenenbaum – GEOG 070 – UNC-CH Spring 2005
Measures of Central Tendency• Think of this from the following point of view:
We have some distribution in which we want to locate the center, and we need to choose an appropriate measure of central tendency. We can choose from:1. Mode2. Median3. Mean
• Each of these measures is appropriate to different distributions / under different circumstances
David Tenenbaum – GEOG 070 – UNC-CH Spring 2005
Measures of Central Tendency - Mean3. Mean cont. – A standard geographic application
of the mean is to locate the center (a.k.a. centroid) of a spatial distribution by assigning to each member of the spatial distribution a gridded coordinate and calculating the mean value in each coordinate direction Bivariate mean or mean center
Σ xii=1
i=n
nx =Σ yii=1
i=n
ny =
For a set of (x,y) coordinates, the mean center (x,y) is computed using:
David Tenenbaum – GEOG 070 – UNC-CH Spring 2005
Measures of Dispersion - Range1. Range – this is the most simply formulated of
all measures of dispersion
• Given a set of measurements x1, x2, x3, … ,xn-1, xn , the range is defined as the difference between the largest and smallest values:
Range = xmax – xmin
• This is another descriptive measure that is vulnerable to the influence of outliers in a data set, which result in a range that is not really descriptive of most of the data
David Tenenbaum – GEOG 070 – UNC-CH Spring 2005
Measures of Dispersion – Variance, Standard Deviation, Z-scores
2. Variance etc. cont. – Variance is formulated as the sum of squares divided by the population size or the sample size minus one:
Σ (xi – µ)2i=1
i=N
Nσ2 =Population variance
(xi – x)2Σi=1
i=N
n - 1S2 =Sample variance
David Tenenbaum – GEOG 070 – UNC-CH Spring 2005
Measures of Dispersion – Variance, Standard Deviation, Z-scores
2. Variance etc. cont. – Standard deviation is calculated by taking the square root of variance:
Σ (xi – µ)2i=1
i=N
Nσ =Population standard deviation
(xi – x)2Σi=1
i=N
n - 1S =Sample standard deviation
• Why do we prefer standard deviation over variance as a measure of dispersion? Magnitude of values and units match means
David Tenenbaum – GEOG 070 – UNC-CH Spring 2005
Measures of Dispersion – Variance, Standard Deviation, Z-scores
2. Variance etc. cont. – Just as the mean can be applied to spatial distributions through the bivariate mean center and weighted mean center formulae (computed by considering the (x,y) coordinates of a set of spatial objects), standard deviation can be applied to examining the dispersion of a spatial distribution. This is called standard distance (SD):
Σi=1
i=n
SD =(yi – y)2Σ
i=1
i=n
n - 1+(xi – x)2
n - 1
David Tenenbaum – GEOG 070 – UNC-CH Spring 2005
• The first part of today’s lecture will introduce three kinds of discrete probability distributions that are useful for us to examine:1. The Uniform Distribution2. The Binomial Distribution3. The Poisson Distribution
• Each of these probability distributions is appropriately applied in certain situations to help us understand particular geographic phenomena
Discrete Probability Distributions
David Tenenbaum – GEOG 070 – UNC-CH Spring 2005
• While the idea of a uniform distribution might seem a little simplistic and perhaps useless, it actually is well applied in two situations:
1. When the probabilities are of each and every possible outcome are truly equal (e.g. the coin toss)
2. When we have no prior knowledge of how a variable is distributed (i.e. when we are dealing with complete uncertainty), we first distribution we should use is uniform, because it makes no assumptions about the distribution
The Uniform Distribution
David Tenenbaum – GEOG 070 – UNC-CH Spring 2005
• The binomial distribution provides information about the probability of the repetition of eventswhen there are only two possible outcomes, e.g. heads or tails, left or right, success or failure, rain or no rain … any nominal data situation where there are only two categories / outcomes possible
• The binomial distribution is useful for describing when the same event is repeated over and over, characterizing the probability of a proportion of the events having a certain outcome over a specified number of events
The Binomial Distribution
David Tenenbaum – GEOG 070 – UNC-CH Spring 2005
• The usual application of probability distributions is to find a theoretical distribution that reflects a process such that it explains what we see in some observed sample of a geographic phenomenon
• The theoretical distribution does this by virtue of the fact that the form of the sampled information and theoretical distribution can be compared and be found to similar through a test of significance
• One theoretical concept that we often study in geography concerns discrete random events in space and time (e.g. where will a tornado occur?)
The Poisson Distribution
David Tenenbaum – GEOG 070 – UNC-CH Spring 2005
• Here, the counts of points per quadrat form the frequencies we use to check Poisson probabilities:
The Poisson Distribution
RegularLow variance
Mean 1σ2:x is low
RandomVariance
Meanσ2:x ~ 1
ClusteredLow variance
Mean 0σ2:x is high
David Tenenbaum – GEOG 070 – UNC-CH Spring 2005
The Normal Distribution• Most naturally occurring variables are
distributed normally (e.g. heights, weights, shoe sizes, annual temperature variations, test scores, IQ scores, etc.)
• For example, if we take the land surface temperature values for Maryland Climate Division 6 on May 24,2002 from the LST GRID we are using in lab #6 and generate a histogramof these values, we can see that the values are basically normally distributed:
David Tenenbaum – GEOG 070 – UNC-CH Spring 2005
LST Distribution
Mean = 85.4994 degrees FStandard Deviation = 4.3092 degrees F
David Tenenbaum – GEOG 070 – UNC-CH Spring 2005
The Normal Distribution•You will recall the z-score (a.k.a. standard normal variate, standard normal deviate, or just the standard score), which is calculated by subtracting the sample mean from the observation, and then dividing that difference by the sample standard deviation:
x - µZ-score = σ•The z-score is the means that is used to transform our normal distribution into a standard normal distribution, which is simply a normal distribution that has been standardized to have µ = 0 and σ = 1
David Tenenbaum – GEOG 070 – UNC-CH Spring 2005
LST Z-Scores
Mean = 85.4994 degrees FStd. Dev. = 4.3092 degrees F
x - µZ-score = σ