cs 128/es 228 - lecture 14a1 data quality management geospatial errors can cause real-life...
Post on 15-Jan-2016
222 views
TRANSCRIPT
![Page 1: CS 128/ES 228 - Lecture 14a1 Data Quality Management Geospatial errors can cause real-life problems!](https://reader035.vdocument.in/reader035/viewer/2022062309/56649d3a5503460f94a140c6/html5/thumbnails/1.jpg)
CS 128/ES 228 - Lecture 14a 1
Data Quality Management
htt
p:/
/ww
w.b
row
nsm
ari
na.c
om
/fu
n.h
tml
Geospatial errors can
cause real-life problems!
![Page 2: CS 128/ES 228 - Lecture 14a1 Data Quality Management Geospatial errors can cause real-life problems!](https://reader035.vdocument.in/reader035/viewer/2022062309/56649d3a5503460f94a140c6/html5/thumbnails/2.jpg)
CS 128/ES 228 - Lecture 14a 2
One management strategy …
![Page 3: CS 128/ES 228 - Lecture 14a1 Data Quality Management Geospatial errors can cause real-life problems!](https://reader035.vdocument.in/reader035/viewer/2022062309/56649d3a5503460f94a140c6/html5/thumbnails/3.jpg)
CS 128/ES 228 - Lecture 14a 3
Murphy’s Law
Ignoring data quality issues usually doesn’t work very well
![Page 4: CS 128/ES 228 - Lecture 14a1 Data Quality Management Geospatial errors can cause real-life problems!](https://reader035.vdocument.in/reader035/viewer/2022062309/56649d3a5503460f94a140c6/html5/thumbnails/4.jpg)
CS 128/ES 228 - Lecture 14a 4
Some geospatial goofs
![Page 5: CS 128/ES 228 - Lecture 14a1 Data Quality Management Geospatial errors can cause real-life problems!](https://reader035.vdocument.in/reader035/viewer/2022062309/56649d3a5503460f94a140c6/html5/thumbnails/5.jpg)
CS 128/ES 228 - Lecture 14a 5
This one’s worse…
Mars Climate Orbiter (MCO) was lost on 23 Sep 1999 when it failed to enter an orbit around Mars, instead crashing into the planet, destroying the $125 million craft, part of a $328 million mission
The root cause of the failure was a computer program that was supposed to provide its output in newton seconds (N·s) but instead provided pound-force seconds (lbf·s).
http://lamar.colostate.edu/~hillger/unit-mixups.html#mco
http://www.boeing.com/companyoffices/gallery/images/space/d2_mars_climate_orbiter_01.htm
![Page 6: CS 128/ES 228 - Lecture 14a1 Data Quality Management Geospatial errors can cause real-life problems!](https://reader035.vdocument.in/reader035/viewer/2022062309/56649d3a5503460f94a140c6/html5/thumbnails/6.jpg)
CS 128/ES 228 - Lecture 14a 6
And these are really bad!Just a 'map error'? The China Daily website carries a cartoon of the damaged US plane at Hainan Island's airbase and asks sarcastically if Sunday's collision "might be due to another map error“ - a reference to the US bombing of the Chinese embassy in Belgrade in 1999. "Last time it's due to a map error, and this time another map error? What about the next?” http://news.bbc.co.uk/1/hi/world/monitoring/media_reports/1260185.stm
It might be due to another map error
China Daily
![Page 7: CS 128/ES 228 - Lecture 14a1 Data Quality Management Geospatial errors can cause real-life problems!](https://reader035.vdocument.in/reader035/viewer/2022062309/56649d3a5503460f94a140c6/html5/thumbnails/7.jpg)
CS 128/ES 228 - Lecture 14a 7
What is error?
“Error is the physical difference between the real world and the GIS facsimile”
-Heywood, Cornelius, & Carver, p. 178
Errors are impossible to avoid, but can be managed
![Page 8: CS 128/ES 228 - Lecture 14a1 Data Quality Management Geospatial errors can cause real-life problems!](https://reader035.vdocument.in/reader035/viewer/2022062309/56649d3a5503460f94a140c6/html5/thumbnails/8.jpg)
CS 128/ES 228 - Lecture 14a 8
A Data Management Model
Data acquisition
Data representation
& analysis
Data outputs
![Page 9: CS 128/ES 228 - Lecture 14a1 Data Quality Management Geospatial errors can cause real-life problems!](https://reader035.vdocument.in/reader035/viewer/2022062309/56649d3a5503460f94a140c6/html5/thumbnails/9.jpg)
CS 128/ES 228 - Lecture 14a 9
Data acquisition errors
Scientists use the term “error” for two very different concepts:
natural variability actual mistakes
![Page 10: CS 128/ES 228 - Lecture 14a1 Data Quality Management Geospatial errors can cause real-life problems!](https://reader035.vdocument.in/reader035/viewer/2022062309/56649d3a5503460f94a140c6/html5/thumbnails/10.jpg)
CS 128/ES 228 - Lecture 14a 10
Take a sidewalk …
What’s its width? 1.77, 1.82, 1.69 … meters
a. “Error” (natural variability):mean width = 1.76 m, range 1.69 - 1.82
b. “Error” (actual mistake): mean = 1.67 ft
![Page 11: CS 128/ES 228 - Lecture 14a1 Data Quality Management Geospatial errors can cause real-life problems!](https://reader035.vdocument.in/reader035/viewer/2022062309/56649d3a5503460f94a140c6/html5/thumbnails/11.jpg)
CS 128/ES 228 - Lecture 14a 11
Accuracy vs. Precision
Figure 10.1, An Introduction to Geographic Information Systems by
Heywood, Cornelius, and Carver
![Page 12: CS 128/ES 228 - Lecture 14a1 Data Quality Management Geospatial errors can cause real-life problems!](https://reader035.vdocument.in/reader035/viewer/2022062309/56649d3a5503460f94a140c6/html5/thumbnails/12.jpg)
CS 128/ES 228 - Lecture 14a 12
Random error vs. Bias
![Page 13: CS 128/ES 228 - Lecture 14a1 Data Quality Management Geospatial errors can cause real-life problems!](https://reader035.vdocument.in/reader035/viewer/2022062309/56649d3a5503460f94a140c6/html5/thumbnails/13.jpg)
CS 128/ES 228 - Lecture 14a 13
Where does lack of precision come from?
Natural variability
Poor input assumptions
Imprecise equipment
Sloppy measurement
Accumulated error
![Page 14: CS 128/ES 228 - Lecture 14a1 Data Quality Management Geospatial errors can cause real-life problems!](https://reader035.vdocument.in/reader035/viewer/2022062309/56649d3a5503460f94a140c6/html5/thumbnails/14.jpg)
CS 128/ES 228 - Lecture 14a 14
Random error is often “normal”
mean
Standard deviation
![Page 15: CS 128/ES 228 - Lecture 14a1 Data Quality Management Geospatial errors can cause real-life problems!](https://reader035.vdocument.in/reader035/viewer/2022062309/56649d3a5503460f94a140c6/html5/thumbnails/15.jpg)
CS 128/ES 228 - Lecture 14a 15
95% of observations ±2 s.d.
mean
Mean + 2 s.d. Mean + 2 s.d.
![Page 16: CS 128/ES 228 - Lecture 14a1 Data Quality Management Geospatial errors can cause real-life problems!](https://reader035.vdocument.in/reader035/viewer/2022062309/56649d3a5503460f94a140c6/html5/thumbnails/16.jpg)
CS 128/ES 228 - Lecture 14a 16
Means have smaller variability than single measurements
S. E. (mean) = standard deviation √n
If n = 4 √n = ?
![Page 17: CS 128/ES 228 - Lecture 14a1 Data Quality Management Geospatial errors can cause real-life problems!](https://reader035.vdocument.in/reader035/viewer/2022062309/56649d3a5503460f94a140c6/html5/thumbnails/17.jpg)
CS 128/ES 228 - Lecture 14a 17
Where does lack of accuracy come from?
Dubious source data Incompatible source data
Data collected at different times through different methods, possibly in different formats
Bias
![Page 18: CS 128/ES 228 - Lecture 14a1 Data Quality Management Geospatial errors can cause real-life problems!](https://reader035.vdocument.in/reader035/viewer/2022062309/56649d3a5503460f94a140c6/html5/thumbnails/18.jpg)
CS 128/ES 228 - Lecture 14a 18
How can we fix it? Benchmarks
ex. National Geodetic Survey maintains a database of survey “monuments” at
http://www.ngs.noaa.gov/ cgi-bin/datasheet.prl
Otherwise – just measure variability
http://upload.wikimedia.org/wikipedia/commons/thumb/6/66/USCGS-E134.jpg/617px-USCGS-
E134.jpg
![Page 19: CS 128/ES 228 - Lecture 14a1 Data Quality Management Geospatial errors can cause real-life problems!](https://reader035.vdocument.in/reader035/viewer/2022062309/56649d3a5503460f94a140c6/html5/thumbnails/19.jpg)
CS 128/ES 228 - Lecture 14a 19
Data representation errors Transference error
Data storage errors
Analysis errors
![Page 20: CS 128/ES 228 - Lecture 14a1 Data Quality Management Geospatial errors can cause real-life problems!](https://reader035.vdocument.in/reader035/viewer/2022062309/56649d3a5503460f94a140c6/html5/thumbnails/20.jpg)
CS 128/ES 228 - Lecture 14a 20
Where does transference error come from?
Typos, etc. Less likely with automated data
collection and transformation Can be prevented through diligence
and software “sanity” checks
Format conversion Many inter-format conversions cause
loss/corruption of data/information
![Page 21: CS 128/ES 228 - Lecture 14a1 Data Quality Management Geospatial errors can cause real-life problems!](https://reader035.vdocument.in/reader035/viewer/2022062309/56649d3a5503460f94a140c6/html5/thumbnails/21.jpg)
CS 128/ES 228 - Lecture 14a 21
Something got lost in the translation
“geographic information systems is an interesting course”
“ 지리적인 정보 시스템은 재미있는 과정 이다 ”
“The geography information system is the process which is fun”
Thanks to http://babelfish.altavista.com/babelfish/tr
![Page 22: CS 128/ES 228 - Lecture 14a1 Data Quality Management Geospatial errors can cause real-life problems!](https://reader035.vdocument.in/reader035/viewer/2022062309/56649d3a5503460f94a140c6/html5/thumbnails/22.jpg)
CS 128/ES 228 - Lecture 14a 22
Raster Vector conversions
Aliasing is an intrinsic problem of GIS’s
![Page 23: CS 128/ES 228 - Lecture 14a1 Data Quality Management Geospatial errors can cause real-life problems!](https://reader035.vdocument.in/reader035/viewer/2022062309/56649d3a5503460f94a140c6/html5/thumbnails/23.jpg)
CS 128/ES 228 - Lecture 14a 23
Digitization errors
![Page 24: CS 128/ES 228 - Lecture 14a1 Data Quality Management Geospatial errors can cause real-life problems!](https://reader035.vdocument.in/reader035/viewer/2022062309/56649d3a5503460f94a140c6/html5/thumbnails/24.jpg)
CS 128/ES 228 - Lecture 14a 24
Topology errors
Figure 10.5, An Introduction to Geographic Information Systems by
Heywood, Cornelius, and Carver
![Page 25: CS 128/ES 228 - Lecture 14a1 Data Quality Management Geospatial errors can cause real-life problems!](https://reader035.vdocument.in/reader035/viewer/2022062309/56649d3a5503460f94a140c6/html5/thumbnails/25.jpg)
CS 128/ES 228 - Lecture 14a 25
Data storage/retrieval errors
Hardware failure
Hardware Limitations
![Page 26: CS 128/ES 228 - Lecture 14a1 Data Quality Management Geospatial errors can cause real-life problems!](https://reader035.vdocument.in/reader035/viewer/2022062309/56649d3a5503460f94a140c6/html5/thumbnails/26.jpg)
CS 128/ES 228 - Lecture 14a 26
What is a hardware limitation? Numbers in a
computer are stored in a finite number of bits.
Using too few bits can cause round-off error.
Box 9.2, Principles of Geographic Information Systems by Burrough and McDonnell
![Page 27: CS 128/ES 228 - Lecture 14a1 Data Quality Management Geospatial errors can cause real-life problems!](https://reader035.vdocument.in/reader035/viewer/2022062309/56649d3a5503460f94a140c6/html5/thumbnails/27.jpg)
CS 128/ES 228 - Lecture 14a 27
Where do errors of data rot come from?
Link rot Not FoundThe requested URL /cs/dlevine/ was not found on this
server.Apache/1.3.27 Server at www.xxx.edu Port 80
Poor “style” E.g. “Employees may appeal to Sr. Carney” as
opposed to “Employees may appeal to the President of the University”
![Page 28: CS 128/ES 228 - Lecture 14a1 Data Quality Management Geospatial errors can cause real-life problems!](https://reader035.vdocument.in/reader035/viewer/2022062309/56649d3a5503460f94a140c6/html5/thumbnails/28.jpg)
CS 128/ES 228 - Lecture 14a 28
Where do errors of analysis come from?
How long do you have? …
Mistaken queries
Analyzing layers with different datums or coordinate systems
Comparing attributes with incompatible units
![Page 29: CS 128/ES 228 - Lecture 14a1 Data Quality Management Geospatial errors can cause real-life problems!](https://reader035.vdocument.in/reader035/viewer/2022062309/56649d3a5503460f94a140c6/html5/thumbnails/29.jpg)
CS 128/ES 228 - Lecture 14a 29
More errors of analysis … Inappropriate resolution
Combining rasters/vectors with different resolutions
Using exact/abrupt surface fits when approx./gradual is appropriate (or vice versa)
![Page 30: CS 128/ES 228 - Lecture 14a1 Data Quality Management Geospatial errors can cause real-life problems!](https://reader035.vdocument.in/reader035/viewer/2022062309/56649d3a5503460f94a140c6/html5/thumbnails/30.jpg)
CS 128/ES 228 - Lecture 14a 30
Data output errors Maps
Reports
![Page 31: CS 128/ES 228 - Lecture 14a1 Data Quality Management Geospatial errors can cause real-life problems!](https://reader035.vdocument.in/reader035/viewer/2022062309/56649d3a5503460f94a140c6/html5/thumbnails/31.jpg)
CS 128/ES 228 - Lecture 14a 31
Junket at taxpayers’ expense?Did a politician misuse federal funds to visit Alaska on the way to official business in Japan?
Muekrcke. Map Use, 2nd ed. p. 395
![Page 32: CS 128/ES 228 - Lecture 14a1 Data Quality Management Geospatial errors can cause real-life problems!](https://reader035.vdocument.in/reader035/viewer/2022062309/56649d3a5503460f94a140c6/html5/thumbnails/32.jpg)
CS 128/ES 228 - Lecture 14a 32
No - Intentional map error*
*More like lying with maps!
Muekrcke. Map Use, 2nd ed. p. 395
![Page 33: CS 128/ES 228 - Lecture 14a1 Data Quality Management Geospatial errors can cause real-life problems!](https://reader035.vdocument.in/reader035/viewer/2022062309/56649d3a5503460f94a140c6/html5/thumbnails/33.jpg)
CS 128/ES 228 - Lecture 14a 33
Should maps be as accurate as possible?
Map simplification Features are omitted Area features become
lines or points
Exaggeration Features’ apparent
size is “increased” (e.g. hydrants)
Features’ separation is increased on the map for visibility
Must Mapquest be accurate?
![Page 34: CS 128/ES 228 - Lecture 14a1 Data Quality Management Geospatial errors can cause real-life problems!](https://reader035.vdocument.in/reader035/viewer/2022062309/56649d3a5503460f94a140c6/html5/thumbnails/34.jpg)
CS 128/ES 228 - Lecture 14a 34
Reporting significance of findings Hypothesis testing
What does the term “significant” mean to scientists?
![Page 35: CS 128/ES 228 - Lecture 14a1 Data Quality Management Geospatial errors can cause real-life problems!](https://reader035.vdocument.in/reader035/viewer/2022062309/56649d3a5503460f94a140c6/html5/thumbnails/35.jpg)
CS 128/ES 228 - Lecture 14a 35
Are two means really different?These two normal distributions have a very large overlap. The
means of the two populations are not significantly different, because the overlap is > 5% of the area under the curves. t would be very small.
htt
p:/
/ww
w.s
teve.g
b.c
om
/sci
en
ce/s
tati
stic
s.h
tml#
t
![Page 36: CS 128/ES 228 - Lecture 14a1 Data Quality Management Geospatial errors can cause real-life problems!](https://reader035.vdocument.in/reader035/viewer/2022062309/56649d3a5503460f94a140c6/html5/thumbnails/36.jpg)
CS 128/ES 228 - Lecture 14a 36
What about these two means?
htt
p:/
/ww
w.s
teve.g
b.c
om
/sci
en
ce/s
tati
stic
s.h
tml#
t
![Page 37: CS 128/ES 228 - Lecture 14a1 Data Quality Management Geospatial errors can cause real-life problems!](https://reader035.vdocument.in/reader035/viewer/2022062309/56649d3a5503460f94a140c6/html5/thumbnails/37.jpg)
CS 128/ES 228 - Lecture 14a 37
These means are also significantly different - why?
htt
p:/
/ww
w.s
teve.g
b.c
om
/sci
en
ce/s
tati
stic
s.h
tml#
t
![Page 38: CS 128/ES 228 - Lecture 14a1 Data Quality Management Geospatial errors can cause real-life problems!](https://reader035.vdocument.in/reader035/viewer/2022062309/56649d3a5503460f94a140c6/html5/thumbnails/38.jpg)
CS 128/ES 228 - Lecture 14a 38
How do we actually test for statistical differences?
Student’s t-test
t = difference in means measure of variability
![Page 39: CS 128/ES 228 - Lecture 14a1 Data Quality Management Geospatial errors can cause real-life problems!](https://reader035.vdocument.in/reader035/viewer/2022062309/56649d3a5503460f94a140c6/html5/thumbnails/39.jpg)
CS 128/ES 228 - Lecture 14a 39
Three Commandments of Data Reporting
Thou Shalt Not …I. Report insignificant digits
(or omit significant trailing zeros)
II. Report means without also reporting sample sizes and variability
III. Report results as “significant” (or even worth talking about) without doing the appropriate statistical tests.
![Page 40: CS 128/ES 228 - Lecture 14a1 Data Quality Management Geospatial errors can cause real-life problems!](https://reader035.vdocument.in/reader035/viewer/2022062309/56649d3a5503460f94a140c6/html5/thumbnails/40.jpg)
CS 128/ES 228 - Lecture 14a 40
How do we minimize (NOT avoid) error?
-- “Mad Eye” Moody Defense Against The Dark Arts Instructor Hogwarts School of Witchcraft and Wizardry
“CONSTANT VIGILANCE”
htt
p:/
/new
s.b
bc.
co.u
k/1
/sh
are
d/s
pl/h
i/p
op
_up
s/0
5/e
nte
rtain
men
t_g
ob
let_
of_
fire
/htm
l/3
.stm