introduction: chapters 14 & 15 imperfect or uncertain reconciliation [science, practice] ...
TRANSCRIPT
Introduction: Chapters 14 & 15
Imperfect or uncertain reconciliation [science, practice] [concepts, application] [analytical capability, social context]
It is impossible to make a perfect representation of the world, so uncertainty about it is inevitable
Sources of Uncertainty
Measurement error: different observers, measuring instruments
Specification error: omitted variables Ambiguity, vagueness and the quality of a
GIS representation A catch-all for ‘incomplete’ representations or
a ‘quality’ measure
Fuzzy Approaches to Uncertainty
In fuzzy set theory, it is possible to have partial membership in a set membership can vary, e.g. from 0 to 1 this adds a third option to classification: yes, no,
and maybe Fuzzy approaches have been applied to the
mapping of soils, vegetation cover, and land use
Statistical measures of uncertainty: nominal case
How to measure the accuracy of nominal attributes? e.g., a vegetation cover map
The confusion matrix compares recorded classes (the observations)
with classes obtained by some more accurate process, or from a more accurate source (the reference)
Example of a misclassification or confusion matrix. A grand total of 304 parcels have been checked. The rows of the table correspond to the land use class of each parcel as
recorded in the database, and the columns to the class as recorded in the field. The numbers appearing on the principal diagonal of the table (from top left to bottom right)
reflect correct classification.
A B C D E Total
A 80 4 0 15 7 106
B 2 17 0 9 2 30
C 12 5 9 4 8 38
D 7 8 0 65 0 80
E 3 2 1 6 38 50
Total 104 36 10 99 55 304
Confusion Matrix Statistics
Percent correctly classified total of diagonal entries divided by the grand total, times
100 209/304*100 = 68.8% but chance would give a score of better than 0
Kappa statistic normalized to range from 0 (chance) to 100 evaluates to 58.3%
The term precision is often used to refer to the repeatability of measurements. In both diagrams six measurements have been
taken of the same position, represented by the center of the circle. On the left, successive measurements have similar values (they are
precise), but show a bias away from the correct value (they are inaccurate). On the right, precision is lower but accuracy is higher.
Measuring Accuracy
Root Mean Square Error is the square root of the average squared error the primary measure of accuracy in map accuracy
standards and GIS databases e.g., elevations in a digital elevation model might
have an RMSE of 2m the abundances of errors of different magnitudes
often closely follow a Gaussian or normal distribution
Map scale Ground distance corresponding to 0.5 mm map distance
1:1250 62.5 cm
1:2500 1.25 m
1:5000 2.5 m
1:10,000 5 m
1:24,000 12 m
1:50,000 25 m
1:100,000 50 m
1:250,000 125 m
1:1,000,000 500 m
1:10,000,000 5 km
A useful rule of thumb is that positions measured from maps are accurate to about 0.5 mm on the map.
Multiplying this by the scale of the map gives the corresponding distance on the ground.
Data Quality Issues: Geographic Data
Error: difference between the real world and the geographic data
representation of it. Components of error in geographic data... Accuracy: extent to which an estimated data value approaches
its true value. e.g., cell in raster DEM with elevation of 210 meters -vs.-
actual elevation at that point of 219 meters. Precision: level of detail at which data values are recorded
e.g., cell in raster DEM with elevation values recorded
no decimal place: 219 meters -vs.-
two decimal places: 219.05 meters
More Data Quality Issues
Bias: systematic variation in accuracy within geographic data e.g., GIS tech mistypes coordinate values when entering
control points to register map to digitizing tablet
all coordinate data from this map is systematically offset (biased)
Error can propagate… e.g., what happens if layer digitized with a spatial bias
problem is used as the spatial reference to create another, new layer?
Propagation can be additive Remember our resolution and generalization issues… And don’t forget the limitations of our data models…
Some Non-Error Data Quality Issues Compatibility: can two or more geographic data sets be used
together properly? e.g. is it meaningful to overlay roads data digitized at 1:250,000
scale with road hazard sites digitized at 1:10,000? Completeness: does a given data set adequately cover a study
area? are there gaps in space or time? e.g. a city’s municipal cadastral database -- do all parcel
polygons have attribute information? are any parcels missing? Consistency: are pieces of a geographic data set consistent in terms
of content, format, etc? e.g. landcover data layer for a study area -- different sub-areas
produced from two satellite scenes... one Landsat TM & classified into 10 classes -vs.- one Landsat MSS & classified into 5 classes
Living with Uncertainty
It is easy to see the importance of uncertainty in GIS but much more difficult to deal with it effectively but we may have no option, especially in disputes
that are likely to involve litigation
More Basic Principles
Use as many sources of data as possible and cross-check them for accuracy
Be honest and informative in reporting results add plenty of caveats and cautions
In the end…
Your responsibility: --- is it appropriate or suitable for the intended use?
Truth in Labeling; Fitness for Use
Assessing the applicability of a data set or analysis result given the accuracy, precision, bias, resolution, level of
generalization, lineage, positional accuracy, attribute accuracy, compatibility, completeness, & logical consistency of a data set or analysis result.
Make use of lineage informationat a minimum: description of source data how was the data transformed in preparation or analysis?
Assessing Data Quality
Positional accuracy: RMSE Attribute accuracy: error (confusion) matrix, overall
accuracy (PCC – Percent Correctly Classified), User’s accuracy (error of commission), Producer’s accuracy (error of omission), kappa statistic (index of agreement, contribution of chance).
QA/QC, error propagation, sensitivity analyses Special issues: sample data, reference data, spatial
autocorrelation, sample size, sampling scheme (e.g., random, systematic, stratified).
Geographic Data Standards
A standard is defined by the International Standards Organization as a document established by consensus and approved by a recognized body that provides for common and repeated use, rules, guidelines, or characteristics for activities or their results, aimed at the achievement of the optimum degree of order in a given context.
Standardization is an indispensable component of the information system development strategy of any organization.
Without standards, users would tend to go their won way in systems operations, data management, and applications development – incompatibility.
Implementing a GIS
Design and implementation of a GIS is a major, long-term undertaking (more than technical issues); its where technology and people meet.
Organization must evolve, for in adapting any new technology, especially one with an influence as pervasive as a GIS, the organization itself is changed.
Information flows are shifted, and different people exert different degrees of control over information, its distribution and use – a new organization power structure can develop; computer technology is political; knowledge and access is power.
Informal control and review of information requests may be lost; hard to deny access or hide preferential treatment.
More on Implementation
Computerized data and analysis techniques are subject to the same types of political bias and inaccuracies as other data.
Important to coordinate the integration process so that the operation of the organization during the transition to GIS is not jeopardized
New technology will change an organization in ways that cannot be entirely predicted.
Awareness, development of system requirements, system evaluation, development of an implementation plan, system acquisition and startup, operational phase.
Awareness & System Requirements
Top-down, bottom-up, 3rd party; group within the organization can be assigned the responsibility of analyzing the need for a GIS, but often a 3rd party.
Functional definition of user needs analysis - interviews, analyzing the information product and services handled by the organization, and the systems and procedures used to provide them.
Analysis should provide a systematic report of the information tasks mandated to perform, input data needed, output products required, and the procedures used to generate those output products.
Form, accuracy, timeliness, and volume of those products are critical characteristics.
Project Initiation & Planning
The use of GIS in an organization does not necessarily mean that there are problems to be fixed.
Planning – studying the mission of the organization, defining the information architecture of the proposed system (i.e., data, people, applications, and technology), and performing analysis that identify the requirements of individual functions for the purpose of sharing databases and technologies.
User needs assessment – function information product table, information product data table, data attribute table.
Alternative Systems User-friendliness is a relative term. Possible problems – poor training, poor documentation,
software does not perform as expected, system installation and start-up is late, customer support is too slow, data entry is more costly and slower than expected, price higher than anticipated for hardware, software, maintenance, back-up or recovery systems fail and data are lost, software can not be modified to provide additional functions or handle unexpected problems.
Benchmarking, pilot studies, end user inputs critical. Report that includes recommendations and stated
criteria, financial and staff resources, and support of the key stakeholders of the organization.
User Needs Assessment & Impetus for a GIS
Identification of users, definition of required products, evaluation of work flow, estimation of database development effort, inventory of applications, refinement of GIS product characteristics, calculation of necessary production rates, estimation of data flows, cost/benefit analysis.
Spatial information is poorly maintained or is out of date spatial data is not recorded or stored in a standardized way; spatial data may not be defined in a consistent manner; data are not shared; data retrieval and manipulation capabilities are inadequate; new demands are made of the organization that cannot be met using the current information system.
GIS Software Evaluation Software benchmark testing for specific
applications; mandatory vs. desirable functions and features, technical specifications to vendors, demo data sets, scripting & macros for customized functions.
Hardware systems – turn key, multi-function, graphic devices, input/output, internal and external memory requirements, servers vs. local systems, network and data volumes, system security.
Database & Applications Design System design is largely the work of system analysts, but end
users are intimately involved in aspects of the process – to define the scope and requirements of a database.
Conceptual data modeling is carried out independently of software and hardware that will be used to implement the database.
Identification of database entities, attributes, data relationships, and logical data connections.
Need to connect the conceptual analysis scheme to the logical scheme; identify key fields and insure integrity of links.
Layer or coverage design – geo-relational data model that organizes spatial data in layers of coverages; anticipated use of the data, scale and resolutions of source data, data-to-data relationships, data-to-function relationships, integrity of shape files, tile boundaries, tile size, etc.
Beta testing through physical data modeling – storage structure, access paths, etc.
Populating the Geographic Database
Creating the necessary attribute relations and the graphical layers as set out by the design specifications.
Regardless of the source of the data, quality assurance and quality control (QA/QC) is essential.
System maintenance and technical support.
Getting Started…
Collection of data products from municipalities and counties, pre-processing (e.g., digitizing), data transformations (e.g., coordinate conversion), QA/QC protocols, create a production plan and time schedule for each phase of the project.
Decide on base map to be used, identify and gather supporting data layers, create attribute template.
Develop classification cross-walks. Critical assessment in inputs, outputs, and analysis
streams for development and use of the database. Stay in contact with the stakeholders, train end-users
and technical services people, inform administrators. Develop internal and external validity checks.
Role of a GIS Manager
Your role is to make sure: A good system is selected It works efficiently It demonstrably contributes to the organisation’s
strategic objectives It is sustainable
Consequences of failure severe for you and others Success demands sharing experience and knowledge
with others
Building the Case for a GIS
Why GIS? Cost reduction
e.g. tax assessment, work orders Cost avoidance
e.g minimize delivery costs, avoid flood damage Increased revenue
e.g. attract more customers, sell more maps Getting wholly new (and valued) products
e.g. those too costly or time-consuming previously Non-tangible benefits
e.g. better decisions, happy staff and customers
Benefit: Cost Analysis
Category Costs Benefits
Economic (tangible) Hardware and Software Reduced cost (staff)
Data purchase, collection Greater throughput
Training Increased revenues
New staff or skills New market services or products
Additional space
Institutional (intangible)
Interpersonal shifts Improved client relationships
Layoffs of low-skilled staff Better decisions
Staff anxiety Improved morale
Neglect of other projects Better information flow
Better culture of ‘achievers’
GIS Implementation Management Issues (1)
Plan effectively Obtain support Communicate with users Anticipate and avoid obstacles Avoid false economies
GIS Implementation Management Issues (2)
Ensure database quality and security
Accommodate GIS within organization
Avoid unreasonable timeframes
Secure ongoing funding
Prevent meltdown
Reasons GIS Fail
Lack of executive-level commitment Inadequate oversight of key participants Inexperienced managers Unsupportive organizational structure Political pressures e.g. in times of fast change Inability to demonstrate benefits Unrealistic deadlines Poor planning Lack of core funding
Managing an Operational GIS
Customer support All users are customers Create customer support facility
Operations support Administration, backup, system support Helpdesk
Data management support Database Administrator
On-going application customization Use well-proven project management tools
Golden Rules of Project Management
Projects must be completed on time, within budget, and according to quality standards
You will be responsible for the work of others. Make sure they are competent
Uncertainty of many kinds exists: you have to live with it but agree how much is acceptable
Have fun doing it and celebrate success!
Government, Academia, and Commerce are all businesses
‘Business’ used for entities which act coherently, meet particular objectives, act to please customers /clients /stakeholders
Many similarities – goals to be achieved, cost-effectiveness, innovation, managing knowledge + creation of Geographic Information
Some convergence between sectors e.g. revenue generation even by some governments
But some distinctions e.g. complex of objectives in governments, profit motive in commerce
Almost all GIS software now commercial
Some Concerns for Managers
People cause more problems than technology Everything changes faster than you would like e.g.
user expectations Uncertainty is always with us Everything interacts with everything else Users often have very imprecise ideas of what they
want – even if they say otherwise Big differences in national culture impact on how
things can be done
Information as Currency in the Knowledge Economy: Some Myths
More information always leads to better decisions
Managers need all the information they want
Managers can model the decision they wish to make (cause and effect not always clear!)
Managers don’t need to understand information system
Information systems lead to better communication within the organisation
But GIS can help in each case to create evidence for decisions…
Information as Currency in the Knowledge Economy: Characteristics
Value depends on use e.g. consumption or as factor of production
Initial production costs often high For information to be a ‘pure public good’:
Marginal cost of copying near-zero Use by one user does not prevent use by others Individuals can not be excluded from its use
Is GI Special?
Has many of the normal information characteristics, some of the time Different governments take different views on access to GI –
some seeing it as a ‘tradeable commodity’
Some GI has the characteristics of a natural monopoly
Its widespread availability brings positive and negative externalities e.g. sharing one ‘framework’ brings many benefits though anti-competitive!
Difficult to exclude people from using GI? But licensing etc now common, even of government data
Much GI is long-lasting, changing little.
GIS and GI as Business Assets
The commercial sector now drives GIS: Software almost all charged-for by commercial bodies Much data now charged-for by commerce Consultancy, etc also charged-for GIS generates large revenues and costs Linking data together can generate added value
But US federal government GI is free Many voluntary groups do work for free or low cost Some other governments expect users to pay for GI
Navigating Constraints on GIS Success
Legal constraints on GIS operations
People with right skills, attitudes, knowledge
Issues about availability, pricing, quality and ownership of GI especially common ‘geographical framework’
Risk management and GIS strategy
The Legal Framework
The law touches everything – be prepared
There is a geography of the law – it varies. ‘Commerce is global. Law, for the most part, is not’ Financial Times 12/23/99
Innovation and investment is protected through time-limited copyright or patents Some governments (e.g. US federal government) do not
impose copyright on information they create. Others do – and sometimes charge for the information
Legal areas particularly important for GIS: Liability Intellectual Property Rights Information access laws Privacy
Exploiting GI: the Big Questions Can GI be treated as (intellectual) property? YES
Can ‘geographical facts’ be protected? USUALLY NOT IN USA
Should government GI be protected? LAWS VARY
Who owns GI when new stuff added? ALL CONTRIBUTORS
How to price GI?
BY PERCEIVED VALUE, NOT COST EXPLOIT ECONOMIES OF SCALE AND SCOPE DIFFERENTIATE PRODUCTS (& PRICE) FOR DIFFERENT
MARKETS SEIZE FIRST MOVER ADVANTAGE USE PROMOTIONS, ETC
GIS/GI Skills and Education
People much more expensive than hard/software
Many GIS folk see themselves as skilled technicians
Many global similarities in technical GIS education/training
Over 2000 universities involved + private sector providers +schools
Growth of interest in professional accreditation, CPD
Graduate Certificate Programs in GIScience
National Partnerships via NSDIs
The problem: Data duplication commonplace – so waste occurs
Ad hoc data sharing has many difficulties
Data often tailored to one application
Best data often collected in greatest detail at local level but not accessible to regional or national folk
Indexes/metadata to available GI unknown until recently
No general protocols for any of this until NSDI
Many Countries Claim to have an NSDI
Australia Finland Japan South Africa
Cambodia France Malaysia Spain
Canada Germany Nicaragua Sweden
Chile Hungary Norway Switzerland
China India Philippines The Netherlands
Colombia Indonesia Poland United Kingdom
Cuba Ireland Portugal United States
Czech Republic Iceland Russia Uruguay
Denmark Israel Salvador Venezuela
Dominican Republic
ItalySlovenia
What is a National Spatial Data Infrastructure?
‘the technology, policies, standards, and human resources necessary to acquire, process, store, distribute, and improve utilization of geospatial data’
Source: Presidential Executive Order #12906 (1994): 'Co-ordinating Geographic Data Acquisition and Access: The National Spatial Data Infrastructure' W Clinton.
Initial Elements of the US NSDI
Defined standards (mandated on federal agencies and encouraged for others) Minimising inconsistency
Clearing house – metadata descriptions of existing data. Advertising what is available
National geospatial data framework - a common ‘template’ on which to assemble other data
But lots of people involved…No one in Charge
Federal government (many agencies) State government Local government Private sector – contractors, value-adders, exploiters Not for profit organizations Citizenry Others…
Beyond the National Frontiers..
National governments own and control national mapping agencies
All such mapping produced to national specifications until recently
New private sector providers:
Produce imagery for anywhere in world
Produce road databases
How do we get everyone to work together?
Multi-National and Global Partnerships
European attempts to implement a GIS/GI policy for 450 million people in 25 countries (INSPIRE)
Permanent Committee on GIS for Asia and the Pacific (55 countries)
Potential or existing global GI: Topographic mapping - military and ISCGM Road guidance data by NAVTEQ and Tele-Atlas Commercial satellites e.g. use after SE Asia tsunami Scientific missions e.g. NASA, ESA Global standards for GI e.g. Open Geospatial Consortium
Global Spatial Data Infrastructure
A Global Spatial Data Infrastructure? Difficult enough to get players within any one country to
work together…
Demonstrating benefit to those who face costs a challenge. Who are the stakeholders? Who needs it? (military doing what they need themselves?
GSDI now focused on articulating value of SDI Fostering all SDIs – more exist, the better change of
global SDI Promoting informed and responsible use of GI
generally
Extreme Events Change Everything First duty of government = protect its citizens Events like 9/11, other atrocities around world and SE Asia
tsunami require much use of GIS/GI GIS/GI can aid terrorists by:
Locating ‘choke points’ or unique impact big targets Modeling of likely effects of disruption Defining access and escape routes
BUT is this a real danger? Various organisations removed material from web sites
after 9/11 (e.g. layout of nuclear plants and risk factors)
Geographic Impact of 9/11
Courtesy: US Department of
Homeland Security, US Geological
Survey and ESRI
The Rand Report
Rand’s conclusions: Publicly accessible GI could help terrorists But much available from so many sources that
it can’t be stopped Big cost to society of curtailing access to GI via
web Federal Government should work out how to
guard sensitive GI + raise public awareness of dangers…
GIS/GI can help in Disasters by…
Contributing to Risk assessment Preparedness Mitigation Response Recovery
BUT someone has to be in charge. The old NSDI scheme and loose partnerships may not work..
Systems, Science, Study
GIS support and drive science GIScience grounds successful applications in
established scientific practices ‘geospatial’ and geographically enabled disciplines GIStudies addresses how systems and science
interact with society Technology remains both a major driver of GIS and
a constraint on access to it Poor management remains the Number One reason
why GIS projects fail GIS helps resolve global problems
10 ‘Grand Challenges’ for GIS Global data layers The GIS profession The GIS curriculum Near-universal empowerment of GIS users Global population statistics Development of richer geo-demographic data infrastructures The transition from geo-centered to ego-centered mapping Supporting data models for a complete range of geographic
phenomena Combating terrorism, preserving culture Supporting a wide range of types of geographic simulation
Issues of Implementing a GIS Data Issues
Access to Geographic Data (disparate data & sources, data conversion, geospatial data infrastructures, format standardization).
Data ownership, copyright, & cost recovery (US & public domain of data vs. chargeable services).
Liability of data misuse (inherent data uncertainties & flawed decision)
Freedom of Info & Privacy (access vs. privacy of individuals). Geometric Incompatibility (failure to match across data types,
sources, edges, NAD 83/27, time). DB Updating & Maintenance (degradation of data quality over time
& assessment of dynamic systems and key variables). Data quality & doc (still treated as a luxury vs. an absolute
necessity).
Issues Pertaining to People
Chronic shortage of skill specialists Diversity of training, education, experiences Too much weight on the use of software packages;
capable of applying only a very small portion of the power of GIS technology
Pyramid model of GIS education and training (bottom to top): (a) basic spatial and computer understanding, (b) routine use of GIS technology, (c) higher level modeling applications, (d) GIS applications design and development, (e) GIS systems design, and (f) GIS research and software development.
Issues Pertaining to Technology
Evaluation of technology (formal benchmark testing).
Management of technology (performance continues to meet design specifications).
Data related technology issues (incompatibility, distributed systems and networks).
High speed transmissions
Applications Issues
Breath & depth of applications development (allows users to focus more on the jobs that they do (i.e., using the applications software), rather than on the tools that they use (i.e., the information system).
Approach to applications development (toolbox approach – commands and scripting languages, hence proprietary applications); expensive and inefficient.
Move to a standardized applications development environment.
Integration with other types of applications (e.g., MapObjects of ESRI conform to the programming standards of the Windows NT environment, thus allowing GIS applications and data to be linked easily to their non-geographic counterparts.
Trend of GIS Development – Internet and Multimedia Technology
Enterprise computing & GIS (integration of GIS into mainstream information technology).
DB management (e.g., GIS & CAD) Data Modeling (ID corporate data requirements and
relationships). Applications development (interoperability – a real-time interface
between different software & hardware platforms). Enhanced network analyses Advances in mobile computing technology (wireless, internet
based data delivery, field data acquisition hardware & software.
Spatial Data Warehouses
Central repositories of all the data collected by the various information systems of an enterprise.
Unify disparate domain-specific databases without disrupting or changing them.
Interoperability and Open GIS
Incompatibility of proprietary data; use of data interchange standards to neutral formats.
Develop data storage specifications that will enable different GIS applications to access disparate data sets residing in different hardware platforms by means of standard data sharing protocols such as structured query language for interrogating relational databases.
Example is the Open GIS Consortium Inc.
National Spatial Data Infrastructure
NSDI – foster the awareness of existing geographic data resources and to encourage harmonization in geographic data characteristics.
Internet and Impacts on GIS
A cooperative system of computer networks based on the Transmission Control Protocol/Internet Protocol (TCP/IP).
Data communication facilities: file transfer protocol (FTP), telnet, electronic mail, gopher, newsgroup, world wide web (WWW), wide area information searching (WAIS).
Hypertext transfer protocol (HTTP). Hypertext Markup Language (HTML). GIS applications (Web-GIS, Internet-GIS).
Internet GIS - Functions
Static maps, dynamic maps, animations Spatial data catalogs Geographic information search engines (emphasis
on where, e.g., city names). Map generators (on-the-fly mapping). Real-time map browsers (high-end map generators). Real-time maps and images (web cams) Interface to a GIS (real-time, graphics, text).