predicting locations using map similarity(plums): a framework for spatial data mining sanjay...

Post on 20-Dec-2015

219 Views

Category:

Documents

2 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Predicting Locations Using Map Similarity(PLUMS): A Framework for

Spatial Data Mining

Sanjay Chawla(Vignette Corporation)

Shashi Shekhar, Weili Wu(CS, Univ. of Minnesota)

Uygar Ozesmi(Ericyes University, Turkey)

http://www.cs.umn.edu/research/shashi-group

Outline• Motivation• Application Domain• Distinguishing characteristics of spatial data

mining• Problem Definition• Spatial Statistics Approach• Our approach: PLUMS• Experiments, Results, Conclusion and Future

Work

Motivation• Historical Examples of Spatial Data

Exploration– Asiatic Cholera, 1855– Theory of Gondwanaland– Effect of fluoride on Dental Hygiene

• A potential application in news– Tracking the West Nile Virus

Application Domain

• Wetland Management: Predicting locations of bird(red-winged blackbird) nests in wetlands

• Why we choose this application ?– Strong spatial component– Domain Expertise– Classical Data Mining techniques(logistic

regression, neural nets) had already been applied

Application Domain: Continued..

Nest Locations Distance to open water

Vegetation Durability Water Depth

Unique characteristics of spatial data mining

Spatial Autocorrelation Property

Unique characteristics…cont

K

kkk PnearestAAd

KPAADNP

1

))(.,(1

),(

Average Distance to Nearest Prediction(ADNP):

Location Prediction:Problem Formulation

• Given: A spatial framework S.– Explanatory functions,

– Dependent function

– A family F F of learning model function mappings

• Find an element

• Objective: maximize (map_similarity = classification_accuracy + spatial accuracy)

• Constraints: spatial autocorrelation exists

kX RSfk

:

}1,0{: YY RSf

ykky RRRFf ....:ˆ

Spatial Statistics Approach1.

2. Xy XWyy

2”

X

X

e

eyob

1)1(PrLogistic Regression:

Spatial Stat: Solution Techniques

• Least Square Estimation: Biased and Inconsistent

• Maximum Likelihood: Involve computation of large determinant(from W)

• Bayesian: Monte Carlo Markov Chain(e.g. Gibbs Sampling)

Our Approach

Experiment Setup

Result(1)

FNTP

TPTPR

TNFP

FPFPR

Result(2)

Conclusion and Future work

• PLUMS >> Classical Data Mining techniques

• PLUMS State-of-the-art Spatial Statistics approaches

• Better performance(two orders of magnitude)

• Try other configurations of the PLUMS framework and formalize!

top related