Anomaly Detection in GPS Data Based on Visual AnalyticsKyung Min Su
- Zicheng Liao, Yizhou Yu, and Baoquan Chen, Anomaly Detection in GPS Data Based on Visual Analytics. IEEE Conference on Visual Analytics Science and Technology, 2010
Overview
Data analysis on GPS traces of taxis For traffic monitoring To detect abnormal situations
Visual analytics approach collaboration between machines and human
analysts
System architecture
Feature Set
Feature Extraction
Probabilistic Models
Conditional Random Fields (CRF)
Conditional Random Fields (CRF) Hidden state sequence y
Z(x): normalization item
CRF - Training
Training: computes the model parameters (the weight vector) according to labeled training data pairs {y, x}
CRF - Inference
Inference: tries to find the most likely hidden state
assignment y, the label sequence for the unlabeled input sequence x
Active Learning
Active learning: learner selectively chooses the examples to reduced amount of training data to improve the generalization performance on a
fixed-size training set
Criteria Uncertainty Representativeness Diversity
Uncertainty
High model uncertainty Help enrich the classifier
Confidence
Uncertainty
Representativeness
High representativeness sample sequence is not similar to any other
Diversity
Diversity: To remove items that are redundant with respect
to data items that are already in the training set from the previous iteration.
Similarity score is not greater than the average pairwise similarity among all sequences currently in the training set.
Visualization and Interaction
Interaction Interface
Basic mode Raw GPS traces without any labels
Monitoring mode Anomaly tags are shown. Show the internal CRF states of the tagged data items.
Tagging mode Active learning module is activated. Highly uncertain labels from the CRF model are
highlighted, requesting for user input.
Visualizing CRF Features
CRF internal states visualization
Features and their Weights Red: + Negative: -
Visualizing CRF Features
Summarization
Anomaly detection system Conditional Random Fields
Active Learning
Visualization and Interaction
References [1] Zicheng Liao, Yizhou Yu, and Baoquan Chen.
Anomaly Detection in GPS Data Based on Visual Analytics. IEEE Conference on Visual Analytics Science and Technology (VAST 2010), 2010.
[2] J. Lafferty, A. McCallum, and F. Pereira. Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In Proceedings of the International Conference on Machine Learning (ICML-2001), 2001.
[3] C. T. Symons, N. F. Samatova, R. Krishnamurthy, B. H. Park, T. Umar, D. Buttler, T. Critchlow, and D. Hysom. Multi-criterion active learning in conditional random fields. In Proceedings of the 18th IEEE International Conference on Tools with Artificial Intelligence, 2006.