robust geotag generation algorithms from noisy loran data ... 3 eloran system-level...
TRANSCRIPT
Robust Geotag Generation Algorithms from Noisy Loran Data
for Security Applications
Di Qiu, Sherman Lo, Dan Boneh, Per EngeStanford University
ILA 2009
Sponsored by FAA Loran Program CRDA 2000-G-028
Geo-security: Why?
Doesn’t require memorization• Users don’t have to memorize location dependent signal
characteristics.
Resistant to misplaced tokens• Users won’t forget to bring location.
Can’t be delegated• Users can’t lend location to someone else.
Low awareness
Don’t need physical access to attack
2
Geo-security: Big picture
Use location-dependent signal characteristics from multiple transmitters.Restrict access of information content or electronic devices.
3
Geo-security: How?
Receiver GeotagGeneration
Database
Application
Grant/Deny?
Matching
Calibration
Verification
Desired properties:
1.Hard to forge2.Reproducible
Desired properties:
1.Hard to forge2.Reproducible
GeotagGeneration
0111000…0111000… Location 1 = 011…Location 2 = 100…Location 3 = 101…
Location 1 = 011…Location 2 = 100…Location 3 = 101…
Matching
4
Geotag applications
Loopt• Social networking
Digital Manners Policy• Microsoft
Data access control• Geo-encryption
Geo-fencing• Laptop anti-theft technology
User A
User B
Central server
xA
xB
SMS
5
Performance standards
Two hypotheses• H0: accepting as authentic user• H1: rejecting as an attacker
Two errors• False reject: accepting hypothesis H1 when H0 is true• False accept: accepting hypothesis H0 when H1 is true
6
Geotag generation
Quantization-based
Fuzzy extractor-based Pattern classification-based• k-Nearest Neighbor (kNN)• Support Vector Machines (SVM)
⎪⎩
⎪⎨⎧
∑ =⊕=
Δ+Δ=∈=
=otherwise.0
1)(')(~1 if1)',~(
))1(,[;)(
1
n
i
iiki
iTiTnTTM
kkSxkiT
7
Quantization-based geotag reproducibility
Tamper-resistant device and self-authenticated signals for spoofing attacks.Temporal variation degrades performanceParameters --TD, ECD,SNR-- from GRI 9940Non-monotonic trend comes from quantization
2σ 3σ 4σ 5σ 6σ 7σ 8σ10
−2
10−1
100
Quantization Steps
False
Rej
ect R
ate
Reproducibility of Loran Tag
8
Fuzzy extractor
9
Generationx P Reproducex’P
…
T T’
Definition. A fuzzy extractor is a tuple (M, t0, Gen, Rep), where M is the metric space with a distance function dis, Gen is a generate procedure and Rep is a reproduce procedure, which has the following properties:
1.If dis(x, x’) ≤ t0, T’ = T.2.If dis(x, x’) ≥ t0, T’ ≠ T.
Y. Dodis el al., “Fuzzy extractors: How to generate strong keys from biometrics and other noisy data,” 2004.
Three fuzzy extractors for location data
Euclidean metric fuzzy extractor• Noise, bias, and quantization errors• Adjust offsets• Real numbers
Hamming metric fuzzy extractors• Constructions
░ Reed-Solomon based fuzzy extractor░ Secret Sharing based fuzzy extractor
• Offline transmitter• Inputs are integers: quantized values of location parameters
10
Performance of Euclidean fuzzy extractor
2σ 3σ 4σ 5σ 6σ 7σ 8σ10
−2
10−1
100
Quantization Steps
False
Rej
ect R
ate
Euclidean Metric Fuzzy Extractor Performance
without fuzzy extractorwith fuzzy extractor
84% reduction
11
RS-based fuzzy extractor – “Locking”
12
Receiver x Quantization qx
Mapping MatrixGeneration
QuantizationLevels
1 2 3 4 … n
q1
q2
q3
… … … … … …
c1
c2
c3
ContinuousQuantized
Random Generator
RSEncode c
T
xqc =
RS-based fuzzy extractor – “Unlocking”
13
q’x
… … … … … …
c1
c2
c3 c1
c2
c’3
cn
…RS
Decode T’
Receiver x’ Quantization q’x
ContinuousQuantized
T’=T if dis(q’x, qx) ≤ t, t is a design parameter
Performance analysis: RS-based
14
∑+=
−−⎟⎟⎠
⎞⎜⎜⎝
⎛=
n
tj
jnj ppjn
1)1(}failure decodeor error Pr{
p symbol error
t ≤ 2
Pattern classification
Goal: Extract decision rules from data to assign class labels to future data samples.• Maximize the difference between classes• Minimize the within-class scatter
The quality of a feature vector is essential to spatial discrimination/decorrelation.
“Good” features “Bad” features
Properties
Linear separability Non-linear separability Highly correlated Multi-modal15
Dimensionality reduction
“Curse of dimensionality” – high dimensional data are difficult to work with.• The added parameters or features can increase noise• Enough observations to get good estimates
Dimensionality reduction• Efficiency – computation and storage costs• Classification performance• Ease of modeling
16
Geotag generation using pattern classification
ClassificationSignal processingData collection Dimensionalityreduction
Model selection
T = f(ω)
Geotag database
Calibration
?Decisionmaking
Verification
T’=f(ω’)
T = f(ω)
Matching
17
K-Nearest Neighbor (kNN)
‘Memory’ based classification
No training phase is required: ‘lazy’ learning approach
Given data point x, find the k nearest training inputs x1, x2,…, xk to x using a distance metric.
Large k produces smoother boundaries and reduces the impact of noise.
Computational cost
x1
x2
x
18
Visualization of kNN, k=8
Easy to implement but computationally intensive.Euclidean distance metric
-30 -20 -10 0 10 20 30 40 50-10
-5
0
5
10
x1
x 2
KNN, k=8
Class 1Class 2Class 3
19
Support Vector Machines (SVM)
Optimal separating hyperplane – find a hyperplane with minimum misclassification rateNon-linear SVM• Perform a non-linear mapping of the feature vector x onto a
high-dimensional space• Construct an optimal separating hyperplane in the high-
dimensional spaceTradeoff – margin and capacity
ϕ(x) wTzx z y
x1
x2
Large Margin
x1
x2
Small Margin
20
Visualization of SVM
-30 -20 -10 0 10 20 30 40 50-10
-5
0
5
10
x1
x 2
SVM, Kernel argument=5
Class 1Class 2Class 3
Sequential Minimal Optimization (SMO) is applied to solve the optimization problem.Large kernel argument reduces misclassification errors but lowers discrimination ability.
-30 -20 -10 0 10 20 30 40 50-10
-5
0
5
10
x1
x 2
SVM, Kernel argument=1
Class 1Class 2Class 3
21
Data set to evaluate spatial discrimination
22
Classifier visualization
-0.2 -0.1 0 0.1 0.2 0.3-0.4
-0.3
-0.2
-0.1
0
0.1
0.2
0.3
0.4
0.5
x1
x 2
SVM
Class 1Class 2Class 3Class 4Class 5Class 6Class 7Class 8Class 9Class 10Class 11Class 12Class 13
-0.2 -0.1 0 0.1 0.2 0.3-0.4
-0.3
-0.2
-0.1
0
0.1
0.2
0.3
0.4
0.5
x1
x 2
KNN, k=8
Class 1Class 2Class 3Class 4Class 5Class 6Class 7Class 8Class 9Class 10Class 11Class 12Class 13
23
Spatial discrimination comparison
0 5 10 15 20 25 300
0.2
0.4
0.6
0.8
1
Separation [m]
Fals
e ac
cpet
rat
e (F
AR
)Performance Comparison
SVMkNN, k=8Quantization
24
85% reduction
Conclusion
Modeled geo-security and standardized the performance evaluation.• Geotag reproducibility and spatial discriminiation
Developed fuzzy extractors to reduce continuity risks.• Euclidean metric for noise, seasonal bias, and quantization error• Hamming metric for offline transmitters
Applied pattern classification for geotag generation to improve spatial discrimination.• kNN and SVM
25
Acknowledgment
The authors would like to thank Mitch Narins of the FAA, Dr. Greg Johnson and Ruslan Shalaev at Alion Science & Technology and USCG LSU for their support and help during the research.
26
Thank you!
Questions?
27
Backup slides
28
Geotag reproducibility
90-day seasonal monitor dataSame hypothesis problemTradeoff by varying kernel argument
0 0.2 0.4 0.6 0.8 10
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
Kernel Argument
Fals
e re
ject
rat
e (F
RR
)
SVM
29