1 continuous k-dominant skyline query processing presented by prasad sriram nilu thakur
Post on 20-Dec-2015
225 views
TRANSCRIPT
3
Example Skyline
Which one is better? e or b? (e, because its price and distance dominate those
of b) C or f?
Finding skyline of hotel, lesser price & closer to the beach
1 2 3 4
200
150
100
50
Distance
Price a
b
c
d
ef
4
Problem Definition
InputA set of points, p1,p2,…pn
OutputA set of points P (referred to as the skyline points), such that
any pointp1 Є P is not dominated by any other point in the dataset
Objective Provide correct and complete resultsMinimize the query response time and memory consumptionContinuous queries require continuous evaluationScalability in terms of the number of queries
ConstraintsMinimize the number of dominance checks
5
Skyline Properties (1/2) Meaningful for incomparable dimensions
Browsing Laptops Price, weight, size, memory, etc.
Insensitive to scaling and shifting of the dimensions Skyline - Curse of Dimensionality Movie Rating
Different users may have different rating preferences
Movie p better than q only if p rated higher or equal to q by all users
One outlier opinion will invalidate the dominance
6
Skyline Properties (2/2)
Too many skyline points in high dimensional spaces Example: NBA data set, 17000 player season statistics
on 17 attributes Over 1000 skyline points in the full space Some average-skilled players are in the skyline if
they are not bad on some attributes. Possible Solutions
Dimension Reduction Techniques - Requires domain knowledge
Subspace Skylines - Many subspaces need to be explored
Relax the notion of d-dominance - k-dominance
7
k-dominant Skyline k-Dominate
If A is not worse than B on k dimensions, and better on at least one of the k dimensions, we say A k-dominates B.
k-Dominant Skyline k-dominant skyline contains all the points that
cannot be k-dominated by any other point k-Dominant Skyline Query
Given a data set, find the k-dominant skyline When k=d, we have the conventional skyline K-dominance is cyclic unlike d-dominance
Slide Courtesy [2] 8
k-dominant Skyline - Example
d1 d2 d3 d4 d5 d6
p1 2 2 2 4 4 4
p2 4 4 4 2 2 2
p3 3 3 3 5 3 3
p4 4 4 4 3 3 3
p5 5 5 5 1 5 5
conventional skyline
5-dominant skyline
4-dominant skyline
Smaller k, smaller k-dominant skyline
9
Cyclic Properties of k-dominance k-dominance can be cyclic A 3-dominates B
d1 d2 d3 d4
A 5 5 5 5
B 1 6 6 6
C 2 1 7 7
D 3 2 1 8
10
Cyclic Properties of k-dominance
B 3-dominates C
d1 d2 d3 d4
A 5 5 5 5
B 1 6 6 6
C 2 1 7 7
D 3 2 1 8
11
Cyclic Properties of k-dominance
C 3-dominates D
d1 d2 d3 d4
A 5 5 5 5
B 1 6 6 6
C 2 1 7 7
D 3 2 1 8
12
Cyclic Properties of k-dominance
D 3-dominates A
d1 d2 d3 d4
A 5 5 5 5
B 1 6 6 6
C 2 1 7 7
D 3 2 1 8
14
A naïve approach
Case 1 A new point arrives
It is k-dominated by some points It k-dominates some points
Case 2 A point expires
16
An improved approach
a(1)
b(3) c(5)
d(7) e(9) f(11) g(13)
Skyline heap
Non-Skyline heap
h(15)
h(26)
a 16 DIS
b 18 DIS
c 20 DIS
d 22 DIS
e 24 DIS
f 26 DIS
g 28 DIS
h 26 RET
17
An improved approach
b(3)
d(7) c(5)
e(9) f(11) g(13)
Skyline heap
Non-Skyline heap
h(26)
b 18 DIS
c 20 DIS
d 22 DIS
e 24 DIS
f 26 DIS
g 28 DIS
h 26 RET
at t = 16
18
An improved approach
b(3)
d(7) c(5)
e(9) f(11) g(13)
Skyline heap
Non-Skyline heap
h(26)
b 18 DIS
c 20 DIS
d 22 DIS
e 24 DIS
f 26 DIS
g 28 DIS
i 20 RET
i(17)
i(20)
19
An improved approach
c(5)
d(7) f(11)
e(9) g(13)
Skyline heap
Non-Skyline heap
i(20)
c 20 DIS
d 22 DIS
e 24 DIS
f 26 DIS
g 28 DIS
i 20 RET
at t = 18
20
An improved approach
c(5)
d(7) f(11)
e(9) g(13)
Skyline heap
Non-Skyline heap
i(20)
c 20 DIS
d 22 DIS
e 24 DIS
f 26 DIS
g 28 DIS
i 20 RET
j(19)
21
An improved approach
c(5)
d(7) f(11)
e(9) g(13)
Skyline heap
Non-Skyline heap
i(20)
c 20 DIS
d 22 DIS
e 24 DIS
f 26 DIS
g 28 DIS
i 20 RET
j 32 RET
j(32)
22
Validations
Methodology Theorem based proving for correctness
and completeness Experiments to analyze performance
Validation criteria Query Response time
23
Experimental Analysis
4500
4550
4600
4650
4700
4750
4800
4850
4900
4950
5000
5050
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29
k-dominance checks
Res
po
nse
Tim
e in
mil
isec
s
Improved Approach
Naïve Approach
24
Rewrite today
Improvements A better technique for k-
dominance Conduct detailed experiments with
network object generators Think about how to find (spatial)
skyline in road networks
25
References1. Yufei Tao, Dimitris Papadias: Maintaining Sliding Window Skylines on
Data Streams. IEEE Trans. Knowl. Data Eng. 18(2): 377-391 (2006) 2. Chee Yong Chan, H. V. Jagadish, Kian-Lee Tan, Anthony K. H. Tung,
Zhenjie Zhang: Finding k-dominant skylines in high dimensional space. SIGMOD Conference 2006: 503-514.
3. M. Sharifzadeh, C. Shahabi. The Spatial Skyline Queries. In Proceedings of VLDB’06.
4. Michael D. Morse, Jignesh M. Patel, William I. Grosky: Efficient Continuous Skyline Computation. ICDE 2006: 108.
5. Zhiyong Huang, Hua Lu, Beng Chin Ooi, Anthony K.H. Tung, Continuous Skyline Queries for Moving Objects, IEEE Transactions on Knowledge and Data Engineering, vol. 18, no. 12, pp. 1645-1658, Dec., 2006.
6. S. Borzsonyi, D. Kossmann, and K. Stocker. The Skyline Operator. In Proceedings of ICDE'01.
7. D. Kossmann, F. Ramsak, and S. Rost. Shooting Stars in the Sky: An Online Algorithm for Skyline Queries. In Proceedings of VLDB'02.