quotient cube: how to summarize the semantics of a data cube laks v.s. lakshmanan (univ. of british...

44
Quotient Cube: How to Summarize the Semantics of a Data Cube Laks V.S. Lakshmanan (Univ. of British Columbia) * Jian Pei (State Univ. of New York at Buffalo) * Jiawei Han (Univ. of Illinois at Urbana- Champaign) + * The work is partially supported by NSERC and NCE/IRIS

Upload: nigel-rowell

Post on 01-Apr-2015

215 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: Quotient Cube: How to Summarize the Semantics of a Data Cube Laks V.S. Lakshmanan (Univ. of British Columbia) * Jian Pei (State Univ. of New York at Buffalo)

Quotient Cube: How to Summarize the Semantics of a Data Cube

Laks V.S. Lakshmanan (Univ. of British Columbia)*

Jian Pei (State Univ. of New York at Buffalo)*

Jiawei Han (Univ. of Illinois at Urbana-Champaign)+

* The work is partially supported by NSERC and NCE/IRIS+ The work is partially supported by NSF, UI, and Microsoft Research

Page 2: Quotient Cube: How to Summarize the Semantics of a Data Cube Laks V.S. Lakshmanan (Univ. of British Columbia) * Jian Pei (State Univ. of New York at Buffalo)

Lakshmanan, Pei & Han. Quotient Cube: How to Summarize the Semantics of a Data Cube 2

Outline

• Introduction and motivation

• Cube lattice partitions

• Semantics preserving partitions

• Algorithms

• Experimental results

• Discussion and summary

Page 3: Quotient Cube: How to Summarize the Semantics of a Data Cube Laks V.S. Lakshmanan (Univ. of British Columbia) * Jian Pei (State Univ. of New York at Buffalo)

Lakshmanan, Pei & Han. Quotient Cube: How to Summarize the Semantics of a Data Cube 3

Data CubeBase table

Dimensions Measure

Store Product Season AVG(Sales)

S1 P1 Spring 6

S1 P2 Spring 12

S2 P1 Fall 9

S1 * Spring 9

… … … …

* * * 9

Dimensions Measure

Store Product Season Sales

S1 P1 Spring 6

S1 P2 Spring 12

S2 P1 Fall 9

Aggregation

Page 4: Quotient Cube: How to Summarize the Semantics of a Data Cube Laks V.S. Lakshmanan (Univ. of British Columbia) * Jian Pei (State Univ. of New York at Buffalo)

Lakshmanan, Pei & Han. Quotient Cube: How to Summarize the Semantics of a Data Cube 4

Previous Work: Efficient Cube Computation

• Compute a cube from a base table: e.g. (Agarwal et al. 98), (Zhao et al. 97)

• View materialization with space constraint: e.g. Harinarayann et al. 96

• Handling scarcity (Ross & Srivastava 97)• Cube compression: e.g. (Sismanis et al. 02),

(Shanmugasundaram et al. 99), (Want et al. 02)• Approximation: e.g. (Barbara & Sullivan 97), (Barbara

& Xu 00), (Vitter et al. 98)• Constrained cube construction: e.g. (Beyer &

Ramakrishnan 99)

Page 5: Quotient Cube: How to Summarize the Semantics of a Data Cube Laks V.S. Lakshmanan (Univ. of British Columbia) * Jian Pei (State Univ. of New York at Buffalo)

Lakshmanan, Pei & Han. Quotient Cube: How to Summarize the Semantics of a Data Cube 5

Previous Work: Extracting Semantics From Cubes

• General contexts of patterns (Sathe & Sarawagi 01)

• Generalize association rules (Imielinski et al. 00)

• Cube gradient analysis (Dong et al. 01)

Page 6: Quotient Cube: How to Summarize the Semantics of a Data Cube Laks V.S. Lakshmanan (Univ. of British Columbia) * Jian Pei (State Univ. of New York at Buffalo)

Lakshmanan, Pei & Han. Quotient Cube: How to Summarize the Semantics of a Data Cube 6

Cube (Cell) Lattice

• Many cells have same aggregate values• Can we summarize the semantics of the

cube by grouping cells by aggregate values?

(S1,P1,s):6 (S1,P2,s):12 (S2,P1,f):9

(S1,*,s):9 (S1,P1,*):6 (*,P1,s):6 (S1,P2,*):12 (*,P2,s):12 (S2,*,f):9 (S2,P1,*):9(*,P1,f):9

(S1,*,*):9 (*,*,s):9 (*,P1,*):7.5 (*,P2,*):12 (*,*,f):9 (S2,*,*):9

(*,*,*):9

Page 7: Quotient Cube: How to Summarize the Semantics of a Data Cube Laks V.S. Lakshmanan (Univ. of British Columbia) * Jian Pei (State Univ. of New York at Buffalo)

Lakshmanan, Pei & Han. Quotient Cube: How to Summarize the Semantics of a Data Cube 7

A Naïve Attempt

• Put all cells having same aggregate value in a class

(S1,P1,s):6 (S1,P2,s):12 (S2,P1,f):9

(S1,*,s):9 (S1,P1,*):6 (*,P1,s):6 (S1,P2,*):12 (*,P2,s):12 (S2,*,f):9 (S2,P1,*):9(*,P1,f):9

(S1,*,*):9 (*,*,s):9 (*,P1,*):7.5 (*,P2,*):12 (*,*,f):9 (S2,*,*):9

(*,*,*):9

C1 C2 C3

C4

Page 8: Quotient Cube: How to Summarize the Semantics of a Data Cube Laks V.S. Lakshmanan (Univ. of British Columbia) * Jian Pei (State Univ. of New York at Buffalo)

Lakshmanan, Pei & Han. Quotient Cube: How to Summarize the Semantics of a Data Cube 8

Problems w/ the Naïve Attempt

• The result is not a lattice anymore!– Anomaly– The rollup/drilldown semantics is lost

C1 C2 C3

C4

343 CCC rolluprollup

(S1,P1,s):6 (S1,P2,s):12 (S2,P1,f):9

(S1,*,s):9 (S1,P1,*):6 (*,P1,s):6 (S1,P2,*):12 (*,P2,s):12 (S2,*,f):9 (S2,P1,*):9(*,P1,f):9

(S1,*,*):9 (*,*,s):9 (*,P1,*):7.5 (*,P2,*):12 (*,*,f):9 (S2,*,*):9

(*,*,*):9

Page 9: Quotient Cube: How to Summarize the Semantics of a Data Cube Laks V.S. Lakshmanan (Univ. of British Columbia) * Jian Pei (State Univ. of New York at Buffalo)

Lakshmanan, Pei & Han. Quotient Cube: How to Summarize the Semantics of a Data Cube 9

A Better Partitioning

• Quotient cube: partitioning reserving the rollup/drilldown semantics

(S1,P1,s):6 (S1,P2,s):12 (S2,P1,f):9

(S1,*,s):9 (S1,P1,*):6 (*,P1,s):6 (S1,P2,*):12 (*,P2,s):12 (S2,*,f):9 (S2,P1,*) (*,P1,f):9

(S1,*,*):9 (*,*,s):9 (*,P1,*):7.5 (*,P2,*):12 (*,*,f):9 (S2,*,*):9

(*,*,*):9

C1 C3

C5

C4

C2

Page 10: Quotient Cube: How to Summarize the Semantics of a Data Cube Laks V.S. Lakshmanan (Univ. of British Columbia) * Jian Pei (State Univ. of New York at Buffalo)

Lakshmanan, Pei & Han. Quotient Cube: How to Summarize the Semantics of a Data Cube 10

Problem Statement

• Given a cube, characterize a good way (quotient cube) of partitioning its cells into classes such that– The partition generates a reduced lattice

preserving the rollup/drilldown semantics– The partition is optimal: # classes as small

as possible

• Compute quotient cubes efficiently

Page 11: Quotient Cube: How to Summarize the Semantics of a Data Cube Laks V.S. Lakshmanan (Univ. of British Columbia) * Jian Pei (State Univ. of New York at Buffalo)

Lakshmanan, Pei & Han. Quotient Cube: How to Summarize the Semantics of a Data Cube 11

Why A Quotient Cube Useful?

• Semantic compression

• Semantic OLAP browsing(S1,P1,s):6 (S1,P2,s):12 (S2,P1,f):9

(S1,*,s):9 (S1,P1,*):6(*,P1,s):6 (S1,P2,*):12(*,P2,s):12(S2,*,f):9 (S2,P1,*) (*,P1,f):9

(S1,*,*):9 (*,*,s):9 (*,P1,*):7.5 (*,P2,*):12 (*,*,f):9 (S2,*,*):9

(*,*,*):9

C1 C2

C5

C4

C3

Page 12: Quotient Cube: How to Summarize the Semantics of a Data Cube Laks V.S. Lakshmanan (Univ. of British Columbia) * Jian Pei (State Univ. of New York at Buffalo)

Lakshmanan, Pei & Han. Quotient Cube: How to Summarize the Semantics of a Data Cube 12

Why A Quotient Cube Useful?

• Semantic compression

• Semantic OLAP browsing(S1,P1,s):6 (S1,P2,s):12 (S2,P1,f):9

(S1,*,s):9 (S1,P1,*):6(*,P1,s):6 (S1,P2,*):12(*,P2,s):12(S2,*,f):9 (S2,P1,*) (*,P1,f):9

(S1,*,*):9 (*,*,s):9 (*,P1,*):7.5 (*,P2,*):12 (*,*,f):9 (S2,*,*):9

(*,*,*):9

C1 C2

C5

C4

(S2,P1,f):9

(S2,*,f):9 (S2,P1,*) (*,P1,f):9

(*,*,f):9 (S2,*,*):9

Page 13: Quotient Cube: How to Summarize the Semantics of a Data Cube Laks V.S. Lakshmanan (Univ. of British Columbia) * Jian Pei (State Univ. of New York at Buffalo)

Lakshmanan, Pei & Han. Quotient Cube: How to Summarize the Semantics of a Data Cube 13

Outline

• Introduction and motivation

• Cube lattice partitions

• Semantics preserving partitions

• Algorithms

• Experimental results

• Discussion and summary

Page 14: Quotient Cube: How to Summarize the Semantics of a Data Cube Laks V.S. Lakshmanan (Univ. of British Columbia) * Jian Pei (State Univ. of New York at Buffalo)

Lakshmanan, Pei & Han. Quotient Cube: How to Summarize the Semantics of a Data Cube 14

Convex Partitions

• A convex partition retains semantics

CLScCLSccccc rolluprollup 231321 ,,

(S1,P1,s):6 (S1,P2,s):12 (S2,P1,f):9

(S1,*,s):9 (S1,P1,*):6 (*,P1,s):6 (S1,P2,*):12 (*,P2,s):12 (S2,*,f):9 (S2,P1,*) (*,P1,f):9

(S1,*,*):9 (*,*,s):9 (*,P1,*):7.5 (*,P2,*):12 (*,*,f):9 (S2,*,*):9

(*,*,*):9

C1 C3

C5

C4

C2

Page 15: Quotient Cube: How to Summarize the Semantics of a Data Cube Laks V.S. Lakshmanan (Univ. of British Columbia) * Jian Pei (State Univ. of New York at Buffalo)

Lakshmanan, Pei & Han. Quotient Cube: How to Summarize the Semantics of a Data Cube 15

A Non-convex Partition

• Anomaly

• The rollup/drilldown semantics is lost

C1 C2 C3

C4

343 CCC rolluprollup

(S1,P1,s):6 (S1,P2,s):12 (S2,P1,f):9

(S1,*,s):9 (S1,P1,*):6 (*,P1,s):6 (S1,P2,*):12 (*,P2,s):12 (S2,*,f):9 (S2,P1,*):9(*,P1,f):9

(S1,*,*):9 (*,*,s):9 (*,P1,*):7.5 (*,P2,*):12 (*,*,f):9 (S2,*,*):9

(*,*,*):9

Page 16: Quotient Cube: How to Summarize the Semantics of a Data Cube Laks V.S. Lakshmanan (Univ. of British Columbia) * Jian Pei (State Univ. of New York at Buffalo)

Lakshmanan, Pei & Han. Quotient Cube: How to Summarize the Semantics of a Data Cube 16

Connected Partitions

• Cells c1 and c2 are connected if a series of rollup/drilldown operation starting from c1 can touch c2

• Intuitively, (each class of) a partition should be connected

Page 17: Quotient Cube: How to Summarize the Semantics of a Data Cube Laks V.S. Lakshmanan (Univ. of British Columbia) * Jian Pei (State Univ. of New York at Buffalo)

Lakshmanan, Pei & Han. Quotient Cube: How to Summarize the Semantics of a Data Cube 17

Cover Partition

• For a cell c, a tuple t in base table is in c’s cover if t can be rolled up to c– E.g., Cov(S1,*,spring)={(S1,P1,spring),

(S1,P2,spring)}

Dimensions Measure

Store Product Season Sales

S1 P1 Spring 6

S1 P2 Spring 12

S2 P1 Fall 9

Page 18: Quotient Cube: How to Summarize the Semantics of a Data Cube Laks V.S. Lakshmanan (Univ. of British Columbia) * Jian Pei (State Univ. of New York at Buffalo)

Lakshmanan, Pei & Han. Quotient Cube: How to Summarize the Semantics of a Data Cube 18

Cover Partitions Are Convex

• All cells having the same cover are in a class• (S1,P2,s) and (*,P2,*) cover same tuples in

the base table (S1,P2,*) and (*,P2,s) are in the same class.

(S1,P1,s):6 (S1,P2,s):12 (S2,P1,f):9

(S1,*,s):9 (S1,P1,*):6 (*,P1,s):6 (S1,P2,*):12 (*,P2,s):12 (S2,*,f):9 (S2,P1,*) (*,P1,f):9

(S1,*,*):9 (*,*,s):9 (*,P1,*):7.5 (*,P2,*):12 (*,*,f):9 (S2,*,*):9

(*,*,*):9

Page 19: Quotient Cube: How to Summarize the Semantics of a Data Cube Laks V.S. Lakshmanan (Univ. of British Columbia) * Jian Pei (State Univ. of New York at Buffalo)

Lakshmanan, Pei & Han. Quotient Cube: How to Summarize the Semantics of a Data Cube 19

Cover Partitions Are Connected

• Cells c1 and c2 have the same cover there must be some common ancestor c3 of c1 and c2 st c3 has the same cover– Cells c1 and c2 are in the same class and

connected (S1,P1,s):6 (S1,P2,s):12 (S2,P1,f):9

(S1,*,s):9 (S1,P1,*):6 (*,P1,s):6 (S1,P2,*):12 (*,P2,s):12 (S2,*,f):9 (S2,P1,*) (*,P1,f):9

(S1,*,*):9 (*,*,s):9 (*,P1,*):7.5 (*,P2,*):12 (*,*,f):9 (S2,*,*):9

(*,*,*):9

Page 20: Quotient Cube: How to Summarize the Semantics of a Data Cube Laks V.S. Lakshmanan (Univ. of British Columbia) * Jian Pei (State Univ. of New York at Buffalo)

Lakshmanan, Pei & Han. Quotient Cube: How to Summarize the Semantics of a Data Cube 20

Cover Partitions & Aggregates

• All cells in a cover partition carry the same aggregate value w.r.t. any aggregate function– But cells in a class of MIN() may have

different covers

• For COUNT() and SUM() (positive), cover equivalence coincides with aggregate equivalence

Page 21: Quotient Cube: How to Summarize the Semantics of a Data Cube Laks V.S. Lakshmanan (Univ. of British Columbia) * Jian Pei (State Univ. of New York at Buffalo)

Lakshmanan, Pei & Han. Quotient Cube: How to Summarize the Semantics of a Data Cube 21

Outline

• Introduction and motivation

• Cube lattice partitions

• Semantics preserving partitions

• Algorithms

• Experimental results

• Discussion and summary

Page 22: Quotient Cube: How to Summarize the Semantics of a Data Cube Laks V.S. Lakshmanan (Univ. of British Columbia) * Jian Pei (State Univ. of New York at Buffalo)

Lakshmanan, Pei & Han. Quotient Cube: How to Summarize the Semantics of a Data Cube 22

Class 1 = Class 2

Class 1

Weak Congruence

• Weak congruence preserves semantics

Class 2

c c’

d d’

roll

up

roll

up

c c’

d d’ro

llup

roll

upimply

Page 23: Quotient Cube: How to Summarize the Semantics of a Data Cube Laks V.S. Lakshmanan (Univ. of British Columbia) * Jian Pei (State Univ. of New York at Buffalo)

Lakshmanan, Pei & Han. Quotient Cube: How to Summarize the Semantics of a Data Cube 23

Weak Congruence = Convex

• Convex no “hole” in the class weak congruence

• They preserve the rollup/drilldown semantics

• Quotient cube lattice is the lattice of convex classes

• How to derive the coarsest quotient cube?

Page 24: Quotient Cube: How to Summarize the Semantics of a Data Cube Laks V.S. Lakshmanan (Univ. of British Columbia) * Jian Pei (State Univ. of New York at Buffalo)

Lakshmanan, Pei & Han. Quotient Cube: How to Summarize the Semantics of a Data Cube 24

Monotone Aggregate Functions

• Monotone functions– S T f(S) f(T)– S T f(S) f(T)– MIN(), MAX(), COUNT(), PSUM(), …

• The aggregate function f is monotone f is the unique coarsest partition– MIN(): put all cells having the same MIN()

value into a class

Page 25: Quotient Cube: How to Summarize the Semantics of a Data Cube Laks V.S. Lakshmanan (Univ. of British Columbia) * Jian Pei (State Univ. of New York at Buffalo)

Lakshmanan, Pei & Han. Quotient Cube: How to Summarize the Semantics of a Data Cube 25

Non-monotone Functions

• Bad news: f may or may not be a convex/weak congruence.

• Good news: cover partition is convex (I.e., weak congruence) and always yields a quotient cube w.r.t. any aggregate function!

Page 26: Quotient Cube: How to Summarize the Semantics of a Data Cube Laks V.S. Lakshmanan (Univ. of British Columbia) * Jian Pei (State Univ. of New York at Buffalo)

Lakshmanan, Pei & Han. Quotient Cube: How to Summarize the Semantics of a Data Cube 26

Outline

• Introduction and motivation

• Cube lattice partitions

• Semantics preserving partitions

• Algorithms

• Experimental results

• Discussion and summary

Page 27: Quotient Cube: How to Summarize the Semantics of a Data Cube Laks V.S. Lakshmanan (Univ. of British Columbia) * Jian Pei (State Univ. of New York at Buffalo)

Lakshmanan, Pei & Han. Quotient Cube: How to Summarize the Semantics of a Data Cube 27

How to Compute A QC

• Aggregate functions– Monotone functions– Non-monotone functions

• Settings– The cube is available– Only the base table is available

Page 28: Quotient Cube: How to Summarize the Semantics of a Data Cube Laks V.S. Lakshmanan (Univ. of British Columbia) * Jian Pei (State Univ. of New York at Buffalo)

Lakshmanan, Pei & Han. Quotient Cube: How to Summarize the Semantics of a Data Cube 28

Monotone Functions

• The cube is available grab all cells with the same aggregate value and put them into a class

• Only the base table is available bottom-up, depth-first search– For a cell, compute its cover, find the upper

bound having the same aggregate value– Group lower bounds by upper bounds

Page 29: Quotient Cube: How to Summarize the Semantics of a Data Cube Laks V.S. Lakshmanan (Univ. of British Columbia) * Jian Pei (State Univ. of New York at Buffalo)

Lakshmanan, Pei & Han. Quotient Cube: How to Summarize the Semantics of a Data Cube 29

Example: Cover QC(S1,P1,s):6 (S1,P2,s):12 (S2,P1,f):9

(S1,*,s):9 (S1,P1,*):6 (*,P1,s):6 (S1,P2,*):12 (*,P2,s):12 (S2,*,f):9 (S2,P1,*) (*,P1,f):9

(S1,*,*):9 (*,*,s):9 (*,P1,*):7.5 (*,P2,*):12 (*,*,f):9 (S2,*,*):9

(*,*,*):9

Page 30: Quotient Cube: How to Summarize the Semantics of a Data Cube Laks V.S. Lakshmanan (Univ. of British Columbia) * Jian Pei (State Univ. of New York at Buffalo)

Lakshmanan, Pei & Han. Quotient Cube: How to Summarize the Semantics of a Data Cube 30

Non-monotone Functions

• Class merging

• Find cover partition classes

• Merge classes as long as convexity is retained

Page 31: Quotient Cube: How to Summarize the Semantics of a Data Cube Laks V.S. Lakshmanan (Univ. of British Columbia) * Jian Pei (State Univ. of New York at Buffalo)

Lakshmanan, Pei & Han. Quotient Cube: How to Summarize the Semantics of a Data Cube 31

Example: AVG QC(S1,P1,s):6 (S1,P2,s):12 (S2,P1,f):9

(S1,*,s):9 (S1,P1,*):6 (*,P1,s):6 (S1,P2,*):12 (*,P2,s):12 (S2,*,f):9 (S2,P1,*) (*,P1,f):9

(S1,*,*):9 (*,*,s):9 (*,P1,*):7.5 (*,P2,*):12 (*,*,f):9 (S2,*,*):9

(*,*,*):9

Page 32: Quotient Cube: How to Summarize the Semantics of a Data Cube Laks V.S. Lakshmanan (Univ. of British Columbia) * Jian Pei (State Univ. of New York at Buffalo)

Lakshmanan, Pei & Han. Quotient Cube: How to Summarize the Semantics of a Data Cube 32

Outline

• Introduction and motivation

• Cube lattice partitions

• Semantics preserving partitions

• Algorithms

• Experimental results

• Discussion and summary

Page 33: Quotient Cube: How to Summarize the Semantics of a Data Cube Laks V.S. Lakshmanan (Univ. of British Columbia) * Jian Pei (State Univ. of New York at Buffalo)

Lakshmanan, Pei & Han. Quotient Cube: How to Summarize the Semantics of a Data Cube 33

Reduction Ratio vs. Dimensionality

0

10

20

30

40

50

60

70

80

90

100

2 3 4 5 6 7 8 9 10

Re

duction

ratio

(%

)

Dimensionality

MinCubeQC_CovQC_MIN

# base tuples = 200k Zipf factor = 2.0

Page 34: Quotient Cube: How to Summarize the Semantics of a Data Cube Laks V.S. Lakshmanan (Univ. of British Columbia) * Jian Pei (State Univ. of New York at Buffalo)

Lakshmanan, Pei & Han. Quotient Cube: How to Summarize the Semantics of a Data Cube 34

Reduction Ratio vs. Zipf Factor

0

10

20

30

40

50

60

0 0.5 1 1.5 2 2.5 3

Reduction r

atio (

%)

Zipf factor

MinCubeQC_CovQC_MIN

# base tuples = 200k # dimensions = 6

Page 35: Quotient Cube: How to Summarize the Semantics of a Data Cube Laks V.S. Lakshmanan (Univ. of British Columbia) * Jian Pei (State Univ. of New York at Buffalo)

Lakshmanan, Pei & Han. Quotient Cube: How to Summarize the Semantics of a Data Cube 35

Reduction Ratio vs. Base Table Size

0

10

20

30

40

50

60

70

80

0 200 400 600 800 1000 1200 1400

Red

uct

ion r

atio

(%

)

Number of tuples (k)

MinCubeQC_CovQC_MIN

Zipf factor = 2.0 # dimensions = 6

Page 36: Quotient Cube: How to Summarize the Semantics of a Data Cube Laks V.S. Lakshmanan (Univ. of British Columbia) * Jian Pei (State Univ. of New York at Buffalo)

Lakshmanan, Pei & Han. Quotient Cube: How to Summarize the Semantics of a Data Cube 36

Runtime

0

500

1000

1500

2000

2500

3000

0 200 400 600 800 1000 1200 1400

Runtim

e (

seconds)

Number of tuples (k)

MinCubeQC_CovQC_MIN

BUC

Zipf factor = 2.0 # dimensions = 6

Page 37: Quotient Cube: How to Summarize the Semantics of a Data Cube Laks V.S. Lakshmanan (Univ. of British Columbia) * Jian Pei (State Univ. of New York at Buffalo)

Lakshmanan, Pei & Han. Quotient Cube: How to Summarize the Semantics of a Data Cube 37

Compression Ratio on Weather Data Set

0

10

20

30

40

50

60

70

80

90

100

2 3 4 5 6 7

Red

uctio

n ra

tio (

%)

Number of dimensions

QC_CovQC_AVG

0

10

20

30

40

50

60

2 3 4 5 6 7 8 9

Red

uctio

n ra

tio (

%)

Number of dimensions

MinCubeQC_Cov

Page 38: Quotient Cube: How to Summarize the Semantics of a Data Cube Laks V.S. Lakshmanan (Univ. of British Columbia) * Jian Pei (State Univ. of New York at Buffalo)

Lakshmanan, Pei & Han. Quotient Cube: How to Summarize the Semantics of a Data Cube 38

Outline

• Introduction and motivation

• Cube lattice partitions

• Semantics preserving partitions

• Algorithms

• Experimental results

• Discussion and summary

Page 39: Quotient Cube: How to Summarize the Semantics of a Data Cube Laks V.S. Lakshmanan (Univ. of British Columbia) * Jian Pei (State Univ. of New York at Buffalo)

Lakshmanan, Pei & Han. Quotient Cube: How to Summarize the Semantics of a Data Cube 39

Semantic Cube Exploration

• Theoretical foundation for semantic summarization in data cube– concept and properties of quotient cubes

• Efficient algorithms for quotient cube construction– Quotient cubes can be computed directly

from base tables

Page 40: Quotient Cube: How to Summarize the Semantics of a Data Cube Laks V.S. Lakshmanan (Univ. of British Columbia) * Jian Pei (State Univ. of New York at Buffalo)

Lakshmanan, Pei & Han. Quotient Cube: How to Summarize the Semantics of a Data Cube 40

Ongoing Research

• Efficient implementation of quotient cube-based OLAP system– Data warehouse built using quotient cubes

• Hierarchies and constraints

• Incremental maintenance

• Semantics based OLAP and mining

• Efficient query answering

Page 41: Quotient Cube: How to Summarize the Semantics of a Data Cube Laks V.S. Lakshmanan (Univ. of British Columbia) * Jian Pei (State Univ. of New York at Buffalo)

Lakshmanan, Pei & Han. Quotient Cube: How to Summarize the Semantics of a Data Cube 41

References (1)

• R. Agrawal and R. Srikant. Fast Algorithms for Mining Association Rules in Large Databases. VLDB 1994

• S. Agarwal, R. Agrawal, P.M. Deshpande, A. Gupta, J.F. Naughton, R. Ramakrishnan, and S. Sarawagi. On the computation of multidimensional aggregates. VLDB, 1996.

• D. Barbara and M. Sullivan. Quasi-cubes: Exploiting approximation in multidimensional databases. SIGMOD Record, 26:12--17, 1997.

• D. Barbara and X. Wu. Using loglinear models to compress datacube. In WAIM'2000}, pages 311--322, 2000.

• K. Beyer and R. Ramakrishnan. Bottom-up computation of sparse and iceberg cubes. In SIGMOD'99.

Page 42: Quotient Cube: How to Summarize the Semantics of a Data Cube Laks V.S. Lakshmanan (Univ. of British Columbia) * Jian Pei (State Univ. of New York at Buffalo)

Lakshmanan, Pei & Han. Quotient Cube: How to Summarize the Semantics of a Data Cube 42

Reference (2)• G. Birkhoff, Lattice Theory, 2nd edition, New York, American

Mathematical Society (Colloquium Publications, vol. 25), 1948.• S. Geffner, D. Agrawal, A. El Abbadi, and T. R. Smith. Relative

prefix sums: An efficient approach for querying dynamic OLAP data cubes. In ICDE'99.

• Jim Gray, Adam Bosworth, Andrew Layman, Hamid Pirahesh. Data Cube: A Relational Aggregation Operator Generalizing Group-By, Cross-Tab, and Sub-Total. ICDE'96.

• C.-T. Ho, J. Bruck, and R. Agrawal. Partial-sum queries in data cubes using covering codes. In PODS'97.

• J. Han, J. Pei, G. Dong, and K. Wang. Efficient Computation of Iceberg Cubes with Complex Measures. In SIGMOD'01.

Page 43: Quotient Cube: How to Summarize the Semantics of a Data Cube Laks V.S. Lakshmanan (Univ. of British Columbia) * Jian Pei (State Univ. of New York at Buffalo)

Lakshmanan, Pei & Han. Quotient Cube: How to Summarize the Semantics of a Data Cube 43

Reference (3)

• V. Harinarayan, A. Rajaraman, and J. D. Ullman. Implementing data cubes efficiently. In SIGMOD'96.

• T. Imielinski, L. Khachiyan, and A. Abdulghani. Cubegrades: Generalizing Association Rules. Technical Report, Rutgers University, August 2000.

• H. V. Jagadish, J. Madar, R.T. Ng. Semantic Compression and Pattern Extraction with Fascicles. VLDB'99.

• K. Ross and D. Srivastava. Fast computation of sparse datacubes. In VLDB'97.

• G. Sathe and S. Sarawagi. Intelligent Rollups in Multidimensional OLAP Data. VLDB'01.

Page 44: Quotient Cube: How to Summarize the Semantics of a Data Cube Laks V.S. Lakshmanan (Univ. of British Columbia) * Jian Pei (State Univ. of New York at Buffalo)

Lakshmanan, Pei & Han. Quotient Cube: How to Summarize the Semantics of a Data Cube 44

Reference (4)

• J. Shanmugasundaram, U.M. Fayyad, and P. S. Bradley. Compressed Data Cubes for OLAP Aggregate Query Approximation on Continuous Dimensions. SIGKDD’99.

• J. S. Vitter, M. Wang, and B. R. Iyer. Data cube approximation and historgrams via wavelets. In CIKM'98.

• W. Wang, H. Lu, J. Feng, and J. X. Yu. Condensed cube: An effective approach to reducing data cube size. In ICDE'02.

• Y. Zhao, P. M. Deshpande, and J. F. Naughton. An array-based algorithm for simultaneous multidimensional aggregates. In SIGMOD'97.

• G.K. Zipf. Human Behavior and The Principle of Least Effort Addison-Wesley, 1949.