traditional database indexing techniques for video database indexing jianping fan department of...
Post on 20-Dec-2015
228 views
TRANSCRIPT
Traditional Database Indexing Techniques for Video Database Indexing
Jianping FanDepartment of Computer Science
University of North Carolina at CharlotteCharlotte, NC 28223
http://www.cs.uncc.edu/~jfan
1. Why we need indexing?
Library: 2000000 books
Find the book with title “Multimedia Systems, Standards, and Networks” without indexing!
Query:
Too hard! 2000000!
How we can do this more efficiently?
2. How Library Works?
a. Classify these books into several subjects:
I get it!
Too easy!11!
Books in Library
Natural Sciences Social Sciences
DancingComputer Science
ElectricalEngineering
Computer Languages Researches
Database Multimedia
2. How Library Works?
b. How they get this good partition and management?
Taxonomy !!& Library Science!!
Natu
ral S
cien
ces
Socia
l Scie
nce
How we can do this for data & image?
3. Key Problems for Building Indexing?
What you can find from this map?
3. Key Problems for Building Indexing?
What you can find from this map?
What you can find from this map?
3. Key Problems for Building Indexing?
What you can find from this map?
3. Key Problems for Building Indexing?
3. Key Problems for Building Indexing?
a. Partition
b. Representation
Database is some tables! Map is similar as tables!
Partition the large-scale data set into meaningful &manageable small regions hierarchically!
Represent these regions using efficient technique so that they can access very fast!
4. How to build indexing structure for data?
a. Space partition approach:
Partition the space into regions according to some measure
4. How to build indexing structure for data?
a. Space partition approach:
Space partition tree is attractive for GIS system.
4. How to build indexing structure for data?
Space partition may not work for some case!!
4. How to build indexing structure for data?
Partition data based on data distributions!
clustering
Using clustering to partition data set!!
4. How to build indexing structure for data?
b. Data Partition via Clustering
4. How to build indexing structure for data?
b. Data Partition via Clustering
K-mean data clustering
(1) Select K center to startDark points
4. How to build indexing structure for data?
b. Data Partition via Clustering
K-mean data clustering
(2) Put the testing point into most similar center
]},1[|),(min{ KicentertestD i
4. How to build indexing structure for data?
b. Data Partition via Clustering
K-mean data clustering
(3) Update the corresponding cluster center
c. Representation of Data Partition Results:
(1) Rectangular box(ID, x1,y1,x2,y2)
(2) Sphere
(ID, xc,yc, R)
(SR-tree)
4. How to build indexing structure for data?
A
B
CD
E
F G
H
I
J
K
L
M
N
A B C
D E F G H
L M N
I J K
L M N
D E F G H
I J K
Data set Indexing tree
Search road
R-tree: Minimum Rectangular Box
A
B
C
D
A B C D
A B C D
First partition
4. How to build indexing structure for data?
R-tree: Minimum Rectangular Box
A
a b
c
d e
fg
A B C D
A B C D
a b c d e f gSecond partition of A
4. How to build indexing structure for data?
Data partition approach:
4. How to build indexing structure for data?
Data partition approach:
B
Second partition of B
A B C D
A B C Dh
i
jk
h i j k
4. How to build indexing structure for data?
Data partition approach:
Cl
m
A B C D
A B C D
l mSecond partition of C
4. How to build indexing structure for data?
A B C D
B C D A
a b c d e f g
l m
h i j k
Final indexing structure
Data partition approach:
R-tree family
A
B C
F
G
H
D
E
Root Node
A B C
D E F G H
4. How to build indexing structure for data?
R-tree family
A
B C
F
G
H
D
E
Root Node
A B C
D E F G H
a. Overlap between A and C!
4. How to build indexing structure for data?
4. How to build indexing structure for data?
X-tree: Minimum Rectangular Box with Fat Node root
Normal directory nodesSuper-nodes
Data nodes
4. How to build indexing structure for data?
SR-tree: Minimum Sphere
Grid file can be treated as an extended Q-tree with multiple partition at each attribute!
salary
age
4. How to build indexing structure for data?
Grid file can be treated as an extended Q-tree with multiple partition at each attribute!
buckets
4. How to build indexing structure for data?
primary buckets
overflow bucket
4. How to build indexing structure for data?
a. Equal query: 1 + M
b. Range query: N + N*M
c. Insert: 1 + M + 1
d. Delete: 1 + M + 1
Bucket numbers: N; overflow bucket: M; Number of data entries for leaf node: K
4. How to build indexing structure for data?
Data distribution information can be used to improve the performance of grid file.
salary
age
4. How to build indexing structure for data?
Dynamic Grid File
salary
age
bucket
4. How to build indexing structure for data?
20*
00
01
10
11
2 2
2
2
LOCAL DEPTH 2
2
DIRECTORY
GLOBAL DEPTHBucket A
Bucket B
Bucket C
Bucket D
Bucket A2(`split image'of Bucket A)
1* 5* 21*13*
32*16*
10*
15* 7* 19*
4* 12*
19*
2
2
2
000
001
010
011
100
101
110
111
3
3
3DIRECTORY
Bucket A
Bucket B
Bucket C
Bucket D
Bucket A2(`split image'of Bucket A)
32*
1* 5* 21*13*
16*
10*
15* 7*
4* 20*12*
LOCAL DEPTH
GLOBAL DEPTH
4. How to build indexing structure for data?
a. Equal query: 1 + M
b. Range query: N + N*M
c. Insert: 1 + M + 1
d. Delete: 1 + M + 1
Bucket numbers: N; overflow bucket: M; Number of data entries for leaf node: K
4. How to build indexing structure for data?
Database indexing structure is built for decision making and tries to make the decision as fast as possible!
Color = Green?
Size = Big?
watermelon
Size = Medium?
appleGrape
Color = Yellow?
Shape = Round?
Size = Big? banana
grapefruit lemon
Size = small?
Taste = sweet?
cherry grape
apple
yes
yesno
yes no
no
yes no
yes no
yes no
yes no
yes no
4. How to build indexing structure for data?
Decision Tree
How to obtain decision for a database?
)(log)()( 2 jj
j ppni
a. Obtain a set of labeled training data set from the database.
b. Calculate the entropy impurity:
c. Classifier is built by:
)(max ni
4. How to build indexing structure for data?
KD-tree
By treating query as a decision making procedure, we can use decision to build more effective database indexing!
Database root node
Salary > $75000?
yes no
Data table
Age > 60?yes no
no
Age > 60?yes no
4. How to build indexing structure for data?
Each inter-node, only one attribute is used!
It is not balance! Search from different node may have different I/O cost!
It can support multiple attribute database indexing like R-tree!
It has integrated decision making and database query!
4. How to build indexing structure for data?
a. Equal query: N + M
b. Range query: N + M
c. Insert: N + M + 1
d. Delete: N+ M + 1
Tree levels: N; Leaf nodes: M; Number of data entries for leaf node: KThe inter-nodes for kd-tree at the same level are stored on the same page.
4. How to build indexing structure for data?
5. Storage Management for High-Dimensional Indexing Structures
Index entries
Data entries
direct search for
(Index File)
(Data file)
Data Records
data entries
Data entries
Data Records
CLUSTEREDUNCLUSTERED
Index entries
Data entries
direct search for
(Index File)
(Data file)
Data Records
data entries
Data entries
Data Records
CLUSTEREDUNCLUSTERED
We want to put the similar data in the same page or neighboring pages!
5. Storage Management for High-Dimensional Indexing Structures
It is very hard to do multi-dimensional data sorting!
00 01 10 11
Hilbert Curve: scale multi-dimensional data into one dimension.
5. Storage Management for High-Dimensional Indexing Structures
0 1
23
4
5 6
7 8
9 10
11
12
13
14 15
From multi-dimensional indexing to one-dimensional storage in disk!
6. Video Database Indexing
Can these technique be used for video database indexing?
a. Curse of Dimensions: overlap in high-dimensional space
b. Semantic Gap: visual features == semantic concepts
What we should do?
ColorHSV color histogram, dominant color, …
TextureEdge histogram, wavelet coefficients, Tamura features, …
MotionDirectional motion histogram, Camera motion, …
Other features
Video Sequence
Shot 1 Shot i Shot n
Visual Representation
Schema Determination
ColorHSV color histogram, dominant color, …
TextureEdge histogram, Tamura, ….
ShapeRectangular box, moments, …..
MotionTrajectory, motion histogram, …
Other features
Video Sequence
Key Object 1 Key Object i Key Object n
Schema Determination
6. Video Database Indexing
A
B
C
overlap
curse of dimensions
6. Video Database Indexing
a. Concept Hierarchy
We should try to bridge the semantic gap in the video content partition procedure.
Objective:
2000 Olympic Games
filed basketball softball soccer volleyball
Team USA
Team Norway
Team Slovakia
Team USA
Players
News
Game Actions
Players
News
Game Actions
6. Video Database Indexing
],.....,[21 niii xxx
.. ..... ...
Visual Features
...
. ..
Semantic Clusters jC
. . . . . . . . . . . . . . . . .Video Contents in
Database
Weighted mapping?
b. Semantic classification
6. Video Database Indexing
Video in Database
Cluster 1 Cluster i Cluster n
Subcluster 11 Subcluster 1j Subcluster n1 Subcluster nl
Subregion 11k Subregion nl1 Subregion nlm
object1111 object nlm1
Disk for Cluster 1 Disk for Cluster i Disk for Cluster n
ii DN log
7. Video Query with Indexing
query object
feature extraction
Cluster 1 Cluster i Cluster n
Subcluster i1 Subcluster ij Subcluster im
Subregion ij1 Subregion ijl Subregion ijr
Object ijrm
Disk for cluster 1 Disk for cluster i
Video Browsing
A* Search Algorithm Video in Database
Cluster 1 Cluster i Cluster n
Subcluster 11 Subcluster 1j Subcluster n1 Subcluster nl
Subregion 11k Subregion nl1 Subregion nlm
object1111 object nlm1
Disk for Cluster 1 Disk for Cluster i Disk for Cluster n
ii DN log
Multimedia Database System Design
Access control & rights management
Query & Delivery
Delivery
Query Presentation
Query Processing
Visual Summarization
Indexing
Video Collections
MPEG Encoder
Indexing is very important!