Download - CS4432: Database Systems II
![Page 1: CS4432: Database Systems II](https://reader036.vdocument.in/reader036/viewer/2022081519/56812b24550346895d8f1ff2/html5/thumbnails/1.jpg)
CS 4432 lecture #9 1
CS4432: Database Systems IILecture #8
(Basic indexing)
Professor Elke A. Rundensteiner
![Page 2: CS4432: Database Systems II](https://reader036.vdocument.in/reader036/viewer/2022081519/56812b24550346895d8f1ff2/html5/thumbnails/2.jpg)
CS 4432 lecture #9 2
Indexing : helps to retrieve data quicker for certain queries
value= 1,000,000
Select * FROM Emp WHERE salary = 1,000,000;Select * FROM Emp WHERE salary = 1,000,000;
Indexing (Chapter 14 )
value
record
![Page 3: CS4432: Database Systems II](https://reader036.vdocument.in/reader036/viewer/2022081519/56812b24550346895d8f1ff2/html5/thumbnails/3.jpg)
CS 4432 lecture #9 3
Topics
• Sequential Index Files • Secondary Indexes
![Page 4: CS4432: Database Systems II](https://reader036.vdocument.in/reader036/viewer/2022081519/56812b24550346895d8f1ff2/html5/thumbnails/4.jpg)
CS 4432 lecture #9 4
Sequential File
2010
4030
6050
8070
10090
![Page 5: CS4432: Database Systems II](https://reader036.vdocument.in/reader036/viewer/2022081519/56812b24550346895d8f1ff2/html5/thumbnails/5.jpg)
CS 4432 lecture #9 5
Sequential File
2010
4030
6050
8070
10090
Dense Index
10203040
50607080
90100110120
Every record
is in index.
![Page 6: CS4432: Database Systems II](https://reader036.vdocument.in/reader036/viewer/2022081519/56812b24550346895d8f1ff2/html5/thumbnails/6.jpg)
CS 4432 lecture #9 6
Sequential File
2010
4030
6050
8070
10090
Sparse Index
10305070
90110130150
170190210230
Only first record
per block in index.
![Page 7: CS4432: Database Systems II](https://reader036.vdocument.in/reader036/viewer/2022081519/56812b24550346895d8f1ff2/html5/thumbnails/7.jpg)
CS 4432 lecture #9 7
Sequential File
2010
4030
6050
8070
10090
Sparse 2nd level
10305070
90110130150
170190210230
1090
170250
330410490570
![Page 8: CS4432: Database Systems II](https://reader036.vdocument.in/reader036/viewer/2022081519/56812b24550346895d8f1ff2/html5/thumbnails/8.jpg)
CS 4432 lecture #9 8
Note : DATA FILE or INDEX can be both “ordered files”.
Question:How would we lay them out on disk ?
- contiguous layout on disk ? - block-chained layout on disk ?
![Page 9: CS4432: Database Systems II](https://reader036.vdocument.in/reader036/viewer/2022081519/56812b24550346895d8f1ff2/html5/thumbnails/9.jpg)
CS 4432 lecture #9 9
Questions:
• Do we want to build a dense 2nd-level index for a dense index?
• Can we even do this ?
Sequential File2010
4030
6050
8070
10090
2nd level?1030507090
110130150170190210230
1090
170250330410490570
1st level?
![Page 10: CS4432: Database Systems II](https://reader036.vdocument.in/reader036/viewer/2022081519/56812b24550346895d8f1ff2/html5/thumbnails/10.jpg)
CS 4432 lecture #9 10
Notes on pointers:
(1)Block pointer (used in sparse index) can be smaller than record pointer (used in dense index)
BP
RP
![Page 11: CS4432: Database Systems II](https://reader036.vdocument.in/reader036/viewer/2022081519/56812b24550346895d8f1ff2/html5/thumbnails/11.jpg)
CS 4432 lecture #9 11
K1
K3
K4
K2
R1
R2
R3
R4
say:1024 Bper block
• if we want K3 block:• get it at offset (3-1)*1024 = 2048 bytes
Note : If file is contiguous, then we can omit pointers
![Page 12: CS4432: Database Systems II](https://reader036.vdocument.in/reader036/viewer/2022081519/56812b24550346895d8f1ff2/html5/thumbnails/12.jpg)
CS 4432 lecture #9 12
Sparse vs. Dense Tradeoff
• Sparse: Less index space per record can keep more of index in
memory (Later: sparse better for insertions)
• Dense: Can tell if any record exists without accessing file
(Later: dense needed for secondary indexes)
![Page 13: CS4432: Database Systems II](https://reader036.vdocument.in/reader036/viewer/2022081519/56812b24550346895d8f1ff2/html5/thumbnails/13.jpg)
CS 4432 lecture #9 13
Terms
• Index sequential file• Search key ( primary key)• Primary index (on sequencing field)• Secondary index• Dense index (contains all search
key values)• Sparse index• Multi-level index
![Page 14: CS4432: Database Systems II](https://reader036.vdocument.in/reader036/viewer/2022081519/56812b24550346895d8f1ff2/html5/thumbnails/14.jpg)
CS 4432 lecture #9 14
Next:
• Duplicate keys
• Deletion/Insertion
• Secondary indexes
![Page 15: CS4432: Database Systems II](https://reader036.vdocument.in/reader036/viewer/2022081519/56812b24550346895d8f1ff2/html5/thumbnails/15.jpg)
CS 4432 lecture #9 15
Duplicate keys
1010
2010
3020
3030
4540
![Page 16: CS4432: Database Systems II](https://reader036.vdocument.in/reader036/viewer/2022081519/56812b24550346895d8f1ff2/html5/thumbnails/16.jpg)
CS 4432 lecture #9 16
1010
2010
3020
3030
4540
1010
2010
3020
3030
4540
10101020
20303030
10101020
20303030
Dense index ! Point to each value !
Duplicate keys
![Page 17: CS4432: Database Systems II](https://reader036.vdocument.in/reader036/viewer/2022081519/56812b24550346895d8f1ff2/html5/thumbnails/17.jpg)
CS 4432 lecture #9 17
1010
2010
3020
3030
4540
Dense index. Point to each distinct value!
10203040
Duplicate keys
![Page 18: CS4432: Database Systems II](https://reader036.vdocument.in/reader036/viewer/2022081519/56812b24550346895d8f1ff2/html5/thumbnails/18.jpg)
CS 4432 lecture #9 18
1010
2010
3020
3030
4540
10102030
Sparse index: point to start of block !
Duplicate keys
care
ful if lookin
gfo
r 2
0 o
r 3
0!
![Page 19: CS4432: Database Systems II](https://reader036.vdocument.in/reader036/viewer/2022081519/56812b24550346895d8f1ff2/html5/thumbnails/19.jpg)
CS 4432 lecture #9 19
1010
2010
3020
3030
4540
10203030
Sparse index, another way ?
Duplicate keys
– place first new key from block
shouldthis be40?
![Page 20: CS4432: Database Systems II](https://reader036.vdocument.in/reader036/viewer/2022081519/56812b24550346895d8f1ff2/html5/thumbnails/20.jpg)
CS 4432 lecture #9 20
Duplicate values, primary index
• Index may point to first instance ofeach value only
File Index
Summary
aaa
b
![Page 21: CS4432: Database Systems II](https://reader036.vdocument.in/reader036/viewer/2022081519/56812b24550346895d8f1ff2/html5/thumbnails/21.jpg)
CS 4432 lecture #9 21
Next:
• Duplicate keys
• Deletion/Insertion
• Secondary indexes
![Page 22: CS4432: Database Systems II](https://reader036.vdocument.in/reader036/viewer/2022081519/56812b24550346895d8f1ff2/html5/thumbnails/22.jpg)
CS 4432 lecture #9 22
Deletion from sparse index
2010
4030
6050
8070
10305070
90110130150
![Page 23: CS4432: Database Systems II](https://reader036.vdocument.in/reader036/viewer/2022081519/56812b24550346895d8f1ff2/html5/thumbnails/23.jpg)
CS 4432 lecture #9 23
Deletion from sparse index
2010
4030
6050
8070
10305070
90110130150
– delete record 40
![Page 24: CS4432: Database Systems II](https://reader036.vdocument.in/reader036/viewer/2022081519/56812b24550346895d8f1ff2/html5/thumbnails/24.jpg)
CS 4432 lecture #9 24
Deletion from sparse index
2010
4030
6050
8070
10305070
90110130150
– delete record 30
4040
![Page 25: CS4432: Database Systems II](https://reader036.vdocument.in/reader036/viewer/2022081519/56812b24550346895d8f1ff2/html5/thumbnails/25.jpg)
CS 4432 lecture #9 25
Deletion from sparse index
2010
4030
6050
8070
10305070
90110130150
– delete records 30 & 40
5070
![Page 26: CS4432: Database Systems II](https://reader036.vdocument.in/reader036/viewer/2022081519/56812b24550346895d8f1ff2/html5/thumbnails/26.jpg)
CS 4432 lecture #9 26
Deletion from dense index
2010
4030
6050
8070
10203040
50607080
![Page 27: CS4432: Database Systems II](https://reader036.vdocument.in/reader036/viewer/2022081519/56812b24550346895d8f1ff2/html5/thumbnails/27.jpg)
CS 4432 lecture #9 27
Deletion from dense index
2010
4030
6050
8070
10203040
50607080
– delete record 30
4040
![Page 28: CS4432: Database Systems II](https://reader036.vdocument.in/reader036/viewer/2022081519/56812b24550346895d8f1ff2/html5/thumbnails/28.jpg)
CS 4432 lecture #9 28
Insertion, sparse index case
2010
30
5040
60
10304060
![Page 29: CS4432: Database Systems II](https://reader036.vdocument.in/reader036/viewer/2022081519/56812b24550346895d8f1ff2/html5/thumbnails/29.jpg)
CS 4432 lecture #9 29
Insertion, sparse index case
2010
30
5040
60
10304060
– insert record 34
34
• our lucky day! we have free space where we need it!
![Page 30: CS4432: Database Systems II](https://reader036.vdocument.in/reader036/viewer/2022081519/56812b24550346895d8f1ff2/html5/thumbnails/30.jpg)
CS 4432 lecture #9 30
Insertion, sparse index case
2010
30
5040
60
10304060
– insert record 15
15
2030
20
• Immediate reorganization• Other variations?
![Page 31: CS4432: Database Systems II](https://reader036.vdocument.in/reader036/viewer/2022081519/56812b24550346895d8f1ff2/html5/thumbnails/31.jpg)
CS 4432 lecture #9 31
• Just Illustrated: -Immediate reorganization
• Now Variation:– insert new block (chained file)– otherwise leave data file– update index only
![Page 32: CS4432: Database Systems II](https://reader036.vdocument.in/reader036/viewer/2022081519/56812b24550346895d8f1ff2/html5/thumbnails/32.jpg)
CS 4432 lecture #9 32
Insertion, sparse index case
2010
30
5040
60
10304060
– insert record 25
25
overflow blocks(reorganize later...)
![Page 33: CS4432: Database Systems II](https://reader036.vdocument.in/reader036/viewer/2022081519/56812b24550346895d8f1ff2/html5/thumbnails/33.jpg)
CS 4432 lecture #9 33
Insertion, dense index case
• Similar
• Often more expensive . . .
![Page 34: CS4432: Database Systems II](https://reader036.vdocument.in/reader036/viewer/2022081519/56812b24550346895d8f1ff2/html5/thumbnails/34.jpg)
CS 4432 lecture #9 34
Next:
• Duplicate keys
• Deletion/Insertion
• Secondary indexes
![Page 35: CS4432: Database Systems II](https://reader036.vdocument.in/reader036/viewer/2022081519/56812b24550346895d8f1ff2/html5/thumbnails/35.jpg)
CS 4432 lecture #9 35
Secondary indexesSequencefield
5030
7020
4080
10100
6090
Can I make a
secondary
index sparse ?
![Page 36: CS4432: Database Systems II](https://reader036.vdocument.in/reader036/viewer/2022081519/56812b24550346895d8f1ff2/html5/thumbnails/36.jpg)
CS 4432 lecture #9 36
Secondary indexesSequencefield
5030
7020
4080
10100
6090
• Sparse index
302080
100
90...
![Page 37: CS4432: Database Systems II](https://reader036.vdocument.in/reader036/viewer/2022081519/56812b24550346895d8f1ff2/html5/thumbnails/37.jpg)
CS 4432 lecture #9 37
Secondary indexesSequencefield
5030
7020
4080
10100
6090
• Sparse index
302080
100
90...
?
![Page 38: CS4432: Database Systems II](https://reader036.vdocument.in/reader036/viewer/2022081519/56812b24550346895d8f1ff2/html5/thumbnails/38.jpg)
CS 4432 lecture #9 38
Secondary indexesSequencefield
5030
7020
4080
10100
6090
• Sparse index
302080
100
90...
does not make sense!
![Page 39: CS4432: Database Systems II](https://reader036.vdocument.in/reader036/viewer/2022081519/56812b24550346895d8f1ff2/html5/thumbnails/39.jpg)
CS 4432 lecture #9 39
Secondary indexesSequencefield
5030
7020
4080
10100
6090
• Must be dense index !10203040
506070...
105090...
sparsehighlevel
allowed?
![Page 40: CS4432: Database Systems II](https://reader036.vdocument.in/reader036/viewer/2022081519/56812b24550346895d8f1ff2/html5/thumbnails/40.jpg)
CS 4432 lecture #9 40
Reminder : With secondary indexes:• Lowest level is dense• Other levels are sparse
Also: Pointers are record pointers
(not block pointers; nor off-sets)
![Page 41: CS4432: Database Systems II](https://reader036.vdocument.in/reader036/viewer/2022081519/56812b24550346895d8f1ff2/html5/thumbnails/41.jpg)
CS 4432 lecture #9 41
Duplicate values & secondary indexes
1020
4020
4010
4010
4030
![Page 42: CS4432: Database Systems II](https://reader036.vdocument.in/reader036/viewer/2022081519/56812b24550346895d8f1ff2/html5/thumbnails/42.jpg)
CS 4432 lecture #9 42
Duplicate values & secondary indexes
1020
4020
4010
4010
4030
10101020
20304040
4040...
one option...
Problem:excess overhead!
• disk space• search time
![Page 43: CS4432: Database Systems II](https://reader036.vdocument.in/reader036/viewer/2022081519/56812b24550346895d8f1ff2/html5/thumbnails/43.jpg)
CS 4432 lecture #9 43
Duplicate values & secondary indexes
1020
4020
4010
4010
4030
10
another option...
4030
20Problem:variable sizerecords inindex!
![Page 44: CS4432: Database Systems II](https://reader036.vdocument.in/reader036/viewer/2022081519/56812b24550346895d8f1ff2/html5/thumbnails/44.jpg)
CS 4432 lecture #9 44
Duplicate values & secondary indexes
1020
4020
4010
4010
4030
10203040
5060...
Another idea :Chain records with same key !
Problems:• Need to add fields to data records for each index• Need to follow chain to know records
![Page 45: CS4432: Database Systems II](https://reader036.vdocument.in/reader036/viewer/2022081519/56812b24550346895d8f1ff2/html5/thumbnails/45.jpg)
CS 4432 lecture #9 45
Summary : Indexing Basics
– Basic Ideas: sparse, dense, multi-level…
– Duplicate Keys– Deletion/Insertion– Secondary Indexes