c20.0046: database management systems lecture #26

42
M.P. Johnson, DBMS, Stern/NYU , Sp2004 1 C20.0046: Database Management Systems Lecture #26 Matthew P. Johnson Stern School of Business, NYU Spring, 2004

Upload: luke

Post on 22-Jan-2016

37 views

Category:

Documents


0 download

DESCRIPTION

C20.0046: Database Management Systems Lecture #26. Matthew P. Johnson Stern School of Business, NYU Spring, 2004. Agenda. Previously: Indices Next: Finish Indices, advanced indices Failure/recovery Data warehousing & mining Websearch Hw3 due today no extensions! 1-minute responses - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: C20.0046: Database Management Systems Lecture #26

M.P. Johnson, DBMS, Stern/NYU, Sp2004

1

C20.0046: Database Management SystemsLecture #26

Matthew P. Johnson

Stern School of Business, NYU

Spring, 2004

Page 2: C20.0046: Database Management Systems Lecture #26

M.P. Johnson, DBMS, Stern/NYU, Sp2004

2

Agenda Previously: Indices Next:

Finish Indices, advanced indices Failure/recovery Data warehousing & mining

Websearch

Hw3 due today no extensions!

1-minute responses

Review: clustered, dense, primary, #/tbl, syntax

Page 3: C20.0046: Database Management Systems Lecture #26

M.P. Johnson, DBMS, Stern/NYU, Sp2004

3

Query compiler/optimizer

Execution engine

Index/record mgr.

Buffer manager

Storage manager

storage

User/Application

Queryupdate

Query executionplanRecord,

indexrequests

Page commands

Read/writepages

Transaction manager:•Concurrency control•Logging/recovery

Transactioncommands

Let’s get physical

Page 4: C20.0046: Database Management Systems Lecture #26

M.P. Johnson, DBMS, Stern/NYU, Sp2004

4

BSTs Very simple data structure in CS: BSTs

Binary Search Trees Keep balanced Each node ~ one item

Each node has two children: Left subtree: < Right subtree: >=

Can search, insert, delete in log time log2(1MB = 220) = 20

Page 5: C20.0046: Database Management Systems Lecture #26

M.P. Johnson, DBMS, Stern/NYU, Sp2004

5

Search for DBMS Big improvement: log2(1MB) = 20

Each op divides remaining range in half! But recall: all that matters is #disk accesses

20 is better than 220 but:

Can we do better?

Page 6: C20.0046: Database Management Systems Lecture #26

M.P. Johnson, DBMS, Stern/NYU, Sp2004

6

BSTs B-trees Like BSTs except each node ~ one block Branching factor is >> 2

Each access divides remaining range by, say, 300 B-trees = BSTs + blocks B+ trees are a variant of B-trees

Data stored only in leaves Leaves form a (sorted) linked list Better supports range queries

Consequences: Much shorter depth Many fewer disk reads Must find element within node Trades CPU/RAM time for disk time

Page 7: C20.0046: Database Management Systems Lecture #26

M.P. Johnson, DBMS, Stern/NYU, Sp2004

7

B+ Trees Parameter n branching factor is n+1

Largest number s.t. one block can contain n search-key values and n+1 pointers

Each node (except root) has at least n/2 keys

30 120 240

Keys k < 30Keys 30<=k<120 Keys 120<=k<240 Keys 240<=k

40 50 60

40 50 60

Next leaf

Page 8: C20.0046: Database Management Systems Lecture #26

M.P. Johnson, DBMS, Stern/NYU, Sp2004

8

Searching a B+ Tree Exact key values:

Start at the root If we’re in leaf, walk through its key values; If not, look at keys K1..Kn

If Ki <= K <= Ki+1, look in child i

Range queries: As above Then walk left until test fails

Select nameFrom peopleWhere age = 25

Select nameFrom peopleWhere age = 25

Select nameFrom peopleWhere 20 <= age and age <= 30

Select nameFrom peopleWhere 20 <= age and age <= 30

Page 9: C20.0046: Database Management Systems Lecture #26

M.P. Johnson, DBMS, Stern/NYU, Sp2004

9

B+ Tree Example80

20 60 100 120

140

10 15 18 20 30 40 50 60 65 80 85 90

10 15 18 20 30 40 50 60 65 80 85 90

n = 4Find the key 40

40 80

20 < 40 60

30 < 40 40

NB: Leaf keys are sorted; data pointed to is only if clustered

Page 10: C20.0046: Database Management Systems Lecture #26

M.P. Johnson, DBMS, Stern/NYU, Sp2004

Clustered & unclustered B-trees

Data entries(Index File)

(Data file)

Data Records

Data entries

Data Records

CLUSTERED UNCLUSTERED

Page 11: C20.0046: Database Management Systems Lecture #26

M.P. Johnson, DBMS, Stern/NYU, Sp2004

11

B+ trees, and, or Assume index on a,b,c

Intuition: phone book

WHERE a = ‘x’ and b = ‘y’

WHERE b = ‘y’ and c = ‘z’

WHERE a = ‘a’ and c = ‘z’

WHERE a = ‘x’ or b = ‘y’ or c = ‘z’

Page 12: C20.0046: Database Management Systems Lecture #26

M.P. Johnson, DBMS, Stern/NYU, Sp2004

12

B+ trees and LIKE Supports only hard-coded prefix LIKE checks

Intuition: phone book

Select * from T where a like ‘xyz%’

Select * from T where a like ‘%xyz’

Select * from T where a like ‘xyz%zyx%’

Page 13: C20.0046: Database Management Systems Lecture #26

M.P. Johnson, DBMS, Stern/NYU, Sp2004

13

B-tree search efficiency With params:

block=4k integer = 4b, pointer = 8b

the largest n satisfying 4n+8(n+1) <= 4096 is n=340 Each node has 170..340 keys assume on avg has (170+340)/2=255

Then: 255 rows depth = 1 2552 = 64k rows depth = 2 2553 = 16M rows depth = 3 2554 = 4G rows depth = 4

Page 14: C20.0046: Database Management Systems Lecture #26

M.P. Johnson, DBMS, Stern/NYU, Sp2004

14

B-trees in practice Most DBMSs use B-trees for most indices

Default in MySQL Default in Oracle

Speeds up where clauses Some like checks Min or max functions joins

Limitation: fields used must Be a prefix of indexed fields Be ANDed together

Page 15: C20.0046: Database Management Systems Lecture #26

M.P. Johnson, DBMS, Stern/NYU, Sp2004

15

Next topic: Advanced types of indices Spatial indices based on R-trees (R = region)

Support multi-dimensional searches on “geometry” fields

2-d not 1-d ranges

Oracle:

MySQL:

CREATE INDEX geology_rtree_idx ON geology_tab(geometry) INDEXTYPE IS MDSYS.SPATIAL_INDEX;

CREATE INDEX geology_rtree_idx ON geology_tab(geometry) INDEXTYPE IS MDSYS.SPATIAL_INDEX;

CREATE TABLE geom (g GEOMETRY NOT NULL, SPATIAL INDEX(g));CREATE TABLE geom (g GEOMETRY NOT NULL, SPATIAL INDEX(g));

Page 16: C20.0046: Database Management Systems Lecture #26

M.P. Johnson, DBMS, Stern/NYU, Sp2004

16

Advanced types of indices Inverted indices for web doc search

First, think of each webpage as a tuple One column for every possible word True means the word appears on the page Index on all columns

Now can search: you’re fired select * from T where youre=T and fired=T

Page 17: C20.0046: Database Management Systems Lecture #26

M.P. Johnson, DBMS, Stern/NYU, Sp2004

17

Advanced types of indices Can simplify somewhat:

1. For each field index, delete False entries

2. True entries for each index become a bucket

Create “inverted index”: One entry for each search word Search word entry points to corresponding

bucket Bucket points to pages with its word Amazon

Page 18: C20.0046: Database Management Systems Lecture #26

M.P. Johnson, DBMS, Stern/NYU, Sp2004

18

Advanced types of indices Function-based indices

Speeds up WHERE upper(name)=‘BUSH’, etc.

Now supported in Oracle 8, not MySQL

Bitmap indices Speeds up arbitrary combination of reqs

Not limited to prefixes or conjunctions Now supported in Oracle 9, not MySQL

create index on T(my_soundex(name));create index on T(substr(DOB),4,5));

create index on T(my_soundex(name));create index on T(substr(DOB),4,5));

Page 19: C20.0046: Database Management Systems Lecture #26

M.P. Johnson, DBMS, Stern/NYU, Sp2004

19

Bitmap indices Assume table has n records Assume F is a field with m different values Bitmap index on F: m length-n bitstrings

One bitstring for each value of F Each one says which rows have that value for F

Example: n = , mF = , mG =

Q: find rows whereF=50 or (F=30 and G=‘Baz’)

F G

1 30 Foo

2 30 Bar

3 40 Baz

4 50 Foo

5 40 Bar

6 30 Baz

Page 20: C20.0046: Database Management Systems Lecture #26

M.P. Johnson, DBMS, Stern/NYU, Sp2004

20

Bitmap index search Larger example: (age,salary) of jewelry buyers:

Bitmaps for age: 25:100000001000, 30:000000010000, 45:01000000100,

50:001110000010, 60:000000000001, 70:000001000000, 85:000000100000

Bitmaps for salary: 60:110000000000, 75:001000000000, 100:000100000000,

110:000001000000, 120:000010000000, 140:000000100000, 260:000000010001, 275:000000000010, 350:000000000100, 400:000000001000

Age Sal.

5 50 120

6 70 110

7 85 140

8 30 260

Age Sal.

9 25 400

10 45 350

11 50 275

12 50 260

Age Sal.

1 25 60

2 45 60

3 50 75

4 50 100

Page 21: C20.0046: Database Management Systems Lecture #26

M.P. Johnson, DBMS, Stern/NYU, Sp2004

21

Bitmap index search Query: find buyers of age 45-55 with salary

100-200 Age range: 010000000100 (45) |

001110000010 (50) = 011110000110 Bitwise or of Salary range: 000111100000 AND together: 011110000110 &

000111100000 = 000110000000

What does this mean?

Page 22: C20.0046: Database Management Systems Lecture #26

M.P. Johnson, DBMS, Stern/NYU, Sp2004

22

Bitmap index search Once we have row numbers, then what?

Get rows with those numbers (How?)

Bitmap indices in Oracle:

Best for low-cardinality fields Boolean, enum, gender

lots of 0s in our bitmaps Compress: 000000100001 6141

“run-length encoding”

CREATE BITMAP INDEX ON T(F,G);CREATE BITMAP INDEX ON T(F,G);

Page 23: C20.0046: Database Management Systems Lecture #26

M.P. Johnson, DBMS, Stern/NYU, Sp2004

23

New topic: Recovery

Type of Crash Prevention

Wrong data entryConstraints andData cleaning

Disk crashesRedundancy:

e.g. RAID, archive

Fire, theft, bankruptcy…

Buy insurance, Change jobs…

System failures:e.g. blackout

DATABASERECOVERY

Page 24: C20.0046: Database Management Systems Lecture #26

M.P. Johnson, DBMS, Stern/NYU, Sp2004

24

System Failures Each transaction has internal state When system crashes, internal state is lost

Don’t know which parts executed and which didn’t Remedy: use a log

A file that records each action of each xact Trail of breadcrumbs

Page 25: C20.0046: Database Management Systems Lecture #26

M.P. Johnson, DBMS, Stern/NYU, Sp2004

25

Media Failures Rule of thumb: Pr(hard drive has head crash

within 10 years) = 50% Simpler rule of thumb: Pr(hard drive has head

crash within 1 years) = 10% Serious problem

Soln: different RAID strategies RAID: Redundant Arrays of Independent

Disks

Page 26: C20.0046: Database Management Systems Lecture #26

M.P. Johnson, DBMS, Stern/NYU, Sp2004

26

RAID levels RAID level 1: each disk gets a mirror RAID level 4: one disk is xor of all others

Each bit is sum mod 2 of corresponding bits E.g.:

Disk 1: 11110000 Disk 2: 10101010 Disk 3: 00111000 Disk 4:

How to recover?

Page 27: C20.0046: Database Management Systems Lecture #26

M.P. Johnson, DBMS, Stern/NYU, Sp2004

27

Transactions Transaction: unit of code to be executed

atomically In ad-hoc SQL

one command = one transaction In embedded SQL

Transaction starts = first SQL command issued Transaction ends =

COMMIT ROLLBACK (=abort)

Can turn off/on autocommit

Page 28: C20.0046: Database Management Systems Lecture #26

M.P. Johnson, DBMS, Stern/NYU, Sp2004

28

Primitive operations of transactions Each xact reads/writes rows or blocks: elms

INPUT(X) read element X to memory buffer

READ(X,t) copy element X to transaction local variable t

WRITE(X,t) copy transaction local variable t to element X

OUTPUT(X) write element X to disk

LOG RECORD

Page 29: C20.0046: Database Management Systems Lecture #26

M.P. Johnson, DBMS, Stern/NYU, Sp2004

29

Transaction example Xact: Transfer $100 from savings to checking

A = A+100; B = B-100;

READ(A,t);

t := t+100;

WRITE(A,t);

READ(B,t);

t := t-100;

WRITE(B,t)

Page 30: C20.0046: Database Management Systems Lecture #26

M.P. Johnson, DBMS, Stern/NYU, Sp2004

30

Transaction example READ(A,t); t := t+100;WRITE(A,t); READ(B,t); t := t-100;WRITE(B,t)

Action t Mem A Mem B Disk A Disk B

INPUT(A) 1000 1000 1000

READ(A,t) 1000 1000 1000 1000

t:=t+100 1100 1000 1000 1000

WRITE(A,t) 1100 1100 1000 1000

INPUT(B) 1100 1100 1000 1000 1000

READ(B,t) 1000 1100 1000 1000 1000

t:=t-100 900 1100 1000 1000 1000

WRITE(B,t) 900 1100 900 1000 1000

OUTPUT(A) 900 1100 900 1100 1000

OUTPUT(B) 900 1100 900 1100 900

Page 31: C20.0046: Database Management Systems Lecture #26

M.P. Johnson, DBMS, Stern/NYU, Sp2004

31

The log An append-only file containing log records Note: multiple transactions run concurrently,

log records are interleaved After a system crash, use log to:

Redo some transaction that didn’t commit Undo other transactions that didn’t commit

Three kinds of logs: undo, redo, undo/redo We’ll discuss only Undo

Page 32: C20.0046: Database Management Systems Lecture #26

M.P. Johnson, DBMS, Stern/NYU, Sp2004

32

Undo Logging Log records <START T>

transaction T has begun <COMMIT T>

T has committed <ABORT T>

T has aborted <T,X,v>

T has updated element X, and its old value was v

Page 33: C20.0046: Database Management Systems Lecture #26

M.P. Johnson, DBMS, Stern/NYU, Sp2004

33

Undo-Logging Rules U1: Changes logged (<T,X,v>) before being

written to disk U2: Commits logged (<COMMIT T>) after being

written to disk

Results: May forget we did whole xact (and so wrongly undo) Will never forget did partial xact (and so leave)

Log-change, change, log-change, change, Commit, log-commit

Page 34: C20.0046: Database Management Systems Lecture #26

M.P. Johnson, DBMS, Stern/NYU, Sp2004

34

Action T Mem A Mem B Disk A Disk B Log

<START T>

READ(A,t) 1000 1000 1000 1000

t:=t+100 1100 1000 1000 1000

WRITE(A,t) 1100 1100 1000 1000 <T,A,8>

READ(B,t) 1000 1100 1000 1000 1000

t:=t-100 900 1100 1000 1000 1000

WRITE(B,t) 900 1100 900 1000 1000 <T,B,8>

OUTPUT(A) 900 1100 900 1100 900

OUTPUT(B) 900 1100 900 1100 900

COMMIT

<COMMIT T>

Undo-Logging e.g. (inputs omitted)

Page 35: C20.0046: Database Management Systems Lecture #26

M.P. Johnson, DBMS, Stern/NYU, Sp2004

35

Recovery with Undo Log After system’s crash, run recovery manager

1. Decide for each xact T whether it was completed

2. Undo all modifications from incomplete xacts, in reverse order (why?) and abort each

<START T>….<COMMIT T> yes<START T>….<ABORT T> yes<START T>…………………… no

<START T>….<COMMIT T> yes<START T>….<ABORT T> yes<START T>…………………… no

Page 36: C20.0046: Database Management Systems Lecture #26

M.P. Johnson, DBMS, Stern/NYU, Sp2004

36

Recovery with Undo Log Read log from the end; cases:

<COMMIT T>: mark T as completed <ABORT T>: mark T as completed <T,X,v>:

<START T>: ignore

if T is not completed thenwrite X=v to disk

elseignore

if T is not completed thenwrite X=v to disk

elseignore

Page 37: C20.0046: Database Management Systems Lecture #26

M.P. Johnson, DBMS, Stern/NYU, Sp2004

37

Recovery with Undo Log……<T2,X2,v2>……<START T5><START T4><T1,X1,v1><T5,X5,v5><T4,X4,v4><COMMIT T5><T3,X3,v3><T2,X2,v2>

Q: Which updates areundone?

Crash!

Start:

Page 38: C20.0046: Database Management Systems Lecture #26

M.P. Johnson, DBMS, Stern/NYU, Sp2004

38

Recovery with Undo Log Note: undo commands are idempotent

No harm done if we repeat them Q: What if system crashes during recovery?

How far back in the log do we go? Don’t go all the way back to the start May be very large

Better idea: use checkpointing

Page 39: C20.0046: Database Management Systems Lecture #26

M.P. Johnson, DBMS, Stern/NYU, Sp2004

39

Checkpointing Checkpoint the database periodically

Stop accepting new transactions Wait until all current xacts complete Flush log to disk Write a <CKPT> log record, flush log Resume accepting new xacts

Page 40: C20.0046: Database Management Systems Lecture #26

M.P. Johnson, DBMS, Stern/NYU, Sp2004

40

Undo Recovery with Checkpointing

……<T1,X1,v1>……(all completed)<CKPT><START T2><START T3<START T5><START T4><T4,X4,v4><T5,X5,v5><T4,X4,v4><COMMIT T5><T3,X3,v3><T2,X2,v2>

During recovery,can stop at first<CKPT>

xacts T2,T3,T4,T5

other xacts

Page 41: C20.0046: Database Management Systems Lecture #26

M.P. Johnson, DBMS, Stern/NYU, Sp2004

41

Non-quiescent Checkpointing Problem: database must freeze during

checkpoint Would like to checkpoint while database is

operational Idea: non-quiescent checkpointing

Quiescent: quiet, still, at rest; inactive

Page 42: C20.0046: Database Management Systems Lecture #26

M.P. Johnson, DBMS, Stern/NYU, Sp2004

42

Next time Next: Data warehousing mining! For next time: reading online

Proj5 due next Thursday no extensions!

Now: one-minute responses Relative weight: warehousing, mining, websearch Data mining techniques

NNs GAs kNN Decision Trees