storage and indexes introduction to databases computer

23
Storage and Indexes Introduction to Databases Computer Science 557 Instructor: Joe Bockhorst University of Wisconsin - Milwaukee

Upload: others

Post on 04-Apr-2022

1 views

Category:

Documents


0 download

TRANSCRIPT

Storage and Indexes

Introduction to DatabasesComputer Science 557

Instructor: Joe BockhorstUniversity of Wisconsin - Milwaukee

Announcements

• Any problems logging in to course accounts?

• Use grid3, grid5 or weise for course work– Send bugs on weise to [email protected]

• Reading Assignment: Chapter 14 in the textbook• Program 1 assigned today, due next Friday

Hard Disks• DBMS typically store information on “hard” disks

• I/O operations (“Read” and “Write”) are costly – Should be planned carefully

• A block is amount of data transferred in one operation (example block size ~ 1024 bytes)

DiskMain Memory

(RAM)

Read

Write

Our First Equation

trs ++=I/O_Costseek time

rotational delay

transfer time

Anatomy of a Hard Disk

Hard Drive Glossary

• Disks are divided into concentric circular trackson each disk surface.– Track capacities vary typically from ~ 4 to 50 Kbytes

• The division of a track into sectors is hard-coded on the disk surface and cannot be changed

• A block is an integer number of sectors– The block size B is fixed for each system.

• Typical block sizes range from B=512 bytes to B=4096 bytes.– Whole blocks are transferred between disk and main memory

Typical Disk Parameters

(Courtesy of Seagate Technology)

Accessing a Disk Block

• to Read or Write block give block addr to disk controller – Hardware block address – cylinder #, surface #, block #– Logical block addressing allows higher levels to refer to

hardware block address using a block_id• Seek time – move head to correct cylinder (~10ms)• Rotation time – rotate start of block under head

– @ 15000 rpm, average rotation time is 4ms• Transfer time – transfer entire block• Accessing consecutive blocks only need to pay the seek

and rotation time once• Compare to typical main memory access times which

are measured in micro (10-6) or nano (10-9) seconds

Coming soon: Solid State “Disks”?

+ Random access devices eliminate seek times

+ faster startup- $$$$ much more

expensive than hard disks - ($8 / GB vs $0.25 / GB)

- Capacities are smaller

We will assume hard disk storage inthis course

Why not store DB in main memory?

• $$$$– cost of RAM is > 100 X hard drive cost

• main memory is volatile• 32 bit addressing

Managing the Hard Disk

Query Optimization

Relational Operators

Files and Access Methods

Buffer Management

Disk Space Management

The DSM provides an abstraction of the block as a unit of data

DSM interface includes commands to read and write block commands

I/O requests

Operations Supported by DSM

• allocate_blocks(num_blocks)– Add blocks to DB

• deallocate block(blockID)– Remove block from DB

• write_block(blockID, blockPtr)– Write block to disk

• read_block(blockID, blockPtr)– Read block from disk

Managing the Hard Disk

Disk Space Manager

1yes

2 3block IDyes yesallocated?

N-1 Nno no

4 5no no

792 793 794 * ** *hardware addr

6no

7no

* *

Example: Managing the Hard Disk

allocate_blocks(3)write_block(5,data)read_block(5)deallocate block(2)

Disk Space Manager

1yes

2 3block IDyes yesallocated?

N-1 Nno no

4 5no no

792 793 794 * ** *hardware addr

6no

7no

* *

allocate_blocks(3)write_block(5,data)read_block(5)deallocate block(2)

Example: Managing the Hard Disk

Disk Space Manager

1yes

2 3block IDyes yesallocated?

N-1 Nno no

4 5yes yes

792 793 794 * *902 903hardware addr

6yes

7no

904 *

//allocate three consecutive blocks

allocate_blocks(3)write_block(5,data)read_block(5)deallocate block(2)

Example: Managing the Hard Disk

Disk Space Manager

1yes

2 3block IDyes yesallocated?

N-1 Nno no

4 5yes yes

792 793 794 * *902 903hardware addr

6yes

7no

904 *

//write blockID 5 to disk

write_block(903,data)

allocate_blocks(3)write_block(5,data)read_block(5)deallocate block(2)

Example: Managing the Hard Disk

Disk Space Manager

1yes

2 3block IDyes yesallocated?

N-1 Nno no

4 5yes yes

792 793 794 * *902 903hardware addr

6yes

7no

904 *

// read blockID 5 to buffer

read_block(903)

allocate_blocks(3)write_block(5,data)read_block(5)deallocate block(2)

Example: Managing the Hard Disk

Disk Space Manager

1yes

2 3block IDno yesallocated?

N-1 Nno no

4 5yes yes

792 * 794 * *902 903hardware addr

6yes

7no

904 *

// read blockID 5 to buffer

Buffer Management

• Responsible for managing region of main memory called the buffer pool

• MM pages are called frames (slots that can hold one block)

• Higher levels of the DBMS need not worry if the page is in memory or not... Just ask for it.

Query Optimization

Relational Operators

Files and Access Methods

Disk Space Management

Buffer Management

Buffer Manager Operations

• add_blocks_to_DB(num_blocks)– add new blocks to DB

• delete_block_from_DB(block_id)– delete block from the DB

• pin_block(block_id)– bring block from disk to buffer pool if not in BP– increment pin count for block

• unpin_block(block_id)– decrement pin count for block

• mark_dirty(block_id)

• Buffer Manager maintains for each frame– pin count– dirty bit

Buffer Manager Example

buffer poolwith M frames1 2 3 M-1 M

- - - - -

0 0 0 0 0

no no no no no

block IDpin count

dirty

initial state ofbuffer manager

1 2 3 N-1 N

Buffer Manager Example

buffer poolwith M frames1 2 3 M-1 M

76 22 - - -

2 1 0 0 0

no no no no no

block IDpin count

dirty

initial state ofbuffer manager

draw on whiteboard

1 2 3 N-1 N

Buffer Manager Example

add_blocks_to_DB(3)pin_block(76)pin_block(13)mark_dirty(13)pin_block(76)unpin_block(13)// now assume all frames are filled and blk 22 // is not in the buffer poolpin_block(22)// BufMgr flushes blk w/ pin count = 0delete_block(35)