storage and file structure year/data base... · the platter of hard disks are made ... there are...

31
UNIT-IV DBMS Prepared by Mrs.D.Maladhy (AP/IT/RGCET) Page 1 UNIT IV STORAGE AND FILE STRUCTURE 1. We have been looking mostly at the higher-level models of a database. At the conceptual or logical level the database was viewed as o A collection of tables (relational model). o A collection of classes of objects (object-oriented model). 2. The logical model is the correct level for database users to focus on. However, performance depends on the efficiency of the data structures used to represent data in the database, and on the efficiency of operations on these data structures. OVERVIEW OF PHYSICAL STORAGE MEDIA 1. Several types of data storage exist in most computer systems. They vary in speed of access, cost per unit of data, and reliability. o Cache: most costly and fastest form of storage. Usually very small, and managed by the operating system. o Main Memory (MM): the storage area for data available to be operated on. General-purpose machine instructions operate on main memory. Contents of main memory are usually lost in a power failure or ``crash''. Usually too small (even with megabytes) and too expensive to store the entire database. o Flash memory: EEPROM (electrically erasable programmable read-only memory). Data in flash memory survive from power failure. Reading data from flash memory takes about 10 nano-secs (roughly as fast as from main memory), and writing data into flash memory is more complicated: write-once takes about 4-10 microsecs. To overwrite what has been written, one has to first erase the entire bank of the memory. It may support only a limited number of erase cycles ( to ). It has found its popularity as a replacement for disks for storing small volumes of data (5-10 megabytes). o Magnetic-disk storage: primary medium for long-term storage. Typically the entire database is stored on disk. Data must be moved from disk to main memory in order for the data to be operated on. After operations are performed, data must be copied back to disk if any changes were made. Storage and File Structures: Overview of Physical Storage Media Magnetic Disks RAID Tertiary Storage Storage Access File Organization. Indexing and Hashing: Basic Concepts Static Hashing Dynamic Hashing.

Upload: lamdiep

Post on 19-May-2018

216 views

Category:

Documents


1 download

TRANSCRIPT

UNIT-IV DBMS

Prepared by Mrs.D.Maladhy (AP/IT/RGCET) Page 1

UNIT IV

STORAGE AND FILE STRUCTURE

1. We have been looking mostly at the higher-level models of a database. At the conceptual or

logical level the database was viewed as

o A collection of tables (relational model).

o A collection of classes of objects (object-oriented model).

2. The logical model is the correct level for database users to focus on. However, performance

depends on the efficiency of the data structures used to represent data in the database, and on

the efficiency of operations on these data structures.

OVERVIEW OF PHYSICAL STORAGE MEDIA

1. Several types of data storage exist in most computer systems. They vary in speed of access,

cost per unit of data, and reliability.

o Cache: most costly and fastest form of storage. Usually very small, and managed by

the operating system.

o Main Memory (MM): the storage area for data available to be operated on.

General-purpose machine instructions operate on main memory.

Contents of main memory are usually lost in a power failure or ``crash''.

Usually too small (even with megabytes) and too expensive to store the entire

database.

o Flash memory: EEPROM (electrically erasable programmable read-only memory).

Data in flash memory survive from power failure.

Reading data from flash memory takes about 10 nano-secs (roughly as fast as

from main memory), and writing data into flash memory is more complicated:

write-once takes about 4-10 microsecs.

To overwrite what has been written, one has to first erase the entire bank of the

memory. It may support only a limited number of erase cycles ( to ).

It has found its popularity as a replacement for disks for storing small volumes

of data (5-10 megabytes).

o Magnetic-disk storage: primary medium for long-term storage.

Typically the entire database is stored on disk.

Data must be moved from disk to main memory in order for the data to be

operated on.

After operations are performed, data must be copied back to disk if any changes

were made.

Storage and File Structures: Overview of Physical Storage Media – Magnetic Disks – RAID –

Tertiary Storage – Storage Access – File Organization. Indexing and Hashing: Basic Concepts –

Static Hashing – Dynamic Hashing.

UNIT-IV DBMS

Prepared by Mrs.D.Maladhy (AP/IT/RGCET) Page 2

Disk storage is called direct access storage as it is possible to read data on the

disk in any order (unlike sequential access).

Disk storage usually survives power failures and system crashes.

o Optical storage: CD-ROM (compact-disk read-only memory), WORM (write-once

read-many) disk (for archival storage of data), and Juke box (containing a few drives

and numerous disks loaded on demand).

o Tape Storage: used primarily for backup and archival data.

Cheaper, but much slower access, since tape must be read sequentially from the

beginning.

Used as protection from disk failures!

2. The storage device hierarchy is presented in Figure 10.1, where the higher levels are expensive

(cost per bit), fast (access time), but the capacity is smaller.

Figure 10.1: Storage-device hierarchy

3. Another classification: Primary, secondary, and tertiary storage.

1. Primary storage: the fastest storage media, such as cash and main memory.

2. Secondary (or on-line) storage: the next level of the hierarchy, e.g., magnetic disks.

3. Tertiary (or off-line) storage: magnetic tapes and optical disk juke boxes.

4. Volatility of storage. Volatile storage loses its contents when the power is removed. Without

power backup, data in the volatile storage (the part of the hierarchy from main memory up)

must be written to nonvolatile storage for safekeeping.

MAGNETIC DISKS

Physical Characteristics of Disks

1. The storage capacity of a single disk ranges from 10MB to 10GB. A typical commercial

database may require hundreds of disks.

2. Figure 10.2 shows a moving-head disk mechanism.

UNIT-IV DBMS

Prepared by Mrs.D.Maladhy (AP/IT/RGCET) Page 3

o Each disk platter has a flat circular shape. Its two surfaces are covered with a magnetic

material and information is recorded on the surfaces. The platter of hard disks are made

from rigid metal or glass, while floppy disks are made from flexible material.

o The disk surface is logically divided into tracks, which are subdivided into sectors. A

sector (varying from 32 bytes to 4096 bytes, usually 512 bytes) is the smallest unit of

information that can be read from or written to disk. There are 4-32 sectors per track

and 20-1500 tracks per disk surface.

o The arm can be positioned over any one of the tracks.

o The platter is spun at high speed.

o To read information, the arm is positioned over the correct track.

o When the data to be accessed passes under the head, the read or write operation is

performed.

3. A disk typically contains multiple platters (see Figure 10.2). The read-write heads of all the

tracks are mounted on a single assembly called a disk arm, and move together.

o Multiple disk arms are moved as a unit by the actuator.

o Each arm has two heads, to read disks above and below it.

o The set of tracks over which the heads are located forms a cylinder.

o This cylinder holds that data that is accessible within the disk latency time.

o It is clearly sensible to store related data in the same or adjacent cylinders.

4. Disk platters range from 1.8" to 14" in diameter, and 5"1/4 and 3"1/2 disks dominate due to the

lower cost and faster seek time than do larger disks, yet they provide high storage capacity.

5. A disk controller interfaces between the computer system and the actual hardware of the disk

drive. It accepts commands to r/w a sector, and initiate actions. Disk controllers also attach

checksums to each sector to check read error.

6. Remapping of bad sectors: If a controller detects that a sector is damaged when the disk is

initially formatted, or when an attempt is made to write the sector, it can logically map the

sector to a different physical location.

7. SCSI (Small Computer System Interconnect) is commonly used to connect disks to PCs and

workstations. Mainframe and server systems usually have a faster and more expensive bus to

connect to the disks.

8. Head crash: why cause the entire disk failing (?).

9. A fixed dead disk has a separate head for each track -- very many heads, very expensive.

Multiple disk arms: allow more than one track to be accessed at a time. Both were used in high

performance mainframe systems but are relatively rare today.

Performance Measures of Disks

The main measures of the qualities of a disk are capacity, access time, data transfer rate, and

reliability,

1. access time: the time from when a read or write request is issued to when data transfer begins.

To access data on a given sector of a disk, the arm first must move so that it is positioned over

the correct track, and then must wait for the sector to appear under it as the disk rotates. The

time for repositioning the arm is called seek time, and it increases with the distance the arm

must move. Typical seek time range from 2 to 30 milliseconds.

UNIT-IV DBMS

Prepared by Mrs.D.Maladhy (AP/IT/RGCET) Page 4

Average seek time is the average of the seek time, measured over a sequence of (uniformly

distributed) random requests, and it is about one third of the worst-case seek time.

Once the seek has occurred, the time spent waiting for the sector to be accesses to appear under

the head is called rotational latency time. Average rotational latency time is about half of the

time for a full rotation of the disk. (Typical rotational speeds of disks ranges from 60 to 120

rotations per second).

The access time is then the sum of the seek time and the latency and ranges from 10 to 40

milli-sec.

2. data transfer rate, the rate at which data can be retrieved from or stored to the disk. Current

disk systems support transfer rate from 1 to 5 megabytes per second.

3. reliability, measured by the mean time to failure. The typical mean time to failure of disks

today ranges from 30,000 to 800,000 hours (about 3.4 to 91 years).

Optimization of Disk-Block Access

1. Data is transferred between disk and main memory in units called blocks.

2. A block is a contiguous sequence of bytes from a single track of one platter.

3. Block sizes range from 512 bytes to several thousand.

4. The lower levels of file system manager covert block addresses into the hardware-level

cylinder, surface, and sector number.

5. Access to data on disk is several orders of magnitude slower than is access to data in main

memory. Optimization techniques besides buffering of blocks in main memory.

o Scheduling: If several blocks from a cylinder need to be transferred, we may save time

by requesting them in the order in which they pass under the heads. A commonly used

disk-arm scheduling algorithm is the elevator algorithm.

o File organization. Organize blocks on disk in a way that corresponds closely to the

manner that we expect data to be accessed. For example, store related information on

the same track, or physically close tracks, or adjacent cylinders in order to minimize

seek time. IBM mainframe OS's provide programmers fine control on placement of

files but increase programmer's burden.

UNIX or PC OSs hide disk organizations from users. Over time, a sequential file may

become fragmented. To reduce fragmentation, the system can make a back-up copy of

the data on disk and restore the entire disk. The restore operation writes back the blocks

of each file continuously (or nearly so). Some systems, such as MS-DOS, have utilities

that scan the disk and then move blocks to decrease the fragmentation.

o Nonvolatile write buffers. Use nonvolatile RAM (such as battery-back-up RAM) to

speed up disk writes drastically (first write to nonvolatile RAM buffer and inform OS

that writes completed).

o Log disk. Another approach to reducing write latency is to use a log disk, a disk

devoted to writing a sequential log. All access to the log disk is sequential, essentially

UNIT-IV DBMS

Prepared by Mrs.D.Maladhy (AP/IT/RGCET) Page 5

eliminating seek time, and several consecutive blocks can be written at once, making

writes to log disk several times faster than random writes.

RAID

In computing, the acronym RAID (originally redundant array of inexpensive disks, now

also known as redundant array of independent disks) refers to a data storage scheme using multiple

hard drives to share or replicate data among the drives. Depending on the configuration of the RAID

(typically referred to as the RAID level), the benefit of RAID is one or more of increased data

integrity, fault-tolerance, throughput or capacity compared to single drives. In its original

implementations, its key advantage was the ability to combine multiple low-cost devices using older

technology into an array that offered greater capacity, reliability, speed, or a combination of these

things, than was affordably available in a single device using the newest technology

A RAID appears to the operating system to be a single logical hard disk. RAID employs the technique

of disk striping, which involves partitioning each drive's storage space into units ranging from a sector

(512 bytes) up to several megabytes. The stripes of all the disks are interleaved and addressed in order.

In a single-user system where large records, such as medical or other scientific images, are stored, the

stripes are typically set up to be small (perhaps 512 bytes) so that a single record spans all disks and

can be accessed quickly by reading all disks at the same time.

In a multi-user system, better performance requires establishing a stripe wide enough to hold the

typical or maximum size record. This allows overlapped disk I/O across drives.

There are at least nine types of RAID plus a non-redundant array (RAID-0):

RAID-0: This technique has striping but no redundancy of data. It offers the best performance

but no fault-tolerance.

RAID-1: This type is also known as disk mirroring and consists of at least two drives that

duplicate the storage of data. There is no striping. Read performance is improved since either

disk can be read at the same time. Write performance is the same as for single disk storage.

RAID-1 provides the best performance and the best fault-tolerance in a multi-user system.

RAID-2: This type uses striping across disks with some disks storing error checking and

correcting (ECC) information. It has no advantage over RAID-3.

RAID-3: This type uses striping and dedicates one drive to storing parity information. The

embedded error checking (ECC) information is used to detect errors. Data recovery is

accomplished by calculating the exclusive OR (XOR) of the information recorded on the other

drives. Since an I/O operation addresses all drives at the same time, RAID-3 cannot overlap

I/O. For this reason, RAID-3 is best for single-user systems with long record applications.

RAID-4: This type uses large stripes, which means you can read records from any single drive.

This allows you to take advantage of overlapped I/O for read operations. Since all write

operations have to update the parity drive, no I/O overlapping is possible. RAID-4 offers no

advantage over RAID-5.

RAID-5: This type includes a rotating parity array, thus addressing the write limitation in

RAID-4. Thus, all read and write operations can be overlapped. RAID-5 stores parity

UNIT-IV DBMS

Prepared by Mrs.D.Maladhy (AP/IT/RGCET) Page 6

information but not redundant data (but parity information can be used to reconstruct data).

RAID-5 requires at least three and usually five disks for the array. It's best for multi-user

systems in which performance is not critical or which do few write operations.

RAID-6: This type is similar to RAID-5 but includes a second parity scheme that is distributed

across different drives and thus offers extremely high fault- and drive-failure tolerance.

RAID-7: This type includes a real-time embedded operating system as a controller, caching via

a high-speed bus, and other characteristics of a stand-alone computer. One vendor offers this

system.

RAID-10: Combining RAID-0 and RAID-1 is often referred to as RAID-10, which offers

higher performance than RAID-1 but at much higher cost. There are two subtypes: In RAID-

0+1, data is organized as stripes across multiple disks, and then the striped disk sets are

mirrored. In RAID-1+0, the data is mirrored and the mirrors are striped.

RAID-50 (or RAID-5+0): This type consists of a series of RAID-5 groups and striped in

RAID-0 fashion to improve RAID-5 performance without reducing data protection.

RAID-53 (or RAID-5+3): This type uses striping (in RAID-0 style) for RAID-3's virtual disk

blocks. This offers higher performance than RAID-3 but at much higher cost.

RAID-S (also known as Parity RAID): This is an alternate, proprietary method for striped

parity RAID from EMC Symmetrix that is no longer in use on current equipment. It appears to

be similar to RAID-5 with some performance enhancements as well as the enhancements that

come from having a high-speed disk cache on the disk array.

TERTIARY STORAGE

Optical Disks

1. CD-ROM has become a popular medium for distributing software, multimedia data, and other

electronic published information.

2. Capacity of CD-ROM: 500 MB. Disks are cheap to mass produce and also drives.

3. CD-ROM: much longer seek time (250m-sec), lower rotation speed (400 rpm), leading to high

latency and lower data-transfer rate (about 150 KB/sec). Drives spins at audio CD spin

speed (standard) is available.

4. Recently, a new optical format, digit video disk (DVD) has become standard. These disks hold

between 4.7 and 17 GB data.

5. WORM (write-once, read many) disks are popular for archival storage of data since they have

a high capacity (about 500 MB), longer life time than HD, and can be removed from drive --

good for audit trail (hard to tamper).

Magnetic Tapes

1. Long history, slow, and limited to sequential access, and thus are used for backup, storage for

infrequent access, and off-line medium for system transfer.

2. Moving to the correct spot may take minutes, but once positioned, tape drives can write data at

density and speed approaching to those of disk drives.

3. 8mm tape drive has the highest density, and we store 5 GB data on a 350-foot tape.

4. Popularly used for storage of large volumes of data, such as video, image, or remote sensing

data.

UNIT-IV DBMS

Prepared by Mrs.D.Maladhy (AP/IT/RGCET) Page 7

STORAGE ACCESS

1. Each file is partitioned into fixed-length storage units, called blocks, which are the units of both

storage allocation and data transfer.

2. It is desirable to keep as many blocks as possible in main memory. Usually, we cannot keep all

blocks in main memory, so we need to manage the allocation of available main memory space.

3. We need to use disk storage for the database, and to transfer blocks of data between main

memory and disk. We also want to minimize the number of such transfers, as they are time-

consuming.

4. The buffer is the part of main memory available for storage of copies of disk blocks

Buffer manager

1. The subsystem responsible for the allocation of buffer space is called the buffer manager.

o The buffer manager handles all requests for blocks of the database.

o If the block is already in main memory, the address in main memory is given to the

requester.

o If not, the buffer manager must read the block in from disk (possibly displacing some

other block if the buffer is full) and then pass the address in main memory to the

requester.

2. The buffer manager must use some sophisticated techniques in order to provide good service:

o Replacement Strategy -- When there is no room left in the buffer, some block must be

removed to make way for the new one. Typical operating system memory management

schemes use a ``least recently used'' (LRU) method. (Simply remove the block least

recently referenced.) This can be improved upon for database applications.

o Pinned Blocks - For the database to be able to recover from crashes, we need to restrict

times when a block maybe written back to disk. A block not allowed to be written is

said to be pinned. Many operating systems do not provide support for pinned blocks,

and such a feature is essential if a database is to be ``crash resistant''.

o Forced Output of Blocks - Sometimes it is necessary to write a block back to disk

even though its buffer space is not needed, (called the forced output of a block.) This

is due to the fact that main memory contents (and thus the buffer) are lost in a crash,

while disk data usually survives.

Buffer replacement policies

1. Replacement Strategy: Goal is minimization of accesses to disk. Generally it is hard to

predict which blocks will be referenced. So operating systems use the history of past references

as a guide to prediction.

o General Assumption: Blocks referenced recently are likely to be used again.

o Therefore, if we need space, throw out the least recently referenced block (LRU

replacement scheme).

2. LRU is acceptable in operating systems, however, a database system is able to predict future

references more accurately.

UNIT-IV DBMS

Prepared by Mrs.D.Maladhy (AP/IT/RGCET) Page 8

3. Consider processing of the relational algebra expression

4. Further, assume the strategy to process this request is given by the following pseudo-code:

5. aaaaaaaaaaaa¯for each tuple b of borrower do

6.

7. for each tuple c of customer do

8.

9. if b[cname] = c[cname]

10. 11. then begin

12. 13. let x be a tuple defined as follows:

14. 15. x[cname]:= b[cname]

16. 17. x[loan#]:= b[loan#]

18. 19. x[street]:= c[street]

20. 21. x[city]:= c[city]

22. 23. include tuple x as part of result of

24. borrow customer

25.

26. end 27.

28. end 29.

30. end 31. 32. Assume that the two relations in this example are stored in different files.

o Once a tuple of borrower has been processed, it is not needed again. Therefore, once

processing of an entire block of tuples is finished, that block is not needed in main

memory, even though it has been used very recently.

o Buffer manager should free the space occupied by a borrow block as soon as it is

processed. This strategy is called toss-immediate.

o Consider blocks containing customer tuples.

o Every block of customer tuples must be examined once for every tuple of the borrow

relation. When processing of a customer block is completed, it will not be used again

until all other customer blocks have been processed. This means the most recently used

(MRU) block will be the last block to be re-referenced, and the least recently used will

be referenced next.

o This is the opposite of LRU assumptions. So for inner block, use MRU strategy -- if a

customer block must be removed from the buffer, choose MRU block.

UNIT-IV DBMS

Prepared by Mrs.D.Maladhy (AP/IT/RGCET) Page 9

o For MRU strategy, the system must pin the customer block currently being processed

until the last tuple has been processed. Then it is unpinned, becoming the most recently

used block.

33. The buffer manager may also use statistical information regarding the probability that a request

will reference a particular relation.

o The data dictionary is the most frequently-used part of the database. It should,

therefore, not be removed from main memory unless necessary.

o File indices are also frequently used, and should generally be in main memory.

o No single strategy is known that handles all possible scenarios well.

o Many database systems use LRU, despite of its faults.

o Concurrency and recovery may need other buffer management strategies, such as

delayed buffer-out or forced output.

FILE ORGANIZATION

1. A file is organized logically as a sequence of records.

2. Records are mapped onto disk blocks.

3. Files are provided as a basic construct in operating systems, so we assume the existence of an

underlying file system.

4. Blocks are of a fixed size determined by the operating system.

5. Record sizes vary.

6. In relational database, tuples of distinct relations may be of different sizes.

7. One approach to mapping database to files is to store records of one length in a given file.

8. An alternative is to structure files to accommodate variable-length records. (Fixed-length is

easier to implement.)

Fixed-Length Records

1. Consider a file of deposit records of the form:

2. aaaaaaaaaaaa¯type deposit = record

3.

4. bname : char(22);

5.

6. account# : char(10);

7.

8. balance : real;

9.

10. end 11.

o If we assume that each character occupies one byte, an integer occupies 4 bytes, and a

real 8 bytes, our deposit record is 40 bytes long.

o The simplest approach is to use the first 40 bytes for the first record, the next 40 bytes

for the second, and so on.

o However, there are two problems with this approach.

o It is difficult to delete a record from this structure.

UNIT-IV DBMS

Prepared by Mrs.D.Maladhy (AP/IT/RGCET) Page 10

o Space occupied must somehow be deleted, or we need to mark deleted records so that

they can be ignored.

o Unless block size is a multiple of 40, some records will cross block boundaries.

o It would then require two block accesses to read or write such a record.

12. When a record is deleted, we could move all successive records up one (Figure 10.7), which

may require moving a lot of records.

o We could instead move the last record into the ``hole'' created by the deleted record

(Figure 10.8).

o This changes the order the records are in.

o It turns out to be undesirable to move records to occupy freed space, as moving requires

block accesses.

o Also, insertions tend to be more frequent than deletions.

o It is acceptable to leave the space open and wait for a subsequent insertion.

o This leads to a need for additional structure in our file design.

13. So one solution is:

o At the beginning of a file, allocate some bytes as a file header.

o This header for now need only be used to store the address of the first record whose

contents are deleted.

o This first record can then store the address of the second available record, and so on

(Figure 10.9).

o To insert a new record, we use the record pointed to by the header, and change the

header pointer to the next available record.

o If no deleted records exist we add our new record to the end of the file.

14. Note: Use of pointers requires careful programming. If a record pointed to is moved or deleted,

and that pointer is not corrected, the pointer becomes a dangling pointer. Records pointed to

are called pinned.

15. Fixed-length file insertions and deletions are relatively simple because ``one size fits all''. For

variable length, this is not the case.

Variable-Length Records

1. Variable-length records arise in a database in several ways:

o Storage of multiple items in a file.

o Record types allowing variable field size

o Record types allowing repeating fields

2. We'll look at several techniques, using one example with a variable-length record:

3. aaaaaaaaaaaa¯type account-list = record

4.

5. bname : char(22);

6.

7. account-info : array of

8.

9. record;

10. 11. account#: char(10);

12.

UNIT-IV DBMS

Prepared by Mrs.D.Maladhy (AP/IT/RGCET) Page 11

13. balance: real;

14.

15. end 16.

17. end

Account-information is an array with an arbitrary number of elements.

Byte string representation

1. Attach a special end-of-record symbol ( ) to the end of each record. Each record is stored as a

string of successive bytes (See Figure 10.10).

Byte string representation has several disadvantages:

o It is not easy to re-use space left by a deleted record

o In general, there is no space for records to grow longer. (Must move to expand, and

record may be pinned.)

So this method is not usually used.

2. An interesting structure: Slot page structure.

There is a header at the beginning of each block, containing:

o # of record entires in the header

o the end of free space in the block

o an array whose entries contain the location and size of each record.

3. The slot page structure requires that there be no pointers that point directly to records. Instead,

pointers must point to the entry in the header that contains the actual location of the record.

This level of indirection allows records to be moved to prevent fragmentation of space inside a

block, while supporting indirect pointers to the record.

Fixed-length representation

1. Uses one or more fixed-length records to represent one variable-length record.

2. Two techniques:

o Reserved space - uses fixed-length records large enough to accommodate the largest

variable-length record. (Unused space filled with end-of-record symbol.)

o Pointers - represent by a list of fixed-length records, chained together.

3. The reserved space method requires the selection of some maximum record length. (Figure

10.12)

If most records are of near-maximum length this method is useful. Otherwise, space is wasted.

UNIT-IV DBMS

Prepared by Mrs.D.Maladhy (AP/IT/RGCET) Page 12

4. Then the pointer method may be used (Figure 10.13). Its disadvantage is that space is wasted in

successive records in a chain as non-repeating fields are still present.

5. To overcome this last disadvantage we can split records into two blocks (See Figure 10.14)

o Anchor block - contains first records of a chain

o Overflow block - contains records other than first in the chain.

Now all records in a block have the same length, and there is no wasted space.

ORGANIZATION OF RECORDS IN FILES

There are several ways of organizing records in files.

heap file organization. Any record can be placed anywhere in the file where there is space for

the record. There is no ordering of records.

sequential file organization. Records are stored in sequential order, based on the value of the

search key of each record.

hashing file organization. A hash function is computed on some attribute of each record. The

result of the function specifies in which block of the file the record should be placed -- to be

discussed in chapter 11 since it is closely related to the indexing structure.

clustering file organization. Records of several different relations can be stored in the same

file. Related records of the different relations are stored on the same block so that one I/O

operation fetches related records from all the relations.

Sequential File Organization

1. A sequential file is designed for efficient processing of records in sorted order on some

search key.

o Records are chained together by pointers to permit fast retrieval in search key order.

o Pointer points to next record in order.

o Records are stored physically in search key order (or as close to this as possible).

o This minimizes number of block accesses.

o Figure 10.15 shows an example, with bname as the search key.

2. It is difficult to maintain physical sequential order as records are inserted and deleted.

o Deletion can be managed with the pointer chains.

o Insertion poses problems if no space where new record should go.

o If space, use it, else put new record in an overflow block.

o Adjust pointers accordingly.

o Figure 10.16 shows the previous example after an insertion.

o Problem: we now have some records out of physical sequential order.

o If very few records in overflow blocks, this will work well.

o If order is lost, reorganize the file.

o Reorganizations are expensive and done when system load is low.

3. If insertions rarely occur, we could keep the file in physically sorted order and reorganize when

insertion occurs. In this case, the pointer fields are no longer required.

UNIT-IV DBMS

Prepared by Mrs.D.Maladhy (AP/IT/RGCET) Page 13

Clustering File Organization

1. One relation per file, with fixed-length record, is good for small databases, which also reduces

the code size.

2. Many large-scale DB systems do not rely directly on the underlying operating system for file

management. One large OS file is allocated to DB system and all relations are stored in one

file.

3. To efficiently execute queries involving , one may store the depositor

tuple for each cname near the customer tuple for the corresponding cname, as shown in Figure

10.19.

4. This structure mixes together tuples from two relations, but allows for efficient processing of

the join.

5. If the customer has many accounts which cannot fit in one block, the remaining records appear

on nearby blocks. This file structure, called clustering, allows us to read many of the required

records using one block read.

6. Our use of clustering enhances the processing of a particular join but may result in slow

processing of other types of queries, such as selection on customer.

For example, the query

aaaaaaaaaaaa¯select *

from customer

now requires more block accesses as our customer relation is now interspersed with the deposit

relation.

7. Thus it is a trade-off, depending on the types of query that the database designer believes to be

most frequent. Careful use of clustering may produce significant performance gain.

DATA DICTIONARY STORAGE

1. The database also needs to store information about the relations, known as the data

dictionary. This includes:

o Names of relations.

o Names of attributes of relations.

o Domains and lengths of attributes.

o Names and definitions of views.

o Integrity constraints (e.g., key constraints).

plus data on the system users:

o Names of authorized users.

o Accounting information about users.

plus (possibly) statistical and descriptive data:

UNIT-IV DBMS

Prepared by Mrs.D.Maladhy (AP/IT/RGCET) Page 14

o Number of tuples in each relation.

o Method of storage used for each relation (e.g., clustered or non-clustered).

2. When we look at indices (Chapter 11), we'll also see a need to store information about each

index on each relation:

o Name of the index.

o Name of the relation being indexed.

o Attributes the index is on.

o Type of index.

3. This information is, in itself, a miniature database. We can use the database to store data about

itself, simplifying the overall structure of the system, and allowing the full power of the

database to be used to permit fast access to system data.

4. The exact choice of how to represent system data using relations must be made by the system

designer. One possible representation follows.

5. aaaaaaaaaaaa¯System-catalog-schema = (relation-name, number-attrs)

6.

7. Attr-schema = (attr-name, rel-name, domain-type, position, length)

8.

9. User-schema = (user-name, encrypted-password, group)

10. 11. Index-schema = (index-name, rel-name, index-type, index-attr)

12. 13. View-schema = (view-name, definition)

INDEXING & HASHING

1. Many queries reference only a small proportion of records in a file. For example, finding all

records at Perryridge branch only returns records where bname = ``Perryridge''.

2. We should be able to locate these records directly, rather than having to read every record and

check its branch-name. We then need extra file structuring.

BASIC CONCEPTS

1. An index for a file works like a catalogue in a library. Cards in alphabetic order tell us where to

find books by a particular author.

2. In real-world databases, indices like this might be too large to be efficient. We'll look at more

sophisticated indexing techniques.

3. There are two kinds of indices.

o Ordered indices: indices are based on a sorted ordering of the values.

o Hash indices: indices are based on the values being distributed uniformly across a range

of buckets. The buckets to which a value is assigned is determined by a function, called

a hash function.

4. We will consider several indexing techniques. No one technique is the best. Each technique is

best suited for a particular database application.

5. Methods will be evaluated on:

1. Access Types -- types of access that are supported efficiently, e.g., value-based search

or range search.

UNIT-IV DBMS

Prepared by Mrs.D.Maladhy (AP/IT/RGCET) Page 15

2. Access Time -- time to find a particular data item or set of items.

3. Insertion Time -- time taken to insert a new data item (includes time to find the right

place to insert).

4. Deletion Time -- time to delete an item (includes time taken to find item, as well as to

update the index structure).

5. Space Overhead -- additional space occupied by an index structure.

6. We may have more than one index or hash function for a file. (The library may have card

catalogues by author, subject or title.)

7. The attribute or set of attributes used to look up records in a file is called the search key (not to

be confused with primary key, etc.).

ORDERED INDICES

1. In order to allow fast random access, an index structure may be used.

2. A file may have several indices on different search keys.

3. If the file containing the records is sequentially ordered, the index whose search key specifies

the sequential order of the file is the primary index, or clustering index. Note: The search key

of a primary index is usually the primary key, but it is not necessarily so.

4. Indices whose search key specifies an order different from the sequential order of the file are

called the secondary indices, or nonclustering indices.

Primary Index

1. Index-sequential files: Files are ordered sequentially on some search key, and a primary index

is associated with it.

Figure 11.1: Sequential file for deposit records.

Dense and Sparse Indices

UNIT-IV DBMS

Prepared by Mrs.D.Maladhy (AP/IT/RGCET) Page 16

1. There are Two types of ordered indices:

Dense Index:

o An index record appears for every search key value in file.

o This record contains search key value and a pointer to the actual record.

Sparse Index:

o Index records are created only for some of the records.

o To locate a record, we find the index record with the largest search key value less than

or equal to the search key value we are looking for.

o We start at that record pointed to by the index record, and proceed along the pointers in

the file (that is, sequentially) until we find the desired record.

2. Figures 11.2 and 11.3 show dense and sparse indices for the deposit file.

Figure 11.2: Dense index.

3. Notice how we would find records for Perryridge branch using both methods. (Do it!)

UNIT-IV DBMS

Prepared by Mrs.D.Maladhy (AP/IT/RGCET) Page 17

Figure 11.3: Sparse index.

4. Dense indices are faster in general, but sparse indices require less space and impose less

maintenance for insertions and deletions. (Why?)

5. A good compromise: to have a sparse index with one entry per block.

Why is this good?

o Biggest cost is in bringing a block into main memory.

o We are guaranteed to have the correct block with this method, unless record is on an

overflow block (actually could be several blocks).

o Index size still small.

Multi-Level Indices

1. Even with a sparse index, index size may still grow too large. For 100,000 records, 10 per

block, at one index record per block, that's 10,000 index records! Even if we can fit 100 index

records per block, this is 100 blocks.

2. If index is too large to be kept in main memory, a search results in several disk reads.

o If there are no overflow blocks in the index, we can use binary search.

o This will read as many as blocks (as many as 7 for our 100 blocks).

o If index has overflow blocks, then sequential search typically used, reading all b index

blocks.

3. Solution: Construct a sparse index on the index (Figure 11.4).

UNIT-IV DBMS

Prepared by Mrs.D.Maladhy (AP/IT/RGCET) Page 18

Figure 11.4: Two-level sparse index.

4. Use binary search on outer index. Scan index block found until correct index record found.

Use index record as before - scan block pointed to for desired record.

5. For very large files, additional levels of indexing may be required.

6. Indices must be updated at all levels when insertions or deletions require it.

7. Frequently, each level of index corresponds to a unit of physical storage (e.g. indices at the

level of track, cylinder and disk).

Index Update

Regardless of what form of index is used, every index must be updated whenever a record is either

inserted into or deleted from the file.

1. Deletion: o Find (look up) the record

o If the last record with a particular search key value, delete that search key value from

index.

o For dense indices, this is like deleting a record in a file.

o For sparse indices, delete a key value by replacing key value's entry in index by next

search key value. If that value already has an index entry, delete the entry.

UNIT-IV DBMS

Prepared by Mrs.D.Maladhy (AP/IT/RGCET) Page 19

2. Insertion: o Find place to insert.

o Dense index: insert search key value if not present.

o Sparse index: no change unless new block is created. (In this case, the first search key

value appearing in the new block is inserted into the index).

Secondary Indices

1. If the search key of a secondary index is not a candidate key, it is not enough to point to just

the first record with each search-key value because the remaining records with the same

search-key value could be anywhere in the file. Therefore, a secondary index must contain

pointers to all the records.

Figure 11.5: Sparse secondary index on cname.

2. We can use an extra-level of indirection to implement secondary indices on search keys that

are not candidate keys. A pointer does not point directly to the file but to a bucket that contains

pointers to the file.

o See Figure 11.5 on secondary key cname.

o To perform a lookup on Peterson, we must read all three records pointed to by entries in

bucket 2.

o Only one entry points to a Peterson record, but three records need to be read.

o As file is not ordered physically by cname, this may take 3 block accesses.

3. Secondary indices must be dense, with an index entry for every search-key value, and a pointer

to every record in the file.

4. Secondary indices improve the performance of queries on non-primary keys.

5. They also impose serious overhead on database modification: whenever a file is updated, every

index must be updated.

6. Designer must decide whether to use secondary indices or not.

UNIT-IV DBMS

Prepared by Mrs.D.Maladhy (AP/IT/RGCET) Page 20

B -TREE INDEX FILES

1. Primary disadvantage of index-sequential file organization is that performance degrades as the

file grows. This can be remedied by costly re-organizations.

2. B -tree file structure maintains its efficiency despite frequent insertions and deletions. It

imposes some acceptable update and space overheads.

3. A B -tree index is a balanced tree in which every path from the root to a leaf is of the same

length.

4. Each nonleaf node in the tree must have between and n children, where n is fixed for a

particular tree.

Structure of a B -Tree

1. A B -tree index is a multilevel index but is structured differently from that of multi-level

index sequential files.

2. A typical node (Figure 11.6) contains up to n-1 search key values , and n

pointers . Search key values in a node are kept in sorted order.

Figure 11.6: Typical node of a B+-tree.

3. For leaf nodes, ( ) points to either a file record with search key value , or a

bucket of pointers to records with that search key value. Bucket structure is used if search key

is not a primary key, and file is not sorted in search key order.

Pointer (nth pointer in the leaf node) is used to chain leaf nodes together in linear order

(search key order). This allows efficient sequential processing of the file.

The range of values in each leaf do not overlap.

4. Non-leaf nodes form a multilevel index on leaf nodes.

A non-leaf node may hold up to n pointers and must hold pointers. The number of

pointers in a node is called the fan-out of the node.

Consider a node containing m pointers. Pointer ( ) points to a subtree containing

search key values and . Pointer points to a subtree containing search key values

. Pointer points to a subtree containing search key values .

5. Figures 11.7 (textbook Fig. 11.8) and textbook Fig. 11.9 show B -trees for the deposit file

with n=3 and n=5.

UNIT-IV DBMS

Prepared by Mrs.D.Maladhy (AP/IT/RGCET) Page 21

Figure 11.7: B+-tree for deposit file with n = 3.

Queries on B -Trees

1. Suppose we want to find all records with a search key value of k.

o Examine the root node and find the smallest search key value .

o Follow pointer to another node.

o If follow pointer .

o Otherwise, find the appropriate pointer to follow.

o Continue down through non-leaf nodes, looking for smallest search key value > k and

following the corresponding pointer.

o Eventually we arrive at a leaf node, where pointer will point to the desired record or

bucket.

2. In processing a query, we traverse a path from the root to a leaf node. If there are K search key

values in the file, this path is no longer than .

This means that the path is not long, even in large files. For a 4k byte disk block with a search-

key size of 12 bytes and a disk pointer of 8 bytes, n is around 200. If n =100, a look-up of 1

million search-key values may take nodes to be accessed. Since root is

in usually in the buffer, so typically it takes only 3 or fewer disk reads.

Updates on B -Trees

1. Insertions and Deletions:

Insertion and deletion are more complicated, as they may require splitting or combining nodes

to keep the tree balanced. If splitting or combining are not required, insertion works as follows:

o Find leaf node where search key value should appear.

o If value is present, add new record to the bucket.

o If value is not present, insert value in leaf node (so that search keys are still in order).

o Create a new bucket and insert the new record.

If splitting or combining are not required, deletion works as follows:

o Deletion: Find record to be deleted, and remove it from the bucket.

o If bucket is now empty, remove search key value from leaf node.

UNIT-IV DBMS

Prepared by Mrs.D.Maladhy (AP/IT/RGCET) Page 22

2. Insertions Causing Splitting:

When insertion causes a leaf node to be too large, we split that node. In Figure 11.8, assume

we wish to insert a record with a bname value of ``Clearview''.

o There is no room for it in the leaf node where it should appear.

o We now have n values (the n-1 search key values plus the new one we wish to insert).

o We put the first values in the existing node, and the remainder into a new node.

o Figure 11.10 shows the result.

o The new node must be inserted into the B -tree.

o We also need to update search key values for the parent (or higher) nodes of the split

leaf node. (Except if the new node is the leftmost one)

o Order must be preserved among the search key values in each node.

o If the parent was already full, it will have to be split.

o When a non-leaf node is split, the children are divided among the two new nodes.

o In the worst case, splits may be required all the way up to the root. (If the root is split,

the tree becomes one level deeper.)

o Note: when we start a B -tree, we begin with a single node that is both the root and a

single leaf. When it gets full and another insertion occurs, we split it into two leaf

nodes, requiring a new root.

3. Deletions Causing Combining:

Deleting records may cause tree nodes to contain too few pointers. Then we must combine

nodes.

o If we wish to delete ``Downtown'' from the B -tree of Figure 11.11, this occurs.

o In this case, the leaf node is empty and must be deleted.

o If we wish to delete ``Perryridge'' from the B -tree of Figure 11.11, the parent is left

with only one pointer, and must be coalesced with a sibling node.

o Sometimes higher-level nodes must also be coalesced.

o If the root becomes empty as a result, the tree is one level less deep (Figure 11.13).

o Sometimes the pointers must be redistributed to keep the tree balanced.

o Deleting ``Perryridge'' from Figure 11.11 produces Figure 11.14.

4. To summarize:

o Insertion and deletion are complicated, but require relatively few operations.

o Number of operations required for insertion and deletion is proportional to logarithm of

number of search keys.

o B -trees are fast as index structures for database.

UNIT-IV DBMS

Prepared by Mrs.D.Maladhy (AP/IT/RGCET) Page 23

B -Tree File Organization

1. The B -tree structure is used not only as an index but also as an organizer for records into a

file.

2. In a B -tree file organization, the leaf nodes of the tree store records instead of storing

pointers to records, as shown in Fig. 11.17.

3. Since records are usually larger than pointers, the maximum number of records that can be

stored in a leaf node is less than the maximum number of pointers in a nonleaf node.

4. However, the leaf node are still required to be at least half full.

5. Insertion and deletion from a B -tree file organization are handled in the same way as that in a

B -tree index.

6. When a B -tree is used for file organization, space utilization is particularly important. We

can improve the space utilization by involving more sibling nodes in redistribution during

splits and merges.

7. In general, if m nodes are involved in redistribution, each node can be guaranteed to contain at

least entries. However, the cost of update becomes higher as more siblings are

involved in redistribution.

B-TREE INDEX FILES

1. B-tree indices are similar to B -tree indices.

o Difference is that B-tree eliminates the redundant storage of search key values.

o In B -tree of Figure 11.11, some search key values appear twice.

o A corresponding B-tree of Figure 11.18 allows search key values to appear only once.

o Thus we can store the index in less space.

Figure 11.8: Leaf and nonleaf node of a B-tree.

2. Advantages: o Lack of redundant storage (but only marginally different).

o Some searches are faster (key may be in non-leaf node).

3. Disadvantages: o Leaf and non-leaf nodes are of different size (complicates storage)

o Deletion may occur in a non-leaf node (more complicated)

Generally, the structural simplicity of B -tree is preferred.

UNIT-IV DBMS

Prepared by Mrs.D.Maladhy (AP/IT/RGCET) Page 24

STATIC HASHING

1. Index schemes force us to traverse an index structure. Hashing avoids this.

Hash File Organization

1. Hashing involves computing the address of a data item by computing a function on the search

key value.

2. A hash function h is a function from the set of all search key values K to the set of all bucket

addresses B.

o We choose a number of buckets to correspond to the number of search key values we

will have stored in the database.

o To perform a lookup on a search key value , we compute , and search the

bucket with that address.

o If two search keys i and j map to the same address, because , then the

bucket at the address obtained will contain records with both search key values.

o In this case we will have to check the search key value of every record in the bucket to

get the ones we want.

o Insertion and deletion are simple.

Hash Functions

1. A good hash function gives an average-case lookup that is a small constant, independent of the

number of search keys.

2. We hope records are distributed uniformly among the buckets.

3. The worst hash function maps all keys to the same bucket.

4. The best hash function maps all keys to distinct addresses.

5. Ideally, distribution of keys to addresses is uniform and random.

6. Suppose we have 26 buckets, and map names beginning with ith letter of the alphabet to the ith

bucket.

o Problem: this does not give uniform distribution.

o Many more names will be mapped to ``A'' than to ``X''.

o Typical hash functions perform some operation on the internal binary machine

representations of characters in a key.

o For example, compute the sum, modulo # of buckets, of the binary representations of

characters of the search key.

o See Figure 11.18, using this method for 10 buckets (assuming the ith character in the

alphabet is represented by integer i).

Handling of bucket overflows

1. Open hashing occurs where records are stored in different buckets. Compute the hash function

and search the corresponding bucket to find a record.

2. Closed hashing occurs where all records are stored in one bucket. Hash function computes

addresses within that bucket. (Deletions are difficult.) Not used much in database applications.

UNIT-IV DBMS

Prepared by Mrs.D.Maladhy (AP/IT/RGCET) Page 25

3. Drawback to our approach: Hash function must be chosen at implementation time.

o Number of buckets is fixed, but the database may grow.

o If number is too large, we waste space.

o If number is too small, we get too many ``collisions'', resulting in records of many

search key values being in the same bucket.

o Choosing the number to be twice the number of search key values in the file gives a

good space/performance tradeoff.

Hash Indices

1. A hash index organizes the search keys with their associated pointers into a hash file structure.

2. We apply a hash function on a search key to identify a bucket, and store the key and its

associated pointers in the bucket (or in overflow buckets).

3. Strictly speaking, hash indices are only secondary index structures, since if a file itself is

organized using hashing, there is no need for a separate hash index structure on it.

DYNAMIC HASHING

1. As the database grows over time, we have three options:

o Choose hash function based on current file size. Get performance degradation as file

grows.

o Choose hash function based on anticipated file size. Space is wasted initially.

o Periodically re-organize hash structure as file grows. Requires selecting new hash

function, recomputing all addresses and generating new bucket assignments. Costly,

and shuts down database.

2. Some hashing techniques allow the hash function to be modified dynamically to accommodate

the growth or shrinking of the database. These are called dynamic hash functions.

o Extendable hashing is one form of dynamic hashing.

o Extendable hashing splits and coalesces buckets as database size changes.

o This imposes some performance overhead, but space efficiency is maintained.

o As reorganization is on one bucket at a time, overhead is acceptably low.

3. How does it work?

UNIT-IV DBMS

Prepared by Mrs.D.Maladhy (AP/IT/RGCET) Page 26

Figure 11.9: General extendable hash structure.

o We choose a hash function that is uniform and random that generates values over a

relatively large range.

o Range is b-bit binary integers (typically b=32).

o is over 4 billion, so we don't generate that many buckets!

o Instead we create buckets on demand, and do not use all b bits of the hash initially.

o At any point we use i bits where .

o The i bits are used as an offset into a table of bucket addresses.

o Value of i grows and shrinks with the database.

o Figure 11.19 shows an extendable hash structure.

o Note that the i appearing over the bucket address table tells how many bits are required

to determine the correct bucket.

o It may be the case that several entries point to the same bucket.

o All such entries will have a common hash prefix, but the length of this prefix may be

less than i.

o So we give each bucket an integer giving the length of the common hash prefix.

o This is shown in Figure 11.9 (textbook 11.19) as .

o Number of bucket entries pointing to bucket j is then .

4. To find the bucket containing search key value :

o Compute .

o Take the first i high order bits of .

o Look at the corresponding table entry for this i-bit string.

o Follow the bucket pointer in the table entry.

5. We now look at insertions in an extendable hashing scheme.

o Follow the same procedure for lookup, ending up in some bucket j.

o If there is room in the bucket, insert information and insert record in the file.

o If the bucket is full, we must split the bucket, and redistribute the records.

o If bucket is split we may need to increase the number of bits we use in the hash.

UNIT-IV DBMS

Prepared by Mrs.D.Maladhy (AP/IT/RGCET) Page 27

6. Two cases exist:

1. If , then only one entry in the bucket address table points to bucket j.

o Then we need to increase the size of the bucket address table so that we can include

pointers to the two buckets that result from splitting bucket j.

o We increment i by one, thus considering more of the hash, and doubling the size of the

bucket address table.

o Each entry is replaced by two entries, each containing original value.

o Now two entries in bucket address table point to bucket j.

o We allocate a new bucket z, and set the second pointer to point to z.

o Set and to i.

o Rehash all records in bucket j which are put in either j or z.

o Now insert new record.

o It is remotely possible, but unlikely, that the new hash will still put all of the records in

one bucket.

o If so, split again and increment i again.

2. If , then more than one entry in the bucket address table points to bucket j.

o Then we can split bucket j without increasing the size of the bucket address table

(why?).

o Note that all entries that point to bucket j correspond to hash prefixes that have the

same value on the leftmost bits.

o We allocate a new bucket z, and set and to the original value plus 1.

o Now adjust entries in the bucket address table that previously pointed to bucket j.

o Leave the first half pointing to bucket j, and make the rest point to bucket z.

o Rehash each record in bucket j as before.

o Reattempt new insert.

7. Note that in both cases we only need to rehash records in bucket j.

8. Deletion of records is similar. Buckets may have to be coalesced, and bucket address table may

have to be halved.

9. Insertion is illustrated for the example deposit file of Figure 11.20.

o 32-bit hash values on bname are shown in Figure 11.21.

o An initial empty hash structure is shown in Figure 11.22.

o We insert records one by one.

o We (unrealistically) assume that a bucket can only hold 2 records, in order to illustrate

both situations described.

o As we insert the Perryridge and Round Hill records, this first bucket becomes full.

o When we insert the next record (Downtown), we must split the bucket.

o Since , we need to increase the number of bits we use from the hash.

o We now use 1 bit, allowing us buckets.

o This makes us double the size of the bucket address table to two entries.

UNIT-IV DBMS

Prepared by Mrs.D.Maladhy (AP/IT/RGCET) Page 28

o We split the bucket, placing the records whose search key hash begins with 1 in the

new bucket, and those with a 0 in the old bucket (Figure 11.23).

o Next we attempt to insert the Redwood record, and find it hashes to 1.

o That bucket is full, and .

o So we must split that bucket, increasing the number of bits we must use to 2.

o This necessitates doubling the bucket address table again to four entries (Figure 11.24).

o We rehash the entries in the old bucket.

o We continue on for the deposit records of Figure 11.20, obtaining the extendable hash

structure of Figure 11.25.

10. Advantages: o Extendable hashing provides performance that does not degrade as the file grows.

o Minimal space overhead - no buckets need be reserved for future use. Bucket address

table only contains one pointer for each hash value of current prefix length.

11. Disadvantages: o Extra level of indirection in the bucket address table

o Added complexity

12. Summary: A highly attractive technique, provided we accept added complexity.

COMPARISON OF INDEXING AND HASHING

1. To make a wise choice between the methods seen, database designer must consider the

following issues:

o Is the cost of periodic re-organization of index or hash structure acceptable?

o What is the relative frequence of insertion and deletion?

o Is it desirable to optimize average access time at the expense of increasing worst-case

access time?

o What types of queries are users likely to pose?

2. The last issue is critical to the choice between indexing and hashing. If most queries are of the

form

3. aaaaaaaaaaaa¯select

4.

5. from r

6.

7. where

then to process this query the system will perform a lookup on an index or hash structure for

attribute with value c.

8. For these sorts of queries a hashing scheme is preferable.

o Index lookup takes time proportional to log of number of values in R for .

o Hash structure provides lookup average time that is a small constant (independent of

database size).

9. However, the worst-case favors indexing:

o Hash worst-case gives time proportional to the number of values in R for .

UNIT-IV DBMS

Prepared by Mrs.D.Maladhy (AP/IT/RGCET) Page 29

o Index worst case still log of number of values in R.

10. Index methods are preferable where a range of values is specified in the query, e.g.

select

from r

where and

This query finds records with values in the range from to .

o Using an index structure, we can find the bucket for value , and then follow the

pointer chain to read the next buckets in alphabetic (or numeric) order until we find .

o If we have a hash structure instead of an index, we can find a bucket for easily, but it

is not easy to find the ``next bucket''.

o A good hash function assigns values randomly to buckets.

o Also, each bucket may be assigned many search key values, so we cannot chain them

together.

o To support range queries using a hash structure, we need a hash function that preserves

order.

o For example, if and are search key values and then .

o Such a function would ensure that buckets are in key order.

o Order-preserving hash functions that also provide randomness and uniformity are

extremely difficult to find.

o Thus most systems use indexing in preference to hashing unless it is known in advance

that range queries will be infrequent.

INDEX DEFINITION IN SQL

1. Some SQL implementations includes data definition commands to create and drop indices. The

IBM SAA-SQL commands are

o An index is created by

o create index <index-name>

o

o on r (<attribute-list>)

o

o The attribute list is the list of attributes in relation r that form the search key for the

index.

o To create an index on bname for the branch relation:

o ¯create index b-index

o

o on branch (bname)

o

UNIT-IV DBMS

Prepared by Mrs.D.Maladhy (AP/IT/RGCET) Page 30

o If the search key is a candidate key, we add the word unique to the definition:

o ¯create unique index b-index

o

o on branch (bname)

o

o If bname is not a candidate key, an error message will appear.

o If the index creation succeeds, any attempt to insert a tuple violating this requirement

will fail.

o The unique keyword is redundant if primary keys have been defined with integrity

constraints already.

2. To remove an index, the command is

3. ¯drop index <index-name>

MULTIPLE-KEY ACCESS

1. For some queries, it is advantageous to use multiple indices if they exist.

2. If there are two indices on deposit, one on bname and one on cname, then suppose we have a

query like

3. ¯select balance

4.

5. from deposit

6.

7. where bname = ``Perryridge'' and balance = 1000

8.

9. There are 3 possible strategies to process this query:

o Use the index on bname to find all records pertaining to Perryridge branch. Examine

them to see if balance = 1000

o Use the index on balance to find all records pertaining to Williams. Examine them to

see if bname = ``Perryridge''.

o Use index on bname to find pointers to records pertaining to Perryridge branch. Use

index on balance to find pointers to records pertaining to 1000. Take the intersection

of these two sets of pointers.

10. The third strategy takes advantage of the existence of multiple indices. This may still not work

well if

o There are a large number of Perryridge records AND

o There are a large number of 1000 records AND

o Only a small number of records pertain to both Perryridge and 1000.

11. To speed up multiple search key queries special structures can be maintained.

Grid File

1. A grid structure for queries on two search keys is a 2-dimensional grid, or array, indexed by

values for the search keys. Figure 11.10 (textbook 11.31) shows part of a grid structure for the

deposit file.

UNIT-IV DBMS

Prepared by Mrs.D.Maladhy (AP/IT/RGCET) Page 31

Figure 11.10: Grid structure for deposit file.

2. A particular entry in the array contains pointers to all records with the specified search key

values.

o No special computations need to be done

o Only the right records are accessed

o Can also be used for single search key queries (one column or row)

o Easy to extend to queries on n search keys - construct an n-dimensional array.

o Significant improvement in processing time for multiple-key queries.

o Imposes space overhead.

o Performance overhead on insertion and deletion.

Partitioned Hashing

1.

o An alternative approach to multiple-key queries.

o To construct a structure for queries on deposit involving bname and cname, we

construct a hash structure for the key (cname, bname).

o We split this hash function into two parts, one for each part of the key.

o The first part depends only on the cname value.

o The second part depends only on the bname value.

o Figure 11.32 shows a sample partitioned hash function.

o Note that pairs with the same cname or bname value will have 3 bits the same in the

appropriate position.

o To find the balance in all of Williams' accounts at the Perryridge branch, we

compute h(Williams, Perryridge) and access the hash structure.

2. The same hash structure can be used to answer a query on one of the search keys:

o Compute part of partitioned hash.

o Access hash structure and scan buckets for which that part of the hash coincides.

o Text doesn't say so, but the hash structure must have some grid-like form imposed

on it to enable searching the structure based on only some part of the hash.

3. Partitioned hashing can also be extended to n-key search.