© 2011 pearson education, inc. publishing as prentice hall 1 chapter 5 part 2: file organization...

15
© 2011 Pearson Education, Inc. Publishing as © 2011 Pearson Education, Inc. Publishing as Prentice Hall Prentice Hall 1 Chapter 5 Part 2: Chapter 5 Part 2: File Organization and File Organization and Performance Performance Modern Database Management Modern Database Management 10 10 th th Edition Edition Jeffrey A. Hoffer, V. Ramesh, Jeffrey A. Hoffer, V. Ramesh, Heikki Topi Heikki Topi

Upload: owen-stanley

Post on 26-Dec-2015

212 views

Category:

Documents


0 download

TRANSCRIPT

© 2011 Pearson Education, Inc.  Publishing as © 2011 Pearson Education, Inc.  Publishing as Prentice HallPrentice Hall 11

Chapter 5 Part 2:Chapter 5 Part 2:File Organization and File Organization and

PerformancePerformance

Modern Database Modern Database ManagementManagement1010thth Edition Edition

Jeffrey A. Hoffer, V. Ramesh, Jeffrey A. Hoffer, V. Ramesh, Heikki TopiHeikki Topi

Chapter 5 © 2011 Pearson Education, Inc.  Publishing as Prentice Hall© 2011 Pearson Education, Inc.  Publishing as Prentice Hall 22

ObjectivesObjectives Define termsDefine terms Select appropriate file organizationsSelect appropriate file organizations Describe three types of file organizationDescribe three types of file organization Describe indexes and their appropriate Describe indexes and their appropriate

useuse Translate a database model into efficient Translate a database model into efficient

structuresstructures Know when and how to use Know when and how to use

denormalizationdenormalization

Chapter 5 © 2011 Pearson Education, Inc.  Publishing as Prentice Hall© 2011 Pearson Education, Inc.  Publishing as Prentice Hall 33

Physical RecordsPhysical Records Physical Record: A group of fields Physical Record: A group of fields

stored in adjacent memory locations stored in adjacent memory locations and retrieved together as a unitand retrieved together as a unit

Page: The amount of data read or Page: The amount of data read or written in one I/O operationwritten in one I/O operation

Blocking Factor: The number of Blocking Factor: The number of physical records per pagephysical records per page

Chapter 5 © 2011 Pearson Education, Inc.  Publishing as Prentice Hall© 2011 Pearson Education, Inc.  Publishing as Prentice Hall 44

Designing Physical FilesDesigning Physical Files Physical File: Physical File:

A named portion of secondary memory A named portion of secondary memory allocated for the purpose of storing physical allocated for the purpose of storing physical recordsrecords

Tablespace – named set of disk storage Tablespace – named set of disk storage elements in which physical files for elements in which physical files for database tables can be storeddatabase tables can be stored

Extent–contiguous section of disk spaceExtent–contiguous section of disk space

Chapter 5 © 2011 Pearson Education, Inc.  Publishing as Prentice Hall© 2011 Pearson Education, Inc.  Publishing as Prentice Hall 55

File OrganizationsFile Organizations Technique for physically arranging records of a file Technique for physically arranging records of a file

on secondary storageon secondary storage Factors for selecting file organization:Factors for selecting file organization:

Fast data retrieval and throughputFast data retrieval and throughput Efficient storage space utilizationEfficient storage space utilization Protection from failure and data lossProtection from failure and data loss Minimizing need for reorganizationMinimizing need for reorganization Accommodating growthAccommodating growth Security from unauthorized useSecurity from unauthorized use

Types of file organizationsTypes of file organizations SequentialSequential IndexedIndexed HashedHashed

Chapter 5 © 2011 Pearson Education, Inc.  Publishing as Prentice Hall© 2011 Pearson Education, Inc.  Publishing as Prentice Hall 66

Figure 5-7a Sequential file organization

If not sortedAverage time to find desired record = n/2

1

2

n

Records of the file are stored in sequence by the primary key field values

If sorted – every insert or delete requires a re-sort

Chapter 5 © 2011 Pearson Education, Inc.  Publishing as Prentice Hall© 2011 Pearson Education, Inc.  Publishing as Prentice Hall 77

Figure 5-7b Indexed file organization

uses a tree searchAverage time to find desired record = depth of the tree

Chapter 5 © 2011 Pearson Education, Inc.  Publishing as Prentice Hall© 2011 Pearson Education, Inc.  Publishing as Prentice Hall 88

Indexed File OrganizationsIndexed File Organizations Indexed File Organization: the storage of Indexed File Organization: the storage of

records with an index that allows software records with an index that allows software to locate individual recordsto locate individual records

Index: a table or other data structure used Index: a table or other data structure used to determine (within a file) the location of to determine (within a file) the location of records that satisfy some conditionrecords that satisfy some condition

Primary keys are automatically indexedPrimary keys are automatically indexed Other fields or combinations of fields can Other fields or combinations of fields can

also be indexed; these are called also be indexed; these are called secondary keys (or nonunique keys)secondary keys (or nonunique keys)

Chapter 5 © 2011 Pearson Education, Inc.  Publishing as Prentice Hall© 2011 Pearson Education, Inc.  Publishing as Prentice Hall 99

Figure 5-7cHashed file organization

Hash algorithmUsually uses division-remainder to determine record position. Records with same position are grouped in lists.

Chapter 5 © 2011 Pearson Education, Inc.  Publishing as Prentice Hall© 2011 Pearson Education, Inc.  Publishing as Prentice Hall 1010

CUSTOMER HASH STORAGE STRUCTURE

Bucket #      033 Jones … 11 Smith …    123 Zale … 12 Gaines … 1 Dane …  235 Allen … 2 Hafter …    314 Norris … 25 Harris …    44 Caine … 15 Elder …    516 Doan … 5 Moen … 38 Raines …  639 Vale … 27 Hale … 28 Tyne …  718 Clark … 29 Kent …    88 Ames …      920 Lord … 9 Cowell … 42 Hart …

1032 Bundy … 31 Madoff …  

Assume that a set of Customer records have been stored using the hashing method, where the storage location is determined by the remainder from dividing the customer ID by 11 (the # of buckets). Each bucket has enough room (slots) for 3 customer data records. If a bucket is full and a new record is added which belongs in that bucket, the record is placed in the next (higher) available bucket. If the last bucket is full, we roll around to the top bucket and store a record in the first available space.

1What bucket(s) would be accessed to retrieve data for customer IDs: 38, 27, 49 - and which would be found?

2In which bucket would the following records be stored if they were added

to this hashed structure in the order shown: 36, 3, 10, 21

Chapter 5 © 2011 Pearson Education, Inc.  Publishing as Prentice Hall© 2011 Pearson Education, Inc.  Publishing as Prentice Hall 1111

Figure 6-8 Join Indexes–speeds up join operations

a) Join index for common non-key columns

a) Join index for matching foreign key (FK) and primary key (PK)

Chapter 5 © 2011 Pearson Education, Inc.  Publishing as Prentice Hall© 2011 Pearson Education, Inc.  Publishing as Prentice Hall

Cluster StorageCluster Storage

Store records from two (or more) tables Store records from two (or more) tables together on the same physical recordtogether on the same physical record

E.g. I may almost always retrieve the E.g. I may almost always retrieve the set of ITEM_SOLD associated with a set of ITEM_SOLD associated with a SALE every time I retrieve a SALE SALE every time I retrieve a SALE recordrecord

If so, I will store them as a cluster to If so, I will store them as a cluster to speed up retrievalspeed up retrieval

1212

Chapter 5 © 2011 Pearson Education, Inc.  Publishing as Prentice Hall© 2011 Pearson Education, Inc.  Publishing as Prentice Hall 1313

Chapter 5 © 2011 Pearson Education, Inc.  Publishing as Prentice Hall© 2011 Pearson Education, Inc.  Publishing as Prentice Hall 1414

Clustering FilesClustering Files

In some relational DBMSs, related records In some relational DBMSs, related records from different tables can be stored from different tables can be stored together in the same disk areatogether in the same disk area

Useful for improving performance of join Useful for improving performance of join operationsoperations

Primary key records of the main table are Primary key records of the main table are stored adjacent to associated foreign key stored adjacent to associated foreign key records of the dependent tablerecords of the dependent table

e.g. Oracle has a CREATE CLUSTER e.g. Oracle has a CREATE CLUSTER commandcommand

Chapter 5 © 2011 Pearson Education, Inc.  Publishing as Prentice Hall© 2011 Pearson Education, Inc.  Publishing as Prentice Hall 1515

Rules for Using IndexesRules for Using Indexes

1.1. Use on larger tablesUse on larger tables2.2. Index the primary key of each tableIndex the primary key of each table3.3. Index search fields (fields frequently Index search fields (fields frequently

in WHERE clause) Foreign Keys?in WHERE clause) Foreign Keys?4.4. Fields in SQL ORDER BY and GROUP Fields in SQL ORDER BY and GROUP

BY commandsBY commands5.5. When there are >100 values but When there are >100 values but

not when there are <30 valuesnot when there are <30 values