© 2011 pearson education, inc. publishing as prentice hall 1 chapter 5 part 2: file organization...
TRANSCRIPT
© 2011 Pearson Education, Inc. Publishing as © 2011 Pearson Education, Inc. Publishing as Prentice HallPrentice Hall 11
Chapter 5 Part 2:Chapter 5 Part 2:File Organization and File Organization and
PerformancePerformance
Modern Database Modern Database ManagementManagement1010thth Edition Edition
Jeffrey A. Hoffer, V. Ramesh, Jeffrey A. Hoffer, V. Ramesh, Heikki TopiHeikki Topi
Chapter 5 © 2011 Pearson Education, Inc. Publishing as Prentice Hall© 2011 Pearson Education, Inc. Publishing as Prentice Hall 22
ObjectivesObjectives Define termsDefine terms Select appropriate file organizationsSelect appropriate file organizations Describe three types of file organizationDescribe three types of file organization Describe indexes and their appropriate Describe indexes and their appropriate
useuse Translate a database model into efficient Translate a database model into efficient
structuresstructures Know when and how to use Know when and how to use
denormalizationdenormalization
Chapter 5 © 2011 Pearson Education, Inc. Publishing as Prentice Hall© 2011 Pearson Education, Inc. Publishing as Prentice Hall 33
Physical RecordsPhysical Records Physical Record: A group of fields Physical Record: A group of fields
stored in adjacent memory locations stored in adjacent memory locations and retrieved together as a unitand retrieved together as a unit
Page: The amount of data read or Page: The amount of data read or written in one I/O operationwritten in one I/O operation
Blocking Factor: The number of Blocking Factor: The number of physical records per pagephysical records per page
Chapter 5 © 2011 Pearson Education, Inc. Publishing as Prentice Hall© 2011 Pearson Education, Inc. Publishing as Prentice Hall 44
Designing Physical FilesDesigning Physical Files Physical File: Physical File:
A named portion of secondary memory A named portion of secondary memory allocated for the purpose of storing physical allocated for the purpose of storing physical recordsrecords
Tablespace – named set of disk storage Tablespace – named set of disk storage elements in which physical files for elements in which physical files for database tables can be storeddatabase tables can be stored
Extent–contiguous section of disk spaceExtent–contiguous section of disk space
Chapter 5 © 2011 Pearson Education, Inc. Publishing as Prentice Hall© 2011 Pearson Education, Inc. Publishing as Prentice Hall 55
File OrganizationsFile Organizations Technique for physically arranging records of a file Technique for physically arranging records of a file
on secondary storageon secondary storage Factors for selecting file organization:Factors for selecting file organization:
Fast data retrieval and throughputFast data retrieval and throughput Efficient storage space utilizationEfficient storage space utilization Protection from failure and data lossProtection from failure and data loss Minimizing need for reorganizationMinimizing need for reorganization Accommodating growthAccommodating growth Security from unauthorized useSecurity from unauthorized use
Types of file organizationsTypes of file organizations SequentialSequential IndexedIndexed HashedHashed
Chapter 5 © 2011 Pearson Education, Inc. Publishing as Prentice Hall© 2011 Pearson Education, Inc. Publishing as Prentice Hall 66
Figure 5-7a Sequential file organization
If not sortedAverage time to find desired record = n/2
1
2
n
Records of the file are stored in sequence by the primary key field values
If sorted – every insert or delete requires a re-sort
Chapter 5 © 2011 Pearson Education, Inc. Publishing as Prentice Hall© 2011 Pearson Education, Inc. Publishing as Prentice Hall 77
Figure 5-7b Indexed file organization
uses a tree searchAverage time to find desired record = depth of the tree
Chapter 5 © 2011 Pearson Education, Inc. Publishing as Prentice Hall© 2011 Pearson Education, Inc. Publishing as Prentice Hall 88
Indexed File OrganizationsIndexed File Organizations Indexed File Organization: the storage of Indexed File Organization: the storage of
records with an index that allows software records with an index that allows software to locate individual recordsto locate individual records
Index: a table or other data structure used Index: a table or other data structure used to determine (within a file) the location of to determine (within a file) the location of records that satisfy some conditionrecords that satisfy some condition
Primary keys are automatically indexedPrimary keys are automatically indexed Other fields or combinations of fields can Other fields or combinations of fields can
also be indexed; these are called also be indexed; these are called secondary keys (or nonunique keys)secondary keys (or nonunique keys)
Chapter 5 © 2011 Pearson Education, Inc. Publishing as Prentice Hall© 2011 Pearson Education, Inc. Publishing as Prentice Hall 99
Figure 5-7cHashed file organization
Hash algorithmUsually uses division-remainder to determine record position. Records with same position are grouped in lists.
Chapter 5 © 2011 Pearson Education, Inc. Publishing as Prentice Hall© 2011 Pearson Education, Inc. Publishing as Prentice Hall 1010
CUSTOMER HASH STORAGE STRUCTURE
Bucket # 033 Jones … 11 Smith … 123 Zale … 12 Gaines … 1 Dane … 235 Allen … 2 Hafter … 314 Norris … 25 Harris … 44 Caine … 15 Elder … 516 Doan … 5 Moen … 38 Raines … 639 Vale … 27 Hale … 28 Tyne … 718 Clark … 29 Kent … 88 Ames … 920 Lord … 9 Cowell … 42 Hart …
1032 Bundy … 31 Madoff …
Assume that a set of Customer records have been stored using the hashing method, where the storage location is determined by the remainder from dividing the customer ID by 11 (the # of buckets). Each bucket has enough room (slots) for 3 customer data records. If a bucket is full and a new record is added which belongs in that bucket, the record is placed in the next (higher) available bucket. If the last bucket is full, we roll around to the top bucket and store a record in the first available space.
1What bucket(s) would be accessed to retrieve data for customer IDs: 38, 27, 49 - and which would be found?
2In which bucket would the following records be stored if they were added
to this hashed structure in the order shown: 36, 3, 10, 21
Chapter 5 © 2011 Pearson Education, Inc. Publishing as Prentice Hall© 2011 Pearson Education, Inc. Publishing as Prentice Hall 1111
Figure 6-8 Join Indexes–speeds up join operations
a) Join index for common non-key columns
a) Join index for matching foreign key (FK) and primary key (PK)
Chapter 5 © 2011 Pearson Education, Inc. Publishing as Prentice Hall© 2011 Pearson Education, Inc. Publishing as Prentice Hall
Cluster StorageCluster Storage
Store records from two (or more) tables Store records from two (or more) tables together on the same physical recordtogether on the same physical record
E.g. I may almost always retrieve the E.g. I may almost always retrieve the set of ITEM_SOLD associated with a set of ITEM_SOLD associated with a SALE every time I retrieve a SALE SALE every time I retrieve a SALE recordrecord
If so, I will store them as a cluster to If so, I will store them as a cluster to speed up retrievalspeed up retrieval
1212
Chapter 5 © 2011 Pearson Education, Inc. Publishing as Prentice Hall© 2011 Pearson Education, Inc. Publishing as Prentice Hall 1313
Chapter 5 © 2011 Pearson Education, Inc. Publishing as Prentice Hall© 2011 Pearson Education, Inc. Publishing as Prentice Hall 1414
Clustering FilesClustering Files
In some relational DBMSs, related records In some relational DBMSs, related records from different tables can be stored from different tables can be stored together in the same disk areatogether in the same disk area
Useful for improving performance of join Useful for improving performance of join operationsoperations
Primary key records of the main table are Primary key records of the main table are stored adjacent to associated foreign key stored adjacent to associated foreign key records of the dependent tablerecords of the dependent table
e.g. Oracle has a CREATE CLUSTER e.g. Oracle has a CREATE CLUSTER commandcommand
Chapter 5 © 2011 Pearson Education, Inc. Publishing as Prentice Hall© 2011 Pearson Education, Inc. Publishing as Prentice Hall 1515
Rules for Using IndexesRules for Using Indexes
1.1. Use on larger tablesUse on larger tables2.2. Index the primary key of each tableIndex the primary key of each table3.3. Index search fields (fields frequently Index search fields (fields frequently
in WHERE clause) Foreign Keys?in WHERE clause) Foreign Keys?4.4. Fields in SQL ORDER BY and GROUP Fields in SQL ORDER BY and GROUP
BY commandsBY commands5.5. When there are >100 values but When there are >100 values but
not when there are <30 valuesnot when there are <30 values