the worlds of database systems chapter 1. database management systems (dbms) dbms: powerful tool for...
TRANSCRIPT
The Worlds of Database Systems
Chapter 1
Database Management Systems (DBMS)
• DBMS: Powerful tool for creating and managing large amounts of data efficiently and allowing it to persist over long periods of time safely
• DBMS capabilities– Persistent storage– Programming interface– Transaction management
DBMS vs File Systems
• Conventional file systems are inadequate as database systems.
• They fail to support:– Efficient search– Efficient modifications to small pieces of data– Complex queries– Controlled buffering of useful data in MM– Atomic and independent execution of
transactions
DBMS Functionalities
• Allows users to create new databases
• Allows users to querry data
• Support the storage of very large amounts of data
• Controls access to data from many users at once
Early Database Management Systems
• Airline reservation systems
• Banking systems
• Corporate Records
Airline Reservation System• Data
– Reservations (flight no, seat, meal prefer.)– Flights (src, dest, departure/arrival times, etc)– Prices
• Queries– Flights on a given date, time, city, seats available,
prices• Modifications
– Booking a flight, assigning a seat• Requirements
– Concurrent access by many agents (make sure no two agents assign the same seat simultaneously)
– Avoid loss in case of a sudden system failure
Relational Databases
• Based on the relational model
• Separates the logical view from the physical view of the data.
Student Course Quarter
Charles CS 444 Fall, 1997
Dan CS 142 Winter,1998
… … …
Querying a Database
• Find all the students who have taken CS444 in Fall, 1997.
• S(tructured) Q(uery) L(anguage)– select E.name– from Enroll E– where E.course=CS444 and – E.quarter=“Fall, 1997”
• Query processor figures out how to answer the query efficiently.
Database Industry
• Relational databases are a great success of theoretical ideas.
• “Big 3” DBMS companies are among the largest software companies in the world.
• IBM (with DB2) and Microsoft (SQL Server, Microsoft Access) are also important players.
• $20B industry• Challenged by object oriented DBMS.
Evolution • Smaller and smaller systems
– DBMS’s could only be run on large computers (only large computers could store gigabyte)
– Now, feasible to run a DBMS on a PC
• Bigger and bigger systems– Many applications need hundreds of gigabyte
• E.g. A retail chain may store terabytes (1012)
– Newer databases store images, audio, video, etc.• E.g., database of sattellites images store petabytes (1015)
– Modest size databases stored in Secondary Storage (Arrays of disks)
Dealing with large data
• One could argue that what distinguishes database systems from other software is, database systems assume that data is too big to fit in main memory; thus kept on disk
• Tertiary Storage– Capacity of terabytes– Access times seconds (ms for disks)– CD’s DVD’s may be the storage medium
• Parallel Computing– Satisfies the need of fast access to large data– Speeds up through parallel access to disks/devices– Needs efficient algorithms to intelligently break queries up
• Client-Server and Multi-Tier Architectures– Major database components at the server and the
client used to interface with the user– In multi-tier architectures, client is an application
server, manages connections to the database, transactions, authorization, etc.
– Client application server database server• Multimedia Data• Information Integration
– Providing a unified view to the database user even if data comes from different sources(legacy databases) using different structures to represent information
– Data Warehouses copy information from many legacy databases with appropriate translation to a central database;
• Warehouses reconstructed each night to reflect updates on legacy databases
Overview of a DBMS• DDL: Data Definition Language
– Capable of altering the metadata (schema or structure/constraints information of the database)
– Requires special authority (i.e., used by database administrator)
• Storage and Buffer Management– Storage manager keeps track of the location of files
on the disk and obtains the blocks containing a file on request from the buffer manager which keeps portions of the disk contents in MM. Information that various components may need include
• Data, Metadata, Statistics• İndexes: data structures that support efficient access to the
data
DBMS Components
Transaction Processing– Group of one or more database operations into a
transaction which must be executed atomically and in isolation of other trans.
– A DBMS guarantees durability: the work of a completed transaction will never be lost.
– The Transaction processor performs 3 tasks.1) Logging: Every change in the database is logged on disk
(through buffer manager) to enable recovery in case of a crash2) concurrency control assures
* atomicity: a transaction is performed either completely or not at all
* isolation: transactions are executed as if there were no other concurrently executing transactions (uses locks)
3) deadlock resolution (“roll back” or “abort” some transaction)
• Querry Processing• Querry compiler
– Parser: builds a tree structure from the textual form of the query
– Preprocessor: performs semantic checks on the query and translates parse tree into algebraic operators representing the initial query plan
– Optimizer: transforms the initial query plan into the best available sequence of operations on actual data
• Execution engine– Executes the steps of the chosen query plan– Needs to interact with most of the other components of
the DBMS
DBMS Components
DBMS Studies
• Database Design– E-R Model– Relational data model– ODL
• Database programming– SQL– OQL (Object Query Language)
• Database system implementation– Storage management– Query processing– Transaction management