chapter 1 introduction overview of dbms (database

44
1 CHAPTER 1 INTRODUCTION Overview of DBMS (Database Management System) DBMS is generally defined as a collection of logically related data and a set of programs to access the data. Strictly speaking, this is definition of “Database System”, which comprises of two components i.e. (i) Database and (ii) DBMS. USER QUERIES DATABASE SYSTEM DATABASE A Database is a collection of logically related data that can be recorded. The information stored in the database must have the following implicit properties:- (a) It must represent some real-world aspect; like a college or a company etc. The aspect represented by the database is called its “Mini-world”. (b) It must comprise a logically coherent collection of data, which should have well-understood inherent meaning (semantics). (c) The repository of data must be designed, developed and implemented for a specific purpose. There must exist an intended group of users, who must have some pre-conceived applications of the data. Praveen Kumar DBMS Query Processing Software Storage Management Software DATABASE Schema Definition DATA

Upload: others

Post on 10-Dec-2021

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: CHAPTER 1 INTRODUCTION Overview of DBMS (Database

1

CHAPTER 1 INTRODUCTION

Overview of DBMS (Database Management System)

DBMS is generally defined as a collection of logically related data and a set of programsto access the data. Strictly speaking, this is definition of “Database System”, whichcomprises of two components i.e. (i) Database and (ii) DBMS.

USER QUERIES

DATABASE SYSTEM

DATABASE A Database is a collection of logically related data that can be recorded.The information stored in the database must have the following implicit properties:-

(a) It must represent some real-world aspect; like a college or a company etc.The aspect represented by the database is called its “Mini-world”.

(b) It must comprise a logically coherent collection of data, which shouldhave well-understood inherent meaning (semantics).

(c) The repository of data must be designed, developed and implemented for aspecific purpose. There must exist an intended group of users, who must havesome pre-conceived applications of the data.

Praveen Kumar

DBMS

Query ProcessingSoftware

Storage ManagementSoftware

DATABASE

Schema Definition

DATA

Page 2: CHAPTER 1 INTRODUCTION Overview of DBMS (Database

2

A Database System will have the following major organs:-

- Sources of information, from where it derives its data.- Some related real-world events, which influence its data.- Some intended users, who would be interested in its data.

For example, in the college database, sources of information will be students, faculty,labs etc. The real-world events affecting the information in the database will beadmissions, exams, results & placements etc. The set of intended users will be faculty,students, admin staff etc.

Database Management System (DBMS)

A Database Management System (DBMS) refers to a set of programs for defining,creation, maintenance and manipulation of a database. A DBMS must facilitate thefollowing major functions:-

- Defining of Database Schema:- The DBMS must facilitate defining thedatabase structure i.e. defining of data types, relationships amongst the data andspecification of the integrity constraints to be enforced on the database. It shouldalso facilitate specifying the access rights of authorized users.

- Manipulation of the Database:- The DBMS must facilitate functions like:-

Insertion of new data into the database Update of changed informationDeletion of data, which might have been rendered defunctReading of stored information, including generation of reports

- Sharing of a database The DBMS must enable concurrent access of shareddata items by multiple users, while preserving the consistency of thedatabase.

- Protection of a database The DBMS must protect the database againstunauthorized/ malicious access.

- Database Recovery In the event of system failures, DBMS must facilitatedatabase recovery.

File Processing System

Before the evolution of DBMS, dedicated systems known as “File-ProcessingSystems” were in vogue to handle the data repositories of organizations. Such systems

Praveen Kumar

Page 3: CHAPTER 1 INTRODUCTION Overview of DBMS (Database

3

needed a dedicated set of application programs, to add information to the files, to extractinformation from the files and to update the existing information. The structure of thefiles used to be hard-coded in the application programs. Normally, such applicationprograms used to be written by different programmers in different languages, as andwhen need arose. Also, the same information used to be stored at multiple places, indifferent formats, on different machines, which were not even interconnected.

Limitations of a File Processing System

(i) Data Redundancy and Inconsistency:- Since the same information is storedat multiple places, it causes data-inconsistency problems during updates.

(ii) Difficulty in Accessing of Data Suppose there exists some information in thefiles, but the existing set of application programs do not support extraction of thatinformation. Under such situations, the application programs need to be updated and thisis very inconvenient, time consuming and costly solution.

(iii) Data Isolation The information is scattered over a large number of files,on a number of stand-alone (not networked) machines, making it very difficult to processcertain queries, which need information to be extracted from multiple locations.

(iv) Difficulty in Enforcing Integrity Constraints: Enforcing of integrityconstraints has to be handled at application program level, making the programs verycomplex. The redundancy of information makes this task all the more difficult.

(v) Atomicity Problems: Since the information needed to rollback a transaction maynot be readily available in a file-processing systems, ensuring atomicity of transactionswill be difficult.

(vi) Difficulty in Concurrency Control: It is complex to build in theconcurrency control features at the application programs level.

(vii) Security Problems: Since the information is scattered and does not havecentralized access path, effective enforcing of user access rights will not be fail-proof.

Features of a Database System

A DBMS will support the following features:-

(a) Data Dictionary A Database System will support a Data Dictionary (or DataDirectory or DBMS Catalog), which contains information like Data Types,Relationships amongst the data and Data Constraints of the underlyingdatabase. In addition, it also contains the information about Authorized Usersof the database like their Access Rights. Since, this information defines thenature of the data stored in the database, it is called metadata (data about

Praveen Kumar

Page 4: CHAPTER 1 INTRODUCTION Overview of DBMS (Database

4

data). This information makes the DBMS software independent of itsunderlying database. When a need arises to change the structure of data, nochanges need to be made to the DBMS software; only the dictionary isupdated to reflect the changes. Whereas in a file processing system, theapplication programs would need to be changed. Also, this feature makes theDBMS software generic. The same DBMS can be used for differentorganization having entirely different set of data; the distinguishing featurewill be the information stored in the Data Directory. This feature of DBMS isgenerally referred to as ‘Self Describing Nature of a Database’, since theinformation stored in the Data Dictionary fully describes the nature of the datastored in the Database.

(b) Storage Management DBMS supports a File Manager to manage theallocation of disk space for the DBMS files. Also, it supports a BufferManager to manage the memory buffers, used for processing databaseinformation. Whenever, some information is to be updated, it is first read fromthe files into the buffer, where it is manipulated and then the updatedinformation is written back into the files.

(c) Language Interfaces DBMS supports language interfaces with 4GLlanguages like PL/SQL for data manipulation applications.

(d) Transaction Management DBMS ensures atomicity of transactionprocessing. A Transaction, when executed transforms the database from oneconsistent state to another consistent state. During its execution, a log ismaintained in a system Log File of all the operations performed by theTransaction. If a Transaction fails during its execution, then the log file is usedto rollback the transaction during recovery of the database. This ensuresatomicity of transaction processing.

(e) Concurrency control DBMS will support concurrency control tools forpermitting multiple users or application programs to access the databaseconcurrently, while preserving the consistency of database.

(f) Security Management Security Mechanism of a Database System willensure that only the authorized users can access the database; and that tooonly to the extent, which is explicitly authorized by the DatabaseAdministrator. The authorized Access Rights are explicitly stored in the DataDictionary. The access by each user and the type of operations performed onvarious data will be monitored and controlled by the DBMS. This will protectdatabase against the authorized/ malicious access.

(g) Database Recovery Since DBMS maintains a log of all transactionsbeing executed, it will enable recovery of the underlying database, in the eventof failures. For example, if a Transaction fails during its execution, it is rolledback to initial state; thus reverting back to the consistent state that existed

Praveen Kumar

Page 5: CHAPTER 1 INTRODUCTION Overview of DBMS (Database

5

prior to the commencement of the failed transaction. This is made possible bythe information stored in the system log file. Also, DBMS will support takingof periodic backups, which are used to recover databases in case ofcatastrophic failures; like Disk Crash.

DATA MODELS

A Data Model defines the underlying structure of a Database. It comprises a collection ofconceptual tools for describing the Data, the Data Relationships, the Data Semantics andthe Data Integrity Constraints.

CATEGORIES OF DATA MODELS

Basically, there are three categories of data models:-

(a) Object Based Logical Models.(b) Record Based Logical Models.

(a) Object Based Logical Models.

The Object Based Logical Models view the universe as a collection of objects. (i) Entity-Relationship Model.

- An Entity refers to a real-world ‘object’ or a ‘concept’ that is distinguishablefrom other objects and other concepts in the real-world. For example, aperson, a bank-account, a payment are all entities of different kinds.

- An Entity will have a set of properties, known as Attributes; for example, theEntity “Account” may have attributes like “Account-Number”, “Current-Balance” etc

- Each attribute will have a set of permitted values, called its Domain; forexample the domain of Balance of an account can be the set of +ve realnumbers.

- A collection of entities of the same kind, having same set of attributes, is

called an “Entity Set”.

- A relationship refers to the association amongst entities. For example, in abanking database, an entity ‘Customer’ can have relationship ‘Depositor’ withanother entity ‘Account’.

- A set of Relationships of the same kind, having the same set of attributes is

called a Relationship Set.

Praveen Kumar

Page 6: CHAPTER 1 INTRODUCTION Overview of DBMS (Database

6

- A database in E-R Model is modeled as a collection of Entity Sets andRelationships Sets.

- E-R Model also specifies certain constraints, like Mapping Cardinalities i.e.

whether the relationship is one-to-one, one-to-many, many-to-one ormany-to-many.

- The E-R Diagram below depicts two Entity Sets “STUDENT”, “COURSE”and a relationship set “RESULT” indicating the marks obtained by students indifferent Courses.

RESULT

(ii) Object-Oriented Model. Like the E-R Model, this model alsomodels a database as a Collection of Objects. An Object Bodyencapsulates Data (Variables) as well as Methods (Functions) tomanipulate the Data (Variables). The Objects that contain same Type ofData Variables and same Type of Functions are grouped together as aClass. Thus, a Class may be viewed as a Type Definition of the Objects.The only way an Object “A” can access the Data Items of another Object“B” is by invoking the Methods of “B”. “A” can accomplish this bymaking calls to the methods of “B”, through B’s Interface. The methodsdefined within an object are made visible to the external world, through itsInterface.

Interface

OBJECT

Praveen Kumar

STUDENT COURSE

Roll_No

S_Name

Marks

Sub_Code

Sub_Title

S-Address

Variables

Functions

Page 7: CHAPTER 1 INTRODUCTION Overview of DBMS (Database

7

The structure of an object-oriented database is modeled as a set of classesand database will comprise of objects belonging to those classes.

(b) Record Based Logical Models. These models describe data at the Logicallevel, as a collection of fixed-format Records of different types. Each Record Type canhave a fixed number of Fields (or Attributes) and each Field is usually of fixed length.Use of fixed-length Records simplifies the Physical Level implementation of a database.The most widely used Record Based Logical Models are:-

(i) Hierarchical Model. This is one of the oldest models, dating back to1960s. The first commercial DBMS, based on this model, was “InformationManagement System” (IMS), released by IBM in 1966. At one time, it was themost used DBMS. In the Hierarchical Model, the Data is represented as Records;and the Records are organized as a collection of Trees. The relationships amongthe data are represented by Links, which can be viewed as pointers. The treestructure permits that each record can have only one parent record. Thus, itpermits modeling of only one-to-many relationship (not many-to-manyrelationship) amongst the Records.

The following diagram, showing an Academic Database in Hierarchical Model,represents Records of three types “Course”, “Teacher” and “Student”; and linksindicating relationship “Offered By” from “Course” to “Teacher”- indicating thefaculty offering a course and the relationship “Attended By” from “Course” to“Student”- indicating the students attending a course.

Offered By Attended By

HIERARCHICAL MODEL

It does not indicate the relationships “ What are the courses being offeredby a faculty”, “What are courses being attended by a student”, “who are thestudents being taught by a faculty” and “who are the faculty teaching a student”.This is due the limitation of tree structure that a node can have only one parentnode; and thus we can represent only one-to-many relationship but notmany-to-many relationship.

Praveen Kumar

Course

TeacherStudent

Page 8: CHAPTER 1 INTRODUCTION Overview of DBMS (Database

8

(ii) Network Model. Like the Hierarchical Model, this Model alsomodels a database as a collection of Records; and the Records are organized as acollection of arbitrary graphs (or Networks). Thus a Record can have any numberof parent records; and thus supports many-to-many relationship amongst records.The relationships among the records are represented as links (pointers). Since, thisModel supports many-to-many relationship amongst the records, it is consideredmore versatile as compared to Hierarchical Model.

The above database can be better modeled in Network as indicated below. Itcontains additional information i.e. relationship “Offers” from “Teacher” to“Course” and relationship “Attends” from “Student” to “Course”. Since, theHierarchical Model can strictly model only Tree Structures, it was not possible todepict “Offers” and “Attends” in the Hierarchical Model. Also, it depictsrelationships “Teaches” and “Taught By” between “Teacher” and “Taught”.

Offered By Attends Offers Attended By

Teaches

Taught By

NETWORK MODEL

(iii) Relational Model.This is most modern and most commonly usedmodel amongst the Record Based Models. It has been widely accepted. TheRelational Model models a database as a collection of Tables to represent bothdata and the relationships amongst the data. Each Table is called a Relation, whichis assigned a unique name. Each relation has a number of Columns, representingthe Fields (or Attributes) of the relation. Each Field is also uniquely named. ARelation (or Table) can have an un-limited number of Rows and each Rowrepresents an Instance of the Relation. A Row is also termed as a Tuple. EachTuple will be unique in a Relation. So, a Relation can be viewed as a set of Tuplesof the same type. The relationships amongst the tables will be modeled as ForeignKey- Primary Key Relationships.

The “Course-Student-Teacher” Database Schema in Relational Modelwill be represented by six Tables- three tables to represent entities i.e. STUDENT(giving details of all students), TEACHER (giving details of all teachers),COURSE (giving details of all courses); and three tables to represent relationshipsi.e. COURSE-TEACHER (indicating relationships – OFFERED BY and

Praveen Kumar

Course

Teacher Student

Page 9: CHAPTER 1 INTRODUCTION Overview of DBMS (Database

9

OFFERS), COURSE-STUDENT (indicating relationships ATTENDS andATTENDED BY) and TEACHER-STUDENT (indicating relationships TAUGHTBY and TEACHES).

STUDENTRoll_No S_Name Branch Semester Section S_Address

COURSESub_Code Sub_Title Semester Branch Contact_Hrs

TEACHERFac_Code Fac_Name Desig Dept Fac_Address

COURSE-TEACHERSub_Code Fac_Code

COURSE-STUDENTSub_Code Roll_No

TEACHER-STUDENTFac_Code Roll_No

The Relational Model has become extremely popular because:-

(a) It is extremely simple and easy to implement.(b) It has a strong mathematical foundation.(c) It has been highly standardized.

SCHEMAS AND INSTANCES

Praveen Kumar

Page 10: CHAPTER 1 INTRODUCTION Overview of DBMS (Database

10

Schema. Database Schema refers to the overall structure of a database. Oncedefined, the schemas are rarely changed. A Database System will have several Schemas,partitioned according to the levels of its abstraction.

Instance. It refers to the actual collection of data (a Snapshot of data) existing in thedatabase at a particular moment of time. Since, a database will continuously experienceinsertion of new data, deletion of defunct data and update of changed data, the Instancewill be under continuous change.

DATA ABSTRACTION & VARIOUS SCHEMAS OF A DATABASE

There are three levels of data abstraction in a database; and each level is described by aschema as explained below:-

(a) Physical Level. This is the lowest level of abstraction. At this level, aPhysical Schema describes “how data is physically stored”. The Physical Schema maydescribe complex structures, used to store the data, with the sole aim of achieving anefficient access of the data.

(b) Logical Level. This is the intermediate level of abstraction. At thislevel, a Logical Schema (or Conceptual Schema) would describe “what data is stored inthe database” and “what are the relationships amongst the data”. This Schema is used byDatabase Administrators, who decide what information is to be kept in the Database. Itwould describe the logical structure of database, data types and integrity constraints. Ascompared to Physical Level, Database at Logical Level is described by relatively smallernumber of simpler structures. But, the implementation of these simple structures may bequite complex at the Physical Level. The user operating at Logical Level need not beaware of the complexities at the Physical Level.

(c ) View Level. This is the highest level of abstraction. At this level, there will bemany Views, defined for different categories of users. A View for a certain group of usersdescribes “what subset of the database is to be made visible” to that group. A view willdescribe only a subset of the underlying database. This is the subset, which the intendedgroup of users needs to access. There may be many Views, tailored to the specific needsof various users. At the view level, the main goal is to provide an efficient and auser-friendly human-interaction with the system. So, the interface at this level is made assimple and user-friendly as possible. A user doesn’t have to be aware of the complexitiesat the conceptual level and physical level.

DATA INDEPENDENCE

The ability of a DBMS to modify its Schema definition at one level, without affecting aSchema definition at the next higher level, is called Data Independence. There are twolevels of Data Independence:-

Praveen Kumar

Page 11: CHAPTER 1 INTRODUCTION Overview of DBMS (Database

11

(a) Physical Data Independence. It is the ability of DBMS to modify thePhysical Schema without causing any changes in the schema at the logical level and atthe view level. Modifications at Physical Level are driven by advancements in hardwaretechnology and by the requirements to upgrade hardware for improving systemperformance. (b) Logical Data Independence.This refers to the ability of DBMS to modify theLogical Schema without causing any changes in the application programs at the viewlevel. Modifications at Logical Level are necessitated by need to alter the LogicalStructure of the database. The Logical Data Independence is much more difficult toachieve than the Physical Data independence, since the application programs are heavilydependent on the logical structure of the database.

DATABASE LANGUAGESA DBMS will support two kinds of languages; one called Data Definition Language(DDL) to specify the Database Schema and the other called Data Manipulation Language(DML) to enable accessing and manipulation of the data stored in the database.

(a) DDL. A database schema is specified by a set of definitions expressed in DDL.In a Relational Database, the result of interpretation of DDL statements will be a set ofTables that are stored in a special file called Data Dictionary or Data Directory or DBMSCatalog. This data stored in Data Dictionary is called Metadata i.e. data about data.Whenever the database is to be accessed, the DBMS will first make a reference to theData Dictionary with a view to determine the structure of data to be accessed; only then itwill access the actual data in the database. Thus the data dictionary is accessed duringprocessing of each query.

The storage structure and access methods used by the database system arespecified by a set of definitions in a special type of DDL called Data Storage andDefinition Language. The result of interpretation of these definitions will be a set ofphysical schema structures and a set of access methods supported by the system. Thesedetails are usually hidden from the database-users.(b) DML. A DML is a language that enables users to access and manipulate the datastored in the database. A DML query is a statement specifying information to be accessedfor retrieval or insert/update/delete. The portion of a DML that involves informationretrieval is called a query language. The goal of a DML is to provide an efficient andfriendly human interface for the following operations in a database:-

(i) Retrieval on information stored in the database.(ii) Insertion of new information into the database.(iii) Deletion of information from the database.(iv) Update of information stored in the database.

There are two types of DMLs:-

(i) Procedural DMLs. A query in procedural DML requires the user tospecify not only “what data is required to be extracted from the database” but alsoto specify “how to extract those data”.

Praveen Kumar

Page 12: CHAPTER 1 INTRODUCTION Overview of DBMS (Database

12

(ii) Non-Procedural DMLs. A Query in Non-Procedural DML requiresthe user to specify only “what data is needed”, without specifying how to getthose data.Non-procedural DMLs are easier to learn and to use than the procedural DMLs.However, since non-Procedural DMLs do not specify “how to get the data”, thequeries in Non-Procedural DMLs may not generate as efficient code as theequivalent queries in Procedural DMLs. This limitation of Non-Procedural DMLsis overcome by performing query optimization at the System Level.

OVERALL STRUCTURE OF DBMS

Users

DML 4GL Programs Queries

Query Processor

Storage Manager

Database

DATA MODELING USING E-R MODEL

Praveen Kumar

Unskilled Application DML DBA Users Programmers Users

Program Development Tools

DML Tools DBA ToolsApplicationInterfaces

DMLCompiler

DDL Interpreter

Application ProgramsObject Code

Pre-CompilerFor 4GL Programs

Query Evaluation Engine

Buffer Manager

Indexinformation

Query EvaluationStatistics

ApplicationData Files

Data Dictionary(Schema + Access

Rights)

Disk Storage

DDL Statements

File Manager

Authorization& IntegrityManager

TransactionManager

Page 13: CHAPTER 1 INTRODUCTION Overview of DBMS (Database

13

Functions of a Database Administrator (DBA)

DBA is the custodian of the Database System placed under his control and is responsiblefor the following functions:-

1. Creation of Conceptual Schema and its periodic update to adapt to the changedrequirements.

2. Implementation of efficient Storage Structure and Access Methods.

3. Liaise with the Users to ensure that the information required by the Users is madeavailable.

4. Ensure system security, through Grant and Revoke of Access Rights to the Users.A user must have only as much rights as required by his role in the organization- nothingmore, nothing less.

5. To ensure Physical Security of Database against malicious access and accidentslike fire etc.

6. Take periodic backups and keep the archived data safely.

7. Execute immediate recovery procedures in case of failures.

8. Monitor the system performance. In case of degradation in system performance,perform tuning procedures. If necessary, upgrade the system (hardware / software) tomeet the changed requirements of the organization.

9. Ensure sufficient Disk Space is always available. If needed, upgrade the DiskDrives to meet the increased requirements.

10. To liaise with the DBMS vendor to obtain necessary technical supports and toobtain the necessary tools & software upgrades, whenever made available by the vendor.

Characteristics of a Database System, which distinguish it from a conventionalFile-Processing System

In a traditional file system, each user defines & implements the files needed for a specificapplication, as a part of programming the application itself. Multiple users of the same setof data will create replicated sets of files, specific to their respective applications. Thisredundancy in defining & storage of data results in higher storage costs and databaseinconsistencies during updates. On the other hand, in a database approach, a singlerepository of data is maintained, which is defined once and then accessed by varioususers of the data.

Praveen Kumar

Page 14: CHAPTER 1 INTRODUCTION Overview of DBMS (Database

14

The main characteristics of a database approach, which distinguish it from afile-processing approach are:-

(i) Self-Describing nature of a Database System A database contains not onlythe data, but also a complete definition of the data structure, data types & dataconstraints. This additional information is called meta-data, which is stored ina file called Data-Dictionary (also called DBMS Catalog). The informationstored in the Data Dictionary is accessible to the DBMS software. Thisadditional information makes the DBMS software independent of itsapplications. When a new need arises to change the structure of data, nochanges need to be made to the DBMS software; only the meta-data in theData Dictionary needs to be changed, to reflect the changes. This featureenables the DBMS software to be adapted for any application. The sameDBMS will work for a college, a bank or a factory. Whereas in a traditionalfile processing system, the application programs would need major changeswhile shifting from one application to another.

(ii) Data Abstraction In a traditional file processing system, the structure of thedata files is hard coded in the application programs; thus any changes instructure would need the related application programs to be modifiedaccordingly. Whereas in a Database System, the application programs areinsulated from the data stored in the database. The application programs areonly concerned with ‘what data’ is stored in the database and not concernedwith ‘how the data is stored’. As long as the contents of data remainunchanged, the database structure can be changed, without affecting theexisting application programs. This feature is called Data Abstraction.

(iii) Support for Multiple Views of the Data Depending on differentneeds and different levels of authorizations, different users would be provideddifferent perspectives of the same data, called Views. A View refers to a subsetof the stored data or a set of Virtual Data i.e. data derived from the stored data.A View is not explicitly stored in the Database; only its Definition is stored inthe DBMS Catalog. Whenever a user or a program submits a query to access aView, the View is instantly computed and presented to the User or theProgram. Next time, when the same view is again accessed, it is re-computedfresh.

(iv) Multi-User Access & Concurrency Control A Multi-User DBMS allowsmultiple users to access the same database concurrently. This is achieved byincluding Concurrency Control Software in the DBMS, to ensure thatdatabase remains consistent, despite access by multiple users concurrently.

(v) Effective System Protection through grant of Access Rights Access Rightsare granted to the users, to the extent required for their roles in theorganization. These rights are stored in the data dictionary itself. When aquery is to be processed, the DBMS will first ensure that the user submitting

Praveen Kumar

Page 15: CHAPTER 1 INTRODUCTION Overview of DBMS (Database

15

the query has sufficient rights for the processing of that query; only then thequery is processed.

(vi) Support for efficient Recovery. When a system is restarted after a failure,log-based recovery recovers the database efficiently.

Advantages of using a DBMS vis-à-vis File Processing System

(a) Controlling Redundancy While designing a database, various Viewsof different users are integrated into a single database, thus controllingredundancy. This results in reduced effort and reduced storage space. Also, itensures database consistency, in case of updates.

(b) Restricting unauthorized access The user access rights are stored inthe data dictionary. Whenever, any query is received from any user, it ischecked for valid access rights. If access rights exist, the query is processedelse it is rejected as ‘Invalid Query’. This prevents unauthorized access ofdata.

(c) Providing Multiple User-Interfaces A DBMS provides various types ofuser interfaces for various categories of users:-

- Query Languages (like SQL) for skilled users

- Programming Languages (like PL/SQL) for application programmers

- Menus, Forms for Naive Users

- DDL for Database Administrator

(d) Enforcing of Data Integrity Constraints The Data IntegrityConstraints are stored in the data dictionary itself. Whenever, some data isinserted/updated/deleted, the data constraints are automatically applied to therelated data items and invalid operations are rejected.

(e) Supporting Concurrent Access A DBMS supports concurrent access bymultiple users. Despite concurrent access by multiple users, databaseconsistency is maintained.

(f) Providing backup & recovery A DBMS supports data backup & recoveryin case of failures.

(g) Reduced Application Development Time Development time of a newapplication using DBMS is of the order of 15 – 25% as compared to the timeneeded in development of equivalent applications in a traditional fileprocessing system.

Praveen Kumar

Page 16: CHAPTER 1 INTRODUCTION Overview of DBMS (Database

16

(h) Easy Adaptability A database system can be easily adapted to changedrequirement, with minimal time and cost implications.

(i) Potential for enforcing Standards It permits the DatabaseAdministrator (DBA) to define & enforce standards among the database users.The standards can be defined for naming conventions, formats of data items,display formats or report structures etc.

Praveen Kumar

Page 17: CHAPTER 1 INTRODUCTION Overview of DBMS (Database

17

Exercises

Ex.1.1 Explain three level of data abstraction. Distinguish between Physical DataIndependence and Logical Data Independence. Which is more difficult to achieveand why?

Ex.1.2 Explain the characteristics of DBMS that distinguish it from a FileProcessing System. Explain how the application development is mush shorterin a DBMS environment than in a File Processing Environment.

Ex.1.3 Compare the three data models: Hierarchical, Network and Relational.What are the distinguishing features of Relational Model that make it so popular?

Ex.1.4 Distinguish between:-

(a) DDL & DML(b) Schema & Instance(c) Procedural DML & Non-Procedural DML

Ex.1.5 Explain the role of the following components of DBMS:-

(a) DML Compiler(b) Query Processing Engine(c) Buffer Manager(d) Transaction Manager

Ex.1.5 What is the roe of a Data Dictionary in DBMS? How does this featuremake the DBMS independent of the underlying database?

Ex.1.6 Explain major functions of a Database Administrator (DBA)?

Ex.1.7 Explain what is implied by the statement“ In DBMS, views of differentusers can be integrated into a single database”.

Ex.1.8 What is meant by “Self-describing nature of a database”?

Ex.1.9 Compare Procedural DMLs and Non-Procedural DMLs from theviewpoints of (i) User Friendliness (ii) Query Optimization.

Praveen Kumar

Page 18: CHAPTER 1 INTRODUCTION Overview of DBMS (Database

18

CHAPTER 2

ENTITY-RELATIONSHIP MODELING

The Entity Relationship Model (ER Model) models the real world situations as acollection of entities and relationships amongst the entities.

Entity An Entity is an object (like a “CAR”) or a concept (like an “ACCOUNT”)from the real world, which is distinguishable from other objects and other concepts. EachEntity will be defined by a set of properties (called Attributes). For example entity“ACCOUNT” may be defined by Attributes like “ACCOUNT-NUMBER”,“BRANCH-NAME” and “BALANACE” etc.

Entity Set An Entity-Set refers to a collection of entities of the same kind. Eachentity in an Entity-Set will have the same set of attributes and the set of attributes willdistinguish it from other Entity Sets. No other entity set will have exactly the same set ofattributes. Some of the attributes of an entity set may overlap with other entity sets.

Relationship A Relationship refers to an association amongst Entity Sets. Like theremay be relationship “DEPOSITOR” between Entity Set “CUSTOMER” and Entity Set“ACCOUNT”.

Relationship Set A Relationship Set refers to the collection of Relationships of thesame kind (i.e. having exactly same set of Attributes). A Relationship Set will inheritsome of the Attributes (properties) of the associating Entity Sets. Like the RelationshipSet “DEPOSITOR” between Entity Sets “CUSTOMER” and “ACCOUNT” will inheritAttributes “CUSTOMER-ID” from “CUSTOMER” and Attribute“ACCOUNT-NUMBER” from “ACCOUNT”. In addition, a Relationship Set may havesome of its own attributes called “Descriptive Attributes”; for example the relationshipset “DEPOSITOR” may have a descriptive attribute “DATE-OF-OPERATION”,indicating the date on which a customer has last operated an account.

Domain of an Attribute

Each attribute has a set of permitted values called its domain or value set, like theattribute ‘NAME’ may have a domain that is set strings of characters of specifiedmaximum length.

A database will consist of a set of Entity-sets and Relationship-Sets, each ofwhich will contain a number of entities of the same type or Relationships of the sametype. An entity in a database may be described by a set of (attribute, data value) pairs;like a student in Entity-Set “STUDENT” may be described by {(ROLL-NUMBER,0990013010), (NAME, ‘Karan Singh’), (DATE-OF-BIRTH, ‘10-DEC-1985’)}.

Praveen Kumar

Page 19: CHAPTER 1 INTRODUCTION Overview of DBMS (Database

19

Attribute Types:-

(i) Simple Vs Composite Attributes . A Simple attribute is the one, whichis not divisible into sub-parts like ‘BRANCH’. On the other hand, a Compositeattribute is the one, which can be divided into sub-parts like ‘DATE-OF-BIRTH’,which may be divided into ‘birth-date’, ‘birth-month’ & ‘birth-year’.

(ii) Single-Valued Vs Multi-Valued Attributes . An attribute, whichcan assume one value at a time, is called Single-Valued attribute; like ‘name’ of anEMPLOYEE entity. On the other hand, an attribute, which may assume a set ofvalues at a time, is called multi-valued attribute; like attribute ‘dependant’ of anAttribute Set “EMPLOYEE”, which may have none or one or multiple values,depending upon the number of dependents of an employee.

(iii) Null Attribute . A null value is assigned to an attribute under any ofthe following three conditions:-

(d) If the attribute value is not applicable to an entity; like SPOUSE-NAMEwill not be applicable if an employee is unmarried.

(e) If value is applicable, but not specified; like TEL#- an employee may notbe owning a Telephone.

(f) If value is applicable and specified but not known to the agency enteringthe information; like an employee may be owning a Telephone but thenumber may not be known to the organization.

Null value can only be assigned to an Attribute, if assigning value to that attributeis optional (not mandatory). The Mandatory attributes cannot be assigned a“Null” value.

(iv) Derived Attribute Vs Stored Attribute . A derived attribute is the one,whose value is not stored in the database, but is derived from the value of otherstored attributes; like the value of attribute ‘age’ can be derived from attribute‘date-of-birth’ and current date obtained from the system.

Degree of Relationship Sets. Degree of a Relationship Set refers to the number ofEntity Sets participating in the Relationship. Most of the relationships are binary.

E-R Diagram Notations

Rectangle represents an entity set.

Praveen Kumar

Page 20: CHAPTER 1 INTRODUCTION Overview of DBMS (Database

20

Ellipse represents an attribute.

Diamond represents a relationship set.

Line links an attribute to an entity set or an entity set to arelation set.

Double Line indicates total participation of an entity set in a relation set.

Dashed Ellipse indicates derived attribute.

Double Ellipse indicates multi-valued attribute.

Double Rectangle indicates weak entity set.

Double Diamond indicates a relationship set with participationof some weak entity sets.

RELATIONSHIP constraints

- Mapping Cardinalities- Participation Constraint

Mapping Cardinalities. For a binary relationship set R between entity sets A and B,the mapping cardinalities can be on of the following:-

(a) One-to-one. An entity in A is associated with at most one entity in B and anentity in B is associated with at most one entity in A. It is represented in E-R Model asfollows:-

R

One-to-one cardinality is represented by directed lines drawn from R to A & B both.

Praveen Kumar

A B

Page 21: CHAPTER 1 INTRODUCTION Overview of DBMS (Database

21

(b) One-to-many. One to many cardinality from A to B implies than an entity in A isassociated with any number (Nil/ one/ many) of entities in B; however, an entity in B isassociated with at most one entity in A. It is represented in E-R Model as follows:-

R

(c) Many-to-one. Many to one cardinality from A to B implies that an entity in A isassociated with at most one entity in B; however, one entity in B can be associated withany number of entities in A. It is represented in E-R Model as follows:-

R

(d) Many-to-many. Many to many cardinality from A to B implies that anentity in A can be associated with any number of entities in B and one entity in B can beassociated with any number of entities in A. It is represented in E-R Model as follows:-

R

Example:-

One-to-One relationship from CUSTOMER to ACCOUNT implies that each customercan have only one account and each account has to be Single.

DEPOSITOR

(One-to-One Relationship)

One-to-Many relationship from CUSTOMER to ACCOUNT implies that each customercan have any number (NIL or One or More than One) of accounts, but each account hasto be Single.

DEPOSITOR

(One-to-Many Relationship)

Many-to-One relationship from CUSTOMER to ACCOUNT implies that each customercan have only one accounts, but each account can be Joint (held by one or more).

DEPOSITOR

Praveen Kumar

A B

A B

A B

CUSTOMER ACCOUNT

CUSTOMER ACCOUNT

CUSTOMER ACCOUNT

Page 22: CHAPTER 1 INTRODUCTION Overview of DBMS (Database

22

(Many-to-One Relationship)

Many-to-Many relationship from CUSTOMER to ACCOUNT implies that eachcustomer can have any number (Nil or One or More than One) accounts and each accountcan be Joint (held by one or more).

DEPOSITOR

(Many-to-Many Relationship)

Participation Constraints in Relationship Sets

- Total Participation- Partial Participation

Total Participation

An Entity Set E is said to have total participation in relationship set R if each entity in Eis participating at least in one relationship through R. In E-R Diagram, the TotalParticipation is represented by a “Double Line” drawn between the Entity Set symbol andthe Relationship Set symbol.

Partial ParticipationAn Entity Set E is said to have partial participation in relationship set R if some of theentities in E are not participating in any relationship through R. In E-R Diagram, thePartial Participation is represented by a “Single Line” drawn between the Entity Setsymbol and the Relationship Set symbol.

Example:- Suppose Entity Sets “CUSTOMER” and “ACCOUNT” are related byRelationship Set “DEPOSITOR” and Entity Sets “CUSTOMER” and “LOAN” arerelated by Relationship Set “BORROWER”. Suppose it is possible that a customer mayhave only account or only loan or both, then the situation can be modeled as follows:-

Partial DEPOSITOR Total Participation Participation Partial Participation

BORROWER

Total Participation

Praveen Kumar

CUSTOMER ACCOUNT

CUSTOMER ACCOUNT

LOAN

Page 23: CHAPTER 1 INTRODUCTION Overview of DBMS (Database

23

Concept of Key

Super Key. A Super Key of an Entity Set or Relationship Set refers to the set ofattributes, which when taken collectively, will uniquely determine an entity within theEntity Set or a Relationship within the Relationship Set. If K forms a Super Key (SK) ofan Entity Set E then any super set of K will also be a Super Key of E. So, a Super Keymay have some extraneous (unnecessary) attributes, which if removed, the balance setmay still form a Super Key of R.

Example :- Suppose each student in the Entity Set STUDENT (ROLL_NO, NAME,BRANCH, FATHERS-NAME, ADDRESS, DOB, TEL-NO) has a unique value ofROLL-NO. This implies that no two students can have same ROLL-NO. Then{ROLL-NO, NAME} forms a super key of Entity-Set STUDENT. In this, the attributeNAME is extraneous; which if removed, the balance set i.e. {ROLL-NO} still forms aSuper Key of STUDENT.

Candidate Key. A Super Key, whose no proper subset forms a Super Key, is calleda Candidate Key. Thus, Candidate Key is a minimal Super Key (i.e. a Super Key havingno extraneous attributes). An Entity Set may have more than one Candidate Keys.

Example:- The Entity Set STUDENT will have at least two Candidate Keys i.e.{ROLL-NO} and {NAME, FATHERS-NAME, DOB, ADDRESS}.

Primary Key. Primary Key is one of the Candidate Keys that is designated by thedatabase designers as primary means of identifying entities within an entity set. In theE-R Diagram, the Primary Key Attributes are underlined with a firm line.

Primary Key of a Relationship Set

Let R be a binary relationship set between Entity Sets E1 and E2. Let K1 and K2 be therespective Primary Keys of E1 and E2. Then the Primary Key of Relationship Set R willdepend upon the cardinality mapping of the relationship set, as explained below:-

(i) One to One RelationshipPK (R) = PK (E1)

or= PK (E2)

(ii) One to Many Relationship from E1 to E2 Here E2 is called “Many-Side” Entity Set and E1 is called “One-Side” Entity-Set.PK (R) = PK (E2) i.e. Primary Key of “Many-Side” Entity-Set.

Praveen Kumar

Page 24: CHAPTER 1 INTRODUCTION Overview of DBMS (Database

24

(iii) Many to One Relationship from E1 to E2 Here E1 is called “Many-Side” Entity Set and E2 is called “One-Side” Entity-Set.PK (R) = PK (E1) i.e. Primary Key of “Many-Side” Entity-Set.

(iv) Many to Many Relationship from E1 to E2

PK (R) = PK (E1) ∪ PK (E2)Example

(i)

DEPOSITOR

(One-to-One Relationship)

PK (DEPOSITOR) = CN or AN

(ii)

DEPOSITOR

(One-to-One Relationship)

PK (DEPOSITOR) = AN i.e. PK of ACCOUNT

(iii)

DEPOSITOR

(One-to-One Relationship)

PK (DEPOSITOR) = CN i.e. PK of CUSTOMER

(iv)

DEPOSITOR

Praveen Kumar

CUSTOMER ACCOUNT

CN AN

CUSTOMER ACCOUNT

CN AN

CUSTOMER ACCOUNT

CN AN

CUSTOMER ACCOUNT

CN AN

Page 25: CHAPTER 1 INTRODUCTION Overview of DBMS (Database

25

(One-to-One Relationship)

PK (DEPOSITOR) = {CN, AN} i.e. PK (CUSTOMER) ∪ PK (ACCOUNT)

Concept of Weak Entity Set

An Entity Set is said to be a Weak Entity Set if it does not have sufficient attributes toform its Primary Key. On the other hand, an entity set having a primary key of its own iscalled a Strong Entity Set. A Weak Entity Set (say E2) will be dependent for its existenceon a Strong Entity Set (say E1) to form its Candidate Key. Then Entity Set E2 is said to be“Existence-Dependent” on E1 and E1 is said to be the “Owner Entity Set” of E2. Therelationship R between E2 and E1 is called “Identifying Relationship”. The Weak EntitySet E2 will have a set of attributes called its “Discriminator”, which together with thePrimary Key of E1 will form the Primary Key of E2.

Owner Entity Set Identifying Relationship Weak Entity Set

Example:-Suppose an Entity Set EMPLOYEE (EMP_ID, EMP_NAME, SALARY,DEPENDENTS) has an attribute DEPENDENT which is multi-valued i.e. an employeemay have none or one or more than dependents. This situation can be best modeled asfollows:-

Owner Entity Set Identifying Relationship Weak Entity Set

- The Weak Entity Set DEPENDENT is Existence Dependent on the Strong EntitySet EMPLOYEE.

- The Weak Entity Set “DEPENDENT” has a Discriminator Attribute D-NAME,which along with primary key EMP-ID of EMPLOYEE, forms Primary Key of the weakentity set DEPENDENT. In E-R Diagram, the Discriminator (also called Partial Key) ofa weak entity set is marked by underlining with a broken line.

Special Features of an Identifying Relationship

Praveen Kumar

EMPLOYEE

EMP-ID

EMP-NAME

SALARY

DEPENDENT

D-NAME RELATION

DOB

E1E2

Page 26: CHAPTER 1 INTRODUCTION Overview of DBMS (Database

26

Normally, a situation modeled by Weak Entity Set will have following features:-

(i)The Identifying Relationship will be one-to-many from Owner Entity Set to WeakEntity Set.

(ii) The Participation of Owner Entity Set in the Identifying Relationship will bepartial and the participation of the Weak Entity Set in the Identifying Relationshipwill be Total.

Modeling of a Multi-valued Attribute as Weak Entity Set

In the above example, the Weak Entity Set DEPENDENT can also be modeled as amulti-valued attribute of Entity Set EMPLOYEE. The multi-valued attribute can be usedto indicate the names of the dependents of employees. But suppose we want to indicateother parameters of dependents like dependent’s relationship with the employee then themulti-valued approach will not be suitable. In this case, the Weak Entity approach will bethe ideal choice, since then the weak entity set DEPENDENT can have any number ofattributes.

Extended E-R Features

Specialization . An entity set E may include some sub-groups of entities (say E1,E2, ….. En), such that each of these sub-groups may have some distinct attributes differentthan the other sub-groups. There will be some attributes that will be common to allsub-groups. The process of designating these sub-groups within an entity set is calledspecialization;

Higher Level Entity SetOr Super Class

ISA

Lower Level Entity Sets or Sub Classes

In the above example, an Entity Set E has been specialized into Sub-groups designated asE1 , E2 ….. En. E is called “Super Class” or “Higher Level Entity Set” and the entity sets E1

, E2 ….. En are called “Sub Classes” or “Lower Level Entity Sets” of E. The common

Praveen Kumar

E

A1

A2

E1 E2En

A2

C2C

1

B1

A1

Page 27: CHAPTER 1 INTRODUCTION Overview of DBMS (Database

27

attributes of all sub entity sets are represented with the super entity sets. And the distinctattributes of each sub entity set are represented with the sub entity set.

The relationship of Higher Level Entity Set with its Lower Level Entity Sets is called ISArelationship. It is read as “is a”.

Inheritance of Attributes in Specialization

Each Sub Class will inherit the Attributes of its Super Class; plus it will have its owndistinct Attributes. Like in the above case, each lower entity set will inherit attributes A1

and A2 of the Super Class E.

Example:- Consider an entity set ACCOUNT with attributes Account-Number andBalance. The Entity Set ACCOUNT may be specialized into different types of accountslike SAVINGS-ACCOUNT, CURRENT-ACCOUNT, FIXED-DEPOSIT (FD) andRECURRING-DEPOSIT (RD). The SAVINGS-ACCOUNT may have an attributeInterest-Rate and CURRENT-ACCOUNT may have attribute Over-Draft. Similarly, FDand RD have distinct attributes of their own.

ISA

Specialization Constraints

Disjoint Vs Overlapping Specialization

Disjoint. It implies that an entity does not belong to more than onelower-level entity set i.e. an account is either savings-account or current-accountbut not both.

Overlapping. In overlapping generalizations, an entity may belong to more thanone lower-level entity sets within a single generalization.

Praveen Kumar

ACCOUNT

Account-Number

Balance

SAVINGS-ACCOUNT

CURRENT-ACCOUNT

Interest-Rate

Over-Draft

FD

Int-Rate

Mat-Date

RD

Int-Rate

Mat-Date

Installment

Page 28: CHAPTER 1 INTRODUCTION Overview of DBMS (Database

28

Total Vs Partial Specialization

Total Each higher level entity must belong to a lower-level entity set.Partial. Some higher-level entities may not belong to any lower-level entity set.

Generalization . Specialization is a top-down approach; whereas Generalization isexactly inverse of that. Generalization refers to the process of fusing several distinctentity sets into a single Higher Level Entity Set, on the basis of commonality of theirattributes. Then the fused sets form sub classes or lower level entity sets. The commonattributes of the Lower Level Entity Sets will be assigned to the Higher Level Entity Set.Thus, generalization is a process, which proceeds in a bottom-up manner, in whichmultiple entities are synthesized into a single higher-level entity set, on the basis of theircommon features. The higher-level entity set is termed as super-class and lower levelentity set is termed as sub-class. As regards E-R Diagram, both Specialization andGeneralization are represented exactly in the same manner.

Aggregation . One limitation of E-R Model is that it fails to express relationshipsamong relationship sets or relationship between a relationship set on one side and anentity set on the other side. Aggregation provides a solution in this case. Aggregation isan abstraction through which relationships are treated as higher-level entities, which canthen participate in relationships with other Entity Sets or with other relationship sets. Forexample the relationship between R1 and E3 as indicated below.

R1

Aggregated Higher Level Entity Set “R1”

R2

Here, the Relationship Set R1 between Entity Set E1 and Entity Set E2 has beenaggregated as Higher Level Entity Set “R1”. This Higher Level Entity Set is participatingin a Relationship R2 with Entity Set E3. Thus, through aggregation, we are able torepresent a Relationship between Relationship Set R1 and Entity Set E3.

Praveen Kumar

E2E1

A1 A

2B

1 B2

E3

C1 C

2C

3

Page 29: CHAPTER 1 INTRODUCTION Overview of DBMS (Database

29

Example:- Suppose, we have Entity Sets “EMPLOYEE”, “BRANCH” and “JOB”which are related through a Relationship “EBJ” which indicates, “which employee” isperforming “what jobs” at “which branch”. There will be multiple jobs at each at eachbranch and assume that each employee may be performing multiple jobs at one of thebranches. Suppose, we want to relate another Entity Set “MANAGER” to indicate:-

(i)The set of Employees managed by a manager.

(ii)The set of jobs managed by a manager.

(iii)The Branches managed by a manager (assume a manager can manages onlyone branch).

If we represent this scenario without use of aggregation, then the E-R Diagram will be asfollows:-

EBJ

EM BM

JM

The above Scenario can be better modeled by aggregating the Relationship Set “EBJ” a ahigher level Entity Set and the creating a relationship between this higher level entity setand the Entity Set “MANAGER”, as indicated below:-

Praveen Kumar

EMPLOYEEBRANCH

JOB

MANAGER

EBJ

Aggregated Higher-Level-Entity-Set “EBJ”

BRANCH

EMPLOYEE JOB

Page 30: CHAPTER 1 INTRODUCTION Overview of DBMS (Database

30

EBJM

This modeling represents the situation more realistically, wherein the Relationship Set“EBJM” indicates “which combinations of employee-branch-job” are being managed byeach manager.

Reduction of E-R Schema to Tables

An E-R Diagram can be reduced to a set of Tables, as explained below:-

(a) Tabular representation of a Strong Entity Set. A Strong Entity Set E will berepresented by a Table named “E”. The Table will have columns as follows:-

(i) Simple, Single-valued Attributes There will be a column for eachsimple, single-valued attribute of Entity Set E.

(ii) Composite Attributes There will be a column for each sub-part ofa Composite Attribute; no column needs to be assigned for composite attribute assuch. For example for NAME comprising of First Name (FN), Middle Name( MN) and Last Name (LN) there will be three columns for FN, MN and LN. Nocolumn needs to be assigned for NAME. If NAME needs to be produced, it canbe done by combining the sub-parts.

(iii) Derived Attributes No column needs to be assigned for the derivedattributes; since the values of these attributes are not stored in database.

(iv) Multi-Valued Attribute Each Multi-Valued Attribute (say M) will berepresented by a separate Table (say named E-M) which will have a column eachfor the primary key attributes of E and a column for Attribute M. Each value ofthe multi-valued attribute will be represented in a separate row in this table.

Let E be a Strong Entity Set with simple single-valued attributes a1,a2,……,an. ThisEntity Set will be represented by a Table called E with n distinct columns, each of whichwill correspond to one of the attributes. Let D1,D2,…Dn be the domains of attributes

Praveen Kumar

MANAGER

Page 31: CHAPTER 1 INTRODUCTION Overview of DBMS (Database

31

a1,a2,….,an respectively. The Table E will comprise of a set of rows, which will be asubset of the Cartesian Product D1 X D2 X…….Dn.

Example

The derived attribute Age will not be represented in the STUDENT table. When required,its value will be derived from DOB.

The Tel-No will be represented in a separate table (say named STUDENT-TEL-NO),which will have a column for Primary Key of STUDENT i.e. Roll-No and a column forTel-No. Suppose, a student has more than one Tel-No then his Roll-No will appear thatmany times in this table.

The Above E-R Diagram will be reduced to following two Tables:-

STUDENTUniv_Roll_No Name DOB H-No Street City Pin

STUDENT-TEL-NOUniv_Roll_No Tel_No

(b) Tabular representation of Relationship Sets. Let R be a Relation Set andlet a1, a2,…..am be the set of attributes formed by the union of the primary keys of all theEntity Sets participating in Relation R and let the descriptive attributes of R (if any) be

Praveen Kumar

STUDENT

Univ_Roll_NoName

DOBAge

Address

StreetH-N

oPin

City

Tel_No

Date-of-Operation

Page 32: CHAPTER 1 INTRODUCTION Overview of DBMS (Database

32

b1,b2,…..bn. Then the Relation R will be represented by a Table named say “R”, whichwill (m+n) columns, each column representing one of the attributes from the set {a1, a2,……am} U {b1,b2,….bn}.

Example

DEPOSITOR

The Relationship Set DEPOSITOR will be represented by a table named DEPOSITOR.The Entity Sets CUSTOMER and ACCOUNT have Primary Keys C-Id and Account-Norespectively, which will also form part of the DEPOSITOR table. In addition, theDEPOSITOR table will have a column for its Descriptive Attribute “date-of-Operation”.The above E-R Diagram will be reduced to the following set of tables:-

CUSTOMERC-Id C-Name C-address

ACCOUNTAccount-Number Balance Branch-Name

DEPOSITORC-Id Account-Number Date-of-Operation

Shifting of Descriptive Attributes of a Relationship Set and Merging of RelationshipSet Table with the tables of participating Entity Sets. Depending on the Cardinality

Praveen Kumar

CUSTOMERACCOUNT

C-Id

C-Name

C-Address

Account-No

Balance

Branch-Name

Page 33: CHAPTER 1 INTRODUCTION Overview of DBMS (Database

33

Mapping of the participating Entity Sets, the Descriptive Attributes of the Relationshipset can be shifted to one of the participating Entity Sets. Also, a Relationship Set Tablecan be combined with the table of one of the participating Entity Sets, as per thefollowing conditions:-

(1) One-to-One Relationship Suppose there is a One-to-One relationship betweentwo entity sets, then the rows in the Relationship Set table will have one-to-one mappingwith the rows in the tables of the participating entity sets. Under this condition, it ispossible to shift the descriptive attributes of the relationship set to any of the participatingEntity Sets and also it is possible to merge the table of the Relationship Set with the tableof any of the participating Entity Sets, without loss of any information.

Example:-

DEPOSITOR

As indicated above, there is One-to-One Relationship between CUSTOMER andACCOUNT i.e. Each Customer has at most one account and each account is “Single”(i.e. owned by only one customer).CUSTOMER

C-Id C-Name C-addressC-001 Ajay 320, Sector-26, NoidaC-220 Vijay 110,Sector-8, RKPC-310 Ram 120,Sector-25, NoidaC-505 Shyam 303,Sector-22, RKP

ACCOUNTAccount-Number Balance Branch-NameA-101 10000 Sec-18A-203 30000 Sec-26A-305 50000 CPA-310 25000 RKP

Praveen Kumar

CUSTOMERACCOUNT

C-Id

C-Name

C-Address

Account-No

Balance

Branch-Name

Date-of-Operation

Page 34: CHAPTER 1 INTRODUCTION Overview of DBMS (Database

34

DEPOSITORC-Id Account-Number Date-of-Operation

C-001 A-310 10-Jan-2007C-220 A-101 23-Dec-2006C-310 A-203 03-Feb-2007C-505 A-305 27-Dec-2007

As obvious, the rows in DEPOSITOR table are having one-to-one mapping with the rowsin the CUSTOMER Table and also with the rows in the ACCOUNT Table. That is, thefirst row of DEPOSITOR maps onto the fourth row of ACCOUNT, the second row ofDEPOSITOR maps onto the first row of ACCOUNT, the third row of DEPOSITOR mapsonto the second row of ACCOUNT and the last row of DEPOSITOR maps onto the thirdrow of ACCOUNT. Thus, the descriptive attribute Date-Of-Operation of the RelationshipSet DEPOSITOR can be shifted to either CUSTOMER or ACCOUNT. Also, theDEPOSITOR Table can be combined either with the CUSTOMER Table or with theACCOUNT Table, without losing any information. The combined table will have unionof the columns of the two merged tables. Suppose, DEPOSITOR Table is merged withthe CUSTOMER Table, then the CUSTOMER Table will also include attributesAccount_Number and Date_Of_Operation . The resulting set of tables will then be:-

CUSTOMERC-Id C-Name C-address Account-Nu

mberDate-of-Opera

tionC-001 Ajay 320, Sector-26, Noida A-310 10-Jan-2007C-220 Vijay 110,Sector-8, RKP A-101 23-Dec-2006C-310 Ram 120,Sector-25, Noida A-203 03-Feb-2007C-505 Shyam 303,Sector-22,RKP A-305 27-Dec-2007

ACCOUNTAccount-Number Balance Branch-NameA-101 10000 Sec-18A-203 30000 Sec-26A-305 50000 CPA-310 25000 RKP

The combined CUSTOMER Table now includes the Primary Key (AN) of ACCOUNTand descriptive attribute Date_Of_Operation of DEPOSITOR.

Praveen Kumar

Page 35: CHAPTER 1 INTRODUCTION Overview of DBMS (Database

35

(2) One-To-Many Relationship Suppose there is a One-to-Many relationshipbetween CUSTOMER and ACCOUNT i.e. each customer can have many accounts, buteach account has to be single.

Example

DEPOSITOR

CUSTOMERC-Id C-Name C-address

C-001 Ajay 320, Sector-26, NoidaC-220 Vijay 110,Sector-8, RKPC-310 Ram 120,Sector-25, NoidaC-505 Shyam 303,Sector-22,RKP

ACCOUNTAccount-Number Balance Branch-NameA-101 10000 Sec-18A-203 30000 Sec-26A-305 50000 CPA-310 25000 RKPA-550 35000 CPA-670 60000 Sec-18

DEPOSITORC-Id Account-Number Date-of-Operation

C-001 A-310 10-Jan-2007C-220 A-101 23-Dec-2006C-310 A-203 03-Feb-2007C-505 A-305 27-Dec-2007C-101 A-550 22-Dec-2006C-310 A-670 01-Jan-2007

The rows in the DEPOSITOR table have one-to-one mapping onto the rows inACCOUNT Table i.e. with the “Many-Side Entity Set” Table. That is, the first row ofDEPOSITOR maps onto the fourth row of ACCOUNT, the second row of DEPOSITORmaps onto the first row of ACCOUNT, the third row of DEPOSITOR maps onto the

Praveen Kumar

CUSTOMERACCOUNT

C-Id

C-Name

C-Address

Account-No

Balance

Branch-Name

Date-of-Operation

Page 36: CHAPTER 1 INTRODUCTION Overview of DBMS (Database

36

second row of ACCOUNT, the fourth row of DEPOSITOR maps onto the third row ofACCOUNT, the fifth row of DEPOSITOR maps onto the fifth row of ACCOUNT and thelast row of DEPOSITOR maps onto the last row of ACCOUNT table. Thus, thedescriptive attribute Date-Of-Operation can be shifted to ACCOUNT (The “Many-Side”Entity Set) and the DEPOSITOR Table can be with the ACCOUNT Table (i.e. with thetable of the “Many-Side” Entity Set), without losing any information. The resultantACCOUNT table will also include the Primary Key C-Id of CUSOMER table anddescriptive attribute DOO of the DEPOSITOR table. The resulting set of tables will thenbe:-

CUSTOMERC-Id C-Name C-address

C-001 Ajay 320, Sector-26, NoidaC-220 Vijay 110,Sector-8, RKPC-310 Ram 120,Sector-25, NoidaC-505 Shyam 303,Sector-22,RKP

ACCOUNTAccount-Numbe

rBalance Branch-Name Customer_Id Date_of_Operation

A-101 10000 Sec-18 C-220 23-Dec-2006A-203 30000 Sec-26 C-310 03-Feb-2007A-305 50000 CP C-505 27-Dec-2007A-310 25000 RKP C-101 10-Jan-2007A-550 35000 CP C-101 22-Dec-2006A-670 60000 Sec-18 C-310 01-Jan-2007

(3) Many-to-One Relationship Suppose there is many-to-one relationship betweenCUSTOMER and ACCCOUNT, which implies that each account can be “Joint” but eachcustomer can hold only one account. In this case, the table DEPOSITOR can becombined with “Many-Side” Entity-Set table CUSTOMER.

Example

DEPOSITOR

CUSTOMERC-Id C-Name C-address

Praveen Kumar

CUSTOMERACCOUNT

C-Id

C-Name

C-Address

Account-No

Balance

Branch-Name

Date-of-Operation

Page 37: CHAPTER 1 INTRODUCTION Overview of DBMS (Database

37

C-001 Ajay 320, Sector-26, NoidaC-220 Vijay 110,Sector-8, RKPC-310 Ram 120,Sector-25, NoidaC-505 Shyam 303,Sector-22, RKP

ACCOUNTAccount-Number Balance Branch-NameA-101 10000 Sec-18A-203 30000 Sec-26

DEPOSITORC-Id Account-Number Date-of-Operation

C-001 A-101 10-Jan-2007C-220 A-203 23-Dec-2006C-310 A-101 03-Feb-2007C-505 A-203 27-Dec-2007The rows in the DEPOSITOR table have one-to-one mapping onto the rows inCUSTOMER Table i.e. with the “Many-Side Entity Set” Table. Thus, the descriptiveattributes of DEPOSITOR can be shifted to “Many-Side” Entity Set CUSTOMER andthe DEPOSITOR Table can be with the CUSTOMER Table, without losing anyinformation. The resultant CUSTOMER table will also include the Primary KeyAccount_Number of ACCOUNT table and descriptive attribute DOO of theDEPOSITOR table. The resulting set of tables will then be:-

CUSTOMERC-Id C-Name C-address Account_Number DOO

C-001 Ajay 320, Sector-26, Noida A-101 10-Jan-2007C-220 Vijay 110,Sector-8, RKP A-203 23-Dec-2006C-310 Ram 120,Sector-25, Noida A-101 03-Feb-2007C-505 Shyam 303,Sector-22, RKP A-203 27-Dec-2007

ACCOUNTAccount-Number Balance Branch-NameA-101 10000 Sec-18A-203 30000 Sec-26

(4) Many-to-Many Relationship Suppose there is many-to-many relationshipbetween CUSTOMER and ACCCOUNT, which implies that each account can be “Joint”but each customer can hold many accounts. In this case, the table DEPOSITOR cannot becombined with any Entity Set and it has be created as a separate table. Since, if wecombine then we have to combine with both the Entity Sets and that would addunnecessary data redundancy, which is not acceptable.

Praveen Kumar

Page 38: CHAPTER 1 INTRODUCTION Overview of DBMS (Database

38

Example

DEPOSITOR

CUSTOMERC-Id C-Name C-address

C-001 Ajay 320, Sector-26, NoidaC-220 Vijay 110,Sector-8, RKPC-310 Ram 120,Sector-25, NoidaC-505 Shyam 303,Sector-22, RKP

ACCOUNTAccount-Number Balance Branch-NameA-101 10000 Sec-18A-203 30000 Sec-26A-305 50000 CPA-310 25000 RKP

DEPOSITORC-Id Account-Number Date-of-Operation

C-001 A-101 10-Jan-2007C-220 A-203 23-Dec-2006C-310 A-101 03-Feb-2007C-505 A-203 27-Dec-2007C-101 A-305 30-Dec-2007C-505 A-310 02-Jan-2007

Now, the rows in the DEPOSITOR table do not have one-to-one mapping withCUSTOMER table and also with the ACCOUNT table. So, the DEPOSITOR table canneither be merged with CUSTOMER table nor with ACCOUNT table. Thus, there has tobe a separate table for DEPOSITOR as indicated above. Also, the descriptive attributes ofthe Relationship Set cannot be shifted to the participating Entity Sets; the descriptiveattributes have to remain with the relationship set itself.

Praveen Kumar

CUSTOMERACCOUNT

C-Id

C-Name

C-Address

Account-No

Balance

Branch-Name

Date-of-Operation

Page 39: CHAPTER 1 INTRODUCTION Overview of DBMS (Database

39

(c) Tabular representation of Weak Entity Sets. Let A be a Weak Entity Setwith descriptive Attributes a1,a2,……,am. Let B be the Strong Entity Set on which A isexistence dependent. Let the primary key of B consist of attributes b1,b2,….bn. TheEntity Set A is represented by a Table called A with (m+n) columns, each columnrepresenting one of the attributes from the set {a1,a2,……am} U {b1,b2,….bn}.

Example:

LOAN- PAYMENT

There will be Tables LOAN and PAYMENT; the PAYMENT table will also include thePrimary Key of Loan i.e. Loan-No. The Primary Key of table PAYMENT will be{Loan-No, Payment-No} where the attribute Payment-No is called a “Discriminator” or“Partial Key” of the table PAYMENT.

Redundancy of Tables in Weak Entity Sets The Table for IdentifyingRelationship LOAN-PAYMENT is not required because if we create such a table, it willhave only two attributes i.e. Loan-No and Payment-No, which as such form part of tablePAYMENT. Thus, no table needs to be created for an Identifying Relationship. In casethere exists a Descriptive Attribute of an Identifying that can be shifted to the“Many-Side Entity Set” i.e. the Weak Entity Set.

(e) Tabular representation of Generalization. The steps involved are:-

Create a Table each for the higher-level entity set and for each lower-level entityset. The table for lower-level entity set will include its own attributes plus all thePrimary-Key attributes of its higher-level entity set.

Praveen Kumar

LOAN

Loan-NoAmount

PAYMENT

Payment-No

Payment-Date

Installment

Page 40: CHAPTER 1 INTRODUCTION Overview of DBMS (Database

40

ISA

For example, in the above case there will five tables i.e. ACCOUNT,SAVINGS-ACCOUNT, CURRENT-ACCOUNT, FD and RD. The table ACCOUNT willhave columns Account-Number and Balance; and table SAVINGS-ACCOUNT will havecolumns Account-Number and Interest Rate; and table CURRENT-RATE will havecolumns Account-Number and Over-Draft. Same is applicable to the tables FD and RD.

Combining of Tables in Generalization If a generalization is “Total”, which impliesthat each entity in the super-class (higher-level entity set) is a member of at least onesub-class (lower-level entity set), no table is required to be created for the higher-levelentity set. Instead a table needs to be created for each lower-level entity set; and eachsuch table will also include all the attributes of higher-level entity set, in addition to itsown distinct attributes. For example, the table SAVINGS-ACCOUNT will also havecolumns Account-Number, Balance and Interest-Rate; and the tableCURRENT-ACCOUNT will also have the columns Account-Number, Balance andOver-Draft. The same is applicable for FD and RD tables.

(f) Tabular representation of Aggregation. Take the following Example:-

Praveen Kumar

ACCOUNT

Account-Number

Balance

SAVINGS-ACCOUNT

CURRENT-ACCOUNT

Interest-Rate

Over-Draft

FD

Int-Rate

Mat-Date

RD

Int-Rate

Mat-Date

Installment

Page 41: CHAPTER 1 INTRODUCTION Overview of DBMS (Database

41

EBJM

In the above scenario, there will be tables for Entity Sets EMPLOYEE, BRANCH, JOBand MANAGER. There will be one table for Relationship Set EBJM having AttributesE#, B#, J# and Mgr-Id. No table is required for the Relationship Set EBJ because thistable would be a subset of table EBJM.

EBJME# B# J# Mgr-Id

E-R DIAGRAM OF AN AIRLINE RESERVATION SYSTEM

Praveen Kumar

EBJ

Aggregated Higher-Level-Entity-Set “EBJ”

BRANCH

EMPLOYEE JOB

MANAGER

E#E-Name

B#B-Name

J#

Mgr-Id

AIRCRAFT

FROM_PLACE

TO_PLACE

ETD DATE ATDCREW_NAME

AC_NO AC_TYPECAPACITY

Page 42: CHAPTER 1 INTRODUCTION Overview of DBMS (Database

42

FLT_ CREW

CANCELLATION RESERVATION

Where ETD: Estimated Time of Departure (i.e. scheduled take-off time).ETA: Estimated Time of Arrival (i.e. scheduled landing time at the destination)ATD: Actual Time of Departure. Initially, it will have a NULL value. It will get defined after aircraft actually takes-off.ATA: Actual Time of Arrival. Initially, it will have a NULL value. Its value will get defined only after aircraft actually lands at the destination.

The above E-R Diagram can be reduced to the following set of tables:-

FLIGHT_SCHEDULE (FLT_NO, FROM_PLACE, TO_PLACE, ETD, ETA)

Praveen Kumar

FLT_SCHEDULE FLIGHT

CREW

FLT_NOETA

ATA

CREW-ID

DESIGNATION

TICKET

CONFIRMED

SEAT_NO

TICKET_NO ISSUE_DATE

FARE

PASSENGER

P_NAME

P_ADDRESS

P_TEL_NO

REFUND

VOUCHER_NO

AMOUNT

C_DATE

Page 43: CHAPTER 1 INTRODUCTION Overview of DBMS (Database

43

AIRCRAFT (AC_NO, AC_TYPE, CAPACITY)CREW (CREW_ID, CREW_NAME, DESIGNATION)FLIGHT (FLT_NO, DATE, AC_NO, ATD, ATA)FLT_CREW (FLT_NO, DATE, CREW_ID)TICKET (TICKET_NO, ISSUE_DATE, FARE, P_NAME, P_ADDR, P_TEL_NO)RESERVATION (TICKET_NO, FLT_NO, DATE, CONFIRMED, SEAT_NO)CANCELLATION (TICKET_NO, FLT_NO, DATE, VOUCHER_NO, C_DATE)REFUND (VOUCHER_NO, AMOUNT)

Since, there is one-to-one relationship between TICKET and PASSENGER, a common table TICKET will suffice for these two entity sets. Even the CANCELLATION table canbe combined with RESERVATION table, since there is many-to-one relationship betweenRESERVATION and REFUND.

The SEAT_NO will get defined only after a passenger checks in for a flight.

E-R DIAGRAM FOR VEHICLE INSURANCE

Praveen Kumar

OWNER

VEHICLE

O-NAME

O_ADDRESSO_TEL_NO

REG_NO

MAKE

MODEL

COLOR

INSURANCE_POLICY

POLICY_NO

EXPIRY_DATE

PREMIUM

BONUS

ACCIDENT

A_REPORT_NOA_DATE

PLACE

S_REPORT_NOASSESSED_DAMAGE

CLAIM_PAYMENT

PAYMENT_VOUCHER

_NO

P_DATE

P_AMOUNT

Page 44: CHAPTER 1 INTRODUCTION Overview of DBMS (Database

44

Praveen Kumar

SURVEYOR REPORT

REPAIRS

REF_NO

REPAIR_ITEMCOST