entity  · web view2018. 10. 5. · in a course. here, works_at and enrolls are called...

18
Page 1 of 18 Relational Database Systems – Unit –I Topics 1.Purpose of Database Systems 2.Overall System Structure 3.The Entity-Relationship Model 4.Mapping Constraints 5. Keys 6.ER Digrams 7.Data Storage and Querying 8.Transaction Management 9.Database Architecture 1.Purpose of Database Systems 1. To see why database management systems are necessary, let's look at a typical ``file-processing system'' supported by a conventional operating system. 2. The application is a savings bank: o Savings account and customer records are kept in permanent system files. o Application programs are written to manipulate files to perform the following tasks: Debit or credit an account. Add a new account. Find an account balance. Generate monthly statements. 3. Development of the system proceeds as follows: o New application programs must be written as the need arises. o New permanent files are created as required. o but over a long period of time files may be in different formats, and o Application programs may be in different languages. 4. So we can see there are problems with the straight file- processing approach: o Data redundancy and inconsistency Same information may be duplicated in several places. All copies may not be updated properly.

Upload: others

Post on 04-Jun-2021

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Entity  · Web view2018. 10. 5. · in a course. Here, Works_at and Enrolls are called relationships. Relationship Set. A set of relationships of similar type is called a relationship

Page 1 of 14

Relational Database Systems – Unit –I

Topics1.Purpose of Database Systems2.Overall System Structure 3.The Entity-Relationship Model4.Mapping Constraints5. Keys6.ER Digrams7.Data Storage and Querying8.Transaction Management9.Database Architecture

1.Purpose of Database Systems

1. To see why database management systems are necessary, let's look at a typical ``file-processing system'' supported by a conventional operating system.

2. The application is a savings bank: o Savings account and customer records are kept in permanent system files. o Application programs are written to manipulate files to perform the following

tasks: Debit or credit an account. Add a new account. Find an account balance. Generate monthly statements.

3. Development of the system proceeds as follows: o New application programs must be written as the need arises. o New permanent files are created as required. o but over a long period of time files may be in different formats, and o Application programs may be in different languages.

4. So we can see there are problems with the straight file-processing approach: o Data redundancy and inconsistency

Same information may be duplicated in several places. All copies may not be updated properly.

o Difficulty in accessing data May have to write a new application program to satisfy an unusual

request. E.g. find all customers with the same postal code. Could generate this data manually, but a long job...

o Data isolation Data in different files. Data in different formats. Difficult to write new application programs.

o Integrity problems Data may be required to satisfy constraints. E.g. no account balance below $25.00. Again, difficult to enforce or to change constraints with the file-processing

approach.

Page 2: Entity  · Web view2018. 10. 5. · in a course. Here, Works_at and Enrolls are called relationships. Relationship Set. A set of relationships of similar type is called a relationship

Page 2 of 14

o Atomicity problems Failures may leave database in inconsistent state with partial updates

carried out. Example – Transfer of funds from one account to another account shold

either complete or not happen at all.o Concurrent Access Anomalies

Want concurrency for faster response time. Need protection for concurrent updates. E.g. two customers withdrawing funds from the same account at the same

time - account has $500 in it, and they withdraw $100 and $50. The result could be $350, $400 or $450 if no protection.

o Security problems Every user of the system should be able to access only the data they are

permitted to see. E.g. payroll people only handle employee records, and cannot see

customer accounts; tellers only access account data and cannot see payroll data.

Difficult to enforce this with application programs.

These problems and others led to the development of database management systems.

2.Overall System Structure

DBMS are very large and typically divided into modules. Some of the services are also provided by the operating system of the host computer. The following is an example of what the structure might be:

Page 3: Entity  · Web view2018. 10. 5. · in a course. Here, Works_at and Enrolls are called relationships. Relationship Set. A set of relationships of similar type is called a relationship

Page 3 of 14

Query processor o DML compiler

Translates the Data Manipulation Languages into query Engine instructions. It might also do optimization for query.

o Embedded DML precompiler Converts the DML statements in the an application program to normal procedure calls in the host language.

o DDL interpreter Interprets DDL statements and records them in a set of tables containing metadata

o Query Evaluation Engine Executes low level instruction generated by DML compiler

Storage Manager o Authorization and integrity manager

Tests for the satisfaction of integrity constraints and Checks the authority of user to perform various action.

Page 4: Entity  · Web view2018. 10. 5. · in a course. Here, Works_at and Enrolls are called relationships. Relationship Set. A set of relationships of similar type is called a relationship

Page 4 of 14

o Transaction Manager Ensures the database remains in a consistent (correct) state despite system failures.

o File manager Responsible for the allocation of space on the disk storage system.

o Buffer manager Manages the data coming into and out of the system, Including the caching of data.

Data structures o Data files

the database itself. o Data dictionary

the metadata about the structure of the database. Actually, this is a critical element in the DBMS!

o Indices Used to provide fast access to the data.

o Statistical data The query processor uses this to optimize queries.

3.The Entity-Relationship Model

The ER model defines the conceptual view of a database. It works around real-world entities and the associations among them. At view level, the ER model is considered a good option for designing databases.

Entity

An entity can be a real-world object, either animate or inanimate, that can be easily identifiable. For example, in a school database, students, teachers, classes, and courses offered can be considered as entities. All these entities have some attributes or properties that give them their identity.

Entity set

An entity set is a collection of similar types of entities. An entity set may contain entities with attribute sharing similar values. For example, a Students set may contain all the students of a school; likewise a Teachers set may contain all the teachers of a school from all faculties. Entity sets need not be disjoint.

Attributes

Entities are represented by means of their properties, called attributes. All

Page 5: Entity  · Web view2018. 10. 5. · in a course. Here, Works_at and Enrolls are called relationships. Relationship Set. A set of relationships of similar type is called a relationship

Page 5 of 14

attributes have values. For example, a student entity may have name, class, and age as attributes.

There exists a domain or range of values that can be assigned to attributes. For example, a student's name cannot be a numeric value. It has to be alphabetic. A student's age cannot be negative, etc.

Types of Attributes

Simple attribute − Simple attributes are atomic values, which cannot be divided further. For example, a student's phone number is an atomic value of 10 digits.

Composite attribute − Composite attributes are made of more than one simple attribute. For example, a student's complete name may have first_name and last_name.

Derived attribute − Derived attributes are the attributes that do not exist in the physical database, but their values are derived from other attributes present in the database. For example, average_salary in a department should not be saved directly in the database, instead it can be derived. For another example, age can be derived from data_of_birth.

Single-value attribute − Single-value attributes contain single value. For example − Social_Security_Number.

Multi-value attribute − Multi-value attributes may contain more than one values. For example, a person can have more than one phone number, email_address, etc.

These attribute types can come together in a way like simple single-valued attributes simple multi-valued attributes composite single-valued attributes composite multi-valued attributes

Relationship

The association among entities is called a relationship. For example, an employee works_at a department, a student enrolls in a course. Here, Works_at and Enrolls are called relationships.

Relationship Set

A set of relationships of similar type is called a relationship set. Like entities, a relationship too can have attributes. These attributes are called descriptive attributes.

Page 6: Entity  · Web view2018. 10. 5. · in a course. Here, Works_at and Enrolls are called relationships. Relationship Set. A set of relationships of similar type is called a relationship

Page 6 of 14

Degree of Relationship

The number of participating entities in a relationship defines the degree of the relationship.

Binary = degree 2 Ternary = degree 3 n-ary = degree n

4.Mapping Constraints a)Mapping Cardinalities and b)Existence Dependenciesa)Mapping Cardinalities

Cardinality defines the number of entities in one entity set, which can be associated with the number of entities of other set via relationship set.One-to-one − One entity from entity set A can be associated with at most one entity of entity set B and vice versa.

One-to-many − One entity from entity set A can be associated with more than one entities of entity set B however an entity from entity set B, can be associated with at most one entity.

Page 7: Entity  · Web view2018. 10. 5. · in a course. Here, Works_at and Enrolls are called relationships. Relationship Set. A set of relationships of similar type is called a relationship

Page 7 of 14

Many-to-one − More than one entities from entity set A can be associated with at most one entity of entity set B, however an entity from entity set B can be associated with more than one entity from entity set A.

Many-to-many − One entity from A can be associated with more than one entity from B and vice versa.

b)Existence Dependencies

Page 8: Entity  · Web view2018. 10. 5. · in a course. Here, Works_at and Enrolls are called relationships. Relationship Set. A set of relationships of similar type is called a relationship

Page 8 of 14

5. Keys

Key is an attribute or collection of attributes that uniquely identifies an entity among entity set. For example, the roll_number of a student makes him/her identifiable among students.

Super Key − A set of attributes one or more that collectively identifies an entity in an entity set.

Candidate Key − A minimal super key is called a candidate key. An entity set may have more than one candidate key.

Primary Key − A primary key is one of the candidate keys chosen by the database designer to uniquely identify the entity set.

6.ER DigramsAn entity–relationship model (ER model) is a systematic way of describing and defining a business process. An ER model is typically implemented as a database. The main components of E-R model are: entity set and relationship set.Here are the geometric shapes and their meaning in an E-R Diagram –

Rectangle: Represents Entity sets.Ellipses: AttributesDiamonds: Relationship SetLines: They link attributes to Entity Sets and Entity sets to Relationship SetDouble Ellipses: Multivalued AttributesDashed Ellipses: Derived AttributesDouble Rectangles: Weak Entity SetsDouble Lines: Total participation of an entity in a relationship set

A sample E-R Diagram:

Page 9: Entity  · Web view2018. 10. 5. · in a course. Here, Works_at and Enrolls are called relationships. Relationship Set. A set of relationships of similar type is called a relationship

Page 9 of 14

Multivalued Attributes: An attribute that can hold multiple values is known as multivalued attribute. We represent it with double ellipses in an E-R Diagram. E.g. A person can have more than one phone numbers so the phone number attribute is multivalued.

Derived Attribute: A derived attribute is one whose value is dynamic and derived from another attribute. It is represented by dashed ellipses in an E-R Diagram. E.g. Person age is a derived attribute as it changes over time and can be derived from another attribute (Date of birth).

E-R diagram with multivalued and derived attributes:

Total Participation of an Entity set:

A Total participation of an entity set represents that each entity in entity set must have at least one relationship in a relationship set. For example: In the below diagram each college must have at-least one associated Student.

Total Participation Diagram

Page 10: Entity  · Web view2018. 10. 5. · in a course. Here, Works_at and Enrolls are called relationships. Relationship Set. A set of relationships of similar type is called a relationship

Page 10 of 14

7.Data Storage and QueryingStorage manager is a program module that provides the interface between the low leveldata stored in the database and the application programs and queries submitted to the system.

The storage manager is responsible to the following tasks:- Interaction with the file manager- Efficient storing, retrieving and updating of data

Components of Storage Manager

o Authorization and integrity manager Tests for the satisfaction of integrity constraints and Checks the authority of user to perform various action.

o Transaction Manager Ensures the database remains in a consistent (correct) state despite system failures.

o File manager Responsible for the allocation of space on the disk storage system.

o Buffer manager Manages the data coming into and out of the system, Including the caching of data.

Issues: Storage access File organization Indexing and hashing

Query ProcessingSteps

1. Parsing and translation2. Optimization3. Evaluation

Page 11: Entity  · Web view2018. 10. 5. · in a course. Here, Works_at and Enrolls are called relationships. Relationship Set. A set of relationships of similar type is called a relationship

Page 11 of 14

Components of Query processor

o DML compiler Translates the Data Manipulation Languages into query Engine instructions. It might also do optimization for query.

o Embedded DML precompiler Converts the DML statements in the an application program to normal procedure calls in the host language.

o DDL interpreter Interprets DDL statements and records them in a set of tables containing metadata

o Query Evaluationngine Executes low level instruction generated by DML compiler

8.Transaction ManagementOften, several operations on the database form a single logical unit of work. An example

is a funds transfer, in which one department account(say A) is debited and another department account (say B) is credited. Clearly, itis essential that either both the credit and debit occur, or that neither occur. Thatis, the funds transfer must happen in its entirety or not at all. This all-or-none requirement is called atomicity.

In addition, it is essential that the execution of the funds transfer preserve the consistency of the database. That is, the value of the sum of the balances of Aand B must be preserved. This correctness requirement is called consistency.

Finally, after the successful execution of a funds transfer, the new values of the balances of accounts A and B must persist, despite the possibility of system failure. This persistence requirement is called durability.

A transaction is a collection of operations that performs a single logical function in a database application. Each transaction is a unit of both atomicityand consistency. Thus, we require that transactions do not violate any database consistency onstraints. That is, if the database was consistent when a transaction started, the database must be consistent when the transaction successfully terminates. However, during the execution of a transaction, it may be necessary temporarily to allow inconsistency, since either the debit of A or the credit of B must

Page 12: Entity  · Web view2018. 10. 5. · in a course. Here, Works_at and Enrolls are called relationships. Relationship Set. A set of relationships of similar type is called a relationship

Page 12 of 14

be done before the other. This temporary inconsistency, although necessary, may lead to difficulty if a failure occurs.

It is the programmer’s responsibility to define properly the various transactions, so that each preserves the consistency of the database. For example, the transaction to transfer funds from the account of department A to the account of department B could be defined to be composed of two separate programs: one that debits account A, and another that credits account B. The execution of these two programs one after the other will indeed preserve consistency.However, each program by itself does not transform the database from a consistent state to a new consistent state. Thus, those programs are not transactions.

Ensuring the atomicity and durability properties is the responsibility of the database system itself—specifically, of the recovery manager. In the absence of failures, all transactions complete successfully, and atomicity is achieved easily.

However, because of various types of failure, a transaction may not always complete its execution successfully. If we are to ensure the atomicity property, a failed transaction must have no effect on the state of the database. Thus, the database must be restored to the state in which it was before the transaction in question started executing. The database system must therefore perform failure recovery, that is, detect system failures and restore the database to the state that existed prior to the occurrence of the failure.

Finally, when several transactions update the database concurrently, the consistencyof data may no longer be preserved, even though each individual transaction is correct. It is the responsibility of the concurrency-control manager to control the interaction among the concurrent transactions, to ensure the consistency of the database. The transaction manager consists of the concurrency-control manager and the recovery manager.

9.Database Architecture

Database architecture can be 2-tier or 3 tier architecture based on how users are connected to the database to get their request done. They can either directly connect to the database or their request is received by intermediary layer, which synthesizes the request and then it sends to database.

2-tier Architecture

Page 13: Entity  · Web view2018. 10. 5. · in a course. Here, Works_at and Enrolls are called relationships. Relationship Set. A set of relationships of similar type is called a relationship

Page 13 of 14

In 2-tier architecture, application program directly interacts with the database. There will not be any user interface or the user involved with database interaction. Imagine a front end application of School, where we need to display the reports of all the students who are opted for different subjects. In this case, the application will directly interact with the database and retreive all required data. Here no inputs from the user are required. This involves 2-tier architecture of the database.

Advantages of 2-tier Architecture

Easy to understand as it directly communicates with the database. Requested data can be retrieved very quickly, when there is less number of users. Easy to modify – any changes required, directly requests can be sent to database Easy to maintain – When there are multiple requests, it will be handled in a queue and

there will not be any chaos.

Disadvantages of 2-tier architecture:

It would be time consuming, when there is huge number of users. All the requests will be queued and handed one after another. Hence it will not respond to multiple users at the same time.

This architecture would little cost effective.

3-tier Architecture

3-tier architecture is the most widely used database architecture. It can be viewed as below.

Presentation layer / User layer is the layer where user uses the database. He does not have any knowledge about underlying database. He simply interacts with the database as though he has all data in front of him. You can imagine this layer as a registration form where you will be inputting your details. Did you ever guessed, after pressing ‘submit’ button where the data goes? No right? You just know that your details are saved. This is the presentation layer where all the details from the user are taken, sent to the next layer for processing.

Application layer is the underlying program which is responsible for saving the details that you have entered, and retrieving your details to show up in the page. This layer has all the business

Page 14: Entity  · Web view2018. 10. 5. · in a course. Here, Works_at and Enrolls are called relationships. Relationship Set. A set of relationships of similar type is called a relationship

Page 14 of 14

logics like validation, calculations and manipulations of data, and then sends the requests to database to get the actual data. If this layer sees that the request is invalid, it sends back the message to presentation layer. It will not hit the database layer at all.

Data layer or Database layer is the layer where actual database resides. In this layer, all the tables, their mappings and the actual data present. When you save you details from the front end, it will be inserted into the respective tables in the database layer, by using the programs in the application layer. When you want to view your details in the web browser, a request is sent to database layer by application layer. The database layer fires queries and gets the data. These data are then transferred to the browser (presentation layer) by the programs in the application layer.

Advantages of 3-tier architecture:

Easy to maintain and modify. Any changes requested will not affect any other data in the database. Application layer will do all the validations.

Improved security. Since there is no direct access to the database, data security is increased. There is no fear of mishandling the data. Application layer filters out all the malicious actions.

Good performance. Since this architecture cache the data once retrieved, there is no need to hit the database for each request. This reduces the time consumed for multiple requests and hence enables the system to respond at the same time.

Disadvantages 3-tier Architecture

Disadvantages of 3-tier architecture are that it is little more complex and little more effort is required in terms of hitting the database.