cse2004 – database management systems · unit – 2 : data models (er-data model and relational...
Post on 16-Jul-2020
5 Views
Preview:
TRANSCRIPT
Text Books :
1.R. Elmasri & S. B. Navathe, Fundamentals of Database Systems, Addison Wesley, 7 th Edition, 2015
2.Raghu Ramakrishnan, Database Management Systems,Mcgraw-Hill,4th edition,2015
Reference Book
3.A. Silberschatz, H. F. Korth & S. Sudershan, Database System Concepts, McGraw Hill, 6th Edition 2010 4.Thomas Connolly, Carolyn Begg,” Database Systems : A Practical Approach to Design, Implementation and Management”,6th Edition,2012
5.Shashank Tiwari ,“Professional NoSql”,Wiley ,2011
CSE2004 – Database Management Systems
Unit – 2 : Data Models (ER-Data Model and Relational Data Model
CSE2004 – Database Management Systems
Entity Relationship Model Types of Attributes Relationship, Structural Constraints Relational Model, Relational model Constraints Mapping ER model to a relational schema Integrity constraints
Entity - Relationship Data Model
CSE2004 – Database Management Systems
Entity - Relationship Data Model
Modelling Constraints E-R Diagram Weak Entity Sets Extended E-R Features Design of the University Database (Case Study) Design of the Company Database (Case Study) Reduction to Relation Schemas Database Modelling Tools
Modeling
A database can be modeled as: a collection of Entities, Relationship among entities.
An entity is an object that exists and is distinguishable from otherobjects.
Example: specific person, company, event, plantEntities have attributes (defines the properties of Entity)
Example: people have names and addressesAn entity set is a set of entities of the same type that share the sameproperties.
Example: set of all persons, companies, trees, holidays
Entity Sets instructor and student
instructor_ID instructor_name student-ID student_name
Relationship Sets
A relationship is an association among several entities
Example: 44553 (Peltier) advisor 22222 (Einstein)
student entity relationship set instructor entity
It is a mathematical relation among n ≥ 2 entities, each taken from entity sets {(e1 , e2 , ... en) | e1 E∈ 1, e2 E∈ 2 , ..., en E∈ n} where (e1,e2,..., en) is a relationship
lExample:
(44553,22222) ∈ advisorRelationship Set advisor
Relationship Sets
An attribute can also be property of a relationship set. For instance, the advisor relationship set between entity sets instructor and
student may have the attribute date which tracks when the student started being associated with the advisor
Degree of Relationship Sets
Binary relationship involve two entity sets (or degree two). most relationship sets in a database system are binary. Relationships between more than two entity sets are rare.
Example: students work on research projects under the guidance of an instructor. relationship proj_guide is a ternary relationship between instructor, student,
and project
Attribute
An entity is represented by a set of attributes, that is descriptive properties possessed by all members of an entity set.
lExample:
instructor = (ID, name, street, city, salary )course = (course_id, title, credits)
Domain – the set of permitted values for each attribute
Attribute types: Simple and composite attributes. Single-valued and multivalued attributes
Example: multivalued attribute: phone_numbers
Derived attributes Can be computed from other attributes
Example: age, given date_of_birth
Composite Attribute
Mapping Cardinality Constraints
Express the number of entities to which another entity can be associated via a relationship set. Most useful in describing binary relationship sets.
For a binary relationship set the mapping cardinality must be one of the following types:
One to one One to many Many to one Many to many
Mapping Cardinality Constraints
A super key of an entity set is a set of one or more attributes whose values uniquely determine each entity.
A candidate key of an entity set is a minimal super key ID is candidate key of instructor course_id is candidate key of course
Although several candidate keys may exist, one of the candidate keys is selected to be the primary key
The combination of primary keys of the participating entity sets forms a super key of a relationship set.
(s_id, i_id) is the super key of advisor NOTE: pair of entity sets can have at most one relationship in a particular
relationship set.Example: if we wish to track multiple meeting dates between a student and her advisor, we cannot assume a relationship for each meeting. We can use a multivalued attribute though
E-R Diagrams
Rectangles represent entity sets. Diamonds represent relationship sets. Attributes listed inside entity rectangle Underline indicates primary key attributes
Entity With Composite, Multivalued, and Derived Attributes
Relationships sets with Attributes and Roles
Entity sets of a relationship need not be distinct Each occurrence of an entity set plays a “role” in the relationship
The labels “course_id” and “prereq_id” are called roles.
Cardinality Constraints
We express cardinality constraints by drawing either a directed line ( ), signifying “one,” or →an undirected line (—), signifying “many,” between the relationship set and the entity set.
One - One
One - Many
Many - Many
Many - One
Participation of Entity Sets in Relationship
Total participation (indicated by double line): Every entity in the entity set participates in at least one relationship in the relationship setE.g., participation of section in sec_course is total, I.e every section must have an associated course
Partial participation: some entities may not participate in any relationship in the relationship setE.g., participation of instructor in advisor is partial
Example for Ternary Relationship
Week Entity Sets (No Primary Key)
An entity set that does not have a primary key is referred to as a weak entity set.
The existence of a weak entity set depends on the existence of a identifying entity set
It must relate to the identifying entity set via a total, one-to-many relationship set from the identifying to the weak entity set
Identifying relationship depicted using a double diamond The discriminator (or partial key) of a weak entity set is the set of attributes
that distinguishes among all the entities of a weak entity set. The primary key of a weak entity set is formed by the primary key of the strong
entity set on which the weak entity set is existence dependent, plus the weak entity set’s discriminator.
We underline the discriminator of a weak entity set with a dashed line. We put the identifying relationship of a weak entity in a double diamond. Primary key for section – (course_id, sec_id, semester, year)
Note: the primary key of the strong entity set is not explicitly stored with the weak entity set, since it is implicit in the identifying relationship.
If course_id were explicitly stored, section could be made a strong entity, but then the relationship between section and course would be duplicated by an implicit relationship defined by the attribute course_id common to course and section
Week Entity Sets (No Primary Key)
E-R Diagram for a University Enterprise
Extended ER Features
Top-down design process; we designate subgroupings within an entity set that are distinctive from other entities in the set.
These subgroupings become lower-level entity sets that have attributes or participate in relationships that do not apply to the higher-level entity set.
Depicted by a triangle component labeled ISA (E.g., instructor “is a” person). Attribute inheritance – a lower-level entity set inherits all the attributes and
relationship participation of the higher-level entity set to which it is linked.
A bottom-up design process – combine a number of entity sets that share the same features into a higher-level entity set.
Specialization and generalization are simple inversions of each other; they are represented in an E-R diagram in the same way.
The terms specialization and generalization are used interchangeably.
Specialization
Generalization
Extended ER Features
ER Notations
Entity
Relationship
Relationship against week entity
Total Participation
Attributes :Simple A1Composite A2Multivalued A3Derived A4
Primary Key
Week Entity
ER Notations
Many-Many
One-One
Role Indicator
TotalGeneralization
DisjointGeneralization
ISA : Generalization/Specialization
Cardinality Limits
Many-One
ER Notations
Total Generalization
Generalization
Week Entity
ER Notations
Chen IDE1FX Crow feet Notation
Database Modeling Tools
A number of popular tools that cover conceptual modeling and mapping into relational schema design.
– Examples: ERWin, S- Designer (Enterprise Application Suite), ER- Studio, etc.
POSITIVES: serves as documentation of application requirements, easy user interface - mostly graphics editor support
Problems with Current Modeling ToolsDIAGRAMMING
Poor conceptual meaningful notation.To avoid the problem of layout algorithms and aesthetics of diagrams, they
prefer boxes and lines and do nothing more than represent (primary-foreign key) relationships among resulting tables.
METHODOLGYlack of built-in methodology support. poor tradeoff analysis preferences.poor design verification and suggestions for improvement.
COMPANY TOOL FUNCTIONALITY
Embarcadero Technologies
ER Studio Database Modeling in ER and IDEF1X
DB Artisan Database administration and space and security management
Oracle Developer 2000 and Designer 2000
Database modeling, application development
Popkin Software System Architect 2001 Data modeling, object modeling, process modeling, structured analysis/design
Platinum Technology
Platinum Enterprice Modeling Suite: Erwin, BPWin, Paradigm Plus
Data, process, and business component modeling
Persistence Inc. Pwertier Mapping from O-O to relational model
Rational Rational Rose Modeling in UML and application generation in C++ and JAVA
Rogue Ware RW Metro Mapping from O-O to relational model
Resolution Ltd. Xcase Conceptual modeling up to code maintenance
Sybase Enterprise Application Suite Data modeling, business logic modeling
Visio Visio Enterprise Data modeling, design and reengineering Visual Basic and Visual C++
Automated Database Modeling Tools
Relation Data Model
CSE2004 – Database Management Systems
Relation Data Model
Structure of Relational Databases Database Schema Keys Schema Diagrams Relational Query Languages Relational Operations Exercises
Relational data model is the Simple and Primary data model. Used widely around the world for data storage and processing. It has all the properties and capabilities required to process data with storage
efficiency.
Introduction Relation Data Model
attributes(or columns)
tuples(or rows)
Example of a RelationExample of a Relation
Relation Data Model Concepts
Tables: Relations are saved in the format of Tables.This format stores the relation among entities. A table has rows and columnswhere rows represent records and columns represent the attributes.
Tuple: A single row of a table, which contains a single record for thatrelation is called a tuple.
Relation instance: A finite set of tuples in the relational database system They do not have duplicate tuples.
Relation schema: It describes the relation name (table name), attributes
Relation key: Each row has one or more attributes, known as relation key Identify the row in the relation (table) uniquely.
Attribute domain: Every attribute has some predefined value scope
Relation Data Model Attribute Types
The set of allowed values for each attribute is called the domain of the attribute
Attribute values are (normally) required to be atomic; that is, indivisible The special value null(Unknown) is a member of every domain. The null value causes complications in the definition of many operations
Let us consider • A1, A2, …, An are attributes• R = (A1, A2, …, An ) is a relation schema
Example: instructor = (ID, name, dept_name, salary)Formally, given sets D1, D2, …. Dn a relation r is a subset of D1 x D2 x … Dn
Thus, a relation is a set of n-tuples (a1, a2, …, an) where each ai Di
Relation Data Model Concepts
Constraints : Every relation has some conditions that must hold for it to be a validrelation (Relational Integrity Constraints).
There are three main integrity constraints: Key constraints (Entity Constraints) Domain constraints Referential integrity constraints
Relation Data Model Concepts
Key Constraints
There must be at least one minimal subset of attributes in the relation, which can identify a tuple uniquely.
This minimal subset of attributes is called key for that relation.
If there are more than one such minimal subsets, these are called candidate keys.
Key constraints force that:
in a relation with a key attribute, no two tuples can have identical values for key attributes.
a key attribute cannot have NULL values.
Key constraints are also referred to as Entity Constraints.
Relational Data Model
Domain Constraints
Attributes have specific values in real-world scenario. For Eg., age can only be a positive integer. The same constraints have been tried to employ on the attributes of a relation.
Every attribute is bound to have a specific range of values. For Eg., age cannot be less than zero and telephone numbers cannot contain a digit outside 0-9.
Referential Integrity Constraints
It works on the concept of Foreign Keys. A foreign key is a key attribute of a relation that can be referred in other relation.
It states that if a relation refers to a key attribute of a different or same relation, then that key element must exist.
Relational Data Model
IDcourse_idsec_idsemesteryeargrad
IDnamedept_nametot_cred
buildingroom_nocapacity
course_idsec_idsemesteryearbuildingroom_notime_slot_id
s_idi_id
IDcourse_idsec_idsemesteryear
takes
section
classroom
teaches
prereqcourse_idprereq_id
course_id
titledept_namecredits
course
student
dept_name
buildingbudget
department
instructorIDnamedept_namesalary
advisor
time_slottime_slot_iddaystart_timeend_time
Schema Diagram for University Database
Relational Query Languages
Relational database systems comes with a query language, which assists users to query the database instances.
There are two kinds of query languages:
Relational algebra and
Relational calculus.
Relational Algebra
Relational Algebra : It is a procedural query language, which takes instances of relations as
input and yields instances of relations as output. It uses operators to perform queries. An operator can be either unary or binary. They accept relations as their input and yield relations as their output. Relational algebra is performed recursively on a relation and
intermediate results are also considered relations.
The fundamental operations of relational algebra are as follows: Select Project Union Set different Cartesian product Rename
Relational Algebra
Select Operation ( )σ It selects tuples that satisfy the given predicate from a relation.
Notation: σp(r)Where σ stands for selection predicate and r stands for relation, p is predicate logic formula which may use connectors like and, or, and not.These terms may use relational operators like: =, ≠, ≥, <, >, ≤.
For example: σsubject="database"(Books) :
Selects tuples from Books where subject = 'database'.
σsubject="database" and price="450"(Books) :
Selects tuples from Books where subject is 'database' and 'price' is 450.
σsubject="database" and price < "450" or year > "2010"(Books) :
Selects tuples from Books where subject is 'database' and 'price' is 450 or thosebooks published after 2010.
Relational Algebra
Project Operation ( ) : ∏ It projects column(s) that satisfy a given predicate.Notation: ∏A1, A2,..An(r)
Where A1,A2,....An are attribute names of relation r.
Duplicate rows are automatically eliminated, as relation is a set.
For example: ∏subject,author(Books)
selects and projects columns named as subject and author from the relationBooks.
A=B & D > 5 (r)
Relation r Relation r
A,C (r)
Relational Algebra : (Selection/Projection)
Relational Algebra
Union Operation ( )∪
It performs binary union between two given relations and is defined as:r s = { t | t r or t s}∪ ∈ ∈
Notion: r U sWhere r and s are either database relations or relation result set.(temporary relation).
For a union operation to be valid, the following conditions must hold: r and s must have the same number of attributes. Attribute domains must be compatible. Duplicate tuples are automatically eliminated.
For example: ∏author(Books) ∪ ∏author(Articles)
Projects the names of the authors who have either written a Book or an Articlesor both.
Relational Algebra
Set Difference (−)The result of set difference operation is tuples, which are present in one relation but are not in the second relation.
Notation: r − sFinds all the tuples that are present in ‘r’ but not in ‘s’.
For example: ∏author(Books) - ∏author(Articles)
Provides the name of authors who have written books but not articles.
Relational Algebra
Rename Operation ( )ρ
The results of relational algebra are also relations but without any name. The rename operation allows us to rename the output relation. ‘rename’ operation is denoted with small Greek letter rho .ρ
Notation: ρM(E)Where the result of expression E is saved with name of M.
Additional operations are: Set intersection Assignment Natural join
Relational Algebra
Intersection (∩)The Intersection operation defines a relation consisting of the set of all
tuples that are in both R and S. R and S must be union-compatible.
Notation: r ∩ sWhere r and s are relations and their output will be defined as:r ∩ s = { q | q r and q s}∈ ∈
For example: Πcity(Branch) ∩ Πcity(PropertyForRent)
List all cities where there is both a branch office and at least one property forrent
Relational Algebra
Relations r, s:
r s
r s
Relations r, s:
Relations r, s:
r s
Note: r s = r – (r – s)
Relational Algebra
Cartesian Product ( )Χ
Combines information of two different relations into one.Notation: r sΧ
Where r and s are relations and their output will be defined as:r s = { q t | q r and t s}Χ ∈ ∈
For example: ∏author = 'tutorialspoint (Books Articles)Χ
Yields a relation, which shows all the books and articles written by “tutorialspoint”.
Relations r, s:
r x s
r.B r.C
r x s
Relations r, s:
Relational Algebra
Relational Algebra
Join Operation ⋈ It is a combinations of the Cartesian product that satisfy certain conditions It Combines two relations to form new relation. It is a derivative of Cartesian product, equivalent to performing a
Selection operation, using the join predicate as the selection formula, overthe Cartesian product of the two operand relations
It is one of the most difficult operations to implement efficiently in an RDBMS
There are various forms of Join operation, Theta join Equijoin (a particular type of Theta join) Natural join Outer join Semijoin.
Other Relational operators areDivision
Aggregate
Grouping
Dr Edgar F. Codd Rules :
Rule 1: Information Rule
The data stored in a database, may it be user data or metadata, must be avalue of some table cell, Database must be stored in a table format.
Rule 2: Guaranteed Access Rule
Every single data element (value) is guaranteed to be accessible logicallywith a combination of table-name, primary-key (row value), and attribute-name (column value).
Rule 3: Systematic Treatment of NULL Values (important rule)
The NULL values in a database must be given a systematic and uniformtreatment, because a NULL can be interpreted as data is missing/notknown/not applicable.
Dr Edgar F. Codd Rules :
Rule 4: Active Online Catalog
The structure of the entire database must be stored in an online catalog,(data dictionary), which can be accessed by authorized users.Users can use the same query language to access the catalog.
Rule 5: Comprehensive Data Sub-Language Rule
A database can only be accessed using a language having linear syntax thatsupports data definition, data manipulation, and transaction managementoperations. This language can be used directly or by means of someapplication.
Rule 6: View Updating Rule
All the views of a database, which can theoretically be updated, must alsobe updatable by the system.
Dr Edgar F. Codd Rules :
Rule 7: High-Level Insert, Update, and Delete Rule
A database must support high-level insertion, updation, and deletion. Thismust not be limited to a single row, I.e, it must also support union,intersection and minus operations to yield sets of data records.
Rule 8: Physical Data Independence
The data stored in a database must be independent of the applications thataccess the database. Any change in the physical structure of a databasemust not have any impact on how the data is being accessed by applications.
Rule 9: Logical Data Independence
The logical data in a database must be independent of its user’s view(application). Any change in logical data must not affect the applications using it.
Dr Edgar F. Codd Rules :
Rule 10: Integrity Independence
A database must be independent of the application that uses it. All itsintegrity constraints can be independently modified without the need of anychange in the application.
Rule 11: Distribution Independence
The end-user must not be able to see that the data is distributed overvarious locations. Users should always get the impression that the data islocated at one site only.
Rule 12: Non-Subversion Rule
If a system has an interface that provides access to low-level records, thenthe interface must not be able to subvert the system and bypass securityand integrity constraints.
Mapping
ER Data Model to Relational Data Model
CSE2004 – Database Management Systems
ER Model, when conceptualized into diagrams, gives a good overview of entity-relationship, which is easier to understand.
ER diagrams can be mapped to Relational schema.I.e It is possible to create relational schema using ER diagram. We cannot import all the ER constraints into relational model, but an
approximate schema can be generated.
There are several processes and algorithms available to convert ER Diagramsinto Relational Schema (automated/manual).
We may focus here on the mapping diagram contents to relational basics.
ER diagrams mainly comprise of: Entity and its attributes Relationship, which is association among entities
Mapping ER to Relational
Mapping Entity: An entity is a real-world object with some attributes.
Mapping Process (Algorithm) Create table for each entity. Entity's attributes should become fields of tables with their respective data types. Declare primary key.
Mapping ER to Relational
Mapping Relationships : A relationship is an association among entities.
Mapping Process (Algorithm) Create table for a relationship. Add the primary keys of all participating Entities as fields of table with their
respective data types. If relationship has any attribute, add each attribute as field of table. Declare a primary key composing all the primary keys of participating entities. Declare all foreign key constraints
Mapping ER to Relational
Mapping Hierarchical Entities :
ER specialization or generalization comes in the form of hierarchical entity sets.
Mapping Process (Algorithm) Create tables for all higher-level entities. Create tables for lower-level entities. Add primary keys of higher-level entities in the table of lower-level entities. In lower-level tables, add all other attributes of lower-level entities. Declare primary key of higher-level table and the primary key for lower- level table.ts. Declare foreign key constraints.
Mapping ER to Relational
Converting Strong entity types
• Each entity type becomes a table
• Each single-valued attribute becomes a column
• Derived attributes are ignored
• Composite attributes are represented by components
• Multi-valued attributes are represented by a separate table
• The key attribute of the entity type becomes the primary key of the table
Entity example
• Here address is a composite attribute
• Years of service is a derived attribute (can be calculated from date of joining and current date)
• Skill set is a multi-valued attribute
• The relational Schema
Employee (E#, Name, Door_No, Street, City, Pincode, Date_Of_Joining)
Emp_Skillset( E#, Skillset)
Entity Example (Contd…)
SkillSet
EmpCode FK
Skills
Employee Table
EmpCode PK
EmpName
DateofJoining
SkillSet
Converting weak entity types
• Weak entity types are converted into a table of their own, with the primary key of the strong entity acting as a foreign key in the table
• This foreign key along with the key of the weak entity form the composite primary key of this table
• The Relational Schema
Employee (E# ,…….)
Dependant (Employee, Dependant_ID, Name, Address)
Converting weak entity types (Contd…)
Dependent
EmpCode PK /FK
Dependent_ID PK
Name
Address
Employee Table
EmpCode PK
EmpName
DateofJoining
SkillSet
Converting relationships
• The way relationships are represented depends on the cardinality and the degree of the relationship
• The possible cardinalities are: 1:1, 1:M, N:M
• The degrees are:UnaryBinaryTernary …
Binary 1:1
• Case 1: Combination of participation types
The primary key of the partial participant will become the foreign key of the total participant
Employee( E#, Name,…)
Department (Dept#, Name…,MgrE#)
departmentEmployee Manages1 1
partial
Total
Binary 1 : 1
Department
DeptCode PK
DeptName
Location
MgrEmpCode FK
Employee Table
EmpCode PK
EmpName
DateofJoining
SkillSet
Binary 1:1
• Case 2: Uniform participation types
The primary key of either of the participants can become a foreign key in the other
Employee (E#,name…)
Chair( item#, model, location, used_by)
(or)Employee ( E#, Name….Sits_on)
Chair (item#,….)
Employee CHAIRSits_on
Binary 1 : 1
Chair
ItemNo PK
Model
Location
Used_By FK
Employee Table
EmpCode PK
EmpName
DateofJoining
SkillSet
ChairItemNo PKModelLocation
Employee TableEmpCode PKEmpNameDateofJoiningSkillSetSits_On FK
Binary 1:N
The primary key of the relation on the “1” side of the relationship becomes a foreign key in the relation on the “N” side
Teacher (ID, Name, Telephone, ...)
Subject (Code, Name, ..., Teacher)
Teacher teaches Subject1 N
Binary 1 : N
Subject
SubCode PK
SubName
Duration
TeacherID FK
Teacher
TeacherID PK
Name
Telephone
Cabin
Binary M:N
• A new table is created to represent the relationship
• Contains two foreign keys - one from each of the participants in the relationship
• The primary key of the new table is the combination of the two foreign keys
Student (Sid#,Title…) Course(C#,CName,…)
Enrolls (Sid#, C#)
Student Enrolls CourseM N
Binary M : N
Course
CourseID PK
Coursename
Student
StudentID PK
StudentName
DOB
Address
Enrolls
StudentCode PK / FK
CourseID PK / FK
DOIssue
Status
top related