characteristics of database management system · web view2017. 10. 7. · 1. stores any kind of...
TRANSCRIPT
Introduction:
Databases and database technology are having a major impact on the growing use of computers. It is fair to
say that databases play a critical role in almost all areas where computers are used, including business,
electronic commerce, engineering, medicine, law, education, and library science.
A data base is a collection of related data. For example , consider the names, telephone
numbers, and addresses of the people you know. You may have recorded this data in an indexed address
book, or you may have stored it on a hard drive, using a personal computer and software such as Microsoft
Access, or Excel. This is a collection of related data with an implicit meaning and hence is a database.
Definition: A database management system (DBMS) is a collection of programs that enables
users to create and maintain a database. The DBMS is hence a general-purpose software
system that facilitates the processes of defining, constructing, manipulating, and sharing
databases among various users and applications.
1. Defining a database involves specifying the data types, structures, and constraints for the data to be
stored in the database.
2. Constructing the database is the process of storing the data itself on some storage medium that is
controlled by the DBMS.
3. Manipulating a database includes such functions as querying the database to retrieve specific data,
updating the database to reflect changes in the mini world, and generating reports from the data.
4. Sharing a database allows multiple users and programs to access the database concurrently.
NBKRIST ADBMS I M.TECH CSE I SEM Page 1
Characteristics of Database Management System1. Stores any kind of data
A database management system should be able to store any kind of data. It should not be restricted to the
employee name, salary and address. Any kind of data that exists in the real world can be stored in DBMS
because we need to work with all kinds of data that is present around us.
2. Support ACID Properties
Any DBMS is able to support ACID (Accuracy, Completeness, Isolation, and Durability) properties. It is
made sure is every DBMS that the real purpose of data should not be lost while performing transactions like
delete, insert an update. Let us take an example; if an employee name is updated then it should make sure
that there is no duplicate data and no mismatch of student information.
3. Represents complex relationship between data
Data stored in a database is connected with each other and a relationship is made in between data. DBMS
should be able to represent the complex relationship between data to make the efficient and accurate use of
data.
4. Backup and recovery
There are many chances of failure of whole database. At that time no one will be able to get the database
back and for sure company will be in a big loss. The only solution is to take backup of database and
whenever it is needed, it can be stored back. All the databases must have this characteristic.
5. Structures and described data
A database should not contains only the data but also all the structures and definitions of the data. This data
represent itself that what actions should be taken on it. These descriptions include the structure, types and
format of data and relationship between them.
6. Data integrity
This is one of the most important characteristics of database management system. Integrity ensures the
quality and reliability of database system. It protects the unauthorized access of database and makes it more
secure. It brings only the consistence and accurate data into the database.
7. Concurrent use of database
There are many chances that many users will be accessing the data at the same time. They may require
altering the database system concurrently. At that time, DBMS supports them to concurrently use the
database without any problem.
NBKRIST ADBMS I M.TECH CSE I SEM Page 2
Advantages of DBMS
Controlling Data Redundancy
In non-database systems each application program has its own private files. In this case, the duplicated
copies of the same data are created in many places. In DBMS, all data of an organization is integrated into a
single database file. The data is recorded in only one place in the database and it is not duplicated.
Sharing of Data
In DBMS, data can be shared by authorized users of the organization. The database administrator manages
the data and gives rights to users to access the data. Many users can be authorized to access the same piece
of information simultaneously. The remote users can also share same data. Similarly, the data of same
database can be shared between different application programs.
Data Consistency
By controlling the data redundancy, the data consistency is obtained. If a data item appears only once, any
update to its value has to be performed only once and the updated value is immediately available to all users.
If the DBMS has controlled redundancy, the database system enforces consistency.
Integration of Data
In Database management system, data in database is stored in tables. A single database contains multiple
tables and relationships can be created between tables (or associated data entities). This makes easy to
retrieve and update data.
Integration Constraints
Integrity constraints or consistency rules can be applied to database so that the correct data can be entered
into database. The constraints may be applied to data item within a single record or the may be applied to
relationships between records.
Data Security
Form is very important object of DBMS. You can create forms very easily and quickly in DBMS. Once a
form is created, it can be used many times and it can be modified very easily. The created forms are also
saved along with database and behave like a software component. A form provides very easy way (user-
friendly) to enter data into database, edit data and display data from database. The non-technical users can
also perform various operations on database through forms without going into technical details of a fatabase.
Report Writers
Most of the DBMSs provide the report writer tools used to create reports. The users can create very easily
and quickly. Once a report is created, it can be used many times and it can be modified very easily. The
created reports are also saved along with database and behave like a software component.
Control over Concurrency
In a computer file-based system, if two users are allowed to access data simultaneously, it is possible that
they will interfere with each other. For example, if both users attempt to perform update operation on the
same record, then one may overwrite the values recorded by the other. Most database management systems
have sub-systems to control the concurrency so that transactions are always recorded with accuracy.
NBKRIST ADBMS I M.TECH CSE I SEM Page 3
Backup and Recovery Procedures
In a computer file-based system, the user creates the backup of data regularly to protect the valuable data
from damage due to failures to the computer system or application program. It is very time consuming
method, if amount of data is large. Most of the DBMSs provide the 'backup and recovery' sub-systems that
automatically create the backup of data and restore data if required.
Data Independence
The separation of data structure of database from the application program that uses the data is called data
independence. In DBMS, you can easily change the structure of database without modifying the application
program.
Disadvantages of Database Management System (DBMS):
Although there are many advantages but the DBMS may also have some minor disadvantages. These are:
1. Cost of Hardware & Software:
A processor with high speed of data processing and memory of large size is required to run the DBMS
software. It means that you have to upgrade the hardware used for file-based system. Similarly, DBMS
software is also Very costly.
2. Cost of Data Conversion:
When a computer file-based system is replaced with a database system, the data stored into data file must be
converted to database files. It is difficult and time consuming method to convert data of data files into
database. You have to hire DBA (or database designer) and system designer along with application
programmers; Alternatively, you have to take the services of some software houses. So a lot of money has to
be paid for developing database and related software.
3. Cost of Staff Training:
Most DBMSs are often complex systems so the training for users to use the DBMS is required. Training is
required at all levels, including programming, application development, and database administration. The
organization has to pay a lot of amount on the training of staff to run the DBMS.
4. Appointing Technical Staff:
The trained technical persons such as database administrator and application programmers etc are required
to handle the DBMS. You have to pay handsome salaries to these persons. Therefore, the system cost
increases.
5. Database Failures:
In most of the organizations, all data is integrated into a single database. If database is corrupted due to
power failure or it is corrupted on the storage media, then our valuable data may be lost or whole system
stops.
NBKRIST ADBMS I M.TECH CSE I SEM Page 4
Database Management System Vs. File Management System
A Database Management System (DMS) is a combination of computer software, hardware, and information
designed to electronically manipulate data via computer processing. Two types of database management
systems are DBMS’s and FMS’s. In simple terms, a File Management System (FMS) is a Database
Management System that allows access to single files or tables at a time. FMS’s accommodate flat files that
have no relation to other files. The FMS was the predecessor for the Database Management System
(DBMS), which allows access to multiple files or tables at a time (see Figure 1 below)
NBKRIST ADBMS I M.TECH CSE I SEM Page 5
UNIT - IDatabase System Concepts: Data Models, Schemas and Instances, Three-Schema Architecture, Database Languages and Interfaces, The Database System Environment, Centralized and Client/Server Architectures for DBMSs, Classification of DBMS.Data Modeling Using ER Model: ER Diagrams, Naming Conventions, and Design Issues.EER Model: Subclasses, Super Classes, and Inheritance, Data Abstraction, Knowledge Representation and Ontology Concepts.
Data Models, Schemas and InstancesData Model: It is a collection of concepts that can be used to describe the structure of a data base.
(Consisting of the data elements, their data types, relationships).
[Or]
Defines the data structure, operations and constraints of the database.
Example data models are the relational data model, entity relationship(ER) model for defining entities and
relationships.
Example operations for retrievals and updates of the database.
Example constraints is that to take a course, you need to be a student.
1. Categories of data models:
NBKRIST ADBMS I M.TECH CSE I SEM Page 6
Many data models have been proposed, which we can categorize according to the types ofconcepts they use to describe the database structure.a. Physical (low‐level, internal) data models:
• Provide concepts that describe details of how data is stored in the computer as files by showing record
formats, orderings, and access paths such as indexing structure for data retrieval.
b. Implementation (representational) data models:
• Provide concepts that fall between the above two, used by many commercial DBMS implementations (e.g.
relational data models used in many commercial systems).
c. Self‐Describing Data Models:
• Combine the description of data with the data values. Examples include XML, key‐value stores and some
NOSQL systems. Used to cater for non‐traditional data types like posts, long character blobs, and web data.
2. Schemas, Instances and Database State
The description of a database is called the database schema. This is specified during database design and
is not expected to change frequently. The fig: shows a schema diagram for the data base shown in fig:
NBKRIST ADBMS I M.TECH CSE I SEM Page 7
Database state:
The data in the database at a particular moment in time is called a database state or snapshot.
The distinction between database schema and database state is very important. When we define a new
database, we specify its database schema only to the DBMS. At this point, the corresponding database state
is the empty state with no data.
Initial database state:
Refers to the database state when it is initially loaded into the system.
Valid state:
A state that satisfies the structure and constraints of the database.
NBKRIST ADBMS I M.TECH CSE I SEM Page 8
Three-Schema Architecture and Data Independence1) Three-Schema Architecture
DBMS uses Three-schema level architecture to achieve programs and data, support multiple views,
use of catalog to store data base description (schema). The three levels consist of
a) The external schemas for end users (E.g.: external views and different user views of data through
queries)at the external level.
b) The conceptual schemas at the conceptual level representing the whole database for a community of
users using a conceptual or implementation data model such as relational.
c) The internal schemas at the internal level which describe physical storage structures and access paths
of the database (E.g.: Indexes).
2) Data Independence
This is the ability of the DBMS to change the schema at a lower level of DB system without having to
change the schema at the next higher level.
DBMS provides the following two types of data independence.
a) Logical Data Independence
b) Physical Data Independence
a) Logical Data Independence
Logical data independence is the capacity to change the conceptual schema without having to change
external schemas or application programs. We may change the conceptual schema to expand the database
NBKRIST ADBMS I M.TECH CSE I SEM Page 9
(by adding a record type or data item), to change constraints, or to reduce the database (by removing a
record type or data item).
In the last case, external schemas that refer only to the remaining data should not be affected.
For example , the external schema of fig: 1.4a should not be affected by changing the GRADE_REPORT file shown in Figure 1.2 into the one shown in Figure 1.5a.
NBKRIST ADBMS I M.TECH CSE I SEM Page 10
b) Physical Data Independence
Physical data independence is the capacity to change the internal schema with out having to change the conceptual schema. Hence, the external schemas need not be changed as well. Changes to the internal schema may be needed because some physical files had to be reorganized-for example ,by creating additional access structures-to improve the performance of retrieval or update. If the same data as before remains in the database, we should not have to change the conceptual schema. For example, providing an access path to improve retrieval speed of SECTION records (Figure
1.2) by Semester and Year should not require a query such as "list all sections offered in fall 1998" to be
changed, although the query would
be executed more efficiently by the DBMS by utilizing the new access path.
All three levels of abstraction represent data descriptions only. However ,
the actual data are stored only at the physical level. The process of transforming requests and results
between levels are called mapping and done by the DBMS.
NBKRIST ADBMS I M.TECH CSE I SEM Page 11
Database Languages and InterfacesThe DBMS must provide appropriate languages and interfaces for each category of users.
a) DBMS Languages
Once the design of a database is completed and a DBMS is chosen to implement the database, the first order of the day is to specify conceptual and internal schemas for the database and any mappings between the two. In many DBMSs where no strict separation of levels is maintained, one language, called the 1. Data definition language (DDL), is used by the DBA and by database designers to define both schemas. The DDL is used to specify the conceptual schema only. 2. Storage definition language (SOL), is used to specify the internal schema. The mappings between the two schemas may be specified in either one of these languages 3. View definition language (VDL), to specify user views and their mappings to the conceptual schema, but in most DBMSs the DDL is used to define both conceptual and external schemas Common DDL statements are CREATE, ALTER, and DROP.
Once the database schemas are compiled and the database is populated with data, users must have some means to manipulate the database. NBKRIST ADBMS I M.TECH CSE I SEM Page 12
Typical manipulations include retrieval, insertion, deletion, and modification of the data. The DBMS provides a set of operations or a language called the data manipulation language (DML)Common DML statements are SELECT, UPDATE, INSERT INTO and DELETE FROM
b) DBMS Interface
User friendly interfaces provided by a DBMS may include the following:
1. Menu-Based Interfaces for Web Clients or Browsing
These interfaces present the user with lists of options, called menus that lead the user through the
formulation of a request. Pull-down menus are a very popular technique in Web-based user interfaces.2. Forms-Based Interfaces
A forms-based interface displays a form to each user. Users can fill out all of the form entries to insert new
data, or they fill out only certain entries, in which case the DBMS will retrieve matching data for the
remaining entries.
3. Graphical User Interfaces
A graphical interface (GUI) typically displays a schema to the user in diagrammatic form. The user can
then specify a query by manipulating the diagram. In many cases, GUIs utilize both menus and forms. Most
GUIs use a pointing device, such as a mouse, to pick certain parts of the displayed schema diagram.
4. Natural Language Interfaces
These interfaces accept requests written in English or some other language and attempt to "understand"
them. A natural language interface usually has its own "schema," which is similar to the database conceptual
schema, as well as a dictionary of important words.
5. Interfaces for Parametric Users
Parametric users, such as bank tellers, often have a small set of operations that they must perform
repeatedly. Systems analysts and programmers design and implement a special interface for each known
class of naïve users.
6. Interfaces for the DBA
Most database systems contain privileged commands that can be used only by the DBA's staff. These
include commands for creating accounts, setting system parameters, granting account authorization,
changing a schema, and reorganizing the storage structures of a database.
7. Mobile Interfaces
Interfaces allowing users to perform transactions using mobile apps
NBKRIST ADBMS I M.TECH CSE I SEM Page 13
The Database System EnvironmentA DBMS is complex software system. In this section we discuss the types of software components that
constitute a DBMS and the types of computer system software with which the DBMS interacts.
1) DBMS Component ModulesThe fig 2.3 illustrates, in a simplified form, the typical DBMS components. The top half refers to the various
users of the database system. The lower half shows the internals of the DBMS responsible for storage of
data and processing of transactions.
The database and the DBMS catalog are usually stored on disk. Access to the disk is
controlled primarily by the Operating System (OS),which schedules disk input/output.
Stored Data Manager
It controls access to the DBMS information that is stored on disk. The dotted lines and circles marked
A,B,C,D and E in fig 2.3 illustrates access that are under the control of this stored data manager. The stored
data manager may use basic OS service for carrying out low level data transfer between the disk and
computer main storage.
DDL Compiler
The DDL compiler processes schema definitions.
Run-Time database processor
It handles database access at runtime. It receives retrieval or update operations and carriers them out on the
database.
Query compiler
It handles high-level queries that are entered interactively. It parses, analyzes, and compiles or interprets a query by creating database access code, and then generates calls to the runtime processor for executing the code.Pre-compilerIt extracts DML commands from an application program written in a host programming language. These commands are sent to the DML compiler for compilation into object code for database access. The rest of the program is sent to the host language compiler.Client programThe client program that accesses the DBMS running on separate computer from the computer on which the database resides. The former is called the Client computer and later it is called the database server. In some cases, the client accesses a middle computer, called the application server.
NBKRIST ADBMS I M.TECH CSE I SEM Page 14
2. Database system utilities
Most DBMSs have database utilities the help the DBA in managing the database system. Common utilities
have the following types of functions:
1. Loading
A loading utility is used to load existing data files-such as text files or sequential files-into the database.
Usually, the current (source) format of the data files and the desired (target) database file structure are
specified to the utility, which then automatically reformats the data and stores it in the database. Includes
data conversion tools.
2. Backup
A backup utility creates a backup copy of the database, usually by dumping the entire database onto tape.
3. File reorganization
This utility can be used to reorganize a database file into a different file organization to improve performance.4. Performance monitoring
NBKRIST ADBMS I M.TECH CSE I SEM Page 15
Such as utility monitors database usage and provides statistics to the DBA.
Other utilities may be available for sorting files, handling data compression, monitoring access by users, interfacing with the network, and performing other functions.3. Tools, Application Environments, and Communications Facilities
Other tools provided by the DB system include:
a) Data dictionary/repository
Used to store schema descriptions and other information such as design decisions, application program
descriptions, user information, usage standards, etc.
b) Application Development Environments and CASE (Computer-aided software engineering)tools like
Forms, JBuilder, Jdeveloper, etc.
Centralized and Client/Server Architectures for DBMSs1. Centralized DBMSs Architecture
All DBMS functionality, application program execution, and user interface processing carried out on one
machine.
2. Basic Client/Server Architectures
Specialized servers with specialized functions provide database query and transaction services to the clients.
Example servers are:
Print server: All print requests by the clients are forwarded to this machine.
File server: That maintains the files of the client machine.
DBMS server
NBKRIST ADBMS I M.TECH CSE I SEM Page 16
Web server
Email server
The client machines provide the user with the appropriate interfaces to utilize these servers, as well as with
local processing power to run local applications.
DBMS Example
Server handles query, update and transaction functionality
Client handles user interface programs and application programs
The fig 2.5 illustrates client/server architecture at the logical level and fig 2.6 is a simplified diagram that
shows how the physical architecture would look.
Client
User machine that provides user interface capabilities and local processing.
Server
System containing both hardware and software. It provides services to the client machines. Such as file
access, printing, archiving or database access.
3. Two-Tier Client/Server Architectures for DBMS
Server handles
NBKRIST ADBMS I M.TECH CSE I SEM Page 17
Query and transaction functionality related to SQL processing.
Client handles
User interface programs and application programs.
Open Database Connectivity(ODBC)
Provides application programming interface(API). Allows client-side programs to call the DBMS, both
client and server machines must have the necessary software installed.
JDBC
Allows java client programs to access one or more DBMSs through standard interface.
4. Three-Tier and n-Tier Architectures for Web Applications
Application server or web server
Adds intermediate layer between client and the database server. Runs application programs and stores
business rules.
N-Tier
Divide the layers between the user and the stored data further into finer components.
Classification of Database Management SystemsThe classification and types of Database Management System (DBMS) is explained in a detailed manner
below based on the different factors.
1. Based on the data model
2. Based on the number of users
3. Based on the sites over which network is distributed
4. Based on the cost
5. Based on the access
6. Based on the usage
1. Based on the data model
NBKRIST ADBMS I M.TECH CSE I SEM Page 18
Relational database – This is the most popular data model used in industries. It is based on the SQL. They
are table oriented which means data is stored in different access control tables, each has the key field whose
task is to identify each row. The tables or the files with the data are called as relations that help in
designating the row or record, and columns are referred to attributes or fields. Few examples are
MYSQL(Oracle, open source), Oracle database (Oracle), Microsoft SQL server(Microsoft) and DB2(IBM).
Object oriented database
The information here is in the form of the object as used in object oriented programming. It adds the
database functionality to object programming languages. It requires less code, use more natural data and also
code bases are easy to maintain. Examples are ObjectDB (ObjectDB software).
Object relational database
Relational DBMS are evolving continuously and they have been incorporating many concepts developed in
object database leading to a new class called extended relational database or object relational database.
Hierarchical database
NBKRIST ADBMS I M.TECH CSE I SEM Page 19
In this, the information about the groups of parent or child relationships is present in the records which is
similar to the structure of a tree. Here the data follows a series of records, set of values attached to it. They
are used in industry on mainframe platforms. Examples are IMS(IBM), Windows registry(Microsoft).
Network database
Mainly used on a large digital computers. If there are more connections, then this database is efficient. They
are similar to hierarchical database, they look like a cobweb or interconnected network of records. Examples
are CA-IDMS(COMPUTER associates), IMAGE(HP).
2. Based on the number of users
Single user
As the name itself indicates it can support only one user at a time. It is mostly used with the personal
computer on which the data resides accessible to a single person. The user may design, maintain and write
the database programs.
Multiple users
It supports multiple users concurrently. Data can be both integrated and shared,a database should be
integrated when the same information is not need be recorded in two places. For example a student in the
college should have the database containing his information. It must be accessible to all the departments
related to him. For example the library department and the fee section department should have information
NBKRIST ADBMS I M.TECH CSE I SEM Page 20
about student’s database. So in such case, we can integrate and even though database resides in only one
place both the departments will have the access to it.
3. Based on the sites over which network is distributed
Centralized database system – The DBMS and database are stored at the single site that is used by several
other systems too. We can simply say that data here is maintained on the centralized server.
Parallel network database system
This system has the advantage of improving processing input and output speeds. Majorly used in the
applications that have query to larger database. It holds the multiple central processing units and data storage
disks in parallel.
Distributed database system
In this data and the DBMS software are distributed over several sites but connected to the single computer.
4. Based on the cost
NBKRIST ADBMS I M.TECH CSE I SEM Page 21
Low cost DBMS
The cost of these systems varies from $100 to $3000.
Medium cost DBMS
Cost varies from $10000 to $100000.
High cost DBMS
Costs of these systems are usually more than $100000.
5. Based on the access
This classification simply based on the access to data in the database systems
Sequential access
One after the other.
Direct access
Inverted file structures
6. Based on the usage
Online transaction processing (OLTP) DBMS
They manage the operational data. Database server must be able to process lots of simple transactions per
unit of time. Transactions are initiated in real time, in simultaneous by lots of user and applications hence it
must have high volume of short, simple queries.
Online analytical processing (OLAP) DBMS
They use the operational data for tactical and strategically decision making. They have limited users deal
with huge amount of data, complex queries.
Big data and analytics DBMS
To cope with big data new database technologies have been introduced. One such is NoSQL (not only SQL)
which abandons the well-known relational database scheme.
XML DBMS – two types
1. Native XML DBMS – Use the logical, intrinsic structure of XML document.
2. Enabled XML DBMS – Existing DBMS with facilities to store XML data and structured data in
integrated way.
Multimedia DBMS – Stores data such as text, images, audio, video and 3D games which are usually stored
in binary large object.
GIS DBMS – Stores and queries the spatial data.
Sensor DBMS – Allows to manage sensor data, bio-metric and telematics data.
Mobile DBMS – Runs on the smartphones, tablets. It handles the local queries. Supports self-management
( no DBA).
NBKRIST ADBMS I M.TECH CSE I SEM Page 22
Open source DBMS – Code is publicly available and can be extended by anyone, popular for small
business applications.
http://www.studytonight.com/dbms/er-diagram.php
Data Modeling Using ER Model:
ER Diagrams, Naming Conventions, and Design Issues1. Summary of notation for ER Diagrams
What is an Entity Relationship Diagram (ERD)? An entity relationship diagram (ERD) shows the relationships of entity sets stored in a database. An entity
relationship diagram looks very much like a flowchart. It is the specialized symbols, and the meanings of
those symbols, that make it unique.
Common Entity Relationship Diagram symbols
An ER diagram is a means of visualizing how the information a system produces is related. There are five
main components of an ERD:
a) Entities
Entities are represented by rectangles. An entity is an object or concept about which you want to store
information.
A weak entity is an entity that must defined by a foreign key relationship with another entity as it cannot be
uniquely identified by its own attributes alone.
b) Actions
Actions are represented by diamond shapes, show how two entities share information in the database.
In some cases, entities can be self-linked. For example, employees can supervise other employees.
NBKRIST ADBMS I M.TECH CSE I SEM Page 23
c) Attributes
Attributes are represented by ovals. A key attribute is the unique, distinguishing characteristic of the entity.
For example, an employee's social security number might be the employee's key attribute.
A multivalued attribute can have more than one value. For example, an employee entity can have multiple
skill values.
A derived attribute is based on another attribute. For example, an employee's monthly salary is based on the
employee's annual salary.
d) Connecting lines
Solid lines that connect attributes to show the relationships of entities in the diagram.
e) Cardinality specifies how many instances of an entity relate to one instance of another entity. Ordinality
is also closely linked to cardinality. While cardinality specifies the occurrences of a relationship, ordinality
describes the relationship as either mandatory or optional. In other words, cardinality specifies the maximum
number of relationships and ordinality specifies the absolute minimum number of relationships.
There are many notation styles that express cardinality.
Information Engineering Style
NBKRIST ADBMS I M.TECH CSE I SEM Page 24
NBKRIST ADBMS I M.TECH CSE I SEM Page 25
2. Proper Naming of Schema Constructs
Choose a singular name for entity types, rather than plural ones.
Uppercase letters for entity type and relationship type names
Attribute names have their initial letter capitalized
Lowercase letters for role names
Choose binary relationship names to make the ER diagram of the schema readable from left to right
and from top to bottom
3. Design choices for ER conceptual design
Design Choices
Should a concept be modeled as an entity or an attribute?
Should a concept be modeled as an entity or a relationship?
Identifying relationships: Binary or ternary? Aggregation?
Constraints in the ER model
A lot of data semantics can(and should)be captured.
Keep the initial ER model like the real world
E.g. don’t show foreign keys
When converting to tables, various relational database rules may change some attributes to entities.
4. Alternative Notations for ER Diagrams
Specify structural Constraints on relationships
NBKRIST ADBMS I M.TECH CSE I SEM Page 26
Replaces cardinality ratio(1:1, 1:N,M:N) and single/double line notation for participation
constraints
Associate a pair of integer numbers(min,max) with each participation of an entity type E in a
relationship type R, where 0<= min<=max and max>=1
EER Model: Subclasses, Super Classes, and Inheritance
EER model The EER(Enhanced ER) model includes all the modeling concepts of the ER.
In addition, it includes the concepts of subclasses/super classes, specialization/generalization,
categories, attribute and relationship inheritance.
These are fundamental to conceptual modeling
The additional EER concepts are used to model applications more completely and more accurately
EER includes some object-oriented concepts, such as inheritance.
Subclasses, Super Classes, and Inheritance
Some objects may have similar but not identical attributes and methods. If there is a large degree of
similarity, it would be useful to be able to share the common properties. Inheritance allows one class to be
NBKRIST ADBMS I M.TECH CSE I SEM Page 27
defined as a special case of a more general class. These special cases are known as subclasses and the more
general cases are known as superclasses.
An entity type may have additional meaningful subgroupings of its entities.
Example: EMPLOYEE may be further grouped into SECRETARY, ENGINEER, MANAGER,
TECHNICIAN, SALARIED_EMPLOYEE, HOURLY_EMPLOYEE,..
Each of these groupings is a subset of EMPLOYEE entities
Each is called a subclass of EMPLOYEE
EMPLOYEE is the superclass for each of these subclasses
These are called superclass/subclass relationships.
Example: EMPLOYEE/SECRETATY
EMPLOYEE/TECHNICIAN
Data Abstraction, Knowledge Representation and Ontology Concepts.Goal of Knowledge representation (KR) techniques
a) Accurately model some domain of knowledge
b) Create an ontology (An ontology is somewhat similar to a conceptual schema, but with more
knowledge, rules and exceptions) that describes the concepts of the domain and how these concepts are
interrelated
Goals of KR are similar to those of semantic data models, but there are some important similarities and
differences between the two disciplines:
Both disciplines provide concepts, constraints, operations, and languages for defining data and representing
knowledge
There are four abstraction concepts that are used in both schematic data models such as the EER model and
KR schemes
NBKRIST ADBMS I M.TECH CSE I SEM Page 28
1. Classification and Instantiation
2. Identification
3. Specialization and generalization
4. Aggregation and association
5.Ontologies and the Semantic web
1. Classification and Instantiation
The process of classification involves systematically assigning similar objects/entities to object
classes/entity types.
Instantiation is the inverse of classification and refers to the generation and specific examination of distinct
objects of a class.
Exception objects
In general, the objects of a class should have a similar type structure. However, some objects may display
properties that differ in some respects from the other objects of the class.
In the EER model, entities are classified into entity types according to their basic attributes and
relationships.
One class can be instance of another class called a meta-class. Notices that this cannot be represented
directly in the EER model.
2. Identification
Identification is the abstraction process whereby classes and objects are made uniquely identifiable by
means of some identifier. For example, a class name uniquely identifies a whole class.
3. Specialization and Generalization
Generalization
Generalization is a bottom-up approach in which two lower level entities combine to form a higher level entity. In generalization, the higher level entity can also combine with other lower level entity to make further higher level entity.
NBKRIST ADBMS I M.TECH CSE I SEM Page 29
Specialization
Specialization is opposite to Generalization. It is a top-down approach in which one higher level entity can be broken down into two lower level entity. In specialization, some higher level entities may not have lower-level entity sets at all.
4. Aggregation and Association
Aggregation
Aggregation is a process when relation between two entity is treated as a single entity. Here the relation between Center and Course is acting as an Entity in relation with Visitor.
Association
Association is a relationship between two objects. In other words, association defines the multiplicity
between objects. You may be aware of one-to-one, one-to-many, many-to-one, many-to-many all these
words define an association between objects. Aggregation is a special form of association. Composition is a
special form of aggregation.
Example: A Student and a Faculty are having an association.
5. Ontologies and the Semantic Web
NBKRIST ADBMS I M.TECH CSE I SEM Page 30
In recent years, the amount of computerized data and information available on the Web has spiraled out of
control. Many different models and formats are used. In addition to the database models, much information
is stored in the form of documents, which have considerably less structure than database information
does.
One research project that is attempting to allow information exchange among computers on the
Web is called the Semantic Web.
Semantic Web: Allow meaningful information exchange and search among machines.
Ontology: Specification of a conceptualization.
Conceptualization: Set of concepts that are used to represent the part of reality or knowledge that is of interest to a community of users.
Specification: It refers to the language and vocabulary terms used to specify conceptualization.
NBKRIST ADBMS I M.TECH CSE I SEM Page 31