database

Upload: mohit-saini

Post on 06-Mar-2016

217 views

Category:

Documents


0 download

DESCRIPTION

Notes for data base management system for CSE Students

TRANSCRIPT

Database: A database is a collection of related data that are related data that are recorded in any medium.For ex: Consider the names , telephone number and address of the people you know. We may recorded this data in an address book, or recorded in a personal computer using Excel sheet . So this record is called database.A database have the following properties:a) A database represents some aspect of the real world sometimes its called mini world or Universe of Discourse (UoD).b) A database is designed , built and associated with data for a specific purpose.In other words , a database has some source from which data is derived, some interaction with events and some users that is interested in the content of database.DBMS:DBMS is a collection of programs ( A general purpose software) that provides the facility to the user to define, construct , manipulate and share databases among various users and application.Defining means to specify the data types , structure and contents of the database . DBMS also keep the information of database itself in database catalog . This information of database is called Meta Data.Constructing means to store the data itself on some storage medium.Manipulating means to retrieve the data from the database and updating the data in the database.Sharing means to allow multiple users and programs to access the database.DBMS System Application Programs/Queries

Users/Programs

Software to process Queries / Programs

DBMS SoftwareSoftware to access stored data

Stored database

DBMS Architecture Stored database definition OR META DETA

Diffence Between DBMS System and DBMS Software:DBMS system provides the environment in which all the DBMS user can easily work on.In other words it provide an interface to the user through which user can easily operate the DBMS software.DBMS software is a collection of programs which is used to process the queries of user and to access the database to fulfill the requirement of user queries.Actors on the Scene(DBMS users) :There are many types of users who are related to DBMS . These are:a) DBA (Database Administrator): In any organization there will be a leader who leads all the structure , manage all the resources etc. So will be the chief administrator of the organization. Like chief administrator , in DBMS DBA is the leader of database. DBA is responsible for authorizing to access the database, to monitoring its use and for acquiring software and hardware resources as needed.b) Database Designer: Database designer are responsible to select appropriate structure for represent and store the data. Before select the structure Designer will meet all the database users and find out all the requirements which they want to fulfill. After collect all the requirement it comes with their design and structure and this work is completed before database implementation.c) End Users: These are:1) Casual Users: These are the users who communicate with the database for little period of time. They are occasionally used the database. So these are the temporary users.2) Parametric Or Naive Users: These are the users who communicate with the database for a regular period. Their main job is to constantly querying and updating the database using standard queries , this is called Canned Transaction.3) Sophisticated End Users: These are engineers, scientists, business analysts who are familiarize with the database and when they want to use the database they used it regularly and when they dont want then they are temporally used .4) Stand alone users: These are the users who maintain their personal database using ready made software which is available in the market easily and provide the menu based interface through which they can easily used the database.d) System Analysts and Application Programmers: System analysts find out the requirements of parametric end users and develop all the analysis for canned transaction through which they can meet all the requirements of parametric users. Application programmers implement these analysis as programs. These are especially Software engineers.Advantages of DBMS:There are many advantages of DBMS over to file system . These are:a) Controlling Redundancy: Here redundancy means repetition . In normal file system we use same thing again and again which increase the storage area and access cost of data. For ex: In a college , the students names , their classes and roll no. are used in many sections like in HR section , Account Section and T & P section. So we used same data again and again in so many files. This type of redundancy is controlled in DBMS, In DBMS we make a database of these names, classes and roll no. and it can be shared among all the sections.b) Easy to access the data: In DBMS all data are stored in database in same format, so to retrieve the data from the database is easy because we use same application programs to retrieve data from database whereas in file system data are stored in files and each file has its own format. So each file need separate application programs to retrieve data. This is also called Data Isolation.c) Providing Backup and Recovery: DBMS provides facilities to recover from hardware and software failure. The backup and recovery subsystem of DBMS ensured that if a system fails in the middle of any transaction then database is restored in the state from where transaction is begin and when system is recovered it also ensure that transaction is resumed from the point where it is interrupted . This is called atomicity( To recover all the system automatically).d) Providing Multiple user interfaces: Because lot of users who are used the database have different types of computer skills or knowledge . Some users are technically strong and some are weak. So to make easy to use a database to all type of users DBMS provides many type of interfaces. It provides query languages for casual users, Programming language for application users, forms and command interfaces for parametric users and menu driven interfaces for stand alone users. Both forms and menu driven interfaces are called GUI.e) Restricted Unauthorized access: When multiple users use a database then it is not sure that every user have the permission to access that database . For ex: Financial data are always be confidential and only limited users can access that data . So in DBMS DBA(Database Administrator ) is responsible to restrict and authorized the users to use the database.f) Use Specific Storage to easily processed the query: In DBMS a specific storage medium is used to store the data because when we processed any query to retrieve data then its processed time is depend on ,how much time take data to be searched. So to processed the query efficiently DBMS used special type of storage medium auxiliary file or index file . By index file data searching can be fast.g) Data Integration: Integration means combined different types of data with different length in a unit. In DBMS we can store all the data with different data types and length . For ex: In any table bca we use can four data types. These are: char, varchar, numeric or number and date. So DBMS provides this facilities to integrate complex data types in one table or database.Data Abstraction:There are many database users who are not computer trained, so developers hide the complex portion of database from users through several levels of abstraction. These levels are:a) Physical Level Or Low level : The lowest level describe how the data are actually stored .b) Conceptual level Or logical level or high level: Describes what data are stored in the database and the relationship among the data.c) View level: The highest level describes only a part of database because this is the user level. User do not want to know that where data is stored and from where data is retrieved so its not the matter for user. So those part are hidden for users.View 2

View LevelView n

View 1

Logical Level

Physical Level

Data Model:A collection of ideas that are used to describe the overall structure of a database. Structure of database means type of data ,relationship among these data and restriction which is applied to the data. Most of data models also describe the basic operations applied to the database (updating and retrieving). Data model is used to show that how can we hide some details of database from the users that is not needed by the user.Types of Data Model:There are three type of data model and these model are categorized according to the capabilities of describe the structure of database. These are:a) High Level or Conceptual Data Models: It describe that what data is stored and how can we retrieve that data. It uses concepts to describe what data is stored are entities , attributes ,and relationships. So E-R model in the category of conceptual level.b) Low level or Physical Data Models: It describe how data is stored in the computer. Because user has no interest to know that where data is stored so this portion is always hidden to the users. This models are used specially for computer specialists.

c) Representational or Implementation Data Models: Between these two model representational model is come which hide some details of data storage and some represents to the end users. It represent data by using record structures so sometime it is called record based data models .This model is mostly used in traditional DBMS. This model also include relational data model ,Network and Hierarchical model(legacy model) , Instances:When an information is added or deleted from a database at a particular moment of time then it is called instances. In other words database changes at a particular moment is called instance.Schema:The overall design of a database is called Schema. The description of a database is also called Schema. Schema are frequently changed when any entry in database is added or deleted . Three Schema Architecture:There are three type of Schemas , these are:a) Physical or Low Level or Internal level Schema: It describe overall database at physical level. Because physical level describe that how data is stored so physical schema describe the database design at low level.b) Conceptual level or High level or Logical level Schema: It describe database design at conceptual level. It is totally related to the data so we can say that it describe the overall design of data rather than database.c) External or view level Schema: It describe the database design at user level. Because at view level there are many user view so at this level there are lot of schema is presented , one for each user.EXTERNAL VIEW

EXTERNAL VIEW

EXTERNAL LEVEL .......................

External/Conceptual Mapping CONCEPTUAL SCHEMA

CONCEPTUAL LEVEL Conceptual/Internal MappingINTERNAL SCHEMA

INTERNAL LEVEL

Stored DatabaseData Independence:The capacity to change the schema at one level of a database without having to change the schema at the next higher level is called Data Independence. There are two type of Data Independence . These are:a) Logical Data Independence: is the capacity to change the conceptual schema without having to change the external schema. b) Physical Data Independence: is the capacity to change the physical schema without having to change the conceptual schema.Database Languages:To perform operations on database for every user DBMS provides mainly three languages. These are:a) DDL(Data Definition Language): This language is used to specify the database schemas. It is used to create , modify and delete the database structure not the data . So in this language these queries are used:1) Create (It is used to create a database not to create data).2) Alter (It is used to change the database structure not the data).3) Drop(It is used to delete whole table with its structure not only the data).4) Comment (It is used to specify the database not the data). So DDL totally related to the database structure not to the data.b) DML(Data Manipulation language): Manipulation means retrieval , updation and insertion of data so DML is totally related to the data not to the database . This language is used to retrieve , update and insert the data into the database so it uses update, select and insert command. There are two types of DML.1) Procedural DML: requires a user to specify what data are needed and how to get those data.2) Non Procedural DML: requires a user to specify what data are needed without specifying how to get those data.c) DCL(Data Controlling Language): Is used to control both data and database so it has the relation to the both. It is used by DBA who specify the authorization and restriction given to the users. In DCL three commands are come. These are:COMMITROLLBACKSAVEPOINTInterfaces: Interfaces are the programs which convert system language to user understandable language and user language to system understandable language. So its work like a translator which provide an environment in which a user can easily operate the system. There are many type of interfaces . These are:a) Menu Based interfaces: These interface are used by the stand alone users.b) Forms Based or Command Based Interfaces: These are used by the parametric users.c) GUI(Graphical User Interface): These are used by both user , by parametric and by stand alone user. Both Form based and Menu based are combindly called GUI.d) Natural Languages Interfaces: These interface are uses by the sophisticated users. E R Model: E R model refers to Entity Relationship model. It is the graphical representation of database. ER Model describes data as Entities , attributes and relationship.Entity is the real world object or in other words entity can be anything in the real world which can be described by its attributes .Attributes are some properties which describes the entity.Relationship describe the association among two or more entities. Keys: Keys are the values which is used to identify the entities and to separate the entities. There are many type of keys these are:a) Primary Key: Primary key is used to avoid duplicity in any field. It will be the unique value in whole database. Primary key always be selected to those field at which we want to apply the restriction to avoid duplicity .b) Super Key: is the combination of primary key and any other field in the database.c) Candidate Key: is the combination of any those field from which we can make any field a primary key.d) Foreign Key: is a field (or collection of fields) in a table whose value is required to match the value of the primary key for the second table.Weak Entity: Those entities which can not formed the primary key are called weak entities.Strong Entity: Those entities which can formed the primary key are called strong entities.Extended(Enhanced ) ER Model:The ER modeling concepts are sufficient for representing traditional database application. For more complex database application such as telecommunications , CAD/CAM , GIS etc , we need more complex requirements than traditional applications. In late 1970s database designers have tried to design more accurate ER model , which reflects the data properties and constraints more accurately . So extended(Enhanced ) ER model have some enhanced features than normal ER model. It uses the concepts of Specialization , Generalization , Aggregation.Specialization:Specialization is the process through which we can relate one entity to more than one entity. In other words specialization is the process to defining a set of subclasses of an entity type. This entity is called superclass. For ex. An entity employee has the sub entity , faculty, Staff. So employee has the relation with both the sub entity. This relation name is IS A. as shown in fig. So Specialization follow the process of one to many relationship.

Employee

Staff

Faculty

Specialization Fig 4.4.8Generalization:Generalization is just reverse of Specialization. Generalization is the process to define a generalized entity type from the given entity type. For ex. Consider the two entity CAR and TRUCK . Because both have some common attributes, they can combindly make a super entity called VEHICLE. So it is the process to identify the common features (attributes) from two or more entity and generalized them into a super entity. Aggregation:Aggregation is the combined part of specialization and generalization. It is the process to define both specialized and generalized entity. For ex. An employee can be faculty and staff. It is the process of specialization and ,faculty and staff combindly formed or generalized an entity employee . So above fig 4.4.8 also show the concept of aggregation.Relation:A relation is a two dimensional table in which: a) Each location in the table contain a single entry.b) Each column has a different name to other column that is no two column has the same name.c) All values in a column are the values of the same attributes.Relational Database:Relational Database is the collection of relations and each relation is related each other through one of the attributes of these relation. For ex. Relation loan and borrower is related each other through the attribute loan no. as shown in fig.Customer name

Loan no.

Loan no.

Branch name

Amount

In Relational database there are two languages used to processed the queries:a) Procedural Language: In this language user instructs to the system to perform a sequence of operations on the database to got the desired result. Relational algebra is procedural language.b) Non Procedural language: In this language user only describes that what he wants without giving a specific procedure and system generates the desired result. Relational calculus is the non procedural language.Relation Schema:Relational Schema is a set of all the attributes of the relation. If any relation R have the attributes A1,A2,A3An than relational schema R={A1,A2,A3An}.Relation State:Relation State of Relation Schema is the set of all the touples. It is denoted by r(R) . where R is the relation schema. So r(R)={t1,t2,t3,tn}.Domain: Domain of the attribute is the set of all the values of that attribute. It is denoted by D(A1). Where A1 is the attribute of the relation.Degree (Arity) of Relation:Degree of the relation is the no. of attributes in any relation. If any relation have five attributes than degree of relation is five.Integrity Constraints:Integrity Constraints are the rules which are imposed on any relation to restrict the invalid entries in the relation. Integrity Constraints governed some rule on data entry in database. These are:a) Entity Constraints: Entity constraints referred the rules that no field that is part of the primary may accept the null value and every entry should be unique . Duplicate value will not be accepted. This means we distinguish each record from other. Normally nulls are used for unknown and inapplicable values. For ex . If any record in the balance table is null than its means that customer have either zero balance or it is unknown account. The decision to allow nulls is generally made on a field by field basis. There is one type of field where null value is not accepted , that is the primary key.b) Referential Constraints: referred the rule that if table A contains a foreign key that matches the primary key of table B , the value of this foreign key must either match the value of the primary key for some row in table B or be null.c) Domain Constraints: Domain Constraints referred the rule that within each tuple, the value of each attribute must be an atomic value . Atomic means that each value should match the data type of that attribute. For ex. If any attribute has the data type varchar than the entered data in that attribute should be character not a numeric value.Relational Algebra:Relational Algebra is a set of algebraic operations and expression to describe, define and manipulate the database.SQL (Structured Query Language) is the language which is formed from the relational algebra. In RDBMS , relational algebra is used to implement and optimized the queries. Mainly there are two operations used to fetch the value from the relation in relational algebra . These are:a) Unary Operations: Those operations which operate on one relation are called unary operations. It includes, Select, Project operations.b) Binary Operations: Those operations which operate on more than one relation are called binary operations. It includes Minus, Union and Cartesian Product operations.The Select operation: The select operation select tuples which satisfy the given condition. It is used for fetching the whole row or tuple. We use (Sigma ) to select the tuple. If we want those records who has more than 1200 amount in there account than the query will be: amount>1200(depositer)The Project Operation: The project operation select the attributes or columns which satisfy the given condition. It is used to fetching the particular column or attribute. We use (PI) to select the column, If we want to fetch loan no and amount of the loan in dehradun branch than the query will be: loanno,amount(branch = dehradun(loan))Union Operation:Union operation are used to find out the common values from two tables. It is denoted by( U). If we want those names who have either and account or loan or both then we use two relation for that borrower and depositor .Query is: customername(borrower) U customername(depositer).Minus operation: It is used to find those record which have in the one relation but not in other. It is denoted by (Minus). If we want those names who have the account but not the loan than the query is:: customername(depositer) customername(borrower).But if we want those names who have the lone but not the account in the bank than query is: customername(borrower) customername(depositer).Cartesian Product operation:Cartesian Product operation are used to combined two relations to formed one relation. After Cartesian product the no. of attributes in the new formed relation is the sum of the attributes in both the relation. It is denoted by (X).Relational Calculus: Relational Calculus is a non procedural and declarative query language based on mathematical logic. It form the basis for two languages QBE(Query By Example) and Datalog. There are two types of Relational Calculus. These are:a) Tuple Relational Calculus: It is a non procedural language which describe only the desired information without giving the specific procedure for obtaining the result. Query in the tuple relational calculus is expressed as: {t| A(t)}This means that the set of all tuples t which belonging to the table A.Examples:If we want all the records of the person who has the amount more than 1200 than query will be:{t| t loan ^ t[amount]>1200}Suppose if we want only the loan no and customer name attribute in branch dehradun than query will be:{t| s borrower( t[customer name]=s[customer name] ^ q loan(t[loan no]=q[loan no] ^ q[branch]=dehradun))}b) Domain Relational Calculus: The second type of relational calculus is Domain Relational Calculus. It uses Domain variables to get the value from the attribute domain, rather than the value from entire tuple. It is totally based on tuple relational calculus. The formal syntax of Domain Calculus is: { | A(x1,x2,..xn)}Where x1,x2represent the domain variable and A represent the relation.Examples:Find the loan no , branch name and amount for loans over 1200. { | loan ^ a>1200} Find all loan numbers for loans with an amount greater than 1200. {< l> | b,a ( loan ^ a>1200)} Find the names of all customers who have loan from the dehradun branch and find the loan amount. { | l( borrower ^ b( loan ^ b=dehradun))}Introduction to SQL:SQL( Structured Query language) was developed under the name SEQUEL at the IBM san jose research facilities as the data manipulation language for IBMs database management system in the mid 1970s. In 1980 it was renamed SQL . In 1986 ANSI(American National Standard Institute) and ISO(International Standard Organization) published SQL standard called SQL 89. Later on ANSI published SQL 89, SQL 99 and the recent version is SQL 2003.Aggregate Functions:The pre defined or built in functions in SQL are called Aggregate Functions. These are: SUM,AVC,MAX,MIN and COUNT.Views:A view is an application programs or an individual users picture of the database. View is the less part of the database. In other words view divides the table into the parts . But in actual table is not divided only, user imagine that table is divided and divide part is yours now. An individual can use a view to create reports, charts and other objects that show database data. To create view we use this syntax:CREATE VIEW abc AS SELECT A1,A2,A3 FROM xyz where A1=pqr;The select command , which is called the defining query indicates which part of the table is include in the view. Indexes:The mechanism to increase the efficiency through which data is retrieved more fast is called index. These indexes are much like the indexes used in the books. Indexes always make searches easy and fast . The exact process to create an index is: CREATE INDEX abc ON xyz (A1);This command creates the index named abc . A1 is the attribute of the table which is used as an index and normally it is called index key.When we create an index and if it contains single index key than it is called Single field index and if it contains more than one index key field than it is called Multiple field index. Cursors:When any query is processed ,Oracle or DBA performs the following tasks:a) Reserves an area in memory called privates SQL area.b) Populates this area with the related data.c) Process the data in the memory area.d) Frees the memory area when the processing is complete.This reserve memory which is used to do private work for the SQL is called Cursor. The data that is stored in the cursor is called the Active Data Set. Cursor size is limited or predefined so if to execute a query ,more memory is required than a row pointer is maintain for that situation. At that time row pointer points to the Oracle or DBA that more memory is required and Oracle or DBA provides another limited cursor . This is done implicitly in Oracle.Triggers:Database triggers are the procedures(functions ) that are stored in the database and are implicitly(automatic) executed (fired) when the contents of a table are changed. In other words when we modify (update , delete ,insert) any table and any trigger for that table is available in the database than it will fired automatically and restricted the modifications against the table. Parts of Trigger:A trigger has three basic parts. These are:a) Trigger Event or Statement: Those statements which forced the trigger to be fired. These are INSERT, DELETE, UPDATE.b) Trigger Restrictions: It is a logical expression which control the execution of trigger. It is applied in the WHEN clause.c) Trigger Action: A trigger action is the procedure(function) that c