mc0067 smu mca sem2 2011

MC0067 – Database Management System

(DBMS and Oracle 9i)-Book ID: B0716 & B0717

Set-1

1. What are the advantages of Indexes in a Database? Explain with the example.

Ans:1Indexes in a Database: A database index is a data structure that improves the speed of data retrieval operations on a database table at the cost of slower writes and increased storage space. Indexes can be created using one or more columns of a database table, providing the basis for both rapid random lookups and efficient access of ordered records.Advantages of Indexes:

The classical example for the need of an index is if there is a table similar to this:

CREATE TABLE test1 ( id integer, content varchar);

and the application requires a lot of queries of the form

SELECT content FROM test1 WHERE id = constant;

Ordinarily, the system would have to scan the entire test1 table row by row to find all matching entries. If there are a lot of rows in test1 and only a few rows (possibly zero or one) returned by the query, then this is clearly an inefficient method. If the system were instructed to maintain an index on the id column, then it could use a more efficient method for locating matching rows. For instance, it might only have to walk a few levels deep into a search tree.

A similar approach is used in most books of non-fiction: Terms and concepts that are frequently looked up by readers are collected in an alphabetic index at the end of the book. The interested reader can scan the index relatively quickly and flip to the appropriate page, and would not have to read the entire book to find the interesting location. As it is the task of the author to anticipate the items that the readers are most likely to look up, it is the task of the database programmer to foresee which indexes would be of advantage.

The following command would be used to create the index on the id column, as discussed:

CREATE INDEX test1_id_index ON test1 (id);

The name test1_id_index can be chosen freely, but you should pick something that enables you to remember later what the index was for.

To remove an index, use the DROP INDEX command. Indexes can be added to and removed from tables at any time.



2. Write about:

Integrity Rules

Relational Operators with examples for each

Linear Search

Collision Chain

Ans:22.1 Integrity Rules:

These are the rules which a relational database follows in order to stay accurate and accessible. These rules govern which operations can be performed on the data and on the structure of the database. There are three integrity rules defined for a relational databse,which are:-

Distinct Rows in a Table - this rule says that all the rows of a table should be distinct to avoid in ambiguity while accessing the rows of that table. Most of the modern database management systems can be configured to avoid duplicate rows.

Entity Integrity (A Primary Key or part of it cannot be null) - this rule says that 'null' is special value in a relational database and it doesn't mean blank or zero. It means the unavailability of data and hence a 'null' primary key would not be a complete identifier. This integrity rule is also termed as entity integirty.

Referential Integrity - this rule says that if a foreign key is defined on a table then a value matching that foreign key value must exist as th e primary key of a row in some other table.

The following are the integrity rules to be satisfied by any relation. • No Component of the Primary Key can be null. • The Database must not contain any unmatched Foreign Key values. This is called the

referential integrity rule. Unlike the case of Primary Keys, there is no integrity rule saying that no component of

the foreign key can be null. This can be logically explained with the help of the following example:

Consider the relations Employee and Account as given below. Employee

Emp# EmpName EmpCity EmpAcc#X101 Shekhar Bombay 120001X102 Raj Pune 120002X103 Sharma Nagpur NullX104 Vani Bhopal 120003

Account



ACC# OpenDate BalAmt120001 30-Aug-1998 5000120002 29-Oct-1998 1200120003 01-Jan-1999 3000120004 04-Mar-1999 500

EmpAcc# in Employee relation is a foreign key creating reference from Employee to Account. Here, a Null value in EmpAcc# attribute is logically possible if an Employee does not have a bank account. If the business rules allow an employee to exist in the system without opening an account, a Null value can be allowed for EmpAcc# in Employee relation.

In the case example given, Cust# in Ord_Aug cannot accept Null if the business rule insists that the Customer No. needs to be stored for every order placed.

2.2 Relational Operators:

In the relational model, the database objects seen so far have specific names:Name Meaning

Relation Table

Tuple Record(Row)

Attribute Field(Column)

Cardinality Number of Records(Rows)

Degree(or Arity) Number of Fields(Columns)

View Query/Answer table

On these objects, a set of operators (relational operators) is provided to manipulate

them:

1. Restrict

2. Project

3. Union

4. Difference

5. Product

6. Intersection



7. Join

8. Divide

Restrict:Restrict simply extract records from a table.

it is also known as “Select”, but not the same SELECT as defined in SQL.

Project:Project selects zero or more fields from a table and generates a new tablethat contains all of the records and only the selected fields (with no duplications).

Union:Union creates a new table by adding the records of one table to anothertables, must be compatible: have the same number of fields and each of the field pairs has to have values in the same domain.

Difference:The difference of two tables is a third table which contains the records which appear in the first BUT NOT in the second.

Product:The product of two tables is a third which contains all of the records in the first one added to each of the records in the second.

Intersection:The intersection of two tables is a third tables which contains the records which are common to both.

Join:The join of two tables is a third which contains all of the records in the first and the second which are related.

Divide:Dividing a table by another table gives all the records in the first which have values in their fields matching ALL the records in the second.

The eight relational algebra operators are

1. SELECT – To retrieve specific tuples/rows from a relation.



Ord#OrdDate Cust#

101 02-08-94 002104 18-09-94 002

2. PROJECT – To retrieve specific attributes/columns from a relation.

DescriptionPrice

Power Supply 4000101-Keyboard 2000 2000Mouse 800 800MS-DOS 6.0 5000 5000MS-Word 6.0 8000 8000

3. PRODUCT – To obtain all possible combination of tuples from two relations.

http://train-srv.manipalu.com/wpress/wp-content/uploads/2009/06/clip-image00427.gif




Ord#OrdDate O.Cust# C.Cust# CustName City

101 02-08-94 002 001 Shah Bombay101 02-08-94 002 002 Srinivasan Madras101 02-08-94 002 003 Gupta Delhi101 02-08-94 002 004 Banerjee Calcutta101 02-08-94 002 005 Apte Bombay102 11-08-94 003 001 Shah Bombay102 11-08-94 003 002 Srinivasan Madras

4. UNION – To retrieve tuples appearing in either or both the relations participating in the UNION.

Eg: Consider the relation Ord_Jul as follows (Table: Ord_Jul)

Ord# OrdDate Cust#101 03-07-94 001102 27-07-94 003101 02-08-94 002102 11-08-94 003103 21-08-94 003


http://train-srv.manipalu.com/wpress/wp-content/uploads/2009/06/clip-image01027.jpg



104 28-08-94 002105 30-08-94 005

Note: The union operation shown above logically implies retrieval of records of Orders placed in July or in August

5. INTERSECT – To retrieve tuples appearing in both the relations participating in the INTERSECT.

Eg: To retrieve Cust# of Customers who’ve placed orders in July and in August

Cust#

003

6. DIFFERENCE – To retrieve tuples appearing in the first relation participating in the DIFFERENCE but not the second.

Eg: To retrieve Cust# of Customers who’ve placed orders in July but not in August





Cust#

001

7. JOIN – To retrieve combinations of tuples in two relations based on a common field in both the relations.

Eg:

ORD_AUG join CUSTOMERS (here, the common column is Cust#)Ord# OrdDate Cust# CustNames City101 02-08-94 002 Srinivasan Madras102 11-08-94 003 Gupta Delhi103 21-08-94 003 Gupta Delhi104 28-08-94 002 Srinivasan Madras105 30-08-94 005 Apte Bombay

Note: The above join operation logically implies retrieval of details of all orders and the details of the corresponding customers who placed the orders. Such a join operation where only those rows having corresponding rows in the both the relations are retrieved is called the natural join or inner join. This is the most common join operation.

Consider the example of EMPLOYEE and ACCOUNT relations.

EMPLOYEE

EMP # EmpName EmpCity Acc#X101 Shekhar Bombay 120001X102 Raj Pune 120002X103 Sharma Nagpur NullX104 Vani Bhopal 120003




ACCOUNT

Acc# OpenDate BalAmt120001 30. Aug. 1998 5000120002 29. Oct. 1998 1200120003 1. Jan. 1999 3000120004 4. Mar. 1999 500

A join can be formed between the two relations based on the common column Acc#. The result of the (inner) join is :

Emp# EmpName EmpCity Acc# OpenDate BalAmtX101 Shekhar Bombay 120001 30. Aug. 1998 5000X102 Raj Pune 120002 29. Oct. 1998 1200X104 Vani Bhopal 120003 1. Jan 1999 3000

Note that, from each table, only those records which have corresponding records in the other table appear in the result set. This means that result of the inner join shows the details of those employees who hold an account along with the account details.

The other type of join is the outer join which has three variations – the left outer join, the right outer join and the full outer join. These three joins are explained as follows:

The left outer join retrieves all rows from the left-side (of the join operator) table. If there are corresponding or related rows in the right-side table, the correspondence will be shown. Otherwise, columns of the right-side table will take null values.

EMPLOYEE left outer join ACCOUNT gives:

Emp# EmpName EmpCity Acc# OpenDate BalAmtX101 Shekhar Bombay 120001 30. Aug. 1998 5000X102 Raj Pune 120002 29. Oct. 1998 1200X103 Sharma Nagpur NULL NULL NULLX104 Vani Bhopal 120003 1. Jan 1999 3000




The right outer join retrieves all rows from the right-side (of the join operator) table. If there are corresponding or related rows in the left-side table, the correspondence will be shown. Otherwise, columns of the left-side table will take null values.

EMPLOYEE right outer join ACCOUNT gives:

Emp# EmpName EmpCity Acc# OpenDate BalAmtX101 Shekhar Bombay 120001 30. Aug. 1998 5000X102 Raj Pune 120002 29. Oct. 1998 1200X104 Vani Bhopal 120003 1. Jan 1999 3000NULL NULL NULL 120004 4. Mar. 1999 500

(Assume that Acc# 120004 belongs to someone who is not an employee and hence the details of the Account holder are not available here)

The full outer join retrieves all rows from both the tables. If there is a correspondence or relation between rows from the tables of either side, the correspondence will be shown. Otherwise, related columns will take null values.

EMPLOYEE full outer join ACCOUNT gives:

Emp# EmpName EmpCity Acc# OpenDate BalAmtX101 Shekhar Bombay 120001 30. Aug. 1998 5000X102 Raj Pune 120002 29. Oct. 1998 1200X103 Sharma Nagpur NULL NULL NULLX104 Vani Bhopal 120003 1. Jan 1999 3000NULL NULL NULL 120004 4. Mar. 1999 500

8. DIVIDE

Consider the following three relations:





R1 divide by R2 per R3 gives:

a

Thus the result contains those values from R1 whose corresponding R2 values in R3 include all R2 values.

2.3 Linear Search

Linear search, also known as sequential search, means starting at the beginning of the data and

checking each item in turn until either the desired item is found or the end of the data is

reached. Linear search is a search algorithm, also known as sequential search that is suitable

for searching a list of data for a particular value. It operates by checking every element of a list

one at a time in sequence until a match is found. The Linear Search, or sequential search, is

simply examining each element in a list one by one until the desired element is found. The

Linear Search is not very efficient. If the item of data to be found is at the end of the list, then all

previous items must be read and checked before the item that matches the search criteria is

found. This is a very straightforward loop comparing every element in the array with the key. As

soon as an equal value is found, it returns. If the loop finishes without finding a match, the

search failed and -1 is returned. For small arrays, linear search is a good solution because it's

so straightforward. In an array of a million elements linear search on average will take500, 000

comparisons to find the key. For a much faster search, take a look at binary search.




Algorithm

For each item in the database if the item matches the wanted info exit with this item Continue loop wanted item is not in database

2.4 Collision Chain:

In computer science, a hash table or hash map is a data structure that uses a hash function to map identifying values, known as keys (e.g., a person's name), to their associated values (e.g., their telephone number). Thus, a hash table implements an associate array. The hash function is used to transform the key into the index (the hash) of an array element (the slot or bucket) where the corresponding value is to be sought.

Ideally, the hash function should map each possible key to a unique slot index, but this ideal is rarely achievable in practice (unless the hash keys are fixed; i.e. new entries are never added to the table after it is created). Instead, most hash table designs assume that hast collisions—different keys that map to the same hash value—will occur and must be accommodated in some way.



4. Discuss the correspondences between the ER model constructs and the relational model

constructs. Show how each ER model construct can be mapped to the relational model, and

discuss any alternative mappings.

Ans:4Relational Data Model:The model uses the concept of a mathematical relation-which looks somewhat like a table of values-as its basic building block, and has its theoretical basis in set theory and first order predicate logic.The relational model represents the database a collection of relations. Each relation resembles a table of values or, to some extent, a “flat” file of records. When a relation is thought of as a table of values, each row in the table represents a collection of related data values. In the relation model, each row in the table represents a fact that typically corresponds to a real-world entity or relationship. The table name and column names are used to help in interpreting the meaning of the values in each row. In the formal relational model terminology, a row is called a tuple, a column header is called an attribute, and the table is called a relation. The data type describing the types of values that can appear in each column is represented by domain of possible values.ER Model:An entity-relationship model (ERM) is an abstract and conceptual representation of data. Entity-relationship modeling is a database modeling method, used to produce a type of conceptual schema or semantic data model of a system, often a relational database, and its requirements in a top-down fashion. Diagrams created by this process are called entity-relationship diagrams, ER diagrams, or ERDs.

The first stage of information system design uses these models during the requirements analysis to describe information needs or the type of information that is to be stored in a database. In the case of the design of an information system that is based on a database, the conceptual data model is, at a later stage (usually called logical design), mapped to a logical data model, such as the relational model; this in turn is mapped to a physical model during physical design. We create a relational schema from an entity-relationship(ER) schema.In the case of the design of an information system that is based on a database, the conceptual data model is, at a later stage (usually called logical design), mapped to a logical data model, such as the relational model; this in turn is mapped to a physical model during physical design. Sometimes, both of these phases are referred to as "physical design". Key elements of this model are entities, attributes, identifiers and relationships.



Correspondence between ER and Relational Models:

ER Model Relational ModelEntity type “Entity” relation1:1 or 1:N relationship type Foregin keyM:N relationship type “Relationship” relation and two foreign keysn ary relationship type “Relationship” relation and n foreign keysSimple attributes AttributesComposite attributes Set of simple component attributesMultivalued attributes Relation and foreign keyValue set DomainKey attribute Primary key or secondary key

Lets take COMPANY database example:



The COMPANY ER schema is below:

1

1 N

N

N

EMPLOYEE

address

salarysex

Lname

Initial

Fname

Name

END

DOB

DEPARTMENT

LocationName

Number

NoOfEmployee

WORKS_FOR

MANAGES

DEPENDENTS_OF

CONTROLS

WORKSO

SUPERVISION

StartDate

HOURS

PROJECT

Name

Number

Location

DEPENDENT



Result of mapping the company ER schema into a relational database schema:

EMPLOYEEFNAME INITIAL LNAME ENO DOB ADDRESS SEX SALARY SUPERENO DNO

DEPARTMENTDNAME DNUMBER MGRENO MGRSTARTDATE

DEPT_LOCATIONSDNUMBER DLOCATION

PROJECTPNAME PNUMBER PLOCATION DNUM

WORKS_ONEENO PNO HOURS

DEPENDENTEENO DEPENDENT_NAME SEX DOB RELATIONSHIP

Mapping of regular entity types:For each regular entity type E in the ER schema, create a relation R that includes all the simple attributes of E. Include only the simple component attributes of a composite attribute. Choose one of the key attributes of E as primary key for R. If the chosen key of E is composite, the set of simple attributes that form it will together the primary key of R.If multiple keys were identified for E during the conceptual design, the information describing the attributes that form each additional key is kept in order to specify secondary (unique) keys of relation R. Knowledge about keys is also kept for indexing purpose and other types of analyses.We create the relations EMPLOYEE, DEPARTMENT, and PROJECT in to correspond to the regular entity types EMPLOYEE, DEPARTMENT, and PROJECT. The foreign key and



relationship attributes, if any, are not include yet; they will be added during subsequent steps. These, include the attributes SUPERENO and DNO of EMPLOYEE, MGRNO and MGRSTARTDATE of DEPARTMENT, and DNUM of PROJECT. We choose ENO, DNUMBER, and PNUMBER as primary keys for the relations EMPLOYEE, DEPARTMENT, and PROJECT, respectively. Knowledge that DNAME of DEPARTMENT and PNAME of PROJCET are secondary keys is kept for possible use later in the design.The relation that is created from the mapping of entity types are sometimes called entity relations because each tuyple represents an entity instance.

5. Define the following terms: disk, disk pack, track, block, cylinder, sector, interblock

gap, read/write head.

Ans:5Disk:Disk s are used for storing large amounts of data. The most basic unit of data on the disk is a single bit of information. By magnetizing a area on disk in certain ways, one can make it represent a bit value of either 0 or 1. To code information, bits are grouped into bytes. Byte sizes are typically 4 to 8 bits, depending on the computer and the device. We assume that one character is stored in a single byte, and we use the terms byte and character interchangeably. The capacity of a disk is the number of bytes it can store, which is usually very large. Small floppy disks used with microcomputers typically hold from 400 kbytes to 1.5 Mbytes; hard disks for micros typically hold from several hundred Mbytes up to a few Gbytes. Whatever their capacity, disks are all made of magnetic material shaped as a thin circular disk and protected by a plastic or acrylic cover. A disk is single-sided if it stores information on only one of its surface and double-sided if both surfaces are used.

Disk Packs:

To increase storage capacity, disks are assembled into a disk pack, which may include many disks and hence many surfaces. A Disk pack is a layered grouping of hard disk platters (circular, rigid discs coated with a magnetic data storage surface). Disk pack is the core component of a hard disk drive. In modern hard disks, the disk pack is permanently sealed inside the drive. In many early hard disks, the disk pack was a removable unit, and would be supplied with a protective canister featuring a lifting handle.

Track and cylinder:



The (circular) area on a disk platter which can be accessed without moving the access arm of the drive is called track. Information is stored on a disk surface in concentric circles of small width, for each having a distinct diameter. Each circle is called a track. For disk packs, the tracks with the same diameter on the various surfaces are called cylinder because of the shape they would form if connected in space. The set of tracks of a disk drive which can be accessed without changing the position of the access arm are called cylinder.

The number of tracks on a disk range from a few hundred to a few thousand, and the capacity of each track typically range from tens of Kbytes to 150 Kbytes.

Sector: A fixed size physical data block on a disk drive.A track usually contains a large amount of information; it is divided into smaller blocks or sectors. The division of a track into sectors is hard-coded on the disk surface and cannot be changed. One type of sector organization calls a portion of a track that subtends a fixed angle at the center as a sector. Several other sector organizations are possible, one of which is to have the sectors subtend smaller angles at the center as one moves away, thus maintaining a uniform density of recording.

Block and Interblock Gaps: A physical data record, separated on the medium from other blocks by inter-block gaps is called block. The division of a track into equal sized disk blocks is set by the operating system during disk formatting. Block size is fixed during initialization and cannot be changed dynamically. Typical disk block sizes range from 512 to 4096 bytes. A disk with hard coded sectors often has the sectors subdivided into blocks during initialization. An area between data blocks which contains no data and which separates the blocks is called interblock gap. Blocks are separated by fixed size interblock gaps, which include specially coded control information written during disk initialization. This information is used to determine which block on the track follows each interblock gap.

Read/write Head:A tape drive is required to read the data from or to write the data to a tape reel. Usually, each group of bits that forms a byte is stored across the tape, and the bytes themselves are stored consecutively on the tape. A read/write head is used to read or write data on tape. Data records on tape are also stored in blocks-although the blocks may be substantially larger than those for disks, and interblock gaps are also quite large. With typical tape densities of 1600 to 6250 bytes per inch, a typical interblock gap of 0.6 inches corresponds to 960 to 3750 bytes of wasted storage space.



Set-2



1. Explain the purpose of Data Modeling. What are the basic constructs of E-R Diagrams?

Ans:1Data modeling in is the process of creating a data model by applying formal data model descriptions using data modeling techniques.

Data modeling is the act of exploring data-oriented structures. Like other modeling artifacts data models can be used for a variety of purposes, from high-level conceptual models to physical data models.

Data modeling is the formalization and documentation of existing processes and events that occur during application software design and development. Data modeling techniques and tools capture and translate complex system designs into easily understood representations of the data flows and processes, creating a blueprint for construction and/or re-engineering.

Basic Constructs of E-R Modeling:The ER model views the real world as a construct of entities and association between entities. The basic constructs of ER modeling are entities, attributes, and relationships.Entity:

An entity may be defined as a thing which is recognized as being capable of an independent existence and which can be uniquely identified. An entity is an abstraction from the complexities of some domain. When we speak of an entity we normally speak of some aspect of the real world which can be distinguished from other aspects of the real world.

An entity may be a physical object such as a house or a car, an event such as a house sale or a car service, or a concept such as a customer transaction or order. Although the term entity is the one most commonly used, following Chen we should really distinguish between an entity and an entity-type. An entity-type is a category. An entity, strictly speaking, is an instance of a given entity-type. There are usually many instances of an entity-type. Because the term entity-type is somewhat cumbersome, most people tend to use the term entity as a synonym for this term.

Entities can be thought of as nouns. Examples: a computer, an employee, a song, a mathematical theorem.

Relationship:

A relationship captures how two or more entities are related to one another. Relationships can be thought of as verbs, linking two or more nouns. Examples: an owns relationship between a company and a computer, a supervises relationship between an employee and a department, a performs relationship between an artist and a song, a proved relationship between a mathematician and a theorem.



Attributes:

Entities and relationships can both have attributes. Examples: an employee entity might have a Social Security Number (SSN) attribute; the proved relationship may have a date attribute.

2. Write about:

Types of Discretionary Privileges

Propagation of Privileges using Grant Option

Physical Storage Structure of DBMS

Indexing

Ans:2Types of Discretionary Privileges:The concept of an authorization identifier is used to refer, to a user account. The DBMS must provide selective access to each relation in the database based on specific accounts. There are two levels for assigning privileges to use use the database system:

The account level: At this level, the DBA specifies the particular privileges that each account holds independently of the relations in the database.

The relation (or table level): At this level, the DBA can control the privilege to access each individual relation or view in the database.

The privileges at the account level apply to the capabilities provided to the account itself and can include the CREATE SCHEMA or CREATE TABLE privilege, to create a schema or base relation; the CREATE VIEW privilege; the ALTER privilege, to apply schema changes such adding or removing attributes from relations; the DROP privilege, to delete relations or views; the MODIFY privilege, to insert, delete, or update tuples; and the SELECT privilege, to retrieve information from the database by using a SELECT query.The second level of privileges applies to the relation level, whether they are base relations or virtual (view) relations.The granting and revoking of privileges generally follow an authorization model for discretionary privileges known as the access matrix model, where the rows of a matrix M represents subjects (users, accounts, programs) and the columns represent objects (relations, records, columns, views, operations). Each position M(i,j) in the matrix represents the types of privileges (read, write, update) that subject i holds on object j. To control the granting and revoking of relation privileges, each relation R in a database is assigned and owner account, which is typically the account that was used when the relation was created in the first place. The owner of a relation is given all privileges on that relation. The owner account holder can pass privileges on R to other users by granting privileges to their accounts.In SQL the following types of privileges can be granted on each individual relation R:



SELECT (retrieval or read) privilege on R: Gives the account retrieval privilege. In SQL this gives the account the privilege to use the SELECT statement to retrieve tuples from R.

MODIFY privileges on R: This gives the account the capability to modify tuples of R. In SQL this privilege is further divided into UPDATE, DELETE, and INSERT privileges to apply the corresponding SQL command to R. In addition, both the INSERT and UPDATE privileges can specify that only certain attributes can be updated or inserted by the account.

REFERENCES privilege on R: This gives the account the capability to reference relation R when specifying integrity constraints. The privilege can also be restricted to specific attributes of R.

Propagation of Privileges using the GRANT OPTION:Whenever the owner A of a relation R grants a privilege on R to another account B, privilege can be given to B with or without the GRANT OPTION. If the GRANT OPTION is given, this means that B can also grant that privilege on R to other accounts. Suppose that B is given the GRANT OPTION by A and that B then grants the privilege on R to a third account C, also with GRANT OPTION. In this way, privileges on R can propagate to other accounts without the knowledge of the owner of R. If the owner account A now revokes the privilege granted to B, all the privileges that B propagated based on that privilege should automatically be revoked by the system.

Physical Storage Structure of DBMS:

The physical design of the database specifies the physical configuration of the database

on the storage media. This includes detailed specification of data elements, data types,

indexing options and other parameters residing in the DBMS data dictionary. It is the

detailed design of a system that includes modules & the database's hardware & software

specifications of the system. Physical structures are those that can be seen and

operated on from the operating system, such as the physical files that store

data on a disk.

• Basic Storage Concepts (Hard Disk)

• disk access time = seek time + rotational delay

• disk access times are much slower than access to main memory.

overriding DBMS performance objective is to minimise the number of disk

accesses (disk I/Os)

Indexing:



Data structure allowing a DBMS to locate particular records more quickly and

hence speed up queries.Book index has index term (stored in alphabetic

order) with a page number.Database index (on a particular attribute) has

attribute value (stored in order) with a memory address.An index gives direct

access to a record and prevents having to scan every record sequentially to

find the one required.

• Using SUPPLIER(Supp# , SName, SCity)

Consider the query Get all the suppliers in a certain city ( e.g. London)

2 possible strategies:

a. Search the entire supplier file for records with city 'London'

b. Create an index on cities, access it for 'London’ entries and follow the

pointer to the corresponding records

SCity Index Supp# SName SCity

Dublin S1 Smith London

London S2 Jones Paris

London S3 Brown Paris

Paris S4 Clark London

Paris S5 Ellis Dublin

4. What is a relationship type? Explain the differences among a relationship

instance, a relationship type, and a relationship set



Ans:4

There are three type of relationships

1) One to one

2) One to many

3) Many to many

Say we have table1 and table2

For one to one relationship, a record(row) in table1 will have at most one matching record or row in table2

I.e. it mustn’t have two matching records or no matching records in table2.

For one to many, a record in table1 can have more than one record in table2 but not vice versa

Let’s take an example,

Say we have a database which saves information about Guys and whom they are dating.

We have two tables in our database Guys and Girls

Guy id Guy name1 Andrew2 Bob3 CraigGirl id Girl name1 Girl12 Girl23 Girl3

Here in above example Guy ID and Girl ID are primary keys of their respective table.

Say Andrew is dating Girl1, Bob – Girl2 and Craig is dating Girl3.

So we are having a one to one relationship over there.

So in this case we need to modify the Girls table to have a Guy id foreign key in it.

Girl id Girl name Guy id1 Girl1 1



2 Girl2 23 Girl3 3

Now let say one guy has started dating more than one girl.

i.e. Andrew has started dating Girl1 and say a new Girl4

That takes us to one to many relationships from Guys to Girls table.

Now to accommodate this change we can modify our Girls table like this

Girl Id Girl Name Guy Id1 Girl1 12 Girl2 23 Girl3 34 Girl4 1

Now say after few days, comes a time where girls have also started dating more than one boy i.e. many to many relationships

So the thing to do over here is to add another table which is called Junction Table, Associate Table or linking Table which will contain primary key columns of both girls and guys table.

Let see it with an example

Guy id Guy name1 Andrew2 Bob3 CraigGirl id Girl name1 Girl12 Girl23 Girl3

Andrew is now dating Girl1 and Girl2 and

Now Girl3 has started dating Bob and Craig

so our junction table will look like this

Guy ID Girl ID1 1



1 22 22 33 3

It will contain primary key of both the Girls and Boys table.

A relationship type R among n entity types E1, E2, …, En is a set of associations among entities from these types. Actually, R is a set of relationship instances ri where each ri is an n-tuple of entities (e1, e2, …, en), and each entity ej in ri is a member of entity type Ej, 1≤j≤n. Hence, a relationship type is a mathematical relation on E1, E2, …, En, or alternatively it can be defined as a subset of the Cartesian product E1x E2x … xEn . Here, entity types E1, E2, …, En defines a set of relationship, called relationship sets.

Relationship instance: Each relationship instance ri in R is an association of entities, where the association includes exactly one entity from each participating entity type. Each such relationship instance ri represent the fact that the entities participating in ri are related in some way in the corresponding miniworld situation. For example, in relationship type WORKS_FOR associates one EMPLOYEE and DEPARTMENT, which associates each employee with the department for which the employee works. Each relationship instance in the relationship set WORKS_FOR associates one EMPLOYEE and one DEPARTMENT.



5. Discuss the various reasons that lead to the occurrence of null values in relations in

detail.

Ans:5

Each value in a tuple is an atomic value; that is, it is not divisible into components within the framework of the basic relational model. Hence, composite and multivalued attributes are not allowed. This model is sometimes called the flat relational model. Much of the theory behind the relational model was developed with this assumption in mind, which is called the first normal form assumption. Hence, multivalued attributes must be represented by separate relations, and composite attributes are represented only by their simple component attributes in the basic relational model.

An important concept is that of nulls, which are used to represent the values of attributes that may be unknown or may not apply to a tuple. A special value, called null, is used for these cases. In general, we can have several meanings for null values, such as "value unknown," "value exists but is not available," or "attribute does not apply to this tuple." It is possible to devise different codes for different meanings of null values. Incorporating different types of null values into the relational model operations has proven difficult and is outside the scope of our presentation.

mc0067 smu mca sem2 2011

Documents