enhanced erd

90
Enhanced E-R Model and Enhanced E-R Model and Business Rules Business Rules CS263 Lecture 4

Upload: zargham-khan

Post on 20-Jan-2016

77 views

Category:

Documents


3 download

DESCRIPTION

erd 10

TRANSCRIPT

Page 1: Enhanced Erd

Enhanced E-R Model andEnhanced E-R Model andBusiness RulesBusiness Rules

CS263 Lecture 4

Page 2: Enhanced Erd

IntroductionIntroduction

The basic ER model was introduced in the mid 1970s Since then business relationships and business data have

become more complex The term Enhanced Entity Relationship (EER) model

refers to the extension of the original ER model

Page 3: Enhanced Erd

Supertype/Subtype Supertype/Subtype relationshipsrelationships

Most important new modelling construct in the EER model Allows us to model a general entity type (the supertype)

and then subdivide it into several specialised entity types (called subtypes)

Each subtype inherits from its supertype and may have special attributes of its own

Page 4: Enhanced Erd

Representing supertypes and Representing supertypes and subtypessubtypes

• The supertype is connected with a line to a circle, which in turn is connected by a line to each subtype that has been defined (see Fig.)

• The ‘U’ shaped symbol on each line connecting a subtype to the circle indicates that the subtype is a subset of the supertype, and also indicates the direction of the relationship

• Attributes shared by all the entities are associated with the supertype, whilst attributes that are unique to a particular

subtype are associated with that subtype

Page 5: Enhanced Erd

Basic notation for supertype/subtype

relationships

Page 6: Enhanced Erd

An example: the EMPLOYEE An example: the EMPLOYEE supertypesupertype

Suppose that an organisation has 3 basic types of employees:

Hourly: Employee_Number, Employee_Name, Address, Date_Hired, Hourly_Rate

Salaried: Employee_Number, Employee_Name, Address, Date_Hired, Annual_Salary, Stock_Option

Contract consultants: Employee_Number, Employee_Name, Address, Date_Hired, Contract_Number, Billing_Rate

Page 7: Enhanced Erd

Employee supertypeEmployee supertype

Many attributes the same across 3 types, so we could define a supertype called EMPLOYEE, with subtypes for HOURLY_EMPLOYEE, SALARIED_EMPLOYEE and CONSULTANT (see Fig)

Page 8: Enhanced Erd

Employee supertype with three subtypes

All employee subtypes will have emp_no., name, address, and date-hired

Each employee subtype will also have its own attributes

Page 9: Enhanced Erd

When to use When to use supertype/subtype relationssupertype/subtype relations

Should consider using subtypes when either (or both) of the following conditions are present:

1. There are attributes that apply to some (but not all) of the instances of an entity type

2. The instances of a subtype participate in a relationship unique to that subtype

Both are true in the following Fig., where PATIENT has two subtypes: OUTPATIENT and RESIDENT PATIENT (the primary key is PATIENT_ID)

All patients have an Admit_Date and a Patient_Name

Page 10: Enhanced Erd

Patient examplePatient example

Every patient is cared for by a RESPONSIBLE_PHYSICIAN who develops a treatment plan for the patient

Each subtype also has unique attributes. Outpatients have a Checkback_Date, whilst residents have a Date_Discharged and a unique relationship that assigns each patient to a bed (this is a mandatory relationship, and each bed may or may not be assigned to a patient)

Page 11: Enhanced Erd

Supertype/subtype relationships of patients

Both outpatients and resident patients are cared for by a responsible physician

Only resident patients are assigned to a bed

Page 12: Enhanced Erd

Generalization and Generalization and specializationspecialization

Generalization: = the process of defining a more general entity type from a set of more specialized entity types. BOTTOM-UPBOTTOM-UP

Specialization = the process of defining one or more subtypes of the supertype, and forming supertype/subtype relationships. TOP-DOWNTOP-DOWN

Following Figs. Shows Generalisation in a situation where Following Figs. Shows Generalisation in a situation where we have 3 different entity types: CAR, TRUCK and we have 3 different entity types: CAR, TRUCK and MOTORCYCLEMOTORCYCLE

In second Fig. The more general entity type VEHICLE is In second Fig. The more general entity type VEHICLE is shownshown

Page 13: Enhanced Erd

GeneralisationGeneralisation

MOTORCYCLE is not shown as it does not satisfy MOTORCYCLE is not shown as it does not satisfy conditions for a subtype discussed earlier – the only conditions for a subtype discussed earlier – the only attributes of MOTORCYCLE are those that are common attributes of MOTORCYCLE are those that are common to all vehicles, there are no attributes specific to to all vehicles, there are no attributes specific to motorcycles. Further, MOTORCYCLE does not have a motorcycles. Further, MOTORCYCLE does not have a relationship to another entity typerelationship to another entity type

Page 14: Enhanced Erd

Example of generalization

Three entity types: CAR, TRUCK, and MOTORCYCLE

All these types of vehicles have common attributes

Page 15: Enhanced Erd

Generalization to VEHICLE supertype

So we put the shared attributes in a supertype

Note: no subtype for motorcycle, since it has no unique attributes

Page 16: Enhanced Erd

SpecialisationSpecialisation

In an example of specialisation, an entity type PART has identifier Part_No and other attributes Description, Unit_price, Location, Qty_On_Hand, Routing_Number and Supplier (a multivalued attribute as there may be more than one supplier)

Some parts are internally Manufactured Parts whilst others are externally Purchased Parts (some parts are obtained from both sources – when the choice depends on factors such as manufacturing capacity, unit price of the parts etc.)

Page 17: Enhanced Erd

SpecialisationSpecialisation

Some attributes apply to all parts regardless of source, others such as Routing_Number depend on the source as they apply only to Manufactured Parts. This suggests that PART should be specialised by defining the subtypes MANUFACTURED_PART and PURCHASED_PART

A new relationship ‘Supplies’ has been created between PURCHASED_PART and SUPPLIER that allows users to more easily associate purchased parts with their suppliers

The attribute Unit_Price is now associated with the relationship ‘Supplies’ so that the price may vary between suppliers

Page 18: Enhanced Erd

Example of specialization

Entity type PART

Only applies to manufactured

parts

Applies only to purchased parts

Page 19: Enhanced Erd

Specialization to MANUFACTURED PART and PURCHASED PART

Note: multivalued attribute was replaced by a relationship to another entity

Created 2 subtypes

Page 20: Enhanced Erd

Completeness constraintsCompleteness constraints

Whether an instance of a supertype must also be a member of at least one subtype. Has two possible rules:

1:Total Specialization Rule: Yes (double line) In following Fig. A PATIENT must either be an OUTPATIENT or a RESIDENT PATIENT. Total specialisation is indicated by the double line extending from the PATIENT identity type to the circle

Page 21: Enhanced Erd

Examples of completeness constraints

Total specialization rule

A patient must be either an outpatient or a resident patient

Page 22: Enhanced Erd

Completeness constraintsCompleteness constraints

• Partial Specialization Rule: No (single line) An entity instance of the supertype is allowed not to belong to any subtype.

• In the following Fig. If a VEHICLE is a car it will appear as an instance of CAR, and if a truck as an instance of TRUCK.

• However, if the vehicle is a motorcycle it cannot appear as an instance of any subtype. This example of partial specialisation is specified by the single line from the VEHICLE supertype to the circle

Page 23: Enhanced Erd

Partial specialization rule

A vehicle could be a car, a truck, or neither

Page 24: Enhanced Erd

Disjointness constraintDisjointness constraint

Whether an instance of a supertype may simultaneously be a member of two (or more) subtypes. Has two possible rules:

Disjoint Rule: An instance of the supertype can be only ONE of the subtypes. Following Fig. shows that at any one time a PATIENT must be either an outpatient or a resident patient but cannot be both – specified by the letter ‘d’

The subclass of a patient may change over time, but at any

given time a patient is of only one type

Page 25: Enhanced Erd

Disjoint rule

Examples of disjointness constraints

A patient can either be outpatient or resident, but not both

Page 26: Enhanced Erd

Disjointness constraintDisjointness constraint

• Overlap Rule: An instance of the supertype can simultaneously be a member of more than one of the subtypes. Some PARTs are both Manufactured and Purchased. An instance of PART is a particular Part Number (i.e. type of part) rather than the individual part itself (Part_No). Considering Part Number 4000, at a given time there may be 250 of this part to hand, of which 100 are manufactured and 150 are purchased. The overlap is specified by placing an ‘o’ in the circle

• Double line suggests any part must be either a purchased part or a manufactured part, or it may simultaneously be both of these

Page 27: Enhanced Erd

Overlap rule

A part may be both purchased and manufactured

Page 28: Enhanced Erd

Subtype discriminatorsSubtype discriminators

• Attribute of the supertype whose values determine the target subtype(s)• Disjoint subtypes: a simple attribute with alternative

values to indicate the possible subtypes. In the following Fig. A new attribute ‘Employee_Type’ has been added to the supertype to serve as a subtype discriminator. 3 values ‘H’ = Hourly, ‘S’ = Salaried, ‘C’ = Consultant. This is assigned the correct value when a new employee is added.

Page 29: Enhanced Erd

Subtype discriminatorsSubtype discriminators

The expression “Employee_Type = “ (the LHS of a condition statement) is placed next to the line leading from the supertype to the circle, with the value of the attribute that selects the appropriate subtype placed adjacent to the line leading to that subtype

Page 30: Enhanced Erd

Subtype discriminator (disjoint rule)

A simple attribute with different possible values indicating the subtype

Page 31: Enhanced Erd

Subtype discriminatorsSubtype discriminators

• Overlapping – a composite attribute whose subparts pertain to different subtypes. Each subpart contains a boolean value to indicate whether or not the instance belongs to the associated subtype.

• In the following Fig. a new attribute Part_Type has been added to PART. Part_Type is a composite attribute with components ‘Manufactured?’ and ‘Purchased?’ Each of these attributes is a boolean variable, and can be combined as: Manufactures only = YN, Purchased only = NY, Purchased and manufactured = YY

Page 32: Enhanced Erd

Subtype discriminator (overlap rule)

A composite attribute with sub-attributes indicating “yes” or “no” to determine whether it is of each subtype

Page 33: Enhanced Erd

Defining supertype/subtype Defining supertype/subtype hierarchieshierarchies

It is possible for any of the subtypes in these examples to have other subtypes defined on it

A supertype/subtype hierarchy is a hierarchical arrangement of supertypes and subtypes, where each subtype has only one supertype

In modelling the Human resources in a University, the most general entity type would be PERSON (sometimes called the root)

Relevant attributes are SSN (Social Security Number – Identifier), Name, Address, Gender and Date_Of_Birth

Page 34: Enhanced Erd

Defining supertype/subtype Defining supertype/subtype hierarchieshierarchies

Next we define all major subtypes of the root, here there are 3, EMPLOYEE, STUDENT and ALUMNUS (already graduated)

A person may belong to more than one subtype (such as ALUMNUS and EMPLOYEE) so the overlap rule is used. Overlap allows for any overlap (3-way) so if certain combinations are not allowed a more refined supertype/subtype hierarchy would have to be developed to eliminate these

Page 35: Enhanced Erd

Defining supertype/subtype Defining supertype/subtype hierarchieshierarchies

The next step is to evaluate whether any of the subtypes already defined qualify for further specialisation. Here employee has two subtypes, FACULTY and STAFF

Because there may be types of employee other than Faculty and Staff, the partial specialisation rule is indicated. However, such an employee cannot be both Faculty and Staff at the same time, so the disjoint rule is indicated in the circle

Page 36: Enhanced Erd

Defining supertype/subtype Defining supertype/subtype hierarchieshierarchies

Two subtypes are defined for student: GRADUATE_STUDENT and UNDERGRADUATE STUDENT

Notice that total specialisation and the disjoint rule are specified

Page 37: Enhanced Erd

Example of supertype/subtype hierarchy

Page 38: Enhanced Erd

Entity clusteringEntity clustering

EER diagrams are difficult to read when there are too many entities and relationships

Solution: group entities and relationships into entity clusters

Entity cluster: set of one or more entity types and associated relationships grouped into a single abstract entity type

Because an entity cluster behaves like an entity type, entity clusters and entity types can be further grouped to form a higher-level entity cluster

Page 39: Enhanced Erd

Entity clusteringEntity clustering

Entity-clustering is a hierarchical decomposition of a macro-level view of the data model into finer and finer views, eventually resulting in the full, detailed data model. The first of the following Figs. Shows the completed data model with dashed lines drawn around possible entity clusters, and the second shows the final result of transforming this into an EER diagram of only entity clusters and relationships

Page 40: Enhanced Erd

Entity clusteringEntity clustering

Entity clusters are formed: 1. By abstracting a supertype and its subtypes (see

CUSTOMER) 2. By combining directly related entity types and their

relationships (SELLING_UNIT, ITEM, MATERIAL, MANUFACTURING)

An entity cluster can also be formed by combining a strong entity and its associated weak entity types (not shown here)

Page 41: Enhanced Erd

Possible entity clusters for Pine Valley Furniture

(PVF)

Related groups of entities could become clusters

Page 42: Enhanced Erd

EER diagram of PVF entity clusters

More readable, isn’t it?

Page 43: Enhanced Erd

Business rulesBusiness rules

Statements that define or constrain some aspect of the business. We have already seen some (rules about relationships between entity types [cardinality], supertype/subtype relationships and definitions about attributes and entities). There are 3 main types of business rules:

1. Derivation – rule derived from other knowledge 2. Structural assertion – rule expressing static structure of the

organisation 3. Action assertion – rule expressing constraints/control of

organizational actions Following Fig. Is an EER diagram to describe business rules

Page 44: Enhanced Erd

EER depiction of business rules

classification

Page 45: Enhanced Erd

Business rulesBusiness rules

The ER model in the following Fig. Contains 4 entity types: FACULTY, COURSE, SECTION and STUDENT. SECTION is a weak entity type, because it cannot exist without the COURSE entity type

The identifying relationship for SECTION is ‘Is_Scheduled’ and the partial identifier is Section_ID (a composite attribute)

Only selected attributes are shown for the various entity

types to simplify the diagram

Page 46: Enhanced Erd

Stating a structural assertionStating a structural assertion

Four examples are: 1. A course is a module of instruction in a particular

subject area. This definition of the term course associates the two terms (module and subject area)

2. Student_Name is an attribute of student 3. A student may register for many sections, and a section

may be registered for by many students – this fact states the participation of entity types in a relationship (Is_Registered)

Page 47: Enhanced Erd

Stating a structural assertionStating a structural assertion

4. A faculty member is an employee of the University. Although not shown in the following Fig., this fact designates a supertype/subtype relationship.

The type of facts above are called base facts, they are fundamental and cannot be derived from other terms or facts

Page 48: Enhanced Erd

Derived factsDerived facts

A derived fact is a fact that is derived from business rules using an algorithm or inference

A derived attribute is an example of a derived fact Two examples of derived facts follow: 1. Student_GPA (Grade Point Average) =

Quality_Points/Total_Hours_Taken Where Quality_Points = sum [for all courses attempted]

(Credit_Hours*Numerical_Grade) Student_GPA could be shown in the Fig. As a dashed oval

connected to STUDENT

Page 49: Enhanced Erd

Derived factsDerived facts

2. A student is taught by the faculty assigned to the sections for which the student is registered – this fact can be derived by following the Is_Registered relationship from STUDENT or SECTION and then the Is_Assigned relationship from SECTION to FACULTY

Page 50: Enhanced Erd

Stating an action assertionStating an action assertion

Action assertions deal with the dynamic aspects of the organisation and impose “must/must not” and “should/should not” constraints on handling data

An action assertion is the property of some business rule (called the anchor object) for a data handling action (such as create, update, delete etc.) and it states how the other business rules (called corresponding objects) act on the anchor object

Page 51: Enhanced Erd

Stating an action assertionStating an action assertion

Examples of action assertions are: A course (anchor object) must have a course name

(corresponding object). Here the action is updating the course name property of a course

A student (anchor object) must have a value of 2.0 or greater for their Student_GPA (corresponding object) to graduate (action)

A student cannot register for (the anchor object is the Is_Registered relationship) a section of the course for which there is no qualified faculty (the corresponding object is the Is_Qualified relationship)

Page 52: Enhanced Erd

Types of action assertionsTypes of action assertions

There are 3 ways of classifying action assertions: 1: Based on the type of result from the assertion – 3 sub-

types: I. Condition, which states that if something is true then

another business rule will apply - ‘if a course has a qualified faculty, then the students can register for a section of that course’

II. Integrity constraint, which states something that must always be true – ‘the date a faculty becomes qualified to teach a course cannot be after the semester in which the faculty is assigned to teach a section of that course’

Page 53: Enhanced Erd

Types of action assertionsTypes of action assertions

III. Authorisation, which states a privilege – ‘only department chairs can qualify a faculty to teach a course’

Page 54: Enhanced Erd

Types of action assertionsTypes of action assertions

2. Based on the form of the assertion. There are 3 types of assertion:

I. Enabler, which if true permits or leads to the existence of the corresponding object – ‘a faculty can be created once the faculty is qualified to teach at least one course’

II. Timer, which enables/disables or creates/deletes an action – ‘when a student has a GPA above 2.0 and a total credit hours above 125, then the student may graduate’ Here graduating does not occur because of the timer, but the timer enables the action to occur

Page 55: Enhanced Erd

Types of action assertionsTypes of action assertions

III. Executive, which causes the execution of one or more actions. Can be thought of as a trigger for some action ‘when a student has a GPA below 2.0, then the student goes on academic probation’

Page 56: Enhanced Erd

Types of action assertionsTypes of action assertions

3. Based on the rigor of the assertion. Looking at action assertions in this way yields 2 types of assertions:

I. Controlling, which state that something must of must not be or happen (above examples fall into this category)

II. Influencing, which are guidelines or items of interest for which a notification must occur – ‘when the number of students exceeds 90% of the capacity of that section, notify the responsible departmental chair’ – in this situation nothing is controlled (students may continue to register) but management wants to know that a particular condition has occurred

Page 57: Enhanced Erd

Data model segment for class scheduling

Page 58: Enhanced Erd

Representing and enforcing Representing and enforcing business rulesbusiness rules

The modern approach is to declare action assertions at a conceptual level without specifying how the rule will be implemented

Can do this graphically (EER notation) or using structured grammar

Page 59: Enhanced Erd

Business rule 1Business rule 1

‘For a faculty member to be assigned to teach a section of a course, they must be qualified to teach the course for which that section is scheduled’

The anchor object is the relationship Is_Assigned, and we are constraining the assignment of the faculty member to the section – the following Fig. Shows a dashed line from Is_Assigned to the action assertion symbol

So in this case there are 2 corresponding objects, represented by the dashed lines from the action assertion symbol to each of the two relationships

Page 60: Enhanced Erd

Business Rule 1: For a faculty member to be assigned to teach a section of a course, the faculty member must be qualified to teach the course for which that section is scheduled

Action assertion

Anchor object

Corresponding object

Corresponding object

In this case, the action assertion

is a RRestriction

Page 61: Enhanced Erd

Business rule 2Business rule 2

‘For a faculty member to be assigned to teach a section of the course, the faculty member must not be assigned to teach a total of more than 3 course sections’

This rule imposes a limitation on the total number of sections a faculty member may teach at a given time. Since the rule involves a total, it requires a modification of the previous notation.

As shown in the following Fig., the anchor object is again the relationship Is_Assigned

Page 62: Enhanced Erd

Business rule 2Business rule 2

However, in this case the corresponding object is also Is_Assigned – it is a count of the total sections assigned to the faculty member

The letters ‘LIM’ in the action assertion symbol stand for ‘limit’

The arrow leaving this symbol then points to a circle with the letter ‘U’ (upper).

The second circle then contains the number 3, which is the upper limit.

Page 63: Enhanced Erd

Business rule 2Business rule 2

The constraint reads as ‘The corresponding object is a count of the number of sections assigned to the faculty member, which has an upper limit of 3’

If a faculty member is already assigned 3 sections, any transaction that attempts to add another section will be rejected

Page 64: Enhanced Erd

Business Rule 2: For a faculty member to be assigned to teach a section of a course, the faculty member must not be assigned to teach a total of more than three course sections

Action assertionAnchor object

Corresponding object

In this case, the action assertion is an

UUpper LIMLIMit

Page 65: Enhanced Erd

Business rules in SQLBusiness rules in SQL

You can implement business rules such as those above using the SQL language, and store the rules as part of the database definitions

One way is to use the CREATE assertion statement, for Business rule 2 we could write:

CREATE ASSERTION Overload_Protect CHECK (SELECT COUNT(*) FROM ASSIGNED WHERE Faculty_ID = ‘12345’) <= 3;

Page 66: Enhanced Erd

Schema definitionSchema definition The internal schema of a relational database can be

controlled for processing and storage efficiency. Techniques include

Choosing to index primary and/or secondary keys to increase the speed of row selection, table joining, and row ordering. Dropping indexes to increase speed of table updating.

Selecting file organizations for base tables that match the type of processing activity on those tables (e.g. keeping a table sorted by a frequently used reporting sort key)

Page 67: Enhanced Erd

Schema definitionSchema definition

Selecting file organizations for indexes (which are also tables) appropriate to the way the indexes are used - and allocating extra space for an index file so that an index can grow without having to be reorganised.

Clustering data, so that related rows of frequently joined tables are stored close together in secondary storage to minimise retrieval time

Maintaining statistics about tables and their indexes, so that the DBMS can find the most efficient ways to perform various database operations

Page 68: Enhanced Erd

Creating indexesCreating indexes

To speed up random/sequential access to base table data. Although users do not directly refer to indexes when writing SQL commands, the DBMS recognises which existing indexes would improve query performance. In this example an alphabetical index on the CUSTOMER_NAME field in the CUSTOMER_T table is created:

CREATE INDEX NAME_IDX ON CUSTOMER_T(CUSTOMER_NAME);

Page 69: Enhanced Erd

Creating indexesCreating indexes

Indexes may be created or dropped at any time. If data already exists in the key column(s), index population will automatically occur for the existing data

If an index is defined as UNIQUE (CREATE UNIQUE INDEX) and the existing data violate this condition, the index creation will fail

Page 70: Enhanced Erd

Creating indexesCreating indexes

Once an index is created, it will be updated as data are entered, updated or deleted

When we no longer need an index we can use the DROP statement, for example to remove the index on the customer name in the CUSTOMER table:

DROP INDEX NAME_IDX;

Page 71: Enhanced Erd

The SELECT StatementThe SELECT Statement Is used for queries on single or multiple tables. It has the following

clauses: SELECT - Lists the columns (and expressions involving columns)

from base tables or views to be projected into the table that will be the result of the command

FROM - Identifies the table(s) or view(s) from which columns will be chosen to appear in the result table, and includes the tables or views needed to join tables to process the query

WHERE - Includes the conditions for row selection within a single table or view, and the conditions between tables or views for joining

Page 72: Enhanced Erd

SELECTSELECT

The first two are required, and the third is necessary when only certain table rows are to be retrieved, or multiple tables are to be joined

Following Fig. Shows how we can display product name and quantity from the PRODUCT view for all products costing less than $275

Page 73: Enhanced Erd

SELECT ExampleSELECT Example

Find products with standard price less than $275

SELECT PRODUCT_NAME, STANDARD_PRICE FROM PRODUCT_V WHERE STANDARD_PRICE < 275

Comparison Operators in SQL

Page 74: Enhanced Erd

Distinct and *Distinct and *

If the user does not want to see duplicate rows in the result, SELECT DISTINCT can be used, so SELECT DISTINCT PRODUCT_NAME would display a results table without duplicate rows

SELECT * (where * is a wildcard to indicate all columns) displays all columns from all the tables or views in the FROM clause

Page 75: Enhanced Erd

Using expressionsUsing expressions

These are mathematical manipulations of the data in the table

Some are stored functions, such as SUM or AVG e.g. what is the average standard price for each product in

inventory (to be put in the result AVERAGE)? SELECT AVG (STANDARD_PRICE, AS AVERAGE FROM PRODUCT_V;

Page 76: Enhanced Erd

Using functionsUsing functions Function such as COUNT, MIN, MAX, SUM an AVG of specified

columns in the column list of a SELECT command may be used to specify that the resulting answer table is to contain aggregated data instead of row-level data. Using any of these aggregate functions will give a one-row answer. e.g. using the COUNT aggregate function to find totals, asking how many different items were ordered on order number 1004

SELECT COUNT(*) FROM ORDER_LINE_V WHERE ORDER_ID = 1004;

Note: with aggregate functions you can’t have single-valued columns included in the SELECT clause

Page 77: Enhanced Erd

Using functionsUsing functions

It is easy to confuse the functions COUNT(*) and COUNT COUNT(*) counts all rows selected by a query, regardless

of whether any of the rows contain null values COUNT tallies only those rows that contain a value, it

ignores null values SUM and AVG can only be used with numeric columns COUNT, COUNT(*), MIN and MAX can be used with

any data type. e.g. using MIN on a text column will find the lowest value in the column - the one which is closest to the beginning of the alphabet

Page 78: Enhanced Erd

Using wildcardsUsing wildcards

Wildcards can also be used in the WHERE clause if an exact match is not possible

Here the keyword LIKE is paired with wildcard characters and usually a string containing the characters that are known to be the desired matches

The wildcard character ‘%’ is used to represent any collection of characters

e.g. using LIKE ‘%DESK’ when searching PRODUCT_DESCRIPTION will find all the different kinds of desks

Page 79: Enhanced Erd

Using wildcardsUsing wildcards

The underscore ,_, is used to represent exactly once character

So using LIKE ‘_-drawer’ when searching PRODUCT_NAME will find any products with specified drawers, such as ‘3-drawer’, ‘5-drawer’ etc.

Page 80: Enhanced Erd

Using Boolean operatorsUsing Boolean operators AND, OR, and NOT Operators for customizing conditions in

WHERE clause. NOT is evaluated first, then AND, then ORThe following query lists product name, finish and unit price for all

desks and all tables that cost more than $300 in the PRODUCT view:

SELECT PRODUCT_DESCRIPTION, PRODUCT_FINISH, STANDARD_PRICE

FROM PRODUCT_V WHERE (PRODUCT_DESCRIPTION LIKE ‘%Desk’ OR PRODUCT_DESCRIPTION LIKE ‘%Table’) AND UNIT_PRICE > 300;

Page 81: Enhanced Erd

In and not in listsIn and not in lists

To match a list of values, use IN e.g. list all customers who live in the warmer American

states: SELECT CUSTOMER_NAME, CITY, STATE FROM CUSTOMER_V WHERE STATE IN (‘FL’, ’TX’, ‘CA’, ‘HI’);

Page 82: Enhanced Erd

Sorting resultsSorting results

• ORDER BY - sorts the final result rows in ascending or descending order

• GROUP BY - groups rows in an intermediate results table where the values in those rows are the same for one or more columns

• HAVING - can only be used following GROUP BY and acts as a secondary WHERE clause, returning only those groups which meet a specified condition

Page 83: Enhanced Erd

ORDER BYORDER BY

e.g. list customer, city and state for all customers in the CUSTOMER view whose address is in Florida, Texas, California or Hawaii. List the customers alphabetically by state, and alphabetically by customer within each state: i.e. sort the results first by STATE, and within a state by CUSTOMER_NAME:

SELECT CUSTOMER_NAME, CITY, STATE FROM CUSTOMER_V WHERE STATE IN (‘FL’, ‘TX’, ‘CA’, ‘HI’) ORDER BY STATE, CUSTOMER_NAME;

Page 84: Enhanced Erd

GROUP BY GROUP BY

Particularly useful when paired with aggregate functions such as SUM or COUNT

GROUP BY divides a table into subsets (by groups). Then an aggregate function can be used to provide summary information for that group

Scalar aggregate: a single value returned from SQL query using an aggregate function

Vector aggregate: multiple values returned from SQL query with aggregate function (via GROUP BY)

Page 85: Enhanced Erd

GROUP BYGROUP BY

e.g. count the number of customers with addresses in each state to which we deliver:

SELECT STATE, COUNT(STATE)

FROM CUSTOMER_V

GROUP BY STATE;

Page 86: Enhanced Erd

GROUP BYGROUP BY

It is also possible to nest groups within groups, the same logic is used as when sorting multiple items

e.g. count the number of customers with addresses in each city to which we deliver. List the cities by state:

SELECT STATE, CITY, COUNT(CITY) FROM CUSTOMER_V GROUP BY STATE, CITY; In general, each column referenced in the SELECT statement

must be referenced in the GROUP BY clause, unless the column is an argument for an aggregate function included in the SELECT clause

Page 87: Enhanced Erd

HAVINGHAVING

The HAVING clause acts like a WHERE clause, but it identifies groups that meet a criterion, rather than rows. Therefore we usually see a HAVING clause following a GROUP BY

e.g. find only states with more than one customer:SELECT STATE, COUNT(STATE) FROM CUSTOMER_V GROUP BY STATE HAVING COUNT(STATE) > 1;Using WHERE here would not work, because WHERE

does not allow aggregates - also WHERE qualifies rows, whereas HAVING qualifies groups

Page 88: Enhanced Erd

Using all 6 clausesUsing all 6 clauses

e.g. list the product finish and average standard price for each finish for selected finishes where the average standard price is less than $750:

SELECT PRODUCT_FINISH, AVG(STANDARD_PRICE)

FROM PRODUCT_V WHERE PRODUCT_FINISH IN(‘Cherry’, ‘Natural Ash’, ‘Natural Maple’, ‘White Ash’)

GROUP BY PRODUCT_FINISH HAVING AVG(STANDARD_PRICE) < 750 ORDER BY PRODUCT_FINISH;

Page 89: Enhanced Erd

Using all 6 clausesUsing all 6 clauses

The following Fig. Shows the order in which SQL processes the clauses of a statement

Arrows indicate the paths that may or may not be followed Only the SELECT and FROM clauses are mandatory The processing order is different from the syntax order As each clause is processed an intermediate results table is

produced that will be used for the next clause (users do not see the intermediate tables, only the final results)

Page 90: Enhanced Erd

SQL statement processing order