csi 1306 databases – part 4

58
CSI 1306 Databases – Part 4 SQL continued and Database Management Systems

Upload: databaseguys

Post on 17-Jun-2015

474 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: CSI 1306 Databases – Part 4

CSI 1306

Databases – Part 4

SQL continued and

Database Management Systems

Page 2: CSI 1306 Databases – Part 4

Union

Union allows the merging of two Selects– The number of fields must be the same in both Select

statements• Field types do not matter

Page 3: CSI 1306 Databases – Part 4

Query 16

Write Query 15 as a Union

(select Initials, Surname from STUDENT where Surname like 'Z*')union (select Initials, Surname from STUDENT where Surname like '*Z');

Page 4: CSI 1306 Databases – Part 4

Query 17

Find all students, and employees whose names begin with Z– the union operation is the only way if the partial results

come from different tables

(select Initials, Surname from STUDENT where Surname like 'Z*')union (select Initials, Empname from EMPLOYEE where Empname like 'Z*');

W.J. ZIMMERMANG.R. ZENDL

Page 5: CSI 1306 Databases – Part 4

Where - In

In will make the comparison to a list of known values. It is a Set operator.

Page 6: CSI 1306 Databases – Part 4

Query 18

Use the Set operator to find information about a series of students

select Studnum, Surname, Initials, Birthdatefrom STUDENTwhere Studnum in ((86004), (86060), (86609));

86004 RUSSELL F.K.A. 1967/09/2186060 ALLAN C.C. 1967/01/1986609 HARRISON G.L. 1965/10/29

Note that this could also be written as:

Studnum = 86004 or Studnum = 86060 or Studnum = 86609

Page 7: CSI 1306 Databases – Part 4

Query 19

Students from Guelph who take the philosophy course

select Initials, Surname, Studnumfrom STUDENTwhere Address like '*GUELPH ONT' and

Studnum in (select Studnum from REGISTER

where Course = 'PHIL');

C.D. MOSER 86087

Page 8: CSI 1306 Databases – Part 4

Query 20

Students from Guelph who do not take the psychology course

select Initials, Surname, Studnumfrom STUDENTwhere Address like '*GUELPH ONT' and

Studnum not in (select Studnum from REGISTER

where Course = 'PSYCH');

C.D. MOSER 86087

Page 9: CSI 1306 Databases – Part 4

Aggregate Functions

Select Aggregate Function – Will return a single value from a table– Aggregate Functions are performed on a field– Standard Functions are:

• Avg() - Arithmetic average of the values• Count() - Number of records• First() - Value of the first record in the set• Last() - Value of the last record in the set• Min() - Smallest value in the set• Max() - Largest value in the set• Sum() - Total of all records• StDev() - Standard deviation (nonbiased)• StDevP - Standard deviation (biased)• Var() - Variance (nonbiased)• VarP() - Variance deviation (biased)

Page 10: CSI 1306 Databases – Part 4

Query 21

Students registered in 7 or fewer courses– must use the aggregate COUNT()

select Surname, Studnumfrom STUDENT as Swhere (select count(*) from REGISTER as R

where R.Studnum = S.Studnum) <= 7;

GARTON 86232SWAN 86544

Page 11: CSI 1306 Databases – Part 4

Query 22

Average mark in the CS course

select avg( Mark ) from MARK where Course = 'CS';

69.988

Page 12: CSI 1306 Databases – Part 4

Query 23

The oldest student's birthday

select min( Birthdate ) from STUDENT;

1961/09/30

Page 13: CSI 1306 Databases – Part 4

Query 24

The youngest student's birthday

select max( Birthdate ) from STUDENT;

1969/10/19

Page 14: CSI 1306 Databases – Part 4

Query 25

The name of the youngest student

select Initials, Surnamefrom STUDENTwhere Birthdate = (select max(Birthdate) from STUDENT );

F.R. BERTRAND

Page 15: CSI 1306 Databases – Part 4

Query 26

Find the students who have handed in an assignment that is 25% greater than the average assignment mark

select Initials, Surnamefrom STUDENT, MARKwhere MARK.Studnum = STUDENT.Studnum and Mark > 1.25 * (select avg(Mark) from MARK where Course = 'CS' );

There are 67 rows in the result.

Page 16: CSI 1306 Databases – Part 4

SQL

Data Definition Language– (create table, drop table, alter table)

Data Manipulation Language– (insert, delete, update)

Query Language.– (select)

Page 17: CSI 1306 Databases – Part 4

Data Definition Language

Create Table – To create an empty database– Create Table Name (Field Type, Field Type…)

Drop Table– To delete a table– Drop Table Name

Alter Table– To add/remove a field from the database– Alter table Name add/drop Field Type

Page 18: CSI 1306 Databases – Part 4

Data Manipulation Language

Insert– To add records to the end of a table (append)– Insert Into Name values (Field1, Field2 …)

• Strings have single quotes ‘’• Works with the Select statement to add records from another

table

Update– To modify records in a database– Update Name set Field = Expression

• Also works with the Where clause in the Select statement

Page 19: CSI 1306 Databases – Part 4

Data Manipulation Language

Delete– To remove records from a table

– Delete * from Name• Deletes all records

• Might help if someone made a mistake

• Note the Where clause in the Select statement works with Delete

Page 20: CSI 1306 Databases – Part 4

Databases

Data Files contain the data (records)Data Definitions describe the different pieces of

information (fields) in the data filesData Schema describes how the information is

related and how one is to interpret such relationsA Database is an organized collection of related

data filesA Database Management System is software that

allows us to create and work with databases

Page 21: CSI 1306 Databases – Part 4

Hierarchy of Data

Bit Byte Field Record Data file Database

Physical Representation (how the data is actually stored)– Bits

– Bytes

Logical Representation (how we, through software, organize, process and interpret the stored data)– Fields, Records, Data files, Databases

Page 22: CSI 1306 Databases – Part 4

Hierarchy of Data

Bit – 0 or 1 (i.e. a location on a disk which is magnetized in one direction or the other)

Byte – 8 bits (a character)Field – a collection of bytes (e.g. student number)Record – a collection of fields (e.g. student

number, name, address)Data file – a collection of records (e.g. all students

in a class)Database – a collection of related data files (e.g.

students, professors, classrooms)

Page 23: CSI 1306 Databases – Part 4

Databases

Databases are used in most business applications– Sales, accounts receivable, inventory

– Purchasing, accounts payable

– Budgeting, payroll, general ledger

– On-line banking

– Mutual fund administration

– Library catalogues

Page 24: CSI 1306 Databases – Part 4

Databases

Building a database is accomplished in several steps using the capabilities of the database management system. – Define the information to be stored and the

relationships between the information

– Build a database schema

– Create an empty database

– Fill (or populate) the database with data

Page 25: CSI 1306 Databases – Part 4

Database Management Software

There are 3 types of data models– Hierarchical

– Network

– Relational

Each physically structures and records the relationships among records in the database in a different manner

Whereas, relational uses common fields, hierarchical and network use pointers (fields containing addresses of related records)

Page 26: CSI 1306 Databases – Part 4

Database Management Software

However, all maintain data independence• the format of the database is recorded within the database

itself, on disk with the data (and not in the application programs)

Since many application may process data in the same database, this is important because,

• Each application program does not have to define the format of the database

• When the format of the database is changed, application programs do not have to be changed, for example to accommodate the addition of a new field to a record or a new data file to the database

Page 27: CSI 1306 Databases – Part 4

Database Management Software

There are 4 classes of users• A database administrator uses the data definition language

supplied by the DBMS to define the data types and definitions for the data files in the database. He develops a data schema to describe how the information is related and how to interpret such relations

• An application programmer uses the data manipulation language supplied by the DBMS to access, update and modify the data in the database

• An end user uses an application program that uses the DBMS software to access data for processing

• An advanced user uses a query language to write queries which are processed by the DBMS software to access the database and provide answers to the queries

Page 28: CSI 1306 Databases – Part 4

General organization of a DBMS

DBMS

end user

advanced user

application programmer

database administrator

data definition language processor

data manipulation

language processor

query processor

file manager

data

database manager

database schema

data dictionary

data file data

file

data file

application program query

application program compiled

application program compiled

Page 29: CSI 1306 Databases – Part 4

Other Capabilities of Database Management SoftwareEfficient access to data

– Databases typically contain millions of records

Controlled access to data– Who can see what

– Read and/or write access

Concurrent access to data– Simultaneous access by multiple users / applications

– Locking a record during update

Page 30: CSI 1306 Databases – Part 4

Other Capabilities of Database Management SoftwareSupport for different user views

– Cross section through a database, focusing on a specific task

– End users will see the data according their individual needs

Support for database recovery– Techniques for reconstruction of a database after a

failure (disk crash, program error, power outage, fire)

Page 31: CSI 1306 Databases – Part 4

Database Management Software

Relational is the most recent model and the one most widely used in the business world– It is also the most intuitive

– Examples of relational database software include• Access, Oracle, Progress, Sybase, DB2, Ingres

Page 32: CSI 1306 Databases – Part 4

Excel vs Access

Where does Excel break down for database modeling?

• Ability to search multiple data files– Searches are typically limited to one worksheet

• Size of database– Excel is limited in the number of rows in a worksheet

» 65,000 records vs millions in Access• Amount of memory required

– A complete Excel worksheet must be loaded into memory– Access does not load every record into memory at one time

• Speed of execution– Excel is optimized for data calculations not data retrieval

• Concurrency– Excel allows only one user to change data at a time– Opens as “Read Only” for a second user

Page 33: CSI 1306 Databases – Part 4

Additional Material

Additional SQL Examples

Page 34: CSI 1306 Databases – Part 4

Query 27

Is there a faster query than Query 11?– This very slow query accesses three tables with

1371648 rows!

– It's better to help the unoptimized SQL by using a temporary table.

Page 35: CSI 1306 Databases – Part 4

Query 27

Create a new table with the two attributes needed for further selection

create table tmp( Course string, Section int );

Page 36: CSI 1306 Databases – Part 4

Query 27

Insert only data about 86030 into the new table

insert into tmp select Course, Section from REGISTER where Studnum = 86030;

Page 37: CSI 1306 Databases – Part 4

Query 27

Since there are 9 rows in tmp, – the overall size of the product drops from 1371648 to

13824

select E.Empname, A.Course, A.Section from APPOINTMENT as A, tmp, EMPLOYEE as E where E.Rank = 'TUTOR' and A.Course = tmp.Course and A.Section = tmp.Section and E.Empnum = A.Empnum;

SEARLE CHEML 2 KRAEMER CSL 1 CAMPBELL PHYSL 4

Page 38: CSI 1306 Databases – Part 4

Query 27

Clean up: remove the table

drop TABLE tmp;

Page 39: CSI 1306 Databases – Part 4

Query 28

Use the method in Query 27 to find out who tutors SCOTT

Page 40: CSI 1306 Databases – Part 4

create table tmp( Course string, Section int ); insert into tmp select Course, Section from STUDENT as S, REGISTER as R where S.Surname = 'SCOTT' and S.Studnum = R.Studnum; select E.Empname, A.Course, A.Section from APPOINTMENT as A, tmp, EMPLOYEE as E where E.Rank = 'TUTOR' and A.Course = tmp.Course and A.Section = tmp.Section and E.Empnum = A.Empnum;

DIXON CHEML 9 O'DONNELL CSL 6 CAMPBELL PHYSL 8

drop TABLE tmp;

Page 41: CSI 1306 Databases – Part 4

Query 29

Query 26, recalculated the average several times. Find a faster system.

create table tmp( CSavg Double );

insert into tmp select avg( Mark ) from MARK where Course = 'CS';

select Initials, Surnamefrom STUDENT, MARKwhere MARK.Studnum = STUDENT.Studnum and Mark > 1.25 * max(select CSavg from tmp);

drop tmp;

Page 42: CSI 1306 Databases – Part 4

Query 30

Calculate the assignment mark for the CS course– there are five marks and their weight are stored in a

separate table

– this problem is best solved in two steps

Page 43: CSI 1306 Databases – Part 4

Query 30

Build a temporary table of marks in CS assignments with weights

create table MKWTCS(Studnum int, Assignnum int, Mark Double, Weighting Double );

insert into MKWTCS select M.Studnum, M.Assignnum, M.Mark, A.Weighting from MARK as M, ASSIGNMENT as A where M.Course = 'CS' and A.Course = 'CS' and M.Assignnum = A.Assignnum;

Page 44: CSI 1306 Databases – Part 4

Query 30

Check to see that all students have handed in all assignments – 0 should be put in for an incomplete– empty result is

select distinct Studnum from MKWTCS where (select count(*) from MKWTCS as WT where WT.Studnum = MKWTCS.Studnum ) < 5;

Page 45: CSI 1306 Databases – Part 4

Query 30

Calculate the mark

select distinct Studnum, (select sum(Mark * Weighting / 100.0) from MKWTCS as WT where WT.Studnum = MKWTCS.Studnum )from MKWTCS;

86004 75.8599 more records

Page 46: CSI 1306 Databases – Part 4

Query 30Now sort the results

select distinct Studnum (select sum(Mark*Weighting / 100.0) from MKWTCS as WT where WT.Studnum = MKWTCS.Studnum),from MKWTCSOrder Studnum;

86887 83.0

Page 47: CSI 1306 Databases – Part 4

Query 30Now sort the results by mark

select distinct Studnum (select sum(Mark*Weighting / 100.0) from MKWTCS as WT where WT.Studnum = MKWTCS.Studnum), from MKWTCS Order 2 desc;

86887 83.0

Page 48: CSI 1306 Databases – Part 4

Query 30

Clean up

drop table MKWTCS;

Page 49: CSI 1306 Databases – Part 4

Query 31

Modify the ROOM table by adding a Building description

alter table ROOM add Building string;

##### THE TABLE ROOM HAS BEEN ALTERED.##### THE DEFAULT VALUE FOR THE NEW DOMAIN##### Building IS

select * from ROOM;

1180 OFF 3 ' '

19 More records

Page 50: CSI 1306 Databases – Part 4

Query 32

Create a new course ECON with LabAdd a record to the LAB table

insert into LAB values ('ECON', 'ECONL');

CONSTRAINT foreign_key(Lab) VIOLATED

Page 51: CSI 1306 Databases – Part 4

Query 32

Add a record to the COURSE table

insert into COURSE values ('ECON', 0.25, 3, 'Econ. Lab-trading (elective)'); insert into LAB values ('ECON', 'ECONL'); select * from LAB;

CHEM CHEML CS CSL PHYS PHYSL ECON ECONL

Page 52: CSI 1306 Databases – Part 4

Query 33

Remove all the records from the LAB table

delete * from LAB;

select * from LAB;

Page 53: CSI 1306 Databases – Part 4

Query 34

Delete the Physics Lab

delete * from LAB where Lab = 'PHYSL';

select * from LAB;

CHEM CHEMLCS CSL

Page 54: CSI 1306 Databases – Part 4

Query 35

Add 1% to every mark in the table of marks

– the attribute name denotes the old value; the expression after “=” defines the new value

update MARK set Mark = Mark * 1.01;

Page 55: CSI 1306 Databases – Part 4

Query 36

Add 2.2 to a specific mark– note that the primary key for the table MARK has

three attributes, all of them necessary to identify the mark

update MARK set Mark = Mark + 2.2where Studnum = 86004 and Course = 'BIOL' and Assignnum = 1;

Page 56: CSI 1306 Databases – Part 4

Query 37

Add 5% to the lab pay of every tutor who is paid below the average create table tmp( Av Double );

insert into tmp select avg( Labpay ) from EMPLOYEE

where Rank = 'TUTOR';

update EMPLOYEEset Labpay = Labpay * 1.05where Rank = 'TUTOR' and Labpay < max( select Av from tmp);

drop tmp;

Page 57: CSI 1306 Databases – Part 4

Homework

Page 58: CSI 1306 Databases – Part 4

For the people/picture database develop the SQL queries to– list all pictures between two dates

– list all pictures in which a specific person appears

– list all pictures between two dates in which a specific person appears

Write SQL queries for queries 1 to 4 from slide 29 of the Databases – Part 2 lecture