csi 1306 databases – part 4
TRANSCRIPT
CSI 1306
Databases – Part 4
SQL continued and
Database Management Systems
Union
Union allows the merging of two Selects– The number of fields must be the same in both Select
statements• Field types do not matter
Query 16
Write Query 15 as a Union
(select Initials, Surname from STUDENT where Surname like 'Z*')union (select Initials, Surname from STUDENT where Surname like '*Z');
Query 17
Find all students, and employees whose names begin with Z– the union operation is the only way if the partial results
come from different tables
(select Initials, Surname from STUDENT where Surname like 'Z*')union (select Initials, Empname from EMPLOYEE where Empname like 'Z*');
W.J. ZIMMERMANG.R. ZENDL
Where - In
In will make the comparison to a list of known values. It is a Set operator.
Query 18
Use the Set operator to find information about a series of students
select Studnum, Surname, Initials, Birthdatefrom STUDENTwhere Studnum in ((86004), (86060), (86609));
86004 RUSSELL F.K.A. 1967/09/2186060 ALLAN C.C. 1967/01/1986609 HARRISON G.L. 1965/10/29
Note that this could also be written as:
Studnum = 86004 or Studnum = 86060 or Studnum = 86609
Query 19
Students from Guelph who take the philosophy course
select Initials, Surname, Studnumfrom STUDENTwhere Address like '*GUELPH ONT' and
Studnum in (select Studnum from REGISTER
where Course = 'PHIL');
C.D. MOSER 86087
Query 20
Students from Guelph who do not take the psychology course
select Initials, Surname, Studnumfrom STUDENTwhere Address like '*GUELPH ONT' and
Studnum not in (select Studnum from REGISTER
where Course = 'PSYCH');
C.D. MOSER 86087
Aggregate Functions
Select Aggregate Function – Will return a single value from a table– Aggregate Functions are performed on a field– Standard Functions are:
• Avg() - Arithmetic average of the values• Count() - Number of records• First() - Value of the first record in the set• Last() - Value of the last record in the set• Min() - Smallest value in the set• Max() - Largest value in the set• Sum() - Total of all records• StDev() - Standard deviation (nonbiased)• StDevP - Standard deviation (biased)• Var() - Variance (nonbiased)• VarP() - Variance deviation (biased)
Query 21
Students registered in 7 or fewer courses– must use the aggregate COUNT()
select Surname, Studnumfrom STUDENT as Swhere (select count(*) from REGISTER as R
where R.Studnum = S.Studnum) <= 7;
GARTON 86232SWAN 86544
Query 22
Average mark in the CS course
select avg( Mark ) from MARK where Course = 'CS';
69.988
Query 23
The oldest student's birthday
select min( Birthdate ) from STUDENT;
1961/09/30
Query 24
The youngest student's birthday
select max( Birthdate ) from STUDENT;
1969/10/19
Query 25
The name of the youngest student
select Initials, Surnamefrom STUDENTwhere Birthdate = (select max(Birthdate) from STUDENT );
F.R. BERTRAND
Query 26
Find the students who have handed in an assignment that is 25% greater than the average assignment mark
select Initials, Surnamefrom STUDENT, MARKwhere MARK.Studnum = STUDENT.Studnum and Mark > 1.25 * (select avg(Mark) from MARK where Course = 'CS' );
There are 67 rows in the result.
SQL
Data Definition Language– (create table, drop table, alter table)
Data Manipulation Language– (insert, delete, update)
Query Language.– (select)
Data Definition Language
Create Table – To create an empty database– Create Table Name (Field Type, Field Type…)
Drop Table– To delete a table– Drop Table Name
Alter Table– To add/remove a field from the database– Alter table Name add/drop Field Type
Data Manipulation Language
Insert– To add records to the end of a table (append)– Insert Into Name values (Field1, Field2 …)
• Strings have single quotes ‘’• Works with the Select statement to add records from another
table
Update– To modify records in a database– Update Name set Field = Expression
• Also works with the Where clause in the Select statement
Data Manipulation Language
Delete– To remove records from a table
– Delete * from Name• Deletes all records
• Might help if someone made a mistake
• Note the Where clause in the Select statement works with Delete
Databases
Data Files contain the data (records)Data Definitions describe the different pieces of
information (fields) in the data filesData Schema describes how the information is
related and how one is to interpret such relationsA Database is an organized collection of related
data filesA Database Management System is software that
allows us to create and work with databases
Hierarchy of Data
Bit Byte Field Record Data file Database
Physical Representation (how the data is actually stored)– Bits
– Bytes
Logical Representation (how we, through software, organize, process and interpret the stored data)– Fields, Records, Data files, Databases
Hierarchy of Data
Bit – 0 or 1 (i.e. a location on a disk which is magnetized in one direction or the other)
Byte – 8 bits (a character)Field – a collection of bytes (e.g. student number)Record – a collection of fields (e.g. student
number, name, address)Data file – a collection of records (e.g. all students
in a class)Database – a collection of related data files (e.g.
students, professors, classrooms)
Databases
Databases are used in most business applications– Sales, accounts receivable, inventory
– Purchasing, accounts payable
– Budgeting, payroll, general ledger
– On-line banking
– Mutual fund administration
– Library catalogues
Databases
Building a database is accomplished in several steps using the capabilities of the database management system. – Define the information to be stored and the
relationships between the information
– Build a database schema
– Create an empty database
– Fill (or populate) the database with data
Database Management Software
There are 3 types of data models– Hierarchical
– Network
– Relational
Each physically structures and records the relationships among records in the database in a different manner
Whereas, relational uses common fields, hierarchical and network use pointers (fields containing addresses of related records)
Database Management Software
However, all maintain data independence• the format of the database is recorded within the database
itself, on disk with the data (and not in the application programs)
Since many application may process data in the same database, this is important because,
• Each application program does not have to define the format of the database
• When the format of the database is changed, application programs do not have to be changed, for example to accommodate the addition of a new field to a record or a new data file to the database
Database Management Software
There are 4 classes of users• A database administrator uses the data definition language
supplied by the DBMS to define the data types and definitions for the data files in the database. He develops a data schema to describe how the information is related and how to interpret such relations
• An application programmer uses the data manipulation language supplied by the DBMS to access, update and modify the data in the database
• An end user uses an application program that uses the DBMS software to access data for processing
• An advanced user uses a query language to write queries which are processed by the DBMS software to access the database and provide answers to the queries
General organization of a DBMS
DBMS
end user
advanced user
application programmer
database administrator
data definition language processor
data manipulation
language processor
query processor
file manager
data
database manager
database schema
data dictionary
data file data
file
data file
application program query
application program compiled
application program compiled
Other Capabilities of Database Management SoftwareEfficient access to data
– Databases typically contain millions of records
Controlled access to data– Who can see what
– Read and/or write access
Concurrent access to data– Simultaneous access by multiple users / applications
– Locking a record during update
Other Capabilities of Database Management SoftwareSupport for different user views
– Cross section through a database, focusing on a specific task
– End users will see the data according their individual needs
Support for database recovery– Techniques for reconstruction of a database after a
failure (disk crash, program error, power outage, fire)
Database Management Software
Relational is the most recent model and the one most widely used in the business world– It is also the most intuitive
– Examples of relational database software include• Access, Oracle, Progress, Sybase, DB2, Ingres
Excel vs Access
Where does Excel break down for database modeling?
• Ability to search multiple data files– Searches are typically limited to one worksheet
• Size of database– Excel is limited in the number of rows in a worksheet
» 65,000 records vs millions in Access• Amount of memory required
– A complete Excel worksheet must be loaded into memory– Access does not load every record into memory at one time
• Speed of execution– Excel is optimized for data calculations not data retrieval
• Concurrency– Excel allows only one user to change data at a time– Opens as “Read Only” for a second user
Additional Material
Additional SQL Examples
Query 27
Is there a faster query than Query 11?– This very slow query accesses three tables with
1371648 rows!
– It's better to help the unoptimized SQL by using a temporary table.
Query 27
Create a new table with the two attributes needed for further selection
create table tmp( Course string, Section int );
Query 27
Insert only data about 86030 into the new table
insert into tmp select Course, Section from REGISTER where Studnum = 86030;
Query 27
Since there are 9 rows in tmp, – the overall size of the product drops from 1371648 to
13824
select E.Empname, A.Course, A.Section from APPOINTMENT as A, tmp, EMPLOYEE as E where E.Rank = 'TUTOR' and A.Course = tmp.Course and A.Section = tmp.Section and E.Empnum = A.Empnum;
SEARLE CHEML 2 KRAEMER CSL 1 CAMPBELL PHYSL 4
Query 27
Clean up: remove the table
drop TABLE tmp;
Query 28
Use the method in Query 27 to find out who tutors SCOTT
create table tmp( Course string, Section int ); insert into tmp select Course, Section from STUDENT as S, REGISTER as R where S.Surname = 'SCOTT' and S.Studnum = R.Studnum; select E.Empname, A.Course, A.Section from APPOINTMENT as A, tmp, EMPLOYEE as E where E.Rank = 'TUTOR' and A.Course = tmp.Course and A.Section = tmp.Section and E.Empnum = A.Empnum;
DIXON CHEML 9 O'DONNELL CSL 6 CAMPBELL PHYSL 8
drop TABLE tmp;
Query 29
Query 26, recalculated the average several times. Find a faster system.
create table tmp( CSavg Double );
insert into tmp select avg( Mark ) from MARK where Course = 'CS';
select Initials, Surnamefrom STUDENT, MARKwhere MARK.Studnum = STUDENT.Studnum and Mark > 1.25 * max(select CSavg from tmp);
drop tmp;
Query 30
Calculate the assignment mark for the CS course– there are five marks and their weight are stored in a
separate table
– this problem is best solved in two steps
Query 30
Build a temporary table of marks in CS assignments with weights
create table MKWTCS(Studnum int, Assignnum int, Mark Double, Weighting Double );
insert into MKWTCS select M.Studnum, M.Assignnum, M.Mark, A.Weighting from MARK as M, ASSIGNMENT as A where M.Course = 'CS' and A.Course = 'CS' and M.Assignnum = A.Assignnum;
Query 30
Check to see that all students have handed in all assignments – 0 should be put in for an incomplete– empty result is
select distinct Studnum from MKWTCS where (select count(*) from MKWTCS as WT where WT.Studnum = MKWTCS.Studnum ) < 5;
Query 30
Calculate the mark
select distinct Studnum, (select sum(Mark * Weighting / 100.0) from MKWTCS as WT where WT.Studnum = MKWTCS.Studnum )from MKWTCS;
86004 75.8599 more records
Query 30Now sort the results
select distinct Studnum (select sum(Mark*Weighting / 100.0) from MKWTCS as WT where WT.Studnum = MKWTCS.Studnum),from MKWTCSOrder Studnum;
86887 83.0
Query 30Now sort the results by mark
select distinct Studnum (select sum(Mark*Weighting / 100.0) from MKWTCS as WT where WT.Studnum = MKWTCS.Studnum), from MKWTCS Order 2 desc;
86887 83.0
Query 30
Clean up
drop table MKWTCS;
Query 31
Modify the ROOM table by adding a Building description
alter table ROOM add Building string;
##### THE TABLE ROOM HAS BEEN ALTERED.##### THE DEFAULT VALUE FOR THE NEW DOMAIN##### Building IS
select * from ROOM;
1180 OFF 3 ' '
19 More records
Query 32
Create a new course ECON with LabAdd a record to the LAB table
insert into LAB values ('ECON', 'ECONL');
CONSTRAINT foreign_key(Lab) VIOLATED
Query 32
Add a record to the COURSE table
insert into COURSE values ('ECON', 0.25, 3, 'Econ. Lab-trading (elective)'); insert into LAB values ('ECON', 'ECONL'); select * from LAB;
CHEM CHEML CS CSL PHYS PHYSL ECON ECONL
Query 33
Remove all the records from the LAB table
delete * from LAB;
select * from LAB;
Query 34
Delete the Physics Lab
delete * from LAB where Lab = 'PHYSL';
select * from LAB;
CHEM CHEMLCS CSL
Query 35
Add 1% to every mark in the table of marks
– the attribute name denotes the old value; the expression after “=” defines the new value
update MARK set Mark = Mark * 1.01;
Query 36
Add 2.2 to a specific mark– note that the primary key for the table MARK has
three attributes, all of them necessary to identify the mark
update MARK set Mark = Mark + 2.2where Studnum = 86004 and Course = 'BIOL' and Assignnum = 1;
Query 37
Add 5% to the lab pay of every tutor who is paid below the average create table tmp( Av Double );
insert into tmp select avg( Labpay ) from EMPLOYEE
where Rank = 'TUTOR';
update EMPLOYEEset Labpay = Labpay * 1.05where Rank = 'TUTOR' and Labpay < max( select Av from tmp);
drop tmp;
Homework
For the people/picture database develop the SQL queries to– list all pictures between two dates
– list all pictures in which a specific person appears
– list all pictures between two dates in which a specific person appears
Write SQL queries for queries 1 to 4 from slide 29 of the Databases – Part 2 lecture