zeit2301 design of information systems sql: computing statistics school of engineering and...

21
ZEIT2301 Design of Information Systems SQL: Computing Statistics School of Engineering and Information Technology UNSW@ADFA Dr Kathryn Merrick

Upload: reginald-wilson

Post on 02-Jan-2016

217 views

Category:

Documents


1 download

TRANSCRIPT

ZEIT2301Design of Information Systems

SQL: Computing Statistics

School of Engineering and Information TechnologyUNSW@ADFA

Dr Kathryn Merrick

Topic 11: SQL Computing Statistics

In this lecture you will learn to use functions in SQL to compute simple statistics on data

1. Aggregating functions 2. Ordering functions 3. String functions 4. Date functions

Reference: http://www.w3schools.com/sql/

1. Aggregate Functions

Functions that operate on a single column (or expression) and return a single value

COUNT – counts the number of values SUM – returns the total of the values AVG – returns the average of the values MIN – returns the minimum value MAX – returns the maximum value

Used in SELECT clause NOT allowed in WHERE clause (very common mistake)

COUNT()

How many clubs are there?SELECT COUNT(*)FROM sportClub;

How many club presidents are there? SELECT COUNT(president)

FROM sportClub;

sportClub (sport, contactNo, sponsor, president, annualBudget )

Query returns a table with one row with one column.

* is a special shorthand;Query counts all rows of the table

Does not count nulls in the “president” column

COUNT(DISTINCT)

How many different club sponsors are there?

SELECT COUNT (DISTINCT sponsor)

FROM sportClub;

(supported by Oracle and SQL Server but not by Access)

sportClub (sport, contactNo, sponsor, president, annualBudget )

Discards duplicates

SUM()

Each club has an annual budget. What is the total budget amount for all clubs?

SELECT SUM(annualBudget)

FROM sportClub;

sportClub (sport, contactNo, sponsor, president, annualBudget )

Query returns a table with one row with one column.

Hint: The NRL Salary Cap for 2011 is $4.3m for the 25 highest paid players at each club. 

AVG(), MIN(), MAX()

Find the average, minimum and maximum cost of the clubs’ budgets

SELECT AVG(annualBudget), MIN(annualBudget), MAX(annualBudget)

FROM sportClub;

Query returns one row with three columns.

Review: Column Name Aliases

Columns can be renamed in the result table using the AS clause to give more meaningful output

Also useful to avoid display of system generated

column names for calculated columns (MsAccess uses “Expr1”)

Select SUM(annualBudget) AS TotalBudget

The Bike Database Revisited

Bike name*

Number of riders*

Centre of mass height

Harley 1 0.724

Harley 2 0.775

Honda 1 0.831

Honda 2 0.881

Road conditions*

Coefficient of friction

Icy 0.1

Wet 0.5

Dry 0.9

Scenario ID*

Bike name

Number of riders

Road conditions

Can stoppie

1 Harley 1 Dry false

2 Harley 2 Dry false

3 Honda 1 Dry true

4 Honda 2 Dry true

Bike name*

Wheelbase

Harley 1.588

Honda 1.458

Aliasing Examples

SELECT MIN(wheelbase) AS minWheelbase FROM Bikes;

SELECT MAX(scenarioID) AS maxScenarioID FROM Scenarios;

SELECT AVG(wheelbase) AS avgWheelbase FROM Bikes;

SELECT COUNT(wheelbase) AS smallWheelbases

FROM Bikes

WHERE wheelbase < 1.5;

Aggregating Results

The GROUP BY statement is used in conjunction with the aggregate functions to group the result-set by one or more columns.

Eg: to find the total value of all orders by each customer, we can use the GROUP BY statement to group customers.

SELECT customer, SUM(orderPrice)

FROM OrdersGROUP BY customer

Orders(orderID, orderDate, orderPrice, customer)

Aggregating Results Solution

orderID orderDate orderPrice customer

1 2008/11/12 1000 Hansen

2 2008/10/23 1600 Nilsen

3 2008/09/02 700 Hansen

4 2008/09/03 300 Hansen

5 2008/08/30 2000 Jensen

6 2008/10/04 100 Nilsen

customer SUM(orderPrice)

Hansen 2000

Nilsen 1700

Jensen 2000

Orders

Query result

Filtering Groups

Individual rows can be filtered using a WHERE clause

BUT groups must be filtered using a HAVING clause

Eg: suppose we only want to display customer order totals less than $2000:

SELECT Customer,SUM(OrderPrice) FROM Orders

GROUP BY CustomerHAVING SUM(OrderPrice) < 2000

customer SUM(OrderPrice)

Nilsen 1700

In Class Exercise

What is the result of the following query on the Orders table?

SELECT customer, SUM(orderPrice)

FROM OrdersWHERE customer='Hansen' OR customer='Jensen'GROUP BY customerHAVING SUM(orderPrice) > 1500

orderID orderDate orderPrice customer

1 2008/11/12 1000 Hansen

2 2008/10/23 1600 Nilsen

3 2008/09/02 700 Hansen

4 2008/09/03 300 Hansen

5 2008/08/30 2000 Jensen

6 2008/10/04 100 Nilsen

2. Order Functions Find the first value of the orderPrice column

SELECT FIRST(orderPrice)

FROM Orders Equivalent to:

SELECT orderPrice

FROM Orders ORDER BY orderID LIMIT 1

Find the last value of the orderPrice column:SELECT LAST(orderPrice)

FROM Orders Equivalent to:

SELECT prderPrice

FROM Orders ORDER BY orderID DESC LIMIT 1

3. String Functions

Functions that operate on strings (varchars) UCASE() – convert a string to uppercase LCASE() – convert a string to lower case MID() – extract characters from the middle of a string LEN() – find the length of a string

UCASE()

personID lastName firstName address city

1 Hansen Ola Timoteivn 10 Sandnes

2 Svendson Tove Borgvn 23 Sandnes

3 Pettersen Kari Storgt 20 Stavanger

SELECT UCASE(lastName) as lastName, firstName FROM Persons

lastName firstName

HANSEN Ola

SVENDSON Tove

PETTERSEN Kari

Persons

MID()

SELECT MID(city,1,4) as SmallCity FROM Persons

Column name

Start Character

End Character

SmallCity

Sand

Sand

Stav

LEN()

SELECT LEN(Address) as LengthOfAddress FROM Persons

LengthOfAddress

12

9

9

4. Date Functions

Functions for manipulating dates NOW() – get the current system date and time

SELECT productName, unitPrice, NOW() as perDate FROM Products

prod_Id productName unit unitPrice

1 Jarlsberg 1000 g 10.45

2 Mascarpone 1000 g 32.56

3 Gorgonzola 1000 g 15.67

productName unitPrice perDate

Jarlsberg 10.45 10/7/2008 11:25:02 AM

Mascarpone 32.56 10/7/2008 11:25:02 AM

Gorgonzola 15.67 10/7/2008 11:25:02 AM

Summary

After today’s lecture you should be able to write or interpret queries that include:

Aggregating functions Ordering functions String functions Date functions