sql206 sql median

12
Parts Median 1 Median SQL Programming Median query – How to calculate the median

Upload: dan-durso

Post on 21-May-2015

332 views

Category:

Technology


1 download

DESCRIPTION

How to calculate the median in SQL Server.

TRANSCRIPT

Page 1: SQL206 SQL Median

Parts Median 1

Median

SQL Programming

Median query – How to calculate the median

Page 2: SQL206 SQL Median

Notes on Median Slides

• These slides will be part of our upcoming intermediate and/or perhaps advanced SQL queries course.

• The basic concept of using TOP was found on a tek-tips SQL forum.

• At this time we are using Chris Date’s famous parts table. We will add versions for the bookstore database as well.

• This script has been tested with SQL Server only at this time.

Parts Median 2

Page 3: SQL206 SQL Median

Parts Median 3

Contact Information

P.O. Box 6142Laguna Niguel, CA 92607949-489-1472http://[email protected]

Copyright 2001-2011. All rights reserved.

Page 4: SQL206 SQL Median

Median Resources

• SQL scripts will be found on box.net athttp://tinyurl.com/SQLScripts

• Slides can be viewed on SlideShare…http://www.slideshare.net/OCDatabases

• Follow up [email protected]

Parts Median 4

Page 5: SQL206 SQL Median

Assumptions

• It is assumed the student is familiar with how to create a database and how to put it in use if required.

• These statements are not covered in these slides.

Parts Median 5

Page 6: SQL206 SQL Median

Business Case• SQL has a function AVG which will take the average or

arithmetic mean. It does not have one for the median.

• These slides will show how to calculate the median of a dataset.– The median is the value in a series above which lie 50% of the

values and below which lie the other 50%.

– If there are an even number of values in the series it is the average of the two innermost above values.

• The median has many uses. One common use is in real estate where the median may give us a better feel for the typical prices paid.

Parts Median 6

Page 7: SQL206 SQL Median

Approach

• We will calculate the median by using an SQL select of the top 50 percent of a dataset.

• This will be done twice. Once to obtain the record 50% of the way down from the top and again to find the record 50% of the way up from the bottom.– If there are an odd number (including 1) of records the

same row will be retrieved twice which is fine.

• We will then average the two values returned.

Parts Median 7

Page 8: SQL206 SQL Median

Create Table

• We will use Chris Date’s famous parts table.

Parts Median 8

CREATE TABLE Parts (part_nbr VARCHAR(5) NOT NULL PRIMARY KEY , part_name VARCHAR(50) NOT NULL , part_color VARCHAR(50) NOT NULL , part_wgt INTEGER NOT NULL , city_name VARCHAR(50) NOT NULL );

Page 9: SQL206 SQL Median

Load Data

• Load the following data and/or experiment with your own values…

Parts Median 9

INSERT INTO Parts (part_nbr, part_name, part_color, part_wgt, city_name)VALUES ('p1', 'Nut', 'Red', 12, 'London')

, ('p2', 'Bolt', 'Green', 17, 'Paris'), ('p3', 'Cam', 'Blue', 12, 'Paris'), ('p4', 'Screw', 'Red', 14, 'London'), ('p5', 'Cam', 'Blue', 12, 'Paris'), ('p6', 'Cog', 'Red', 19, 'London')

;

Page 10: SQL206 SQL Median

Calculate the median

• Union the result of the two select tops. Then average the two results.

Parts Median 10

select avg(wgt) as medianfrom(select max(part_wgt) as wgtFrom (select top 50 percent *from partsorder by part_wgt asc) aunionselect min(part_wgt)from (select top 50 percent *from partsorder by part_wgt desc) d) u;

Page 11: SQL206 SQL Median

Results

Parts Median 11

Page 12: SQL206 SQL Median

Explanation

1. Use a subquery to select the top 50 percent of the dataset in ascending order.

2. Use a named outer query (table expression) to select the bottom value from this list. Assign a column alias to the max(value).

3. Use a subquery to select the bottom 50 percent of the dataset in descending order.

4. Use a named outer query (table expression) to select the top value from this list.

5. Union the result of the two named queries into another named query .

6. Select from this named query. Average the values in the union and assign a new column alias of median.

Parts Median 12