sql206 sql median
DESCRIPTION
How to calculate the median in SQL Server.TRANSCRIPT
Parts Median 1
Median
SQL Programming
Median query – How to calculate the median
Notes on Median Slides
• These slides will be part of our upcoming intermediate and/or perhaps advanced SQL queries course.
• The basic concept of using TOP was found on a tek-tips SQL forum.
• At this time we are using Chris Date’s famous parts table. We will add versions for the bookstore database as well.
• This script has been tested with SQL Server only at this time.
Parts Median 2
Parts Median 3
Contact Information
P.O. Box 6142Laguna Niguel, CA 92607949-489-1472http://[email protected]
Copyright 2001-2011. All rights reserved.
Median Resources
• SQL scripts will be found on box.net athttp://tinyurl.com/SQLScripts
• Slides can be viewed on SlideShare…http://www.slideshare.net/OCDatabases
• Follow up [email protected]
Parts Median 4
Assumptions
• It is assumed the student is familiar with how to create a database and how to put it in use if required.
• These statements are not covered in these slides.
Parts Median 5
Business Case• SQL has a function AVG which will take the average or
arithmetic mean. It does not have one for the median.
• These slides will show how to calculate the median of a dataset.– The median is the value in a series above which lie 50% of the
values and below which lie the other 50%.
– If there are an even number of values in the series it is the average of the two innermost above values.
• The median has many uses. One common use is in real estate where the median may give us a better feel for the typical prices paid.
Parts Median 6
Approach
• We will calculate the median by using an SQL select of the top 50 percent of a dataset.
• This will be done twice. Once to obtain the record 50% of the way down from the top and again to find the record 50% of the way up from the bottom.– If there are an odd number (including 1) of records the
same row will be retrieved twice which is fine.
• We will then average the two values returned.
Parts Median 7
Create Table
• We will use Chris Date’s famous parts table.
Parts Median 8
CREATE TABLE Parts (part_nbr VARCHAR(5) NOT NULL PRIMARY KEY , part_name VARCHAR(50) NOT NULL , part_color VARCHAR(50) NOT NULL , part_wgt INTEGER NOT NULL , city_name VARCHAR(50) NOT NULL );
Load Data
• Load the following data and/or experiment with your own values…
Parts Median 9
INSERT INTO Parts (part_nbr, part_name, part_color, part_wgt, city_name)VALUES ('p1', 'Nut', 'Red', 12, 'London')
, ('p2', 'Bolt', 'Green', 17, 'Paris'), ('p3', 'Cam', 'Blue', 12, 'Paris'), ('p4', 'Screw', 'Red', 14, 'London'), ('p5', 'Cam', 'Blue', 12, 'Paris'), ('p6', 'Cog', 'Red', 19, 'London')
;
Calculate the median
• Union the result of the two select tops. Then average the two results.
Parts Median 10
select avg(wgt) as medianfrom(select max(part_wgt) as wgtFrom (select top 50 percent *from partsorder by part_wgt asc) aunionselect min(part_wgt)from (select top 50 percent *from partsorder by part_wgt desc) d) u;
Results
Parts Median 11
Explanation
1. Use a subquery to select the top 50 percent of the dataset in ascending order.
2. Use a named outer query (table expression) to select the bottom value from this list. Assign a column alias to the max(value).
3. Use a subquery to select the bottom 50 percent of the dataset in descending order.
4. Use a named outer query (table expression) to select the top value from this list.
5. Union the result of the two named queries into another named query .
6. Select from this named query. Average the values in the union and assign a new column alias of median.
Parts Median 12