graphical analysis. why graph data? graphical methods require very little training easy to use...
TRANSCRIPT
Graphical Analysis
Why Graph Data? Graphical methods
Require very little training Easy to use
Massive amounts of data can be presented more readily
Can provide an understanding of the distribution of the data
May be easier to interpret for individuals with less mathematical background than engineers
Graphical methods Quantitative data (numerical data)
Cost of a computer (continuous) Number of production defects (discrete) Weight of a person (continuous) Parts produced this month (discrete) Temperature of etch bath (continuous)
Graphical tools Line charts Histograms Scatter charts
Graphical methods Qualitative data (categorical and
attribute) Type of equipment (Manual, automated,
semi-automated) Operator (Tom, Nina, Jose)
Graphical tools Bar charts Pie charts Pareto charts
Getting Started Classify data
Quantitative vs. Qualitative Continuous or discrete (quantitative)
Chose the right graphical tool Chose axes and scales to provide
best “view” of data Label graphs to eliminate ambiguity
Graphical Analysis
Examples
Bar or Column Graph
Displays frequency of observations that fall into nominal categories
Color distribution for a random package of M&Ms
0
5
10
15
20
25
brown red yellow green orange blue
Color
Qty
Line Chart Shows trends in data at equal intervals
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
Ma
x S
kew
Ave
rag
e
Ma
x P
itch
Ave
rag
e
Con
tro
lled
Sca
n
Fre
eh
an
d S
can
Brig
ht
Lig
ht
Nor
ma
l Lig
ht
Lo
w L
ight
Performance Category
Sca
n T
ime
(Sec
on
ds)
CCD1 CCD2 LR LCCD CMOS
Graphical methods Acceptable graph
EDC WarehouseTest Results for Read Time
ALL SYSTEMS
0.64
0.20
0.52
0.810.66
N/A
1.46
0.88
0
1
2
1 2 3 4 5 6 7 8
RFID System
Re
ad
Tim
e
(se
cs/r
ea
d)
Graphical methods Better graph
EDC WarehouseTest Results for Read Time
ALL SYSTEMS
0.88
1.46
N/A
0.660.81
0.52
0.20
0.64
0
2
A B C D E F G H
RFID System
Read
Tim
e (
secs/r
ead
)
Graphical Analysis Details Always label axis with titles and units Always use chart titles Use scales that are appropriate to the
range of data being plotted Use legends only when they add value Use both points and lines on line
graphs only if it is appropriate – don’t use if the data is discrete
Histograms Histograms are pictorial
representations of the distribution of a measured quantity or of counted items. It is a quick tool to use to display the average and the amount of variation present.
Histogram example
The Pareto principle
Dr. Joseph Juran (of total quality management fame) formulated the Pareto Principle after expanding on the work of Wilfredo Pareto, a nineteenth century economist and sociologist. The Pareto Principle states that a small number of causes is responsible for a large percentage of the effect--usually a 20-percent to 80-percent ratio.
Pareto example
Histogram Example in Excel
Line Width Histogram
0
10
20
30
40
50
60
70
0.75
1.17
1.59
2.01
2.43
2.85
3.27
3.69
4.12
4.54
4.96
Line Width (um)
Fre
qu
en
cy
ENGR 112
Fitting Equations to Data
Introduction Engineers frequently collect paired data
in order to understand Characteristics of an object Behavior of a system
Relationships between paired data is often developed graphically
Mathematical relationships between paired data can provide additional insight
Regression Analysis
Regression analysis is a mathematical analysis technique used to determine something about the relationship between random variables.
Regression Analysis Goal
To develop a statistical model that can be used to predict the value of a variable based on the value of another
Regression Analysis Regression models are used
primarily for the purpose of prediction
Regression models typically involve A dependent or response variable
Represented as y One or more independent or
explanatory variables Represented as x1, x2, …,xn
Regression Analysis
Our focus? Models with only one
explanatory variable
These models are called simple linear regression models
Regression Analysis A scatter diagram is used to plot an
independent variable vs. a dependent variable Mail-Order House
Relationship b/w Weight of Mail vs. No. of Orders
0
5
10
15
20
25
0 100 200 300 400 500 600 700 800
Weight of Mail (lbs)
No
. of
Ord
ers
(th
ou
san
ds)
Regression AnalysisRemember!!
Relationships between variables can take many forms
Selection of the proper mathematical model is influenced by the distribution of the X and Y values on the scatter diagram
Regression Analysis
X
Y
X
YX
Y
X
Y
Regression Analysis Model
SIMPLE LINEAR REGRESSION MODEL
However, both 0 and 1 are population parameters
i Represents the random error in Y for each observation i that occurs
Yi = 0 + 1Xi + i
Regression Analysis Model
Since we will be working with samples, the previous model becomes
Where b0 = Y intercept (estimate of 0)
Value of Y when X = 0 b1 = Slope (estimate of 1)
Expected change in Y per unit change in X Yi = Predicted (estimated) value of Y
Yi = b0 + b1Xi
^
^
Regression Analysis Model
What happened with the error term?
Unfortunately, it is not gone. We still have errors in the estimated values iii YYe
Regression Analysis Find the straight line That BEST fits the data
Regression Analysis
X
Y
00
Positive Straight-Line Relationship
e1
e2
e3
e4
e5
Yi = b0 + b1Xi
b0 xy
b1
Least Squares Method
Mathematical technique that determines the values of b0 and b1
It does so by minimizing the following expression
n
1i
2ieMin
2n
1ii10i
2n
1iii
n
1i
2i XbbYYYeMin
Least Squares MethodResulting equations
Equations (1) and (2) are called the “normal equations”
n
1ii10
n
1ii XbnbY
n
1i
2i1
n
1ii0
n
1iii XbXbYX
(1)
(2)
Least Squares Method Assume the following values
Resulting equations
15xy,10x,20y,2x,5n 2
20b2b5 1 10
15b10b2 2 10
Assessing Fit How do we know how good a regression
model is? Sum of squares of errors (SSE)
Good if we have additional models to compare against
Coefficient of determination r2
A value close to 1 suggests a good fit
SST
SSE1r2
Where do weget these values?