excel data analysis tools descriptive statistics – data ribbon – analysis section – data...

23
Excel Data Analysis Tools Descriptive Statistics Data ribbon Analysis section Data Analysis icon Descriptive Statistics option Does NOT auto update

Upload: helena-amelia-roberts

Post on 22-Dec-2015

229 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Excel Data Analysis Tools Descriptive Statistics – Data ribbon – Analysis section – Data Analysis icon – Descriptive Statistics option – Does NOT auto

Excel Data Analysis Tools• Descriptive Statistics

– Data ribbon– Analysis section– Data Analysis icon– Descriptive Statistics option– Does NOT auto update

Page 2: Excel Data Analysis Tools Descriptive Statistics – Data ribbon – Analysis section – Data Analysis icon – Descriptive Statistics option – Does NOT auto

Statistical Functions

• Descriptive Statistics (DS) versus Functions– DS do not auto update; functions do (for the most part)– Quick check goes to DS; functions take a while to set up– If functions are unknown, then DS easier to auto create– Other?

Descriptive Statistics Functions

Page 3: Excel Data Analysis Tools Descriptive Statistics – Data ribbon – Analysis section – Data Analysis icon – Descriptive Statistics option – Does NOT auto

Excel Statistical Functions• MEDIAN – returns the middle value of a {sorted} list of supplied values

• If an odd # of values in the list, then middle number• If an even # of values in the list, then the average of the two middle numbers

• MODE.SNGL – returns the most frequently occurring value of a list of supplied numbers

• what if there’s a tie? first one listed (lowest if sorted)

• STDEV[.S] – returns the standard deviation of a supplied set of values (which represent a sample of a population)

• A measure of how spread out the numbers are (square root of the variance)

• VAR.S – returns the variance of a supplied set of values (which represent a sample of a population)

• The average of the squared differences from the mean (average)

• KURT - returns the kurtosis of a data set• The kurtosis of a data set provides a measure of the peakedness of the distribution of the data,

relative to the normal distribution.

• SKEW - returns the skewness of a distribution• is a measurement of the asymmetry of the distribution about the mean

Page 4: Excel Data Analysis Tools Descriptive Statistics – Data ribbon – Analysis section – Data Analysis icon – Descriptive Statistics option – Does NOT auto

MODE.MULT

To input an array formula, you need to first highlight the range of cells for the function’s RESULT.

Type your function into the first cell of the range, and press CTRL-SHIFT-Enter

{=MODE.MULT(A1:A10)}

Page 5: Excel Data Analysis Tools Descriptive Statistics – Data ribbon – Analysis section – Data Analysis icon – Descriptive Statistics option – Does NOT auto

Other Statistical Functions• CORREL was PEARSON - Returns the correlation coefficient between two sets

of values • RSQ - Returns the square of the Pearson product moment correlation

coefficient• SLOPE - Returns the slope of the linear regression line through a supplied

series of x- and y- values• INTERCEPT - Calculates the best fit regression line, through a supplied series of

x- and y- values and returns the value at which this line intercepts the y-axis• STEYX - Returns the standard error of the predicted y-value for each x in the

regression line for a set of supplied x- and y- values• NORM.S.INV - Returns the inverse of the standard normal cumulative

distribution• POISSION.DIST - Returns the Poisson distribution

Webpage: http://www.excelfunctions.net/Excel-Statistical-Functions.html

Page 6: Excel Data Analysis Tools Descriptive Statistics – Data ribbon – Analysis section – Data Analysis icon – Descriptive Statistics option – Does NOT auto

Excel Data Analysis Toolpak

• Moving Average– Used to smooth out irregularities (peaks and valleys) to

easily recognize trends.• Exponential Smoothing– In a sequence of values, calculates a prediction based on a

preceding set of values, and on a prior prediction for those values

• Regression– Creates a report of the regression statistics based on linear

regression through a set of data containing one dependent variable and one or more independent variables.

Page 7: Excel Data Analysis Tools Descriptive Statistics – Data ribbon – Analysis section – Data Analysis icon – Descriptive Statistics option – Does NOT auto

DATP – Data Analysis ToolPak

• On the Data tab, click Data Analysis– If need to load the Analysis ToolPak add-in

• http://www.excel-easy.com/data-analysis/analysis-toolpak.html

• Select Moving Average and click OK.

Page 8: Excel Data Analysis Tools Descriptive Statistics – Data ribbon – Analysis section – Data Analysis icon – Descriptive Statistics option – Does NOT auto

Moving Average

Page 9: Excel Data Analysis Tools Descriptive Statistics – Data ribbon – Analysis section – Data Analysis icon – Descriptive Statistics option – Does NOT auto

DATP – Moving Average• Click on the input range box and select the range B2:M2• Click in the Interval box and type 6• Click in the output range box and select B3• Click OK

Page 10: Excel Data Analysis Tools Descriptive Statistics – Data ribbon – Analysis section – Data Analysis icon – Descriptive Statistics option – Does NOT auto

Explanation: because we set the interval to 6, the moving average is the average of the previous 5 data points and the current data point. As a result, peaks and valleys are smoothed out. The graph shows an increasing trend. Excel cannot calculate the moving average for the first 5 data points because there are not enough previous data points.

Page 11: Excel Data Analysis Tools Descriptive Statistics – Data ribbon – Analysis section – Data Analysis icon – Descriptive Statistics option – Does NOT auto

• Conclusion: The larger the interval, the more the peaks and valleys are smoothed out. The smaller the interval, the closer the moving averages are to the actual data points.

Page 12: Excel Data Analysis Tools Descriptive Statistics – Data ribbon – Analysis section – Data Analysis icon – Descriptive Statistics option – Does NOT auto

Exponential Smoothing

Page 13: Excel Data Analysis Tools Descriptive Statistics – Data ribbon – Analysis section – Data Analysis icon – Descriptive Statistics option – Does NOT auto

DATP – Exponential Smoothing• Same input range B2:M2• Click on the Damping factor box and type 0.9. Literature

often talks about the smoothing constant alpha (). The value (1-α) is called the damping factor.

• Click the output range box and select cell B3

• Click OK.

Page 14: Excel Data Analysis Tools Descriptive Statistics – Data ribbon – Analysis section – Data Analysis icon – Descriptive Statistics option – Does NOT auto

α*alphaprev + (1-α)*actualprev

Because we set alpha to 0.1, the previous data point is given a relatively small weight while the previous smoothed value is given a large weight (i.e. 0.9). As a result, peaks and valleys are smoothed out. The graph shows an increasing trend. Excel cannot calculate the smoothed value for the first data point because there is no previous data point. The smoothed value for the second data point equals the previous data point.

Page 15: Excel Data Analysis Tools Descriptive Statistics – Data ribbon – Analysis section – Data Analysis icon – Descriptive Statistics option – Does NOT auto

ConclusionThe smaller alpha (larger the damping factor), the more the peaks and valleys are smoothed out. The larger alpha (smaller the damping factor), the closer the smoothed values are to the actual data points.

Page 16: Excel Data Analysis Tools Descriptive Statistics – Data ribbon – Analysis section – Data Analysis icon – Descriptive Statistics option – Does NOT auto

DATP – Regression AnalysisIs there a relation between Quantity Sold (Output) and Price and Advertising (Input). In other words: can we predict Quantity Sold if we know Price and Advertising?

Page 17: Excel Data Analysis Tools Descriptive Statistics – Data ribbon – Analysis section – Data Analysis icon – Descriptive Statistics option – Does NOT auto

Regression Analysis• Select the Y Range

(A1:A8). This is the predictor variable (also called dependent variable).

• Select the X Range(B1:C8). These are the explanatory variables (also called independent variables). These columns must be adjacent to each other.

• Check Labels.• Select an Output Range.• Check Residuals.• Click OK.

Page 18: Excel Data Analysis Tools Descriptive Statistics – Data ribbon – Analysis section – Data Analysis icon – Descriptive Statistics option – Does NOT auto

Summary Output – R Square

R Square equals 0.962, which is a very good fit. 96% of the variation in Quantity Sold is explained by the independent variables Price and Advertising. The closer to 1, the better the regression line (read on) fits the data.

Page 19: Excel Data Analysis Tools Descriptive Statistics – Data ribbon – Analysis section – Data Analysis icon – Descriptive Statistics option – Does NOT auto

Summary Output Significance of F and P values

• To check if your results are reliable (statistically significant), look at Significance F (0.001). If this value is less than 0.05, you're OK. If Significance F is greater than 0.05, it's probably better to stop using this set of independent variables. Delete a variable with a high P-value (greater than 0.05) and rerun the regression until Significance F drops below 0.05.

• Most or all P-values should be below 0.05. In our example this is the case. (0.000, 0.001 and 0.005).

Page 20: Excel Data Analysis Tools Descriptive Statistics – Data ribbon – Analysis section – Data Analysis icon – Descriptive Statistics option – Does NOT auto

Sample Output - Coefficients• The regression line is: y = Quantity Sold = 8536.214 -835.722 * Price + 0.592 *

Advertising. • In other words, for each unit increase in price, Quantity Sold decreases with

835.722 units. For each unit increase in Advertising, Quantity Sold increases with 0.592 units. This is valuable information.

• You can also use these coefficients to do a forecast. For example, if price equals $4 and Advertising equals $3000, you might be able to achieve a Quantity Sold of 8536.214 -835.722 * 4 + 0.592 * 3000 = 6970.

Page 21: Excel Data Analysis Tools Descriptive Statistics – Data ribbon – Analysis section – Data Analysis icon – Descriptive Statistics option – Does NOT auto

Sample Output - Residuals• The residuals show you how far away the actual data points

are from the predicted data points (using the equation). • For example, the first data point equals 8500 (A2). Using the

equation, the predicted data point equals 8536.214 -835.722 * 2 + 0.592 * 2800 = 8523.009, giving a residual of 8500 - 8523.009 = -23.009.

Page 22: Excel Data Analysis Tools Descriptive Statistics – Data ribbon – Analysis section – Data Analysis icon – Descriptive Statistics option – Does NOT auto

Creating a Histogram• Designate Student

Scores• Decide how score

should be related to grades– 10 point scale– The values in the

“bins” are the upper limits of each bin…

Page 23: Excel Data Analysis Tools Descriptive Statistics – Data ribbon – Analysis section – Data Analysis icon – Descriptive Statistics option – Does NOT auto

HistogramOutput