wsuwp-uploads.s3.amazonaws.com€¦ · web viewadv.:irregularities in your empirical distribution...

47
MgtOp 470—Business Modeling with Spreadsheets Professor Munson Topic 6 Monte Carlo Simulation Set 1: Computer Simulation and Output Analysis

Upload: others

Post on 23-May-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: wsuwp-uploads.s3.amazonaws.com€¦ · Web viewAdv.:irregularities in your empirical distribution are smoothed out and “tails” are included that might not be in the original data

MgtOp 470—Business Modeling with Spreadsheets

Professor Munson

Topic 6

Monte Carlo Simulation

Set 1: Computer Simulation and Output Analysis

“Spock, I need that analysis now!”Captain James T. Kirk, sometime in the future...

Page 2: wsuwp-uploads.s3.amazonaws.com€¦ · Web viewAdv.:irregularities in your empirical distribution are smoothed out and “tails” are included that might not be in the original data

Play 1st Card Revealed Card Final Card Prize Card Prize123456789

101112131415161718192021222324252627282930

Let's Make a Deal

Page 3: wsuwp-uploads.s3.amazonaws.com€¦ · Web viewAdv.:irregularities in your empirical distribution are smoothed out and “tails” are included that might not be in the original data

Play 1st Card Revealed Card Final Card Prize Card Prize313233343536373839404142434445464748495051525354555657585960

Let's Make a Deal

Page 4: wsuwp-uploads.s3.amazonaws.com€¦ · Web viewAdv.:irregularities in your empirical distribution are smoothed out and “tails” are included that might not be in the original data

What is Monte Carlo Simulation?

Uncertainty arises due to random variation, lack of knowledge, or error. Computer simulation has to do with using computer models to imitate real life or make predictions. Monte Carlo simulation not only tells you what could happen, but how likely it is to happen.

Working Definition: Monte Carlo simulation is basically a sampling experiment whose purpose is to estimate the distribution of an outcome variable that depends on several probabilistic input variables.

Three Primary Uses:1. Predict an expected outcome2. Predict a distribution of outcomes (best case/worst case,

risk/reward)3. Optimization (comparison of decisions)

Two Major Types of Computer Simulation1. Discrete event: Concerns the modeling of a system as it evolves over

time by a representation in which the state variables change instantaneously at separate points in time. Examples: customers proceeding through a drive-through window at a restaurant, workers loading a truck, etc. (Note: continuous simulation models systems that change continuously over time. These typically involve differential equations.) Simulation models that involve the passage of time may or may not involve random inputs.

2. Monte Carlo: Concerns the modeling of a system that employs the use of random inputs. Many authors only define simulations as Monte Carlo if the passage of time plays no substantive role. Others define them more broadly as modeling systems whose relationships can be defined mathematically and entered into a spreadsheet.

Page 5: wsuwp-uploads.s3.amazonaws.com€¦ · Web viewAdv.:irregularities in your empirical distribution are smoothed out and “tails” are included that might not be in the original data

Monte Carlo Simulation as Risk AnalysisRisk analysis is part of every decision we make. Simulation answers the question, “What are the risks?”. It can address questions such as, “What’s the likelihood that this investment will yield a $1 million return? How much does the inflation rate influence my economic evaluation? What are the odds of being over budget on this project?”

Monte Carlo simulation lets you see all the possible outcomes of your decisions and assess the impact of risk, allowing for better decision making under uncertainty. Desired outcomes: (1) average outcome, (2) probability of a particular set of outcomes (“tail probabilities” are often suitable measures of the risk associated with a decision), (3) worst case/best case.

Monte Carlo simulation performs risk analysis by building models of possible results by substituting a range a values—a probability distribution—for any factor that has inherent uncertainty. It then calculates results over and over, each time using a different set of random values from the probability functions. Depending upon the number of uncertainties and the ranges specified for them, a Monte Carlo simulation could involve thousands or tens of thousands of recalculations before it is complete. Monte Carlo simulation produces distributions of possible outcome values. By comparison, a decision tree only deals with expected values and thus inherently ignores risks.

HistoryA Monte Carlo method is a technique that involves using random numbers and probability to solve problems. The term was coined by Ulam and Metropolis in the 1940s in reference to games of chance, a popular attraction in Monte Carlo, Monaco. Monte Carlo methods were used during the Manhattan Project, and computerized Monte Carlo simulation was first used by physicists working on nuclear weapons projects in the Los Alamos National Laboratory.

Page 6: wsuwp-uploads.s3.amazonaws.com€¦ · Web viewAdv.:irregularities in your empirical distribution are smoothed out and “tails” are included that might not be in the original data

Advantages and Disadvantages of Simulation

Advantages of Computer Simulation1. Applicable in complex cases where analytical techniques cannot be

employed. In general, the larger the number of probabilistic components in the system becomes, the more likely it is that simulation will be the best approach.

2. Provides an experimental laboratory. 3. Applicable where the system itself cannot be built, modified,

destroyed, etc. Allows for “exploration of the impossible.”4. Avoids risk and disruptive experiments with actual systems. 5. Compresses time to reveal long-term effects.6. Generally less costly than experiments with real-world systems.7. Promotes creativity (faster & less risk).8. Ideas come alive with animation and graphs.9. Can incorporate risk in the decision-making process.10. Can identify and analyze a large number of possible solutions.11. A tool for thinking and understanding before taking action.12. Great tool for sensitivity analysis.13. Extremely flexible tool.

Disadvantages of Computer Simulation1. An expert may have to write the computer program. Some

programs contain > 10,000 lines of code.2. Can be costly for data collection, modeling, and analysis.3. Optimal solutions are not guaranteed.4. Does not necessarily suggest a solution methodology.5. May hide critical assumptions that invalidate the model.6. Random numbers generated and used are only samples from a

distribution.7. Reality can never be exactly modeled, particularly with respect to

human reactions.8. May be difficult to assess uncertainties.9. Quality modeling is not easy!

Page 7: wsuwp-uploads.s3.amazonaws.com€¦ · Web viewAdv.:irregularities in your empirical distribution are smoothed out and “tails” are included that might not be in the original data

CARELESS CODE RECYCLING CAUSES KILLER KANGAS

Mutant Marsupials Take Up Arms Against Australian Air Force

The reuse of some object-oriented code has caused tactical headaches for Australia’s armed forces. As virtual reality simulators assume larger roles in helicopter combat training, programmers have gone to great lengths to increase the realism of their scenarios, including detailed landscapes and—in the case of the Northern Territory’s Operation Phoenix—herds of kangaroos (since disturbed animals might well give away a helicopter’s position).

The head of the Defense Science & Technology Organization’s Land Operations/Simulation division reportedly instructed developers to model the local marsupials’ movements and reactions to helicopters. Being efficient programmers, they just re-appropriated some code originally used to model infantry detachment reactions under the same stimuli, changed the mapped icon from a soldier to a kangaroo, and increase the figures’ speed of movement.

Eager to demonstrate their flying skills for some visiting American pilots, the hotshot Aussies “buzzed” the virtual kangaroos in low flight during a simulation. The kangaroos scattered, as predicted, and the visiting Americans nodded appreciatively...then did a double-take as the kangaroos reappeared from behind a hill and launched a barrage of Stinger missiles at the hapless helicopter. (Apparently the programmers had forgotten to remove that part of the infantry coding.)

The lesson?

Objects are defined with certain attributes, and any new object defined in terms of an old one inherits all the attributes. The embarrassed programmers had learned to be careful when reusing object-oriented code, and the Yanks left with a newfound respect for Australian wildlife.

Simulator supervisors report that pilots from that point onward have strictly avoided kangaroos, just as they were meant to. (June 15, 1999, Melbourne)

Page 8: wsuwp-uploads.s3.amazonaws.com€¦ · Web viewAdv.:irregularities in your empirical distribution are smoothed out and “tails” are included that might not be in the original data

Simulation SoftwareSoftware Review: http://lionhrtpub.com/orms/surveys/Simulation/Simulation.html

Excel Add-Ins: Crystal Ball, @Risk, Risk Solver, Simtools (free)

Discrete-Event Simulations: Arena, SIMPROCESS, Process Simulator, AGPSS, SIMSCRIPT III, SIMUL8

Steps of Monte Carlo Simulation 1. Create a parametric (base case) model, and determine which input

parameters will be uncertain (sensitivity analysis and tornado charts can help) and their respective probability distributions.

2. Generate a set of random inputs.

3. Evaluate the model and store the results.

4. Repeat Steps 2 and 3 up to n trials.

5. Analyze the results using histograms, summary statistics, confidence intervals, etc.

Page 9: wsuwp-uploads.s3.amazonaws.com€¦ · Web viewAdv.:irregularities in your empirical distribution are smoothed out and “tails” are included that might not be in the original data

Uncertain Input Parameter SelectionOnly the most influential and uncertain parameters should be replaced with probability distributions. A tornado chart evaluates which parameters of a model most affect the valuation. In general, those showing the greatest effect should assigned probability distributions in the simulation.

Excel Add-In “TopRank”This program (bundled with the text) creates a non-permanent Excel Add-In that disappears every time that you close Excel. After opening the program, Excel should open and create the add-in for you that can then be used in any other spreadsheets that you open or create. TopRank has a video tutorial that is worth viewing.

General Approach1. Select the model output (or outputs) of interest by clicking:

<Add Output>2. Place the cursor over the each input that you want to perform

sensitivity analysis on and click: <Add Input>3. Set the Min and Max ranges over which to vary the input (the default

approach uses percentage changes—click on the function button to use a different approach (e.g., absolute changes or absolute values))

4. Click on <Run What-If Analysis> and then click <RUN>5. A tornado chart and spider graph will be available in a separate

workbook

Page 10: wsuwp-uploads.s3.amazonaws.com€¦ · Web viewAdv.:irregularities in your empirical distribution are smoothed out and “tails” are included that might not be in the original data

ExampleProfit = (Price –Variable Cost)[Market Size × Our Share]−Fixed Cost

Parameter Lower Bound Most Likely Upper BoundPrice 140 175 200

Variable Cost 30 40 60Fixed Cost 150 180 300Market Size 8 12 20Our Share .18 .25 .35

To change to actual Min and Max, click <Add Input> then this button.

Page 11: wsuwp-uploads.s3.amazonaws.com€¦ · Web viewAdv.:irregularities in your empirical distribution are smoothed out and “tails” are included that might not be in the original data

Click <Run What-If Analysis>:

Market Size (C8)

Our Share (C9)

Price (C5)

Fixed Cost (C7)

Variable Cost (C6)

50 100

150

200

250

300

350

400

450

500

550

Tornado Graph of ProfitImpact by Input

Value of Profit

Other TopRank Issues You can perform a quick default Tornado chart on all inputs that

affect the output by simply clicking: <AutoVary Functions>→<Add AutoVary Functions…>→<OK>

With the Add Input dialog displayed, just press <TAB> to step through all inputs that will be changed in your spreadsheet

“Multi-Way” analysis can be performed that analyzes the impact of changing more than one input at the same time. Click on <Model Window>, then select the inputs you want to analyze via multi-way (<CRTL click>), then right-click and select <Multi-Way>

Page 12: wsuwp-uploads.s3.amazonaws.com€¦ · Web viewAdv.:irregularities in your empirical distribution are smoothed out and “tails” are included that might not be in the original data

Creating Simulation Replications in Excel

Three Potential Methods1. Put the entire model into one row (column),

and copy the model down (n – 1) rows (over (n – 1) columns)).

2. If only one or two inputs are random, generate a Data Table.

3. Write a macro (e.g., (rel. references)→Paste Values→<Down>).

Page 13: wsuwp-uploads.s3.amazonaws.com€¦ · Web viewAdv.:irregularities in your empirical distribution are smoothed out and “tails” are included that might not be in the original data

Two-Variable Data Table

The input values for one variable are listed down one column, while the values for the other variable are listed across one row. The output cell is placed at the intersection of the input column and input row.

Fast Feet RevisitedTo add prices ranging from $20 to $45, in $5 increments:

1. Move the sales quantity range underneath the output cell.2. Put the input prices in the 6 columns to the right of the output cell.3. Prior to calling the Data Table tool, select the range covering all

input cells.4. The Row input cell will reference the price cell (C5), and the

Column input cell will reference the quantity cell (C10).

Page 14: wsuwp-uploads.s3.amazonaws.com€¦ · Web viewAdv.:irregularities in your empirical distribution are smoothed out and “tails” are included that might not be in the original data

Random Number GenerationComputers generate pseudo-random numbers, because they are developed according to an algorithm on a finite machine. Eventually, these algorithms either repeat or converge. However, the numbers appear random, and for practical purposes, work well on today’s computers using Excel and other simulation programs.

Random number generators have a starting seed, which is the beginning number for the generation of pseudo-random numbers. In Excel, the default seed for a random number constantly varies (probably based on the clock), which is why a different number may appear in a random number cell every time that you open the same Excel file. However, Excel also has a Random Number Generation tool that allows the user to specify a starting seed. This is extremely useful in simulation because different systems (assumptions) can be compared, where the variation comes from the change in system (assumption), not the change in random numbers.

For example, if you’re trying two different distributions for a particular input variable, then drawing from the same seed should result in both draws being either above or below their respective means.

As explained below, if you want to use seeds for distributions that are not part of Excel’s Random Number Generation tool, you can generate a set of Uniform(0,1) random numbers and use Excel functions to convert those into the distribution of interest.

Generating Random Numbers in ExcelTwo Methods for Random Number Generation:1. Random Number Generation tool in Excel’s Analysis Toolpak Add-In

to create a set of static random numbers for certain distributions. 2. Use specific functions to create random variables in cells using the

RAND() function. Unless the automatic recalculation feature is suppressed, whenever any cell in the spreadsheet is modified (or <F9> is depressed), the values in any cell containing RAND() will change.

Page 15: wsuwp-uploads.s3.amazonaws.com€¦ · Web viewAdv.:irregularities in your empirical distribution are smoothed out and “tails” are included that might not be in the original data

1Fitting Data to Distributions

3 Approaches to Input Distributions1. Take raw data and use that data as the input to your simulation.

Adv.: good way to validate the modelDisadv.: you’re saying that the future will exactly mimic the past

2. Empirical distribution—use n possible outcomes (that you have observed) and assign a uniform probability to each outcome (Excel’s Sampling tool does this)Adv.: don’t need a distribution assumptionDisadv: constrained by the dataNote: When choosing a distribution based on empirical data,

it is generally advisable to widen the range because actual results tend to underestimate the extremes.

3. A parameterized theoretical distributionAdv.: irregularities in your empirical distribution are

smoothed out and “tails” are included that might not be in the original data

Disadv: must estimate a distribution

Page 16: wsuwp-uploads.s3.amazonaws.com€¦ · Web viewAdv.:irregularities in your empirical distribution are smoothed out and “tails” are included that might not be in the original data

Empirical Distributions

Sampling Without ReplacementUsing SIMTOOLS: the array function

=SHUFFLE(n)<Ctrl><Shift><Enter>

This returns a random ordering of the numbers 1 to n, and should be entered into a row containing n cells. When entered into a row range of fewer than n cells, this function generates random samples from 1 to n without replacement.

To sample from non-integers, the values in a given row range of n cells can be shuffled by entering the array formula:

=INDEX(range of numbers,1,SHUFFLE(n))<Ctrl><Shift><Enter>

Sampling With ReplacementUse Excel’s Sampling tool.

The Sampling analysis tool creates a sample from a population by treating the input range as a population. When the population is too large to process or chart, you can use a representative sample. You can also create a sample that contains only the values from a particular part of a cycle if you believe that the input data is periodic. For example, if the input range contains quarterly sales figures, sampling with a periodic rate of four places the values from the same quarter in the output range.

→Data→Analysis:→Data Analysis→Sampling→<OK>

Page 17: wsuwp-uploads.s3.amazonaws.com€¦ · Web viewAdv.:irregularities in your empirical distribution are smoothed out and “tails” are included that might not be in the original data

Input Range Enter the range of data containing the population of values that you want to sample.

Labels Select if the first row or column of your input range contains labels.

Period: If using periodic sampling, enter the desired periodic interval. This simply pulls every period-th value from the input range; it does not draw randomly.

Number of Samples Enter the number of random values that you want in the output column. Each value is drawn from a random position in the input range, and any number can be selected more than once.

Output Range Enter the reference for the upper-left cell of the output table. Data are written in a single column below the cell.

Page 18: wsuwp-uploads.s3.amazonaws.com€¦ · Web viewAdv.:irregularities in your empirical distribution are smoothed out and “tails” are included that might not be in the original data

Excel Functions for Drawing from Random Distributions

Uniform(0, 1): =RAND()

Uniform(a, b): =a+RAND()*(b−a)

Normal(μ, ): =NORMINV(probability,mean,stdevn)

Weibull(α, β): =β*(−LN(1−probability))^(1/α)

Bernoulli(p): =IF(probabilityp,1,0)

Discrete Uniform(a, b): =RANDBETWEEN(lbound,ubound)

Using SIMTOOLSExponential(μ): =EXPOINV(probability,mean)

Lognormal(μ, ): =LNORMINV(probability,mean,stdevn)

Gamma(μ, ): =GAMINV(probability,mean,stdevn)

Beta(μ, , a, b): =BETA(probability,mean,stdevn,lbound,ubound)(default lower and upper bounds are 0 and 1, respectively)

Triangular(a, b, c): =TRIANINV(prob,lbound,mostlikely,ubound)

Binomial(n, p): =BINOMINV(probability,#trials,p)

Poisson(λ): =POISINV(probability,mean)

Discrete Distribution: DISCRINV(probability,values range,probabilities range)

Page 19: wsuwp-uploads.s3.amazonaws.com€¦ · Web viewAdv.:irregularities in your empirical distribution are smoothed out and “tails” are included that might not be in the original data

Excel’s Random Number Generator

The Random Number Generation analysis tool fills a range with independent random numbers that are drawn from one of several distributions. Compared to using direct functions such as NORMDIST, a starting seed may be specified for comparison across options, and the numbers do not recalculate with each worksheet change.

→Data→Analysis:→Data Analysis→Random Number Generator→<OK>

Number of Variables   Enter the number of columns of values that you want in the output table. Default = 1 (or defined output range).

Number of Random Numbers   Enter the number of data points that you want to see. Each data point appears in a row of the output table. For example, 3 “Variables” and 20 “Random Numbers” = 60 actual random data points. Default = 1 row of numbers (or defined output range).

Page 20: wsuwp-uploads.s3.amazonaws.com€¦ · Web viewAdv.:irregularities in your empirical distribution are smoothed out and “tails” are included that might not be in the original data

Parameter Options for the Chosen Distribution

Page 21: wsuwp-uploads.s3.amazonaws.com€¦ · Web viewAdv.:irregularities in your empirical distribution are smoothed out and “tails” are included that might not be in the original data

There is also a Patterned distribution, which is actually just a way to generate a deterministic sequence of numbers (similar to Excel’s Fill capabilities). This is characterized by a lower bound and an upper bound, a step, repetition rate for values, and a repetition rate for the sequence. Number of Variables and Number of Random Numbers are not applicable.

Discrete   This is characterized by a value and the associated probability range. The range must contain two columns: The left column contains values, and the right column contains probabilities that are associated with the value in that row. The sum of the probabilities must be 1.

910111213

E FValue Prob

2 0.103 0.354 0.255 0.30 In this case, input range = E10:F13.

Random Seed   Enter an optional integer value from which to generate random numbers. You can reuse this value later to produce the same random numbers.

Output Range   If specified Number of Variables and Number of Random Numbers, enter the reference for the upper-left cell of the output table. Excel automatically determines the size of the output area and displays a message if the output table will replace existing data. Otherwise, enter the range to be filled with random numbers.

New Worksheet Ply   Click to insert a new worksheet in the current workbook and paste the results starting at cell A1 of the new worksheet. To name the new worksheet, type a name in the box.

New Workbook   Click to create a new workbook in which results are added to a new worksheet.

Page 22: wsuwp-uploads.s3.amazonaws.com€¦ · Web viewAdv.:irregularities in your empirical distribution are smoothed out and “tails” are included that might not be in the original data

Run Length

Typical simulations have 500-1000 trials, but they could be much larger.

The precision of any simulation estimate increases as the run length increases. Repeated batches of samples are more like each other (i.e., they show less variability) as the sample size increases; therefore, they are more precise.

No definitive optimal number of trials. Smaller samples may not be precise enough. Larger samples take more computer time and space.

Two Methods to Compute Run Length1. Experimental Method

A. Pick an initial sample size, say 100.B. Perform 5-10 simulations (with different seeds) using this run

length and compare the estimates of the outcome measure.C. If the estimates are too far apart, increase the run length and go to

Step B.

2. Sampled Standard Error MethodA. Pick an initial sample size, say 100.B. Perform the simulation once, and compute the sample standard

deviation S of the output values (STDEV in Excel).C. Determine the desired confidence interval ±A , i.e., the accuracy

that you wish to obtain in estimating the mean. For example, if you wish estimated profit to be no more than ± $5 from the true average profit, A = 5.

D. Determine your desired confidence level 100(1−α)%, and compute the z-value of α/2. For example, if you wish 99% confidence (α=.01), enter =NORMSINV(1−(.01/2)), which will yield a z-value of 2.575.

E. Compute an estimated required sample size n = (zS/A)2.F. Replicate the simulation an additional n−100 times.G. For completeness, it would be a good idea to calculate a new S

based on the n simulations and re-compute n in step E. Repeat the procedure if the new required n has grown.

Page 23: wsuwp-uploads.s3.amazonaws.com€¦ · Web viewAdv.:irregularities in your empirical distribution are smoothed out and “tails” are included that might not be in the original data

Output AnalysisPrimary Types of Output Analysis for Simulations1. Histogram2. Summary Statistics3. Risk Analysis

Example Used Throughout this Section“Output Analysis.xlsx”

123456789

101112131415161718192021222324

A B C D E FSample Profit Output from a Simulation with 20 Trials

Trial Profit Profit1 $50,386 $10,0002 $38,888 $20,0003 $62,023 $30,0004 ($12,000) $40,0005 $24,960 $50,0006 $24,756 $60,0007 $12,000 $70,0008 $14,7449 $44,421

10 $23,00011 $25,43212 $36,98713 $85,21314 $54,12215 $23,22216 $13,00017 $55,63418 $33,35219 $41,90420 ($5,235)

Histogram

Typically the first phase of output analysisEssentially provides a probability distribution of outputProvides a visual representation central tendency, skewness, dispersion,

and outliers (extreme risk)

Page 24: wsuwp-uploads.s3.amazonaws.com€¦ · Web viewAdv.:irregularities in your empirical distribution are smoothed out and “tails” are included that might not be in the original data

Excel’s Histogram Tool

→Data→Analysis:→Data Analysis→Histogram→<OK>

Input Range Range Containing the data to be analyzed.

Bin Range (Optional: If omitted, Excel creates evenly distributed bins between the min and max values.) A “bin” is an interval of numbers. For each defined bin, a histogram counts the number of input range values that fall within the bin and then graphs it in a bar chart. A defined bin range in Excel is a sequence of increasing numbers representing the boundary values of each bin. The first bin, then, equals −∞ to the first number in the bin range. The second bin equals every value > the first number in the bin range and ≤ the second number in the bin range, etc. Any values counted above the final number in the bin range are given the label “More” in the histogram. (Note: the bins do not have to be equally spaced.) While bin selection may have a logical basis, it may well be based on judgment. Clearly, a different set of bins will change the appearance of the histogram to some degree.

Labels Select if the first row (or column) of your bin range has a label. If checked, this label will print on the tables and charts.

Page 25: wsuwp-uploads.s3.amazonaws.com€¦ · Web viewAdv.:irregularities in your empirical distribution are smoothed out and “tails” are included that might not be in the original data

Output Range Enter the reference for (only) the upper-left cell of the output table. Excel automatically determines the size of the output area and displays a message if the table will replace existing data.

New Worksheet Ply   Click to insert a new worksheet in the current workbook and paste the results starting at cell A1 of the new worksheet. To name the new worksheet, type a name in the box.

New Workbook   Click to create a new workbook in which results are added to a new worksheet.

123

456789

101112

131415161718192021222324

A B C D E F GSample Profit Output from a Simulation with 20 Trials

Trial Profit Profit1 $50,386 $10,000 Profit Frequency2 $38,888 $20,000 10000 23 $62,023 $30,000 20000 34 ($12,000) $40,000 30000 55 $24,960 $50,000 40000 36 $24,756 $60,000 50000 27 $12,000 $70,000 60000 28 $14,744 70000 19 $44,421 More 1

10 $23,00011 $25,43212 $36,98713 $85,21314 $54,12215 $23,22216 $13,00017 $55,63418 $33,35219 $41,90420 ($5,235)

In addition to the frequency table that’s automatically generated, three other output options are available in any combination.

Pareto (sorted histogram) Presents the frequency chart in decreasing order of frequency.

Cumulative Percentage Adds cumulative percentage column (line) to the frequency charts (histogram chart).

Chart Output Select to generate an embedded histogram chart with the output table.

Page 26: wsuwp-uploads.s3.amazonaws.com€¦ · Web viewAdv.:irregularities in your empirical distribution are smoothed out and “tails” are included that might not be in the original data

Summary Statistics

→Data→Analysis:→Data Analysis→Descriptive Statistics→<OK>

34567891011121314151617181920

H IProfit

Mean 32340.45Standard Error 5162.526Median 29392Mode #N/AStandard Deviation 23087.52Sample Variance 5.33E+08Kurtosis 0.411262Skewness 0.198427Range 97213Minimum -12000Maximum 85213Sum 646809Count 20Largest(3) 55634Smallest(3) 12000Confidence Level(95.0%) 10805.29

For a range of data, Excel can automatically calculate basic statistics, as shown above. Include the output heading in your input range and check the “Labels in first row” box if you want the heading printed on your report. In this example, no output value appeared more than once, so no mode was provided. The confidence interval was provided for a 5% significance level (which can be changed). The “Kth Largest” (“Kth Smallest) boxes are selected if you want to include a row in the output table for the kth largest (smallest) value in the data range. A 1 provides the maximum (minimum) in the data set.

The skewness describes the asymmetry of the distribution relative to the mean. A positive skewness indicates that it has a longer right-hand tail. A negative skewness indicates skewness to the left.

Kurtosis describes the peakedness or flatness of the distribution relative to the Normal distribution. A positive value indicates a more peaked distribution, while a negative value indicates a flatter one.

Page 27: wsuwp-uploads.s3.amazonaws.com€¦ · Web viewAdv.:irregularities in your empirical distribution are smoothed out and “tails” are included that might not be in the original data

Quartiles

First quartile: =QUARTILE(B4:B23,1)Third quartile: =QUARTILE(B4:B23,3)Interquartile range (the central 50% of the data):

=QUARTILE(B4:B23,3)−QUARTILE(B4:B23,1)

Risk Analysis

Simulations are often conducted to analyze risk in some way, e.g., the probability of being late on a construction project or the probability of losing money. This is where the distribution of the output, not just the mean, becomes particularly important. Multiple runs of the simulation are particularly advised to when seeking answers to these types of questions.

ExamplesProbability of output being less than $10,000 (% of output < $10,000)

=PERCENTRANK(B4:B23,10000)(Note that PERCENTRANK interpolates if no value matches.)

Probability of output being greater than $72,000=1−PERCENTRANK(B4:B23,72000)

Probability of output being between $5000 and $67,000=PERCENTRANK(B4:B23,67000)−PERCENTRANK(B4:B23,5000)

What are the 95% central interval limits (α = .05)? (In other words, what are the .025 and .975 quantiles?)=PERCENTILE(B4:B23,.025) and =PERCENTILE(B4:B23,.975)

(Note that this is not a 95% “confidence interval.” Instead, we are estimating the proportion of the data that we expect to be within the given limits based upon the results of the simulation. We are defining the interval based upon the central proportion of the data.)

Page 28: wsuwp-uploads.s3.amazonaws.com€¦ · Web viewAdv.:irregularities in your empirical distribution are smoothed out and “tails” are included that might not be in the original data

The Data Analysis “Rank and Percentile” tool provides cumulative distribution information.

→Data→Analysis:→Data Analysis→Rank and Percentile→<OK>

34567891011121314151617181920212223

H I J KPoint Profit Rank Percent

13 $85,213 1 100.00%3 $62,023 2 94.70%

17 $55,634 3 89.40%14 $54,122 4 84.20%1 $50,386 5 78.90%9 $44,421 6 73.60%

19 $41,904 7 68.40%2 $38,888 8 63.10%

12 $36,987 9 57.80%18 $33,352 10 52.60%11 $25,432 11 47.30%5 $24,960 12 42.10%6 $24,756 13 36.80%

15 $23,222 14 31.50%10 $23,000 15 26.30%8 $14,744 16 21.00%

16 $13,000 17 15.70%7 $12,000 18 10.50%

20 ($5,235) 19 5.20%4 ($12,000) 20 0.00%