unit 6 input modeling

34
INPUT MODELING UNIT 6

Upload: raksharao

Post on 18-Jan-2017

7 views

Category:

Education


1 download

TRANSCRIPT

Page 1: Unit 6 input modeling

INPUT MODELING UNIT 6

Page 2: Unit 6 input modeling

CONTENTS• Data Collection• Identifying the distribution with data• Parameter estimation• Goodness of Fit Tests• Fitting a non-stationary Poisson process• Selecting input models without data• Multivariate and Time-Series input models• uniformity and independence• Chi-Square test, • K-S Test.

Page 3: Unit 6 input modeling

• Input models are the distributions of time b/w arrivals and of service times.• four steps in the development of a useful model of input

data1. collect the data from the real system of interest2. identify the probability distribution to represent the input process.3. choose the parameters that determine a specific instance of the

distribution family4. evaluate the chosen distribution and associated parameters for

goodness of fit test.

Page 4: Unit 6 input modeling

1. DATA COLLECTION

• Collect data from the real system of interest• Requires substantial amount of time and resource

commitment• When data is not available, expert opinion and knowledge of

the process must be used to make educated guesses• Even though if the model structure is valid, if the input data

is inaccurately collected, inappropriately analysed then simulation output will be misleading.

Page 5: Unit 6 input modeling

SUGGESTIONS TO ENHANCE DATA COLLECTION1. A useful expenditure of time is in planning which could begin by

practice or pre observing session2. Try to analyse the data as they are being collected 3. Try to combine homogeneous data sets4. Be aware of the possibility of data censoring5. Discover the relationship between the two variables by using scatter

diagram6. Check for autocorrelation of data collected from the customers7. Difference between input data and output or performance data must

be given importance

Page 6: Unit 6 input modeling

2. IDENTIFYING THE DISTRIBUTION WITH DATA• Shape of distribution is identified by•Frequency distribution •Histograms

Page 7: Unit 6 input modeling

STEPS INVOLVED IN CONSTRUCTION OF HISTOGRAM1. Divide the range of the data into interval2. Label the horizontal axis to conform to the intervals

selected3. Find the frequency of occurrences within each interval4. Label the vertical axis so that the total occurrences can be

plotted for each interval5. Plot the frequencies on the vertical axis

Page 8: Unit 6 input modeling

• The no .of intervals depends on the no. of observations & on the amount of scatter or dispersion in the data• Histogram should not be too ragged or to coarse as shown

Page 9: Unit 6 input modeling

RAGGED AND COARSE HISTOGRAM

Page 10: Unit 6 input modeling

EXAMPLE

Page 11: Unit 6 input modeling

2.2. SELECTING THE FAMILY OF DISTRIBUTION• Purpose of preparing a histogram is to infer a known pmf or

pdf• Family of distribution is chosen based on what might arise in

the context being investigated along with the shape of the histogram• There are many probability distributions created , few are

Page 12: Unit 6 input modeling

• Binomial :• Models the no. of successes in n trails , where the trails are

independent with common success probability p• Eg: number of defective chips out of n chips

• Poisson:• Models the number of independent events that occur in a fixed

amount of time or space• Eg : the number of customer that arrive to a store in one hour

• Normal:• Models the distribution of a process that can be thought of as the sum

of a number of component process• Eg: time assemble a product that is the sum of the times required for

each assembly operation

Page 13: Unit 6 input modeling

• Exponential:• Models the time between independent event• E.g.: time between the arrivals

• Gamma: • Models non-negative random variables and it is flexible distribution

• Beta• Extremely flexible distribution used to model bounded random variable

• Weibull• Models the time to failure for compounds

• Discrete or continuous uniform• Models complete uncertainty

• Triangular• Models a process which only min most likely and max values of the distribution known

Page 14: Unit 6 input modeling

2.3. QUANTILE-QUANTILE PLOTS (Q-Q PLOTS)• Histogram is not useful for evaluating the fit of the chosen

distribution, where there are small number of data points, histograms can be ragged, width of histogram interval should be appropriate• Q-Q plot is a useful tool for evaluating distribution fit• Definition:

• If X is a random variable with cdf F, then q-quantile if X is that value ᵞ such that F(ᵞ)=P(X≤Y)=q for 0<q<1

Page 15: Unit 6 input modeling

3 PARAMETER ESTIMATION

• After the selection of family of distributions , the next step is to estimate the parameters of the distribution.

Preliminary statistics: sample mean and sample variance• Sample mean and variance will be calculated depending on

the type of data whether discrete or continuous .

Page 16: Unit 6 input modeling
Page 17: Unit 6 input modeling

SUGGESTED ESTIMATORS:

• Numerical estimates of the distribution parameters are needed to reduce the family of distributions to a specific distribution

• These estimators are the likelihood estimators based on the raw data.

• The triangular distribution is usually employed when no data is available.

Page 18: Unit 6 input modeling
Page 19: Unit 6 input modeling

4 GOODNESS-OF-FIT TESTS

• Apply the different types of tests for goodness based on the family of distribution selected.• Use the corresponding estimators based on the family of

distribution and verify for the goodness of fit.• General tests can be applied are

1. Chi-square test2. Kolmogorov-Smirnov test

Page 20: Unit 6 input modeling

GOODNESS-OF-FIT TESTS•Chi-square test: the test procedure starts by arranging

‘n’ observations into k class intervals or cells.• The test statistic is given by X0²= Σ (Oi – Ei)²/ Ei , where

Oi is the observed frequency at ith interval & Ei is the expected frequency in that class interval.• The expected frequency for each class interval can be

computed as Ei = nPi, where, Pi is the theoretical hypothesized probability associated with the ith class interval.

Page 21: Unit 6 input modeling

CHI-SQUARE TEST STEPS1. Formulate the hypothesis

Ho : Data belongs to a particular candidate distributionH1: data belongs to a particular candidate distribution

2. Estimate the parameters of these distribution3. Calculate the value of the pdf i.e. Pi

4. Calculate the estimated frequency Ei=nPi

5. Calculate X0²= ∑(Oi-Ei)2/Ei

6. Find the critical value X0² > X²α, K-S-1Where k is number of class intervalsS is the number of parameters distributedα Is level of significance

Page 22: Unit 6 input modeling

CHI-SQUARE TEST WITH EQUAL PROBABILITIES

•Number of intervals is k ≤ n/5• the probability of class interval is Pi =1/K• the upper limit of the class interval are computed as

F(x) = ip•For the exponential distribution the upper limit of the class interval is computed as ai=-1/λ ln (1-ip)

Page 23: Unit 6 input modeling

KOLMOGOROV - SMIRNOV TEST FORGOODNESS-OF-FIT TEST FOR EXPONENTIAL DISTRIBUTION• Hypothesis can be given as

H0: IAT are exponentially distributed.H1: the IAT are not exponentially distributed.

• The data were collected over a period of time from 0 to T.• If the underlying distributions IAT {T1,T2,…….Tn} is

exponential , then the arrival times are uniformly distributed on interval (0,T).

Page 24: Unit 6 input modeling

• Arrival times can be obtained by adding IA times. i.e, T1, T1+T2, T1+T2+T3,………T1+……TN.• Then the arrival times can be normalized to a (1, T) interval.

so that the K-S test can be applied .on (0,T) interval the data points (Di) will be T1/T, T1+T2/T, T1+T2+T3/T,…….T1+…..TN/T

Page 25: Unit 6 input modeling

• Use these data points to apply KS test.

• Finally, calculate Dα = D/ sqrt(N) and compare D value with Dα , to accept or to reject the null hypothesis.

• If D < Dα, then the no's are uniformly distributed, vice versa.

Page 26: Unit 6 input modeling

P-VALUES AND BEST FITS

• P value• Is the significance level at which one would just reject H0 for the given

values of statistics• Therefore large p-values indicate a good fit• Small p-value suggests a poor fit

• Best fit• Here software recommends are input model to the user after

evaluating all feasible models

Page 27: Unit 6 input modeling

FITTING A NON STATIONARY POISON PROCESS (NSPP)• Approaches used are•Choose a very flexible model with lots of

parameters and fit it with a method such as maximum likelihood• Approximate the arrival rate as being constant

over some basic interval of time , such as an hour, or a day or a month best by carrying from time interval to time interval

Page 28: Unit 6 input modeling

SELECTING INPUT MODELS WITHOUT DATA• Different ways to obtain information about a process even if

data are not available• Engineering data• Product or process performance rating satisfied by the manufacturer

• Expert opinion• Talk to people who are experienced with process or similar process

• Physical or conventional limitations• Most physical process have physical limits or performance

• The nature of process• The artificial description of the distribution which is predefined can be

used

Page 29: Unit 6 input modeling

MULTIVARIATE AND TIME SERVICE INPUT MODELS

• If we have finite numbers of random variables then we have multivariate model• If we have infinite number of random variable then we have time series model

Page 30: Unit 6 input modeling

COVARIANCE AND CORRELATION

Page 31: Unit 6 input modeling

TIME SERIES INPUT MODEL

• If X1,X2,X3, … is a sequence of identically distributed but dependent and covariance stationary random variables there are number of time series models that can be used to represent the process• Two models that have the characteristics that auto correlation take the

form

• AR(1) model• EAR (1) model

Page 32: Unit 6 input modeling

AR MODEL- AUTOREGRESSIVE ORDER 1 MODEL

Page 33: Unit 6 input modeling

EAR MODEL- EXPONENTIAL AUTOREGRESSIVE ORDER 1 MODEL

Page 34: Unit 6 input modeling

END OF UNIT 6THANK YOU