tugasan 3

38
TUGASAN 3 PN SITI NUR DIYANA

Upload: diyana-mahmud

Post on 24-Oct-2014

145 views

Category:

Documents


6 download

TRANSCRIPT

Page 1: TUGASAN 3

TUGASAN 3 PN SITI NUR DIYANA

Page 2: TUGASAN 3

Why clean your data?

• Screening process

• • Detect errors

• ƒ Missing data

• ƒ Outliers

• • Make sure data meets assumptions for analysis

• ƒ Normality

Page 3: TUGASAN 3

2 Types of Screening

• 1. Preliminary data screening

• ƒ Screen one variable at a time on the entire data set before any analysis

• 2. In conjunction with statistical analysis

• ƒ Dependent on analysis being performed

Page 4: TUGASAN 3

Data Cleaning Tips for making your data

suitable for analysis

Page 5: TUGASAN 3

Steps

• 1. Check for missing data

• 2. Check for normality

• 3. Remove outliers

• 4. Check for normality again

• 5. Transform data

Page 6: TUGASAN 3

Compute into variable

Page 7: TUGASAN 3
Page 8: TUGASAN 3
Page 9: TUGASAN 3
Page 10: TUGASAN 3
Page 11: TUGASAN 3
Page 12: TUGASAN 3
Page 13: TUGASAN 3

Statistical methods include diagnostic hypothesis tests for normality, and a rule of thumb that says a variable is reasonably close to normal if its skewness and kurtosis have values between –1.0 and +1.0.

Page 14: TUGASAN 3

What do we do when Data are Missing?

• Listwise (casewise) deletion:: uses only complete cases

Page 15: TUGASAN 3

Step 2 Check for normality

• • Still using information from “explore” in SPSS

• • Look at:

• 1. Descriptive table

• 2. Tests of Normality table

• 3. Histogram

• 4. Box plot

Page 16: TUGASAN 3

Step 2 Check for normality

• Since the sample size is larger than 50, we use the Kolmogorov-Smirnov test. If the sample size were 50 or less, we would use the Shapiro-Wilk statistic instead.

The null hypothesis for the test of normality states that the actual distribution of the variable is equal to the expected distribution, i.e., the variable is normally distributed. Since the probability associated with the test of normality is < 0.001 is less than or equal to the level of significance (0.01), we reject the null hypothesis and conclude that total hours spent on the Internet is not normally distributed. (Note: we report the probability as <0.001 instead of .000 to be clear that the probability is not really zero.)

Page 17: TUGASAN 3

Step 2 Check for normality

Page 18: TUGASAN 3
Page 19: TUGASAN 3

Step 3 Remove outliers • • Remove data points highlighted in box plot

• ƒ Not the best method

• • “Schweinle Method”

• ƒ Remove data that is 2.5 SD from mean

Page 20: TUGASAN 3

• “Schweinle Method”

• 1. SD x 2.5--- 0.61463 x 2.5 ---- 1.53657

• 2. Add that value to the mean

• 1.53657+ 3.8026 = 5.339175

• • Remove any values above 5.339175

Step 3 Remove outliers

Page 21: TUGASAN 3

• SPSS: Data select cases

• • Select “if condition is satisfied”

• • Variable <= 5.339175

• ƒ SPSS will not analyze data that is over 5.339175

• • Click “continue” and “OK”

Step 3 Remove outliers

Page 22: TUGASAN 3

Step 4 Check for normality – again!

Page 23: TUGASAN 3

Step 5 Transform data

Page 24: TUGASAN 3

• Transform data – square root

Step 5 Transform data

SPSS: transform compute • Target variable: enter new name ƒ Ex: sqrt • Click on “arithmetic” under function group • Click on “sqrt” under functions and special variables • Click on the up arrow to bring sqrt(?) to numeric expression box • Highlight variable to be transformed and click the right arrow to replace the (?) Explore data again to check for normality

Page 25: TUGASAN 3

Reliability

Page 26: TUGASAN 3
Page 27: TUGASAN 3
Page 28: TUGASAN 3
Page 29: TUGASAN 3
Page 30: TUGASAN 3

Creating graphic illustration

Page 31: TUGASAN 3
Page 32: TUGASAN 3

Descriptive analysis

Page 33: TUGASAN 3
Page 34: TUGASAN 3

Inferential statistic

Page 35: TUGASAN 3

T-test

Page 36: TUGASAN 3

Correlation

Page 37: TUGASAN 3

Two-way anova

Page 38: TUGASAN 3

Regression