regresi logistik i - kusmansadik.files.wordpress.com · 12/11/2018 · 1. gunakan program r untuk...

Post on 07-Mar-2019

230 Views

Category:

Documents

1 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Regresi Logistik I

(Peubah Bebas : Kontinu)

Dr. Kusman Sadik, M.Si

Program Studi Pascasarjana

Departemen Statistika IPB, 2018/2019

2

The logistic regression model is a generalized linear

model appropriate for binary outcomes.

The logistic regression model is similar to the more

familiar linear regression model in that both models

predict an outcome variable, from a set of predictor

variables (also known as explanatory or independent

variables).

In the case of logistic regression, the response variable

is a binary or dichotomous variable, which means it

can only take on one of two possible values.

3

To model the probabilities of certain conditions or states (e.g.,

divorce, disease, resilience, etc.) as a function of some predictor

variables. For example, one might want to model whether an

individual has diabetes as a function of weight, plasma insulin,

and fasting plasma glucose.

To describe differences between individuals from separate

groups as a function of some predictor variables, also known

as descriptive discriminant analysis. For example, one might

want to describe the difference between students who attend

public versus private schools as a function of achievement test

scores, desired occupation, and socioeconomic status (SES).

4

To classify individuals into one of two categories on the basis

of the predictor variables, also known as predictive

discriminant analysis.

This is closely related to descriptive discriminant analysis,

but the descriptive information is used to predict group

membership or classify individuals into one of two groups.

For example, one may want to predict whether a student is

more likely to attend a private school (as opposed to a public

school) as a function of achievement test scores, desired

occupation, and SES.

5

In the psychometric field, there are specific applications that

are closely tied to predictive discriminant analysis.

For example, one might want to predict the probability that

an examinee will correctly answer a test item as a function

of race and gender.

These types of studies are known in the psychometric

literature as differential item functioning analyses.

6

7

8

𝜋

1 − 𝜋=

𝑃(𝑌 = 1)

1 − 𝑃(𝑌 = 1)=𝑃(𝑌 = 1)

𝑃(𝑌 = 0)= 𝑂𝑑𝑑𝑠 = exp(𝛼 + 𝜷𝑿)

𝜋 =

9

10

11

12

13

14

15

16

χ2(α, db) : qchisq(α, db, lower.tail=FALSE)

> qchisq(0.05, 1, lower.tail=FALSE)

[1] 3.841459

Jadi χ2(α = 0.05, db = 1) = 3.841459

17

if (sa[i] > 0) (y[i] = 1) else (y[i] = 0)

18

19

20

** Data Horseshoe Crab (Agresti, sub-bab 5.13) **

dataku <- read.csv(file="Data.Horseshoe.Crab.csv")

c <- factor(dataku[,1])

s <- factor(dataku[,2])

w <- dataku[,3]

wt <- dataku[,4]

sa <- dataku[,5]

y <- c(1:173)

for (i in 1:length(sa)) {

if (sa[i] > 0) (y[i] = 1) else (y[i] = 0)

}

model <- glm(y ~ w, family=binomial("link"=logit))

summary(model)

dugaan <- round(fitted(model),2)

data.frame(w,y,dugaan)

21

C S W Wt Sa

1 2 3 28.3 3.05 8

2 3 3 26.0 2.60 4

3 3 3 25.6 2.15 0

4 4 2 21.0 1.85 0

5 2 3 29.0 3.00 1

6 1 2 25.0 2.30 3

7 4 3 26.2 1.30 0

8 2 3 24.9 2.10 0

9 2 1 25.7 2.00 8

10 2 3 27.5 3.15 6

11 1 1 26.1 2.80 5

12 3 3 28.9 2.80 4

13 2 1 30.3 3.60 3

.

.

.

170 2 3 26.5 2.35 4

171 2 3 26.5 2.75 7

172 3 3 26.1 2.75 3

173 2 2 24.5 2.00 0

22

w sa y

1 28.3 8 1

2 26.0 4 1

3 25.6 0 0

4 21.0 0 0

5 29.0 1 1

6 25.0 3 1

7 26.2 0 0

8 24.9 0 0

.

.

.

172 26.1 3 1

173 24.5 0 0

23

Call:

glm(formula = y ~ w, family = binomial(link = logit))

Coefficients:

Estimate Std. Error z value Pr(>|z|)

Intercept -12.3508 2.6287 -4.698 2.62e-06 ***

w 0.4972 0.1017 4.887 1.02e-06 ***

Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’

Null deviance : 225.76 on 172 degrees of freedom

Residual deviance: 194.45 on 171 degrees of freedom

AIC: 198.45

24

w y dugaan

1 28.3 1 0.85

2 26.0 1 0.64

3 25.6 0 0.59

4 21.0 0 0.13

5 29.0 1 0.89

6 25.0 1 0.52

7 26.2 0 0.66

8 24.9 0 0.51

.

.

.

171 26.5 1 0.70

172 26.1 1 0.65

173 24.5 0 0.46

25

Bandingkan

output SAS ini

dengan R

26

27

28

29

1. Gunakan Program R untuk data Horseshoe Crabs Revisited

(Agresti, sub-bab 5.1.3 ) .

a. Lakukan pemodelan regresi logistik dengan peubah bebasnya

adalah Width (x). Bandingkan hasil output R tersebut dengan

output SAS di dalam buku Agresti.

b. Lakukan pemodelan regresi logistik dengan peubah bebasnya

adalah Width (x) dan √(Width) atau √(x). Gunakan uji Wald

untuk mengetahui apakah kedua peubah bebas tersebut

berpengaruh nyata. Apa kesimpulan Anda.

c. Bandingkan model bagian (a) dan (b) di atas. Model mana

yang lebih baik? Jelaskan.

30

2. Gunakan Program R untuk menyelesaikan Problems 8.9 (Azen,

hlm. 212-213 ) .

31

32

Pustaka

1. Azen, R. dan Walker, C.R. (2011). Categorical Data

Analysis for the Behavioral and Social Sciences.

Routledge, Taylor and Francis Group, New York.

2. Agresti, A. (2002). Categorical Data Analysis 2nd. New

York: Wiley.

3. Pustaka lain yang relevan.

33

Bisa di-download di

kusmansadik.wordpress.com

34

Terima Kasih

top related