functional methods for testing data. the data each person a = 1,…,n each person a = 1,…,n...

36
Functional Methods for Functional Methods for Testing Data Testing Data 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 -1.5 -1 -0.5 -2 Item 3 0 0.5 1 1.5 2 2.5 -2.5 Item 4 Item 29

Post on 18-Dec-2015

223 views

Category:

Documents


2 download

TRANSCRIPT

Functional Methods for Functional Methods for Testing DataTesting Data

00.2

0.40.6

0.8

1

0

0.2

0.4

0.6

0.8

10

0.2

0.4

0.6

0.8

1

-1.5

-1

-0.5

-2

Item 3

0

0.5

1

1.522.5

-2.5

Item 4

Ite

m 2

9

The dataThe data

Each person Each person a = 1,…,Na = 1,…,N responds to each item responds to each item i = 1,…,ni = 1,…,n and makes a binary response = and makes a binary response = uuai ai , ,

where 0 indicates “wrong” and 1 where 0 indicates “wrong” and 1 indicates “right”.indicates “right”.

We want to estimate We want to estimate PPaiai = the = the probability of person probability of person aa getting item getting item ii right.right.

The modelThe model The The response spaceresponse space is an n- is an n-

dimensional unit hypercube. The data dimensional unit hypercube. The data vectors {vectors {uua1a1,…,,…,uuanan} are on the corners. } are on the corners.

The vectors of correct response The vectors of correct response probabilities {probabilities {PPa1a1,…,,…,PPanan}} fall along a fall along a smooth curve in response space, the smooth curve in response space, the response manifoldresponse manifold..

This manifold is, in principle, This manifold is, in principle, identifiable from the data, is therefore identifiable from the data, is therefore not a latent trait.not a latent trait.

Item response functionsItem response functions

We can define a smooth We can define a smooth charting functioncharting function that maps each point on the response that maps each point on the response manifold to a corresponding real number manifold to a corresponding real number θθ. . E.g.: arc length.E.g.: arc length.

In this way we establish a metric defining In this way we establish a metric defining positions on this manifold. positions on this manifold.

PPii((θθ)) is the success probability for item is the success probability for item ii of all of all those at position those at position θθ. This is a smooth function . This is a smooth function of of θθ : The : The item response functionitem response function for item for item ii..

The response The response manifold for 3 manifold for 3 test items: 3, test items: 3, 4, and 29.4, and 29.

The curve The curve indicates the indicates the possible values possible values of of PPaiai = = PPi i ((θθ)) . .

The circles The circles correspond to correspond to 11 fixed values 11 fixed values of of θθ..

00.2

0.40.6

0.8

1

0

0.2

0.4

0.6

0.8

10

0.2

0.4

0.6

0.8

1

-1.5

-1

-0.5

-2

Item 3

0

0.5

1

1.522.5

-2.5

Item 4

Ite

m 2

9

Three items from an Three items from an Intro Psych testIntro Psych test

What does “smooth” mean?What does “smooth” mean?If If θθ has a standard normal distribution, has a standard normal distribution,

then experience indicates that usually:then experience indicates that usually: The function The function PPi i ((θθ)) is monotonic. is monotonic. It has slopes near It has slopes near 00 for extreme for extreme θθ

values.values. The lower asymptote is positive, and The lower asymptote is positive, and

the upper asymptote is one.the upper asymptote is one. There is only one inflection point.There is only one inflection point.

The three-parameter logistic The three-parameter logistic item response function is item response function is

smoothsmooth

-3 -2 -1 0 1 2 30

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

P(

)

The challengesThe challenges

Let the item response functions take Let the item response functions take whatever shapes are supported by the whatever shapes are supported by the data.data.

But control their smoothness in this But control their smoothness in this sense.sense.

Constrain the function values to lie Constrain the function values to lie within [0,1]. within [0,1].

We want a smooth derivative, too, for We want a smooth derivative, too, for the item information function. the item information function.

The log-odds transformation The log-odds transformation deals with the [0,1] constraintdeals with the [0,1] constraint

We actually estimate We actually estimate

WWii((θθ)) = log [ = log [PPi i ((θθ)) / /QQi i ((θθ))], where ], where

QQi i ((θθ)) = 1 – = 1 – PPi i ((θθ))..

““Smooth” in terms of Smooth” in terms of WWi i ((θθ)) means means linear behavior for extreme linear behavior for extreme θθ, , with with small slope on the left and larger small slope on the left and larger positive slope on the right.positive slope on the right.

The log-odds transformation of The log-odds transformation of a three-parameter logistic item a three-parameter logistic item

response functionresponse function

-3 -2 -1 0 1 2 3-2

-1

0

1

2

3

4

W(

)

A three-dimensional model for A three-dimensional model for a smooth log-odds functiona smooth log-odds function

1 2 3( ) log( 1)W c c c e

The function The function log(e log(e θθ+1)+1) has the desired has the desired behavior at the extremes. behavior at the extremes. The other two terms add vertical shift and The other two terms add vertical shift and tilt as required.tilt as required.

Comparing the 3D model and Comparing the 3D model and the 3PL log-odds functionsthe 3PL log-odds functions

-3 -2 -1 0 1 2 3-2

-1

0

1

2

3

4

W(

)

3D3PL

B-spline expansions for B-spline expansions for W(W(θθ))

But, in fact, we will actually estimate But, in fact, we will actually estimate our log-odds functions byour log-odds functions by

• expanding expanding W(W(θθ)) in terms of a set of in terms of a set of KK B-spline basis functions, while B-spline basis functions, while

• smoothing these expansions towards smoothing these expansions towards these simpler three-dimensional these simpler three-dimensional models.models.

Fitting the dataFitting the dataWe use maximum marginal likelihood estimation, We use maximum marginal likelihood estimation,

using the EM algorithm to maximizeusing the EM algorithm to maximize

log log ai aia i

ML P Q g d

where g(where g(θθ) is a prior density on ) is a prior density on θθ, often , often taken to taken to be the standard normal.be the standard normal.Maximization is with respect to the nK Maximization is with respect to the nK coefficients coefficients defining the B-spline expansions of the log-defining the B-spline expansions of the log-oddsoddsfunctions. functions.

What about smoothness?What about smoothness?

We have defined smoothness here in a We have defined smoothness here in a less orthodox fashion; It isn’t defined only less orthodox fashion; It isn’t defined only in terms of the second derivative. in terms of the second derivative.

Instead, we define smooth in terms of the Instead, we define smooth in terms of the size ofsize of

2 31

1

eLW D W D W

e

How did you come up with How did you come up with this?this?

If If W(W(θθ)) conforms exactly to the three- conforms exactly to the three-dimensional smooth model, then dimensional smooth model, then LW(LW(θθ)) = 0.= 0.

In other words, ifIn other words, if

1 2 3( ) log( 1)W c c c e thethenn 3 21

1

eD W D W

e

Our strategy is to define a low-Our strategy is to define a low-dimensional family of prototype dimensional family of prototype functions that capture what we mean functions that capture what we mean by “smooth.”by “smooth.”

Then we represent this family by a Then we represent this family by a linear differential equation.linear differential equation.

This differential equation defines a This differential equation defines a measure of “roughness”, which we measure of “roughness”, which we penalize.penalize.

The more we penalize this kind of The more we penalize this kind of roughness, the more we force the roughness, the more we force the fitted functions to be smooth.fitted functions to be smooth.

In general, if we begin with a linear model of In general, if we begin with a linear model of dimension dimension mm, we can find a linear , we can find a linear differential equation of order differential equation of order mm such that all such that all versions of this model will satisfy:versions of this model will satisfy:

DDmmW(W(θθ) = b) = b0 0 ((θθ) W() W(θθ)+ b)+ b1 1 ((θθ) DW() DW(θθ)+ … )+ … + b+ bm-1 m-1 ((θθ) D) Dm-1m-1W(W(θθ)) for some choice of coefficient functions for some choice of coefficient functions bbjj((θθ).). We change the equation to a roughness We change the equation to a roughness

penalty by converting it to operator form:penalty by converting it to operator form:LW(LW(θθ) =) = bb0 0 ((θθ) W() W(θθ)+ b)+ b1 1 ((θθ) DW() DW(θθ)+…+ b)+…+ bm-1 m-1 ((θθ) D) Dm-1m-1W W

((θθ)) + + DDmmW(W(θθ) = 0.) = 0.

The roughness-penaltyThe roughness-penalty

2( ) [ ]PEN W LW d measures the departure ofmeasures the departure of W(W(θθ)) from this from this smooth model.smooth model.

Roughness-penalized log Roughness-penalized log marginal likelihoodmarginal likelihood

Consequently, we actually maximizeConsequently, we actually maximize

log ( )ii

ML PEN W Smoothing parameter Smoothing parameter λλ controls the amount of controls the amount ofsmoothness in the smoothness in the W(W(θθ)) ‘s; the larger it is, the s; the larger it is, the more these will look like the three-dimensional more these will look like the three-dimensional versions. versions.

Some examplesSome examples

Here are three estimates of the item Here are three estimates of the item response functions for items 3, 4, 29, response functions for items 3, 4, 29, and 96 for an introductory and 96 for an introductory psychology testpsychology test

The test had 100 items, and was The test had 100 items, and was given to 379 students. given to 379 students.

Each function Each function W(W(θθ)) is defined by an is defined by an expansion in terms of 13 B-spline expansion in terms of 13 B-spline basis functions. basis functions.

λλ=0=0

-2 -1 0 1 20

0.2

0.4

0.6

0.8

1

Item 3

-2 -1 0 1 20

0.2

0.4

0.6

0.8

1

Item 4

-2 -1 0 1 20

0.2

0.4

0.6

0.8

1

Item 29

-2 -1 0 1 20

0.2

0.4

0.6

0.8

1

Item 96

λλ=0.01=0.01

-2 -1 0 1 20

0.2

0.4

0.6

0.8

1

Item 3

-2 -1 0 1 20

0.2

0.4

0.6

0.8

1

Item 4

-2 -1 0 1 20

0.2

0.4

0.6

0.8

1

Item 29

-2 -1 0 1 20

0.2

0.4

0.6

0.8

1

Item 96

λλ=50=50

-2 -1 0 1 20

0.2

0.4

0.6

0.8

1

Item 3

-2 -1 0 1 20

0.2

0.4

0.6

0.8

1

Item 4

-2 -1 0 1 20

0.2

0.4

0.6

0.8

1

Item 29

-2 -1 0 1 20

0.2

0.4

0.6

0.8

1

Item 96

What does What does θθ mean?mean?

We have fallen into the habit of calling We have fallen into the habit of calling θθ a “latent trait score”. a “latent trait score”.

Actually, it is the value of a function Actually, it is the value of a function that is chosen more or less arbitrarily that is chosen more or less arbitrarily to map position along the response to map position along the response manifold.manifold.

The assumption of a standard normal The assumption of a standard normal distribution is pure convention. distribution is pure convention.

We can choose otherwise.We can choose otherwise.

What charting functions would What charting functions would be more useful?be more useful?

Three choices of charting functions Three choices of charting functions are especially interesting, and none are especially interesting, and none are “latent” in any sense.are “latent” in any sense.

Each leads to interesting diagnostic Each leads to interesting diagnostic statistics and graphics. statistics and graphics.

The arc length chartingThe arc length charting

Arc length Arc length ss measures the Euclidean measures the Euclidean distance traveled along the manifold distance traveled along the manifold from its origin at from its origin at θθ00 to a given position to a given position θθ::

0

2( ) i

i

s DP d

Item discrimination in arc Item discrimination in arc length metriclength metric

One useful property of arc length isOne useful property of arc length is

21i

i

DP s Each squared item discrimination is a proportion Each squared item discrimination is a proportion total test discrimination, and therefore has a total test discrimination, and therefore has a familiar frame of reference. familiar frame of reference.

Expected score chartingExpected score charting

Assuming that expected score is Assuming that expected score is monotonically related to monotonically related to θθ, (there aren’t , (there aren’t too many items like 96), thentoo many items like 96), then

ii

P Provides a metric that is familiar to users and easy Provides a metric that is familiar to users and easy for them to interpret.for them to interpret.

Expected score is already used extensively as a Expected score is already used extensively as a basis for assessing differential item functioning (DIF).basis for assessing differential item functioning (DIF).

ACT Math test for males and ACT Math test for males and femalesfemales

Three items from a Three items from a 60 item math test.60 item math test.

Around 2000 Around 2000 examinees.examinees.

The male and The male and female response female response manifolds differ.manifolds differ.

00.2

0.40.6

0.8

1

0

0.2

0.4

0.6

0.8

10

0.2

0.4

0.6

0.8

1

Item 14

F

F

F

M

F

MM

F

M

F

M

F

F

M

M

M

M

Item 17

Ite

m 1

9

Differential item functioning for Differential item functioning for an ACT Math test iteman ACT Math test item

0 10 20 30 40 50 600.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Expected Score

Pro

ba

bili

ty o

f S

ucc

ess

Item 17

MaleFemale

Total change chartingTotal change charting

The following total change in probability of The following total change in probability of success measure is closely related to arc success measure is closely related to arc length:length:

0

ic DP d

Some general lessonsSome general lessons

Fitting functional models to non-Fitting functional models to non-functional data is relatively straight-functional data is relatively straight-forward.forward.

But we do need to transform constrained But we do need to transform constrained functions into unconstrained versions.functions into unconstrained versions.

We can define smoothness or roughness We can define smoothness or roughness in customized ways that capture the in customized ways that capture the default or baseline behavior of our default or baseline behavior of our estimated functions.estimated functions.

““Latent trait models” aren’t really Latent trait models” aren’t really latent at all. latent at all.

They express the idea of a one-They express the idea of a one-dimensional subspace for modeling dimensional subspace for modeling the data.the data.

Differential geometry gives us the Differential geometry gives us the appropriate mathematical tools.appropriate mathematical tools.

There is room for creativity in There is room for creativity in choosing charting functions.choosing charting functions.

Looking aheadLooking ahead

There is an intimate connection There is an intimate connection between designer roughness between designer roughness penalties and the estimation of penalties and the estimation of differential equations from data. differential equations from data.

We will use discrete data to estimate We will use discrete data to estimate a differential equation that describes a differential equation that describes the data. the data.

ReferencesReferences

More technical details on fitting test More technical details on fitting test data with functional models are indata with functional models are in

Rossi, N., Wang, X. and Ramsay, J. O. Rossi, N., Wang, X. and Ramsay, J. O. (2002) Nonparametric item response (2002) Nonparametric item response function estimates with the EM function estimates with the EM algorithm. algorithm. Journal of Educational and Journal of Educational and Behavioral StatisticsBehavioral Statistics, 27, 291-317., 27, 291-317.