item formulation psychological testin gfinalxxxx[1]
DESCRIPTION
designing psychological testTRANSCRIPT
CONTRUCT VALIDITY
3 ASPECTS
SUBSTANTIVE VALIDITYSTRUCTURAL VALIDITYEXTERNAL VALIDITY
ACCDG TO LOEVINGER THESZE THREE ASPECTS ARE MUTUALLY EXCLUSIVE ,EXHAUSTIVE OF POSSIBLE LINES OF EVIDENCE FOR CONTRUCT VALIDITY AND MANATATORY AND ARE CLOSELY RELATED TO THREE STAGES IN THE TEST CONTRUCTION PROCESS, CONSTITUTION OF THE POOL OF ITEM AND CONSEQUENT SELECTION OF ITEMS TO FORM THE SCORING KEY AND CORRELATION OF TEST SCORES WITH CRITERIA AND OTHER VARIABLES
SUBSTATIVE VALIDITY PHASE
IS CENTERED ON THE TASKS OF CONTRUCT CONCEPTUALIZATION AND
DEVELOPMENT OF THE INITIAL ITEM POOL
review of related literature - important as if the review reveals that there is already a good psychometrically sound measures of the contruct, the scale developer mustask himself whether the new measure is, in fact necessary and if so why
however, the existence of psychometrically sound measures of the construct does not necessarily preclude the development of a new instrument
construct conceptualization
- 2nd important functrion of the revie is to develop a clear conceptualization of the target construct
the literature review will reveal alternative conceptualization of the construct, related contructs
that are potentially important and potential pitfalls to consider in the scale development process
development of the initial item pool
very critical step in the scale construction
no existing data analytic technique can
remedy serious deficiencies in an item poor
primary consideration during this step is to generate items sampling ALL content that is potentially relevant to the the target construct
overinclusiveness - should characterize
the initial item pool
relevance - refers to the appropriateness of a measure's items for the target construct. when applied to the scale cosntructionprocess this principle suggest that all
items in the finished measure should fall within the boundaries target construct
representativeness- refers to the degree to which the item pooladequately samples content from all important aaspects of the target construct
-item pool should contain itemsreflecting all content
areas relevant to the target construct
-item p[ool should include items reflecting all levels of the trait that need to be assessed
classical trait theory - ussually favors selection of items with
moderate selection of items with MODERATE endorsement probabilities
item response theory - offers valuable tools for quantifying the "trait level" of the items in the pool
item writing
item clarityresponse format
use of simple and straightforward language avoid trendy) or colloquial expression avoid complex or convoluted items
writing items with stems i worry about ....
writing mixture of posivitively and negativelykeyed items
context neutral
item writing
response format
dichotomouspolytomous
use of anchoring schemes based onagreement - strong disagree to
strongly agreeperceived similarity - uncharacteristic of
me to characteristic of me
frequency - never to always
pilot testing
help identify potential problems such as
confusing items or instruction, objecti onable content
or the lack of items in an important content area
STRUCTURAL VALIDITY PHASEPSYCHOMETRIC EVALUATION OF ITEMS AND PROVISIONAL SCALE DEVELOPMENT
LOEVINGER defined structural component of construct validity as the "extent to which structural relations between test
items parallel the structural manifestations of the trait being measured
structural relations between test aned nontest manifestation of the target construct should be parallet to the extent
possible - structural fidelity
item selection strategies
rational theoretical approach- scale developer simply writes items that
APPEARconsistent with his or her particular theoretical
` construct
pitfalls - discriminant validity suffers
replicated rational selection (MMPI2) - involves asking many trained raters (who are given detailed definition of the target construct - to select itemsfrom a pool that most clearly tap the construct,given their interpretations of the definition and the items. Only items that achieve a high degreeof consensus make the final cut
criterion keyed item selection - items are selected for ascale based solely on their ability to discriminatebetween individuals from a "normal group and those
froma prescribed criterion group. Item content is thereforirrelevant
pitfalls- atheoretical and fail to help advancepsychological theory in a meaningful way
internal consistency approach to item selection-the goal is to identify relatively
homogenous scales that demontrate good discriminant validity
accomplished with some variant of factor or component analysis, often combined with classical and modern
psychometric approaches to hone factor based scales
data collection
psychometric evaluation of items
factor analysis- extrem,ely useful to the scale developer who wishes to create homogenous scales t that exhibit good discriminant validity
ultimately, however, the most important criterion for choosing factor structure is the psychological and meaningfulmess of the resultant factors
good candidate items are those that load at least moderately on the primary factor and minimally on
the other factors
internal consistency and homogeneity
once a reduced pool of candidate items has been identified throughfactor analysis, additional item level analysis should be conducted to hone the scales
the goals at this stage is to identify a set of items whose intercorrelations match the internal organization of the
target construct
estimator -- coefficient alpha which are are functions of 2 parameters
average interitem correlation number of items on the scale
alternativesexamination of interitem correlation (.15 to .50)conducting confrimatory factor analysis
item response theory
parameter for item difficultyitem discrimination
item difficulty - also known as threshold or locationrefers to the point along the trait
contimuumat which a given item has a 50 percent probability of being endorsed in the keyed direction
discrimination - reflects the degree of psychometricprecision or information, that an item provides at its difficulty level
standard error of measurement of a scale is equal to the inverse squareroot of
information at every point along the trait continuum
IRT method have been used to study item bias or differential item functioning. the basic goal of DIF is to identify items that yield significantly different difficulty or discrimination parameteres across groups of itnerest, after equating groups with respect to the trait being measured
IRT application which is potentially relevant to personality is Computerized Adaptive TEsting (CAT)
iin which items are individually tailoredto the trait level of the respondentA typical CAT selects and administers only thoseitems that provide the most psychometric information at a given ability or trait leveleliminating the need to present items that have a very low or very high likelihood of beingendorsed or answered correctly given a particular respondent's trait or ability level
EXTERNAL VALIDITY PHASEVALIDATION AGAINST TEST AND NONTEST CRITERIA
concerned with two basic aspect of construct validityconvergent and discriminant validitycriterion related validity
whereas the structural phase primarily involves analyses of the items WITHIN the new measure,
the goal of the external phase is to examine whether relations between the new measure and the
important test and nontest criteria are congruent withone's theoretical understanding of the target
construct and its place in the nomologicalnet
convergent validity - extent to which a measure correlates with other measures of the same construct
discriminant validity - extent that a measure does not correlate with measures of other constructs that are theoretically or emperically distinct
can be assessed by MULTITRAIT-MULTIMETHOD (MTMM) - in such a matrix
multiple measures of at least two constructs are correlated and arranged to highlight several important
aspect of convergentand discriminant validity
absolute convergent correlations will depend on specific aspects of the measures being correlated
concept of method variance - suggest that self rating of the same construct generally will correlated more strongly than will self ratings and peer rating
heterotrait-heteromethod triangels -correlation above and below the convergent correlations
convergent validity correlations
should be higher than the correlations obtained between that
variable and any other variable and any
other variable having neither traitnor method in common
convergent correlations generally should be higher than the correlations in the hetero-monomethod trianges that appear above and to the right of the heterosexual block
the same pattern and trait interrelationship should be shown in all of the heterotrait
triangles
criterion related validity
concurrent validity- involves relating a measure to criterion evidence colelcted at the same time as the measure itself
predictive validity- involves associations with criteria that are assessed at some point in the future
GOALS of criterion relatedconfirm the new measure's place in the
nomological netprovide emperical basis for making
inferences from test scores