at-sat: a test battery for screening air traffic controller applicants gordon waugh laurie wise...

AT-SAT:A Test Battery for Screening

Air Traffic Controller Applicants

Gordon WaughLaurie Wise

Human Resources Research Organization (HumRRO)

Topics Today

Background of AT-SAT Overview of AT-SAT tests Air Traffic Scenarios Test Score weighting methods Setting the cut score

Background of AT-SAT

ATCS Selection backgroundATCS Selection background

Previous Selection System Written aptitude test battery Academy screening (9 weeks) On-the-job training

Issues Test utility after 12-15 years use Academy screening cost & fairness

Air Traffic Selection and Training Air Traffic Selection and Training (AT-SAT) Validation Research(AT-SAT) Validation Research

Integrated validation study Job-performance criterion measures 8 new computerized predictor tests (6 hrs) Concurrent validation study on en route controllers

Overview of AT-SAT Tests

Dial ReadingLetter FactoryAT ScenarioAnglesEQApplied MathAnalogiesScan

Dial Reading Test

1) 22 2) 24 3) 28 4) 30 5) 45

Order Boxes

Belt A Belt B Belt C Belt D

Quality Control

Conveyor Belt

Order Boxes AreaLoading AreaQuality Control Area

Box Storage Area

Available Zone

Letter Factory Test

Order Boxes

Belt A Belt B Belt C Belt D

Quality Control

Which boxes should be in the loading area in order to correctly place all the letters on the belts?

1. One purple box and one orange box2. One green box, one purple box, and one orange box3. Two purple boxes, one green box, and one orange box4. Two green boxes, one orange box, and one purple box

Letter Factory Test - Situational Awareness Questions

Air Traffic Scenarios Test

Angles Test

This test includes two different types of questions:

The first presents a picture of an angle and asks you the measure of that angle in degrees (From 1 to 360)

What is the measure of this angle?

2) 10°1) 90° 3) 125° 4) 190°

This test includes two different types of questions:

The second provides you with the measure of an angle (from 1 to 360 degrees). It then asks you to select an angle that represents that measure.

Which of the following represents a 10 degree angle?

Experience Questionnaire

1- Definitely True2- Somewhat True3- Neither True nor False4- Somewhat False5- Definitely False

Your emotions have sometimes prevented you from solving a difficult problem.

Applied MathApplied Math

The distance from point A to point B is 560 miles. If the aircraft left point A at 8:00 “Zulu”, and flew at 400 kts, what time would the aircraft cross point B?

A. 8:56 B. 9:02

C. 9:24 D. 10:02

Analogies Test

Water: Liquid Ice: ?

Gas Cube Solid Oxygen Freeze (1) (2) (3) (4) (5)

Visual Analogies:

(1) (2) (3) (4) (5)

V41610

Type the identification numbers contained in the data block with lower line numbers falling beyond the range (360-710):

B12810

P45580

T65120

Y85710

K23250

F75560

C20390

Scan Test

Air Traffic Scenarios Test

Simulation Overview

Description of simulation Design considerations Scoring Instructions and practice

ATST: Description of Simulation

• Display of airspace sector

• Control several planes

• Land at two airports

• Exit at four locations

AT Scenarios in a Nutshell

F2eF2e

F4DF4D

S1S1AA

F3fF3f

Plane Icon

M2BExit/Airport B

Altitude Level 2

Speed Medium

Heading

Design Elements Environment: airspace sector with exits, airports,

planes, and controls Events: planes appear, move, and disappear Actions: user clicks controls to control planes. Rules: eight rules related to speed, altitude,

separation, etc.

• Display of air sector:

• four exits

• two airports

• planes

• Plane controls

• Display of time remaining

Environment

• Direction – 8 headings

• Speed - 3 levels

• Altitude - 4 levels

• Accept handoff

Actions: Plane Controls

• New planes appear

• Planes move

• Planes disappear:

• land, exit, or crash

Events

• Land at slowest speed• Land at lowest altitude• Exit at highest altitude• Land/exit at right place• Land in correct direction• Don’t fly over airports• Keep planes separated• Don’t crash!

Eight Simple Rules

ATST: Design

Considerations

• Measure KSAs

• Simulate wide range of job tasks

• Simulate a key job activity

Possible Approaches

• Maximize score variance

• No disadvantage for computer novices

• Short testing time

Design Goals

• Number of planes on the screen at one time

• Initial speed, level, heading, & location

• N of actions per plane

Difficulty Factors: Planes

• Speed of plane movement• N and complexity of rules• N of exits/airports• N of controls• N of control levels• N of things to remember

Difficulty Factors: Other

ATST: Scoring

• Rule violations

• Separation errors

• Crashes

• Elapsed Time

Record the Outcomes

• N of crashes & separation errors

• N of procedural errors

• Percent of successful flights

• Total delay time (handoff & en route)

Á Priori Rational Scales

• Moved “flyovers” based on data• Standardize scores before

combining and weighting• Rescale scores if useful:

• reverse• fix skewness• make scores more sensitive to

differences at higher levels

Adjust Scales and Scores

ATST:Instructions

&Practice

• Will computer experience affect scores?

• If so, can you minimize its effect?

• If computer skill is job relevant, it’s not as serious.

Computer Skill Effects

• Reduces effects of:• learning ability• computer game skills

Simulation Practice

Instructions with examples

Practice a few things in mini-trial

Give feedback and tips

Several mini-trials

• Add practice trials until more trials add little to:

• reliability

• validity

Post-Instruction Practice

• Pilot test with lots of trials

• Compute trial-total r

• Use last one or more trials with high rs

• Trial-criterion rs are even better if available

Which Trials to Score

Score Weighting Methods

Final AT-SAT Validity

Validity, corrected for shrinkage, range restriction, and unreliability in criterion:

.76 with composite criterion

.78 with CBPM

.38 with Ratings

Criteria

Reliability .80 - CBPM (test-retest) .71 - Ratings (interrater reliability) .76 - Composite (mean of CBPM & Ratings)

Composite = .6 * CBPM + .4 * Ratings r CBPM J Ratings = .24

Uncorrected Test Validities

.41 Applied Math .38 Analogies .33 Angles .33 Letter Factory .32 Air Traffic Scenarios .27 Time Wall .25 Planes

Uncorrected Test Validities (cont’d)

.24 Dials .23 Memory Retest .19 Memory .19 Scan .17 Experiences Questionnaire .14 Sound

Find best set of tests for the final battery

Determine scale weights

Measure validity: How well test battery predicts job performance

Decisions to be Made

Which predictors to keep

How to weight predictors

Decision 1: Choosing Predictors

Phase I- Evaluate predictors according to: simple validity incremental validity fairness group differences test administration time

Predictors chosen by group consensus

Decision 1: Choosing Predictors

Phase 2: Optimal weighting algorithm iterated regression negative weights set to zero maximizes R2 while minimizing differences in group means, slopes, and

intercepts

Many runs done while varying the relative importance of R2 and group differences (10 parameters)

Decision 2: How to Weight Predictors

Alternatives Considered regression weights validity weights optimal weights unit weights

Chosen Weighting Method mean of validity and optimal weights

Conclusions: Optimal Weights

An optimal weighting algorithm can balance the many considerations in selecting and weighting predictors

Optimal weighting tends to exclude too many predictors

Optimal weighting used in conjunction with another weighting method can perform very well

Conclusions: Validity Weights

In the current study, validity weights performed very well, especially when combined with optimal weights

Research is needed to compare the shrinkage (due to overfitting and predictor selection) and R2 of validity weights to other weighting methods

Estimating Shrinkage in R

Shrinkage formula has shortcomings Shrinkage formula corrects for overfitting but not predictor

selection Validity weighting - only selection shrinkage Optimal weighting - shrinkage is severe because of extra

parameters

Conclusions: Shrinkage

Research is needed to help estimate shrinkage of R (due to overfitting and predictor selection) using validity weights and optimal weights.

Research needed to compare validity weights, optimal weights, and regression weights in terms of shrunken R2 (due to overfitting and predictor selection)

Setting the Cut Score

Regression Method

Set cut score on criterion Compute corresponding score on predictor based

on regression line Works well when R-squared is high and good

criterion is available

Criterion Cut Point Options

Cut score set at 5% (for example) of current incumbents’ distribution

Set cut score such that mean expected performance of candidates passing is at 60th percentile (for example) of current incumbents

Set score at anchor on ratings scale (e.g., “acceptable.”

Cut Point Considerations

% of applicants passing % of incumbents passing % of each minority passing adverse impact ratio If low pass rate:

need many applicants, or recruiting must target high-quality applicants

Questions?

at-sat: a test battery for screening air traffic controller applicants gordon waugh laurie wise...

practice slide

nutshell slide

scan test slide

cut score slide

math analogies scan

f3f f4b m4c slide

priori rational scales

u rules

Documents

waugh monica 200608 ma

waugh - corsets and crinolines

rural landscape of frank waugh

waugh text 3

humrro 2013-055 end-of-course cut scores...

steve waugh

jesse waugh paintings 2013

evelyn waugh elena

evelyn waugh "decline and fall"

humrro research on l project 100,000

by: jordan waugh

spirit of cricket steve waugh

waugh thistleton 2012

andrew waugh business systems

waugh vs. david colby & wellpoint

waugh literary theory

evelyn waugh: beato edmund campion

waugh thistleton architects

waugh...

evelyn waugh - university of texas at...