cognitive modeling in the large july 2008 jerry ball human effectiveness directorate 711 th human...

41
Cognitive Modeling in the Large July 2008 Jerry Ball Human Effectiveness Directorate 711 th Human Performance Wing Air Force Research

Upload: stella-nelson

Post on 17-Dec-2015

216 views

Category:

Documents


2 download

TRANSCRIPT

Cognitive Modeling in the LargeJuly 2008

Jerry Ball

Human Effectiveness Directorate

711th Human Performance Wing

Air Force Research Laboratory

2

Cognitive Modeling

• Cognitive models are typically Small-Scale

• Tied closely to Laboratory Experiments & Data

– Popperian approach to science – Hypothesis falsification

• Focus on Cognitive Plausibility of isolated cognitive phenomena

– Isolate phenomenon of interest from possible confounds

– Divide and conquer approach to research (Gray, 2007)

• Focus on Empirical Validation

– Fine-grained validation (e.g. reaction time, error rates)

– Models that fit data and are implemented in a cognitive architecture are assumed to be valid

3

Cognitive Modeling

• Computational Implementation

– Limited complexity

– One shot models with little reuse or maintenance

• publish and move on

– Task specific models that don’t generalize to other tasks

• Little accumulation of reusable components

• Can’t be scaled up to larger or multiple tasks

4

Cognitive Modeling

• Cognitive Modeling tools

– Cognitive Architecture (ACT-R, SOAR, Polyscheme)

• Provides unifying theoretical framework for isolated models

– Attempt to answer Newell’s 20 questions critique of cognitive science research (cf. Anderson & Lebiere, 2003)

– Lakotosian approach to science – accumulation of scientific evidence across studies and systems

• Imposes cognitive constraints on model development

• Supplemented with accepted “practices” which are not architecturally enforced

• Provides debugging and visualization tools

– WhyNot

– Swim lanes, BOLD Response

5

Cognitive Modeling

• Cognitive Modeling tools

– Model code generation tools

• Automatically generate lower level model code from higher level specifications

– GOMS ACT-R or SOAR

6

Sample Cognitive Model

• A paper recently published in a leading Psychological journal (Salvucci & Taatgen, 2008) used a dual-task model of Reading and Dictation to provide support for the theoretical claims of the paper

• Since I’m interested in reading, I checked out the “Reading” model

• The “Reading” model

– Contains exactly 4 productions!

• Attend visual – encode next word

• Lexical access – retrieve encoded word from DM

• Set retrieval cues – retrieve syntax for word from DM

• Attach syntax element – stub production encode next word

– The model cycles thru the same sentence over and over

7

Sample Cognitive Model

• The “Reading” model is an extreme example, but cognitive models of reading are often little more than word lookup models

– It turns out that the time to move the eye from word to word accounts for most of the time needed in reading

– In fact, the model is actually slower than the human subjects it was compared to (who actually read the input texts and were tested for comprehension)

• Subj 1 – 480 words per minute normal reading speed, single task

• Subj 2 – 358 wpm

• Model – 298 wpm

– To get good fits in the dual-task model, the authors had to “scale” the results to normal, single task reading speed

Spelke et al. study, 1976

8

Cognitive Models

• Although extreme in claiming to be a model of reading, the “Reading” model is not unusual in size

• Cognitive models typically contain on the order of tens of productions

• Given the fine-grained nature of ACT-R productions, such models are only capable of modeling extremely limited, isolated cognitive tasks

• Who needs software engineering!

9

Software Engineering

• Focus on Development of Large-Scale, Complex Software Systems

– Multi-Component Systems

– Teams of Developers

– Reuse of Validated Components

– Component Integration

– Verification and Validation of System Functionality

– Life-Cycle Maintenance

– Software engineering tools:

• Configuration management (e.g. subversion)

• Architecture and design tools (e.g. UML)

• Software development environments

10

Software Engineering

• Focus on handling Complexity

– Modularization & Encapsulation

– Hierarchical Decomposition

– Statistical analysis and machine learning

– Automated generation of systems

• Focus on engineering solutions to complex tasks

– Scientific and theoretical issues not typically emphasized

11

Cognitive Modeling in the Large

• Large-Scale

• Multi-Component

• Functional

• Cognitively Plausible

– Theoretically motivated components

• Empirically Valid

– Gross Level Validation – not just input/output behavior

• Computational Implementation

– Significant complexity

12

Bridging the Gap

• Use of Software Engineering techniques

– Modularity & Encapsulation

– Hierarchical Decomposition

– Automated testing (especially regression testing)

– Software development life-cycle management

• Educate Cognitive Modelers in Software Engineering techniques

• Encourage Positive Team Interaction

13

Bridging the Gap

• Use of Cognitive Modeling techniques

– Empirical Validation

– Theoretical Motivation

– Taking Cognitive Plausibility seriously

– Using a Cognitive Architecture for development

• Educate Software Engineers in Cognitive Modeling techniques

• Encourage Positive Team Interaction

14

Easier Said Than Done!

• Researchers have vested interests in the particular methodologies and research directions they are pursuing

– Disciplinary boundaries are real!

– Interdisciplinary research is hard!

– The value of interdisciplinary research is underappreciated within disciplines

• If you’re not fitting data the cognitive modeling community will not be interested

• If you can’t process the Penn Treebank the computational linguistics community will not be interested

– Teams of researchers from different disciplines are difficult to manage

• “You obviously need me, but it’s not clear (to me) that I need you”

15

Easier Said Than Done!

• Not clear that cognitive components map to software components

– May be a whole different scale of complexity

– “Real” cognitive components may be much larger than typical software components

• Is human language a single cognitive component or can it be decomposed into smaller cognitive components?

• NLP research suggests that decomposition into separate components for morphology, syntax, semantics, and pragmatics is untenable

– Too much non-determinism within individual components

– Clear evidence of interaction between components

16

Why Cognitive Modeling needs Software Engineering?

• Cognitive modeling researchers interested in building large-scale models (many of which are at this workshop) really have no choice but to adopt software engineering techniques

– Large software systems will collapse under their own weight if not adequately engineered

– Complexity cannot be wished away, it must be managed

– Scaling up small-scale models leads to complex integration issues which do not arise in small-scale model development

17

Why Software Engineering needs Cognitive Modeling?

• Software engineers interested in engineering solutions to tasks that are inherently human would be advised to consider how humans manage the tasks

• Throwing computational techniques at such tasks is unlikely to succeed

– Computationally explosive, typically NP-hard

– Faster computers and more efficient algorithms will not solve the problem – not even HPC resources

– Humans clearly don’t work that way – maybe there’s a good reason why they don’t

• Evolution leads to workable solutions

18

Case Study: The Synthetic Teammate Project

19

Synthetic Teammate Project

• Goal: Development of a language capable synthetic teammate capable of functioning as the Air Vehicle Operator (AVO) in a 3-person simulation of a UAV performing reconnaissance missions

– Cognitively Plausible

– Functional

– Empirically Validated

AVOPLO DEMPC

20

Text Chat

Output

Language

Comprehension

Language

Generation

Dialog

Manager

ACT-R Cognitive Architecture

Motor ModelMotor

Actions

Situation & Task ModelVisual

Input

Text Chat

Input

System Overview

Major components

developed

in ACT-R

21

Text Chat

Output

Language

Comprehension

Language

Generation

Dialog

Manager

ACT-R Cognitive Architecture

Motor ModelMotor

Actions

Situation & Task ModelVisual

Input

Text Chat

Input

System Overview

22

Language Comprehension Model

• Under development since 2002

• Initially coded in ACT-R 5

• Ported to ACT-R 6 in 2006

• Described in Ball, Heiberg & Silber (2007)

23

Language Comprehension Model

• Currently handles a fairly wide range of expression types, including:

– Declaratives – “The man hit the ball”

– Questions

• Yes-No Questions – “Did the man hit the ball?”

• Wh Questions – “Where did the man hit the ball?”

– Imperatives – “Hit the ball!”

– Relative Clauses – “The ball that the man hit”

– Wh Clauses – “I know where the man hit the ball”

– Passive constructions – “The ball was hit”

24

Language Comprehension Model

• Additions to model are

– Motivated by functional considerations

– Driven by empirical evidence

– Constrained by well-established cognitive constraints on language processing

– Don’t use any computational techniques that are obviously cognitively implausible, e.g.

– Multi-stage processing with a separate part-of-speech tagger

– Non-incremental processing

– Access to right context

25

Language Comprehension Model

• Why might adherence to cognitive constraints actually be a good idea from a software engineering perspective?

– Don’t know what we’re giving up in adopting cognitively implausible techniques

–Microsoft parser – parses input from right to left

– Reasonable engineering decision?

– Can’t operate in realtime

– Can’t be integrated with speech recognition

– Cognitive constraints push development in directions which are more likely to be successful given the inherently human nature of language processing

26

Text Chat

Output

Language

Comprehension

Language

Generation

Dialog

Manager

ACT-R Cognitive Architecture

Motor ModelMotor

Actions

Situation & Task ModelVisual

Input

Text Chat

Input

System Overview

27

Natural Language Generation

• Computational cognitive model of NLG, combining:

– Optimality Theory (OT) (Prince & Smolensky 1993/2004)

– ACT-R cognitive architecture (Anderson 2007)

• The model:

– Expresses the patterns of human language production through simple, conflicting OT constraints

– Is based on human language usage from UAV simulation experiment transcripts (Cooke & Shope 2005)

– Provides a general computational cognitive account of OT

– Captures the dynamic and probabilistic nature of human language production

28

Text Chat

Output

Language

Comprehension

Language

Generation

Dialog

Manager

ACT-R Cognitive Architecture

Motor ModelMotor

Actions

Situation & Task ModelVisual

Input

Text Chat

Input

System Overview

29

AVO Task Model

• Fly UAV in performance of reconnaissance missions as a member of a three-person team

• Task model currently capable of flying to a waypoint, but isn’t yet hooked up to the UAV simulation

• Development of AVO Task Model funded by AFOSR grant to the Cognitive Engineering Research Institute (CERI)

– Chris Myers is leading development

• Task model is currently being integrated with the Language Generation model

30

Case Study: The Synthetic Teammate Project

• Each component of the system would individually be a complex cognitive model

• Together they constitute a highly complex model

• Important design decisions

– Which components of the system need to be cognitively plausible?

31

Case Study: The Synthetic Teammate Project

• Language generation component was created by Andrea Heiberg, a linguist, and is not cognitively plausible

– Uses full utterance templates to generate outputs

– Maps linguistic input to output template

– Matches behavior of a single subject very well!

– Cognitively plausible language generation—starting from a non-linguistic mental representation of the situation—would be extremely hard—in my opinion, harder than language comprehension!

32

Case Study: The Synthetic Teammate Project

• Mary Dungan, a summer intern, is currently working on handling variability in the input (e.g. H-Area, HAREA, H area, spellingg)

– Decided to handle variability within ACT-R

• Can’t use readily available off-the-shelf tools

• Mary had to learn how to use ACT-R

• Implementation within ACT-R will also support recognition of multi-word units (e.g. good bye, kick the bucket)

– To model variability, had to eliminate “hard” constraints in word retrieval

• ACT-R must serially compute activations over all words in declarative memory

– ~2000 words significant slowdown (~second)

• If partial matching is on, performance is even worse

33

Case Study: The Synthetic Teammate Project

• How do the cognitively plausible components interface with non-cognitively plausible components?

– Considered running components in separate ACT-R modules,

• Attempt to modularize the code

• Psychologists on team argued against doing so

– In process of integrating cognitively plausible task model with cognitively implausible language generation model into a single composite model

• Haven’t thought thru the theoretical implications…

– Even though there may be cognitively implausible components, integration of a cognitive plausible component into an end-to-end system is likely to provide insight into the theoretical merit of the cognitively plausible component

34

Case Study: The Synthetic Teammate Project

• How important is the science, relative to the engineering?

– Is an end-to-end functional demonstration required?

– If so, is it even feasible?

35

Case Study: The Synthetic Teammate Project

• What is the right level for empirical validation?

• Gross level validation is important for assuring cognitive plausibility, focusing development, and avoiding over-proliferation of functional elements

• Fitting the model to specific data sets before a functional system is in place runs the risk of overfitting in ways that make the system non-generalizable to meet the larger functional requirements

36

Case Study: The Synthetic Teammate Project

• How to manage a team of researchers with different disciplinary backgrounds?

– Most linguists don’t do psychology or software engineering

– Most psychologists don’t do linguistics or software engineering

– Most software engineers don’t do psychology or linguistics

37

Case Study: The Synthetic Teammate Project

• How to keep a team together over the extended timeframe needed for development of a complex software system?

– Dealing with the loss of personnel

– Dealing with personal interaction

• How to manage the expectations of stakeholders without overstating the prospects for success or losing funding?

– Dealing with inflated expectations

– Dealing with funding cuts

38

Summary

• Development of complex cognitive models necessitates introduction of software and systems engineering techniques

• If we are genuinely interested in building complex cognitive models, cross-disciplinary teams willing to work together will be needed

• Historically, cross-disciplinary research has often foundered

– “The quickest way to ruin an NLP project is to hire a linguist!”

– Of course, most NLP projects failed with or without a linguist

39

Summary

• There are no guarantees of success, and lots of ways to fail

• Without taking software/systems engineering and program management seriously, any chance of success will hinge on the capabilities of one or a very small group of individuals

40

Questions?

41

Ball, J., Heiberg, A. & Silber, R. (2007). Toward a Large-Scale Model of Language Comprehension in ACT-R 6. Proceedings of the 8th International Conference on Cognitive Modeling, 173-179. Ed by R. Lewis, T. Polk & J. Laird. NY: Psychology Press.

Anderson, J. & Lebiere, C. (2003). The Newell Test for a theory of cognition. Behavioral and Brain Sciences, 26, 587-637.

Gray, W. (2007). Integrated Models of Cognitive Systems. Edited by W. Gray. Oxford: Oxford University Press.

Questions?

Salvucci, D. & Taatgen, N. (2008). Threaded Cognition: An Integrated Theory of Concurrent Multitasking. Psychological Review, 115, 101-130.

Spelke, E., Hirst, W. & Neisser, U. (1976). Skills of Divided Attention. Cognition, 4, 215-230.

Anderson, J. R. (2007). How Can the Human Mind Occur in the Physical Universe? New York: Oxford University Press.

Cooke, N. & Shope, S. (2005). Synthetic Task Environments for Teams: CERTT’s UAV-STE. Handbook on Human Factors and Ergonomics Methods. 46-1-46-6. CLC Press.

Prince, A. & Smolensky, P. (1993/2004). Optimality Theory: Constraint Interaction in Generative Grammar. Oxford: Blackwell.