homework i3 - due tuesday issues? - northeastern university

30
1 1 Empirical Research Methods in Information Science IS 4800 / CS6350 Lecture 8 Miscellaneous Measures 3 Homework I3 - Due Tuesday Issues? n Conduct a small usability study n Descriptive n Quantitative n At least Two tasks n At least Two measures n At least Three subjects

Upload: others

Post on 10-Jun-2022

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Homework I3 - Due Tuesday Issues? - Northeastern University

1

1

Empirical Research Methods in Information Science

IS 4800 / CS6350

Lecture 8 Miscellaneous Measures

3

Homework I3 - Due Tuesday Issues?

n  Conduct a small usability study n  Descriptive n  Quantitative n  At least Two tasks n  At least Two measures n  At least Three subjects

Page 2: Homework I3 - Due Tuesday Issues? - Northeastern University

2

Lecture 1 - Introduction 4

Review

Descriptive Demonstration Correlational Experimental

Number of Variables

Number of IV Levels

Manipulation

≥1?2≥2≥

1 NA

NA 1

2≥

NA

NA

5

Questionnaire Validation - Review

n  Reliability n  Test-retest n  Internal consistency

n  Validity n  Face n  Content n  Criterion

n  Concurrent n  Predictive

n  Construct n  Convergent n  Discriminant

Page 3: Homework I3 - Due Tuesday Issues? - Northeastern University

3

Scales of Measurement - Review

n  Nominal n  Ordinal n  Interval n  Ratio

6

Measures of Center: Decision Rule

n  Nominal n  Mode

n  Ordinal n  Median

n  Interval, Ration & Normal & No Outliers n  Mean

n  Else n  Median

7

Page 4: Homework I3 - Due Tuesday Issues? - Northeastern University

4

Measures of Spread: Decision Rule

n  Nominal, Ordinal n  no measure of spread

n  Interval, Ratio & Normal & no outliers n  SD

n  Else: n  IQR

8

9

Which measures of center and spread?

Time to Complete

Page 5: Homework I3 - Due Tuesday Issues? - Northeastern University

5

10

Which measures of center and spread?

Red

Blue

Purp

le

Yello

w

Pink

Ora

nge

Favorite Color

Gre

en

Blac

k

Gre

y

Tan

11

Which measures of center and spread?

Number of Errors

Page 6: Homework I3 - Due Tuesday Issues? - Northeastern University

6

12

Chapter 8

Using Nonexperimental Research Observational

Miscellaneous designs Meta-analyses

13

Observational Research

Nonexperimental Research

Page 7: Homework I3 - Due Tuesday Issues? - Northeastern University

7

14

Example: Handheld ECAs

n  Research Question: n  Do people exhibit the same

nonverbal conversational behavior when talking to a 2” tall character than when talking to another person face-to-face?

n  Exercise: n  Design the study

Example: Handheld ECAs

15

Page 8: Homework I3 - Due Tuesday Issues? - Northeastern University

8

16

Results

0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

Gesture Gaze Nods Postures Brows

HumanHECA

Observational Research aka Behavior Coding

Watching people and quantifying their behavior

17

Page 9: Homework I3 - Due Tuesday Issues? - Northeastern University

9

18

Defining Behavioral Categories

n  Only need enough detail to provide a reliable measure. n  What is reliability? n  How to measure it?

n  e.g. do people in the student center get more rude 10 minutes before class times?

19

Defining a behavioral protocol

Draft Coding manual

Two or more observers code data for pilot subjects

Is Kappa acceptable? Coders review differences & update coding

manual

Coding manual provides reliable measure,

proceed with study

yes no

Page 10: Homework I3 - Due Tuesday Issues? - Northeastern University

10

20

Developing Behavioral Categories

n  Categories must be operationally defined n  Behavioral categories must be clearly

defined to avoid ambiguity

n  “flailing arms around” vs. n  “moved arms from below to above waist

and back more than 3 times per minute”

21

Developing Behavioral Categories

“confused” vs. (“clicked mouse at least 5 times on inappropriate

menu” OR “gazed at interface with mouth open AND no

mouse clicks or keyboard presses for 5 minutes”)

AND “furrowed brows”

Page 11: Homework I3 - Due Tuesday Issues? - Northeastern University

11

22

Coding Manual

n  You should write your behavior identification rules down so that you could give them to someone else to follow reliably.

n  You should also write down the sampling and coding methods you will use, as well as your recording instrument (e.g., paper form).

24

Quantifying Behavior: What is the metric?

n  Frequency Method n  Record the frequency with which a behavior occurs

within a time period n  Duration Method

n  Record how long a behavior lasts n  Intervals Method

n  Divide the observation period into several discrete time intervals (e.g., ten 2-minute intervals), and record whether a behavior occurs within each interval

Page 12: Homework I3 - Due Tuesday Issues? - Northeastern University

12

Example: Code Posture Shifts

25

26

Example n  Posture shifts

n  Body part n  Upper body n  Lower body n  Both

n  Type n  Shift n  Return

n  Energy level n  0-100%

n  Hand gestures and other communicative behavior does not count – nor their effects.

StartTime EndTime BodyPart Type Energy 00:00:03 00:00:04 Upper Return 50% … … … …

n  Video reviewed and start/stop/type coded. n  From this, we can compute frequency, duration, or

intervals

Page 13: Homework I3 - Due Tuesday Issues? - Northeastern University

13

Lecture 1 - Introduction 27

Posture Shifts Duration, Frequency, or Interval Measures?

Monologues (0.06/s)

Dialogues (0.07/s)

ps/s

ps/int

energy

ps/s

ps/int

energy

Inter-dseg

0.340

0.837

0.832

0.332

0.533

0.844

intra-dseg

0.039

0.701

0.053

0.723

Posture shifts with respect to discourse segment

28

Page 14: Homework I3 - Due Tuesday Issues? - Northeastern University

14

29

Tools for Coding: ANVIL

31

Coping With Complexity in Observational Research

n  Recording n  Use a recording device to make a record of behavior

for later review

n  Time Sampling n  Scan subjects for a specific period (e.g., 30 seconds),

and then record your observations during the next period

n  Individual Sampling n  Select a subject and observe behavior for a given

period (e.g., 30 seconds), and then shift to another subject and repeat observations

Page 15: Homework I3 - Due Tuesday Issues? - Northeastern University

15

32

Coping With Complexity in Observational Research

n  Event Sampling n  Select one behavior for observation and record all

instances of that behavior n  It is best if one behavior can be specified as more

important than others

33

Coping With Complexity in Observational Research

n  Ecological momentary assessment

n  Intelligent/Context Aware EMA

n  What kind of sampling

is this?

Page 16: Homework I3 - Due Tuesday Issues? - Northeastern University

16

34

Smart Rooms – e.g. PlaceLab

Issues in Observational Research

n  IRB issues with video/audio recording?

n  Behavior vs. Function/Intent

35

Page 17: Homework I3 - Due Tuesday Issues? - Northeastern University

17

36

Evaluating Interrater Reliability

n  You must establish reliability of observations from multiple observers (interrater reliability)

n  Most common/acceptable method for evaluating interrater reliability for a nominal measure, 2 raters n  Cohen’s Kappa

n  Allows you to determine if agreement observed is due to chance

n  Kappa of 0.70 or more indicates acceptable interrater reliability

The R Project for Statistical Computing Interrater Reliability

Lecture 1 - Introduction 37

Page 18: Homework I3 - Due Tuesday Issues? - Northeastern University

18

Example R data setup for interrater reliability

45

Time Judge1 Judge21 together together

2 apart apart

3 together together

4 apart together

5 apart apart

6 together together

7 together together

Kappa > install.packages(“psych”) #one time

> require(psych) #every session

> wkappa(table(data$Judge1,data$Judge2))

$kappa [1] 0.6

$weighted.kappa [1] 0.2

#accounts for distance of each discrepancy

#= how bad different disagreements are

#ignore for now

Page 19: Homework I3 - Due Tuesday Issues? - Northeastern University

19

48

Other statistics for inter-rater reliability n  Fleiss' kappa

n  Nominal, >2 raters

n  Kendall's τ, or Spearman's rho n  Ordinal, 2 raters (not testing absolute match)

n  Pearson correlation coefficient n  Interval or ratio, 2 raters (not testing absolute match – only

whether linearly related)

n  Intraclass correlation coefficient n  Interval or ratio, 2+ raters n  See ‘icc’ function in ‘irr’ R package.

n  See Hallgren article on Bibliography page

49

Refining a behavioral protocol

Draft Coding manual

Two or more observers code data for pilot subjects

Is Kappa acceptable? Coders review differences & update coding

manual

Coding manual provides reliable measure,

proceed with study

yes no

Page 20: Homework I3 - Due Tuesday Issues? - Northeastern University

20

51

Behavior Coding Exercise Groups of 2-4, one should have a laptop with R

n  You are developing a robotic couples counselor. n  You want to determine how couples react to it. n  Given nominal variable to code n  Discuss

n  Meaning n  Refine values n  Behavioral correlates n  Draft coding manual n  Focus on nonverbal behavior (poor audio)

52

Behavior Coding Exercise Groups of 2-4, one should have a laptop with R

n  Shown 2-3 minute samples from 3 couples n  Interval sampling, 10s intervals n  Each judge codes behavior

n  Suggest shorthand, eg “E” for “Engaged” n  Annotate Couple ID, landmarks (e.g. start of speaking turn), sample

ID

Page 21: Homework I3 - Due Tuesday Issues? - Northeastern University

21

Code!

53

54

Behavior Coding Exercise Groups of 2-4, one should have a laptop with R

n  Put into one spreadsheet with one row per observation, one column per judge

n  Compute interrater reliability n  For >2 judges, compute mean of all pair-wise kappas n  If <0.7 discuss discrepancies and improvements n  Update coding manual

n  Videos shown 2nd time for discussion n  Videos shown 3rd time for 2nd pass coding n  Repeat kappa calcs

Page 22: Homework I3 - Due Tuesday Issues? - Northeastern University

22

58

Other approaches to Data Collection Necessarily non-experimental?

n  Naturalistic Observation n  Unobtrusive observations of subjects’ naturally occurring

behavior are made n  Ethnography

n  The researcher becomes immersed in the behavioral or social system being studied. May be conducted as a participant or non-participant observation study

n  Sociometry n  You identify and measure interpersonal relationships

within a group

59

Approaches to Data Collection Necessarily non-experimental?

n  Case History n  You observe and report on a single case

n  Content Analysis n  You analyze spoken or written records for the

occurrence of specific categories of events (e.g., a word or phrase)

Page 23: Homework I3 - Due Tuesday Issues? - Northeastern University

23

60

Approaches to Data Collection Necessarily non-experimental?

n  Archival Research n  You use existing records (e.g., police

records) as your source of data

n  Meta-Analysis n  Compute overall statistics based on a

number of previously-published studies.

62

Sequential Analysis aka Time-Series Analysis

n  B&A say recording sequences of behavior may yield more information than individual events. n  e.g. interruption followed by grimace

followed by rolling eyes

Page 24: Homework I3 - Due Tuesday Issues? - Northeastern University

24

63

Content Analysis: Defining Characteristics

n  Used to analyze a written or spoken record for occurrence of specific behaviors or events

n  Archival sources often used as sources for data n  Response categories must be clearly defined n  A method for quantifying behavior must be

defined

n  Tools exist, n  http://www.lexicoder.com n  R package: http://docs.quanteda.io/

64

Example Study

n  The CEO of Global Enterprises, Inc. is very worried about the low morale in the company, as evidenced by the amount of flame email she receives. She considers sending every office on a “ropes” course, but to do this would cost the company $10M. She asks you to do a study to tell how well her scheme might actually work in reducing her flame mail.

Page 25: Homework I3 - Due Tuesday Issues? - Northeastern University

25

65

Meta-Analyses

n  Compare/Integrate “all” studies that have investigated a given phenomena n  E.g., use of a particular medication for a

particular disease n  Common in the literature (esp. medical) n  Very methodical

n  Search for articles n  Eligibility criteria n  Statistical analyses

66

Meta-Analysis

n  New terms(?) n  Level of Significance n  Effect Size n  Type I & II errors

Page 26: Homework I3 - Due Tuesday Issues? - Northeastern University

26

67

Meta-Analyses

n  Effect Size n  Measure of how much difference exists

between treatment groups in an experiment n  How to assess as common metric?

n  E.g., compare effect of large monitors on productivity

n  Study 1 measures widgets per day n  Study 2 measures subjective assessment of managers

n  How to integrate across studies?

Meta-analysis example

68

Page 27: Homework I3 - Due Tuesday Issues? - Northeastern University

27

69

70

Page 28: Homework I3 - Due Tuesday Issues? - Northeastern University

28

71

72

Page 29: Homework I3 - Due Tuesday Issues? - Northeastern University

29

73

n  Notes: n  r is a measure of effect size; r^2 is the amount of variance

in the DV accounted for by the IV.

74

Page 30: Homework I3 - Due Tuesday Issues? - Northeastern University

30

75

Homework

n  Read Survey measures n  B&A Ch 9 n  Article on debate re: scale questionnaire

measures

n  Finish I3 (usability test) n  Due next class

76