csc2130:( empirical(research(methods(for(computer( sciensts(sme/csc2130/01-intro.pdf · this...

1

1

University of Toronto Department of Computer Science

© 2004-5 Steve Easterbrook. This presentation is available free for non-commercial use with attribution under a creative commons license.

CSC2130: Empirical Research Methods for Computer

Scien<sts

Steve Easterbrook [email protected]

Barbara Neves [email protected]

www.cs.toronto.edu/~sme/CSC2130/


© 2012 Steve Easterbrook. This presentation is available free for non-commercial use with attribution under a creative commons license. 2

Course Goals ➜ Mo<vate the need for an empirical basis for research claims

➜ Prepare students for advanced research: Ä Learn how to plan, conduct and report on empirical inves<ga<ons. Ä Understand the key steps of a research project:

Ø  formula<ng research ques<ons, Ø  theory building, Ø  data analysis (using both qualita<ve and quan<ta<ve methods), Ø  building evidence, Ø  assessing validity, Ø  publishing.

➜ Cover the principal empirical methods applicable to human subjects studies in CS Ä controlled experiment, case studies, surveys, archival analysis, ac<on research,

ethnographies,…

➜ Relate these methods to relevant meta-‐theories in the philosophy and sociology of science.

2



Intended Audience

➜ This is an advanced course: Ä assumes a strong grasp of the key research ques<ons in your own research area, and that

you are already doing independent research

➜  Focus: Ä How do people use computer technology? Ä How does this technology (re-‐)shape human ac<vi<es? Ä How can we apply qualita<ve and quan<ta<ve techniques from the behavioural sciences

to help answer these ques<ons?

➜ The course is aimed at students who: Ä …plan to conduct research (in SE, HCI, etc) that demands some empirical valida<on Ä …wish to establish an empirical basis for an exis<ng research programme Ä …wish to apply these techniques in related fields (e.g. Cog Sci, )

➜  Note: we will *not* cover the kinds of experimental techniques used in CS systems areas, nor in medical/biological research

Ä  Focus is on the rela<onship between human ac<vity and computer technology



Format

➜ Seminars: Ä 1 three-‐hour seminar per week Ä Mix of discussion, lecture, student presenta<ons

➜ Readings Ä Major component is discussion of weekly readings Ä Please read the set papers before the seminar

➜ Assessment: Ä 10% Class Par<cipa<on Ä 20% Oral Presenta<on – introduce the readings using no more than two slides

Ø  Slide 1: Key takeaway messages from the paper Ø  Slide 2: Discussion ques<ons

Ä 70% Wriden paper Ø  A cri<cal literature review and study design for a specific research ques<on Ø  preferably related to your own research

3



Course Outline (part 1) 1.   Introduc<on & Orienta<on (today!)

Ä  Intro to philosophy & sociology of science Ä  Role of theory building

2.   Research Design and Ethics Ä  What counts as evidence? Ä  Use of mixed methods Ä  Ethics concerns for human subjects studies

3.   Basics of Doing Research Ä  Finding good research ques<ons Ä  Theory building Ä  Evidence and Measurement Ä  Replica<on Ä  Peer Review



Course Outline (cont) 4.   Laboratory Experiments

Ä  Controlled Experiments Ä  Quasi-‐experiments Ä  Sampling Ä  Confounding Variables

5.   Quan<ta<ve Analysis Ä  Basic Stats Ä  Interpre<ng significance measures Ä  power analysis

6.   Interviews and Observa<on Ä  Conduc<ng Interviews Ä  Par<cipant observa<on Ä  Collec<ng field notes

7.   Qualita<ve Analysis Ä  Coding Strategies Ä  Grounded Theory Ä  Phenomenography

8.   Case Studies Ä  Single and Mul<-‐case Ä  Longitudinal Case Studies

9.   Survey Research Ä  Ques<onnaire Design Ä  Sample Size

10.   Interven<on Methods Ä  Ac<on Research Ä  Pilot Studies Ä  Benchmarking

11.   Publishing and Reviewing Ä  (mock PC mee<ng)

12.   Replica<on and Beyond Ä  Internal and External Replica<on Ä  Biases and Influences Ä  Threats to Validity Ä  When to use empirical methods

4



Is this your research plan?

Step 1: Build a new tool

Step 2: ??

Step 3: Profit



Engineering vs. Science

➜ Tradi<onal View: Scien<sts… Engineers… create knowledge apply that knowledge study the world as it is seek to change the world are trained in scien<fic method are trained in engineering design use explicit knowledge use tacit knowledge are thinkers are doers

➜ More realis<c View Scien<sts… Engineers… create knowledge create knowledge are problem-‐driven are problem-‐driven seek to understand and explain seek to understand and explain design experiments to test theories design devices to test theories prefer abstract knowledge prefer con<ngent knowledge but rely on tacit knowledge but rely on tacit knowledge

Both involve a mix of design and discovery

5



Scien<fic Method

➜  No single “official” scien<fic method

➜  Somehow, scien<sts are supposed to do this:

World Theory

Observation

Validation



Observe!

6



Scien<fic Inquiry

Prior Knowledge (Initial Hypothesis)

Observe (what is wrong with the current theory?)

Theorize (refine/create a better theory)

Design (Design empirical tests

of the theory)

Experiment (manipulate the variables)



7



Some Characteris<cs of Science

➜ Science seeks to improve our understanding of the world.

➜ Explana<ons are based on observa<ons Ä Scien<fic truths must stand up to empirical scru<ny Ä Some<mes “scien<fic truth” must be thrown out in the face of new findings

➜ Theory and observa<on affect one another: Ä Our percep<ons of the world affect how we understand it Ä Our understanding of the world affects how we perceive it

➜ Crea<vity is important Ä Theories, hypotheses, experimental designs Ä Search for elegance, simplicity



All Methods are flawed

➜ E.g. Laboratory Experiments Ä Cannot study large scale sorware development in the lab! Ä Too many variables to control them all!

➜ E.g. Case Studies Ä How do we know what’s true in one project generalizes to others? Ä Researcher chose what ques<ons to ask, hence biased the study

➜ E.g. Surveys Ä Self-‐selec<on of respondents biases the study Ä Respondents tell you what they think they ought to do, not what they actually do

➜ …etc...

8



Strategies to overcome weaknesses

➜ Theory-‐building Ä Tes<ng a hypothesis is pointless (single flawed study!)… Ä …unless it builds evidence for a clearly stated theory

➜ Empirical Induc<on Ä Series of studies over <me… Ä Each designed to probe more aspects of the theory Ä …together build evidence for a clearly stated theory

➜ Mixed Methods Research Ä Use mul<ple methods to inves<gate the same research ques<on Ä Each method compensates for the flaws of the others Ä …together build evidence for a clearly stated theory


© 2012 Steve Easterbrook. This presentation is available free for non-commercial use with attribution under a creative commons license. 16 Source: http://xkcd.com/882/"

9



What is a research contribu<on?

➜ A beder understanding of how people use sorware technology?

➜  Iden<fica<on of problems with the current state-‐of-‐the-‐art?

➜ A characteriza<on of the proper<es of new tools/techniques?

➜ Evidence that approach A is beder than approach B?

How will you validate your claims?"



Meet Stuart Dent

➜ Name: Ä Stuart Dent (a.k.a. “Stu”)

➜ Advisor: Ä Prof. Helen Back

➜ Topic: Ä Merging Stakeholder views in Model Driven

Development

➜ Status: Ä 2 years into his PhD Ä Has built a tool Ä Needs an evalua<on plan

10



Stu’s Evalua<on Plan ➜  Formal Experiment

Ä  Independent Variable: Stu-‐Merge vs. Ra<onal Architect Ä Dependent Variables: Correctness, Speed, Subjec<ve Assessment Ä Task: Merging Class Diagrams from two different stakeholders’ models Ä Subjects: Grad Students in SE Ä H1: “Stu-‐Merge produces correct merges more oren than RA” Ä H2: “Subjects produce merges faster with Stu-‐Merge than with RA” Ä H3: “Subjects prefer using Stu-‐Merge to RA”

➜ Results Ä H1 accepted (strong evidence) Ä H2 & H3 rejected Ä Subjects found the tool unintui<ve



Threats to Validity

➜ Construct Validity Ä What do we mean by a merge? What is correctness? Ä 5-‐point scale for subjec<ve assessment -‐ insufficient discriminatory power

Ø  (both tools scored very low)

➜  Internal Validity Ä Confounding variables: Time taken to learn the tool; familiarity

Ø  Subjects were all familiar with RA, not with Stu-‐merge

➜ External Validity Ä Task representa<veness

Ø  class models were of a toy problem

Ä Subject representa<veness Ø  Grad students as sample of what popula<on?

➜ Theore<cal Reliability Ä Researcher bias

Ø  subjects knew Stu-‐merge was Stu’s own tool

11



What went wrong?

➜ What was the research ques<on? Ä “Is tool A beder than tool B?”

➜ What would count as an answer?

➜ What use would the answer be? Ä How is it a “contribu<on to knowledge”?

➜ How does this evalua<on relate to the exis<ng literature?



Experiments as Clinical Trials

Is drug A better than drug B?

Why would we expect it to be better?

Better at doing what?

Better in what way?

Better in what situations?

Why do we need to know?

What will we do with the

answer?

12



You goda have a theory!

Why would we expect it to be better?



Some Defini<ons

➜  A model is an abstract representa<on of a phenomenon or set of related phenomena Ä  Some details included, others excluded

➜  A theory is a set of statements that explain a set of phenomena Ä  Serves to explain and predict Ä  Precisely defined terminology Ä  Concepts, rela<onships, causal inferences Ä  (opera<onal defini<ons for theore<cal terms)

➜  A hypothesis is a testable statement derived from a theory Ä  A hypothesis is not a theory!

➜  In CS, we have mostly folk theories

13



A simpler defini<on

A (good) Theory is the best explanation of all the available evidence"



The Role of Theory Building

➜ Theories lie at the heart of what it means to do science. Ä Produc<on of generalizable knowledge

➜ Theory provides orienta<on for data collec<on Ä Cannot observe the world without a theore<cal perspec<ve

➜ Theories allow us to compare similar work Ä Theories include precise defini<on for the key terms Ä Theories provide a ra<onale for which phenomena to measure

➜ Theories support analy<cal generaliza<on Ä Provide a deeper understanding of our empirical results Ä …and hence how they apply more generally Ä Much more powerful than sta<s<cal generaliza<on

14



Stu’s Theory ➜ Background Assump<ons

Ä Large team projects, models contributed by many actors Ä Models are fragmentary, capture par<al views Ä Par<al views are inconsistent and incomplete most of the <me

➜ Basic Theory Ä  (Brief summary:) Ä Model merging is an exploratory process, in which the aim is to discover intended

rela<onships between views. ‘Goodness’ of a merge is a subjec<ve judgment. If an adempted merge doesn’t seem ‘good’, many need to change either the models, or the way in which they were mapped together.

Ä [S<ll needs some work]

➜ Derived Hypotheses Ä Useful merge tools need to represent rela<onships explicitly Ä Useful merge tools need to be complete (work for any models, even if inconsistent)



What type of ques<on are you asking? ➜  Existence:

Ä  Does X exist?

➜  Descrip<on & Classifica<on Ä What is X like? Ä What are its proper<es? Ä  How can it be categorized? Ä  How can we measure it? Ä What are its components?

➜  Descrip<ve-‐Compara<ve Ä  How does X differ from Y?

➜  Frequency and Distribu<on Ä  How oren does X occur? Ä What is an average amount of X?

➜  Descrip<ve-‐Process Ä  How does X normally work? Ä  By what process does X happen? Ä What are the steps as X evolves?

➜  Rela<onship Ä  Are X and Y related? Ä  Do occurrences of X correlate with occurrences

of Y?

➜  Causality Ä  Does X cause Y? Ä  Does X prevent Y? Ä What causes X? Ä What effect does X have on Y?

➜  Causality-‐Compara<ve Ä  Does X cause more Y than does Z? Ä  Is X beder at preven<ng Y than is Z? Ä  Does X cause more Y than does Z under one

condi<on but not others?

➜  Design Ä What is an effec<ve way to achieve X? Ä  How can we improve X?

15



What type of ques<on are you asking? ➜  Existence:

Ä  Does X exist?

➜  Descrip<on & Classifica<on Ä What is X like? Ä What are its proper<es? Ä  How can it be categorized? Ä  How can we measure it? Ä What are its components?

➜  Descrip<ve-‐Compara<ve Ä  How does X differ from Y?

➜  Frequency and Distribu<on Ä  How oren does X occur? Ä What is an average amount of X?

➜  Descrip<ve-‐Process Ä  How does X normally work? Ä  By what process does X happen? Ä What are the steps as X evolves?

➜  Rela<onship Ä  Are X and Y related? Ä  Do occurrences of X correlate with occurrences

of Y?

➜  Causality Ä  Does X cause Y? Ä  Does X prevent Y? Ä What causes X? Ä What effect does X have on Y?

➜  Causality-‐Compara<ve Ä  Does X cause more Y than does Z? Ä  Is X beder at preven<ng Y than is Z? Ä  Does X cause more Y than does Z under one

condi<on but not others?

➜  Design Ä What is an effec<ve way to achieve X? Ä  How can we improve X?

Exploratory

Baserate

Correlation

Design

Causal Relationship



Stu’s Research Ques<on(s) ➜ Existence

Ä Does model merging ever happen in prac<ce?

➜ Descrip<on/Classifica<on Ä What are the different types of model merging that occur in prac<ce on large scale

systems?

➜ Descrip<ve-‐Compara<ve Ä How does model merging with explicit representa<on of rela<onships differ from model

merging without such representa<on?

➜ Causality Ä Does an explicit representa<on of the rela<onship between models cause developers to

explore different ways of merging models?

➜ Causality-‐Compara<ve Ä Does the algebraic representa<on of rela<onships in Stu’s tool lead developers to explore

more than do pointcuts in AOM?

Pick just one for now…"

16



Puyng the Ques<on in Context

The Research Question

How does this relate to the established literature?

What new perspectives are you bringing to this field?

What methods are appropriate for answering this question?

Existing Theories

New Concepts

Methodological Choices Empirical Method

Data Collection Techniques

Data Analysis Techniques

Philosophical Context Positivist Constructivist

Critical theory Pragmatist

What will you accept as valid truth?



Many available methods…   Common “in the lab” Methods

●  Controlled Experiments

➜  Ra<onal Reconstruc<ons

➜  Exemplars

➜  Benchmarks

➜  Simula<ons

  Common “in the wild”

Methods

●  Quasi-Experiments ●  Case Studies ●  Survey Research ●  Ethnographies ●  Action Research

❍  Artifact/Archive Analysis (“mining”!)

17



Stu’s Method(s) Selec<on… ➜  Existence

Ä  Does model merging ever happen in prac<ce?

➜  Descrip<on/Classifica<on Ä What are the different types of model merging that

occur in prac<ce on large scale systems?

➜  Descrip<ve-‐Compara<ve Ä  How does model merging with explicit representa<on

of rela<onships differ from model merging without such representa<on?

➜  Causality Ä  Does an explicit representa<on of the rela<onship

between models cause developers to explore different ways of merging models?

➜  Causality-‐Compara<ve Ä  Does the algebraic representa<on of rela<onships in

Stu’s tool lead developers to explore more than do pointcuts in AOM?

Case study

Survey Research

Controlled Experiment

Action Research

Ethnography

?

? ?

?

?

?

?

?

?

?

?

?



Warning

No method is perfect

Don’t get hung up on methodological purity

Pick something and get on with it

Some knowledge is better than none

18

35


© 2004-5 Steve Easterbrook. This presentation is available free for non-commercial use with attribution under a creative commons license.

Okay, but…



Why Build a Tool?

➜ Build a Tool to Test a Theory Ä Tool is part of the experimental materials needed to conduct your study

➜ Build a Tool to Develop a Theory Ä Theory emerges as you explore the tool

➜ Build a Tool to Explain your Theory Ä Theory as a concrete instan<a<on of (some aspect of) the theory

?

csc2130:( empirical(research(methods(for(computer( sciensts(sme/csc2130/01-intro.pdf · this...

Documents