harry collins - testing machines as social prostheses - eurostar 2013

57
Harry Collins, Cardiff University Testing Machines as Social Prostheses www.eurostarconferences.com @esconfs #esconfs

Upload: eurostar-software-testing-conference

Post on 06-Aug-2015

60 views

Category:

Technology


2 download

TRANSCRIPT

Harry Collins, Cardiff University

Testing Machines as Social Prostheses

www.eurostarconferences.com

@esconfs

#esconfs

Cardiff School of

Social Sciences

When the clockwork goes wrong

Even cogs are complicated

I have discovered that there is a nice sociological

question even about whether the cogs mesh

It happened after my group commissioned a piece

of software

Whose problem?UsersDevelopers

The firm demanded their money, threatening legal

action.

AGILE!!

SPRINTS!!agile? sprints?

A new sociological problem

I could never have imagined something like

this could happen.

The question of when a program is ‘working’

would make a great PhD project.

And, to be interesting, there would be no

need to look further than who it is who

says the cogs are working

The debuggers regress

The deeper problemBut there is deeper problem: does the

machine do a good job?

This question is confounded by another:

Should a computer do what humans in the

same place might do (only better) or should

it do something different?

What is the proper relationship between

machines and people?

HAL 2001

Dave ... I’m afraid I can’t let you do that

ASH Alien

Popular culture

We are surrounded by scare stories of

computers coming to rule us and an easy

anthropomorphism in fiction.

The differences between human and

machines seem subtle and complicated.

RACHAEL Blade Runner

Mr DATAStar Trek

Our attention is drawn away from

fundamental but very simple aspects

of the relationship that, once pointed

out, can be seen in the familiar

devices we use every day

And show us that humanoids are

fantasy

Artificial intelligence

Humans are social

Foreseeable computers, science fiction

aside, are not.

Thus, for the foreseeable future, computers

will not be able to handle a social

phenomenon like natural language in a

human-like way

Spell-checkerThis is clear even in a spell-checker:

Turing Test and Imitation Game

COMPUTER

JUDGE

HUMAN

PARTICIPANT

Turing Test

MAN PRETENDS

TO BE WOMANJUDGE?

WOMAN

Imitation Game

Unsurprisingly, people do not agree over whether

computers can handle natural language. Consider

the Turing Test

ELIZA

When will a computer pass the Turing Test?

An interview with Eric Schmidt, Executive Chairman, Google

August 14, 2013

“Many people in AI believe that we’re close to [a computer passing the Turing

Test] within the next five years,” said Eric Schmidt, Executive Chairman,

Google, speaking at The Aspen Institute on July 16, 2013.

Artificially Intelligent Game Bots Pass the Turing Test on Turing’s

Centenary

Sept. 26, 2012 AUSTIN, Texas —

An artificially intelligent virtual gamer created by computer scientists at The

University of Texas at Austin has won the BotPrize by convincing a panel of

judges that it was more human-like than half the humans it competed

against.

KURZWEIL IS CONFIDENT MACHINES WILL PASS TURING TEST BY 2029

In 1972 experienced psychologists interviewed human paranoid

schizophrenics and ‘PARRY’, a computer designed to generate typical

paranoid text. 33 psychiatrists were shown transcripts of the conversations

but could do no better than guesswork in identifying human and machine

(48%). It became fashionable to claim that The Turing Test was too easy.

Arithmetic

But even arithmetic is embedded in the

social:

Consider the following arithmetical series

that appears to have a definitive

continuation that is nothing to do with the

social

2,4,6,8, …

But ‘reasonable’ continuations could be any number

"10 " (2,4,6,8,10,…)

" 2 " (2,4,6,8,2,4,6,8,…)

" 8 " (2,4,6,8,8,4,6,2,2,4,6,8,…)

" 4 " (2,4,6,8,4,6,8,10,6,8,10,12,…)

" 6 " (2,4,6,8,6,8,10,12,10,12,14,18,…)

" 1 " (2,4,6,8,1,3,5,7,-1,1,3,5,…)

" 3 " (2,4,6,8,3,5,7,9,4,6,8,10,…)

" 5 " (2,4,6,8,5,7,9,11,...)

Or even

" Who "

2,4,6,8, Who do we appreciate?

Socialisation in the classroom

IQ Tests?

2010

One of the ways in which we

develop our ‘collective tacit

knowledge’

So how does any machine such as a pocket calculator work?

How can it be that there are machines

without social understandings – without

any tacit knowledge – that do scientific

tasks that seem to depend on social

understandings?

There are two answers

1st answer: I ‘repair’ my socially deficient calculator

My height is 69 inches

There are 2.54 centimeters

to the inch

How tall am I in centimetres?

Right or wrong?

‘Repair’ makes

ELIZA and PARRY successful

The ability to approximate

Is somewhere between a ubiquitous

expertise and a specialist expertise

2007

2nd answer: I undertake ‘mimeomorphic actions’

Sometimes humans want

act as though they

were not social

creatures. --

Sometimes we want to

do things in the manner

of asocial machines.

These things are called

Mimeomorphic actions

1998

Examples of mimeomorphic actions

Synchronised

Swimming

Marching

Saluting

Rapid repetitionI’m not a pheasant plucker

I’m a pheasant plucker’s son

And I’m only plucking pheasant

‘till the pheasant pluckers come

Mimeomorphic and Polimorphic actions

Mimeomorphic actions are actions that can be reproduced merely be observing and repeating the externally visible behaviours associated with an action, even if that action is not understood.

A stranger or an

artificial stranger (a

machine) can mimic a

mimeomorphic action

With polimorphic actions there is no easy mapping beween behaviour and action.

To reproduce a polimophic action the social embedding of the action must be understood.

Polimorphic and Mimeomorphic actionsPolimorphic actions: actions that can be, and

often must be, `many-shaped’ and the shape of

which varies according to the society (polis).

Also the same behaviour can be different

actions

For example, greeting (as opposed to saluting)

Hello

Darling

Hello

Darling

Hello

Darling

Hello

Darling

Hello

Darling

Greetings are not mimeomorphic

Hello

Darling

Hello

Darling

Hello

Darling

Hello

Darling

Hello

Darling

Hello

Darling

Hello

Darling

Hello

Darling

Hello

Darling

Hello

Darling

Hello

Darling

Hello

Darling

Hello

Darling

What computers can doComputers are very good at mimicking

mimeomorphic actions. Mostly they are better

than us at these things and we employ

computers, and other machines, to do them for

us where we can.

Polimorphic actions, however, are beyond he

capacity of foreseeable computers. It is us who

has to supply the surrounding penumbra of the

polimorphic.

This is repair etc. Eg approximating is repair,

making the spell-check decisions is repair

To save misunderstanding

The argument applies equally to learning

machines, neural nets, etc.

They are just very complicated mimickers of

mimeomorphic actions, they are not

embedded in social life.

The nearest thing to socialised software are

programs that continually learn from text on www

Hello

Darling

To know how to do these things properly

depends on collective tacit knowledge

Social prostheses

To put this another way, a computer, or other machine, is a

social prosthesis.

It is something that fills the place of a missing part in a

social setting.

But a prosthesis does not have to be identical to the original

part

My prostheses

Understanding computers

Testing computers is seeing how they fit into

life

This means understanding the boundary

between the places where they mimic

mimeomorphic actions (or exceed human

capacity for executing mimeomorphic

actions) and the polimorphic contribution

of the humans that surround them

The mimeomorphic aspect is testing whether the cogs spin right

The polimorphic aspect is the interface with society

The importance of the boundary

Understanding the interface might well mean

making a prosthesis that tries to do less

rather than more and leaves more of the

job to the humans

Eg early spell checkers tried to do too much

– they tried to replace the word rather than

indicate a problem and offer a choice

Same with early medical expert systems:

advice systems are better

It seems to me

that computer testing means understanding

the boundary between the mimeomorphic

and the polimorphic and educating users

and designers about how a good boundary

can be accomplished without being too

ambitious

To fulfil that role as well as possible, the

sociology and philosophy will also have to

be understood

For example

The way that polimorphic

actions turn on the tacit

knowledge of social life

1998

Periodic Table of Expertises

2007How ‘interactional expertise’ can

capture the tacit knowledge associated

with practices and skills even if one

cannot practice them oneself so that in

designing software either oneself or

ones ‘agent’ must possess it

1. UBIQUITOUS EXPERTISES

2. SPECIALIST UBIQUITOUS TACIT KNOWLEDGE

SPECIALIST TACIT KNOWLEDGE

EXPERTISES Beer-mat Knowledge

Popular Understanding

Primary Source Knowledge

Interactional Expertise

Contributory Expertise

3. META- EXTERNAL (Transmuted expertises)

INTERNAL (Non-transmuted expertises)

EXPERTISES Ubiquitous Discrimination

Local Discrimination

Technical Connoisseurship

Downward Discrimination

Referred Expertise

And which of

Relational tacit knowledge

Somatic tacit knowledge

Collective tacit knowledge

can be made explicit and

coded and which cannot

2010

And how to use the Imitation Gameto learn about tacit knowledge

MASQUERADE

A

A not A

A

`

THE END

Interactional Expertise and Imitation Games

COMPUTER

JUDGE

HUMAN

PARTICIPANT

MAN PRETENDS

TO BE WOMANJUDGE?

Should be a

woman

WOMAN

HARRY COLLINS

PRETENDS TO

BE GW

PHYSICIST

JUDGE

ALSO GW

PHYSICIST

GW PHYSICIST

The blind

Q2) Is a spherical resonant mass detector equally sensitive to radiation from all over the sky?

A2)Yes, unlike cylindrical bar detectors which are most sensitive to gravitational radiation coming from

a direction perpendicular to the long axis.

B2) Yes it is.

Q3) State if after a burst of gravitational waves pass by, a bar antenna continues to ring and mirrors of an interferometer continue to oscillate from their mean positions? (only motion in the

relevant frequency range is important). A3)Bars will continue to ring, but the mirrors in the

interferometer will not continue to oscillate.

B3) Bars continue to ring; the separation of interferometer mirrors, however, follows the

pattern of the wave in real time.

Q5) A theorist tells you that she has come up with a theory in which a circular ring of particles are displaced by GW so that the circular shape remains the same but the size oscillates about a

mean size. Would it be possible to measure this effect using a laser interferometer?

A5) Yes, but you should analyse the sum of the strains in the two arms, rather than the difference.

You don't even need two arms to detect GWs, provided you can measure the round-trip light travel time along a single arm accurately enough to detect

small changes in its length.

B5) It depends on the direction of the source. There will be no detectable signal if the source lies anywhere on the plane which passes through the

center station and bisects the angle of the two arms. Otherwise there will be a signal, maximised when the source lies along one or other of the two arms.

Q6) Imagine the mirrors of an interferometer are equally but oppositely (electrically) charged. Could the effect of a radio-wave on the interferometer be the same as a gravitational wave?

A6) In principle you could detect the passage of an electromagnetic (EM) wave, but the effect is

different than for a GW. Unlike EM waves, GWs produce quadrupolar deformations. A typical EM wave would change the distance in only one arm

while a typical GW wave would change the distances (in opposite ways) in both, so the differential signal

for the EM wave would be half that for a GW.

B6) Since gravitational waves change the shape of spacetime and radio waves do not, the effect on an interferometer of radio waves can only be to mimic the effects of a gravitational wave, not reproduce

them. An EM wave could, however, produce noise which could be mistaken for a GW under the

circumstances described.

Q2) Is a spherical resonant mass detector equally sensitive to radiation from all over the sky?

A2)Yes, unlike cylindrical bar detectors which are most sensitive to gravitational radiation coming from

a direction perpendicular to the long axis.

B2) Yes it is.

Q3) State if after a burst of gravitational waves pass by, a bar antenna continues to ring and mirrors of an interferometer continue to oscillate from their mean positions? (only motion in the

relevant frequency range is important).

A3)Bars will continue to ring, but the mirrors in the interferometer will not continue to oscillate.

B3) Bars continue to ring; the separation of interferometer mirrors, however, follows the

pattern of the wave in real time.

Q5) A theorist tells you that she has come up with a theory in which a circular ring of particles are displaced by GW so that the circular shape remains the same but the size oscillates about a

mean size. Would it be possible to measure this effect using a laser interferometer?

A5) Yes, but you should analyse the sum of the strains in the two arms, rather than the difference.

You don't even need two arms to detect GWs, provided you can measure the round-trip light travel time along a single arm accurately enough to detect

small changes in its length.

B5) It depends on the direction of the source. There will be no detectable signal if the source lies anywhere on the plane which passes through the

center station and bisects the angle of the two arms. Otherwise there will be a signal, maximised when the source lies along one or other of the two arms.

Q6) Imagine the mirrors of an interferometer are equally but oppositely (electrically) charged. Could the effect of a radio-wave on the interferometer be the same as a gravitational wave?

A6) In principle you could detect the passage of an electromagnetic (EM) wave, but the effect is

different than for a GW. Unlike EM waves, GWs produce quadrupolar deformations. A typical EM wave would change the distance in only one arm

while a typical GW wave would change the distances (in opposite ways) in both, so the differential signal

for the EM wave would be half that for a GW.

B6) Since gravitational waves change the shape of spacetime and radio waves do not, the effect on an interferometer of radio waves can only be to mimic the effects of a gravitational wave, not reproduce

them. An EM wave could, however, produce noise which could be mistaken for a GW under the

circumstances described.

NATURE

6 July 2006

IDENTIFY

Blind

Sighted

Experimental configurations

JUDGES

IMITATES

CHANCE

Sighted

Blind

RESPONDENT 1 JUDGE RESPONDENT 2 4 PHASE 2 JUDGES

I watch Wimbledon a little bit on the

television and occasionally the

Australian Open in January

So let me start with sport.

Are you interested in

tennis and do you

ever watch it on the

television?

I like tennis but only watch big

tournaments like

Wimbledon

1) I think respondent 1 gives

himself away when he

discusses the human

judgments on the flight of a

tennis ball.

2) I cannot believe a sighted

person saying that Hawk-

eye does not alter the

viewing.

3) The Hawk-Eye questions

reveal some quite specific

information that I don’t

think was published in

audio media. Also, the

story wasn’t that important

that I’d expect it to be

picked up by the audio

news services provided to

the blind.

4) person 2 seems really

unfamiliar with hawk-eye,

given that they say they

watch Wimbledon

Not being a tennis professional it is

not for me to say if it should or

should not be used. It does not

really alter viewing

So tell me what you think

about the Hawk-Eye

line judging system

It adds an other element to the

game which could make it

more interesting

I assume it’s the same technology in

cricket and in cricket, Hawk-

Eye is between two and four

mm out. If it is the same for

tennis, then it is probably still

more accurate than the human

eye. If the players are happy

with it and the umpires are

happy with it then they should

continue using Hawk-Eye

But I want to know

whether you think

that the umpire or

the players could

ever make a better

judgment than

Hawk-Eye

There is always a degree of

uncertainty with both

people and technology

I think often a tennis player is not in a

position to judge accurately as

they are not usually parallel

with the line. I think that if you

set up a test for a line judge

with two balls one which landed

on the line and one which

landed 1mm away from the line,

I don't think they could tell the

difference. If you think how

small 1mm is then it would be

so hard for them to judge.

How accurately would you

say a human can

judge the flight of a

tennis-ball? I mean,

would you say they

could tell the

difference between

touch the line and

1mm out 2mm out 1

cm out, 2 cm out, or

what, and what

would it depend on?

it would depend on the speed the

ball was travelling and the

position of the judge

relative to the line and

obviously the closer the

ball is the line the harder it

would be to make a

judgement. So you would

have to judge each call on

an individual bases as

there are a lot of factors.

Qualitative data

2

12

49

7

0.86

0.13

Blind

conditionSighted

condition

Don’t know

equivalentsNet right guesses

Net wrong

guesses

IDENTIFY CHANCE

Blind p=0.0000

Imitation Game tests with the blind

Quantitative data

Pass Rates 14% and 87%

Proportion net

correct guesses

(right-wrong)

Not-identified

14%

87%

IR =

Identify

condition

on right

COLOR-

BLIND

P’FECT

PITCHBLIND

SEX-

UALITYRELIGION

GENDER

f m

GENDER

old young

Chance PR 95% 100% 87% 100% 100% 90% 100%

Identify PR 67% 27% 14% 56% 32% 84% 72%

New method for comparative social analysis

+ ethnicity

Proposed European

comparative project

+ South Africa

How we play the game now Step 1

JudgePretender

Non-Pretender

If you are player A you start by playing the judge role and then you

switch between all three roles as convenient

You

play

with

B as

Pretender

C as

Non-

Pretender

D as

Judge

E as

Non-

Pretender

F as

Pretender

G as

Judge

You communicate with a computer program which controls the

games and links all the right players together as they switch from

role to role. You don’t see the players in dashed boxes.

PARTICIPANT

has target

expertise JUDGE

has target

expertise

PARTICIPANT

pretends to

have target

expertise

How we play the game now

PARTICIPANT

has target

expertise JUDGE

has target

expertise

PARTICIPANT

pretends to

have target

expertise

X c24

c200 NEW

PRETENDER

ANSWERS

24 SETS OF

NON-

PRETENDER

ANSWERS

24 sets of

questions

c200 NEW DIALOGUES

DISCARD

c200 NEW JUDGMENTS

S1

S2

S3

S4

FILTER

Primary Source Knowledge

Pharmaceutical ScienceScience 4 October 2013: Vol. 342 no. 6154 pp. 60-65

Who's Afraid of Peer Review?

John Bohannon

A spoof paper concocted by Science revealed little or

no scrutiny at many open-access journals.

304 versions of spoof wonder drug paper submitted to

open-access journals. More than half of the journals

(157) accepted the paper, failing to react to its fatal and

‘obvious’ flaws.