slides for on the usability of spoken dialogue …kom.aau.dk/~lbl/phd/slides_for_on_the_usability_of...

33
On the Usability of Spoken Dialogue Systems On the Usability of Spoken Dialogue Systems Presentation of Ph.D. thesis by Lars Bo Larsen Aalborg University, Sep. 12, 2003

Upload: others

Post on 03-Jun-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Slides for On the Usability of Spoken Dialogue …kom.aau.dk/~lbl/phd/Slides_for_On_the_Usability_of Spoken...Page 2 of 33 On the Usability of Spoken Dialogue Systems Overview •

On

the

Usa

bilit

y of

Spo

ken

Dia

logu

e Sy

stem

s

On

the

Usa

bilit

y of

Spo

ken

Dia

logu

e Sy

stem

s

Pre

sent

atio

nof

Ph.D

. the

sis

byLa

rs B

o La

rsen

Aal

borg

Uni

vers

ity, S

ep. 1

2, 2

003

Page 2: Slides for On the Usability of Spoken Dialogue …kom.aau.dk/~lbl/phd/Slides_for_On_the_Usability_of Spoken...Page 2 of 33 On the Usability of Spoken Dialogue Systems Overview •

Page

2of

33

On

the

Usa

bilit

y of

Spo

ken

Dia

logu

e Sy

stem

s

Ove

rvie

w

•In

trodu

ctio

n –

back

grou

nd•

Def

initi

on o

f usa

bilit

y•

The

OV

ID p

roje

ct•

Obj

ectiv

e m

easu

res

•Tu

rn-ta

king

and

use

r ini

tiativ

es•

Per

ceiv

ed a

nd o

bser

ved

task

suc

cess

•S

ubje

ctiv

e m

easu

res

•Q

uest

ionn

aire

s fo

r mea

surin

g us

er s

atis

fact

ion

•Fa

ctor

Ana

lysi

s•

Com

bine

d an

alys

is u

sing

the

Par

adis

e sc

hem

e•

Sum

mar

y an

d co

nclu

sion

s

Page 3: Slides for On the Usability of Spoken Dialogue …kom.aau.dk/~lbl/phd/Slides_for_On_the_Usability_of Spoken...Page 2 of 33 On the Usability of Spoken Dialogue Systems Overview •

Page

3of

33

On

the

Usa

bilit

y of

Spo

ken

Dia

logu

e Sy

stem

s

Bac

kgro

und

This

wor

k ha

s be

en c

arrie

d ou

t in

two

phas

es:

•Th

e fir

st p

hase

was

the

expe

rimen

tal w

ork

carr

ied

out i

n th

e E

sprit

OV

ID p

roje

ct in

199

6-7

•Th

is re

sulte

d in

a n

umbe

r of r

epor

ts a

nd p

ublic

atio

ns in

19

97-9

9•

Ano

ther

impo

rtant

resu

lt w

as a

fully

ann

otat

ed d

ialo

gue

corp

us•

The

seco

nd m

ore

rece

nt p

hase

was

in 2

002-

3, w

here

the

resu

lts w

ere

verif

ied

and

anal

ysed

from

a m

etho

dica

l poi

nt

of v

iew

, and

new

ana

lyse

s w

ere

carr

ied

out o

n th

e co

rpus

•M

ost n

otab

ly, a

new

par

adig

m (c

alle

d P

arad

ise)

had

be

en p

ropo

sed

sinc

e th

e or

igin

al O

VID

wor

k•

In b

etw

een

the

two

phas

es I

mai

nly

wor

ked

in th

e ar

ea o

f m

ulti

mod

al s

yste

ms

and

teac

hing

.

Page 4: Slides for On the Usability of Spoken Dialogue …kom.aau.dk/~lbl/phd/Slides_for_On_the_Usability_of Spoken...Page 2 of 33 On the Usability of Spoken Dialogue Systems Overview •

Page

4of

33

On

the

Usa

bilit

y of

Spo

ken

Dia

logu

e Sy

stem

s

The

Goa

ls o

f the

OVI

D P

roje

ct

OVI

D T

echn

ical

Ann

ex:

“The

par

tner

s in

tend

to a

ppro

ach

the

wor

k vi

a a

serie

s of

co

ntro

lled

usab

ility

tria

lsof

the

softw

are

in a

real

istic

ba

nkin

g se

rvic

e w

ith re

al b

ank

cust

omer

s.Th

e re

sults

will

be

an a

sses

smen

tof h

ow b

ank

cust

omer

s ar

e ab

le to

use

the

auto

mat

ed s

ervi

ce w

ithou

t tra

inin

gin

its

use,

to d

esig

n an

opt

imal

use

r int

erfa

ce d

ialo

gue

whi

ch c

an

acco

mm

odat

e th

e un

trai

ned

user

.”

Page 5: Slides for On the Usability of Spoken Dialogue …kom.aau.dk/~lbl/phd/Slides_for_On_the_Usability_of Spoken...Page 2 of 33 On the Usability of Spoken Dialogue Systems Overview •

Page

5of

33

On

the

Usa

bilit

y of

Spo

ken

Dia

logu

e Sy

stem

s

Usa

bilit

y

ISO

’s d

efin

ition

of u

sabi

lity:

•Ef

fect

iven

ess:

Acc

urac

y an

d co

mpl

eten

ess

with

w

hich

use

rs a

chie

ve s

peci

fied

goal

s.•

Effic

ienc

y:R

esou

rces

exp

ende

d in

rela

tion

to th

e ac

cura

cy a

nd c

ompl

eten

ess

with

whi

ch u

sers

ach

ieve

go

als.

•Sa

tisfa

ctio

n:Fr

eedo

m fr

om d

isco

mfo

rt, a

nd p

ositi

ve

attit

udes

tow

ards

the

use

of th

e pr

oduc

t.

Usa

bilit

y:ex

tent

to w

hich

a p

rodu

ct c

an b

e us

ed b

y sp

ecifi

ed u

sers

to a

chie

ve s

peci

fied

goal

s w

ith

effe

ctiv

enes

s, e

ffici

ency

, and

sat

isfa

ctio

n in

a s

peci

fied

cont

ext o

f use

.”

Page 6: Slides for On the Usability of Spoken Dialogue …kom.aau.dk/~lbl/phd/Slides_for_On_the_Usability_of Spoken...Page 2 of 33 On the Usability of Spoken Dialogue Systems Overview •

Page

6of

33

On

the

Usa

bilit

y of

Spo

ken

Dia

logu

e Sy

stem

s

Usa

bilit

y of

Spo

ken

Dia

logu

e Sy

stem

s (S

DS)

The

gene

ral d

efin

ition

of u

sabi

lity

and

the

asso

ciat

ed

attri

bute

s ar

e of

cou

rse

also

true

for t

he c

ase

of

SDS.

How

ever

:•

Due

to th

e co

mpl

exity

of t

he in

put p

roce

ssin

g an

d th

e no

n-pe

rsis

tenc

e of

spe

ech,

spe

cial

atte

ntio

n m

ust b

e pa

id to

:•

The

lear

nabi

lity

and

mem

orab

ility

of s

peec

h ba

sed

inte

rface

s•

The

trans

pare

ncy

and

erro

r-ha

ndlin

g ca

pabi

lity

For t

his

reas

on, t

he m

etho

ds th

at h

as b

een

deve

lope

d an

d ar

e w

ell-p

rove

n fo

r tra

ditio

nal i

nter

face

s ca

n no

t di

rect

ly b

e us

ed fo

r spe

ech.

Page 7: Slides for On the Usability of Spoken Dialogue …kom.aau.dk/~lbl/phd/Slides_for_On_the_Usability_of Spoken...Page 2 of 33 On the Usability of Spoken Dialogue Systems Overview •

Page

7of

33

On

the

Usa

bilit

y of

Spo

ken

Dia

logu

e Sy

stem

s

Usa

bilit

y M

easu

res

Two

orth

ogon

al c

ateg

orie

s of

usa

bilit

y m

easu

res

mus

t be

cap

ture

d si

mul

tane

ousl

y:

Obj

ectiv

e m

easu

res:

To

eval

uate

the

effe

ctiv

enes

s an

d ef

ficie

ncy

of th

e sy

stem

:•O

bser

ved

valu

es o

f e.g

. tim

e to

com

plet

e ta

sks,

task

suc

cess

ra

tes,

err

or ra

tes,

num

ber o

f hel

p m

essa

ges,

num

ber o

f use

r ba

rge-

ins,

etc

.•O

bjec

tive

mea

sure

s ar

e di

rect

ly o

bser

vabl

e

Subj

ectiv

e m

easu

res:

To

eval

uate

use

r pre

fere

nces

:•U

ser s

atis

fact

ion,

the

user

’s a

ttitu

des

tow

ards

the

over

all s

yste

m,

or p

artic

ular

asp

ects

of i

t•U

ser a

ttitu

des

cann

ot b

e ob

serv

ed d

irect

ly -

you

mus

t ask

the

user

s

Page 8: Slides for On the Usability of Spoken Dialogue …kom.aau.dk/~lbl/phd/Slides_for_On_the_Usability_of Spoken...Page 2 of 33 On the Usability of Spoken Dialogue Systems Overview •

Page

8of

33

On

the

Usa

bilit

y of

Spo

ken

Dia

logu

e Sy

stem

s

Req

uire

men

ts to

the

Dia

logu

e

Bas

ed o

n in

terv

iew

s w

ith b

anki

ng p

erso

nnel

, the

fu

nctio

nalit

y of

the

hom

e ba

nk w

as c

hose

n to

be:

•P

rovi

de b

alan

ce a

nd in

form

atio

n of

mov

emen

ts fo

r use

r ac

coun

ts.

•U

ser m

ust p

rovi

de Id

and

PIN

cod

es fo

r acc

ess

•Th

e us

er m

ust b

e (o

r fee

l) in

con

trol o

f the

dia

logu

e•

The

serv

ice

mus

t be

equa

lly a

ccep

tabl

e to

use

rs

rega

rdle

ss o

f gen

der,

age

and

acce

ntFu

rther

mor

e, th

e di

alog

ue m

ust a

ccom

mod

ate

the

untra

ined

use

r

Page 9: Slides for On the Usability of Spoken Dialogue …kom.aau.dk/~lbl/phd/Slides_for_On_the_Usability_of Spoken...Page 2 of 33 On the Usability of Spoken Dialogue Systems Overview •

Page

9of

33

On

the

Usa

bilit

y of

Spo

ken

Dia

logu

e Sy

stem

s

The

Ove

rall

Dia

logu

e M

odel

Id-n

umbe

r

Acc

ess C

ode

Mai

n

Bal

ance

Min

i Sta

t.

Page 10: Slides for On the Usability of Spoken Dialogue …kom.aau.dk/~lbl/phd/Slides_for_On_the_Usability_of Spoken...Page 2 of 33 On the Usability of Spoken Dialogue Systems Overview •

Pag

e 10

of 3

3

On

the

Usa

bilit

y of

Spo

ken

Dia

logu

e Sy

stem

s

Bal

ance

or

Min

i Sta

t ?

For

whi

chA

ccou

nt ?

Prov

ide

Bal

ance

Mor

eA

ccou

nts ?

For

whi

chA

ccou

nt ?

Prov

ide

Min

i-Sta

t

Mor

eA

ccou

nts ?

Whi

sh to

Con

tinue

?

Dia

logu

e T

ask

Stru

ctur

eM

ain

Bal

ance

Min

i Sta

t

Syst

em d

irec

ted

Tra

nsiti

ons:

Use

r In

itiat

ed

A M

ixed

-Initi

ativ

e D

ialo

gue

Mod

el w

ith S

hort

-Cut

s

Page 11: Slides for On the Usability of Spoken Dialogue …kom.aau.dk/~lbl/phd/Slides_for_On_the_Usability_of Spoken...Page 2 of 33 On the Usability of Spoken Dialogue Systems Overview •

Pag

e 11

of 3

3

On

the

Usa

bilit

y of

Spo

ken

Dia

logu

e Sy

stem

s

Obj

ectiv

e M

easu

res

on th

e O

VID

Cor

pus

The

Cor

pus:

•70

0 tra

nscr

ibed

dia

logu

es fo

r 310

use

rs w

ho w

ere

requ

este

d to

car

ry o

ut tw

o pr

e-de

fined

sce

nario

s.Sp

eech

I/O

Qua

lity:

•S

peec

h (c

once

pt) r

ecog

nitio

n pe

rform

ance

Dia

logu

e Sy

mm

etry

•Tu

rn-ta

king

stra

tegy

, in

parti

cula

r how

and

whe

n us

ers

took

the

initi

ativ

e in

the

dial

ogue

Com

mun

icat

ion

Effe

cien

cy•

Tim

ing

para

met

ers

for o

vera

ll an

d su

btas

k pe

rform

ance

Task

Effe

ctiv

enes

s:•

Task

suc

cess

rate

s

Page 12: Slides for On the Usability of Spoken Dialogue …kom.aau.dk/~lbl/phd/Slides_for_On_the_Usability_of Spoken...Page 2 of 33 On the Usability of Spoken Dialogue Systems Overview •

Pag

e 12

of 3

3

On

the

Usa

bilit

y of

Spo

ken

Dia

logu

e Sy

stem

s

Obj

ectiv

eM

easu

res

–Ti

min

g

All

user

s w

ere

requ

ired

to c

arry

out

two

scen

ario

s, A

and

B

The

tabl

e sh

ows

the

aver

age

time

spen

t in

the

logi

n su

btas

ks fo

r the

firs

t (A

1,B

1) a

nd s

econ

d (A

2,B

2) d

ialo

gues

A p

aire

d, tw

o ta

iled

t-tes

t rev

eale

d a

sign

ifica

nt re

duct

ion

of th

e tim

e sp

ent

in th

e “Id

_num

ber”

sub

task

whe

n co

mpa

ring

the

first

to th

e se

cond

di

alog

ue (p

= 0

.03)

Page 13: Slides for On the Usability of Spoken Dialogue …kom.aau.dk/~lbl/phd/Slides_for_On_the_Usability_of Spoken...Page 2 of 33 On the Usability of Spoken Dialogue Systems Overview •

Pag

e 13

of 3

3

On

the

Usa

bilit

y of

Spo

ken

Dia

logu

e Sy

stem

s

Obj

ectiv

eM

easu

res

–Tu

rnta

king

Ana

lysi

ngth

e tu

rn-ta

king

stra

tegi

es u

ncov

ered

a s

imila

r tre

nd –

user

s co

mpl

eted

the

dial

ogue

s w

ith a

sm

alle

r num

ber o

f tur

ns in

th

e se

cond

dia

logu

eIn

par

ticul

ar, u

sers

wer

e m

ore

will

ing

to ta

ke th

e in

itiat

ive

in th

e di

alog

ue,

the

mor

e ex

perie

nced

th

ey b

ecam

e.

Ave

rage

num

ber o

f use

r ini

tiativ

es

per d

ialo

gue

for t

he ”A

” and

”B”

scen

ario

s, fo

r the

firs

t and

sec

ond

dial

ogue

s.

An

unpa

ired

two-

taile

d t-t

est

show

s a

sign

ifica

nt (p

= 0

.02)

in

crea

se in

the

num

ber o

f use

r in

itiat

ives

rela

tive

to th

e to

tal

num

ber o

f tur

ns fo

r sce

nario

B2

com

pare

d to

B1.

Page 14: Slides for On the Usability of Spoken Dialogue …kom.aau.dk/~lbl/phd/Slides_for_On_the_Usability_of Spoken...Page 2 of 33 On the Usability of Spoken Dialogue Systems Overview •

Pag

e 14

of 3

3

On

the

Usa

bilit

y of

Spo

ken

Dia

logu

e Sy

stem

s

Obj

ectiv

e M

easu

res

–Ta

sk C

ompl

etio

n

Per

ceiv

ed v

ersu

s ob

serv

ed ta

sk s

ucce

ss:

•A

lthou

gh 9

6% o

f the

use

rs b

elie

ved

that

they

had

co

mpl

eted

bot

h sc

enar

ios,

onl

y 74

% a

ctua

lly d

id s

o

Ther

e is

a re

duct

ion

of

alm

ost 5

0% o

f the

faile

d di

alog

ues

from

the

first

to

the

seco

nd c

all (

a 25

%

redu

ctio

n is

sig

nific

ant a

t th

e 95

% c

onf.

leve

l)

Page 15: Slides for On the Usability of Spoken Dialogue …kom.aau.dk/~lbl/phd/Slides_for_On_the_Usability_of Spoken...Page 2 of 33 On the Usability of Spoken Dialogue Systems Overview •

Pag

e 15

of 3

3

On

the

Usa

bilit

y of

Spo

ken

Dia

logu

e Sy

stem

s

Obj

ectiv

eM

easu

res

–Sp

eech

Rec

ogni

tion

The

inte

rval

s sh

ow th

esp

eech

reco

gniti

onpe

rform

ance

ex

perie

nced

by a

par

ticul

arpr

opor

tion

ofus

ers

A re

cogn

ition

pe

rform

ance

of

90%

ro

ughl

y co

rres

pond

s to

one

erro

r pe

r dia

logu

e

Page 16: Slides for On the Usability of Spoken Dialogue …kom.aau.dk/~lbl/phd/Slides_for_On_the_Usability_of Spoken...Page 2 of 33 On the Usability of Spoken Dialogue Systems Overview •

Pag

e 16

of 3

3

On

the

Usa

bilit

y of

Spo

ken

Dia

logu

e Sy

stem

s

Obj

ectu

reM

easu

res

-Con

clus

ions

A s

igni

fican

t red

uctio

n of

tim

e fo

r the

ID_n

umbe

rsub

task

was

ob

serv

ed w

hen

com

parin

g du

ratio

ns o

f the

firs

t and

sec

ond

dial

ogue

s.A

naly

ses

of th

e us

ers’

turn

-tak

ing

stra

tegy

for t

he fi

rst a

nd

seco

nd c

alls

reve

al a

sig

nific

ant i

ncre

ase

in th

e us

ers’

te

nden

cy to

take

the

initi

ativ

e in

the

dial

ogue

.Ta

sk c

ompl

etio

n ra

tes

also

sho

wed

a s

igni

fican

t inc

reas

e fro

m

the

first

to th

e se

cond

dia

logu

e.Th

ese

findi

ngs

are

inte

rpre

ted

as s

igns

of s

yste

m le

arna

bilit

y.--

----

----

----

----

----

----

----

----

----

No

diffe

renc

es w

ere

iden

tifie

d fo

r use

rs fr

om d

iffer

ent

dem

ogra

phic

gro

ups

(gen

der,

regi

on, a

ge)

Page 17: Slides for On the Usability of Spoken Dialogue …kom.aau.dk/~lbl/phd/Slides_for_On_the_Usability_of Spoken...Page 2 of 33 On the Usability of Spoken Dialogue Systems Overview •

Pag

e 17

of 3

3

On

the

Usa

bilit

y of

Spo

ken

Dia

logu

e Sy

stem

s

Subj

ectiv

e M

easu

res

The

term

use

r sat

isfa

ctio

nis

use

d to

den

ote

the

degr

ee to

whi

ch

the

user

s ar

e sa

tisfie

d w

ith, o

r acc

ept t

he s

yste

m p

erfo

rman

ce.

Con

trary

to (m

ost)

obje

ctiv

e m

easu

res,

info

rmat

ion

of u

ser

satis

fact

ion

is n

ot d

irect

ly o

bser

vabl

e, b

ut m

ust b

e ob

tain

ed b

yas

king

the

user

sO

ften

the

user

is a

sked

to e

xpre

ss h

is/h

er a

ttitu

de to

war

ds a

nu

mbe

r of s

tate

men

ts a

bout

the

syst

em, f

or e

xam

ple

usin

g a

so-

calle

d Li

kert

attit

ude

ques

tionn

aire

Page 18: Slides for On the Usability of Spoken Dialogue …kom.aau.dk/~lbl/phd/Slides_for_On_the_Usability_of Spoken...Page 2 of 33 On the Usability of Spoken Dialogue Systems Overview •

Pag

e 18

of 3

3

On

the

Usa

bilit

y of

Spo

ken

Dia

logu

e Sy

stem

s

The

OVI

D Q

uest

ionn

aire

with

25

Stat

emen

tsA

vera

ge U

ser

Att

itude

s with

98%

con

fiden

ce in

terv

als

1234567

easy to use

knew what to do

friendliness

confusing

use again

reliability

out of control

like voice

concentration

effeciency

flustered

too fast

under stress

voice clear

frustation

prefer human

too complicated

enjoyment

needs improvement

politeness

security

convenient

confidentiality

remember too much

good value

Average

Cat

egor

y

Attitude

Dom

ain

Dep

.

Page 19: Slides for On the Usability of Spoken Dialogue …kom.aau.dk/~lbl/phd/Slides_for_On_the_Usability_of Spoken...Page 2 of 33 On the Usability of Spoken Dialogue Systems Overview •

Pag

e 19

of 3

3

On

the

Usa

bilit

y of

Spo

ken

Dia

logu

e Sy

stem

s

Fact

or A

naly

sis

Fact

or A

naly

sis

(FA

) is

used

to id

entif

y th

e un

derly

ing

rela

tions

hips

bet

wee

n th

e st

atem

ents

•M

athe

mat

ical

ly, F

A re

sem

bles

Prin

cipa

l Com

pone

nts

Ana

lysi

s (P

CA

), bu

t:•

In F

A, t

he fa

ctor

s ar

e pe

rcei

ved

as th

e ca

use

of th

e ob

serv

ed

varia

ble

scor

es, i

.e. i

t is

the

unde

rlyin

g fa

ctor

stru

ctur

e th

atha

s pr

oduc

ed (o

r cau

sed)

the

obse

rved

var

iabl

e sc

ores

.•

In c

ontra

st, f

or P

CA

, the

com

pone

nts

are

just

per

ceiv

ed a

s ag

greg

ates

of t

he o

bser

ved

varia

ble

scor

es•

Furth

erm

ore,

in P

CA

all

varia

nce

is m

odel

led,

whe

reas

in F

A o

nly

the

varia

nce

the

varia

bles

hav

e in

com

mon

(com

mun

aliti

es) a

re

cons

ider

ed•

FA h

as a

n el

emen

t of s

ubje

ctiv

e ju

dgm

ent,

sinc

e th

e go

al

is to

arr

ive

at a

fact

or s

et th

at w

ill p

rovi

de a

n in

terp

reta

tion

of th

e ob

serv

ed d

ata

Page 20: Slides for On the Usability of Spoken Dialogue …kom.aau.dk/~lbl/phd/Slides_for_On_the_Usability_of Spoken...Page 2 of 33 On the Usability of Spoken Dialogue Systems Overview •

Pag

e 20

of 3

3

On

the

Usa

bilit

y of

Spo

ken

Dia

logu

e Sy

stem

s

Verif

icat

ion

of O

VID

Fac

tors

OVI

D F

acto

rsVa

rianc

eF1

: Qua

lity

of in

terfa

ce/

19%

perfo

rman

ceU

se A

gain

, Rel

iabi

lity,

Effi

cien

cy,

pref

er H

uman

, Enj

oym

ent

Nee

ds Im

prov

emen

tF2

: Cog

nitiv

e lo

ad13

%C

once

ntra

tion,

Spe

ed,

Und

er S

tress

F3: C

ontro

l/Con

fusi

on9%

Kno

w w

hat w

as e

xpec

ted,

perc

eive

d co

ntro

l, C

onfu

sion

,Fl

uste

red,

/Too

Com

plic

ated

F4: F

riend

lines

s8%

Frie

ndly

, Pol

iteF5

: Voi

ce8%

Like

d V

oice

, Voi

ce c

lear

Tota

l Exp

lain

ed V

aria

nce

57%

Orig

inal

CC

IR F

acto

rsVa

rianc

eF1

:Qua

lity

of in

terfa

ce/

21%

perfo

rman

ceU

se A

gain

, Effi

cien

cy, R

elia

bilit

yN

eeds

Impr

ovem

ent

F2: C

ogni

tive

effo

rt an

d S

tress

, 17%

Spe

ed, U

nder

Stre

ss, C

once

ntra

tion,

P

erce

ived

con

trol

F3: C

onve

rsat

iona

l mod

elV

oice

, Ton

e pr

ompt

s, F

riend

lines

s

F4: F

luen

cyV

oice

cla

rity,

Pol

itene

ss,

Kno

w w

hat w

as e

xpec

ted

F5: T

rans

pare

ncy

Eas

e of

use

, Pro

mpt

hel

pful

ness

Deg

ree

of

flust

er

Tota

l Exp

lain

ed V

aria

nce

74%

Page 21: Slides for On the Usability of Spoken Dialogue …kom.aau.dk/~lbl/phd/Slides_for_On_the_Usability_of Spoken...Page 2 of 33 On the Usability of Spoken Dialogue Systems Overview •

Pag

e 21

of 3

3

On

the

Usa

bilit

y of

Spo

ken

Dia

logu

e Sy

stem

s

Six-

Fact

or S

truc

ture

Whe

n th

e fiv

e do

mai

n de

pend

ent

stat

emen

ts

are

adde

d, a

si

xth

fact

or

emer

ges.

Page 22: Slides for On the Usability of Spoken Dialogue …kom.aau.dk/~lbl/phd/Slides_for_On_the_Usability_of Spoken...Page 2 of 33 On the Usability of Spoken Dialogue Systems Overview •

Pag

e 22

of 3

3

On

the

Usa

bilit

y of

Spo

ken

Dia

logu

e Sy

stem

s

Cor

rela

ting

Stat

emen

t Sco

res

0

0,250,5

0,75

knew

wha

t to

do too

fast co

nfus

ing

polit

e like

voic

e

voic

e cl

ear

conc

entra

te

frie

ndly

rem

embe

r muc

h

stre

ssco

ntro

lco

nfid

entia

lity

secu

rity

flust

ered

need

s im

prov

emen

tpr

efer

hum

an

relia

bilit

y

com

plic

ated

easy

to u

se

Effic

ienc

y

Frus

tratio

n

good

val

ueen

joym

ent

conv

enie

nt

Cor

rela

tion

with

use

r atti

tude

to “

Use

aga

in”

F3: C

onve

nien

ce

Page 23: Slides for On the Usability of Spoken Dialogue …kom.aau.dk/~lbl/phd/Slides_for_On_the_Usability_of Spoken...Page 2 of 33 On the Usability of Spoken Dialogue Systems Overview •

Pag

e 23

of 3

3

On

the

Usa

bilit

y of

Spo

ken

Dia

logu

e Sy

stem

s

Subj

ectiv

e M

easu

res

-Con

clus

ions

The

ques

tionn

aire

use

d fo

r the

sub

ject

ive

mea

sure

s w

ere

show

n to

be

valid

and

pro

duce

a fa

ctor

stru

ctur

e si

mila

r to

that

of t

he

orig

inal

CC

IR q

uest

ionn

aire

Whe

n th

e do

mai

n de

pend

ent s

tate

men

ts w

ere

incl

uded

, the

fa

ctor

stru

ctur

e ch

ange

d an

d ne

w fa

ctor

“con

fiden

tialit

y”

emer

ged

Gen

eral

ly, t

he u

sers

wer

e po

sitiv

e to

war

ds th

e O

VID

hom

e ba

nk

serv

ice

(ave

rage

sco

re w

as 5

.6 –

i.ebe

twee

n “a

gree

” and

“s

trong

ly a

gree

”)S

imila

r to

the

obje

ctiv

e m

easu

res,

no

sign

ifica

nt d

iffer

ence

s be

twee

n th

e de

mog

raph

ic g

roup

s w

ere

foun

d

Page 24: Slides for On the Usability of Spoken Dialogue …kom.aau.dk/~lbl/phd/Slides_for_On_the_Usability_of Spoken...Page 2 of 33 On the Usability of Spoken Dialogue Systems Overview •

Pag

e 24

of 3

3

On

the

Usa

bilit

y of

Spo

ken

Dia

logu

e Sy

stem

s

Com

bini

ng O

bjec

tive

and

Subj

ectiv

e M

easu

res

•Th

e P

AR

AD

ISE

(Par

adig

m fo

r Dia

logu

e S

yste

m E

valu

atio

n)

sche

me

(pro

pose

d by

Wal

ker e

t al f

rom

AT&

T in

199

7)

atte

mpt

s to

com

bine

the

subj

ectiv

e an

d ob

ject

ive

mea

sure

s.•

This

is d

one

by e

stim

atin

g a

perfo

rman

ce fu

nctio

n w

ith

“usa

bilit

y” a

s th

e in

depe

nden

t var

iabl

e an

d th

e ob

ject

ive

mea

sure

s as

the

depe

nden

t var

iabl

es•

The

perfo

rman

ce e

quat

ion

is m

odel

ed u

sing

Mul

tiple

Lin

ear

Reg

ress

ion

(MLR

)

Page 25: Slides for On the Usability of Spoken Dialogue …kom.aau.dk/~lbl/phd/Slides_for_On_the_Usability_of Spoken...Page 2 of 33 On the Usability of Spoken Dialogue Systems Overview •

Pag

e 25

of 3

3

On

the

Usa

bilit

y of

Spo

ken

Dia

logu

e Sy

stem

s

The

PAR

AD

ISE

Mod

el

Kap

pa a

ttem

pts

to c

ompe

nsat

e fo

r diff

eren

ces

in

the

com

plex

ity o

f th

e di

alog

ues

Page 26: Slides for On the Usability of Spoken Dialogue …kom.aau.dk/~lbl/phd/Slides_for_On_the_Usability_of Spoken...Page 2 of 33 On the Usability of Spoken Dialogue Systems Overview •

Pag

e 26

of 3

3

On

the

Usa

bilit

y of

Spo

ken

Dia

logu

e Sy

stem

s

App

lyin

g PA

RA

DIS

E to

the

OVI

D C

orpu

s

Che

ckin

g th

e co

rrel

atio

n of

the

inde

pend

ent v

aria

bles

bef

ore

appl

ying

MLR

. Onl

y th

e sp

eech

reco

gniti

on a

nd ta

sk s

ucce

ss

para

met

ers

turn

ed o

ut to

be

sign

ifica

nt p

redi

ctor

s of

usa

bilit

y (r

epre

sent

ed a

s th

e F1

-fact

or g

roup

)

Page 27: Slides for On the Usability of Spoken Dialogue …kom.aau.dk/~lbl/phd/Slides_for_On_the_Usability_of Spoken...Page 2 of 33 On the Usability of Spoken Dialogue Systems Overview •

Pag

e 27

of 3

3

On

the

Usa

bilit

y of

Spo

ken

Dia

logu

e Sy

stem

s

The

Reg

resi

son

Page 28: Slides for On the Usability of Spoken Dialogue …kom.aau.dk/~lbl/phd/Slides_for_On_the_Usability_of Spoken...Page 2 of 33 On the Usability of Spoken Dialogue Systems Overview •

Pag

e 28

of 3

3

On

the

Usa

bilit

y of

Spo

ken

Dia

logu

e Sy

stem

s

The

Res

ultin

g Pe

rfor

man

ce F

unct

ion

The

resu

lting

per

form

ance

func

tion

for t

he O

VID

exp

erim

ent,

com

pare

d w

ith s

imila

r PA

RA

DIS

E a

naly

ses

by W

alke

r et a

l at

AT&

T.

Page 29: Slides for On the Usability of Spoken Dialogue …kom.aau.dk/~lbl/phd/Slides_for_On_the_Usability_of Spoken...Page 2 of 33 On the Usability of Spoken Dialogue Systems Overview •

Pag

e 29

of 3

3

On

the

Usa

bilit

y of

Spo

ken

Dia

logu

e Sy

stem

s

Estim

atio

n of

the

user

atti

tude

sO

bser

ved

and

Est

imat

ed U

ser A

ttitu

des

Use

rs

User Attitude (F1)

↓ +

95%

Con

f.

Obs

erve

d

Esi

tmat

ed

← -

95%

Con

f.

510

1520

2530

1234567

For v

erifi

catio

n of

the

mod

el, t

he id

entif

ied

para

met

ers

are

used

to

estim

ate

the

user

sat

isfa

ctio

n (r

ed li

ne).

It is

cle

ar th

at o

nly

half

of

the

varia

nce

of th

e ob

serv

ed (b

lue)

is c

aptu

red.

Page 30: Slides for On the Usability of Spoken Dialogue …kom.aau.dk/~lbl/phd/Slides_for_On_the_Usability_of Spoken...Page 2 of 33 On the Usability of Spoken Dialogue Systems Overview •

Pag

e 30

of 3

3

On

the

Usa

bilit

y of

Spo

ken

Dia

logu

e Sy

stem

s

Con

clus

ion

on th

e PA

RA

DIS

E R

esul

tsTh

e im

porta

nt q

uest

ion

is o

f cou

rse

whe

ther

any

new

in

form

atio

n w

as re

veal

ed.

•It

is h

ardl

y su

rpris

ing

that

a re

latio

nshi

p be

twee

n A

SR

pe

rform

ance

, tas

k su

cces

s an

d us

er s

atis

fact

ion

can

be

obse

rved

.•

Kap

pa p

rove

d to

be

a be

tter p

redi

ctor

of u

sabi

lity

than

a m

ore

sim

ple

ratio

of c

ompl

eted

sub

-goa

ls. T

he m

ain

func

tion

of K

appa

is

to n

orm

alis

efo

r tas

k co

mpl

exity

, whi

ch in

this

cas

e it

seem

s to

ha

ve d

one.

•Th

ere

is a

n (a

lmos

t sur

pris

ingl

y) g

ood

corr

espo

nden

ce b

etw

een

the

OV

ID re

sults

and

thos

e ob

tain

ed b

y A

T&T

•P

AR

AD

ISE

is li

mite

d by

the

requ

irem

ent f

or w

ell-d

efin

ed

scen

ario

bas

ed d

ialo

gues

and

a li

near

rela

tions

hip

betw

een

perfo

rman

ce m

easu

res

and

usab

ility

Page 31: Slides for On the Usability of Spoken Dialogue …kom.aau.dk/~lbl/phd/Slides_for_On_the_Usability_of Spoken...Page 2 of 33 On the Usability of Spoken Dialogue Systems Overview •

Pag

e 31

of 3

3

On

the

Usa

bilit

y of

Spo

ken

Dia

logu

e Sy

stem

s

Sum

mar

y

•C

erta

in u

sabi

lity

aspe

cts

mus

t rec

eive

spe

cial

atte

ntio

n, d

ue to

the

natu

re o

f spe

ech,

mos

t not

ably

tran

spar

ency

and

lear

nabi

lity

•Th

e re

quire

men

ts s

et u

p fo

r the

OV

ID d

ialo

gue

has

to a

larg

e de

gree

bee

n m

et. (

Exc

eptio

n: S

peed

) •

The

lear

nabi

lity

of th

e O

VID

dia

logu

e ha

s be

en d

emon

stra

ted

thro

ugh

mea

sure

s of

the

timin

g an

d tu

rn-ta

king

stra

tegy

•Th

e va

lidity

of t

he q

uest

ionn

aire

use

d fo

r OV

ID h

as b

een

esta

blis

hed

thro

ugh

fact

or a

naly

sis

•A

PA

RA

DIS

E a

naly

sis

conf

irmed

that

spe

ech

reco

gniti

on a

nd

task

suc

cess

are

impo

rtant

for u

ser s

atis

fact

ion,

and

a h

igh

corr

espo

nden

ce w

ith re

sults

obt

aine

d el

swhe

reis

sho

wn.

•Th

e im

porta

nt to

pic

of m

ulti

mod

al u

ser i

nter

actio

n an

d th

e is

sue

of m

emor

abili

tyha

ve n

ot b

een

addr

esse

d in

this

wor

k

Page 32: Slides for On the Usability of Spoken Dialogue …kom.aau.dk/~lbl/phd/Slides_for_On_the_Usability_of Spoken...Page 2 of 33 On the Usability of Spoken Dialogue Systems Overview •

Pag

e 32

of 3

3

On

the

Usa

bilit

y of

Spo

ken

Dia

logu

e Sy

stem

s

Wha

t is

the

in th

e Fu

ture

for S

peec

h?

•S

peec

h as

a m

odal

ity is

in a

hig

hly

com

petit

ive

“mar

ket”,

and

mus

t si

mpl

y be

bet

ter t

han

any

othe

r opt

ion

for p

eopl

e to

use

it•

Man

y en

visi

oned

“kill

er a

pplic

atio

ns” a

s e.

g. p

hone

-bas

ed h

ome

bank

ing

has

been

take

n ov

er b

y th

e W

eb (e

.g. 3

8% o

f Dan

ish

inte

rnet

use

rs u

sed

inte

rnet

hom

e ba

nkin

g re

gula

rly b

y 20

02, w

hile

no

ne u

sed

spee

ch)

•Th

e m

etho

ds fo

r mea

surin

g us

er s

atis

fact

ion

has

to a

larg

e de

gree

be

en o

verlo

oked

by

the

spee

ch c

omm

unity

and

mus

t rec

eive

mor

e at

tent

ion

if sp

eech

bas

ed in

terfa

ces

are

to b

e su

cces

sful

•M

uch

focu

s ha

s be

en o

n na

tura

lnes

s an

d us

er c

ontro

l, bu

t rea

lly

with

out a

ny h

ard

proo

f tha

t thi

s ac

tual

ly le

ads

to h

ighe

r use

r sa

tisfa

ctio

n –

lear

nabi

lity

mig

ht b

e ju

st a

s im

porta

nt•

The

focu

s on

mob

ility

mig

ht p

rovi

de a

bre

akth

roug

h fo

r spe

ech,

es

peci

ally

in c

ombi

natio

n w

ith o

ther

mod

aliti

es

Page 33: Slides for On the Usability of Spoken Dialogue …kom.aau.dk/~lbl/phd/Slides_for_On_the_Usability_of Spoken...Page 2 of 33 On the Usability of Spoken Dialogue Systems Overview •

Pag

e 33

of 3

3

On

the

Usa

bilit

y of

Spo

ken

Dia

logu

e Sy

stem

s

and

Fina

lly…

I wis

h to

than

k al

l tho

se o

f my

colle

ague

s at

CP

K a

nd th

e O

VID

team

who

hav

e he

lped

me

in th

is w

ork,

eith

er

dire

ctly

or b

y ta

king

ove

r som

e of

my

othe

r tas

ks.

I als

o w

ish

to th

ank

my

fam

ily fo

r the

ir su

ppor

t

Last

, I w

ish

to th

ank

you

all f

or c

omin

g he

re a

nd li

sten

to

wha

t I h

ad to

say

____

____

____

_