is one second enough? evaluating qoe for inter-destination multimedia synchronization using human...

17
Is One Second Enough? Evalua&ng QoE for InterDes&na&on Mul&media Synchroniza&on using Human Computa&on Benjamin Rainer, Stefan Petscharnig , Chris<an Timmerer , and Hermann Hellwagner AlpenAdriaUniversität Klagenfurt (AAU) Faculty of Technical Sciences (TEWI) Department of Informa&on Technology (ITEC) Mul&media Communica&on (MMC) Sensory Experience Lab (SELab) hLp://blog.&mmerer.com hLp://selab.itec.aau.at/ hLp://dash.itec.aau.at chris&an.&[email protected] Chief Innova&on Officer (CIO) at bitmovin GmbH hLp://www.bitmovin.com chris&an.&[email protected] Slides: hBp://www.slideshare.net/chris<an.<mmerer QoMEX 2015, May 27, 2015

Upload: christian-timmerer

Post on 23-Jul-2015

174 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: Is One Second Enough? Evaluating QoE for Inter-Destination Multimedia Synchronization using Human Computation

Is  One  Second  Enough?  Evalua&ng  QoE  for  Inter-­‐Des&na&on  Mul&media  Synchroniza&on  

using  Human  Computa&on  Benjamin  Rainer,  Stefan  Petscharnig,  Chris<an  Timmerer,  and  Hermann  Hellwagner  

 Alpen-­‐Adria-­‐Universität  Klagenfurt  (AAU)  w  Faculty  of  Technical  Sciences  (TEWI)  w  Department  of  Informa&on  

Technology  (ITEC)  w  Mul&media  Communica&on  (MMC)  w  Sensory  Experience  Lab  (SELab)  hLp://blog.&mmerer.com  w  hLp://selab.itec.aau.at/  w  hLp://dash.itec.aau.at  w  chris&an.&[email protected]  

Chief  Innova&on  Officer  (CIO)  at  bitmovin  GmbH  hLp://www.bitmovin.com  w  chris&an.&[email protected]  

Slides:  hBp://www.slideshare.net/chris<an.<mmerer  

QoMEX  2015,  May  27,  2015  

Page 2: Is One Second Enough? Evaluating QoE for Inter-Destination Multimedia Synchronization using Human Computation

Outline  •  Mo&va&on  •  Our  Approach  •  Reac&on  Game  for  Subjec&ve  Quality  Assessment  •  Evalua&on  Methodology  •  Results  •  Conclusions  May  27,  2015   QoMEX  2015   2  

Page 3: Is One Second Enough? Evaluating QoE for Inter-Destination Multimedia Synchronization using Human Computation

Mo&va&on  •  Watching  mul&media  content  online  together  while  geographically  distributed,  

e.g.,  sport  events,  Twitch,  online  quiz  shows,  …  •  SocialTV  scenario  featuring  real-­‐&me  communica&on  via  text,  voice,  video  

•  Inter-­‐Des&na&on  Mul&media  Synchroniza&on[0]  ==  the  playout  of  media  streams  at  two  or  more  geographically  distributed  loca&ons  in  a  &me  synchronized  manner  

May  27,  2015   QoMEX  2015   3  

User  1   User  2  Goal!   Did  you  see  the  goal?  

Which  goal?  Thanks  for  the  spoiler!  

[0]  M.  Montagud,  F.  Boronat,  H.  Stokking,  R.  Brandenburg,  "Interdes&na&on  mul&media  synchroniza&on:  schemes,  use  cases  and  standardiza&on,"  Mul$media  Systems,  vol.  18,  pp.  459–482,  2012.    

Page 4: Is One Second Enough? Evaluating QoE for Inter-Destination Multimedia Synchronization using Human Computation

Mo&va&on  (cont’d)  •  Geerts  et.  al:  Are  we  in  sync?[1]  

–  Watching  videos  online  together,  while  using  voice  and  text  chat  

–  No&ceability  of  asynchronism  and  its  impact  on  annoyance  and  togetherness    

–  Recommenda&on:  1  second  is  enough  –  we  don‘t  think  so!  •  What  is  the  lower  threshold  on  asynchronism  for  IDMS?  

–  Alterna&vely:  Above  which  level  of  asynchronism  do  users  realize  that  they  are  not  in  sync?  

•  How  to  assess  QoE  in  SocialTV  scenarios?    

May  27,  2015   QoMEX  2015   4  

[1]  D.  Geerts,  et  al.,  "Are  we  in  sync?:  synchroniza&on  requirements  for  watching  online  video  together,"  Proc.  of  SIGCHI  Conference  on  Human  Factors  in  Compu$ng  Systems  (CHI  '11),  pp.  311-­‐314,  2011.  

Page 5: Is One Second Enough? Evaluating QoE for Inter-Destination Multimedia Synchronization using Human Computation

Our  Approach  •  We  adopt  a  combina&on  of  

–  Games  with  a  purpose[2]  –  Gamifica&on[3]  –  Crowdsourcing[4]  

•  We  design  and  implement  a  game  to  evaluate  the  impact  of  asynchronism  on    –  Fairness  –  Togetherness  –  Annoyance  –  QoE  

May  27,  2015   QoMEX  2015   5  

[2]  L.  von  Ahn,  L.  Dabbish,  "Labeling  images  with  a  computer  game,"  Proceedings  of  the  SIGCHI  Conf.  on  Human  Factors  in  Compu$ng  Systems  (CHI’04),  pp.  319-­‐326,  2004.  [3]  E.  D.  Mekler,  F.  Bruhlmann,  K.  Opwis,  A.  N.  Tuch,  "Do  points,  levels  and  leaderboards  harm  intrinsic  mo&va&on?:  An  empirical  analysis  of  common  gamifica&on  elements,"  Proceedings  of  the  First  Interna$onal  Conference  on  Gameful  Design,  Research,  and  Applica$ons  (Gamifica$on’13),  pp.  66-­‐73,  2013.  [4]  T.  Hossfeld,  C.  Keimel,  M.  Hirth,  B.  Gardlo,  J.  Habigt,  K.  Diepold,  and  P.  Tran-­‐Gia,  "Best  Prac&ces  for  QoE  Crowdtes&ng:  QoE  Assessment  with  Crowdsourcing,”  IEEE  Transac$ons  on  Mul$media,  vol.  16,  no.  2,  pp.  541-­‐558,  2014.  

Page 6: Is One Second Enough? Evaluating QoE for Inter-Destination Multimedia Synchronization using Human Computation

Reac&on  Game  for  Subjec&ve  Quality  Assessments  

•  Aligned  to  use  case,  synchroniza&on  •  Connected  to  video  content,  not  a  full  game  •  Crowdsourcable  (simulated  opponent)  •  Game  Idea:  Collabora&ve  reac&on  game    –  Players  have  to  react  to  game  events  –  Collabora&ve  aspect:  bonus  score  whenever  both  players  click  within  a  given  &me  window  

–  Explicit  user  feedback  (hit,  miss,  bonus)    

May  27,  2015   QoMEX  2015   6  

Page 7: Is One Second Enough? Evaluating QoE for Inter-Destination Multimedia Synchronization using Human Computation

Game  Events  

May  27,  2015   QoMEX  2015   7  

Page 8: Is One Second Enough? Evaluating QoE for Inter-Destination Multimedia Synchronization using Human Computation

Bonus  Score  Example  

May  27,  2015   QoMEX  2015   8  

Page 9: Is One Second Enough? Evaluating QoE for Inter-Destination Multimedia Synchronization using Human Computation

Evalua&on  Procedure  •  Evalua&on  using  the  WESP[5]  

framework  

•  Structured  in  five  phases  –  Explain  the  experiment  –  Gather  demographic  data  –  Get  par&cipants  used  to  the  procedure  –  Play  a  game  round  with  subsequent  

evalua&on  for  each  test  case  –  Give  feedback  to  evalua&on  process  

May  27,  2015   QoMEX  2015   9  

[5]  B.  Rainer,  M.  Waltl,  C.  Timmerer,  "A  Web  based  Subjec&ve  Evalua&on  Plavorm,”  Proceedings  of  the  5th  Interna$onal  Workshop  on  Quality  of  Mul$media  Experience  (QoMEX’15).  pp.  24–25,  2013.  

Page 10: Is One Second Enough? Evaluating QoE for Inter-Destination Multimedia Synchronization using Human Computation

Crowdsourcing  

May  27,  2015   QoMEX  2015   10  

•  Subjec&ve  quality  assessment  using  crowdsourcing  –  We  used  Microworker[6]  crowdsourcing  plavorm  and  

paid  0.5  USD  for  each  successful  par&cipa&on  –  Dura&on  about  15  minutes  –  Simulated  opponent  

[6]  hLp://www.microworkers.com  

•  Implicit  Measures  •  Number  of  browser  focus  changes  •  Number  of  clicks  •  Video  playback  length  •  Score  •  Number  of  pauses  •  …  

•  Explicit  Measures  •  Fairness  •  Togetherness  •  Annoyance  •  QoE  

 Slider  with  a  con&nuous  scale  from  0  (very  low)  to  100  (very  high)  with  ini&al  posi&on  at  50  (medium)    

Page 11: Is One Second Enough? Evaluating QoE for Inter-Destination Multimedia Synchronization using Human Computation

S&muli  and  Par&cipants  •  Videos:  in-­‐game  footage  of    

–  inFAMOUS:  Second  Son[7]    –  Knack[8]  

•  Training  phase  –  Infamous:  Second  Son  0  (00:54,  3  events)  

•  Main  evalua&on  using  three  video    sequences*  –  Infamous  :  Second  Son  1  (01:46,  6  Events)  –  Infamous  :  Second  Son  2  (01:58,  8  Events)  –  Knack  (01:50,  4  Events)  

•  Video  sequences  pre-­‐cached  to  avoid  any  bias  caused  by  stalls  •  Display  of  configura&ons  in  random  order  

May  27,  2015   QoMEX  2015   11  

Test  Configura<on  

Asynchronism  [ms]  

Window  length  [ms]  

Bonus  window    [ms]  

Training   0   2000   2000  

Synchronous   0   2000   2000  

Small  Async   400   2000   1600  

Medium  Async   750   2000   1250  

Big  Async   1500   2000   500  

[7]  inFAMOUS:  Second  Son  -­‐  Sukker  Punch,  hLp://infamous-­‐second-­‐son.com/  [8]  Knack  -­‐  SCE  Japan  Studio,  hLp://www.playsta&on.com/en-­‐us/games/knack-­‐ps4/    

*  With  a  resolu&on  of  720p,  29  fps,  and  approx.  2  Mbit/s  

Page 12: Is One Second Enough? Evaluating QoE for Inter-Destination Multimedia Synchronization using Human Computation

S&muli  and  Par&cipants  (cont‘d)  •  In  total,  89  microworkers  par&cipated  in  the  study  –  The  campaign  was  restricted  to  Europe,  Northern  America,  Australia  and  New  Zealand  

•  We  screened  45  par&cipants,  by  filtering  them  according  to:  –  Browser  focus  change  (27)  –  Total  number  of  clicks  <  1  (16)  –  Number  of  clicks  during  any  event  <  1  (2)  

May  27,  2015   QoMEX  2015   12  

Page 13: Is One Second Enough? Evaluating QoE for Inter-Destination Multimedia Synchronization using Human Computation

Results:  Togetherness  &  Annoyance  

May  27,  2015   13  QoMEX  2015  

Significant  difference  in  means  between    •  0  ms  and  750  ms  (t  =  1.68,  p-­‐value  =  0.096,  alpha  =  0.1)  •  400  ms  and  750  ms    (t  =  2.08,  p-­‐value  =  0.040,  alpha  =  0.05)  

Significant  difference  in  means  between    •  400  ms  and  750  ms    (t  =  -­‐1.31,  p-­‐value  =  0.049,  alpha  =  0.05)  

Page 14: Is One Second Enough? Evaluating QoE for Inter-Destination Multimedia Synchronization using Human Computation

Results:  Fairness  &  QoE  

May  27,  2015   QoMEX  2015   14  

Significant  difference  in  means  between  •  400  ms  and  750  ms    (t  =  2.51,  p-­‐value  =  0.014,  alpha  =  0.05)  •  400  ms  and  1500  ms    (t  =  1.93,  p-­‐value  =  0.057,    alpha  =  0.1)  •  For  the  pairs  of  test  cases    (0  ms,  750  ms)  and  (0  ms,  1500  ms)    

the  p-­‐value  is  slightly  above    alpha  =  0.1  

Significant  difference  in  means  between    •  400  ms  and  750  ms    (t  =  1.73  p-­‐value  =  0.087  alpha  =  0.1)  •  400  ms  and  1500  ms  (t  =  2.1  p-­‐value  =  0.039  alpha  =    0.05)  

Page 15: Is One Second Enough? Evaluating QoE for Inter-Destination Multimedia Synchronization using Human Computation

Results:  Game  Score  

•  Drop  in  score  a}er  400ms  

•  Same  tendencies    as  in  previous  results  

May  27,  2015   QoMEX  2015   15  

Page 16: Is One Second Enough? Evaluating QoE for Inter-Destination Multimedia Synchronization using Human Computation

Conclusions  •  Using  a  game  to  evaluate  the  impact  of  asynchronism  on  QoE,  fairness,  

togetherness,  and  annoyance    

ONE  

•  Our  evalua&on  showed  that  there  is  significantly    –  lower  QoE  –  lower  fairness  –  lower  togetherness  –  higher  annoyance  above  a  threshold  T  (400  ms  ≤  T  ≤  750  ms)  

•  Future  work  –  More  precise  threshold  value  –  Rela&onship  between  QoE  and  other  variables  (fairness,  togetherness,  annoyance)  

May  27,  2015   QoMEX  2015   16  

One  second  is  clearly  not  enough    

Page 17: Is One Second Enough? Evaluating QoE for Inter-Destination Multimedia Synchronization using Human Computation

Thank  you  for  your  aLen&on  

...  ques&ons,  comments,  etc.  are  welcome  …  

   

 Stefen  Petscharnig  and  Priv.-­‐Doz.  Dipl.-­‐Ing.  Dr.  Chris&an  Timmerer  Associate  Professor  

Alpen-­‐Adria-­‐Universität  Klagenfurt,  Department  of  Informa&on  Technology  (ITEC)  Universitätsstrasse  65-­‐67,  A-­‐9020  Klagenfurt,  AUSTRIA  

chris&an.&[email protected]­‐klu.ac.at  hLp://research.&mmerer.com/  

Tel:  +43/463/2700  3621  Fax:  +43/463/2700  3699  ©  Copyright:  Chris$an  Timmerer  and  Stefan  Petscharnig   17  May  27,  2015   QoMEX  2015