sensor based physical interaction for embodied playful learning games

81
July 2012. Amsterdam THESIS SUPERVISORS Prof. Dr. Anton Eliens (Dep. Of Computer Science, VU Amsterdam) Dr. Lora Aroyo (Dep. Of Computer Science, VU Amsterdam) Keimpe de Heer (Internship supervisor, Creative Learning Lab, Waag Society) Sensor based physical interaction for embodied playful learning games Master Thesis Project Nikolaos Poulios MSc. Computer Science / Multimedia Vrije Universiteit Amsterdam Creative Learning Lab – Waag Society [email protected] Stud. No.: 2001527

Upload: nikos-poulios

Post on 21-Mar-2016

222 views

Category:

Documents


4 download

DESCRIPTION

Master Thesis. Motion capture and wearable sensors on interactive educational game installations

TRANSCRIPT

Page 1: Sensor based physical interaction for embodied playful learning games

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

July  2012.  Amsterdam  

THESIS  SUPERVISORS  

Prof.  Dr.  Anton  Eliens  (Dep.  Of  Computer  Science,  VU  Amsterdam)                                                                                      Dr.  Lora  Aroyo  (Dep.  Of  Computer  Science,  VU  Amsterdam)                                                                                                Keimpe  de  Heer    (Internship  supervisor,  Creative  Learning  Lab,  Waag  Society)  

 Sensor  based  physical  interaction  for  embodied  playful  learning  games  Master  Thesis  Project  Nikolaos  Poulios  

MSc.  Computer  Science  /  Multimedia  

Vrije  Universiteit  Amsterdam  

Creative  Learning  Lab  –  Waag  Society  

[email protected]  Stud.  No.:  2001527  

Page 2: Sensor based physical interaction for embodied playful learning games

 2  

 

 

Page 3: Sensor based physical interaction for embodied playful learning games

  3  

Page 4: Sensor based physical interaction for embodied playful learning games

 4  

 

Sensor  based  physical  interaction  for  embodied  playful  learning  games  

 

Nikolaos  Poulios  

Master  Thesis  Project  for  the  degree  of  MSc  Computer  Science,  specialty  Multimedia,  at  the  Vrije  Universiteit  Amsterdam.    

July  2012  

 

Abstract    

         This   thesis   explores   the   use   of   modern   sensor   technologies   for   physical  interaction   on   educational   games   and   interactive   spaces.  More   specifically   the  thesis  studies  the  potential  effect  of  motion  capture  and  wearable  body  sensors  on   educational   interactive   games,   on   two   aspects:   i)   on   the   involvement   of  human   body   and   motion   in   the   process   of   learning,   and   recall   of   knowledge  (embodied   learning),   ii)  on  assisting   the  development  of  basic  social  emotional  competencies,   through   the   enhanced   social   affordances   of   embodied   games.  After  building  a  theoretical  framework  on  these  two  aspects,  based  on  previous  research,  the  thesis  presents  a  range  of  state  of  the  art  commercial  and  research  technologies  of  real  time  motion  capture,  and  biofeedback/emotion  recognition  systems,   which   could   be   utilized   in   educational   interactive   spaces,   along   with  insights   from   tested   selected   technologies,   and   tools/platforms   for   the  development   of   sensor   based   interactive   systems.   The   thesis   continues   by  proposing  a  generic  distributed  system  architecture  for  such  spaces,  separating      a  sensor  device  level  from  an  application  level.  This  architecture  is  then  put  into  test   on   the   implementation   of   a   prototype,   virtual   board   game,   featuring   a  motion  capture  sensor  (Microsoft  Kinect),  and  two  body  sensors  (EEG,  ECG).  The  thesis   concludes  with   results   from   the  evaluation  of   the  prototype   followed  by  general  conclusions  and  ideas  for  further  development.    

 

Thesis  Supervisors  

Prof.  Dr.  Anton  Eliens  (Department  of  Computer  Science,  VU  Amsterdam)  

Dr.  Lora  Aroyo  (Department  of  Computer  Science,  VU  Amsterdam)  

Keimpe  de  Heer  (Internship  supervisor,  Creative  Learning  Lab,  Waag  Society)  

Page 5: Sensor based physical interaction for embodied playful learning games

  5  

Preface      

 This  thesis  project  is  the  product  of  almost  one  year  of  personal  work,  driven  by   a   strong   interest,   developed   during   my   master   studies,   in   sensor   based  interactive   systems,   and   media   art   installations.   Inspired   by   the   Embodied  Playful   Learning   Theater   project   description   of   the   Waag   Society   institute,  focusing  on  a  multi-­‐sensor   installation   for  educational  games   that  would  assist  children   to   develop   their   social   emotional   competences,   I   started   a   research  internship   there.   The  Waag   Society   offered   a   very   good   environment   for   field  experimentations,   and   the   project   provided   an   interesting   background   for  academic   research.     During   my   internship,   I   made   a   literature   study   around  educational  games,  and  how  sensor  based  physical   interaction  could  contribute  the   enhancement   of   an   embodied   gaming   experience,   studying   also   available  technologies  in  motion  capture,  and  bio-­‐feedback  sensors.  During  this  6  months  period   I   had   the   chance   to   test   some   selected   sensor   technologies,   and   write  small  code  prototypes  using  them.    After  the  end  of  my  internship,  I  focused  on  putting   those   pieces   together   in   a   prototype   game,   that   would   make   use   of  multiple   sensors   and   blend   virtual   game   mechanics   with   tradition   forms   of  children   games,   result   of   which   was   the   NumHop   game   presented   in   this  document.  

This   study   tries   to   balance   between   theoretical   and   applied   research,  attempting   a   first   systematic   approach   towards   multi-­‐sensor   based,   physical  interaction  games  and  systems  in  general.  I  think  that  the  results  of  this  research  provide  very  encouraging  signs  towards  the  big  room  left  for  further  study.  

 

N.  Poulios  

 

 

Page 6: Sensor based physical interaction for embodied playful learning games

 6  

 

Acknowledgments      

         This  document  is  the  result  of  research  made  for  my  master  thesis  project  and  an   internship   at   the   Creative   Learning   Lab,   the   educational   department   of   the  Waag  Society   institute   for   art,   science   and   technology,   in  Amsterdam,  between  June  and  December  of  2011.  Creative  Learning  Lab  is  focused  on  innovative  ways  of   learning,   with   the   use   of   creative   technology.   Aim   of   the   internship   was   to  conduct  preliminary  research  for  the  Embodied  Playful  Learning  Theatre  (EPLT)  project.  Based  on  a  physical  interactive  installation,  EPLT  is  meant  to  be  an  open  platform  for  research,  development,  exploitation,  testing  and  support  of  a  range  of  embodied  and  multisensory  technologies,  such  as  motion  tracking  and  other  body  sensing  technologies,  focusing  on  the  development      of  serious  educational  games  for  training  of  social  emotional  competencies,  by  safely  exposing  learners  to  a  virtually  simulated  conflict  situation  in  which  wellbeing  plays  a  crucial  role.  EPLT  is  part  of  the  institute’s  involvement  in  the  COMMIT  P4,  “virtual  worlds  for  well-­‐being”   project*.   The   COMMIT   program   is   the   largest   Dutch   public-­‐private  research   programme,   bringing   together   leading   researchers   in   search   engines,  parallel   computing,   databases,   interaction   in   context,   embedded   systems   and  knowledge   technology,   aiming   to   broaden   and   enforce   the   Dutch   knowledge  infrastructure   in   ICT   and   to   better   position   Dutch   companies   in   international  competition  by  connection  the  best  scientists  to  high-­‐tech  companies.    

   I  would  like  to  thank  first  of  all  my  supervisors,  Prof.  Dr.  Anton  Eliens,  and    Dr.  Lora  Aroyo,  for  their  guidance  and  support,  my  internship  supervisor  Keimpe  de  Heer,  director  of  the  Creative  Learning  Lab,  for  his  inspiration,  help,  and  support,  along  with  all  the  people  of  the  Waag  Society  institute,  Heidi  J.  Boisvert,  artist  in  residence  at  the  Waag  Society,  during  the  period  I  started  my  internship,  for  her  help.   I  would  also   like   to   thank  Marco  Otte,   technical  director  of  CAMeRA@VU,  center   for   advanced   media   research   of   the   VU   Amsterdam,   for   his   help   and  consultancy,   and   Tobias   Ruf,   from   the   Electronic   Imaging   Department   of   the  Fraunhofer  Institute,  for  providing  very  kindly  their  SHORE  (Sophisticated  High-­‐speed   Object   Recognition   Engine)   SDK,   for   our   tests   on   facial   expressions  analysis.  

                                                                                                               

*  http://waag.org/nl/project/commit,        http://www.commit-­‐nl.nl/projects/virtual-­‐worlds-­‐for-­‐well-­‐being      

Page 7: Sensor based physical interaction for embodied playful learning games

  7  

 

Table  of  Contents  

Abstract   4  Preface   5  Acknowledgments   6  Chapter  1:  Introduction   9  1.1   Motivations   10  1.2   Problem,  Research  Questions  and  approach   11  1.3   Summary  of  Contributions   11  1.4  Outline   12  

Chapter  2:  Interactive  Games  and  Embodied  Learning   14  2.1  Games  and  Conceptual  Engagement   14  2.2  Social-­emotional  skills  and  interactions   15  2.3  The  role  of  physical  motion  interaction   18  2.4  The  role  of  bio-­feedback  sensors   20  

Chapter  3:  Sensing  motions   22  3.1  Basic  motion  sensors   22  3.1.1  Sensing  forces   22  3.1.2  Detecting  motion   23  3.1.3  Measuring  distance   23  

3.2  Motion  Capture  and  tracking  systems   24  3.2.1  Optical  Systems   24  3.2.2  Non-­‐optical  systems   27  3.2.3  Motion  capture  libraries   28  

3.3  Motion  sense  in  interaction   28  Hand  Tracking   29  Head/Face  Tracking   29  Eye  Tracking   29  Nintendo  Wii  Remote   30  Blobo   31  Floor  boards   31  Sony  PlayStation  Move   31  Microsoft  Kinect   32  Panasonic  D-­‐Imager   33  

3.4  Comparison  of  motion  capture  systems  for  the  EPLT  installation   33  Chapter  4:  Sensing  emotions   35  4.1  Speech  analysis   37  4.2  Facial  expressions   37  4.3  Body  movement/postures   38  4.4  Pupil  size   39  4.5  Bio-­feedback  sensors   39  4.6  Brain  Computer  Interfaces  (BCI)   40  4.7  Developing  Tools  for  Multimodal  Biofeedback   42  4.8  Data  representation  of  emotions   43  4.9  Biofeedback  Interactions.  Thoughts  and  insights     44  

Page 8: Sensor based physical interaction for embodied playful learning games

 8  

Chapter  5:    Hardware  and  software  platforms  for  multi-­sensor  interactive  spaces   49  5.1  Sensor  Hardware  Platforms   49  Arduino   49  .Net  Gadgeteer   50  Phidgets   50  Shimmer   50  I-­‐CubeX   51  

5.2  Interactive  software  development  platforms   51  Visual  Programming  Languages   52  Working  with  sensors   53  

Chapter  6:    A  generic  architecture  for  multi-­sensory  interactive  systems   54  6.1  Architecture  Description   54  6.2  Use  Case   56  

Chapter  7:    Prototyping  a  virtual  board  game  with  physical  interaction   57  7.1  Introduction   57  7.2  Preliminary  studies   58  7.3  NumHopII  -­  The  Game   59  7.4  Architecture  Overview   61  7.5  Main  components  overview   63  SensorOSCTransmitter  (C++  -­‐  cinder)   63  NumHopII  (C#  -­‐  Unity3D)   64  

Chapter  8:    Evaluation  results  and  conclusions   66  8.1  Prototype  Game  Evaluation   66  Microsoft  Kinect   67  Neurosky  Mindwave   69  Zephyr  HxM   69  

8.2  General  conclusions  and  further  development   70  8.3  Summary  of  research  results   71  

Appendix  I   74  NumHop  Game  Evaluation  Questionnaire   74  

Bibliography   76  References   76    

Page 9: Sensor based physical interaction for embodied playful learning games

  9  

Chapter  1:  Introduction    

 

The  use  of  computer  games  in  education  has  been  an  active  field  of  academic  research   for   the   last   twenty   years,   providing   considerable   evidence   to   support  the  positive  effects  of  the  use  of  games  on  the  learning  outcomes  of  students,  by  increasing   their   motivation,   stimulating   their   engagement,   and   by   helping  students  to  understand  complex  concepts  by  applying  them  on  problem-­‐solving  tasks,   in   an   explorative   environment.     Nowadays   the   image   of   computer  laboratories   in   schools   has   become   established,   aiming   to   familiarize   children  with   the   use   of   modern   technology,   and   providing   an   alternative   medium   for  educational  material.          

On   the   other   hand,   in   the   consumer   and   home   entertainment   industries,  recent   innovations   on   sensor   and   software   technology,   have   lead   to   the  development   of   a   constantly   growing   range   of   products   focused   on  entertainment   and   personal  wellbeing.  Major   game   consoles   like   the  Nintendo  Wii,  Microsoft  Xbox,  and  Sony  Playstation,  have  introduced  peripherals  featuring  motion  sensors  and  computer  vision  algorithms,  designed  for  games  in  which  the  gamer   interacts   in   a   physical   way,   using   body   motion,   instead   of   sitting   on   a  couch  using  a  conventional  controller,  and  which  successfully  promote  physical  exercise   in   an   entertaining  way.   A   lot   of   other   products   have   been   introduced  based   on   wearable   sensors   monitoring   body   signals   during   sports   training   or  daily   life   activities,   to   be   used   by   athletes   or   general   consumers,   in   order   to  monitor  and  analyze  their  physical  state.          

Based   on   the   idea   of   embodied   cognition,   and   the   assumption   that   the  participation  of  body  into  the  educational  process,  plays  an  important  role  on  the  learning  outcomes,  and  personal  development  of  children,   this  thesis   is  a  study  on  state  of  the  art  sensor  technologies  and  their  application  on  the  designing  of  interactive   spaces,   focused   on   a   broader   sense   of   educational   games.   The  Embodied   Playful   Learning   Theater   (EPLT)   is   a   project   of   the   Waag   Society  institute   in   Amsterdam,   for   the   implementation   of   such   an   interactive   space.  Combining   advanced   computer   vision,   motion   and   wearable   body   sensors  technology,   with   real   time   computer   graphics   projection,   active   sound   and  lighting   systems,   the   EPLT   is   meant   to   be   a   highly   immersive   environment,  providing   an   open   platform   for   research   and   development   of   applications   and  games,  that  are  designed  to  interact  based  on  body  motion,  physical  and  mental  state  of   the  user.  Target  of   these   applications   is   to  offer   a  playful   and   learning  experience   that   will   help   children   to   develop   their   personal,   social-­‐emotional  skills.  

 

As   preliminary   research   for   the   EPLT   project,   this   thesis   aims   to   build   a  theoretical   and   technological   background   for   the  development   of   the  platform,  by   reviewing   findings   of   previous   research   on   educational   games,   interactive  

Page 10: Sensor based physical interaction for embodied playful learning games

 10  

storytelling,   physical   Human   Computer   Interaction,   modern   hardware   and  software  technologies  for  the  purpose  of  the  project,  and  a  set  of  insights  gained  in  the  design  and  evaluation  process  of  interaction  concept  prototypes.    

 

1.1 Motivations    

Motivating   young   students   in   the   educational   process   is   a   continuous   and  diachronic  challenge  for  the  educators,  who  try  to  make  learning  more  active  and  engaging.    At  the  same  time,  modern  pedagogy  places  the  development  of  social-­‐emotional  competences  at  its  core,  creating  a  growing  need  in  education  for  tools  and   instruments   to   assess   and   support   these   skills.   Central   to   the   social  emotional   competence   development   of   learners   is   the   individual   in   relation   to  his  or  her   social   environment  using  all  possible   expressive   forms.  For   learners  this   includes   social,   pedagogical,   didactical   and   psychological   guidance   in   the  development  of  competencies  for  societal  participation  and  group  dynamics.    

Although   current   computer   based   educational   games   may   feature   all   the  qualities  to  be  engaging,  like  rich  computer  graphics  and  story  driven  plot,  recent  technological   innovations,  allow  us   to  consider  a  dimension  previously   ignored  in  gaming,  that  of  the  human  body.  The  human  body  can  be  seen  as  part  of  the  human   cognition,   and   a   medium   of   self-­‐expression   and   interaction   with   the  environment  and  other  people,  thus  its  participation  is  of  key  importance  to  the  development  of  social-­‐emotional  competences  of   learners.      On   the  other  hand,  the  use  of  body  sensors  provide  us  a  way  to  notice  and  measure  in  real  time,  the  reflection  of  our  actions,  or  stimuli  of  the  surrounding  environment,  to  our  body,  physical   and   mental   condition,   helping   to   understand   ourselves   and   others  better.  

The   conjunction   of   multi-­‐sensor   technologies   with   virtual   worlds   and   the  dynamics   of   game   and   play   provide   excellent   possibilities   for   playful   learning,  and   to   train   and   assess   social   emotional   competences   in   an   immersive  environment   in   which   learner   are   engaged   in   problem   based   simulations   and  conflict  situations,  in  the  realms  of  a  safe  and  confined  space.    

Apart  from  educational  applications,  which  is  the  main  research  topic  of  this  study,   this   thesis   is   motivated   by   a   general   interest   in   fusing   art,   electronic  technology,   immersive   interactive  environments,  and  the  design  of  the  EPLT  as  an   open   platform   for   research,   experimentation   and   support   of   multi   sensor  technologies  in  the  field  of  art  and  science.  Since  the  work  of  pioneer  composers  in  music   like   Karlheinz   Stockhausen,   Iannis   Xenakis   and   John   Cage,   electronic  and   computer   technology   has   become   essential   part   of   music   and   what   is  commonly  known   today   as   computer  music.  Music  has  many   effects   related   to  wellbeing  (e.g.  relaxation,  stimulation,  expressing  emotion  and  defining  identity),  because   music   is   inherently   meaningful   to   human   beings.   When   we   listen   to  music,  our  mind  manipulates  musical  structures  (non-­‐consciously  extracted)  and  this  manipulation  generates  an  emotional  response.  Sensor  technology  can  thus  contribute   to   the   creation   of   computer   music   and   music   cognition   research.  Similarly,   technology   is   increasingly  being  used   in  visual  arts,   from  the  deus  ex  

Page 11: Sensor based physical interaction for embodied playful learning games

  11  

machina  of   the  Greek   tragedies,   to  modern  computerized   theater   stages,  dance  performances  like  the  works  of  Merce  Cunningham,  where  dancing  bodies  blend  on   stage   with   real   time   computer   graphics,   and   numerous   interactive  installations   emerging   to   a   new   form   of   art.   (See   “Digital   Performance”   in  bibliography  for  a  history  of  technology  in  performing  arts)  

 

1.2 Problem,  Research  Questions  and  approach    

The   Embodied   Playful   Learning   Theater   will   be   built   on   top   of   a   physical  installation.  After  defining  the  interaction  technologies  suitable  for  the  context  of  the   project,   their   integration   must   be   examined   with   respect   to   their  specifications  and  those  of  the  physical  installation.  Additional  usage  of  sensors  for   the   monitoring   of   the   player’s   condition   requires   the   development   of   a  common  framework  for  the  real  time  collection  and  process  of  input  data.  

The  two  main  research  questions  of  the  thesis  are:  

•  Can   physical   interactions   be   combined   with   a   virtual   environment   to  enhance  a  playful  gaming  experience  inside  a  gaming  installation?    The   research   is   called   to   answer   this   question   by   exploring   existing  sensor  based  human  computer  interaction  technologies  and  applications,  available   software   development   tools   for   such   gaming   installations   and  examine  their  advantages  and  limitations.    

• What   sensor   technologies   are   most   applicable   for   enhancing   a   playful  gaming  experience  inside  the  EPLT  installation?  

The  research  is  called  to  answer  this  question  by  examining,  what  sensor  technologies   are  more   suitable   for   the   context   of   the   games,   and  which  are   their   characteristics   and   constrains   that   need   to   be   considered   for  their  application  to  the  EPLT  installation.  

 The   research   is   based   on   performing   physical   interaction   domain  exploration,   with   literature   and   technological   study   on   existing   sensor  technologies  and  software,  followed  by  small  cycles  of  design,  development  and  testing  of  prototype  applications  using  selected  sensor  technologies.    

 

1.3 Summary  of  Contributions    

The  contributions  of  this  thesis  are:  

• A   theoretical   background   on   educational   games   for   the   development   of  social   emotional   competences   of   learners,   based   on   multi   sensor  interactive  installations    

Page 12: Sensor based physical interaction for embodied playful learning games

 12  

• A   review   of   state   of   the   art   motion   capture   and   biofeedback   sensor  technologies,  as  well  as  a  review  of  software  to  support  the  development  of  interactive  applications  featuring  these  sensors    

• An  architectural  proposal  for  multi-­‐sensor  interactive  spaces    

• Prototype   applications   for   selected   sensor   technologies,   developed   as  components  of  the  proposed  architecture    

• Insights  from  the  evaluation  of  prototypes,  and  concept  ideas  for  further  development  based  on  the  evaluated  technologies  

 

1.4  Outline    

Chapter   2   aims   to   build   a   theoretical   background   on   embodied   learning  games   for   social   emotional   competences   development,   by   studying   and  discussing   findings   and   ideas   from  previous   academic   research,   combining   the  topics  of  games  in  education,  the  effect  of  motion  based  interaction  in  games,  and  the   topic   of   “affective   computing”   and   emotion   recognition   technologies   in  physical  human  computer  interaction.    

Chapter   3   focuses   on   motion   capture   technologies,   starting   from   basic  motion   sensors,   the   chapter   continues   with   a   review   of   currently   available  commercial   sensor   systems  and   research   innovations.  Chapter  4   is   a   review  of  the   second   technological   part   under   research,   that   of   biofeedback   wearable  sensors   and   emotion   recognition   systems,   including   multimodal   emotion  recognition   software   platforms   developed   on   research   institutes,   as   well   as  standards  for  the  representation,  annotation,  storage  and  transmission  of  human  emotions  along  emotion  aware  applications.  Chapters  3  and  4  aim  to  present  the  range  of   technologies   that   could  be   applied   as   inputs   to   an   interactive   system,  summarizing   their   features   and   constrains   through   which   someone   can   judge  their  suitability  during  the  design  process  of  a  specific  application  built  on  top  of  the  installation.    

Chapter  5  presents  some  examples  of  hardware  sensor  platforms  including  commercial   ready-­‐made   solutions   as   well   as   platforms   to   built   custom   made  sensors.  The  use  of   these  platforms  provides   a   common  device   level   for   larger  scale  applications  using  a   large  number  of  sensors,   increasing  functionality  and  simplifying   the   implementation   of   a   network   of   devices   and   software.   The  chapter  continues  with  a  presentation  of  software  platforms  providing  tools  and  sets   of   libraries   necessary   for   the   development   of   interactive   applications  featuring  sensors,  multimedia  inputs  and  outputs,  and  network  connectivity.  

Chapter   6   provides   an   architectural   proposal   for   multi-­‐sensor   interactive  spaces  like  the  EPLT,  featuring  an  application  level,  independent  from  the  input  and  output  device  level,  and  a  communication  layer  between  the  levels  and  their  components.    

Page 13: Sensor based physical interaction for embodied playful learning games

  13  

Chapter  7  is  documentation  of  prototype  code  implementing  components  of  the   proposed   architecture’s   device   level,   for   selected   motion   capture   and  biofeedback   sensors,   and   a   proof   of   concept   game,   demonstrating   a   version   of  the  proposed  architectural  design.  

The   thesis   ends   with   Chapter   8,   presenting   insights   gathered   during   the  evaluation   process   of   the   prototype,   ideas   for   further   development   of   the  prototype  and  other  applications,  and  final  conclusions.    

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Page 14: Sensor based physical interaction for embodied playful learning games

 14  

Chapter  2:  Interactive  Games  and  Embodied  Learning  

   

This   chapter   discusses   the   question   of   how   can   physical   interactions   and  biofeedback   mechanisms   can   be   combined   with   a   virtual   environment   to  enhance  a  playful   gaming  experience  on  an   interactive   game   space.  Combining  findings  of  previous   research   from   the   topics  of   game   theory,   education,   social  psychology,  and  human  computer  interaction  (HCI),  this  chapter  aims  to  build  a  theoretical   framework   on   physical   games   designed   for   interactive   spaces,   by  analyzing  how  games  utilizing  motion  sensing  controllers  and  body  sensors  can  contribute   to  engagement,  motivation,   and   social   interaction  between  children,  and  the  effect  on  the  learning  outcomes.    

 

2.1  Games  and  Conceptual  Engagement    

Educators   continuously   face   the   problem  of  motivating   and   engaging   their  students   to   learn.  Main   reasons   of   this   problem  are  believed   to   be   the  passive  form  of   tuition   in   class,   and   the   gap   that   exists  between   learning   a   theory   and  understanding   its   practical   value.   At   the   same   time   younger   and   younger  children   are   becoming   immersed   in   the   consumption   of   media   and   the   early  adoption   of   technology   in   their   homes.   According   to   studies   conducted   by   the  Kaiser  Family  Foundation  (Rideout,  Foehr,  &  Roberts,  2010),  Sesame  Workshop,  and   others   recently   synthesized   in   the   Cooney   Center’s   report   Always  Connected:   The   New   Digital   Media   Habits   of   Young   Children   preschool   and  primary  grades  children  typically  consume  between  4  (for  preschoolers)  and  7.5  hours  (for  8-­‐year-­‐olds)  of  media  on  a  typical  day.  More  than  half  of  all  children  under  5  use   some   type  of   electronic   learning   toy,   and  watch   an   average  of   3.5  hours  of  television  in  an  average  day.  By  the  time  they  are  8,  more  than  70%  of  all   children   play   video   games,   and   67%   use   the   Internet   on   a   daily   basis.  (Gutnick,  Robb,  Takeuchi,  &  Kotler,  2011)  [1].    

Following   this   tendency,  a   large  number  of  academic  studies  during  recent  years  have  focused  on  the  design  of  interactive  games  for  educational  purposes,  providing   considerable   evidence   to   support   the   positive   effectiveness   of  educational   games   on   a   broad   range   of   learning   outcomes.   Piaget,   through   his  child   development   theory   believes   in   the   development   of   cognitive   structures  through   action   and   spontaneous   play   [2].   According   to   Piaget,   constructivist  learning   is   rooted   in  experimentation,  discovery  and  play  among  other   factors.  Malone   and   Lepper   consider   games   as   intrinsic   motivators   for   learning   [3].  Games  provide  an  alternative,  more  active  and  experiential  method  of   learning  that   supplementary   with   traditional   textbooks   can   help   students   understand  better  complex  concepts  and  engage  with  content  within  contexts  of  use  [4].  Gee  [5]  argues  that  schools  provide  the  manual  but  not  the  game,  and  that  any  gamer  will   tell   you   that   reading   a  manual  without   playing   the   game   is   confusing   and  

Page 15: Sensor based physical interaction for embodied playful learning games

  15  

unproductive.  However,  while  one  is  playing  the  game,  the  manual  can  provide  an   important   sense   of   direction   and   serves   to   deepen   emergent   claims.   Game  dynamics  motivate  students  to  compete  in  achieving  better  results  and  immerse  them   in   problem-­‐based   simulations   where   learning   becomes   a   more   personal  experience.   Games   featuring   rich   narrative   invite   players   to   inhabit   roles   and  assume   identities   as   they   adopt   conceptually-­‐relevant   intentions   in   a   virtual  world   in  which  they  make  choices,  develop  skills,  and  experience  the   impact  of  their  actions  as  part  of  a  legitimate  game  role,  allowing  students  to  move  beyond  their   classroom   identity   and   become   legitimate   participants   in   the   game  narrative  (Barab,  Sadler,  Heiselt,  Hickey  and  Zuiker,  2007)  [6].    Balasubramanian  and  Wilson   (2006)   analyzed   the   findings   of   numerous   studies   and   found   that  well-­‐designed   educational   digital   games   and   simulations   can   help   students   to  obtain  critical  problem  solving  and  decision  making  skills,  which  are  necessary  for  everyday   living  [7],  and  Hake  [8]  examined  pre-­‐  and  post-­‐test  data   for  over  6,000  students  in  introductory  physics  courses  and  found  significantly  improved  performance   for   students   in   classes   with   substantial   use   of   interactive-­‐engagement  methods.  

 

2.2  Social-­‐emotional  skills  and  interactions    

Besides  the  conceptual  engagement  analyzed  in  the  previous  section,  games  can  assist  on  Social  and  Emotional  Learning,  offering  excellent  opportunities  for  social  interactions  through  which  children  learn  to  subordinate  desires  to  social  rules,   cooperate   with   others   willingly,   and   engage   in   socially   appropriate  behavior—behaviors  vital  to  adjusting  well  to  the  demands  of  school    

Social  and  Emotional  Learning  can  be  defined  as  the  process  of  acquiring  the  skills  to  recognize  and  manage  emotions,  develop  caring  and  concern  for  others,  establish   positive   relationships,   make   responsible   decisions,   and   handle  challenging   situations   effectively.   Social   and   emotional   learning   is   of   key  importance   in   the   pedagogical   role   of   schools,   preparing   young   children   to  become   active   parts   of   the   society.   Socially   and   emotionally   balanced   children  have   increased   confidence,   express   and   communicate   better,   form   better  relationships,  take  and  persist  at  challenging  tasks,  and  have  increased  capacity  of  learning.  Dr.  Maurice  Elias,  a  leading  child  psychologist,  researcher  and  expert  on   social-­‐emotional   learning   from   Rutgers   University,   explains   the   dangers   of  omitting   social-­‐emotional   development,   and   programs   from   our   children's  classrooms.   He   states,   "Many   of   the   problems   in   our   schools   are   the   result   of  social  and  emotional  malfunction  and  debilitation  from  which  too  many  children  have  suffered  and  continue  to  bear  the  consequences.  Children  in  class  who  are  beset   by   an   array   of   confused   or   hurtful   feelings   cannot   and   will   not   learn  effectively.  In  the  process  of  civilizing  and  humanizing  our  children,  the  missing  piece  is,  without  doubt,  social  and  emotional  learning.”  [9]  

Social  and  emotional  skills  can  be  learned  and  enhanced  at  any  age,  but  the  earlier  a  person  begins  the  social  emotional  learning,  the  greater  the  advantages.  During   pre-­‐   and   early   school   years,   children   begin   to   understand   themselves  

Page 16: Sensor based physical interaction for embodied playful learning games

 16  

both   as   individuals   and   as   part   of   a   social   world;   they   are   becoming   more  autonomous,   and   their   cognitive   abilities   permit   them   to   see   how   they  will   fit  into   their   family   and   group   of   friends.   According   to   Raver,   “from   the   last   two  decades   of   research,   it   is   unequivocally   clear   that   children’s   emotional   and  behavioral   adjustment   is   important   for   their   chances   of   early   school   success”.  [10]  

Goleman   [11]   outlines   five   crucial   emotional   competencies   basic   to   social  and  emotional  learning:  

1. Self  and  other  awareness:  understanding  and  identifying  feelings;  knowing  when  one's  feelings  shift;  understanding  the  difference  between  thinking,  feeling  and  acting;  and  understanding  that  one's  actions  have  consequences  in  terms  of  others'  feelings.    

2. Mood  management:  handling  and  managing  difficult  feelings;  controlling  impulses;  and  handling  anger  constructively.    

3. Self-­‐motivation:  being  able  to  set  goals  and  persevere  towards  them  with  optimism  and  hope,  even  in  the  face  of  setbacks.    

4. Empathy:  being  able  to  put  yourself  "in  someone  else's  shoes"  both  cognitively  and  affectively;  being  able  to  take  someone's  perspective;  being  able  to  show  that  you  care.    

5. Management  of  relationships:  making  friends,  handling  friendships;  resolving  conflicts;  cooperating;  collaborative  learning  and  other  social  skills.    

Free   and   guided   play   are   important   for   fostering   social   competence   and  confidence   in   children,   as   well   as   the   self-­‐regulation   necessary   for   managing  their  own  behavior  and  emotions.  In  play  children  learn  how  to  collaborate  and  negotiate   with   others,   to   take   turns,   and   to   manage   themselves   and   others.    Barnett   and   Storm   [12]   also   find   that   play   serves   as   a  means   for   coping  with  distress.   Interactive   games   in   school   give   us   the   ability   to   create   a   playful  environment   for   social   interactions,   and   the   ability   to   simulate   challenging   or  conflict  situations.  

Despite  the  image  of  social  isolation  electronic  gaming  has  for  many  people,  and   the   concerns   and   criticism   raised   against   them   by   teachers,   parents,  researchers   and   policymakers,   the   literature   does   not   provide   convincing  evidence   to   this   effect.     On   the   contrary   there   is   a   number   of   studies  demonstrating   that   games   often   elicit   beneficial   on   cognitive   skills,   but   also   in  affective   and   social   terms   (Calvert   2005   [13],   Gunter   [14]).   De   Kort   and  Ijsselsteijn   [15],   inspired  by   the  realization   that  gaming   is  often  as  much  about  social   interaction,   as   it   is   about   interaction   with   the   game   content,   review  findings  of  previous   research  on   the  psychological   experience  of   social   context  effects   while   playing,   discuss   contingencies   between   player,   co-­‐player(s)   and  audience  and  how  these  are  shaped  by  the  physical  and  media  context  in  which  

Page 17: Sensor based physical interaction for embodied playful learning games

  17  

they   reside,   and   the   ‘sociality   characteristics’   of   game   settings   in   terms   of   co-­‐located,   mediated,   and   virtual   others.   In   this   paper   we   find   several   studies  reporting   electronic   games   opportunities   for   social   interaction   (e.g.   Lazzaro,  2007  [16])  for  settings  ranging  from  public  interaction  (arcades),  to  semi-­‐public  (LAN  events),   to  private   (living  room  at  home).   In   these   it  has  been   found  that  people   enjoy   playing   together   or  watching   others   play,   sharing   comments   and  enjoying  the  spectacle  and  the  enhancement  of  emotional  experience  that  comes  from  a  crowd  (Jansz  &  Marteens,  2005  [17]),  and  that  some  even  argue  that  it  is  the   social   interaction   and   participation   that,   to   a   large   extent,   explain   game  enjoyment  (Bryce  &  Rutter,  2003  [18];  Carr  et  al.,  2004  [19]).    

When   people   are   playing   together,   their   need   to   belong   is   nourished   in  multiple   ways.   First,   through   involvement   in   a   common   activity   they   interact  socially,  and  both   the  number  and  quality  of   social   interactions  contribute   to  a  person's   sense   of   belonging   (Baumeister   &   Leary,   1995   [20]),   resulting   in   a  positive  affective   state.   Second,   spending   time   together  makes  people  aware  of  them  being   part   of   each   other's   social   network   or   group,  which   generally   also  brings  about  positive  emotions.  Moreover,   (unconscious)  processes  of  empathy  and  mimicry   result   in   a   phenomenon   called   'emotional   contagion',   where   one  person's  affective  state  spreads  to  that  of  a  second  person  who  is  able  to  perceive  his/her   facial  expressions  (Ramanathan  &  McGill,  2008  [21]).  Hence,  when  one  player   is   visibly   enjoying   a   game,   this   emotion   potentially   crosses   over   to   the  other.  Lastly,  the  subsequent  congruence  of  feelings  engenders  an  even  stronger  sense   of   belonging,   through   reinforcement   and   confirmation   (Raghunathan   &  Corfman,  2006  [22]).  

Continuing   from   the   same   study   of   De   Kort   and   Ijsselsteijn,   the   paper  denotes   the   social   context   effects   on   player’s   performance,   caused   by   the  presence  of  others.  The  emotional  effects   include   increased  arousal,   evaluation  apprehension,   increased   self-­‐awareness,   self-­‐evaluation,   and   increased   goal  relevance.     The   effects   on   performance   are   moderated   by   the   ‘sociality  characteristics’  [23]  of  the  game  setting,  by  the  other  person’s  role  (co-­‐actor  vs.  spectator),   relationship   and   expertise,   by   performance   requirements,   and   by  personal   differences.   Sociality   characteristics   are   the   social   affordances   of   the  game  content,   the  gaming  interface,  and  the  physical  environment   in  which  the  game   is  played.   Social   affordances   include   the  player’s   ability   to  monitor  other  players’  actions,  performance  and  emotions,  and  the  opportunities  for  verbal  and  non-­‐verbal  communication.    

Finally   as   the   paper   suggests,   naturally,   social   settings   not   only   allow   for  experiences   of   pride   and   sociability,   but   also   for   their   negatively   toned  counterparts   –   shame,   crowding,   and   social   pressure.   Thinking   of   a   learning  environment,   such   as   the   EPLT,   where   children   play   under   the   guidance   of  educators,  even  these  negative  emotions  are  important  for  learning  basic  social  emotional  competences  such  as  self  and  others  awareness,  mood  management,  and  empathy,  as  referred  above.  

 

Page 18: Sensor based physical interaction for embodied playful learning games

 18  

2.3  The  role  of  physical  motion  interaction  

   As  discussed  in  previous  sections,  children  need  to  be  involved  in  a  variety  of  

activities   to   learn   and   develop   well   cognitively,   physically,   emotionally   and  socially.  These  activities  include  interaction  with  each  other  and  adults,  moving  and   exploring,   manipulating   objects,   reading   and   creating   representations,  listening   (and   then   reading)   books,   engaged   in   pretend   play,   conversing,   and  building   relationships.   This   information   about   children’s   needs   is   the   basic  reason   that   early   childhood   teachers   often  believe   that   computers   and   “screen  time”   have   little   place   in   the   early   childhood   setting;   they   are   correct   that  technology   should   not   replace   these   vital   experiences   of   childhood.   Rather,  technology   is   most   productive   in   young   children’s   lives   when   it   enhances  children’s  engagement  in  these  activities,  as  well  as  their  reflections  about  their  actions  and  experiences.  The  currently  prevalent  model  for  educational  games  in  schools  is  for  a  single  student,  or  a  very  small  group  of  students,  to  work  on  one  computer.   This   model   has   limited   margins   of   self-­‐expression   and   socio-­‐collaborative  interactions.    The  use  of  modern  physical  interaction  interfaces  on  hybrid  reality  spaces  for  learning  will  have  a  great  impact  on  both  cognitive  and  social-­‐emotional  engagement  of  children.  

Modern   motion   capture   sensors   allow   the   player   to   interact   with   a   game  using   physical   movement,   map   player’s   body   movement   to   that   of   a   virtual  character,   and  also   to   create   interactions  between  virtual   and  physical   objects,  by  embedding  sensors  on  the  last  ones.  Motion  controllers  give  us  the  ability  to  design   interactive   spaces   where   physical   exercise   and   social   interaction,  characteristics   of   traditional   outdoor   children   games   like   hopscotch   or   jump  rope,  merge  with  those  of  modern  video  games,  like  rich  computer  graphics  and  audio,  virtual  environments,  game  dynamics,  and  interactive  story  telling.    

Murray   (Murray   1998)   proposes   three   characteristic   values   of   interactive  story   experiences:   immersion,   transformation,   and   agency.   Immersion   is   the  feeling   of   present   in   another   place   and   engaged   in   the   action   therein.  Transformation   is   the   game   experience   that   allows   the   players   to   transform  themselves   into  someone  else   for   the  duration  of   the  experience.  Agency   is   the  satisfying  power   to   take  meaningful  action  and  see   the  results  of  our  decisions  and  choices.  Motion  interaction  combined  with  large  projection  in  an  interactive  space,   increases   immersion   because   it   gives   the   player   the   feeling   that   he   is  standing  and  moving  inside  the  virtual  world.  Additionally  to  the  feeling  of   just  being  present,  the  player  has  to  follow  the  action  using  his  body,  performing  all  the   necessary   actions   that   the   virtual   character   has   to   perform,   coming   to   a  physical   state   that   the   character   would   have   in   real   action,   leading   to   a  more  experiential,  kinesthetic  experience  with  increased  feeling  of  transformation  and  agency.   Previous   research   comparing   motion   controllers   with   conventional  interaction   contollers,   support   this   claim,   finding   higher   levels   of   engagement  when   the   controller   supports   natural   movement   (Lindley,   Couteur   &   Bianchi-­‐Berthouze 2008)[24].   Another   study   (Bianchi-­‐Berthouze,   Kim   &   Patel)[25]  suggests  that  body  movements  appear  not  only  to  increase  players’  engagement  but  also  to  modify  the  way  they  get  engaged.    By  inducing  body  movement,  the  device  resulted   in  a  higher  sense  of  engagement   in   the  players  and  mediated  a  

Page 19: Sensor based physical interaction for embodied playful learning games

  19  

feeling  of  presence  in  the  digital  world.  The  players  appeared  to  quickly  enter  in  the  role  suggested  by  the  game,  and  started  to  perform  task  related  motions  that  were  not  required  or  recorded  by  the  game  itself.  Gaming  was  no  longer  only  a  question  of  challenge;  it  was  the  experience  itself  that  rewarded  the  players.  This  supports   another   factor   of   engagement,   that   of   fantasy,   existing   on   the  description  of  engagement  by  Malone  [26]  and  Lazzaro  [27].  

Whereas   analytical   aesthetics   is   preoccupied  with   separating   humans   into  mind  and  body,  a  part  for  thinking  and  a  part  for  sensing,  pragmatist  aesthetics  insists  on   their   interdependencies   in   the   aesthetic   experience.   In   a   pragmatist  perspective,  aesthetic  experience  is  closely   linked  not  only  to  the  analytic  mind  nor  solely  to  the  bodily  experience;  aesthetic  experience  speaks  to  both.  The  role  of   art   and   design   is   “to   give   a   satisfyingly   integrated   expression   to   both   our  bodily   and   intellectual   dimensions”   [28].   The   sensed   is  without  meaning   if   de-­‐contextualized  from  the  intellectual  and  vice  versa  [29].  Multiple  research  areas  support  the  embodiment  of  human  cognition,  that  nearly  all  cognitive  processes  are   deeply   rooted   and   derived   from   the   body’s   interaction   with   its   physical  environment  (Dourish  2001  [30],  Wilson  2002  [31]).  Several  theorists  (Barsalou,  2008   [32];   Glenberg   &   Kaschak,   2002[33])   base   this   premise   on   research  regarding   mirror   neurons   (Rizzolatti   &   Craighero,   2004   [34].   Located   in   the  premotor  cortex,  mirror  neurons  are  activated  both  when  perceiving  another’s  actions  and  when  producing  actions  oneself.  These  neurons  are  hypothesized  to  be  integral  in  understanding  and  imitating  the  actions  of  others.  The  fact  that  the  very   same   cells   are   involved   in   both   action   and   perception   suggests   that  activating   potential   actions   may   be   an   automatic   consequence   of   perception.  Starting   from   highlighting   the   importance   of   the   coupling   of   motor   and  perceptual  processes  for  the  interaction  with  the  environment,  and  arguing  that  this  might   also  be   important   for  mental   representation  of   the  world,  Hostetter  and   Alibali   [35]   study   how   people   use   their   bodies   (i.e.   gestures)   to   express  knowledge,   supporting   one   of   the   claims   of   embodied   cognition’s   proponents,  according  to  which  offline  cognition  (i.e.,  cognition  that  occurs  in  the  absence  of  relevant   environmental   input)   is   perceptually   and   mechanically     based.   From  this  perspective,  the  ability  to  represent  and  manipulate  information  that  is  not  currently   perceptually   present   is   accomplished   through   the   activation   of  sensorimotor   processes.   Based   on   the   theory   of   embodied   cognition,   Johnson-­‐Glenberg,   Birchfield   et   al.,   2010   [36]   suggest   the   idea   of   embodied   learning  according   to  which   learning  via  movement  activates  additional  modalities  (and  sensorimotor   systems)   for   crisper   and   more   stable   representations   of  information.   These   crisper   representations,   with   more   modal   associative  overlap,  will  be  more  easily  recalled.  Better  retrieval  leads  to  better  performance  on   assessment   measures.   Findings   from   studies   conducted   on   SMALLab,   a  learning   environment   based   on   educational   interactive   material   and   physical  interaction,   support   that   learning   in   the   embodied   interactive   environment  result   in   greater   knowledge   gains   over   time   compared   to   regular   classroom  instruction.    

Another   important   point,   highlighted   by   all   studies   on   motion   based  controllers,  is  that  controllers  that  allow  natural  movement  have  the  potential  to  offer   greater   affordances   for   social   interaction   [15][24][25][36].   Going   back   to  social-­‐emotional  learning,  the  previous  section  discussed  how  the  social  context  

Page 20: Sensor based physical interaction for embodied playful learning games

 20  

effect   on   player’s   performance   depends   on   the   social   affordances   of   the   game  setting,   including   the   game   controller,   the   opportunities   for   verbal   and   not  verbal  communication,  and  the  ability  of  spectators  to  monitor  player’s  actions,  emotions,   and   performance.   Aligned   with   the   view   of     embodied   cognition,  emotions  cannot  be  seen  solely  as  a  mental  state  but  also  a  physical,  bodily,  state.  Emotions   can   be   generated   through   imagination   without   physical   interaction,  but  they  can  also  be  generated  from  body  movements  (Erkman  1972)[37].  Based  on  that  body  postures  and  motion,  designed  to   interact  with  the  game,  become  another  modality  to  stimulate  player’s  emotions  through  the  game.  Additionally,  the   player   has   the   freedom   to   express   her   emotions   and   communicate   with  others  using  her  whole  body  and  motion.  At  the  same  time  all  the  in-­‐game  action  becomes  visible  to  all  spectators  who  can  monitor  players  performance,  physical  effort   invested,   and  emotions,  modifying  player’s  behaviour  and   increasing  her  evaluation  apprehension.  Body  interaction  between  the  game  and  the  player,  and  between   the   player   and   co-­‐players   or   spectators   increases   self   and   others  awareness,  and  gives  the  ability  to  train  mood  management.    

 

2.4  The  role  of  bio-­‐feedback  sensors    

Apart   from   visible   expressions   of   emotions   like   speech,   facial   expressions,  body  posture  and  motion,  researchers  have  been  studying  physical  responses  of  the  human  body  during   generation  of   emotions.  Tiny   electrical   charges,   sweat,  heat   flux,  and  heartbeat  have  been  measured  and  studied  using  wearable  body  sensors  and  have  been  related  to  emotions  (for  a  review  see  Lisetti  2004  [38]).  Emotions  have  also  been  a  research  topic  on  HCI.  Picard  (1995)  [39]  first  coined  the   term   “Affective   Computing”,   describing   interactive   systems   that   have   the  ability   to   interpret   the   emotional   state   of   users   and   adapt   their   behaviour   to  them,   simulating   human   empathy.   Although   several   researches   have   followed  since  then,  there  has  not  been  yet  any  commercial  application  taking  advantage  of  emotion  recognition  capabilities.  Given  the  complexity  of  the  processes  behind  the  generation  and  expression  of  emotions,  making  them  very  difficult  to  classify,  a   lot   of   researchers   avoid   the   term   “emotion   recognition”,   preferring   that   of  “biofeedback   mechanisms”,   criticizing   what   Boehner   et   al.   [40]   call  “Informational   approach   to   emotions”,   in   which   emotions   are   represented   as  clear,   discrete   states,   in   a   machine-­‐readable   format.   Boehner   suggests   an  alternative   view   called   “Interactional   approach”,   where   instead   of   trying   to  develop   systems   to   recognize   human   emotions,   it   focuses   on   helping   humans  understand,  experience,  and  express  their  emotions  through  technology.  

Considering   both   the   informational   and   the   interactional   approach   to  emotions,   the   application   of   biofeedback   sensors   will   contribute   to   the  experience   of   the   EPLT   in   various   ways,   and   provide   the   infrastructure   for  further   research   on   the   relationship   between   bio   signals   and   emotions,   and  affective  interaction.    First  of  all  body  sensors  can  provide  information  about  the  physical   and  mental   state   of   the   player   during   the   game,   contributing   to   self-­‐awareness.   Biofeedback   gives   also   spectators   a   measurable   way   to   monitor  player’s  performance,  and  physical  effort,  giving  them  an  augmented  view  of  the  

Page 21: Sensor based physical interaction for embodied playful learning games

  21  

player’s   state   combining   externally   visible,   and   internal   modalities.  Transparency  of   the  physical  effort   invested,  using   for  example  heart  beat  as  a  game  score   factor,  can  entice  participants  to  compare  their  energy  expenditure  over   time  and  with  others,   fostering  competition   that  motivates   them  to   invest  even  more  effort.  The  same  information  can  be  used  to  adjust  the  game  challenge  based  on  player’s  state,  which  will  contribute  to  engagement.  [41]  

The  application  of  biofeedback  mechanisms   in  games  can  assist   in   learning  mood  management  techniques,  similar  to  techniques  used  in  professional  sports  training.   Biathlon   for   example,   which   combines   cross-­‐country   skiing  with   rifle  shooting,  requires  special  techniques  from  the  athletes  to  calm  down  and  control  their  breathing  when  they  arrive  to  the  shooting  range,  after  a  very  demanding  physical   effort,   and   with   a   very   high   heart   beat   rate.   Crews   and   Landers  (1993)[42]   identified   electroencephalographic   signal   (EEG)   measures   of  intentional  patterns  prior  to  successful  golf  putts.  Pope  and  Stephens  (2011)[43]  describe  how  the  concept  of  physiological  modulation  of  operator  input,  evolved  from  a  physiologically-­‐adaptive  simulator  system  that  was  developed  in  National  Aeronautics   and   Space   Administration   (NASA)   flight   deck   research.   In   this  system,   EEG   signals   of   pilots   controlled   the   level   of   automation   in   a   simulator  flight  deck.  This  "closed-­‐loop"  testing  setup  was  used  to  determine  what  level  of  automation  kept  pilots  best  engaged  in  the  flight  task.  It  was  soon  realized  that,  given   enough   practice,   pilots   could   probably   turn   the   testing   system   into   a  training  system;  that  is,  they  would  learn  to  control  their  EEG  to  set  the  level  of  automation  where  they  preferred.  This  becomes  essentially  an  EEG  biofeedback  training   situation.   In   a   similar   way   that   games   based   on   motion   sensing  controllers   reward   players   for   imitating   a   skilled   performer’s   overt   motor  behavior,   biofeedback   mechanisms   can   additionally   challenge   the   player   to  reproduce   the  expert  performer’s  emotional  and  cognitive  state  by  setting  as  a  target   the   psycho-­‐physiological   responses   exhibited   by   the   expert   in   the   real-­‐world  situation.  

Biofeedback   sensors   can   be   used   to   develop   virtual   actors   demonstrating  basic   artificial   emotional   intelligence.   These   virtual   actors   can   for   example  motivate  players  and  reward  physical  effort,  or  help  them  to  calm  down.  In  story  driven   interactive   games,   intelligent   virtual   actors   can   enable   emotion  recognition   mechanisms   in   certain   points   of   the   story,   asking   players   to   act  emotions   or   behaviours,   or   in   order   to   perceive   players’   reactions   to   game  stimuli   and   trigger   virtual   actor’s   behaviours   accordingly,   e.g.   simulating  empathy.  Elements   like   these  would   increase   the  engagement  and  enhance   the  experience  of  an  interactive  story  where  the  player  finds  herself  in  an  immersive  world,  inhabited  by  personality-­‐rich,  robustly  interactive  characters.    

 

 

 

 

 

Page 22: Sensor based physical interaction for embodied playful learning games

 22  

 

 

Chapter  3:  Sensing  motions    

This   chapter   is   a   review   of   motion   sensing   systems.   Starting   from  fundamental  motion   sensors,   the   chapter   continues   presenting   state   of   the   art  commercial   and   innovative   research   systems   for   motion   capture,   and   motion  sensing  game  controllers,  introduced  during  recent  years  for  game  consoles.  The  chapter   reviews   main   characteristics   and   functional   principles   behind   those  systems,  in  order  to  come  to  conclusions  about  their  suitability  for  an  interactive  learning  space.      

 

3.1  Basic  motion  sensors    

This   part   of   the   document   is   a   short   presentation   of     fundamental  motion  sensors,   used   in   applications   that   are   studied   further   in   the   document.   These  sensors  are  basic  electronics,  with  a  very  particular  function,  translating  changes  in  one  form  of  energy  to  changes  in  electrical  energy.  All  the  presented  sensors  are  already  around  us  for  quite  a  while  now,  in  every  day  systems  like:  automatic  sliding   doors   and   lights,   alarm   systems,   cars,   and   various   industrial   control  systems.  During  recent  years   the  progress  of   technology  has  reduced  their  size  and  cost,  allowing  their  application  to  a  variety  of  devices  like  mobile  phones  and  game  controllers,  while  certain  projects  have  developed  frameworks  to  facilitate  and   simplify   their  use   in  multi-­‐purpose  applications  made  by  a  wider   range  of  people,  involved  in  designing  and  programming  of  interactive  systems.  

3.1.1  Sensing  forces    

Piezoelectric   sensors  are   a   category   of   sensors   that   use   the   piezoelectric  effect  to  measure  pressure,  acceleration,  stain  or  force  by  converting  them  to  an  electrical  charge.  Piezoelectricity  is  the  ability  of  some  materials,  notably  crystals  and   certain   ceramics,   to   generate   an   electric   potential   in   response   to   physical  stress.    

Force-­Sensing   Resistors   are   materials   whose   resistance   changes   when  force  is  applied  on  them.  Flexible  force  sensors  are  ultra-­‐thin,   flexible  printed  circuits,  consisting  of  two  laminated  layers  of  conductive  material  and  pressure-­‐sensitive   ink.  The  resistance  of  a   flexible  sensor   in  a  circuit   is  decreased  under  pressure.  Flexible  sensors  are  used  to  measure  forces  in  a  higher  range  than  that  of  a  piezoelectric  sensor.  

Capacitance  sensors  are  very  sensitive  sensors,  detecting  anything   that   is  conductive  or  has  a  dielectric  different   than   that  of   the  air.  Nowadays   they  are  

Page 23: Sensor based physical interaction for embodied playful learning games

  23  

usually   found   in   touch   screens,   though   there   are   capacitance   sensors   that   can  detect  body’s  charge  from  distances  up  to  a  meter  (such  sensors  are  used  by  the  Theremin  musical  instrument).    

Accelerometer  is  a  sensor  that  measures  the  change  in  speed  of  movement,  or  acceleration.  Conceptually,  an  accelerometer  behaves  as  a  damped  mass  on  a  spring.  When  the  accelerometer  experiences  acceleration,  the  mass  is  displaced  to  the  point  that  the  spring  is  able  to  accelerate  the  mass  at  the  same  rate  as  the  casing.   The   displacement   is   then   measured   to   give   the   acceleration.   An  accelerometer   thus   measures  weight   per   unit   of   (test)   mass,   a   quantity   also  known   as  specific   force,   or  g-­‐force.   Another   way   of   stating   this   is   that   by  measuring  weight,   an   accelerometer  measures   the   acceleration   of   the   free-­‐fall  reference   frame   relative   to   itself.   Accelerometers   typically   have   two   or  sometimes  three  axis  of  measurement.  

 Gyroscopes  are  sensors  that  measure  angular  acceleration.  They  are  similar  to   accelerometers,   except   that   they  measure   how   fast   the   angle   of   rotation   is  changing,  rather  than  measuring  acceleration  in  a  straight  line.  Gyroscopes  work  based   on   the   principles   of   conservation   of   angular   momentum.   Mechanical  gyroscopes  are  consist  of  a  hi  rate  spinning  disk  whose  axle   is   free   to   take  any  orientation,   mounted   on   a   set   of   two   gimbals   with   orthogonal   pivot   axes,  allowing   the   gyroscope   to   minimize   any   external   torque   and   preserve   its  orientation,  regardless  of  any  motion  of  the  platform  on  which  is  mounted.    

3.1.2  Detecting  motion    

Photoelectric   switches   use   a   light   beam   hitting   a   photosensitive   target  sensor.  When  a  body  breaks  the  beam,  passing  between  the  sensor  and  the  light  beam,  the  switch  is  activated.  

 Passive  infrared  sensors  measure  infrared  light  radiating  from  objects   in  their   field   of   view.   Apparent  motion   is   detected  when   an   infrared   source  with  one  temperature,   such   as   a  human,   passes   in   front   of   an   infrared   source   with  another  temperature,  such  as  a  wall.  

Magnetic   switches   consist   of   a   very   thin   pair   of   contacts   in   a   protective  housing.  When  exposed  to  a  magnet  they  are  drawn  together  closing  the  switch.    

Hall   effect   sensors  are   transducers   that   change   their   output   voltage   from  low  to  high  when  the  magnetic  field  around  them  changes.  

3.1.3  Measuring  distance    

Most  distance  sensors  use  an  energy  source,  transmitting  a  reference  signal,  and  a  sensor  measuring  the  signal  reflected  by  the  target,  back  to  the  source,  to  calculate  the  distance  of  the  target.    Most  applications  use  (near)  infrared  light  sensors,   sending   an   infrared   beam   and   read   the   reflection   of   the   beam   off   a  target.   For   longer   ranges,   ultrasonic   sensors   are   used,   sending   a   ping   of  ultrasonic  sound  and  then  timing  how  long  it   takes  to  bounce  back.  Alternative  implementations   of   distance   sensors   are   based   on   combination   of  magnetic   or  

Page 24: Sensor based physical interaction for embodied playful learning games

 24  

Hall  effect  sensors  (for  very  short  distances),  measuring  variations  in  a  reference  magnetic  field.  

 

3.2  Motion  Capture  and  tracking  systems    

Motion  capture  (mocap)/tracking  is  the  process  of  recording/tracking  body  movement   and   map   it   on   to   the   movement   of   a   digital   model.   Human   body  movement  mechanics   is   a   topic   of   interest   for   science   since   ancient   years   and  today   many   different   disciplines   use   motion   analysis   systems   to   capture  movement  and  posture  of  the  human  body.  In  clinical  research,  motion  capture  has   been   used   to   analyze   walking   patterns   of   impaired   patients   in   order   to  receive   the  right  orthopedic   treatment,   to  monitor   the  progress  of  a   treatment,  and   to  help   the  designing  of  prosthetics.  Motion  analysis   is  also  widely  used   in  sports   to   analyze   and   optimize   athletes’   movement   in   order   to   achieve   better  performance.    

During   the   last  years  motion  capture  systems  has  been  used  extensively   in  the   areas   of   cinematography   and   video   games   in   order   to   animate   computer  generated  characters  with  natural  human  movement,  following  recorded  moves  of   an   actor   inside   special   studios,   replacing   the   tradition   animating  method   of  rotoscope   on  which   animators   trace   over   live   action   film  movement,   frame   by  frame.    Despite  the  high  cost  of  the  special  equipment,  space  and  setup  required  for   a   motion   capture   system;   they   are   preferred   by   some   productions   over      traditional   animation   techniques   for   their   ability   to   give  more   realistic   results  and  in  shorter,  or  even  in  real  time.  

   Motion   capture   systems   is   a   very   active   field   of   research,   today   there   are  many  alternative  type  of  systems  using  different  technologies  with  differences  in  accuracy,  functional  requirements  and  cost,  and  their  suitability  depends  on  the  nature   of   the   project.   The   range   of   applications   utilizing   motion   capture   is  becoming  wider,  following  the  progress  made  on  processors,  memory  chips  and  sensors  regarding  their  speed,  accuracy,  size  and  cost,  as  well  as  the  progress  on  algorithms  developed   for  data  processing.  The   two  major   categories  of  motion  capture  systems  are  optical  and  non-­optical.  

3.2.1  Optical  Systems    

Optical   systems  work   based   on   data   captured   from   a   single   or  multiple   image  sensors   calibrated   to   provide   overlapping   projections,   and   algorithms   to  triangulate   the   3D   position   of   a   subject   in   space.   Most   optical   systems   utilize  markers,  distinguishable  by  the  cameras  from  the  rest  of  the  captured  image  in  order   to   determine   their   position   easier   and   more   accurate.   The   process   of  motion  capture  begins  with  the  calibration  of   the  system  in  which  markers  are  placed   in   known   positions   and   every   camera   position   and   lens   distortion   is  calculated   accordingly.   If   two   calibrated   cameras   see   a  marker,   its   3D  position  can  be  determined.  After   calibration  of   the   system  a  performer  wears  markers  near   each   joint   of   her   body   to   identify   the   motion   by   the   positions   or   angles  

Page 25: Sensor based physical interaction for embodied playful learning games

  25  

between   the  markers.   The   number   of   cameras   required   for   an   optical   system  depends  on  the  size  of  the  space  we  need  to  cover,  the  desired  accuracy  and  the  number  of   subjects  we  need   to   track   at   the   same   time.  Typically   a   system   like  that   consists   of   6   to   24   hi-­‐speed   cameras,   while   there   are   systems   using  hundreds   of   cameras   to   achieve   better   accuracy.   Optical   systems   are  characterized  by  the  captured  image  resolution  in  pixels,  the  sampling  frequency  in  hertz  and  the  frame  rate,  which  is  balanced  between  the  image  resolution  and  sampling  frequency.  Different  types  of  markers  exist  between  optical  systems.    

Passive   markers   are   the   simplest   type   of   markers,   featuring   retro-­‐  reflective   material   to   reflect   light   generated   near   the   cameras   lens.   Camera’s  threshold  is  adjusted  to  sample  bright  reflective  markers  ignoring  the  rest  of  the  captured  image.  Major  advantage  of  passive  markers  is  that  the  subject  does  not  need   to   wear   any   electronics   that   might   limit   her   freedom   to   move.   Passive  markers   are   attached   directly   to   the   skin   or   attached   to   specially   designed  spandex/lycra  full  body  suit.  The  major  disadvantage  of  passive  markers  is  what  is   called   markers   swapping,   meaning   that   all   markers   are   identical   and   the  system  might  mismatch  a  marker  with  the  corresponding  joint,  requiring  larger  number  of  cameras  to  avoid  the  problem.  

 

 Figure  1:  Active  marker  motion  capture  system  

 

Active   markers   are   another   type   of   markers.   Instead   of   reflecting   light,  active   markers   use   LEDs   to   emit   light   [Figure   1.],   increasing   the   maximum  distances   and   volume   for   capture.   Optical   systems   using   active   markers  triangulate  positions  by  illuminating  one  LED  at  a  time  very  quickly  or  multiple  LEDs  with  software  to  identify  them  by  their  relative  position.  Refined  versions  of  active  markers  exist,  using  time  modulation  over  the  amplitude  or  pulse  of  the  LEDs  to  provide  marker  ID  in  order  to  eliminate  markers  swapping.     Computer  processing   of   modulated   IDs   offers   clearer   data   and   less   filtered   results.   This  higher   accuracy   and   resolution   requires   more   processing   than   passive  technologies,   but   the   additional   processing   is   done   at   the   camera   to   improve  

Page 26: Sensor based physical interaction for embodied playful learning games

 26  

resolution  via  a  subpixel  or  centroid  processing,  providing  both  high  resolution  and  high  speed.  

Both   technologies   mentioned   above   are   mainly   used   indoors   in   special  motion   capture   studios.  Passive   systems  are  usually   less   expensive   than  active  and  easier  to  set  up,  while  active  systems  are  more  accurate  and  after  the  initial  set   up,   require   less   time   to   get   results   from.   Commercial   active   and   passive  systems   are   available   from   companies   like   Vicon,   Naturalpoint,   Qualisys   and  PhaseSpace,  and  usually  cost  between  tens  and  hundreds  of  thousands  of  euro.  

Semi-­passive  -­  Photosensitive  markers.  Prakash  [44]  is  a  motion  capture  system  developed  in  MIT’s  Media  Lab  as  an  inexpensive  alternative  system  (the  overall  cost  is  less  than  1.000  euro),  suitable  also  for  outdoor  use  and  real  time  motion   capture.   Instead   of   using   expensive   hi-­‐speed   cameras,   Prakash   uses  multi-­‐LED  hi  speed  projectors  with  passive  binary  films  (masks)  set  in  front.  The  light   intensity   sequencing   provides   a   temporal   modulation   and   the   masks  provide   a   spatial   modulation.   Each   beamer   projects   invisible   (near   infrared)  binary  patterns  thousands  of  times  per  second.  Tags  with  photo  sensor  attached  to   the   scene   determine   their   location   by   decoding   the   transmitted   space-­‐dependent   labels.   Apart   from   their   position,   tags   can   compute   their   own  orientation,   incident   illumination,   and   reflectance.   These   tracking   tags  work   in  natural  lighting  conditions  and  can  be  imperceptibly  embedded  in  attire  or  other  objects.  The  system  supports  an  unlimited  number  of  tags  in  a  scene,  with  each  tag   uniquely   identified   to   eliminate  marker-­‐swapping   issues.   Since   the   system  eliminates  a  high-­‐speed  camera  and  the  corresponding  high-­‐speed  image  stream,  it   requires   significantly   lower   data   bandwidth.   The   tags   also   provide   incident  illumination   data,   which   can   be   used   to   match   scene   lighting   when   inserting  synthetic  elements.    

Markerless   Motion   Capture.   Motion   capture   and   computer   vision   have  been  a  very  active  field  of  research  during  the  15  last  years  and  there  have  been  a  lot  of  studies  to  develop  markerless  motion  capture  systems,  based  on  the  use  of   a   single   or  multiple   cameras   and   optimized   image   analysis   algorithms,  with  comparable   performance   to   that   of   more   expensive   commercial   systems,  previously  mentioned.  

Recently   a   team   from   the  Carnegie  Mellon  University  working  with  Disney  Research   presented   a   system   featuring   small   body-­‐mounted   cameras   to  reconstruct  the  motion  of  a  subject  [45].  Outward-­‐looking  cameras  are  attached  to   the   limbs   of   the   subject,   and   the   joint   angles   and   root   pose   are   estimated  through   non-­‐linear   optimization.     The   optimization   objective   function  incorporates  terms  for  image  matching  error  and  temporal  continuity  of  motion.  Structure-­‐from-­‐motion  is  used  to  estimate  the  skeleton  structure  and  to  provide  initialization   for   the   non-­‐linear   optimization   procedure.   Global   motion   is  estimated  and  drift  is  controlled  by  matching  the  captured  set  of  videos  to  a  3D  reconstruction   of   the   scene   built   from   reference   imagery.   By   estimating   the  camera   poses,   the   global   and   relative   motion   of   an   actor   can   be   captured  outdoors   under   a   wide   variety   of   lighting   conditions   or   in   extended   indoor  regions  without  any  additional  equipment.  

Page 27: Sensor based physical interaction for embodied playful learning games

  27  

Several  other  techniques  and  algorithms  have  been  proposed  for  markerless  motion   capture   for   single   or  multiple   subjects.  Most   of   them  use   footage   from  multiple   cameras   to   make   a   volumetric   reconstruction   of   the   body   using  background   removal,   skin   color   detection,   “shape   from   silhouette   (SFS)”   and  structure   from   motion   methods.   The   formalism   of   SFS   was   introduced   by   A.  Laurentini   [46].   By   definition,   an   object   lies   inside   the   volume   generated   by  back-­‐projecting   its   silhouette   through   the   camera   center   (called   silhouette’s  cone).  With  multiple  views  of  the  same  object  at  the  same  time,  the  intersection  of   all   the   silhouette’s   cones   build   a   volume   called   ”Visual   Hull”,   which   is  guaranteed  to  contain  the  real  object.  After  the  visual  hull  has  been  constructed,  body   pose   is   estimated   by   fitting   shape   models   of   specific   body   parts   to   the  volume  or  by  applying  heuristic  assumptions  of  features  related  to  position  and  establish   the   correspondence   of   joints   between   successive   frames.   Markerless  motion   capture   systems   based   on   these   methods   have   been   developed   by  various   academic   research   laboratories,   like   the   BioMotion   Lab   of   Stanford  University  [47],  the  University  of  Utrecht  [48],  the  Max  Planck  Institute  [49],  and  commercial  systems  like  Organic  Motion’s  solutions.    

3.2.2  Non-­‐optical  systems  

This   category   includes   all   motion   capture   systems   that   instead   of   image  sensors,   they   are   using   alternative   types   of   sensors   to   capture   motion.   These  systems   collect  data   from  wearable   sensors   attached   to   the   subject’s  body  and  translate  them  into  motion  in  space.  Their  main  advantage  is  that  because  they  are   not   based   on   cameras,   they   don’t   require   a   studio   setup,   they   are   more  portable   and   they   can   be   used   outdoors,   capturing  motion   in   large   areas   and  independent  of   light  conditions.  Their  main  disadvantages  are  that  usually  they  are   less   accurate   than   optical   systems   and   that   they   might   limit   the   subject’s  freedom  to  move  and  perform.  

Inertial  systems  use  miniature  inertial  sensors  attached  to  the  joints  of  the  body,  biomechanical  models  and  sensor  fusion  algorithms  to  translate  data  into  motion.   Starting   from   a   known   position,   inertial   systems   use   wireless  accelerometers   and   gyroscopes,   sending   data   to   a   computer   to   continuously  calculate  the  position,  orientation  and  velocity  of  the  subject  with  full  six  degrees  of   freedom   body   motion.     Their   accuracy   depends   on   the   number   of   sensors  used.  Commercial  inertial  motion  capture  systems  are  available  from  companies  like  XSens  and  Animazoo.  

Mechanical  or  exo-­skeleton  systems  use  a  skeletal-­‐like  structure  worn  by  the   subject,   consisting   either   by   straight  metal   or   plastic   rods,   linked   together  with  potentiometers  articulating  the  joints,  or  using  flexible  sensors  to  measure  joint   angles  during  motion.  Mechanical   systems  are   real   time  and   low  cost  but  they   capture   only   the   relative  movement   of   the   subject,   requiring   an   external  absolute  positioning  system  and  they  might  be  not  comfortable  for  a  performer  to  wear.  Commercial  systems  like  the  Gypsy  7  by  Animazoo  combine  gyroscope  and  exo-­‐skeletal  to  capture  absolute  and  relative  motion.  

Magnetic   systems  utilize   sensors  placed  on   the  body   to  measure   the   low-­‐frequency   magnetic   field   generated   by   a   transmitter   source.   Position   and  

Page 28: Sensor based physical interaction for embodied playful learning games

 28  

orientation  are  calculated  by  the  relative  magnetic  flux  of  three  orthogonal  coils  on  both  the  transmitter  and  each  receiver.  The  relative  intensity  of  the  voltage  or  current   of   the   three   coils   allows   these   systems   to   calculate   both   range   and  orientation  by  meticulously  mapping  the  tracking  volume.  The  sensor  captures  6  degrees  of  freedom,  which  provides  useful  results  obtained  with  two-­‐thirds  the  number  of  markers   required   in  optical   systems;  one  on  upper  arm  and  one  on  lower   arm   for   elbow   position   and   angle.   Magnetic   systems   are   low   cost   but  nowadays   rarely  used  because  of   their  major  disadvantages.   Since  each   sensor  requires   its   own   (fairly   thick)   shielded   cable,   the   tether   used   by   magnetic  systems  can  be  quite  cumbersome.  Magnetic  systems  have  issues  with  azimuth.  If  an  actor  is  doing  a  push-­‐up  type  posture,  the  system  will  get  confused.  Multiple  actor   magnetic   setups   also   have   problems   with   two   or   more   actors   in   close  proximity.   Sensors   from   the   different   actors   will   interfere   with   each   other,  providing   distorted   results.   Magnetic   systems   have   very   negative   reactions   to  metal   or   magnetic   fields   in   the   environment   caused   by   metallic   construction  materials  in  buildings  or  other  electrical  appliances  in  use.    

3.2.3  Motion  capture  libraries    

As  mentioned  before,  motion  capture  is  an  easier  technique  to  give  realistic  motion  to  virtual  characters  and  although  most  motion  capture  systems  require  expensive   equipment   and   special   studios,   independent   developers   can   take  advantage  of  online  available  free  or  commercial  libraries,  which  include  motion  captured  data  from  various  human  activities,  in  file  formats  that  can  be  imported  in  3D  animating  software  and  mapped  to  any  character  model.  A  quick  search  for  motion   capture   libraries   will   return   a   long   list   of   resources,   among   them   the  Carnegie   Mellon   University,   which   has   published   a   very   large   motion   capture  database,   freely   available   at   http://mocap.cs.cmu.edu/,  http://www.mocapclub.com/,  which  includes  a  library  from  the  Motion  Capture  Society  association,  and  http://mocapdata.com,  which  is  also  a  large  resource  of  both  free  and  commercial  animation  files.  

3.3  Motion  sense  in  interaction    

During  the  last  years,  sensors  and  principals  used  in  motion  capture  systems  have  been  used  in  smaller  scale,  to  low  cost  consumer  computer  input  devices,  to  provide  physical  interaction  input  interfaces.  During  the  last  five  years,  all  major  companies  in  the  video  game  industry  have  developed  different  technologies  for  games   and   controllers   with   motion   based   interaction.   Although   sports   have  always  been   a  popular   theme  on  video   games,   and  game   companies   started   to  explore   sensor   based   physical   interfaces   from   the  middle   of   1980s,   it  was   not  until  recently  that  technology  allowed  them  to  produce  wireless  and  lightweight  devices,  practical  to  use  as  game  controllers.  That  fact,  along  with  the  popularity  of  large  TV  screens  in  today’s  average  living  room,  have  created  the  basis  for  the  creation   of   games   offering   more   immersion   and   encourage   gamers’   physical  activity.  Today  “exertion  games”  or  “exergames”  is  a  growing  market,  attracting  also   more   people   that   were   not   traditionally   attracted   to   video   games,   and  considered  them  a  rather  passive  activity.  

Page 29: Sensor based physical interaction for embodied playful learning games

  29  

This  part  is  a  presentation  of  current  techniques  and  examples  of  devices  for  physical  input  interfaces  and  game  controllers,  based  on  motion  sensors.  

Hand  Tracking    

Designing  wearable  input  interfaces,  usually  called  “data  gloves”,  to  allow  a  user   to   use   her   hands   and   fingers   to   navigate   in   a   virtual   world,   use   hand  gestures,   and   interact  with  objects   in   a  more  natural  way,  was  one  of   the   first  examples  of  natural  user  interfaces.  The  first  data  glove  was  created  in  1977,  and  since   then   a   few   companies   and   laboratories   came   up   with   their   own  implementations.   Data   gloves   use   various   sensors   as   accelerometers   or  gyroscopes   to   capture   hand  movement   and   flexible   sensors   for   the   bending   of  fingers.   Some   data   gloves   use   optical   fibers   attached   to   the   fingers   and   a  photocell  as  a  way  to  measure  bending,  since  some  light  escapes  the  fiber  when  bended.   Some   data   gloves   also   provide   haptic   feedback,   applying   small   forces  and  vibrations  to  give  users  a  sense  of  touch.  

Data   gloves   are   also   used   on   body   motion   capture   systems,   because  solutions   based   on   markers   are   not   able   to   capture   such   detail   in   finger  movement.  This  technique  is  called  hand-­‐over.  

Head/Face  Tracking    

Facial   expressions   and   small   facial   muscles   movement   is   also   difficult   to  capture   during   body  motion   capture.   For   that   reason   facial   motion   capture   is  done   in   a   separate   recording,   by   attaching  a   lot   of   small  markers   in   the   actors  face.    

In   the   field   of   interaction   and   the   gaming   industry,   head   tracking   devices  exist,   allowing   the   computer   to   set   a   camera’s   viewpoint   according   to   the  position  of  the  player  in  space.  Commercial  systems,  like  NaturalPoint’s  TrackIR,  use   an   infrared   sensor   and   active   markers   attached   to   player’s   head.   Other  systems,   like  a   lot  of  head  mounted  displays   for  virtual  reality  systems,  use   tilt  sensors  to  track  head  movement.  There  are  also  available  applications  that  use  a  plain   camera   and   automatic   face   detection   algorithms   to   track   user’s   position,  but  because  of  using  a  plain  camera  they    are  less  accurate  on  movement  along  the  depth  axis.    

Eye  Tracking    

Eye   tracking   is   the   process   of   measuring   either   the   point   of   gaze   of   a  viewer  or  the  motion  of  an  eye  relative  to  head.  Eye  trackers  are  mostly  used  in  research  on  the  visual  system,  in  psychology,  in  cognitive  linguistics  and  also  in  marketing  research,  product  design  and  usability   testing,   to  spot  elements   that  attract  viewers  gaze  or  others  that  do  not.    

Eye   trackers   measure   rotations   of   the   eye   and   principally   the   fall   into  three  categories:  The  first  category  uses  an  attachment  to  the  eye,  like  a  contact  lens   with   an   embedded  mirror   or   magnetic   field   sensor.     Measurements   with  

Page 30: Sensor based physical interaction for embodied playful learning games

 30  

tight   fitting  contact   lenses  have  provided  extremely  sensitive  recordings  of  eye  movement,  and  magnetic  search  coils  are   the  method  of  choice   for  researchers  studying  the  dynamics  and  underlying  physiology  of  eye  movement.  The  second  category   uses   electric   potentials   measured   with   electrodes   placed   around   the  eyes.  The  eyes  are  the  origin  of  a  steady  electric  potential  field,  which  can  also  be  detected   in   total   darkness   and   if   the   eyes   are   closed.   It   can   be  modeled   to   be  generated  by  a  dipole  with  its  positive  pole  at  the  cornea  and  its  negative  pole  at  the   retina.   The   electric   signal   that   can   be   derived   using   two   pairs   of   contact  electrodes  placed  on  the  skin  around  one  eye  is  called  Electroculogram  (EOG).  If  the   eyes   move   from   the   centre   position   towards   the   periphery,   the   retina  approaches  one   electrode  while   the   cornea   approaches   the  opposing  one.  This  change   in   the   orientation   of   the   dipole   and   consequently   the   electric   potential  field   results   in   a   change   in   the   measured   EOG   signal.   Inversely,   by   analyzing  these  changes  eye  movement  can  be  tracked.  

The  last  and  most  commonly  used  category  is  non-­‐intrusive,  optical  based  systems   using   the   Pupil   Centre   Corneal   Reflection   (PCCR)   technique.   This  technique   uses   a   light   source   to   illuminate   the   eye   causing   highly   visible  reflections,   and   a   camera   to   capture   an   image   of   the   eye   showing   these  reflections.  Image  processing  algorithms  are  then  used  to  identify  the  reflection  of  the  light  source  on  the  cornea  and  the  pupil.  Calculating  the  angle  between  the  two   reflections,   combined   with   other   geometrical   characteristics   of   the  reflections,  allow  us  to  determine  the  gaze  direction.    

There  are   two  different   illumination   setups   that   can  be  used  with  PCCR  technique:   bright   pupil   tracking,   where   an   illuminator   is   placed   close   to   the  optical  axis  of   the   imaging  device,  which  causes   the  pupil   to  appear   lit  up;  and  dark  pupil,  where   the   illuminator   is   placed   away   from   the   optical   axis   causing  the  pupil  to  appear  darker  than  the  iris.  There  are  different  factors  affecting  the  pupil  detection  when  using  each  one  of  the  two  techniques  like  age  of  the  subject,  light  conditions  and  ethnicity.  Some  commercial  systems  like  Tobii  eye  trackers  can  use  both  techniques,  determining  the  best   technique  during  the  calibration  procedure  where  the  viewer  is  asked  to  gaze  at  certain  points  on  screen.    

Eye  trackers  can  also  be  used  as  an  interaction  input  interface,  replacing  a  mouse   for   example,   allowing   the   user   to   control   the   cursor   with   her   eyes.  EyeWriter   [50]   is   a   collaborative   research   project   for   building   an   eye   tracker  from   inexpensive   material,   along   with   open   source   software,   developed   to  empower  people  who  are  suffering  from  ALS  and  other  physical  disabilities  with  creative  technologies  

Nintendo  Wii  Remote    

In   2006,  Nintendo   released   its,   now  popular,  Wii   video   game   console.   The  major  innovation  of  Wii,  was  its  remote  game  controller,  the  Wii  Mote.  Wii    Mote  features  an   infrared  sensor  and  an  accelerometer,   that  allows   it   to  calculate   its  position   in   space   and   track   hand  movement.   Using   the  wii  mote,   the   player   is  able   to   aim   at   items   on   screen,   and   interact   using   gestures   and   natural  movement.    

Page 31: Sensor based physical interaction for embodied playful learning games

  31  

Upon   its   release   date,   the   Wii   mote   gained   much   attention   thanks   to   its  advanced   features   and   quickly   became   very   popular   among   programming  enthusiasts,  who  wrote  software  that  allowed  the  use  of   the  device  beyond  the  game   console.   After   that   the   wii   mote   has   been   used   numerous   projects   as   a  controller,   or   as   an   infrared   sensor   to   track   infrared   LEDs,   attached   to   other  items,  for  example  a  head  tracking  system  like  the  one  previously  mentioned.    

Blobo    

The  blobo  sensor  is  manufactured  by  a  small  company  in  Finland,  targeting  on  games  designed  for  the  sensor  on  PC  and  Mac  platforms.  Blobos  have  the  shape  of  a  small  ball,  which  packs  an  accelerometer  and  gyroscope  sensor  as  well  as  an  air  pressure  sensor,  which  can  measure  how  hard  the  player  holds  the  ball.  Its  unique  ball  shape  makes  it  ideal  to  be  used  in  children  games  and  the  ability  to  measure  pressure  and  speed,  could  act  as  another  indicator  on  the  players  emotional  state.  

Floor  boards    

Floorboards  equipped  with  pressure  sensors  were  the  first  attempt  to  make  an   input   interface,  with  which   a   player  would   utilize   her  whole   body   in   game  interaction.  The  first  controller  of  this  kind  was  created  by  Atari,  in  1982,  called  Joyboard.  In  2007,  Nintendo  released  a  modern,  wireless  version,  called  Balance  Board,  along  with  a  series  of  fitness  games  utilizing  it,  called  Wii  Fit,  for  the  Wii  game  console.  

Sony  PlayStation  Move    

Sony’s   motion   sensing   platform   for   the   PlayStation   console   includes   the  PlayStation  Eye  camera,  which  is  capable  of  capturing  standard  video  at  60  Hz,  at  640x480  pixel  resolution,  or  at  120  Hz  at  320x240  pixels,  along  with  computer  vision   and   gesture   recognition   software,   and   a   microphone   array   for   voice  location  tracking  and  voice  commands  recognition.    

The  PlayStation  Move  motion  controller   features  an  orb  at   the  head,  which  can  glow  in  any  of  a  full  range  of  RGB  colors  using  LEDs.  Based  on  the  colors  in  the   user   environment   captured   by   the   PlayStation   Eye   camera,   the   system  dynamically   selects  an  orb   color   that   can  be  distinguished   from   the   rest  of   the  scene.  The  colored  light  serves  as  an  active  marker,  the  position  of  which  can  be  tracked by  the  camera.  The  uniform  spherical  shape  and  known  size  of  the  light,  also  allows  the  system  to  accurately  determine  the  controller's  distance  from  the  camera   through   the   light's   size   in   the   image.   The   controller   also   features   an  accelerometer  and  a  gyroscope,  used  to  track  rotation  as  well  as  overall  motion. An   internal magnetometer is   also   used   for   calibrating   the  controller's orientation against  the  earth's  magnetic  field  to  help  correct  against  cumulative  error  (drift)  by  the  inertial  sensors. The  inertial  sensors  can  be  used  to calculate  position   in   cases  where   the   camera   tracking   is   insufficient,   such  as  when  the  controller  is  obscured  behind  the  player's  back.  

Page 32: Sensor based physical interaction for embodied playful learning games

 32  

Microsoft  Kinect    

Kinect  was  Microsoft’s  answer  on  motion  sensors  to  the  video  game  consoles  competition.   Initially   released   as   an   accessory   for   the   Xbox   360   game   console,  Kinect  was  the  first  consumer  device  that  allowed  real-­‐time,  markerless  full  body  3D  motion  capture  in  a  room  environment.  Kinect  features  a  normal  RGB  camera  and   a   depth   sensor,   consisting   of   an   infrared   laser   projector   and   an   infrared  camera,   capable   of   capturing   3D   video   data   at   30   Hz,   at   640x480   pixels.   The  sensor   also   includes   a   3-­‐axis   accelerometer   to   determine   its   orientation   and   a  four-­‐microphone   array   allowing   it   to   also   receive   voice   commands,   ambient  noise   reduction,   and   to   determine   the   source   location   of   a   sound.   The   most  innovative  part  of  the  Kinect  though,  is  a  microprocessor  running  a  “trained”,  by  using  machine  learning  and  a  large  training  set  of  images,  algorithm,  that  allows  it  to  track  multiple  bodies’  motion,  based  on  20  joints  for  each  body  [Figure  2].  

 Kinect   uses   a   single   depth   image   [51],   which   is   segmented   into   a   dense  

probabilistic  body  part   labeling,  with   the  parts  defined  to  be  spatially   localized  near  skeletal   joint  of   interest.  Reprojecting   the   inferred  parts   into  world  space,  spatial   modes   of   each   part   distribution   are   localized   and   thus   generate  confidence-­‐weighted   proposals   for   the   3D   locations   of   each   skeletal   joint.   The  segmentation  into  body  parts  is  treated  as  a  per-­‐pixel  classification  task.  A  very  large  collection  of  realistic  depth  images  of  humans  of  many  shapes  and  sizes  in  highly  varied  poses  sampled  from  a  large  motion  capture  database  were  used  to  train   a   deep   randomized   decision   forest   classifier   which   avoids   over-­‐fitting.  Simple,   discriminative   depth   comparison   image   features   yield   3D   translation  invariance   while   maintaining   high   computational   efficiency.   Finally,   spatial  modes   of   the   inferred   per-­‐pixel   distributions   are   computed   using   mean   shift,  resulting  in  the  3D  joint  proposals.  

 

 Figure  2  Kinect  tracking  joints  

 

Page 33: Sensor based physical interaction for embodied playful learning games

  33  

Kinect’s  real  time  full  body  motion  capture  capabilities  allows  the  creation  of  games   and   other   applications   featuring   full   body   and   motion,   physical  interaction,   based   on   position   determination,   collision   detection   of   virtual  objects  with  independent  body  parts,  motion  gestures  and  postures  recognition.  Kinect   truly   revolutionized   the   field   of   natural   user   interface   for   gaming   and  became  upon  its  release,  the  fastest  selling  electric  device  ever,  while  as  with  the  release   of  WiiMote,   it   quickly   attracted   the   attention   of   a   large   community   of  programming  enthusiasts  who  wrote  open  source  software,  allowing  the  use  of  Kinect   for   independent,   computer   platform   applications,   followed   by   a   large  number   of   projects   found   on   internet   including   interactive   application,   games,  installations  and  robotics,  utilizing  the  sensor.  After  the  release  on  Internet  of  a  large  number  of  impressive  examples  of  uses  of  the  Kinect,  companies  involved  in   its   development,   like   Prime   Sense   and   Microsoft,   decided   to   support   these  efforts  by  releasing  software  to  facilitate  independent  project  development.  

 

Panasonic  D-­‐Imager    

D-­‐Imager  was   introduced   to   the  market  by  Panasonic   in  2011,   targeting   to  commercial  businesses  rather  than  game  consoles’  end  customers.  The  D-­‐Imager  works   in  a  similar  way  to  Kinect,  using  an  array  of  near   infrared  LED  emitters,  instead  of  a   single   laser  beam,  and  measuring   the   time  delay  between   the  LED  emitters  and  the  target  reflected  light  is  measured  on  a  pixel-­‐by-­‐pixel  basis  (time  of   flight  principle).  Using   internal  processing  similar   to   the  Kinect’s,   the  sensor  can  track  20  joints  of  a  human  body  in  front  of  it.  The  comparative  advantages  of  D-­‐Imager  over  the  Kinect  are  the  increased  tracking  range  (1,2  to  9  meters  over  1,2   to  3,5  meters),   and   the   ability   to   fully   track  up   to  5  people   simultaneously  over   2   that   Kinect   supports.   D-­‐Imager   lacks   the   VGA   camera   and  microphone  array   found   in   the   Kinect   though.     D-­‐Imager   comes   with   Omek   Beckon  Development  Suite,  supporting  multiple  development  environments,  packs  with  ready   to   use   gestures   and   an   authoring   tool   that   allows   developers   to   easily  record   custom   gestures   and   feed   them   automatically   to   a   machine   learning  algorithm.   These   advantages   of   course   come   at   a   much   higher   cost   that   the  Kinect  sensor.

 

3.4  Comparison  of  motion  capture  systems  for  the  EPLT  installation    

Commercial  marker  based,  optical  motion  capture  systems  have  the  highest  sampling  rate  performance  and  have  been  proved  to  be  very  robust  through  the  years   that   have   been   available.   It   should   be   noted   though   that   these   systems  have   been   designed   for   high   detail   motion   capture   for   animating   and  movies,  which   have   higher   performance   requirements   than   game   interaction.   The  disadvantages   of   marker   based   systems   are:   i)   the   use   of   markers,   that   along  with  the  body  sensors  that  will  have  to  be  placed  on  players  body  and  calibrated  will  require  a   lot  of  time  in  order  to  prepare  a  player  for  the  game,   ii)   the  very  high   cost   of   such   systems   compared   to   game   consoles’   controllers,   and   iii)  

Page 34: Sensor based physical interaction for embodied playful learning games

 34  

marker  based  systems  require  a  number  of  cameras  to  be  set  up  and  calibrated  above  stage  [Figure  3.],  requiring  a  more  permanent  setup,  while  solutions   like  the  Kinect  and  D-­‐Imager  don’t  require  special  calibration  and  are  more  flexible  to  be  used  from  a  permanent  installation  to  a  smaller  scale  classroom  setup.  

 

 Figure  3  Optical  tracking  setup  using  12  cameras  

 

The  advantage  of  depth  sensors  like  the  Kinect  and  the  D-­‐Imager  is  that  the  user   does   not   have   to   wear   anything   in   order   to   be   tracked.   Trackers   on   the  other  hand  can  be  also  used  to  track  objects  on  stage,  besides  bodies.  If  there  is  need   to   track   a   small   number   of   objects,   inertial   sensors   (accelerometer,  gyroscope)   can   be   attached   to   the   object   and   track  motion   based   on   an   initial  position   on   stage.   An   alternative   solution   is   to   use   the  Wii   Remote’s   infrared  camera,   though  using   it   the  other  way  around,  where   the  camera   is   setup  on  a  position,  and  infrared  LED  are  attached  to  objects,  might  interfere  with  the  body  capture   depth   sensors.   The   other   disadvantage   of   the   Kinect   specially   is   its  limited   range   of   less   than   4   meters.   Additionally   because   it   is   based   on   the  viewing   angle   of   a   single   camera,   the   active   area   of   tracking   form   a   triangle  which  gets  narrower  approaching  the  position  of  the  camera,  leaving  part  of  the  area   in   front  of   the   sensor  out  of   sight.  Multiple  Kinect   sensors   can  be  used   to  cover   a   larger   area,   but   they   have   to   be   very   carefully   positioned   so   that   they  don’t   interfere   with   each   other,   and   cause   additional   complexity   to   the  development  of  the  application.  The  9  meters  range  of  D-­‐Imager  makes  it  more  suitable  for  larger  stages.  

Markerless  non  optical  motion  capture  systems  present   the  same  difficulty  with   marker   based,   that   of   the   requirement   of   a   large   number   of   sensors  attached  and  calibrated  to   the  players  body.  Finally  markerless  optical  systems  seem  appealing,  the  only  commercial  system  found  was  that  of  Organic  Motions,  which  was  not  tested  during  the  research,  but  as  a  general  note,  experience  has  shown   that   markerless   optical   systems   are   not   so   accurate   as   the   rest   of   the  categories  and  their  performance  might  strongly  depend  on  lighting  conditions.  

Page 35: Sensor based physical interaction for embodied playful learning games

  35  

Chapter  4:  Sensing  emotions    

   

The  vision  of  machines  with  emotional  intelligence  [52]  coexists  with  that  of  artificial   intelligence   since   the   invention   of   the   term.   It   is   a   popular   theme   in  science  fiction  literature,  featuring  androids  understanding  emotions  and  having  human   like  behavior,   and   aptly   raising   ethical   questions   about   the  use  of   such  technologies.  Although  we  are   still   quite   far   from   this   vision   (or  nightmare   for  some),   research   laboratories   around   the   world   work   on   developing   emotion-­‐sensing  technology  to  support  the  study  of  human  behavior,  the  affective  human  computer   interaction,   and   communication   between   people.   Automatic  recognition  of  human  affective  states  is  an  important  research  topic  for  a  broad  range   of   applications,   including   psychology   research,   computer   assisted  therapeutic   systems,   safety   monitoring   applications,   assessment   and   training  systems,   user   experience   studies,   marketing   research,   and   automatic   affect-­‐based  indexing  of  digital  material  [53].  

Emotion   recognition   can   make   social   interaction   more   effective   in   cases  where  there  are  difficulties  to  communicate  expressively,  for  example  for  people  on  the  autistic  spectrum,  where  an  autistic  person  might  outwardly  appear  calm  and  relaxed,  while  experiencing  a  state  of  emotional  or  cognitive  overload  [54],  and  every  day  social  networking  applications  where  there  is  a  tendency  on  text  based  communication,  or  communicating  through  avatars  in  virtual  worlds.    

As  with  physical   interaction  interfaces,  a   lot  of  studies  experiment  with  the  application  of  physiological  sensors  on  video  games  and  interactive  story  telling  [55].   Video   games   are   an   excellent   application   area   to   explore   benefits   and  drawbacks   of   physiological   sensor-­‐interaction   because   there   are   less   severe  consequences   of   failure   than   in   critical   control   systems,  making   games   a   field  bridging   laboratory   research   and   commercial   systems.   It   has   also   been   shown  that  video  games  can  stimulate  strong  emotional  reactions  from  players,  making  them  an  appropriate  field  for  behavior  studies,  and  as  gaming  has  turned  into  a  huge   entertainment   industry,   companies   are   interested   to   use   physiological  feedback   for   game   design   evaluation.   Explorations   to   develop   “biofeedback”  games,   games   to  make  users  more   aware  of   their   physiological   state   and   train  them  to  control   it  using  game  dynamics,  started  from  the  early  1980s.   In  1984,  Thought   Technology   developed   a   racing   game   called   CalmPrix   [56],   utilizing   a  modified   galvanic   skin   response   sensor,   followed   by   other   innovative   game  companies   like  Atari   and  Nintendo  using   a   variety  of   body   sensors,   presenting  their  own  biofeedback  games.  Some  of  these  games  never  made  it  to  the  market  while  others  did,  but  without  the  expected  market  success.    

As  we  all  know  from  personal  experience,  emotions  are  hard  to  define  and  recognize.   Despite   all   our   senses,   the   verbal   and   non-­‐verbal   communications  skills  we  have  as  humans,   it   is   often  hard   to   immediately   recognize   someone’s  emotions,  if  they  are  real  or  pretended,  if  someone  is  talking  seriously  or  joking,  laughing  or  crying  etc.  Expression  of  emotions   is  becoming  even  more  complex  when   analyzed   in   a   global,   cross-­‐cultural   scale.   It   is   easy   to   imagine   thus,   that  

Page 36: Sensor based physical interaction for embodied playful learning games

 36  

emotion  recognition  is  a  very  difficult  task  for  a  computer,  especially  on  real  time  application  where  the  system  has  to  analyze  the  user’s  state  and  give  a  response  on  a  very  narrow  time  frame.  Classic  psychological  research  claims  the  existence  of   six   basic   expressions   of   emotion   that   are   universally   displayed   and  recognized:   happiness,   anger,   sadness,   surprise,   disgust,   and   fear   [57],   other  studies   on   emotion   recognition   also   include   emotions   like   despair,   interest,  irritation   and   pride   [58].   A   lot   of   studies   do   not   accept   this   categorization   of  emotions,  suggesting   that   it   is  not  emotions  but  some  components  of  emotions  that  are  universally   linked  with  certain  communicative  displays.  Most   theorists  agree  that  the  two  dominant  dimensions  of  emotion  can  be  described  as  valence  (pleasant   vs.   unpleasant)   and   arousal   (activated   vs.   deactivated   or   excited   vs.  calm)  [54].  Mapping  even  basic  emotions  on  these  two  dimensions  is  challenging  [Figure  4.],  and  emotion  recognition  systems  analyzing  single  human  modalities  like  voice  or  facial  expressions,  usually  suffer  either  from  poor  accuracy  or  over  simplified  classification  of  emotions.    

 

 Figure  4  Emotions  mapped  on  basic  dimensions  

 

The   next   part   is   a   presentation   of   the   various   sensors   used   to   capture  physiological  signals  that  can  be  associated  with  the  emotional  state  of  a  person,  along  with  software  for  emotion  recognition  developed  from  previous  research.  

 

 

Page 37: Sensor based physical interaction for embodied playful learning games

  37  

4.1  Speech  analysis    

Speech  is  the  primary  method  of  human  communication.  Analysis  of  certain  features   extracted   by   speech   characteristics   like   intensity,   pitch,   phonetic  features,   voice   segments,   pause   length,   and   spectral   modeling,   along   with  linguistic   analysis   based   on   keywords   used,   can   be   used   to   make   conclusions  over  the  emotional  state  of  a  person  [60]  .    

EmoVoice  [61],  developed  by  the  university  of  Augsburg,  Human  Centered  Multimedia   Laboratory,   is   a   framework   for   emotional   speech   corpus   and  classifier   creation   and   for   offline   as  well   as   real-­‐time   on-­‐   line   speech   emotion  recognition.   The   framework   is  meant   to   be  used  by  non-­‐experts   and   therefore  comes   with   an   interface   to   create   an   own   personal   or   application   specific  emotion   recognizer.   EmoVoice   is   now   integrated   to   the   SSI   framework   (see  emotion  frameworks)  

 openEar  [62],  developed  by  the  Technische  Universität  München,  Institute  for  Human-­‐Machine  Communication,  is  an  open  source,  C++  library,  form  speech  processing   and   emotion   recognition,   combining   features   for   audio   recording,  feature  extraction,  and  classification  of  results,  along  with  pre-­‐trained  models.  

4.2  Facial  expressions  

Facial   expressions   analysis   has   been   the   first,   and   extensively   used   since  then,  method  for  emotion  recognition  on  multiple  studies,  and  it  is  the  preferred  method  for  single  modal  emotion  recognition  systems.  Facial  expressions  are  the  main   non-­‐verbal   communication   tools,   providing   the   most   powerful,   versatile  and   natural   means   of   communicating   motivational   and   affective   state.   Apart  from   expressing   emotion,   facial   expressions   are   providing   important    communicative  cues  during  social   interaction,   such  as  our   level  of   interest,  our  desire  to  take  a  speaking  turn  and  continuous  feedback  signaling  understanding  of   the   information   conveyed.   Facial   expression   constitutes   55   percent   of   the  effect  of  a  communicated  message  [63]  and  is  hence  a  major  modality  in  human  communication.  Several  studies  have  also  shown  that  ordinary  people  can  detect  six  emotional  facial  expressions  with  an  accuracy  ranging  from  70%  to  98%.  

In  facial  expressions  analysis  systems,  the  face  is  segmented  focusing  on  the  facial  areas  of  eyes,  eyebrows,  mouth  and  nose.  Each  of  these  feature-­‐candidate  areas   contains   the   features   whose   boundaries   are   extracted   and   stored   over  time,   and   then   the   displacement   of   each   feature   is   compared   to   “neutral   face”  model   images   to   conclude   the  emotion  expressed  by   the   subject.  Changes  over  systems  are  usually  on  the  number  of  features  tracked  and  the  kind  of  classifier  used.    

There   are   already   quite   a   few   systems   for   facial   expression   analysis  developed   by   research   institutes   and   some   are   available   for   research,   or  commercially.  Examples  of  such  systems  are:  the  SHORE  system  [64],  developed  by  the  Fraunhofer  institute,  eMotion  [65],  a  project  started  from  the  University  of  Amsterdam,  which  also  includes  software  to  map  captured  facial  expressions  on  

Page 38: Sensor based physical interaction for embodied playful learning games

 38  

second   life   avatars,   MindReader   [66]   developed   initially   by   Cambridge  University   (based   on   the   commercial   system   of   Nevenvision,   now   acquired   by  Google),  projects  of  the  ibug  (intelligent  behaviour  understanding  group)  of  the  Imperial  College  London  [67]  ,  and  FaceAPI  [68]  from  Seeing  machines.  There  are  also   some   open   source   examples   of   facial   features   tracking,   using   openCV  [69](open  Computer  Vision)  library  and  the  included  Haar  classifier.  openCV  is  a  library   for   real   time   image   analysis   and   it   has   become   one   of   the   standard  libraries   for  computer  vision,  with  C,  C++,  Python,  and   Java   interfaces   ,  used   in  robotics  and  multimedia  applications,  and  included  in  a  lot  of  frameworks  for  the  development  of  such  applications.    

4.3  Body  movement/postures    

Although   a   lot   has   been   written   for   the   so-­‐called   “body   language”,   body  movement   and   posture   has   not   been   researched   on   emotion   recognition   so  extensively,   as   facial   expressions   and   voice   analysis.   There   are   though   some  studies,   questioning   the   validity   of   facial   expressions   as   a   modality   for  recognizing   affective   states,   because   face   is   involved   in   various   functions   and  many   of   the   famously   recognized   facial   expressions   represent   only   a   small  subset   of   the   possible   expressions,   suggesting   body   posture   as   a   very   good  indicator   for   certain   categories   of   basic   emotions.  Most   studies   however,   have  not   been   able   to   demonstrate   similar   recognition   accuracy   with   that   of   facial  expressions   classifiers,   especially   those   who   study   emotion   recognition   from  static  body  postures  only.  Coulson  [70]  considered  how  6   joint  rotations  (head  bend,  chest  bend,  abdomen  twist,  shoulder   forward/backward,  shoulder  swing,  and   elbow   bend)   could   help   recognizing   6   emotions   (angry,   fear,   happy,   sad,  surprised   and   disgust). Concordance   rates   for   attributions   of   the   6   emotions  ranged  from  zero  for  many  disgust  postures  to  over  90  percent  for  some  anger  and   sadness   postures.   Kleinsmith   A.     and   Bianchi-­‐Berthouze   [71]   used   four  affective   dimensions   (valence,   arousal,   potency,   and   avoidance)   instead   of  discrete   emotion   categories.  On   their   study   there  was   a   12%  error  percentage  for  valence,  10%  for  both  arousal  and  potency,  and  11%  in  the  case  of  avoidance.  In  their  conclusions  they  report  that  other  types  of  body  motion  features,  may  be  necessary  for  achieving  better  recognition  of  some  affective  states  such  as  fear,  and  better  performance  of   their  model. Other  studies  that   include  body  motion  as   a   modality   [72],   tracking   features   like   quantity   of   motion   and   contraction  index   of   the   body,   velocity,   acceleration   and   fluidity   of   the   hand’s   barycenter,  orientation  and  approach/avoidance  behaviors  of  two  participants  towards  their  interlocutor   in   an   interaction,   suggest   that   body   language   reflect   their   level   of  activation  and  dominance  but  are  less  informative  about  their  valence  (positive  vs  negative).

Another  role  of  body  posture  should  be  also  noted.  Studies  suggest  that  body  posture   can  actually   induce   changes   in   affective   states  or  have  a   feedback   role  affecting   motivation   and   emotion.   A   study   by   Riskind   and   Gotay   [73],   for  example,  revealed  how  “subjects  who  had  been  temporarily  placed  in  a  slumped,  depressed  physical  posture  later  appeared  to  develop  helplessness  more  readily,  as  assessed  by  their  lack  of  persistence  in  a  standard  learned  helplessness  task,  than   did   subjects   who   had   been   placed   in   an   expansive,   upright   posture.”  

Page 39: Sensor based physical interaction for embodied playful learning games

  39  

Furthermore,  it  was  shown  that  posture  had  also  an  effect  on  verbally  reported  self-­‐perceptions.   Another   study   [74]   examining   postures   as   a   modality   for  recognizing   emotions,   suggests   that   involving   the   body   in   the   control   of  technology   facilitates   users’   expression   of   their   feelings,   which   in   turn   makes  them  have  an  improved  experience,  i.e.,  being  engaged.  

An   open   source   library   for   analyzing   body  motion   extracted   from   video   is  the  EyesWeb  [75]  Expressive  Gesture  Analysis  Library.  EyesWeb  refers  both  to  research   projects   of   InfoMus   Lab   of   the   University   of   Genova,   on   multimodal  interactive  systems  and  expressive  gesture,  and  to  an  open  software  platform  to  support   the   development   of   real-­‐time   multimodal   distributed   interactive  applications.  

4.4  Pupil  size    

Studies   have   shown   that   the   eye’s   pupil   is   significantly   larger   during   both  emotionally   negative   and   positive   stimuli   than   during   neutral   stimuli   [76].  Although  we  cannot  distinguish  valence,  pupil  size  can  be  used  as  an  additional  modality  of  arousal.  A  lot  of  eye  tracker  devices  have  the  ability  to  measure  the  pupil’s  size.  

4.5  Bio-­‐feedback  sensors  

Emotion  recognition  systems  based  on  external  modalities  like  speech,  facial  expressions,  and  body  posture  are  more  familiar  to  use,  because  they  accept  the  same   input   we   as   humans   do,   in   our   everyday   interactions   with   others.   The  performance   of   such   systems   however,   depends   on   environmental   conditions,  and   the   training   models   that   have   been   used   on   their   machine   learning  algorithms.   Although   advanced   processing   algorithms   have   been   developed   to  minimize   the   effect   of   environmental   conditions   like   illumination   for   facial  expression  analysis,  or  noise  cancellation  for  speech  analysis,  training  classifiers  can   be   practically   difficult   and   very   time-­‐consuming   procedure.   A   speech  analysis  system  for  example  has  to  be  trained  for  every  different   language,  and  as   vocal   characteristics   change   depending   on   age,   a   system   trained   on   adult  acoustic   models   would   not   be   affective   with   children.   From   the   range   of  modalities  mentioned  in  the  previous  section,  facial  expression  analysis  has  been  researched   the   most   and   proven   to   be   the   most   accurate.   The   use   of   this  technique  though,  also   introduces  some  practical  constrains,  as   the  camera  has  to   have   a   clear   image   of   the   subject’s   face   and   sufficient   illumination   to   be  accurate.    Additionally  it  is  easy  from  someone  not  to  reveal  his  emotions  on  the  camera,  or  as  mentioned  earlier,   autistic  persons   for  example  might  even  have  difficulty  to  do  so  when  they  want  to  express  their  emotions.  For  these  reasons  scientists   have   also   turned   to   the   use   of   embodied   biophysical   sensors,  monitoring  signals  that  can  reveal  valuable  information,  not  only  for  the  physical  state  of  someone,  but  for  the  emotional  and  mental  state  as  well.  

The  physiological  signals  usually  monitored  in  behavior  studies  are:  

Page 40: Sensor based physical interaction for embodied playful learning games

 40  

Heartbeat   rate   (ECG):   Electrocardiography   sensors   determine   heartbeat  rate  by  detecting  and  amplifying  the  tiny  electrical  changes  on  the  skin  that  are  caused   when   the   heart   muscle   depolarizes,   by   measuring   the   difference   in  voltage   between   two   electrodes   placed   either   side   of   the   heart.   There   are   also  optical   heartbeat   sensors   using   an   infrared   LED   and   a   phototransistor,   placed  closed  to  each  other  with  usually  a  fingertip,  or  the  ear  lobe,   in  between.  These  sensors  work  based  on  the  fact  that  when  your  heart  beats  you  have  a  quick  rush  of   blood   into   tiny   blood   vessels   close   to   your   skin,   which   makes   it   less  transparent,   so   less   light   comes   through   it   to   the   phototransistor.   Changes   in  heartbeat   can   give   us   a   clear   index   of   arousal,   but   sensors   are   prone   to  movement  artifacts.  Increase  of  hear  rate  has  been  related  to  fear,  and  decrease  to  anger  [38]  

Galvanic  Skin  Response  (GSR)/Electro  Dermal  Activity  (EDA)  both  refer  to  the  electrical  changes  measured  at  the  surface  of  the  skin.  EDA  sensors  usually  work  by  passing  a  miniscule  amount  of  direct  current  between  two  electrodes  in  contact  with  the  skin.  When  a  person  experiences  emotional  arousal,   increased  cognitive  workload   or   physical   exertion,   the   brain   sends   signals   to   the   skin   to  increase  the   level  of  sweating.  Sweat   is  a  weak  electrolyte  and  good  conductor,  the   filling   of   sweat   ducts   results   in   increasing   the   conductance   of   the   applied  current.  Changes  in  skin  conductance  at  the  surface  thus  provide  a  sensitive  and  convenient  measure   of   assessing   sympathetic   arousal   changes   associated  with  emotion,  cognition  and  attention.    

Skin   temperature/Heat   flux   is   the   amount   of   heat   that   the   body   emits.  Studies  have  shown  that  Heat  Flux  is  effective  in  detecting  context  switches.  This  is  because  context  switches  often  involve  physical  movement,  which  causes  the  body   to   warm   up   and   therefore   emit   heat.   Heat   flux   has   been   reported   to  increase  during  increased  cognitive  load  [77].  

There  are  a  lot  of  companies  today  producing  commercial  wireless,  wearable  biophysical   sensors   transmitting  signals   to  software  running  on  a  smart-­‐phone  or  computer,  for  sports  enthusiasts  who  like  to  monitor  and  keep  track  of  their  exercising   habits.   Most   of   them   do   not   offer   an   open   API   for   application  development   but   in   some   cases   it   is   possible   to   read   the   packets   sent   by   the  sensor  with  custom  libraries.  

 

4.6  Brain  Computer  Interfaces  (BCI)    

Brain  computer  interfaces  are  sensors  monitoring  brain  activity  to  translate  user’s   thoughts   or   mental   state   into   actions   on   the   computer.     The   brain’s  electrical   charge   is  maintained   by   billions   of   neurons.   Neurons   are   electrically  charged   by   membrane   transport   proteins   that   pump   ions   across   their  membranes.   Neurons   are   constantly   exchanging   ions   with   the   extracellular  milieu,  for  example  to  propagate  action  potentials.  

     Electroencephalography   (EEG)   is   the   recording   of   electrical   activity,  using   electrodes   attached   along   the   scalp,   measuring   voltage   fluctuations  

Page 41: Sensor based physical interaction for embodied playful learning games

  41  

resulting   from   ionic   current   flows   within   neurons,   and   generated   by   the  synchronous   activity   of   thousands   or   millions   of   neurons   with   similar   spatial  orientation  in  the  brain.  

     Since  its  discovery  in  1924  by  Hans  Berger,  EEG  has  been  widely  used  in  clinical   research,   on   neurology,   to   diagnose   epilepsy,   coma,   brain   death   and  various   encephalopathies.   Scalp   EEG   activity   shows   oscillations   at   a   variety   of  frequencies   and   researchers   have   associated   certain   oscillations   frequency  ranges   and   spatial   distributions   to   different   states   of   brain   functioning.        Although  EEG  is  not  the  most  accurate  method  to  monitor  brain  activity,  its  ease  of   use,   portability   and   low   set-­‐up   cost   has  made   it   the  most   studied   one,   and  resulted   its   application   to   other   research   fields   and   all   kinds   of   experiments  where   it   is   interesting  to  monitor   the  mental  state  of   the  subject.  Usually   three  frequency  ranges  are  used  for  this  purpose:  

 

• Theta  (4  -­‐  7  Hz):  related  to  drowsiness    

• Alpha  (8  -­‐  13  Hz):  related  to  relaxation    

• Beta  (>13  –  30  Hz):  related  to  alertness  

 

 During   the   last   years,   EEG   has   made   its   way   to   human   computer  interaction   research,   research   towards   machines   with   emotional   intelligence,  and   a   small   number   of   companies   are   working   on   developing   low   cost,   non-­‐invasive,  brain  computer  interface  products  like  the  Emotiv  headset  ,  Neurosky’s  Mindwave,  Starlab’s  Enobio  (which  combines  EEG,  ECG  and  EOG  sensors)   ,  and  OpenEEG  [78],  a  community  project  has  been  created  to  support  the  creation  of  open  hardware  and  software  solutions.  On  a  consumer  level,  these  interfaces  are  currently   used   mainly   in   gaming   and   other   entertainment   applications,   since  they   are   still   proved   to   be   inaccurate   and   not   practical   for   more   critical  applications.  

 

Functional   near-­infrared   spectroscopy   (fNIRS)   is   an   emerging  technique   for   sensing   brain   activity,   similar   to   the   technique   used   by   optical  heartbeat  sensors  mentioned  earlier  in  the  document.  The  fNIRS    system  is  made  up   of   probes   that   send   light   at   two   wavelengths   in   the   near-­‐infrared   range.  Biological   tissues   are   relatively   transparent   to   light   at   these  wavelengths.   The  main   absorbers   of   the   light   are   oxygenated   hemoglobin   and   deoxygenated  hemoglobin.   These   act   as   relevant   markers   of   hemodynamic   and   metabolic  changes  associated  with  neural   activity   in   the  brain.  The   reflected   light   is   then  picked  up  by  the  detectors  on  the  device.  Depending  on  the  amount  of  light  that  is   reflected,   we   can   get   a   measure   of   brain   activity   in   the   area   beneath   the  sensors.  

Page 42: Sensor based physical interaction for embodied playful learning games

 42  

Studies   in   fNIRS   [79]   report   that   the   hemodynamic   response   being  measured  in  brain  is  a  slow  response  which  occur  over  5-­‐8  seconds.  This  makes  the   technique   currently   impractical   to   be   used   for   interaction   input   interfaces.  For   the  moment   there   is   still  no   commercial  brain   computer   interface  utilizing  the  fNIRS  technique.  

 

4.7  Developing  Tools  for  Multimodal  Biofeedback    

As  mentioned   in   the   introduction   of   this   chapter,   emotion   recognition   is   a  difficult   task   for   a   computer   and   the   performance   of   such   systems   can   vary  depending   on   the   state   of   the   interacting   person   as   well   as   environmental  conditions.   In  order   to   increase   the   reliability  of   emotion   sensing   systems,   and  after   gaining   experience   by   developing   single  modal   analysis   systems,  modern  research  examines   the   application  of  multi-­‐modal   systems   [77][80],   combining  various  sensors  and  data  analysis  and  sharing  a  last  decision  level  to  determine  the  emotional  or  effective  state  of  the  subject.  Towards  this  direction  there  have  been  a  number  of  projects,  with  contribution  from  universities  all  over  Europe,  for   the   development   of   frameworks   and   middleware   that   make   easier   for  researchers  to  develop  and  use  multi-­‐modal  emotion  recognition  systems.  

CALLAS   [81]   (Conveying   Affectiveness   in   Leading-­‐edge   Living   Adaptive  System)   is   a   project   funded   by   the   European   Commission,   under   the   6th  Framework   Programme,   with   the   participation   of   a   lot   of   universities   around  Europe.   CALLAS   is   a   framework   based   on   a   plug-­‐in   multimodal   architecture,  containing   a   collection   of   components   for   feature   extraction   from   text,   audio,  video  and  motion   sensors,   and  process   emotional   aspects   in   real-­‐time   for   easy  development  of  applications  for  art  and  entertainment.  The  CALLAS  framework  also  includes  its  own  visual  programming,  authoring  tool,  CAT.  

SEMAINE  [82]  is  also  a  project  funded  by  the  European  Commission,  under  the  7th  Framework  Programme,  aiming  to  build  a  Sensitive  Artificial  Listener,  a  multimodal   dialogue   system   which   can   sustain   an   interaction   with   a   user   for  some  time  and  react  appropriately  to  user’s  non-­‐verbal  behavior.  The  system  can  take   input   from   video   and   audio   to   analyze   the   user’s   emotional   state.   The  SEMAINE  API   is   available   as   open   source,   supporting  C++  and   Java;   it   features  the  Apache  ActiveMQ  message  broker  as   an   integration   layer  and   can   run  as  a  distributed  system.    

 

Page 43: Sensor based physical interaction for embodied playful learning games

  43  

 

 

SSI   [83]   (Social   Signal   Interpretation)   framework   is   developed   by   the  Human  Centered  Multimedia  research  laboratory  of  the  University  of  Augsburg.  It   is   available   as   open   source,   written   in   C++,   and   contains   tools   to   record,  analyze   and   recognize   human   behavior   in   real-­‐time,   such   as   gestures,  mimics,  head  nods,  and  emotional  speech.   It  also   follows  a  plug-­‐in  based  design,  with  a  growing   collection   including   among   others,   input   from   the   Wii-­‐mote   and   the  Kinect   sensor   (under   development),   while   it   also   supports   the   use   of   external  libraries  such  as  OpenCV,  ARToolKit,  SHORE,  Torch,  Speex,  Watson.  SSI  supports  the  machine-­‐learning  pipeline   in   its   full   length   and  offers   a   graphical   interface  that   assists   a   user   to   collect   own   training   corpora   and   obtain   personalized  models.   It   also   features   an  XML-­‐editor  programming  environment   to  draft   and  run  pipelines  without  special  programming  skills.  

4.8  Data  representation  of  emotions    

Apart   from   developing   special   software,   a   lot   of   projects   have   focused   on  creating   standard   formats   to   represent  human  emotions  and  share   them  along  emotion  aware  applications.  These  formats  can  be  used  for  example  to  annotate  digital  media  in  order  to  train  models  for  affective  indexing,  collect  data  to  train  virtual   agents,   or   to   share   data   between   emotion   recognition   system   and   an  application,  developed  by  another  party,  that  will  animate  a  virtual  avatar  of  the  user  accordingly.    

MPEG-­‐4   (Part   2   “Visual”)   contains   MPEG-­4   FAP   [84](Facial   animation  parameters),   a   set   of   68   parameters   to   allow   the   animation   of   synthetic   face  models,  which  can  be  used  on  facial  expressions  analysis  applications.  MPEG-­V  [85]   is   a   standard   under   development   for   a   common  middle   layer   format   for  interaction  and  visualization,  among  virtual  world  applications.  

EMMA     [86](Extensible   Multimodal   Annotation   Language),   is   an   XML  markup  language,  recommended  by  the  W3C,  for  containing  and  annotating  the  interpretation   of   user   input.   It   is   a  wrapper   language   that   can   include   various  kind   of   payloads   representing   interpretation   of   various   user   input.   An  interpretation  element  contains  information  about  the  modality  upon  which  the  interpretation   is   based,   can   indicate   start   and   end   timestamps   of   the  interpretation   and   many   more   attributes.   EmotionML   [87],   is   a   “plug-­‐in”  

Page 44: Sensor based physical interaction for embodied playful learning games

 44  

language,   also   recommended   by  W3C,  which   can   be   combined  with   EMMA,   to  represent   human   emotions   on   user   input.   EmotionML   recognizes   the   fact   that  there  is  no  single  agreed  representation  of  affective  states,  or  of  vocabularies  to  use.   Therefore,   an   emotional   state   <emotion>   can   be   characterized   using   four  types   of   descriptions:   <category>,   <dimensions>,   <appraisals>,   and   <action-­‐tendencies>.   An   example   of   EMMA   document   carrying   EmotionML   as  interpretation  payload  is  given  below:  

<emma:emma xmlns:emma="http://www.w3.org/2003/04/emma" version="1.0"> <emma:interpretation emma:start="123456789"> <emotion xmlns="http://www.w3.org/2005/Incubator/emotion"> <dimensions set="valenceArousalPotency"> <arousal value="-0.29"/> <valence value="-0.22"/> </dimensions> </emotion>

</emma:interpretation> </emma:emma>

HEO     [88](Human   Emotion   Ontology)   is   an   effort   to   make   an   RDF,   OWL  ontology   to   represent   human   emotions   with   sub   classes   and   attributes   to  describe   input   modalities,   dimensions   (arousal,   valence,   dominance),   action  tendencies  and  many  more.  

SAIBA   [89]   (Situation,   Agent,   Intention,   Behavior,   Animation)   is   a   running  project   focusing   on   the   creation   of   a   framework   of   languages   for   Embodied  Conversational  Agents,  with  three  stages  representing  intent  planning,  behavior  planning   and   behavior   realization.   A   Function   Markup   Language   (FML),  describing   intent  without  referring   to  physical  behavior,  mediates  between   the  first   two   stages   and   a   Behavior   Markup   Language   (BML)   describing   desired  physical   realization,   mediates   between   the   last   two   stages.   BML   has   behavior  elements   for   head,   torso,   face,   gaze,   body,   legs,   gesture,   speech   and   lips   and  defines  attributes  for  animating,  lips  and  gaze  synchronization,  gestures  etc.  

More   information,   articles   and   tools   can   be   found   on   the   HUMAINE  Association   website   [90],   an   international   community   around   research   on  emotions  and  human-­‐machine  interaction.  

4.9  Biofeedback  Interactions.  Thoughts  and  insights      

The  biofeedback  mechanisms  used  in  a  game  are  defined  by  the  interactions  designed  for   it,  on  the  questions  of  what  we  want  to  measure,  why  we  want  to  measure   it,   and   how   we   use   this   measurement   inside   the   game.   However,  designing  biofeedback  interactions  for  a  game  installation  featuring  also  physical  motion  interaction,  automatically  sets  some  factors  of  the  game  setting  that  have  to   be   considered.   Rather   than   the   statistical   evaluation   of   individual   emotion  recognition  techniques,  this  study  focuses  on  the  practical  application  of  sensors  in   an   interactive   game   space   and   the   challenges  presented  by   the   setting.  This  part  of  the  document  discusses  some  thoughts  and  ideas  derived  from  the  study  of   the   characteristics   of   sensors   presented   previously,   and   insights   from   the  testing  of  particular  technologies  during  research.  

 

Page 45: Sensor based physical interaction for embodied playful learning games

  45  

As  discussed  in  section  2.4,  bio-­‐signals  collected  from  sensors  can  be  used  in  game   interaction   basically   in   two   ways:   i)   as   a   continuous   monitored   signal,  correlated   to   running   variables   of   the   game,   such   as   the   difficulty   or   pace;   or  monitoring   player’s   progress   towards   a   desired   state,   perceived   as   goal;   ii)   as  signals  monitored   in  relatively  small   time   frames,  on  specific  points  of  a   story-­‐driven  game,  acting  as  sensing  mechanisms  of  virtual  agents.    Certain  signals  like  ECG,  body  temperature/thermal  flux,  and  EEG,  are  offered  more  to  be  handled  as  continuous.    Naturally,   in  a  game  with   intensive  physical  motion   interaction,  values  of  heart  rate,  temperature,  and  skin  conductance  are  expected  to  increase,  which  could  make  their  use  for  emotion  recognition  purposes  problematic.    

EEG  signals  are  also  interesting  to  monitor  through  a  game.  EEG  sensors  can  give   an   indication   of   the   cognitive   load   of   the   player,   making   it   interesting   to  study   the   correlation   of   physical   and   mental   state   during   a   game,   seeking  additional   signs   to   support   the   idea   of   embodied   learning.   The  main   problem  with  EEG  sensors  is  that  the  signals  monitored  are  very  weak  and  noisy,  and  the  electrodes  must  be  positioned  very  carefully  on  scalp.    

Three   commercial   wireless   EEG   sensors   where   tested   during   research   as  possible  solutions  for  the  EPLT  installation.  The  first  one  was  the  Emotiv  EPOC  headset.   EPOC  uses  14   gold  plated   electrodes   that   need   to   be  moisturized   and  placed   carefully   on   the   user’s   scalp.   The   sensor   is   able   to   monitor   4   mental  states,   13   conscious   thoughts,   and   facial   expressions,   and   also   includes   2  gyroscopes   to   track   head  movement.   Although   EPOC   is   an   interesting   piece   of  hardware  and  software,  it  was  found  not  suitable  for  a  public  interactive  space.  Placing  all  moisturized  electrodes  in  the  right  position  can  take  a  significant  time  and   it   is   difficult   for   electrodes   to  maintain   their   position   during   a   game  with  physical   motion.   Additionally   the   advanced   function   of   recognizing   conscious  thoughts   requires   also   a   lot   of   time   in   both   user   and   machine   training.   After  these   findings   the   research   turned   to   simpler   solutions   and   tested   Neurosky’s  sensors.   Among   raw   EEG   values   of   6   frequency   ranges,   Neurosky’s   sensors  include  two  values  indicating  attention  and  meditation  levels,  derived  from  what  the   company   calls   eSense   algorithm.   The   sensor   amplifies   the   raw   brainwave  signal   and   removes   the   ambient   noise   and   muscle   movement.   The   eSense  algorithm   is   then   applied   to   the   remaining   signal,   resulting   in   the   interpreted  eSense  attention/mediation  meter  values.  Previous  research  papers  support  that  the   sensor   successfully   indicates   changes   in   user’s   mental   states   [91][92].  Mindset  was   the   first   sensor   tested,   featuring  a   single  dry  electrode   to   capture  EEG  signals  and  a  pair  of  headphones.  During  tests  it  was  found  that  it  was  also  difficult  to  get  a  perfect  signal  from  the  single  electrode,  required  for  the  eSense  values  to  work,  requiring  a  lot  of  time  and  patience.  Besides  the  electrode  placed  on   the   users   forehead,   the   sensor   has   3   more   contacts   placed   in   the   left  headphone  that  need  to  make  good  contact  with  the  skin.  Even  when  the  sensor  had  a  perfect  signal  it  was  difficult  to  maintain  it  while  moving  and  jumping.  The  last   sensor   tested   was   Neurosky’s   latest   sensor   called   Mindwave   [Figure   5].  Mindwave  has  an  improved  design  over  its  predecessor,  using  a  single  electrode  that   is   wider   and  more   comfortable   to   wear   and   an   earlobe   clip   that   ensures  good  contact  with  the  skin.  Testing  has  shown  that  indeed  the  sensor  gets  a  good  signal  easily  and  can  maintain  it  even  in  relatively  intense  motion.  

Page 46: Sensor based physical interaction for embodied playful learning games

 46  

 

 Figure  5  Neurosky  mindwave  single  dry  electrode  EEG  sensor  

 

Pupil   size  was   found  not  practical   to  use  as  a  biofeedback  mechanism.  An  immersive   environment  with   continuous   visual   stimuli   and   physical  motion   of  the   player   is   expected   to   affect   eye   movement   and   pupil   size,   and   the  requirement  of  a  wearable  camera  in  close  distance  from  player’s  eye,  will  limit  the  sense  of  freedom  to  move.    

As  mentioned  in  section  4.5,  accuracy  of  emotion  recognition  systems  based  on   facial  expressions,  speech  analysis  and  body  postures   can  be   limited  by  environmental  conditions.  Thinking  specially  of  a  game  with  motion  interaction  we  would  expect  the  user  to  move  a  lot  inside  space,  being  in  a  distance  of  some  meters   from   the   camera,   and  adopting  postures   suggested  by   the   game  action.  These   factors,   along  with   fast   transitions   from  one   emotional   state   to   another,  experienced  during   intense  moments   for   the  progress  of   the  game,  make   these  modalities   practical   to   monitor   as   continuous   signals   only   for   later   statistical  analysis,   and  not  as   signals   correlated  continuously  with   runtime  properties  of  the  game  play.    

During   research   tests   on   facial   expression   analysis   were   made   using   the  SHORE  SDK[64]  provided  by  the  Fraunhofer  research  organization.  The  SHORE  engine  is  able  to  detect  and  analyze  multiple  faces  in  a  frame,  providing  gender  recognition,   an   age   estimation   and   analyze   facial   expressions   providing  indication   values   of   4   basic   emotions:   angry,   happy,   sad   and   surprised.   Tests  have  shown  that  SHORE’s  performance  is  very  high  even  under  low  illumination  and   when   the   face   covers   a   small   area   of   the   frame.   Emotion   classification  however  proved  to  be  accurate  mostly  for  the  emotion  of  happiness,  which  is  the  most  obvious,  derived  from  analyzing  how  much  the  subject  is  smiling.  

Skin   conductance  value  has  been   reported   to   vary   a   lot   between  persons  being   in   relevant   states,   and   detection   of   sudden   emotional   context   change   is  noticed  as  sudden  increase  of  the  value  compared  to  previous  ones.  This  makes  skin   conductance   also   more   suitable   to   be   monitored   on   a   short   timeframe  where  we  want  to  monitor  player’s  reaction  to  specific  game  stimuli.    

 

Page 47: Sensor based physical interaction for embodied playful learning games

  47  

The   ultimate   application   of   emotion   recognition   systems   in   story   driven  games,   would   be   to   develop   virtual   actors   demonstrating   signs   of   artificial  emotional  intelligence,  by  reading  input  from  sensors  while  interacting  with  the  user.  As  an  example,  Self  City  [93]   is  a  previous  project  of   the  Waag  Society  on  gamefying  social-­‐emotional  skills  learning.  Self  City  transferred  the  player  into  a  virtual   city   in  which   the  player   could   train  her   social   skills  by   interacting  with  other  virtual  avatars,  in  simulations  of  daily  social  life,  and  conflict  scenarios.  On  these  scenarios  the  player  was  called  for  example  to  deal  kindly  and  calmly  with  an  aggressive  doorman,  or  someone  who  took  his  place  at  the  tickets  queue,  or  kindly   ask   another   person   for   something.   The   player   was   guided   by   another  avatar,  her  personal  social  skills  advisor.  Self  City  was  designed  on  Second  Life,  an  online  virtual  world  in  which  users  interact  with  each  other  through  avatars.  Behind  all  avatars  of  Self  City  there  where  educators  interacting  with  the  user.    

Transferring   game   scenarios   like   those   of   Self   City   to   a   multi   sensor  interactive  space,  virtual  actors  could  use  emotion  recognition  systems  to  sense      player’s   emotions   and   trigger   corresponding   pre   programmed   behaviours.   A  virtual  actor  could  enable  emotion  recognition  at  the  beginning  of  an  interaction,  for   example   when   the   player   is   in   a   close   range,   sensing   if   the   player   is   for  example   kind,   smiling   (facial   expressions),   talking   calmly   or   using   “please”  (speech  analysis/keyword  recognition);  or   if   the  player   looks  angry  or   if   she   is  scared   (skin   conductance).     Monitoring   speech   and/or   facial   expressions,   and  keywords   recognition,   the   virtual   actor   could   detect   the   end   of   a   phrase   or   a  pause,  and  use  the  output  of  emotion  recognition  algorithms  running,  to  trigger  behaviours  based  on   the  story  script,  awarding   the  player,   simulating  empathy  or   act   like   it   has   been   insulted   or   upset.   Although   current   state   of   the   above  technologies  may  require  from  the  player  to  overact  her  emotions,  and  delayed  responses   would   create   an   unnatural   flow   in   the   interaction,   considering   the  progress  made  for  example  on  speech  and  action  recognition  systems  during  the  last   years,   implementations   of   intelligent   virtual   actors   will   become   more  appealing  and  easier  for  interactive  story  telling.  For  more  information  on  action  recognition  and  intelligent  virtual  agents  see:  [94][95][96].  

 

Closing  this  chapter,   the  table  below  [Table  1]  presents  a  summary  of  all  biofeedback  mechanisms  studied,  and  their  main  characteristics  and  constrains:  

Biofeedback  signals   Emotions  elicited   Characteristics/Constrains  Speech   Anger  

Happiness  Surprise  Sadness  Disgust  Fear    

-­‐  Expensive    -­‐  Performance  depends  on  training    -­‐  Subject  to  bias    

Facial  Expressions   Anger  Happiness  Surprise  Sadness  Disgust  Fear  

+  Determines  displeasure  or  pleasure    -­‐  Requires  a  clear  image  of  subject’s  face    

Page 48: Sensor based physical interaction for embodied playful learning games

 48  

  -­‐  Performance  depends  on  training    -­‐  Subject  to  bias  

Body  postures   Anger  Happiness  Surprise  Sadness  Disgust  Fear    

-­‐  Difficult  to  determine  accurately    -­‐Performance  depends  on  training    -­‐Subject  to  bias  

Eye  tracking   Attention  (eye  movement),    Arousal  (pupil  size)  

+  Reliable    -­‐  Difficult  to  measure  on  a  dynamic  environment    -­‐  Difficult  to  determine  displeasure  or  pleasure    -­‐  Expensive  

Heart  rate/ECG   Arousal,  Fear,  Anger  

+  Familiar,  easy  to  measure  +  Cheap    -­‐  Lag  between  on  set  and  stimuli  -­‐  Prone  to  movement  artifacts  

Skin  conductance   Arousal,  Frustration,  Surprise,  Fear,  Anger    

+  Minimal  lag  to  stimuli  +  Robust  to  movement  +  Easy  to  measure  +  Cheap    -­‐  Difficult  to  determine  displeasure  or  pleasure    -­‐  Variable  range  across  subjects    

Body  heat     Arousal  Anger  Fear  Cognitive  load    

+  Easy  to  measure  +  Cheap      

Brain  signals  EEG   Drowsiness  Relaxation  Alertness  Attention  

+  Determines  cognitive  load    -­‐  Noisy  signals    -­‐  Prone  to  movement  artifacts    -­‐  Raw  values  are  hard  to  interpret    

 

Table  1:  Overview  of  emotion  recognition  modalities    

 

 

 

Page 49: Sensor based physical interaction for embodied playful learning games

  49  

 

Chapter  5:    Hardware  and  software  platforms  for  multi-­‐sensor  interactive  spaces    

5.1  Sensor  Hardware  Platforms    

There   is  a  very   large  number  of  companies  producing  sensors  and  offering  specialized  solutions  for  any  nature  of  project.  As  final  products,  designed  for  a  specific   use   though,   these   solutions   often   introduce   restrictions   on   their  application   to   custom   setups   and   collaborating   with   custom  written   software.  The   architectural   design   of   a   project   featuring   multiple   sensors,   requires   not  only  a  sensor  network  that  will  make  sure  that  all  sensors  work  together  without  problems,  but  also  a  network  that  can  be  customized  to  fit  the  project’s  data  flow  design.   The   use   of   sensor   platforms   complies   with   these   two   requirements  offering  a  common  standard  base  between  sensors  and  the  freedom  to  customize  their   function   and   connectivity.   The   following   part   presents   some   examples   of  sensor  platforms  used  today,  with  different  design  approaches.  

Arduino    

Arduino  is  an  open-­‐source  electronics  platform.  It  is  designed  as  a  low  cost,  expandable,  multi-­‐purpose   prototyping   platform   based   on   flexible,   easy   to   use  hardware  and  software.  Since  its  introduction,  Arduino  has  created  a  very  large  community,   sharing   support   and   code;   it   is   used   for   education   in   a   lot   of  laboratories   around   the   world,   and   has   become   a   standard   for   interactive  designers,  media  artists,  and  hobbyists  

The   Arduino   basic   platform   consists   of   three   parts:   The   Arduino  microcontroller,  which   can   be   built   by   hand   using   the   provided   schematics   or  purchased  preassembled  and  in  different  versions  and  sizes,   including  versions  designed  to  implement  wireless  nodes,  with  XBee*  radio  connector  and  circuitry  for  battery  and  charging;  or  version  like  the  LilyPad,  designed  so  it  can  be  sewn  onto   fabric,   for  wearable   applications.  The  Arduino  microcontrollers   are  based  on  the  Atmel  8-­‐bit  AVR  family  of  microcontrollers  with  RISC  architecture.  

Second  part  of  the  platform  is  the  language  and  compiler.  Arduino’s  language  is   based   on   C,   and   designed   to   simplify   the   creation   of   physical   interaction  application,  in  combination  with  the  use  of  the  third  part,  the  IDE,  which  is  built  on  Java.    The  three  parts  make  a  platform  with  simplified  programming  language,  used   to   create   instructions   for   a   controller   basic   enough   to   be   easily   used   for  common  programming  tasks,  yet  powerful  to  support  complex  projects.  

Arduino   can   be   expanded   with   a   great   variety   of   add-­‐ons,   the   Arduino  shields   as   they   are   called,   and   a   great   variety   of   motion   and   environmental  

Page 50: Sensor based physical interaction for embodied playful learning games

 50  

sensors,   network   devices,   servomotors,   and   can   implement   wireless   sensors,  tangible  interfaces  and  robots.  

*XBee   is   a   ZigBee-­‐enabled   device   for   Arduino.   ZigBee   is   a   wireless  communication   standard,   designed   to   be   inexpensive,   with   low-­‐power  consumption.   Most   importantly   ZigBee   is   particularly   well   designed   for   mesh  networks,  with  peer  to  peer  connections,  instead  of  a  single  router  network.    

.Net  Gadgeteer    

Following  Arduino’s  success,  Microsoft  Research  recently   launched  the   .Net  Gadgeteer   open   source   platform,   a   microcontroller   based   on   the   ARM7  processor,   designed   to   be   programmed   through   the   Microsoft’s   .NET   (Micro)  Framework   and   C#   and   expand   through   solder-­‐less   connection   modules.   The  idea  of  solder-­‐less  connection  modules  will  encourage  more  people  without  any  experience   in   building   circuits,   to   try   and   build   their   own   gadget   prototypes.  Since   Gadgeteer   is   a   very   new   platform,   and   since   it   uses   its   own   connection  standard,  the  list  of  available  modules/sensors  is  still  limited.  

Phidgets    

Phidgets   is  also  a  platform  on   the  concept  of  Arduino,  designed   to  be  even  simpler.   Phidgets   is   a   line   of   plug   and   play   building   blocks   for   physical  computing  that  can  be  connected  over  USB  to  a  computer  and  communicate  with  any  application.  The  Phidgets  API  controls  all   the  USB  communication  with  the  devices,   making   simpler   the   communication   between   applications.   Arduino  supports  the  creation  of  more  complex  projects,  but  Phidgets  allows  you  to  built  simpler   prototypes   faster,   and   supports   programming   in   a   large   variety   of  programming  languages  including  high  level  languages  like  C#  and  Actionscript  3,  as  well  as  visual  programming  frameworks  like  Max/PureData  and  LabView.  

 Shimmer  Shimmer   is   an  open   source  platform   for   small,  wearable,  wireless   sensors.  

Shimmer  started  as  a  project  of  Intel  Research  and  is  now  a  division  of  Realtime  Technologies.   Unlike   the   previous   hardware   platforms   presented,   focusing   on  multi  purpose  prototype  building,  shimmer  produces  already  assembled,  highly  sophisticated   sensors,   focusing   more   on   research   around   Body(or   Personal)  Area   Networks   (B/P   AN).   BAN   research   aims   at   the   development   of   wireless  distributed  systems  for  autonomous  and  remote  monitoring  of  patients  in  health  care.      

 Shimmer   platform   consists   of   the   main   unit,   a   light   weight   pack   with   an  MSP430   processor,   battery,   Bluetooth   and   802.15.4   connectivity,   a   micro   SD  memory  slot  for  offline  data  storage,  a  tilt  sensor  and  an  accelerometer.  A  variety  of  motion,  biophysical  and  ambient  sensors  can  be  connected  with  the  unit.  The  firmware  of  the  unit  embeds  TinyOS  ,  a  very  light  and  highly  customizable  unix-­‐based   operating   system,   specially   designed   for   low   power   embedded   systems  and  sensor  networks.  Shimmer  supports  development  of  applications  in  C#  and  also   a   LabView   library,   although   every   unit   is   an   autonomous   node,   providing  

Page 51: Sensor based physical interaction for embodied playful learning games

  51  

data   in   raw   or   semi   processed   format,   accessible   through   all   applications   via  custom  libraries.  

I-­‐CubeX    

I-­‐CubeX   is   a   commercial   platform   producing   a   variety   of   sensors   and  providing   multiple   sensor   kits   for   research   and   interactive   projects.   I-­‐CubeX  provides   an   API   with   support   for   various   languages   like   C++,   Actionscript,  Max/Jitter,  while   the   sensors   can   communicate  directly  with  musical  keyboard  instruments  using  the  MIDI  interface.  On  the  platforms  website  there  are  a  lot  of  application   code   examples   and   sensor   kits   suggested   for   a   wide   range   of  interactive  applications  categories.  

 

5.2  Interactive  software  development  platforms    

This   part   of   the   document   is   a   short   presentation   of   various   useful  frameworks   and   toolkits   for   interactive   application   and   data   visualization  programming.  Although  a  lot  of  the  frameworks,  mentioned  below,  share  a  lot  of  common  elements,  this  list  serves  two  purposes.  The  first  is  to  cover  frameworks  written  in  different  languages  so  that  the  reader  can  find  one  that  is  written  in  a  familiar   language   for   him,   or   that   serves  better   his   projects   requirements.   The  second  purpose   is   to  encourage   the  reader   to  visit  and  explore   the  websites  of  the   tools  mentioned,   where   previous  work   of   very   talented   programmers   and  artists  is  showcased,  often  with  available  source  code,  being  thus  a  great  source  of  inspiration  for  anyone  interested  in  multimedia  programming  and  visual  arts.  

Processing   (Java   based)   is   an   open   source   programming   language   and  environment   focusing  graphics  and   interactions  programming.  Based  on  a  very  minimal  environment,  Processing  was  developed  as  a  “software  sketchbook”  and  a   tool   to   teach   fundamental   computer  programming   for   visual   arts.   Processing  was   the   first   of   a   series   of   frameworks   that   appeared  during   the   recent   years,  wrapping   a   growing   collection   of   standard   libraries   for   graphics,   image,   video,  audio  manipulation,  network  libraries,  physics  engines  and  many  more,  offering  also  more  simplified  interfaces  to  all  these  libraries  to  make  it  simple  to  combine  them  inside  a  program.  

After   the   success   of   Processing,   openFrameworks   (C++)   was   released,  following   the   same   concept,   using   C++   to   deliver   applications   with   better  performance  than  processing  and  native  C++  libraries,  offering  also  the  ability  to  develop   native   applications   for   the   iOS   and   Android   mobile   platforms.  openFrameworks   has   built   a   very   large   support   community   and   it   has   been  successfully   used   from   mobile   apps   to   large   and   complex   interactive  installations.  Beyond  the  basic  standard  libraries  wrapped  by  openFrameworks,  users   are   constantly   expanding   the   list   of   add-­‐on   libraries   and   components,  including  libraries  to  create  tangible  interfaces  and  physical  interaction,  like  the  TUIO   and  TouchLib   libraries,   and   the  OpenNI   framework,   which   has   already  produced  a  few  very  interesting  projects  using  the  Kinect  sensor.  Cinder  (C++)  

Page 52: Sensor based physical interaction for embodied playful learning games

 52  

and   Polycode   (C++/Lua)   are   also   two   other   open   source   toolkits   similar   to  openFrameworks.  

Visual  Programming  Languages    

Visual   programming   languages   combine   traditional   coding   with   tools   that  allow   the  user   to  handle  all   components  as  blocks  on  a   canvas.  Each  block  has  some  kind  of  input  signal,  and  the  code  inside  the  block  determines  its  output.  In  that  way  the  user  control   the   flow  of  data   inside  a  program  by  virtually  wiring  signals   with   blocks   input/outputs.   Apart   from   offering   a   clearer   structure,   by  using  this  visual  schematic,   to  people  with  no  programming  background,  visual  programming  languages  also  focus  more  on  live,  or  run-­‐time  coding,  allowing  to  change   the   behavior   of   a   block   without   requiring   to   recompile   of   the   whole  program.  

The   most   popular   visual   programming   languages   are  Max,   developed   by  Cycling74,  and  PureData,  its  free  open  source  equivalent,  actually  developed  by  one   of   the   initial   developers   of   Max,   Miller   Puckette.   Max   and   PureData   were  particularly   popular   to   musicians,   since   electronic   music   was   one   of   the   first  fields   utilizing   digital   technology   and   programming   and   this   logic   of   dataflow  programming,  wiring   different   signals,   effects   and   sensors  was   something   that  musicians  were  already   familiar  with   from  recording  studios.  Today  both   tools  have   a   very   large   collection   of   patches   and   programming   APIs   to   integrate  different  effects  and  sensors.  

 Isadora,   developed   by   TroikaTronix,   software   branch   of   Troika   Ranch,   a  media   intensive   dance   company,   is   a   visual   programming   language   focusing  mainly  in  manipulation  of  video  and  audio  for  live  performances,  supporting  up  to   6   different   independent   outputs,   and   including   also   a   C++   SDK   to   develop  custom  filters  and  effects.  

Field  is  a  Python  based  open  source  toolkit,  developed  by  OpenEndedGroup,  a  team  of  artists  also  with  experience  in  interactive  installations,  and  working  on  theatre   and   dance   performances.   Field   includes   a   Processing   plug-­‐in   which  replaces   the  Processing   IDE,   and   through  which   all   Processing   libraries   can  be  used   on   Field.   A   program   written   in   Field   can   include   also   code   in   other  programming   languages,   including   languages   that   execute   inside   other  applications   like   Autodesk   Maya   and   Adobe   After   Effects.   Field   supports   only  Mac  and  Linux  platforms  

VVVV   is  another  new  visual  programming   toolkit,   free   for  non-­‐commercial  use,   compatible   with   Windows   platform   only,   using   DirectX   libraries   and  supporting  programming  in  C#.  

QuartzComposer   is   part   of   Apple’s   XCode   framework,   for   visual  programming  using  native  libraries  of  the  MacOS.  

 

 

Page 53: Sensor based physical interaction for embodied playful learning games

  53  

 

 

Working  with  sensors    

For   working   more   particular   with   sensors,   signal   processing   and   pattern  recognition,   the  most   popular   applications,   offering  both   visual   and   traditional  programming  are  LabView,  by  National   Instruments,  and  Simulink,  developed  by  MathWorks.    

BioMOBIUS   is   an   open   platform,   developed   by   an   open   community   of  researchers   and   by   TRIL   Centre,   which   allows   researcher   to   rapidly   develop  sophisticated   technology   solutions   for   biomedical   research.   It   was   developed  with   the   philosophy   of   providing   a   common   technology   platform,   which  comprises   hardware,   software,   services   and   sensors.   BioMOBIUS   development  environment   is   based   on   EyesWeb,   and   provide   support   for   designing  applications  based  on  the  Shimmer  sensor  platform.    

Exemplar   is   an   open   source   kit   for   programming   of   prototypes   using  sensors,   developed   by   Stanford’s   University,   Human   Computer   Interaction  Group.   Exemplar   is   a   plug-­‐in   written   for   Eclipse   IDE,   offering   a   GUI   through  which  is  possible  visually  monitor  live  sensors  signals  and  manipulate  them.  

   ROS  (Robot  Operating  System)  is  an  open  source  project  providing  libraries  and   tools   like   device   drivers,   message   passing   middleware,   computer   vision  libraries,  and  more  features  to  support  the  creation  of  robot  applications.  Since  robots  are  an  ensemble  of  sensors  and  motors,  ROS  features  could  also  support  the  creation  of  a  project  utilizing  a  network  of  autonomous  sensor  nodes.  Among  other   sensors,   ROS   now   includes   drivers   and   libraries   for   the   Kinect   sensor,  which   is   a  perfect   solution   for   computer   vision   in   low   cost   robot  projects,   and  has  already  been  used  with  very  interesting  results.    

Result  of   the  combination  of  ROS  with  the  Kinect  sensor   is  also  the  Point  Cloud  Library  (PCL),  sister  project  of  ROS,  including  state  of  the  art  algorithms  for   3D   point   cloud   processing,   including   filtering,   feature   estimation,   surface  reconstruction  and  registration,  model  fitting  and  segmentation.  

 

 

 

 

 

 

 

Page 54: Sensor based physical interaction for embodied playful learning games

 54  

Chapter  6:    A  generic  architecture  for  multi-­‐sensory  interactive  systems  

6.1  Architecture  Description    

As   described   in   the   introduction   of   this   document   [See   Ch.1],   the   EPLT   is  meant  to  be  an  open  platform  to  be  used  by  developers,  artists  and  researchers  for   the   development,   experimentation,   testing   and   support   of   multi-­‐sensor  technologies   applied   on   interactive   applications.   As   such,   the   EPLT   should  feature   a   flexible,   extendable   and   scalable   architecture   that   can   be   adapted  according  to  the  application  built  upon  it,  and  the  equipment  used  for  input  and  output   of   the   interactions.   A  major   characteristic   of   this   architecture,   included  also   in   the   problem   statement   of   this   thesis,   is   the   existence   of   a   common  framework   for   the   collection   and   process   of   data   from   the   various   wearable  sensors  used.  

Based  on  the  above  requirements,  formed  by  the  EPLT  project’s  description,  and  after  research  conducted  by  the  author,  on  previous  related  work  of  others,  and  development  platforms  for  interactive  applications  [see  Ch.  5.2],  this  chapter  proposes  a  generic  architecture  scheme  for  multi-­‐sensor  interaction  spaces.  The  basic   elements   of   this   scheme   are   shown   in   the   figure   below   [Figure   6],  composed   by   three   basic   levels.   The   world   level   corresponds   to   the   actual  sensors,  such  as  a  motion  sensor,  a  heartbeat  sensor  and  an  EEG  sensor,  and  the  output  generators  of  the  systems  such  as  a  projector,  speakers  and  lights.    

 

 Figure  6:  A  generic  architecture  for  interactive  spaces  

Page 55: Sensor based physical interaction for embodied playful learning games

  55  

The   device   level   corresponds   to   low-­‐level   hardware   and   software  responsible  for  the  collection  and  transmission  of  data  from  the  various  sensors  to  the  application,  and  the  sub-­‐systems  controlling  the  output  mechanisms  used  by   the  application.  The  application   level  corresponds   to   the   system  accepting  data  from  the  sensors  as  input  and  process  them  to  the  corresponding  output.    

Main   characteristic   of   the   proposed   architecture   is   that   the   components  composing   the   different   parts   of   the   device   and   the   application   level   can  correspond  either   to  processes  running  on  the  same  computer,  or   to  processes  running   distributed   over   a   network   of   computers,   each   one   implementing   a  different   part   of   the   interactive   system.     In   order   to   provide   this   scalability,   a  messaging   service   is   established   between   the   device   and   the   application   level,  using  the  Open  Sound  Control  (OSC)  protocol.  

OSC   is   a   communication   protocol   originally   developed   at   the   UC   Berkeley  Center   for   New   Music   and   Audio   Technology   (CNMAT),   for   communication  among   computers,   sound   synthesizers,   and   other   multimedia   devices   that   is  optimized   for   modern   network   technology.   OSC's   advantages   include  interoperability,  accuracy,  flexibility,  and  enhanced  organization,  featuring  open-­‐ended,   dynamic,   URL-­‐style   symbolic   naming,   symbolic   and   high-­‐resolution  numeric  argument  data,  pattern  matching  language  to  specify  multiple  recipients  of  a  single  message,  high  resolution  time  tags,  and  “bundles”  of  messages  whose  effects  must   occur   simultaneously.   OSC  messages   are   usually   transmitted   over  the   UDP   protocol.   Due   to   its   flexibility   and   simplicity,   OSC   has   gained   a   lot   of  popularity   and   has   been   implemented   on   a   growing   list   of   programming  languages   and   libraries,   like   the   ones   presented   on   the   previous   chapter,   real  time   sound   and   media   processing   environments,   software   and   hardware  synthesizers,   sound   and   light   consoles   and   various   tangible   interfaces.   Other  messaging   systems   where   considered   during   research,   such   as   the   Virtual-­‐Reality   Peripheral   Network   (VRPN)[97].   VRPN   is   a   device-­‐independent   and  network   transparent   interface   to   virtual-­‐reality   peripherals,   supporting   a  wide  range  of  controllers.  VRPN  offers  similar  functionality  to  OSC,  but  the  second  was  preferred  because  of  the  wider  range  of  applications  supporting  it,   including  as  mentioned  above,  applications  used  by  non-­‐programmers.    

In  the  proposed  architecture  the  Input  Control  System  is  a  set  of  processes  reading  data   from  the  actual   sensors,  which  can  be  connected   in  various  ways,  for   example   through   USB,   Bluetooth,   RF,   WiFi   etc.   and   transmits   these   data  values  over  OSC  messages  to  the  other  components  of  the  device  and  application  level.   Although   the   use   of   messaging   on   an   application   running   on   a   single  computer   will   introduce   some   additional   load   and   latency   to   the   system,   it  preserves   a   degree   of   independence   between   the   device   and   the   application  level,   increases  the  reusability  of  code  to  extend  the  system,  establishes  a  basic  framework   to   develop   sound   and   media   interactions   without   any   additional  code,  and  sets  a  communication  standard  which  also  allows  someone  to  test  the  interactive  system  without  using  the  actual  sensor  devices  or  having  to  go   into  low-­‐level  details  of  how  the  computer  reads  data  from  the  sensor  device.        

 

Page 56: Sensor based physical interaction for embodied playful learning games

 56  

6.2  Use  Case    

Following   a   general   use   case   of   the   EPLT,   which   will   demonstrate   the  architecture  on  a  larger  scale  as  shown  on  the  previous  figure  [Figure  6],  the  user  enters   the   installation   space,   where   his   first   action   is   to   place   a   set   of   bio-­‐feedback  sensors  on  his  body.    Body  sensors  are  sensitive  devices  and  must  be  placed  correctly  in  order  to  get  good  signal  quality  and  correct  values.  A  member  of  the  staff  helps  the  user  to  place  the  sensors  correctly  and  monitors  the  sensor  signals   on   a   laptop   computer,   which   runs   the   Input   Control   System.  When   all  sensors   are   placed   correctly   the   staff  member   gives   a   brief   description   to   the  user  of  what  and  how  the  sensors  measure,  by  showing  the  user  a  visualization  of  the  data  collected.  The  user  then  has  to  become  comfortable  with  the  sensors  and   to   relax   in   order   to   enter   the   system  on   a   neutral   state.  When   the   user   is  good   to   go,   and   everything   is   set   correctly,   the   staff  member   can   initialize   the  OSC   connection   between   the   Input   Control   System   and   the   rest   of   the  applications  running  distributed  over  the  network,  and  launch  the  game  engine  running  on  another  computer  back  stage.  In  the  application  level,  apart  from  the  game  engine,   there   can  be   an   application   storing   all   bio-­‐signal   data  during   the  session   for   later   review.   Independently   from  the  output  produced  by   the  game  engine,   certain   sensor   values   can   be   mapped   directly   to   channels   of   another  system  with   a   sound   console   and   a  DMX   controller,   controlling   the   sound   and  lights  of  the  space.  For  example  a  feedback  sound  effect  of  a  heart  beating,  or  the  bpm  of  a  soundtrack,  can  be  directly  synced  with  the  heartbeat  rate  of  the  user,  and  a  spotlight  can  follow  the  position  of  the  user  in  space,  changing  brightness  and  color  according  to  the  attention  and  meditation  levels  of  the  user  as  they  are  captured  by  an  EEG  sensor.    The  use  of  OSC  makes  the  system  easy  to  extend,  and  allows  someone,  or  a  group  of  people,   to  design   interactions  using  a  variety  of  software  that  as  a  whole  will  create  a  rich  and  immersive  interaction  experience.    

Having   the   Input  Control   System   running  on   a  different  process  or   system  from   the   application   level   makes   it   also   easier   for   the   system   to   handle   and  recover   from   errors.   During   a   game   session,   a   sensor   might   suddenly   lose  connection  with  the  system  or  lose  contact  with  the  body  of  the  player  because  of  a  sudden  move  or  a  jump,  and  resulting  to  incorrect  values.  Errors  like  that  are  easier  to  spot  by  someone  monitoring  the  signal  qualities  and  raw  values  of  the  sensors.   In   case   of   an   error,   the   staff   member   can   temporarily   disable   the  transmission   of   sensor   data   to   the   rest   of   the   system,   leaving   it   to   continue  running  based  on  a  previous  valid  state,  while  trying  to  reset  the  connection  with  a  particular  sensor,  or  if  the  error  is  critical  for  the  progress  of  the  game,  he  can  decide  to  pause  the  game,  help  the  player  to  get  a  sensor  placed  correctly  on  his  body,  and  resume  the  game.  

An   adaptation   of   the   generic   scheme  presented   here,   is   described   in  more  detail   on   the   next   chapter,   which   describes   the   development   of   a   small-­‐scale  prototype  demonstrating  the  use  of  multiple  sensors  in  a  game  environment.  

Page 57: Sensor based physical interaction for embodied playful learning games

  57  

Chapter  7:    Prototyping  a  virtual  board  game  with  physical  interaction  

7.1  Introduction    

This   chapter   describes   the   prototyping   phase   of   the   thesis.   This   phase  focused  on  the  development  of  a  prototype,  aiming  at  the  following  targets:  

• Study  of  the  characteristics  of  certain  commercial  sensors  in  a  programming  perspective  and  their  performance  on  interactive  applications    

• Demonstrate  a  basic  implementation  of  the  architecture  proposed  in  the  previous  chapter  (Ch.  6)    

• Develop   a   base   that   would   allow   the   study   of   data   collected   from  body   sensors,   to   observe   ranges   of   values,   based   on   which   more  complex   interaction   can   be   designed,   and   their   possible   correlation  with  the  concept  of  Embodied  Learning    

Main   idea   of   the   prototype,   conceptualized   by   the   author,   was   to   use   a  motion  capture  sensor  to  create  a  board  game  that  would  blend  traditional  forms  of   children   games   with  modern   video   games.   Characteristics   such   as   dynamic  computer  graphics,  sound  effects,  fantastic  virtual  worlds,  and  the  ability  to  play  with  someone  over  distance  have  made  video  games  very  exciting  and  engaging  to  children.  On  the  other  hand,  traditional  games  that  used  to  be  more  popular  in  the   past,   for   example   hopscotch   [Figure   7],   although   they   might   seem   rather  simple  for  today’s  hi-­‐tech  standards,  motivated  the  physical  exercise  of  children  while  offering  a  playful  experience.  Modern  devices  such  as  the  Microsoft  Kinect  sensor,  give  us  the  ability  to  combine  the  best  parts  of  both  forms  of  game.  

 Figure  7:  Hopscotch.  A  classic  example  of  children  game  

Page 58: Sensor based physical interaction for embodied playful learning games

 58  

7.2  Preliminary  studies    

Designing  a  board  game  based  on   the  Kinect   sensor   first  of   all   required   to  determine   the   maximum   squared   active   skeleton   tracking   area   for   the   board.  The   first   version   of   a   game   prototype   developed   by   the   author,  NumHop   is   a  minimal  game,  built  on  C++,  OpenGL,  and  the  Microsoft  Kinect  SDK  (beta1)  using  the   cinder   framework   (see   5.2),   as   a   preliminary   study   on  Kinect’s   active   area  and  performance.   In  NumHop  [Figure  9]   the  player  was  called   to  add  numbers  up  to  a  certain  number  by  jumping  to  numbered  tiles  of  a  4x4  board  that  updates  itself   every   3   seconds.   The   faster   the   player   reaches   the   target   sum,   the  more  points  he  gains,  while  the  player   loses  a  game  round  if  he  exceeds  the  targeted  sum.  

Kinect’s  minimal  distance  for  skeleton  tracking  is  approximately  1.3  meters  and  the  maximum  approximately  3.8  meters.  The  sensor  has  a  horizontal  field  of  view  of  57°.    After  testing  the  optimal  maximum  squared  area  was  determined  to  be   around   4   m2,   starting   from   1.5   m.,   to   3.5   m.   away   from   the   sensor.   The  diagram   below   [Figure   8]   shows   the   scene  model   in   OpenGL  world   space   and  camera  model.  According  to  this,  the  camera  in  OpenGL  is  placed  on  the  positive  z-­‐axis,  facing  towards  the  negative  z-­‐axis,  and  the  Kinect  sensor  is  placed  on  the  origin   0,0,0   point,   facing   the   positive   z-­‐axis.   The   application   uses   all   tracked  joints   to   draw   the   skeleton   and   the   position   of   the   center   of  mass   joint   of   the  tracked  user  to  determine  the  tile  the  player  is  standing.  The  player  has  to  stand  on  a  tile  for  at  least  one  second  to  select  it.  On  its  graphical  user  interface  (GUI)  the  application  provides  a  control  through  which  the  position  of  the  camera  can  be  adjusted  inside  the  OpenGL  world  space.  By  doing  that,  the  game  could  also  be  played   in   an   alternative   setup   where   the   virtual   camera   is   placed   above   the  board,  and  the  game  is  projected  on  the  actual  floor.  

 Figure  8:  Kinect  board  in  OpenGL  scene  

Page 59: Sensor based physical interaction for embodied playful learning games

  59  

NumHop  was  presented  among  other  projects  of  the  Waag  Society  institute,  on  an  open  day  event  of  the  Creative  Learning  Lab  for  educators,  receiving  a  lot  of  good  comments  on  the  potentials  of  the  Kinect  technology  and  the  prototype  itself  for  educational  use.  

 Figure  9:  View  of  NumHop  first  prototype  

Preliminary   phase   of   prototyping   continued   by   developing   interfaces   and  small   applications   on   cinder   framework   to   test   other   sensor   technologies,  including   the   facial   expressions   analysis   SHORE   library,   provided   by   the  Fraunhofer   research   organization,   the   Mindwave   EEG   Bluetooth   sensor   by  Neurosky   (see   4.9),   and   the   Zephyr   HxM   bluetooth   heart   rate   sensor.   Both  Bluetooth  sensors  use  serial-­‐over-­‐bluetooth  communication  protocol  to  transmit  values   to   the  applications.    Wrappers  developed  on   cinder   for   the   two   sensors  where   later   used   in   the   implementation   of   the   “hardware”   level   of   the   next  version  of  NumHop,  described  in  the  next  section.  

7.3  NumHopII  -­‐  The  Game    

Motivated   by   the   concept   of   NumHop   and   the   good   comments   the   first  sample   received,   the   prototyping   phase   continued   on   building   an   enriched  version   of   the   game   that   would   feature   more   interaction   with   the   Kinect   and  additional  biofeedback  mechanisms,  and  that  would  cover  the  targets  mentioned  in  the   introduction  of  this  chapter.  The  final  prototype  presented  here  uses  the  Kinect  sensor,  the  Mindwave  EEG  sensor,  and  the  Zephyr  HxM  ECG  sensor.    

The  sensors  used  on  the  prototype  where  chosen  among  the  ones  discussed  in  chapter  3  and  4,  based  on  their  suitability   for  the  specified  game  mechanics,  and  out  of  personal  curiosity  of  the  author  to  work  with  them.  The  Kinect  sensor  is   the   latest,   state   of   the   art   device   in  motion   capture   sensors,  with   very   good  performance   and   easiness   of   use.   Brain   computer   interfaces   are   a   very   new  technology,   at   least   in   commercial   level,   with   theoretically,   very   promising  features.   This   characteristic   of   emerging   technology   adds   novelty   to   the   game,  and   the  use  of   an   indication  of   attention,   and  meditation   level,   is  believed   that  will  intrigue  the  player,  and  enhance  the  gaming  experience.  On  the  other  hand,  heartbeat   rate   has   the   value   that   (large   enough)   fluctuations   are   internally  

Page 60: Sensor based physical interaction for embodied playful learning games

 60  

sensed  by  the  player.  The  visualization  of  the  heartbeat  rate,  and  the  interaction  based   on   it,   creates   another   link   between   the   physically   sensed   body,   and   the  virtual   environment,   that   will   enhance   the   feeling   of   immersion.   The  combination   of   the   two   body   sensors   is   believed   to   be   a   good   base   to   further  study  potential  relation  between  physical  activity  and  mental  state.  

In  the  final  version  of  NumHop  [Figure  10],  the  player  is  placed  on  a  virtual  large  hall.  In  front  of  the  player  placed  on  the  floor,  there  is  again  a  board  of  16  numbered  tiles.  The  rest  of  the  scene  contains  6  teleport  chambers  placed  along  the   walls   of   the   hall.   The   player   is   called   to   answer   questions   on   simple  multiplication  matrices,  for  example  the  result  of  6  x  7.    The  tiles  of  the  board  are  numbered   to  values  close   to   the  correct   result  with  at   least  one  containing   the  correct   number.   The   player   then   has   some   seconds   to   select   his   answer   by  stepping  onto  a  tile.  The  faster  the  player  responds  correctly,  the  more  points  he  gains.    If  the  player  does  not  respond  the  game  moves  automatically  to  the  next  question.   If   on   the   other   hand   the   player   selects   a   wrong   answer,   the   board  moves   to   the   next   question,   and   an   enemy   robot   is   teleported   in   the   scene  through   one   of   the   6   chambers,   and   starts   approaching   the   player   with   bad  intensions.   The   player   can   defend   himself   against   the   robots   by   activating   his  superpower   (activated   by   raising   both   hands   above   shoulder   level)   and   aim   it  against  the  robots.  The  player  starts  the  game  with  a  certain  level  of  superpower  that  it  is  reduced  by  use.  When  however  the  Mindwave  sensor  that  player  wears  on   his   head,   detects   high   level   of   attention,   the   superpower   level   starts   to  increase   and   the   player   can   activate   it   again.   If   the   player   runs   out   of  superpower,   he  has   to   suffer   the   robot’s  hits,  which   reduce   the  player’s  health  level.   If   the  player  survives  an  attack  he  can  step  back  for  a  moment  and  try  to  relax.  When   the  Mindwave   sensor   detects   high   level   of  meditation,   the   health  level  of  the  player  is  increased.  The  player  is  given  3  lives  in  the  beginning  on  the  game,   and   bonus   lives   can   be   gained   after   a   number   of   consecutive   correct  answers.  

The  GUI   element   representing   the  health   level   includes   also   the  heart   rate  value   in  beats  per  minute   (bpm),   obtained  by   the  Zephyr  HxM  sensor   that   the  player  is  wearing.    The  heart  rate  value  is  not  directly  connected  to  any  element  of   the   game.   Although   there   was   the   idea   of   correlating   the   heart   rate   to   the  update  interval  of  the  board,  it  was  finally  abandoned.  There  are  two  reasons  for  that.   The   first   one   is   that   because   the   heart   sensor   has   to   be   placed   under  player’s   clothes,   it   might   be   proved   impractical   to   use   on   a   school   test  environment.   The   second   one   is   that   designing   a   certain   interaction   based   on  heart  rate  requires  to  know  in  advance  the  expected  range  of  values  during  the  game,   knowledge   and   expertise   that   was   not   available   at   the   time   of  development.   The   presence   of   the   hear   rate   value   however  was   thought   to   be  useful  as  also  explained  earlier,  first  for  observation  of  the  values  for  further  use,  and   second   to   see   how   players   respond   to   this   information,   if   for   example   by  placing  the  heart  rate  value  of  the  player,  appearing  on  the  GUI,  as  what  could  be  conceived  by  someone  as  another   form  of  score  points,  motivates  the  player  to  raise  his  heart  rate  by  moving  more  intensively.    

 

Page 61: Sensor based physical interaction for embodied playful learning games

  61  

 Figure  10:  View  of  NumHopII  prototype  scene  

7.4  Architecture  Overview    

In   order   to  make   this   final   version   of   NumHop,   using   better   graphics   and  more   complex   game   play,   the   game  was   redesigned   on   the   Unity3D   platform.  Unity3D   is   a   game   development   platform   that   combines   a   visual   editor   and   a  programming  environment,  which  simplifies  the  procedure  of  developing  games  for  multiple  platforms,  providing  a  physics  engine,  and  among  others,  easy  to  use  tools  to  program  animations,  collision  detection,  and  particle  systems.  Unity  can  be   programmed   using   scripts   written   in   C#   or   JavaScript.   The   prototype   was  developed  by  the  author  of  this  document,  based  on  3D  models,  textures,  sounds  and   pieces   of   code   found   on   Unity’s   sample   projects,   tutorials   and   the   Unity  online  community.    

The   prototype   consists   of   two   applications.   The   first   one   implements   the  sensors  “device”  level  [see  6.1],  and  it  is  developed  in  cinder,  and  the  second  one  is   the   game   itself,   developed   in   Unity3D.   The   first   application,  SensorsOSCTransmitter,   communicates   with   the   Mindwave   and   Zephyr   HxM  sensors   and   transmits   the   sensor   values   to   the   game   in   OSC   messages.   The  application   also   features   three   oscilloscopes   to   monitor   attention,   meditation,  and  heart  rate  over  time  [Figure  12],  and  also  the  ability  to  store  values  in  a  log  file.  The  log  file  contain  values  of:  seconds  since  log  started,  attention,  mediation,  and   heart   rate   (bpm),   in   columns,   in   simple   CSV   (Comma   Separated   Values)  format,   that   can   be   easily   imported   in   a   spreadsheet   application,   for   further  study  and  statistical  analysis.  

 Following   an   adaptation   of   the   generic   architecture   described   in   the  previous  chapter,  the  Kinect  sensor  is  directly  connected  to  the  game  engine  and  not   to   the   general   input   device   level.   This   option   is   better   for   the   particular  project,   because   the   game’s   avatar   movement   is   more   responsive   to   player’s  physical   movement;   it   also   gives   the   ability   to   have   a   preview   screen   of   the  sensors  view  inside  the  game,  and  improves  the  overall  performance.  The  Kinect  sensor  communicates  with  the  game  through  the  OpenNI  framework  instead  of  the  Microsoft  Kinect  SDK.  Using  OpenNI  allows  also  the  deployment  of  the  game  

Page 62: Sensor based physical interaction for embodied playful learning games

 62  

to   both   Windows   and   Mac   OS   X   platforms.   The   diagram   below   [Figure   11]  represents  the  main  components  of  the  two  applications  forming  the  prototype,  and  the  connections  between  them:  

 Figure  11:  Prototype's  architecture  overview  

Page 63: Sensor based physical interaction for embodied playful learning games

  63  

 

7.5  Main  components  overview    

This  section  provides  a  brief  description  of  the  main  components  and  classes  used  by  the  two  applications  that  form  the  prototype,  starting  from  bottom  to  top,  as  they  appear  on  the  previous  diagram  [Figure  12].  

 

SensorOSCTransmitter  (C++  -­‐  cinder)    

Mindset  is  a  cinder  wrapper  on  the  ThinkGear  SDK  provided  by  Neurosky.  The  class   contains   function   to   connect   and   read   values   from   the   Mindwave   or  Mindset  EEG  sensor  using  a  serial  over  Bluetooth  connection.  

ZephyrHxM   is  a  class  containing   function   to  connect  and  read  values   from  the  Zephyr  HxM  ECG  sensor  using  a  serial  over  Bluetooth  connection.  

ciOscilloscope   is   a   class   implementing   a   simple   oscilloscope   drawing   the  contents   of   a   c++   double   ended   queue   buffer.   The   height   of   the   y   axisis  automatically  adjusted  between  the  minimum  and  maximum  value  observed.    

OSCSender   class   implements   the   OSC   over   UDP   communication,   containing  functions   to   construct   OSC   messages,   bundles   and   send   them   to   a   specific   IP  Address  and  port.  The  class  is  contained  in  the  OSC  “block”  built  in  cinder.  

 

 

 Figure  12:  View  of  the  SensorOSCTransmitter  application  generating  a  test  signal  for  the  

values  of  attention,  meditation,  and  heart  rate  

 

Page 64: Sensor based physical interaction for embodied playful learning games

 64  

 

NumHopII  (C#  -­‐  Unity3D)    

OscService  object  uses  C#  OSC  and  UDPPacketIO  libraries  and  contains  the  OSCReceiver  script  that  starts  a  thread  listening  for  OSC  messages  on  a  specific  port,  and  initializes  two  OSCListeners  one  for  each  sensor.  OSC  messages  sent  from  SensorOSCTransmitter  have  the  following  format:    

/sensors_transmitter/(sensor_name)/(valueName)/(float  value)  

Mindwave  OSCListeners  listens  for  messages  with  address  prefix:  

/sensors_transmitter/signal  (Signal  quality  of  mindwave  sensor),  

/sensors_transmitter/attention,  

/sensors_meditation,  

and  HeartbeatSensor  OSCListeners  for  messages  with  address  prefix  

/sensors_transmitter/heartrate  

When   a   message   with   one   of   those   addresses   field   is   received,   the  OSCListener  calls  the  corresponding  set  value  function  of  PlayerStatus  script.  

KinectSensor  is  the  object  containing  all  necessary  scripts  to  communicate  with  the   Kinect   sensor   via   USB,   using   the   OpenNI   framework   developed   by  PrimeSense   (Wrapper   scripts   are   based   on   an   older   version   of   zigfu.com  wrappers  for  Unity,  no  longer  available  online)  .  Among  other  scripts  the  object  contains   the  OpenNISingleSkeletonController,  which   is   activated  after   a  user  has  been  calibrated  by  the  Kinect,  by  standing  in  the  “Y”  pose  indicated  initially  by  the  player’s  avatar.      OpenNISingleSkeletonController  updates  joint  positions  by   calling   the   corresponding   function   in   OpenNISkeleton   of   the   player   object.  The   object   also   contains   the   OpenNIPostureDetector   script,   in   which   the  program   compares   current   hand,   elbows   and   shoulders   joints   positions.   If   the  script   detects   the   preset   posture   (both   hands   raised   above   shoulders   level)   it  activates   the   Superpower   script   of   the   Player.   OpenNIDepthmapViewer  renders  a  small  preview  display  of  the  sensors  depth  video  feed.    

Player   besides   the   3d   avatar   model,   contains   the   OpenNISkeleton   script   in  which  joints  from  the  Kinect  sensor  are  mapped  to  joints  of  the  3d  model,  along  with  various  variables  controlling  the  behavior  of  this  mapping,  like  offsets,  scale  of   transformations,   damping   etc.   PlayerStatus   script   contains   all   information  about   the  player,   including  game  play  values   like  score,   lives   remaining,  health  status,  superpower  level,  and  values  received  by  the  body  sensors.  When  a  value  is  updated  the  PlayerStatus  script  calls  function  to  update  elements  of  the  Game  GUI.   Player   object   also   contains   a  CharacterController   script,  which   creates   a  capsule  collider   for   the  player’s  avatar   through  which  the  player   interacts  with  the   game’s   board.     The   SuperPower   script   activates   the   rendering   of   the  superpower  lighting  effect  particle  system  starting  from  the  player’s  hands  and  extending   to   the   position   of   the   target   object,   found   on   player’s   model   tree,  

Page 65: Sensor based physical interaction for embodied playful learning games

  65  

followed  by  a  ray  cast  collision  test,   that  calls   the  ApplyDamage  function  of   the  EnemyDamage  script  of  the  Robot  it  collides  with.      

GameGUI  contains  an  orthographic  camera,  additional  to  the  game’s  perspective  main  camera,  used   to  render   the  graphical  user   interface  of   the  game.  The  GUI  consists  of  the  player’s  HUD  (Head  Up  Display),  and  a  GUI  for  changing  settings  of   the   game   during   runtime.   The   game   HUD   is   consisted   by   a   text   texture  showing   the   current   score,   four   animated   circles   indicating   current   attention,  meditation,  superpower  level,  and  health/heart  rate/  lives  remaining  values,  and  a  text  texture  that  display’s  the  current  question  of  the  board  to  the  player.  

The  game  setting  panel,  handled  by  the  GameSettingsGUI  script,  is  activated  by  pressing  the  “p”  key.  This  panel  includes  various  game  play  parameters,   like  the  boards  update  interval  in  seconds,  player’s  maximum  superpower  and  health  level,  and  parameters  to  adjust  Kinect’s  skeleton  tracking  behavior,  like  skeleton  smoothing,   skeleton  offset  and   transformation  scale   that  sometimes  have   to  be  adjusted  depending  on   the  position  of   the  sensor  and  the  environmental  noise,  and  a  toggle  for  the  Depthmap  Viewer  (having  the  DepthMap  viewer  activated  all  the  time  reduces  game’s  performance).    

Board   contains   16   tile   objects.   Each   tile   contains   a   text   texture   containing   a  number,   a   box   collider   and   a   ButtonManager   script.   BoardManager   script  handles   the   update   of   the   board,   and   checks   player’s   answers.   BoardManager  starts   by   setting   a   question   to   the   player,   by   choosing   two   random   numbers  between  1  and  9,  and   then  updates  a  random  number  of   tiles  around   the  right  answer,   making   sure   at   least   one   contains   the   right   answer.  When   the   player  steps  on  a  tile,  the  tile  starts  being  pressed,  and  when  it  reaches  its  fully  pressed  position   (duration   of   press   is   adjustable)   calls   the   BoardManager   to   check   the  tile’s  value  against  the  correct  answer.    

The   function   of   BoardManager   that   check’s   the   player’s   answer,   handles  basic   rules   of   the   game   play.   In   the   simplest   case   that   the   player’s   answer   is  wrong,  the  script  calls  the  RobotLauncher  object  to  launch  another  robot  attack  to  the  user.  If  the  answer  is  correct,  the  player  is  awarded  with  points  depending  on  the  time  of  response  (the  faster  response  the  more  points).  The  script  keeps  track  of  consecutive  correct  answers,  based  on  which  bonus  rules  can  be  applied,  for  example  to  fill  the  player’s  health  level  after  five  consecutive  correct  answers  or  award  the  player  with  a  bonus  life.    Similarly  the  function  includes  a  sensor  fail,  fallback  game  mechanism,  according  to  which  if  the  game  detects  that  the  EEG  sensor,  on  which  the  player’s  superpower  and  health  levels  depend,  has  bad  quality  signal,  the  game  raises  the  superpower  level  on  every  correct  answer,  in  order  to  keep  the  play  running  and  not  let  the  player  totally  vulnerable.  

RobotLauncher  contains  a  path  from  every  teleport  chamber  in  the  level,  to  the  level’s  board  and  the  RobotSpawn  script  that  initializes  the  robot  attacks.  When  the   player   chooses   a  wrong   answer,   RobotSpawn   is   called,   initializing   a   Robot  object   (prefab   in   Unity’s   terminology)   inside   one   of   the   chambers.   After  initialized,   the   robot   follows   the   corresponding  path   to   the  board,   and  when   it  reaches  its  end,  follows  the  player  to  his  current  position  and  starts  hitting  him  (see  RobotBehaviour  script  in  projects  code).  

Page 66: Sensor based physical interaction for embodied playful learning games

 66  

Chapter  8:    Evaluation  results  and  conclusions    

8.1  Prototype  Game  Evaluation  

This  chapter  discusses  results  and  conclusions  from  testing  and  evaluation  of  the  final  prototype  presented  in  the  previous  chapter,  as  well  as  ideas  for  further  development  and  study.  The  prototype  was  tested  in  a  series  of  private  sessions  and  an  open  evaluation  session  that  took  place  at  the  Theatrum  Anatomicum,  at  the   Waag   Society,   with   participants   from   people   working   for   the   institute.  Overall   11   people   tested   the   prototype,   from  which   4   people   played   the   game  using   only   the  Kinect   sensor,   5   people   using   the  Kinect   and  Mindwave   sensor,  and  three  people  using  all  3  sensors.  The  evaluation  data  where  collected  either  by  a  questionnaire  (see  Appendix  I)  or  by  short  interviews  and  discussion  with  the  participants  after  they  played  the  game.          In   all   sessions,   both  applications  of   the  prototype  where   running  on  a   single  computer,  with  a  dual  screen  (monitor  +  projector)  setup  [Figure  13].  Overall  the  system  proved  to  be  robust,  and  no  major  flaws  where  found  that  would  cause  the  system  to  crash,  besides  some  cases  that  there  was  an  error  on  the  bluetooth  connection   between   the   operating   system   and   the   sensors.   The   advantage   of  having   a   separate   application   handling   the   communication  with   the   two   body  sensors   is   that   even   in   those   cases   of   connection   error,   the   game   was   not  interrupted,  and  only  the  sensors  transmitter  application  had  to  be  restarted  in  order  to  re-­‐establish  the  bluetooth  connection.              During  the  game  play,  only  one  bug  was  found,  appearing  randomly,  and  yet  not   fixed,   believed   to   be   caused   by   a   racing   condition   between   the   animating  engine  of   the  board   tiles   (the  push  down/release  animation),  and   the  updating  function   of   the   board's   numbers,   triggering   a   board   update   sooner   than   the  defined  time  interval,  and  as  soon  as  the  player  stepped  on  a  tile.  In  some  cases  this   bug   was   frustrating   for   the   player   because   he   would   step   on   a   correct  number,  and  until  the  tile  was  fully  pushed,  it  would  change  to  another  number  and  question,  probably  making  that  tile  a  wrong  answer  to  the  new  question.

   Figure  13:  Game  test  session  at  Theatrum  Anatomicum  -­  Waag  Society

Page 67: Sensor based physical interaction for embodied playful learning games

  67  

Microsoft  Kinect            Below  we  will  discuss   test  results   for  each  sensor   in   the  game,  starting  with  the   motion   capture   Microsoft   Kinect   sensor.   The   Kinect   sensor   worked   very  reliably   during   the   test,   and   there   were   no   cases   where   the   sensor   stopped  tracking   the   player,   even  when   she  would   step   out   of   the   field   of   view   of   the  sensor  for  a  moment  and  then  step  back  in.  A  negative  point  of  the  game's  GUI  is  that  when  a  user  goes  out  of   the   field  of  view  of   the  sensor,   the  game  does  not  provide  any  indication  to  the  player  to  move  back  in  to  view,  apart  from  the  fact  that   the  avatar  stops  responding  to  users  movement.  Another  negative  point   is  that   the   avatar   movement   is   different   than   that   found   on   commercial   XBox  games.   Although   some   smoothing   techniques,   found   on   the   software   used  (OpenNI),  were  applied,   lack  of   image  noise  reduction  optimization,  and  lack  of  physical   human   kinetic   motor   models   applied   to   the   avatar,   make   the   avatar  being  shaking  even  if  the  player  is  completely  steady,  as  a  result  of  image  noise,  or   the   avatar   to   take   a   physically   impossible   pose,   usually   when   some   of   the  player's  joints  are  out  of  the  field  of  view  of  the  sensor,  when  the  player  moves  close  to  the  edges  of  the  board.            The   second   major   problem   regarding   the   avatar   and   the   Kinect,   found   on  earlier  testing,  and  partially  solved  on  final  version  of  the  game,  has  to  do  with  the  superpower  beam  directivity  and  the  physical  position  (height  and  angle  of  view)  of  the  sensor.  In  order  to  make  this  problem  easier  to  understand  for  the  reader,   we   first   have   to   explain   a   little   more   on   how   the   superpower   works  inside  the  game.  The  player's  avatar  model  hierarchy  includes  an  object  named  Target.   Target   is   just   a  point   in  3d   space,   placed   approximately   in   the  player’s  chest   height,   and   50   units   in   front   of   the   player.   When   the   superpower   is  activated,   the   graphics   engine   renders   the   superpower   particles   starting   from  the  hands  of  the  player  to  the  target  position,  and  the  physics  engine  casts  a  ray  from  the  spine  of  the  player  to  the  target,  looking  for  any  robot  enemy  to  hit  in  between.  In  order  for  the  target  to  always  be  towards  the  current  local  forward  vector   of   the   player,   the   target   follows   the   rotation   of   the   spine   joint,   as  determined  by   the  Kinect.   In  other  words,   target   is   always  50  units   in   front  of  where  the  spine  of  the  player  looks.  This  rotation  is  determined  according  to  the  position   and   view   of   the   sensor.   Specially   for   the   joint   rotation   on   the   x   axis  (pitch  angle)   this   causes   the  problem   that   for  a  given  player  pose,   the  angle  of  rotation   varies   a   lot   depending   on   the   height   that   the   sensor   is   placed   and   its  angle,  having  as  a  result  that  if  for  example  the  Kinect  is  placed  on  a  position,  on  a  height   lower  than  the  players  spine  height,  so  that   it   looks  at  the  player  from  below,  the  superpower  beam  would  end  up  aiming  at  a  higher  than  the  desired  level,  towards  the  ceiling  of  the  level.  In  the  same  way,  for  a  given  position  of  the  sensor,  that  behavior  varies  depending  on  the  height  of  the  player.  Although  an  attempt   for   an   optimal   solution   of   this   geometric   problem,   was   made   using  Kinect's  internal  accelerometer  and  floor  determination  functions,  results  where  not  reliable  enough,  leading  to  the  decision  to  finally  overcome  that  problem  by  just  ignoring  the  spine's  pitch  angle  for  the  target.  As  a  result  even  if  the  player  

Page 68: Sensor based physical interaction for embodied playful learning games

 68  

leans   forwards   or   backwards,   the   target   of   the   superpower   beam   stays   on   a  given  height,  on  a  fixed  angle.                    During   the   tests   the   players   where   not   explained   in   detail   how   the  superpower   beam   works.   The   only   tip   given   was   that   it   can   be   activated   by  raising   both   hands.   Although  most   players   got   familiar   easily   with   aiming   the  beam  using  their  body,  some  of  them  where  trying  to  aim  only  using  their  hands  without  turning  their  body,  which  is  normal  since  in  the  game  it  appears  that  the  beam  starts  from  the  hands  of  the  player  and  not  his  chest,  as  it  actually  is  inside  the  physics  engine.  Another  problem  mentioned  by  participants   is   that  when  a  robot  approaches  very  close  to  the  target  it  might  end  up  behind  the  player  or  in  an  angle  larger  than  90  degrees  on  the  y  axis,  so  the  player  cannot  defend  himself  because   either   the   Kinect   will   not   detect   that   the   player   has   raised   both   his  hands   (having  a  profile  view  of   the  player),   or   that  of   course   if   they   turn   their  back,   they   will   not   be   able   to   see   the   screen.   The   solution   to   that   problem   is  usually   for   the   player   to   take   a   step   back   and   aim   again   (supposing   that   she  would  still  be  in  the  boards  limits),  but  again  this  is  not  a  physical  response  if  you  would   have   an   enemy   on   your   back   in   the   physical   world.   Apart   from   that  problem,  the  beam  follows  smoothly  players  body,  aiming  the  beam  takes  some  time   to  master,  but  after  all   it   offers  an  opportunity   for   the  player   to  master  a  skill   inside   the   game,   which   is   part   of   the   gaming   experience.   The   activation  method  of  raising  both  hands  appeared  to  work  good  for  the  game,  being  easy  to  understand   both   for   the   player   and   the   sensor,   not   very   strict   to   limit   players  natural   movement,   and   in   most   cases   it   was   recognized   even   in   minimum  distance  from  the  sensor,  provided  that  the  player  is  not  so  tall  (<2.0  m)  that  in  the  minimum  distance,  his  shoulders  are  already  out  of   the   field  of  view  of   the  Kinect.              Finally,   although   the   board   area   is   quite   limited,   and   every   tile   is   one   step  away  from  its  next  one,  some  participants  mentioned  that  the   lack  of  reference  points   for   the   tiles   on   the   floor   can   be   confusing   at   the   beginning.   In   order   to  overcome   this  problem,  a  more  advanced  and   immersive  setup  can  be  used,   in  which  another  application  runs  the  board  mechanism,  using  a  second  projector  to   project   the   board,   mapped   to   the   actual   floor.   Alternatively   the   current  prototype  can  be  modified  having  a  first  person  camera  perspective,  to  be  played  using   a   head   mounted   display.   Although   theoretically   head   mounted   displays  provide   the   total   immersion   experience   of   virtual   reality,   these   displays   have  been  proved   to   limit  player's   freedom  of  movement,   first  because  all   currently  available   solutions   require   a   wired   connection   to   the   computer,   and   secondly  because  they  all  lack  the  performance  abilities  to  be  responsive  to  physical  speed  of  movement,  requiring  the  player  to  adapt  his  movement  to  the  capabilities  of  the  screen.          

Page 69: Sensor based physical interaction for embodied playful learning games

  69  

Neurosky  Mindwave    

         The  Mindwave  sensor  also  proved  to  work  good  as  a  piece  of  hardware  for  the  game.   It  was  rated  as  quite  comfortable   to  wear,   it  was  easy  enough   to  get   the  perfect   signal,   required   by   the   sensor   for   the   attention   and   meditation   level  determination   to  work,  and  maintained   it  without  problems   through   the  game.  In  some  cases  the  cases  would  lose  signal  after  a  jump,  but  it  regained  it  shortly,  without   the  player   stop  moving.  The  metaphor  of   connecting  attention   level   to  the  superpower  level,  and  the  meditation  to  health  level  was  rated  very  high  as  a  concept,  the  effectiveness  of  the  sensor  inside  the  game  though,  did  not  receive  very   high   ratings.   The   attention   and   meditation   values   usually   stayed   on   an  average  neutral  level  through  the  game,  and  in  a  lot  of  cases  we  had  to  cheat  by  adding  superpower  level  by  the  keyboard,  in  order  not  to  spoil  the  fun.  If  a  player  dedicates   more   time   to   master   the   sensor,   results   will   get   better,   but  spontaneously   during   the   game,   the   calculations   did   not   seem   to   raise   the  attention   levels   that   much   for   the   sensor   to   play   its   role   on   the   game   very  effectively.  It  should  also  be  noted  though,  that  all  participants  in  the  evaluation  were  adults,   and  basic  multiplication  matrices  are  not   so   challenging   for   them.  This  leaves  an  open  possibility  that  for  young  children,  for  who  the  prototype  is  designed  for  in  the  first  place;  these  calculations  might  be  a  heavier  mental  load,  thus  more  easily  detected  by  the  sensor,  raising  the  effect  of  the  sensor  in  game  play.   In   any   case   the   use   of   the   sensor   certainly   adds   some   excitement   and  curiosity   about   the   game,   since   most   people   are   not   familiar   yet   with   brain  computer   interface  devices,  and   they  want   to   try  and   learn  how  they  work.  All  participants  that  played  the  game  without  using  the  mindwave  sensor,  were  very  curious  to  try  it  with  the  additional  sensor,  and  believed  that  this  would  certainly  add  more  fun  to  the  game.  

Zephyr  HxM    

         As  mentioned  also  in  the  previous  chapter,  the  heart  beat  rate  was  not  given  an   actual   role   on   the   game,   because   it   was   believed   that   the   use   of   the   heart  sensor  might  be  problematic  during  an  evaluation  session,  basically  because  the  player  has  to  wear  it  under  his  shirt  and  it  also  has  to  be  a  little  moisturized  to  increase   its   conductivity   with   the   skin.   Indeed   given   the   option,   most  participants  chose  not  to  use  the  heart  sensor  (5  out  of  8).  Nevertheless,  the  ones  who  used  the  sensor  rated  it  as  very  comfortable  to  wear,  and  the  sensor  worked  almost  perfectly  through  the  game.  Since  that  session  was  made  to  test  the  game  and  the  sensors,  participants  who  used  the  heart  sensor  were  monitoring  their  heart   beat   rate,   and   felt  motivated   to   try   and   raise   it,   in   order   also   to   test   the  responsiveness   of   the   sensor   and   their   condition.   Again   the   integration   of   the  heart  sensor  on  the  game  is  a  first  step  to  gather  and  study  heart  rate  values  and  ranges,   in  order   to  get  some  knowledge  based  on  which  other   interactions  and  game   dynamics   can   be   developed.   Additionally   new   trends   in   the   use   of   body  sensors   in   daily   life   push   the   technology   around   sensors   and  we   already   have  samples  of   sensors  embedded   to  ordinary  accessories,   like   for  example  a  wrist  watch  with   a   heartbeat   sensor.  Devices   like   these  will  make   the   integration   of  biofeedback  sensors  to  games  easier  and  more  practical  to  use.  

Page 70: Sensor based physical interaction for embodied playful learning games

 70  

8.2  General  conclusions  and  further  development  

         Overall   the   prototype   presented   was   rated   as   a   fun   gaming   experience.  Especially   participants   that   had   never   played   a   Kinect   game   before  were   very  excited  by   their   first  experience  with   this   technology.  The  prototype   itself,   and  the   general   concept   of   motion-­‐based   board   games   was   believed   to   have   high  potentials   for   educational   games.   As   mentioned   in   the   previous   chapter,   the  game's   graphics   were   taken   for   freely   available   resources   on   the   Unity   game  developing  community.  This  leaves  a  lot  of  room  for  changes  and  improvements  to  make  the  game  environment  more  suitable  and  pleasant  for  younger  children.  The  concept  of  virtual  worlds  in  a  board  game  can  be  expanded  by  introducing  additional  motion  based  interactions.  For  example,  the  board  could  be  designed  additionally  as  a   floating  platform,  kind  of  a   flying  carpet,  which  the  player  can  navigate  through  space  depending  on  his  position  on  it,  called  to  follow  a  track  path  and  encounter  enemies  on  his  way  to  finish  the  level.  On  a  more  permanent  setup,   taking   advantage   of   the   general   architecture   proposed   in   chapter   4,   the  feeling  of   immersion  inside  the  games  world  could  be  enhanced  by  introducing  additional   interactions   with   vision   and   sound,   for   example   in   the   prototype  presented,   the  hall  could  be   lighted  with   intense  red   light  whenever   the  player  gives   a   wrong   question,   and   a   robot   attack   is   launched,   or   intense   blue   lights  whenever   the   player's   lighting   superpower   is   activated.   In   a   similar   way,   the  game   could   change   the   color   of   the   lights   when   necessary,   to   help   the   player  relax   for   a  while,   in   order   to   charge  his   health   level   and   reduce  his   heart   beat  level.                An   element   that   would   certainly   raise   the   social   affordances   of   the   game  would  be  the  ability  of  a  multiplayer  game.  Although  technically  it  is  possible  to  use   a   single   Kinect   sensor   to   track   two   players,   practically   the   active   area   of  tracking   leaves   little  space   to  have   two  players  on  a  board  without  crashing   to  each  other.  Using  multiple  Kinect  sensors  though,  or  alternative  technologies  like  the  Panasonic  D-­‐Imager  or  others,  presented  in  chapter  3,  that  extend  the  area  of  tracking,   allow   to   design   scenarios   in   which   the   players   either   compete   each  other  on  the  same  task,  or  even  more  sophisticated  game  play  on  which  two  or  more   players   are   called   to   synchronize   their   moves   in   order   to   achieve   a  common  goal,  creating  also  a  more  interesting  and  fun  scene  for  people  watching  them  play.                  The   prototype   developed   is   using   one  motion   capture   sensor   and   two   body  sensors,  one  of  which  did  not  have  a  specific  role  in  the  game  play.  After  a  short  presentation  about  the  sensors  and  the  game,  and  before  playing  it,  some  people  did  not  understand  exactly  how  the  mindwave  sensor  values  are  used  inside  the  game.  Apart  from  putting  the  writer's  presentation  skills  into  question,  this  fact  could   also  mean   that   if   the   game  was  using  more   sensors,  with  more   complex  game  dynamics,  and  if  it  was  presented  to  young  children,  the  game  interactions  could   be   hard   to   understand.     On   the   other   hand,   all   computer   games   require  from  the  player  to   invest  some  time  playing   in  order  to  discover  all   the  game's  mechanisms.  

Page 71: Sensor based physical interaction for embodied playful learning games

  71  

           As  wearable  sensor  technology  develops,  making  them  more  practical  to  use,  the  greatest  challenge  for  a  multi-­‐sensor  game  designer  is  to  create  a  meaningful  ambient  interaction  layer,  through  which  the  player  will  discover  and  experience  the   game's   mechanisms   while   playing,   rather   than   require   an   explanation   in  advance.   Models   derived   from   previous   research   on   emotion   recognition  systems,   like  those  presented  in  chapter  4,  combined  with  study  of  sensor  data  collected   during   game   play,   can   assist   on   reaching   this   level   of   meaningful  interaction.  

         Aim   of   this   interaction   is   to   create   a   playful   experience   that   will   assist   the  player   to  develop  his   self-­‐awareness,   as  well   as  others   awareness,   through   the  opportunities  of  social  interaction  that  the  game  set  creates.  This  awareness  is  a  step   towards   creating   stable   foundations   that   will   support   children   to  communicate  and  maximize  their  learning  abilities.  At  the  same  time,  games  like  that   could  be  a  valuable   tool   for   educators   to  assess   children,  offering  a  better  insight  to  each  one  of  them,  helping  them  to  understand  and  highlight  difficulties  of  children,  and  assisting  them  on  giving  more  personalized  guidance.    

8.3  Summary  of  research  results    

This   section   is   revisiting   the   research   questions   of   the   thesis,   as   defined   in  section  1.2,   offering  a   summary  of   the   research   results   that   attempt   to   answer  those  questions,  and  analyze  the  research  problem.        

The  first  research  question  was  if  physical  interaction  can  be  combined  with  a  virtual  environment  to  enhance  a  playful  gaming  experience.  The  answer  to  this  question  is  a  definite  yes.  Chapter  2  analyzes,  by  reviewing  literature  and  results  of  previous  research,  how  physical  interaction  through  the  use  of  motion  capture  controllers  and  body  sensors  enhances  values  of  the  gaming  experience,  such  as  immersion,   transformation,  and  agency,  as  defined  by  Murray   [1997].  Focusing  primarily  on  educational  games,  the  research  additionally  examines  the  potential  effect  of  the  use  of  sensors  in  the  creation  of  a  playful  learning  experience,  on  the  aspects  of  the  involvement  of  human  body  and  motion  in  the  process  of  learning,  and  recall  of  knowledge  (embodied  learning),  and  on  assisting  the  development  of   basic   social   emotional   competencies,   by   significantly   enhancing   the   social  affordances   of   games.   The  prototype  presented   in   chapter   7,   acts   as   a   positive  confirmation  of  the  research  question,  providing  a  basic  example  of  multi-­‐sensor  based   physical   interaction   game,   which   as   the   evaluation   results   showed,  delivers  a  novel  playful  gaming  experience.  

 

The   second   research   question   was   what   sensor   technologies   are   most  applicable   for   enhancing   a   playful   gaming   experience   inside   the   Embodied  Playful   Learning   Theater   installation.   The   thesis   attempted   to   answer   this  question   by   presenting   a   range   of   available   technologies   of   motion   capture  

Page 72: Sensor based physical interaction for embodied playful learning games

 72  

systems   [Chapter   3],   and   body/bio-­‐feedback   sensor   systems   [Chapter   4],  studying   their   technical   characteristics,   focusing   on   the   practical   use   of   those  systems  in  a  gaming  installation.    This  research  question  is  hard  to  be  answered  definitely  because  theoretically,  all  the  presented  technologies  are  applicable  and  depending   on   the   designed   interactions,   are   able   to   enhance   the   gaming  experience.   The   thesis   provides   more   in   depth   details   regarding   technologies  that   the   limited   research   resources   allowed   to   be   tested,   more   specifically  regarding   the   Microsoft   Kinect   motion   capture   sensor,   the   SHORE   facial  expression   analysis   library   by   Fraunhofer   institute,   three   different   models   of  brain   computer   interfaces   (Emotiv,   Neurosky   Mindset   and   Mindwave),   the  Zephyr  HxM  heart  sensor,  and  a  Tobii  desktop  eye  tracking  system.  

Regarding  motion  capture  systems,   there  was  a  preference  towards  marker-­‐less   systems,   because   the   use   of   markers   in   combination   with   body   sensors  would   create   a   rather   complex   setup   that   has   to   be   worn   by   the   player   and  calibrated  before  he  can  start  playing.  Regarding  biofeedback  mechanisms,  after  studying  their  characteristics,  the  research  attempts  to  make  a  more  systematic  approach   towards  designing   sensor-­‐based   interactions   for   games,   by  making   a  classification  of  sensor  data  into  two  categories:  i)  signals  that  are  more  suitable  to  be  continuously  monitored  during  a  game,  directly  related  to  game  mechanics,  and   ii)   signals   that   are   more   appropriate   to   be   sampled   on   an   event   basis  interaction,   connected   to   certain   points   of   the   game   flow.   This   classification  helps  the  interaction  designer  to  choose  what  technologies  are  more  applicable  based  on  what  game  mechanics  he  wants  to  achieve.    

Besides  the  two  research  questions,  the  thesis  examined  the  design  of  a  multi-­‐sensor   interactive   system   from   a   technical   system   architecture   perspective,   as  defined  in  the  research  problem  statement  (see  1.2).  Main  characteristics  of  this  system  are  expansibility,   to  be  able  to  support  the  use  of  a  number  of  different  sensors,  and  openness,  to  be  able  to  support  the  development  and  deployment  of  different   applications   on   it.   After   considering   a   number   of   sensor   hardware  platforms,   and   interactive   software   development   frameworks   [Chapter   5],   as  well   as  previous   sensor  based   interactive   systems,   the   thesis  proposes  a  basic,  potentially   reference   system   architecture   [Chapter   6].   Main   feature   of   this  architecture  is  the  design  of  separating  the  main  interactive  application,   from  a  common   device   level   that   is   responsible   to   connect   and   collect   data   from   the  various   sensors.   According   to   the   design   the   two   levels   are   communicating  through  Open  Sound  Control  messages,  an  open  standard  that  was   found  to  be  supported   by   a   very   wide   range   of   reviewed   interaction   development  frameworks  and  applications.  This  concept  design  was  then  implemented  during  the   development   of   the   final   game   prototype,   which   apart   from   the   game  application,  features  a  separate  application  to  connect  with  the  two  body  sensors  used,  and  transmit  data  to  the  game  through  OSC.  This  application  was  written  in  modular,   cross-­‐platform,   C++   language,   and   it   can   be   used   to   serve   other  applications   compatible   with   the   OSC   standard,   and   can   also   easily   extend   to  support  the  use  of  more  sensors.  

 

 

Page 73: Sensor based physical interaction for embodied playful learning games

  73  

 

 

 

 

 

 

 

 

 

 

 

 

 

Page 74: Sensor based physical interaction for embodied playful learning games

 74  

Appendix  I  

   

NumHop  Game  Evaluation  Questionnaire    

1.  Have  you  ever  played  a  game  using  the  Kinect  sensor  before?  

         Yes  /  No  

 

2.  How  would  you  rate  the  overall  gaming  experience?  

Very boring Very exciting

❏   ❏   ❏   ❏   ❏   3.  Which  game  element  you  liked  the  most  and  why?  

 

 

4.  Which  game  element  you  did  not  like  or  found  frustrating,  and  why?  

 

 

5.  How  would  you  rate  the  interaction  with  the  Kinect  sensor  (player  movement)?  

Unresponsive - Frustrating

Very responsive - physical movement

❏   ❏   ❏   ❏   ❏      

6.  How  comfortable  was  the  Zephyr  heart  sensor  to  wear?  

Very Annoying

Very Comfortable

❏   ❏   ❏   ❏   ❏  

Page 75: Sensor based physical interaction for embodied playful learning games

  75  

7.  How  comfortable  was  the  Mindwave  EEG  sensor  to  wear?  

Very Annoying Very Comfortable

❏   ❏   ❏   ❏   ❏    8.  How  would  you  rate  the  effectiveness  of  the  Mindwave  sensor  to  the  gameplay?  

Not effective at all

Very effective and fun

❏   ❏   ❏   ❏   ❏    

 

 

9.  Do  you  have  any  ideas  on  how  the  heart  sensor  could  be  used  effectively  in  the  game  play?  

 

 

10.  How  would  you  rate  the  speed  of  the  game  play  (board  update  interval,  robot  attack  speed  etc.)  

Very slow Very fast

❏   ❏   ❏   ❏   ❏   11.  Do  you  believe  that  a  game  like  NumHop  has  potential  in  education  /  school  environment?  

Not really Certainly

❏   ❏   ❏   ❏   ❏    12.  Other  Comments/Suggestions:  

 

 

Thank  you  for  your  participation!  

 

 

Page 76: Sensor based physical interaction for embodied playful learning games

 76  

Bibliography    

Joshua  Noble  (2009).  Programming  Interactivity.  Sebastopol  (U.S.A.):  O’Reilly  Media  

Dan   O’Sullivan   and   Tom   Igoe   (2004).   Physical  Computing.   Boston   (U.S.A.):   Thomson   Course  Technology  

Steve   Dixon   (2007).   Digital   Performance.    Cambridge  (U.S.A.):  The  MIT  Press  

Janet  H.  Murray  (1998).  Hamlet  on  the  Holodeck:  The   future   of   Narrative   in   Cyberspace.  Cambridge  (U.S.A.):  The  MIT  Press  

References    

[1]:  Barron,  B.,  Cayton-­‐Hodges,  G.,  Bofferding,  L.,  Copple,   C.,   Darling-­‐Hammond,   L.,   &   Levine,   M.  (2011).   Take   a   Giant   Step:   A   Blueprint   for  Teaching   Children   in   a   Digital   Age.   New   York:  The   Joan   Ganz   Cooney   Center   at   Sesame  Workshop.  

[2]:  Piaget,  J.  (1973).  To  understand  is  to  invent:  The   future   of   education.   Grossman   Publishers,  New  York.  

[3]:   Malone,   T.W.   and   Lepper,   M.R.   (1987).  Making  learning  fun:  A  taxonomy  of  motivations  for   learning.   In   Snow,   E.   and   Farr,   M.   eds.  Aptitude,  learning,  and  instruction:  Cognitive  and  affective   process   analyses,   Lawrence   Erlbaum,  Hillsdale,  N.J.  

[4]:   Melissa   Gresalfi,   Sasha   Barab,   Sinem  Siyahhan,  and  Tyler  Christensen.    Virtual  worlds,  conceptual  understanding,  and  me:  designing  for  consequential  engagement.  On  The  Horizon  -­‐  The  Strategic   Planning         Resource   for   Education  Professionals,  17(1):21‚34.  

 

 

 

[5]:  Gee,  J.  P.  (2003).  What  Video  Games  Have  to  Teach   Us   About   Learning   and   Literacy.   New  York:  Palgrave  Macmillan.

[6]:   Barab,   S.A.,   Sadler,   T.,   Heiselt,   C.,   Hickey,   D.  and   Zuiker,   S.   (2007),   ‘‘Relating   narrative,  inquiry,  and  inscriptions:  a  framework  for  socio-­‐scientific   inquiry’’,   Journal   of   Science   Education  and  Technology,  Vol.  16  No.  1,  pp.  59-­‐82.  

[7]:  Balasubramanian,  N.,  &  Wilson,  B.G.   (2006).  Games   and   simulations.   In   C.   Crawford   et   al.,  (Eds.),   ForeSITE,   Vol.   2005,   Proceedings   of  Society  for  Information  Technology  and  Teacher  Education   International   Conference   2006.  Chesapeake  

[8]:   Hake,   R.,   “Interactive-­‐Engagement   vs.  Traditional   Methods:   A   Six-­‐Thousand-­‐Student  Survey   of  Mechanics   Test   Data   for   Introductory  Physics   Courses,”   American   Journal   of   Physics,  Vol.  66,  No.  1,  1998,  p.  64.  

[9]:  Kress,  J.  S.,  &  Elias,  M.  J.  (2006).  School  based  social  and  emotional   learning  programs.   In  K.  A.  Renninger  &  I.  E.  Sigel  (Eds.),  Handbook  of  child  psychology:   Vol.4.   Child   psychology   in   practice  (6th   ed.,   pp.   592-­‐618).  Hoboken,  NJ:   John  Wiley  and  Sons.  

[10]:   Raver   CC.   Emotions   matter:   Making   the  case   for   the   role   of   young   children’s   emotional  development   for   early   school   readiness.   SRCD  Social  Policy  Report  2002;  XVI(3):  3-­‐18.    

[11]:  Goleman,  D.  (1995).  Emotional  intelligence.  New  York:  Bantam  Books.  

[12]:   Barnett   LA,   Storm   B.   Play,   pleasure,   and  pain:   The   reduction   of   anxiety   through   play.  Leisure  Sciences  1981;4(2):161-­‐175.  

[13]:   Calvert,   S.   L.   (2005).   Cognitive   effects   of  video  games.  In:  J.  Raessens  &  J.  Goldstein  (eds.),  Handbook  of  Computer  Game  Studies.  Cambridge,  MA:  MIT  Press,  pp.  125-­‐131.  

[14]:   Gunter,   B.   (2005).   Psychological   effects   of  video  games.  In:  J.  Raessens  &  J.  Goldstein  (eds.),  Handbook  of  Computer  Game  Studies.  Cambridge,  MA:  MIT  Press,  pp.  145-­‐160.  

[15]:  De  Kort,  Y.A.W.,  and  Ijsselsteijn,  W.  A.  2008.  People,   places,   and   play:   a   research   framework  for   digital   game   experience   in   a   socio-­‐spatial  context.  ACM  Comput.  Entertain,  6,   2,  Article  18  (July  2008),  11  pages.  

[16]:  Lazzaro,  N.  (2007).  Why  We  Play:  Affect  and  the   Fun   of   Games:   Designing   Emotions   for  Games,  Entertainment  Interfaces  and  Interactive  Tools.   In:   The   Human-­‐Computer   Interaction  Handbook:   Fundamentals,   Evolving   Techniques,  and   Emerging   Applications,   Edited   by   A.   Sears  and  J.A.  Sears.  Lawrence  Erlbaum  Associates,  

[15]:  De  Kort,  Y.A.W.,  and  Ijsselsteijn,  W.  A.  2008.  People,   places,   and   play:   a   research   framework  for   digital   game   experience   in   a   socio-­‐spatial  context.  ACM  Comput.  Entertain,  6,   2,  Article  18  (July  2008),  11  pages.  

[16]:  Lazzaro,  N.  (2007).  Why  We  Play:  Affect  and  the   Fun   of   Games:   Designing   Emotions   for  Games,  Entertainment  Interfaces  and  Interactive  Tools.   In:   The   Human-­‐Computer   Interaction  Handbook:   Fundamentals,   Evolving   Techniques,  and   Emerging   Applications,   Edited   by   A.   Sears  and   J.A.   Sears.   Lawrence   Erlbaum   Associates,  

Page 77: Sensor based physical interaction for embodied playful learning games

  77  

Mahwah,   New   Jersey,   2nd   Edition,   pages   679   –  700.  

[17]:   Jansz,   J.,  &  Martens,  L.   (2005).  Gaming  at  a  LAN   event:   the   social   context   of   playing   video  games.  New  Media  &  Society,  7  (3),  333-­‐355.  

[18]:  Bryce,  J.,  &  Ruttter,  J.  (2003).  The  Gendering  of  Computer  Gaming:  Experience  and  Space.  In  S.  Fleming   &   I.   Jones,   Leisure   Cultures:  Investigations   in   Sport,   Media   and   Technology,  Leisure  Studies  Association,  pp.3-­‐22.  

[19]:  Carr,  D.,  Schott,  G.,  Burn,  A.,  &  Buckingham,  D.   (2004).   Doing   game   studies:   A  multi-­‐method  approach  to  the  study  of  textuality,   interactivity,  and   narrative   space.   Media   International  Australia   incorporating   Culture   and   Policy,   No.  110,  19-­‐30.  

[20]:  Baumeister,  R.  F.  &  Leary,  M.  R.  (1995).  The  Need   to   Belong:   Desire   for   Interpersonal  Attachments   as   a   Fundamental   Human  Motivation.   Psychological   Bulletin,   117   (3),   497-­‐529.  

[21]:   Ramanathan,   S.,   &   McGill,   A.   (2008).  Consuming   with   others:   Social   influences   on  moment-­‐to-­‐moment   and   retrospective  evaluations   of   an   experience.   Journal   of  Consumer  Research,  34.  

[22]:  Raghunathan,  R.,  &  Corfman,  K.   (2006),   “Is  Happiness   Shared   Doubled   and   Sadness   Shared  Halved?   Social   Influence   on   Enjoyment   of  Hedonic   Experiences,”   Journal   of   Marketing  Research,  43  (August),  386–94.  

[23]:   Jakobs,   E.,   Fischer,   A.,   &   Manstead,   A.  (1997).   Emotional   experience   as   a   function   of  social   context:   The   role   of   the   other.   Journal   of  Nonverbal  Behavior,  21  (2),  103-­‐130.  

[24]:   Siân   E.   Lindley,   James   Le   Couteur,   and  Nadia  L.  Berthouze.  2008.  Stirring  up  experience  through   movement   in   game   play:   effects   on  engagement  and  social  behaviour.  In  Proceedings  of  the  twenty-­‐sixth  annual  SIGCHI  conference  on  Human   factors   in   computing   systems  (CHI   '08).  ACM,  New  York,  NY,  USA,  511-­‐514.  

[25]:   Bianchi-­‐Berthouze,   N  and  Whan,  WK  and  Patel,   D  (2007)   Does   body   movement  engage  you  more  in  digital  game  play?  And  why?  In:  Lecture  Notes  in  Computer  Science  (including  subseries   Lecture  Notes   in  Artificial   Intelligence  and   Lecture   Notes   in   Bioinformatics).  pp.   102   -­‐  113

[26]:  Malone,  T.W.:  What  makes  computer  games  fun?  Byte  6,  pp.  258–277  (1981)  

[27]:  Lazzaro,  N.:  Why  we  play  games:  Four  keys  to  more  emotion  without  story.  Technical  report,  XEO  Design  Inc  (2004)    

[28]:   Shusterman,   R.   (1992)   Pragmatist  Aesthetics.   Living   Beauty,   Rethinking   Art.  Blackwell.  [29]:   Marianne   Graves   Petersen,   Ole   Sejer  Iversen,  Peter  Gall  Krogh,  and  Martin  Ludvigsen.  2004.   Aesthetic   interaction:   a   pragmatist's  aesthetics  of   interactive   systems.   In  Proceedings  of   the   5th   conference   on   Designing   interactive  systems:   processes,   practices,   methods,   and  techniques(DIS   '04).   ACM,   New   York,   NY,   USA,  269-­‐276.  

[30]:  Dourish,  P.  (2001).  Where  the  action  is:  The  foundations  of  embodied  interaction.  Cambridge,  MA:  MIT  Press.  

[31]:  Wilson,   M.   (2002).   Six   views   of   embodied  cognition.   Psychonomic   Bulletin   &   Review,   9,  625-­‐636.  

[32]:  Barsalou,  L.  W.  (2008).  Grounded  Cognition.  Annual  Review  of  Psychology,  59,  617-­‐645.  

[33]:   Glenberg,   A.   M.,   &   Kaschak,   M.   P.   (2002).  Grounding   language   in   action.   Psychonomic  Bulletin  and  Review,  9,  558-­‐565.  

[34]:   Rizzolatti,   G.,   &   Craighero,   L.   (2004).   The  mirror   neuron   system.   Annual   Review   of  Neuroscience,  27,  169-­‐192.    [35]:   Hostetter, A. B., & Alibali, M. W. (2008). Visible embodiment: Gestures as simulated action. Psychonomic Bulletin & Review, 15, 495-514.  [36]:   Johnson-­‐Glenberg,   M.   C.,   Birchfield,   D.,  Savvides,  P.  &  Megowan-­‐Romanowicz,   C.   (2010)  Semi-­‐virtual   Embodied   Learning   –   Real   World  STEM   Assessment.   In   L.   Annetta   &   S.   Bronack  (eds.)   Serious   Educational   Game   Assessment:  Practical   Methods   and   Models   for   Educational  Games,  Simulations  and  Virtual  Worlds.  pp.  225-­‐241.  Sense  Publications,  Rotterdam.    

 [37]:   Ekman,   P.   (1972)   Emotion   in   the   Human  Face,  Pergamon  Press  Inc.,  New  York,  USA.    [38]:   Christine   Lætitia   Lisetti   and   Fatma   Nasoz.  2004.  Using  noninvasive  wearable  computers   to  recognize   human   emotions   from   physiological  signals.  EURASIP   J.   Appl.   Signal   Process.  2004  (January  2004),  1672-­‐1687.    

[39]:   Picard   R.   W.   1995   “Affective   Computing”,  MIT  Technical  Report  #321,  Cambridge,  MA,  USA  

[40]: Boehner  K.,  DePaula  R.,  Dourish  P.,  Sengers  P.   2005   ”Affect:   From   Information   to  Interaction”,   Proceedings   of   the   4th   decennial  conference  on  Critical  computing:  between  sense  and  sensibility,  Aarhus,  Denmark,  ACM  Press

[41]: Florian  'Floyd'  Mueller,  Darren  Edge,  Frank  Vetere,  Martin  R.  Gibbs,  Stefan  Agamanolis,  Bert  Bongers,   and   Jennifer   G.   Sheridan.   2011.  Designing   sports:   a   framework   for   exertion  

Page 78: Sensor based physical interaction for embodied playful learning games

 78  

games.   In   Proceedings   of   the   2011   annual  conference   on   Human   factors   in   computing  systems  (CHI   '11).   ACM,   New   York,   NY,   USA,  2651-­‐2660.

[42]: Crews,   D.   J.   &   Landers,   D.   M.   (1993).  Electroencephalographic  measures  of  attentional  patterns   prior   to   the   golf   putt.   Medicine   &  Science  in  Sports  &  Exercise.  25(1),  116-­‐126.

[43]: Pope  A.,  Stephens  C.  (2011).  “Movemental”:  Intergrading   Movement   and   the   Mental   Game.  Workshop   Paper   form   CHI   2011  Workshop  “Brain  and  Body  Interfaces:  Designing  for   Meaningful   Interaction”.   Available   on:  http://physiologicalcomputing.net/bbichi2011/Movemental   Integrating   Movement   and   the  Mental  Game.pdf  

[44]: Ramesh  Raskar,  Hideaki  Nii,  Bert  deDecker,  Yuki  Hashimoto,  Jay  Summet,  Dylan  Moore,  Yong  Zhao,   Jonathan   Westhues,   Paul   Dietz,   John  Barnwell,  Shree  Nayar,  Masahiko  Inami,  Philippe  Bekaert,  Michael  Noland,  Vlad  Branzoi,  and  Erich  Bruns.   2007.   Prakash:   lighting   aware   motion  capture   using   photosensing   markers   and  multiplexed   illuminators.   In  ACM   SIGGRAPH  2007   papers  (SIGGRAPH   '07).   ACM,   New   York,  NY,  USA,  ,  Article  36  .  

[45]:   Takaaki   Shiratori,   Hyun   Soo   Park,   Leonid  Sigal,   Yaser   Sheikh,   Jessica   K.   Hodgins   "Motion  Capture   from   Body-­‐Mounted   Cameras"   ACM  Transactions   on   Graphics,   Vol.   30,   No.   4   (Proc.  ACM  SIGGRAPH  2011),  July  2011    [46]:  A. Laurentini (February 1994). "The visual hull concept for silhouette-based image understanding". IEEE Trans. Pattern Analysis and Machine Intelligence.. pp. 150–162    [47]:  Corazza  S.,  Mündermann  L.,  Andriacchi  T.,  A  Framework  For  The  Functional   Identification  Of  Joint   Centers   Using  Markerless   Motion   Capture,  Validation   For   The   Hip   Joint,   Journal   of  Biomechanics,  2007.    [48]:   L.   Xinghan,   B.   Berendsen,  R.T.   Tan,  R.C.  Veltkamp,   Dept.   of   Inf.   &   Comput.   Sci.,   Utrecht  Univ.,   Utrecht,   Netherlands.   Human   Pose  Estimation   for   Multiple   Persons   Based   on  Volume   Reconstruction.   In:   Proc.   2010   20th  ICRP.  IEEE,  2010,  pp  3591-­‐3594.      [49]:   Rosenhahn,   B.,   Brox,   T.,   Kersting,   U.   G.,  Smith,  A.  W.,  Gurney,   J.  K.,  &  Klette,  R.   (2006).  A  system   for   marker-­‐less   motion  capture.  Main,  1(1),  45-­‐51.  Citeseer.      [50]:  http://www.eyewriter.org    [51]:   J.   Shotton,  A.   Fitzgibbon,  M.   Cook,T.   Sharp,  M.  Finocchio,  R.  Moore,A.  Kipman,  A.  Blake.  Real-­‐Time   Human   Pose   Recognition   in   Parts   from   a  Single   Depth   Image.   Microsoft   Research  Cambridge,  2011.  

 [52]:   R.   W.   Picard.   Toward   Machines   with  Emotional   Intelligence.   In:   IEEE  Transactions  on  Pattern   Analysis   and   Machine   Intelligence   -­‐  Graph  Algorithms   and   Computer   Vision   Journal,  Vol.   23,   10,   IEEE   Computer   Society,   2001,   pp.  1175-­‐1191.    [53]:   O.A.   Schipor,   Ş.G.   Pentiuc,   M.D.   Schipor.  Towards   a   multimodal   emotion   recognition  framework  to  be  integrated  in  a  computer  based  speech  therapy  system.  In:  The  6th  International  Conference   on   Speech   Technology   and   Human-­‐Computer  Dialogue,  2011.    [54]:R.W.   Picard.Future   affective   technology   for  autism   and   emotion   communication  Phil.   Trans.  R.  Soc.  B  December  12,2009.  [55]:  IRIS  project.  Integrate  Research  on  Interactive  Storytelling.  http://iris.scm.tees.ac.uk/      [56]:   Lennart   E.   Nacke.   Directions   in  Physiological   Game   Evaluation   and   Interaction.  In   CHI   2011   BBI   Workshop   Proceedings,  Vancouver,  BC,  Canada.  2011  

[57]:   Ekman,   P,   &   Friesen,   W.   V.   (1978).   The  facial   action   coding   system:   A   technique   for   the  measurement   of   facial   movement.   Palo   Alto:  Consulting  Psychologists  Press.  

[58]:   G.   Castellano,   L.   Kessous,   G.   Caridakis.  Emotion Recognition through Multiple Modalities: Face, Body Gesture, Speech. In : Affect and Emotion in Human-Computer Interaction, Springer Berlin / Heidelberg, 2008. Pp 92-103

[59]: A.   Batliner,   D.   Seppi,   S.   Steidl,   B.   Schuller.  Segmenting   into   Adequate   Units   for   Automatic  Recognition   of   Emotion-­‐Related   Episodes:   A  Speech-­‐Based  Approach.  In  :Advances  in  Human-­‐Computer  Interaction  Volume  2010  (2010)  

[60]:   Anton   Batliner,   Stefan   Steidl,   Dino   Seppi,  and   Bjorn   Schuller.   2010.   Segmenting   into  adequate   units   for   automatic   recognition   of  emotion-­‐related   episodes:   a   speech-­‐based  approach.  Adv.  in  Hum.-­Comp.  Int.  2010,  Article  3  (January  2010),  15  pages.  

[61]:  T.  Vogt,  E.  André  and  N.  Bee,  "EmoVoice  -­‐  A  framework   for   online   recognition   of   emotions  from   voice,"   in  Proceedings   of   Workshop   on  Perception   and   Interactive   Technologies   for  Speech-­‐Based  Systems,  2008.  

[62]:   F.  Eyben,   M.  Wöllmer,   and   B.  Schuller.  openEAR   -­‐   Introducing   the  Munich  Open-­‐Source  Emotion  and  Affect  Recognition  Toolkit.   In:Proc.  4th   International   HUMAINE   Association  Conference   on   Affective   Computing   and  Intelligent   Interaction   2009   (ACII   2009),  Amsterdam,  The  Netherlands,  volume  I,  pp.  576–581.  IEEE,  2009.  10.-­‐12.09.2009.  

Page 79: Sensor based physical interaction for embodied playful learning games

  79  

[63]:   A.   Mehrabian.   Communication   without  words.  Psychology  Today,  2(4):53–56,  1968.  

[64]: Christian   Kublbeck   and   Andreas   Ernst.  2006.   Face   detection   and   tracking   in   video  sequences   using   the   modified   census  transformation.  Image  Vision  Comput.  24,  6  (June  2006),  564-­‐572.

[65]: Salah,   A.A.,   N.   Sebe,   Th.   Gevers,  Communication   and   automatic   interpretation   of  affect   from   facial   expressions,   in  D.  Gökçay  &  G.  Yıldırım   (eds.),   Affective   Computing   and  Interaction:   Psychological,   Cognitive   and  Neuroscientific  Perspectives,  to  appear.

[66]:  Rana  el  Kaliouby  and  Peter  Robinson.  Real-­‐Time  Inference   of   Complex   Mental   States   from  Facial   Expressions   and   Head   Gestures.   In   the  IEEE   International   Workshop   on   Real   Time  Computer   Vision   for   Human   Computer  Interaction  at  el  Kaliouby,    CVPR,  2004.    

[67]:  Intelligent  Behaviour  Understanding  Group  (iBUG),  Department  of  Computing,  Imperial  College  London  http://ibug.doc.ic.ac.uk/resources/facial-­‐tracker-­‐2011/  

[68]:  Seeing  Macinnes.  FaceAPI  http://www.seeingmachines.com/product/faceapi/  

[69]:  open  Computer  Vision  Library.  http://opencv.org  

[70]:  Coulson,  M.  (2004)  'Attributing  Emotion  To  Static   Body   Postures:   Recognition   Accuracy,  Confusions,  And  Viewpoint  Dependence.'  Journal  of  Nonverbal  Behavior  28  (2)  117-­‐139

[71]: Kleinsmith   A.,   and   Bianchi-­‐Berthouze   N.,  Recognizing   affective   dimensions   from   body  posture,   In:   Proc.   2nd   Intl   Conf   of   ACII,   LNCS  4738,  Portugal,  pp.  48-­‐58,  2007

[72]:     A.  Metallinou   ,   A.   Katsamanis,  Wang   Yun,  S.Narayanan.   Tracking   changes   in   continuous  emotion  states  using  body  language  and  prosodic  cues.   In:   IEEE   International   Conference   on  Acoustics,   Speech   and   Signal   Processing  (ICASSP),  IEEE.  Prague.  2011.  pp  2288-­‐2291  

[73]:     Riskind,   J.H.,   and   Gotay,   C.C.:   Physical  posture:   Could   it   have   regulatory   or   feedback  effects   on   motivation   and   emotion?   Motivation  and  Emotion  6(3)  (1982).pp  273–298  

[74]:  N.  Bianchi-­‐Berthouze,  P.,  Cairns,  A.,  Cox,  C.,  Jennett,   W..,   Kim,.On   posture   as   a   modality   for  expressing   and   recognizing   emotions.   Emotion  and   HCI   workshop   at   BCS   HCI   London,  September,  2006    

[75]:  A.  Camurri,  B.  Mazzarino,  G.  Volpe.  Analysis  of   Expressive  Gesture:   The  EyesWeb  Expressive  

Gesture  Processing  Library   In:  GESTURE  BASED  COMMUNICATION   IN   HUMAN-­‐COMPUTER  INTERACTION   Lecture   Notes   in   Computer  Science,  2004,  Volume  2915/2004,  469-­‐470  

[76]:   Timo   Partala   and   Veikko   Surakka.   2003.  Pupil   size   variation   as   an   indication   of   affective  processing.  Int.   J.   Hum.-­‐Comput.   Stud.  59,   1-­‐2  (July  2003),  185-­‐198.  

[77]:   Eija   Haapalainen,   SeungJun   Kim,   Jodi   F.  Forlizzi,   and   Anind   K.   Dey.   2010.   Psycho-­‐physiological   measures   for   assessing   cognitive  load.  In  Proceedings  of  the  12th  ACM  international  conference   on   Ubiquitous   computing  (Ubicomp  '10).  ACM,  New  York,  NY,  USA,  301-­‐310.    

[78]:  OpenEEG.  http://openeeg.sourceforge.net/doc/  

[79]:   Erin   Treacy   Solovey,   Audrey   Girouard,  Krysta   Chauncey,   Leanne   M.   Hirshfield,   Angelo  Sassaroli,  Feng  Zheng,  Sergio  Fantini,  and  Robert  J.K.   Jacob.   2009.   Using   fNIRS   brain   sensing   in  realistic   HCI   settings:   experiments   and  guidelines.   In  Proceedings   of   the   22nd   annual  ACM   symposium   on   User   interface   software   and  technology  (UIST   '09).  ACM,  New  York,  NY,  USA,  157-­‐166.    

[80]:   O.   A.   Schipor,   S.   G.   Pentiuc,   M.   D.   Schipor.  Towards   a   multimodal   emotion   recognition  framework  to  be  integrated  in  a  Computer  Based  Speech   Therapy   System.   In:   6th   Conference   on  Speech   Technology   and   Human-­‐Computer  Dialogue   (SpeD),   IEEE.Brasov.Romania.2011.   pp  1-­‐6.  

[81]: Bertoncini,  M.  and  Cavazza,  M.,  2007.  Emotional  Multimodal  Interfaces  for  Digital  Media:  The  CALLAS  Challenge.  Proceedings  of  HCI  International  2007.

[82]:  Marc  Schröder.  The  SEMAINE  API:  Towards  a   Standards-­‐Based   Framework   for   Building  Emotion-­‐Oriented   Systems.   In:   Advances   in  Human-­‐Computer   Interaction.   Volume   2010  (2010),  Article  ID  319406,  21  pages  

[83]:     J.   Wagner,   F.   Lingenfelser,   and   E.   Andre,  The  Social  Signal  Interpretation  Framework  (SSI)  for   Real   Time   Signal   Processing   and  Recognitions,"  in   Proceedings   of   INTERSPEECH  2011,  Florence,  Italy,  2011.  

[84]: F.   Lavagetto   and   R.   Pockaj,   "The   Facial  Animation  Engine:  towards  a  high-­‐level  interface  for   the   design   of   MPEG-­‐4   compliant   animated  faces",   IEEE   Trans.   on   Circuits   and   Systems   for  Video   Technology,   Vol.   9,   n.2,   March   1999,  pp.277-­‐289

[85]:  MPEG-­‐V  (Information  Exchange  with  Virtual  Worlds)  http://mpeg.chiariglione.org/working_documents.htm#MPEG-­‐V  ,  http://www.metaverse1.org/  

Page 80: Sensor based physical interaction for embodied playful learning games

 80  

[86]: EMMA: Extensible MultiModal Annotation markup language W3C Recommendation 10 February 2009 http://www.w3.org/TR/emma/ [87]:    Emotion  Markup  Language  (EmotionML)  1.0.  W3C  Working  Draft  7  April  2011  http://www.w3.org/TR/emotionml/  

[88]: Marco  Grassi.  2009.  Developing  HEO  human  emotions   ontology.   In  Proceedings   of   the   2009  joint   COST   2101   and   2102   international  conference   on   Biometric   ID   management   and  multimodal      communication    (BioID_MultiComm  '09),   Julian   Fierrez,   Javier   Ortega-­‐Garcia,   Anna  Esposito,  Andrzej  Drygajlo,  and  Marcos  Faundez-­‐Zanuy  (Eds.).  Springer-­‐Verlag,  Berlin,  Heidelberg,  244-­‐251.

[89]: S.   Kopp,   B.   Krenn,   S.   Marsella,   et   al.,  “Towards   a   common   framework   for  multimodal  generation:   the   behavior   markup   language,”   in  Proceedings  of   the  6th   International  Conference  on  Intelligent  Virtual  Agents  (IVA  ’06),  vol.  4133  of   Lecture  Notes   in   Computer   Science,   pp.   205–217,  2006.

[90]:  HUMAINE.  http://emotion-­‐research.net/  

[91]: Katie   Crowley,   Aidan   Sliney,   Ian   Pitt,   and  Dave   Murphy.   2010.   Evaluating   a   Brain-­‐Computer   Interface   to   Categorise   Human  Emotional  Response.   In  Proceedings  of   the  2010  10th  IEEE  International  Conference  on  Advanced  Learning   Technologies  (ICALT   '10).   IEEE  Computer   Society,   Washington,   DC,   USA,   276-­‐278.  

[92]: Genaro   Rebolledo-­‐Mendez,   Ian   Dunwell,  Erika   A.   Martinez-­‐Miron,   Maria   Dolores   Vargas-­‐Cerdan,   Sara  Freitas,   Fotis   Liarokapis,   and  Alma  R.   Garcia-­‐Gaona.   2009.   Assessing   NeuroSky's  Usability   to   Detect   Attention   Levels   in   an  Assessment  Exercise.   In  Proceedings  of   the  13th  International   Conference   on   Human-­‐Computer  Interaction.   Part   I:   New   Trends,   Julie   A.   Jacko  (Ed.).   Springer-­‐Verlag,   Berlin,   Heidelberg,   149-­‐158.

[93]: Self   City.   Waag   Society,   Stichting  Experimentele   Werkplaatsen   (SEW),   RENN4:  Regionaal   Expertise   Centrum   Noord   Nederland  (cluster   4),   Prof.dr.H.J.M.Hermans,Em,   Radboud  University,Nijmegen.    http://waag.org/projects/selfcity  

[94]:   C.   S.   Pinhanez,   Advisor(s)   Aaron   F.  Bobick.  (1999).  Representation   and   Recognition  of   Action   in   Interactive   Spaces.   Ph.D.  Dissertation.   Massachusetts   Institute   of  Technology.  CA,  USA.  

[95]: Sung,  J.,  Ponce,  C.,  Selman,  B.,  and  Saxena,  A.  (2011).   Human   Activity   Detection   from   RGBD  Images.  Artificial   Intelligence  abs/1107.0,  47-­‐55.  Available  at:  http://arxiv.org/abs/1107.0169.

[96]: M.   Mateas.   (2002).  Interactive   Drama,   Art  and   Artificial   Intelligence.   Ph.D.   Dissertation.  Carnegie  Mellon  Univ.,  Pittsburgh,  PA,  USA.  

[97]: R.  M.  Taylor,   II,  T.  C.  Hudson,  A.  Seeger,  H.  Weber,  J.  Juliano,  and  A.  T.  Helser.  (2001).  VRPN:  a   device-­‐independent,   network-­‐transparent   VR  peripheral   system.   InProceedings   of   the   ACM  symposium   on   Virtual   reality   software   and  technology  (VRST  '01).  ACM,  New  York,  NY,  USA,  55-­‐61.

 

Page 81: Sensor based physical interaction for embodied playful learning games

  81