Download - Tangible User Interfaces and Reinforcement Learning (Smart Toys) An honours thesis presentation by… Trent Apted Supervised by A/Prof Bob Kummerfeld Smart

Tangible User Interfaces and Tangible User Interfaces and Reinforcement LearningReinforcement Learning

(Smart Toys)(Smart Toys)

An honours thesis presentation by…

Trent Apted <[email protected]>Supervised by A/Prof Bob Kummerfeld

Smart Internet Technology Research Group

Tangible User InterfacesTangible User Interfaces

• Not just a mouse– Although he can advance my slides

• Facilitate a more intimate interaction with the user– Mainly targeted towards children– Huggable, cute and cuddly– Develop a relationship with the user– Play games

Toys - MotivationToys - Motivation

• Plush (soft and furry) toys account for around 25% of toy store sales

• Over 17 million Furby toys were sold between October 1998 and December 1999– They had primitive learning

capabilities– Mostly robot-like in appearance– They were also relatively cheap

(unlike Sony’s Aibo ~$2,000+)

Toys - ChallengesToys - Challenges

• Want to (cheaply) make a Smart Toy, derived from a plush doll

• Don’t want to adversely affect the original function– Namely, being soft, cute and cuddly

• Also want to be able to detect the usual ‘plush toy’ interactions– E.g. squeeze, carry, lie down with

• I am not an engineer…

Reinforcement Reinforcement LearningLearning

• Like training a dog with a ‘clicker’• Need to associate the reward (click) with

behaviour in a nearby temporal window– How to represent the behaviour– How to determine the window

• Apply learning that attempts to maximise all future possible rewards

• Many techniques– Q-learning, TD(Bayesian models, Markov

models, neural networks, actor-critic, hierarchical

Reinforcement Learning - Reinforcement Learning - ChallengesChallenges

• Not all techniques can be applied to this scenario– Infinite: no end to training examples– Interactive: need to wait for the user to determine the reward– Discrete: few training examples– Future use: a (cheap) toy can not hold a lot of state– Sensors are unsophisticated (Boolean)

• Also needs to be fun– Non-determinism– Anticipate possible actions without stimuli

• May not also be possible to punish the model

My Contributions –My Contributions –Hardware / SystemsHardware / Systems

• Design and implementation of the circuitry and sensors

• Integration into a plush toy• A hardware software

interface (via parallel port) and event model

• Many lessons learnt– E.g. limitations of high-level

hardware (PDA)

My Contributions –My Contributions –SoftwareSoftware

• Reinforcement learning in the context of a Smart Toy

• Flexible learning architecture for further research and exploration (in other contexts)

• Evaluation of the reinforcement learning techniques implemented

• Implementation of a number of simple games to motivate learning of the toy (fun?)

Some Results and AnalysisSome Results and Analysis

• Increasing the state space and re-presenting examples does not help interactive learning

• ‘Snapshot’ environments perform poorly and do not benefit from increasing the learner complexity

• Q-Learning combined with Markov models perform well

Future WorkFuture Work

• Improve the abilities of the toy– There’s spare wires - a speaker would be easy to add– Speech recognition would be harder

• Wireless– Remove the tether for more natural interaction Power source and increased expense

• Collaboration– ‘talking’ to other Smart Toys, collaborating in games– Collaborative learning

• Examine more learning models• Psychological / Sociological aspects

Download - Tangible User Interfaces and Reinforcement Learning (Smart Toys) An honours thesis presentation by… Trent Apted Supervised by A/Prof Bob Kummerfeld Smart

Top Related