increasing self-awareness for telepresence robot users...increasing self-awareness for telepresence...

2
Increasing Self-Awareness for Telepresence Robot Users Naomi T. Fitter and Maja J. Matari´ c Abstract— As robots become more common in the class- room environment, opportunities emerge for improved distance learning via telepresence robots. K-12 students might be able to maintain their in-classroom education during extended absences through the use of telepresence robots, but existing robot platforms lack desirable self-awareness capabilities like rear vision and speaking volume awareness. At a high level, our research aims to augment the remote environment awareness of telepresence robot operators without significantly increasing their task load. As an initial proof of concept, we constructed a machine learning pipeline that uses dynamic time warping to classify certain prosocial and antisocial behaviors of people near a robot with an accuracy of 85.8%. In this initial work, a teleoperated TurtleBot2 serves as a proxy for the telepresence robots that will later use this classification ability. This robot responds to classified behaviors by maneuvering in a socially appropriate way and notifying stakeholders of events that could harm the robot. We propose future work to improve and build on these initial efforts. I. MOTIVATION Robots are increasingly common in human-populated en- vironments like the classroom. Meanwhile, over a quarter of children in the United States miss significant amounts of school each year [1]. Accordingly, one impactful emergent application of robots in the classroom is enabling absent students to attend school via telepresence robots [2]. The possibility of preserving in-classroom learning experiences for these children is immensely beneficial; children learn both information and social abilities through in-school inter- actions with peers. At the same time, modern telepresence robots fail to provide important self-awareness of things like speaking volume and telepresence robot surroundings. This is especially problematic in environments with children, where robot bullying events like obscene gestures at robots, blocking of robots’ sensors, and physical abuse of robots have already been reported by pioneering researchers in this field [3], [4]. A few self-awareness strategies can help to resolve robot abuse and social navigation issues in the classroom. One approach is delivering concise information about unseen areas around the robot to the robot operator or authority figures when necessary. This information helps to enable a user to take the appropriate action or an authority figure to in- tervene when necessary. Another self-awareness strategy is to incorporate automatic social navigation (perhaps combined with shared autonomy strategies) into the robot’s movement. In this way, the robot becomes more aware of the appropriate prosocial behaviors or evasive maneuvers to make based on Naomi T. Fitter and Maja J. Matari´ c are with the Interaction Lab, Department of Computer Science, University of Southern California, Los Angeles, CA {nfitter,mataric}@usc.edu Fig. 1. Example prosocial and antisocial behaviors that might take place outside the view of the front- and downward-facing cameras of a telepresence robot. events in its environment like those displayed in Fig. 1. The distinction of these behaviors may vary by individual, so for this initial effort, we rely on the conventions identified in previous work such as [3]. Our research aims are to develop 1) ways to communicate extra situational awareness information to robot operators and other parties without significantly increasing operator task load and 2) automatic social navigation for telepresence robots based on a machine learning-mediated understanding of the robot’s environment. II. SYSTEM ARCHITECTURE This project sought to accomplish recognition and classi- fication of gestures in a telepresence robot’s surroundings using RGB-D camera data. Since robot operators lack a full view of the events taking place in the remote envi- ronment where their telepresence robot is located, succinct descriptions of nearby people and actions from additional camera views can enhance the telepresence experience and the safety of the robot. Computer vision is necessary in this scenario; directly providing additional camera views to a robot operator would have a problematic impact on their task load while using the system. To facilitate this future-looking goal of providing more information to robot operators without overwhelming them, we pursued the following high-level structure for processing, labeling, and sharing information about the behaviors of people co-located with the robot: 1) Video data is sent from a Kinect that is mounted on the robot to an OpenNI project. 2) The OpenNI project extracts relevant skeletal data and feeds this information into a gesture recognition model that is trained prior to running the project using the

Upload: others

Post on 25-Sep-2020

9 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Increasing Self-Awareness for Telepresence Robot Users...Increasing Self-Awareness for Telepresence Robot Users Naomi T. Fitter∗ and Maja J. Mataric´∗ Abstract—As robots become

Increasing Self-Awareness for Telepresence Robot Users

Naomi T. Fitter∗ and Maja J. Mataric∗

Abstract— As robots become more common in the class-room environment, opportunities emerge for improved distancelearning via telepresence robots. K-12 students might be ableto maintain their in-classroom education during extendedabsences through the use of telepresence robots, but existingrobot platforms lack desirable self-awareness capabilities likerear vision and speaking volume awareness. At a high level, ourresearch aims to augment the remote environment awarenessof telepresence robot operators without significantly increasingtheir task load. As an initial proof of concept, we constructeda machine learning pipeline that uses dynamic time warpingto classify certain prosocial and antisocial behaviors of peoplenear a robot with an accuracy of 85.8%. In this initial work, ateleoperated TurtleBot2 serves as a proxy for the telepresencerobots that will later use this classification ability. This robotresponds to classified behaviors by maneuvering in a sociallyappropriate way and notifying stakeholders of events that couldharm the robot. We propose future work to improve and buildon these initial efforts.

I. MOTIVATION

Robots are increasingly common in human-populated en-vironments like the classroom. Meanwhile, over a quarterof children in the United States miss significant amounts ofschool each year [1]. Accordingly, one impactful emergentapplication of robots in the classroom is enabling absentstudents to attend school via telepresence robots [2]. Thepossibility of preserving in-classroom learning experiencesfor these children is immensely beneficial; children learnboth information and social abilities through in-school inter-actions with peers. At the same time, modern telepresencerobots fail to provide important self-awareness of thingslike speaking volume and telepresence robot surroundings.This is especially problematic in environments with children,where robot bullying events like obscene gestures at robots,blocking of robots’ sensors, and physical abuse of robotshave already been reported by pioneering researchers in thisfield [3], [4].

A few self-awareness strategies can help to resolve robotabuse and social navigation issues in the classroom. Oneapproach is delivering concise information about unseenareas around the robot to the robot operator or authorityfigures when necessary. This information helps to enable auser to take the appropriate action or an authority figure to in-tervene when necessary. Another self-awareness strategy is toincorporate automatic social navigation (perhaps combinedwith shared autonomy strategies) into the robot’s movement.In this way, the robot becomes more aware of the appropriateprosocial behaviors or evasive maneuvers to make based on

∗Naomi T. Fitter and Maja J. Mataric are with the Interaction Lab,Department of Computer Science, University of Southern California, LosAngeles, CA {nfitter,mataric}@usc.edu

Fig. 1. Example prosocial and antisocial behaviors that might takeplace outside the view of the front- and downward-facing cameras of atelepresence robot.

events in its environment like those displayed in Fig. 1. Thedistinction of these behaviors may vary by individual, so forthis initial effort, we rely on the conventions identified inprevious work such as [3].

Our research aims are to develop 1) ways to communicateextra situational awareness information to robot operatorsand other parties without significantly increasing operatortask load and 2) automatic social navigation for telepresencerobots based on a machine learning-mediated understandingof the robot’s environment.

II. SYSTEM ARCHITECTURE

This project sought to accomplish recognition and classi-fication of gestures in a telepresence robot’s surroundingsusing RGB-D camera data. Since robot operators lack afull view of the events taking place in the remote envi-ronment where their telepresence robot is located, succinctdescriptions of nearby people and actions from additionalcamera views can enhance the telepresence experience andthe safety of the robot. Computer vision is necessary in thisscenario; directly providing additional camera views to arobot operator would have a problematic impact on their taskload while using the system.

To facilitate this future-looking goal of providing moreinformation to robot operators without overwhelming them,we pursued the following high-level structure for processing,labeling, and sharing information about the behaviors ofpeople co-located with the robot:

1) Video data is sent from a Kinect that is mounted onthe robot to an OpenNI project.

2) The OpenNI project extracts relevant skeletal data andfeeds this information into a gesture recognition modelthat is trained prior to running the project using the

Page 2: Increasing Self-Awareness for Telepresence Robot Users...Increasing Self-Awareness for Telepresence Robot Users Naomi T. Fitter∗ and Maja J. Mataric´∗ Abstract—As robots become

Fig. 2. Illustrations of the confrontational and prosocial behaviors classifiedin this initial work: kicking (top row), throwing (middle row), and waving(bottom row).

Gesture Recognition Toolkit’s Dynamic Time Warpingfunctionality [5].

3) The identity of classified motions is used to propagatethe appropriate messages to a Robot Operating System(ROS) Node via server-client networking code, wherethe ROS Node acts as the server [6]. The ROS Nodereads the message and decides how the robot shouldrespond (for example, approaching a co-located personin the case of prosocial behaviors or retreating inthe case of antisocial behaviors). In the case that therecognized motion could be harmful to the robot, thesystem may also send an email and/or text message toan adult or other nearby supervisor.

4) The robot operator uses a graphical user interface(GUI) that is a web application (living on the samecomputer as the ROS Node) to control the robot. ThisGUI also provides the operator with information aboutevents in the remote environment.

III. MOTION CLASSIFICATION

A crucial facet of our system architecture for increas-ing telepresence user self-awareness was the generation ofsuitable gesture recognition models. To train the gesturerecognition model, we recorded 30 instances of each of fourgestures demonstrated by members of the research team.For this initial work, these four motions included kicking,throwing, and waving, as illustrated in Fig. 2, as well as beingstill (no gesture). Models were trained using the GestureRecognition Toolkit’s Dynamic Time Warping approach withthe default settings. We used 5-fold cross validation to ensurethat the generated models would perform well on motionrecordings not included in training the model. The relativelysuccessful results of this approach appear in Fig. 3.

The main observed classification errors are labeling a waveas a throw and labeling the prosocial or antisocial actionsas non-gestures. Waves are likely misclassified as throws

No Gesture Kick Throw Wave

No Gesture 1.000 0.000 0.000 0.000

Kick 0.133 0.867 0.000 0.000

Throw 0.167 0.000 0.833 0.000

Wave 0.167 0.000 0.100 0.733

Actual Identity

Classified Identity

Fig. 3. A confusion matrix of the motion classification results. The highvalues along the diagonal of the matrix indicate promise for using thisapproach in future work.

because both begin with raising the hand. The incorrect “nogesture” labels may be reduced by adjusting the null rejectioncoefficient in the classification function. As more gesturesare added to the model, the classification accuracy maydecline. Thus we must continue to seek ways to correctlydistinguish gestures that are similar and maintain a highoverall classification accuracy.

IV. RESULTS AND FUTURE WORK

We developed a pipeline for classifying prosocial andantisocial behaviors by people co-located with a mobile robotbase. The pipeline can share concise notifications with re-mote robot users and/or authority figures. The robot can alsorespond automatically to prosocial behaviors by approachinga person or antisocial behaviors by retreating automatically,without robot operator input. Our future research steps willinclude increasing the accuracy of the motion classifier andaugmenting the set of recognized motions. With this pipelinein place, we will perform future studies to compare differentmethods of notifying robot operators of events in theirsurroundings. Ultimately, this work will enable increasedself-awareness, sociability, and safety for telepresence robotswithout detrimentally increasing robot operator task load.

ACKNOWLEGDGMENTS

We thank Adrian Sunga, Cole Brossart, Lizzy Worstell,Natalie Mackraz, and Utkash Dubey for their contributionsto the early stages of this work.

REFERENCES

[1] U.S. Department of Health and Human Services. (2012) The2011-12 National Survey of Children’s Health. [Online]. Available:http://www.childhealthdata.org/learn/NSCH

[2] V. A. Newhart, M. Warschauer, and L. Sender, “Virtual inclusionvia telepresence robots in the classroom: An exploratory case study,”International Journal of Technologies in Learning, vol. 23, no. 4, pp.2327–2686, 2016.

[3] D. Brscic, H. Kidokoro, Y. Suehiro, and T. Kanda, “Escaping from chil-dren’s abuse of social robots,” in Proc. of the ACM/IEEE InternationalConference on Human-Robot Interaction (HRI), 2015, pp. 59–66.

[4] M. Tsang, V. Korolik, S. Scherer, and M. Mataric, “Comparing modelsfor gesture recognition of children’s bullying behaviors,” in Proc. of theIEEE International Conference on Affective Computing and IntelligentInteraction (ACII), 2017, pp. 138–145.

[5] N. Gillian and J. A. Paradiso, “The gesture recognition toolkit,” TheJournal of Machine Learning Research, vol. 15, no. 1, pp. 3483–3487,2014.

[6] M. Quigley, K. Conley, B. Gerkey, J. Faust, T. Foote, J. Leibs,R. Wheeler, and A. Y. Ng, “Ros: an open-source robot operatingsystem,” in ICRA workshop on open source software, vol. 3, no. 3.2.Kobe, Japan, 2009, p. 5.