shengli minyue derong liu - uta talks 2017/robot hri.pdfbe assessed, including the safety, level of...

3

Shengli XieMinyue FuDerong Liu

Moncrief-O’Donnell Chair, UTA Research Institute (UTARI)The University of Texas at Arlington, USA

and

F.L. Lewis, NAI

Talk available online at http://www.UTA.edu/UTARI/acs

Assistive Human-Robot Interaction (HRI)and Dan Popa, University of Louisville

and Isura Ranatunga, Reza Modares, Bakur AlQaudi

Guest Foreign Professor, Guangdong University of Technology,Guangzhou China

Supported by :NSF NRI Initiative ONR

Supported by :China NNSFChina Project 111

3. Devices: “distributed skin sensors”Integration of multi-modal, multi-resolution,MEMS skin sensors to include tactile,thermal, pressure, acceleration, and distanceIR sensing.- Sensor design tuned for pHRI- Fabrication on flexible substrates- Robust packaging in Frubber® & laminates- Efficient wire interconnect schemes

Microsensor packaging and interconnects Concept: Microactuator

array using piezo actuator

Sensors and skin onto PR2 and youBot robots at UTARI

Electro Hydro Dynamicalsensor printing

(maskless lithography)

Multi-Modal Skin and Garments for Healthcare and Home Robots

University of Texas at ArlingtonNRI Grant No. 1208623

Program Manager: Dr. Paul Werbos, ECCS, ENG, NSF

Dan O. Popa1, Frank L. Lewis1, Nicoleta Bugnariu2, Woo Ho Lee1 and Muthu Wijesundara3

1Department of Electrical Engineering, University of Texas at Arlington, 2University of North Texas Health Science Center, 3UT Arlington Research InstitutePartner Companies: Advanced Arm Dynamics, Hanson Robotics, Inc., National Instruments

1. System Design: “where to place sensors on robot?”Novel algorithms and methods for optimal placement anddata management of such devices on several co-Robots.

- Statistical adaptive sampling for sensor selection- Sensor fusion based on noise and sensor scaling

models- Optimization algorithms for maximizing robot perception- New sensor simulation models and robot control

algorithms

4. Co-Robot performance: “how does thistechnology help humans?”The impact of the new technology to humans willbe assessed, including the safety, level ofassistance to several targeted user groups, easeof use, aesthetics, and therapeutic benefits.- Clinical Testing at UNTHSC and UTARI- Collaborative work with Advanced Arm Dynamics

Robot, Tactile sensor array andthe environment in Gazebo

Robot

Human

Sensor

Diagram of SkinSim, multi-modal skin simulation environment

Assistance for UpperLimb Amputees PR2 Teach-by-demonstration

Human Interaction Data Collection

PR2 Human

Character

SensorPlacementInfraredAccel.Temp.Tactile

World

ModelPlugins

SDF

SensorPlugins

WorldPlugin

OverlayThermalTempprofile

OutputInfraredAccel.Temp.Tactile

GAZEBO

Sketch-up

CollectMeshes

User ApplicationC++/Python

Pressure sensitiveadhesive tape(50µm thick)Bottom Kaptonlayer with electric traces and pads

Top Kapton layer with electric traces and pads

Array of temperature sensors Parylene (polymer substrate) at UTARI

( )v t

Model Reference Neuroadaptive Controller

Neuroadaptive Impedance Control

2. Control and Learning: “both human and robot learn during interaction”Learning algorithms and adaptive impedancecontrol for efficient use of multimodal sensors tosense human intent and improve the usability ofco-Robots.- Online reinforcement learning for pHRI with co-

Robots wearing skin and garments, given human-centric rewards and cost functions- Neuroadaptive Impedance Control with stabilityand performance guarantees

The robustifying signalis,

( ) ( )( ) ( , )( ) ( ) ( )m m mf x M q q e V q q q e F q G q ˆ ˆ( )T T

v rW V x K r v

ˆ( ) ( )z F Bv t K Z Z r ‖ ‖

Where,

Iterate Designs & Algorithms

PerceivedImpedance

Robot and Skin

Simulation

Fabrication & Integration of Skin/Garment

Hardware

Interaction LearningInitial sensor

prototypes& Robotichardware

Task requirements

Measurement&

Simulation

pHRI

Fully Automated Robot vs. Assistive Robot

PR2 meets Isura

Standard Robot Trajectory Tracking Controller

Where is the human?

Robot dynamics

Prescribed Error system

Control torque depends onImpedance model parameters

Impedance Control

Human task learning has 2 components:1. Human learns a robot dynamics model to compensate for robot nonlinearities2. Human learns a task model to properly perform a task

Inner Robot Specific Control LoopINDEPENDENT OF TASK

Outer Task Specific Control LoopINDEPENDENT OF ROBOT DETAILS

Human Performance Factors Studies

Two-loop HRI Design- Robot Control versus Task Control1. Inner Robot-Specific Control Loop2. Outer Task-Specific Control Loop

2A. Adaptive Inverse Filter Task Design2B. Model Reference Adaptive Control Task Design2C. Reinforcement Learning Task Control for Minimum Human Effort

3. Experiments

Robot control inner loop

Task control outer loop

1. Inner‐Loop Robot Specific Controller

Model‐reference Neuro‐adaptive Control

There is NO prescribed trajectory in the robot control loop design

Make the robot behave like the prescribed model

F.L. Lewis, D.M. Dawson, and C.T. Abdallah, RobotManipulator Control: Theory and Practice, 2nd edition,Revised and Expanded, CRC Press, Boca Raton, 2006.

F.L. Lewis, S. Jagannathan, and A. Yesildirek, Neural NetworkControl of Robot Manipulators and Nonlinear Systems,Taylor and Francis, London, 1999.

A Novel Control Objective Using Neuro‐adaptive Control Techniques

( ) ( , ) ( ) ( ) =d c hM q x V q q x F q G q f f f

=m m m m m m hM x D x K x f

= me x x

=r e e

( ) = ( , ) ( ) d c hM q r V q q r f f f f

( ) = ( )( ) ( , )( ) ( ) ( )m mf M q x e V q q x e F q G q

=TT T T T T T T

m m me e x x x q q

Robot dynamics

Prescribed robot impedance model

Model‐following error

Sliding mode error

Error dynamics

Unknown robot nonlinear function

There is NO task trajectory here

Model‐following error formulation

= ( )TcJ q f

Control torque

ˆ= ( ) ( )c v hf f K r v t f

( ) = ( , ) ( ) ( )v dM q r V q q r K r f f v t

( ) ( , ) ( ) ( ) =d c hM q x V q q x F q G q f f f

( ) = ( , ) ( ) d c hM q r V q q r f f f f

ˆ ˆ ˆ( ) = ( )T Tf W V

( ) = ( )T Tf W V

Dynamics

Controller‐ approximation based

Robot

Error

Unknown nonlinearities parameterized in terms of a function approximator

Estimated parameters

Estimate for unknown nonlinearities

Closed‐loop error dynamics

W, V unknown parameters

Model‐following error driven by parameter estimation error

ˆ ˆ,W V

ˆ ˆ ˆ( ) ( ) ( ) = ( ) ( )T T T Tf f f W V W V

ˆ ˆ= ( ) ( )T Tc v hf W V K r v t f

ˆ ˆ ˆ ˆ ˆ= ( ) ( )T T T T TW F V r F V V r F r W

ˆ ˆ ˆ ˆ= ( ( ) )T T TV G V Wr G r V

ˆ( ) = ( )z F Bv t K Z Z r

=TT T T T T T T

m m me e x x x q q

Adaptive control structure

Standard Adaptive Parameter tuning algorithms

Robust control term

No task reference trajectory is used hereThe robot controller makes the model following error smallThe parameters of the admittance model are not needed

= me x x=m m m m m m hM x D x K x f

= me x x

=r e e

Model‐followingerror

No task trajectory information is used in this inner‐loop robot controllerThe inner‐loop robot controller makes the model‐following error smallThe admittance model parameters are not neededOnly the admittance model trajectories are needed., ,m m mx x x

2. Outer Task-Specific Control Loop 2A. Adaptive Inverse Filter Task Design2B. Model Reference Adaptive Control Task Design2C. Reinforcement Learning Task Control for Minimum Human Effort

Robot control inner loop

Task control outer loop

Three Outer Loop DesignsTo appear 2016

2A. Outer‐loop Task Specific Design #1

( ) ( ) ( )D s M s H sWant to find M(s) so that

with H(s) and D(s) unknown

Adaptive Inverse Control and Wiener Filter

B. Widrow Adaptive inverse filter

For trajectory following task‐ e.g. point‐to‐point motion control

Work of Isura Ranatunga

Signal to robot controller

( ) ( ) ( )D s M s H sWant to find M(s) so that

with H(s) and D(s) unknown

( ) ( )( ) = =( ) ( )

f xh d

f fh h

s D sM ss H s

Wiener Filter Solution in terms of power spectral densities

1 1

2 2

3 3

0 1 0 00 0 1 00 0 0 1

hd fdt

1 1

2 2

3 3

0 1 0 00 0 1 00 0 0 1

md xdt

1/s

1/s

1/s

1/s

1/s

1/s

1b

2b

3b

1a

2a

3a

( )hf t ( )mx tFind Wiener Filter online using adaptive learning

( ) = ( ) ( )dx t H t t

ˆ( ) = ( ) ( )mx t H t t

Ideal Filter

Wiener filter solution

Known regression vector Unknown coefficients

d mx x

Kalman Filter = CT RLS

1 1 11 1 1= ( ) { } { } ( ) ( ) ( )2 2 2

T T T TL r M q r tr W F W tr V G V t P t t

Combined stability analysis of Inner robot control loop and Outer task following loop

Lyapunov function

Robot model‐following error

NN weight estimation errors

Outer‐loop inverse adaptive filter error

Shi nian shu muBai nian shu ren

十年树木，百年树人

Keshi-Wu nian shu xuesheng可是，五年树学生

2B. Outer‐loop Task Specific Design #2

Model Reference Adaptive Control

K. Astrom

BUT ‐In standard MRAC, the controller appears before the unknown plantHere, the unknown plant (e.g. human) is BEFORE the controller

Work of Bakur AlQaudi

BUT ‐In standard MRAC, the controller appears before the unknown plantHere, the unknown plant (e.g. human) is BEFORE the controller

So we need to add a human dynamics identifier

( )hc

b yH ss a u

( ) m mm

m c

b yH ss a u

( ) pnp

n

ybH ss a u

Nominal Robot impedance model

To generate prescribed model trajectory , ,m m mx x x

Unknown Human model

Task reference model – first‐order crossover model – ideal human + robot system

Basic muscle response model

Human factors studies show that AFTER Learning, the human plus robot system obeysA simple first‐order roll‐off high bandwidth dynamics

ˆy y y

ˆˆb b b

a a a

Human response estimation error

Parameter estimation error

ˆy y y Human response estimation error

Combined stability proof of overall 2‐loop robot‐task system

1 11 1 1( ) { } { }2 2 2

T T Tr M q r tr W F W tr V G V

Adaptive tuning parameter errors

Inner model tracking errorNN parameter estimation errors

Basic muscle response

PD controller like that provided by cerebellum

2C. Outer‐loop Task Specific Design #3

Reinforcement Learning for minimum human effort

Feedforward assistive control term

‐ 2 1( )Ms Bs K -+ +hK

(.)l

+

-

dx

mxh

fd

e ++

1( )p d

K s K s -+

PrescribedImpedanceModelHuman

Find robot impedance model parametersTo minimize human force effortAnd task trajectory following error

, ,M B Khf

de

Human force amplifier

Work of Reza Modares

Force exerted by human indicates his discontent‐A measure of Human Intent

( )d p h e d

K s K f k e+ =

nd d m

e x x= - Î Tracking error

2[ ]T T T nd d d d

e e e x x= = - Î

2[ ]T T T nm m

x x x= Î

2[ ]T T T nd d d

x x x= Î

Minimize human effort and tracking error

( )T T Td d d h h h e e

t

J e Q e f Q f u R u dt¥

= + +ò

Performance index

1 2e d hu K e K f= +

Then control is

( )T Te e

t

J X Q X u R u dt¥

= +ò

How to get human force into PI ?

Feedback linearization loop

Robot Impedance Model Unknown Human Model

Overall Augmented Dynamics

1,0h d p h e d d h h h d

f K K f k K e A f E e-= - + º +

( )d p e d

K s K f k e+ =

d h p h e dK f K f k e+ =

We want online method to learn the optimal control without knowing the System Matrix A

Optimal Design Always Admits Reinforcement Learning for Real‐time Optimal Adaptive Control

Optimal control is an offline methodBased on solving AREKnowing all the plant dynamics

Take enough data along the system trajectoryTo solve this equation using least‐squares

OFF‐POLICYReinforcement LearningNeeds NO knowledge of the system dynamics

Off‐policy IRL Bellman equation

( ) ( ) [2 ( ) ] ( ) ( ) ( ) ( )t t t t

T T T T Te e

t t

X t PX t X t PBe d X t Q X t u R u d X t t P X t tt t t+D +D

é ù+ = + + +D +Dê úë ûò ò

Off‐policy term

Finds optimal control gains without using ANY system dynamics

Off‐Policy Reinforcement LearningNeeds NO knowledge of the system dynamics

3. Experimental Results on PR2

3. Experimental Results Work of Isura RanatungaSven Cremer

Point‐to‐point tracking error Human force effort

Future Work

Thanks !!

shengli minyue derong liu - uta talks 2017/robot hri.pdfbe assessed, including the safety, level of...

Documents