overview of robot perception - university of texas at austinyukez/cs391r_fall2020/...embodied view...

48
CS391R: Robot Learning (Fall 2020) Overview of Robot Perception 1 Prof. Yuke Zhu Fall 2020

Upload: others

Post on 21-Jan-2021

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Overview of Robot Perception - University of Texas at Austinyukez/cs391r_fall2020/...Embodied View of Perception • As the active cat (A) walks, the other cat (P) moves and perceives

CS391R: Robot Learning (Fall 2020)

Overview of Robot Perception

1

Prof. Yuke Zhu

Fall 2020

Page 2: Overview of Robot Perception - University of Texas at Austinyukez/cs391r_fall2020/...Embodied View of Perception • As the active cat (A) walks, the other cat (P) moves and perceives

CS391R: Robot Learning (Fall 2020) 2

Logistics

Office Hours

Instructor: 4-5pm Wednesdays (Zoom) or by appointment

TA: 10:15-11:15am Mondays (Zoom) or by appointment

Presentation Sign-Up: Deadline Today (EOD)

First review due: Wednesday 9:59pm (one review: Mask-RCNN or YOLO)

Student Self-Introduction

Page 3: Overview of Robot Perception - University of Texas at Austinyukez/cs391r_fall2020/...Embodied View of Perception • As the active cat (A) walks, the other cat (P) moves and perceives

CS391R: Robot Learning (Fall 2020) 3

Today’s Agenda

● What is Robot Perception?

● Robot Vision vs. Computer Vision

● Landscape of Robot Perception

○ neural network architectures

○ representation learning algorithms

○ state estimation tasks

○ embodiment and active perception

● Quick Review of Deep Learning (if time permits)

Page 4: Overview of Robot Perception - University of Texas at Austinyukez/cs391r_fall2020/...Embodied View of Perception • As the active cat (A) walks, the other cat (P) moves and perceives

CS391R: Robot Learning (Fall 2020) 4

[Levine et al. JMLR 2016] [Bohg et al. ICRA 2018][Sa et al. IROS 2014]

Perceive

Act

Perceive

Act

Act

Perceive

A key challenge in Robot Learning is to close the perception-action loop.

Page 5: Overview of Robot Perception - University of Texas at Austinyukez/cs391r_fall2020/...Embodied View of Perception • As the active cat (A) walks, the other cat (P) moves and perceives

CS391R: Robot Learning (Fall 2020) 5

What is Robot Perception?

Making sense of the unstructured real world…

• Incomplete knowledge of objects and scene

• Environment dynamics and other agents

• Imperfect actions may lead to failure

Page 6: Overview of Robot Perception - University of Texas at Austinyukez/cs391r_fall2020/...Embodied View of Perception • As the active cat (A) walks, the other cat (P) moves and perceives

CS391R: Robot Learning (Fall 2020) 6

Robotic Sensors

Making contact of the physical world through multimodal senses

Page 7: Overview of Robot Perception - University of Texas at Austinyukez/cs391r_fall2020/...Embodied View of Perception • As the active cat (A) walks, the other cat (P) moves and perceives

CS391R: Robot Learning (Fall 2020) 7

Robotic Sensors

Making contact of the physical world through multimodal senses

[Source: HKU Advanced Robotics Laboratory]

Page 8: Overview of Robot Perception - University of Texas at Austinyukez/cs391r_fall2020/...Embodied View of Perception • As the active cat (A) walks, the other cat (P) moves and perceives

CS391R: Robot Learning (Fall 2020) 8

Robot Vision vs. Computer Vision

[Detectron - Facebook AI Research] [Zeng et al., IROS 2018]

Robot vision is embodied, active, and environmentally situated.

Page 9: Overview of Robot Perception - University of Texas at Austinyukez/cs391r_fall2020/...Embodied View of Perception • As the active cat (A) walks, the other cat (P) moves and perceives

CS391R: Robot Learning (Fall 2020) 9

Robot Vision vs. Computer Vision

Robot vision is embodied, active, and environmentally situated.

● Embodied: Robots have physical bodies and experience the world directly. Their

actions are part of a dynamic with the world and have immediate feedback on their

own sensation.

● Active: Robots are active perceivers. It knows why it wishes to sense, and chooses

what to perceive, and determines how, when and where to achieve that perception.

● Situated: Robots are situated in the world. They do not deal with abstract

descriptions, but with the here and now of the world directly influencing the behavior

of the system.

[Brooks 1991; Bajcsy 2018]

Page 10: Overview of Robot Perception - University of Texas at Austinyukez/cs391r_fall2020/...Embodied View of Perception • As the active cat (A) walks, the other cat (P) moves and perceives

CS391R: Robot Learning (Fall 2020) 10

Robot Perception: Landscape

What you will learn in the chapter of Robotics and Perception

1. Modalities: neural network architectures designed for different sensory modalities

2. Representations: representation learning algorithms without strong supervision

3. Tasks: state estimation tasks for robot navigation and manipulation

4. Embodiment: active perception for embodied visual intelligence

Page 11: Overview of Robot Perception - University of Texas at Austinyukez/cs391r_fall2020/...Embodied View of Perception • As the active cat (A) walks, the other cat (P) moves and perceives

CS391R: Robot Learning (Fall 2020) 11

Robot Perception: Modalities

Pixels (from RGB cameras) Point cloud (from structure sensors)

(x1, y1, z1)(x2, y2, z2)

[Source: PointNet++; Qi et al. 2016]

Time series (from F/T sensors) Tactile data (from the GelSights sensors)

[Source: Calandra et al. 2018][Source: Lee*, Zhu*, et al. 2018]

Page 12: Overview of Robot Perception - University of Texas at Austinyukez/cs391r_fall2020/...Embodied View of Perception • As the active cat (A) walks, the other cat (P) moves and perceives

CS391R: Robot Learning (Fall 2020) 12

Robot Perception: Modalities

Week 2: Object Detection (Pixels) Week 3: 3D Point Cloud

More sensory modalities

in later weeks…

How can we design the neural network architectures that can effectively process

raw sensory data in vastly different forms?

Page 13: Overview of Robot Perception - University of Texas at Austinyukez/cs391r_fall2020/...Embodied View of Perception • As the active cat (A) walks, the other cat (P) moves and perceives

CS391R: Robot Learning (Fall 2020) 13

Robot Perception: Representations

A fundamental problem in robot perception is to learn the proper representations

of the unstructured world.

[Source: Stanford CS331b]

Page 14: Overview of Robot Perception - University of Texas at Austinyukez/cs391r_fall2020/...Embodied View of Perception • As the active cat (A) walks, the other cat (P) moves and perceives

CS391R: Robot Learning (Fall 2020) 14

Robot Perception: Representations

“Solving a problem simply means representing it so as to make

the solution transparent.”

Herbert A. Simon, Sciences of the Artificial

Our secret weapon? Learning

Page 15: Overview of Robot Perception - University of Texas at Austinyukez/cs391r_fall2020/...Embodied View of Perception • As the active cat (A) walks, the other cat (P) moves and perceives

CS391R: Robot Learning (Fall 2020) 15

[6.S094, MIT]

Page 16: Overview of Robot Perception - University of Texas at Austinyukez/cs391r_fall2020/...Embodied View of Perception • As the active cat (A) walks, the other cat (P) moves and perceives

CS391R: Robot Learning (Fall 2020) 16

Robot Perception: Representations

How can we learn representations of the world with limited supervision?

Structural priors (inductive biases)

Interaction and movement (embodiment)

“Nature”

“Nurture”

+

Week 3 (Thu)

Week 4 (Tue)

babies learning by playing

Page 17: Overview of Robot Perception - University of Texas at Austinyukez/cs391r_fall2020/...Embodied View of Perception • As the active cat (A) walks, the other cat (P) moves and perceives

CS391R: Robot Learning (Fall 2020) 17

Robot Perception: Representations

How can we learn representations that fuse multiple sensory modalities together?

[The McGurk Effect, BBC]Is seeing believing?

https://www.youtube.com/watch?v=2k8fHR9jKVM

Page 18: Overview of Robot Perception - University of Texas at Austinyukez/cs391r_fall2020/...Embodied View of Perception • As the active cat (A) walks, the other cat (P) moves and perceives

CS391R: Robot Learning (Fall 2020) 18

Robot Perception: Representations

How can we learn representations that fuse multiple sensory modalities together?

[Lee*, Zhu*, et al. 2018]

1 2 3 4 5 6

1 2

Reaching

3 4

Alignment

5 6

Insertion

combining vision and force for manipulation

Week 4 Thu: Multimodal Sensor Fusion

Page 19: Overview of Robot Perception - University of Texas at Austinyukez/cs391r_fall2020/...Embodied View of Perception • As the active cat (A) walks, the other cat (P) moves and perceives

CS391R: Robot Learning (Fall 2020) 19

Robot Perception: Tasks

State Representation

Perception &

Computer Vision

Robot Control &

Decision Making

Noisy Sensory Data

Page 20: Overview of Robot Perception - University of Texas at Austinyukez/cs391r_fall2020/...Embodied View of Perception • As the active cat (A) walks, the other cat (P) moves and perceives

CS391R: Robot Learning (Fall 2020) 20

Robot Perception: Tasks

State Representation

Perception &

Computer Vision

Robot Control &

Decision Making

Noisy Sensory Data

Localization (Week 5 Tue)

Pose Estimation (Week 5 Thu)

Visual Tracking (Week 6 Tue)

Page 21: Overview of Robot Perception - University of Texas at Austinyukez/cs391r_fall2020/...Embodied View of Perception • As the active cat (A) walks, the other cat (P) moves and perceives

CS391R: Robot Learning (Fall 2020) 21

Robot Perception: Tasks

State Representation

Robot Control &

Decision Making

Noisy Sensory Data

Perception &

Computer Visionhttp://www.probabilistic-robotics.org/

Page 22: Overview of Robot Perception - University of Texas at Austinyukez/cs391r_fall2020/...Embodied View of Perception • As the active cat (A) walks, the other cat (P) moves and perceives

CS391R: Robot Learning (Fall 2020) 22

Robot Perception: Tasks

: state : observation : action

: transition model (motion model)

: measurement model (observation model)

xt

<latexit sha1_base64="TheT5UxEhsllRmdBZl0X0uTguJY=">AAAB6nicbVBNS8NAEJ3Ur1q/qh69LBbBU0lEqMeiF48V7Qe0oWy2m3bpZhN2J2IJ/QlePCji1V/kzX/jts1BWx8MPN6bYWZekEhh0HW/ncLa+sbmVnG7tLO7t39QPjxqmTjVjDdZLGPdCajhUijeRIGSdxLNaRRI3g7GNzO//ci1EbF6wEnC/YgOlQgFo2il+6c+9ssVt+rOQVaJl5MK5Gj0y1+9QczSiCtkkhrT9dwE/YxqFEzyaamXGp5QNqZD3rVU0YgbP5ufOiVnVhmQMNa2FJK5+nsio5ExkyiwnRHFkVn2ZuJ/XjfF8MrPhEpS5IotFoWpJBiT2d9kIDRnKCeWUKaFvZWwEdWUoU2nZEPwll9eJa2LqndZrd1dVurXeRxFOIFTOAcPalCHW2hAExgM4Rle4c2Rzovz7nwsWgtOPnMMf+B8/gB0eI3t</latexit>

zt

<latexit sha1_base64="/JoDDb3w1Lbpxa+WZO8WaYqU2K8=">AAAB6nicbVBNS8NAEJ3Ur1q/qh69LBbBU0lEqMeiF48V7Qe0oWy2m3bpZhN2J0IN/QlePCji1V/kzX/jts1BWx8MPN6bYWZekEhh0HW/ncLa+sbmVnG7tLO7t39QPjxqmTjVjDdZLGPdCajhUijeRIGSdxLNaRRI3g7GNzO//ci1EbF6wEnC/YgOlQgFo2il+6c+9ssVt+rOQVaJl5MK5Gj0y1+9QczSiCtkkhrT9dwE/YxqFEzyaamXGp5QNqZD3rVU0YgbP5ufOiVnVhmQMNa2FJK5+nsio5ExkyiwnRHFkVn2ZuJ/XjfF8MrPhEpS5IotFoWpJBiT2d9kIDRnKCeWUKaFvZWwEdWUoU2nZEPwll9eJa2LqndZrd1dVurXeRxFOIFTOAcPalCHW2hAExgM4Rle4c2Rzovz7nwsWgtOPnMMf+B8/gB3hI3v</latexit>

ut

<latexit sha1_base64="/eqKUhd47miGuwwmnBPvPmbKGWM=">AAAB6nicbVBNS8NAEJ3Ur1q/qh69LBbBU0mkUI9FLx4r2g9oQ9lsN+3SzSbsToQS+hO8eFDEq7/Im//GbZuDtj4YeLw3w8y8IJHCoOt+O4WNza3tneJuaW//4PCofHzSNnGqGW+xWMa6G1DDpVC8hQIl7yaa0yiQvBNMbud+54lrI2L1iNOE+xEdKREKRtFKD+kAB+WKW3UXIOvEy0kFcjQH5a/+MGZpxBUySY3peW6CfkY1Cib5rNRPDU8om9AR71mqaMSNny1OnZELqwxJGGtbCslC/T2R0ciYaRTYzoji2Kx6c/E/r5dieO1nQiUpcsWWi8JUEozJ/G8yFJozlFNLKNPC3krYmGrK0KZTsiF4qy+vk/ZV1atV6/e1SuMmj6MIZ3AOl+BBHRpwB01oAYMRPMMrvDnSeXHenY9la8HJZ07hD5zPH2/mjeo=</latexit>

p(xt|ut, xt−1)

<latexit sha1_base64="rcCQoP/aEiLLnvtOiarySz7BnQE=">AAAB+3icbVDLSsNAFJ3UV62vWJduBotQQUsihbosunFZwT6gDWEynbRDJw9mbqQl5lfcuFDErT/izr9x2mah1QMXDufcy733eLHgCizryyisrW9sbhW3Szu7e/sH5mG5o6JEUtamkYhkzyOKCR6yNnAQrBdLRgJPsK43uZn73QcmFY/Ce5jFzAnIKOQ+pwS05JrluDp14TFx4XzqpnBhZ2euWbFq1gL4L7FzUkE5Wq75ORhGNAlYCFQQpfq2FYOTEgmcCpaVBoliMaETMmJ9TUMSMOWki9szfKqVIfYjqSsEvFB/TqQkUGoWeLozIDBWq95c/M/rJ+BfOSkP4wRYSJeL/ERgiPA8CDzkklEQM00IlVzfiumYSEJBx1XSIdirL/8lncuaXa817uqV5nUeRxEdoxNURTZqoCa6RS3URhRN0RN6Qa9GZjwbb8b7srVg5DNH6BeMj29hwJQG</latexit>

p(zt|xt)

<latexit sha1_base64="1EdS5PnvuyIMSClqyAfSQ+0cy5M=">AAAB8XicbVBNS8NAEJ34WetX1aOXYBHqpSRSqMeiF48V7Ae2IWy2m3bpZhN2J2Kt/RdePCji1X/jzX/jts1BWx8MPN6bYWZekAiu0XG+rZXVtfWNzdxWfntnd2+/cHDY1HGqKGvQWMSqHRDNBJesgRwFayeKkSgQrBUMr6Z+654pzWN5i6OEeRHpSx5yStBId0np0cenBx/P/ELRKTsz2MvEzUgRMtT9wle3F9M0YhKpIFp3XCdBb0wUcirYJN9NNUsIHZI+6xgqScS0N55dPLFPjdKzw1iZkmjP1N8TYxJpPYoC0xkRHOhFbyr+53VSDC+8MZdJikzS+aIwFTbG9vR9u8cVoyhGhhCquLnVpgOiCEUTUt6E4C6+vEya52W3Uq7eVIq1yyyOHBzDCZTAhSrU4Brq0AAKEp7hFd4sbb1Y79bHvHXFymaO4A+szx9grJC9</latexit>

bel(xt)

<latexit sha1_base64="7dqwAjIevYMGVrenjYTqQEQdk/c=">AAAB73icbVDLSgNBEOz1GeMr6tHLYBDiJexKIB6DXjxGMA9IljA7mU2GzM6uM71iCPkJLx4U8ervePNvnCR70MSChqKqm+6uIJHCoOt+O2vrG5tb27md/O7e/sFh4ei4aeJUM95gsYx1O6CGS6F4AwVK3k40p1EgeSsY3cz81iPXRsTqHscJ9yM6UCIUjKKV2gGXpaceXvQKRbfszkFWiZeRImSo9wpf3X7M0ogrZJIa0/HcBP0J1SiY5NN8NzU8oWxEB7xjqaIRN/5kfu+UnFulT8JY21JI5urviQmNjBlHge2MKA7NsjcT//M6KYZX/kSoJEWu2GJRmEqCMZk9T/pCc4ZybAllWthbCRtSTRnaiPI2BG/55VXSvCx7lXL1rlKsXWdx5OAUzqAEHlShBrdQhwYwkPAMr/DmPDgvzrvzsWhdc7KZE/gD5/MHgmSPow==</latexit>

: belief

State estimation methods: Bayes Filtering

Page 23: Overview of Robot Perception - University of Texas at Austinyukez/cs391r_fall2020/...Embodied View of Perception • As the active cat (A) walks, the other cat (P) moves and perceives

CS391R: Robot Learning (Fall 2020) 23

Robot Perception: Tasks

State estimation methods: Bayes Filtering

: state : observation : action

: transition model (motion model)

: measurement model (observation model)

xt

<latexit sha1_base64="TheT5UxEhsllRmdBZl0X0uTguJY=">AAAB6nicbVBNS8NAEJ3Ur1q/qh69LBbBU0lEqMeiF48V7Qe0oWy2m3bpZhN2J2IJ/QlePCji1V/kzX/jts1BWx8MPN6bYWZekEhh0HW/ncLa+sbmVnG7tLO7t39QPjxqmTjVjDdZLGPdCajhUijeRIGSdxLNaRRI3g7GNzO//ci1EbF6wEnC/YgOlQgFo2il+6c+9ssVt+rOQVaJl5MK5Gj0y1+9QczSiCtkkhrT9dwE/YxqFEzyaamXGp5QNqZD3rVU0YgbP5ufOiVnVhmQMNa2FJK5+nsio5ExkyiwnRHFkVn2ZuJ/XjfF8MrPhEpS5IotFoWpJBiT2d9kIDRnKCeWUKaFvZWwEdWUoU2nZEPwll9eJa2LqndZrd1dVurXeRxFOIFTOAcPalCHW2hAExgM4Rle4c2Rzovz7nwsWgtOPnMMf+B8/gB0eI3t</latexit>

zt

<latexit sha1_base64="/JoDDb3w1Lbpxa+WZO8WaYqU2K8=">AAAB6nicbVBNS8NAEJ3Ur1q/qh69LBbBU0lEqMeiF48V7Qe0oWy2m3bpZhN2J0IN/QlePCji1V/kzX/jts1BWx8MPN6bYWZekEhh0HW/ncLa+sbmVnG7tLO7t39QPjxqmTjVjDdZLGPdCajhUijeRIGSdxLNaRRI3g7GNzO//ci1EbF6wEnC/YgOlQgFo2il+6c+9ssVt+rOQVaJl5MK5Gj0y1+9QczSiCtkkhrT9dwE/YxqFEzyaamXGp5QNqZD3rVU0YgbP5ufOiVnVhmQMNa2FJK5+nsio5ExkyiwnRHFkVn2ZuJ/XjfF8MrPhEpS5IotFoWpJBiT2d9kIDRnKCeWUKaFvZWwEdWUoU2nZEPwll9eJa2LqndZrd1dVurXeRxFOIFTOAcPalCHW2hAExgM4Rle4c2Rzovz7nwsWgtOPnMMf+B8/gB3hI3v</latexit>

ut

<latexit sha1_base64="/eqKUhd47miGuwwmnBPvPmbKGWM=">AAAB6nicbVBNS8NAEJ3Ur1q/qh69LBbBU0mkUI9FLx4r2g9oQ9lsN+3SzSbsToQS+hO8eFDEq7/Im//GbZuDtj4YeLw3w8y8IJHCoOt+O4WNza3tneJuaW//4PCofHzSNnGqGW+xWMa6G1DDpVC8hQIl7yaa0yiQvBNMbud+54lrI2L1iNOE+xEdKREKRtFKD+kAB+WKW3UXIOvEy0kFcjQH5a/+MGZpxBUySY3peW6CfkY1Cib5rNRPDU8om9AR71mqaMSNny1OnZELqwxJGGtbCslC/T2R0ciYaRTYzoji2Kx6c/E/r5dieO1nQiUpcsWWi8JUEozJ/G8yFJozlFNLKNPC3krYmGrK0KZTsiF4qy+vk/ZV1atV6/e1SuMmj6MIZ3AOl+BBHRpwB01oAYMRPMMrvDnSeXHenY9la8HJZ07hD5zPH2/mjeo=</latexit>

p(xt|ut, xt−1)

<latexit sha1_base64="rcCQoP/aEiLLnvtOiarySz7BnQE=">AAAB+3icbVDLSsNAFJ3UV62vWJduBotQQUsihbosunFZwT6gDWEynbRDJw9mbqQl5lfcuFDErT/izr9x2mah1QMXDufcy733eLHgCizryyisrW9sbhW3Szu7e/sH5mG5o6JEUtamkYhkzyOKCR6yNnAQrBdLRgJPsK43uZn73QcmFY/Ce5jFzAnIKOQ+pwS05JrluDp14TFx4XzqpnBhZ2euWbFq1gL4L7FzUkE5Wq75ORhGNAlYCFQQpfq2FYOTEgmcCpaVBoliMaETMmJ9TUMSMOWki9szfKqVIfYjqSsEvFB/TqQkUGoWeLozIDBWq95c/M/rJ+BfOSkP4wRYSJeL/ERgiPA8CDzkklEQM00IlVzfiumYSEJBx1XSIdirL/8lncuaXa817uqV5nUeRxEdoxNURTZqoCa6RS3URhRN0RN6Qa9GZjwbb8b7srVg5DNH6BeMj29hwJQG</latexit>

p(zt|xt)

<latexit sha1_base64="1EdS5PnvuyIMSClqyAfSQ+0cy5M=">AAAB8XicbVBNS8NAEJ34WetX1aOXYBHqpSRSqMeiF48V7Ae2IWy2m3bpZhN2J2Kt/RdePCji1X/jzX/jts1BWx8MPN6bYWZekAiu0XG+rZXVtfWNzdxWfntnd2+/cHDY1HGqKGvQWMSqHRDNBJesgRwFayeKkSgQrBUMr6Z+654pzWN5i6OEeRHpSx5yStBId0np0cenBx/P/ELRKTsz2MvEzUgRMtT9wle3F9M0YhKpIFp3XCdBb0wUcirYJN9NNUsIHZI+6xgqScS0N55dPLFPjdKzw1iZkmjP1N8TYxJpPYoC0xkRHOhFbyr+53VSDC+8MZdJikzS+aIwFTbG9vR9u8cVoyhGhhCquLnVpgOiCEUTUt6E4C6+vEya52W3Uq7eVIq1yyyOHBzDCZTAhSrU4Brq0AAKEp7hFd4sbb1Y79bHvHXFymaO4A+szx9grJC9</latexit>

What if models are hard to specify? Learning

bel(xt)

<latexit sha1_base64="7dqwAjIevYMGVrenjYTqQEQdk/c=">AAAB73icbVDLSgNBEOz1GeMr6tHLYBDiJexKIB6DXjxGMA9IljA7mU2GzM6uM71iCPkJLx4U8ervePNvnCR70MSChqKqm+6uIJHCoOt+O2vrG5tb27md/O7e/sFh4ei4aeJUM95gsYx1O6CGS6F4AwVK3k40p1EgeSsY3cz81iPXRsTqHscJ9yM6UCIUjKKV2gGXpaceXvQKRbfszkFWiZeRImSo9wpf3X7M0ogrZJIa0/HcBP0J1SiY5NN8NzU8oWxEB7xjqaIRN/5kfu+UnFulT8JY21JI5urviQmNjBlHge2MKA7NsjcT//M6KYZX/kSoJEWu2GJRmEqCMZk9T/pCc4ZybAllWthbCRtSTRnaiPI2BG/55VXSvCx7lXL1rlKsXWdx5OAUzqAEHlShBrdQhwYwkPAMr/DmPDgvzrvzsWhdc7KZE/gD5/MHgmSPow==</latexit>

: belief

Example: Particle Filter Localization

Page 24: Overview of Robot Perception - University of Texas at Austinyukez/cs391r_fall2020/...Embodied View of Perception • As the active cat (A) walks, the other cat (P) moves and perceives

CS391R: Robot Learning (Fall 2020) 24

Robot Perception: Embodiment

Input-Output Picture (Susan Hurley, 1998)

Conventional View of Perception

[Action in Perception, Alva Noë 2004]

• Perception is the process of building an internal

representation of the environment

• Perception is input from world to mind, and action

is output from mind to world, thought is the

mediating process.

Page 25: Overview of Robot Perception - University of Texas at Austinyukez/cs391r_fall2020/...Embodied View of Perception • As the active cat (A) walks, the other cat (P) moves and perceives

CS391R: Robot Learning (Fall 2020) 25

Robot Perception: Embodiment

Kitten Carousel (Held and Hein, 1963)

Embodied View of Perception

• As the active cat (A) walks, the other cat (P) moves

and perceives the environment passively.

• Only the active cat develops normal perception

through self-actuated movement.

• The passive cat suffers from perception problems,

such as 1) not blinking when objects approach,

and 2) hitting the walls.

Page 26: Overview of Robot Perception - University of Texas at Austinyukez/cs391r_fall2020/...Embodied View of Perception • As the active cat (A) walks, the other cat (P) moves and perceives

CS391R: Robot Learning (Fall 2020) 26

Robot Perception: Embodiment

Pebbles (James J. Gibson 1966)

Embodied View of Perception

• Subjects asked to find a reference object among a

set of irregularly-shaped objects

• Three groups

a. Passive observers of one static image (49%)

b. Observers of moving shapes (72%)

c. Interactive observers (99%)

• The ability to condition input signals with actions is

crucial to perception.

Page 27: Overview of Robot Perception - University of Texas at Austinyukez/cs391r_fall2020/...Embodied View of Perception • As the active cat (A) walks, the other cat (P) moves and perceives

CS391R: Robot Learning (Fall 2020) 27

Robot Perception: Embodiment

Take-home messages

• Perceptual experiences do not present the sense in the way that a photograph does.

• Perception is developed by an embodied agent through actively exploring in the

physical world.

• “We see in order to move; we move in order to see.” – William Gibson

Page 28: Overview of Robot Perception - University of Texas at Austinyukez/cs391r_fall2020/...Embodied View of Perception • As the active cat (A) walks, the other cat (P) moves and perceives

CS391R: Robot Learning (Fall 2020) 28

Robot Perception: Embodiment

Week 6 (Thu) – Active Perception: How can embodied agents (robots) improve perception

based on visual experiences through active exploration?

View

Selection

Physical

Interaction

[Ramakrishnan et al. 2019][Pinto et al. 2016]

Page 29: Overview of Robot Perception - University of Texas at Austinyukez/cs391r_fall2020/...Embodied View of Perception • As the active cat (A) walks, the other cat (P) moves and perceives

CS391R: Robot Learning (Fall 2020) 29

Research Frontier: Closing the Perception-Action Loop

Perception ActionRobots

How robots’ intelligent behaviors

are guided by their interactive

perception

How robots develop better perception

from embodied sensorimotor

experiences

Page 30: Overview of Robot Perception - University of Texas at Austinyukez/cs391r_fall2020/...Embodied View of Perception • As the active cat (A) walks, the other cat (P) moves and perceives

CS391R: Robot Learning (Fall 2020) 30

Visual Processing Methods

Staged Visual Recognition Pipeline End-to-end Deep Learning

What is new since 1980s?

Page 31: Overview of Robot Perception - University of Texas at Austinyukez/cs391r_fall2020/...Embodied View of Perception • As the active cat (A) walks, the other cat (P) moves and perceives

CS391R: Robot Learning (Fall 2020) 31

Quick Review of Deep Learning: Artificial Neurons

Biological Neuron

Computational building block for the brain

Artificial Neuron

Computational building block for the neural network

Note: Many differences exist – be careful with the brain analogies!

[Dendritic Computation, Michael London and Michael Hausser 2015]

Page 32: Overview of Robot Perception - University of Texas at Austinyukez/cs391r_fall2020/...Embodied View of Perception • As the active cat (A) walks, the other cat (P) moves and perceives

CS391R: Robot Learning (Fall 2020) 32

Quick Review of Deep Learning: Convolutional Networks

Page 33: Overview of Robot Perception - University of Texas at Austinyukez/cs391r_fall2020/...Embodied View of Perception • As the active cat (A) walks, the other cat (P) moves and perceives

CS391R: Robot Learning (Fall 2020) 33

Quick Review of Deep Learning: Fully-Connected Layers

[Source: Stanford CS231N]

What is the dimension of W ?

Page 34: Overview of Robot Perception - University of Texas at Austinyukez/cs391r_fall2020/...Embodied View of Perception • As the active cat (A) walks, the other cat (P) moves and perceives

CS391R: Robot Learning (Fall 2020) 34

Quick Review of Deep Learning: Convolutional Layers

[Source: Stanford CS231N]

Page 35: Overview of Robot Perception - University of Texas at Austinyukez/cs391r_fall2020/...Embodied View of Perception • As the active cat (A) walks, the other cat (P) moves and perceives

CS391R: Robot Learning (Fall 2020) 35

Quick Review of Deep Learning: Convolutional Layers

[Source: Stanford CS231N]

Page 36: Overview of Robot Perception - University of Texas at Austinyukez/cs391r_fall2020/...Embodied View of Perception • As the active cat (A) walks, the other cat (P) moves and perceives

CS391R: Robot Learning (Fall 2020) 36

Quick Review of Deep Learning: Convolutional Layers

[Source: Stanford CS231N]

Page 37: Overview of Robot Perception - University of Texas at Austinyukez/cs391r_fall2020/...Embodied View of Perception • As the active cat (A) walks, the other cat (P) moves and perceives

CS391R: Robot Learning (Fall 2020) 37

Quick Review of Deep Learning: Convolutional Layers

[Source: Stanford CS231N]

Page 38: Overview of Robot Perception - University of Texas at Austinyukez/cs391r_fall2020/...Embodied View of Perception • As the active cat (A) walks, the other cat (P) moves and perceives

CS391R: Robot Learning (Fall 2020) 38

Quick Review of Deep Learning: Convolutional Layers

[Source: Stanford CS231N]

Page 39: Overview of Robot Perception - University of Texas at Austinyukez/cs391r_fall2020/...Embodied View of Perception • As the active cat (A) walks, the other cat (P) moves and perceives

CS391R: Robot Learning (Fall 2020) 39

Quick Review of Deep Learning: Convolutional Layers

[Source: Stanford CS231N]

Page 40: Overview of Robot Perception - University of Texas at Austinyukez/cs391r_fall2020/...Embodied View of Perception • As the active cat (A) walks, the other cat (P) moves and perceives

CS391R: Robot Learning (Fall 2020) 40

Quick Review of Deep Learning: Convolutional Layers

[Source: Stanford CS231N]

Page 41: Overview of Robot Perception - University of Texas at Austinyukez/cs391r_fall2020/...Embodied View of Perception • As the active cat (A) walks, the other cat (P) moves and perceives

CS391R: Robot Learning (Fall 2020) 41

Quick Review of Deep Learning: Pooling Operations

Page 42: Overview of Robot Perception - University of Texas at Austinyukez/cs391r_fall2020/...Embodied View of Perception • As the active cat (A) walks, the other cat (P) moves and perceives

CS391R: Robot Learning (Fall 2020) 42

Quick Review of Deep Learning: Activation Functions

Page 43: Overview of Robot Perception - University of Texas at Austinyukez/cs391r_fall2020/...Embodied View of Perception • As the active cat (A) walks, the other cat (P) moves and perceives

CS391R: Robot Learning (Fall 2020) 43

Quick Review of Deep Learning: CNN Architectures

AlexNetVGG-16

ResNetLeNet

Page 44: Overview of Robot Perception - University of Texas at Austinyukez/cs391r_fall2020/...Embodied View of Perception • As the active cat (A) walks, the other cat (P) moves and perceives

CS391R: Robot Learning (Fall 2020) 44

Quick Review of Deep Learning: Optimization

Stochastic Gradient Descent (SGD)

θ = θ � ηrθJ(θ;x(i); y(i))

<latexit sha1_base64="BKFxODiGs0UWCdpl4EQg4DJQSQo=">AAACKHicbZDLSgMxFIYzXmu9VV26CRahLiwzIiiIWHQjrhTsBTq1nElTG8xkhuSMWIY+jhtfxY2IIm59EtN2Fmo9EPLx/+eQnD+IpTDoup/O1PTM7Nx8biG/uLS8slpYW6+ZKNGMV1kkI90IwHApFK+iQMkbseYQBpLXg7uzoV+/59qISF1jP+atEG6V6AoGaKV24cTHHkegxzSDXeoPL19BIKGdiRelMRw93KQlsTM4ov0x7LQLRbfsjopOgpdBkWR12S68+p2IJSFXyCQY0/TcGFspaBRM8kHeTwyPgd3BLW9aVBBy00pHiw7otlU6tBtpexTSkfpzIoXQmH4Y2M4QsGf+ekPxP6+ZYPewlQoVJ8gVGz/UTSTFiA5Tox2hOUPZtwBMC/tXynqggaHNNm9D8P6uPAm1vbK3Xz642i9WTrM4cmSTbJES8cgBqZBzckmqhJFH8kzeyLvz5Lw4H87nuHXKyWY2yK9yvr4B7UCkrg==</latexit>

input label

learning rate

weights

Backpropagation

Page 45: Overview of Robot Perception - University of Texas at Austinyukez/cs391r_fall2020/...Embodied View of Perception • As the active cat (A) walks, the other cat (P) moves and perceives

CS391R: Robot Learning (Fall 2020) 45

Quick Review of Deep Learning: Features

[Source: Stanford CS231N]

Page 46: Overview of Robot Perception - University of Texas at Austinyukez/cs391r_fall2020/...Embodied View of Perception • As the active cat (A) walks, the other cat (P) moves and perceives

CS391R: Robot Learning (Fall 2020) 46

Quick Review of Deep Learning: Implementation

Tutorial coming in late September / early October

Page 47: Overview of Robot Perception - University of Texas at Austinyukez/cs391r_fall2020/...Embodied View of Perception • As the active cat (A) walks, the other cat (P) moves and perceives

CS391R: Robot Learning (Fall 2020) 47

Quick Review of Deep Learning: Resources

Online Courses

• CS231N: Convolutional Neural Networks for Visual Recognition

http://cs231n.stanford.edu/

• MIT 6.S191: Introduction to Deep Learning

http://introtodeeplearning.com/

Textbooks:

• Deep Learning. Ian Goodfellow, Yoshua Bengio, Aaron Courville

http://www.deeplearningbook.org/

Page 48: Overview of Robot Perception - University of Texas at Austinyukez/cs391r_fall2020/...Embodied View of Perception • As the active cat (A) walks, the other cat (P) moves and perceives

CS391R: Robot Learning (Fall 2020) 48

Resources

Related courses at UTCS

• CS342: Neural Networks

• CS 376: Computer Vision

• CS 378 Autonomous Driving

• CS 393R: Autonomous Robots

• CS394R: Reinforcement Learning: Theory and Practice

Extended readings:

• Action-based Theories of Perception, Stanford Encyclopedia of Philosophy

• Action in Perception, Alva Noë