unsupervised video object segmentation for deep ...€¦ · unsupervised video object segmentation...
TRANSCRIPT
![Page 1: Unsupervised Video Object Segmentation for Deep ...€¦ · Unsupervised Video Object Segmentation for Deep Reinforcement Learning Machine Learning and Data Analytics Symposium Doha,](https://reader033.vdocument.in/reader033/viewer/2022050515/5f9f54da83a279505e56f4f8/html5/thumbnails/1.jpg)
Unsupervised Video Object Segmentation for Deep Reinforcement Learning
Machine Learning and Data Analytics SymposiumDoha, Qatar, April 1, 2019
Vikash Goel, Jameson Weng, Pascal Poupart
![Page 2: Unsupervised Video Object Segmentation for Deep ...€¦ · Unsupervised Video Object Segmentation for Deep Reinforcement Learning Machine Learning and Data Analytics Symposium Doha,](https://reader033.vdocument.in/reader033/viewer/2022050515/5f9f54da83a279505e56f4f8/html5/thumbnails/2.jpg)
2
Pascal: RBC Borealis AI Research Director
• Research institute funded by RBC
• 5 research centers: – Montreal, Toronto, Waterloo,
Edmonton and Vancouver
• 80 researchers: – Integrated (applied & fundamental) research model
• ML, RL, NLP, computer vision, private AI, knowledge graphs
• We are hiring!
![Page 3: Unsupervised Video Object Segmentation for Deep ...€¦ · Unsupervised Video Object Segmentation for Deep Reinforcement Learning Machine Learning and Data Analytics Symposium Doha,](https://reader033.vdocument.in/reader033/viewer/2022050515/5f9f54da83a279505e56f4f8/html5/thumbnails/3.jpg)
3
Pascal: ML Professor at U of Waterloo
• Deep Learning– Automated structure learning, sum-product networks, transfer learning
• Reinforcement learning– Constrained RL, motion-oriented RL, sport analytics
• NLP– Conversational agents, machine translation, automated proofreading
• Theory– Convex relaxations of sum-product networks, characterization of local
optima in mixture models, consistent approximate Bayesian techniques
![Page 4: Unsupervised Video Object Segmentation for Deep ...€¦ · Unsupervised Video Object Segmentation for Deep Reinforcement Learning Machine Learning and Data Analytics Symposium Doha,](https://reader033.vdocument.in/reader033/viewer/2022050515/5f9f54da83a279505e56f4f8/html5/thumbnails/4.jpg)
4
Outline
• Background
– Reinforcement learning: data inefficiency
– Solution: self-supervised learning
• MOREL: Motion-Oriented REinforcement Learning
– Unsupervised object & motion recognition
– Faster policy optimization & interpretability
Reference: Goel, Weng, Poupart (2018) Unsupervised Video Object Segmentation for Deep Reinforcement Learning, NeurIPS.
![Page 5: Unsupervised Video Object Segmentation for Deep ...€¦ · Unsupervised Video Object Segmentation for Deep Reinforcement Learning Machine Learning and Data Analytics Symposium Doha,](https://reader033.vdocument.in/reader033/viewer/2022050515/5f9f54da83a279505e56f4f8/html5/thumbnails/5.jpg)
5
Reinforcement Learning
Games, robotics, automated trading, autonomous driving, recommender systems, conversational agents, operations research, data center optimization
Agent
Environment
ObservationReward Action
![Page 6: Unsupervised Video Object Segmentation for Deep ...€¦ · Unsupervised Video Object Segmentation for Deep Reinforcement Learning Machine Learning and Data Analytics Symposium Doha,](https://reader033.vdocument.in/reader033/viewer/2022050515/5f9f54da83a279505e56f4f8/html5/thumbnails/6.jpg)
6
Data Inefficiency• Most RL successes: simulated environments
• Atari baselines: 40M frames (Schulman et al., 2017)
Atari MuJoCo VizDoom Computer Go
![Page 7: Unsupervised Video Object Segmentation for Deep ...€¦ · Unsupervised Video Object Segmentation for Deep Reinforcement Learning Machine Learning and Data Analytics Symposium Doha,](https://reader033.vdocument.in/reader033/viewer/2022050515/5f9f54da83a279505e56f4f8/html5/thumbnails/7.jpg)
7
Image-based RL
imag
e actionsor
values
deep neural network sparse
reward
![Page 8: Unsupervised Video Object Segmentation for Deep ...€¦ · Unsupervised Video Object Segmentation for Deep Reinforcement Learning Machine Learning and Data Analytics Symposium Doha,](https://reader033.vdocument.in/reader033/viewer/2022050515/5f9f54da83a279505e56f4f8/html5/thumbnails/8.jpg)
8
Self-supervised learning
• Auxiliary tasks and objectives– Future observation/reward prediction– Past observation prediction (inverse dynamics)– Observation reconstruction (auto-encoder)
Agent
Environment
ObservationReward Action
![Page 9: Unsupervised Video Object Segmentation for Deep ...€¦ · Unsupervised Video Object Segmentation for Deep Reinforcement Learning Machine Learning and Data Analytics Symposium Doha,](https://reader033.vdocument.in/reader033/viewer/2022050515/5f9f54da83a279505e56f4f8/html5/thumbnails/9.jpg)
9
Image-based RL• Deep RL:
• Self-supervised RL (auxiliary tasks):
imag
e actionsor
values
deep neural network
dense signal
deep neural networkim
age
next
imag
e
sparse reward
![Page 10: Unsupervised Video Object Segmentation for Deep ...€¦ · Unsupervised Video Object Segmentation for Deep Reinforcement Learning Machine Learning and Data Analytics Symposium Doha,](https://reader033.vdocument.in/reader033/viewer/2022050515/5f9f54da83a279505e56f4f8/html5/thumbnails/10.jpg)
10
Prior knowledge• What do you see?
– Humans: moving objects– RL agent: sequence of pixels
seaquest space invaders breakout
![Page 11: Unsupervised Video Object Segmentation for Deep ...€¦ · Unsupervised Video Object Segmentation for Deep Reinforcement Learning Machine Learning and Data Analytics Symposium Doha,](https://reader033.vdocument.in/reader033/viewer/2022050515/5f9f54da83a279505e56f4f8/html5/thumbnails/11.jpg)
11
Discovery of relevant features slows down learning
imag
e actionsor
values
deep neural network
sparse reward
Feature extractionPolicy optimization
![Page 12: Unsupervised Video Object Segmentation for Deep ...€¦ · Unsupervised Video Object Segmentation for Deep Reinforcement Learning Machine Learning and Data Analytics Symposium Doha,](https://reader033.vdocument.in/reader033/viewer/2022050515/5f9f54da83a279505e56f4f8/html5/thumbnails/12.jpg)
12
Faster LearningCan we learn a policy that automatically segments moving objects and identifies relevant objects?
seaquest space invaders breakout
![Page 13: Unsupervised Video Object Segmentation for Deep ...€¦ · Unsupervised Video Object Segmentation for Deep Reinforcement Learning Machine Learning and Data Analytics Symposium Doha,](https://reader033.vdocument.in/reader033/viewer/2022050515/5f9f54da83a279505e56f4f8/html5/thumbnails/13.jpg)
13
Outline• Background
– Reinforcement learning: data inefficiency– Solution: self-supervised learning
• MOREL: Motion-Oriented REinforcement Learning– Unsupervised object & motion recognition– Faster policy optimization & interpretability
Reference: Goel, Weng, Poupart (2018) Unsupervised Video Object Segmentation for Deep Reinforcement Learning, NeurIPS.
![Page 14: Unsupervised Video Object Segmentation for Deep ...€¦ · Unsupervised Video Object Segmentation for Deep Reinforcement Learning Machine Learning and Data Analytics Symposium Doha,](https://reader033.vdocument.in/reader033/viewer/2022050515/5f9f54da83a279505e56f4f8/html5/thumbnails/14.jpg)
14
MOREL: Motion-Oriented RL
Unsupervised object segmentation
Only 1% of the frames (random actions)
Faster policy segmentation
Based on object segmentation and motion
Phase 1 Phase 2
![Page 15: Unsupervised Video Object Segmentation for Deep ...€¦ · Unsupervised Video Object Segmentation for Deep Reinforcement Learning Machine Learning and Data Analytics Symposium Doha,](https://reader033.vdocument.in/reader033/viewer/2022050515/5f9f54da83a279505e56f4f8/html5/thumbnails/15.jpg)
15
Motion Consistency • Supervised segmentation: labor intensive labeling
• Idea: leverage optical flow (structure from motion)
![Page 16: Unsupervised Video Object Segmentation for Deep ...€¦ · Unsupervised Video Object Segmentation for Deep Reinforcement Learning Machine Learning and Data Analytics Symposium Doha,](https://reader033.vdocument.in/reader033/viewer/2022050515/5f9f54da83a279505e56f4f8/html5/thumbnails/16.jpg)
16
SfM-NetVijayanarasimhan, Ricco, Schmid, Sukthankar, Fragkiadaki, SfM-Net: Learning of Structure and Motion from Video, arXiv, 2017.
![Page 17: Unsupervised Video Object Segmentation for Deep ...€¦ · Unsupervised Video Object Segmentation for Deep Reinforcement Learning Machine Learning and Data Analytics Symposium Doha,](https://reader033.vdocument.in/reader033/viewer/2022050515/5f9f54da83a279505e56f4f8/html5/thumbnails/17.jpg)
17
SfM-Net predictions (KITTI 2015)
![Page 18: Unsupervised Video Object Segmentation for Deep ...€¦ · Unsupervised Video Object Segmentation for Deep Reinforcement Learning Machine Learning and Data Analytics Symposium Doha,](https://reader033.vdocument.in/reader033/viewer/2022050515/5f9f54da83a279505e56f4f8/html5/thumbnails/18.jpg)
18
Simplified 2D SfM-Net
• No skip connection
• Reconstruction loss: DSSIM (structural dissimilarity)
• Flow regularization: L1 loss
• Curriculum: gradually increase !"#$ from 0 to 1
%"#&'()*"+&* = -../0
%"#$ =12
0(2) × 62 7
%899 = %"#&'()*"+&* + !"#$%"#$
![Page 19: Unsupervised Video Object Segmentation for Deep ...€¦ · Unsupervised Video Object Segmentation for Deep Reinforcement Learning Machine Learning and Data Analytics Symposium Doha,](https://reader033.vdocument.in/reader033/viewer/2022050515/5f9f54da83a279505e56f4f8/html5/thumbnails/19.jpg)
19
Simplified 2D SfM-Net
Frame 1 Frame 2Masks
(summed)Most salient
mask Optical flow
Brea
kout
Pong
![Page 20: Unsupervised Video Object Segmentation for Deep ...€¦ · Unsupervised Video Object Segmentation for Deep Reinforcement Learning Machine Learning and Data Analytics Symposium Doha,](https://reader033.vdocument.in/reader033/viewer/2022050515/5f9f54da83a279505e56f4f8/html5/thumbnails/20.jpg)
20
Unsupervised object segmentationMasks (summed) Most salient mask Optical flow
Spac
e In
vade
rsBe
am R
ider
Seaq
uest
Frame 1 Frame 2Masks
(summed)Most salient
mask Optical flow
![Page 21: Unsupervised Video Object Segmentation for Deep ...€¦ · Unsupervised Video Object Segmentation for Deep Reinforcement Learning Machine Learning and Data Analytics Symposium Doha,](https://reader033.vdocument.in/reader033/viewer/2022050515/5f9f54da83a279505e56f4f8/html5/thumbnails/21.jpg)
21
MOREL: Motion-Oriented RLMulti-objective: max $%&'$() and min ,-./0'121,&3$$,$
Comparison with PPOBetter: 25 gamesSimilar: 25 gamesWorse: 9 games
Comparison with A2CBetter: 26 gamesSimilar: 30 gamesWorse: 3 games
![Page 22: Unsupervised Video Object Segmentation for Deep ...€¦ · Unsupervised Video Object Segmentation for Deep Reinforcement Learning Machine Learning and Data Analytics Symposium Doha,](https://reader033.vdocument.in/reader033/viewer/2022050515/5f9f54da83a279505e56f4f8/html5/thumbnails/22.jpg)
22
VideosPong
Breakout
Seaquest
Beamrider
![Page 23: Unsupervised Video Object Segmentation for Deep ...€¦ · Unsupervised Video Object Segmentation for Deep Reinforcement Learning Machine Learning and Data Analytics Symposium Doha,](https://reader033.vdocument.in/reader033/viewer/2022050515/5f9f54da83a279505e56f4f8/html5/thumbnails/23.jpg)
23
Performance CurvesBreakout
Epis
ode
rew
ards
Frames Frames
Epis
ode
rew
ards
Seaquest
Pong
Beamrider
![Page 24: Unsupervised Video Object Segmentation for Deep ...€¦ · Unsupervised Video Object Segmentation for Deep Reinforcement Learning Machine Learning and Data Analytics Symposium Doha,](https://reader033.vdocument.in/reader033/viewer/2022050515/5f9f54da83a279505e56f4f8/html5/thumbnails/24.jpg)
Pong
24
Ablation StudyBreakout
Seaquest Beamrider
FramesFrames
Epis
ode
rew
ards
Epis
ode
rew
ards
![Page 25: Unsupervised Video Object Segmentation for Deep ...€¦ · Unsupervised Video Object Segmentation for Deep Reinforcement Learning Machine Learning and Data Analytics Symposium Doha,](https://reader033.vdocument.in/reader033/viewer/2022050515/5f9f54da83a279505e56f4f8/html5/thumbnails/25.jpg)
25
Conclusion• MOREL: Motion-Oriented REinforcement Learning
– Unsupervised object & motion recognition– Faster policy optimization & interpretability
• Future work– 3D environments, physics-based dynamics, object-oriented RL,
model-based RL
Reference: Goel, Weng, Poupart (2018) Unsupervised Video Object Segmentation for Deep Reinforcement Learning, NeurIPS.
![Page 26: Unsupervised Video Object Segmentation for Deep ...€¦ · Unsupervised Video Object Segmentation for Deep Reinforcement Learning Machine Learning and Data Analytics Symposium Doha,](https://reader033.vdocument.in/reader033/viewer/2022050515/5f9f54da83a279505e56f4f8/html5/thumbnails/26.jpg)
26
RBC Borealis AI
• Graduating soon?– Join RBC Borealis AI (https://www.borealisai.com)– Email: [email protected]
• Research Institute– Fundamental research (publications)– Applied research (products)
• Topics– RL: automated trading– NLP: news filtering, information extraction, text generation– Computer Vision: satellite-based house valuation– Privacy: differential privacy– Knowledge graphs: recommender systems