december 4, project
TRANSCRIPT
![Page 1: December 4, Project](https://reader033.vdocument.in/reader033/viewer/2022052823/55500d22b4c90535638b47e8/html5/thumbnails/1.jpg)
Combing Reactive and Deliberative Algorithms
CSCI7000: Final Presentation
Maciej Stachura
Dec. 4, 2009
![Page 2: December 4, Project](https://reader033.vdocument.in/reader033/viewer/2022052823/55500d22b4c90535638b47e8/html5/thumbnails/2.jpg)
Outline
• Project Overview
• Positioning System
• Hardware Demo
![Page 3: December 4, Project](https://reader033.vdocument.in/reader033/viewer/2022052823/55500d22b4c90535638b47e8/html5/thumbnails/3.jpg)
Project Goals
• Combine deliberative and reactive algorithms
• Show stability and completeness
• Demonstrate multi-robot coverage on iCreate robots.
![Page 4: December 4, Project](https://reader033.vdocument.in/reader033/viewer/2022052823/55500d22b4c90535638b47e8/html5/thumbnails/4.jpg)
Coverage Problem
• Cover Entire Area
• Deliberative Algorithm Plans Next Point to visit.
• Reactive Algorithm pushes robot to that point.
• Reactive Algorithm Adds 2 constraints:
• Maintain Communication Distance
• Collision Avoidance
![Page 5: December 4, Project](https://reader033.vdocument.in/reader033/viewer/2022052823/55500d22b4c90535638b47e8/html5/thumbnails/5.jpg)
Proof of Stability
also,
error decays
Therefore stable system.
![Page 6: December 4, Project](https://reader033.vdocument.in/reader033/viewer/2022052823/55500d22b4c90535638b47e8/html5/thumbnails/6.jpg)
Demo for single vehicle
• Implimented on iCreate.
• 5 points to visit.
• Deliberative Algorithm Selects Point.
• Reactive Algorithm uses potential field to reach point.
• Point reached when within some minimum distance.
VIDEO
![Page 7: December 4, Project](https://reader033.vdocument.in/reader033/viewer/2022052823/55500d22b4c90535638b47e8/html5/thumbnails/7.jpg)
Multi-robot Case
• 2 Robot Coverage
• Blue is free to move
• Green must stay in communication range.
• Matlab Simulation.
VIDEO
![Page 8: December 4, Project](https://reader033.vdocument.in/reader033/viewer/2022052823/55500d22b4c90535638b47e8/html5/thumbnails/8.jpg)
Outline
• Project Overview
• Positioning System
• Hardware Demo
![Page 9: December 4, Project](https://reader033.vdocument.in/reader033/viewer/2022052823/55500d22b4c90535638b47e8/html5/thumbnails/9.jpg)
Positioning System
• Problems with Stargazer.• Periods of no measurement
• Occasional Bad Measurements
• State Estimation (SPF)• Combine Stargazer with Odometry
• Reject Bad Measurements
![Page 10: December 4, Project](https://reader033.vdocument.in/reader033/viewer/2022052823/55500d22b4c90535638b47e8/html5/thumbnails/10.jpg)
SPF Explanation
• Sigma Point Filter uses Stargazer and Odometry measures to predict robot position.
• Non-guassian Noise
• Implimented and Tested on robot platform.
• Performs very well in the presence of no measurements or bad measurement.
![Page 11: December 4, Project](https://reader033.vdocument.in/reader033/viewer/2022052823/55500d22b4c90535638b47e8/html5/thumbnails/11.jpg)
Outline
• Project Overview
• Positioning System
• Hardware Demo
![Page 12: December 4, Project](https://reader033.vdocument.in/reader033/viewer/2022052823/55500d22b4c90535638b47e8/html5/thumbnails/12.jpg)
Roomba Pac-Man
• Implimented 5 Robot Demo along
with Jack Elston.
• Re-creation of Pac-Man Game.
• Demonstrate NetUAS system.
• Showcase most of concepts
from class.
![Page 13: December 4, Project](https://reader033.vdocument.in/reader033/viewer/2022052823/55500d22b4c90535638b47e8/html5/thumbnails/13.jpg)
Video
![Page 14: December 4, Project](https://reader033.vdocument.in/reader033/viewer/2022052823/55500d22b4c90535638b47e8/html5/thumbnails/14.jpg)
Roomba Pac-Man
• Reactive Algorithms:• Walls of maze
• Potential Field
• Deliberative Algorithms• Ghost Planning (Enumerate States)
• Collision Avoidance
• Game modes
• Decentralized• Each ghost ran planning algorithm
• Collaborated on positions
• Communication• 802.11b Ad-hoc Network
• AODV, no centralized node
![Page 15: December 4, Project](https://reader033.vdocument.in/reader033/viewer/2022052823/55500d22b4c90535638b47e8/html5/thumbnails/15.jpg)
Roomba Pac-Man
• Reactive Algorithms:• Walls of maze
• Potential Field
• Deliberative Algorithms• Ghost Planning (Enumerate States)
• Collision Avoidance
• Game modes
• Decentralized• Each ghost ran planning algorithm
• Collaborated on positions
• Communication• 802.11b Ad-hoc Network
• AODV, no centralized node
![Page 16: December 4, Project](https://reader033.vdocument.in/reader033/viewer/2022052823/55500d22b4c90535638b47e8/html5/thumbnails/16.jpg)
Roomba Pac-Man
• Reactive Algorithms:• Walls of maze
• Potential Field
• Deliberative Algorithms• Ghost Planning (Enumerate States)
• Collision Avoidance
• Game modes
• Decentralized• Each ghost ran planning algorithm
• Collaborated on positions
• Communication• 802.11b Ad-hoc Network
• AODV, no centralized node
![Page 17: December 4, Project](https://reader033.vdocument.in/reader033/viewer/2022052823/55500d22b4c90535638b47e8/html5/thumbnails/17.jpg)
Roomba Pac-Man
• Reactive Algorithms:• Walls of maze
• Potential Field
• Deliberative Algorithms• Ghost Planning (Enumerate States)
• Collision Avoidance
• Game modes
• Decentralized• Each ghost ran planning algorithm
• Collaborated on positions
• Communication• 802.11b Ad-hoc Network
• AODV, no centralized node
![Page 18: December 4, Project](https://reader033.vdocument.in/reader033/viewer/2022052823/55500d22b4c90535638b47e8/html5/thumbnails/18.jpg)
Roomba Pac-Man
• Simulation• Multi-threaded Sim. Of Robots
• Combine Software with Hardware
• Probabilistic Modelling• Sigma Point Filter
• Human/Robot Interaction• Limited Human Control of Pac-Man
• Autonomous Ghosts
• Hardware Implimentation• SBC's running Gentoo
• Experimental Verification
![Page 19: December 4, Project](https://reader033.vdocument.in/reader033/viewer/2022052823/55500d22b4c90535638b47e8/html5/thumbnails/19.jpg)
Roomba Pac-Man
• Simulation• Multi-threaded Sim. Of Robots
• Combine Software with Hardware
• Probabilistic Modelling• Sigma Point Filter
• Human/Robot Interaction• Limited Human Control of Pac-Man
• Autonomous Ghosts
• Hardware Implimentation• SBC's running Gentoo
• Experimental Verification
![Page 20: December 4, Project](https://reader033.vdocument.in/reader033/viewer/2022052823/55500d22b4c90535638b47e8/html5/thumbnails/20.jpg)
Roomba Pac-Man
• Simulation• Multi-threaded Sim. Of Robots
• Combine Software with Hardware
• Probabilistic Modelling• Sigma Point Filter
• Human/Robot Interaction• Limited Human Control of Pac-Man
• Autonomous Ghosts
• Hardware Implimentation• SBC's running Gentoo
• Experimental Verification
![Page 21: December 4, Project](https://reader033.vdocument.in/reader033/viewer/2022052823/55500d22b4c90535638b47e8/html5/thumbnails/21.jpg)
Roomba Pac-Man
• Simulation• Multi-threaded Sim. Of Robots
• Combine Software with Hardware
• Probabilistic Modelling• Sigma Point Filter
• Human/Robot Interaction• Limited Human Control of Pac-Man
• Autonomous Ghosts
• Hardware Implimentation• SBC's running Gentoo
• Experimental Verification
![Page 22: December 4, Project](https://reader033.vdocument.in/reader033/viewer/2022052823/55500d22b4c90535638b47e8/html5/thumbnails/22.jpg)
Left to Do
• Impliment inter-robot potential field.
• Conduct Experiments
• Generalize Theory?
![Page 23: December 4, Project](https://reader033.vdocument.in/reader033/viewer/2022052823/55500d22b4c90535638b47e8/html5/thumbnails/23.jpg)
End
Questions?
http://pacman.elstonj.com
![Page 24: December 4, Project](https://reader033.vdocument.in/reader033/viewer/2022052823/55500d22b4c90535638b47e8/html5/thumbnails/24.jpg)
A Gradient Based Approach
Greg Brown
![Page 25: December 4, Project](https://reader033.vdocument.in/reader033/viewer/2022052823/55500d22b4c90535638b47e8/html5/thumbnails/25.jpg)
Introduction Robot State Machine Gradients for “Grasping” the Object Gradient for Moving the Object Convergence Simulation Results Continuing Work
![Page 26: December 4, Project](https://reader033.vdocument.in/reader033/viewer/2022052823/55500d22b4c90535638b47e8/html5/thumbnails/26.jpg)
Place a single beacon on an object and another at the object’s destination. Multiple robots cooperate to move the object.
Goals: Minimal/No Robot Communication Object has an Unknown Geometry Use Gradients for Reactive Navigation
![Page 27: December 4, Project](https://reader033.vdocument.in/reader033/viewer/2022052823/55500d22b4c90535638b47e8/html5/thumbnails/27.jpg)
![Page 28: December 4, Project](https://reader033.vdocument.in/reader033/viewer/2022052823/55500d22b4c90535638b47e8/html5/thumbnails/28.jpg)
Each Robot Knows: ◦ Distance/Direction to Object ◦ Distance/Direction to Destination ◦ Distance/Direction to All Other Robots ◦ Bumper Sensor to Detect Collision
Robots Do Not Know ◦ Object Geometry ◦ Actions other Robots are taking
![Page 29: December 4, Project](https://reader033.vdocument.in/reader033/viewer/2022052823/55500d22b4c90535638b47e8/html5/thumbnails/29.jpg)
![Page 30: December 4, Project](https://reader033.vdocument.in/reader033/viewer/2022052823/55500d22b4c90535638b47e8/html5/thumbnails/30.jpg)
Related “Grasping” Work: ◦ Grasping with hand – Maximize torque [Liu et al] ◦ Cage objects for pushing [Fink et al] ◦ Tug Boats Manipulating Barge [Esposito] ◦ ALL require known geometry
My Hybrid Approach ◦ Even distribution around object ◦ Alternate between Convergence and Repulsion
Gradients
◦ Similar to Cow Herding example from class.
![Page 31: December 4, Project](https://reader033.vdocument.in/reader033/viewer/2022052823/55500d22b4c90535638b47e8/html5/thumbnails/31.jpg)
Pull towards object:
Avoid nearby robots: €
γ = ri − robj
€
β = 1− 1+ dc4
dc4
( ri − rj2− dc
2)2
( ri − rj2− dc
2)2 +1
sign(dc − ri −rj )+1
2
j=1
N
∏
![Page 32: December 4, Project](https://reader033.vdocument.in/reader033/viewer/2022052823/55500d22b4c90535638b47e8/html5/thumbnails/32.jpg)
€
Cost =γ 2
(γκ c + β)1/κ c
Combined Cost Function:
![Page 33: December 4, Project](https://reader033.vdocument.in/reader033/viewer/2022052823/55500d22b4c90535638b47e8/html5/thumbnails/33.jpg)
Repel from all robots:
€
Cost =1
(1+ β)1/κ r
€
β = ri − rj2− dr
2
j=1
N
∏
![Page 34: December 4, Project](https://reader033.vdocument.in/reader033/viewer/2022052823/55500d22b4c90535638b47e8/html5/thumbnails/34.jpg)
![Page 35: December 4, Project](https://reader033.vdocument.in/reader033/viewer/2022052823/55500d22b4c90535638b47e8/html5/thumbnails/35.jpg)
Related Work ◦ Formations [Tanner and Kumar] ◦ Flocking [Lindhé et al] ◦ Pushing objects [Fink et al, Esposito] ◦ No catastrophic failure if out of position.
My Approach: ◦ Head towards destination in steps ◦ Keep close to object. ◦ Communicate “through” object ◦ Maintain orientation.
Assuming forklift on Robot can rotate 360º
![Page 36: December 4, Project](https://reader033.vdocument.in/reader033/viewer/2022052823/55500d22b4c90535638b47e8/html5/thumbnails/36.jpg)
Next Step Vector:
Pull to destination:
€
γ1 = ri − rγ i€
rγ i = rideali + dmrObjCenter − rObjDestrObjCenter − rObjDest
![Page 37: December 4, Project](https://reader033.vdocument.in/reader033/viewer/2022052823/55500d22b4c90535638b47e8/html5/thumbnails/37.jpg)
Valley Perpendicular to Travel Vector:
€
m = −rObjCenterx − rObjDestx
rObjCentery − rObjDesty + .0001
€
γ 2 =mrix − riy −mrγ x + rγ y
(m2 +1)
![Page 38: December 4, Project](https://reader033.vdocument.in/reader033/viewer/2022052823/55500d22b4c90535638b47e8/html5/thumbnails/38.jpg)
€
Cost = γ1κ1γ 2
κ 2
![Page 39: December 4, Project](https://reader033.vdocument.in/reader033/viewer/2022052823/55500d22b4c90535638b47e8/html5/thumbnails/39.jpg)
![Page 40: December 4, Project](https://reader033.vdocument.in/reader033/viewer/2022052823/55500d22b4c90535638b47e8/html5/thumbnails/40.jpg)
![Page 41: December 4, Project](https://reader033.vdocument.in/reader033/viewer/2022052823/55500d22b4c90535638b47e8/html5/thumbnails/41.jpg)
0
10
20
30
40
50
60 52
1 67
0 82
0 96
9 11
18
1268
14
17
1566
17
15
1865
20
14
2163
23
13
2462
26
11
2761
29
10
3059
32
08
3358
35
07
3656
38
06
3955
41
04
4254
44
03
4552
47
01
4851
50
00
Num
ber o
f Occ
uren
ces
Time Steps
3 Bots
4 Bots
5 Bots
6 Bots
![Page 42: December 4, Project](https://reader033.vdocument.in/reader033/viewer/2022052823/55500d22b4c90535638b47e8/html5/thumbnails/42.jpg)
Resolve Convergence Problems Noise in Sensing Noise in Actuation
![Page 43: December 4, Project](https://reader033.vdocument.in/reader033/viewer/2022052823/55500d22b4c90535638b47e8/html5/thumbnails/43.jpg)
0
10
20
30
40
50
60 24
5 40
4 56
2 72
1 87
9 10
38
1196
13
55
1513
16
72
1830
19
89
2147
23
06
2464
26
23
2781
29
40
3098
32
57
3415
35
74
3732
38
91
4049
42
08
4366
45
25
4683
48
42
5000
Num
ber o
f Occ
uren
ces
Time Steps
3 Bots
4 Bots
5 Bots
6 Bots
![Page 44: December 4, Project](https://reader033.vdocument.in/reader033/viewer/2022052823/55500d22b4c90535638b47e8/html5/thumbnails/44.jpg)
Modular RobotsLearning
ContributionsConclusion
A Young Modular Robot’s Guide to Locomotion
Ben Pearre
Computer Science
University of Colorado at Boulder, USA
December 6, 2009
Ben Pearre A Young Modular Robot’s Guide to Locomotion
![Page 45: December 4, Project](https://reader033.vdocument.in/reader033/viewer/2022052823/55500d22b4c90535638b47e8/html5/thumbnails/45.jpg)
Modular RobotsLearning
ContributionsConclusion
Outline
Modular Robots
LearningThe ProblemThe Policy GradientDomain Knowledge
ContributionsGoing forwardSteeringCurriculum Development
Conclusion
Ben Pearre A Young Modular Robot’s Guide to Locomotion
![Page 46: December 4, Project](https://reader033.vdocument.in/reader033/viewer/2022052823/55500d22b4c90535638b47e8/html5/thumbnails/46.jpg)
Modular RobotsLearning
ContributionsConclusion
Modular Robots
How to get these to move?
Ben Pearre A Young Modular Robot’s Guide to Locomotion
![Page 47: December 4, Project](https://reader033.vdocument.in/reader033/viewer/2022052823/55500d22b4c90535638b47e8/html5/thumbnails/47.jpg)
Modular RobotsLearning
ContributionsConclusion
The ProblemThe Policy GradientDomain Knowledge
The Learning Problem
Given unknown sensations and actions, learn a task:
◮ Sensations s ∈ Rn
◮ State x ∈ Rd
◮ Action u ∈ Rp
◮ Reward r ∈ R
◮ Policy π(x , θ) = Pr(u|x , θ) : R|θ| × R
|u|
Example policy:
u(x , θ) = θ0 +∑
i
θi (x − bi )TDi (x − bi ) + N (0, σ)
What does that mean for locomotion?
Ben Pearre A Young Modular Robot’s Guide to Locomotion
![Page 48: December 4, Project](https://reader033.vdocument.in/reader033/viewer/2022052823/55500d22b4c90535638b47e8/html5/thumbnails/48.jpg)
Modular RobotsLearning
ContributionsConclusion
The ProblemThe Policy GradientDomain Knowledge
Policy Gradient Reinforcement Learning: Finite Difference
Vary θ:
◮ Measure performance J0 of π(θ)
◮ Measure performance J1...n of π(θ + ∆1...nθ)
◮ Solve regression, move θ along gradient.
gradient =(
∆ΘT∆Θ)−1
∆ΘT J
where ∆Θ =
∆θ1
...∆θn
and J =
J1 − J0
...Jn − J0
Ben Pearre A Young Modular Robot’s Guide to Locomotion
![Page 49: December 4, Project](https://reader033.vdocument.in/reader033/viewer/2022052823/55500d22b4c90535638b47e8/html5/thumbnails/49.jpg)
Modular RobotsLearning
ContributionsConclusion
The ProblemThe Policy GradientDomain Knowledge
Policy Gradient Reinforcement Learning: Likelihood Ratio
Vary u:
◮ Measure performance J(π(θ)) of π(θ) with noise. . .
◮ Compute log-probability of generated trajectory Pr(τ |θ)
Gradient =
⟨(
H∑
k=0
∇θ log πθ(uk |xk)
)(
H∑
l=0
rl
)⟩
Ben Pearre A Young Modular Robot’s Guide to Locomotion
![Page 50: December 4, Project](https://reader033.vdocument.in/reader033/viewer/2022052823/55500d22b4c90535638b47e8/html5/thumbnails/50.jpg)
Modular RobotsLearning
ContributionsConclusion
The ProblemThe Policy GradientDomain Knowledge
Why is RL slow?
“Curse of Dimensionality”
◮ Exploration
◮ Learning rate
◮ Domain representation
◮ Policy representation
◮ Over- and under-actuation
◮ Domain knowledge
Ben Pearre A Young Modular Robot’s Guide to Locomotion
![Page 51: December 4, Project](https://reader033.vdocument.in/reader033/viewer/2022052823/55500d22b4c90535638b47e8/html5/thumbnails/51.jpg)
Modular RobotsLearning
ContributionsConclusion
The ProblemThe Policy GradientDomain Knowledge
Domain Knowledge
Infinite space of policies to explore.
◮ RL is model-free. So what?
◮ Representation is bias.
◮ Bias search towards “good” solutions
◮ Learn all of physics. . . and apply it?
◮ Previous experience in this domain?
◮ Policy implemented by <programmer, agent> “autonomous”?
How would knowledge of this domain help?
Ben Pearre A Young Modular Robot’s Guide to Locomotion
![Page 52: December 4, Project](https://reader033.vdocument.in/reader033/viewer/2022052823/55500d22b4c90535638b47e8/html5/thumbnails/52.jpg)
Modular RobotsLearning
ContributionsConclusion
The ProblemThe Policy GradientDomain Knowledge
Dimensionality Reduction
Task learning as domain-knowledge acquisition:
◮ Experience with a domain
◮ Skill at completing some task
◮ Skill at completing some set of tasks?
◮ Taskspace Manifold
Ben Pearre A Young Modular Robot’s Guide to Locomotion
![Page 53: December 4, Project](https://reader033.vdocument.in/reader033/viewer/2022052823/55500d22b4c90535638b47e8/html5/thumbnails/53.jpg)
Modular RobotsLearning
ContributionsConclusion
Going forwardSteeringCurriculum Development
Goals
1. Apply PGRL to a new domain.
2. Learn mapping from task manifold to policy manifold.
3. Robot school?
Ben Pearre A Young Modular Robot’s Guide to Locomotion
![Page 54: December 4, Project](https://reader033.vdocument.in/reader033/viewer/2022052823/55500d22b4c90535638b47e8/html5/thumbnails/54.jpg)
Modular RobotsLearning
ContributionsConclusion
Going forwardSteeringCurriculum Development
1: Learning to locomote
◮ Sensors: Force feedback onservos? Or not.
◮ Policy: u ∈ R8 controls
servos
ui = N (θi , σ)
◮ Reward: forward speed
◮ Domain knowledge: none
Demo?
Ben Pearre A Young Modular Robot’s Guide to Locomotion
![Page 55: December 4, Project](https://reader033.vdocument.in/reader033/viewer/2022052823/55500d22b4c90535638b47e8/html5/thumbnails/55.jpg)
Modular RobotsLearning
ContributionsConclusion
Going forwardSteeringCurriculum Development
1: Learning to locomote
0 500 1000 1500 2000 2500−0.1
0
0.1
0.2
0.3
0.4
s
v
0 500 1000 1500 2000 2500−10
−5
0
5
10
s
θ
Learning to move
steer bow
steer stern
bow
port fwd
stbd fwd
port aft
stbd aft
stern
effort10−step forward speed
Ben Pearre A Young Modular Robot’s Guide to Locomotion
![Page 56: December 4, Project](https://reader033.vdocument.in/reader033/viewer/2022052823/55500d22b4c90535638b47e8/html5/thumbnails/56.jpg)
Modular RobotsLearning
ContributionsConclusion
Going forwardSteeringCurriculum Development
2: Learning to get to a target
◮ Sensors: Bearing to goal.
◮ Policy: u ∈ R8 controls servos
◮ Policy parameters: θ ∈ R16
µi (x , θ) = θi · s (1)
= [ θi ,0 θi ,1 ]
[
1φ
]
(2)
ui = N (µi , σ) (3)
∇θilog π(x , θ) =
1
σ2(ui − θi · s) · s (4)
Ben Pearre A Young Modular Robot’s Guide to Locomotion
![Page 57: December 4, Project](https://reader033.vdocument.in/reader033/viewer/2022052823/55500d22b4c90535638b47e8/html5/thumbnails/57.jpg)
Modular RobotsLearning
ContributionsConclusion
Going forwardSteeringCurriculum Development
2: Task space → policy space
◮ 16-DOF learning FAIL!
◮ Try simpler task:◮ Learn to locomote with
θ ∈ R16
◮ Try bootstrapping:
1. Learn to locomote with 8DOF
2. Add new sensing andcontrol DOF
◮ CHEATING! Why?
0 20 40 60 80 100 12050
100
150
200
250
300Time to complete task
task
seco
nds
Ben Pearre A Young Modular Robot’s Guide to Locomotion
![Page 58: December 4, Project](https://reader033.vdocument.in/reader033/viewer/2022052823/55500d22b4c90535638b47e8/html5/thumbnails/58.jpg)
Modular RobotsLearning
ContributionsConclusion
Going forwardSteeringCurriculum Development
Curriculum development for manifold discovery?
◮ Etude in Locomotion◮ Task-space manifold for locomotion
θ ∈ ξ · [ 0 0 1 −1 1 −1 1 1 ]T
◮ Stop exploring in task nullspace◮ FAST!
◮ Etude in Steering◮ Can task be completed on locomotion manifold?◮ One possible approximate solution uses the bases
[
0 0 1 −1 1 −1 1 11 −1 0 0 0 0 0 0
]T
◮ Can second basis be learned?
Ben Pearre A Young Modular Robot’s Guide to Locomotion
![Page 59: December 4, Project](https://reader033.vdocument.in/reader033/viewer/2022052823/55500d22b4c90535638b47e8/html5/thumbnails/59.jpg)
Modular RobotsLearning
ContributionsConclusion
Going forwardSteeringCurriculum Development
3: How to teach a robot?
How to teach an animal?
1. Reward basic skills
2. Develop control along useful DOFs
3. Make skill more complex
4. A good solution NOW!
Ben Pearre A Young Modular Robot’s Guide to Locomotion
![Page 60: December 4, Project](https://reader033.vdocument.in/reader033/viewer/2022052823/55500d22b4c90535638b47e8/html5/thumbnails/60.jpg)
Modular RobotsLearning
ContributionsConclusion
Conclusion
Exorcising the Curse of Dimensionality
◮ PGRL works for low-DOF problems.
◮ Task-space dimension < state-space dimension.
◮ Learn f: task-space manifold → policy-space manifold.
Ben Pearre A Young Modular Robot’s Guide to Locomotion