sérgio ronaldo barros dos santos (ita-brazil) sidney nascimento givigi júnior (rmc-canada) cairo...

Post on 11-Jan-2016

215 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Sérgio Ronaldo Barros dos Santos (ITA-Brazil)Sidney Nascimento Givigi Júnior (RMC-Canada)

Cairo Lúcio Nascimento Júnior (ITA-Brazil)

Autonomous Construction of Structures in a Dynamic

Environment using Reinforcement Learning

2/25

Introduction In recent years, there has been a growing interest in a class of applications in which mobile robots are used to assemble and build different types of structures.

These applications traditionally involves human performing:

The operation of tools and equipament;

The manipulation and transportation of the resources for manufacturing of structures; and

The careful preplanning of the tasks that will be executed.

3/25

Due to recent advancements in technologies available for UAVs, the problem of autonomous manipulation, transportation and construction is advancing to the aerial domain.

Autonomous construction using aerial robots may be useful in several situations, such as:

Reduce the high accident rates in traditional construction;

Enable the construction in extraterrestrial environments or disaster areas; and

Use in military and logistics applications.

Introduction

4/25

Quad-rotor Robot

All of the movements of quad-rotor can be controlled by the changes of each rotor speed.

An inertial frame and a body fixed frame whose origin is in the center of mass of the quad-rotor are used.

5/25

Problem Statement Construction task using mobile robots are characterized by three fundamental problems such as task planning, motion planning and path tracking.

However, to obtain the task and path planning that define a specific sequence of operations for construction of different structures is generally very complex.

The task planning, motion planning and low-level controllers for robotic assembly are derived off-line through a simulation environment, using Reinforcement Learning (RL) and heuristic search (A*) algorithms, and then the solutions are ported to an actual quad-rotor.

6/25

Problem Statement

Proposed Environment The 3-D Structures suggested

This work is concentrated on the learning of four different types of 3-D structures: cube, tower, pyramid and wall, similar to those used in construction of scaffolds, tower cranes, skyscrapers, etc.

7/25

Proposed Solution

• Low-level controllers: Enable the position and path tracking control of the quad-rotor.

• Task planning: Provide the maneuvers and assembly sequence.

• Path Planning: Find the optimal path for the robot so that its navigation by the dynamic environment is executed.

8/25

Experimental Infrastructure

9/25

Experimental Infrastructure

10/25

Reinforcement Learning The task planning and low-level controllers for robotic assembly were learned by a reinforcement learning algorithm known as Learning Automata (LA).

11/25

Learning Control of a Quad-rotor The low-level controllers are adapted simultaneously, (attitude and height) and after (position and path tracking) using the nonlinear dynamics model of the target quad-rotor built for the X-Plane Flight Simulator and also the LA algorithm executing in Matlab.

12/25

Learning Control of a Quad-rotor Some considerations that must be taken into account during the learning phase, such as wind and ground effects, as well as the change of mass and center of gravity of the system produced by different types of payloads.

13/25

Learning Control of a Quad-rotor A simulation setup is proposed to the training and evaluation of the control parameters under realistic conditions.

14/25

Learning Control of a Quad-rotor Experimental setup used to test and validate the learned attitude and path tracking controllers in simulation.

15/25

Learning Control of a Quad-rotor Path tracking and height responses obtained by the quad-rotor during the test of the adapted control laws .

Test in simulation Experimental Validation

16/25

Learning of the Robotic Assembly The proposed learning system for the autonomous construction of structures. The training process of the task planning is accomplished by a team of automata.

Learning Architecture Learning Automata

17/25

Learning of the Robotic Assembly

TT

TT

M

j d

dd

M

j r

yR

M

j xz

xzT

M

j c

cc

TT

V

kjDw

V

kjRw

V

kjDw

V

kjDw

MnJ

11

11

],[],[

..],[],[

1

)()(

)()(

min nJnJ

nJnJnI

TT

TTd

The proposed total cost function to evaluate the structure construction mode is given by:

The numeric value of the response quality obtained by the robot during each iteration is computed using:

18/25

Learning of the Robotic Assembly

1)(

1)(122

)(

1)(

nIifR

nIifRRRR

nI

nIifR

nR

dP

dpGpG

d

dG

c

errorassemblyif

nRif

nRifR

nR c

cc

20

0)(0

0)(

The value of Rc(n) ϵ [Rp, RG] is understood as the established limit by the user to change the speed convergence during the training process.

A common reinforcement is used to update the action probability distributions of the team of automata.

19/25

Learning of the Robotic Assembly During the learning phase, it is noted that the acquired knowledge by the system relative to assembly of a 3-D structure (tower) increases with each iteration.

The learned sequence of maneuvers and assembly for construction of a tower are illustrated in plots below.

20/25

Learning of the Robotic Assembly Experimental setup used to validate the learned task planning and the produced path planning by RL and A* algorithms, simultaneously.

21/25

Learning of the Robotic Assembly The Executed events for the assembly task of a structure.

22/25

Learning of the Robotic Assembly The resulting trajectory of the sequence of maneuvers learned for assembling the tower, through a quad-rotor was successfully performed.

23/25

Conclusions This method allows the autonomous construction of multiple 3-D structures based on the Learning Automata and A* algorithms using a quad-rotor.

This approach reduces substantially the effort employed for developing a task and motion planning that permits a robot to efficiently assemble and construct multiple 3-D structures.

The use of reinforcement learning for finding different set of actions to build a 3-D structure is very promising.

24/25

Conclusions

The proposed learning architecture enables an aerial robot to learn a good sequence of maneuvers and assembly so that the constraints inherent in the structures and environment are overcome.

It has been shown that a 3-D structure can be built using the adapted low-level controllers, the learned task planning and also the produced path planning.

25/25

Thank you

Questions?

top related