learning classifier systems mobile robot control

29
Learning Classifier Systems Mobile Robot Control

Upload: janel-tyler

Post on 18-Dec-2015

219 views

Category:

Documents


0 download

TRANSCRIPT

Learning Classifier Systems

Mobile Robot Control

XCS and Implementation

XCS – An LCS variant where classifier fitness is based on the accuracy of prediction, not the prediction itself

Traditional LCS vs XCS

Genetic Algorithm acts on Action Sets

Addition of message list, allowing operation in non-Markov environments

Markov environment/ process – a stochastic process that satisfies the Markov property of being memory-less

Makes XCS easier to analyse

Commonly held that XCS can be more useful than LCS in many applications

XCS and Implementation

) 𝑝 𝑗+1=𝑝 𝑗+𝛽 (𝑃 −𝑝 𝑗)1. 2.

3. 4.

5.

> 𝑘 𝑗+1=1 𝑖𝑓 𝜀 𝑗≤𝜀0

𝐹 𝑗+1=𝐹 𝑗+𝛽¿

j is a classifier, is the error, is the prediction & is the fitness

XCS and Implementation

XCS and Implementation

Control system receives pre-process binary vectors, suitable for subsequent evolutionary operations

Signals fed in as condition-action pairs, or classifiers

Rewards are attributed according to a fixed table of rewards

Robot control and LCS

Traditional methods

Often require lots of manual parameter tweaking

Tedious process - the operator must constantly evaluate the outcome of the tweaking

Inflexible w.r.t. changes in environment

Changes require refinement of parameters once again

LCS

Real-time learning removes a lot of the manual work and setup process

Quite short time to learn tasks -> Adapts easily to changes in the environment

Complexity no longer a problem due to exponential computational power growth

Robot control and LCS

Early pioneer work by Katagami and Yamada in 2000

Used LCS for learning in human-robot interaction

Sped-up the initial learning process and removed psychological workload from the operator

Different enhancements and variations have since been proposed

Enhanced LCS, Temporal (accuracy-based) classifier system, comb. of fuzzy-system and LCS, etc.

Great success

Example: 79% reduction in robot localization error using an LCS by Williams and Browne

Mobile Robot Control Using XCS

By F. Tóth et al.

Bratislava, Slovak Republic

Presented at the 2013 International Conference on Process Control

Control of an omni-directional robot using XCS

Both physical and simulated robot

Goals: Move along walls and avoid obstacles

2 、 NAO ROBOT LEARN TO PUSH BOX

1 、 ROBOT LEARN TO MOVE IN THE CORRECT WAY

3 、 traffic conjuction control

usage

data mining(financial data prediciton-simulated on-line traders in continuous double-auction markets)

AI controller for games(mario)

Robot control

traffic conjuction control

...

why using LCS

1 、 combination of the properties of reinforcement learning and evolutionary algorithms

2 、 learning in real time,and require quite a short time to learn the desired task

example : Williams and Browne enhanced the mobile robot localization systemwith an LCS and reduced the error in robots localization by 79%

Robot task

Robot learned to avoid the barrier and find the correct way to move

explanation

system receives data (from sensor sensing the environment condition and robot situation,based on the distance to the desired action and barriers)from the robot preprocessed in form of binary vectors , which are most suitable for subsequent evolutionary operations and are called condition-action pairs, i.e. the classifiers

the inputs from environment consist of the signal from sensors and the matching set of classifiers represents actions that are available to the robot at the current time. After creating the matching set the LCS evaluates each action and the best action is selected for the robot to produce.(For XCS is to evaluate according the reward from the environment ,each time step the fitness is updated .Basically it is the accuracy of the robot prediction taken to define the fitness)

The reward that the system receives from the environment isattributed according to fixed table of rewards (table I).

moving demonstration

Robot Condition

Nao is an autonomous, programmable humanoid robot developed by Aldebaran Robotics, a French robotics company headquartered in Paris. The robot's development began with the launch of Project Nao in 2004.-

Robot Task

The robot is designed to learn the path and learn to push the box in the correct direction

pushing box demonstration

the improtance of doing so

why using LCS

Learning Classifier Systems (LCS) can be used for optimisation in a way that offers substantial promise for application in traffic-responsive signal control systems where the way in which the control responds to variations in traffic flows can be adapted according to measured conditions.

This is important in order to achieve traffic control that is sufficiently flexible to respond rapidly when traffic conditions change in a fundamental way, as occurs at the start of a peak period, without being unduly sensitive to short-term variations in flow.

how it works

suppose that the roads are oriented north-south and eastwest.

traffic flows within the network are profiled by signals.

Each junction is controlled by an LCS that receives as stimulus a binary string,as input representing the quene length

binary string are also used to represent actions

condition/action pair ,reward,prediction accuracy to change the fitness

performance measurement:vehicles/hours

how it works

demonstration

https://www.youtube.com/watch?v=Ge4rG8ER_CU

reference

Mobile Robot Control Using XCS(Filip T6th*, Kristina Rebrovat, Gregor Zatkot, Pavol Krasiiansky* and Boris Rohal-Ilkiv*)

Towards Distributed Adaptive Control for Road Traffic Junction Signals using Learning Classifier Systems(L Bull, J Sha’Aban, A Tomlinson, JD Addison and BG Heydecker )

Learning Classifier System on a Humanoid NAO Robot in Dynamic Environments(Chang Wang, Pascal Wiggers, Koen Hindriks, Catholijn M. Jonker)

Interactive Classifier System for Real Robot Learning (D Katagami and S Yamada)