constructing intelligent agents via neuroevolution by jacob schrum [email protected]
TRANSCRIPT
![Page 2: Constructing Intelligent Agents via Neuroevolution By Jacob Schrum schrum2@cs.utexas.edu](https://reader036.vdocument.in/reader036/viewer/2022062423/56649e995503460f94b9c790/html5/thumbnails/2.jpg)
Motivation
• Intelligent agents are needed– Search-and-rescue robots– Mars exploration– Training simulations– Video games
• Insight into nature of intelligence– Sufficient conditions for emergence of:
• Cooperation• Communication• Multimodal behavior
![Page 3: Constructing Intelligent Agents via Neuroevolution By Jacob Schrum schrum2@cs.utexas.edu](https://reader036.vdocument.in/reader036/viewer/2022062423/56649e995503460f94b9c790/html5/thumbnails/3.jpg)
Talk Outline
• Bio-inspired learning methods– Neural networks– Evolutionary computation
• My research– Learning multimodal behavior– Modular networks in Ms. Pac-Man– Human-like behavior in Unreal Tournament
• Future work• Conclusion
![Page 4: Constructing Intelligent Agents via Neuroevolution By Jacob Schrum schrum2@cs.utexas.edu](https://reader036.vdocument.in/reader036/viewer/2022062423/56649e995503460f94b9c790/html5/thumbnails/4.jpg)
Artificial Neural Networks
• Brain = network of neurons
• ANN = abstraction of brain– Neurons organized into layers
Inputs Outputs
![Page 5: Constructing Intelligent Agents via Neuroevolution By Jacob Schrum schrum2@cs.utexas.edu](https://reader036.vdocument.in/reader036/viewer/2022062423/56649e995503460f94b9c790/html5/thumbnails/5.jpg)
What Can Neural Networks Do?
• In theory, anything!– Universal Approximation
Theorem–
• Can’t program: too complicated• In practice, learning/training is hard
– Supervised: Backpropagation– Unsupervised: Self-Organizing Maps– Reinforcement Learning: Temporal-Difference
and Evolutionary Computation
MN ]1,0[]1,0[
![Page 6: Constructing Intelligent Agents via Neuroevolution By Jacob Schrum schrum2@cs.utexas.edu](https://reader036.vdocument.in/reader036/viewer/2022062423/56649e995503460f94b9c790/html5/thumbnails/6.jpg)
Evolutionary Computation
• Computational abstraction of evolution– Descent with modification (mutation)– Sexual reproduction (crossover)– Survival of the fittest (natural selection)
• Evolution + Neural Nets = Neuroevolution– Population of neural networks– Mutation and crossover modify networks– Net used as control policy to evaluate fitness
![Page 7: Constructing Intelligent Agents via Neuroevolution By Jacob Schrum schrum2@cs.utexas.edu](https://reader036.vdocument.in/reader036/viewer/2022062423/56649e995503460f94b9c790/html5/thumbnails/7.jpg)
Neuroevolution Example
Start WithParent Population
![Page 8: Constructing Intelligent Agents via Neuroevolution By Jacob Schrum schrum2@cs.utexas.edu](https://reader036.vdocument.in/reader036/viewer/2022062423/56649e995503460f94b9c790/html5/thumbnails/8.jpg)
Neuroevolution Example
Start WithParent Population
Evaluate andAssign Fitness
100 90 75 61 56 50 31
![Page 9: Constructing Intelligent Agents via Neuroevolution By Jacob Schrum schrum2@cs.utexas.edu](https://reader036.vdocument.in/reader036/viewer/2022062423/56649e995503460f94b9c790/html5/thumbnails/9.jpg)
Neuroevolution Example
Start WithParent Population
Evaluate andAssign Fitness
100 90 75 61 56 50 31
Clone, Crossoverand Mutate
To Get ChildPopulation
![Page 10: Constructing Intelligent Agents via Neuroevolution By Jacob Schrum schrum2@cs.utexas.edu](https://reader036.vdocument.in/reader036/viewer/2022062423/56649e995503460f94b9c790/html5/thumbnails/10.jpg)
Neuroevolution Example
Start WithParent Population
Evaluate andAssign Fitness
100 90 75 61 56 50 31
Clone, Crossoverand Mutate
Children Are Nowthe New Parents
Repeat Process:Fitness Evaluations
As the process continues, each successive population improves performance
100 120 69 99 60 83 50
![Page 11: Constructing Intelligent Agents via Neuroevolution By Jacob Schrum schrum2@cs.utexas.edu](https://reader036.vdocument.in/reader036/viewer/2022062423/56649e995503460f94b9c790/html5/thumbnails/11.jpg)
Neuroevolution Applications
F. Gomez and R. Miikkulainen, “2-D Pole Balancing With Recurrent Evolutionary Networks” ICANN 1998
Double Pole Balancing
![Page 12: Constructing Intelligent Agents via Neuroevolution By Jacob Schrum schrum2@cs.utexas.edu](https://reader036.vdocument.in/reader036/viewer/2022062423/56649e995503460f94b9c790/html5/thumbnails/12.jpg)
Neuroevolution Applications
F. Gomez and R. Miikkulainen, “Active Guidance for a Finless Rocket Using Neuroevolution” GECCO 2003
Finless Rocket Control
![Page 13: Constructing Intelligent Agents via Neuroevolution By Jacob Schrum schrum2@cs.utexas.edu](https://reader036.vdocument.in/reader036/viewer/2022062423/56649e995503460f94b9c790/html5/thumbnails/13.jpg)
Neuroevolution Applications
N. Kohl, K. Stanley, R. Miikkulainen, M. Samples, and R. Sherony, "Evolving a Real-World Vehicle Warning System" GECCO 2006
Vehicle Crash Warning System
![Page 14: Constructing Intelligent Agents via Neuroevolution By Jacob Schrum schrum2@cs.utexas.edu](https://reader036.vdocument.in/reader036/viewer/2022062423/56649e995503460f94b9c790/html5/thumbnails/14.jpg)
Neuroevolution Applications
K. O. Stanley, B. D. Bryant, I. Karpov, R. Miikkulainen, "Real-Time Evolution of Neural Networks in the NERO Video Game" AAAI 2006
Training Video Game Agents
http://nerogame.org/
![Page 15: Constructing Intelligent Agents via Neuroevolution By Jacob Schrum schrum2@cs.utexas.edu](https://reader036.vdocument.in/reader036/viewer/2022062423/56649e995503460f94b9c790/html5/thumbnails/15.jpg)
What is Missing?
• NERO agents are specialists– Sniping from a distance– Aggressively rushing in
• Humans can do all of this, and more
• Multimodal behavior– Different behaviors for
different situations
• Human-like behavior– Preferred by humans
![Page 16: Constructing Intelligent Agents via Neuroevolution By Jacob Schrum schrum2@cs.utexas.edu](https://reader036.vdocument.in/reader036/viewer/2022062423/56649e995503460f94b9c790/html5/thumbnails/16.jpg)
What I do With Neuroevolution
• Discover complex agent behavior• Discover multimodal behavior
Contributions:• Use multi-objective evolution
– Different objectives for different modes
• Evolve modular networks– Networks with modules for
each mode
• Human-like behavior– Constrain evolution
![Page 17: Constructing Intelligent Agents via Neuroevolution By Jacob Schrum schrum2@cs.utexas.edu](https://reader036.vdocument.in/reader036/viewer/2022062423/56649e995503460f94b9c790/html5/thumbnails/17.jpg)
Pareto-based Multiobjective Optimization
High health but did not deal much damage
Dealt lot of damage,but lost lots of health
Tradeoff between objectives
solution optimal one than More
)(
s.t. in points all contains
optimal Pareto is
:best points dominated-Non
)}(,,1{ 2.
and )}(,,1{ 1.
i.e. , dominates
RemainingHealth -
Dealt Damage -
:objectives with twogame Imagine
xyFyAx
FA
FA
uvni
uvni
uvuv
ii
ii
![Page 18: Constructing Intelligent Agents via Neuroevolution By Jacob Schrum schrum2@cs.utexas.edu](https://reader036.vdocument.in/reader036/viewer/2022062423/56649e995503460f94b9c790/html5/thumbnails/18.jpg)
Non-dominated Sorting Genetic Algorithm II
• Population P with size N; Evaluate P• Use mutation (& crossover) to get P´ size N; Evaluate P´• Calculate non-dominated fronts of P P´ size 2N• New population size N from highest fronts of P P´
K. Deb, S. Agrawal, A. Pratap, T. Meyarivan, "A Fast Elitist Non-dominated sorting genetic algorithm for multi-objective optimization: NSGA-II" PPSN VI, 2000
![Page 19: Constructing Intelligent Agents via Neuroevolution By Jacob Schrum schrum2@cs.utexas.edu](https://reader036.vdocument.in/reader036/viewer/2022062423/56649e995503460f94b9c790/html5/thumbnails/19.jpg)
Ms. Pac-Man
• Popular classic game• Predator-prey scenario
– Ghosts are predators– Until power pill is eaten
• Multimodal behavior needed– Running from threats– Chasing edible ghosts– More?
![Page 20: Constructing Intelligent Agents via Neuroevolution By Jacob Schrum schrum2@cs.utexas.edu](https://reader036.vdocument.in/reader036/viewer/2022062423/56649e995503460f94b9c790/html5/thumbnails/20.jpg)
Modular Networks
• Different areas of brain specialize– Structural modularity → functional modularity
• Apply to evolved neural networks– Separate module → behavioral mode
• Preference neurons (grey)
arbitrate between modules• Use module with highest
preference output
( )( )
![Page 21: Constructing Intelligent Agents via Neuroevolution By Jacob Schrum schrum2@cs.utexas.edu](https://reader036.vdocument.in/reader036/viewer/2022062423/56649e995503460f94b9c790/html5/thumbnails/21.jpg)
Module Mutation
• Let evolution decide how many modules
Networks start withone module
New modules addedby one of severalmodule mutations
Previous
Random
Duplicate
![Page 22: Constructing Intelligent Agents via Neuroevolution By Jacob Schrum schrum2@cs.utexas.edu](https://reader036.vdocument.in/reader036/viewer/2022062423/56649e995503460f94b9c790/html5/thumbnails/22.jpg)
Intelligent Module Usage
• Evolution discovers a novel task division– Not programmed
• Dedicates one module to luring (cyan)
• Improves ghost eating when using other module
![Page 23: Constructing Intelligent Agents via Neuroevolution By Jacob Schrum schrum2@cs.utexas.edu](https://reader036.vdocument.in/reader036/viewer/2022062423/56649e995503460f94b9c790/html5/thumbnails/23.jpg)
![Page 24: Constructing Intelligent Agents via Neuroevolution By Jacob Schrum schrum2@cs.utexas.edu](https://reader036.vdocument.in/reader036/viewer/2022062423/56649e995503460f94b9c790/html5/thumbnails/24.jpg)
![Page 25: Constructing Intelligent Agents via Neuroevolution By Jacob Schrum schrum2@cs.utexas.edu](https://reader036.vdocument.in/reader036/viewer/2022062423/56649e995503460f94b9c790/html5/thumbnails/25.jpg)
Comparison With Other Work
Authors Method Game AVG MAX
Alhejali and Lucas [1] GP FourMaze 16,014 44,560
Alhejali and Lucas [2] GP+Camps FourMaze 11,413 31,850
My Module Mutation Duplicate Results FourMaze 32,647 44,520
Brandstetter and Ahmadi [3] GP CIG 2011 19,198 33,420
Recio et al. [4] ACO CIG 2011 36,031 43,467
Alhejali and Lucas [5] GP+MCTS CIG 2011 32,641 69,010
My Module Mutation Duplicate Results CIG 2011 63,299 84,980
[1] A.M. Alhejali, S.M. Lucas: Evolving diverse Ms. Pac-Man playing agents using genetic programming. UKCI 2010.[2] A.M. Alhejali, S.M. Lucas: Using a training camp with Genetic Programming to evolve Ms Pac-Man agents. CIG 2011.[3] M.F. Brandstetter, S. Ahmadi: Reactive control of Ms. Pac Man using information retrieval based on Genetic Programming. CIG 2012.[4] G. Recio, E. Martín, C. Estébanez, Y. Sáez: AntBot: Ant Colonies for Video Games. TCIAIG 2012.[5] A.M. Alhejali, S.M. Lucas: Using genetic programming to evolve heuristics for a Monte Carlo Tree Search Ms Pac-Man agent. CIG 2013.
![Page 26: Constructing Intelligent Agents via Neuroevolution By Jacob Schrum schrum2@cs.utexas.edu](https://reader036.vdocument.in/reader036/viewer/2022062423/56649e995503460f94b9c790/html5/thumbnails/26.jpg)
Types of Intelligence
• Evolved intelligent Ms. Pac-Man behavior– Surprising module usage– Evolution discovers the unexpected– Diverse collection of solutions
• Still not human-like– Human-like vs. optimal– Human intelligence
![Page 27: Constructing Intelligent Agents via Neuroevolution By Jacob Schrum schrum2@cs.utexas.edu](https://reader036.vdocument.in/reader036/viewer/2022062423/56649e995503460f94b9c790/html5/thumbnails/27.jpg)
Modern Game: Unreal Tournament
• 3D world with simulated physics• Multiple human and software agents interacting• Agents attack, retreat, explore, etc.• Multimodal behavior required to succeed
![Page 28: Constructing Intelligent Agents via Neuroevolution By Jacob Schrum schrum2@cs.utexas.edu](https://reader036.vdocument.in/reader036/viewer/2022062423/56649e995503460f94b9c790/html5/thumbnails/28.jpg)
Human-like Behavior: BotPrize
• International competition at CIG conference
• A Turing Test for video game bots– Judge as human over 50% of time to win– After 5 years, we won in 2012
• Evolved combat behavior– Constrained to
be human-like
![Page 29: Constructing Intelligent Agents via Neuroevolution By Jacob Schrum schrum2@cs.utexas.edu](https://reader036.vdocument.in/reader036/viewer/2022062423/56649e995503460f94b9c790/html5/thumbnails/29.jpg)
Guessing Game
• Coleman: ????• Milford: ????• Moises: ????• Lawerence: ????• Clifford: ????• Kathe: ????• Tristan: ????• Jackie: ????
![Page 30: Constructing Intelligent Agents via Neuroevolution By Jacob Schrum schrum2@cs.utexas.edu](https://reader036.vdocument.in/reader036/viewer/2022062423/56649e995503460f94b9c790/html5/thumbnails/30.jpg)
Judging Game
![Page 31: Constructing Intelligent Agents via Neuroevolution By Jacob Schrum schrum2@cs.utexas.edu](https://reader036.vdocument.in/reader036/viewer/2022062423/56649e995503460f94b9c790/html5/thumbnails/31.jpg)
Player Identities
• Coleman: UT^2 (Our winning bot)• Milford: ICE-2010 (bot)• Moises: Discordia (bot)• Lawerence: Native UT2004 bot• Clifford: w00t (bot)• Kathe: Human• Tristan: Human• Jackie: Native UT2004 bot
![Page 32: Constructing Intelligent Agents via Neuroevolution By Jacob Schrum schrum2@cs.utexas.edu](https://reader036.vdocument.in/reader036/viewer/2022062423/56649e995503460f94b9c790/html5/thumbnails/32.jpg)
Human Subject Study
• Six participants played the judging game
• Recorded extensive post-game interviews
• What criteria to humans claim to judge by?
![Page 33: Constructing Intelligent Agents via Neuroevolution By Jacob Schrum schrum2@cs.utexas.edu](https://reader036.vdocument.in/reader036/viewer/2022062423/56649e995503460f94b9c790/html5/thumbnails/33.jpg)
Lessons Learned
• Don’t be too skilled– Evolved with accuracy restrictions– Disable elaborate dodging
• Humans are “tenacious”– Opponent-relative actions– Encourage “focusing” on opponent
• Don’t repeat mistakes– Database of human traces to get unstuck
Enem y
Bot
Item
![Page 34: Constructing Intelligent Agents via Neuroevolution By Jacob Schrum schrum2@cs.utexas.edu](https://reader036.vdocument.in/reader036/viewer/2022062423/56649e995503460f94b9c790/html5/thumbnails/34.jpg)
Bot Architecture
![Page 35: Constructing Intelligent Agents via Neuroevolution By Jacob Schrum schrum2@cs.utexas.edu](https://reader036.vdocument.in/reader036/viewer/2022062423/56649e995503460f94b9c790/html5/thumbnails/35.jpg)
![Page 36: Constructing Intelligent Agents via Neuroevolution By Jacob Schrum schrum2@cs.utexas.edu](https://reader036.vdocument.in/reader036/viewer/2022062423/56649e995503460f94b9c790/html5/thumbnails/36.jpg)
Future Work
• Evolving teamwork– Ghosts must cooperate to eat Ms. Pac-Man– Unreal Tournament supports team play
• Domination, Capture the Flag, etc.
• Interactive evolution– Evolve in response to human interaction
• Adaptive opponents/assistants• Evolutionary art• Content generation
http://picbreeder.org/
![Page 37: Constructing Intelligent Agents via Neuroevolution By Jacob Schrum schrum2@cs.utexas.edu](https://reader036.vdocument.in/reader036/viewer/2022062423/56649e995503460f94b9c790/html5/thumbnails/37.jpg)
Conclusion
• Evolution discovers unexpected behavior
• Modular networks learn multimodal behavior
• Human behavior not optimal– Evolution can be constrained to be
more human-like
• Many directions for future research