geneticprogramming - poznań university of technologygeneticprogramming krzysztofkrawiec...

96
Genetic Programming Krzysztof Krawiec Laboratory of Intelligent Decision Support Systems Institute of Computing Science, Poznan University of Technology, Poznań, Poland http://www.cs.put.poznan.pl/kkrawiec/ June 10, 2016 –1

Upload: others

Post on 15-Mar-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: GeneticProgramming - Poznań University of TechnologyGeneticProgramming KrzysztofKrawiec LaboratoryofIntelligentDecisionSupportSystems InstituteofComputingScience,PoznanUniversityofTechnology,Poznań,Poland

Genetic Programming

Krzysztof Krawiec

Laboratory of Intelligent Decision Support SystemsInstitute of Computing Science Poznan University of Technology Poznań Poland

httpwwwcsputpoznanplkkrawiec

June 10 2016

ndash 1

Introduction

Introductionndash 2

Outline and objectives

1 Introduction GP as a variant of EC2 Specific features of GP3 Variants of GP4 Applications5 Some theory6 Case studies

Introductionndash 3

Recommended reading

Koza J R Genetic Programming On the Programming of Computers by Meansof Natural Selection MIT Press 1992

A Field Guide to Genetic Programming (ISBN 978-1-4092-0073-4)httpwwwgp-field-guideorguk

Langdon W B Genetic Programming and Data Structures GeneticProgramming + Data Structures = Automatic Programming Kluwer 1998

Langdon W B amp Poli R Foundations of Genetic Programming Springer-Verlag2002

Riolo R L Soule T amp Worzel B (ed) Genetic Programming Theory andPractice V Springer 2007

Riolo R McConaghy T amp Vladislavleva E (ed) Genetic Programming Theoryand Practice VIII Springer 2010

See httpwwwcsbhamacuk~wblbiblio

Introductionndash 4

Recommended reading

A Field Guide to Genetic Programming httpwwwgp-field-guideorguk[39]

(This presentation uses some figures from the Field Guide)

Introductionndash 5

Background

Backgroundndash 6

Evolutionary Computation (EC)

Heuristic bio-inspired global search algorithms

Operate on populations of candidate solutions

Candidate solutions are encoded as genotypes

Genotypes get decoded into phenotypes when evaluated by the fitness function fbeing optimized

Formulation

plowast = argmaxpisinP

f (p)

where

P is the considered space (search space) of candidate solutions (solutions forshort)

f is a (maximized) fitness function

plowast is an optimal solution (an ideal) that maximizes f

Backgroundndash 7

Generic evolutionary algorithm

Evolutionary Algorithm

Population P of individuals

Evaluation

Selection

Mutation and recombination

Initialization of population P

Solutionindividual s

f(s)

Output Best solution s+

Termination criteria

Fitness function f

Backgroundndash 8

[Unique] characteristic of EC

Black-box optimization (f primes dependency on the independent variables does nothave to be known or meet any criteria)

Variables do not have to be explicitly defined

Fining an optimum cannot be guaranteed but in practice a well-performingsuboptimal solution is often satisfactory

Importance of crossover a recombination operator that makes the solutionsexchange certain elements (variable values features)

Without crossover EC boils down parallel stochastic local search

Backgroundndash 9

What is genetic programming

What is genetic programmingndash 10

Genetic programming

In a nutshell

A variant of EC where the genotypes represent programs ie entities capable ofreading in input data and producing some output data in response to that input

Fitness function f measures the similarity of the output produced by the programto the desired output given as a part of task statement

Standard representation expression trees

Important implication Additional input required by the algorithm (compared to EC)

Set of instructions (programming language of consideration)

Data to run the programs on

What is genetic programmingndash 11

Conceptual preliminaries of GP Space of candidate solutions P

Candidate solutions p isin P evolving under the selection pressure of the fitnessfunction f are themselves functions of the form p I rarrO

I and O are respectively the spaces of input data and output data accepted andproduced by programs from P

Cardinality of |P| is typically large or infinite

The set of program inputs I even if finite is usually so large that running eachcandidate solution on all possible inputs becomes intractable

GP algorithms typically evaluate solutions on a sample I prime sub I |I prime| |I | of possibleinputs and fitness is only an approximate estimate of solution quality

The task is given as a set of fitness cases ie pairs (xi yi ) isin I timesO where xiusually comprises one or more independent variables and yi is the output variable

What is genetic programmingndash 12

Conceptual preliminaries Fitness function f

In most cases (and most real-world applications of GP) fitness function fmeasures the similarity of the output produced by the program to the desiredoutput given as a part of task statement

Then fitness can be expressed as a monotonous function of the divergence ofprogramrsquos output from the desired one for instance as

f (p) =minussumi

||yi minusp(xi )|| (1)

where

p(xi ) is the output produced by program p for the input data xi

|| middot || is a metric (a norm) in the output space O

i iterates over all fitness cases

What is genetic programmingndash 13

Conceptual preliminaries Character of candidate solutions

The candidate solutions in GP are being assembled from elementary entities calledinstructions

A part of formulation of a GP task is then also an instruction set I ie a set ofsymbols used by the search algorithm to compose the programs (candidatesolutions)

Design of I usually requires some background knowledgeIn particular it should comprise all instructions necessary to find solution to theproblem posed (closure)

What is genetic programmingndash 14

Genetic programming

Main evolution loop (lsquovanilla GPrsquo)

1 procedure GeneticProgramming(f I ) f - fitness function I - instruction set2 P larrplarrRandomProgram(I ) Initialize population3 repeat Main loop over generations4 for p isinP do Evaluation5 pf larr f (p) pf is a lsquofieldrsquo in program p that stores its fitness6 end for7 P prime larr 0 Next population8 repeat Breeding loop9 p1 larrTournamentSelection(P) First parent10 p2 larrTournamentSelection(P) Second parent11 (o1o2)larrCrossover(p1 p2)12 o1 larrMutation(o1 I )13 o2 larrMutation(o2 I )14 P prime larrP prime cupo1o215 until |P prime|= |P|16 P larrP prime

17 until StoppingCondition(P)18 return argmaxpisinP pf

19 end procedure

What is genetic programmingndash 15

Crossover

Crossover exchange of randomly selected subexpressions (subtree swapping crossover)

1 function Crossover(p1p2)2 repeat3 s1 larr Random node in p14 s2 larr Random node in p25 (pprime1p

prime2)larr Swap subtrees rooted in s1 and s2

6 until Depth(pprime1)lt dmax andDepth(pprime2)lt dmax dmax is the tree depthlimit

7 return (pprime1pprime2)

8 end function

What is genetic programmingndash 16

Mutation

Mutation replace a randomly selected subexpression with a new randomly generatedsubexpression

1 function Mutation(pI )2 repeat3 slarr Random node in p4 s prime larrRandomProgram(I )5 pprime larr Replace the subtree rooted in s with s prime

6 until Depth(pprime)lt dmax dmax is the tree depth limit7 return pprime

8 end function

What is genetic programmingndash 17

Exemplary run Setup

Objective Find program whose output matches x2+x+1 over the range [minus11]Such tasks can be considered as a form of regressionAs solutions are built by manipulating code (instructions) this is referred to assymbolic regression

Fitness sum of absolute errors for x isin minus10minus09 0910In other words the set of fitness cases is

xi -10 -09 0 09 10yi 1 091 1 271 3

What is genetic programmingndash 18

Exemplary run Setup

Instruction setNonterminal (function) set + - (protected division) and x all operating onfloatsTerminal set x and constants chosen randomly between -5 and +5

Selection fitness proportionate (roulette wheel) non elitist

Initial pop ramped half-and-half (depth 1 to 2 50 of terminals are constants)(to be explained later)

Parameterspopulation size 450 subtree crossover25 reproduction25 subtree mutation no tree size limits

Termination when an individual with fitness better than 01 found

What is genetic programmingndash 19

Initial population (population 0)

What is genetic programmingndash 20

Fitness assignment for population 0

Fitness values f(a)=77 f(b)=110 f(c)=1798 f(d)=287

What is genetic programmingndash 21

Breeding

Assume

a gets reproduced

c gets mutated (at loci 2)

a and d get crossed-over

a and b get crossed over

What is genetic programmingndash 22

Population 1

Population 0

Population 1

Individual d in population 1 has fitness 0What is genetic programmingndash 23

Summary of our first glimpse at GP

Summary of our first glimpse at GPndash 24

Specific features of GP

The solutions evolving under the selection pressure of the fitness function arethemselves functions (programs)

GP operates on symbolic structures of varying lengthsThere are no variables for the algorithm to operate on (at least in the commonsense)

The program can be tested only on a limited number of fitness cases (tests)

=rArr In contrast to most EC methods that are typically placed in optimizationframework GP is by nature an inductive learning approach that fits into the domain ofmachine learning []

Summary of our first glimpse at GPndash 25

In a broader context

As opposed to typical ML approaches GP is very genericArbitrary programming language arbitrary input and output representation

The syntax and semantic of the programming language of consideration serve asmeans to provide the algorithm with prior knowledge

(common sense knowledge background knowledge domain knowledge)

GP is not the only approach to program induction (but probably the best one )See eg inductive logic programming ILP

GP embodies the ultimate goal of AI to build a system capable ofself-programming (adaptation learning)

Summary of our first glimpse at GPndash 26

What is GP ndash Question revisited

Genetic programming is a branch of computer science studying heuristicalgorithms based on neo-Darwinian principles for synthesizing programs iediscrete symbolic compositional structures that process data

Summary of our first glimpse at GPndash 27

Why should GP be considered a viable approach to program synthesis

GP combines two powerful concepts marked in underline in the above definition1 Representing candidate solutions as programs

which in general can conduct any Turing-complete computation (egclassification regression clustering reasoning problem solving etc) and thusenable capturing solutions to any type of problems (whether the task is eglearning optimization problem solving game playing etc)

2 Searching the space of candidate solutions using the lsquomechanicsrsquo borrowed frombiological evolutionwhich is unquestionably a very powerful computing paradigm given that itresulted in life on Earth and development of intelligent beings

Summary of our first glimpse at GPndash 28

Consequences of the above definition

Heuristic nature of search

Symbolic program representation

Input sensitivity and inductive character

State-fullness

Unconstrained data types

Summary of our first glimpse at GPndash 29

Origins of GP

httpwwwgenetic-programmingcomjohnkozahtml

Home assignment Indentify John Kozarsquos unique visual trait )

Summary of our first glimpse at GPndash 30

Life demonstration of GP using ECJ

Life demonstration of GP using ECJndash 31

GP on ECJ

ECJ Evolutionary Computation in Javahttpcsgmuedu~eclabprojectsecj

by Sean Luke Liviu Panait Gabriel Balan Sean Paus Zbigniew Skolicki ElenaPopovici Keith Sullivan et al

Probably the most popular freely available framework for EC with a strongsupport for GP

Licensed under Academic Free License version 30

As of March 2012 version 20

Many other libraries integrate with ECJ

Life demonstration of GP using ECJndash 32

Selected ECJ features

GUI with charting

Platform-independent checkpointing and logging

Hierarchical parameter files

Multithreading

Mersenne Twister Random Number Generators (compare tohttpwwwalifecouknonrandom)

Abstractions for implementing a variety of EC forms

Prepared to work in a distributed environment (including so-called island model)

Life demonstration of GP using ECJndash 33

GP-related ECJ features

GP Tree Representations

Set-based Strongly-Typed Genetic Programming

Ephemeral Random Constants

Automatically-Defined Functions and Automatically Defined Macros

Multiple tree forests

Six tree-creation algorithms

Extensive set of GP breeding operators

Grammatical Encoding

Eight pre-done GP application problem domains (ant regression multiplexerlawnmower parity two-box edge serengeti)

Life demonstration of GP using ECJndash 34

Exemplary run of ECJ

Standard outputjava ecEvolve -file ecappregressionquinticercparamsThreads breed1 eval1Seed 1427743400Job 0Setting upProcessing GP TypesProcessing GP Node ConstraintsProcessing GP Function SetsProcessing GP Tree Constraints-01306332228659439200164875774146594280653340439694114301402200189629743-0037506348565697010001402771209365470606602806044824949013869498395598084Initializing Generation 0Subpop 0 best fitness of generation Fitness Standardized=11303205 Adjusted=046941292 Hits=10Generation 1Subpop 0 best fitness of generation Fitness Standardized=06804932 Adjusted=059506345 Hits=7

Life demonstration of GP using ECJndash 35

Exemplary run The result

The log file produced by the runGeneration 0Best IndividualSubpopulation 0Evaluated trueFitness Standardized=11303205 Adjusted=046941292 Hits=10Tree 0( (sin ( x x)) (cos (+ x x)))Generation 1Best IndividualSubpopulation 0Evaluated trueFitness Standardized=06804932 Adjusted=059506345 Hits=7Tree 0( (rlog (+ (- x x) (cos x))) (rlog (- (cos (cos ( x x))) (- x x))))

Life demonstration of GP using ECJndash 36

Exemplary run

The log file produced by the runBest Individual of RunSubpopulation 0Evaluated trueFitness Standardized=008413165 Adjusted=092239726 Hits=17Tree 0( ( ( (- ( ( ( ( x (sin x)) (rlog

x)) (+ (+ (sin x) x) (- x x))) (exp ( x( ( (- ( ( ( ( x x) (rlog x)) (+ (+

(sin x) x) (- x x))) (exp ( x (sin x))))(sin x)) (rlog x)) (exp (rlog x)))))) (sin

x)) (rlog x)) x) (cos (cos ( ( (- ( ((exp (rlog x)) (+ x ( ( (exp (rlog x))(rlog x)) x))) (exp ( ( ( (- (exp (rlogx)) x) (rlog x)) x) (sin ( x x))))) (sinx)) ( x ( ( (- ( ( ( ( x x) (rlogx)) (+ (+ x (+ (+ (sin x) x) (- x x))) (-x x))) (exp ( x (sin x)))) (sin x)) (rlogx)) (exp (rlog x))))) x))))

Life demonstration of GP using ECJndash 37

GP in R

httpcranr-projectorgwebpackagesgprindexhtml

Life demonstration of GP using ECJndash 38

GP in Mathematica

Life demonstration of GP using ECJndash 39

Assessment of GP techniques

Assessment of GP techniquesndash 40

Criteria for assessing the quality of GP-evolved solutions

Criteria for assessing GP algorithmssuccess rate (percentage of evolutionary runs ended with success)

time-to-success (can be infin)

error of the best-of-run individual

Criteria for assessing programs obtained with GP

error rate (percentage of tests passed)

program size (number of instructions)

execution time

transparency

Assessment of GP techniquesndash 41

GP Benchmarks

Problem Definition (formula)Sextic x6minus2x4+x2

Septic x7minus2x6+x5minusx4+x3minus2x2+xNonic x9+x8+x7+x6+x5+x4+x3+x2+xR1 (x+1)3(x2minusx+1)R2 (x5minus3x3+1)(x2+1)R3 (x6+x5)(x4+x3+x2+x+1)

Assessment of GP techniquesndash 42

Even more GP benchmarks

Symbolic Regression Drug Bioavailability [43]

Tower [52] Protein Structure Classification [58]

Boolean Functions Time Series Forecasting [53]

N-Multiplexer [17] N-Majority [17] N-Parity [17] Path-finding and Planning

Generalised Boolean Circuits [12 59] Physical Travelling Salesman [28]

Digital Adder [55] Artificial Ant [17]

Order [9] Lawnmower [18]

Digital Multiplier [55] Tartarus Problem [6]

Majority [9] Maximum Overhang [36]

Classification Circuit Design [29]

mRNA Motif Classification [24] Control Systems

DNA Motif Discovery [25] Chaotic Dynamic Systems Control [26]

Intrusion Detection [11] Pole Balancing [31]

Protein Classification [19] Truck Control [16]

Intertwined Spirals [17]

Predictive Modelling

Mackey-Glass Chaotic Time Series [21]

Financial Trading [5 4 8]

Sunspot Prediction [17]

GeneChip Probe Performance [22]

Prime Number Prediction [54]

Assessment of GP techniquesndash 43

and more

Game-Playing

TORCS Car Racing [50]

Ms PacMan [10]

Othello [27]

Chessboard Evaluation [44]

Backgammon [44]

Mario [49]

NP-Complete Puzzles [14]

Robocode [44]

Rush Hour [44]

Checkers [44]

Freecell [44]

Dynamic Optimisation

Dynamic Symbolic Regression [34 35 51]

Dynamic Scheduling [13]

Traditional Programming

Sorting [15 1]

Assessment of GP techniquesndash 44

Case study

Case studyndash 45

Case study 1 Evolution of temperature models

Based on Karolina Stanisławska Krzysztof Krawiec Zbigniew W KundzewiczModeling Global Temperature Changes using Genetic Programming ndash A Case Study

Institute of Computing Science Poznan University of Technology Poznan Poland

Institute for Agricultural and Forest Environment Polish Academy of SciencesPoznan Poland and Potsdam Institute for Climate Impact Research PotsdamGermany

Konferencja Algorytmoacutew Ewolucyjnych i Optymalizacji Globalnej KAEiOGSeptember 21 2011

Case studyndash 46

A more detailed view on GP(vanilla GP is not the whole story)

A more detailed view on GP (vanilla GP is not the whole story)ndash 47

Where to get [random] candidate solutions from

Every stochastic search method relies on some sampling algorithm(s)

The distribution of randomly generated solutions is important as it impliescertain bias of the algorithm

ProblemsWe donrsquot know the lsquoidealrsquo distribution of GP programsEven if we knew it it may be difficult to design an algorithm that obeys it

The most widely used contemporary initialization methods take care only of thesyntax of generated programs

Mainly height constraint

A more detailed view on GP (vanilla GP is not the whole story)ndash 48

Initialization Full method

Specify the maximum tree height hmax

The full method for initializing treesChoose nonterminal nodes at random until hmax is reachedThen choose only from terminals

A more detailed view on GP (vanilla GP is not the whole story)ndash 49

Initialization Grow method

Specify the maximum tree height hmax

The grow method for initializing treesChoose nonterminal or terminal nodes at random until hmax is reachedThen choose only from terminals

A more detailed view on GP (vanilla GP is not the whole story)ndash 50

Homologous crossover for GP

Earliest example one-point crossover [23] identify a common region in theparents and swap the corresponding trees

The common region is the lsquointersectionrsquo of parent trees

A more detailed view on GP (vanilla GP is not the whole story)ndash 51

Uniform crossover for GP

Works similarly to uniform crossover in GAs

The offspring is build by iterating over nodes in the common region and flipping acoin to decide from which parent should an instruction be copied [38]

A more detailed view on GP (vanilla GP is not the whole story)ndash 52

How to employ multiple operators for lsquobreedingrsquo

How should the particular operators coexist in an evolutionary process In other words

How should they be superimposed

What should be the lsquopipingrsquo of particular breeding pipelines

A topic surprisingly underexplored in GP (and in EC probably too)

An example Which is betterpopsubpop0speciespipe = ecgpkozaMutationPipelinepopsubpop0speciespipenum-sources = 1popsubpop0speciespipesource0 = ecgpkozaCrossoverPipeline

Orpopsubpop0speciespipenum-sources = 2popsubpop0speciespipesource0 = ecgpkozaCrossoverPipelinepopsubpop0speciespipesource0prob = 09popsubpop0speciespipesource1 = ecgpkozaMutationPipelinepopsubpop0speciespipesource1prob = 01

A more detailed view on GP (vanilla GP is not the whole story)ndash 53

The Challenges for GP

The Challenges for GPndash 54

Bloat

The evolving expressions tend to grow indefinitely in sizeFor tree-based representations this growth is typically exponential[-ish]

Evaluation becomes slow algorithm stalls memory overrun likely

One of the most intensely studied topics in GP 240+ papers as of March 2012

The Challenges for GPndash 55

Bloat example

Average number of nodes per generation in a typical run of GP solving the Sexticproblem x6minus2x4+x2

(GP dotted line)

The Challenges for GPndash 56

Countermeasures for bloat

Constraining tree heightSurprisingly can speed up bloat

Favoring small programsLexicographic parsimony pressure given two equally fit individuals prefer (select)the one represented by a smaller tree

Bloat-aware operators size-fair crossover

The Challenges for GPndash 57

Highly non-uniform distribution of program lsquobehaviorsrsquo

Convergence of binary Boolean random linear functions (composed of AND NANDOR NOR 8 bits)

From [20] Langdon W B Cantuacute-Paz E (ed) Random Search is Parsimonious LateBreaking Papers at the Genetic and Evolutionary Computation Conference(GECCO-2002) AAAI 2002 308-315

The Challenges for GPndash 58

High cost of evaluation

Running a program on multiple inputscan be expensive

Particularly for some types of dataeg images

Solutions

Caching of outcomes of subprograms

Parallel execution of programs onparticular fitness cases

Bloat prevention methods

The Challenges for GPndash 59

Variants of GP

Variants of GPndash 60

Strongly typed GP (STGP)

A way to incorporate prior knowledge and impose a structure on programs [30]

ImplementationProvide a set of typesFor each instruction define the types of its arguments and outcomesMake the operators type-aware

Mutation substitute a random tree of a proper typeCrossover swap trees of compatible1 types

1Compatible belonging to the same lsquoset typersquoVariants of GPndash 61

Strongly typed GP in ECJ

For the problem of simple classifiers represented as decision trees

Classifier syntaxClassifier = Class_idClassifier = if_then_else(Condition ClassifierClassifier)Condition = Input_Variable = Constant_Value

Implementation in ECJ parameter filesgptypeasize = 3gptypea0name = classgptypea1name = vargptypea2name = constgptypessize = 0gptcsize = 1gptc0 = ecgpGPTreeConstraintsgptc0name = tc0gptc0fset = f0gptc0returns = class

gpncsize = 4gpnc0 = ecgpGPNodeConstraintsgpnc0name = ncSimpleClassifiergpnc0returns = classgpnc0size = 0gpnc1 = ecgpGPNodeConstraintsgpnc1name = ncCompoundClassifiergpnc1returns = classgpnc1size = 4gpnc1child0 = vargpnc1child1 = constgpnc1child2 = classgpnc1child3 = classgpnc2 = ecgpGPNodeConstraintsgpnc2name = ncVariablegpnc2returns = vargpnc2size = 0gpnc3 = ecgpGPNodeConstraintsgpnc3name = ncConstantgpnc3returns = constgpnc3size = 0

Variants of GPndash 62

Linear Genetic Programming

MotivationTree-like structures are not natural for contemporary hardware architectures

Program = a sequence of instructions

Data passed via registers

ProsDirectly portable to machine code fast executionNatural correspondence to standard (GA-like) crossover operator

Applications direct evolution of machine code [32]

Initial register contents

Final register contents

x1

x2 O1 O2

x3

O3 O4 g2

g3

g1r1

r2

r3

r1

r2

r3

Variants of GPndash 63

Linear GP

Variants of GPndash 64

Stack-based GP

The best-known representative Push and PushGPhampshireedulspectorpushhtml [48]

ProsVery simple syntax program = instruction | literal | ( program )No need to specify the number of registersThe top element of a stack has the natural interpretation of program outcomeNatural possibility of implementing autoconstructive programs [47]Includes certain features that make it Turing-complete (eg YANK instruction)

Variants of GPndash 65

Push Example

Program

( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR )

Initial stack states

BOOLEAN STACK ()CODE STACK ( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR )FLOAT STACK ()INTEGER STACK ()

Stack states after program execution

BOOLEAN STACK ( TRUE )CODE STACK ( ( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR ) )FLOAT STACK ( 93 )INTEGER STACK ( 6 )

httphampshireedulspectorpush3-descriptionhtml

Variants of GPndash 66

Other variants of GP

Grammatical Evolution The grammar of the programming language ofconsideration is given as input to the algorithm Individuals encode the choice ofproductions in the derivation tree (which of available alternative productionshould be chosen modulo the number of productions available at given step ofderivation)

Graph-based GPMotivation standard GP cannot reuse subprograms (within a single program)Example Cartesian Genetic Programming

Variants of GPndash 67

Other variants of GP

Multiobjective GP The extra objectives canCome with the problemResult from GPrsquos specifics eg use program size as the second (minimized)objectiveBe associated with different tests (eg feature tests [40])

Developmental GP (eg using Push)

Probabilistic GP (a variant of EDA Estimation of Distribution Algorithms)The algorithm maintains a probability distribution P instead of a populationIndividuals are generated from P lsquoon demandrsquoThe results of individualsrsquo evaluation are used to update P

Variants of GPndash 68

Simple EDA-like GP PIPE

Probabilistic Incremental Program Evolution [41]

Variants of GPndash 69

Selected theoretical results

Selected theoretical resultsndash 70

Schemata theorem for GP

Exact formula for the expected number of individuals sampling a schema a thenext generation [37]

Plus later work for other types of crossover

Selected theoretical resultsndash 71

Semantic crossover

Selected theoretical resultsndash 72

Applications of GP

Applications of GPndash 73

Review

GP produced a number of solutions that are human-competitive ie a GPalgorithm automatically solved a problem for which a patent exists2

A recent award-winning work has demonstrated the ability of a GP system toautomatically find and correct bugs in commercially-released software whenprovided with test data3

GP is one of leading methodologies that can be used to lsquoautomatersquo sciencehelping the researchers to find the hidden complex patterns in the observedphenomena4

2Koza J R Keane M A Streeter M J Mydlowec W Yu J Lanza G 2003 Genetic Pro-gramming IV Routine Human-Competitive Machine Intelligence Kluwer Academic Publishers

3Arcuri A Yao X A novel co-evolutionary approach to automatic software bug fixing In WangJ (Ed) 2008 IEEE World Congress on Computational Intelligence IEEE Computational IntelligenceSociety IEEE Press Hong Kong

4Schmidt M Lipson H 3 Apr 2009 Distilling free-form natural laws from experimental dataScience 324 (5923) 81ndash85

Applications of GPndash 74

Humies

() Entries were solicited for cash awards for human-competitive results that wereproduced by any form of genetic and evolutionary computation and that were published

Applications of GPndash 75

Humies

The conditions to qualify(A) The result was patented as an invention in the past is an improvement over apatented invention or would qualify today as a patentable new invention(B) The result is equal to or better than a result that was accepted as a new scientificresult at the time when it was published in a peer-reviewed scientific journal(C) The result is equal to or better than a result that was placed into a database orarchive of results maintained by an internationally recognized panel of scientificexperts(D) The result is publishable in its own right as a new scientific result mdash independentof the fact that the result was mechanically created(E) The result is equal to or better than the most recent human-created solution to along-standing problem for which there has been a succession of increasingly betterhuman-created solutions(F) The result is equal to or better than a result that was considered an achievementin its field at the time it was first discovered(G) The result solves a problem of indisputable difficulty in its field(H) The result holds its own or wins a regulated competition involving humancontestants (in the form of either live human players or human-written computerprograms)

Applications of GPndash 76

Selected Gold Humies

2004 Jason D Lohn Gregory S Hornby Derek S Linden NASA Ames ResearchCenterAn Evolved Antenna for Deployment on NASArsquos Space Technology 5 Mission

httpidesignucscedupapershornby_ec11pdf

Applications of GPndash 77

Selected Gold Humies using GP

2009 Stephanie Forrest Claire Le Goues ThanhVu Nguyen Westley WeimerAutomatically finding patches using genetic programming A GeneticProgramming Approach to Automated Software Repair

Applications of GPndash 78

Selected Gold Humies using GP

2008 Lee Spector David M Clark Ian Lindsay Bradford Barr Jon KleinGenetic Programming for Finite Algebras

2010 Natalio Krasnogor Paweł Widera Jonathan GaribaldiEvolutionary design of the energy function for protein structure prediction GPchallenge evolving the energy function for protein structure prediction Automateddesign of energy functions for protein structure prediction by means of geneticprogramming and improved structure similarity assessment

2011 Achiya Elyasaf Ami Hauptmann Moshe SipperGA-FreeCell Evolving Solvers for the Game of FreeCell

Applications of GPndash 79

Other applications

classification problems in machine learning [] object recognition [ 33] orlearning game strategies [] [2 57] has demonstrated the ability of a GP system to automatically find andcorrect bugs in commercially-released software when provided with test data

In the context of this paper it deserves particular attention that GP is one ofleading methodologies that can be used to lsquoautomatersquo science helping theresearchers to find the hidden complex patterns in the observed phenomena

In this spirit in their seminal paper [42] have shown how GP can be used toinduce scientific laws from experimental data Many other studies havedemonstrated the usefulness of GP for modeling different phenomena includingthose of natural origins [45 3 7 56 46]

See [] for an extensive review of GP applications

Applications of GPndash 80

Case study 2 Evolution of feature detectors

Based on Krzysztof Krawiec Bartosz Kukawka and Tomasz Maciejewski (2010)Evolving cascades of voting feature detectors for vehicle detection in satellite imageryIn IEEE Congress on Evolutionary Computation (CEC 2010) Barcelona IEEE Presspages 2392-2399

Applications of GPndash 81

Additional resources

Additional resourcesndash 82

Additional resources

Evolutionary Computation in Java csgmuedu~eclabprojectsecjGeneric software framework for EA well-prepared to work with GP

The online GP bilbiography wwwcsbhamacuk~wblbiblio

The genetic programming lsquohome pagersquo (a little bit messy but still valuable)httpwwwgenetic-programmingcom

Additional resourcesndash 83

Bibliography

Bibliographyndash 84

A Agapitos and S M LucasEvolving Modular Recursive Sorting AlgorithmsIn Proc EuroGP 2007

A Arcuri and X YaoA novel co-evolutionary approach to automatic software bug fixingIn J Wang editor 2008 IEEE World Congress on Computational Intelligencepages 162ndash168 Hong Kong 1-6 June 2008 IEEE Computational IntelligenceSociety IEEE Press

M Arganis R Val J Prats K Rodriguez R Dominguez and J DolzGenetic programming and standardization in water temperature modellingAdvances in Civil Engineering 2009 2009

A Brabazon M OrsquoNeill and I DempseyAn Introduction to Evolutionary Computation in FinanceIEEE Computational Intelligence Magazine 3(4)42ndash55 2008

R Bradley A Brabazon and M OrsquoNeillDynamic High Frequency Trading A Neuro-Evolutionary ApproachIn Proc EvoWorkshops 2009

G Cuccu and F GomezWhen novelty is not enoughIn Proc EvoApplications 2011

Bibliographyndash 85

M Daga and M C DeoAlternative data-driven methods to estimate wind from waves by inversemodelingNatural Hazards 49(2)293ndash310 May 2009

I Dempsey M OrsquoNeill and A BrabazonAdaptive Trading With Grammatical EvolutionIn Proc CEC 2006

G Durrett F Neumann and U-M OrsquoReillyComputational Complexity Analysis of Simple Genetic Programming On TwoProblems Modeling Isolated Program SemanticsIn Proc FOGA 2011

E Galvaacuten-Loacutepez J Swafford M OrsquoNeill and A BrabazonEvolving a Ms PacMan Controller Using Grammatical EvolutionIn Applications of Evolutionary Computation Springer 2010

J V Hansen P B Lowry R D Meservy and D M McDonaldGenetic Programming for Prevention of Cyberterrorism through Dynamic andEvolving Intrusion DetectionDecision Support Systems 431362ndash1374 2007

Bibliographyndash 86

S Harding J F Miller and W BanzhafDevelopments in Cartesian Genetic Programming self-modifying CGPGPEM 11397ndash439 2010

D Jakobović and L BudinDynamic Scheduling with Genetic ProgrammingIn Proc EuroGP 2006

G Kendall A Parkes and K SpoererA Survey of NP-Complete PuzzlesInternational Computer Games Association Journal 31(1)13ndash34 2008

K E Kinnear JrEvolving a Sort Lessons in Genetic ProgrammingIn Proc of the International Conference on Neural Networks 1993

J KozaA Genetic Approach to the Truck Backer Upper Problem and the Inter-twinedSpiral ProblemIn Proc International Joint Conference on Neural Networks 1992

J R KozaGenetic Programming On the Programming of Computers by Means of NaturalSelectionMIT Press Cambridge MA USA 1992

Bibliographyndash 87

J R KozaGenetic Programming II Automatic Discovery of Reusable ProgramsMIT Press Cambridge Massachusetts May 1994

W Langdon and W BanzhafRepeated Patterns in Genetic ProgrammingNatural Computing 7589ndash613 2008

W B LangdonRandom search is parsimoniousIn E Cantuacute-Paz editor Late Breaking Papers at the Genetic and EvolutionaryComputation Conference (GECCO-2002) pages 308ndash315 New York NY 9-13July 2002 AAAI

W B Langdon and W BanzhafRepeated Sequences in Linear Genetic Programming GenomesComplex Systems 15(4)285ndash306 2005

W B Langdon and A P HarrisonEvolving Regular Expressions for GeneChip Probe Performance PredictionIn Proc PPSN pages 1061ndash1070 2008

W B Langdon and R PoliFoundations of Genetic ProgrammingSpringer-Verlag 2002

Bibliographyndash 88

W B Langdon J Rowsell and A P HarrisonCreating Regular Expressions as mRNA Motifs with GP to Predict Human ExonSplittingIn Proc GECCO 2009

W B Langdon O Sanchez Graillet and A P HarrisonAutomated DNA Motif DiscoveryarXivorg 2010

M Lones A Tyrrell S Stepney and L CavesControlling Complex Dynamics with Artificial Biochemical NetworksIn Proc EuroGP pages 159ndash170 2010

S LucasOthello Competitionhttpprotectkern-1667emrelaxalgovalessexacuk8080othellohtmlOthellohtml 2012[Online accessed 27-Jan-2012]

Bibliographyndash 89

S LucasThe Physical Travelling Salesperson Problemhttpprotectkern-1667emrelaxalgovalessexacukptspptsphtml2012[Online accessed 27ndashJan-2012]

T McConaghyFFX Fast Scalable Deterministic Symbolic Regression TechnologyIn Proc GPTP 2011

D J MontanaStrongly typed genetic programmingBBN Technical Report 7866 Bolt Beranek and Newman Inc 10 MoultonStreet Cambridge MA 02138 USA 7 May 1993

M Nicolau M Schoenauer and W BanzhafEvolving Genes to Balance a PoleIn Proc EuroGP 2010

Bibliographyndash 90

P Nordin and W BanzhafGenetic programming controlling a miniature robotIn E V Siegel and J R Koza editors Working Notes for the AAAI Symposiumon Genetic Programming pages 61ndash67 MIT Cambridge MA USA 10ndash12 Nov1995 AAAI

G Olague and L TrujilloEvolutionary-computer-assisted design of image operators that detect interestpoints using genetic programmingImage and Vision Computing 29(7)484ndash498 2011

M OrsquoNeill A Brabazon and E HembergSubtree Deactivation Control with Grammatical Genetic Programming inDynamic EnvironmentsIn Proc CEC 2008

M OrsquoNeill and C RyanGrammatical Evolution by Grammatical Evolution The Evolution of Grammarand Genetic CodeIn Proc EuroGP pages 138ndash149 Springer-Verlag 5ndash7 Apr 2004

M Paterson Y Peres M Thorup P Winkler and U ZwickMaximum OverhangIn Proc 19th Annual ACM-SIAM Symposium on Discrete Algorithms 2008

Bibliographyndash 91

R PoliExact schema theory for genetic programming and variable-length geneticalgorithms with one-point crossoverGenetic Programming and Evolvable Machines 2(2)123ndash163 June 2001

R Poli and W B LangdonOn the search properties of different crossover operators in genetic programmingIn J R Koza W Banzhaf K Chellapilla K Deb M Dorigo D B Fogel M HGarzon D E Goldberg H Iba and R Riolo editors Genetic Programming1998 Proceedings of the Third Annual Conference pages 293ndash301 University ofWisconsin Madison Wisconsin USA 22-25 July 1998 Morgan Kaufmann

R Poli W B Langdon and N F McPheeA field guide to genetic programmingPublished via httplulucom and freely available athttpwwwgp-field-guideorguk 2008(With contributions by J R Koza)

B J Ross and H ZhuProcedural texture evolution using multiobjective optimizationNew Generation Computing 22(3)271ndash293 2004

Bibliographyndash 92

R P Salustowicz and J SchmidhuberProbabilistic incremental program evolutionEvolutionary Computation 5(2)123ndash141 1997

M Schmidt and H LipsonDistilling free-form natural laws from experimental dataScience 324(5923)81ndash85 3 Apr 2009

S Silva and L VanneschiState-of-the-Art Genetic Programming for Predicting Human Oral Bioavailabilityof DrugsIn Proc 4th International Workshop on Practical Applications of ComputationalBiology and Bioinformatics 2010

M SipperLet the Games EvolveIn Proc GPTP 2011

C Sivapragasam R Maheswaran and V VenkateshGenetic programming approach for flood routing in natural channelsHydrological Processes 22(5)623ndash628 2007

C Sivapragasam N Muttil S Muthukumar and V M ArunPrediction of algal blooms using genetic programmingMarine Pollution Bulletin 60(10)1849ndash1855 2010

Bibliographyndash 93

L SpectorTowards practical autoconstructive evolution Self-evolution of problem-solvinggenetic programming systemsIn R Riolo T McConaghy and E Vladislavleva editors Genetic ProgrammingTheory and Practice VIII volume 8 of Genetic and Evolutionary Computationchapter 2 pages 17ndash33 Springer Ann Arbor USA 20-22 May 2010

L Spector C Perry and J KleinPush 20 programming language descriptionTechnical report School of Cognitive Science Hampshire College Apr 2004

J Togelius S Karakovskiy J Koutnik and J SchmidhuberSuper Mario EvolutionIn Proc IEEE Computational Intelligence and Games 2009

TORCS The Open Car Racing Simulatorhttptorcssourceforgenet 2012

L Vanneschi and G CuccuVariable Size Population for Dynamic Optimization with Genetic ProgrammingIn Proc GECCO 2009

Bibliographyndash 94

E Vladislavleva G Smits and D Den HertogOrder of Nonlinearity as a Complexity Measure for Models Generated by SymbolicRegression via Pareto Genetic ProgrammingIEEE Trans EC 13(2)333ndash349 2009

N Wagner Z Michalewicz M Khouja and R McGregorTime Series Forecasting for Dynamic Environments The DyFor Genetic ProgramModelIEEE Trans EC 2007

J Walker and J MillerPredicting Prime Numbers Using Cartesian Genetic ProgrammingIn Proc EuroGP 2007

J A Walker K Voumllk S L Smith and J F MillerParallel Evolution using Multi-chromosome Cartesian Genetic ProgrammingGPEM 10417ndash445 2009

W-C Wang K-W Chau C-T Cheng and L QiuA comparison of performance of several artificial intelligence methods forforecasting monthly discharge time seriesJournal of Hydrology 374(3-4)294ndash306 2009

Bibliographyndash 95

W Weimer S Forrest C Le Goues and T NguyenAutomatic program repair with evolutionary computationCommunications of the ACM 53(5)109ndash116 June 2010

P Widera J Garibaldi and N KrasnogorGP challenge Evolving energy function for protein structure predictionGPEM 1161ndash88 2010

T YuHierarchical Processing for Evolving Recursive and Modular Programs UsingHigher-Order Functions and Lambda AbstractionGPEM 2345ndash380 2001

Bibliographyndash 96

  • Introduction
  • Background
  • What is genetic programming
  • Summary of our first glimpse at GP
  • Life demonstration of GP using ECJ
  • Assessment of GP techniques
  • Case study
  • A more detailed view on GP (vanilla GP is not the whole story)
  • The Challenges for GP
  • Variants of GP
  • Selected theoretical results
  • Applications of GP
  • Additional resources
  • Bibliography
Page 2: GeneticProgramming - Poznań University of TechnologyGeneticProgramming KrzysztofKrawiec LaboratoryofIntelligentDecisionSupportSystems InstituteofComputingScience,PoznanUniversityofTechnology,Poznań,Poland

Introduction

Introductionndash 2

Outline and objectives

1 Introduction GP as a variant of EC2 Specific features of GP3 Variants of GP4 Applications5 Some theory6 Case studies

Introductionndash 3

Recommended reading

Koza J R Genetic Programming On the Programming of Computers by Meansof Natural Selection MIT Press 1992

A Field Guide to Genetic Programming (ISBN 978-1-4092-0073-4)httpwwwgp-field-guideorguk

Langdon W B Genetic Programming and Data Structures GeneticProgramming + Data Structures = Automatic Programming Kluwer 1998

Langdon W B amp Poli R Foundations of Genetic Programming Springer-Verlag2002

Riolo R L Soule T amp Worzel B (ed) Genetic Programming Theory andPractice V Springer 2007

Riolo R McConaghy T amp Vladislavleva E (ed) Genetic Programming Theoryand Practice VIII Springer 2010

See httpwwwcsbhamacuk~wblbiblio

Introductionndash 4

Recommended reading

A Field Guide to Genetic Programming httpwwwgp-field-guideorguk[39]

(This presentation uses some figures from the Field Guide)

Introductionndash 5

Background

Backgroundndash 6

Evolutionary Computation (EC)

Heuristic bio-inspired global search algorithms

Operate on populations of candidate solutions

Candidate solutions are encoded as genotypes

Genotypes get decoded into phenotypes when evaluated by the fitness function fbeing optimized

Formulation

plowast = argmaxpisinP

f (p)

where

P is the considered space (search space) of candidate solutions (solutions forshort)

f is a (maximized) fitness function

plowast is an optimal solution (an ideal) that maximizes f

Backgroundndash 7

Generic evolutionary algorithm

Evolutionary Algorithm

Population P of individuals

Evaluation

Selection

Mutation and recombination

Initialization of population P

Solutionindividual s

f(s)

Output Best solution s+

Termination criteria

Fitness function f

Backgroundndash 8

[Unique] characteristic of EC

Black-box optimization (f primes dependency on the independent variables does nothave to be known or meet any criteria)

Variables do not have to be explicitly defined

Fining an optimum cannot be guaranteed but in practice a well-performingsuboptimal solution is often satisfactory

Importance of crossover a recombination operator that makes the solutionsexchange certain elements (variable values features)

Without crossover EC boils down parallel stochastic local search

Backgroundndash 9

What is genetic programming

What is genetic programmingndash 10

Genetic programming

In a nutshell

A variant of EC where the genotypes represent programs ie entities capable ofreading in input data and producing some output data in response to that input

Fitness function f measures the similarity of the output produced by the programto the desired output given as a part of task statement

Standard representation expression trees

Important implication Additional input required by the algorithm (compared to EC)

Set of instructions (programming language of consideration)

Data to run the programs on

What is genetic programmingndash 11

Conceptual preliminaries of GP Space of candidate solutions P

Candidate solutions p isin P evolving under the selection pressure of the fitnessfunction f are themselves functions of the form p I rarrO

I and O are respectively the spaces of input data and output data accepted andproduced by programs from P

Cardinality of |P| is typically large or infinite

The set of program inputs I even if finite is usually so large that running eachcandidate solution on all possible inputs becomes intractable

GP algorithms typically evaluate solutions on a sample I prime sub I |I prime| |I | of possibleinputs and fitness is only an approximate estimate of solution quality

The task is given as a set of fitness cases ie pairs (xi yi ) isin I timesO where xiusually comprises one or more independent variables and yi is the output variable

What is genetic programmingndash 12

Conceptual preliminaries Fitness function f

In most cases (and most real-world applications of GP) fitness function fmeasures the similarity of the output produced by the program to the desiredoutput given as a part of task statement

Then fitness can be expressed as a monotonous function of the divergence ofprogramrsquos output from the desired one for instance as

f (p) =minussumi

||yi minusp(xi )|| (1)

where

p(xi ) is the output produced by program p for the input data xi

|| middot || is a metric (a norm) in the output space O

i iterates over all fitness cases

What is genetic programmingndash 13

Conceptual preliminaries Character of candidate solutions

The candidate solutions in GP are being assembled from elementary entities calledinstructions

A part of formulation of a GP task is then also an instruction set I ie a set ofsymbols used by the search algorithm to compose the programs (candidatesolutions)

Design of I usually requires some background knowledgeIn particular it should comprise all instructions necessary to find solution to theproblem posed (closure)

What is genetic programmingndash 14

Genetic programming

Main evolution loop (lsquovanilla GPrsquo)

1 procedure GeneticProgramming(f I ) f - fitness function I - instruction set2 P larrplarrRandomProgram(I ) Initialize population3 repeat Main loop over generations4 for p isinP do Evaluation5 pf larr f (p) pf is a lsquofieldrsquo in program p that stores its fitness6 end for7 P prime larr 0 Next population8 repeat Breeding loop9 p1 larrTournamentSelection(P) First parent10 p2 larrTournamentSelection(P) Second parent11 (o1o2)larrCrossover(p1 p2)12 o1 larrMutation(o1 I )13 o2 larrMutation(o2 I )14 P prime larrP prime cupo1o215 until |P prime|= |P|16 P larrP prime

17 until StoppingCondition(P)18 return argmaxpisinP pf

19 end procedure

What is genetic programmingndash 15

Crossover

Crossover exchange of randomly selected subexpressions (subtree swapping crossover)

1 function Crossover(p1p2)2 repeat3 s1 larr Random node in p14 s2 larr Random node in p25 (pprime1p

prime2)larr Swap subtrees rooted in s1 and s2

6 until Depth(pprime1)lt dmax andDepth(pprime2)lt dmax dmax is the tree depthlimit

7 return (pprime1pprime2)

8 end function

What is genetic programmingndash 16

Mutation

Mutation replace a randomly selected subexpression with a new randomly generatedsubexpression

1 function Mutation(pI )2 repeat3 slarr Random node in p4 s prime larrRandomProgram(I )5 pprime larr Replace the subtree rooted in s with s prime

6 until Depth(pprime)lt dmax dmax is the tree depth limit7 return pprime

8 end function

What is genetic programmingndash 17

Exemplary run Setup

Objective Find program whose output matches x2+x+1 over the range [minus11]Such tasks can be considered as a form of regressionAs solutions are built by manipulating code (instructions) this is referred to assymbolic regression

Fitness sum of absolute errors for x isin minus10minus09 0910In other words the set of fitness cases is

xi -10 -09 0 09 10yi 1 091 1 271 3

What is genetic programmingndash 18

Exemplary run Setup

Instruction setNonterminal (function) set + - (protected division) and x all operating onfloatsTerminal set x and constants chosen randomly between -5 and +5

Selection fitness proportionate (roulette wheel) non elitist

Initial pop ramped half-and-half (depth 1 to 2 50 of terminals are constants)(to be explained later)

Parameterspopulation size 450 subtree crossover25 reproduction25 subtree mutation no tree size limits

Termination when an individual with fitness better than 01 found

What is genetic programmingndash 19

Initial population (population 0)

What is genetic programmingndash 20

Fitness assignment for population 0

Fitness values f(a)=77 f(b)=110 f(c)=1798 f(d)=287

What is genetic programmingndash 21

Breeding

Assume

a gets reproduced

c gets mutated (at loci 2)

a and d get crossed-over

a and b get crossed over

What is genetic programmingndash 22

Population 1

Population 0

Population 1

Individual d in population 1 has fitness 0What is genetic programmingndash 23

Summary of our first glimpse at GP

Summary of our first glimpse at GPndash 24

Specific features of GP

The solutions evolving under the selection pressure of the fitness function arethemselves functions (programs)

GP operates on symbolic structures of varying lengthsThere are no variables for the algorithm to operate on (at least in the commonsense)

The program can be tested only on a limited number of fitness cases (tests)

=rArr In contrast to most EC methods that are typically placed in optimizationframework GP is by nature an inductive learning approach that fits into the domain ofmachine learning []

Summary of our first glimpse at GPndash 25

In a broader context

As opposed to typical ML approaches GP is very genericArbitrary programming language arbitrary input and output representation

The syntax and semantic of the programming language of consideration serve asmeans to provide the algorithm with prior knowledge

(common sense knowledge background knowledge domain knowledge)

GP is not the only approach to program induction (but probably the best one )See eg inductive logic programming ILP

GP embodies the ultimate goal of AI to build a system capable ofself-programming (adaptation learning)

Summary of our first glimpse at GPndash 26

What is GP ndash Question revisited

Genetic programming is a branch of computer science studying heuristicalgorithms based on neo-Darwinian principles for synthesizing programs iediscrete symbolic compositional structures that process data

Summary of our first glimpse at GPndash 27

Why should GP be considered a viable approach to program synthesis

GP combines two powerful concepts marked in underline in the above definition1 Representing candidate solutions as programs

which in general can conduct any Turing-complete computation (egclassification regression clustering reasoning problem solving etc) and thusenable capturing solutions to any type of problems (whether the task is eglearning optimization problem solving game playing etc)

2 Searching the space of candidate solutions using the lsquomechanicsrsquo borrowed frombiological evolutionwhich is unquestionably a very powerful computing paradigm given that itresulted in life on Earth and development of intelligent beings

Summary of our first glimpse at GPndash 28

Consequences of the above definition

Heuristic nature of search

Symbolic program representation

Input sensitivity and inductive character

State-fullness

Unconstrained data types

Summary of our first glimpse at GPndash 29

Origins of GP

httpwwwgenetic-programmingcomjohnkozahtml

Home assignment Indentify John Kozarsquos unique visual trait )

Summary of our first glimpse at GPndash 30

Life demonstration of GP using ECJ

Life demonstration of GP using ECJndash 31

GP on ECJ

ECJ Evolutionary Computation in Javahttpcsgmuedu~eclabprojectsecj

by Sean Luke Liviu Panait Gabriel Balan Sean Paus Zbigniew Skolicki ElenaPopovici Keith Sullivan et al

Probably the most popular freely available framework for EC with a strongsupport for GP

Licensed under Academic Free License version 30

As of March 2012 version 20

Many other libraries integrate with ECJ

Life demonstration of GP using ECJndash 32

Selected ECJ features

GUI with charting

Platform-independent checkpointing and logging

Hierarchical parameter files

Multithreading

Mersenne Twister Random Number Generators (compare tohttpwwwalifecouknonrandom)

Abstractions for implementing a variety of EC forms

Prepared to work in a distributed environment (including so-called island model)

Life demonstration of GP using ECJndash 33

GP-related ECJ features

GP Tree Representations

Set-based Strongly-Typed Genetic Programming

Ephemeral Random Constants

Automatically-Defined Functions and Automatically Defined Macros

Multiple tree forests

Six tree-creation algorithms

Extensive set of GP breeding operators

Grammatical Encoding

Eight pre-done GP application problem domains (ant regression multiplexerlawnmower parity two-box edge serengeti)

Life demonstration of GP using ECJndash 34

Exemplary run of ECJ

Standard outputjava ecEvolve -file ecappregressionquinticercparamsThreads breed1 eval1Seed 1427743400Job 0Setting upProcessing GP TypesProcessing GP Node ConstraintsProcessing GP Function SetsProcessing GP Tree Constraints-01306332228659439200164875774146594280653340439694114301402200189629743-0037506348565697010001402771209365470606602806044824949013869498395598084Initializing Generation 0Subpop 0 best fitness of generation Fitness Standardized=11303205 Adjusted=046941292 Hits=10Generation 1Subpop 0 best fitness of generation Fitness Standardized=06804932 Adjusted=059506345 Hits=7

Life demonstration of GP using ECJndash 35

Exemplary run The result

The log file produced by the runGeneration 0Best IndividualSubpopulation 0Evaluated trueFitness Standardized=11303205 Adjusted=046941292 Hits=10Tree 0( (sin ( x x)) (cos (+ x x)))Generation 1Best IndividualSubpopulation 0Evaluated trueFitness Standardized=06804932 Adjusted=059506345 Hits=7Tree 0( (rlog (+ (- x x) (cos x))) (rlog (- (cos (cos ( x x))) (- x x))))

Life demonstration of GP using ECJndash 36

Exemplary run

The log file produced by the runBest Individual of RunSubpopulation 0Evaluated trueFitness Standardized=008413165 Adjusted=092239726 Hits=17Tree 0( ( ( (- ( ( ( ( x (sin x)) (rlog

x)) (+ (+ (sin x) x) (- x x))) (exp ( x( ( (- ( ( ( ( x x) (rlog x)) (+ (+

(sin x) x) (- x x))) (exp ( x (sin x))))(sin x)) (rlog x)) (exp (rlog x)))))) (sin

x)) (rlog x)) x) (cos (cos ( ( (- ( ((exp (rlog x)) (+ x ( ( (exp (rlog x))(rlog x)) x))) (exp ( ( ( (- (exp (rlogx)) x) (rlog x)) x) (sin ( x x))))) (sinx)) ( x ( ( (- ( ( ( ( x x) (rlogx)) (+ (+ x (+ (+ (sin x) x) (- x x))) (-x x))) (exp ( x (sin x)))) (sin x)) (rlogx)) (exp (rlog x))))) x))))

Life demonstration of GP using ECJndash 37

GP in R

httpcranr-projectorgwebpackagesgprindexhtml

Life demonstration of GP using ECJndash 38

GP in Mathematica

Life demonstration of GP using ECJndash 39

Assessment of GP techniques

Assessment of GP techniquesndash 40

Criteria for assessing the quality of GP-evolved solutions

Criteria for assessing GP algorithmssuccess rate (percentage of evolutionary runs ended with success)

time-to-success (can be infin)

error of the best-of-run individual

Criteria for assessing programs obtained with GP

error rate (percentage of tests passed)

program size (number of instructions)

execution time

transparency

Assessment of GP techniquesndash 41

GP Benchmarks

Problem Definition (formula)Sextic x6minus2x4+x2

Septic x7minus2x6+x5minusx4+x3minus2x2+xNonic x9+x8+x7+x6+x5+x4+x3+x2+xR1 (x+1)3(x2minusx+1)R2 (x5minus3x3+1)(x2+1)R3 (x6+x5)(x4+x3+x2+x+1)

Assessment of GP techniquesndash 42

Even more GP benchmarks

Symbolic Regression Drug Bioavailability [43]

Tower [52] Protein Structure Classification [58]

Boolean Functions Time Series Forecasting [53]

N-Multiplexer [17] N-Majority [17] N-Parity [17] Path-finding and Planning

Generalised Boolean Circuits [12 59] Physical Travelling Salesman [28]

Digital Adder [55] Artificial Ant [17]

Order [9] Lawnmower [18]

Digital Multiplier [55] Tartarus Problem [6]

Majority [9] Maximum Overhang [36]

Classification Circuit Design [29]

mRNA Motif Classification [24] Control Systems

DNA Motif Discovery [25] Chaotic Dynamic Systems Control [26]

Intrusion Detection [11] Pole Balancing [31]

Protein Classification [19] Truck Control [16]

Intertwined Spirals [17]

Predictive Modelling

Mackey-Glass Chaotic Time Series [21]

Financial Trading [5 4 8]

Sunspot Prediction [17]

GeneChip Probe Performance [22]

Prime Number Prediction [54]

Assessment of GP techniquesndash 43

and more

Game-Playing

TORCS Car Racing [50]

Ms PacMan [10]

Othello [27]

Chessboard Evaluation [44]

Backgammon [44]

Mario [49]

NP-Complete Puzzles [14]

Robocode [44]

Rush Hour [44]

Checkers [44]

Freecell [44]

Dynamic Optimisation

Dynamic Symbolic Regression [34 35 51]

Dynamic Scheduling [13]

Traditional Programming

Sorting [15 1]

Assessment of GP techniquesndash 44

Case study

Case studyndash 45

Case study 1 Evolution of temperature models

Based on Karolina Stanisławska Krzysztof Krawiec Zbigniew W KundzewiczModeling Global Temperature Changes using Genetic Programming ndash A Case Study

Institute of Computing Science Poznan University of Technology Poznan Poland

Institute for Agricultural and Forest Environment Polish Academy of SciencesPoznan Poland and Potsdam Institute for Climate Impact Research PotsdamGermany

Konferencja Algorytmoacutew Ewolucyjnych i Optymalizacji Globalnej KAEiOGSeptember 21 2011

Case studyndash 46

A more detailed view on GP(vanilla GP is not the whole story)

A more detailed view on GP (vanilla GP is not the whole story)ndash 47

Where to get [random] candidate solutions from

Every stochastic search method relies on some sampling algorithm(s)

The distribution of randomly generated solutions is important as it impliescertain bias of the algorithm

ProblemsWe donrsquot know the lsquoidealrsquo distribution of GP programsEven if we knew it it may be difficult to design an algorithm that obeys it

The most widely used contemporary initialization methods take care only of thesyntax of generated programs

Mainly height constraint

A more detailed view on GP (vanilla GP is not the whole story)ndash 48

Initialization Full method

Specify the maximum tree height hmax

The full method for initializing treesChoose nonterminal nodes at random until hmax is reachedThen choose only from terminals

A more detailed view on GP (vanilla GP is not the whole story)ndash 49

Initialization Grow method

Specify the maximum tree height hmax

The grow method for initializing treesChoose nonterminal or terminal nodes at random until hmax is reachedThen choose only from terminals

A more detailed view on GP (vanilla GP is not the whole story)ndash 50

Homologous crossover for GP

Earliest example one-point crossover [23] identify a common region in theparents and swap the corresponding trees

The common region is the lsquointersectionrsquo of parent trees

A more detailed view on GP (vanilla GP is not the whole story)ndash 51

Uniform crossover for GP

Works similarly to uniform crossover in GAs

The offspring is build by iterating over nodes in the common region and flipping acoin to decide from which parent should an instruction be copied [38]

A more detailed view on GP (vanilla GP is not the whole story)ndash 52

How to employ multiple operators for lsquobreedingrsquo

How should the particular operators coexist in an evolutionary process In other words

How should they be superimposed

What should be the lsquopipingrsquo of particular breeding pipelines

A topic surprisingly underexplored in GP (and in EC probably too)

An example Which is betterpopsubpop0speciespipe = ecgpkozaMutationPipelinepopsubpop0speciespipenum-sources = 1popsubpop0speciespipesource0 = ecgpkozaCrossoverPipeline

Orpopsubpop0speciespipenum-sources = 2popsubpop0speciespipesource0 = ecgpkozaCrossoverPipelinepopsubpop0speciespipesource0prob = 09popsubpop0speciespipesource1 = ecgpkozaMutationPipelinepopsubpop0speciespipesource1prob = 01

A more detailed view on GP (vanilla GP is not the whole story)ndash 53

The Challenges for GP

The Challenges for GPndash 54

Bloat

The evolving expressions tend to grow indefinitely in sizeFor tree-based representations this growth is typically exponential[-ish]

Evaluation becomes slow algorithm stalls memory overrun likely

One of the most intensely studied topics in GP 240+ papers as of March 2012

The Challenges for GPndash 55

Bloat example

Average number of nodes per generation in a typical run of GP solving the Sexticproblem x6minus2x4+x2

(GP dotted line)

The Challenges for GPndash 56

Countermeasures for bloat

Constraining tree heightSurprisingly can speed up bloat

Favoring small programsLexicographic parsimony pressure given two equally fit individuals prefer (select)the one represented by a smaller tree

Bloat-aware operators size-fair crossover

The Challenges for GPndash 57

Highly non-uniform distribution of program lsquobehaviorsrsquo

Convergence of binary Boolean random linear functions (composed of AND NANDOR NOR 8 bits)

From [20] Langdon W B Cantuacute-Paz E (ed) Random Search is Parsimonious LateBreaking Papers at the Genetic and Evolutionary Computation Conference(GECCO-2002) AAAI 2002 308-315

The Challenges for GPndash 58

High cost of evaluation

Running a program on multiple inputscan be expensive

Particularly for some types of dataeg images

Solutions

Caching of outcomes of subprograms

Parallel execution of programs onparticular fitness cases

Bloat prevention methods

The Challenges for GPndash 59

Variants of GP

Variants of GPndash 60

Strongly typed GP (STGP)

A way to incorporate prior knowledge and impose a structure on programs [30]

ImplementationProvide a set of typesFor each instruction define the types of its arguments and outcomesMake the operators type-aware

Mutation substitute a random tree of a proper typeCrossover swap trees of compatible1 types

1Compatible belonging to the same lsquoset typersquoVariants of GPndash 61

Strongly typed GP in ECJ

For the problem of simple classifiers represented as decision trees

Classifier syntaxClassifier = Class_idClassifier = if_then_else(Condition ClassifierClassifier)Condition = Input_Variable = Constant_Value

Implementation in ECJ parameter filesgptypeasize = 3gptypea0name = classgptypea1name = vargptypea2name = constgptypessize = 0gptcsize = 1gptc0 = ecgpGPTreeConstraintsgptc0name = tc0gptc0fset = f0gptc0returns = class

gpncsize = 4gpnc0 = ecgpGPNodeConstraintsgpnc0name = ncSimpleClassifiergpnc0returns = classgpnc0size = 0gpnc1 = ecgpGPNodeConstraintsgpnc1name = ncCompoundClassifiergpnc1returns = classgpnc1size = 4gpnc1child0 = vargpnc1child1 = constgpnc1child2 = classgpnc1child3 = classgpnc2 = ecgpGPNodeConstraintsgpnc2name = ncVariablegpnc2returns = vargpnc2size = 0gpnc3 = ecgpGPNodeConstraintsgpnc3name = ncConstantgpnc3returns = constgpnc3size = 0

Variants of GPndash 62

Linear Genetic Programming

MotivationTree-like structures are not natural for contemporary hardware architectures

Program = a sequence of instructions

Data passed via registers

ProsDirectly portable to machine code fast executionNatural correspondence to standard (GA-like) crossover operator

Applications direct evolution of machine code [32]

Initial register contents

Final register contents

x1

x2 O1 O2

x3

O3 O4 g2

g3

g1r1

r2

r3

r1

r2

r3

Variants of GPndash 63

Linear GP

Variants of GPndash 64

Stack-based GP

The best-known representative Push and PushGPhampshireedulspectorpushhtml [48]

ProsVery simple syntax program = instruction | literal | ( program )No need to specify the number of registersThe top element of a stack has the natural interpretation of program outcomeNatural possibility of implementing autoconstructive programs [47]Includes certain features that make it Turing-complete (eg YANK instruction)

Variants of GPndash 65

Push Example

Program

( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR )

Initial stack states

BOOLEAN STACK ()CODE STACK ( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR )FLOAT STACK ()INTEGER STACK ()

Stack states after program execution

BOOLEAN STACK ( TRUE )CODE STACK ( ( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR ) )FLOAT STACK ( 93 )INTEGER STACK ( 6 )

httphampshireedulspectorpush3-descriptionhtml

Variants of GPndash 66

Other variants of GP

Grammatical Evolution The grammar of the programming language ofconsideration is given as input to the algorithm Individuals encode the choice ofproductions in the derivation tree (which of available alternative productionshould be chosen modulo the number of productions available at given step ofderivation)

Graph-based GPMotivation standard GP cannot reuse subprograms (within a single program)Example Cartesian Genetic Programming

Variants of GPndash 67

Other variants of GP

Multiobjective GP The extra objectives canCome with the problemResult from GPrsquos specifics eg use program size as the second (minimized)objectiveBe associated with different tests (eg feature tests [40])

Developmental GP (eg using Push)

Probabilistic GP (a variant of EDA Estimation of Distribution Algorithms)The algorithm maintains a probability distribution P instead of a populationIndividuals are generated from P lsquoon demandrsquoThe results of individualsrsquo evaluation are used to update P

Variants of GPndash 68

Simple EDA-like GP PIPE

Probabilistic Incremental Program Evolution [41]

Variants of GPndash 69

Selected theoretical results

Selected theoretical resultsndash 70

Schemata theorem for GP

Exact formula for the expected number of individuals sampling a schema a thenext generation [37]

Plus later work for other types of crossover

Selected theoretical resultsndash 71

Semantic crossover

Selected theoretical resultsndash 72

Applications of GP

Applications of GPndash 73

Review

GP produced a number of solutions that are human-competitive ie a GPalgorithm automatically solved a problem for which a patent exists2

A recent award-winning work has demonstrated the ability of a GP system toautomatically find and correct bugs in commercially-released software whenprovided with test data3

GP is one of leading methodologies that can be used to lsquoautomatersquo sciencehelping the researchers to find the hidden complex patterns in the observedphenomena4

2Koza J R Keane M A Streeter M J Mydlowec W Yu J Lanza G 2003 Genetic Pro-gramming IV Routine Human-Competitive Machine Intelligence Kluwer Academic Publishers

3Arcuri A Yao X A novel co-evolutionary approach to automatic software bug fixing In WangJ (Ed) 2008 IEEE World Congress on Computational Intelligence IEEE Computational IntelligenceSociety IEEE Press Hong Kong

4Schmidt M Lipson H 3 Apr 2009 Distilling free-form natural laws from experimental dataScience 324 (5923) 81ndash85

Applications of GPndash 74

Humies

() Entries were solicited for cash awards for human-competitive results that wereproduced by any form of genetic and evolutionary computation and that were published

Applications of GPndash 75

Humies

The conditions to qualify(A) The result was patented as an invention in the past is an improvement over apatented invention or would qualify today as a patentable new invention(B) The result is equal to or better than a result that was accepted as a new scientificresult at the time when it was published in a peer-reviewed scientific journal(C) The result is equal to or better than a result that was placed into a database orarchive of results maintained by an internationally recognized panel of scientificexperts(D) The result is publishable in its own right as a new scientific result mdash independentof the fact that the result was mechanically created(E) The result is equal to or better than the most recent human-created solution to along-standing problem for which there has been a succession of increasingly betterhuman-created solutions(F) The result is equal to or better than a result that was considered an achievementin its field at the time it was first discovered(G) The result solves a problem of indisputable difficulty in its field(H) The result holds its own or wins a regulated competition involving humancontestants (in the form of either live human players or human-written computerprograms)

Applications of GPndash 76

Selected Gold Humies

2004 Jason D Lohn Gregory S Hornby Derek S Linden NASA Ames ResearchCenterAn Evolved Antenna for Deployment on NASArsquos Space Technology 5 Mission

httpidesignucscedupapershornby_ec11pdf

Applications of GPndash 77

Selected Gold Humies using GP

2009 Stephanie Forrest Claire Le Goues ThanhVu Nguyen Westley WeimerAutomatically finding patches using genetic programming A GeneticProgramming Approach to Automated Software Repair

Applications of GPndash 78

Selected Gold Humies using GP

2008 Lee Spector David M Clark Ian Lindsay Bradford Barr Jon KleinGenetic Programming for Finite Algebras

2010 Natalio Krasnogor Paweł Widera Jonathan GaribaldiEvolutionary design of the energy function for protein structure prediction GPchallenge evolving the energy function for protein structure prediction Automateddesign of energy functions for protein structure prediction by means of geneticprogramming and improved structure similarity assessment

2011 Achiya Elyasaf Ami Hauptmann Moshe SipperGA-FreeCell Evolving Solvers for the Game of FreeCell

Applications of GPndash 79

Other applications

classification problems in machine learning [] object recognition [ 33] orlearning game strategies [] [2 57] has demonstrated the ability of a GP system to automatically find andcorrect bugs in commercially-released software when provided with test data

In the context of this paper it deserves particular attention that GP is one ofleading methodologies that can be used to lsquoautomatersquo science helping theresearchers to find the hidden complex patterns in the observed phenomena

In this spirit in their seminal paper [42] have shown how GP can be used toinduce scientific laws from experimental data Many other studies havedemonstrated the usefulness of GP for modeling different phenomena includingthose of natural origins [45 3 7 56 46]

See [] for an extensive review of GP applications

Applications of GPndash 80

Case study 2 Evolution of feature detectors

Based on Krzysztof Krawiec Bartosz Kukawka and Tomasz Maciejewski (2010)Evolving cascades of voting feature detectors for vehicle detection in satellite imageryIn IEEE Congress on Evolutionary Computation (CEC 2010) Barcelona IEEE Presspages 2392-2399

Applications of GPndash 81

Additional resources

Additional resourcesndash 82

Additional resources

Evolutionary Computation in Java csgmuedu~eclabprojectsecjGeneric software framework for EA well-prepared to work with GP

The online GP bilbiography wwwcsbhamacuk~wblbiblio

The genetic programming lsquohome pagersquo (a little bit messy but still valuable)httpwwwgenetic-programmingcom

Additional resourcesndash 83

Bibliography

Bibliographyndash 84

A Agapitos and S M LucasEvolving Modular Recursive Sorting AlgorithmsIn Proc EuroGP 2007

A Arcuri and X YaoA novel co-evolutionary approach to automatic software bug fixingIn J Wang editor 2008 IEEE World Congress on Computational Intelligencepages 162ndash168 Hong Kong 1-6 June 2008 IEEE Computational IntelligenceSociety IEEE Press

M Arganis R Val J Prats K Rodriguez R Dominguez and J DolzGenetic programming and standardization in water temperature modellingAdvances in Civil Engineering 2009 2009

A Brabazon M OrsquoNeill and I DempseyAn Introduction to Evolutionary Computation in FinanceIEEE Computational Intelligence Magazine 3(4)42ndash55 2008

R Bradley A Brabazon and M OrsquoNeillDynamic High Frequency Trading A Neuro-Evolutionary ApproachIn Proc EvoWorkshops 2009

G Cuccu and F GomezWhen novelty is not enoughIn Proc EvoApplications 2011

Bibliographyndash 85

M Daga and M C DeoAlternative data-driven methods to estimate wind from waves by inversemodelingNatural Hazards 49(2)293ndash310 May 2009

I Dempsey M OrsquoNeill and A BrabazonAdaptive Trading With Grammatical EvolutionIn Proc CEC 2006

G Durrett F Neumann and U-M OrsquoReillyComputational Complexity Analysis of Simple Genetic Programming On TwoProblems Modeling Isolated Program SemanticsIn Proc FOGA 2011

E Galvaacuten-Loacutepez J Swafford M OrsquoNeill and A BrabazonEvolving a Ms PacMan Controller Using Grammatical EvolutionIn Applications of Evolutionary Computation Springer 2010

J V Hansen P B Lowry R D Meservy and D M McDonaldGenetic Programming for Prevention of Cyberterrorism through Dynamic andEvolving Intrusion DetectionDecision Support Systems 431362ndash1374 2007

Bibliographyndash 86

S Harding J F Miller and W BanzhafDevelopments in Cartesian Genetic Programming self-modifying CGPGPEM 11397ndash439 2010

D Jakobović and L BudinDynamic Scheduling with Genetic ProgrammingIn Proc EuroGP 2006

G Kendall A Parkes and K SpoererA Survey of NP-Complete PuzzlesInternational Computer Games Association Journal 31(1)13ndash34 2008

K E Kinnear JrEvolving a Sort Lessons in Genetic ProgrammingIn Proc of the International Conference on Neural Networks 1993

J KozaA Genetic Approach to the Truck Backer Upper Problem and the Inter-twinedSpiral ProblemIn Proc International Joint Conference on Neural Networks 1992

J R KozaGenetic Programming On the Programming of Computers by Means of NaturalSelectionMIT Press Cambridge MA USA 1992

Bibliographyndash 87

J R KozaGenetic Programming II Automatic Discovery of Reusable ProgramsMIT Press Cambridge Massachusetts May 1994

W Langdon and W BanzhafRepeated Patterns in Genetic ProgrammingNatural Computing 7589ndash613 2008

W B LangdonRandom search is parsimoniousIn E Cantuacute-Paz editor Late Breaking Papers at the Genetic and EvolutionaryComputation Conference (GECCO-2002) pages 308ndash315 New York NY 9-13July 2002 AAAI

W B Langdon and W BanzhafRepeated Sequences in Linear Genetic Programming GenomesComplex Systems 15(4)285ndash306 2005

W B Langdon and A P HarrisonEvolving Regular Expressions for GeneChip Probe Performance PredictionIn Proc PPSN pages 1061ndash1070 2008

W B Langdon and R PoliFoundations of Genetic ProgrammingSpringer-Verlag 2002

Bibliographyndash 88

W B Langdon J Rowsell and A P HarrisonCreating Regular Expressions as mRNA Motifs with GP to Predict Human ExonSplittingIn Proc GECCO 2009

W B Langdon O Sanchez Graillet and A P HarrisonAutomated DNA Motif DiscoveryarXivorg 2010

M Lones A Tyrrell S Stepney and L CavesControlling Complex Dynamics with Artificial Biochemical NetworksIn Proc EuroGP pages 159ndash170 2010

S LucasOthello Competitionhttpprotectkern-1667emrelaxalgovalessexacuk8080othellohtmlOthellohtml 2012[Online accessed 27-Jan-2012]

Bibliographyndash 89

S LucasThe Physical Travelling Salesperson Problemhttpprotectkern-1667emrelaxalgovalessexacukptspptsphtml2012[Online accessed 27ndashJan-2012]

T McConaghyFFX Fast Scalable Deterministic Symbolic Regression TechnologyIn Proc GPTP 2011

D J MontanaStrongly typed genetic programmingBBN Technical Report 7866 Bolt Beranek and Newman Inc 10 MoultonStreet Cambridge MA 02138 USA 7 May 1993

M Nicolau M Schoenauer and W BanzhafEvolving Genes to Balance a PoleIn Proc EuroGP 2010

Bibliographyndash 90

P Nordin and W BanzhafGenetic programming controlling a miniature robotIn E V Siegel and J R Koza editors Working Notes for the AAAI Symposiumon Genetic Programming pages 61ndash67 MIT Cambridge MA USA 10ndash12 Nov1995 AAAI

G Olague and L TrujilloEvolutionary-computer-assisted design of image operators that detect interestpoints using genetic programmingImage and Vision Computing 29(7)484ndash498 2011

M OrsquoNeill A Brabazon and E HembergSubtree Deactivation Control with Grammatical Genetic Programming inDynamic EnvironmentsIn Proc CEC 2008

M OrsquoNeill and C RyanGrammatical Evolution by Grammatical Evolution The Evolution of Grammarand Genetic CodeIn Proc EuroGP pages 138ndash149 Springer-Verlag 5ndash7 Apr 2004

M Paterson Y Peres M Thorup P Winkler and U ZwickMaximum OverhangIn Proc 19th Annual ACM-SIAM Symposium on Discrete Algorithms 2008

Bibliographyndash 91

R PoliExact schema theory for genetic programming and variable-length geneticalgorithms with one-point crossoverGenetic Programming and Evolvable Machines 2(2)123ndash163 June 2001

R Poli and W B LangdonOn the search properties of different crossover operators in genetic programmingIn J R Koza W Banzhaf K Chellapilla K Deb M Dorigo D B Fogel M HGarzon D E Goldberg H Iba and R Riolo editors Genetic Programming1998 Proceedings of the Third Annual Conference pages 293ndash301 University ofWisconsin Madison Wisconsin USA 22-25 July 1998 Morgan Kaufmann

R Poli W B Langdon and N F McPheeA field guide to genetic programmingPublished via httplulucom and freely available athttpwwwgp-field-guideorguk 2008(With contributions by J R Koza)

B J Ross and H ZhuProcedural texture evolution using multiobjective optimizationNew Generation Computing 22(3)271ndash293 2004

Bibliographyndash 92

R P Salustowicz and J SchmidhuberProbabilistic incremental program evolutionEvolutionary Computation 5(2)123ndash141 1997

M Schmidt and H LipsonDistilling free-form natural laws from experimental dataScience 324(5923)81ndash85 3 Apr 2009

S Silva and L VanneschiState-of-the-Art Genetic Programming for Predicting Human Oral Bioavailabilityof DrugsIn Proc 4th International Workshop on Practical Applications of ComputationalBiology and Bioinformatics 2010

M SipperLet the Games EvolveIn Proc GPTP 2011

C Sivapragasam R Maheswaran and V VenkateshGenetic programming approach for flood routing in natural channelsHydrological Processes 22(5)623ndash628 2007

C Sivapragasam N Muttil S Muthukumar and V M ArunPrediction of algal blooms using genetic programmingMarine Pollution Bulletin 60(10)1849ndash1855 2010

Bibliographyndash 93

L SpectorTowards practical autoconstructive evolution Self-evolution of problem-solvinggenetic programming systemsIn R Riolo T McConaghy and E Vladislavleva editors Genetic ProgrammingTheory and Practice VIII volume 8 of Genetic and Evolutionary Computationchapter 2 pages 17ndash33 Springer Ann Arbor USA 20-22 May 2010

L Spector C Perry and J KleinPush 20 programming language descriptionTechnical report School of Cognitive Science Hampshire College Apr 2004

J Togelius S Karakovskiy J Koutnik and J SchmidhuberSuper Mario EvolutionIn Proc IEEE Computational Intelligence and Games 2009

TORCS The Open Car Racing Simulatorhttptorcssourceforgenet 2012

L Vanneschi and G CuccuVariable Size Population for Dynamic Optimization with Genetic ProgrammingIn Proc GECCO 2009

Bibliographyndash 94

E Vladislavleva G Smits and D Den HertogOrder of Nonlinearity as a Complexity Measure for Models Generated by SymbolicRegression via Pareto Genetic ProgrammingIEEE Trans EC 13(2)333ndash349 2009

N Wagner Z Michalewicz M Khouja and R McGregorTime Series Forecasting for Dynamic Environments The DyFor Genetic ProgramModelIEEE Trans EC 2007

J Walker and J MillerPredicting Prime Numbers Using Cartesian Genetic ProgrammingIn Proc EuroGP 2007

J A Walker K Voumllk S L Smith and J F MillerParallel Evolution using Multi-chromosome Cartesian Genetic ProgrammingGPEM 10417ndash445 2009

W-C Wang K-W Chau C-T Cheng and L QiuA comparison of performance of several artificial intelligence methods forforecasting monthly discharge time seriesJournal of Hydrology 374(3-4)294ndash306 2009

Bibliographyndash 95

W Weimer S Forrest C Le Goues and T NguyenAutomatic program repair with evolutionary computationCommunications of the ACM 53(5)109ndash116 June 2010

P Widera J Garibaldi and N KrasnogorGP challenge Evolving energy function for protein structure predictionGPEM 1161ndash88 2010

T YuHierarchical Processing for Evolving Recursive and Modular Programs UsingHigher-Order Functions and Lambda AbstractionGPEM 2345ndash380 2001

Bibliographyndash 96

  • Introduction
  • Background
  • What is genetic programming
  • Summary of our first glimpse at GP
  • Life demonstration of GP using ECJ
  • Assessment of GP techniques
  • Case study
  • A more detailed view on GP (vanilla GP is not the whole story)
  • The Challenges for GP
  • Variants of GP
  • Selected theoretical results
  • Applications of GP
  • Additional resources
  • Bibliography
Page 3: GeneticProgramming - Poznań University of TechnologyGeneticProgramming KrzysztofKrawiec LaboratoryofIntelligentDecisionSupportSystems InstituteofComputingScience,PoznanUniversityofTechnology,Poznań,Poland

Outline and objectives

1 Introduction GP as a variant of EC2 Specific features of GP3 Variants of GP4 Applications5 Some theory6 Case studies

Introductionndash 3

Recommended reading

Koza J R Genetic Programming On the Programming of Computers by Meansof Natural Selection MIT Press 1992

A Field Guide to Genetic Programming (ISBN 978-1-4092-0073-4)httpwwwgp-field-guideorguk

Langdon W B Genetic Programming and Data Structures GeneticProgramming + Data Structures = Automatic Programming Kluwer 1998

Langdon W B amp Poli R Foundations of Genetic Programming Springer-Verlag2002

Riolo R L Soule T amp Worzel B (ed) Genetic Programming Theory andPractice V Springer 2007

Riolo R McConaghy T amp Vladislavleva E (ed) Genetic Programming Theoryand Practice VIII Springer 2010

See httpwwwcsbhamacuk~wblbiblio

Introductionndash 4

Recommended reading

A Field Guide to Genetic Programming httpwwwgp-field-guideorguk[39]

(This presentation uses some figures from the Field Guide)

Introductionndash 5

Background

Backgroundndash 6

Evolutionary Computation (EC)

Heuristic bio-inspired global search algorithms

Operate on populations of candidate solutions

Candidate solutions are encoded as genotypes

Genotypes get decoded into phenotypes when evaluated by the fitness function fbeing optimized

Formulation

plowast = argmaxpisinP

f (p)

where

P is the considered space (search space) of candidate solutions (solutions forshort)

f is a (maximized) fitness function

plowast is an optimal solution (an ideal) that maximizes f

Backgroundndash 7

Generic evolutionary algorithm

Evolutionary Algorithm

Population P of individuals

Evaluation

Selection

Mutation and recombination

Initialization of population P

Solutionindividual s

f(s)

Output Best solution s+

Termination criteria

Fitness function f

Backgroundndash 8

[Unique] characteristic of EC

Black-box optimization (f primes dependency on the independent variables does nothave to be known or meet any criteria)

Variables do not have to be explicitly defined

Fining an optimum cannot be guaranteed but in practice a well-performingsuboptimal solution is often satisfactory

Importance of crossover a recombination operator that makes the solutionsexchange certain elements (variable values features)

Without crossover EC boils down parallel stochastic local search

Backgroundndash 9

What is genetic programming

What is genetic programmingndash 10

Genetic programming

In a nutshell

A variant of EC where the genotypes represent programs ie entities capable ofreading in input data and producing some output data in response to that input

Fitness function f measures the similarity of the output produced by the programto the desired output given as a part of task statement

Standard representation expression trees

Important implication Additional input required by the algorithm (compared to EC)

Set of instructions (programming language of consideration)

Data to run the programs on

What is genetic programmingndash 11

Conceptual preliminaries of GP Space of candidate solutions P

Candidate solutions p isin P evolving under the selection pressure of the fitnessfunction f are themselves functions of the form p I rarrO

I and O are respectively the spaces of input data and output data accepted andproduced by programs from P

Cardinality of |P| is typically large or infinite

The set of program inputs I even if finite is usually so large that running eachcandidate solution on all possible inputs becomes intractable

GP algorithms typically evaluate solutions on a sample I prime sub I |I prime| |I | of possibleinputs and fitness is only an approximate estimate of solution quality

The task is given as a set of fitness cases ie pairs (xi yi ) isin I timesO where xiusually comprises one or more independent variables and yi is the output variable

What is genetic programmingndash 12

Conceptual preliminaries Fitness function f

In most cases (and most real-world applications of GP) fitness function fmeasures the similarity of the output produced by the program to the desiredoutput given as a part of task statement

Then fitness can be expressed as a monotonous function of the divergence ofprogramrsquos output from the desired one for instance as

f (p) =minussumi

||yi minusp(xi )|| (1)

where

p(xi ) is the output produced by program p for the input data xi

|| middot || is a metric (a norm) in the output space O

i iterates over all fitness cases

What is genetic programmingndash 13

Conceptual preliminaries Character of candidate solutions

The candidate solutions in GP are being assembled from elementary entities calledinstructions

A part of formulation of a GP task is then also an instruction set I ie a set ofsymbols used by the search algorithm to compose the programs (candidatesolutions)

Design of I usually requires some background knowledgeIn particular it should comprise all instructions necessary to find solution to theproblem posed (closure)

What is genetic programmingndash 14

Genetic programming

Main evolution loop (lsquovanilla GPrsquo)

1 procedure GeneticProgramming(f I ) f - fitness function I - instruction set2 P larrplarrRandomProgram(I ) Initialize population3 repeat Main loop over generations4 for p isinP do Evaluation5 pf larr f (p) pf is a lsquofieldrsquo in program p that stores its fitness6 end for7 P prime larr 0 Next population8 repeat Breeding loop9 p1 larrTournamentSelection(P) First parent10 p2 larrTournamentSelection(P) Second parent11 (o1o2)larrCrossover(p1 p2)12 o1 larrMutation(o1 I )13 o2 larrMutation(o2 I )14 P prime larrP prime cupo1o215 until |P prime|= |P|16 P larrP prime

17 until StoppingCondition(P)18 return argmaxpisinP pf

19 end procedure

What is genetic programmingndash 15

Crossover

Crossover exchange of randomly selected subexpressions (subtree swapping crossover)

1 function Crossover(p1p2)2 repeat3 s1 larr Random node in p14 s2 larr Random node in p25 (pprime1p

prime2)larr Swap subtrees rooted in s1 and s2

6 until Depth(pprime1)lt dmax andDepth(pprime2)lt dmax dmax is the tree depthlimit

7 return (pprime1pprime2)

8 end function

What is genetic programmingndash 16

Mutation

Mutation replace a randomly selected subexpression with a new randomly generatedsubexpression

1 function Mutation(pI )2 repeat3 slarr Random node in p4 s prime larrRandomProgram(I )5 pprime larr Replace the subtree rooted in s with s prime

6 until Depth(pprime)lt dmax dmax is the tree depth limit7 return pprime

8 end function

What is genetic programmingndash 17

Exemplary run Setup

Objective Find program whose output matches x2+x+1 over the range [minus11]Such tasks can be considered as a form of regressionAs solutions are built by manipulating code (instructions) this is referred to assymbolic regression

Fitness sum of absolute errors for x isin minus10minus09 0910In other words the set of fitness cases is

xi -10 -09 0 09 10yi 1 091 1 271 3

What is genetic programmingndash 18

Exemplary run Setup

Instruction setNonterminal (function) set + - (protected division) and x all operating onfloatsTerminal set x and constants chosen randomly between -5 and +5

Selection fitness proportionate (roulette wheel) non elitist

Initial pop ramped half-and-half (depth 1 to 2 50 of terminals are constants)(to be explained later)

Parameterspopulation size 450 subtree crossover25 reproduction25 subtree mutation no tree size limits

Termination when an individual with fitness better than 01 found

What is genetic programmingndash 19

Initial population (population 0)

What is genetic programmingndash 20

Fitness assignment for population 0

Fitness values f(a)=77 f(b)=110 f(c)=1798 f(d)=287

What is genetic programmingndash 21

Breeding

Assume

a gets reproduced

c gets mutated (at loci 2)

a and d get crossed-over

a and b get crossed over

What is genetic programmingndash 22

Population 1

Population 0

Population 1

Individual d in population 1 has fitness 0What is genetic programmingndash 23

Summary of our first glimpse at GP

Summary of our first glimpse at GPndash 24

Specific features of GP

The solutions evolving under the selection pressure of the fitness function arethemselves functions (programs)

GP operates on symbolic structures of varying lengthsThere are no variables for the algorithm to operate on (at least in the commonsense)

The program can be tested only on a limited number of fitness cases (tests)

=rArr In contrast to most EC methods that are typically placed in optimizationframework GP is by nature an inductive learning approach that fits into the domain ofmachine learning []

Summary of our first glimpse at GPndash 25

In a broader context

As opposed to typical ML approaches GP is very genericArbitrary programming language arbitrary input and output representation

The syntax and semantic of the programming language of consideration serve asmeans to provide the algorithm with prior knowledge

(common sense knowledge background knowledge domain knowledge)

GP is not the only approach to program induction (but probably the best one )See eg inductive logic programming ILP

GP embodies the ultimate goal of AI to build a system capable ofself-programming (adaptation learning)

Summary of our first glimpse at GPndash 26

What is GP ndash Question revisited

Genetic programming is a branch of computer science studying heuristicalgorithms based on neo-Darwinian principles for synthesizing programs iediscrete symbolic compositional structures that process data

Summary of our first glimpse at GPndash 27

Why should GP be considered a viable approach to program synthesis

GP combines two powerful concepts marked in underline in the above definition1 Representing candidate solutions as programs

which in general can conduct any Turing-complete computation (egclassification regression clustering reasoning problem solving etc) and thusenable capturing solutions to any type of problems (whether the task is eglearning optimization problem solving game playing etc)

2 Searching the space of candidate solutions using the lsquomechanicsrsquo borrowed frombiological evolutionwhich is unquestionably a very powerful computing paradigm given that itresulted in life on Earth and development of intelligent beings

Summary of our first glimpse at GPndash 28

Consequences of the above definition

Heuristic nature of search

Symbolic program representation

Input sensitivity and inductive character

State-fullness

Unconstrained data types

Summary of our first glimpse at GPndash 29

Origins of GP

httpwwwgenetic-programmingcomjohnkozahtml

Home assignment Indentify John Kozarsquos unique visual trait )

Summary of our first glimpse at GPndash 30

Life demonstration of GP using ECJ

Life demonstration of GP using ECJndash 31

GP on ECJ

ECJ Evolutionary Computation in Javahttpcsgmuedu~eclabprojectsecj

by Sean Luke Liviu Panait Gabriel Balan Sean Paus Zbigniew Skolicki ElenaPopovici Keith Sullivan et al

Probably the most popular freely available framework for EC with a strongsupport for GP

Licensed under Academic Free License version 30

As of March 2012 version 20

Many other libraries integrate with ECJ

Life demonstration of GP using ECJndash 32

Selected ECJ features

GUI with charting

Platform-independent checkpointing and logging

Hierarchical parameter files

Multithreading

Mersenne Twister Random Number Generators (compare tohttpwwwalifecouknonrandom)

Abstractions for implementing a variety of EC forms

Prepared to work in a distributed environment (including so-called island model)

Life demonstration of GP using ECJndash 33

GP-related ECJ features

GP Tree Representations

Set-based Strongly-Typed Genetic Programming

Ephemeral Random Constants

Automatically-Defined Functions and Automatically Defined Macros

Multiple tree forests

Six tree-creation algorithms

Extensive set of GP breeding operators

Grammatical Encoding

Eight pre-done GP application problem domains (ant regression multiplexerlawnmower parity two-box edge serengeti)

Life demonstration of GP using ECJndash 34

Exemplary run of ECJ

Standard outputjava ecEvolve -file ecappregressionquinticercparamsThreads breed1 eval1Seed 1427743400Job 0Setting upProcessing GP TypesProcessing GP Node ConstraintsProcessing GP Function SetsProcessing GP Tree Constraints-01306332228659439200164875774146594280653340439694114301402200189629743-0037506348565697010001402771209365470606602806044824949013869498395598084Initializing Generation 0Subpop 0 best fitness of generation Fitness Standardized=11303205 Adjusted=046941292 Hits=10Generation 1Subpop 0 best fitness of generation Fitness Standardized=06804932 Adjusted=059506345 Hits=7

Life demonstration of GP using ECJndash 35

Exemplary run The result

The log file produced by the runGeneration 0Best IndividualSubpopulation 0Evaluated trueFitness Standardized=11303205 Adjusted=046941292 Hits=10Tree 0( (sin ( x x)) (cos (+ x x)))Generation 1Best IndividualSubpopulation 0Evaluated trueFitness Standardized=06804932 Adjusted=059506345 Hits=7Tree 0( (rlog (+ (- x x) (cos x))) (rlog (- (cos (cos ( x x))) (- x x))))

Life demonstration of GP using ECJndash 36

Exemplary run

The log file produced by the runBest Individual of RunSubpopulation 0Evaluated trueFitness Standardized=008413165 Adjusted=092239726 Hits=17Tree 0( ( ( (- ( ( ( ( x (sin x)) (rlog

x)) (+ (+ (sin x) x) (- x x))) (exp ( x( ( (- ( ( ( ( x x) (rlog x)) (+ (+

(sin x) x) (- x x))) (exp ( x (sin x))))(sin x)) (rlog x)) (exp (rlog x)))))) (sin

x)) (rlog x)) x) (cos (cos ( ( (- ( ((exp (rlog x)) (+ x ( ( (exp (rlog x))(rlog x)) x))) (exp ( ( ( (- (exp (rlogx)) x) (rlog x)) x) (sin ( x x))))) (sinx)) ( x ( ( (- ( ( ( ( x x) (rlogx)) (+ (+ x (+ (+ (sin x) x) (- x x))) (-x x))) (exp ( x (sin x)))) (sin x)) (rlogx)) (exp (rlog x))))) x))))

Life demonstration of GP using ECJndash 37

GP in R

httpcranr-projectorgwebpackagesgprindexhtml

Life demonstration of GP using ECJndash 38

GP in Mathematica

Life demonstration of GP using ECJndash 39

Assessment of GP techniques

Assessment of GP techniquesndash 40

Criteria for assessing the quality of GP-evolved solutions

Criteria for assessing GP algorithmssuccess rate (percentage of evolutionary runs ended with success)

time-to-success (can be infin)

error of the best-of-run individual

Criteria for assessing programs obtained with GP

error rate (percentage of tests passed)

program size (number of instructions)

execution time

transparency

Assessment of GP techniquesndash 41

GP Benchmarks

Problem Definition (formula)Sextic x6minus2x4+x2

Septic x7minus2x6+x5minusx4+x3minus2x2+xNonic x9+x8+x7+x6+x5+x4+x3+x2+xR1 (x+1)3(x2minusx+1)R2 (x5minus3x3+1)(x2+1)R3 (x6+x5)(x4+x3+x2+x+1)

Assessment of GP techniquesndash 42

Even more GP benchmarks

Symbolic Regression Drug Bioavailability [43]

Tower [52] Protein Structure Classification [58]

Boolean Functions Time Series Forecasting [53]

N-Multiplexer [17] N-Majority [17] N-Parity [17] Path-finding and Planning

Generalised Boolean Circuits [12 59] Physical Travelling Salesman [28]

Digital Adder [55] Artificial Ant [17]

Order [9] Lawnmower [18]

Digital Multiplier [55] Tartarus Problem [6]

Majority [9] Maximum Overhang [36]

Classification Circuit Design [29]

mRNA Motif Classification [24] Control Systems

DNA Motif Discovery [25] Chaotic Dynamic Systems Control [26]

Intrusion Detection [11] Pole Balancing [31]

Protein Classification [19] Truck Control [16]

Intertwined Spirals [17]

Predictive Modelling

Mackey-Glass Chaotic Time Series [21]

Financial Trading [5 4 8]

Sunspot Prediction [17]

GeneChip Probe Performance [22]

Prime Number Prediction [54]

Assessment of GP techniquesndash 43

and more

Game-Playing

TORCS Car Racing [50]

Ms PacMan [10]

Othello [27]

Chessboard Evaluation [44]

Backgammon [44]

Mario [49]

NP-Complete Puzzles [14]

Robocode [44]

Rush Hour [44]

Checkers [44]

Freecell [44]

Dynamic Optimisation

Dynamic Symbolic Regression [34 35 51]

Dynamic Scheduling [13]

Traditional Programming

Sorting [15 1]

Assessment of GP techniquesndash 44

Case study

Case studyndash 45

Case study 1 Evolution of temperature models

Based on Karolina Stanisławska Krzysztof Krawiec Zbigniew W KundzewiczModeling Global Temperature Changes using Genetic Programming ndash A Case Study

Institute of Computing Science Poznan University of Technology Poznan Poland

Institute for Agricultural and Forest Environment Polish Academy of SciencesPoznan Poland and Potsdam Institute for Climate Impact Research PotsdamGermany

Konferencja Algorytmoacutew Ewolucyjnych i Optymalizacji Globalnej KAEiOGSeptember 21 2011

Case studyndash 46

A more detailed view on GP(vanilla GP is not the whole story)

A more detailed view on GP (vanilla GP is not the whole story)ndash 47

Where to get [random] candidate solutions from

Every stochastic search method relies on some sampling algorithm(s)

The distribution of randomly generated solutions is important as it impliescertain bias of the algorithm

ProblemsWe donrsquot know the lsquoidealrsquo distribution of GP programsEven if we knew it it may be difficult to design an algorithm that obeys it

The most widely used contemporary initialization methods take care only of thesyntax of generated programs

Mainly height constraint

A more detailed view on GP (vanilla GP is not the whole story)ndash 48

Initialization Full method

Specify the maximum tree height hmax

The full method for initializing treesChoose nonterminal nodes at random until hmax is reachedThen choose only from terminals

A more detailed view on GP (vanilla GP is not the whole story)ndash 49

Initialization Grow method

Specify the maximum tree height hmax

The grow method for initializing treesChoose nonterminal or terminal nodes at random until hmax is reachedThen choose only from terminals

A more detailed view on GP (vanilla GP is not the whole story)ndash 50

Homologous crossover for GP

Earliest example one-point crossover [23] identify a common region in theparents and swap the corresponding trees

The common region is the lsquointersectionrsquo of parent trees

A more detailed view on GP (vanilla GP is not the whole story)ndash 51

Uniform crossover for GP

Works similarly to uniform crossover in GAs

The offspring is build by iterating over nodes in the common region and flipping acoin to decide from which parent should an instruction be copied [38]

A more detailed view on GP (vanilla GP is not the whole story)ndash 52

How to employ multiple operators for lsquobreedingrsquo

How should the particular operators coexist in an evolutionary process In other words

How should they be superimposed

What should be the lsquopipingrsquo of particular breeding pipelines

A topic surprisingly underexplored in GP (and in EC probably too)

An example Which is betterpopsubpop0speciespipe = ecgpkozaMutationPipelinepopsubpop0speciespipenum-sources = 1popsubpop0speciespipesource0 = ecgpkozaCrossoverPipeline

Orpopsubpop0speciespipenum-sources = 2popsubpop0speciespipesource0 = ecgpkozaCrossoverPipelinepopsubpop0speciespipesource0prob = 09popsubpop0speciespipesource1 = ecgpkozaMutationPipelinepopsubpop0speciespipesource1prob = 01

A more detailed view on GP (vanilla GP is not the whole story)ndash 53

The Challenges for GP

The Challenges for GPndash 54

Bloat

The evolving expressions tend to grow indefinitely in sizeFor tree-based representations this growth is typically exponential[-ish]

Evaluation becomes slow algorithm stalls memory overrun likely

One of the most intensely studied topics in GP 240+ papers as of March 2012

The Challenges for GPndash 55

Bloat example

Average number of nodes per generation in a typical run of GP solving the Sexticproblem x6minus2x4+x2

(GP dotted line)

The Challenges for GPndash 56

Countermeasures for bloat

Constraining tree heightSurprisingly can speed up bloat

Favoring small programsLexicographic parsimony pressure given two equally fit individuals prefer (select)the one represented by a smaller tree

Bloat-aware operators size-fair crossover

The Challenges for GPndash 57

Highly non-uniform distribution of program lsquobehaviorsrsquo

Convergence of binary Boolean random linear functions (composed of AND NANDOR NOR 8 bits)

From [20] Langdon W B Cantuacute-Paz E (ed) Random Search is Parsimonious LateBreaking Papers at the Genetic and Evolutionary Computation Conference(GECCO-2002) AAAI 2002 308-315

The Challenges for GPndash 58

High cost of evaluation

Running a program on multiple inputscan be expensive

Particularly for some types of dataeg images

Solutions

Caching of outcomes of subprograms

Parallel execution of programs onparticular fitness cases

Bloat prevention methods

The Challenges for GPndash 59

Variants of GP

Variants of GPndash 60

Strongly typed GP (STGP)

A way to incorporate prior knowledge and impose a structure on programs [30]

ImplementationProvide a set of typesFor each instruction define the types of its arguments and outcomesMake the operators type-aware

Mutation substitute a random tree of a proper typeCrossover swap trees of compatible1 types

1Compatible belonging to the same lsquoset typersquoVariants of GPndash 61

Strongly typed GP in ECJ

For the problem of simple classifiers represented as decision trees

Classifier syntaxClassifier = Class_idClassifier = if_then_else(Condition ClassifierClassifier)Condition = Input_Variable = Constant_Value

Implementation in ECJ parameter filesgptypeasize = 3gptypea0name = classgptypea1name = vargptypea2name = constgptypessize = 0gptcsize = 1gptc0 = ecgpGPTreeConstraintsgptc0name = tc0gptc0fset = f0gptc0returns = class

gpncsize = 4gpnc0 = ecgpGPNodeConstraintsgpnc0name = ncSimpleClassifiergpnc0returns = classgpnc0size = 0gpnc1 = ecgpGPNodeConstraintsgpnc1name = ncCompoundClassifiergpnc1returns = classgpnc1size = 4gpnc1child0 = vargpnc1child1 = constgpnc1child2 = classgpnc1child3 = classgpnc2 = ecgpGPNodeConstraintsgpnc2name = ncVariablegpnc2returns = vargpnc2size = 0gpnc3 = ecgpGPNodeConstraintsgpnc3name = ncConstantgpnc3returns = constgpnc3size = 0

Variants of GPndash 62

Linear Genetic Programming

MotivationTree-like structures are not natural for contemporary hardware architectures

Program = a sequence of instructions

Data passed via registers

ProsDirectly portable to machine code fast executionNatural correspondence to standard (GA-like) crossover operator

Applications direct evolution of machine code [32]

Initial register contents

Final register contents

x1

x2 O1 O2

x3

O3 O4 g2

g3

g1r1

r2

r3

r1

r2

r3

Variants of GPndash 63

Linear GP

Variants of GPndash 64

Stack-based GP

The best-known representative Push and PushGPhampshireedulspectorpushhtml [48]

ProsVery simple syntax program = instruction | literal | ( program )No need to specify the number of registersThe top element of a stack has the natural interpretation of program outcomeNatural possibility of implementing autoconstructive programs [47]Includes certain features that make it Turing-complete (eg YANK instruction)

Variants of GPndash 65

Push Example

Program

( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR )

Initial stack states

BOOLEAN STACK ()CODE STACK ( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR )FLOAT STACK ()INTEGER STACK ()

Stack states after program execution

BOOLEAN STACK ( TRUE )CODE STACK ( ( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR ) )FLOAT STACK ( 93 )INTEGER STACK ( 6 )

httphampshireedulspectorpush3-descriptionhtml

Variants of GPndash 66

Other variants of GP

Grammatical Evolution The grammar of the programming language ofconsideration is given as input to the algorithm Individuals encode the choice ofproductions in the derivation tree (which of available alternative productionshould be chosen modulo the number of productions available at given step ofderivation)

Graph-based GPMotivation standard GP cannot reuse subprograms (within a single program)Example Cartesian Genetic Programming

Variants of GPndash 67

Other variants of GP

Multiobjective GP The extra objectives canCome with the problemResult from GPrsquos specifics eg use program size as the second (minimized)objectiveBe associated with different tests (eg feature tests [40])

Developmental GP (eg using Push)

Probabilistic GP (a variant of EDA Estimation of Distribution Algorithms)The algorithm maintains a probability distribution P instead of a populationIndividuals are generated from P lsquoon demandrsquoThe results of individualsrsquo evaluation are used to update P

Variants of GPndash 68

Simple EDA-like GP PIPE

Probabilistic Incremental Program Evolution [41]

Variants of GPndash 69

Selected theoretical results

Selected theoretical resultsndash 70

Schemata theorem for GP

Exact formula for the expected number of individuals sampling a schema a thenext generation [37]

Plus later work for other types of crossover

Selected theoretical resultsndash 71

Semantic crossover

Selected theoretical resultsndash 72

Applications of GP

Applications of GPndash 73

Review

GP produced a number of solutions that are human-competitive ie a GPalgorithm automatically solved a problem for which a patent exists2

A recent award-winning work has demonstrated the ability of a GP system toautomatically find and correct bugs in commercially-released software whenprovided with test data3

GP is one of leading methodologies that can be used to lsquoautomatersquo sciencehelping the researchers to find the hidden complex patterns in the observedphenomena4

2Koza J R Keane M A Streeter M J Mydlowec W Yu J Lanza G 2003 Genetic Pro-gramming IV Routine Human-Competitive Machine Intelligence Kluwer Academic Publishers

3Arcuri A Yao X A novel co-evolutionary approach to automatic software bug fixing In WangJ (Ed) 2008 IEEE World Congress on Computational Intelligence IEEE Computational IntelligenceSociety IEEE Press Hong Kong

4Schmidt M Lipson H 3 Apr 2009 Distilling free-form natural laws from experimental dataScience 324 (5923) 81ndash85

Applications of GPndash 74

Humies

() Entries were solicited for cash awards for human-competitive results that wereproduced by any form of genetic and evolutionary computation and that were published

Applications of GPndash 75

Humies

The conditions to qualify(A) The result was patented as an invention in the past is an improvement over apatented invention or would qualify today as a patentable new invention(B) The result is equal to or better than a result that was accepted as a new scientificresult at the time when it was published in a peer-reviewed scientific journal(C) The result is equal to or better than a result that was placed into a database orarchive of results maintained by an internationally recognized panel of scientificexperts(D) The result is publishable in its own right as a new scientific result mdash independentof the fact that the result was mechanically created(E) The result is equal to or better than the most recent human-created solution to along-standing problem for which there has been a succession of increasingly betterhuman-created solutions(F) The result is equal to or better than a result that was considered an achievementin its field at the time it was first discovered(G) The result solves a problem of indisputable difficulty in its field(H) The result holds its own or wins a regulated competition involving humancontestants (in the form of either live human players or human-written computerprograms)

Applications of GPndash 76

Selected Gold Humies

2004 Jason D Lohn Gregory S Hornby Derek S Linden NASA Ames ResearchCenterAn Evolved Antenna for Deployment on NASArsquos Space Technology 5 Mission

httpidesignucscedupapershornby_ec11pdf

Applications of GPndash 77

Selected Gold Humies using GP

2009 Stephanie Forrest Claire Le Goues ThanhVu Nguyen Westley WeimerAutomatically finding patches using genetic programming A GeneticProgramming Approach to Automated Software Repair

Applications of GPndash 78

Selected Gold Humies using GP

2008 Lee Spector David M Clark Ian Lindsay Bradford Barr Jon KleinGenetic Programming for Finite Algebras

2010 Natalio Krasnogor Paweł Widera Jonathan GaribaldiEvolutionary design of the energy function for protein structure prediction GPchallenge evolving the energy function for protein structure prediction Automateddesign of energy functions for protein structure prediction by means of geneticprogramming and improved structure similarity assessment

2011 Achiya Elyasaf Ami Hauptmann Moshe SipperGA-FreeCell Evolving Solvers for the Game of FreeCell

Applications of GPndash 79

Other applications

classification problems in machine learning [] object recognition [ 33] orlearning game strategies [] [2 57] has demonstrated the ability of a GP system to automatically find andcorrect bugs in commercially-released software when provided with test data

In the context of this paper it deserves particular attention that GP is one ofleading methodologies that can be used to lsquoautomatersquo science helping theresearchers to find the hidden complex patterns in the observed phenomena

In this spirit in their seminal paper [42] have shown how GP can be used toinduce scientific laws from experimental data Many other studies havedemonstrated the usefulness of GP for modeling different phenomena includingthose of natural origins [45 3 7 56 46]

See [] for an extensive review of GP applications

Applications of GPndash 80

Case study 2 Evolution of feature detectors

Based on Krzysztof Krawiec Bartosz Kukawka and Tomasz Maciejewski (2010)Evolving cascades of voting feature detectors for vehicle detection in satellite imageryIn IEEE Congress on Evolutionary Computation (CEC 2010) Barcelona IEEE Presspages 2392-2399

Applications of GPndash 81

Additional resources

Additional resourcesndash 82

Additional resources

Evolutionary Computation in Java csgmuedu~eclabprojectsecjGeneric software framework for EA well-prepared to work with GP

The online GP bilbiography wwwcsbhamacuk~wblbiblio

The genetic programming lsquohome pagersquo (a little bit messy but still valuable)httpwwwgenetic-programmingcom

Additional resourcesndash 83

Bibliography

Bibliographyndash 84

A Agapitos and S M LucasEvolving Modular Recursive Sorting AlgorithmsIn Proc EuroGP 2007

A Arcuri and X YaoA novel co-evolutionary approach to automatic software bug fixingIn J Wang editor 2008 IEEE World Congress on Computational Intelligencepages 162ndash168 Hong Kong 1-6 June 2008 IEEE Computational IntelligenceSociety IEEE Press

M Arganis R Val J Prats K Rodriguez R Dominguez and J DolzGenetic programming and standardization in water temperature modellingAdvances in Civil Engineering 2009 2009

A Brabazon M OrsquoNeill and I DempseyAn Introduction to Evolutionary Computation in FinanceIEEE Computational Intelligence Magazine 3(4)42ndash55 2008

R Bradley A Brabazon and M OrsquoNeillDynamic High Frequency Trading A Neuro-Evolutionary ApproachIn Proc EvoWorkshops 2009

G Cuccu and F GomezWhen novelty is not enoughIn Proc EvoApplications 2011

Bibliographyndash 85

M Daga and M C DeoAlternative data-driven methods to estimate wind from waves by inversemodelingNatural Hazards 49(2)293ndash310 May 2009

I Dempsey M OrsquoNeill and A BrabazonAdaptive Trading With Grammatical EvolutionIn Proc CEC 2006

G Durrett F Neumann and U-M OrsquoReillyComputational Complexity Analysis of Simple Genetic Programming On TwoProblems Modeling Isolated Program SemanticsIn Proc FOGA 2011

E Galvaacuten-Loacutepez J Swafford M OrsquoNeill and A BrabazonEvolving a Ms PacMan Controller Using Grammatical EvolutionIn Applications of Evolutionary Computation Springer 2010

J V Hansen P B Lowry R D Meservy and D M McDonaldGenetic Programming for Prevention of Cyberterrorism through Dynamic andEvolving Intrusion DetectionDecision Support Systems 431362ndash1374 2007

Bibliographyndash 86

S Harding J F Miller and W BanzhafDevelopments in Cartesian Genetic Programming self-modifying CGPGPEM 11397ndash439 2010

D Jakobović and L BudinDynamic Scheduling with Genetic ProgrammingIn Proc EuroGP 2006

G Kendall A Parkes and K SpoererA Survey of NP-Complete PuzzlesInternational Computer Games Association Journal 31(1)13ndash34 2008

K E Kinnear JrEvolving a Sort Lessons in Genetic ProgrammingIn Proc of the International Conference on Neural Networks 1993

J KozaA Genetic Approach to the Truck Backer Upper Problem and the Inter-twinedSpiral ProblemIn Proc International Joint Conference on Neural Networks 1992

J R KozaGenetic Programming On the Programming of Computers by Means of NaturalSelectionMIT Press Cambridge MA USA 1992

Bibliographyndash 87

J R KozaGenetic Programming II Automatic Discovery of Reusable ProgramsMIT Press Cambridge Massachusetts May 1994

W Langdon and W BanzhafRepeated Patterns in Genetic ProgrammingNatural Computing 7589ndash613 2008

W B LangdonRandom search is parsimoniousIn E Cantuacute-Paz editor Late Breaking Papers at the Genetic and EvolutionaryComputation Conference (GECCO-2002) pages 308ndash315 New York NY 9-13July 2002 AAAI

W B Langdon and W BanzhafRepeated Sequences in Linear Genetic Programming GenomesComplex Systems 15(4)285ndash306 2005

W B Langdon and A P HarrisonEvolving Regular Expressions for GeneChip Probe Performance PredictionIn Proc PPSN pages 1061ndash1070 2008

W B Langdon and R PoliFoundations of Genetic ProgrammingSpringer-Verlag 2002

Bibliographyndash 88

W B Langdon J Rowsell and A P HarrisonCreating Regular Expressions as mRNA Motifs with GP to Predict Human ExonSplittingIn Proc GECCO 2009

W B Langdon O Sanchez Graillet and A P HarrisonAutomated DNA Motif DiscoveryarXivorg 2010

M Lones A Tyrrell S Stepney and L CavesControlling Complex Dynamics with Artificial Biochemical NetworksIn Proc EuroGP pages 159ndash170 2010

S LucasOthello Competitionhttpprotectkern-1667emrelaxalgovalessexacuk8080othellohtmlOthellohtml 2012[Online accessed 27-Jan-2012]

Bibliographyndash 89

S LucasThe Physical Travelling Salesperson Problemhttpprotectkern-1667emrelaxalgovalessexacukptspptsphtml2012[Online accessed 27ndashJan-2012]

T McConaghyFFX Fast Scalable Deterministic Symbolic Regression TechnologyIn Proc GPTP 2011

D J MontanaStrongly typed genetic programmingBBN Technical Report 7866 Bolt Beranek and Newman Inc 10 MoultonStreet Cambridge MA 02138 USA 7 May 1993

M Nicolau M Schoenauer and W BanzhafEvolving Genes to Balance a PoleIn Proc EuroGP 2010

Bibliographyndash 90

P Nordin and W BanzhafGenetic programming controlling a miniature robotIn E V Siegel and J R Koza editors Working Notes for the AAAI Symposiumon Genetic Programming pages 61ndash67 MIT Cambridge MA USA 10ndash12 Nov1995 AAAI

G Olague and L TrujilloEvolutionary-computer-assisted design of image operators that detect interestpoints using genetic programmingImage and Vision Computing 29(7)484ndash498 2011

M OrsquoNeill A Brabazon and E HembergSubtree Deactivation Control with Grammatical Genetic Programming inDynamic EnvironmentsIn Proc CEC 2008

M OrsquoNeill and C RyanGrammatical Evolution by Grammatical Evolution The Evolution of Grammarand Genetic CodeIn Proc EuroGP pages 138ndash149 Springer-Verlag 5ndash7 Apr 2004

M Paterson Y Peres M Thorup P Winkler and U ZwickMaximum OverhangIn Proc 19th Annual ACM-SIAM Symposium on Discrete Algorithms 2008

Bibliographyndash 91

R PoliExact schema theory for genetic programming and variable-length geneticalgorithms with one-point crossoverGenetic Programming and Evolvable Machines 2(2)123ndash163 June 2001

R Poli and W B LangdonOn the search properties of different crossover operators in genetic programmingIn J R Koza W Banzhaf K Chellapilla K Deb M Dorigo D B Fogel M HGarzon D E Goldberg H Iba and R Riolo editors Genetic Programming1998 Proceedings of the Third Annual Conference pages 293ndash301 University ofWisconsin Madison Wisconsin USA 22-25 July 1998 Morgan Kaufmann

R Poli W B Langdon and N F McPheeA field guide to genetic programmingPublished via httplulucom and freely available athttpwwwgp-field-guideorguk 2008(With contributions by J R Koza)

B J Ross and H ZhuProcedural texture evolution using multiobjective optimizationNew Generation Computing 22(3)271ndash293 2004

Bibliographyndash 92

R P Salustowicz and J SchmidhuberProbabilistic incremental program evolutionEvolutionary Computation 5(2)123ndash141 1997

M Schmidt and H LipsonDistilling free-form natural laws from experimental dataScience 324(5923)81ndash85 3 Apr 2009

S Silva and L VanneschiState-of-the-Art Genetic Programming for Predicting Human Oral Bioavailabilityof DrugsIn Proc 4th International Workshop on Practical Applications of ComputationalBiology and Bioinformatics 2010

M SipperLet the Games EvolveIn Proc GPTP 2011

C Sivapragasam R Maheswaran and V VenkateshGenetic programming approach for flood routing in natural channelsHydrological Processes 22(5)623ndash628 2007

C Sivapragasam N Muttil S Muthukumar and V M ArunPrediction of algal blooms using genetic programmingMarine Pollution Bulletin 60(10)1849ndash1855 2010

Bibliographyndash 93

L SpectorTowards practical autoconstructive evolution Self-evolution of problem-solvinggenetic programming systemsIn R Riolo T McConaghy and E Vladislavleva editors Genetic ProgrammingTheory and Practice VIII volume 8 of Genetic and Evolutionary Computationchapter 2 pages 17ndash33 Springer Ann Arbor USA 20-22 May 2010

L Spector C Perry and J KleinPush 20 programming language descriptionTechnical report School of Cognitive Science Hampshire College Apr 2004

J Togelius S Karakovskiy J Koutnik and J SchmidhuberSuper Mario EvolutionIn Proc IEEE Computational Intelligence and Games 2009

TORCS The Open Car Racing Simulatorhttptorcssourceforgenet 2012

L Vanneschi and G CuccuVariable Size Population for Dynamic Optimization with Genetic ProgrammingIn Proc GECCO 2009

Bibliographyndash 94

E Vladislavleva G Smits and D Den HertogOrder of Nonlinearity as a Complexity Measure for Models Generated by SymbolicRegression via Pareto Genetic ProgrammingIEEE Trans EC 13(2)333ndash349 2009

N Wagner Z Michalewicz M Khouja and R McGregorTime Series Forecasting for Dynamic Environments The DyFor Genetic ProgramModelIEEE Trans EC 2007

J Walker and J MillerPredicting Prime Numbers Using Cartesian Genetic ProgrammingIn Proc EuroGP 2007

J A Walker K Voumllk S L Smith and J F MillerParallel Evolution using Multi-chromosome Cartesian Genetic ProgrammingGPEM 10417ndash445 2009

W-C Wang K-W Chau C-T Cheng and L QiuA comparison of performance of several artificial intelligence methods forforecasting monthly discharge time seriesJournal of Hydrology 374(3-4)294ndash306 2009

Bibliographyndash 95

W Weimer S Forrest C Le Goues and T NguyenAutomatic program repair with evolutionary computationCommunications of the ACM 53(5)109ndash116 June 2010

P Widera J Garibaldi and N KrasnogorGP challenge Evolving energy function for protein structure predictionGPEM 1161ndash88 2010

T YuHierarchical Processing for Evolving Recursive and Modular Programs UsingHigher-Order Functions and Lambda AbstractionGPEM 2345ndash380 2001

Bibliographyndash 96

  • Introduction
  • Background
  • What is genetic programming
  • Summary of our first glimpse at GP
  • Life demonstration of GP using ECJ
  • Assessment of GP techniques
  • Case study
  • A more detailed view on GP (vanilla GP is not the whole story)
  • The Challenges for GP
  • Variants of GP
  • Selected theoretical results
  • Applications of GP
  • Additional resources
  • Bibliography
Page 4: GeneticProgramming - Poznań University of TechnologyGeneticProgramming KrzysztofKrawiec LaboratoryofIntelligentDecisionSupportSystems InstituteofComputingScience,PoznanUniversityofTechnology,Poznań,Poland

Recommended reading

Koza J R Genetic Programming On the Programming of Computers by Meansof Natural Selection MIT Press 1992

A Field Guide to Genetic Programming (ISBN 978-1-4092-0073-4)httpwwwgp-field-guideorguk

Langdon W B Genetic Programming and Data Structures GeneticProgramming + Data Structures = Automatic Programming Kluwer 1998

Langdon W B amp Poli R Foundations of Genetic Programming Springer-Verlag2002

Riolo R L Soule T amp Worzel B (ed) Genetic Programming Theory andPractice V Springer 2007

Riolo R McConaghy T amp Vladislavleva E (ed) Genetic Programming Theoryand Practice VIII Springer 2010

See httpwwwcsbhamacuk~wblbiblio

Introductionndash 4

Recommended reading

A Field Guide to Genetic Programming httpwwwgp-field-guideorguk[39]

(This presentation uses some figures from the Field Guide)

Introductionndash 5

Background

Backgroundndash 6

Evolutionary Computation (EC)

Heuristic bio-inspired global search algorithms

Operate on populations of candidate solutions

Candidate solutions are encoded as genotypes

Genotypes get decoded into phenotypes when evaluated by the fitness function fbeing optimized

Formulation

plowast = argmaxpisinP

f (p)

where

P is the considered space (search space) of candidate solutions (solutions forshort)

f is a (maximized) fitness function

plowast is an optimal solution (an ideal) that maximizes f

Backgroundndash 7

Generic evolutionary algorithm

Evolutionary Algorithm

Population P of individuals

Evaluation

Selection

Mutation and recombination

Initialization of population P

Solutionindividual s

f(s)

Output Best solution s+

Termination criteria

Fitness function f

Backgroundndash 8

[Unique] characteristic of EC

Black-box optimization (f primes dependency on the independent variables does nothave to be known or meet any criteria)

Variables do not have to be explicitly defined

Fining an optimum cannot be guaranteed but in practice a well-performingsuboptimal solution is often satisfactory

Importance of crossover a recombination operator that makes the solutionsexchange certain elements (variable values features)

Without crossover EC boils down parallel stochastic local search

Backgroundndash 9

What is genetic programming

What is genetic programmingndash 10

Genetic programming

In a nutshell

A variant of EC where the genotypes represent programs ie entities capable ofreading in input data and producing some output data in response to that input

Fitness function f measures the similarity of the output produced by the programto the desired output given as a part of task statement

Standard representation expression trees

Important implication Additional input required by the algorithm (compared to EC)

Set of instructions (programming language of consideration)

Data to run the programs on

What is genetic programmingndash 11

Conceptual preliminaries of GP Space of candidate solutions P

Candidate solutions p isin P evolving under the selection pressure of the fitnessfunction f are themselves functions of the form p I rarrO

I and O are respectively the spaces of input data and output data accepted andproduced by programs from P

Cardinality of |P| is typically large or infinite

The set of program inputs I even if finite is usually so large that running eachcandidate solution on all possible inputs becomes intractable

GP algorithms typically evaluate solutions on a sample I prime sub I |I prime| |I | of possibleinputs and fitness is only an approximate estimate of solution quality

The task is given as a set of fitness cases ie pairs (xi yi ) isin I timesO where xiusually comprises one or more independent variables and yi is the output variable

What is genetic programmingndash 12

Conceptual preliminaries Fitness function f

In most cases (and most real-world applications of GP) fitness function fmeasures the similarity of the output produced by the program to the desiredoutput given as a part of task statement

Then fitness can be expressed as a monotonous function of the divergence ofprogramrsquos output from the desired one for instance as

f (p) =minussumi

||yi minusp(xi )|| (1)

where

p(xi ) is the output produced by program p for the input data xi

|| middot || is a metric (a norm) in the output space O

i iterates over all fitness cases

What is genetic programmingndash 13

Conceptual preliminaries Character of candidate solutions

The candidate solutions in GP are being assembled from elementary entities calledinstructions

A part of formulation of a GP task is then also an instruction set I ie a set ofsymbols used by the search algorithm to compose the programs (candidatesolutions)

Design of I usually requires some background knowledgeIn particular it should comprise all instructions necessary to find solution to theproblem posed (closure)

What is genetic programmingndash 14

Genetic programming

Main evolution loop (lsquovanilla GPrsquo)

1 procedure GeneticProgramming(f I ) f - fitness function I - instruction set2 P larrplarrRandomProgram(I ) Initialize population3 repeat Main loop over generations4 for p isinP do Evaluation5 pf larr f (p) pf is a lsquofieldrsquo in program p that stores its fitness6 end for7 P prime larr 0 Next population8 repeat Breeding loop9 p1 larrTournamentSelection(P) First parent10 p2 larrTournamentSelection(P) Second parent11 (o1o2)larrCrossover(p1 p2)12 o1 larrMutation(o1 I )13 o2 larrMutation(o2 I )14 P prime larrP prime cupo1o215 until |P prime|= |P|16 P larrP prime

17 until StoppingCondition(P)18 return argmaxpisinP pf

19 end procedure

What is genetic programmingndash 15

Crossover

Crossover exchange of randomly selected subexpressions (subtree swapping crossover)

1 function Crossover(p1p2)2 repeat3 s1 larr Random node in p14 s2 larr Random node in p25 (pprime1p

prime2)larr Swap subtrees rooted in s1 and s2

6 until Depth(pprime1)lt dmax andDepth(pprime2)lt dmax dmax is the tree depthlimit

7 return (pprime1pprime2)

8 end function

What is genetic programmingndash 16

Mutation

Mutation replace a randomly selected subexpression with a new randomly generatedsubexpression

1 function Mutation(pI )2 repeat3 slarr Random node in p4 s prime larrRandomProgram(I )5 pprime larr Replace the subtree rooted in s with s prime

6 until Depth(pprime)lt dmax dmax is the tree depth limit7 return pprime

8 end function

What is genetic programmingndash 17

Exemplary run Setup

Objective Find program whose output matches x2+x+1 over the range [minus11]Such tasks can be considered as a form of regressionAs solutions are built by manipulating code (instructions) this is referred to assymbolic regression

Fitness sum of absolute errors for x isin minus10minus09 0910In other words the set of fitness cases is

xi -10 -09 0 09 10yi 1 091 1 271 3

What is genetic programmingndash 18

Exemplary run Setup

Instruction setNonterminal (function) set + - (protected division) and x all operating onfloatsTerminal set x and constants chosen randomly between -5 and +5

Selection fitness proportionate (roulette wheel) non elitist

Initial pop ramped half-and-half (depth 1 to 2 50 of terminals are constants)(to be explained later)

Parameterspopulation size 450 subtree crossover25 reproduction25 subtree mutation no tree size limits

Termination when an individual with fitness better than 01 found

What is genetic programmingndash 19

Initial population (population 0)

What is genetic programmingndash 20

Fitness assignment for population 0

Fitness values f(a)=77 f(b)=110 f(c)=1798 f(d)=287

What is genetic programmingndash 21

Breeding

Assume

a gets reproduced

c gets mutated (at loci 2)

a and d get crossed-over

a and b get crossed over

What is genetic programmingndash 22

Population 1

Population 0

Population 1

Individual d in population 1 has fitness 0What is genetic programmingndash 23

Summary of our first glimpse at GP

Summary of our first glimpse at GPndash 24

Specific features of GP

The solutions evolving under the selection pressure of the fitness function arethemselves functions (programs)

GP operates on symbolic structures of varying lengthsThere are no variables for the algorithm to operate on (at least in the commonsense)

The program can be tested only on a limited number of fitness cases (tests)

=rArr In contrast to most EC methods that are typically placed in optimizationframework GP is by nature an inductive learning approach that fits into the domain ofmachine learning []

Summary of our first glimpse at GPndash 25

In a broader context

As opposed to typical ML approaches GP is very genericArbitrary programming language arbitrary input and output representation

The syntax and semantic of the programming language of consideration serve asmeans to provide the algorithm with prior knowledge

(common sense knowledge background knowledge domain knowledge)

GP is not the only approach to program induction (but probably the best one )See eg inductive logic programming ILP

GP embodies the ultimate goal of AI to build a system capable ofself-programming (adaptation learning)

Summary of our first glimpse at GPndash 26

What is GP ndash Question revisited

Genetic programming is a branch of computer science studying heuristicalgorithms based on neo-Darwinian principles for synthesizing programs iediscrete symbolic compositional structures that process data

Summary of our first glimpse at GPndash 27

Why should GP be considered a viable approach to program synthesis

GP combines two powerful concepts marked in underline in the above definition1 Representing candidate solutions as programs

which in general can conduct any Turing-complete computation (egclassification regression clustering reasoning problem solving etc) and thusenable capturing solutions to any type of problems (whether the task is eglearning optimization problem solving game playing etc)

2 Searching the space of candidate solutions using the lsquomechanicsrsquo borrowed frombiological evolutionwhich is unquestionably a very powerful computing paradigm given that itresulted in life on Earth and development of intelligent beings

Summary of our first glimpse at GPndash 28

Consequences of the above definition

Heuristic nature of search

Symbolic program representation

Input sensitivity and inductive character

State-fullness

Unconstrained data types

Summary of our first glimpse at GPndash 29

Origins of GP

httpwwwgenetic-programmingcomjohnkozahtml

Home assignment Indentify John Kozarsquos unique visual trait )

Summary of our first glimpse at GPndash 30

Life demonstration of GP using ECJ

Life demonstration of GP using ECJndash 31

GP on ECJ

ECJ Evolutionary Computation in Javahttpcsgmuedu~eclabprojectsecj

by Sean Luke Liviu Panait Gabriel Balan Sean Paus Zbigniew Skolicki ElenaPopovici Keith Sullivan et al

Probably the most popular freely available framework for EC with a strongsupport for GP

Licensed under Academic Free License version 30

As of March 2012 version 20

Many other libraries integrate with ECJ

Life demonstration of GP using ECJndash 32

Selected ECJ features

GUI with charting

Platform-independent checkpointing and logging

Hierarchical parameter files

Multithreading

Mersenne Twister Random Number Generators (compare tohttpwwwalifecouknonrandom)

Abstractions for implementing a variety of EC forms

Prepared to work in a distributed environment (including so-called island model)

Life demonstration of GP using ECJndash 33

GP-related ECJ features

GP Tree Representations

Set-based Strongly-Typed Genetic Programming

Ephemeral Random Constants

Automatically-Defined Functions and Automatically Defined Macros

Multiple tree forests

Six tree-creation algorithms

Extensive set of GP breeding operators

Grammatical Encoding

Eight pre-done GP application problem domains (ant regression multiplexerlawnmower parity two-box edge serengeti)

Life demonstration of GP using ECJndash 34

Exemplary run of ECJ

Standard outputjava ecEvolve -file ecappregressionquinticercparamsThreads breed1 eval1Seed 1427743400Job 0Setting upProcessing GP TypesProcessing GP Node ConstraintsProcessing GP Function SetsProcessing GP Tree Constraints-01306332228659439200164875774146594280653340439694114301402200189629743-0037506348565697010001402771209365470606602806044824949013869498395598084Initializing Generation 0Subpop 0 best fitness of generation Fitness Standardized=11303205 Adjusted=046941292 Hits=10Generation 1Subpop 0 best fitness of generation Fitness Standardized=06804932 Adjusted=059506345 Hits=7

Life demonstration of GP using ECJndash 35

Exemplary run The result

The log file produced by the runGeneration 0Best IndividualSubpopulation 0Evaluated trueFitness Standardized=11303205 Adjusted=046941292 Hits=10Tree 0( (sin ( x x)) (cos (+ x x)))Generation 1Best IndividualSubpopulation 0Evaluated trueFitness Standardized=06804932 Adjusted=059506345 Hits=7Tree 0( (rlog (+ (- x x) (cos x))) (rlog (- (cos (cos ( x x))) (- x x))))

Life demonstration of GP using ECJndash 36

Exemplary run

The log file produced by the runBest Individual of RunSubpopulation 0Evaluated trueFitness Standardized=008413165 Adjusted=092239726 Hits=17Tree 0( ( ( (- ( ( ( ( x (sin x)) (rlog

x)) (+ (+ (sin x) x) (- x x))) (exp ( x( ( (- ( ( ( ( x x) (rlog x)) (+ (+

(sin x) x) (- x x))) (exp ( x (sin x))))(sin x)) (rlog x)) (exp (rlog x)))))) (sin

x)) (rlog x)) x) (cos (cos ( ( (- ( ((exp (rlog x)) (+ x ( ( (exp (rlog x))(rlog x)) x))) (exp ( ( ( (- (exp (rlogx)) x) (rlog x)) x) (sin ( x x))))) (sinx)) ( x ( ( (- ( ( ( ( x x) (rlogx)) (+ (+ x (+ (+ (sin x) x) (- x x))) (-x x))) (exp ( x (sin x)))) (sin x)) (rlogx)) (exp (rlog x))))) x))))

Life demonstration of GP using ECJndash 37

GP in R

httpcranr-projectorgwebpackagesgprindexhtml

Life demonstration of GP using ECJndash 38

GP in Mathematica

Life demonstration of GP using ECJndash 39

Assessment of GP techniques

Assessment of GP techniquesndash 40

Criteria for assessing the quality of GP-evolved solutions

Criteria for assessing GP algorithmssuccess rate (percentage of evolutionary runs ended with success)

time-to-success (can be infin)

error of the best-of-run individual

Criteria for assessing programs obtained with GP

error rate (percentage of tests passed)

program size (number of instructions)

execution time

transparency

Assessment of GP techniquesndash 41

GP Benchmarks

Problem Definition (formula)Sextic x6minus2x4+x2

Septic x7minus2x6+x5minusx4+x3minus2x2+xNonic x9+x8+x7+x6+x5+x4+x3+x2+xR1 (x+1)3(x2minusx+1)R2 (x5minus3x3+1)(x2+1)R3 (x6+x5)(x4+x3+x2+x+1)

Assessment of GP techniquesndash 42

Even more GP benchmarks

Symbolic Regression Drug Bioavailability [43]

Tower [52] Protein Structure Classification [58]

Boolean Functions Time Series Forecasting [53]

N-Multiplexer [17] N-Majority [17] N-Parity [17] Path-finding and Planning

Generalised Boolean Circuits [12 59] Physical Travelling Salesman [28]

Digital Adder [55] Artificial Ant [17]

Order [9] Lawnmower [18]

Digital Multiplier [55] Tartarus Problem [6]

Majority [9] Maximum Overhang [36]

Classification Circuit Design [29]

mRNA Motif Classification [24] Control Systems

DNA Motif Discovery [25] Chaotic Dynamic Systems Control [26]

Intrusion Detection [11] Pole Balancing [31]

Protein Classification [19] Truck Control [16]

Intertwined Spirals [17]

Predictive Modelling

Mackey-Glass Chaotic Time Series [21]

Financial Trading [5 4 8]

Sunspot Prediction [17]

GeneChip Probe Performance [22]

Prime Number Prediction [54]

Assessment of GP techniquesndash 43

and more

Game-Playing

TORCS Car Racing [50]

Ms PacMan [10]

Othello [27]

Chessboard Evaluation [44]

Backgammon [44]

Mario [49]

NP-Complete Puzzles [14]

Robocode [44]

Rush Hour [44]

Checkers [44]

Freecell [44]

Dynamic Optimisation

Dynamic Symbolic Regression [34 35 51]

Dynamic Scheduling [13]

Traditional Programming

Sorting [15 1]

Assessment of GP techniquesndash 44

Case study

Case studyndash 45

Case study 1 Evolution of temperature models

Based on Karolina Stanisławska Krzysztof Krawiec Zbigniew W KundzewiczModeling Global Temperature Changes using Genetic Programming ndash A Case Study

Institute of Computing Science Poznan University of Technology Poznan Poland

Institute for Agricultural and Forest Environment Polish Academy of SciencesPoznan Poland and Potsdam Institute for Climate Impact Research PotsdamGermany

Konferencja Algorytmoacutew Ewolucyjnych i Optymalizacji Globalnej KAEiOGSeptember 21 2011

Case studyndash 46

A more detailed view on GP(vanilla GP is not the whole story)

A more detailed view on GP (vanilla GP is not the whole story)ndash 47

Where to get [random] candidate solutions from

Every stochastic search method relies on some sampling algorithm(s)

The distribution of randomly generated solutions is important as it impliescertain bias of the algorithm

ProblemsWe donrsquot know the lsquoidealrsquo distribution of GP programsEven if we knew it it may be difficult to design an algorithm that obeys it

The most widely used contemporary initialization methods take care only of thesyntax of generated programs

Mainly height constraint

A more detailed view on GP (vanilla GP is not the whole story)ndash 48

Initialization Full method

Specify the maximum tree height hmax

The full method for initializing treesChoose nonterminal nodes at random until hmax is reachedThen choose only from terminals

A more detailed view on GP (vanilla GP is not the whole story)ndash 49

Initialization Grow method

Specify the maximum tree height hmax

The grow method for initializing treesChoose nonterminal or terminal nodes at random until hmax is reachedThen choose only from terminals

A more detailed view on GP (vanilla GP is not the whole story)ndash 50

Homologous crossover for GP

Earliest example one-point crossover [23] identify a common region in theparents and swap the corresponding trees

The common region is the lsquointersectionrsquo of parent trees

A more detailed view on GP (vanilla GP is not the whole story)ndash 51

Uniform crossover for GP

Works similarly to uniform crossover in GAs

The offspring is build by iterating over nodes in the common region and flipping acoin to decide from which parent should an instruction be copied [38]

A more detailed view on GP (vanilla GP is not the whole story)ndash 52

How to employ multiple operators for lsquobreedingrsquo

How should the particular operators coexist in an evolutionary process In other words

How should they be superimposed

What should be the lsquopipingrsquo of particular breeding pipelines

A topic surprisingly underexplored in GP (and in EC probably too)

An example Which is betterpopsubpop0speciespipe = ecgpkozaMutationPipelinepopsubpop0speciespipenum-sources = 1popsubpop0speciespipesource0 = ecgpkozaCrossoverPipeline

Orpopsubpop0speciespipenum-sources = 2popsubpop0speciespipesource0 = ecgpkozaCrossoverPipelinepopsubpop0speciespipesource0prob = 09popsubpop0speciespipesource1 = ecgpkozaMutationPipelinepopsubpop0speciespipesource1prob = 01

A more detailed view on GP (vanilla GP is not the whole story)ndash 53

The Challenges for GP

The Challenges for GPndash 54

Bloat

The evolving expressions tend to grow indefinitely in sizeFor tree-based representations this growth is typically exponential[-ish]

Evaluation becomes slow algorithm stalls memory overrun likely

One of the most intensely studied topics in GP 240+ papers as of March 2012

The Challenges for GPndash 55

Bloat example

Average number of nodes per generation in a typical run of GP solving the Sexticproblem x6minus2x4+x2

(GP dotted line)

The Challenges for GPndash 56

Countermeasures for bloat

Constraining tree heightSurprisingly can speed up bloat

Favoring small programsLexicographic parsimony pressure given two equally fit individuals prefer (select)the one represented by a smaller tree

Bloat-aware operators size-fair crossover

The Challenges for GPndash 57

Highly non-uniform distribution of program lsquobehaviorsrsquo

Convergence of binary Boolean random linear functions (composed of AND NANDOR NOR 8 bits)

From [20] Langdon W B Cantuacute-Paz E (ed) Random Search is Parsimonious LateBreaking Papers at the Genetic and Evolutionary Computation Conference(GECCO-2002) AAAI 2002 308-315

The Challenges for GPndash 58

High cost of evaluation

Running a program on multiple inputscan be expensive

Particularly for some types of dataeg images

Solutions

Caching of outcomes of subprograms

Parallel execution of programs onparticular fitness cases

Bloat prevention methods

The Challenges for GPndash 59

Variants of GP

Variants of GPndash 60

Strongly typed GP (STGP)

A way to incorporate prior knowledge and impose a structure on programs [30]

ImplementationProvide a set of typesFor each instruction define the types of its arguments and outcomesMake the operators type-aware

Mutation substitute a random tree of a proper typeCrossover swap trees of compatible1 types

1Compatible belonging to the same lsquoset typersquoVariants of GPndash 61

Strongly typed GP in ECJ

For the problem of simple classifiers represented as decision trees

Classifier syntaxClassifier = Class_idClassifier = if_then_else(Condition ClassifierClassifier)Condition = Input_Variable = Constant_Value

Implementation in ECJ parameter filesgptypeasize = 3gptypea0name = classgptypea1name = vargptypea2name = constgptypessize = 0gptcsize = 1gptc0 = ecgpGPTreeConstraintsgptc0name = tc0gptc0fset = f0gptc0returns = class

gpncsize = 4gpnc0 = ecgpGPNodeConstraintsgpnc0name = ncSimpleClassifiergpnc0returns = classgpnc0size = 0gpnc1 = ecgpGPNodeConstraintsgpnc1name = ncCompoundClassifiergpnc1returns = classgpnc1size = 4gpnc1child0 = vargpnc1child1 = constgpnc1child2 = classgpnc1child3 = classgpnc2 = ecgpGPNodeConstraintsgpnc2name = ncVariablegpnc2returns = vargpnc2size = 0gpnc3 = ecgpGPNodeConstraintsgpnc3name = ncConstantgpnc3returns = constgpnc3size = 0

Variants of GPndash 62

Linear Genetic Programming

MotivationTree-like structures are not natural for contemporary hardware architectures

Program = a sequence of instructions

Data passed via registers

ProsDirectly portable to machine code fast executionNatural correspondence to standard (GA-like) crossover operator

Applications direct evolution of machine code [32]

Initial register contents

Final register contents

x1

x2 O1 O2

x3

O3 O4 g2

g3

g1r1

r2

r3

r1

r2

r3

Variants of GPndash 63

Linear GP

Variants of GPndash 64

Stack-based GP

The best-known representative Push and PushGPhampshireedulspectorpushhtml [48]

ProsVery simple syntax program = instruction | literal | ( program )No need to specify the number of registersThe top element of a stack has the natural interpretation of program outcomeNatural possibility of implementing autoconstructive programs [47]Includes certain features that make it Turing-complete (eg YANK instruction)

Variants of GPndash 65

Push Example

Program

( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR )

Initial stack states

BOOLEAN STACK ()CODE STACK ( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR )FLOAT STACK ()INTEGER STACK ()

Stack states after program execution

BOOLEAN STACK ( TRUE )CODE STACK ( ( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR ) )FLOAT STACK ( 93 )INTEGER STACK ( 6 )

httphampshireedulspectorpush3-descriptionhtml

Variants of GPndash 66

Other variants of GP

Grammatical Evolution The grammar of the programming language ofconsideration is given as input to the algorithm Individuals encode the choice ofproductions in the derivation tree (which of available alternative productionshould be chosen modulo the number of productions available at given step ofderivation)

Graph-based GPMotivation standard GP cannot reuse subprograms (within a single program)Example Cartesian Genetic Programming

Variants of GPndash 67

Other variants of GP

Multiobjective GP The extra objectives canCome with the problemResult from GPrsquos specifics eg use program size as the second (minimized)objectiveBe associated with different tests (eg feature tests [40])

Developmental GP (eg using Push)

Probabilistic GP (a variant of EDA Estimation of Distribution Algorithms)The algorithm maintains a probability distribution P instead of a populationIndividuals are generated from P lsquoon demandrsquoThe results of individualsrsquo evaluation are used to update P

Variants of GPndash 68

Simple EDA-like GP PIPE

Probabilistic Incremental Program Evolution [41]

Variants of GPndash 69

Selected theoretical results

Selected theoretical resultsndash 70

Schemata theorem for GP

Exact formula for the expected number of individuals sampling a schema a thenext generation [37]

Plus later work for other types of crossover

Selected theoretical resultsndash 71

Semantic crossover

Selected theoretical resultsndash 72

Applications of GP

Applications of GPndash 73

Review

GP produced a number of solutions that are human-competitive ie a GPalgorithm automatically solved a problem for which a patent exists2

A recent award-winning work has demonstrated the ability of a GP system toautomatically find and correct bugs in commercially-released software whenprovided with test data3

GP is one of leading methodologies that can be used to lsquoautomatersquo sciencehelping the researchers to find the hidden complex patterns in the observedphenomena4

2Koza J R Keane M A Streeter M J Mydlowec W Yu J Lanza G 2003 Genetic Pro-gramming IV Routine Human-Competitive Machine Intelligence Kluwer Academic Publishers

3Arcuri A Yao X A novel co-evolutionary approach to automatic software bug fixing In WangJ (Ed) 2008 IEEE World Congress on Computational Intelligence IEEE Computational IntelligenceSociety IEEE Press Hong Kong

4Schmidt M Lipson H 3 Apr 2009 Distilling free-form natural laws from experimental dataScience 324 (5923) 81ndash85

Applications of GPndash 74

Humies

() Entries were solicited for cash awards for human-competitive results that wereproduced by any form of genetic and evolutionary computation and that were published

Applications of GPndash 75

Humies

The conditions to qualify(A) The result was patented as an invention in the past is an improvement over apatented invention or would qualify today as a patentable new invention(B) The result is equal to or better than a result that was accepted as a new scientificresult at the time when it was published in a peer-reviewed scientific journal(C) The result is equal to or better than a result that was placed into a database orarchive of results maintained by an internationally recognized panel of scientificexperts(D) The result is publishable in its own right as a new scientific result mdash independentof the fact that the result was mechanically created(E) The result is equal to or better than the most recent human-created solution to along-standing problem for which there has been a succession of increasingly betterhuman-created solutions(F) The result is equal to or better than a result that was considered an achievementin its field at the time it was first discovered(G) The result solves a problem of indisputable difficulty in its field(H) The result holds its own or wins a regulated competition involving humancontestants (in the form of either live human players or human-written computerprograms)

Applications of GPndash 76

Selected Gold Humies

2004 Jason D Lohn Gregory S Hornby Derek S Linden NASA Ames ResearchCenterAn Evolved Antenna for Deployment on NASArsquos Space Technology 5 Mission

httpidesignucscedupapershornby_ec11pdf

Applications of GPndash 77

Selected Gold Humies using GP

2009 Stephanie Forrest Claire Le Goues ThanhVu Nguyen Westley WeimerAutomatically finding patches using genetic programming A GeneticProgramming Approach to Automated Software Repair

Applications of GPndash 78

Selected Gold Humies using GP

2008 Lee Spector David M Clark Ian Lindsay Bradford Barr Jon KleinGenetic Programming for Finite Algebras

2010 Natalio Krasnogor Paweł Widera Jonathan GaribaldiEvolutionary design of the energy function for protein structure prediction GPchallenge evolving the energy function for protein structure prediction Automateddesign of energy functions for protein structure prediction by means of geneticprogramming and improved structure similarity assessment

2011 Achiya Elyasaf Ami Hauptmann Moshe SipperGA-FreeCell Evolving Solvers for the Game of FreeCell

Applications of GPndash 79

Other applications

classification problems in machine learning [] object recognition [ 33] orlearning game strategies [] [2 57] has demonstrated the ability of a GP system to automatically find andcorrect bugs in commercially-released software when provided with test data

In the context of this paper it deserves particular attention that GP is one ofleading methodologies that can be used to lsquoautomatersquo science helping theresearchers to find the hidden complex patterns in the observed phenomena

In this spirit in their seminal paper [42] have shown how GP can be used toinduce scientific laws from experimental data Many other studies havedemonstrated the usefulness of GP for modeling different phenomena includingthose of natural origins [45 3 7 56 46]

See [] for an extensive review of GP applications

Applications of GPndash 80

Case study 2 Evolution of feature detectors

Based on Krzysztof Krawiec Bartosz Kukawka and Tomasz Maciejewski (2010)Evolving cascades of voting feature detectors for vehicle detection in satellite imageryIn IEEE Congress on Evolutionary Computation (CEC 2010) Barcelona IEEE Presspages 2392-2399

Applications of GPndash 81

Additional resources

Additional resourcesndash 82

Additional resources

Evolutionary Computation in Java csgmuedu~eclabprojectsecjGeneric software framework for EA well-prepared to work with GP

The online GP bilbiography wwwcsbhamacuk~wblbiblio

The genetic programming lsquohome pagersquo (a little bit messy but still valuable)httpwwwgenetic-programmingcom

Additional resourcesndash 83

Bibliography

Bibliographyndash 84

A Agapitos and S M LucasEvolving Modular Recursive Sorting AlgorithmsIn Proc EuroGP 2007

A Arcuri and X YaoA novel co-evolutionary approach to automatic software bug fixingIn J Wang editor 2008 IEEE World Congress on Computational Intelligencepages 162ndash168 Hong Kong 1-6 June 2008 IEEE Computational IntelligenceSociety IEEE Press

M Arganis R Val J Prats K Rodriguez R Dominguez and J DolzGenetic programming and standardization in water temperature modellingAdvances in Civil Engineering 2009 2009

A Brabazon M OrsquoNeill and I DempseyAn Introduction to Evolutionary Computation in FinanceIEEE Computational Intelligence Magazine 3(4)42ndash55 2008

R Bradley A Brabazon and M OrsquoNeillDynamic High Frequency Trading A Neuro-Evolutionary ApproachIn Proc EvoWorkshops 2009

G Cuccu and F GomezWhen novelty is not enoughIn Proc EvoApplications 2011

Bibliographyndash 85

M Daga and M C DeoAlternative data-driven methods to estimate wind from waves by inversemodelingNatural Hazards 49(2)293ndash310 May 2009

I Dempsey M OrsquoNeill and A BrabazonAdaptive Trading With Grammatical EvolutionIn Proc CEC 2006

G Durrett F Neumann and U-M OrsquoReillyComputational Complexity Analysis of Simple Genetic Programming On TwoProblems Modeling Isolated Program SemanticsIn Proc FOGA 2011

E Galvaacuten-Loacutepez J Swafford M OrsquoNeill and A BrabazonEvolving a Ms PacMan Controller Using Grammatical EvolutionIn Applications of Evolutionary Computation Springer 2010

J V Hansen P B Lowry R D Meservy and D M McDonaldGenetic Programming for Prevention of Cyberterrorism through Dynamic andEvolving Intrusion DetectionDecision Support Systems 431362ndash1374 2007

Bibliographyndash 86

S Harding J F Miller and W BanzhafDevelopments in Cartesian Genetic Programming self-modifying CGPGPEM 11397ndash439 2010

D Jakobović and L BudinDynamic Scheduling with Genetic ProgrammingIn Proc EuroGP 2006

G Kendall A Parkes and K SpoererA Survey of NP-Complete PuzzlesInternational Computer Games Association Journal 31(1)13ndash34 2008

K E Kinnear JrEvolving a Sort Lessons in Genetic ProgrammingIn Proc of the International Conference on Neural Networks 1993

J KozaA Genetic Approach to the Truck Backer Upper Problem and the Inter-twinedSpiral ProblemIn Proc International Joint Conference on Neural Networks 1992

J R KozaGenetic Programming On the Programming of Computers by Means of NaturalSelectionMIT Press Cambridge MA USA 1992

Bibliographyndash 87

J R KozaGenetic Programming II Automatic Discovery of Reusable ProgramsMIT Press Cambridge Massachusetts May 1994

W Langdon and W BanzhafRepeated Patterns in Genetic ProgrammingNatural Computing 7589ndash613 2008

W B LangdonRandom search is parsimoniousIn E Cantuacute-Paz editor Late Breaking Papers at the Genetic and EvolutionaryComputation Conference (GECCO-2002) pages 308ndash315 New York NY 9-13July 2002 AAAI

W B Langdon and W BanzhafRepeated Sequences in Linear Genetic Programming GenomesComplex Systems 15(4)285ndash306 2005

W B Langdon and A P HarrisonEvolving Regular Expressions for GeneChip Probe Performance PredictionIn Proc PPSN pages 1061ndash1070 2008

W B Langdon and R PoliFoundations of Genetic ProgrammingSpringer-Verlag 2002

Bibliographyndash 88

W B Langdon J Rowsell and A P HarrisonCreating Regular Expressions as mRNA Motifs with GP to Predict Human ExonSplittingIn Proc GECCO 2009

W B Langdon O Sanchez Graillet and A P HarrisonAutomated DNA Motif DiscoveryarXivorg 2010

M Lones A Tyrrell S Stepney and L CavesControlling Complex Dynamics with Artificial Biochemical NetworksIn Proc EuroGP pages 159ndash170 2010

S LucasOthello Competitionhttpprotectkern-1667emrelaxalgovalessexacuk8080othellohtmlOthellohtml 2012[Online accessed 27-Jan-2012]

Bibliographyndash 89

S LucasThe Physical Travelling Salesperson Problemhttpprotectkern-1667emrelaxalgovalessexacukptspptsphtml2012[Online accessed 27ndashJan-2012]

T McConaghyFFX Fast Scalable Deterministic Symbolic Regression TechnologyIn Proc GPTP 2011

D J MontanaStrongly typed genetic programmingBBN Technical Report 7866 Bolt Beranek and Newman Inc 10 MoultonStreet Cambridge MA 02138 USA 7 May 1993

M Nicolau M Schoenauer and W BanzhafEvolving Genes to Balance a PoleIn Proc EuroGP 2010

Bibliographyndash 90

P Nordin and W BanzhafGenetic programming controlling a miniature robotIn E V Siegel and J R Koza editors Working Notes for the AAAI Symposiumon Genetic Programming pages 61ndash67 MIT Cambridge MA USA 10ndash12 Nov1995 AAAI

G Olague and L TrujilloEvolutionary-computer-assisted design of image operators that detect interestpoints using genetic programmingImage and Vision Computing 29(7)484ndash498 2011

M OrsquoNeill A Brabazon and E HembergSubtree Deactivation Control with Grammatical Genetic Programming inDynamic EnvironmentsIn Proc CEC 2008

M OrsquoNeill and C RyanGrammatical Evolution by Grammatical Evolution The Evolution of Grammarand Genetic CodeIn Proc EuroGP pages 138ndash149 Springer-Verlag 5ndash7 Apr 2004

M Paterson Y Peres M Thorup P Winkler and U ZwickMaximum OverhangIn Proc 19th Annual ACM-SIAM Symposium on Discrete Algorithms 2008

Bibliographyndash 91

R PoliExact schema theory for genetic programming and variable-length geneticalgorithms with one-point crossoverGenetic Programming and Evolvable Machines 2(2)123ndash163 June 2001

R Poli and W B LangdonOn the search properties of different crossover operators in genetic programmingIn J R Koza W Banzhaf K Chellapilla K Deb M Dorigo D B Fogel M HGarzon D E Goldberg H Iba and R Riolo editors Genetic Programming1998 Proceedings of the Third Annual Conference pages 293ndash301 University ofWisconsin Madison Wisconsin USA 22-25 July 1998 Morgan Kaufmann

R Poli W B Langdon and N F McPheeA field guide to genetic programmingPublished via httplulucom and freely available athttpwwwgp-field-guideorguk 2008(With contributions by J R Koza)

B J Ross and H ZhuProcedural texture evolution using multiobjective optimizationNew Generation Computing 22(3)271ndash293 2004

Bibliographyndash 92

R P Salustowicz and J SchmidhuberProbabilistic incremental program evolutionEvolutionary Computation 5(2)123ndash141 1997

M Schmidt and H LipsonDistilling free-form natural laws from experimental dataScience 324(5923)81ndash85 3 Apr 2009

S Silva and L VanneschiState-of-the-Art Genetic Programming for Predicting Human Oral Bioavailabilityof DrugsIn Proc 4th International Workshop on Practical Applications of ComputationalBiology and Bioinformatics 2010

M SipperLet the Games EvolveIn Proc GPTP 2011

C Sivapragasam R Maheswaran and V VenkateshGenetic programming approach for flood routing in natural channelsHydrological Processes 22(5)623ndash628 2007

C Sivapragasam N Muttil S Muthukumar and V M ArunPrediction of algal blooms using genetic programmingMarine Pollution Bulletin 60(10)1849ndash1855 2010

Bibliographyndash 93

L SpectorTowards practical autoconstructive evolution Self-evolution of problem-solvinggenetic programming systemsIn R Riolo T McConaghy and E Vladislavleva editors Genetic ProgrammingTheory and Practice VIII volume 8 of Genetic and Evolutionary Computationchapter 2 pages 17ndash33 Springer Ann Arbor USA 20-22 May 2010

L Spector C Perry and J KleinPush 20 programming language descriptionTechnical report School of Cognitive Science Hampshire College Apr 2004

J Togelius S Karakovskiy J Koutnik and J SchmidhuberSuper Mario EvolutionIn Proc IEEE Computational Intelligence and Games 2009

TORCS The Open Car Racing Simulatorhttptorcssourceforgenet 2012

L Vanneschi and G CuccuVariable Size Population for Dynamic Optimization with Genetic ProgrammingIn Proc GECCO 2009

Bibliographyndash 94

E Vladislavleva G Smits and D Den HertogOrder of Nonlinearity as a Complexity Measure for Models Generated by SymbolicRegression via Pareto Genetic ProgrammingIEEE Trans EC 13(2)333ndash349 2009

N Wagner Z Michalewicz M Khouja and R McGregorTime Series Forecasting for Dynamic Environments The DyFor Genetic ProgramModelIEEE Trans EC 2007

J Walker and J MillerPredicting Prime Numbers Using Cartesian Genetic ProgrammingIn Proc EuroGP 2007

J A Walker K Voumllk S L Smith and J F MillerParallel Evolution using Multi-chromosome Cartesian Genetic ProgrammingGPEM 10417ndash445 2009

W-C Wang K-W Chau C-T Cheng and L QiuA comparison of performance of several artificial intelligence methods forforecasting monthly discharge time seriesJournal of Hydrology 374(3-4)294ndash306 2009

Bibliographyndash 95

W Weimer S Forrest C Le Goues and T NguyenAutomatic program repair with evolutionary computationCommunications of the ACM 53(5)109ndash116 June 2010

P Widera J Garibaldi and N KrasnogorGP challenge Evolving energy function for protein structure predictionGPEM 1161ndash88 2010

T YuHierarchical Processing for Evolving Recursive and Modular Programs UsingHigher-Order Functions and Lambda AbstractionGPEM 2345ndash380 2001

Bibliographyndash 96

  • Introduction
  • Background
  • What is genetic programming
  • Summary of our first glimpse at GP
  • Life demonstration of GP using ECJ
  • Assessment of GP techniques
  • Case study
  • A more detailed view on GP (vanilla GP is not the whole story)
  • The Challenges for GP
  • Variants of GP
  • Selected theoretical results
  • Applications of GP
  • Additional resources
  • Bibliography
Page 5: GeneticProgramming - Poznań University of TechnologyGeneticProgramming KrzysztofKrawiec LaboratoryofIntelligentDecisionSupportSystems InstituteofComputingScience,PoznanUniversityofTechnology,Poznań,Poland

Recommended reading

A Field Guide to Genetic Programming httpwwwgp-field-guideorguk[39]

(This presentation uses some figures from the Field Guide)

Introductionndash 5

Background

Backgroundndash 6

Evolutionary Computation (EC)

Heuristic bio-inspired global search algorithms

Operate on populations of candidate solutions

Candidate solutions are encoded as genotypes

Genotypes get decoded into phenotypes when evaluated by the fitness function fbeing optimized

Formulation

plowast = argmaxpisinP

f (p)

where

P is the considered space (search space) of candidate solutions (solutions forshort)

f is a (maximized) fitness function

plowast is an optimal solution (an ideal) that maximizes f

Backgroundndash 7

Generic evolutionary algorithm

Evolutionary Algorithm

Population P of individuals

Evaluation

Selection

Mutation and recombination

Initialization of population P

Solutionindividual s

f(s)

Output Best solution s+

Termination criteria

Fitness function f

Backgroundndash 8

[Unique] characteristic of EC

Black-box optimization (f primes dependency on the independent variables does nothave to be known or meet any criteria)

Variables do not have to be explicitly defined

Fining an optimum cannot be guaranteed but in practice a well-performingsuboptimal solution is often satisfactory

Importance of crossover a recombination operator that makes the solutionsexchange certain elements (variable values features)

Without crossover EC boils down parallel stochastic local search

Backgroundndash 9

What is genetic programming

What is genetic programmingndash 10

Genetic programming

In a nutshell

A variant of EC where the genotypes represent programs ie entities capable ofreading in input data and producing some output data in response to that input

Fitness function f measures the similarity of the output produced by the programto the desired output given as a part of task statement

Standard representation expression trees

Important implication Additional input required by the algorithm (compared to EC)

Set of instructions (programming language of consideration)

Data to run the programs on

What is genetic programmingndash 11

Conceptual preliminaries of GP Space of candidate solutions P

Candidate solutions p isin P evolving under the selection pressure of the fitnessfunction f are themselves functions of the form p I rarrO

I and O are respectively the spaces of input data and output data accepted andproduced by programs from P

Cardinality of |P| is typically large or infinite

The set of program inputs I even if finite is usually so large that running eachcandidate solution on all possible inputs becomes intractable

GP algorithms typically evaluate solutions on a sample I prime sub I |I prime| |I | of possibleinputs and fitness is only an approximate estimate of solution quality

The task is given as a set of fitness cases ie pairs (xi yi ) isin I timesO where xiusually comprises one or more independent variables and yi is the output variable

What is genetic programmingndash 12

Conceptual preliminaries Fitness function f

In most cases (and most real-world applications of GP) fitness function fmeasures the similarity of the output produced by the program to the desiredoutput given as a part of task statement

Then fitness can be expressed as a monotonous function of the divergence ofprogramrsquos output from the desired one for instance as

f (p) =minussumi

||yi minusp(xi )|| (1)

where

p(xi ) is the output produced by program p for the input data xi

|| middot || is a metric (a norm) in the output space O

i iterates over all fitness cases

What is genetic programmingndash 13

Conceptual preliminaries Character of candidate solutions

The candidate solutions in GP are being assembled from elementary entities calledinstructions

A part of formulation of a GP task is then also an instruction set I ie a set ofsymbols used by the search algorithm to compose the programs (candidatesolutions)

Design of I usually requires some background knowledgeIn particular it should comprise all instructions necessary to find solution to theproblem posed (closure)

What is genetic programmingndash 14

Genetic programming

Main evolution loop (lsquovanilla GPrsquo)

1 procedure GeneticProgramming(f I ) f - fitness function I - instruction set2 P larrplarrRandomProgram(I ) Initialize population3 repeat Main loop over generations4 for p isinP do Evaluation5 pf larr f (p) pf is a lsquofieldrsquo in program p that stores its fitness6 end for7 P prime larr 0 Next population8 repeat Breeding loop9 p1 larrTournamentSelection(P) First parent10 p2 larrTournamentSelection(P) Second parent11 (o1o2)larrCrossover(p1 p2)12 o1 larrMutation(o1 I )13 o2 larrMutation(o2 I )14 P prime larrP prime cupo1o215 until |P prime|= |P|16 P larrP prime

17 until StoppingCondition(P)18 return argmaxpisinP pf

19 end procedure

What is genetic programmingndash 15

Crossover

Crossover exchange of randomly selected subexpressions (subtree swapping crossover)

1 function Crossover(p1p2)2 repeat3 s1 larr Random node in p14 s2 larr Random node in p25 (pprime1p

prime2)larr Swap subtrees rooted in s1 and s2

6 until Depth(pprime1)lt dmax andDepth(pprime2)lt dmax dmax is the tree depthlimit

7 return (pprime1pprime2)

8 end function

What is genetic programmingndash 16

Mutation

Mutation replace a randomly selected subexpression with a new randomly generatedsubexpression

1 function Mutation(pI )2 repeat3 slarr Random node in p4 s prime larrRandomProgram(I )5 pprime larr Replace the subtree rooted in s with s prime

6 until Depth(pprime)lt dmax dmax is the tree depth limit7 return pprime

8 end function

What is genetic programmingndash 17

Exemplary run Setup

Objective Find program whose output matches x2+x+1 over the range [minus11]Such tasks can be considered as a form of regressionAs solutions are built by manipulating code (instructions) this is referred to assymbolic regression

Fitness sum of absolute errors for x isin minus10minus09 0910In other words the set of fitness cases is

xi -10 -09 0 09 10yi 1 091 1 271 3

What is genetic programmingndash 18

Exemplary run Setup

Instruction setNonterminal (function) set + - (protected division) and x all operating onfloatsTerminal set x and constants chosen randomly between -5 and +5

Selection fitness proportionate (roulette wheel) non elitist

Initial pop ramped half-and-half (depth 1 to 2 50 of terminals are constants)(to be explained later)

Parameterspopulation size 450 subtree crossover25 reproduction25 subtree mutation no tree size limits

Termination when an individual with fitness better than 01 found

What is genetic programmingndash 19

Initial population (population 0)

What is genetic programmingndash 20

Fitness assignment for population 0

Fitness values f(a)=77 f(b)=110 f(c)=1798 f(d)=287

What is genetic programmingndash 21

Breeding

Assume

a gets reproduced

c gets mutated (at loci 2)

a and d get crossed-over

a and b get crossed over

What is genetic programmingndash 22

Population 1

Population 0

Population 1

Individual d in population 1 has fitness 0What is genetic programmingndash 23

Summary of our first glimpse at GP

Summary of our first glimpse at GPndash 24

Specific features of GP

The solutions evolving under the selection pressure of the fitness function arethemselves functions (programs)

GP operates on symbolic structures of varying lengthsThere are no variables for the algorithm to operate on (at least in the commonsense)

The program can be tested only on a limited number of fitness cases (tests)

=rArr In contrast to most EC methods that are typically placed in optimizationframework GP is by nature an inductive learning approach that fits into the domain ofmachine learning []

Summary of our first glimpse at GPndash 25

In a broader context

As opposed to typical ML approaches GP is very genericArbitrary programming language arbitrary input and output representation

The syntax and semantic of the programming language of consideration serve asmeans to provide the algorithm with prior knowledge

(common sense knowledge background knowledge domain knowledge)

GP is not the only approach to program induction (but probably the best one )See eg inductive logic programming ILP

GP embodies the ultimate goal of AI to build a system capable ofself-programming (adaptation learning)

Summary of our first glimpse at GPndash 26

What is GP ndash Question revisited

Genetic programming is a branch of computer science studying heuristicalgorithms based on neo-Darwinian principles for synthesizing programs iediscrete symbolic compositional structures that process data

Summary of our first glimpse at GPndash 27

Why should GP be considered a viable approach to program synthesis

GP combines two powerful concepts marked in underline in the above definition1 Representing candidate solutions as programs

which in general can conduct any Turing-complete computation (egclassification regression clustering reasoning problem solving etc) and thusenable capturing solutions to any type of problems (whether the task is eglearning optimization problem solving game playing etc)

2 Searching the space of candidate solutions using the lsquomechanicsrsquo borrowed frombiological evolutionwhich is unquestionably a very powerful computing paradigm given that itresulted in life on Earth and development of intelligent beings

Summary of our first glimpse at GPndash 28

Consequences of the above definition

Heuristic nature of search

Symbolic program representation

Input sensitivity and inductive character

State-fullness

Unconstrained data types

Summary of our first glimpse at GPndash 29

Origins of GP

httpwwwgenetic-programmingcomjohnkozahtml

Home assignment Indentify John Kozarsquos unique visual trait )

Summary of our first glimpse at GPndash 30

Life demonstration of GP using ECJ

Life demonstration of GP using ECJndash 31

GP on ECJ

ECJ Evolutionary Computation in Javahttpcsgmuedu~eclabprojectsecj

by Sean Luke Liviu Panait Gabriel Balan Sean Paus Zbigniew Skolicki ElenaPopovici Keith Sullivan et al

Probably the most popular freely available framework for EC with a strongsupport for GP

Licensed under Academic Free License version 30

As of March 2012 version 20

Many other libraries integrate with ECJ

Life demonstration of GP using ECJndash 32

Selected ECJ features

GUI with charting

Platform-independent checkpointing and logging

Hierarchical parameter files

Multithreading

Mersenne Twister Random Number Generators (compare tohttpwwwalifecouknonrandom)

Abstractions for implementing a variety of EC forms

Prepared to work in a distributed environment (including so-called island model)

Life demonstration of GP using ECJndash 33

GP-related ECJ features

GP Tree Representations

Set-based Strongly-Typed Genetic Programming

Ephemeral Random Constants

Automatically-Defined Functions and Automatically Defined Macros

Multiple tree forests

Six tree-creation algorithms

Extensive set of GP breeding operators

Grammatical Encoding

Eight pre-done GP application problem domains (ant regression multiplexerlawnmower parity two-box edge serengeti)

Life demonstration of GP using ECJndash 34

Exemplary run of ECJ

Standard outputjava ecEvolve -file ecappregressionquinticercparamsThreads breed1 eval1Seed 1427743400Job 0Setting upProcessing GP TypesProcessing GP Node ConstraintsProcessing GP Function SetsProcessing GP Tree Constraints-01306332228659439200164875774146594280653340439694114301402200189629743-0037506348565697010001402771209365470606602806044824949013869498395598084Initializing Generation 0Subpop 0 best fitness of generation Fitness Standardized=11303205 Adjusted=046941292 Hits=10Generation 1Subpop 0 best fitness of generation Fitness Standardized=06804932 Adjusted=059506345 Hits=7

Life demonstration of GP using ECJndash 35

Exemplary run The result

The log file produced by the runGeneration 0Best IndividualSubpopulation 0Evaluated trueFitness Standardized=11303205 Adjusted=046941292 Hits=10Tree 0( (sin ( x x)) (cos (+ x x)))Generation 1Best IndividualSubpopulation 0Evaluated trueFitness Standardized=06804932 Adjusted=059506345 Hits=7Tree 0( (rlog (+ (- x x) (cos x))) (rlog (- (cos (cos ( x x))) (- x x))))

Life demonstration of GP using ECJndash 36

Exemplary run

The log file produced by the runBest Individual of RunSubpopulation 0Evaluated trueFitness Standardized=008413165 Adjusted=092239726 Hits=17Tree 0( ( ( (- ( ( ( ( x (sin x)) (rlog

x)) (+ (+ (sin x) x) (- x x))) (exp ( x( ( (- ( ( ( ( x x) (rlog x)) (+ (+

(sin x) x) (- x x))) (exp ( x (sin x))))(sin x)) (rlog x)) (exp (rlog x)))))) (sin

x)) (rlog x)) x) (cos (cos ( ( (- ( ((exp (rlog x)) (+ x ( ( (exp (rlog x))(rlog x)) x))) (exp ( ( ( (- (exp (rlogx)) x) (rlog x)) x) (sin ( x x))))) (sinx)) ( x ( ( (- ( ( ( ( x x) (rlogx)) (+ (+ x (+ (+ (sin x) x) (- x x))) (-x x))) (exp ( x (sin x)))) (sin x)) (rlogx)) (exp (rlog x))))) x))))

Life demonstration of GP using ECJndash 37

GP in R

httpcranr-projectorgwebpackagesgprindexhtml

Life demonstration of GP using ECJndash 38

GP in Mathematica

Life demonstration of GP using ECJndash 39

Assessment of GP techniques

Assessment of GP techniquesndash 40

Criteria for assessing the quality of GP-evolved solutions

Criteria for assessing GP algorithmssuccess rate (percentage of evolutionary runs ended with success)

time-to-success (can be infin)

error of the best-of-run individual

Criteria for assessing programs obtained with GP

error rate (percentage of tests passed)

program size (number of instructions)

execution time

transparency

Assessment of GP techniquesndash 41

GP Benchmarks

Problem Definition (formula)Sextic x6minus2x4+x2

Septic x7minus2x6+x5minusx4+x3minus2x2+xNonic x9+x8+x7+x6+x5+x4+x3+x2+xR1 (x+1)3(x2minusx+1)R2 (x5minus3x3+1)(x2+1)R3 (x6+x5)(x4+x3+x2+x+1)

Assessment of GP techniquesndash 42

Even more GP benchmarks

Symbolic Regression Drug Bioavailability [43]

Tower [52] Protein Structure Classification [58]

Boolean Functions Time Series Forecasting [53]

N-Multiplexer [17] N-Majority [17] N-Parity [17] Path-finding and Planning

Generalised Boolean Circuits [12 59] Physical Travelling Salesman [28]

Digital Adder [55] Artificial Ant [17]

Order [9] Lawnmower [18]

Digital Multiplier [55] Tartarus Problem [6]

Majority [9] Maximum Overhang [36]

Classification Circuit Design [29]

mRNA Motif Classification [24] Control Systems

DNA Motif Discovery [25] Chaotic Dynamic Systems Control [26]

Intrusion Detection [11] Pole Balancing [31]

Protein Classification [19] Truck Control [16]

Intertwined Spirals [17]

Predictive Modelling

Mackey-Glass Chaotic Time Series [21]

Financial Trading [5 4 8]

Sunspot Prediction [17]

GeneChip Probe Performance [22]

Prime Number Prediction [54]

Assessment of GP techniquesndash 43

and more

Game-Playing

TORCS Car Racing [50]

Ms PacMan [10]

Othello [27]

Chessboard Evaluation [44]

Backgammon [44]

Mario [49]

NP-Complete Puzzles [14]

Robocode [44]

Rush Hour [44]

Checkers [44]

Freecell [44]

Dynamic Optimisation

Dynamic Symbolic Regression [34 35 51]

Dynamic Scheduling [13]

Traditional Programming

Sorting [15 1]

Assessment of GP techniquesndash 44

Case study

Case studyndash 45

Case study 1 Evolution of temperature models

Based on Karolina Stanisławska Krzysztof Krawiec Zbigniew W KundzewiczModeling Global Temperature Changes using Genetic Programming ndash A Case Study

Institute of Computing Science Poznan University of Technology Poznan Poland

Institute for Agricultural and Forest Environment Polish Academy of SciencesPoznan Poland and Potsdam Institute for Climate Impact Research PotsdamGermany

Konferencja Algorytmoacutew Ewolucyjnych i Optymalizacji Globalnej KAEiOGSeptember 21 2011

Case studyndash 46

A more detailed view on GP(vanilla GP is not the whole story)

A more detailed view on GP (vanilla GP is not the whole story)ndash 47

Where to get [random] candidate solutions from

Every stochastic search method relies on some sampling algorithm(s)

The distribution of randomly generated solutions is important as it impliescertain bias of the algorithm

ProblemsWe donrsquot know the lsquoidealrsquo distribution of GP programsEven if we knew it it may be difficult to design an algorithm that obeys it

The most widely used contemporary initialization methods take care only of thesyntax of generated programs

Mainly height constraint

A more detailed view on GP (vanilla GP is not the whole story)ndash 48

Initialization Full method

Specify the maximum tree height hmax

The full method for initializing treesChoose nonterminal nodes at random until hmax is reachedThen choose only from terminals

A more detailed view on GP (vanilla GP is not the whole story)ndash 49

Initialization Grow method

Specify the maximum tree height hmax

The grow method for initializing treesChoose nonterminal or terminal nodes at random until hmax is reachedThen choose only from terminals

A more detailed view on GP (vanilla GP is not the whole story)ndash 50

Homologous crossover for GP

Earliest example one-point crossover [23] identify a common region in theparents and swap the corresponding trees

The common region is the lsquointersectionrsquo of parent trees

A more detailed view on GP (vanilla GP is not the whole story)ndash 51

Uniform crossover for GP

Works similarly to uniform crossover in GAs

The offspring is build by iterating over nodes in the common region and flipping acoin to decide from which parent should an instruction be copied [38]

A more detailed view on GP (vanilla GP is not the whole story)ndash 52

How to employ multiple operators for lsquobreedingrsquo

How should the particular operators coexist in an evolutionary process In other words

How should they be superimposed

What should be the lsquopipingrsquo of particular breeding pipelines

A topic surprisingly underexplored in GP (and in EC probably too)

An example Which is betterpopsubpop0speciespipe = ecgpkozaMutationPipelinepopsubpop0speciespipenum-sources = 1popsubpop0speciespipesource0 = ecgpkozaCrossoverPipeline

Orpopsubpop0speciespipenum-sources = 2popsubpop0speciespipesource0 = ecgpkozaCrossoverPipelinepopsubpop0speciespipesource0prob = 09popsubpop0speciespipesource1 = ecgpkozaMutationPipelinepopsubpop0speciespipesource1prob = 01

A more detailed view on GP (vanilla GP is not the whole story)ndash 53

The Challenges for GP

The Challenges for GPndash 54

Bloat

The evolving expressions tend to grow indefinitely in sizeFor tree-based representations this growth is typically exponential[-ish]

Evaluation becomes slow algorithm stalls memory overrun likely

One of the most intensely studied topics in GP 240+ papers as of March 2012

The Challenges for GPndash 55

Bloat example

Average number of nodes per generation in a typical run of GP solving the Sexticproblem x6minus2x4+x2

(GP dotted line)

The Challenges for GPndash 56

Countermeasures for bloat

Constraining tree heightSurprisingly can speed up bloat

Favoring small programsLexicographic parsimony pressure given two equally fit individuals prefer (select)the one represented by a smaller tree

Bloat-aware operators size-fair crossover

The Challenges for GPndash 57

Highly non-uniform distribution of program lsquobehaviorsrsquo

Convergence of binary Boolean random linear functions (composed of AND NANDOR NOR 8 bits)

From [20] Langdon W B Cantuacute-Paz E (ed) Random Search is Parsimonious LateBreaking Papers at the Genetic and Evolutionary Computation Conference(GECCO-2002) AAAI 2002 308-315

The Challenges for GPndash 58

High cost of evaluation

Running a program on multiple inputscan be expensive

Particularly for some types of dataeg images

Solutions

Caching of outcomes of subprograms

Parallel execution of programs onparticular fitness cases

Bloat prevention methods

The Challenges for GPndash 59

Variants of GP

Variants of GPndash 60

Strongly typed GP (STGP)

A way to incorporate prior knowledge and impose a structure on programs [30]

ImplementationProvide a set of typesFor each instruction define the types of its arguments and outcomesMake the operators type-aware

Mutation substitute a random tree of a proper typeCrossover swap trees of compatible1 types

1Compatible belonging to the same lsquoset typersquoVariants of GPndash 61

Strongly typed GP in ECJ

For the problem of simple classifiers represented as decision trees

Classifier syntaxClassifier = Class_idClassifier = if_then_else(Condition ClassifierClassifier)Condition = Input_Variable = Constant_Value

Implementation in ECJ parameter filesgptypeasize = 3gptypea0name = classgptypea1name = vargptypea2name = constgptypessize = 0gptcsize = 1gptc0 = ecgpGPTreeConstraintsgptc0name = tc0gptc0fset = f0gptc0returns = class

gpncsize = 4gpnc0 = ecgpGPNodeConstraintsgpnc0name = ncSimpleClassifiergpnc0returns = classgpnc0size = 0gpnc1 = ecgpGPNodeConstraintsgpnc1name = ncCompoundClassifiergpnc1returns = classgpnc1size = 4gpnc1child0 = vargpnc1child1 = constgpnc1child2 = classgpnc1child3 = classgpnc2 = ecgpGPNodeConstraintsgpnc2name = ncVariablegpnc2returns = vargpnc2size = 0gpnc3 = ecgpGPNodeConstraintsgpnc3name = ncConstantgpnc3returns = constgpnc3size = 0

Variants of GPndash 62

Linear Genetic Programming

MotivationTree-like structures are not natural for contemporary hardware architectures

Program = a sequence of instructions

Data passed via registers

ProsDirectly portable to machine code fast executionNatural correspondence to standard (GA-like) crossover operator

Applications direct evolution of machine code [32]

Initial register contents

Final register contents

x1

x2 O1 O2

x3

O3 O4 g2

g3

g1r1

r2

r3

r1

r2

r3

Variants of GPndash 63

Linear GP

Variants of GPndash 64

Stack-based GP

The best-known representative Push and PushGPhampshireedulspectorpushhtml [48]

ProsVery simple syntax program = instruction | literal | ( program )No need to specify the number of registersThe top element of a stack has the natural interpretation of program outcomeNatural possibility of implementing autoconstructive programs [47]Includes certain features that make it Turing-complete (eg YANK instruction)

Variants of GPndash 65

Push Example

Program

( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR )

Initial stack states

BOOLEAN STACK ()CODE STACK ( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR )FLOAT STACK ()INTEGER STACK ()

Stack states after program execution

BOOLEAN STACK ( TRUE )CODE STACK ( ( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR ) )FLOAT STACK ( 93 )INTEGER STACK ( 6 )

httphampshireedulspectorpush3-descriptionhtml

Variants of GPndash 66

Other variants of GP

Grammatical Evolution The grammar of the programming language ofconsideration is given as input to the algorithm Individuals encode the choice ofproductions in the derivation tree (which of available alternative productionshould be chosen modulo the number of productions available at given step ofderivation)

Graph-based GPMotivation standard GP cannot reuse subprograms (within a single program)Example Cartesian Genetic Programming

Variants of GPndash 67

Other variants of GP

Multiobjective GP The extra objectives canCome with the problemResult from GPrsquos specifics eg use program size as the second (minimized)objectiveBe associated with different tests (eg feature tests [40])

Developmental GP (eg using Push)

Probabilistic GP (a variant of EDA Estimation of Distribution Algorithms)The algorithm maintains a probability distribution P instead of a populationIndividuals are generated from P lsquoon demandrsquoThe results of individualsrsquo evaluation are used to update P

Variants of GPndash 68

Simple EDA-like GP PIPE

Probabilistic Incremental Program Evolution [41]

Variants of GPndash 69

Selected theoretical results

Selected theoretical resultsndash 70

Schemata theorem for GP

Exact formula for the expected number of individuals sampling a schema a thenext generation [37]

Plus later work for other types of crossover

Selected theoretical resultsndash 71

Semantic crossover

Selected theoretical resultsndash 72

Applications of GP

Applications of GPndash 73

Review

GP produced a number of solutions that are human-competitive ie a GPalgorithm automatically solved a problem for which a patent exists2

A recent award-winning work has demonstrated the ability of a GP system toautomatically find and correct bugs in commercially-released software whenprovided with test data3

GP is one of leading methodologies that can be used to lsquoautomatersquo sciencehelping the researchers to find the hidden complex patterns in the observedphenomena4

2Koza J R Keane M A Streeter M J Mydlowec W Yu J Lanza G 2003 Genetic Pro-gramming IV Routine Human-Competitive Machine Intelligence Kluwer Academic Publishers

3Arcuri A Yao X A novel co-evolutionary approach to automatic software bug fixing In WangJ (Ed) 2008 IEEE World Congress on Computational Intelligence IEEE Computational IntelligenceSociety IEEE Press Hong Kong

4Schmidt M Lipson H 3 Apr 2009 Distilling free-form natural laws from experimental dataScience 324 (5923) 81ndash85

Applications of GPndash 74

Humies

() Entries were solicited for cash awards for human-competitive results that wereproduced by any form of genetic and evolutionary computation and that were published

Applications of GPndash 75

Humies

The conditions to qualify(A) The result was patented as an invention in the past is an improvement over apatented invention or would qualify today as a patentable new invention(B) The result is equal to or better than a result that was accepted as a new scientificresult at the time when it was published in a peer-reviewed scientific journal(C) The result is equal to or better than a result that was placed into a database orarchive of results maintained by an internationally recognized panel of scientificexperts(D) The result is publishable in its own right as a new scientific result mdash independentof the fact that the result was mechanically created(E) The result is equal to or better than the most recent human-created solution to along-standing problem for which there has been a succession of increasingly betterhuman-created solutions(F) The result is equal to or better than a result that was considered an achievementin its field at the time it was first discovered(G) The result solves a problem of indisputable difficulty in its field(H) The result holds its own or wins a regulated competition involving humancontestants (in the form of either live human players or human-written computerprograms)

Applications of GPndash 76

Selected Gold Humies

2004 Jason D Lohn Gregory S Hornby Derek S Linden NASA Ames ResearchCenterAn Evolved Antenna for Deployment on NASArsquos Space Technology 5 Mission

httpidesignucscedupapershornby_ec11pdf

Applications of GPndash 77

Selected Gold Humies using GP

2009 Stephanie Forrest Claire Le Goues ThanhVu Nguyen Westley WeimerAutomatically finding patches using genetic programming A GeneticProgramming Approach to Automated Software Repair

Applications of GPndash 78

Selected Gold Humies using GP

2008 Lee Spector David M Clark Ian Lindsay Bradford Barr Jon KleinGenetic Programming for Finite Algebras

2010 Natalio Krasnogor Paweł Widera Jonathan GaribaldiEvolutionary design of the energy function for protein structure prediction GPchallenge evolving the energy function for protein structure prediction Automateddesign of energy functions for protein structure prediction by means of geneticprogramming and improved structure similarity assessment

2011 Achiya Elyasaf Ami Hauptmann Moshe SipperGA-FreeCell Evolving Solvers for the Game of FreeCell

Applications of GPndash 79

Other applications

classification problems in machine learning [] object recognition [ 33] orlearning game strategies [] [2 57] has demonstrated the ability of a GP system to automatically find andcorrect bugs in commercially-released software when provided with test data

In the context of this paper it deserves particular attention that GP is one ofleading methodologies that can be used to lsquoautomatersquo science helping theresearchers to find the hidden complex patterns in the observed phenomena

In this spirit in their seminal paper [42] have shown how GP can be used toinduce scientific laws from experimental data Many other studies havedemonstrated the usefulness of GP for modeling different phenomena includingthose of natural origins [45 3 7 56 46]

See [] for an extensive review of GP applications

Applications of GPndash 80

Case study 2 Evolution of feature detectors

Based on Krzysztof Krawiec Bartosz Kukawka and Tomasz Maciejewski (2010)Evolving cascades of voting feature detectors for vehicle detection in satellite imageryIn IEEE Congress on Evolutionary Computation (CEC 2010) Barcelona IEEE Presspages 2392-2399

Applications of GPndash 81

Additional resources

Additional resourcesndash 82

Additional resources

Evolutionary Computation in Java csgmuedu~eclabprojectsecjGeneric software framework for EA well-prepared to work with GP

The online GP bilbiography wwwcsbhamacuk~wblbiblio

The genetic programming lsquohome pagersquo (a little bit messy but still valuable)httpwwwgenetic-programmingcom

Additional resourcesndash 83

Bibliography

Bibliographyndash 84

A Agapitos and S M LucasEvolving Modular Recursive Sorting AlgorithmsIn Proc EuroGP 2007

A Arcuri and X YaoA novel co-evolutionary approach to automatic software bug fixingIn J Wang editor 2008 IEEE World Congress on Computational Intelligencepages 162ndash168 Hong Kong 1-6 June 2008 IEEE Computational IntelligenceSociety IEEE Press

M Arganis R Val J Prats K Rodriguez R Dominguez and J DolzGenetic programming and standardization in water temperature modellingAdvances in Civil Engineering 2009 2009

A Brabazon M OrsquoNeill and I DempseyAn Introduction to Evolutionary Computation in FinanceIEEE Computational Intelligence Magazine 3(4)42ndash55 2008

R Bradley A Brabazon and M OrsquoNeillDynamic High Frequency Trading A Neuro-Evolutionary ApproachIn Proc EvoWorkshops 2009

G Cuccu and F GomezWhen novelty is not enoughIn Proc EvoApplications 2011

Bibliographyndash 85

M Daga and M C DeoAlternative data-driven methods to estimate wind from waves by inversemodelingNatural Hazards 49(2)293ndash310 May 2009

I Dempsey M OrsquoNeill and A BrabazonAdaptive Trading With Grammatical EvolutionIn Proc CEC 2006

G Durrett F Neumann and U-M OrsquoReillyComputational Complexity Analysis of Simple Genetic Programming On TwoProblems Modeling Isolated Program SemanticsIn Proc FOGA 2011

E Galvaacuten-Loacutepez J Swafford M OrsquoNeill and A BrabazonEvolving a Ms PacMan Controller Using Grammatical EvolutionIn Applications of Evolutionary Computation Springer 2010

J V Hansen P B Lowry R D Meservy and D M McDonaldGenetic Programming for Prevention of Cyberterrorism through Dynamic andEvolving Intrusion DetectionDecision Support Systems 431362ndash1374 2007

Bibliographyndash 86

S Harding J F Miller and W BanzhafDevelopments in Cartesian Genetic Programming self-modifying CGPGPEM 11397ndash439 2010

D Jakobović and L BudinDynamic Scheduling with Genetic ProgrammingIn Proc EuroGP 2006

G Kendall A Parkes and K SpoererA Survey of NP-Complete PuzzlesInternational Computer Games Association Journal 31(1)13ndash34 2008

K E Kinnear JrEvolving a Sort Lessons in Genetic ProgrammingIn Proc of the International Conference on Neural Networks 1993

J KozaA Genetic Approach to the Truck Backer Upper Problem and the Inter-twinedSpiral ProblemIn Proc International Joint Conference on Neural Networks 1992

J R KozaGenetic Programming On the Programming of Computers by Means of NaturalSelectionMIT Press Cambridge MA USA 1992

Bibliographyndash 87

J R KozaGenetic Programming II Automatic Discovery of Reusable ProgramsMIT Press Cambridge Massachusetts May 1994

W Langdon and W BanzhafRepeated Patterns in Genetic ProgrammingNatural Computing 7589ndash613 2008

W B LangdonRandom search is parsimoniousIn E Cantuacute-Paz editor Late Breaking Papers at the Genetic and EvolutionaryComputation Conference (GECCO-2002) pages 308ndash315 New York NY 9-13July 2002 AAAI

W B Langdon and W BanzhafRepeated Sequences in Linear Genetic Programming GenomesComplex Systems 15(4)285ndash306 2005

W B Langdon and A P HarrisonEvolving Regular Expressions for GeneChip Probe Performance PredictionIn Proc PPSN pages 1061ndash1070 2008

W B Langdon and R PoliFoundations of Genetic ProgrammingSpringer-Verlag 2002

Bibliographyndash 88

W B Langdon J Rowsell and A P HarrisonCreating Regular Expressions as mRNA Motifs with GP to Predict Human ExonSplittingIn Proc GECCO 2009

W B Langdon O Sanchez Graillet and A P HarrisonAutomated DNA Motif DiscoveryarXivorg 2010

M Lones A Tyrrell S Stepney and L CavesControlling Complex Dynamics with Artificial Biochemical NetworksIn Proc EuroGP pages 159ndash170 2010

S LucasOthello Competitionhttpprotectkern-1667emrelaxalgovalessexacuk8080othellohtmlOthellohtml 2012[Online accessed 27-Jan-2012]

Bibliographyndash 89

S LucasThe Physical Travelling Salesperson Problemhttpprotectkern-1667emrelaxalgovalessexacukptspptsphtml2012[Online accessed 27ndashJan-2012]

T McConaghyFFX Fast Scalable Deterministic Symbolic Regression TechnologyIn Proc GPTP 2011

D J MontanaStrongly typed genetic programmingBBN Technical Report 7866 Bolt Beranek and Newman Inc 10 MoultonStreet Cambridge MA 02138 USA 7 May 1993

M Nicolau M Schoenauer and W BanzhafEvolving Genes to Balance a PoleIn Proc EuroGP 2010

Bibliographyndash 90

P Nordin and W BanzhafGenetic programming controlling a miniature robotIn E V Siegel and J R Koza editors Working Notes for the AAAI Symposiumon Genetic Programming pages 61ndash67 MIT Cambridge MA USA 10ndash12 Nov1995 AAAI

G Olague and L TrujilloEvolutionary-computer-assisted design of image operators that detect interestpoints using genetic programmingImage and Vision Computing 29(7)484ndash498 2011

M OrsquoNeill A Brabazon and E HembergSubtree Deactivation Control with Grammatical Genetic Programming inDynamic EnvironmentsIn Proc CEC 2008

M OrsquoNeill and C RyanGrammatical Evolution by Grammatical Evolution The Evolution of Grammarand Genetic CodeIn Proc EuroGP pages 138ndash149 Springer-Verlag 5ndash7 Apr 2004

M Paterson Y Peres M Thorup P Winkler and U ZwickMaximum OverhangIn Proc 19th Annual ACM-SIAM Symposium on Discrete Algorithms 2008

Bibliographyndash 91

R PoliExact schema theory for genetic programming and variable-length geneticalgorithms with one-point crossoverGenetic Programming and Evolvable Machines 2(2)123ndash163 June 2001

R Poli and W B LangdonOn the search properties of different crossover operators in genetic programmingIn J R Koza W Banzhaf K Chellapilla K Deb M Dorigo D B Fogel M HGarzon D E Goldberg H Iba and R Riolo editors Genetic Programming1998 Proceedings of the Third Annual Conference pages 293ndash301 University ofWisconsin Madison Wisconsin USA 22-25 July 1998 Morgan Kaufmann

R Poli W B Langdon and N F McPheeA field guide to genetic programmingPublished via httplulucom and freely available athttpwwwgp-field-guideorguk 2008(With contributions by J R Koza)

B J Ross and H ZhuProcedural texture evolution using multiobjective optimizationNew Generation Computing 22(3)271ndash293 2004

Bibliographyndash 92

R P Salustowicz and J SchmidhuberProbabilistic incremental program evolutionEvolutionary Computation 5(2)123ndash141 1997

M Schmidt and H LipsonDistilling free-form natural laws from experimental dataScience 324(5923)81ndash85 3 Apr 2009

S Silva and L VanneschiState-of-the-Art Genetic Programming for Predicting Human Oral Bioavailabilityof DrugsIn Proc 4th International Workshop on Practical Applications of ComputationalBiology and Bioinformatics 2010

M SipperLet the Games EvolveIn Proc GPTP 2011

C Sivapragasam R Maheswaran and V VenkateshGenetic programming approach for flood routing in natural channelsHydrological Processes 22(5)623ndash628 2007

C Sivapragasam N Muttil S Muthukumar and V M ArunPrediction of algal blooms using genetic programmingMarine Pollution Bulletin 60(10)1849ndash1855 2010

Bibliographyndash 93

L SpectorTowards practical autoconstructive evolution Self-evolution of problem-solvinggenetic programming systemsIn R Riolo T McConaghy and E Vladislavleva editors Genetic ProgrammingTheory and Practice VIII volume 8 of Genetic and Evolutionary Computationchapter 2 pages 17ndash33 Springer Ann Arbor USA 20-22 May 2010

L Spector C Perry and J KleinPush 20 programming language descriptionTechnical report School of Cognitive Science Hampshire College Apr 2004

J Togelius S Karakovskiy J Koutnik and J SchmidhuberSuper Mario EvolutionIn Proc IEEE Computational Intelligence and Games 2009

TORCS The Open Car Racing Simulatorhttptorcssourceforgenet 2012

L Vanneschi and G CuccuVariable Size Population for Dynamic Optimization with Genetic ProgrammingIn Proc GECCO 2009

Bibliographyndash 94

E Vladislavleva G Smits and D Den HertogOrder of Nonlinearity as a Complexity Measure for Models Generated by SymbolicRegression via Pareto Genetic ProgrammingIEEE Trans EC 13(2)333ndash349 2009

N Wagner Z Michalewicz M Khouja and R McGregorTime Series Forecasting for Dynamic Environments The DyFor Genetic ProgramModelIEEE Trans EC 2007

J Walker and J MillerPredicting Prime Numbers Using Cartesian Genetic ProgrammingIn Proc EuroGP 2007

J A Walker K Voumllk S L Smith and J F MillerParallel Evolution using Multi-chromosome Cartesian Genetic ProgrammingGPEM 10417ndash445 2009

W-C Wang K-W Chau C-T Cheng and L QiuA comparison of performance of several artificial intelligence methods forforecasting monthly discharge time seriesJournal of Hydrology 374(3-4)294ndash306 2009

Bibliographyndash 95

W Weimer S Forrest C Le Goues and T NguyenAutomatic program repair with evolutionary computationCommunications of the ACM 53(5)109ndash116 June 2010

P Widera J Garibaldi and N KrasnogorGP challenge Evolving energy function for protein structure predictionGPEM 1161ndash88 2010

T YuHierarchical Processing for Evolving Recursive and Modular Programs UsingHigher-Order Functions and Lambda AbstractionGPEM 2345ndash380 2001

Bibliographyndash 96

  • Introduction
  • Background
  • What is genetic programming
  • Summary of our first glimpse at GP
  • Life demonstration of GP using ECJ
  • Assessment of GP techniques
  • Case study
  • A more detailed view on GP (vanilla GP is not the whole story)
  • The Challenges for GP
  • Variants of GP
  • Selected theoretical results
  • Applications of GP
  • Additional resources
  • Bibliography
Page 6: GeneticProgramming - Poznań University of TechnologyGeneticProgramming KrzysztofKrawiec LaboratoryofIntelligentDecisionSupportSystems InstituteofComputingScience,PoznanUniversityofTechnology,Poznań,Poland

Background

Backgroundndash 6

Evolutionary Computation (EC)

Heuristic bio-inspired global search algorithms

Operate on populations of candidate solutions

Candidate solutions are encoded as genotypes

Genotypes get decoded into phenotypes when evaluated by the fitness function fbeing optimized

Formulation

plowast = argmaxpisinP

f (p)

where

P is the considered space (search space) of candidate solutions (solutions forshort)

f is a (maximized) fitness function

plowast is an optimal solution (an ideal) that maximizes f

Backgroundndash 7

Generic evolutionary algorithm

Evolutionary Algorithm

Population P of individuals

Evaluation

Selection

Mutation and recombination

Initialization of population P

Solutionindividual s

f(s)

Output Best solution s+

Termination criteria

Fitness function f

Backgroundndash 8

[Unique] characteristic of EC

Black-box optimization (f primes dependency on the independent variables does nothave to be known or meet any criteria)

Variables do not have to be explicitly defined

Fining an optimum cannot be guaranteed but in practice a well-performingsuboptimal solution is often satisfactory

Importance of crossover a recombination operator that makes the solutionsexchange certain elements (variable values features)

Without crossover EC boils down parallel stochastic local search

Backgroundndash 9

What is genetic programming

What is genetic programmingndash 10

Genetic programming

In a nutshell

A variant of EC where the genotypes represent programs ie entities capable ofreading in input data and producing some output data in response to that input

Fitness function f measures the similarity of the output produced by the programto the desired output given as a part of task statement

Standard representation expression trees

Important implication Additional input required by the algorithm (compared to EC)

Set of instructions (programming language of consideration)

Data to run the programs on

What is genetic programmingndash 11

Conceptual preliminaries of GP Space of candidate solutions P

Candidate solutions p isin P evolving under the selection pressure of the fitnessfunction f are themselves functions of the form p I rarrO

I and O are respectively the spaces of input data and output data accepted andproduced by programs from P

Cardinality of |P| is typically large or infinite

The set of program inputs I even if finite is usually so large that running eachcandidate solution on all possible inputs becomes intractable

GP algorithms typically evaluate solutions on a sample I prime sub I |I prime| |I | of possibleinputs and fitness is only an approximate estimate of solution quality

The task is given as a set of fitness cases ie pairs (xi yi ) isin I timesO where xiusually comprises one or more independent variables and yi is the output variable

What is genetic programmingndash 12

Conceptual preliminaries Fitness function f

In most cases (and most real-world applications of GP) fitness function fmeasures the similarity of the output produced by the program to the desiredoutput given as a part of task statement

Then fitness can be expressed as a monotonous function of the divergence ofprogramrsquos output from the desired one for instance as

f (p) =minussumi

||yi minusp(xi )|| (1)

where

p(xi ) is the output produced by program p for the input data xi

|| middot || is a metric (a norm) in the output space O

i iterates over all fitness cases

What is genetic programmingndash 13

Conceptual preliminaries Character of candidate solutions

The candidate solutions in GP are being assembled from elementary entities calledinstructions

A part of formulation of a GP task is then also an instruction set I ie a set ofsymbols used by the search algorithm to compose the programs (candidatesolutions)

Design of I usually requires some background knowledgeIn particular it should comprise all instructions necessary to find solution to theproblem posed (closure)

What is genetic programmingndash 14

Genetic programming

Main evolution loop (lsquovanilla GPrsquo)

1 procedure GeneticProgramming(f I ) f - fitness function I - instruction set2 P larrplarrRandomProgram(I ) Initialize population3 repeat Main loop over generations4 for p isinP do Evaluation5 pf larr f (p) pf is a lsquofieldrsquo in program p that stores its fitness6 end for7 P prime larr 0 Next population8 repeat Breeding loop9 p1 larrTournamentSelection(P) First parent10 p2 larrTournamentSelection(P) Second parent11 (o1o2)larrCrossover(p1 p2)12 o1 larrMutation(o1 I )13 o2 larrMutation(o2 I )14 P prime larrP prime cupo1o215 until |P prime|= |P|16 P larrP prime

17 until StoppingCondition(P)18 return argmaxpisinP pf

19 end procedure

What is genetic programmingndash 15

Crossover

Crossover exchange of randomly selected subexpressions (subtree swapping crossover)

1 function Crossover(p1p2)2 repeat3 s1 larr Random node in p14 s2 larr Random node in p25 (pprime1p

prime2)larr Swap subtrees rooted in s1 and s2

6 until Depth(pprime1)lt dmax andDepth(pprime2)lt dmax dmax is the tree depthlimit

7 return (pprime1pprime2)

8 end function

What is genetic programmingndash 16

Mutation

Mutation replace a randomly selected subexpression with a new randomly generatedsubexpression

1 function Mutation(pI )2 repeat3 slarr Random node in p4 s prime larrRandomProgram(I )5 pprime larr Replace the subtree rooted in s with s prime

6 until Depth(pprime)lt dmax dmax is the tree depth limit7 return pprime

8 end function

What is genetic programmingndash 17

Exemplary run Setup

Objective Find program whose output matches x2+x+1 over the range [minus11]Such tasks can be considered as a form of regressionAs solutions are built by manipulating code (instructions) this is referred to assymbolic regression

Fitness sum of absolute errors for x isin minus10minus09 0910In other words the set of fitness cases is

xi -10 -09 0 09 10yi 1 091 1 271 3

What is genetic programmingndash 18

Exemplary run Setup

Instruction setNonterminal (function) set + - (protected division) and x all operating onfloatsTerminal set x and constants chosen randomly between -5 and +5

Selection fitness proportionate (roulette wheel) non elitist

Initial pop ramped half-and-half (depth 1 to 2 50 of terminals are constants)(to be explained later)

Parameterspopulation size 450 subtree crossover25 reproduction25 subtree mutation no tree size limits

Termination when an individual with fitness better than 01 found

What is genetic programmingndash 19

Initial population (population 0)

What is genetic programmingndash 20

Fitness assignment for population 0

Fitness values f(a)=77 f(b)=110 f(c)=1798 f(d)=287

What is genetic programmingndash 21

Breeding

Assume

a gets reproduced

c gets mutated (at loci 2)

a and d get crossed-over

a and b get crossed over

What is genetic programmingndash 22

Population 1

Population 0

Population 1

Individual d in population 1 has fitness 0What is genetic programmingndash 23

Summary of our first glimpse at GP

Summary of our first glimpse at GPndash 24

Specific features of GP

The solutions evolving under the selection pressure of the fitness function arethemselves functions (programs)

GP operates on symbolic structures of varying lengthsThere are no variables for the algorithm to operate on (at least in the commonsense)

The program can be tested only on a limited number of fitness cases (tests)

=rArr In contrast to most EC methods that are typically placed in optimizationframework GP is by nature an inductive learning approach that fits into the domain ofmachine learning []

Summary of our first glimpse at GPndash 25

In a broader context

As opposed to typical ML approaches GP is very genericArbitrary programming language arbitrary input and output representation

The syntax and semantic of the programming language of consideration serve asmeans to provide the algorithm with prior knowledge

(common sense knowledge background knowledge domain knowledge)

GP is not the only approach to program induction (but probably the best one )See eg inductive logic programming ILP

GP embodies the ultimate goal of AI to build a system capable ofself-programming (adaptation learning)

Summary of our first glimpse at GPndash 26

What is GP ndash Question revisited

Genetic programming is a branch of computer science studying heuristicalgorithms based on neo-Darwinian principles for synthesizing programs iediscrete symbolic compositional structures that process data

Summary of our first glimpse at GPndash 27

Why should GP be considered a viable approach to program synthesis

GP combines two powerful concepts marked in underline in the above definition1 Representing candidate solutions as programs

which in general can conduct any Turing-complete computation (egclassification regression clustering reasoning problem solving etc) and thusenable capturing solutions to any type of problems (whether the task is eglearning optimization problem solving game playing etc)

2 Searching the space of candidate solutions using the lsquomechanicsrsquo borrowed frombiological evolutionwhich is unquestionably a very powerful computing paradigm given that itresulted in life on Earth and development of intelligent beings

Summary of our first glimpse at GPndash 28

Consequences of the above definition

Heuristic nature of search

Symbolic program representation

Input sensitivity and inductive character

State-fullness

Unconstrained data types

Summary of our first glimpse at GPndash 29

Origins of GP

httpwwwgenetic-programmingcomjohnkozahtml

Home assignment Indentify John Kozarsquos unique visual trait )

Summary of our first glimpse at GPndash 30

Life demonstration of GP using ECJ

Life demonstration of GP using ECJndash 31

GP on ECJ

ECJ Evolutionary Computation in Javahttpcsgmuedu~eclabprojectsecj

by Sean Luke Liviu Panait Gabriel Balan Sean Paus Zbigniew Skolicki ElenaPopovici Keith Sullivan et al

Probably the most popular freely available framework for EC with a strongsupport for GP

Licensed under Academic Free License version 30

As of March 2012 version 20

Many other libraries integrate with ECJ

Life demonstration of GP using ECJndash 32

Selected ECJ features

GUI with charting

Platform-independent checkpointing and logging

Hierarchical parameter files

Multithreading

Mersenne Twister Random Number Generators (compare tohttpwwwalifecouknonrandom)

Abstractions for implementing a variety of EC forms

Prepared to work in a distributed environment (including so-called island model)

Life demonstration of GP using ECJndash 33

GP-related ECJ features

GP Tree Representations

Set-based Strongly-Typed Genetic Programming

Ephemeral Random Constants

Automatically-Defined Functions and Automatically Defined Macros

Multiple tree forests

Six tree-creation algorithms

Extensive set of GP breeding operators

Grammatical Encoding

Eight pre-done GP application problem domains (ant regression multiplexerlawnmower parity two-box edge serengeti)

Life demonstration of GP using ECJndash 34

Exemplary run of ECJ

Standard outputjava ecEvolve -file ecappregressionquinticercparamsThreads breed1 eval1Seed 1427743400Job 0Setting upProcessing GP TypesProcessing GP Node ConstraintsProcessing GP Function SetsProcessing GP Tree Constraints-01306332228659439200164875774146594280653340439694114301402200189629743-0037506348565697010001402771209365470606602806044824949013869498395598084Initializing Generation 0Subpop 0 best fitness of generation Fitness Standardized=11303205 Adjusted=046941292 Hits=10Generation 1Subpop 0 best fitness of generation Fitness Standardized=06804932 Adjusted=059506345 Hits=7

Life demonstration of GP using ECJndash 35

Exemplary run The result

The log file produced by the runGeneration 0Best IndividualSubpopulation 0Evaluated trueFitness Standardized=11303205 Adjusted=046941292 Hits=10Tree 0( (sin ( x x)) (cos (+ x x)))Generation 1Best IndividualSubpopulation 0Evaluated trueFitness Standardized=06804932 Adjusted=059506345 Hits=7Tree 0( (rlog (+ (- x x) (cos x))) (rlog (- (cos (cos ( x x))) (- x x))))

Life demonstration of GP using ECJndash 36

Exemplary run

The log file produced by the runBest Individual of RunSubpopulation 0Evaluated trueFitness Standardized=008413165 Adjusted=092239726 Hits=17Tree 0( ( ( (- ( ( ( ( x (sin x)) (rlog

x)) (+ (+ (sin x) x) (- x x))) (exp ( x( ( (- ( ( ( ( x x) (rlog x)) (+ (+

(sin x) x) (- x x))) (exp ( x (sin x))))(sin x)) (rlog x)) (exp (rlog x)))))) (sin

x)) (rlog x)) x) (cos (cos ( ( (- ( ((exp (rlog x)) (+ x ( ( (exp (rlog x))(rlog x)) x))) (exp ( ( ( (- (exp (rlogx)) x) (rlog x)) x) (sin ( x x))))) (sinx)) ( x ( ( (- ( ( ( ( x x) (rlogx)) (+ (+ x (+ (+ (sin x) x) (- x x))) (-x x))) (exp ( x (sin x)))) (sin x)) (rlogx)) (exp (rlog x))))) x))))

Life demonstration of GP using ECJndash 37

GP in R

httpcranr-projectorgwebpackagesgprindexhtml

Life demonstration of GP using ECJndash 38

GP in Mathematica

Life demonstration of GP using ECJndash 39

Assessment of GP techniques

Assessment of GP techniquesndash 40

Criteria for assessing the quality of GP-evolved solutions

Criteria for assessing GP algorithmssuccess rate (percentage of evolutionary runs ended with success)

time-to-success (can be infin)

error of the best-of-run individual

Criteria for assessing programs obtained with GP

error rate (percentage of tests passed)

program size (number of instructions)

execution time

transparency

Assessment of GP techniquesndash 41

GP Benchmarks

Problem Definition (formula)Sextic x6minus2x4+x2

Septic x7minus2x6+x5minusx4+x3minus2x2+xNonic x9+x8+x7+x6+x5+x4+x3+x2+xR1 (x+1)3(x2minusx+1)R2 (x5minus3x3+1)(x2+1)R3 (x6+x5)(x4+x3+x2+x+1)

Assessment of GP techniquesndash 42

Even more GP benchmarks

Symbolic Regression Drug Bioavailability [43]

Tower [52] Protein Structure Classification [58]

Boolean Functions Time Series Forecasting [53]

N-Multiplexer [17] N-Majority [17] N-Parity [17] Path-finding and Planning

Generalised Boolean Circuits [12 59] Physical Travelling Salesman [28]

Digital Adder [55] Artificial Ant [17]

Order [9] Lawnmower [18]

Digital Multiplier [55] Tartarus Problem [6]

Majority [9] Maximum Overhang [36]

Classification Circuit Design [29]

mRNA Motif Classification [24] Control Systems

DNA Motif Discovery [25] Chaotic Dynamic Systems Control [26]

Intrusion Detection [11] Pole Balancing [31]

Protein Classification [19] Truck Control [16]

Intertwined Spirals [17]

Predictive Modelling

Mackey-Glass Chaotic Time Series [21]

Financial Trading [5 4 8]

Sunspot Prediction [17]

GeneChip Probe Performance [22]

Prime Number Prediction [54]

Assessment of GP techniquesndash 43

and more

Game-Playing

TORCS Car Racing [50]

Ms PacMan [10]

Othello [27]

Chessboard Evaluation [44]

Backgammon [44]

Mario [49]

NP-Complete Puzzles [14]

Robocode [44]

Rush Hour [44]

Checkers [44]

Freecell [44]

Dynamic Optimisation

Dynamic Symbolic Regression [34 35 51]

Dynamic Scheduling [13]

Traditional Programming

Sorting [15 1]

Assessment of GP techniquesndash 44

Case study

Case studyndash 45

Case study 1 Evolution of temperature models

Based on Karolina Stanisławska Krzysztof Krawiec Zbigniew W KundzewiczModeling Global Temperature Changes using Genetic Programming ndash A Case Study

Institute of Computing Science Poznan University of Technology Poznan Poland

Institute for Agricultural and Forest Environment Polish Academy of SciencesPoznan Poland and Potsdam Institute for Climate Impact Research PotsdamGermany

Konferencja Algorytmoacutew Ewolucyjnych i Optymalizacji Globalnej KAEiOGSeptember 21 2011

Case studyndash 46

A more detailed view on GP(vanilla GP is not the whole story)

A more detailed view on GP (vanilla GP is not the whole story)ndash 47

Where to get [random] candidate solutions from

Every stochastic search method relies on some sampling algorithm(s)

The distribution of randomly generated solutions is important as it impliescertain bias of the algorithm

ProblemsWe donrsquot know the lsquoidealrsquo distribution of GP programsEven if we knew it it may be difficult to design an algorithm that obeys it

The most widely used contemporary initialization methods take care only of thesyntax of generated programs

Mainly height constraint

A more detailed view on GP (vanilla GP is not the whole story)ndash 48

Initialization Full method

Specify the maximum tree height hmax

The full method for initializing treesChoose nonterminal nodes at random until hmax is reachedThen choose only from terminals

A more detailed view on GP (vanilla GP is not the whole story)ndash 49

Initialization Grow method

Specify the maximum tree height hmax

The grow method for initializing treesChoose nonterminal or terminal nodes at random until hmax is reachedThen choose only from terminals

A more detailed view on GP (vanilla GP is not the whole story)ndash 50

Homologous crossover for GP

Earliest example one-point crossover [23] identify a common region in theparents and swap the corresponding trees

The common region is the lsquointersectionrsquo of parent trees

A more detailed view on GP (vanilla GP is not the whole story)ndash 51

Uniform crossover for GP

Works similarly to uniform crossover in GAs

The offspring is build by iterating over nodes in the common region and flipping acoin to decide from which parent should an instruction be copied [38]

A more detailed view on GP (vanilla GP is not the whole story)ndash 52

How to employ multiple operators for lsquobreedingrsquo

How should the particular operators coexist in an evolutionary process In other words

How should they be superimposed

What should be the lsquopipingrsquo of particular breeding pipelines

A topic surprisingly underexplored in GP (and in EC probably too)

An example Which is betterpopsubpop0speciespipe = ecgpkozaMutationPipelinepopsubpop0speciespipenum-sources = 1popsubpop0speciespipesource0 = ecgpkozaCrossoverPipeline

Orpopsubpop0speciespipenum-sources = 2popsubpop0speciespipesource0 = ecgpkozaCrossoverPipelinepopsubpop0speciespipesource0prob = 09popsubpop0speciespipesource1 = ecgpkozaMutationPipelinepopsubpop0speciespipesource1prob = 01

A more detailed view on GP (vanilla GP is not the whole story)ndash 53

The Challenges for GP

The Challenges for GPndash 54

Bloat

The evolving expressions tend to grow indefinitely in sizeFor tree-based representations this growth is typically exponential[-ish]

Evaluation becomes slow algorithm stalls memory overrun likely

One of the most intensely studied topics in GP 240+ papers as of March 2012

The Challenges for GPndash 55

Bloat example

Average number of nodes per generation in a typical run of GP solving the Sexticproblem x6minus2x4+x2

(GP dotted line)

The Challenges for GPndash 56

Countermeasures for bloat

Constraining tree heightSurprisingly can speed up bloat

Favoring small programsLexicographic parsimony pressure given two equally fit individuals prefer (select)the one represented by a smaller tree

Bloat-aware operators size-fair crossover

The Challenges for GPndash 57

Highly non-uniform distribution of program lsquobehaviorsrsquo

Convergence of binary Boolean random linear functions (composed of AND NANDOR NOR 8 bits)

From [20] Langdon W B Cantuacute-Paz E (ed) Random Search is Parsimonious LateBreaking Papers at the Genetic and Evolutionary Computation Conference(GECCO-2002) AAAI 2002 308-315

The Challenges for GPndash 58

High cost of evaluation

Running a program on multiple inputscan be expensive

Particularly for some types of dataeg images

Solutions

Caching of outcomes of subprograms

Parallel execution of programs onparticular fitness cases

Bloat prevention methods

The Challenges for GPndash 59

Variants of GP

Variants of GPndash 60

Strongly typed GP (STGP)

A way to incorporate prior knowledge and impose a structure on programs [30]

ImplementationProvide a set of typesFor each instruction define the types of its arguments and outcomesMake the operators type-aware

Mutation substitute a random tree of a proper typeCrossover swap trees of compatible1 types

1Compatible belonging to the same lsquoset typersquoVariants of GPndash 61

Strongly typed GP in ECJ

For the problem of simple classifiers represented as decision trees

Classifier syntaxClassifier = Class_idClassifier = if_then_else(Condition ClassifierClassifier)Condition = Input_Variable = Constant_Value

Implementation in ECJ parameter filesgptypeasize = 3gptypea0name = classgptypea1name = vargptypea2name = constgptypessize = 0gptcsize = 1gptc0 = ecgpGPTreeConstraintsgptc0name = tc0gptc0fset = f0gptc0returns = class

gpncsize = 4gpnc0 = ecgpGPNodeConstraintsgpnc0name = ncSimpleClassifiergpnc0returns = classgpnc0size = 0gpnc1 = ecgpGPNodeConstraintsgpnc1name = ncCompoundClassifiergpnc1returns = classgpnc1size = 4gpnc1child0 = vargpnc1child1 = constgpnc1child2 = classgpnc1child3 = classgpnc2 = ecgpGPNodeConstraintsgpnc2name = ncVariablegpnc2returns = vargpnc2size = 0gpnc3 = ecgpGPNodeConstraintsgpnc3name = ncConstantgpnc3returns = constgpnc3size = 0

Variants of GPndash 62

Linear Genetic Programming

MotivationTree-like structures are not natural for contemporary hardware architectures

Program = a sequence of instructions

Data passed via registers

ProsDirectly portable to machine code fast executionNatural correspondence to standard (GA-like) crossover operator

Applications direct evolution of machine code [32]

Initial register contents

Final register contents

x1

x2 O1 O2

x3

O3 O4 g2

g3

g1r1

r2

r3

r1

r2

r3

Variants of GPndash 63

Linear GP

Variants of GPndash 64

Stack-based GP

The best-known representative Push and PushGPhampshireedulspectorpushhtml [48]

ProsVery simple syntax program = instruction | literal | ( program )No need to specify the number of registersThe top element of a stack has the natural interpretation of program outcomeNatural possibility of implementing autoconstructive programs [47]Includes certain features that make it Turing-complete (eg YANK instruction)

Variants of GPndash 65

Push Example

Program

( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR )

Initial stack states

BOOLEAN STACK ()CODE STACK ( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR )FLOAT STACK ()INTEGER STACK ()

Stack states after program execution

BOOLEAN STACK ( TRUE )CODE STACK ( ( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR ) )FLOAT STACK ( 93 )INTEGER STACK ( 6 )

httphampshireedulspectorpush3-descriptionhtml

Variants of GPndash 66

Other variants of GP

Grammatical Evolution The grammar of the programming language ofconsideration is given as input to the algorithm Individuals encode the choice ofproductions in the derivation tree (which of available alternative productionshould be chosen modulo the number of productions available at given step ofderivation)

Graph-based GPMotivation standard GP cannot reuse subprograms (within a single program)Example Cartesian Genetic Programming

Variants of GPndash 67

Other variants of GP

Multiobjective GP The extra objectives canCome with the problemResult from GPrsquos specifics eg use program size as the second (minimized)objectiveBe associated with different tests (eg feature tests [40])

Developmental GP (eg using Push)

Probabilistic GP (a variant of EDA Estimation of Distribution Algorithms)The algorithm maintains a probability distribution P instead of a populationIndividuals are generated from P lsquoon demandrsquoThe results of individualsrsquo evaluation are used to update P

Variants of GPndash 68

Simple EDA-like GP PIPE

Probabilistic Incremental Program Evolution [41]

Variants of GPndash 69

Selected theoretical results

Selected theoretical resultsndash 70

Schemata theorem for GP

Exact formula for the expected number of individuals sampling a schema a thenext generation [37]

Plus later work for other types of crossover

Selected theoretical resultsndash 71

Semantic crossover

Selected theoretical resultsndash 72

Applications of GP

Applications of GPndash 73

Review

GP produced a number of solutions that are human-competitive ie a GPalgorithm automatically solved a problem for which a patent exists2

A recent award-winning work has demonstrated the ability of a GP system toautomatically find and correct bugs in commercially-released software whenprovided with test data3

GP is one of leading methodologies that can be used to lsquoautomatersquo sciencehelping the researchers to find the hidden complex patterns in the observedphenomena4

2Koza J R Keane M A Streeter M J Mydlowec W Yu J Lanza G 2003 Genetic Pro-gramming IV Routine Human-Competitive Machine Intelligence Kluwer Academic Publishers

3Arcuri A Yao X A novel co-evolutionary approach to automatic software bug fixing In WangJ (Ed) 2008 IEEE World Congress on Computational Intelligence IEEE Computational IntelligenceSociety IEEE Press Hong Kong

4Schmidt M Lipson H 3 Apr 2009 Distilling free-form natural laws from experimental dataScience 324 (5923) 81ndash85

Applications of GPndash 74

Humies

() Entries were solicited for cash awards for human-competitive results that wereproduced by any form of genetic and evolutionary computation and that were published

Applications of GPndash 75

Humies

The conditions to qualify(A) The result was patented as an invention in the past is an improvement over apatented invention or would qualify today as a patentable new invention(B) The result is equal to or better than a result that was accepted as a new scientificresult at the time when it was published in a peer-reviewed scientific journal(C) The result is equal to or better than a result that was placed into a database orarchive of results maintained by an internationally recognized panel of scientificexperts(D) The result is publishable in its own right as a new scientific result mdash independentof the fact that the result was mechanically created(E) The result is equal to or better than the most recent human-created solution to along-standing problem for which there has been a succession of increasingly betterhuman-created solutions(F) The result is equal to or better than a result that was considered an achievementin its field at the time it was first discovered(G) The result solves a problem of indisputable difficulty in its field(H) The result holds its own or wins a regulated competition involving humancontestants (in the form of either live human players or human-written computerprograms)

Applications of GPndash 76

Selected Gold Humies

2004 Jason D Lohn Gregory S Hornby Derek S Linden NASA Ames ResearchCenterAn Evolved Antenna for Deployment on NASArsquos Space Technology 5 Mission

httpidesignucscedupapershornby_ec11pdf

Applications of GPndash 77

Selected Gold Humies using GP

2009 Stephanie Forrest Claire Le Goues ThanhVu Nguyen Westley WeimerAutomatically finding patches using genetic programming A GeneticProgramming Approach to Automated Software Repair

Applications of GPndash 78

Selected Gold Humies using GP

2008 Lee Spector David M Clark Ian Lindsay Bradford Barr Jon KleinGenetic Programming for Finite Algebras

2010 Natalio Krasnogor Paweł Widera Jonathan GaribaldiEvolutionary design of the energy function for protein structure prediction GPchallenge evolving the energy function for protein structure prediction Automateddesign of energy functions for protein structure prediction by means of geneticprogramming and improved structure similarity assessment

2011 Achiya Elyasaf Ami Hauptmann Moshe SipperGA-FreeCell Evolving Solvers for the Game of FreeCell

Applications of GPndash 79

Other applications

classification problems in machine learning [] object recognition [ 33] orlearning game strategies [] [2 57] has demonstrated the ability of a GP system to automatically find andcorrect bugs in commercially-released software when provided with test data

In the context of this paper it deserves particular attention that GP is one ofleading methodologies that can be used to lsquoautomatersquo science helping theresearchers to find the hidden complex patterns in the observed phenomena

In this spirit in their seminal paper [42] have shown how GP can be used toinduce scientific laws from experimental data Many other studies havedemonstrated the usefulness of GP for modeling different phenomena includingthose of natural origins [45 3 7 56 46]

See [] for an extensive review of GP applications

Applications of GPndash 80

Case study 2 Evolution of feature detectors

Based on Krzysztof Krawiec Bartosz Kukawka and Tomasz Maciejewski (2010)Evolving cascades of voting feature detectors for vehicle detection in satellite imageryIn IEEE Congress on Evolutionary Computation (CEC 2010) Barcelona IEEE Presspages 2392-2399

Applications of GPndash 81

Additional resources

Additional resourcesndash 82

Additional resources

Evolutionary Computation in Java csgmuedu~eclabprojectsecjGeneric software framework for EA well-prepared to work with GP

The online GP bilbiography wwwcsbhamacuk~wblbiblio

The genetic programming lsquohome pagersquo (a little bit messy but still valuable)httpwwwgenetic-programmingcom

Additional resourcesndash 83

Bibliography

Bibliographyndash 84

A Agapitos and S M LucasEvolving Modular Recursive Sorting AlgorithmsIn Proc EuroGP 2007

A Arcuri and X YaoA novel co-evolutionary approach to automatic software bug fixingIn J Wang editor 2008 IEEE World Congress on Computational Intelligencepages 162ndash168 Hong Kong 1-6 June 2008 IEEE Computational IntelligenceSociety IEEE Press

M Arganis R Val J Prats K Rodriguez R Dominguez and J DolzGenetic programming and standardization in water temperature modellingAdvances in Civil Engineering 2009 2009

A Brabazon M OrsquoNeill and I DempseyAn Introduction to Evolutionary Computation in FinanceIEEE Computational Intelligence Magazine 3(4)42ndash55 2008

R Bradley A Brabazon and M OrsquoNeillDynamic High Frequency Trading A Neuro-Evolutionary ApproachIn Proc EvoWorkshops 2009

G Cuccu and F GomezWhen novelty is not enoughIn Proc EvoApplications 2011

Bibliographyndash 85

M Daga and M C DeoAlternative data-driven methods to estimate wind from waves by inversemodelingNatural Hazards 49(2)293ndash310 May 2009

I Dempsey M OrsquoNeill and A BrabazonAdaptive Trading With Grammatical EvolutionIn Proc CEC 2006

G Durrett F Neumann and U-M OrsquoReillyComputational Complexity Analysis of Simple Genetic Programming On TwoProblems Modeling Isolated Program SemanticsIn Proc FOGA 2011

E Galvaacuten-Loacutepez J Swafford M OrsquoNeill and A BrabazonEvolving a Ms PacMan Controller Using Grammatical EvolutionIn Applications of Evolutionary Computation Springer 2010

J V Hansen P B Lowry R D Meservy and D M McDonaldGenetic Programming for Prevention of Cyberterrorism through Dynamic andEvolving Intrusion DetectionDecision Support Systems 431362ndash1374 2007

Bibliographyndash 86

S Harding J F Miller and W BanzhafDevelopments in Cartesian Genetic Programming self-modifying CGPGPEM 11397ndash439 2010

D Jakobović and L BudinDynamic Scheduling with Genetic ProgrammingIn Proc EuroGP 2006

G Kendall A Parkes and K SpoererA Survey of NP-Complete PuzzlesInternational Computer Games Association Journal 31(1)13ndash34 2008

K E Kinnear JrEvolving a Sort Lessons in Genetic ProgrammingIn Proc of the International Conference on Neural Networks 1993

J KozaA Genetic Approach to the Truck Backer Upper Problem and the Inter-twinedSpiral ProblemIn Proc International Joint Conference on Neural Networks 1992

J R KozaGenetic Programming On the Programming of Computers by Means of NaturalSelectionMIT Press Cambridge MA USA 1992

Bibliographyndash 87

J R KozaGenetic Programming II Automatic Discovery of Reusable ProgramsMIT Press Cambridge Massachusetts May 1994

W Langdon and W BanzhafRepeated Patterns in Genetic ProgrammingNatural Computing 7589ndash613 2008

W B LangdonRandom search is parsimoniousIn E Cantuacute-Paz editor Late Breaking Papers at the Genetic and EvolutionaryComputation Conference (GECCO-2002) pages 308ndash315 New York NY 9-13July 2002 AAAI

W B Langdon and W BanzhafRepeated Sequences in Linear Genetic Programming GenomesComplex Systems 15(4)285ndash306 2005

W B Langdon and A P HarrisonEvolving Regular Expressions for GeneChip Probe Performance PredictionIn Proc PPSN pages 1061ndash1070 2008

W B Langdon and R PoliFoundations of Genetic ProgrammingSpringer-Verlag 2002

Bibliographyndash 88

W B Langdon J Rowsell and A P HarrisonCreating Regular Expressions as mRNA Motifs with GP to Predict Human ExonSplittingIn Proc GECCO 2009

W B Langdon O Sanchez Graillet and A P HarrisonAutomated DNA Motif DiscoveryarXivorg 2010

M Lones A Tyrrell S Stepney and L CavesControlling Complex Dynamics with Artificial Biochemical NetworksIn Proc EuroGP pages 159ndash170 2010

S LucasOthello Competitionhttpprotectkern-1667emrelaxalgovalessexacuk8080othellohtmlOthellohtml 2012[Online accessed 27-Jan-2012]

Bibliographyndash 89

S LucasThe Physical Travelling Salesperson Problemhttpprotectkern-1667emrelaxalgovalessexacukptspptsphtml2012[Online accessed 27ndashJan-2012]

T McConaghyFFX Fast Scalable Deterministic Symbolic Regression TechnologyIn Proc GPTP 2011

D J MontanaStrongly typed genetic programmingBBN Technical Report 7866 Bolt Beranek and Newman Inc 10 MoultonStreet Cambridge MA 02138 USA 7 May 1993

M Nicolau M Schoenauer and W BanzhafEvolving Genes to Balance a PoleIn Proc EuroGP 2010

Bibliographyndash 90

P Nordin and W BanzhafGenetic programming controlling a miniature robotIn E V Siegel and J R Koza editors Working Notes for the AAAI Symposiumon Genetic Programming pages 61ndash67 MIT Cambridge MA USA 10ndash12 Nov1995 AAAI

G Olague and L TrujilloEvolutionary-computer-assisted design of image operators that detect interestpoints using genetic programmingImage and Vision Computing 29(7)484ndash498 2011

M OrsquoNeill A Brabazon and E HembergSubtree Deactivation Control with Grammatical Genetic Programming inDynamic EnvironmentsIn Proc CEC 2008

M OrsquoNeill and C RyanGrammatical Evolution by Grammatical Evolution The Evolution of Grammarand Genetic CodeIn Proc EuroGP pages 138ndash149 Springer-Verlag 5ndash7 Apr 2004

M Paterson Y Peres M Thorup P Winkler and U ZwickMaximum OverhangIn Proc 19th Annual ACM-SIAM Symposium on Discrete Algorithms 2008

Bibliographyndash 91

R PoliExact schema theory for genetic programming and variable-length geneticalgorithms with one-point crossoverGenetic Programming and Evolvable Machines 2(2)123ndash163 June 2001

R Poli and W B LangdonOn the search properties of different crossover operators in genetic programmingIn J R Koza W Banzhaf K Chellapilla K Deb M Dorigo D B Fogel M HGarzon D E Goldberg H Iba and R Riolo editors Genetic Programming1998 Proceedings of the Third Annual Conference pages 293ndash301 University ofWisconsin Madison Wisconsin USA 22-25 July 1998 Morgan Kaufmann

R Poli W B Langdon and N F McPheeA field guide to genetic programmingPublished via httplulucom and freely available athttpwwwgp-field-guideorguk 2008(With contributions by J R Koza)

B J Ross and H ZhuProcedural texture evolution using multiobjective optimizationNew Generation Computing 22(3)271ndash293 2004

Bibliographyndash 92

R P Salustowicz and J SchmidhuberProbabilistic incremental program evolutionEvolutionary Computation 5(2)123ndash141 1997

M Schmidt and H LipsonDistilling free-form natural laws from experimental dataScience 324(5923)81ndash85 3 Apr 2009

S Silva and L VanneschiState-of-the-Art Genetic Programming for Predicting Human Oral Bioavailabilityof DrugsIn Proc 4th International Workshop on Practical Applications of ComputationalBiology and Bioinformatics 2010

M SipperLet the Games EvolveIn Proc GPTP 2011

C Sivapragasam R Maheswaran and V VenkateshGenetic programming approach for flood routing in natural channelsHydrological Processes 22(5)623ndash628 2007

C Sivapragasam N Muttil S Muthukumar and V M ArunPrediction of algal blooms using genetic programmingMarine Pollution Bulletin 60(10)1849ndash1855 2010

Bibliographyndash 93

L SpectorTowards practical autoconstructive evolution Self-evolution of problem-solvinggenetic programming systemsIn R Riolo T McConaghy and E Vladislavleva editors Genetic ProgrammingTheory and Practice VIII volume 8 of Genetic and Evolutionary Computationchapter 2 pages 17ndash33 Springer Ann Arbor USA 20-22 May 2010

L Spector C Perry and J KleinPush 20 programming language descriptionTechnical report School of Cognitive Science Hampshire College Apr 2004

J Togelius S Karakovskiy J Koutnik and J SchmidhuberSuper Mario EvolutionIn Proc IEEE Computational Intelligence and Games 2009

TORCS The Open Car Racing Simulatorhttptorcssourceforgenet 2012

L Vanneschi and G CuccuVariable Size Population for Dynamic Optimization with Genetic ProgrammingIn Proc GECCO 2009

Bibliographyndash 94

E Vladislavleva G Smits and D Den HertogOrder of Nonlinearity as a Complexity Measure for Models Generated by SymbolicRegression via Pareto Genetic ProgrammingIEEE Trans EC 13(2)333ndash349 2009

N Wagner Z Michalewicz M Khouja and R McGregorTime Series Forecasting for Dynamic Environments The DyFor Genetic ProgramModelIEEE Trans EC 2007

J Walker and J MillerPredicting Prime Numbers Using Cartesian Genetic ProgrammingIn Proc EuroGP 2007

J A Walker K Voumllk S L Smith and J F MillerParallel Evolution using Multi-chromosome Cartesian Genetic ProgrammingGPEM 10417ndash445 2009

W-C Wang K-W Chau C-T Cheng and L QiuA comparison of performance of several artificial intelligence methods forforecasting monthly discharge time seriesJournal of Hydrology 374(3-4)294ndash306 2009

Bibliographyndash 95

W Weimer S Forrest C Le Goues and T NguyenAutomatic program repair with evolutionary computationCommunications of the ACM 53(5)109ndash116 June 2010

P Widera J Garibaldi and N KrasnogorGP challenge Evolving energy function for protein structure predictionGPEM 1161ndash88 2010

T YuHierarchical Processing for Evolving Recursive and Modular Programs UsingHigher-Order Functions and Lambda AbstractionGPEM 2345ndash380 2001

Bibliographyndash 96

  • Introduction
  • Background
  • What is genetic programming
  • Summary of our first glimpse at GP
  • Life demonstration of GP using ECJ
  • Assessment of GP techniques
  • Case study
  • A more detailed view on GP (vanilla GP is not the whole story)
  • The Challenges for GP
  • Variants of GP
  • Selected theoretical results
  • Applications of GP
  • Additional resources
  • Bibliography
Page 7: GeneticProgramming - Poznań University of TechnologyGeneticProgramming KrzysztofKrawiec LaboratoryofIntelligentDecisionSupportSystems InstituteofComputingScience,PoznanUniversityofTechnology,Poznań,Poland

Evolutionary Computation (EC)

Heuristic bio-inspired global search algorithms

Operate on populations of candidate solutions

Candidate solutions are encoded as genotypes

Genotypes get decoded into phenotypes when evaluated by the fitness function fbeing optimized

Formulation

plowast = argmaxpisinP

f (p)

where

P is the considered space (search space) of candidate solutions (solutions forshort)

f is a (maximized) fitness function

plowast is an optimal solution (an ideal) that maximizes f

Backgroundndash 7

Generic evolutionary algorithm

Evolutionary Algorithm

Population P of individuals

Evaluation

Selection

Mutation and recombination

Initialization of population P

Solutionindividual s

f(s)

Output Best solution s+

Termination criteria

Fitness function f

Backgroundndash 8

[Unique] characteristic of EC

Black-box optimization (f primes dependency on the independent variables does nothave to be known or meet any criteria)

Variables do not have to be explicitly defined

Fining an optimum cannot be guaranteed but in practice a well-performingsuboptimal solution is often satisfactory

Importance of crossover a recombination operator that makes the solutionsexchange certain elements (variable values features)

Without crossover EC boils down parallel stochastic local search

Backgroundndash 9

What is genetic programming

What is genetic programmingndash 10

Genetic programming

In a nutshell

A variant of EC where the genotypes represent programs ie entities capable ofreading in input data and producing some output data in response to that input

Fitness function f measures the similarity of the output produced by the programto the desired output given as a part of task statement

Standard representation expression trees

Important implication Additional input required by the algorithm (compared to EC)

Set of instructions (programming language of consideration)

Data to run the programs on

What is genetic programmingndash 11

Conceptual preliminaries of GP Space of candidate solutions P

Candidate solutions p isin P evolving under the selection pressure of the fitnessfunction f are themselves functions of the form p I rarrO

I and O are respectively the spaces of input data and output data accepted andproduced by programs from P

Cardinality of |P| is typically large or infinite

The set of program inputs I even if finite is usually so large that running eachcandidate solution on all possible inputs becomes intractable

GP algorithms typically evaluate solutions on a sample I prime sub I |I prime| |I | of possibleinputs and fitness is only an approximate estimate of solution quality

The task is given as a set of fitness cases ie pairs (xi yi ) isin I timesO where xiusually comprises one or more independent variables and yi is the output variable

What is genetic programmingndash 12

Conceptual preliminaries Fitness function f

In most cases (and most real-world applications of GP) fitness function fmeasures the similarity of the output produced by the program to the desiredoutput given as a part of task statement

Then fitness can be expressed as a monotonous function of the divergence ofprogramrsquos output from the desired one for instance as

f (p) =minussumi

||yi minusp(xi )|| (1)

where

p(xi ) is the output produced by program p for the input data xi

|| middot || is a metric (a norm) in the output space O

i iterates over all fitness cases

What is genetic programmingndash 13

Conceptual preliminaries Character of candidate solutions

The candidate solutions in GP are being assembled from elementary entities calledinstructions

A part of formulation of a GP task is then also an instruction set I ie a set ofsymbols used by the search algorithm to compose the programs (candidatesolutions)

Design of I usually requires some background knowledgeIn particular it should comprise all instructions necessary to find solution to theproblem posed (closure)

What is genetic programmingndash 14

Genetic programming

Main evolution loop (lsquovanilla GPrsquo)

1 procedure GeneticProgramming(f I ) f - fitness function I - instruction set2 P larrplarrRandomProgram(I ) Initialize population3 repeat Main loop over generations4 for p isinP do Evaluation5 pf larr f (p) pf is a lsquofieldrsquo in program p that stores its fitness6 end for7 P prime larr 0 Next population8 repeat Breeding loop9 p1 larrTournamentSelection(P) First parent10 p2 larrTournamentSelection(P) Second parent11 (o1o2)larrCrossover(p1 p2)12 o1 larrMutation(o1 I )13 o2 larrMutation(o2 I )14 P prime larrP prime cupo1o215 until |P prime|= |P|16 P larrP prime

17 until StoppingCondition(P)18 return argmaxpisinP pf

19 end procedure

What is genetic programmingndash 15

Crossover

Crossover exchange of randomly selected subexpressions (subtree swapping crossover)

1 function Crossover(p1p2)2 repeat3 s1 larr Random node in p14 s2 larr Random node in p25 (pprime1p

prime2)larr Swap subtrees rooted in s1 and s2

6 until Depth(pprime1)lt dmax andDepth(pprime2)lt dmax dmax is the tree depthlimit

7 return (pprime1pprime2)

8 end function

What is genetic programmingndash 16

Mutation

Mutation replace a randomly selected subexpression with a new randomly generatedsubexpression

1 function Mutation(pI )2 repeat3 slarr Random node in p4 s prime larrRandomProgram(I )5 pprime larr Replace the subtree rooted in s with s prime

6 until Depth(pprime)lt dmax dmax is the tree depth limit7 return pprime

8 end function

What is genetic programmingndash 17

Exemplary run Setup

Objective Find program whose output matches x2+x+1 over the range [minus11]Such tasks can be considered as a form of regressionAs solutions are built by manipulating code (instructions) this is referred to assymbolic regression

Fitness sum of absolute errors for x isin minus10minus09 0910In other words the set of fitness cases is

xi -10 -09 0 09 10yi 1 091 1 271 3

What is genetic programmingndash 18

Exemplary run Setup

Instruction setNonterminal (function) set + - (protected division) and x all operating onfloatsTerminal set x and constants chosen randomly between -5 and +5

Selection fitness proportionate (roulette wheel) non elitist

Initial pop ramped half-and-half (depth 1 to 2 50 of terminals are constants)(to be explained later)

Parameterspopulation size 450 subtree crossover25 reproduction25 subtree mutation no tree size limits

Termination when an individual with fitness better than 01 found

What is genetic programmingndash 19

Initial population (population 0)

What is genetic programmingndash 20

Fitness assignment for population 0

Fitness values f(a)=77 f(b)=110 f(c)=1798 f(d)=287

What is genetic programmingndash 21

Breeding

Assume

a gets reproduced

c gets mutated (at loci 2)

a and d get crossed-over

a and b get crossed over

What is genetic programmingndash 22

Population 1

Population 0

Population 1

Individual d in population 1 has fitness 0What is genetic programmingndash 23

Summary of our first glimpse at GP

Summary of our first glimpse at GPndash 24

Specific features of GP

The solutions evolving under the selection pressure of the fitness function arethemselves functions (programs)

GP operates on symbolic structures of varying lengthsThere are no variables for the algorithm to operate on (at least in the commonsense)

The program can be tested only on a limited number of fitness cases (tests)

=rArr In contrast to most EC methods that are typically placed in optimizationframework GP is by nature an inductive learning approach that fits into the domain ofmachine learning []

Summary of our first glimpse at GPndash 25

In a broader context

As opposed to typical ML approaches GP is very genericArbitrary programming language arbitrary input and output representation

The syntax and semantic of the programming language of consideration serve asmeans to provide the algorithm with prior knowledge

(common sense knowledge background knowledge domain knowledge)

GP is not the only approach to program induction (but probably the best one )See eg inductive logic programming ILP

GP embodies the ultimate goal of AI to build a system capable ofself-programming (adaptation learning)

Summary of our first glimpse at GPndash 26

What is GP ndash Question revisited

Genetic programming is a branch of computer science studying heuristicalgorithms based on neo-Darwinian principles for synthesizing programs iediscrete symbolic compositional structures that process data

Summary of our first glimpse at GPndash 27

Why should GP be considered a viable approach to program synthesis

GP combines two powerful concepts marked in underline in the above definition1 Representing candidate solutions as programs

which in general can conduct any Turing-complete computation (egclassification regression clustering reasoning problem solving etc) and thusenable capturing solutions to any type of problems (whether the task is eglearning optimization problem solving game playing etc)

2 Searching the space of candidate solutions using the lsquomechanicsrsquo borrowed frombiological evolutionwhich is unquestionably a very powerful computing paradigm given that itresulted in life on Earth and development of intelligent beings

Summary of our first glimpse at GPndash 28

Consequences of the above definition

Heuristic nature of search

Symbolic program representation

Input sensitivity and inductive character

State-fullness

Unconstrained data types

Summary of our first glimpse at GPndash 29

Origins of GP

httpwwwgenetic-programmingcomjohnkozahtml

Home assignment Indentify John Kozarsquos unique visual trait )

Summary of our first glimpse at GPndash 30

Life demonstration of GP using ECJ

Life demonstration of GP using ECJndash 31

GP on ECJ

ECJ Evolutionary Computation in Javahttpcsgmuedu~eclabprojectsecj

by Sean Luke Liviu Panait Gabriel Balan Sean Paus Zbigniew Skolicki ElenaPopovici Keith Sullivan et al

Probably the most popular freely available framework for EC with a strongsupport for GP

Licensed under Academic Free License version 30

As of March 2012 version 20

Many other libraries integrate with ECJ

Life demonstration of GP using ECJndash 32

Selected ECJ features

GUI with charting

Platform-independent checkpointing and logging

Hierarchical parameter files

Multithreading

Mersenne Twister Random Number Generators (compare tohttpwwwalifecouknonrandom)

Abstractions for implementing a variety of EC forms

Prepared to work in a distributed environment (including so-called island model)

Life demonstration of GP using ECJndash 33

GP-related ECJ features

GP Tree Representations

Set-based Strongly-Typed Genetic Programming

Ephemeral Random Constants

Automatically-Defined Functions and Automatically Defined Macros

Multiple tree forests

Six tree-creation algorithms

Extensive set of GP breeding operators

Grammatical Encoding

Eight pre-done GP application problem domains (ant regression multiplexerlawnmower parity two-box edge serengeti)

Life demonstration of GP using ECJndash 34

Exemplary run of ECJ

Standard outputjava ecEvolve -file ecappregressionquinticercparamsThreads breed1 eval1Seed 1427743400Job 0Setting upProcessing GP TypesProcessing GP Node ConstraintsProcessing GP Function SetsProcessing GP Tree Constraints-01306332228659439200164875774146594280653340439694114301402200189629743-0037506348565697010001402771209365470606602806044824949013869498395598084Initializing Generation 0Subpop 0 best fitness of generation Fitness Standardized=11303205 Adjusted=046941292 Hits=10Generation 1Subpop 0 best fitness of generation Fitness Standardized=06804932 Adjusted=059506345 Hits=7

Life demonstration of GP using ECJndash 35

Exemplary run The result

The log file produced by the runGeneration 0Best IndividualSubpopulation 0Evaluated trueFitness Standardized=11303205 Adjusted=046941292 Hits=10Tree 0( (sin ( x x)) (cos (+ x x)))Generation 1Best IndividualSubpopulation 0Evaluated trueFitness Standardized=06804932 Adjusted=059506345 Hits=7Tree 0( (rlog (+ (- x x) (cos x))) (rlog (- (cos (cos ( x x))) (- x x))))

Life demonstration of GP using ECJndash 36

Exemplary run

The log file produced by the runBest Individual of RunSubpopulation 0Evaluated trueFitness Standardized=008413165 Adjusted=092239726 Hits=17Tree 0( ( ( (- ( ( ( ( x (sin x)) (rlog

x)) (+ (+ (sin x) x) (- x x))) (exp ( x( ( (- ( ( ( ( x x) (rlog x)) (+ (+

(sin x) x) (- x x))) (exp ( x (sin x))))(sin x)) (rlog x)) (exp (rlog x)))))) (sin

x)) (rlog x)) x) (cos (cos ( ( (- ( ((exp (rlog x)) (+ x ( ( (exp (rlog x))(rlog x)) x))) (exp ( ( ( (- (exp (rlogx)) x) (rlog x)) x) (sin ( x x))))) (sinx)) ( x ( ( (- ( ( ( ( x x) (rlogx)) (+ (+ x (+ (+ (sin x) x) (- x x))) (-x x))) (exp ( x (sin x)))) (sin x)) (rlogx)) (exp (rlog x))))) x))))

Life demonstration of GP using ECJndash 37

GP in R

httpcranr-projectorgwebpackagesgprindexhtml

Life demonstration of GP using ECJndash 38

GP in Mathematica

Life demonstration of GP using ECJndash 39

Assessment of GP techniques

Assessment of GP techniquesndash 40

Criteria for assessing the quality of GP-evolved solutions

Criteria for assessing GP algorithmssuccess rate (percentage of evolutionary runs ended with success)

time-to-success (can be infin)

error of the best-of-run individual

Criteria for assessing programs obtained with GP

error rate (percentage of tests passed)

program size (number of instructions)

execution time

transparency

Assessment of GP techniquesndash 41

GP Benchmarks

Problem Definition (formula)Sextic x6minus2x4+x2

Septic x7minus2x6+x5minusx4+x3minus2x2+xNonic x9+x8+x7+x6+x5+x4+x3+x2+xR1 (x+1)3(x2minusx+1)R2 (x5minus3x3+1)(x2+1)R3 (x6+x5)(x4+x3+x2+x+1)

Assessment of GP techniquesndash 42

Even more GP benchmarks

Symbolic Regression Drug Bioavailability [43]

Tower [52] Protein Structure Classification [58]

Boolean Functions Time Series Forecasting [53]

N-Multiplexer [17] N-Majority [17] N-Parity [17] Path-finding and Planning

Generalised Boolean Circuits [12 59] Physical Travelling Salesman [28]

Digital Adder [55] Artificial Ant [17]

Order [9] Lawnmower [18]

Digital Multiplier [55] Tartarus Problem [6]

Majority [9] Maximum Overhang [36]

Classification Circuit Design [29]

mRNA Motif Classification [24] Control Systems

DNA Motif Discovery [25] Chaotic Dynamic Systems Control [26]

Intrusion Detection [11] Pole Balancing [31]

Protein Classification [19] Truck Control [16]

Intertwined Spirals [17]

Predictive Modelling

Mackey-Glass Chaotic Time Series [21]

Financial Trading [5 4 8]

Sunspot Prediction [17]

GeneChip Probe Performance [22]

Prime Number Prediction [54]

Assessment of GP techniquesndash 43

and more

Game-Playing

TORCS Car Racing [50]

Ms PacMan [10]

Othello [27]

Chessboard Evaluation [44]

Backgammon [44]

Mario [49]

NP-Complete Puzzles [14]

Robocode [44]

Rush Hour [44]

Checkers [44]

Freecell [44]

Dynamic Optimisation

Dynamic Symbolic Regression [34 35 51]

Dynamic Scheduling [13]

Traditional Programming

Sorting [15 1]

Assessment of GP techniquesndash 44

Case study

Case studyndash 45

Case study 1 Evolution of temperature models

Based on Karolina Stanisławska Krzysztof Krawiec Zbigniew W KundzewiczModeling Global Temperature Changes using Genetic Programming ndash A Case Study

Institute of Computing Science Poznan University of Technology Poznan Poland

Institute for Agricultural and Forest Environment Polish Academy of SciencesPoznan Poland and Potsdam Institute for Climate Impact Research PotsdamGermany

Konferencja Algorytmoacutew Ewolucyjnych i Optymalizacji Globalnej KAEiOGSeptember 21 2011

Case studyndash 46

A more detailed view on GP(vanilla GP is not the whole story)

A more detailed view on GP (vanilla GP is not the whole story)ndash 47

Where to get [random] candidate solutions from

Every stochastic search method relies on some sampling algorithm(s)

The distribution of randomly generated solutions is important as it impliescertain bias of the algorithm

ProblemsWe donrsquot know the lsquoidealrsquo distribution of GP programsEven if we knew it it may be difficult to design an algorithm that obeys it

The most widely used contemporary initialization methods take care only of thesyntax of generated programs

Mainly height constraint

A more detailed view on GP (vanilla GP is not the whole story)ndash 48

Initialization Full method

Specify the maximum tree height hmax

The full method for initializing treesChoose nonterminal nodes at random until hmax is reachedThen choose only from terminals

A more detailed view on GP (vanilla GP is not the whole story)ndash 49

Initialization Grow method

Specify the maximum tree height hmax

The grow method for initializing treesChoose nonterminal or terminal nodes at random until hmax is reachedThen choose only from terminals

A more detailed view on GP (vanilla GP is not the whole story)ndash 50

Homologous crossover for GP

Earliest example one-point crossover [23] identify a common region in theparents and swap the corresponding trees

The common region is the lsquointersectionrsquo of parent trees

A more detailed view on GP (vanilla GP is not the whole story)ndash 51

Uniform crossover for GP

Works similarly to uniform crossover in GAs

The offspring is build by iterating over nodes in the common region and flipping acoin to decide from which parent should an instruction be copied [38]

A more detailed view on GP (vanilla GP is not the whole story)ndash 52

How to employ multiple operators for lsquobreedingrsquo

How should the particular operators coexist in an evolutionary process In other words

How should they be superimposed

What should be the lsquopipingrsquo of particular breeding pipelines

A topic surprisingly underexplored in GP (and in EC probably too)

An example Which is betterpopsubpop0speciespipe = ecgpkozaMutationPipelinepopsubpop0speciespipenum-sources = 1popsubpop0speciespipesource0 = ecgpkozaCrossoverPipeline

Orpopsubpop0speciespipenum-sources = 2popsubpop0speciespipesource0 = ecgpkozaCrossoverPipelinepopsubpop0speciespipesource0prob = 09popsubpop0speciespipesource1 = ecgpkozaMutationPipelinepopsubpop0speciespipesource1prob = 01

A more detailed view on GP (vanilla GP is not the whole story)ndash 53

The Challenges for GP

The Challenges for GPndash 54

Bloat

The evolving expressions tend to grow indefinitely in sizeFor tree-based representations this growth is typically exponential[-ish]

Evaluation becomes slow algorithm stalls memory overrun likely

One of the most intensely studied topics in GP 240+ papers as of March 2012

The Challenges for GPndash 55

Bloat example

Average number of nodes per generation in a typical run of GP solving the Sexticproblem x6minus2x4+x2

(GP dotted line)

The Challenges for GPndash 56

Countermeasures for bloat

Constraining tree heightSurprisingly can speed up bloat

Favoring small programsLexicographic parsimony pressure given two equally fit individuals prefer (select)the one represented by a smaller tree

Bloat-aware operators size-fair crossover

The Challenges for GPndash 57

Highly non-uniform distribution of program lsquobehaviorsrsquo

Convergence of binary Boolean random linear functions (composed of AND NANDOR NOR 8 bits)

From [20] Langdon W B Cantuacute-Paz E (ed) Random Search is Parsimonious LateBreaking Papers at the Genetic and Evolutionary Computation Conference(GECCO-2002) AAAI 2002 308-315

The Challenges for GPndash 58

High cost of evaluation

Running a program on multiple inputscan be expensive

Particularly for some types of dataeg images

Solutions

Caching of outcomes of subprograms

Parallel execution of programs onparticular fitness cases

Bloat prevention methods

The Challenges for GPndash 59

Variants of GP

Variants of GPndash 60

Strongly typed GP (STGP)

A way to incorporate prior knowledge and impose a structure on programs [30]

ImplementationProvide a set of typesFor each instruction define the types of its arguments and outcomesMake the operators type-aware

Mutation substitute a random tree of a proper typeCrossover swap trees of compatible1 types

1Compatible belonging to the same lsquoset typersquoVariants of GPndash 61

Strongly typed GP in ECJ

For the problem of simple classifiers represented as decision trees

Classifier syntaxClassifier = Class_idClassifier = if_then_else(Condition ClassifierClassifier)Condition = Input_Variable = Constant_Value

Implementation in ECJ parameter filesgptypeasize = 3gptypea0name = classgptypea1name = vargptypea2name = constgptypessize = 0gptcsize = 1gptc0 = ecgpGPTreeConstraintsgptc0name = tc0gptc0fset = f0gptc0returns = class

gpncsize = 4gpnc0 = ecgpGPNodeConstraintsgpnc0name = ncSimpleClassifiergpnc0returns = classgpnc0size = 0gpnc1 = ecgpGPNodeConstraintsgpnc1name = ncCompoundClassifiergpnc1returns = classgpnc1size = 4gpnc1child0 = vargpnc1child1 = constgpnc1child2 = classgpnc1child3 = classgpnc2 = ecgpGPNodeConstraintsgpnc2name = ncVariablegpnc2returns = vargpnc2size = 0gpnc3 = ecgpGPNodeConstraintsgpnc3name = ncConstantgpnc3returns = constgpnc3size = 0

Variants of GPndash 62

Linear Genetic Programming

MotivationTree-like structures are not natural for contemporary hardware architectures

Program = a sequence of instructions

Data passed via registers

ProsDirectly portable to machine code fast executionNatural correspondence to standard (GA-like) crossover operator

Applications direct evolution of machine code [32]

Initial register contents

Final register contents

x1

x2 O1 O2

x3

O3 O4 g2

g3

g1r1

r2

r3

r1

r2

r3

Variants of GPndash 63

Linear GP

Variants of GPndash 64

Stack-based GP

The best-known representative Push and PushGPhampshireedulspectorpushhtml [48]

ProsVery simple syntax program = instruction | literal | ( program )No need to specify the number of registersThe top element of a stack has the natural interpretation of program outcomeNatural possibility of implementing autoconstructive programs [47]Includes certain features that make it Turing-complete (eg YANK instruction)

Variants of GPndash 65

Push Example

Program

( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR )

Initial stack states

BOOLEAN STACK ()CODE STACK ( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR )FLOAT STACK ()INTEGER STACK ()

Stack states after program execution

BOOLEAN STACK ( TRUE )CODE STACK ( ( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR ) )FLOAT STACK ( 93 )INTEGER STACK ( 6 )

httphampshireedulspectorpush3-descriptionhtml

Variants of GPndash 66

Other variants of GP

Grammatical Evolution The grammar of the programming language ofconsideration is given as input to the algorithm Individuals encode the choice ofproductions in the derivation tree (which of available alternative productionshould be chosen modulo the number of productions available at given step ofderivation)

Graph-based GPMotivation standard GP cannot reuse subprograms (within a single program)Example Cartesian Genetic Programming

Variants of GPndash 67

Other variants of GP

Multiobjective GP The extra objectives canCome with the problemResult from GPrsquos specifics eg use program size as the second (minimized)objectiveBe associated with different tests (eg feature tests [40])

Developmental GP (eg using Push)

Probabilistic GP (a variant of EDA Estimation of Distribution Algorithms)The algorithm maintains a probability distribution P instead of a populationIndividuals are generated from P lsquoon demandrsquoThe results of individualsrsquo evaluation are used to update P

Variants of GPndash 68

Simple EDA-like GP PIPE

Probabilistic Incremental Program Evolution [41]

Variants of GPndash 69

Selected theoretical results

Selected theoretical resultsndash 70

Schemata theorem for GP

Exact formula for the expected number of individuals sampling a schema a thenext generation [37]

Plus later work for other types of crossover

Selected theoretical resultsndash 71

Semantic crossover

Selected theoretical resultsndash 72

Applications of GP

Applications of GPndash 73

Review

GP produced a number of solutions that are human-competitive ie a GPalgorithm automatically solved a problem for which a patent exists2

A recent award-winning work has demonstrated the ability of a GP system toautomatically find and correct bugs in commercially-released software whenprovided with test data3

GP is one of leading methodologies that can be used to lsquoautomatersquo sciencehelping the researchers to find the hidden complex patterns in the observedphenomena4

2Koza J R Keane M A Streeter M J Mydlowec W Yu J Lanza G 2003 Genetic Pro-gramming IV Routine Human-Competitive Machine Intelligence Kluwer Academic Publishers

3Arcuri A Yao X A novel co-evolutionary approach to automatic software bug fixing In WangJ (Ed) 2008 IEEE World Congress on Computational Intelligence IEEE Computational IntelligenceSociety IEEE Press Hong Kong

4Schmidt M Lipson H 3 Apr 2009 Distilling free-form natural laws from experimental dataScience 324 (5923) 81ndash85

Applications of GPndash 74

Humies

() Entries were solicited for cash awards for human-competitive results that wereproduced by any form of genetic and evolutionary computation and that were published

Applications of GPndash 75

Humies

The conditions to qualify(A) The result was patented as an invention in the past is an improvement over apatented invention or would qualify today as a patentable new invention(B) The result is equal to or better than a result that was accepted as a new scientificresult at the time when it was published in a peer-reviewed scientific journal(C) The result is equal to or better than a result that was placed into a database orarchive of results maintained by an internationally recognized panel of scientificexperts(D) The result is publishable in its own right as a new scientific result mdash independentof the fact that the result was mechanically created(E) The result is equal to or better than the most recent human-created solution to along-standing problem for which there has been a succession of increasingly betterhuman-created solutions(F) The result is equal to or better than a result that was considered an achievementin its field at the time it was first discovered(G) The result solves a problem of indisputable difficulty in its field(H) The result holds its own or wins a regulated competition involving humancontestants (in the form of either live human players or human-written computerprograms)

Applications of GPndash 76

Selected Gold Humies

2004 Jason D Lohn Gregory S Hornby Derek S Linden NASA Ames ResearchCenterAn Evolved Antenna for Deployment on NASArsquos Space Technology 5 Mission

httpidesignucscedupapershornby_ec11pdf

Applications of GPndash 77

Selected Gold Humies using GP

2009 Stephanie Forrest Claire Le Goues ThanhVu Nguyen Westley WeimerAutomatically finding patches using genetic programming A GeneticProgramming Approach to Automated Software Repair

Applications of GPndash 78

Selected Gold Humies using GP

2008 Lee Spector David M Clark Ian Lindsay Bradford Barr Jon KleinGenetic Programming for Finite Algebras

2010 Natalio Krasnogor Paweł Widera Jonathan GaribaldiEvolutionary design of the energy function for protein structure prediction GPchallenge evolving the energy function for protein structure prediction Automateddesign of energy functions for protein structure prediction by means of geneticprogramming and improved structure similarity assessment

2011 Achiya Elyasaf Ami Hauptmann Moshe SipperGA-FreeCell Evolving Solvers for the Game of FreeCell

Applications of GPndash 79

Other applications

classification problems in machine learning [] object recognition [ 33] orlearning game strategies [] [2 57] has demonstrated the ability of a GP system to automatically find andcorrect bugs in commercially-released software when provided with test data

In the context of this paper it deserves particular attention that GP is one ofleading methodologies that can be used to lsquoautomatersquo science helping theresearchers to find the hidden complex patterns in the observed phenomena

In this spirit in their seminal paper [42] have shown how GP can be used toinduce scientific laws from experimental data Many other studies havedemonstrated the usefulness of GP for modeling different phenomena includingthose of natural origins [45 3 7 56 46]

See [] for an extensive review of GP applications

Applications of GPndash 80

Case study 2 Evolution of feature detectors

Based on Krzysztof Krawiec Bartosz Kukawka and Tomasz Maciejewski (2010)Evolving cascades of voting feature detectors for vehicle detection in satellite imageryIn IEEE Congress on Evolutionary Computation (CEC 2010) Barcelona IEEE Presspages 2392-2399

Applications of GPndash 81

Additional resources

Additional resourcesndash 82

Additional resources

Evolutionary Computation in Java csgmuedu~eclabprojectsecjGeneric software framework for EA well-prepared to work with GP

The online GP bilbiography wwwcsbhamacuk~wblbiblio

The genetic programming lsquohome pagersquo (a little bit messy but still valuable)httpwwwgenetic-programmingcom

Additional resourcesndash 83

Bibliography

Bibliographyndash 84

A Agapitos and S M LucasEvolving Modular Recursive Sorting AlgorithmsIn Proc EuroGP 2007

A Arcuri and X YaoA novel co-evolutionary approach to automatic software bug fixingIn J Wang editor 2008 IEEE World Congress on Computational Intelligencepages 162ndash168 Hong Kong 1-6 June 2008 IEEE Computational IntelligenceSociety IEEE Press

M Arganis R Val J Prats K Rodriguez R Dominguez and J DolzGenetic programming and standardization in water temperature modellingAdvances in Civil Engineering 2009 2009

A Brabazon M OrsquoNeill and I DempseyAn Introduction to Evolutionary Computation in FinanceIEEE Computational Intelligence Magazine 3(4)42ndash55 2008

R Bradley A Brabazon and M OrsquoNeillDynamic High Frequency Trading A Neuro-Evolutionary ApproachIn Proc EvoWorkshops 2009

G Cuccu and F GomezWhen novelty is not enoughIn Proc EvoApplications 2011

Bibliographyndash 85

M Daga and M C DeoAlternative data-driven methods to estimate wind from waves by inversemodelingNatural Hazards 49(2)293ndash310 May 2009

I Dempsey M OrsquoNeill and A BrabazonAdaptive Trading With Grammatical EvolutionIn Proc CEC 2006

G Durrett F Neumann and U-M OrsquoReillyComputational Complexity Analysis of Simple Genetic Programming On TwoProblems Modeling Isolated Program SemanticsIn Proc FOGA 2011

E Galvaacuten-Loacutepez J Swafford M OrsquoNeill and A BrabazonEvolving a Ms PacMan Controller Using Grammatical EvolutionIn Applications of Evolutionary Computation Springer 2010

J V Hansen P B Lowry R D Meservy and D M McDonaldGenetic Programming for Prevention of Cyberterrorism through Dynamic andEvolving Intrusion DetectionDecision Support Systems 431362ndash1374 2007

Bibliographyndash 86

S Harding J F Miller and W BanzhafDevelopments in Cartesian Genetic Programming self-modifying CGPGPEM 11397ndash439 2010

D Jakobović and L BudinDynamic Scheduling with Genetic ProgrammingIn Proc EuroGP 2006

G Kendall A Parkes and K SpoererA Survey of NP-Complete PuzzlesInternational Computer Games Association Journal 31(1)13ndash34 2008

K E Kinnear JrEvolving a Sort Lessons in Genetic ProgrammingIn Proc of the International Conference on Neural Networks 1993

J KozaA Genetic Approach to the Truck Backer Upper Problem and the Inter-twinedSpiral ProblemIn Proc International Joint Conference on Neural Networks 1992

J R KozaGenetic Programming On the Programming of Computers by Means of NaturalSelectionMIT Press Cambridge MA USA 1992

Bibliographyndash 87

J R KozaGenetic Programming II Automatic Discovery of Reusable ProgramsMIT Press Cambridge Massachusetts May 1994

W Langdon and W BanzhafRepeated Patterns in Genetic ProgrammingNatural Computing 7589ndash613 2008

W B LangdonRandom search is parsimoniousIn E Cantuacute-Paz editor Late Breaking Papers at the Genetic and EvolutionaryComputation Conference (GECCO-2002) pages 308ndash315 New York NY 9-13July 2002 AAAI

W B Langdon and W BanzhafRepeated Sequences in Linear Genetic Programming GenomesComplex Systems 15(4)285ndash306 2005

W B Langdon and A P HarrisonEvolving Regular Expressions for GeneChip Probe Performance PredictionIn Proc PPSN pages 1061ndash1070 2008

W B Langdon and R PoliFoundations of Genetic ProgrammingSpringer-Verlag 2002

Bibliographyndash 88

W B Langdon J Rowsell and A P HarrisonCreating Regular Expressions as mRNA Motifs with GP to Predict Human ExonSplittingIn Proc GECCO 2009

W B Langdon O Sanchez Graillet and A P HarrisonAutomated DNA Motif DiscoveryarXivorg 2010

M Lones A Tyrrell S Stepney and L CavesControlling Complex Dynamics with Artificial Biochemical NetworksIn Proc EuroGP pages 159ndash170 2010

S LucasOthello Competitionhttpprotectkern-1667emrelaxalgovalessexacuk8080othellohtmlOthellohtml 2012[Online accessed 27-Jan-2012]

Bibliographyndash 89

S LucasThe Physical Travelling Salesperson Problemhttpprotectkern-1667emrelaxalgovalessexacukptspptsphtml2012[Online accessed 27ndashJan-2012]

T McConaghyFFX Fast Scalable Deterministic Symbolic Regression TechnologyIn Proc GPTP 2011

D J MontanaStrongly typed genetic programmingBBN Technical Report 7866 Bolt Beranek and Newman Inc 10 MoultonStreet Cambridge MA 02138 USA 7 May 1993

M Nicolau M Schoenauer and W BanzhafEvolving Genes to Balance a PoleIn Proc EuroGP 2010

Bibliographyndash 90

P Nordin and W BanzhafGenetic programming controlling a miniature robotIn E V Siegel and J R Koza editors Working Notes for the AAAI Symposiumon Genetic Programming pages 61ndash67 MIT Cambridge MA USA 10ndash12 Nov1995 AAAI

G Olague and L TrujilloEvolutionary-computer-assisted design of image operators that detect interestpoints using genetic programmingImage and Vision Computing 29(7)484ndash498 2011

M OrsquoNeill A Brabazon and E HembergSubtree Deactivation Control with Grammatical Genetic Programming inDynamic EnvironmentsIn Proc CEC 2008

M OrsquoNeill and C RyanGrammatical Evolution by Grammatical Evolution The Evolution of Grammarand Genetic CodeIn Proc EuroGP pages 138ndash149 Springer-Verlag 5ndash7 Apr 2004

M Paterson Y Peres M Thorup P Winkler and U ZwickMaximum OverhangIn Proc 19th Annual ACM-SIAM Symposium on Discrete Algorithms 2008

Bibliographyndash 91

R PoliExact schema theory for genetic programming and variable-length geneticalgorithms with one-point crossoverGenetic Programming and Evolvable Machines 2(2)123ndash163 June 2001

R Poli and W B LangdonOn the search properties of different crossover operators in genetic programmingIn J R Koza W Banzhaf K Chellapilla K Deb M Dorigo D B Fogel M HGarzon D E Goldberg H Iba and R Riolo editors Genetic Programming1998 Proceedings of the Third Annual Conference pages 293ndash301 University ofWisconsin Madison Wisconsin USA 22-25 July 1998 Morgan Kaufmann

R Poli W B Langdon and N F McPheeA field guide to genetic programmingPublished via httplulucom and freely available athttpwwwgp-field-guideorguk 2008(With contributions by J R Koza)

B J Ross and H ZhuProcedural texture evolution using multiobjective optimizationNew Generation Computing 22(3)271ndash293 2004

Bibliographyndash 92

R P Salustowicz and J SchmidhuberProbabilistic incremental program evolutionEvolutionary Computation 5(2)123ndash141 1997

M Schmidt and H LipsonDistilling free-form natural laws from experimental dataScience 324(5923)81ndash85 3 Apr 2009

S Silva and L VanneschiState-of-the-Art Genetic Programming for Predicting Human Oral Bioavailabilityof DrugsIn Proc 4th International Workshop on Practical Applications of ComputationalBiology and Bioinformatics 2010

M SipperLet the Games EvolveIn Proc GPTP 2011

C Sivapragasam R Maheswaran and V VenkateshGenetic programming approach for flood routing in natural channelsHydrological Processes 22(5)623ndash628 2007

C Sivapragasam N Muttil S Muthukumar and V M ArunPrediction of algal blooms using genetic programmingMarine Pollution Bulletin 60(10)1849ndash1855 2010

Bibliographyndash 93

L SpectorTowards practical autoconstructive evolution Self-evolution of problem-solvinggenetic programming systemsIn R Riolo T McConaghy and E Vladislavleva editors Genetic ProgrammingTheory and Practice VIII volume 8 of Genetic and Evolutionary Computationchapter 2 pages 17ndash33 Springer Ann Arbor USA 20-22 May 2010

L Spector C Perry and J KleinPush 20 programming language descriptionTechnical report School of Cognitive Science Hampshire College Apr 2004

J Togelius S Karakovskiy J Koutnik and J SchmidhuberSuper Mario EvolutionIn Proc IEEE Computational Intelligence and Games 2009

TORCS The Open Car Racing Simulatorhttptorcssourceforgenet 2012

L Vanneschi and G CuccuVariable Size Population for Dynamic Optimization with Genetic ProgrammingIn Proc GECCO 2009

Bibliographyndash 94

E Vladislavleva G Smits and D Den HertogOrder of Nonlinearity as a Complexity Measure for Models Generated by SymbolicRegression via Pareto Genetic ProgrammingIEEE Trans EC 13(2)333ndash349 2009

N Wagner Z Michalewicz M Khouja and R McGregorTime Series Forecasting for Dynamic Environments The DyFor Genetic ProgramModelIEEE Trans EC 2007

J Walker and J MillerPredicting Prime Numbers Using Cartesian Genetic ProgrammingIn Proc EuroGP 2007

J A Walker K Voumllk S L Smith and J F MillerParallel Evolution using Multi-chromosome Cartesian Genetic ProgrammingGPEM 10417ndash445 2009

W-C Wang K-W Chau C-T Cheng and L QiuA comparison of performance of several artificial intelligence methods forforecasting monthly discharge time seriesJournal of Hydrology 374(3-4)294ndash306 2009

Bibliographyndash 95

W Weimer S Forrest C Le Goues and T NguyenAutomatic program repair with evolutionary computationCommunications of the ACM 53(5)109ndash116 June 2010

P Widera J Garibaldi and N KrasnogorGP challenge Evolving energy function for protein structure predictionGPEM 1161ndash88 2010

T YuHierarchical Processing for Evolving Recursive and Modular Programs UsingHigher-Order Functions and Lambda AbstractionGPEM 2345ndash380 2001

Bibliographyndash 96

  • Introduction
  • Background
  • What is genetic programming
  • Summary of our first glimpse at GP
  • Life demonstration of GP using ECJ
  • Assessment of GP techniques
  • Case study
  • A more detailed view on GP (vanilla GP is not the whole story)
  • The Challenges for GP
  • Variants of GP
  • Selected theoretical results
  • Applications of GP
  • Additional resources
  • Bibliography
Page 8: GeneticProgramming - Poznań University of TechnologyGeneticProgramming KrzysztofKrawiec LaboratoryofIntelligentDecisionSupportSystems InstituteofComputingScience,PoznanUniversityofTechnology,Poznań,Poland

Generic evolutionary algorithm

Evolutionary Algorithm

Population P of individuals

Evaluation

Selection

Mutation and recombination

Initialization of population P

Solutionindividual s

f(s)

Output Best solution s+

Termination criteria

Fitness function f

Backgroundndash 8

[Unique] characteristic of EC

Black-box optimization (f primes dependency on the independent variables does nothave to be known or meet any criteria)

Variables do not have to be explicitly defined

Fining an optimum cannot be guaranteed but in practice a well-performingsuboptimal solution is often satisfactory

Importance of crossover a recombination operator that makes the solutionsexchange certain elements (variable values features)

Without crossover EC boils down parallel stochastic local search

Backgroundndash 9

What is genetic programming

What is genetic programmingndash 10

Genetic programming

In a nutshell

A variant of EC where the genotypes represent programs ie entities capable ofreading in input data and producing some output data in response to that input

Fitness function f measures the similarity of the output produced by the programto the desired output given as a part of task statement

Standard representation expression trees

Important implication Additional input required by the algorithm (compared to EC)

Set of instructions (programming language of consideration)

Data to run the programs on

What is genetic programmingndash 11

Conceptual preliminaries of GP Space of candidate solutions P

Candidate solutions p isin P evolving under the selection pressure of the fitnessfunction f are themselves functions of the form p I rarrO

I and O are respectively the spaces of input data and output data accepted andproduced by programs from P

Cardinality of |P| is typically large or infinite

The set of program inputs I even if finite is usually so large that running eachcandidate solution on all possible inputs becomes intractable

GP algorithms typically evaluate solutions on a sample I prime sub I |I prime| |I | of possibleinputs and fitness is only an approximate estimate of solution quality

The task is given as a set of fitness cases ie pairs (xi yi ) isin I timesO where xiusually comprises one or more independent variables and yi is the output variable

What is genetic programmingndash 12

Conceptual preliminaries Fitness function f

In most cases (and most real-world applications of GP) fitness function fmeasures the similarity of the output produced by the program to the desiredoutput given as a part of task statement

Then fitness can be expressed as a monotonous function of the divergence ofprogramrsquos output from the desired one for instance as

f (p) =minussumi

||yi minusp(xi )|| (1)

where

p(xi ) is the output produced by program p for the input data xi

|| middot || is a metric (a norm) in the output space O

i iterates over all fitness cases

What is genetic programmingndash 13

Conceptual preliminaries Character of candidate solutions

The candidate solutions in GP are being assembled from elementary entities calledinstructions

A part of formulation of a GP task is then also an instruction set I ie a set ofsymbols used by the search algorithm to compose the programs (candidatesolutions)

Design of I usually requires some background knowledgeIn particular it should comprise all instructions necessary to find solution to theproblem posed (closure)

What is genetic programmingndash 14

Genetic programming

Main evolution loop (lsquovanilla GPrsquo)

1 procedure GeneticProgramming(f I ) f - fitness function I - instruction set2 P larrplarrRandomProgram(I ) Initialize population3 repeat Main loop over generations4 for p isinP do Evaluation5 pf larr f (p) pf is a lsquofieldrsquo in program p that stores its fitness6 end for7 P prime larr 0 Next population8 repeat Breeding loop9 p1 larrTournamentSelection(P) First parent10 p2 larrTournamentSelection(P) Second parent11 (o1o2)larrCrossover(p1 p2)12 o1 larrMutation(o1 I )13 o2 larrMutation(o2 I )14 P prime larrP prime cupo1o215 until |P prime|= |P|16 P larrP prime

17 until StoppingCondition(P)18 return argmaxpisinP pf

19 end procedure

What is genetic programmingndash 15

Crossover

Crossover exchange of randomly selected subexpressions (subtree swapping crossover)

1 function Crossover(p1p2)2 repeat3 s1 larr Random node in p14 s2 larr Random node in p25 (pprime1p

prime2)larr Swap subtrees rooted in s1 and s2

6 until Depth(pprime1)lt dmax andDepth(pprime2)lt dmax dmax is the tree depthlimit

7 return (pprime1pprime2)

8 end function

What is genetic programmingndash 16

Mutation

Mutation replace a randomly selected subexpression with a new randomly generatedsubexpression

1 function Mutation(pI )2 repeat3 slarr Random node in p4 s prime larrRandomProgram(I )5 pprime larr Replace the subtree rooted in s with s prime

6 until Depth(pprime)lt dmax dmax is the tree depth limit7 return pprime

8 end function

What is genetic programmingndash 17

Exemplary run Setup

Objective Find program whose output matches x2+x+1 over the range [minus11]Such tasks can be considered as a form of regressionAs solutions are built by manipulating code (instructions) this is referred to assymbolic regression

Fitness sum of absolute errors for x isin minus10minus09 0910In other words the set of fitness cases is

xi -10 -09 0 09 10yi 1 091 1 271 3

What is genetic programmingndash 18

Exemplary run Setup

Instruction setNonterminal (function) set + - (protected division) and x all operating onfloatsTerminal set x and constants chosen randomly between -5 and +5

Selection fitness proportionate (roulette wheel) non elitist

Initial pop ramped half-and-half (depth 1 to 2 50 of terminals are constants)(to be explained later)

Parameterspopulation size 450 subtree crossover25 reproduction25 subtree mutation no tree size limits

Termination when an individual with fitness better than 01 found

What is genetic programmingndash 19

Initial population (population 0)

What is genetic programmingndash 20

Fitness assignment for population 0

Fitness values f(a)=77 f(b)=110 f(c)=1798 f(d)=287

What is genetic programmingndash 21

Breeding

Assume

a gets reproduced

c gets mutated (at loci 2)

a and d get crossed-over

a and b get crossed over

What is genetic programmingndash 22

Population 1

Population 0

Population 1

Individual d in population 1 has fitness 0What is genetic programmingndash 23

Summary of our first glimpse at GP

Summary of our first glimpse at GPndash 24

Specific features of GP

The solutions evolving under the selection pressure of the fitness function arethemselves functions (programs)

GP operates on symbolic structures of varying lengthsThere are no variables for the algorithm to operate on (at least in the commonsense)

The program can be tested only on a limited number of fitness cases (tests)

=rArr In contrast to most EC methods that are typically placed in optimizationframework GP is by nature an inductive learning approach that fits into the domain ofmachine learning []

Summary of our first glimpse at GPndash 25

In a broader context

As opposed to typical ML approaches GP is very genericArbitrary programming language arbitrary input and output representation

The syntax and semantic of the programming language of consideration serve asmeans to provide the algorithm with prior knowledge

(common sense knowledge background knowledge domain knowledge)

GP is not the only approach to program induction (but probably the best one )See eg inductive logic programming ILP

GP embodies the ultimate goal of AI to build a system capable ofself-programming (adaptation learning)

Summary of our first glimpse at GPndash 26

What is GP ndash Question revisited

Genetic programming is a branch of computer science studying heuristicalgorithms based on neo-Darwinian principles for synthesizing programs iediscrete symbolic compositional structures that process data

Summary of our first glimpse at GPndash 27

Why should GP be considered a viable approach to program synthesis

GP combines two powerful concepts marked in underline in the above definition1 Representing candidate solutions as programs

which in general can conduct any Turing-complete computation (egclassification regression clustering reasoning problem solving etc) and thusenable capturing solutions to any type of problems (whether the task is eglearning optimization problem solving game playing etc)

2 Searching the space of candidate solutions using the lsquomechanicsrsquo borrowed frombiological evolutionwhich is unquestionably a very powerful computing paradigm given that itresulted in life on Earth and development of intelligent beings

Summary of our first glimpse at GPndash 28

Consequences of the above definition

Heuristic nature of search

Symbolic program representation

Input sensitivity and inductive character

State-fullness

Unconstrained data types

Summary of our first glimpse at GPndash 29

Origins of GP

httpwwwgenetic-programmingcomjohnkozahtml

Home assignment Indentify John Kozarsquos unique visual trait )

Summary of our first glimpse at GPndash 30

Life demonstration of GP using ECJ

Life demonstration of GP using ECJndash 31

GP on ECJ

ECJ Evolutionary Computation in Javahttpcsgmuedu~eclabprojectsecj

by Sean Luke Liviu Panait Gabriel Balan Sean Paus Zbigniew Skolicki ElenaPopovici Keith Sullivan et al

Probably the most popular freely available framework for EC with a strongsupport for GP

Licensed under Academic Free License version 30

As of March 2012 version 20

Many other libraries integrate with ECJ

Life demonstration of GP using ECJndash 32

Selected ECJ features

GUI with charting

Platform-independent checkpointing and logging

Hierarchical parameter files

Multithreading

Mersenne Twister Random Number Generators (compare tohttpwwwalifecouknonrandom)

Abstractions for implementing a variety of EC forms

Prepared to work in a distributed environment (including so-called island model)

Life demonstration of GP using ECJndash 33

GP-related ECJ features

GP Tree Representations

Set-based Strongly-Typed Genetic Programming

Ephemeral Random Constants

Automatically-Defined Functions and Automatically Defined Macros

Multiple tree forests

Six tree-creation algorithms

Extensive set of GP breeding operators

Grammatical Encoding

Eight pre-done GP application problem domains (ant regression multiplexerlawnmower parity two-box edge serengeti)

Life demonstration of GP using ECJndash 34

Exemplary run of ECJ

Standard outputjava ecEvolve -file ecappregressionquinticercparamsThreads breed1 eval1Seed 1427743400Job 0Setting upProcessing GP TypesProcessing GP Node ConstraintsProcessing GP Function SetsProcessing GP Tree Constraints-01306332228659439200164875774146594280653340439694114301402200189629743-0037506348565697010001402771209365470606602806044824949013869498395598084Initializing Generation 0Subpop 0 best fitness of generation Fitness Standardized=11303205 Adjusted=046941292 Hits=10Generation 1Subpop 0 best fitness of generation Fitness Standardized=06804932 Adjusted=059506345 Hits=7

Life demonstration of GP using ECJndash 35

Exemplary run The result

The log file produced by the runGeneration 0Best IndividualSubpopulation 0Evaluated trueFitness Standardized=11303205 Adjusted=046941292 Hits=10Tree 0( (sin ( x x)) (cos (+ x x)))Generation 1Best IndividualSubpopulation 0Evaluated trueFitness Standardized=06804932 Adjusted=059506345 Hits=7Tree 0( (rlog (+ (- x x) (cos x))) (rlog (- (cos (cos ( x x))) (- x x))))

Life demonstration of GP using ECJndash 36

Exemplary run

The log file produced by the runBest Individual of RunSubpopulation 0Evaluated trueFitness Standardized=008413165 Adjusted=092239726 Hits=17Tree 0( ( ( (- ( ( ( ( x (sin x)) (rlog

x)) (+ (+ (sin x) x) (- x x))) (exp ( x( ( (- ( ( ( ( x x) (rlog x)) (+ (+

(sin x) x) (- x x))) (exp ( x (sin x))))(sin x)) (rlog x)) (exp (rlog x)))))) (sin

x)) (rlog x)) x) (cos (cos ( ( (- ( ((exp (rlog x)) (+ x ( ( (exp (rlog x))(rlog x)) x))) (exp ( ( ( (- (exp (rlogx)) x) (rlog x)) x) (sin ( x x))))) (sinx)) ( x ( ( (- ( ( ( ( x x) (rlogx)) (+ (+ x (+ (+ (sin x) x) (- x x))) (-x x))) (exp ( x (sin x)))) (sin x)) (rlogx)) (exp (rlog x))))) x))))

Life demonstration of GP using ECJndash 37

GP in R

httpcranr-projectorgwebpackagesgprindexhtml

Life demonstration of GP using ECJndash 38

GP in Mathematica

Life demonstration of GP using ECJndash 39

Assessment of GP techniques

Assessment of GP techniquesndash 40

Criteria for assessing the quality of GP-evolved solutions

Criteria for assessing GP algorithmssuccess rate (percentage of evolutionary runs ended with success)

time-to-success (can be infin)

error of the best-of-run individual

Criteria for assessing programs obtained with GP

error rate (percentage of tests passed)

program size (number of instructions)

execution time

transparency

Assessment of GP techniquesndash 41

GP Benchmarks

Problem Definition (formula)Sextic x6minus2x4+x2

Septic x7minus2x6+x5minusx4+x3minus2x2+xNonic x9+x8+x7+x6+x5+x4+x3+x2+xR1 (x+1)3(x2minusx+1)R2 (x5minus3x3+1)(x2+1)R3 (x6+x5)(x4+x3+x2+x+1)

Assessment of GP techniquesndash 42

Even more GP benchmarks

Symbolic Regression Drug Bioavailability [43]

Tower [52] Protein Structure Classification [58]

Boolean Functions Time Series Forecasting [53]

N-Multiplexer [17] N-Majority [17] N-Parity [17] Path-finding and Planning

Generalised Boolean Circuits [12 59] Physical Travelling Salesman [28]

Digital Adder [55] Artificial Ant [17]

Order [9] Lawnmower [18]

Digital Multiplier [55] Tartarus Problem [6]

Majority [9] Maximum Overhang [36]

Classification Circuit Design [29]

mRNA Motif Classification [24] Control Systems

DNA Motif Discovery [25] Chaotic Dynamic Systems Control [26]

Intrusion Detection [11] Pole Balancing [31]

Protein Classification [19] Truck Control [16]

Intertwined Spirals [17]

Predictive Modelling

Mackey-Glass Chaotic Time Series [21]

Financial Trading [5 4 8]

Sunspot Prediction [17]

GeneChip Probe Performance [22]

Prime Number Prediction [54]

Assessment of GP techniquesndash 43

and more

Game-Playing

TORCS Car Racing [50]

Ms PacMan [10]

Othello [27]

Chessboard Evaluation [44]

Backgammon [44]

Mario [49]

NP-Complete Puzzles [14]

Robocode [44]

Rush Hour [44]

Checkers [44]

Freecell [44]

Dynamic Optimisation

Dynamic Symbolic Regression [34 35 51]

Dynamic Scheduling [13]

Traditional Programming

Sorting [15 1]

Assessment of GP techniquesndash 44

Case study

Case studyndash 45

Case study 1 Evolution of temperature models

Based on Karolina Stanisławska Krzysztof Krawiec Zbigniew W KundzewiczModeling Global Temperature Changes using Genetic Programming ndash A Case Study

Institute of Computing Science Poznan University of Technology Poznan Poland

Institute for Agricultural and Forest Environment Polish Academy of SciencesPoznan Poland and Potsdam Institute for Climate Impact Research PotsdamGermany

Konferencja Algorytmoacutew Ewolucyjnych i Optymalizacji Globalnej KAEiOGSeptember 21 2011

Case studyndash 46

A more detailed view on GP(vanilla GP is not the whole story)

A more detailed view on GP (vanilla GP is not the whole story)ndash 47

Where to get [random] candidate solutions from

Every stochastic search method relies on some sampling algorithm(s)

The distribution of randomly generated solutions is important as it impliescertain bias of the algorithm

ProblemsWe donrsquot know the lsquoidealrsquo distribution of GP programsEven if we knew it it may be difficult to design an algorithm that obeys it

The most widely used contemporary initialization methods take care only of thesyntax of generated programs

Mainly height constraint

A more detailed view on GP (vanilla GP is not the whole story)ndash 48

Initialization Full method

Specify the maximum tree height hmax

The full method for initializing treesChoose nonterminal nodes at random until hmax is reachedThen choose only from terminals

A more detailed view on GP (vanilla GP is not the whole story)ndash 49

Initialization Grow method

Specify the maximum tree height hmax

The grow method for initializing treesChoose nonterminal or terminal nodes at random until hmax is reachedThen choose only from terminals

A more detailed view on GP (vanilla GP is not the whole story)ndash 50

Homologous crossover for GP

Earliest example one-point crossover [23] identify a common region in theparents and swap the corresponding trees

The common region is the lsquointersectionrsquo of parent trees

A more detailed view on GP (vanilla GP is not the whole story)ndash 51

Uniform crossover for GP

Works similarly to uniform crossover in GAs

The offspring is build by iterating over nodes in the common region and flipping acoin to decide from which parent should an instruction be copied [38]

A more detailed view on GP (vanilla GP is not the whole story)ndash 52

How to employ multiple operators for lsquobreedingrsquo

How should the particular operators coexist in an evolutionary process In other words

How should they be superimposed

What should be the lsquopipingrsquo of particular breeding pipelines

A topic surprisingly underexplored in GP (and in EC probably too)

An example Which is betterpopsubpop0speciespipe = ecgpkozaMutationPipelinepopsubpop0speciespipenum-sources = 1popsubpop0speciespipesource0 = ecgpkozaCrossoverPipeline

Orpopsubpop0speciespipenum-sources = 2popsubpop0speciespipesource0 = ecgpkozaCrossoverPipelinepopsubpop0speciespipesource0prob = 09popsubpop0speciespipesource1 = ecgpkozaMutationPipelinepopsubpop0speciespipesource1prob = 01

A more detailed view on GP (vanilla GP is not the whole story)ndash 53

The Challenges for GP

The Challenges for GPndash 54

Bloat

The evolving expressions tend to grow indefinitely in sizeFor tree-based representations this growth is typically exponential[-ish]

Evaluation becomes slow algorithm stalls memory overrun likely

One of the most intensely studied topics in GP 240+ papers as of March 2012

The Challenges for GPndash 55

Bloat example

Average number of nodes per generation in a typical run of GP solving the Sexticproblem x6minus2x4+x2

(GP dotted line)

The Challenges for GPndash 56

Countermeasures for bloat

Constraining tree heightSurprisingly can speed up bloat

Favoring small programsLexicographic parsimony pressure given two equally fit individuals prefer (select)the one represented by a smaller tree

Bloat-aware operators size-fair crossover

The Challenges for GPndash 57

Highly non-uniform distribution of program lsquobehaviorsrsquo

Convergence of binary Boolean random linear functions (composed of AND NANDOR NOR 8 bits)

From [20] Langdon W B Cantuacute-Paz E (ed) Random Search is Parsimonious LateBreaking Papers at the Genetic and Evolutionary Computation Conference(GECCO-2002) AAAI 2002 308-315

The Challenges for GPndash 58

High cost of evaluation

Running a program on multiple inputscan be expensive

Particularly for some types of dataeg images

Solutions

Caching of outcomes of subprograms

Parallel execution of programs onparticular fitness cases

Bloat prevention methods

The Challenges for GPndash 59

Variants of GP

Variants of GPndash 60

Strongly typed GP (STGP)

A way to incorporate prior knowledge and impose a structure on programs [30]

ImplementationProvide a set of typesFor each instruction define the types of its arguments and outcomesMake the operators type-aware

Mutation substitute a random tree of a proper typeCrossover swap trees of compatible1 types

1Compatible belonging to the same lsquoset typersquoVariants of GPndash 61

Strongly typed GP in ECJ

For the problem of simple classifiers represented as decision trees

Classifier syntaxClassifier = Class_idClassifier = if_then_else(Condition ClassifierClassifier)Condition = Input_Variable = Constant_Value

Implementation in ECJ parameter filesgptypeasize = 3gptypea0name = classgptypea1name = vargptypea2name = constgptypessize = 0gptcsize = 1gptc0 = ecgpGPTreeConstraintsgptc0name = tc0gptc0fset = f0gptc0returns = class

gpncsize = 4gpnc0 = ecgpGPNodeConstraintsgpnc0name = ncSimpleClassifiergpnc0returns = classgpnc0size = 0gpnc1 = ecgpGPNodeConstraintsgpnc1name = ncCompoundClassifiergpnc1returns = classgpnc1size = 4gpnc1child0 = vargpnc1child1 = constgpnc1child2 = classgpnc1child3 = classgpnc2 = ecgpGPNodeConstraintsgpnc2name = ncVariablegpnc2returns = vargpnc2size = 0gpnc3 = ecgpGPNodeConstraintsgpnc3name = ncConstantgpnc3returns = constgpnc3size = 0

Variants of GPndash 62

Linear Genetic Programming

MotivationTree-like structures are not natural for contemporary hardware architectures

Program = a sequence of instructions

Data passed via registers

ProsDirectly portable to machine code fast executionNatural correspondence to standard (GA-like) crossover operator

Applications direct evolution of machine code [32]

Initial register contents

Final register contents

x1

x2 O1 O2

x3

O3 O4 g2

g3

g1r1

r2

r3

r1

r2

r3

Variants of GPndash 63

Linear GP

Variants of GPndash 64

Stack-based GP

The best-known representative Push and PushGPhampshireedulspectorpushhtml [48]

ProsVery simple syntax program = instruction | literal | ( program )No need to specify the number of registersThe top element of a stack has the natural interpretation of program outcomeNatural possibility of implementing autoconstructive programs [47]Includes certain features that make it Turing-complete (eg YANK instruction)

Variants of GPndash 65

Push Example

Program

( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR )

Initial stack states

BOOLEAN STACK ()CODE STACK ( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR )FLOAT STACK ()INTEGER STACK ()

Stack states after program execution

BOOLEAN STACK ( TRUE )CODE STACK ( ( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR ) )FLOAT STACK ( 93 )INTEGER STACK ( 6 )

httphampshireedulspectorpush3-descriptionhtml

Variants of GPndash 66

Other variants of GP

Grammatical Evolution The grammar of the programming language ofconsideration is given as input to the algorithm Individuals encode the choice ofproductions in the derivation tree (which of available alternative productionshould be chosen modulo the number of productions available at given step ofderivation)

Graph-based GPMotivation standard GP cannot reuse subprograms (within a single program)Example Cartesian Genetic Programming

Variants of GPndash 67

Other variants of GP

Multiobjective GP The extra objectives canCome with the problemResult from GPrsquos specifics eg use program size as the second (minimized)objectiveBe associated with different tests (eg feature tests [40])

Developmental GP (eg using Push)

Probabilistic GP (a variant of EDA Estimation of Distribution Algorithms)The algorithm maintains a probability distribution P instead of a populationIndividuals are generated from P lsquoon demandrsquoThe results of individualsrsquo evaluation are used to update P

Variants of GPndash 68

Simple EDA-like GP PIPE

Probabilistic Incremental Program Evolution [41]

Variants of GPndash 69

Selected theoretical results

Selected theoretical resultsndash 70

Schemata theorem for GP

Exact formula for the expected number of individuals sampling a schema a thenext generation [37]

Plus later work for other types of crossover

Selected theoretical resultsndash 71

Semantic crossover

Selected theoretical resultsndash 72

Applications of GP

Applications of GPndash 73

Review

GP produced a number of solutions that are human-competitive ie a GPalgorithm automatically solved a problem for which a patent exists2

A recent award-winning work has demonstrated the ability of a GP system toautomatically find and correct bugs in commercially-released software whenprovided with test data3

GP is one of leading methodologies that can be used to lsquoautomatersquo sciencehelping the researchers to find the hidden complex patterns in the observedphenomena4

2Koza J R Keane M A Streeter M J Mydlowec W Yu J Lanza G 2003 Genetic Pro-gramming IV Routine Human-Competitive Machine Intelligence Kluwer Academic Publishers

3Arcuri A Yao X A novel co-evolutionary approach to automatic software bug fixing In WangJ (Ed) 2008 IEEE World Congress on Computational Intelligence IEEE Computational IntelligenceSociety IEEE Press Hong Kong

4Schmidt M Lipson H 3 Apr 2009 Distilling free-form natural laws from experimental dataScience 324 (5923) 81ndash85

Applications of GPndash 74

Humies

() Entries were solicited for cash awards for human-competitive results that wereproduced by any form of genetic and evolutionary computation and that were published

Applications of GPndash 75

Humies

The conditions to qualify(A) The result was patented as an invention in the past is an improvement over apatented invention or would qualify today as a patentable new invention(B) The result is equal to or better than a result that was accepted as a new scientificresult at the time when it was published in a peer-reviewed scientific journal(C) The result is equal to or better than a result that was placed into a database orarchive of results maintained by an internationally recognized panel of scientificexperts(D) The result is publishable in its own right as a new scientific result mdash independentof the fact that the result was mechanically created(E) The result is equal to or better than the most recent human-created solution to along-standing problem for which there has been a succession of increasingly betterhuman-created solutions(F) The result is equal to or better than a result that was considered an achievementin its field at the time it was first discovered(G) The result solves a problem of indisputable difficulty in its field(H) The result holds its own or wins a regulated competition involving humancontestants (in the form of either live human players or human-written computerprograms)

Applications of GPndash 76

Selected Gold Humies

2004 Jason D Lohn Gregory S Hornby Derek S Linden NASA Ames ResearchCenterAn Evolved Antenna for Deployment on NASArsquos Space Technology 5 Mission

httpidesignucscedupapershornby_ec11pdf

Applications of GPndash 77

Selected Gold Humies using GP

2009 Stephanie Forrest Claire Le Goues ThanhVu Nguyen Westley WeimerAutomatically finding patches using genetic programming A GeneticProgramming Approach to Automated Software Repair

Applications of GPndash 78

Selected Gold Humies using GP

2008 Lee Spector David M Clark Ian Lindsay Bradford Barr Jon KleinGenetic Programming for Finite Algebras

2010 Natalio Krasnogor Paweł Widera Jonathan GaribaldiEvolutionary design of the energy function for protein structure prediction GPchallenge evolving the energy function for protein structure prediction Automateddesign of energy functions for protein structure prediction by means of geneticprogramming and improved structure similarity assessment

2011 Achiya Elyasaf Ami Hauptmann Moshe SipperGA-FreeCell Evolving Solvers for the Game of FreeCell

Applications of GPndash 79

Other applications

classification problems in machine learning [] object recognition [ 33] orlearning game strategies [] [2 57] has demonstrated the ability of a GP system to automatically find andcorrect bugs in commercially-released software when provided with test data

In the context of this paper it deserves particular attention that GP is one ofleading methodologies that can be used to lsquoautomatersquo science helping theresearchers to find the hidden complex patterns in the observed phenomena

In this spirit in their seminal paper [42] have shown how GP can be used toinduce scientific laws from experimental data Many other studies havedemonstrated the usefulness of GP for modeling different phenomena includingthose of natural origins [45 3 7 56 46]

See [] for an extensive review of GP applications

Applications of GPndash 80

Case study 2 Evolution of feature detectors

Based on Krzysztof Krawiec Bartosz Kukawka and Tomasz Maciejewski (2010)Evolving cascades of voting feature detectors for vehicle detection in satellite imageryIn IEEE Congress on Evolutionary Computation (CEC 2010) Barcelona IEEE Presspages 2392-2399

Applications of GPndash 81

Additional resources

Additional resourcesndash 82

Additional resources

Evolutionary Computation in Java csgmuedu~eclabprojectsecjGeneric software framework for EA well-prepared to work with GP

The online GP bilbiography wwwcsbhamacuk~wblbiblio

The genetic programming lsquohome pagersquo (a little bit messy but still valuable)httpwwwgenetic-programmingcom

Additional resourcesndash 83

Bibliography

Bibliographyndash 84

A Agapitos and S M LucasEvolving Modular Recursive Sorting AlgorithmsIn Proc EuroGP 2007

A Arcuri and X YaoA novel co-evolutionary approach to automatic software bug fixingIn J Wang editor 2008 IEEE World Congress on Computational Intelligencepages 162ndash168 Hong Kong 1-6 June 2008 IEEE Computational IntelligenceSociety IEEE Press

M Arganis R Val J Prats K Rodriguez R Dominguez and J DolzGenetic programming and standardization in water temperature modellingAdvances in Civil Engineering 2009 2009

A Brabazon M OrsquoNeill and I DempseyAn Introduction to Evolutionary Computation in FinanceIEEE Computational Intelligence Magazine 3(4)42ndash55 2008

R Bradley A Brabazon and M OrsquoNeillDynamic High Frequency Trading A Neuro-Evolutionary ApproachIn Proc EvoWorkshops 2009

G Cuccu and F GomezWhen novelty is not enoughIn Proc EvoApplications 2011

Bibliographyndash 85

M Daga and M C DeoAlternative data-driven methods to estimate wind from waves by inversemodelingNatural Hazards 49(2)293ndash310 May 2009

I Dempsey M OrsquoNeill and A BrabazonAdaptive Trading With Grammatical EvolutionIn Proc CEC 2006

G Durrett F Neumann and U-M OrsquoReillyComputational Complexity Analysis of Simple Genetic Programming On TwoProblems Modeling Isolated Program SemanticsIn Proc FOGA 2011

E Galvaacuten-Loacutepez J Swafford M OrsquoNeill and A BrabazonEvolving a Ms PacMan Controller Using Grammatical EvolutionIn Applications of Evolutionary Computation Springer 2010

J V Hansen P B Lowry R D Meservy and D M McDonaldGenetic Programming for Prevention of Cyberterrorism through Dynamic andEvolving Intrusion DetectionDecision Support Systems 431362ndash1374 2007

Bibliographyndash 86

S Harding J F Miller and W BanzhafDevelopments in Cartesian Genetic Programming self-modifying CGPGPEM 11397ndash439 2010

D Jakobović and L BudinDynamic Scheduling with Genetic ProgrammingIn Proc EuroGP 2006

G Kendall A Parkes and K SpoererA Survey of NP-Complete PuzzlesInternational Computer Games Association Journal 31(1)13ndash34 2008

K E Kinnear JrEvolving a Sort Lessons in Genetic ProgrammingIn Proc of the International Conference on Neural Networks 1993

J KozaA Genetic Approach to the Truck Backer Upper Problem and the Inter-twinedSpiral ProblemIn Proc International Joint Conference on Neural Networks 1992

J R KozaGenetic Programming On the Programming of Computers by Means of NaturalSelectionMIT Press Cambridge MA USA 1992

Bibliographyndash 87

J R KozaGenetic Programming II Automatic Discovery of Reusable ProgramsMIT Press Cambridge Massachusetts May 1994

W Langdon and W BanzhafRepeated Patterns in Genetic ProgrammingNatural Computing 7589ndash613 2008

W B LangdonRandom search is parsimoniousIn E Cantuacute-Paz editor Late Breaking Papers at the Genetic and EvolutionaryComputation Conference (GECCO-2002) pages 308ndash315 New York NY 9-13July 2002 AAAI

W B Langdon and W BanzhafRepeated Sequences in Linear Genetic Programming GenomesComplex Systems 15(4)285ndash306 2005

W B Langdon and A P HarrisonEvolving Regular Expressions for GeneChip Probe Performance PredictionIn Proc PPSN pages 1061ndash1070 2008

W B Langdon and R PoliFoundations of Genetic ProgrammingSpringer-Verlag 2002

Bibliographyndash 88

W B Langdon J Rowsell and A P HarrisonCreating Regular Expressions as mRNA Motifs with GP to Predict Human ExonSplittingIn Proc GECCO 2009

W B Langdon O Sanchez Graillet and A P HarrisonAutomated DNA Motif DiscoveryarXivorg 2010

M Lones A Tyrrell S Stepney and L CavesControlling Complex Dynamics with Artificial Biochemical NetworksIn Proc EuroGP pages 159ndash170 2010

S LucasOthello Competitionhttpprotectkern-1667emrelaxalgovalessexacuk8080othellohtmlOthellohtml 2012[Online accessed 27-Jan-2012]

Bibliographyndash 89

S LucasThe Physical Travelling Salesperson Problemhttpprotectkern-1667emrelaxalgovalessexacukptspptsphtml2012[Online accessed 27ndashJan-2012]

T McConaghyFFX Fast Scalable Deterministic Symbolic Regression TechnologyIn Proc GPTP 2011

D J MontanaStrongly typed genetic programmingBBN Technical Report 7866 Bolt Beranek and Newman Inc 10 MoultonStreet Cambridge MA 02138 USA 7 May 1993

M Nicolau M Schoenauer and W BanzhafEvolving Genes to Balance a PoleIn Proc EuroGP 2010

Bibliographyndash 90

P Nordin and W BanzhafGenetic programming controlling a miniature robotIn E V Siegel and J R Koza editors Working Notes for the AAAI Symposiumon Genetic Programming pages 61ndash67 MIT Cambridge MA USA 10ndash12 Nov1995 AAAI

G Olague and L TrujilloEvolutionary-computer-assisted design of image operators that detect interestpoints using genetic programmingImage and Vision Computing 29(7)484ndash498 2011

M OrsquoNeill A Brabazon and E HembergSubtree Deactivation Control with Grammatical Genetic Programming inDynamic EnvironmentsIn Proc CEC 2008

M OrsquoNeill and C RyanGrammatical Evolution by Grammatical Evolution The Evolution of Grammarand Genetic CodeIn Proc EuroGP pages 138ndash149 Springer-Verlag 5ndash7 Apr 2004

M Paterson Y Peres M Thorup P Winkler and U ZwickMaximum OverhangIn Proc 19th Annual ACM-SIAM Symposium on Discrete Algorithms 2008

Bibliographyndash 91

R PoliExact schema theory for genetic programming and variable-length geneticalgorithms with one-point crossoverGenetic Programming and Evolvable Machines 2(2)123ndash163 June 2001

R Poli and W B LangdonOn the search properties of different crossover operators in genetic programmingIn J R Koza W Banzhaf K Chellapilla K Deb M Dorigo D B Fogel M HGarzon D E Goldberg H Iba and R Riolo editors Genetic Programming1998 Proceedings of the Third Annual Conference pages 293ndash301 University ofWisconsin Madison Wisconsin USA 22-25 July 1998 Morgan Kaufmann

R Poli W B Langdon and N F McPheeA field guide to genetic programmingPublished via httplulucom and freely available athttpwwwgp-field-guideorguk 2008(With contributions by J R Koza)

B J Ross and H ZhuProcedural texture evolution using multiobjective optimizationNew Generation Computing 22(3)271ndash293 2004

Bibliographyndash 92

R P Salustowicz and J SchmidhuberProbabilistic incremental program evolutionEvolutionary Computation 5(2)123ndash141 1997

M Schmidt and H LipsonDistilling free-form natural laws from experimental dataScience 324(5923)81ndash85 3 Apr 2009

S Silva and L VanneschiState-of-the-Art Genetic Programming for Predicting Human Oral Bioavailabilityof DrugsIn Proc 4th International Workshop on Practical Applications of ComputationalBiology and Bioinformatics 2010

M SipperLet the Games EvolveIn Proc GPTP 2011

C Sivapragasam R Maheswaran and V VenkateshGenetic programming approach for flood routing in natural channelsHydrological Processes 22(5)623ndash628 2007

C Sivapragasam N Muttil S Muthukumar and V M ArunPrediction of algal blooms using genetic programmingMarine Pollution Bulletin 60(10)1849ndash1855 2010

Bibliographyndash 93

L SpectorTowards practical autoconstructive evolution Self-evolution of problem-solvinggenetic programming systemsIn R Riolo T McConaghy and E Vladislavleva editors Genetic ProgrammingTheory and Practice VIII volume 8 of Genetic and Evolutionary Computationchapter 2 pages 17ndash33 Springer Ann Arbor USA 20-22 May 2010

L Spector C Perry and J KleinPush 20 programming language descriptionTechnical report School of Cognitive Science Hampshire College Apr 2004

J Togelius S Karakovskiy J Koutnik and J SchmidhuberSuper Mario EvolutionIn Proc IEEE Computational Intelligence and Games 2009

TORCS The Open Car Racing Simulatorhttptorcssourceforgenet 2012

L Vanneschi and G CuccuVariable Size Population for Dynamic Optimization with Genetic ProgrammingIn Proc GECCO 2009

Bibliographyndash 94

E Vladislavleva G Smits and D Den HertogOrder of Nonlinearity as a Complexity Measure for Models Generated by SymbolicRegression via Pareto Genetic ProgrammingIEEE Trans EC 13(2)333ndash349 2009

N Wagner Z Michalewicz M Khouja and R McGregorTime Series Forecasting for Dynamic Environments The DyFor Genetic ProgramModelIEEE Trans EC 2007

J Walker and J MillerPredicting Prime Numbers Using Cartesian Genetic ProgrammingIn Proc EuroGP 2007

J A Walker K Voumllk S L Smith and J F MillerParallel Evolution using Multi-chromosome Cartesian Genetic ProgrammingGPEM 10417ndash445 2009

W-C Wang K-W Chau C-T Cheng and L QiuA comparison of performance of several artificial intelligence methods forforecasting monthly discharge time seriesJournal of Hydrology 374(3-4)294ndash306 2009

Bibliographyndash 95

W Weimer S Forrest C Le Goues and T NguyenAutomatic program repair with evolutionary computationCommunications of the ACM 53(5)109ndash116 June 2010

P Widera J Garibaldi and N KrasnogorGP challenge Evolving energy function for protein structure predictionGPEM 1161ndash88 2010

T YuHierarchical Processing for Evolving Recursive and Modular Programs UsingHigher-Order Functions and Lambda AbstractionGPEM 2345ndash380 2001

Bibliographyndash 96

  • Introduction
  • Background
  • What is genetic programming
  • Summary of our first glimpse at GP
  • Life demonstration of GP using ECJ
  • Assessment of GP techniques
  • Case study
  • A more detailed view on GP (vanilla GP is not the whole story)
  • The Challenges for GP
  • Variants of GP
  • Selected theoretical results
  • Applications of GP
  • Additional resources
  • Bibliography
Page 9: GeneticProgramming - Poznań University of TechnologyGeneticProgramming KrzysztofKrawiec LaboratoryofIntelligentDecisionSupportSystems InstituteofComputingScience,PoznanUniversityofTechnology,Poznań,Poland

[Unique] characteristic of EC

Black-box optimization (f primes dependency on the independent variables does nothave to be known or meet any criteria)

Variables do not have to be explicitly defined

Fining an optimum cannot be guaranteed but in practice a well-performingsuboptimal solution is often satisfactory

Importance of crossover a recombination operator that makes the solutionsexchange certain elements (variable values features)

Without crossover EC boils down parallel stochastic local search

Backgroundndash 9

What is genetic programming

What is genetic programmingndash 10

Genetic programming

In a nutshell

A variant of EC where the genotypes represent programs ie entities capable ofreading in input data and producing some output data in response to that input

Fitness function f measures the similarity of the output produced by the programto the desired output given as a part of task statement

Standard representation expression trees

Important implication Additional input required by the algorithm (compared to EC)

Set of instructions (programming language of consideration)

Data to run the programs on

What is genetic programmingndash 11

Conceptual preliminaries of GP Space of candidate solutions P

Candidate solutions p isin P evolving under the selection pressure of the fitnessfunction f are themselves functions of the form p I rarrO

I and O are respectively the spaces of input data and output data accepted andproduced by programs from P

Cardinality of |P| is typically large or infinite

The set of program inputs I even if finite is usually so large that running eachcandidate solution on all possible inputs becomes intractable

GP algorithms typically evaluate solutions on a sample I prime sub I |I prime| |I | of possibleinputs and fitness is only an approximate estimate of solution quality

The task is given as a set of fitness cases ie pairs (xi yi ) isin I timesO where xiusually comprises one or more independent variables and yi is the output variable

What is genetic programmingndash 12

Conceptual preliminaries Fitness function f

In most cases (and most real-world applications of GP) fitness function fmeasures the similarity of the output produced by the program to the desiredoutput given as a part of task statement

Then fitness can be expressed as a monotonous function of the divergence ofprogramrsquos output from the desired one for instance as

f (p) =minussumi

||yi minusp(xi )|| (1)

where

p(xi ) is the output produced by program p for the input data xi

|| middot || is a metric (a norm) in the output space O

i iterates over all fitness cases

What is genetic programmingndash 13

Conceptual preliminaries Character of candidate solutions

The candidate solutions in GP are being assembled from elementary entities calledinstructions

A part of formulation of a GP task is then also an instruction set I ie a set ofsymbols used by the search algorithm to compose the programs (candidatesolutions)

Design of I usually requires some background knowledgeIn particular it should comprise all instructions necessary to find solution to theproblem posed (closure)

What is genetic programmingndash 14

Genetic programming

Main evolution loop (lsquovanilla GPrsquo)

1 procedure GeneticProgramming(f I ) f - fitness function I - instruction set2 P larrplarrRandomProgram(I ) Initialize population3 repeat Main loop over generations4 for p isinP do Evaluation5 pf larr f (p) pf is a lsquofieldrsquo in program p that stores its fitness6 end for7 P prime larr 0 Next population8 repeat Breeding loop9 p1 larrTournamentSelection(P) First parent10 p2 larrTournamentSelection(P) Second parent11 (o1o2)larrCrossover(p1 p2)12 o1 larrMutation(o1 I )13 o2 larrMutation(o2 I )14 P prime larrP prime cupo1o215 until |P prime|= |P|16 P larrP prime

17 until StoppingCondition(P)18 return argmaxpisinP pf

19 end procedure

What is genetic programmingndash 15

Crossover

Crossover exchange of randomly selected subexpressions (subtree swapping crossover)

1 function Crossover(p1p2)2 repeat3 s1 larr Random node in p14 s2 larr Random node in p25 (pprime1p

prime2)larr Swap subtrees rooted in s1 and s2

6 until Depth(pprime1)lt dmax andDepth(pprime2)lt dmax dmax is the tree depthlimit

7 return (pprime1pprime2)

8 end function

What is genetic programmingndash 16

Mutation

Mutation replace a randomly selected subexpression with a new randomly generatedsubexpression

1 function Mutation(pI )2 repeat3 slarr Random node in p4 s prime larrRandomProgram(I )5 pprime larr Replace the subtree rooted in s with s prime

6 until Depth(pprime)lt dmax dmax is the tree depth limit7 return pprime

8 end function

What is genetic programmingndash 17

Exemplary run Setup

Objective Find program whose output matches x2+x+1 over the range [minus11]Such tasks can be considered as a form of regressionAs solutions are built by manipulating code (instructions) this is referred to assymbolic regression

Fitness sum of absolute errors for x isin minus10minus09 0910In other words the set of fitness cases is

xi -10 -09 0 09 10yi 1 091 1 271 3

What is genetic programmingndash 18

Exemplary run Setup

Instruction setNonterminal (function) set + - (protected division) and x all operating onfloatsTerminal set x and constants chosen randomly between -5 and +5

Selection fitness proportionate (roulette wheel) non elitist

Initial pop ramped half-and-half (depth 1 to 2 50 of terminals are constants)(to be explained later)

Parameterspopulation size 450 subtree crossover25 reproduction25 subtree mutation no tree size limits

Termination when an individual with fitness better than 01 found

What is genetic programmingndash 19

Initial population (population 0)

What is genetic programmingndash 20

Fitness assignment for population 0

Fitness values f(a)=77 f(b)=110 f(c)=1798 f(d)=287

What is genetic programmingndash 21

Breeding

Assume

a gets reproduced

c gets mutated (at loci 2)

a and d get crossed-over

a and b get crossed over

What is genetic programmingndash 22

Population 1

Population 0

Population 1

Individual d in population 1 has fitness 0What is genetic programmingndash 23

Summary of our first glimpse at GP

Summary of our first glimpse at GPndash 24

Specific features of GP

The solutions evolving under the selection pressure of the fitness function arethemselves functions (programs)

GP operates on symbolic structures of varying lengthsThere are no variables for the algorithm to operate on (at least in the commonsense)

The program can be tested only on a limited number of fitness cases (tests)

=rArr In contrast to most EC methods that are typically placed in optimizationframework GP is by nature an inductive learning approach that fits into the domain ofmachine learning []

Summary of our first glimpse at GPndash 25

In a broader context

As opposed to typical ML approaches GP is very genericArbitrary programming language arbitrary input and output representation

The syntax and semantic of the programming language of consideration serve asmeans to provide the algorithm with prior knowledge

(common sense knowledge background knowledge domain knowledge)

GP is not the only approach to program induction (but probably the best one )See eg inductive logic programming ILP

GP embodies the ultimate goal of AI to build a system capable ofself-programming (adaptation learning)

Summary of our first glimpse at GPndash 26

What is GP ndash Question revisited

Genetic programming is a branch of computer science studying heuristicalgorithms based on neo-Darwinian principles for synthesizing programs iediscrete symbolic compositional structures that process data

Summary of our first glimpse at GPndash 27

Why should GP be considered a viable approach to program synthesis

GP combines two powerful concepts marked in underline in the above definition1 Representing candidate solutions as programs

which in general can conduct any Turing-complete computation (egclassification regression clustering reasoning problem solving etc) and thusenable capturing solutions to any type of problems (whether the task is eglearning optimization problem solving game playing etc)

2 Searching the space of candidate solutions using the lsquomechanicsrsquo borrowed frombiological evolutionwhich is unquestionably a very powerful computing paradigm given that itresulted in life on Earth and development of intelligent beings

Summary of our first glimpse at GPndash 28

Consequences of the above definition

Heuristic nature of search

Symbolic program representation

Input sensitivity and inductive character

State-fullness

Unconstrained data types

Summary of our first glimpse at GPndash 29

Origins of GP

httpwwwgenetic-programmingcomjohnkozahtml

Home assignment Indentify John Kozarsquos unique visual trait )

Summary of our first glimpse at GPndash 30

Life demonstration of GP using ECJ

Life demonstration of GP using ECJndash 31

GP on ECJ

ECJ Evolutionary Computation in Javahttpcsgmuedu~eclabprojectsecj

by Sean Luke Liviu Panait Gabriel Balan Sean Paus Zbigniew Skolicki ElenaPopovici Keith Sullivan et al

Probably the most popular freely available framework for EC with a strongsupport for GP

Licensed under Academic Free License version 30

As of March 2012 version 20

Many other libraries integrate with ECJ

Life demonstration of GP using ECJndash 32

Selected ECJ features

GUI with charting

Platform-independent checkpointing and logging

Hierarchical parameter files

Multithreading

Mersenne Twister Random Number Generators (compare tohttpwwwalifecouknonrandom)

Abstractions for implementing a variety of EC forms

Prepared to work in a distributed environment (including so-called island model)

Life demonstration of GP using ECJndash 33

GP-related ECJ features

GP Tree Representations

Set-based Strongly-Typed Genetic Programming

Ephemeral Random Constants

Automatically-Defined Functions and Automatically Defined Macros

Multiple tree forests

Six tree-creation algorithms

Extensive set of GP breeding operators

Grammatical Encoding

Eight pre-done GP application problem domains (ant regression multiplexerlawnmower parity two-box edge serengeti)

Life demonstration of GP using ECJndash 34

Exemplary run of ECJ

Standard outputjava ecEvolve -file ecappregressionquinticercparamsThreads breed1 eval1Seed 1427743400Job 0Setting upProcessing GP TypesProcessing GP Node ConstraintsProcessing GP Function SetsProcessing GP Tree Constraints-01306332228659439200164875774146594280653340439694114301402200189629743-0037506348565697010001402771209365470606602806044824949013869498395598084Initializing Generation 0Subpop 0 best fitness of generation Fitness Standardized=11303205 Adjusted=046941292 Hits=10Generation 1Subpop 0 best fitness of generation Fitness Standardized=06804932 Adjusted=059506345 Hits=7

Life demonstration of GP using ECJndash 35

Exemplary run The result

The log file produced by the runGeneration 0Best IndividualSubpopulation 0Evaluated trueFitness Standardized=11303205 Adjusted=046941292 Hits=10Tree 0( (sin ( x x)) (cos (+ x x)))Generation 1Best IndividualSubpopulation 0Evaluated trueFitness Standardized=06804932 Adjusted=059506345 Hits=7Tree 0( (rlog (+ (- x x) (cos x))) (rlog (- (cos (cos ( x x))) (- x x))))

Life demonstration of GP using ECJndash 36

Exemplary run

The log file produced by the runBest Individual of RunSubpopulation 0Evaluated trueFitness Standardized=008413165 Adjusted=092239726 Hits=17Tree 0( ( ( (- ( ( ( ( x (sin x)) (rlog

x)) (+ (+ (sin x) x) (- x x))) (exp ( x( ( (- ( ( ( ( x x) (rlog x)) (+ (+

(sin x) x) (- x x))) (exp ( x (sin x))))(sin x)) (rlog x)) (exp (rlog x)))))) (sin

x)) (rlog x)) x) (cos (cos ( ( (- ( ((exp (rlog x)) (+ x ( ( (exp (rlog x))(rlog x)) x))) (exp ( ( ( (- (exp (rlogx)) x) (rlog x)) x) (sin ( x x))))) (sinx)) ( x ( ( (- ( ( ( ( x x) (rlogx)) (+ (+ x (+ (+ (sin x) x) (- x x))) (-x x))) (exp ( x (sin x)))) (sin x)) (rlogx)) (exp (rlog x))))) x))))

Life demonstration of GP using ECJndash 37

GP in R

httpcranr-projectorgwebpackagesgprindexhtml

Life demonstration of GP using ECJndash 38

GP in Mathematica

Life demonstration of GP using ECJndash 39

Assessment of GP techniques

Assessment of GP techniquesndash 40

Criteria for assessing the quality of GP-evolved solutions

Criteria for assessing GP algorithmssuccess rate (percentage of evolutionary runs ended with success)

time-to-success (can be infin)

error of the best-of-run individual

Criteria for assessing programs obtained with GP

error rate (percentage of tests passed)

program size (number of instructions)

execution time

transparency

Assessment of GP techniquesndash 41

GP Benchmarks

Problem Definition (formula)Sextic x6minus2x4+x2

Septic x7minus2x6+x5minusx4+x3minus2x2+xNonic x9+x8+x7+x6+x5+x4+x3+x2+xR1 (x+1)3(x2minusx+1)R2 (x5minus3x3+1)(x2+1)R3 (x6+x5)(x4+x3+x2+x+1)

Assessment of GP techniquesndash 42

Even more GP benchmarks

Symbolic Regression Drug Bioavailability [43]

Tower [52] Protein Structure Classification [58]

Boolean Functions Time Series Forecasting [53]

N-Multiplexer [17] N-Majority [17] N-Parity [17] Path-finding and Planning

Generalised Boolean Circuits [12 59] Physical Travelling Salesman [28]

Digital Adder [55] Artificial Ant [17]

Order [9] Lawnmower [18]

Digital Multiplier [55] Tartarus Problem [6]

Majority [9] Maximum Overhang [36]

Classification Circuit Design [29]

mRNA Motif Classification [24] Control Systems

DNA Motif Discovery [25] Chaotic Dynamic Systems Control [26]

Intrusion Detection [11] Pole Balancing [31]

Protein Classification [19] Truck Control [16]

Intertwined Spirals [17]

Predictive Modelling

Mackey-Glass Chaotic Time Series [21]

Financial Trading [5 4 8]

Sunspot Prediction [17]

GeneChip Probe Performance [22]

Prime Number Prediction [54]

Assessment of GP techniquesndash 43

and more

Game-Playing

TORCS Car Racing [50]

Ms PacMan [10]

Othello [27]

Chessboard Evaluation [44]

Backgammon [44]

Mario [49]

NP-Complete Puzzles [14]

Robocode [44]

Rush Hour [44]

Checkers [44]

Freecell [44]

Dynamic Optimisation

Dynamic Symbolic Regression [34 35 51]

Dynamic Scheduling [13]

Traditional Programming

Sorting [15 1]

Assessment of GP techniquesndash 44

Case study

Case studyndash 45

Case study 1 Evolution of temperature models

Based on Karolina Stanisławska Krzysztof Krawiec Zbigniew W KundzewiczModeling Global Temperature Changes using Genetic Programming ndash A Case Study

Institute of Computing Science Poznan University of Technology Poznan Poland

Institute for Agricultural and Forest Environment Polish Academy of SciencesPoznan Poland and Potsdam Institute for Climate Impact Research PotsdamGermany

Konferencja Algorytmoacutew Ewolucyjnych i Optymalizacji Globalnej KAEiOGSeptember 21 2011

Case studyndash 46

A more detailed view on GP(vanilla GP is not the whole story)

A more detailed view on GP (vanilla GP is not the whole story)ndash 47

Where to get [random] candidate solutions from

Every stochastic search method relies on some sampling algorithm(s)

The distribution of randomly generated solutions is important as it impliescertain bias of the algorithm

ProblemsWe donrsquot know the lsquoidealrsquo distribution of GP programsEven if we knew it it may be difficult to design an algorithm that obeys it

The most widely used contemporary initialization methods take care only of thesyntax of generated programs

Mainly height constraint

A more detailed view on GP (vanilla GP is not the whole story)ndash 48

Initialization Full method

Specify the maximum tree height hmax

The full method for initializing treesChoose nonterminal nodes at random until hmax is reachedThen choose only from terminals

A more detailed view on GP (vanilla GP is not the whole story)ndash 49

Initialization Grow method

Specify the maximum tree height hmax

The grow method for initializing treesChoose nonterminal or terminal nodes at random until hmax is reachedThen choose only from terminals

A more detailed view on GP (vanilla GP is not the whole story)ndash 50

Homologous crossover for GP

Earliest example one-point crossover [23] identify a common region in theparents and swap the corresponding trees

The common region is the lsquointersectionrsquo of parent trees

A more detailed view on GP (vanilla GP is not the whole story)ndash 51

Uniform crossover for GP

Works similarly to uniform crossover in GAs

The offspring is build by iterating over nodes in the common region and flipping acoin to decide from which parent should an instruction be copied [38]

A more detailed view on GP (vanilla GP is not the whole story)ndash 52

How to employ multiple operators for lsquobreedingrsquo

How should the particular operators coexist in an evolutionary process In other words

How should they be superimposed

What should be the lsquopipingrsquo of particular breeding pipelines

A topic surprisingly underexplored in GP (and in EC probably too)

An example Which is betterpopsubpop0speciespipe = ecgpkozaMutationPipelinepopsubpop0speciespipenum-sources = 1popsubpop0speciespipesource0 = ecgpkozaCrossoverPipeline

Orpopsubpop0speciespipenum-sources = 2popsubpop0speciespipesource0 = ecgpkozaCrossoverPipelinepopsubpop0speciespipesource0prob = 09popsubpop0speciespipesource1 = ecgpkozaMutationPipelinepopsubpop0speciespipesource1prob = 01

A more detailed view on GP (vanilla GP is not the whole story)ndash 53

The Challenges for GP

The Challenges for GPndash 54

Bloat

The evolving expressions tend to grow indefinitely in sizeFor tree-based representations this growth is typically exponential[-ish]

Evaluation becomes slow algorithm stalls memory overrun likely

One of the most intensely studied topics in GP 240+ papers as of March 2012

The Challenges for GPndash 55

Bloat example

Average number of nodes per generation in a typical run of GP solving the Sexticproblem x6minus2x4+x2

(GP dotted line)

The Challenges for GPndash 56

Countermeasures for bloat

Constraining tree heightSurprisingly can speed up bloat

Favoring small programsLexicographic parsimony pressure given two equally fit individuals prefer (select)the one represented by a smaller tree

Bloat-aware operators size-fair crossover

The Challenges for GPndash 57

Highly non-uniform distribution of program lsquobehaviorsrsquo

Convergence of binary Boolean random linear functions (composed of AND NANDOR NOR 8 bits)

From [20] Langdon W B Cantuacute-Paz E (ed) Random Search is Parsimonious LateBreaking Papers at the Genetic and Evolutionary Computation Conference(GECCO-2002) AAAI 2002 308-315

The Challenges for GPndash 58

High cost of evaluation

Running a program on multiple inputscan be expensive

Particularly for some types of dataeg images

Solutions

Caching of outcomes of subprograms

Parallel execution of programs onparticular fitness cases

Bloat prevention methods

The Challenges for GPndash 59

Variants of GP

Variants of GPndash 60

Strongly typed GP (STGP)

A way to incorporate prior knowledge and impose a structure on programs [30]

ImplementationProvide a set of typesFor each instruction define the types of its arguments and outcomesMake the operators type-aware

Mutation substitute a random tree of a proper typeCrossover swap trees of compatible1 types

1Compatible belonging to the same lsquoset typersquoVariants of GPndash 61

Strongly typed GP in ECJ

For the problem of simple classifiers represented as decision trees

Classifier syntaxClassifier = Class_idClassifier = if_then_else(Condition ClassifierClassifier)Condition = Input_Variable = Constant_Value

Implementation in ECJ parameter filesgptypeasize = 3gptypea0name = classgptypea1name = vargptypea2name = constgptypessize = 0gptcsize = 1gptc0 = ecgpGPTreeConstraintsgptc0name = tc0gptc0fset = f0gptc0returns = class

gpncsize = 4gpnc0 = ecgpGPNodeConstraintsgpnc0name = ncSimpleClassifiergpnc0returns = classgpnc0size = 0gpnc1 = ecgpGPNodeConstraintsgpnc1name = ncCompoundClassifiergpnc1returns = classgpnc1size = 4gpnc1child0 = vargpnc1child1 = constgpnc1child2 = classgpnc1child3 = classgpnc2 = ecgpGPNodeConstraintsgpnc2name = ncVariablegpnc2returns = vargpnc2size = 0gpnc3 = ecgpGPNodeConstraintsgpnc3name = ncConstantgpnc3returns = constgpnc3size = 0

Variants of GPndash 62

Linear Genetic Programming

MotivationTree-like structures are not natural for contemporary hardware architectures

Program = a sequence of instructions

Data passed via registers

ProsDirectly portable to machine code fast executionNatural correspondence to standard (GA-like) crossover operator

Applications direct evolution of machine code [32]

Initial register contents

Final register contents

x1

x2 O1 O2

x3

O3 O4 g2

g3

g1r1

r2

r3

r1

r2

r3

Variants of GPndash 63

Linear GP

Variants of GPndash 64

Stack-based GP

The best-known representative Push and PushGPhampshireedulspectorpushhtml [48]

ProsVery simple syntax program = instruction | literal | ( program )No need to specify the number of registersThe top element of a stack has the natural interpretation of program outcomeNatural possibility of implementing autoconstructive programs [47]Includes certain features that make it Turing-complete (eg YANK instruction)

Variants of GPndash 65

Push Example

Program

( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR )

Initial stack states

BOOLEAN STACK ()CODE STACK ( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR )FLOAT STACK ()INTEGER STACK ()

Stack states after program execution

BOOLEAN STACK ( TRUE )CODE STACK ( ( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR ) )FLOAT STACK ( 93 )INTEGER STACK ( 6 )

httphampshireedulspectorpush3-descriptionhtml

Variants of GPndash 66

Other variants of GP

Grammatical Evolution The grammar of the programming language ofconsideration is given as input to the algorithm Individuals encode the choice ofproductions in the derivation tree (which of available alternative productionshould be chosen modulo the number of productions available at given step ofderivation)

Graph-based GPMotivation standard GP cannot reuse subprograms (within a single program)Example Cartesian Genetic Programming

Variants of GPndash 67

Other variants of GP

Multiobjective GP The extra objectives canCome with the problemResult from GPrsquos specifics eg use program size as the second (minimized)objectiveBe associated with different tests (eg feature tests [40])

Developmental GP (eg using Push)

Probabilistic GP (a variant of EDA Estimation of Distribution Algorithms)The algorithm maintains a probability distribution P instead of a populationIndividuals are generated from P lsquoon demandrsquoThe results of individualsrsquo evaluation are used to update P

Variants of GPndash 68

Simple EDA-like GP PIPE

Probabilistic Incremental Program Evolution [41]

Variants of GPndash 69

Selected theoretical results

Selected theoretical resultsndash 70

Schemata theorem for GP

Exact formula for the expected number of individuals sampling a schema a thenext generation [37]

Plus later work for other types of crossover

Selected theoretical resultsndash 71

Semantic crossover

Selected theoretical resultsndash 72

Applications of GP

Applications of GPndash 73

Review

GP produced a number of solutions that are human-competitive ie a GPalgorithm automatically solved a problem for which a patent exists2

A recent award-winning work has demonstrated the ability of a GP system toautomatically find and correct bugs in commercially-released software whenprovided with test data3

GP is one of leading methodologies that can be used to lsquoautomatersquo sciencehelping the researchers to find the hidden complex patterns in the observedphenomena4

2Koza J R Keane M A Streeter M J Mydlowec W Yu J Lanza G 2003 Genetic Pro-gramming IV Routine Human-Competitive Machine Intelligence Kluwer Academic Publishers

3Arcuri A Yao X A novel co-evolutionary approach to automatic software bug fixing In WangJ (Ed) 2008 IEEE World Congress on Computational Intelligence IEEE Computational IntelligenceSociety IEEE Press Hong Kong

4Schmidt M Lipson H 3 Apr 2009 Distilling free-form natural laws from experimental dataScience 324 (5923) 81ndash85

Applications of GPndash 74

Humies

() Entries were solicited for cash awards for human-competitive results that wereproduced by any form of genetic and evolutionary computation and that were published

Applications of GPndash 75

Humies

The conditions to qualify(A) The result was patented as an invention in the past is an improvement over apatented invention or would qualify today as a patentable new invention(B) The result is equal to or better than a result that was accepted as a new scientificresult at the time when it was published in a peer-reviewed scientific journal(C) The result is equal to or better than a result that was placed into a database orarchive of results maintained by an internationally recognized panel of scientificexperts(D) The result is publishable in its own right as a new scientific result mdash independentof the fact that the result was mechanically created(E) The result is equal to or better than the most recent human-created solution to along-standing problem for which there has been a succession of increasingly betterhuman-created solutions(F) The result is equal to or better than a result that was considered an achievementin its field at the time it was first discovered(G) The result solves a problem of indisputable difficulty in its field(H) The result holds its own or wins a regulated competition involving humancontestants (in the form of either live human players or human-written computerprograms)

Applications of GPndash 76

Selected Gold Humies

2004 Jason D Lohn Gregory S Hornby Derek S Linden NASA Ames ResearchCenterAn Evolved Antenna for Deployment on NASArsquos Space Technology 5 Mission

httpidesignucscedupapershornby_ec11pdf

Applications of GPndash 77

Selected Gold Humies using GP

2009 Stephanie Forrest Claire Le Goues ThanhVu Nguyen Westley WeimerAutomatically finding patches using genetic programming A GeneticProgramming Approach to Automated Software Repair

Applications of GPndash 78

Selected Gold Humies using GP

2008 Lee Spector David M Clark Ian Lindsay Bradford Barr Jon KleinGenetic Programming for Finite Algebras

2010 Natalio Krasnogor Paweł Widera Jonathan GaribaldiEvolutionary design of the energy function for protein structure prediction GPchallenge evolving the energy function for protein structure prediction Automateddesign of energy functions for protein structure prediction by means of geneticprogramming and improved structure similarity assessment

2011 Achiya Elyasaf Ami Hauptmann Moshe SipperGA-FreeCell Evolving Solvers for the Game of FreeCell

Applications of GPndash 79

Other applications

classification problems in machine learning [] object recognition [ 33] orlearning game strategies [] [2 57] has demonstrated the ability of a GP system to automatically find andcorrect bugs in commercially-released software when provided with test data

In the context of this paper it deserves particular attention that GP is one ofleading methodologies that can be used to lsquoautomatersquo science helping theresearchers to find the hidden complex patterns in the observed phenomena

In this spirit in their seminal paper [42] have shown how GP can be used toinduce scientific laws from experimental data Many other studies havedemonstrated the usefulness of GP for modeling different phenomena includingthose of natural origins [45 3 7 56 46]

See [] for an extensive review of GP applications

Applications of GPndash 80

Case study 2 Evolution of feature detectors

Based on Krzysztof Krawiec Bartosz Kukawka and Tomasz Maciejewski (2010)Evolving cascades of voting feature detectors for vehicle detection in satellite imageryIn IEEE Congress on Evolutionary Computation (CEC 2010) Barcelona IEEE Presspages 2392-2399

Applications of GPndash 81

Additional resources

Additional resourcesndash 82

Additional resources

Evolutionary Computation in Java csgmuedu~eclabprojectsecjGeneric software framework for EA well-prepared to work with GP

The online GP bilbiography wwwcsbhamacuk~wblbiblio

The genetic programming lsquohome pagersquo (a little bit messy but still valuable)httpwwwgenetic-programmingcom

Additional resourcesndash 83

Bibliography

Bibliographyndash 84

A Agapitos and S M LucasEvolving Modular Recursive Sorting AlgorithmsIn Proc EuroGP 2007

A Arcuri and X YaoA novel co-evolutionary approach to automatic software bug fixingIn J Wang editor 2008 IEEE World Congress on Computational Intelligencepages 162ndash168 Hong Kong 1-6 June 2008 IEEE Computational IntelligenceSociety IEEE Press

M Arganis R Val J Prats K Rodriguez R Dominguez and J DolzGenetic programming and standardization in water temperature modellingAdvances in Civil Engineering 2009 2009

A Brabazon M OrsquoNeill and I DempseyAn Introduction to Evolutionary Computation in FinanceIEEE Computational Intelligence Magazine 3(4)42ndash55 2008

R Bradley A Brabazon and M OrsquoNeillDynamic High Frequency Trading A Neuro-Evolutionary ApproachIn Proc EvoWorkshops 2009

G Cuccu and F GomezWhen novelty is not enoughIn Proc EvoApplications 2011

Bibliographyndash 85

M Daga and M C DeoAlternative data-driven methods to estimate wind from waves by inversemodelingNatural Hazards 49(2)293ndash310 May 2009

I Dempsey M OrsquoNeill and A BrabazonAdaptive Trading With Grammatical EvolutionIn Proc CEC 2006

G Durrett F Neumann and U-M OrsquoReillyComputational Complexity Analysis of Simple Genetic Programming On TwoProblems Modeling Isolated Program SemanticsIn Proc FOGA 2011

E Galvaacuten-Loacutepez J Swafford M OrsquoNeill and A BrabazonEvolving a Ms PacMan Controller Using Grammatical EvolutionIn Applications of Evolutionary Computation Springer 2010

J V Hansen P B Lowry R D Meservy and D M McDonaldGenetic Programming for Prevention of Cyberterrorism through Dynamic andEvolving Intrusion DetectionDecision Support Systems 431362ndash1374 2007

Bibliographyndash 86

S Harding J F Miller and W BanzhafDevelopments in Cartesian Genetic Programming self-modifying CGPGPEM 11397ndash439 2010

D Jakobović and L BudinDynamic Scheduling with Genetic ProgrammingIn Proc EuroGP 2006

G Kendall A Parkes and K SpoererA Survey of NP-Complete PuzzlesInternational Computer Games Association Journal 31(1)13ndash34 2008

K E Kinnear JrEvolving a Sort Lessons in Genetic ProgrammingIn Proc of the International Conference on Neural Networks 1993

J KozaA Genetic Approach to the Truck Backer Upper Problem and the Inter-twinedSpiral ProblemIn Proc International Joint Conference on Neural Networks 1992

J R KozaGenetic Programming On the Programming of Computers by Means of NaturalSelectionMIT Press Cambridge MA USA 1992

Bibliographyndash 87

J R KozaGenetic Programming II Automatic Discovery of Reusable ProgramsMIT Press Cambridge Massachusetts May 1994

W Langdon and W BanzhafRepeated Patterns in Genetic ProgrammingNatural Computing 7589ndash613 2008

W B LangdonRandom search is parsimoniousIn E Cantuacute-Paz editor Late Breaking Papers at the Genetic and EvolutionaryComputation Conference (GECCO-2002) pages 308ndash315 New York NY 9-13July 2002 AAAI

W B Langdon and W BanzhafRepeated Sequences in Linear Genetic Programming GenomesComplex Systems 15(4)285ndash306 2005

W B Langdon and A P HarrisonEvolving Regular Expressions for GeneChip Probe Performance PredictionIn Proc PPSN pages 1061ndash1070 2008

W B Langdon and R PoliFoundations of Genetic ProgrammingSpringer-Verlag 2002

Bibliographyndash 88

W B Langdon J Rowsell and A P HarrisonCreating Regular Expressions as mRNA Motifs with GP to Predict Human ExonSplittingIn Proc GECCO 2009

W B Langdon O Sanchez Graillet and A P HarrisonAutomated DNA Motif DiscoveryarXivorg 2010

M Lones A Tyrrell S Stepney and L CavesControlling Complex Dynamics with Artificial Biochemical NetworksIn Proc EuroGP pages 159ndash170 2010

S LucasOthello Competitionhttpprotectkern-1667emrelaxalgovalessexacuk8080othellohtmlOthellohtml 2012[Online accessed 27-Jan-2012]

Bibliographyndash 89

S LucasThe Physical Travelling Salesperson Problemhttpprotectkern-1667emrelaxalgovalessexacukptspptsphtml2012[Online accessed 27ndashJan-2012]

T McConaghyFFX Fast Scalable Deterministic Symbolic Regression TechnologyIn Proc GPTP 2011

D J MontanaStrongly typed genetic programmingBBN Technical Report 7866 Bolt Beranek and Newman Inc 10 MoultonStreet Cambridge MA 02138 USA 7 May 1993

M Nicolau M Schoenauer and W BanzhafEvolving Genes to Balance a PoleIn Proc EuroGP 2010

Bibliographyndash 90

P Nordin and W BanzhafGenetic programming controlling a miniature robotIn E V Siegel and J R Koza editors Working Notes for the AAAI Symposiumon Genetic Programming pages 61ndash67 MIT Cambridge MA USA 10ndash12 Nov1995 AAAI

G Olague and L TrujilloEvolutionary-computer-assisted design of image operators that detect interestpoints using genetic programmingImage and Vision Computing 29(7)484ndash498 2011

M OrsquoNeill A Brabazon and E HembergSubtree Deactivation Control with Grammatical Genetic Programming inDynamic EnvironmentsIn Proc CEC 2008

M OrsquoNeill and C RyanGrammatical Evolution by Grammatical Evolution The Evolution of Grammarand Genetic CodeIn Proc EuroGP pages 138ndash149 Springer-Verlag 5ndash7 Apr 2004

M Paterson Y Peres M Thorup P Winkler and U ZwickMaximum OverhangIn Proc 19th Annual ACM-SIAM Symposium on Discrete Algorithms 2008

Bibliographyndash 91

R PoliExact schema theory for genetic programming and variable-length geneticalgorithms with one-point crossoverGenetic Programming and Evolvable Machines 2(2)123ndash163 June 2001

R Poli and W B LangdonOn the search properties of different crossover operators in genetic programmingIn J R Koza W Banzhaf K Chellapilla K Deb M Dorigo D B Fogel M HGarzon D E Goldberg H Iba and R Riolo editors Genetic Programming1998 Proceedings of the Third Annual Conference pages 293ndash301 University ofWisconsin Madison Wisconsin USA 22-25 July 1998 Morgan Kaufmann

R Poli W B Langdon and N F McPheeA field guide to genetic programmingPublished via httplulucom and freely available athttpwwwgp-field-guideorguk 2008(With contributions by J R Koza)

B J Ross and H ZhuProcedural texture evolution using multiobjective optimizationNew Generation Computing 22(3)271ndash293 2004

Bibliographyndash 92

R P Salustowicz and J SchmidhuberProbabilistic incremental program evolutionEvolutionary Computation 5(2)123ndash141 1997

M Schmidt and H LipsonDistilling free-form natural laws from experimental dataScience 324(5923)81ndash85 3 Apr 2009

S Silva and L VanneschiState-of-the-Art Genetic Programming for Predicting Human Oral Bioavailabilityof DrugsIn Proc 4th International Workshop on Practical Applications of ComputationalBiology and Bioinformatics 2010

M SipperLet the Games EvolveIn Proc GPTP 2011

C Sivapragasam R Maheswaran and V VenkateshGenetic programming approach for flood routing in natural channelsHydrological Processes 22(5)623ndash628 2007

C Sivapragasam N Muttil S Muthukumar and V M ArunPrediction of algal blooms using genetic programmingMarine Pollution Bulletin 60(10)1849ndash1855 2010

Bibliographyndash 93

L SpectorTowards practical autoconstructive evolution Self-evolution of problem-solvinggenetic programming systemsIn R Riolo T McConaghy and E Vladislavleva editors Genetic ProgrammingTheory and Practice VIII volume 8 of Genetic and Evolutionary Computationchapter 2 pages 17ndash33 Springer Ann Arbor USA 20-22 May 2010

L Spector C Perry and J KleinPush 20 programming language descriptionTechnical report School of Cognitive Science Hampshire College Apr 2004

J Togelius S Karakovskiy J Koutnik and J SchmidhuberSuper Mario EvolutionIn Proc IEEE Computational Intelligence and Games 2009

TORCS The Open Car Racing Simulatorhttptorcssourceforgenet 2012

L Vanneschi and G CuccuVariable Size Population for Dynamic Optimization with Genetic ProgrammingIn Proc GECCO 2009

Bibliographyndash 94

E Vladislavleva G Smits and D Den HertogOrder of Nonlinearity as a Complexity Measure for Models Generated by SymbolicRegression via Pareto Genetic ProgrammingIEEE Trans EC 13(2)333ndash349 2009

N Wagner Z Michalewicz M Khouja and R McGregorTime Series Forecasting for Dynamic Environments The DyFor Genetic ProgramModelIEEE Trans EC 2007

J Walker and J MillerPredicting Prime Numbers Using Cartesian Genetic ProgrammingIn Proc EuroGP 2007

J A Walker K Voumllk S L Smith and J F MillerParallel Evolution using Multi-chromosome Cartesian Genetic ProgrammingGPEM 10417ndash445 2009

W-C Wang K-W Chau C-T Cheng and L QiuA comparison of performance of several artificial intelligence methods forforecasting monthly discharge time seriesJournal of Hydrology 374(3-4)294ndash306 2009

Bibliographyndash 95

W Weimer S Forrest C Le Goues and T NguyenAutomatic program repair with evolutionary computationCommunications of the ACM 53(5)109ndash116 June 2010

P Widera J Garibaldi and N KrasnogorGP challenge Evolving energy function for protein structure predictionGPEM 1161ndash88 2010

T YuHierarchical Processing for Evolving Recursive and Modular Programs UsingHigher-Order Functions and Lambda AbstractionGPEM 2345ndash380 2001

Bibliographyndash 96

  • Introduction
  • Background
  • What is genetic programming
  • Summary of our first glimpse at GP
  • Life demonstration of GP using ECJ
  • Assessment of GP techniques
  • Case study
  • A more detailed view on GP (vanilla GP is not the whole story)
  • The Challenges for GP
  • Variants of GP
  • Selected theoretical results
  • Applications of GP
  • Additional resources
  • Bibliography
Page 10: GeneticProgramming - Poznań University of TechnologyGeneticProgramming KrzysztofKrawiec LaboratoryofIntelligentDecisionSupportSystems InstituteofComputingScience,PoznanUniversityofTechnology,Poznań,Poland

What is genetic programming

What is genetic programmingndash 10

Genetic programming

In a nutshell

A variant of EC where the genotypes represent programs ie entities capable ofreading in input data and producing some output data in response to that input

Fitness function f measures the similarity of the output produced by the programto the desired output given as a part of task statement

Standard representation expression trees

Important implication Additional input required by the algorithm (compared to EC)

Set of instructions (programming language of consideration)

Data to run the programs on

What is genetic programmingndash 11

Conceptual preliminaries of GP Space of candidate solutions P

Candidate solutions p isin P evolving under the selection pressure of the fitnessfunction f are themselves functions of the form p I rarrO

I and O are respectively the spaces of input data and output data accepted andproduced by programs from P

Cardinality of |P| is typically large or infinite

The set of program inputs I even if finite is usually so large that running eachcandidate solution on all possible inputs becomes intractable

GP algorithms typically evaluate solutions on a sample I prime sub I |I prime| |I | of possibleinputs and fitness is only an approximate estimate of solution quality

The task is given as a set of fitness cases ie pairs (xi yi ) isin I timesO where xiusually comprises one or more independent variables and yi is the output variable

What is genetic programmingndash 12

Conceptual preliminaries Fitness function f

In most cases (and most real-world applications of GP) fitness function fmeasures the similarity of the output produced by the program to the desiredoutput given as a part of task statement

Then fitness can be expressed as a monotonous function of the divergence ofprogramrsquos output from the desired one for instance as

f (p) =minussumi

||yi minusp(xi )|| (1)

where

p(xi ) is the output produced by program p for the input data xi

|| middot || is a metric (a norm) in the output space O

i iterates over all fitness cases

What is genetic programmingndash 13

Conceptual preliminaries Character of candidate solutions

The candidate solutions in GP are being assembled from elementary entities calledinstructions

A part of formulation of a GP task is then also an instruction set I ie a set ofsymbols used by the search algorithm to compose the programs (candidatesolutions)

Design of I usually requires some background knowledgeIn particular it should comprise all instructions necessary to find solution to theproblem posed (closure)

What is genetic programmingndash 14

Genetic programming

Main evolution loop (lsquovanilla GPrsquo)

1 procedure GeneticProgramming(f I ) f - fitness function I - instruction set2 P larrplarrRandomProgram(I ) Initialize population3 repeat Main loop over generations4 for p isinP do Evaluation5 pf larr f (p) pf is a lsquofieldrsquo in program p that stores its fitness6 end for7 P prime larr 0 Next population8 repeat Breeding loop9 p1 larrTournamentSelection(P) First parent10 p2 larrTournamentSelection(P) Second parent11 (o1o2)larrCrossover(p1 p2)12 o1 larrMutation(o1 I )13 o2 larrMutation(o2 I )14 P prime larrP prime cupo1o215 until |P prime|= |P|16 P larrP prime

17 until StoppingCondition(P)18 return argmaxpisinP pf

19 end procedure

What is genetic programmingndash 15

Crossover

Crossover exchange of randomly selected subexpressions (subtree swapping crossover)

1 function Crossover(p1p2)2 repeat3 s1 larr Random node in p14 s2 larr Random node in p25 (pprime1p

prime2)larr Swap subtrees rooted in s1 and s2

6 until Depth(pprime1)lt dmax andDepth(pprime2)lt dmax dmax is the tree depthlimit

7 return (pprime1pprime2)

8 end function

What is genetic programmingndash 16

Mutation

Mutation replace a randomly selected subexpression with a new randomly generatedsubexpression

1 function Mutation(pI )2 repeat3 slarr Random node in p4 s prime larrRandomProgram(I )5 pprime larr Replace the subtree rooted in s with s prime

6 until Depth(pprime)lt dmax dmax is the tree depth limit7 return pprime

8 end function

What is genetic programmingndash 17

Exemplary run Setup

Objective Find program whose output matches x2+x+1 over the range [minus11]Such tasks can be considered as a form of regressionAs solutions are built by manipulating code (instructions) this is referred to assymbolic regression

Fitness sum of absolute errors for x isin minus10minus09 0910In other words the set of fitness cases is

xi -10 -09 0 09 10yi 1 091 1 271 3

What is genetic programmingndash 18

Exemplary run Setup

Instruction setNonterminal (function) set + - (protected division) and x all operating onfloatsTerminal set x and constants chosen randomly between -5 and +5

Selection fitness proportionate (roulette wheel) non elitist

Initial pop ramped half-and-half (depth 1 to 2 50 of terminals are constants)(to be explained later)

Parameterspopulation size 450 subtree crossover25 reproduction25 subtree mutation no tree size limits

Termination when an individual with fitness better than 01 found

What is genetic programmingndash 19

Initial population (population 0)

What is genetic programmingndash 20

Fitness assignment for population 0

Fitness values f(a)=77 f(b)=110 f(c)=1798 f(d)=287

What is genetic programmingndash 21

Breeding

Assume

a gets reproduced

c gets mutated (at loci 2)

a and d get crossed-over

a and b get crossed over

What is genetic programmingndash 22

Population 1

Population 0

Population 1

Individual d in population 1 has fitness 0What is genetic programmingndash 23

Summary of our first glimpse at GP

Summary of our first glimpse at GPndash 24

Specific features of GP

The solutions evolving under the selection pressure of the fitness function arethemselves functions (programs)

GP operates on symbolic structures of varying lengthsThere are no variables for the algorithm to operate on (at least in the commonsense)

The program can be tested only on a limited number of fitness cases (tests)

=rArr In contrast to most EC methods that are typically placed in optimizationframework GP is by nature an inductive learning approach that fits into the domain ofmachine learning []

Summary of our first glimpse at GPndash 25

In a broader context

As opposed to typical ML approaches GP is very genericArbitrary programming language arbitrary input and output representation

The syntax and semantic of the programming language of consideration serve asmeans to provide the algorithm with prior knowledge

(common sense knowledge background knowledge domain knowledge)

GP is not the only approach to program induction (but probably the best one )See eg inductive logic programming ILP

GP embodies the ultimate goal of AI to build a system capable ofself-programming (adaptation learning)

Summary of our first glimpse at GPndash 26

What is GP ndash Question revisited

Genetic programming is a branch of computer science studying heuristicalgorithms based on neo-Darwinian principles for synthesizing programs iediscrete symbolic compositional structures that process data

Summary of our first glimpse at GPndash 27

Why should GP be considered a viable approach to program synthesis

GP combines two powerful concepts marked in underline in the above definition1 Representing candidate solutions as programs

which in general can conduct any Turing-complete computation (egclassification regression clustering reasoning problem solving etc) and thusenable capturing solutions to any type of problems (whether the task is eglearning optimization problem solving game playing etc)

2 Searching the space of candidate solutions using the lsquomechanicsrsquo borrowed frombiological evolutionwhich is unquestionably a very powerful computing paradigm given that itresulted in life on Earth and development of intelligent beings

Summary of our first glimpse at GPndash 28

Consequences of the above definition

Heuristic nature of search

Symbolic program representation

Input sensitivity and inductive character

State-fullness

Unconstrained data types

Summary of our first glimpse at GPndash 29

Origins of GP

httpwwwgenetic-programmingcomjohnkozahtml

Home assignment Indentify John Kozarsquos unique visual trait )

Summary of our first glimpse at GPndash 30

Life demonstration of GP using ECJ

Life demonstration of GP using ECJndash 31

GP on ECJ

ECJ Evolutionary Computation in Javahttpcsgmuedu~eclabprojectsecj

by Sean Luke Liviu Panait Gabriel Balan Sean Paus Zbigniew Skolicki ElenaPopovici Keith Sullivan et al

Probably the most popular freely available framework for EC with a strongsupport for GP

Licensed under Academic Free License version 30

As of March 2012 version 20

Many other libraries integrate with ECJ

Life demonstration of GP using ECJndash 32

Selected ECJ features

GUI with charting

Platform-independent checkpointing and logging

Hierarchical parameter files

Multithreading

Mersenne Twister Random Number Generators (compare tohttpwwwalifecouknonrandom)

Abstractions for implementing a variety of EC forms

Prepared to work in a distributed environment (including so-called island model)

Life demonstration of GP using ECJndash 33

GP-related ECJ features

GP Tree Representations

Set-based Strongly-Typed Genetic Programming

Ephemeral Random Constants

Automatically-Defined Functions and Automatically Defined Macros

Multiple tree forests

Six tree-creation algorithms

Extensive set of GP breeding operators

Grammatical Encoding

Eight pre-done GP application problem domains (ant regression multiplexerlawnmower parity two-box edge serengeti)

Life demonstration of GP using ECJndash 34

Exemplary run of ECJ

Standard outputjava ecEvolve -file ecappregressionquinticercparamsThreads breed1 eval1Seed 1427743400Job 0Setting upProcessing GP TypesProcessing GP Node ConstraintsProcessing GP Function SetsProcessing GP Tree Constraints-01306332228659439200164875774146594280653340439694114301402200189629743-0037506348565697010001402771209365470606602806044824949013869498395598084Initializing Generation 0Subpop 0 best fitness of generation Fitness Standardized=11303205 Adjusted=046941292 Hits=10Generation 1Subpop 0 best fitness of generation Fitness Standardized=06804932 Adjusted=059506345 Hits=7

Life demonstration of GP using ECJndash 35

Exemplary run The result

The log file produced by the runGeneration 0Best IndividualSubpopulation 0Evaluated trueFitness Standardized=11303205 Adjusted=046941292 Hits=10Tree 0( (sin ( x x)) (cos (+ x x)))Generation 1Best IndividualSubpopulation 0Evaluated trueFitness Standardized=06804932 Adjusted=059506345 Hits=7Tree 0( (rlog (+ (- x x) (cos x))) (rlog (- (cos (cos ( x x))) (- x x))))

Life demonstration of GP using ECJndash 36

Exemplary run

The log file produced by the runBest Individual of RunSubpopulation 0Evaluated trueFitness Standardized=008413165 Adjusted=092239726 Hits=17Tree 0( ( ( (- ( ( ( ( x (sin x)) (rlog

x)) (+ (+ (sin x) x) (- x x))) (exp ( x( ( (- ( ( ( ( x x) (rlog x)) (+ (+

(sin x) x) (- x x))) (exp ( x (sin x))))(sin x)) (rlog x)) (exp (rlog x)))))) (sin

x)) (rlog x)) x) (cos (cos ( ( (- ( ((exp (rlog x)) (+ x ( ( (exp (rlog x))(rlog x)) x))) (exp ( ( ( (- (exp (rlogx)) x) (rlog x)) x) (sin ( x x))))) (sinx)) ( x ( ( (- ( ( ( ( x x) (rlogx)) (+ (+ x (+ (+ (sin x) x) (- x x))) (-x x))) (exp ( x (sin x)))) (sin x)) (rlogx)) (exp (rlog x))))) x))))

Life demonstration of GP using ECJndash 37

GP in R

httpcranr-projectorgwebpackagesgprindexhtml

Life demonstration of GP using ECJndash 38

GP in Mathematica

Life demonstration of GP using ECJndash 39

Assessment of GP techniques

Assessment of GP techniquesndash 40

Criteria for assessing the quality of GP-evolved solutions

Criteria for assessing GP algorithmssuccess rate (percentage of evolutionary runs ended with success)

time-to-success (can be infin)

error of the best-of-run individual

Criteria for assessing programs obtained with GP

error rate (percentage of tests passed)

program size (number of instructions)

execution time

transparency

Assessment of GP techniquesndash 41

GP Benchmarks

Problem Definition (formula)Sextic x6minus2x4+x2

Septic x7minus2x6+x5minusx4+x3minus2x2+xNonic x9+x8+x7+x6+x5+x4+x3+x2+xR1 (x+1)3(x2minusx+1)R2 (x5minus3x3+1)(x2+1)R3 (x6+x5)(x4+x3+x2+x+1)

Assessment of GP techniquesndash 42

Even more GP benchmarks

Symbolic Regression Drug Bioavailability [43]

Tower [52] Protein Structure Classification [58]

Boolean Functions Time Series Forecasting [53]

N-Multiplexer [17] N-Majority [17] N-Parity [17] Path-finding and Planning

Generalised Boolean Circuits [12 59] Physical Travelling Salesman [28]

Digital Adder [55] Artificial Ant [17]

Order [9] Lawnmower [18]

Digital Multiplier [55] Tartarus Problem [6]

Majority [9] Maximum Overhang [36]

Classification Circuit Design [29]

mRNA Motif Classification [24] Control Systems

DNA Motif Discovery [25] Chaotic Dynamic Systems Control [26]

Intrusion Detection [11] Pole Balancing [31]

Protein Classification [19] Truck Control [16]

Intertwined Spirals [17]

Predictive Modelling

Mackey-Glass Chaotic Time Series [21]

Financial Trading [5 4 8]

Sunspot Prediction [17]

GeneChip Probe Performance [22]

Prime Number Prediction [54]

Assessment of GP techniquesndash 43

and more

Game-Playing

TORCS Car Racing [50]

Ms PacMan [10]

Othello [27]

Chessboard Evaluation [44]

Backgammon [44]

Mario [49]

NP-Complete Puzzles [14]

Robocode [44]

Rush Hour [44]

Checkers [44]

Freecell [44]

Dynamic Optimisation

Dynamic Symbolic Regression [34 35 51]

Dynamic Scheduling [13]

Traditional Programming

Sorting [15 1]

Assessment of GP techniquesndash 44

Case study

Case studyndash 45

Case study 1 Evolution of temperature models

Based on Karolina Stanisławska Krzysztof Krawiec Zbigniew W KundzewiczModeling Global Temperature Changes using Genetic Programming ndash A Case Study

Institute of Computing Science Poznan University of Technology Poznan Poland

Institute for Agricultural and Forest Environment Polish Academy of SciencesPoznan Poland and Potsdam Institute for Climate Impact Research PotsdamGermany

Konferencja Algorytmoacutew Ewolucyjnych i Optymalizacji Globalnej KAEiOGSeptember 21 2011

Case studyndash 46

A more detailed view on GP(vanilla GP is not the whole story)

A more detailed view on GP (vanilla GP is not the whole story)ndash 47

Where to get [random] candidate solutions from

Every stochastic search method relies on some sampling algorithm(s)

The distribution of randomly generated solutions is important as it impliescertain bias of the algorithm

ProblemsWe donrsquot know the lsquoidealrsquo distribution of GP programsEven if we knew it it may be difficult to design an algorithm that obeys it

The most widely used contemporary initialization methods take care only of thesyntax of generated programs

Mainly height constraint

A more detailed view on GP (vanilla GP is not the whole story)ndash 48

Initialization Full method

Specify the maximum tree height hmax

The full method for initializing treesChoose nonterminal nodes at random until hmax is reachedThen choose only from terminals

A more detailed view on GP (vanilla GP is not the whole story)ndash 49

Initialization Grow method

Specify the maximum tree height hmax

The grow method for initializing treesChoose nonterminal or terminal nodes at random until hmax is reachedThen choose only from terminals

A more detailed view on GP (vanilla GP is not the whole story)ndash 50

Homologous crossover for GP

Earliest example one-point crossover [23] identify a common region in theparents and swap the corresponding trees

The common region is the lsquointersectionrsquo of parent trees

A more detailed view on GP (vanilla GP is not the whole story)ndash 51

Uniform crossover for GP

Works similarly to uniform crossover in GAs

The offspring is build by iterating over nodes in the common region and flipping acoin to decide from which parent should an instruction be copied [38]

A more detailed view on GP (vanilla GP is not the whole story)ndash 52

How to employ multiple operators for lsquobreedingrsquo

How should the particular operators coexist in an evolutionary process In other words

How should they be superimposed

What should be the lsquopipingrsquo of particular breeding pipelines

A topic surprisingly underexplored in GP (and in EC probably too)

An example Which is betterpopsubpop0speciespipe = ecgpkozaMutationPipelinepopsubpop0speciespipenum-sources = 1popsubpop0speciespipesource0 = ecgpkozaCrossoverPipeline

Orpopsubpop0speciespipenum-sources = 2popsubpop0speciespipesource0 = ecgpkozaCrossoverPipelinepopsubpop0speciespipesource0prob = 09popsubpop0speciespipesource1 = ecgpkozaMutationPipelinepopsubpop0speciespipesource1prob = 01

A more detailed view on GP (vanilla GP is not the whole story)ndash 53

The Challenges for GP

The Challenges for GPndash 54

Bloat

The evolving expressions tend to grow indefinitely in sizeFor tree-based representations this growth is typically exponential[-ish]

Evaluation becomes slow algorithm stalls memory overrun likely

One of the most intensely studied topics in GP 240+ papers as of March 2012

The Challenges for GPndash 55

Bloat example

Average number of nodes per generation in a typical run of GP solving the Sexticproblem x6minus2x4+x2

(GP dotted line)

The Challenges for GPndash 56

Countermeasures for bloat

Constraining tree heightSurprisingly can speed up bloat

Favoring small programsLexicographic parsimony pressure given two equally fit individuals prefer (select)the one represented by a smaller tree

Bloat-aware operators size-fair crossover

The Challenges for GPndash 57

Highly non-uniform distribution of program lsquobehaviorsrsquo

Convergence of binary Boolean random linear functions (composed of AND NANDOR NOR 8 bits)

From [20] Langdon W B Cantuacute-Paz E (ed) Random Search is Parsimonious LateBreaking Papers at the Genetic and Evolutionary Computation Conference(GECCO-2002) AAAI 2002 308-315

The Challenges for GPndash 58

High cost of evaluation

Running a program on multiple inputscan be expensive

Particularly for some types of dataeg images

Solutions

Caching of outcomes of subprograms

Parallel execution of programs onparticular fitness cases

Bloat prevention methods

The Challenges for GPndash 59

Variants of GP

Variants of GPndash 60

Strongly typed GP (STGP)

A way to incorporate prior knowledge and impose a structure on programs [30]

ImplementationProvide a set of typesFor each instruction define the types of its arguments and outcomesMake the operators type-aware

Mutation substitute a random tree of a proper typeCrossover swap trees of compatible1 types

1Compatible belonging to the same lsquoset typersquoVariants of GPndash 61

Strongly typed GP in ECJ

For the problem of simple classifiers represented as decision trees

Classifier syntaxClassifier = Class_idClassifier = if_then_else(Condition ClassifierClassifier)Condition = Input_Variable = Constant_Value

Implementation in ECJ parameter filesgptypeasize = 3gptypea0name = classgptypea1name = vargptypea2name = constgptypessize = 0gptcsize = 1gptc0 = ecgpGPTreeConstraintsgptc0name = tc0gptc0fset = f0gptc0returns = class

gpncsize = 4gpnc0 = ecgpGPNodeConstraintsgpnc0name = ncSimpleClassifiergpnc0returns = classgpnc0size = 0gpnc1 = ecgpGPNodeConstraintsgpnc1name = ncCompoundClassifiergpnc1returns = classgpnc1size = 4gpnc1child0 = vargpnc1child1 = constgpnc1child2 = classgpnc1child3 = classgpnc2 = ecgpGPNodeConstraintsgpnc2name = ncVariablegpnc2returns = vargpnc2size = 0gpnc3 = ecgpGPNodeConstraintsgpnc3name = ncConstantgpnc3returns = constgpnc3size = 0

Variants of GPndash 62

Linear Genetic Programming

MotivationTree-like structures are not natural for contemporary hardware architectures

Program = a sequence of instructions

Data passed via registers

ProsDirectly portable to machine code fast executionNatural correspondence to standard (GA-like) crossover operator

Applications direct evolution of machine code [32]

Initial register contents

Final register contents

x1

x2 O1 O2

x3

O3 O4 g2

g3

g1r1

r2

r3

r1

r2

r3

Variants of GPndash 63

Linear GP

Variants of GPndash 64

Stack-based GP

The best-known representative Push and PushGPhampshireedulspectorpushhtml [48]

ProsVery simple syntax program = instruction | literal | ( program )No need to specify the number of registersThe top element of a stack has the natural interpretation of program outcomeNatural possibility of implementing autoconstructive programs [47]Includes certain features that make it Turing-complete (eg YANK instruction)

Variants of GPndash 65

Push Example

Program

( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR )

Initial stack states

BOOLEAN STACK ()CODE STACK ( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR )FLOAT STACK ()INTEGER STACK ()

Stack states after program execution

BOOLEAN STACK ( TRUE )CODE STACK ( ( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR ) )FLOAT STACK ( 93 )INTEGER STACK ( 6 )

httphampshireedulspectorpush3-descriptionhtml

Variants of GPndash 66

Other variants of GP

Grammatical Evolution The grammar of the programming language ofconsideration is given as input to the algorithm Individuals encode the choice ofproductions in the derivation tree (which of available alternative productionshould be chosen modulo the number of productions available at given step ofderivation)

Graph-based GPMotivation standard GP cannot reuse subprograms (within a single program)Example Cartesian Genetic Programming

Variants of GPndash 67

Other variants of GP

Multiobjective GP The extra objectives canCome with the problemResult from GPrsquos specifics eg use program size as the second (minimized)objectiveBe associated with different tests (eg feature tests [40])

Developmental GP (eg using Push)

Probabilistic GP (a variant of EDA Estimation of Distribution Algorithms)The algorithm maintains a probability distribution P instead of a populationIndividuals are generated from P lsquoon demandrsquoThe results of individualsrsquo evaluation are used to update P

Variants of GPndash 68

Simple EDA-like GP PIPE

Probabilistic Incremental Program Evolution [41]

Variants of GPndash 69

Selected theoretical results

Selected theoretical resultsndash 70

Schemata theorem for GP

Exact formula for the expected number of individuals sampling a schema a thenext generation [37]

Plus later work for other types of crossover

Selected theoretical resultsndash 71

Semantic crossover

Selected theoretical resultsndash 72

Applications of GP

Applications of GPndash 73

Review

GP produced a number of solutions that are human-competitive ie a GPalgorithm automatically solved a problem for which a patent exists2

A recent award-winning work has demonstrated the ability of a GP system toautomatically find and correct bugs in commercially-released software whenprovided with test data3

GP is one of leading methodologies that can be used to lsquoautomatersquo sciencehelping the researchers to find the hidden complex patterns in the observedphenomena4

2Koza J R Keane M A Streeter M J Mydlowec W Yu J Lanza G 2003 Genetic Pro-gramming IV Routine Human-Competitive Machine Intelligence Kluwer Academic Publishers

3Arcuri A Yao X A novel co-evolutionary approach to automatic software bug fixing In WangJ (Ed) 2008 IEEE World Congress on Computational Intelligence IEEE Computational IntelligenceSociety IEEE Press Hong Kong

4Schmidt M Lipson H 3 Apr 2009 Distilling free-form natural laws from experimental dataScience 324 (5923) 81ndash85

Applications of GPndash 74

Humies

() Entries were solicited for cash awards for human-competitive results that wereproduced by any form of genetic and evolutionary computation and that were published

Applications of GPndash 75

Humies

The conditions to qualify(A) The result was patented as an invention in the past is an improvement over apatented invention or would qualify today as a patentable new invention(B) The result is equal to or better than a result that was accepted as a new scientificresult at the time when it was published in a peer-reviewed scientific journal(C) The result is equal to or better than a result that was placed into a database orarchive of results maintained by an internationally recognized panel of scientificexperts(D) The result is publishable in its own right as a new scientific result mdash independentof the fact that the result was mechanically created(E) The result is equal to or better than the most recent human-created solution to along-standing problem for which there has been a succession of increasingly betterhuman-created solutions(F) The result is equal to or better than a result that was considered an achievementin its field at the time it was first discovered(G) The result solves a problem of indisputable difficulty in its field(H) The result holds its own or wins a regulated competition involving humancontestants (in the form of either live human players or human-written computerprograms)

Applications of GPndash 76

Selected Gold Humies

2004 Jason D Lohn Gregory S Hornby Derek S Linden NASA Ames ResearchCenterAn Evolved Antenna for Deployment on NASArsquos Space Technology 5 Mission

httpidesignucscedupapershornby_ec11pdf

Applications of GPndash 77

Selected Gold Humies using GP

2009 Stephanie Forrest Claire Le Goues ThanhVu Nguyen Westley WeimerAutomatically finding patches using genetic programming A GeneticProgramming Approach to Automated Software Repair

Applications of GPndash 78

Selected Gold Humies using GP

2008 Lee Spector David M Clark Ian Lindsay Bradford Barr Jon KleinGenetic Programming for Finite Algebras

2010 Natalio Krasnogor Paweł Widera Jonathan GaribaldiEvolutionary design of the energy function for protein structure prediction GPchallenge evolving the energy function for protein structure prediction Automateddesign of energy functions for protein structure prediction by means of geneticprogramming and improved structure similarity assessment

2011 Achiya Elyasaf Ami Hauptmann Moshe SipperGA-FreeCell Evolving Solvers for the Game of FreeCell

Applications of GPndash 79

Other applications

classification problems in machine learning [] object recognition [ 33] orlearning game strategies [] [2 57] has demonstrated the ability of a GP system to automatically find andcorrect bugs in commercially-released software when provided with test data

In the context of this paper it deserves particular attention that GP is one ofleading methodologies that can be used to lsquoautomatersquo science helping theresearchers to find the hidden complex patterns in the observed phenomena

In this spirit in their seminal paper [42] have shown how GP can be used toinduce scientific laws from experimental data Many other studies havedemonstrated the usefulness of GP for modeling different phenomena includingthose of natural origins [45 3 7 56 46]

See [] for an extensive review of GP applications

Applications of GPndash 80

Case study 2 Evolution of feature detectors

Based on Krzysztof Krawiec Bartosz Kukawka and Tomasz Maciejewski (2010)Evolving cascades of voting feature detectors for vehicle detection in satellite imageryIn IEEE Congress on Evolutionary Computation (CEC 2010) Barcelona IEEE Presspages 2392-2399

Applications of GPndash 81

Additional resources

Additional resourcesndash 82

Additional resources

Evolutionary Computation in Java csgmuedu~eclabprojectsecjGeneric software framework for EA well-prepared to work with GP

The online GP bilbiography wwwcsbhamacuk~wblbiblio

The genetic programming lsquohome pagersquo (a little bit messy but still valuable)httpwwwgenetic-programmingcom

Additional resourcesndash 83

Bibliography

Bibliographyndash 84

A Agapitos and S M LucasEvolving Modular Recursive Sorting AlgorithmsIn Proc EuroGP 2007

A Arcuri and X YaoA novel co-evolutionary approach to automatic software bug fixingIn J Wang editor 2008 IEEE World Congress on Computational Intelligencepages 162ndash168 Hong Kong 1-6 June 2008 IEEE Computational IntelligenceSociety IEEE Press

M Arganis R Val J Prats K Rodriguez R Dominguez and J DolzGenetic programming and standardization in water temperature modellingAdvances in Civil Engineering 2009 2009

A Brabazon M OrsquoNeill and I DempseyAn Introduction to Evolutionary Computation in FinanceIEEE Computational Intelligence Magazine 3(4)42ndash55 2008

R Bradley A Brabazon and M OrsquoNeillDynamic High Frequency Trading A Neuro-Evolutionary ApproachIn Proc EvoWorkshops 2009

G Cuccu and F GomezWhen novelty is not enoughIn Proc EvoApplications 2011

Bibliographyndash 85

M Daga and M C DeoAlternative data-driven methods to estimate wind from waves by inversemodelingNatural Hazards 49(2)293ndash310 May 2009

I Dempsey M OrsquoNeill and A BrabazonAdaptive Trading With Grammatical EvolutionIn Proc CEC 2006

G Durrett F Neumann and U-M OrsquoReillyComputational Complexity Analysis of Simple Genetic Programming On TwoProblems Modeling Isolated Program SemanticsIn Proc FOGA 2011

E Galvaacuten-Loacutepez J Swafford M OrsquoNeill and A BrabazonEvolving a Ms PacMan Controller Using Grammatical EvolutionIn Applications of Evolutionary Computation Springer 2010

J V Hansen P B Lowry R D Meservy and D M McDonaldGenetic Programming for Prevention of Cyberterrorism through Dynamic andEvolving Intrusion DetectionDecision Support Systems 431362ndash1374 2007

Bibliographyndash 86

S Harding J F Miller and W BanzhafDevelopments in Cartesian Genetic Programming self-modifying CGPGPEM 11397ndash439 2010

D Jakobović and L BudinDynamic Scheduling with Genetic ProgrammingIn Proc EuroGP 2006

G Kendall A Parkes and K SpoererA Survey of NP-Complete PuzzlesInternational Computer Games Association Journal 31(1)13ndash34 2008

K E Kinnear JrEvolving a Sort Lessons in Genetic ProgrammingIn Proc of the International Conference on Neural Networks 1993

J KozaA Genetic Approach to the Truck Backer Upper Problem and the Inter-twinedSpiral ProblemIn Proc International Joint Conference on Neural Networks 1992

J R KozaGenetic Programming On the Programming of Computers by Means of NaturalSelectionMIT Press Cambridge MA USA 1992

Bibliographyndash 87

J R KozaGenetic Programming II Automatic Discovery of Reusable ProgramsMIT Press Cambridge Massachusetts May 1994

W Langdon and W BanzhafRepeated Patterns in Genetic ProgrammingNatural Computing 7589ndash613 2008

W B LangdonRandom search is parsimoniousIn E Cantuacute-Paz editor Late Breaking Papers at the Genetic and EvolutionaryComputation Conference (GECCO-2002) pages 308ndash315 New York NY 9-13July 2002 AAAI

W B Langdon and W BanzhafRepeated Sequences in Linear Genetic Programming GenomesComplex Systems 15(4)285ndash306 2005

W B Langdon and A P HarrisonEvolving Regular Expressions for GeneChip Probe Performance PredictionIn Proc PPSN pages 1061ndash1070 2008

W B Langdon and R PoliFoundations of Genetic ProgrammingSpringer-Verlag 2002

Bibliographyndash 88

W B Langdon J Rowsell and A P HarrisonCreating Regular Expressions as mRNA Motifs with GP to Predict Human ExonSplittingIn Proc GECCO 2009

W B Langdon O Sanchez Graillet and A P HarrisonAutomated DNA Motif DiscoveryarXivorg 2010

M Lones A Tyrrell S Stepney and L CavesControlling Complex Dynamics with Artificial Biochemical NetworksIn Proc EuroGP pages 159ndash170 2010

S LucasOthello Competitionhttpprotectkern-1667emrelaxalgovalessexacuk8080othellohtmlOthellohtml 2012[Online accessed 27-Jan-2012]

Bibliographyndash 89

S LucasThe Physical Travelling Salesperson Problemhttpprotectkern-1667emrelaxalgovalessexacukptspptsphtml2012[Online accessed 27ndashJan-2012]

T McConaghyFFX Fast Scalable Deterministic Symbolic Regression TechnologyIn Proc GPTP 2011

D J MontanaStrongly typed genetic programmingBBN Technical Report 7866 Bolt Beranek and Newman Inc 10 MoultonStreet Cambridge MA 02138 USA 7 May 1993

M Nicolau M Schoenauer and W BanzhafEvolving Genes to Balance a PoleIn Proc EuroGP 2010

Bibliographyndash 90

P Nordin and W BanzhafGenetic programming controlling a miniature robotIn E V Siegel and J R Koza editors Working Notes for the AAAI Symposiumon Genetic Programming pages 61ndash67 MIT Cambridge MA USA 10ndash12 Nov1995 AAAI

G Olague and L TrujilloEvolutionary-computer-assisted design of image operators that detect interestpoints using genetic programmingImage and Vision Computing 29(7)484ndash498 2011

M OrsquoNeill A Brabazon and E HembergSubtree Deactivation Control with Grammatical Genetic Programming inDynamic EnvironmentsIn Proc CEC 2008

M OrsquoNeill and C RyanGrammatical Evolution by Grammatical Evolution The Evolution of Grammarand Genetic CodeIn Proc EuroGP pages 138ndash149 Springer-Verlag 5ndash7 Apr 2004

M Paterson Y Peres M Thorup P Winkler and U ZwickMaximum OverhangIn Proc 19th Annual ACM-SIAM Symposium on Discrete Algorithms 2008

Bibliographyndash 91

R PoliExact schema theory for genetic programming and variable-length geneticalgorithms with one-point crossoverGenetic Programming and Evolvable Machines 2(2)123ndash163 June 2001

R Poli and W B LangdonOn the search properties of different crossover operators in genetic programmingIn J R Koza W Banzhaf K Chellapilla K Deb M Dorigo D B Fogel M HGarzon D E Goldberg H Iba and R Riolo editors Genetic Programming1998 Proceedings of the Third Annual Conference pages 293ndash301 University ofWisconsin Madison Wisconsin USA 22-25 July 1998 Morgan Kaufmann

R Poli W B Langdon and N F McPheeA field guide to genetic programmingPublished via httplulucom and freely available athttpwwwgp-field-guideorguk 2008(With contributions by J R Koza)

B J Ross and H ZhuProcedural texture evolution using multiobjective optimizationNew Generation Computing 22(3)271ndash293 2004

Bibliographyndash 92

R P Salustowicz and J SchmidhuberProbabilistic incremental program evolutionEvolutionary Computation 5(2)123ndash141 1997

M Schmidt and H LipsonDistilling free-form natural laws from experimental dataScience 324(5923)81ndash85 3 Apr 2009

S Silva and L VanneschiState-of-the-Art Genetic Programming for Predicting Human Oral Bioavailabilityof DrugsIn Proc 4th International Workshop on Practical Applications of ComputationalBiology and Bioinformatics 2010

M SipperLet the Games EvolveIn Proc GPTP 2011

C Sivapragasam R Maheswaran and V VenkateshGenetic programming approach for flood routing in natural channelsHydrological Processes 22(5)623ndash628 2007

C Sivapragasam N Muttil S Muthukumar and V M ArunPrediction of algal blooms using genetic programmingMarine Pollution Bulletin 60(10)1849ndash1855 2010

Bibliographyndash 93

L SpectorTowards practical autoconstructive evolution Self-evolution of problem-solvinggenetic programming systemsIn R Riolo T McConaghy and E Vladislavleva editors Genetic ProgrammingTheory and Practice VIII volume 8 of Genetic and Evolutionary Computationchapter 2 pages 17ndash33 Springer Ann Arbor USA 20-22 May 2010

L Spector C Perry and J KleinPush 20 programming language descriptionTechnical report School of Cognitive Science Hampshire College Apr 2004

J Togelius S Karakovskiy J Koutnik and J SchmidhuberSuper Mario EvolutionIn Proc IEEE Computational Intelligence and Games 2009

TORCS The Open Car Racing Simulatorhttptorcssourceforgenet 2012

L Vanneschi and G CuccuVariable Size Population for Dynamic Optimization with Genetic ProgrammingIn Proc GECCO 2009

Bibliographyndash 94

E Vladislavleva G Smits and D Den HertogOrder of Nonlinearity as a Complexity Measure for Models Generated by SymbolicRegression via Pareto Genetic ProgrammingIEEE Trans EC 13(2)333ndash349 2009

N Wagner Z Michalewicz M Khouja and R McGregorTime Series Forecasting for Dynamic Environments The DyFor Genetic ProgramModelIEEE Trans EC 2007

J Walker and J MillerPredicting Prime Numbers Using Cartesian Genetic ProgrammingIn Proc EuroGP 2007

J A Walker K Voumllk S L Smith and J F MillerParallel Evolution using Multi-chromosome Cartesian Genetic ProgrammingGPEM 10417ndash445 2009

W-C Wang K-W Chau C-T Cheng and L QiuA comparison of performance of several artificial intelligence methods forforecasting monthly discharge time seriesJournal of Hydrology 374(3-4)294ndash306 2009

Bibliographyndash 95

W Weimer S Forrest C Le Goues and T NguyenAutomatic program repair with evolutionary computationCommunications of the ACM 53(5)109ndash116 June 2010

P Widera J Garibaldi and N KrasnogorGP challenge Evolving energy function for protein structure predictionGPEM 1161ndash88 2010

T YuHierarchical Processing for Evolving Recursive and Modular Programs UsingHigher-Order Functions and Lambda AbstractionGPEM 2345ndash380 2001

Bibliographyndash 96

  • Introduction
  • Background
  • What is genetic programming
  • Summary of our first glimpse at GP
  • Life demonstration of GP using ECJ
  • Assessment of GP techniques
  • Case study
  • A more detailed view on GP (vanilla GP is not the whole story)
  • The Challenges for GP
  • Variants of GP
  • Selected theoretical results
  • Applications of GP
  • Additional resources
  • Bibliography
Page 11: GeneticProgramming - Poznań University of TechnologyGeneticProgramming KrzysztofKrawiec LaboratoryofIntelligentDecisionSupportSystems InstituteofComputingScience,PoznanUniversityofTechnology,Poznań,Poland

Genetic programming

In a nutshell

A variant of EC where the genotypes represent programs ie entities capable ofreading in input data and producing some output data in response to that input

Fitness function f measures the similarity of the output produced by the programto the desired output given as a part of task statement

Standard representation expression trees

Important implication Additional input required by the algorithm (compared to EC)

Set of instructions (programming language of consideration)

Data to run the programs on

What is genetic programmingndash 11

Conceptual preliminaries of GP Space of candidate solutions P

Candidate solutions p isin P evolving under the selection pressure of the fitnessfunction f are themselves functions of the form p I rarrO

I and O are respectively the spaces of input data and output data accepted andproduced by programs from P

Cardinality of |P| is typically large or infinite

The set of program inputs I even if finite is usually so large that running eachcandidate solution on all possible inputs becomes intractable

GP algorithms typically evaluate solutions on a sample I prime sub I |I prime| |I | of possibleinputs and fitness is only an approximate estimate of solution quality

The task is given as a set of fitness cases ie pairs (xi yi ) isin I timesO where xiusually comprises one or more independent variables and yi is the output variable

What is genetic programmingndash 12

Conceptual preliminaries Fitness function f

In most cases (and most real-world applications of GP) fitness function fmeasures the similarity of the output produced by the program to the desiredoutput given as a part of task statement

Then fitness can be expressed as a monotonous function of the divergence ofprogramrsquos output from the desired one for instance as

f (p) =minussumi

||yi minusp(xi )|| (1)

where

p(xi ) is the output produced by program p for the input data xi

|| middot || is a metric (a norm) in the output space O

i iterates over all fitness cases

What is genetic programmingndash 13

Conceptual preliminaries Character of candidate solutions

The candidate solutions in GP are being assembled from elementary entities calledinstructions

A part of formulation of a GP task is then also an instruction set I ie a set ofsymbols used by the search algorithm to compose the programs (candidatesolutions)

Design of I usually requires some background knowledgeIn particular it should comprise all instructions necessary to find solution to theproblem posed (closure)

What is genetic programmingndash 14

Genetic programming

Main evolution loop (lsquovanilla GPrsquo)

1 procedure GeneticProgramming(f I ) f - fitness function I - instruction set2 P larrplarrRandomProgram(I ) Initialize population3 repeat Main loop over generations4 for p isinP do Evaluation5 pf larr f (p) pf is a lsquofieldrsquo in program p that stores its fitness6 end for7 P prime larr 0 Next population8 repeat Breeding loop9 p1 larrTournamentSelection(P) First parent10 p2 larrTournamentSelection(P) Second parent11 (o1o2)larrCrossover(p1 p2)12 o1 larrMutation(o1 I )13 o2 larrMutation(o2 I )14 P prime larrP prime cupo1o215 until |P prime|= |P|16 P larrP prime

17 until StoppingCondition(P)18 return argmaxpisinP pf

19 end procedure

What is genetic programmingndash 15

Crossover

Crossover exchange of randomly selected subexpressions (subtree swapping crossover)

1 function Crossover(p1p2)2 repeat3 s1 larr Random node in p14 s2 larr Random node in p25 (pprime1p

prime2)larr Swap subtrees rooted in s1 and s2

6 until Depth(pprime1)lt dmax andDepth(pprime2)lt dmax dmax is the tree depthlimit

7 return (pprime1pprime2)

8 end function

What is genetic programmingndash 16

Mutation

Mutation replace a randomly selected subexpression with a new randomly generatedsubexpression

1 function Mutation(pI )2 repeat3 slarr Random node in p4 s prime larrRandomProgram(I )5 pprime larr Replace the subtree rooted in s with s prime

6 until Depth(pprime)lt dmax dmax is the tree depth limit7 return pprime

8 end function

What is genetic programmingndash 17

Exemplary run Setup

Objective Find program whose output matches x2+x+1 over the range [minus11]Such tasks can be considered as a form of regressionAs solutions are built by manipulating code (instructions) this is referred to assymbolic regression

Fitness sum of absolute errors for x isin minus10minus09 0910In other words the set of fitness cases is

xi -10 -09 0 09 10yi 1 091 1 271 3

What is genetic programmingndash 18

Exemplary run Setup

Instruction setNonterminal (function) set + - (protected division) and x all operating onfloatsTerminal set x and constants chosen randomly between -5 and +5

Selection fitness proportionate (roulette wheel) non elitist

Initial pop ramped half-and-half (depth 1 to 2 50 of terminals are constants)(to be explained later)

Parameterspopulation size 450 subtree crossover25 reproduction25 subtree mutation no tree size limits

Termination when an individual with fitness better than 01 found

What is genetic programmingndash 19

Initial population (population 0)

What is genetic programmingndash 20

Fitness assignment for population 0

Fitness values f(a)=77 f(b)=110 f(c)=1798 f(d)=287

What is genetic programmingndash 21

Breeding

Assume

a gets reproduced

c gets mutated (at loci 2)

a and d get crossed-over

a and b get crossed over

What is genetic programmingndash 22

Population 1

Population 0

Population 1

Individual d in population 1 has fitness 0What is genetic programmingndash 23

Summary of our first glimpse at GP

Summary of our first glimpse at GPndash 24

Specific features of GP

The solutions evolving under the selection pressure of the fitness function arethemselves functions (programs)

GP operates on symbolic structures of varying lengthsThere are no variables for the algorithm to operate on (at least in the commonsense)

The program can be tested only on a limited number of fitness cases (tests)

=rArr In contrast to most EC methods that are typically placed in optimizationframework GP is by nature an inductive learning approach that fits into the domain ofmachine learning []

Summary of our first glimpse at GPndash 25

In a broader context

As opposed to typical ML approaches GP is very genericArbitrary programming language arbitrary input and output representation

The syntax and semantic of the programming language of consideration serve asmeans to provide the algorithm with prior knowledge

(common sense knowledge background knowledge domain knowledge)

GP is not the only approach to program induction (but probably the best one )See eg inductive logic programming ILP

GP embodies the ultimate goal of AI to build a system capable ofself-programming (adaptation learning)

Summary of our first glimpse at GPndash 26

What is GP ndash Question revisited

Genetic programming is a branch of computer science studying heuristicalgorithms based on neo-Darwinian principles for synthesizing programs iediscrete symbolic compositional structures that process data

Summary of our first glimpse at GPndash 27

Why should GP be considered a viable approach to program synthesis

GP combines two powerful concepts marked in underline in the above definition1 Representing candidate solutions as programs

which in general can conduct any Turing-complete computation (egclassification regression clustering reasoning problem solving etc) and thusenable capturing solutions to any type of problems (whether the task is eglearning optimization problem solving game playing etc)

2 Searching the space of candidate solutions using the lsquomechanicsrsquo borrowed frombiological evolutionwhich is unquestionably a very powerful computing paradigm given that itresulted in life on Earth and development of intelligent beings

Summary of our first glimpse at GPndash 28

Consequences of the above definition

Heuristic nature of search

Symbolic program representation

Input sensitivity and inductive character

State-fullness

Unconstrained data types

Summary of our first glimpse at GPndash 29

Origins of GP

httpwwwgenetic-programmingcomjohnkozahtml

Home assignment Indentify John Kozarsquos unique visual trait )

Summary of our first glimpse at GPndash 30

Life demonstration of GP using ECJ

Life demonstration of GP using ECJndash 31

GP on ECJ

ECJ Evolutionary Computation in Javahttpcsgmuedu~eclabprojectsecj

by Sean Luke Liviu Panait Gabriel Balan Sean Paus Zbigniew Skolicki ElenaPopovici Keith Sullivan et al

Probably the most popular freely available framework for EC with a strongsupport for GP

Licensed under Academic Free License version 30

As of March 2012 version 20

Many other libraries integrate with ECJ

Life demonstration of GP using ECJndash 32

Selected ECJ features

GUI with charting

Platform-independent checkpointing and logging

Hierarchical parameter files

Multithreading

Mersenne Twister Random Number Generators (compare tohttpwwwalifecouknonrandom)

Abstractions for implementing a variety of EC forms

Prepared to work in a distributed environment (including so-called island model)

Life demonstration of GP using ECJndash 33

GP-related ECJ features

GP Tree Representations

Set-based Strongly-Typed Genetic Programming

Ephemeral Random Constants

Automatically-Defined Functions and Automatically Defined Macros

Multiple tree forests

Six tree-creation algorithms

Extensive set of GP breeding operators

Grammatical Encoding

Eight pre-done GP application problem domains (ant regression multiplexerlawnmower parity two-box edge serengeti)

Life demonstration of GP using ECJndash 34

Exemplary run of ECJ

Standard outputjava ecEvolve -file ecappregressionquinticercparamsThreads breed1 eval1Seed 1427743400Job 0Setting upProcessing GP TypesProcessing GP Node ConstraintsProcessing GP Function SetsProcessing GP Tree Constraints-01306332228659439200164875774146594280653340439694114301402200189629743-0037506348565697010001402771209365470606602806044824949013869498395598084Initializing Generation 0Subpop 0 best fitness of generation Fitness Standardized=11303205 Adjusted=046941292 Hits=10Generation 1Subpop 0 best fitness of generation Fitness Standardized=06804932 Adjusted=059506345 Hits=7

Life demonstration of GP using ECJndash 35

Exemplary run The result

The log file produced by the runGeneration 0Best IndividualSubpopulation 0Evaluated trueFitness Standardized=11303205 Adjusted=046941292 Hits=10Tree 0( (sin ( x x)) (cos (+ x x)))Generation 1Best IndividualSubpopulation 0Evaluated trueFitness Standardized=06804932 Adjusted=059506345 Hits=7Tree 0( (rlog (+ (- x x) (cos x))) (rlog (- (cos (cos ( x x))) (- x x))))

Life demonstration of GP using ECJndash 36

Exemplary run

The log file produced by the runBest Individual of RunSubpopulation 0Evaluated trueFitness Standardized=008413165 Adjusted=092239726 Hits=17Tree 0( ( ( (- ( ( ( ( x (sin x)) (rlog

x)) (+ (+ (sin x) x) (- x x))) (exp ( x( ( (- ( ( ( ( x x) (rlog x)) (+ (+

(sin x) x) (- x x))) (exp ( x (sin x))))(sin x)) (rlog x)) (exp (rlog x)))))) (sin

x)) (rlog x)) x) (cos (cos ( ( (- ( ((exp (rlog x)) (+ x ( ( (exp (rlog x))(rlog x)) x))) (exp ( ( ( (- (exp (rlogx)) x) (rlog x)) x) (sin ( x x))))) (sinx)) ( x ( ( (- ( ( ( ( x x) (rlogx)) (+ (+ x (+ (+ (sin x) x) (- x x))) (-x x))) (exp ( x (sin x)))) (sin x)) (rlogx)) (exp (rlog x))))) x))))

Life demonstration of GP using ECJndash 37

GP in R

httpcranr-projectorgwebpackagesgprindexhtml

Life demonstration of GP using ECJndash 38

GP in Mathematica

Life demonstration of GP using ECJndash 39

Assessment of GP techniques

Assessment of GP techniquesndash 40

Criteria for assessing the quality of GP-evolved solutions

Criteria for assessing GP algorithmssuccess rate (percentage of evolutionary runs ended with success)

time-to-success (can be infin)

error of the best-of-run individual

Criteria for assessing programs obtained with GP

error rate (percentage of tests passed)

program size (number of instructions)

execution time

transparency

Assessment of GP techniquesndash 41

GP Benchmarks

Problem Definition (formula)Sextic x6minus2x4+x2

Septic x7minus2x6+x5minusx4+x3minus2x2+xNonic x9+x8+x7+x6+x5+x4+x3+x2+xR1 (x+1)3(x2minusx+1)R2 (x5minus3x3+1)(x2+1)R3 (x6+x5)(x4+x3+x2+x+1)

Assessment of GP techniquesndash 42

Even more GP benchmarks

Symbolic Regression Drug Bioavailability [43]

Tower [52] Protein Structure Classification [58]

Boolean Functions Time Series Forecasting [53]

N-Multiplexer [17] N-Majority [17] N-Parity [17] Path-finding and Planning

Generalised Boolean Circuits [12 59] Physical Travelling Salesman [28]

Digital Adder [55] Artificial Ant [17]

Order [9] Lawnmower [18]

Digital Multiplier [55] Tartarus Problem [6]

Majority [9] Maximum Overhang [36]

Classification Circuit Design [29]

mRNA Motif Classification [24] Control Systems

DNA Motif Discovery [25] Chaotic Dynamic Systems Control [26]

Intrusion Detection [11] Pole Balancing [31]

Protein Classification [19] Truck Control [16]

Intertwined Spirals [17]

Predictive Modelling

Mackey-Glass Chaotic Time Series [21]

Financial Trading [5 4 8]

Sunspot Prediction [17]

GeneChip Probe Performance [22]

Prime Number Prediction [54]

Assessment of GP techniquesndash 43

and more

Game-Playing

TORCS Car Racing [50]

Ms PacMan [10]

Othello [27]

Chessboard Evaluation [44]

Backgammon [44]

Mario [49]

NP-Complete Puzzles [14]

Robocode [44]

Rush Hour [44]

Checkers [44]

Freecell [44]

Dynamic Optimisation

Dynamic Symbolic Regression [34 35 51]

Dynamic Scheduling [13]

Traditional Programming

Sorting [15 1]

Assessment of GP techniquesndash 44

Case study

Case studyndash 45

Case study 1 Evolution of temperature models

Based on Karolina Stanisławska Krzysztof Krawiec Zbigniew W KundzewiczModeling Global Temperature Changes using Genetic Programming ndash A Case Study

Institute of Computing Science Poznan University of Technology Poznan Poland

Institute for Agricultural and Forest Environment Polish Academy of SciencesPoznan Poland and Potsdam Institute for Climate Impact Research PotsdamGermany

Konferencja Algorytmoacutew Ewolucyjnych i Optymalizacji Globalnej KAEiOGSeptember 21 2011

Case studyndash 46

A more detailed view on GP(vanilla GP is not the whole story)

A more detailed view on GP (vanilla GP is not the whole story)ndash 47

Where to get [random] candidate solutions from

Every stochastic search method relies on some sampling algorithm(s)

The distribution of randomly generated solutions is important as it impliescertain bias of the algorithm

ProblemsWe donrsquot know the lsquoidealrsquo distribution of GP programsEven if we knew it it may be difficult to design an algorithm that obeys it

The most widely used contemporary initialization methods take care only of thesyntax of generated programs

Mainly height constraint

A more detailed view on GP (vanilla GP is not the whole story)ndash 48

Initialization Full method

Specify the maximum tree height hmax

The full method for initializing treesChoose nonterminal nodes at random until hmax is reachedThen choose only from terminals

A more detailed view on GP (vanilla GP is not the whole story)ndash 49

Initialization Grow method

Specify the maximum tree height hmax

The grow method for initializing treesChoose nonterminal or terminal nodes at random until hmax is reachedThen choose only from terminals

A more detailed view on GP (vanilla GP is not the whole story)ndash 50

Homologous crossover for GP

Earliest example one-point crossover [23] identify a common region in theparents and swap the corresponding trees

The common region is the lsquointersectionrsquo of parent trees

A more detailed view on GP (vanilla GP is not the whole story)ndash 51

Uniform crossover for GP

Works similarly to uniform crossover in GAs

The offspring is build by iterating over nodes in the common region and flipping acoin to decide from which parent should an instruction be copied [38]

A more detailed view on GP (vanilla GP is not the whole story)ndash 52

How to employ multiple operators for lsquobreedingrsquo

How should the particular operators coexist in an evolutionary process In other words

How should they be superimposed

What should be the lsquopipingrsquo of particular breeding pipelines

A topic surprisingly underexplored in GP (and in EC probably too)

An example Which is betterpopsubpop0speciespipe = ecgpkozaMutationPipelinepopsubpop0speciespipenum-sources = 1popsubpop0speciespipesource0 = ecgpkozaCrossoverPipeline

Orpopsubpop0speciespipenum-sources = 2popsubpop0speciespipesource0 = ecgpkozaCrossoverPipelinepopsubpop0speciespipesource0prob = 09popsubpop0speciespipesource1 = ecgpkozaMutationPipelinepopsubpop0speciespipesource1prob = 01

A more detailed view on GP (vanilla GP is not the whole story)ndash 53

The Challenges for GP

The Challenges for GPndash 54

Bloat

The evolving expressions tend to grow indefinitely in sizeFor tree-based representations this growth is typically exponential[-ish]

Evaluation becomes slow algorithm stalls memory overrun likely

One of the most intensely studied topics in GP 240+ papers as of March 2012

The Challenges for GPndash 55

Bloat example

Average number of nodes per generation in a typical run of GP solving the Sexticproblem x6minus2x4+x2

(GP dotted line)

The Challenges for GPndash 56

Countermeasures for bloat

Constraining tree heightSurprisingly can speed up bloat

Favoring small programsLexicographic parsimony pressure given two equally fit individuals prefer (select)the one represented by a smaller tree

Bloat-aware operators size-fair crossover

The Challenges for GPndash 57

Highly non-uniform distribution of program lsquobehaviorsrsquo

Convergence of binary Boolean random linear functions (composed of AND NANDOR NOR 8 bits)

From [20] Langdon W B Cantuacute-Paz E (ed) Random Search is Parsimonious LateBreaking Papers at the Genetic and Evolutionary Computation Conference(GECCO-2002) AAAI 2002 308-315

The Challenges for GPndash 58

High cost of evaluation

Running a program on multiple inputscan be expensive

Particularly for some types of dataeg images

Solutions

Caching of outcomes of subprograms

Parallel execution of programs onparticular fitness cases

Bloat prevention methods

The Challenges for GPndash 59

Variants of GP

Variants of GPndash 60

Strongly typed GP (STGP)

A way to incorporate prior knowledge and impose a structure on programs [30]

ImplementationProvide a set of typesFor each instruction define the types of its arguments and outcomesMake the operators type-aware

Mutation substitute a random tree of a proper typeCrossover swap trees of compatible1 types

1Compatible belonging to the same lsquoset typersquoVariants of GPndash 61

Strongly typed GP in ECJ

For the problem of simple classifiers represented as decision trees

Classifier syntaxClassifier = Class_idClassifier = if_then_else(Condition ClassifierClassifier)Condition = Input_Variable = Constant_Value

Implementation in ECJ parameter filesgptypeasize = 3gptypea0name = classgptypea1name = vargptypea2name = constgptypessize = 0gptcsize = 1gptc0 = ecgpGPTreeConstraintsgptc0name = tc0gptc0fset = f0gptc0returns = class

gpncsize = 4gpnc0 = ecgpGPNodeConstraintsgpnc0name = ncSimpleClassifiergpnc0returns = classgpnc0size = 0gpnc1 = ecgpGPNodeConstraintsgpnc1name = ncCompoundClassifiergpnc1returns = classgpnc1size = 4gpnc1child0 = vargpnc1child1 = constgpnc1child2 = classgpnc1child3 = classgpnc2 = ecgpGPNodeConstraintsgpnc2name = ncVariablegpnc2returns = vargpnc2size = 0gpnc3 = ecgpGPNodeConstraintsgpnc3name = ncConstantgpnc3returns = constgpnc3size = 0

Variants of GPndash 62

Linear Genetic Programming

MotivationTree-like structures are not natural for contemporary hardware architectures

Program = a sequence of instructions

Data passed via registers

ProsDirectly portable to machine code fast executionNatural correspondence to standard (GA-like) crossover operator

Applications direct evolution of machine code [32]

Initial register contents

Final register contents

x1

x2 O1 O2

x3

O3 O4 g2

g3

g1r1

r2

r3

r1

r2

r3

Variants of GPndash 63

Linear GP

Variants of GPndash 64

Stack-based GP

The best-known representative Push and PushGPhampshireedulspectorpushhtml [48]

ProsVery simple syntax program = instruction | literal | ( program )No need to specify the number of registersThe top element of a stack has the natural interpretation of program outcomeNatural possibility of implementing autoconstructive programs [47]Includes certain features that make it Turing-complete (eg YANK instruction)

Variants of GPndash 65

Push Example

Program

( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR )

Initial stack states

BOOLEAN STACK ()CODE STACK ( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR )FLOAT STACK ()INTEGER STACK ()

Stack states after program execution

BOOLEAN STACK ( TRUE )CODE STACK ( ( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR ) )FLOAT STACK ( 93 )INTEGER STACK ( 6 )

httphampshireedulspectorpush3-descriptionhtml

Variants of GPndash 66

Other variants of GP

Grammatical Evolution The grammar of the programming language ofconsideration is given as input to the algorithm Individuals encode the choice ofproductions in the derivation tree (which of available alternative productionshould be chosen modulo the number of productions available at given step ofderivation)

Graph-based GPMotivation standard GP cannot reuse subprograms (within a single program)Example Cartesian Genetic Programming

Variants of GPndash 67

Other variants of GP

Multiobjective GP The extra objectives canCome with the problemResult from GPrsquos specifics eg use program size as the second (minimized)objectiveBe associated with different tests (eg feature tests [40])

Developmental GP (eg using Push)

Probabilistic GP (a variant of EDA Estimation of Distribution Algorithms)The algorithm maintains a probability distribution P instead of a populationIndividuals are generated from P lsquoon demandrsquoThe results of individualsrsquo evaluation are used to update P

Variants of GPndash 68

Simple EDA-like GP PIPE

Probabilistic Incremental Program Evolution [41]

Variants of GPndash 69

Selected theoretical results

Selected theoretical resultsndash 70

Schemata theorem for GP

Exact formula for the expected number of individuals sampling a schema a thenext generation [37]

Plus later work for other types of crossover

Selected theoretical resultsndash 71

Semantic crossover

Selected theoretical resultsndash 72

Applications of GP

Applications of GPndash 73

Review

GP produced a number of solutions that are human-competitive ie a GPalgorithm automatically solved a problem for which a patent exists2

A recent award-winning work has demonstrated the ability of a GP system toautomatically find and correct bugs in commercially-released software whenprovided with test data3

GP is one of leading methodologies that can be used to lsquoautomatersquo sciencehelping the researchers to find the hidden complex patterns in the observedphenomena4

2Koza J R Keane M A Streeter M J Mydlowec W Yu J Lanza G 2003 Genetic Pro-gramming IV Routine Human-Competitive Machine Intelligence Kluwer Academic Publishers

3Arcuri A Yao X A novel co-evolutionary approach to automatic software bug fixing In WangJ (Ed) 2008 IEEE World Congress on Computational Intelligence IEEE Computational IntelligenceSociety IEEE Press Hong Kong

4Schmidt M Lipson H 3 Apr 2009 Distilling free-form natural laws from experimental dataScience 324 (5923) 81ndash85

Applications of GPndash 74

Humies

() Entries were solicited for cash awards for human-competitive results that wereproduced by any form of genetic and evolutionary computation and that were published

Applications of GPndash 75

Humies

The conditions to qualify(A) The result was patented as an invention in the past is an improvement over apatented invention or would qualify today as a patentable new invention(B) The result is equal to or better than a result that was accepted as a new scientificresult at the time when it was published in a peer-reviewed scientific journal(C) The result is equal to or better than a result that was placed into a database orarchive of results maintained by an internationally recognized panel of scientificexperts(D) The result is publishable in its own right as a new scientific result mdash independentof the fact that the result was mechanically created(E) The result is equal to or better than the most recent human-created solution to along-standing problem for which there has been a succession of increasingly betterhuman-created solutions(F) The result is equal to or better than a result that was considered an achievementin its field at the time it was first discovered(G) The result solves a problem of indisputable difficulty in its field(H) The result holds its own or wins a regulated competition involving humancontestants (in the form of either live human players or human-written computerprograms)

Applications of GPndash 76

Selected Gold Humies

2004 Jason D Lohn Gregory S Hornby Derek S Linden NASA Ames ResearchCenterAn Evolved Antenna for Deployment on NASArsquos Space Technology 5 Mission

httpidesignucscedupapershornby_ec11pdf

Applications of GPndash 77

Selected Gold Humies using GP

2009 Stephanie Forrest Claire Le Goues ThanhVu Nguyen Westley WeimerAutomatically finding patches using genetic programming A GeneticProgramming Approach to Automated Software Repair

Applications of GPndash 78

Selected Gold Humies using GP

2008 Lee Spector David M Clark Ian Lindsay Bradford Barr Jon KleinGenetic Programming for Finite Algebras

2010 Natalio Krasnogor Paweł Widera Jonathan GaribaldiEvolutionary design of the energy function for protein structure prediction GPchallenge evolving the energy function for protein structure prediction Automateddesign of energy functions for protein structure prediction by means of geneticprogramming and improved structure similarity assessment

2011 Achiya Elyasaf Ami Hauptmann Moshe SipperGA-FreeCell Evolving Solvers for the Game of FreeCell

Applications of GPndash 79

Other applications

classification problems in machine learning [] object recognition [ 33] orlearning game strategies [] [2 57] has demonstrated the ability of a GP system to automatically find andcorrect bugs in commercially-released software when provided with test data

In the context of this paper it deserves particular attention that GP is one ofleading methodologies that can be used to lsquoautomatersquo science helping theresearchers to find the hidden complex patterns in the observed phenomena

In this spirit in their seminal paper [42] have shown how GP can be used toinduce scientific laws from experimental data Many other studies havedemonstrated the usefulness of GP for modeling different phenomena includingthose of natural origins [45 3 7 56 46]

See [] for an extensive review of GP applications

Applications of GPndash 80

Case study 2 Evolution of feature detectors

Based on Krzysztof Krawiec Bartosz Kukawka and Tomasz Maciejewski (2010)Evolving cascades of voting feature detectors for vehicle detection in satellite imageryIn IEEE Congress on Evolutionary Computation (CEC 2010) Barcelona IEEE Presspages 2392-2399

Applications of GPndash 81

Additional resources

Additional resourcesndash 82

Additional resources

Evolutionary Computation in Java csgmuedu~eclabprojectsecjGeneric software framework for EA well-prepared to work with GP

The online GP bilbiography wwwcsbhamacuk~wblbiblio

The genetic programming lsquohome pagersquo (a little bit messy but still valuable)httpwwwgenetic-programmingcom

Additional resourcesndash 83

Bibliography

Bibliographyndash 84

A Agapitos and S M LucasEvolving Modular Recursive Sorting AlgorithmsIn Proc EuroGP 2007

A Arcuri and X YaoA novel co-evolutionary approach to automatic software bug fixingIn J Wang editor 2008 IEEE World Congress on Computational Intelligencepages 162ndash168 Hong Kong 1-6 June 2008 IEEE Computational IntelligenceSociety IEEE Press

M Arganis R Val J Prats K Rodriguez R Dominguez and J DolzGenetic programming and standardization in water temperature modellingAdvances in Civil Engineering 2009 2009

A Brabazon M OrsquoNeill and I DempseyAn Introduction to Evolutionary Computation in FinanceIEEE Computational Intelligence Magazine 3(4)42ndash55 2008

R Bradley A Brabazon and M OrsquoNeillDynamic High Frequency Trading A Neuro-Evolutionary ApproachIn Proc EvoWorkshops 2009

G Cuccu and F GomezWhen novelty is not enoughIn Proc EvoApplications 2011

Bibliographyndash 85

M Daga and M C DeoAlternative data-driven methods to estimate wind from waves by inversemodelingNatural Hazards 49(2)293ndash310 May 2009

I Dempsey M OrsquoNeill and A BrabazonAdaptive Trading With Grammatical EvolutionIn Proc CEC 2006

G Durrett F Neumann and U-M OrsquoReillyComputational Complexity Analysis of Simple Genetic Programming On TwoProblems Modeling Isolated Program SemanticsIn Proc FOGA 2011

E Galvaacuten-Loacutepez J Swafford M OrsquoNeill and A BrabazonEvolving a Ms PacMan Controller Using Grammatical EvolutionIn Applications of Evolutionary Computation Springer 2010

J V Hansen P B Lowry R D Meservy and D M McDonaldGenetic Programming for Prevention of Cyberterrorism through Dynamic andEvolving Intrusion DetectionDecision Support Systems 431362ndash1374 2007

Bibliographyndash 86

S Harding J F Miller and W BanzhafDevelopments in Cartesian Genetic Programming self-modifying CGPGPEM 11397ndash439 2010

D Jakobović and L BudinDynamic Scheduling with Genetic ProgrammingIn Proc EuroGP 2006

G Kendall A Parkes and K SpoererA Survey of NP-Complete PuzzlesInternational Computer Games Association Journal 31(1)13ndash34 2008

K E Kinnear JrEvolving a Sort Lessons in Genetic ProgrammingIn Proc of the International Conference on Neural Networks 1993

J KozaA Genetic Approach to the Truck Backer Upper Problem and the Inter-twinedSpiral ProblemIn Proc International Joint Conference on Neural Networks 1992

J R KozaGenetic Programming On the Programming of Computers by Means of NaturalSelectionMIT Press Cambridge MA USA 1992

Bibliographyndash 87

J R KozaGenetic Programming II Automatic Discovery of Reusable ProgramsMIT Press Cambridge Massachusetts May 1994

W Langdon and W BanzhafRepeated Patterns in Genetic ProgrammingNatural Computing 7589ndash613 2008

W B LangdonRandom search is parsimoniousIn E Cantuacute-Paz editor Late Breaking Papers at the Genetic and EvolutionaryComputation Conference (GECCO-2002) pages 308ndash315 New York NY 9-13July 2002 AAAI

W B Langdon and W BanzhafRepeated Sequences in Linear Genetic Programming GenomesComplex Systems 15(4)285ndash306 2005

W B Langdon and A P HarrisonEvolving Regular Expressions for GeneChip Probe Performance PredictionIn Proc PPSN pages 1061ndash1070 2008

W B Langdon and R PoliFoundations of Genetic ProgrammingSpringer-Verlag 2002

Bibliographyndash 88

W B Langdon J Rowsell and A P HarrisonCreating Regular Expressions as mRNA Motifs with GP to Predict Human ExonSplittingIn Proc GECCO 2009

W B Langdon O Sanchez Graillet and A P HarrisonAutomated DNA Motif DiscoveryarXivorg 2010

M Lones A Tyrrell S Stepney and L CavesControlling Complex Dynamics with Artificial Biochemical NetworksIn Proc EuroGP pages 159ndash170 2010

S LucasOthello Competitionhttpprotectkern-1667emrelaxalgovalessexacuk8080othellohtmlOthellohtml 2012[Online accessed 27-Jan-2012]

Bibliographyndash 89

S LucasThe Physical Travelling Salesperson Problemhttpprotectkern-1667emrelaxalgovalessexacukptspptsphtml2012[Online accessed 27ndashJan-2012]

T McConaghyFFX Fast Scalable Deterministic Symbolic Regression TechnologyIn Proc GPTP 2011

D J MontanaStrongly typed genetic programmingBBN Technical Report 7866 Bolt Beranek and Newman Inc 10 MoultonStreet Cambridge MA 02138 USA 7 May 1993

M Nicolau M Schoenauer and W BanzhafEvolving Genes to Balance a PoleIn Proc EuroGP 2010

Bibliographyndash 90

P Nordin and W BanzhafGenetic programming controlling a miniature robotIn E V Siegel and J R Koza editors Working Notes for the AAAI Symposiumon Genetic Programming pages 61ndash67 MIT Cambridge MA USA 10ndash12 Nov1995 AAAI

G Olague and L TrujilloEvolutionary-computer-assisted design of image operators that detect interestpoints using genetic programmingImage and Vision Computing 29(7)484ndash498 2011

M OrsquoNeill A Brabazon and E HembergSubtree Deactivation Control with Grammatical Genetic Programming inDynamic EnvironmentsIn Proc CEC 2008

M OrsquoNeill and C RyanGrammatical Evolution by Grammatical Evolution The Evolution of Grammarand Genetic CodeIn Proc EuroGP pages 138ndash149 Springer-Verlag 5ndash7 Apr 2004

M Paterson Y Peres M Thorup P Winkler and U ZwickMaximum OverhangIn Proc 19th Annual ACM-SIAM Symposium on Discrete Algorithms 2008

Bibliographyndash 91

R PoliExact schema theory for genetic programming and variable-length geneticalgorithms with one-point crossoverGenetic Programming and Evolvable Machines 2(2)123ndash163 June 2001

R Poli and W B LangdonOn the search properties of different crossover operators in genetic programmingIn J R Koza W Banzhaf K Chellapilla K Deb M Dorigo D B Fogel M HGarzon D E Goldberg H Iba and R Riolo editors Genetic Programming1998 Proceedings of the Third Annual Conference pages 293ndash301 University ofWisconsin Madison Wisconsin USA 22-25 July 1998 Morgan Kaufmann

R Poli W B Langdon and N F McPheeA field guide to genetic programmingPublished via httplulucom and freely available athttpwwwgp-field-guideorguk 2008(With contributions by J R Koza)

B J Ross and H ZhuProcedural texture evolution using multiobjective optimizationNew Generation Computing 22(3)271ndash293 2004

Bibliographyndash 92

R P Salustowicz and J SchmidhuberProbabilistic incremental program evolutionEvolutionary Computation 5(2)123ndash141 1997

M Schmidt and H LipsonDistilling free-form natural laws from experimental dataScience 324(5923)81ndash85 3 Apr 2009

S Silva and L VanneschiState-of-the-Art Genetic Programming for Predicting Human Oral Bioavailabilityof DrugsIn Proc 4th International Workshop on Practical Applications of ComputationalBiology and Bioinformatics 2010

M SipperLet the Games EvolveIn Proc GPTP 2011

C Sivapragasam R Maheswaran and V VenkateshGenetic programming approach for flood routing in natural channelsHydrological Processes 22(5)623ndash628 2007

C Sivapragasam N Muttil S Muthukumar and V M ArunPrediction of algal blooms using genetic programmingMarine Pollution Bulletin 60(10)1849ndash1855 2010

Bibliographyndash 93

L SpectorTowards practical autoconstructive evolution Self-evolution of problem-solvinggenetic programming systemsIn R Riolo T McConaghy and E Vladislavleva editors Genetic ProgrammingTheory and Practice VIII volume 8 of Genetic and Evolutionary Computationchapter 2 pages 17ndash33 Springer Ann Arbor USA 20-22 May 2010

L Spector C Perry and J KleinPush 20 programming language descriptionTechnical report School of Cognitive Science Hampshire College Apr 2004

J Togelius S Karakovskiy J Koutnik and J SchmidhuberSuper Mario EvolutionIn Proc IEEE Computational Intelligence and Games 2009

TORCS The Open Car Racing Simulatorhttptorcssourceforgenet 2012

L Vanneschi and G CuccuVariable Size Population for Dynamic Optimization with Genetic ProgrammingIn Proc GECCO 2009

Bibliographyndash 94

E Vladislavleva G Smits and D Den HertogOrder of Nonlinearity as a Complexity Measure for Models Generated by SymbolicRegression via Pareto Genetic ProgrammingIEEE Trans EC 13(2)333ndash349 2009

N Wagner Z Michalewicz M Khouja and R McGregorTime Series Forecasting for Dynamic Environments The DyFor Genetic ProgramModelIEEE Trans EC 2007

J Walker and J MillerPredicting Prime Numbers Using Cartesian Genetic ProgrammingIn Proc EuroGP 2007

J A Walker K Voumllk S L Smith and J F MillerParallel Evolution using Multi-chromosome Cartesian Genetic ProgrammingGPEM 10417ndash445 2009

W-C Wang K-W Chau C-T Cheng and L QiuA comparison of performance of several artificial intelligence methods forforecasting monthly discharge time seriesJournal of Hydrology 374(3-4)294ndash306 2009

Bibliographyndash 95

W Weimer S Forrest C Le Goues and T NguyenAutomatic program repair with evolutionary computationCommunications of the ACM 53(5)109ndash116 June 2010

P Widera J Garibaldi and N KrasnogorGP challenge Evolving energy function for protein structure predictionGPEM 1161ndash88 2010

T YuHierarchical Processing for Evolving Recursive and Modular Programs UsingHigher-Order Functions and Lambda AbstractionGPEM 2345ndash380 2001

Bibliographyndash 96

  • Introduction
  • Background
  • What is genetic programming
  • Summary of our first glimpse at GP
  • Life demonstration of GP using ECJ
  • Assessment of GP techniques
  • Case study
  • A more detailed view on GP (vanilla GP is not the whole story)
  • The Challenges for GP
  • Variants of GP
  • Selected theoretical results
  • Applications of GP
  • Additional resources
  • Bibliography
Page 12: GeneticProgramming - Poznań University of TechnologyGeneticProgramming KrzysztofKrawiec LaboratoryofIntelligentDecisionSupportSystems InstituteofComputingScience,PoznanUniversityofTechnology,Poznań,Poland

Conceptual preliminaries of GP Space of candidate solutions P

Candidate solutions p isin P evolving under the selection pressure of the fitnessfunction f are themselves functions of the form p I rarrO

I and O are respectively the spaces of input data and output data accepted andproduced by programs from P

Cardinality of |P| is typically large or infinite

The set of program inputs I even if finite is usually so large that running eachcandidate solution on all possible inputs becomes intractable

GP algorithms typically evaluate solutions on a sample I prime sub I |I prime| |I | of possibleinputs and fitness is only an approximate estimate of solution quality

The task is given as a set of fitness cases ie pairs (xi yi ) isin I timesO where xiusually comprises one or more independent variables and yi is the output variable

What is genetic programmingndash 12

Conceptual preliminaries Fitness function f

In most cases (and most real-world applications of GP) fitness function fmeasures the similarity of the output produced by the program to the desiredoutput given as a part of task statement

Then fitness can be expressed as a monotonous function of the divergence ofprogramrsquos output from the desired one for instance as

f (p) =minussumi

||yi minusp(xi )|| (1)

where

p(xi ) is the output produced by program p for the input data xi

|| middot || is a metric (a norm) in the output space O

i iterates over all fitness cases

What is genetic programmingndash 13

Conceptual preliminaries Character of candidate solutions

The candidate solutions in GP are being assembled from elementary entities calledinstructions

A part of formulation of a GP task is then also an instruction set I ie a set ofsymbols used by the search algorithm to compose the programs (candidatesolutions)

Design of I usually requires some background knowledgeIn particular it should comprise all instructions necessary to find solution to theproblem posed (closure)

What is genetic programmingndash 14

Genetic programming

Main evolution loop (lsquovanilla GPrsquo)

1 procedure GeneticProgramming(f I ) f - fitness function I - instruction set2 P larrplarrRandomProgram(I ) Initialize population3 repeat Main loop over generations4 for p isinP do Evaluation5 pf larr f (p) pf is a lsquofieldrsquo in program p that stores its fitness6 end for7 P prime larr 0 Next population8 repeat Breeding loop9 p1 larrTournamentSelection(P) First parent10 p2 larrTournamentSelection(P) Second parent11 (o1o2)larrCrossover(p1 p2)12 o1 larrMutation(o1 I )13 o2 larrMutation(o2 I )14 P prime larrP prime cupo1o215 until |P prime|= |P|16 P larrP prime

17 until StoppingCondition(P)18 return argmaxpisinP pf

19 end procedure

What is genetic programmingndash 15

Crossover

Crossover exchange of randomly selected subexpressions (subtree swapping crossover)

1 function Crossover(p1p2)2 repeat3 s1 larr Random node in p14 s2 larr Random node in p25 (pprime1p

prime2)larr Swap subtrees rooted in s1 and s2

6 until Depth(pprime1)lt dmax andDepth(pprime2)lt dmax dmax is the tree depthlimit

7 return (pprime1pprime2)

8 end function

What is genetic programmingndash 16

Mutation

Mutation replace a randomly selected subexpression with a new randomly generatedsubexpression

1 function Mutation(pI )2 repeat3 slarr Random node in p4 s prime larrRandomProgram(I )5 pprime larr Replace the subtree rooted in s with s prime

6 until Depth(pprime)lt dmax dmax is the tree depth limit7 return pprime

8 end function

What is genetic programmingndash 17

Exemplary run Setup

Objective Find program whose output matches x2+x+1 over the range [minus11]Such tasks can be considered as a form of regressionAs solutions are built by manipulating code (instructions) this is referred to assymbolic regression

Fitness sum of absolute errors for x isin minus10minus09 0910In other words the set of fitness cases is

xi -10 -09 0 09 10yi 1 091 1 271 3

What is genetic programmingndash 18

Exemplary run Setup

Instruction setNonterminal (function) set + - (protected division) and x all operating onfloatsTerminal set x and constants chosen randomly between -5 and +5

Selection fitness proportionate (roulette wheel) non elitist

Initial pop ramped half-and-half (depth 1 to 2 50 of terminals are constants)(to be explained later)

Parameterspopulation size 450 subtree crossover25 reproduction25 subtree mutation no tree size limits

Termination when an individual with fitness better than 01 found

What is genetic programmingndash 19

Initial population (population 0)

What is genetic programmingndash 20

Fitness assignment for population 0

Fitness values f(a)=77 f(b)=110 f(c)=1798 f(d)=287

What is genetic programmingndash 21

Breeding

Assume

a gets reproduced

c gets mutated (at loci 2)

a and d get crossed-over

a and b get crossed over

What is genetic programmingndash 22

Population 1

Population 0

Population 1

Individual d in population 1 has fitness 0What is genetic programmingndash 23

Summary of our first glimpse at GP

Summary of our first glimpse at GPndash 24

Specific features of GP

The solutions evolving under the selection pressure of the fitness function arethemselves functions (programs)

GP operates on symbolic structures of varying lengthsThere are no variables for the algorithm to operate on (at least in the commonsense)

The program can be tested only on a limited number of fitness cases (tests)

=rArr In contrast to most EC methods that are typically placed in optimizationframework GP is by nature an inductive learning approach that fits into the domain ofmachine learning []

Summary of our first glimpse at GPndash 25

In a broader context

As opposed to typical ML approaches GP is very genericArbitrary programming language arbitrary input and output representation

The syntax and semantic of the programming language of consideration serve asmeans to provide the algorithm with prior knowledge

(common sense knowledge background knowledge domain knowledge)

GP is not the only approach to program induction (but probably the best one )See eg inductive logic programming ILP

GP embodies the ultimate goal of AI to build a system capable ofself-programming (adaptation learning)

Summary of our first glimpse at GPndash 26

What is GP ndash Question revisited

Genetic programming is a branch of computer science studying heuristicalgorithms based on neo-Darwinian principles for synthesizing programs iediscrete symbolic compositional structures that process data

Summary of our first glimpse at GPndash 27

Why should GP be considered a viable approach to program synthesis

GP combines two powerful concepts marked in underline in the above definition1 Representing candidate solutions as programs

which in general can conduct any Turing-complete computation (egclassification regression clustering reasoning problem solving etc) and thusenable capturing solutions to any type of problems (whether the task is eglearning optimization problem solving game playing etc)

2 Searching the space of candidate solutions using the lsquomechanicsrsquo borrowed frombiological evolutionwhich is unquestionably a very powerful computing paradigm given that itresulted in life on Earth and development of intelligent beings

Summary of our first glimpse at GPndash 28

Consequences of the above definition

Heuristic nature of search

Symbolic program representation

Input sensitivity and inductive character

State-fullness

Unconstrained data types

Summary of our first glimpse at GPndash 29

Origins of GP

httpwwwgenetic-programmingcomjohnkozahtml

Home assignment Indentify John Kozarsquos unique visual trait )

Summary of our first glimpse at GPndash 30

Life demonstration of GP using ECJ

Life demonstration of GP using ECJndash 31

GP on ECJ

ECJ Evolutionary Computation in Javahttpcsgmuedu~eclabprojectsecj

by Sean Luke Liviu Panait Gabriel Balan Sean Paus Zbigniew Skolicki ElenaPopovici Keith Sullivan et al

Probably the most popular freely available framework for EC with a strongsupport for GP

Licensed under Academic Free License version 30

As of March 2012 version 20

Many other libraries integrate with ECJ

Life demonstration of GP using ECJndash 32

Selected ECJ features

GUI with charting

Platform-independent checkpointing and logging

Hierarchical parameter files

Multithreading

Mersenne Twister Random Number Generators (compare tohttpwwwalifecouknonrandom)

Abstractions for implementing a variety of EC forms

Prepared to work in a distributed environment (including so-called island model)

Life demonstration of GP using ECJndash 33

GP-related ECJ features

GP Tree Representations

Set-based Strongly-Typed Genetic Programming

Ephemeral Random Constants

Automatically-Defined Functions and Automatically Defined Macros

Multiple tree forests

Six tree-creation algorithms

Extensive set of GP breeding operators

Grammatical Encoding

Eight pre-done GP application problem domains (ant regression multiplexerlawnmower parity two-box edge serengeti)

Life demonstration of GP using ECJndash 34

Exemplary run of ECJ

Standard outputjava ecEvolve -file ecappregressionquinticercparamsThreads breed1 eval1Seed 1427743400Job 0Setting upProcessing GP TypesProcessing GP Node ConstraintsProcessing GP Function SetsProcessing GP Tree Constraints-01306332228659439200164875774146594280653340439694114301402200189629743-0037506348565697010001402771209365470606602806044824949013869498395598084Initializing Generation 0Subpop 0 best fitness of generation Fitness Standardized=11303205 Adjusted=046941292 Hits=10Generation 1Subpop 0 best fitness of generation Fitness Standardized=06804932 Adjusted=059506345 Hits=7

Life demonstration of GP using ECJndash 35

Exemplary run The result

The log file produced by the runGeneration 0Best IndividualSubpopulation 0Evaluated trueFitness Standardized=11303205 Adjusted=046941292 Hits=10Tree 0( (sin ( x x)) (cos (+ x x)))Generation 1Best IndividualSubpopulation 0Evaluated trueFitness Standardized=06804932 Adjusted=059506345 Hits=7Tree 0( (rlog (+ (- x x) (cos x))) (rlog (- (cos (cos ( x x))) (- x x))))

Life demonstration of GP using ECJndash 36

Exemplary run

The log file produced by the runBest Individual of RunSubpopulation 0Evaluated trueFitness Standardized=008413165 Adjusted=092239726 Hits=17Tree 0( ( ( (- ( ( ( ( x (sin x)) (rlog

x)) (+ (+ (sin x) x) (- x x))) (exp ( x( ( (- ( ( ( ( x x) (rlog x)) (+ (+

(sin x) x) (- x x))) (exp ( x (sin x))))(sin x)) (rlog x)) (exp (rlog x)))))) (sin

x)) (rlog x)) x) (cos (cos ( ( (- ( ((exp (rlog x)) (+ x ( ( (exp (rlog x))(rlog x)) x))) (exp ( ( ( (- (exp (rlogx)) x) (rlog x)) x) (sin ( x x))))) (sinx)) ( x ( ( (- ( ( ( ( x x) (rlogx)) (+ (+ x (+ (+ (sin x) x) (- x x))) (-x x))) (exp ( x (sin x)))) (sin x)) (rlogx)) (exp (rlog x))))) x))))

Life demonstration of GP using ECJndash 37

GP in R

httpcranr-projectorgwebpackagesgprindexhtml

Life demonstration of GP using ECJndash 38

GP in Mathematica

Life demonstration of GP using ECJndash 39

Assessment of GP techniques

Assessment of GP techniquesndash 40

Criteria for assessing the quality of GP-evolved solutions

Criteria for assessing GP algorithmssuccess rate (percentage of evolutionary runs ended with success)

time-to-success (can be infin)

error of the best-of-run individual

Criteria for assessing programs obtained with GP

error rate (percentage of tests passed)

program size (number of instructions)

execution time

transparency

Assessment of GP techniquesndash 41

GP Benchmarks

Problem Definition (formula)Sextic x6minus2x4+x2

Septic x7minus2x6+x5minusx4+x3minus2x2+xNonic x9+x8+x7+x6+x5+x4+x3+x2+xR1 (x+1)3(x2minusx+1)R2 (x5minus3x3+1)(x2+1)R3 (x6+x5)(x4+x3+x2+x+1)

Assessment of GP techniquesndash 42

Even more GP benchmarks

Symbolic Regression Drug Bioavailability [43]

Tower [52] Protein Structure Classification [58]

Boolean Functions Time Series Forecasting [53]

N-Multiplexer [17] N-Majority [17] N-Parity [17] Path-finding and Planning

Generalised Boolean Circuits [12 59] Physical Travelling Salesman [28]

Digital Adder [55] Artificial Ant [17]

Order [9] Lawnmower [18]

Digital Multiplier [55] Tartarus Problem [6]

Majority [9] Maximum Overhang [36]

Classification Circuit Design [29]

mRNA Motif Classification [24] Control Systems

DNA Motif Discovery [25] Chaotic Dynamic Systems Control [26]

Intrusion Detection [11] Pole Balancing [31]

Protein Classification [19] Truck Control [16]

Intertwined Spirals [17]

Predictive Modelling

Mackey-Glass Chaotic Time Series [21]

Financial Trading [5 4 8]

Sunspot Prediction [17]

GeneChip Probe Performance [22]

Prime Number Prediction [54]

Assessment of GP techniquesndash 43

and more

Game-Playing

TORCS Car Racing [50]

Ms PacMan [10]

Othello [27]

Chessboard Evaluation [44]

Backgammon [44]

Mario [49]

NP-Complete Puzzles [14]

Robocode [44]

Rush Hour [44]

Checkers [44]

Freecell [44]

Dynamic Optimisation

Dynamic Symbolic Regression [34 35 51]

Dynamic Scheduling [13]

Traditional Programming

Sorting [15 1]

Assessment of GP techniquesndash 44

Case study

Case studyndash 45

Case study 1 Evolution of temperature models

Based on Karolina Stanisławska Krzysztof Krawiec Zbigniew W KundzewiczModeling Global Temperature Changes using Genetic Programming ndash A Case Study

Institute of Computing Science Poznan University of Technology Poznan Poland

Institute for Agricultural and Forest Environment Polish Academy of SciencesPoznan Poland and Potsdam Institute for Climate Impact Research PotsdamGermany

Konferencja Algorytmoacutew Ewolucyjnych i Optymalizacji Globalnej KAEiOGSeptember 21 2011

Case studyndash 46

A more detailed view on GP(vanilla GP is not the whole story)

A more detailed view on GP (vanilla GP is not the whole story)ndash 47

Where to get [random] candidate solutions from

Every stochastic search method relies on some sampling algorithm(s)

The distribution of randomly generated solutions is important as it impliescertain bias of the algorithm

ProblemsWe donrsquot know the lsquoidealrsquo distribution of GP programsEven if we knew it it may be difficult to design an algorithm that obeys it

The most widely used contemporary initialization methods take care only of thesyntax of generated programs

Mainly height constraint

A more detailed view on GP (vanilla GP is not the whole story)ndash 48

Initialization Full method

Specify the maximum tree height hmax

The full method for initializing treesChoose nonterminal nodes at random until hmax is reachedThen choose only from terminals

A more detailed view on GP (vanilla GP is not the whole story)ndash 49

Initialization Grow method

Specify the maximum tree height hmax

The grow method for initializing treesChoose nonterminal or terminal nodes at random until hmax is reachedThen choose only from terminals

A more detailed view on GP (vanilla GP is not the whole story)ndash 50

Homologous crossover for GP

Earliest example one-point crossover [23] identify a common region in theparents and swap the corresponding trees

The common region is the lsquointersectionrsquo of parent trees

A more detailed view on GP (vanilla GP is not the whole story)ndash 51

Uniform crossover for GP

Works similarly to uniform crossover in GAs

The offspring is build by iterating over nodes in the common region and flipping acoin to decide from which parent should an instruction be copied [38]

A more detailed view on GP (vanilla GP is not the whole story)ndash 52

How to employ multiple operators for lsquobreedingrsquo

How should the particular operators coexist in an evolutionary process In other words

How should they be superimposed

What should be the lsquopipingrsquo of particular breeding pipelines

A topic surprisingly underexplored in GP (and in EC probably too)

An example Which is betterpopsubpop0speciespipe = ecgpkozaMutationPipelinepopsubpop0speciespipenum-sources = 1popsubpop0speciespipesource0 = ecgpkozaCrossoverPipeline

Orpopsubpop0speciespipenum-sources = 2popsubpop0speciespipesource0 = ecgpkozaCrossoverPipelinepopsubpop0speciespipesource0prob = 09popsubpop0speciespipesource1 = ecgpkozaMutationPipelinepopsubpop0speciespipesource1prob = 01

A more detailed view on GP (vanilla GP is not the whole story)ndash 53

The Challenges for GP

The Challenges for GPndash 54

Bloat

The evolving expressions tend to grow indefinitely in sizeFor tree-based representations this growth is typically exponential[-ish]

Evaluation becomes slow algorithm stalls memory overrun likely

One of the most intensely studied topics in GP 240+ papers as of March 2012

The Challenges for GPndash 55

Bloat example

Average number of nodes per generation in a typical run of GP solving the Sexticproblem x6minus2x4+x2

(GP dotted line)

The Challenges for GPndash 56

Countermeasures for bloat

Constraining tree heightSurprisingly can speed up bloat

Favoring small programsLexicographic parsimony pressure given two equally fit individuals prefer (select)the one represented by a smaller tree

Bloat-aware operators size-fair crossover

The Challenges for GPndash 57

Highly non-uniform distribution of program lsquobehaviorsrsquo

Convergence of binary Boolean random linear functions (composed of AND NANDOR NOR 8 bits)

From [20] Langdon W B Cantuacute-Paz E (ed) Random Search is Parsimonious LateBreaking Papers at the Genetic and Evolutionary Computation Conference(GECCO-2002) AAAI 2002 308-315

The Challenges for GPndash 58

High cost of evaluation

Running a program on multiple inputscan be expensive

Particularly for some types of dataeg images

Solutions

Caching of outcomes of subprograms

Parallel execution of programs onparticular fitness cases

Bloat prevention methods

The Challenges for GPndash 59

Variants of GP

Variants of GPndash 60

Strongly typed GP (STGP)

A way to incorporate prior knowledge and impose a structure on programs [30]

ImplementationProvide a set of typesFor each instruction define the types of its arguments and outcomesMake the operators type-aware

Mutation substitute a random tree of a proper typeCrossover swap trees of compatible1 types

1Compatible belonging to the same lsquoset typersquoVariants of GPndash 61

Strongly typed GP in ECJ

For the problem of simple classifiers represented as decision trees

Classifier syntaxClassifier = Class_idClassifier = if_then_else(Condition ClassifierClassifier)Condition = Input_Variable = Constant_Value

Implementation in ECJ parameter filesgptypeasize = 3gptypea0name = classgptypea1name = vargptypea2name = constgptypessize = 0gptcsize = 1gptc0 = ecgpGPTreeConstraintsgptc0name = tc0gptc0fset = f0gptc0returns = class

gpncsize = 4gpnc0 = ecgpGPNodeConstraintsgpnc0name = ncSimpleClassifiergpnc0returns = classgpnc0size = 0gpnc1 = ecgpGPNodeConstraintsgpnc1name = ncCompoundClassifiergpnc1returns = classgpnc1size = 4gpnc1child0 = vargpnc1child1 = constgpnc1child2 = classgpnc1child3 = classgpnc2 = ecgpGPNodeConstraintsgpnc2name = ncVariablegpnc2returns = vargpnc2size = 0gpnc3 = ecgpGPNodeConstraintsgpnc3name = ncConstantgpnc3returns = constgpnc3size = 0

Variants of GPndash 62

Linear Genetic Programming

MotivationTree-like structures are not natural for contemporary hardware architectures

Program = a sequence of instructions

Data passed via registers

ProsDirectly portable to machine code fast executionNatural correspondence to standard (GA-like) crossover operator

Applications direct evolution of machine code [32]

Initial register contents

Final register contents

x1

x2 O1 O2

x3

O3 O4 g2

g3

g1r1

r2

r3

r1

r2

r3

Variants of GPndash 63

Linear GP

Variants of GPndash 64

Stack-based GP

The best-known representative Push and PushGPhampshireedulspectorpushhtml [48]

ProsVery simple syntax program = instruction | literal | ( program )No need to specify the number of registersThe top element of a stack has the natural interpretation of program outcomeNatural possibility of implementing autoconstructive programs [47]Includes certain features that make it Turing-complete (eg YANK instruction)

Variants of GPndash 65

Push Example

Program

( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR )

Initial stack states

BOOLEAN STACK ()CODE STACK ( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR )FLOAT STACK ()INTEGER STACK ()

Stack states after program execution

BOOLEAN STACK ( TRUE )CODE STACK ( ( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR ) )FLOAT STACK ( 93 )INTEGER STACK ( 6 )

httphampshireedulspectorpush3-descriptionhtml

Variants of GPndash 66

Other variants of GP

Grammatical Evolution The grammar of the programming language ofconsideration is given as input to the algorithm Individuals encode the choice ofproductions in the derivation tree (which of available alternative productionshould be chosen modulo the number of productions available at given step ofderivation)

Graph-based GPMotivation standard GP cannot reuse subprograms (within a single program)Example Cartesian Genetic Programming

Variants of GPndash 67

Other variants of GP

Multiobjective GP The extra objectives canCome with the problemResult from GPrsquos specifics eg use program size as the second (minimized)objectiveBe associated with different tests (eg feature tests [40])

Developmental GP (eg using Push)

Probabilistic GP (a variant of EDA Estimation of Distribution Algorithms)The algorithm maintains a probability distribution P instead of a populationIndividuals are generated from P lsquoon demandrsquoThe results of individualsrsquo evaluation are used to update P

Variants of GPndash 68

Simple EDA-like GP PIPE

Probabilistic Incremental Program Evolution [41]

Variants of GPndash 69

Selected theoretical results

Selected theoretical resultsndash 70

Schemata theorem for GP

Exact formula for the expected number of individuals sampling a schema a thenext generation [37]

Plus later work for other types of crossover

Selected theoretical resultsndash 71

Semantic crossover

Selected theoretical resultsndash 72

Applications of GP

Applications of GPndash 73

Review

GP produced a number of solutions that are human-competitive ie a GPalgorithm automatically solved a problem for which a patent exists2

A recent award-winning work has demonstrated the ability of a GP system toautomatically find and correct bugs in commercially-released software whenprovided with test data3

GP is one of leading methodologies that can be used to lsquoautomatersquo sciencehelping the researchers to find the hidden complex patterns in the observedphenomena4

2Koza J R Keane M A Streeter M J Mydlowec W Yu J Lanza G 2003 Genetic Pro-gramming IV Routine Human-Competitive Machine Intelligence Kluwer Academic Publishers

3Arcuri A Yao X A novel co-evolutionary approach to automatic software bug fixing In WangJ (Ed) 2008 IEEE World Congress on Computational Intelligence IEEE Computational IntelligenceSociety IEEE Press Hong Kong

4Schmidt M Lipson H 3 Apr 2009 Distilling free-form natural laws from experimental dataScience 324 (5923) 81ndash85

Applications of GPndash 74

Humies

() Entries were solicited for cash awards for human-competitive results that wereproduced by any form of genetic and evolutionary computation and that were published

Applications of GPndash 75

Humies

The conditions to qualify(A) The result was patented as an invention in the past is an improvement over apatented invention or would qualify today as a patentable new invention(B) The result is equal to or better than a result that was accepted as a new scientificresult at the time when it was published in a peer-reviewed scientific journal(C) The result is equal to or better than a result that was placed into a database orarchive of results maintained by an internationally recognized panel of scientificexperts(D) The result is publishable in its own right as a new scientific result mdash independentof the fact that the result was mechanically created(E) The result is equal to or better than the most recent human-created solution to along-standing problem for which there has been a succession of increasingly betterhuman-created solutions(F) The result is equal to or better than a result that was considered an achievementin its field at the time it was first discovered(G) The result solves a problem of indisputable difficulty in its field(H) The result holds its own or wins a regulated competition involving humancontestants (in the form of either live human players or human-written computerprograms)

Applications of GPndash 76

Selected Gold Humies

2004 Jason D Lohn Gregory S Hornby Derek S Linden NASA Ames ResearchCenterAn Evolved Antenna for Deployment on NASArsquos Space Technology 5 Mission

httpidesignucscedupapershornby_ec11pdf

Applications of GPndash 77

Selected Gold Humies using GP

2009 Stephanie Forrest Claire Le Goues ThanhVu Nguyen Westley WeimerAutomatically finding patches using genetic programming A GeneticProgramming Approach to Automated Software Repair

Applications of GPndash 78

Selected Gold Humies using GP

2008 Lee Spector David M Clark Ian Lindsay Bradford Barr Jon KleinGenetic Programming for Finite Algebras

2010 Natalio Krasnogor Paweł Widera Jonathan GaribaldiEvolutionary design of the energy function for protein structure prediction GPchallenge evolving the energy function for protein structure prediction Automateddesign of energy functions for protein structure prediction by means of geneticprogramming and improved structure similarity assessment

2011 Achiya Elyasaf Ami Hauptmann Moshe SipperGA-FreeCell Evolving Solvers for the Game of FreeCell

Applications of GPndash 79

Other applications

classification problems in machine learning [] object recognition [ 33] orlearning game strategies [] [2 57] has demonstrated the ability of a GP system to automatically find andcorrect bugs in commercially-released software when provided with test data

In the context of this paper it deserves particular attention that GP is one ofleading methodologies that can be used to lsquoautomatersquo science helping theresearchers to find the hidden complex patterns in the observed phenomena

In this spirit in their seminal paper [42] have shown how GP can be used toinduce scientific laws from experimental data Many other studies havedemonstrated the usefulness of GP for modeling different phenomena includingthose of natural origins [45 3 7 56 46]

See [] for an extensive review of GP applications

Applications of GPndash 80

Case study 2 Evolution of feature detectors

Based on Krzysztof Krawiec Bartosz Kukawka and Tomasz Maciejewski (2010)Evolving cascades of voting feature detectors for vehicle detection in satellite imageryIn IEEE Congress on Evolutionary Computation (CEC 2010) Barcelona IEEE Presspages 2392-2399

Applications of GPndash 81

Additional resources

Additional resourcesndash 82

Additional resources

Evolutionary Computation in Java csgmuedu~eclabprojectsecjGeneric software framework for EA well-prepared to work with GP

The online GP bilbiography wwwcsbhamacuk~wblbiblio

The genetic programming lsquohome pagersquo (a little bit messy but still valuable)httpwwwgenetic-programmingcom

Additional resourcesndash 83

Bibliography

Bibliographyndash 84

A Agapitos and S M LucasEvolving Modular Recursive Sorting AlgorithmsIn Proc EuroGP 2007

A Arcuri and X YaoA novel co-evolutionary approach to automatic software bug fixingIn J Wang editor 2008 IEEE World Congress on Computational Intelligencepages 162ndash168 Hong Kong 1-6 June 2008 IEEE Computational IntelligenceSociety IEEE Press

M Arganis R Val J Prats K Rodriguez R Dominguez and J DolzGenetic programming and standardization in water temperature modellingAdvances in Civil Engineering 2009 2009

A Brabazon M OrsquoNeill and I DempseyAn Introduction to Evolutionary Computation in FinanceIEEE Computational Intelligence Magazine 3(4)42ndash55 2008

R Bradley A Brabazon and M OrsquoNeillDynamic High Frequency Trading A Neuro-Evolutionary ApproachIn Proc EvoWorkshops 2009

G Cuccu and F GomezWhen novelty is not enoughIn Proc EvoApplications 2011

Bibliographyndash 85

M Daga and M C DeoAlternative data-driven methods to estimate wind from waves by inversemodelingNatural Hazards 49(2)293ndash310 May 2009

I Dempsey M OrsquoNeill and A BrabazonAdaptive Trading With Grammatical EvolutionIn Proc CEC 2006

G Durrett F Neumann and U-M OrsquoReillyComputational Complexity Analysis of Simple Genetic Programming On TwoProblems Modeling Isolated Program SemanticsIn Proc FOGA 2011

E Galvaacuten-Loacutepez J Swafford M OrsquoNeill and A BrabazonEvolving a Ms PacMan Controller Using Grammatical EvolutionIn Applications of Evolutionary Computation Springer 2010

J V Hansen P B Lowry R D Meservy and D M McDonaldGenetic Programming for Prevention of Cyberterrorism through Dynamic andEvolving Intrusion DetectionDecision Support Systems 431362ndash1374 2007

Bibliographyndash 86

S Harding J F Miller and W BanzhafDevelopments in Cartesian Genetic Programming self-modifying CGPGPEM 11397ndash439 2010

D Jakobović and L BudinDynamic Scheduling with Genetic ProgrammingIn Proc EuroGP 2006

G Kendall A Parkes and K SpoererA Survey of NP-Complete PuzzlesInternational Computer Games Association Journal 31(1)13ndash34 2008

K E Kinnear JrEvolving a Sort Lessons in Genetic ProgrammingIn Proc of the International Conference on Neural Networks 1993

J KozaA Genetic Approach to the Truck Backer Upper Problem and the Inter-twinedSpiral ProblemIn Proc International Joint Conference on Neural Networks 1992

J R KozaGenetic Programming On the Programming of Computers by Means of NaturalSelectionMIT Press Cambridge MA USA 1992

Bibliographyndash 87

J R KozaGenetic Programming II Automatic Discovery of Reusable ProgramsMIT Press Cambridge Massachusetts May 1994

W Langdon and W BanzhafRepeated Patterns in Genetic ProgrammingNatural Computing 7589ndash613 2008

W B LangdonRandom search is parsimoniousIn E Cantuacute-Paz editor Late Breaking Papers at the Genetic and EvolutionaryComputation Conference (GECCO-2002) pages 308ndash315 New York NY 9-13July 2002 AAAI

W B Langdon and W BanzhafRepeated Sequences in Linear Genetic Programming GenomesComplex Systems 15(4)285ndash306 2005

W B Langdon and A P HarrisonEvolving Regular Expressions for GeneChip Probe Performance PredictionIn Proc PPSN pages 1061ndash1070 2008

W B Langdon and R PoliFoundations of Genetic ProgrammingSpringer-Verlag 2002

Bibliographyndash 88

W B Langdon J Rowsell and A P HarrisonCreating Regular Expressions as mRNA Motifs with GP to Predict Human ExonSplittingIn Proc GECCO 2009

W B Langdon O Sanchez Graillet and A P HarrisonAutomated DNA Motif DiscoveryarXivorg 2010

M Lones A Tyrrell S Stepney and L CavesControlling Complex Dynamics with Artificial Biochemical NetworksIn Proc EuroGP pages 159ndash170 2010

S LucasOthello Competitionhttpprotectkern-1667emrelaxalgovalessexacuk8080othellohtmlOthellohtml 2012[Online accessed 27-Jan-2012]

Bibliographyndash 89

S LucasThe Physical Travelling Salesperson Problemhttpprotectkern-1667emrelaxalgovalessexacukptspptsphtml2012[Online accessed 27ndashJan-2012]

T McConaghyFFX Fast Scalable Deterministic Symbolic Regression TechnologyIn Proc GPTP 2011

D J MontanaStrongly typed genetic programmingBBN Technical Report 7866 Bolt Beranek and Newman Inc 10 MoultonStreet Cambridge MA 02138 USA 7 May 1993

M Nicolau M Schoenauer and W BanzhafEvolving Genes to Balance a PoleIn Proc EuroGP 2010

Bibliographyndash 90

P Nordin and W BanzhafGenetic programming controlling a miniature robotIn E V Siegel and J R Koza editors Working Notes for the AAAI Symposiumon Genetic Programming pages 61ndash67 MIT Cambridge MA USA 10ndash12 Nov1995 AAAI

G Olague and L TrujilloEvolutionary-computer-assisted design of image operators that detect interestpoints using genetic programmingImage and Vision Computing 29(7)484ndash498 2011

M OrsquoNeill A Brabazon and E HembergSubtree Deactivation Control with Grammatical Genetic Programming inDynamic EnvironmentsIn Proc CEC 2008

M OrsquoNeill and C RyanGrammatical Evolution by Grammatical Evolution The Evolution of Grammarand Genetic CodeIn Proc EuroGP pages 138ndash149 Springer-Verlag 5ndash7 Apr 2004

M Paterson Y Peres M Thorup P Winkler and U ZwickMaximum OverhangIn Proc 19th Annual ACM-SIAM Symposium on Discrete Algorithms 2008

Bibliographyndash 91

R PoliExact schema theory for genetic programming and variable-length geneticalgorithms with one-point crossoverGenetic Programming and Evolvable Machines 2(2)123ndash163 June 2001

R Poli and W B LangdonOn the search properties of different crossover operators in genetic programmingIn J R Koza W Banzhaf K Chellapilla K Deb M Dorigo D B Fogel M HGarzon D E Goldberg H Iba and R Riolo editors Genetic Programming1998 Proceedings of the Third Annual Conference pages 293ndash301 University ofWisconsin Madison Wisconsin USA 22-25 July 1998 Morgan Kaufmann

R Poli W B Langdon and N F McPheeA field guide to genetic programmingPublished via httplulucom and freely available athttpwwwgp-field-guideorguk 2008(With contributions by J R Koza)

B J Ross and H ZhuProcedural texture evolution using multiobjective optimizationNew Generation Computing 22(3)271ndash293 2004

Bibliographyndash 92

R P Salustowicz and J SchmidhuberProbabilistic incremental program evolutionEvolutionary Computation 5(2)123ndash141 1997

M Schmidt and H LipsonDistilling free-form natural laws from experimental dataScience 324(5923)81ndash85 3 Apr 2009

S Silva and L VanneschiState-of-the-Art Genetic Programming for Predicting Human Oral Bioavailabilityof DrugsIn Proc 4th International Workshop on Practical Applications of ComputationalBiology and Bioinformatics 2010

M SipperLet the Games EvolveIn Proc GPTP 2011

C Sivapragasam R Maheswaran and V VenkateshGenetic programming approach for flood routing in natural channelsHydrological Processes 22(5)623ndash628 2007

C Sivapragasam N Muttil S Muthukumar and V M ArunPrediction of algal blooms using genetic programmingMarine Pollution Bulletin 60(10)1849ndash1855 2010

Bibliographyndash 93

L SpectorTowards practical autoconstructive evolution Self-evolution of problem-solvinggenetic programming systemsIn R Riolo T McConaghy and E Vladislavleva editors Genetic ProgrammingTheory and Practice VIII volume 8 of Genetic and Evolutionary Computationchapter 2 pages 17ndash33 Springer Ann Arbor USA 20-22 May 2010

L Spector C Perry and J KleinPush 20 programming language descriptionTechnical report School of Cognitive Science Hampshire College Apr 2004

J Togelius S Karakovskiy J Koutnik and J SchmidhuberSuper Mario EvolutionIn Proc IEEE Computational Intelligence and Games 2009

TORCS The Open Car Racing Simulatorhttptorcssourceforgenet 2012

L Vanneschi and G CuccuVariable Size Population for Dynamic Optimization with Genetic ProgrammingIn Proc GECCO 2009

Bibliographyndash 94

E Vladislavleva G Smits and D Den HertogOrder of Nonlinearity as a Complexity Measure for Models Generated by SymbolicRegression via Pareto Genetic ProgrammingIEEE Trans EC 13(2)333ndash349 2009

N Wagner Z Michalewicz M Khouja and R McGregorTime Series Forecasting for Dynamic Environments The DyFor Genetic ProgramModelIEEE Trans EC 2007

J Walker and J MillerPredicting Prime Numbers Using Cartesian Genetic ProgrammingIn Proc EuroGP 2007

J A Walker K Voumllk S L Smith and J F MillerParallel Evolution using Multi-chromosome Cartesian Genetic ProgrammingGPEM 10417ndash445 2009

W-C Wang K-W Chau C-T Cheng and L QiuA comparison of performance of several artificial intelligence methods forforecasting monthly discharge time seriesJournal of Hydrology 374(3-4)294ndash306 2009

Bibliographyndash 95

W Weimer S Forrest C Le Goues and T NguyenAutomatic program repair with evolutionary computationCommunications of the ACM 53(5)109ndash116 June 2010

P Widera J Garibaldi and N KrasnogorGP challenge Evolving energy function for protein structure predictionGPEM 1161ndash88 2010

T YuHierarchical Processing for Evolving Recursive and Modular Programs UsingHigher-Order Functions and Lambda AbstractionGPEM 2345ndash380 2001

Bibliographyndash 96

  • Introduction
  • Background
  • What is genetic programming
  • Summary of our first glimpse at GP
  • Life demonstration of GP using ECJ
  • Assessment of GP techniques
  • Case study
  • A more detailed view on GP (vanilla GP is not the whole story)
  • The Challenges for GP
  • Variants of GP
  • Selected theoretical results
  • Applications of GP
  • Additional resources
  • Bibliography
Page 13: GeneticProgramming - Poznań University of TechnologyGeneticProgramming KrzysztofKrawiec LaboratoryofIntelligentDecisionSupportSystems InstituteofComputingScience,PoznanUniversityofTechnology,Poznań,Poland

Conceptual preliminaries Fitness function f

In most cases (and most real-world applications of GP) fitness function fmeasures the similarity of the output produced by the program to the desiredoutput given as a part of task statement

Then fitness can be expressed as a monotonous function of the divergence ofprogramrsquos output from the desired one for instance as

f (p) =minussumi

||yi minusp(xi )|| (1)

where

p(xi ) is the output produced by program p for the input data xi

|| middot || is a metric (a norm) in the output space O

i iterates over all fitness cases

What is genetic programmingndash 13

Conceptual preliminaries Character of candidate solutions

The candidate solutions in GP are being assembled from elementary entities calledinstructions

A part of formulation of a GP task is then also an instruction set I ie a set ofsymbols used by the search algorithm to compose the programs (candidatesolutions)

Design of I usually requires some background knowledgeIn particular it should comprise all instructions necessary to find solution to theproblem posed (closure)

What is genetic programmingndash 14

Genetic programming

Main evolution loop (lsquovanilla GPrsquo)

1 procedure GeneticProgramming(f I ) f - fitness function I - instruction set2 P larrplarrRandomProgram(I ) Initialize population3 repeat Main loop over generations4 for p isinP do Evaluation5 pf larr f (p) pf is a lsquofieldrsquo in program p that stores its fitness6 end for7 P prime larr 0 Next population8 repeat Breeding loop9 p1 larrTournamentSelection(P) First parent10 p2 larrTournamentSelection(P) Second parent11 (o1o2)larrCrossover(p1 p2)12 o1 larrMutation(o1 I )13 o2 larrMutation(o2 I )14 P prime larrP prime cupo1o215 until |P prime|= |P|16 P larrP prime

17 until StoppingCondition(P)18 return argmaxpisinP pf

19 end procedure

What is genetic programmingndash 15

Crossover

Crossover exchange of randomly selected subexpressions (subtree swapping crossover)

1 function Crossover(p1p2)2 repeat3 s1 larr Random node in p14 s2 larr Random node in p25 (pprime1p

prime2)larr Swap subtrees rooted in s1 and s2

6 until Depth(pprime1)lt dmax andDepth(pprime2)lt dmax dmax is the tree depthlimit

7 return (pprime1pprime2)

8 end function

What is genetic programmingndash 16

Mutation

Mutation replace a randomly selected subexpression with a new randomly generatedsubexpression

1 function Mutation(pI )2 repeat3 slarr Random node in p4 s prime larrRandomProgram(I )5 pprime larr Replace the subtree rooted in s with s prime

6 until Depth(pprime)lt dmax dmax is the tree depth limit7 return pprime

8 end function

What is genetic programmingndash 17

Exemplary run Setup

Objective Find program whose output matches x2+x+1 over the range [minus11]Such tasks can be considered as a form of regressionAs solutions are built by manipulating code (instructions) this is referred to assymbolic regression

Fitness sum of absolute errors for x isin minus10minus09 0910In other words the set of fitness cases is

xi -10 -09 0 09 10yi 1 091 1 271 3

What is genetic programmingndash 18

Exemplary run Setup

Instruction setNonterminal (function) set + - (protected division) and x all operating onfloatsTerminal set x and constants chosen randomly between -5 and +5

Selection fitness proportionate (roulette wheel) non elitist

Initial pop ramped half-and-half (depth 1 to 2 50 of terminals are constants)(to be explained later)

Parameterspopulation size 450 subtree crossover25 reproduction25 subtree mutation no tree size limits

Termination when an individual with fitness better than 01 found

What is genetic programmingndash 19

Initial population (population 0)

What is genetic programmingndash 20

Fitness assignment for population 0

Fitness values f(a)=77 f(b)=110 f(c)=1798 f(d)=287

What is genetic programmingndash 21

Breeding

Assume

a gets reproduced

c gets mutated (at loci 2)

a and d get crossed-over

a and b get crossed over

What is genetic programmingndash 22

Population 1

Population 0

Population 1

Individual d in population 1 has fitness 0What is genetic programmingndash 23

Summary of our first glimpse at GP

Summary of our first glimpse at GPndash 24

Specific features of GP

The solutions evolving under the selection pressure of the fitness function arethemselves functions (programs)

GP operates on symbolic structures of varying lengthsThere are no variables for the algorithm to operate on (at least in the commonsense)

The program can be tested only on a limited number of fitness cases (tests)

=rArr In contrast to most EC methods that are typically placed in optimizationframework GP is by nature an inductive learning approach that fits into the domain ofmachine learning []

Summary of our first glimpse at GPndash 25

In a broader context

As opposed to typical ML approaches GP is very genericArbitrary programming language arbitrary input and output representation

The syntax and semantic of the programming language of consideration serve asmeans to provide the algorithm with prior knowledge

(common sense knowledge background knowledge domain knowledge)

GP is not the only approach to program induction (but probably the best one )See eg inductive logic programming ILP

GP embodies the ultimate goal of AI to build a system capable ofself-programming (adaptation learning)

Summary of our first glimpse at GPndash 26

What is GP ndash Question revisited

Genetic programming is a branch of computer science studying heuristicalgorithms based on neo-Darwinian principles for synthesizing programs iediscrete symbolic compositional structures that process data

Summary of our first glimpse at GPndash 27

Why should GP be considered a viable approach to program synthesis

GP combines two powerful concepts marked in underline in the above definition1 Representing candidate solutions as programs

which in general can conduct any Turing-complete computation (egclassification regression clustering reasoning problem solving etc) and thusenable capturing solutions to any type of problems (whether the task is eglearning optimization problem solving game playing etc)

2 Searching the space of candidate solutions using the lsquomechanicsrsquo borrowed frombiological evolutionwhich is unquestionably a very powerful computing paradigm given that itresulted in life on Earth and development of intelligent beings

Summary of our first glimpse at GPndash 28

Consequences of the above definition

Heuristic nature of search

Symbolic program representation

Input sensitivity and inductive character

State-fullness

Unconstrained data types

Summary of our first glimpse at GPndash 29

Origins of GP

httpwwwgenetic-programmingcomjohnkozahtml

Home assignment Indentify John Kozarsquos unique visual trait )

Summary of our first glimpse at GPndash 30

Life demonstration of GP using ECJ

Life demonstration of GP using ECJndash 31

GP on ECJ

ECJ Evolutionary Computation in Javahttpcsgmuedu~eclabprojectsecj

by Sean Luke Liviu Panait Gabriel Balan Sean Paus Zbigniew Skolicki ElenaPopovici Keith Sullivan et al

Probably the most popular freely available framework for EC with a strongsupport for GP

Licensed under Academic Free License version 30

As of March 2012 version 20

Many other libraries integrate with ECJ

Life demonstration of GP using ECJndash 32

Selected ECJ features

GUI with charting

Platform-independent checkpointing and logging

Hierarchical parameter files

Multithreading

Mersenne Twister Random Number Generators (compare tohttpwwwalifecouknonrandom)

Abstractions for implementing a variety of EC forms

Prepared to work in a distributed environment (including so-called island model)

Life demonstration of GP using ECJndash 33

GP-related ECJ features

GP Tree Representations

Set-based Strongly-Typed Genetic Programming

Ephemeral Random Constants

Automatically-Defined Functions and Automatically Defined Macros

Multiple tree forests

Six tree-creation algorithms

Extensive set of GP breeding operators

Grammatical Encoding

Eight pre-done GP application problem domains (ant regression multiplexerlawnmower parity two-box edge serengeti)

Life demonstration of GP using ECJndash 34

Exemplary run of ECJ

Standard outputjava ecEvolve -file ecappregressionquinticercparamsThreads breed1 eval1Seed 1427743400Job 0Setting upProcessing GP TypesProcessing GP Node ConstraintsProcessing GP Function SetsProcessing GP Tree Constraints-01306332228659439200164875774146594280653340439694114301402200189629743-0037506348565697010001402771209365470606602806044824949013869498395598084Initializing Generation 0Subpop 0 best fitness of generation Fitness Standardized=11303205 Adjusted=046941292 Hits=10Generation 1Subpop 0 best fitness of generation Fitness Standardized=06804932 Adjusted=059506345 Hits=7

Life demonstration of GP using ECJndash 35

Exemplary run The result

The log file produced by the runGeneration 0Best IndividualSubpopulation 0Evaluated trueFitness Standardized=11303205 Adjusted=046941292 Hits=10Tree 0( (sin ( x x)) (cos (+ x x)))Generation 1Best IndividualSubpopulation 0Evaluated trueFitness Standardized=06804932 Adjusted=059506345 Hits=7Tree 0( (rlog (+ (- x x) (cos x))) (rlog (- (cos (cos ( x x))) (- x x))))

Life demonstration of GP using ECJndash 36

Exemplary run

The log file produced by the runBest Individual of RunSubpopulation 0Evaluated trueFitness Standardized=008413165 Adjusted=092239726 Hits=17Tree 0( ( ( (- ( ( ( ( x (sin x)) (rlog

x)) (+ (+ (sin x) x) (- x x))) (exp ( x( ( (- ( ( ( ( x x) (rlog x)) (+ (+

(sin x) x) (- x x))) (exp ( x (sin x))))(sin x)) (rlog x)) (exp (rlog x)))))) (sin

x)) (rlog x)) x) (cos (cos ( ( (- ( ((exp (rlog x)) (+ x ( ( (exp (rlog x))(rlog x)) x))) (exp ( ( ( (- (exp (rlogx)) x) (rlog x)) x) (sin ( x x))))) (sinx)) ( x ( ( (- ( ( ( ( x x) (rlogx)) (+ (+ x (+ (+ (sin x) x) (- x x))) (-x x))) (exp ( x (sin x)))) (sin x)) (rlogx)) (exp (rlog x))))) x))))

Life demonstration of GP using ECJndash 37

GP in R

httpcranr-projectorgwebpackagesgprindexhtml

Life demonstration of GP using ECJndash 38

GP in Mathematica

Life demonstration of GP using ECJndash 39

Assessment of GP techniques

Assessment of GP techniquesndash 40

Criteria for assessing the quality of GP-evolved solutions

Criteria for assessing GP algorithmssuccess rate (percentage of evolutionary runs ended with success)

time-to-success (can be infin)

error of the best-of-run individual

Criteria for assessing programs obtained with GP

error rate (percentage of tests passed)

program size (number of instructions)

execution time

transparency

Assessment of GP techniquesndash 41

GP Benchmarks

Problem Definition (formula)Sextic x6minus2x4+x2

Septic x7minus2x6+x5minusx4+x3minus2x2+xNonic x9+x8+x7+x6+x5+x4+x3+x2+xR1 (x+1)3(x2minusx+1)R2 (x5minus3x3+1)(x2+1)R3 (x6+x5)(x4+x3+x2+x+1)

Assessment of GP techniquesndash 42

Even more GP benchmarks

Symbolic Regression Drug Bioavailability [43]

Tower [52] Protein Structure Classification [58]

Boolean Functions Time Series Forecasting [53]

N-Multiplexer [17] N-Majority [17] N-Parity [17] Path-finding and Planning

Generalised Boolean Circuits [12 59] Physical Travelling Salesman [28]

Digital Adder [55] Artificial Ant [17]

Order [9] Lawnmower [18]

Digital Multiplier [55] Tartarus Problem [6]

Majority [9] Maximum Overhang [36]

Classification Circuit Design [29]

mRNA Motif Classification [24] Control Systems

DNA Motif Discovery [25] Chaotic Dynamic Systems Control [26]

Intrusion Detection [11] Pole Balancing [31]

Protein Classification [19] Truck Control [16]

Intertwined Spirals [17]

Predictive Modelling

Mackey-Glass Chaotic Time Series [21]

Financial Trading [5 4 8]

Sunspot Prediction [17]

GeneChip Probe Performance [22]

Prime Number Prediction [54]

Assessment of GP techniquesndash 43

and more

Game-Playing

TORCS Car Racing [50]

Ms PacMan [10]

Othello [27]

Chessboard Evaluation [44]

Backgammon [44]

Mario [49]

NP-Complete Puzzles [14]

Robocode [44]

Rush Hour [44]

Checkers [44]

Freecell [44]

Dynamic Optimisation

Dynamic Symbolic Regression [34 35 51]

Dynamic Scheduling [13]

Traditional Programming

Sorting [15 1]

Assessment of GP techniquesndash 44

Case study

Case studyndash 45

Case study 1 Evolution of temperature models

Based on Karolina Stanisławska Krzysztof Krawiec Zbigniew W KundzewiczModeling Global Temperature Changes using Genetic Programming ndash A Case Study

Institute of Computing Science Poznan University of Technology Poznan Poland

Institute for Agricultural and Forest Environment Polish Academy of SciencesPoznan Poland and Potsdam Institute for Climate Impact Research PotsdamGermany

Konferencja Algorytmoacutew Ewolucyjnych i Optymalizacji Globalnej KAEiOGSeptember 21 2011

Case studyndash 46

A more detailed view on GP(vanilla GP is not the whole story)

A more detailed view on GP (vanilla GP is not the whole story)ndash 47

Where to get [random] candidate solutions from

Every stochastic search method relies on some sampling algorithm(s)

The distribution of randomly generated solutions is important as it impliescertain bias of the algorithm

ProblemsWe donrsquot know the lsquoidealrsquo distribution of GP programsEven if we knew it it may be difficult to design an algorithm that obeys it

The most widely used contemporary initialization methods take care only of thesyntax of generated programs

Mainly height constraint

A more detailed view on GP (vanilla GP is not the whole story)ndash 48

Initialization Full method

Specify the maximum tree height hmax

The full method for initializing treesChoose nonterminal nodes at random until hmax is reachedThen choose only from terminals

A more detailed view on GP (vanilla GP is not the whole story)ndash 49

Initialization Grow method

Specify the maximum tree height hmax

The grow method for initializing treesChoose nonterminal or terminal nodes at random until hmax is reachedThen choose only from terminals

A more detailed view on GP (vanilla GP is not the whole story)ndash 50

Homologous crossover for GP

Earliest example one-point crossover [23] identify a common region in theparents and swap the corresponding trees

The common region is the lsquointersectionrsquo of parent trees

A more detailed view on GP (vanilla GP is not the whole story)ndash 51

Uniform crossover for GP

Works similarly to uniform crossover in GAs

The offspring is build by iterating over nodes in the common region and flipping acoin to decide from which parent should an instruction be copied [38]

A more detailed view on GP (vanilla GP is not the whole story)ndash 52

How to employ multiple operators for lsquobreedingrsquo

How should the particular operators coexist in an evolutionary process In other words

How should they be superimposed

What should be the lsquopipingrsquo of particular breeding pipelines

A topic surprisingly underexplored in GP (and in EC probably too)

An example Which is betterpopsubpop0speciespipe = ecgpkozaMutationPipelinepopsubpop0speciespipenum-sources = 1popsubpop0speciespipesource0 = ecgpkozaCrossoverPipeline

Orpopsubpop0speciespipenum-sources = 2popsubpop0speciespipesource0 = ecgpkozaCrossoverPipelinepopsubpop0speciespipesource0prob = 09popsubpop0speciespipesource1 = ecgpkozaMutationPipelinepopsubpop0speciespipesource1prob = 01

A more detailed view on GP (vanilla GP is not the whole story)ndash 53

The Challenges for GP

The Challenges for GPndash 54

Bloat

The evolving expressions tend to grow indefinitely in sizeFor tree-based representations this growth is typically exponential[-ish]

Evaluation becomes slow algorithm stalls memory overrun likely

One of the most intensely studied topics in GP 240+ papers as of March 2012

The Challenges for GPndash 55

Bloat example

Average number of nodes per generation in a typical run of GP solving the Sexticproblem x6minus2x4+x2

(GP dotted line)

The Challenges for GPndash 56

Countermeasures for bloat

Constraining tree heightSurprisingly can speed up bloat

Favoring small programsLexicographic parsimony pressure given two equally fit individuals prefer (select)the one represented by a smaller tree

Bloat-aware operators size-fair crossover

The Challenges for GPndash 57

Highly non-uniform distribution of program lsquobehaviorsrsquo

Convergence of binary Boolean random linear functions (composed of AND NANDOR NOR 8 bits)

From [20] Langdon W B Cantuacute-Paz E (ed) Random Search is Parsimonious LateBreaking Papers at the Genetic and Evolutionary Computation Conference(GECCO-2002) AAAI 2002 308-315

The Challenges for GPndash 58

High cost of evaluation

Running a program on multiple inputscan be expensive

Particularly for some types of dataeg images

Solutions

Caching of outcomes of subprograms

Parallel execution of programs onparticular fitness cases

Bloat prevention methods

The Challenges for GPndash 59

Variants of GP

Variants of GPndash 60

Strongly typed GP (STGP)

A way to incorporate prior knowledge and impose a structure on programs [30]

ImplementationProvide a set of typesFor each instruction define the types of its arguments and outcomesMake the operators type-aware

Mutation substitute a random tree of a proper typeCrossover swap trees of compatible1 types

1Compatible belonging to the same lsquoset typersquoVariants of GPndash 61

Strongly typed GP in ECJ

For the problem of simple classifiers represented as decision trees

Classifier syntaxClassifier = Class_idClassifier = if_then_else(Condition ClassifierClassifier)Condition = Input_Variable = Constant_Value

Implementation in ECJ parameter filesgptypeasize = 3gptypea0name = classgptypea1name = vargptypea2name = constgptypessize = 0gptcsize = 1gptc0 = ecgpGPTreeConstraintsgptc0name = tc0gptc0fset = f0gptc0returns = class

gpncsize = 4gpnc0 = ecgpGPNodeConstraintsgpnc0name = ncSimpleClassifiergpnc0returns = classgpnc0size = 0gpnc1 = ecgpGPNodeConstraintsgpnc1name = ncCompoundClassifiergpnc1returns = classgpnc1size = 4gpnc1child0 = vargpnc1child1 = constgpnc1child2 = classgpnc1child3 = classgpnc2 = ecgpGPNodeConstraintsgpnc2name = ncVariablegpnc2returns = vargpnc2size = 0gpnc3 = ecgpGPNodeConstraintsgpnc3name = ncConstantgpnc3returns = constgpnc3size = 0

Variants of GPndash 62

Linear Genetic Programming

MotivationTree-like structures are not natural for contemporary hardware architectures

Program = a sequence of instructions

Data passed via registers

ProsDirectly portable to machine code fast executionNatural correspondence to standard (GA-like) crossover operator

Applications direct evolution of machine code [32]

Initial register contents

Final register contents

x1

x2 O1 O2

x3

O3 O4 g2

g3

g1r1

r2

r3

r1

r2

r3

Variants of GPndash 63

Linear GP

Variants of GPndash 64

Stack-based GP

The best-known representative Push and PushGPhampshireedulspectorpushhtml [48]

ProsVery simple syntax program = instruction | literal | ( program )No need to specify the number of registersThe top element of a stack has the natural interpretation of program outcomeNatural possibility of implementing autoconstructive programs [47]Includes certain features that make it Turing-complete (eg YANK instruction)

Variants of GPndash 65

Push Example

Program

( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR )

Initial stack states

BOOLEAN STACK ()CODE STACK ( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR )FLOAT STACK ()INTEGER STACK ()

Stack states after program execution

BOOLEAN STACK ( TRUE )CODE STACK ( ( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR ) )FLOAT STACK ( 93 )INTEGER STACK ( 6 )

httphampshireedulspectorpush3-descriptionhtml

Variants of GPndash 66

Other variants of GP

Grammatical Evolution The grammar of the programming language ofconsideration is given as input to the algorithm Individuals encode the choice ofproductions in the derivation tree (which of available alternative productionshould be chosen modulo the number of productions available at given step ofderivation)

Graph-based GPMotivation standard GP cannot reuse subprograms (within a single program)Example Cartesian Genetic Programming

Variants of GPndash 67

Other variants of GP

Multiobjective GP The extra objectives canCome with the problemResult from GPrsquos specifics eg use program size as the second (minimized)objectiveBe associated with different tests (eg feature tests [40])

Developmental GP (eg using Push)

Probabilistic GP (a variant of EDA Estimation of Distribution Algorithms)The algorithm maintains a probability distribution P instead of a populationIndividuals are generated from P lsquoon demandrsquoThe results of individualsrsquo evaluation are used to update P

Variants of GPndash 68

Simple EDA-like GP PIPE

Probabilistic Incremental Program Evolution [41]

Variants of GPndash 69

Selected theoretical results

Selected theoretical resultsndash 70

Schemata theorem for GP

Exact formula for the expected number of individuals sampling a schema a thenext generation [37]

Plus later work for other types of crossover

Selected theoretical resultsndash 71

Semantic crossover

Selected theoretical resultsndash 72

Applications of GP

Applications of GPndash 73

Review

GP produced a number of solutions that are human-competitive ie a GPalgorithm automatically solved a problem for which a patent exists2

A recent award-winning work has demonstrated the ability of a GP system toautomatically find and correct bugs in commercially-released software whenprovided with test data3

GP is one of leading methodologies that can be used to lsquoautomatersquo sciencehelping the researchers to find the hidden complex patterns in the observedphenomena4

2Koza J R Keane M A Streeter M J Mydlowec W Yu J Lanza G 2003 Genetic Pro-gramming IV Routine Human-Competitive Machine Intelligence Kluwer Academic Publishers

3Arcuri A Yao X A novel co-evolutionary approach to automatic software bug fixing In WangJ (Ed) 2008 IEEE World Congress on Computational Intelligence IEEE Computational IntelligenceSociety IEEE Press Hong Kong

4Schmidt M Lipson H 3 Apr 2009 Distilling free-form natural laws from experimental dataScience 324 (5923) 81ndash85

Applications of GPndash 74

Humies

() Entries were solicited for cash awards for human-competitive results that wereproduced by any form of genetic and evolutionary computation and that were published

Applications of GPndash 75

Humies

The conditions to qualify(A) The result was patented as an invention in the past is an improvement over apatented invention or would qualify today as a patentable new invention(B) The result is equal to or better than a result that was accepted as a new scientificresult at the time when it was published in a peer-reviewed scientific journal(C) The result is equal to or better than a result that was placed into a database orarchive of results maintained by an internationally recognized panel of scientificexperts(D) The result is publishable in its own right as a new scientific result mdash independentof the fact that the result was mechanically created(E) The result is equal to or better than the most recent human-created solution to along-standing problem for which there has been a succession of increasingly betterhuman-created solutions(F) The result is equal to or better than a result that was considered an achievementin its field at the time it was first discovered(G) The result solves a problem of indisputable difficulty in its field(H) The result holds its own or wins a regulated competition involving humancontestants (in the form of either live human players or human-written computerprograms)

Applications of GPndash 76

Selected Gold Humies

2004 Jason D Lohn Gregory S Hornby Derek S Linden NASA Ames ResearchCenterAn Evolved Antenna for Deployment on NASArsquos Space Technology 5 Mission

httpidesignucscedupapershornby_ec11pdf

Applications of GPndash 77

Selected Gold Humies using GP

2009 Stephanie Forrest Claire Le Goues ThanhVu Nguyen Westley WeimerAutomatically finding patches using genetic programming A GeneticProgramming Approach to Automated Software Repair

Applications of GPndash 78

Selected Gold Humies using GP

2008 Lee Spector David M Clark Ian Lindsay Bradford Barr Jon KleinGenetic Programming for Finite Algebras

2010 Natalio Krasnogor Paweł Widera Jonathan GaribaldiEvolutionary design of the energy function for protein structure prediction GPchallenge evolving the energy function for protein structure prediction Automateddesign of energy functions for protein structure prediction by means of geneticprogramming and improved structure similarity assessment

2011 Achiya Elyasaf Ami Hauptmann Moshe SipperGA-FreeCell Evolving Solvers for the Game of FreeCell

Applications of GPndash 79

Other applications

classification problems in machine learning [] object recognition [ 33] orlearning game strategies [] [2 57] has demonstrated the ability of a GP system to automatically find andcorrect bugs in commercially-released software when provided with test data

In the context of this paper it deserves particular attention that GP is one ofleading methodologies that can be used to lsquoautomatersquo science helping theresearchers to find the hidden complex patterns in the observed phenomena

In this spirit in their seminal paper [42] have shown how GP can be used toinduce scientific laws from experimental data Many other studies havedemonstrated the usefulness of GP for modeling different phenomena includingthose of natural origins [45 3 7 56 46]

See [] for an extensive review of GP applications

Applications of GPndash 80

Case study 2 Evolution of feature detectors

Based on Krzysztof Krawiec Bartosz Kukawka and Tomasz Maciejewski (2010)Evolving cascades of voting feature detectors for vehicle detection in satellite imageryIn IEEE Congress on Evolutionary Computation (CEC 2010) Barcelona IEEE Presspages 2392-2399

Applications of GPndash 81

Additional resources

Additional resourcesndash 82

Additional resources

Evolutionary Computation in Java csgmuedu~eclabprojectsecjGeneric software framework for EA well-prepared to work with GP

The online GP bilbiography wwwcsbhamacuk~wblbiblio

The genetic programming lsquohome pagersquo (a little bit messy but still valuable)httpwwwgenetic-programmingcom

Additional resourcesndash 83

Bibliography

Bibliographyndash 84

A Agapitos and S M LucasEvolving Modular Recursive Sorting AlgorithmsIn Proc EuroGP 2007

A Arcuri and X YaoA novel co-evolutionary approach to automatic software bug fixingIn J Wang editor 2008 IEEE World Congress on Computational Intelligencepages 162ndash168 Hong Kong 1-6 June 2008 IEEE Computational IntelligenceSociety IEEE Press

M Arganis R Val J Prats K Rodriguez R Dominguez and J DolzGenetic programming and standardization in water temperature modellingAdvances in Civil Engineering 2009 2009

A Brabazon M OrsquoNeill and I DempseyAn Introduction to Evolutionary Computation in FinanceIEEE Computational Intelligence Magazine 3(4)42ndash55 2008

R Bradley A Brabazon and M OrsquoNeillDynamic High Frequency Trading A Neuro-Evolutionary ApproachIn Proc EvoWorkshops 2009

G Cuccu and F GomezWhen novelty is not enoughIn Proc EvoApplications 2011

Bibliographyndash 85

M Daga and M C DeoAlternative data-driven methods to estimate wind from waves by inversemodelingNatural Hazards 49(2)293ndash310 May 2009

I Dempsey M OrsquoNeill and A BrabazonAdaptive Trading With Grammatical EvolutionIn Proc CEC 2006

G Durrett F Neumann and U-M OrsquoReillyComputational Complexity Analysis of Simple Genetic Programming On TwoProblems Modeling Isolated Program SemanticsIn Proc FOGA 2011

E Galvaacuten-Loacutepez J Swafford M OrsquoNeill and A BrabazonEvolving a Ms PacMan Controller Using Grammatical EvolutionIn Applications of Evolutionary Computation Springer 2010

J V Hansen P B Lowry R D Meservy and D M McDonaldGenetic Programming for Prevention of Cyberterrorism through Dynamic andEvolving Intrusion DetectionDecision Support Systems 431362ndash1374 2007

Bibliographyndash 86

S Harding J F Miller and W BanzhafDevelopments in Cartesian Genetic Programming self-modifying CGPGPEM 11397ndash439 2010

D Jakobović and L BudinDynamic Scheduling with Genetic ProgrammingIn Proc EuroGP 2006

G Kendall A Parkes and K SpoererA Survey of NP-Complete PuzzlesInternational Computer Games Association Journal 31(1)13ndash34 2008

K E Kinnear JrEvolving a Sort Lessons in Genetic ProgrammingIn Proc of the International Conference on Neural Networks 1993

J KozaA Genetic Approach to the Truck Backer Upper Problem and the Inter-twinedSpiral ProblemIn Proc International Joint Conference on Neural Networks 1992

J R KozaGenetic Programming On the Programming of Computers by Means of NaturalSelectionMIT Press Cambridge MA USA 1992

Bibliographyndash 87

J R KozaGenetic Programming II Automatic Discovery of Reusable ProgramsMIT Press Cambridge Massachusetts May 1994

W Langdon and W BanzhafRepeated Patterns in Genetic ProgrammingNatural Computing 7589ndash613 2008

W B LangdonRandom search is parsimoniousIn E Cantuacute-Paz editor Late Breaking Papers at the Genetic and EvolutionaryComputation Conference (GECCO-2002) pages 308ndash315 New York NY 9-13July 2002 AAAI

W B Langdon and W BanzhafRepeated Sequences in Linear Genetic Programming GenomesComplex Systems 15(4)285ndash306 2005

W B Langdon and A P HarrisonEvolving Regular Expressions for GeneChip Probe Performance PredictionIn Proc PPSN pages 1061ndash1070 2008

W B Langdon and R PoliFoundations of Genetic ProgrammingSpringer-Verlag 2002

Bibliographyndash 88

W B Langdon J Rowsell and A P HarrisonCreating Regular Expressions as mRNA Motifs with GP to Predict Human ExonSplittingIn Proc GECCO 2009

W B Langdon O Sanchez Graillet and A P HarrisonAutomated DNA Motif DiscoveryarXivorg 2010

M Lones A Tyrrell S Stepney and L CavesControlling Complex Dynamics with Artificial Biochemical NetworksIn Proc EuroGP pages 159ndash170 2010

S LucasOthello Competitionhttpprotectkern-1667emrelaxalgovalessexacuk8080othellohtmlOthellohtml 2012[Online accessed 27-Jan-2012]

Bibliographyndash 89

S LucasThe Physical Travelling Salesperson Problemhttpprotectkern-1667emrelaxalgovalessexacukptspptsphtml2012[Online accessed 27ndashJan-2012]

T McConaghyFFX Fast Scalable Deterministic Symbolic Regression TechnologyIn Proc GPTP 2011

D J MontanaStrongly typed genetic programmingBBN Technical Report 7866 Bolt Beranek and Newman Inc 10 MoultonStreet Cambridge MA 02138 USA 7 May 1993

M Nicolau M Schoenauer and W BanzhafEvolving Genes to Balance a PoleIn Proc EuroGP 2010

Bibliographyndash 90

P Nordin and W BanzhafGenetic programming controlling a miniature robotIn E V Siegel and J R Koza editors Working Notes for the AAAI Symposiumon Genetic Programming pages 61ndash67 MIT Cambridge MA USA 10ndash12 Nov1995 AAAI

G Olague and L TrujilloEvolutionary-computer-assisted design of image operators that detect interestpoints using genetic programmingImage and Vision Computing 29(7)484ndash498 2011

M OrsquoNeill A Brabazon and E HembergSubtree Deactivation Control with Grammatical Genetic Programming inDynamic EnvironmentsIn Proc CEC 2008

M OrsquoNeill and C RyanGrammatical Evolution by Grammatical Evolution The Evolution of Grammarand Genetic CodeIn Proc EuroGP pages 138ndash149 Springer-Verlag 5ndash7 Apr 2004

M Paterson Y Peres M Thorup P Winkler and U ZwickMaximum OverhangIn Proc 19th Annual ACM-SIAM Symposium on Discrete Algorithms 2008

Bibliographyndash 91

R PoliExact schema theory for genetic programming and variable-length geneticalgorithms with one-point crossoverGenetic Programming and Evolvable Machines 2(2)123ndash163 June 2001

R Poli and W B LangdonOn the search properties of different crossover operators in genetic programmingIn J R Koza W Banzhaf K Chellapilla K Deb M Dorigo D B Fogel M HGarzon D E Goldberg H Iba and R Riolo editors Genetic Programming1998 Proceedings of the Third Annual Conference pages 293ndash301 University ofWisconsin Madison Wisconsin USA 22-25 July 1998 Morgan Kaufmann

R Poli W B Langdon and N F McPheeA field guide to genetic programmingPublished via httplulucom and freely available athttpwwwgp-field-guideorguk 2008(With contributions by J R Koza)

B J Ross and H ZhuProcedural texture evolution using multiobjective optimizationNew Generation Computing 22(3)271ndash293 2004

Bibliographyndash 92

R P Salustowicz and J SchmidhuberProbabilistic incremental program evolutionEvolutionary Computation 5(2)123ndash141 1997

M Schmidt and H LipsonDistilling free-form natural laws from experimental dataScience 324(5923)81ndash85 3 Apr 2009

S Silva and L VanneschiState-of-the-Art Genetic Programming for Predicting Human Oral Bioavailabilityof DrugsIn Proc 4th International Workshop on Practical Applications of ComputationalBiology and Bioinformatics 2010

M SipperLet the Games EvolveIn Proc GPTP 2011

C Sivapragasam R Maheswaran and V VenkateshGenetic programming approach for flood routing in natural channelsHydrological Processes 22(5)623ndash628 2007

C Sivapragasam N Muttil S Muthukumar and V M ArunPrediction of algal blooms using genetic programmingMarine Pollution Bulletin 60(10)1849ndash1855 2010

Bibliographyndash 93

L SpectorTowards practical autoconstructive evolution Self-evolution of problem-solvinggenetic programming systemsIn R Riolo T McConaghy and E Vladislavleva editors Genetic ProgrammingTheory and Practice VIII volume 8 of Genetic and Evolutionary Computationchapter 2 pages 17ndash33 Springer Ann Arbor USA 20-22 May 2010

L Spector C Perry and J KleinPush 20 programming language descriptionTechnical report School of Cognitive Science Hampshire College Apr 2004

J Togelius S Karakovskiy J Koutnik and J SchmidhuberSuper Mario EvolutionIn Proc IEEE Computational Intelligence and Games 2009

TORCS The Open Car Racing Simulatorhttptorcssourceforgenet 2012

L Vanneschi and G CuccuVariable Size Population for Dynamic Optimization with Genetic ProgrammingIn Proc GECCO 2009

Bibliographyndash 94

E Vladislavleva G Smits and D Den HertogOrder of Nonlinearity as a Complexity Measure for Models Generated by SymbolicRegression via Pareto Genetic ProgrammingIEEE Trans EC 13(2)333ndash349 2009

N Wagner Z Michalewicz M Khouja and R McGregorTime Series Forecasting for Dynamic Environments The DyFor Genetic ProgramModelIEEE Trans EC 2007

J Walker and J MillerPredicting Prime Numbers Using Cartesian Genetic ProgrammingIn Proc EuroGP 2007

J A Walker K Voumllk S L Smith and J F MillerParallel Evolution using Multi-chromosome Cartesian Genetic ProgrammingGPEM 10417ndash445 2009

W-C Wang K-W Chau C-T Cheng and L QiuA comparison of performance of several artificial intelligence methods forforecasting monthly discharge time seriesJournal of Hydrology 374(3-4)294ndash306 2009

Bibliographyndash 95

W Weimer S Forrest C Le Goues and T NguyenAutomatic program repair with evolutionary computationCommunications of the ACM 53(5)109ndash116 June 2010

P Widera J Garibaldi and N KrasnogorGP challenge Evolving energy function for protein structure predictionGPEM 1161ndash88 2010

T YuHierarchical Processing for Evolving Recursive and Modular Programs UsingHigher-Order Functions and Lambda AbstractionGPEM 2345ndash380 2001

Bibliographyndash 96

  • Introduction
  • Background
  • What is genetic programming
  • Summary of our first glimpse at GP
  • Life demonstration of GP using ECJ
  • Assessment of GP techniques
  • Case study
  • A more detailed view on GP (vanilla GP is not the whole story)
  • The Challenges for GP
  • Variants of GP
  • Selected theoretical results
  • Applications of GP
  • Additional resources
  • Bibliography
Page 14: GeneticProgramming - Poznań University of TechnologyGeneticProgramming KrzysztofKrawiec LaboratoryofIntelligentDecisionSupportSystems InstituteofComputingScience,PoznanUniversityofTechnology,Poznań,Poland

Conceptual preliminaries Character of candidate solutions

The candidate solutions in GP are being assembled from elementary entities calledinstructions

A part of formulation of a GP task is then also an instruction set I ie a set ofsymbols used by the search algorithm to compose the programs (candidatesolutions)

Design of I usually requires some background knowledgeIn particular it should comprise all instructions necessary to find solution to theproblem posed (closure)

What is genetic programmingndash 14

Genetic programming

Main evolution loop (lsquovanilla GPrsquo)

1 procedure GeneticProgramming(f I ) f - fitness function I - instruction set2 P larrplarrRandomProgram(I ) Initialize population3 repeat Main loop over generations4 for p isinP do Evaluation5 pf larr f (p) pf is a lsquofieldrsquo in program p that stores its fitness6 end for7 P prime larr 0 Next population8 repeat Breeding loop9 p1 larrTournamentSelection(P) First parent10 p2 larrTournamentSelection(P) Second parent11 (o1o2)larrCrossover(p1 p2)12 o1 larrMutation(o1 I )13 o2 larrMutation(o2 I )14 P prime larrP prime cupo1o215 until |P prime|= |P|16 P larrP prime

17 until StoppingCondition(P)18 return argmaxpisinP pf

19 end procedure

What is genetic programmingndash 15

Crossover

Crossover exchange of randomly selected subexpressions (subtree swapping crossover)

1 function Crossover(p1p2)2 repeat3 s1 larr Random node in p14 s2 larr Random node in p25 (pprime1p

prime2)larr Swap subtrees rooted in s1 and s2

6 until Depth(pprime1)lt dmax andDepth(pprime2)lt dmax dmax is the tree depthlimit

7 return (pprime1pprime2)

8 end function

What is genetic programmingndash 16

Mutation

Mutation replace a randomly selected subexpression with a new randomly generatedsubexpression

1 function Mutation(pI )2 repeat3 slarr Random node in p4 s prime larrRandomProgram(I )5 pprime larr Replace the subtree rooted in s with s prime

6 until Depth(pprime)lt dmax dmax is the tree depth limit7 return pprime

8 end function

What is genetic programmingndash 17

Exemplary run Setup

Objective Find program whose output matches x2+x+1 over the range [minus11]Such tasks can be considered as a form of regressionAs solutions are built by manipulating code (instructions) this is referred to assymbolic regression

Fitness sum of absolute errors for x isin minus10minus09 0910In other words the set of fitness cases is

xi -10 -09 0 09 10yi 1 091 1 271 3

What is genetic programmingndash 18

Exemplary run Setup

Instruction setNonterminal (function) set + - (protected division) and x all operating onfloatsTerminal set x and constants chosen randomly between -5 and +5

Selection fitness proportionate (roulette wheel) non elitist

Initial pop ramped half-and-half (depth 1 to 2 50 of terminals are constants)(to be explained later)

Parameterspopulation size 450 subtree crossover25 reproduction25 subtree mutation no tree size limits

Termination when an individual with fitness better than 01 found

What is genetic programmingndash 19

Initial population (population 0)

What is genetic programmingndash 20

Fitness assignment for population 0

Fitness values f(a)=77 f(b)=110 f(c)=1798 f(d)=287

What is genetic programmingndash 21

Breeding

Assume

a gets reproduced

c gets mutated (at loci 2)

a and d get crossed-over

a and b get crossed over

What is genetic programmingndash 22

Population 1

Population 0

Population 1

Individual d in population 1 has fitness 0What is genetic programmingndash 23

Summary of our first glimpse at GP

Summary of our first glimpse at GPndash 24

Specific features of GP

The solutions evolving under the selection pressure of the fitness function arethemselves functions (programs)

GP operates on symbolic structures of varying lengthsThere are no variables for the algorithm to operate on (at least in the commonsense)

The program can be tested only on a limited number of fitness cases (tests)

=rArr In contrast to most EC methods that are typically placed in optimizationframework GP is by nature an inductive learning approach that fits into the domain ofmachine learning []

Summary of our first glimpse at GPndash 25

In a broader context

As opposed to typical ML approaches GP is very genericArbitrary programming language arbitrary input and output representation

The syntax and semantic of the programming language of consideration serve asmeans to provide the algorithm with prior knowledge

(common sense knowledge background knowledge domain knowledge)

GP is not the only approach to program induction (but probably the best one )See eg inductive logic programming ILP

GP embodies the ultimate goal of AI to build a system capable ofself-programming (adaptation learning)

Summary of our first glimpse at GPndash 26

What is GP ndash Question revisited

Genetic programming is a branch of computer science studying heuristicalgorithms based on neo-Darwinian principles for synthesizing programs iediscrete symbolic compositional structures that process data

Summary of our first glimpse at GPndash 27

Why should GP be considered a viable approach to program synthesis

GP combines two powerful concepts marked in underline in the above definition1 Representing candidate solutions as programs

which in general can conduct any Turing-complete computation (egclassification regression clustering reasoning problem solving etc) and thusenable capturing solutions to any type of problems (whether the task is eglearning optimization problem solving game playing etc)

2 Searching the space of candidate solutions using the lsquomechanicsrsquo borrowed frombiological evolutionwhich is unquestionably a very powerful computing paradigm given that itresulted in life on Earth and development of intelligent beings

Summary of our first glimpse at GPndash 28

Consequences of the above definition

Heuristic nature of search

Symbolic program representation

Input sensitivity and inductive character

State-fullness

Unconstrained data types

Summary of our first glimpse at GPndash 29

Origins of GP

httpwwwgenetic-programmingcomjohnkozahtml

Home assignment Indentify John Kozarsquos unique visual trait )

Summary of our first glimpse at GPndash 30

Life demonstration of GP using ECJ

Life demonstration of GP using ECJndash 31

GP on ECJ

ECJ Evolutionary Computation in Javahttpcsgmuedu~eclabprojectsecj

by Sean Luke Liviu Panait Gabriel Balan Sean Paus Zbigniew Skolicki ElenaPopovici Keith Sullivan et al

Probably the most popular freely available framework for EC with a strongsupport for GP

Licensed under Academic Free License version 30

As of March 2012 version 20

Many other libraries integrate with ECJ

Life demonstration of GP using ECJndash 32

Selected ECJ features

GUI with charting

Platform-independent checkpointing and logging

Hierarchical parameter files

Multithreading

Mersenne Twister Random Number Generators (compare tohttpwwwalifecouknonrandom)

Abstractions for implementing a variety of EC forms

Prepared to work in a distributed environment (including so-called island model)

Life demonstration of GP using ECJndash 33

GP-related ECJ features

GP Tree Representations

Set-based Strongly-Typed Genetic Programming

Ephemeral Random Constants

Automatically-Defined Functions and Automatically Defined Macros

Multiple tree forests

Six tree-creation algorithms

Extensive set of GP breeding operators

Grammatical Encoding

Eight pre-done GP application problem domains (ant regression multiplexerlawnmower parity two-box edge serengeti)

Life demonstration of GP using ECJndash 34

Exemplary run of ECJ

Standard outputjava ecEvolve -file ecappregressionquinticercparamsThreads breed1 eval1Seed 1427743400Job 0Setting upProcessing GP TypesProcessing GP Node ConstraintsProcessing GP Function SetsProcessing GP Tree Constraints-01306332228659439200164875774146594280653340439694114301402200189629743-0037506348565697010001402771209365470606602806044824949013869498395598084Initializing Generation 0Subpop 0 best fitness of generation Fitness Standardized=11303205 Adjusted=046941292 Hits=10Generation 1Subpop 0 best fitness of generation Fitness Standardized=06804932 Adjusted=059506345 Hits=7

Life demonstration of GP using ECJndash 35

Exemplary run The result

The log file produced by the runGeneration 0Best IndividualSubpopulation 0Evaluated trueFitness Standardized=11303205 Adjusted=046941292 Hits=10Tree 0( (sin ( x x)) (cos (+ x x)))Generation 1Best IndividualSubpopulation 0Evaluated trueFitness Standardized=06804932 Adjusted=059506345 Hits=7Tree 0( (rlog (+ (- x x) (cos x))) (rlog (- (cos (cos ( x x))) (- x x))))

Life demonstration of GP using ECJndash 36

Exemplary run

The log file produced by the runBest Individual of RunSubpopulation 0Evaluated trueFitness Standardized=008413165 Adjusted=092239726 Hits=17Tree 0( ( ( (- ( ( ( ( x (sin x)) (rlog

x)) (+ (+ (sin x) x) (- x x))) (exp ( x( ( (- ( ( ( ( x x) (rlog x)) (+ (+

(sin x) x) (- x x))) (exp ( x (sin x))))(sin x)) (rlog x)) (exp (rlog x)))))) (sin

x)) (rlog x)) x) (cos (cos ( ( (- ( ((exp (rlog x)) (+ x ( ( (exp (rlog x))(rlog x)) x))) (exp ( ( ( (- (exp (rlogx)) x) (rlog x)) x) (sin ( x x))))) (sinx)) ( x ( ( (- ( ( ( ( x x) (rlogx)) (+ (+ x (+ (+ (sin x) x) (- x x))) (-x x))) (exp ( x (sin x)))) (sin x)) (rlogx)) (exp (rlog x))))) x))))

Life demonstration of GP using ECJndash 37

GP in R

httpcranr-projectorgwebpackagesgprindexhtml

Life demonstration of GP using ECJndash 38

GP in Mathematica

Life demonstration of GP using ECJndash 39

Assessment of GP techniques

Assessment of GP techniquesndash 40

Criteria for assessing the quality of GP-evolved solutions

Criteria for assessing GP algorithmssuccess rate (percentage of evolutionary runs ended with success)

time-to-success (can be infin)

error of the best-of-run individual

Criteria for assessing programs obtained with GP

error rate (percentage of tests passed)

program size (number of instructions)

execution time

transparency

Assessment of GP techniquesndash 41

GP Benchmarks

Problem Definition (formula)Sextic x6minus2x4+x2

Septic x7minus2x6+x5minusx4+x3minus2x2+xNonic x9+x8+x7+x6+x5+x4+x3+x2+xR1 (x+1)3(x2minusx+1)R2 (x5minus3x3+1)(x2+1)R3 (x6+x5)(x4+x3+x2+x+1)

Assessment of GP techniquesndash 42

Even more GP benchmarks

Symbolic Regression Drug Bioavailability [43]

Tower [52] Protein Structure Classification [58]

Boolean Functions Time Series Forecasting [53]

N-Multiplexer [17] N-Majority [17] N-Parity [17] Path-finding and Planning

Generalised Boolean Circuits [12 59] Physical Travelling Salesman [28]

Digital Adder [55] Artificial Ant [17]

Order [9] Lawnmower [18]

Digital Multiplier [55] Tartarus Problem [6]

Majority [9] Maximum Overhang [36]

Classification Circuit Design [29]

mRNA Motif Classification [24] Control Systems

DNA Motif Discovery [25] Chaotic Dynamic Systems Control [26]

Intrusion Detection [11] Pole Balancing [31]

Protein Classification [19] Truck Control [16]

Intertwined Spirals [17]

Predictive Modelling

Mackey-Glass Chaotic Time Series [21]

Financial Trading [5 4 8]

Sunspot Prediction [17]

GeneChip Probe Performance [22]

Prime Number Prediction [54]

Assessment of GP techniquesndash 43

and more

Game-Playing

TORCS Car Racing [50]

Ms PacMan [10]

Othello [27]

Chessboard Evaluation [44]

Backgammon [44]

Mario [49]

NP-Complete Puzzles [14]

Robocode [44]

Rush Hour [44]

Checkers [44]

Freecell [44]

Dynamic Optimisation

Dynamic Symbolic Regression [34 35 51]

Dynamic Scheduling [13]

Traditional Programming

Sorting [15 1]

Assessment of GP techniquesndash 44

Case study

Case studyndash 45

Case study 1 Evolution of temperature models

Based on Karolina Stanisławska Krzysztof Krawiec Zbigniew W KundzewiczModeling Global Temperature Changes using Genetic Programming ndash A Case Study

Institute of Computing Science Poznan University of Technology Poznan Poland

Institute for Agricultural and Forest Environment Polish Academy of SciencesPoznan Poland and Potsdam Institute for Climate Impact Research PotsdamGermany

Konferencja Algorytmoacutew Ewolucyjnych i Optymalizacji Globalnej KAEiOGSeptember 21 2011

Case studyndash 46

A more detailed view on GP(vanilla GP is not the whole story)

A more detailed view on GP (vanilla GP is not the whole story)ndash 47

Where to get [random] candidate solutions from

Every stochastic search method relies on some sampling algorithm(s)

The distribution of randomly generated solutions is important as it impliescertain bias of the algorithm

ProblemsWe donrsquot know the lsquoidealrsquo distribution of GP programsEven if we knew it it may be difficult to design an algorithm that obeys it

The most widely used contemporary initialization methods take care only of thesyntax of generated programs

Mainly height constraint

A more detailed view on GP (vanilla GP is not the whole story)ndash 48

Initialization Full method

Specify the maximum tree height hmax

The full method for initializing treesChoose nonterminal nodes at random until hmax is reachedThen choose only from terminals

A more detailed view on GP (vanilla GP is not the whole story)ndash 49

Initialization Grow method

Specify the maximum tree height hmax

The grow method for initializing treesChoose nonterminal or terminal nodes at random until hmax is reachedThen choose only from terminals

A more detailed view on GP (vanilla GP is not the whole story)ndash 50

Homologous crossover for GP

Earliest example one-point crossover [23] identify a common region in theparents and swap the corresponding trees

The common region is the lsquointersectionrsquo of parent trees

A more detailed view on GP (vanilla GP is not the whole story)ndash 51

Uniform crossover for GP

Works similarly to uniform crossover in GAs

The offspring is build by iterating over nodes in the common region and flipping acoin to decide from which parent should an instruction be copied [38]

A more detailed view on GP (vanilla GP is not the whole story)ndash 52

How to employ multiple operators for lsquobreedingrsquo

How should the particular operators coexist in an evolutionary process In other words

How should they be superimposed

What should be the lsquopipingrsquo of particular breeding pipelines

A topic surprisingly underexplored in GP (and in EC probably too)

An example Which is betterpopsubpop0speciespipe = ecgpkozaMutationPipelinepopsubpop0speciespipenum-sources = 1popsubpop0speciespipesource0 = ecgpkozaCrossoverPipeline

Orpopsubpop0speciespipenum-sources = 2popsubpop0speciespipesource0 = ecgpkozaCrossoverPipelinepopsubpop0speciespipesource0prob = 09popsubpop0speciespipesource1 = ecgpkozaMutationPipelinepopsubpop0speciespipesource1prob = 01

A more detailed view on GP (vanilla GP is not the whole story)ndash 53

The Challenges for GP

The Challenges for GPndash 54

Bloat

The evolving expressions tend to grow indefinitely in sizeFor tree-based representations this growth is typically exponential[-ish]

Evaluation becomes slow algorithm stalls memory overrun likely

One of the most intensely studied topics in GP 240+ papers as of March 2012

The Challenges for GPndash 55

Bloat example

Average number of nodes per generation in a typical run of GP solving the Sexticproblem x6minus2x4+x2

(GP dotted line)

The Challenges for GPndash 56

Countermeasures for bloat

Constraining tree heightSurprisingly can speed up bloat

Favoring small programsLexicographic parsimony pressure given two equally fit individuals prefer (select)the one represented by a smaller tree

Bloat-aware operators size-fair crossover

The Challenges for GPndash 57

Highly non-uniform distribution of program lsquobehaviorsrsquo

Convergence of binary Boolean random linear functions (composed of AND NANDOR NOR 8 bits)

From [20] Langdon W B Cantuacute-Paz E (ed) Random Search is Parsimonious LateBreaking Papers at the Genetic and Evolutionary Computation Conference(GECCO-2002) AAAI 2002 308-315

The Challenges for GPndash 58

High cost of evaluation

Running a program on multiple inputscan be expensive

Particularly for some types of dataeg images

Solutions

Caching of outcomes of subprograms

Parallel execution of programs onparticular fitness cases

Bloat prevention methods

The Challenges for GPndash 59

Variants of GP

Variants of GPndash 60

Strongly typed GP (STGP)

A way to incorporate prior knowledge and impose a structure on programs [30]

ImplementationProvide a set of typesFor each instruction define the types of its arguments and outcomesMake the operators type-aware

Mutation substitute a random tree of a proper typeCrossover swap trees of compatible1 types

1Compatible belonging to the same lsquoset typersquoVariants of GPndash 61

Strongly typed GP in ECJ

For the problem of simple classifiers represented as decision trees

Classifier syntaxClassifier = Class_idClassifier = if_then_else(Condition ClassifierClassifier)Condition = Input_Variable = Constant_Value

Implementation in ECJ parameter filesgptypeasize = 3gptypea0name = classgptypea1name = vargptypea2name = constgptypessize = 0gptcsize = 1gptc0 = ecgpGPTreeConstraintsgptc0name = tc0gptc0fset = f0gptc0returns = class

gpncsize = 4gpnc0 = ecgpGPNodeConstraintsgpnc0name = ncSimpleClassifiergpnc0returns = classgpnc0size = 0gpnc1 = ecgpGPNodeConstraintsgpnc1name = ncCompoundClassifiergpnc1returns = classgpnc1size = 4gpnc1child0 = vargpnc1child1 = constgpnc1child2 = classgpnc1child3 = classgpnc2 = ecgpGPNodeConstraintsgpnc2name = ncVariablegpnc2returns = vargpnc2size = 0gpnc3 = ecgpGPNodeConstraintsgpnc3name = ncConstantgpnc3returns = constgpnc3size = 0

Variants of GPndash 62

Linear Genetic Programming

MotivationTree-like structures are not natural for contemporary hardware architectures

Program = a sequence of instructions

Data passed via registers

ProsDirectly portable to machine code fast executionNatural correspondence to standard (GA-like) crossover operator

Applications direct evolution of machine code [32]

Initial register contents

Final register contents

x1

x2 O1 O2

x3

O3 O4 g2

g3

g1r1

r2

r3

r1

r2

r3

Variants of GPndash 63

Linear GP

Variants of GPndash 64

Stack-based GP

The best-known representative Push and PushGPhampshireedulspectorpushhtml [48]

ProsVery simple syntax program = instruction | literal | ( program )No need to specify the number of registersThe top element of a stack has the natural interpretation of program outcomeNatural possibility of implementing autoconstructive programs [47]Includes certain features that make it Turing-complete (eg YANK instruction)

Variants of GPndash 65

Push Example

Program

( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR )

Initial stack states

BOOLEAN STACK ()CODE STACK ( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR )FLOAT STACK ()INTEGER STACK ()

Stack states after program execution

BOOLEAN STACK ( TRUE )CODE STACK ( ( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR ) )FLOAT STACK ( 93 )INTEGER STACK ( 6 )

httphampshireedulspectorpush3-descriptionhtml

Variants of GPndash 66

Other variants of GP

Grammatical Evolution The grammar of the programming language ofconsideration is given as input to the algorithm Individuals encode the choice ofproductions in the derivation tree (which of available alternative productionshould be chosen modulo the number of productions available at given step ofderivation)

Graph-based GPMotivation standard GP cannot reuse subprograms (within a single program)Example Cartesian Genetic Programming

Variants of GPndash 67

Other variants of GP

Multiobjective GP The extra objectives canCome with the problemResult from GPrsquos specifics eg use program size as the second (minimized)objectiveBe associated with different tests (eg feature tests [40])

Developmental GP (eg using Push)

Probabilistic GP (a variant of EDA Estimation of Distribution Algorithms)The algorithm maintains a probability distribution P instead of a populationIndividuals are generated from P lsquoon demandrsquoThe results of individualsrsquo evaluation are used to update P

Variants of GPndash 68

Simple EDA-like GP PIPE

Probabilistic Incremental Program Evolution [41]

Variants of GPndash 69

Selected theoretical results

Selected theoretical resultsndash 70

Schemata theorem for GP

Exact formula for the expected number of individuals sampling a schema a thenext generation [37]

Plus later work for other types of crossover

Selected theoretical resultsndash 71

Semantic crossover

Selected theoretical resultsndash 72

Applications of GP

Applications of GPndash 73

Review

GP produced a number of solutions that are human-competitive ie a GPalgorithm automatically solved a problem for which a patent exists2

A recent award-winning work has demonstrated the ability of a GP system toautomatically find and correct bugs in commercially-released software whenprovided with test data3

GP is one of leading methodologies that can be used to lsquoautomatersquo sciencehelping the researchers to find the hidden complex patterns in the observedphenomena4

2Koza J R Keane M A Streeter M J Mydlowec W Yu J Lanza G 2003 Genetic Pro-gramming IV Routine Human-Competitive Machine Intelligence Kluwer Academic Publishers

3Arcuri A Yao X A novel co-evolutionary approach to automatic software bug fixing In WangJ (Ed) 2008 IEEE World Congress on Computational Intelligence IEEE Computational IntelligenceSociety IEEE Press Hong Kong

4Schmidt M Lipson H 3 Apr 2009 Distilling free-form natural laws from experimental dataScience 324 (5923) 81ndash85

Applications of GPndash 74

Humies

() Entries were solicited for cash awards for human-competitive results that wereproduced by any form of genetic and evolutionary computation and that were published

Applications of GPndash 75

Humies

The conditions to qualify(A) The result was patented as an invention in the past is an improvement over apatented invention or would qualify today as a patentable new invention(B) The result is equal to or better than a result that was accepted as a new scientificresult at the time when it was published in a peer-reviewed scientific journal(C) The result is equal to or better than a result that was placed into a database orarchive of results maintained by an internationally recognized panel of scientificexperts(D) The result is publishable in its own right as a new scientific result mdash independentof the fact that the result was mechanically created(E) The result is equal to or better than the most recent human-created solution to along-standing problem for which there has been a succession of increasingly betterhuman-created solutions(F) The result is equal to or better than a result that was considered an achievementin its field at the time it was first discovered(G) The result solves a problem of indisputable difficulty in its field(H) The result holds its own or wins a regulated competition involving humancontestants (in the form of either live human players or human-written computerprograms)

Applications of GPndash 76

Selected Gold Humies

2004 Jason D Lohn Gregory S Hornby Derek S Linden NASA Ames ResearchCenterAn Evolved Antenna for Deployment on NASArsquos Space Technology 5 Mission

httpidesignucscedupapershornby_ec11pdf

Applications of GPndash 77

Selected Gold Humies using GP

2009 Stephanie Forrest Claire Le Goues ThanhVu Nguyen Westley WeimerAutomatically finding patches using genetic programming A GeneticProgramming Approach to Automated Software Repair

Applications of GPndash 78

Selected Gold Humies using GP

2008 Lee Spector David M Clark Ian Lindsay Bradford Barr Jon KleinGenetic Programming for Finite Algebras

2010 Natalio Krasnogor Paweł Widera Jonathan GaribaldiEvolutionary design of the energy function for protein structure prediction GPchallenge evolving the energy function for protein structure prediction Automateddesign of energy functions for protein structure prediction by means of geneticprogramming and improved structure similarity assessment

2011 Achiya Elyasaf Ami Hauptmann Moshe SipperGA-FreeCell Evolving Solvers for the Game of FreeCell

Applications of GPndash 79

Other applications

classification problems in machine learning [] object recognition [ 33] orlearning game strategies [] [2 57] has demonstrated the ability of a GP system to automatically find andcorrect bugs in commercially-released software when provided with test data

In the context of this paper it deserves particular attention that GP is one ofleading methodologies that can be used to lsquoautomatersquo science helping theresearchers to find the hidden complex patterns in the observed phenomena

In this spirit in their seminal paper [42] have shown how GP can be used toinduce scientific laws from experimental data Many other studies havedemonstrated the usefulness of GP for modeling different phenomena includingthose of natural origins [45 3 7 56 46]

See [] for an extensive review of GP applications

Applications of GPndash 80

Case study 2 Evolution of feature detectors

Based on Krzysztof Krawiec Bartosz Kukawka and Tomasz Maciejewski (2010)Evolving cascades of voting feature detectors for vehicle detection in satellite imageryIn IEEE Congress on Evolutionary Computation (CEC 2010) Barcelona IEEE Presspages 2392-2399

Applications of GPndash 81

Additional resources

Additional resourcesndash 82

Additional resources

Evolutionary Computation in Java csgmuedu~eclabprojectsecjGeneric software framework for EA well-prepared to work with GP

The online GP bilbiography wwwcsbhamacuk~wblbiblio

The genetic programming lsquohome pagersquo (a little bit messy but still valuable)httpwwwgenetic-programmingcom

Additional resourcesndash 83

Bibliography

Bibliographyndash 84

A Agapitos and S M LucasEvolving Modular Recursive Sorting AlgorithmsIn Proc EuroGP 2007

A Arcuri and X YaoA novel co-evolutionary approach to automatic software bug fixingIn J Wang editor 2008 IEEE World Congress on Computational Intelligencepages 162ndash168 Hong Kong 1-6 June 2008 IEEE Computational IntelligenceSociety IEEE Press

M Arganis R Val J Prats K Rodriguez R Dominguez and J DolzGenetic programming and standardization in water temperature modellingAdvances in Civil Engineering 2009 2009

A Brabazon M OrsquoNeill and I DempseyAn Introduction to Evolutionary Computation in FinanceIEEE Computational Intelligence Magazine 3(4)42ndash55 2008

R Bradley A Brabazon and M OrsquoNeillDynamic High Frequency Trading A Neuro-Evolutionary ApproachIn Proc EvoWorkshops 2009

G Cuccu and F GomezWhen novelty is not enoughIn Proc EvoApplications 2011

Bibliographyndash 85

M Daga and M C DeoAlternative data-driven methods to estimate wind from waves by inversemodelingNatural Hazards 49(2)293ndash310 May 2009

I Dempsey M OrsquoNeill and A BrabazonAdaptive Trading With Grammatical EvolutionIn Proc CEC 2006

G Durrett F Neumann and U-M OrsquoReillyComputational Complexity Analysis of Simple Genetic Programming On TwoProblems Modeling Isolated Program SemanticsIn Proc FOGA 2011

E Galvaacuten-Loacutepez J Swafford M OrsquoNeill and A BrabazonEvolving a Ms PacMan Controller Using Grammatical EvolutionIn Applications of Evolutionary Computation Springer 2010

J V Hansen P B Lowry R D Meservy and D M McDonaldGenetic Programming for Prevention of Cyberterrorism through Dynamic andEvolving Intrusion DetectionDecision Support Systems 431362ndash1374 2007

Bibliographyndash 86

S Harding J F Miller and W BanzhafDevelopments in Cartesian Genetic Programming self-modifying CGPGPEM 11397ndash439 2010

D Jakobović and L BudinDynamic Scheduling with Genetic ProgrammingIn Proc EuroGP 2006

G Kendall A Parkes and K SpoererA Survey of NP-Complete PuzzlesInternational Computer Games Association Journal 31(1)13ndash34 2008

K E Kinnear JrEvolving a Sort Lessons in Genetic ProgrammingIn Proc of the International Conference on Neural Networks 1993

J KozaA Genetic Approach to the Truck Backer Upper Problem and the Inter-twinedSpiral ProblemIn Proc International Joint Conference on Neural Networks 1992

J R KozaGenetic Programming On the Programming of Computers by Means of NaturalSelectionMIT Press Cambridge MA USA 1992

Bibliographyndash 87

J R KozaGenetic Programming II Automatic Discovery of Reusable ProgramsMIT Press Cambridge Massachusetts May 1994

W Langdon and W BanzhafRepeated Patterns in Genetic ProgrammingNatural Computing 7589ndash613 2008

W B LangdonRandom search is parsimoniousIn E Cantuacute-Paz editor Late Breaking Papers at the Genetic and EvolutionaryComputation Conference (GECCO-2002) pages 308ndash315 New York NY 9-13July 2002 AAAI

W B Langdon and W BanzhafRepeated Sequences in Linear Genetic Programming GenomesComplex Systems 15(4)285ndash306 2005

W B Langdon and A P HarrisonEvolving Regular Expressions for GeneChip Probe Performance PredictionIn Proc PPSN pages 1061ndash1070 2008

W B Langdon and R PoliFoundations of Genetic ProgrammingSpringer-Verlag 2002

Bibliographyndash 88

W B Langdon J Rowsell and A P HarrisonCreating Regular Expressions as mRNA Motifs with GP to Predict Human ExonSplittingIn Proc GECCO 2009

W B Langdon O Sanchez Graillet and A P HarrisonAutomated DNA Motif DiscoveryarXivorg 2010

M Lones A Tyrrell S Stepney and L CavesControlling Complex Dynamics with Artificial Biochemical NetworksIn Proc EuroGP pages 159ndash170 2010

S LucasOthello Competitionhttpprotectkern-1667emrelaxalgovalessexacuk8080othellohtmlOthellohtml 2012[Online accessed 27-Jan-2012]

Bibliographyndash 89

S LucasThe Physical Travelling Salesperson Problemhttpprotectkern-1667emrelaxalgovalessexacukptspptsphtml2012[Online accessed 27ndashJan-2012]

T McConaghyFFX Fast Scalable Deterministic Symbolic Regression TechnologyIn Proc GPTP 2011

D J MontanaStrongly typed genetic programmingBBN Technical Report 7866 Bolt Beranek and Newman Inc 10 MoultonStreet Cambridge MA 02138 USA 7 May 1993

M Nicolau M Schoenauer and W BanzhafEvolving Genes to Balance a PoleIn Proc EuroGP 2010

Bibliographyndash 90

P Nordin and W BanzhafGenetic programming controlling a miniature robotIn E V Siegel and J R Koza editors Working Notes for the AAAI Symposiumon Genetic Programming pages 61ndash67 MIT Cambridge MA USA 10ndash12 Nov1995 AAAI

G Olague and L TrujilloEvolutionary-computer-assisted design of image operators that detect interestpoints using genetic programmingImage and Vision Computing 29(7)484ndash498 2011

M OrsquoNeill A Brabazon and E HembergSubtree Deactivation Control with Grammatical Genetic Programming inDynamic EnvironmentsIn Proc CEC 2008

M OrsquoNeill and C RyanGrammatical Evolution by Grammatical Evolution The Evolution of Grammarand Genetic CodeIn Proc EuroGP pages 138ndash149 Springer-Verlag 5ndash7 Apr 2004

M Paterson Y Peres M Thorup P Winkler and U ZwickMaximum OverhangIn Proc 19th Annual ACM-SIAM Symposium on Discrete Algorithms 2008

Bibliographyndash 91

R PoliExact schema theory for genetic programming and variable-length geneticalgorithms with one-point crossoverGenetic Programming and Evolvable Machines 2(2)123ndash163 June 2001

R Poli and W B LangdonOn the search properties of different crossover operators in genetic programmingIn J R Koza W Banzhaf K Chellapilla K Deb M Dorigo D B Fogel M HGarzon D E Goldberg H Iba and R Riolo editors Genetic Programming1998 Proceedings of the Third Annual Conference pages 293ndash301 University ofWisconsin Madison Wisconsin USA 22-25 July 1998 Morgan Kaufmann

R Poli W B Langdon and N F McPheeA field guide to genetic programmingPublished via httplulucom and freely available athttpwwwgp-field-guideorguk 2008(With contributions by J R Koza)

B J Ross and H ZhuProcedural texture evolution using multiobjective optimizationNew Generation Computing 22(3)271ndash293 2004

Bibliographyndash 92

R P Salustowicz and J SchmidhuberProbabilistic incremental program evolutionEvolutionary Computation 5(2)123ndash141 1997

M Schmidt and H LipsonDistilling free-form natural laws from experimental dataScience 324(5923)81ndash85 3 Apr 2009

S Silva and L VanneschiState-of-the-Art Genetic Programming for Predicting Human Oral Bioavailabilityof DrugsIn Proc 4th International Workshop on Practical Applications of ComputationalBiology and Bioinformatics 2010

M SipperLet the Games EvolveIn Proc GPTP 2011

C Sivapragasam R Maheswaran and V VenkateshGenetic programming approach for flood routing in natural channelsHydrological Processes 22(5)623ndash628 2007

C Sivapragasam N Muttil S Muthukumar and V M ArunPrediction of algal blooms using genetic programmingMarine Pollution Bulletin 60(10)1849ndash1855 2010

Bibliographyndash 93

L SpectorTowards practical autoconstructive evolution Self-evolution of problem-solvinggenetic programming systemsIn R Riolo T McConaghy and E Vladislavleva editors Genetic ProgrammingTheory and Practice VIII volume 8 of Genetic and Evolutionary Computationchapter 2 pages 17ndash33 Springer Ann Arbor USA 20-22 May 2010

L Spector C Perry and J KleinPush 20 programming language descriptionTechnical report School of Cognitive Science Hampshire College Apr 2004

J Togelius S Karakovskiy J Koutnik and J SchmidhuberSuper Mario EvolutionIn Proc IEEE Computational Intelligence and Games 2009

TORCS The Open Car Racing Simulatorhttptorcssourceforgenet 2012

L Vanneschi and G CuccuVariable Size Population for Dynamic Optimization with Genetic ProgrammingIn Proc GECCO 2009

Bibliographyndash 94

E Vladislavleva G Smits and D Den HertogOrder of Nonlinearity as a Complexity Measure for Models Generated by SymbolicRegression via Pareto Genetic ProgrammingIEEE Trans EC 13(2)333ndash349 2009

N Wagner Z Michalewicz M Khouja and R McGregorTime Series Forecasting for Dynamic Environments The DyFor Genetic ProgramModelIEEE Trans EC 2007

J Walker and J MillerPredicting Prime Numbers Using Cartesian Genetic ProgrammingIn Proc EuroGP 2007

J A Walker K Voumllk S L Smith and J F MillerParallel Evolution using Multi-chromosome Cartesian Genetic ProgrammingGPEM 10417ndash445 2009

W-C Wang K-W Chau C-T Cheng and L QiuA comparison of performance of several artificial intelligence methods forforecasting monthly discharge time seriesJournal of Hydrology 374(3-4)294ndash306 2009

Bibliographyndash 95

W Weimer S Forrest C Le Goues and T NguyenAutomatic program repair with evolutionary computationCommunications of the ACM 53(5)109ndash116 June 2010

P Widera J Garibaldi and N KrasnogorGP challenge Evolving energy function for protein structure predictionGPEM 1161ndash88 2010

T YuHierarchical Processing for Evolving Recursive and Modular Programs UsingHigher-Order Functions and Lambda AbstractionGPEM 2345ndash380 2001

Bibliographyndash 96

  • Introduction
  • Background
  • What is genetic programming
  • Summary of our first glimpse at GP
  • Life demonstration of GP using ECJ
  • Assessment of GP techniques
  • Case study
  • A more detailed view on GP (vanilla GP is not the whole story)
  • The Challenges for GP
  • Variants of GP
  • Selected theoretical results
  • Applications of GP
  • Additional resources
  • Bibliography
Page 15: GeneticProgramming - Poznań University of TechnologyGeneticProgramming KrzysztofKrawiec LaboratoryofIntelligentDecisionSupportSystems InstituteofComputingScience,PoznanUniversityofTechnology,Poznań,Poland

Genetic programming

Main evolution loop (lsquovanilla GPrsquo)

1 procedure GeneticProgramming(f I ) f - fitness function I - instruction set2 P larrplarrRandomProgram(I ) Initialize population3 repeat Main loop over generations4 for p isinP do Evaluation5 pf larr f (p) pf is a lsquofieldrsquo in program p that stores its fitness6 end for7 P prime larr 0 Next population8 repeat Breeding loop9 p1 larrTournamentSelection(P) First parent10 p2 larrTournamentSelection(P) Second parent11 (o1o2)larrCrossover(p1 p2)12 o1 larrMutation(o1 I )13 o2 larrMutation(o2 I )14 P prime larrP prime cupo1o215 until |P prime|= |P|16 P larrP prime

17 until StoppingCondition(P)18 return argmaxpisinP pf

19 end procedure

What is genetic programmingndash 15

Crossover

Crossover exchange of randomly selected subexpressions (subtree swapping crossover)

1 function Crossover(p1p2)2 repeat3 s1 larr Random node in p14 s2 larr Random node in p25 (pprime1p

prime2)larr Swap subtrees rooted in s1 and s2

6 until Depth(pprime1)lt dmax andDepth(pprime2)lt dmax dmax is the tree depthlimit

7 return (pprime1pprime2)

8 end function

What is genetic programmingndash 16

Mutation

Mutation replace a randomly selected subexpression with a new randomly generatedsubexpression

1 function Mutation(pI )2 repeat3 slarr Random node in p4 s prime larrRandomProgram(I )5 pprime larr Replace the subtree rooted in s with s prime

6 until Depth(pprime)lt dmax dmax is the tree depth limit7 return pprime

8 end function

What is genetic programmingndash 17

Exemplary run Setup

Objective Find program whose output matches x2+x+1 over the range [minus11]Such tasks can be considered as a form of regressionAs solutions are built by manipulating code (instructions) this is referred to assymbolic regression

Fitness sum of absolute errors for x isin minus10minus09 0910In other words the set of fitness cases is

xi -10 -09 0 09 10yi 1 091 1 271 3

What is genetic programmingndash 18

Exemplary run Setup

Instruction setNonterminal (function) set + - (protected division) and x all operating onfloatsTerminal set x and constants chosen randomly between -5 and +5

Selection fitness proportionate (roulette wheel) non elitist

Initial pop ramped half-and-half (depth 1 to 2 50 of terminals are constants)(to be explained later)

Parameterspopulation size 450 subtree crossover25 reproduction25 subtree mutation no tree size limits

Termination when an individual with fitness better than 01 found

What is genetic programmingndash 19

Initial population (population 0)

What is genetic programmingndash 20

Fitness assignment for population 0

Fitness values f(a)=77 f(b)=110 f(c)=1798 f(d)=287

What is genetic programmingndash 21

Breeding

Assume

a gets reproduced

c gets mutated (at loci 2)

a and d get crossed-over

a and b get crossed over

What is genetic programmingndash 22

Population 1

Population 0

Population 1

Individual d in population 1 has fitness 0What is genetic programmingndash 23

Summary of our first glimpse at GP

Summary of our first glimpse at GPndash 24

Specific features of GP

The solutions evolving under the selection pressure of the fitness function arethemselves functions (programs)

GP operates on symbolic structures of varying lengthsThere are no variables for the algorithm to operate on (at least in the commonsense)

The program can be tested only on a limited number of fitness cases (tests)

=rArr In contrast to most EC methods that are typically placed in optimizationframework GP is by nature an inductive learning approach that fits into the domain ofmachine learning []

Summary of our first glimpse at GPndash 25

In a broader context

As opposed to typical ML approaches GP is very genericArbitrary programming language arbitrary input and output representation

The syntax and semantic of the programming language of consideration serve asmeans to provide the algorithm with prior knowledge

(common sense knowledge background knowledge domain knowledge)

GP is not the only approach to program induction (but probably the best one )See eg inductive logic programming ILP

GP embodies the ultimate goal of AI to build a system capable ofself-programming (adaptation learning)

Summary of our first glimpse at GPndash 26

What is GP ndash Question revisited

Genetic programming is a branch of computer science studying heuristicalgorithms based on neo-Darwinian principles for synthesizing programs iediscrete symbolic compositional structures that process data

Summary of our first glimpse at GPndash 27

Why should GP be considered a viable approach to program synthesis

GP combines two powerful concepts marked in underline in the above definition1 Representing candidate solutions as programs

which in general can conduct any Turing-complete computation (egclassification regression clustering reasoning problem solving etc) and thusenable capturing solutions to any type of problems (whether the task is eglearning optimization problem solving game playing etc)

2 Searching the space of candidate solutions using the lsquomechanicsrsquo borrowed frombiological evolutionwhich is unquestionably a very powerful computing paradigm given that itresulted in life on Earth and development of intelligent beings

Summary of our first glimpse at GPndash 28

Consequences of the above definition

Heuristic nature of search

Symbolic program representation

Input sensitivity and inductive character

State-fullness

Unconstrained data types

Summary of our first glimpse at GPndash 29

Origins of GP

httpwwwgenetic-programmingcomjohnkozahtml

Home assignment Indentify John Kozarsquos unique visual trait )

Summary of our first glimpse at GPndash 30

Life demonstration of GP using ECJ

Life demonstration of GP using ECJndash 31

GP on ECJ

ECJ Evolutionary Computation in Javahttpcsgmuedu~eclabprojectsecj

by Sean Luke Liviu Panait Gabriel Balan Sean Paus Zbigniew Skolicki ElenaPopovici Keith Sullivan et al

Probably the most popular freely available framework for EC with a strongsupport for GP

Licensed under Academic Free License version 30

As of March 2012 version 20

Many other libraries integrate with ECJ

Life demonstration of GP using ECJndash 32

Selected ECJ features

GUI with charting

Platform-independent checkpointing and logging

Hierarchical parameter files

Multithreading

Mersenne Twister Random Number Generators (compare tohttpwwwalifecouknonrandom)

Abstractions for implementing a variety of EC forms

Prepared to work in a distributed environment (including so-called island model)

Life demonstration of GP using ECJndash 33

GP-related ECJ features

GP Tree Representations

Set-based Strongly-Typed Genetic Programming

Ephemeral Random Constants

Automatically-Defined Functions and Automatically Defined Macros

Multiple tree forests

Six tree-creation algorithms

Extensive set of GP breeding operators

Grammatical Encoding

Eight pre-done GP application problem domains (ant regression multiplexerlawnmower parity two-box edge serengeti)

Life demonstration of GP using ECJndash 34

Exemplary run of ECJ

Standard outputjava ecEvolve -file ecappregressionquinticercparamsThreads breed1 eval1Seed 1427743400Job 0Setting upProcessing GP TypesProcessing GP Node ConstraintsProcessing GP Function SetsProcessing GP Tree Constraints-01306332228659439200164875774146594280653340439694114301402200189629743-0037506348565697010001402771209365470606602806044824949013869498395598084Initializing Generation 0Subpop 0 best fitness of generation Fitness Standardized=11303205 Adjusted=046941292 Hits=10Generation 1Subpop 0 best fitness of generation Fitness Standardized=06804932 Adjusted=059506345 Hits=7

Life demonstration of GP using ECJndash 35

Exemplary run The result

The log file produced by the runGeneration 0Best IndividualSubpopulation 0Evaluated trueFitness Standardized=11303205 Adjusted=046941292 Hits=10Tree 0( (sin ( x x)) (cos (+ x x)))Generation 1Best IndividualSubpopulation 0Evaluated trueFitness Standardized=06804932 Adjusted=059506345 Hits=7Tree 0( (rlog (+ (- x x) (cos x))) (rlog (- (cos (cos ( x x))) (- x x))))

Life demonstration of GP using ECJndash 36

Exemplary run

The log file produced by the runBest Individual of RunSubpopulation 0Evaluated trueFitness Standardized=008413165 Adjusted=092239726 Hits=17Tree 0( ( ( (- ( ( ( ( x (sin x)) (rlog

x)) (+ (+ (sin x) x) (- x x))) (exp ( x( ( (- ( ( ( ( x x) (rlog x)) (+ (+

(sin x) x) (- x x))) (exp ( x (sin x))))(sin x)) (rlog x)) (exp (rlog x)))))) (sin

x)) (rlog x)) x) (cos (cos ( ( (- ( ((exp (rlog x)) (+ x ( ( (exp (rlog x))(rlog x)) x))) (exp ( ( ( (- (exp (rlogx)) x) (rlog x)) x) (sin ( x x))))) (sinx)) ( x ( ( (- ( ( ( ( x x) (rlogx)) (+ (+ x (+ (+ (sin x) x) (- x x))) (-x x))) (exp ( x (sin x)))) (sin x)) (rlogx)) (exp (rlog x))))) x))))

Life demonstration of GP using ECJndash 37

GP in R

httpcranr-projectorgwebpackagesgprindexhtml

Life demonstration of GP using ECJndash 38

GP in Mathematica

Life demonstration of GP using ECJndash 39

Assessment of GP techniques

Assessment of GP techniquesndash 40

Criteria for assessing the quality of GP-evolved solutions

Criteria for assessing GP algorithmssuccess rate (percentage of evolutionary runs ended with success)

time-to-success (can be infin)

error of the best-of-run individual

Criteria for assessing programs obtained with GP

error rate (percentage of tests passed)

program size (number of instructions)

execution time

transparency

Assessment of GP techniquesndash 41

GP Benchmarks

Problem Definition (formula)Sextic x6minus2x4+x2

Septic x7minus2x6+x5minusx4+x3minus2x2+xNonic x9+x8+x7+x6+x5+x4+x3+x2+xR1 (x+1)3(x2minusx+1)R2 (x5minus3x3+1)(x2+1)R3 (x6+x5)(x4+x3+x2+x+1)

Assessment of GP techniquesndash 42

Even more GP benchmarks

Symbolic Regression Drug Bioavailability [43]

Tower [52] Protein Structure Classification [58]

Boolean Functions Time Series Forecasting [53]

N-Multiplexer [17] N-Majority [17] N-Parity [17] Path-finding and Planning

Generalised Boolean Circuits [12 59] Physical Travelling Salesman [28]

Digital Adder [55] Artificial Ant [17]

Order [9] Lawnmower [18]

Digital Multiplier [55] Tartarus Problem [6]

Majority [9] Maximum Overhang [36]

Classification Circuit Design [29]

mRNA Motif Classification [24] Control Systems

DNA Motif Discovery [25] Chaotic Dynamic Systems Control [26]

Intrusion Detection [11] Pole Balancing [31]

Protein Classification [19] Truck Control [16]

Intertwined Spirals [17]

Predictive Modelling

Mackey-Glass Chaotic Time Series [21]

Financial Trading [5 4 8]

Sunspot Prediction [17]

GeneChip Probe Performance [22]

Prime Number Prediction [54]

Assessment of GP techniquesndash 43

and more

Game-Playing

TORCS Car Racing [50]

Ms PacMan [10]

Othello [27]

Chessboard Evaluation [44]

Backgammon [44]

Mario [49]

NP-Complete Puzzles [14]

Robocode [44]

Rush Hour [44]

Checkers [44]

Freecell [44]

Dynamic Optimisation

Dynamic Symbolic Regression [34 35 51]

Dynamic Scheduling [13]

Traditional Programming

Sorting [15 1]

Assessment of GP techniquesndash 44

Case study

Case studyndash 45

Case study 1 Evolution of temperature models

Based on Karolina Stanisławska Krzysztof Krawiec Zbigniew W KundzewiczModeling Global Temperature Changes using Genetic Programming ndash A Case Study

Institute of Computing Science Poznan University of Technology Poznan Poland

Institute for Agricultural and Forest Environment Polish Academy of SciencesPoznan Poland and Potsdam Institute for Climate Impact Research PotsdamGermany

Konferencja Algorytmoacutew Ewolucyjnych i Optymalizacji Globalnej KAEiOGSeptember 21 2011

Case studyndash 46

A more detailed view on GP(vanilla GP is not the whole story)

A more detailed view on GP (vanilla GP is not the whole story)ndash 47

Where to get [random] candidate solutions from

Every stochastic search method relies on some sampling algorithm(s)

The distribution of randomly generated solutions is important as it impliescertain bias of the algorithm

ProblemsWe donrsquot know the lsquoidealrsquo distribution of GP programsEven if we knew it it may be difficult to design an algorithm that obeys it

The most widely used contemporary initialization methods take care only of thesyntax of generated programs

Mainly height constraint

A more detailed view on GP (vanilla GP is not the whole story)ndash 48

Initialization Full method

Specify the maximum tree height hmax

The full method for initializing treesChoose nonterminal nodes at random until hmax is reachedThen choose only from terminals

A more detailed view on GP (vanilla GP is not the whole story)ndash 49

Initialization Grow method

Specify the maximum tree height hmax

The grow method for initializing treesChoose nonterminal or terminal nodes at random until hmax is reachedThen choose only from terminals

A more detailed view on GP (vanilla GP is not the whole story)ndash 50

Homologous crossover for GP

Earliest example one-point crossover [23] identify a common region in theparents and swap the corresponding trees

The common region is the lsquointersectionrsquo of parent trees

A more detailed view on GP (vanilla GP is not the whole story)ndash 51

Uniform crossover for GP

Works similarly to uniform crossover in GAs

The offspring is build by iterating over nodes in the common region and flipping acoin to decide from which parent should an instruction be copied [38]

A more detailed view on GP (vanilla GP is not the whole story)ndash 52

How to employ multiple operators for lsquobreedingrsquo

How should the particular operators coexist in an evolutionary process In other words

How should they be superimposed

What should be the lsquopipingrsquo of particular breeding pipelines

A topic surprisingly underexplored in GP (and in EC probably too)

An example Which is betterpopsubpop0speciespipe = ecgpkozaMutationPipelinepopsubpop0speciespipenum-sources = 1popsubpop0speciespipesource0 = ecgpkozaCrossoverPipeline

Orpopsubpop0speciespipenum-sources = 2popsubpop0speciespipesource0 = ecgpkozaCrossoverPipelinepopsubpop0speciespipesource0prob = 09popsubpop0speciespipesource1 = ecgpkozaMutationPipelinepopsubpop0speciespipesource1prob = 01

A more detailed view on GP (vanilla GP is not the whole story)ndash 53

The Challenges for GP

The Challenges for GPndash 54

Bloat

The evolving expressions tend to grow indefinitely in sizeFor tree-based representations this growth is typically exponential[-ish]

Evaluation becomes slow algorithm stalls memory overrun likely

One of the most intensely studied topics in GP 240+ papers as of March 2012

The Challenges for GPndash 55

Bloat example

Average number of nodes per generation in a typical run of GP solving the Sexticproblem x6minus2x4+x2

(GP dotted line)

The Challenges for GPndash 56

Countermeasures for bloat

Constraining tree heightSurprisingly can speed up bloat

Favoring small programsLexicographic parsimony pressure given two equally fit individuals prefer (select)the one represented by a smaller tree

Bloat-aware operators size-fair crossover

The Challenges for GPndash 57

Highly non-uniform distribution of program lsquobehaviorsrsquo

Convergence of binary Boolean random linear functions (composed of AND NANDOR NOR 8 bits)

From [20] Langdon W B Cantuacute-Paz E (ed) Random Search is Parsimonious LateBreaking Papers at the Genetic and Evolutionary Computation Conference(GECCO-2002) AAAI 2002 308-315

The Challenges for GPndash 58

High cost of evaluation

Running a program on multiple inputscan be expensive

Particularly for some types of dataeg images

Solutions

Caching of outcomes of subprograms

Parallel execution of programs onparticular fitness cases

Bloat prevention methods

The Challenges for GPndash 59

Variants of GP

Variants of GPndash 60

Strongly typed GP (STGP)

A way to incorporate prior knowledge and impose a structure on programs [30]

ImplementationProvide a set of typesFor each instruction define the types of its arguments and outcomesMake the operators type-aware

Mutation substitute a random tree of a proper typeCrossover swap trees of compatible1 types

1Compatible belonging to the same lsquoset typersquoVariants of GPndash 61

Strongly typed GP in ECJ

For the problem of simple classifiers represented as decision trees

Classifier syntaxClassifier = Class_idClassifier = if_then_else(Condition ClassifierClassifier)Condition = Input_Variable = Constant_Value

Implementation in ECJ parameter filesgptypeasize = 3gptypea0name = classgptypea1name = vargptypea2name = constgptypessize = 0gptcsize = 1gptc0 = ecgpGPTreeConstraintsgptc0name = tc0gptc0fset = f0gptc0returns = class

gpncsize = 4gpnc0 = ecgpGPNodeConstraintsgpnc0name = ncSimpleClassifiergpnc0returns = classgpnc0size = 0gpnc1 = ecgpGPNodeConstraintsgpnc1name = ncCompoundClassifiergpnc1returns = classgpnc1size = 4gpnc1child0 = vargpnc1child1 = constgpnc1child2 = classgpnc1child3 = classgpnc2 = ecgpGPNodeConstraintsgpnc2name = ncVariablegpnc2returns = vargpnc2size = 0gpnc3 = ecgpGPNodeConstraintsgpnc3name = ncConstantgpnc3returns = constgpnc3size = 0

Variants of GPndash 62

Linear Genetic Programming

MotivationTree-like structures are not natural for contemporary hardware architectures

Program = a sequence of instructions

Data passed via registers

ProsDirectly portable to machine code fast executionNatural correspondence to standard (GA-like) crossover operator

Applications direct evolution of machine code [32]

Initial register contents

Final register contents

x1

x2 O1 O2

x3

O3 O4 g2

g3

g1r1

r2

r3

r1

r2

r3

Variants of GPndash 63

Linear GP

Variants of GPndash 64

Stack-based GP

The best-known representative Push and PushGPhampshireedulspectorpushhtml [48]

ProsVery simple syntax program = instruction | literal | ( program )No need to specify the number of registersThe top element of a stack has the natural interpretation of program outcomeNatural possibility of implementing autoconstructive programs [47]Includes certain features that make it Turing-complete (eg YANK instruction)

Variants of GPndash 65

Push Example

Program

( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR )

Initial stack states

BOOLEAN STACK ()CODE STACK ( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR )FLOAT STACK ()INTEGER STACK ()

Stack states after program execution

BOOLEAN STACK ( TRUE )CODE STACK ( ( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR ) )FLOAT STACK ( 93 )INTEGER STACK ( 6 )

httphampshireedulspectorpush3-descriptionhtml

Variants of GPndash 66

Other variants of GP

Grammatical Evolution The grammar of the programming language ofconsideration is given as input to the algorithm Individuals encode the choice ofproductions in the derivation tree (which of available alternative productionshould be chosen modulo the number of productions available at given step ofderivation)

Graph-based GPMotivation standard GP cannot reuse subprograms (within a single program)Example Cartesian Genetic Programming

Variants of GPndash 67

Other variants of GP

Multiobjective GP The extra objectives canCome with the problemResult from GPrsquos specifics eg use program size as the second (minimized)objectiveBe associated with different tests (eg feature tests [40])

Developmental GP (eg using Push)

Probabilistic GP (a variant of EDA Estimation of Distribution Algorithms)The algorithm maintains a probability distribution P instead of a populationIndividuals are generated from P lsquoon demandrsquoThe results of individualsrsquo evaluation are used to update P

Variants of GPndash 68

Simple EDA-like GP PIPE

Probabilistic Incremental Program Evolution [41]

Variants of GPndash 69

Selected theoretical results

Selected theoretical resultsndash 70

Schemata theorem for GP

Exact formula for the expected number of individuals sampling a schema a thenext generation [37]

Plus later work for other types of crossover

Selected theoretical resultsndash 71

Semantic crossover

Selected theoretical resultsndash 72

Applications of GP

Applications of GPndash 73

Review

GP produced a number of solutions that are human-competitive ie a GPalgorithm automatically solved a problem for which a patent exists2

A recent award-winning work has demonstrated the ability of a GP system toautomatically find and correct bugs in commercially-released software whenprovided with test data3

GP is one of leading methodologies that can be used to lsquoautomatersquo sciencehelping the researchers to find the hidden complex patterns in the observedphenomena4

2Koza J R Keane M A Streeter M J Mydlowec W Yu J Lanza G 2003 Genetic Pro-gramming IV Routine Human-Competitive Machine Intelligence Kluwer Academic Publishers

3Arcuri A Yao X A novel co-evolutionary approach to automatic software bug fixing In WangJ (Ed) 2008 IEEE World Congress on Computational Intelligence IEEE Computational IntelligenceSociety IEEE Press Hong Kong

4Schmidt M Lipson H 3 Apr 2009 Distilling free-form natural laws from experimental dataScience 324 (5923) 81ndash85

Applications of GPndash 74

Humies

() Entries were solicited for cash awards for human-competitive results that wereproduced by any form of genetic and evolutionary computation and that were published

Applications of GPndash 75

Humies

The conditions to qualify(A) The result was patented as an invention in the past is an improvement over apatented invention or would qualify today as a patentable new invention(B) The result is equal to or better than a result that was accepted as a new scientificresult at the time when it was published in a peer-reviewed scientific journal(C) The result is equal to or better than a result that was placed into a database orarchive of results maintained by an internationally recognized panel of scientificexperts(D) The result is publishable in its own right as a new scientific result mdash independentof the fact that the result was mechanically created(E) The result is equal to or better than the most recent human-created solution to along-standing problem for which there has been a succession of increasingly betterhuman-created solutions(F) The result is equal to or better than a result that was considered an achievementin its field at the time it was first discovered(G) The result solves a problem of indisputable difficulty in its field(H) The result holds its own or wins a regulated competition involving humancontestants (in the form of either live human players or human-written computerprograms)

Applications of GPndash 76

Selected Gold Humies

2004 Jason D Lohn Gregory S Hornby Derek S Linden NASA Ames ResearchCenterAn Evolved Antenna for Deployment on NASArsquos Space Technology 5 Mission

httpidesignucscedupapershornby_ec11pdf

Applications of GPndash 77

Selected Gold Humies using GP

2009 Stephanie Forrest Claire Le Goues ThanhVu Nguyen Westley WeimerAutomatically finding patches using genetic programming A GeneticProgramming Approach to Automated Software Repair

Applications of GPndash 78

Selected Gold Humies using GP

2008 Lee Spector David M Clark Ian Lindsay Bradford Barr Jon KleinGenetic Programming for Finite Algebras

2010 Natalio Krasnogor Paweł Widera Jonathan GaribaldiEvolutionary design of the energy function for protein structure prediction GPchallenge evolving the energy function for protein structure prediction Automateddesign of energy functions for protein structure prediction by means of geneticprogramming and improved structure similarity assessment

2011 Achiya Elyasaf Ami Hauptmann Moshe SipperGA-FreeCell Evolving Solvers for the Game of FreeCell

Applications of GPndash 79

Other applications

classification problems in machine learning [] object recognition [ 33] orlearning game strategies [] [2 57] has demonstrated the ability of a GP system to automatically find andcorrect bugs in commercially-released software when provided with test data

In the context of this paper it deserves particular attention that GP is one ofleading methodologies that can be used to lsquoautomatersquo science helping theresearchers to find the hidden complex patterns in the observed phenomena

In this spirit in their seminal paper [42] have shown how GP can be used toinduce scientific laws from experimental data Many other studies havedemonstrated the usefulness of GP for modeling different phenomena includingthose of natural origins [45 3 7 56 46]

See [] for an extensive review of GP applications

Applications of GPndash 80

Case study 2 Evolution of feature detectors

Based on Krzysztof Krawiec Bartosz Kukawka and Tomasz Maciejewski (2010)Evolving cascades of voting feature detectors for vehicle detection in satellite imageryIn IEEE Congress on Evolutionary Computation (CEC 2010) Barcelona IEEE Presspages 2392-2399

Applications of GPndash 81

Additional resources

Additional resourcesndash 82

Additional resources

Evolutionary Computation in Java csgmuedu~eclabprojectsecjGeneric software framework for EA well-prepared to work with GP

The online GP bilbiography wwwcsbhamacuk~wblbiblio

The genetic programming lsquohome pagersquo (a little bit messy but still valuable)httpwwwgenetic-programmingcom

Additional resourcesndash 83

Bibliography

Bibliographyndash 84

A Agapitos and S M LucasEvolving Modular Recursive Sorting AlgorithmsIn Proc EuroGP 2007

A Arcuri and X YaoA novel co-evolutionary approach to automatic software bug fixingIn J Wang editor 2008 IEEE World Congress on Computational Intelligencepages 162ndash168 Hong Kong 1-6 June 2008 IEEE Computational IntelligenceSociety IEEE Press

M Arganis R Val J Prats K Rodriguez R Dominguez and J DolzGenetic programming and standardization in water temperature modellingAdvances in Civil Engineering 2009 2009

A Brabazon M OrsquoNeill and I DempseyAn Introduction to Evolutionary Computation in FinanceIEEE Computational Intelligence Magazine 3(4)42ndash55 2008

R Bradley A Brabazon and M OrsquoNeillDynamic High Frequency Trading A Neuro-Evolutionary ApproachIn Proc EvoWorkshops 2009

G Cuccu and F GomezWhen novelty is not enoughIn Proc EvoApplications 2011

Bibliographyndash 85

M Daga and M C DeoAlternative data-driven methods to estimate wind from waves by inversemodelingNatural Hazards 49(2)293ndash310 May 2009

I Dempsey M OrsquoNeill and A BrabazonAdaptive Trading With Grammatical EvolutionIn Proc CEC 2006

G Durrett F Neumann and U-M OrsquoReillyComputational Complexity Analysis of Simple Genetic Programming On TwoProblems Modeling Isolated Program SemanticsIn Proc FOGA 2011

E Galvaacuten-Loacutepez J Swafford M OrsquoNeill and A BrabazonEvolving a Ms PacMan Controller Using Grammatical EvolutionIn Applications of Evolutionary Computation Springer 2010

J V Hansen P B Lowry R D Meservy and D M McDonaldGenetic Programming for Prevention of Cyberterrorism through Dynamic andEvolving Intrusion DetectionDecision Support Systems 431362ndash1374 2007

Bibliographyndash 86

S Harding J F Miller and W BanzhafDevelopments in Cartesian Genetic Programming self-modifying CGPGPEM 11397ndash439 2010

D Jakobović and L BudinDynamic Scheduling with Genetic ProgrammingIn Proc EuroGP 2006

G Kendall A Parkes and K SpoererA Survey of NP-Complete PuzzlesInternational Computer Games Association Journal 31(1)13ndash34 2008

K E Kinnear JrEvolving a Sort Lessons in Genetic ProgrammingIn Proc of the International Conference on Neural Networks 1993

J KozaA Genetic Approach to the Truck Backer Upper Problem and the Inter-twinedSpiral ProblemIn Proc International Joint Conference on Neural Networks 1992

J R KozaGenetic Programming On the Programming of Computers by Means of NaturalSelectionMIT Press Cambridge MA USA 1992

Bibliographyndash 87

J R KozaGenetic Programming II Automatic Discovery of Reusable ProgramsMIT Press Cambridge Massachusetts May 1994

W Langdon and W BanzhafRepeated Patterns in Genetic ProgrammingNatural Computing 7589ndash613 2008

W B LangdonRandom search is parsimoniousIn E Cantuacute-Paz editor Late Breaking Papers at the Genetic and EvolutionaryComputation Conference (GECCO-2002) pages 308ndash315 New York NY 9-13July 2002 AAAI

W B Langdon and W BanzhafRepeated Sequences in Linear Genetic Programming GenomesComplex Systems 15(4)285ndash306 2005

W B Langdon and A P HarrisonEvolving Regular Expressions for GeneChip Probe Performance PredictionIn Proc PPSN pages 1061ndash1070 2008

W B Langdon and R PoliFoundations of Genetic ProgrammingSpringer-Verlag 2002

Bibliographyndash 88

W B Langdon J Rowsell and A P HarrisonCreating Regular Expressions as mRNA Motifs with GP to Predict Human ExonSplittingIn Proc GECCO 2009

W B Langdon O Sanchez Graillet and A P HarrisonAutomated DNA Motif DiscoveryarXivorg 2010

M Lones A Tyrrell S Stepney and L CavesControlling Complex Dynamics with Artificial Biochemical NetworksIn Proc EuroGP pages 159ndash170 2010

S LucasOthello Competitionhttpprotectkern-1667emrelaxalgovalessexacuk8080othellohtmlOthellohtml 2012[Online accessed 27-Jan-2012]

Bibliographyndash 89

S LucasThe Physical Travelling Salesperson Problemhttpprotectkern-1667emrelaxalgovalessexacukptspptsphtml2012[Online accessed 27ndashJan-2012]

T McConaghyFFX Fast Scalable Deterministic Symbolic Regression TechnologyIn Proc GPTP 2011

D J MontanaStrongly typed genetic programmingBBN Technical Report 7866 Bolt Beranek and Newman Inc 10 MoultonStreet Cambridge MA 02138 USA 7 May 1993

M Nicolau M Schoenauer and W BanzhafEvolving Genes to Balance a PoleIn Proc EuroGP 2010

Bibliographyndash 90

P Nordin and W BanzhafGenetic programming controlling a miniature robotIn E V Siegel and J R Koza editors Working Notes for the AAAI Symposiumon Genetic Programming pages 61ndash67 MIT Cambridge MA USA 10ndash12 Nov1995 AAAI

G Olague and L TrujilloEvolutionary-computer-assisted design of image operators that detect interestpoints using genetic programmingImage and Vision Computing 29(7)484ndash498 2011

M OrsquoNeill A Brabazon and E HembergSubtree Deactivation Control with Grammatical Genetic Programming inDynamic EnvironmentsIn Proc CEC 2008

M OrsquoNeill and C RyanGrammatical Evolution by Grammatical Evolution The Evolution of Grammarand Genetic CodeIn Proc EuroGP pages 138ndash149 Springer-Verlag 5ndash7 Apr 2004

M Paterson Y Peres M Thorup P Winkler and U ZwickMaximum OverhangIn Proc 19th Annual ACM-SIAM Symposium on Discrete Algorithms 2008

Bibliographyndash 91

R PoliExact schema theory for genetic programming and variable-length geneticalgorithms with one-point crossoverGenetic Programming and Evolvable Machines 2(2)123ndash163 June 2001

R Poli and W B LangdonOn the search properties of different crossover operators in genetic programmingIn J R Koza W Banzhaf K Chellapilla K Deb M Dorigo D B Fogel M HGarzon D E Goldberg H Iba and R Riolo editors Genetic Programming1998 Proceedings of the Third Annual Conference pages 293ndash301 University ofWisconsin Madison Wisconsin USA 22-25 July 1998 Morgan Kaufmann

R Poli W B Langdon and N F McPheeA field guide to genetic programmingPublished via httplulucom and freely available athttpwwwgp-field-guideorguk 2008(With contributions by J R Koza)

B J Ross and H ZhuProcedural texture evolution using multiobjective optimizationNew Generation Computing 22(3)271ndash293 2004

Bibliographyndash 92

R P Salustowicz and J SchmidhuberProbabilistic incremental program evolutionEvolutionary Computation 5(2)123ndash141 1997

M Schmidt and H LipsonDistilling free-form natural laws from experimental dataScience 324(5923)81ndash85 3 Apr 2009

S Silva and L VanneschiState-of-the-Art Genetic Programming for Predicting Human Oral Bioavailabilityof DrugsIn Proc 4th International Workshop on Practical Applications of ComputationalBiology and Bioinformatics 2010

M SipperLet the Games EvolveIn Proc GPTP 2011

C Sivapragasam R Maheswaran and V VenkateshGenetic programming approach for flood routing in natural channelsHydrological Processes 22(5)623ndash628 2007

C Sivapragasam N Muttil S Muthukumar and V M ArunPrediction of algal blooms using genetic programmingMarine Pollution Bulletin 60(10)1849ndash1855 2010

Bibliographyndash 93

L SpectorTowards practical autoconstructive evolution Self-evolution of problem-solvinggenetic programming systemsIn R Riolo T McConaghy and E Vladislavleva editors Genetic ProgrammingTheory and Practice VIII volume 8 of Genetic and Evolutionary Computationchapter 2 pages 17ndash33 Springer Ann Arbor USA 20-22 May 2010

L Spector C Perry and J KleinPush 20 programming language descriptionTechnical report School of Cognitive Science Hampshire College Apr 2004

J Togelius S Karakovskiy J Koutnik and J SchmidhuberSuper Mario EvolutionIn Proc IEEE Computational Intelligence and Games 2009

TORCS The Open Car Racing Simulatorhttptorcssourceforgenet 2012

L Vanneschi and G CuccuVariable Size Population for Dynamic Optimization with Genetic ProgrammingIn Proc GECCO 2009

Bibliographyndash 94

E Vladislavleva G Smits and D Den HertogOrder of Nonlinearity as a Complexity Measure for Models Generated by SymbolicRegression via Pareto Genetic ProgrammingIEEE Trans EC 13(2)333ndash349 2009

N Wagner Z Michalewicz M Khouja and R McGregorTime Series Forecasting for Dynamic Environments The DyFor Genetic ProgramModelIEEE Trans EC 2007

J Walker and J MillerPredicting Prime Numbers Using Cartesian Genetic ProgrammingIn Proc EuroGP 2007

J A Walker K Voumllk S L Smith and J F MillerParallel Evolution using Multi-chromosome Cartesian Genetic ProgrammingGPEM 10417ndash445 2009

W-C Wang K-W Chau C-T Cheng and L QiuA comparison of performance of several artificial intelligence methods forforecasting monthly discharge time seriesJournal of Hydrology 374(3-4)294ndash306 2009

Bibliographyndash 95

W Weimer S Forrest C Le Goues and T NguyenAutomatic program repair with evolutionary computationCommunications of the ACM 53(5)109ndash116 June 2010

P Widera J Garibaldi and N KrasnogorGP challenge Evolving energy function for protein structure predictionGPEM 1161ndash88 2010

T YuHierarchical Processing for Evolving Recursive and Modular Programs UsingHigher-Order Functions and Lambda AbstractionGPEM 2345ndash380 2001

Bibliographyndash 96

  • Introduction
  • Background
  • What is genetic programming
  • Summary of our first glimpse at GP
  • Life demonstration of GP using ECJ
  • Assessment of GP techniques
  • Case study
  • A more detailed view on GP (vanilla GP is not the whole story)
  • The Challenges for GP
  • Variants of GP
  • Selected theoretical results
  • Applications of GP
  • Additional resources
  • Bibliography
Page 16: GeneticProgramming - Poznań University of TechnologyGeneticProgramming KrzysztofKrawiec LaboratoryofIntelligentDecisionSupportSystems InstituteofComputingScience,PoznanUniversityofTechnology,Poznań,Poland

Crossover

Crossover exchange of randomly selected subexpressions (subtree swapping crossover)

1 function Crossover(p1p2)2 repeat3 s1 larr Random node in p14 s2 larr Random node in p25 (pprime1p

prime2)larr Swap subtrees rooted in s1 and s2

6 until Depth(pprime1)lt dmax andDepth(pprime2)lt dmax dmax is the tree depthlimit

7 return (pprime1pprime2)

8 end function

What is genetic programmingndash 16

Mutation

Mutation replace a randomly selected subexpression with a new randomly generatedsubexpression

1 function Mutation(pI )2 repeat3 slarr Random node in p4 s prime larrRandomProgram(I )5 pprime larr Replace the subtree rooted in s with s prime

6 until Depth(pprime)lt dmax dmax is the tree depth limit7 return pprime

8 end function

What is genetic programmingndash 17

Exemplary run Setup

Objective Find program whose output matches x2+x+1 over the range [minus11]Such tasks can be considered as a form of regressionAs solutions are built by manipulating code (instructions) this is referred to assymbolic regression

Fitness sum of absolute errors for x isin minus10minus09 0910In other words the set of fitness cases is

xi -10 -09 0 09 10yi 1 091 1 271 3

What is genetic programmingndash 18

Exemplary run Setup

Instruction setNonterminal (function) set + - (protected division) and x all operating onfloatsTerminal set x and constants chosen randomly between -5 and +5

Selection fitness proportionate (roulette wheel) non elitist

Initial pop ramped half-and-half (depth 1 to 2 50 of terminals are constants)(to be explained later)

Parameterspopulation size 450 subtree crossover25 reproduction25 subtree mutation no tree size limits

Termination when an individual with fitness better than 01 found

What is genetic programmingndash 19

Initial population (population 0)

What is genetic programmingndash 20

Fitness assignment for population 0

Fitness values f(a)=77 f(b)=110 f(c)=1798 f(d)=287

What is genetic programmingndash 21

Breeding

Assume

a gets reproduced

c gets mutated (at loci 2)

a and d get crossed-over

a and b get crossed over

What is genetic programmingndash 22

Population 1

Population 0

Population 1

Individual d in population 1 has fitness 0What is genetic programmingndash 23

Summary of our first glimpse at GP

Summary of our first glimpse at GPndash 24

Specific features of GP

The solutions evolving under the selection pressure of the fitness function arethemselves functions (programs)

GP operates on symbolic structures of varying lengthsThere are no variables for the algorithm to operate on (at least in the commonsense)

The program can be tested only on a limited number of fitness cases (tests)

=rArr In contrast to most EC methods that are typically placed in optimizationframework GP is by nature an inductive learning approach that fits into the domain ofmachine learning []

Summary of our first glimpse at GPndash 25

In a broader context

As opposed to typical ML approaches GP is very genericArbitrary programming language arbitrary input and output representation

The syntax and semantic of the programming language of consideration serve asmeans to provide the algorithm with prior knowledge

(common sense knowledge background knowledge domain knowledge)

GP is not the only approach to program induction (but probably the best one )See eg inductive logic programming ILP

GP embodies the ultimate goal of AI to build a system capable ofself-programming (adaptation learning)

Summary of our first glimpse at GPndash 26

What is GP ndash Question revisited

Genetic programming is a branch of computer science studying heuristicalgorithms based on neo-Darwinian principles for synthesizing programs iediscrete symbolic compositional structures that process data

Summary of our first glimpse at GPndash 27

Why should GP be considered a viable approach to program synthesis

GP combines two powerful concepts marked in underline in the above definition1 Representing candidate solutions as programs

which in general can conduct any Turing-complete computation (egclassification regression clustering reasoning problem solving etc) and thusenable capturing solutions to any type of problems (whether the task is eglearning optimization problem solving game playing etc)

2 Searching the space of candidate solutions using the lsquomechanicsrsquo borrowed frombiological evolutionwhich is unquestionably a very powerful computing paradigm given that itresulted in life on Earth and development of intelligent beings

Summary of our first glimpse at GPndash 28

Consequences of the above definition

Heuristic nature of search

Symbolic program representation

Input sensitivity and inductive character

State-fullness

Unconstrained data types

Summary of our first glimpse at GPndash 29

Origins of GP

httpwwwgenetic-programmingcomjohnkozahtml

Home assignment Indentify John Kozarsquos unique visual trait )

Summary of our first glimpse at GPndash 30

Life demonstration of GP using ECJ

Life demonstration of GP using ECJndash 31

GP on ECJ

ECJ Evolutionary Computation in Javahttpcsgmuedu~eclabprojectsecj

by Sean Luke Liviu Panait Gabriel Balan Sean Paus Zbigniew Skolicki ElenaPopovici Keith Sullivan et al

Probably the most popular freely available framework for EC with a strongsupport for GP

Licensed under Academic Free License version 30

As of March 2012 version 20

Many other libraries integrate with ECJ

Life demonstration of GP using ECJndash 32

Selected ECJ features

GUI with charting

Platform-independent checkpointing and logging

Hierarchical parameter files

Multithreading

Mersenne Twister Random Number Generators (compare tohttpwwwalifecouknonrandom)

Abstractions for implementing a variety of EC forms

Prepared to work in a distributed environment (including so-called island model)

Life demonstration of GP using ECJndash 33

GP-related ECJ features

GP Tree Representations

Set-based Strongly-Typed Genetic Programming

Ephemeral Random Constants

Automatically-Defined Functions and Automatically Defined Macros

Multiple tree forests

Six tree-creation algorithms

Extensive set of GP breeding operators

Grammatical Encoding

Eight pre-done GP application problem domains (ant regression multiplexerlawnmower parity two-box edge serengeti)

Life demonstration of GP using ECJndash 34

Exemplary run of ECJ

Standard outputjava ecEvolve -file ecappregressionquinticercparamsThreads breed1 eval1Seed 1427743400Job 0Setting upProcessing GP TypesProcessing GP Node ConstraintsProcessing GP Function SetsProcessing GP Tree Constraints-01306332228659439200164875774146594280653340439694114301402200189629743-0037506348565697010001402771209365470606602806044824949013869498395598084Initializing Generation 0Subpop 0 best fitness of generation Fitness Standardized=11303205 Adjusted=046941292 Hits=10Generation 1Subpop 0 best fitness of generation Fitness Standardized=06804932 Adjusted=059506345 Hits=7

Life demonstration of GP using ECJndash 35

Exemplary run The result

The log file produced by the runGeneration 0Best IndividualSubpopulation 0Evaluated trueFitness Standardized=11303205 Adjusted=046941292 Hits=10Tree 0( (sin ( x x)) (cos (+ x x)))Generation 1Best IndividualSubpopulation 0Evaluated trueFitness Standardized=06804932 Adjusted=059506345 Hits=7Tree 0( (rlog (+ (- x x) (cos x))) (rlog (- (cos (cos ( x x))) (- x x))))

Life demonstration of GP using ECJndash 36

Exemplary run

The log file produced by the runBest Individual of RunSubpopulation 0Evaluated trueFitness Standardized=008413165 Adjusted=092239726 Hits=17Tree 0( ( ( (- ( ( ( ( x (sin x)) (rlog

x)) (+ (+ (sin x) x) (- x x))) (exp ( x( ( (- ( ( ( ( x x) (rlog x)) (+ (+

(sin x) x) (- x x))) (exp ( x (sin x))))(sin x)) (rlog x)) (exp (rlog x)))))) (sin

x)) (rlog x)) x) (cos (cos ( ( (- ( ((exp (rlog x)) (+ x ( ( (exp (rlog x))(rlog x)) x))) (exp ( ( ( (- (exp (rlogx)) x) (rlog x)) x) (sin ( x x))))) (sinx)) ( x ( ( (- ( ( ( ( x x) (rlogx)) (+ (+ x (+ (+ (sin x) x) (- x x))) (-x x))) (exp ( x (sin x)))) (sin x)) (rlogx)) (exp (rlog x))))) x))))

Life demonstration of GP using ECJndash 37

GP in R

httpcranr-projectorgwebpackagesgprindexhtml

Life demonstration of GP using ECJndash 38

GP in Mathematica

Life demonstration of GP using ECJndash 39

Assessment of GP techniques

Assessment of GP techniquesndash 40

Criteria for assessing the quality of GP-evolved solutions

Criteria for assessing GP algorithmssuccess rate (percentage of evolutionary runs ended with success)

time-to-success (can be infin)

error of the best-of-run individual

Criteria for assessing programs obtained with GP

error rate (percentage of tests passed)

program size (number of instructions)

execution time

transparency

Assessment of GP techniquesndash 41

GP Benchmarks

Problem Definition (formula)Sextic x6minus2x4+x2

Septic x7minus2x6+x5minusx4+x3minus2x2+xNonic x9+x8+x7+x6+x5+x4+x3+x2+xR1 (x+1)3(x2minusx+1)R2 (x5minus3x3+1)(x2+1)R3 (x6+x5)(x4+x3+x2+x+1)

Assessment of GP techniquesndash 42

Even more GP benchmarks

Symbolic Regression Drug Bioavailability [43]

Tower [52] Protein Structure Classification [58]

Boolean Functions Time Series Forecasting [53]

N-Multiplexer [17] N-Majority [17] N-Parity [17] Path-finding and Planning

Generalised Boolean Circuits [12 59] Physical Travelling Salesman [28]

Digital Adder [55] Artificial Ant [17]

Order [9] Lawnmower [18]

Digital Multiplier [55] Tartarus Problem [6]

Majority [9] Maximum Overhang [36]

Classification Circuit Design [29]

mRNA Motif Classification [24] Control Systems

DNA Motif Discovery [25] Chaotic Dynamic Systems Control [26]

Intrusion Detection [11] Pole Balancing [31]

Protein Classification [19] Truck Control [16]

Intertwined Spirals [17]

Predictive Modelling

Mackey-Glass Chaotic Time Series [21]

Financial Trading [5 4 8]

Sunspot Prediction [17]

GeneChip Probe Performance [22]

Prime Number Prediction [54]

Assessment of GP techniquesndash 43

and more

Game-Playing

TORCS Car Racing [50]

Ms PacMan [10]

Othello [27]

Chessboard Evaluation [44]

Backgammon [44]

Mario [49]

NP-Complete Puzzles [14]

Robocode [44]

Rush Hour [44]

Checkers [44]

Freecell [44]

Dynamic Optimisation

Dynamic Symbolic Regression [34 35 51]

Dynamic Scheduling [13]

Traditional Programming

Sorting [15 1]

Assessment of GP techniquesndash 44

Case study

Case studyndash 45

Case study 1 Evolution of temperature models

Based on Karolina Stanisławska Krzysztof Krawiec Zbigniew W KundzewiczModeling Global Temperature Changes using Genetic Programming ndash A Case Study

Institute of Computing Science Poznan University of Technology Poznan Poland

Institute for Agricultural and Forest Environment Polish Academy of SciencesPoznan Poland and Potsdam Institute for Climate Impact Research PotsdamGermany

Konferencja Algorytmoacutew Ewolucyjnych i Optymalizacji Globalnej KAEiOGSeptember 21 2011

Case studyndash 46

A more detailed view on GP(vanilla GP is not the whole story)

A more detailed view on GP (vanilla GP is not the whole story)ndash 47

Where to get [random] candidate solutions from

Every stochastic search method relies on some sampling algorithm(s)

The distribution of randomly generated solutions is important as it impliescertain bias of the algorithm

ProblemsWe donrsquot know the lsquoidealrsquo distribution of GP programsEven if we knew it it may be difficult to design an algorithm that obeys it

The most widely used contemporary initialization methods take care only of thesyntax of generated programs

Mainly height constraint

A more detailed view on GP (vanilla GP is not the whole story)ndash 48

Initialization Full method

Specify the maximum tree height hmax

The full method for initializing treesChoose nonterminal nodes at random until hmax is reachedThen choose only from terminals

A more detailed view on GP (vanilla GP is not the whole story)ndash 49

Initialization Grow method

Specify the maximum tree height hmax

The grow method for initializing treesChoose nonterminal or terminal nodes at random until hmax is reachedThen choose only from terminals

A more detailed view on GP (vanilla GP is not the whole story)ndash 50

Homologous crossover for GP

Earliest example one-point crossover [23] identify a common region in theparents and swap the corresponding trees

The common region is the lsquointersectionrsquo of parent trees

A more detailed view on GP (vanilla GP is not the whole story)ndash 51

Uniform crossover for GP

Works similarly to uniform crossover in GAs

The offspring is build by iterating over nodes in the common region and flipping acoin to decide from which parent should an instruction be copied [38]

A more detailed view on GP (vanilla GP is not the whole story)ndash 52

How to employ multiple operators for lsquobreedingrsquo

How should the particular operators coexist in an evolutionary process In other words

How should they be superimposed

What should be the lsquopipingrsquo of particular breeding pipelines

A topic surprisingly underexplored in GP (and in EC probably too)

An example Which is betterpopsubpop0speciespipe = ecgpkozaMutationPipelinepopsubpop0speciespipenum-sources = 1popsubpop0speciespipesource0 = ecgpkozaCrossoverPipeline

Orpopsubpop0speciespipenum-sources = 2popsubpop0speciespipesource0 = ecgpkozaCrossoverPipelinepopsubpop0speciespipesource0prob = 09popsubpop0speciespipesource1 = ecgpkozaMutationPipelinepopsubpop0speciespipesource1prob = 01

A more detailed view on GP (vanilla GP is not the whole story)ndash 53

The Challenges for GP

The Challenges for GPndash 54

Bloat

The evolving expressions tend to grow indefinitely in sizeFor tree-based representations this growth is typically exponential[-ish]

Evaluation becomes slow algorithm stalls memory overrun likely

One of the most intensely studied topics in GP 240+ papers as of March 2012

The Challenges for GPndash 55

Bloat example

Average number of nodes per generation in a typical run of GP solving the Sexticproblem x6minus2x4+x2

(GP dotted line)

The Challenges for GPndash 56

Countermeasures for bloat

Constraining tree heightSurprisingly can speed up bloat

Favoring small programsLexicographic parsimony pressure given two equally fit individuals prefer (select)the one represented by a smaller tree

Bloat-aware operators size-fair crossover

The Challenges for GPndash 57

Highly non-uniform distribution of program lsquobehaviorsrsquo

Convergence of binary Boolean random linear functions (composed of AND NANDOR NOR 8 bits)

From [20] Langdon W B Cantuacute-Paz E (ed) Random Search is Parsimonious LateBreaking Papers at the Genetic and Evolutionary Computation Conference(GECCO-2002) AAAI 2002 308-315

The Challenges for GPndash 58

High cost of evaluation

Running a program on multiple inputscan be expensive

Particularly for some types of dataeg images

Solutions

Caching of outcomes of subprograms

Parallel execution of programs onparticular fitness cases

Bloat prevention methods

The Challenges for GPndash 59

Variants of GP

Variants of GPndash 60

Strongly typed GP (STGP)

A way to incorporate prior knowledge and impose a structure on programs [30]

ImplementationProvide a set of typesFor each instruction define the types of its arguments and outcomesMake the operators type-aware

Mutation substitute a random tree of a proper typeCrossover swap trees of compatible1 types

1Compatible belonging to the same lsquoset typersquoVariants of GPndash 61

Strongly typed GP in ECJ

For the problem of simple classifiers represented as decision trees

Classifier syntaxClassifier = Class_idClassifier = if_then_else(Condition ClassifierClassifier)Condition = Input_Variable = Constant_Value

Implementation in ECJ parameter filesgptypeasize = 3gptypea0name = classgptypea1name = vargptypea2name = constgptypessize = 0gptcsize = 1gptc0 = ecgpGPTreeConstraintsgptc0name = tc0gptc0fset = f0gptc0returns = class

gpncsize = 4gpnc0 = ecgpGPNodeConstraintsgpnc0name = ncSimpleClassifiergpnc0returns = classgpnc0size = 0gpnc1 = ecgpGPNodeConstraintsgpnc1name = ncCompoundClassifiergpnc1returns = classgpnc1size = 4gpnc1child0 = vargpnc1child1 = constgpnc1child2 = classgpnc1child3 = classgpnc2 = ecgpGPNodeConstraintsgpnc2name = ncVariablegpnc2returns = vargpnc2size = 0gpnc3 = ecgpGPNodeConstraintsgpnc3name = ncConstantgpnc3returns = constgpnc3size = 0

Variants of GPndash 62

Linear Genetic Programming

MotivationTree-like structures are not natural for contemporary hardware architectures

Program = a sequence of instructions

Data passed via registers

ProsDirectly portable to machine code fast executionNatural correspondence to standard (GA-like) crossover operator

Applications direct evolution of machine code [32]

Initial register contents

Final register contents

x1

x2 O1 O2

x3

O3 O4 g2

g3

g1r1

r2

r3

r1

r2

r3

Variants of GPndash 63

Linear GP

Variants of GPndash 64

Stack-based GP

The best-known representative Push and PushGPhampshireedulspectorpushhtml [48]

ProsVery simple syntax program = instruction | literal | ( program )No need to specify the number of registersThe top element of a stack has the natural interpretation of program outcomeNatural possibility of implementing autoconstructive programs [47]Includes certain features that make it Turing-complete (eg YANK instruction)

Variants of GPndash 65

Push Example

Program

( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR )

Initial stack states

BOOLEAN STACK ()CODE STACK ( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR )FLOAT STACK ()INTEGER STACK ()

Stack states after program execution

BOOLEAN STACK ( TRUE )CODE STACK ( ( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR ) )FLOAT STACK ( 93 )INTEGER STACK ( 6 )

httphampshireedulspectorpush3-descriptionhtml

Variants of GPndash 66

Other variants of GP

Grammatical Evolution The grammar of the programming language ofconsideration is given as input to the algorithm Individuals encode the choice ofproductions in the derivation tree (which of available alternative productionshould be chosen modulo the number of productions available at given step ofderivation)

Graph-based GPMotivation standard GP cannot reuse subprograms (within a single program)Example Cartesian Genetic Programming

Variants of GPndash 67

Other variants of GP

Multiobjective GP The extra objectives canCome with the problemResult from GPrsquos specifics eg use program size as the second (minimized)objectiveBe associated with different tests (eg feature tests [40])

Developmental GP (eg using Push)

Probabilistic GP (a variant of EDA Estimation of Distribution Algorithms)The algorithm maintains a probability distribution P instead of a populationIndividuals are generated from P lsquoon demandrsquoThe results of individualsrsquo evaluation are used to update P

Variants of GPndash 68

Simple EDA-like GP PIPE

Probabilistic Incremental Program Evolution [41]

Variants of GPndash 69

Selected theoretical results

Selected theoretical resultsndash 70

Schemata theorem for GP

Exact formula for the expected number of individuals sampling a schema a thenext generation [37]

Plus later work for other types of crossover

Selected theoretical resultsndash 71

Semantic crossover

Selected theoretical resultsndash 72

Applications of GP

Applications of GPndash 73

Review

GP produced a number of solutions that are human-competitive ie a GPalgorithm automatically solved a problem for which a patent exists2

A recent award-winning work has demonstrated the ability of a GP system toautomatically find and correct bugs in commercially-released software whenprovided with test data3

GP is one of leading methodologies that can be used to lsquoautomatersquo sciencehelping the researchers to find the hidden complex patterns in the observedphenomena4

2Koza J R Keane M A Streeter M J Mydlowec W Yu J Lanza G 2003 Genetic Pro-gramming IV Routine Human-Competitive Machine Intelligence Kluwer Academic Publishers

3Arcuri A Yao X A novel co-evolutionary approach to automatic software bug fixing In WangJ (Ed) 2008 IEEE World Congress on Computational Intelligence IEEE Computational IntelligenceSociety IEEE Press Hong Kong

4Schmidt M Lipson H 3 Apr 2009 Distilling free-form natural laws from experimental dataScience 324 (5923) 81ndash85

Applications of GPndash 74

Humies

() Entries were solicited for cash awards for human-competitive results that wereproduced by any form of genetic and evolutionary computation and that were published

Applications of GPndash 75

Humies

The conditions to qualify(A) The result was patented as an invention in the past is an improvement over apatented invention or would qualify today as a patentable new invention(B) The result is equal to or better than a result that was accepted as a new scientificresult at the time when it was published in a peer-reviewed scientific journal(C) The result is equal to or better than a result that was placed into a database orarchive of results maintained by an internationally recognized panel of scientificexperts(D) The result is publishable in its own right as a new scientific result mdash independentof the fact that the result was mechanically created(E) The result is equal to or better than the most recent human-created solution to along-standing problem for which there has been a succession of increasingly betterhuman-created solutions(F) The result is equal to or better than a result that was considered an achievementin its field at the time it was first discovered(G) The result solves a problem of indisputable difficulty in its field(H) The result holds its own or wins a regulated competition involving humancontestants (in the form of either live human players or human-written computerprograms)

Applications of GPndash 76

Selected Gold Humies

2004 Jason D Lohn Gregory S Hornby Derek S Linden NASA Ames ResearchCenterAn Evolved Antenna for Deployment on NASArsquos Space Technology 5 Mission

httpidesignucscedupapershornby_ec11pdf

Applications of GPndash 77

Selected Gold Humies using GP

2009 Stephanie Forrest Claire Le Goues ThanhVu Nguyen Westley WeimerAutomatically finding patches using genetic programming A GeneticProgramming Approach to Automated Software Repair

Applications of GPndash 78

Selected Gold Humies using GP

2008 Lee Spector David M Clark Ian Lindsay Bradford Barr Jon KleinGenetic Programming for Finite Algebras

2010 Natalio Krasnogor Paweł Widera Jonathan GaribaldiEvolutionary design of the energy function for protein structure prediction GPchallenge evolving the energy function for protein structure prediction Automateddesign of energy functions for protein structure prediction by means of geneticprogramming and improved structure similarity assessment

2011 Achiya Elyasaf Ami Hauptmann Moshe SipperGA-FreeCell Evolving Solvers for the Game of FreeCell

Applications of GPndash 79

Other applications

classification problems in machine learning [] object recognition [ 33] orlearning game strategies [] [2 57] has demonstrated the ability of a GP system to automatically find andcorrect bugs in commercially-released software when provided with test data

In the context of this paper it deserves particular attention that GP is one ofleading methodologies that can be used to lsquoautomatersquo science helping theresearchers to find the hidden complex patterns in the observed phenomena

In this spirit in their seminal paper [42] have shown how GP can be used toinduce scientific laws from experimental data Many other studies havedemonstrated the usefulness of GP for modeling different phenomena includingthose of natural origins [45 3 7 56 46]

See [] for an extensive review of GP applications

Applications of GPndash 80

Case study 2 Evolution of feature detectors

Based on Krzysztof Krawiec Bartosz Kukawka and Tomasz Maciejewski (2010)Evolving cascades of voting feature detectors for vehicle detection in satellite imageryIn IEEE Congress on Evolutionary Computation (CEC 2010) Barcelona IEEE Presspages 2392-2399

Applications of GPndash 81

Additional resources

Additional resourcesndash 82

Additional resources

Evolutionary Computation in Java csgmuedu~eclabprojectsecjGeneric software framework for EA well-prepared to work with GP

The online GP bilbiography wwwcsbhamacuk~wblbiblio

The genetic programming lsquohome pagersquo (a little bit messy but still valuable)httpwwwgenetic-programmingcom

Additional resourcesndash 83

Bibliography

Bibliographyndash 84

A Agapitos and S M LucasEvolving Modular Recursive Sorting AlgorithmsIn Proc EuroGP 2007

A Arcuri and X YaoA novel co-evolutionary approach to automatic software bug fixingIn J Wang editor 2008 IEEE World Congress on Computational Intelligencepages 162ndash168 Hong Kong 1-6 June 2008 IEEE Computational IntelligenceSociety IEEE Press

M Arganis R Val J Prats K Rodriguez R Dominguez and J DolzGenetic programming and standardization in water temperature modellingAdvances in Civil Engineering 2009 2009

A Brabazon M OrsquoNeill and I DempseyAn Introduction to Evolutionary Computation in FinanceIEEE Computational Intelligence Magazine 3(4)42ndash55 2008

R Bradley A Brabazon and M OrsquoNeillDynamic High Frequency Trading A Neuro-Evolutionary ApproachIn Proc EvoWorkshops 2009

G Cuccu and F GomezWhen novelty is not enoughIn Proc EvoApplications 2011

Bibliographyndash 85

M Daga and M C DeoAlternative data-driven methods to estimate wind from waves by inversemodelingNatural Hazards 49(2)293ndash310 May 2009

I Dempsey M OrsquoNeill and A BrabazonAdaptive Trading With Grammatical EvolutionIn Proc CEC 2006

G Durrett F Neumann and U-M OrsquoReillyComputational Complexity Analysis of Simple Genetic Programming On TwoProblems Modeling Isolated Program SemanticsIn Proc FOGA 2011

E Galvaacuten-Loacutepez J Swafford M OrsquoNeill and A BrabazonEvolving a Ms PacMan Controller Using Grammatical EvolutionIn Applications of Evolutionary Computation Springer 2010

J V Hansen P B Lowry R D Meservy and D M McDonaldGenetic Programming for Prevention of Cyberterrorism through Dynamic andEvolving Intrusion DetectionDecision Support Systems 431362ndash1374 2007

Bibliographyndash 86

S Harding J F Miller and W BanzhafDevelopments in Cartesian Genetic Programming self-modifying CGPGPEM 11397ndash439 2010

D Jakobović and L BudinDynamic Scheduling with Genetic ProgrammingIn Proc EuroGP 2006

G Kendall A Parkes and K SpoererA Survey of NP-Complete PuzzlesInternational Computer Games Association Journal 31(1)13ndash34 2008

K E Kinnear JrEvolving a Sort Lessons in Genetic ProgrammingIn Proc of the International Conference on Neural Networks 1993

J KozaA Genetic Approach to the Truck Backer Upper Problem and the Inter-twinedSpiral ProblemIn Proc International Joint Conference on Neural Networks 1992

J R KozaGenetic Programming On the Programming of Computers by Means of NaturalSelectionMIT Press Cambridge MA USA 1992

Bibliographyndash 87

J R KozaGenetic Programming II Automatic Discovery of Reusable ProgramsMIT Press Cambridge Massachusetts May 1994

W Langdon and W BanzhafRepeated Patterns in Genetic ProgrammingNatural Computing 7589ndash613 2008

W B LangdonRandom search is parsimoniousIn E Cantuacute-Paz editor Late Breaking Papers at the Genetic and EvolutionaryComputation Conference (GECCO-2002) pages 308ndash315 New York NY 9-13July 2002 AAAI

W B Langdon and W BanzhafRepeated Sequences in Linear Genetic Programming GenomesComplex Systems 15(4)285ndash306 2005

W B Langdon and A P HarrisonEvolving Regular Expressions for GeneChip Probe Performance PredictionIn Proc PPSN pages 1061ndash1070 2008

W B Langdon and R PoliFoundations of Genetic ProgrammingSpringer-Verlag 2002

Bibliographyndash 88

W B Langdon J Rowsell and A P HarrisonCreating Regular Expressions as mRNA Motifs with GP to Predict Human ExonSplittingIn Proc GECCO 2009

W B Langdon O Sanchez Graillet and A P HarrisonAutomated DNA Motif DiscoveryarXivorg 2010

M Lones A Tyrrell S Stepney and L CavesControlling Complex Dynamics with Artificial Biochemical NetworksIn Proc EuroGP pages 159ndash170 2010

S LucasOthello Competitionhttpprotectkern-1667emrelaxalgovalessexacuk8080othellohtmlOthellohtml 2012[Online accessed 27-Jan-2012]

Bibliographyndash 89

S LucasThe Physical Travelling Salesperson Problemhttpprotectkern-1667emrelaxalgovalessexacukptspptsphtml2012[Online accessed 27ndashJan-2012]

T McConaghyFFX Fast Scalable Deterministic Symbolic Regression TechnologyIn Proc GPTP 2011

D J MontanaStrongly typed genetic programmingBBN Technical Report 7866 Bolt Beranek and Newman Inc 10 MoultonStreet Cambridge MA 02138 USA 7 May 1993

M Nicolau M Schoenauer and W BanzhafEvolving Genes to Balance a PoleIn Proc EuroGP 2010

Bibliographyndash 90

P Nordin and W BanzhafGenetic programming controlling a miniature robotIn E V Siegel and J R Koza editors Working Notes for the AAAI Symposiumon Genetic Programming pages 61ndash67 MIT Cambridge MA USA 10ndash12 Nov1995 AAAI

G Olague and L TrujilloEvolutionary-computer-assisted design of image operators that detect interestpoints using genetic programmingImage and Vision Computing 29(7)484ndash498 2011

M OrsquoNeill A Brabazon and E HembergSubtree Deactivation Control with Grammatical Genetic Programming inDynamic EnvironmentsIn Proc CEC 2008

M OrsquoNeill and C RyanGrammatical Evolution by Grammatical Evolution The Evolution of Grammarand Genetic CodeIn Proc EuroGP pages 138ndash149 Springer-Verlag 5ndash7 Apr 2004

M Paterson Y Peres M Thorup P Winkler and U ZwickMaximum OverhangIn Proc 19th Annual ACM-SIAM Symposium on Discrete Algorithms 2008

Bibliographyndash 91

R PoliExact schema theory for genetic programming and variable-length geneticalgorithms with one-point crossoverGenetic Programming and Evolvable Machines 2(2)123ndash163 June 2001

R Poli and W B LangdonOn the search properties of different crossover operators in genetic programmingIn J R Koza W Banzhaf K Chellapilla K Deb M Dorigo D B Fogel M HGarzon D E Goldberg H Iba and R Riolo editors Genetic Programming1998 Proceedings of the Third Annual Conference pages 293ndash301 University ofWisconsin Madison Wisconsin USA 22-25 July 1998 Morgan Kaufmann

R Poli W B Langdon and N F McPheeA field guide to genetic programmingPublished via httplulucom and freely available athttpwwwgp-field-guideorguk 2008(With contributions by J R Koza)

B J Ross and H ZhuProcedural texture evolution using multiobjective optimizationNew Generation Computing 22(3)271ndash293 2004

Bibliographyndash 92

R P Salustowicz and J SchmidhuberProbabilistic incremental program evolutionEvolutionary Computation 5(2)123ndash141 1997

M Schmidt and H LipsonDistilling free-form natural laws from experimental dataScience 324(5923)81ndash85 3 Apr 2009

S Silva and L VanneschiState-of-the-Art Genetic Programming for Predicting Human Oral Bioavailabilityof DrugsIn Proc 4th International Workshop on Practical Applications of ComputationalBiology and Bioinformatics 2010

M SipperLet the Games EvolveIn Proc GPTP 2011

C Sivapragasam R Maheswaran and V VenkateshGenetic programming approach for flood routing in natural channelsHydrological Processes 22(5)623ndash628 2007

C Sivapragasam N Muttil S Muthukumar and V M ArunPrediction of algal blooms using genetic programmingMarine Pollution Bulletin 60(10)1849ndash1855 2010

Bibliographyndash 93

L SpectorTowards practical autoconstructive evolution Self-evolution of problem-solvinggenetic programming systemsIn R Riolo T McConaghy and E Vladislavleva editors Genetic ProgrammingTheory and Practice VIII volume 8 of Genetic and Evolutionary Computationchapter 2 pages 17ndash33 Springer Ann Arbor USA 20-22 May 2010

L Spector C Perry and J KleinPush 20 programming language descriptionTechnical report School of Cognitive Science Hampshire College Apr 2004

J Togelius S Karakovskiy J Koutnik and J SchmidhuberSuper Mario EvolutionIn Proc IEEE Computational Intelligence and Games 2009

TORCS The Open Car Racing Simulatorhttptorcssourceforgenet 2012

L Vanneschi and G CuccuVariable Size Population for Dynamic Optimization with Genetic ProgrammingIn Proc GECCO 2009

Bibliographyndash 94

E Vladislavleva G Smits and D Den HertogOrder of Nonlinearity as a Complexity Measure for Models Generated by SymbolicRegression via Pareto Genetic ProgrammingIEEE Trans EC 13(2)333ndash349 2009

N Wagner Z Michalewicz M Khouja and R McGregorTime Series Forecasting for Dynamic Environments The DyFor Genetic ProgramModelIEEE Trans EC 2007

J Walker and J MillerPredicting Prime Numbers Using Cartesian Genetic ProgrammingIn Proc EuroGP 2007

J A Walker K Voumllk S L Smith and J F MillerParallel Evolution using Multi-chromosome Cartesian Genetic ProgrammingGPEM 10417ndash445 2009

W-C Wang K-W Chau C-T Cheng and L QiuA comparison of performance of several artificial intelligence methods forforecasting monthly discharge time seriesJournal of Hydrology 374(3-4)294ndash306 2009

Bibliographyndash 95

W Weimer S Forrest C Le Goues and T NguyenAutomatic program repair with evolutionary computationCommunications of the ACM 53(5)109ndash116 June 2010

P Widera J Garibaldi and N KrasnogorGP challenge Evolving energy function for protein structure predictionGPEM 1161ndash88 2010

T YuHierarchical Processing for Evolving Recursive and Modular Programs UsingHigher-Order Functions and Lambda AbstractionGPEM 2345ndash380 2001

Bibliographyndash 96

  • Introduction
  • Background
  • What is genetic programming
  • Summary of our first glimpse at GP
  • Life demonstration of GP using ECJ
  • Assessment of GP techniques
  • Case study
  • A more detailed view on GP (vanilla GP is not the whole story)
  • The Challenges for GP
  • Variants of GP
  • Selected theoretical results
  • Applications of GP
  • Additional resources
  • Bibliography
Page 17: GeneticProgramming - Poznań University of TechnologyGeneticProgramming KrzysztofKrawiec LaboratoryofIntelligentDecisionSupportSystems InstituteofComputingScience,PoznanUniversityofTechnology,Poznań,Poland

Mutation

Mutation replace a randomly selected subexpression with a new randomly generatedsubexpression

1 function Mutation(pI )2 repeat3 slarr Random node in p4 s prime larrRandomProgram(I )5 pprime larr Replace the subtree rooted in s with s prime

6 until Depth(pprime)lt dmax dmax is the tree depth limit7 return pprime

8 end function

What is genetic programmingndash 17

Exemplary run Setup

Objective Find program whose output matches x2+x+1 over the range [minus11]Such tasks can be considered as a form of regressionAs solutions are built by manipulating code (instructions) this is referred to assymbolic regression

Fitness sum of absolute errors for x isin minus10minus09 0910In other words the set of fitness cases is

xi -10 -09 0 09 10yi 1 091 1 271 3

What is genetic programmingndash 18

Exemplary run Setup

Instruction setNonterminal (function) set + - (protected division) and x all operating onfloatsTerminal set x and constants chosen randomly between -5 and +5

Selection fitness proportionate (roulette wheel) non elitist

Initial pop ramped half-and-half (depth 1 to 2 50 of terminals are constants)(to be explained later)

Parameterspopulation size 450 subtree crossover25 reproduction25 subtree mutation no tree size limits

Termination when an individual with fitness better than 01 found

What is genetic programmingndash 19

Initial population (population 0)

What is genetic programmingndash 20

Fitness assignment for population 0

Fitness values f(a)=77 f(b)=110 f(c)=1798 f(d)=287

What is genetic programmingndash 21

Breeding

Assume

a gets reproduced

c gets mutated (at loci 2)

a and d get crossed-over

a and b get crossed over

What is genetic programmingndash 22

Population 1

Population 0

Population 1

Individual d in population 1 has fitness 0What is genetic programmingndash 23

Summary of our first glimpse at GP

Summary of our first glimpse at GPndash 24

Specific features of GP

The solutions evolving under the selection pressure of the fitness function arethemselves functions (programs)

GP operates on symbolic structures of varying lengthsThere are no variables for the algorithm to operate on (at least in the commonsense)

The program can be tested only on a limited number of fitness cases (tests)

=rArr In contrast to most EC methods that are typically placed in optimizationframework GP is by nature an inductive learning approach that fits into the domain ofmachine learning []

Summary of our first glimpse at GPndash 25

In a broader context

As opposed to typical ML approaches GP is very genericArbitrary programming language arbitrary input and output representation

The syntax and semantic of the programming language of consideration serve asmeans to provide the algorithm with prior knowledge

(common sense knowledge background knowledge domain knowledge)

GP is not the only approach to program induction (but probably the best one )See eg inductive logic programming ILP

GP embodies the ultimate goal of AI to build a system capable ofself-programming (adaptation learning)

Summary of our first glimpse at GPndash 26

What is GP ndash Question revisited

Genetic programming is a branch of computer science studying heuristicalgorithms based on neo-Darwinian principles for synthesizing programs iediscrete symbolic compositional structures that process data

Summary of our first glimpse at GPndash 27

Why should GP be considered a viable approach to program synthesis

GP combines two powerful concepts marked in underline in the above definition1 Representing candidate solutions as programs

which in general can conduct any Turing-complete computation (egclassification regression clustering reasoning problem solving etc) and thusenable capturing solutions to any type of problems (whether the task is eglearning optimization problem solving game playing etc)

2 Searching the space of candidate solutions using the lsquomechanicsrsquo borrowed frombiological evolutionwhich is unquestionably a very powerful computing paradigm given that itresulted in life on Earth and development of intelligent beings

Summary of our first glimpse at GPndash 28

Consequences of the above definition

Heuristic nature of search

Symbolic program representation

Input sensitivity and inductive character

State-fullness

Unconstrained data types

Summary of our first glimpse at GPndash 29

Origins of GP

httpwwwgenetic-programmingcomjohnkozahtml

Home assignment Indentify John Kozarsquos unique visual trait )

Summary of our first glimpse at GPndash 30

Life demonstration of GP using ECJ

Life demonstration of GP using ECJndash 31

GP on ECJ

ECJ Evolutionary Computation in Javahttpcsgmuedu~eclabprojectsecj

by Sean Luke Liviu Panait Gabriel Balan Sean Paus Zbigniew Skolicki ElenaPopovici Keith Sullivan et al

Probably the most popular freely available framework for EC with a strongsupport for GP

Licensed under Academic Free License version 30

As of March 2012 version 20

Many other libraries integrate with ECJ

Life demonstration of GP using ECJndash 32

Selected ECJ features

GUI with charting

Platform-independent checkpointing and logging

Hierarchical parameter files

Multithreading

Mersenne Twister Random Number Generators (compare tohttpwwwalifecouknonrandom)

Abstractions for implementing a variety of EC forms

Prepared to work in a distributed environment (including so-called island model)

Life demonstration of GP using ECJndash 33

GP-related ECJ features

GP Tree Representations

Set-based Strongly-Typed Genetic Programming

Ephemeral Random Constants

Automatically-Defined Functions and Automatically Defined Macros

Multiple tree forests

Six tree-creation algorithms

Extensive set of GP breeding operators

Grammatical Encoding

Eight pre-done GP application problem domains (ant regression multiplexerlawnmower parity two-box edge serengeti)

Life demonstration of GP using ECJndash 34

Exemplary run of ECJ

Standard outputjava ecEvolve -file ecappregressionquinticercparamsThreads breed1 eval1Seed 1427743400Job 0Setting upProcessing GP TypesProcessing GP Node ConstraintsProcessing GP Function SetsProcessing GP Tree Constraints-01306332228659439200164875774146594280653340439694114301402200189629743-0037506348565697010001402771209365470606602806044824949013869498395598084Initializing Generation 0Subpop 0 best fitness of generation Fitness Standardized=11303205 Adjusted=046941292 Hits=10Generation 1Subpop 0 best fitness of generation Fitness Standardized=06804932 Adjusted=059506345 Hits=7

Life demonstration of GP using ECJndash 35

Exemplary run The result

The log file produced by the runGeneration 0Best IndividualSubpopulation 0Evaluated trueFitness Standardized=11303205 Adjusted=046941292 Hits=10Tree 0( (sin ( x x)) (cos (+ x x)))Generation 1Best IndividualSubpopulation 0Evaluated trueFitness Standardized=06804932 Adjusted=059506345 Hits=7Tree 0( (rlog (+ (- x x) (cos x))) (rlog (- (cos (cos ( x x))) (- x x))))

Life demonstration of GP using ECJndash 36

Exemplary run

The log file produced by the runBest Individual of RunSubpopulation 0Evaluated trueFitness Standardized=008413165 Adjusted=092239726 Hits=17Tree 0( ( ( (- ( ( ( ( x (sin x)) (rlog

x)) (+ (+ (sin x) x) (- x x))) (exp ( x( ( (- ( ( ( ( x x) (rlog x)) (+ (+

(sin x) x) (- x x))) (exp ( x (sin x))))(sin x)) (rlog x)) (exp (rlog x)))))) (sin

x)) (rlog x)) x) (cos (cos ( ( (- ( ((exp (rlog x)) (+ x ( ( (exp (rlog x))(rlog x)) x))) (exp ( ( ( (- (exp (rlogx)) x) (rlog x)) x) (sin ( x x))))) (sinx)) ( x ( ( (- ( ( ( ( x x) (rlogx)) (+ (+ x (+ (+ (sin x) x) (- x x))) (-x x))) (exp ( x (sin x)))) (sin x)) (rlogx)) (exp (rlog x))))) x))))

Life demonstration of GP using ECJndash 37

GP in R

httpcranr-projectorgwebpackagesgprindexhtml

Life demonstration of GP using ECJndash 38

GP in Mathematica

Life demonstration of GP using ECJndash 39

Assessment of GP techniques

Assessment of GP techniquesndash 40

Criteria for assessing the quality of GP-evolved solutions

Criteria for assessing GP algorithmssuccess rate (percentage of evolutionary runs ended with success)

time-to-success (can be infin)

error of the best-of-run individual

Criteria for assessing programs obtained with GP

error rate (percentage of tests passed)

program size (number of instructions)

execution time

transparency

Assessment of GP techniquesndash 41

GP Benchmarks

Problem Definition (formula)Sextic x6minus2x4+x2

Septic x7minus2x6+x5minusx4+x3minus2x2+xNonic x9+x8+x7+x6+x5+x4+x3+x2+xR1 (x+1)3(x2minusx+1)R2 (x5minus3x3+1)(x2+1)R3 (x6+x5)(x4+x3+x2+x+1)

Assessment of GP techniquesndash 42

Even more GP benchmarks

Symbolic Regression Drug Bioavailability [43]

Tower [52] Protein Structure Classification [58]

Boolean Functions Time Series Forecasting [53]

N-Multiplexer [17] N-Majority [17] N-Parity [17] Path-finding and Planning

Generalised Boolean Circuits [12 59] Physical Travelling Salesman [28]

Digital Adder [55] Artificial Ant [17]

Order [9] Lawnmower [18]

Digital Multiplier [55] Tartarus Problem [6]

Majority [9] Maximum Overhang [36]

Classification Circuit Design [29]

mRNA Motif Classification [24] Control Systems

DNA Motif Discovery [25] Chaotic Dynamic Systems Control [26]

Intrusion Detection [11] Pole Balancing [31]

Protein Classification [19] Truck Control [16]

Intertwined Spirals [17]

Predictive Modelling

Mackey-Glass Chaotic Time Series [21]

Financial Trading [5 4 8]

Sunspot Prediction [17]

GeneChip Probe Performance [22]

Prime Number Prediction [54]

Assessment of GP techniquesndash 43

and more

Game-Playing

TORCS Car Racing [50]

Ms PacMan [10]

Othello [27]

Chessboard Evaluation [44]

Backgammon [44]

Mario [49]

NP-Complete Puzzles [14]

Robocode [44]

Rush Hour [44]

Checkers [44]

Freecell [44]

Dynamic Optimisation

Dynamic Symbolic Regression [34 35 51]

Dynamic Scheduling [13]

Traditional Programming

Sorting [15 1]

Assessment of GP techniquesndash 44

Case study

Case studyndash 45

Case study 1 Evolution of temperature models

Based on Karolina Stanisławska Krzysztof Krawiec Zbigniew W KundzewiczModeling Global Temperature Changes using Genetic Programming ndash A Case Study

Institute of Computing Science Poznan University of Technology Poznan Poland

Institute for Agricultural and Forest Environment Polish Academy of SciencesPoznan Poland and Potsdam Institute for Climate Impact Research PotsdamGermany

Konferencja Algorytmoacutew Ewolucyjnych i Optymalizacji Globalnej KAEiOGSeptember 21 2011

Case studyndash 46

A more detailed view on GP(vanilla GP is not the whole story)

A more detailed view on GP (vanilla GP is not the whole story)ndash 47

Where to get [random] candidate solutions from

Every stochastic search method relies on some sampling algorithm(s)

The distribution of randomly generated solutions is important as it impliescertain bias of the algorithm

ProblemsWe donrsquot know the lsquoidealrsquo distribution of GP programsEven if we knew it it may be difficult to design an algorithm that obeys it

The most widely used contemporary initialization methods take care only of thesyntax of generated programs

Mainly height constraint

A more detailed view on GP (vanilla GP is not the whole story)ndash 48

Initialization Full method

Specify the maximum tree height hmax

The full method for initializing treesChoose nonterminal nodes at random until hmax is reachedThen choose only from terminals

A more detailed view on GP (vanilla GP is not the whole story)ndash 49

Initialization Grow method

Specify the maximum tree height hmax

The grow method for initializing treesChoose nonterminal or terminal nodes at random until hmax is reachedThen choose only from terminals

A more detailed view on GP (vanilla GP is not the whole story)ndash 50

Homologous crossover for GP

Earliest example one-point crossover [23] identify a common region in theparents and swap the corresponding trees

The common region is the lsquointersectionrsquo of parent trees

A more detailed view on GP (vanilla GP is not the whole story)ndash 51

Uniform crossover for GP

Works similarly to uniform crossover in GAs

The offspring is build by iterating over nodes in the common region and flipping acoin to decide from which parent should an instruction be copied [38]

A more detailed view on GP (vanilla GP is not the whole story)ndash 52

How to employ multiple operators for lsquobreedingrsquo

How should the particular operators coexist in an evolutionary process In other words

How should they be superimposed

What should be the lsquopipingrsquo of particular breeding pipelines

A topic surprisingly underexplored in GP (and in EC probably too)

An example Which is betterpopsubpop0speciespipe = ecgpkozaMutationPipelinepopsubpop0speciespipenum-sources = 1popsubpop0speciespipesource0 = ecgpkozaCrossoverPipeline

Orpopsubpop0speciespipenum-sources = 2popsubpop0speciespipesource0 = ecgpkozaCrossoverPipelinepopsubpop0speciespipesource0prob = 09popsubpop0speciespipesource1 = ecgpkozaMutationPipelinepopsubpop0speciespipesource1prob = 01

A more detailed view on GP (vanilla GP is not the whole story)ndash 53

The Challenges for GP

The Challenges for GPndash 54

Bloat

The evolving expressions tend to grow indefinitely in sizeFor tree-based representations this growth is typically exponential[-ish]

Evaluation becomes slow algorithm stalls memory overrun likely

One of the most intensely studied topics in GP 240+ papers as of March 2012

The Challenges for GPndash 55

Bloat example

Average number of nodes per generation in a typical run of GP solving the Sexticproblem x6minus2x4+x2

(GP dotted line)

The Challenges for GPndash 56

Countermeasures for bloat

Constraining tree heightSurprisingly can speed up bloat

Favoring small programsLexicographic parsimony pressure given two equally fit individuals prefer (select)the one represented by a smaller tree

Bloat-aware operators size-fair crossover

The Challenges for GPndash 57

Highly non-uniform distribution of program lsquobehaviorsrsquo

Convergence of binary Boolean random linear functions (composed of AND NANDOR NOR 8 bits)

From [20] Langdon W B Cantuacute-Paz E (ed) Random Search is Parsimonious LateBreaking Papers at the Genetic and Evolutionary Computation Conference(GECCO-2002) AAAI 2002 308-315

The Challenges for GPndash 58

High cost of evaluation

Running a program on multiple inputscan be expensive

Particularly for some types of dataeg images

Solutions

Caching of outcomes of subprograms

Parallel execution of programs onparticular fitness cases

Bloat prevention methods

The Challenges for GPndash 59

Variants of GP

Variants of GPndash 60

Strongly typed GP (STGP)

A way to incorporate prior knowledge and impose a structure on programs [30]

ImplementationProvide a set of typesFor each instruction define the types of its arguments and outcomesMake the operators type-aware

Mutation substitute a random tree of a proper typeCrossover swap trees of compatible1 types

1Compatible belonging to the same lsquoset typersquoVariants of GPndash 61

Strongly typed GP in ECJ

For the problem of simple classifiers represented as decision trees

Classifier syntaxClassifier = Class_idClassifier = if_then_else(Condition ClassifierClassifier)Condition = Input_Variable = Constant_Value

Implementation in ECJ parameter filesgptypeasize = 3gptypea0name = classgptypea1name = vargptypea2name = constgptypessize = 0gptcsize = 1gptc0 = ecgpGPTreeConstraintsgptc0name = tc0gptc0fset = f0gptc0returns = class

gpncsize = 4gpnc0 = ecgpGPNodeConstraintsgpnc0name = ncSimpleClassifiergpnc0returns = classgpnc0size = 0gpnc1 = ecgpGPNodeConstraintsgpnc1name = ncCompoundClassifiergpnc1returns = classgpnc1size = 4gpnc1child0 = vargpnc1child1 = constgpnc1child2 = classgpnc1child3 = classgpnc2 = ecgpGPNodeConstraintsgpnc2name = ncVariablegpnc2returns = vargpnc2size = 0gpnc3 = ecgpGPNodeConstraintsgpnc3name = ncConstantgpnc3returns = constgpnc3size = 0

Variants of GPndash 62

Linear Genetic Programming

MotivationTree-like structures are not natural for contemporary hardware architectures

Program = a sequence of instructions

Data passed via registers

ProsDirectly portable to machine code fast executionNatural correspondence to standard (GA-like) crossover operator

Applications direct evolution of machine code [32]

Initial register contents

Final register contents

x1

x2 O1 O2

x3

O3 O4 g2

g3

g1r1

r2

r3

r1

r2

r3

Variants of GPndash 63

Linear GP

Variants of GPndash 64

Stack-based GP

The best-known representative Push and PushGPhampshireedulspectorpushhtml [48]

ProsVery simple syntax program = instruction | literal | ( program )No need to specify the number of registersThe top element of a stack has the natural interpretation of program outcomeNatural possibility of implementing autoconstructive programs [47]Includes certain features that make it Turing-complete (eg YANK instruction)

Variants of GPndash 65

Push Example

Program

( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR )

Initial stack states

BOOLEAN STACK ()CODE STACK ( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR )FLOAT STACK ()INTEGER STACK ()

Stack states after program execution

BOOLEAN STACK ( TRUE )CODE STACK ( ( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR ) )FLOAT STACK ( 93 )INTEGER STACK ( 6 )

httphampshireedulspectorpush3-descriptionhtml

Variants of GPndash 66

Other variants of GP

Grammatical Evolution The grammar of the programming language ofconsideration is given as input to the algorithm Individuals encode the choice ofproductions in the derivation tree (which of available alternative productionshould be chosen modulo the number of productions available at given step ofderivation)

Graph-based GPMotivation standard GP cannot reuse subprograms (within a single program)Example Cartesian Genetic Programming

Variants of GPndash 67

Other variants of GP

Multiobjective GP The extra objectives canCome with the problemResult from GPrsquos specifics eg use program size as the second (minimized)objectiveBe associated with different tests (eg feature tests [40])

Developmental GP (eg using Push)

Probabilistic GP (a variant of EDA Estimation of Distribution Algorithms)The algorithm maintains a probability distribution P instead of a populationIndividuals are generated from P lsquoon demandrsquoThe results of individualsrsquo evaluation are used to update P

Variants of GPndash 68

Simple EDA-like GP PIPE

Probabilistic Incremental Program Evolution [41]

Variants of GPndash 69

Selected theoretical results

Selected theoretical resultsndash 70

Schemata theorem for GP

Exact formula for the expected number of individuals sampling a schema a thenext generation [37]

Plus later work for other types of crossover

Selected theoretical resultsndash 71

Semantic crossover

Selected theoretical resultsndash 72

Applications of GP

Applications of GPndash 73

Review

GP produced a number of solutions that are human-competitive ie a GPalgorithm automatically solved a problem for which a patent exists2

A recent award-winning work has demonstrated the ability of a GP system toautomatically find and correct bugs in commercially-released software whenprovided with test data3

GP is one of leading methodologies that can be used to lsquoautomatersquo sciencehelping the researchers to find the hidden complex patterns in the observedphenomena4

2Koza J R Keane M A Streeter M J Mydlowec W Yu J Lanza G 2003 Genetic Pro-gramming IV Routine Human-Competitive Machine Intelligence Kluwer Academic Publishers

3Arcuri A Yao X A novel co-evolutionary approach to automatic software bug fixing In WangJ (Ed) 2008 IEEE World Congress on Computational Intelligence IEEE Computational IntelligenceSociety IEEE Press Hong Kong

4Schmidt M Lipson H 3 Apr 2009 Distilling free-form natural laws from experimental dataScience 324 (5923) 81ndash85

Applications of GPndash 74

Humies

() Entries were solicited for cash awards for human-competitive results that wereproduced by any form of genetic and evolutionary computation and that were published

Applications of GPndash 75

Humies

The conditions to qualify(A) The result was patented as an invention in the past is an improvement over apatented invention or would qualify today as a patentable new invention(B) The result is equal to or better than a result that was accepted as a new scientificresult at the time when it was published in a peer-reviewed scientific journal(C) The result is equal to or better than a result that was placed into a database orarchive of results maintained by an internationally recognized panel of scientificexperts(D) The result is publishable in its own right as a new scientific result mdash independentof the fact that the result was mechanically created(E) The result is equal to or better than the most recent human-created solution to along-standing problem for which there has been a succession of increasingly betterhuman-created solutions(F) The result is equal to or better than a result that was considered an achievementin its field at the time it was first discovered(G) The result solves a problem of indisputable difficulty in its field(H) The result holds its own or wins a regulated competition involving humancontestants (in the form of either live human players or human-written computerprograms)

Applications of GPndash 76

Selected Gold Humies

2004 Jason D Lohn Gregory S Hornby Derek S Linden NASA Ames ResearchCenterAn Evolved Antenna for Deployment on NASArsquos Space Technology 5 Mission

httpidesignucscedupapershornby_ec11pdf

Applications of GPndash 77

Selected Gold Humies using GP

2009 Stephanie Forrest Claire Le Goues ThanhVu Nguyen Westley WeimerAutomatically finding patches using genetic programming A GeneticProgramming Approach to Automated Software Repair

Applications of GPndash 78

Selected Gold Humies using GP

2008 Lee Spector David M Clark Ian Lindsay Bradford Barr Jon KleinGenetic Programming for Finite Algebras

2010 Natalio Krasnogor Paweł Widera Jonathan GaribaldiEvolutionary design of the energy function for protein structure prediction GPchallenge evolving the energy function for protein structure prediction Automateddesign of energy functions for protein structure prediction by means of geneticprogramming and improved structure similarity assessment

2011 Achiya Elyasaf Ami Hauptmann Moshe SipperGA-FreeCell Evolving Solvers for the Game of FreeCell

Applications of GPndash 79

Other applications

classification problems in machine learning [] object recognition [ 33] orlearning game strategies [] [2 57] has demonstrated the ability of a GP system to automatically find andcorrect bugs in commercially-released software when provided with test data

In the context of this paper it deserves particular attention that GP is one ofleading methodologies that can be used to lsquoautomatersquo science helping theresearchers to find the hidden complex patterns in the observed phenomena

In this spirit in their seminal paper [42] have shown how GP can be used toinduce scientific laws from experimental data Many other studies havedemonstrated the usefulness of GP for modeling different phenomena includingthose of natural origins [45 3 7 56 46]

See [] for an extensive review of GP applications

Applications of GPndash 80

Case study 2 Evolution of feature detectors

Based on Krzysztof Krawiec Bartosz Kukawka and Tomasz Maciejewski (2010)Evolving cascades of voting feature detectors for vehicle detection in satellite imageryIn IEEE Congress on Evolutionary Computation (CEC 2010) Barcelona IEEE Presspages 2392-2399

Applications of GPndash 81

Additional resources

Additional resourcesndash 82

Additional resources

Evolutionary Computation in Java csgmuedu~eclabprojectsecjGeneric software framework for EA well-prepared to work with GP

The online GP bilbiography wwwcsbhamacuk~wblbiblio

The genetic programming lsquohome pagersquo (a little bit messy but still valuable)httpwwwgenetic-programmingcom

Additional resourcesndash 83

Bibliography

Bibliographyndash 84

A Agapitos and S M LucasEvolving Modular Recursive Sorting AlgorithmsIn Proc EuroGP 2007

A Arcuri and X YaoA novel co-evolutionary approach to automatic software bug fixingIn J Wang editor 2008 IEEE World Congress on Computational Intelligencepages 162ndash168 Hong Kong 1-6 June 2008 IEEE Computational IntelligenceSociety IEEE Press

M Arganis R Val J Prats K Rodriguez R Dominguez and J DolzGenetic programming and standardization in water temperature modellingAdvances in Civil Engineering 2009 2009

A Brabazon M OrsquoNeill and I DempseyAn Introduction to Evolutionary Computation in FinanceIEEE Computational Intelligence Magazine 3(4)42ndash55 2008

R Bradley A Brabazon and M OrsquoNeillDynamic High Frequency Trading A Neuro-Evolutionary ApproachIn Proc EvoWorkshops 2009

G Cuccu and F GomezWhen novelty is not enoughIn Proc EvoApplications 2011

Bibliographyndash 85

M Daga and M C DeoAlternative data-driven methods to estimate wind from waves by inversemodelingNatural Hazards 49(2)293ndash310 May 2009

I Dempsey M OrsquoNeill and A BrabazonAdaptive Trading With Grammatical EvolutionIn Proc CEC 2006

G Durrett F Neumann and U-M OrsquoReillyComputational Complexity Analysis of Simple Genetic Programming On TwoProblems Modeling Isolated Program SemanticsIn Proc FOGA 2011

E Galvaacuten-Loacutepez J Swafford M OrsquoNeill and A BrabazonEvolving a Ms PacMan Controller Using Grammatical EvolutionIn Applications of Evolutionary Computation Springer 2010

J V Hansen P B Lowry R D Meservy and D M McDonaldGenetic Programming for Prevention of Cyberterrorism through Dynamic andEvolving Intrusion DetectionDecision Support Systems 431362ndash1374 2007

Bibliographyndash 86

S Harding J F Miller and W BanzhafDevelopments in Cartesian Genetic Programming self-modifying CGPGPEM 11397ndash439 2010

D Jakobović and L BudinDynamic Scheduling with Genetic ProgrammingIn Proc EuroGP 2006

G Kendall A Parkes and K SpoererA Survey of NP-Complete PuzzlesInternational Computer Games Association Journal 31(1)13ndash34 2008

K E Kinnear JrEvolving a Sort Lessons in Genetic ProgrammingIn Proc of the International Conference on Neural Networks 1993

J KozaA Genetic Approach to the Truck Backer Upper Problem and the Inter-twinedSpiral ProblemIn Proc International Joint Conference on Neural Networks 1992

J R KozaGenetic Programming On the Programming of Computers by Means of NaturalSelectionMIT Press Cambridge MA USA 1992

Bibliographyndash 87

J R KozaGenetic Programming II Automatic Discovery of Reusable ProgramsMIT Press Cambridge Massachusetts May 1994

W Langdon and W BanzhafRepeated Patterns in Genetic ProgrammingNatural Computing 7589ndash613 2008

W B LangdonRandom search is parsimoniousIn E Cantuacute-Paz editor Late Breaking Papers at the Genetic and EvolutionaryComputation Conference (GECCO-2002) pages 308ndash315 New York NY 9-13July 2002 AAAI

W B Langdon and W BanzhafRepeated Sequences in Linear Genetic Programming GenomesComplex Systems 15(4)285ndash306 2005

W B Langdon and A P HarrisonEvolving Regular Expressions for GeneChip Probe Performance PredictionIn Proc PPSN pages 1061ndash1070 2008

W B Langdon and R PoliFoundations of Genetic ProgrammingSpringer-Verlag 2002

Bibliographyndash 88

W B Langdon J Rowsell and A P HarrisonCreating Regular Expressions as mRNA Motifs with GP to Predict Human ExonSplittingIn Proc GECCO 2009

W B Langdon O Sanchez Graillet and A P HarrisonAutomated DNA Motif DiscoveryarXivorg 2010

M Lones A Tyrrell S Stepney and L CavesControlling Complex Dynamics with Artificial Biochemical NetworksIn Proc EuroGP pages 159ndash170 2010

S LucasOthello Competitionhttpprotectkern-1667emrelaxalgovalessexacuk8080othellohtmlOthellohtml 2012[Online accessed 27-Jan-2012]

Bibliographyndash 89

S LucasThe Physical Travelling Salesperson Problemhttpprotectkern-1667emrelaxalgovalessexacukptspptsphtml2012[Online accessed 27ndashJan-2012]

T McConaghyFFX Fast Scalable Deterministic Symbolic Regression TechnologyIn Proc GPTP 2011

D J MontanaStrongly typed genetic programmingBBN Technical Report 7866 Bolt Beranek and Newman Inc 10 MoultonStreet Cambridge MA 02138 USA 7 May 1993

M Nicolau M Schoenauer and W BanzhafEvolving Genes to Balance a PoleIn Proc EuroGP 2010

Bibliographyndash 90

P Nordin and W BanzhafGenetic programming controlling a miniature robotIn E V Siegel and J R Koza editors Working Notes for the AAAI Symposiumon Genetic Programming pages 61ndash67 MIT Cambridge MA USA 10ndash12 Nov1995 AAAI

G Olague and L TrujilloEvolutionary-computer-assisted design of image operators that detect interestpoints using genetic programmingImage and Vision Computing 29(7)484ndash498 2011

M OrsquoNeill A Brabazon and E HembergSubtree Deactivation Control with Grammatical Genetic Programming inDynamic EnvironmentsIn Proc CEC 2008

M OrsquoNeill and C RyanGrammatical Evolution by Grammatical Evolution The Evolution of Grammarand Genetic CodeIn Proc EuroGP pages 138ndash149 Springer-Verlag 5ndash7 Apr 2004

M Paterson Y Peres M Thorup P Winkler and U ZwickMaximum OverhangIn Proc 19th Annual ACM-SIAM Symposium on Discrete Algorithms 2008

Bibliographyndash 91

R PoliExact schema theory for genetic programming and variable-length geneticalgorithms with one-point crossoverGenetic Programming and Evolvable Machines 2(2)123ndash163 June 2001

R Poli and W B LangdonOn the search properties of different crossover operators in genetic programmingIn J R Koza W Banzhaf K Chellapilla K Deb M Dorigo D B Fogel M HGarzon D E Goldberg H Iba and R Riolo editors Genetic Programming1998 Proceedings of the Third Annual Conference pages 293ndash301 University ofWisconsin Madison Wisconsin USA 22-25 July 1998 Morgan Kaufmann

R Poli W B Langdon and N F McPheeA field guide to genetic programmingPublished via httplulucom and freely available athttpwwwgp-field-guideorguk 2008(With contributions by J R Koza)

B J Ross and H ZhuProcedural texture evolution using multiobjective optimizationNew Generation Computing 22(3)271ndash293 2004

Bibliographyndash 92

R P Salustowicz and J SchmidhuberProbabilistic incremental program evolutionEvolutionary Computation 5(2)123ndash141 1997

M Schmidt and H LipsonDistilling free-form natural laws from experimental dataScience 324(5923)81ndash85 3 Apr 2009

S Silva and L VanneschiState-of-the-Art Genetic Programming for Predicting Human Oral Bioavailabilityof DrugsIn Proc 4th International Workshop on Practical Applications of ComputationalBiology and Bioinformatics 2010

M SipperLet the Games EvolveIn Proc GPTP 2011

C Sivapragasam R Maheswaran and V VenkateshGenetic programming approach for flood routing in natural channelsHydrological Processes 22(5)623ndash628 2007

C Sivapragasam N Muttil S Muthukumar and V M ArunPrediction of algal blooms using genetic programmingMarine Pollution Bulletin 60(10)1849ndash1855 2010

Bibliographyndash 93

L SpectorTowards practical autoconstructive evolution Self-evolution of problem-solvinggenetic programming systemsIn R Riolo T McConaghy and E Vladislavleva editors Genetic ProgrammingTheory and Practice VIII volume 8 of Genetic and Evolutionary Computationchapter 2 pages 17ndash33 Springer Ann Arbor USA 20-22 May 2010

L Spector C Perry and J KleinPush 20 programming language descriptionTechnical report School of Cognitive Science Hampshire College Apr 2004

J Togelius S Karakovskiy J Koutnik and J SchmidhuberSuper Mario EvolutionIn Proc IEEE Computational Intelligence and Games 2009

TORCS The Open Car Racing Simulatorhttptorcssourceforgenet 2012

L Vanneschi and G CuccuVariable Size Population for Dynamic Optimization with Genetic ProgrammingIn Proc GECCO 2009

Bibliographyndash 94

E Vladislavleva G Smits and D Den HertogOrder of Nonlinearity as a Complexity Measure for Models Generated by SymbolicRegression via Pareto Genetic ProgrammingIEEE Trans EC 13(2)333ndash349 2009

N Wagner Z Michalewicz M Khouja and R McGregorTime Series Forecasting for Dynamic Environments The DyFor Genetic ProgramModelIEEE Trans EC 2007

J Walker and J MillerPredicting Prime Numbers Using Cartesian Genetic ProgrammingIn Proc EuroGP 2007

J A Walker K Voumllk S L Smith and J F MillerParallel Evolution using Multi-chromosome Cartesian Genetic ProgrammingGPEM 10417ndash445 2009

W-C Wang K-W Chau C-T Cheng and L QiuA comparison of performance of several artificial intelligence methods forforecasting monthly discharge time seriesJournal of Hydrology 374(3-4)294ndash306 2009

Bibliographyndash 95

W Weimer S Forrest C Le Goues and T NguyenAutomatic program repair with evolutionary computationCommunications of the ACM 53(5)109ndash116 June 2010

P Widera J Garibaldi and N KrasnogorGP challenge Evolving energy function for protein structure predictionGPEM 1161ndash88 2010

T YuHierarchical Processing for Evolving Recursive and Modular Programs UsingHigher-Order Functions and Lambda AbstractionGPEM 2345ndash380 2001

Bibliographyndash 96

  • Introduction
  • Background
  • What is genetic programming
  • Summary of our first glimpse at GP
  • Life demonstration of GP using ECJ
  • Assessment of GP techniques
  • Case study
  • A more detailed view on GP (vanilla GP is not the whole story)
  • The Challenges for GP
  • Variants of GP
  • Selected theoretical results
  • Applications of GP
  • Additional resources
  • Bibliography
Page 18: GeneticProgramming - Poznań University of TechnologyGeneticProgramming KrzysztofKrawiec LaboratoryofIntelligentDecisionSupportSystems InstituteofComputingScience,PoznanUniversityofTechnology,Poznań,Poland

Exemplary run Setup

Objective Find program whose output matches x2+x+1 over the range [minus11]Such tasks can be considered as a form of regressionAs solutions are built by manipulating code (instructions) this is referred to assymbolic regression

Fitness sum of absolute errors for x isin minus10minus09 0910In other words the set of fitness cases is

xi -10 -09 0 09 10yi 1 091 1 271 3

What is genetic programmingndash 18

Exemplary run Setup

Instruction setNonterminal (function) set + - (protected division) and x all operating onfloatsTerminal set x and constants chosen randomly between -5 and +5

Selection fitness proportionate (roulette wheel) non elitist

Initial pop ramped half-and-half (depth 1 to 2 50 of terminals are constants)(to be explained later)

Parameterspopulation size 450 subtree crossover25 reproduction25 subtree mutation no tree size limits

Termination when an individual with fitness better than 01 found

What is genetic programmingndash 19

Initial population (population 0)

What is genetic programmingndash 20

Fitness assignment for population 0

Fitness values f(a)=77 f(b)=110 f(c)=1798 f(d)=287

What is genetic programmingndash 21

Breeding

Assume

a gets reproduced

c gets mutated (at loci 2)

a and d get crossed-over

a and b get crossed over

What is genetic programmingndash 22

Population 1

Population 0

Population 1

Individual d in population 1 has fitness 0What is genetic programmingndash 23

Summary of our first glimpse at GP

Summary of our first glimpse at GPndash 24

Specific features of GP

The solutions evolving under the selection pressure of the fitness function arethemselves functions (programs)

GP operates on symbolic structures of varying lengthsThere are no variables for the algorithm to operate on (at least in the commonsense)

The program can be tested only on a limited number of fitness cases (tests)

=rArr In contrast to most EC methods that are typically placed in optimizationframework GP is by nature an inductive learning approach that fits into the domain ofmachine learning []

Summary of our first glimpse at GPndash 25

In a broader context

As opposed to typical ML approaches GP is very genericArbitrary programming language arbitrary input and output representation

The syntax and semantic of the programming language of consideration serve asmeans to provide the algorithm with prior knowledge

(common sense knowledge background knowledge domain knowledge)

GP is not the only approach to program induction (but probably the best one )See eg inductive logic programming ILP

GP embodies the ultimate goal of AI to build a system capable ofself-programming (adaptation learning)

Summary of our first glimpse at GPndash 26

What is GP ndash Question revisited

Genetic programming is a branch of computer science studying heuristicalgorithms based on neo-Darwinian principles for synthesizing programs iediscrete symbolic compositional structures that process data

Summary of our first glimpse at GPndash 27

Why should GP be considered a viable approach to program synthesis

GP combines two powerful concepts marked in underline in the above definition1 Representing candidate solutions as programs

which in general can conduct any Turing-complete computation (egclassification regression clustering reasoning problem solving etc) and thusenable capturing solutions to any type of problems (whether the task is eglearning optimization problem solving game playing etc)

2 Searching the space of candidate solutions using the lsquomechanicsrsquo borrowed frombiological evolutionwhich is unquestionably a very powerful computing paradigm given that itresulted in life on Earth and development of intelligent beings

Summary of our first glimpse at GPndash 28

Consequences of the above definition

Heuristic nature of search

Symbolic program representation

Input sensitivity and inductive character

State-fullness

Unconstrained data types

Summary of our first glimpse at GPndash 29

Origins of GP

httpwwwgenetic-programmingcomjohnkozahtml

Home assignment Indentify John Kozarsquos unique visual trait )

Summary of our first glimpse at GPndash 30

Life demonstration of GP using ECJ

Life demonstration of GP using ECJndash 31

GP on ECJ

ECJ Evolutionary Computation in Javahttpcsgmuedu~eclabprojectsecj

by Sean Luke Liviu Panait Gabriel Balan Sean Paus Zbigniew Skolicki ElenaPopovici Keith Sullivan et al

Probably the most popular freely available framework for EC with a strongsupport for GP

Licensed under Academic Free License version 30

As of March 2012 version 20

Many other libraries integrate with ECJ

Life demonstration of GP using ECJndash 32

Selected ECJ features

GUI with charting

Platform-independent checkpointing and logging

Hierarchical parameter files

Multithreading

Mersenne Twister Random Number Generators (compare tohttpwwwalifecouknonrandom)

Abstractions for implementing a variety of EC forms

Prepared to work in a distributed environment (including so-called island model)

Life demonstration of GP using ECJndash 33

GP-related ECJ features

GP Tree Representations

Set-based Strongly-Typed Genetic Programming

Ephemeral Random Constants

Automatically-Defined Functions and Automatically Defined Macros

Multiple tree forests

Six tree-creation algorithms

Extensive set of GP breeding operators

Grammatical Encoding

Eight pre-done GP application problem domains (ant regression multiplexerlawnmower parity two-box edge serengeti)

Life demonstration of GP using ECJndash 34

Exemplary run of ECJ

Standard outputjava ecEvolve -file ecappregressionquinticercparamsThreads breed1 eval1Seed 1427743400Job 0Setting upProcessing GP TypesProcessing GP Node ConstraintsProcessing GP Function SetsProcessing GP Tree Constraints-01306332228659439200164875774146594280653340439694114301402200189629743-0037506348565697010001402771209365470606602806044824949013869498395598084Initializing Generation 0Subpop 0 best fitness of generation Fitness Standardized=11303205 Adjusted=046941292 Hits=10Generation 1Subpop 0 best fitness of generation Fitness Standardized=06804932 Adjusted=059506345 Hits=7

Life demonstration of GP using ECJndash 35

Exemplary run The result

The log file produced by the runGeneration 0Best IndividualSubpopulation 0Evaluated trueFitness Standardized=11303205 Adjusted=046941292 Hits=10Tree 0( (sin ( x x)) (cos (+ x x)))Generation 1Best IndividualSubpopulation 0Evaluated trueFitness Standardized=06804932 Adjusted=059506345 Hits=7Tree 0( (rlog (+ (- x x) (cos x))) (rlog (- (cos (cos ( x x))) (- x x))))

Life demonstration of GP using ECJndash 36

Exemplary run

The log file produced by the runBest Individual of RunSubpopulation 0Evaluated trueFitness Standardized=008413165 Adjusted=092239726 Hits=17Tree 0( ( ( (- ( ( ( ( x (sin x)) (rlog

x)) (+ (+ (sin x) x) (- x x))) (exp ( x( ( (- ( ( ( ( x x) (rlog x)) (+ (+

(sin x) x) (- x x))) (exp ( x (sin x))))(sin x)) (rlog x)) (exp (rlog x)))))) (sin

x)) (rlog x)) x) (cos (cos ( ( (- ( ((exp (rlog x)) (+ x ( ( (exp (rlog x))(rlog x)) x))) (exp ( ( ( (- (exp (rlogx)) x) (rlog x)) x) (sin ( x x))))) (sinx)) ( x ( ( (- ( ( ( ( x x) (rlogx)) (+ (+ x (+ (+ (sin x) x) (- x x))) (-x x))) (exp ( x (sin x)))) (sin x)) (rlogx)) (exp (rlog x))))) x))))

Life demonstration of GP using ECJndash 37

GP in R

httpcranr-projectorgwebpackagesgprindexhtml

Life demonstration of GP using ECJndash 38

GP in Mathematica

Life demonstration of GP using ECJndash 39

Assessment of GP techniques

Assessment of GP techniquesndash 40

Criteria for assessing the quality of GP-evolved solutions

Criteria for assessing GP algorithmssuccess rate (percentage of evolutionary runs ended with success)

time-to-success (can be infin)

error of the best-of-run individual

Criteria for assessing programs obtained with GP

error rate (percentage of tests passed)

program size (number of instructions)

execution time

transparency

Assessment of GP techniquesndash 41

GP Benchmarks

Problem Definition (formula)Sextic x6minus2x4+x2

Septic x7minus2x6+x5minusx4+x3minus2x2+xNonic x9+x8+x7+x6+x5+x4+x3+x2+xR1 (x+1)3(x2minusx+1)R2 (x5minus3x3+1)(x2+1)R3 (x6+x5)(x4+x3+x2+x+1)

Assessment of GP techniquesndash 42

Even more GP benchmarks

Symbolic Regression Drug Bioavailability [43]

Tower [52] Protein Structure Classification [58]

Boolean Functions Time Series Forecasting [53]

N-Multiplexer [17] N-Majority [17] N-Parity [17] Path-finding and Planning

Generalised Boolean Circuits [12 59] Physical Travelling Salesman [28]

Digital Adder [55] Artificial Ant [17]

Order [9] Lawnmower [18]

Digital Multiplier [55] Tartarus Problem [6]

Majority [9] Maximum Overhang [36]

Classification Circuit Design [29]

mRNA Motif Classification [24] Control Systems

DNA Motif Discovery [25] Chaotic Dynamic Systems Control [26]

Intrusion Detection [11] Pole Balancing [31]

Protein Classification [19] Truck Control [16]

Intertwined Spirals [17]

Predictive Modelling

Mackey-Glass Chaotic Time Series [21]

Financial Trading [5 4 8]

Sunspot Prediction [17]

GeneChip Probe Performance [22]

Prime Number Prediction [54]

Assessment of GP techniquesndash 43

and more

Game-Playing

TORCS Car Racing [50]

Ms PacMan [10]

Othello [27]

Chessboard Evaluation [44]

Backgammon [44]

Mario [49]

NP-Complete Puzzles [14]

Robocode [44]

Rush Hour [44]

Checkers [44]

Freecell [44]

Dynamic Optimisation

Dynamic Symbolic Regression [34 35 51]

Dynamic Scheduling [13]

Traditional Programming

Sorting [15 1]

Assessment of GP techniquesndash 44

Case study

Case studyndash 45

Case study 1 Evolution of temperature models

Based on Karolina Stanisławska Krzysztof Krawiec Zbigniew W KundzewiczModeling Global Temperature Changes using Genetic Programming ndash A Case Study

Institute of Computing Science Poznan University of Technology Poznan Poland

Institute for Agricultural and Forest Environment Polish Academy of SciencesPoznan Poland and Potsdam Institute for Climate Impact Research PotsdamGermany

Konferencja Algorytmoacutew Ewolucyjnych i Optymalizacji Globalnej KAEiOGSeptember 21 2011

Case studyndash 46

A more detailed view on GP(vanilla GP is not the whole story)

A more detailed view on GP (vanilla GP is not the whole story)ndash 47

Where to get [random] candidate solutions from

Every stochastic search method relies on some sampling algorithm(s)

The distribution of randomly generated solutions is important as it impliescertain bias of the algorithm

ProblemsWe donrsquot know the lsquoidealrsquo distribution of GP programsEven if we knew it it may be difficult to design an algorithm that obeys it

The most widely used contemporary initialization methods take care only of thesyntax of generated programs

Mainly height constraint

A more detailed view on GP (vanilla GP is not the whole story)ndash 48

Initialization Full method

Specify the maximum tree height hmax

The full method for initializing treesChoose nonterminal nodes at random until hmax is reachedThen choose only from terminals

A more detailed view on GP (vanilla GP is not the whole story)ndash 49

Initialization Grow method

Specify the maximum tree height hmax

The grow method for initializing treesChoose nonterminal or terminal nodes at random until hmax is reachedThen choose only from terminals

A more detailed view on GP (vanilla GP is not the whole story)ndash 50

Homologous crossover for GP

Earliest example one-point crossover [23] identify a common region in theparents and swap the corresponding trees

The common region is the lsquointersectionrsquo of parent trees

A more detailed view on GP (vanilla GP is not the whole story)ndash 51

Uniform crossover for GP

Works similarly to uniform crossover in GAs

The offspring is build by iterating over nodes in the common region and flipping acoin to decide from which parent should an instruction be copied [38]

A more detailed view on GP (vanilla GP is not the whole story)ndash 52

How to employ multiple operators for lsquobreedingrsquo

How should the particular operators coexist in an evolutionary process In other words

How should they be superimposed

What should be the lsquopipingrsquo of particular breeding pipelines

A topic surprisingly underexplored in GP (and in EC probably too)

An example Which is betterpopsubpop0speciespipe = ecgpkozaMutationPipelinepopsubpop0speciespipenum-sources = 1popsubpop0speciespipesource0 = ecgpkozaCrossoverPipeline

Orpopsubpop0speciespipenum-sources = 2popsubpop0speciespipesource0 = ecgpkozaCrossoverPipelinepopsubpop0speciespipesource0prob = 09popsubpop0speciespipesource1 = ecgpkozaMutationPipelinepopsubpop0speciespipesource1prob = 01

A more detailed view on GP (vanilla GP is not the whole story)ndash 53

The Challenges for GP

The Challenges for GPndash 54

Bloat

The evolving expressions tend to grow indefinitely in sizeFor tree-based representations this growth is typically exponential[-ish]

Evaluation becomes slow algorithm stalls memory overrun likely

One of the most intensely studied topics in GP 240+ papers as of March 2012

The Challenges for GPndash 55

Bloat example

Average number of nodes per generation in a typical run of GP solving the Sexticproblem x6minus2x4+x2

(GP dotted line)

The Challenges for GPndash 56

Countermeasures for bloat

Constraining tree heightSurprisingly can speed up bloat

Favoring small programsLexicographic parsimony pressure given two equally fit individuals prefer (select)the one represented by a smaller tree

Bloat-aware operators size-fair crossover

The Challenges for GPndash 57

Highly non-uniform distribution of program lsquobehaviorsrsquo

Convergence of binary Boolean random linear functions (composed of AND NANDOR NOR 8 bits)

From [20] Langdon W B Cantuacute-Paz E (ed) Random Search is Parsimonious LateBreaking Papers at the Genetic and Evolutionary Computation Conference(GECCO-2002) AAAI 2002 308-315

The Challenges for GPndash 58

High cost of evaluation

Running a program on multiple inputscan be expensive

Particularly for some types of dataeg images

Solutions

Caching of outcomes of subprograms

Parallel execution of programs onparticular fitness cases

Bloat prevention methods

The Challenges for GPndash 59

Variants of GP

Variants of GPndash 60

Strongly typed GP (STGP)

A way to incorporate prior knowledge and impose a structure on programs [30]

ImplementationProvide a set of typesFor each instruction define the types of its arguments and outcomesMake the operators type-aware

Mutation substitute a random tree of a proper typeCrossover swap trees of compatible1 types

1Compatible belonging to the same lsquoset typersquoVariants of GPndash 61

Strongly typed GP in ECJ

For the problem of simple classifiers represented as decision trees

Classifier syntaxClassifier = Class_idClassifier = if_then_else(Condition ClassifierClassifier)Condition = Input_Variable = Constant_Value

Implementation in ECJ parameter filesgptypeasize = 3gptypea0name = classgptypea1name = vargptypea2name = constgptypessize = 0gptcsize = 1gptc0 = ecgpGPTreeConstraintsgptc0name = tc0gptc0fset = f0gptc0returns = class

gpncsize = 4gpnc0 = ecgpGPNodeConstraintsgpnc0name = ncSimpleClassifiergpnc0returns = classgpnc0size = 0gpnc1 = ecgpGPNodeConstraintsgpnc1name = ncCompoundClassifiergpnc1returns = classgpnc1size = 4gpnc1child0 = vargpnc1child1 = constgpnc1child2 = classgpnc1child3 = classgpnc2 = ecgpGPNodeConstraintsgpnc2name = ncVariablegpnc2returns = vargpnc2size = 0gpnc3 = ecgpGPNodeConstraintsgpnc3name = ncConstantgpnc3returns = constgpnc3size = 0

Variants of GPndash 62

Linear Genetic Programming

MotivationTree-like structures are not natural for contemporary hardware architectures

Program = a sequence of instructions

Data passed via registers

ProsDirectly portable to machine code fast executionNatural correspondence to standard (GA-like) crossover operator

Applications direct evolution of machine code [32]

Initial register contents

Final register contents

x1

x2 O1 O2

x3

O3 O4 g2

g3

g1r1

r2

r3

r1

r2

r3

Variants of GPndash 63

Linear GP

Variants of GPndash 64

Stack-based GP

The best-known representative Push and PushGPhampshireedulspectorpushhtml [48]

ProsVery simple syntax program = instruction | literal | ( program )No need to specify the number of registersThe top element of a stack has the natural interpretation of program outcomeNatural possibility of implementing autoconstructive programs [47]Includes certain features that make it Turing-complete (eg YANK instruction)

Variants of GPndash 65

Push Example

Program

( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR )

Initial stack states

BOOLEAN STACK ()CODE STACK ( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR )FLOAT STACK ()INTEGER STACK ()

Stack states after program execution

BOOLEAN STACK ( TRUE )CODE STACK ( ( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR ) )FLOAT STACK ( 93 )INTEGER STACK ( 6 )

httphampshireedulspectorpush3-descriptionhtml

Variants of GPndash 66

Other variants of GP

Grammatical Evolution The grammar of the programming language ofconsideration is given as input to the algorithm Individuals encode the choice ofproductions in the derivation tree (which of available alternative productionshould be chosen modulo the number of productions available at given step ofderivation)

Graph-based GPMotivation standard GP cannot reuse subprograms (within a single program)Example Cartesian Genetic Programming

Variants of GPndash 67

Other variants of GP

Multiobjective GP The extra objectives canCome with the problemResult from GPrsquos specifics eg use program size as the second (minimized)objectiveBe associated with different tests (eg feature tests [40])

Developmental GP (eg using Push)

Probabilistic GP (a variant of EDA Estimation of Distribution Algorithms)The algorithm maintains a probability distribution P instead of a populationIndividuals are generated from P lsquoon demandrsquoThe results of individualsrsquo evaluation are used to update P

Variants of GPndash 68

Simple EDA-like GP PIPE

Probabilistic Incremental Program Evolution [41]

Variants of GPndash 69

Selected theoretical results

Selected theoretical resultsndash 70

Schemata theorem for GP

Exact formula for the expected number of individuals sampling a schema a thenext generation [37]

Plus later work for other types of crossover

Selected theoretical resultsndash 71

Semantic crossover

Selected theoretical resultsndash 72

Applications of GP

Applications of GPndash 73

Review

GP produced a number of solutions that are human-competitive ie a GPalgorithm automatically solved a problem for which a patent exists2

A recent award-winning work has demonstrated the ability of a GP system toautomatically find and correct bugs in commercially-released software whenprovided with test data3

GP is one of leading methodologies that can be used to lsquoautomatersquo sciencehelping the researchers to find the hidden complex patterns in the observedphenomena4

2Koza J R Keane M A Streeter M J Mydlowec W Yu J Lanza G 2003 Genetic Pro-gramming IV Routine Human-Competitive Machine Intelligence Kluwer Academic Publishers

3Arcuri A Yao X A novel co-evolutionary approach to automatic software bug fixing In WangJ (Ed) 2008 IEEE World Congress on Computational Intelligence IEEE Computational IntelligenceSociety IEEE Press Hong Kong

4Schmidt M Lipson H 3 Apr 2009 Distilling free-form natural laws from experimental dataScience 324 (5923) 81ndash85

Applications of GPndash 74

Humies

() Entries were solicited for cash awards for human-competitive results that wereproduced by any form of genetic and evolutionary computation and that were published

Applications of GPndash 75

Humies

The conditions to qualify(A) The result was patented as an invention in the past is an improvement over apatented invention or would qualify today as a patentable new invention(B) The result is equal to or better than a result that was accepted as a new scientificresult at the time when it was published in a peer-reviewed scientific journal(C) The result is equal to or better than a result that was placed into a database orarchive of results maintained by an internationally recognized panel of scientificexperts(D) The result is publishable in its own right as a new scientific result mdash independentof the fact that the result was mechanically created(E) The result is equal to or better than the most recent human-created solution to along-standing problem for which there has been a succession of increasingly betterhuman-created solutions(F) The result is equal to or better than a result that was considered an achievementin its field at the time it was first discovered(G) The result solves a problem of indisputable difficulty in its field(H) The result holds its own or wins a regulated competition involving humancontestants (in the form of either live human players or human-written computerprograms)

Applications of GPndash 76

Selected Gold Humies

2004 Jason D Lohn Gregory S Hornby Derek S Linden NASA Ames ResearchCenterAn Evolved Antenna for Deployment on NASArsquos Space Technology 5 Mission

httpidesignucscedupapershornby_ec11pdf

Applications of GPndash 77

Selected Gold Humies using GP

2009 Stephanie Forrest Claire Le Goues ThanhVu Nguyen Westley WeimerAutomatically finding patches using genetic programming A GeneticProgramming Approach to Automated Software Repair

Applications of GPndash 78

Selected Gold Humies using GP

2008 Lee Spector David M Clark Ian Lindsay Bradford Barr Jon KleinGenetic Programming for Finite Algebras

2010 Natalio Krasnogor Paweł Widera Jonathan GaribaldiEvolutionary design of the energy function for protein structure prediction GPchallenge evolving the energy function for protein structure prediction Automateddesign of energy functions for protein structure prediction by means of geneticprogramming and improved structure similarity assessment

2011 Achiya Elyasaf Ami Hauptmann Moshe SipperGA-FreeCell Evolving Solvers for the Game of FreeCell

Applications of GPndash 79

Other applications

classification problems in machine learning [] object recognition [ 33] orlearning game strategies [] [2 57] has demonstrated the ability of a GP system to automatically find andcorrect bugs in commercially-released software when provided with test data

In the context of this paper it deserves particular attention that GP is one ofleading methodologies that can be used to lsquoautomatersquo science helping theresearchers to find the hidden complex patterns in the observed phenomena

In this spirit in their seminal paper [42] have shown how GP can be used toinduce scientific laws from experimental data Many other studies havedemonstrated the usefulness of GP for modeling different phenomena includingthose of natural origins [45 3 7 56 46]

See [] for an extensive review of GP applications

Applications of GPndash 80

Case study 2 Evolution of feature detectors

Based on Krzysztof Krawiec Bartosz Kukawka and Tomasz Maciejewski (2010)Evolving cascades of voting feature detectors for vehicle detection in satellite imageryIn IEEE Congress on Evolutionary Computation (CEC 2010) Barcelona IEEE Presspages 2392-2399

Applications of GPndash 81

Additional resources

Additional resourcesndash 82

Additional resources

Evolutionary Computation in Java csgmuedu~eclabprojectsecjGeneric software framework for EA well-prepared to work with GP

The online GP bilbiography wwwcsbhamacuk~wblbiblio

The genetic programming lsquohome pagersquo (a little bit messy but still valuable)httpwwwgenetic-programmingcom

Additional resourcesndash 83

Bibliography

Bibliographyndash 84

A Agapitos and S M LucasEvolving Modular Recursive Sorting AlgorithmsIn Proc EuroGP 2007

A Arcuri and X YaoA novel co-evolutionary approach to automatic software bug fixingIn J Wang editor 2008 IEEE World Congress on Computational Intelligencepages 162ndash168 Hong Kong 1-6 June 2008 IEEE Computational IntelligenceSociety IEEE Press

M Arganis R Val J Prats K Rodriguez R Dominguez and J DolzGenetic programming and standardization in water temperature modellingAdvances in Civil Engineering 2009 2009

A Brabazon M OrsquoNeill and I DempseyAn Introduction to Evolutionary Computation in FinanceIEEE Computational Intelligence Magazine 3(4)42ndash55 2008

R Bradley A Brabazon and M OrsquoNeillDynamic High Frequency Trading A Neuro-Evolutionary ApproachIn Proc EvoWorkshops 2009

G Cuccu and F GomezWhen novelty is not enoughIn Proc EvoApplications 2011

Bibliographyndash 85

M Daga and M C DeoAlternative data-driven methods to estimate wind from waves by inversemodelingNatural Hazards 49(2)293ndash310 May 2009

I Dempsey M OrsquoNeill and A BrabazonAdaptive Trading With Grammatical EvolutionIn Proc CEC 2006

G Durrett F Neumann and U-M OrsquoReillyComputational Complexity Analysis of Simple Genetic Programming On TwoProblems Modeling Isolated Program SemanticsIn Proc FOGA 2011

E Galvaacuten-Loacutepez J Swafford M OrsquoNeill and A BrabazonEvolving a Ms PacMan Controller Using Grammatical EvolutionIn Applications of Evolutionary Computation Springer 2010

J V Hansen P B Lowry R D Meservy and D M McDonaldGenetic Programming for Prevention of Cyberterrorism through Dynamic andEvolving Intrusion DetectionDecision Support Systems 431362ndash1374 2007

Bibliographyndash 86

S Harding J F Miller and W BanzhafDevelopments in Cartesian Genetic Programming self-modifying CGPGPEM 11397ndash439 2010

D Jakobović and L BudinDynamic Scheduling with Genetic ProgrammingIn Proc EuroGP 2006

G Kendall A Parkes and K SpoererA Survey of NP-Complete PuzzlesInternational Computer Games Association Journal 31(1)13ndash34 2008

K E Kinnear JrEvolving a Sort Lessons in Genetic ProgrammingIn Proc of the International Conference on Neural Networks 1993

J KozaA Genetic Approach to the Truck Backer Upper Problem and the Inter-twinedSpiral ProblemIn Proc International Joint Conference on Neural Networks 1992

J R KozaGenetic Programming On the Programming of Computers by Means of NaturalSelectionMIT Press Cambridge MA USA 1992

Bibliographyndash 87

J R KozaGenetic Programming II Automatic Discovery of Reusable ProgramsMIT Press Cambridge Massachusetts May 1994

W Langdon and W BanzhafRepeated Patterns in Genetic ProgrammingNatural Computing 7589ndash613 2008

W B LangdonRandom search is parsimoniousIn E Cantuacute-Paz editor Late Breaking Papers at the Genetic and EvolutionaryComputation Conference (GECCO-2002) pages 308ndash315 New York NY 9-13July 2002 AAAI

W B Langdon and W BanzhafRepeated Sequences in Linear Genetic Programming GenomesComplex Systems 15(4)285ndash306 2005

W B Langdon and A P HarrisonEvolving Regular Expressions for GeneChip Probe Performance PredictionIn Proc PPSN pages 1061ndash1070 2008

W B Langdon and R PoliFoundations of Genetic ProgrammingSpringer-Verlag 2002

Bibliographyndash 88

W B Langdon J Rowsell and A P HarrisonCreating Regular Expressions as mRNA Motifs with GP to Predict Human ExonSplittingIn Proc GECCO 2009

W B Langdon O Sanchez Graillet and A P HarrisonAutomated DNA Motif DiscoveryarXivorg 2010

M Lones A Tyrrell S Stepney and L CavesControlling Complex Dynamics with Artificial Biochemical NetworksIn Proc EuroGP pages 159ndash170 2010

S LucasOthello Competitionhttpprotectkern-1667emrelaxalgovalessexacuk8080othellohtmlOthellohtml 2012[Online accessed 27-Jan-2012]

Bibliographyndash 89

S LucasThe Physical Travelling Salesperson Problemhttpprotectkern-1667emrelaxalgovalessexacukptspptsphtml2012[Online accessed 27ndashJan-2012]

T McConaghyFFX Fast Scalable Deterministic Symbolic Regression TechnologyIn Proc GPTP 2011

D J MontanaStrongly typed genetic programmingBBN Technical Report 7866 Bolt Beranek and Newman Inc 10 MoultonStreet Cambridge MA 02138 USA 7 May 1993

M Nicolau M Schoenauer and W BanzhafEvolving Genes to Balance a PoleIn Proc EuroGP 2010

Bibliographyndash 90

P Nordin and W BanzhafGenetic programming controlling a miniature robotIn E V Siegel and J R Koza editors Working Notes for the AAAI Symposiumon Genetic Programming pages 61ndash67 MIT Cambridge MA USA 10ndash12 Nov1995 AAAI

G Olague and L TrujilloEvolutionary-computer-assisted design of image operators that detect interestpoints using genetic programmingImage and Vision Computing 29(7)484ndash498 2011

M OrsquoNeill A Brabazon and E HembergSubtree Deactivation Control with Grammatical Genetic Programming inDynamic EnvironmentsIn Proc CEC 2008

M OrsquoNeill and C RyanGrammatical Evolution by Grammatical Evolution The Evolution of Grammarand Genetic CodeIn Proc EuroGP pages 138ndash149 Springer-Verlag 5ndash7 Apr 2004

M Paterson Y Peres M Thorup P Winkler and U ZwickMaximum OverhangIn Proc 19th Annual ACM-SIAM Symposium on Discrete Algorithms 2008

Bibliographyndash 91

R PoliExact schema theory for genetic programming and variable-length geneticalgorithms with one-point crossoverGenetic Programming and Evolvable Machines 2(2)123ndash163 June 2001

R Poli and W B LangdonOn the search properties of different crossover operators in genetic programmingIn J R Koza W Banzhaf K Chellapilla K Deb M Dorigo D B Fogel M HGarzon D E Goldberg H Iba and R Riolo editors Genetic Programming1998 Proceedings of the Third Annual Conference pages 293ndash301 University ofWisconsin Madison Wisconsin USA 22-25 July 1998 Morgan Kaufmann

R Poli W B Langdon and N F McPheeA field guide to genetic programmingPublished via httplulucom and freely available athttpwwwgp-field-guideorguk 2008(With contributions by J R Koza)

B J Ross and H ZhuProcedural texture evolution using multiobjective optimizationNew Generation Computing 22(3)271ndash293 2004

Bibliographyndash 92

R P Salustowicz and J SchmidhuberProbabilistic incremental program evolutionEvolutionary Computation 5(2)123ndash141 1997

M Schmidt and H LipsonDistilling free-form natural laws from experimental dataScience 324(5923)81ndash85 3 Apr 2009

S Silva and L VanneschiState-of-the-Art Genetic Programming for Predicting Human Oral Bioavailabilityof DrugsIn Proc 4th International Workshop on Practical Applications of ComputationalBiology and Bioinformatics 2010

M SipperLet the Games EvolveIn Proc GPTP 2011

C Sivapragasam R Maheswaran and V VenkateshGenetic programming approach for flood routing in natural channelsHydrological Processes 22(5)623ndash628 2007

C Sivapragasam N Muttil S Muthukumar and V M ArunPrediction of algal blooms using genetic programmingMarine Pollution Bulletin 60(10)1849ndash1855 2010

Bibliographyndash 93

L SpectorTowards practical autoconstructive evolution Self-evolution of problem-solvinggenetic programming systemsIn R Riolo T McConaghy and E Vladislavleva editors Genetic ProgrammingTheory and Practice VIII volume 8 of Genetic and Evolutionary Computationchapter 2 pages 17ndash33 Springer Ann Arbor USA 20-22 May 2010

L Spector C Perry and J KleinPush 20 programming language descriptionTechnical report School of Cognitive Science Hampshire College Apr 2004

J Togelius S Karakovskiy J Koutnik and J SchmidhuberSuper Mario EvolutionIn Proc IEEE Computational Intelligence and Games 2009

TORCS The Open Car Racing Simulatorhttptorcssourceforgenet 2012

L Vanneschi and G CuccuVariable Size Population for Dynamic Optimization with Genetic ProgrammingIn Proc GECCO 2009

Bibliographyndash 94

E Vladislavleva G Smits and D Den HertogOrder of Nonlinearity as a Complexity Measure for Models Generated by SymbolicRegression via Pareto Genetic ProgrammingIEEE Trans EC 13(2)333ndash349 2009

N Wagner Z Michalewicz M Khouja and R McGregorTime Series Forecasting for Dynamic Environments The DyFor Genetic ProgramModelIEEE Trans EC 2007

J Walker and J MillerPredicting Prime Numbers Using Cartesian Genetic ProgrammingIn Proc EuroGP 2007

J A Walker K Voumllk S L Smith and J F MillerParallel Evolution using Multi-chromosome Cartesian Genetic ProgrammingGPEM 10417ndash445 2009

W-C Wang K-W Chau C-T Cheng and L QiuA comparison of performance of several artificial intelligence methods forforecasting monthly discharge time seriesJournal of Hydrology 374(3-4)294ndash306 2009

Bibliographyndash 95

W Weimer S Forrest C Le Goues and T NguyenAutomatic program repair with evolutionary computationCommunications of the ACM 53(5)109ndash116 June 2010

P Widera J Garibaldi and N KrasnogorGP challenge Evolving energy function for protein structure predictionGPEM 1161ndash88 2010

T YuHierarchical Processing for Evolving Recursive and Modular Programs UsingHigher-Order Functions and Lambda AbstractionGPEM 2345ndash380 2001

Bibliographyndash 96

  • Introduction
  • Background
  • What is genetic programming
  • Summary of our first glimpse at GP
  • Life demonstration of GP using ECJ
  • Assessment of GP techniques
  • Case study
  • A more detailed view on GP (vanilla GP is not the whole story)
  • The Challenges for GP
  • Variants of GP
  • Selected theoretical results
  • Applications of GP
  • Additional resources
  • Bibliography
Page 19: GeneticProgramming - Poznań University of TechnologyGeneticProgramming KrzysztofKrawiec LaboratoryofIntelligentDecisionSupportSystems InstituteofComputingScience,PoznanUniversityofTechnology,Poznań,Poland

Exemplary run Setup

Instruction setNonterminal (function) set + - (protected division) and x all operating onfloatsTerminal set x and constants chosen randomly between -5 and +5

Selection fitness proportionate (roulette wheel) non elitist

Initial pop ramped half-and-half (depth 1 to 2 50 of terminals are constants)(to be explained later)

Parameterspopulation size 450 subtree crossover25 reproduction25 subtree mutation no tree size limits

Termination when an individual with fitness better than 01 found

What is genetic programmingndash 19

Initial population (population 0)

What is genetic programmingndash 20

Fitness assignment for population 0

Fitness values f(a)=77 f(b)=110 f(c)=1798 f(d)=287

What is genetic programmingndash 21

Breeding

Assume

a gets reproduced

c gets mutated (at loci 2)

a and d get crossed-over

a and b get crossed over

What is genetic programmingndash 22

Population 1

Population 0

Population 1

Individual d in population 1 has fitness 0What is genetic programmingndash 23

Summary of our first glimpse at GP

Summary of our first glimpse at GPndash 24

Specific features of GP

The solutions evolving under the selection pressure of the fitness function arethemselves functions (programs)

GP operates on symbolic structures of varying lengthsThere are no variables for the algorithm to operate on (at least in the commonsense)

The program can be tested only on a limited number of fitness cases (tests)

=rArr In contrast to most EC methods that are typically placed in optimizationframework GP is by nature an inductive learning approach that fits into the domain ofmachine learning []

Summary of our first glimpse at GPndash 25

In a broader context

As opposed to typical ML approaches GP is very genericArbitrary programming language arbitrary input and output representation

The syntax and semantic of the programming language of consideration serve asmeans to provide the algorithm with prior knowledge

(common sense knowledge background knowledge domain knowledge)

GP is not the only approach to program induction (but probably the best one )See eg inductive logic programming ILP

GP embodies the ultimate goal of AI to build a system capable ofself-programming (adaptation learning)

Summary of our first glimpse at GPndash 26

What is GP ndash Question revisited

Genetic programming is a branch of computer science studying heuristicalgorithms based on neo-Darwinian principles for synthesizing programs iediscrete symbolic compositional structures that process data

Summary of our first glimpse at GPndash 27

Why should GP be considered a viable approach to program synthesis

GP combines two powerful concepts marked in underline in the above definition1 Representing candidate solutions as programs

which in general can conduct any Turing-complete computation (egclassification regression clustering reasoning problem solving etc) and thusenable capturing solutions to any type of problems (whether the task is eglearning optimization problem solving game playing etc)

2 Searching the space of candidate solutions using the lsquomechanicsrsquo borrowed frombiological evolutionwhich is unquestionably a very powerful computing paradigm given that itresulted in life on Earth and development of intelligent beings

Summary of our first glimpse at GPndash 28

Consequences of the above definition

Heuristic nature of search

Symbolic program representation

Input sensitivity and inductive character

State-fullness

Unconstrained data types

Summary of our first glimpse at GPndash 29

Origins of GP

httpwwwgenetic-programmingcomjohnkozahtml

Home assignment Indentify John Kozarsquos unique visual trait )

Summary of our first glimpse at GPndash 30

Life demonstration of GP using ECJ

Life demonstration of GP using ECJndash 31

GP on ECJ

ECJ Evolutionary Computation in Javahttpcsgmuedu~eclabprojectsecj

by Sean Luke Liviu Panait Gabriel Balan Sean Paus Zbigniew Skolicki ElenaPopovici Keith Sullivan et al

Probably the most popular freely available framework for EC with a strongsupport for GP

Licensed under Academic Free License version 30

As of March 2012 version 20

Many other libraries integrate with ECJ

Life demonstration of GP using ECJndash 32

Selected ECJ features

GUI with charting

Platform-independent checkpointing and logging

Hierarchical parameter files

Multithreading

Mersenne Twister Random Number Generators (compare tohttpwwwalifecouknonrandom)

Abstractions for implementing a variety of EC forms

Prepared to work in a distributed environment (including so-called island model)

Life demonstration of GP using ECJndash 33

GP-related ECJ features

GP Tree Representations

Set-based Strongly-Typed Genetic Programming

Ephemeral Random Constants

Automatically-Defined Functions and Automatically Defined Macros

Multiple tree forests

Six tree-creation algorithms

Extensive set of GP breeding operators

Grammatical Encoding

Eight pre-done GP application problem domains (ant regression multiplexerlawnmower parity two-box edge serengeti)

Life demonstration of GP using ECJndash 34

Exemplary run of ECJ

Standard outputjava ecEvolve -file ecappregressionquinticercparamsThreads breed1 eval1Seed 1427743400Job 0Setting upProcessing GP TypesProcessing GP Node ConstraintsProcessing GP Function SetsProcessing GP Tree Constraints-01306332228659439200164875774146594280653340439694114301402200189629743-0037506348565697010001402771209365470606602806044824949013869498395598084Initializing Generation 0Subpop 0 best fitness of generation Fitness Standardized=11303205 Adjusted=046941292 Hits=10Generation 1Subpop 0 best fitness of generation Fitness Standardized=06804932 Adjusted=059506345 Hits=7

Life demonstration of GP using ECJndash 35

Exemplary run The result

The log file produced by the runGeneration 0Best IndividualSubpopulation 0Evaluated trueFitness Standardized=11303205 Adjusted=046941292 Hits=10Tree 0( (sin ( x x)) (cos (+ x x)))Generation 1Best IndividualSubpopulation 0Evaluated trueFitness Standardized=06804932 Adjusted=059506345 Hits=7Tree 0( (rlog (+ (- x x) (cos x))) (rlog (- (cos (cos ( x x))) (- x x))))

Life demonstration of GP using ECJndash 36

Exemplary run

The log file produced by the runBest Individual of RunSubpopulation 0Evaluated trueFitness Standardized=008413165 Adjusted=092239726 Hits=17Tree 0( ( ( (- ( ( ( ( x (sin x)) (rlog

x)) (+ (+ (sin x) x) (- x x))) (exp ( x( ( (- ( ( ( ( x x) (rlog x)) (+ (+

(sin x) x) (- x x))) (exp ( x (sin x))))(sin x)) (rlog x)) (exp (rlog x)))))) (sin

x)) (rlog x)) x) (cos (cos ( ( (- ( ((exp (rlog x)) (+ x ( ( (exp (rlog x))(rlog x)) x))) (exp ( ( ( (- (exp (rlogx)) x) (rlog x)) x) (sin ( x x))))) (sinx)) ( x ( ( (- ( ( ( ( x x) (rlogx)) (+ (+ x (+ (+ (sin x) x) (- x x))) (-x x))) (exp ( x (sin x)))) (sin x)) (rlogx)) (exp (rlog x))))) x))))

Life demonstration of GP using ECJndash 37

GP in R

httpcranr-projectorgwebpackagesgprindexhtml

Life demonstration of GP using ECJndash 38

GP in Mathematica

Life demonstration of GP using ECJndash 39

Assessment of GP techniques

Assessment of GP techniquesndash 40

Criteria for assessing the quality of GP-evolved solutions

Criteria for assessing GP algorithmssuccess rate (percentage of evolutionary runs ended with success)

time-to-success (can be infin)

error of the best-of-run individual

Criteria for assessing programs obtained with GP

error rate (percentage of tests passed)

program size (number of instructions)

execution time

transparency

Assessment of GP techniquesndash 41

GP Benchmarks

Problem Definition (formula)Sextic x6minus2x4+x2

Septic x7minus2x6+x5minusx4+x3minus2x2+xNonic x9+x8+x7+x6+x5+x4+x3+x2+xR1 (x+1)3(x2minusx+1)R2 (x5minus3x3+1)(x2+1)R3 (x6+x5)(x4+x3+x2+x+1)

Assessment of GP techniquesndash 42

Even more GP benchmarks

Symbolic Regression Drug Bioavailability [43]

Tower [52] Protein Structure Classification [58]

Boolean Functions Time Series Forecasting [53]

N-Multiplexer [17] N-Majority [17] N-Parity [17] Path-finding and Planning

Generalised Boolean Circuits [12 59] Physical Travelling Salesman [28]

Digital Adder [55] Artificial Ant [17]

Order [9] Lawnmower [18]

Digital Multiplier [55] Tartarus Problem [6]

Majority [9] Maximum Overhang [36]

Classification Circuit Design [29]

mRNA Motif Classification [24] Control Systems

DNA Motif Discovery [25] Chaotic Dynamic Systems Control [26]

Intrusion Detection [11] Pole Balancing [31]

Protein Classification [19] Truck Control [16]

Intertwined Spirals [17]

Predictive Modelling

Mackey-Glass Chaotic Time Series [21]

Financial Trading [5 4 8]

Sunspot Prediction [17]

GeneChip Probe Performance [22]

Prime Number Prediction [54]

Assessment of GP techniquesndash 43

and more

Game-Playing

TORCS Car Racing [50]

Ms PacMan [10]

Othello [27]

Chessboard Evaluation [44]

Backgammon [44]

Mario [49]

NP-Complete Puzzles [14]

Robocode [44]

Rush Hour [44]

Checkers [44]

Freecell [44]

Dynamic Optimisation

Dynamic Symbolic Regression [34 35 51]

Dynamic Scheduling [13]

Traditional Programming

Sorting [15 1]

Assessment of GP techniquesndash 44

Case study

Case studyndash 45

Case study 1 Evolution of temperature models

Based on Karolina Stanisławska Krzysztof Krawiec Zbigniew W KundzewiczModeling Global Temperature Changes using Genetic Programming ndash A Case Study

Institute of Computing Science Poznan University of Technology Poznan Poland

Institute for Agricultural and Forest Environment Polish Academy of SciencesPoznan Poland and Potsdam Institute for Climate Impact Research PotsdamGermany

Konferencja Algorytmoacutew Ewolucyjnych i Optymalizacji Globalnej KAEiOGSeptember 21 2011

Case studyndash 46

A more detailed view on GP(vanilla GP is not the whole story)

A more detailed view on GP (vanilla GP is not the whole story)ndash 47

Where to get [random] candidate solutions from

Every stochastic search method relies on some sampling algorithm(s)

The distribution of randomly generated solutions is important as it impliescertain bias of the algorithm

ProblemsWe donrsquot know the lsquoidealrsquo distribution of GP programsEven if we knew it it may be difficult to design an algorithm that obeys it

The most widely used contemporary initialization methods take care only of thesyntax of generated programs

Mainly height constraint

A more detailed view on GP (vanilla GP is not the whole story)ndash 48

Initialization Full method

Specify the maximum tree height hmax

The full method for initializing treesChoose nonterminal nodes at random until hmax is reachedThen choose only from terminals

A more detailed view on GP (vanilla GP is not the whole story)ndash 49

Initialization Grow method

Specify the maximum tree height hmax

The grow method for initializing treesChoose nonterminal or terminal nodes at random until hmax is reachedThen choose only from terminals

A more detailed view on GP (vanilla GP is not the whole story)ndash 50

Homologous crossover for GP

Earliest example one-point crossover [23] identify a common region in theparents and swap the corresponding trees

The common region is the lsquointersectionrsquo of parent trees

A more detailed view on GP (vanilla GP is not the whole story)ndash 51

Uniform crossover for GP

Works similarly to uniform crossover in GAs

The offspring is build by iterating over nodes in the common region and flipping acoin to decide from which parent should an instruction be copied [38]

A more detailed view on GP (vanilla GP is not the whole story)ndash 52

How to employ multiple operators for lsquobreedingrsquo

How should the particular operators coexist in an evolutionary process In other words

How should they be superimposed

What should be the lsquopipingrsquo of particular breeding pipelines

A topic surprisingly underexplored in GP (and in EC probably too)

An example Which is betterpopsubpop0speciespipe = ecgpkozaMutationPipelinepopsubpop0speciespipenum-sources = 1popsubpop0speciespipesource0 = ecgpkozaCrossoverPipeline

Orpopsubpop0speciespipenum-sources = 2popsubpop0speciespipesource0 = ecgpkozaCrossoverPipelinepopsubpop0speciespipesource0prob = 09popsubpop0speciespipesource1 = ecgpkozaMutationPipelinepopsubpop0speciespipesource1prob = 01

A more detailed view on GP (vanilla GP is not the whole story)ndash 53

The Challenges for GP

The Challenges for GPndash 54

Bloat

The evolving expressions tend to grow indefinitely in sizeFor tree-based representations this growth is typically exponential[-ish]

Evaluation becomes slow algorithm stalls memory overrun likely

One of the most intensely studied topics in GP 240+ papers as of March 2012

The Challenges for GPndash 55

Bloat example

Average number of nodes per generation in a typical run of GP solving the Sexticproblem x6minus2x4+x2

(GP dotted line)

The Challenges for GPndash 56

Countermeasures for bloat

Constraining tree heightSurprisingly can speed up bloat

Favoring small programsLexicographic parsimony pressure given two equally fit individuals prefer (select)the one represented by a smaller tree

Bloat-aware operators size-fair crossover

The Challenges for GPndash 57

Highly non-uniform distribution of program lsquobehaviorsrsquo

Convergence of binary Boolean random linear functions (composed of AND NANDOR NOR 8 bits)

From [20] Langdon W B Cantuacute-Paz E (ed) Random Search is Parsimonious LateBreaking Papers at the Genetic and Evolutionary Computation Conference(GECCO-2002) AAAI 2002 308-315

The Challenges for GPndash 58

High cost of evaluation

Running a program on multiple inputscan be expensive

Particularly for some types of dataeg images

Solutions

Caching of outcomes of subprograms

Parallel execution of programs onparticular fitness cases

Bloat prevention methods

The Challenges for GPndash 59

Variants of GP

Variants of GPndash 60

Strongly typed GP (STGP)

A way to incorporate prior knowledge and impose a structure on programs [30]

ImplementationProvide a set of typesFor each instruction define the types of its arguments and outcomesMake the operators type-aware

Mutation substitute a random tree of a proper typeCrossover swap trees of compatible1 types

1Compatible belonging to the same lsquoset typersquoVariants of GPndash 61

Strongly typed GP in ECJ

For the problem of simple classifiers represented as decision trees

Classifier syntaxClassifier = Class_idClassifier = if_then_else(Condition ClassifierClassifier)Condition = Input_Variable = Constant_Value

Implementation in ECJ parameter filesgptypeasize = 3gptypea0name = classgptypea1name = vargptypea2name = constgptypessize = 0gptcsize = 1gptc0 = ecgpGPTreeConstraintsgptc0name = tc0gptc0fset = f0gptc0returns = class

gpncsize = 4gpnc0 = ecgpGPNodeConstraintsgpnc0name = ncSimpleClassifiergpnc0returns = classgpnc0size = 0gpnc1 = ecgpGPNodeConstraintsgpnc1name = ncCompoundClassifiergpnc1returns = classgpnc1size = 4gpnc1child0 = vargpnc1child1 = constgpnc1child2 = classgpnc1child3 = classgpnc2 = ecgpGPNodeConstraintsgpnc2name = ncVariablegpnc2returns = vargpnc2size = 0gpnc3 = ecgpGPNodeConstraintsgpnc3name = ncConstantgpnc3returns = constgpnc3size = 0

Variants of GPndash 62

Linear Genetic Programming

MotivationTree-like structures are not natural for contemporary hardware architectures

Program = a sequence of instructions

Data passed via registers

ProsDirectly portable to machine code fast executionNatural correspondence to standard (GA-like) crossover operator

Applications direct evolution of machine code [32]

Initial register contents

Final register contents

x1

x2 O1 O2

x3

O3 O4 g2

g3

g1r1

r2

r3

r1

r2

r3

Variants of GPndash 63

Linear GP

Variants of GPndash 64

Stack-based GP

The best-known representative Push and PushGPhampshireedulspectorpushhtml [48]

ProsVery simple syntax program = instruction | literal | ( program )No need to specify the number of registersThe top element of a stack has the natural interpretation of program outcomeNatural possibility of implementing autoconstructive programs [47]Includes certain features that make it Turing-complete (eg YANK instruction)

Variants of GPndash 65

Push Example

Program

( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR )

Initial stack states

BOOLEAN STACK ()CODE STACK ( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR )FLOAT STACK ()INTEGER STACK ()

Stack states after program execution

BOOLEAN STACK ( TRUE )CODE STACK ( ( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR ) )FLOAT STACK ( 93 )INTEGER STACK ( 6 )

httphampshireedulspectorpush3-descriptionhtml

Variants of GPndash 66

Other variants of GP

Grammatical Evolution The grammar of the programming language ofconsideration is given as input to the algorithm Individuals encode the choice ofproductions in the derivation tree (which of available alternative productionshould be chosen modulo the number of productions available at given step ofderivation)

Graph-based GPMotivation standard GP cannot reuse subprograms (within a single program)Example Cartesian Genetic Programming

Variants of GPndash 67

Other variants of GP

Multiobjective GP The extra objectives canCome with the problemResult from GPrsquos specifics eg use program size as the second (minimized)objectiveBe associated with different tests (eg feature tests [40])

Developmental GP (eg using Push)

Probabilistic GP (a variant of EDA Estimation of Distribution Algorithms)The algorithm maintains a probability distribution P instead of a populationIndividuals are generated from P lsquoon demandrsquoThe results of individualsrsquo evaluation are used to update P

Variants of GPndash 68

Simple EDA-like GP PIPE

Probabilistic Incremental Program Evolution [41]

Variants of GPndash 69

Selected theoretical results

Selected theoretical resultsndash 70

Schemata theorem for GP

Exact formula for the expected number of individuals sampling a schema a thenext generation [37]

Plus later work for other types of crossover

Selected theoretical resultsndash 71

Semantic crossover

Selected theoretical resultsndash 72

Applications of GP

Applications of GPndash 73

Review

GP produced a number of solutions that are human-competitive ie a GPalgorithm automatically solved a problem for which a patent exists2

A recent award-winning work has demonstrated the ability of a GP system toautomatically find and correct bugs in commercially-released software whenprovided with test data3

GP is one of leading methodologies that can be used to lsquoautomatersquo sciencehelping the researchers to find the hidden complex patterns in the observedphenomena4

2Koza J R Keane M A Streeter M J Mydlowec W Yu J Lanza G 2003 Genetic Pro-gramming IV Routine Human-Competitive Machine Intelligence Kluwer Academic Publishers

3Arcuri A Yao X A novel co-evolutionary approach to automatic software bug fixing In WangJ (Ed) 2008 IEEE World Congress on Computational Intelligence IEEE Computational IntelligenceSociety IEEE Press Hong Kong

4Schmidt M Lipson H 3 Apr 2009 Distilling free-form natural laws from experimental dataScience 324 (5923) 81ndash85

Applications of GPndash 74

Humies

() Entries were solicited for cash awards for human-competitive results that wereproduced by any form of genetic and evolutionary computation and that were published

Applications of GPndash 75

Humies

The conditions to qualify(A) The result was patented as an invention in the past is an improvement over apatented invention or would qualify today as a patentable new invention(B) The result is equal to or better than a result that was accepted as a new scientificresult at the time when it was published in a peer-reviewed scientific journal(C) The result is equal to or better than a result that was placed into a database orarchive of results maintained by an internationally recognized panel of scientificexperts(D) The result is publishable in its own right as a new scientific result mdash independentof the fact that the result was mechanically created(E) The result is equal to or better than the most recent human-created solution to along-standing problem for which there has been a succession of increasingly betterhuman-created solutions(F) The result is equal to or better than a result that was considered an achievementin its field at the time it was first discovered(G) The result solves a problem of indisputable difficulty in its field(H) The result holds its own or wins a regulated competition involving humancontestants (in the form of either live human players or human-written computerprograms)

Applications of GPndash 76

Selected Gold Humies

2004 Jason D Lohn Gregory S Hornby Derek S Linden NASA Ames ResearchCenterAn Evolved Antenna for Deployment on NASArsquos Space Technology 5 Mission

httpidesignucscedupapershornby_ec11pdf

Applications of GPndash 77

Selected Gold Humies using GP

2009 Stephanie Forrest Claire Le Goues ThanhVu Nguyen Westley WeimerAutomatically finding patches using genetic programming A GeneticProgramming Approach to Automated Software Repair

Applications of GPndash 78

Selected Gold Humies using GP

2008 Lee Spector David M Clark Ian Lindsay Bradford Barr Jon KleinGenetic Programming for Finite Algebras

2010 Natalio Krasnogor Paweł Widera Jonathan GaribaldiEvolutionary design of the energy function for protein structure prediction GPchallenge evolving the energy function for protein structure prediction Automateddesign of energy functions for protein structure prediction by means of geneticprogramming and improved structure similarity assessment

2011 Achiya Elyasaf Ami Hauptmann Moshe SipperGA-FreeCell Evolving Solvers for the Game of FreeCell

Applications of GPndash 79

Other applications

classification problems in machine learning [] object recognition [ 33] orlearning game strategies [] [2 57] has demonstrated the ability of a GP system to automatically find andcorrect bugs in commercially-released software when provided with test data

In the context of this paper it deserves particular attention that GP is one ofleading methodologies that can be used to lsquoautomatersquo science helping theresearchers to find the hidden complex patterns in the observed phenomena

In this spirit in their seminal paper [42] have shown how GP can be used toinduce scientific laws from experimental data Many other studies havedemonstrated the usefulness of GP for modeling different phenomena includingthose of natural origins [45 3 7 56 46]

See [] for an extensive review of GP applications

Applications of GPndash 80

Case study 2 Evolution of feature detectors

Based on Krzysztof Krawiec Bartosz Kukawka and Tomasz Maciejewski (2010)Evolving cascades of voting feature detectors for vehicle detection in satellite imageryIn IEEE Congress on Evolutionary Computation (CEC 2010) Barcelona IEEE Presspages 2392-2399

Applications of GPndash 81

Additional resources

Additional resourcesndash 82

Additional resources

Evolutionary Computation in Java csgmuedu~eclabprojectsecjGeneric software framework for EA well-prepared to work with GP

The online GP bilbiography wwwcsbhamacuk~wblbiblio

The genetic programming lsquohome pagersquo (a little bit messy but still valuable)httpwwwgenetic-programmingcom

Additional resourcesndash 83

Bibliography

Bibliographyndash 84

A Agapitos and S M LucasEvolving Modular Recursive Sorting AlgorithmsIn Proc EuroGP 2007

A Arcuri and X YaoA novel co-evolutionary approach to automatic software bug fixingIn J Wang editor 2008 IEEE World Congress on Computational Intelligencepages 162ndash168 Hong Kong 1-6 June 2008 IEEE Computational IntelligenceSociety IEEE Press

M Arganis R Val J Prats K Rodriguez R Dominguez and J DolzGenetic programming and standardization in water temperature modellingAdvances in Civil Engineering 2009 2009

A Brabazon M OrsquoNeill and I DempseyAn Introduction to Evolutionary Computation in FinanceIEEE Computational Intelligence Magazine 3(4)42ndash55 2008

R Bradley A Brabazon and M OrsquoNeillDynamic High Frequency Trading A Neuro-Evolutionary ApproachIn Proc EvoWorkshops 2009

G Cuccu and F GomezWhen novelty is not enoughIn Proc EvoApplications 2011

Bibliographyndash 85

M Daga and M C DeoAlternative data-driven methods to estimate wind from waves by inversemodelingNatural Hazards 49(2)293ndash310 May 2009

I Dempsey M OrsquoNeill and A BrabazonAdaptive Trading With Grammatical EvolutionIn Proc CEC 2006

G Durrett F Neumann and U-M OrsquoReillyComputational Complexity Analysis of Simple Genetic Programming On TwoProblems Modeling Isolated Program SemanticsIn Proc FOGA 2011

E Galvaacuten-Loacutepez J Swafford M OrsquoNeill and A BrabazonEvolving a Ms PacMan Controller Using Grammatical EvolutionIn Applications of Evolutionary Computation Springer 2010

J V Hansen P B Lowry R D Meservy and D M McDonaldGenetic Programming for Prevention of Cyberterrorism through Dynamic andEvolving Intrusion DetectionDecision Support Systems 431362ndash1374 2007

Bibliographyndash 86

S Harding J F Miller and W BanzhafDevelopments in Cartesian Genetic Programming self-modifying CGPGPEM 11397ndash439 2010

D Jakobović and L BudinDynamic Scheduling with Genetic ProgrammingIn Proc EuroGP 2006

G Kendall A Parkes and K SpoererA Survey of NP-Complete PuzzlesInternational Computer Games Association Journal 31(1)13ndash34 2008

K E Kinnear JrEvolving a Sort Lessons in Genetic ProgrammingIn Proc of the International Conference on Neural Networks 1993

J KozaA Genetic Approach to the Truck Backer Upper Problem and the Inter-twinedSpiral ProblemIn Proc International Joint Conference on Neural Networks 1992

J R KozaGenetic Programming On the Programming of Computers by Means of NaturalSelectionMIT Press Cambridge MA USA 1992

Bibliographyndash 87

J R KozaGenetic Programming II Automatic Discovery of Reusable ProgramsMIT Press Cambridge Massachusetts May 1994

W Langdon and W BanzhafRepeated Patterns in Genetic ProgrammingNatural Computing 7589ndash613 2008

W B LangdonRandom search is parsimoniousIn E Cantuacute-Paz editor Late Breaking Papers at the Genetic and EvolutionaryComputation Conference (GECCO-2002) pages 308ndash315 New York NY 9-13July 2002 AAAI

W B Langdon and W BanzhafRepeated Sequences in Linear Genetic Programming GenomesComplex Systems 15(4)285ndash306 2005

W B Langdon and A P HarrisonEvolving Regular Expressions for GeneChip Probe Performance PredictionIn Proc PPSN pages 1061ndash1070 2008

W B Langdon and R PoliFoundations of Genetic ProgrammingSpringer-Verlag 2002

Bibliographyndash 88

W B Langdon J Rowsell and A P HarrisonCreating Regular Expressions as mRNA Motifs with GP to Predict Human ExonSplittingIn Proc GECCO 2009

W B Langdon O Sanchez Graillet and A P HarrisonAutomated DNA Motif DiscoveryarXivorg 2010

M Lones A Tyrrell S Stepney and L CavesControlling Complex Dynamics with Artificial Biochemical NetworksIn Proc EuroGP pages 159ndash170 2010

S LucasOthello Competitionhttpprotectkern-1667emrelaxalgovalessexacuk8080othellohtmlOthellohtml 2012[Online accessed 27-Jan-2012]

Bibliographyndash 89

S LucasThe Physical Travelling Salesperson Problemhttpprotectkern-1667emrelaxalgovalessexacukptspptsphtml2012[Online accessed 27ndashJan-2012]

T McConaghyFFX Fast Scalable Deterministic Symbolic Regression TechnologyIn Proc GPTP 2011

D J MontanaStrongly typed genetic programmingBBN Technical Report 7866 Bolt Beranek and Newman Inc 10 MoultonStreet Cambridge MA 02138 USA 7 May 1993

M Nicolau M Schoenauer and W BanzhafEvolving Genes to Balance a PoleIn Proc EuroGP 2010

Bibliographyndash 90

P Nordin and W BanzhafGenetic programming controlling a miniature robotIn E V Siegel and J R Koza editors Working Notes for the AAAI Symposiumon Genetic Programming pages 61ndash67 MIT Cambridge MA USA 10ndash12 Nov1995 AAAI

G Olague and L TrujilloEvolutionary-computer-assisted design of image operators that detect interestpoints using genetic programmingImage and Vision Computing 29(7)484ndash498 2011

M OrsquoNeill A Brabazon and E HembergSubtree Deactivation Control with Grammatical Genetic Programming inDynamic EnvironmentsIn Proc CEC 2008

M OrsquoNeill and C RyanGrammatical Evolution by Grammatical Evolution The Evolution of Grammarand Genetic CodeIn Proc EuroGP pages 138ndash149 Springer-Verlag 5ndash7 Apr 2004

M Paterson Y Peres M Thorup P Winkler and U ZwickMaximum OverhangIn Proc 19th Annual ACM-SIAM Symposium on Discrete Algorithms 2008

Bibliographyndash 91

R PoliExact schema theory for genetic programming and variable-length geneticalgorithms with one-point crossoverGenetic Programming and Evolvable Machines 2(2)123ndash163 June 2001

R Poli and W B LangdonOn the search properties of different crossover operators in genetic programmingIn J R Koza W Banzhaf K Chellapilla K Deb M Dorigo D B Fogel M HGarzon D E Goldberg H Iba and R Riolo editors Genetic Programming1998 Proceedings of the Third Annual Conference pages 293ndash301 University ofWisconsin Madison Wisconsin USA 22-25 July 1998 Morgan Kaufmann

R Poli W B Langdon and N F McPheeA field guide to genetic programmingPublished via httplulucom and freely available athttpwwwgp-field-guideorguk 2008(With contributions by J R Koza)

B J Ross and H ZhuProcedural texture evolution using multiobjective optimizationNew Generation Computing 22(3)271ndash293 2004

Bibliographyndash 92

R P Salustowicz and J SchmidhuberProbabilistic incremental program evolutionEvolutionary Computation 5(2)123ndash141 1997

M Schmidt and H LipsonDistilling free-form natural laws from experimental dataScience 324(5923)81ndash85 3 Apr 2009

S Silva and L VanneschiState-of-the-Art Genetic Programming for Predicting Human Oral Bioavailabilityof DrugsIn Proc 4th International Workshop on Practical Applications of ComputationalBiology and Bioinformatics 2010

M SipperLet the Games EvolveIn Proc GPTP 2011

C Sivapragasam R Maheswaran and V VenkateshGenetic programming approach for flood routing in natural channelsHydrological Processes 22(5)623ndash628 2007

C Sivapragasam N Muttil S Muthukumar and V M ArunPrediction of algal blooms using genetic programmingMarine Pollution Bulletin 60(10)1849ndash1855 2010

Bibliographyndash 93

L SpectorTowards practical autoconstructive evolution Self-evolution of problem-solvinggenetic programming systemsIn R Riolo T McConaghy and E Vladislavleva editors Genetic ProgrammingTheory and Practice VIII volume 8 of Genetic and Evolutionary Computationchapter 2 pages 17ndash33 Springer Ann Arbor USA 20-22 May 2010

L Spector C Perry and J KleinPush 20 programming language descriptionTechnical report School of Cognitive Science Hampshire College Apr 2004

J Togelius S Karakovskiy J Koutnik and J SchmidhuberSuper Mario EvolutionIn Proc IEEE Computational Intelligence and Games 2009

TORCS The Open Car Racing Simulatorhttptorcssourceforgenet 2012

L Vanneschi and G CuccuVariable Size Population for Dynamic Optimization with Genetic ProgrammingIn Proc GECCO 2009

Bibliographyndash 94

E Vladislavleva G Smits and D Den HertogOrder of Nonlinearity as a Complexity Measure for Models Generated by SymbolicRegression via Pareto Genetic ProgrammingIEEE Trans EC 13(2)333ndash349 2009

N Wagner Z Michalewicz M Khouja and R McGregorTime Series Forecasting for Dynamic Environments The DyFor Genetic ProgramModelIEEE Trans EC 2007

J Walker and J MillerPredicting Prime Numbers Using Cartesian Genetic ProgrammingIn Proc EuroGP 2007

J A Walker K Voumllk S L Smith and J F MillerParallel Evolution using Multi-chromosome Cartesian Genetic ProgrammingGPEM 10417ndash445 2009

W-C Wang K-W Chau C-T Cheng and L QiuA comparison of performance of several artificial intelligence methods forforecasting monthly discharge time seriesJournal of Hydrology 374(3-4)294ndash306 2009

Bibliographyndash 95

W Weimer S Forrest C Le Goues and T NguyenAutomatic program repair with evolutionary computationCommunications of the ACM 53(5)109ndash116 June 2010

P Widera J Garibaldi and N KrasnogorGP challenge Evolving energy function for protein structure predictionGPEM 1161ndash88 2010

T YuHierarchical Processing for Evolving Recursive and Modular Programs UsingHigher-Order Functions and Lambda AbstractionGPEM 2345ndash380 2001

Bibliographyndash 96

  • Introduction
  • Background
  • What is genetic programming
  • Summary of our first glimpse at GP
  • Life demonstration of GP using ECJ
  • Assessment of GP techniques
  • Case study
  • A more detailed view on GP (vanilla GP is not the whole story)
  • The Challenges for GP
  • Variants of GP
  • Selected theoretical results
  • Applications of GP
  • Additional resources
  • Bibliography
Page 20: GeneticProgramming - Poznań University of TechnologyGeneticProgramming KrzysztofKrawiec LaboratoryofIntelligentDecisionSupportSystems InstituteofComputingScience,PoznanUniversityofTechnology,Poznań,Poland

Initial population (population 0)

What is genetic programmingndash 20

Fitness assignment for population 0

Fitness values f(a)=77 f(b)=110 f(c)=1798 f(d)=287

What is genetic programmingndash 21

Breeding

Assume

a gets reproduced

c gets mutated (at loci 2)

a and d get crossed-over

a and b get crossed over

What is genetic programmingndash 22

Population 1

Population 0

Population 1

Individual d in population 1 has fitness 0What is genetic programmingndash 23

Summary of our first glimpse at GP

Summary of our first glimpse at GPndash 24

Specific features of GP

The solutions evolving under the selection pressure of the fitness function arethemselves functions (programs)

GP operates on symbolic structures of varying lengthsThere are no variables for the algorithm to operate on (at least in the commonsense)

The program can be tested only on a limited number of fitness cases (tests)

=rArr In contrast to most EC methods that are typically placed in optimizationframework GP is by nature an inductive learning approach that fits into the domain ofmachine learning []

Summary of our first glimpse at GPndash 25

In a broader context

As opposed to typical ML approaches GP is very genericArbitrary programming language arbitrary input and output representation

The syntax and semantic of the programming language of consideration serve asmeans to provide the algorithm with prior knowledge

(common sense knowledge background knowledge domain knowledge)

GP is not the only approach to program induction (but probably the best one )See eg inductive logic programming ILP

GP embodies the ultimate goal of AI to build a system capable ofself-programming (adaptation learning)

Summary of our first glimpse at GPndash 26

What is GP ndash Question revisited

Genetic programming is a branch of computer science studying heuristicalgorithms based on neo-Darwinian principles for synthesizing programs iediscrete symbolic compositional structures that process data

Summary of our first glimpse at GPndash 27

Why should GP be considered a viable approach to program synthesis

GP combines two powerful concepts marked in underline in the above definition1 Representing candidate solutions as programs

which in general can conduct any Turing-complete computation (egclassification regression clustering reasoning problem solving etc) and thusenable capturing solutions to any type of problems (whether the task is eglearning optimization problem solving game playing etc)

2 Searching the space of candidate solutions using the lsquomechanicsrsquo borrowed frombiological evolutionwhich is unquestionably a very powerful computing paradigm given that itresulted in life on Earth and development of intelligent beings

Summary of our first glimpse at GPndash 28

Consequences of the above definition

Heuristic nature of search

Symbolic program representation

Input sensitivity and inductive character

State-fullness

Unconstrained data types

Summary of our first glimpse at GPndash 29

Origins of GP

httpwwwgenetic-programmingcomjohnkozahtml

Home assignment Indentify John Kozarsquos unique visual trait )

Summary of our first glimpse at GPndash 30

Life demonstration of GP using ECJ

Life demonstration of GP using ECJndash 31

GP on ECJ

ECJ Evolutionary Computation in Javahttpcsgmuedu~eclabprojectsecj

by Sean Luke Liviu Panait Gabriel Balan Sean Paus Zbigniew Skolicki ElenaPopovici Keith Sullivan et al

Probably the most popular freely available framework for EC with a strongsupport for GP

Licensed under Academic Free License version 30

As of March 2012 version 20

Many other libraries integrate with ECJ

Life demonstration of GP using ECJndash 32

Selected ECJ features

GUI with charting

Platform-independent checkpointing and logging

Hierarchical parameter files

Multithreading

Mersenne Twister Random Number Generators (compare tohttpwwwalifecouknonrandom)

Abstractions for implementing a variety of EC forms

Prepared to work in a distributed environment (including so-called island model)

Life demonstration of GP using ECJndash 33

GP-related ECJ features

GP Tree Representations

Set-based Strongly-Typed Genetic Programming

Ephemeral Random Constants

Automatically-Defined Functions and Automatically Defined Macros

Multiple tree forests

Six tree-creation algorithms

Extensive set of GP breeding operators

Grammatical Encoding

Eight pre-done GP application problem domains (ant regression multiplexerlawnmower parity two-box edge serengeti)

Life demonstration of GP using ECJndash 34

Exemplary run of ECJ

Standard outputjava ecEvolve -file ecappregressionquinticercparamsThreads breed1 eval1Seed 1427743400Job 0Setting upProcessing GP TypesProcessing GP Node ConstraintsProcessing GP Function SetsProcessing GP Tree Constraints-01306332228659439200164875774146594280653340439694114301402200189629743-0037506348565697010001402771209365470606602806044824949013869498395598084Initializing Generation 0Subpop 0 best fitness of generation Fitness Standardized=11303205 Adjusted=046941292 Hits=10Generation 1Subpop 0 best fitness of generation Fitness Standardized=06804932 Adjusted=059506345 Hits=7

Life demonstration of GP using ECJndash 35

Exemplary run The result

The log file produced by the runGeneration 0Best IndividualSubpopulation 0Evaluated trueFitness Standardized=11303205 Adjusted=046941292 Hits=10Tree 0( (sin ( x x)) (cos (+ x x)))Generation 1Best IndividualSubpopulation 0Evaluated trueFitness Standardized=06804932 Adjusted=059506345 Hits=7Tree 0( (rlog (+ (- x x) (cos x))) (rlog (- (cos (cos ( x x))) (- x x))))

Life demonstration of GP using ECJndash 36

Exemplary run

The log file produced by the runBest Individual of RunSubpopulation 0Evaluated trueFitness Standardized=008413165 Adjusted=092239726 Hits=17Tree 0( ( ( (- ( ( ( ( x (sin x)) (rlog

x)) (+ (+ (sin x) x) (- x x))) (exp ( x( ( (- ( ( ( ( x x) (rlog x)) (+ (+

(sin x) x) (- x x))) (exp ( x (sin x))))(sin x)) (rlog x)) (exp (rlog x)))))) (sin

x)) (rlog x)) x) (cos (cos ( ( (- ( ((exp (rlog x)) (+ x ( ( (exp (rlog x))(rlog x)) x))) (exp ( ( ( (- (exp (rlogx)) x) (rlog x)) x) (sin ( x x))))) (sinx)) ( x ( ( (- ( ( ( ( x x) (rlogx)) (+ (+ x (+ (+ (sin x) x) (- x x))) (-x x))) (exp ( x (sin x)))) (sin x)) (rlogx)) (exp (rlog x))))) x))))

Life demonstration of GP using ECJndash 37

GP in R

httpcranr-projectorgwebpackagesgprindexhtml

Life demonstration of GP using ECJndash 38

GP in Mathematica

Life demonstration of GP using ECJndash 39

Assessment of GP techniques

Assessment of GP techniquesndash 40

Criteria for assessing the quality of GP-evolved solutions

Criteria for assessing GP algorithmssuccess rate (percentage of evolutionary runs ended with success)

time-to-success (can be infin)

error of the best-of-run individual

Criteria for assessing programs obtained with GP

error rate (percentage of tests passed)

program size (number of instructions)

execution time

transparency

Assessment of GP techniquesndash 41

GP Benchmarks

Problem Definition (formula)Sextic x6minus2x4+x2

Septic x7minus2x6+x5minusx4+x3minus2x2+xNonic x9+x8+x7+x6+x5+x4+x3+x2+xR1 (x+1)3(x2minusx+1)R2 (x5minus3x3+1)(x2+1)R3 (x6+x5)(x4+x3+x2+x+1)

Assessment of GP techniquesndash 42

Even more GP benchmarks

Symbolic Regression Drug Bioavailability [43]

Tower [52] Protein Structure Classification [58]

Boolean Functions Time Series Forecasting [53]

N-Multiplexer [17] N-Majority [17] N-Parity [17] Path-finding and Planning

Generalised Boolean Circuits [12 59] Physical Travelling Salesman [28]

Digital Adder [55] Artificial Ant [17]

Order [9] Lawnmower [18]

Digital Multiplier [55] Tartarus Problem [6]

Majority [9] Maximum Overhang [36]

Classification Circuit Design [29]

mRNA Motif Classification [24] Control Systems

DNA Motif Discovery [25] Chaotic Dynamic Systems Control [26]

Intrusion Detection [11] Pole Balancing [31]

Protein Classification [19] Truck Control [16]

Intertwined Spirals [17]

Predictive Modelling

Mackey-Glass Chaotic Time Series [21]

Financial Trading [5 4 8]

Sunspot Prediction [17]

GeneChip Probe Performance [22]

Prime Number Prediction [54]

Assessment of GP techniquesndash 43

and more

Game-Playing

TORCS Car Racing [50]

Ms PacMan [10]

Othello [27]

Chessboard Evaluation [44]

Backgammon [44]

Mario [49]

NP-Complete Puzzles [14]

Robocode [44]

Rush Hour [44]

Checkers [44]

Freecell [44]

Dynamic Optimisation

Dynamic Symbolic Regression [34 35 51]

Dynamic Scheduling [13]

Traditional Programming

Sorting [15 1]

Assessment of GP techniquesndash 44

Case study

Case studyndash 45

Case study 1 Evolution of temperature models

Based on Karolina Stanisławska Krzysztof Krawiec Zbigniew W KundzewiczModeling Global Temperature Changes using Genetic Programming ndash A Case Study

Institute of Computing Science Poznan University of Technology Poznan Poland

Institute for Agricultural and Forest Environment Polish Academy of SciencesPoznan Poland and Potsdam Institute for Climate Impact Research PotsdamGermany

Konferencja Algorytmoacutew Ewolucyjnych i Optymalizacji Globalnej KAEiOGSeptember 21 2011

Case studyndash 46

A more detailed view on GP(vanilla GP is not the whole story)

A more detailed view on GP (vanilla GP is not the whole story)ndash 47

Where to get [random] candidate solutions from

Every stochastic search method relies on some sampling algorithm(s)

The distribution of randomly generated solutions is important as it impliescertain bias of the algorithm

ProblemsWe donrsquot know the lsquoidealrsquo distribution of GP programsEven if we knew it it may be difficult to design an algorithm that obeys it

The most widely used contemporary initialization methods take care only of thesyntax of generated programs

Mainly height constraint

A more detailed view on GP (vanilla GP is not the whole story)ndash 48

Initialization Full method

Specify the maximum tree height hmax

The full method for initializing treesChoose nonterminal nodes at random until hmax is reachedThen choose only from terminals

A more detailed view on GP (vanilla GP is not the whole story)ndash 49

Initialization Grow method

Specify the maximum tree height hmax

The grow method for initializing treesChoose nonterminal or terminal nodes at random until hmax is reachedThen choose only from terminals

A more detailed view on GP (vanilla GP is not the whole story)ndash 50

Homologous crossover for GP

Earliest example one-point crossover [23] identify a common region in theparents and swap the corresponding trees

The common region is the lsquointersectionrsquo of parent trees

A more detailed view on GP (vanilla GP is not the whole story)ndash 51

Uniform crossover for GP

Works similarly to uniform crossover in GAs

The offspring is build by iterating over nodes in the common region and flipping acoin to decide from which parent should an instruction be copied [38]

A more detailed view on GP (vanilla GP is not the whole story)ndash 52

How to employ multiple operators for lsquobreedingrsquo

How should the particular operators coexist in an evolutionary process In other words

How should they be superimposed

What should be the lsquopipingrsquo of particular breeding pipelines

A topic surprisingly underexplored in GP (and in EC probably too)

An example Which is betterpopsubpop0speciespipe = ecgpkozaMutationPipelinepopsubpop0speciespipenum-sources = 1popsubpop0speciespipesource0 = ecgpkozaCrossoverPipeline

Orpopsubpop0speciespipenum-sources = 2popsubpop0speciespipesource0 = ecgpkozaCrossoverPipelinepopsubpop0speciespipesource0prob = 09popsubpop0speciespipesource1 = ecgpkozaMutationPipelinepopsubpop0speciespipesource1prob = 01

A more detailed view on GP (vanilla GP is not the whole story)ndash 53

The Challenges for GP

The Challenges for GPndash 54

Bloat

The evolving expressions tend to grow indefinitely in sizeFor tree-based representations this growth is typically exponential[-ish]

Evaluation becomes slow algorithm stalls memory overrun likely

One of the most intensely studied topics in GP 240+ papers as of March 2012

The Challenges for GPndash 55

Bloat example

Average number of nodes per generation in a typical run of GP solving the Sexticproblem x6minus2x4+x2

(GP dotted line)

The Challenges for GPndash 56

Countermeasures for bloat

Constraining tree heightSurprisingly can speed up bloat

Favoring small programsLexicographic parsimony pressure given two equally fit individuals prefer (select)the one represented by a smaller tree

Bloat-aware operators size-fair crossover

The Challenges for GPndash 57

Highly non-uniform distribution of program lsquobehaviorsrsquo

Convergence of binary Boolean random linear functions (composed of AND NANDOR NOR 8 bits)

From [20] Langdon W B Cantuacute-Paz E (ed) Random Search is Parsimonious LateBreaking Papers at the Genetic and Evolutionary Computation Conference(GECCO-2002) AAAI 2002 308-315

The Challenges for GPndash 58

High cost of evaluation

Running a program on multiple inputscan be expensive

Particularly for some types of dataeg images

Solutions

Caching of outcomes of subprograms

Parallel execution of programs onparticular fitness cases

Bloat prevention methods

The Challenges for GPndash 59

Variants of GP

Variants of GPndash 60

Strongly typed GP (STGP)

A way to incorporate prior knowledge and impose a structure on programs [30]

ImplementationProvide a set of typesFor each instruction define the types of its arguments and outcomesMake the operators type-aware

Mutation substitute a random tree of a proper typeCrossover swap trees of compatible1 types

1Compatible belonging to the same lsquoset typersquoVariants of GPndash 61

Strongly typed GP in ECJ

For the problem of simple classifiers represented as decision trees

Classifier syntaxClassifier = Class_idClassifier = if_then_else(Condition ClassifierClassifier)Condition = Input_Variable = Constant_Value

Implementation in ECJ parameter filesgptypeasize = 3gptypea0name = classgptypea1name = vargptypea2name = constgptypessize = 0gptcsize = 1gptc0 = ecgpGPTreeConstraintsgptc0name = tc0gptc0fset = f0gptc0returns = class

gpncsize = 4gpnc0 = ecgpGPNodeConstraintsgpnc0name = ncSimpleClassifiergpnc0returns = classgpnc0size = 0gpnc1 = ecgpGPNodeConstraintsgpnc1name = ncCompoundClassifiergpnc1returns = classgpnc1size = 4gpnc1child0 = vargpnc1child1 = constgpnc1child2 = classgpnc1child3 = classgpnc2 = ecgpGPNodeConstraintsgpnc2name = ncVariablegpnc2returns = vargpnc2size = 0gpnc3 = ecgpGPNodeConstraintsgpnc3name = ncConstantgpnc3returns = constgpnc3size = 0

Variants of GPndash 62

Linear Genetic Programming

MotivationTree-like structures are not natural for contemporary hardware architectures

Program = a sequence of instructions

Data passed via registers

ProsDirectly portable to machine code fast executionNatural correspondence to standard (GA-like) crossover operator

Applications direct evolution of machine code [32]

Initial register contents

Final register contents

x1

x2 O1 O2

x3

O3 O4 g2

g3

g1r1

r2

r3

r1

r2

r3

Variants of GPndash 63

Linear GP

Variants of GPndash 64

Stack-based GP

The best-known representative Push and PushGPhampshireedulspectorpushhtml [48]

ProsVery simple syntax program = instruction | literal | ( program )No need to specify the number of registersThe top element of a stack has the natural interpretation of program outcomeNatural possibility of implementing autoconstructive programs [47]Includes certain features that make it Turing-complete (eg YANK instruction)

Variants of GPndash 65

Push Example

Program

( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR )

Initial stack states

BOOLEAN STACK ()CODE STACK ( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR )FLOAT STACK ()INTEGER STACK ()

Stack states after program execution

BOOLEAN STACK ( TRUE )CODE STACK ( ( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR ) )FLOAT STACK ( 93 )INTEGER STACK ( 6 )

httphampshireedulspectorpush3-descriptionhtml

Variants of GPndash 66

Other variants of GP

Grammatical Evolution The grammar of the programming language ofconsideration is given as input to the algorithm Individuals encode the choice ofproductions in the derivation tree (which of available alternative productionshould be chosen modulo the number of productions available at given step ofderivation)

Graph-based GPMotivation standard GP cannot reuse subprograms (within a single program)Example Cartesian Genetic Programming

Variants of GPndash 67

Other variants of GP

Multiobjective GP The extra objectives canCome with the problemResult from GPrsquos specifics eg use program size as the second (minimized)objectiveBe associated with different tests (eg feature tests [40])

Developmental GP (eg using Push)

Probabilistic GP (a variant of EDA Estimation of Distribution Algorithms)The algorithm maintains a probability distribution P instead of a populationIndividuals are generated from P lsquoon demandrsquoThe results of individualsrsquo evaluation are used to update P

Variants of GPndash 68

Simple EDA-like GP PIPE

Probabilistic Incremental Program Evolution [41]

Variants of GPndash 69

Selected theoretical results

Selected theoretical resultsndash 70

Schemata theorem for GP

Exact formula for the expected number of individuals sampling a schema a thenext generation [37]

Plus later work for other types of crossover

Selected theoretical resultsndash 71

Semantic crossover

Selected theoretical resultsndash 72

Applications of GP

Applications of GPndash 73

Review

GP produced a number of solutions that are human-competitive ie a GPalgorithm automatically solved a problem for which a patent exists2

A recent award-winning work has demonstrated the ability of a GP system toautomatically find and correct bugs in commercially-released software whenprovided with test data3

GP is one of leading methodologies that can be used to lsquoautomatersquo sciencehelping the researchers to find the hidden complex patterns in the observedphenomena4

2Koza J R Keane M A Streeter M J Mydlowec W Yu J Lanza G 2003 Genetic Pro-gramming IV Routine Human-Competitive Machine Intelligence Kluwer Academic Publishers

3Arcuri A Yao X A novel co-evolutionary approach to automatic software bug fixing In WangJ (Ed) 2008 IEEE World Congress on Computational Intelligence IEEE Computational IntelligenceSociety IEEE Press Hong Kong

4Schmidt M Lipson H 3 Apr 2009 Distilling free-form natural laws from experimental dataScience 324 (5923) 81ndash85

Applications of GPndash 74

Humies

() Entries were solicited for cash awards for human-competitive results that wereproduced by any form of genetic and evolutionary computation and that were published

Applications of GPndash 75

Humies

The conditions to qualify(A) The result was patented as an invention in the past is an improvement over apatented invention or would qualify today as a patentable new invention(B) The result is equal to or better than a result that was accepted as a new scientificresult at the time when it was published in a peer-reviewed scientific journal(C) The result is equal to or better than a result that was placed into a database orarchive of results maintained by an internationally recognized panel of scientificexperts(D) The result is publishable in its own right as a new scientific result mdash independentof the fact that the result was mechanically created(E) The result is equal to or better than the most recent human-created solution to along-standing problem for which there has been a succession of increasingly betterhuman-created solutions(F) The result is equal to or better than a result that was considered an achievementin its field at the time it was first discovered(G) The result solves a problem of indisputable difficulty in its field(H) The result holds its own or wins a regulated competition involving humancontestants (in the form of either live human players or human-written computerprograms)

Applications of GPndash 76

Selected Gold Humies

2004 Jason D Lohn Gregory S Hornby Derek S Linden NASA Ames ResearchCenterAn Evolved Antenna for Deployment on NASArsquos Space Technology 5 Mission

httpidesignucscedupapershornby_ec11pdf

Applications of GPndash 77

Selected Gold Humies using GP

2009 Stephanie Forrest Claire Le Goues ThanhVu Nguyen Westley WeimerAutomatically finding patches using genetic programming A GeneticProgramming Approach to Automated Software Repair

Applications of GPndash 78

Selected Gold Humies using GP

2008 Lee Spector David M Clark Ian Lindsay Bradford Barr Jon KleinGenetic Programming for Finite Algebras

2010 Natalio Krasnogor Paweł Widera Jonathan GaribaldiEvolutionary design of the energy function for protein structure prediction GPchallenge evolving the energy function for protein structure prediction Automateddesign of energy functions for protein structure prediction by means of geneticprogramming and improved structure similarity assessment

2011 Achiya Elyasaf Ami Hauptmann Moshe SipperGA-FreeCell Evolving Solvers for the Game of FreeCell

Applications of GPndash 79

Other applications

classification problems in machine learning [] object recognition [ 33] orlearning game strategies [] [2 57] has demonstrated the ability of a GP system to automatically find andcorrect bugs in commercially-released software when provided with test data

In the context of this paper it deserves particular attention that GP is one ofleading methodologies that can be used to lsquoautomatersquo science helping theresearchers to find the hidden complex patterns in the observed phenomena

In this spirit in their seminal paper [42] have shown how GP can be used toinduce scientific laws from experimental data Many other studies havedemonstrated the usefulness of GP for modeling different phenomena includingthose of natural origins [45 3 7 56 46]

See [] for an extensive review of GP applications

Applications of GPndash 80

Case study 2 Evolution of feature detectors

Based on Krzysztof Krawiec Bartosz Kukawka and Tomasz Maciejewski (2010)Evolving cascades of voting feature detectors for vehicle detection in satellite imageryIn IEEE Congress on Evolutionary Computation (CEC 2010) Barcelona IEEE Presspages 2392-2399

Applications of GPndash 81

Additional resources

Additional resourcesndash 82

Additional resources

Evolutionary Computation in Java csgmuedu~eclabprojectsecjGeneric software framework for EA well-prepared to work with GP

The online GP bilbiography wwwcsbhamacuk~wblbiblio

The genetic programming lsquohome pagersquo (a little bit messy but still valuable)httpwwwgenetic-programmingcom

Additional resourcesndash 83

Bibliography

Bibliographyndash 84

A Agapitos and S M LucasEvolving Modular Recursive Sorting AlgorithmsIn Proc EuroGP 2007

A Arcuri and X YaoA novel co-evolutionary approach to automatic software bug fixingIn J Wang editor 2008 IEEE World Congress on Computational Intelligencepages 162ndash168 Hong Kong 1-6 June 2008 IEEE Computational IntelligenceSociety IEEE Press

M Arganis R Val J Prats K Rodriguez R Dominguez and J DolzGenetic programming and standardization in water temperature modellingAdvances in Civil Engineering 2009 2009

A Brabazon M OrsquoNeill and I DempseyAn Introduction to Evolutionary Computation in FinanceIEEE Computational Intelligence Magazine 3(4)42ndash55 2008

R Bradley A Brabazon and M OrsquoNeillDynamic High Frequency Trading A Neuro-Evolutionary ApproachIn Proc EvoWorkshops 2009

G Cuccu and F GomezWhen novelty is not enoughIn Proc EvoApplications 2011

Bibliographyndash 85

M Daga and M C DeoAlternative data-driven methods to estimate wind from waves by inversemodelingNatural Hazards 49(2)293ndash310 May 2009

I Dempsey M OrsquoNeill and A BrabazonAdaptive Trading With Grammatical EvolutionIn Proc CEC 2006

G Durrett F Neumann and U-M OrsquoReillyComputational Complexity Analysis of Simple Genetic Programming On TwoProblems Modeling Isolated Program SemanticsIn Proc FOGA 2011

E Galvaacuten-Loacutepez J Swafford M OrsquoNeill and A BrabazonEvolving a Ms PacMan Controller Using Grammatical EvolutionIn Applications of Evolutionary Computation Springer 2010

J V Hansen P B Lowry R D Meservy and D M McDonaldGenetic Programming for Prevention of Cyberterrorism through Dynamic andEvolving Intrusion DetectionDecision Support Systems 431362ndash1374 2007

Bibliographyndash 86

S Harding J F Miller and W BanzhafDevelopments in Cartesian Genetic Programming self-modifying CGPGPEM 11397ndash439 2010

D Jakobović and L BudinDynamic Scheduling with Genetic ProgrammingIn Proc EuroGP 2006

G Kendall A Parkes and K SpoererA Survey of NP-Complete PuzzlesInternational Computer Games Association Journal 31(1)13ndash34 2008

K E Kinnear JrEvolving a Sort Lessons in Genetic ProgrammingIn Proc of the International Conference on Neural Networks 1993

J KozaA Genetic Approach to the Truck Backer Upper Problem and the Inter-twinedSpiral ProblemIn Proc International Joint Conference on Neural Networks 1992

J R KozaGenetic Programming On the Programming of Computers by Means of NaturalSelectionMIT Press Cambridge MA USA 1992

Bibliographyndash 87

J R KozaGenetic Programming II Automatic Discovery of Reusable ProgramsMIT Press Cambridge Massachusetts May 1994

W Langdon and W BanzhafRepeated Patterns in Genetic ProgrammingNatural Computing 7589ndash613 2008

W B LangdonRandom search is parsimoniousIn E Cantuacute-Paz editor Late Breaking Papers at the Genetic and EvolutionaryComputation Conference (GECCO-2002) pages 308ndash315 New York NY 9-13July 2002 AAAI

W B Langdon and W BanzhafRepeated Sequences in Linear Genetic Programming GenomesComplex Systems 15(4)285ndash306 2005

W B Langdon and A P HarrisonEvolving Regular Expressions for GeneChip Probe Performance PredictionIn Proc PPSN pages 1061ndash1070 2008

W B Langdon and R PoliFoundations of Genetic ProgrammingSpringer-Verlag 2002

Bibliographyndash 88

W B Langdon J Rowsell and A P HarrisonCreating Regular Expressions as mRNA Motifs with GP to Predict Human ExonSplittingIn Proc GECCO 2009

W B Langdon O Sanchez Graillet and A P HarrisonAutomated DNA Motif DiscoveryarXivorg 2010

M Lones A Tyrrell S Stepney and L CavesControlling Complex Dynamics with Artificial Biochemical NetworksIn Proc EuroGP pages 159ndash170 2010

S LucasOthello Competitionhttpprotectkern-1667emrelaxalgovalessexacuk8080othellohtmlOthellohtml 2012[Online accessed 27-Jan-2012]

Bibliographyndash 89

S LucasThe Physical Travelling Salesperson Problemhttpprotectkern-1667emrelaxalgovalessexacukptspptsphtml2012[Online accessed 27ndashJan-2012]

T McConaghyFFX Fast Scalable Deterministic Symbolic Regression TechnologyIn Proc GPTP 2011

D J MontanaStrongly typed genetic programmingBBN Technical Report 7866 Bolt Beranek and Newman Inc 10 MoultonStreet Cambridge MA 02138 USA 7 May 1993

M Nicolau M Schoenauer and W BanzhafEvolving Genes to Balance a PoleIn Proc EuroGP 2010

Bibliographyndash 90

P Nordin and W BanzhafGenetic programming controlling a miniature robotIn E V Siegel and J R Koza editors Working Notes for the AAAI Symposiumon Genetic Programming pages 61ndash67 MIT Cambridge MA USA 10ndash12 Nov1995 AAAI

G Olague and L TrujilloEvolutionary-computer-assisted design of image operators that detect interestpoints using genetic programmingImage and Vision Computing 29(7)484ndash498 2011

M OrsquoNeill A Brabazon and E HembergSubtree Deactivation Control with Grammatical Genetic Programming inDynamic EnvironmentsIn Proc CEC 2008

M OrsquoNeill and C RyanGrammatical Evolution by Grammatical Evolution The Evolution of Grammarand Genetic CodeIn Proc EuroGP pages 138ndash149 Springer-Verlag 5ndash7 Apr 2004

M Paterson Y Peres M Thorup P Winkler and U ZwickMaximum OverhangIn Proc 19th Annual ACM-SIAM Symposium on Discrete Algorithms 2008

Bibliographyndash 91

R PoliExact schema theory for genetic programming and variable-length geneticalgorithms with one-point crossoverGenetic Programming and Evolvable Machines 2(2)123ndash163 June 2001

R Poli and W B LangdonOn the search properties of different crossover operators in genetic programmingIn J R Koza W Banzhaf K Chellapilla K Deb M Dorigo D B Fogel M HGarzon D E Goldberg H Iba and R Riolo editors Genetic Programming1998 Proceedings of the Third Annual Conference pages 293ndash301 University ofWisconsin Madison Wisconsin USA 22-25 July 1998 Morgan Kaufmann

R Poli W B Langdon and N F McPheeA field guide to genetic programmingPublished via httplulucom and freely available athttpwwwgp-field-guideorguk 2008(With contributions by J R Koza)

B J Ross and H ZhuProcedural texture evolution using multiobjective optimizationNew Generation Computing 22(3)271ndash293 2004

Bibliographyndash 92

R P Salustowicz and J SchmidhuberProbabilistic incremental program evolutionEvolutionary Computation 5(2)123ndash141 1997

M Schmidt and H LipsonDistilling free-form natural laws from experimental dataScience 324(5923)81ndash85 3 Apr 2009

S Silva and L VanneschiState-of-the-Art Genetic Programming for Predicting Human Oral Bioavailabilityof DrugsIn Proc 4th International Workshop on Practical Applications of ComputationalBiology and Bioinformatics 2010

M SipperLet the Games EvolveIn Proc GPTP 2011

C Sivapragasam R Maheswaran and V VenkateshGenetic programming approach for flood routing in natural channelsHydrological Processes 22(5)623ndash628 2007

C Sivapragasam N Muttil S Muthukumar and V M ArunPrediction of algal blooms using genetic programmingMarine Pollution Bulletin 60(10)1849ndash1855 2010

Bibliographyndash 93

L SpectorTowards practical autoconstructive evolution Self-evolution of problem-solvinggenetic programming systemsIn R Riolo T McConaghy and E Vladislavleva editors Genetic ProgrammingTheory and Practice VIII volume 8 of Genetic and Evolutionary Computationchapter 2 pages 17ndash33 Springer Ann Arbor USA 20-22 May 2010

L Spector C Perry and J KleinPush 20 programming language descriptionTechnical report School of Cognitive Science Hampshire College Apr 2004

J Togelius S Karakovskiy J Koutnik and J SchmidhuberSuper Mario EvolutionIn Proc IEEE Computational Intelligence and Games 2009

TORCS The Open Car Racing Simulatorhttptorcssourceforgenet 2012

L Vanneschi and G CuccuVariable Size Population for Dynamic Optimization with Genetic ProgrammingIn Proc GECCO 2009

Bibliographyndash 94

E Vladislavleva G Smits and D Den HertogOrder of Nonlinearity as a Complexity Measure for Models Generated by SymbolicRegression via Pareto Genetic ProgrammingIEEE Trans EC 13(2)333ndash349 2009

N Wagner Z Michalewicz M Khouja and R McGregorTime Series Forecasting for Dynamic Environments The DyFor Genetic ProgramModelIEEE Trans EC 2007

J Walker and J MillerPredicting Prime Numbers Using Cartesian Genetic ProgrammingIn Proc EuroGP 2007

J A Walker K Voumllk S L Smith and J F MillerParallel Evolution using Multi-chromosome Cartesian Genetic ProgrammingGPEM 10417ndash445 2009

W-C Wang K-W Chau C-T Cheng and L QiuA comparison of performance of several artificial intelligence methods forforecasting monthly discharge time seriesJournal of Hydrology 374(3-4)294ndash306 2009

Bibliographyndash 95

W Weimer S Forrest C Le Goues and T NguyenAutomatic program repair with evolutionary computationCommunications of the ACM 53(5)109ndash116 June 2010

P Widera J Garibaldi and N KrasnogorGP challenge Evolving energy function for protein structure predictionGPEM 1161ndash88 2010

T YuHierarchical Processing for Evolving Recursive and Modular Programs UsingHigher-Order Functions and Lambda AbstractionGPEM 2345ndash380 2001

Bibliographyndash 96

  • Introduction
  • Background
  • What is genetic programming
  • Summary of our first glimpse at GP
  • Life demonstration of GP using ECJ
  • Assessment of GP techniques
  • Case study
  • A more detailed view on GP (vanilla GP is not the whole story)
  • The Challenges for GP
  • Variants of GP
  • Selected theoretical results
  • Applications of GP
  • Additional resources
  • Bibliography
Page 21: GeneticProgramming - Poznań University of TechnologyGeneticProgramming KrzysztofKrawiec LaboratoryofIntelligentDecisionSupportSystems InstituteofComputingScience,PoznanUniversityofTechnology,Poznań,Poland

Fitness assignment for population 0

Fitness values f(a)=77 f(b)=110 f(c)=1798 f(d)=287

What is genetic programmingndash 21

Breeding

Assume

a gets reproduced

c gets mutated (at loci 2)

a and d get crossed-over

a and b get crossed over

What is genetic programmingndash 22

Population 1

Population 0

Population 1

Individual d in population 1 has fitness 0What is genetic programmingndash 23

Summary of our first glimpse at GP

Summary of our first glimpse at GPndash 24

Specific features of GP

The solutions evolving under the selection pressure of the fitness function arethemselves functions (programs)

GP operates on symbolic structures of varying lengthsThere are no variables for the algorithm to operate on (at least in the commonsense)

The program can be tested only on a limited number of fitness cases (tests)

=rArr In contrast to most EC methods that are typically placed in optimizationframework GP is by nature an inductive learning approach that fits into the domain ofmachine learning []

Summary of our first glimpse at GPndash 25

In a broader context

As opposed to typical ML approaches GP is very genericArbitrary programming language arbitrary input and output representation

The syntax and semantic of the programming language of consideration serve asmeans to provide the algorithm with prior knowledge

(common sense knowledge background knowledge domain knowledge)

GP is not the only approach to program induction (but probably the best one )See eg inductive logic programming ILP

GP embodies the ultimate goal of AI to build a system capable ofself-programming (adaptation learning)

Summary of our first glimpse at GPndash 26

What is GP ndash Question revisited

Genetic programming is a branch of computer science studying heuristicalgorithms based on neo-Darwinian principles for synthesizing programs iediscrete symbolic compositional structures that process data

Summary of our first glimpse at GPndash 27

Why should GP be considered a viable approach to program synthesis

GP combines two powerful concepts marked in underline in the above definition1 Representing candidate solutions as programs

which in general can conduct any Turing-complete computation (egclassification regression clustering reasoning problem solving etc) and thusenable capturing solutions to any type of problems (whether the task is eglearning optimization problem solving game playing etc)

2 Searching the space of candidate solutions using the lsquomechanicsrsquo borrowed frombiological evolutionwhich is unquestionably a very powerful computing paradigm given that itresulted in life on Earth and development of intelligent beings

Summary of our first glimpse at GPndash 28

Consequences of the above definition

Heuristic nature of search

Symbolic program representation

Input sensitivity and inductive character

State-fullness

Unconstrained data types

Summary of our first glimpse at GPndash 29

Origins of GP

httpwwwgenetic-programmingcomjohnkozahtml

Home assignment Indentify John Kozarsquos unique visual trait )

Summary of our first glimpse at GPndash 30

Life demonstration of GP using ECJ

Life demonstration of GP using ECJndash 31

GP on ECJ

ECJ Evolutionary Computation in Javahttpcsgmuedu~eclabprojectsecj

by Sean Luke Liviu Panait Gabriel Balan Sean Paus Zbigniew Skolicki ElenaPopovici Keith Sullivan et al

Probably the most popular freely available framework for EC with a strongsupport for GP

Licensed under Academic Free License version 30

As of March 2012 version 20

Many other libraries integrate with ECJ

Life demonstration of GP using ECJndash 32

Selected ECJ features

GUI with charting

Platform-independent checkpointing and logging

Hierarchical parameter files

Multithreading

Mersenne Twister Random Number Generators (compare tohttpwwwalifecouknonrandom)

Abstractions for implementing a variety of EC forms

Prepared to work in a distributed environment (including so-called island model)

Life demonstration of GP using ECJndash 33

GP-related ECJ features

GP Tree Representations

Set-based Strongly-Typed Genetic Programming

Ephemeral Random Constants

Automatically-Defined Functions and Automatically Defined Macros

Multiple tree forests

Six tree-creation algorithms

Extensive set of GP breeding operators

Grammatical Encoding

Eight pre-done GP application problem domains (ant regression multiplexerlawnmower parity two-box edge serengeti)

Life demonstration of GP using ECJndash 34

Exemplary run of ECJ

Standard outputjava ecEvolve -file ecappregressionquinticercparamsThreads breed1 eval1Seed 1427743400Job 0Setting upProcessing GP TypesProcessing GP Node ConstraintsProcessing GP Function SetsProcessing GP Tree Constraints-01306332228659439200164875774146594280653340439694114301402200189629743-0037506348565697010001402771209365470606602806044824949013869498395598084Initializing Generation 0Subpop 0 best fitness of generation Fitness Standardized=11303205 Adjusted=046941292 Hits=10Generation 1Subpop 0 best fitness of generation Fitness Standardized=06804932 Adjusted=059506345 Hits=7

Life demonstration of GP using ECJndash 35

Exemplary run The result

The log file produced by the runGeneration 0Best IndividualSubpopulation 0Evaluated trueFitness Standardized=11303205 Adjusted=046941292 Hits=10Tree 0( (sin ( x x)) (cos (+ x x)))Generation 1Best IndividualSubpopulation 0Evaluated trueFitness Standardized=06804932 Adjusted=059506345 Hits=7Tree 0( (rlog (+ (- x x) (cos x))) (rlog (- (cos (cos ( x x))) (- x x))))

Life demonstration of GP using ECJndash 36

Exemplary run

The log file produced by the runBest Individual of RunSubpopulation 0Evaluated trueFitness Standardized=008413165 Adjusted=092239726 Hits=17Tree 0( ( ( (- ( ( ( ( x (sin x)) (rlog

x)) (+ (+ (sin x) x) (- x x))) (exp ( x( ( (- ( ( ( ( x x) (rlog x)) (+ (+

(sin x) x) (- x x))) (exp ( x (sin x))))(sin x)) (rlog x)) (exp (rlog x)))))) (sin

x)) (rlog x)) x) (cos (cos ( ( (- ( ((exp (rlog x)) (+ x ( ( (exp (rlog x))(rlog x)) x))) (exp ( ( ( (- (exp (rlogx)) x) (rlog x)) x) (sin ( x x))))) (sinx)) ( x ( ( (- ( ( ( ( x x) (rlogx)) (+ (+ x (+ (+ (sin x) x) (- x x))) (-x x))) (exp ( x (sin x)))) (sin x)) (rlogx)) (exp (rlog x))))) x))))

Life demonstration of GP using ECJndash 37

GP in R

httpcranr-projectorgwebpackagesgprindexhtml

Life demonstration of GP using ECJndash 38

GP in Mathematica

Life demonstration of GP using ECJndash 39

Assessment of GP techniques

Assessment of GP techniquesndash 40

Criteria for assessing the quality of GP-evolved solutions

Criteria for assessing GP algorithmssuccess rate (percentage of evolutionary runs ended with success)

time-to-success (can be infin)

error of the best-of-run individual

Criteria for assessing programs obtained with GP

error rate (percentage of tests passed)

program size (number of instructions)

execution time

transparency

Assessment of GP techniquesndash 41

GP Benchmarks

Problem Definition (formula)Sextic x6minus2x4+x2

Septic x7minus2x6+x5minusx4+x3minus2x2+xNonic x9+x8+x7+x6+x5+x4+x3+x2+xR1 (x+1)3(x2minusx+1)R2 (x5minus3x3+1)(x2+1)R3 (x6+x5)(x4+x3+x2+x+1)

Assessment of GP techniquesndash 42

Even more GP benchmarks

Symbolic Regression Drug Bioavailability [43]

Tower [52] Protein Structure Classification [58]

Boolean Functions Time Series Forecasting [53]

N-Multiplexer [17] N-Majority [17] N-Parity [17] Path-finding and Planning

Generalised Boolean Circuits [12 59] Physical Travelling Salesman [28]

Digital Adder [55] Artificial Ant [17]

Order [9] Lawnmower [18]

Digital Multiplier [55] Tartarus Problem [6]

Majority [9] Maximum Overhang [36]

Classification Circuit Design [29]

mRNA Motif Classification [24] Control Systems

DNA Motif Discovery [25] Chaotic Dynamic Systems Control [26]

Intrusion Detection [11] Pole Balancing [31]

Protein Classification [19] Truck Control [16]

Intertwined Spirals [17]

Predictive Modelling

Mackey-Glass Chaotic Time Series [21]

Financial Trading [5 4 8]

Sunspot Prediction [17]

GeneChip Probe Performance [22]

Prime Number Prediction [54]

Assessment of GP techniquesndash 43

and more

Game-Playing

TORCS Car Racing [50]

Ms PacMan [10]

Othello [27]

Chessboard Evaluation [44]

Backgammon [44]

Mario [49]

NP-Complete Puzzles [14]

Robocode [44]

Rush Hour [44]

Checkers [44]

Freecell [44]

Dynamic Optimisation

Dynamic Symbolic Regression [34 35 51]

Dynamic Scheduling [13]

Traditional Programming

Sorting [15 1]

Assessment of GP techniquesndash 44

Case study

Case studyndash 45

Case study 1 Evolution of temperature models

Based on Karolina Stanisławska Krzysztof Krawiec Zbigniew W KundzewiczModeling Global Temperature Changes using Genetic Programming ndash A Case Study

Institute of Computing Science Poznan University of Technology Poznan Poland

Institute for Agricultural and Forest Environment Polish Academy of SciencesPoznan Poland and Potsdam Institute for Climate Impact Research PotsdamGermany

Konferencja Algorytmoacutew Ewolucyjnych i Optymalizacji Globalnej KAEiOGSeptember 21 2011

Case studyndash 46

A more detailed view on GP(vanilla GP is not the whole story)

A more detailed view on GP (vanilla GP is not the whole story)ndash 47

Where to get [random] candidate solutions from

Every stochastic search method relies on some sampling algorithm(s)

The distribution of randomly generated solutions is important as it impliescertain bias of the algorithm

ProblemsWe donrsquot know the lsquoidealrsquo distribution of GP programsEven if we knew it it may be difficult to design an algorithm that obeys it

The most widely used contemporary initialization methods take care only of thesyntax of generated programs

Mainly height constraint

A more detailed view on GP (vanilla GP is not the whole story)ndash 48

Initialization Full method

Specify the maximum tree height hmax

The full method for initializing treesChoose nonterminal nodes at random until hmax is reachedThen choose only from terminals

A more detailed view on GP (vanilla GP is not the whole story)ndash 49

Initialization Grow method

Specify the maximum tree height hmax

The grow method for initializing treesChoose nonterminal or terminal nodes at random until hmax is reachedThen choose only from terminals

A more detailed view on GP (vanilla GP is not the whole story)ndash 50

Homologous crossover for GP

Earliest example one-point crossover [23] identify a common region in theparents and swap the corresponding trees

The common region is the lsquointersectionrsquo of parent trees

A more detailed view on GP (vanilla GP is not the whole story)ndash 51

Uniform crossover for GP

Works similarly to uniform crossover in GAs

The offspring is build by iterating over nodes in the common region and flipping acoin to decide from which parent should an instruction be copied [38]

A more detailed view on GP (vanilla GP is not the whole story)ndash 52

How to employ multiple operators for lsquobreedingrsquo

How should the particular operators coexist in an evolutionary process In other words

How should they be superimposed

What should be the lsquopipingrsquo of particular breeding pipelines

A topic surprisingly underexplored in GP (and in EC probably too)

An example Which is betterpopsubpop0speciespipe = ecgpkozaMutationPipelinepopsubpop0speciespipenum-sources = 1popsubpop0speciespipesource0 = ecgpkozaCrossoverPipeline

Orpopsubpop0speciespipenum-sources = 2popsubpop0speciespipesource0 = ecgpkozaCrossoverPipelinepopsubpop0speciespipesource0prob = 09popsubpop0speciespipesource1 = ecgpkozaMutationPipelinepopsubpop0speciespipesource1prob = 01

A more detailed view on GP (vanilla GP is not the whole story)ndash 53

The Challenges for GP

The Challenges for GPndash 54

Bloat

The evolving expressions tend to grow indefinitely in sizeFor tree-based representations this growth is typically exponential[-ish]

Evaluation becomes slow algorithm stalls memory overrun likely

One of the most intensely studied topics in GP 240+ papers as of March 2012

The Challenges for GPndash 55

Bloat example

Average number of nodes per generation in a typical run of GP solving the Sexticproblem x6minus2x4+x2

(GP dotted line)

The Challenges for GPndash 56

Countermeasures for bloat

Constraining tree heightSurprisingly can speed up bloat

Favoring small programsLexicographic parsimony pressure given two equally fit individuals prefer (select)the one represented by a smaller tree

Bloat-aware operators size-fair crossover

The Challenges for GPndash 57

Highly non-uniform distribution of program lsquobehaviorsrsquo

Convergence of binary Boolean random linear functions (composed of AND NANDOR NOR 8 bits)

From [20] Langdon W B Cantuacute-Paz E (ed) Random Search is Parsimonious LateBreaking Papers at the Genetic and Evolutionary Computation Conference(GECCO-2002) AAAI 2002 308-315

The Challenges for GPndash 58

High cost of evaluation

Running a program on multiple inputscan be expensive

Particularly for some types of dataeg images

Solutions

Caching of outcomes of subprograms

Parallel execution of programs onparticular fitness cases

Bloat prevention methods

The Challenges for GPndash 59

Variants of GP

Variants of GPndash 60

Strongly typed GP (STGP)

A way to incorporate prior knowledge and impose a structure on programs [30]

ImplementationProvide a set of typesFor each instruction define the types of its arguments and outcomesMake the operators type-aware

Mutation substitute a random tree of a proper typeCrossover swap trees of compatible1 types

1Compatible belonging to the same lsquoset typersquoVariants of GPndash 61

Strongly typed GP in ECJ

For the problem of simple classifiers represented as decision trees

Classifier syntaxClassifier = Class_idClassifier = if_then_else(Condition ClassifierClassifier)Condition = Input_Variable = Constant_Value

Implementation in ECJ parameter filesgptypeasize = 3gptypea0name = classgptypea1name = vargptypea2name = constgptypessize = 0gptcsize = 1gptc0 = ecgpGPTreeConstraintsgptc0name = tc0gptc0fset = f0gptc0returns = class

gpncsize = 4gpnc0 = ecgpGPNodeConstraintsgpnc0name = ncSimpleClassifiergpnc0returns = classgpnc0size = 0gpnc1 = ecgpGPNodeConstraintsgpnc1name = ncCompoundClassifiergpnc1returns = classgpnc1size = 4gpnc1child0 = vargpnc1child1 = constgpnc1child2 = classgpnc1child3 = classgpnc2 = ecgpGPNodeConstraintsgpnc2name = ncVariablegpnc2returns = vargpnc2size = 0gpnc3 = ecgpGPNodeConstraintsgpnc3name = ncConstantgpnc3returns = constgpnc3size = 0

Variants of GPndash 62

Linear Genetic Programming

MotivationTree-like structures are not natural for contemporary hardware architectures

Program = a sequence of instructions

Data passed via registers

ProsDirectly portable to machine code fast executionNatural correspondence to standard (GA-like) crossover operator

Applications direct evolution of machine code [32]

Initial register contents

Final register contents

x1

x2 O1 O2

x3

O3 O4 g2

g3

g1r1

r2

r3

r1

r2

r3

Variants of GPndash 63

Linear GP

Variants of GPndash 64

Stack-based GP

The best-known representative Push and PushGPhampshireedulspectorpushhtml [48]

ProsVery simple syntax program = instruction | literal | ( program )No need to specify the number of registersThe top element of a stack has the natural interpretation of program outcomeNatural possibility of implementing autoconstructive programs [47]Includes certain features that make it Turing-complete (eg YANK instruction)

Variants of GPndash 65

Push Example

Program

( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR )

Initial stack states

BOOLEAN STACK ()CODE STACK ( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR )FLOAT STACK ()INTEGER STACK ()

Stack states after program execution

BOOLEAN STACK ( TRUE )CODE STACK ( ( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR ) )FLOAT STACK ( 93 )INTEGER STACK ( 6 )

httphampshireedulspectorpush3-descriptionhtml

Variants of GPndash 66

Other variants of GP

Grammatical Evolution The grammar of the programming language ofconsideration is given as input to the algorithm Individuals encode the choice ofproductions in the derivation tree (which of available alternative productionshould be chosen modulo the number of productions available at given step ofderivation)

Graph-based GPMotivation standard GP cannot reuse subprograms (within a single program)Example Cartesian Genetic Programming

Variants of GPndash 67

Other variants of GP

Multiobjective GP The extra objectives canCome with the problemResult from GPrsquos specifics eg use program size as the second (minimized)objectiveBe associated with different tests (eg feature tests [40])

Developmental GP (eg using Push)

Probabilistic GP (a variant of EDA Estimation of Distribution Algorithms)The algorithm maintains a probability distribution P instead of a populationIndividuals are generated from P lsquoon demandrsquoThe results of individualsrsquo evaluation are used to update P

Variants of GPndash 68

Simple EDA-like GP PIPE

Probabilistic Incremental Program Evolution [41]

Variants of GPndash 69

Selected theoretical results

Selected theoretical resultsndash 70

Schemata theorem for GP

Exact formula for the expected number of individuals sampling a schema a thenext generation [37]

Plus later work for other types of crossover

Selected theoretical resultsndash 71

Semantic crossover

Selected theoretical resultsndash 72

Applications of GP

Applications of GPndash 73

Review

GP produced a number of solutions that are human-competitive ie a GPalgorithm automatically solved a problem for which a patent exists2

A recent award-winning work has demonstrated the ability of a GP system toautomatically find and correct bugs in commercially-released software whenprovided with test data3

GP is one of leading methodologies that can be used to lsquoautomatersquo sciencehelping the researchers to find the hidden complex patterns in the observedphenomena4

2Koza J R Keane M A Streeter M J Mydlowec W Yu J Lanza G 2003 Genetic Pro-gramming IV Routine Human-Competitive Machine Intelligence Kluwer Academic Publishers

3Arcuri A Yao X A novel co-evolutionary approach to automatic software bug fixing In WangJ (Ed) 2008 IEEE World Congress on Computational Intelligence IEEE Computational IntelligenceSociety IEEE Press Hong Kong

4Schmidt M Lipson H 3 Apr 2009 Distilling free-form natural laws from experimental dataScience 324 (5923) 81ndash85

Applications of GPndash 74

Humies

() Entries were solicited for cash awards for human-competitive results that wereproduced by any form of genetic and evolutionary computation and that were published

Applications of GPndash 75

Humies

The conditions to qualify(A) The result was patented as an invention in the past is an improvement over apatented invention or would qualify today as a patentable new invention(B) The result is equal to or better than a result that was accepted as a new scientificresult at the time when it was published in a peer-reviewed scientific journal(C) The result is equal to or better than a result that was placed into a database orarchive of results maintained by an internationally recognized panel of scientificexperts(D) The result is publishable in its own right as a new scientific result mdash independentof the fact that the result was mechanically created(E) The result is equal to or better than the most recent human-created solution to along-standing problem for which there has been a succession of increasingly betterhuman-created solutions(F) The result is equal to or better than a result that was considered an achievementin its field at the time it was first discovered(G) The result solves a problem of indisputable difficulty in its field(H) The result holds its own or wins a regulated competition involving humancontestants (in the form of either live human players or human-written computerprograms)

Applications of GPndash 76

Selected Gold Humies

2004 Jason D Lohn Gregory S Hornby Derek S Linden NASA Ames ResearchCenterAn Evolved Antenna for Deployment on NASArsquos Space Technology 5 Mission

httpidesignucscedupapershornby_ec11pdf

Applications of GPndash 77

Selected Gold Humies using GP

2009 Stephanie Forrest Claire Le Goues ThanhVu Nguyen Westley WeimerAutomatically finding patches using genetic programming A GeneticProgramming Approach to Automated Software Repair

Applications of GPndash 78

Selected Gold Humies using GP

2008 Lee Spector David M Clark Ian Lindsay Bradford Barr Jon KleinGenetic Programming for Finite Algebras

2010 Natalio Krasnogor Paweł Widera Jonathan GaribaldiEvolutionary design of the energy function for protein structure prediction GPchallenge evolving the energy function for protein structure prediction Automateddesign of energy functions for protein structure prediction by means of geneticprogramming and improved structure similarity assessment

2011 Achiya Elyasaf Ami Hauptmann Moshe SipperGA-FreeCell Evolving Solvers for the Game of FreeCell

Applications of GPndash 79

Other applications

classification problems in machine learning [] object recognition [ 33] orlearning game strategies [] [2 57] has demonstrated the ability of a GP system to automatically find andcorrect bugs in commercially-released software when provided with test data

In the context of this paper it deserves particular attention that GP is one ofleading methodologies that can be used to lsquoautomatersquo science helping theresearchers to find the hidden complex patterns in the observed phenomena

In this spirit in their seminal paper [42] have shown how GP can be used toinduce scientific laws from experimental data Many other studies havedemonstrated the usefulness of GP for modeling different phenomena includingthose of natural origins [45 3 7 56 46]

See [] for an extensive review of GP applications

Applications of GPndash 80

Case study 2 Evolution of feature detectors

Based on Krzysztof Krawiec Bartosz Kukawka and Tomasz Maciejewski (2010)Evolving cascades of voting feature detectors for vehicle detection in satellite imageryIn IEEE Congress on Evolutionary Computation (CEC 2010) Barcelona IEEE Presspages 2392-2399

Applications of GPndash 81

Additional resources

Additional resourcesndash 82

Additional resources

Evolutionary Computation in Java csgmuedu~eclabprojectsecjGeneric software framework for EA well-prepared to work with GP

The online GP bilbiography wwwcsbhamacuk~wblbiblio

The genetic programming lsquohome pagersquo (a little bit messy but still valuable)httpwwwgenetic-programmingcom

Additional resourcesndash 83

Bibliography

Bibliographyndash 84

A Agapitos and S M LucasEvolving Modular Recursive Sorting AlgorithmsIn Proc EuroGP 2007

A Arcuri and X YaoA novel co-evolutionary approach to automatic software bug fixingIn J Wang editor 2008 IEEE World Congress on Computational Intelligencepages 162ndash168 Hong Kong 1-6 June 2008 IEEE Computational IntelligenceSociety IEEE Press

M Arganis R Val J Prats K Rodriguez R Dominguez and J DolzGenetic programming and standardization in water temperature modellingAdvances in Civil Engineering 2009 2009

A Brabazon M OrsquoNeill and I DempseyAn Introduction to Evolutionary Computation in FinanceIEEE Computational Intelligence Magazine 3(4)42ndash55 2008

R Bradley A Brabazon and M OrsquoNeillDynamic High Frequency Trading A Neuro-Evolutionary ApproachIn Proc EvoWorkshops 2009

G Cuccu and F GomezWhen novelty is not enoughIn Proc EvoApplications 2011

Bibliographyndash 85

M Daga and M C DeoAlternative data-driven methods to estimate wind from waves by inversemodelingNatural Hazards 49(2)293ndash310 May 2009

I Dempsey M OrsquoNeill and A BrabazonAdaptive Trading With Grammatical EvolutionIn Proc CEC 2006

G Durrett F Neumann and U-M OrsquoReillyComputational Complexity Analysis of Simple Genetic Programming On TwoProblems Modeling Isolated Program SemanticsIn Proc FOGA 2011

E Galvaacuten-Loacutepez J Swafford M OrsquoNeill and A BrabazonEvolving a Ms PacMan Controller Using Grammatical EvolutionIn Applications of Evolutionary Computation Springer 2010

J V Hansen P B Lowry R D Meservy and D M McDonaldGenetic Programming for Prevention of Cyberterrorism through Dynamic andEvolving Intrusion DetectionDecision Support Systems 431362ndash1374 2007

Bibliographyndash 86

S Harding J F Miller and W BanzhafDevelopments in Cartesian Genetic Programming self-modifying CGPGPEM 11397ndash439 2010

D Jakobović and L BudinDynamic Scheduling with Genetic ProgrammingIn Proc EuroGP 2006

G Kendall A Parkes and K SpoererA Survey of NP-Complete PuzzlesInternational Computer Games Association Journal 31(1)13ndash34 2008

K E Kinnear JrEvolving a Sort Lessons in Genetic ProgrammingIn Proc of the International Conference on Neural Networks 1993

J KozaA Genetic Approach to the Truck Backer Upper Problem and the Inter-twinedSpiral ProblemIn Proc International Joint Conference on Neural Networks 1992

J R KozaGenetic Programming On the Programming of Computers by Means of NaturalSelectionMIT Press Cambridge MA USA 1992

Bibliographyndash 87

J R KozaGenetic Programming II Automatic Discovery of Reusable ProgramsMIT Press Cambridge Massachusetts May 1994

W Langdon and W BanzhafRepeated Patterns in Genetic ProgrammingNatural Computing 7589ndash613 2008

W B LangdonRandom search is parsimoniousIn E Cantuacute-Paz editor Late Breaking Papers at the Genetic and EvolutionaryComputation Conference (GECCO-2002) pages 308ndash315 New York NY 9-13July 2002 AAAI

W B Langdon and W BanzhafRepeated Sequences in Linear Genetic Programming GenomesComplex Systems 15(4)285ndash306 2005

W B Langdon and A P HarrisonEvolving Regular Expressions for GeneChip Probe Performance PredictionIn Proc PPSN pages 1061ndash1070 2008

W B Langdon and R PoliFoundations of Genetic ProgrammingSpringer-Verlag 2002

Bibliographyndash 88

W B Langdon J Rowsell and A P HarrisonCreating Regular Expressions as mRNA Motifs with GP to Predict Human ExonSplittingIn Proc GECCO 2009

W B Langdon O Sanchez Graillet and A P HarrisonAutomated DNA Motif DiscoveryarXivorg 2010

M Lones A Tyrrell S Stepney and L CavesControlling Complex Dynamics with Artificial Biochemical NetworksIn Proc EuroGP pages 159ndash170 2010

S LucasOthello Competitionhttpprotectkern-1667emrelaxalgovalessexacuk8080othellohtmlOthellohtml 2012[Online accessed 27-Jan-2012]

Bibliographyndash 89

S LucasThe Physical Travelling Salesperson Problemhttpprotectkern-1667emrelaxalgovalessexacukptspptsphtml2012[Online accessed 27ndashJan-2012]

T McConaghyFFX Fast Scalable Deterministic Symbolic Regression TechnologyIn Proc GPTP 2011

D J MontanaStrongly typed genetic programmingBBN Technical Report 7866 Bolt Beranek and Newman Inc 10 MoultonStreet Cambridge MA 02138 USA 7 May 1993

M Nicolau M Schoenauer and W BanzhafEvolving Genes to Balance a PoleIn Proc EuroGP 2010

Bibliographyndash 90

P Nordin and W BanzhafGenetic programming controlling a miniature robotIn E V Siegel and J R Koza editors Working Notes for the AAAI Symposiumon Genetic Programming pages 61ndash67 MIT Cambridge MA USA 10ndash12 Nov1995 AAAI

G Olague and L TrujilloEvolutionary-computer-assisted design of image operators that detect interestpoints using genetic programmingImage and Vision Computing 29(7)484ndash498 2011

M OrsquoNeill A Brabazon and E HembergSubtree Deactivation Control with Grammatical Genetic Programming inDynamic EnvironmentsIn Proc CEC 2008

M OrsquoNeill and C RyanGrammatical Evolution by Grammatical Evolution The Evolution of Grammarand Genetic CodeIn Proc EuroGP pages 138ndash149 Springer-Verlag 5ndash7 Apr 2004

M Paterson Y Peres M Thorup P Winkler and U ZwickMaximum OverhangIn Proc 19th Annual ACM-SIAM Symposium on Discrete Algorithms 2008

Bibliographyndash 91

R PoliExact schema theory for genetic programming and variable-length geneticalgorithms with one-point crossoverGenetic Programming and Evolvable Machines 2(2)123ndash163 June 2001

R Poli and W B LangdonOn the search properties of different crossover operators in genetic programmingIn J R Koza W Banzhaf K Chellapilla K Deb M Dorigo D B Fogel M HGarzon D E Goldberg H Iba and R Riolo editors Genetic Programming1998 Proceedings of the Third Annual Conference pages 293ndash301 University ofWisconsin Madison Wisconsin USA 22-25 July 1998 Morgan Kaufmann

R Poli W B Langdon and N F McPheeA field guide to genetic programmingPublished via httplulucom and freely available athttpwwwgp-field-guideorguk 2008(With contributions by J R Koza)

B J Ross and H ZhuProcedural texture evolution using multiobjective optimizationNew Generation Computing 22(3)271ndash293 2004

Bibliographyndash 92

R P Salustowicz and J SchmidhuberProbabilistic incremental program evolutionEvolutionary Computation 5(2)123ndash141 1997

M Schmidt and H LipsonDistilling free-form natural laws from experimental dataScience 324(5923)81ndash85 3 Apr 2009

S Silva and L VanneschiState-of-the-Art Genetic Programming for Predicting Human Oral Bioavailabilityof DrugsIn Proc 4th International Workshop on Practical Applications of ComputationalBiology and Bioinformatics 2010

M SipperLet the Games EvolveIn Proc GPTP 2011

C Sivapragasam R Maheswaran and V VenkateshGenetic programming approach for flood routing in natural channelsHydrological Processes 22(5)623ndash628 2007

C Sivapragasam N Muttil S Muthukumar and V M ArunPrediction of algal blooms using genetic programmingMarine Pollution Bulletin 60(10)1849ndash1855 2010

Bibliographyndash 93

L SpectorTowards practical autoconstructive evolution Self-evolution of problem-solvinggenetic programming systemsIn R Riolo T McConaghy and E Vladislavleva editors Genetic ProgrammingTheory and Practice VIII volume 8 of Genetic and Evolutionary Computationchapter 2 pages 17ndash33 Springer Ann Arbor USA 20-22 May 2010

L Spector C Perry and J KleinPush 20 programming language descriptionTechnical report School of Cognitive Science Hampshire College Apr 2004

J Togelius S Karakovskiy J Koutnik and J SchmidhuberSuper Mario EvolutionIn Proc IEEE Computational Intelligence and Games 2009

TORCS The Open Car Racing Simulatorhttptorcssourceforgenet 2012

L Vanneschi and G CuccuVariable Size Population for Dynamic Optimization with Genetic ProgrammingIn Proc GECCO 2009

Bibliographyndash 94

E Vladislavleva G Smits and D Den HertogOrder of Nonlinearity as a Complexity Measure for Models Generated by SymbolicRegression via Pareto Genetic ProgrammingIEEE Trans EC 13(2)333ndash349 2009

N Wagner Z Michalewicz M Khouja and R McGregorTime Series Forecasting for Dynamic Environments The DyFor Genetic ProgramModelIEEE Trans EC 2007

J Walker and J MillerPredicting Prime Numbers Using Cartesian Genetic ProgrammingIn Proc EuroGP 2007

J A Walker K Voumllk S L Smith and J F MillerParallel Evolution using Multi-chromosome Cartesian Genetic ProgrammingGPEM 10417ndash445 2009

W-C Wang K-W Chau C-T Cheng and L QiuA comparison of performance of several artificial intelligence methods forforecasting monthly discharge time seriesJournal of Hydrology 374(3-4)294ndash306 2009

Bibliographyndash 95

W Weimer S Forrest C Le Goues and T NguyenAutomatic program repair with evolutionary computationCommunications of the ACM 53(5)109ndash116 June 2010

P Widera J Garibaldi and N KrasnogorGP challenge Evolving energy function for protein structure predictionGPEM 1161ndash88 2010

T YuHierarchical Processing for Evolving Recursive and Modular Programs UsingHigher-Order Functions and Lambda AbstractionGPEM 2345ndash380 2001

Bibliographyndash 96

  • Introduction
  • Background
  • What is genetic programming
  • Summary of our first glimpse at GP
  • Life demonstration of GP using ECJ
  • Assessment of GP techniques
  • Case study
  • A more detailed view on GP (vanilla GP is not the whole story)
  • The Challenges for GP
  • Variants of GP
  • Selected theoretical results
  • Applications of GP
  • Additional resources
  • Bibliography
Page 22: GeneticProgramming - Poznań University of TechnologyGeneticProgramming KrzysztofKrawiec LaboratoryofIntelligentDecisionSupportSystems InstituteofComputingScience,PoznanUniversityofTechnology,Poznań,Poland

Breeding

Assume

a gets reproduced

c gets mutated (at loci 2)

a and d get crossed-over

a and b get crossed over

What is genetic programmingndash 22

Population 1

Population 0

Population 1

Individual d in population 1 has fitness 0What is genetic programmingndash 23

Summary of our first glimpse at GP

Summary of our first glimpse at GPndash 24

Specific features of GP

The solutions evolving under the selection pressure of the fitness function arethemselves functions (programs)

GP operates on symbolic structures of varying lengthsThere are no variables for the algorithm to operate on (at least in the commonsense)

The program can be tested only on a limited number of fitness cases (tests)

=rArr In contrast to most EC methods that are typically placed in optimizationframework GP is by nature an inductive learning approach that fits into the domain ofmachine learning []

Summary of our first glimpse at GPndash 25

In a broader context

As opposed to typical ML approaches GP is very genericArbitrary programming language arbitrary input and output representation

The syntax and semantic of the programming language of consideration serve asmeans to provide the algorithm with prior knowledge

(common sense knowledge background knowledge domain knowledge)

GP is not the only approach to program induction (but probably the best one )See eg inductive logic programming ILP

GP embodies the ultimate goal of AI to build a system capable ofself-programming (adaptation learning)

Summary of our first glimpse at GPndash 26

What is GP ndash Question revisited

Genetic programming is a branch of computer science studying heuristicalgorithms based on neo-Darwinian principles for synthesizing programs iediscrete symbolic compositional structures that process data

Summary of our first glimpse at GPndash 27

Why should GP be considered a viable approach to program synthesis

GP combines two powerful concepts marked in underline in the above definition1 Representing candidate solutions as programs

which in general can conduct any Turing-complete computation (egclassification regression clustering reasoning problem solving etc) and thusenable capturing solutions to any type of problems (whether the task is eglearning optimization problem solving game playing etc)

2 Searching the space of candidate solutions using the lsquomechanicsrsquo borrowed frombiological evolutionwhich is unquestionably a very powerful computing paradigm given that itresulted in life on Earth and development of intelligent beings

Summary of our first glimpse at GPndash 28

Consequences of the above definition

Heuristic nature of search

Symbolic program representation

Input sensitivity and inductive character

State-fullness

Unconstrained data types

Summary of our first glimpse at GPndash 29

Origins of GP

httpwwwgenetic-programmingcomjohnkozahtml

Home assignment Indentify John Kozarsquos unique visual trait )

Summary of our first glimpse at GPndash 30

Life demonstration of GP using ECJ

Life demonstration of GP using ECJndash 31

GP on ECJ

ECJ Evolutionary Computation in Javahttpcsgmuedu~eclabprojectsecj

by Sean Luke Liviu Panait Gabriel Balan Sean Paus Zbigniew Skolicki ElenaPopovici Keith Sullivan et al

Probably the most popular freely available framework for EC with a strongsupport for GP

Licensed under Academic Free License version 30

As of March 2012 version 20

Many other libraries integrate with ECJ

Life demonstration of GP using ECJndash 32

Selected ECJ features

GUI with charting

Platform-independent checkpointing and logging

Hierarchical parameter files

Multithreading

Mersenne Twister Random Number Generators (compare tohttpwwwalifecouknonrandom)

Abstractions for implementing a variety of EC forms

Prepared to work in a distributed environment (including so-called island model)

Life demonstration of GP using ECJndash 33

GP-related ECJ features

GP Tree Representations

Set-based Strongly-Typed Genetic Programming

Ephemeral Random Constants

Automatically-Defined Functions and Automatically Defined Macros

Multiple tree forests

Six tree-creation algorithms

Extensive set of GP breeding operators

Grammatical Encoding

Eight pre-done GP application problem domains (ant regression multiplexerlawnmower parity two-box edge serengeti)

Life demonstration of GP using ECJndash 34

Exemplary run of ECJ

Standard outputjava ecEvolve -file ecappregressionquinticercparamsThreads breed1 eval1Seed 1427743400Job 0Setting upProcessing GP TypesProcessing GP Node ConstraintsProcessing GP Function SetsProcessing GP Tree Constraints-01306332228659439200164875774146594280653340439694114301402200189629743-0037506348565697010001402771209365470606602806044824949013869498395598084Initializing Generation 0Subpop 0 best fitness of generation Fitness Standardized=11303205 Adjusted=046941292 Hits=10Generation 1Subpop 0 best fitness of generation Fitness Standardized=06804932 Adjusted=059506345 Hits=7

Life demonstration of GP using ECJndash 35

Exemplary run The result

The log file produced by the runGeneration 0Best IndividualSubpopulation 0Evaluated trueFitness Standardized=11303205 Adjusted=046941292 Hits=10Tree 0( (sin ( x x)) (cos (+ x x)))Generation 1Best IndividualSubpopulation 0Evaluated trueFitness Standardized=06804932 Adjusted=059506345 Hits=7Tree 0( (rlog (+ (- x x) (cos x))) (rlog (- (cos (cos ( x x))) (- x x))))

Life demonstration of GP using ECJndash 36

Exemplary run

The log file produced by the runBest Individual of RunSubpopulation 0Evaluated trueFitness Standardized=008413165 Adjusted=092239726 Hits=17Tree 0( ( ( (- ( ( ( ( x (sin x)) (rlog

x)) (+ (+ (sin x) x) (- x x))) (exp ( x( ( (- ( ( ( ( x x) (rlog x)) (+ (+

(sin x) x) (- x x))) (exp ( x (sin x))))(sin x)) (rlog x)) (exp (rlog x)))))) (sin

x)) (rlog x)) x) (cos (cos ( ( (- ( ((exp (rlog x)) (+ x ( ( (exp (rlog x))(rlog x)) x))) (exp ( ( ( (- (exp (rlogx)) x) (rlog x)) x) (sin ( x x))))) (sinx)) ( x ( ( (- ( ( ( ( x x) (rlogx)) (+ (+ x (+ (+ (sin x) x) (- x x))) (-x x))) (exp ( x (sin x)))) (sin x)) (rlogx)) (exp (rlog x))))) x))))

Life demonstration of GP using ECJndash 37

GP in R

httpcranr-projectorgwebpackagesgprindexhtml

Life demonstration of GP using ECJndash 38

GP in Mathematica

Life demonstration of GP using ECJndash 39

Assessment of GP techniques

Assessment of GP techniquesndash 40

Criteria for assessing the quality of GP-evolved solutions

Criteria for assessing GP algorithmssuccess rate (percentage of evolutionary runs ended with success)

time-to-success (can be infin)

error of the best-of-run individual

Criteria for assessing programs obtained with GP

error rate (percentage of tests passed)

program size (number of instructions)

execution time

transparency

Assessment of GP techniquesndash 41

GP Benchmarks

Problem Definition (formula)Sextic x6minus2x4+x2

Septic x7minus2x6+x5minusx4+x3minus2x2+xNonic x9+x8+x7+x6+x5+x4+x3+x2+xR1 (x+1)3(x2minusx+1)R2 (x5minus3x3+1)(x2+1)R3 (x6+x5)(x4+x3+x2+x+1)

Assessment of GP techniquesndash 42

Even more GP benchmarks

Symbolic Regression Drug Bioavailability [43]

Tower [52] Protein Structure Classification [58]

Boolean Functions Time Series Forecasting [53]

N-Multiplexer [17] N-Majority [17] N-Parity [17] Path-finding and Planning

Generalised Boolean Circuits [12 59] Physical Travelling Salesman [28]

Digital Adder [55] Artificial Ant [17]

Order [9] Lawnmower [18]

Digital Multiplier [55] Tartarus Problem [6]

Majority [9] Maximum Overhang [36]

Classification Circuit Design [29]

mRNA Motif Classification [24] Control Systems

DNA Motif Discovery [25] Chaotic Dynamic Systems Control [26]

Intrusion Detection [11] Pole Balancing [31]

Protein Classification [19] Truck Control [16]

Intertwined Spirals [17]

Predictive Modelling

Mackey-Glass Chaotic Time Series [21]

Financial Trading [5 4 8]

Sunspot Prediction [17]

GeneChip Probe Performance [22]

Prime Number Prediction [54]

Assessment of GP techniquesndash 43

and more

Game-Playing

TORCS Car Racing [50]

Ms PacMan [10]

Othello [27]

Chessboard Evaluation [44]

Backgammon [44]

Mario [49]

NP-Complete Puzzles [14]

Robocode [44]

Rush Hour [44]

Checkers [44]

Freecell [44]

Dynamic Optimisation

Dynamic Symbolic Regression [34 35 51]

Dynamic Scheduling [13]

Traditional Programming

Sorting [15 1]

Assessment of GP techniquesndash 44

Case study

Case studyndash 45

Case study 1 Evolution of temperature models

Based on Karolina Stanisławska Krzysztof Krawiec Zbigniew W KundzewiczModeling Global Temperature Changes using Genetic Programming ndash A Case Study

Institute of Computing Science Poznan University of Technology Poznan Poland

Institute for Agricultural and Forest Environment Polish Academy of SciencesPoznan Poland and Potsdam Institute for Climate Impact Research PotsdamGermany

Konferencja Algorytmoacutew Ewolucyjnych i Optymalizacji Globalnej KAEiOGSeptember 21 2011

Case studyndash 46

A more detailed view on GP(vanilla GP is not the whole story)

A more detailed view on GP (vanilla GP is not the whole story)ndash 47

Where to get [random] candidate solutions from

Every stochastic search method relies on some sampling algorithm(s)

The distribution of randomly generated solutions is important as it impliescertain bias of the algorithm

ProblemsWe donrsquot know the lsquoidealrsquo distribution of GP programsEven if we knew it it may be difficult to design an algorithm that obeys it

The most widely used contemporary initialization methods take care only of thesyntax of generated programs

Mainly height constraint

A more detailed view on GP (vanilla GP is not the whole story)ndash 48

Initialization Full method

Specify the maximum tree height hmax

The full method for initializing treesChoose nonterminal nodes at random until hmax is reachedThen choose only from terminals

A more detailed view on GP (vanilla GP is not the whole story)ndash 49

Initialization Grow method

Specify the maximum tree height hmax

The grow method for initializing treesChoose nonterminal or terminal nodes at random until hmax is reachedThen choose only from terminals

A more detailed view on GP (vanilla GP is not the whole story)ndash 50

Homologous crossover for GP

Earliest example one-point crossover [23] identify a common region in theparents and swap the corresponding trees

The common region is the lsquointersectionrsquo of parent trees

A more detailed view on GP (vanilla GP is not the whole story)ndash 51

Uniform crossover for GP

Works similarly to uniform crossover in GAs

The offspring is build by iterating over nodes in the common region and flipping acoin to decide from which parent should an instruction be copied [38]

A more detailed view on GP (vanilla GP is not the whole story)ndash 52

How to employ multiple operators for lsquobreedingrsquo

How should the particular operators coexist in an evolutionary process In other words

How should they be superimposed

What should be the lsquopipingrsquo of particular breeding pipelines

A topic surprisingly underexplored in GP (and in EC probably too)

An example Which is betterpopsubpop0speciespipe = ecgpkozaMutationPipelinepopsubpop0speciespipenum-sources = 1popsubpop0speciespipesource0 = ecgpkozaCrossoverPipeline

Orpopsubpop0speciespipenum-sources = 2popsubpop0speciespipesource0 = ecgpkozaCrossoverPipelinepopsubpop0speciespipesource0prob = 09popsubpop0speciespipesource1 = ecgpkozaMutationPipelinepopsubpop0speciespipesource1prob = 01

A more detailed view on GP (vanilla GP is not the whole story)ndash 53

The Challenges for GP

The Challenges for GPndash 54

Bloat

The evolving expressions tend to grow indefinitely in sizeFor tree-based representations this growth is typically exponential[-ish]

Evaluation becomes slow algorithm stalls memory overrun likely

One of the most intensely studied topics in GP 240+ papers as of March 2012

The Challenges for GPndash 55

Bloat example

Average number of nodes per generation in a typical run of GP solving the Sexticproblem x6minus2x4+x2

(GP dotted line)

The Challenges for GPndash 56

Countermeasures for bloat

Constraining tree heightSurprisingly can speed up bloat

Favoring small programsLexicographic parsimony pressure given two equally fit individuals prefer (select)the one represented by a smaller tree

Bloat-aware operators size-fair crossover

The Challenges for GPndash 57

Highly non-uniform distribution of program lsquobehaviorsrsquo

Convergence of binary Boolean random linear functions (composed of AND NANDOR NOR 8 bits)

From [20] Langdon W B Cantuacute-Paz E (ed) Random Search is Parsimonious LateBreaking Papers at the Genetic and Evolutionary Computation Conference(GECCO-2002) AAAI 2002 308-315

The Challenges for GPndash 58

High cost of evaluation

Running a program on multiple inputscan be expensive

Particularly for some types of dataeg images

Solutions

Caching of outcomes of subprograms

Parallel execution of programs onparticular fitness cases

Bloat prevention methods

The Challenges for GPndash 59

Variants of GP

Variants of GPndash 60

Strongly typed GP (STGP)

A way to incorporate prior knowledge and impose a structure on programs [30]

ImplementationProvide a set of typesFor each instruction define the types of its arguments and outcomesMake the operators type-aware

Mutation substitute a random tree of a proper typeCrossover swap trees of compatible1 types

1Compatible belonging to the same lsquoset typersquoVariants of GPndash 61

Strongly typed GP in ECJ

For the problem of simple classifiers represented as decision trees

Classifier syntaxClassifier = Class_idClassifier = if_then_else(Condition ClassifierClassifier)Condition = Input_Variable = Constant_Value

Implementation in ECJ parameter filesgptypeasize = 3gptypea0name = classgptypea1name = vargptypea2name = constgptypessize = 0gptcsize = 1gptc0 = ecgpGPTreeConstraintsgptc0name = tc0gptc0fset = f0gptc0returns = class

gpncsize = 4gpnc0 = ecgpGPNodeConstraintsgpnc0name = ncSimpleClassifiergpnc0returns = classgpnc0size = 0gpnc1 = ecgpGPNodeConstraintsgpnc1name = ncCompoundClassifiergpnc1returns = classgpnc1size = 4gpnc1child0 = vargpnc1child1 = constgpnc1child2 = classgpnc1child3 = classgpnc2 = ecgpGPNodeConstraintsgpnc2name = ncVariablegpnc2returns = vargpnc2size = 0gpnc3 = ecgpGPNodeConstraintsgpnc3name = ncConstantgpnc3returns = constgpnc3size = 0

Variants of GPndash 62

Linear Genetic Programming

MotivationTree-like structures are not natural for contemporary hardware architectures

Program = a sequence of instructions

Data passed via registers

ProsDirectly portable to machine code fast executionNatural correspondence to standard (GA-like) crossover operator

Applications direct evolution of machine code [32]

Initial register contents

Final register contents

x1

x2 O1 O2

x3

O3 O4 g2

g3

g1r1

r2

r3

r1

r2

r3

Variants of GPndash 63

Linear GP

Variants of GPndash 64

Stack-based GP

The best-known representative Push and PushGPhampshireedulspectorpushhtml [48]

ProsVery simple syntax program = instruction | literal | ( program )No need to specify the number of registersThe top element of a stack has the natural interpretation of program outcomeNatural possibility of implementing autoconstructive programs [47]Includes certain features that make it Turing-complete (eg YANK instruction)

Variants of GPndash 65

Push Example

Program

( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR )

Initial stack states

BOOLEAN STACK ()CODE STACK ( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR )FLOAT STACK ()INTEGER STACK ()

Stack states after program execution

BOOLEAN STACK ( TRUE )CODE STACK ( ( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR ) )FLOAT STACK ( 93 )INTEGER STACK ( 6 )

httphampshireedulspectorpush3-descriptionhtml

Variants of GPndash 66

Other variants of GP

Grammatical Evolution The grammar of the programming language ofconsideration is given as input to the algorithm Individuals encode the choice ofproductions in the derivation tree (which of available alternative productionshould be chosen modulo the number of productions available at given step ofderivation)

Graph-based GPMotivation standard GP cannot reuse subprograms (within a single program)Example Cartesian Genetic Programming

Variants of GPndash 67

Other variants of GP

Multiobjective GP The extra objectives canCome with the problemResult from GPrsquos specifics eg use program size as the second (minimized)objectiveBe associated with different tests (eg feature tests [40])

Developmental GP (eg using Push)

Probabilistic GP (a variant of EDA Estimation of Distribution Algorithms)The algorithm maintains a probability distribution P instead of a populationIndividuals are generated from P lsquoon demandrsquoThe results of individualsrsquo evaluation are used to update P

Variants of GPndash 68

Simple EDA-like GP PIPE

Probabilistic Incremental Program Evolution [41]

Variants of GPndash 69

Selected theoretical results

Selected theoretical resultsndash 70

Schemata theorem for GP

Exact formula for the expected number of individuals sampling a schema a thenext generation [37]

Plus later work for other types of crossover

Selected theoretical resultsndash 71

Semantic crossover

Selected theoretical resultsndash 72

Applications of GP

Applications of GPndash 73

Review

GP produced a number of solutions that are human-competitive ie a GPalgorithm automatically solved a problem for which a patent exists2

A recent award-winning work has demonstrated the ability of a GP system toautomatically find and correct bugs in commercially-released software whenprovided with test data3

GP is one of leading methodologies that can be used to lsquoautomatersquo sciencehelping the researchers to find the hidden complex patterns in the observedphenomena4

2Koza J R Keane M A Streeter M J Mydlowec W Yu J Lanza G 2003 Genetic Pro-gramming IV Routine Human-Competitive Machine Intelligence Kluwer Academic Publishers

3Arcuri A Yao X A novel co-evolutionary approach to automatic software bug fixing In WangJ (Ed) 2008 IEEE World Congress on Computational Intelligence IEEE Computational IntelligenceSociety IEEE Press Hong Kong

4Schmidt M Lipson H 3 Apr 2009 Distilling free-form natural laws from experimental dataScience 324 (5923) 81ndash85

Applications of GPndash 74

Humies

() Entries were solicited for cash awards for human-competitive results that wereproduced by any form of genetic and evolutionary computation and that were published

Applications of GPndash 75

Humies

The conditions to qualify(A) The result was patented as an invention in the past is an improvement over apatented invention or would qualify today as a patentable new invention(B) The result is equal to or better than a result that was accepted as a new scientificresult at the time when it was published in a peer-reviewed scientific journal(C) The result is equal to or better than a result that was placed into a database orarchive of results maintained by an internationally recognized panel of scientificexperts(D) The result is publishable in its own right as a new scientific result mdash independentof the fact that the result was mechanically created(E) The result is equal to or better than the most recent human-created solution to along-standing problem for which there has been a succession of increasingly betterhuman-created solutions(F) The result is equal to or better than a result that was considered an achievementin its field at the time it was first discovered(G) The result solves a problem of indisputable difficulty in its field(H) The result holds its own or wins a regulated competition involving humancontestants (in the form of either live human players or human-written computerprograms)

Applications of GPndash 76

Selected Gold Humies

2004 Jason D Lohn Gregory S Hornby Derek S Linden NASA Ames ResearchCenterAn Evolved Antenna for Deployment on NASArsquos Space Technology 5 Mission

httpidesignucscedupapershornby_ec11pdf

Applications of GPndash 77

Selected Gold Humies using GP

2009 Stephanie Forrest Claire Le Goues ThanhVu Nguyen Westley WeimerAutomatically finding patches using genetic programming A GeneticProgramming Approach to Automated Software Repair

Applications of GPndash 78

Selected Gold Humies using GP

2008 Lee Spector David M Clark Ian Lindsay Bradford Barr Jon KleinGenetic Programming for Finite Algebras

2010 Natalio Krasnogor Paweł Widera Jonathan GaribaldiEvolutionary design of the energy function for protein structure prediction GPchallenge evolving the energy function for protein structure prediction Automateddesign of energy functions for protein structure prediction by means of geneticprogramming and improved structure similarity assessment

2011 Achiya Elyasaf Ami Hauptmann Moshe SipperGA-FreeCell Evolving Solvers for the Game of FreeCell

Applications of GPndash 79

Other applications

classification problems in machine learning [] object recognition [ 33] orlearning game strategies [] [2 57] has demonstrated the ability of a GP system to automatically find andcorrect bugs in commercially-released software when provided with test data

In the context of this paper it deserves particular attention that GP is one ofleading methodologies that can be used to lsquoautomatersquo science helping theresearchers to find the hidden complex patterns in the observed phenomena

In this spirit in their seminal paper [42] have shown how GP can be used toinduce scientific laws from experimental data Many other studies havedemonstrated the usefulness of GP for modeling different phenomena includingthose of natural origins [45 3 7 56 46]

See [] for an extensive review of GP applications

Applications of GPndash 80

Case study 2 Evolution of feature detectors

Based on Krzysztof Krawiec Bartosz Kukawka and Tomasz Maciejewski (2010)Evolving cascades of voting feature detectors for vehicle detection in satellite imageryIn IEEE Congress on Evolutionary Computation (CEC 2010) Barcelona IEEE Presspages 2392-2399

Applications of GPndash 81

Additional resources

Additional resourcesndash 82

Additional resources

Evolutionary Computation in Java csgmuedu~eclabprojectsecjGeneric software framework for EA well-prepared to work with GP

The online GP bilbiography wwwcsbhamacuk~wblbiblio

The genetic programming lsquohome pagersquo (a little bit messy but still valuable)httpwwwgenetic-programmingcom

Additional resourcesndash 83

Bibliography

Bibliographyndash 84

A Agapitos and S M LucasEvolving Modular Recursive Sorting AlgorithmsIn Proc EuroGP 2007

A Arcuri and X YaoA novel co-evolutionary approach to automatic software bug fixingIn J Wang editor 2008 IEEE World Congress on Computational Intelligencepages 162ndash168 Hong Kong 1-6 June 2008 IEEE Computational IntelligenceSociety IEEE Press

M Arganis R Val J Prats K Rodriguez R Dominguez and J DolzGenetic programming and standardization in water temperature modellingAdvances in Civil Engineering 2009 2009

A Brabazon M OrsquoNeill and I DempseyAn Introduction to Evolutionary Computation in FinanceIEEE Computational Intelligence Magazine 3(4)42ndash55 2008

R Bradley A Brabazon and M OrsquoNeillDynamic High Frequency Trading A Neuro-Evolutionary ApproachIn Proc EvoWorkshops 2009

G Cuccu and F GomezWhen novelty is not enoughIn Proc EvoApplications 2011

Bibliographyndash 85

M Daga and M C DeoAlternative data-driven methods to estimate wind from waves by inversemodelingNatural Hazards 49(2)293ndash310 May 2009

I Dempsey M OrsquoNeill and A BrabazonAdaptive Trading With Grammatical EvolutionIn Proc CEC 2006

G Durrett F Neumann and U-M OrsquoReillyComputational Complexity Analysis of Simple Genetic Programming On TwoProblems Modeling Isolated Program SemanticsIn Proc FOGA 2011

E Galvaacuten-Loacutepez J Swafford M OrsquoNeill and A BrabazonEvolving a Ms PacMan Controller Using Grammatical EvolutionIn Applications of Evolutionary Computation Springer 2010

J V Hansen P B Lowry R D Meservy and D M McDonaldGenetic Programming for Prevention of Cyberterrorism through Dynamic andEvolving Intrusion DetectionDecision Support Systems 431362ndash1374 2007

Bibliographyndash 86

S Harding J F Miller and W BanzhafDevelopments in Cartesian Genetic Programming self-modifying CGPGPEM 11397ndash439 2010

D Jakobović and L BudinDynamic Scheduling with Genetic ProgrammingIn Proc EuroGP 2006

G Kendall A Parkes and K SpoererA Survey of NP-Complete PuzzlesInternational Computer Games Association Journal 31(1)13ndash34 2008

K E Kinnear JrEvolving a Sort Lessons in Genetic ProgrammingIn Proc of the International Conference on Neural Networks 1993

J KozaA Genetic Approach to the Truck Backer Upper Problem and the Inter-twinedSpiral ProblemIn Proc International Joint Conference on Neural Networks 1992

J R KozaGenetic Programming On the Programming of Computers by Means of NaturalSelectionMIT Press Cambridge MA USA 1992

Bibliographyndash 87

J R KozaGenetic Programming II Automatic Discovery of Reusable ProgramsMIT Press Cambridge Massachusetts May 1994

W Langdon and W BanzhafRepeated Patterns in Genetic ProgrammingNatural Computing 7589ndash613 2008

W B LangdonRandom search is parsimoniousIn E Cantuacute-Paz editor Late Breaking Papers at the Genetic and EvolutionaryComputation Conference (GECCO-2002) pages 308ndash315 New York NY 9-13July 2002 AAAI

W B Langdon and W BanzhafRepeated Sequences in Linear Genetic Programming GenomesComplex Systems 15(4)285ndash306 2005

W B Langdon and A P HarrisonEvolving Regular Expressions for GeneChip Probe Performance PredictionIn Proc PPSN pages 1061ndash1070 2008

W B Langdon and R PoliFoundations of Genetic ProgrammingSpringer-Verlag 2002

Bibliographyndash 88

W B Langdon J Rowsell and A P HarrisonCreating Regular Expressions as mRNA Motifs with GP to Predict Human ExonSplittingIn Proc GECCO 2009

W B Langdon O Sanchez Graillet and A P HarrisonAutomated DNA Motif DiscoveryarXivorg 2010

M Lones A Tyrrell S Stepney and L CavesControlling Complex Dynamics with Artificial Biochemical NetworksIn Proc EuroGP pages 159ndash170 2010

S LucasOthello Competitionhttpprotectkern-1667emrelaxalgovalessexacuk8080othellohtmlOthellohtml 2012[Online accessed 27-Jan-2012]

Bibliographyndash 89

S LucasThe Physical Travelling Salesperson Problemhttpprotectkern-1667emrelaxalgovalessexacukptspptsphtml2012[Online accessed 27ndashJan-2012]

T McConaghyFFX Fast Scalable Deterministic Symbolic Regression TechnologyIn Proc GPTP 2011

D J MontanaStrongly typed genetic programmingBBN Technical Report 7866 Bolt Beranek and Newman Inc 10 MoultonStreet Cambridge MA 02138 USA 7 May 1993

M Nicolau M Schoenauer and W BanzhafEvolving Genes to Balance a PoleIn Proc EuroGP 2010

Bibliographyndash 90

P Nordin and W BanzhafGenetic programming controlling a miniature robotIn E V Siegel and J R Koza editors Working Notes for the AAAI Symposiumon Genetic Programming pages 61ndash67 MIT Cambridge MA USA 10ndash12 Nov1995 AAAI

G Olague and L TrujilloEvolutionary-computer-assisted design of image operators that detect interestpoints using genetic programmingImage and Vision Computing 29(7)484ndash498 2011

M OrsquoNeill A Brabazon and E HembergSubtree Deactivation Control with Grammatical Genetic Programming inDynamic EnvironmentsIn Proc CEC 2008

M OrsquoNeill and C RyanGrammatical Evolution by Grammatical Evolution The Evolution of Grammarand Genetic CodeIn Proc EuroGP pages 138ndash149 Springer-Verlag 5ndash7 Apr 2004

M Paterson Y Peres M Thorup P Winkler and U ZwickMaximum OverhangIn Proc 19th Annual ACM-SIAM Symposium on Discrete Algorithms 2008

Bibliographyndash 91

R PoliExact schema theory for genetic programming and variable-length geneticalgorithms with one-point crossoverGenetic Programming and Evolvable Machines 2(2)123ndash163 June 2001

R Poli and W B LangdonOn the search properties of different crossover operators in genetic programmingIn J R Koza W Banzhaf K Chellapilla K Deb M Dorigo D B Fogel M HGarzon D E Goldberg H Iba and R Riolo editors Genetic Programming1998 Proceedings of the Third Annual Conference pages 293ndash301 University ofWisconsin Madison Wisconsin USA 22-25 July 1998 Morgan Kaufmann

R Poli W B Langdon and N F McPheeA field guide to genetic programmingPublished via httplulucom and freely available athttpwwwgp-field-guideorguk 2008(With contributions by J R Koza)

B J Ross and H ZhuProcedural texture evolution using multiobjective optimizationNew Generation Computing 22(3)271ndash293 2004

Bibliographyndash 92

R P Salustowicz and J SchmidhuberProbabilistic incremental program evolutionEvolutionary Computation 5(2)123ndash141 1997

M Schmidt and H LipsonDistilling free-form natural laws from experimental dataScience 324(5923)81ndash85 3 Apr 2009

S Silva and L VanneschiState-of-the-Art Genetic Programming for Predicting Human Oral Bioavailabilityof DrugsIn Proc 4th International Workshop on Practical Applications of ComputationalBiology and Bioinformatics 2010

M SipperLet the Games EvolveIn Proc GPTP 2011

C Sivapragasam R Maheswaran and V VenkateshGenetic programming approach for flood routing in natural channelsHydrological Processes 22(5)623ndash628 2007

C Sivapragasam N Muttil S Muthukumar and V M ArunPrediction of algal blooms using genetic programmingMarine Pollution Bulletin 60(10)1849ndash1855 2010

Bibliographyndash 93

L SpectorTowards practical autoconstructive evolution Self-evolution of problem-solvinggenetic programming systemsIn R Riolo T McConaghy and E Vladislavleva editors Genetic ProgrammingTheory and Practice VIII volume 8 of Genetic and Evolutionary Computationchapter 2 pages 17ndash33 Springer Ann Arbor USA 20-22 May 2010

L Spector C Perry and J KleinPush 20 programming language descriptionTechnical report School of Cognitive Science Hampshire College Apr 2004

J Togelius S Karakovskiy J Koutnik and J SchmidhuberSuper Mario EvolutionIn Proc IEEE Computational Intelligence and Games 2009

TORCS The Open Car Racing Simulatorhttptorcssourceforgenet 2012

L Vanneschi and G CuccuVariable Size Population for Dynamic Optimization with Genetic ProgrammingIn Proc GECCO 2009

Bibliographyndash 94

E Vladislavleva G Smits and D Den HertogOrder of Nonlinearity as a Complexity Measure for Models Generated by SymbolicRegression via Pareto Genetic ProgrammingIEEE Trans EC 13(2)333ndash349 2009

N Wagner Z Michalewicz M Khouja and R McGregorTime Series Forecasting for Dynamic Environments The DyFor Genetic ProgramModelIEEE Trans EC 2007

J Walker and J MillerPredicting Prime Numbers Using Cartesian Genetic ProgrammingIn Proc EuroGP 2007

J A Walker K Voumllk S L Smith and J F MillerParallel Evolution using Multi-chromosome Cartesian Genetic ProgrammingGPEM 10417ndash445 2009

W-C Wang K-W Chau C-T Cheng and L QiuA comparison of performance of several artificial intelligence methods forforecasting monthly discharge time seriesJournal of Hydrology 374(3-4)294ndash306 2009

Bibliographyndash 95

W Weimer S Forrest C Le Goues and T NguyenAutomatic program repair with evolutionary computationCommunications of the ACM 53(5)109ndash116 June 2010

P Widera J Garibaldi and N KrasnogorGP challenge Evolving energy function for protein structure predictionGPEM 1161ndash88 2010

T YuHierarchical Processing for Evolving Recursive and Modular Programs UsingHigher-Order Functions and Lambda AbstractionGPEM 2345ndash380 2001

Bibliographyndash 96

  • Introduction
  • Background
  • What is genetic programming
  • Summary of our first glimpse at GP
  • Life demonstration of GP using ECJ
  • Assessment of GP techniques
  • Case study
  • A more detailed view on GP (vanilla GP is not the whole story)
  • The Challenges for GP
  • Variants of GP
  • Selected theoretical results
  • Applications of GP
  • Additional resources
  • Bibliography
Page 23: GeneticProgramming - Poznań University of TechnologyGeneticProgramming KrzysztofKrawiec LaboratoryofIntelligentDecisionSupportSystems InstituteofComputingScience,PoznanUniversityofTechnology,Poznań,Poland

Population 1

Population 0

Population 1

Individual d in population 1 has fitness 0What is genetic programmingndash 23

Summary of our first glimpse at GP

Summary of our first glimpse at GPndash 24

Specific features of GP

The solutions evolving under the selection pressure of the fitness function arethemselves functions (programs)

GP operates on symbolic structures of varying lengthsThere are no variables for the algorithm to operate on (at least in the commonsense)

The program can be tested only on a limited number of fitness cases (tests)

=rArr In contrast to most EC methods that are typically placed in optimizationframework GP is by nature an inductive learning approach that fits into the domain ofmachine learning []

Summary of our first glimpse at GPndash 25

In a broader context

As opposed to typical ML approaches GP is very genericArbitrary programming language arbitrary input and output representation

The syntax and semantic of the programming language of consideration serve asmeans to provide the algorithm with prior knowledge

(common sense knowledge background knowledge domain knowledge)

GP is not the only approach to program induction (but probably the best one )See eg inductive logic programming ILP

GP embodies the ultimate goal of AI to build a system capable ofself-programming (adaptation learning)

Summary of our first glimpse at GPndash 26

What is GP ndash Question revisited

Genetic programming is a branch of computer science studying heuristicalgorithms based on neo-Darwinian principles for synthesizing programs iediscrete symbolic compositional structures that process data

Summary of our first glimpse at GPndash 27

Why should GP be considered a viable approach to program synthesis

GP combines two powerful concepts marked in underline in the above definition1 Representing candidate solutions as programs

which in general can conduct any Turing-complete computation (egclassification regression clustering reasoning problem solving etc) and thusenable capturing solutions to any type of problems (whether the task is eglearning optimization problem solving game playing etc)

2 Searching the space of candidate solutions using the lsquomechanicsrsquo borrowed frombiological evolutionwhich is unquestionably a very powerful computing paradigm given that itresulted in life on Earth and development of intelligent beings

Summary of our first glimpse at GPndash 28

Consequences of the above definition

Heuristic nature of search

Symbolic program representation

Input sensitivity and inductive character

State-fullness

Unconstrained data types

Summary of our first glimpse at GPndash 29

Origins of GP

httpwwwgenetic-programmingcomjohnkozahtml

Home assignment Indentify John Kozarsquos unique visual trait )

Summary of our first glimpse at GPndash 30

Life demonstration of GP using ECJ

Life demonstration of GP using ECJndash 31

GP on ECJ

ECJ Evolutionary Computation in Javahttpcsgmuedu~eclabprojectsecj

by Sean Luke Liviu Panait Gabriel Balan Sean Paus Zbigniew Skolicki ElenaPopovici Keith Sullivan et al

Probably the most popular freely available framework for EC with a strongsupport for GP

Licensed under Academic Free License version 30

As of March 2012 version 20

Many other libraries integrate with ECJ

Life demonstration of GP using ECJndash 32

Selected ECJ features

GUI with charting

Platform-independent checkpointing and logging

Hierarchical parameter files

Multithreading

Mersenne Twister Random Number Generators (compare tohttpwwwalifecouknonrandom)

Abstractions for implementing a variety of EC forms

Prepared to work in a distributed environment (including so-called island model)

Life demonstration of GP using ECJndash 33

GP-related ECJ features

GP Tree Representations

Set-based Strongly-Typed Genetic Programming

Ephemeral Random Constants

Automatically-Defined Functions and Automatically Defined Macros

Multiple tree forests

Six tree-creation algorithms

Extensive set of GP breeding operators

Grammatical Encoding

Eight pre-done GP application problem domains (ant regression multiplexerlawnmower parity two-box edge serengeti)

Life demonstration of GP using ECJndash 34

Exemplary run of ECJ

Standard outputjava ecEvolve -file ecappregressionquinticercparamsThreads breed1 eval1Seed 1427743400Job 0Setting upProcessing GP TypesProcessing GP Node ConstraintsProcessing GP Function SetsProcessing GP Tree Constraints-01306332228659439200164875774146594280653340439694114301402200189629743-0037506348565697010001402771209365470606602806044824949013869498395598084Initializing Generation 0Subpop 0 best fitness of generation Fitness Standardized=11303205 Adjusted=046941292 Hits=10Generation 1Subpop 0 best fitness of generation Fitness Standardized=06804932 Adjusted=059506345 Hits=7

Life demonstration of GP using ECJndash 35

Exemplary run The result

The log file produced by the runGeneration 0Best IndividualSubpopulation 0Evaluated trueFitness Standardized=11303205 Adjusted=046941292 Hits=10Tree 0( (sin ( x x)) (cos (+ x x)))Generation 1Best IndividualSubpopulation 0Evaluated trueFitness Standardized=06804932 Adjusted=059506345 Hits=7Tree 0( (rlog (+ (- x x) (cos x))) (rlog (- (cos (cos ( x x))) (- x x))))

Life demonstration of GP using ECJndash 36

Exemplary run

The log file produced by the runBest Individual of RunSubpopulation 0Evaluated trueFitness Standardized=008413165 Adjusted=092239726 Hits=17Tree 0( ( ( (- ( ( ( ( x (sin x)) (rlog

x)) (+ (+ (sin x) x) (- x x))) (exp ( x( ( (- ( ( ( ( x x) (rlog x)) (+ (+

(sin x) x) (- x x))) (exp ( x (sin x))))(sin x)) (rlog x)) (exp (rlog x)))))) (sin

x)) (rlog x)) x) (cos (cos ( ( (- ( ((exp (rlog x)) (+ x ( ( (exp (rlog x))(rlog x)) x))) (exp ( ( ( (- (exp (rlogx)) x) (rlog x)) x) (sin ( x x))))) (sinx)) ( x ( ( (- ( ( ( ( x x) (rlogx)) (+ (+ x (+ (+ (sin x) x) (- x x))) (-x x))) (exp ( x (sin x)))) (sin x)) (rlogx)) (exp (rlog x))))) x))))

Life demonstration of GP using ECJndash 37

GP in R

httpcranr-projectorgwebpackagesgprindexhtml

Life demonstration of GP using ECJndash 38

GP in Mathematica

Life demonstration of GP using ECJndash 39

Assessment of GP techniques

Assessment of GP techniquesndash 40

Criteria for assessing the quality of GP-evolved solutions

Criteria for assessing GP algorithmssuccess rate (percentage of evolutionary runs ended with success)

time-to-success (can be infin)

error of the best-of-run individual

Criteria for assessing programs obtained with GP

error rate (percentage of tests passed)

program size (number of instructions)

execution time

transparency

Assessment of GP techniquesndash 41

GP Benchmarks

Problem Definition (formula)Sextic x6minus2x4+x2

Septic x7minus2x6+x5minusx4+x3minus2x2+xNonic x9+x8+x7+x6+x5+x4+x3+x2+xR1 (x+1)3(x2minusx+1)R2 (x5minus3x3+1)(x2+1)R3 (x6+x5)(x4+x3+x2+x+1)

Assessment of GP techniquesndash 42

Even more GP benchmarks

Symbolic Regression Drug Bioavailability [43]

Tower [52] Protein Structure Classification [58]

Boolean Functions Time Series Forecasting [53]

N-Multiplexer [17] N-Majority [17] N-Parity [17] Path-finding and Planning

Generalised Boolean Circuits [12 59] Physical Travelling Salesman [28]

Digital Adder [55] Artificial Ant [17]

Order [9] Lawnmower [18]

Digital Multiplier [55] Tartarus Problem [6]

Majority [9] Maximum Overhang [36]

Classification Circuit Design [29]

mRNA Motif Classification [24] Control Systems

DNA Motif Discovery [25] Chaotic Dynamic Systems Control [26]

Intrusion Detection [11] Pole Balancing [31]

Protein Classification [19] Truck Control [16]

Intertwined Spirals [17]

Predictive Modelling

Mackey-Glass Chaotic Time Series [21]

Financial Trading [5 4 8]

Sunspot Prediction [17]

GeneChip Probe Performance [22]

Prime Number Prediction [54]

Assessment of GP techniquesndash 43

and more

Game-Playing

TORCS Car Racing [50]

Ms PacMan [10]

Othello [27]

Chessboard Evaluation [44]

Backgammon [44]

Mario [49]

NP-Complete Puzzles [14]

Robocode [44]

Rush Hour [44]

Checkers [44]

Freecell [44]

Dynamic Optimisation

Dynamic Symbolic Regression [34 35 51]

Dynamic Scheduling [13]

Traditional Programming

Sorting [15 1]

Assessment of GP techniquesndash 44

Case study

Case studyndash 45

Case study 1 Evolution of temperature models

Based on Karolina Stanisławska Krzysztof Krawiec Zbigniew W KundzewiczModeling Global Temperature Changes using Genetic Programming ndash A Case Study

Institute of Computing Science Poznan University of Technology Poznan Poland

Institute for Agricultural and Forest Environment Polish Academy of SciencesPoznan Poland and Potsdam Institute for Climate Impact Research PotsdamGermany

Konferencja Algorytmoacutew Ewolucyjnych i Optymalizacji Globalnej KAEiOGSeptember 21 2011

Case studyndash 46

A more detailed view on GP(vanilla GP is not the whole story)

A more detailed view on GP (vanilla GP is not the whole story)ndash 47

Where to get [random] candidate solutions from

Every stochastic search method relies on some sampling algorithm(s)

The distribution of randomly generated solutions is important as it impliescertain bias of the algorithm

ProblemsWe donrsquot know the lsquoidealrsquo distribution of GP programsEven if we knew it it may be difficult to design an algorithm that obeys it

The most widely used contemporary initialization methods take care only of thesyntax of generated programs

Mainly height constraint

A more detailed view on GP (vanilla GP is not the whole story)ndash 48

Initialization Full method

Specify the maximum tree height hmax

The full method for initializing treesChoose nonterminal nodes at random until hmax is reachedThen choose only from terminals

A more detailed view on GP (vanilla GP is not the whole story)ndash 49

Initialization Grow method

Specify the maximum tree height hmax

The grow method for initializing treesChoose nonterminal or terminal nodes at random until hmax is reachedThen choose only from terminals

A more detailed view on GP (vanilla GP is not the whole story)ndash 50

Homologous crossover for GP

Earliest example one-point crossover [23] identify a common region in theparents and swap the corresponding trees

The common region is the lsquointersectionrsquo of parent trees

A more detailed view on GP (vanilla GP is not the whole story)ndash 51

Uniform crossover for GP

Works similarly to uniform crossover in GAs

The offspring is build by iterating over nodes in the common region and flipping acoin to decide from which parent should an instruction be copied [38]

A more detailed view on GP (vanilla GP is not the whole story)ndash 52

How to employ multiple operators for lsquobreedingrsquo

How should the particular operators coexist in an evolutionary process In other words

How should they be superimposed

What should be the lsquopipingrsquo of particular breeding pipelines

A topic surprisingly underexplored in GP (and in EC probably too)

An example Which is betterpopsubpop0speciespipe = ecgpkozaMutationPipelinepopsubpop0speciespipenum-sources = 1popsubpop0speciespipesource0 = ecgpkozaCrossoverPipeline

Orpopsubpop0speciespipenum-sources = 2popsubpop0speciespipesource0 = ecgpkozaCrossoverPipelinepopsubpop0speciespipesource0prob = 09popsubpop0speciespipesource1 = ecgpkozaMutationPipelinepopsubpop0speciespipesource1prob = 01

A more detailed view on GP (vanilla GP is not the whole story)ndash 53

The Challenges for GP

The Challenges for GPndash 54

Bloat

The evolving expressions tend to grow indefinitely in sizeFor tree-based representations this growth is typically exponential[-ish]

Evaluation becomes slow algorithm stalls memory overrun likely

One of the most intensely studied topics in GP 240+ papers as of March 2012

The Challenges for GPndash 55

Bloat example

Average number of nodes per generation in a typical run of GP solving the Sexticproblem x6minus2x4+x2

(GP dotted line)

The Challenges for GPndash 56

Countermeasures for bloat

Constraining tree heightSurprisingly can speed up bloat

Favoring small programsLexicographic parsimony pressure given two equally fit individuals prefer (select)the one represented by a smaller tree

Bloat-aware operators size-fair crossover

The Challenges for GPndash 57

Highly non-uniform distribution of program lsquobehaviorsrsquo

Convergence of binary Boolean random linear functions (composed of AND NANDOR NOR 8 bits)

From [20] Langdon W B Cantuacute-Paz E (ed) Random Search is Parsimonious LateBreaking Papers at the Genetic and Evolutionary Computation Conference(GECCO-2002) AAAI 2002 308-315

The Challenges for GPndash 58

High cost of evaluation

Running a program on multiple inputscan be expensive

Particularly for some types of dataeg images

Solutions

Caching of outcomes of subprograms

Parallel execution of programs onparticular fitness cases

Bloat prevention methods

The Challenges for GPndash 59

Variants of GP

Variants of GPndash 60

Strongly typed GP (STGP)

A way to incorporate prior knowledge and impose a structure on programs [30]

ImplementationProvide a set of typesFor each instruction define the types of its arguments and outcomesMake the operators type-aware

Mutation substitute a random tree of a proper typeCrossover swap trees of compatible1 types

1Compatible belonging to the same lsquoset typersquoVariants of GPndash 61

Strongly typed GP in ECJ

For the problem of simple classifiers represented as decision trees

Classifier syntaxClassifier = Class_idClassifier = if_then_else(Condition ClassifierClassifier)Condition = Input_Variable = Constant_Value

Implementation in ECJ parameter filesgptypeasize = 3gptypea0name = classgptypea1name = vargptypea2name = constgptypessize = 0gptcsize = 1gptc0 = ecgpGPTreeConstraintsgptc0name = tc0gptc0fset = f0gptc0returns = class

gpncsize = 4gpnc0 = ecgpGPNodeConstraintsgpnc0name = ncSimpleClassifiergpnc0returns = classgpnc0size = 0gpnc1 = ecgpGPNodeConstraintsgpnc1name = ncCompoundClassifiergpnc1returns = classgpnc1size = 4gpnc1child0 = vargpnc1child1 = constgpnc1child2 = classgpnc1child3 = classgpnc2 = ecgpGPNodeConstraintsgpnc2name = ncVariablegpnc2returns = vargpnc2size = 0gpnc3 = ecgpGPNodeConstraintsgpnc3name = ncConstantgpnc3returns = constgpnc3size = 0

Variants of GPndash 62

Linear Genetic Programming

MotivationTree-like structures are not natural for contemporary hardware architectures

Program = a sequence of instructions

Data passed via registers

ProsDirectly portable to machine code fast executionNatural correspondence to standard (GA-like) crossover operator

Applications direct evolution of machine code [32]

Initial register contents

Final register contents

x1

x2 O1 O2

x3

O3 O4 g2

g3

g1r1

r2

r3

r1

r2

r3

Variants of GPndash 63

Linear GP

Variants of GPndash 64

Stack-based GP

The best-known representative Push and PushGPhampshireedulspectorpushhtml [48]

ProsVery simple syntax program = instruction | literal | ( program )No need to specify the number of registersThe top element of a stack has the natural interpretation of program outcomeNatural possibility of implementing autoconstructive programs [47]Includes certain features that make it Turing-complete (eg YANK instruction)

Variants of GPndash 65

Push Example

Program

( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR )

Initial stack states

BOOLEAN STACK ()CODE STACK ( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR )FLOAT STACK ()INTEGER STACK ()

Stack states after program execution

BOOLEAN STACK ( TRUE )CODE STACK ( ( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR ) )FLOAT STACK ( 93 )INTEGER STACK ( 6 )

httphampshireedulspectorpush3-descriptionhtml

Variants of GPndash 66

Other variants of GP

Grammatical Evolution The grammar of the programming language ofconsideration is given as input to the algorithm Individuals encode the choice ofproductions in the derivation tree (which of available alternative productionshould be chosen modulo the number of productions available at given step ofderivation)

Graph-based GPMotivation standard GP cannot reuse subprograms (within a single program)Example Cartesian Genetic Programming

Variants of GPndash 67

Other variants of GP

Multiobjective GP The extra objectives canCome with the problemResult from GPrsquos specifics eg use program size as the second (minimized)objectiveBe associated with different tests (eg feature tests [40])

Developmental GP (eg using Push)

Probabilistic GP (a variant of EDA Estimation of Distribution Algorithms)The algorithm maintains a probability distribution P instead of a populationIndividuals are generated from P lsquoon demandrsquoThe results of individualsrsquo evaluation are used to update P

Variants of GPndash 68

Simple EDA-like GP PIPE

Probabilistic Incremental Program Evolution [41]

Variants of GPndash 69

Selected theoretical results

Selected theoretical resultsndash 70

Schemata theorem for GP

Exact formula for the expected number of individuals sampling a schema a thenext generation [37]

Plus later work for other types of crossover

Selected theoretical resultsndash 71

Semantic crossover

Selected theoretical resultsndash 72

Applications of GP

Applications of GPndash 73

Review

GP produced a number of solutions that are human-competitive ie a GPalgorithm automatically solved a problem for which a patent exists2

A recent award-winning work has demonstrated the ability of a GP system toautomatically find and correct bugs in commercially-released software whenprovided with test data3

GP is one of leading methodologies that can be used to lsquoautomatersquo sciencehelping the researchers to find the hidden complex patterns in the observedphenomena4

2Koza J R Keane M A Streeter M J Mydlowec W Yu J Lanza G 2003 Genetic Pro-gramming IV Routine Human-Competitive Machine Intelligence Kluwer Academic Publishers

3Arcuri A Yao X A novel co-evolutionary approach to automatic software bug fixing In WangJ (Ed) 2008 IEEE World Congress on Computational Intelligence IEEE Computational IntelligenceSociety IEEE Press Hong Kong

4Schmidt M Lipson H 3 Apr 2009 Distilling free-form natural laws from experimental dataScience 324 (5923) 81ndash85

Applications of GPndash 74

Humies

() Entries were solicited for cash awards for human-competitive results that wereproduced by any form of genetic and evolutionary computation and that were published

Applications of GPndash 75

Humies

The conditions to qualify(A) The result was patented as an invention in the past is an improvement over apatented invention or would qualify today as a patentable new invention(B) The result is equal to or better than a result that was accepted as a new scientificresult at the time when it was published in a peer-reviewed scientific journal(C) The result is equal to or better than a result that was placed into a database orarchive of results maintained by an internationally recognized panel of scientificexperts(D) The result is publishable in its own right as a new scientific result mdash independentof the fact that the result was mechanically created(E) The result is equal to or better than the most recent human-created solution to along-standing problem for which there has been a succession of increasingly betterhuman-created solutions(F) The result is equal to or better than a result that was considered an achievementin its field at the time it was first discovered(G) The result solves a problem of indisputable difficulty in its field(H) The result holds its own or wins a regulated competition involving humancontestants (in the form of either live human players or human-written computerprograms)

Applications of GPndash 76

Selected Gold Humies

2004 Jason D Lohn Gregory S Hornby Derek S Linden NASA Ames ResearchCenterAn Evolved Antenna for Deployment on NASArsquos Space Technology 5 Mission

httpidesignucscedupapershornby_ec11pdf

Applications of GPndash 77

Selected Gold Humies using GP

2009 Stephanie Forrest Claire Le Goues ThanhVu Nguyen Westley WeimerAutomatically finding patches using genetic programming A GeneticProgramming Approach to Automated Software Repair

Applications of GPndash 78

Selected Gold Humies using GP

2008 Lee Spector David M Clark Ian Lindsay Bradford Barr Jon KleinGenetic Programming for Finite Algebras

2010 Natalio Krasnogor Paweł Widera Jonathan GaribaldiEvolutionary design of the energy function for protein structure prediction GPchallenge evolving the energy function for protein structure prediction Automateddesign of energy functions for protein structure prediction by means of geneticprogramming and improved structure similarity assessment

2011 Achiya Elyasaf Ami Hauptmann Moshe SipperGA-FreeCell Evolving Solvers for the Game of FreeCell

Applications of GPndash 79

Other applications

classification problems in machine learning [] object recognition [ 33] orlearning game strategies [] [2 57] has demonstrated the ability of a GP system to automatically find andcorrect bugs in commercially-released software when provided with test data

In the context of this paper it deserves particular attention that GP is one ofleading methodologies that can be used to lsquoautomatersquo science helping theresearchers to find the hidden complex patterns in the observed phenomena

In this spirit in their seminal paper [42] have shown how GP can be used toinduce scientific laws from experimental data Many other studies havedemonstrated the usefulness of GP for modeling different phenomena includingthose of natural origins [45 3 7 56 46]

See [] for an extensive review of GP applications

Applications of GPndash 80

Case study 2 Evolution of feature detectors

Based on Krzysztof Krawiec Bartosz Kukawka and Tomasz Maciejewski (2010)Evolving cascades of voting feature detectors for vehicle detection in satellite imageryIn IEEE Congress on Evolutionary Computation (CEC 2010) Barcelona IEEE Presspages 2392-2399

Applications of GPndash 81

Additional resources

Additional resourcesndash 82

Additional resources

Evolutionary Computation in Java csgmuedu~eclabprojectsecjGeneric software framework for EA well-prepared to work with GP

The online GP bilbiography wwwcsbhamacuk~wblbiblio

The genetic programming lsquohome pagersquo (a little bit messy but still valuable)httpwwwgenetic-programmingcom

Additional resourcesndash 83

Bibliography

Bibliographyndash 84

A Agapitos and S M LucasEvolving Modular Recursive Sorting AlgorithmsIn Proc EuroGP 2007

A Arcuri and X YaoA novel co-evolutionary approach to automatic software bug fixingIn J Wang editor 2008 IEEE World Congress on Computational Intelligencepages 162ndash168 Hong Kong 1-6 June 2008 IEEE Computational IntelligenceSociety IEEE Press

M Arganis R Val J Prats K Rodriguez R Dominguez and J DolzGenetic programming and standardization in water temperature modellingAdvances in Civil Engineering 2009 2009

A Brabazon M OrsquoNeill and I DempseyAn Introduction to Evolutionary Computation in FinanceIEEE Computational Intelligence Magazine 3(4)42ndash55 2008

R Bradley A Brabazon and M OrsquoNeillDynamic High Frequency Trading A Neuro-Evolutionary ApproachIn Proc EvoWorkshops 2009

G Cuccu and F GomezWhen novelty is not enoughIn Proc EvoApplications 2011

Bibliographyndash 85

M Daga and M C DeoAlternative data-driven methods to estimate wind from waves by inversemodelingNatural Hazards 49(2)293ndash310 May 2009

I Dempsey M OrsquoNeill and A BrabazonAdaptive Trading With Grammatical EvolutionIn Proc CEC 2006

G Durrett F Neumann and U-M OrsquoReillyComputational Complexity Analysis of Simple Genetic Programming On TwoProblems Modeling Isolated Program SemanticsIn Proc FOGA 2011

E Galvaacuten-Loacutepez J Swafford M OrsquoNeill and A BrabazonEvolving a Ms PacMan Controller Using Grammatical EvolutionIn Applications of Evolutionary Computation Springer 2010

J V Hansen P B Lowry R D Meservy and D M McDonaldGenetic Programming for Prevention of Cyberterrorism through Dynamic andEvolving Intrusion DetectionDecision Support Systems 431362ndash1374 2007

Bibliographyndash 86

S Harding J F Miller and W BanzhafDevelopments in Cartesian Genetic Programming self-modifying CGPGPEM 11397ndash439 2010

D Jakobović and L BudinDynamic Scheduling with Genetic ProgrammingIn Proc EuroGP 2006

G Kendall A Parkes and K SpoererA Survey of NP-Complete PuzzlesInternational Computer Games Association Journal 31(1)13ndash34 2008

K E Kinnear JrEvolving a Sort Lessons in Genetic ProgrammingIn Proc of the International Conference on Neural Networks 1993

J KozaA Genetic Approach to the Truck Backer Upper Problem and the Inter-twinedSpiral ProblemIn Proc International Joint Conference on Neural Networks 1992

J R KozaGenetic Programming On the Programming of Computers by Means of NaturalSelectionMIT Press Cambridge MA USA 1992

Bibliographyndash 87

J R KozaGenetic Programming II Automatic Discovery of Reusable ProgramsMIT Press Cambridge Massachusetts May 1994

W Langdon and W BanzhafRepeated Patterns in Genetic ProgrammingNatural Computing 7589ndash613 2008

W B LangdonRandom search is parsimoniousIn E Cantuacute-Paz editor Late Breaking Papers at the Genetic and EvolutionaryComputation Conference (GECCO-2002) pages 308ndash315 New York NY 9-13July 2002 AAAI

W B Langdon and W BanzhafRepeated Sequences in Linear Genetic Programming GenomesComplex Systems 15(4)285ndash306 2005

W B Langdon and A P HarrisonEvolving Regular Expressions for GeneChip Probe Performance PredictionIn Proc PPSN pages 1061ndash1070 2008

W B Langdon and R PoliFoundations of Genetic ProgrammingSpringer-Verlag 2002

Bibliographyndash 88

W B Langdon J Rowsell and A P HarrisonCreating Regular Expressions as mRNA Motifs with GP to Predict Human ExonSplittingIn Proc GECCO 2009

W B Langdon O Sanchez Graillet and A P HarrisonAutomated DNA Motif DiscoveryarXivorg 2010

M Lones A Tyrrell S Stepney and L CavesControlling Complex Dynamics with Artificial Biochemical NetworksIn Proc EuroGP pages 159ndash170 2010

S LucasOthello Competitionhttpprotectkern-1667emrelaxalgovalessexacuk8080othellohtmlOthellohtml 2012[Online accessed 27-Jan-2012]

Bibliographyndash 89

S LucasThe Physical Travelling Salesperson Problemhttpprotectkern-1667emrelaxalgovalessexacukptspptsphtml2012[Online accessed 27ndashJan-2012]

T McConaghyFFX Fast Scalable Deterministic Symbolic Regression TechnologyIn Proc GPTP 2011

D J MontanaStrongly typed genetic programmingBBN Technical Report 7866 Bolt Beranek and Newman Inc 10 MoultonStreet Cambridge MA 02138 USA 7 May 1993

M Nicolau M Schoenauer and W BanzhafEvolving Genes to Balance a PoleIn Proc EuroGP 2010

Bibliographyndash 90

P Nordin and W BanzhafGenetic programming controlling a miniature robotIn E V Siegel and J R Koza editors Working Notes for the AAAI Symposiumon Genetic Programming pages 61ndash67 MIT Cambridge MA USA 10ndash12 Nov1995 AAAI

G Olague and L TrujilloEvolutionary-computer-assisted design of image operators that detect interestpoints using genetic programmingImage and Vision Computing 29(7)484ndash498 2011

M OrsquoNeill A Brabazon and E HembergSubtree Deactivation Control with Grammatical Genetic Programming inDynamic EnvironmentsIn Proc CEC 2008

M OrsquoNeill and C RyanGrammatical Evolution by Grammatical Evolution The Evolution of Grammarand Genetic CodeIn Proc EuroGP pages 138ndash149 Springer-Verlag 5ndash7 Apr 2004

M Paterson Y Peres M Thorup P Winkler and U ZwickMaximum OverhangIn Proc 19th Annual ACM-SIAM Symposium on Discrete Algorithms 2008

Bibliographyndash 91

R PoliExact schema theory for genetic programming and variable-length geneticalgorithms with one-point crossoverGenetic Programming and Evolvable Machines 2(2)123ndash163 June 2001

R Poli and W B LangdonOn the search properties of different crossover operators in genetic programmingIn J R Koza W Banzhaf K Chellapilla K Deb M Dorigo D B Fogel M HGarzon D E Goldberg H Iba and R Riolo editors Genetic Programming1998 Proceedings of the Third Annual Conference pages 293ndash301 University ofWisconsin Madison Wisconsin USA 22-25 July 1998 Morgan Kaufmann

R Poli W B Langdon and N F McPheeA field guide to genetic programmingPublished via httplulucom and freely available athttpwwwgp-field-guideorguk 2008(With contributions by J R Koza)

B J Ross and H ZhuProcedural texture evolution using multiobjective optimizationNew Generation Computing 22(3)271ndash293 2004

Bibliographyndash 92

R P Salustowicz and J SchmidhuberProbabilistic incremental program evolutionEvolutionary Computation 5(2)123ndash141 1997

M Schmidt and H LipsonDistilling free-form natural laws from experimental dataScience 324(5923)81ndash85 3 Apr 2009

S Silva and L VanneschiState-of-the-Art Genetic Programming for Predicting Human Oral Bioavailabilityof DrugsIn Proc 4th International Workshop on Practical Applications of ComputationalBiology and Bioinformatics 2010

M SipperLet the Games EvolveIn Proc GPTP 2011

C Sivapragasam R Maheswaran and V VenkateshGenetic programming approach for flood routing in natural channelsHydrological Processes 22(5)623ndash628 2007

C Sivapragasam N Muttil S Muthukumar and V M ArunPrediction of algal blooms using genetic programmingMarine Pollution Bulletin 60(10)1849ndash1855 2010

Bibliographyndash 93

L SpectorTowards practical autoconstructive evolution Self-evolution of problem-solvinggenetic programming systemsIn R Riolo T McConaghy and E Vladislavleva editors Genetic ProgrammingTheory and Practice VIII volume 8 of Genetic and Evolutionary Computationchapter 2 pages 17ndash33 Springer Ann Arbor USA 20-22 May 2010

L Spector C Perry and J KleinPush 20 programming language descriptionTechnical report School of Cognitive Science Hampshire College Apr 2004

J Togelius S Karakovskiy J Koutnik and J SchmidhuberSuper Mario EvolutionIn Proc IEEE Computational Intelligence and Games 2009

TORCS The Open Car Racing Simulatorhttptorcssourceforgenet 2012

L Vanneschi and G CuccuVariable Size Population for Dynamic Optimization with Genetic ProgrammingIn Proc GECCO 2009

Bibliographyndash 94

E Vladislavleva G Smits and D Den HertogOrder of Nonlinearity as a Complexity Measure for Models Generated by SymbolicRegression via Pareto Genetic ProgrammingIEEE Trans EC 13(2)333ndash349 2009

N Wagner Z Michalewicz M Khouja and R McGregorTime Series Forecasting for Dynamic Environments The DyFor Genetic ProgramModelIEEE Trans EC 2007

J Walker and J MillerPredicting Prime Numbers Using Cartesian Genetic ProgrammingIn Proc EuroGP 2007

J A Walker K Voumllk S L Smith and J F MillerParallel Evolution using Multi-chromosome Cartesian Genetic ProgrammingGPEM 10417ndash445 2009

W-C Wang K-W Chau C-T Cheng and L QiuA comparison of performance of several artificial intelligence methods forforecasting monthly discharge time seriesJournal of Hydrology 374(3-4)294ndash306 2009

Bibliographyndash 95

W Weimer S Forrest C Le Goues and T NguyenAutomatic program repair with evolutionary computationCommunications of the ACM 53(5)109ndash116 June 2010

P Widera J Garibaldi and N KrasnogorGP challenge Evolving energy function for protein structure predictionGPEM 1161ndash88 2010

T YuHierarchical Processing for Evolving Recursive and Modular Programs UsingHigher-Order Functions and Lambda AbstractionGPEM 2345ndash380 2001

Bibliographyndash 96

  • Introduction
  • Background
  • What is genetic programming
  • Summary of our first glimpse at GP
  • Life demonstration of GP using ECJ
  • Assessment of GP techniques
  • Case study
  • A more detailed view on GP (vanilla GP is not the whole story)
  • The Challenges for GP
  • Variants of GP
  • Selected theoretical results
  • Applications of GP
  • Additional resources
  • Bibliography
Page 24: GeneticProgramming - Poznań University of TechnologyGeneticProgramming KrzysztofKrawiec LaboratoryofIntelligentDecisionSupportSystems InstituteofComputingScience,PoznanUniversityofTechnology,Poznań,Poland

Summary of our first glimpse at GP

Summary of our first glimpse at GPndash 24

Specific features of GP

The solutions evolving under the selection pressure of the fitness function arethemselves functions (programs)

GP operates on symbolic structures of varying lengthsThere are no variables for the algorithm to operate on (at least in the commonsense)

The program can be tested only on a limited number of fitness cases (tests)

=rArr In contrast to most EC methods that are typically placed in optimizationframework GP is by nature an inductive learning approach that fits into the domain ofmachine learning []

Summary of our first glimpse at GPndash 25

In a broader context

As opposed to typical ML approaches GP is very genericArbitrary programming language arbitrary input and output representation

The syntax and semantic of the programming language of consideration serve asmeans to provide the algorithm with prior knowledge

(common sense knowledge background knowledge domain knowledge)

GP is not the only approach to program induction (but probably the best one )See eg inductive logic programming ILP

GP embodies the ultimate goal of AI to build a system capable ofself-programming (adaptation learning)

Summary of our first glimpse at GPndash 26

What is GP ndash Question revisited

Genetic programming is a branch of computer science studying heuristicalgorithms based on neo-Darwinian principles for synthesizing programs iediscrete symbolic compositional structures that process data

Summary of our first glimpse at GPndash 27

Why should GP be considered a viable approach to program synthesis

GP combines two powerful concepts marked in underline in the above definition1 Representing candidate solutions as programs

which in general can conduct any Turing-complete computation (egclassification regression clustering reasoning problem solving etc) and thusenable capturing solutions to any type of problems (whether the task is eglearning optimization problem solving game playing etc)

2 Searching the space of candidate solutions using the lsquomechanicsrsquo borrowed frombiological evolutionwhich is unquestionably a very powerful computing paradigm given that itresulted in life on Earth and development of intelligent beings

Summary of our first glimpse at GPndash 28

Consequences of the above definition

Heuristic nature of search

Symbolic program representation

Input sensitivity and inductive character

State-fullness

Unconstrained data types

Summary of our first glimpse at GPndash 29

Origins of GP

httpwwwgenetic-programmingcomjohnkozahtml

Home assignment Indentify John Kozarsquos unique visual trait )

Summary of our first glimpse at GPndash 30

Life demonstration of GP using ECJ

Life demonstration of GP using ECJndash 31

GP on ECJ

ECJ Evolutionary Computation in Javahttpcsgmuedu~eclabprojectsecj

by Sean Luke Liviu Panait Gabriel Balan Sean Paus Zbigniew Skolicki ElenaPopovici Keith Sullivan et al

Probably the most popular freely available framework for EC with a strongsupport for GP

Licensed under Academic Free License version 30

As of March 2012 version 20

Many other libraries integrate with ECJ

Life demonstration of GP using ECJndash 32

Selected ECJ features

GUI with charting

Platform-independent checkpointing and logging

Hierarchical parameter files

Multithreading

Mersenne Twister Random Number Generators (compare tohttpwwwalifecouknonrandom)

Abstractions for implementing a variety of EC forms

Prepared to work in a distributed environment (including so-called island model)

Life demonstration of GP using ECJndash 33

GP-related ECJ features

GP Tree Representations

Set-based Strongly-Typed Genetic Programming

Ephemeral Random Constants

Automatically-Defined Functions and Automatically Defined Macros

Multiple tree forests

Six tree-creation algorithms

Extensive set of GP breeding operators

Grammatical Encoding

Eight pre-done GP application problem domains (ant regression multiplexerlawnmower parity two-box edge serengeti)

Life demonstration of GP using ECJndash 34

Exemplary run of ECJ

Standard outputjava ecEvolve -file ecappregressionquinticercparamsThreads breed1 eval1Seed 1427743400Job 0Setting upProcessing GP TypesProcessing GP Node ConstraintsProcessing GP Function SetsProcessing GP Tree Constraints-01306332228659439200164875774146594280653340439694114301402200189629743-0037506348565697010001402771209365470606602806044824949013869498395598084Initializing Generation 0Subpop 0 best fitness of generation Fitness Standardized=11303205 Adjusted=046941292 Hits=10Generation 1Subpop 0 best fitness of generation Fitness Standardized=06804932 Adjusted=059506345 Hits=7

Life demonstration of GP using ECJndash 35

Exemplary run The result

The log file produced by the runGeneration 0Best IndividualSubpopulation 0Evaluated trueFitness Standardized=11303205 Adjusted=046941292 Hits=10Tree 0( (sin ( x x)) (cos (+ x x)))Generation 1Best IndividualSubpopulation 0Evaluated trueFitness Standardized=06804932 Adjusted=059506345 Hits=7Tree 0( (rlog (+ (- x x) (cos x))) (rlog (- (cos (cos ( x x))) (- x x))))

Life demonstration of GP using ECJndash 36

Exemplary run

The log file produced by the runBest Individual of RunSubpopulation 0Evaluated trueFitness Standardized=008413165 Adjusted=092239726 Hits=17Tree 0( ( ( (- ( ( ( ( x (sin x)) (rlog

x)) (+ (+ (sin x) x) (- x x))) (exp ( x( ( (- ( ( ( ( x x) (rlog x)) (+ (+

(sin x) x) (- x x))) (exp ( x (sin x))))(sin x)) (rlog x)) (exp (rlog x)))))) (sin

x)) (rlog x)) x) (cos (cos ( ( (- ( ((exp (rlog x)) (+ x ( ( (exp (rlog x))(rlog x)) x))) (exp ( ( ( (- (exp (rlogx)) x) (rlog x)) x) (sin ( x x))))) (sinx)) ( x ( ( (- ( ( ( ( x x) (rlogx)) (+ (+ x (+ (+ (sin x) x) (- x x))) (-x x))) (exp ( x (sin x)))) (sin x)) (rlogx)) (exp (rlog x))))) x))))

Life demonstration of GP using ECJndash 37

GP in R

httpcranr-projectorgwebpackagesgprindexhtml

Life demonstration of GP using ECJndash 38

GP in Mathematica

Life demonstration of GP using ECJndash 39

Assessment of GP techniques

Assessment of GP techniquesndash 40

Criteria for assessing the quality of GP-evolved solutions

Criteria for assessing GP algorithmssuccess rate (percentage of evolutionary runs ended with success)

time-to-success (can be infin)

error of the best-of-run individual

Criteria for assessing programs obtained with GP

error rate (percentage of tests passed)

program size (number of instructions)

execution time

transparency

Assessment of GP techniquesndash 41

GP Benchmarks

Problem Definition (formula)Sextic x6minus2x4+x2

Septic x7minus2x6+x5minusx4+x3minus2x2+xNonic x9+x8+x7+x6+x5+x4+x3+x2+xR1 (x+1)3(x2minusx+1)R2 (x5minus3x3+1)(x2+1)R3 (x6+x5)(x4+x3+x2+x+1)

Assessment of GP techniquesndash 42

Even more GP benchmarks

Symbolic Regression Drug Bioavailability [43]

Tower [52] Protein Structure Classification [58]

Boolean Functions Time Series Forecasting [53]

N-Multiplexer [17] N-Majority [17] N-Parity [17] Path-finding and Planning

Generalised Boolean Circuits [12 59] Physical Travelling Salesman [28]

Digital Adder [55] Artificial Ant [17]

Order [9] Lawnmower [18]

Digital Multiplier [55] Tartarus Problem [6]

Majority [9] Maximum Overhang [36]

Classification Circuit Design [29]

mRNA Motif Classification [24] Control Systems

DNA Motif Discovery [25] Chaotic Dynamic Systems Control [26]

Intrusion Detection [11] Pole Balancing [31]

Protein Classification [19] Truck Control [16]

Intertwined Spirals [17]

Predictive Modelling

Mackey-Glass Chaotic Time Series [21]

Financial Trading [5 4 8]

Sunspot Prediction [17]

GeneChip Probe Performance [22]

Prime Number Prediction [54]

Assessment of GP techniquesndash 43

and more

Game-Playing

TORCS Car Racing [50]

Ms PacMan [10]

Othello [27]

Chessboard Evaluation [44]

Backgammon [44]

Mario [49]

NP-Complete Puzzles [14]

Robocode [44]

Rush Hour [44]

Checkers [44]

Freecell [44]

Dynamic Optimisation

Dynamic Symbolic Regression [34 35 51]

Dynamic Scheduling [13]

Traditional Programming

Sorting [15 1]

Assessment of GP techniquesndash 44

Case study

Case studyndash 45

Case study 1 Evolution of temperature models

Based on Karolina Stanisławska Krzysztof Krawiec Zbigniew W KundzewiczModeling Global Temperature Changes using Genetic Programming ndash A Case Study

Institute of Computing Science Poznan University of Technology Poznan Poland

Institute for Agricultural and Forest Environment Polish Academy of SciencesPoznan Poland and Potsdam Institute for Climate Impact Research PotsdamGermany

Konferencja Algorytmoacutew Ewolucyjnych i Optymalizacji Globalnej KAEiOGSeptember 21 2011

Case studyndash 46

A more detailed view on GP(vanilla GP is not the whole story)

A more detailed view on GP (vanilla GP is not the whole story)ndash 47

Where to get [random] candidate solutions from

Every stochastic search method relies on some sampling algorithm(s)

The distribution of randomly generated solutions is important as it impliescertain bias of the algorithm

ProblemsWe donrsquot know the lsquoidealrsquo distribution of GP programsEven if we knew it it may be difficult to design an algorithm that obeys it

The most widely used contemporary initialization methods take care only of thesyntax of generated programs

Mainly height constraint

A more detailed view on GP (vanilla GP is not the whole story)ndash 48

Initialization Full method

Specify the maximum tree height hmax

The full method for initializing treesChoose nonterminal nodes at random until hmax is reachedThen choose only from terminals

A more detailed view on GP (vanilla GP is not the whole story)ndash 49

Initialization Grow method

Specify the maximum tree height hmax

The grow method for initializing treesChoose nonterminal or terminal nodes at random until hmax is reachedThen choose only from terminals

A more detailed view on GP (vanilla GP is not the whole story)ndash 50

Homologous crossover for GP

Earliest example one-point crossover [23] identify a common region in theparents and swap the corresponding trees

The common region is the lsquointersectionrsquo of parent trees

A more detailed view on GP (vanilla GP is not the whole story)ndash 51

Uniform crossover for GP

Works similarly to uniform crossover in GAs

The offspring is build by iterating over nodes in the common region and flipping acoin to decide from which parent should an instruction be copied [38]

A more detailed view on GP (vanilla GP is not the whole story)ndash 52

How to employ multiple operators for lsquobreedingrsquo

How should the particular operators coexist in an evolutionary process In other words

How should they be superimposed

What should be the lsquopipingrsquo of particular breeding pipelines

A topic surprisingly underexplored in GP (and in EC probably too)

An example Which is betterpopsubpop0speciespipe = ecgpkozaMutationPipelinepopsubpop0speciespipenum-sources = 1popsubpop0speciespipesource0 = ecgpkozaCrossoverPipeline

Orpopsubpop0speciespipenum-sources = 2popsubpop0speciespipesource0 = ecgpkozaCrossoverPipelinepopsubpop0speciespipesource0prob = 09popsubpop0speciespipesource1 = ecgpkozaMutationPipelinepopsubpop0speciespipesource1prob = 01

A more detailed view on GP (vanilla GP is not the whole story)ndash 53

The Challenges for GP

The Challenges for GPndash 54

Bloat

The evolving expressions tend to grow indefinitely in sizeFor tree-based representations this growth is typically exponential[-ish]

Evaluation becomes slow algorithm stalls memory overrun likely

One of the most intensely studied topics in GP 240+ papers as of March 2012

The Challenges for GPndash 55

Bloat example

Average number of nodes per generation in a typical run of GP solving the Sexticproblem x6minus2x4+x2

(GP dotted line)

The Challenges for GPndash 56

Countermeasures for bloat

Constraining tree heightSurprisingly can speed up bloat

Favoring small programsLexicographic parsimony pressure given two equally fit individuals prefer (select)the one represented by a smaller tree

Bloat-aware operators size-fair crossover

The Challenges for GPndash 57

Highly non-uniform distribution of program lsquobehaviorsrsquo

Convergence of binary Boolean random linear functions (composed of AND NANDOR NOR 8 bits)

From [20] Langdon W B Cantuacute-Paz E (ed) Random Search is Parsimonious LateBreaking Papers at the Genetic and Evolutionary Computation Conference(GECCO-2002) AAAI 2002 308-315

The Challenges for GPndash 58

High cost of evaluation

Running a program on multiple inputscan be expensive

Particularly for some types of dataeg images

Solutions

Caching of outcomes of subprograms

Parallel execution of programs onparticular fitness cases

Bloat prevention methods

The Challenges for GPndash 59

Variants of GP

Variants of GPndash 60

Strongly typed GP (STGP)

A way to incorporate prior knowledge and impose a structure on programs [30]

ImplementationProvide a set of typesFor each instruction define the types of its arguments and outcomesMake the operators type-aware

Mutation substitute a random tree of a proper typeCrossover swap trees of compatible1 types

1Compatible belonging to the same lsquoset typersquoVariants of GPndash 61

Strongly typed GP in ECJ

For the problem of simple classifiers represented as decision trees

Classifier syntaxClassifier = Class_idClassifier = if_then_else(Condition ClassifierClassifier)Condition = Input_Variable = Constant_Value

Implementation in ECJ parameter filesgptypeasize = 3gptypea0name = classgptypea1name = vargptypea2name = constgptypessize = 0gptcsize = 1gptc0 = ecgpGPTreeConstraintsgptc0name = tc0gptc0fset = f0gptc0returns = class

gpncsize = 4gpnc0 = ecgpGPNodeConstraintsgpnc0name = ncSimpleClassifiergpnc0returns = classgpnc0size = 0gpnc1 = ecgpGPNodeConstraintsgpnc1name = ncCompoundClassifiergpnc1returns = classgpnc1size = 4gpnc1child0 = vargpnc1child1 = constgpnc1child2 = classgpnc1child3 = classgpnc2 = ecgpGPNodeConstraintsgpnc2name = ncVariablegpnc2returns = vargpnc2size = 0gpnc3 = ecgpGPNodeConstraintsgpnc3name = ncConstantgpnc3returns = constgpnc3size = 0

Variants of GPndash 62

Linear Genetic Programming

MotivationTree-like structures are not natural for contemporary hardware architectures

Program = a sequence of instructions

Data passed via registers

ProsDirectly portable to machine code fast executionNatural correspondence to standard (GA-like) crossover operator

Applications direct evolution of machine code [32]

Initial register contents

Final register contents

x1

x2 O1 O2

x3

O3 O4 g2

g3

g1r1

r2

r3

r1

r2

r3

Variants of GPndash 63

Linear GP

Variants of GPndash 64

Stack-based GP

The best-known representative Push and PushGPhampshireedulspectorpushhtml [48]

ProsVery simple syntax program = instruction | literal | ( program )No need to specify the number of registersThe top element of a stack has the natural interpretation of program outcomeNatural possibility of implementing autoconstructive programs [47]Includes certain features that make it Turing-complete (eg YANK instruction)

Variants of GPndash 65

Push Example

Program

( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR )

Initial stack states

BOOLEAN STACK ()CODE STACK ( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR )FLOAT STACK ()INTEGER STACK ()

Stack states after program execution

BOOLEAN STACK ( TRUE )CODE STACK ( ( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR ) )FLOAT STACK ( 93 )INTEGER STACK ( 6 )

httphampshireedulspectorpush3-descriptionhtml

Variants of GPndash 66

Other variants of GP

Grammatical Evolution The grammar of the programming language ofconsideration is given as input to the algorithm Individuals encode the choice ofproductions in the derivation tree (which of available alternative productionshould be chosen modulo the number of productions available at given step ofderivation)

Graph-based GPMotivation standard GP cannot reuse subprograms (within a single program)Example Cartesian Genetic Programming

Variants of GPndash 67

Other variants of GP

Multiobjective GP The extra objectives canCome with the problemResult from GPrsquos specifics eg use program size as the second (minimized)objectiveBe associated with different tests (eg feature tests [40])

Developmental GP (eg using Push)

Probabilistic GP (a variant of EDA Estimation of Distribution Algorithms)The algorithm maintains a probability distribution P instead of a populationIndividuals are generated from P lsquoon demandrsquoThe results of individualsrsquo evaluation are used to update P

Variants of GPndash 68

Simple EDA-like GP PIPE

Probabilistic Incremental Program Evolution [41]

Variants of GPndash 69

Selected theoretical results

Selected theoretical resultsndash 70

Schemata theorem for GP

Exact formula for the expected number of individuals sampling a schema a thenext generation [37]

Plus later work for other types of crossover

Selected theoretical resultsndash 71

Semantic crossover

Selected theoretical resultsndash 72

Applications of GP

Applications of GPndash 73

Review

GP produced a number of solutions that are human-competitive ie a GPalgorithm automatically solved a problem for which a patent exists2

A recent award-winning work has demonstrated the ability of a GP system toautomatically find and correct bugs in commercially-released software whenprovided with test data3

GP is one of leading methodologies that can be used to lsquoautomatersquo sciencehelping the researchers to find the hidden complex patterns in the observedphenomena4

2Koza J R Keane M A Streeter M J Mydlowec W Yu J Lanza G 2003 Genetic Pro-gramming IV Routine Human-Competitive Machine Intelligence Kluwer Academic Publishers

3Arcuri A Yao X A novel co-evolutionary approach to automatic software bug fixing In WangJ (Ed) 2008 IEEE World Congress on Computational Intelligence IEEE Computational IntelligenceSociety IEEE Press Hong Kong

4Schmidt M Lipson H 3 Apr 2009 Distilling free-form natural laws from experimental dataScience 324 (5923) 81ndash85

Applications of GPndash 74

Humies

() Entries were solicited for cash awards for human-competitive results that wereproduced by any form of genetic and evolutionary computation and that were published

Applications of GPndash 75

Humies

The conditions to qualify(A) The result was patented as an invention in the past is an improvement over apatented invention or would qualify today as a patentable new invention(B) The result is equal to or better than a result that was accepted as a new scientificresult at the time when it was published in a peer-reviewed scientific journal(C) The result is equal to or better than a result that was placed into a database orarchive of results maintained by an internationally recognized panel of scientificexperts(D) The result is publishable in its own right as a new scientific result mdash independentof the fact that the result was mechanically created(E) The result is equal to or better than the most recent human-created solution to along-standing problem for which there has been a succession of increasingly betterhuman-created solutions(F) The result is equal to or better than a result that was considered an achievementin its field at the time it was first discovered(G) The result solves a problem of indisputable difficulty in its field(H) The result holds its own or wins a regulated competition involving humancontestants (in the form of either live human players or human-written computerprograms)

Applications of GPndash 76

Selected Gold Humies

2004 Jason D Lohn Gregory S Hornby Derek S Linden NASA Ames ResearchCenterAn Evolved Antenna for Deployment on NASArsquos Space Technology 5 Mission

httpidesignucscedupapershornby_ec11pdf

Applications of GPndash 77

Selected Gold Humies using GP

2009 Stephanie Forrest Claire Le Goues ThanhVu Nguyen Westley WeimerAutomatically finding patches using genetic programming A GeneticProgramming Approach to Automated Software Repair

Applications of GPndash 78

Selected Gold Humies using GP

2008 Lee Spector David M Clark Ian Lindsay Bradford Barr Jon KleinGenetic Programming for Finite Algebras

2010 Natalio Krasnogor Paweł Widera Jonathan GaribaldiEvolutionary design of the energy function for protein structure prediction GPchallenge evolving the energy function for protein structure prediction Automateddesign of energy functions for protein structure prediction by means of geneticprogramming and improved structure similarity assessment

2011 Achiya Elyasaf Ami Hauptmann Moshe SipperGA-FreeCell Evolving Solvers for the Game of FreeCell

Applications of GPndash 79

Other applications

classification problems in machine learning [] object recognition [ 33] orlearning game strategies [] [2 57] has demonstrated the ability of a GP system to automatically find andcorrect bugs in commercially-released software when provided with test data

In the context of this paper it deserves particular attention that GP is one ofleading methodologies that can be used to lsquoautomatersquo science helping theresearchers to find the hidden complex patterns in the observed phenomena

In this spirit in their seminal paper [42] have shown how GP can be used toinduce scientific laws from experimental data Many other studies havedemonstrated the usefulness of GP for modeling different phenomena includingthose of natural origins [45 3 7 56 46]

See [] for an extensive review of GP applications

Applications of GPndash 80

Case study 2 Evolution of feature detectors

Based on Krzysztof Krawiec Bartosz Kukawka and Tomasz Maciejewski (2010)Evolving cascades of voting feature detectors for vehicle detection in satellite imageryIn IEEE Congress on Evolutionary Computation (CEC 2010) Barcelona IEEE Presspages 2392-2399

Applications of GPndash 81

Additional resources

Additional resourcesndash 82

Additional resources

Evolutionary Computation in Java csgmuedu~eclabprojectsecjGeneric software framework for EA well-prepared to work with GP

The online GP bilbiography wwwcsbhamacuk~wblbiblio

The genetic programming lsquohome pagersquo (a little bit messy but still valuable)httpwwwgenetic-programmingcom

Additional resourcesndash 83

Bibliography

Bibliographyndash 84

A Agapitos and S M LucasEvolving Modular Recursive Sorting AlgorithmsIn Proc EuroGP 2007

A Arcuri and X YaoA novel co-evolutionary approach to automatic software bug fixingIn J Wang editor 2008 IEEE World Congress on Computational Intelligencepages 162ndash168 Hong Kong 1-6 June 2008 IEEE Computational IntelligenceSociety IEEE Press

M Arganis R Val J Prats K Rodriguez R Dominguez and J DolzGenetic programming and standardization in water temperature modellingAdvances in Civil Engineering 2009 2009

A Brabazon M OrsquoNeill and I DempseyAn Introduction to Evolutionary Computation in FinanceIEEE Computational Intelligence Magazine 3(4)42ndash55 2008

R Bradley A Brabazon and M OrsquoNeillDynamic High Frequency Trading A Neuro-Evolutionary ApproachIn Proc EvoWorkshops 2009

G Cuccu and F GomezWhen novelty is not enoughIn Proc EvoApplications 2011

Bibliographyndash 85

M Daga and M C DeoAlternative data-driven methods to estimate wind from waves by inversemodelingNatural Hazards 49(2)293ndash310 May 2009

I Dempsey M OrsquoNeill and A BrabazonAdaptive Trading With Grammatical EvolutionIn Proc CEC 2006

G Durrett F Neumann and U-M OrsquoReillyComputational Complexity Analysis of Simple Genetic Programming On TwoProblems Modeling Isolated Program SemanticsIn Proc FOGA 2011

E Galvaacuten-Loacutepez J Swafford M OrsquoNeill and A BrabazonEvolving a Ms PacMan Controller Using Grammatical EvolutionIn Applications of Evolutionary Computation Springer 2010

J V Hansen P B Lowry R D Meservy and D M McDonaldGenetic Programming for Prevention of Cyberterrorism through Dynamic andEvolving Intrusion DetectionDecision Support Systems 431362ndash1374 2007

Bibliographyndash 86

S Harding J F Miller and W BanzhafDevelopments in Cartesian Genetic Programming self-modifying CGPGPEM 11397ndash439 2010

D Jakobović and L BudinDynamic Scheduling with Genetic ProgrammingIn Proc EuroGP 2006

G Kendall A Parkes and K SpoererA Survey of NP-Complete PuzzlesInternational Computer Games Association Journal 31(1)13ndash34 2008

K E Kinnear JrEvolving a Sort Lessons in Genetic ProgrammingIn Proc of the International Conference on Neural Networks 1993

J KozaA Genetic Approach to the Truck Backer Upper Problem and the Inter-twinedSpiral ProblemIn Proc International Joint Conference on Neural Networks 1992

J R KozaGenetic Programming On the Programming of Computers by Means of NaturalSelectionMIT Press Cambridge MA USA 1992

Bibliographyndash 87

J R KozaGenetic Programming II Automatic Discovery of Reusable ProgramsMIT Press Cambridge Massachusetts May 1994

W Langdon and W BanzhafRepeated Patterns in Genetic ProgrammingNatural Computing 7589ndash613 2008

W B LangdonRandom search is parsimoniousIn E Cantuacute-Paz editor Late Breaking Papers at the Genetic and EvolutionaryComputation Conference (GECCO-2002) pages 308ndash315 New York NY 9-13July 2002 AAAI

W B Langdon and W BanzhafRepeated Sequences in Linear Genetic Programming GenomesComplex Systems 15(4)285ndash306 2005

W B Langdon and A P HarrisonEvolving Regular Expressions for GeneChip Probe Performance PredictionIn Proc PPSN pages 1061ndash1070 2008

W B Langdon and R PoliFoundations of Genetic ProgrammingSpringer-Verlag 2002

Bibliographyndash 88

W B Langdon J Rowsell and A P HarrisonCreating Regular Expressions as mRNA Motifs with GP to Predict Human ExonSplittingIn Proc GECCO 2009

W B Langdon O Sanchez Graillet and A P HarrisonAutomated DNA Motif DiscoveryarXivorg 2010

M Lones A Tyrrell S Stepney and L CavesControlling Complex Dynamics with Artificial Biochemical NetworksIn Proc EuroGP pages 159ndash170 2010

S LucasOthello Competitionhttpprotectkern-1667emrelaxalgovalessexacuk8080othellohtmlOthellohtml 2012[Online accessed 27-Jan-2012]

Bibliographyndash 89

S LucasThe Physical Travelling Salesperson Problemhttpprotectkern-1667emrelaxalgovalessexacukptspptsphtml2012[Online accessed 27ndashJan-2012]

T McConaghyFFX Fast Scalable Deterministic Symbolic Regression TechnologyIn Proc GPTP 2011

D J MontanaStrongly typed genetic programmingBBN Technical Report 7866 Bolt Beranek and Newman Inc 10 MoultonStreet Cambridge MA 02138 USA 7 May 1993

M Nicolau M Schoenauer and W BanzhafEvolving Genes to Balance a PoleIn Proc EuroGP 2010

Bibliographyndash 90

P Nordin and W BanzhafGenetic programming controlling a miniature robotIn E V Siegel and J R Koza editors Working Notes for the AAAI Symposiumon Genetic Programming pages 61ndash67 MIT Cambridge MA USA 10ndash12 Nov1995 AAAI

G Olague and L TrujilloEvolutionary-computer-assisted design of image operators that detect interestpoints using genetic programmingImage and Vision Computing 29(7)484ndash498 2011

M OrsquoNeill A Brabazon and E HembergSubtree Deactivation Control with Grammatical Genetic Programming inDynamic EnvironmentsIn Proc CEC 2008

M OrsquoNeill and C RyanGrammatical Evolution by Grammatical Evolution The Evolution of Grammarand Genetic CodeIn Proc EuroGP pages 138ndash149 Springer-Verlag 5ndash7 Apr 2004

M Paterson Y Peres M Thorup P Winkler and U ZwickMaximum OverhangIn Proc 19th Annual ACM-SIAM Symposium on Discrete Algorithms 2008

Bibliographyndash 91

R PoliExact schema theory for genetic programming and variable-length geneticalgorithms with one-point crossoverGenetic Programming and Evolvable Machines 2(2)123ndash163 June 2001

R Poli and W B LangdonOn the search properties of different crossover operators in genetic programmingIn J R Koza W Banzhaf K Chellapilla K Deb M Dorigo D B Fogel M HGarzon D E Goldberg H Iba and R Riolo editors Genetic Programming1998 Proceedings of the Third Annual Conference pages 293ndash301 University ofWisconsin Madison Wisconsin USA 22-25 July 1998 Morgan Kaufmann

R Poli W B Langdon and N F McPheeA field guide to genetic programmingPublished via httplulucom and freely available athttpwwwgp-field-guideorguk 2008(With contributions by J R Koza)

B J Ross and H ZhuProcedural texture evolution using multiobjective optimizationNew Generation Computing 22(3)271ndash293 2004

Bibliographyndash 92

R P Salustowicz and J SchmidhuberProbabilistic incremental program evolutionEvolutionary Computation 5(2)123ndash141 1997

M Schmidt and H LipsonDistilling free-form natural laws from experimental dataScience 324(5923)81ndash85 3 Apr 2009

S Silva and L VanneschiState-of-the-Art Genetic Programming for Predicting Human Oral Bioavailabilityof DrugsIn Proc 4th International Workshop on Practical Applications of ComputationalBiology and Bioinformatics 2010

M SipperLet the Games EvolveIn Proc GPTP 2011

C Sivapragasam R Maheswaran and V VenkateshGenetic programming approach for flood routing in natural channelsHydrological Processes 22(5)623ndash628 2007

C Sivapragasam N Muttil S Muthukumar and V M ArunPrediction of algal blooms using genetic programmingMarine Pollution Bulletin 60(10)1849ndash1855 2010

Bibliographyndash 93

L SpectorTowards practical autoconstructive evolution Self-evolution of problem-solvinggenetic programming systemsIn R Riolo T McConaghy and E Vladislavleva editors Genetic ProgrammingTheory and Practice VIII volume 8 of Genetic and Evolutionary Computationchapter 2 pages 17ndash33 Springer Ann Arbor USA 20-22 May 2010

L Spector C Perry and J KleinPush 20 programming language descriptionTechnical report School of Cognitive Science Hampshire College Apr 2004

J Togelius S Karakovskiy J Koutnik and J SchmidhuberSuper Mario EvolutionIn Proc IEEE Computational Intelligence and Games 2009

TORCS The Open Car Racing Simulatorhttptorcssourceforgenet 2012

L Vanneschi and G CuccuVariable Size Population for Dynamic Optimization with Genetic ProgrammingIn Proc GECCO 2009

Bibliographyndash 94

E Vladislavleva G Smits and D Den HertogOrder of Nonlinearity as a Complexity Measure for Models Generated by SymbolicRegression via Pareto Genetic ProgrammingIEEE Trans EC 13(2)333ndash349 2009

N Wagner Z Michalewicz M Khouja and R McGregorTime Series Forecasting for Dynamic Environments The DyFor Genetic ProgramModelIEEE Trans EC 2007

J Walker and J MillerPredicting Prime Numbers Using Cartesian Genetic ProgrammingIn Proc EuroGP 2007

J A Walker K Voumllk S L Smith and J F MillerParallel Evolution using Multi-chromosome Cartesian Genetic ProgrammingGPEM 10417ndash445 2009

W-C Wang K-W Chau C-T Cheng and L QiuA comparison of performance of several artificial intelligence methods forforecasting monthly discharge time seriesJournal of Hydrology 374(3-4)294ndash306 2009

Bibliographyndash 95

W Weimer S Forrest C Le Goues and T NguyenAutomatic program repair with evolutionary computationCommunications of the ACM 53(5)109ndash116 June 2010

P Widera J Garibaldi and N KrasnogorGP challenge Evolving energy function for protein structure predictionGPEM 1161ndash88 2010

T YuHierarchical Processing for Evolving Recursive and Modular Programs UsingHigher-Order Functions and Lambda AbstractionGPEM 2345ndash380 2001

Bibliographyndash 96

  • Introduction
  • Background
  • What is genetic programming
  • Summary of our first glimpse at GP
  • Life demonstration of GP using ECJ
  • Assessment of GP techniques
  • Case study
  • A more detailed view on GP (vanilla GP is not the whole story)
  • The Challenges for GP
  • Variants of GP
  • Selected theoretical results
  • Applications of GP
  • Additional resources
  • Bibliography
Page 25: GeneticProgramming - Poznań University of TechnologyGeneticProgramming KrzysztofKrawiec LaboratoryofIntelligentDecisionSupportSystems InstituteofComputingScience,PoznanUniversityofTechnology,Poznań,Poland

Specific features of GP

The solutions evolving under the selection pressure of the fitness function arethemselves functions (programs)

GP operates on symbolic structures of varying lengthsThere are no variables for the algorithm to operate on (at least in the commonsense)

The program can be tested only on a limited number of fitness cases (tests)

=rArr In contrast to most EC methods that are typically placed in optimizationframework GP is by nature an inductive learning approach that fits into the domain ofmachine learning []

Summary of our first glimpse at GPndash 25

In a broader context

As opposed to typical ML approaches GP is very genericArbitrary programming language arbitrary input and output representation

The syntax and semantic of the programming language of consideration serve asmeans to provide the algorithm with prior knowledge

(common sense knowledge background knowledge domain knowledge)

GP is not the only approach to program induction (but probably the best one )See eg inductive logic programming ILP

GP embodies the ultimate goal of AI to build a system capable ofself-programming (adaptation learning)

Summary of our first glimpse at GPndash 26

What is GP ndash Question revisited

Genetic programming is a branch of computer science studying heuristicalgorithms based on neo-Darwinian principles for synthesizing programs iediscrete symbolic compositional structures that process data

Summary of our first glimpse at GPndash 27

Why should GP be considered a viable approach to program synthesis

GP combines two powerful concepts marked in underline in the above definition1 Representing candidate solutions as programs

which in general can conduct any Turing-complete computation (egclassification regression clustering reasoning problem solving etc) and thusenable capturing solutions to any type of problems (whether the task is eglearning optimization problem solving game playing etc)

2 Searching the space of candidate solutions using the lsquomechanicsrsquo borrowed frombiological evolutionwhich is unquestionably a very powerful computing paradigm given that itresulted in life on Earth and development of intelligent beings

Summary of our first glimpse at GPndash 28

Consequences of the above definition

Heuristic nature of search

Symbolic program representation

Input sensitivity and inductive character

State-fullness

Unconstrained data types

Summary of our first glimpse at GPndash 29

Origins of GP

httpwwwgenetic-programmingcomjohnkozahtml

Home assignment Indentify John Kozarsquos unique visual trait )

Summary of our first glimpse at GPndash 30

Life demonstration of GP using ECJ

Life demonstration of GP using ECJndash 31

GP on ECJ

ECJ Evolutionary Computation in Javahttpcsgmuedu~eclabprojectsecj

by Sean Luke Liviu Panait Gabriel Balan Sean Paus Zbigniew Skolicki ElenaPopovici Keith Sullivan et al

Probably the most popular freely available framework for EC with a strongsupport for GP

Licensed under Academic Free License version 30

As of March 2012 version 20

Many other libraries integrate with ECJ

Life demonstration of GP using ECJndash 32

Selected ECJ features

GUI with charting

Platform-independent checkpointing and logging

Hierarchical parameter files

Multithreading

Mersenne Twister Random Number Generators (compare tohttpwwwalifecouknonrandom)

Abstractions for implementing a variety of EC forms

Prepared to work in a distributed environment (including so-called island model)

Life demonstration of GP using ECJndash 33

GP-related ECJ features

GP Tree Representations

Set-based Strongly-Typed Genetic Programming

Ephemeral Random Constants

Automatically-Defined Functions and Automatically Defined Macros

Multiple tree forests

Six tree-creation algorithms

Extensive set of GP breeding operators

Grammatical Encoding

Eight pre-done GP application problem domains (ant regression multiplexerlawnmower parity two-box edge serengeti)

Life demonstration of GP using ECJndash 34

Exemplary run of ECJ

Standard outputjava ecEvolve -file ecappregressionquinticercparamsThreads breed1 eval1Seed 1427743400Job 0Setting upProcessing GP TypesProcessing GP Node ConstraintsProcessing GP Function SetsProcessing GP Tree Constraints-01306332228659439200164875774146594280653340439694114301402200189629743-0037506348565697010001402771209365470606602806044824949013869498395598084Initializing Generation 0Subpop 0 best fitness of generation Fitness Standardized=11303205 Adjusted=046941292 Hits=10Generation 1Subpop 0 best fitness of generation Fitness Standardized=06804932 Adjusted=059506345 Hits=7

Life demonstration of GP using ECJndash 35

Exemplary run The result

The log file produced by the runGeneration 0Best IndividualSubpopulation 0Evaluated trueFitness Standardized=11303205 Adjusted=046941292 Hits=10Tree 0( (sin ( x x)) (cos (+ x x)))Generation 1Best IndividualSubpopulation 0Evaluated trueFitness Standardized=06804932 Adjusted=059506345 Hits=7Tree 0( (rlog (+ (- x x) (cos x))) (rlog (- (cos (cos ( x x))) (- x x))))

Life demonstration of GP using ECJndash 36

Exemplary run

The log file produced by the runBest Individual of RunSubpopulation 0Evaluated trueFitness Standardized=008413165 Adjusted=092239726 Hits=17Tree 0( ( ( (- ( ( ( ( x (sin x)) (rlog

x)) (+ (+ (sin x) x) (- x x))) (exp ( x( ( (- ( ( ( ( x x) (rlog x)) (+ (+

(sin x) x) (- x x))) (exp ( x (sin x))))(sin x)) (rlog x)) (exp (rlog x)))))) (sin

x)) (rlog x)) x) (cos (cos ( ( (- ( ((exp (rlog x)) (+ x ( ( (exp (rlog x))(rlog x)) x))) (exp ( ( ( (- (exp (rlogx)) x) (rlog x)) x) (sin ( x x))))) (sinx)) ( x ( ( (- ( ( ( ( x x) (rlogx)) (+ (+ x (+ (+ (sin x) x) (- x x))) (-x x))) (exp ( x (sin x)))) (sin x)) (rlogx)) (exp (rlog x))))) x))))

Life demonstration of GP using ECJndash 37

GP in R

httpcranr-projectorgwebpackagesgprindexhtml

Life demonstration of GP using ECJndash 38

GP in Mathematica

Life demonstration of GP using ECJndash 39

Assessment of GP techniques

Assessment of GP techniquesndash 40

Criteria for assessing the quality of GP-evolved solutions

Criteria for assessing GP algorithmssuccess rate (percentage of evolutionary runs ended with success)

time-to-success (can be infin)

error of the best-of-run individual

Criteria for assessing programs obtained with GP

error rate (percentage of tests passed)

program size (number of instructions)

execution time

transparency

Assessment of GP techniquesndash 41

GP Benchmarks

Problem Definition (formula)Sextic x6minus2x4+x2

Septic x7minus2x6+x5minusx4+x3minus2x2+xNonic x9+x8+x7+x6+x5+x4+x3+x2+xR1 (x+1)3(x2minusx+1)R2 (x5minus3x3+1)(x2+1)R3 (x6+x5)(x4+x3+x2+x+1)

Assessment of GP techniquesndash 42

Even more GP benchmarks

Symbolic Regression Drug Bioavailability [43]

Tower [52] Protein Structure Classification [58]

Boolean Functions Time Series Forecasting [53]

N-Multiplexer [17] N-Majority [17] N-Parity [17] Path-finding and Planning

Generalised Boolean Circuits [12 59] Physical Travelling Salesman [28]

Digital Adder [55] Artificial Ant [17]

Order [9] Lawnmower [18]

Digital Multiplier [55] Tartarus Problem [6]

Majority [9] Maximum Overhang [36]

Classification Circuit Design [29]

mRNA Motif Classification [24] Control Systems

DNA Motif Discovery [25] Chaotic Dynamic Systems Control [26]

Intrusion Detection [11] Pole Balancing [31]

Protein Classification [19] Truck Control [16]

Intertwined Spirals [17]

Predictive Modelling

Mackey-Glass Chaotic Time Series [21]

Financial Trading [5 4 8]

Sunspot Prediction [17]

GeneChip Probe Performance [22]

Prime Number Prediction [54]

Assessment of GP techniquesndash 43

and more

Game-Playing

TORCS Car Racing [50]

Ms PacMan [10]

Othello [27]

Chessboard Evaluation [44]

Backgammon [44]

Mario [49]

NP-Complete Puzzles [14]

Robocode [44]

Rush Hour [44]

Checkers [44]

Freecell [44]

Dynamic Optimisation

Dynamic Symbolic Regression [34 35 51]

Dynamic Scheduling [13]

Traditional Programming

Sorting [15 1]

Assessment of GP techniquesndash 44

Case study

Case studyndash 45

Case study 1 Evolution of temperature models

Based on Karolina Stanisławska Krzysztof Krawiec Zbigniew W KundzewiczModeling Global Temperature Changes using Genetic Programming ndash A Case Study

Institute of Computing Science Poznan University of Technology Poznan Poland

Institute for Agricultural and Forest Environment Polish Academy of SciencesPoznan Poland and Potsdam Institute for Climate Impact Research PotsdamGermany

Konferencja Algorytmoacutew Ewolucyjnych i Optymalizacji Globalnej KAEiOGSeptember 21 2011

Case studyndash 46

A more detailed view on GP(vanilla GP is not the whole story)

A more detailed view on GP (vanilla GP is not the whole story)ndash 47

Where to get [random] candidate solutions from

Every stochastic search method relies on some sampling algorithm(s)

The distribution of randomly generated solutions is important as it impliescertain bias of the algorithm

ProblemsWe donrsquot know the lsquoidealrsquo distribution of GP programsEven if we knew it it may be difficult to design an algorithm that obeys it

The most widely used contemporary initialization methods take care only of thesyntax of generated programs

Mainly height constraint

A more detailed view on GP (vanilla GP is not the whole story)ndash 48

Initialization Full method

Specify the maximum tree height hmax

The full method for initializing treesChoose nonterminal nodes at random until hmax is reachedThen choose only from terminals

A more detailed view on GP (vanilla GP is not the whole story)ndash 49

Initialization Grow method

Specify the maximum tree height hmax

The grow method for initializing treesChoose nonterminal or terminal nodes at random until hmax is reachedThen choose only from terminals

A more detailed view on GP (vanilla GP is not the whole story)ndash 50

Homologous crossover for GP

Earliest example one-point crossover [23] identify a common region in theparents and swap the corresponding trees

The common region is the lsquointersectionrsquo of parent trees

A more detailed view on GP (vanilla GP is not the whole story)ndash 51

Uniform crossover for GP

Works similarly to uniform crossover in GAs

The offspring is build by iterating over nodes in the common region and flipping acoin to decide from which parent should an instruction be copied [38]

A more detailed view on GP (vanilla GP is not the whole story)ndash 52

How to employ multiple operators for lsquobreedingrsquo

How should the particular operators coexist in an evolutionary process In other words

How should they be superimposed

What should be the lsquopipingrsquo of particular breeding pipelines

A topic surprisingly underexplored in GP (and in EC probably too)

An example Which is betterpopsubpop0speciespipe = ecgpkozaMutationPipelinepopsubpop0speciespipenum-sources = 1popsubpop0speciespipesource0 = ecgpkozaCrossoverPipeline

Orpopsubpop0speciespipenum-sources = 2popsubpop0speciespipesource0 = ecgpkozaCrossoverPipelinepopsubpop0speciespipesource0prob = 09popsubpop0speciespipesource1 = ecgpkozaMutationPipelinepopsubpop0speciespipesource1prob = 01

A more detailed view on GP (vanilla GP is not the whole story)ndash 53

The Challenges for GP

The Challenges for GPndash 54

Bloat

The evolving expressions tend to grow indefinitely in sizeFor tree-based representations this growth is typically exponential[-ish]

Evaluation becomes slow algorithm stalls memory overrun likely

One of the most intensely studied topics in GP 240+ papers as of March 2012

The Challenges for GPndash 55

Bloat example

Average number of nodes per generation in a typical run of GP solving the Sexticproblem x6minus2x4+x2

(GP dotted line)

The Challenges for GPndash 56

Countermeasures for bloat

Constraining tree heightSurprisingly can speed up bloat

Favoring small programsLexicographic parsimony pressure given two equally fit individuals prefer (select)the one represented by a smaller tree

Bloat-aware operators size-fair crossover

The Challenges for GPndash 57

Highly non-uniform distribution of program lsquobehaviorsrsquo

Convergence of binary Boolean random linear functions (composed of AND NANDOR NOR 8 bits)

From [20] Langdon W B Cantuacute-Paz E (ed) Random Search is Parsimonious LateBreaking Papers at the Genetic and Evolutionary Computation Conference(GECCO-2002) AAAI 2002 308-315

The Challenges for GPndash 58

High cost of evaluation

Running a program on multiple inputscan be expensive

Particularly for some types of dataeg images

Solutions

Caching of outcomes of subprograms

Parallel execution of programs onparticular fitness cases

Bloat prevention methods

The Challenges for GPndash 59

Variants of GP

Variants of GPndash 60

Strongly typed GP (STGP)

A way to incorporate prior knowledge and impose a structure on programs [30]

ImplementationProvide a set of typesFor each instruction define the types of its arguments and outcomesMake the operators type-aware

Mutation substitute a random tree of a proper typeCrossover swap trees of compatible1 types

1Compatible belonging to the same lsquoset typersquoVariants of GPndash 61

Strongly typed GP in ECJ

For the problem of simple classifiers represented as decision trees

Classifier syntaxClassifier = Class_idClassifier = if_then_else(Condition ClassifierClassifier)Condition = Input_Variable = Constant_Value

Implementation in ECJ parameter filesgptypeasize = 3gptypea0name = classgptypea1name = vargptypea2name = constgptypessize = 0gptcsize = 1gptc0 = ecgpGPTreeConstraintsgptc0name = tc0gptc0fset = f0gptc0returns = class

gpncsize = 4gpnc0 = ecgpGPNodeConstraintsgpnc0name = ncSimpleClassifiergpnc0returns = classgpnc0size = 0gpnc1 = ecgpGPNodeConstraintsgpnc1name = ncCompoundClassifiergpnc1returns = classgpnc1size = 4gpnc1child0 = vargpnc1child1 = constgpnc1child2 = classgpnc1child3 = classgpnc2 = ecgpGPNodeConstraintsgpnc2name = ncVariablegpnc2returns = vargpnc2size = 0gpnc3 = ecgpGPNodeConstraintsgpnc3name = ncConstantgpnc3returns = constgpnc3size = 0

Variants of GPndash 62

Linear Genetic Programming

MotivationTree-like structures are not natural for contemporary hardware architectures

Program = a sequence of instructions

Data passed via registers

ProsDirectly portable to machine code fast executionNatural correspondence to standard (GA-like) crossover operator

Applications direct evolution of machine code [32]

Initial register contents

Final register contents

x1

x2 O1 O2

x3

O3 O4 g2

g3

g1r1

r2

r3

r1

r2

r3

Variants of GPndash 63

Linear GP

Variants of GPndash 64

Stack-based GP

The best-known representative Push and PushGPhampshireedulspectorpushhtml [48]

ProsVery simple syntax program = instruction | literal | ( program )No need to specify the number of registersThe top element of a stack has the natural interpretation of program outcomeNatural possibility of implementing autoconstructive programs [47]Includes certain features that make it Turing-complete (eg YANK instruction)

Variants of GPndash 65

Push Example

Program

( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR )

Initial stack states

BOOLEAN STACK ()CODE STACK ( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR )FLOAT STACK ()INTEGER STACK ()

Stack states after program execution

BOOLEAN STACK ( TRUE )CODE STACK ( ( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR ) )FLOAT STACK ( 93 )INTEGER STACK ( 6 )

httphampshireedulspectorpush3-descriptionhtml

Variants of GPndash 66

Other variants of GP

Grammatical Evolution The grammar of the programming language ofconsideration is given as input to the algorithm Individuals encode the choice ofproductions in the derivation tree (which of available alternative productionshould be chosen modulo the number of productions available at given step ofderivation)

Graph-based GPMotivation standard GP cannot reuse subprograms (within a single program)Example Cartesian Genetic Programming

Variants of GPndash 67

Other variants of GP

Multiobjective GP The extra objectives canCome with the problemResult from GPrsquos specifics eg use program size as the second (minimized)objectiveBe associated with different tests (eg feature tests [40])

Developmental GP (eg using Push)

Probabilistic GP (a variant of EDA Estimation of Distribution Algorithms)The algorithm maintains a probability distribution P instead of a populationIndividuals are generated from P lsquoon demandrsquoThe results of individualsrsquo evaluation are used to update P

Variants of GPndash 68

Simple EDA-like GP PIPE

Probabilistic Incremental Program Evolution [41]

Variants of GPndash 69

Selected theoretical results

Selected theoretical resultsndash 70

Schemata theorem for GP

Exact formula for the expected number of individuals sampling a schema a thenext generation [37]

Plus later work for other types of crossover

Selected theoretical resultsndash 71

Semantic crossover

Selected theoretical resultsndash 72

Applications of GP

Applications of GPndash 73

Review

GP produced a number of solutions that are human-competitive ie a GPalgorithm automatically solved a problem for which a patent exists2

A recent award-winning work has demonstrated the ability of a GP system toautomatically find and correct bugs in commercially-released software whenprovided with test data3

GP is one of leading methodologies that can be used to lsquoautomatersquo sciencehelping the researchers to find the hidden complex patterns in the observedphenomena4

2Koza J R Keane M A Streeter M J Mydlowec W Yu J Lanza G 2003 Genetic Pro-gramming IV Routine Human-Competitive Machine Intelligence Kluwer Academic Publishers

3Arcuri A Yao X A novel co-evolutionary approach to automatic software bug fixing In WangJ (Ed) 2008 IEEE World Congress on Computational Intelligence IEEE Computational IntelligenceSociety IEEE Press Hong Kong

4Schmidt M Lipson H 3 Apr 2009 Distilling free-form natural laws from experimental dataScience 324 (5923) 81ndash85

Applications of GPndash 74

Humies

() Entries were solicited for cash awards for human-competitive results that wereproduced by any form of genetic and evolutionary computation and that were published

Applications of GPndash 75

Humies

The conditions to qualify(A) The result was patented as an invention in the past is an improvement over apatented invention or would qualify today as a patentable new invention(B) The result is equal to or better than a result that was accepted as a new scientificresult at the time when it was published in a peer-reviewed scientific journal(C) The result is equal to or better than a result that was placed into a database orarchive of results maintained by an internationally recognized panel of scientificexperts(D) The result is publishable in its own right as a new scientific result mdash independentof the fact that the result was mechanically created(E) The result is equal to or better than the most recent human-created solution to along-standing problem for which there has been a succession of increasingly betterhuman-created solutions(F) The result is equal to or better than a result that was considered an achievementin its field at the time it was first discovered(G) The result solves a problem of indisputable difficulty in its field(H) The result holds its own or wins a regulated competition involving humancontestants (in the form of either live human players or human-written computerprograms)

Applications of GPndash 76

Selected Gold Humies

2004 Jason D Lohn Gregory S Hornby Derek S Linden NASA Ames ResearchCenterAn Evolved Antenna for Deployment on NASArsquos Space Technology 5 Mission

httpidesignucscedupapershornby_ec11pdf

Applications of GPndash 77

Selected Gold Humies using GP

2009 Stephanie Forrest Claire Le Goues ThanhVu Nguyen Westley WeimerAutomatically finding patches using genetic programming A GeneticProgramming Approach to Automated Software Repair

Applications of GPndash 78

Selected Gold Humies using GP

2008 Lee Spector David M Clark Ian Lindsay Bradford Barr Jon KleinGenetic Programming for Finite Algebras

2010 Natalio Krasnogor Paweł Widera Jonathan GaribaldiEvolutionary design of the energy function for protein structure prediction GPchallenge evolving the energy function for protein structure prediction Automateddesign of energy functions for protein structure prediction by means of geneticprogramming and improved structure similarity assessment

2011 Achiya Elyasaf Ami Hauptmann Moshe SipperGA-FreeCell Evolving Solvers for the Game of FreeCell

Applications of GPndash 79

Other applications

classification problems in machine learning [] object recognition [ 33] orlearning game strategies [] [2 57] has demonstrated the ability of a GP system to automatically find andcorrect bugs in commercially-released software when provided with test data

In the context of this paper it deserves particular attention that GP is one ofleading methodologies that can be used to lsquoautomatersquo science helping theresearchers to find the hidden complex patterns in the observed phenomena

In this spirit in their seminal paper [42] have shown how GP can be used toinduce scientific laws from experimental data Many other studies havedemonstrated the usefulness of GP for modeling different phenomena includingthose of natural origins [45 3 7 56 46]

See [] for an extensive review of GP applications

Applications of GPndash 80

Case study 2 Evolution of feature detectors

Based on Krzysztof Krawiec Bartosz Kukawka and Tomasz Maciejewski (2010)Evolving cascades of voting feature detectors for vehicle detection in satellite imageryIn IEEE Congress on Evolutionary Computation (CEC 2010) Barcelona IEEE Presspages 2392-2399

Applications of GPndash 81

Additional resources

Additional resourcesndash 82

Additional resources

Evolutionary Computation in Java csgmuedu~eclabprojectsecjGeneric software framework for EA well-prepared to work with GP

The online GP bilbiography wwwcsbhamacuk~wblbiblio

The genetic programming lsquohome pagersquo (a little bit messy but still valuable)httpwwwgenetic-programmingcom

Additional resourcesndash 83

Bibliography

Bibliographyndash 84

A Agapitos and S M LucasEvolving Modular Recursive Sorting AlgorithmsIn Proc EuroGP 2007

A Arcuri and X YaoA novel co-evolutionary approach to automatic software bug fixingIn J Wang editor 2008 IEEE World Congress on Computational Intelligencepages 162ndash168 Hong Kong 1-6 June 2008 IEEE Computational IntelligenceSociety IEEE Press

M Arganis R Val J Prats K Rodriguez R Dominguez and J DolzGenetic programming and standardization in water temperature modellingAdvances in Civil Engineering 2009 2009

A Brabazon M OrsquoNeill and I DempseyAn Introduction to Evolutionary Computation in FinanceIEEE Computational Intelligence Magazine 3(4)42ndash55 2008

R Bradley A Brabazon and M OrsquoNeillDynamic High Frequency Trading A Neuro-Evolutionary ApproachIn Proc EvoWorkshops 2009

G Cuccu and F GomezWhen novelty is not enoughIn Proc EvoApplications 2011

Bibliographyndash 85

M Daga and M C DeoAlternative data-driven methods to estimate wind from waves by inversemodelingNatural Hazards 49(2)293ndash310 May 2009

I Dempsey M OrsquoNeill and A BrabazonAdaptive Trading With Grammatical EvolutionIn Proc CEC 2006

G Durrett F Neumann and U-M OrsquoReillyComputational Complexity Analysis of Simple Genetic Programming On TwoProblems Modeling Isolated Program SemanticsIn Proc FOGA 2011

E Galvaacuten-Loacutepez J Swafford M OrsquoNeill and A BrabazonEvolving a Ms PacMan Controller Using Grammatical EvolutionIn Applications of Evolutionary Computation Springer 2010

J V Hansen P B Lowry R D Meservy and D M McDonaldGenetic Programming for Prevention of Cyberterrorism through Dynamic andEvolving Intrusion DetectionDecision Support Systems 431362ndash1374 2007

Bibliographyndash 86

S Harding J F Miller and W BanzhafDevelopments in Cartesian Genetic Programming self-modifying CGPGPEM 11397ndash439 2010

D Jakobović and L BudinDynamic Scheduling with Genetic ProgrammingIn Proc EuroGP 2006

G Kendall A Parkes and K SpoererA Survey of NP-Complete PuzzlesInternational Computer Games Association Journal 31(1)13ndash34 2008

K E Kinnear JrEvolving a Sort Lessons in Genetic ProgrammingIn Proc of the International Conference on Neural Networks 1993

J KozaA Genetic Approach to the Truck Backer Upper Problem and the Inter-twinedSpiral ProblemIn Proc International Joint Conference on Neural Networks 1992

J R KozaGenetic Programming On the Programming of Computers by Means of NaturalSelectionMIT Press Cambridge MA USA 1992

Bibliographyndash 87

J R KozaGenetic Programming II Automatic Discovery of Reusable ProgramsMIT Press Cambridge Massachusetts May 1994

W Langdon and W BanzhafRepeated Patterns in Genetic ProgrammingNatural Computing 7589ndash613 2008

W B LangdonRandom search is parsimoniousIn E Cantuacute-Paz editor Late Breaking Papers at the Genetic and EvolutionaryComputation Conference (GECCO-2002) pages 308ndash315 New York NY 9-13July 2002 AAAI

W B Langdon and W BanzhafRepeated Sequences in Linear Genetic Programming GenomesComplex Systems 15(4)285ndash306 2005

W B Langdon and A P HarrisonEvolving Regular Expressions for GeneChip Probe Performance PredictionIn Proc PPSN pages 1061ndash1070 2008

W B Langdon and R PoliFoundations of Genetic ProgrammingSpringer-Verlag 2002

Bibliographyndash 88

W B Langdon J Rowsell and A P HarrisonCreating Regular Expressions as mRNA Motifs with GP to Predict Human ExonSplittingIn Proc GECCO 2009

W B Langdon O Sanchez Graillet and A P HarrisonAutomated DNA Motif DiscoveryarXivorg 2010

M Lones A Tyrrell S Stepney and L CavesControlling Complex Dynamics with Artificial Biochemical NetworksIn Proc EuroGP pages 159ndash170 2010

S LucasOthello Competitionhttpprotectkern-1667emrelaxalgovalessexacuk8080othellohtmlOthellohtml 2012[Online accessed 27-Jan-2012]

Bibliographyndash 89

S LucasThe Physical Travelling Salesperson Problemhttpprotectkern-1667emrelaxalgovalessexacukptspptsphtml2012[Online accessed 27ndashJan-2012]

T McConaghyFFX Fast Scalable Deterministic Symbolic Regression TechnologyIn Proc GPTP 2011

D J MontanaStrongly typed genetic programmingBBN Technical Report 7866 Bolt Beranek and Newman Inc 10 MoultonStreet Cambridge MA 02138 USA 7 May 1993

M Nicolau M Schoenauer and W BanzhafEvolving Genes to Balance a PoleIn Proc EuroGP 2010

Bibliographyndash 90

P Nordin and W BanzhafGenetic programming controlling a miniature robotIn E V Siegel and J R Koza editors Working Notes for the AAAI Symposiumon Genetic Programming pages 61ndash67 MIT Cambridge MA USA 10ndash12 Nov1995 AAAI

G Olague and L TrujilloEvolutionary-computer-assisted design of image operators that detect interestpoints using genetic programmingImage and Vision Computing 29(7)484ndash498 2011

M OrsquoNeill A Brabazon and E HembergSubtree Deactivation Control with Grammatical Genetic Programming inDynamic EnvironmentsIn Proc CEC 2008

M OrsquoNeill and C RyanGrammatical Evolution by Grammatical Evolution The Evolution of Grammarand Genetic CodeIn Proc EuroGP pages 138ndash149 Springer-Verlag 5ndash7 Apr 2004

M Paterson Y Peres M Thorup P Winkler and U ZwickMaximum OverhangIn Proc 19th Annual ACM-SIAM Symposium on Discrete Algorithms 2008

Bibliographyndash 91

R PoliExact schema theory for genetic programming and variable-length geneticalgorithms with one-point crossoverGenetic Programming and Evolvable Machines 2(2)123ndash163 June 2001

R Poli and W B LangdonOn the search properties of different crossover operators in genetic programmingIn J R Koza W Banzhaf K Chellapilla K Deb M Dorigo D B Fogel M HGarzon D E Goldberg H Iba and R Riolo editors Genetic Programming1998 Proceedings of the Third Annual Conference pages 293ndash301 University ofWisconsin Madison Wisconsin USA 22-25 July 1998 Morgan Kaufmann

R Poli W B Langdon and N F McPheeA field guide to genetic programmingPublished via httplulucom and freely available athttpwwwgp-field-guideorguk 2008(With contributions by J R Koza)

B J Ross and H ZhuProcedural texture evolution using multiobjective optimizationNew Generation Computing 22(3)271ndash293 2004

Bibliographyndash 92

R P Salustowicz and J SchmidhuberProbabilistic incremental program evolutionEvolutionary Computation 5(2)123ndash141 1997

M Schmidt and H LipsonDistilling free-form natural laws from experimental dataScience 324(5923)81ndash85 3 Apr 2009

S Silva and L VanneschiState-of-the-Art Genetic Programming for Predicting Human Oral Bioavailabilityof DrugsIn Proc 4th International Workshop on Practical Applications of ComputationalBiology and Bioinformatics 2010

M SipperLet the Games EvolveIn Proc GPTP 2011

C Sivapragasam R Maheswaran and V VenkateshGenetic programming approach for flood routing in natural channelsHydrological Processes 22(5)623ndash628 2007

C Sivapragasam N Muttil S Muthukumar and V M ArunPrediction of algal blooms using genetic programmingMarine Pollution Bulletin 60(10)1849ndash1855 2010

Bibliographyndash 93

L SpectorTowards practical autoconstructive evolution Self-evolution of problem-solvinggenetic programming systemsIn R Riolo T McConaghy and E Vladislavleva editors Genetic ProgrammingTheory and Practice VIII volume 8 of Genetic and Evolutionary Computationchapter 2 pages 17ndash33 Springer Ann Arbor USA 20-22 May 2010

L Spector C Perry and J KleinPush 20 programming language descriptionTechnical report School of Cognitive Science Hampshire College Apr 2004

J Togelius S Karakovskiy J Koutnik and J SchmidhuberSuper Mario EvolutionIn Proc IEEE Computational Intelligence and Games 2009

TORCS The Open Car Racing Simulatorhttptorcssourceforgenet 2012

L Vanneschi and G CuccuVariable Size Population for Dynamic Optimization with Genetic ProgrammingIn Proc GECCO 2009

Bibliographyndash 94

E Vladislavleva G Smits and D Den HertogOrder of Nonlinearity as a Complexity Measure for Models Generated by SymbolicRegression via Pareto Genetic ProgrammingIEEE Trans EC 13(2)333ndash349 2009

N Wagner Z Michalewicz M Khouja and R McGregorTime Series Forecasting for Dynamic Environments The DyFor Genetic ProgramModelIEEE Trans EC 2007

J Walker and J MillerPredicting Prime Numbers Using Cartesian Genetic ProgrammingIn Proc EuroGP 2007

J A Walker K Voumllk S L Smith and J F MillerParallel Evolution using Multi-chromosome Cartesian Genetic ProgrammingGPEM 10417ndash445 2009

W-C Wang K-W Chau C-T Cheng and L QiuA comparison of performance of several artificial intelligence methods forforecasting monthly discharge time seriesJournal of Hydrology 374(3-4)294ndash306 2009

Bibliographyndash 95

W Weimer S Forrest C Le Goues and T NguyenAutomatic program repair with evolutionary computationCommunications of the ACM 53(5)109ndash116 June 2010

P Widera J Garibaldi and N KrasnogorGP challenge Evolving energy function for protein structure predictionGPEM 1161ndash88 2010

T YuHierarchical Processing for Evolving Recursive and Modular Programs UsingHigher-Order Functions and Lambda AbstractionGPEM 2345ndash380 2001

Bibliographyndash 96

  • Introduction
  • Background
  • What is genetic programming
  • Summary of our first glimpse at GP
  • Life demonstration of GP using ECJ
  • Assessment of GP techniques
  • Case study
  • A more detailed view on GP (vanilla GP is not the whole story)
  • The Challenges for GP
  • Variants of GP
  • Selected theoretical results
  • Applications of GP
  • Additional resources
  • Bibliography
Page 26: GeneticProgramming - Poznań University of TechnologyGeneticProgramming KrzysztofKrawiec LaboratoryofIntelligentDecisionSupportSystems InstituteofComputingScience,PoznanUniversityofTechnology,Poznań,Poland

In a broader context

As opposed to typical ML approaches GP is very genericArbitrary programming language arbitrary input and output representation

The syntax and semantic of the programming language of consideration serve asmeans to provide the algorithm with prior knowledge

(common sense knowledge background knowledge domain knowledge)

GP is not the only approach to program induction (but probably the best one )See eg inductive logic programming ILP

GP embodies the ultimate goal of AI to build a system capable ofself-programming (adaptation learning)

Summary of our first glimpse at GPndash 26

What is GP ndash Question revisited

Genetic programming is a branch of computer science studying heuristicalgorithms based on neo-Darwinian principles for synthesizing programs iediscrete symbolic compositional structures that process data

Summary of our first glimpse at GPndash 27

Why should GP be considered a viable approach to program synthesis

GP combines two powerful concepts marked in underline in the above definition1 Representing candidate solutions as programs

which in general can conduct any Turing-complete computation (egclassification regression clustering reasoning problem solving etc) and thusenable capturing solutions to any type of problems (whether the task is eglearning optimization problem solving game playing etc)

2 Searching the space of candidate solutions using the lsquomechanicsrsquo borrowed frombiological evolutionwhich is unquestionably a very powerful computing paradigm given that itresulted in life on Earth and development of intelligent beings

Summary of our first glimpse at GPndash 28

Consequences of the above definition

Heuristic nature of search

Symbolic program representation

Input sensitivity and inductive character

State-fullness

Unconstrained data types

Summary of our first glimpse at GPndash 29

Origins of GP

httpwwwgenetic-programmingcomjohnkozahtml

Home assignment Indentify John Kozarsquos unique visual trait )

Summary of our first glimpse at GPndash 30

Life demonstration of GP using ECJ

Life demonstration of GP using ECJndash 31

GP on ECJ

ECJ Evolutionary Computation in Javahttpcsgmuedu~eclabprojectsecj

by Sean Luke Liviu Panait Gabriel Balan Sean Paus Zbigniew Skolicki ElenaPopovici Keith Sullivan et al

Probably the most popular freely available framework for EC with a strongsupport for GP

Licensed under Academic Free License version 30

As of March 2012 version 20

Many other libraries integrate with ECJ

Life demonstration of GP using ECJndash 32

Selected ECJ features

GUI with charting

Platform-independent checkpointing and logging

Hierarchical parameter files

Multithreading

Mersenne Twister Random Number Generators (compare tohttpwwwalifecouknonrandom)

Abstractions for implementing a variety of EC forms

Prepared to work in a distributed environment (including so-called island model)

Life demonstration of GP using ECJndash 33

GP-related ECJ features

GP Tree Representations

Set-based Strongly-Typed Genetic Programming

Ephemeral Random Constants

Automatically-Defined Functions and Automatically Defined Macros

Multiple tree forests

Six tree-creation algorithms

Extensive set of GP breeding operators

Grammatical Encoding

Eight pre-done GP application problem domains (ant regression multiplexerlawnmower parity two-box edge serengeti)

Life demonstration of GP using ECJndash 34

Exemplary run of ECJ

Standard outputjava ecEvolve -file ecappregressionquinticercparamsThreads breed1 eval1Seed 1427743400Job 0Setting upProcessing GP TypesProcessing GP Node ConstraintsProcessing GP Function SetsProcessing GP Tree Constraints-01306332228659439200164875774146594280653340439694114301402200189629743-0037506348565697010001402771209365470606602806044824949013869498395598084Initializing Generation 0Subpop 0 best fitness of generation Fitness Standardized=11303205 Adjusted=046941292 Hits=10Generation 1Subpop 0 best fitness of generation Fitness Standardized=06804932 Adjusted=059506345 Hits=7

Life demonstration of GP using ECJndash 35

Exemplary run The result

The log file produced by the runGeneration 0Best IndividualSubpopulation 0Evaluated trueFitness Standardized=11303205 Adjusted=046941292 Hits=10Tree 0( (sin ( x x)) (cos (+ x x)))Generation 1Best IndividualSubpopulation 0Evaluated trueFitness Standardized=06804932 Adjusted=059506345 Hits=7Tree 0( (rlog (+ (- x x) (cos x))) (rlog (- (cos (cos ( x x))) (- x x))))

Life demonstration of GP using ECJndash 36

Exemplary run

The log file produced by the runBest Individual of RunSubpopulation 0Evaluated trueFitness Standardized=008413165 Adjusted=092239726 Hits=17Tree 0( ( ( (- ( ( ( ( x (sin x)) (rlog

x)) (+ (+ (sin x) x) (- x x))) (exp ( x( ( (- ( ( ( ( x x) (rlog x)) (+ (+

(sin x) x) (- x x))) (exp ( x (sin x))))(sin x)) (rlog x)) (exp (rlog x)))))) (sin

x)) (rlog x)) x) (cos (cos ( ( (- ( ((exp (rlog x)) (+ x ( ( (exp (rlog x))(rlog x)) x))) (exp ( ( ( (- (exp (rlogx)) x) (rlog x)) x) (sin ( x x))))) (sinx)) ( x ( ( (- ( ( ( ( x x) (rlogx)) (+ (+ x (+ (+ (sin x) x) (- x x))) (-x x))) (exp ( x (sin x)))) (sin x)) (rlogx)) (exp (rlog x))))) x))))

Life demonstration of GP using ECJndash 37

GP in R

httpcranr-projectorgwebpackagesgprindexhtml

Life demonstration of GP using ECJndash 38

GP in Mathematica

Life demonstration of GP using ECJndash 39

Assessment of GP techniques

Assessment of GP techniquesndash 40

Criteria for assessing the quality of GP-evolved solutions

Criteria for assessing GP algorithmssuccess rate (percentage of evolutionary runs ended with success)

time-to-success (can be infin)

error of the best-of-run individual

Criteria for assessing programs obtained with GP

error rate (percentage of tests passed)

program size (number of instructions)

execution time

transparency

Assessment of GP techniquesndash 41

GP Benchmarks

Problem Definition (formula)Sextic x6minus2x4+x2

Septic x7minus2x6+x5minusx4+x3minus2x2+xNonic x9+x8+x7+x6+x5+x4+x3+x2+xR1 (x+1)3(x2minusx+1)R2 (x5minus3x3+1)(x2+1)R3 (x6+x5)(x4+x3+x2+x+1)

Assessment of GP techniquesndash 42

Even more GP benchmarks

Symbolic Regression Drug Bioavailability [43]

Tower [52] Protein Structure Classification [58]

Boolean Functions Time Series Forecasting [53]

N-Multiplexer [17] N-Majority [17] N-Parity [17] Path-finding and Planning

Generalised Boolean Circuits [12 59] Physical Travelling Salesman [28]

Digital Adder [55] Artificial Ant [17]

Order [9] Lawnmower [18]

Digital Multiplier [55] Tartarus Problem [6]

Majority [9] Maximum Overhang [36]

Classification Circuit Design [29]

mRNA Motif Classification [24] Control Systems

DNA Motif Discovery [25] Chaotic Dynamic Systems Control [26]

Intrusion Detection [11] Pole Balancing [31]

Protein Classification [19] Truck Control [16]

Intertwined Spirals [17]

Predictive Modelling

Mackey-Glass Chaotic Time Series [21]

Financial Trading [5 4 8]

Sunspot Prediction [17]

GeneChip Probe Performance [22]

Prime Number Prediction [54]

Assessment of GP techniquesndash 43

and more

Game-Playing

TORCS Car Racing [50]

Ms PacMan [10]

Othello [27]

Chessboard Evaluation [44]

Backgammon [44]

Mario [49]

NP-Complete Puzzles [14]

Robocode [44]

Rush Hour [44]

Checkers [44]

Freecell [44]

Dynamic Optimisation

Dynamic Symbolic Regression [34 35 51]

Dynamic Scheduling [13]

Traditional Programming

Sorting [15 1]

Assessment of GP techniquesndash 44

Case study

Case studyndash 45

Case study 1 Evolution of temperature models

Based on Karolina Stanisławska Krzysztof Krawiec Zbigniew W KundzewiczModeling Global Temperature Changes using Genetic Programming ndash A Case Study

Institute of Computing Science Poznan University of Technology Poznan Poland

Institute for Agricultural and Forest Environment Polish Academy of SciencesPoznan Poland and Potsdam Institute for Climate Impact Research PotsdamGermany

Konferencja Algorytmoacutew Ewolucyjnych i Optymalizacji Globalnej KAEiOGSeptember 21 2011

Case studyndash 46

A more detailed view on GP(vanilla GP is not the whole story)

A more detailed view on GP (vanilla GP is not the whole story)ndash 47

Where to get [random] candidate solutions from

Every stochastic search method relies on some sampling algorithm(s)

The distribution of randomly generated solutions is important as it impliescertain bias of the algorithm

ProblemsWe donrsquot know the lsquoidealrsquo distribution of GP programsEven if we knew it it may be difficult to design an algorithm that obeys it

The most widely used contemporary initialization methods take care only of thesyntax of generated programs

Mainly height constraint

A more detailed view on GP (vanilla GP is not the whole story)ndash 48

Initialization Full method

Specify the maximum tree height hmax

The full method for initializing treesChoose nonterminal nodes at random until hmax is reachedThen choose only from terminals

A more detailed view on GP (vanilla GP is not the whole story)ndash 49

Initialization Grow method

Specify the maximum tree height hmax

The grow method for initializing treesChoose nonterminal or terminal nodes at random until hmax is reachedThen choose only from terminals

A more detailed view on GP (vanilla GP is not the whole story)ndash 50

Homologous crossover for GP

Earliest example one-point crossover [23] identify a common region in theparents and swap the corresponding trees

The common region is the lsquointersectionrsquo of parent trees

A more detailed view on GP (vanilla GP is not the whole story)ndash 51

Uniform crossover for GP

Works similarly to uniform crossover in GAs

The offspring is build by iterating over nodes in the common region and flipping acoin to decide from which parent should an instruction be copied [38]

A more detailed view on GP (vanilla GP is not the whole story)ndash 52

How to employ multiple operators for lsquobreedingrsquo

How should the particular operators coexist in an evolutionary process In other words

How should they be superimposed

What should be the lsquopipingrsquo of particular breeding pipelines

A topic surprisingly underexplored in GP (and in EC probably too)

An example Which is betterpopsubpop0speciespipe = ecgpkozaMutationPipelinepopsubpop0speciespipenum-sources = 1popsubpop0speciespipesource0 = ecgpkozaCrossoverPipeline

Orpopsubpop0speciespipenum-sources = 2popsubpop0speciespipesource0 = ecgpkozaCrossoverPipelinepopsubpop0speciespipesource0prob = 09popsubpop0speciespipesource1 = ecgpkozaMutationPipelinepopsubpop0speciespipesource1prob = 01

A more detailed view on GP (vanilla GP is not the whole story)ndash 53

The Challenges for GP

The Challenges for GPndash 54

Bloat

The evolving expressions tend to grow indefinitely in sizeFor tree-based representations this growth is typically exponential[-ish]

Evaluation becomes slow algorithm stalls memory overrun likely

One of the most intensely studied topics in GP 240+ papers as of March 2012

The Challenges for GPndash 55

Bloat example

Average number of nodes per generation in a typical run of GP solving the Sexticproblem x6minus2x4+x2

(GP dotted line)

The Challenges for GPndash 56

Countermeasures for bloat

Constraining tree heightSurprisingly can speed up bloat

Favoring small programsLexicographic parsimony pressure given two equally fit individuals prefer (select)the one represented by a smaller tree

Bloat-aware operators size-fair crossover

The Challenges for GPndash 57

Highly non-uniform distribution of program lsquobehaviorsrsquo

Convergence of binary Boolean random linear functions (composed of AND NANDOR NOR 8 bits)

From [20] Langdon W B Cantuacute-Paz E (ed) Random Search is Parsimonious LateBreaking Papers at the Genetic and Evolutionary Computation Conference(GECCO-2002) AAAI 2002 308-315

The Challenges for GPndash 58

High cost of evaluation

Running a program on multiple inputscan be expensive

Particularly for some types of dataeg images

Solutions

Caching of outcomes of subprograms

Parallel execution of programs onparticular fitness cases

Bloat prevention methods

The Challenges for GPndash 59

Variants of GP

Variants of GPndash 60

Strongly typed GP (STGP)

A way to incorporate prior knowledge and impose a structure on programs [30]

ImplementationProvide a set of typesFor each instruction define the types of its arguments and outcomesMake the operators type-aware

Mutation substitute a random tree of a proper typeCrossover swap trees of compatible1 types

1Compatible belonging to the same lsquoset typersquoVariants of GPndash 61

Strongly typed GP in ECJ

For the problem of simple classifiers represented as decision trees

Classifier syntaxClassifier = Class_idClassifier = if_then_else(Condition ClassifierClassifier)Condition = Input_Variable = Constant_Value

Implementation in ECJ parameter filesgptypeasize = 3gptypea0name = classgptypea1name = vargptypea2name = constgptypessize = 0gptcsize = 1gptc0 = ecgpGPTreeConstraintsgptc0name = tc0gptc0fset = f0gptc0returns = class

gpncsize = 4gpnc0 = ecgpGPNodeConstraintsgpnc0name = ncSimpleClassifiergpnc0returns = classgpnc0size = 0gpnc1 = ecgpGPNodeConstraintsgpnc1name = ncCompoundClassifiergpnc1returns = classgpnc1size = 4gpnc1child0 = vargpnc1child1 = constgpnc1child2 = classgpnc1child3 = classgpnc2 = ecgpGPNodeConstraintsgpnc2name = ncVariablegpnc2returns = vargpnc2size = 0gpnc3 = ecgpGPNodeConstraintsgpnc3name = ncConstantgpnc3returns = constgpnc3size = 0

Variants of GPndash 62

Linear Genetic Programming

MotivationTree-like structures are not natural for contemporary hardware architectures

Program = a sequence of instructions

Data passed via registers

ProsDirectly portable to machine code fast executionNatural correspondence to standard (GA-like) crossover operator

Applications direct evolution of machine code [32]

Initial register contents

Final register contents

x1

x2 O1 O2

x3

O3 O4 g2

g3

g1r1

r2

r3

r1

r2

r3

Variants of GPndash 63

Linear GP

Variants of GPndash 64

Stack-based GP

The best-known representative Push and PushGPhampshireedulspectorpushhtml [48]

ProsVery simple syntax program = instruction | literal | ( program )No need to specify the number of registersThe top element of a stack has the natural interpretation of program outcomeNatural possibility of implementing autoconstructive programs [47]Includes certain features that make it Turing-complete (eg YANK instruction)

Variants of GPndash 65

Push Example

Program

( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR )

Initial stack states

BOOLEAN STACK ()CODE STACK ( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR )FLOAT STACK ()INTEGER STACK ()

Stack states after program execution

BOOLEAN STACK ( TRUE )CODE STACK ( ( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR ) )FLOAT STACK ( 93 )INTEGER STACK ( 6 )

httphampshireedulspectorpush3-descriptionhtml

Variants of GPndash 66

Other variants of GP

Grammatical Evolution The grammar of the programming language ofconsideration is given as input to the algorithm Individuals encode the choice ofproductions in the derivation tree (which of available alternative productionshould be chosen modulo the number of productions available at given step ofderivation)

Graph-based GPMotivation standard GP cannot reuse subprograms (within a single program)Example Cartesian Genetic Programming

Variants of GPndash 67

Other variants of GP

Multiobjective GP The extra objectives canCome with the problemResult from GPrsquos specifics eg use program size as the second (minimized)objectiveBe associated with different tests (eg feature tests [40])

Developmental GP (eg using Push)

Probabilistic GP (a variant of EDA Estimation of Distribution Algorithms)The algorithm maintains a probability distribution P instead of a populationIndividuals are generated from P lsquoon demandrsquoThe results of individualsrsquo evaluation are used to update P

Variants of GPndash 68

Simple EDA-like GP PIPE

Probabilistic Incremental Program Evolution [41]

Variants of GPndash 69

Selected theoretical results

Selected theoretical resultsndash 70

Schemata theorem for GP

Exact formula for the expected number of individuals sampling a schema a thenext generation [37]

Plus later work for other types of crossover

Selected theoretical resultsndash 71

Semantic crossover

Selected theoretical resultsndash 72

Applications of GP

Applications of GPndash 73

Review

GP produced a number of solutions that are human-competitive ie a GPalgorithm automatically solved a problem for which a patent exists2

A recent award-winning work has demonstrated the ability of a GP system toautomatically find and correct bugs in commercially-released software whenprovided with test data3

GP is one of leading methodologies that can be used to lsquoautomatersquo sciencehelping the researchers to find the hidden complex patterns in the observedphenomena4

2Koza J R Keane M A Streeter M J Mydlowec W Yu J Lanza G 2003 Genetic Pro-gramming IV Routine Human-Competitive Machine Intelligence Kluwer Academic Publishers

3Arcuri A Yao X A novel co-evolutionary approach to automatic software bug fixing In WangJ (Ed) 2008 IEEE World Congress on Computational Intelligence IEEE Computational IntelligenceSociety IEEE Press Hong Kong

4Schmidt M Lipson H 3 Apr 2009 Distilling free-form natural laws from experimental dataScience 324 (5923) 81ndash85

Applications of GPndash 74

Humies

() Entries were solicited for cash awards for human-competitive results that wereproduced by any form of genetic and evolutionary computation and that were published

Applications of GPndash 75

Humies

The conditions to qualify(A) The result was patented as an invention in the past is an improvement over apatented invention or would qualify today as a patentable new invention(B) The result is equal to or better than a result that was accepted as a new scientificresult at the time when it was published in a peer-reviewed scientific journal(C) The result is equal to or better than a result that was placed into a database orarchive of results maintained by an internationally recognized panel of scientificexperts(D) The result is publishable in its own right as a new scientific result mdash independentof the fact that the result was mechanically created(E) The result is equal to or better than the most recent human-created solution to along-standing problem for which there has been a succession of increasingly betterhuman-created solutions(F) The result is equal to or better than a result that was considered an achievementin its field at the time it was first discovered(G) The result solves a problem of indisputable difficulty in its field(H) The result holds its own or wins a regulated competition involving humancontestants (in the form of either live human players or human-written computerprograms)

Applications of GPndash 76

Selected Gold Humies

2004 Jason D Lohn Gregory S Hornby Derek S Linden NASA Ames ResearchCenterAn Evolved Antenna for Deployment on NASArsquos Space Technology 5 Mission

httpidesignucscedupapershornby_ec11pdf

Applications of GPndash 77

Selected Gold Humies using GP

2009 Stephanie Forrest Claire Le Goues ThanhVu Nguyen Westley WeimerAutomatically finding patches using genetic programming A GeneticProgramming Approach to Automated Software Repair

Applications of GPndash 78

Selected Gold Humies using GP

2008 Lee Spector David M Clark Ian Lindsay Bradford Barr Jon KleinGenetic Programming for Finite Algebras

2010 Natalio Krasnogor Paweł Widera Jonathan GaribaldiEvolutionary design of the energy function for protein structure prediction GPchallenge evolving the energy function for protein structure prediction Automateddesign of energy functions for protein structure prediction by means of geneticprogramming and improved structure similarity assessment

2011 Achiya Elyasaf Ami Hauptmann Moshe SipperGA-FreeCell Evolving Solvers for the Game of FreeCell

Applications of GPndash 79

Other applications

classification problems in machine learning [] object recognition [ 33] orlearning game strategies [] [2 57] has demonstrated the ability of a GP system to automatically find andcorrect bugs in commercially-released software when provided with test data

In the context of this paper it deserves particular attention that GP is one ofleading methodologies that can be used to lsquoautomatersquo science helping theresearchers to find the hidden complex patterns in the observed phenomena

In this spirit in their seminal paper [42] have shown how GP can be used toinduce scientific laws from experimental data Many other studies havedemonstrated the usefulness of GP for modeling different phenomena includingthose of natural origins [45 3 7 56 46]

See [] for an extensive review of GP applications

Applications of GPndash 80

Case study 2 Evolution of feature detectors

Based on Krzysztof Krawiec Bartosz Kukawka and Tomasz Maciejewski (2010)Evolving cascades of voting feature detectors for vehicle detection in satellite imageryIn IEEE Congress on Evolutionary Computation (CEC 2010) Barcelona IEEE Presspages 2392-2399

Applications of GPndash 81

Additional resources

Additional resourcesndash 82

Additional resources

Evolutionary Computation in Java csgmuedu~eclabprojectsecjGeneric software framework for EA well-prepared to work with GP

The online GP bilbiography wwwcsbhamacuk~wblbiblio

The genetic programming lsquohome pagersquo (a little bit messy but still valuable)httpwwwgenetic-programmingcom

Additional resourcesndash 83

Bibliography

Bibliographyndash 84

A Agapitos and S M LucasEvolving Modular Recursive Sorting AlgorithmsIn Proc EuroGP 2007

A Arcuri and X YaoA novel co-evolutionary approach to automatic software bug fixingIn J Wang editor 2008 IEEE World Congress on Computational Intelligencepages 162ndash168 Hong Kong 1-6 June 2008 IEEE Computational IntelligenceSociety IEEE Press

M Arganis R Val J Prats K Rodriguez R Dominguez and J DolzGenetic programming and standardization in water temperature modellingAdvances in Civil Engineering 2009 2009

A Brabazon M OrsquoNeill and I DempseyAn Introduction to Evolutionary Computation in FinanceIEEE Computational Intelligence Magazine 3(4)42ndash55 2008

R Bradley A Brabazon and M OrsquoNeillDynamic High Frequency Trading A Neuro-Evolutionary ApproachIn Proc EvoWorkshops 2009

G Cuccu and F GomezWhen novelty is not enoughIn Proc EvoApplications 2011

Bibliographyndash 85

M Daga and M C DeoAlternative data-driven methods to estimate wind from waves by inversemodelingNatural Hazards 49(2)293ndash310 May 2009

I Dempsey M OrsquoNeill and A BrabazonAdaptive Trading With Grammatical EvolutionIn Proc CEC 2006

G Durrett F Neumann and U-M OrsquoReillyComputational Complexity Analysis of Simple Genetic Programming On TwoProblems Modeling Isolated Program SemanticsIn Proc FOGA 2011

E Galvaacuten-Loacutepez J Swafford M OrsquoNeill and A BrabazonEvolving a Ms PacMan Controller Using Grammatical EvolutionIn Applications of Evolutionary Computation Springer 2010

J V Hansen P B Lowry R D Meservy and D M McDonaldGenetic Programming for Prevention of Cyberterrorism through Dynamic andEvolving Intrusion DetectionDecision Support Systems 431362ndash1374 2007

Bibliographyndash 86

S Harding J F Miller and W BanzhafDevelopments in Cartesian Genetic Programming self-modifying CGPGPEM 11397ndash439 2010

D Jakobović and L BudinDynamic Scheduling with Genetic ProgrammingIn Proc EuroGP 2006

G Kendall A Parkes and K SpoererA Survey of NP-Complete PuzzlesInternational Computer Games Association Journal 31(1)13ndash34 2008

K E Kinnear JrEvolving a Sort Lessons in Genetic ProgrammingIn Proc of the International Conference on Neural Networks 1993

J KozaA Genetic Approach to the Truck Backer Upper Problem and the Inter-twinedSpiral ProblemIn Proc International Joint Conference on Neural Networks 1992

J R KozaGenetic Programming On the Programming of Computers by Means of NaturalSelectionMIT Press Cambridge MA USA 1992

Bibliographyndash 87

J R KozaGenetic Programming II Automatic Discovery of Reusable ProgramsMIT Press Cambridge Massachusetts May 1994

W Langdon and W BanzhafRepeated Patterns in Genetic ProgrammingNatural Computing 7589ndash613 2008

W B LangdonRandom search is parsimoniousIn E Cantuacute-Paz editor Late Breaking Papers at the Genetic and EvolutionaryComputation Conference (GECCO-2002) pages 308ndash315 New York NY 9-13July 2002 AAAI

W B Langdon and W BanzhafRepeated Sequences in Linear Genetic Programming GenomesComplex Systems 15(4)285ndash306 2005

W B Langdon and A P HarrisonEvolving Regular Expressions for GeneChip Probe Performance PredictionIn Proc PPSN pages 1061ndash1070 2008

W B Langdon and R PoliFoundations of Genetic ProgrammingSpringer-Verlag 2002

Bibliographyndash 88

W B Langdon J Rowsell and A P HarrisonCreating Regular Expressions as mRNA Motifs with GP to Predict Human ExonSplittingIn Proc GECCO 2009

W B Langdon O Sanchez Graillet and A P HarrisonAutomated DNA Motif DiscoveryarXivorg 2010

M Lones A Tyrrell S Stepney and L CavesControlling Complex Dynamics with Artificial Biochemical NetworksIn Proc EuroGP pages 159ndash170 2010

S LucasOthello Competitionhttpprotectkern-1667emrelaxalgovalessexacuk8080othellohtmlOthellohtml 2012[Online accessed 27-Jan-2012]

Bibliographyndash 89

S LucasThe Physical Travelling Salesperson Problemhttpprotectkern-1667emrelaxalgovalessexacukptspptsphtml2012[Online accessed 27ndashJan-2012]

T McConaghyFFX Fast Scalable Deterministic Symbolic Regression TechnologyIn Proc GPTP 2011

D J MontanaStrongly typed genetic programmingBBN Technical Report 7866 Bolt Beranek and Newman Inc 10 MoultonStreet Cambridge MA 02138 USA 7 May 1993

M Nicolau M Schoenauer and W BanzhafEvolving Genes to Balance a PoleIn Proc EuroGP 2010

Bibliographyndash 90

P Nordin and W BanzhafGenetic programming controlling a miniature robotIn E V Siegel and J R Koza editors Working Notes for the AAAI Symposiumon Genetic Programming pages 61ndash67 MIT Cambridge MA USA 10ndash12 Nov1995 AAAI

G Olague and L TrujilloEvolutionary-computer-assisted design of image operators that detect interestpoints using genetic programmingImage and Vision Computing 29(7)484ndash498 2011

M OrsquoNeill A Brabazon and E HembergSubtree Deactivation Control with Grammatical Genetic Programming inDynamic EnvironmentsIn Proc CEC 2008

M OrsquoNeill and C RyanGrammatical Evolution by Grammatical Evolution The Evolution of Grammarand Genetic CodeIn Proc EuroGP pages 138ndash149 Springer-Verlag 5ndash7 Apr 2004

M Paterson Y Peres M Thorup P Winkler and U ZwickMaximum OverhangIn Proc 19th Annual ACM-SIAM Symposium on Discrete Algorithms 2008

Bibliographyndash 91

R PoliExact schema theory for genetic programming and variable-length geneticalgorithms with one-point crossoverGenetic Programming and Evolvable Machines 2(2)123ndash163 June 2001

R Poli and W B LangdonOn the search properties of different crossover operators in genetic programmingIn J R Koza W Banzhaf K Chellapilla K Deb M Dorigo D B Fogel M HGarzon D E Goldberg H Iba and R Riolo editors Genetic Programming1998 Proceedings of the Third Annual Conference pages 293ndash301 University ofWisconsin Madison Wisconsin USA 22-25 July 1998 Morgan Kaufmann

R Poli W B Langdon and N F McPheeA field guide to genetic programmingPublished via httplulucom and freely available athttpwwwgp-field-guideorguk 2008(With contributions by J R Koza)

B J Ross and H ZhuProcedural texture evolution using multiobjective optimizationNew Generation Computing 22(3)271ndash293 2004

Bibliographyndash 92

R P Salustowicz and J SchmidhuberProbabilistic incremental program evolutionEvolutionary Computation 5(2)123ndash141 1997

M Schmidt and H LipsonDistilling free-form natural laws from experimental dataScience 324(5923)81ndash85 3 Apr 2009

S Silva and L VanneschiState-of-the-Art Genetic Programming for Predicting Human Oral Bioavailabilityof DrugsIn Proc 4th International Workshop on Practical Applications of ComputationalBiology and Bioinformatics 2010

M SipperLet the Games EvolveIn Proc GPTP 2011

C Sivapragasam R Maheswaran and V VenkateshGenetic programming approach for flood routing in natural channelsHydrological Processes 22(5)623ndash628 2007

C Sivapragasam N Muttil S Muthukumar and V M ArunPrediction of algal blooms using genetic programmingMarine Pollution Bulletin 60(10)1849ndash1855 2010

Bibliographyndash 93

L SpectorTowards practical autoconstructive evolution Self-evolution of problem-solvinggenetic programming systemsIn R Riolo T McConaghy and E Vladislavleva editors Genetic ProgrammingTheory and Practice VIII volume 8 of Genetic and Evolutionary Computationchapter 2 pages 17ndash33 Springer Ann Arbor USA 20-22 May 2010

L Spector C Perry and J KleinPush 20 programming language descriptionTechnical report School of Cognitive Science Hampshire College Apr 2004

J Togelius S Karakovskiy J Koutnik and J SchmidhuberSuper Mario EvolutionIn Proc IEEE Computational Intelligence and Games 2009

TORCS The Open Car Racing Simulatorhttptorcssourceforgenet 2012

L Vanneschi and G CuccuVariable Size Population for Dynamic Optimization with Genetic ProgrammingIn Proc GECCO 2009

Bibliographyndash 94

E Vladislavleva G Smits and D Den HertogOrder of Nonlinearity as a Complexity Measure for Models Generated by SymbolicRegression via Pareto Genetic ProgrammingIEEE Trans EC 13(2)333ndash349 2009

N Wagner Z Michalewicz M Khouja and R McGregorTime Series Forecasting for Dynamic Environments The DyFor Genetic ProgramModelIEEE Trans EC 2007

J Walker and J MillerPredicting Prime Numbers Using Cartesian Genetic ProgrammingIn Proc EuroGP 2007

J A Walker K Voumllk S L Smith and J F MillerParallel Evolution using Multi-chromosome Cartesian Genetic ProgrammingGPEM 10417ndash445 2009

W-C Wang K-W Chau C-T Cheng and L QiuA comparison of performance of several artificial intelligence methods forforecasting monthly discharge time seriesJournal of Hydrology 374(3-4)294ndash306 2009

Bibliographyndash 95

W Weimer S Forrest C Le Goues and T NguyenAutomatic program repair with evolutionary computationCommunications of the ACM 53(5)109ndash116 June 2010

P Widera J Garibaldi and N KrasnogorGP challenge Evolving energy function for protein structure predictionGPEM 1161ndash88 2010

T YuHierarchical Processing for Evolving Recursive and Modular Programs UsingHigher-Order Functions and Lambda AbstractionGPEM 2345ndash380 2001

Bibliographyndash 96

  • Introduction
  • Background
  • What is genetic programming
  • Summary of our first glimpse at GP
  • Life demonstration of GP using ECJ
  • Assessment of GP techniques
  • Case study
  • A more detailed view on GP (vanilla GP is not the whole story)
  • The Challenges for GP
  • Variants of GP
  • Selected theoretical results
  • Applications of GP
  • Additional resources
  • Bibliography
Page 27: GeneticProgramming - Poznań University of TechnologyGeneticProgramming KrzysztofKrawiec LaboratoryofIntelligentDecisionSupportSystems InstituteofComputingScience,PoznanUniversityofTechnology,Poznań,Poland

What is GP ndash Question revisited

Genetic programming is a branch of computer science studying heuristicalgorithms based on neo-Darwinian principles for synthesizing programs iediscrete symbolic compositional structures that process data

Summary of our first glimpse at GPndash 27

Why should GP be considered a viable approach to program synthesis

GP combines two powerful concepts marked in underline in the above definition1 Representing candidate solutions as programs

which in general can conduct any Turing-complete computation (egclassification regression clustering reasoning problem solving etc) and thusenable capturing solutions to any type of problems (whether the task is eglearning optimization problem solving game playing etc)

2 Searching the space of candidate solutions using the lsquomechanicsrsquo borrowed frombiological evolutionwhich is unquestionably a very powerful computing paradigm given that itresulted in life on Earth and development of intelligent beings

Summary of our first glimpse at GPndash 28

Consequences of the above definition

Heuristic nature of search

Symbolic program representation

Input sensitivity and inductive character

State-fullness

Unconstrained data types

Summary of our first glimpse at GPndash 29

Origins of GP

httpwwwgenetic-programmingcomjohnkozahtml

Home assignment Indentify John Kozarsquos unique visual trait )

Summary of our first glimpse at GPndash 30

Life demonstration of GP using ECJ

Life demonstration of GP using ECJndash 31

GP on ECJ

ECJ Evolutionary Computation in Javahttpcsgmuedu~eclabprojectsecj

by Sean Luke Liviu Panait Gabriel Balan Sean Paus Zbigniew Skolicki ElenaPopovici Keith Sullivan et al

Probably the most popular freely available framework for EC with a strongsupport for GP

Licensed under Academic Free License version 30

As of March 2012 version 20

Many other libraries integrate with ECJ

Life demonstration of GP using ECJndash 32

Selected ECJ features

GUI with charting

Platform-independent checkpointing and logging

Hierarchical parameter files

Multithreading

Mersenne Twister Random Number Generators (compare tohttpwwwalifecouknonrandom)

Abstractions for implementing a variety of EC forms

Prepared to work in a distributed environment (including so-called island model)

Life demonstration of GP using ECJndash 33

GP-related ECJ features

GP Tree Representations

Set-based Strongly-Typed Genetic Programming

Ephemeral Random Constants

Automatically-Defined Functions and Automatically Defined Macros

Multiple tree forests

Six tree-creation algorithms

Extensive set of GP breeding operators

Grammatical Encoding

Eight pre-done GP application problem domains (ant regression multiplexerlawnmower parity two-box edge serengeti)

Life demonstration of GP using ECJndash 34

Exemplary run of ECJ

Standard outputjava ecEvolve -file ecappregressionquinticercparamsThreads breed1 eval1Seed 1427743400Job 0Setting upProcessing GP TypesProcessing GP Node ConstraintsProcessing GP Function SetsProcessing GP Tree Constraints-01306332228659439200164875774146594280653340439694114301402200189629743-0037506348565697010001402771209365470606602806044824949013869498395598084Initializing Generation 0Subpop 0 best fitness of generation Fitness Standardized=11303205 Adjusted=046941292 Hits=10Generation 1Subpop 0 best fitness of generation Fitness Standardized=06804932 Adjusted=059506345 Hits=7

Life demonstration of GP using ECJndash 35

Exemplary run The result

The log file produced by the runGeneration 0Best IndividualSubpopulation 0Evaluated trueFitness Standardized=11303205 Adjusted=046941292 Hits=10Tree 0( (sin ( x x)) (cos (+ x x)))Generation 1Best IndividualSubpopulation 0Evaluated trueFitness Standardized=06804932 Adjusted=059506345 Hits=7Tree 0( (rlog (+ (- x x) (cos x))) (rlog (- (cos (cos ( x x))) (- x x))))

Life demonstration of GP using ECJndash 36

Exemplary run

The log file produced by the runBest Individual of RunSubpopulation 0Evaluated trueFitness Standardized=008413165 Adjusted=092239726 Hits=17Tree 0( ( ( (- ( ( ( ( x (sin x)) (rlog

x)) (+ (+ (sin x) x) (- x x))) (exp ( x( ( (- ( ( ( ( x x) (rlog x)) (+ (+

(sin x) x) (- x x))) (exp ( x (sin x))))(sin x)) (rlog x)) (exp (rlog x)))))) (sin

x)) (rlog x)) x) (cos (cos ( ( (- ( ((exp (rlog x)) (+ x ( ( (exp (rlog x))(rlog x)) x))) (exp ( ( ( (- (exp (rlogx)) x) (rlog x)) x) (sin ( x x))))) (sinx)) ( x ( ( (- ( ( ( ( x x) (rlogx)) (+ (+ x (+ (+ (sin x) x) (- x x))) (-x x))) (exp ( x (sin x)))) (sin x)) (rlogx)) (exp (rlog x))))) x))))

Life demonstration of GP using ECJndash 37

GP in R

httpcranr-projectorgwebpackagesgprindexhtml

Life demonstration of GP using ECJndash 38

GP in Mathematica

Life demonstration of GP using ECJndash 39

Assessment of GP techniques

Assessment of GP techniquesndash 40

Criteria for assessing the quality of GP-evolved solutions

Criteria for assessing GP algorithmssuccess rate (percentage of evolutionary runs ended with success)

time-to-success (can be infin)

error of the best-of-run individual

Criteria for assessing programs obtained with GP

error rate (percentage of tests passed)

program size (number of instructions)

execution time

transparency

Assessment of GP techniquesndash 41

GP Benchmarks

Problem Definition (formula)Sextic x6minus2x4+x2

Septic x7minus2x6+x5minusx4+x3minus2x2+xNonic x9+x8+x7+x6+x5+x4+x3+x2+xR1 (x+1)3(x2minusx+1)R2 (x5minus3x3+1)(x2+1)R3 (x6+x5)(x4+x3+x2+x+1)

Assessment of GP techniquesndash 42

Even more GP benchmarks

Symbolic Regression Drug Bioavailability [43]

Tower [52] Protein Structure Classification [58]

Boolean Functions Time Series Forecasting [53]

N-Multiplexer [17] N-Majority [17] N-Parity [17] Path-finding and Planning

Generalised Boolean Circuits [12 59] Physical Travelling Salesman [28]

Digital Adder [55] Artificial Ant [17]

Order [9] Lawnmower [18]

Digital Multiplier [55] Tartarus Problem [6]

Majority [9] Maximum Overhang [36]

Classification Circuit Design [29]

mRNA Motif Classification [24] Control Systems

DNA Motif Discovery [25] Chaotic Dynamic Systems Control [26]

Intrusion Detection [11] Pole Balancing [31]

Protein Classification [19] Truck Control [16]

Intertwined Spirals [17]

Predictive Modelling

Mackey-Glass Chaotic Time Series [21]

Financial Trading [5 4 8]

Sunspot Prediction [17]

GeneChip Probe Performance [22]

Prime Number Prediction [54]

Assessment of GP techniquesndash 43

and more

Game-Playing

TORCS Car Racing [50]

Ms PacMan [10]

Othello [27]

Chessboard Evaluation [44]

Backgammon [44]

Mario [49]

NP-Complete Puzzles [14]

Robocode [44]

Rush Hour [44]

Checkers [44]

Freecell [44]

Dynamic Optimisation

Dynamic Symbolic Regression [34 35 51]

Dynamic Scheduling [13]

Traditional Programming

Sorting [15 1]

Assessment of GP techniquesndash 44

Case study

Case studyndash 45

Case study 1 Evolution of temperature models

Based on Karolina Stanisławska Krzysztof Krawiec Zbigniew W KundzewiczModeling Global Temperature Changes using Genetic Programming ndash A Case Study

Institute of Computing Science Poznan University of Technology Poznan Poland

Institute for Agricultural and Forest Environment Polish Academy of SciencesPoznan Poland and Potsdam Institute for Climate Impact Research PotsdamGermany

Konferencja Algorytmoacutew Ewolucyjnych i Optymalizacji Globalnej KAEiOGSeptember 21 2011

Case studyndash 46

A more detailed view on GP(vanilla GP is not the whole story)

A more detailed view on GP (vanilla GP is not the whole story)ndash 47

Where to get [random] candidate solutions from

Every stochastic search method relies on some sampling algorithm(s)

The distribution of randomly generated solutions is important as it impliescertain bias of the algorithm

ProblemsWe donrsquot know the lsquoidealrsquo distribution of GP programsEven if we knew it it may be difficult to design an algorithm that obeys it

The most widely used contemporary initialization methods take care only of thesyntax of generated programs

Mainly height constraint

A more detailed view on GP (vanilla GP is not the whole story)ndash 48

Initialization Full method

Specify the maximum tree height hmax

The full method for initializing treesChoose nonterminal nodes at random until hmax is reachedThen choose only from terminals

A more detailed view on GP (vanilla GP is not the whole story)ndash 49

Initialization Grow method

Specify the maximum tree height hmax

The grow method for initializing treesChoose nonterminal or terminal nodes at random until hmax is reachedThen choose only from terminals

A more detailed view on GP (vanilla GP is not the whole story)ndash 50

Homologous crossover for GP

Earliest example one-point crossover [23] identify a common region in theparents and swap the corresponding trees

The common region is the lsquointersectionrsquo of parent trees

A more detailed view on GP (vanilla GP is not the whole story)ndash 51

Uniform crossover for GP

Works similarly to uniform crossover in GAs

The offspring is build by iterating over nodes in the common region and flipping acoin to decide from which parent should an instruction be copied [38]

A more detailed view on GP (vanilla GP is not the whole story)ndash 52

How to employ multiple operators for lsquobreedingrsquo

How should the particular operators coexist in an evolutionary process In other words

How should they be superimposed

What should be the lsquopipingrsquo of particular breeding pipelines

A topic surprisingly underexplored in GP (and in EC probably too)

An example Which is betterpopsubpop0speciespipe = ecgpkozaMutationPipelinepopsubpop0speciespipenum-sources = 1popsubpop0speciespipesource0 = ecgpkozaCrossoverPipeline

Orpopsubpop0speciespipenum-sources = 2popsubpop0speciespipesource0 = ecgpkozaCrossoverPipelinepopsubpop0speciespipesource0prob = 09popsubpop0speciespipesource1 = ecgpkozaMutationPipelinepopsubpop0speciespipesource1prob = 01

A more detailed view on GP (vanilla GP is not the whole story)ndash 53

The Challenges for GP

The Challenges for GPndash 54

Bloat

The evolving expressions tend to grow indefinitely in sizeFor tree-based representations this growth is typically exponential[-ish]

Evaluation becomes slow algorithm stalls memory overrun likely

One of the most intensely studied topics in GP 240+ papers as of March 2012

The Challenges for GPndash 55

Bloat example

Average number of nodes per generation in a typical run of GP solving the Sexticproblem x6minus2x4+x2

(GP dotted line)

The Challenges for GPndash 56

Countermeasures for bloat

Constraining tree heightSurprisingly can speed up bloat

Favoring small programsLexicographic parsimony pressure given two equally fit individuals prefer (select)the one represented by a smaller tree

Bloat-aware operators size-fair crossover

The Challenges for GPndash 57

Highly non-uniform distribution of program lsquobehaviorsrsquo

Convergence of binary Boolean random linear functions (composed of AND NANDOR NOR 8 bits)

From [20] Langdon W B Cantuacute-Paz E (ed) Random Search is Parsimonious LateBreaking Papers at the Genetic and Evolutionary Computation Conference(GECCO-2002) AAAI 2002 308-315

The Challenges for GPndash 58

High cost of evaluation

Running a program on multiple inputscan be expensive

Particularly for some types of dataeg images

Solutions

Caching of outcomes of subprograms

Parallel execution of programs onparticular fitness cases

Bloat prevention methods

The Challenges for GPndash 59

Variants of GP

Variants of GPndash 60

Strongly typed GP (STGP)

A way to incorporate prior knowledge and impose a structure on programs [30]

ImplementationProvide a set of typesFor each instruction define the types of its arguments and outcomesMake the operators type-aware

Mutation substitute a random tree of a proper typeCrossover swap trees of compatible1 types

1Compatible belonging to the same lsquoset typersquoVariants of GPndash 61

Strongly typed GP in ECJ

For the problem of simple classifiers represented as decision trees

Classifier syntaxClassifier = Class_idClassifier = if_then_else(Condition ClassifierClassifier)Condition = Input_Variable = Constant_Value

Implementation in ECJ parameter filesgptypeasize = 3gptypea0name = classgptypea1name = vargptypea2name = constgptypessize = 0gptcsize = 1gptc0 = ecgpGPTreeConstraintsgptc0name = tc0gptc0fset = f0gptc0returns = class

gpncsize = 4gpnc0 = ecgpGPNodeConstraintsgpnc0name = ncSimpleClassifiergpnc0returns = classgpnc0size = 0gpnc1 = ecgpGPNodeConstraintsgpnc1name = ncCompoundClassifiergpnc1returns = classgpnc1size = 4gpnc1child0 = vargpnc1child1 = constgpnc1child2 = classgpnc1child3 = classgpnc2 = ecgpGPNodeConstraintsgpnc2name = ncVariablegpnc2returns = vargpnc2size = 0gpnc3 = ecgpGPNodeConstraintsgpnc3name = ncConstantgpnc3returns = constgpnc3size = 0

Variants of GPndash 62

Linear Genetic Programming

MotivationTree-like structures are not natural for contemporary hardware architectures

Program = a sequence of instructions

Data passed via registers

ProsDirectly portable to machine code fast executionNatural correspondence to standard (GA-like) crossover operator

Applications direct evolution of machine code [32]

Initial register contents

Final register contents

x1

x2 O1 O2

x3

O3 O4 g2

g3

g1r1

r2

r3

r1

r2

r3

Variants of GPndash 63

Linear GP

Variants of GPndash 64

Stack-based GP

The best-known representative Push and PushGPhampshireedulspectorpushhtml [48]

ProsVery simple syntax program = instruction | literal | ( program )No need to specify the number of registersThe top element of a stack has the natural interpretation of program outcomeNatural possibility of implementing autoconstructive programs [47]Includes certain features that make it Turing-complete (eg YANK instruction)

Variants of GPndash 65

Push Example

Program

( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR )

Initial stack states

BOOLEAN STACK ()CODE STACK ( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR )FLOAT STACK ()INTEGER STACK ()

Stack states after program execution

BOOLEAN STACK ( TRUE )CODE STACK ( ( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR ) )FLOAT STACK ( 93 )INTEGER STACK ( 6 )

httphampshireedulspectorpush3-descriptionhtml

Variants of GPndash 66

Other variants of GP

Grammatical Evolution The grammar of the programming language ofconsideration is given as input to the algorithm Individuals encode the choice ofproductions in the derivation tree (which of available alternative productionshould be chosen modulo the number of productions available at given step ofderivation)

Graph-based GPMotivation standard GP cannot reuse subprograms (within a single program)Example Cartesian Genetic Programming

Variants of GPndash 67

Other variants of GP

Multiobjective GP The extra objectives canCome with the problemResult from GPrsquos specifics eg use program size as the second (minimized)objectiveBe associated with different tests (eg feature tests [40])

Developmental GP (eg using Push)

Probabilistic GP (a variant of EDA Estimation of Distribution Algorithms)The algorithm maintains a probability distribution P instead of a populationIndividuals are generated from P lsquoon demandrsquoThe results of individualsrsquo evaluation are used to update P

Variants of GPndash 68

Simple EDA-like GP PIPE

Probabilistic Incremental Program Evolution [41]

Variants of GPndash 69

Selected theoretical results

Selected theoretical resultsndash 70

Schemata theorem for GP

Exact formula for the expected number of individuals sampling a schema a thenext generation [37]

Plus later work for other types of crossover

Selected theoretical resultsndash 71

Semantic crossover

Selected theoretical resultsndash 72

Applications of GP

Applications of GPndash 73

Review

GP produced a number of solutions that are human-competitive ie a GPalgorithm automatically solved a problem for which a patent exists2

A recent award-winning work has demonstrated the ability of a GP system toautomatically find and correct bugs in commercially-released software whenprovided with test data3

GP is one of leading methodologies that can be used to lsquoautomatersquo sciencehelping the researchers to find the hidden complex patterns in the observedphenomena4

2Koza J R Keane M A Streeter M J Mydlowec W Yu J Lanza G 2003 Genetic Pro-gramming IV Routine Human-Competitive Machine Intelligence Kluwer Academic Publishers

3Arcuri A Yao X A novel co-evolutionary approach to automatic software bug fixing In WangJ (Ed) 2008 IEEE World Congress on Computational Intelligence IEEE Computational IntelligenceSociety IEEE Press Hong Kong

4Schmidt M Lipson H 3 Apr 2009 Distilling free-form natural laws from experimental dataScience 324 (5923) 81ndash85

Applications of GPndash 74

Humies

() Entries were solicited for cash awards for human-competitive results that wereproduced by any form of genetic and evolutionary computation and that were published

Applications of GPndash 75

Humies

The conditions to qualify(A) The result was patented as an invention in the past is an improvement over apatented invention or would qualify today as a patentable new invention(B) The result is equal to or better than a result that was accepted as a new scientificresult at the time when it was published in a peer-reviewed scientific journal(C) The result is equal to or better than a result that was placed into a database orarchive of results maintained by an internationally recognized panel of scientificexperts(D) The result is publishable in its own right as a new scientific result mdash independentof the fact that the result was mechanically created(E) The result is equal to or better than the most recent human-created solution to along-standing problem for which there has been a succession of increasingly betterhuman-created solutions(F) The result is equal to or better than a result that was considered an achievementin its field at the time it was first discovered(G) The result solves a problem of indisputable difficulty in its field(H) The result holds its own or wins a regulated competition involving humancontestants (in the form of either live human players or human-written computerprograms)

Applications of GPndash 76

Selected Gold Humies

2004 Jason D Lohn Gregory S Hornby Derek S Linden NASA Ames ResearchCenterAn Evolved Antenna for Deployment on NASArsquos Space Technology 5 Mission

httpidesignucscedupapershornby_ec11pdf

Applications of GPndash 77

Selected Gold Humies using GP

2009 Stephanie Forrest Claire Le Goues ThanhVu Nguyen Westley WeimerAutomatically finding patches using genetic programming A GeneticProgramming Approach to Automated Software Repair

Applications of GPndash 78

Selected Gold Humies using GP

2008 Lee Spector David M Clark Ian Lindsay Bradford Barr Jon KleinGenetic Programming for Finite Algebras

2010 Natalio Krasnogor Paweł Widera Jonathan GaribaldiEvolutionary design of the energy function for protein structure prediction GPchallenge evolving the energy function for protein structure prediction Automateddesign of energy functions for protein structure prediction by means of geneticprogramming and improved structure similarity assessment

2011 Achiya Elyasaf Ami Hauptmann Moshe SipperGA-FreeCell Evolving Solvers for the Game of FreeCell

Applications of GPndash 79

Other applications

classification problems in machine learning [] object recognition [ 33] orlearning game strategies [] [2 57] has demonstrated the ability of a GP system to automatically find andcorrect bugs in commercially-released software when provided with test data

In the context of this paper it deserves particular attention that GP is one ofleading methodologies that can be used to lsquoautomatersquo science helping theresearchers to find the hidden complex patterns in the observed phenomena

In this spirit in their seminal paper [42] have shown how GP can be used toinduce scientific laws from experimental data Many other studies havedemonstrated the usefulness of GP for modeling different phenomena includingthose of natural origins [45 3 7 56 46]

See [] for an extensive review of GP applications

Applications of GPndash 80

Case study 2 Evolution of feature detectors

Based on Krzysztof Krawiec Bartosz Kukawka and Tomasz Maciejewski (2010)Evolving cascades of voting feature detectors for vehicle detection in satellite imageryIn IEEE Congress on Evolutionary Computation (CEC 2010) Barcelona IEEE Presspages 2392-2399

Applications of GPndash 81

Additional resources

Additional resourcesndash 82

Additional resources

Evolutionary Computation in Java csgmuedu~eclabprojectsecjGeneric software framework for EA well-prepared to work with GP

The online GP bilbiography wwwcsbhamacuk~wblbiblio

The genetic programming lsquohome pagersquo (a little bit messy but still valuable)httpwwwgenetic-programmingcom

Additional resourcesndash 83

Bibliography

Bibliographyndash 84

A Agapitos and S M LucasEvolving Modular Recursive Sorting AlgorithmsIn Proc EuroGP 2007

A Arcuri and X YaoA novel co-evolutionary approach to automatic software bug fixingIn J Wang editor 2008 IEEE World Congress on Computational Intelligencepages 162ndash168 Hong Kong 1-6 June 2008 IEEE Computational IntelligenceSociety IEEE Press

M Arganis R Val J Prats K Rodriguez R Dominguez and J DolzGenetic programming and standardization in water temperature modellingAdvances in Civil Engineering 2009 2009

A Brabazon M OrsquoNeill and I DempseyAn Introduction to Evolutionary Computation in FinanceIEEE Computational Intelligence Magazine 3(4)42ndash55 2008

R Bradley A Brabazon and M OrsquoNeillDynamic High Frequency Trading A Neuro-Evolutionary ApproachIn Proc EvoWorkshops 2009

G Cuccu and F GomezWhen novelty is not enoughIn Proc EvoApplications 2011

Bibliographyndash 85

M Daga and M C DeoAlternative data-driven methods to estimate wind from waves by inversemodelingNatural Hazards 49(2)293ndash310 May 2009

I Dempsey M OrsquoNeill and A BrabazonAdaptive Trading With Grammatical EvolutionIn Proc CEC 2006

G Durrett F Neumann and U-M OrsquoReillyComputational Complexity Analysis of Simple Genetic Programming On TwoProblems Modeling Isolated Program SemanticsIn Proc FOGA 2011

E Galvaacuten-Loacutepez J Swafford M OrsquoNeill and A BrabazonEvolving a Ms PacMan Controller Using Grammatical EvolutionIn Applications of Evolutionary Computation Springer 2010

J V Hansen P B Lowry R D Meservy and D M McDonaldGenetic Programming for Prevention of Cyberterrorism through Dynamic andEvolving Intrusion DetectionDecision Support Systems 431362ndash1374 2007

Bibliographyndash 86

S Harding J F Miller and W BanzhafDevelopments in Cartesian Genetic Programming self-modifying CGPGPEM 11397ndash439 2010

D Jakobović and L BudinDynamic Scheduling with Genetic ProgrammingIn Proc EuroGP 2006

G Kendall A Parkes and K SpoererA Survey of NP-Complete PuzzlesInternational Computer Games Association Journal 31(1)13ndash34 2008

K E Kinnear JrEvolving a Sort Lessons in Genetic ProgrammingIn Proc of the International Conference on Neural Networks 1993

J KozaA Genetic Approach to the Truck Backer Upper Problem and the Inter-twinedSpiral ProblemIn Proc International Joint Conference on Neural Networks 1992

J R KozaGenetic Programming On the Programming of Computers by Means of NaturalSelectionMIT Press Cambridge MA USA 1992

Bibliographyndash 87

J R KozaGenetic Programming II Automatic Discovery of Reusable ProgramsMIT Press Cambridge Massachusetts May 1994

W Langdon and W BanzhafRepeated Patterns in Genetic ProgrammingNatural Computing 7589ndash613 2008

W B LangdonRandom search is parsimoniousIn E Cantuacute-Paz editor Late Breaking Papers at the Genetic and EvolutionaryComputation Conference (GECCO-2002) pages 308ndash315 New York NY 9-13July 2002 AAAI

W B Langdon and W BanzhafRepeated Sequences in Linear Genetic Programming GenomesComplex Systems 15(4)285ndash306 2005

W B Langdon and A P HarrisonEvolving Regular Expressions for GeneChip Probe Performance PredictionIn Proc PPSN pages 1061ndash1070 2008

W B Langdon and R PoliFoundations of Genetic ProgrammingSpringer-Verlag 2002

Bibliographyndash 88

W B Langdon J Rowsell and A P HarrisonCreating Regular Expressions as mRNA Motifs with GP to Predict Human ExonSplittingIn Proc GECCO 2009

W B Langdon O Sanchez Graillet and A P HarrisonAutomated DNA Motif DiscoveryarXivorg 2010

M Lones A Tyrrell S Stepney and L CavesControlling Complex Dynamics with Artificial Biochemical NetworksIn Proc EuroGP pages 159ndash170 2010

S LucasOthello Competitionhttpprotectkern-1667emrelaxalgovalessexacuk8080othellohtmlOthellohtml 2012[Online accessed 27-Jan-2012]

Bibliographyndash 89

S LucasThe Physical Travelling Salesperson Problemhttpprotectkern-1667emrelaxalgovalessexacukptspptsphtml2012[Online accessed 27ndashJan-2012]

T McConaghyFFX Fast Scalable Deterministic Symbolic Regression TechnologyIn Proc GPTP 2011

D J MontanaStrongly typed genetic programmingBBN Technical Report 7866 Bolt Beranek and Newman Inc 10 MoultonStreet Cambridge MA 02138 USA 7 May 1993

M Nicolau M Schoenauer and W BanzhafEvolving Genes to Balance a PoleIn Proc EuroGP 2010

Bibliographyndash 90

P Nordin and W BanzhafGenetic programming controlling a miniature robotIn E V Siegel and J R Koza editors Working Notes for the AAAI Symposiumon Genetic Programming pages 61ndash67 MIT Cambridge MA USA 10ndash12 Nov1995 AAAI

G Olague and L TrujilloEvolutionary-computer-assisted design of image operators that detect interestpoints using genetic programmingImage and Vision Computing 29(7)484ndash498 2011

M OrsquoNeill A Brabazon and E HembergSubtree Deactivation Control with Grammatical Genetic Programming inDynamic EnvironmentsIn Proc CEC 2008

M OrsquoNeill and C RyanGrammatical Evolution by Grammatical Evolution The Evolution of Grammarand Genetic CodeIn Proc EuroGP pages 138ndash149 Springer-Verlag 5ndash7 Apr 2004

M Paterson Y Peres M Thorup P Winkler and U ZwickMaximum OverhangIn Proc 19th Annual ACM-SIAM Symposium on Discrete Algorithms 2008

Bibliographyndash 91

R PoliExact schema theory for genetic programming and variable-length geneticalgorithms with one-point crossoverGenetic Programming and Evolvable Machines 2(2)123ndash163 June 2001

R Poli and W B LangdonOn the search properties of different crossover operators in genetic programmingIn J R Koza W Banzhaf K Chellapilla K Deb M Dorigo D B Fogel M HGarzon D E Goldberg H Iba and R Riolo editors Genetic Programming1998 Proceedings of the Third Annual Conference pages 293ndash301 University ofWisconsin Madison Wisconsin USA 22-25 July 1998 Morgan Kaufmann

R Poli W B Langdon and N F McPheeA field guide to genetic programmingPublished via httplulucom and freely available athttpwwwgp-field-guideorguk 2008(With contributions by J R Koza)

B J Ross and H ZhuProcedural texture evolution using multiobjective optimizationNew Generation Computing 22(3)271ndash293 2004

Bibliographyndash 92

R P Salustowicz and J SchmidhuberProbabilistic incremental program evolutionEvolutionary Computation 5(2)123ndash141 1997

M Schmidt and H LipsonDistilling free-form natural laws from experimental dataScience 324(5923)81ndash85 3 Apr 2009

S Silva and L VanneschiState-of-the-Art Genetic Programming for Predicting Human Oral Bioavailabilityof DrugsIn Proc 4th International Workshop on Practical Applications of ComputationalBiology and Bioinformatics 2010

M SipperLet the Games EvolveIn Proc GPTP 2011

C Sivapragasam R Maheswaran and V VenkateshGenetic programming approach for flood routing in natural channelsHydrological Processes 22(5)623ndash628 2007

C Sivapragasam N Muttil S Muthukumar and V M ArunPrediction of algal blooms using genetic programmingMarine Pollution Bulletin 60(10)1849ndash1855 2010

Bibliographyndash 93

L SpectorTowards practical autoconstructive evolution Self-evolution of problem-solvinggenetic programming systemsIn R Riolo T McConaghy and E Vladislavleva editors Genetic ProgrammingTheory and Practice VIII volume 8 of Genetic and Evolutionary Computationchapter 2 pages 17ndash33 Springer Ann Arbor USA 20-22 May 2010

L Spector C Perry and J KleinPush 20 programming language descriptionTechnical report School of Cognitive Science Hampshire College Apr 2004

J Togelius S Karakovskiy J Koutnik and J SchmidhuberSuper Mario EvolutionIn Proc IEEE Computational Intelligence and Games 2009

TORCS The Open Car Racing Simulatorhttptorcssourceforgenet 2012

L Vanneschi and G CuccuVariable Size Population for Dynamic Optimization with Genetic ProgrammingIn Proc GECCO 2009

Bibliographyndash 94

E Vladislavleva G Smits and D Den HertogOrder of Nonlinearity as a Complexity Measure for Models Generated by SymbolicRegression via Pareto Genetic ProgrammingIEEE Trans EC 13(2)333ndash349 2009

N Wagner Z Michalewicz M Khouja and R McGregorTime Series Forecasting for Dynamic Environments The DyFor Genetic ProgramModelIEEE Trans EC 2007

J Walker and J MillerPredicting Prime Numbers Using Cartesian Genetic ProgrammingIn Proc EuroGP 2007

J A Walker K Voumllk S L Smith and J F MillerParallel Evolution using Multi-chromosome Cartesian Genetic ProgrammingGPEM 10417ndash445 2009

W-C Wang K-W Chau C-T Cheng and L QiuA comparison of performance of several artificial intelligence methods forforecasting monthly discharge time seriesJournal of Hydrology 374(3-4)294ndash306 2009

Bibliographyndash 95

W Weimer S Forrest C Le Goues and T NguyenAutomatic program repair with evolutionary computationCommunications of the ACM 53(5)109ndash116 June 2010

P Widera J Garibaldi and N KrasnogorGP challenge Evolving energy function for protein structure predictionGPEM 1161ndash88 2010

T YuHierarchical Processing for Evolving Recursive and Modular Programs UsingHigher-Order Functions and Lambda AbstractionGPEM 2345ndash380 2001

Bibliographyndash 96

  • Introduction
  • Background
  • What is genetic programming
  • Summary of our first glimpse at GP
  • Life demonstration of GP using ECJ
  • Assessment of GP techniques
  • Case study
  • A more detailed view on GP (vanilla GP is not the whole story)
  • The Challenges for GP
  • Variants of GP
  • Selected theoretical results
  • Applications of GP
  • Additional resources
  • Bibliography
Page 28: GeneticProgramming - Poznań University of TechnologyGeneticProgramming KrzysztofKrawiec LaboratoryofIntelligentDecisionSupportSystems InstituteofComputingScience,PoznanUniversityofTechnology,Poznań,Poland

Why should GP be considered a viable approach to program synthesis

GP combines two powerful concepts marked in underline in the above definition1 Representing candidate solutions as programs

which in general can conduct any Turing-complete computation (egclassification regression clustering reasoning problem solving etc) and thusenable capturing solutions to any type of problems (whether the task is eglearning optimization problem solving game playing etc)

2 Searching the space of candidate solutions using the lsquomechanicsrsquo borrowed frombiological evolutionwhich is unquestionably a very powerful computing paradigm given that itresulted in life on Earth and development of intelligent beings

Summary of our first glimpse at GPndash 28

Consequences of the above definition

Heuristic nature of search

Symbolic program representation

Input sensitivity and inductive character

State-fullness

Unconstrained data types

Summary of our first glimpse at GPndash 29

Origins of GP

httpwwwgenetic-programmingcomjohnkozahtml

Home assignment Indentify John Kozarsquos unique visual trait )

Summary of our first glimpse at GPndash 30

Life demonstration of GP using ECJ

Life demonstration of GP using ECJndash 31

GP on ECJ

ECJ Evolutionary Computation in Javahttpcsgmuedu~eclabprojectsecj

by Sean Luke Liviu Panait Gabriel Balan Sean Paus Zbigniew Skolicki ElenaPopovici Keith Sullivan et al

Probably the most popular freely available framework for EC with a strongsupport for GP

Licensed under Academic Free License version 30

As of March 2012 version 20

Many other libraries integrate with ECJ

Life demonstration of GP using ECJndash 32

Selected ECJ features

GUI with charting

Platform-independent checkpointing and logging

Hierarchical parameter files

Multithreading

Mersenne Twister Random Number Generators (compare tohttpwwwalifecouknonrandom)

Abstractions for implementing a variety of EC forms

Prepared to work in a distributed environment (including so-called island model)

Life demonstration of GP using ECJndash 33

GP-related ECJ features

GP Tree Representations

Set-based Strongly-Typed Genetic Programming

Ephemeral Random Constants

Automatically-Defined Functions and Automatically Defined Macros

Multiple tree forests

Six tree-creation algorithms

Extensive set of GP breeding operators

Grammatical Encoding

Eight pre-done GP application problem domains (ant regression multiplexerlawnmower parity two-box edge serengeti)

Life demonstration of GP using ECJndash 34

Exemplary run of ECJ

Standard outputjava ecEvolve -file ecappregressionquinticercparamsThreads breed1 eval1Seed 1427743400Job 0Setting upProcessing GP TypesProcessing GP Node ConstraintsProcessing GP Function SetsProcessing GP Tree Constraints-01306332228659439200164875774146594280653340439694114301402200189629743-0037506348565697010001402771209365470606602806044824949013869498395598084Initializing Generation 0Subpop 0 best fitness of generation Fitness Standardized=11303205 Adjusted=046941292 Hits=10Generation 1Subpop 0 best fitness of generation Fitness Standardized=06804932 Adjusted=059506345 Hits=7

Life demonstration of GP using ECJndash 35

Exemplary run The result

The log file produced by the runGeneration 0Best IndividualSubpopulation 0Evaluated trueFitness Standardized=11303205 Adjusted=046941292 Hits=10Tree 0( (sin ( x x)) (cos (+ x x)))Generation 1Best IndividualSubpopulation 0Evaluated trueFitness Standardized=06804932 Adjusted=059506345 Hits=7Tree 0( (rlog (+ (- x x) (cos x))) (rlog (- (cos (cos ( x x))) (- x x))))

Life demonstration of GP using ECJndash 36

Exemplary run

The log file produced by the runBest Individual of RunSubpopulation 0Evaluated trueFitness Standardized=008413165 Adjusted=092239726 Hits=17Tree 0( ( ( (- ( ( ( ( x (sin x)) (rlog

x)) (+ (+ (sin x) x) (- x x))) (exp ( x( ( (- ( ( ( ( x x) (rlog x)) (+ (+

(sin x) x) (- x x))) (exp ( x (sin x))))(sin x)) (rlog x)) (exp (rlog x)))))) (sin

x)) (rlog x)) x) (cos (cos ( ( (- ( ((exp (rlog x)) (+ x ( ( (exp (rlog x))(rlog x)) x))) (exp ( ( ( (- (exp (rlogx)) x) (rlog x)) x) (sin ( x x))))) (sinx)) ( x ( ( (- ( ( ( ( x x) (rlogx)) (+ (+ x (+ (+ (sin x) x) (- x x))) (-x x))) (exp ( x (sin x)))) (sin x)) (rlogx)) (exp (rlog x))))) x))))

Life demonstration of GP using ECJndash 37

GP in R

httpcranr-projectorgwebpackagesgprindexhtml

Life demonstration of GP using ECJndash 38

GP in Mathematica

Life demonstration of GP using ECJndash 39

Assessment of GP techniques

Assessment of GP techniquesndash 40

Criteria for assessing the quality of GP-evolved solutions

Criteria for assessing GP algorithmssuccess rate (percentage of evolutionary runs ended with success)

time-to-success (can be infin)

error of the best-of-run individual

Criteria for assessing programs obtained with GP

error rate (percentage of tests passed)

program size (number of instructions)

execution time

transparency

Assessment of GP techniquesndash 41

GP Benchmarks

Problem Definition (formula)Sextic x6minus2x4+x2

Septic x7minus2x6+x5minusx4+x3minus2x2+xNonic x9+x8+x7+x6+x5+x4+x3+x2+xR1 (x+1)3(x2minusx+1)R2 (x5minus3x3+1)(x2+1)R3 (x6+x5)(x4+x3+x2+x+1)

Assessment of GP techniquesndash 42

Even more GP benchmarks

Symbolic Regression Drug Bioavailability [43]

Tower [52] Protein Structure Classification [58]

Boolean Functions Time Series Forecasting [53]

N-Multiplexer [17] N-Majority [17] N-Parity [17] Path-finding and Planning

Generalised Boolean Circuits [12 59] Physical Travelling Salesman [28]

Digital Adder [55] Artificial Ant [17]

Order [9] Lawnmower [18]

Digital Multiplier [55] Tartarus Problem [6]

Majority [9] Maximum Overhang [36]

Classification Circuit Design [29]

mRNA Motif Classification [24] Control Systems

DNA Motif Discovery [25] Chaotic Dynamic Systems Control [26]

Intrusion Detection [11] Pole Balancing [31]

Protein Classification [19] Truck Control [16]

Intertwined Spirals [17]

Predictive Modelling

Mackey-Glass Chaotic Time Series [21]

Financial Trading [5 4 8]

Sunspot Prediction [17]

GeneChip Probe Performance [22]

Prime Number Prediction [54]

Assessment of GP techniquesndash 43

and more

Game-Playing

TORCS Car Racing [50]

Ms PacMan [10]

Othello [27]

Chessboard Evaluation [44]

Backgammon [44]

Mario [49]

NP-Complete Puzzles [14]

Robocode [44]

Rush Hour [44]

Checkers [44]

Freecell [44]

Dynamic Optimisation

Dynamic Symbolic Regression [34 35 51]

Dynamic Scheduling [13]

Traditional Programming

Sorting [15 1]

Assessment of GP techniquesndash 44

Case study

Case studyndash 45

Case study 1 Evolution of temperature models

Based on Karolina Stanisławska Krzysztof Krawiec Zbigniew W KundzewiczModeling Global Temperature Changes using Genetic Programming ndash A Case Study

Institute of Computing Science Poznan University of Technology Poznan Poland

Institute for Agricultural and Forest Environment Polish Academy of SciencesPoznan Poland and Potsdam Institute for Climate Impact Research PotsdamGermany

Konferencja Algorytmoacutew Ewolucyjnych i Optymalizacji Globalnej KAEiOGSeptember 21 2011

Case studyndash 46

A more detailed view on GP(vanilla GP is not the whole story)

A more detailed view on GP (vanilla GP is not the whole story)ndash 47

Where to get [random] candidate solutions from

Every stochastic search method relies on some sampling algorithm(s)

The distribution of randomly generated solutions is important as it impliescertain bias of the algorithm

ProblemsWe donrsquot know the lsquoidealrsquo distribution of GP programsEven if we knew it it may be difficult to design an algorithm that obeys it

The most widely used contemporary initialization methods take care only of thesyntax of generated programs

Mainly height constraint

A more detailed view on GP (vanilla GP is not the whole story)ndash 48

Initialization Full method

Specify the maximum tree height hmax

The full method for initializing treesChoose nonterminal nodes at random until hmax is reachedThen choose only from terminals

A more detailed view on GP (vanilla GP is not the whole story)ndash 49

Initialization Grow method

Specify the maximum tree height hmax

The grow method for initializing treesChoose nonterminal or terminal nodes at random until hmax is reachedThen choose only from terminals

A more detailed view on GP (vanilla GP is not the whole story)ndash 50

Homologous crossover for GP

Earliest example one-point crossover [23] identify a common region in theparents and swap the corresponding trees

The common region is the lsquointersectionrsquo of parent trees

A more detailed view on GP (vanilla GP is not the whole story)ndash 51

Uniform crossover for GP

Works similarly to uniform crossover in GAs

The offspring is build by iterating over nodes in the common region and flipping acoin to decide from which parent should an instruction be copied [38]

A more detailed view on GP (vanilla GP is not the whole story)ndash 52

How to employ multiple operators for lsquobreedingrsquo

How should the particular operators coexist in an evolutionary process In other words

How should they be superimposed

What should be the lsquopipingrsquo of particular breeding pipelines

A topic surprisingly underexplored in GP (and in EC probably too)

An example Which is betterpopsubpop0speciespipe = ecgpkozaMutationPipelinepopsubpop0speciespipenum-sources = 1popsubpop0speciespipesource0 = ecgpkozaCrossoverPipeline

Orpopsubpop0speciespipenum-sources = 2popsubpop0speciespipesource0 = ecgpkozaCrossoverPipelinepopsubpop0speciespipesource0prob = 09popsubpop0speciespipesource1 = ecgpkozaMutationPipelinepopsubpop0speciespipesource1prob = 01

A more detailed view on GP (vanilla GP is not the whole story)ndash 53

The Challenges for GP

The Challenges for GPndash 54

Bloat

The evolving expressions tend to grow indefinitely in sizeFor tree-based representations this growth is typically exponential[-ish]

Evaluation becomes slow algorithm stalls memory overrun likely

One of the most intensely studied topics in GP 240+ papers as of March 2012

The Challenges for GPndash 55

Bloat example

Average number of nodes per generation in a typical run of GP solving the Sexticproblem x6minus2x4+x2

(GP dotted line)

The Challenges for GPndash 56

Countermeasures for bloat

Constraining tree heightSurprisingly can speed up bloat

Favoring small programsLexicographic parsimony pressure given two equally fit individuals prefer (select)the one represented by a smaller tree

Bloat-aware operators size-fair crossover

The Challenges for GPndash 57

Highly non-uniform distribution of program lsquobehaviorsrsquo

Convergence of binary Boolean random linear functions (composed of AND NANDOR NOR 8 bits)

From [20] Langdon W B Cantuacute-Paz E (ed) Random Search is Parsimonious LateBreaking Papers at the Genetic and Evolutionary Computation Conference(GECCO-2002) AAAI 2002 308-315

The Challenges for GPndash 58

High cost of evaluation

Running a program on multiple inputscan be expensive

Particularly for some types of dataeg images

Solutions

Caching of outcomes of subprograms

Parallel execution of programs onparticular fitness cases

Bloat prevention methods

The Challenges for GPndash 59

Variants of GP

Variants of GPndash 60

Strongly typed GP (STGP)

A way to incorporate prior knowledge and impose a structure on programs [30]

ImplementationProvide a set of typesFor each instruction define the types of its arguments and outcomesMake the operators type-aware

Mutation substitute a random tree of a proper typeCrossover swap trees of compatible1 types

1Compatible belonging to the same lsquoset typersquoVariants of GPndash 61

Strongly typed GP in ECJ

For the problem of simple classifiers represented as decision trees

Classifier syntaxClassifier = Class_idClassifier = if_then_else(Condition ClassifierClassifier)Condition = Input_Variable = Constant_Value

Implementation in ECJ parameter filesgptypeasize = 3gptypea0name = classgptypea1name = vargptypea2name = constgptypessize = 0gptcsize = 1gptc0 = ecgpGPTreeConstraintsgptc0name = tc0gptc0fset = f0gptc0returns = class

gpncsize = 4gpnc0 = ecgpGPNodeConstraintsgpnc0name = ncSimpleClassifiergpnc0returns = classgpnc0size = 0gpnc1 = ecgpGPNodeConstraintsgpnc1name = ncCompoundClassifiergpnc1returns = classgpnc1size = 4gpnc1child0 = vargpnc1child1 = constgpnc1child2 = classgpnc1child3 = classgpnc2 = ecgpGPNodeConstraintsgpnc2name = ncVariablegpnc2returns = vargpnc2size = 0gpnc3 = ecgpGPNodeConstraintsgpnc3name = ncConstantgpnc3returns = constgpnc3size = 0

Variants of GPndash 62

Linear Genetic Programming

MotivationTree-like structures are not natural for contemporary hardware architectures

Program = a sequence of instructions

Data passed via registers

ProsDirectly portable to machine code fast executionNatural correspondence to standard (GA-like) crossover operator

Applications direct evolution of machine code [32]

Initial register contents

Final register contents

x1

x2 O1 O2

x3

O3 O4 g2

g3

g1r1

r2

r3

r1

r2

r3

Variants of GPndash 63

Linear GP

Variants of GPndash 64

Stack-based GP

The best-known representative Push and PushGPhampshireedulspectorpushhtml [48]

ProsVery simple syntax program = instruction | literal | ( program )No need to specify the number of registersThe top element of a stack has the natural interpretation of program outcomeNatural possibility of implementing autoconstructive programs [47]Includes certain features that make it Turing-complete (eg YANK instruction)

Variants of GPndash 65

Push Example

Program

( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR )

Initial stack states

BOOLEAN STACK ()CODE STACK ( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR )FLOAT STACK ()INTEGER STACK ()

Stack states after program execution

BOOLEAN STACK ( TRUE )CODE STACK ( ( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR ) )FLOAT STACK ( 93 )INTEGER STACK ( 6 )

httphampshireedulspectorpush3-descriptionhtml

Variants of GPndash 66

Other variants of GP

Grammatical Evolution The grammar of the programming language ofconsideration is given as input to the algorithm Individuals encode the choice ofproductions in the derivation tree (which of available alternative productionshould be chosen modulo the number of productions available at given step ofderivation)

Graph-based GPMotivation standard GP cannot reuse subprograms (within a single program)Example Cartesian Genetic Programming

Variants of GPndash 67

Other variants of GP

Multiobjective GP The extra objectives canCome with the problemResult from GPrsquos specifics eg use program size as the second (minimized)objectiveBe associated with different tests (eg feature tests [40])

Developmental GP (eg using Push)

Probabilistic GP (a variant of EDA Estimation of Distribution Algorithms)The algorithm maintains a probability distribution P instead of a populationIndividuals are generated from P lsquoon demandrsquoThe results of individualsrsquo evaluation are used to update P

Variants of GPndash 68

Simple EDA-like GP PIPE

Probabilistic Incremental Program Evolution [41]

Variants of GPndash 69

Selected theoretical results

Selected theoretical resultsndash 70

Schemata theorem for GP

Exact formula for the expected number of individuals sampling a schema a thenext generation [37]

Plus later work for other types of crossover

Selected theoretical resultsndash 71

Semantic crossover

Selected theoretical resultsndash 72

Applications of GP

Applications of GPndash 73

Review

GP produced a number of solutions that are human-competitive ie a GPalgorithm automatically solved a problem for which a patent exists2

A recent award-winning work has demonstrated the ability of a GP system toautomatically find and correct bugs in commercially-released software whenprovided with test data3

GP is one of leading methodologies that can be used to lsquoautomatersquo sciencehelping the researchers to find the hidden complex patterns in the observedphenomena4

2Koza J R Keane M A Streeter M J Mydlowec W Yu J Lanza G 2003 Genetic Pro-gramming IV Routine Human-Competitive Machine Intelligence Kluwer Academic Publishers

3Arcuri A Yao X A novel co-evolutionary approach to automatic software bug fixing In WangJ (Ed) 2008 IEEE World Congress on Computational Intelligence IEEE Computational IntelligenceSociety IEEE Press Hong Kong

4Schmidt M Lipson H 3 Apr 2009 Distilling free-form natural laws from experimental dataScience 324 (5923) 81ndash85

Applications of GPndash 74

Humies

() Entries were solicited for cash awards for human-competitive results that wereproduced by any form of genetic and evolutionary computation and that were published

Applications of GPndash 75

Humies

The conditions to qualify(A) The result was patented as an invention in the past is an improvement over apatented invention or would qualify today as a patentable new invention(B) The result is equal to or better than a result that was accepted as a new scientificresult at the time when it was published in a peer-reviewed scientific journal(C) The result is equal to or better than a result that was placed into a database orarchive of results maintained by an internationally recognized panel of scientificexperts(D) The result is publishable in its own right as a new scientific result mdash independentof the fact that the result was mechanically created(E) The result is equal to or better than the most recent human-created solution to along-standing problem for which there has been a succession of increasingly betterhuman-created solutions(F) The result is equal to or better than a result that was considered an achievementin its field at the time it was first discovered(G) The result solves a problem of indisputable difficulty in its field(H) The result holds its own or wins a regulated competition involving humancontestants (in the form of either live human players or human-written computerprograms)

Applications of GPndash 76

Selected Gold Humies

2004 Jason D Lohn Gregory S Hornby Derek S Linden NASA Ames ResearchCenterAn Evolved Antenna for Deployment on NASArsquos Space Technology 5 Mission

httpidesignucscedupapershornby_ec11pdf

Applications of GPndash 77

Selected Gold Humies using GP

2009 Stephanie Forrest Claire Le Goues ThanhVu Nguyen Westley WeimerAutomatically finding patches using genetic programming A GeneticProgramming Approach to Automated Software Repair

Applications of GPndash 78

Selected Gold Humies using GP

2008 Lee Spector David M Clark Ian Lindsay Bradford Barr Jon KleinGenetic Programming for Finite Algebras

2010 Natalio Krasnogor Paweł Widera Jonathan GaribaldiEvolutionary design of the energy function for protein structure prediction GPchallenge evolving the energy function for protein structure prediction Automateddesign of energy functions for protein structure prediction by means of geneticprogramming and improved structure similarity assessment

2011 Achiya Elyasaf Ami Hauptmann Moshe SipperGA-FreeCell Evolving Solvers for the Game of FreeCell

Applications of GPndash 79

Other applications

classification problems in machine learning [] object recognition [ 33] orlearning game strategies [] [2 57] has demonstrated the ability of a GP system to automatically find andcorrect bugs in commercially-released software when provided with test data

In the context of this paper it deserves particular attention that GP is one ofleading methodologies that can be used to lsquoautomatersquo science helping theresearchers to find the hidden complex patterns in the observed phenomena

In this spirit in their seminal paper [42] have shown how GP can be used toinduce scientific laws from experimental data Many other studies havedemonstrated the usefulness of GP for modeling different phenomena includingthose of natural origins [45 3 7 56 46]

See [] for an extensive review of GP applications

Applications of GPndash 80

Case study 2 Evolution of feature detectors

Based on Krzysztof Krawiec Bartosz Kukawka and Tomasz Maciejewski (2010)Evolving cascades of voting feature detectors for vehicle detection in satellite imageryIn IEEE Congress on Evolutionary Computation (CEC 2010) Barcelona IEEE Presspages 2392-2399

Applications of GPndash 81

Additional resources

Additional resourcesndash 82

Additional resources

Evolutionary Computation in Java csgmuedu~eclabprojectsecjGeneric software framework for EA well-prepared to work with GP

The online GP bilbiography wwwcsbhamacuk~wblbiblio

The genetic programming lsquohome pagersquo (a little bit messy but still valuable)httpwwwgenetic-programmingcom

Additional resourcesndash 83

Bibliography

Bibliographyndash 84

A Agapitos and S M LucasEvolving Modular Recursive Sorting AlgorithmsIn Proc EuroGP 2007

A Arcuri and X YaoA novel co-evolutionary approach to automatic software bug fixingIn J Wang editor 2008 IEEE World Congress on Computational Intelligencepages 162ndash168 Hong Kong 1-6 June 2008 IEEE Computational IntelligenceSociety IEEE Press

M Arganis R Val J Prats K Rodriguez R Dominguez and J DolzGenetic programming and standardization in water temperature modellingAdvances in Civil Engineering 2009 2009

A Brabazon M OrsquoNeill and I DempseyAn Introduction to Evolutionary Computation in FinanceIEEE Computational Intelligence Magazine 3(4)42ndash55 2008

R Bradley A Brabazon and M OrsquoNeillDynamic High Frequency Trading A Neuro-Evolutionary ApproachIn Proc EvoWorkshops 2009

G Cuccu and F GomezWhen novelty is not enoughIn Proc EvoApplications 2011

Bibliographyndash 85

M Daga and M C DeoAlternative data-driven methods to estimate wind from waves by inversemodelingNatural Hazards 49(2)293ndash310 May 2009

I Dempsey M OrsquoNeill and A BrabazonAdaptive Trading With Grammatical EvolutionIn Proc CEC 2006

G Durrett F Neumann and U-M OrsquoReillyComputational Complexity Analysis of Simple Genetic Programming On TwoProblems Modeling Isolated Program SemanticsIn Proc FOGA 2011

E Galvaacuten-Loacutepez J Swafford M OrsquoNeill and A BrabazonEvolving a Ms PacMan Controller Using Grammatical EvolutionIn Applications of Evolutionary Computation Springer 2010

J V Hansen P B Lowry R D Meservy and D M McDonaldGenetic Programming for Prevention of Cyberterrorism through Dynamic andEvolving Intrusion DetectionDecision Support Systems 431362ndash1374 2007

Bibliographyndash 86

S Harding J F Miller and W BanzhafDevelopments in Cartesian Genetic Programming self-modifying CGPGPEM 11397ndash439 2010

D Jakobović and L BudinDynamic Scheduling with Genetic ProgrammingIn Proc EuroGP 2006

G Kendall A Parkes and K SpoererA Survey of NP-Complete PuzzlesInternational Computer Games Association Journal 31(1)13ndash34 2008

K E Kinnear JrEvolving a Sort Lessons in Genetic ProgrammingIn Proc of the International Conference on Neural Networks 1993

J KozaA Genetic Approach to the Truck Backer Upper Problem and the Inter-twinedSpiral ProblemIn Proc International Joint Conference on Neural Networks 1992

J R KozaGenetic Programming On the Programming of Computers by Means of NaturalSelectionMIT Press Cambridge MA USA 1992

Bibliographyndash 87

J R KozaGenetic Programming II Automatic Discovery of Reusable ProgramsMIT Press Cambridge Massachusetts May 1994

W Langdon and W BanzhafRepeated Patterns in Genetic ProgrammingNatural Computing 7589ndash613 2008

W B LangdonRandom search is parsimoniousIn E Cantuacute-Paz editor Late Breaking Papers at the Genetic and EvolutionaryComputation Conference (GECCO-2002) pages 308ndash315 New York NY 9-13July 2002 AAAI

W B Langdon and W BanzhafRepeated Sequences in Linear Genetic Programming GenomesComplex Systems 15(4)285ndash306 2005

W B Langdon and A P HarrisonEvolving Regular Expressions for GeneChip Probe Performance PredictionIn Proc PPSN pages 1061ndash1070 2008

W B Langdon and R PoliFoundations of Genetic ProgrammingSpringer-Verlag 2002

Bibliographyndash 88

W B Langdon J Rowsell and A P HarrisonCreating Regular Expressions as mRNA Motifs with GP to Predict Human ExonSplittingIn Proc GECCO 2009

W B Langdon O Sanchez Graillet and A P HarrisonAutomated DNA Motif DiscoveryarXivorg 2010

M Lones A Tyrrell S Stepney and L CavesControlling Complex Dynamics with Artificial Biochemical NetworksIn Proc EuroGP pages 159ndash170 2010

S LucasOthello Competitionhttpprotectkern-1667emrelaxalgovalessexacuk8080othellohtmlOthellohtml 2012[Online accessed 27-Jan-2012]

Bibliographyndash 89

S LucasThe Physical Travelling Salesperson Problemhttpprotectkern-1667emrelaxalgovalessexacukptspptsphtml2012[Online accessed 27ndashJan-2012]

T McConaghyFFX Fast Scalable Deterministic Symbolic Regression TechnologyIn Proc GPTP 2011

D J MontanaStrongly typed genetic programmingBBN Technical Report 7866 Bolt Beranek and Newman Inc 10 MoultonStreet Cambridge MA 02138 USA 7 May 1993

M Nicolau M Schoenauer and W BanzhafEvolving Genes to Balance a PoleIn Proc EuroGP 2010

Bibliographyndash 90

P Nordin and W BanzhafGenetic programming controlling a miniature robotIn E V Siegel and J R Koza editors Working Notes for the AAAI Symposiumon Genetic Programming pages 61ndash67 MIT Cambridge MA USA 10ndash12 Nov1995 AAAI

G Olague and L TrujilloEvolutionary-computer-assisted design of image operators that detect interestpoints using genetic programmingImage and Vision Computing 29(7)484ndash498 2011

M OrsquoNeill A Brabazon and E HembergSubtree Deactivation Control with Grammatical Genetic Programming inDynamic EnvironmentsIn Proc CEC 2008

M OrsquoNeill and C RyanGrammatical Evolution by Grammatical Evolution The Evolution of Grammarand Genetic CodeIn Proc EuroGP pages 138ndash149 Springer-Verlag 5ndash7 Apr 2004

M Paterson Y Peres M Thorup P Winkler and U ZwickMaximum OverhangIn Proc 19th Annual ACM-SIAM Symposium on Discrete Algorithms 2008

Bibliographyndash 91

R PoliExact schema theory for genetic programming and variable-length geneticalgorithms with one-point crossoverGenetic Programming and Evolvable Machines 2(2)123ndash163 June 2001

R Poli and W B LangdonOn the search properties of different crossover operators in genetic programmingIn J R Koza W Banzhaf K Chellapilla K Deb M Dorigo D B Fogel M HGarzon D E Goldberg H Iba and R Riolo editors Genetic Programming1998 Proceedings of the Third Annual Conference pages 293ndash301 University ofWisconsin Madison Wisconsin USA 22-25 July 1998 Morgan Kaufmann

R Poli W B Langdon and N F McPheeA field guide to genetic programmingPublished via httplulucom and freely available athttpwwwgp-field-guideorguk 2008(With contributions by J R Koza)

B J Ross and H ZhuProcedural texture evolution using multiobjective optimizationNew Generation Computing 22(3)271ndash293 2004

Bibliographyndash 92

R P Salustowicz and J SchmidhuberProbabilistic incremental program evolutionEvolutionary Computation 5(2)123ndash141 1997

M Schmidt and H LipsonDistilling free-form natural laws from experimental dataScience 324(5923)81ndash85 3 Apr 2009

S Silva and L VanneschiState-of-the-Art Genetic Programming for Predicting Human Oral Bioavailabilityof DrugsIn Proc 4th International Workshop on Practical Applications of ComputationalBiology and Bioinformatics 2010

M SipperLet the Games EvolveIn Proc GPTP 2011

C Sivapragasam R Maheswaran and V VenkateshGenetic programming approach for flood routing in natural channelsHydrological Processes 22(5)623ndash628 2007

C Sivapragasam N Muttil S Muthukumar and V M ArunPrediction of algal blooms using genetic programmingMarine Pollution Bulletin 60(10)1849ndash1855 2010

Bibliographyndash 93

L SpectorTowards practical autoconstructive evolution Self-evolution of problem-solvinggenetic programming systemsIn R Riolo T McConaghy and E Vladislavleva editors Genetic ProgrammingTheory and Practice VIII volume 8 of Genetic and Evolutionary Computationchapter 2 pages 17ndash33 Springer Ann Arbor USA 20-22 May 2010

L Spector C Perry and J KleinPush 20 programming language descriptionTechnical report School of Cognitive Science Hampshire College Apr 2004

J Togelius S Karakovskiy J Koutnik and J SchmidhuberSuper Mario EvolutionIn Proc IEEE Computational Intelligence and Games 2009

TORCS The Open Car Racing Simulatorhttptorcssourceforgenet 2012

L Vanneschi and G CuccuVariable Size Population for Dynamic Optimization with Genetic ProgrammingIn Proc GECCO 2009

Bibliographyndash 94

E Vladislavleva G Smits and D Den HertogOrder of Nonlinearity as a Complexity Measure for Models Generated by SymbolicRegression via Pareto Genetic ProgrammingIEEE Trans EC 13(2)333ndash349 2009

N Wagner Z Michalewicz M Khouja and R McGregorTime Series Forecasting for Dynamic Environments The DyFor Genetic ProgramModelIEEE Trans EC 2007

J Walker and J MillerPredicting Prime Numbers Using Cartesian Genetic ProgrammingIn Proc EuroGP 2007

J A Walker K Voumllk S L Smith and J F MillerParallel Evolution using Multi-chromosome Cartesian Genetic ProgrammingGPEM 10417ndash445 2009

W-C Wang K-W Chau C-T Cheng and L QiuA comparison of performance of several artificial intelligence methods forforecasting monthly discharge time seriesJournal of Hydrology 374(3-4)294ndash306 2009

Bibliographyndash 95

W Weimer S Forrest C Le Goues and T NguyenAutomatic program repair with evolutionary computationCommunications of the ACM 53(5)109ndash116 June 2010

P Widera J Garibaldi and N KrasnogorGP challenge Evolving energy function for protein structure predictionGPEM 1161ndash88 2010

T YuHierarchical Processing for Evolving Recursive and Modular Programs UsingHigher-Order Functions and Lambda AbstractionGPEM 2345ndash380 2001

Bibliographyndash 96

  • Introduction
  • Background
  • What is genetic programming
  • Summary of our first glimpse at GP
  • Life demonstration of GP using ECJ
  • Assessment of GP techniques
  • Case study
  • A more detailed view on GP (vanilla GP is not the whole story)
  • The Challenges for GP
  • Variants of GP
  • Selected theoretical results
  • Applications of GP
  • Additional resources
  • Bibliography
Page 29: GeneticProgramming - Poznań University of TechnologyGeneticProgramming KrzysztofKrawiec LaboratoryofIntelligentDecisionSupportSystems InstituteofComputingScience,PoznanUniversityofTechnology,Poznań,Poland

Consequences of the above definition

Heuristic nature of search

Symbolic program representation

Input sensitivity and inductive character

State-fullness

Unconstrained data types

Summary of our first glimpse at GPndash 29

Origins of GP

httpwwwgenetic-programmingcomjohnkozahtml

Home assignment Indentify John Kozarsquos unique visual trait )

Summary of our first glimpse at GPndash 30

Life demonstration of GP using ECJ

Life demonstration of GP using ECJndash 31

GP on ECJ

ECJ Evolutionary Computation in Javahttpcsgmuedu~eclabprojectsecj

by Sean Luke Liviu Panait Gabriel Balan Sean Paus Zbigniew Skolicki ElenaPopovici Keith Sullivan et al

Probably the most popular freely available framework for EC with a strongsupport for GP

Licensed under Academic Free License version 30

As of March 2012 version 20

Many other libraries integrate with ECJ

Life demonstration of GP using ECJndash 32

Selected ECJ features

GUI with charting

Platform-independent checkpointing and logging

Hierarchical parameter files

Multithreading

Mersenne Twister Random Number Generators (compare tohttpwwwalifecouknonrandom)

Abstractions for implementing a variety of EC forms

Prepared to work in a distributed environment (including so-called island model)

Life demonstration of GP using ECJndash 33

GP-related ECJ features

GP Tree Representations

Set-based Strongly-Typed Genetic Programming

Ephemeral Random Constants

Automatically-Defined Functions and Automatically Defined Macros

Multiple tree forests

Six tree-creation algorithms

Extensive set of GP breeding operators

Grammatical Encoding

Eight pre-done GP application problem domains (ant regression multiplexerlawnmower parity two-box edge serengeti)

Life demonstration of GP using ECJndash 34

Exemplary run of ECJ

Standard outputjava ecEvolve -file ecappregressionquinticercparamsThreads breed1 eval1Seed 1427743400Job 0Setting upProcessing GP TypesProcessing GP Node ConstraintsProcessing GP Function SetsProcessing GP Tree Constraints-01306332228659439200164875774146594280653340439694114301402200189629743-0037506348565697010001402771209365470606602806044824949013869498395598084Initializing Generation 0Subpop 0 best fitness of generation Fitness Standardized=11303205 Adjusted=046941292 Hits=10Generation 1Subpop 0 best fitness of generation Fitness Standardized=06804932 Adjusted=059506345 Hits=7

Life demonstration of GP using ECJndash 35

Exemplary run The result

The log file produced by the runGeneration 0Best IndividualSubpopulation 0Evaluated trueFitness Standardized=11303205 Adjusted=046941292 Hits=10Tree 0( (sin ( x x)) (cos (+ x x)))Generation 1Best IndividualSubpopulation 0Evaluated trueFitness Standardized=06804932 Adjusted=059506345 Hits=7Tree 0( (rlog (+ (- x x) (cos x))) (rlog (- (cos (cos ( x x))) (- x x))))

Life demonstration of GP using ECJndash 36

Exemplary run

The log file produced by the runBest Individual of RunSubpopulation 0Evaluated trueFitness Standardized=008413165 Adjusted=092239726 Hits=17Tree 0( ( ( (- ( ( ( ( x (sin x)) (rlog

x)) (+ (+ (sin x) x) (- x x))) (exp ( x( ( (- ( ( ( ( x x) (rlog x)) (+ (+

(sin x) x) (- x x))) (exp ( x (sin x))))(sin x)) (rlog x)) (exp (rlog x)))))) (sin

x)) (rlog x)) x) (cos (cos ( ( (- ( ((exp (rlog x)) (+ x ( ( (exp (rlog x))(rlog x)) x))) (exp ( ( ( (- (exp (rlogx)) x) (rlog x)) x) (sin ( x x))))) (sinx)) ( x ( ( (- ( ( ( ( x x) (rlogx)) (+ (+ x (+ (+ (sin x) x) (- x x))) (-x x))) (exp ( x (sin x)))) (sin x)) (rlogx)) (exp (rlog x))))) x))))

Life demonstration of GP using ECJndash 37

GP in R

httpcranr-projectorgwebpackagesgprindexhtml

Life demonstration of GP using ECJndash 38

GP in Mathematica

Life demonstration of GP using ECJndash 39

Assessment of GP techniques

Assessment of GP techniquesndash 40

Criteria for assessing the quality of GP-evolved solutions

Criteria for assessing GP algorithmssuccess rate (percentage of evolutionary runs ended with success)

time-to-success (can be infin)

error of the best-of-run individual

Criteria for assessing programs obtained with GP

error rate (percentage of tests passed)

program size (number of instructions)

execution time

transparency

Assessment of GP techniquesndash 41

GP Benchmarks

Problem Definition (formula)Sextic x6minus2x4+x2

Septic x7minus2x6+x5minusx4+x3minus2x2+xNonic x9+x8+x7+x6+x5+x4+x3+x2+xR1 (x+1)3(x2minusx+1)R2 (x5minus3x3+1)(x2+1)R3 (x6+x5)(x4+x3+x2+x+1)

Assessment of GP techniquesndash 42

Even more GP benchmarks

Symbolic Regression Drug Bioavailability [43]

Tower [52] Protein Structure Classification [58]

Boolean Functions Time Series Forecasting [53]

N-Multiplexer [17] N-Majority [17] N-Parity [17] Path-finding and Planning

Generalised Boolean Circuits [12 59] Physical Travelling Salesman [28]

Digital Adder [55] Artificial Ant [17]

Order [9] Lawnmower [18]

Digital Multiplier [55] Tartarus Problem [6]

Majority [9] Maximum Overhang [36]

Classification Circuit Design [29]

mRNA Motif Classification [24] Control Systems

DNA Motif Discovery [25] Chaotic Dynamic Systems Control [26]

Intrusion Detection [11] Pole Balancing [31]

Protein Classification [19] Truck Control [16]

Intertwined Spirals [17]

Predictive Modelling

Mackey-Glass Chaotic Time Series [21]

Financial Trading [5 4 8]

Sunspot Prediction [17]

GeneChip Probe Performance [22]

Prime Number Prediction [54]

Assessment of GP techniquesndash 43

and more

Game-Playing

TORCS Car Racing [50]

Ms PacMan [10]

Othello [27]

Chessboard Evaluation [44]

Backgammon [44]

Mario [49]

NP-Complete Puzzles [14]

Robocode [44]

Rush Hour [44]

Checkers [44]

Freecell [44]

Dynamic Optimisation

Dynamic Symbolic Regression [34 35 51]

Dynamic Scheduling [13]

Traditional Programming

Sorting [15 1]

Assessment of GP techniquesndash 44

Case study

Case studyndash 45

Case study 1 Evolution of temperature models

Based on Karolina Stanisławska Krzysztof Krawiec Zbigniew W KundzewiczModeling Global Temperature Changes using Genetic Programming ndash A Case Study

Institute of Computing Science Poznan University of Technology Poznan Poland

Institute for Agricultural and Forest Environment Polish Academy of SciencesPoznan Poland and Potsdam Institute for Climate Impact Research PotsdamGermany

Konferencja Algorytmoacutew Ewolucyjnych i Optymalizacji Globalnej KAEiOGSeptember 21 2011

Case studyndash 46

A more detailed view on GP(vanilla GP is not the whole story)

A more detailed view on GP (vanilla GP is not the whole story)ndash 47

Where to get [random] candidate solutions from

Every stochastic search method relies on some sampling algorithm(s)

The distribution of randomly generated solutions is important as it impliescertain bias of the algorithm

ProblemsWe donrsquot know the lsquoidealrsquo distribution of GP programsEven if we knew it it may be difficult to design an algorithm that obeys it

The most widely used contemporary initialization methods take care only of thesyntax of generated programs

Mainly height constraint

A more detailed view on GP (vanilla GP is not the whole story)ndash 48

Initialization Full method

Specify the maximum tree height hmax

The full method for initializing treesChoose nonterminal nodes at random until hmax is reachedThen choose only from terminals

A more detailed view on GP (vanilla GP is not the whole story)ndash 49

Initialization Grow method

Specify the maximum tree height hmax

The grow method for initializing treesChoose nonterminal or terminal nodes at random until hmax is reachedThen choose only from terminals

A more detailed view on GP (vanilla GP is not the whole story)ndash 50

Homologous crossover for GP

Earliest example one-point crossover [23] identify a common region in theparents and swap the corresponding trees

The common region is the lsquointersectionrsquo of parent trees

A more detailed view on GP (vanilla GP is not the whole story)ndash 51

Uniform crossover for GP

Works similarly to uniform crossover in GAs

The offspring is build by iterating over nodes in the common region and flipping acoin to decide from which parent should an instruction be copied [38]

A more detailed view on GP (vanilla GP is not the whole story)ndash 52

How to employ multiple operators for lsquobreedingrsquo

How should the particular operators coexist in an evolutionary process In other words

How should they be superimposed

What should be the lsquopipingrsquo of particular breeding pipelines

A topic surprisingly underexplored in GP (and in EC probably too)

An example Which is betterpopsubpop0speciespipe = ecgpkozaMutationPipelinepopsubpop0speciespipenum-sources = 1popsubpop0speciespipesource0 = ecgpkozaCrossoverPipeline

Orpopsubpop0speciespipenum-sources = 2popsubpop0speciespipesource0 = ecgpkozaCrossoverPipelinepopsubpop0speciespipesource0prob = 09popsubpop0speciespipesource1 = ecgpkozaMutationPipelinepopsubpop0speciespipesource1prob = 01

A more detailed view on GP (vanilla GP is not the whole story)ndash 53

The Challenges for GP

The Challenges for GPndash 54

Bloat

The evolving expressions tend to grow indefinitely in sizeFor tree-based representations this growth is typically exponential[-ish]

Evaluation becomes slow algorithm stalls memory overrun likely

One of the most intensely studied topics in GP 240+ papers as of March 2012

The Challenges for GPndash 55

Bloat example

Average number of nodes per generation in a typical run of GP solving the Sexticproblem x6minus2x4+x2

(GP dotted line)

The Challenges for GPndash 56

Countermeasures for bloat

Constraining tree heightSurprisingly can speed up bloat

Favoring small programsLexicographic parsimony pressure given two equally fit individuals prefer (select)the one represented by a smaller tree

Bloat-aware operators size-fair crossover

The Challenges for GPndash 57

Highly non-uniform distribution of program lsquobehaviorsrsquo

Convergence of binary Boolean random linear functions (composed of AND NANDOR NOR 8 bits)

From [20] Langdon W B Cantuacute-Paz E (ed) Random Search is Parsimonious LateBreaking Papers at the Genetic and Evolutionary Computation Conference(GECCO-2002) AAAI 2002 308-315

The Challenges for GPndash 58

High cost of evaluation

Running a program on multiple inputscan be expensive

Particularly for some types of dataeg images

Solutions

Caching of outcomes of subprograms

Parallel execution of programs onparticular fitness cases

Bloat prevention methods

The Challenges for GPndash 59

Variants of GP

Variants of GPndash 60

Strongly typed GP (STGP)

A way to incorporate prior knowledge and impose a structure on programs [30]

ImplementationProvide a set of typesFor each instruction define the types of its arguments and outcomesMake the operators type-aware

Mutation substitute a random tree of a proper typeCrossover swap trees of compatible1 types

1Compatible belonging to the same lsquoset typersquoVariants of GPndash 61

Strongly typed GP in ECJ

For the problem of simple classifiers represented as decision trees

Classifier syntaxClassifier = Class_idClassifier = if_then_else(Condition ClassifierClassifier)Condition = Input_Variable = Constant_Value

Implementation in ECJ parameter filesgptypeasize = 3gptypea0name = classgptypea1name = vargptypea2name = constgptypessize = 0gptcsize = 1gptc0 = ecgpGPTreeConstraintsgptc0name = tc0gptc0fset = f0gptc0returns = class

gpncsize = 4gpnc0 = ecgpGPNodeConstraintsgpnc0name = ncSimpleClassifiergpnc0returns = classgpnc0size = 0gpnc1 = ecgpGPNodeConstraintsgpnc1name = ncCompoundClassifiergpnc1returns = classgpnc1size = 4gpnc1child0 = vargpnc1child1 = constgpnc1child2 = classgpnc1child3 = classgpnc2 = ecgpGPNodeConstraintsgpnc2name = ncVariablegpnc2returns = vargpnc2size = 0gpnc3 = ecgpGPNodeConstraintsgpnc3name = ncConstantgpnc3returns = constgpnc3size = 0

Variants of GPndash 62

Linear Genetic Programming

MotivationTree-like structures are not natural for contemporary hardware architectures

Program = a sequence of instructions

Data passed via registers

ProsDirectly portable to machine code fast executionNatural correspondence to standard (GA-like) crossover operator

Applications direct evolution of machine code [32]

Initial register contents

Final register contents

x1

x2 O1 O2

x3

O3 O4 g2

g3

g1r1

r2

r3

r1

r2

r3

Variants of GPndash 63

Linear GP

Variants of GPndash 64

Stack-based GP

The best-known representative Push and PushGPhampshireedulspectorpushhtml [48]

ProsVery simple syntax program = instruction | literal | ( program )No need to specify the number of registersThe top element of a stack has the natural interpretation of program outcomeNatural possibility of implementing autoconstructive programs [47]Includes certain features that make it Turing-complete (eg YANK instruction)

Variants of GPndash 65

Push Example

Program

( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR )

Initial stack states

BOOLEAN STACK ()CODE STACK ( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR )FLOAT STACK ()INTEGER STACK ()

Stack states after program execution

BOOLEAN STACK ( TRUE )CODE STACK ( ( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR ) )FLOAT STACK ( 93 )INTEGER STACK ( 6 )

httphampshireedulspectorpush3-descriptionhtml

Variants of GPndash 66

Other variants of GP

Grammatical Evolution The grammar of the programming language ofconsideration is given as input to the algorithm Individuals encode the choice ofproductions in the derivation tree (which of available alternative productionshould be chosen modulo the number of productions available at given step ofderivation)

Graph-based GPMotivation standard GP cannot reuse subprograms (within a single program)Example Cartesian Genetic Programming

Variants of GPndash 67

Other variants of GP

Multiobjective GP The extra objectives canCome with the problemResult from GPrsquos specifics eg use program size as the second (minimized)objectiveBe associated with different tests (eg feature tests [40])

Developmental GP (eg using Push)

Probabilistic GP (a variant of EDA Estimation of Distribution Algorithms)The algorithm maintains a probability distribution P instead of a populationIndividuals are generated from P lsquoon demandrsquoThe results of individualsrsquo evaluation are used to update P

Variants of GPndash 68

Simple EDA-like GP PIPE

Probabilistic Incremental Program Evolution [41]

Variants of GPndash 69

Selected theoretical results

Selected theoretical resultsndash 70

Schemata theorem for GP

Exact formula for the expected number of individuals sampling a schema a thenext generation [37]

Plus later work for other types of crossover

Selected theoretical resultsndash 71

Semantic crossover

Selected theoretical resultsndash 72

Applications of GP

Applications of GPndash 73

Review

GP produced a number of solutions that are human-competitive ie a GPalgorithm automatically solved a problem for which a patent exists2

A recent award-winning work has demonstrated the ability of a GP system toautomatically find and correct bugs in commercially-released software whenprovided with test data3

GP is one of leading methodologies that can be used to lsquoautomatersquo sciencehelping the researchers to find the hidden complex patterns in the observedphenomena4

2Koza J R Keane M A Streeter M J Mydlowec W Yu J Lanza G 2003 Genetic Pro-gramming IV Routine Human-Competitive Machine Intelligence Kluwer Academic Publishers

3Arcuri A Yao X A novel co-evolutionary approach to automatic software bug fixing In WangJ (Ed) 2008 IEEE World Congress on Computational Intelligence IEEE Computational IntelligenceSociety IEEE Press Hong Kong

4Schmidt M Lipson H 3 Apr 2009 Distilling free-form natural laws from experimental dataScience 324 (5923) 81ndash85

Applications of GPndash 74

Humies

() Entries were solicited for cash awards for human-competitive results that wereproduced by any form of genetic and evolutionary computation and that were published

Applications of GPndash 75

Humies

The conditions to qualify(A) The result was patented as an invention in the past is an improvement over apatented invention or would qualify today as a patentable new invention(B) The result is equal to or better than a result that was accepted as a new scientificresult at the time when it was published in a peer-reviewed scientific journal(C) The result is equal to or better than a result that was placed into a database orarchive of results maintained by an internationally recognized panel of scientificexperts(D) The result is publishable in its own right as a new scientific result mdash independentof the fact that the result was mechanically created(E) The result is equal to or better than the most recent human-created solution to along-standing problem for which there has been a succession of increasingly betterhuman-created solutions(F) The result is equal to or better than a result that was considered an achievementin its field at the time it was first discovered(G) The result solves a problem of indisputable difficulty in its field(H) The result holds its own or wins a regulated competition involving humancontestants (in the form of either live human players or human-written computerprograms)

Applications of GPndash 76

Selected Gold Humies

2004 Jason D Lohn Gregory S Hornby Derek S Linden NASA Ames ResearchCenterAn Evolved Antenna for Deployment on NASArsquos Space Technology 5 Mission

httpidesignucscedupapershornby_ec11pdf

Applications of GPndash 77

Selected Gold Humies using GP

2009 Stephanie Forrest Claire Le Goues ThanhVu Nguyen Westley WeimerAutomatically finding patches using genetic programming A GeneticProgramming Approach to Automated Software Repair

Applications of GPndash 78

Selected Gold Humies using GP

2008 Lee Spector David M Clark Ian Lindsay Bradford Barr Jon KleinGenetic Programming for Finite Algebras

2010 Natalio Krasnogor Paweł Widera Jonathan GaribaldiEvolutionary design of the energy function for protein structure prediction GPchallenge evolving the energy function for protein structure prediction Automateddesign of energy functions for protein structure prediction by means of geneticprogramming and improved structure similarity assessment

2011 Achiya Elyasaf Ami Hauptmann Moshe SipperGA-FreeCell Evolving Solvers for the Game of FreeCell

Applications of GPndash 79

Other applications

classification problems in machine learning [] object recognition [ 33] orlearning game strategies [] [2 57] has demonstrated the ability of a GP system to automatically find andcorrect bugs in commercially-released software when provided with test data

In the context of this paper it deserves particular attention that GP is one ofleading methodologies that can be used to lsquoautomatersquo science helping theresearchers to find the hidden complex patterns in the observed phenomena

In this spirit in their seminal paper [42] have shown how GP can be used toinduce scientific laws from experimental data Many other studies havedemonstrated the usefulness of GP for modeling different phenomena includingthose of natural origins [45 3 7 56 46]

See [] for an extensive review of GP applications

Applications of GPndash 80

Case study 2 Evolution of feature detectors

Based on Krzysztof Krawiec Bartosz Kukawka and Tomasz Maciejewski (2010)Evolving cascades of voting feature detectors for vehicle detection in satellite imageryIn IEEE Congress on Evolutionary Computation (CEC 2010) Barcelona IEEE Presspages 2392-2399

Applications of GPndash 81

Additional resources

Additional resourcesndash 82

Additional resources

Evolutionary Computation in Java csgmuedu~eclabprojectsecjGeneric software framework for EA well-prepared to work with GP

The online GP bilbiography wwwcsbhamacuk~wblbiblio

The genetic programming lsquohome pagersquo (a little bit messy but still valuable)httpwwwgenetic-programmingcom

Additional resourcesndash 83

Bibliography

Bibliographyndash 84

A Agapitos and S M LucasEvolving Modular Recursive Sorting AlgorithmsIn Proc EuroGP 2007

A Arcuri and X YaoA novel co-evolutionary approach to automatic software bug fixingIn J Wang editor 2008 IEEE World Congress on Computational Intelligencepages 162ndash168 Hong Kong 1-6 June 2008 IEEE Computational IntelligenceSociety IEEE Press

M Arganis R Val J Prats K Rodriguez R Dominguez and J DolzGenetic programming and standardization in water temperature modellingAdvances in Civil Engineering 2009 2009

A Brabazon M OrsquoNeill and I DempseyAn Introduction to Evolutionary Computation in FinanceIEEE Computational Intelligence Magazine 3(4)42ndash55 2008

R Bradley A Brabazon and M OrsquoNeillDynamic High Frequency Trading A Neuro-Evolutionary ApproachIn Proc EvoWorkshops 2009

G Cuccu and F GomezWhen novelty is not enoughIn Proc EvoApplications 2011

Bibliographyndash 85

M Daga and M C DeoAlternative data-driven methods to estimate wind from waves by inversemodelingNatural Hazards 49(2)293ndash310 May 2009

I Dempsey M OrsquoNeill and A BrabazonAdaptive Trading With Grammatical EvolutionIn Proc CEC 2006

G Durrett F Neumann and U-M OrsquoReillyComputational Complexity Analysis of Simple Genetic Programming On TwoProblems Modeling Isolated Program SemanticsIn Proc FOGA 2011

E Galvaacuten-Loacutepez J Swafford M OrsquoNeill and A BrabazonEvolving a Ms PacMan Controller Using Grammatical EvolutionIn Applications of Evolutionary Computation Springer 2010

J V Hansen P B Lowry R D Meservy and D M McDonaldGenetic Programming for Prevention of Cyberterrorism through Dynamic andEvolving Intrusion DetectionDecision Support Systems 431362ndash1374 2007

Bibliographyndash 86

S Harding J F Miller and W BanzhafDevelopments in Cartesian Genetic Programming self-modifying CGPGPEM 11397ndash439 2010

D Jakobović and L BudinDynamic Scheduling with Genetic ProgrammingIn Proc EuroGP 2006

G Kendall A Parkes and K SpoererA Survey of NP-Complete PuzzlesInternational Computer Games Association Journal 31(1)13ndash34 2008

K E Kinnear JrEvolving a Sort Lessons in Genetic ProgrammingIn Proc of the International Conference on Neural Networks 1993

J KozaA Genetic Approach to the Truck Backer Upper Problem and the Inter-twinedSpiral ProblemIn Proc International Joint Conference on Neural Networks 1992

J R KozaGenetic Programming On the Programming of Computers by Means of NaturalSelectionMIT Press Cambridge MA USA 1992

Bibliographyndash 87

J R KozaGenetic Programming II Automatic Discovery of Reusable ProgramsMIT Press Cambridge Massachusetts May 1994

W Langdon and W BanzhafRepeated Patterns in Genetic ProgrammingNatural Computing 7589ndash613 2008

W B LangdonRandom search is parsimoniousIn E Cantuacute-Paz editor Late Breaking Papers at the Genetic and EvolutionaryComputation Conference (GECCO-2002) pages 308ndash315 New York NY 9-13July 2002 AAAI

W B Langdon and W BanzhafRepeated Sequences in Linear Genetic Programming GenomesComplex Systems 15(4)285ndash306 2005

W B Langdon and A P HarrisonEvolving Regular Expressions for GeneChip Probe Performance PredictionIn Proc PPSN pages 1061ndash1070 2008

W B Langdon and R PoliFoundations of Genetic ProgrammingSpringer-Verlag 2002

Bibliographyndash 88

W B Langdon J Rowsell and A P HarrisonCreating Regular Expressions as mRNA Motifs with GP to Predict Human ExonSplittingIn Proc GECCO 2009

W B Langdon O Sanchez Graillet and A P HarrisonAutomated DNA Motif DiscoveryarXivorg 2010

M Lones A Tyrrell S Stepney and L CavesControlling Complex Dynamics with Artificial Biochemical NetworksIn Proc EuroGP pages 159ndash170 2010

S LucasOthello Competitionhttpprotectkern-1667emrelaxalgovalessexacuk8080othellohtmlOthellohtml 2012[Online accessed 27-Jan-2012]

Bibliographyndash 89

S LucasThe Physical Travelling Salesperson Problemhttpprotectkern-1667emrelaxalgovalessexacukptspptsphtml2012[Online accessed 27ndashJan-2012]

T McConaghyFFX Fast Scalable Deterministic Symbolic Regression TechnologyIn Proc GPTP 2011

D J MontanaStrongly typed genetic programmingBBN Technical Report 7866 Bolt Beranek and Newman Inc 10 MoultonStreet Cambridge MA 02138 USA 7 May 1993

M Nicolau M Schoenauer and W BanzhafEvolving Genes to Balance a PoleIn Proc EuroGP 2010

Bibliographyndash 90

P Nordin and W BanzhafGenetic programming controlling a miniature robotIn E V Siegel and J R Koza editors Working Notes for the AAAI Symposiumon Genetic Programming pages 61ndash67 MIT Cambridge MA USA 10ndash12 Nov1995 AAAI

G Olague and L TrujilloEvolutionary-computer-assisted design of image operators that detect interestpoints using genetic programmingImage and Vision Computing 29(7)484ndash498 2011

M OrsquoNeill A Brabazon and E HembergSubtree Deactivation Control with Grammatical Genetic Programming inDynamic EnvironmentsIn Proc CEC 2008

M OrsquoNeill and C RyanGrammatical Evolution by Grammatical Evolution The Evolution of Grammarand Genetic CodeIn Proc EuroGP pages 138ndash149 Springer-Verlag 5ndash7 Apr 2004

M Paterson Y Peres M Thorup P Winkler and U ZwickMaximum OverhangIn Proc 19th Annual ACM-SIAM Symposium on Discrete Algorithms 2008

Bibliographyndash 91

R PoliExact schema theory for genetic programming and variable-length geneticalgorithms with one-point crossoverGenetic Programming and Evolvable Machines 2(2)123ndash163 June 2001

R Poli and W B LangdonOn the search properties of different crossover operators in genetic programmingIn J R Koza W Banzhaf K Chellapilla K Deb M Dorigo D B Fogel M HGarzon D E Goldberg H Iba and R Riolo editors Genetic Programming1998 Proceedings of the Third Annual Conference pages 293ndash301 University ofWisconsin Madison Wisconsin USA 22-25 July 1998 Morgan Kaufmann

R Poli W B Langdon and N F McPheeA field guide to genetic programmingPublished via httplulucom and freely available athttpwwwgp-field-guideorguk 2008(With contributions by J R Koza)

B J Ross and H ZhuProcedural texture evolution using multiobjective optimizationNew Generation Computing 22(3)271ndash293 2004

Bibliographyndash 92

R P Salustowicz and J SchmidhuberProbabilistic incremental program evolutionEvolutionary Computation 5(2)123ndash141 1997

M Schmidt and H LipsonDistilling free-form natural laws from experimental dataScience 324(5923)81ndash85 3 Apr 2009

S Silva and L VanneschiState-of-the-Art Genetic Programming for Predicting Human Oral Bioavailabilityof DrugsIn Proc 4th International Workshop on Practical Applications of ComputationalBiology and Bioinformatics 2010

M SipperLet the Games EvolveIn Proc GPTP 2011

C Sivapragasam R Maheswaran and V VenkateshGenetic programming approach for flood routing in natural channelsHydrological Processes 22(5)623ndash628 2007

C Sivapragasam N Muttil S Muthukumar and V M ArunPrediction of algal blooms using genetic programmingMarine Pollution Bulletin 60(10)1849ndash1855 2010

Bibliographyndash 93

L SpectorTowards practical autoconstructive evolution Self-evolution of problem-solvinggenetic programming systemsIn R Riolo T McConaghy and E Vladislavleva editors Genetic ProgrammingTheory and Practice VIII volume 8 of Genetic and Evolutionary Computationchapter 2 pages 17ndash33 Springer Ann Arbor USA 20-22 May 2010

L Spector C Perry and J KleinPush 20 programming language descriptionTechnical report School of Cognitive Science Hampshire College Apr 2004

J Togelius S Karakovskiy J Koutnik and J SchmidhuberSuper Mario EvolutionIn Proc IEEE Computational Intelligence and Games 2009

TORCS The Open Car Racing Simulatorhttptorcssourceforgenet 2012

L Vanneschi and G CuccuVariable Size Population for Dynamic Optimization with Genetic ProgrammingIn Proc GECCO 2009

Bibliographyndash 94

E Vladislavleva G Smits and D Den HertogOrder of Nonlinearity as a Complexity Measure for Models Generated by SymbolicRegression via Pareto Genetic ProgrammingIEEE Trans EC 13(2)333ndash349 2009

N Wagner Z Michalewicz M Khouja and R McGregorTime Series Forecasting for Dynamic Environments The DyFor Genetic ProgramModelIEEE Trans EC 2007

J Walker and J MillerPredicting Prime Numbers Using Cartesian Genetic ProgrammingIn Proc EuroGP 2007

J A Walker K Voumllk S L Smith and J F MillerParallel Evolution using Multi-chromosome Cartesian Genetic ProgrammingGPEM 10417ndash445 2009

W-C Wang K-W Chau C-T Cheng and L QiuA comparison of performance of several artificial intelligence methods forforecasting monthly discharge time seriesJournal of Hydrology 374(3-4)294ndash306 2009

Bibliographyndash 95

W Weimer S Forrest C Le Goues and T NguyenAutomatic program repair with evolutionary computationCommunications of the ACM 53(5)109ndash116 June 2010

P Widera J Garibaldi and N KrasnogorGP challenge Evolving energy function for protein structure predictionGPEM 1161ndash88 2010

T YuHierarchical Processing for Evolving Recursive and Modular Programs UsingHigher-Order Functions and Lambda AbstractionGPEM 2345ndash380 2001

Bibliographyndash 96

  • Introduction
  • Background
  • What is genetic programming
  • Summary of our first glimpse at GP
  • Life demonstration of GP using ECJ
  • Assessment of GP techniques
  • Case study
  • A more detailed view on GP (vanilla GP is not the whole story)
  • The Challenges for GP
  • Variants of GP
  • Selected theoretical results
  • Applications of GP
  • Additional resources
  • Bibliography
Page 30: GeneticProgramming - Poznań University of TechnologyGeneticProgramming KrzysztofKrawiec LaboratoryofIntelligentDecisionSupportSystems InstituteofComputingScience,PoznanUniversityofTechnology,Poznań,Poland

Origins of GP

httpwwwgenetic-programmingcomjohnkozahtml

Home assignment Indentify John Kozarsquos unique visual trait )

Summary of our first glimpse at GPndash 30

Life demonstration of GP using ECJ

Life demonstration of GP using ECJndash 31

GP on ECJ

ECJ Evolutionary Computation in Javahttpcsgmuedu~eclabprojectsecj

by Sean Luke Liviu Panait Gabriel Balan Sean Paus Zbigniew Skolicki ElenaPopovici Keith Sullivan et al

Probably the most popular freely available framework for EC with a strongsupport for GP

Licensed under Academic Free License version 30

As of March 2012 version 20

Many other libraries integrate with ECJ

Life demonstration of GP using ECJndash 32

Selected ECJ features

GUI with charting

Platform-independent checkpointing and logging

Hierarchical parameter files

Multithreading

Mersenne Twister Random Number Generators (compare tohttpwwwalifecouknonrandom)

Abstractions for implementing a variety of EC forms

Prepared to work in a distributed environment (including so-called island model)

Life demonstration of GP using ECJndash 33

GP-related ECJ features

GP Tree Representations

Set-based Strongly-Typed Genetic Programming

Ephemeral Random Constants

Automatically-Defined Functions and Automatically Defined Macros

Multiple tree forests

Six tree-creation algorithms

Extensive set of GP breeding operators

Grammatical Encoding

Eight pre-done GP application problem domains (ant regression multiplexerlawnmower parity two-box edge serengeti)

Life demonstration of GP using ECJndash 34

Exemplary run of ECJ

Standard outputjava ecEvolve -file ecappregressionquinticercparamsThreads breed1 eval1Seed 1427743400Job 0Setting upProcessing GP TypesProcessing GP Node ConstraintsProcessing GP Function SetsProcessing GP Tree Constraints-01306332228659439200164875774146594280653340439694114301402200189629743-0037506348565697010001402771209365470606602806044824949013869498395598084Initializing Generation 0Subpop 0 best fitness of generation Fitness Standardized=11303205 Adjusted=046941292 Hits=10Generation 1Subpop 0 best fitness of generation Fitness Standardized=06804932 Adjusted=059506345 Hits=7

Life demonstration of GP using ECJndash 35

Exemplary run The result

The log file produced by the runGeneration 0Best IndividualSubpopulation 0Evaluated trueFitness Standardized=11303205 Adjusted=046941292 Hits=10Tree 0( (sin ( x x)) (cos (+ x x)))Generation 1Best IndividualSubpopulation 0Evaluated trueFitness Standardized=06804932 Adjusted=059506345 Hits=7Tree 0( (rlog (+ (- x x) (cos x))) (rlog (- (cos (cos ( x x))) (- x x))))

Life demonstration of GP using ECJndash 36

Exemplary run

The log file produced by the runBest Individual of RunSubpopulation 0Evaluated trueFitness Standardized=008413165 Adjusted=092239726 Hits=17Tree 0( ( ( (- ( ( ( ( x (sin x)) (rlog

x)) (+ (+ (sin x) x) (- x x))) (exp ( x( ( (- ( ( ( ( x x) (rlog x)) (+ (+

(sin x) x) (- x x))) (exp ( x (sin x))))(sin x)) (rlog x)) (exp (rlog x)))))) (sin

x)) (rlog x)) x) (cos (cos ( ( (- ( ((exp (rlog x)) (+ x ( ( (exp (rlog x))(rlog x)) x))) (exp ( ( ( (- (exp (rlogx)) x) (rlog x)) x) (sin ( x x))))) (sinx)) ( x ( ( (- ( ( ( ( x x) (rlogx)) (+ (+ x (+ (+ (sin x) x) (- x x))) (-x x))) (exp ( x (sin x)))) (sin x)) (rlogx)) (exp (rlog x))))) x))))

Life demonstration of GP using ECJndash 37

GP in R

httpcranr-projectorgwebpackagesgprindexhtml

Life demonstration of GP using ECJndash 38

GP in Mathematica

Life demonstration of GP using ECJndash 39

Assessment of GP techniques

Assessment of GP techniquesndash 40

Criteria for assessing the quality of GP-evolved solutions

Criteria for assessing GP algorithmssuccess rate (percentage of evolutionary runs ended with success)

time-to-success (can be infin)

error of the best-of-run individual

Criteria for assessing programs obtained with GP

error rate (percentage of tests passed)

program size (number of instructions)

execution time

transparency

Assessment of GP techniquesndash 41

GP Benchmarks

Problem Definition (formula)Sextic x6minus2x4+x2

Septic x7minus2x6+x5minusx4+x3minus2x2+xNonic x9+x8+x7+x6+x5+x4+x3+x2+xR1 (x+1)3(x2minusx+1)R2 (x5minus3x3+1)(x2+1)R3 (x6+x5)(x4+x3+x2+x+1)

Assessment of GP techniquesndash 42

Even more GP benchmarks

Symbolic Regression Drug Bioavailability [43]

Tower [52] Protein Structure Classification [58]

Boolean Functions Time Series Forecasting [53]

N-Multiplexer [17] N-Majority [17] N-Parity [17] Path-finding and Planning

Generalised Boolean Circuits [12 59] Physical Travelling Salesman [28]

Digital Adder [55] Artificial Ant [17]

Order [9] Lawnmower [18]

Digital Multiplier [55] Tartarus Problem [6]

Majority [9] Maximum Overhang [36]

Classification Circuit Design [29]

mRNA Motif Classification [24] Control Systems

DNA Motif Discovery [25] Chaotic Dynamic Systems Control [26]

Intrusion Detection [11] Pole Balancing [31]

Protein Classification [19] Truck Control [16]

Intertwined Spirals [17]

Predictive Modelling

Mackey-Glass Chaotic Time Series [21]

Financial Trading [5 4 8]

Sunspot Prediction [17]

GeneChip Probe Performance [22]

Prime Number Prediction [54]

Assessment of GP techniquesndash 43

and more

Game-Playing

TORCS Car Racing [50]

Ms PacMan [10]

Othello [27]

Chessboard Evaluation [44]

Backgammon [44]

Mario [49]

NP-Complete Puzzles [14]

Robocode [44]

Rush Hour [44]

Checkers [44]

Freecell [44]

Dynamic Optimisation

Dynamic Symbolic Regression [34 35 51]

Dynamic Scheduling [13]

Traditional Programming

Sorting [15 1]

Assessment of GP techniquesndash 44

Case study

Case studyndash 45

Case study 1 Evolution of temperature models

Based on Karolina Stanisławska Krzysztof Krawiec Zbigniew W KundzewiczModeling Global Temperature Changes using Genetic Programming ndash A Case Study

Institute of Computing Science Poznan University of Technology Poznan Poland

Institute for Agricultural and Forest Environment Polish Academy of SciencesPoznan Poland and Potsdam Institute for Climate Impact Research PotsdamGermany

Konferencja Algorytmoacutew Ewolucyjnych i Optymalizacji Globalnej KAEiOGSeptember 21 2011

Case studyndash 46

A more detailed view on GP(vanilla GP is not the whole story)

A more detailed view on GP (vanilla GP is not the whole story)ndash 47

Where to get [random] candidate solutions from

Every stochastic search method relies on some sampling algorithm(s)

The distribution of randomly generated solutions is important as it impliescertain bias of the algorithm

ProblemsWe donrsquot know the lsquoidealrsquo distribution of GP programsEven if we knew it it may be difficult to design an algorithm that obeys it

The most widely used contemporary initialization methods take care only of thesyntax of generated programs

Mainly height constraint

A more detailed view on GP (vanilla GP is not the whole story)ndash 48

Initialization Full method

Specify the maximum tree height hmax

The full method for initializing treesChoose nonterminal nodes at random until hmax is reachedThen choose only from terminals

A more detailed view on GP (vanilla GP is not the whole story)ndash 49

Initialization Grow method

Specify the maximum tree height hmax

The grow method for initializing treesChoose nonterminal or terminal nodes at random until hmax is reachedThen choose only from terminals

A more detailed view on GP (vanilla GP is not the whole story)ndash 50

Homologous crossover for GP

Earliest example one-point crossover [23] identify a common region in theparents and swap the corresponding trees

The common region is the lsquointersectionrsquo of parent trees

A more detailed view on GP (vanilla GP is not the whole story)ndash 51

Uniform crossover for GP

Works similarly to uniform crossover in GAs

The offspring is build by iterating over nodes in the common region and flipping acoin to decide from which parent should an instruction be copied [38]

A more detailed view on GP (vanilla GP is not the whole story)ndash 52

How to employ multiple operators for lsquobreedingrsquo

How should the particular operators coexist in an evolutionary process In other words

How should they be superimposed

What should be the lsquopipingrsquo of particular breeding pipelines

A topic surprisingly underexplored in GP (and in EC probably too)

An example Which is betterpopsubpop0speciespipe = ecgpkozaMutationPipelinepopsubpop0speciespipenum-sources = 1popsubpop0speciespipesource0 = ecgpkozaCrossoverPipeline

Orpopsubpop0speciespipenum-sources = 2popsubpop0speciespipesource0 = ecgpkozaCrossoverPipelinepopsubpop0speciespipesource0prob = 09popsubpop0speciespipesource1 = ecgpkozaMutationPipelinepopsubpop0speciespipesource1prob = 01

A more detailed view on GP (vanilla GP is not the whole story)ndash 53

The Challenges for GP

The Challenges for GPndash 54

Bloat

The evolving expressions tend to grow indefinitely in sizeFor tree-based representations this growth is typically exponential[-ish]

Evaluation becomes slow algorithm stalls memory overrun likely

One of the most intensely studied topics in GP 240+ papers as of March 2012

The Challenges for GPndash 55

Bloat example

Average number of nodes per generation in a typical run of GP solving the Sexticproblem x6minus2x4+x2

(GP dotted line)

The Challenges for GPndash 56

Countermeasures for bloat

Constraining tree heightSurprisingly can speed up bloat

Favoring small programsLexicographic parsimony pressure given two equally fit individuals prefer (select)the one represented by a smaller tree

Bloat-aware operators size-fair crossover

The Challenges for GPndash 57

Highly non-uniform distribution of program lsquobehaviorsrsquo

Convergence of binary Boolean random linear functions (composed of AND NANDOR NOR 8 bits)

From [20] Langdon W B Cantuacute-Paz E (ed) Random Search is Parsimonious LateBreaking Papers at the Genetic and Evolutionary Computation Conference(GECCO-2002) AAAI 2002 308-315

The Challenges for GPndash 58

High cost of evaluation

Running a program on multiple inputscan be expensive

Particularly for some types of dataeg images

Solutions

Caching of outcomes of subprograms

Parallel execution of programs onparticular fitness cases

Bloat prevention methods

The Challenges for GPndash 59

Variants of GP

Variants of GPndash 60

Strongly typed GP (STGP)

A way to incorporate prior knowledge and impose a structure on programs [30]

ImplementationProvide a set of typesFor each instruction define the types of its arguments and outcomesMake the operators type-aware

Mutation substitute a random tree of a proper typeCrossover swap trees of compatible1 types

1Compatible belonging to the same lsquoset typersquoVariants of GPndash 61

Strongly typed GP in ECJ

For the problem of simple classifiers represented as decision trees

Classifier syntaxClassifier = Class_idClassifier = if_then_else(Condition ClassifierClassifier)Condition = Input_Variable = Constant_Value

Implementation in ECJ parameter filesgptypeasize = 3gptypea0name = classgptypea1name = vargptypea2name = constgptypessize = 0gptcsize = 1gptc0 = ecgpGPTreeConstraintsgptc0name = tc0gptc0fset = f0gptc0returns = class

gpncsize = 4gpnc0 = ecgpGPNodeConstraintsgpnc0name = ncSimpleClassifiergpnc0returns = classgpnc0size = 0gpnc1 = ecgpGPNodeConstraintsgpnc1name = ncCompoundClassifiergpnc1returns = classgpnc1size = 4gpnc1child0 = vargpnc1child1 = constgpnc1child2 = classgpnc1child3 = classgpnc2 = ecgpGPNodeConstraintsgpnc2name = ncVariablegpnc2returns = vargpnc2size = 0gpnc3 = ecgpGPNodeConstraintsgpnc3name = ncConstantgpnc3returns = constgpnc3size = 0

Variants of GPndash 62

Linear Genetic Programming

MotivationTree-like structures are not natural for contemporary hardware architectures

Program = a sequence of instructions

Data passed via registers

ProsDirectly portable to machine code fast executionNatural correspondence to standard (GA-like) crossover operator

Applications direct evolution of machine code [32]

Initial register contents

Final register contents

x1

x2 O1 O2

x3

O3 O4 g2

g3

g1r1

r2

r3

r1

r2

r3

Variants of GPndash 63

Linear GP

Variants of GPndash 64

Stack-based GP

The best-known representative Push and PushGPhampshireedulspectorpushhtml [48]

ProsVery simple syntax program = instruction | literal | ( program )No need to specify the number of registersThe top element of a stack has the natural interpretation of program outcomeNatural possibility of implementing autoconstructive programs [47]Includes certain features that make it Turing-complete (eg YANK instruction)

Variants of GPndash 65

Push Example

Program

( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR )

Initial stack states

BOOLEAN STACK ()CODE STACK ( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR )FLOAT STACK ()INTEGER STACK ()

Stack states after program execution

BOOLEAN STACK ( TRUE )CODE STACK ( ( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR ) )FLOAT STACK ( 93 )INTEGER STACK ( 6 )

httphampshireedulspectorpush3-descriptionhtml

Variants of GPndash 66

Other variants of GP

Grammatical Evolution The grammar of the programming language ofconsideration is given as input to the algorithm Individuals encode the choice ofproductions in the derivation tree (which of available alternative productionshould be chosen modulo the number of productions available at given step ofderivation)

Graph-based GPMotivation standard GP cannot reuse subprograms (within a single program)Example Cartesian Genetic Programming

Variants of GPndash 67

Other variants of GP

Multiobjective GP The extra objectives canCome with the problemResult from GPrsquos specifics eg use program size as the second (minimized)objectiveBe associated with different tests (eg feature tests [40])

Developmental GP (eg using Push)

Probabilistic GP (a variant of EDA Estimation of Distribution Algorithms)The algorithm maintains a probability distribution P instead of a populationIndividuals are generated from P lsquoon demandrsquoThe results of individualsrsquo evaluation are used to update P

Variants of GPndash 68

Simple EDA-like GP PIPE

Probabilistic Incremental Program Evolution [41]

Variants of GPndash 69

Selected theoretical results

Selected theoretical resultsndash 70

Schemata theorem for GP

Exact formula for the expected number of individuals sampling a schema a thenext generation [37]

Plus later work for other types of crossover

Selected theoretical resultsndash 71

Semantic crossover

Selected theoretical resultsndash 72

Applications of GP

Applications of GPndash 73

Review

GP produced a number of solutions that are human-competitive ie a GPalgorithm automatically solved a problem for which a patent exists2

A recent award-winning work has demonstrated the ability of a GP system toautomatically find and correct bugs in commercially-released software whenprovided with test data3

GP is one of leading methodologies that can be used to lsquoautomatersquo sciencehelping the researchers to find the hidden complex patterns in the observedphenomena4

2Koza J R Keane M A Streeter M J Mydlowec W Yu J Lanza G 2003 Genetic Pro-gramming IV Routine Human-Competitive Machine Intelligence Kluwer Academic Publishers

3Arcuri A Yao X A novel co-evolutionary approach to automatic software bug fixing In WangJ (Ed) 2008 IEEE World Congress on Computational Intelligence IEEE Computational IntelligenceSociety IEEE Press Hong Kong

4Schmidt M Lipson H 3 Apr 2009 Distilling free-form natural laws from experimental dataScience 324 (5923) 81ndash85

Applications of GPndash 74

Humies

() Entries were solicited for cash awards for human-competitive results that wereproduced by any form of genetic and evolutionary computation and that were published

Applications of GPndash 75

Humies

The conditions to qualify(A) The result was patented as an invention in the past is an improvement over apatented invention or would qualify today as a patentable new invention(B) The result is equal to or better than a result that was accepted as a new scientificresult at the time when it was published in a peer-reviewed scientific journal(C) The result is equal to or better than a result that was placed into a database orarchive of results maintained by an internationally recognized panel of scientificexperts(D) The result is publishable in its own right as a new scientific result mdash independentof the fact that the result was mechanically created(E) The result is equal to or better than the most recent human-created solution to along-standing problem for which there has been a succession of increasingly betterhuman-created solutions(F) The result is equal to or better than a result that was considered an achievementin its field at the time it was first discovered(G) The result solves a problem of indisputable difficulty in its field(H) The result holds its own or wins a regulated competition involving humancontestants (in the form of either live human players or human-written computerprograms)

Applications of GPndash 76

Selected Gold Humies

2004 Jason D Lohn Gregory S Hornby Derek S Linden NASA Ames ResearchCenterAn Evolved Antenna for Deployment on NASArsquos Space Technology 5 Mission

httpidesignucscedupapershornby_ec11pdf

Applications of GPndash 77

Selected Gold Humies using GP

2009 Stephanie Forrest Claire Le Goues ThanhVu Nguyen Westley WeimerAutomatically finding patches using genetic programming A GeneticProgramming Approach to Automated Software Repair

Applications of GPndash 78

Selected Gold Humies using GP

2008 Lee Spector David M Clark Ian Lindsay Bradford Barr Jon KleinGenetic Programming for Finite Algebras

2010 Natalio Krasnogor Paweł Widera Jonathan GaribaldiEvolutionary design of the energy function for protein structure prediction GPchallenge evolving the energy function for protein structure prediction Automateddesign of energy functions for protein structure prediction by means of geneticprogramming and improved structure similarity assessment

2011 Achiya Elyasaf Ami Hauptmann Moshe SipperGA-FreeCell Evolving Solvers for the Game of FreeCell

Applications of GPndash 79

Other applications

classification problems in machine learning [] object recognition [ 33] orlearning game strategies [] [2 57] has demonstrated the ability of a GP system to automatically find andcorrect bugs in commercially-released software when provided with test data

In the context of this paper it deserves particular attention that GP is one ofleading methodologies that can be used to lsquoautomatersquo science helping theresearchers to find the hidden complex patterns in the observed phenomena

In this spirit in their seminal paper [42] have shown how GP can be used toinduce scientific laws from experimental data Many other studies havedemonstrated the usefulness of GP for modeling different phenomena includingthose of natural origins [45 3 7 56 46]

See [] for an extensive review of GP applications

Applications of GPndash 80

Case study 2 Evolution of feature detectors

Based on Krzysztof Krawiec Bartosz Kukawka and Tomasz Maciejewski (2010)Evolving cascades of voting feature detectors for vehicle detection in satellite imageryIn IEEE Congress on Evolutionary Computation (CEC 2010) Barcelona IEEE Presspages 2392-2399

Applications of GPndash 81

Additional resources

Additional resourcesndash 82

Additional resources

Evolutionary Computation in Java csgmuedu~eclabprojectsecjGeneric software framework for EA well-prepared to work with GP

The online GP bilbiography wwwcsbhamacuk~wblbiblio

The genetic programming lsquohome pagersquo (a little bit messy but still valuable)httpwwwgenetic-programmingcom

Additional resourcesndash 83

Bibliography

Bibliographyndash 84

A Agapitos and S M LucasEvolving Modular Recursive Sorting AlgorithmsIn Proc EuroGP 2007

A Arcuri and X YaoA novel co-evolutionary approach to automatic software bug fixingIn J Wang editor 2008 IEEE World Congress on Computational Intelligencepages 162ndash168 Hong Kong 1-6 June 2008 IEEE Computational IntelligenceSociety IEEE Press

M Arganis R Val J Prats K Rodriguez R Dominguez and J DolzGenetic programming and standardization in water temperature modellingAdvances in Civil Engineering 2009 2009

A Brabazon M OrsquoNeill and I DempseyAn Introduction to Evolutionary Computation in FinanceIEEE Computational Intelligence Magazine 3(4)42ndash55 2008

R Bradley A Brabazon and M OrsquoNeillDynamic High Frequency Trading A Neuro-Evolutionary ApproachIn Proc EvoWorkshops 2009

G Cuccu and F GomezWhen novelty is not enoughIn Proc EvoApplications 2011

Bibliographyndash 85

M Daga and M C DeoAlternative data-driven methods to estimate wind from waves by inversemodelingNatural Hazards 49(2)293ndash310 May 2009

I Dempsey M OrsquoNeill and A BrabazonAdaptive Trading With Grammatical EvolutionIn Proc CEC 2006

G Durrett F Neumann and U-M OrsquoReillyComputational Complexity Analysis of Simple Genetic Programming On TwoProblems Modeling Isolated Program SemanticsIn Proc FOGA 2011

E Galvaacuten-Loacutepez J Swafford M OrsquoNeill and A BrabazonEvolving a Ms PacMan Controller Using Grammatical EvolutionIn Applications of Evolutionary Computation Springer 2010

J V Hansen P B Lowry R D Meservy and D M McDonaldGenetic Programming for Prevention of Cyberterrorism through Dynamic andEvolving Intrusion DetectionDecision Support Systems 431362ndash1374 2007

Bibliographyndash 86

S Harding J F Miller and W BanzhafDevelopments in Cartesian Genetic Programming self-modifying CGPGPEM 11397ndash439 2010

D Jakobović and L BudinDynamic Scheduling with Genetic ProgrammingIn Proc EuroGP 2006

G Kendall A Parkes and K SpoererA Survey of NP-Complete PuzzlesInternational Computer Games Association Journal 31(1)13ndash34 2008

K E Kinnear JrEvolving a Sort Lessons in Genetic ProgrammingIn Proc of the International Conference on Neural Networks 1993

J KozaA Genetic Approach to the Truck Backer Upper Problem and the Inter-twinedSpiral ProblemIn Proc International Joint Conference on Neural Networks 1992

J R KozaGenetic Programming On the Programming of Computers by Means of NaturalSelectionMIT Press Cambridge MA USA 1992

Bibliographyndash 87

J R KozaGenetic Programming II Automatic Discovery of Reusable ProgramsMIT Press Cambridge Massachusetts May 1994

W Langdon and W BanzhafRepeated Patterns in Genetic ProgrammingNatural Computing 7589ndash613 2008

W B LangdonRandom search is parsimoniousIn E Cantuacute-Paz editor Late Breaking Papers at the Genetic and EvolutionaryComputation Conference (GECCO-2002) pages 308ndash315 New York NY 9-13July 2002 AAAI

W B Langdon and W BanzhafRepeated Sequences in Linear Genetic Programming GenomesComplex Systems 15(4)285ndash306 2005

W B Langdon and A P HarrisonEvolving Regular Expressions for GeneChip Probe Performance PredictionIn Proc PPSN pages 1061ndash1070 2008

W B Langdon and R PoliFoundations of Genetic ProgrammingSpringer-Verlag 2002

Bibliographyndash 88

W B Langdon J Rowsell and A P HarrisonCreating Regular Expressions as mRNA Motifs with GP to Predict Human ExonSplittingIn Proc GECCO 2009

W B Langdon O Sanchez Graillet and A P HarrisonAutomated DNA Motif DiscoveryarXivorg 2010

M Lones A Tyrrell S Stepney and L CavesControlling Complex Dynamics with Artificial Biochemical NetworksIn Proc EuroGP pages 159ndash170 2010

S LucasOthello Competitionhttpprotectkern-1667emrelaxalgovalessexacuk8080othellohtmlOthellohtml 2012[Online accessed 27-Jan-2012]

Bibliographyndash 89

S LucasThe Physical Travelling Salesperson Problemhttpprotectkern-1667emrelaxalgovalessexacukptspptsphtml2012[Online accessed 27ndashJan-2012]

T McConaghyFFX Fast Scalable Deterministic Symbolic Regression TechnologyIn Proc GPTP 2011

D J MontanaStrongly typed genetic programmingBBN Technical Report 7866 Bolt Beranek and Newman Inc 10 MoultonStreet Cambridge MA 02138 USA 7 May 1993

M Nicolau M Schoenauer and W BanzhafEvolving Genes to Balance a PoleIn Proc EuroGP 2010

Bibliographyndash 90

P Nordin and W BanzhafGenetic programming controlling a miniature robotIn E V Siegel and J R Koza editors Working Notes for the AAAI Symposiumon Genetic Programming pages 61ndash67 MIT Cambridge MA USA 10ndash12 Nov1995 AAAI

G Olague and L TrujilloEvolutionary-computer-assisted design of image operators that detect interestpoints using genetic programmingImage and Vision Computing 29(7)484ndash498 2011

M OrsquoNeill A Brabazon and E HembergSubtree Deactivation Control with Grammatical Genetic Programming inDynamic EnvironmentsIn Proc CEC 2008

M OrsquoNeill and C RyanGrammatical Evolution by Grammatical Evolution The Evolution of Grammarand Genetic CodeIn Proc EuroGP pages 138ndash149 Springer-Verlag 5ndash7 Apr 2004

M Paterson Y Peres M Thorup P Winkler and U ZwickMaximum OverhangIn Proc 19th Annual ACM-SIAM Symposium on Discrete Algorithms 2008

Bibliographyndash 91

R PoliExact schema theory for genetic programming and variable-length geneticalgorithms with one-point crossoverGenetic Programming and Evolvable Machines 2(2)123ndash163 June 2001

R Poli and W B LangdonOn the search properties of different crossover operators in genetic programmingIn J R Koza W Banzhaf K Chellapilla K Deb M Dorigo D B Fogel M HGarzon D E Goldberg H Iba and R Riolo editors Genetic Programming1998 Proceedings of the Third Annual Conference pages 293ndash301 University ofWisconsin Madison Wisconsin USA 22-25 July 1998 Morgan Kaufmann

R Poli W B Langdon and N F McPheeA field guide to genetic programmingPublished via httplulucom and freely available athttpwwwgp-field-guideorguk 2008(With contributions by J R Koza)

B J Ross and H ZhuProcedural texture evolution using multiobjective optimizationNew Generation Computing 22(3)271ndash293 2004

Bibliographyndash 92

R P Salustowicz and J SchmidhuberProbabilistic incremental program evolutionEvolutionary Computation 5(2)123ndash141 1997

M Schmidt and H LipsonDistilling free-form natural laws from experimental dataScience 324(5923)81ndash85 3 Apr 2009

S Silva and L VanneschiState-of-the-Art Genetic Programming for Predicting Human Oral Bioavailabilityof DrugsIn Proc 4th International Workshop on Practical Applications of ComputationalBiology and Bioinformatics 2010

M SipperLet the Games EvolveIn Proc GPTP 2011

C Sivapragasam R Maheswaran and V VenkateshGenetic programming approach for flood routing in natural channelsHydrological Processes 22(5)623ndash628 2007

C Sivapragasam N Muttil S Muthukumar and V M ArunPrediction of algal blooms using genetic programmingMarine Pollution Bulletin 60(10)1849ndash1855 2010

Bibliographyndash 93

L SpectorTowards practical autoconstructive evolution Self-evolution of problem-solvinggenetic programming systemsIn R Riolo T McConaghy and E Vladislavleva editors Genetic ProgrammingTheory and Practice VIII volume 8 of Genetic and Evolutionary Computationchapter 2 pages 17ndash33 Springer Ann Arbor USA 20-22 May 2010

L Spector C Perry and J KleinPush 20 programming language descriptionTechnical report School of Cognitive Science Hampshire College Apr 2004

J Togelius S Karakovskiy J Koutnik and J SchmidhuberSuper Mario EvolutionIn Proc IEEE Computational Intelligence and Games 2009

TORCS The Open Car Racing Simulatorhttptorcssourceforgenet 2012

L Vanneschi and G CuccuVariable Size Population for Dynamic Optimization with Genetic ProgrammingIn Proc GECCO 2009

Bibliographyndash 94

E Vladislavleva G Smits and D Den HertogOrder of Nonlinearity as a Complexity Measure for Models Generated by SymbolicRegression via Pareto Genetic ProgrammingIEEE Trans EC 13(2)333ndash349 2009

N Wagner Z Michalewicz M Khouja and R McGregorTime Series Forecasting for Dynamic Environments The DyFor Genetic ProgramModelIEEE Trans EC 2007

J Walker and J MillerPredicting Prime Numbers Using Cartesian Genetic ProgrammingIn Proc EuroGP 2007

J A Walker K Voumllk S L Smith and J F MillerParallel Evolution using Multi-chromosome Cartesian Genetic ProgrammingGPEM 10417ndash445 2009

W-C Wang K-W Chau C-T Cheng and L QiuA comparison of performance of several artificial intelligence methods forforecasting monthly discharge time seriesJournal of Hydrology 374(3-4)294ndash306 2009

Bibliographyndash 95

W Weimer S Forrest C Le Goues and T NguyenAutomatic program repair with evolutionary computationCommunications of the ACM 53(5)109ndash116 June 2010

P Widera J Garibaldi and N KrasnogorGP challenge Evolving energy function for protein structure predictionGPEM 1161ndash88 2010

T YuHierarchical Processing for Evolving Recursive and Modular Programs UsingHigher-Order Functions and Lambda AbstractionGPEM 2345ndash380 2001

Bibliographyndash 96

  • Introduction
  • Background
  • What is genetic programming
  • Summary of our first glimpse at GP
  • Life demonstration of GP using ECJ
  • Assessment of GP techniques
  • Case study
  • A more detailed view on GP (vanilla GP is not the whole story)
  • The Challenges for GP
  • Variants of GP
  • Selected theoretical results
  • Applications of GP
  • Additional resources
  • Bibliography
Page 31: GeneticProgramming - Poznań University of TechnologyGeneticProgramming KrzysztofKrawiec LaboratoryofIntelligentDecisionSupportSystems InstituteofComputingScience,PoznanUniversityofTechnology,Poznań,Poland

Life demonstration of GP using ECJ

Life demonstration of GP using ECJndash 31

GP on ECJ

ECJ Evolutionary Computation in Javahttpcsgmuedu~eclabprojectsecj

by Sean Luke Liviu Panait Gabriel Balan Sean Paus Zbigniew Skolicki ElenaPopovici Keith Sullivan et al

Probably the most popular freely available framework for EC with a strongsupport for GP

Licensed under Academic Free License version 30

As of March 2012 version 20

Many other libraries integrate with ECJ

Life demonstration of GP using ECJndash 32

Selected ECJ features

GUI with charting

Platform-independent checkpointing and logging

Hierarchical parameter files

Multithreading

Mersenne Twister Random Number Generators (compare tohttpwwwalifecouknonrandom)

Abstractions for implementing a variety of EC forms

Prepared to work in a distributed environment (including so-called island model)

Life demonstration of GP using ECJndash 33

GP-related ECJ features

GP Tree Representations

Set-based Strongly-Typed Genetic Programming

Ephemeral Random Constants

Automatically-Defined Functions and Automatically Defined Macros

Multiple tree forests

Six tree-creation algorithms

Extensive set of GP breeding operators

Grammatical Encoding

Eight pre-done GP application problem domains (ant regression multiplexerlawnmower parity two-box edge serengeti)

Life demonstration of GP using ECJndash 34

Exemplary run of ECJ

Standard outputjava ecEvolve -file ecappregressionquinticercparamsThreads breed1 eval1Seed 1427743400Job 0Setting upProcessing GP TypesProcessing GP Node ConstraintsProcessing GP Function SetsProcessing GP Tree Constraints-01306332228659439200164875774146594280653340439694114301402200189629743-0037506348565697010001402771209365470606602806044824949013869498395598084Initializing Generation 0Subpop 0 best fitness of generation Fitness Standardized=11303205 Adjusted=046941292 Hits=10Generation 1Subpop 0 best fitness of generation Fitness Standardized=06804932 Adjusted=059506345 Hits=7

Life demonstration of GP using ECJndash 35

Exemplary run The result

The log file produced by the runGeneration 0Best IndividualSubpopulation 0Evaluated trueFitness Standardized=11303205 Adjusted=046941292 Hits=10Tree 0( (sin ( x x)) (cos (+ x x)))Generation 1Best IndividualSubpopulation 0Evaluated trueFitness Standardized=06804932 Adjusted=059506345 Hits=7Tree 0( (rlog (+ (- x x) (cos x))) (rlog (- (cos (cos ( x x))) (- x x))))

Life demonstration of GP using ECJndash 36

Exemplary run

The log file produced by the runBest Individual of RunSubpopulation 0Evaluated trueFitness Standardized=008413165 Adjusted=092239726 Hits=17Tree 0( ( ( (- ( ( ( ( x (sin x)) (rlog

x)) (+ (+ (sin x) x) (- x x))) (exp ( x( ( (- ( ( ( ( x x) (rlog x)) (+ (+

(sin x) x) (- x x))) (exp ( x (sin x))))(sin x)) (rlog x)) (exp (rlog x)))))) (sin

x)) (rlog x)) x) (cos (cos ( ( (- ( ((exp (rlog x)) (+ x ( ( (exp (rlog x))(rlog x)) x))) (exp ( ( ( (- (exp (rlogx)) x) (rlog x)) x) (sin ( x x))))) (sinx)) ( x ( ( (- ( ( ( ( x x) (rlogx)) (+ (+ x (+ (+ (sin x) x) (- x x))) (-x x))) (exp ( x (sin x)))) (sin x)) (rlogx)) (exp (rlog x))))) x))))

Life demonstration of GP using ECJndash 37

GP in R

httpcranr-projectorgwebpackagesgprindexhtml

Life demonstration of GP using ECJndash 38

GP in Mathematica

Life demonstration of GP using ECJndash 39

Assessment of GP techniques

Assessment of GP techniquesndash 40

Criteria for assessing the quality of GP-evolved solutions

Criteria for assessing GP algorithmssuccess rate (percentage of evolutionary runs ended with success)

time-to-success (can be infin)

error of the best-of-run individual

Criteria for assessing programs obtained with GP

error rate (percentage of tests passed)

program size (number of instructions)

execution time

transparency

Assessment of GP techniquesndash 41

GP Benchmarks

Problem Definition (formula)Sextic x6minus2x4+x2

Septic x7minus2x6+x5minusx4+x3minus2x2+xNonic x9+x8+x7+x6+x5+x4+x3+x2+xR1 (x+1)3(x2minusx+1)R2 (x5minus3x3+1)(x2+1)R3 (x6+x5)(x4+x3+x2+x+1)

Assessment of GP techniquesndash 42

Even more GP benchmarks

Symbolic Regression Drug Bioavailability [43]

Tower [52] Protein Structure Classification [58]

Boolean Functions Time Series Forecasting [53]

N-Multiplexer [17] N-Majority [17] N-Parity [17] Path-finding and Planning

Generalised Boolean Circuits [12 59] Physical Travelling Salesman [28]

Digital Adder [55] Artificial Ant [17]

Order [9] Lawnmower [18]

Digital Multiplier [55] Tartarus Problem [6]

Majority [9] Maximum Overhang [36]

Classification Circuit Design [29]

mRNA Motif Classification [24] Control Systems

DNA Motif Discovery [25] Chaotic Dynamic Systems Control [26]

Intrusion Detection [11] Pole Balancing [31]

Protein Classification [19] Truck Control [16]

Intertwined Spirals [17]

Predictive Modelling

Mackey-Glass Chaotic Time Series [21]

Financial Trading [5 4 8]

Sunspot Prediction [17]

GeneChip Probe Performance [22]

Prime Number Prediction [54]

Assessment of GP techniquesndash 43

and more

Game-Playing

TORCS Car Racing [50]

Ms PacMan [10]

Othello [27]

Chessboard Evaluation [44]

Backgammon [44]

Mario [49]

NP-Complete Puzzles [14]

Robocode [44]

Rush Hour [44]

Checkers [44]

Freecell [44]

Dynamic Optimisation

Dynamic Symbolic Regression [34 35 51]

Dynamic Scheduling [13]

Traditional Programming

Sorting [15 1]

Assessment of GP techniquesndash 44

Case study

Case studyndash 45

Case study 1 Evolution of temperature models

Based on Karolina Stanisławska Krzysztof Krawiec Zbigniew W KundzewiczModeling Global Temperature Changes using Genetic Programming ndash A Case Study

Institute of Computing Science Poznan University of Technology Poznan Poland

Institute for Agricultural and Forest Environment Polish Academy of SciencesPoznan Poland and Potsdam Institute for Climate Impact Research PotsdamGermany

Konferencja Algorytmoacutew Ewolucyjnych i Optymalizacji Globalnej KAEiOGSeptember 21 2011

Case studyndash 46

A more detailed view on GP(vanilla GP is not the whole story)

A more detailed view on GP (vanilla GP is not the whole story)ndash 47

Where to get [random] candidate solutions from

Every stochastic search method relies on some sampling algorithm(s)

The distribution of randomly generated solutions is important as it impliescertain bias of the algorithm

ProblemsWe donrsquot know the lsquoidealrsquo distribution of GP programsEven if we knew it it may be difficult to design an algorithm that obeys it

The most widely used contemporary initialization methods take care only of thesyntax of generated programs

Mainly height constraint

A more detailed view on GP (vanilla GP is not the whole story)ndash 48

Initialization Full method

Specify the maximum tree height hmax

The full method for initializing treesChoose nonterminal nodes at random until hmax is reachedThen choose only from terminals

A more detailed view on GP (vanilla GP is not the whole story)ndash 49

Initialization Grow method

Specify the maximum tree height hmax

The grow method for initializing treesChoose nonterminal or terminal nodes at random until hmax is reachedThen choose only from terminals

A more detailed view on GP (vanilla GP is not the whole story)ndash 50

Homologous crossover for GP

Earliest example one-point crossover [23] identify a common region in theparents and swap the corresponding trees

The common region is the lsquointersectionrsquo of parent trees

A more detailed view on GP (vanilla GP is not the whole story)ndash 51

Uniform crossover for GP

Works similarly to uniform crossover in GAs

The offspring is build by iterating over nodes in the common region and flipping acoin to decide from which parent should an instruction be copied [38]

A more detailed view on GP (vanilla GP is not the whole story)ndash 52

How to employ multiple operators for lsquobreedingrsquo

How should the particular operators coexist in an evolutionary process In other words

How should they be superimposed

What should be the lsquopipingrsquo of particular breeding pipelines

A topic surprisingly underexplored in GP (and in EC probably too)

An example Which is betterpopsubpop0speciespipe = ecgpkozaMutationPipelinepopsubpop0speciespipenum-sources = 1popsubpop0speciespipesource0 = ecgpkozaCrossoverPipeline

Orpopsubpop0speciespipenum-sources = 2popsubpop0speciespipesource0 = ecgpkozaCrossoverPipelinepopsubpop0speciespipesource0prob = 09popsubpop0speciespipesource1 = ecgpkozaMutationPipelinepopsubpop0speciespipesource1prob = 01

A more detailed view on GP (vanilla GP is not the whole story)ndash 53

The Challenges for GP

The Challenges for GPndash 54

Bloat

The evolving expressions tend to grow indefinitely in sizeFor tree-based representations this growth is typically exponential[-ish]

Evaluation becomes slow algorithm stalls memory overrun likely

One of the most intensely studied topics in GP 240+ papers as of March 2012

The Challenges for GPndash 55

Bloat example

Average number of nodes per generation in a typical run of GP solving the Sexticproblem x6minus2x4+x2

(GP dotted line)

The Challenges for GPndash 56

Countermeasures for bloat

Constraining tree heightSurprisingly can speed up bloat

Favoring small programsLexicographic parsimony pressure given two equally fit individuals prefer (select)the one represented by a smaller tree

Bloat-aware operators size-fair crossover

The Challenges for GPndash 57

Highly non-uniform distribution of program lsquobehaviorsrsquo

Convergence of binary Boolean random linear functions (composed of AND NANDOR NOR 8 bits)

From [20] Langdon W B Cantuacute-Paz E (ed) Random Search is Parsimonious LateBreaking Papers at the Genetic and Evolutionary Computation Conference(GECCO-2002) AAAI 2002 308-315

The Challenges for GPndash 58

High cost of evaluation

Running a program on multiple inputscan be expensive

Particularly for some types of dataeg images

Solutions

Caching of outcomes of subprograms

Parallel execution of programs onparticular fitness cases

Bloat prevention methods

The Challenges for GPndash 59

Variants of GP

Variants of GPndash 60

Strongly typed GP (STGP)

A way to incorporate prior knowledge and impose a structure on programs [30]

ImplementationProvide a set of typesFor each instruction define the types of its arguments and outcomesMake the operators type-aware

Mutation substitute a random tree of a proper typeCrossover swap trees of compatible1 types

1Compatible belonging to the same lsquoset typersquoVariants of GPndash 61

Strongly typed GP in ECJ

For the problem of simple classifiers represented as decision trees

Classifier syntaxClassifier = Class_idClassifier = if_then_else(Condition ClassifierClassifier)Condition = Input_Variable = Constant_Value

Implementation in ECJ parameter filesgptypeasize = 3gptypea0name = classgptypea1name = vargptypea2name = constgptypessize = 0gptcsize = 1gptc0 = ecgpGPTreeConstraintsgptc0name = tc0gptc0fset = f0gptc0returns = class

gpncsize = 4gpnc0 = ecgpGPNodeConstraintsgpnc0name = ncSimpleClassifiergpnc0returns = classgpnc0size = 0gpnc1 = ecgpGPNodeConstraintsgpnc1name = ncCompoundClassifiergpnc1returns = classgpnc1size = 4gpnc1child0 = vargpnc1child1 = constgpnc1child2 = classgpnc1child3 = classgpnc2 = ecgpGPNodeConstraintsgpnc2name = ncVariablegpnc2returns = vargpnc2size = 0gpnc3 = ecgpGPNodeConstraintsgpnc3name = ncConstantgpnc3returns = constgpnc3size = 0

Variants of GPndash 62

Linear Genetic Programming

MotivationTree-like structures are not natural for contemporary hardware architectures

Program = a sequence of instructions

Data passed via registers

ProsDirectly portable to machine code fast executionNatural correspondence to standard (GA-like) crossover operator

Applications direct evolution of machine code [32]

Initial register contents

Final register contents

x1

x2 O1 O2

x3

O3 O4 g2

g3

g1r1

r2

r3

r1

r2

r3

Variants of GPndash 63

Linear GP

Variants of GPndash 64

Stack-based GP

The best-known representative Push and PushGPhampshireedulspectorpushhtml [48]

ProsVery simple syntax program = instruction | literal | ( program )No need to specify the number of registersThe top element of a stack has the natural interpretation of program outcomeNatural possibility of implementing autoconstructive programs [47]Includes certain features that make it Turing-complete (eg YANK instruction)

Variants of GPndash 65

Push Example

Program

( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR )

Initial stack states

BOOLEAN STACK ()CODE STACK ( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR )FLOAT STACK ()INTEGER STACK ()

Stack states after program execution

BOOLEAN STACK ( TRUE )CODE STACK ( ( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR ) )FLOAT STACK ( 93 )INTEGER STACK ( 6 )

httphampshireedulspectorpush3-descriptionhtml

Variants of GPndash 66

Other variants of GP

Grammatical Evolution The grammar of the programming language ofconsideration is given as input to the algorithm Individuals encode the choice ofproductions in the derivation tree (which of available alternative productionshould be chosen modulo the number of productions available at given step ofderivation)

Graph-based GPMotivation standard GP cannot reuse subprograms (within a single program)Example Cartesian Genetic Programming

Variants of GPndash 67

Other variants of GP

Multiobjective GP The extra objectives canCome with the problemResult from GPrsquos specifics eg use program size as the second (minimized)objectiveBe associated with different tests (eg feature tests [40])

Developmental GP (eg using Push)

Probabilistic GP (a variant of EDA Estimation of Distribution Algorithms)The algorithm maintains a probability distribution P instead of a populationIndividuals are generated from P lsquoon demandrsquoThe results of individualsrsquo evaluation are used to update P

Variants of GPndash 68

Simple EDA-like GP PIPE

Probabilistic Incremental Program Evolution [41]

Variants of GPndash 69

Selected theoretical results

Selected theoretical resultsndash 70

Schemata theorem for GP

Exact formula for the expected number of individuals sampling a schema a thenext generation [37]

Plus later work for other types of crossover

Selected theoretical resultsndash 71

Semantic crossover

Selected theoretical resultsndash 72

Applications of GP

Applications of GPndash 73

Review

GP produced a number of solutions that are human-competitive ie a GPalgorithm automatically solved a problem for which a patent exists2

A recent award-winning work has demonstrated the ability of a GP system toautomatically find and correct bugs in commercially-released software whenprovided with test data3

GP is one of leading methodologies that can be used to lsquoautomatersquo sciencehelping the researchers to find the hidden complex patterns in the observedphenomena4

2Koza J R Keane M A Streeter M J Mydlowec W Yu J Lanza G 2003 Genetic Pro-gramming IV Routine Human-Competitive Machine Intelligence Kluwer Academic Publishers

3Arcuri A Yao X A novel co-evolutionary approach to automatic software bug fixing In WangJ (Ed) 2008 IEEE World Congress on Computational Intelligence IEEE Computational IntelligenceSociety IEEE Press Hong Kong

4Schmidt M Lipson H 3 Apr 2009 Distilling free-form natural laws from experimental dataScience 324 (5923) 81ndash85

Applications of GPndash 74

Humies

() Entries were solicited for cash awards for human-competitive results that wereproduced by any form of genetic and evolutionary computation and that were published

Applications of GPndash 75

Humies

The conditions to qualify(A) The result was patented as an invention in the past is an improvement over apatented invention or would qualify today as a patentable new invention(B) The result is equal to or better than a result that was accepted as a new scientificresult at the time when it was published in a peer-reviewed scientific journal(C) The result is equal to or better than a result that was placed into a database orarchive of results maintained by an internationally recognized panel of scientificexperts(D) The result is publishable in its own right as a new scientific result mdash independentof the fact that the result was mechanically created(E) The result is equal to or better than the most recent human-created solution to along-standing problem for which there has been a succession of increasingly betterhuman-created solutions(F) The result is equal to or better than a result that was considered an achievementin its field at the time it was first discovered(G) The result solves a problem of indisputable difficulty in its field(H) The result holds its own or wins a regulated competition involving humancontestants (in the form of either live human players or human-written computerprograms)

Applications of GPndash 76

Selected Gold Humies

2004 Jason D Lohn Gregory S Hornby Derek S Linden NASA Ames ResearchCenterAn Evolved Antenna for Deployment on NASArsquos Space Technology 5 Mission

httpidesignucscedupapershornby_ec11pdf

Applications of GPndash 77

Selected Gold Humies using GP

2009 Stephanie Forrest Claire Le Goues ThanhVu Nguyen Westley WeimerAutomatically finding patches using genetic programming A GeneticProgramming Approach to Automated Software Repair

Applications of GPndash 78

Selected Gold Humies using GP

2008 Lee Spector David M Clark Ian Lindsay Bradford Barr Jon KleinGenetic Programming for Finite Algebras

2010 Natalio Krasnogor Paweł Widera Jonathan GaribaldiEvolutionary design of the energy function for protein structure prediction GPchallenge evolving the energy function for protein structure prediction Automateddesign of energy functions for protein structure prediction by means of geneticprogramming and improved structure similarity assessment

2011 Achiya Elyasaf Ami Hauptmann Moshe SipperGA-FreeCell Evolving Solvers for the Game of FreeCell

Applications of GPndash 79

Other applications

classification problems in machine learning [] object recognition [ 33] orlearning game strategies [] [2 57] has demonstrated the ability of a GP system to automatically find andcorrect bugs in commercially-released software when provided with test data

In the context of this paper it deserves particular attention that GP is one ofleading methodologies that can be used to lsquoautomatersquo science helping theresearchers to find the hidden complex patterns in the observed phenomena

In this spirit in their seminal paper [42] have shown how GP can be used toinduce scientific laws from experimental data Many other studies havedemonstrated the usefulness of GP for modeling different phenomena includingthose of natural origins [45 3 7 56 46]

See [] for an extensive review of GP applications

Applications of GPndash 80

Case study 2 Evolution of feature detectors

Based on Krzysztof Krawiec Bartosz Kukawka and Tomasz Maciejewski (2010)Evolving cascades of voting feature detectors for vehicle detection in satellite imageryIn IEEE Congress on Evolutionary Computation (CEC 2010) Barcelona IEEE Presspages 2392-2399

Applications of GPndash 81

Additional resources

Additional resourcesndash 82

Additional resources

Evolutionary Computation in Java csgmuedu~eclabprojectsecjGeneric software framework for EA well-prepared to work with GP

The online GP bilbiography wwwcsbhamacuk~wblbiblio

The genetic programming lsquohome pagersquo (a little bit messy but still valuable)httpwwwgenetic-programmingcom

Additional resourcesndash 83

Bibliography

Bibliographyndash 84

A Agapitos and S M LucasEvolving Modular Recursive Sorting AlgorithmsIn Proc EuroGP 2007

A Arcuri and X YaoA novel co-evolutionary approach to automatic software bug fixingIn J Wang editor 2008 IEEE World Congress on Computational Intelligencepages 162ndash168 Hong Kong 1-6 June 2008 IEEE Computational IntelligenceSociety IEEE Press

M Arganis R Val J Prats K Rodriguez R Dominguez and J DolzGenetic programming and standardization in water temperature modellingAdvances in Civil Engineering 2009 2009

A Brabazon M OrsquoNeill and I DempseyAn Introduction to Evolutionary Computation in FinanceIEEE Computational Intelligence Magazine 3(4)42ndash55 2008

R Bradley A Brabazon and M OrsquoNeillDynamic High Frequency Trading A Neuro-Evolutionary ApproachIn Proc EvoWorkshops 2009

G Cuccu and F GomezWhen novelty is not enoughIn Proc EvoApplications 2011

Bibliographyndash 85

M Daga and M C DeoAlternative data-driven methods to estimate wind from waves by inversemodelingNatural Hazards 49(2)293ndash310 May 2009

I Dempsey M OrsquoNeill and A BrabazonAdaptive Trading With Grammatical EvolutionIn Proc CEC 2006

G Durrett F Neumann and U-M OrsquoReillyComputational Complexity Analysis of Simple Genetic Programming On TwoProblems Modeling Isolated Program SemanticsIn Proc FOGA 2011

E Galvaacuten-Loacutepez J Swafford M OrsquoNeill and A BrabazonEvolving a Ms PacMan Controller Using Grammatical EvolutionIn Applications of Evolutionary Computation Springer 2010

J V Hansen P B Lowry R D Meservy and D M McDonaldGenetic Programming for Prevention of Cyberterrorism through Dynamic andEvolving Intrusion DetectionDecision Support Systems 431362ndash1374 2007

Bibliographyndash 86

S Harding J F Miller and W BanzhafDevelopments in Cartesian Genetic Programming self-modifying CGPGPEM 11397ndash439 2010

D Jakobović and L BudinDynamic Scheduling with Genetic ProgrammingIn Proc EuroGP 2006

G Kendall A Parkes and K SpoererA Survey of NP-Complete PuzzlesInternational Computer Games Association Journal 31(1)13ndash34 2008

K E Kinnear JrEvolving a Sort Lessons in Genetic ProgrammingIn Proc of the International Conference on Neural Networks 1993

J KozaA Genetic Approach to the Truck Backer Upper Problem and the Inter-twinedSpiral ProblemIn Proc International Joint Conference on Neural Networks 1992

J R KozaGenetic Programming On the Programming of Computers by Means of NaturalSelectionMIT Press Cambridge MA USA 1992

Bibliographyndash 87

J R KozaGenetic Programming II Automatic Discovery of Reusable ProgramsMIT Press Cambridge Massachusetts May 1994

W Langdon and W BanzhafRepeated Patterns in Genetic ProgrammingNatural Computing 7589ndash613 2008

W B LangdonRandom search is parsimoniousIn E Cantuacute-Paz editor Late Breaking Papers at the Genetic and EvolutionaryComputation Conference (GECCO-2002) pages 308ndash315 New York NY 9-13July 2002 AAAI

W B Langdon and W BanzhafRepeated Sequences in Linear Genetic Programming GenomesComplex Systems 15(4)285ndash306 2005

W B Langdon and A P HarrisonEvolving Regular Expressions for GeneChip Probe Performance PredictionIn Proc PPSN pages 1061ndash1070 2008

W B Langdon and R PoliFoundations of Genetic ProgrammingSpringer-Verlag 2002

Bibliographyndash 88

W B Langdon J Rowsell and A P HarrisonCreating Regular Expressions as mRNA Motifs with GP to Predict Human ExonSplittingIn Proc GECCO 2009

W B Langdon O Sanchez Graillet and A P HarrisonAutomated DNA Motif DiscoveryarXivorg 2010

M Lones A Tyrrell S Stepney and L CavesControlling Complex Dynamics with Artificial Biochemical NetworksIn Proc EuroGP pages 159ndash170 2010

S LucasOthello Competitionhttpprotectkern-1667emrelaxalgovalessexacuk8080othellohtmlOthellohtml 2012[Online accessed 27-Jan-2012]

Bibliographyndash 89

S LucasThe Physical Travelling Salesperson Problemhttpprotectkern-1667emrelaxalgovalessexacukptspptsphtml2012[Online accessed 27ndashJan-2012]

T McConaghyFFX Fast Scalable Deterministic Symbolic Regression TechnologyIn Proc GPTP 2011

D J MontanaStrongly typed genetic programmingBBN Technical Report 7866 Bolt Beranek and Newman Inc 10 MoultonStreet Cambridge MA 02138 USA 7 May 1993

M Nicolau M Schoenauer and W BanzhafEvolving Genes to Balance a PoleIn Proc EuroGP 2010

Bibliographyndash 90

P Nordin and W BanzhafGenetic programming controlling a miniature robotIn E V Siegel and J R Koza editors Working Notes for the AAAI Symposiumon Genetic Programming pages 61ndash67 MIT Cambridge MA USA 10ndash12 Nov1995 AAAI

G Olague and L TrujilloEvolutionary-computer-assisted design of image operators that detect interestpoints using genetic programmingImage and Vision Computing 29(7)484ndash498 2011

M OrsquoNeill A Brabazon and E HembergSubtree Deactivation Control with Grammatical Genetic Programming inDynamic EnvironmentsIn Proc CEC 2008

M OrsquoNeill and C RyanGrammatical Evolution by Grammatical Evolution The Evolution of Grammarand Genetic CodeIn Proc EuroGP pages 138ndash149 Springer-Verlag 5ndash7 Apr 2004

M Paterson Y Peres M Thorup P Winkler and U ZwickMaximum OverhangIn Proc 19th Annual ACM-SIAM Symposium on Discrete Algorithms 2008

Bibliographyndash 91

R PoliExact schema theory for genetic programming and variable-length geneticalgorithms with one-point crossoverGenetic Programming and Evolvable Machines 2(2)123ndash163 June 2001

R Poli and W B LangdonOn the search properties of different crossover operators in genetic programmingIn J R Koza W Banzhaf K Chellapilla K Deb M Dorigo D B Fogel M HGarzon D E Goldberg H Iba and R Riolo editors Genetic Programming1998 Proceedings of the Third Annual Conference pages 293ndash301 University ofWisconsin Madison Wisconsin USA 22-25 July 1998 Morgan Kaufmann

R Poli W B Langdon and N F McPheeA field guide to genetic programmingPublished via httplulucom and freely available athttpwwwgp-field-guideorguk 2008(With contributions by J R Koza)

B J Ross and H ZhuProcedural texture evolution using multiobjective optimizationNew Generation Computing 22(3)271ndash293 2004

Bibliographyndash 92

R P Salustowicz and J SchmidhuberProbabilistic incremental program evolutionEvolutionary Computation 5(2)123ndash141 1997

M Schmidt and H LipsonDistilling free-form natural laws from experimental dataScience 324(5923)81ndash85 3 Apr 2009

S Silva and L VanneschiState-of-the-Art Genetic Programming for Predicting Human Oral Bioavailabilityof DrugsIn Proc 4th International Workshop on Practical Applications of ComputationalBiology and Bioinformatics 2010

M SipperLet the Games EvolveIn Proc GPTP 2011

C Sivapragasam R Maheswaran and V VenkateshGenetic programming approach for flood routing in natural channelsHydrological Processes 22(5)623ndash628 2007

C Sivapragasam N Muttil S Muthukumar and V M ArunPrediction of algal blooms using genetic programmingMarine Pollution Bulletin 60(10)1849ndash1855 2010

Bibliographyndash 93

L SpectorTowards practical autoconstructive evolution Self-evolution of problem-solvinggenetic programming systemsIn R Riolo T McConaghy and E Vladislavleva editors Genetic ProgrammingTheory and Practice VIII volume 8 of Genetic and Evolutionary Computationchapter 2 pages 17ndash33 Springer Ann Arbor USA 20-22 May 2010

L Spector C Perry and J KleinPush 20 programming language descriptionTechnical report School of Cognitive Science Hampshire College Apr 2004

J Togelius S Karakovskiy J Koutnik and J SchmidhuberSuper Mario EvolutionIn Proc IEEE Computational Intelligence and Games 2009

TORCS The Open Car Racing Simulatorhttptorcssourceforgenet 2012

L Vanneschi and G CuccuVariable Size Population for Dynamic Optimization with Genetic ProgrammingIn Proc GECCO 2009

Bibliographyndash 94

E Vladislavleva G Smits and D Den HertogOrder of Nonlinearity as a Complexity Measure for Models Generated by SymbolicRegression via Pareto Genetic ProgrammingIEEE Trans EC 13(2)333ndash349 2009

N Wagner Z Michalewicz M Khouja and R McGregorTime Series Forecasting for Dynamic Environments The DyFor Genetic ProgramModelIEEE Trans EC 2007

J Walker and J MillerPredicting Prime Numbers Using Cartesian Genetic ProgrammingIn Proc EuroGP 2007

J A Walker K Voumllk S L Smith and J F MillerParallel Evolution using Multi-chromosome Cartesian Genetic ProgrammingGPEM 10417ndash445 2009

W-C Wang K-W Chau C-T Cheng and L QiuA comparison of performance of several artificial intelligence methods forforecasting monthly discharge time seriesJournal of Hydrology 374(3-4)294ndash306 2009

Bibliographyndash 95

W Weimer S Forrest C Le Goues and T NguyenAutomatic program repair with evolutionary computationCommunications of the ACM 53(5)109ndash116 June 2010

P Widera J Garibaldi and N KrasnogorGP challenge Evolving energy function for protein structure predictionGPEM 1161ndash88 2010

T YuHierarchical Processing for Evolving Recursive and Modular Programs UsingHigher-Order Functions and Lambda AbstractionGPEM 2345ndash380 2001

Bibliographyndash 96

  • Introduction
  • Background
  • What is genetic programming
  • Summary of our first glimpse at GP
  • Life demonstration of GP using ECJ
  • Assessment of GP techniques
  • Case study
  • A more detailed view on GP (vanilla GP is not the whole story)
  • The Challenges for GP
  • Variants of GP
  • Selected theoretical results
  • Applications of GP
  • Additional resources
  • Bibliography
Page 32: GeneticProgramming - Poznań University of TechnologyGeneticProgramming KrzysztofKrawiec LaboratoryofIntelligentDecisionSupportSystems InstituteofComputingScience,PoznanUniversityofTechnology,Poznań,Poland

GP on ECJ

ECJ Evolutionary Computation in Javahttpcsgmuedu~eclabprojectsecj

by Sean Luke Liviu Panait Gabriel Balan Sean Paus Zbigniew Skolicki ElenaPopovici Keith Sullivan et al

Probably the most popular freely available framework for EC with a strongsupport for GP

Licensed under Academic Free License version 30

As of March 2012 version 20

Many other libraries integrate with ECJ

Life demonstration of GP using ECJndash 32

Selected ECJ features

GUI with charting

Platform-independent checkpointing and logging

Hierarchical parameter files

Multithreading

Mersenne Twister Random Number Generators (compare tohttpwwwalifecouknonrandom)

Abstractions for implementing a variety of EC forms

Prepared to work in a distributed environment (including so-called island model)

Life demonstration of GP using ECJndash 33

GP-related ECJ features

GP Tree Representations

Set-based Strongly-Typed Genetic Programming

Ephemeral Random Constants

Automatically-Defined Functions and Automatically Defined Macros

Multiple tree forests

Six tree-creation algorithms

Extensive set of GP breeding operators

Grammatical Encoding

Eight pre-done GP application problem domains (ant regression multiplexerlawnmower parity two-box edge serengeti)

Life demonstration of GP using ECJndash 34

Exemplary run of ECJ

Standard outputjava ecEvolve -file ecappregressionquinticercparamsThreads breed1 eval1Seed 1427743400Job 0Setting upProcessing GP TypesProcessing GP Node ConstraintsProcessing GP Function SetsProcessing GP Tree Constraints-01306332228659439200164875774146594280653340439694114301402200189629743-0037506348565697010001402771209365470606602806044824949013869498395598084Initializing Generation 0Subpop 0 best fitness of generation Fitness Standardized=11303205 Adjusted=046941292 Hits=10Generation 1Subpop 0 best fitness of generation Fitness Standardized=06804932 Adjusted=059506345 Hits=7

Life demonstration of GP using ECJndash 35

Exemplary run The result

The log file produced by the runGeneration 0Best IndividualSubpopulation 0Evaluated trueFitness Standardized=11303205 Adjusted=046941292 Hits=10Tree 0( (sin ( x x)) (cos (+ x x)))Generation 1Best IndividualSubpopulation 0Evaluated trueFitness Standardized=06804932 Adjusted=059506345 Hits=7Tree 0( (rlog (+ (- x x) (cos x))) (rlog (- (cos (cos ( x x))) (- x x))))

Life demonstration of GP using ECJndash 36

Exemplary run

The log file produced by the runBest Individual of RunSubpopulation 0Evaluated trueFitness Standardized=008413165 Adjusted=092239726 Hits=17Tree 0( ( ( (- ( ( ( ( x (sin x)) (rlog

x)) (+ (+ (sin x) x) (- x x))) (exp ( x( ( (- ( ( ( ( x x) (rlog x)) (+ (+

(sin x) x) (- x x))) (exp ( x (sin x))))(sin x)) (rlog x)) (exp (rlog x)))))) (sin

x)) (rlog x)) x) (cos (cos ( ( (- ( ((exp (rlog x)) (+ x ( ( (exp (rlog x))(rlog x)) x))) (exp ( ( ( (- (exp (rlogx)) x) (rlog x)) x) (sin ( x x))))) (sinx)) ( x ( ( (- ( ( ( ( x x) (rlogx)) (+ (+ x (+ (+ (sin x) x) (- x x))) (-x x))) (exp ( x (sin x)))) (sin x)) (rlogx)) (exp (rlog x))))) x))))

Life demonstration of GP using ECJndash 37

GP in R

httpcranr-projectorgwebpackagesgprindexhtml

Life demonstration of GP using ECJndash 38

GP in Mathematica

Life demonstration of GP using ECJndash 39

Assessment of GP techniques

Assessment of GP techniquesndash 40

Criteria for assessing the quality of GP-evolved solutions

Criteria for assessing GP algorithmssuccess rate (percentage of evolutionary runs ended with success)

time-to-success (can be infin)

error of the best-of-run individual

Criteria for assessing programs obtained with GP

error rate (percentage of tests passed)

program size (number of instructions)

execution time

transparency

Assessment of GP techniquesndash 41

GP Benchmarks

Problem Definition (formula)Sextic x6minus2x4+x2

Septic x7minus2x6+x5minusx4+x3minus2x2+xNonic x9+x8+x7+x6+x5+x4+x3+x2+xR1 (x+1)3(x2minusx+1)R2 (x5minus3x3+1)(x2+1)R3 (x6+x5)(x4+x3+x2+x+1)

Assessment of GP techniquesndash 42

Even more GP benchmarks

Symbolic Regression Drug Bioavailability [43]

Tower [52] Protein Structure Classification [58]

Boolean Functions Time Series Forecasting [53]

N-Multiplexer [17] N-Majority [17] N-Parity [17] Path-finding and Planning

Generalised Boolean Circuits [12 59] Physical Travelling Salesman [28]

Digital Adder [55] Artificial Ant [17]

Order [9] Lawnmower [18]

Digital Multiplier [55] Tartarus Problem [6]

Majority [9] Maximum Overhang [36]

Classification Circuit Design [29]

mRNA Motif Classification [24] Control Systems

DNA Motif Discovery [25] Chaotic Dynamic Systems Control [26]

Intrusion Detection [11] Pole Balancing [31]

Protein Classification [19] Truck Control [16]

Intertwined Spirals [17]

Predictive Modelling

Mackey-Glass Chaotic Time Series [21]

Financial Trading [5 4 8]

Sunspot Prediction [17]

GeneChip Probe Performance [22]

Prime Number Prediction [54]

Assessment of GP techniquesndash 43

and more

Game-Playing

TORCS Car Racing [50]

Ms PacMan [10]

Othello [27]

Chessboard Evaluation [44]

Backgammon [44]

Mario [49]

NP-Complete Puzzles [14]

Robocode [44]

Rush Hour [44]

Checkers [44]

Freecell [44]

Dynamic Optimisation

Dynamic Symbolic Regression [34 35 51]

Dynamic Scheduling [13]

Traditional Programming

Sorting [15 1]

Assessment of GP techniquesndash 44

Case study

Case studyndash 45

Case study 1 Evolution of temperature models

Based on Karolina Stanisławska Krzysztof Krawiec Zbigniew W KundzewiczModeling Global Temperature Changes using Genetic Programming ndash A Case Study

Institute of Computing Science Poznan University of Technology Poznan Poland

Institute for Agricultural and Forest Environment Polish Academy of SciencesPoznan Poland and Potsdam Institute for Climate Impact Research PotsdamGermany

Konferencja Algorytmoacutew Ewolucyjnych i Optymalizacji Globalnej KAEiOGSeptember 21 2011

Case studyndash 46

A more detailed view on GP(vanilla GP is not the whole story)

A more detailed view on GP (vanilla GP is not the whole story)ndash 47

Where to get [random] candidate solutions from

Every stochastic search method relies on some sampling algorithm(s)

The distribution of randomly generated solutions is important as it impliescertain bias of the algorithm

ProblemsWe donrsquot know the lsquoidealrsquo distribution of GP programsEven if we knew it it may be difficult to design an algorithm that obeys it

The most widely used contemporary initialization methods take care only of thesyntax of generated programs

Mainly height constraint

A more detailed view on GP (vanilla GP is not the whole story)ndash 48

Initialization Full method

Specify the maximum tree height hmax

The full method for initializing treesChoose nonterminal nodes at random until hmax is reachedThen choose only from terminals

A more detailed view on GP (vanilla GP is not the whole story)ndash 49

Initialization Grow method

Specify the maximum tree height hmax

The grow method for initializing treesChoose nonterminal or terminal nodes at random until hmax is reachedThen choose only from terminals

A more detailed view on GP (vanilla GP is not the whole story)ndash 50

Homologous crossover for GP

Earliest example one-point crossover [23] identify a common region in theparents and swap the corresponding trees

The common region is the lsquointersectionrsquo of parent trees

A more detailed view on GP (vanilla GP is not the whole story)ndash 51

Uniform crossover for GP

Works similarly to uniform crossover in GAs

The offspring is build by iterating over nodes in the common region and flipping acoin to decide from which parent should an instruction be copied [38]

A more detailed view on GP (vanilla GP is not the whole story)ndash 52

How to employ multiple operators for lsquobreedingrsquo

How should the particular operators coexist in an evolutionary process In other words

How should they be superimposed

What should be the lsquopipingrsquo of particular breeding pipelines

A topic surprisingly underexplored in GP (and in EC probably too)

An example Which is betterpopsubpop0speciespipe = ecgpkozaMutationPipelinepopsubpop0speciespipenum-sources = 1popsubpop0speciespipesource0 = ecgpkozaCrossoverPipeline

Orpopsubpop0speciespipenum-sources = 2popsubpop0speciespipesource0 = ecgpkozaCrossoverPipelinepopsubpop0speciespipesource0prob = 09popsubpop0speciespipesource1 = ecgpkozaMutationPipelinepopsubpop0speciespipesource1prob = 01

A more detailed view on GP (vanilla GP is not the whole story)ndash 53

The Challenges for GP

The Challenges for GPndash 54

Bloat

The evolving expressions tend to grow indefinitely in sizeFor tree-based representations this growth is typically exponential[-ish]

Evaluation becomes slow algorithm stalls memory overrun likely

One of the most intensely studied topics in GP 240+ papers as of March 2012

The Challenges for GPndash 55

Bloat example

Average number of nodes per generation in a typical run of GP solving the Sexticproblem x6minus2x4+x2

(GP dotted line)

The Challenges for GPndash 56

Countermeasures for bloat

Constraining tree heightSurprisingly can speed up bloat

Favoring small programsLexicographic parsimony pressure given two equally fit individuals prefer (select)the one represented by a smaller tree

Bloat-aware operators size-fair crossover

The Challenges for GPndash 57

Highly non-uniform distribution of program lsquobehaviorsrsquo

Convergence of binary Boolean random linear functions (composed of AND NANDOR NOR 8 bits)

From [20] Langdon W B Cantuacute-Paz E (ed) Random Search is Parsimonious LateBreaking Papers at the Genetic and Evolutionary Computation Conference(GECCO-2002) AAAI 2002 308-315

The Challenges for GPndash 58

High cost of evaluation

Running a program on multiple inputscan be expensive

Particularly for some types of dataeg images

Solutions

Caching of outcomes of subprograms

Parallel execution of programs onparticular fitness cases

Bloat prevention methods

The Challenges for GPndash 59

Variants of GP

Variants of GPndash 60

Strongly typed GP (STGP)

A way to incorporate prior knowledge and impose a structure on programs [30]

ImplementationProvide a set of typesFor each instruction define the types of its arguments and outcomesMake the operators type-aware

Mutation substitute a random tree of a proper typeCrossover swap trees of compatible1 types

1Compatible belonging to the same lsquoset typersquoVariants of GPndash 61

Strongly typed GP in ECJ

For the problem of simple classifiers represented as decision trees

Classifier syntaxClassifier = Class_idClassifier = if_then_else(Condition ClassifierClassifier)Condition = Input_Variable = Constant_Value

Implementation in ECJ parameter filesgptypeasize = 3gptypea0name = classgptypea1name = vargptypea2name = constgptypessize = 0gptcsize = 1gptc0 = ecgpGPTreeConstraintsgptc0name = tc0gptc0fset = f0gptc0returns = class

gpncsize = 4gpnc0 = ecgpGPNodeConstraintsgpnc0name = ncSimpleClassifiergpnc0returns = classgpnc0size = 0gpnc1 = ecgpGPNodeConstraintsgpnc1name = ncCompoundClassifiergpnc1returns = classgpnc1size = 4gpnc1child0 = vargpnc1child1 = constgpnc1child2 = classgpnc1child3 = classgpnc2 = ecgpGPNodeConstraintsgpnc2name = ncVariablegpnc2returns = vargpnc2size = 0gpnc3 = ecgpGPNodeConstraintsgpnc3name = ncConstantgpnc3returns = constgpnc3size = 0

Variants of GPndash 62

Linear Genetic Programming

MotivationTree-like structures are not natural for contemporary hardware architectures

Program = a sequence of instructions

Data passed via registers

ProsDirectly portable to machine code fast executionNatural correspondence to standard (GA-like) crossover operator

Applications direct evolution of machine code [32]

Initial register contents

Final register contents

x1

x2 O1 O2

x3

O3 O4 g2

g3

g1r1

r2

r3

r1

r2

r3

Variants of GPndash 63

Linear GP

Variants of GPndash 64

Stack-based GP

The best-known representative Push and PushGPhampshireedulspectorpushhtml [48]

ProsVery simple syntax program = instruction | literal | ( program )No need to specify the number of registersThe top element of a stack has the natural interpretation of program outcomeNatural possibility of implementing autoconstructive programs [47]Includes certain features that make it Turing-complete (eg YANK instruction)

Variants of GPndash 65

Push Example

Program

( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR )

Initial stack states

BOOLEAN STACK ()CODE STACK ( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR )FLOAT STACK ()INTEGER STACK ()

Stack states after program execution

BOOLEAN STACK ( TRUE )CODE STACK ( ( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR ) )FLOAT STACK ( 93 )INTEGER STACK ( 6 )

httphampshireedulspectorpush3-descriptionhtml

Variants of GPndash 66

Other variants of GP

Grammatical Evolution The grammar of the programming language ofconsideration is given as input to the algorithm Individuals encode the choice ofproductions in the derivation tree (which of available alternative productionshould be chosen modulo the number of productions available at given step ofderivation)

Graph-based GPMotivation standard GP cannot reuse subprograms (within a single program)Example Cartesian Genetic Programming

Variants of GPndash 67

Other variants of GP

Multiobjective GP The extra objectives canCome with the problemResult from GPrsquos specifics eg use program size as the second (minimized)objectiveBe associated with different tests (eg feature tests [40])

Developmental GP (eg using Push)

Probabilistic GP (a variant of EDA Estimation of Distribution Algorithms)The algorithm maintains a probability distribution P instead of a populationIndividuals are generated from P lsquoon demandrsquoThe results of individualsrsquo evaluation are used to update P

Variants of GPndash 68

Simple EDA-like GP PIPE

Probabilistic Incremental Program Evolution [41]

Variants of GPndash 69

Selected theoretical results

Selected theoretical resultsndash 70

Schemata theorem for GP

Exact formula for the expected number of individuals sampling a schema a thenext generation [37]

Plus later work for other types of crossover

Selected theoretical resultsndash 71

Semantic crossover

Selected theoretical resultsndash 72

Applications of GP

Applications of GPndash 73

Review

GP produced a number of solutions that are human-competitive ie a GPalgorithm automatically solved a problem for which a patent exists2

A recent award-winning work has demonstrated the ability of a GP system toautomatically find and correct bugs in commercially-released software whenprovided with test data3

GP is one of leading methodologies that can be used to lsquoautomatersquo sciencehelping the researchers to find the hidden complex patterns in the observedphenomena4

2Koza J R Keane M A Streeter M J Mydlowec W Yu J Lanza G 2003 Genetic Pro-gramming IV Routine Human-Competitive Machine Intelligence Kluwer Academic Publishers

3Arcuri A Yao X A novel co-evolutionary approach to automatic software bug fixing In WangJ (Ed) 2008 IEEE World Congress on Computational Intelligence IEEE Computational IntelligenceSociety IEEE Press Hong Kong

4Schmidt M Lipson H 3 Apr 2009 Distilling free-form natural laws from experimental dataScience 324 (5923) 81ndash85

Applications of GPndash 74

Humies

() Entries were solicited for cash awards for human-competitive results that wereproduced by any form of genetic and evolutionary computation and that were published

Applications of GPndash 75

Humies

The conditions to qualify(A) The result was patented as an invention in the past is an improvement over apatented invention or would qualify today as a patentable new invention(B) The result is equal to or better than a result that was accepted as a new scientificresult at the time when it was published in a peer-reviewed scientific journal(C) The result is equal to or better than a result that was placed into a database orarchive of results maintained by an internationally recognized panel of scientificexperts(D) The result is publishable in its own right as a new scientific result mdash independentof the fact that the result was mechanically created(E) The result is equal to or better than the most recent human-created solution to along-standing problem for which there has been a succession of increasingly betterhuman-created solutions(F) The result is equal to or better than a result that was considered an achievementin its field at the time it was first discovered(G) The result solves a problem of indisputable difficulty in its field(H) The result holds its own or wins a regulated competition involving humancontestants (in the form of either live human players or human-written computerprograms)

Applications of GPndash 76

Selected Gold Humies

2004 Jason D Lohn Gregory S Hornby Derek S Linden NASA Ames ResearchCenterAn Evolved Antenna for Deployment on NASArsquos Space Technology 5 Mission

httpidesignucscedupapershornby_ec11pdf

Applications of GPndash 77

Selected Gold Humies using GP

2009 Stephanie Forrest Claire Le Goues ThanhVu Nguyen Westley WeimerAutomatically finding patches using genetic programming A GeneticProgramming Approach to Automated Software Repair

Applications of GPndash 78

Selected Gold Humies using GP

2008 Lee Spector David M Clark Ian Lindsay Bradford Barr Jon KleinGenetic Programming for Finite Algebras

2010 Natalio Krasnogor Paweł Widera Jonathan GaribaldiEvolutionary design of the energy function for protein structure prediction GPchallenge evolving the energy function for protein structure prediction Automateddesign of energy functions for protein structure prediction by means of geneticprogramming and improved structure similarity assessment

2011 Achiya Elyasaf Ami Hauptmann Moshe SipperGA-FreeCell Evolving Solvers for the Game of FreeCell

Applications of GPndash 79

Other applications

classification problems in machine learning [] object recognition [ 33] orlearning game strategies [] [2 57] has demonstrated the ability of a GP system to automatically find andcorrect bugs in commercially-released software when provided with test data

In the context of this paper it deserves particular attention that GP is one ofleading methodologies that can be used to lsquoautomatersquo science helping theresearchers to find the hidden complex patterns in the observed phenomena

In this spirit in their seminal paper [42] have shown how GP can be used toinduce scientific laws from experimental data Many other studies havedemonstrated the usefulness of GP for modeling different phenomena includingthose of natural origins [45 3 7 56 46]

See [] for an extensive review of GP applications

Applications of GPndash 80

Case study 2 Evolution of feature detectors

Based on Krzysztof Krawiec Bartosz Kukawka and Tomasz Maciejewski (2010)Evolving cascades of voting feature detectors for vehicle detection in satellite imageryIn IEEE Congress on Evolutionary Computation (CEC 2010) Barcelona IEEE Presspages 2392-2399

Applications of GPndash 81

Additional resources

Additional resourcesndash 82

Additional resources

Evolutionary Computation in Java csgmuedu~eclabprojectsecjGeneric software framework for EA well-prepared to work with GP

The online GP bilbiography wwwcsbhamacuk~wblbiblio

The genetic programming lsquohome pagersquo (a little bit messy but still valuable)httpwwwgenetic-programmingcom

Additional resourcesndash 83

Bibliography

Bibliographyndash 84

A Agapitos and S M LucasEvolving Modular Recursive Sorting AlgorithmsIn Proc EuroGP 2007

A Arcuri and X YaoA novel co-evolutionary approach to automatic software bug fixingIn J Wang editor 2008 IEEE World Congress on Computational Intelligencepages 162ndash168 Hong Kong 1-6 June 2008 IEEE Computational IntelligenceSociety IEEE Press

M Arganis R Val J Prats K Rodriguez R Dominguez and J DolzGenetic programming and standardization in water temperature modellingAdvances in Civil Engineering 2009 2009

A Brabazon M OrsquoNeill and I DempseyAn Introduction to Evolutionary Computation in FinanceIEEE Computational Intelligence Magazine 3(4)42ndash55 2008

R Bradley A Brabazon and M OrsquoNeillDynamic High Frequency Trading A Neuro-Evolutionary ApproachIn Proc EvoWorkshops 2009

G Cuccu and F GomezWhen novelty is not enoughIn Proc EvoApplications 2011

Bibliographyndash 85

M Daga and M C DeoAlternative data-driven methods to estimate wind from waves by inversemodelingNatural Hazards 49(2)293ndash310 May 2009

I Dempsey M OrsquoNeill and A BrabazonAdaptive Trading With Grammatical EvolutionIn Proc CEC 2006

G Durrett F Neumann and U-M OrsquoReillyComputational Complexity Analysis of Simple Genetic Programming On TwoProblems Modeling Isolated Program SemanticsIn Proc FOGA 2011

E Galvaacuten-Loacutepez J Swafford M OrsquoNeill and A BrabazonEvolving a Ms PacMan Controller Using Grammatical EvolutionIn Applications of Evolutionary Computation Springer 2010

J V Hansen P B Lowry R D Meservy and D M McDonaldGenetic Programming for Prevention of Cyberterrorism through Dynamic andEvolving Intrusion DetectionDecision Support Systems 431362ndash1374 2007

Bibliographyndash 86

S Harding J F Miller and W BanzhafDevelopments in Cartesian Genetic Programming self-modifying CGPGPEM 11397ndash439 2010

D Jakobović and L BudinDynamic Scheduling with Genetic ProgrammingIn Proc EuroGP 2006

G Kendall A Parkes and K SpoererA Survey of NP-Complete PuzzlesInternational Computer Games Association Journal 31(1)13ndash34 2008

K E Kinnear JrEvolving a Sort Lessons in Genetic ProgrammingIn Proc of the International Conference on Neural Networks 1993

J KozaA Genetic Approach to the Truck Backer Upper Problem and the Inter-twinedSpiral ProblemIn Proc International Joint Conference on Neural Networks 1992

J R KozaGenetic Programming On the Programming of Computers by Means of NaturalSelectionMIT Press Cambridge MA USA 1992

Bibliographyndash 87

J R KozaGenetic Programming II Automatic Discovery of Reusable ProgramsMIT Press Cambridge Massachusetts May 1994

W Langdon and W BanzhafRepeated Patterns in Genetic ProgrammingNatural Computing 7589ndash613 2008

W B LangdonRandom search is parsimoniousIn E Cantuacute-Paz editor Late Breaking Papers at the Genetic and EvolutionaryComputation Conference (GECCO-2002) pages 308ndash315 New York NY 9-13July 2002 AAAI

W B Langdon and W BanzhafRepeated Sequences in Linear Genetic Programming GenomesComplex Systems 15(4)285ndash306 2005

W B Langdon and A P HarrisonEvolving Regular Expressions for GeneChip Probe Performance PredictionIn Proc PPSN pages 1061ndash1070 2008

W B Langdon and R PoliFoundations of Genetic ProgrammingSpringer-Verlag 2002

Bibliographyndash 88

W B Langdon J Rowsell and A P HarrisonCreating Regular Expressions as mRNA Motifs with GP to Predict Human ExonSplittingIn Proc GECCO 2009

W B Langdon O Sanchez Graillet and A P HarrisonAutomated DNA Motif DiscoveryarXivorg 2010

M Lones A Tyrrell S Stepney and L CavesControlling Complex Dynamics with Artificial Biochemical NetworksIn Proc EuroGP pages 159ndash170 2010

S LucasOthello Competitionhttpprotectkern-1667emrelaxalgovalessexacuk8080othellohtmlOthellohtml 2012[Online accessed 27-Jan-2012]

Bibliographyndash 89

S LucasThe Physical Travelling Salesperson Problemhttpprotectkern-1667emrelaxalgovalessexacukptspptsphtml2012[Online accessed 27ndashJan-2012]

T McConaghyFFX Fast Scalable Deterministic Symbolic Regression TechnologyIn Proc GPTP 2011

D J MontanaStrongly typed genetic programmingBBN Technical Report 7866 Bolt Beranek and Newman Inc 10 MoultonStreet Cambridge MA 02138 USA 7 May 1993

M Nicolau M Schoenauer and W BanzhafEvolving Genes to Balance a PoleIn Proc EuroGP 2010

Bibliographyndash 90

P Nordin and W BanzhafGenetic programming controlling a miniature robotIn E V Siegel and J R Koza editors Working Notes for the AAAI Symposiumon Genetic Programming pages 61ndash67 MIT Cambridge MA USA 10ndash12 Nov1995 AAAI

G Olague and L TrujilloEvolutionary-computer-assisted design of image operators that detect interestpoints using genetic programmingImage and Vision Computing 29(7)484ndash498 2011

M OrsquoNeill A Brabazon and E HembergSubtree Deactivation Control with Grammatical Genetic Programming inDynamic EnvironmentsIn Proc CEC 2008

M OrsquoNeill and C RyanGrammatical Evolution by Grammatical Evolution The Evolution of Grammarand Genetic CodeIn Proc EuroGP pages 138ndash149 Springer-Verlag 5ndash7 Apr 2004

M Paterson Y Peres M Thorup P Winkler and U ZwickMaximum OverhangIn Proc 19th Annual ACM-SIAM Symposium on Discrete Algorithms 2008

Bibliographyndash 91

R PoliExact schema theory for genetic programming and variable-length geneticalgorithms with one-point crossoverGenetic Programming and Evolvable Machines 2(2)123ndash163 June 2001

R Poli and W B LangdonOn the search properties of different crossover operators in genetic programmingIn J R Koza W Banzhaf K Chellapilla K Deb M Dorigo D B Fogel M HGarzon D E Goldberg H Iba and R Riolo editors Genetic Programming1998 Proceedings of the Third Annual Conference pages 293ndash301 University ofWisconsin Madison Wisconsin USA 22-25 July 1998 Morgan Kaufmann

R Poli W B Langdon and N F McPheeA field guide to genetic programmingPublished via httplulucom and freely available athttpwwwgp-field-guideorguk 2008(With contributions by J R Koza)

B J Ross and H ZhuProcedural texture evolution using multiobjective optimizationNew Generation Computing 22(3)271ndash293 2004

Bibliographyndash 92

R P Salustowicz and J SchmidhuberProbabilistic incremental program evolutionEvolutionary Computation 5(2)123ndash141 1997

M Schmidt and H LipsonDistilling free-form natural laws from experimental dataScience 324(5923)81ndash85 3 Apr 2009

S Silva and L VanneschiState-of-the-Art Genetic Programming for Predicting Human Oral Bioavailabilityof DrugsIn Proc 4th International Workshop on Practical Applications of ComputationalBiology and Bioinformatics 2010

M SipperLet the Games EvolveIn Proc GPTP 2011

C Sivapragasam R Maheswaran and V VenkateshGenetic programming approach for flood routing in natural channelsHydrological Processes 22(5)623ndash628 2007

C Sivapragasam N Muttil S Muthukumar and V M ArunPrediction of algal blooms using genetic programmingMarine Pollution Bulletin 60(10)1849ndash1855 2010

Bibliographyndash 93

L SpectorTowards practical autoconstructive evolution Self-evolution of problem-solvinggenetic programming systemsIn R Riolo T McConaghy and E Vladislavleva editors Genetic ProgrammingTheory and Practice VIII volume 8 of Genetic and Evolutionary Computationchapter 2 pages 17ndash33 Springer Ann Arbor USA 20-22 May 2010

L Spector C Perry and J KleinPush 20 programming language descriptionTechnical report School of Cognitive Science Hampshire College Apr 2004

J Togelius S Karakovskiy J Koutnik and J SchmidhuberSuper Mario EvolutionIn Proc IEEE Computational Intelligence and Games 2009

TORCS The Open Car Racing Simulatorhttptorcssourceforgenet 2012

L Vanneschi and G CuccuVariable Size Population for Dynamic Optimization with Genetic ProgrammingIn Proc GECCO 2009

Bibliographyndash 94

E Vladislavleva G Smits and D Den HertogOrder of Nonlinearity as a Complexity Measure for Models Generated by SymbolicRegression via Pareto Genetic ProgrammingIEEE Trans EC 13(2)333ndash349 2009

N Wagner Z Michalewicz M Khouja and R McGregorTime Series Forecasting for Dynamic Environments The DyFor Genetic ProgramModelIEEE Trans EC 2007

J Walker and J MillerPredicting Prime Numbers Using Cartesian Genetic ProgrammingIn Proc EuroGP 2007

J A Walker K Voumllk S L Smith and J F MillerParallel Evolution using Multi-chromosome Cartesian Genetic ProgrammingGPEM 10417ndash445 2009

W-C Wang K-W Chau C-T Cheng and L QiuA comparison of performance of several artificial intelligence methods forforecasting monthly discharge time seriesJournal of Hydrology 374(3-4)294ndash306 2009

Bibliographyndash 95

W Weimer S Forrest C Le Goues and T NguyenAutomatic program repair with evolutionary computationCommunications of the ACM 53(5)109ndash116 June 2010

P Widera J Garibaldi and N KrasnogorGP challenge Evolving energy function for protein structure predictionGPEM 1161ndash88 2010

T YuHierarchical Processing for Evolving Recursive and Modular Programs UsingHigher-Order Functions and Lambda AbstractionGPEM 2345ndash380 2001

Bibliographyndash 96

  • Introduction
  • Background
  • What is genetic programming
  • Summary of our first glimpse at GP
  • Life demonstration of GP using ECJ
  • Assessment of GP techniques
  • Case study
  • A more detailed view on GP (vanilla GP is not the whole story)
  • The Challenges for GP
  • Variants of GP
  • Selected theoretical results
  • Applications of GP
  • Additional resources
  • Bibliography
Page 33: GeneticProgramming - Poznań University of TechnologyGeneticProgramming KrzysztofKrawiec LaboratoryofIntelligentDecisionSupportSystems InstituteofComputingScience,PoznanUniversityofTechnology,Poznań,Poland

Selected ECJ features

GUI with charting

Platform-independent checkpointing and logging

Hierarchical parameter files

Multithreading

Mersenne Twister Random Number Generators (compare tohttpwwwalifecouknonrandom)

Abstractions for implementing a variety of EC forms

Prepared to work in a distributed environment (including so-called island model)

Life demonstration of GP using ECJndash 33

GP-related ECJ features

GP Tree Representations

Set-based Strongly-Typed Genetic Programming

Ephemeral Random Constants

Automatically-Defined Functions and Automatically Defined Macros

Multiple tree forests

Six tree-creation algorithms

Extensive set of GP breeding operators

Grammatical Encoding

Eight pre-done GP application problem domains (ant regression multiplexerlawnmower parity two-box edge serengeti)

Life demonstration of GP using ECJndash 34

Exemplary run of ECJ

Standard outputjava ecEvolve -file ecappregressionquinticercparamsThreads breed1 eval1Seed 1427743400Job 0Setting upProcessing GP TypesProcessing GP Node ConstraintsProcessing GP Function SetsProcessing GP Tree Constraints-01306332228659439200164875774146594280653340439694114301402200189629743-0037506348565697010001402771209365470606602806044824949013869498395598084Initializing Generation 0Subpop 0 best fitness of generation Fitness Standardized=11303205 Adjusted=046941292 Hits=10Generation 1Subpop 0 best fitness of generation Fitness Standardized=06804932 Adjusted=059506345 Hits=7

Life demonstration of GP using ECJndash 35

Exemplary run The result

The log file produced by the runGeneration 0Best IndividualSubpopulation 0Evaluated trueFitness Standardized=11303205 Adjusted=046941292 Hits=10Tree 0( (sin ( x x)) (cos (+ x x)))Generation 1Best IndividualSubpopulation 0Evaluated trueFitness Standardized=06804932 Adjusted=059506345 Hits=7Tree 0( (rlog (+ (- x x) (cos x))) (rlog (- (cos (cos ( x x))) (- x x))))

Life demonstration of GP using ECJndash 36

Exemplary run

The log file produced by the runBest Individual of RunSubpopulation 0Evaluated trueFitness Standardized=008413165 Adjusted=092239726 Hits=17Tree 0( ( ( (- ( ( ( ( x (sin x)) (rlog

x)) (+ (+ (sin x) x) (- x x))) (exp ( x( ( (- ( ( ( ( x x) (rlog x)) (+ (+

(sin x) x) (- x x))) (exp ( x (sin x))))(sin x)) (rlog x)) (exp (rlog x)))))) (sin

x)) (rlog x)) x) (cos (cos ( ( (- ( ((exp (rlog x)) (+ x ( ( (exp (rlog x))(rlog x)) x))) (exp ( ( ( (- (exp (rlogx)) x) (rlog x)) x) (sin ( x x))))) (sinx)) ( x ( ( (- ( ( ( ( x x) (rlogx)) (+ (+ x (+ (+ (sin x) x) (- x x))) (-x x))) (exp ( x (sin x)))) (sin x)) (rlogx)) (exp (rlog x))))) x))))

Life demonstration of GP using ECJndash 37

GP in R

httpcranr-projectorgwebpackagesgprindexhtml

Life demonstration of GP using ECJndash 38

GP in Mathematica

Life demonstration of GP using ECJndash 39

Assessment of GP techniques

Assessment of GP techniquesndash 40

Criteria for assessing the quality of GP-evolved solutions

Criteria for assessing GP algorithmssuccess rate (percentage of evolutionary runs ended with success)

time-to-success (can be infin)

error of the best-of-run individual

Criteria for assessing programs obtained with GP

error rate (percentage of tests passed)

program size (number of instructions)

execution time

transparency

Assessment of GP techniquesndash 41

GP Benchmarks

Problem Definition (formula)Sextic x6minus2x4+x2

Septic x7minus2x6+x5minusx4+x3minus2x2+xNonic x9+x8+x7+x6+x5+x4+x3+x2+xR1 (x+1)3(x2minusx+1)R2 (x5minus3x3+1)(x2+1)R3 (x6+x5)(x4+x3+x2+x+1)

Assessment of GP techniquesndash 42

Even more GP benchmarks

Symbolic Regression Drug Bioavailability [43]

Tower [52] Protein Structure Classification [58]

Boolean Functions Time Series Forecasting [53]

N-Multiplexer [17] N-Majority [17] N-Parity [17] Path-finding and Planning

Generalised Boolean Circuits [12 59] Physical Travelling Salesman [28]

Digital Adder [55] Artificial Ant [17]

Order [9] Lawnmower [18]

Digital Multiplier [55] Tartarus Problem [6]

Majority [9] Maximum Overhang [36]

Classification Circuit Design [29]

mRNA Motif Classification [24] Control Systems

DNA Motif Discovery [25] Chaotic Dynamic Systems Control [26]

Intrusion Detection [11] Pole Balancing [31]

Protein Classification [19] Truck Control [16]

Intertwined Spirals [17]

Predictive Modelling

Mackey-Glass Chaotic Time Series [21]

Financial Trading [5 4 8]

Sunspot Prediction [17]

GeneChip Probe Performance [22]

Prime Number Prediction [54]

Assessment of GP techniquesndash 43

and more

Game-Playing

TORCS Car Racing [50]

Ms PacMan [10]

Othello [27]

Chessboard Evaluation [44]

Backgammon [44]

Mario [49]

NP-Complete Puzzles [14]

Robocode [44]

Rush Hour [44]

Checkers [44]

Freecell [44]

Dynamic Optimisation

Dynamic Symbolic Regression [34 35 51]

Dynamic Scheduling [13]

Traditional Programming

Sorting [15 1]

Assessment of GP techniquesndash 44

Case study

Case studyndash 45

Case study 1 Evolution of temperature models

Based on Karolina Stanisławska Krzysztof Krawiec Zbigniew W KundzewiczModeling Global Temperature Changes using Genetic Programming ndash A Case Study

Institute of Computing Science Poznan University of Technology Poznan Poland

Institute for Agricultural and Forest Environment Polish Academy of SciencesPoznan Poland and Potsdam Institute for Climate Impact Research PotsdamGermany

Konferencja Algorytmoacutew Ewolucyjnych i Optymalizacji Globalnej KAEiOGSeptember 21 2011

Case studyndash 46

A more detailed view on GP(vanilla GP is not the whole story)

A more detailed view on GP (vanilla GP is not the whole story)ndash 47

Where to get [random] candidate solutions from

Every stochastic search method relies on some sampling algorithm(s)

The distribution of randomly generated solutions is important as it impliescertain bias of the algorithm

ProblemsWe donrsquot know the lsquoidealrsquo distribution of GP programsEven if we knew it it may be difficult to design an algorithm that obeys it

The most widely used contemporary initialization methods take care only of thesyntax of generated programs

Mainly height constraint

A more detailed view on GP (vanilla GP is not the whole story)ndash 48

Initialization Full method

Specify the maximum tree height hmax

The full method for initializing treesChoose nonterminal nodes at random until hmax is reachedThen choose only from terminals

A more detailed view on GP (vanilla GP is not the whole story)ndash 49

Initialization Grow method

Specify the maximum tree height hmax

The grow method for initializing treesChoose nonterminal or terminal nodes at random until hmax is reachedThen choose only from terminals

A more detailed view on GP (vanilla GP is not the whole story)ndash 50

Homologous crossover for GP

Earliest example one-point crossover [23] identify a common region in theparents and swap the corresponding trees

The common region is the lsquointersectionrsquo of parent trees

A more detailed view on GP (vanilla GP is not the whole story)ndash 51

Uniform crossover for GP

Works similarly to uniform crossover in GAs

The offspring is build by iterating over nodes in the common region and flipping acoin to decide from which parent should an instruction be copied [38]

A more detailed view on GP (vanilla GP is not the whole story)ndash 52

How to employ multiple operators for lsquobreedingrsquo

How should the particular operators coexist in an evolutionary process In other words

How should they be superimposed

What should be the lsquopipingrsquo of particular breeding pipelines

A topic surprisingly underexplored in GP (and in EC probably too)

An example Which is betterpopsubpop0speciespipe = ecgpkozaMutationPipelinepopsubpop0speciespipenum-sources = 1popsubpop0speciespipesource0 = ecgpkozaCrossoverPipeline

Orpopsubpop0speciespipenum-sources = 2popsubpop0speciespipesource0 = ecgpkozaCrossoverPipelinepopsubpop0speciespipesource0prob = 09popsubpop0speciespipesource1 = ecgpkozaMutationPipelinepopsubpop0speciespipesource1prob = 01

A more detailed view on GP (vanilla GP is not the whole story)ndash 53

The Challenges for GP

The Challenges for GPndash 54

Bloat

The evolving expressions tend to grow indefinitely in sizeFor tree-based representations this growth is typically exponential[-ish]

Evaluation becomes slow algorithm stalls memory overrun likely

One of the most intensely studied topics in GP 240+ papers as of March 2012

The Challenges for GPndash 55

Bloat example

Average number of nodes per generation in a typical run of GP solving the Sexticproblem x6minus2x4+x2

(GP dotted line)

The Challenges for GPndash 56

Countermeasures for bloat

Constraining tree heightSurprisingly can speed up bloat

Favoring small programsLexicographic parsimony pressure given two equally fit individuals prefer (select)the one represented by a smaller tree

Bloat-aware operators size-fair crossover

The Challenges for GPndash 57

Highly non-uniform distribution of program lsquobehaviorsrsquo

Convergence of binary Boolean random linear functions (composed of AND NANDOR NOR 8 bits)

From [20] Langdon W B Cantuacute-Paz E (ed) Random Search is Parsimonious LateBreaking Papers at the Genetic and Evolutionary Computation Conference(GECCO-2002) AAAI 2002 308-315

The Challenges for GPndash 58

High cost of evaluation

Running a program on multiple inputscan be expensive

Particularly for some types of dataeg images

Solutions

Caching of outcomes of subprograms

Parallel execution of programs onparticular fitness cases

Bloat prevention methods

The Challenges for GPndash 59

Variants of GP

Variants of GPndash 60

Strongly typed GP (STGP)

A way to incorporate prior knowledge and impose a structure on programs [30]

ImplementationProvide a set of typesFor each instruction define the types of its arguments and outcomesMake the operators type-aware

Mutation substitute a random tree of a proper typeCrossover swap trees of compatible1 types

1Compatible belonging to the same lsquoset typersquoVariants of GPndash 61

Strongly typed GP in ECJ

For the problem of simple classifiers represented as decision trees

Classifier syntaxClassifier = Class_idClassifier = if_then_else(Condition ClassifierClassifier)Condition = Input_Variable = Constant_Value

Implementation in ECJ parameter filesgptypeasize = 3gptypea0name = classgptypea1name = vargptypea2name = constgptypessize = 0gptcsize = 1gptc0 = ecgpGPTreeConstraintsgptc0name = tc0gptc0fset = f0gptc0returns = class

gpncsize = 4gpnc0 = ecgpGPNodeConstraintsgpnc0name = ncSimpleClassifiergpnc0returns = classgpnc0size = 0gpnc1 = ecgpGPNodeConstraintsgpnc1name = ncCompoundClassifiergpnc1returns = classgpnc1size = 4gpnc1child0 = vargpnc1child1 = constgpnc1child2 = classgpnc1child3 = classgpnc2 = ecgpGPNodeConstraintsgpnc2name = ncVariablegpnc2returns = vargpnc2size = 0gpnc3 = ecgpGPNodeConstraintsgpnc3name = ncConstantgpnc3returns = constgpnc3size = 0

Variants of GPndash 62

Linear Genetic Programming

MotivationTree-like structures are not natural for contemporary hardware architectures

Program = a sequence of instructions

Data passed via registers

ProsDirectly portable to machine code fast executionNatural correspondence to standard (GA-like) crossover operator

Applications direct evolution of machine code [32]

Initial register contents

Final register contents

x1

x2 O1 O2

x3

O3 O4 g2

g3

g1r1

r2

r3

r1

r2

r3

Variants of GPndash 63

Linear GP

Variants of GPndash 64

Stack-based GP

The best-known representative Push and PushGPhampshireedulspectorpushhtml [48]

ProsVery simple syntax program = instruction | literal | ( program )No need to specify the number of registersThe top element of a stack has the natural interpretation of program outcomeNatural possibility of implementing autoconstructive programs [47]Includes certain features that make it Turing-complete (eg YANK instruction)

Variants of GPndash 65

Push Example

Program

( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR )

Initial stack states

BOOLEAN STACK ()CODE STACK ( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR )FLOAT STACK ()INTEGER STACK ()

Stack states after program execution

BOOLEAN STACK ( TRUE )CODE STACK ( ( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR ) )FLOAT STACK ( 93 )INTEGER STACK ( 6 )

httphampshireedulspectorpush3-descriptionhtml

Variants of GPndash 66

Other variants of GP

Grammatical Evolution The grammar of the programming language ofconsideration is given as input to the algorithm Individuals encode the choice ofproductions in the derivation tree (which of available alternative productionshould be chosen modulo the number of productions available at given step ofderivation)

Graph-based GPMotivation standard GP cannot reuse subprograms (within a single program)Example Cartesian Genetic Programming

Variants of GPndash 67

Other variants of GP

Multiobjective GP The extra objectives canCome with the problemResult from GPrsquos specifics eg use program size as the second (minimized)objectiveBe associated with different tests (eg feature tests [40])

Developmental GP (eg using Push)

Probabilistic GP (a variant of EDA Estimation of Distribution Algorithms)The algorithm maintains a probability distribution P instead of a populationIndividuals are generated from P lsquoon demandrsquoThe results of individualsrsquo evaluation are used to update P

Variants of GPndash 68

Simple EDA-like GP PIPE

Probabilistic Incremental Program Evolution [41]

Variants of GPndash 69

Selected theoretical results

Selected theoretical resultsndash 70

Schemata theorem for GP

Exact formula for the expected number of individuals sampling a schema a thenext generation [37]

Plus later work for other types of crossover

Selected theoretical resultsndash 71

Semantic crossover

Selected theoretical resultsndash 72

Applications of GP

Applications of GPndash 73

Review

GP produced a number of solutions that are human-competitive ie a GPalgorithm automatically solved a problem for which a patent exists2

A recent award-winning work has demonstrated the ability of a GP system toautomatically find and correct bugs in commercially-released software whenprovided with test data3

GP is one of leading methodologies that can be used to lsquoautomatersquo sciencehelping the researchers to find the hidden complex patterns in the observedphenomena4

2Koza J R Keane M A Streeter M J Mydlowec W Yu J Lanza G 2003 Genetic Pro-gramming IV Routine Human-Competitive Machine Intelligence Kluwer Academic Publishers

3Arcuri A Yao X A novel co-evolutionary approach to automatic software bug fixing In WangJ (Ed) 2008 IEEE World Congress on Computational Intelligence IEEE Computational IntelligenceSociety IEEE Press Hong Kong

4Schmidt M Lipson H 3 Apr 2009 Distilling free-form natural laws from experimental dataScience 324 (5923) 81ndash85

Applications of GPndash 74

Humies

() Entries were solicited for cash awards for human-competitive results that wereproduced by any form of genetic and evolutionary computation and that were published

Applications of GPndash 75

Humies

The conditions to qualify(A) The result was patented as an invention in the past is an improvement over apatented invention or would qualify today as a patentable new invention(B) The result is equal to or better than a result that was accepted as a new scientificresult at the time when it was published in a peer-reviewed scientific journal(C) The result is equal to or better than a result that was placed into a database orarchive of results maintained by an internationally recognized panel of scientificexperts(D) The result is publishable in its own right as a new scientific result mdash independentof the fact that the result was mechanically created(E) The result is equal to or better than the most recent human-created solution to along-standing problem for which there has been a succession of increasingly betterhuman-created solutions(F) The result is equal to or better than a result that was considered an achievementin its field at the time it was first discovered(G) The result solves a problem of indisputable difficulty in its field(H) The result holds its own or wins a regulated competition involving humancontestants (in the form of either live human players or human-written computerprograms)

Applications of GPndash 76

Selected Gold Humies

2004 Jason D Lohn Gregory S Hornby Derek S Linden NASA Ames ResearchCenterAn Evolved Antenna for Deployment on NASArsquos Space Technology 5 Mission

httpidesignucscedupapershornby_ec11pdf

Applications of GPndash 77

Selected Gold Humies using GP

2009 Stephanie Forrest Claire Le Goues ThanhVu Nguyen Westley WeimerAutomatically finding patches using genetic programming A GeneticProgramming Approach to Automated Software Repair

Applications of GPndash 78

Selected Gold Humies using GP

2008 Lee Spector David M Clark Ian Lindsay Bradford Barr Jon KleinGenetic Programming for Finite Algebras

2010 Natalio Krasnogor Paweł Widera Jonathan GaribaldiEvolutionary design of the energy function for protein structure prediction GPchallenge evolving the energy function for protein structure prediction Automateddesign of energy functions for protein structure prediction by means of geneticprogramming and improved structure similarity assessment

2011 Achiya Elyasaf Ami Hauptmann Moshe SipperGA-FreeCell Evolving Solvers for the Game of FreeCell

Applications of GPndash 79

Other applications

classification problems in machine learning [] object recognition [ 33] orlearning game strategies [] [2 57] has demonstrated the ability of a GP system to automatically find andcorrect bugs in commercially-released software when provided with test data

In the context of this paper it deserves particular attention that GP is one ofleading methodologies that can be used to lsquoautomatersquo science helping theresearchers to find the hidden complex patterns in the observed phenomena

In this spirit in their seminal paper [42] have shown how GP can be used toinduce scientific laws from experimental data Many other studies havedemonstrated the usefulness of GP for modeling different phenomena includingthose of natural origins [45 3 7 56 46]

See [] for an extensive review of GP applications

Applications of GPndash 80

Case study 2 Evolution of feature detectors

Based on Krzysztof Krawiec Bartosz Kukawka and Tomasz Maciejewski (2010)Evolving cascades of voting feature detectors for vehicle detection in satellite imageryIn IEEE Congress on Evolutionary Computation (CEC 2010) Barcelona IEEE Presspages 2392-2399

Applications of GPndash 81

Additional resources

Additional resourcesndash 82

Additional resources

Evolutionary Computation in Java csgmuedu~eclabprojectsecjGeneric software framework for EA well-prepared to work with GP

The online GP bilbiography wwwcsbhamacuk~wblbiblio

The genetic programming lsquohome pagersquo (a little bit messy but still valuable)httpwwwgenetic-programmingcom

Additional resourcesndash 83

Bibliography

Bibliographyndash 84

A Agapitos and S M LucasEvolving Modular Recursive Sorting AlgorithmsIn Proc EuroGP 2007

A Arcuri and X YaoA novel co-evolutionary approach to automatic software bug fixingIn J Wang editor 2008 IEEE World Congress on Computational Intelligencepages 162ndash168 Hong Kong 1-6 June 2008 IEEE Computational IntelligenceSociety IEEE Press

M Arganis R Val J Prats K Rodriguez R Dominguez and J DolzGenetic programming and standardization in water temperature modellingAdvances in Civil Engineering 2009 2009

A Brabazon M OrsquoNeill and I DempseyAn Introduction to Evolutionary Computation in FinanceIEEE Computational Intelligence Magazine 3(4)42ndash55 2008

R Bradley A Brabazon and M OrsquoNeillDynamic High Frequency Trading A Neuro-Evolutionary ApproachIn Proc EvoWorkshops 2009

G Cuccu and F GomezWhen novelty is not enoughIn Proc EvoApplications 2011

Bibliographyndash 85

M Daga and M C DeoAlternative data-driven methods to estimate wind from waves by inversemodelingNatural Hazards 49(2)293ndash310 May 2009

I Dempsey M OrsquoNeill and A BrabazonAdaptive Trading With Grammatical EvolutionIn Proc CEC 2006

G Durrett F Neumann and U-M OrsquoReillyComputational Complexity Analysis of Simple Genetic Programming On TwoProblems Modeling Isolated Program SemanticsIn Proc FOGA 2011

E Galvaacuten-Loacutepez J Swafford M OrsquoNeill and A BrabazonEvolving a Ms PacMan Controller Using Grammatical EvolutionIn Applications of Evolutionary Computation Springer 2010

J V Hansen P B Lowry R D Meservy and D M McDonaldGenetic Programming for Prevention of Cyberterrorism through Dynamic andEvolving Intrusion DetectionDecision Support Systems 431362ndash1374 2007

Bibliographyndash 86

S Harding J F Miller and W BanzhafDevelopments in Cartesian Genetic Programming self-modifying CGPGPEM 11397ndash439 2010

D Jakobović and L BudinDynamic Scheduling with Genetic ProgrammingIn Proc EuroGP 2006

G Kendall A Parkes and K SpoererA Survey of NP-Complete PuzzlesInternational Computer Games Association Journal 31(1)13ndash34 2008

K E Kinnear JrEvolving a Sort Lessons in Genetic ProgrammingIn Proc of the International Conference on Neural Networks 1993

J KozaA Genetic Approach to the Truck Backer Upper Problem and the Inter-twinedSpiral ProblemIn Proc International Joint Conference on Neural Networks 1992

J R KozaGenetic Programming On the Programming of Computers by Means of NaturalSelectionMIT Press Cambridge MA USA 1992

Bibliographyndash 87

J R KozaGenetic Programming II Automatic Discovery of Reusable ProgramsMIT Press Cambridge Massachusetts May 1994

W Langdon and W BanzhafRepeated Patterns in Genetic ProgrammingNatural Computing 7589ndash613 2008

W B LangdonRandom search is parsimoniousIn E Cantuacute-Paz editor Late Breaking Papers at the Genetic and EvolutionaryComputation Conference (GECCO-2002) pages 308ndash315 New York NY 9-13July 2002 AAAI

W B Langdon and W BanzhafRepeated Sequences in Linear Genetic Programming GenomesComplex Systems 15(4)285ndash306 2005

W B Langdon and A P HarrisonEvolving Regular Expressions for GeneChip Probe Performance PredictionIn Proc PPSN pages 1061ndash1070 2008

W B Langdon and R PoliFoundations of Genetic ProgrammingSpringer-Verlag 2002

Bibliographyndash 88

W B Langdon J Rowsell and A P HarrisonCreating Regular Expressions as mRNA Motifs with GP to Predict Human ExonSplittingIn Proc GECCO 2009

W B Langdon O Sanchez Graillet and A P HarrisonAutomated DNA Motif DiscoveryarXivorg 2010

M Lones A Tyrrell S Stepney and L CavesControlling Complex Dynamics with Artificial Biochemical NetworksIn Proc EuroGP pages 159ndash170 2010

S LucasOthello Competitionhttpprotectkern-1667emrelaxalgovalessexacuk8080othellohtmlOthellohtml 2012[Online accessed 27-Jan-2012]

Bibliographyndash 89

S LucasThe Physical Travelling Salesperson Problemhttpprotectkern-1667emrelaxalgovalessexacukptspptsphtml2012[Online accessed 27ndashJan-2012]

T McConaghyFFX Fast Scalable Deterministic Symbolic Regression TechnologyIn Proc GPTP 2011

D J MontanaStrongly typed genetic programmingBBN Technical Report 7866 Bolt Beranek and Newman Inc 10 MoultonStreet Cambridge MA 02138 USA 7 May 1993

M Nicolau M Schoenauer and W BanzhafEvolving Genes to Balance a PoleIn Proc EuroGP 2010

Bibliographyndash 90

P Nordin and W BanzhafGenetic programming controlling a miniature robotIn E V Siegel and J R Koza editors Working Notes for the AAAI Symposiumon Genetic Programming pages 61ndash67 MIT Cambridge MA USA 10ndash12 Nov1995 AAAI

G Olague and L TrujilloEvolutionary-computer-assisted design of image operators that detect interestpoints using genetic programmingImage and Vision Computing 29(7)484ndash498 2011

M OrsquoNeill A Brabazon and E HembergSubtree Deactivation Control with Grammatical Genetic Programming inDynamic EnvironmentsIn Proc CEC 2008

M OrsquoNeill and C RyanGrammatical Evolution by Grammatical Evolution The Evolution of Grammarand Genetic CodeIn Proc EuroGP pages 138ndash149 Springer-Verlag 5ndash7 Apr 2004

M Paterson Y Peres M Thorup P Winkler and U ZwickMaximum OverhangIn Proc 19th Annual ACM-SIAM Symposium on Discrete Algorithms 2008

Bibliographyndash 91

R PoliExact schema theory for genetic programming and variable-length geneticalgorithms with one-point crossoverGenetic Programming and Evolvable Machines 2(2)123ndash163 June 2001

R Poli and W B LangdonOn the search properties of different crossover operators in genetic programmingIn J R Koza W Banzhaf K Chellapilla K Deb M Dorigo D B Fogel M HGarzon D E Goldberg H Iba and R Riolo editors Genetic Programming1998 Proceedings of the Third Annual Conference pages 293ndash301 University ofWisconsin Madison Wisconsin USA 22-25 July 1998 Morgan Kaufmann

R Poli W B Langdon and N F McPheeA field guide to genetic programmingPublished via httplulucom and freely available athttpwwwgp-field-guideorguk 2008(With contributions by J R Koza)

B J Ross and H ZhuProcedural texture evolution using multiobjective optimizationNew Generation Computing 22(3)271ndash293 2004

Bibliographyndash 92

R P Salustowicz and J SchmidhuberProbabilistic incremental program evolutionEvolutionary Computation 5(2)123ndash141 1997

M Schmidt and H LipsonDistilling free-form natural laws from experimental dataScience 324(5923)81ndash85 3 Apr 2009

S Silva and L VanneschiState-of-the-Art Genetic Programming for Predicting Human Oral Bioavailabilityof DrugsIn Proc 4th International Workshop on Practical Applications of ComputationalBiology and Bioinformatics 2010

M SipperLet the Games EvolveIn Proc GPTP 2011

C Sivapragasam R Maheswaran and V VenkateshGenetic programming approach for flood routing in natural channelsHydrological Processes 22(5)623ndash628 2007

C Sivapragasam N Muttil S Muthukumar and V M ArunPrediction of algal blooms using genetic programmingMarine Pollution Bulletin 60(10)1849ndash1855 2010

Bibliographyndash 93

L SpectorTowards practical autoconstructive evolution Self-evolution of problem-solvinggenetic programming systemsIn R Riolo T McConaghy and E Vladislavleva editors Genetic ProgrammingTheory and Practice VIII volume 8 of Genetic and Evolutionary Computationchapter 2 pages 17ndash33 Springer Ann Arbor USA 20-22 May 2010

L Spector C Perry and J KleinPush 20 programming language descriptionTechnical report School of Cognitive Science Hampshire College Apr 2004

J Togelius S Karakovskiy J Koutnik and J SchmidhuberSuper Mario EvolutionIn Proc IEEE Computational Intelligence and Games 2009

TORCS The Open Car Racing Simulatorhttptorcssourceforgenet 2012

L Vanneschi and G CuccuVariable Size Population for Dynamic Optimization with Genetic ProgrammingIn Proc GECCO 2009

Bibliographyndash 94

E Vladislavleva G Smits and D Den HertogOrder of Nonlinearity as a Complexity Measure for Models Generated by SymbolicRegression via Pareto Genetic ProgrammingIEEE Trans EC 13(2)333ndash349 2009

N Wagner Z Michalewicz M Khouja and R McGregorTime Series Forecasting for Dynamic Environments The DyFor Genetic ProgramModelIEEE Trans EC 2007

J Walker and J MillerPredicting Prime Numbers Using Cartesian Genetic ProgrammingIn Proc EuroGP 2007

J A Walker K Voumllk S L Smith and J F MillerParallel Evolution using Multi-chromosome Cartesian Genetic ProgrammingGPEM 10417ndash445 2009

W-C Wang K-W Chau C-T Cheng and L QiuA comparison of performance of several artificial intelligence methods forforecasting monthly discharge time seriesJournal of Hydrology 374(3-4)294ndash306 2009

Bibliographyndash 95

W Weimer S Forrest C Le Goues and T NguyenAutomatic program repair with evolutionary computationCommunications of the ACM 53(5)109ndash116 June 2010

P Widera J Garibaldi and N KrasnogorGP challenge Evolving energy function for protein structure predictionGPEM 1161ndash88 2010

T YuHierarchical Processing for Evolving Recursive and Modular Programs UsingHigher-Order Functions and Lambda AbstractionGPEM 2345ndash380 2001

Bibliographyndash 96

  • Introduction
  • Background
  • What is genetic programming
  • Summary of our first glimpse at GP
  • Life demonstration of GP using ECJ
  • Assessment of GP techniques
  • Case study
  • A more detailed view on GP (vanilla GP is not the whole story)
  • The Challenges for GP
  • Variants of GP
  • Selected theoretical results
  • Applications of GP
  • Additional resources
  • Bibliography
Page 34: GeneticProgramming - Poznań University of TechnologyGeneticProgramming KrzysztofKrawiec LaboratoryofIntelligentDecisionSupportSystems InstituteofComputingScience,PoznanUniversityofTechnology,Poznań,Poland

GP-related ECJ features

GP Tree Representations

Set-based Strongly-Typed Genetic Programming

Ephemeral Random Constants

Automatically-Defined Functions and Automatically Defined Macros

Multiple tree forests

Six tree-creation algorithms

Extensive set of GP breeding operators

Grammatical Encoding

Eight pre-done GP application problem domains (ant regression multiplexerlawnmower parity two-box edge serengeti)

Life demonstration of GP using ECJndash 34

Exemplary run of ECJ

Standard outputjava ecEvolve -file ecappregressionquinticercparamsThreads breed1 eval1Seed 1427743400Job 0Setting upProcessing GP TypesProcessing GP Node ConstraintsProcessing GP Function SetsProcessing GP Tree Constraints-01306332228659439200164875774146594280653340439694114301402200189629743-0037506348565697010001402771209365470606602806044824949013869498395598084Initializing Generation 0Subpop 0 best fitness of generation Fitness Standardized=11303205 Adjusted=046941292 Hits=10Generation 1Subpop 0 best fitness of generation Fitness Standardized=06804932 Adjusted=059506345 Hits=7

Life demonstration of GP using ECJndash 35

Exemplary run The result

The log file produced by the runGeneration 0Best IndividualSubpopulation 0Evaluated trueFitness Standardized=11303205 Adjusted=046941292 Hits=10Tree 0( (sin ( x x)) (cos (+ x x)))Generation 1Best IndividualSubpopulation 0Evaluated trueFitness Standardized=06804932 Adjusted=059506345 Hits=7Tree 0( (rlog (+ (- x x) (cos x))) (rlog (- (cos (cos ( x x))) (- x x))))

Life demonstration of GP using ECJndash 36

Exemplary run

The log file produced by the runBest Individual of RunSubpopulation 0Evaluated trueFitness Standardized=008413165 Adjusted=092239726 Hits=17Tree 0( ( ( (- ( ( ( ( x (sin x)) (rlog

x)) (+ (+ (sin x) x) (- x x))) (exp ( x( ( (- ( ( ( ( x x) (rlog x)) (+ (+

(sin x) x) (- x x))) (exp ( x (sin x))))(sin x)) (rlog x)) (exp (rlog x)))))) (sin

x)) (rlog x)) x) (cos (cos ( ( (- ( ((exp (rlog x)) (+ x ( ( (exp (rlog x))(rlog x)) x))) (exp ( ( ( (- (exp (rlogx)) x) (rlog x)) x) (sin ( x x))))) (sinx)) ( x ( ( (- ( ( ( ( x x) (rlogx)) (+ (+ x (+ (+ (sin x) x) (- x x))) (-x x))) (exp ( x (sin x)))) (sin x)) (rlogx)) (exp (rlog x))))) x))))

Life demonstration of GP using ECJndash 37

GP in R

httpcranr-projectorgwebpackagesgprindexhtml

Life demonstration of GP using ECJndash 38

GP in Mathematica

Life demonstration of GP using ECJndash 39

Assessment of GP techniques

Assessment of GP techniquesndash 40

Criteria for assessing the quality of GP-evolved solutions

Criteria for assessing GP algorithmssuccess rate (percentage of evolutionary runs ended with success)

time-to-success (can be infin)

error of the best-of-run individual

Criteria for assessing programs obtained with GP

error rate (percentage of tests passed)

program size (number of instructions)

execution time

transparency

Assessment of GP techniquesndash 41

GP Benchmarks

Problem Definition (formula)Sextic x6minus2x4+x2

Septic x7minus2x6+x5minusx4+x3minus2x2+xNonic x9+x8+x7+x6+x5+x4+x3+x2+xR1 (x+1)3(x2minusx+1)R2 (x5minus3x3+1)(x2+1)R3 (x6+x5)(x4+x3+x2+x+1)

Assessment of GP techniquesndash 42

Even more GP benchmarks

Symbolic Regression Drug Bioavailability [43]

Tower [52] Protein Structure Classification [58]

Boolean Functions Time Series Forecasting [53]

N-Multiplexer [17] N-Majority [17] N-Parity [17] Path-finding and Planning

Generalised Boolean Circuits [12 59] Physical Travelling Salesman [28]

Digital Adder [55] Artificial Ant [17]

Order [9] Lawnmower [18]

Digital Multiplier [55] Tartarus Problem [6]

Majority [9] Maximum Overhang [36]

Classification Circuit Design [29]

mRNA Motif Classification [24] Control Systems

DNA Motif Discovery [25] Chaotic Dynamic Systems Control [26]

Intrusion Detection [11] Pole Balancing [31]

Protein Classification [19] Truck Control [16]

Intertwined Spirals [17]

Predictive Modelling

Mackey-Glass Chaotic Time Series [21]

Financial Trading [5 4 8]

Sunspot Prediction [17]

GeneChip Probe Performance [22]

Prime Number Prediction [54]

Assessment of GP techniquesndash 43

and more

Game-Playing

TORCS Car Racing [50]

Ms PacMan [10]

Othello [27]

Chessboard Evaluation [44]

Backgammon [44]

Mario [49]

NP-Complete Puzzles [14]

Robocode [44]

Rush Hour [44]

Checkers [44]

Freecell [44]

Dynamic Optimisation

Dynamic Symbolic Regression [34 35 51]

Dynamic Scheduling [13]

Traditional Programming

Sorting [15 1]

Assessment of GP techniquesndash 44

Case study

Case studyndash 45

Case study 1 Evolution of temperature models

Based on Karolina Stanisławska Krzysztof Krawiec Zbigniew W KundzewiczModeling Global Temperature Changes using Genetic Programming ndash A Case Study

Institute of Computing Science Poznan University of Technology Poznan Poland

Institute for Agricultural and Forest Environment Polish Academy of SciencesPoznan Poland and Potsdam Institute for Climate Impact Research PotsdamGermany

Konferencja Algorytmoacutew Ewolucyjnych i Optymalizacji Globalnej KAEiOGSeptember 21 2011

Case studyndash 46

A more detailed view on GP(vanilla GP is not the whole story)

A more detailed view on GP (vanilla GP is not the whole story)ndash 47

Where to get [random] candidate solutions from

Every stochastic search method relies on some sampling algorithm(s)

The distribution of randomly generated solutions is important as it impliescertain bias of the algorithm

ProblemsWe donrsquot know the lsquoidealrsquo distribution of GP programsEven if we knew it it may be difficult to design an algorithm that obeys it

The most widely used contemporary initialization methods take care only of thesyntax of generated programs

Mainly height constraint

A more detailed view on GP (vanilla GP is not the whole story)ndash 48

Initialization Full method

Specify the maximum tree height hmax

The full method for initializing treesChoose nonterminal nodes at random until hmax is reachedThen choose only from terminals

A more detailed view on GP (vanilla GP is not the whole story)ndash 49

Initialization Grow method

Specify the maximum tree height hmax

The grow method for initializing treesChoose nonterminal or terminal nodes at random until hmax is reachedThen choose only from terminals

A more detailed view on GP (vanilla GP is not the whole story)ndash 50

Homologous crossover for GP

Earliest example one-point crossover [23] identify a common region in theparents and swap the corresponding trees

The common region is the lsquointersectionrsquo of parent trees

A more detailed view on GP (vanilla GP is not the whole story)ndash 51

Uniform crossover for GP

Works similarly to uniform crossover in GAs

The offspring is build by iterating over nodes in the common region and flipping acoin to decide from which parent should an instruction be copied [38]

A more detailed view on GP (vanilla GP is not the whole story)ndash 52

How to employ multiple operators for lsquobreedingrsquo

How should the particular operators coexist in an evolutionary process In other words

How should they be superimposed

What should be the lsquopipingrsquo of particular breeding pipelines

A topic surprisingly underexplored in GP (and in EC probably too)

An example Which is betterpopsubpop0speciespipe = ecgpkozaMutationPipelinepopsubpop0speciespipenum-sources = 1popsubpop0speciespipesource0 = ecgpkozaCrossoverPipeline

Orpopsubpop0speciespipenum-sources = 2popsubpop0speciespipesource0 = ecgpkozaCrossoverPipelinepopsubpop0speciespipesource0prob = 09popsubpop0speciespipesource1 = ecgpkozaMutationPipelinepopsubpop0speciespipesource1prob = 01

A more detailed view on GP (vanilla GP is not the whole story)ndash 53

The Challenges for GP

The Challenges for GPndash 54

Bloat

The evolving expressions tend to grow indefinitely in sizeFor tree-based representations this growth is typically exponential[-ish]

Evaluation becomes slow algorithm stalls memory overrun likely

One of the most intensely studied topics in GP 240+ papers as of March 2012

The Challenges for GPndash 55

Bloat example

Average number of nodes per generation in a typical run of GP solving the Sexticproblem x6minus2x4+x2

(GP dotted line)

The Challenges for GPndash 56

Countermeasures for bloat

Constraining tree heightSurprisingly can speed up bloat

Favoring small programsLexicographic parsimony pressure given two equally fit individuals prefer (select)the one represented by a smaller tree

Bloat-aware operators size-fair crossover

The Challenges for GPndash 57

Highly non-uniform distribution of program lsquobehaviorsrsquo

Convergence of binary Boolean random linear functions (composed of AND NANDOR NOR 8 bits)

From [20] Langdon W B Cantuacute-Paz E (ed) Random Search is Parsimonious LateBreaking Papers at the Genetic and Evolutionary Computation Conference(GECCO-2002) AAAI 2002 308-315

The Challenges for GPndash 58

High cost of evaluation

Running a program on multiple inputscan be expensive

Particularly for some types of dataeg images

Solutions

Caching of outcomes of subprograms

Parallel execution of programs onparticular fitness cases

Bloat prevention methods

The Challenges for GPndash 59

Variants of GP

Variants of GPndash 60

Strongly typed GP (STGP)

A way to incorporate prior knowledge and impose a structure on programs [30]

ImplementationProvide a set of typesFor each instruction define the types of its arguments and outcomesMake the operators type-aware

Mutation substitute a random tree of a proper typeCrossover swap trees of compatible1 types

1Compatible belonging to the same lsquoset typersquoVariants of GPndash 61

Strongly typed GP in ECJ

For the problem of simple classifiers represented as decision trees

Classifier syntaxClassifier = Class_idClassifier = if_then_else(Condition ClassifierClassifier)Condition = Input_Variable = Constant_Value

Implementation in ECJ parameter filesgptypeasize = 3gptypea0name = classgptypea1name = vargptypea2name = constgptypessize = 0gptcsize = 1gptc0 = ecgpGPTreeConstraintsgptc0name = tc0gptc0fset = f0gptc0returns = class

gpncsize = 4gpnc0 = ecgpGPNodeConstraintsgpnc0name = ncSimpleClassifiergpnc0returns = classgpnc0size = 0gpnc1 = ecgpGPNodeConstraintsgpnc1name = ncCompoundClassifiergpnc1returns = classgpnc1size = 4gpnc1child0 = vargpnc1child1 = constgpnc1child2 = classgpnc1child3 = classgpnc2 = ecgpGPNodeConstraintsgpnc2name = ncVariablegpnc2returns = vargpnc2size = 0gpnc3 = ecgpGPNodeConstraintsgpnc3name = ncConstantgpnc3returns = constgpnc3size = 0

Variants of GPndash 62

Linear Genetic Programming

MotivationTree-like structures are not natural for contemporary hardware architectures

Program = a sequence of instructions

Data passed via registers

ProsDirectly portable to machine code fast executionNatural correspondence to standard (GA-like) crossover operator

Applications direct evolution of machine code [32]

Initial register contents

Final register contents

x1

x2 O1 O2

x3

O3 O4 g2

g3

g1r1

r2

r3

r1

r2

r3

Variants of GPndash 63

Linear GP

Variants of GPndash 64

Stack-based GP

The best-known representative Push and PushGPhampshireedulspectorpushhtml [48]

ProsVery simple syntax program = instruction | literal | ( program )No need to specify the number of registersThe top element of a stack has the natural interpretation of program outcomeNatural possibility of implementing autoconstructive programs [47]Includes certain features that make it Turing-complete (eg YANK instruction)

Variants of GPndash 65

Push Example

Program

( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR )

Initial stack states

BOOLEAN STACK ()CODE STACK ( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR )FLOAT STACK ()INTEGER STACK ()

Stack states after program execution

BOOLEAN STACK ( TRUE )CODE STACK ( ( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR ) )FLOAT STACK ( 93 )INTEGER STACK ( 6 )

httphampshireedulspectorpush3-descriptionhtml

Variants of GPndash 66

Other variants of GP

Grammatical Evolution The grammar of the programming language ofconsideration is given as input to the algorithm Individuals encode the choice ofproductions in the derivation tree (which of available alternative productionshould be chosen modulo the number of productions available at given step ofderivation)

Graph-based GPMotivation standard GP cannot reuse subprograms (within a single program)Example Cartesian Genetic Programming

Variants of GPndash 67

Other variants of GP

Multiobjective GP The extra objectives canCome with the problemResult from GPrsquos specifics eg use program size as the second (minimized)objectiveBe associated with different tests (eg feature tests [40])

Developmental GP (eg using Push)

Probabilistic GP (a variant of EDA Estimation of Distribution Algorithms)The algorithm maintains a probability distribution P instead of a populationIndividuals are generated from P lsquoon demandrsquoThe results of individualsrsquo evaluation are used to update P

Variants of GPndash 68

Simple EDA-like GP PIPE

Probabilistic Incremental Program Evolution [41]

Variants of GPndash 69

Selected theoretical results

Selected theoretical resultsndash 70

Schemata theorem for GP

Exact formula for the expected number of individuals sampling a schema a thenext generation [37]

Plus later work for other types of crossover

Selected theoretical resultsndash 71

Semantic crossover

Selected theoretical resultsndash 72

Applications of GP

Applications of GPndash 73

Review

GP produced a number of solutions that are human-competitive ie a GPalgorithm automatically solved a problem for which a patent exists2

A recent award-winning work has demonstrated the ability of a GP system toautomatically find and correct bugs in commercially-released software whenprovided with test data3

GP is one of leading methodologies that can be used to lsquoautomatersquo sciencehelping the researchers to find the hidden complex patterns in the observedphenomena4

2Koza J R Keane M A Streeter M J Mydlowec W Yu J Lanza G 2003 Genetic Pro-gramming IV Routine Human-Competitive Machine Intelligence Kluwer Academic Publishers

3Arcuri A Yao X A novel co-evolutionary approach to automatic software bug fixing In WangJ (Ed) 2008 IEEE World Congress on Computational Intelligence IEEE Computational IntelligenceSociety IEEE Press Hong Kong

4Schmidt M Lipson H 3 Apr 2009 Distilling free-form natural laws from experimental dataScience 324 (5923) 81ndash85

Applications of GPndash 74

Humies

() Entries were solicited for cash awards for human-competitive results that wereproduced by any form of genetic and evolutionary computation and that were published

Applications of GPndash 75

Humies

The conditions to qualify(A) The result was patented as an invention in the past is an improvement over apatented invention or would qualify today as a patentable new invention(B) The result is equal to or better than a result that was accepted as a new scientificresult at the time when it was published in a peer-reviewed scientific journal(C) The result is equal to or better than a result that was placed into a database orarchive of results maintained by an internationally recognized panel of scientificexperts(D) The result is publishable in its own right as a new scientific result mdash independentof the fact that the result was mechanically created(E) The result is equal to or better than the most recent human-created solution to along-standing problem for which there has been a succession of increasingly betterhuman-created solutions(F) The result is equal to or better than a result that was considered an achievementin its field at the time it was first discovered(G) The result solves a problem of indisputable difficulty in its field(H) The result holds its own or wins a regulated competition involving humancontestants (in the form of either live human players or human-written computerprograms)

Applications of GPndash 76

Selected Gold Humies

2004 Jason D Lohn Gregory S Hornby Derek S Linden NASA Ames ResearchCenterAn Evolved Antenna for Deployment on NASArsquos Space Technology 5 Mission

httpidesignucscedupapershornby_ec11pdf

Applications of GPndash 77

Selected Gold Humies using GP

2009 Stephanie Forrest Claire Le Goues ThanhVu Nguyen Westley WeimerAutomatically finding patches using genetic programming A GeneticProgramming Approach to Automated Software Repair

Applications of GPndash 78

Selected Gold Humies using GP

2008 Lee Spector David M Clark Ian Lindsay Bradford Barr Jon KleinGenetic Programming for Finite Algebras

2010 Natalio Krasnogor Paweł Widera Jonathan GaribaldiEvolutionary design of the energy function for protein structure prediction GPchallenge evolving the energy function for protein structure prediction Automateddesign of energy functions for protein structure prediction by means of geneticprogramming and improved structure similarity assessment

2011 Achiya Elyasaf Ami Hauptmann Moshe SipperGA-FreeCell Evolving Solvers for the Game of FreeCell

Applications of GPndash 79

Other applications

classification problems in machine learning [] object recognition [ 33] orlearning game strategies [] [2 57] has demonstrated the ability of a GP system to automatically find andcorrect bugs in commercially-released software when provided with test data

In the context of this paper it deserves particular attention that GP is one ofleading methodologies that can be used to lsquoautomatersquo science helping theresearchers to find the hidden complex patterns in the observed phenomena

In this spirit in their seminal paper [42] have shown how GP can be used toinduce scientific laws from experimental data Many other studies havedemonstrated the usefulness of GP for modeling different phenomena includingthose of natural origins [45 3 7 56 46]

See [] for an extensive review of GP applications

Applications of GPndash 80

Case study 2 Evolution of feature detectors

Based on Krzysztof Krawiec Bartosz Kukawka and Tomasz Maciejewski (2010)Evolving cascades of voting feature detectors for vehicle detection in satellite imageryIn IEEE Congress on Evolutionary Computation (CEC 2010) Barcelona IEEE Presspages 2392-2399

Applications of GPndash 81

Additional resources

Additional resourcesndash 82

Additional resources

Evolutionary Computation in Java csgmuedu~eclabprojectsecjGeneric software framework for EA well-prepared to work with GP

The online GP bilbiography wwwcsbhamacuk~wblbiblio

The genetic programming lsquohome pagersquo (a little bit messy but still valuable)httpwwwgenetic-programmingcom

Additional resourcesndash 83

Bibliography

Bibliographyndash 84

A Agapitos and S M LucasEvolving Modular Recursive Sorting AlgorithmsIn Proc EuroGP 2007

A Arcuri and X YaoA novel co-evolutionary approach to automatic software bug fixingIn J Wang editor 2008 IEEE World Congress on Computational Intelligencepages 162ndash168 Hong Kong 1-6 June 2008 IEEE Computational IntelligenceSociety IEEE Press

M Arganis R Val J Prats K Rodriguez R Dominguez and J DolzGenetic programming and standardization in water temperature modellingAdvances in Civil Engineering 2009 2009

A Brabazon M OrsquoNeill and I DempseyAn Introduction to Evolutionary Computation in FinanceIEEE Computational Intelligence Magazine 3(4)42ndash55 2008

R Bradley A Brabazon and M OrsquoNeillDynamic High Frequency Trading A Neuro-Evolutionary ApproachIn Proc EvoWorkshops 2009

G Cuccu and F GomezWhen novelty is not enoughIn Proc EvoApplications 2011

Bibliographyndash 85

M Daga and M C DeoAlternative data-driven methods to estimate wind from waves by inversemodelingNatural Hazards 49(2)293ndash310 May 2009

I Dempsey M OrsquoNeill and A BrabazonAdaptive Trading With Grammatical EvolutionIn Proc CEC 2006

G Durrett F Neumann and U-M OrsquoReillyComputational Complexity Analysis of Simple Genetic Programming On TwoProblems Modeling Isolated Program SemanticsIn Proc FOGA 2011

E Galvaacuten-Loacutepez J Swafford M OrsquoNeill and A BrabazonEvolving a Ms PacMan Controller Using Grammatical EvolutionIn Applications of Evolutionary Computation Springer 2010

J V Hansen P B Lowry R D Meservy and D M McDonaldGenetic Programming for Prevention of Cyberterrorism through Dynamic andEvolving Intrusion DetectionDecision Support Systems 431362ndash1374 2007

Bibliographyndash 86

S Harding J F Miller and W BanzhafDevelopments in Cartesian Genetic Programming self-modifying CGPGPEM 11397ndash439 2010

D Jakobović and L BudinDynamic Scheduling with Genetic ProgrammingIn Proc EuroGP 2006

G Kendall A Parkes and K SpoererA Survey of NP-Complete PuzzlesInternational Computer Games Association Journal 31(1)13ndash34 2008

K E Kinnear JrEvolving a Sort Lessons in Genetic ProgrammingIn Proc of the International Conference on Neural Networks 1993

J KozaA Genetic Approach to the Truck Backer Upper Problem and the Inter-twinedSpiral ProblemIn Proc International Joint Conference on Neural Networks 1992

J R KozaGenetic Programming On the Programming of Computers by Means of NaturalSelectionMIT Press Cambridge MA USA 1992

Bibliographyndash 87

J R KozaGenetic Programming II Automatic Discovery of Reusable ProgramsMIT Press Cambridge Massachusetts May 1994

W Langdon and W BanzhafRepeated Patterns in Genetic ProgrammingNatural Computing 7589ndash613 2008

W B LangdonRandom search is parsimoniousIn E Cantuacute-Paz editor Late Breaking Papers at the Genetic and EvolutionaryComputation Conference (GECCO-2002) pages 308ndash315 New York NY 9-13July 2002 AAAI

W B Langdon and W BanzhafRepeated Sequences in Linear Genetic Programming GenomesComplex Systems 15(4)285ndash306 2005

W B Langdon and A P HarrisonEvolving Regular Expressions for GeneChip Probe Performance PredictionIn Proc PPSN pages 1061ndash1070 2008

W B Langdon and R PoliFoundations of Genetic ProgrammingSpringer-Verlag 2002

Bibliographyndash 88

W B Langdon J Rowsell and A P HarrisonCreating Regular Expressions as mRNA Motifs with GP to Predict Human ExonSplittingIn Proc GECCO 2009

W B Langdon O Sanchez Graillet and A P HarrisonAutomated DNA Motif DiscoveryarXivorg 2010

M Lones A Tyrrell S Stepney and L CavesControlling Complex Dynamics with Artificial Biochemical NetworksIn Proc EuroGP pages 159ndash170 2010

S LucasOthello Competitionhttpprotectkern-1667emrelaxalgovalessexacuk8080othellohtmlOthellohtml 2012[Online accessed 27-Jan-2012]

Bibliographyndash 89

S LucasThe Physical Travelling Salesperson Problemhttpprotectkern-1667emrelaxalgovalessexacukptspptsphtml2012[Online accessed 27ndashJan-2012]

T McConaghyFFX Fast Scalable Deterministic Symbolic Regression TechnologyIn Proc GPTP 2011

D J MontanaStrongly typed genetic programmingBBN Technical Report 7866 Bolt Beranek and Newman Inc 10 MoultonStreet Cambridge MA 02138 USA 7 May 1993

M Nicolau M Schoenauer and W BanzhafEvolving Genes to Balance a PoleIn Proc EuroGP 2010

Bibliographyndash 90

P Nordin and W BanzhafGenetic programming controlling a miniature robotIn E V Siegel and J R Koza editors Working Notes for the AAAI Symposiumon Genetic Programming pages 61ndash67 MIT Cambridge MA USA 10ndash12 Nov1995 AAAI

G Olague and L TrujilloEvolutionary-computer-assisted design of image operators that detect interestpoints using genetic programmingImage and Vision Computing 29(7)484ndash498 2011

M OrsquoNeill A Brabazon and E HembergSubtree Deactivation Control with Grammatical Genetic Programming inDynamic EnvironmentsIn Proc CEC 2008

M OrsquoNeill and C RyanGrammatical Evolution by Grammatical Evolution The Evolution of Grammarand Genetic CodeIn Proc EuroGP pages 138ndash149 Springer-Verlag 5ndash7 Apr 2004

M Paterson Y Peres M Thorup P Winkler and U ZwickMaximum OverhangIn Proc 19th Annual ACM-SIAM Symposium on Discrete Algorithms 2008

Bibliographyndash 91

R PoliExact schema theory for genetic programming and variable-length geneticalgorithms with one-point crossoverGenetic Programming and Evolvable Machines 2(2)123ndash163 June 2001

R Poli and W B LangdonOn the search properties of different crossover operators in genetic programmingIn J R Koza W Banzhaf K Chellapilla K Deb M Dorigo D B Fogel M HGarzon D E Goldberg H Iba and R Riolo editors Genetic Programming1998 Proceedings of the Third Annual Conference pages 293ndash301 University ofWisconsin Madison Wisconsin USA 22-25 July 1998 Morgan Kaufmann

R Poli W B Langdon and N F McPheeA field guide to genetic programmingPublished via httplulucom and freely available athttpwwwgp-field-guideorguk 2008(With contributions by J R Koza)

B J Ross and H ZhuProcedural texture evolution using multiobjective optimizationNew Generation Computing 22(3)271ndash293 2004

Bibliographyndash 92

R P Salustowicz and J SchmidhuberProbabilistic incremental program evolutionEvolutionary Computation 5(2)123ndash141 1997

M Schmidt and H LipsonDistilling free-form natural laws from experimental dataScience 324(5923)81ndash85 3 Apr 2009

S Silva and L VanneschiState-of-the-Art Genetic Programming for Predicting Human Oral Bioavailabilityof DrugsIn Proc 4th International Workshop on Practical Applications of ComputationalBiology and Bioinformatics 2010

M SipperLet the Games EvolveIn Proc GPTP 2011

C Sivapragasam R Maheswaran and V VenkateshGenetic programming approach for flood routing in natural channelsHydrological Processes 22(5)623ndash628 2007

C Sivapragasam N Muttil S Muthukumar and V M ArunPrediction of algal blooms using genetic programmingMarine Pollution Bulletin 60(10)1849ndash1855 2010

Bibliographyndash 93

L SpectorTowards practical autoconstructive evolution Self-evolution of problem-solvinggenetic programming systemsIn R Riolo T McConaghy and E Vladislavleva editors Genetic ProgrammingTheory and Practice VIII volume 8 of Genetic and Evolutionary Computationchapter 2 pages 17ndash33 Springer Ann Arbor USA 20-22 May 2010

L Spector C Perry and J KleinPush 20 programming language descriptionTechnical report School of Cognitive Science Hampshire College Apr 2004

J Togelius S Karakovskiy J Koutnik and J SchmidhuberSuper Mario EvolutionIn Proc IEEE Computational Intelligence and Games 2009

TORCS The Open Car Racing Simulatorhttptorcssourceforgenet 2012

L Vanneschi and G CuccuVariable Size Population for Dynamic Optimization with Genetic ProgrammingIn Proc GECCO 2009

Bibliographyndash 94

E Vladislavleva G Smits and D Den HertogOrder of Nonlinearity as a Complexity Measure for Models Generated by SymbolicRegression via Pareto Genetic ProgrammingIEEE Trans EC 13(2)333ndash349 2009

N Wagner Z Michalewicz M Khouja and R McGregorTime Series Forecasting for Dynamic Environments The DyFor Genetic ProgramModelIEEE Trans EC 2007

J Walker and J MillerPredicting Prime Numbers Using Cartesian Genetic ProgrammingIn Proc EuroGP 2007

J A Walker K Voumllk S L Smith and J F MillerParallel Evolution using Multi-chromosome Cartesian Genetic ProgrammingGPEM 10417ndash445 2009

W-C Wang K-W Chau C-T Cheng and L QiuA comparison of performance of several artificial intelligence methods forforecasting monthly discharge time seriesJournal of Hydrology 374(3-4)294ndash306 2009

Bibliographyndash 95

W Weimer S Forrest C Le Goues and T NguyenAutomatic program repair with evolutionary computationCommunications of the ACM 53(5)109ndash116 June 2010

P Widera J Garibaldi and N KrasnogorGP challenge Evolving energy function for protein structure predictionGPEM 1161ndash88 2010

T YuHierarchical Processing for Evolving Recursive and Modular Programs UsingHigher-Order Functions and Lambda AbstractionGPEM 2345ndash380 2001

Bibliographyndash 96

  • Introduction
  • Background
  • What is genetic programming
  • Summary of our first glimpse at GP
  • Life demonstration of GP using ECJ
  • Assessment of GP techniques
  • Case study
  • A more detailed view on GP (vanilla GP is not the whole story)
  • The Challenges for GP
  • Variants of GP
  • Selected theoretical results
  • Applications of GP
  • Additional resources
  • Bibliography
Page 35: GeneticProgramming - Poznań University of TechnologyGeneticProgramming KrzysztofKrawiec LaboratoryofIntelligentDecisionSupportSystems InstituteofComputingScience,PoznanUniversityofTechnology,Poznań,Poland

Exemplary run of ECJ

Standard outputjava ecEvolve -file ecappregressionquinticercparamsThreads breed1 eval1Seed 1427743400Job 0Setting upProcessing GP TypesProcessing GP Node ConstraintsProcessing GP Function SetsProcessing GP Tree Constraints-01306332228659439200164875774146594280653340439694114301402200189629743-0037506348565697010001402771209365470606602806044824949013869498395598084Initializing Generation 0Subpop 0 best fitness of generation Fitness Standardized=11303205 Adjusted=046941292 Hits=10Generation 1Subpop 0 best fitness of generation Fitness Standardized=06804932 Adjusted=059506345 Hits=7

Life demonstration of GP using ECJndash 35

Exemplary run The result

The log file produced by the runGeneration 0Best IndividualSubpopulation 0Evaluated trueFitness Standardized=11303205 Adjusted=046941292 Hits=10Tree 0( (sin ( x x)) (cos (+ x x)))Generation 1Best IndividualSubpopulation 0Evaluated trueFitness Standardized=06804932 Adjusted=059506345 Hits=7Tree 0( (rlog (+ (- x x) (cos x))) (rlog (- (cos (cos ( x x))) (- x x))))

Life demonstration of GP using ECJndash 36

Exemplary run

The log file produced by the runBest Individual of RunSubpopulation 0Evaluated trueFitness Standardized=008413165 Adjusted=092239726 Hits=17Tree 0( ( ( (- ( ( ( ( x (sin x)) (rlog

x)) (+ (+ (sin x) x) (- x x))) (exp ( x( ( (- ( ( ( ( x x) (rlog x)) (+ (+

(sin x) x) (- x x))) (exp ( x (sin x))))(sin x)) (rlog x)) (exp (rlog x)))))) (sin

x)) (rlog x)) x) (cos (cos ( ( (- ( ((exp (rlog x)) (+ x ( ( (exp (rlog x))(rlog x)) x))) (exp ( ( ( (- (exp (rlogx)) x) (rlog x)) x) (sin ( x x))))) (sinx)) ( x ( ( (- ( ( ( ( x x) (rlogx)) (+ (+ x (+ (+ (sin x) x) (- x x))) (-x x))) (exp ( x (sin x)))) (sin x)) (rlogx)) (exp (rlog x))))) x))))

Life demonstration of GP using ECJndash 37

GP in R

httpcranr-projectorgwebpackagesgprindexhtml

Life demonstration of GP using ECJndash 38

GP in Mathematica

Life demonstration of GP using ECJndash 39

Assessment of GP techniques

Assessment of GP techniquesndash 40

Criteria for assessing the quality of GP-evolved solutions

Criteria for assessing GP algorithmssuccess rate (percentage of evolutionary runs ended with success)

time-to-success (can be infin)

error of the best-of-run individual

Criteria for assessing programs obtained with GP

error rate (percentage of tests passed)

program size (number of instructions)

execution time

transparency

Assessment of GP techniquesndash 41

GP Benchmarks

Problem Definition (formula)Sextic x6minus2x4+x2

Septic x7minus2x6+x5minusx4+x3minus2x2+xNonic x9+x8+x7+x6+x5+x4+x3+x2+xR1 (x+1)3(x2minusx+1)R2 (x5minus3x3+1)(x2+1)R3 (x6+x5)(x4+x3+x2+x+1)

Assessment of GP techniquesndash 42

Even more GP benchmarks

Symbolic Regression Drug Bioavailability [43]

Tower [52] Protein Structure Classification [58]

Boolean Functions Time Series Forecasting [53]

N-Multiplexer [17] N-Majority [17] N-Parity [17] Path-finding and Planning

Generalised Boolean Circuits [12 59] Physical Travelling Salesman [28]

Digital Adder [55] Artificial Ant [17]

Order [9] Lawnmower [18]

Digital Multiplier [55] Tartarus Problem [6]

Majority [9] Maximum Overhang [36]

Classification Circuit Design [29]

mRNA Motif Classification [24] Control Systems

DNA Motif Discovery [25] Chaotic Dynamic Systems Control [26]

Intrusion Detection [11] Pole Balancing [31]

Protein Classification [19] Truck Control [16]

Intertwined Spirals [17]

Predictive Modelling

Mackey-Glass Chaotic Time Series [21]

Financial Trading [5 4 8]

Sunspot Prediction [17]

GeneChip Probe Performance [22]

Prime Number Prediction [54]

Assessment of GP techniquesndash 43

and more

Game-Playing

TORCS Car Racing [50]

Ms PacMan [10]

Othello [27]

Chessboard Evaluation [44]

Backgammon [44]

Mario [49]

NP-Complete Puzzles [14]

Robocode [44]

Rush Hour [44]

Checkers [44]

Freecell [44]

Dynamic Optimisation

Dynamic Symbolic Regression [34 35 51]

Dynamic Scheduling [13]

Traditional Programming

Sorting [15 1]

Assessment of GP techniquesndash 44

Case study

Case studyndash 45

Case study 1 Evolution of temperature models

Based on Karolina Stanisławska Krzysztof Krawiec Zbigniew W KundzewiczModeling Global Temperature Changes using Genetic Programming ndash A Case Study

Institute of Computing Science Poznan University of Technology Poznan Poland

Institute for Agricultural and Forest Environment Polish Academy of SciencesPoznan Poland and Potsdam Institute for Climate Impact Research PotsdamGermany

Konferencja Algorytmoacutew Ewolucyjnych i Optymalizacji Globalnej KAEiOGSeptember 21 2011

Case studyndash 46

A more detailed view on GP(vanilla GP is not the whole story)

A more detailed view on GP (vanilla GP is not the whole story)ndash 47

Where to get [random] candidate solutions from

Every stochastic search method relies on some sampling algorithm(s)

The distribution of randomly generated solutions is important as it impliescertain bias of the algorithm

ProblemsWe donrsquot know the lsquoidealrsquo distribution of GP programsEven if we knew it it may be difficult to design an algorithm that obeys it

The most widely used contemporary initialization methods take care only of thesyntax of generated programs

Mainly height constraint

A more detailed view on GP (vanilla GP is not the whole story)ndash 48

Initialization Full method

Specify the maximum tree height hmax

The full method for initializing treesChoose nonterminal nodes at random until hmax is reachedThen choose only from terminals

A more detailed view on GP (vanilla GP is not the whole story)ndash 49

Initialization Grow method

Specify the maximum tree height hmax

The grow method for initializing treesChoose nonterminal or terminal nodes at random until hmax is reachedThen choose only from terminals

A more detailed view on GP (vanilla GP is not the whole story)ndash 50

Homologous crossover for GP

Earliest example one-point crossover [23] identify a common region in theparents and swap the corresponding trees

The common region is the lsquointersectionrsquo of parent trees

A more detailed view on GP (vanilla GP is not the whole story)ndash 51

Uniform crossover for GP

Works similarly to uniform crossover in GAs

The offspring is build by iterating over nodes in the common region and flipping acoin to decide from which parent should an instruction be copied [38]

A more detailed view on GP (vanilla GP is not the whole story)ndash 52

How to employ multiple operators for lsquobreedingrsquo

How should the particular operators coexist in an evolutionary process In other words

How should they be superimposed

What should be the lsquopipingrsquo of particular breeding pipelines

A topic surprisingly underexplored in GP (and in EC probably too)

An example Which is betterpopsubpop0speciespipe = ecgpkozaMutationPipelinepopsubpop0speciespipenum-sources = 1popsubpop0speciespipesource0 = ecgpkozaCrossoverPipeline

Orpopsubpop0speciespipenum-sources = 2popsubpop0speciespipesource0 = ecgpkozaCrossoverPipelinepopsubpop0speciespipesource0prob = 09popsubpop0speciespipesource1 = ecgpkozaMutationPipelinepopsubpop0speciespipesource1prob = 01

A more detailed view on GP (vanilla GP is not the whole story)ndash 53

The Challenges for GP

The Challenges for GPndash 54

Bloat

The evolving expressions tend to grow indefinitely in sizeFor tree-based representations this growth is typically exponential[-ish]

Evaluation becomes slow algorithm stalls memory overrun likely

One of the most intensely studied topics in GP 240+ papers as of March 2012

The Challenges for GPndash 55

Bloat example

Average number of nodes per generation in a typical run of GP solving the Sexticproblem x6minus2x4+x2

(GP dotted line)

The Challenges for GPndash 56

Countermeasures for bloat

Constraining tree heightSurprisingly can speed up bloat

Favoring small programsLexicographic parsimony pressure given two equally fit individuals prefer (select)the one represented by a smaller tree

Bloat-aware operators size-fair crossover

The Challenges for GPndash 57

Highly non-uniform distribution of program lsquobehaviorsrsquo

Convergence of binary Boolean random linear functions (composed of AND NANDOR NOR 8 bits)

From [20] Langdon W B Cantuacute-Paz E (ed) Random Search is Parsimonious LateBreaking Papers at the Genetic and Evolutionary Computation Conference(GECCO-2002) AAAI 2002 308-315

The Challenges for GPndash 58

High cost of evaluation

Running a program on multiple inputscan be expensive

Particularly for some types of dataeg images

Solutions

Caching of outcomes of subprograms

Parallel execution of programs onparticular fitness cases

Bloat prevention methods

The Challenges for GPndash 59

Variants of GP

Variants of GPndash 60

Strongly typed GP (STGP)

A way to incorporate prior knowledge and impose a structure on programs [30]

ImplementationProvide a set of typesFor each instruction define the types of its arguments and outcomesMake the operators type-aware

Mutation substitute a random tree of a proper typeCrossover swap trees of compatible1 types

1Compatible belonging to the same lsquoset typersquoVariants of GPndash 61

Strongly typed GP in ECJ

For the problem of simple classifiers represented as decision trees

Classifier syntaxClassifier = Class_idClassifier = if_then_else(Condition ClassifierClassifier)Condition = Input_Variable = Constant_Value

Implementation in ECJ parameter filesgptypeasize = 3gptypea0name = classgptypea1name = vargptypea2name = constgptypessize = 0gptcsize = 1gptc0 = ecgpGPTreeConstraintsgptc0name = tc0gptc0fset = f0gptc0returns = class

gpncsize = 4gpnc0 = ecgpGPNodeConstraintsgpnc0name = ncSimpleClassifiergpnc0returns = classgpnc0size = 0gpnc1 = ecgpGPNodeConstraintsgpnc1name = ncCompoundClassifiergpnc1returns = classgpnc1size = 4gpnc1child0 = vargpnc1child1 = constgpnc1child2 = classgpnc1child3 = classgpnc2 = ecgpGPNodeConstraintsgpnc2name = ncVariablegpnc2returns = vargpnc2size = 0gpnc3 = ecgpGPNodeConstraintsgpnc3name = ncConstantgpnc3returns = constgpnc3size = 0

Variants of GPndash 62

Linear Genetic Programming

MotivationTree-like structures are not natural for contemporary hardware architectures

Program = a sequence of instructions

Data passed via registers

ProsDirectly portable to machine code fast executionNatural correspondence to standard (GA-like) crossover operator

Applications direct evolution of machine code [32]

Initial register contents

Final register contents

x1

x2 O1 O2

x3

O3 O4 g2

g3

g1r1

r2

r3

r1

r2

r3

Variants of GPndash 63

Linear GP

Variants of GPndash 64

Stack-based GP

The best-known representative Push and PushGPhampshireedulspectorpushhtml [48]

ProsVery simple syntax program = instruction | literal | ( program )No need to specify the number of registersThe top element of a stack has the natural interpretation of program outcomeNatural possibility of implementing autoconstructive programs [47]Includes certain features that make it Turing-complete (eg YANK instruction)

Variants of GPndash 65

Push Example

Program

( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR )

Initial stack states

BOOLEAN STACK ()CODE STACK ( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR )FLOAT STACK ()INTEGER STACK ()

Stack states after program execution

BOOLEAN STACK ( TRUE )CODE STACK ( ( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR ) )FLOAT STACK ( 93 )INTEGER STACK ( 6 )

httphampshireedulspectorpush3-descriptionhtml

Variants of GPndash 66

Other variants of GP

Grammatical Evolution The grammar of the programming language ofconsideration is given as input to the algorithm Individuals encode the choice ofproductions in the derivation tree (which of available alternative productionshould be chosen modulo the number of productions available at given step ofderivation)

Graph-based GPMotivation standard GP cannot reuse subprograms (within a single program)Example Cartesian Genetic Programming

Variants of GPndash 67

Other variants of GP

Multiobjective GP The extra objectives canCome with the problemResult from GPrsquos specifics eg use program size as the second (minimized)objectiveBe associated with different tests (eg feature tests [40])

Developmental GP (eg using Push)

Probabilistic GP (a variant of EDA Estimation of Distribution Algorithms)The algorithm maintains a probability distribution P instead of a populationIndividuals are generated from P lsquoon demandrsquoThe results of individualsrsquo evaluation are used to update P

Variants of GPndash 68

Simple EDA-like GP PIPE

Probabilistic Incremental Program Evolution [41]

Variants of GPndash 69

Selected theoretical results

Selected theoretical resultsndash 70

Schemata theorem for GP

Exact formula for the expected number of individuals sampling a schema a thenext generation [37]

Plus later work for other types of crossover

Selected theoretical resultsndash 71

Semantic crossover

Selected theoretical resultsndash 72

Applications of GP

Applications of GPndash 73

Review

GP produced a number of solutions that are human-competitive ie a GPalgorithm automatically solved a problem for which a patent exists2

A recent award-winning work has demonstrated the ability of a GP system toautomatically find and correct bugs in commercially-released software whenprovided with test data3

GP is one of leading methodologies that can be used to lsquoautomatersquo sciencehelping the researchers to find the hidden complex patterns in the observedphenomena4

2Koza J R Keane M A Streeter M J Mydlowec W Yu J Lanza G 2003 Genetic Pro-gramming IV Routine Human-Competitive Machine Intelligence Kluwer Academic Publishers

3Arcuri A Yao X A novel co-evolutionary approach to automatic software bug fixing In WangJ (Ed) 2008 IEEE World Congress on Computational Intelligence IEEE Computational IntelligenceSociety IEEE Press Hong Kong

4Schmidt M Lipson H 3 Apr 2009 Distilling free-form natural laws from experimental dataScience 324 (5923) 81ndash85

Applications of GPndash 74

Humies

() Entries were solicited for cash awards for human-competitive results that wereproduced by any form of genetic and evolutionary computation and that were published

Applications of GPndash 75

Humies

The conditions to qualify(A) The result was patented as an invention in the past is an improvement over apatented invention or would qualify today as a patentable new invention(B) The result is equal to or better than a result that was accepted as a new scientificresult at the time when it was published in a peer-reviewed scientific journal(C) The result is equal to or better than a result that was placed into a database orarchive of results maintained by an internationally recognized panel of scientificexperts(D) The result is publishable in its own right as a new scientific result mdash independentof the fact that the result was mechanically created(E) The result is equal to or better than the most recent human-created solution to along-standing problem for which there has been a succession of increasingly betterhuman-created solutions(F) The result is equal to or better than a result that was considered an achievementin its field at the time it was first discovered(G) The result solves a problem of indisputable difficulty in its field(H) The result holds its own or wins a regulated competition involving humancontestants (in the form of either live human players or human-written computerprograms)

Applications of GPndash 76

Selected Gold Humies

2004 Jason D Lohn Gregory S Hornby Derek S Linden NASA Ames ResearchCenterAn Evolved Antenna for Deployment on NASArsquos Space Technology 5 Mission

httpidesignucscedupapershornby_ec11pdf

Applications of GPndash 77

Selected Gold Humies using GP

2009 Stephanie Forrest Claire Le Goues ThanhVu Nguyen Westley WeimerAutomatically finding patches using genetic programming A GeneticProgramming Approach to Automated Software Repair

Applications of GPndash 78

Selected Gold Humies using GP

2008 Lee Spector David M Clark Ian Lindsay Bradford Barr Jon KleinGenetic Programming for Finite Algebras

2010 Natalio Krasnogor Paweł Widera Jonathan GaribaldiEvolutionary design of the energy function for protein structure prediction GPchallenge evolving the energy function for protein structure prediction Automateddesign of energy functions for protein structure prediction by means of geneticprogramming and improved structure similarity assessment

2011 Achiya Elyasaf Ami Hauptmann Moshe SipperGA-FreeCell Evolving Solvers for the Game of FreeCell

Applications of GPndash 79

Other applications

classification problems in machine learning [] object recognition [ 33] orlearning game strategies [] [2 57] has demonstrated the ability of a GP system to automatically find andcorrect bugs in commercially-released software when provided with test data

In the context of this paper it deserves particular attention that GP is one ofleading methodologies that can be used to lsquoautomatersquo science helping theresearchers to find the hidden complex patterns in the observed phenomena

In this spirit in their seminal paper [42] have shown how GP can be used toinduce scientific laws from experimental data Many other studies havedemonstrated the usefulness of GP for modeling different phenomena includingthose of natural origins [45 3 7 56 46]

See [] for an extensive review of GP applications

Applications of GPndash 80

Case study 2 Evolution of feature detectors

Based on Krzysztof Krawiec Bartosz Kukawka and Tomasz Maciejewski (2010)Evolving cascades of voting feature detectors for vehicle detection in satellite imageryIn IEEE Congress on Evolutionary Computation (CEC 2010) Barcelona IEEE Presspages 2392-2399

Applications of GPndash 81

Additional resources

Additional resourcesndash 82

Additional resources

Evolutionary Computation in Java csgmuedu~eclabprojectsecjGeneric software framework for EA well-prepared to work with GP

The online GP bilbiography wwwcsbhamacuk~wblbiblio

The genetic programming lsquohome pagersquo (a little bit messy but still valuable)httpwwwgenetic-programmingcom

Additional resourcesndash 83

Bibliography

Bibliographyndash 84

A Agapitos and S M LucasEvolving Modular Recursive Sorting AlgorithmsIn Proc EuroGP 2007

A Arcuri and X YaoA novel co-evolutionary approach to automatic software bug fixingIn J Wang editor 2008 IEEE World Congress on Computational Intelligencepages 162ndash168 Hong Kong 1-6 June 2008 IEEE Computational IntelligenceSociety IEEE Press

M Arganis R Val J Prats K Rodriguez R Dominguez and J DolzGenetic programming and standardization in water temperature modellingAdvances in Civil Engineering 2009 2009

A Brabazon M OrsquoNeill and I DempseyAn Introduction to Evolutionary Computation in FinanceIEEE Computational Intelligence Magazine 3(4)42ndash55 2008

R Bradley A Brabazon and M OrsquoNeillDynamic High Frequency Trading A Neuro-Evolutionary ApproachIn Proc EvoWorkshops 2009

G Cuccu and F GomezWhen novelty is not enoughIn Proc EvoApplications 2011

Bibliographyndash 85

M Daga and M C DeoAlternative data-driven methods to estimate wind from waves by inversemodelingNatural Hazards 49(2)293ndash310 May 2009

I Dempsey M OrsquoNeill and A BrabazonAdaptive Trading With Grammatical EvolutionIn Proc CEC 2006

G Durrett F Neumann and U-M OrsquoReillyComputational Complexity Analysis of Simple Genetic Programming On TwoProblems Modeling Isolated Program SemanticsIn Proc FOGA 2011

E Galvaacuten-Loacutepez J Swafford M OrsquoNeill and A BrabazonEvolving a Ms PacMan Controller Using Grammatical EvolutionIn Applications of Evolutionary Computation Springer 2010

J V Hansen P B Lowry R D Meservy and D M McDonaldGenetic Programming for Prevention of Cyberterrorism through Dynamic andEvolving Intrusion DetectionDecision Support Systems 431362ndash1374 2007

Bibliographyndash 86

S Harding J F Miller and W BanzhafDevelopments in Cartesian Genetic Programming self-modifying CGPGPEM 11397ndash439 2010

D Jakobović and L BudinDynamic Scheduling with Genetic ProgrammingIn Proc EuroGP 2006

G Kendall A Parkes and K SpoererA Survey of NP-Complete PuzzlesInternational Computer Games Association Journal 31(1)13ndash34 2008

K E Kinnear JrEvolving a Sort Lessons in Genetic ProgrammingIn Proc of the International Conference on Neural Networks 1993

J KozaA Genetic Approach to the Truck Backer Upper Problem and the Inter-twinedSpiral ProblemIn Proc International Joint Conference on Neural Networks 1992

J R KozaGenetic Programming On the Programming of Computers by Means of NaturalSelectionMIT Press Cambridge MA USA 1992

Bibliographyndash 87

J R KozaGenetic Programming II Automatic Discovery of Reusable ProgramsMIT Press Cambridge Massachusetts May 1994

W Langdon and W BanzhafRepeated Patterns in Genetic ProgrammingNatural Computing 7589ndash613 2008

W B LangdonRandom search is parsimoniousIn E Cantuacute-Paz editor Late Breaking Papers at the Genetic and EvolutionaryComputation Conference (GECCO-2002) pages 308ndash315 New York NY 9-13July 2002 AAAI

W B Langdon and W BanzhafRepeated Sequences in Linear Genetic Programming GenomesComplex Systems 15(4)285ndash306 2005

W B Langdon and A P HarrisonEvolving Regular Expressions for GeneChip Probe Performance PredictionIn Proc PPSN pages 1061ndash1070 2008

W B Langdon and R PoliFoundations of Genetic ProgrammingSpringer-Verlag 2002

Bibliographyndash 88

W B Langdon J Rowsell and A P HarrisonCreating Regular Expressions as mRNA Motifs with GP to Predict Human ExonSplittingIn Proc GECCO 2009

W B Langdon O Sanchez Graillet and A P HarrisonAutomated DNA Motif DiscoveryarXivorg 2010

M Lones A Tyrrell S Stepney and L CavesControlling Complex Dynamics with Artificial Biochemical NetworksIn Proc EuroGP pages 159ndash170 2010

S LucasOthello Competitionhttpprotectkern-1667emrelaxalgovalessexacuk8080othellohtmlOthellohtml 2012[Online accessed 27-Jan-2012]

Bibliographyndash 89

S LucasThe Physical Travelling Salesperson Problemhttpprotectkern-1667emrelaxalgovalessexacukptspptsphtml2012[Online accessed 27ndashJan-2012]

T McConaghyFFX Fast Scalable Deterministic Symbolic Regression TechnologyIn Proc GPTP 2011

D J MontanaStrongly typed genetic programmingBBN Technical Report 7866 Bolt Beranek and Newman Inc 10 MoultonStreet Cambridge MA 02138 USA 7 May 1993

M Nicolau M Schoenauer and W BanzhafEvolving Genes to Balance a PoleIn Proc EuroGP 2010

Bibliographyndash 90

P Nordin and W BanzhafGenetic programming controlling a miniature robotIn E V Siegel and J R Koza editors Working Notes for the AAAI Symposiumon Genetic Programming pages 61ndash67 MIT Cambridge MA USA 10ndash12 Nov1995 AAAI

G Olague and L TrujilloEvolutionary-computer-assisted design of image operators that detect interestpoints using genetic programmingImage and Vision Computing 29(7)484ndash498 2011

M OrsquoNeill A Brabazon and E HembergSubtree Deactivation Control with Grammatical Genetic Programming inDynamic EnvironmentsIn Proc CEC 2008

M OrsquoNeill and C RyanGrammatical Evolution by Grammatical Evolution The Evolution of Grammarand Genetic CodeIn Proc EuroGP pages 138ndash149 Springer-Verlag 5ndash7 Apr 2004

M Paterson Y Peres M Thorup P Winkler and U ZwickMaximum OverhangIn Proc 19th Annual ACM-SIAM Symposium on Discrete Algorithms 2008

Bibliographyndash 91

R PoliExact schema theory for genetic programming and variable-length geneticalgorithms with one-point crossoverGenetic Programming and Evolvable Machines 2(2)123ndash163 June 2001

R Poli and W B LangdonOn the search properties of different crossover operators in genetic programmingIn J R Koza W Banzhaf K Chellapilla K Deb M Dorigo D B Fogel M HGarzon D E Goldberg H Iba and R Riolo editors Genetic Programming1998 Proceedings of the Third Annual Conference pages 293ndash301 University ofWisconsin Madison Wisconsin USA 22-25 July 1998 Morgan Kaufmann

R Poli W B Langdon and N F McPheeA field guide to genetic programmingPublished via httplulucom and freely available athttpwwwgp-field-guideorguk 2008(With contributions by J R Koza)

B J Ross and H ZhuProcedural texture evolution using multiobjective optimizationNew Generation Computing 22(3)271ndash293 2004

Bibliographyndash 92

R P Salustowicz and J SchmidhuberProbabilistic incremental program evolutionEvolutionary Computation 5(2)123ndash141 1997

M Schmidt and H LipsonDistilling free-form natural laws from experimental dataScience 324(5923)81ndash85 3 Apr 2009

S Silva and L VanneschiState-of-the-Art Genetic Programming for Predicting Human Oral Bioavailabilityof DrugsIn Proc 4th International Workshop on Practical Applications of ComputationalBiology and Bioinformatics 2010

M SipperLet the Games EvolveIn Proc GPTP 2011

C Sivapragasam R Maheswaran and V VenkateshGenetic programming approach for flood routing in natural channelsHydrological Processes 22(5)623ndash628 2007

C Sivapragasam N Muttil S Muthukumar and V M ArunPrediction of algal blooms using genetic programmingMarine Pollution Bulletin 60(10)1849ndash1855 2010

Bibliographyndash 93

L SpectorTowards practical autoconstructive evolution Self-evolution of problem-solvinggenetic programming systemsIn R Riolo T McConaghy and E Vladislavleva editors Genetic ProgrammingTheory and Practice VIII volume 8 of Genetic and Evolutionary Computationchapter 2 pages 17ndash33 Springer Ann Arbor USA 20-22 May 2010

L Spector C Perry and J KleinPush 20 programming language descriptionTechnical report School of Cognitive Science Hampshire College Apr 2004

J Togelius S Karakovskiy J Koutnik and J SchmidhuberSuper Mario EvolutionIn Proc IEEE Computational Intelligence and Games 2009

TORCS The Open Car Racing Simulatorhttptorcssourceforgenet 2012

L Vanneschi and G CuccuVariable Size Population for Dynamic Optimization with Genetic ProgrammingIn Proc GECCO 2009

Bibliographyndash 94

E Vladislavleva G Smits and D Den HertogOrder of Nonlinearity as a Complexity Measure for Models Generated by SymbolicRegression via Pareto Genetic ProgrammingIEEE Trans EC 13(2)333ndash349 2009

N Wagner Z Michalewicz M Khouja and R McGregorTime Series Forecasting for Dynamic Environments The DyFor Genetic ProgramModelIEEE Trans EC 2007

J Walker and J MillerPredicting Prime Numbers Using Cartesian Genetic ProgrammingIn Proc EuroGP 2007

J A Walker K Voumllk S L Smith and J F MillerParallel Evolution using Multi-chromosome Cartesian Genetic ProgrammingGPEM 10417ndash445 2009

W-C Wang K-W Chau C-T Cheng and L QiuA comparison of performance of several artificial intelligence methods forforecasting monthly discharge time seriesJournal of Hydrology 374(3-4)294ndash306 2009

Bibliographyndash 95

W Weimer S Forrest C Le Goues and T NguyenAutomatic program repair with evolutionary computationCommunications of the ACM 53(5)109ndash116 June 2010

P Widera J Garibaldi and N KrasnogorGP challenge Evolving energy function for protein structure predictionGPEM 1161ndash88 2010

T YuHierarchical Processing for Evolving Recursive and Modular Programs UsingHigher-Order Functions and Lambda AbstractionGPEM 2345ndash380 2001

Bibliographyndash 96

  • Introduction
  • Background
  • What is genetic programming
  • Summary of our first glimpse at GP
  • Life demonstration of GP using ECJ
  • Assessment of GP techniques
  • Case study
  • A more detailed view on GP (vanilla GP is not the whole story)
  • The Challenges for GP
  • Variants of GP
  • Selected theoretical results
  • Applications of GP
  • Additional resources
  • Bibliography
Page 36: GeneticProgramming - Poznań University of TechnologyGeneticProgramming KrzysztofKrawiec LaboratoryofIntelligentDecisionSupportSystems InstituteofComputingScience,PoznanUniversityofTechnology,Poznań,Poland

Exemplary run The result

The log file produced by the runGeneration 0Best IndividualSubpopulation 0Evaluated trueFitness Standardized=11303205 Adjusted=046941292 Hits=10Tree 0( (sin ( x x)) (cos (+ x x)))Generation 1Best IndividualSubpopulation 0Evaluated trueFitness Standardized=06804932 Adjusted=059506345 Hits=7Tree 0( (rlog (+ (- x x) (cos x))) (rlog (- (cos (cos ( x x))) (- x x))))

Life demonstration of GP using ECJndash 36

Exemplary run

The log file produced by the runBest Individual of RunSubpopulation 0Evaluated trueFitness Standardized=008413165 Adjusted=092239726 Hits=17Tree 0( ( ( (- ( ( ( ( x (sin x)) (rlog

x)) (+ (+ (sin x) x) (- x x))) (exp ( x( ( (- ( ( ( ( x x) (rlog x)) (+ (+

(sin x) x) (- x x))) (exp ( x (sin x))))(sin x)) (rlog x)) (exp (rlog x)))))) (sin

x)) (rlog x)) x) (cos (cos ( ( (- ( ((exp (rlog x)) (+ x ( ( (exp (rlog x))(rlog x)) x))) (exp ( ( ( (- (exp (rlogx)) x) (rlog x)) x) (sin ( x x))))) (sinx)) ( x ( ( (- ( ( ( ( x x) (rlogx)) (+ (+ x (+ (+ (sin x) x) (- x x))) (-x x))) (exp ( x (sin x)))) (sin x)) (rlogx)) (exp (rlog x))))) x))))

Life demonstration of GP using ECJndash 37

GP in R

httpcranr-projectorgwebpackagesgprindexhtml

Life demonstration of GP using ECJndash 38

GP in Mathematica

Life demonstration of GP using ECJndash 39

Assessment of GP techniques

Assessment of GP techniquesndash 40

Criteria for assessing the quality of GP-evolved solutions

Criteria for assessing GP algorithmssuccess rate (percentage of evolutionary runs ended with success)

time-to-success (can be infin)

error of the best-of-run individual

Criteria for assessing programs obtained with GP

error rate (percentage of tests passed)

program size (number of instructions)

execution time

transparency

Assessment of GP techniquesndash 41

GP Benchmarks

Problem Definition (formula)Sextic x6minus2x4+x2

Septic x7minus2x6+x5minusx4+x3minus2x2+xNonic x9+x8+x7+x6+x5+x4+x3+x2+xR1 (x+1)3(x2minusx+1)R2 (x5minus3x3+1)(x2+1)R3 (x6+x5)(x4+x3+x2+x+1)

Assessment of GP techniquesndash 42

Even more GP benchmarks

Symbolic Regression Drug Bioavailability [43]

Tower [52] Protein Structure Classification [58]

Boolean Functions Time Series Forecasting [53]

N-Multiplexer [17] N-Majority [17] N-Parity [17] Path-finding and Planning

Generalised Boolean Circuits [12 59] Physical Travelling Salesman [28]

Digital Adder [55] Artificial Ant [17]

Order [9] Lawnmower [18]

Digital Multiplier [55] Tartarus Problem [6]

Majority [9] Maximum Overhang [36]

Classification Circuit Design [29]

mRNA Motif Classification [24] Control Systems

DNA Motif Discovery [25] Chaotic Dynamic Systems Control [26]

Intrusion Detection [11] Pole Balancing [31]

Protein Classification [19] Truck Control [16]

Intertwined Spirals [17]

Predictive Modelling

Mackey-Glass Chaotic Time Series [21]

Financial Trading [5 4 8]

Sunspot Prediction [17]

GeneChip Probe Performance [22]

Prime Number Prediction [54]

Assessment of GP techniquesndash 43

and more

Game-Playing

TORCS Car Racing [50]

Ms PacMan [10]

Othello [27]

Chessboard Evaluation [44]

Backgammon [44]

Mario [49]

NP-Complete Puzzles [14]

Robocode [44]

Rush Hour [44]

Checkers [44]

Freecell [44]

Dynamic Optimisation

Dynamic Symbolic Regression [34 35 51]

Dynamic Scheduling [13]

Traditional Programming

Sorting [15 1]

Assessment of GP techniquesndash 44

Case study

Case studyndash 45

Case study 1 Evolution of temperature models

Based on Karolina Stanisławska Krzysztof Krawiec Zbigniew W KundzewiczModeling Global Temperature Changes using Genetic Programming ndash A Case Study

Institute of Computing Science Poznan University of Technology Poznan Poland

Institute for Agricultural and Forest Environment Polish Academy of SciencesPoznan Poland and Potsdam Institute for Climate Impact Research PotsdamGermany

Konferencja Algorytmoacutew Ewolucyjnych i Optymalizacji Globalnej KAEiOGSeptember 21 2011

Case studyndash 46

A more detailed view on GP(vanilla GP is not the whole story)

A more detailed view on GP (vanilla GP is not the whole story)ndash 47

Where to get [random] candidate solutions from

Every stochastic search method relies on some sampling algorithm(s)

The distribution of randomly generated solutions is important as it impliescertain bias of the algorithm

ProblemsWe donrsquot know the lsquoidealrsquo distribution of GP programsEven if we knew it it may be difficult to design an algorithm that obeys it

The most widely used contemporary initialization methods take care only of thesyntax of generated programs

Mainly height constraint

A more detailed view on GP (vanilla GP is not the whole story)ndash 48

Initialization Full method

Specify the maximum tree height hmax

The full method for initializing treesChoose nonterminal nodes at random until hmax is reachedThen choose only from terminals

A more detailed view on GP (vanilla GP is not the whole story)ndash 49

Initialization Grow method

Specify the maximum tree height hmax

The grow method for initializing treesChoose nonterminal or terminal nodes at random until hmax is reachedThen choose only from terminals

A more detailed view on GP (vanilla GP is not the whole story)ndash 50

Homologous crossover for GP

Earliest example one-point crossover [23] identify a common region in theparents and swap the corresponding trees

The common region is the lsquointersectionrsquo of parent trees

A more detailed view on GP (vanilla GP is not the whole story)ndash 51

Uniform crossover for GP

Works similarly to uniform crossover in GAs

The offspring is build by iterating over nodes in the common region and flipping acoin to decide from which parent should an instruction be copied [38]

A more detailed view on GP (vanilla GP is not the whole story)ndash 52

How to employ multiple operators for lsquobreedingrsquo

How should the particular operators coexist in an evolutionary process In other words

How should they be superimposed

What should be the lsquopipingrsquo of particular breeding pipelines

A topic surprisingly underexplored in GP (and in EC probably too)

An example Which is betterpopsubpop0speciespipe = ecgpkozaMutationPipelinepopsubpop0speciespipenum-sources = 1popsubpop0speciespipesource0 = ecgpkozaCrossoverPipeline

Orpopsubpop0speciespipenum-sources = 2popsubpop0speciespipesource0 = ecgpkozaCrossoverPipelinepopsubpop0speciespipesource0prob = 09popsubpop0speciespipesource1 = ecgpkozaMutationPipelinepopsubpop0speciespipesource1prob = 01

A more detailed view on GP (vanilla GP is not the whole story)ndash 53

The Challenges for GP

The Challenges for GPndash 54

Bloat

The evolving expressions tend to grow indefinitely in sizeFor tree-based representations this growth is typically exponential[-ish]

Evaluation becomes slow algorithm stalls memory overrun likely

One of the most intensely studied topics in GP 240+ papers as of March 2012

The Challenges for GPndash 55

Bloat example

Average number of nodes per generation in a typical run of GP solving the Sexticproblem x6minus2x4+x2

(GP dotted line)

The Challenges for GPndash 56

Countermeasures for bloat

Constraining tree heightSurprisingly can speed up bloat

Favoring small programsLexicographic parsimony pressure given two equally fit individuals prefer (select)the one represented by a smaller tree

Bloat-aware operators size-fair crossover

The Challenges for GPndash 57

Highly non-uniform distribution of program lsquobehaviorsrsquo

Convergence of binary Boolean random linear functions (composed of AND NANDOR NOR 8 bits)

From [20] Langdon W B Cantuacute-Paz E (ed) Random Search is Parsimonious LateBreaking Papers at the Genetic and Evolutionary Computation Conference(GECCO-2002) AAAI 2002 308-315

The Challenges for GPndash 58

High cost of evaluation

Running a program on multiple inputscan be expensive

Particularly for some types of dataeg images

Solutions

Caching of outcomes of subprograms

Parallel execution of programs onparticular fitness cases

Bloat prevention methods

The Challenges for GPndash 59

Variants of GP

Variants of GPndash 60

Strongly typed GP (STGP)

A way to incorporate prior knowledge and impose a structure on programs [30]

ImplementationProvide a set of typesFor each instruction define the types of its arguments and outcomesMake the operators type-aware

Mutation substitute a random tree of a proper typeCrossover swap trees of compatible1 types

1Compatible belonging to the same lsquoset typersquoVariants of GPndash 61

Strongly typed GP in ECJ

For the problem of simple classifiers represented as decision trees

Classifier syntaxClassifier = Class_idClassifier = if_then_else(Condition ClassifierClassifier)Condition = Input_Variable = Constant_Value

Implementation in ECJ parameter filesgptypeasize = 3gptypea0name = classgptypea1name = vargptypea2name = constgptypessize = 0gptcsize = 1gptc0 = ecgpGPTreeConstraintsgptc0name = tc0gptc0fset = f0gptc0returns = class

gpncsize = 4gpnc0 = ecgpGPNodeConstraintsgpnc0name = ncSimpleClassifiergpnc0returns = classgpnc0size = 0gpnc1 = ecgpGPNodeConstraintsgpnc1name = ncCompoundClassifiergpnc1returns = classgpnc1size = 4gpnc1child0 = vargpnc1child1 = constgpnc1child2 = classgpnc1child3 = classgpnc2 = ecgpGPNodeConstraintsgpnc2name = ncVariablegpnc2returns = vargpnc2size = 0gpnc3 = ecgpGPNodeConstraintsgpnc3name = ncConstantgpnc3returns = constgpnc3size = 0

Variants of GPndash 62

Linear Genetic Programming

MotivationTree-like structures are not natural for contemporary hardware architectures

Program = a sequence of instructions

Data passed via registers

ProsDirectly portable to machine code fast executionNatural correspondence to standard (GA-like) crossover operator

Applications direct evolution of machine code [32]

Initial register contents

Final register contents

x1

x2 O1 O2

x3

O3 O4 g2

g3

g1r1

r2

r3

r1

r2

r3

Variants of GPndash 63

Linear GP

Variants of GPndash 64

Stack-based GP

The best-known representative Push and PushGPhampshireedulspectorpushhtml [48]

ProsVery simple syntax program = instruction | literal | ( program )No need to specify the number of registersThe top element of a stack has the natural interpretation of program outcomeNatural possibility of implementing autoconstructive programs [47]Includes certain features that make it Turing-complete (eg YANK instruction)

Variants of GPndash 65

Push Example

Program

( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR )

Initial stack states

BOOLEAN STACK ()CODE STACK ( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR )FLOAT STACK ()INTEGER STACK ()

Stack states after program execution

BOOLEAN STACK ( TRUE )CODE STACK ( ( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR ) )FLOAT STACK ( 93 )INTEGER STACK ( 6 )

httphampshireedulspectorpush3-descriptionhtml

Variants of GPndash 66

Other variants of GP

Grammatical Evolution The grammar of the programming language ofconsideration is given as input to the algorithm Individuals encode the choice ofproductions in the derivation tree (which of available alternative productionshould be chosen modulo the number of productions available at given step ofderivation)

Graph-based GPMotivation standard GP cannot reuse subprograms (within a single program)Example Cartesian Genetic Programming

Variants of GPndash 67

Other variants of GP

Multiobjective GP The extra objectives canCome with the problemResult from GPrsquos specifics eg use program size as the second (minimized)objectiveBe associated with different tests (eg feature tests [40])

Developmental GP (eg using Push)

Probabilistic GP (a variant of EDA Estimation of Distribution Algorithms)The algorithm maintains a probability distribution P instead of a populationIndividuals are generated from P lsquoon demandrsquoThe results of individualsrsquo evaluation are used to update P

Variants of GPndash 68

Simple EDA-like GP PIPE

Probabilistic Incremental Program Evolution [41]

Variants of GPndash 69

Selected theoretical results

Selected theoretical resultsndash 70

Schemata theorem for GP

Exact formula for the expected number of individuals sampling a schema a thenext generation [37]

Plus later work for other types of crossover

Selected theoretical resultsndash 71

Semantic crossover

Selected theoretical resultsndash 72

Applications of GP

Applications of GPndash 73

Review

GP produced a number of solutions that are human-competitive ie a GPalgorithm automatically solved a problem for which a patent exists2

A recent award-winning work has demonstrated the ability of a GP system toautomatically find and correct bugs in commercially-released software whenprovided with test data3

GP is one of leading methodologies that can be used to lsquoautomatersquo sciencehelping the researchers to find the hidden complex patterns in the observedphenomena4

2Koza J R Keane M A Streeter M J Mydlowec W Yu J Lanza G 2003 Genetic Pro-gramming IV Routine Human-Competitive Machine Intelligence Kluwer Academic Publishers

3Arcuri A Yao X A novel co-evolutionary approach to automatic software bug fixing In WangJ (Ed) 2008 IEEE World Congress on Computational Intelligence IEEE Computational IntelligenceSociety IEEE Press Hong Kong

4Schmidt M Lipson H 3 Apr 2009 Distilling free-form natural laws from experimental dataScience 324 (5923) 81ndash85

Applications of GPndash 74

Humies

() Entries were solicited for cash awards for human-competitive results that wereproduced by any form of genetic and evolutionary computation and that were published

Applications of GPndash 75

Humies

The conditions to qualify(A) The result was patented as an invention in the past is an improvement over apatented invention or would qualify today as a patentable new invention(B) The result is equal to or better than a result that was accepted as a new scientificresult at the time when it was published in a peer-reviewed scientific journal(C) The result is equal to or better than a result that was placed into a database orarchive of results maintained by an internationally recognized panel of scientificexperts(D) The result is publishable in its own right as a new scientific result mdash independentof the fact that the result was mechanically created(E) The result is equal to or better than the most recent human-created solution to along-standing problem for which there has been a succession of increasingly betterhuman-created solutions(F) The result is equal to or better than a result that was considered an achievementin its field at the time it was first discovered(G) The result solves a problem of indisputable difficulty in its field(H) The result holds its own or wins a regulated competition involving humancontestants (in the form of either live human players or human-written computerprograms)

Applications of GPndash 76

Selected Gold Humies

2004 Jason D Lohn Gregory S Hornby Derek S Linden NASA Ames ResearchCenterAn Evolved Antenna for Deployment on NASArsquos Space Technology 5 Mission

httpidesignucscedupapershornby_ec11pdf

Applications of GPndash 77

Selected Gold Humies using GP

2009 Stephanie Forrest Claire Le Goues ThanhVu Nguyen Westley WeimerAutomatically finding patches using genetic programming A GeneticProgramming Approach to Automated Software Repair

Applications of GPndash 78

Selected Gold Humies using GP

2008 Lee Spector David M Clark Ian Lindsay Bradford Barr Jon KleinGenetic Programming for Finite Algebras

2010 Natalio Krasnogor Paweł Widera Jonathan GaribaldiEvolutionary design of the energy function for protein structure prediction GPchallenge evolving the energy function for protein structure prediction Automateddesign of energy functions for protein structure prediction by means of geneticprogramming and improved structure similarity assessment

2011 Achiya Elyasaf Ami Hauptmann Moshe SipperGA-FreeCell Evolving Solvers for the Game of FreeCell

Applications of GPndash 79

Other applications

classification problems in machine learning [] object recognition [ 33] orlearning game strategies [] [2 57] has demonstrated the ability of a GP system to automatically find andcorrect bugs in commercially-released software when provided with test data

In the context of this paper it deserves particular attention that GP is one ofleading methodologies that can be used to lsquoautomatersquo science helping theresearchers to find the hidden complex patterns in the observed phenomena

In this spirit in their seminal paper [42] have shown how GP can be used toinduce scientific laws from experimental data Many other studies havedemonstrated the usefulness of GP for modeling different phenomena includingthose of natural origins [45 3 7 56 46]

See [] for an extensive review of GP applications

Applications of GPndash 80

Case study 2 Evolution of feature detectors

Based on Krzysztof Krawiec Bartosz Kukawka and Tomasz Maciejewski (2010)Evolving cascades of voting feature detectors for vehicle detection in satellite imageryIn IEEE Congress on Evolutionary Computation (CEC 2010) Barcelona IEEE Presspages 2392-2399

Applications of GPndash 81

Additional resources

Additional resourcesndash 82

Additional resources

Evolutionary Computation in Java csgmuedu~eclabprojectsecjGeneric software framework for EA well-prepared to work with GP

The online GP bilbiography wwwcsbhamacuk~wblbiblio

The genetic programming lsquohome pagersquo (a little bit messy but still valuable)httpwwwgenetic-programmingcom

Additional resourcesndash 83

Bibliography

Bibliographyndash 84

A Agapitos and S M LucasEvolving Modular Recursive Sorting AlgorithmsIn Proc EuroGP 2007

A Arcuri and X YaoA novel co-evolutionary approach to automatic software bug fixingIn J Wang editor 2008 IEEE World Congress on Computational Intelligencepages 162ndash168 Hong Kong 1-6 June 2008 IEEE Computational IntelligenceSociety IEEE Press

M Arganis R Val J Prats K Rodriguez R Dominguez and J DolzGenetic programming and standardization in water temperature modellingAdvances in Civil Engineering 2009 2009

A Brabazon M OrsquoNeill and I DempseyAn Introduction to Evolutionary Computation in FinanceIEEE Computational Intelligence Magazine 3(4)42ndash55 2008

R Bradley A Brabazon and M OrsquoNeillDynamic High Frequency Trading A Neuro-Evolutionary ApproachIn Proc EvoWorkshops 2009

G Cuccu and F GomezWhen novelty is not enoughIn Proc EvoApplications 2011

Bibliographyndash 85

M Daga and M C DeoAlternative data-driven methods to estimate wind from waves by inversemodelingNatural Hazards 49(2)293ndash310 May 2009

I Dempsey M OrsquoNeill and A BrabazonAdaptive Trading With Grammatical EvolutionIn Proc CEC 2006

G Durrett F Neumann and U-M OrsquoReillyComputational Complexity Analysis of Simple Genetic Programming On TwoProblems Modeling Isolated Program SemanticsIn Proc FOGA 2011

E Galvaacuten-Loacutepez J Swafford M OrsquoNeill and A BrabazonEvolving a Ms PacMan Controller Using Grammatical EvolutionIn Applications of Evolutionary Computation Springer 2010

J V Hansen P B Lowry R D Meservy and D M McDonaldGenetic Programming for Prevention of Cyberterrorism through Dynamic andEvolving Intrusion DetectionDecision Support Systems 431362ndash1374 2007

Bibliographyndash 86

S Harding J F Miller and W BanzhafDevelopments in Cartesian Genetic Programming self-modifying CGPGPEM 11397ndash439 2010

D Jakobović and L BudinDynamic Scheduling with Genetic ProgrammingIn Proc EuroGP 2006

G Kendall A Parkes and K SpoererA Survey of NP-Complete PuzzlesInternational Computer Games Association Journal 31(1)13ndash34 2008

K E Kinnear JrEvolving a Sort Lessons in Genetic ProgrammingIn Proc of the International Conference on Neural Networks 1993

J KozaA Genetic Approach to the Truck Backer Upper Problem and the Inter-twinedSpiral ProblemIn Proc International Joint Conference on Neural Networks 1992

J R KozaGenetic Programming On the Programming of Computers by Means of NaturalSelectionMIT Press Cambridge MA USA 1992

Bibliographyndash 87

J R KozaGenetic Programming II Automatic Discovery of Reusable ProgramsMIT Press Cambridge Massachusetts May 1994

W Langdon and W BanzhafRepeated Patterns in Genetic ProgrammingNatural Computing 7589ndash613 2008

W B LangdonRandom search is parsimoniousIn E Cantuacute-Paz editor Late Breaking Papers at the Genetic and EvolutionaryComputation Conference (GECCO-2002) pages 308ndash315 New York NY 9-13July 2002 AAAI

W B Langdon and W BanzhafRepeated Sequences in Linear Genetic Programming GenomesComplex Systems 15(4)285ndash306 2005

W B Langdon and A P HarrisonEvolving Regular Expressions for GeneChip Probe Performance PredictionIn Proc PPSN pages 1061ndash1070 2008

W B Langdon and R PoliFoundations of Genetic ProgrammingSpringer-Verlag 2002

Bibliographyndash 88

W B Langdon J Rowsell and A P HarrisonCreating Regular Expressions as mRNA Motifs with GP to Predict Human ExonSplittingIn Proc GECCO 2009

W B Langdon O Sanchez Graillet and A P HarrisonAutomated DNA Motif DiscoveryarXivorg 2010

M Lones A Tyrrell S Stepney and L CavesControlling Complex Dynamics with Artificial Biochemical NetworksIn Proc EuroGP pages 159ndash170 2010

S LucasOthello Competitionhttpprotectkern-1667emrelaxalgovalessexacuk8080othellohtmlOthellohtml 2012[Online accessed 27-Jan-2012]

Bibliographyndash 89

S LucasThe Physical Travelling Salesperson Problemhttpprotectkern-1667emrelaxalgovalessexacukptspptsphtml2012[Online accessed 27ndashJan-2012]

T McConaghyFFX Fast Scalable Deterministic Symbolic Regression TechnologyIn Proc GPTP 2011

D J MontanaStrongly typed genetic programmingBBN Technical Report 7866 Bolt Beranek and Newman Inc 10 MoultonStreet Cambridge MA 02138 USA 7 May 1993

M Nicolau M Schoenauer and W BanzhafEvolving Genes to Balance a PoleIn Proc EuroGP 2010

Bibliographyndash 90

P Nordin and W BanzhafGenetic programming controlling a miniature robotIn E V Siegel and J R Koza editors Working Notes for the AAAI Symposiumon Genetic Programming pages 61ndash67 MIT Cambridge MA USA 10ndash12 Nov1995 AAAI

G Olague and L TrujilloEvolutionary-computer-assisted design of image operators that detect interestpoints using genetic programmingImage and Vision Computing 29(7)484ndash498 2011

M OrsquoNeill A Brabazon and E HembergSubtree Deactivation Control with Grammatical Genetic Programming inDynamic EnvironmentsIn Proc CEC 2008

M OrsquoNeill and C RyanGrammatical Evolution by Grammatical Evolution The Evolution of Grammarand Genetic CodeIn Proc EuroGP pages 138ndash149 Springer-Verlag 5ndash7 Apr 2004

M Paterson Y Peres M Thorup P Winkler and U ZwickMaximum OverhangIn Proc 19th Annual ACM-SIAM Symposium on Discrete Algorithms 2008

Bibliographyndash 91

R PoliExact schema theory for genetic programming and variable-length geneticalgorithms with one-point crossoverGenetic Programming and Evolvable Machines 2(2)123ndash163 June 2001

R Poli and W B LangdonOn the search properties of different crossover operators in genetic programmingIn J R Koza W Banzhaf K Chellapilla K Deb M Dorigo D B Fogel M HGarzon D E Goldberg H Iba and R Riolo editors Genetic Programming1998 Proceedings of the Third Annual Conference pages 293ndash301 University ofWisconsin Madison Wisconsin USA 22-25 July 1998 Morgan Kaufmann

R Poli W B Langdon and N F McPheeA field guide to genetic programmingPublished via httplulucom and freely available athttpwwwgp-field-guideorguk 2008(With contributions by J R Koza)

B J Ross and H ZhuProcedural texture evolution using multiobjective optimizationNew Generation Computing 22(3)271ndash293 2004

Bibliographyndash 92

R P Salustowicz and J SchmidhuberProbabilistic incremental program evolutionEvolutionary Computation 5(2)123ndash141 1997

M Schmidt and H LipsonDistilling free-form natural laws from experimental dataScience 324(5923)81ndash85 3 Apr 2009

S Silva and L VanneschiState-of-the-Art Genetic Programming for Predicting Human Oral Bioavailabilityof DrugsIn Proc 4th International Workshop on Practical Applications of ComputationalBiology and Bioinformatics 2010

M SipperLet the Games EvolveIn Proc GPTP 2011

C Sivapragasam R Maheswaran and V VenkateshGenetic programming approach for flood routing in natural channelsHydrological Processes 22(5)623ndash628 2007

C Sivapragasam N Muttil S Muthukumar and V M ArunPrediction of algal blooms using genetic programmingMarine Pollution Bulletin 60(10)1849ndash1855 2010

Bibliographyndash 93

L SpectorTowards practical autoconstructive evolution Self-evolution of problem-solvinggenetic programming systemsIn R Riolo T McConaghy and E Vladislavleva editors Genetic ProgrammingTheory and Practice VIII volume 8 of Genetic and Evolutionary Computationchapter 2 pages 17ndash33 Springer Ann Arbor USA 20-22 May 2010

L Spector C Perry and J KleinPush 20 programming language descriptionTechnical report School of Cognitive Science Hampshire College Apr 2004

J Togelius S Karakovskiy J Koutnik and J SchmidhuberSuper Mario EvolutionIn Proc IEEE Computational Intelligence and Games 2009

TORCS The Open Car Racing Simulatorhttptorcssourceforgenet 2012

L Vanneschi and G CuccuVariable Size Population for Dynamic Optimization with Genetic ProgrammingIn Proc GECCO 2009

Bibliographyndash 94

E Vladislavleva G Smits and D Den HertogOrder of Nonlinearity as a Complexity Measure for Models Generated by SymbolicRegression via Pareto Genetic ProgrammingIEEE Trans EC 13(2)333ndash349 2009

N Wagner Z Michalewicz M Khouja and R McGregorTime Series Forecasting for Dynamic Environments The DyFor Genetic ProgramModelIEEE Trans EC 2007

J Walker and J MillerPredicting Prime Numbers Using Cartesian Genetic ProgrammingIn Proc EuroGP 2007

J A Walker K Voumllk S L Smith and J F MillerParallel Evolution using Multi-chromosome Cartesian Genetic ProgrammingGPEM 10417ndash445 2009

W-C Wang K-W Chau C-T Cheng and L QiuA comparison of performance of several artificial intelligence methods forforecasting monthly discharge time seriesJournal of Hydrology 374(3-4)294ndash306 2009

Bibliographyndash 95

W Weimer S Forrest C Le Goues and T NguyenAutomatic program repair with evolutionary computationCommunications of the ACM 53(5)109ndash116 June 2010

P Widera J Garibaldi and N KrasnogorGP challenge Evolving energy function for protein structure predictionGPEM 1161ndash88 2010

T YuHierarchical Processing for Evolving Recursive and Modular Programs UsingHigher-Order Functions and Lambda AbstractionGPEM 2345ndash380 2001

Bibliographyndash 96

  • Introduction
  • Background
  • What is genetic programming
  • Summary of our first glimpse at GP
  • Life demonstration of GP using ECJ
  • Assessment of GP techniques
  • Case study
  • A more detailed view on GP (vanilla GP is not the whole story)
  • The Challenges for GP
  • Variants of GP
  • Selected theoretical results
  • Applications of GP
  • Additional resources
  • Bibliography
Page 37: GeneticProgramming - Poznań University of TechnologyGeneticProgramming KrzysztofKrawiec LaboratoryofIntelligentDecisionSupportSystems InstituteofComputingScience,PoznanUniversityofTechnology,Poznań,Poland

Exemplary run

The log file produced by the runBest Individual of RunSubpopulation 0Evaluated trueFitness Standardized=008413165 Adjusted=092239726 Hits=17Tree 0( ( ( (- ( ( ( ( x (sin x)) (rlog

x)) (+ (+ (sin x) x) (- x x))) (exp ( x( ( (- ( ( ( ( x x) (rlog x)) (+ (+

(sin x) x) (- x x))) (exp ( x (sin x))))(sin x)) (rlog x)) (exp (rlog x)))))) (sin

x)) (rlog x)) x) (cos (cos ( ( (- ( ((exp (rlog x)) (+ x ( ( (exp (rlog x))(rlog x)) x))) (exp ( ( ( (- (exp (rlogx)) x) (rlog x)) x) (sin ( x x))))) (sinx)) ( x ( ( (- ( ( ( ( x x) (rlogx)) (+ (+ x (+ (+ (sin x) x) (- x x))) (-x x))) (exp ( x (sin x)))) (sin x)) (rlogx)) (exp (rlog x))))) x))))

Life demonstration of GP using ECJndash 37

GP in R

httpcranr-projectorgwebpackagesgprindexhtml

Life demonstration of GP using ECJndash 38

GP in Mathematica

Life demonstration of GP using ECJndash 39

Assessment of GP techniques

Assessment of GP techniquesndash 40

Criteria for assessing the quality of GP-evolved solutions

Criteria for assessing GP algorithmssuccess rate (percentage of evolutionary runs ended with success)

time-to-success (can be infin)

error of the best-of-run individual

Criteria for assessing programs obtained with GP

error rate (percentage of tests passed)

program size (number of instructions)

execution time

transparency

Assessment of GP techniquesndash 41

GP Benchmarks

Problem Definition (formula)Sextic x6minus2x4+x2

Septic x7minus2x6+x5minusx4+x3minus2x2+xNonic x9+x8+x7+x6+x5+x4+x3+x2+xR1 (x+1)3(x2minusx+1)R2 (x5minus3x3+1)(x2+1)R3 (x6+x5)(x4+x3+x2+x+1)

Assessment of GP techniquesndash 42

Even more GP benchmarks

Symbolic Regression Drug Bioavailability [43]

Tower [52] Protein Structure Classification [58]

Boolean Functions Time Series Forecasting [53]

N-Multiplexer [17] N-Majority [17] N-Parity [17] Path-finding and Planning

Generalised Boolean Circuits [12 59] Physical Travelling Salesman [28]

Digital Adder [55] Artificial Ant [17]

Order [9] Lawnmower [18]

Digital Multiplier [55] Tartarus Problem [6]

Majority [9] Maximum Overhang [36]

Classification Circuit Design [29]

mRNA Motif Classification [24] Control Systems

DNA Motif Discovery [25] Chaotic Dynamic Systems Control [26]

Intrusion Detection [11] Pole Balancing [31]

Protein Classification [19] Truck Control [16]

Intertwined Spirals [17]

Predictive Modelling

Mackey-Glass Chaotic Time Series [21]

Financial Trading [5 4 8]

Sunspot Prediction [17]

GeneChip Probe Performance [22]

Prime Number Prediction [54]

Assessment of GP techniquesndash 43

and more

Game-Playing

TORCS Car Racing [50]

Ms PacMan [10]

Othello [27]

Chessboard Evaluation [44]

Backgammon [44]

Mario [49]

NP-Complete Puzzles [14]

Robocode [44]

Rush Hour [44]

Checkers [44]

Freecell [44]

Dynamic Optimisation

Dynamic Symbolic Regression [34 35 51]

Dynamic Scheduling [13]

Traditional Programming

Sorting [15 1]

Assessment of GP techniquesndash 44

Case study

Case studyndash 45

Case study 1 Evolution of temperature models

Based on Karolina Stanisławska Krzysztof Krawiec Zbigniew W KundzewiczModeling Global Temperature Changes using Genetic Programming ndash A Case Study

Institute of Computing Science Poznan University of Technology Poznan Poland

Institute for Agricultural and Forest Environment Polish Academy of SciencesPoznan Poland and Potsdam Institute for Climate Impact Research PotsdamGermany

Konferencja Algorytmoacutew Ewolucyjnych i Optymalizacji Globalnej KAEiOGSeptember 21 2011

Case studyndash 46

A more detailed view on GP(vanilla GP is not the whole story)

A more detailed view on GP (vanilla GP is not the whole story)ndash 47

Where to get [random] candidate solutions from

Every stochastic search method relies on some sampling algorithm(s)

The distribution of randomly generated solutions is important as it impliescertain bias of the algorithm

ProblemsWe donrsquot know the lsquoidealrsquo distribution of GP programsEven if we knew it it may be difficult to design an algorithm that obeys it

The most widely used contemporary initialization methods take care only of thesyntax of generated programs

Mainly height constraint

A more detailed view on GP (vanilla GP is not the whole story)ndash 48

Initialization Full method

Specify the maximum tree height hmax

The full method for initializing treesChoose nonterminal nodes at random until hmax is reachedThen choose only from terminals

A more detailed view on GP (vanilla GP is not the whole story)ndash 49

Initialization Grow method

Specify the maximum tree height hmax

The grow method for initializing treesChoose nonterminal or terminal nodes at random until hmax is reachedThen choose only from terminals

A more detailed view on GP (vanilla GP is not the whole story)ndash 50

Homologous crossover for GP

Earliest example one-point crossover [23] identify a common region in theparents and swap the corresponding trees

The common region is the lsquointersectionrsquo of parent trees

A more detailed view on GP (vanilla GP is not the whole story)ndash 51

Uniform crossover for GP

Works similarly to uniform crossover in GAs

The offspring is build by iterating over nodes in the common region and flipping acoin to decide from which parent should an instruction be copied [38]

A more detailed view on GP (vanilla GP is not the whole story)ndash 52

How to employ multiple operators for lsquobreedingrsquo

How should the particular operators coexist in an evolutionary process In other words

How should they be superimposed

What should be the lsquopipingrsquo of particular breeding pipelines

A topic surprisingly underexplored in GP (and in EC probably too)

An example Which is betterpopsubpop0speciespipe = ecgpkozaMutationPipelinepopsubpop0speciespipenum-sources = 1popsubpop0speciespipesource0 = ecgpkozaCrossoverPipeline

Orpopsubpop0speciespipenum-sources = 2popsubpop0speciespipesource0 = ecgpkozaCrossoverPipelinepopsubpop0speciespipesource0prob = 09popsubpop0speciespipesource1 = ecgpkozaMutationPipelinepopsubpop0speciespipesource1prob = 01

A more detailed view on GP (vanilla GP is not the whole story)ndash 53

The Challenges for GP

The Challenges for GPndash 54

Bloat

The evolving expressions tend to grow indefinitely in sizeFor tree-based representations this growth is typically exponential[-ish]

Evaluation becomes slow algorithm stalls memory overrun likely

One of the most intensely studied topics in GP 240+ papers as of March 2012

The Challenges for GPndash 55

Bloat example

Average number of nodes per generation in a typical run of GP solving the Sexticproblem x6minus2x4+x2

(GP dotted line)

The Challenges for GPndash 56

Countermeasures for bloat

Constraining tree heightSurprisingly can speed up bloat

Favoring small programsLexicographic parsimony pressure given two equally fit individuals prefer (select)the one represented by a smaller tree

Bloat-aware operators size-fair crossover

The Challenges for GPndash 57

Highly non-uniform distribution of program lsquobehaviorsrsquo

Convergence of binary Boolean random linear functions (composed of AND NANDOR NOR 8 bits)

From [20] Langdon W B Cantuacute-Paz E (ed) Random Search is Parsimonious LateBreaking Papers at the Genetic and Evolutionary Computation Conference(GECCO-2002) AAAI 2002 308-315

The Challenges for GPndash 58

High cost of evaluation

Running a program on multiple inputscan be expensive

Particularly for some types of dataeg images

Solutions

Caching of outcomes of subprograms

Parallel execution of programs onparticular fitness cases

Bloat prevention methods

The Challenges for GPndash 59

Variants of GP

Variants of GPndash 60

Strongly typed GP (STGP)

A way to incorporate prior knowledge and impose a structure on programs [30]

ImplementationProvide a set of typesFor each instruction define the types of its arguments and outcomesMake the operators type-aware

Mutation substitute a random tree of a proper typeCrossover swap trees of compatible1 types

1Compatible belonging to the same lsquoset typersquoVariants of GPndash 61

Strongly typed GP in ECJ

For the problem of simple classifiers represented as decision trees

Classifier syntaxClassifier = Class_idClassifier = if_then_else(Condition ClassifierClassifier)Condition = Input_Variable = Constant_Value

Implementation in ECJ parameter filesgptypeasize = 3gptypea0name = classgptypea1name = vargptypea2name = constgptypessize = 0gptcsize = 1gptc0 = ecgpGPTreeConstraintsgptc0name = tc0gptc0fset = f0gptc0returns = class

gpncsize = 4gpnc0 = ecgpGPNodeConstraintsgpnc0name = ncSimpleClassifiergpnc0returns = classgpnc0size = 0gpnc1 = ecgpGPNodeConstraintsgpnc1name = ncCompoundClassifiergpnc1returns = classgpnc1size = 4gpnc1child0 = vargpnc1child1 = constgpnc1child2 = classgpnc1child3 = classgpnc2 = ecgpGPNodeConstraintsgpnc2name = ncVariablegpnc2returns = vargpnc2size = 0gpnc3 = ecgpGPNodeConstraintsgpnc3name = ncConstantgpnc3returns = constgpnc3size = 0

Variants of GPndash 62

Linear Genetic Programming

MotivationTree-like structures are not natural for contemporary hardware architectures

Program = a sequence of instructions

Data passed via registers

ProsDirectly portable to machine code fast executionNatural correspondence to standard (GA-like) crossover operator

Applications direct evolution of machine code [32]

Initial register contents

Final register contents

x1

x2 O1 O2

x3

O3 O4 g2

g3

g1r1

r2

r3

r1

r2

r3

Variants of GPndash 63

Linear GP

Variants of GPndash 64

Stack-based GP

The best-known representative Push and PushGPhampshireedulspectorpushhtml [48]

ProsVery simple syntax program = instruction | literal | ( program )No need to specify the number of registersThe top element of a stack has the natural interpretation of program outcomeNatural possibility of implementing autoconstructive programs [47]Includes certain features that make it Turing-complete (eg YANK instruction)

Variants of GPndash 65

Push Example

Program

( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR )

Initial stack states

BOOLEAN STACK ()CODE STACK ( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR )FLOAT STACK ()INTEGER STACK ()

Stack states after program execution

BOOLEAN STACK ( TRUE )CODE STACK ( ( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR ) )FLOAT STACK ( 93 )INTEGER STACK ( 6 )

httphampshireedulspectorpush3-descriptionhtml

Variants of GPndash 66

Other variants of GP

Grammatical Evolution The grammar of the programming language ofconsideration is given as input to the algorithm Individuals encode the choice ofproductions in the derivation tree (which of available alternative productionshould be chosen modulo the number of productions available at given step ofderivation)

Graph-based GPMotivation standard GP cannot reuse subprograms (within a single program)Example Cartesian Genetic Programming

Variants of GPndash 67

Other variants of GP

Multiobjective GP The extra objectives canCome with the problemResult from GPrsquos specifics eg use program size as the second (minimized)objectiveBe associated with different tests (eg feature tests [40])

Developmental GP (eg using Push)

Probabilistic GP (a variant of EDA Estimation of Distribution Algorithms)The algorithm maintains a probability distribution P instead of a populationIndividuals are generated from P lsquoon demandrsquoThe results of individualsrsquo evaluation are used to update P

Variants of GPndash 68

Simple EDA-like GP PIPE

Probabilistic Incremental Program Evolution [41]

Variants of GPndash 69

Selected theoretical results

Selected theoretical resultsndash 70

Schemata theorem for GP

Exact formula for the expected number of individuals sampling a schema a thenext generation [37]

Plus later work for other types of crossover

Selected theoretical resultsndash 71

Semantic crossover

Selected theoretical resultsndash 72

Applications of GP

Applications of GPndash 73

Review

GP produced a number of solutions that are human-competitive ie a GPalgorithm automatically solved a problem for which a patent exists2

A recent award-winning work has demonstrated the ability of a GP system toautomatically find and correct bugs in commercially-released software whenprovided with test data3

GP is one of leading methodologies that can be used to lsquoautomatersquo sciencehelping the researchers to find the hidden complex patterns in the observedphenomena4

2Koza J R Keane M A Streeter M J Mydlowec W Yu J Lanza G 2003 Genetic Pro-gramming IV Routine Human-Competitive Machine Intelligence Kluwer Academic Publishers

3Arcuri A Yao X A novel co-evolutionary approach to automatic software bug fixing In WangJ (Ed) 2008 IEEE World Congress on Computational Intelligence IEEE Computational IntelligenceSociety IEEE Press Hong Kong

4Schmidt M Lipson H 3 Apr 2009 Distilling free-form natural laws from experimental dataScience 324 (5923) 81ndash85

Applications of GPndash 74

Humies

() Entries were solicited for cash awards for human-competitive results that wereproduced by any form of genetic and evolutionary computation and that were published

Applications of GPndash 75

Humies

The conditions to qualify(A) The result was patented as an invention in the past is an improvement over apatented invention or would qualify today as a patentable new invention(B) The result is equal to or better than a result that was accepted as a new scientificresult at the time when it was published in a peer-reviewed scientific journal(C) The result is equal to or better than a result that was placed into a database orarchive of results maintained by an internationally recognized panel of scientificexperts(D) The result is publishable in its own right as a new scientific result mdash independentof the fact that the result was mechanically created(E) The result is equal to or better than the most recent human-created solution to along-standing problem for which there has been a succession of increasingly betterhuman-created solutions(F) The result is equal to or better than a result that was considered an achievementin its field at the time it was first discovered(G) The result solves a problem of indisputable difficulty in its field(H) The result holds its own or wins a regulated competition involving humancontestants (in the form of either live human players or human-written computerprograms)

Applications of GPndash 76

Selected Gold Humies

2004 Jason D Lohn Gregory S Hornby Derek S Linden NASA Ames ResearchCenterAn Evolved Antenna for Deployment on NASArsquos Space Technology 5 Mission

httpidesignucscedupapershornby_ec11pdf

Applications of GPndash 77

Selected Gold Humies using GP

2009 Stephanie Forrest Claire Le Goues ThanhVu Nguyen Westley WeimerAutomatically finding patches using genetic programming A GeneticProgramming Approach to Automated Software Repair

Applications of GPndash 78

Selected Gold Humies using GP

2008 Lee Spector David M Clark Ian Lindsay Bradford Barr Jon KleinGenetic Programming for Finite Algebras

2010 Natalio Krasnogor Paweł Widera Jonathan GaribaldiEvolutionary design of the energy function for protein structure prediction GPchallenge evolving the energy function for protein structure prediction Automateddesign of energy functions for protein structure prediction by means of geneticprogramming and improved structure similarity assessment

2011 Achiya Elyasaf Ami Hauptmann Moshe SipperGA-FreeCell Evolving Solvers for the Game of FreeCell

Applications of GPndash 79

Other applications

classification problems in machine learning [] object recognition [ 33] orlearning game strategies [] [2 57] has demonstrated the ability of a GP system to automatically find andcorrect bugs in commercially-released software when provided with test data

In the context of this paper it deserves particular attention that GP is one ofleading methodologies that can be used to lsquoautomatersquo science helping theresearchers to find the hidden complex patterns in the observed phenomena

In this spirit in their seminal paper [42] have shown how GP can be used toinduce scientific laws from experimental data Many other studies havedemonstrated the usefulness of GP for modeling different phenomena includingthose of natural origins [45 3 7 56 46]

See [] for an extensive review of GP applications

Applications of GPndash 80

Case study 2 Evolution of feature detectors

Based on Krzysztof Krawiec Bartosz Kukawka and Tomasz Maciejewski (2010)Evolving cascades of voting feature detectors for vehicle detection in satellite imageryIn IEEE Congress on Evolutionary Computation (CEC 2010) Barcelona IEEE Presspages 2392-2399

Applications of GPndash 81

Additional resources

Additional resourcesndash 82

Additional resources

Evolutionary Computation in Java csgmuedu~eclabprojectsecjGeneric software framework for EA well-prepared to work with GP

The online GP bilbiography wwwcsbhamacuk~wblbiblio

The genetic programming lsquohome pagersquo (a little bit messy but still valuable)httpwwwgenetic-programmingcom

Additional resourcesndash 83

Bibliography

Bibliographyndash 84

A Agapitos and S M LucasEvolving Modular Recursive Sorting AlgorithmsIn Proc EuroGP 2007

A Arcuri and X YaoA novel co-evolutionary approach to automatic software bug fixingIn J Wang editor 2008 IEEE World Congress on Computational Intelligencepages 162ndash168 Hong Kong 1-6 June 2008 IEEE Computational IntelligenceSociety IEEE Press

M Arganis R Val J Prats K Rodriguez R Dominguez and J DolzGenetic programming and standardization in water temperature modellingAdvances in Civil Engineering 2009 2009

A Brabazon M OrsquoNeill and I DempseyAn Introduction to Evolutionary Computation in FinanceIEEE Computational Intelligence Magazine 3(4)42ndash55 2008

R Bradley A Brabazon and M OrsquoNeillDynamic High Frequency Trading A Neuro-Evolutionary ApproachIn Proc EvoWorkshops 2009

G Cuccu and F GomezWhen novelty is not enoughIn Proc EvoApplications 2011

Bibliographyndash 85

M Daga and M C DeoAlternative data-driven methods to estimate wind from waves by inversemodelingNatural Hazards 49(2)293ndash310 May 2009

I Dempsey M OrsquoNeill and A BrabazonAdaptive Trading With Grammatical EvolutionIn Proc CEC 2006

G Durrett F Neumann and U-M OrsquoReillyComputational Complexity Analysis of Simple Genetic Programming On TwoProblems Modeling Isolated Program SemanticsIn Proc FOGA 2011

E Galvaacuten-Loacutepez J Swafford M OrsquoNeill and A BrabazonEvolving a Ms PacMan Controller Using Grammatical EvolutionIn Applications of Evolutionary Computation Springer 2010

J V Hansen P B Lowry R D Meservy and D M McDonaldGenetic Programming for Prevention of Cyberterrorism through Dynamic andEvolving Intrusion DetectionDecision Support Systems 431362ndash1374 2007

Bibliographyndash 86

S Harding J F Miller and W BanzhafDevelopments in Cartesian Genetic Programming self-modifying CGPGPEM 11397ndash439 2010

D Jakobović and L BudinDynamic Scheduling with Genetic ProgrammingIn Proc EuroGP 2006

G Kendall A Parkes and K SpoererA Survey of NP-Complete PuzzlesInternational Computer Games Association Journal 31(1)13ndash34 2008

K E Kinnear JrEvolving a Sort Lessons in Genetic ProgrammingIn Proc of the International Conference on Neural Networks 1993

J KozaA Genetic Approach to the Truck Backer Upper Problem and the Inter-twinedSpiral ProblemIn Proc International Joint Conference on Neural Networks 1992

J R KozaGenetic Programming On the Programming of Computers by Means of NaturalSelectionMIT Press Cambridge MA USA 1992

Bibliographyndash 87

J R KozaGenetic Programming II Automatic Discovery of Reusable ProgramsMIT Press Cambridge Massachusetts May 1994

W Langdon and W BanzhafRepeated Patterns in Genetic ProgrammingNatural Computing 7589ndash613 2008

W B LangdonRandom search is parsimoniousIn E Cantuacute-Paz editor Late Breaking Papers at the Genetic and EvolutionaryComputation Conference (GECCO-2002) pages 308ndash315 New York NY 9-13July 2002 AAAI

W B Langdon and W BanzhafRepeated Sequences in Linear Genetic Programming GenomesComplex Systems 15(4)285ndash306 2005

W B Langdon and A P HarrisonEvolving Regular Expressions for GeneChip Probe Performance PredictionIn Proc PPSN pages 1061ndash1070 2008

W B Langdon and R PoliFoundations of Genetic ProgrammingSpringer-Verlag 2002

Bibliographyndash 88

W B Langdon J Rowsell and A P HarrisonCreating Regular Expressions as mRNA Motifs with GP to Predict Human ExonSplittingIn Proc GECCO 2009

W B Langdon O Sanchez Graillet and A P HarrisonAutomated DNA Motif DiscoveryarXivorg 2010

M Lones A Tyrrell S Stepney and L CavesControlling Complex Dynamics with Artificial Biochemical NetworksIn Proc EuroGP pages 159ndash170 2010

S LucasOthello Competitionhttpprotectkern-1667emrelaxalgovalessexacuk8080othellohtmlOthellohtml 2012[Online accessed 27-Jan-2012]

Bibliographyndash 89

S LucasThe Physical Travelling Salesperson Problemhttpprotectkern-1667emrelaxalgovalessexacukptspptsphtml2012[Online accessed 27ndashJan-2012]

T McConaghyFFX Fast Scalable Deterministic Symbolic Regression TechnologyIn Proc GPTP 2011

D J MontanaStrongly typed genetic programmingBBN Technical Report 7866 Bolt Beranek and Newman Inc 10 MoultonStreet Cambridge MA 02138 USA 7 May 1993

M Nicolau M Schoenauer and W BanzhafEvolving Genes to Balance a PoleIn Proc EuroGP 2010

Bibliographyndash 90

P Nordin and W BanzhafGenetic programming controlling a miniature robotIn E V Siegel and J R Koza editors Working Notes for the AAAI Symposiumon Genetic Programming pages 61ndash67 MIT Cambridge MA USA 10ndash12 Nov1995 AAAI

G Olague and L TrujilloEvolutionary-computer-assisted design of image operators that detect interestpoints using genetic programmingImage and Vision Computing 29(7)484ndash498 2011

M OrsquoNeill A Brabazon and E HembergSubtree Deactivation Control with Grammatical Genetic Programming inDynamic EnvironmentsIn Proc CEC 2008

M OrsquoNeill and C RyanGrammatical Evolution by Grammatical Evolution The Evolution of Grammarand Genetic CodeIn Proc EuroGP pages 138ndash149 Springer-Verlag 5ndash7 Apr 2004

M Paterson Y Peres M Thorup P Winkler and U ZwickMaximum OverhangIn Proc 19th Annual ACM-SIAM Symposium on Discrete Algorithms 2008

Bibliographyndash 91

R PoliExact schema theory for genetic programming and variable-length geneticalgorithms with one-point crossoverGenetic Programming and Evolvable Machines 2(2)123ndash163 June 2001

R Poli and W B LangdonOn the search properties of different crossover operators in genetic programmingIn J R Koza W Banzhaf K Chellapilla K Deb M Dorigo D B Fogel M HGarzon D E Goldberg H Iba and R Riolo editors Genetic Programming1998 Proceedings of the Third Annual Conference pages 293ndash301 University ofWisconsin Madison Wisconsin USA 22-25 July 1998 Morgan Kaufmann

R Poli W B Langdon and N F McPheeA field guide to genetic programmingPublished via httplulucom and freely available athttpwwwgp-field-guideorguk 2008(With contributions by J R Koza)

B J Ross and H ZhuProcedural texture evolution using multiobjective optimizationNew Generation Computing 22(3)271ndash293 2004

Bibliographyndash 92

R P Salustowicz and J SchmidhuberProbabilistic incremental program evolutionEvolutionary Computation 5(2)123ndash141 1997

M Schmidt and H LipsonDistilling free-form natural laws from experimental dataScience 324(5923)81ndash85 3 Apr 2009

S Silva and L VanneschiState-of-the-Art Genetic Programming for Predicting Human Oral Bioavailabilityof DrugsIn Proc 4th International Workshop on Practical Applications of ComputationalBiology and Bioinformatics 2010

M SipperLet the Games EvolveIn Proc GPTP 2011

C Sivapragasam R Maheswaran and V VenkateshGenetic programming approach for flood routing in natural channelsHydrological Processes 22(5)623ndash628 2007

C Sivapragasam N Muttil S Muthukumar and V M ArunPrediction of algal blooms using genetic programmingMarine Pollution Bulletin 60(10)1849ndash1855 2010

Bibliographyndash 93

L SpectorTowards practical autoconstructive evolution Self-evolution of problem-solvinggenetic programming systemsIn R Riolo T McConaghy and E Vladislavleva editors Genetic ProgrammingTheory and Practice VIII volume 8 of Genetic and Evolutionary Computationchapter 2 pages 17ndash33 Springer Ann Arbor USA 20-22 May 2010

L Spector C Perry and J KleinPush 20 programming language descriptionTechnical report School of Cognitive Science Hampshire College Apr 2004

J Togelius S Karakovskiy J Koutnik and J SchmidhuberSuper Mario EvolutionIn Proc IEEE Computational Intelligence and Games 2009

TORCS The Open Car Racing Simulatorhttptorcssourceforgenet 2012

L Vanneschi and G CuccuVariable Size Population for Dynamic Optimization with Genetic ProgrammingIn Proc GECCO 2009

Bibliographyndash 94

E Vladislavleva G Smits and D Den HertogOrder of Nonlinearity as a Complexity Measure for Models Generated by SymbolicRegression via Pareto Genetic ProgrammingIEEE Trans EC 13(2)333ndash349 2009

N Wagner Z Michalewicz M Khouja and R McGregorTime Series Forecasting for Dynamic Environments The DyFor Genetic ProgramModelIEEE Trans EC 2007

J Walker and J MillerPredicting Prime Numbers Using Cartesian Genetic ProgrammingIn Proc EuroGP 2007

J A Walker K Voumllk S L Smith and J F MillerParallel Evolution using Multi-chromosome Cartesian Genetic ProgrammingGPEM 10417ndash445 2009

W-C Wang K-W Chau C-T Cheng and L QiuA comparison of performance of several artificial intelligence methods forforecasting monthly discharge time seriesJournal of Hydrology 374(3-4)294ndash306 2009

Bibliographyndash 95

W Weimer S Forrest C Le Goues and T NguyenAutomatic program repair with evolutionary computationCommunications of the ACM 53(5)109ndash116 June 2010

P Widera J Garibaldi and N KrasnogorGP challenge Evolving energy function for protein structure predictionGPEM 1161ndash88 2010

T YuHierarchical Processing for Evolving Recursive and Modular Programs UsingHigher-Order Functions and Lambda AbstractionGPEM 2345ndash380 2001

Bibliographyndash 96

  • Introduction
  • Background
  • What is genetic programming
  • Summary of our first glimpse at GP
  • Life demonstration of GP using ECJ
  • Assessment of GP techniques
  • Case study
  • A more detailed view on GP (vanilla GP is not the whole story)
  • The Challenges for GP
  • Variants of GP
  • Selected theoretical results
  • Applications of GP
  • Additional resources
  • Bibliography
Page 38: GeneticProgramming - Poznań University of TechnologyGeneticProgramming KrzysztofKrawiec LaboratoryofIntelligentDecisionSupportSystems InstituteofComputingScience,PoznanUniversityofTechnology,Poznań,Poland

GP in R

httpcranr-projectorgwebpackagesgprindexhtml

Life demonstration of GP using ECJndash 38

GP in Mathematica

Life demonstration of GP using ECJndash 39

Assessment of GP techniques

Assessment of GP techniquesndash 40

Criteria for assessing the quality of GP-evolved solutions

Criteria for assessing GP algorithmssuccess rate (percentage of evolutionary runs ended with success)

time-to-success (can be infin)

error of the best-of-run individual

Criteria for assessing programs obtained with GP

error rate (percentage of tests passed)

program size (number of instructions)

execution time

transparency

Assessment of GP techniquesndash 41

GP Benchmarks

Problem Definition (formula)Sextic x6minus2x4+x2

Septic x7minus2x6+x5minusx4+x3minus2x2+xNonic x9+x8+x7+x6+x5+x4+x3+x2+xR1 (x+1)3(x2minusx+1)R2 (x5minus3x3+1)(x2+1)R3 (x6+x5)(x4+x3+x2+x+1)

Assessment of GP techniquesndash 42

Even more GP benchmarks

Symbolic Regression Drug Bioavailability [43]

Tower [52] Protein Structure Classification [58]

Boolean Functions Time Series Forecasting [53]

N-Multiplexer [17] N-Majority [17] N-Parity [17] Path-finding and Planning

Generalised Boolean Circuits [12 59] Physical Travelling Salesman [28]

Digital Adder [55] Artificial Ant [17]

Order [9] Lawnmower [18]

Digital Multiplier [55] Tartarus Problem [6]

Majority [9] Maximum Overhang [36]

Classification Circuit Design [29]

mRNA Motif Classification [24] Control Systems

DNA Motif Discovery [25] Chaotic Dynamic Systems Control [26]

Intrusion Detection [11] Pole Balancing [31]

Protein Classification [19] Truck Control [16]

Intertwined Spirals [17]

Predictive Modelling

Mackey-Glass Chaotic Time Series [21]

Financial Trading [5 4 8]

Sunspot Prediction [17]

GeneChip Probe Performance [22]

Prime Number Prediction [54]

Assessment of GP techniquesndash 43

and more

Game-Playing

TORCS Car Racing [50]

Ms PacMan [10]

Othello [27]

Chessboard Evaluation [44]

Backgammon [44]

Mario [49]

NP-Complete Puzzles [14]

Robocode [44]

Rush Hour [44]

Checkers [44]

Freecell [44]

Dynamic Optimisation

Dynamic Symbolic Regression [34 35 51]

Dynamic Scheduling [13]

Traditional Programming

Sorting [15 1]

Assessment of GP techniquesndash 44

Case study

Case studyndash 45

Case study 1 Evolution of temperature models

Based on Karolina Stanisławska Krzysztof Krawiec Zbigniew W KundzewiczModeling Global Temperature Changes using Genetic Programming ndash A Case Study

Institute of Computing Science Poznan University of Technology Poznan Poland

Institute for Agricultural and Forest Environment Polish Academy of SciencesPoznan Poland and Potsdam Institute for Climate Impact Research PotsdamGermany

Konferencja Algorytmoacutew Ewolucyjnych i Optymalizacji Globalnej KAEiOGSeptember 21 2011

Case studyndash 46

A more detailed view on GP(vanilla GP is not the whole story)

A more detailed view on GP (vanilla GP is not the whole story)ndash 47

Where to get [random] candidate solutions from

Every stochastic search method relies on some sampling algorithm(s)

The distribution of randomly generated solutions is important as it impliescertain bias of the algorithm

ProblemsWe donrsquot know the lsquoidealrsquo distribution of GP programsEven if we knew it it may be difficult to design an algorithm that obeys it

The most widely used contemporary initialization methods take care only of thesyntax of generated programs

Mainly height constraint

A more detailed view on GP (vanilla GP is not the whole story)ndash 48

Initialization Full method

Specify the maximum tree height hmax

The full method for initializing treesChoose nonterminal nodes at random until hmax is reachedThen choose only from terminals

A more detailed view on GP (vanilla GP is not the whole story)ndash 49

Initialization Grow method

Specify the maximum tree height hmax

The grow method for initializing treesChoose nonterminal or terminal nodes at random until hmax is reachedThen choose only from terminals

A more detailed view on GP (vanilla GP is not the whole story)ndash 50

Homologous crossover for GP

Earliest example one-point crossover [23] identify a common region in theparents and swap the corresponding trees

The common region is the lsquointersectionrsquo of parent trees

A more detailed view on GP (vanilla GP is not the whole story)ndash 51

Uniform crossover for GP

Works similarly to uniform crossover in GAs

The offspring is build by iterating over nodes in the common region and flipping acoin to decide from which parent should an instruction be copied [38]

A more detailed view on GP (vanilla GP is not the whole story)ndash 52

How to employ multiple operators for lsquobreedingrsquo

How should the particular operators coexist in an evolutionary process In other words

How should they be superimposed

What should be the lsquopipingrsquo of particular breeding pipelines

A topic surprisingly underexplored in GP (and in EC probably too)

An example Which is betterpopsubpop0speciespipe = ecgpkozaMutationPipelinepopsubpop0speciespipenum-sources = 1popsubpop0speciespipesource0 = ecgpkozaCrossoverPipeline

Orpopsubpop0speciespipenum-sources = 2popsubpop0speciespipesource0 = ecgpkozaCrossoverPipelinepopsubpop0speciespipesource0prob = 09popsubpop0speciespipesource1 = ecgpkozaMutationPipelinepopsubpop0speciespipesource1prob = 01

A more detailed view on GP (vanilla GP is not the whole story)ndash 53

The Challenges for GP

The Challenges for GPndash 54

Bloat

The evolving expressions tend to grow indefinitely in sizeFor tree-based representations this growth is typically exponential[-ish]

Evaluation becomes slow algorithm stalls memory overrun likely

One of the most intensely studied topics in GP 240+ papers as of March 2012

The Challenges for GPndash 55

Bloat example

Average number of nodes per generation in a typical run of GP solving the Sexticproblem x6minus2x4+x2

(GP dotted line)

The Challenges for GPndash 56

Countermeasures for bloat

Constraining tree heightSurprisingly can speed up bloat

Favoring small programsLexicographic parsimony pressure given two equally fit individuals prefer (select)the one represented by a smaller tree

Bloat-aware operators size-fair crossover

The Challenges for GPndash 57

Highly non-uniform distribution of program lsquobehaviorsrsquo

Convergence of binary Boolean random linear functions (composed of AND NANDOR NOR 8 bits)

From [20] Langdon W B Cantuacute-Paz E (ed) Random Search is Parsimonious LateBreaking Papers at the Genetic and Evolutionary Computation Conference(GECCO-2002) AAAI 2002 308-315

The Challenges for GPndash 58

High cost of evaluation

Running a program on multiple inputscan be expensive

Particularly for some types of dataeg images

Solutions

Caching of outcomes of subprograms

Parallel execution of programs onparticular fitness cases

Bloat prevention methods

The Challenges for GPndash 59

Variants of GP

Variants of GPndash 60

Strongly typed GP (STGP)

A way to incorporate prior knowledge and impose a structure on programs [30]

ImplementationProvide a set of typesFor each instruction define the types of its arguments and outcomesMake the operators type-aware

Mutation substitute a random tree of a proper typeCrossover swap trees of compatible1 types

1Compatible belonging to the same lsquoset typersquoVariants of GPndash 61

Strongly typed GP in ECJ

For the problem of simple classifiers represented as decision trees

Classifier syntaxClassifier = Class_idClassifier = if_then_else(Condition ClassifierClassifier)Condition = Input_Variable = Constant_Value

Implementation in ECJ parameter filesgptypeasize = 3gptypea0name = classgptypea1name = vargptypea2name = constgptypessize = 0gptcsize = 1gptc0 = ecgpGPTreeConstraintsgptc0name = tc0gptc0fset = f0gptc0returns = class

gpncsize = 4gpnc0 = ecgpGPNodeConstraintsgpnc0name = ncSimpleClassifiergpnc0returns = classgpnc0size = 0gpnc1 = ecgpGPNodeConstraintsgpnc1name = ncCompoundClassifiergpnc1returns = classgpnc1size = 4gpnc1child0 = vargpnc1child1 = constgpnc1child2 = classgpnc1child3 = classgpnc2 = ecgpGPNodeConstraintsgpnc2name = ncVariablegpnc2returns = vargpnc2size = 0gpnc3 = ecgpGPNodeConstraintsgpnc3name = ncConstantgpnc3returns = constgpnc3size = 0

Variants of GPndash 62

Linear Genetic Programming

MotivationTree-like structures are not natural for contemporary hardware architectures

Program = a sequence of instructions

Data passed via registers

ProsDirectly portable to machine code fast executionNatural correspondence to standard (GA-like) crossover operator

Applications direct evolution of machine code [32]

Initial register contents

Final register contents

x1

x2 O1 O2

x3

O3 O4 g2

g3

g1r1

r2

r3

r1

r2

r3

Variants of GPndash 63

Linear GP

Variants of GPndash 64

Stack-based GP

The best-known representative Push and PushGPhampshireedulspectorpushhtml [48]

ProsVery simple syntax program = instruction | literal | ( program )No need to specify the number of registersThe top element of a stack has the natural interpretation of program outcomeNatural possibility of implementing autoconstructive programs [47]Includes certain features that make it Turing-complete (eg YANK instruction)

Variants of GPndash 65

Push Example

Program

( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR )

Initial stack states

BOOLEAN STACK ()CODE STACK ( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR )FLOAT STACK ()INTEGER STACK ()

Stack states after program execution

BOOLEAN STACK ( TRUE )CODE STACK ( ( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR ) )FLOAT STACK ( 93 )INTEGER STACK ( 6 )

httphampshireedulspectorpush3-descriptionhtml

Variants of GPndash 66

Other variants of GP

Grammatical Evolution The grammar of the programming language ofconsideration is given as input to the algorithm Individuals encode the choice ofproductions in the derivation tree (which of available alternative productionshould be chosen modulo the number of productions available at given step ofderivation)

Graph-based GPMotivation standard GP cannot reuse subprograms (within a single program)Example Cartesian Genetic Programming

Variants of GPndash 67

Other variants of GP

Multiobjective GP The extra objectives canCome with the problemResult from GPrsquos specifics eg use program size as the second (minimized)objectiveBe associated with different tests (eg feature tests [40])

Developmental GP (eg using Push)

Probabilistic GP (a variant of EDA Estimation of Distribution Algorithms)The algorithm maintains a probability distribution P instead of a populationIndividuals are generated from P lsquoon demandrsquoThe results of individualsrsquo evaluation are used to update P

Variants of GPndash 68

Simple EDA-like GP PIPE

Probabilistic Incremental Program Evolution [41]

Variants of GPndash 69

Selected theoretical results

Selected theoretical resultsndash 70

Schemata theorem for GP

Exact formula for the expected number of individuals sampling a schema a thenext generation [37]

Plus later work for other types of crossover

Selected theoretical resultsndash 71

Semantic crossover

Selected theoretical resultsndash 72

Applications of GP

Applications of GPndash 73

Review

GP produced a number of solutions that are human-competitive ie a GPalgorithm automatically solved a problem for which a patent exists2

A recent award-winning work has demonstrated the ability of a GP system toautomatically find and correct bugs in commercially-released software whenprovided with test data3

GP is one of leading methodologies that can be used to lsquoautomatersquo sciencehelping the researchers to find the hidden complex patterns in the observedphenomena4

2Koza J R Keane M A Streeter M J Mydlowec W Yu J Lanza G 2003 Genetic Pro-gramming IV Routine Human-Competitive Machine Intelligence Kluwer Academic Publishers

3Arcuri A Yao X A novel co-evolutionary approach to automatic software bug fixing In WangJ (Ed) 2008 IEEE World Congress on Computational Intelligence IEEE Computational IntelligenceSociety IEEE Press Hong Kong

4Schmidt M Lipson H 3 Apr 2009 Distilling free-form natural laws from experimental dataScience 324 (5923) 81ndash85

Applications of GPndash 74

Humies

() Entries were solicited for cash awards for human-competitive results that wereproduced by any form of genetic and evolutionary computation and that were published

Applications of GPndash 75

Humies

The conditions to qualify(A) The result was patented as an invention in the past is an improvement over apatented invention or would qualify today as a patentable new invention(B) The result is equal to or better than a result that was accepted as a new scientificresult at the time when it was published in a peer-reviewed scientific journal(C) The result is equal to or better than a result that was placed into a database orarchive of results maintained by an internationally recognized panel of scientificexperts(D) The result is publishable in its own right as a new scientific result mdash independentof the fact that the result was mechanically created(E) The result is equal to or better than the most recent human-created solution to along-standing problem for which there has been a succession of increasingly betterhuman-created solutions(F) The result is equal to or better than a result that was considered an achievementin its field at the time it was first discovered(G) The result solves a problem of indisputable difficulty in its field(H) The result holds its own or wins a regulated competition involving humancontestants (in the form of either live human players or human-written computerprograms)

Applications of GPndash 76

Selected Gold Humies

2004 Jason D Lohn Gregory S Hornby Derek S Linden NASA Ames ResearchCenterAn Evolved Antenna for Deployment on NASArsquos Space Technology 5 Mission

httpidesignucscedupapershornby_ec11pdf

Applications of GPndash 77

Selected Gold Humies using GP

2009 Stephanie Forrest Claire Le Goues ThanhVu Nguyen Westley WeimerAutomatically finding patches using genetic programming A GeneticProgramming Approach to Automated Software Repair

Applications of GPndash 78

Selected Gold Humies using GP

2008 Lee Spector David M Clark Ian Lindsay Bradford Barr Jon KleinGenetic Programming for Finite Algebras

2010 Natalio Krasnogor Paweł Widera Jonathan GaribaldiEvolutionary design of the energy function for protein structure prediction GPchallenge evolving the energy function for protein structure prediction Automateddesign of energy functions for protein structure prediction by means of geneticprogramming and improved structure similarity assessment

2011 Achiya Elyasaf Ami Hauptmann Moshe SipperGA-FreeCell Evolving Solvers for the Game of FreeCell

Applications of GPndash 79

Other applications

classification problems in machine learning [] object recognition [ 33] orlearning game strategies [] [2 57] has demonstrated the ability of a GP system to automatically find andcorrect bugs in commercially-released software when provided with test data

In the context of this paper it deserves particular attention that GP is one ofleading methodologies that can be used to lsquoautomatersquo science helping theresearchers to find the hidden complex patterns in the observed phenomena

In this spirit in their seminal paper [42] have shown how GP can be used toinduce scientific laws from experimental data Many other studies havedemonstrated the usefulness of GP for modeling different phenomena includingthose of natural origins [45 3 7 56 46]

See [] for an extensive review of GP applications

Applications of GPndash 80

Case study 2 Evolution of feature detectors

Based on Krzysztof Krawiec Bartosz Kukawka and Tomasz Maciejewski (2010)Evolving cascades of voting feature detectors for vehicle detection in satellite imageryIn IEEE Congress on Evolutionary Computation (CEC 2010) Barcelona IEEE Presspages 2392-2399

Applications of GPndash 81

Additional resources

Additional resourcesndash 82

Additional resources

Evolutionary Computation in Java csgmuedu~eclabprojectsecjGeneric software framework for EA well-prepared to work with GP

The online GP bilbiography wwwcsbhamacuk~wblbiblio

The genetic programming lsquohome pagersquo (a little bit messy but still valuable)httpwwwgenetic-programmingcom

Additional resourcesndash 83

Bibliography

Bibliographyndash 84

A Agapitos and S M LucasEvolving Modular Recursive Sorting AlgorithmsIn Proc EuroGP 2007

A Arcuri and X YaoA novel co-evolutionary approach to automatic software bug fixingIn J Wang editor 2008 IEEE World Congress on Computational Intelligencepages 162ndash168 Hong Kong 1-6 June 2008 IEEE Computational IntelligenceSociety IEEE Press

M Arganis R Val J Prats K Rodriguez R Dominguez and J DolzGenetic programming and standardization in water temperature modellingAdvances in Civil Engineering 2009 2009

A Brabazon M OrsquoNeill and I DempseyAn Introduction to Evolutionary Computation in FinanceIEEE Computational Intelligence Magazine 3(4)42ndash55 2008

R Bradley A Brabazon and M OrsquoNeillDynamic High Frequency Trading A Neuro-Evolutionary ApproachIn Proc EvoWorkshops 2009

G Cuccu and F GomezWhen novelty is not enoughIn Proc EvoApplications 2011

Bibliographyndash 85

M Daga and M C DeoAlternative data-driven methods to estimate wind from waves by inversemodelingNatural Hazards 49(2)293ndash310 May 2009

I Dempsey M OrsquoNeill and A BrabazonAdaptive Trading With Grammatical EvolutionIn Proc CEC 2006

G Durrett F Neumann and U-M OrsquoReillyComputational Complexity Analysis of Simple Genetic Programming On TwoProblems Modeling Isolated Program SemanticsIn Proc FOGA 2011

E Galvaacuten-Loacutepez J Swafford M OrsquoNeill and A BrabazonEvolving a Ms PacMan Controller Using Grammatical EvolutionIn Applications of Evolutionary Computation Springer 2010

J V Hansen P B Lowry R D Meservy and D M McDonaldGenetic Programming for Prevention of Cyberterrorism through Dynamic andEvolving Intrusion DetectionDecision Support Systems 431362ndash1374 2007

Bibliographyndash 86

S Harding J F Miller and W BanzhafDevelopments in Cartesian Genetic Programming self-modifying CGPGPEM 11397ndash439 2010

D Jakobović and L BudinDynamic Scheduling with Genetic ProgrammingIn Proc EuroGP 2006

G Kendall A Parkes and K SpoererA Survey of NP-Complete PuzzlesInternational Computer Games Association Journal 31(1)13ndash34 2008

K E Kinnear JrEvolving a Sort Lessons in Genetic ProgrammingIn Proc of the International Conference on Neural Networks 1993

J KozaA Genetic Approach to the Truck Backer Upper Problem and the Inter-twinedSpiral ProblemIn Proc International Joint Conference on Neural Networks 1992

J R KozaGenetic Programming On the Programming of Computers by Means of NaturalSelectionMIT Press Cambridge MA USA 1992

Bibliographyndash 87

J R KozaGenetic Programming II Automatic Discovery of Reusable ProgramsMIT Press Cambridge Massachusetts May 1994

W Langdon and W BanzhafRepeated Patterns in Genetic ProgrammingNatural Computing 7589ndash613 2008

W B LangdonRandom search is parsimoniousIn E Cantuacute-Paz editor Late Breaking Papers at the Genetic and EvolutionaryComputation Conference (GECCO-2002) pages 308ndash315 New York NY 9-13July 2002 AAAI

W B Langdon and W BanzhafRepeated Sequences in Linear Genetic Programming GenomesComplex Systems 15(4)285ndash306 2005

W B Langdon and A P HarrisonEvolving Regular Expressions for GeneChip Probe Performance PredictionIn Proc PPSN pages 1061ndash1070 2008

W B Langdon and R PoliFoundations of Genetic ProgrammingSpringer-Verlag 2002

Bibliographyndash 88

W B Langdon J Rowsell and A P HarrisonCreating Regular Expressions as mRNA Motifs with GP to Predict Human ExonSplittingIn Proc GECCO 2009

W B Langdon O Sanchez Graillet and A P HarrisonAutomated DNA Motif DiscoveryarXivorg 2010

M Lones A Tyrrell S Stepney and L CavesControlling Complex Dynamics with Artificial Biochemical NetworksIn Proc EuroGP pages 159ndash170 2010

S LucasOthello Competitionhttpprotectkern-1667emrelaxalgovalessexacuk8080othellohtmlOthellohtml 2012[Online accessed 27-Jan-2012]

Bibliographyndash 89

S LucasThe Physical Travelling Salesperson Problemhttpprotectkern-1667emrelaxalgovalessexacukptspptsphtml2012[Online accessed 27ndashJan-2012]

T McConaghyFFX Fast Scalable Deterministic Symbolic Regression TechnologyIn Proc GPTP 2011

D J MontanaStrongly typed genetic programmingBBN Technical Report 7866 Bolt Beranek and Newman Inc 10 MoultonStreet Cambridge MA 02138 USA 7 May 1993

M Nicolau M Schoenauer and W BanzhafEvolving Genes to Balance a PoleIn Proc EuroGP 2010

Bibliographyndash 90

P Nordin and W BanzhafGenetic programming controlling a miniature robotIn E V Siegel and J R Koza editors Working Notes for the AAAI Symposiumon Genetic Programming pages 61ndash67 MIT Cambridge MA USA 10ndash12 Nov1995 AAAI

G Olague and L TrujilloEvolutionary-computer-assisted design of image operators that detect interestpoints using genetic programmingImage and Vision Computing 29(7)484ndash498 2011

M OrsquoNeill A Brabazon and E HembergSubtree Deactivation Control with Grammatical Genetic Programming inDynamic EnvironmentsIn Proc CEC 2008

M OrsquoNeill and C RyanGrammatical Evolution by Grammatical Evolution The Evolution of Grammarand Genetic CodeIn Proc EuroGP pages 138ndash149 Springer-Verlag 5ndash7 Apr 2004

M Paterson Y Peres M Thorup P Winkler and U ZwickMaximum OverhangIn Proc 19th Annual ACM-SIAM Symposium on Discrete Algorithms 2008

Bibliographyndash 91

R PoliExact schema theory for genetic programming and variable-length geneticalgorithms with one-point crossoverGenetic Programming and Evolvable Machines 2(2)123ndash163 June 2001

R Poli and W B LangdonOn the search properties of different crossover operators in genetic programmingIn J R Koza W Banzhaf K Chellapilla K Deb M Dorigo D B Fogel M HGarzon D E Goldberg H Iba and R Riolo editors Genetic Programming1998 Proceedings of the Third Annual Conference pages 293ndash301 University ofWisconsin Madison Wisconsin USA 22-25 July 1998 Morgan Kaufmann

R Poli W B Langdon and N F McPheeA field guide to genetic programmingPublished via httplulucom and freely available athttpwwwgp-field-guideorguk 2008(With contributions by J R Koza)

B J Ross and H ZhuProcedural texture evolution using multiobjective optimizationNew Generation Computing 22(3)271ndash293 2004

Bibliographyndash 92

R P Salustowicz and J SchmidhuberProbabilistic incremental program evolutionEvolutionary Computation 5(2)123ndash141 1997

M Schmidt and H LipsonDistilling free-form natural laws from experimental dataScience 324(5923)81ndash85 3 Apr 2009

S Silva and L VanneschiState-of-the-Art Genetic Programming for Predicting Human Oral Bioavailabilityof DrugsIn Proc 4th International Workshop on Practical Applications of ComputationalBiology and Bioinformatics 2010

M SipperLet the Games EvolveIn Proc GPTP 2011

C Sivapragasam R Maheswaran and V VenkateshGenetic programming approach for flood routing in natural channelsHydrological Processes 22(5)623ndash628 2007

C Sivapragasam N Muttil S Muthukumar and V M ArunPrediction of algal blooms using genetic programmingMarine Pollution Bulletin 60(10)1849ndash1855 2010

Bibliographyndash 93

L SpectorTowards practical autoconstructive evolution Self-evolution of problem-solvinggenetic programming systemsIn R Riolo T McConaghy and E Vladislavleva editors Genetic ProgrammingTheory and Practice VIII volume 8 of Genetic and Evolutionary Computationchapter 2 pages 17ndash33 Springer Ann Arbor USA 20-22 May 2010

L Spector C Perry and J KleinPush 20 programming language descriptionTechnical report School of Cognitive Science Hampshire College Apr 2004

J Togelius S Karakovskiy J Koutnik and J SchmidhuberSuper Mario EvolutionIn Proc IEEE Computational Intelligence and Games 2009

TORCS The Open Car Racing Simulatorhttptorcssourceforgenet 2012

L Vanneschi and G CuccuVariable Size Population for Dynamic Optimization with Genetic ProgrammingIn Proc GECCO 2009

Bibliographyndash 94

E Vladislavleva G Smits and D Den HertogOrder of Nonlinearity as a Complexity Measure for Models Generated by SymbolicRegression via Pareto Genetic ProgrammingIEEE Trans EC 13(2)333ndash349 2009

N Wagner Z Michalewicz M Khouja and R McGregorTime Series Forecasting for Dynamic Environments The DyFor Genetic ProgramModelIEEE Trans EC 2007

J Walker and J MillerPredicting Prime Numbers Using Cartesian Genetic ProgrammingIn Proc EuroGP 2007

J A Walker K Voumllk S L Smith and J F MillerParallel Evolution using Multi-chromosome Cartesian Genetic ProgrammingGPEM 10417ndash445 2009

W-C Wang K-W Chau C-T Cheng and L QiuA comparison of performance of several artificial intelligence methods forforecasting monthly discharge time seriesJournal of Hydrology 374(3-4)294ndash306 2009

Bibliographyndash 95

W Weimer S Forrest C Le Goues and T NguyenAutomatic program repair with evolutionary computationCommunications of the ACM 53(5)109ndash116 June 2010

P Widera J Garibaldi and N KrasnogorGP challenge Evolving energy function for protein structure predictionGPEM 1161ndash88 2010

T YuHierarchical Processing for Evolving Recursive and Modular Programs UsingHigher-Order Functions and Lambda AbstractionGPEM 2345ndash380 2001

Bibliographyndash 96

  • Introduction
  • Background
  • What is genetic programming
  • Summary of our first glimpse at GP
  • Life demonstration of GP using ECJ
  • Assessment of GP techniques
  • Case study
  • A more detailed view on GP (vanilla GP is not the whole story)
  • The Challenges for GP
  • Variants of GP
  • Selected theoretical results
  • Applications of GP
  • Additional resources
  • Bibliography
Page 39: GeneticProgramming - Poznań University of TechnologyGeneticProgramming KrzysztofKrawiec LaboratoryofIntelligentDecisionSupportSystems InstituteofComputingScience,PoznanUniversityofTechnology,Poznań,Poland

GP in Mathematica

Life demonstration of GP using ECJndash 39

Assessment of GP techniques

Assessment of GP techniquesndash 40

Criteria for assessing the quality of GP-evolved solutions

Criteria for assessing GP algorithmssuccess rate (percentage of evolutionary runs ended with success)

time-to-success (can be infin)

error of the best-of-run individual

Criteria for assessing programs obtained with GP

error rate (percentage of tests passed)

program size (number of instructions)

execution time

transparency

Assessment of GP techniquesndash 41

GP Benchmarks

Problem Definition (formula)Sextic x6minus2x4+x2

Septic x7minus2x6+x5minusx4+x3minus2x2+xNonic x9+x8+x7+x6+x5+x4+x3+x2+xR1 (x+1)3(x2minusx+1)R2 (x5minus3x3+1)(x2+1)R3 (x6+x5)(x4+x3+x2+x+1)

Assessment of GP techniquesndash 42

Even more GP benchmarks

Symbolic Regression Drug Bioavailability [43]

Tower [52] Protein Structure Classification [58]

Boolean Functions Time Series Forecasting [53]

N-Multiplexer [17] N-Majority [17] N-Parity [17] Path-finding and Planning

Generalised Boolean Circuits [12 59] Physical Travelling Salesman [28]

Digital Adder [55] Artificial Ant [17]

Order [9] Lawnmower [18]

Digital Multiplier [55] Tartarus Problem [6]

Majority [9] Maximum Overhang [36]

Classification Circuit Design [29]

mRNA Motif Classification [24] Control Systems

DNA Motif Discovery [25] Chaotic Dynamic Systems Control [26]

Intrusion Detection [11] Pole Balancing [31]

Protein Classification [19] Truck Control [16]

Intertwined Spirals [17]

Predictive Modelling

Mackey-Glass Chaotic Time Series [21]

Financial Trading [5 4 8]

Sunspot Prediction [17]

GeneChip Probe Performance [22]

Prime Number Prediction [54]

Assessment of GP techniquesndash 43

and more

Game-Playing

TORCS Car Racing [50]

Ms PacMan [10]

Othello [27]

Chessboard Evaluation [44]

Backgammon [44]

Mario [49]

NP-Complete Puzzles [14]

Robocode [44]

Rush Hour [44]

Checkers [44]

Freecell [44]

Dynamic Optimisation

Dynamic Symbolic Regression [34 35 51]

Dynamic Scheduling [13]

Traditional Programming

Sorting [15 1]

Assessment of GP techniquesndash 44

Case study

Case studyndash 45

Case study 1 Evolution of temperature models

Based on Karolina Stanisławska Krzysztof Krawiec Zbigniew W KundzewiczModeling Global Temperature Changes using Genetic Programming ndash A Case Study

Institute of Computing Science Poznan University of Technology Poznan Poland

Institute for Agricultural and Forest Environment Polish Academy of SciencesPoznan Poland and Potsdam Institute for Climate Impact Research PotsdamGermany

Konferencja Algorytmoacutew Ewolucyjnych i Optymalizacji Globalnej KAEiOGSeptember 21 2011

Case studyndash 46

A more detailed view on GP(vanilla GP is not the whole story)

A more detailed view on GP (vanilla GP is not the whole story)ndash 47

Where to get [random] candidate solutions from

Every stochastic search method relies on some sampling algorithm(s)

The distribution of randomly generated solutions is important as it impliescertain bias of the algorithm

ProblemsWe donrsquot know the lsquoidealrsquo distribution of GP programsEven if we knew it it may be difficult to design an algorithm that obeys it

The most widely used contemporary initialization methods take care only of thesyntax of generated programs

Mainly height constraint

A more detailed view on GP (vanilla GP is not the whole story)ndash 48

Initialization Full method

Specify the maximum tree height hmax

The full method for initializing treesChoose nonterminal nodes at random until hmax is reachedThen choose only from terminals

A more detailed view on GP (vanilla GP is not the whole story)ndash 49

Initialization Grow method

Specify the maximum tree height hmax

The grow method for initializing treesChoose nonterminal or terminal nodes at random until hmax is reachedThen choose only from terminals

A more detailed view on GP (vanilla GP is not the whole story)ndash 50

Homologous crossover for GP

Earliest example one-point crossover [23] identify a common region in theparents and swap the corresponding trees

The common region is the lsquointersectionrsquo of parent trees

A more detailed view on GP (vanilla GP is not the whole story)ndash 51

Uniform crossover for GP

Works similarly to uniform crossover in GAs

The offspring is build by iterating over nodes in the common region and flipping acoin to decide from which parent should an instruction be copied [38]

A more detailed view on GP (vanilla GP is not the whole story)ndash 52

How to employ multiple operators for lsquobreedingrsquo

How should the particular operators coexist in an evolutionary process In other words

How should they be superimposed

What should be the lsquopipingrsquo of particular breeding pipelines

A topic surprisingly underexplored in GP (and in EC probably too)

An example Which is betterpopsubpop0speciespipe = ecgpkozaMutationPipelinepopsubpop0speciespipenum-sources = 1popsubpop0speciespipesource0 = ecgpkozaCrossoverPipeline

Orpopsubpop0speciespipenum-sources = 2popsubpop0speciespipesource0 = ecgpkozaCrossoverPipelinepopsubpop0speciespipesource0prob = 09popsubpop0speciespipesource1 = ecgpkozaMutationPipelinepopsubpop0speciespipesource1prob = 01

A more detailed view on GP (vanilla GP is not the whole story)ndash 53

The Challenges for GP

The Challenges for GPndash 54

Bloat

The evolving expressions tend to grow indefinitely in sizeFor tree-based representations this growth is typically exponential[-ish]

Evaluation becomes slow algorithm stalls memory overrun likely

One of the most intensely studied topics in GP 240+ papers as of March 2012

The Challenges for GPndash 55

Bloat example

Average number of nodes per generation in a typical run of GP solving the Sexticproblem x6minus2x4+x2

(GP dotted line)

The Challenges for GPndash 56

Countermeasures for bloat

Constraining tree heightSurprisingly can speed up bloat

Favoring small programsLexicographic parsimony pressure given two equally fit individuals prefer (select)the one represented by a smaller tree

Bloat-aware operators size-fair crossover

The Challenges for GPndash 57

Highly non-uniform distribution of program lsquobehaviorsrsquo

Convergence of binary Boolean random linear functions (composed of AND NANDOR NOR 8 bits)

From [20] Langdon W B Cantuacute-Paz E (ed) Random Search is Parsimonious LateBreaking Papers at the Genetic and Evolutionary Computation Conference(GECCO-2002) AAAI 2002 308-315

The Challenges for GPndash 58

High cost of evaluation

Running a program on multiple inputscan be expensive

Particularly for some types of dataeg images

Solutions

Caching of outcomes of subprograms

Parallel execution of programs onparticular fitness cases

Bloat prevention methods

The Challenges for GPndash 59

Variants of GP

Variants of GPndash 60

Strongly typed GP (STGP)

A way to incorporate prior knowledge and impose a structure on programs [30]

ImplementationProvide a set of typesFor each instruction define the types of its arguments and outcomesMake the operators type-aware

Mutation substitute a random tree of a proper typeCrossover swap trees of compatible1 types

1Compatible belonging to the same lsquoset typersquoVariants of GPndash 61

Strongly typed GP in ECJ

For the problem of simple classifiers represented as decision trees

Classifier syntaxClassifier = Class_idClassifier = if_then_else(Condition ClassifierClassifier)Condition = Input_Variable = Constant_Value

Implementation in ECJ parameter filesgptypeasize = 3gptypea0name = classgptypea1name = vargptypea2name = constgptypessize = 0gptcsize = 1gptc0 = ecgpGPTreeConstraintsgptc0name = tc0gptc0fset = f0gptc0returns = class

gpncsize = 4gpnc0 = ecgpGPNodeConstraintsgpnc0name = ncSimpleClassifiergpnc0returns = classgpnc0size = 0gpnc1 = ecgpGPNodeConstraintsgpnc1name = ncCompoundClassifiergpnc1returns = classgpnc1size = 4gpnc1child0 = vargpnc1child1 = constgpnc1child2 = classgpnc1child3 = classgpnc2 = ecgpGPNodeConstraintsgpnc2name = ncVariablegpnc2returns = vargpnc2size = 0gpnc3 = ecgpGPNodeConstraintsgpnc3name = ncConstantgpnc3returns = constgpnc3size = 0

Variants of GPndash 62

Linear Genetic Programming

MotivationTree-like structures are not natural for contemporary hardware architectures

Program = a sequence of instructions

Data passed via registers

ProsDirectly portable to machine code fast executionNatural correspondence to standard (GA-like) crossover operator

Applications direct evolution of machine code [32]

Initial register contents

Final register contents

x1

x2 O1 O2

x3

O3 O4 g2

g3

g1r1

r2

r3

r1

r2

r3

Variants of GPndash 63

Linear GP

Variants of GPndash 64

Stack-based GP

The best-known representative Push and PushGPhampshireedulspectorpushhtml [48]

ProsVery simple syntax program = instruction | literal | ( program )No need to specify the number of registersThe top element of a stack has the natural interpretation of program outcomeNatural possibility of implementing autoconstructive programs [47]Includes certain features that make it Turing-complete (eg YANK instruction)

Variants of GPndash 65

Push Example

Program

( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR )

Initial stack states

BOOLEAN STACK ()CODE STACK ( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR )FLOAT STACK ()INTEGER STACK ()

Stack states after program execution

BOOLEAN STACK ( TRUE )CODE STACK ( ( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR ) )FLOAT STACK ( 93 )INTEGER STACK ( 6 )

httphampshireedulspectorpush3-descriptionhtml

Variants of GPndash 66

Other variants of GP

Grammatical Evolution The grammar of the programming language ofconsideration is given as input to the algorithm Individuals encode the choice ofproductions in the derivation tree (which of available alternative productionshould be chosen modulo the number of productions available at given step ofderivation)

Graph-based GPMotivation standard GP cannot reuse subprograms (within a single program)Example Cartesian Genetic Programming

Variants of GPndash 67

Other variants of GP

Multiobjective GP The extra objectives canCome with the problemResult from GPrsquos specifics eg use program size as the second (minimized)objectiveBe associated with different tests (eg feature tests [40])

Developmental GP (eg using Push)

Probabilistic GP (a variant of EDA Estimation of Distribution Algorithms)The algorithm maintains a probability distribution P instead of a populationIndividuals are generated from P lsquoon demandrsquoThe results of individualsrsquo evaluation are used to update P

Variants of GPndash 68

Simple EDA-like GP PIPE

Probabilistic Incremental Program Evolution [41]

Variants of GPndash 69

Selected theoretical results

Selected theoretical resultsndash 70

Schemata theorem for GP

Exact formula for the expected number of individuals sampling a schema a thenext generation [37]

Plus later work for other types of crossover

Selected theoretical resultsndash 71

Semantic crossover

Selected theoretical resultsndash 72

Applications of GP

Applications of GPndash 73

Review

GP produced a number of solutions that are human-competitive ie a GPalgorithm automatically solved a problem for which a patent exists2

A recent award-winning work has demonstrated the ability of a GP system toautomatically find and correct bugs in commercially-released software whenprovided with test data3

GP is one of leading methodologies that can be used to lsquoautomatersquo sciencehelping the researchers to find the hidden complex patterns in the observedphenomena4

2Koza J R Keane M A Streeter M J Mydlowec W Yu J Lanza G 2003 Genetic Pro-gramming IV Routine Human-Competitive Machine Intelligence Kluwer Academic Publishers

3Arcuri A Yao X A novel co-evolutionary approach to automatic software bug fixing In WangJ (Ed) 2008 IEEE World Congress on Computational Intelligence IEEE Computational IntelligenceSociety IEEE Press Hong Kong

4Schmidt M Lipson H 3 Apr 2009 Distilling free-form natural laws from experimental dataScience 324 (5923) 81ndash85

Applications of GPndash 74

Humies

() Entries were solicited for cash awards for human-competitive results that wereproduced by any form of genetic and evolutionary computation and that were published

Applications of GPndash 75

Humies

The conditions to qualify(A) The result was patented as an invention in the past is an improvement over apatented invention or would qualify today as a patentable new invention(B) The result is equal to or better than a result that was accepted as a new scientificresult at the time when it was published in a peer-reviewed scientific journal(C) The result is equal to or better than a result that was placed into a database orarchive of results maintained by an internationally recognized panel of scientificexperts(D) The result is publishable in its own right as a new scientific result mdash independentof the fact that the result was mechanically created(E) The result is equal to or better than the most recent human-created solution to along-standing problem for which there has been a succession of increasingly betterhuman-created solutions(F) The result is equal to or better than a result that was considered an achievementin its field at the time it was first discovered(G) The result solves a problem of indisputable difficulty in its field(H) The result holds its own or wins a regulated competition involving humancontestants (in the form of either live human players or human-written computerprograms)

Applications of GPndash 76

Selected Gold Humies

2004 Jason D Lohn Gregory S Hornby Derek S Linden NASA Ames ResearchCenterAn Evolved Antenna for Deployment on NASArsquos Space Technology 5 Mission

httpidesignucscedupapershornby_ec11pdf

Applications of GPndash 77

Selected Gold Humies using GP

2009 Stephanie Forrest Claire Le Goues ThanhVu Nguyen Westley WeimerAutomatically finding patches using genetic programming A GeneticProgramming Approach to Automated Software Repair

Applications of GPndash 78

Selected Gold Humies using GP

2008 Lee Spector David M Clark Ian Lindsay Bradford Barr Jon KleinGenetic Programming for Finite Algebras

2010 Natalio Krasnogor Paweł Widera Jonathan GaribaldiEvolutionary design of the energy function for protein structure prediction GPchallenge evolving the energy function for protein structure prediction Automateddesign of energy functions for protein structure prediction by means of geneticprogramming and improved structure similarity assessment

2011 Achiya Elyasaf Ami Hauptmann Moshe SipperGA-FreeCell Evolving Solvers for the Game of FreeCell

Applications of GPndash 79

Other applications

classification problems in machine learning [] object recognition [ 33] orlearning game strategies [] [2 57] has demonstrated the ability of a GP system to automatically find andcorrect bugs in commercially-released software when provided with test data

In the context of this paper it deserves particular attention that GP is one ofleading methodologies that can be used to lsquoautomatersquo science helping theresearchers to find the hidden complex patterns in the observed phenomena

In this spirit in their seminal paper [42] have shown how GP can be used toinduce scientific laws from experimental data Many other studies havedemonstrated the usefulness of GP for modeling different phenomena includingthose of natural origins [45 3 7 56 46]

See [] for an extensive review of GP applications

Applications of GPndash 80

Case study 2 Evolution of feature detectors

Based on Krzysztof Krawiec Bartosz Kukawka and Tomasz Maciejewski (2010)Evolving cascades of voting feature detectors for vehicle detection in satellite imageryIn IEEE Congress on Evolutionary Computation (CEC 2010) Barcelona IEEE Presspages 2392-2399

Applications of GPndash 81

Additional resources

Additional resourcesndash 82

Additional resources

Evolutionary Computation in Java csgmuedu~eclabprojectsecjGeneric software framework for EA well-prepared to work with GP

The online GP bilbiography wwwcsbhamacuk~wblbiblio

The genetic programming lsquohome pagersquo (a little bit messy but still valuable)httpwwwgenetic-programmingcom

Additional resourcesndash 83

Bibliography

Bibliographyndash 84

A Agapitos and S M LucasEvolving Modular Recursive Sorting AlgorithmsIn Proc EuroGP 2007

A Arcuri and X YaoA novel co-evolutionary approach to automatic software bug fixingIn J Wang editor 2008 IEEE World Congress on Computational Intelligencepages 162ndash168 Hong Kong 1-6 June 2008 IEEE Computational IntelligenceSociety IEEE Press

M Arganis R Val J Prats K Rodriguez R Dominguez and J DolzGenetic programming and standardization in water temperature modellingAdvances in Civil Engineering 2009 2009

A Brabazon M OrsquoNeill and I DempseyAn Introduction to Evolutionary Computation in FinanceIEEE Computational Intelligence Magazine 3(4)42ndash55 2008

R Bradley A Brabazon and M OrsquoNeillDynamic High Frequency Trading A Neuro-Evolutionary ApproachIn Proc EvoWorkshops 2009

G Cuccu and F GomezWhen novelty is not enoughIn Proc EvoApplications 2011

Bibliographyndash 85

M Daga and M C DeoAlternative data-driven methods to estimate wind from waves by inversemodelingNatural Hazards 49(2)293ndash310 May 2009

I Dempsey M OrsquoNeill and A BrabazonAdaptive Trading With Grammatical EvolutionIn Proc CEC 2006

G Durrett F Neumann and U-M OrsquoReillyComputational Complexity Analysis of Simple Genetic Programming On TwoProblems Modeling Isolated Program SemanticsIn Proc FOGA 2011

E Galvaacuten-Loacutepez J Swafford M OrsquoNeill and A BrabazonEvolving a Ms PacMan Controller Using Grammatical EvolutionIn Applications of Evolutionary Computation Springer 2010

J V Hansen P B Lowry R D Meservy and D M McDonaldGenetic Programming for Prevention of Cyberterrorism through Dynamic andEvolving Intrusion DetectionDecision Support Systems 431362ndash1374 2007

Bibliographyndash 86

S Harding J F Miller and W BanzhafDevelopments in Cartesian Genetic Programming self-modifying CGPGPEM 11397ndash439 2010

D Jakobović and L BudinDynamic Scheduling with Genetic ProgrammingIn Proc EuroGP 2006

G Kendall A Parkes and K SpoererA Survey of NP-Complete PuzzlesInternational Computer Games Association Journal 31(1)13ndash34 2008

K E Kinnear JrEvolving a Sort Lessons in Genetic ProgrammingIn Proc of the International Conference on Neural Networks 1993

J KozaA Genetic Approach to the Truck Backer Upper Problem and the Inter-twinedSpiral ProblemIn Proc International Joint Conference on Neural Networks 1992

J R KozaGenetic Programming On the Programming of Computers by Means of NaturalSelectionMIT Press Cambridge MA USA 1992

Bibliographyndash 87

J R KozaGenetic Programming II Automatic Discovery of Reusable ProgramsMIT Press Cambridge Massachusetts May 1994

W Langdon and W BanzhafRepeated Patterns in Genetic ProgrammingNatural Computing 7589ndash613 2008

W B LangdonRandom search is parsimoniousIn E Cantuacute-Paz editor Late Breaking Papers at the Genetic and EvolutionaryComputation Conference (GECCO-2002) pages 308ndash315 New York NY 9-13July 2002 AAAI

W B Langdon and W BanzhafRepeated Sequences in Linear Genetic Programming GenomesComplex Systems 15(4)285ndash306 2005

W B Langdon and A P HarrisonEvolving Regular Expressions for GeneChip Probe Performance PredictionIn Proc PPSN pages 1061ndash1070 2008

W B Langdon and R PoliFoundations of Genetic ProgrammingSpringer-Verlag 2002

Bibliographyndash 88

W B Langdon J Rowsell and A P HarrisonCreating Regular Expressions as mRNA Motifs with GP to Predict Human ExonSplittingIn Proc GECCO 2009

W B Langdon O Sanchez Graillet and A P HarrisonAutomated DNA Motif DiscoveryarXivorg 2010

M Lones A Tyrrell S Stepney and L CavesControlling Complex Dynamics with Artificial Biochemical NetworksIn Proc EuroGP pages 159ndash170 2010

S LucasOthello Competitionhttpprotectkern-1667emrelaxalgovalessexacuk8080othellohtmlOthellohtml 2012[Online accessed 27-Jan-2012]

Bibliographyndash 89

S LucasThe Physical Travelling Salesperson Problemhttpprotectkern-1667emrelaxalgovalessexacukptspptsphtml2012[Online accessed 27ndashJan-2012]

T McConaghyFFX Fast Scalable Deterministic Symbolic Regression TechnologyIn Proc GPTP 2011

D J MontanaStrongly typed genetic programmingBBN Technical Report 7866 Bolt Beranek and Newman Inc 10 MoultonStreet Cambridge MA 02138 USA 7 May 1993

M Nicolau M Schoenauer and W BanzhafEvolving Genes to Balance a PoleIn Proc EuroGP 2010

Bibliographyndash 90

P Nordin and W BanzhafGenetic programming controlling a miniature robotIn E V Siegel and J R Koza editors Working Notes for the AAAI Symposiumon Genetic Programming pages 61ndash67 MIT Cambridge MA USA 10ndash12 Nov1995 AAAI

G Olague and L TrujilloEvolutionary-computer-assisted design of image operators that detect interestpoints using genetic programmingImage and Vision Computing 29(7)484ndash498 2011

M OrsquoNeill A Brabazon and E HembergSubtree Deactivation Control with Grammatical Genetic Programming inDynamic EnvironmentsIn Proc CEC 2008

M OrsquoNeill and C RyanGrammatical Evolution by Grammatical Evolution The Evolution of Grammarand Genetic CodeIn Proc EuroGP pages 138ndash149 Springer-Verlag 5ndash7 Apr 2004

M Paterson Y Peres M Thorup P Winkler and U ZwickMaximum OverhangIn Proc 19th Annual ACM-SIAM Symposium on Discrete Algorithms 2008

Bibliographyndash 91

R PoliExact schema theory for genetic programming and variable-length geneticalgorithms with one-point crossoverGenetic Programming and Evolvable Machines 2(2)123ndash163 June 2001

R Poli and W B LangdonOn the search properties of different crossover operators in genetic programmingIn J R Koza W Banzhaf K Chellapilla K Deb M Dorigo D B Fogel M HGarzon D E Goldberg H Iba and R Riolo editors Genetic Programming1998 Proceedings of the Third Annual Conference pages 293ndash301 University ofWisconsin Madison Wisconsin USA 22-25 July 1998 Morgan Kaufmann

R Poli W B Langdon and N F McPheeA field guide to genetic programmingPublished via httplulucom and freely available athttpwwwgp-field-guideorguk 2008(With contributions by J R Koza)

B J Ross and H ZhuProcedural texture evolution using multiobjective optimizationNew Generation Computing 22(3)271ndash293 2004

Bibliographyndash 92

R P Salustowicz and J SchmidhuberProbabilistic incremental program evolutionEvolutionary Computation 5(2)123ndash141 1997

M Schmidt and H LipsonDistilling free-form natural laws from experimental dataScience 324(5923)81ndash85 3 Apr 2009

S Silva and L VanneschiState-of-the-Art Genetic Programming for Predicting Human Oral Bioavailabilityof DrugsIn Proc 4th International Workshop on Practical Applications of ComputationalBiology and Bioinformatics 2010

M SipperLet the Games EvolveIn Proc GPTP 2011

C Sivapragasam R Maheswaran and V VenkateshGenetic programming approach for flood routing in natural channelsHydrological Processes 22(5)623ndash628 2007

C Sivapragasam N Muttil S Muthukumar and V M ArunPrediction of algal blooms using genetic programmingMarine Pollution Bulletin 60(10)1849ndash1855 2010

Bibliographyndash 93

L SpectorTowards practical autoconstructive evolution Self-evolution of problem-solvinggenetic programming systemsIn R Riolo T McConaghy and E Vladislavleva editors Genetic ProgrammingTheory and Practice VIII volume 8 of Genetic and Evolutionary Computationchapter 2 pages 17ndash33 Springer Ann Arbor USA 20-22 May 2010

L Spector C Perry and J KleinPush 20 programming language descriptionTechnical report School of Cognitive Science Hampshire College Apr 2004

J Togelius S Karakovskiy J Koutnik and J SchmidhuberSuper Mario EvolutionIn Proc IEEE Computational Intelligence and Games 2009

TORCS The Open Car Racing Simulatorhttptorcssourceforgenet 2012

L Vanneschi and G CuccuVariable Size Population for Dynamic Optimization with Genetic ProgrammingIn Proc GECCO 2009

Bibliographyndash 94

E Vladislavleva G Smits and D Den HertogOrder of Nonlinearity as a Complexity Measure for Models Generated by SymbolicRegression via Pareto Genetic ProgrammingIEEE Trans EC 13(2)333ndash349 2009

N Wagner Z Michalewicz M Khouja and R McGregorTime Series Forecasting for Dynamic Environments The DyFor Genetic ProgramModelIEEE Trans EC 2007

J Walker and J MillerPredicting Prime Numbers Using Cartesian Genetic ProgrammingIn Proc EuroGP 2007

J A Walker K Voumllk S L Smith and J F MillerParallel Evolution using Multi-chromosome Cartesian Genetic ProgrammingGPEM 10417ndash445 2009

W-C Wang K-W Chau C-T Cheng and L QiuA comparison of performance of several artificial intelligence methods forforecasting monthly discharge time seriesJournal of Hydrology 374(3-4)294ndash306 2009

Bibliographyndash 95

W Weimer S Forrest C Le Goues and T NguyenAutomatic program repair with evolutionary computationCommunications of the ACM 53(5)109ndash116 June 2010

P Widera J Garibaldi and N KrasnogorGP challenge Evolving energy function for protein structure predictionGPEM 1161ndash88 2010

T YuHierarchical Processing for Evolving Recursive and Modular Programs UsingHigher-Order Functions and Lambda AbstractionGPEM 2345ndash380 2001

Bibliographyndash 96

  • Introduction
  • Background
  • What is genetic programming
  • Summary of our first glimpse at GP
  • Life demonstration of GP using ECJ
  • Assessment of GP techniques
  • Case study
  • A more detailed view on GP (vanilla GP is not the whole story)
  • The Challenges for GP
  • Variants of GP
  • Selected theoretical results
  • Applications of GP
  • Additional resources
  • Bibliography
Page 40: GeneticProgramming - Poznań University of TechnologyGeneticProgramming KrzysztofKrawiec LaboratoryofIntelligentDecisionSupportSystems InstituteofComputingScience,PoznanUniversityofTechnology,Poznań,Poland

Assessment of GP techniques

Assessment of GP techniquesndash 40

Criteria for assessing the quality of GP-evolved solutions

Criteria for assessing GP algorithmssuccess rate (percentage of evolutionary runs ended with success)

time-to-success (can be infin)

error of the best-of-run individual

Criteria for assessing programs obtained with GP

error rate (percentage of tests passed)

program size (number of instructions)

execution time

transparency

Assessment of GP techniquesndash 41

GP Benchmarks

Problem Definition (formula)Sextic x6minus2x4+x2

Septic x7minus2x6+x5minusx4+x3minus2x2+xNonic x9+x8+x7+x6+x5+x4+x3+x2+xR1 (x+1)3(x2minusx+1)R2 (x5minus3x3+1)(x2+1)R3 (x6+x5)(x4+x3+x2+x+1)

Assessment of GP techniquesndash 42

Even more GP benchmarks

Symbolic Regression Drug Bioavailability [43]

Tower [52] Protein Structure Classification [58]

Boolean Functions Time Series Forecasting [53]

N-Multiplexer [17] N-Majority [17] N-Parity [17] Path-finding and Planning

Generalised Boolean Circuits [12 59] Physical Travelling Salesman [28]

Digital Adder [55] Artificial Ant [17]

Order [9] Lawnmower [18]

Digital Multiplier [55] Tartarus Problem [6]

Majority [9] Maximum Overhang [36]

Classification Circuit Design [29]

mRNA Motif Classification [24] Control Systems

DNA Motif Discovery [25] Chaotic Dynamic Systems Control [26]

Intrusion Detection [11] Pole Balancing [31]

Protein Classification [19] Truck Control [16]

Intertwined Spirals [17]

Predictive Modelling

Mackey-Glass Chaotic Time Series [21]

Financial Trading [5 4 8]

Sunspot Prediction [17]

GeneChip Probe Performance [22]

Prime Number Prediction [54]

Assessment of GP techniquesndash 43

and more

Game-Playing

TORCS Car Racing [50]

Ms PacMan [10]

Othello [27]

Chessboard Evaluation [44]

Backgammon [44]

Mario [49]

NP-Complete Puzzles [14]

Robocode [44]

Rush Hour [44]

Checkers [44]

Freecell [44]

Dynamic Optimisation

Dynamic Symbolic Regression [34 35 51]

Dynamic Scheduling [13]

Traditional Programming

Sorting [15 1]

Assessment of GP techniquesndash 44

Case study

Case studyndash 45

Case study 1 Evolution of temperature models

Based on Karolina Stanisławska Krzysztof Krawiec Zbigniew W KundzewiczModeling Global Temperature Changes using Genetic Programming ndash A Case Study

Institute of Computing Science Poznan University of Technology Poznan Poland

Institute for Agricultural and Forest Environment Polish Academy of SciencesPoznan Poland and Potsdam Institute for Climate Impact Research PotsdamGermany

Konferencja Algorytmoacutew Ewolucyjnych i Optymalizacji Globalnej KAEiOGSeptember 21 2011

Case studyndash 46

A more detailed view on GP(vanilla GP is not the whole story)

A more detailed view on GP (vanilla GP is not the whole story)ndash 47

Where to get [random] candidate solutions from

Every stochastic search method relies on some sampling algorithm(s)

The distribution of randomly generated solutions is important as it impliescertain bias of the algorithm

ProblemsWe donrsquot know the lsquoidealrsquo distribution of GP programsEven if we knew it it may be difficult to design an algorithm that obeys it

The most widely used contemporary initialization methods take care only of thesyntax of generated programs

Mainly height constraint

A more detailed view on GP (vanilla GP is not the whole story)ndash 48

Initialization Full method

Specify the maximum tree height hmax

The full method for initializing treesChoose nonterminal nodes at random until hmax is reachedThen choose only from terminals

A more detailed view on GP (vanilla GP is not the whole story)ndash 49

Initialization Grow method

Specify the maximum tree height hmax

The grow method for initializing treesChoose nonterminal or terminal nodes at random until hmax is reachedThen choose only from terminals

A more detailed view on GP (vanilla GP is not the whole story)ndash 50

Homologous crossover for GP

Earliest example one-point crossover [23] identify a common region in theparents and swap the corresponding trees

The common region is the lsquointersectionrsquo of parent trees

A more detailed view on GP (vanilla GP is not the whole story)ndash 51

Uniform crossover for GP

Works similarly to uniform crossover in GAs

The offspring is build by iterating over nodes in the common region and flipping acoin to decide from which parent should an instruction be copied [38]

A more detailed view on GP (vanilla GP is not the whole story)ndash 52

How to employ multiple operators for lsquobreedingrsquo

How should the particular operators coexist in an evolutionary process In other words

How should they be superimposed

What should be the lsquopipingrsquo of particular breeding pipelines

A topic surprisingly underexplored in GP (and in EC probably too)

An example Which is betterpopsubpop0speciespipe = ecgpkozaMutationPipelinepopsubpop0speciespipenum-sources = 1popsubpop0speciespipesource0 = ecgpkozaCrossoverPipeline

Orpopsubpop0speciespipenum-sources = 2popsubpop0speciespipesource0 = ecgpkozaCrossoverPipelinepopsubpop0speciespipesource0prob = 09popsubpop0speciespipesource1 = ecgpkozaMutationPipelinepopsubpop0speciespipesource1prob = 01

A more detailed view on GP (vanilla GP is not the whole story)ndash 53

The Challenges for GP

The Challenges for GPndash 54

Bloat

The evolving expressions tend to grow indefinitely in sizeFor tree-based representations this growth is typically exponential[-ish]

Evaluation becomes slow algorithm stalls memory overrun likely

One of the most intensely studied topics in GP 240+ papers as of March 2012

The Challenges for GPndash 55

Bloat example

Average number of nodes per generation in a typical run of GP solving the Sexticproblem x6minus2x4+x2

(GP dotted line)

The Challenges for GPndash 56

Countermeasures for bloat

Constraining tree heightSurprisingly can speed up bloat

Favoring small programsLexicographic parsimony pressure given two equally fit individuals prefer (select)the one represented by a smaller tree

Bloat-aware operators size-fair crossover

The Challenges for GPndash 57

Highly non-uniform distribution of program lsquobehaviorsrsquo

Convergence of binary Boolean random linear functions (composed of AND NANDOR NOR 8 bits)

From [20] Langdon W B Cantuacute-Paz E (ed) Random Search is Parsimonious LateBreaking Papers at the Genetic and Evolutionary Computation Conference(GECCO-2002) AAAI 2002 308-315

The Challenges for GPndash 58

High cost of evaluation

Running a program on multiple inputscan be expensive

Particularly for some types of dataeg images

Solutions

Caching of outcomes of subprograms

Parallel execution of programs onparticular fitness cases

Bloat prevention methods

The Challenges for GPndash 59

Variants of GP

Variants of GPndash 60

Strongly typed GP (STGP)

A way to incorporate prior knowledge and impose a structure on programs [30]

ImplementationProvide a set of typesFor each instruction define the types of its arguments and outcomesMake the operators type-aware

Mutation substitute a random tree of a proper typeCrossover swap trees of compatible1 types

1Compatible belonging to the same lsquoset typersquoVariants of GPndash 61

Strongly typed GP in ECJ

For the problem of simple classifiers represented as decision trees

Classifier syntaxClassifier = Class_idClassifier = if_then_else(Condition ClassifierClassifier)Condition = Input_Variable = Constant_Value

Implementation in ECJ parameter filesgptypeasize = 3gptypea0name = classgptypea1name = vargptypea2name = constgptypessize = 0gptcsize = 1gptc0 = ecgpGPTreeConstraintsgptc0name = tc0gptc0fset = f0gptc0returns = class

gpncsize = 4gpnc0 = ecgpGPNodeConstraintsgpnc0name = ncSimpleClassifiergpnc0returns = classgpnc0size = 0gpnc1 = ecgpGPNodeConstraintsgpnc1name = ncCompoundClassifiergpnc1returns = classgpnc1size = 4gpnc1child0 = vargpnc1child1 = constgpnc1child2 = classgpnc1child3 = classgpnc2 = ecgpGPNodeConstraintsgpnc2name = ncVariablegpnc2returns = vargpnc2size = 0gpnc3 = ecgpGPNodeConstraintsgpnc3name = ncConstantgpnc3returns = constgpnc3size = 0

Variants of GPndash 62

Linear Genetic Programming

MotivationTree-like structures are not natural for contemporary hardware architectures

Program = a sequence of instructions

Data passed via registers

ProsDirectly portable to machine code fast executionNatural correspondence to standard (GA-like) crossover operator

Applications direct evolution of machine code [32]

Initial register contents

Final register contents

x1

x2 O1 O2

x3

O3 O4 g2

g3

g1r1

r2

r3

r1

r2

r3

Variants of GPndash 63

Linear GP

Variants of GPndash 64

Stack-based GP

The best-known representative Push and PushGPhampshireedulspectorpushhtml [48]

ProsVery simple syntax program = instruction | literal | ( program )No need to specify the number of registersThe top element of a stack has the natural interpretation of program outcomeNatural possibility of implementing autoconstructive programs [47]Includes certain features that make it Turing-complete (eg YANK instruction)

Variants of GPndash 65

Push Example

Program

( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR )

Initial stack states

BOOLEAN STACK ()CODE STACK ( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR )FLOAT STACK ()INTEGER STACK ()

Stack states after program execution

BOOLEAN STACK ( TRUE )CODE STACK ( ( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR ) )FLOAT STACK ( 93 )INTEGER STACK ( 6 )

httphampshireedulspectorpush3-descriptionhtml

Variants of GPndash 66

Other variants of GP

Grammatical Evolution The grammar of the programming language ofconsideration is given as input to the algorithm Individuals encode the choice ofproductions in the derivation tree (which of available alternative productionshould be chosen modulo the number of productions available at given step ofderivation)

Graph-based GPMotivation standard GP cannot reuse subprograms (within a single program)Example Cartesian Genetic Programming

Variants of GPndash 67

Other variants of GP

Multiobjective GP The extra objectives canCome with the problemResult from GPrsquos specifics eg use program size as the second (minimized)objectiveBe associated with different tests (eg feature tests [40])

Developmental GP (eg using Push)

Probabilistic GP (a variant of EDA Estimation of Distribution Algorithms)The algorithm maintains a probability distribution P instead of a populationIndividuals are generated from P lsquoon demandrsquoThe results of individualsrsquo evaluation are used to update P

Variants of GPndash 68

Simple EDA-like GP PIPE

Probabilistic Incremental Program Evolution [41]

Variants of GPndash 69

Selected theoretical results

Selected theoretical resultsndash 70

Schemata theorem for GP

Exact formula for the expected number of individuals sampling a schema a thenext generation [37]

Plus later work for other types of crossover

Selected theoretical resultsndash 71

Semantic crossover

Selected theoretical resultsndash 72

Applications of GP

Applications of GPndash 73

Review

GP produced a number of solutions that are human-competitive ie a GPalgorithm automatically solved a problem for which a patent exists2

A recent award-winning work has demonstrated the ability of a GP system toautomatically find and correct bugs in commercially-released software whenprovided with test data3

GP is one of leading methodologies that can be used to lsquoautomatersquo sciencehelping the researchers to find the hidden complex patterns in the observedphenomena4

2Koza J R Keane M A Streeter M J Mydlowec W Yu J Lanza G 2003 Genetic Pro-gramming IV Routine Human-Competitive Machine Intelligence Kluwer Academic Publishers

3Arcuri A Yao X A novel co-evolutionary approach to automatic software bug fixing In WangJ (Ed) 2008 IEEE World Congress on Computational Intelligence IEEE Computational IntelligenceSociety IEEE Press Hong Kong

4Schmidt M Lipson H 3 Apr 2009 Distilling free-form natural laws from experimental dataScience 324 (5923) 81ndash85

Applications of GPndash 74

Humies

() Entries were solicited for cash awards for human-competitive results that wereproduced by any form of genetic and evolutionary computation and that were published

Applications of GPndash 75

Humies

The conditions to qualify(A) The result was patented as an invention in the past is an improvement over apatented invention or would qualify today as a patentable new invention(B) The result is equal to or better than a result that was accepted as a new scientificresult at the time when it was published in a peer-reviewed scientific journal(C) The result is equal to or better than a result that was placed into a database orarchive of results maintained by an internationally recognized panel of scientificexperts(D) The result is publishable in its own right as a new scientific result mdash independentof the fact that the result was mechanically created(E) The result is equal to or better than the most recent human-created solution to along-standing problem for which there has been a succession of increasingly betterhuman-created solutions(F) The result is equal to or better than a result that was considered an achievementin its field at the time it was first discovered(G) The result solves a problem of indisputable difficulty in its field(H) The result holds its own or wins a regulated competition involving humancontestants (in the form of either live human players or human-written computerprograms)

Applications of GPndash 76

Selected Gold Humies

2004 Jason D Lohn Gregory S Hornby Derek S Linden NASA Ames ResearchCenterAn Evolved Antenna for Deployment on NASArsquos Space Technology 5 Mission

httpidesignucscedupapershornby_ec11pdf

Applications of GPndash 77

Selected Gold Humies using GP

2009 Stephanie Forrest Claire Le Goues ThanhVu Nguyen Westley WeimerAutomatically finding patches using genetic programming A GeneticProgramming Approach to Automated Software Repair

Applications of GPndash 78

Selected Gold Humies using GP

2008 Lee Spector David M Clark Ian Lindsay Bradford Barr Jon KleinGenetic Programming for Finite Algebras

2010 Natalio Krasnogor Paweł Widera Jonathan GaribaldiEvolutionary design of the energy function for protein structure prediction GPchallenge evolving the energy function for protein structure prediction Automateddesign of energy functions for protein structure prediction by means of geneticprogramming and improved structure similarity assessment

2011 Achiya Elyasaf Ami Hauptmann Moshe SipperGA-FreeCell Evolving Solvers for the Game of FreeCell

Applications of GPndash 79

Other applications

classification problems in machine learning [] object recognition [ 33] orlearning game strategies [] [2 57] has demonstrated the ability of a GP system to automatically find andcorrect bugs in commercially-released software when provided with test data

In the context of this paper it deserves particular attention that GP is one ofleading methodologies that can be used to lsquoautomatersquo science helping theresearchers to find the hidden complex patterns in the observed phenomena

In this spirit in their seminal paper [42] have shown how GP can be used toinduce scientific laws from experimental data Many other studies havedemonstrated the usefulness of GP for modeling different phenomena includingthose of natural origins [45 3 7 56 46]

See [] for an extensive review of GP applications

Applications of GPndash 80

Case study 2 Evolution of feature detectors

Based on Krzysztof Krawiec Bartosz Kukawka and Tomasz Maciejewski (2010)Evolving cascades of voting feature detectors for vehicle detection in satellite imageryIn IEEE Congress on Evolutionary Computation (CEC 2010) Barcelona IEEE Presspages 2392-2399

Applications of GPndash 81

Additional resources

Additional resourcesndash 82

Additional resources

Evolutionary Computation in Java csgmuedu~eclabprojectsecjGeneric software framework for EA well-prepared to work with GP

The online GP bilbiography wwwcsbhamacuk~wblbiblio

The genetic programming lsquohome pagersquo (a little bit messy but still valuable)httpwwwgenetic-programmingcom

Additional resourcesndash 83

Bibliography

Bibliographyndash 84

A Agapitos and S M LucasEvolving Modular Recursive Sorting AlgorithmsIn Proc EuroGP 2007

A Arcuri and X YaoA novel co-evolutionary approach to automatic software bug fixingIn J Wang editor 2008 IEEE World Congress on Computational Intelligencepages 162ndash168 Hong Kong 1-6 June 2008 IEEE Computational IntelligenceSociety IEEE Press

M Arganis R Val J Prats K Rodriguez R Dominguez and J DolzGenetic programming and standardization in water temperature modellingAdvances in Civil Engineering 2009 2009

A Brabazon M OrsquoNeill and I DempseyAn Introduction to Evolutionary Computation in FinanceIEEE Computational Intelligence Magazine 3(4)42ndash55 2008

R Bradley A Brabazon and M OrsquoNeillDynamic High Frequency Trading A Neuro-Evolutionary ApproachIn Proc EvoWorkshops 2009

G Cuccu and F GomezWhen novelty is not enoughIn Proc EvoApplications 2011

Bibliographyndash 85

M Daga and M C DeoAlternative data-driven methods to estimate wind from waves by inversemodelingNatural Hazards 49(2)293ndash310 May 2009

I Dempsey M OrsquoNeill and A BrabazonAdaptive Trading With Grammatical EvolutionIn Proc CEC 2006

G Durrett F Neumann and U-M OrsquoReillyComputational Complexity Analysis of Simple Genetic Programming On TwoProblems Modeling Isolated Program SemanticsIn Proc FOGA 2011

E Galvaacuten-Loacutepez J Swafford M OrsquoNeill and A BrabazonEvolving a Ms PacMan Controller Using Grammatical EvolutionIn Applications of Evolutionary Computation Springer 2010

J V Hansen P B Lowry R D Meservy and D M McDonaldGenetic Programming for Prevention of Cyberterrorism through Dynamic andEvolving Intrusion DetectionDecision Support Systems 431362ndash1374 2007

Bibliographyndash 86

S Harding J F Miller and W BanzhafDevelopments in Cartesian Genetic Programming self-modifying CGPGPEM 11397ndash439 2010

D Jakobović and L BudinDynamic Scheduling with Genetic ProgrammingIn Proc EuroGP 2006

G Kendall A Parkes and K SpoererA Survey of NP-Complete PuzzlesInternational Computer Games Association Journal 31(1)13ndash34 2008

K E Kinnear JrEvolving a Sort Lessons in Genetic ProgrammingIn Proc of the International Conference on Neural Networks 1993

J KozaA Genetic Approach to the Truck Backer Upper Problem and the Inter-twinedSpiral ProblemIn Proc International Joint Conference on Neural Networks 1992

J R KozaGenetic Programming On the Programming of Computers by Means of NaturalSelectionMIT Press Cambridge MA USA 1992

Bibliographyndash 87

J R KozaGenetic Programming II Automatic Discovery of Reusable ProgramsMIT Press Cambridge Massachusetts May 1994

W Langdon and W BanzhafRepeated Patterns in Genetic ProgrammingNatural Computing 7589ndash613 2008

W B LangdonRandom search is parsimoniousIn E Cantuacute-Paz editor Late Breaking Papers at the Genetic and EvolutionaryComputation Conference (GECCO-2002) pages 308ndash315 New York NY 9-13July 2002 AAAI

W B Langdon and W BanzhafRepeated Sequences in Linear Genetic Programming GenomesComplex Systems 15(4)285ndash306 2005

W B Langdon and A P HarrisonEvolving Regular Expressions for GeneChip Probe Performance PredictionIn Proc PPSN pages 1061ndash1070 2008

W B Langdon and R PoliFoundations of Genetic ProgrammingSpringer-Verlag 2002

Bibliographyndash 88

W B Langdon J Rowsell and A P HarrisonCreating Regular Expressions as mRNA Motifs with GP to Predict Human ExonSplittingIn Proc GECCO 2009

W B Langdon O Sanchez Graillet and A P HarrisonAutomated DNA Motif DiscoveryarXivorg 2010

M Lones A Tyrrell S Stepney and L CavesControlling Complex Dynamics with Artificial Biochemical NetworksIn Proc EuroGP pages 159ndash170 2010

S LucasOthello Competitionhttpprotectkern-1667emrelaxalgovalessexacuk8080othellohtmlOthellohtml 2012[Online accessed 27-Jan-2012]

Bibliographyndash 89

S LucasThe Physical Travelling Salesperson Problemhttpprotectkern-1667emrelaxalgovalessexacukptspptsphtml2012[Online accessed 27ndashJan-2012]

T McConaghyFFX Fast Scalable Deterministic Symbolic Regression TechnologyIn Proc GPTP 2011

D J MontanaStrongly typed genetic programmingBBN Technical Report 7866 Bolt Beranek and Newman Inc 10 MoultonStreet Cambridge MA 02138 USA 7 May 1993

M Nicolau M Schoenauer and W BanzhafEvolving Genes to Balance a PoleIn Proc EuroGP 2010

Bibliographyndash 90

P Nordin and W BanzhafGenetic programming controlling a miniature robotIn E V Siegel and J R Koza editors Working Notes for the AAAI Symposiumon Genetic Programming pages 61ndash67 MIT Cambridge MA USA 10ndash12 Nov1995 AAAI

G Olague and L TrujilloEvolutionary-computer-assisted design of image operators that detect interestpoints using genetic programmingImage and Vision Computing 29(7)484ndash498 2011

M OrsquoNeill A Brabazon and E HembergSubtree Deactivation Control with Grammatical Genetic Programming inDynamic EnvironmentsIn Proc CEC 2008

M OrsquoNeill and C RyanGrammatical Evolution by Grammatical Evolution The Evolution of Grammarand Genetic CodeIn Proc EuroGP pages 138ndash149 Springer-Verlag 5ndash7 Apr 2004

M Paterson Y Peres M Thorup P Winkler and U ZwickMaximum OverhangIn Proc 19th Annual ACM-SIAM Symposium on Discrete Algorithms 2008

Bibliographyndash 91

R PoliExact schema theory for genetic programming and variable-length geneticalgorithms with one-point crossoverGenetic Programming and Evolvable Machines 2(2)123ndash163 June 2001

R Poli and W B LangdonOn the search properties of different crossover operators in genetic programmingIn J R Koza W Banzhaf K Chellapilla K Deb M Dorigo D B Fogel M HGarzon D E Goldberg H Iba and R Riolo editors Genetic Programming1998 Proceedings of the Third Annual Conference pages 293ndash301 University ofWisconsin Madison Wisconsin USA 22-25 July 1998 Morgan Kaufmann

R Poli W B Langdon and N F McPheeA field guide to genetic programmingPublished via httplulucom and freely available athttpwwwgp-field-guideorguk 2008(With contributions by J R Koza)

B J Ross and H ZhuProcedural texture evolution using multiobjective optimizationNew Generation Computing 22(3)271ndash293 2004

Bibliographyndash 92

R P Salustowicz and J SchmidhuberProbabilistic incremental program evolutionEvolutionary Computation 5(2)123ndash141 1997

M Schmidt and H LipsonDistilling free-form natural laws from experimental dataScience 324(5923)81ndash85 3 Apr 2009

S Silva and L VanneschiState-of-the-Art Genetic Programming for Predicting Human Oral Bioavailabilityof DrugsIn Proc 4th International Workshop on Practical Applications of ComputationalBiology and Bioinformatics 2010

M SipperLet the Games EvolveIn Proc GPTP 2011

C Sivapragasam R Maheswaran and V VenkateshGenetic programming approach for flood routing in natural channelsHydrological Processes 22(5)623ndash628 2007

C Sivapragasam N Muttil S Muthukumar and V M ArunPrediction of algal blooms using genetic programmingMarine Pollution Bulletin 60(10)1849ndash1855 2010

Bibliographyndash 93

L SpectorTowards practical autoconstructive evolution Self-evolution of problem-solvinggenetic programming systemsIn R Riolo T McConaghy and E Vladislavleva editors Genetic ProgrammingTheory and Practice VIII volume 8 of Genetic and Evolutionary Computationchapter 2 pages 17ndash33 Springer Ann Arbor USA 20-22 May 2010

L Spector C Perry and J KleinPush 20 programming language descriptionTechnical report School of Cognitive Science Hampshire College Apr 2004

J Togelius S Karakovskiy J Koutnik and J SchmidhuberSuper Mario EvolutionIn Proc IEEE Computational Intelligence and Games 2009

TORCS The Open Car Racing Simulatorhttptorcssourceforgenet 2012

L Vanneschi and G CuccuVariable Size Population for Dynamic Optimization with Genetic ProgrammingIn Proc GECCO 2009

Bibliographyndash 94

E Vladislavleva G Smits and D Den HertogOrder of Nonlinearity as a Complexity Measure for Models Generated by SymbolicRegression via Pareto Genetic ProgrammingIEEE Trans EC 13(2)333ndash349 2009

N Wagner Z Michalewicz M Khouja and R McGregorTime Series Forecasting for Dynamic Environments The DyFor Genetic ProgramModelIEEE Trans EC 2007

J Walker and J MillerPredicting Prime Numbers Using Cartesian Genetic ProgrammingIn Proc EuroGP 2007

J A Walker K Voumllk S L Smith and J F MillerParallel Evolution using Multi-chromosome Cartesian Genetic ProgrammingGPEM 10417ndash445 2009

W-C Wang K-W Chau C-T Cheng and L QiuA comparison of performance of several artificial intelligence methods forforecasting monthly discharge time seriesJournal of Hydrology 374(3-4)294ndash306 2009

Bibliographyndash 95

W Weimer S Forrest C Le Goues and T NguyenAutomatic program repair with evolutionary computationCommunications of the ACM 53(5)109ndash116 June 2010

P Widera J Garibaldi and N KrasnogorGP challenge Evolving energy function for protein structure predictionGPEM 1161ndash88 2010

T YuHierarchical Processing for Evolving Recursive and Modular Programs UsingHigher-Order Functions and Lambda AbstractionGPEM 2345ndash380 2001

Bibliographyndash 96

  • Introduction
  • Background
  • What is genetic programming
  • Summary of our first glimpse at GP
  • Life demonstration of GP using ECJ
  • Assessment of GP techniques
  • Case study
  • A more detailed view on GP (vanilla GP is not the whole story)
  • The Challenges for GP
  • Variants of GP
  • Selected theoretical results
  • Applications of GP
  • Additional resources
  • Bibliography
Page 41: GeneticProgramming - Poznań University of TechnologyGeneticProgramming KrzysztofKrawiec LaboratoryofIntelligentDecisionSupportSystems InstituteofComputingScience,PoznanUniversityofTechnology,Poznań,Poland

Criteria for assessing the quality of GP-evolved solutions

Criteria for assessing GP algorithmssuccess rate (percentage of evolutionary runs ended with success)

time-to-success (can be infin)

error of the best-of-run individual

Criteria for assessing programs obtained with GP

error rate (percentage of tests passed)

program size (number of instructions)

execution time

transparency

Assessment of GP techniquesndash 41

GP Benchmarks

Problem Definition (formula)Sextic x6minus2x4+x2

Septic x7minus2x6+x5minusx4+x3minus2x2+xNonic x9+x8+x7+x6+x5+x4+x3+x2+xR1 (x+1)3(x2minusx+1)R2 (x5minus3x3+1)(x2+1)R3 (x6+x5)(x4+x3+x2+x+1)

Assessment of GP techniquesndash 42

Even more GP benchmarks

Symbolic Regression Drug Bioavailability [43]

Tower [52] Protein Structure Classification [58]

Boolean Functions Time Series Forecasting [53]

N-Multiplexer [17] N-Majority [17] N-Parity [17] Path-finding and Planning

Generalised Boolean Circuits [12 59] Physical Travelling Salesman [28]

Digital Adder [55] Artificial Ant [17]

Order [9] Lawnmower [18]

Digital Multiplier [55] Tartarus Problem [6]

Majority [9] Maximum Overhang [36]

Classification Circuit Design [29]

mRNA Motif Classification [24] Control Systems

DNA Motif Discovery [25] Chaotic Dynamic Systems Control [26]

Intrusion Detection [11] Pole Balancing [31]

Protein Classification [19] Truck Control [16]

Intertwined Spirals [17]

Predictive Modelling

Mackey-Glass Chaotic Time Series [21]

Financial Trading [5 4 8]

Sunspot Prediction [17]

GeneChip Probe Performance [22]

Prime Number Prediction [54]

Assessment of GP techniquesndash 43

and more

Game-Playing

TORCS Car Racing [50]

Ms PacMan [10]

Othello [27]

Chessboard Evaluation [44]

Backgammon [44]

Mario [49]

NP-Complete Puzzles [14]

Robocode [44]

Rush Hour [44]

Checkers [44]

Freecell [44]

Dynamic Optimisation

Dynamic Symbolic Regression [34 35 51]

Dynamic Scheduling [13]

Traditional Programming

Sorting [15 1]

Assessment of GP techniquesndash 44

Case study

Case studyndash 45

Case study 1 Evolution of temperature models

Based on Karolina Stanisławska Krzysztof Krawiec Zbigniew W KundzewiczModeling Global Temperature Changes using Genetic Programming ndash A Case Study

Institute of Computing Science Poznan University of Technology Poznan Poland

Institute for Agricultural and Forest Environment Polish Academy of SciencesPoznan Poland and Potsdam Institute for Climate Impact Research PotsdamGermany

Konferencja Algorytmoacutew Ewolucyjnych i Optymalizacji Globalnej KAEiOGSeptember 21 2011

Case studyndash 46

A more detailed view on GP(vanilla GP is not the whole story)

A more detailed view on GP (vanilla GP is not the whole story)ndash 47

Where to get [random] candidate solutions from

Every stochastic search method relies on some sampling algorithm(s)

The distribution of randomly generated solutions is important as it impliescertain bias of the algorithm

ProblemsWe donrsquot know the lsquoidealrsquo distribution of GP programsEven if we knew it it may be difficult to design an algorithm that obeys it

The most widely used contemporary initialization methods take care only of thesyntax of generated programs

Mainly height constraint

A more detailed view on GP (vanilla GP is not the whole story)ndash 48

Initialization Full method

Specify the maximum tree height hmax

The full method for initializing treesChoose nonterminal nodes at random until hmax is reachedThen choose only from terminals

A more detailed view on GP (vanilla GP is not the whole story)ndash 49

Initialization Grow method

Specify the maximum tree height hmax

The grow method for initializing treesChoose nonterminal or terminal nodes at random until hmax is reachedThen choose only from terminals

A more detailed view on GP (vanilla GP is not the whole story)ndash 50

Homologous crossover for GP

Earliest example one-point crossover [23] identify a common region in theparents and swap the corresponding trees

The common region is the lsquointersectionrsquo of parent trees

A more detailed view on GP (vanilla GP is not the whole story)ndash 51

Uniform crossover for GP

Works similarly to uniform crossover in GAs

The offspring is build by iterating over nodes in the common region and flipping acoin to decide from which parent should an instruction be copied [38]

A more detailed view on GP (vanilla GP is not the whole story)ndash 52

How to employ multiple operators for lsquobreedingrsquo

How should the particular operators coexist in an evolutionary process In other words

How should they be superimposed

What should be the lsquopipingrsquo of particular breeding pipelines

A topic surprisingly underexplored in GP (and in EC probably too)

An example Which is betterpopsubpop0speciespipe = ecgpkozaMutationPipelinepopsubpop0speciespipenum-sources = 1popsubpop0speciespipesource0 = ecgpkozaCrossoverPipeline

Orpopsubpop0speciespipenum-sources = 2popsubpop0speciespipesource0 = ecgpkozaCrossoverPipelinepopsubpop0speciespipesource0prob = 09popsubpop0speciespipesource1 = ecgpkozaMutationPipelinepopsubpop0speciespipesource1prob = 01

A more detailed view on GP (vanilla GP is not the whole story)ndash 53

The Challenges for GP

The Challenges for GPndash 54

Bloat

The evolving expressions tend to grow indefinitely in sizeFor tree-based representations this growth is typically exponential[-ish]

Evaluation becomes slow algorithm stalls memory overrun likely

One of the most intensely studied topics in GP 240+ papers as of March 2012

The Challenges for GPndash 55

Bloat example

Average number of nodes per generation in a typical run of GP solving the Sexticproblem x6minus2x4+x2

(GP dotted line)

The Challenges for GPndash 56

Countermeasures for bloat

Constraining tree heightSurprisingly can speed up bloat

Favoring small programsLexicographic parsimony pressure given two equally fit individuals prefer (select)the one represented by a smaller tree

Bloat-aware operators size-fair crossover

The Challenges for GPndash 57

Highly non-uniform distribution of program lsquobehaviorsrsquo

Convergence of binary Boolean random linear functions (composed of AND NANDOR NOR 8 bits)

From [20] Langdon W B Cantuacute-Paz E (ed) Random Search is Parsimonious LateBreaking Papers at the Genetic and Evolutionary Computation Conference(GECCO-2002) AAAI 2002 308-315

The Challenges for GPndash 58

High cost of evaluation

Running a program on multiple inputscan be expensive

Particularly for some types of dataeg images

Solutions

Caching of outcomes of subprograms

Parallel execution of programs onparticular fitness cases

Bloat prevention methods

The Challenges for GPndash 59

Variants of GP

Variants of GPndash 60

Strongly typed GP (STGP)

A way to incorporate prior knowledge and impose a structure on programs [30]

ImplementationProvide a set of typesFor each instruction define the types of its arguments and outcomesMake the operators type-aware

Mutation substitute a random tree of a proper typeCrossover swap trees of compatible1 types

1Compatible belonging to the same lsquoset typersquoVariants of GPndash 61

Strongly typed GP in ECJ

For the problem of simple classifiers represented as decision trees

Classifier syntaxClassifier = Class_idClassifier = if_then_else(Condition ClassifierClassifier)Condition = Input_Variable = Constant_Value

Implementation in ECJ parameter filesgptypeasize = 3gptypea0name = classgptypea1name = vargptypea2name = constgptypessize = 0gptcsize = 1gptc0 = ecgpGPTreeConstraintsgptc0name = tc0gptc0fset = f0gptc0returns = class

gpncsize = 4gpnc0 = ecgpGPNodeConstraintsgpnc0name = ncSimpleClassifiergpnc0returns = classgpnc0size = 0gpnc1 = ecgpGPNodeConstraintsgpnc1name = ncCompoundClassifiergpnc1returns = classgpnc1size = 4gpnc1child0 = vargpnc1child1 = constgpnc1child2 = classgpnc1child3 = classgpnc2 = ecgpGPNodeConstraintsgpnc2name = ncVariablegpnc2returns = vargpnc2size = 0gpnc3 = ecgpGPNodeConstraintsgpnc3name = ncConstantgpnc3returns = constgpnc3size = 0

Variants of GPndash 62

Linear Genetic Programming

MotivationTree-like structures are not natural for contemporary hardware architectures

Program = a sequence of instructions

Data passed via registers

ProsDirectly portable to machine code fast executionNatural correspondence to standard (GA-like) crossover operator

Applications direct evolution of machine code [32]

Initial register contents

Final register contents

x1

x2 O1 O2

x3

O3 O4 g2

g3

g1r1

r2

r3

r1

r2

r3

Variants of GPndash 63

Linear GP

Variants of GPndash 64

Stack-based GP

The best-known representative Push and PushGPhampshireedulspectorpushhtml [48]

ProsVery simple syntax program = instruction | literal | ( program )No need to specify the number of registersThe top element of a stack has the natural interpretation of program outcomeNatural possibility of implementing autoconstructive programs [47]Includes certain features that make it Turing-complete (eg YANK instruction)

Variants of GPndash 65

Push Example

Program

( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR )

Initial stack states

BOOLEAN STACK ()CODE STACK ( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR )FLOAT STACK ()INTEGER STACK ()

Stack states after program execution

BOOLEAN STACK ( TRUE )CODE STACK ( ( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR ) )FLOAT STACK ( 93 )INTEGER STACK ( 6 )

httphampshireedulspectorpush3-descriptionhtml

Variants of GPndash 66

Other variants of GP

Grammatical Evolution The grammar of the programming language ofconsideration is given as input to the algorithm Individuals encode the choice ofproductions in the derivation tree (which of available alternative productionshould be chosen modulo the number of productions available at given step ofderivation)

Graph-based GPMotivation standard GP cannot reuse subprograms (within a single program)Example Cartesian Genetic Programming

Variants of GPndash 67

Other variants of GP

Multiobjective GP The extra objectives canCome with the problemResult from GPrsquos specifics eg use program size as the second (minimized)objectiveBe associated with different tests (eg feature tests [40])

Developmental GP (eg using Push)

Probabilistic GP (a variant of EDA Estimation of Distribution Algorithms)The algorithm maintains a probability distribution P instead of a populationIndividuals are generated from P lsquoon demandrsquoThe results of individualsrsquo evaluation are used to update P

Variants of GPndash 68

Simple EDA-like GP PIPE

Probabilistic Incremental Program Evolution [41]

Variants of GPndash 69

Selected theoretical results

Selected theoretical resultsndash 70

Schemata theorem for GP

Exact formula for the expected number of individuals sampling a schema a thenext generation [37]

Plus later work for other types of crossover

Selected theoretical resultsndash 71

Semantic crossover

Selected theoretical resultsndash 72

Applications of GP

Applications of GPndash 73

Review

GP produced a number of solutions that are human-competitive ie a GPalgorithm automatically solved a problem for which a patent exists2

A recent award-winning work has demonstrated the ability of a GP system toautomatically find and correct bugs in commercially-released software whenprovided with test data3

GP is one of leading methodologies that can be used to lsquoautomatersquo sciencehelping the researchers to find the hidden complex patterns in the observedphenomena4

2Koza J R Keane M A Streeter M J Mydlowec W Yu J Lanza G 2003 Genetic Pro-gramming IV Routine Human-Competitive Machine Intelligence Kluwer Academic Publishers

3Arcuri A Yao X A novel co-evolutionary approach to automatic software bug fixing In WangJ (Ed) 2008 IEEE World Congress on Computational Intelligence IEEE Computational IntelligenceSociety IEEE Press Hong Kong

4Schmidt M Lipson H 3 Apr 2009 Distilling free-form natural laws from experimental dataScience 324 (5923) 81ndash85

Applications of GPndash 74

Humies

() Entries were solicited for cash awards for human-competitive results that wereproduced by any form of genetic and evolutionary computation and that were published

Applications of GPndash 75

Humies

The conditions to qualify(A) The result was patented as an invention in the past is an improvement over apatented invention or would qualify today as a patentable new invention(B) The result is equal to or better than a result that was accepted as a new scientificresult at the time when it was published in a peer-reviewed scientific journal(C) The result is equal to or better than a result that was placed into a database orarchive of results maintained by an internationally recognized panel of scientificexperts(D) The result is publishable in its own right as a new scientific result mdash independentof the fact that the result was mechanically created(E) The result is equal to or better than the most recent human-created solution to along-standing problem for which there has been a succession of increasingly betterhuman-created solutions(F) The result is equal to or better than a result that was considered an achievementin its field at the time it was first discovered(G) The result solves a problem of indisputable difficulty in its field(H) The result holds its own or wins a regulated competition involving humancontestants (in the form of either live human players or human-written computerprograms)

Applications of GPndash 76

Selected Gold Humies

2004 Jason D Lohn Gregory S Hornby Derek S Linden NASA Ames ResearchCenterAn Evolved Antenna for Deployment on NASArsquos Space Technology 5 Mission

httpidesignucscedupapershornby_ec11pdf

Applications of GPndash 77

Selected Gold Humies using GP

2009 Stephanie Forrest Claire Le Goues ThanhVu Nguyen Westley WeimerAutomatically finding patches using genetic programming A GeneticProgramming Approach to Automated Software Repair

Applications of GPndash 78

Selected Gold Humies using GP

2008 Lee Spector David M Clark Ian Lindsay Bradford Barr Jon KleinGenetic Programming for Finite Algebras

2010 Natalio Krasnogor Paweł Widera Jonathan GaribaldiEvolutionary design of the energy function for protein structure prediction GPchallenge evolving the energy function for protein structure prediction Automateddesign of energy functions for protein structure prediction by means of geneticprogramming and improved structure similarity assessment

2011 Achiya Elyasaf Ami Hauptmann Moshe SipperGA-FreeCell Evolving Solvers for the Game of FreeCell

Applications of GPndash 79

Other applications

classification problems in machine learning [] object recognition [ 33] orlearning game strategies [] [2 57] has demonstrated the ability of a GP system to automatically find andcorrect bugs in commercially-released software when provided with test data

In the context of this paper it deserves particular attention that GP is one ofleading methodologies that can be used to lsquoautomatersquo science helping theresearchers to find the hidden complex patterns in the observed phenomena

In this spirit in their seminal paper [42] have shown how GP can be used toinduce scientific laws from experimental data Many other studies havedemonstrated the usefulness of GP for modeling different phenomena includingthose of natural origins [45 3 7 56 46]

See [] for an extensive review of GP applications

Applications of GPndash 80

Case study 2 Evolution of feature detectors

Based on Krzysztof Krawiec Bartosz Kukawka and Tomasz Maciejewski (2010)Evolving cascades of voting feature detectors for vehicle detection in satellite imageryIn IEEE Congress on Evolutionary Computation (CEC 2010) Barcelona IEEE Presspages 2392-2399

Applications of GPndash 81

Additional resources

Additional resourcesndash 82

Additional resources

Evolutionary Computation in Java csgmuedu~eclabprojectsecjGeneric software framework for EA well-prepared to work with GP

The online GP bilbiography wwwcsbhamacuk~wblbiblio

The genetic programming lsquohome pagersquo (a little bit messy but still valuable)httpwwwgenetic-programmingcom

Additional resourcesndash 83

Bibliography

Bibliographyndash 84

A Agapitos and S M LucasEvolving Modular Recursive Sorting AlgorithmsIn Proc EuroGP 2007

A Arcuri and X YaoA novel co-evolutionary approach to automatic software bug fixingIn J Wang editor 2008 IEEE World Congress on Computational Intelligencepages 162ndash168 Hong Kong 1-6 June 2008 IEEE Computational IntelligenceSociety IEEE Press

M Arganis R Val J Prats K Rodriguez R Dominguez and J DolzGenetic programming and standardization in water temperature modellingAdvances in Civil Engineering 2009 2009

A Brabazon M OrsquoNeill and I DempseyAn Introduction to Evolutionary Computation in FinanceIEEE Computational Intelligence Magazine 3(4)42ndash55 2008

R Bradley A Brabazon and M OrsquoNeillDynamic High Frequency Trading A Neuro-Evolutionary ApproachIn Proc EvoWorkshops 2009

G Cuccu and F GomezWhen novelty is not enoughIn Proc EvoApplications 2011

Bibliographyndash 85

M Daga and M C DeoAlternative data-driven methods to estimate wind from waves by inversemodelingNatural Hazards 49(2)293ndash310 May 2009

I Dempsey M OrsquoNeill and A BrabazonAdaptive Trading With Grammatical EvolutionIn Proc CEC 2006

G Durrett F Neumann and U-M OrsquoReillyComputational Complexity Analysis of Simple Genetic Programming On TwoProblems Modeling Isolated Program SemanticsIn Proc FOGA 2011

E Galvaacuten-Loacutepez J Swafford M OrsquoNeill and A BrabazonEvolving a Ms PacMan Controller Using Grammatical EvolutionIn Applications of Evolutionary Computation Springer 2010

J V Hansen P B Lowry R D Meservy and D M McDonaldGenetic Programming for Prevention of Cyberterrorism through Dynamic andEvolving Intrusion DetectionDecision Support Systems 431362ndash1374 2007

Bibliographyndash 86

S Harding J F Miller and W BanzhafDevelopments in Cartesian Genetic Programming self-modifying CGPGPEM 11397ndash439 2010

D Jakobović and L BudinDynamic Scheduling with Genetic ProgrammingIn Proc EuroGP 2006

G Kendall A Parkes and K SpoererA Survey of NP-Complete PuzzlesInternational Computer Games Association Journal 31(1)13ndash34 2008

K E Kinnear JrEvolving a Sort Lessons in Genetic ProgrammingIn Proc of the International Conference on Neural Networks 1993

J KozaA Genetic Approach to the Truck Backer Upper Problem and the Inter-twinedSpiral ProblemIn Proc International Joint Conference on Neural Networks 1992

J R KozaGenetic Programming On the Programming of Computers by Means of NaturalSelectionMIT Press Cambridge MA USA 1992

Bibliographyndash 87

J R KozaGenetic Programming II Automatic Discovery of Reusable ProgramsMIT Press Cambridge Massachusetts May 1994

W Langdon and W BanzhafRepeated Patterns in Genetic ProgrammingNatural Computing 7589ndash613 2008

W B LangdonRandom search is parsimoniousIn E Cantuacute-Paz editor Late Breaking Papers at the Genetic and EvolutionaryComputation Conference (GECCO-2002) pages 308ndash315 New York NY 9-13July 2002 AAAI

W B Langdon and W BanzhafRepeated Sequences in Linear Genetic Programming GenomesComplex Systems 15(4)285ndash306 2005

W B Langdon and A P HarrisonEvolving Regular Expressions for GeneChip Probe Performance PredictionIn Proc PPSN pages 1061ndash1070 2008

W B Langdon and R PoliFoundations of Genetic ProgrammingSpringer-Verlag 2002

Bibliographyndash 88

W B Langdon J Rowsell and A P HarrisonCreating Regular Expressions as mRNA Motifs with GP to Predict Human ExonSplittingIn Proc GECCO 2009

W B Langdon O Sanchez Graillet and A P HarrisonAutomated DNA Motif DiscoveryarXivorg 2010

M Lones A Tyrrell S Stepney and L CavesControlling Complex Dynamics with Artificial Biochemical NetworksIn Proc EuroGP pages 159ndash170 2010

S LucasOthello Competitionhttpprotectkern-1667emrelaxalgovalessexacuk8080othellohtmlOthellohtml 2012[Online accessed 27-Jan-2012]

Bibliographyndash 89

S LucasThe Physical Travelling Salesperson Problemhttpprotectkern-1667emrelaxalgovalessexacukptspptsphtml2012[Online accessed 27ndashJan-2012]

T McConaghyFFX Fast Scalable Deterministic Symbolic Regression TechnologyIn Proc GPTP 2011

D J MontanaStrongly typed genetic programmingBBN Technical Report 7866 Bolt Beranek and Newman Inc 10 MoultonStreet Cambridge MA 02138 USA 7 May 1993

M Nicolau M Schoenauer and W BanzhafEvolving Genes to Balance a PoleIn Proc EuroGP 2010

Bibliographyndash 90

P Nordin and W BanzhafGenetic programming controlling a miniature robotIn E V Siegel and J R Koza editors Working Notes for the AAAI Symposiumon Genetic Programming pages 61ndash67 MIT Cambridge MA USA 10ndash12 Nov1995 AAAI

G Olague and L TrujilloEvolutionary-computer-assisted design of image operators that detect interestpoints using genetic programmingImage and Vision Computing 29(7)484ndash498 2011

M OrsquoNeill A Brabazon and E HembergSubtree Deactivation Control with Grammatical Genetic Programming inDynamic EnvironmentsIn Proc CEC 2008

M OrsquoNeill and C RyanGrammatical Evolution by Grammatical Evolution The Evolution of Grammarand Genetic CodeIn Proc EuroGP pages 138ndash149 Springer-Verlag 5ndash7 Apr 2004

M Paterson Y Peres M Thorup P Winkler and U ZwickMaximum OverhangIn Proc 19th Annual ACM-SIAM Symposium on Discrete Algorithms 2008

Bibliographyndash 91

R PoliExact schema theory for genetic programming and variable-length geneticalgorithms with one-point crossoverGenetic Programming and Evolvable Machines 2(2)123ndash163 June 2001

R Poli and W B LangdonOn the search properties of different crossover operators in genetic programmingIn J R Koza W Banzhaf K Chellapilla K Deb M Dorigo D B Fogel M HGarzon D E Goldberg H Iba and R Riolo editors Genetic Programming1998 Proceedings of the Third Annual Conference pages 293ndash301 University ofWisconsin Madison Wisconsin USA 22-25 July 1998 Morgan Kaufmann

R Poli W B Langdon and N F McPheeA field guide to genetic programmingPublished via httplulucom and freely available athttpwwwgp-field-guideorguk 2008(With contributions by J R Koza)

B J Ross and H ZhuProcedural texture evolution using multiobjective optimizationNew Generation Computing 22(3)271ndash293 2004

Bibliographyndash 92

R P Salustowicz and J SchmidhuberProbabilistic incremental program evolutionEvolutionary Computation 5(2)123ndash141 1997

M Schmidt and H LipsonDistilling free-form natural laws from experimental dataScience 324(5923)81ndash85 3 Apr 2009

S Silva and L VanneschiState-of-the-Art Genetic Programming for Predicting Human Oral Bioavailabilityof DrugsIn Proc 4th International Workshop on Practical Applications of ComputationalBiology and Bioinformatics 2010

M SipperLet the Games EvolveIn Proc GPTP 2011

C Sivapragasam R Maheswaran and V VenkateshGenetic programming approach for flood routing in natural channelsHydrological Processes 22(5)623ndash628 2007

C Sivapragasam N Muttil S Muthukumar and V M ArunPrediction of algal blooms using genetic programmingMarine Pollution Bulletin 60(10)1849ndash1855 2010

Bibliographyndash 93

L SpectorTowards practical autoconstructive evolution Self-evolution of problem-solvinggenetic programming systemsIn R Riolo T McConaghy and E Vladislavleva editors Genetic ProgrammingTheory and Practice VIII volume 8 of Genetic and Evolutionary Computationchapter 2 pages 17ndash33 Springer Ann Arbor USA 20-22 May 2010

L Spector C Perry and J KleinPush 20 programming language descriptionTechnical report School of Cognitive Science Hampshire College Apr 2004

J Togelius S Karakovskiy J Koutnik and J SchmidhuberSuper Mario EvolutionIn Proc IEEE Computational Intelligence and Games 2009

TORCS The Open Car Racing Simulatorhttptorcssourceforgenet 2012

L Vanneschi and G CuccuVariable Size Population for Dynamic Optimization with Genetic ProgrammingIn Proc GECCO 2009

Bibliographyndash 94

E Vladislavleva G Smits and D Den HertogOrder of Nonlinearity as a Complexity Measure for Models Generated by SymbolicRegression via Pareto Genetic ProgrammingIEEE Trans EC 13(2)333ndash349 2009

N Wagner Z Michalewicz M Khouja and R McGregorTime Series Forecasting for Dynamic Environments The DyFor Genetic ProgramModelIEEE Trans EC 2007

J Walker and J MillerPredicting Prime Numbers Using Cartesian Genetic ProgrammingIn Proc EuroGP 2007

J A Walker K Voumllk S L Smith and J F MillerParallel Evolution using Multi-chromosome Cartesian Genetic ProgrammingGPEM 10417ndash445 2009

W-C Wang K-W Chau C-T Cheng and L QiuA comparison of performance of several artificial intelligence methods forforecasting monthly discharge time seriesJournal of Hydrology 374(3-4)294ndash306 2009

Bibliographyndash 95

W Weimer S Forrest C Le Goues and T NguyenAutomatic program repair with evolutionary computationCommunications of the ACM 53(5)109ndash116 June 2010

P Widera J Garibaldi and N KrasnogorGP challenge Evolving energy function for protein structure predictionGPEM 1161ndash88 2010

T YuHierarchical Processing for Evolving Recursive and Modular Programs UsingHigher-Order Functions and Lambda AbstractionGPEM 2345ndash380 2001

Bibliographyndash 96

  • Introduction
  • Background
  • What is genetic programming
  • Summary of our first glimpse at GP
  • Life demonstration of GP using ECJ
  • Assessment of GP techniques
  • Case study
  • A more detailed view on GP (vanilla GP is not the whole story)
  • The Challenges for GP
  • Variants of GP
  • Selected theoretical results
  • Applications of GP
  • Additional resources
  • Bibliography
Page 42: GeneticProgramming - Poznań University of TechnologyGeneticProgramming KrzysztofKrawiec LaboratoryofIntelligentDecisionSupportSystems InstituteofComputingScience,PoznanUniversityofTechnology,Poznań,Poland

GP Benchmarks

Problem Definition (formula)Sextic x6minus2x4+x2

Septic x7minus2x6+x5minusx4+x3minus2x2+xNonic x9+x8+x7+x6+x5+x4+x3+x2+xR1 (x+1)3(x2minusx+1)R2 (x5minus3x3+1)(x2+1)R3 (x6+x5)(x4+x3+x2+x+1)

Assessment of GP techniquesndash 42

Even more GP benchmarks

Symbolic Regression Drug Bioavailability [43]

Tower [52] Protein Structure Classification [58]

Boolean Functions Time Series Forecasting [53]

N-Multiplexer [17] N-Majority [17] N-Parity [17] Path-finding and Planning

Generalised Boolean Circuits [12 59] Physical Travelling Salesman [28]

Digital Adder [55] Artificial Ant [17]

Order [9] Lawnmower [18]

Digital Multiplier [55] Tartarus Problem [6]

Majority [9] Maximum Overhang [36]

Classification Circuit Design [29]

mRNA Motif Classification [24] Control Systems

DNA Motif Discovery [25] Chaotic Dynamic Systems Control [26]

Intrusion Detection [11] Pole Balancing [31]

Protein Classification [19] Truck Control [16]

Intertwined Spirals [17]

Predictive Modelling

Mackey-Glass Chaotic Time Series [21]

Financial Trading [5 4 8]

Sunspot Prediction [17]

GeneChip Probe Performance [22]

Prime Number Prediction [54]

Assessment of GP techniquesndash 43

and more

Game-Playing

TORCS Car Racing [50]

Ms PacMan [10]

Othello [27]

Chessboard Evaluation [44]

Backgammon [44]

Mario [49]

NP-Complete Puzzles [14]

Robocode [44]

Rush Hour [44]

Checkers [44]

Freecell [44]

Dynamic Optimisation

Dynamic Symbolic Regression [34 35 51]

Dynamic Scheduling [13]

Traditional Programming

Sorting [15 1]

Assessment of GP techniquesndash 44

Case study

Case studyndash 45

Case study 1 Evolution of temperature models

Based on Karolina Stanisławska Krzysztof Krawiec Zbigniew W KundzewiczModeling Global Temperature Changes using Genetic Programming ndash A Case Study

Institute of Computing Science Poznan University of Technology Poznan Poland

Institute for Agricultural and Forest Environment Polish Academy of SciencesPoznan Poland and Potsdam Institute for Climate Impact Research PotsdamGermany

Konferencja Algorytmoacutew Ewolucyjnych i Optymalizacji Globalnej KAEiOGSeptember 21 2011

Case studyndash 46

A more detailed view on GP(vanilla GP is not the whole story)

A more detailed view on GP (vanilla GP is not the whole story)ndash 47

Where to get [random] candidate solutions from

Every stochastic search method relies on some sampling algorithm(s)

The distribution of randomly generated solutions is important as it impliescertain bias of the algorithm

ProblemsWe donrsquot know the lsquoidealrsquo distribution of GP programsEven if we knew it it may be difficult to design an algorithm that obeys it

The most widely used contemporary initialization methods take care only of thesyntax of generated programs

Mainly height constraint

A more detailed view on GP (vanilla GP is not the whole story)ndash 48

Initialization Full method

Specify the maximum tree height hmax

The full method for initializing treesChoose nonterminal nodes at random until hmax is reachedThen choose only from terminals

A more detailed view on GP (vanilla GP is not the whole story)ndash 49

Initialization Grow method

Specify the maximum tree height hmax

The grow method for initializing treesChoose nonterminal or terminal nodes at random until hmax is reachedThen choose only from terminals

A more detailed view on GP (vanilla GP is not the whole story)ndash 50

Homologous crossover for GP

Earliest example one-point crossover [23] identify a common region in theparents and swap the corresponding trees

The common region is the lsquointersectionrsquo of parent trees

A more detailed view on GP (vanilla GP is not the whole story)ndash 51

Uniform crossover for GP

Works similarly to uniform crossover in GAs

The offspring is build by iterating over nodes in the common region and flipping acoin to decide from which parent should an instruction be copied [38]

A more detailed view on GP (vanilla GP is not the whole story)ndash 52

How to employ multiple operators for lsquobreedingrsquo

How should the particular operators coexist in an evolutionary process In other words

How should they be superimposed

What should be the lsquopipingrsquo of particular breeding pipelines

A topic surprisingly underexplored in GP (and in EC probably too)

An example Which is betterpopsubpop0speciespipe = ecgpkozaMutationPipelinepopsubpop0speciespipenum-sources = 1popsubpop0speciespipesource0 = ecgpkozaCrossoverPipeline

Orpopsubpop0speciespipenum-sources = 2popsubpop0speciespipesource0 = ecgpkozaCrossoverPipelinepopsubpop0speciespipesource0prob = 09popsubpop0speciespipesource1 = ecgpkozaMutationPipelinepopsubpop0speciespipesource1prob = 01

A more detailed view on GP (vanilla GP is not the whole story)ndash 53

The Challenges for GP

The Challenges for GPndash 54

Bloat

The evolving expressions tend to grow indefinitely in sizeFor tree-based representations this growth is typically exponential[-ish]

Evaluation becomes slow algorithm stalls memory overrun likely

One of the most intensely studied topics in GP 240+ papers as of March 2012

The Challenges for GPndash 55

Bloat example

Average number of nodes per generation in a typical run of GP solving the Sexticproblem x6minus2x4+x2

(GP dotted line)

The Challenges for GPndash 56

Countermeasures for bloat

Constraining tree heightSurprisingly can speed up bloat

Favoring small programsLexicographic parsimony pressure given two equally fit individuals prefer (select)the one represented by a smaller tree

Bloat-aware operators size-fair crossover

The Challenges for GPndash 57

Highly non-uniform distribution of program lsquobehaviorsrsquo

Convergence of binary Boolean random linear functions (composed of AND NANDOR NOR 8 bits)

From [20] Langdon W B Cantuacute-Paz E (ed) Random Search is Parsimonious LateBreaking Papers at the Genetic and Evolutionary Computation Conference(GECCO-2002) AAAI 2002 308-315

The Challenges for GPndash 58

High cost of evaluation

Running a program on multiple inputscan be expensive

Particularly for some types of dataeg images

Solutions

Caching of outcomes of subprograms

Parallel execution of programs onparticular fitness cases

Bloat prevention methods

The Challenges for GPndash 59

Variants of GP

Variants of GPndash 60

Strongly typed GP (STGP)

A way to incorporate prior knowledge and impose a structure on programs [30]

ImplementationProvide a set of typesFor each instruction define the types of its arguments and outcomesMake the operators type-aware

Mutation substitute a random tree of a proper typeCrossover swap trees of compatible1 types

1Compatible belonging to the same lsquoset typersquoVariants of GPndash 61

Strongly typed GP in ECJ

For the problem of simple classifiers represented as decision trees

Classifier syntaxClassifier = Class_idClassifier = if_then_else(Condition ClassifierClassifier)Condition = Input_Variable = Constant_Value

Implementation in ECJ parameter filesgptypeasize = 3gptypea0name = classgptypea1name = vargptypea2name = constgptypessize = 0gptcsize = 1gptc0 = ecgpGPTreeConstraintsgptc0name = tc0gptc0fset = f0gptc0returns = class

gpncsize = 4gpnc0 = ecgpGPNodeConstraintsgpnc0name = ncSimpleClassifiergpnc0returns = classgpnc0size = 0gpnc1 = ecgpGPNodeConstraintsgpnc1name = ncCompoundClassifiergpnc1returns = classgpnc1size = 4gpnc1child0 = vargpnc1child1 = constgpnc1child2 = classgpnc1child3 = classgpnc2 = ecgpGPNodeConstraintsgpnc2name = ncVariablegpnc2returns = vargpnc2size = 0gpnc3 = ecgpGPNodeConstraintsgpnc3name = ncConstantgpnc3returns = constgpnc3size = 0

Variants of GPndash 62

Linear Genetic Programming

MotivationTree-like structures are not natural for contemporary hardware architectures

Program = a sequence of instructions

Data passed via registers

ProsDirectly portable to machine code fast executionNatural correspondence to standard (GA-like) crossover operator

Applications direct evolution of machine code [32]

Initial register contents

Final register contents

x1

x2 O1 O2

x3

O3 O4 g2

g3

g1r1

r2

r3

r1

r2

r3

Variants of GPndash 63

Linear GP

Variants of GPndash 64

Stack-based GP

The best-known representative Push and PushGPhampshireedulspectorpushhtml [48]

ProsVery simple syntax program = instruction | literal | ( program )No need to specify the number of registersThe top element of a stack has the natural interpretation of program outcomeNatural possibility of implementing autoconstructive programs [47]Includes certain features that make it Turing-complete (eg YANK instruction)

Variants of GPndash 65

Push Example

Program

( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR )

Initial stack states

BOOLEAN STACK ()CODE STACK ( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR )FLOAT STACK ()INTEGER STACK ()

Stack states after program execution

BOOLEAN STACK ( TRUE )CODE STACK ( ( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR ) )FLOAT STACK ( 93 )INTEGER STACK ( 6 )

httphampshireedulspectorpush3-descriptionhtml

Variants of GPndash 66

Other variants of GP

Grammatical Evolution The grammar of the programming language ofconsideration is given as input to the algorithm Individuals encode the choice ofproductions in the derivation tree (which of available alternative productionshould be chosen modulo the number of productions available at given step ofderivation)

Graph-based GPMotivation standard GP cannot reuse subprograms (within a single program)Example Cartesian Genetic Programming

Variants of GPndash 67

Other variants of GP

Multiobjective GP The extra objectives canCome with the problemResult from GPrsquos specifics eg use program size as the second (minimized)objectiveBe associated with different tests (eg feature tests [40])

Developmental GP (eg using Push)

Probabilistic GP (a variant of EDA Estimation of Distribution Algorithms)The algorithm maintains a probability distribution P instead of a populationIndividuals are generated from P lsquoon demandrsquoThe results of individualsrsquo evaluation are used to update P

Variants of GPndash 68

Simple EDA-like GP PIPE

Probabilistic Incremental Program Evolution [41]

Variants of GPndash 69

Selected theoretical results

Selected theoretical resultsndash 70

Schemata theorem for GP

Exact formula for the expected number of individuals sampling a schema a thenext generation [37]

Plus later work for other types of crossover

Selected theoretical resultsndash 71

Semantic crossover

Selected theoretical resultsndash 72

Applications of GP

Applications of GPndash 73

Review

GP produced a number of solutions that are human-competitive ie a GPalgorithm automatically solved a problem for which a patent exists2

A recent award-winning work has demonstrated the ability of a GP system toautomatically find and correct bugs in commercially-released software whenprovided with test data3

GP is one of leading methodologies that can be used to lsquoautomatersquo sciencehelping the researchers to find the hidden complex patterns in the observedphenomena4

2Koza J R Keane M A Streeter M J Mydlowec W Yu J Lanza G 2003 Genetic Pro-gramming IV Routine Human-Competitive Machine Intelligence Kluwer Academic Publishers

3Arcuri A Yao X A novel co-evolutionary approach to automatic software bug fixing In WangJ (Ed) 2008 IEEE World Congress on Computational Intelligence IEEE Computational IntelligenceSociety IEEE Press Hong Kong

4Schmidt M Lipson H 3 Apr 2009 Distilling free-form natural laws from experimental dataScience 324 (5923) 81ndash85

Applications of GPndash 74

Humies

() Entries were solicited for cash awards for human-competitive results that wereproduced by any form of genetic and evolutionary computation and that were published

Applications of GPndash 75

Humies

The conditions to qualify(A) The result was patented as an invention in the past is an improvement over apatented invention or would qualify today as a patentable new invention(B) The result is equal to or better than a result that was accepted as a new scientificresult at the time when it was published in a peer-reviewed scientific journal(C) The result is equal to or better than a result that was placed into a database orarchive of results maintained by an internationally recognized panel of scientificexperts(D) The result is publishable in its own right as a new scientific result mdash independentof the fact that the result was mechanically created(E) The result is equal to or better than the most recent human-created solution to along-standing problem for which there has been a succession of increasingly betterhuman-created solutions(F) The result is equal to or better than a result that was considered an achievementin its field at the time it was first discovered(G) The result solves a problem of indisputable difficulty in its field(H) The result holds its own or wins a regulated competition involving humancontestants (in the form of either live human players or human-written computerprograms)

Applications of GPndash 76

Selected Gold Humies

2004 Jason D Lohn Gregory S Hornby Derek S Linden NASA Ames ResearchCenterAn Evolved Antenna for Deployment on NASArsquos Space Technology 5 Mission

httpidesignucscedupapershornby_ec11pdf

Applications of GPndash 77

Selected Gold Humies using GP

2009 Stephanie Forrest Claire Le Goues ThanhVu Nguyen Westley WeimerAutomatically finding patches using genetic programming A GeneticProgramming Approach to Automated Software Repair

Applications of GPndash 78

Selected Gold Humies using GP

2008 Lee Spector David M Clark Ian Lindsay Bradford Barr Jon KleinGenetic Programming for Finite Algebras

2010 Natalio Krasnogor Paweł Widera Jonathan GaribaldiEvolutionary design of the energy function for protein structure prediction GPchallenge evolving the energy function for protein structure prediction Automateddesign of energy functions for protein structure prediction by means of geneticprogramming and improved structure similarity assessment

2011 Achiya Elyasaf Ami Hauptmann Moshe SipperGA-FreeCell Evolving Solvers for the Game of FreeCell

Applications of GPndash 79

Other applications

classification problems in machine learning [] object recognition [ 33] orlearning game strategies [] [2 57] has demonstrated the ability of a GP system to automatically find andcorrect bugs in commercially-released software when provided with test data

In the context of this paper it deserves particular attention that GP is one ofleading methodologies that can be used to lsquoautomatersquo science helping theresearchers to find the hidden complex patterns in the observed phenomena

In this spirit in their seminal paper [42] have shown how GP can be used toinduce scientific laws from experimental data Many other studies havedemonstrated the usefulness of GP for modeling different phenomena includingthose of natural origins [45 3 7 56 46]

See [] for an extensive review of GP applications

Applications of GPndash 80

Case study 2 Evolution of feature detectors

Based on Krzysztof Krawiec Bartosz Kukawka and Tomasz Maciejewski (2010)Evolving cascades of voting feature detectors for vehicle detection in satellite imageryIn IEEE Congress on Evolutionary Computation (CEC 2010) Barcelona IEEE Presspages 2392-2399

Applications of GPndash 81

Additional resources

Additional resourcesndash 82

Additional resources

Evolutionary Computation in Java csgmuedu~eclabprojectsecjGeneric software framework for EA well-prepared to work with GP

The online GP bilbiography wwwcsbhamacuk~wblbiblio

The genetic programming lsquohome pagersquo (a little bit messy but still valuable)httpwwwgenetic-programmingcom

Additional resourcesndash 83

Bibliography

Bibliographyndash 84

A Agapitos and S M LucasEvolving Modular Recursive Sorting AlgorithmsIn Proc EuroGP 2007

A Arcuri and X YaoA novel co-evolutionary approach to automatic software bug fixingIn J Wang editor 2008 IEEE World Congress on Computational Intelligencepages 162ndash168 Hong Kong 1-6 June 2008 IEEE Computational IntelligenceSociety IEEE Press

M Arganis R Val J Prats K Rodriguez R Dominguez and J DolzGenetic programming and standardization in water temperature modellingAdvances in Civil Engineering 2009 2009

A Brabazon M OrsquoNeill and I DempseyAn Introduction to Evolutionary Computation in FinanceIEEE Computational Intelligence Magazine 3(4)42ndash55 2008

R Bradley A Brabazon and M OrsquoNeillDynamic High Frequency Trading A Neuro-Evolutionary ApproachIn Proc EvoWorkshops 2009

G Cuccu and F GomezWhen novelty is not enoughIn Proc EvoApplications 2011

Bibliographyndash 85

M Daga and M C DeoAlternative data-driven methods to estimate wind from waves by inversemodelingNatural Hazards 49(2)293ndash310 May 2009

I Dempsey M OrsquoNeill and A BrabazonAdaptive Trading With Grammatical EvolutionIn Proc CEC 2006

G Durrett F Neumann and U-M OrsquoReillyComputational Complexity Analysis of Simple Genetic Programming On TwoProblems Modeling Isolated Program SemanticsIn Proc FOGA 2011

E Galvaacuten-Loacutepez J Swafford M OrsquoNeill and A BrabazonEvolving a Ms PacMan Controller Using Grammatical EvolutionIn Applications of Evolutionary Computation Springer 2010

J V Hansen P B Lowry R D Meservy and D M McDonaldGenetic Programming for Prevention of Cyberterrorism through Dynamic andEvolving Intrusion DetectionDecision Support Systems 431362ndash1374 2007

Bibliographyndash 86

S Harding J F Miller and W BanzhafDevelopments in Cartesian Genetic Programming self-modifying CGPGPEM 11397ndash439 2010

D Jakobović and L BudinDynamic Scheduling with Genetic ProgrammingIn Proc EuroGP 2006

G Kendall A Parkes and K SpoererA Survey of NP-Complete PuzzlesInternational Computer Games Association Journal 31(1)13ndash34 2008

K E Kinnear JrEvolving a Sort Lessons in Genetic ProgrammingIn Proc of the International Conference on Neural Networks 1993

J KozaA Genetic Approach to the Truck Backer Upper Problem and the Inter-twinedSpiral ProblemIn Proc International Joint Conference on Neural Networks 1992

J R KozaGenetic Programming On the Programming of Computers by Means of NaturalSelectionMIT Press Cambridge MA USA 1992

Bibliographyndash 87

J R KozaGenetic Programming II Automatic Discovery of Reusable ProgramsMIT Press Cambridge Massachusetts May 1994

W Langdon and W BanzhafRepeated Patterns in Genetic ProgrammingNatural Computing 7589ndash613 2008

W B LangdonRandom search is parsimoniousIn E Cantuacute-Paz editor Late Breaking Papers at the Genetic and EvolutionaryComputation Conference (GECCO-2002) pages 308ndash315 New York NY 9-13July 2002 AAAI

W B Langdon and W BanzhafRepeated Sequences in Linear Genetic Programming GenomesComplex Systems 15(4)285ndash306 2005

W B Langdon and A P HarrisonEvolving Regular Expressions for GeneChip Probe Performance PredictionIn Proc PPSN pages 1061ndash1070 2008

W B Langdon and R PoliFoundations of Genetic ProgrammingSpringer-Verlag 2002

Bibliographyndash 88

W B Langdon J Rowsell and A P HarrisonCreating Regular Expressions as mRNA Motifs with GP to Predict Human ExonSplittingIn Proc GECCO 2009

W B Langdon O Sanchez Graillet and A P HarrisonAutomated DNA Motif DiscoveryarXivorg 2010

M Lones A Tyrrell S Stepney and L CavesControlling Complex Dynamics with Artificial Biochemical NetworksIn Proc EuroGP pages 159ndash170 2010

S LucasOthello Competitionhttpprotectkern-1667emrelaxalgovalessexacuk8080othellohtmlOthellohtml 2012[Online accessed 27-Jan-2012]

Bibliographyndash 89

S LucasThe Physical Travelling Salesperson Problemhttpprotectkern-1667emrelaxalgovalessexacukptspptsphtml2012[Online accessed 27ndashJan-2012]

T McConaghyFFX Fast Scalable Deterministic Symbolic Regression TechnologyIn Proc GPTP 2011

D J MontanaStrongly typed genetic programmingBBN Technical Report 7866 Bolt Beranek and Newman Inc 10 MoultonStreet Cambridge MA 02138 USA 7 May 1993

M Nicolau M Schoenauer and W BanzhafEvolving Genes to Balance a PoleIn Proc EuroGP 2010

Bibliographyndash 90

P Nordin and W BanzhafGenetic programming controlling a miniature robotIn E V Siegel and J R Koza editors Working Notes for the AAAI Symposiumon Genetic Programming pages 61ndash67 MIT Cambridge MA USA 10ndash12 Nov1995 AAAI

G Olague and L TrujilloEvolutionary-computer-assisted design of image operators that detect interestpoints using genetic programmingImage and Vision Computing 29(7)484ndash498 2011

M OrsquoNeill A Brabazon and E HembergSubtree Deactivation Control with Grammatical Genetic Programming inDynamic EnvironmentsIn Proc CEC 2008

M OrsquoNeill and C RyanGrammatical Evolution by Grammatical Evolution The Evolution of Grammarand Genetic CodeIn Proc EuroGP pages 138ndash149 Springer-Verlag 5ndash7 Apr 2004

M Paterson Y Peres M Thorup P Winkler and U ZwickMaximum OverhangIn Proc 19th Annual ACM-SIAM Symposium on Discrete Algorithms 2008

Bibliographyndash 91

R PoliExact schema theory for genetic programming and variable-length geneticalgorithms with one-point crossoverGenetic Programming and Evolvable Machines 2(2)123ndash163 June 2001

R Poli and W B LangdonOn the search properties of different crossover operators in genetic programmingIn J R Koza W Banzhaf K Chellapilla K Deb M Dorigo D B Fogel M HGarzon D E Goldberg H Iba and R Riolo editors Genetic Programming1998 Proceedings of the Third Annual Conference pages 293ndash301 University ofWisconsin Madison Wisconsin USA 22-25 July 1998 Morgan Kaufmann

R Poli W B Langdon and N F McPheeA field guide to genetic programmingPublished via httplulucom and freely available athttpwwwgp-field-guideorguk 2008(With contributions by J R Koza)

B J Ross and H ZhuProcedural texture evolution using multiobjective optimizationNew Generation Computing 22(3)271ndash293 2004

Bibliographyndash 92

R P Salustowicz and J SchmidhuberProbabilistic incremental program evolutionEvolutionary Computation 5(2)123ndash141 1997

M Schmidt and H LipsonDistilling free-form natural laws from experimental dataScience 324(5923)81ndash85 3 Apr 2009

S Silva and L VanneschiState-of-the-Art Genetic Programming for Predicting Human Oral Bioavailabilityof DrugsIn Proc 4th International Workshop on Practical Applications of ComputationalBiology and Bioinformatics 2010

M SipperLet the Games EvolveIn Proc GPTP 2011

C Sivapragasam R Maheswaran and V VenkateshGenetic programming approach for flood routing in natural channelsHydrological Processes 22(5)623ndash628 2007

C Sivapragasam N Muttil S Muthukumar and V M ArunPrediction of algal blooms using genetic programmingMarine Pollution Bulletin 60(10)1849ndash1855 2010

Bibliographyndash 93

L SpectorTowards practical autoconstructive evolution Self-evolution of problem-solvinggenetic programming systemsIn R Riolo T McConaghy and E Vladislavleva editors Genetic ProgrammingTheory and Practice VIII volume 8 of Genetic and Evolutionary Computationchapter 2 pages 17ndash33 Springer Ann Arbor USA 20-22 May 2010

L Spector C Perry and J KleinPush 20 programming language descriptionTechnical report School of Cognitive Science Hampshire College Apr 2004

J Togelius S Karakovskiy J Koutnik and J SchmidhuberSuper Mario EvolutionIn Proc IEEE Computational Intelligence and Games 2009

TORCS The Open Car Racing Simulatorhttptorcssourceforgenet 2012

L Vanneschi and G CuccuVariable Size Population for Dynamic Optimization with Genetic ProgrammingIn Proc GECCO 2009

Bibliographyndash 94

E Vladislavleva G Smits and D Den HertogOrder of Nonlinearity as a Complexity Measure for Models Generated by SymbolicRegression via Pareto Genetic ProgrammingIEEE Trans EC 13(2)333ndash349 2009

N Wagner Z Michalewicz M Khouja and R McGregorTime Series Forecasting for Dynamic Environments The DyFor Genetic ProgramModelIEEE Trans EC 2007

J Walker and J MillerPredicting Prime Numbers Using Cartesian Genetic ProgrammingIn Proc EuroGP 2007

J A Walker K Voumllk S L Smith and J F MillerParallel Evolution using Multi-chromosome Cartesian Genetic ProgrammingGPEM 10417ndash445 2009

W-C Wang K-W Chau C-T Cheng and L QiuA comparison of performance of several artificial intelligence methods forforecasting monthly discharge time seriesJournal of Hydrology 374(3-4)294ndash306 2009

Bibliographyndash 95

W Weimer S Forrest C Le Goues and T NguyenAutomatic program repair with evolutionary computationCommunications of the ACM 53(5)109ndash116 June 2010

P Widera J Garibaldi and N KrasnogorGP challenge Evolving energy function for protein structure predictionGPEM 1161ndash88 2010

T YuHierarchical Processing for Evolving Recursive and Modular Programs UsingHigher-Order Functions and Lambda AbstractionGPEM 2345ndash380 2001

Bibliographyndash 96

  • Introduction
  • Background
  • What is genetic programming
  • Summary of our first glimpse at GP
  • Life demonstration of GP using ECJ
  • Assessment of GP techniques
  • Case study
  • A more detailed view on GP (vanilla GP is not the whole story)
  • The Challenges for GP
  • Variants of GP
  • Selected theoretical results
  • Applications of GP
  • Additional resources
  • Bibliography
Page 43: GeneticProgramming - Poznań University of TechnologyGeneticProgramming KrzysztofKrawiec LaboratoryofIntelligentDecisionSupportSystems InstituteofComputingScience,PoznanUniversityofTechnology,Poznań,Poland

Even more GP benchmarks

Symbolic Regression Drug Bioavailability [43]

Tower [52] Protein Structure Classification [58]

Boolean Functions Time Series Forecasting [53]

N-Multiplexer [17] N-Majority [17] N-Parity [17] Path-finding and Planning

Generalised Boolean Circuits [12 59] Physical Travelling Salesman [28]

Digital Adder [55] Artificial Ant [17]

Order [9] Lawnmower [18]

Digital Multiplier [55] Tartarus Problem [6]

Majority [9] Maximum Overhang [36]

Classification Circuit Design [29]

mRNA Motif Classification [24] Control Systems

DNA Motif Discovery [25] Chaotic Dynamic Systems Control [26]

Intrusion Detection [11] Pole Balancing [31]

Protein Classification [19] Truck Control [16]

Intertwined Spirals [17]

Predictive Modelling

Mackey-Glass Chaotic Time Series [21]

Financial Trading [5 4 8]

Sunspot Prediction [17]

GeneChip Probe Performance [22]

Prime Number Prediction [54]

Assessment of GP techniquesndash 43

and more

Game-Playing

TORCS Car Racing [50]

Ms PacMan [10]

Othello [27]

Chessboard Evaluation [44]

Backgammon [44]

Mario [49]

NP-Complete Puzzles [14]

Robocode [44]

Rush Hour [44]

Checkers [44]

Freecell [44]

Dynamic Optimisation

Dynamic Symbolic Regression [34 35 51]

Dynamic Scheduling [13]

Traditional Programming

Sorting [15 1]

Assessment of GP techniquesndash 44

Case study

Case studyndash 45

Case study 1 Evolution of temperature models

Based on Karolina Stanisławska Krzysztof Krawiec Zbigniew W KundzewiczModeling Global Temperature Changes using Genetic Programming ndash A Case Study

Institute of Computing Science Poznan University of Technology Poznan Poland

Institute for Agricultural and Forest Environment Polish Academy of SciencesPoznan Poland and Potsdam Institute for Climate Impact Research PotsdamGermany

Konferencja Algorytmoacutew Ewolucyjnych i Optymalizacji Globalnej KAEiOGSeptember 21 2011

Case studyndash 46

A more detailed view on GP(vanilla GP is not the whole story)

A more detailed view on GP (vanilla GP is not the whole story)ndash 47

Where to get [random] candidate solutions from

Every stochastic search method relies on some sampling algorithm(s)

The distribution of randomly generated solutions is important as it impliescertain bias of the algorithm

ProblemsWe donrsquot know the lsquoidealrsquo distribution of GP programsEven if we knew it it may be difficult to design an algorithm that obeys it

The most widely used contemporary initialization methods take care only of thesyntax of generated programs

Mainly height constraint

A more detailed view on GP (vanilla GP is not the whole story)ndash 48

Initialization Full method

Specify the maximum tree height hmax

The full method for initializing treesChoose nonterminal nodes at random until hmax is reachedThen choose only from terminals

A more detailed view on GP (vanilla GP is not the whole story)ndash 49

Initialization Grow method

Specify the maximum tree height hmax

The grow method for initializing treesChoose nonterminal or terminal nodes at random until hmax is reachedThen choose only from terminals

A more detailed view on GP (vanilla GP is not the whole story)ndash 50

Homologous crossover for GP

Earliest example one-point crossover [23] identify a common region in theparents and swap the corresponding trees

The common region is the lsquointersectionrsquo of parent trees

A more detailed view on GP (vanilla GP is not the whole story)ndash 51

Uniform crossover for GP

Works similarly to uniform crossover in GAs

The offspring is build by iterating over nodes in the common region and flipping acoin to decide from which parent should an instruction be copied [38]

A more detailed view on GP (vanilla GP is not the whole story)ndash 52

How to employ multiple operators for lsquobreedingrsquo

How should the particular operators coexist in an evolutionary process In other words

How should they be superimposed

What should be the lsquopipingrsquo of particular breeding pipelines

A topic surprisingly underexplored in GP (and in EC probably too)

An example Which is betterpopsubpop0speciespipe = ecgpkozaMutationPipelinepopsubpop0speciespipenum-sources = 1popsubpop0speciespipesource0 = ecgpkozaCrossoverPipeline

Orpopsubpop0speciespipenum-sources = 2popsubpop0speciespipesource0 = ecgpkozaCrossoverPipelinepopsubpop0speciespipesource0prob = 09popsubpop0speciespipesource1 = ecgpkozaMutationPipelinepopsubpop0speciespipesource1prob = 01

A more detailed view on GP (vanilla GP is not the whole story)ndash 53

The Challenges for GP

The Challenges for GPndash 54

Bloat

The evolving expressions tend to grow indefinitely in sizeFor tree-based representations this growth is typically exponential[-ish]

Evaluation becomes slow algorithm stalls memory overrun likely

One of the most intensely studied topics in GP 240+ papers as of March 2012

The Challenges for GPndash 55

Bloat example

Average number of nodes per generation in a typical run of GP solving the Sexticproblem x6minus2x4+x2

(GP dotted line)

The Challenges for GPndash 56

Countermeasures for bloat

Constraining tree heightSurprisingly can speed up bloat

Favoring small programsLexicographic parsimony pressure given two equally fit individuals prefer (select)the one represented by a smaller tree

Bloat-aware operators size-fair crossover

The Challenges for GPndash 57

Highly non-uniform distribution of program lsquobehaviorsrsquo

Convergence of binary Boolean random linear functions (composed of AND NANDOR NOR 8 bits)

From [20] Langdon W B Cantuacute-Paz E (ed) Random Search is Parsimonious LateBreaking Papers at the Genetic and Evolutionary Computation Conference(GECCO-2002) AAAI 2002 308-315

The Challenges for GPndash 58

High cost of evaluation

Running a program on multiple inputscan be expensive

Particularly for some types of dataeg images

Solutions

Caching of outcomes of subprograms

Parallel execution of programs onparticular fitness cases

Bloat prevention methods

The Challenges for GPndash 59

Variants of GP

Variants of GPndash 60

Strongly typed GP (STGP)

A way to incorporate prior knowledge and impose a structure on programs [30]

ImplementationProvide a set of typesFor each instruction define the types of its arguments and outcomesMake the operators type-aware

Mutation substitute a random tree of a proper typeCrossover swap trees of compatible1 types

1Compatible belonging to the same lsquoset typersquoVariants of GPndash 61

Strongly typed GP in ECJ

For the problem of simple classifiers represented as decision trees

Classifier syntaxClassifier = Class_idClassifier = if_then_else(Condition ClassifierClassifier)Condition = Input_Variable = Constant_Value

Implementation in ECJ parameter filesgptypeasize = 3gptypea0name = classgptypea1name = vargptypea2name = constgptypessize = 0gptcsize = 1gptc0 = ecgpGPTreeConstraintsgptc0name = tc0gptc0fset = f0gptc0returns = class

gpncsize = 4gpnc0 = ecgpGPNodeConstraintsgpnc0name = ncSimpleClassifiergpnc0returns = classgpnc0size = 0gpnc1 = ecgpGPNodeConstraintsgpnc1name = ncCompoundClassifiergpnc1returns = classgpnc1size = 4gpnc1child0 = vargpnc1child1 = constgpnc1child2 = classgpnc1child3 = classgpnc2 = ecgpGPNodeConstraintsgpnc2name = ncVariablegpnc2returns = vargpnc2size = 0gpnc3 = ecgpGPNodeConstraintsgpnc3name = ncConstantgpnc3returns = constgpnc3size = 0

Variants of GPndash 62

Linear Genetic Programming

MotivationTree-like structures are not natural for contemporary hardware architectures

Program = a sequence of instructions

Data passed via registers

ProsDirectly portable to machine code fast executionNatural correspondence to standard (GA-like) crossover operator

Applications direct evolution of machine code [32]

Initial register contents

Final register contents

x1

x2 O1 O2

x3

O3 O4 g2

g3

g1r1

r2

r3

r1

r2

r3

Variants of GPndash 63

Linear GP

Variants of GPndash 64

Stack-based GP

The best-known representative Push and PushGPhampshireedulspectorpushhtml [48]

ProsVery simple syntax program = instruction | literal | ( program )No need to specify the number of registersThe top element of a stack has the natural interpretation of program outcomeNatural possibility of implementing autoconstructive programs [47]Includes certain features that make it Turing-complete (eg YANK instruction)

Variants of GPndash 65

Push Example

Program

( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR )

Initial stack states

BOOLEAN STACK ()CODE STACK ( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR )FLOAT STACK ()INTEGER STACK ()

Stack states after program execution

BOOLEAN STACK ( TRUE )CODE STACK ( ( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR ) )FLOAT STACK ( 93 )INTEGER STACK ( 6 )

httphampshireedulspectorpush3-descriptionhtml

Variants of GPndash 66

Other variants of GP

Grammatical Evolution The grammar of the programming language ofconsideration is given as input to the algorithm Individuals encode the choice ofproductions in the derivation tree (which of available alternative productionshould be chosen modulo the number of productions available at given step ofderivation)

Graph-based GPMotivation standard GP cannot reuse subprograms (within a single program)Example Cartesian Genetic Programming

Variants of GPndash 67

Other variants of GP

Multiobjective GP The extra objectives canCome with the problemResult from GPrsquos specifics eg use program size as the second (minimized)objectiveBe associated with different tests (eg feature tests [40])

Developmental GP (eg using Push)

Probabilistic GP (a variant of EDA Estimation of Distribution Algorithms)The algorithm maintains a probability distribution P instead of a populationIndividuals are generated from P lsquoon demandrsquoThe results of individualsrsquo evaluation are used to update P

Variants of GPndash 68

Simple EDA-like GP PIPE

Probabilistic Incremental Program Evolution [41]

Variants of GPndash 69

Selected theoretical results

Selected theoretical resultsndash 70

Schemata theorem for GP

Exact formula for the expected number of individuals sampling a schema a thenext generation [37]

Plus later work for other types of crossover

Selected theoretical resultsndash 71

Semantic crossover

Selected theoretical resultsndash 72

Applications of GP

Applications of GPndash 73

Review

GP produced a number of solutions that are human-competitive ie a GPalgorithm automatically solved a problem for which a patent exists2

A recent award-winning work has demonstrated the ability of a GP system toautomatically find and correct bugs in commercially-released software whenprovided with test data3

GP is one of leading methodologies that can be used to lsquoautomatersquo sciencehelping the researchers to find the hidden complex patterns in the observedphenomena4

2Koza J R Keane M A Streeter M J Mydlowec W Yu J Lanza G 2003 Genetic Pro-gramming IV Routine Human-Competitive Machine Intelligence Kluwer Academic Publishers

3Arcuri A Yao X A novel co-evolutionary approach to automatic software bug fixing In WangJ (Ed) 2008 IEEE World Congress on Computational Intelligence IEEE Computational IntelligenceSociety IEEE Press Hong Kong

4Schmidt M Lipson H 3 Apr 2009 Distilling free-form natural laws from experimental dataScience 324 (5923) 81ndash85

Applications of GPndash 74

Humies

() Entries were solicited for cash awards for human-competitive results that wereproduced by any form of genetic and evolutionary computation and that were published

Applications of GPndash 75

Humies

The conditions to qualify(A) The result was patented as an invention in the past is an improvement over apatented invention or would qualify today as a patentable new invention(B) The result is equal to or better than a result that was accepted as a new scientificresult at the time when it was published in a peer-reviewed scientific journal(C) The result is equal to or better than a result that was placed into a database orarchive of results maintained by an internationally recognized panel of scientificexperts(D) The result is publishable in its own right as a new scientific result mdash independentof the fact that the result was mechanically created(E) The result is equal to or better than the most recent human-created solution to along-standing problem for which there has been a succession of increasingly betterhuman-created solutions(F) The result is equal to or better than a result that was considered an achievementin its field at the time it was first discovered(G) The result solves a problem of indisputable difficulty in its field(H) The result holds its own or wins a regulated competition involving humancontestants (in the form of either live human players or human-written computerprograms)

Applications of GPndash 76

Selected Gold Humies

2004 Jason D Lohn Gregory S Hornby Derek S Linden NASA Ames ResearchCenterAn Evolved Antenna for Deployment on NASArsquos Space Technology 5 Mission

httpidesignucscedupapershornby_ec11pdf

Applications of GPndash 77

Selected Gold Humies using GP

2009 Stephanie Forrest Claire Le Goues ThanhVu Nguyen Westley WeimerAutomatically finding patches using genetic programming A GeneticProgramming Approach to Automated Software Repair

Applications of GPndash 78

Selected Gold Humies using GP

2008 Lee Spector David M Clark Ian Lindsay Bradford Barr Jon KleinGenetic Programming for Finite Algebras

2010 Natalio Krasnogor Paweł Widera Jonathan GaribaldiEvolutionary design of the energy function for protein structure prediction GPchallenge evolving the energy function for protein structure prediction Automateddesign of energy functions for protein structure prediction by means of geneticprogramming and improved structure similarity assessment

2011 Achiya Elyasaf Ami Hauptmann Moshe SipperGA-FreeCell Evolving Solvers for the Game of FreeCell

Applications of GPndash 79

Other applications

classification problems in machine learning [] object recognition [ 33] orlearning game strategies [] [2 57] has demonstrated the ability of a GP system to automatically find andcorrect bugs in commercially-released software when provided with test data

In the context of this paper it deserves particular attention that GP is one ofleading methodologies that can be used to lsquoautomatersquo science helping theresearchers to find the hidden complex patterns in the observed phenomena

In this spirit in their seminal paper [42] have shown how GP can be used toinduce scientific laws from experimental data Many other studies havedemonstrated the usefulness of GP for modeling different phenomena includingthose of natural origins [45 3 7 56 46]

See [] for an extensive review of GP applications

Applications of GPndash 80

Case study 2 Evolution of feature detectors

Based on Krzysztof Krawiec Bartosz Kukawka and Tomasz Maciejewski (2010)Evolving cascades of voting feature detectors for vehicle detection in satellite imageryIn IEEE Congress on Evolutionary Computation (CEC 2010) Barcelona IEEE Presspages 2392-2399

Applications of GPndash 81

Additional resources

Additional resourcesndash 82

Additional resources

Evolutionary Computation in Java csgmuedu~eclabprojectsecjGeneric software framework for EA well-prepared to work with GP

The online GP bilbiography wwwcsbhamacuk~wblbiblio

The genetic programming lsquohome pagersquo (a little bit messy but still valuable)httpwwwgenetic-programmingcom

Additional resourcesndash 83

Bibliography

Bibliographyndash 84

A Agapitos and S M LucasEvolving Modular Recursive Sorting AlgorithmsIn Proc EuroGP 2007

A Arcuri and X YaoA novel co-evolutionary approach to automatic software bug fixingIn J Wang editor 2008 IEEE World Congress on Computational Intelligencepages 162ndash168 Hong Kong 1-6 June 2008 IEEE Computational IntelligenceSociety IEEE Press

M Arganis R Val J Prats K Rodriguez R Dominguez and J DolzGenetic programming and standardization in water temperature modellingAdvances in Civil Engineering 2009 2009

A Brabazon M OrsquoNeill and I DempseyAn Introduction to Evolutionary Computation in FinanceIEEE Computational Intelligence Magazine 3(4)42ndash55 2008

R Bradley A Brabazon and M OrsquoNeillDynamic High Frequency Trading A Neuro-Evolutionary ApproachIn Proc EvoWorkshops 2009

G Cuccu and F GomezWhen novelty is not enoughIn Proc EvoApplications 2011

Bibliographyndash 85

M Daga and M C DeoAlternative data-driven methods to estimate wind from waves by inversemodelingNatural Hazards 49(2)293ndash310 May 2009

I Dempsey M OrsquoNeill and A BrabazonAdaptive Trading With Grammatical EvolutionIn Proc CEC 2006

G Durrett F Neumann and U-M OrsquoReillyComputational Complexity Analysis of Simple Genetic Programming On TwoProblems Modeling Isolated Program SemanticsIn Proc FOGA 2011

E Galvaacuten-Loacutepez J Swafford M OrsquoNeill and A BrabazonEvolving a Ms PacMan Controller Using Grammatical EvolutionIn Applications of Evolutionary Computation Springer 2010

J V Hansen P B Lowry R D Meservy and D M McDonaldGenetic Programming for Prevention of Cyberterrorism through Dynamic andEvolving Intrusion DetectionDecision Support Systems 431362ndash1374 2007

Bibliographyndash 86

S Harding J F Miller and W BanzhafDevelopments in Cartesian Genetic Programming self-modifying CGPGPEM 11397ndash439 2010

D Jakobović and L BudinDynamic Scheduling with Genetic ProgrammingIn Proc EuroGP 2006

G Kendall A Parkes and K SpoererA Survey of NP-Complete PuzzlesInternational Computer Games Association Journal 31(1)13ndash34 2008

K E Kinnear JrEvolving a Sort Lessons in Genetic ProgrammingIn Proc of the International Conference on Neural Networks 1993

J KozaA Genetic Approach to the Truck Backer Upper Problem and the Inter-twinedSpiral ProblemIn Proc International Joint Conference on Neural Networks 1992

J R KozaGenetic Programming On the Programming of Computers by Means of NaturalSelectionMIT Press Cambridge MA USA 1992

Bibliographyndash 87

J R KozaGenetic Programming II Automatic Discovery of Reusable ProgramsMIT Press Cambridge Massachusetts May 1994

W Langdon and W BanzhafRepeated Patterns in Genetic ProgrammingNatural Computing 7589ndash613 2008

W B LangdonRandom search is parsimoniousIn E Cantuacute-Paz editor Late Breaking Papers at the Genetic and EvolutionaryComputation Conference (GECCO-2002) pages 308ndash315 New York NY 9-13July 2002 AAAI

W B Langdon and W BanzhafRepeated Sequences in Linear Genetic Programming GenomesComplex Systems 15(4)285ndash306 2005

W B Langdon and A P HarrisonEvolving Regular Expressions for GeneChip Probe Performance PredictionIn Proc PPSN pages 1061ndash1070 2008

W B Langdon and R PoliFoundations of Genetic ProgrammingSpringer-Verlag 2002

Bibliographyndash 88

W B Langdon J Rowsell and A P HarrisonCreating Regular Expressions as mRNA Motifs with GP to Predict Human ExonSplittingIn Proc GECCO 2009

W B Langdon O Sanchez Graillet and A P HarrisonAutomated DNA Motif DiscoveryarXivorg 2010

M Lones A Tyrrell S Stepney and L CavesControlling Complex Dynamics with Artificial Biochemical NetworksIn Proc EuroGP pages 159ndash170 2010

S LucasOthello Competitionhttpprotectkern-1667emrelaxalgovalessexacuk8080othellohtmlOthellohtml 2012[Online accessed 27-Jan-2012]

Bibliographyndash 89

S LucasThe Physical Travelling Salesperson Problemhttpprotectkern-1667emrelaxalgovalessexacukptspptsphtml2012[Online accessed 27ndashJan-2012]

T McConaghyFFX Fast Scalable Deterministic Symbolic Regression TechnologyIn Proc GPTP 2011

D J MontanaStrongly typed genetic programmingBBN Technical Report 7866 Bolt Beranek and Newman Inc 10 MoultonStreet Cambridge MA 02138 USA 7 May 1993

M Nicolau M Schoenauer and W BanzhafEvolving Genes to Balance a PoleIn Proc EuroGP 2010

Bibliographyndash 90

P Nordin and W BanzhafGenetic programming controlling a miniature robotIn E V Siegel and J R Koza editors Working Notes for the AAAI Symposiumon Genetic Programming pages 61ndash67 MIT Cambridge MA USA 10ndash12 Nov1995 AAAI

G Olague and L TrujilloEvolutionary-computer-assisted design of image operators that detect interestpoints using genetic programmingImage and Vision Computing 29(7)484ndash498 2011

M OrsquoNeill A Brabazon and E HembergSubtree Deactivation Control with Grammatical Genetic Programming inDynamic EnvironmentsIn Proc CEC 2008

M OrsquoNeill and C RyanGrammatical Evolution by Grammatical Evolution The Evolution of Grammarand Genetic CodeIn Proc EuroGP pages 138ndash149 Springer-Verlag 5ndash7 Apr 2004

M Paterson Y Peres M Thorup P Winkler and U ZwickMaximum OverhangIn Proc 19th Annual ACM-SIAM Symposium on Discrete Algorithms 2008

Bibliographyndash 91

R PoliExact schema theory for genetic programming and variable-length geneticalgorithms with one-point crossoverGenetic Programming and Evolvable Machines 2(2)123ndash163 June 2001

R Poli and W B LangdonOn the search properties of different crossover operators in genetic programmingIn J R Koza W Banzhaf K Chellapilla K Deb M Dorigo D B Fogel M HGarzon D E Goldberg H Iba and R Riolo editors Genetic Programming1998 Proceedings of the Third Annual Conference pages 293ndash301 University ofWisconsin Madison Wisconsin USA 22-25 July 1998 Morgan Kaufmann

R Poli W B Langdon and N F McPheeA field guide to genetic programmingPublished via httplulucom and freely available athttpwwwgp-field-guideorguk 2008(With contributions by J R Koza)

B J Ross and H ZhuProcedural texture evolution using multiobjective optimizationNew Generation Computing 22(3)271ndash293 2004

Bibliographyndash 92

R P Salustowicz and J SchmidhuberProbabilistic incremental program evolutionEvolutionary Computation 5(2)123ndash141 1997

M Schmidt and H LipsonDistilling free-form natural laws from experimental dataScience 324(5923)81ndash85 3 Apr 2009

S Silva and L VanneschiState-of-the-Art Genetic Programming for Predicting Human Oral Bioavailabilityof DrugsIn Proc 4th International Workshop on Practical Applications of ComputationalBiology and Bioinformatics 2010

M SipperLet the Games EvolveIn Proc GPTP 2011

C Sivapragasam R Maheswaran and V VenkateshGenetic programming approach for flood routing in natural channelsHydrological Processes 22(5)623ndash628 2007

C Sivapragasam N Muttil S Muthukumar and V M ArunPrediction of algal blooms using genetic programmingMarine Pollution Bulletin 60(10)1849ndash1855 2010

Bibliographyndash 93

L SpectorTowards practical autoconstructive evolution Self-evolution of problem-solvinggenetic programming systemsIn R Riolo T McConaghy and E Vladislavleva editors Genetic ProgrammingTheory and Practice VIII volume 8 of Genetic and Evolutionary Computationchapter 2 pages 17ndash33 Springer Ann Arbor USA 20-22 May 2010

L Spector C Perry and J KleinPush 20 programming language descriptionTechnical report School of Cognitive Science Hampshire College Apr 2004

J Togelius S Karakovskiy J Koutnik and J SchmidhuberSuper Mario EvolutionIn Proc IEEE Computational Intelligence and Games 2009

TORCS The Open Car Racing Simulatorhttptorcssourceforgenet 2012

L Vanneschi and G CuccuVariable Size Population for Dynamic Optimization with Genetic ProgrammingIn Proc GECCO 2009

Bibliographyndash 94

E Vladislavleva G Smits and D Den HertogOrder of Nonlinearity as a Complexity Measure for Models Generated by SymbolicRegression via Pareto Genetic ProgrammingIEEE Trans EC 13(2)333ndash349 2009

N Wagner Z Michalewicz M Khouja and R McGregorTime Series Forecasting for Dynamic Environments The DyFor Genetic ProgramModelIEEE Trans EC 2007

J Walker and J MillerPredicting Prime Numbers Using Cartesian Genetic ProgrammingIn Proc EuroGP 2007

J A Walker K Voumllk S L Smith and J F MillerParallel Evolution using Multi-chromosome Cartesian Genetic ProgrammingGPEM 10417ndash445 2009

W-C Wang K-W Chau C-T Cheng and L QiuA comparison of performance of several artificial intelligence methods forforecasting monthly discharge time seriesJournal of Hydrology 374(3-4)294ndash306 2009

Bibliographyndash 95

W Weimer S Forrest C Le Goues and T NguyenAutomatic program repair with evolutionary computationCommunications of the ACM 53(5)109ndash116 June 2010

P Widera J Garibaldi and N KrasnogorGP challenge Evolving energy function for protein structure predictionGPEM 1161ndash88 2010

T YuHierarchical Processing for Evolving Recursive and Modular Programs UsingHigher-Order Functions and Lambda AbstractionGPEM 2345ndash380 2001

Bibliographyndash 96

  • Introduction
  • Background
  • What is genetic programming
  • Summary of our first glimpse at GP
  • Life demonstration of GP using ECJ
  • Assessment of GP techniques
  • Case study
  • A more detailed view on GP (vanilla GP is not the whole story)
  • The Challenges for GP
  • Variants of GP
  • Selected theoretical results
  • Applications of GP
  • Additional resources
  • Bibliography
Page 44: GeneticProgramming - Poznań University of TechnologyGeneticProgramming KrzysztofKrawiec LaboratoryofIntelligentDecisionSupportSystems InstituteofComputingScience,PoznanUniversityofTechnology,Poznań,Poland

and more

Game-Playing

TORCS Car Racing [50]

Ms PacMan [10]

Othello [27]

Chessboard Evaluation [44]

Backgammon [44]

Mario [49]

NP-Complete Puzzles [14]

Robocode [44]

Rush Hour [44]

Checkers [44]

Freecell [44]

Dynamic Optimisation

Dynamic Symbolic Regression [34 35 51]

Dynamic Scheduling [13]

Traditional Programming

Sorting [15 1]

Assessment of GP techniquesndash 44

Case study

Case studyndash 45

Case study 1 Evolution of temperature models

Based on Karolina Stanisławska Krzysztof Krawiec Zbigniew W KundzewiczModeling Global Temperature Changes using Genetic Programming ndash A Case Study

Institute of Computing Science Poznan University of Technology Poznan Poland

Institute for Agricultural and Forest Environment Polish Academy of SciencesPoznan Poland and Potsdam Institute for Climate Impact Research PotsdamGermany

Konferencja Algorytmoacutew Ewolucyjnych i Optymalizacji Globalnej KAEiOGSeptember 21 2011

Case studyndash 46

A more detailed view on GP(vanilla GP is not the whole story)

A more detailed view on GP (vanilla GP is not the whole story)ndash 47

Where to get [random] candidate solutions from

Every stochastic search method relies on some sampling algorithm(s)

The distribution of randomly generated solutions is important as it impliescertain bias of the algorithm

ProblemsWe donrsquot know the lsquoidealrsquo distribution of GP programsEven if we knew it it may be difficult to design an algorithm that obeys it

The most widely used contemporary initialization methods take care only of thesyntax of generated programs

Mainly height constraint

A more detailed view on GP (vanilla GP is not the whole story)ndash 48

Initialization Full method

Specify the maximum tree height hmax

The full method for initializing treesChoose nonterminal nodes at random until hmax is reachedThen choose only from terminals

A more detailed view on GP (vanilla GP is not the whole story)ndash 49

Initialization Grow method

Specify the maximum tree height hmax

The grow method for initializing treesChoose nonterminal or terminal nodes at random until hmax is reachedThen choose only from terminals

A more detailed view on GP (vanilla GP is not the whole story)ndash 50

Homologous crossover for GP

Earliest example one-point crossover [23] identify a common region in theparents and swap the corresponding trees

The common region is the lsquointersectionrsquo of parent trees

A more detailed view on GP (vanilla GP is not the whole story)ndash 51

Uniform crossover for GP

Works similarly to uniform crossover in GAs

The offspring is build by iterating over nodes in the common region and flipping acoin to decide from which parent should an instruction be copied [38]

A more detailed view on GP (vanilla GP is not the whole story)ndash 52

How to employ multiple operators for lsquobreedingrsquo

How should the particular operators coexist in an evolutionary process In other words

How should they be superimposed

What should be the lsquopipingrsquo of particular breeding pipelines

A topic surprisingly underexplored in GP (and in EC probably too)

An example Which is betterpopsubpop0speciespipe = ecgpkozaMutationPipelinepopsubpop0speciespipenum-sources = 1popsubpop0speciespipesource0 = ecgpkozaCrossoverPipeline

Orpopsubpop0speciespipenum-sources = 2popsubpop0speciespipesource0 = ecgpkozaCrossoverPipelinepopsubpop0speciespipesource0prob = 09popsubpop0speciespipesource1 = ecgpkozaMutationPipelinepopsubpop0speciespipesource1prob = 01

A more detailed view on GP (vanilla GP is not the whole story)ndash 53

The Challenges for GP

The Challenges for GPndash 54

Bloat

The evolving expressions tend to grow indefinitely in sizeFor tree-based representations this growth is typically exponential[-ish]

Evaluation becomes slow algorithm stalls memory overrun likely

One of the most intensely studied topics in GP 240+ papers as of March 2012

The Challenges for GPndash 55

Bloat example

Average number of nodes per generation in a typical run of GP solving the Sexticproblem x6minus2x4+x2

(GP dotted line)

The Challenges for GPndash 56

Countermeasures for bloat

Constraining tree heightSurprisingly can speed up bloat

Favoring small programsLexicographic parsimony pressure given two equally fit individuals prefer (select)the one represented by a smaller tree

Bloat-aware operators size-fair crossover

The Challenges for GPndash 57

Highly non-uniform distribution of program lsquobehaviorsrsquo

Convergence of binary Boolean random linear functions (composed of AND NANDOR NOR 8 bits)

From [20] Langdon W B Cantuacute-Paz E (ed) Random Search is Parsimonious LateBreaking Papers at the Genetic and Evolutionary Computation Conference(GECCO-2002) AAAI 2002 308-315

The Challenges for GPndash 58

High cost of evaluation

Running a program on multiple inputscan be expensive

Particularly for some types of dataeg images

Solutions

Caching of outcomes of subprograms

Parallel execution of programs onparticular fitness cases

Bloat prevention methods

The Challenges for GPndash 59

Variants of GP

Variants of GPndash 60

Strongly typed GP (STGP)

A way to incorporate prior knowledge and impose a structure on programs [30]

ImplementationProvide a set of typesFor each instruction define the types of its arguments and outcomesMake the operators type-aware

Mutation substitute a random tree of a proper typeCrossover swap trees of compatible1 types

1Compatible belonging to the same lsquoset typersquoVariants of GPndash 61

Strongly typed GP in ECJ

For the problem of simple classifiers represented as decision trees

Classifier syntaxClassifier = Class_idClassifier = if_then_else(Condition ClassifierClassifier)Condition = Input_Variable = Constant_Value

Implementation in ECJ parameter filesgptypeasize = 3gptypea0name = classgptypea1name = vargptypea2name = constgptypessize = 0gptcsize = 1gptc0 = ecgpGPTreeConstraintsgptc0name = tc0gptc0fset = f0gptc0returns = class

gpncsize = 4gpnc0 = ecgpGPNodeConstraintsgpnc0name = ncSimpleClassifiergpnc0returns = classgpnc0size = 0gpnc1 = ecgpGPNodeConstraintsgpnc1name = ncCompoundClassifiergpnc1returns = classgpnc1size = 4gpnc1child0 = vargpnc1child1 = constgpnc1child2 = classgpnc1child3 = classgpnc2 = ecgpGPNodeConstraintsgpnc2name = ncVariablegpnc2returns = vargpnc2size = 0gpnc3 = ecgpGPNodeConstraintsgpnc3name = ncConstantgpnc3returns = constgpnc3size = 0

Variants of GPndash 62

Linear Genetic Programming

MotivationTree-like structures are not natural for contemporary hardware architectures

Program = a sequence of instructions

Data passed via registers

ProsDirectly portable to machine code fast executionNatural correspondence to standard (GA-like) crossover operator

Applications direct evolution of machine code [32]

Initial register contents

Final register contents

x1

x2 O1 O2

x3

O3 O4 g2

g3

g1r1

r2

r3

r1

r2

r3

Variants of GPndash 63

Linear GP

Variants of GPndash 64

Stack-based GP

The best-known representative Push and PushGPhampshireedulspectorpushhtml [48]

ProsVery simple syntax program = instruction | literal | ( program )No need to specify the number of registersThe top element of a stack has the natural interpretation of program outcomeNatural possibility of implementing autoconstructive programs [47]Includes certain features that make it Turing-complete (eg YANK instruction)

Variants of GPndash 65

Push Example

Program

( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR )

Initial stack states

BOOLEAN STACK ()CODE STACK ( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR )FLOAT STACK ()INTEGER STACK ()

Stack states after program execution

BOOLEAN STACK ( TRUE )CODE STACK ( ( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR ) )FLOAT STACK ( 93 )INTEGER STACK ( 6 )

httphampshireedulspectorpush3-descriptionhtml

Variants of GPndash 66

Other variants of GP

Grammatical Evolution The grammar of the programming language ofconsideration is given as input to the algorithm Individuals encode the choice ofproductions in the derivation tree (which of available alternative productionshould be chosen modulo the number of productions available at given step ofderivation)

Graph-based GPMotivation standard GP cannot reuse subprograms (within a single program)Example Cartesian Genetic Programming

Variants of GPndash 67

Other variants of GP

Multiobjective GP The extra objectives canCome with the problemResult from GPrsquos specifics eg use program size as the second (minimized)objectiveBe associated with different tests (eg feature tests [40])

Developmental GP (eg using Push)

Probabilistic GP (a variant of EDA Estimation of Distribution Algorithms)The algorithm maintains a probability distribution P instead of a populationIndividuals are generated from P lsquoon demandrsquoThe results of individualsrsquo evaluation are used to update P

Variants of GPndash 68

Simple EDA-like GP PIPE

Probabilistic Incremental Program Evolution [41]

Variants of GPndash 69

Selected theoretical results

Selected theoretical resultsndash 70

Schemata theorem for GP

Exact formula for the expected number of individuals sampling a schema a thenext generation [37]

Plus later work for other types of crossover

Selected theoretical resultsndash 71

Semantic crossover

Selected theoretical resultsndash 72

Applications of GP

Applications of GPndash 73

Review

GP produced a number of solutions that are human-competitive ie a GPalgorithm automatically solved a problem for which a patent exists2

A recent award-winning work has demonstrated the ability of a GP system toautomatically find and correct bugs in commercially-released software whenprovided with test data3

GP is one of leading methodologies that can be used to lsquoautomatersquo sciencehelping the researchers to find the hidden complex patterns in the observedphenomena4

2Koza J R Keane M A Streeter M J Mydlowec W Yu J Lanza G 2003 Genetic Pro-gramming IV Routine Human-Competitive Machine Intelligence Kluwer Academic Publishers

3Arcuri A Yao X A novel co-evolutionary approach to automatic software bug fixing In WangJ (Ed) 2008 IEEE World Congress on Computational Intelligence IEEE Computational IntelligenceSociety IEEE Press Hong Kong

4Schmidt M Lipson H 3 Apr 2009 Distilling free-form natural laws from experimental dataScience 324 (5923) 81ndash85

Applications of GPndash 74

Humies

() Entries were solicited for cash awards for human-competitive results that wereproduced by any form of genetic and evolutionary computation and that were published

Applications of GPndash 75

Humies

The conditions to qualify(A) The result was patented as an invention in the past is an improvement over apatented invention or would qualify today as a patentable new invention(B) The result is equal to or better than a result that was accepted as a new scientificresult at the time when it was published in a peer-reviewed scientific journal(C) The result is equal to or better than a result that was placed into a database orarchive of results maintained by an internationally recognized panel of scientificexperts(D) The result is publishable in its own right as a new scientific result mdash independentof the fact that the result was mechanically created(E) The result is equal to or better than the most recent human-created solution to along-standing problem for which there has been a succession of increasingly betterhuman-created solutions(F) The result is equal to or better than a result that was considered an achievementin its field at the time it was first discovered(G) The result solves a problem of indisputable difficulty in its field(H) The result holds its own or wins a regulated competition involving humancontestants (in the form of either live human players or human-written computerprograms)

Applications of GPndash 76

Selected Gold Humies

2004 Jason D Lohn Gregory S Hornby Derek S Linden NASA Ames ResearchCenterAn Evolved Antenna for Deployment on NASArsquos Space Technology 5 Mission

httpidesignucscedupapershornby_ec11pdf

Applications of GPndash 77

Selected Gold Humies using GP

2009 Stephanie Forrest Claire Le Goues ThanhVu Nguyen Westley WeimerAutomatically finding patches using genetic programming A GeneticProgramming Approach to Automated Software Repair

Applications of GPndash 78

Selected Gold Humies using GP

2008 Lee Spector David M Clark Ian Lindsay Bradford Barr Jon KleinGenetic Programming for Finite Algebras

2010 Natalio Krasnogor Paweł Widera Jonathan GaribaldiEvolutionary design of the energy function for protein structure prediction GPchallenge evolving the energy function for protein structure prediction Automateddesign of energy functions for protein structure prediction by means of geneticprogramming and improved structure similarity assessment

2011 Achiya Elyasaf Ami Hauptmann Moshe SipperGA-FreeCell Evolving Solvers for the Game of FreeCell

Applications of GPndash 79

Other applications

classification problems in machine learning [] object recognition [ 33] orlearning game strategies [] [2 57] has demonstrated the ability of a GP system to automatically find andcorrect bugs in commercially-released software when provided with test data

In the context of this paper it deserves particular attention that GP is one ofleading methodologies that can be used to lsquoautomatersquo science helping theresearchers to find the hidden complex patterns in the observed phenomena

In this spirit in their seminal paper [42] have shown how GP can be used toinduce scientific laws from experimental data Many other studies havedemonstrated the usefulness of GP for modeling different phenomena includingthose of natural origins [45 3 7 56 46]

See [] for an extensive review of GP applications

Applications of GPndash 80

Case study 2 Evolution of feature detectors

Based on Krzysztof Krawiec Bartosz Kukawka and Tomasz Maciejewski (2010)Evolving cascades of voting feature detectors for vehicle detection in satellite imageryIn IEEE Congress on Evolutionary Computation (CEC 2010) Barcelona IEEE Presspages 2392-2399

Applications of GPndash 81

Additional resources

Additional resourcesndash 82

Additional resources

Evolutionary Computation in Java csgmuedu~eclabprojectsecjGeneric software framework for EA well-prepared to work with GP

The online GP bilbiography wwwcsbhamacuk~wblbiblio

The genetic programming lsquohome pagersquo (a little bit messy but still valuable)httpwwwgenetic-programmingcom

Additional resourcesndash 83

Bibliography

Bibliographyndash 84

A Agapitos and S M LucasEvolving Modular Recursive Sorting AlgorithmsIn Proc EuroGP 2007

A Arcuri and X YaoA novel co-evolutionary approach to automatic software bug fixingIn J Wang editor 2008 IEEE World Congress on Computational Intelligencepages 162ndash168 Hong Kong 1-6 June 2008 IEEE Computational IntelligenceSociety IEEE Press

M Arganis R Val J Prats K Rodriguez R Dominguez and J DolzGenetic programming and standardization in water temperature modellingAdvances in Civil Engineering 2009 2009

A Brabazon M OrsquoNeill and I DempseyAn Introduction to Evolutionary Computation in FinanceIEEE Computational Intelligence Magazine 3(4)42ndash55 2008

R Bradley A Brabazon and M OrsquoNeillDynamic High Frequency Trading A Neuro-Evolutionary ApproachIn Proc EvoWorkshops 2009

G Cuccu and F GomezWhen novelty is not enoughIn Proc EvoApplications 2011

Bibliographyndash 85

M Daga and M C DeoAlternative data-driven methods to estimate wind from waves by inversemodelingNatural Hazards 49(2)293ndash310 May 2009

I Dempsey M OrsquoNeill and A BrabazonAdaptive Trading With Grammatical EvolutionIn Proc CEC 2006

G Durrett F Neumann and U-M OrsquoReillyComputational Complexity Analysis of Simple Genetic Programming On TwoProblems Modeling Isolated Program SemanticsIn Proc FOGA 2011

E Galvaacuten-Loacutepez J Swafford M OrsquoNeill and A BrabazonEvolving a Ms PacMan Controller Using Grammatical EvolutionIn Applications of Evolutionary Computation Springer 2010

J V Hansen P B Lowry R D Meservy and D M McDonaldGenetic Programming for Prevention of Cyberterrorism through Dynamic andEvolving Intrusion DetectionDecision Support Systems 431362ndash1374 2007

Bibliographyndash 86

S Harding J F Miller and W BanzhafDevelopments in Cartesian Genetic Programming self-modifying CGPGPEM 11397ndash439 2010

D Jakobović and L BudinDynamic Scheduling with Genetic ProgrammingIn Proc EuroGP 2006

G Kendall A Parkes and K SpoererA Survey of NP-Complete PuzzlesInternational Computer Games Association Journal 31(1)13ndash34 2008

K E Kinnear JrEvolving a Sort Lessons in Genetic ProgrammingIn Proc of the International Conference on Neural Networks 1993

J KozaA Genetic Approach to the Truck Backer Upper Problem and the Inter-twinedSpiral ProblemIn Proc International Joint Conference on Neural Networks 1992

J R KozaGenetic Programming On the Programming of Computers by Means of NaturalSelectionMIT Press Cambridge MA USA 1992

Bibliographyndash 87

J R KozaGenetic Programming II Automatic Discovery of Reusable ProgramsMIT Press Cambridge Massachusetts May 1994

W Langdon and W BanzhafRepeated Patterns in Genetic ProgrammingNatural Computing 7589ndash613 2008

W B LangdonRandom search is parsimoniousIn E Cantuacute-Paz editor Late Breaking Papers at the Genetic and EvolutionaryComputation Conference (GECCO-2002) pages 308ndash315 New York NY 9-13July 2002 AAAI

W B Langdon and W BanzhafRepeated Sequences in Linear Genetic Programming GenomesComplex Systems 15(4)285ndash306 2005

W B Langdon and A P HarrisonEvolving Regular Expressions for GeneChip Probe Performance PredictionIn Proc PPSN pages 1061ndash1070 2008

W B Langdon and R PoliFoundations of Genetic ProgrammingSpringer-Verlag 2002

Bibliographyndash 88

W B Langdon J Rowsell and A P HarrisonCreating Regular Expressions as mRNA Motifs with GP to Predict Human ExonSplittingIn Proc GECCO 2009

W B Langdon O Sanchez Graillet and A P HarrisonAutomated DNA Motif DiscoveryarXivorg 2010

M Lones A Tyrrell S Stepney and L CavesControlling Complex Dynamics with Artificial Biochemical NetworksIn Proc EuroGP pages 159ndash170 2010

S LucasOthello Competitionhttpprotectkern-1667emrelaxalgovalessexacuk8080othellohtmlOthellohtml 2012[Online accessed 27-Jan-2012]

Bibliographyndash 89

S LucasThe Physical Travelling Salesperson Problemhttpprotectkern-1667emrelaxalgovalessexacukptspptsphtml2012[Online accessed 27ndashJan-2012]

T McConaghyFFX Fast Scalable Deterministic Symbolic Regression TechnologyIn Proc GPTP 2011

D J MontanaStrongly typed genetic programmingBBN Technical Report 7866 Bolt Beranek and Newman Inc 10 MoultonStreet Cambridge MA 02138 USA 7 May 1993

M Nicolau M Schoenauer and W BanzhafEvolving Genes to Balance a PoleIn Proc EuroGP 2010

Bibliographyndash 90

P Nordin and W BanzhafGenetic programming controlling a miniature robotIn E V Siegel and J R Koza editors Working Notes for the AAAI Symposiumon Genetic Programming pages 61ndash67 MIT Cambridge MA USA 10ndash12 Nov1995 AAAI

G Olague and L TrujilloEvolutionary-computer-assisted design of image operators that detect interestpoints using genetic programmingImage and Vision Computing 29(7)484ndash498 2011

M OrsquoNeill A Brabazon and E HembergSubtree Deactivation Control with Grammatical Genetic Programming inDynamic EnvironmentsIn Proc CEC 2008

M OrsquoNeill and C RyanGrammatical Evolution by Grammatical Evolution The Evolution of Grammarand Genetic CodeIn Proc EuroGP pages 138ndash149 Springer-Verlag 5ndash7 Apr 2004

M Paterson Y Peres M Thorup P Winkler and U ZwickMaximum OverhangIn Proc 19th Annual ACM-SIAM Symposium on Discrete Algorithms 2008

Bibliographyndash 91

R PoliExact schema theory for genetic programming and variable-length geneticalgorithms with one-point crossoverGenetic Programming and Evolvable Machines 2(2)123ndash163 June 2001

R Poli and W B LangdonOn the search properties of different crossover operators in genetic programmingIn J R Koza W Banzhaf K Chellapilla K Deb M Dorigo D B Fogel M HGarzon D E Goldberg H Iba and R Riolo editors Genetic Programming1998 Proceedings of the Third Annual Conference pages 293ndash301 University ofWisconsin Madison Wisconsin USA 22-25 July 1998 Morgan Kaufmann

R Poli W B Langdon and N F McPheeA field guide to genetic programmingPublished via httplulucom and freely available athttpwwwgp-field-guideorguk 2008(With contributions by J R Koza)

B J Ross and H ZhuProcedural texture evolution using multiobjective optimizationNew Generation Computing 22(3)271ndash293 2004

Bibliographyndash 92

R P Salustowicz and J SchmidhuberProbabilistic incremental program evolutionEvolutionary Computation 5(2)123ndash141 1997

M Schmidt and H LipsonDistilling free-form natural laws from experimental dataScience 324(5923)81ndash85 3 Apr 2009

S Silva and L VanneschiState-of-the-Art Genetic Programming for Predicting Human Oral Bioavailabilityof DrugsIn Proc 4th International Workshop on Practical Applications of ComputationalBiology and Bioinformatics 2010

M SipperLet the Games EvolveIn Proc GPTP 2011

C Sivapragasam R Maheswaran and V VenkateshGenetic programming approach for flood routing in natural channelsHydrological Processes 22(5)623ndash628 2007

C Sivapragasam N Muttil S Muthukumar and V M ArunPrediction of algal blooms using genetic programmingMarine Pollution Bulletin 60(10)1849ndash1855 2010

Bibliographyndash 93

L SpectorTowards practical autoconstructive evolution Self-evolution of problem-solvinggenetic programming systemsIn R Riolo T McConaghy and E Vladislavleva editors Genetic ProgrammingTheory and Practice VIII volume 8 of Genetic and Evolutionary Computationchapter 2 pages 17ndash33 Springer Ann Arbor USA 20-22 May 2010

L Spector C Perry and J KleinPush 20 programming language descriptionTechnical report School of Cognitive Science Hampshire College Apr 2004

J Togelius S Karakovskiy J Koutnik and J SchmidhuberSuper Mario EvolutionIn Proc IEEE Computational Intelligence and Games 2009

TORCS The Open Car Racing Simulatorhttptorcssourceforgenet 2012

L Vanneschi and G CuccuVariable Size Population for Dynamic Optimization with Genetic ProgrammingIn Proc GECCO 2009

Bibliographyndash 94

E Vladislavleva G Smits and D Den HertogOrder of Nonlinearity as a Complexity Measure for Models Generated by SymbolicRegression via Pareto Genetic ProgrammingIEEE Trans EC 13(2)333ndash349 2009

N Wagner Z Michalewicz M Khouja and R McGregorTime Series Forecasting for Dynamic Environments The DyFor Genetic ProgramModelIEEE Trans EC 2007

J Walker and J MillerPredicting Prime Numbers Using Cartesian Genetic ProgrammingIn Proc EuroGP 2007

J A Walker K Voumllk S L Smith and J F MillerParallel Evolution using Multi-chromosome Cartesian Genetic ProgrammingGPEM 10417ndash445 2009

W-C Wang K-W Chau C-T Cheng and L QiuA comparison of performance of several artificial intelligence methods forforecasting monthly discharge time seriesJournal of Hydrology 374(3-4)294ndash306 2009

Bibliographyndash 95

W Weimer S Forrest C Le Goues and T NguyenAutomatic program repair with evolutionary computationCommunications of the ACM 53(5)109ndash116 June 2010

P Widera J Garibaldi and N KrasnogorGP challenge Evolving energy function for protein structure predictionGPEM 1161ndash88 2010

T YuHierarchical Processing for Evolving Recursive and Modular Programs UsingHigher-Order Functions and Lambda AbstractionGPEM 2345ndash380 2001

Bibliographyndash 96

  • Introduction
  • Background
  • What is genetic programming
  • Summary of our first glimpse at GP
  • Life demonstration of GP using ECJ
  • Assessment of GP techniques
  • Case study
  • A more detailed view on GP (vanilla GP is not the whole story)
  • The Challenges for GP
  • Variants of GP
  • Selected theoretical results
  • Applications of GP
  • Additional resources
  • Bibliography
Page 45: GeneticProgramming - Poznań University of TechnologyGeneticProgramming KrzysztofKrawiec LaboratoryofIntelligentDecisionSupportSystems InstituteofComputingScience,PoznanUniversityofTechnology,Poznań,Poland

Case study

Case studyndash 45

Case study 1 Evolution of temperature models

Based on Karolina Stanisławska Krzysztof Krawiec Zbigniew W KundzewiczModeling Global Temperature Changes using Genetic Programming ndash A Case Study

Institute of Computing Science Poznan University of Technology Poznan Poland

Institute for Agricultural and Forest Environment Polish Academy of SciencesPoznan Poland and Potsdam Institute for Climate Impact Research PotsdamGermany

Konferencja Algorytmoacutew Ewolucyjnych i Optymalizacji Globalnej KAEiOGSeptember 21 2011

Case studyndash 46

A more detailed view on GP(vanilla GP is not the whole story)

A more detailed view on GP (vanilla GP is not the whole story)ndash 47

Where to get [random] candidate solutions from

Every stochastic search method relies on some sampling algorithm(s)

The distribution of randomly generated solutions is important as it impliescertain bias of the algorithm

ProblemsWe donrsquot know the lsquoidealrsquo distribution of GP programsEven if we knew it it may be difficult to design an algorithm that obeys it

The most widely used contemporary initialization methods take care only of thesyntax of generated programs

Mainly height constraint

A more detailed view on GP (vanilla GP is not the whole story)ndash 48

Initialization Full method

Specify the maximum tree height hmax

The full method for initializing treesChoose nonterminal nodes at random until hmax is reachedThen choose only from terminals

A more detailed view on GP (vanilla GP is not the whole story)ndash 49

Initialization Grow method

Specify the maximum tree height hmax

The grow method for initializing treesChoose nonterminal or terminal nodes at random until hmax is reachedThen choose only from terminals

A more detailed view on GP (vanilla GP is not the whole story)ndash 50

Homologous crossover for GP

Earliest example one-point crossover [23] identify a common region in theparents and swap the corresponding trees

The common region is the lsquointersectionrsquo of parent trees

A more detailed view on GP (vanilla GP is not the whole story)ndash 51

Uniform crossover for GP

Works similarly to uniform crossover in GAs

The offspring is build by iterating over nodes in the common region and flipping acoin to decide from which parent should an instruction be copied [38]

A more detailed view on GP (vanilla GP is not the whole story)ndash 52

How to employ multiple operators for lsquobreedingrsquo

How should the particular operators coexist in an evolutionary process In other words

How should they be superimposed

What should be the lsquopipingrsquo of particular breeding pipelines

A topic surprisingly underexplored in GP (and in EC probably too)

An example Which is betterpopsubpop0speciespipe = ecgpkozaMutationPipelinepopsubpop0speciespipenum-sources = 1popsubpop0speciespipesource0 = ecgpkozaCrossoverPipeline

Orpopsubpop0speciespipenum-sources = 2popsubpop0speciespipesource0 = ecgpkozaCrossoverPipelinepopsubpop0speciespipesource0prob = 09popsubpop0speciespipesource1 = ecgpkozaMutationPipelinepopsubpop0speciespipesource1prob = 01

A more detailed view on GP (vanilla GP is not the whole story)ndash 53

The Challenges for GP

The Challenges for GPndash 54

Bloat

The evolving expressions tend to grow indefinitely in sizeFor tree-based representations this growth is typically exponential[-ish]

Evaluation becomes slow algorithm stalls memory overrun likely

One of the most intensely studied topics in GP 240+ papers as of March 2012

The Challenges for GPndash 55

Bloat example

Average number of nodes per generation in a typical run of GP solving the Sexticproblem x6minus2x4+x2

(GP dotted line)

The Challenges for GPndash 56

Countermeasures for bloat

Constraining tree heightSurprisingly can speed up bloat

Favoring small programsLexicographic parsimony pressure given two equally fit individuals prefer (select)the one represented by a smaller tree

Bloat-aware operators size-fair crossover

The Challenges for GPndash 57

Highly non-uniform distribution of program lsquobehaviorsrsquo

Convergence of binary Boolean random linear functions (composed of AND NANDOR NOR 8 bits)

From [20] Langdon W B Cantuacute-Paz E (ed) Random Search is Parsimonious LateBreaking Papers at the Genetic and Evolutionary Computation Conference(GECCO-2002) AAAI 2002 308-315

The Challenges for GPndash 58

High cost of evaluation

Running a program on multiple inputscan be expensive

Particularly for some types of dataeg images

Solutions

Caching of outcomes of subprograms

Parallel execution of programs onparticular fitness cases

Bloat prevention methods

The Challenges for GPndash 59

Variants of GP

Variants of GPndash 60

Strongly typed GP (STGP)

A way to incorporate prior knowledge and impose a structure on programs [30]

ImplementationProvide a set of typesFor each instruction define the types of its arguments and outcomesMake the operators type-aware

Mutation substitute a random tree of a proper typeCrossover swap trees of compatible1 types

1Compatible belonging to the same lsquoset typersquoVariants of GPndash 61

Strongly typed GP in ECJ

For the problem of simple classifiers represented as decision trees

Classifier syntaxClassifier = Class_idClassifier = if_then_else(Condition ClassifierClassifier)Condition = Input_Variable = Constant_Value

Implementation in ECJ parameter filesgptypeasize = 3gptypea0name = classgptypea1name = vargptypea2name = constgptypessize = 0gptcsize = 1gptc0 = ecgpGPTreeConstraintsgptc0name = tc0gptc0fset = f0gptc0returns = class

gpncsize = 4gpnc0 = ecgpGPNodeConstraintsgpnc0name = ncSimpleClassifiergpnc0returns = classgpnc0size = 0gpnc1 = ecgpGPNodeConstraintsgpnc1name = ncCompoundClassifiergpnc1returns = classgpnc1size = 4gpnc1child0 = vargpnc1child1 = constgpnc1child2 = classgpnc1child3 = classgpnc2 = ecgpGPNodeConstraintsgpnc2name = ncVariablegpnc2returns = vargpnc2size = 0gpnc3 = ecgpGPNodeConstraintsgpnc3name = ncConstantgpnc3returns = constgpnc3size = 0

Variants of GPndash 62

Linear Genetic Programming

MotivationTree-like structures are not natural for contemporary hardware architectures

Program = a sequence of instructions

Data passed via registers

ProsDirectly portable to machine code fast executionNatural correspondence to standard (GA-like) crossover operator

Applications direct evolution of machine code [32]

Initial register contents

Final register contents

x1

x2 O1 O2

x3

O3 O4 g2

g3

g1r1

r2

r3

r1

r2

r3

Variants of GPndash 63

Linear GP

Variants of GPndash 64

Stack-based GP

The best-known representative Push and PushGPhampshireedulspectorpushhtml [48]

ProsVery simple syntax program = instruction | literal | ( program )No need to specify the number of registersThe top element of a stack has the natural interpretation of program outcomeNatural possibility of implementing autoconstructive programs [47]Includes certain features that make it Turing-complete (eg YANK instruction)

Variants of GPndash 65

Push Example

Program

( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR )

Initial stack states

BOOLEAN STACK ()CODE STACK ( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR )FLOAT STACK ()INTEGER STACK ()

Stack states after program execution

BOOLEAN STACK ( TRUE )CODE STACK ( ( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR ) )FLOAT STACK ( 93 )INTEGER STACK ( 6 )

httphampshireedulspectorpush3-descriptionhtml

Variants of GPndash 66

Other variants of GP

Grammatical Evolution The grammar of the programming language ofconsideration is given as input to the algorithm Individuals encode the choice ofproductions in the derivation tree (which of available alternative productionshould be chosen modulo the number of productions available at given step ofderivation)

Graph-based GPMotivation standard GP cannot reuse subprograms (within a single program)Example Cartesian Genetic Programming

Variants of GPndash 67

Other variants of GP

Multiobjective GP The extra objectives canCome with the problemResult from GPrsquos specifics eg use program size as the second (minimized)objectiveBe associated with different tests (eg feature tests [40])

Developmental GP (eg using Push)

Probabilistic GP (a variant of EDA Estimation of Distribution Algorithms)The algorithm maintains a probability distribution P instead of a populationIndividuals are generated from P lsquoon demandrsquoThe results of individualsrsquo evaluation are used to update P

Variants of GPndash 68

Simple EDA-like GP PIPE

Probabilistic Incremental Program Evolution [41]

Variants of GPndash 69

Selected theoretical results

Selected theoretical resultsndash 70

Schemata theorem for GP

Exact formula for the expected number of individuals sampling a schema a thenext generation [37]

Plus later work for other types of crossover

Selected theoretical resultsndash 71

Semantic crossover

Selected theoretical resultsndash 72

Applications of GP

Applications of GPndash 73

Review

GP produced a number of solutions that are human-competitive ie a GPalgorithm automatically solved a problem for which a patent exists2

A recent award-winning work has demonstrated the ability of a GP system toautomatically find and correct bugs in commercially-released software whenprovided with test data3

GP is one of leading methodologies that can be used to lsquoautomatersquo sciencehelping the researchers to find the hidden complex patterns in the observedphenomena4

2Koza J R Keane M A Streeter M J Mydlowec W Yu J Lanza G 2003 Genetic Pro-gramming IV Routine Human-Competitive Machine Intelligence Kluwer Academic Publishers

3Arcuri A Yao X A novel co-evolutionary approach to automatic software bug fixing In WangJ (Ed) 2008 IEEE World Congress on Computational Intelligence IEEE Computational IntelligenceSociety IEEE Press Hong Kong

4Schmidt M Lipson H 3 Apr 2009 Distilling free-form natural laws from experimental dataScience 324 (5923) 81ndash85

Applications of GPndash 74

Humies

() Entries were solicited for cash awards for human-competitive results that wereproduced by any form of genetic and evolutionary computation and that were published

Applications of GPndash 75

Humies

The conditions to qualify(A) The result was patented as an invention in the past is an improvement over apatented invention or would qualify today as a patentable new invention(B) The result is equal to or better than a result that was accepted as a new scientificresult at the time when it was published in a peer-reviewed scientific journal(C) The result is equal to or better than a result that was placed into a database orarchive of results maintained by an internationally recognized panel of scientificexperts(D) The result is publishable in its own right as a new scientific result mdash independentof the fact that the result was mechanically created(E) The result is equal to or better than the most recent human-created solution to along-standing problem for which there has been a succession of increasingly betterhuman-created solutions(F) The result is equal to or better than a result that was considered an achievementin its field at the time it was first discovered(G) The result solves a problem of indisputable difficulty in its field(H) The result holds its own or wins a regulated competition involving humancontestants (in the form of either live human players or human-written computerprograms)

Applications of GPndash 76

Selected Gold Humies

2004 Jason D Lohn Gregory S Hornby Derek S Linden NASA Ames ResearchCenterAn Evolved Antenna for Deployment on NASArsquos Space Technology 5 Mission

httpidesignucscedupapershornby_ec11pdf

Applications of GPndash 77

Selected Gold Humies using GP

2009 Stephanie Forrest Claire Le Goues ThanhVu Nguyen Westley WeimerAutomatically finding patches using genetic programming A GeneticProgramming Approach to Automated Software Repair

Applications of GPndash 78

Selected Gold Humies using GP

2008 Lee Spector David M Clark Ian Lindsay Bradford Barr Jon KleinGenetic Programming for Finite Algebras

2010 Natalio Krasnogor Paweł Widera Jonathan GaribaldiEvolutionary design of the energy function for protein structure prediction GPchallenge evolving the energy function for protein structure prediction Automateddesign of energy functions for protein structure prediction by means of geneticprogramming and improved structure similarity assessment

2011 Achiya Elyasaf Ami Hauptmann Moshe SipperGA-FreeCell Evolving Solvers for the Game of FreeCell

Applications of GPndash 79

Other applications

classification problems in machine learning [] object recognition [ 33] orlearning game strategies [] [2 57] has demonstrated the ability of a GP system to automatically find andcorrect bugs in commercially-released software when provided with test data

In the context of this paper it deserves particular attention that GP is one ofleading methodologies that can be used to lsquoautomatersquo science helping theresearchers to find the hidden complex patterns in the observed phenomena

In this spirit in their seminal paper [42] have shown how GP can be used toinduce scientific laws from experimental data Many other studies havedemonstrated the usefulness of GP for modeling different phenomena includingthose of natural origins [45 3 7 56 46]

See [] for an extensive review of GP applications

Applications of GPndash 80

Case study 2 Evolution of feature detectors

Based on Krzysztof Krawiec Bartosz Kukawka and Tomasz Maciejewski (2010)Evolving cascades of voting feature detectors for vehicle detection in satellite imageryIn IEEE Congress on Evolutionary Computation (CEC 2010) Barcelona IEEE Presspages 2392-2399

Applications of GPndash 81

Additional resources

Additional resourcesndash 82

Additional resources

Evolutionary Computation in Java csgmuedu~eclabprojectsecjGeneric software framework for EA well-prepared to work with GP

The online GP bilbiography wwwcsbhamacuk~wblbiblio

The genetic programming lsquohome pagersquo (a little bit messy but still valuable)httpwwwgenetic-programmingcom

Additional resourcesndash 83

Bibliography

Bibliographyndash 84

A Agapitos and S M LucasEvolving Modular Recursive Sorting AlgorithmsIn Proc EuroGP 2007

A Arcuri and X YaoA novel co-evolutionary approach to automatic software bug fixingIn J Wang editor 2008 IEEE World Congress on Computational Intelligencepages 162ndash168 Hong Kong 1-6 June 2008 IEEE Computational IntelligenceSociety IEEE Press

M Arganis R Val J Prats K Rodriguez R Dominguez and J DolzGenetic programming and standardization in water temperature modellingAdvances in Civil Engineering 2009 2009

A Brabazon M OrsquoNeill and I DempseyAn Introduction to Evolutionary Computation in FinanceIEEE Computational Intelligence Magazine 3(4)42ndash55 2008

R Bradley A Brabazon and M OrsquoNeillDynamic High Frequency Trading A Neuro-Evolutionary ApproachIn Proc EvoWorkshops 2009

G Cuccu and F GomezWhen novelty is not enoughIn Proc EvoApplications 2011

Bibliographyndash 85

M Daga and M C DeoAlternative data-driven methods to estimate wind from waves by inversemodelingNatural Hazards 49(2)293ndash310 May 2009

I Dempsey M OrsquoNeill and A BrabazonAdaptive Trading With Grammatical EvolutionIn Proc CEC 2006

G Durrett F Neumann and U-M OrsquoReillyComputational Complexity Analysis of Simple Genetic Programming On TwoProblems Modeling Isolated Program SemanticsIn Proc FOGA 2011

E Galvaacuten-Loacutepez J Swafford M OrsquoNeill and A BrabazonEvolving a Ms PacMan Controller Using Grammatical EvolutionIn Applications of Evolutionary Computation Springer 2010

J V Hansen P B Lowry R D Meservy and D M McDonaldGenetic Programming for Prevention of Cyberterrorism through Dynamic andEvolving Intrusion DetectionDecision Support Systems 431362ndash1374 2007

Bibliographyndash 86

S Harding J F Miller and W BanzhafDevelopments in Cartesian Genetic Programming self-modifying CGPGPEM 11397ndash439 2010

D Jakobović and L BudinDynamic Scheduling with Genetic ProgrammingIn Proc EuroGP 2006

G Kendall A Parkes and K SpoererA Survey of NP-Complete PuzzlesInternational Computer Games Association Journal 31(1)13ndash34 2008

K E Kinnear JrEvolving a Sort Lessons in Genetic ProgrammingIn Proc of the International Conference on Neural Networks 1993

J KozaA Genetic Approach to the Truck Backer Upper Problem and the Inter-twinedSpiral ProblemIn Proc International Joint Conference on Neural Networks 1992

J R KozaGenetic Programming On the Programming of Computers by Means of NaturalSelectionMIT Press Cambridge MA USA 1992

Bibliographyndash 87

J R KozaGenetic Programming II Automatic Discovery of Reusable ProgramsMIT Press Cambridge Massachusetts May 1994

W Langdon and W BanzhafRepeated Patterns in Genetic ProgrammingNatural Computing 7589ndash613 2008

W B LangdonRandom search is parsimoniousIn E Cantuacute-Paz editor Late Breaking Papers at the Genetic and EvolutionaryComputation Conference (GECCO-2002) pages 308ndash315 New York NY 9-13July 2002 AAAI

W B Langdon and W BanzhafRepeated Sequences in Linear Genetic Programming GenomesComplex Systems 15(4)285ndash306 2005

W B Langdon and A P HarrisonEvolving Regular Expressions for GeneChip Probe Performance PredictionIn Proc PPSN pages 1061ndash1070 2008

W B Langdon and R PoliFoundations of Genetic ProgrammingSpringer-Verlag 2002

Bibliographyndash 88

W B Langdon J Rowsell and A P HarrisonCreating Regular Expressions as mRNA Motifs with GP to Predict Human ExonSplittingIn Proc GECCO 2009

W B Langdon O Sanchez Graillet and A P HarrisonAutomated DNA Motif DiscoveryarXivorg 2010

M Lones A Tyrrell S Stepney and L CavesControlling Complex Dynamics with Artificial Biochemical NetworksIn Proc EuroGP pages 159ndash170 2010

S LucasOthello Competitionhttpprotectkern-1667emrelaxalgovalessexacuk8080othellohtmlOthellohtml 2012[Online accessed 27-Jan-2012]

Bibliographyndash 89

S LucasThe Physical Travelling Salesperson Problemhttpprotectkern-1667emrelaxalgovalessexacukptspptsphtml2012[Online accessed 27ndashJan-2012]

T McConaghyFFX Fast Scalable Deterministic Symbolic Regression TechnologyIn Proc GPTP 2011

D J MontanaStrongly typed genetic programmingBBN Technical Report 7866 Bolt Beranek and Newman Inc 10 MoultonStreet Cambridge MA 02138 USA 7 May 1993

M Nicolau M Schoenauer and W BanzhafEvolving Genes to Balance a PoleIn Proc EuroGP 2010

Bibliographyndash 90

P Nordin and W BanzhafGenetic programming controlling a miniature robotIn E V Siegel and J R Koza editors Working Notes for the AAAI Symposiumon Genetic Programming pages 61ndash67 MIT Cambridge MA USA 10ndash12 Nov1995 AAAI

G Olague and L TrujilloEvolutionary-computer-assisted design of image operators that detect interestpoints using genetic programmingImage and Vision Computing 29(7)484ndash498 2011

M OrsquoNeill A Brabazon and E HembergSubtree Deactivation Control with Grammatical Genetic Programming inDynamic EnvironmentsIn Proc CEC 2008

M OrsquoNeill and C RyanGrammatical Evolution by Grammatical Evolution The Evolution of Grammarand Genetic CodeIn Proc EuroGP pages 138ndash149 Springer-Verlag 5ndash7 Apr 2004

M Paterson Y Peres M Thorup P Winkler and U ZwickMaximum OverhangIn Proc 19th Annual ACM-SIAM Symposium on Discrete Algorithms 2008

Bibliographyndash 91

R PoliExact schema theory for genetic programming and variable-length geneticalgorithms with one-point crossoverGenetic Programming and Evolvable Machines 2(2)123ndash163 June 2001

R Poli and W B LangdonOn the search properties of different crossover operators in genetic programmingIn J R Koza W Banzhaf K Chellapilla K Deb M Dorigo D B Fogel M HGarzon D E Goldberg H Iba and R Riolo editors Genetic Programming1998 Proceedings of the Third Annual Conference pages 293ndash301 University ofWisconsin Madison Wisconsin USA 22-25 July 1998 Morgan Kaufmann

R Poli W B Langdon and N F McPheeA field guide to genetic programmingPublished via httplulucom and freely available athttpwwwgp-field-guideorguk 2008(With contributions by J R Koza)

B J Ross and H ZhuProcedural texture evolution using multiobjective optimizationNew Generation Computing 22(3)271ndash293 2004

Bibliographyndash 92

R P Salustowicz and J SchmidhuberProbabilistic incremental program evolutionEvolutionary Computation 5(2)123ndash141 1997

M Schmidt and H LipsonDistilling free-form natural laws from experimental dataScience 324(5923)81ndash85 3 Apr 2009

S Silva and L VanneschiState-of-the-Art Genetic Programming for Predicting Human Oral Bioavailabilityof DrugsIn Proc 4th International Workshop on Practical Applications of ComputationalBiology and Bioinformatics 2010

M SipperLet the Games EvolveIn Proc GPTP 2011

C Sivapragasam R Maheswaran and V VenkateshGenetic programming approach for flood routing in natural channelsHydrological Processes 22(5)623ndash628 2007

C Sivapragasam N Muttil S Muthukumar and V M ArunPrediction of algal blooms using genetic programmingMarine Pollution Bulletin 60(10)1849ndash1855 2010

Bibliographyndash 93

L SpectorTowards practical autoconstructive evolution Self-evolution of problem-solvinggenetic programming systemsIn R Riolo T McConaghy and E Vladislavleva editors Genetic ProgrammingTheory and Practice VIII volume 8 of Genetic and Evolutionary Computationchapter 2 pages 17ndash33 Springer Ann Arbor USA 20-22 May 2010

L Spector C Perry and J KleinPush 20 programming language descriptionTechnical report School of Cognitive Science Hampshire College Apr 2004

J Togelius S Karakovskiy J Koutnik and J SchmidhuberSuper Mario EvolutionIn Proc IEEE Computational Intelligence and Games 2009

TORCS The Open Car Racing Simulatorhttptorcssourceforgenet 2012

L Vanneschi and G CuccuVariable Size Population for Dynamic Optimization with Genetic ProgrammingIn Proc GECCO 2009

Bibliographyndash 94

E Vladislavleva G Smits and D Den HertogOrder of Nonlinearity as a Complexity Measure for Models Generated by SymbolicRegression via Pareto Genetic ProgrammingIEEE Trans EC 13(2)333ndash349 2009

N Wagner Z Michalewicz M Khouja and R McGregorTime Series Forecasting for Dynamic Environments The DyFor Genetic ProgramModelIEEE Trans EC 2007

J Walker and J MillerPredicting Prime Numbers Using Cartesian Genetic ProgrammingIn Proc EuroGP 2007

J A Walker K Voumllk S L Smith and J F MillerParallel Evolution using Multi-chromosome Cartesian Genetic ProgrammingGPEM 10417ndash445 2009

W-C Wang K-W Chau C-T Cheng and L QiuA comparison of performance of several artificial intelligence methods forforecasting monthly discharge time seriesJournal of Hydrology 374(3-4)294ndash306 2009

Bibliographyndash 95

W Weimer S Forrest C Le Goues and T NguyenAutomatic program repair with evolutionary computationCommunications of the ACM 53(5)109ndash116 June 2010

P Widera J Garibaldi and N KrasnogorGP challenge Evolving energy function for protein structure predictionGPEM 1161ndash88 2010

T YuHierarchical Processing for Evolving Recursive and Modular Programs UsingHigher-Order Functions and Lambda AbstractionGPEM 2345ndash380 2001

Bibliographyndash 96

  • Introduction
  • Background
  • What is genetic programming
  • Summary of our first glimpse at GP
  • Life demonstration of GP using ECJ
  • Assessment of GP techniques
  • Case study
  • A more detailed view on GP (vanilla GP is not the whole story)
  • The Challenges for GP
  • Variants of GP
  • Selected theoretical results
  • Applications of GP
  • Additional resources
  • Bibliography
Page 46: GeneticProgramming - Poznań University of TechnologyGeneticProgramming KrzysztofKrawiec LaboratoryofIntelligentDecisionSupportSystems InstituteofComputingScience,PoznanUniversityofTechnology,Poznań,Poland

Case study 1 Evolution of temperature models

Based on Karolina Stanisławska Krzysztof Krawiec Zbigniew W KundzewiczModeling Global Temperature Changes using Genetic Programming ndash A Case Study

Institute of Computing Science Poznan University of Technology Poznan Poland

Institute for Agricultural and Forest Environment Polish Academy of SciencesPoznan Poland and Potsdam Institute for Climate Impact Research PotsdamGermany

Konferencja Algorytmoacutew Ewolucyjnych i Optymalizacji Globalnej KAEiOGSeptember 21 2011

Case studyndash 46

A more detailed view on GP(vanilla GP is not the whole story)

A more detailed view on GP (vanilla GP is not the whole story)ndash 47

Where to get [random] candidate solutions from

Every stochastic search method relies on some sampling algorithm(s)

The distribution of randomly generated solutions is important as it impliescertain bias of the algorithm

ProblemsWe donrsquot know the lsquoidealrsquo distribution of GP programsEven if we knew it it may be difficult to design an algorithm that obeys it

The most widely used contemporary initialization methods take care only of thesyntax of generated programs

Mainly height constraint

A more detailed view on GP (vanilla GP is not the whole story)ndash 48

Initialization Full method

Specify the maximum tree height hmax

The full method for initializing treesChoose nonterminal nodes at random until hmax is reachedThen choose only from terminals

A more detailed view on GP (vanilla GP is not the whole story)ndash 49

Initialization Grow method

Specify the maximum tree height hmax

The grow method for initializing treesChoose nonterminal or terminal nodes at random until hmax is reachedThen choose only from terminals

A more detailed view on GP (vanilla GP is not the whole story)ndash 50

Homologous crossover for GP

Earliest example one-point crossover [23] identify a common region in theparents and swap the corresponding trees

The common region is the lsquointersectionrsquo of parent trees

A more detailed view on GP (vanilla GP is not the whole story)ndash 51

Uniform crossover for GP

Works similarly to uniform crossover in GAs

The offspring is build by iterating over nodes in the common region and flipping acoin to decide from which parent should an instruction be copied [38]

A more detailed view on GP (vanilla GP is not the whole story)ndash 52

How to employ multiple operators for lsquobreedingrsquo

How should the particular operators coexist in an evolutionary process In other words

How should they be superimposed

What should be the lsquopipingrsquo of particular breeding pipelines

A topic surprisingly underexplored in GP (and in EC probably too)

An example Which is betterpopsubpop0speciespipe = ecgpkozaMutationPipelinepopsubpop0speciespipenum-sources = 1popsubpop0speciespipesource0 = ecgpkozaCrossoverPipeline

Orpopsubpop0speciespipenum-sources = 2popsubpop0speciespipesource0 = ecgpkozaCrossoverPipelinepopsubpop0speciespipesource0prob = 09popsubpop0speciespipesource1 = ecgpkozaMutationPipelinepopsubpop0speciespipesource1prob = 01

A more detailed view on GP (vanilla GP is not the whole story)ndash 53

The Challenges for GP

The Challenges for GPndash 54

Bloat

The evolving expressions tend to grow indefinitely in sizeFor tree-based representations this growth is typically exponential[-ish]

Evaluation becomes slow algorithm stalls memory overrun likely

One of the most intensely studied topics in GP 240+ papers as of March 2012

The Challenges for GPndash 55

Bloat example

Average number of nodes per generation in a typical run of GP solving the Sexticproblem x6minus2x4+x2

(GP dotted line)

The Challenges for GPndash 56

Countermeasures for bloat

Constraining tree heightSurprisingly can speed up bloat

Favoring small programsLexicographic parsimony pressure given two equally fit individuals prefer (select)the one represented by a smaller tree

Bloat-aware operators size-fair crossover

The Challenges for GPndash 57

Highly non-uniform distribution of program lsquobehaviorsrsquo

Convergence of binary Boolean random linear functions (composed of AND NANDOR NOR 8 bits)

From [20] Langdon W B Cantuacute-Paz E (ed) Random Search is Parsimonious LateBreaking Papers at the Genetic and Evolutionary Computation Conference(GECCO-2002) AAAI 2002 308-315

The Challenges for GPndash 58

High cost of evaluation

Running a program on multiple inputscan be expensive

Particularly for some types of dataeg images

Solutions

Caching of outcomes of subprograms

Parallel execution of programs onparticular fitness cases

Bloat prevention methods

The Challenges for GPndash 59

Variants of GP

Variants of GPndash 60

Strongly typed GP (STGP)

A way to incorporate prior knowledge and impose a structure on programs [30]

ImplementationProvide a set of typesFor each instruction define the types of its arguments and outcomesMake the operators type-aware

Mutation substitute a random tree of a proper typeCrossover swap trees of compatible1 types

1Compatible belonging to the same lsquoset typersquoVariants of GPndash 61

Strongly typed GP in ECJ

For the problem of simple classifiers represented as decision trees

Classifier syntaxClassifier = Class_idClassifier = if_then_else(Condition ClassifierClassifier)Condition = Input_Variable = Constant_Value

Implementation in ECJ parameter filesgptypeasize = 3gptypea0name = classgptypea1name = vargptypea2name = constgptypessize = 0gptcsize = 1gptc0 = ecgpGPTreeConstraintsgptc0name = tc0gptc0fset = f0gptc0returns = class

gpncsize = 4gpnc0 = ecgpGPNodeConstraintsgpnc0name = ncSimpleClassifiergpnc0returns = classgpnc0size = 0gpnc1 = ecgpGPNodeConstraintsgpnc1name = ncCompoundClassifiergpnc1returns = classgpnc1size = 4gpnc1child0 = vargpnc1child1 = constgpnc1child2 = classgpnc1child3 = classgpnc2 = ecgpGPNodeConstraintsgpnc2name = ncVariablegpnc2returns = vargpnc2size = 0gpnc3 = ecgpGPNodeConstraintsgpnc3name = ncConstantgpnc3returns = constgpnc3size = 0

Variants of GPndash 62

Linear Genetic Programming

MotivationTree-like structures are not natural for contemporary hardware architectures

Program = a sequence of instructions

Data passed via registers

ProsDirectly portable to machine code fast executionNatural correspondence to standard (GA-like) crossover operator

Applications direct evolution of machine code [32]

Initial register contents

Final register contents

x1

x2 O1 O2

x3

O3 O4 g2

g3

g1r1

r2

r3

r1

r2

r3

Variants of GPndash 63

Linear GP

Variants of GPndash 64

Stack-based GP

The best-known representative Push and PushGPhampshireedulspectorpushhtml [48]

ProsVery simple syntax program = instruction | literal | ( program )No need to specify the number of registersThe top element of a stack has the natural interpretation of program outcomeNatural possibility of implementing autoconstructive programs [47]Includes certain features that make it Turing-complete (eg YANK instruction)

Variants of GPndash 65

Push Example

Program

( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR )

Initial stack states

BOOLEAN STACK ()CODE STACK ( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR )FLOAT STACK ()INTEGER STACK ()

Stack states after program execution

BOOLEAN STACK ( TRUE )CODE STACK ( ( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR ) )FLOAT STACK ( 93 )INTEGER STACK ( 6 )

httphampshireedulspectorpush3-descriptionhtml

Variants of GPndash 66

Other variants of GP

Grammatical Evolution The grammar of the programming language ofconsideration is given as input to the algorithm Individuals encode the choice ofproductions in the derivation tree (which of available alternative productionshould be chosen modulo the number of productions available at given step ofderivation)

Graph-based GPMotivation standard GP cannot reuse subprograms (within a single program)Example Cartesian Genetic Programming

Variants of GPndash 67

Other variants of GP

Multiobjective GP The extra objectives canCome with the problemResult from GPrsquos specifics eg use program size as the second (minimized)objectiveBe associated with different tests (eg feature tests [40])

Developmental GP (eg using Push)

Probabilistic GP (a variant of EDA Estimation of Distribution Algorithms)The algorithm maintains a probability distribution P instead of a populationIndividuals are generated from P lsquoon demandrsquoThe results of individualsrsquo evaluation are used to update P

Variants of GPndash 68

Simple EDA-like GP PIPE

Probabilistic Incremental Program Evolution [41]

Variants of GPndash 69

Selected theoretical results

Selected theoretical resultsndash 70

Schemata theorem for GP

Exact formula for the expected number of individuals sampling a schema a thenext generation [37]

Plus later work for other types of crossover

Selected theoretical resultsndash 71

Semantic crossover

Selected theoretical resultsndash 72

Applications of GP

Applications of GPndash 73

Review

GP produced a number of solutions that are human-competitive ie a GPalgorithm automatically solved a problem for which a patent exists2

A recent award-winning work has demonstrated the ability of a GP system toautomatically find and correct bugs in commercially-released software whenprovided with test data3

GP is one of leading methodologies that can be used to lsquoautomatersquo sciencehelping the researchers to find the hidden complex patterns in the observedphenomena4

2Koza J R Keane M A Streeter M J Mydlowec W Yu J Lanza G 2003 Genetic Pro-gramming IV Routine Human-Competitive Machine Intelligence Kluwer Academic Publishers

3Arcuri A Yao X A novel co-evolutionary approach to automatic software bug fixing In WangJ (Ed) 2008 IEEE World Congress on Computational Intelligence IEEE Computational IntelligenceSociety IEEE Press Hong Kong

4Schmidt M Lipson H 3 Apr 2009 Distilling free-form natural laws from experimental dataScience 324 (5923) 81ndash85

Applications of GPndash 74

Humies

() Entries were solicited for cash awards for human-competitive results that wereproduced by any form of genetic and evolutionary computation and that were published

Applications of GPndash 75

Humies

The conditions to qualify(A) The result was patented as an invention in the past is an improvement over apatented invention or would qualify today as a patentable new invention(B) The result is equal to or better than a result that was accepted as a new scientificresult at the time when it was published in a peer-reviewed scientific journal(C) The result is equal to or better than a result that was placed into a database orarchive of results maintained by an internationally recognized panel of scientificexperts(D) The result is publishable in its own right as a new scientific result mdash independentof the fact that the result was mechanically created(E) The result is equal to or better than the most recent human-created solution to along-standing problem for which there has been a succession of increasingly betterhuman-created solutions(F) The result is equal to or better than a result that was considered an achievementin its field at the time it was first discovered(G) The result solves a problem of indisputable difficulty in its field(H) The result holds its own or wins a regulated competition involving humancontestants (in the form of either live human players or human-written computerprograms)

Applications of GPndash 76

Selected Gold Humies

2004 Jason D Lohn Gregory S Hornby Derek S Linden NASA Ames ResearchCenterAn Evolved Antenna for Deployment on NASArsquos Space Technology 5 Mission

httpidesignucscedupapershornby_ec11pdf

Applications of GPndash 77

Selected Gold Humies using GP

2009 Stephanie Forrest Claire Le Goues ThanhVu Nguyen Westley WeimerAutomatically finding patches using genetic programming A GeneticProgramming Approach to Automated Software Repair

Applications of GPndash 78

Selected Gold Humies using GP

2008 Lee Spector David M Clark Ian Lindsay Bradford Barr Jon KleinGenetic Programming for Finite Algebras

2010 Natalio Krasnogor Paweł Widera Jonathan GaribaldiEvolutionary design of the energy function for protein structure prediction GPchallenge evolving the energy function for protein structure prediction Automateddesign of energy functions for protein structure prediction by means of geneticprogramming and improved structure similarity assessment

2011 Achiya Elyasaf Ami Hauptmann Moshe SipperGA-FreeCell Evolving Solvers for the Game of FreeCell

Applications of GPndash 79

Other applications

classification problems in machine learning [] object recognition [ 33] orlearning game strategies [] [2 57] has demonstrated the ability of a GP system to automatically find andcorrect bugs in commercially-released software when provided with test data

In the context of this paper it deserves particular attention that GP is one ofleading methodologies that can be used to lsquoautomatersquo science helping theresearchers to find the hidden complex patterns in the observed phenomena

In this spirit in their seminal paper [42] have shown how GP can be used toinduce scientific laws from experimental data Many other studies havedemonstrated the usefulness of GP for modeling different phenomena includingthose of natural origins [45 3 7 56 46]

See [] for an extensive review of GP applications

Applications of GPndash 80

Case study 2 Evolution of feature detectors

Based on Krzysztof Krawiec Bartosz Kukawka and Tomasz Maciejewski (2010)Evolving cascades of voting feature detectors for vehicle detection in satellite imageryIn IEEE Congress on Evolutionary Computation (CEC 2010) Barcelona IEEE Presspages 2392-2399

Applications of GPndash 81

Additional resources

Additional resourcesndash 82

Additional resources

Evolutionary Computation in Java csgmuedu~eclabprojectsecjGeneric software framework for EA well-prepared to work with GP

The online GP bilbiography wwwcsbhamacuk~wblbiblio

The genetic programming lsquohome pagersquo (a little bit messy but still valuable)httpwwwgenetic-programmingcom

Additional resourcesndash 83

Bibliography

Bibliographyndash 84

A Agapitos and S M LucasEvolving Modular Recursive Sorting AlgorithmsIn Proc EuroGP 2007

A Arcuri and X YaoA novel co-evolutionary approach to automatic software bug fixingIn J Wang editor 2008 IEEE World Congress on Computational Intelligencepages 162ndash168 Hong Kong 1-6 June 2008 IEEE Computational IntelligenceSociety IEEE Press

M Arganis R Val J Prats K Rodriguez R Dominguez and J DolzGenetic programming and standardization in water temperature modellingAdvances in Civil Engineering 2009 2009

A Brabazon M OrsquoNeill and I DempseyAn Introduction to Evolutionary Computation in FinanceIEEE Computational Intelligence Magazine 3(4)42ndash55 2008

R Bradley A Brabazon and M OrsquoNeillDynamic High Frequency Trading A Neuro-Evolutionary ApproachIn Proc EvoWorkshops 2009

G Cuccu and F GomezWhen novelty is not enoughIn Proc EvoApplications 2011

Bibliographyndash 85

M Daga and M C DeoAlternative data-driven methods to estimate wind from waves by inversemodelingNatural Hazards 49(2)293ndash310 May 2009

I Dempsey M OrsquoNeill and A BrabazonAdaptive Trading With Grammatical EvolutionIn Proc CEC 2006

G Durrett F Neumann and U-M OrsquoReillyComputational Complexity Analysis of Simple Genetic Programming On TwoProblems Modeling Isolated Program SemanticsIn Proc FOGA 2011

E Galvaacuten-Loacutepez J Swafford M OrsquoNeill and A BrabazonEvolving a Ms PacMan Controller Using Grammatical EvolutionIn Applications of Evolutionary Computation Springer 2010

J V Hansen P B Lowry R D Meservy and D M McDonaldGenetic Programming for Prevention of Cyberterrorism through Dynamic andEvolving Intrusion DetectionDecision Support Systems 431362ndash1374 2007

Bibliographyndash 86

S Harding J F Miller and W BanzhafDevelopments in Cartesian Genetic Programming self-modifying CGPGPEM 11397ndash439 2010

D Jakobović and L BudinDynamic Scheduling with Genetic ProgrammingIn Proc EuroGP 2006

G Kendall A Parkes and K SpoererA Survey of NP-Complete PuzzlesInternational Computer Games Association Journal 31(1)13ndash34 2008

K E Kinnear JrEvolving a Sort Lessons in Genetic ProgrammingIn Proc of the International Conference on Neural Networks 1993

J KozaA Genetic Approach to the Truck Backer Upper Problem and the Inter-twinedSpiral ProblemIn Proc International Joint Conference on Neural Networks 1992

J R KozaGenetic Programming On the Programming of Computers by Means of NaturalSelectionMIT Press Cambridge MA USA 1992

Bibliographyndash 87

J R KozaGenetic Programming II Automatic Discovery of Reusable ProgramsMIT Press Cambridge Massachusetts May 1994

W Langdon and W BanzhafRepeated Patterns in Genetic ProgrammingNatural Computing 7589ndash613 2008

W B LangdonRandom search is parsimoniousIn E Cantuacute-Paz editor Late Breaking Papers at the Genetic and EvolutionaryComputation Conference (GECCO-2002) pages 308ndash315 New York NY 9-13July 2002 AAAI

W B Langdon and W BanzhafRepeated Sequences in Linear Genetic Programming GenomesComplex Systems 15(4)285ndash306 2005

W B Langdon and A P HarrisonEvolving Regular Expressions for GeneChip Probe Performance PredictionIn Proc PPSN pages 1061ndash1070 2008

W B Langdon and R PoliFoundations of Genetic ProgrammingSpringer-Verlag 2002

Bibliographyndash 88

W B Langdon J Rowsell and A P HarrisonCreating Regular Expressions as mRNA Motifs with GP to Predict Human ExonSplittingIn Proc GECCO 2009

W B Langdon O Sanchez Graillet and A P HarrisonAutomated DNA Motif DiscoveryarXivorg 2010

M Lones A Tyrrell S Stepney and L CavesControlling Complex Dynamics with Artificial Biochemical NetworksIn Proc EuroGP pages 159ndash170 2010

S LucasOthello Competitionhttpprotectkern-1667emrelaxalgovalessexacuk8080othellohtmlOthellohtml 2012[Online accessed 27-Jan-2012]

Bibliographyndash 89

S LucasThe Physical Travelling Salesperson Problemhttpprotectkern-1667emrelaxalgovalessexacukptspptsphtml2012[Online accessed 27ndashJan-2012]

T McConaghyFFX Fast Scalable Deterministic Symbolic Regression TechnologyIn Proc GPTP 2011

D J MontanaStrongly typed genetic programmingBBN Technical Report 7866 Bolt Beranek and Newman Inc 10 MoultonStreet Cambridge MA 02138 USA 7 May 1993

M Nicolau M Schoenauer and W BanzhafEvolving Genes to Balance a PoleIn Proc EuroGP 2010

Bibliographyndash 90

P Nordin and W BanzhafGenetic programming controlling a miniature robotIn E V Siegel and J R Koza editors Working Notes for the AAAI Symposiumon Genetic Programming pages 61ndash67 MIT Cambridge MA USA 10ndash12 Nov1995 AAAI

G Olague and L TrujilloEvolutionary-computer-assisted design of image operators that detect interestpoints using genetic programmingImage and Vision Computing 29(7)484ndash498 2011

M OrsquoNeill A Brabazon and E HembergSubtree Deactivation Control with Grammatical Genetic Programming inDynamic EnvironmentsIn Proc CEC 2008

M OrsquoNeill and C RyanGrammatical Evolution by Grammatical Evolution The Evolution of Grammarand Genetic CodeIn Proc EuroGP pages 138ndash149 Springer-Verlag 5ndash7 Apr 2004

M Paterson Y Peres M Thorup P Winkler and U ZwickMaximum OverhangIn Proc 19th Annual ACM-SIAM Symposium on Discrete Algorithms 2008

Bibliographyndash 91

R PoliExact schema theory for genetic programming and variable-length geneticalgorithms with one-point crossoverGenetic Programming and Evolvable Machines 2(2)123ndash163 June 2001

R Poli and W B LangdonOn the search properties of different crossover operators in genetic programmingIn J R Koza W Banzhaf K Chellapilla K Deb M Dorigo D B Fogel M HGarzon D E Goldberg H Iba and R Riolo editors Genetic Programming1998 Proceedings of the Third Annual Conference pages 293ndash301 University ofWisconsin Madison Wisconsin USA 22-25 July 1998 Morgan Kaufmann

R Poli W B Langdon and N F McPheeA field guide to genetic programmingPublished via httplulucom and freely available athttpwwwgp-field-guideorguk 2008(With contributions by J R Koza)

B J Ross and H ZhuProcedural texture evolution using multiobjective optimizationNew Generation Computing 22(3)271ndash293 2004

Bibliographyndash 92

R P Salustowicz and J SchmidhuberProbabilistic incremental program evolutionEvolutionary Computation 5(2)123ndash141 1997

M Schmidt and H LipsonDistilling free-form natural laws from experimental dataScience 324(5923)81ndash85 3 Apr 2009

S Silva and L VanneschiState-of-the-Art Genetic Programming for Predicting Human Oral Bioavailabilityof DrugsIn Proc 4th International Workshop on Practical Applications of ComputationalBiology and Bioinformatics 2010

M SipperLet the Games EvolveIn Proc GPTP 2011

C Sivapragasam R Maheswaran and V VenkateshGenetic programming approach for flood routing in natural channelsHydrological Processes 22(5)623ndash628 2007

C Sivapragasam N Muttil S Muthukumar and V M ArunPrediction of algal blooms using genetic programmingMarine Pollution Bulletin 60(10)1849ndash1855 2010

Bibliographyndash 93

L SpectorTowards practical autoconstructive evolution Self-evolution of problem-solvinggenetic programming systemsIn R Riolo T McConaghy and E Vladislavleva editors Genetic ProgrammingTheory and Practice VIII volume 8 of Genetic and Evolutionary Computationchapter 2 pages 17ndash33 Springer Ann Arbor USA 20-22 May 2010

L Spector C Perry and J KleinPush 20 programming language descriptionTechnical report School of Cognitive Science Hampshire College Apr 2004

J Togelius S Karakovskiy J Koutnik and J SchmidhuberSuper Mario EvolutionIn Proc IEEE Computational Intelligence and Games 2009

TORCS The Open Car Racing Simulatorhttptorcssourceforgenet 2012

L Vanneschi and G CuccuVariable Size Population for Dynamic Optimization with Genetic ProgrammingIn Proc GECCO 2009

Bibliographyndash 94

E Vladislavleva G Smits and D Den HertogOrder of Nonlinearity as a Complexity Measure for Models Generated by SymbolicRegression via Pareto Genetic ProgrammingIEEE Trans EC 13(2)333ndash349 2009

N Wagner Z Michalewicz M Khouja and R McGregorTime Series Forecasting for Dynamic Environments The DyFor Genetic ProgramModelIEEE Trans EC 2007

J Walker and J MillerPredicting Prime Numbers Using Cartesian Genetic ProgrammingIn Proc EuroGP 2007

J A Walker K Voumllk S L Smith and J F MillerParallel Evolution using Multi-chromosome Cartesian Genetic ProgrammingGPEM 10417ndash445 2009

W-C Wang K-W Chau C-T Cheng and L QiuA comparison of performance of several artificial intelligence methods forforecasting monthly discharge time seriesJournal of Hydrology 374(3-4)294ndash306 2009

Bibliographyndash 95

W Weimer S Forrest C Le Goues and T NguyenAutomatic program repair with evolutionary computationCommunications of the ACM 53(5)109ndash116 June 2010

P Widera J Garibaldi and N KrasnogorGP challenge Evolving energy function for protein structure predictionGPEM 1161ndash88 2010

T YuHierarchical Processing for Evolving Recursive and Modular Programs UsingHigher-Order Functions and Lambda AbstractionGPEM 2345ndash380 2001

Bibliographyndash 96

  • Introduction
  • Background
  • What is genetic programming
  • Summary of our first glimpse at GP
  • Life demonstration of GP using ECJ
  • Assessment of GP techniques
  • Case study
  • A more detailed view on GP (vanilla GP is not the whole story)
  • The Challenges for GP
  • Variants of GP
  • Selected theoretical results
  • Applications of GP
  • Additional resources
  • Bibliography
Page 47: GeneticProgramming - Poznań University of TechnologyGeneticProgramming KrzysztofKrawiec LaboratoryofIntelligentDecisionSupportSystems InstituteofComputingScience,PoznanUniversityofTechnology,Poznań,Poland

A more detailed view on GP(vanilla GP is not the whole story)

A more detailed view on GP (vanilla GP is not the whole story)ndash 47

Where to get [random] candidate solutions from

Every stochastic search method relies on some sampling algorithm(s)

The distribution of randomly generated solutions is important as it impliescertain bias of the algorithm

ProblemsWe donrsquot know the lsquoidealrsquo distribution of GP programsEven if we knew it it may be difficult to design an algorithm that obeys it

The most widely used contemporary initialization methods take care only of thesyntax of generated programs

Mainly height constraint

A more detailed view on GP (vanilla GP is not the whole story)ndash 48

Initialization Full method

Specify the maximum tree height hmax

The full method for initializing treesChoose nonterminal nodes at random until hmax is reachedThen choose only from terminals

A more detailed view on GP (vanilla GP is not the whole story)ndash 49

Initialization Grow method

Specify the maximum tree height hmax

The grow method for initializing treesChoose nonterminal or terminal nodes at random until hmax is reachedThen choose only from terminals

A more detailed view on GP (vanilla GP is not the whole story)ndash 50

Homologous crossover for GP

Earliest example one-point crossover [23] identify a common region in theparents and swap the corresponding trees

The common region is the lsquointersectionrsquo of parent trees

A more detailed view on GP (vanilla GP is not the whole story)ndash 51

Uniform crossover for GP

Works similarly to uniform crossover in GAs

The offspring is build by iterating over nodes in the common region and flipping acoin to decide from which parent should an instruction be copied [38]

A more detailed view on GP (vanilla GP is not the whole story)ndash 52

How to employ multiple operators for lsquobreedingrsquo

How should the particular operators coexist in an evolutionary process In other words

How should they be superimposed

What should be the lsquopipingrsquo of particular breeding pipelines

A topic surprisingly underexplored in GP (and in EC probably too)

An example Which is betterpopsubpop0speciespipe = ecgpkozaMutationPipelinepopsubpop0speciespipenum-sources = 1popsubpop0speciespipesource0 = ecgpkozaCrossoverPipeline

Orpopsubpop0speciespipenum-sources = 2popsubpop0speciespipesource0 = ecgpkozaCrossoverPipelinepopsubpop0speciespipesource0prob = 09popsubpop0speciespipesource1 = ecgpkozaMutationPipelinepopsubpop0speciespipesource1prob = 01

A more detailed view on GP (vanilla GP is not the whole story)ndash 53

The Challenges for GP

The Challenges for GPndash 54

Bloat

The evolving expressions tend to grow indefinitely in sizeFor tree-based representations this growth is typically exponential[-ish]

Evaluation becomes slow algorithm stalls memory overrun likely

One of the most intensely studied topics in GP 240+ papers as of March 2012

The Challenges for GPndash 55

Bloat example

Average number of nodes per generation in a typical run of GP solving the Sexticproblem x6minus2x4+x2

(GP dotted line)

The Challenges for GPndash 56

Countermeasures for bloat

Constraining tree heightSurprisingly can speed up bloat

Favoring small programsLexicographic parsimony pressure given two equally fit individuals prefer (select)the one represented by a smaller tree

Bloat-aware operators size-fair crossover

The Challenges for GPndash 57

Highly non-uniform distribution of program lsquobehaviorsrsquo

Convergence of binary Boolean random linear functions (composed of AND NANDOR NOR 8 bits)

From [20] Langdon W B Cantuacute-Paz E (ed) Random Search is Parsimonious LateBreaking Papers at the Genetic and Evolutionary Computation Conference(GECCO-2002) AAAI 2002 308-315

The Challenges for GPndash 58

High cost of evaluation

Running a program on multiple inputscan be expensive

Particularly for some types of dataeg images

Solutions

Caching of outcomes of subprograms

Parallel execution of programs onparticular fitness cases

Bloat prevention methods

The Challenges for GPndash 59

Variants of GP

Variants of GPndash 60

Strongly typed GP (STGP)

A way to incorporate prior knowledge and impose a structure on programs [30]

ImplementationProvide a set of typesFor each instruction define the types of its arguments and outcomesMake the operators type-aware

Mutation substitute a random tree of a proper typeCrossover swap trees of compatible1 types

1Compatible belonging to the same lsquoset typersquoVariants of GPndash 61

Strongly typed GP in ECJ

For the problem of simple classifiers represented as decision trees

Classifier syntaxClassifier = Class_idClassifier = if_then_else(Condition ClassifierClassifier)Condition = Input_Variable = Constant_Value

Implementation in ECJ parameter filesgptypeasize = 3gptypea0name = classgptypea1name = vargptypea2name = constgptypessize = 0gptcsize = 1gptc0 = ecgpGPTreeConstraintsgptc0name = tc0gptc0fset = f0gptc0returns = class

gpncsize = 4gpnc0 = ecgpGPNodeConstraintsgpnc0name = ncSimpleClassifiergpnc0returns = classgpnc0size = 0gpnc1 = ecgpGPNodeConstraintsgpnc1name = ncCompoundClassifiergpnc1returns = classgpnc1size = 4gpnc1child0 = vargpnc1child1 = constgpnc1child2 = classgpnc1child3 = classgpnc2 = ecgpGPNodeConstraintsgpnc2name = ncVariablegpnc2returns = vargpnc2size = 0gpnc3 = ecgpGPNodeConstraintsgpnc3name = ncConstantgpnc3returns = constgpnc3size = 0

Variants of GPndash 62

Linear Genetic Programming

MotivationTree-like structures are not natural for contemporary hardware architectures

Program = a sequence of instructions

Data passed via registers

ProsDirectly portable to machine code fast executionNatural correspondence to standard (GA-like) crossover operator

Applications direct evolution of machine code [32]

Initial register contents

Final register contents

x1

x2 O1 O2

x3

O3 O4 g2

g3

g1r1

r2

r3

r1

r2

r3

Variants of GPndash 63

Linear GP

Variants of GPndash 64

Stack-based GP

The best-known representative Push and PushGPhampshireedulspectorpushhtml [48]

ProsVery simple syntax program = instruction | literal | ( program )No need to specify the number of registersThe top element of a stack has the natural interpretation of program outcomeNatural possibility of implementing autoconstructive programs [47]Includes certain features that make it Turing-complete (eg YANK instruction)

Variants of GPndash 65

Push Example

Program

( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR )

Initial stack states

BOOLEAN STACK ()CODE STACK ( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR )FLOAT STACK ()INTEGER STACK ()

Stack states after program execution

BOOLEAN STACK ( TRUE )CODE STACK ( ( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR ) )FLOAT STACK ( 93 )INTEGER STACK ( 6 )

httphampshireedulspectorpush3-descriptionhtml

Variants of GPndash 66

Other variants of GP

Grammatical Evolution The grammar of the programming language ofconsideration is given as input to the algorithm Individuals encode the choice ofproductions in the derivation tree (which of available alternative productionshould be chosen modulo the number of productions available at given step ofderivation)

Graph-based GPMotivation standard GP cannot reuse subprograms (within a single program)Example Cartesian Genetic Programming

Variants of GPndash 67

Other variants of GP

Multiobjective GP The extra objectives canCome with the problemResult from GPrsquos specifics eg use program size as the second (minimized)objectiveBe associated with different tests (eg feature tests [40])

Developmental GP (eg using Push)

Probabilistic GP (a variant of EDA Estimation of Distribution Algorithms)The algorithm maintains a probability distribution P instead of a populationIndividuals are generated from P lsquoon demandrsquoThe results of individualsrsquo evaluation are used to update P

Variants of GPndash 68

Simple EDA-like GP PIPE

Probabilistic Incremental Program Evolution [41]

Variants of GPndash 69

Selected theoretical results

Selected theoretical resultsndash 70

Schemata theorem for GP

Exact formula for the expected number of individuals sampling a schema a thenext generation [37]

Plus later work for other types of crossover

Selected theoretical resultsndash 71

Semantic crossover

Selected theoretical resultsndash 72

Applications of GP

Applications of GPndash 73

Review

GP produced a number of solutions that are human-competitive ie a GPalgorithm automatically solved a problem for which a patent exists2

A recent award-winning work has demonstrated the ability of a GP system toautomatically find and correct bugs in commercially-released software whenprovided with test data3

GP is one of leading methodologies that can be used to lsquoautomatersquo sciencehelping the researchers to find the hidden complex patterns in the observedphenomena4

2Koza J R Keane M A Streeter M J Mydlowec W Yu J Lanza G 2003 Genetic Pro-gramming IV Routine Human-Competitive Machine Intelligence Kluwer Academic Publishers

3Arcuri A Yao X A novel co-evolutionary approach to automatic software bug fixing In WangJ (Ed) 2008 IEEE World Congress on Computational Intelligence IEEE Computational IntelligenceSociety IEEE Press Hong Kong

4Schmidt M Lipson H 3 Apr 2009 Distilling free-form natural laws from experimental dataScience 324 (5923) 81ndash85

Applications of GPndash 74

Humies

() Entries were solicited for cash awards for human-competitive results that wereproduced by any form of genetic and evolutionary computation and that were published

Applications of GPndash 75

Humies

The conditions to qualify(A) The result was patented as an invention in the past is an improvement over apatented invention or would qualify today as a patentable new invention(B) The result is equal to or better than a result that was accepted as a new scientificresult at the time when it was published in a peer-reviewed scientific journal(C) The result is equal to or better than a result that was placed into a database orarchive of results maintained by an internationally recognized panel of scientificexperts(D) The result is publishable in its own right as a new scientific result mdash independentof the fact that the result was mechanically created(E) The result is equal to or better than the most recent human-created solution to along-standing problem for which there has been a succession of increasingly betterhuman-created solutions(F) The result is equal to or better than a result that was considered an achievementin its field at the time it was first discovered(G) The result solves a problem of indisputable difficulty in its field(H) The result holds its own or wins a regulated competition involving humancontestants (in the form of either live human players or human-written computerprograms)

Applications of GPndash 76

Selected Gold Humies

2004 Jason D Lohn Gregory S Hornby Derek S Linden NASA Ames ResearchCenterAn Evolved Antenna for Deployment on NASArsquos Space Technology 5 Mission

httpidesignucscedupapershornby_ec11pdf

Applications of GPndash 77

Selected Gold Humies using GP

2009 Stephanie Forrest Claire Le Goues ThanhVu Nguyen Westley WeimerAutomatically finding patches using genetic programming A GeneticProgramming Approach to Automated Software Repair

Applications of GPndash 78

Selected Gold Humies using GP

2008 Lee Spector David M Clark Ian Lindsay Bradford Barr Jon KleinGenetic Programming for Finite Algebras

2010 Natalio Krasnogor Paweł Widera Jonathan GaribaldiEvolutionary design of the energy function for protein structure prediction GPchallenge evolving the energy function for protein structure prediction Automateddesign of energy functions for protein structure prediction by means of geneticprogramming and improved structure similarity assessment

2011 Achiya Elyasaf Ami Hauptmann Moshe SipperGA-FreeCell Evolving Solvers for the Game of FreeCell

Applications of GPndash 79

Other applications

classification problems in machine learning [] object recognition [ 33] orlearning game strategies [] [2 57] has demonstrated the ability of a GP system to automatically find andcorrect bugs in commercially-released software when provided with test data

In the context of this paper it deserves particular attention that GP is one ofleading methodologies that can be used to lsquoautomatersquo science helping theresearchers to find the hidden complex patterns in the observed phenomena

In this spirit in their seminal paper [42] have shown how GP can be used toinduce scientific laws from experimental data Many other studies havedemonstrated the usefulness of GP for modeling different phenomena includingthose of natural origins [45 3 7 56 46]

See [] for an extensive review of GP applications

Applications of GPndash 80

Case study 2 Evolution of feature detectors

Based on Krzysztof Krawiec Bartosz Kukawka and Tomasz Maciejewski (2010)Evolving cascades of voting feature detectors for vehicle detection in satellite imageryIn IEEE Congress on Evolutionary Computation (CEC 2010) Barcelona IEEE Presspages 2392-2399

Applications of GPndash 81

Additional resources

Additional resourcesndash 82

Additional resources

Evolutionary Computation in Java csgmuedu~eclabprojectsecjGeneric software framework for EA well-prepared to work with GP

The online GP bilbiography wwwcsbhamacuk~wblbiblio

The genetic programming lsquohome pagersquo (a little bit messy but still valuable)httpwwwgenetic-programmingcom

Additional resourcesndash 83

Bibliography

Bibliographyndash 84

A Agapitos and S M LucasEvolving Modular Recursive Sorting AlgorithmsIn Proc EuroGP 2007

A Arcuri and X YaoA novel co-evolutionary approach to automatic software bug fixingIn J Wang editor 2008 IEEE World Congress on Computational Intelligencepages 162ndash168 Hong Kong 1-6 June 2008 IEEE Computational IntelligenceSociety IEEE Press

M Arganis R Val J Prats K Rodriguez R Dominguez and J DolzGenetic programming and standardization in water temperature modellingAdvances in Civil Engineering 2009 2009

A Brabazon M OrsquoNeill and I DempseyAn Introduction to Evolutionary Computation in FinanceIEEE Computational Intelligence Magazine 3(4)42ndash55 2008

R Bradley A Brabazon and M OrsquoNeillDynamic High Frequency Trading A Neuro-Evolutionary ApproachIn Proc EvoWorkshops 2009

G Cuccu and F GomezWhen novelty is not enoughIn Proc EvoApplications 2011

Bibliographyndash 85

M Daga and M C DeoAlternative data-driven methods to estimate wind from waves by inversemodelingNatural Hazards 49(2)293ndash310 May 2009

I Dempsey M OrsquoNeill and A BrabazonAdaptive Trading With Grammatical EvolutionIn Proc CEC 2006

G Durrett F Neumann and U-M OrsquoReillyComputational Complexity Analysis of Simple Genetic Programming On TwoProblems Modeling Isolated Program SemanticsIn Proc FOGA 2011

E Galvaacuten-Loacutepez J Swafford M OrsquoNeill and A BrabazonEvolving a Ms PacMan Controller Using Grammatical EvolutionIn Applications of Evolutionary Computation Springer 2010

J V Hansen P B Lowry R D Meservy and D M McDonaldGenetic Programming for Prevention of Cyberterrorism through Dynamic andEvolving Intrusion DetectionDecision Support Systems 431362ndash1374 2007

Bibliographyndash 86

S Harding J F Miller and W BanzhafDevelopments in Cartesian Genetic Programming self-modifying CGPGPEM 11397ndash439 2010

D Jakobović and L BudinDynamic Scheduling with Genetic ProgrammingIn Proc EuroGP 2006

G Kendall A Parkes and K SpoererA Survey of NP-Complete PuzzlesInternational Computer Games Association Journal 31(1)13ndash34 2008

K E Kinnear JrEvolving a Sort Lessons in Genetic ProgrammingIn Proc of the International Conference on Neural Networks 1993

J KozaA Genetic Approach to the Truck Backer Upper Problem and the Inter-twinedSpiral ProblemIn Proc International Joint Conference on Neural Networks 1992

J R KozaGenetic Programming On the Programming of Computers by Means of NaturalSelectionMIT Press Cambridge MA USA 1992

Bibliographyndash 87

J R KozaGenetic Programming II Automatic Discovery of Reusable ProgramsMIT Press Cambridge Massachusetts May 1994

W Langdon and W BanzhafRepeated Patterns in Genetic ProgrammingNatural Computing 7589ndash613 2008

W B LangdonRandom search is parsimoniousIn E Cantuacute-Paz editor Late Breaking Papers at the Genetic and EvolutionaryComputation Conference (GECCO-2002) pages 308ndash315 New York NY 9-13July 2002 AAAI

W B Langdon and W BanzhafRepeated Sequences in Linear Genetic Programming GenomesComplex Systems 15(4)285ndash306 2005

W B Langdon and A P HarrisonEvolving Regular Expressions for GeneChip Probe Performance PredictionIn Proc PPSN pages 1061ndash1070 2008

W B Langdon and R PoliFoundations of Genetic ProgrammingSpringer-Verlag 2002

Bibliographyndash 88

W B Langdon J Rowsell and A P HarrisonCreating Regular Expressions as mRNA Motifs with GP to Predict Human ExonSplittingIn Proc GECCO 2009

W B Langdon O Sanchez Graillet and A P HarrisonAutomated DNA Motif DiscoveryarXivorg 2010

M Lones A Tyrrell S Stepney and L CavesControlling Complex Dynamics with Artificial Biochemical NetworksIn Proc EuroGP pages 159ndash170 2010

S LucasOthello Competitionhttpprotectkern-1667emrelaxalgovalessexacuk8080othellohtmlOthellohtml 2012[Online accessed 27-Jan-2012]

Bibliographyndash 89

S LucasThe Physical Travelling Salesperson Problemhttpprotectkern-1667emrelaxalgovalessexacukptspptsphtml2012[Online accessed 27ndashJan-2012]

T McConaghyFFX Fast Scalable Deterministic Symbolic Regression TechnologyIn Proc GPTP 2011

D J MontanaStrongly typed genetic programmingBBN Technical Report 7866 Bolt Beranek and Newman Inc 10 MoultonStreet Cambridge MA 02138 USA 7 May 1993

M Nicolau M Schoenauer and W BanzhafEvolving Genes to Balance a PoleIn Proc EuroGP 2010

Bibliographyndash 90

P Nordin and W BanzhafGenetic programming controlling a miniature robotIn E V Siegel and J R Koza editors Working Notes for the AAAI Symposiumon Genetic Programming pages 61ndash67 MIT Cambridge MA USA 10ndash12 Nov1995 AAAI

G Olague and L TrujilloEvolutionary-computer-assisted design of image operators that detect interestpoints using genetic programmingImage and Vision Computing 29(7)484ndash498 2011

M OrsquoNeill A Brabazon and E HembergSubtree Deactivation Control with Grammatical Genetic Programming inDynamic EnvironmentsIn Proc CEC 2008

M OrsquoNeill and C RyanGrammatical Evolution by Grammatical Evolution The Evolution of Grammarand Genetic CodeIn Proc EuroGP pages 138ndash149 Springer-Verlag 5ndash7 Apr 2004

M Paterson Y Peres M Thorup P Winkler and U ZwickMaximum OverhangIn Proc 19th Annual ACM-SIAM Symposium on Discrete Algorithms 2008

Bibliographyndash 91

R PoliExact schema theory for genetic programming and variable-length geneticalgorithms with one-point crossoverGenetic Programming and Evolvable Machines 2(2)123ndash163 June 2001

R Poli and W B LangdonOn the search properties of different crossover operators in genetic programmingIn J R Koza W Banzhaf K Chellapilla K Deb M Dorigo D B Fogel M HGarzon D E Goldberg H Iba and R Riolo editors Genetic Programming1998 Proceedings of the Third Annual Conference pages 293ndash301 University ofWisconsin Madison Wisconsin USA 22-25 July 1998 Morgan Kaufmann

R Poli W B Langdon and N F McPheeA field guide to genetic programmingPublished via httplulucom and freely available athttpwwwgp-field-guideorguk 2008(With contributions by J R Koza)

B J Ross and H ZhuProcedural texture evolution using multiobjective optimizationNew Generation Computing 22(3)271ndash293 2004

Bibliographyndash 92

R P Salustowicz and J SchmidhuberProbabilistic incremental program evolutionEvolutionary Computation 5(2)123ndash141 1997

M Schmidt and H LipsonDistilling free-form natural laws from experimental dataScience 324(5923)81ndash85 3 Apr 2009

S Silva and L VanneschiState-of-the-Art Genetic Programming for Predicting Human Oral Bioavailabilityof DrugsIn Proc 4th International Workshop on Practical Applications of ComputationalBiology and Bioinformatics 2010

M SipperLet the Games EvolveIn Proc GPTP 2011

C Sivapragasam R Maheswaran and V VenkateshGenetic programming approach for flood routing in natural channelsHydrological Processes 22(5)623ndash628 2007

C Sivapragasam N Muttil S Muthukumar and V M ArunPrediction of algal blooms using genetic programmingMarine Pollution Bulletin 60(10)1849ndash1855 2010

Bibliographyndash 93

L SpectorTowards practical autoconstructive evolution Self-evolution of problem-solvinggenetic programming systemsIn R Riolo T McConaghy and E Vladislavleva editors Genetic ProgrammingTheory and Practice VIII volume 8 of Genetic and Evolutionary Computationchapter 2 pages 17ndash33 Springer Ann Arbor USA 20-22 May 2010

L Spector C Perry and J KleinPush 20 programming language descriptionTechnical report School of Cognitive Science Hampshire College Apr 2004

J Togelius S Karakovskiy J Koutnik and J SchmidhuberSuper Mario EvolutionIn Proc IEEE Computational Intelligence and Games 2009

TORCS The Open Car Racing Simulatorhttptorcssourceforgenet 2012

L Vanneschi and G CuccuVariable Size Population for Dynamic Optimization with Genetic ProgrammingIn Proc GECCO 2009

Bibliographyndash 94

E Vladislavleva G Smits and D Den HertogOrder of Nonlinearity as a Complexity Measure for Models Generated by SymbolicRegression via Pareto Genetic ProgrammingIEEE Trans EC 13(2)333ndash349 2009

N Wagner Z Michalewicz M Khouja and R McGregorTime Series Forecasting for Dynamic Environments The DyFor Genetic ProgramModelIEEE Trans EC 2007

J Walker and J MillerPredicting Prime Numbers Using Cartesian Genetic ProgrammingIn Proc EuroGP 2007

J A Walker K Voumllk S L Smith and J F MillerParallel Evolution using Multi-chromosome Cartesian Genetic ProgrammingGPEM 10417ndash445 2009

W-C Wang K-W Chau C-T Cheng and L QiuA comparison of performance of several artificial intelligence methods forforecasting monthly discharge time seriesJournal of Hydrology 374(3-4)294ndash306 2009

Bibliographyndash 95

W Weimer S Forrest C Le Goues and T NguyenAutomatic program repair with evolutionary computationCommunications of the ACM 53(5)109ndash116 June 2010

P Widera J Garibaldi and N KrasnogorGP challenge Evolving energy function for protein structure predictionGPEM 1161ndash88 2010

T YuHierarchical Processing for Evolving Recursive and Modular Programs UsingHigher-Order Functions and Lambda AbstractionGPEM 2345ndash380 2001

Bibliographyndash 96

  • Introduction
  • Background
  • What is genetic programming
  • Summary of our first glimpse at GP
  • Life demonstration of GP using ECJ
  • Assessment of GP techniques
  • Case study
  • A more detailed view on GP (vanilla GP is not the whole story)
  • The Challenges for GP
  • Variants of GP
  • Selected theoretical results
  • Applications of GP
  • Additional resources
  • Bibliography
Page 48: GeneticProgramming - Poznań University of TechnologyGeneticProgramming KrzysztofKrawiec LaboratoryofIntelligentDecisionSupportSystems InstituteofComputingScience,PoznanUniversityofTechnology,Poznań,Poland

Where to get [random] candidate solutions from

Every stochastic search method relies on some sampling algorithm(s)

The distribution of randomly generated solutions is important as it impliescertain bias of the algorithm

ProblemsWe donrsquot know the lsquoidealrsquo distribution of GP programsEven if we knew it it may be difficult to design an algorithm that obeys it

The most widely used contemporary initialization methods take care only of thesyntax of generated programs

Mainly height constraint

A more detailed view on GP (vanilla GP is not the whole story)ndash 48

Initialization Full method

Specify the maximum tree height hmax

The full method for initializing treesChoose nonterminal nodes at random until hmax is reachedThen choose only from terminals

A more detailed view on GP (vanilla GP is not the whole story)ndash 49

Initialization Grow method

Specify the maximum tree height hmax

The grow method for initializing treesChoose nonterminal or terminal nodes at random until hmax is reachedThen choose only from terminals

A more detailed view on GP (vanilla GP is not the whole story)ndash 50

Homologous crossover for GP

Earliest example one-point crossover [23] identify a common region in theparents and swap the corresponding trees

The common region is the lsquointersectionrsquo of parent trees

A more detailed view on GP (vanilla GP is not the whole story)ndash 51

Uniform crossover for GP

Works similarly to uniform crossover in GAs

The offspring is build by iterating over nodes in the common region and flipping acoin to decide from which parent should an instruction be copied [38]

A more detailed view on GP (vanilla GP is not the whole story)ndash 52

How to employ multiple operators for lsquobreedingrsquo

How should the particular operators coexist in an evolutionary process In other words

How should they be superimposed

What should be the lsquopipingrsquo of particular breeding pipelines

A topic surprisingly underexplored in GP (and in EC probably too)

An example Which is betterpopsubpop0speciespipe = ecgpkozaMutationPipelinepopsubpop0speciespipenum-sources = 1popsubpop0speciespipesource0 = ecgpkozaCrossoverPipeline

Orpopsubpop0speciespipenum-sources = 2popsubpop0speciespipesource0 = ecgpkozaCrossoverPipelinepopsubpop0speciespipesource0prob = 09popsubpop0speciespipesource1 = ecgpkozaMutationPipelinepopsubpop0speciespipesource1prob = 01

A more detailed view on GP (vanilla GP is not the whole story)ndash 53

The Challenges for GP

The Challenges for GPndash 54

Bloat

The evolving expressions tend to grow indefinitely in sizeFor tree-based representations this growth is typically exponential[-ish]

Evaluation becomes slow algorithm stalls memory overrun likely

One of the most intensely studied topics in GP 240+ papers as of March 2012

The Challenges for GPndash 55

Bloat example

Average number of nodes per generation in a typical run of GP solving the Sexticproblem x6minus2x4+x2

(GP dotted line)

The Challenges for GPndash 56

Countermeasures for bloat

Constraining tree heightSurprisingly can speed up bloat

Favoring small programsLexicographic parsimony pressure given two equally fit individuals prefer (select)the one represented by a smaller tree

Bloat-aware operators size-fair crossover

The Challenges for GPndash 57

Highly non-uniform distribution of program lsquobehaviorsrsquo

Convergence of binary Boolean random linear functions (composed of AND NANDOR NOR 8 bits)

From [20] Langdon W B Cantuacute-Paz E (ed) Random Search is Parsimonious LateBreaking Papers at the Genetic and Evolutionary Computation Conference(GECCO-2002) AAAI 2002 308-315

The Challenges for GPndash 58

High cost of evaluation

Running a program on multiple inputscan be expensive

Particularly for some types of dataeg images

Solutions

Caching of outcomes of subprograms

Parallel execution of programs onparticular fitness cases

Bloat prevention methods

The Challenges for GPndash 59

Variants of GP

Variants of GPndash 60

Strongly typed GP (STGP)

A way to incorporate prior knowledge and impose a structure on programs [30]

ImplementationProvide a set of typesFor each instruction define the types of its arguments and outcomesMake the operators type-aware

Mutation substitute a random tree of a proper typeCrossover swap trees of compatible1 types

1Compatible belonging to the same lsquoset typersquoVariants of GPndash 61

Strongly typed GP in ECJ

For the problem of simple classifiers represented as decision trees

Classifier syntaxClassifier = Class_idClassifier = if_then_else(Condition ClassifierClassifier)Condition = Input_Variable = Constant_Value

Implementation in ECJ parameter filesgptypeasize = 3gptypea0name = classgptypea1name = vargptypea2name = constgptypessize = 0gptcsize = 1gptc0 = ecgpGPTreeConstraintsgptc0name = tc0gptc0fset = f0gptc0returns = class

gpncsize = 4gpnc0 = ecgpGPNodeConstraintsgpnc0name = ncSimpleClassifiergpnc0returns = classgpnc0size = 0gpnc1 = ecgpGPNodeConstraintsgpnc1name = ncCompoundClassifiergpnc1returns = classgpnc1size = 4gpnc1child0 = vargpnc1child1 = constgpnc1child2 = classgpnc1child3 = classgpnc2 = ecgpGPNodeConstraintsgpnc2name = ncVariablegpnc2returns = vargpnc2size = 0gpnc3 = ecgpGPNodeConstraintsgpnc3name = ncConstantgpnc3returns = constgpnc3size = 0

Variants of GPndash 62

Linear Genetic Programming

MotivationTree-like structures are not natural for contemporary hardware architectures

Program = a sequence of instructions

Data passed via registers

ProsDirectly portable to machine code fast executionNatural correspondence to standard (GA-like) crossover operator

Applications direct evolution of machine code [32]

Initial register contents

Final register contents

x1

x2 O1 O2

x3

O3 O4 g2

g3

g1r1

r2

r3

r1

r2

r3

Variants of GPndash 63

Linear GP

Variants of GPndash 64

Stack-based GP

The best-known representative Push and PushGPhampshireedulspectorpushhtml [48]

ProsVery simple syntax program = instruction | literal | ( program )No need to specify the number of registersThe top element of a stack has the natural interpretation of program outcomeNatural possibility of implementing autoconstructive programs [47]Includes certain features that make it Turing-complete (eg YANK instruction)

Variants of GPndash 65

Push Example

Program

( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR )

Initial stack states

BOOLEAN STACK ()CODE STACK ( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR )FLOAT STACK ()INTEGER STACK ()

Stack states after program execution

BOOLEAN STACK ( TRUE )CODE STACK ( ( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR ) )FLOAT STACK ( 93 )INTEGER STACK ( 6 )

httphampshireedulspectorpush3-descriptionhtml

Variants of GPndash 66

Other variants of GP

Grammatical Evolution The grammar of the programming language ofconsideration is given as input to the algorithm Individuals encode the choice ofproductions in the derivation tree (which of available alternative productionshould be chosen modulo the number of productions available at given step ofderivation)

Graph-based GPMotivation standard GP cannot reuse subprograms (within a single program)Example Cartesian Genetic Programming

Variants of GPndash 67

Other variants of GP

Multiobjective GP The extra objectives canCome with the problemResult from GPrsquos specifics eg use program size as the second (minimized)objectiveBe associated with different tests (eg feature tests [40])

Developmental GP (eg using Push)

Probabilistic GP (a variant of EDA Estimation of Distribution Algorithms)The algorithm maintains a probability distribution P instead of a populationIndividuals are generated from P lsquoon demandrsquoThe results of individualsrsquo evaluation are used to update P

Variants of GPndash 68

Simple EDA-like GP PIPE

Probabilistic Incremental Program Evolution [41]

Variants of GPndash 69

Selected theoretical results

Selected theoretical resultsndash 70

Schemata theorem for GP

Exact formula for the expected number of individuals sampling a schema a thenext generation [37]

Plus later work for other types of crossover

Selected theoretical resultsndash 71

Semantic crossover

Selected theoretical resultsndash 72

Applications of GP

Applications of GPndash 73

Review

GP produced a number of solutions that are human-competitive ie a GPalgorithm automatically solved a problem for which a patent exists2

A recent award-winning work has demonstrated the ability of a GP system toautomatically find and correct bugs in commercially-released software whenprovided with test data3

GP is one of leading methodologies that can be used to lsquoautomatersquo sciencehelping the researchers to find the hidden complex patterns in the observedphenomena4

2Koza J R Keane M A Streeter M J Mydlowec W Yu J Lanza G 2003 Genetic Pro-gramming IV Routine Human-Competitive Machine Intelligence Kluwer Academic Publishers

3Arcuri A Yao X A novel co-evolutionary approach to automatic software bug fixing In WangJ (Ed) 2008 IEEE World Congress on Computational Intelligence IEEE Computational IntelligenceSociety IEEE Press Hong Kong

4Schmidt M Lipson H 3 Apr 2009 Distilling free-form natural laws from experimental dataScience 324 (5923) 81ndash85

Applications of GPndash 74

Humies

() Entries were solicited for cash awards for human-competitive results that wereproduced by any form of genetic and evolutionary computation and that were published

Applications of GPndash 75

Humies

The conditions to qualify(A) The result was patented as an invention in the past is an improvement over apatented invention or would qualify today as a patentable new invention(B) The result is equal to or better than a result that was accepted as a new scientificresult at the time when it was published in a peer-reviewed scientific journal(C) The result is equal to or better than a result that was placed into a database orarchive of results maintained by an internationally recognized panel of scientificexperts(D) The result is publishable in its own right as a new scientific result mdash independentof the fact that the result was mechanically created(E) The result is equal to or better than the most recent human-created solution to along-standing problem for which there has been a succession of increasingly betterhuman-created solutions(F) The result is equal to or better than a result that was considered an achievementin its field at the time it was first discovered(G) The result solves a problem of indisputable difficulty in its field(H) The result holds its own or wins a regulated competition involving humancontestants (in the form of either live human players or human-written computerprograms)

Applications of GPndash 76

Selected Gold Humies

2004 Jason D Lohn Gregory S Hornby Derek S Linden NASA Ames ResearchCenterAn Evolved Antenna for Deployment on NASArsquos Space Technology 5 Mission

httpidesignucscedupapershornby_ec11pdf

Applications of GPndash 77

Selected Gold Humies using GP

2009 Stephanie Forrest Claire Le Goues ThanhVu Nguyen Westley WeimerAutomatically finding patches using genetic programming A GeneticProgramming Approach to Automated Software Repair

Applications of GPndash 78

Selected Gold Humies using GP

2008 Lee Spector David M Clark Ian Lindsay Bradford Barr Jon KleinGenetic Programming for Finite Algebras

2010 Natalio Krasnogor Paweł Widera Jonathan GaribaldiEvolutionary design of the energy function for protein structure prediction GPchallenge evolving the energy function for protein structure prediction Automateddesign of energy functions for protein structure prediction by means of geneticprogramming and improved structure similarity assessment

2011 Achiya Elyasaf Ami Hauptmann Moshe SipperGA-FreeCell Evolving Solvers for the Game of FreeCell

Applications of GPndash 79

Other applications

classification problems in machine learning [] object recognition [ 33] orlearning game strategies [] [2 57] has demonstrated the ability of a GP system to automatically find andcorrect bugs in commercially-released software when provided with test data

In the context of this paper it deserves particular attention that GP is one ofleading methodologies that can be used to lsquoautomatersquo science helping theresearchers to find the hidden complex patterns in the observed phenomena

In this spirit in their seminal paper [42] have shown how GP can be used toinduce scientific laws from experimental data Many other studies havedemonstrated the usefulness of GP for modeling different phenomena includingthose of natural origins [45 3 7 56 46]

See [] for an extensive review of GP applications

Applications of GPndash 80

Case study 2 Evolution of feature detectors

Based on Krzysztof Krawiec Bartosz Kukawka and Tomasz Maciejewski (2010)Evolving cascades of voting feature detectors for vehicle detection in satellite imageryIn IEEE Congress on Evolutionary Computation (CEC 2010) Barcelona IEEE Presspages 2392-2399

Applications of GPndash 81

Additional resources

Additional resourcesndash 82

Additional resources

Evolutionary Computation in Java csgmuedu~eclabprojectsecjGeneric software framework for EA well-prepared to work with GP

The online GP bilbiography wwwcsbhamacuk~wblbiblio

The genetic programming lsquohome pagersquo (a little bit messy but still valuable)httpwwwgenetic-programmingcom

Additional resourcesndash 83

Bibliography

Bibliographyndash 84

A Agapitos and S M LucasEvolving Modular Recursive Sorting AlgorithmsIn Proc EuroGP 2007

A Arcuri and X YaoA novel co-evolutionary approach to automatic software bug fixingIn J Wang editor 2008 IEEE World Congress on Computational Intelligencepages 162ndash168 Hong Kong 1-6 June 2008 IEEE Computational IntelligenceSociety IEEE Press

M Arganis R Val J Prats K Rodriguez R Dominguez and J DolzGenetic programming and standardization in water temperature modellingAdvances in Civil Engineering 2009 2009

A Brabazon M OrsquoNeill and I DempseyAn Introduction to Evolutionary Computation in FinanceIEEE Computational Intelligence Magazine 3(4)42ndash55 2008

R Bradley A Brabazon and M OrsquoNeillDynamic High Frequency Trading A Neuro-Evolutionary ApproachIn Proc EvoWorkshops 2009

G Cuccu and F GomezWhen novelty is not enoughIn Proc EvoApplications 2011

Bibliographyndash 85

M Daga and M C DeoAlternative data-driven methods to estimate wind from waves by inversemodelingNatural Hazards 49(2)293ndash310 May 2009

I Dempsey M OrsquoNeill and A BrabazonAdaptive Trading With Grammatical EvolutionIn Proc CEC 2006

G Durrett F Neumann and U-M OrsquoReillyComputational Complexity Analysis of Simple Genetic Programming On TwoProblems Modeling Isolated Program SemanticsIn Proc FOGA 2011

E Galvaacuten-Loacutepez J Swafford M OrsquoNeill and A BrabazonEvolving a Ms PacMan Controller Using Grammatical EvolutionIn Applications of Evolutionary Computation Springer 2010

J V Hansen P B Lowry R D Meservy and D M McDonaldGenetic Programming for Prevention of Cyberterrorism through Dynamic andEvolving Intrusion DetectionDecision Support Systems 431362ndash1374 2007

Bibliographyndash 86

S Harding J F Miller and W BanzhafDevelopments in Cartesian Genetic Programming self-modifying CGPGPEM 11397ndash439 2010

D Jakobović and L BudinDynamic Scheduling with Genetic ProgrammingIn Proc EuroGP 2006

G Kendall A Parkes and K SpoererA Survey of NP-Complete PuzzlesInternational Computer Games Association Journal 31(1)13ndash34 2008

K E Kinnear JrEvolving a Sort Lessons in Genetic ProgrammingIn Proc of the International Conference on Neural Networks 1993

J KozaA Genetic Approach to the Truck Backer Upper Problem and the Inter-twinedSpiral ProblemIn Proc International Joint Conference on Neural Networks 1992

J R KozaGenetic Programming On the Programming of Computers by Means of NaturalSelectionMIT Press Cambridge MA USA 1992

Bibliographyndash 87

J R KozaGenetic Programming II Automatic Discovery of Reusable ProgramsMIT Press Cambridge Massachusetts May 1994

W Langdon and W BanzhafRepeated Patterns in Genetic ProgrammingNatural Computing 7589ndash613 2008

W B LangdonRandom search is parsimoniousIn E Cantuacute-Paz editor Late Breaking Papers at the Genetic and EvolutionaryComputation Conference (GECCO-2002) pages 308ndash315 New York NY 9-13July 2002 AAAI

W B Langdon and W BanzhafRepeated Sequences in Linear Genetic Programming GenomesComplex Systems 15(4)285ndash306 2005

W B Langdon and A P HarrisonEvolving Regular Expressions for GeneChip Probe Performance PredictionIn Proc PPSN pages 1061ndash1070 2008

W B Langdon and R PoliFoundations of Genetic ProgrammingSpringer-Verlag 2002

Bibliographyndash 88

W B Langdon J Rowsell and A P HarrisonCreating Regular Expressions as mRNA Motifs with GP to Predict Human ExonSplittingIn Proc GECCO 2009

W B Langdon O Sanchez Graillet and A P HarrisonAutomated DNA Motif DiscoveryarXivorg 2010

M Lones A Tyrrell S Stepney and L CavesControlling Complex Dynamics with Artificial Biochemical NetworksIn Proc EuroGP pages 159ndash170 2010

S LucasOthello Competitionhttpprotectkern-1667emrelaxalgovalessexacuk8080othellohtmlOthellohtml 2012[Online accessed 27-Jan-2012]

Bibliographyndash 89

S LucasThe Physical Travelling Salesperson Problemhttpprotectkern-1667emrelaxalgovalessexacukptspptsphtml2012[Online accessed 27ndashJan-2012]

T McConaghyFFX Fast Scalable Deterministic Symbolic Regression TechnologyIn Proc GPTP 2011

D J MontanaStrongly typed genetic programmingBBN Technical Report 7866 Bolt Beranek and Newman Inc 10 MoultonStreet Cambridge MA 02138 USA 7 May 1993

M Nicolau M Schoenauer and W BanzhafEvolving Genes to Balance a PoleIn Proc EuroGP 2010

Bibliographyndash 90

P Nordin and W BanzhafGenetic programming controlling a miniature robotIn E V Siegel and J R Koza editors Working Notes for the AAAI Symposiumon Genetic Programming pages 61ndash67 MIT Cambridge MA USA 10ndash12 Nov1995 AAAI

G Olague and L TrujilloEvolutionary-computer-assisted design of image operators that detect interestpoints using genetic programmingImage and Vision Computing 29(7)484ndash498 2011

M OrsquoNeill A Brabazon and E HembergSubtree Deactivation Control with Grammatical Genetic Programming inDynamic EnvironmentsIn Proc CEC 2008

M OrsquoNeill and C RyanGrammatical Evolution by Grammatical Evolution The Evolution of Grammarand Genetic CodeIn Proc EuroGP pages 138ndash149 Springer-Verlag 5ndash7 Apr 2004

M Paterson Y Peres M Thorup P Winkler and U ZwickMaximum OverhangIn Proc 19th Annual ACM-SIAM Symposium on Discrete Algorithms 2008

Bibliographyndash 91

R PoliExact schema theory for genetic programming and variable-length geneticalgorithms with one-point crossoverGenetic Programming and Evolvable Machines 2(2)123ndash163 June 2001

R Poli and W B LangdonOn the search properties of different crossover operators in genetic programmingIn J R Koza W Banzhaf K Chellapilla K Deb M Dorigo D B Fogel M HGarzon D E Goldberg H Iba and R Riolo editors Genetic Programming1998 Proceedings of the Third Annual Conference pages 293ndash301 University ofWisconsin Madison Wisconsin USA 22-25 July 1998 Morgan Kaufmann

R Poli W B Langdon and N F McPheeA field guide to genetic programmingPublished via httplulucom and freely available athttpwwwgp-field-guideorguk 2008(With contributions by J R Koza)

B J Ross and H ZhuProcedural texture evolution using multiobjective optimizationNew Generation Computing 22(3)271ndash293 2004

Bibliographyndash 92

R P Salustowicz and J SchmidhuberProbabilistic incremental program evolutionEvolutionary Computation 5(2)123ndash141 1997

M Schmidt and H LipsonDistilling free-form natural laws from experimental dataScience 324(5923)81ndash85 3 Apr 2009

S Silva and L VanneschiState-of-the-Art Genetic Programming for Predicting Human Oral Bioavailabilityof DrugsIn Proc 4th International Workshop on Practical Applications of ComputationalBiology and Bioinformatics 2010

M SipperLet the Games EvolveIn Proc GPTP 2011

C Sivapragasam R Maheswaran and V VenkateshGenetic programming approach for flood routing in natural channelsHydrological Processes 22(5)623ndash628 2007

C Sivapragasam N Muttil S Muthukumar and V M ArunPrediction of algal blooms using genetic programmingMarine Pollution Bulletin 60(10)1849ndash1855 2010

Bibliographyndash 93

L SpectorTowards practical autoconstructive evolution Self-evolution of problem-solvinggenetic programming systemsIn R Riolo T McConaghy and E Vladislavleva editors Genetic ProgrammingTheory and Practice VIII volume 8 of Genetic and Evolutionary Computationchapter 2 pages 17ndash33 Springer Ann Arbor USA 20-22 May 2010

L Spector C Perry and J KleinPush 20 programming language descriptionTechnical report School of Cognitive Science Hampshire College Apr 2004

J Togelius S Karakovskiy J Koutnik and J SchmidhuberSuper Mario EvolutionIn Proc IEEE Computational Intelligence and Games 2009

TORCS The Open Car Racing Simulatorhttptorcssourceforgenet 2012

L Vanneschi and G CuccuVariable Size Population for Dynamic Optimization with Genetic ProgrammingIn Proc GECCO 2009

Bibliographyndash 94

E Vladislavleva G Smits and D Den HertogOrder of Nonlinearity as a Complexity Measure for Models Generated by SymbolicRegression via Pareto Genetic ProgrammingIEEE Trans EC 13(2)333ndash349 2009

N Wagner Z Michalewicz M Khouja and R McGregorTime Series Forecasting for Dynamic Environments The DyFor Genetic ProgramModelIEEE Trans EC 2007

J Walker and J MillerPredicting Prime Numbers Using Cartesian Genetic ProgrammingIn Proc EuroGP 2007

J A Walker K Voumllk S L Smith and J F MillerParallel Evolution using Multi-chromosome Cartesian Genetic ProgrammingGPEM 10417ndash445 2009

W-C Wang K-W Chau C-T Cheng and L QiuA comparison of performance of several artificial intelligence methods forforecasting monthly discharge time seriesJournal of Hydrology 374(3-4)294ndash306 2009

Bibliographyndash 95

W Weimer S Forrest C Le Goues and T NguyenAutomatic program repair with evolutionary computationCommunications of the ACM 53(5)109ndash116 June 2010

P Widera J Garibaldi and N KrasnogorGP challenge Evolving energy function for protein structure predictionGPEM 1161ndash88 2010

T YuHierarchical Processing for Evolving Recursive and Modular Programs UsingHigher-Order Functions and Lambda AbstractionGPEM 2345ndash380 2001

Bibliographyndash 96

  • Introduction
  • Background
  • What is genetic programming
  • Summary of our first glimpse at GP
  • Life demonstration of GP using ECJ
  • Assessment of GP techniques
  • Case study
  • A more detailed view on GP (vanilla GP is not the whole story)
  • The Challenges for GP
  • Variants of GP
  • Selected theoretical results
  • Applications of GP
  • Additional resources
  • Bibliography
Page 49: GeneticProgramming - Poznań University of TechnologyGeneticProgramming KrzysztofKrawiec LaboratoryofIntelligentDecisionSupportSystems InstituteofComputingScience,PoznanUniversityofTechnology,Poznań,Poland

Initialization Full method

Specify the maximum tree height hmax

The full method for initializing treesChoose nonterminal nodes at random until hmax is reachedThen choose only from terminals

A more detailed view on GP (vanilla GP is not the whole story)ndash 49

Initialization Grow method

Specify the maximum tree height hmax

The grow method for initializing treesChoose nonterminal or terminal nodes at random until hmax is reachedThen choose only from terminals

A more detailed view on GP (vanilla GP is not the whole story)ndash 50

Homologous crossover for GP

Earliest example one-point crossover [23] identify a common region in theparents and swap the corresponding trees

The common region is the lsquointersectionrsquo of parent trees

A more detailed view on GP (vanilla GP is not the whole story)ndash 51

Uniform crossover for GP

Works similarly to uniform crossover in GAs

The offspring is build by iterating over nodes in the common region and flipping acoin to decide from which parent should an instruction be copied [38]

A more detailed view on GP (vanilla GP is not the whole story)ndash 52

How to employ multiple operators for lsquobreedingrsquo

How should the particular operators coexist in an evolutionary process In other words

How should they be superimposed

What should be the lsquopipingrsquo of particular breeding pipelines

A topic surprisingly underexplored in GP (and in EC probably too)

An example Which is betterpopsubpop0speciespipe = ecgpkozaMutationPipelinepopsubpop0speciespipenum-sources = 1popsubpop0speciespipesource0 = ecgpkozaCrossoverPipeline

Orpopsubpop0speciespipenum-sources = 2popsubpop0speciespipesource0 = ecgpkozaCrossoverPipelinepopsubpop0speciespipesource0prob = 09popsubpop0speciespipesource1 = ecgpkozaMutationPipelinepopsubpop0speciespipesource1prob = 01

A more detailed view on GP (vanilla GP is not the whole story)ndash 53

The Challenges for GP

The Challenges for GPndash 54

Bloat

The evolving expressions tend to grow indefinitely in sizeFor tree-based representations this growth is typically exponential[-ish]

Evaluation becomes slow algorithm stalls memory overrun likely

One of the most intensely studied topics in GP 240+ papers as of March 2012

The Challenges for GPndash 55

Bloat example

Average number of nodes per generation in a typical run of GP solving the Sexticproblem x6minus2x4+x2

(GP dotted line)

The Challenges for GPndash 56

Countermeasures for bloat

Constraining tree heightSurprisingly can speed up bloat

Favoring small programsLexicographic parsimony pressure given two equally fit individuals prefer (select)the one represented by a smaller tree

Bloat-aware operators size-fair crossover

The Challenges for GPndash 57

Highly non-uniform distribution of program lsquobehaviorsrsquo

Convergence of binary Boolean random linear functions (composed of AND NANDOR NOR 8 bits)

From [20] Langdon W B Cantuacute-Paz E (ed) Random Search is Parsimonious LateBreaking Papers at the Genetic and Evolutionary Computation Conference(GECCO-2002) AAAI 2002 308-315

The Challenges for GPndash 58

High cost of evaluation

Running a program on multiple inputscan be expensive

Particularly for some types of dataeg images

Solutions

Caching of outcomes of subprograms

Parallel execution of programs onparticular fitness cases

Bloat prevention methods

The Challenges for GPndash 59

Variants of GP

Variants of GPndash 60

Strongly typed GP (STGP)

A way to incorporate prior knowledge and impose a structure on programs [30]

ImplementationProvide a set of typesFor each instruction define the types of its arguments and outcomesMake the operators type-aware

Mutation substitute a random tree of a proper typeCrossover swap trees of compatible1 types

1Compatible belonging to the same lsquoset typersquoVariants of GPndash 61

Strongly typed GP in ECJ

For the problem of simple classifiers represented as decision trees

Classifier syntaxClassifier = Class_idClassifier = if_then_else(Condition ClassifierClassifier)Condition = Input_Variable = Constant_Value

Implementation in ECJ parameter filesgptypeasize = 3gptypea0name = classgptypea1name = vargptypea2name = constgptypessize = 0gptcsize = 1gptc0 = ecgpGPTreeConstraintsgptc0name = tc0gptc0fset = f0gptc0returns = class

gpncsize = 4gpnc0 = ecgpGPNodeConstraintsgpnc0name = ncSimpleClassifiergpnc0returns = classgpnc0size = 0gpnc1 = ecgpGPNodeConstraintsgpnc1name = ncCompoundClassifiergpnc1returns = classgpnc1size = 4gpnc1child0 = vargpnc1child1 = constgpnc1child2 = classgpnc1child3 = classgpnc2 = ecgpGPNodeConstraintsgpnc2name = ncVariablegpnc2returns = vargpnc2size = 0gpnc3 = ecgpGPNodeConstraintsgpnc3name = ncConstantgpnc3returns = constgpnc3size = 0

Variants of GPndash 62

Linear Genetic Programming

MotivationTree-like structures are not natural for contemporary hardware architectures

Program = a sequence of instructions

Data passed via registers

ProsDirectly portable to machine code fast executionNatural correspondence to standard (GA-like) crossover operator

Applications direct evolution of machine code [32]

Initial register contents

Final register contents

x1

x2 O1 O2

x3

O3 O4 g2

g3

g1r1

r2

r3

r1

r2

r3

Variants of GPndash 63

Linear GP

Variants of GPndash 64

Stack-based GP

The best-known representative Push and PushGPhampshireedulspectorpushhtml [48]

ProsVery simple syntax program = instruction | literal | ( program )No need to specify the number of registersThe top element of a stack has the natural interpretation of program outcomeNatural possibility of implementing autoconstructive programs [47]Includes certain features that make it Turing-complete (eg YANK instruction)

Variants of GPndash 65

Push Example

Program

( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR )

Initial stack states

BOOLEAN STACK ()CODE STACK ( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR )FLOAT STACK ()INTEGER STACK ()

Stack states after program execution

BOOLEAN STACK ( TRUE )CODE STACK ( ( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR ) )FLOAT STACK ( 93 )INTEGER STACK ( 6 )

httphampshireedulspectorpush3-descriptionhtml

Variants of GPndash 66

Other variants of GP

Grammatical Evolution The grammar of the programming language ofconsideration is given as input to the algorithm Individuals encode the choice ofproductions in the derivation tree (which of available alternative productionshould be chosen modulo the number of productions available at given step ofderivation)

Graph-based GPMotivation standard GP cannot reuse subprograms (within a single program)Example Cartesian Genetic Programming

Variants of GPndash 67

Other variants of GP

Multiobjective GP The extra objectives canCome with the problemResult from GPrsquos specifics eg use program size as the second (minimized)objectiveBe associated with different tests (eg feature tests [40])

Developmental GP (eg using Push)

Probabilistic GP (a variant of EDA Estimation of Distribution Algorithms)The algorithm maintains a probability distribution P instead of a populationIndividuals are generated from P lsquoon demandrsquoThe results of individualsrsquo evaluation are used to update P

Variants of GPndash 68

Simple EDA-like GP PIPE

Probabilistic Incremental Program Evolution [41]

Variants of GPndash 69

Selected theoretical results

Selected theoretical resultsndash 70

Schemata theorem for GP

Exact formula for the expected number of individuals sampling a schema a thenext generation [37]

Plus later work for other types of crossover

Selected theoretical resultsndash 71

Semantic crossover

Selected theoretical resultsndash 72

Applications of GP

Applications of GPndash 73

Review

GP produced a number of solutions that are human-competitive ie a GPalgorithm automatically solved a problem for which a patent exists2

A recent award-winning work has demonstrated the ability of a GP system toautomatically find and correct bugs in commercially-released software whenprovided with test data3

GP is one of leading methodologies that can be used to lsquoautomatersquo sciencehelping the researchers to find the hidden complex patterns in the observedphenomena4

2Koza J R Keane M A Streeter M J Mydlowec W Yu J Lanza G 2003 Genetic Pro-gramming IV Routine Human-Competitive Machine Intelligence Kluwer Academic Publishers

3Arcuri A Yao X A novel co-evolutionary approach to automatic software bug fixing In WangJ (Ed) 2008 IEEE World Congress on Computational Intelligence IEEE Computational IntelligenceSociety IEEE Press Hong Kong

4Schmidt M Lipson H 3 Apr 2009 Distilling free-form natural laws from experimental dataScience 324 (5923) 81ndash85

Applications of GPndash 74

Humies

() Entries were solicited for cash awards for human-competitive results that wereproduced by any form of genetic and evolutionary computation and that were published

Applications of GPndash 75

Humies

The conditions to qualify(A) The result was patented as an invention in the past is an improvement over apatented invention or would qualify today as a patentable new invention(B) The result is equal to or better than a result that was accepted as a new scientificresult at the time when it was published in a peer-reviewed scientific journal(C) The result is equal to or better than a result that was placed into a database orarchive of results maintained by an internationally recognized panel of scientificexperts(D) The result is publishable in its own right as a new scientific result mdash independentof the fact that the result was mechanically created(E) The result is equal to or better than the most recent human-created solution to along-standing problem for which there has been a succession of increasingly betterhuman-created solutions(F) The result is equal to or better than a result that was considered an achievementin its field at the time it was first discovered(G) The result solves a problem of indisputable difficulty in its field(H) The result holds its own or wins a regulated competition involving humancontestants (in the form of either live human players or human-written computerprograms)

Applications of GPndash 76

Selected Gold Humies

2004 Jason D Lohn Gregory S Hornby Derek S Linden NASA Ames ResearchCenterAn Evolved Antenna for Deployment on NASArsquos Space Technology 5 Mission

httpidesignucscedupapershornby_ec11pdf

Applications of GPndash 77

Selected Gold Humies using GP

2009 Stephanie Forrest Claire Le Goues ThanhVu Nguyen Westley WeimerAutomatically finding patches using genetic programming A GeneticProgramming Approach to Automated Software Repair

Applications of GPndash 78

Selected Gold Humies using GP

2008 Lee Spector David M Clark Ian Lindsay Bradford Barr Jon KleinGenetic Programming for Finite Algebras

2010 Natalio Krasnogor Paweł Widera Jonathan GaribaldiEvolutionary design of the energy function for protein structure prediction GPchallenge evolving the energy function for protein structure prediction Automateddesign of energy functions for protein structure prediction by means of geneticprogramming and improved structure similarity assessment

2011 Achiya Elyasaf Ami Hauptmann Moshe SipperGA-FreeCell Evolving Solvers for the Game of FreeCell

Applications of GPndash 79

Other applications

classification problems in machine learning [] object recognition [ 33] orlearning game strategies [] [2 57] has demonstrated the ability of a GP system to automatically find andcorrect bugs in commercially-released software when provided with test data

In the context of this paper it deserves particular attention that GP is one ofleading methodologies that can be used to lsquoautomatersquo science helping theresearchers to find the hidden complex patterns in the observed phenomena

In this spirit in their seminal paper [42] have shown how GP can be used toinduce scientific laws from experimental data Many other studies havedemonstrated the usefulness of GP for modeling different phenomena includingthose of natural origins [45 3 7 56 46]

See [] for an extensive review of GP applications

Applications of GPndash 80

Case study 2 Evolution of feature detectors

Based on Krzysztof Krawiec Bartosz Kukawka and Tomasz Maciejewski (2010)Evolving cascades of voting feature detectors for vehicle detection in satellite imageryIn IEEE Congress on Evolutionary Computation (CEC 2010) Barcelona IEEE Presspages 2392-2399

Applications of GPndash 81

Additional resources

Additional resourcesndash 82

Additional resources

Evolutionary Computation in Java csgmuedu~eclabprojectsecjGeneric software framework for EA well-prepared to work with GP

The online GP bilbiography wwwcsbhamacuk~wblbiblio

The genetic programming lsquohome pagersquo (a little bit messy but still valuable)httpwwwgenetic-programmingcom

Additional resourcesndash 83

Bibliography

Bibliographyndash 84

A Agapitos and S M LucasEvolving Modular Recursive Sorting AlgorithmsIn Proc EuroGP 2007

A Arcuri and X YaoA novel co-evolutionary approach to automatic software bug fixingIn J Wang editor 2008 IEEE World Congress on Computational Intelligencepages 162ndash168 Hong Kong 1-6 June 2008 IEEE Computational IntelligenceSociety IEEE Press

M Arganis R Val J Prats K Rodriguez R Dominguez and J DolzGenetic programming and standardization in water temperature modellingAdvances in Civil Engineering 2009 2009

A Brabazon M OrsquoNeill and I DempseyAn Introduction to Evolutionary Computation in FinanceIEEE Computational Intelligence Magazine 3(4)42ndash55 2008

R Bradley A Brabazon and M OrsquoNeillDynamic High Frequency Trading A Neuro-Evolutionary ApproachIn Proc EvoWorkshops 2009

G Cuccu and F GomezWhen novelty is not enoughIn Proc EvoApplications 2011

Bibliographyndash 85

M Daga and M C DeoAlternative data-driven methods to estimate wind from waves by inversemodelingNatural Hazards 49(2)293ndash310 May 2009

I Dempsey M OrsquoNeill and A BrabazonAdaptive Trading With Grammatical EvolutionIn Proc CEC 2006

G Durrett F Neumann and U-M OrsquoReillyComputational Complexity Analysis of Simple Genetic Programming On TwoProblems Modeling Isolated Program SemanticsIn Proc FOGA 2011

E Galvaacuten-Loacutepez J Swafford M OrsquoNeill and A BrabazonEvolving a Ms PacMan Controller Using Grammatical EvolutionIn Applications of Evolutionary Computation Springer 2010

J V Hansen P B Lowry R D Meservy and D M McDonaldGenetic Programming for Prevention of Cyberterrorism through Dynamic andEvolving Intrusion DetectionDecision Support Systems 431362ndash1374 2007

Bibliographyndash 86

S Harding J F Miller and W BanzhafDevelopments in Cartesian Genetic Programming self-modifying CGPGPEM 11397ndash439 2010

D Jakobović and L BudinDynamic Scheduling with Genetic ProgrammingIn Proc EuroGP 2006

G Kendall A Parkes and K SpoererA Survey of NP-Complete PuzzlesInternational Computer Games Association Journal 31(1)13ndash34 2008

K E Kinnear JrEvolving a Sort Lessons in Genetic ProgrammingIn Proc of the International Conference on Neural Networks 1993

J KozaA Genetic Approach to the Truck Backer Upper Problem and the Inter-twinedSpiral ProblemIn Proc International Joint Conference on Neural Networks 1992

J R KozaGenetic Programming On the Programming of Computers by Means of NaturalSelectionMIT Press Cambridge MA USA 1992

Bibliographyndash 87

J R KozaGenetic Programming II Automatic Discovery of Reusable ProgramsMIT Press Cambridge Massachusetts May 1994

W Langdon and W BanzhafRepeated Patterns in Genetic ProgrammingNatural Computing 7589ndash613 2008

W B LangdonRandom search is parsimoniousIn E Cantuacute-Paz editor Late Breaking Papers at the Genetic and EvolutionaryComputation Conference (GECCO-2002) pages 308ndash315 New York NY 9-13July 2002 AAAI

W B Langdon and W BanzhafRepeated Sequences in Linear Genetic Programming GenomesComplex Systems 15(4)285ndash306 2005

W B Langdon and A P HarrisonEvolving Regular Expressions for GeneChip Probe Performance PredictionIn Proc PPSN pages 1061ndash1070 2008

W B Langdon and R PoliFoundations of Genetic ProgrammingSpringer-Verlag 2002

Bibliographyndash 88

W B Langdon J Rowsell and A P HarrisonCreating Regular Expressions as mRNA Motifs with GP to Predict Human ExonSplittingIn Proc GECCO 2009

W B Langdon O Sanchez Graillet and A P HarrisonAutomated DNA Motif DiscoveryarXivorg 2010

M Lones A Tyrrell S Stepney and L CavesControlling Complex Dynamics with Artificial Biochemical NetworksIn Proc EuroGP pages 159ndash170 2010

S LucasOthello Competitionhttpprotectkern-1667emrelaxalgovalessexacuk8080othellohtmlOthellohtml 2012[Online accessed 27-Jan-2012]

Bibliographyndash 89

S LucasThe Physical Travelling Salesperson Problemhttpprotectkern-1667emrelaxalgovalessexacukptspptsphtml2012[Online accessed 27ndashJan-2012]

T McConaghyFFX Fast Scalable Deterministic Symbolic Regression TechnologyIn Proc GPTP 2011

D J MontanaStrongly typed genetic programmingBBN Technical Report 7866 Bolt Beranek and Newman Inc 10 MoultonStreet Cambridge MA 02138 USA 7 May 1993

M Nicolau M Schoenauer and W BanzhafEvolving Genes to Balance a PoleIn Proc EuroGP 2010

Bibliographyndash 90

P Nordin and W BanzhafGenetic programming controlling a miniature robotIn E V Siegel and J R Koza editors Working Notes for the AAAI Symposiumon Genetic Programming pages 61ndash67 MIT Cambridge MA USA 10ndash12 Nov1995 AAAI

G Olague and L TrujilloEvolutionary-computer-assisted design of image operators that detect interestpoints using genetic programmingImage and Vision Computing 29(7)484ndash498 2011

M OrsquoNeill A Brabazon and E HembergSubtree Deactivation Control with Grammatical Genetic Programming inDynamic EnvironmentsIn Proc CEC 2008

M OrsquoNeill and C RyanGrammatical Evolution by Grammatical Evolution The Evolution of Grammarand Genetic CodeIn Proc EuroGP pages 138ndash149 Springer-Verlag 5ndash7 Apr 2004

M Paterson Y Peres M Thorup P Winkler and U ZwickMaximum OverhangIn Proc 19th Annual ACM-SIAM Symposium on Discrete Algorithms 2008

Bibliographyndash 91

R PoliExact schema theory for genetic programming and variable-length geneticalgorithms with one-point crossoverGenetic Programming and Evolvable Machines 2(2)123ndash163 June 2001

R Poli and W B LangdonOn the search properties of different crossover operators in genetic programmingIn J R Koza W Banzhaf K Chellapilla K Deb M Dorigo D B Fogel M HGarzon D E Goldberg H Iba and R Riolo editors Genetic Programming1998 Proceedings of the Third Annual Conference pages 293ndash301 University ofWisconsin Madison Wisconsin USA 22-25 July 1998 Morgan Kaufmann

R Poli W B Langdon and N F McPheeA field guide to genetic programmingPublished via httplulucom and freely available athttpwwwgp-field-guideorguk 2008(With contributions by J R Koza)

B J Ross and H ZhuProcedural texture evolution using multiobjective optimizationNew Generation Computing 22(3)271ndash293 2004

Bibliographyndash 92

R P Salustowicz and J SchmidhuberProbabilistic incremental program evolutionEvolutionary Computation 5(2)123ndash141 1997

M Schmidt and H LipsonDistilling free-form natural laws from experimental dataScience 324(5923)81ndash85 3 Apr 2009

S Silva and L VanneschiState-of-the-Art Genetic Programming for Predicting Human Oral Bioavailabilityof DrugsIn Proc 4th International Workshop on Practical Applications of ComputationalBiology and Bioinformatics 2010

M SipperLet the Games EvolveIn Proc GPTP 2011

C Sivapragasam R Maheswaran and V VenkateshGenetic programming approach for flood routing in natural channelsHydrological Processes 22(5)623ndash628 2007

C Sivapragasam N Muttil S Muthukumar and V M ArunPrediction of algal blooms using genetic programmingMarine Pollution Bulletin 60(10)1849ndash1855 2010

Bibliographyndash 93

L SpectorTowards practical autoconstructive evolution Self-evolution of problem-solvinggenetic programming systemsIn R Riolo T McConaghy and E Vladislavleva editors Genetic ProgrammingTheory and Practice VIII volume 8 of Genetic and Evolutionary Computationchapter 2 pages 17ndash33 Springer Ann Arbor USA 20-22 May 2010

L Spector C Perry and J KleinPush 20 programming language descriptionTechnical report School of Cognitive Science Hampshire College Apr 2004

J Togelius S Karakovskiy J Koutnik and J SchmidhuberSuper Mario EvolutionIn Proc IEEE Computational Intelligence and Games 2009

TORCS The Open Car Racing Simulatorhttptorcssourceforgenet 2012

L Vanneschi and G CuccuVariable Size Population for Dynamic Optimization with Genetic ProgrammingIn Proc GECCO 2009

Bibliographyndash 94

E Vladislavleva G Smits and D Den HertogOrder of Nonlinearity as a Complexity Measure for Models Generated by SymbolicRegression via Pareto Genetic ProgrammingIEEE Trans EC 13(2)333ndash349 2009

N Wagner Z Michalewicz M Khouja and R McGregorTime Series Forecasting for Dynamic Environments The DyFor Genetic ProgramModelIEEE Trans EC 2007

J Walker and J MillerPredicting Prime Numbers Using Cartesian Genetic ProgrammingIn Proc EuroGP 2007

J A Walker K Voumllk S L Smith and J F MillerParallel Evolution using Multi-chromosome Cartesian Genetic ProgrammingGPEM 10417ndash445 2009

W-C Wang K-W Chau C-T Cheng and L QiuA comparison of performance of several artificial intelligence methods forforecasting monthly discharge time seriesJournal of Hydrology 374(3-4)294ndash306 2009

Bibliographyndash 95

W Weimer S Forrest C Le Goues and T NguyenAutomatic program repair with evolutionary computationCommunications of the ACM 53(5)109ndash116 June 2010

P Widera J Garibaldi and N KrasnogorGP challenge Evolving energy function for protein structure predictionGPEM 1161ndash88 2010

T YuHierarchical Processing for Evolving Recursive and Modular Programs UsingHigher-Order Functions and Lambda AbstractionGPEM 2345ndash380 2001

Bibliographyndash 96

  • Introduction
  • Background
  • What is genetic programming
  • Summary of our first glimpse at GP
  • Life demonstration of GP using ECJ
  • Assessment of GP techniques
  • Case study
  • A more detailed view on GP (vanilla GP is not the whole story)
  • The Challenges for GP
  • Variants of GP
  • Selected theoretical results
  • Applications of GP
  • Additional resources
  • Bibliography
Page 50: GeneticProgramming - Poznań University of TechnologyGeneticProgramming KrzysztofKrawiec LaboratoryofIntelligentDecisionSupportSystems InstituteofComputingScience,PoznanUniversityofTechnology,Poznań,Poland

Initialization Grow method

Specify the maximum tree height hmax

The grow method for initializing treesChoose nonterminal or terminal nodes at random until hmax is reachedThen choose only from terminals

A more detailed view on GP (vanilla GP is not the whole story)ndash 50

Homologous crossover for GP

Earliest example one-point crossover [23] identify a common region in theparents and swap the corresponding trees

The common region is the lsquointersectionrsquo of parent trees

A more detailed view on GP (vanilla GP is not the whole story)ndash 51

Uniform crossover for GP

Works similarly to uniform crossover in GAs

The offspring is build by iterating over nodes in the common region and flipping acoin to decide from which parent should an instruction be copied [38]

A more detailed view on GP (vanilla GP is not the whole story)ndash 52

How to employ multiple operators for lsquobreedingrsquo

How should the particular operators coexist in an evolutionary process In other words

How should they be superimposed

What should be the lsquopipingrsquo of particular breeding pipelines

A topic surprisingly underexplored in GP (and in EC probably too)

An example Which is betterpopsubpop0speciespipe = ecgpkozaMutationPipelinepopsubpop0speciespipenum-sources = 1popsubpop0speciespipesource0 = ecgpkozaCrossoverPipeline

Orpopsubpop0speciespipenum-sources = 2popsubpop0speciespipesource0 = ecgpkozaCrossoverPipelinepopsubpop0speciespipesource0prob = 09popsubpop0speciespipesource1 = ecgpkozaMutationPipelinepopsubpop0speciespipesource1prob = 01

A more detailed view on GP (vanilla GP is not the whole story)ndash 53

The Challenges for GP

The Challenges for GPndash 54

Bloat

The evolving expressions tend to grow indefinitely in sizeFor tree-based representations this growth is typically exponential[-ish]

Evaluation becomes slow algorithm stalls memory overrun likely

One of the most intensely studied topics in GP 240+ papers as of March 2012

The Challenges for GPndash 55

Bloat example

Average number of nodes per generation in a typical run of GP solving the Sexticproblem x6minus2x4+x2

(GP dotted line)

The Challenges for GPndash 56

Countermeasures for bloat

Constraining tree heightSurprisingly can speed up bloat

Favoring small programsLexicographic parsimony pressure given two equally fit individuals prefer (select)the one represented by a smaller tree

Bloat-aware operators size-fair crossover

The Challenges for GPndash 57

Highly non-uniform distribution of program lsquobehaviorsrsquo

Convergence of binary Boolean random linear functions (composed of AND NANDOR NOR 8 bits)

From [20] Langdon W B Cantuacute-Paz E (ed) Random Search is Parsimonious LateBreaking Papers at the Genetic and Evolutionary Computation Conference(GECCO-2002) AAAI 2002 308-315

The Challenges for GPndash 58

High cost of evaluation

Running a program on multiple inputscan be expensive

Particularly for some types of dataeg images

Solutions

Caching of outcomes of subprograms

Parallel execution of programs onparticular fitness cases

Bloat prevention methods

The Challenges for GPndash 59

Variants of GP

Variants of GPndash 60

Strongly typed GP (STGP)

A way to incorporate prior knowledge and impose a structure on programs [30]

ImplementationProvide a set of typesFor each instruction define the types of its arguments and outcomesMake the operators type-aware

Mutation substitute a random tree of a proper typeCrossover swap trees of compatible1 types

1Compatible belonging to the same lsquoset typersquoVariants of GPndash 61

Strongly typed GP in ECJ

For the problem of simple classifiers represented as decision trees

Classifier syntaxClassifier = Class_idClassifier = if_then_else(Condition ClassifierClassifier)Condition = Input_Variable = Constant_Value

Implementation in ECJ parameter filesgptypeasize = 3gptypea0name = classgptypea1name = vargptypea2name = constgptypessize = 0gptcsize = 1gptc0 = ecgpGPTreeConstraintsgptc0name = tc0gptc0fset = f0gptc0returns = class

gpncsize = 4gpnc0 = ecgpGPNodeConstraintsgpnc0name = ncSimpleClassifiergpnc0returns = classgpnc0size = 0gpnc1 = ecgpGPNodeConstraintsgpnc1name = ncCompoundClassifiergpnc1returns = classgpnc1size = 4gpnc1child0 = vargpnc1child1 = constgpnc1child2 = classgpnc1child3 = classgpnc2 = ecgpGPNodeConstraintsgpnc2name = ncVariablegpnc2returns = vargpnc2size = 0gpnc3 = ecgpGPNodeConstraintsgpnc3name = ncConstantgpnc3returns = constgpnc3size = 0

Variants of GPndash 62

Linear Genetic Programming

MotivationTree-like structures are not natural for contemporary hardware architectures

Program = a sequence of instructions

Data passed via registers

ProsDirectly portable to machine code fast executionNatural correspondence to standard (GA-like) crossover operator

Applications direct evolution of machine code [32]

Initial register contents

Final register contents

x1

x2 O1 O2

x3

O3 O4 g2

g3

g1r1

r2

r3

r1

r2

r3

Variants of GPndash 63

Linear GP

Variants of GPndash 64

Stack-based GP

The best-known representative Push and PushGPhampshireedulspectorpushhtml [48]

ProsVery simple syntax program = instruction | literal | ( program )No need to specify the number of registersThe top element of a stack has the natural interpretation of program outcomeNatural possibility of implementing autoconstructive programs [47]Includes certain features that make it Turing-complete (eg YANK instruction)

Variants of GPndash 65

Push Example

Program

( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR )

Initial stack states

BOOLEAN STACK ()CODE STACK ( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR )FLOAT STACK ()INTEGER STACK ()

Stack states after program execution

BOOLEAN STACK ( TRUE )CODE STACK ( ( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR ) )FLOAT STACK ( 93 )INTEGER STACK ( 6 )

httphampshireedulspectorpush3-descriptionhtml

Variants of GPndash 66

Other variants of GP

Grammatical Evolution The grammar of the programming language ofconsideration is given as input to the algorithm Individuals encode the choice ofproductions in the derivation tree (which of available alternative productionshould be chosen modulo the number of productions available at given step ofderivation)

Graph-based GPMotivation standard GP cannot reuse subprograms (within a single program)Example Cartesian Genetic Programming

Variants of GPndash 67

Other variants of GP

Multiobjective GP The extra objectives canCome with the problemResult from GPrsquos specifics eg use program size as the second (minimized)objectiveBe associated with different tests (eg feature tests [40])

Developmental GP (eg using Push)

Probabilistic GP (a variant of EDA Estimation of Distribution Algorithms)The algorithm maintains a probability distribution P instead of a populationIndividuals are generated from P lsquoon demandrsquoThe results of individualsrsquo evaluation are used to update P

Variants of GPndash 68

Simple EDA-like GP PIPE

Probabilistic Incremental Program Evolution [41]

Variants of GPndash 69

Selected theoretical results

Selected theoretical resultsndash 70

Schemata theorem for GP

Exact formula for the expected number of individuals sampling a schema a thenext generation [37]

Plus later work for other types of crossover

Selected theoretical resultsndash 71

Semantic crossover

Selected theoretical resultsndash 72

Applications of GP

Applications of GPndash 73

Review

GP produced a number of solutions that are human-competitive ie a GPalgorithm automatically solved a problem for which a patent exists2

A recent award-winning work has demonstrated the ability of a GP system toautomatically find and correct bugs in commercially-released software whenprovided with test data3

GP is one of leading methodologies that can be used to lsquoautomatersquo sciencehelping the researchers to find the hidden complex patterns in the observedphenomena4

2Koza J R Keane M A Streeter M J Mydlowec W Yu J Lanza G 2003 Genetic Pro-gramming IV Routine Human-Competitive Machine Intelligence Kluwer Academic Publishers

3Arcuri A Yao X A novel co-evolutionary approach to automatic software bug fixing In WangJ (Ed) 2008 IEEE World Congress on Computational Intelligence IEEE Computational IntelligenceSociety IEEE Press Hong Kong

4Schmidt M Lipson H 3 Apr 2009 Distilling free-form natural laws from experimental dataScience 324 (5923) 81ndash85

Applications of GPndash 74

Humies

() Entries were solicited for cash awards for human-competitive results that wereproduced by any form of genetic and evolutionary computation and that were published

Applications of GPndash 75

Humies

The conditions to qualify(A) The result was patented as an invention in the past is an improvement over apatented invention or would qualify today as a patentable new invention(B) The result is equal to or better than a result that was accepted as a new scientificresult at the time when it was published in a peer-reviewed scientific journal(C) The result is equal to or better than a result that was placed into a database orarchive of results maintained by an internationally recognized panel of scientificexperts(D) The result is publishable in its own right as a new scientific result mdash independentof the fact that the result was mechanically created(E) The result is equal to or better than the most recent human-created solution to along-standing problem for which there has been a succession of increasingly betterhuman-created solutions(F) The result is equal to or better than a result that was considered an achievementin its field at the time it was first discovered(G) The result solves a problem of indisputable difficulty in its field(H) The result holds its own or wins a regulated competition involving humancontestants (in the form of either live human players or human-written computerprograms)

Applications of GPndash 76

Selected Gold Humies

2004 Jason D Lohn Gregory S Hornby Derek S Linden NASA Ames ResearchCenterAn Evolved Antenna for Deployment on NASArsquos Space Technology 5 Mission

httpidesignucscedupapershornby_ec11pdf

Applications of GPndash 77

Selected Gold Humies using GP

2009 Stephanie Forrest Claire Le Goues ThanhVu Nguyen Westley WeimerAutomatically finding patches using genetic programming A GeneticProgramming Approach to Automated Software Repair

Applications of GPndash 78

Selected Gold Humies using GP

2008 Lee Spector David M Clark Ian Lindsay Bradford Barr Jon KleinGenetic Programming for Finite Algebras

2010 Natalio Krasnogor Paweł Widera Jonathan GaribaldiEvolutionary design of the energy function for protein structure prediction GPchallenge evolving the energy function for protein structure prediction Automateddesign of energy functions for protein structure prediction by means of geneticprogramming and improved structure similarity assessment

2011 Achiya Elyasaf Ami Hauptmann Moshe SipperGA-FreeCell Evolving Solvers for the Game of FreeCell

Applications of GPndash 79

Other applications

classification problems in machine learning [] object recognition [ 33] orlearning game strategies [] [2 57] has demonstrated the ability of a GP system to automatically find andcorrect bugs in commercially-released software when provided with test data

In the context of this paper it deserves particular attention that GP is one ofleading methodologies that can be used to lsquoautomatersquo science helping theresearchers to find the hidden complex patterns in the observed phenomena

In this spirit in their seminal paper [42] have shown how GP can be used toinduce scientific laws from experimental data Many other studies havedemonstrated the usefulness of GP for modeling different phenomena includingthose of natural origins [45 3 7 56 46]

See [] for an extensive review of GP applications

Applications of GPndash 80

Case study 2 Evolution of feature detectors

Based on Krzysztof Krawiec Bartosz Kukawka and Tomasz Maciejewski (2010)Evolving cascades of voting feature detectors for vehicle detection in satellite imageryIn IEEE Congress on Evolutionary Computation (CEC 2010) Barcelona IEEE Presspages 2392-2399

Applications of GPndash 81

Additional resources

Additional resourcesndash 82

Additional resources

Evolutionary Computation in Java csgmuedu~eclabprojectsecjGeneric software framework for EA well-prepared to work with GP

The online GP bilbiography wwwcsbhamacuk~wblbiblio

The genetic programming lsquohome pagersquo (a little bit messy but still valuable)httpwwwgenetic-programmingcom

Additional resourcesndash 83

Bibliography

Bibliographyndash 84

A Agapitos and S M LucasEvolving Modular Recursive Sorting AlgorithmsIn Proc EuroGP 2007

A Arcuri and X YaoA novel co-evolutionary approach to automatic software bug fixingIn J Wang editor 2008 IEEE World Congress on Computational Intelligencepages 162ndash168 Hong Kong 1-6 June 2008 IEEE Computational IntelligenceSociety IEEE Press

M Arganis R Val J Prats K Rodriguez R Dominguez and J DolzGenetic programming and standardization in water temperature modellingAdvances in Civil Engineering 2009 2009

A Brabazon M OrsquoNeill and I DempseyAn Introduction to Evolutionary Computation in FinanceIEEE Computational Intelligence Magazine 3(4)42ndash55 2008

R Bradley A Brabazon and M OrsquoNeillDynamic High Frequency Trading A Neuro-Evolutionary ApproachIn Proc EvoWorkshops 2009

G Cuccu and F GomezWhen novelty is not enoughIn Proc EvoApplications 2011

Bibliographyndash 85

M Daga and M C DeoAlternative data-driven methods to estimate wind from waves by inversemodelingNatural Hazards 49(2)293ndash310 May 2009

I Dempsey M OrsquoNeill and A BrabazonAdaptive Trading With Grammatical EvolutionIn Proc CEC 2006

G Durrett F Neumann and U-M OrsquoReillyComputational Complexity Analysis of Simple Genetic Programming On TwoProblems Modeling Isolated Program SemanticsIn Proc FOGA 2011

E Galvaacuten-Loacutepez J Swafford M OrsquoNeill and A BrabazonEvolving a Ms PacMan Controller Using Grammatical EvolutionIn Applications of Evolutionary Computation Springer 2010

J V Hansen P B Lowry R D Meservy and D M McDonaldGenetic Programming for Prevention of Cyberterrorism through Dynamic andEvolving Intrusion DetectionDecision Support Systems 431362ndash1374 2007

Bibliographyndash 86

S Harding J F Miller and W BanzhafDevelopments in Cartesian Genetic Programming self-modifying CGPGPEM 11397ndash439 2010

D Jakobović and L BudinDynamic Scheduling with Genetic ProgrammingIn Proc EuroGP 2006

G Kendall A Parkes and K SpoererA Survey of NP-Complete PuzzlesInternational Computer Games Association Journal 31(1)13ndash34 2008

K E Kinnear JrEvolving a Sort Lessons in Genetic ProgrammingIn Proc of the International Conference on Neural Networks 1993

J KozaA Genetic Approach to the Truck Backer Upper Problem and the Inter-twinedSpiral ProblemIn Proc International Joint Conference on Neural Networks 1992

J R KozaGenetic Programming On the Programming of Computers by Means of NaturalSelectionMIT Press Cambridge MA USA 1992

Bibliographyndash 87

J R KozaGenetic Programming II Automatic Discovery of Reusable ProgramsMIT Press Cambridge Massachusetts May 1994

W Langdon and W BanzhafRepeated Patterns in Genetic ProgrammingNatural Computing 7589ndash613 2008

W B LangdonRandom search is parsimoniousIn E Cantuacute-Paz editor Late Breaking Papers at the Genetic and EvolutionaryComputation Conference (GECCO-2002) pages 308ndash315 New York NY 9-13July 2002 AAAI

W B Langdon and W BanzhafRepeated Sequences in Linear Genetic Programming GenomesComplex Systems 15(4)285ndash306 2005

W B Langdon and A P HarrisonEvolving Regular Expressions for GeneChip Probe Performance PredictionIn Proc PPSN pages 1061ndash1070 2008

W B Langdon and R PoliFoundations of Genetic ProgrammingSpringer-Verlag 2002

Bibliographyndash 88

W B Langdon J Rowsell and A P HarrisonCreating Regular Expressions as mRNA Motifs with GP to Predict Human ExonSplittingIn Proc GECCO 2009

W B Langdon O Sanchez Graillet and A P HarrisonAutomated DNA Motif DiscoveryarXivorg 2010

M Lones A Tyrrell S Stepney and L CavesControlling Complex Dynamics with Artificial Biochemical NetworksIn Proc EuroGP pages 159ndash170 2010

S LucasOthello Competitionhttpprotectkern-1667emrelaxalgovalessexacuk8080othellohtmlOthellohtml 2012[Online accessed 27-Jan-2012]

Bibliographyndash 89

S LucasThe Physical Travelling Salesperson Problemhttpprotectkern-1667emrelaxalgovalessexacukptspptsphtml2012[Online accessed 27ndashJan-2012]

T McConaghyFFX Fast Scalable Deterministic Symbolic Regression TechnologyIn Proc GPTP 2011

D J MontanaStrongly typed genetic programmingBBN Technical Report 7866 Bolt Beranek and Newman Inc 10 MoultonStreet Cambridge MA 02138 USA 7 May 1993

M Nicolau M Schoenauer and W BanzhafEvolving Genes to Balance a PoleIn Proc EuroGP 2010

Bibliographyndash 90

P Nordin and W BanzhafGenetic programming controlling a miniature robotIn E V Siegel and J R Koza editors Working Notes for the AAAI Symposiumon Genetic Programming pages 61ndash67 MIT Cambridge MA USA 10ndash12 Nov1995 AAAI

G Olague and L TrujilloEvolutionary-computer-assisted design of image operators that detect interestpoints using genetic programmingImage and Vision Computing 29(7)484ndash498 2011

M OrsquoNeill A Brabazon and E HembergSubtree Deactivation Control with Grammatical Genetic Programming inDynamic EnvironmentsIn Proc CEC 2008

M OrsquoNeill and C RyanGrammatical Evolution by Grammatical Evolution The Evolution of Grammarand Genetic CodeIn Proc EuroGP pages 138ndash149 Springer-Verlag 5ndash7 Apr 2004

M Paterson Y Peres M Thorup P Winkler and U ZwickMaximum OverhangIn Proc 19th Annual ACM-SIAM Symposium on Discrete Algorithms 2008

Bibliographyndash 91

R PoliExact schema theory for genetic programming and variable-length geneticalgorithms with one-point crossoverGenetic Programming and Evolvable Machines 2(2)123ndash163 June 2001

R Poli and W B LangdonOn the search properties of different crossover operators in genetic programmingIn J R Koza W Banzhaf K Chellapilla K Deb M Dorigo D B Fogel M HGarzon D E Goldberg H Iba and R Riolo editors Genetic Programming1998 Proceedings of the Third Annual Conference pages 293ndash301 University ofWisconsin Madison Wisconsin USA 22-25 July 1998 Morgan Kaufmann

R Poli W B Langdon and N F McPheeA field guide to genetic programmingPublished via httplulucom and freely available athttpwwwgp-field-guideorguk 2008(With contributions by J R Koza)

B J Ross and H ZhuProcedural texture evolution using multiobjective optimizationNew Generation Computing 22(3)271ndash293 2004

Bibliographyndash 92

R P Salustowicz and J SchmidhuberProbabilistic incremental program evolutionEvolutionary Computation 5(2)123ndash141 1997

M Schmidt and H LipsonDistilling free-form natural laws from experimental dataScience 324(5923)81ndash85 3 Apr 2009

S Silva and L VanneschiState-of-the-Art Genetic Programming for Predicting Human Oral Bioavailabilityof DrugsIn Proc 4th International Workshop on Practical Applications of ComputationalBiology and Bioinformatics 2010

M SipperLet the Games EvolveIn Proc GPTP 2011

C Sivapragasam R Maheswaran and V VenkateshGenetic programming approach for flood routing in natural channelsHydrological Processes 22(5)623ndash628 2007

C Sivapragasam N Muttil S Muthukumar and V M ArunPrediction of algal blooms using genetic programmingMarine Pollution Bulletin 60(10)1849ndash1855 2010

Bibliographyndash 93

L SpectorTowards practical autoconstructive evolution Self-evolution of problem-solvinggenetic programming systemsIn R Riolo T McConaghy and E Vladislavleva editors Genetic ProgrammingTheory and Practice VIII volume 8 of Genetic and Evolutionary Computationchapter 2 pages 17ndash33 Springer Ann Arbor USA 20-22 May 2010

L Spector C Perry and J KleinPush 20 programming language descriptionTechnical report School of Cognitive Science Hampshire College Apr 2004

J Togelius S Karakovskiy J Koutnik and J SchmidhuberSuper Mario EvolutionIn Proc IEEE Computational Intelligence and Games 2009

TORCS The Open Car Racing Simulatorhttptorcssourceforgenet 2012

L Vanneschi and G CuccuVariable Size Population for Dynamic Optimization with Genetic ProgrammingIn Proc GECCO 2009

Bibliographyndash 94

E Vladislavleva G Smits and D Den HertogOrder of Nonlinearity as a Complexity Measure for Models Generated by SymbolicRegression via Pareto Genetic ProgrammingIEEE Trans EC 13(2)333ndash349 2009

N Wagner Z Michalewicz M Khouja and R McGregorTime Series Forecasting for Dynamic Environments The DyFor Genetic ProgramModelIEEE Trans EC 2007

J Walker and J MillerPredicting Prime Numbers Using Cartesian Genetic ProgrammingIn Proc EuroGP 2007

J A Walker K Voumllk S L Smith and J F MillerParallel Evolution using Multi-chromosome Cartesian Genetic ProgrammingGPEM 10417ndash445 2009

W-C Wang K-W Chau C-T Cheng and L QiuA comparison of performance of several artificial intelligence methods forforecasting monthly discharge time seriesJournal of Hydrology 374(3-4)294ndash306 2009

Bibliographyndash 95

W Weimer S Forrest C Le Goues and T NguyenAutomatic program repair with evolutionary computationCommunications of the ACM 53(5)109ndash116 June 2010

P Widera J Garibaldi and N KrasnogorGP challenge Evolving energy function for protein structure predictionGPEM 1161ndash88 2010

T YuHierarchical Processing for Evolving Recursive and Modular Programs UsingHigher-Order Functions and Lambda AbstractionGPEM 2345ndash380 2001

Bibliographyndash 96

  • Introduction
  • Background
  • What is genetic programming
  • Summary of our first glimpse at GP
  • Life demonstration of GP using ECJ
  • Assessment of GP techniques
  • Case study
  • A more detailed view on GP (vanilla GP is not the whole story)
  • The Challenges for GP
  • Variants of GP
  • Selected theoretical results
  • Applications of GP
  • Additional resources
  • Bibliography
Page 51: GeneticProgramming - Poznań University of TechnologyGeneticProgramming KrzysztofKrawiec LaboratoryofIntelligentDecisionSupportSystems InstituteofComputingScience,PoznanUniversityofTechnology,Poznań,Poland

Homologous crossover for GP

Earliest example one-point crossover [23] identify a common region in theparents and swap the corresponding trees

The common region is the lsquointersectionrsquo of parent trees

A more detailed view on GP (vanilla GP is not the whole story)ndash 51

Uniform crossover for GP

Works similarly to uniform crossover in GAs

The offspring is build by iterating over nodes in the common region and flipping acoin to decide from which parent should an instruction be copied [38]

A more detailed view on GP (vanilla GP is not the whole story)ndash 52

How to employ multiple operators for lsquobreedingrsquo

How should the particular operators coexist in an evolutionary process In other words

How should they be superimposed

What should be the lsquopipingrsquo of particular breeding pipelines

A topic surprisingly underexplored in GP (and in EC probably too)

An example Which is betterpopsubpop0speciespipe = ecgpkozaMutationPipelinepopsubpop0speciespipenum-sources = 1popsubpop0speciespipesource0 = ecgpkozaCrossoverPipeline

Orpopsubpop0speciespipenum-sources = 2popsubpop0speciespipesource0 = ecgpkozaCrossoverPipelinepopsubpop0speciespipesource0prob = 09popsubpop0speciespipesource1 = ecgpkozaMutationPipelinepopsubpop0speciespipesource1prob = 01

A more detailed view on GP (vanilla GP is not the whole story)ndash 53

The Challenges for GP

The Challenges for GPndash 54

Bloat

The evolving expressions tend to grow indefinitely in sizeFor tree-based representations this growth is typically exponential[-ish]

Evaluation becomes slow algorithm stalls memory overrun likely

One of the most intensely studied topics in GP 240+ papers as of March 2012

The Challenges for GPndash 55

Bloat example

Average number of nodes per generation in a typical run of GP solving the Sexticproblem x6minus2x4+x2

(GP dotted line)

The Challenges for GPndash 56

Countermeasures for bloat

Constraining tree heightSurprisingly can speed up bloat

Favoring small programsLexicographic parsimony pressure given two equally fit individuals prefer (select)the one represented by a smaller tree

Bloat-aware operators size-fair crossover

The Challenges for GPndash 57

Highly non-uniform distribution of program lsquobehaviorsrsquo

Convergence of binary Boolean random linear functions (composed of AND NANDOR NOR 8 bits)

From [20] Langdon W B Cantuacute-Paz E (ed) Random Search is Parsimonious LateBreaking Papers at the Genetic and Evolutionary Computation Conference(GECCO-2002) AAAI 2002 308-315

The Challenges for GPndash 58

High cost of evaluation

Running a program on multiple inputscan be expensive

Particularly for some types of dataeg images

Solutions

Caching of outcomes of subprograms

Parallel execution of programs onparticular fitness cases

Bloat prevention methods

The Challenges for GPndash 59

Variants of GP

Variants of GPndash 60

Strongly typed GP (STGP)

A way to incorporate prior knowledge and impose a structure on programs [30]

ImplementationProvide a set of typesFor each instruction define the types of its arguments and outcomesMake the operators type-aware

Mutation substitute a random tree of a proper typeCrossover swap trees of compatible1 types

1Compatible belonging to the same lsquoset typersquoVariants of GPndash 61

Strongly typed GP in ECJ

For the problem of simple classifiers represented as decision trees

Classifier syntaxClassifier = Class_idClassifier = if_then_else(Condition ClassifierClassifier)Condition = Input_Variable = Constant_Value

Implementation in ECJ parameter filesgptypeasize = 3gptypea0name = classgptypea1name = vargptypea2name = constgptypessize = 0gptcsize = 1gptc0 = ecgpGPTreeConstraintsgptc0name = tc0gptc0fset = f0gptc0returns = class

gpncsize = 4gpnc0 = ecgpGPNodeConstraintsgpnc0name = ncSimpleClassifiergpnc0returns = classgpnc0size = 0gpnc1 = ecgpGPNodeConstraintsgpnc1name = ncCompoundClassifiergpnc1returns = classgpnc1size = 4gpnc1child0 = vargpnc1child1 = constgpnc1child2 = classgpnc1child3 = classgpnc2 = ecgpGPNodeConstraintsgpnc2name = ncVariablegpnc2returns = vargpnc2size = 0gpnc3 = ecgpGPNodeConstraintsgpnc3name = ncConstantgpnc3returns = constgpnc3size = 0

Variants of GPndash 62

Linear Genetic Programming

MotivationTree-like structures are not natural for contemporary hardware architectures

Program = a sequence of instructions

Data passed via registers

ProsDirectly portable to machine code fast executionNatural correspondence to standard (GA-like) crossover operator

Applications direct evolution of machine code [32]

Initial register contents

Final register contents

x1

x2 O1 O2

x3

O3 O4 g2

g3

g1r1

r2

r3

r1

r2

r3

Variants of GPndash 63

Linear GP

Variants of GPndash 64

Stack-based GP

The best-known representative Push and PushGPhampshireedulspectorpushhtml [48]

ProsVery simple syntax program = instruction | literal | ( program )No need to specify the number of registersThe top element of a stack has the natural interpretation of program outcomeNatural possibility of implementing autoconstructive programs [47]Includes certain features that make it Turing-complete (eg YANK instruction)

Variants of GPndash 65

Push Example

Program

( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR )

Initial stack states

BOOLEAN STACK ()CODE STACK ( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR )FLOAT STACK ()INTEGER STACK ()

Stack states after program execution

BOOLEAN STACK ( TRUE )CODE STACK ( ( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR ) )FLOAT STACK ( 93 )INTEGER STACK ( 6 )

httphampshireedulspectorpush3-descriptionhtml

Variants of GPndash 66

Other variants of GP

Grammatical Evolution The grammar of the programming language ofconsideration is given as input to the algorithm Individuals encode the choice ofproductions in the derivation tree (which of available alternative productionshould be chosen modulo the number of productions available at given step ofderivation)

Graph-based GPMotivation standard GP cannot reuse subprograms (within a single program)Example Cartesian Genetic Programming

Variants of GPndash 67

Other variants of GP

Multiobjective GP The extra objectives canCome with the problemResult from GPrsquos specifics eg use program size as the second (minimized)objectiveBe associated with different tests (eg feature tests [40])

Developmental GP (eg using Push)

Probabilistic GP (a variant of EDA Estimation of Distribution Algorithms)The algorithm maintains a probability distribution P instead of a populationIndividuals are generated from P lsquoon demandrsquoThe results of individualsrsquo evaluation are used to update P

Variants of GPndash 68

Simple EDA-like GP PIPE

Probabilistic Incremental Program Evolution [41]

Variants of GPndash 69

Selected theoretical results

Selected theoretical resultsndash 70

Schemata theorem for GP

Exact formula for the expected number of individuals sampling a schema a thenext generation [37]

Plus later work for other types of crossover

Selected theoretical resultsndash 71

Semantic crossover

Selected theoretical resultsndash 72

Applications of GP

Applications of GPndash 73

Review

GP produced a number of solutions that are human-competitive ie a GPalgorithm automatically solved a problem for which a patent exists2

A recent award-winning work has demonstrated the ability of a GP system toautomatically find and correct bugs in commercially-released software whenprovided with test data3

GP is one of leading methodologies that can be used to lsquoautomatersquo sciencehelping the researchers to find the hidden complex patterns in the observedphenomena4

2Koza J R Keane M A Streeter M J Mydlowec W Yu J Lanza G 2003 Genetic Pro-gramming IV Routine Human-Competitive Machine Intelligence Kluwer Academic Publishers

3Arcuri A Yao X A novel co-evolutionary approach to automatic software bug fixing In WangJ (Ed) 2008 IEEE World Congress on Computational Intelligence IEEE Computational IntelligenceSociety IEEE Press Hong Kong

4Schmidt M Lipson H 3 Apr 2009 Distilling free-form natural laws from experimental dataScience 324 (5923) 81ndash85

Applications of GPndash 74

Humies

() Entries were solicited for cash awards for human-competitive results that wereproduced by any form of genetic and evolutionary computation and that were published

Applications of GPndash 75

Humies

The conditions to qualify(A) The result was patented as an invention in the past is an improvement over apatented invention or would qualify today as a patentable new invention(B) The result is equal to or better than a result that was accepted as a new scientificresult at the time when it was published in a peer-reviewed scientific journal(C) The result is equal to or better than a result that was placed into a database orarchive of results maintained by an internationally recognized panel of scientificexperts(D) The result is publishable in its own right as a new scientific result mdash independentof the fact that the result was mechanically created(E) The result is equal to or better than the most recent human-created solution to along-standing problem for which there has been a succession of increasingly betterhuman-created solutions(F) The result is equal to or better than a result that was considered an achievementin its field at the time it was first discovered(G) The result solves a problem of indisputable difficulty in its field(H) The result holds its own or wins a regulated competition involving humancontestants (in the form of either live human players or human-written computerprograms)

Applications of GPndash 76

Selected Gold Humies

2004 Jason D Lohn Gregory S Hornby Derek S Linden NASA Ames ResearchCenterAn Evolved Antenna for Deployment on NASArsquos Space Technology 5 Mission

httpidesignucscedupapershornby_ec11pdf

Applications of GPndash 77

Selected Gold Humies using GP

2009 Stephanie Forrest Claire Le Goues ThanhVu Nguyen Westley WeimerAutomatically finding patches using genetic programming A GeneticProgramming Approach to Automated Software Repair

Applications of GPndash 78

Selected Gold Humies using GP

2008 Lee Spector David M Clark Ian Lindsay Bradford Barr Jon KleinGenetic Programming for Finite Algebras

2010 Natalio Krasnogor Paweł Widera Jonathan GaribaldiEvolutionary design of the energy function for protein structure prediction GPchallenge evolving the energy function for protein structure prediction Automateddesign of energy functions for protein structure prediction by means of geneticprogramming and improved structure similarity assessment

2011 Achiya Elyasaf Ami Hauptmann Moshe SipperGA-FreeCell Evolving Solvers for the Game of FreeCell

Applications of GPndash 79

Other applications

classification problems in machine learning [] object recognition [ 33] orlearning game strategies [] [2 57] has demonstrated the ability of a GP system to automatically find andcorrect bugs in commercially-released software when provided with test data

In the context of this paper it deserves particular attention that GP is one ofleading methodologies that can be used to lsquoautomatersquo science helping theresearchers to find the hidden complex patterns in the observed phenomena

In this spirit in their seminal paper [42] have shown how GP can be used toinduce scientific laws from experimental data Many other studies havedemonstrated the usefulness of GP for modeling different phenomena includingthose of natural origins [45 3 7 56 46]

See [] for an extensive review of GP applications

Applications of GPndash 80

Case study 2 Evolution of feature detectors

Based on Krzysztof Krawiec Bartosz Kukawka and Tomasz Maciejewski (2010)Evolving cascades of voting feature detectors for vehicle detection in satellite imageryIn IEEE Congress on Evolutionary Computation (CEC 2010) Barcelona IEEE Presspages 2392-2399

Applications of GPndash 81

Additional resources

Additional resourcesndash 82

Additional resources

Evolutionary Computation in Java csgmuedu~eclabprojectsecjGeneric software framework for EA well-prepared to work with GP

The online GP bilbiography wwwcsbhamacuk~wblbiblio

The genetic programming lsquohome pagersquo (a little bit messy but still valuable)httpwwwgenetic-programmingcom

Additional resourcesndash 83

Bibliography

Bibliographyndash 84

A Agapitos and S M LucasEvolving Modular Recursive Sorting AlgorithmsIn Proc EuroGP 2007

A Arcuri and X YaoA novel co-evolutionary approach to automatic software bug fixingIn J Wang editor 2008 IEEE World Congress on Computational Intelligencepages 162ndash168 Hong Kong 1-6 June 2008 IEEE Computational IntelligenceSociety IEEE Press

M Arganis R Val J Prats K Rodriguez R Dominguez and J DolzGenetic programming and standardization in water temperature modellingAdvances in Civil Engineering 2009 2009

A Brabazon M OrsquoNeill and I DempseyAn Introduction to Evolutionary Computation in FinanceIEEE Computational Intelligence Magazine 3(4)42ndash55 2008

R Bradley A Brabazon and M OrsquoNeillDynamic High Frequency Trading A Neuro-Evolutionary ApproachIn Proc EvoWorkshops 2009

G Cuccu and F GomezWhen novelty is not enoughIn Proc EvoApplications 2011

Bibliographyndash 85

M Daga and M C DeoAlternative data-driven methods to estimate wind from waves by inversemodelingNatural Hazards 49(2)293ndash310 May 2009

I Dempsey M OrsquoNeill and A BrabazonAdaptive Trading With Grammatical EvolutionIn Proc CEC 2006

G Durrett F Neumann and U-M OrsquoReillyComputational Complexity Analysis of Simple Genetic Programming On TwoProblems Modeling Isolated Program SemanticsIn Proc FOGA 2011

E Galvaacuten-Loacutepez J Swafford M OrsquoNeill and A BrabazonEvolving a Ms PacMan Controller Using Grammatical EvolutionIn Applications of Evolutionary Computation Springer 2010

J V Hansen P B Lowry R D Meservy and D M McDonaldGenetic Programming for Prevention of Cyberterrorism through Dynamic andEvolving Intrusion DetectionDecision Support Systems 431362ndash1374 2007

Bibliographyndash 86

S Harding J F Miller and W BanzhafDevelopments in Cartesian Genetic Programming self-modifying CGPGPEM 11397ndash439 2010

D Jakobović and L BudinDynamic Scheduling with Genetic ProgrammingIn Proc EuroGP 2006

G Kendall A Parkes and K SpoererA Survey of NP-Complete PuzzlesInternational Computer Games Association Journal 31(1)13ndash34 2008

K E Kinnear JrEvolving a Sort Lessons in Genetic ProgrammingIn Proc of the International Conference on Neural Networks 1993

J KozaA Genetic Approach to the Truck Backer Upper Problem and the Inter-twinedSpiral ProblemIn Proc International Joint Conference on Neural Networks 1992

J R KozaGenetic Programming On the Programming of Computers by Means of NaturalSelectionMIT Press Cambridge MA USA 1992

Bibliographyndash 87

J R KozaGenetic Programming II Automatic Discovery of Reusable ProgramsMIT Press Cambridge Massachusetts May 1994

W Langdon and W BanzhafRepeated Patterns in Genetic ProgrammingNatural Computing 7589ndash613 2008

W B LangdonRandom search is parsimoniousIn E Cantuacute-Paz editor Late Breaking Papers at the Genetic and EvolutionaryComputation Conference (GECCO-2002) pages 308ndash315 New York NY 9-13July 2002 AAAI

W B Langdon and W BanzhafRepeated Sequences in Linear Genetic Programming GenomesComplex Systems 15(4)285ndash306 2005

W B Langdon and A P HarrisonEvolving Regular Expressions for GeneChip Probe Performance PredictionIn Proc PPSN pages 1061ndash1070 2008

W B Langdon and R PoliFoundations of Genetic ProgrammingSpringer-Verlag 2002

Bibliographyndash 88

W B Langdon J Rowsell and A P HarrisonCreating Regular Expressions as mRNA Motifs with GP to Predict Human ExonSplittingIn Proc GECCO 2009

W B Langdon O Sanchez Graillet and A P HarrisonAutomated DNA Motif DiscoveryarXivorg 2010

M Lones A Tyrrell S Stepney and L CavesControlling Complex Dynamics with Artificial Biochemical NetworksIn Proc EuroGP pages 159ndash170 2010

S LucasOthello Competitionhttpprotectkern-1667emrelaxalgovalessexacuk8080othellohtmlOthellohtml 2012[Online accessed 27-Jan-2012]

Bibliographyndash 89

S LucasThe Physical Travelling Salesperson Problemhttpprotectkern-1667emrelaxalgovalessexacukptspptsphtml2012[Online accessed 27ndashJan-2012]

T McConaghyFFX Fast Scalable Deterministic Symbolic Regression TechnologyIn Proc GPTP 2011

D J MontanaStrongly typed genetic programmingBBN Technical Report 7866 Bolt Beranek and Newman Inc 10 MoultonStreet Cambridge MA 02138 USA 7 May 1993

M Nicolau M Schoenauer and W BanzhafEvolving Genes to Balance a PoleIn Proc EuroGP 2010

Bibliographyndash 90

P Nordin and W BanzhafGenetic programming controlling a miniature robotIn E V Siegel and J R Koza editors Working Notes for the AAAI Symposiumon Genetic Programming pages 61ndash67 MIT Cambridge MA USA 10ndash12 Nov1995 AAAI

G Olague and L TrujilloEvolutionary-computer-assisted design of image operators that detect interestpoints using genetic programmingImage and Vision Computing 29(7)484ndash498 2011

M OrsquoNeill A Brabazon and E HembergSubtree Deactivation Control with Grammatical Genetic Programming inDynamic EnvironmentsIn Proc CEC 2008

M OrsquoNeill and C RyanGrammatical Evolution by Grammatical Evolution The Evolution of Grammarand Genetic CodeIn Proc EuroGP pages 138ndash149 Springer-Verlag 5ndash7 Apr 2004

M Paterson Y Peres M Thorup P Winkler and U ZwickMaximum OverhangIn Proc 19th Annual ACM-SIAM Symposium on Discrete Algorithms 2008

Bibliographyndash 91

R PoliExact schema theory for genetic programming and variable-length geneticalgorithms with one-point crossoverGenetic Programming and Evolvable Machines 2(2)123ndash163 June 2001

R Poli and W B LangdonOn the search properties of different crossover operators in genetic programmingIn J R Koza W Banzhaf K Chellapilla K Deb M Dorigo D B Fogel M HGarzon D E Goldberg H Iba and R Riolo editors Genetic Programming1998 Proceedings of the Third Annual Conference pages 293ndash301 University ofWisconsin Madison Wisconsin USA 22-25 July 1998 Morgan Kaufmann

R Poli W B Langdon and N F McPheeA field guide to genetic programmingPublished via httplulucom and freely available athttpwwwgp-field-guideorguk 2008(With contributions by J R Koza)

B J Ross and H ZhuProcedural texture evolution using multiobjective optimizationNew Generation Computing 22(3)271ndash293 2004

Bibliographyndash 92

R P Salustowicz and J SchmidhuberProbabilistic incremental program evolutionEvolutionary Computation 5(2)123ndash141 1997

M Schmidt and H LipsonDistilling free-form natural laws from experimental dataScience 324(5923)81ndash85 3 Apr 2009

S Silva and L VanneschiState-of-the-Art Genetic Programming for Predicting Human Oral Bioavailabilityof DrugsIn Proc 4th International Workshop on Practical Applications of ComputationalBiology and Bioinformatics 2010

M SipperLet the Games EvolveIn Proc GPTP 2011

C Sivapragasam R Maheswaran and V VenkateshGenetic programming approach for flood routing in natural channelsHydrological Processes 22(5)623ndash628 2007

C Sivapragasam N Muttil S Muthukumar and V M ArunPrediction of algal blooms using genetic programmingMarine Pollution Bulletin 60(10)1849ndash1855 2010

Bibliographyndash 93

L SpectorTowards practical autoconstructive evolution Self-evolution of problem-solvinggenetic programming systemsIn R Riolo T McConaghy and E Vladislavleva editors Genetic ProgrammingTheory and Practice VIII volume 8 of Genetic and Evolutionary Computationchapter 2 pages 17ndash33 Springer Ann Arbor USA 20-22 May 2010

L Spector C Perry and J KleinPush 20 programming language descriptionTechnical report School of Cognitive Science Hampshire College Apr 2004

J Togelius S Karakovskiy J Koutnik and J SchmidhuberSuper Mario EvolutionIn Proc IEEE Computational Intelligence and Games 2009

TORCS The Open Car Racing Simulatorhttptorcssourceforgenet 2012

L Vanneschi and G CuccuVariable Size Population for Dynamic Optimization with Genetic ProgrammingIn Proc GECCO 2009

Bibliographyndash 94

E Vladislavleva G Smits and D Den HertogOrder of Nonlinearity as a Complexity Measure for Models Generated by SymbolicRegression via Pareto Genetic ProgrammingIEEE Trans EC 13(2)333ndash349 2009

N Wagner Z Michalewicz M Khouja and R McGregorTime Series Forecasting for Dynamic Environments The DyFor Genetic ProgramModelIEEE Trans EC 2007

J Walker and J MillerPredicting Prime Numbers Using Cartesian Genetic ProgrammingIn Proc EuroGP 2007

J A Walker K Voumllk S L Smith and J F MillerParallel Evolution using Multi-chromosome Cartesian Genetic ProgrammingGPEM 10417ndash445 2009

W-C Wang K-W Chau C-T Cheng and L QiuA comparison of performance of several artificial intelligence methods forforecasting monthly discharge time seriesJournal of Hydrology 374(3-4)294ndash306 2009

Bibliographyndash 95

W Weimer S Forrest C Le Goues and T NguyenAutomatic program repair with evolutionary computationCommunications of the ACM 53(5)109ndash116 June 2010

P Widera J Garibaldi and N KrasnogorGP challenge Evolving energy function for protein structure predictionGPEM 1161ndash88 2010

T YuHierarchical Processing for Evolving Recursive and Modular Programs UsingHigher-Order Functions and Lambda AbstractionGPEM 2345ndash380 2001

Bibliographyndash 96

  • Introduction
  • Background
  • What is genetic programming
  • Summary of our first glimpse at GP
  • Life demonstration of GP using ECJ
  • Assessment of GP techniques
  • Case study
  • A more detailed view on GP (vanilla GP is not the whole story)
  • The Challenges for GP
  • Variants of GP
  • Selected theoretical results
  • Applications of GP
  • Additional resources
  • Bibliography
Page 52: GeneticProgramming - Poznań University of TechnologyGeneticProgramming KrzysztofKrawiec LaboratoryofIntelligentDecisionSupportSystems InstituteofComputingScience,PoznanUniversityofTechnology,Poznań,Poland

Uniform crossover for GP

Works similarly to uniform crossover in GAs

The offspring is build by iterating over nodes in the common region and flipping acoin to decide from which parent should an instruction be copied [38]

A more detailed view on GP (vanilla GP is not the whole story)ndash 52

How to employ multiple operators for lsquobreedingrsquo

How should the particular operators coexist in an evolutionary process In other words

How should they be superimposed

What should be the lsquopipingrsquo of particular breeding pipelines

A topic surprisingly underexplored in GP (and in EC probably too)

An example Which is betterpopsubpop0speciespipe = ecgpkozaMutationPipelinepopsubpop0speciespipenum-sources = 1popsubpop0speciespipesource0 = ecgpkozaCrossoverPipeline

Orpopsubpop0speciespipenum-sources = 2popsubpop0speciespipesource0 = ecgpkozaCrossoverPipelinepopsubpop0speciespipesource0prob = 09popsubpop0speciespipesource1 = ecgpkozaMutationPipelinepopsubpop0speciespipesource1prob = 01

A more detailed view on GP (vanilla GP is not the whole story)ndash 53

The Challenges for GP

The Challenges for GPndash 54

Bloat

The evolving expressions tend to grow indefinitely in sizeFor tree-based representations this growth is typically exponential[-ish]

Evaluation becomes slow algorithm stalls memory overrun likely

One of the most intensely studied topics in GP 240+ papers as of March 2012

The Challenges for GPndash 55

Bloat example

Average number of nodes per generation in a typical run of GP solving the Sexticproblem x6minus2x4+x2

(GP dotted line)

The Challenges for GPndash 56

Countermeasures for bloat

Constraining tree heightSurprisingly can speed up bloat

Favoring small programsLexicographic parsimony pressure given two equally fit individuals prefer (select)the one represented by a smaller tree

Bloat-aware operators size-fair crossover

The Challenges for GPndash 57

Highly non-uniform distribution of program lsquobehaviorsrsquo

Convergence of binary Boolean random linear functions (composed of AND NANDOR NOR 8 bits)

From [20] Langdon W B Cantuacute-Paz E (ed) Random Search is Parsimonious LateBreaking Papers at the Genetic and Evolutionary Computation Conference(GECCO-2002) AAAI 2002 308-315

The Challenges for GPndash 58

High cost of evaluation

Running a program on multiple inputscan be expensive

Particularly for some types of dataeg images

Solutions

Caching of outcomes of subprograms

Parallel execution of programs onparticular fitness cases

Bloat prevention methods

The Challenges for GPndash 59

Variants of GP

Variants of GPndash 60

Strongly typed GP (STGP)

A way to incorporate prior knowledge and impose a structure on programs [30]

ImplementationProvide a set of typesFor each instruction define the types of its arguments and outcomesMake the operators type-aware

Mutation substitute a random tree of a proper typeCrossover swap trees of compatible1 types

1Compatible belonging to the same lsquoset typersquoVariants of GPndash 61

Strongly typed GP in ECJ

For the problem of simple classifiers represented as decision trees

Classifier syntaxClassifier = Class_idClassifier = if_then_else(Condition ClassifierClassifier)Condition = Input_Variable = Constant_Value

Implementation in ECJ parameter filesgptypeasize = 3gptypea0name = classgptypea1name = vargptypea2name = constgptypessize = 0gptcsize = 1gptc0 = ecgpGPTreeConstraintsgptc0name = tc0gptc0fset = f0gptc0returns = class

gpncsize = 4gpnc0 = ecgpGPNodeConstraintsgpnc0name = ncSimpleClassifiergpnc0returns = classgpnc0size = 0gpnc1 = ecgpGPNodeConstraintsgpnc1name = ncCompoundClassifiergpnc1returns = classgpnc1size = 4gpnc1child0 = vargpnc1child1 = constgpnc1child2 = classgpnc1child3 = classgpnc2 = ecgpGPNodeConstraintsgpnc2name = ncVariablegpnc2returns = vargpnc2size = 0gpnc3 = ecgpGPNodeConstraintsgpnc3name = ncConstantgpnc3returns = constgpnc3size = 0

Variants of GPndash 62

Linear Genetic Programming

MotivationTree-like structures are not natural for contemporary hardware architectures

Program = a sequence of instructions

Data passed via registers

ProsDirectly portable to machine code fast executionNatural correspondence to standard (GA-like) crossover operator

Applications direct evolution of machine code [32]

Initial register contents

Final register contents

x1

x2 O1 O2

x3

O3 O4 g2

g3

g1r1

r2

r3

r1

r2

r3

Variants of GPndash 63

Linear GP

Variants of GPndash 64

Stack-based GP

The best-known representative Push and PushGPhampshireedulspectorpushhtml [48]

ProsVery simple syntax program = instruction | literal | ( program )No need to specify the number of registersThe top element of a stack has the natural interpretation of program outcomeNatural possibility of implementing autoconstructive programs [47]Includes certain features that make it Turing-complete (eg YANK instruction)

Variants of GPndash 65

Push Example

Program

( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR )

Initial stack states

BOOLEAN STACK ()CODE STACK ( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR )FLOAT STACK ()INTEGER STACK ()

Stack states after program execution

BOOLEAN STACK ( TRUE )CODE STACK ( ( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR ) )FLOAT STACK ( 93 )INTEGER STACK ( 6 )

httphampshireedulspectorpush3-descriptionhtml

Variants of GPndash 66

Other variants of GP

Grammatical Evolution The grammar of the programming language ofconsideration is given as input to the algorithm Individuals encode the choice ofproductions in the derivation tree (which of available alternative productionshould be chosen modulo the number of productions available at given step ofderivation)

Graph-based GPMotivation standard GP cannot reuse subprograms (within a single program)Example Cartesian Genetic Programming

Variants of GPndash 67

Other variants of GP

Multiobjective GP The extra objectives canCome with the problemResult from GPrsquos specifics eg use program size as the second (minimized)objectiveBe associated with different tests (eg feature tests [40])

Developmental GP (eg using Push)

Probabilistic GP (a variant of EDA Estimation of Distribution Algorithms)The algorithm maintains a probability distribution P instead of a populationIndividuals are generated from P lsquoon demandrsquoThe results of individualsrsquo evaluation are used to update P

Variants of GPndash 68

Simple EDA-like GP PIPE

Probabilistic Incremental Program Evolution [41]

Variants of GPndash 69

Selected theoretical results

Selected theoretical resultsndash 70

Schemata theorem for GP

Exact formula for the expected number of individuals sampling a schema a thenext generation [37]

Plus later work for other types of crossover

Selected theoretical resultsndash 71

Semantic crossover

Selected theoretical resultsndash 72

Applications of GP

Applications of GPndash 73

Review

GP produced a number of solutions that are human-competitive ie a GPalgorithm automatically solved a problem for which a patent exists2

A recent award-winning work has demonstrated the ability of a GP system toautomatically find and correct bugs in commercially-released software whenprovided with test data3

GP is one of leading methodologies that can be used to lsquoautomatersquo sciencehelping the researchers to find the hidden complex patterns in the observedphenomena4

2Koza J R Keane M A Streeter M J Mydlowec W Yu J Lanza G 2003 Genetic Pro-gramming IV Routine Human-Competitive Machine Intelligence Kluwer Academic Publishers

3Arcuri A Yao X A novel co-evolutionary approach to automatic software bug fixing In WangJ (Ed) 2008 IEEE World Congress on Computational Intelligence IEEE Computational IntelligenceSociety IEEE Press Hong Kong

4Schmidt M Lipson H 3 Apr 2009 Distilling free-form natural laws from experimental dataScience 324 (5923) 81ndash85

Applications of GPndash 74

Humies

() Entries were solicited for cash awards for human-competitive results that wereproduced by any form of genetic and evolutionary computation and that were published

Applications of GPndash 75

Humies

The conditions to qualify(A) The result was patented as an invention in the past is an improvement over apatented invention or would qualify today as a patentable new invention(B) The result is equal to or better than a result that was accepted as a new scientificresult at the time when it was published in a peer-reviewed scientific journal(C) The result is equal to or better than a result that was placed into a database orarchive of results maintained by an internationally recognized panel of scientificexperts(D) The result is publishable in its own right as a new scientific result mdash independentof the fact that the result was mechanically created(E) The result is equal to or better than the most recent human-created solution to along-standing problem for which there has been a succession of increasingly betterhuman-created solutions(F) The result is equal to or better than a result that was considered an achievementin its field at the time it was first discovered(G) The result solves a problem of indisputable difficulty in its field(H) The result holds its own or wins a regulated competition involving humancontestants (in the form of either live human players or human-written computerprograms)

Applications of GPndash 76

Selected Gold Humies

2004 Jason D Lohn Gregory S Hornby Derek S Linden NASA Ames ResearchCenterAn Evolved Antenna for Deployment on NASArsquos Space Technology 5 Mission

httpidesignucscedupapershornby_ec11pdf

Applications of GPndash 77

Selected Gold Humies using GP

2009 Stephanie Forrest Claire Le Goues ThanhVu Nguyen Westley WeimerAutomatically finding patches using genetic programming A GeneticProgramming Approach to Automated Software Repair

Applications of GPndash 78

Selected Gold Humies using GP

2008 Lee Spector David M Clark Ian Lindsay Bradford Barr Jon KleinGenetic Programming for Finite Algebras

2010 Natalio Krasnogor Paweł Widera Jonathan GaribaldiEvolutionary design of the energy function for protein structure prediction GPchallenge evolving the energy function for protein structure prediction Automateddesign of energy functions for protein structure prediction by means of geneticprogramming and improved structure similarity assessment

2011 Achiya Elyasaf Ami Hauptmann Moshe SipperGA-FreeCell Evolving Solvers for the Game of FreeCell

Applications of GPndash 79

Other applications

classification problems in machine learning [] object recognition [ 33] orlearning game strategies [] [2 57] has demonstrated the ability of a GP system to automatically find andcorrect bugs in commercially-released software when provided with test data

In the context of this paper it deserves particular attention that GP is one ofleading methodologies that can be used to lsquoautomatersquo science helping theresearchers to find the hidden complex patterns in the observed phenomena

In this spirit in their seminal paper [42] have shown how GP can be used toinduce scientific laws from experimental data Many other studies havedemonstrated the usefulness of GP for modeling different phenomena includingthose of natural origins [45 3 7 56 46]

See [] for an extensive review of GP applications

Applications of GPndash 80

Case study 2 Evolution of feature detectors

Based on Krzysztof Krawiec Bartosz Kukawka and Tomasz Maciejewski (2010)Evolving cascades of voting feature detectors for vehicle detection in satellite imageryIn IEEE Congress on Evolutionary Computation (CEC 2010) Barcelona IEEE Presspages 2392-2399

Applications of GPndash 81

Additional resources

Additional resourcesndash 82

Additional resources

Evolutionary Computation in Java csgmuedu~eclabprojectsecjGeneric software framework for EA well-prepared to work with GP

The online GP bilbiography wwwcsbhamacuk~wblbiblio

The genetic programming lsquohome pagersquo (a little bit messy but still valuable)httpwwwgenetic-programmingcom

Additional resourcesndash 83

Bibliography

Bibliographyndash 84

A Agapitos and S M LucasEvolving Modular Recursive Sorting AlgorithmsIn Proc EuroGP 2007

A Arcuri and X YaoA novel co-evolutionary approach to automatic software bug fixingIn J Wang editor 2008 IEEE World Congress on Computational Intelligencepages 162ndash168 Hong Kong 1-6 June 2008 IEEE Computational IntelligenceSociety IEEE Press

M Arganis R Val J Prats K Rodriguez R Dominguez and J DolzGenetic programming and standardization in water temperature modellingAdvances in Civil Engineering 2009 2009

A Brabazon M OrsquoNeill and I DempseyAn Introduction to Evolutionary Computation in FinanceIEEE Computational Intelligence Magazine 3(4)42ndash55 2008

R Bradley A Brabazon and M OrsquoNeillDynamic High Frequency Trading A Neuro-Evolutionary ApproachIn Proc EvoWorkshops 2009

G Cuccu and F GomezWhen novelty is not enoughIn Proc EvoApplications 2011

Bibliographyndash 85

M Daga and M C DeoAlternative data-driven methods to estimate wind from waves by inversemodelingNatural Hazards 49(2)293ndash310 May 2009

I Dempsey M OrsquoNeill and A BrabazonAdaptive Trading With Grammatical EvolutionIn Proc CEC 2006

G Durrett F Neumann and U-M OrsquoReillyComputational Complexity Analysis of Simple Genetic Programming On TwoProblems Modeling Isolated Program SemanticsIn Proc FOGA 2011

E Galvaacuten-Loacutepez J Swafford M OrsquoNeill and A BrabazonEvolving a Ms PacMan Controller Using Grammatical EvolutionIn Applications of Evolutionary Computation Springer 2010

J V Hansen P B Lowry R D Meservy and D M McDonaldGenetic Programming for Prevention of Cyberterrorism through Dynamic andEvolving Intrusion DetectionDecision Support Systems 431362ndash1374 2007

Bibliographyndash 86

S Harding J F Miller and W BanzhafDevelopments in Cartesian Genetic Programming self-modifying CGPGPEM 11397ndash439 2010

D Jakobović and L BudinDynamic Scheduling with Genetic ProgrammingIn Proc EuroGP 2006

G Kendall A Parkes and K SpoererA Survey of NP-Complete PuzzlesInternational Computer Games Association Journal 31(1)13ndash34 2008

K E Kinnear JrEvolving a Sort Lessons in Genetic ProgrammingIn Proc of the International Conference on Neural Networks 1993

J KozaA Genetic Approach to the Truck Backer Upper Problem and the Inter-twinedSpiral ProblemIn Proc International Joint Conference on Neural Networks 1992

J R KozaGenetic Programming On the Programming of Computers by Means of NaturalSelectionMIT Press Cambridge MA USA 1992

Bibliographyndash 87

J R KozaGenetic Programming II Automatic Discovery of Reusable ProgramsMIT Press Cambridge Massachusetts May 1994

W Langdon and W BanzhafRepeated Patterns in Genetic ProgrammingNatural Computing 7589ndash613 2008

W B LangdonRandom search is parsimoniousIn E Cantuacute-Paz editor Late Breaking Papers at the Genetic and EvolutionaryComputation Conference (GECCO-2002) pages 308ndash315 New York NY 9-13July 2002 AAAI

W B Langdon and W BanzhafRepeated Sequences in Linear Genetic Programming GenomesComplex Systems 15(4)285ndash306 2005

W B Langdon and A P HarrisonEvolving Regular Expressions for GeneChip Probe Performance PredictionIn Proc PPSN pages 1061ndash1070 2008

W B Langdon and R PoliFoundations of Genetic ProgrammingSpringer-Verlag 2002

Bibliographyndash 88

W B Langdon J Rowsell and A P HarrisonCreating Regular Expressions as mRNA Motifs with GP to Predict Human ExonSplittingIn Proc GECCO 2009

W B Langdon O Sanchez Graillet and A P HarrisonAutomated DNA Motif DiscoveryarXivorg 2010

M Lones A Tyrrell S Stepney and L CavesControlling Complex Dynamics with Artificial Biochemical NetworksIn Proc EuroGP pages 159ndash170 2010

S LucasOthello Competitionhttpprotectkern-1667emrelaxalgovalessexacuk8080othellohtmlOthellohtml 2012[Online accessed 27-Jan-2012]

Bibliographyndash 89

S LucasThe Physical Travelling Salesperson Problemhttpprotectkern-1667emrelaxalgovalessexacukptspptsphtml2012[Online accessed 27ndashJan-2012]

T McConaghyFFX Fast Scalable Deterministic Symbolic Regression TechnologyIn Proc GPTP 2011

D J MontanaStrongly typed genetic programmingBBN Technical Report 7866 Bolt Beranek and Newman Inc 10 MoultonStreet Cambridge MA 02138 USA 7 May 1993

M Nicolau M Schoenauer and W BanzhafEvolving Genes to Balance a PoleIn Proc EuroGP 2010

Bibliographyndash 90

P Nordin and W BanzhafGenetic programming controlling a miniature robotIn E V Siegel and J R Koza editors Working Notes for the AAAI Symposiumon Genetic Programming pages 61ndash67 MIT Cambridge MA USA 10ndash12 Nov1995 AAAI

G Olague and L TrujilloEvolutionary-computer-assisted design of image operators that detect interestpoints using genetic programmingImage and Vision Computing 29(7)484ndash498 2011

M OrsquoNeill A Brabazon and E HembergSubtree Deactivation Control with Grammatical Genetic Programming inDynamic EnvironmentsIn Proc CEC 2008

M OrsquoNeill and C RyanGrammatical Evolution by Grammatical Evolution The Evolution of Grammarand Genetic CodeIn Proc EuroGP pages 138ndash149 Springer-Verlag 5ndash7 Apr 2004

M Paterson Y Peres M Thorup P Winkler and U ZwickMaximum OverhangIn Proc 19th Annual ACM-SIAM Symposium on Discrete Algorithms 2008

Bibliographyndash 91

R PoliExact schema theory for genetic programming and variable-length geneticalgorithms with one-point crossoverGenetic Programming and Evolvable Machines 2(2)123ndash163 June 2001

R Poli and W B LangdonOn the search properties of different crossover operators in genetic programmingIn J R Koza W Banzhaf K Chellapilla K Deb M Dorigo D B Fogel M HGarzon D E Goldberg H Iba and R Riolo editors Genetic Programming1998 Proceedings of the Third Annual Conference pages 293ndash301 University ofWisconsin Madison Wisconsin USA 22-25 July 1998 Morgan Kaufmann

R Poli W B Langdon and N F McPheeA field guide to genetic programmingPublished via httplulucom and freely available athttpwwwgp-field-guideorguk 2008(With contributions by J R Koza)

B J Ross and H ZhuProcedural texture evolution using multiobjective optimizationNew Generation Computing 22(3)271ndash293 2004

Bibliographyndash 92

R P Salustowicz and J SchmidhuberProbabilistic incremental program evolutionEvolutionary Computation 5(2)123ndash141 1997

M Schmidt and H LipsonDistilling free-form natural laws from experimental dataScience 324(5923)81ndash85 3 Apr 2009

S Silva and L VanneschiState-of-the-Art Genetic Programming for Predicting Human Oral Bioavailabilityof DrugsIn Proc 4th International Workshop on Practical Applications of ComputationalBiology and Bioinformatics 2010

M SipperLet the Games EvolveIn Proc GPTP 2011

C Sivapragasam R Maheswaran and V VenkateshGenetic programming approach for flood routing in natural channelsHydrological Processes 22(5)623ndash628 2007

C Sivapragasam N Muttil S Muthukumar and V M ArunPrediction of algal blooms using genetic programmingMarine Pollution Bulletin 60(10)1849ndash1855 2010

Bibliographyndash 93

L SpectorTowards practical autoconstructive evolution Self-evolution of problem-solvinggenetic programming systemsIn R Riolo T McConaghy and E Vladislavleva editors Genetic ProgrammingTheory and Practice VIII volume 8 of Genetic and Evolutionary Computationchapter 2 pages 17ndash33 Springer Ann Arbor USA 20-22 May 2010

L Spector C Perry and J KleinPush 20 programming language descriptionTechnical report School of Cognitive Science Hampshire College Apr 2004

J Togelius S Karakovskiy J Koutnik and J SchmidhuberSuper Mario EvolutionIn Proc IEEE Computational Intelligence and Games 2009

TORCS The Open Car Racing Simulatorhttptorcssourceforgenet 2012

L Vanneschi and G CuccuVariable Size Population for Dynamic Optimization with Genetic ProgrammingIn Proc GECCO 2009

Bibliographyndash 94

E Vladislavleva G Smits and D Den HertogOrder of Nonlinearity as a Complexity Measure for Models Generated by SymbolicRegression via Pareto Genetic ProgrammingIEEE Trans EC 13(2)333ndash349 2009

N Wagner Z Michalewicz M Khouja and R McGregorTime Series Forecasting for Dynamic Environments The DyFor Genetic ProgramModelIEEE Trans EC 2007

J Walker and J MillerPredicting Prime Numbers Using Cartesian Genetic ProgrammingIn Proc EuroGP 2007

J A Walker K Voumllk S L Smith and J F MillerParallel Evolution using Multi-chromosome Cartesian Genetic ProgrammingGPEM 10417ndash445 2009

W-C Wang K-W Chau C-T Cheng and L QiuA comparison of performance of several artificial intelligence methods forforecasting monthly discharge time seriesJournal of Hydrology 374(3-4)294ndash306 2009

Bibliographyndash 95

W Weimer S Forrest C Le Goues and T NguyenAutomatic program repair with evolutionary computationCommunications of the ACM 53(5)109ndash116 June 2010

P Widera J Garibaldi and N KrasnogorGP challenge Evolving energy function for protein structure predictionGPEM 1161ndash88 2010

T YuHierarchical Processing for Evolving Recursive and Modular Programs UsingHigher-Order Functions and Lambda AbstractionGPEM 2345ndash380 2001

Bibliographyndash 96

  • Introduction
  • Background
  • What is genetic programming
  • Summary of our first glimpse at GP
  • Life demonstration of GP using ECJ
  • Assessment of GP techniques
  • Case study
  • A more detailed view on GP (vanilla GP is not the whole story)
  • The Challenges for GP
  • Variants of GP
  • Selected theoretical results
  • Applications of GP
  • Additional resources
  • Bibliography
Page 53: GeneticProgramming - Poznań University of TechnologyGeneticProgramming KrzysztofKrawiec LaboratoryofIntelligentDecisionSupportSystems InstituteofComputingScience,PoznanUniversityofTechnology,Poznań,Poland

How to employ multiple operators for lsquobreedingrsquo

How should the particular operators coexist in an evolutionary process In other words

How should they be superimposed

What should be the lsquopipingrsquo of particular breeding pipelines

A topic surprisingly underexplored in GP (and in EC probably too)

An example Which is betterpopsubpop0speciespipe = ecgpkozaMutationPipelinepopsubpop0speciespipenum-sources = 1popsubpop0speciespipesource0 = ecgpkozaCrossoverPipeline

Orpopsubpop0speciespipenum-sources = 2popsubpop0speciespipesource0 = ecgpkozaCrossoverPipelinepopsubpop0speciespipesource0prob = 09popsubpop0speciespipesource1 = ecgpkozaMutationPipelinepopsubpop0speciespipesource1prob = 01

A more detailed view on GP (vanilla GP is not the whole story)ndash 53

The Challenges for GP

The Challenges for GPndash 54

Bloat

The evolving expressions tend to grow indefinitely in sizeFor tree-based representations this growth is typically exponential[-ish]

Evaluation becomes slow algorithm stalls memory overrun likely

One of the most intensely studied topics in GP 240+ papers as of March 2012

The Challenges for GPndash 55

Bloat example

Average number of nodes per generation in a typical run of GP solving the Sexticproblem x6minus2x4+x2

(GP dotted line)

The Challenges for GPndash 56

Countermeasures for bloat

Constraining tree heightSurprisingly can speed up bloat

Favoring small programsLexicographic parsimony pressure given two equally fit individuals prefer (select)the one represented by a smaller tree

Bloat-aware operators size-fair crossover

The Challenges for GPndash 57

Highly non-uniform distribution of program lsquobehaviorsrsquo

Convergence of binary Boolean random linear functions (composed of AND NANDOR NOR 8 bits)

From [20] Langdon W B Cantuacute-Paz E (ed) Random Search is Parsimonious LateBreaking Papers at the Genetic and Evolutionary Computation Conference(GECCO-2002) AAAI 2002 308-315

The Challenges for GPndash 58

High cost of evaluation

Running a program on multiple inputscan be expensive

Particularly for some types of dataeg images

Solutions

Caching of outcomes of subprograms

Parallel execution of programs onparticular fitness cases

Bloat prevention methods

The Challenges for GPndash 59

Variants of GP

Variants of GPndash 60

Strongly typed GP (STGP)

A way to incorporate prior knowledge and impose a structure on programs [30]

ImplementationProvide a set of typesFor each instruction define the types of its arguments and outcomesMake the operators type-aware

Mutation substitute a random tree of a proper typeCrossover swap trees of compatible1 types

1Compatible belonging to the same lsquoset typersquoVariants of GPndash 61

Strongly typed GP in ECJ

For the problem of simple classifiers represented as decision trees

Classifier syntaxClassifier = Class_idClassifier = if_then_else(Condition ClassifierClassifier)Condition = Input_Variable = Constant_Value

Implementation in ECJ parameter filesgptypeasize = 3gptypea0name = classgptypea1name = vargptypea2name = constgptypessize = 0gptcsize = 1gptc0 = ecgpGPTreeConstraintsgptc0name = tc0gptc0fset = f0gptc0returns = class

gpncsize = 4gpnc0 = ecgpGPNodeConstraintsgpnc0name = ncSimpleClassifiergpnc0returns = classgpnc0size = 0gpnc1 = ecgpGPNodeConstraintsgpnc1name = ncCompoundClassifiergpnc1returns = classgpnc1size = 4gpnc1child0 = vargpnc1child1 = constgpnc1child2 = classgpnc1child3 = classgpnc2 = ecgpGPNodeConstraintsgpnc2name = ncVariablegpnc2returns = vargpnc2size = 0gpnc3 = ecgpGPNodeConstraintsgpnc3name = ncConstantgpnc3returns = constgpnc3size = 0

Variants of GPndash 62

Linear Genetic Programming

MotivationTree-like structures are not natural for contemporary hardware architectures

Program = a sequence of instructions

Data passed via registers

ProsDirectly portable to machine code fast executionNatural correspondence to standard (GA-like) crossover operator

Applications direct evolution of machine code [32]

Initial register contents

Final register contents

x1

x2 O1 O2

x3

O3 O4 g2

g3

g1r1

r2

r3

r1

r2

r3

Variants of GPndash 63

Linear GP

Variants of GPndash 64

Stack-based GP

The best-known representative Push and PushGPhampshireedulspectorpushhtml [48]

ProsVery simple syntax program = instruction | literal | ( program )No need to specify the number of registersThe top element of a stack has the natural interpretation of program outcomeNatural possibility of implementing autoconstructive programs [47]Includes certain features that make it Turing-complete (eg YANK instruction)

Variants of GPndash 65

Push Example

Program

( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR )

Initial stack states

BOOLEAN STACK ()CODE STACK ( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR )FLOAT STACK ()INTEGER STACK ()

Stack states after program execution

BOOLEAN STACK ( TRUE )CODE STACK ( ( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR ) )FLOAT STACK ( 93 )INTEGER STACK ( 6 )

httphampshireedulspectorpush3-descriptionhtml

Variants of GPndash 66

Other variants of GP

Grammatical Evolution The grammar of the programming language ofconsideration is given as input to the algorithm Individuals encode the choice ofproductions in the derivation tree (which of available alternative productionshould be chosen modulo the number of productions available at given step ofderivation)

Graph-based GPMotivation standard GP cannot reuse subprograms (within a single program)Example Cartesian Genetic Programming

Variants of GPndash 67

Other variants of GP

Multiobjective GP The extra objectives canCome with the problemResult from GPrsquos specifics eg use program size as the second (minimized)objectiveBe associated with different tests (eg feature tests [40])

Developmental GP (eg using Push)

Probabilistic GP (a variant of EDA Estimation of Distribution Algorithms)The algorithm maintains a probability distribution P instead of a populationIndividuals are generated from P lsquoon demandrsquoThe results of individualsrsquo evaluation are used to update P

Variants of GPndash 68

Simple EDA-like GP PIPE

Probabilistic Incremental Program Evolution [41]

Variants of GPndash 69

Selected theoretical results

Selected theoretical resultsndash 70

Schemata theorem for GP

Exact formula for the expected number of individuals sampling a schema a thenext generation [37]

Plus later work for other types of crossover

Selected theoretical resultsndash 71

Semantic crossover

Selected theoretical resultsndash 72

Applications of GP

Applications of GPndash 73

Review

GP produced a number of solutions that are human-competitive ie a GPalgorithm automatically solved a problem for which a patent exists2

A recent award-winning work has demonstrated the ability of a GP system toautomatically find and correct bugs in commercially-released software whenprovided with test data3

GP is one of leading methodologies that can be used to lsquoautomatersquo sciencehelping the researchers to find the hidden complex patterns in the observedphenomena4

2Koza J R Keane M A Streeter M J Mydlowec W Yu J Lanza G 2003 Genetic Pro-gramming IV Routine Human-Competitive Machine Intelligence Kluwer Academic Publishers

3Arcuri A Yao X A novel co-evolutionary approach to automatic software bug fixing In WangJ (Ed) 2008 IEEE World Congress on Computational Intelligence IEEE Computational IntelligenceSociety IEEE Press Hong Kong

4Schmidt M Lipson H 3 Apr 2009 Distilling free-form natural laws from experimental dataScience 324 (5923) 81ndash85

Applications of GPndash 74

Humies

() Entries were solicited for cash awards for human-competitive results that wereproduced by any form of genetic and evolutionary computation and that were published

Applications of GPndash 75

Humies

The conditions to qualify(A) The result was patented as an invention in the past is an improvement over apatented invention or would qualify today as a patentable new invention(B) The result is equal to or better than a result that was accepted as a new scientificresult at the time when it was published in a peer-reviewed scientific journal(C) The result is equal to or better than a result that was placed into a database orarchive of results maintained by an internationally recognized panel of scientificexperts(D) The result is publishable in its own right as a new scientific result mdash independentof the fact that the result was mechanically created(E) The result is equal to or better than the most recent human-created solution to along-standing problem for which there has been a succession of increasingly betterhuman-created solutions(F) The result is equal to or better than a result that was considered an achievementin its field at the time it was first discovered(G) The result solves a problem of indisputable difficulty in its field(H) The result holds its own or wins a regulated competition involving humancontestants (in the form of either live human players or human-written computerprograms)

Applications of GPndash 76

Selected Gold Humies

2004 Jason D Lohn Gregory S Hornby Derek S Linden NASA Ames ResearchCenterAn Evolved Antenna for Deployment on NASArsquos Space Technology 5 Mission

httpidesignucscedupapershornby_ec11pdf

Applications of GPndash 77

Selected Gold Humies using GP

2009 Stephanie Forrest Claire Le Goues ThanhVu Nguyen Westley WeimerAutomatically finding patches using genetic programming A GeneticProgramming Approach to Automated Software Repair

Applications of GPndash 78

Selected Gold Humies using GP

2008 Lee Spector David M Clark Ian Lindsay Bradford Barr Jon KleinGenetic Programming for Finite Algebras

2010 Natalio Krasnogor Paweł Widera Jonathan GaribaldiEvolutionary design of the energy function for protein structure prediction GPchallenge evolving the energy function for protein structure prediction Automateddesign of energy functions for protein structure prediction by means of geneticprogramming and improved structure similarity assessment

2011 Achiya Elyasaf Ami Hauptmann Moshe SipperGA-FreeCell Evolving Solvers for the Game of FreeCell

Applications of GPndash 79

Other applications

classification problems in machine learning [] object recognition [ 33] orlearning game strategies [] [2 57] has demonstrated the ability of a GP system to automatically find andcorrect bugs in commercially-released software when provided with test data

In the context of this paper it deserves particular attention that GP is one ofleading methodologies that can be used to lsquoautomatersquo science helping theresearchers to find the hidden complex patterns in the observed phenomena

In this spirit in their seminal paper [42] have shown how GP can be used toinduce scientific laws from experimental data Many other studies havedemonstrated the usefulness of GP for modeling different phenomena includingthose of natural origins [45 3 7 56 46]

See [] for an extensive review of GP applications

Applications of GPndash 80

Case study 2 Evolution of feature detectors

Based on Krzysztof Krawiec Bartosz Kukawka and Tomasz Maciejewski (2010)Evolving cascades of voting feature detectors for vehicle detection in satellite imageryIn IEEE Congress on Evolutionary Computation (CEC 2010) Barcelona IEEE Presspages 2392-2399

Applications of GPndash 81

Additional resources

Additional resourcesndash 82

Additional resources

Evolutionary Computation in Java csgmuedu~eclabprojectsecjGeneric software framework for EA well-prepared to work with GP

The online GP bilbiography wwwcsbhamacuk~wblbiblio

The genetic programming lsquohome pagersquo (a little bit messy but still valuable)httpwwwgenetic-programmingcom

Additional resourcesndash 83

Bibliography

Bibliographyndash 84

A Agapitos and S M LucasEvolving Modular Recursive Sorting AlgorithmsIn Proc EuroGP 2007

A Arcuri and X YaoA novel co-evolutionary approach to automatic software bug fixingIn J Wang editor 2008 IEEE World Congress on Computational Intelligencepages 162ndash168 Hong Kong 1-6 June 2008 IEEE Computational IntelligenceSociety IEEE Press

M Arganis R Val J Prats K Rodriguez R Dominguez and J DolzGenetic programming and standardization in water temperature modellingAdvances in Civil Engineering 2009 2009

A Brabazon M OrsquoNeill and I DempseyAn Introduction to Evolutionary Computation in FinanceIEEE Computational Intelligence Magazine 3(4)42ndash55 2008

R Bradley A Brabazon and M OrsquoNeillDynamic High Frequency Trading A Neuro-Evolutionary ApproachIn Proc EvoWorkshops 2009

G Cuccu and F GomezWhen novelty is not enoughIn Proc EvoApplications 2011

Bibliographyndash 85

M Daga and M C DeoAlternative data-driven methods to estimate wind from waves by inversemodelingNatural Hazards 49(2)293ndash310 May 2009

I Dempsey M OrsquoNeill and A BrabazonAdaptive Trading With Grammatical EvolutionIn Proc CEC 2006

G Durrett F Neumann and U-M OrsquoReillyComputational Complexity Analysis of Simple Genetic Programming On TwoProblems Modeling Isolated Program SemanticsIn Proc FOGA 2011

E Galvaacuten-Loacutepez J Swafford M OrsquoNeill and A BrabazonEvolving a Ms PacMan Controller Using Grammatical EvolutionIn Applications of Evolutionary Computation Springer 2010

J V Hansen P B Lowry R D Meservy and D M McDonaldGenetic Programming for Prevention of Cyberterrorism through Dynamic andEvolving Intrusion DetectionDecision Support Systems 431362ndash1374 2007

Bibliographyndash 86

S Harding J F Miller and W BanzhafDevelopments in Cartesian Genetic Programming self-modifying CGPGPEM 11397ndash439 2010

D Jakobović and L BudinDynamic Scheduling with Genetic ProgrammingIn Proc EuroGP 2006

G Kendall A Parkes and K SpoererA Survey of NP-Complete PuzzlesInternational Computer Games Association Journal 31(1)13ndash34 2008

K E Kinnear JrEvolving a Sort Lessons in Genetic ProgrammingIn Proc of the International Conference on Neural Networks 1993

J KozaA Genetic Approach to the Truck Backer Upper Problem and the Inter-twinedSpiral ProblemIn Proc International Joint Conference on Neural Networks 1992

J R KozaGenetic Programming On the Programming of Computers by Means of NaturalSelectionMIT Press Cambridge MA USA 1992

Bibliographyndash 87

J R KozaGenetic Programming II Automatic Discovery of Reusable ProgramsMIT Press Cambridge Massachusetts May 1994

W Langdon and W BanzhafRepeated Patterns in Genetic ProgrammingNatural Computing 7589ndash613 2008

W B LangdonRandom search is parsimoniousIn E Cantuacute-Paz editor Late Breaking Papers at the Genetic and EvolutionaryComputation Conference (GECCO-2002) pages 308ndash315 New York NY 9-13July 2002 AAAI

W B Langdon and W BanzhafRepeated Sequences in Linear Genetic Programming GenomesComplex Systems 15(4)285ndash306 2005

W B Langdon and A P HarrisonEvolving Regular Expressions for GeneChip Probe Performance PredictionIn Proc PPSN pages 1061ndash1070 2008

W B Langdon and R PoliFoundations of Genetic ProgrammingSpringer-Verlag 2002

Bibliographyndash 88

W B Langdon J Rowsell and A P HarrisonCreating Regular Expressions as mRNA Motifs with GP to Predict Human ExonSplittingIn Proc GECCO 2009

W B Langdon O Sanchez Graillet and A P HarrisonAutomated DNA Motif DiscoveryarXivorg 2010

M Lones A Tyrrell S Stepney and L CavesControlling Complex Dynamics with Artificial Biochemical NetworksIn Proc EuroGP pages 159ndash170 2010

S LucasOthello Competitionhttpprotectkern-1667emrelaxalgovalessexacuk8080othellohtmlOthellohtml 2012[Online accessed 27-Jan-2012]

Bibliographyndash 89

S LucasThe Physical Travelling Salesperson Problemhttpprotectkern-1667emrelaxalgovalessexacukptspptsphtml2012[Online accessed 27ndashJan-2012]

T McConaghyFFX Fast Scalable Deterministic Symbolic Regression TechnologyIn Proc GPTP 2011

D J MontanaStrongly typed genetic programmingBBN Technical Report 7866 Bolt Beranek and Newman Inc 10 MoultonStreet Cambridge MA 02138 USA 7 May 1993

M Nicolau M Schoenauer and W BanzhafEvolving Genes to Balance a PoleIn Proc EuroGP 2010

Bibliographyndash 90

P Nordin and W BanzhafGenetic programming controlling a miniature robotIn E V Siegel and J R Koza editors Working Notes for the AAAI Symposiumon Genetic Programming pages 61ndash67 MIT Cambridge MA USA 10ndash12 Nov1995 AAAI

G Olague and L TrujilloEvolutionary-computer-assisted design of image operators that detect interestpoints using genetic programmingImage and Vision Computing 29(7)484ndash498 2011

M OrsquoNeill A Brabazon and E HembergSubtree Deactivation Control with Grammatical Genetic Programming inDynamic EnvironmentsIn Proc CEC 2008

M OrsquoNeill and C RyanGrammatical Evolution by Grammatical Evolution The Evolution of Grammarand Genetic CodeIn Proc EuroGP pages 138ndash149 Springer-Verlag 5ndash7 Apr 2004

M Paterson Y Peres M Thorup P Winkler and U ZwickMaximum OverhangIn Proc 19th Annual ACM-SIAM Symposium on Discrete Algorithms 2008

Bibliographyndash 91

R PoliExact schema theory for genetic programming and variable-length geneticalgorithms with one-point crossoverGenetic Programming and Evolvable Machines 2(2)123ndash163 June 2001

R Poli and W B LangdonOn the search properties of different crossover operators in genetic programmingIn J R Koza W Banzhaf K Chellapilla K Deb M Dorigo D B Fogel M HGarzon D E Goldberg H Iba and R Riolo editors Genetic Programming1998 Proceedings of the Third Annual Conference pages 293ndash301 University ofWisconsin Madison Wisconsin USA 22-25 July 1998 Morgan Kaufmann

R Poli W B Langdon and N F McPheeA field guide to genetic programmingPublished via httplulucom and freely available athttpwwwgp-field-guideorguk 2008(With contributions by J R Koza)

B J Ross and H ZhuProcedural texture evolution using multiobjective optimizationNew Generation Computing 22(3)271ndash293 2004

Bibliographyndash 92

R P Salustowicz and J SchmidhuberProbabilistic incremental program evolutionEvolutionary Computation 5(2)123ndash141 1997

M Schmidt and H LipsonDistilling free-form natural laws from experimental dataScience 324(5923)81ndash85 3 Apr 2009

S Silva and L VanneschiState-of-the-Art Genetic Programming for Predicting Human Oral Bioavailabilityof DrugsIn Proc 4th International Workshop on Practical Applications of ComputationalBiology and Bioinformatics 2010

M SipperLet the Games EvolveIn Proc GPTP 2011

C Sivapragasam R Maheswaran and V VenkateshGenetic programming approach for flood routing in natural channelsHydrological Processes 22(5)623ndash628 2007

C Sivapragasam N Muttil S Muthukumar and V M ArunPrediction of algal blooms using genetic programmingMarine Pollution Bulletin 60(10)1849ndash1855 2010

Bibliographyndash 93

L SpectorTowards practical autoconstructive evolution Self-evolution of problem-solvinggenetic programming systemsIn R Riolo T McConaghy and E Vladislavleva editors Genetic ProgrammingTheory and Practice VIII volume 8 of Genetic and Evolutionary Computationchapter 2 pages 17ndash33 Springer Ann Arbor USA 20-22 May 2010

L Spector C Perry and J KleinPush 20 programming language descriptionTechnical report School of Cognitive Science Hampshire College Apr 2004

J Togelius S Karakovskiy J Koutnik and J SchmidhuberSuper Mario EvolutionIn Proc IEEE Computational Intelligence and Games 2009

TORCS The Open Car Racing Simulatorhttptorcssourceforgenet 2012

L Vanneschi and G CuccuVariable Size Population for Dynamic Optimization with Genetic ProgrammingIn Proc GECCO 2009

Bibliographyndash 94

E Vladislavleva G Smits and D Den HertogOrder of Nonlinearity as a Complexity Measure for Models Generated by SymbolicRegression via Pareto Genetic ProgrammingIEEE Trans EC 13(2)333ndash349 2009

N Wagner Z Michalewicz M Khouja and R McGregorTime Series Forecasting for Dynamic Environments The DyFor Genetic ProgramModelIEEE Trans EC 2007

J Walker and J MillerPredicting Prime Numbers Using Cartesian Genetic ProgrammingIn Proc EuroGP 2007

J A Walker K Voumllk S L Smith and J F MillerParallel Evolution using Multi-chromosome Cartesian Genetic ProgrammingGPEM 10417ndash445 2009

W-C Wang K-W Chau C-T Cheng and L QiuA comparison of performance of several artificial intelligence methods forforecasting monthly discharge time seriesJournal of Hydrology 374(3-4)294ndash306 2009

Bibliographyndash 95

W Weimer S Forrest C Le Goues and T NguyenAutomatic program repair with evolutionary computationCommunications of the ACM 53(5)109ndash116 June 2010

P Widera J Garibaldi and N KrasnogorGP challenge Evolving energy function for protein structure predictionGPEM 1161ndash88 2010

T YuHierarchical Processing for Evolving Recursive and Modular Programs UsingHigher-Order Functions and Lambda AbstractionGPEM 2345ndash380 2001

Bibliographyndash 96

  • Introduction
  • Background
  • What is genetic programming
  • Summary of our first glimpse at GP
  • Life demonstration of GP using ECJ
  • Assessment of GP techniques
  • Case study
  • A more detailed view on GP (vanilla GP is not the whole story)
  • The Challenges for GP
  • Variants of GP
  • Selected theoretical results
  • Applications of GP
  • Additional resources
  • Bibliography
Page 54: GeneticProgramming - Poznań University of TechnologyGeneticProgramming KrzysztofKrawiec LaboratoryofIntelligentDecisionSupportSystems InstituteofComputingScience,PoznanUniversityofTechnology,Poznań,Poland

The Challenges for GP

The Challenges for GPndash 54

Bloat

The evolving expressions tend to grow indefinitely in sizeFor tree-based representations this growth is typically exponential[-ish]

Evaluation becomes slow algorithm stalls memory overrun likely

One of the most intensely studied topics in GP 240+ papers as of March 2012

The Challenges for GPndash 55

Bloat example

Average number of nodes per generation in a typical run of GP solving the Sexticproblem x6minus2x4+x2

(GP dotted line)

The Challenges for GPndash 56

Countermeasures for bloat

Constraining tree heightSurprisingly can speed up bloat

Favoring small programsLexicographic parsimony pressure given two equally fit individuals prefer (select)the one represented by a smaller tree

Bloat-aware operators size-fair crossover

The Challenges for GPndash 57

Highly non-uniform distribution of program lsquobehaviorsrsquo

Convergence of binary Boolean random linear functions (composed of AND NANDOR NOR 8 bits)

From [20] Langdon W B Cantuacute-Paz E (ed) Random Search is Parsimonious LateBreaking Papers at the Genetic and Evolutionary Computation Conference(GECCO-2002) AAAI 2002 308-315

The Challenges for GPndash 58

High cost of evaluation

Running a program on multiple inputscan be expensive

Particularly for some types of dataeg images

Solutions

Caching of outcomes of subprograms

Parallel execution of programs onparticular fitness cases

Bloat prevention methods

The Challenges for GPndash 59

Variants of GP

Variants of GPndash 60

Strongly typed GP (STGP)

A way to incorporate prior knowledge and impose a structure on programs [30]

ImplementationProvide a set of typesFor each instruction define the types of its arguments and outcomesMake the operators type-aware

Mutation substitute a random tree of a proper typeCrossover swap trees of compatible1 types

1Compatible belonging to the same lsquoset typersquoVariants of GPndash 61

Strongly typed GP in ECJ

For the problem of simple classifiers represented as decision trees

Classifier syntaxClassifier = Class_idClassifier = if_then_else(Condition ClassifierClassifier)Condition = Input_Variable = Constant_Value

Implementation in ECJ parameter filesgptypeasize = 3gptypea0name = classgptypea1name = vargptypea2name = constgptypessize = 0gptcsize = 1gptc0 = ecgpGPTreeConstraintsgptc0name = tc0gptc0fset = f0gptc0returns = class

gpncsize = 4gpnc0 = ecgpGPNodeConstraintsgpnc0name = ncSimpleClassifiergpnc0returns = classgpnc0size = 0gpnc1 = ecgpGPNodeConstraintsgpnc1name = ncCompoundClassifiergpnc1returns = classgpnc1size = 4gpnc1child0 = vargpnc1child1 = constgpnc1child2 = classgpnc1child3 = classgpnc2 = ecgpGPNodeConstraintsgpnc2name = ncVariablegpnc2returns = vargpnc2size = 0gpnc3 = ecgpGPNodeConstraintsgpnc3name = ncConstantgpnc3returns = constgpnc3size = 0

Variants of GPndash 62

Linear Genetic Programming

MotivationTree-like structures are not natural for contemporary hardware architectures

Program = a sequence of instructions

Data passed via registers

ProsDirectly portable to machine code fast executionNatural correspondence to standard (GA-like) crossover operator

Applications direct evolution of machine code [32]

Initial register contents

Final register contents

x1

x2 O1 O2

x3

O3 O4 g2

g3

g1r1

r2

r3

r1

r2

r3

Variants of GPndash 63

Linear GP

Variants of GPndash 64

Stack-based GP

The best-known representative Push and PushGPhampshireedulspectorpushhtml [48]

ProsVery simple syntax program = instruction | literal | ( program )No need to specify the number of registersThe top element of a stack has the natural interpretation of program outcomeNatural possibility of implementing autoconstructive programs [47]Includes certain features that make it Turing-complete (eg YANK instruction)

Variants of GPndash 65

Push Example

Program

( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR )

Initial stack states

BOOLEAN STACK ()CODE STACK ( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR )FLOAT STACK ()INTEGER STACK ()

Stack states after program execution

BOOLEAN STACK ( TRUE )CODE STACK ( ( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR ) )FLOAT STACK ( 93 )INTEGER STACK ( 6 )

httphampshireedulspectorpush3-descriptionhtml

Variants of GPndash 66

Other variants of GP

Grammatical Evolution The grammar of the programming language ofconsideration is given as input to the algorithm Individuals encode the choice ofproductions in the derivation tree (which of available alternative productionshould be chosen modulo the number of productions available at given step ofderivation)

Graph-based GPMotivation standard GP cannot reuse subprograms (within a single program)Example Cartesian Genetic Programming

Variants of GPndash 67

Other variants of GP

Multiobjective GP The extra objectives canCome with the problemResult from GPrsquos specifics eg use program size as the second (minimized)objectiveBe associated with different tests (eg feature tests [40])

Developmental GP (eg using Push)

Probabilistic GP (a variant of EDA Estimation of Distribution Algorithms)The algorithm maintains a probability distribution P instead of a populationIndividuals are generated from P lsquoon demandrsquoThe results of individualsrsquo evaluation are used to update P

Variants of GPndash 68

Simple EDA-like GP PIPE

Probabilistic Incremental Program Evolution [41]

Variants of GPndash 69

Selected theoretical results

Selected theoretical resultsndash 70

Schemata theorem for GP

Exact formula for the expected number of individuals sampling a schema a thenext generation [37]

Plus later work for other types of crossover

Selected theoretical resultsndash 71

Semantic crossover

Selected theoretical resultsndash 72

Applications of GP

Applications of GPndash 73

Review

GP produced a number of solutions that are human-competitive ie a GPalgorithm automatically solved a problem for which a patent exists2

A recent award-winning work has demonstrated the ability of a GP system toautomatically find and correct bugs in commercially-released software whenprovided with test data3

GP is one of leading methodologies that can be used to lsquoautomatersquo sciencehelping the researchers to find the hidden complex patterns in the observedphenomena4

2Koza J R Keane M A Streeter M J Mydlowec W Yu J Lanza G 2003 Genetic Pro-gramming IV Routine Human-Competitive Machine Intelligence Kluwer Academic Publishers

3Arcuri A Yao X A novel co-evolutionary approach to automatic software bug fixing In WangJ (Ed) 2008 IEEE World Congress on Computational Intelligence IEEE Computational IntelligenceSociety IEEE Press Hong Kong

4Schmidt M Lipson H 3 Apr 2009 Distilling free-form natural laws from experimental dataScience 324 (5923) 81ndash85

Applications of GPndash 74

Humies

() Entries were solicited for cash awards for human-competitive results that wereproduced by any form of genetic and evolutionary computation and that were published

Applications of GPndash 75

Humies

The conditions to qualify(A) The result was patented as an invention in the past is an improvement over apatented invention or would qualify today as a patentable new invention(B) The result is equal to or better than a result that was accepted as a new scientificresult at the time when it was published in a peer-reviewed scientific journal(C) The result is equal to or better than a result that was placed into a database orarchive of results maintained by an internationally recognized panel of scientificexperts(D) The result is publishable in its own right as a new scientific result mdash independentof the fact that the result was mechanically created(E) The result is equal to or better than the most recent human-created solution to along-standing problem for which there has been a succession of increasingly betterhuman-created solutions(F) The result is equal to or better than a result that was considered an achievementin its field at the time it was first discovered(G) The result solves a problem of indisputable difficulty in its field(H) The result holds its own or wins a regulated competition involving humancontestants (in the form of either live human players or human-written computerprograms)

Applications of GPndash 76

Selected Gold Humies

2004 Jason D Lohn Gregory S Hornby Derek S Linden NASA Ames ResearchCenterAn Evolved Antenna for Deployment on NASArsquos Space Technology 5 Mission

httpidesignucscedupapershornby_ec11pdf

Applications of GPndash 77

Selected Gold Humies using GP

2009 Stephanie Forrest Claire Le Goues ThanhVu Nguyen Westley WeimerAutomatically finding patches using genetic programming A GeneticProgramming Approach to Automated Software Repair

Applications of GPndash 78

Selected Gold Humies using GP

2008 Lee Spector David M Clark Ian Lindsay Bradford Barr Jon KleinGenetic Programming for Finite Algebras

2010 Natalio Krasnogor Paweł Widera Jonathan GaribaldiEvolutionary design of the energy function for protein structure prediction GPchallenge evolving the energy function for protein structure prediction Automateddesign of energy functions for protein structure prediction by means of geneticprogramming and improved structure similarity assessment

2011 Achiya Elyasaf Ami Hauptmann Moshe SipperGA-FreeCell Evolving Solvers for the Game of FreeCell

Applications of GPndash 79

Other applications

classification problems in machine learning [] object recognition [ 33] orlearning game strategies [] [2 57] has demonstrated the ability of a GP system to automatically find andcorrect bugs in commercially-released software when provided with test data

In the context of this paper it deserves particular attention that GP is one ofleading methodologies that can be used to lsquoautomatersquo science helping theresearchers to find the hidden complex patterns in the observed phenomena

In this spirit in their seminal paper [42] have shown how GP can be used toinduce scientific laws from experimental data Many other studies havedemonstrated the usefulness of GP for modeling different phenomena includingthose of natural origins [45 3 7 56 46]

See [] for an extensive review of GP applications

Applications of GPndash 80

Case study 2 Evolution of feature detectors

Based on Krzysztof Krawiec Bartosz Kukawka and Tomasz Maciejewski (2010)Evolving cascades of voting feature detectors for vehicle detection in satellite imageryIn IEEE Congress on Evolutionary Computation (CEC 2010) Barcelona IEEE Presspages 2392-2399

Applications of GPndash 81

Additional resources

Additional resourcesndash 82

Additional resources

Evolutionary Computation in Java csgmuedu~eclabprojectsecjGeneric software framework for EA well-prepared to work with GP

The online GP bilbiography wwwcsbhamacuk~wblbiblio

The genetic programming lsquohome pagersquo (a little bit messy but still valuable)httpwwwgenetic-programmingcom

Additional resourcesndash 83

Bibliography

Bibliographyndash 84

A Agapitos and S M LucasEvolving Modular Recursive Sorting AlgorithmsIn Proc EuroGP 2007

A Arcuri and X YaoA novel co-evolutionary approach to automatic software bug fixingIn J Wang editor 2008 IEEE World Congress on Computational Intelligencepages 162ndash168 Hong Kong 1-6 June 2008 IEEE Computational IntelligenceSociety IEEE Press

M Arganis R Val J Prats K Rodriguez R Dominguez and J DolzGenetic programming and standardization in water temperature modellingAdvances in Civil Engineering 2009 2009

A Brabazon M OrsquoNeill and I DempseyAn Introduction to Evolutionary Computation in FinanceIEEE Computational Intelligence Magazine 3(4)42ndash55 2008

R Bradley A Brabazon and M OrsquoNeillDynamic High Frequency Trading A Neuro-Evolutionary ApproachIn Proc EvoWorkshops 2009

G Cuccu and F GomezWhen novelty is not enoughIn Proc EvoApplications 2011

Bibliographyndash 85

M Daga and M C DeoAlternative data-driven methods to estimate wind from waves by inversemodelingNatural Hazards 49(2)293ndash310 May 2009

I Dempsey M OrsquoNeill and A BrabazonAdaptive Trading With Grammatical EvolutionIn Proc CEC 2006

G Durrett F Neumann and U-M OrsquoReillyComputational Complexity Analysis of Simple Genetic Programming On TwoProblems Modeling Isolated Program SemanticsIn Proc FOGA 2011

E Galvaacuten-Loacutepez J Swafford M OrsquoNeill and A BrabazonEvolving a Ms PacMan Controller Using Grammatical EvolutionIn Applications of Evolutionary Computation Springer 2010

J V Hansen P B Lowry R D Meservy and D M McDonaldGenetic Programming for Prevention of Cyberterrorism through Dynamic andEvolving Intrusion DetectionDecision Support Systems 431362ndash1374 2007

Bibliographyndash 86

S Harding J F Miller and W BanzhafDevelopments in Cartesian Genetic Programming self-modifying CGPGPEM 11397ndash439 2010

D Jakobović and L BudinDynamic Scheduling with Genetic ProgrammingIn Proc EuroGP 2006

G Kendall A Parkes and K SpoererA Survey of NP-Complete PuzzlesInternational Computer Games Association Journal 31(1)13ndash34 2008

K E Kinnear JrEvolving a Sort Lessons in Genetic ProgrammingIn Proc of the International Conference on Neural Networks 1993

J KozaA Genetic Approach to the Truck Backer Upper Problem and the Inter-twinedSpiral ProblemIn Proc International Joint Conference on Neural Networks 1992

J R KozaGenetic Programming On the Programming of Computers by Means of NaturalSelectionMIT Press Cambridge MA USA 1992

Bibliographyndash 87

J R KozaGenetic Programming II Automatic Discovery of Reusable ProgramsMIT Press Cambridge Massachusetts May 1994

W Langdon and W BanzhafRepeated Patterns in Genetic ProgrammingNatural Computing 7589ndash613 2008

W B LangdonRandom search is parsimoniousIn E Cantuacute-Paz editor Late Breaking Papers at the Genetic and EvolutionaryComputation Conference (GECCO-2002) pages 308ndash315 New York NY 9-13July 2002 AAAI

W B Langdon and W BanzhafRepeated Sequences in Linear Genetic Programming GenomesComplex Systems 15(4)285ndash306 2005

W B Langdon and A P HarrisonEvolving Regular Expressions for GeneChip Probe Performance PredictionIn Proc PPSN pages 1061ndash1070 2008

W B Langdon and R PoliFoundations of Genetic ProgrammingSpringer-Verlag 2002

Bibliographyndash 88

W B Langdon J Rowsell and A P HarrisonCreating Regular Expressions as mRNA Motifs with GP to Predict Human ExonSplittingIn Proc GECCO 2009

W B Langdon O Sanchez Graillet and A P HarrisonAutomated DNA Motif DiscoveryarXivorg 2010

M Lones A Tyrrell S Stepney and L CavesControlling Complex Dynamics with Artificial Biochemical NetworksIn Proc EuroGP pages 159ndash170 2010

S LucasOthello Competitionhttpprotectkern-1667emrelaxalgovalessexacuk8080othellohtmlOthellohtml 2012[Online accessed 27-Jan-2012]

Bibliographyndash 89

S LucasThe Physical Travelling Salesperson Problemhttpprotectkern-1667emrelaxalgovalessexacukptspptsphtml2012[Online accessed 27ndashJan-2012]

T McConaghyFFX Fast Scalable Deterministic Symbolic Regression TechnologyIn Proc GPTP 2011

D J MontanaStrongly typed genetic programmingBBN Technical Report 7866 Bolt Beranek and Newman Inc 10 MoultonStreet Cambridge MA 02138 USA 7 May 1993

M Nicolau M Schoenauer and W BanzhafEvolving Genes to Balance a PoleIn Proc EuroGP 2010

Bibliographyndash 90

P Nordin and W BanzhafGenetic programming controlling a miniature robotIn E V Siegel and J R Koza editors Working Notes for the AAAI Symposiumon Genetic Programming pages 61ndash67 MIT Cambridge MA USA 10ndash12 Nov1995 AAAI

G Olague and L TrujilloEvolutionary-computer-assisted design of image operators that detect interestpoints using genetic programmingImage and Vision Computing 29(7)484ndash498 2011

M OrsquoNeill A Brabazon and E HembergSubtree Deactivation Control with Grammatical Genetic Programming inDynamic EnvironmentsIn Proc CEC 2008

M OrsquoNeill and C RyanGrammatical Evolution by Grammatical Evolution The Evolution of Grammarand Genetic CodeIn Proc EuroGP pages 138ndash149 Springer-Verlag 5ndash7 Apr 2004

M Paterson Y Peres M Thorup P Winkler and U ZwickMaximum OverhangIn Proc 19th Annual ACM-SIAM Symposium on Discrete Algorithms 2008

Bibliographyndash 91

R PoliExact schema theory for genetic programming and variable-length geneticalgorithms with one-point crossoverGenetic Programming and Evolvable Machines 2(2)123ndash163 June 2001

R Poli and W B LangdonOn the search properties of different crossover operators in genetic programmingIn J R Koza W Banzhaf K Chellapilla K Deb M Dorigo D B Fogel M HGarzon D E Goldberg H Iba and R Riolo editors Genetic Programming1998 Proceedings of the Third Annual Conference pages 293ndash301 University ofWisconsin Madison Wisconsin USA 22-25 July 1998 Morgan Kaufmann

R Poli W B Langdon and N F McPheeA field guide to genetic programmingPublished via httplulucom and freely available athttpwwwgp-field-guideorguk 2008(With contributions by J R Koza)

B J Ross and H ZhuProcedural texture evolution using multiobjective optimizationNew Generation Computing 22(3)271ndash293 2004

Bibliographyndash 92

R P Salustowicz and J SchmidhuberProbabilistic incremental program evolutionEvolutionary Computation 5(2)123ndash141 1997

M Schmidt and H LipsonDistilling free-form natural laws from experimental dataScience 324(5923)81ndash85 3 Apr 2009

S Silva and L VanneschiState-of-the-Art Genetic Programming for Predicting Human Oral Bioavailabilityof DrugsIn Proc 4th International Workshop on Practical Applications of ComputationalBiology and Bioinformatics 2010

M SipperLet the Games EvolveIn Proc GPTP 2011

C Sivapragasam R Maheswaran and V VenkateshGenetic programming approach for flood routing in natural channelsHydrological Processes 22(5)623ndash628 2007

C Sivapragasam N Muttil S Muthukumar and V M ArunPrediction of algal blooms using genetic programmingMarine Pollution Bulletin 60(10)1849ndash1855 2010

Bibliographyndash 93

L SpectorTowards practical autoconstructive evolution Self-evolution of problem-solvinggenetic programming systemsIn R Riolo T McConaghy and E Vladislavleva editors Genetic ProgrammingTheory and Practice VIII volume 8 of Genetic and Evolutionary Computationchapter 2 pages 17ndash33 Springer Ann Arbor USA 20-22 May 2010

L Spector C Perry and J KleinPush 20 programming language descriptionTechnical report School of Cognitive Science Hampshire College Apr 2004

J Togelius S Karakovskiy J Koutnik and J SchmidhuberSuper Mario EvolutionIn Proc IEEE Computational Intelligence and Games 2009

TORCS The Open Car Racing Simulatorhttptorcssourceforgenet 2012

L Vanneschi and G CuccuVariable Size Population for Dynamic Optimization with Genetic ProgrammingIn Proc GECCO 2009

Bibliographyndash 94

E Vladislavleva G Smits and D Den HertogOrder of Nonlinearity as a Complexity Measure for Models Generated by SymbolicRegression via Pareto Genetic ProgrammingIEEE Trans EC 13(2)333ndash349 2009

N Wagner Z Michalewicz M Khouja and R McGregorTime Series Forecasting for Dynamic Environments The DyFor Genetic ProgramModelIEEE Trans EC 2007

J Walker and J MillerPredicting Prime Numbers Using Cartesian Genetic ProgrammingIn Proc EuroGP 2007

J A Walker K Voumllk S L Smith and J F MillerParallel Evolution using Multi-chromosome Cartesian Genetic ProgrammingGPEM 10417ndash445 2009

W-C Wang K-W Chau C-T Cheng and L QiuA comparison of performance of several artificial intelligence methods forforecasting monthly discharge time seriesJournal of Hydrology 374(3-4)294ndash306 2009

Bibliographyndash 95

W Weimer S Forrest C Le Goues and T NguyenAutomatic program repair with evolutionary computationCommunications of the ACM 53(5)109ndash116 June 2010

P Widera J Garibaldi and N KrasnogorGP challenge Evolving energy function for protein structure predictionGPEM 1161ndash88 2010

T YuHierarchical Processing for Evolving Recursive and Modular Programs UsingHigher-Order Functions and Lambda AbstractionGPEM 2345ndash380 2001

Bibliographyndash 96

  • Introduction
  • Background
  • What is genetic programming
  • Summary of our first glimpse at GP
  • Life demonstration of GP using ECJ
  • Assessment of GP techniques
  • Case study
  • A more detailed view on GP (vanilla GP is not the whole story)
  • The Challenges for GP
  • Variants of GP
  • Selected theoretical results
  • Applications of GP
  • Additional resources
  • Bibliography
Page 55: GeneticProgramming - Poznań University of TechnologyGeneticProgramming KrzysztofKrawiec LaboratoryofIntelligentDecisionSupportSystems InstituteofComputingScience,PoznanUniversityofTechnology,Poznań,Poland

Bloat

The evolving expressions tend to grow indefinitely in sizeFor tree-based representations this growth is typically exponential[-ish]

Evaluation becomes slow algorithm stalls memory overrun likely

One of the most intensely studied topics in GP 240+ papers as of March 2012

The Challenges for GPndash 55

Bloat example

Average number of nodes per generation in a typical run of GP solving the Sexticproblem x6minus2x4+x2

(GP dotted line)

The Challenges for GPndash 56

Countermeasures for bloat

Constraining tree heightSurprisingly can speed up bloat

Favoring small programsLexicographic parsimony pressure given two equally fit individuals prefer (select)the one represented by a smaller tree

Bloat-aware operators size-fair crossover

The Challenges for GPndash 57

Highly non-uniform distribution of program lsquobehaviorsrsquo

Convergence of binary Boolean random linear functions (composed of AND NANDOR NOR 8 bits)

From [20] Langdon W B Cantuacute-Paz E (ed) Random Search is Parsimonious LateBreaking Papers at the Genetic and Evolutionary Computation Conference(GECCO-2002) AAAI 2002 308-315

The Challenges for GPndash 58

High cost of evaluation

Running a program on multiple inputscan be expensive

Particularly for some types of dataeg images

Solutions

Caching of outcomes of subprograms

Parallel execution of programs onparticular fitness cases

Bloat prevention methods

The Challenges for GPndash 59

Variants of GP

Variants of GPndash 60

Strongly typed GP (STGP)

A way to incorporate prior knowledge and impose a structure on programs [30]

ImplementationProvide a set of typesFor each instruction define the types of its arguments and outcomesMake the operators type-aware

Mutation substitute a random tree of a proper typeCrossover swap trees of compatible1 types

1Compatible belonging to the same lsquoset typersquoVariants of GPndash 61

Strongly typed GP in ECJ

For the problem of simple classifiers represented as decision trees

Classifier syntaxClassifier = Class_idClassifier = if_then_else(Condition ClassifierClassifier)Condition = Input_Variable = Constant_Value

Implementation in ECJ parameter filesgptypeasize = 3gptypea0name = classgptypea1name = vargptypea2name = constgptypessize = 0gptcsize = 1gptc0 = ecgpGPTreeConstraintsgptc0name = tc0gptc0fset = f0gptc0returns = class

gpncsize = 4gpnc0 = ecgpGPNodeConstraintsgpnc0name = ncSimpleClassifiergpnc0returns = classgpnc0size = 0gpnc1 = ecgpGPNodeConstraintsgpnc1name = ncCompoundClassifiergpnc1returns = classgpnc1size = 4gpnc1child0 = vargpnc1child1 = constgpnc1child2 = classgpnc1child3 = classgpnc2 = ecgpGPNodeConstraintsgpnc2name = ncVariablegpnc2returns = vargpnc2size = 0gpnc3 = ecgpGPNodeConstraintsgpnc3name = ncConstantgpnc3returns = constgpnc3size = 0

Variants of GPndash 62

Linear Genetic Programming

MotivationTree-like structures are not natural for contemporary hardware architectures

Program = a sequence of instructions

Data passed via registers

ProsDirectly portable to machine code fast executionNatural correspondence to standard (GA-like) crossover operator

Applications direct evolution of machine code [32]

Initial register contents

Final register contents

x1

x2 O1 O2

x3

O3 O4 g2

g3

g1r1

r2

r3

r1

r2

r3

Variants of GPndash 63

Linear GP

Variants of GPndash 64

Stack-based GP

The best-known representative Push and PushGPhampshireedulspectorpushhtml [48]

ProsVery simple syntax program = instruction | literal | ( program )No need to specify the number of registersThe top element of a stack has the natural interpretation of program outcomeNatural possibility of implementing autoconstructive programs [47]Includes certain features that make it Turing-complete (eg YANK instruction)

Variants of GPndash 65

Push Example

Program

( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR )

Initial stack states

BOOLEAN STACK ()CODE STACK ( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR )FLOAT STACK ()INTEGER STACK ()

Stack states after program execution

BOOLEAN STACK ( TRUE )CODE STACK ( ( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR ) )FLOAT STACK ( 93 )INTEGER STACK ( 6 )

httphampshireedulspectorpush3-descriptionhtml

Variants of GPndash 66

Other variants of GP

Grammatical Evolution The grammar of the programming language ofconsideration is given as input to the algorithm Individuals encode the choice ofproductions in the derivation tree (which of available alternative productionshould be chosen modulo the number of productions available at given step ofderivation)

Graph-based GPMotivation standard GP cannot reuse subprograms (within a single program)Example Cartesian Genetic Programming

Variants of GPndash 67

Other variants of GP

Multiobjective GP The extra objectives canCome with the problemResult from GPrsquos specifics eg use program size as the second (minimized)objectiveBe associated with different tests (eg feature tests [40])

Developmental GP (eg using Push)

Probabilistic GP (a variant of EDA Estimation of Distribution Algorithms)The algorithm maintains a probability distribution P instead of a populationIndividuals are generated from P lsquoon demandrsquoThe results of individualsrsquo evaluation are used to update P

Variants of GPndash 68

Simple EDA-like GP PIPE

Probabilistic Incremental Program Evolution [41]

Variants of GPndash 69

Selected theoretical results

Selected theoretical resultsndash 70

Schemata theorem for GP

Exact formula for the expected number of individuals sampling a schema a thenext generation [37]

Plus later work for other types of crossover

Selected theoretical resultsndash 71

Semantic crossover

Selected theoretical resultsndash 72

Applications of GP

Applications of GPndash 73

Review

GP produced a number of solutions that are human-competitive ie a GPalgorithm automatically solved a problem for which a patent exists2

A recent award-winning work has demonstrated the ability of a GP system toautomatically find and correct bugs in commercially-released software whenprovided with test data3

GP is one of leading methodologies that can be used to lsquoautomatersquo sciencehelping the researchers to find the hidden complex patterns in the observedphenomena4

2Koza J R Keane M A Streeter M J Mydlowec W Yu J Lanza G 2003 Genetic Pro-gramming IV Routine Human-Competitive Machine Intelligence Kluwer Academic Publishers

3Arcuri A Yao X A novel co-evolutionary approach to automatic software bug fixing In WangJ (Ed) 2008 IEEE World Congress on Computational Intelligence IEEE Computational IntelligenceSociety IEEE Press Hong Kong

4Schmidt M Lipson H 3 Apr 2009 Distilling free-form natural laws from experimental dataScience 324 (5923) 81ndash85

Applications of GPndash 74

Humies

() Entries were solicited for cash awards for human-competitive results that wereproduced by any form of genetic and evolutionary computation and that were published

Applications of GPndash 75

Humies

The conditions to qualify(A) The result was patented as an invention in the past is an improvement over apatented invention or would qualify today as a patentable new invention(B) The result is equal to or better than a result that was accepted as a new scientificresult at the time when it was published in a peer-reviewed scientific journal(C) The result is equal to or better than a result that was placed into a database orarchive of results maintained by an internationally recognized panel of scientificexperts(D) The result is publishable in its own right as a new scientific result mdash independentof the fact that the result was mechanically created(E) The result is equal to or better than the most recent human-created solution to along-standing problem for which there has been a succession of increasingly betterhuman-created solutions(F) The result is equal to or better than a result that was considered an achievementin its field at the time it was first discovered(G) The result solves a problem of indisputable difficulty in its field(H) The result holds its own or wins a regulated competition involving humancontestants (in the form of either live human players or human-written computerprograms)

Applications of GPndash 76

Selected Gold Humies

2004 Jason D Lohn Gregory S Hornby Derek S Linden NASA Ames ResearchCenterAn Evolved Antenna for Deployment on NASArsquos Space Technology 5 Mission

httpidesignucscedupapershornby_ec11pdf

Applications of GPndash 77

Selected Gold Humies using GP

2009 Stephanie Forrest Claire Le Goues ThanhVu Nguyen Westley WeimerAutomatically finding patches using genetic programming A GeneticProgramming Approach to Automated Software Repair

Applications of GPndash 78

Selected Gold Humies using GP

2008 Lee Spector David M Clark Ian Lindsay Bradford Barr Jon KleinGenetic Programming for Finite Algebras

2010 Natalio Krasnogor Paweł Widera Jonathan GaribaldiEvolutionary design of the energy function for protein structure prediction GPchallenge evolving the energy function for protein structure prediction Automateddesign of energy functions for protein structure prediction by means of geneticprogramming and improved structure similarity assessment

2011 Achiya Elyasaf Ami Hauptmann Moshe SipperGA-FreeCell Evolving Solvers for the Game of FreeCell

Applications of GPndash 79

Other applications

classification problems in machine learning [] object recognition [ 33] orlearning game strategies [] [2 57] has demonstrated the ability of a GP system to automatically find andcorrect bugs in commercially-released software when provided with test data

In the context of this paper it deserves particular attention that GP is one ofleading methodologies that can be used to lsquoautomatersquo science helping theresearchers to find the hidden complex patterns in the observed phenomena

In this spirit in their seminal paper [42] have shown how GP can be used toinduce scientific laws from experimental data Many other studies havedemonstrated the usefulness of GP for modeling different phenomena includingthose of natural origins [45 3 7 56 46]

See [] for an extensive review of GP applications

Applications of GPndash 80

Case study 2 Evolution of feature detectors

Based on Krzysztof Krawiec Bartosz Kukawka and Tomasz Maciejewski (2010)Evolving cascades of voting feature detectors for vehicle detection in satellite imageryIn IEEE Congress on Evolutionary Computation (CEC 2010) Barcelona IEEE Presspages 2392-2399

Applications of GPndash 81

Additional resources

Additional resourcesndash 82

Additional resources

Evolutionary Computation in Java csgmuedu~eclabprojectsecjGeneric software framework for EA well-prepared to work with GP

The online GP bilbiography wwwcsbhamacuk~wblbiblio

The genetic programming lsquohome pagersquo (a little bit messy but still valuable)httpwwwgenetic-programmingcom

Additional resourcesndash 83

Bibliography

Bibliographyndash 84

A Agapitos and S M LucasEvolving Modular Recursive Sorting AlgorithmsIn Proc EuroGP 2007

A Arcuri and X YaoA novel co-evolutionary approach to automatic software bug fixingIn J Wang editor 2008 IEEE World Congress on Computational Intelligencepages 162ndash168 Hong Kong 1-6 June 2008 IEEE Computational IntelligenceSociety IEEE Press

M Arganis R Val J Prats K Rodriguez R Dominguez and J DolzGenetic programming and standardization in water temperature modellingAdvances in Civil Engineering 2009 2009

A Brabazon M OrsquoNeill and I DempseyAn Introduction to Evolutionary Computation in FinanceIEEE Computational Intelligence Magazine 3(4)42ndash55 2008

R Bradley A Brabazon and M OrsquoNeillDynamic High Frequency Trading A Neuro-Evolutionary ApproachIn Proc EvoWorkshops 2009

G Cuccu and F GomezWhen novelty is not enoughIn Proc EvoApplications 2011

Bibliographyndash 85

M Daga and M C DeoAlternative data-driven methods to estimate wind from waves by inversemodelingNatural Hazards 49(2)293ndash310 May 2009

I Dempsey M OrsquoNeill and A BrabazonAdaptive Trading With Grammatical EvolutionIn Proc CEC 2006

G Durrett F Neumann and U-M OrsquoReillyComputational Complexity Analysis of Simple Genetic Programming On TwoProblems Modeling Isolated Program SemanticsIn Proc FOGA 2011

E Galvaacuten-Loacutepez J Swafford M OrsquoNeill and A BrabazonEvolving a Ms PacMan Controller Using Grammatical EvolutionIn Applications of Evolutionary Computation Springer 2010

J V Hansen P B Lowry R D Meservy and D M McDonaldGenetic Programming for Prevention of Cyberterrorism through Dynamic andEvolving Intrusion DetectionDecision Support Systems 431362ndash1374 2007

Bibliographyndash 86

S Harding J F Miller and W BanzhafDevelopments in Cartesian Genetic Programming self-modifying CGPGPEM 11397ndash439 2010

D Jakobović and L BudinDynamic Scheduling with Genetic ProgrammingIn Proc EuroGP 2006

G Kendall A Parkes and K SpoererA Survey of NP-Complete PuzzlesInternational Computer Games Association Journal 31(1)13ndash34 2008

K E Kinnear JrEvolving a Sort Lessons in Genetic ProgrammingIn Proc of the International Conference on Neural Networks 1993

J KozaA Genetic Approach to the Truck Backer Upper Problem and the Inter-twinedSpiral ProblemIn Proc International Joint Conference on Neural Networks 1992

J R KozaGenetic Programming On the Programming of Computers by Means of NaturalSelectionMIT Press Cambridge MA USA 1992

Bibliographyndash 87

J R KozaGenetic Programming II Automatic Discovery of Reusable ProgramsMIT Press Cambridge Massachusetts May 1994

W Langdon and W BanzhafRepeated Patterns in Genetic ProgrammingNatural Computing 7589ndash613 2008

W B LangdonRandom search is parsimoniousIn E Cantuacute-Paz editor Late Breaking Papers at the Genetic and EvolutionaryComputation Conference (GECCO-2002) pages 308ndash315 New York NY 9-13July 2002 AAAI

W B Langdon and W BanzhafRepeated Sequences in Linear Genetic Programming GenomesComplex Systems 15(4)285ndash306 2005

W B Langdon and A P HarrisonEvolving Regular Expressions for GeneChip Probe Performance PredictionIn Proc PPSN pages 1061ndash1070 2008

W B Langdon and R PoliFoundations of Genetic ProgrammingSpringer-Verlag 2002

Bibliographyndash 88

W B Langdon J Rowsell and A P HarrisonCreating Regular Expressions as mRNA Motifs with GP to Predict Human ExonSplittingIn Proc GECCO 2009

W B Langdon O Sanchez Graillet and A P HarrisonAutomated DNA Motif DiscoveryarXivorg 2010

M Lones A Tyrrell S Stepney and L CavesControlling Complex Dynamics with Artificial Biochemical NetworksIn Proc EuroGP pages 159ndash170 2010

S LucasOthello Competitionhttpprotectkern-1667emrelaxalgovalessexacuk8080othellohtmlOthellohtml 2012[Online accessed 27-Jan-2012]

Bibliographyndash 89

S LucasThe Physical Travelling Salesperson Problemhttpprotectkern-1667emrelaxalgovalessexacukptspptsphtml2012[Online accessed 27ndashJan-2012]

T McConaghyFFX Fast Scalable Deterministic Symbolic Regression TechnologyIn Proc GPTP 2011

D J MontanaStrongly typed genetic programmingBBN Technical Report 7866 Bolt Beranek and Newman Inc 10 MoultonStreet Cambridge MA 02138 USA 7 May 1993

M Nicolau M Schoenauer and W BanzhafEvolving Genes to Balance a PoleIn Proc EuroGP 2010

Bibliographyndash 90

P Nordin and W BanzhafGenetic programming controlling a miniature robotIn E V Siegel and J R Koza editors Working Notes for the AAAI Symposiumon Genetic Programming pages 61ndash67 MIT Cambridge MA USA 10ndash12 Nov1995 AAAI

G Olague and L TrujilloEvolutionary-computer-assisted design of image operators that detect interestpoints using genetic programmingImage and Vision Computing 29(7)484ndash498 2011

M OrsquoNeill A Brabazon and E HembergSubtree Deactivation Control with Grammatical Genetic Programming inDynamic EnvironmentsIn Proc CEC 2008

M OrsquoNeill and C RyanGrammatical Evolution by Grammatical Evolution The Evolution of Grammarand Genetic CodeIn Proc EuroGP pages 138ndash149 Springer-Verlag 5ndash7 Apr 2004

M Paterson Y Peres M Thorup P Winkler and U ZwickMaximum OverhangIn Proc 19th Annual ACM-SIAM Symposium on Discrete Algorithms 2008

Bibliographyndash 91

R PoliExact schema theory for genetic programming and variable-length geneticalgorithms with one-point crossoverGenetic Programming and Evolvable Machines 2(2)123ndash163 June 2001

R Poli and W B LangdonOn the search properties of different crossover operators in genetic programmingIn J R Koza W Banzhaf K Chellapilla K Deb M Dorigo D B Fogel M HGarzon D E Goldberg H Iba and R Riolo editors Genetic Programming1998 Proceedings of the Third Annual Conference pages 293ndash301 University ofWisconsin Madison Wisconsin USA 22-25 July 1998 Morgan Kaufmann

R Poli W B Langdon and N F McPheeA field guide to genetic programmingPublished via httplulucom and freely available athttpwwwgp-field-guideorguk 2008(With contributions by J R Koza)

B J Ross and H ZhuProcedural texture evolution using multiobjective optimizationNew Generation Computing 22(3)271ndash293 2004

Bibliographyndash 92

R P Salustowicz and J SchmidhuberProbabilistic incremental program evolutionEvolutionary Computation 5(2)123ndash141 1997

M Schmidt and H LipsonDistilling free-form natural laws from experimental dataScience 324(5923)81ndash85 3 Apr 2009

S Silva and L VanneschiState-of-the-Art Genetic Programming for Predicting Human Oral Bioavailabilityof DrugsIn Proc 4th International Workshop on Practical Applications of ComputationalBiology and Bioinformatics 2010

M SipperLet the Games EvolveIn Proc GPTP 2011

C Sivapragasam R Maheswaran and V VenkateshGenetic programming approach for flood routing in natural channelsHydrological Processes 22(5)623ndash628 2007

C Sivapragasam N Muttil S Muthukumar and V M ArunPrediction of algal blooms using genetic programmingMarine Pollution Bulletin 60(10)1849ndash1855 2010

Bibliographyndash 93

L SpectorTowards practical autoconstructive evolution Self-evolution of problem-solvinggenetic programming systemsIn R Riolo T McConaghy and E Vladislavleva editors Genetic ProgrammingTheory and Practice VIII volume 8 of Genetic and Evolutionary Computationchapter 2 pages 17ndash33 Springer Ann Arbor USA 20-22 May 2010

L Spector C Perry and J KleinPush 20 programming language descriptionTechnical report School of Cognitive Science Hampshire College Apr 2004

J Togelius S Karakovskiy J Koutnik and J SchmidhuberSuper Mario EvolutionIn Proc IEEE Computational Intelligence and Games 2009

TORCS The Open Car Racing Simulatorhttptorcssourceforgenet 2012

L Vanneschi and G CuccuVariable Size Population for Dynamic Optimization with Genetic ProgrammingIn Proc GECCO 2009

Bibliographyndash 94

E Vladislavleva G Smits and D Den HertogOrder of Nonlinearity as a Complexity Measure for Models Generated by SymbolicRegression via Pareto Genetic ProgrammingIEEE Trans EC 13(2)333ndash349 2009

N Wagner Z Michalewicz M Khouja and R McGregorTime Series Forecasting for Dynamic Environments The DyFor Genetic ProgramModelIEEE Trans EC 2007

J Walker and J MillerPredicting Prime Numbers Using Cartesian Genetic ProgrammingIn Proc EuroGP 2007

J A Walker K Voumllk S L Smith and J F MillerParallel Evolution using Multi-chromosome Cartesian Genetic ProgrammingGPEM 10417ndash445 2009

W-C Wang K-W Chau C-T Cheng and L QiuA comparison of performance of several artificial intelligence methods forforecasting monthly discharge time seriesJournal of Hydrology 374(3-4)294ndash306 2009

Bibliographyndash 95

W Weimer S Forrest C Le Goues and T NguyenAutomatic program repair with evolutionary computationCommunications of the ACM 53(5)109ndash116 June 2010

P Widera J Garibaldi and N KrasnogorGP challenge Evolving energy function for protein structure predictionGPEM 1161ndash88 2010

T YuHierarchical Processing for Evolving Recursive and Modular Programs UsingHigher-Order Functions and Lambda AbstractionGPEM 2345ndash380 2001

Bibliographyndash 96

  • Introduction
  • Background
  • What is genetic programming
  • Summary of our first glimpse at GP
  • Life demonstration of GP using ECJ
  • Assessment of GP techniques
  • Case study
  • A more detailed view on GP (vanilla GP is not the whole story)
  • The Challenges for GP
  • Variants of GP
  • Selected theoretical results
  • Applications of GP
  • Additional resources
  • Bibliography
Page 56: GeneticProgramming - Poznań University of TechnologyGeneticProgramming KrzysztofKrawiec LaboratoryofIntelligentDecisionSupportSystems InstituteofComputingScience,PoznanUniversityofTechnology,Poznań,Poland

Bloat example

Average number of nodes per generation in a typical run of GP solving the Sexticproblem x6minus2x4+x2

(GP dotted line)

The Challenges for GPndash 56

Countermeasures for bloat

Constraining tree heightSurprisingly can speed up bloat

Favoring small programsLexicographic parsimony pressure given two equally fit individuals prefer (select)the one represented by a smaller tree

Bloat-aware operators size-fair crossover

The Challenges for GPndash 57

Highly non-uniform distribution of program lsquobehaviorsrsquo

Convergence of binary Boolean random linear functions (composed of AND NANDOR NOR 8 bits)

From [20] Langdon W B Cantuacute-Paz E (ed) Random Search is Parsimonious LateBreaking Papers at the Genetic and Evolutionary Computation Conference(GECCO-2002) AAAI 2002 308-315

The Challenges for GPndash 58

High cost of evaluation

Running a program on multiple inputscan be expensive

Particularly for some types of dataeg images

Solutions

Caching of outcomes of subprograms

Parallel execution of programs onparticular fitness cases

Bloat prevention methods

The Challenges for GPndash 59

Variants of GP

Variants of GPndash 60

Strongly typed GP (STGP)

A way to incorporate prior knowledge and impose a structure on programs [30]

ImplementationProvide a set of typesFor each instruction define the types of its arguments and outcomesMake the operators type-aware

Mutation substitute a random tree of a proper typeCrossover swap trees of compatible1 types

1Compatible belonging to the same lsquoset typersquoVariants of GPndash 61

Strongly typed GP in ECJ

For the problem of simple classifiers represented as decision trees

Classifier syntaxClassifier = Class_idClassifier = if_then_else(Condition ClassifierClassifier)Condition = Input_Variable = Constant_Value

Implementation in ECJ parameter filesgptypeasize = 3gptypea0name = classgptypea1name = vargptypea2name = constgptypessize = 0gptcsize = 1gptc0 = ecgpGPTreeConstraintsgptc0name = tc0gptc0fset = f0gptc0returns = class

gpncsize = 4gpnc0 = ecgpGPNodeConstraintsgpnc0name = ncSimpleClassifiergpnc0returns = classgpnc0size = 0gpnc1 = ecgpGPNodeConstraintsgpnc1name = ncCompoundClassifiergpnc1returns = classgpnc1size = 4gpnc1child0 = vargpnc1child1 = constgpnc1child2 = classgpnc1child3 = classgpnc2 = ecgpGPNodeConstraintsgpnc2name = ncVariablegpnc2returns = vargpnc2size = 0gpnc3 = ecgpGPNodeConstraintsgpnc3name = ncConstantgpnc3returns = constgpnc3size = 0

Variants of GPndash 62

Linear Genetic Programming

MotivationTree-like structures are not natural for contemporary hardware architectures

Program = a sequence of instructions

Data passed via registers

ProsDirectly portable to machine code fast executionNatural correspondence to standard (GA-like) crossover operator

Applications direct evolution of machine code [32]

Initial register contents

Final register contents

x1

x2 O1 O2

x3

O3 O4 g2

g3

g1r1

r2

r3

r1

r2

r3

Variants of GPndash 63

Linear GP

Variants of GPndash 64

Stack-based GP

The best-known representative Push and PushGPhampshireedulspectorpushhtml [48]

ProsVery simple syntax program = instruction | literal | ( program )No need to specify the number of registersThe top element of a stack has the natural interpretation of program outcomeNatural possibility of implementing autoconstructive programs [47]Includes certain features that make it Turing-complete (eg YANK instruction)

Variants of GPndash 65

Push Example

Program

( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR )

Initial stack states

BOOLEAN STACK ()CODE STACK ( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR )FLOAT STACK ()INTEGER STACK ()

Stack states after program execution

BOOLEAN STACK ( TRUE )CODE STACK ( ( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR ) )FLOAT STACK ( 93 )INTEGER STACK ( 6 )

httphampshireedulspectorpush3-descriptionhtml

Variants of GPndash 66

Other variants of GP

Grammatical Evolution The grammar of the programming language ofconsideration is given as input to the algorithm Individuals encode the choice ofproductions in the derivation tree (which of available alternative productionshould be chosen modulo the number of productions available at given step ofderivation)

Graph-based GPMotivation standard GP cannot reuse subprograms (within a single program)Example Cartesian Genetic Programming

Variants of GPndash 67

Other variants of GP

Multiobjective GP The extra objectives canCome with the problemResult from GPrsquos specifics eg use program size as the second (minimized)objectiveBe associated with different tests (eg feature tests [40])

Developmental GP (eg using Push)

Probabilistic GP (a variant of EDA Estimation of Distribution Algorithms)The algorithm maintains a probability distribution P instead of a populationIndividuals are generated from P lsquoon demandrsquoThe results of individualsrsquo evaluation are used to update P

Variants of GPndash 68

Simple EDA-like GP PIPE

Probabilistic Incremental Program Evolution [41]

Variants of GPndash 69

Selected theoretical results

Selected theoretical resultsndash 70

Schemata theorem for GP

Exact formula for the expected number of individuals sampling a schema a thenext generation [37]

Plus later work for other types of crossover

Selected theoretical resultsndash 71

Semantic crossover

Selected theoretical resultsndash 72

Applications of GP

Applications of GPndash 73

Review

GP produced a number of solutions that are human-competitive ie a GPalgorithm automatically solved a problem for which a patent exists2

A recent award-winning work has demonstrated the ability of a GP system toautomatically find and correct bugs in commercially-released software whenprovided with test data3

GP is one of leading methodologies that can be used to lsquoautomatersquo sciencehelping the researchers to find the hidden complex patterns in the observedphenomena4

2Koza J R Keane M A Streeter M J Mydlowec W Yu J Lanza G 2003 Genetic Pro-gramming IV Routine Human-Competitive Machine Intelligence Kluwer Academic Publishers

3Arcuri A Yao X A novel co-evolutionary approach to automatic software bug fixing In WangJ (Ed) 2008 IEEE World Congress on Computational Intelligence IEEE Computational IntelligenceSociety IEEE Press Hong Kong

4Schmidt M Lipson H 3 Apr 2009 Distilling free-form natural laws from experimental dataScience 324 (5923) 81ndash85

Applications of GPndash 74

Humies

() Entries were solicited for cash awards for human-competitive results that wereproduced by any form of genetic and evolutionary computation and that were published

Applications of GPndash 75

Humies

The conditions to qualify(A) The result was patented as an invention in the past is an improvement over apatented invention or would qualify today as a patentable new invention(B) The result is equal to or better than a result that was accepted as a new scientificresult at the time when it was published in a peer-reviewed scientific journal(C) The result is equal to or better than a result that was placed into a database orarchive of results maintained by an internationally recognized panel of scientificexperts(D) The result is publishable in its own right as a new scientific result mdash independentof the fact that the result was mechanically created(E) The result is equal to or better than the most recent human-created solution to along-standing problem for which there has been a succession of increasingly betterhuman-created solutions(F) The result is equal to or better than a result that was considered an achievementin its field at the time it was first discovered(G) The result solves a problem of indisputable difficulty in its field(H) The result holds its own or wins a regulated competition involving humancontestants (in the form of either live human players or human-written computerprograms)

Applications of GPndash 76

Selected Gold Humies

2004 Jason D Lohn Gregory S Hornby Derek S Linden NASA Ames ResearchCenterAn Evolved Antenna for Deployment on NASArsquos Space Technology 5 Mission

httpidesignucscedupapershornby_ec11pdf

Applications of GPndash 77

Selected Gold Humies using GP

2009 Stephanie Forrest Claire Le Goues ThanhVu Nguyen Westley WeimerAutomatically finding patches using genetic programming A GeneticProgramming Approach to Automated Software Repair

Applications of GPndash 78

Selected Gold Humies using GP

2008 Lee Spector David M Clark Ian Lindsay Bradford Barr Jon KleinGenetic Programming for Finite Algebras

2010 Natalio Krasnogor Paweł Widera Jonathan GaribaldiEvolutionary design of the energy function for protein structure prediction GPchallenge evolving the energy function for protein structure prediction Automateddesign of energy functions for protein structure prediction by means of geneticprogramming and improved structure similarity assessment

2011 Achiya Elyasaf Ami Hauptmann Moshe SipperGA-FreeCell Evolving Solvers for the Game of FreeCell

Applications of GPndash 79

Other applications

classification problems in machine learning [] object recognition [ 33] orlearning game strategies [] [2 57] has demonstrated the ability of a GP system to automatically find andcorrect bugs in commercially-released software when provided with test data

In the context of this paper it deserves particular attention that GP is one ofleading methodologies that can be used to lsquoautomatersquo science helping theresearchers to find the hidden complex patterns in the observed phenomena

In this spirit in their seminal paper [42] have shown how GP can be used toinduce scientific laws from experimental data Many other studies havedemonstrated the usefulness of GP for modeling different phenomena includingthose of natural origins [45 3 7 56 46]

See [] for an extensive review of GP applications

Applications of GPndash 80

Case study 2 Evolution of feature detectors

Based on Krzysztof Krawiec Bartosz Kukawka and Tomasz Maciejewski (2010)Evolving cascades of voting feature detectors for vehicle detection in satellite imageryIn IEEE Congress on Evolutionary Computation (CEC 2010) Barcelona IEEE Presspages 2392-2399

Applications of GPndash 81

Additional resources

Additional resourcesndash 82

Additional resources

Evolutionary Computation in Java csgmuedu~eclabprojectsecjGeneric software framework for EA well-prepared to work with GP

The online GP bilbiography wwwcsbhamacuk~wblbiblio

The genetic programming lsquohome pagersquo (a little bit messy but still valuable)httpwwwgenetic-programmingcom

Additional resourcesndash 83

Bibliography

Bibliographyndash 84

A Agapitos and S M LucasEvolving Modular Recursive Sorting AlgorithmsIn Proc EuroGP 2007

A Arcuri and X YaoA novel co-evolutionary approach to automatic software bug fixingIn J Wang editor 2008 IEEE World Congress on Computational Intelligencepages 162ndash168 Hong Kong 1-6 June 2008 IEEE Computational IntelligenceSociety IEEE Press

M Arganis R Val J Prats K Rodriguez R Dominguez and J DolzGenetic programming and standardization in water temperature modellingAdvances in Civil Engineering 2009 2009

A Brabazon M OrsquoNeill and I DempseyAn Introduction to Evolutionary Computation in FinanceIEEE Computational Intelligence Magazine 3(4)42ndash55 2008

R Bradley A Brabazon and M OrsquoNeillDynamic High Frequency Trading A Neuro-Evolutionary ApproachIn Proc EvoWorkshops 2009

G Cuccu and F GomezWhen novelty is not enoughIn Proc EvoApplications 2011

Bibliographyndash 85

M Daga and M C DeoAlternative data-driven methods to estimate wind from waves by inversemodelingNatural Hazards 49(2)293ndash310 May 2009

I Dempsey M OrsquoNeill and A BrabazonAdaptive Trading With Grammatical EvolutionIn Proc CEC 2006

G Durrett F Neumann and U-M OrsquoReillyComputational Complexity Analysis of Simple Genetic Programming On TwoProblems Modeling Isolated Program SemanticsIn Proc FOGA 2011

E Galvaacuten-Loacutepez J Swafford M OrsquoNeill and A BrabazonEvolving a Ms PacMan Controller Using Grammatical EvolutionIn Applications of Evolutionary Computation Springer 2010

J V Hansen P B Lowry R D Meservy and D M McDonaldGenetic Programming for Prevention of Cyberterrorism through Dynamic andEvolving Intrusion DetectionDecision Support Systems 431362ndash1374 2007

Bibliographyndash 86

S Harding J F Miller and W BanzhafDevelopments in Cartesian Genetic Programming self-modifying CGPGPEM 11397ndash439 2010

D Jakobović and L BudinDynamic Scheduling with Genetic ProgrammingIn Proc EuroGP 2006

G Kendall A Parkes and K SpoererA Survey of NP-Complete PuzzlesInternational Computer Games Association Journal 31(1)13ndash34 2008

K E Kinnear JrEvolving a Sort Lessons in Genetic ProgrammingIn Proc of the International Conference on Neural Networks 1993

J KozaA Genetic Approach to the Truck Backer Upper Problem and the Inter-twinedSpiral ProblemIn Proc International Joint Conference on Neural Networks 1992

J R KozaGenetic Programming On the Programming of Computers by Means of NaturalSelectionMIT Press Cambridge MA USA 1992

Bibliographyndash 87

J R KozaGenetic Programming II Automatic Discovery of Reusable ProgramsMIT Press Cambridge Massachusetts May 1994

W Langdon and W BanzhafRepeated Patterns in Genetic ProgrammingNatural Computing 7589ndash613 2008

W B LangdonRandom search is parsimoniousIn E Cantuacute-Paz editor Late Breaking Papers at the Genetic and EvolutionaryComputation Conference (GECCO-2002) pages 308ndash315 New York NY 9-13July 2002 AAAI

W B Langdon and W BanzhafRepeated Sequences in Linear Genetic Programming GenomesComplex Systems 15(4)285ndash306 2005

W B Langdon and A P HarrisonEvolving Regular Expressions for GeneChip Probe Performance PredictionIn Proc PPSN pages 1061ndash1070 2008

W B Langdon and R PoliFoundations of Genetic ProgrammingSpringer-Verlag 2002

Bibliographyndash 88

W B Langdon J Rowsell and A P HarrisonCreating Regular Expressions as mRNA Motifs with GP to Predict Human ExonSplittingIn Proc GECCO 2009

W B Langdon O Sanchez Graillet and A P HarrisonAutomated DNA Motif DiscoveryarXivorg 2010

M Lones A Tyrrell S Stepney and L CavesControlling Complex Dynamics with Artificial Biochemical NetworksIn Proc EuroGP pages 159ndash170 2010

S LucasOthello Competitionhttpprotectkern-1667emrelaxalgovalessexacuk8080othellohtmlOthellohtml 2012[Online accessed 27-Jan-2012]

Bibliographyndash 89

S LucasThe Physical Travelling Salesperson Problemhttpprotectkern-1667emrelaxalgovalessexacukptspptsphtml2012[Online accessed 27ndashJan-2012]

T McConaghyFFX Fast Scalable Deterministic Symbolic Regression TechnologyIn Proc GPTP 2011

D J MontanaStrongly typed genetic programmingBBN Technical Report 7866 Bolt Beranek and Newman Inc 10 MoultonStreet Cambridge MA 02138 USA 7 May 1993

M Nicolau M Schoenauer and W BanzhafEvolving Genes to Balance a PoleIn Proc EuroGP 2010

Bibliographyndash 90

P Nordin and W BanzhafGenetic programming controlling a miniature robotIn E V Siegel and J R Koza editors Working Notes for the AAAI Symposiumon Genetic Programming pages 61ndash67 MIT Cambridge MA USA 10ndash12 Nov1995 AAAI

G Olague and L TrujilloEvolutionary-computer-assisted design of image operators that detect interestpoints using genetic programmingImage and Vision Computing 29(7)484ndash498 2011

M OrsquoNeill A Brabazon and E HembergSubtree Deactivation Control with Grammatical Genetic Programming inDynamic EnvironmentsIn Proc CEC 2008

M OrsquoNeill and C RyanGrammatical Evolution by Grammatical Evolution The Evolution of Grammarand Genetic CodeIn Proc EuroGP pages 138ndash149 Springer-Verlag 5ndash7 Apr 2004

M Paterson Y Peres M Thorup P Winkler and U ZwickMaximum OverhangIn Proc 19th Annual ACM-SIAM Symposium on Discrete Algorithms 2008

Bibliographyndash 91

R PoliExact schema theory for genetic programming and variable-length geneticalgorithms with one-point crossoverGenetic Programming and Evolvable Machines 2(2)123ndash163 June 2001

R Poli and W B LangdonOn the search properties of different crossover operators in genetic programmingIn J R Koza W Banzhaf K Chellapilla K Deb M Dorigo D B Fogel M HGarzon D E Goldberg H Iba and R Riolo editors Genetic Programming1998 Proceedings of the Third Annual Conference pages 293ndash301 University ofWisconsin Madison Wisconsin USA 22-25 July 1998 Morgan Kaufmann

R Poli W B Langdon and N F McPheeA field guide to genetic programmingPublished via httplulucom and freely available athttpwwwgp-field-guideorguk 2008(With contributions by J R Koza)

B J Ross and H ZhuProcedural texture evolution using multiobjective optimizationNew Generation Computing 22(3)271ndash293 2004

Bibliographyndash 92

R P Salustowicz and J SchmidhuberProbabilistic incremental program evolutionEvolutionary Computation 5(2)123ndash141 1997

M Schmidt and H LipsonDistilling free-form natural laws from experimental dataScience 324(5923)81ndash85 3 Apr 2009

S Silva and L VanneschiState-of-the-Art Genetic Programming for Predicting Human Oral Bioavailabilityof DrugsIn Proc 4th International Workshop on Practical Applications of ComputationalBiology and Bioinformatics 2010

M SipperLet the Games EvolveIn Proc GPTP 2011

C Sivapragasam R Maheswaran and V VenkateshGenetic programming approach for flood routing in natural channelsHydrological Processes 22(5)623ndash628 2007

C Sivapragasam N Muttil S Muthukumar and V M ArunPrediction of algal blooms using genetic programmingMarine Pollution Bulletin 60(10)1849ndash1855 2010

Bibliographyndash 93

L SpectorTowards practical autoconstructive evolution Self-evolution of problem-solvinggenetic programming systemsIn R Riolo T McConaghy and E Vladislavleva editors Genetic ProgrammingTheory and Practice VIII volume 8 of Genetic and Evolutionary Computationchapter 2 pages 17ndash33 Springer Ann Arbor USA 20-22 May 2010

L Spector C Perry and J KleinPush 20 programming language descriptionTechnical report School of Cognitive Science Hampshire College Apr 2004

J Togelius S Karakovskiy J Koutnik and J SchmidhuberSuper Mario EvolutionIn Proc IEEE Computational Intelligence and Games 2009

TORCS The Open Car Racing Simulatorhttptorcssourceforgenet 2012

L Vanneschi and G CuccuVariable Size Population for Dynamic Optimization with Genetic ProgrammingIn Proc GECCO 2009

Bibliographyndash 94

E Vladislavleva G Smits and D Den HertogOrder of Nonlinearity as a Complexity Measure for Models Generated by SymbolicRegression via Pareto Genetic ProgrammingIEEE Trans EC 13(2)333ndash349 2009

N Wagner Z Michalewicz M Khouja and R McGregorTime Series Forecasting for Dynamic Environments The DyFor Genetic ProgramModelIEEE Trans EC 2007

J Walker and J MillerPredicting Prime Numbers Using Cartesian Genetic ProgrammingIn Proc EuroGP 2007

J A Walker K Voumllk S L Smith and J F MillerParallel Evolution using Multi-chromosome Cartesian Genetic ProgrammingGPEM 10417ndash445 2009

W-C Wang K-W Chau C-T Cheng and L QiuA comparison of performance of several artificial intelligence methods forforecasting monthly discharge time seriesJournal of Hydrology 374(3-4)294ndash306 2009

Bibliographyndash 95

W Weimer S Forrest C Le Goues and T NguyenAutomatic program repair with evolutionary computationCommunications of the ACM 53(5)109ndash116 June 2010

P Widera J Garibaldi and N KrasnogorGP challenge Evolving energy function for protein structure predictionGPEM 1161ndash88 2010

T YuHierarchical Processing for Evolving Recursive and Modular Programs UsingHigher-Order Functions and Lambda AbstractionGPEM 2345ndash380 2001

Bibliographyndash 96

  • Introduction
  • Background
  • What is genetic programming
  • Summary of our first glimpse at GP
  • Life demonstration of GP using ECJ
  • Assessment of GP techniques
  • Case study
  • A more detailed view on GP (vanilla GP is not the whole story)
  • The Challenges for GP
  • Variants of GP
  • Selected theoretical results
  • Applications of GP
  • Additional resources
  • Bibliography
Page 57: GeneticProgramming - Poznań University of TechnologyGeneticProgramming KrzysztofKrawiec LaboratoryofIntelligentDecisionSupportSystems InstituteofComputingScience,PoznanUniversityofTechnology,Poznań,Poland

Countermeasures for bloat

Constraining tree heightSurprisingly can speed up bloat

Favoring small programsLexicographic parsimony pressure given two equally fit individuals prefer (select)the one represented by a smaller tree

Bloat-aware operators size-fair crossover

The Challenges for GPndash 57

Highly non-uniform distribution of program lsquobehaviorsrsquo

Convergence of binary Boolean random linear functions (composed of AND NANDOR NOR 8 bits)

From [20] Langdon W B Cantuacute-Paz E (ed) Random Search is Parsimonious LateBreaking Papers at the Genetic and Evolutionary Computation Conference(GECCO-2002) AAAI 2002 308-315

The Challenges for GPndash 58

High cost of evaluation

Running a program on multiple inputscan be expensive

Particularly for some types of dataeg images

Solutions

Caching of outcomes of subprograms

Parallel execution of programs onparticular fitness cases

Bloat prevention methods

The Challenges for GPndash 59

Variants of GP

Variants of GPndash 60

Strongly typed GP (STGP)

A way to incorporate prior knowledge and impose a structure on programs [30]

ImplementationProvide a set of typesFor each instruction define the types of its arguments and outcomesMake the operators type-aware

Mutation substitute a random tree of a proper typeCrossover swap trees of compatible1 types

1Compatible belonging to the same lsquoset typersquoVariants of GPndash 61

Strongly typed GP in ECJ

For the problem of simple classifiers represented as decision trees

Classifier syntaxClassifier = Class_idClassifier = if_then_else(Condition ClassifierClassifier)Condition = Input_Variable = Constant_Value

Implementation in ECJ parameter filesgptypeasize = 3gptypea0name = classgptypea1name = vargptypea2name = constgptypessize = 0gptcsize = 1gptc0 = ecgpGPTreeConstraintsgptc0name = tc0gptc0fset = f0gptc0returns = class

gpncsize = 4gpnc0 = ecgpGPNodeConstraintsgpnc0name = ncSimpleClassifiergpnc0returns = classgpnc0size = 0gpnc1 = ecgpGPNodeConstraintsgpnc1name = ncCompoundClassifiergpnc1returns = classgpnc1size = 4gpnc1child0 = vargpnc1child1 = constgpnc1child2 = classgpnc1child3 = classgpnc2 = ecgpGPNodeConstraintsgpnc2name = ncVariablegpnc2returns = vargpnc2size = 0gpnc3 = ecgpGPNodeConstraintsgpnc3name = ncConstantgpnc3returns = constgpnc3size = 0

Variants of GPndash 62

Linear Genetic Programming

MotivationTree-like structures are not natural for contemporary hardware architectures

Program = a sequence of instructions

Data passed via registers

ProsDirectly portable to machine code fast executionNatural correspondence to standard (GA-like) crossover operator

Applications direct evolution of machine code [32]

Initial register contents

Final register contents

x1

x2 O1 O2

x3

O3 O4 g2

g3

g1r1

r2

r3

r1

r2

r3

Variants of GPndash 63

Linear GP

Variants of GPndash 64

Stack-based GP

The best-known representative Push and PushGPhampshireedulspectorpushhtml [48]

ProsVery simple syntax program = instruction | literal | ( program )No need to specify the number of registersThe top element of a stack has the natural interpretation of program outcomeNatural possibility of implementing autoconstructive programs [47]Includes certain features that make it Turing-complete (eg YANK instruction)

Variants of GPndash 65

Push Example

Program

( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR )

Initial stack states

BOOLEAN STACK ()CODE STACK ( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR )FLOAT STACK ()INTEGER STACK ()

Stack states after program execution

BOOLEAN STACK ( TRUE )CODE STACK ( ( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR ) )FLOAT STACK ( 93 )INTEGER STACK ( 6 )

httphampshireedulspectorpush3-descriptionhtml

Variants of GPndash 66

Other variants of GP

Grammatical Evolution The grammar of the programming language ofconsideration is given as input to the algorithm Individuals encode the choice ofproductions in the derivation tree (which of available alternative productionshould be chosen modulo the number of productions available at given step ofderivation)

Graph-based GPMotivation standard GP cannot reuse subprograms (within a single program)Example Cartesian Genetic Programming

Variants of GPndash 67

Other variants of GP

Multiobjective GP The extra objectives canCome with the problemResult from GPrsquos specifics eg use program size as the second (minimized)objectiveBe associated with different tests (eg feature tests [40])

Developmental GP (eg using Push)

Probabilistic GP (a variant of EDA Estimation of Distribution Algorithms)The algorithm maintains a probability distribution P instead of a populationIndividuals are generated from P lsquoon demandrsquoThe results of individualsrsquo evaluation are used to update P

Variants of GPndash 68

Simple EDA-like GP PIPE

Probabilistic Incremental Program Evolution [41]

Variants of GPndash 69

Selected theoretical results

Selected theoretical resultsndash 70

Schemata theorem for GP

Exact formula for the expected number of individuals sampling a schema a thenext generation [37]

Plus later work for other types of crossover

Selected theoretical resultsndash 71

Semantic crossover

Selected theoretical resultsndash 72

Applications of GP

Applications of GPndash 73

Review

GP produced a number of solutions that are human-competitive ie a GPalgorithm automatically solved a problem for which a patent exists2

A recent award-winning work has demonstrated the ability of a GP system toautomatically find and correct bugs in commercially-released software whenprovided with test data3

GP is one of leading methodologies that can be used to lsquoautomatersquo sciencehelping the researchers to find the hidden complex patterns in the observedphenomena4

2Koza J R Keane M A Streeter M J Mydlowec W Yu J Lanza G 2003 Genetic Pro-gramming IV Routine Human-Competitive Machine Intelligence Kluwer Academic Publishers

3Arcuri A Yao X A novel co-evolutionary approach to automatic software bug fixing In WangJ (Ed) 2008 IEEE World Congress on Computational Intelligence IEEE Computational IntelligenceSociety IEEE Press Hong Kong

4Schmidt M Lipson H 3 Apr 2009 Distilling free-form natural laws from experimental dataScience 324 (5923) 81ndash85

Applications of GPndash 74

Humies

() Entries were solicited for cash awards for human-competitive results that wereproduced by any form of genetic and evolutionary computation and that were published

Applications of GPndash 75

Humies

The conditions to qualify(A) The result was patented as an invention in the past is an improvement over apatented invention or would qualify today as a patentable new invention(B) The result is equal to or better than a result that was accepted as a new scientificresult at the time when it was published in a peer-reviewed scientific journal(C) The result is equal to or better than a result that was placed into a database orarchive of results maintained by an internationally recognized panel of scientificexperts(D) The result is publishable in its own right as a new scientific result mdash independentof the fact that the result was mechanically created(E) The result is equal to or better than the most recent human-created solution to along-standing problem for which there has been a succession of increasingly betterhuman-created solutions(F) The result is equal to or better than a result that was considered an achievementin its field at the time it was first discovered(G) The result solves a problem of indisputable difficulty in its field(H) The result holds its own or wins a regulated competition involving humancontestants (in the form of either live human players or human-written computerprograms)

Applications of GPndash 76

Selected Gold Humies

2004 Jason D Lohn Gregory S Hornby Derek S Linden NASA Ames ResearchCenterAn Evolved Antenna for Deployment on NASArsquos Space Technology 5 Mission

httpidesignucscedupapershornby_ec11pdf

Applications of GPndash 77

Selected Gold Humies using GP

2009 Stephanie Forrest Claire Le Goues ThanhVu Nguyen Westley WeimerAutomatically finding patches using genetic programming A GeneticProgramming Approach to Automated Software Repair

Applications of GPndash 78

Selected Gold Humies using GP

2008 Lee Spector David M Clark Ian Lindsay Bradford Barr Jon KleinGenetic Programming for Finite Algebras

2010 Natalio Krasnogor Paweł Widera Jonathan GaribaldiEvolutionary design of the energy function for protein structure prediction GPchallenge evolving the energy function for protein structure prediction Automateddesign of energy functions for protein structure prediction by means of geneticprogramming and improved structure similarity assessment

2011 Achiya Elyasaf Ami Hauptmann Moshe SipperGA-FreeCell Evolving Solvers for the Game of FreeCell

Applications of GPndash 79

Other applications

classification problems in machine learning [] object recognition [ 33] orlearning game strategies [] [2 57] has demonstrated the ability of a GP system to automatically find andcorrect bugs in commercially-released software when provided with test data

In the context of this paper it deserves particular attention that GP is one ofleading methodologies that can be used to lsquoautomatersquo science helping theresearchers to find the hidden complex patterns in the observed phenomena

In this spirit in their seminal paper [42] have shown how GP can be used toinduce scientific laws from experimental data Many other studies havedemonstrated the usefulness of GP for modeling different phenomena includingthose of natural origins [45 3 7 56 46]

See [] for an extensive review of GP applications

Applications of GPndash 80

Case study 2 Evolution of feature detectors

Based on Krzysztof Krawiec Bartosz Kukawka and Tomasz Maciejewski (2010)Evolving cascades of voting feature detectors for vehicle detection in satellite imageryIn IEEE Congress on Evolutionary Computation (CEC 2010) Barcelona IEEE Presspages 2392-2399

Applications of GPndash 81

Additional resources

Additional resourcesndash 82

Additional resources

Evolutionary Computation in Java csgmuedu~eclabprojectsecjGeneric software framework for EA well-prepared to work with GP

The online GP bilbiography wwwcsbhamacuk~wblbiblio

The genetic programming lsquohome pagersquo (a little bit messy but still valuable)httpwwwgenetic-programmingcom

Additional resourcesndash 83

Bibliography

Bibliographyndash 84

A Agapitos and S M LucasEvolving Modular Recursive Sorting AlgorithmsIn Proc EuroGP 2007

A Arcuri and X YaoA novel co-evolutionary approach to automatic software bug fixingIn J Wang editor 2008 IEEE World Congress on Computational Intelligencepages 162ndash168 Hong Kong 1-6 June 2008 IEEE Computational IntelligenceSociety IEEE Press

M Arganis R Val J Prats K Rodriguez R Dominguez and J DolzGenetic programming and standardization in water temperature modellingAdvances in Civil Engineering 2009 2009

A Brabazon M OrsquoNeill and I DempseyAn Introduction to Evolutionary Computation in FinanceIEEE Computational Intelligence Magazine 3(4)42ndash55 2008

R Bradley A Brabazon and M OrsquoNeillDynamic High Frequency Trading A Neuro-Evolutionary ApproachIn Proc EvoWorkshops 2009

G Cuccu and F GomezWhen novelty is not enoughIn Proc EvoApplications 2011

Bibliographyndash 85

M Daga and M C DeoAlternative data-driven methods to estimate wind from waves by inversemodelingNatural Hazards 49(2)293ndash310 May 2009

I Dempsey M OrsquoNeill and A BrabazonAdaptive Trading With Grammatical EvolutionIn Proc CEC 2006

G Durrett F Neumann and U-M OrsquoReillyComputational Complexity Analysis of Simple Genetic Programming On TwoProblems Modeling Isolated Program SemanticsIn Proc FOGA 2011

E Galvaacuten-Loacutepez J Swafford M OrsquoNeill and A BrabazonEvolving a Ms PacMan Controller Using Grammatical EvolutionIn Applications of Evolutionary Computation Springer 2010

J V Hansen P B Lowry R D Meservy and D M McDonaldGenetic Programming for Prevention of Cyberterrorism through Dynamic andEvolving Intrusion DetectionDecision Support Systems 431362ndash1374 2007

Bibliographyndash 86

S Harding J F Miller and W BanzhafDevelopments in Cartesian Genetic Programming self-modifying CGPGPEM 11397ndash439 2010

D Jakobović and L BudinDynamic Scheduling with Genetic ProgrammingIn Proc EuroGP 2006

G Kendall A Parkes and K SpoererA Survey of NP-Complete PuzzlesInternational Computer Games Association Journal 31(1)13ndash34 2008

K E Kinnear JrEvolving a Sort Lessons in Genetic ProgrammingIn Proc of the International Conference on Neural Networks 1993

J KozaA Genetic Approach to the Truck Backer Upper Problem and the Inter-twinedSpiral ProblemIn Proc International Joint Conference on Neural Networks 1992

J R KozaGenetic Programming On the Programming of Computers by Means of NaturalSelectionMIT Press Cambridge MA USA 1992

Bibliographyndash 87

J R KozaGenetic Programming II Automatic Discovery of Reusable ProgramsMIT Press Cambridge Massachusetts May 1994

W Langdon and W BanzhafRepeated Patterns in Genetic ProgrammingNatural Computing 7589ndash613 2008

W B LangdonRandom search is parsimoniousIn E Cantuacute-Paz editor Late Breaking Papers at the Genetic and EvolutionaryComputation Conference (GECCO-2002) pages 308ndash315 New York NY 9-13July 2002 AAAI

W B Langdon and W BanzhafRepeated Sequences in Linear Genetic Programming GenomesComplex Systems 15(4)285ndash306 2005

W B Langdon and A P HarrisonEvolving Regular Expressions for GeneChip Probe Performance PredictionIn Proc PPSN pages 1061ndash1070 2008

W B Langdon and R PoliFoundations of Genetic ProgrammingSpringer-Verlag 2002

Bibliographyndash 88

W B Langdon J Rowsell and A P HarrisonCreating Regular Expressions as mRNA Motifs with GP to Predict Human ExonSplittingIn Proc GECCO 2009

W B Langdon O Sanchez Graillet and A P HarrisonAutomated DNA Motif DiscoveryarXivorg 2010

M Lones A Tyrrell S Stepney and L CavesControlling Complex Dynamics with Artificial Biochemical NetworksIn Proc EuroGP pages 159ndash170 2010

S LucasOthello Competitionhttpprotectkern-1667emrelaxalgovalessexacuk8080othellohtmlOthellohtml 2012[Online accessed 27-Jan-2012]

Bibliographyndash 89

S LucasThe Physical Travelling Salesperson Problemhttpprotectkern-1667emrelaxalgovalessexacukptspptsphtml2012[Online accessed 27ndashJan-2012]

T McConaghyFFX Fast Scalable Deterministic Symbolic Regression TechnologyIn Proc GPTP 2011

D J MontanaStrongly typed genetic programmingBBN Technical Report 7866 Bolt Beranek and Newman Inc 10 MoultonStreet Cambridge MA 02138 USA 7 May 1993

M Nicolau M Schoenauer and W BanzhafEvolving Genes to Balance a PoleIn Proc EuroGP 2010

Bibliographyndash 90

P Nordin and W BanzhafGenetic programming controlling a miniature robotIn E V Siegel and J R Koza editors Working Notes for the AAAI Symposiumon Genetic Programming pages 61ndash67 MIT Cambridge MA USA 10ndash12 Nov1995 AAAI

G Olague and L TrujilloEvolutionary-computer-assisted design of image operators that detect interestpoints using genetic programmingImage and Vision Computing 29(7)484ndash498 2011

M OrsquoNeill A Brabazon and E HembergSubtree Deactivation Control with Grammatical Genetic Programming inDynamic EnvironmentsIn Proc CEC 2008

M OrsquoNeill and C RyanGrammatical Evolution by Grammatical Evolution The Evolution of Grammarand Genetic CodeIn Proc EuroGP pages 138ndash149 Springer-Verlag 5ndash7 Apr 2004

M Paterson Y Peres M Thorup P Winkler and U ZwickMaximum OverhangIn Proc 19th Annual ACM-SIAM Symposium on Discrete Algorithms 2008

Bibliographyndash 91

R PoliExact schema theory for genetic programming and variable-length geneticalgorithms with one-point crossoverGenetic Programming and Evolvable Machines 2(2)123ndash163 June 2001

R Poli and W B LangdonOn the search properties of different crossover operators in genetic programmingIn J R Koza W Banzhaf K Chellapilla K Deb M Dorigo D B Fogel M HGarzon D E Goldberg H Iba and R Riolo editors Genetic Programming1998 Proceedings of the Third Annual Conference pages 293ndash301 University ofWisconsin Madison Wisconsin USA 22-25 July 1998 Morgan Kaufmann

R Poli W B Langdon and N F McPheeA field guide to genetic programmingPublished via httplulucom and freely available athttpwwwgp-field-guideorguk 2008(With contributions by J R Koza)

B J Ross and H ZhuProcedural texture evolution using multiobjective optimizationNew Generation Computing 22(3)271ndash293 2004

Bibliographyndash 92

R P Salustowicz and J SchmidhuberProbabilistic incremental program evolutionEvolutionary Computation 5(2)123ndash141 1997

M Schmidt and H LipsonDistilling free-form natural laws from experimental dataScience 324(5923)81ndash85 3 Apr 2009

S Silva and L VanneschiState-of-the-Art Genetic Programming for Predicting Human Oral Bioavailabilityof DrugsIn Proc 4th International Workshop on Practical Applications of ComputationalBiology and Bioinformatics 2010

M SipperLet the Games EvolveIn Proc GPTP 2011

C Sivapragasam R Maheswaran and V VenkateshGenetic programming approach for flood routing in natural channelsHydrological Processes 22(5)623ndash628 2007

C Sivapragasam N Muttil S Muthukumar and V M ArunPrediction of algal blooms using genetic programmingMarine Pollution Bulletin 60(10)1849ndash1855 2010

Bibliographyndash 93

L SpectorTowards practical autoconstructive evolution Self-evolution of problem-solvinggenetic programming systemsIn R Riolo T McConaghy and E Vladislavleva editors Genetic ProgrammingTheory and Practice VIII volume 8 of Genetic and Evolutionary Computationchapter 2 pages 17ndash33 Springer Ann Arbor USA 20-22 May 2010

L Spector C Perry and J KleinPush 20 programming language descriptionTechnical report School of Cognitive Science Hampshire College Apr 2004

J Togelius S Karakovskiy J Koutnik and J SchmidhuberSuper Mario EvolutionIn Proc IEEE Computational Intelligence and Games 2009

TORCS The Open Car Racing Simulatorhttptorcssourceforgenet 2012

L Vanneschi and G CuccuVariable Size Population for Dynamic Optimization with Genetic ProgrammingIn Proc GECCO 2009

Bibliographyndash 94

E Vladislavleva G Smits and D Den HertogOrder of Nonlinearity as a Complexity Measure for Models Generated by SymbolicRegression via Pareto Genetic ProgrammingIEEE Trans EC 13(2)333ndash349 2009

N Wagner Z Michalewicz M Khouja and R McGregorTime Series Forecasting for Dynamic Environments The DyFor Genetic ProgramModelIEEE Trans EC 2007

J Walker and J MillerPredicting Prime Numbers Using Cartesian Genetic ProgrammingIn Proc EuroGP 2007

J A Walker K Voumllk S L Smith and J F MillerParallel Evolution using Multi-chromosome Cartesian Genetic ProgrammingGPEM 10417ndash445 2009

W-C Wang K-W Chau C-T Cheng and L QiuA comparison of performance of several artificial intelligence methods forforecasting monthly discharge time seriesJournal of Hydrology 374(3-4)294ndash306 2009

Bibliographyndash 95

W Weimer S Forrest C Le Goues and T NguyenAutomatic program repair with evolutionary computationCommunications of the ACM 53(5)109ndash116 June 2010

P Widera J Garibaldi and N KrasnogorGP challenge Evolving energy function for protein structure predictionGPEM 1161ndash88 2010

T YuHierarchical Processing for Evolving Recursive and Modular Programs UsingHigher-Order Functions and Lambda AbstractionGPEM 2345ndash380 2001

Bibliographyndash 96

  • Introduction
  • Background
  • What is genetic programming
  • Summary of our first glimpse at GP
  • Life demonstration of GP using ECJ
  • Assessment of GP techniques
  • Case study
  • A more detailed view on GP (vanilla GP is not the whole story)
  • The Challenges for GP
  • Variants of GP
  • Selected theoretical results
  • Applications of GP
  • Additional resources
  • Bibliography
Page 58: GeneticProgramming - Poznań University of TechnologyGeneticProgramming KrzysztofKrawiec LaboratoryofIntelligentDecisionSupportSystems InstituteofComputingScience,PoznanUniversityofTechnology,Poznań,Poland

Highly non-uniform distribution of program lsquobehaviorsrsquo

Convergence of binary Boolean random linear functions (composed of AND NANDOR NOR 8 bits)

From [20] Langdon W B Cantuacute-Paz E (ed) Random Search is Parsimonious LateBreaking Papers at the Genetic and Evolutionary Computation Conference(GECCO-2002) AAAI 2002 308-315

The Challenges for GPndash 58

High cost of evaluation

Running a program on multiple inputscan be expensive

Particularly for some types of dataeg images

Solutions

Caching of outcomes of subprograms

Parallel execution of programs onparticular fitness cases

Bloat prevention methods

The Challenges for GPndash 59

Variants of GP

Variants of GPndash 60

Strongly typed GP (STGP)

A way to incorporate prior knowledge and impose a structure on programs [30]

ImplementationProvide a set of typesFor each instruction define the types of its arguments and outcomesMake the operators type-aware

Mutation substitute a random tree of a proper typeCrossover swap trees of compatible1 types

1Compatible belonging to the same lsquoset typersquoVariants of GPndash 61

Strongly typed GP in ECJ

For the problem of simple classifiers represented as decision trees

Classifier syntaxClassifier = Class_idClassifier = if_then_else(Condition ClassifierClassifier)Condition = Input_Variable = Constant_Value

Implementation in ECJ parameter filesgptypeasize = 3gptypea0name = classgptypea1name = vargptypea2name = constgptypessize = 0gptcsize = 1gptc0 = ecgpGPTreeConstraintsgptc0name = tc0gptc0fset = f0gptc0returns = class

gpncsize = 4gpnc0 = ecgpGPNodeConstraintsgpnc0name = ncSimpleClassifiergpnc0returns = classgpnc0size = 0gpnc1 = ecgpGPNodeConstraintsgpnc1name = ncCompoundClassifiergpnc1returns = classgpnc1size = 4gpnc1child0 = vargpnc1child1 = constgpnc1child2 = classgpnc1child3 = classgpnc2 = ecgpGPNodeConstraintsgpnc2name = ncVariablegpnc2returns = vargpnc2size = 0gpnc3 = ecgpGPNodeConstraintsgpnc3name = ncConstantgpnc3returns = constgpnc3size = 0

Variants of GPndash 62

Linear Genetic Programming

MotivationTree-like structures are not natural for contemporary hardware architectures

Program = a sequence of instructions

Data passed via registers

ProsDirectly portable to machine code fast executionNatural correspondence to standard (GA-like) crossover operator

Applications direct evolution of machine code [32]

Initial register contents

Final register contents

x1

x2 O1 O2

x3

O3 O4 g2

g3

g1r1

r2

r3

r1

r2

r3

Variants of GPndash 63

Linear GP

Variants of GPndash 64

Stack-based GP

The best-known representative Push and PushGPhampshireedulspectorpushhtml [48]

ProsVery simple syntax program = instruction | literal | ( program )No need to specify the number of registersThe top element of a stack has the natural interpretation of program outcomeNatural possibility of implementing autoconstructive programs [47]Includes certain features that make it Turing-complete (eg YANK instruction)

Variants of GPndash 65

Push Example

Program

( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR )

Initial stack states

BOOLEAN STACK ()CODE STACK ( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR )FLOAT STACK ()INTEGER STACK ()

Stack states after program execution

BOOLEAN STACK ( TRUE )CODE STACK ( ( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR ) )FLOAT STACK ( 93 )INTEGER STACK ( 6 )

httphampshireedulspectorpush3-descriptionhtml

Variants of GPndash 66

Other variants of GP

Grammatical Evolution The grammar of the programming language ofconsideration is given as input to the algorithm Individuals encode the choice ofproductions in the derivation tree (which of available alternative productionshould be chosen modulo the number of productions available at given step ofderivation)

Graph-based GPMotivation standard GP cannot reuse subprograms (within a single program)Example Cartesian Genetic Programming

Variants of GPndash 67

Other variants of GP

Multiobjective GP The extra objectives canCome with the problemResult from GPrsquos specifics eg use program size as the second (minimized)objectiveBe associated with different tests (eg feature tests [40])

Developmental GP (eg using Push)

Probabilistic GP (a variant of EDA Estimation of Distribution Algorithms)The algorithm maintains a probability distribution P instead of a populationIndividuals are generated from P lsquoon demandrsquoThe results of individualsrsquo evaluation are used to update P

Variants of GPndash 68

Simple EDA-like GP PIPE

Probabilistic Incremental Program Evolution [41]

Variants of GPndash 69

Selected theoretical results

Selected theoretical resultsndash 70

Schemata theorem for GP

Exact formula for the expected number of individuals sampling a schema a thenext generation [37]

Plus later work for other types of crossover

Selected theoretical resultsndash 71

Semantic crossover

Selected theoretical resultsndash 72

Applications of GP

Applications of GPndash 73

Review

GP produced a number of solutions that are human-competitive ie a GPalgorithm automatically solved a problem for which a patent exists2

A recent award-winning work has demonstrated the ability of a GP system toautomatically find and correct bugs in commercially-released software whenprovided with test data3

GP is one of leading methodologies that can be used to lsquoautomatersquo sciencehelping the researchers to find the hidden complex patterns in the observedphenomena4

2Koza J R Keane M A Streeter M J Mydlowec W Yu J Lanza G 2003 Genetic Pro-gramming IV Routine Human-Competitive Machine Intelligence Kluwer Academic Publishers

3Arcuri A Yao X A novel co-evolutionary approach to automatic software bug fixing In WangJ (Ed) 2008 IEEE World Congress on Computational Intelligence IEEE Computational IntelligenceSociety IEEE Press Hong Kong

4Schmidt M Lipson H 3 Apr 2009 Distilling free-form natural laws from experimental dataScience 324 (5923) 81ndash85

Applications of GPndash 74

Humies

() Entries were solicited for cash awards for human-competitive results that wereproduced by any form of genetic and evolutionary computation and that were published

Applications of GPndash 75

Humies

The conditions to qualify(A) The result was patented as an invention in the past is an improvement over apatented invention or would qualify today as a patentable new invention(B) The result is equal to or better than a result that was accepted as a new scientificresult at the time when it was published in a peer-reviewed scientific journal(C) The result is equal to or better than a result that was placed into a database orarchive of results maintained by an internationally recognized panel of scientificexperts(D) The result is publishable in its own right as a new scientific result mdash independentof the fact that the result was mechanically created(E) The result is equal to or better than the most recent human-created solution to along-standing problem for which there has been a succession of increasingly betterhuman-created solutions(F) The result is equal to or better than a result that was considered an achievementin its field at the time it was first discovered(G) The result solves a problem of indisputable difficulty in its field(H) The result holds its own or wins a regulated competition involving humancontestants (in the form of either live human players or human-written computerprograms)

Applications of GPndash 76

Selected Gold Humies

2004 Jason D Lohn Gregory S Hornby Derek S Linden NASA Ames ResearchCenterAn Evolved Antenna for Deployment on NASArsquos Space Technology 5 Mission

httpidesignucscedupapershornby_ec11pdf

Applications of GPndash 77

Selected Gold Humies using GP

2009 Stephanie Forrest Claire Le Goues ThanhVu Nguyen Westley WeimerAutomatically finding patches using genetic programming A GeneticProgramming Approach to Automated Software Repair

Applications of GPndash 78

Selected Gold Humies using GP

2008 Lee Spector David M Clark Ian Lindsay Bradford Barr Jon KleinGenetic Programming for Finite Algebras

2010 Natalio Krasnogor Paweł Widera Jonathan GaribaldiEvolutionary design of the energy function for protein structure prediction GPchallenge evolving the energy function for protein structure prediction Automateddesign of energy functions for protein structure prediction by means of geneticprogramming and improved structure similarity assessment

2011 Achiya Elyasaf Ami Hauptmann Moshe SipperGA-FreeCell Evolving Solvers for the Game of FreeCell

Applications of GPndash 79

Other applications

classification problems in machine learning [] object recognition [ 33] orlearning game strategies [] [2 57] has demonstrated the ability of a GP system to automatically find andcorrect bugs in commercially-released software when provided with test data

In the context of this paper it deserves particular attention that GP is one ofleading methodologies that can be used to lsquoautomatersquo science helping theresearchers to find the hidden complex patterns in the observed phenomena

In this spirit in their seminal paper [42] have shown how GP can be used toinduce scientific laws from experimental data Many other studies havedemonstrated the usefulness of GP for modeling different phenomena includingthose of natural origins [45 3 7 56 46]

See [] for an extensive review of GP applications

Applications of GPndash 80

Case study 2 Evolution of feature detectors

Based on Krzysztof Krawiec Bartosz Kukawka and Tomasz Maciejewski (2010)Evolving cascades of voting feature detectors for vehicle detection in satellite imageryIn IEEE Congress on Evolutionary Computation (CEC 2010) Barcelona IEEE Presspages 2392-2399

Applications of GPndash 81

Additional resources

Additional resourcesndash 82

Additional resources

Evolutionary Computation in Java csgmuedu~eclabprojectsecjGeneric software framework for EA well-prepared to work with GP

The online GP bilbiography wwwcsbhamacuk~wblbiblio

The genetic programming lsquohome pagersquo (a little bit messy but still valuable)httpwwwgenetic-programmingcom

Additional resourcesndash 83

Bibliography

Bibliographyndash 84

A Agapitos and S M LucasEvolving Modular Recursive Sorting AlgorithmsIn Proc EuroGP 2007

A Arcuri and X YaoA novel co-evolutionary approach to automatic software bug fixingIn J Wang editor 2008 IEEE World Congress on Computational Intelligencepages 162ndash168 Hong Kong 1-6 June 2008 IEEE Computational IntelligenceSociety IEEE Press

M Arganis R Val J Prats K Rodriguez R Dominguez and J DolzGenetic programming and standardization in water temperature modellingAdvances in Civil Engineering 2009 2009

A Brabazon M OrsquoNeill and I DempseyAn Introduction to Evolutionary Computation in FinanceIEEE Computational Intelligence Magazine 3(4)42ndash55 2008

R Bradley A Brabazon and M OrsquoNeillDynamic High Frequency Trading A Neuro-Evolutionary ApproachIn Proc EvoWorkshops 2009

G Cuccu and F GomezWhen novelty is not enoughIn Proc EvoApplications 2011

Bibliographyndash 85

M Daga and M C DeoAlternative data-driven methods to estimate wind from waves by inversemodelingNatural Hazards 49(2)293ndash310 May 2009

I Dempsey M OrsquoNeill and A BrabazonAdaptive Trading With Grammatical EvolutionIn Proc CEC 2006

G Durrett F Neumann and U-M OrsquoReillyComputational Complexity Analysis of Simple Genetic Programming On TwoProblems Modeling Isolated Program SemanticsIn Proc FOGA 2011

E Galvaacuten-Loacutepez J Swafford M OrsquoNeill and A BrabazonEvolving a Ms PacMan Controller Using Grammatical EvolutionIn Applications of Evolutionary Computation Springer 2010

J V Hansen P B Lowry R D Meservy and D M McDonaldGenetic Programming for Prevention of Cyberterrorism through Dynamic andEvolving Intrusion DetectionDecision Support Systems 431362ndash1374 2007

Bibliographyndash 86

S Harding J F Miller and W BanzhafDevelopments in Cartesian Genetic Programming self-modifying CGPGPEM 11397ndash439 2010

D Jakobović and L BudinDynamic Scheduling with Genetic ProgrammingIn Proc EuroGP 2006

G Kendall A Parkes and K SpoererA Survey of NP-Complete PuzzlesInternational Computer Games Association Journal 31(1)13ndash34 2008

K E Kinnear JrEvolving a Sort Lessons in Genetic ProgrammingIn Proc of the International Conference on Neural Networks 1993

J KozaA Genetic Approach to the Truck Backer Upper Problem and the Inter-twinedSpiral ProblemIn Proc International Joint Conference on Neural Networks 1992

J R KozaGenetic Programming On the Programming of Computers by Means of NaturalSelectionMIT Press Cambridge MA USA 1992

Bibliographyndash 87

J R KozaGenetic Programming II Automatic Discovery of Reusable ProgramsMIT Press Cambridge Massachusetts May 1994

W Langdon and W BanzhafRepeated Patterns in Genetic ProgrammingNatural Computing 7589ndash613 2008

W B LangdonRandom search is parsimoniousIn E Cantuacute-Paz editor Late Breaking Papers at the Genetic and EvolutionaryComputation Conference (GECCO-2002) pages 308ndash315 New York NY 9-13July 2002 AAAI

W B Langdon and W BanzhafRepeated Sequences in Linear Genetic Programming GenomesComplex Systems 15(4)285ndash306 2005

W B Langdon and A P HarrisonEvolving Regular Expressions for GeneChip Probe Performance PredictionIn Proc PPSN pages 1061ndash1070 2008

W B Langdon and R PoliFoundations of Genetic ProgrammingSpringer-Verlag 2002

Bibliographyndash 88

W B Langdon J Rowsell and A P HarrisonCreating Regular Expressions as mRNA Motifs with GP to Predict Human ExonSplittingIn Proc GECCO 2009

W B Langdon O Sanchez Graillet and A P HarrisonAutomated DNA Motif DiscoveryarXivorg 2010

M Lones A Tyrrell S Stepney and L CavesControlling Complex Dynamics with Artificial Biochemical NetworksIn Proc EuroGP pages 159ndash170 2010

S LucasOthello Competitionhttpprotectkern-1667emrelaxalgovalessexacuk8080othellohtmlOthellohtml 2012[Online accessed 27-Jan-2012]

Bibliographyndash 89

S LucasThe Physical Travelling Salesperson Problemhttpprotectkern-1667emrelaxalgovalessexacukptspptsphtml2012[Online accessed 27ndashJan-2012]

T McConaghyFFX Fast Scalable Deterministic Symbolic Regression TechnologyIn Proc GPTP 2011

D J MontanaStrongly typed genetic programmingBBN Technical Report 7866 Bolt Beranek and Newman Inc 10 MoultonStreet Cambridge MA 02138 USA 7 May 1993

M Nicolau M Schoenauer and W BanzhafEvolving Genes to Balance a PoleIn Proc EuroGP 2010

Bibliographyndash 90

P Nordin and W BanzhafGenetic programming controlling a miniature robotIn E V Siegel and J R Koza editors Working Notes for the AAAI Symposiumon Genetic Programming pages 61ndash67 MIT Cambridge MA USA 10ndash12 Nov1995 AAAI

G Olague and L TrujilloEvolutionary-computer-assisted design of image operators that detect interestpoints using genetic programmingImage and Vision Computing 29(7)484ndash498 2011

M OrsquoNeill A Brabazon and E HembergSubtree Deactivation Control with Grammatical Genetic Programming inDynamic EnvironmentsIn Proc CEC 2008

M OrsquoNeill and C RyanGrammatical Evolution by Grammatical Evolution The Evolution of Grammarand Genetic CodeIn Proc EuroGP pages 138ndash149 Springer-Verlag 5ndash7 Apr 2004

M Paterson Y Peres M Thorup P Winkler and U ZwickMaximum OverhangIn Proc 19th Annual ACM-SIAM Symposium on Discrete Algorithms 2008

Bibliographyndash 91

R PoliExact schema theory for genetic programming and variable-length geneticalgorithms with one-point crossoverGenetic Programming and Evolvable Machines 2(2)123ndash163 June 2001

R Poli and W B LangdonOn the search properties of different crossover operators in genetic programmingIn J R Koza W Banzhaf K Chellapilla K Deb M Dorigo D B Fogel M HGarzon D E Goldberg H Iba and R Riolo editors Genetic Programming1998 Proceedings of the Third Annual Conference pages 293ndash301 University ofWisconsin Madison Wisconsin USA 22-25 July 1998 Morgan Kaufmann

R Poli W B Langdon and N F McPheeA field guide to genetic programmingPublished via httplulucom and freely available athttpwwwgp-field-guideorguk 2008(With contributions by J R Koza)

B J Ross and H ZhuProcedural texture evolution using multiobjective optimizationNew Generation Computing 22(3)271ndash293 2004

Bibliographyndash 92

R P Salustowicz and J SchmidhuberProbabilistic incremental program evolutionEvolutionary Computation 5(2)123ndash141 1997

M Schmidt and H LipsonDistilling free-form natural laws from experimental dataScience 324(5923)81ndash85 3 Apr 2009

S Silva and L VanneschiState-of-the-Art Genetic Programming for Predicting Human Oral Bioavailabilityof DrugsIn Proc 4th International Workshop on Practical Applications of ComputationalBiology and Bioinformatics 2010

M SipperLet the Games EvolveIn Proc GPTP 2011

C Sivapragasam R Maheswaran and V VenkateshGenetic programming approach for flood routing in natural channelsHydrological Processes 22(5)623ndash628 2007

C Sivapragasam N Muttil S Muthukumar and V M ArunPrediction of algal blooms using genetic programmingMarine Pollution Bulletin 60(10)1849ndash1855 2010

Bibliographyndash 93

L SpectorTowards practical autoconstructive evolution Self-evolution of problem-solvinggenetic programming systemsIn R Riolo T McConaghy and E Vladislavleva editors Genetic ProgrammingTheory and Practice VIII volume 8 of Genetic and Evolutionary Computationchapter 2 pages 17ndash33 Springer Ann Arbor USA 20-22 May 2010

L Spector C Perry and J KleinPush 20 programming language descriptionTechnical report School of Cognitive Science Hampshire College Apr 2004

J Togelius S Karakovskiy J Koutnik and J SchmidhuberSuper Mario EvolutionIn Proc IEEE Computational Intelligence and Games 2009

TORCS The Open Car Racing Simulatorhttptorcssourceforgenet 2012

L Vanneschi and G CuccuVariable Size Population for Dynamic Optimization with Genetic ProgrammingIn Proc GECCO 2009

Bibliographyndash 94

E Vladislavleva G Smits and D Den HertogOrder of Nonlinearity as a Complexity Measure for Models Generated by SymbolicRegression via Pareto Genetic ProgrammingIEEE Trans EC 13(2)333ndash349 2009

N Wagner Z Michalewicz M Khouja and R McGregorTime Series Forecasting for Dynamic Environments The DyFor Genetic ProgramModelIEEE Trans EC 2007

J Walker and J MillerPredicting Prime Numbers Using Cartesian Genetic ProgrammingIn Proc EuroGP 2007

J A Walker K Voumllk S L Smith and J F MillerParallel Evolution using Multi-chromosome Cartesian Genetic ProgrammingGPEM 10417ndash445 2009

W-C Wang K-W Chau C-T Cheng and L QiuA comparison of performance of several artificial intelligence methods forforecasting monthly discharge time seriesJournal of Hydrology 374(3-4)294ndash306 2009

Bibliographyndash 95

W Weimer S Forrest C Le Goues and T NguyenAutomatic program repair with evolutionary computationCommunications of the ACM 53(5)109ndash116 June 2010

P Widera J Garibaldi and N KrasnogorGP challenge Evolving energy function for protein structure predictionGPEM 1161ndash88 2010

T YuHierarchical Processing for Evolving Recursive and Modular Programs UsingHigher-Order Functions and Lambda AbstractionGPEM 2345ndash380 2001

Bibliographyndash 96

  • Introduction
  • Background
  • What is genetic programming
  • Summary of our first glimpse at GP
  • Life demonstration of GP using ECJ
  • Assessment of GP techniques
  • Case study
  • A more detailed view on GP (vanilla GP is not the whole story)
  • The Challenges for GP
  • Variants of GP
  • Selected theoretical results
  • Applications of GP
  • Additional resources
  • Bibliography
Page 59: GeneticProgramming - Poznań University of TechnologyGeneticProgramming KrzysztofKrawiec LaboratoryofIntelligentDecisionSupportSystems InstituteofComputingScience,PoznanUniversityofTechnology,Poznań,Poland

High cost of evaluation

Running a program on multiple inputscan be expensive

Particularly for some types of dataeg images

Solutions

Caching of outcomes of subprograms

Parallel execution of programs onparticular fitness cases

Bloat prevention methods

The Challenges for GPndash 59

Variants of GP

Variants of GPndash 60

Strongly typed GP (STGP)

A way to incorporate prior knowledge and impose a structure on programs [30]

ImplementationProvide a set of typesFor each instruction define the types of its arguments and outcomesMake the operators type-aware

Mutation substitute a random tree of a proper typeCrossover swap trees of compatible1 types

1Compatible belonging to the same lsquoset typersquoVariants of GPndash 61

Strongly typed GP in ECJ

For the problem of simple classifiers represented as decision trees

Classifier syntaxClassifier = Class_idClassifier = if_then_else(Condition ClassifierClassifier)Condition = Input_Variable = Constant_Value

Implementation in ECJ parameter filesgptypeasize = 3gptypea0name = classgptypea1name = vargptypea2name = constgptypessize = 0gptcsize = 1gptc0 = ecgpGPTreeConstraintsgptc0name = tc0gptc0fset = f0gptc0returns = class

gpncsize = 4gpnc0 = ecgpGPNodeConstraintsgpnc0name = ncSimpleClassifiergpnc0returns = classgpnc0size = 0gpnc1 = ecgpGPNodeConstraintsgpnc1name = ncCompoundClassifiergpnc1returns = classgpnc1size = 4gpnc1child0 = vargpnc1child1 = constgpnc1child2 = classgpnc1child3 = classgpnc2 = ecgpGPNodeConstraintsgpnc2name = ncVariablegpnc2returns = vargpnc2size = 0gpnc3 = ecgpGPNodeConstraintsgpnc3name = ncConstantgpnc3returns = constgpnc3size = 0

Variants of GPndash 62

Linear Genetic Programming

MotivationTree-like structures are not natural for contemporary hardware architectures

Program = a sequence of instructions

Data passed via registers

ProsDirectly portable to machine code fast executionNatural correspondence to standard (GA-like) crossover operator

Applications direct evolution of machine code [32]

Initial register contents

Final register contents

x1

x2 O1 O2

x3

O3 O4 g2

g3

g1r1

r2

r3

r1

r2

r3

Variants of GPndash 63

Linear GP

Variants of GPndash 64

Stack-based GP

The best-known representative Push and PushGPhampshireedulspectorpushhtml [48]

ProsVery simple syntax program = instruction | literal | ( program )No need to specify the number of registersThe top element of a stack has the natural interpretation of program outcomeNatural possibility of implementing autoconstructive programs [47]Includes certain features that make it Turing-complete (eg YANK instruction)

Variants of GPndash 65

Push Example

Program

( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR )

Initial stack states

BOOLEAN STACK ()CODE STACK ( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR )FLOAT STACK ()INTEGER STACK ()

Stack states after program execution

BOOLEAN STACK ( TRUE )CODE STACK ( ( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR ) )FLOAT STACK ( 93 )INTEGER STACK ( 6 )

httphampshireedulspectorpush3-descriptionhtml

Variants of GPndash 66

Other variants of GP

Grammatical Evolution The grammar of the programming language ofconsideration is given as input to the algorithm Individuals encode the choice ofproductions in the derivation tree (which of available alternative productionshould be chosen modulo the number of productions available at given step ofderivation)

Graph-based GPMotivation standard GP cannot reuse subprograms (within a single program)Example Cartesian Genetic Programming

Variants of GPndash 67

Other variants of GP

Multiobjective GP The extra objectives canCome with the problemResult from GPrsquos specifics eg use program size as the second (minimized)objectiveBe associated with different tests (eg feature tests [40])

Developmental GP (eg using Push)

Probabilistic GP (a variant of EDA Estimation of Distribution Algorithms)The algorithm maintains a probability distribution P instead of a populationIndividuals are generated from P lsquoon demandrsquoThe results of individualsrsquo evaluation are used to update P

Variants of GPndash 68

Simple EDA-like GP PIPE

Probabilistic Incremental Program Evolution [41]

Variants of GPndash 69

Selected theoretical results

Selected theoretical resultsndash 70

Schemata theorem for GP

Exact formula for the expected number of individuals sampling a schema a thenext generation [37]

Plus later work for other types of crossover

Selected theoretical resultsndash 71

Semantic crossover

Selected theoretical resultsndash 72

Applications of GP

Applications of GPndash 73

Review

GP produced a number of solutions that are human-competitive ie a GPalgorithm automatically solved a problem for which a patent exists2

A recent award-winning work has demonstrated the ability of a GP system toautomatically find and correct bugs in commercially-released software whenprovided with test data3

GP is one of leading methodologies that can be used to lsquoautomatersquo sciencehelping the researchers to find the hidden complex patterns in the observedphenomena4

2Koza J R Keane M A Streeter M J Mydlowec W Yu J Lanza G 2003 Genetic Pro-gramming IV Routine Human-Competitive Machine Intelligence Kluwer Academic Publishers

3Arcuri A Yao X A novel co-evolutionary approach to automatic software bug fixing In WangJ (Ed) 2008 IEEE World Congress on Computational Intelligence IEEE Computational IntelligenceSociety IEEE Press Hong Kong

4Schmidt M Lipson H 3 Apr 2009 Distilling free-form natural laws from experimental dataScience 324 (5923) 81ndash85

Applications of GPndash 74

Humies

() Entries were solicited for cash awards for human-competitive results that wereproduced by any form of genetic and evolutionary computation and that were published

Applications of GPndash 75

Humies

The conditions to qualify(A) The result was patented as an invention in the past is an improvement over apatented invention or would qualify today as a patentable new invention(B) The result is equal to or better than a result that was accepted as a new scientificresult at the time when it was published in a peer-reviewed scientific journal(C) The result is equal to or better than a result that was placed into a database orarchive of results maintained by an internationally recognized panel of scientificexperts(D) The result is publishable in its own right as a new scientific result mdash independentof the fact that the result was mechanically created(E) The result is equal to or better than the most recent human-created solution to along-standing problem for which there has been a succession of increasingly betterhuman-created solutions(F) The result is equal to or better than a result that was considered an achievementin its field at the time it was first discovered(G) The result solves a problem of indisputable difficulty in its field(H) The result holds its own or wins a regulated competition involving humancontestants (in the form of either live human players or human-written computerprograms)

Applications of GPndash 76

Selected Gold Humies

2004 Jason D Lohn Gregory S Hornby Derek S Linden NASA Ames ResearchCenterAn Evolved Antenna for Deployment on NASArsquos Space Technology 5 Mission

httpidesignucscedupapershornby_ec11pdf

Applications of GPndash 77

Selected Gold Humies using GP

2009 Stephanie Forrest Claire Le Goues ThanhVu Nguyen Westley WeimerAutomatically finding patches using genetic programming A GeneticProgramming Approach to Automated Software Repair

Applications of GPndash 78

Selected Gold Humies using GP

2008 Lee Spector David M Clark Ian Lindsay Bradford Barr Jon KleinGenetic Programming for Finite Algebras

2010 Natalio Krasnogor Paweł Widera Jonathan GaribaldiEvolutionary design of the energy function for protein structure prediction GPchallenge evolving the energy function for protein structure prediction Automateddesign of energy functions for protein structure prediction by means of geneticprogramming and improved structure similarity assessment

2011 Achiya Elyasaf Ami Hauptmann Moshe SipperGA-FreeCell Evolving Solvers for the Game of FreeCell

Applications of GPndash 79

Other applications

classification problems in machine learning [] object recognition [ 33] orlearning game strategies [] [2 57] has demonstrated the ability of a GP system to automatically find andcorrect bugs in commercially-released software when provided with test data

In the context of this paper it deserves particular attention that GP is one ofleading methodologies that can be used to lsquoautomatersquo science helping theresearchers to find the hidden complex patterns in the observed phenomena

In this spirit in their seminal paper [42] have shown how GP can be used toinduce scientific laws from experimental data Many other studies havedemonstrated the usefulness of GP for modeling different phenomena includingthose of natural origins [45 3 7 56 46]

See [] for an extensive review of GP applications

Applications of GPndash 80

Case study 2 Evolution of feature detectors

Based on Krzysztof Krawiec Bartosz Kukawka and Tomasz Maciejewski (2010)Evolving cascades of voting feature detectors for vehicle detection in satellite imageryIn IEEE Congress on Evolutionary Computation (CEC 2010) Barcelona IEEE Presspages 2392-2399

Applications of GPndash 81

Additional resources

Additional resourcesndash 82

Additional resources

Evolutionary Computation in Java csgmuedu~eclabprojectsecjGeneric software framework for EA well-prepared to work with GP

The online GP bilbiography wwwcsbhamacuk~wblbiblio

The genetic programming lsquohome pagersquo (a little bit messy but still valuable)httpwwwgenetic-programmingcom

Additional resourcesndash 83

Bibliography

Bibliographyndash 84

A Agapitos and S M LucasEvolving Modular Recursive Sorting AlgorithmsIn Proc EuroGP 2007

A Arcuri and X YaoA novel co-evolutionary approach to automatic software bug fixingIn J Wang editor 2008 IEEE World Congress on Computational Intelligencepages 162ndash168 Hong Kong 1-6 June 2008 IEEE Computational IntelligenceSociety IEEE Press

M Arganis R Val J Prats K Rodriguez R Dominguez and J DolzGenetic programming and standardization in water temperature modellingAdvances in Civil Engineering 2009 2009

A Brabazon M OrsquoNeill and I DempseyAn Introduction to Evolutionary Computation in FinanceIEEE Computational Intelligence Magazine 3(4)42ndash55 2008

R Bradley A Brabazon and M OrsquoNeillDynamic High Frequency Trading A Neuro-Evolutionary ApproachIn Proc EvoWorkshops 2009

G Cuccu and F GomezWhen novelty is not enoughIn Proc EvoApplications 2011

Bibliographyndash 85

M Daga and M C DeoAlternative data-driven methods to estimate wind from waves by inversemodelingNatural Hazards 49(2)293ndash310 May 2009

I Dempsey M OrsquoNeill and A BrabazonAdaptive Trading With Grammatical EvolutionIn Proc CEC 2006

G Durrett F Neumann and U-M OrsquoReillyComputational Complexity Analysis of Simple Genetic Programming On TwoProblems Modeling Isolated Program SemanticsIn Proc FOGA 2011

E Galvaacuten-Loacutepez J Swafford M OrsquoNeill and A BrabazonEvolving a Ms PacMan Controller Using Grammatical EvolutionIn Applications of Evolutionary Computation Springer 2010

J V Hansen P B Lowry R D Meservy and D M McDonaldGenetic Programming for Prevention of Cyberterrorism through Dynamic andEvolving Intrusion DetectionDecision Support Systems 431362ndash1374 2007

Bibliographyndash 86

S Harding J F Miller and W BanzhafDevelopments in Cartesian Genetic Programming self-modifying CGPGPEM 11397ndash439 2010

D Jakobović and L BudinDynamic Scheduling with Genetic ProgrammingIn Proc EuroGP 2006

G Kendall A Parkes and K SpoererA Survey of NP-Complete PuzzlesInternational Computer Games Association Journal 31(1)13ndash34 2008

K E Kinnear JrEvolving a Sort Lessons in Genetic ProgrammingIn Proc of the International Conference on Neural Networks 1993

J KozaA Genetic Approach to the Truck Backer Upper Problem and the Inter-twinedSpiral ProblemIn Proc International Joint Conference on Neural Networks 1992

J R KozaGenetic Programming On the Programming of Computers by Means of NaturalSelectionMIT Press Cambridge MA USA 1992

Bibliographyndash 87

J R KozaGenetic Programming II Automatic Discovery of Reusable ProgramsMIT Press Cambridge Massachusetts May 1994

W Langdon and W BanzhafRepeated Patterns in Genetic ProgrammingNatural Computing 7589ndash613 2008

W B LangdonRandom search is parsimoniousIn E Cantuacute-Paz editor Late Breaking Papers at the Genetic and EvolutionaryComputation Conference (GECCO-2002) pages 308ndash315 New York NY 9-13July 2002 AAAI

W B Langdon and W BanzhafRepeated Sequences in Linear Genetic Programming GenomesComplex Systems 15(4)285ndash306 2005

W B Langdon and A P HarrisonEvolving Regular Expressions for GeneChip Probe Performance PredictionIn Proc PPSN pages 1061ndash1070 2008

W B Langdon and R PoliFoundations of Genetic ProgrammingSpringer-Verlag 2002

Bibliographyndash 88

W B Langdon J Rowsell and A P HarrisonCreating Regular Expressions as mRNA Motifs with GP to Predict Human ExonSplittingIn Proc GECCO 2009

W B Langdon O Sanchez Graillet and A P HarrisonAutomated DNA Motif DiscoveryarXivorg 2010

M Lones A Tyrrell S Stepney and L CavesControlling Complex Dynamics with Artificial Biochemical NetworksIn Proc EuroGP pages 159ndash170 2010

S LucasOthello Competitionhttpprotectkern-1667emrelaxalgovalessexacuk8080othellohtmlOthellohtml 2012[Online accessed 27-Jan-2012]

Bibliographyndash 89

S LucasThe Physical Travelling Salesperson Problemhttpprotectkern-1667emrelaxalgovalessexacukptspptsphtml2012[Online accessed 27ndashJan-2012]

T McConaghyFFX Fast Scalable Deterministic Symbolic Regression TechnologyIn Proc GPTP 2011

D J MontanaStrongly typed genetic programmingBBN Technical Report 7866 Bolt Beranek and Newman Inc 10 MoultonStreet Cambridge MA 02138 USA 7 May 1993

M Nicolau M Schoenauer and W BanzhafEvolving Genes to Balance a PoleIn Proc EuroGP 2010

Bibliographyndash 90

P Nordin and W BanzhafGenetic programming controlling a miniature robotIn E V Siegel and J R Koza editors Working Notes for the AAAI Symposiumon Genetic Programming pages 61ndash67 MIT Cambridge MA USA 10ndash12 Nov1995 AAAI

G Olague and L TrujilloEvolutionary-computer-assisted design of image operators that detect interestpoints using genetic programmingImage and Vision Computing 29(7)484ndash498 2011

M OrsquoNeill A Brabazon and E HembergSubtree Deactivation Control with Grammatical Genetic Programming inDynamic EnvironmentsIn Proc CEC 2008

M OrsquoNeill and C RyanGrammatical Evolution by Grammatical Evolution The Evolution of Grammarand Genetic CodeIn Proc EuroGP pages 138ndash149 Springer-Verlag 5ndash7 Apr 2004

M Paterson Y Peres M Thorup P Winkler and U ZwickMaximum OverhangIn Proc 19th Annual ACM-SIAM Symposium on Discrete Algorithms 2008

Bibliographyndash 91

R PoliExact schema theory for genetic programming and variable-length geneticalgorithms with one-point crossoverGenetic Programming and Evolvable Machines 2(2)123ndash163 June 2001

R Poli and W B LangdonOn the search properties of different crossover operators in genetic programmingIn J R Koza W Banzhaf K Chellapilla K Deb M Dorigo D B Fogel M HGarzon D E Goldberg H Iba and R Riolo editors Genetic Programming1998 Proceedings of the Third Annual Conference pages 293ndash301 University ofWisconsin Madison Wisconsin USA 22-25 July 1998 Morgan Kaufmann

R Poli W B Langdon and N F McPheeA field guide to genetic programmingPublished via httplulucom and freely available athttpwwwgp-field-guideorguk 2008(With contributions by J R Koza)

B J Ross and H ZhuProcedural texture evolution using multiobjective optimizationNew Generation Computing 22(3)271ndash293 2004

Bibliographyndash 92

R P Salustowicz and J SchmidhuberProbabilistic incremental program evolutionEvolutionary Computation 5(2)123ndash141 1997

M Schmidt and H LipsonDistilling free-form natural laws from experimental dataScience 324(5923)81ndash85 3 Apr 2009

S Silva and L VanneschiState-of-the-Art Genetic Programming for Predicting Human Oral Bioavailabilityof DrugsIn Proc 4th International Workshop on Practical Applications of ComputationalBiology and Bioinformatics 2010

M SipperLet the Games EvolveIn Proc GPTP 2011

C Sivapragasam R Maheswaran and V VenkateshGenetic programming approach for flood routing in natural channelsHydrological Processes 22(5)623ndash628 2007

C Sivapragasam N Muttil S Muthukumar and V M ArunPrediction of algal blooms using genetic programmingMarine Pollution Bulletin 60(10)1849ndash1855 2010

Bibliographyndash 93

L SpectorTowards practical autoconstructive evolution Self-evolution of problem-solvinggenetic programming systemsIn R Riolo T McConaghy and E Vladislavleva editors Genetic ProgrammingTheory and Practice VIII volume 8 of Genetic and Evolutionary Computationchapter 2 pages 17ndash33 Springer Ann Arbor USA 20-22 May 2010

L Spector C Perry and J KleinPush 20 programming language descriptionTechnical report School of Cognitive Science Hampshire College Apr 2004

J Togelius S Karakovskiy J Koutnik and J SchmidhuberSuper Mario EvolutionIn Proc IEEE Computational Intelligence and Games 2009

TORCS The Open Car Racing Simulatorhttptorcssourceforgenet 2012

L Vanneschi and G CuccuVariable Size Population for Dynamic Optimization with Genetic ProgrammingIn Proc GECCO 2009

Bibliographyndash 94

E Vladislavleva G Smits and D Den HertogOrder of Nonlinearity as a Complexity Measure for Models Generated by SymbolicRegression via Pareto Genetic ProgrammingIEEE Trans EC 13(2)333ndash349 2009

N Wagner Z Michalewicz M Khouja and R McGregorTime Series Forecasting for Dynamic Environments The DyFor Genetic ProgramModelIEEE Trans EC 2007

J Walker and J MillerPredicting Prime Numbers Using Cartesian Genetic ProgrammingIn Proc EuroGP 2007

J A Walker K Voumllk S L Smith and J F MillerParallel Evolution using Multi-chromosome Cartesian Genetic ProgrammingGPEM 10417ndash445 2009

W-C Wang K-W Chau C-T Cheng and L QiuA comparison of performance of several artificial intelligence methods forforecasting monthly discharge time seriesJournal of Hydrology 374(3-4)294ndash306 2009

Bibliographyndash 95

W Weimer S Forrest C Le Goues and T NguyenAutomatic program repair with evolutionary computationCommunications of the ACM 53(5)109ndash116 June 2010

P Widera J Garibaldi and N KrasnogorGP challenge Evolving energy function for protein structure predictionGPEM 1161ndash88 2010

T YuHierarchical Processing for Evolving Recursive and Modular Programs UsingHigher-Order Functions and Lambda AbstractionGPEM 2345ndash380 2001

Bibliographyndash 96

  • Introduction
  • Background
  • What is genetic programming
  • Summary of our first glimpse at GP
  • Life demonstration of GP using ECJ
  • Assessment of GP techniques
  • Case study
  • A more detailed view on GP (vanilla GP is not the whole story)
  • The Challenges for GP
  • Variants of GP
  • Selected theoretical results
  • Applications of GP
  • Additional resources
  • Bibliography
Page 60: GeneticProgramming - Poznań University of TechnologyGeneticProgramming KrzysztofKrawiec LaboratoryofIntelligentDecisionSupportSystems InstituteofComputingScience,PoznanUniversityofTechnology,Poznań,Poland

Variants of GP

Variants of GPndash 60

Strongly typed GP (STGP)

A way to incorporate prior knowledge and impose a structure on programs [30]

ImplementationProvide a set of typesFor each instruction define the types of its arguments and outcomesMake the operators type-aware

Mutation substitute a random tree of a proper typeCrossover swap trees of compatible1 types

1Compatible belonging to the same lsquoset typersquoVariants of GPndash 61

Strongly typed GP in ECJ

For the problem of simple classifiers represented as decision trees

Classifier syntaxClassifier = Class_idClassifier = if_then_else(Condition ClassifierClassifier)Condition = Input_Variable = Constant_Value

Implementation in ECJ parameter filesgptypeasize = 3gptypea0name = classgptypea1name = vargptypea2name = constgptypessize = 0gptcsize = 1gptc0 = ecgpGPTreeConstraintsgptc0name = tc0gptc0fset = f0gptc0returns = class

gpncsize = 4gpnc0 = ecgpGPNodeConstraintsgpnc0name = ncSimpleClassifiergpnc0returns = classgpnc0size = 0gpnc1 = ecgpGPNodeConstraintsgpnc1name = ncCompoundClassifiergpnc1returns = classgpnc1size = 4gpnc1child0 = vargpnc1child1 = constgpnc1child2 = classgpnc1child3 = classgpnc2 = ecgpGPNodeConstraintsgpnc2name = ncVariablegpnc2returns = vargpnc2size = 0gpnc3 = ecgpGPNodeConstraintsgpnc3name = ncConstantgpnc3returns = constgpnc3size = 0

Variants of GPndash 62

Linear Genetic Programming

MotivationTree-like structures are not natural for contemporary hardware architectures

Program = a sequence of instructions

Data passed via registers

ProsDirectly portable to machine code fast executionNatural correspondence to standard (GA-like) crossover operator

Applications direct evolution of machine code [32]

Initial register contents

Final register contents

x1

x2 O1 O2

x3

O3 O4 g2

g3

g1r1

r2

r3

r1

r2

r3

Variants of GPndash 63

Linear GP

Variants of GPndash 64

Stack-based GP

The best-known representative Push and PushGPhampshireedulspectorpushhtml [48]

ProsVery simple syntax program = instruction | literal | ( program )No need to specify the number of registersThe top element of a stack has the natural interpretation of program outcomeNatural possibility of implementing autoconstructive programs [47]Includes certain features that make it Turing-complete (eg YANK instruction)

Variants of GPndash 65

Push Example

Program

( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR )

Initial stack states

BOOLEAN STACK ()CODE STACK ( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR )FLOAT STACK ()INTEGER STACK ()

Stack states after program execution

BOOLEAN STACK ( TRUE )CODE STACK ( ( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR ) )FLOAT STACK ( 93 )INTEGER STACK ( 6 )

httphampshireedulspectorpush3-descriptionhtml

Variants of GPndash 66

Other variants of GP

Grammatical Evolution The grammar of the programming language ofconsideration is given as input to the algorithm Individuals encode the choice ofproductions in the derivation tree (which of available alternative productionshould be chosen modulo the number of productions available at given step ofderivation)

Graph-based GPMotivation standard GP cannot reuse subprograms (within a single program)Example Cartesian Genetic Programming

Variants of GPndash 67

Other variants of GP

Multiobjective GP The extra objectives canCome with the problemResult from GPrsquos specifics eg use program size as the second (minimized)objectiveBe associated with different tests (eg feature tests [40])

Developmental GP (eg using Push)

Probabilistic GP (a variant of EDA Estimation of Distribution Algorithms)The algorithm maintains a probability distribution P instead of a populationIndividuals are generated from P lsquoon demandrsquoThe results of individualsrsquo evaluation are used to update P

Variants of GPndash 68

Simple EDA-like GP PIPE

Probabilistic Incremental Program Evolution [41]

Variants of GPndash 69

Selected theoretical results

Selected theoretical resultsndash 70

Schemata theorem for GP

Exact formula for the expected number of individuals sampling a schema a thenext generation [37]

Plus later work for other types of crossover

Selected theoretical resultsndash 71

Semantic crossover

Selected theoretical resultsndash 72

Applications of GP

Applications of GPndash 73

Review

GP produced a number of solutions that are human-competitive ie a GPalgorithm automatically solved a problem for which a patent exists2

A recent award-winning work has demonstrated the ability of a GP system toautomatically find and correct bugs in commercially-released software whenprovided with test data3

GP is one of leading methodologies that can be used to lsquoautomatersquo sciencehelping the researchers to find the hidden complex patterns in the observedphenomena4

2Koza J R Keane M A Streeter M J Mydlowec W Yu J Lanza G 2003 Genetic Pro-gramming IV Routine Human-Competitive Machine Intelligence Kluwer Academic Publishers

3Arcuri A Yao X A novel co-evolutionary approach to automatic software bug fixing In WangJ (Ed) 2008 IEEE World Congress on Computational Intelligence IEEE Computational IntelligenceSociety IEEE Press Hong Kong

4Schmidt M Lipson H 3 Apr 2009 Distilling free-form natural laws from experimental dataScience 324 (5923) 81ndash85

Applications of GPndash 74

Humies

() Entries were solicited for cash awards for human-competitive results that wereproduced by any form of genetic and evolutionary computation and that were published

Applications of GPndash 75

Humies

The conditions to qualify(A) The result was patented as an invention in the past is an improvement over apatented invention or would qualify today as a patentable new invention(B) The result is equal to or better than a result that was accepted as a new scientificresult at the time when it was published in a peer-reviewed scientific journal(C) The result is equal to or better than a result that was placed into a database orarchive of results maintained by an internationally recognized panel of scientificexperts(D) The result is publishable in its own right as a new scientific result mdash independentof the fact that the result was mechanically created(E) The result is equal to or better than the most recent human-created solution to along-standing problem for which there has been a succession of increasingly betterhuman-created solutions(F) The result is equal to or better than a result that was considered an achievementin its field at the time it was first discovered(G) The result solves a problem of indisputable difficulty in its field(H) The result holds its own or wins a regulated competition involving humancontestants (in the form of either live human players or human-written computerprograms)

Applications of GPndash 76

Selected Gold Humies

2004 Jason D Lohn Gregory S Hornby Derek S Linden NASA Ames ResearchCenterAn Evolved Antenna for Deployment on NASArsquos Space Technology 5 Mission

httpidesignucscedupapershornby_ec11pdf

Applications of GPndash 77

Selected Gold Humies using GP

2009 Stephanie Forrest Claire Le Goues ThanhVu Nguyen Westley WeimerAutomatically finding patches using genetic programming A GeneticProgramming Approach to Automated Software Repair

Applications of GPndash 78

Selected Gold Humies using GP

2008 Lee Spector David M Clark Ian Lindsay Bradford Barr Jon KleinGenetic Programming for Finite Algebras

2010 Natalio Krasnogor Paweł Widera Jonathan GaribaldiEvolutionary design of the energy function for protein structure prediction GPchallenge evolving the energy function for protein structure prediction Automateddesign of energy functions for protein structure prediction by means of geneticprogramming and improved structure similarity assessment

2011 Achiya Elyasaf Ami Hauptmann Moshe SipperGA-FreeCell Evolving Solvers for the Game of FreeCell

Applications of GPndash 79

Other applications

classification problems in machine learning [] object recognition [ 33] orlearning game strategies [] [2 57] has demonstrated the ability of a GP system to automatically find andcorrect bugs in commercially-released software when provided with test data

In the context of this paper it deserves particular attention that GP is one ofleading methodologies that can be used to lsquoautomatersquo science helping theresearchers to find the hidden complex patterns in the observed phenomena

In this spirit in their seminal paper [42] have shown how GP can be used toinduce scientific laws from experimental data Many other studies havedemonstrated the usefulness of GP for modeling different phenomena includingthose of natural origins [45 3 7 56 46]

See [] for an extensive review of GP applications

Applications of GPndash 80

Case study 2 Evolution of feature detectors

Based on Krzysztof Krawiec Bartosz Kukawka and Tomasz Maciejewski (2010)Evolving cascades of voting feature detectors for vehicle detection in satellite imageryIn IEEE Congress on Evolutionary Computation (CEC 2010) Barcelona IEEE Presspages 2392-2399

Applications of GPndash 81

Additional resources

Additional resourcesndash 82

Additional resources

Evolutionary Computation in Java csgmuedu~eclabprojectsecjGeneric software framework for EA well-prepared to work with GP

The online GP bilbiography wwwcsbhamacuk~wblbiblio

The genetic programming lsquohome pagersquo (a little bit messy but still valuable)httpwwwgenetic-programmingcom

Additional resourcesndash 83

Bibliography

Bibliographyndash 84

A Agapitos and S M LucasEvolving Modular Recursive Sorting AlgorithmsIn Proc EuroGP 2007

A Arcuri and X YaoA novel co-evolutionary approach to automatic software bug fixingIn J Wang editor 2008 IEEE World Congress on Computational Intelligencepages 162ndash168 Hong Kong 1-6 June 2008 IEEE Computational IntelligenceSociety IEEE Press

M Arganis R Val J Prats K Rodriguez R Dominguez and J DolzGenetic programming and standardization in water temperature modellingAdvances in Civil Engineering 2009 2009

A Brabazon M OrsquoNeill and I DempseyAn Introduction to Evolutionary Computation in FinanceIEEE Computational Intelligence Magazine 3(4)42ndash55 2008

R Bradley A Brabazon and M OrsquoNeillDynamic High Frequency Trading A Neuro-Evolutionary ApproachIn Proc EvoWorkshops 2009

G Cuccu and F GomezWhen novelty is not enoughIn Proc EvoApplications 2011

Bibliographyndash 85

M Daga and M C DeoAlternative data-driven methods to estimate wind from waves by inversemodelingNatural Hazards 49(2)293ndash310 May 2009

I Dempsey M OrsquoNeill and A BrabazonAdaptive Trading With Grammatical EvolutionIn Proc CEC 2006

G Durrett F Neumann and U-M OrsquoReillyComputational Complexity Analysis of Simple Genetic Programming On TwoProblems Modeling Isolated Program SemanticsIn Proc FOGA 2011

E Galvaacuten-Loacutepez J Swafford M OrsquoNeill and A BrabazonEvolving a Ms PacMan Controller Using Grammatical EvolutionIn Applications of Evolutionary Computation Springer 2010

J V Hansen P B Lowry R D Meservy and D M McDonaldGenetic Programming for Prevention of Cyberterrorism through Dynamic andEvolving Intrusion DetectionDecision Support Systems 431362ndash1374 2007

Bibliographyndash 86

S Harding J F Miller and W BanzhafDevelopments in Cartesian Genetic Programming self-modifying CGPGPEM 11397ndash439 2010

D Jakobović and L BudinDynamic Scheduling with Genetic ProgrammingIn Proc EuroGP 2006

G Kendall A Parkes and K SpoererA Survey of NP-Complete PuzzlesInternational Computer Games Association Journal 31(1)13ndash34 2008

K E Kinnear JrEvolving a Sort Lessons in Genetic ProgrammingIn Proc of the International Conference on Neural Networks 1993

J KozaA Genetic Approach to the Truck Backer Upper Problem and the Inter-twinedSpiral ProblemIn Proc International Joint Conference on Neural Networks 1992

J R KozaGenetic Programming On the Programming of Computers by Means of NaturalSelectionMIT Press Cambridge MA USA 1992

Bibliographyndash 87

J R KozaGenetic Programming II Automatic Discovery of Reusable ProgramsMIT Press Cambridge Massachusetts May 1994

W Langdon and W BanzhafRepeated Patterns in Genetic ProgrammingNatural Computing 7589ndash613 2008

W B LangdonRandom search is parsimoniousIn E Cantuacute-Paz editor Late Breaking Papers at the Genetic and EvolutionaryComputation Conference (GECCO-2002) pages 308ndash315 New York NY 9-13July 2002 AAAI

W B Langdon and W BanzhafRepeated Sequences in Linear Genetic Programming GenomesComplex Systems 15(4)285ndash306 2005

W B Langdon and A P HarrisonEvolving Regular Expressions for GeneChip Probe Performance PredictionIn Proc PPSN pages 1061ndash1070 2008

W B Langdon and R PoliFoundations of Genetic ProgrammingSpringer-Verlag 2002

Bibliographyndash 88

W B Langdon J Rowsell and A P HarrisonCreating Regular Expressions as mRNA Motifs with GP to Predict Human ExonSplittingIn Proc GECCO 2009

W B Langdon O Sanchez Graillet and A P HarrisonAutomated DNA Motif DiscoveryarXivorg 2010

M Lones A Tyrrell S Stepney and L CavesControlling Complex Dynamics with Artificial Biochemical NetworksIn Proc EuroGP pages 159ndash170 2010

S LucasOthello Competitionhttpprotectkern-1667emrelaxalgovalessexacuk8080othellohtmlOthellohtml 2012[Online accessed 27-Jan-2012]

Bibliographyndash 89

S LucasThe Physical Travelling Salesperson Problemhttpprotectkern-1667emrelaxalgovalessexacukptspptsphtml2012[Online accessed 27ndashJan-2012]

T McConaghyFFX Fast Scalable Deterministic Symbolic Regression TechnologyIn Proc GPTP 2011

D J MontanaStrongly typed genetic programmingBBN Technical Report 7866 Bolt Beranek and Newman Inc 10 MoultonStreet Cambridge MA 02138 USA 7 May 1993

M Nicolau M Schoenauer and W BanzhafEvolving Genes to Balance a PoleIn Proc EuroGP 2010

Bibliographyndash 90

P Nordin and W BanzhafGenetic programming controlling a miniature robotIn E V Siegel and J R Koza editors Working Notes for the AAAI Symposiumon Genetic Programming pages 61ndash67 MIT Cambridge MA USA 10ndash12 Nov1995 AAAI

G Olague and L TrujilloEvolutionary-computer-assisted design of image operators that detect interestpoints using genetic programmingImage and Vision Computing 29(7)484ndash498 2011

M OrsquoNeill A Brabazon and E HembergSubtree Deactivation Control with Grammatical Genetic Programming inDynamic EnvironmentsIn Proc CEC 2008

M OrsquoNeill and C RyanGrammatical Evolution by Grammatical Evolution The Evolution of Grammarand Genetic CodeIn Proc EuroGP pages 138ndash149 Springer-Verlag 5ndash7 Apr 2004

M Paterson Y Peres M Thorup P Winkler and U ZwickMaximum OverhangIn Proc 19th Annual ACM-SIAM Symposium on Discrete Algorithms 2008

Bibliographyndash 91

R PoliExact schema theory for genetic programming and variable-length geneticalgorithms with one-point crossoverGenetic Programming and Evolvable Machines 2(2)123ndash163 June 2001

R Poli and W B LangdonOn the search properties of different crossover operators in genetic programmingIn J R Koza W Banzhaf K Chellapilla K Deb M Dorigo D B Fogel M HGarzon D E Goldberg H Iba and R Riolo editors Genetic Programming1998 Proceedings of the Third Annual Conference pages 293ndash301 University ofWisconsin Madison Wisconsin USA 22-25 July 1998 Morgan Kaufmann

R Poli W B Langdon and N F McPheeA field guide to genetic programmingPublished via httplulucom and freely available athttpwwwgp-field-guideorguk 2008(With contributions by J R Koza)

B J Ross and H ZhuProcedural texture evolution using multiobjective optimizationNew Generation Computing 22(3)271ndash293 2004

Bibliographyndash 92

R P Salustowicz and J SchmidhuberProbabilistic incremental program evolutionEvolutionary Computation 5(2)123ndash141 1997

M Schmidt and H LipsonDistilling free-form natural laws from experimental dataScience 324(5923)81ndash85 3 Apr 2009

S Silva and L VanneschiState-of-the-Art Genetic Programming for Predicting Human Oral Bioavailabilityof DrugsIn Proc 4th International Workshop on Practical Applications of ComputationalBiology and Bioinformatics 2010

M SipperLet the Games EvolveIn Proc GPTP 2011

C Sivapragasam R Maheswaran and V VenkateshGenetic programming approach for flood routing in natural channelsHydrological Processes 22(5)623ndash628 2007

C Sivapragasam N Muttil S Muthukumar and V M ArunPrediction of algal blooms using genetic programmingMarine Pollution Bulletin 60(10)1849ndash1855 2010

Bibliographyndash 93

L SpectorTowards practical autoconstructive evolution Self-evolution of problem-solvinggenetic programming systemsIn R Riolo T McConaghy and E Vladislavleva editors Genetic ProgrammingTheory and Practice VIII volume 8 of Genetic and Evolutionary Computationchapter 2 pages 17ndash33 Springer Ann Arbor USA 20-22 May 2010

L Spector C Perry and J KleinPush 20 programming language descriptionTechnical report School of Cognitive Science Hampshire College Apr 2004

J Togelius S Karakovskiy J Koutnik and J SchmidhuberSuper Mario EvolutionIn Proc IEEE Computational Intelligence and Games 2009

TORCS The Open Car Racing Simulatorhttptorcssourceforgenet 2012

L Vanneschi and G CuccuVariable Size Population for Dynamic Optimization with Genetic ProgrammingIn Proc GECCO 2009

Bibliographyndash 94

E Vladislavleva G Smits and D Den HertogOrder of Nonlinearity as a Complexity Measure for Models Generated by SymbolicRegression via Pareto Genetic ProgrammingIEEE Trans EC 13(2)333ndash349 2009

N Wagner Z Michalewicz M Khouja and R McGregorTime Series Forecasting for Dynamic Environments The DyFor Genetic ProgramModelIEEE Trans EC 2007

J Walker and J MillerPredicting Prime Numbers Using Cartesian Genetic ProgrammingIn Proc EuroGP 2007

J A Walker K Voumllk S L Smith and J F MillerParallel Evolution using Multi-chromosome Cartesian Genetic ProgrammingGPEM 10417ndash445 2009

W-C Wang K-W Chau C-T Cheng and L QiuA comparison of performance of several artificial intelligence methods forforecasting monthly discharge time seriesJournal of Hydrology 374(3-4)294ndash306 2009

Bibliographyndash 95

W Weimer S Forrest C Le Goues and T NguyenAutomatic program repair with evolutionary computationCommunications of the ACM 53(5)109ndash116 June 2010

P Widera J Garibaldi and N KrasnogorGP challenge Evolving energy function for protein structure predictionGPEM 1161ndash88 2010

T YuHierarchical Processing for Evolving Recursive and Modular Programs UsingHigher-Order Functions and Lambda AbstractionGPEM 2345ndash380 2001

Bibliographyndash 96

  • Introduction
  • Background
  • What is genetic programming
  • Summary of our first glimpse at GP
  • Life demonstration of GP using ECJ
  • Assessment of GP techniques
  • Case study
  • A more detailed view on GP (vanilla GP is not the whole story)
  • The Challenges for GP
  • Variants of GP
  • Selected theoretical results
  • Applications of GP
  • Additional resources
  • Bibliography
Page 61: GeneticProgramming - Poznań University of TechnologyGeneticProgramming KrzysztofKrawiec LaboratoryofIntelligentDecisionSupportSystems InstituteofComputingScience,PoznanUniversityofTechnology,Poznań,Poland

Strongly typed GP (STGP)

A way to incorporate prior knowledge and impose a structure on programs [30]

ImplementationProvide a set of typesFor each instruction define the types of its arguments and outcomesMake the operators type-aware

Mutation substitute a random tree of a proper typeCrossover swap trees of compatible1 types

1Compatible belonging to the same lsquoset typersquoVariants of GPndash 61

Strongly typed GP in ECJ

For the problem of simple classifiers represented as decision trees

Classifier syntaxClassifier = Class_idClassifier = if_then_else(Condition ClassifierClassifier)Condition = Input_Variable = Constant_Value

Implementation in ECJ parameter filesgptypeasize = 3gptypea0name = classgptypea1name = vargptypea2name = constgptypessize = 0gptcsize = 1gptc0 = ecgpGPTreeConstraintsgptc0name = tc0gptc0fset = f0gptc0returns = class

gpncsize = 4gpnc0 = ecgpGPNodeConstraintsgpnc0name = ncSimpleClassifiergpnc0returns = classgpnc0size = 0gpnc1 = ecgpGPNodeConstraintsgpnc1name = ncCompoundClassifiergpnc1returns = classgpnc1size = 4gpnc1child0 = vargpnc1child1 = constgpnc1child2 = classgpnc1child3 = classgpnc2 = ecgpGPNodeConstraintsgpnc2name = ncVariablegpnc2returns = vargpnc2size = 0gpnc3 = ecgpGPNodeConstraintsgpnc3name = ncConstantgpnc3returns = constgpnc3size = 0

Variants of GPndash 62

Linear Genetic Programming

MotivationTree-like structures are not natural for contemporary hardware architectures

Program = a sequence of instructions

Data passed via registers

ProsDirectly portable to machine code fast executionNatural correspondence to standard (GA-like) crossover operator

Applications direct evolution of machine code [32]

Initial register contents

Final register contents

x1

x2 O1 O2

x3

O3 O4 g2

g3

g1r1

r2

r3

r1

r2

r3

Variants of GPndash 63

Linear GP

Variants of GPndash 64

Stack-based GP

The best-known representative Push and PushGPhampshireedulspectorpushhtml [48]

ProsVery simple syntax program = instruction | literal | ( program )No need to specify the number of registersThe top element of a stack has the natural interpretation of program outcomeNatural possibility of implementing autoconstructive programs [47]Includes certain features that make it Turing-complete (eg YANK instruction)

Variants of GPndash 65

Push Example

Program

( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR )

Initial stack states

BOOLEAN STACK ()CODE STACK ( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR )FLOAT STACK ()INTEGER STACK ()

Stack states after program execution

BOOLEAN STACK ( TRUE )CODE STACK ( ( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR ) )FLOAT STACK ( 93 )INTEGER STACK ( 6 )

httphampshireedulspectorpush3-descriptionhtml

Variants of GPndash 66

Other variants of GP

Grammatical Evolution The grammar of the programming language ofconsideration is given as input to the algorithm Individuals encode the choice ofproductions in the derivation tree (which of available alternative productionshould be chosen modulo the number of productions available at given step ofderivation)

Graph-based GPMotivation standard GP cannot reuse subprograms (within a single program)Example Cartesian Genetic Programming

Variants of GPndash 67

Other variants of GP

Multiobjective GP The extra objectives canCome with the problemResult from GPrsquos specifics eg use program size as the second (minimized)objectiveBe associated with different tests (eg feature tests [40])

Developmental GP (eg using Push)

Probabilistic GP (a variant of EDA Estimation of Distribution Algorithms)The algorithm maintains a probability distribution P instead of a populationIndividuals are generated from P lsquoon demandrsquoThe results of individualsrsquo evaluation are used to update P

Variants of GPndash 68

Simple EDA-like GP PIPE

Probabilistic Incremental Program Evolution [41]

Variants of GPndash 69

Selected theoretical results

Selected theoretical resultsndash 70

Schemata theorem for GP

Exact formula for the expected number of individuals sampling a schema a thenext generation [37]

Plus later work for other types of crossover

Selected theoretical resultsndash 71

Semantic crossover

Selected theoretical resultsndash 72

Applications of GP

Applications of GPndash 73

Review

GP produced a number of solutions that are human-competitive ie a GPalgorithm automatically solved a problem for which a patent exists2

A recent award-winning work has demonstrated the ability of a GP system toautomatically find and correct bugs in commercially-released software whenprovided with test data3

GP is one of leading methodologies that can be used to lsquoautomatersquo sciencehelping the researchers to find the hidden complex patterns in the observedphenomena4

2Koza J R Keane M A Streeter M J Mydlowec W Yu J Lanza G 2003 Genetic Pro-gramming IV Routine Human-Competitive Machine Intelligence Kluwer Academic Publishers

3Arcuri A Yao X A novel co-evolutionary approach to automatic software bug fixing In WangJ (Ed) 2008 IEEE World Congress on Computational Intelligence IEEE Computational IntelligenceSociety IEEE Press Hong Kong

4Schmidt M Lipson H 3 Apr 2009 Distilling free-form natural laws from experimental dataScience 324 (5923) 81ndash85

Applications of GPndash 74

Humies

() Entries were solicited for cash awards for human-competitive results that wereproduced by any form of genetic and evolutionary computation and that were published

Applications of GPndash 75

Humies

The conditions to qualify(A) The result was patented as an invention in the past is an improvement over apatented invention or would qualify today as a patentable new invention(B) The result is equal to or better than a result that was accepted as a new scientificresult at the time when it was published in a peer-reviewed scientific journal(C) The result is equal to or better than a result that was placed into a database orarchive of results maintained by an internationally recognized panel of scientificexperts(D) The result is publishable in its own right as a new scientific result mdash independentof the fact that the result was mechanically created(E) The result is equal to or better than the most recent human-created solution to along-standing problem for which there has been a succession of increasingly betterhuman-created solutions(F) The result is equal to or better than a result that was considered an achievementin its field at the time it was first discovered(G) The result solves a problem of indisputable difficulty in its field(H) The result holds its own or wins a regulated competition involving humancontestants (in the form of either live human players or human-written computerprograms)

Applications of GPndash 76

Selected Gold Humies

2004 Jason D Lohn Gregory S Hornby Derek S Linden NASA Ames ResearchCenterAn Evolved Antenna for Deployment on NASArsquos Space Technology 5 Mission

httpidesignucscedupapershornby_ec11pdf

Applications of GPndash 77

Selected Gold Humies using GP

2009 Stephanie Forrest Claire Le Goues ThanhVu Nguyen Westley WeimerAutomatically finding patches using genetic programming A GeneticProgramming Approach to Automated Software Repair

Applications of GPndash 78

Selected Gold Humies using GP

2008 Lee Spector David M Clark Ian Lindsay Bradford Barr Jon KleinGenetic Programming for Finite Algebras

2010 Natalio Krasnogor Paweł Widera Jonathan GaribaldiEvolutionary design of the energy function for protein structure prediction GPchallenge evolving the energy function for protein structure prediction Automateddesign of energy functions for protein structure prediction by means of geneticprogramming and improved structure similarity assessment

2011 Achiya Elyasaf Ami Hauptmann Moshe SipperGA-FreeCell Evolving Solvers for the Game of FreeCell

Applications of GPndash 79

Other applications

classification problems in machine learning [] object recognition [ 33] orlearning game strategies [] [2 57] has demonstrated the ability of a GP system to automatically find andcorrect bugs in commercially-released software when provided with test data

In the context of this paper it deserves particular attention that GP is one ofleading methodologies that can be used to lsquoautomatersquo science helping theresearchers to find the hidden complex patterns in the observed phenomena

In this spirit in their seminal paper [42] have shown how GP can be used toinduce scientific laws from experimental data Many other studies havedemonstrated the usefulness of GP for modeling different phenomena includingthose of natural origins [45 3 7 56 46]

See [] for an extensive review of GP applications

Applications of GPndash 80

Case study 2 Evolution of feature detectors

Based on Krzysztof Krawiec Bartosz Kukawka and Tomasz Maciejewski (2010)Evolving cascades of voting feature detectors for vehicle detection in satellite imageryIn IEEE Congress on Evolutionary Computation (CEC 2010) Barcelona IEEE Presspages 2392-2399

Applications of GPndash 81

Additional resources

Additional resourcesndash 82

Additional resources

Evolutionary Computation in Java csgmuedu~eclabprojectsecjGeneric software framework for EA well-prepared to work with GP

The online GP bilbiography wwwcsbhamacuk~wblbiblio

The genetic programming lsquohome pagersquo (a little bit messy but still valuable)httpwwwgenetic-programmingcom

Additional resourcesndash 83

Bibliography

Bibliographyndash 84

A Agapitos and S M LucasEvolving Modular Recursive Sorting AlgorithmsIn Proc EuroGP 2007

A Arcuri and X YaoA novel co-evolutionary approach to automatic software bug fixingIn J Wang editor 2008 IEEE World Congress on Computational Intelligencepages 162ndash168 Hong Kong 1-6 June 2008 IEEE Computational IntelligenceSociety IEEE Press

M Arganis R Val J Prats K Rodriguez R Dominguez and J DolzGenetic programming and standardization in water temperature modellingAdvances in Civil Engineering 2009 2009

A Brabazon M OrsquoNeill and I DempseyAn Introduction to Evolutionary Computation in FinanceIEEE Computational Intelligence Magazine 3(4)42ndash55 2008

R Bradley A Brabazon and M OrsquoNeillDynamic High Frequency Trading A Neuro-Evolutionary ApproachIn Proc EvoWorkshops 2009

G Cuccu and F GomezWhen novelty is not enoughIn Proc EvoApplications 2011

Bibliographyndash 85

M Daga and M C DeoAlternative data-driven methods to estimate wind from waves by inversemodelingNatural Hazards 49(2)293ndash310 May 2009

I Dempsey M OrsquoNeill and A BrabazonAdaptive Trading With Grammatical EvolutionIn Proc CEC 2006

G Durrett F Neumann and U-M OrsquoReillyComputational Complexity Analysis of Simple Genetic Programming On TwoProblems Modeling Isolated Program SemanticsIn Proc FOGA 2011

E Galvaacuten-Loacutepez J Swafford M OrsquoNeill and A BrabazonEvolving a Ms PacMan Controller Using Grammatical EvolutionIn Applications of Evolutionary Computation Springer 2010

J V Hansen P B Lowry R D Meservy and D M McDonaldGenetic Programming for Prevention of Cyberterrorism through Dynamic andEvolving Intrusion DetectionDecision Support Systems 431362ndash1374 2007

Bibliographyndash 86

S Harding J F Miller and W BanzhafDevelopments in Cartesian Genetic Programming self-modifying CGPGPEM 11397ndash439 2010

D Jakobović and L BudinDynamic Scheduling with Genetic ProgrammingIn Proc EuroGP 2006

G Kendall A Parkes and K SpoererA Survey of NP-Complete PuzzlesInternational Computer Games Association Journal 31(1)13ndash34 2008

K E Kinnear JrEvolving a Sort Lessons in Genetic ProgrammingIn Proc of the International Conference on Neural Networks 1993

J KozaA Genetic Approach to the Truck Backer Upper Problem and the Inter-twinedSpiral ProblemIn Proc International Joint Conference on Neural Networks 1992

J R KozaGenetic Programming On the Programming of Computers by Means of NaturalSelectionMIT Press Cambridge MA USA 1992

Bibliographyndash 87

J R KozaGenetic Programming II Automatic Discovery of Reusable ProgramsMIT Press Cambridge Massachusetts May 1994

W Langdon and W BanzhafRepeated Patterns in Genetic ProgrammingNatural Computing 7589ndash613 2008

W B LangdonRandom search is parsimoniousIn E Cantuacute-Paz editor Late Breaking Papers at the Genetic and EvolutionaryComputation Conference (GECCO-2002) pages 308ndash315 New York NY 9-13July 2002 AAAI

W B Langdon and W BanzhafRepeated Sequences in Linear Genetic Programming GenomesComplex Systems 15(4)285ndash306 2005

W B Langdon and A P HarrisonEvolving Regular Expressions for GeneChip Probe Performance PredictionIn Proc PPSN pages 1061ndash1070 2008

W B Langdon and R PoliFoundations of Genetic ProgrammingSpringer-Verlag 2002

Bibliographyndash 88

W B Langdon J Rowsell and A P HarrisonCreating Regular Expressions as mRNA Motifs with GP to Predict Human ExonSplittingIn Proc GECCO 2009

W B Langdon O Sanchez Graillet and A P HarrisonAutomated DNA Motif DiscoveryarXivorg 2010

M Lones A Tyrrell S Stepney and L CavesControlling Complex Dynamics with Artificial Biochemical NetworksIn Proc EuroGP pages 159ndash170 2010

S LucasOthello Competitionhttpprotectkern-1667emrelaxalgovalessexacuk8080othellohtmlOthellohtml 2012[Online accessed 27-Jan-2012]

Bibliographyndash 89

S LucasThe Physical Travelling Salesperson Problemhttpprotectkern-1667emrelaxalgovalessexacukptspptsphtml2012[Online accessed 27ndashJan-2012]

T McConaghyFFX Fast Scalable Deterministic Symbolic Regression TechnologyIn Proc GPTP 2011

D J MontanaStrongly typed genetic programmingBBN Technical Report 7866 Bolt Beranek and Newman Inc 10 MoultonStreet Cambridge MA 02138 USA 7 May 1993

M Nicolau M Schoenauer and W BanzhafEvolving Genes to Balance a PoleIn Proc EuroGP 2010

Bibliographyndash 90

P Nordin and W BanzhafGenetic programming controlling a miniature robotIn E V Siegel and J R Koza editors Working Notes for the AAAI Symposiumon Genetic Programming pages 61ndash67 MIT Cambridge MA USA 10ndash12 Nov1995 AAAI

G Olague and L TrujilloEvolutionary-computer-assisted design of image operators that detect interestpoints using genetic programmingImage and Vision Computing 29(7)484ndash498 2011

M OrsquoNeill A Brabazon and E HembergSubtree Deactivation Control with Grammatical Genetic Programming inDynamic EnvironmentsIn Proc CEC 2008

M OrsquoNeill and C RyanGrammatical Evolution by Grammatical Evolution The Evolution of Grammarand Genetic CodeIn Proc EuroGP pages 138ndash149 Springer-Verlag 5ndash7 Apr 2004

M Paterson Y Peres M Thorup P Winkler and U ZwickMaximum OverhangIn Proc 19th Annual ACM-SIAM Symposium on Discrete Algorithms 2008

Bibliographyndash 91

R PoliExact schema theory for genetic programming and variable-length geneticalgorithms with one-point crossoverGenetic Programming and Evolvable Machines 2(2)123ndash163 June 2001

R Poli and W B LangdonOn the search properties of different crossover operators in genetic programmingIn J R Koza W Banzhaf K Chellapilla K Deb M Dorigo D B Fogel M HGarzon D E Goldberg H Iba and R Riolo editors Genetic Programming1998 Proceedings of the Third Annual Conference pages 293ndash301 University ofWisconsin Madison Wisconsin USA 22-25 July 1998 Morgan Kaufmann

R Poli W B Langdon and N F McPheeA field guide to genetic programmingPublished via httplulucom and freely available athttpwwwgp-field-guideorguk 2008(With contributions by J R Koza)

B J Ross and H ZhuProcedural texture evolution using multiobjective optimizationNew Generation Computing 22(3)271ndash293 2004

Bibliographyndash 92

R P Salustowicz and J SchmidhuberProbabilistic incremental program evolutionEvolutionary Computation 5(2)123ndash141 1997

M Schmidt and H LipsonDistilling free-form natural laws from experimental dataScience 324(5923)81ndash85 3 Apr 2009

S Silva and L VanneschiState-of-the-Art Genetic Programming for Predicting Human Oral Bioavailabilityof DrugsIn Proc 4th International Workshop on Practical Applications of ComputationalBiology and Bioinformatics 2010

M SipperLet the Games EvolveIn Proc GPTP 2011

C Sivapragasam R Maheswaran and V VenkateshGenetic programming approach for flood routing in natural channelsHydrological Processes 22(5)623ndash628 2007

C Sivapragasam N Muttil S Muthukumar and V M ArunPrediction of algal blooms using genetic programmingMarine Pollution Bulletin 60(10)1849ndash1855 2010

Bibliographyndash 93

L SpectorTowards practical autoconstructive evolution Self-evolution of problem-solvinggenetic programming systemsIn R Riolo T McConaghy and E Vladislavleva editors Genetic ProgrammingTheory and Practice VIII volume 8 of Genetic and Evolutionary Computationchapter 2 pages 17ndash33 Springer Ann Arbor USA 20-22 May 2010

L Spector C Perry and J KleinPush 20 programming language descriptionTechnical report School of Cognitive Science Hampshire College Apr 2004

J Togelius S Karakovskiy J Koutnik and J SchmidhuberSuper Mario EvolutionIn Proc IEEE Computational Intelligence and Games 2009

TORCS The Open Car Racing Simulatorhttptorcssourceforgenet 2012

L Vanneschi and G CuccuVariable Size Population for Dynamic Optimization with Genetic ProgrammingIn Proc GECCO 2009

Bibliographyndash 94

E Vladislavleva G Smits and D Den HertogOrder of Nonlinearity as a Complexity Measure for Models Generated by SymbolicRegression via Pareto Genetic ProgrammingIEEE Trans EC 13(2)333ndash349 2009

N Wagner Z Michalewicz M Khouja and R McGregorTime Series Forecasting for Dynamic Environments The DyFor Genetic ProgramModelIEEE Trans EC 2007

J Walker and J MillerPredicting Prime Numbers Using Cartesian Genetic ProgrammingIn Proc EuroGP 2007

J A Walker K Voumllk S L Smith and J F MillerParallel Evolution using Multi-chromosome Cartesian Genetic ProgrammingGPEM 10417ndash445 2009

W-C Wang K-W Chau C-T Cheng and L QiuA comparison of performance of several artificial intelligence methods forforecasting monthly discharge time seriesJournal of Hydrology 374(3-4)294ndash306 2009

Bibliographyndash 95

W Weimer S Forrest C Le Goues and T NguyenAutomatic program repair with evolutionary computationCommunications of the ACM 53(5)109ndash116 June 2010

P Widera J Garibaldi and N KrasnogorGP challenge Evolving energy function for protein structure predictionGPEM 1161ndash88 2010

T YuHierarchical Processing for Evolving Recursive and Modular Programs UsingHigher-Order Functions and Lambda AbstractionGPEM 2345ndash380 2001

Bibliographyndash 96

  • Introduction
  • Background
  • What is genetic programming
  • Summary of our first glimpse at GP
  • Life demonstration of GP using ECJ
  • Assessment of GP techniques
  • Case study
  • A more detailed view on GP (vanilla GP is not the whole story)
  • The Challenges for GP
  • Variants of GP
  • Selected theoretical results
  • Applications of GP
  • Additional resources
  • Bibliography
Page 62: GeneticProgramming - Poznań University of TechnologyGeneticProgramming KrzysztofKrawiec LaboratoryofIntelligentDecisionSupportSystems InstituteofComputingScience,PoznanUniversityofTechnology,Poznań,Poland

Strongly typed GP in ECJ

For the problem of simple classifiers represented as decision trees

Classifier syntaxClassifier = Class_idClassifier = if_then_else(Condition ClassifierClassifier)Condition = Input_Variable = Constant_Value

Implementation in ECJ parameter filesgptypeasize = 3gptypea0name = classgptypea1name = vargptypea2name = constgptypessize = 0gptcsize = 1gptc0 = ecgpGPTreeConstraintsgptc0name = tc0gptc0fset = f0gptc0returns = class

gpncsize = 4gpnc0 = ecgpGPNodeConstraintsgpnc0name = ncSimpleClassifiergpnc0returns = classgpnc0size = 0gpnc1 = ecgpGPNodeConstraintsgpnc1name = ncCompoundClassifiergpnc1returns = classgpnc1size = 4gpnc1child0 = vargpnc1child1 = constgpnc1child2 = classgpnc1child3 = classgpnc2 = ecgpGPNodeConstraintsgpnc2name = ncVariablegpnc2returns = vargpnc2size = 0gpnc3 = ecgpGPNodeConstraintsgpnc3name = ncConstantgpnc3returns = constgpnc3size = 0

Variants of GPndash 62

Linear Genetic Programming

MotivationTree-like structures are not natural for contemporary hardware architectures

Program = a sequence of instructions

Data passed via registers

ProsDirectly portable to machine code fast executionNatural correspondence to standard (GA-like) crossover operator

Applications direct evolution of machine code [32]

Initial register contents

Final register contents

x1

x2 O1 O2

x3

O3 O4 g2

g3

g1r1

r2

r3

r1

r2

r3

Variants of GPndash 63

Linear GP

Variants of GPndash 64

Stack-based GP

The best-known representative Push and PushGPhampshireedulspectorpushhtml [48]

ProsVery simple syntax program = instruction | literal | ( program )No need to specify the number of registersThe top element of a stack has the natural interpretation of program outcomeNatural possibility of implementing autoconstructive programs [47]Includes certain features that make it Turing-complete (eg YANK instruction)

Variants of GPndash 65

Push Example

Program

( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR )

Initial stack states

BOOLEAN STACK ()CODE STACK ( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR )FLOAT STACK ()INTEGER STACK ()

Stack states after program execution

BOOLEAN STACK ( TRUE )CODE STACK ( ( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR ) )FLOAT STACK ( 93 )INTEGER STACK ( 6 )

httphampshireedulspectorpush3-descriptionhtml

Variants of GPndash 66

Other variants of GP

Grammatical Evolution The grammar of the programming language ofconsideration is given as input to the algorithm Individuals encode the choice ofproductions in the derivation tree (which of available alternative productionshould be chosen modulo the number of productions available at given step ofderivation)

Graph-based GPMotivation standard GP cannot reuse subprograms (within a single program)Example Cartesian Genetic Programming

Variants of GPndash 67

Other variants of GP

Multiobjective GP The extra objectives canCome with the problemResult from GPrsquos specifics eg use program size as the second (minimized)objectiveBe associated with different tests (eg feature tests [40])

Developmental GP (eg using Push)

Probabilistic GP (a variant of EDA Estimation of Distribution Algorithms)The algorithm maintains a probability distribution P instead of a populationIndividuals are generated from P lsquoon demandrsquoThe results of individualsrsquo evaluation are used to update P

Variants of GPndash 68

Simple EDA-like GP PIPE

Probabilistic Incremental Program Evolution [41]

Variants of GPndash 69

Selected theoretical results

Selected theoretical resultsndash 70

Schemata theorem for GP

Exact formula for the expected number of individuals sampling a schema a thenext generation [37]

Plus later work for other types of crossover

Selected theoretical resultsndash 71

Semantic crossover

Selected theoretical resultsndash 72

Applications of GP

Applications of GPndash 73

Review

GP produced a number of solutions that are human-competitive ie a GPalgorithm automatically solved a problem for which a patent exists2

A recent award-winning work has demonstrated the ability of a GP system toautomatically find and correct bugs in commercially-released software whenprovided with test data3

GP is one of leading methodologies that can be used to lsquoautomatersquo sciencehelping the researchers to find the hidden complex patterns in the observedphenomena4

2Koza J R Keane M A Streeter M J Mydlowec W Yu J Lanza G 2003 Genetic Pro-gramming IV Routine Human-Competitive Machine Intelligence Kluwer Academic Publishers

3Arcuri A Yao X A novel co-evolutionary approach to automatic software bug fixing In WangJ (Ed) 2008 IEEE World Congress on Computational Intelligence IEEE Computational IntelligenceSociety IEEE Press Hong Kong

4Schmidt M Lipson H 3 Apr 2009 Distilling free-form natural laws from experimental dataScience 324 (5923) 81ndash85

Applications of GPndash 74

Humies

() Entries were solicited for cash awards for human-competitive results that wereproduced by any form of genetic and evolutionary computation and that were published

Applications of GPndash 75

Humies

The conditions to qualify(A) The result was patented as an invention in the past is an improvement over apatented invention or would qualify today as a patentable new invention(B) The result is equal to or better than a result that was accepted as a new scientificresult at the time when it was published in a peer-reviewed scientific journal(C) The result is equal to or better than a result that was placed into a database orarchive of results maintained by an internationally recognized panel of scientificexperts(D) The result is publishable in its own right as a new scientific result mdash independentof the fact that the result was mechanically created(E) The result is equal to or better than the most recent human-created solution to along-standing problem for which there has been a succession of increasingly betterhuman-created solutions(F) The result is equal to or better than a result that was considered an achievementin its field at the time it was first discovered(G) The result solves a problem of indisputable difficulty in its field(H) The result holds its own or wins a regulated competition involving humancontestants (in the form of either live human players or human-written computerprograms)

Applications of GPndash 76

Selected Gold Humies

2004 Jason D Lohn Gregory S Hornby Derek S Linden NASA Ames ResearchCenterAn Evolved Antenna for Deployment on NASArsquos Space Technology 5 Mission

httpidesignucscedupapershornby_ec11pdf

Applications of GPndash 77

Selected Gold Humies using GP

2009 Stephanie Forrest Claire Le Goues ThanhVu Nguyen Westley WeimerAutomatically finding patches using genetic programming A GeneticProgramming Approach to Automated Software Repair

Applications of GPndash 78

Selected Gold Humies using GP

2008 Lee Spector David M Clark Ian Lindsay Bradford Barr Jon KleinGenetic Programming for Finite Algebras

2010 Natalio Krasnogor Paweł Widera Jonathan GaribaldiEvolutionary design of the energy function for protein structure prediction GPchallenge evolving the energy function for protein structure prediction Automateddesign of energy functions for protein structure prediction by means of geneticprogramming and improved structure similarity assessment

2011 Achiya Elyasaf Ami Hauptmann Moshe SipperGA-FreeCell Evolving Solvers for the Game of FreeCell

Applications of GPndash 79

Other applications

classification problems in machine learning [] object recognition [ 33] orlearning game strategies [] [2 57] has demonstrated the ability of a GP system to automatically find andcorrect bugs in commercially-released software when provided with test data

In the context of this paper it deserves particular attention that GP is one ofleading methodologies that can be used to lsquoautomatersquo science helping theresearchers to find the hidden complex patterns in the observed phenomena

In this spirit in their seminal paper [42] have shown how GP can be used toinduce scientific laws from experimental data Many other studies havedemonstrated the usefulness of GP for modeling different phenomena includingthose of natural origins [45 3 7 56 46]

See [] for an extensive review of GP applications

Applications of GPndash 80

Case study 2 Evolution of feature detectors

Based on Krzysztof Krawiec Bartosz Kukawka and Tomasz Maciejewski (2010)Evolving cascades of voting feature detectors for vehicle detection in satellite imageryIn IEEE Congress on Evolutionary Computation (CEC 2010) Barcelona IEEE Presspages 2392-2399

Applications of GPndash 81

Additional resources

Additional resourcesndash 82

Additional resources

Evolutionary Computation in Java csgmuedu~eclabprojectsecjGeneric software framework for EA well-prepared to work with GP

The online GP bilbiography wwwcsbhamacuk~wblbiblio

The genetic programming lsquohome pagersquo (a little bit messy but still valuable)httpwwwgenetic-programmingcom

Additional resourcesndash 83

Bibliography

Bibliographyndash 84

A Agapitos and S M LucasEvolving Modular Recursive Sorting AlgorithmsIn Proc EuroGP 2007

A Arcuri and X YaoA novel co-evolutionary approach to automatic software bug fixingIn J Wang editor 2008 IEEE World Congress on Computational Intelligencepages 162ndash168 Hong Kong 1-6 June 2008 IEEE Computational IntelligenceSociety IEEE Press

M Arganis R Val J Prats K Rodriguez R Dominguez and J DolzGenetic programming and standardization in water temperature modellingAdvances in Civil Engineering 2009 2009

A Brabazon M OrsquoNeill and I DempseyAn Introduction to Evolutionary Computation in FinanceIEEE Computational Intelligence Magazine 3(4)42ndash55 2008

R Bradley A Brabazon and M OrsquoNeillDynamic High Frequency Trading A Neuro-Evolutionary ApproachIn Proc EvoWorkshops 2009

G Cuccu and F GomezWhen novelty is not enoughIn Proc EvoApplications 2011

Bibliographyndash 85

M Daga and M C DeoAlternative data-driven methods to estimate wind from waves by inversemodelingNatural Hazards 49(2)293ndash310 May 2009

I Dempsey M OrsquoNeill and A BrabazonAdaptive Trading With Grammatical EvolutionIn Proc CEC 2006

G Durrett F Neumann and U-M OrsquoReillyComputational Complexity Analysis of Simple Genetic Programming On TwoProblems Modeling Isolated Program SemanticsIn Proc FOGA 2011

E Galvaacuten-Loacutepez J Swafford M OrsquoNeill and A BrabazonEvolving a Ms PacMan Controller Using Grammatical EvolutionIn Applications of Evolutionary Computation Springer 2010

J V Hansen P B Lowry R D Meservy and D M McDonaldGenetic Programming for Prevention of Cyberterrorism through Dynamic andEvolving Intrusion DetectionDecision Support Systems 431362ndash1374 2007

Bibliographyndash 86

S Harding J F Miller and W BanzhafDevelopments in Cartesian Genetic Programming self-modifying CGPGPEM 11397ndash439 2010

D Jakobović and L BudinDynamic Scheduling with Genetic ProgrammingIn Proc EuroGP 2006

G Kendall A Parkes and K SpoererA Survey of NP-Complete PuzzlesInternational Computer Games Association Journal 31(1)13ndash34 2008

K E Kinnear JrEvolving a Sort Lessons in Genetic ProgrammingIn Proc of the International Conference on Neural Networks 1993

J KozaA Genetic Approach to the Truck Backer Upper Problem and the Inter-twinedSpiral ProblemIn Proc International Joint Conference on Neural Networks 1992

J R KozaGenetic Programming On the Programming of Computers by Means of NaturalSelectionMIT Press Cambridge MA USA 1992

Bibliographyndash 87

J R KozaGenetic Programming II Automatic Discovery of Reusable ProgramsMIT Press Cambridge Massachusetts May 1994

W Langdon and W BanzhafRepeated Patterns in Genetic ProgrammingNatural Computing 7589ndash613 2008

W B LangdonRandom search is parsimoniousIn E Cantuacute-Paz editor Late Breaking Papers at the Genetic and EvolutionaryComputation Conference (GECCO-2002) pages 308ndash315 New York NY 9-13July 2002 AAAI

W B Langdon and W BanzhafRepeated Sequences in Linear Genetic Programming GenomesComplex Systems 15(4)285ndash306 2005

W B Langdon and A P HarrisonEvolving Regular Expressions for GeneChip Probe Performance PredictionIn Proc PPSN pages 1061ndash1070 2008

W B Langdon and R PoliFoundations of Genetic ProgrammingSpringer-Verlag 2002

Bibliographyndash 88

W B Langdon J Rowsell and A P HarrisonCreating Regular Expressions as mRNA Motifs with GP to Predict Human ExonSplittingIn Proc GECCO 2009

W B Langdon O Sanchez Graillet and A P HarrisonAutomated DNA Motif DiscoveryarXivorg 2010

M Lones A Tyrrell S Stepney and L CavesControlling Complex Dynamics with Artificial Biochemical NetworksIn Proc EuroGP pages 159ndash170 2010

S LucasOthello Competitionhttpprotectkern-1667emrelaxalgovalessexacuk8080othellohtmlOthellohtml 2012[Online accessed 27-Jan-2012]

Bibliographyndash 89

S LucasThe Physical Travelling Salesperson Problemhttpprotectkern-1667emrelaxalgovalessexacukptspptsphtml2012[Online accessed 27ndashJan-2012]

T McConaghyFFX Fast Scalable Deterministic Symbolic Regression TechnologyIn Proc GPTP 2011

D J MontanaStrongly typed genetic programmingBBN Technical Report 7866 Bolt Beranek and Newman Inc 10 MoultonStreet Cambridge MA 02138 USA 7 May 1993

M Nicolau M Schoenauer and W BanzhafEvolving Genes to Balance a PoleIn Proc EuroGP 2010

Bibliographyndash 90

P Nordin and W BanzhafGenetic programming controlling a miniature robotIn E V Siegel and J R Koza editors Working Notes for the AAAI Symposiumon Genetic Programming pages 61ndash67 MIT Cambridge MA USA 10ndash12 Nov1995 AAAI

G Olague and L TrujilloEvolutionary-computer-assisted design of image operators that detect interestpoints using genetic programmingImage and Vision Computing 29(7)484ndash498 2011

M OrsquoNeill A Brabazon and E HembergSubtree Deactivation Control with Grammatical Genetic Programming inDynamic EnvironmentsIn Proc CEC 2008

M OrsquoNeill and C RyanGrammatical Evolution by Grammatical Evolution The Evolution of Grammarand Genetic CodeIn Proc EuroGP pages 138ndash149 Springer-Verlag 5ndash7 Apr 2004

M Paterson Y Peres M Thorup P Winkler and U ZwickMaximum OverhangIn Proc 19th Annual ACM-SIAM Symposium on Discrete Algorithms 2008

Bibliographyndash 91

R PoliExact schema theory for genetic programming and variable-length geneticalgorithms with one-point crossoverGenetic Programming and Evolvable Machines 2(2)123ndash163 June 2001

R Poli and W B LangdonOn the search properties of different crossover operators in genetic programmingIn J R Koza W Banzhaf K Chellapilla K Deb M Dorigo D B Fogel M HGarzon D E Goldberg H Iba and R Riolo editors Genetic Programming1998 Proceedings of the Third Annual Conference pages 293ndash301 University ofWisconsin Madison Wisconsin USA 22-25 July 1998 Morgan Kaufmann

R Poli W B Langdon and N F McPheeA field guide to genetic programmingPublished via httplulucom and freely available athttpwwwgp-field-guideorguk 2008(With contributions by J R Koza)

B J Ross and H ZhuProcedural texture evolution using multiobjective optimizationNew Generation Computing 22(3)271ndash293 2004

Bibliographyndash 92

R P Salustowicz and J SchmidhuberProbabilistic incremental program evolutionEvolutionary Computation 5(2)123ndash141 1997

M Schmidt and H LipsonDistilling free-form natural laws from experimental dataScience 324(5923)81ndash85 3 Apr 2009

S Silva and L VanneschiState-of-the-Art Genetic Programming for Predicting Human Oral Bioavailabilityof DrugsIn Proc 4th International Workshop on Practical Applications of ComputationalBiology and Bioinformatics 2010

M SipperLet the Games EvolveIn Proc GPTP 2011

C Sivapragasam R Maheswaran and V VenkateshGenetic programming approach for flood routing in natural channelsHydrological Processes 22(5)623ndash628 2007

C Sivapragasam N Muttil S Muthukumar and V M ArunPrediction of algal blooms using genetic programmingMarine Pollution Bulletin 60(10)1849ndash1855 2010

Bibliographyndash 93

L SpectorTowards practical autoconstructive evolution Self-evolution of problem-solvinggenetic programming systemsIn R Riolo T McConaghy and E Vladislavleva editors Genetic ProgrammingTheory and Practice VIII volume 8 of Genetic and Evolutionary Computationchapter 2 pages 17ndash33 Springer Ann Arbor USA 20-22 May 2010

L Spector C Perry and J KleinPush 20 programming language descriptionTechnical report School of Cognitive Science Hampshire College Apr 2004

J Togelius S Karakovskiy J Koutnik and J SchmidhuberSuper Mario EvolutionIn Proc IEEE Computational Intelligence and Games 2009

TORCS The Open Car Racing Simulatorhttptorcssourceforgenet 2012

L Vanneschi and G CuccuVariable Size Population for Dynamic Optimization with Genetic ProgrammingIn Proc GECCO 2009

Bibliographyndash 94

E Vladislavleva G Smits and D Den HertogOrder of Nonlinearity as a Complexity Measure for Models Generated by SymbolicRegression via Pareto Genetic ProgrammingIEEE Trans EC 13(2)333ndash349 2009

N Wagner Z Michalewicz M Khouja and R McGregorTime Series Forecasting for Dynamic Environments The DyFor Genetic ProgramModelIEEE Trans EC 2007

J Walker and J MillerPredicting Prime Numbers Using Cartesian Genetic ProgrammingIn Proc EuroGP 2007

J A Walker K Voumllk S L Smith and J F MillerParallel Evolution using Multi-chromosome Cartesian Genetic ProgrammingGPEM 10417ndash445 2009

W-C Wang K-W Chau C-T Cheng and L QiuA comparison of performance of several artificial intelligence methods forforecasting monthly discharge time seriesJournal of Hydrology 374(3-4)294ndash306 2009

Bibliographyndash 95

W Weimer S Forrest C Le Goues and T NguyenAutomatic program repair with evolutionary computationCommunications of the ACM 53(5)109ndash116 June 2010

P Widera J Garibaldi and N KrasnogorGP challenge Evolving energy function for protein structure predictionGPEM 1161ndash88 2010

T YuHierarchical Processing for Evolving Recursive and Modular Programs UsingHigher-Order Functions and Lambda AbstractionGPEM 2345ndash380 2001

Bibliographyndash 96

  • Introduction
  • Background
  • What is genetic programming
  • Summary of our first glimpse at GP
  • Life demonstration of GP using ECJ
  • Assessment of GP techniques
  • Case study
  • A more detailed view on GP (vanilla GP is not the whole story)
  • The Challenges for GP
  • Variants of GP
  • Selected theoretical results
  • Applications of GP
  • Additional resources
  • Bibliography
Page 63: GeneticProgramming - Poznań University of TechnologyGeneticProgramming KrzysztofKrawiec LaboratoryofIntelligentDecisionSupportSystems InstituteofComputingScience,PoznanUniversityofTechnology,Poznań,Poland

Linear Genetic Programming

MotivationTree-like structures are not natural for contemporary hardware architectures

Program = a sequence of instructions

Data passed via registers

ProsDirectly portable to machine code fast executionNatural correspondence to standard (GA-like) crossover operator

Applications direct evolution of machine code [32]

Initial register contents

Final register contents

x1

x2 O1 O2

x3

O3 O4 g2

g3

g1r1

r2

r3

r1

r2

r3

Variants of GPndash 63

Linear GP

Variants of GPndash 64

Stack-based GP

The best-known representative Push and PushGPhampshireedulspectorpushhtml [48]

ProsVery simple syntax program = instruction | literal | ( program )No need to specify the number of registersThe top element of a stack has the natural interpretation of program outcomeNatural possibility of implementing autoconstructive programs [47]Includes certain features that make it Turing-complete (eg YANK instruction)

Variants of GPndash 65

Push Example

Program

( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR )

Initial stack states

BOOLEAN STACK ()CODE STACK ( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR )FLOAT STACK ()INTEGER STACK ()

Stack states after program execution

BOOLEAN STACK ( TRUE )CODE STACK ( ( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR ) )FLOAT STACK ( 93 )INTEGER STACK ( 6 )

httphampshireedulspectorpush3-descriptionhtml

Variants of GPndash 66

Other variants of GP

Grammatical Evolution The grammar of the programming language ofconsideration is given as input to the algorithm Individuals encode the choice ofproductions in the derivation tree (which of available alternative productionshould be chosen modulo the number of productions available at given step ofderivation)

Graph-based GPMotivation standard GP cannot reuse subprograms (within a single program)Example Cartesian Genetic Programming

Variants of GPndash 67

Other variants of GP

Multiobjective GP The extra objectives canCome with the problemResult from GPrsquos specifics eg use program size as the second (minimized)objectiveBe associated with different tests (eg feature tests [40])

Developmental GP (eg using Push)

Probabilistic GP (a variant of EDA Estimation of Distribution Algorithms)The algorithm maintains a probability distribution P instead of a populationIndividuals are generated from P lsquoon demandrsquoThe results of individualsrsquo evaluation are used to update P

Variants of GPndash 68

Simple EDA-like GP PIPE

Probabilistic Incremental Program Evolution [41]

Variants of GPndash 69

Selected theoretical results

Selected theoretical resultsndash 70

Schemata theorem for GP

Exact formula for the expected number of individuals sampling a schema a thenext generation [37]

Plus later work for other types of crossover

Selected theoretical resultsndash 71

Semantic crossover

Selected theoretical resultsndash 72

Applications of GP

Applications of GPndash 73

Review

GP produced a number of solutions that are human-competitive ie a GPalgorithm automatically solved a problem for which a patent exists2

A recent award-winning work has demonstrated the ability of a GP system toautomatically find and correct bugs in commercially-released software whenprovided with test data3

GP is one of leading methodologies that can be used to lsquoautomatersquo sciencehelping the researchers to find the hidden complex patterns in the observedphenomena4

2Koza J R Keane M A Streeter M J Mydlowec W Yu J Lanza G 2003 Genetic Pro-gramming IV Routine Human-Competitive Machine Intelligence Kluwer Academic Publishers

3Arcuri A Yao X A novel co-evolutionary approach to automatic software bug fixing In WangJ (Ed) 2008 IEEE World Congress on Computational Intelligence IEEE Computational IntelligenceSociety IEEE Press Hong Kong

4Schmidt M Lipson H 3 Apr 2009 Distilling free-form natural laws from experimental dataScience 324 (5923) 81ndash85

Applications of GPndash 74

Humies

() Entries were solicited for cash awards for human-competitive results that wereproduced by any form of genetic and evolutionary computation and that were published

Applications of GPndash 75

Humies

The conditions to qualify(A) The result was patented as an invention in the past is an improvement over apatented invention or would qualify today as a patentable new invention(B) The result is equal to or better than a result that was accepted as a new scientificresult at the time when it was published in a peer-reviewed scientific journal(C) The result is equal to or better than a result that was placed into a database orarchive of results maintained by an internationally recognized panel of scientificexperts(D) The result is publishable in its own right as a new scientific result mdash independentof the fact that the result was mechanically created(E) The result is equal to or better than the most recent human-created solution to along-standing problem for which there has been a succession of increasingly betterhuman-created solutions(F) The result is equal to or better than a result that was considered an achievementin its field at the time it was first discovered(G) The result solves a problem of indisputable difficulty in its field(H) The result holds its own or wins a regulated competition involving humancontestants (in the form of either live human players or human-written computerprograms)

Applications of GPndash 76

Selected Gold Humies

2004 Jason D Lohn Gregory S Hornby Derek S Linden NASA Ames ResearchCenterAn Evolved Antenna for Deployment on NASArsquos Space Technology 5 Mission

httpidesignucscedupapershornby_ec11pdf

Applications of GPndash 77

Selected Gold Humies using GP

2009 Stephanie Forrest Claire Le Goues ThanhVu Nguyen Westley WeimerAutomatically finding patches using genetic programming A GeneticProgramming Approach to Automated Software Repair

Applications of GPndash 78

Selected Gold Humies using GP

2008 Lee Spector David M Clark Ian Lindsay Bradford Barr Jon KleinGenetic Programming for Finite Algebras

2010 Natalio Krasnogor Paweł Widera Jonathan GaribaldiEvolutionary design of the energy function for protein structure prediction GPchallenge evolving the energy function for protein structure prediction Automateddesign of energy functions for protein structure prediction by means of geneticprogramming and improved structure similarity assessment

2011 Achiya Elyasaf Ami Hauptmann Moshe SipperGA-FreeCell Evolving Solvers for the Game of FreeCell

Applications of GPndash 79

Other applications

classification problems in machine learning [] object recognition [ 33] orlearning game strategies [] [2 57] has demonstrated the ability of a GP system to automatically find andcorrect bugs in commercially-released software when provided with test data

In the context of this paper it deserves particular attention that GP is one ofleading methodologies that can be used to lsquoautomatersquo science helping theresearchers to find the hidden complex patterns in the observed phenomena

In this spirit in their seminal paper [42] have shown how GP can be used toinduce scientific laws from experimental data Many other studies havedemonstrated the usefulness of GP for modeling different phenomena includingthose of natural origins [45 3 7 56 46]

See [] for an extensive review of GP applications

Applications of GPndash 80

Case study 2 Evolution of feature detectors

Based on Krzysztof Krawiec Bartosz Kukawka and Tomasz Maciejewski (2010)Evolving cascades of voting feature detectors for vehicle detection in satellite imageryIn IEEE Congress on Evolutionary Computation (CEC 2010) Barcelona IEEE Presspages 2392-2399

Applications of GPndash 81

Additional resources

Additional resourcesndash 82

Additional resources

Evolutionary Computation in Java csgmuedu~eclabprojectsecjGeneric software framework for EA well-prepared to work with GP

The online GP bilbiography wwwcsbhamacuk~wblbiblio

The genetic programming lsquohome pagersquo (a little bit messy but still valuable)httpwwwgenetic-programmingcom

Additional resourcesndash 83

Bibliography

Bibliographyndash 84

A Agapitos and S M LucasEvolving Modular Recursive Sorting AlgorithmsIn Proc EuroGP 2007

A Arcuri and X YaoA novel co-evolutionary approach to automatic software bug fixingIn J Wang editor 2008 IEEE World Congress on Computational Intelligencepages 162ndash168 Hong Kong 1-6 June 2008 IEEE Computational IntelligenceSociety IEEE Press

M Arganis R Val J Prats K Rodriguez R Dominguez and J DolzGenetic programming and standardization in water temperature modellingAdvances in Civil Engineering 2009 2009

A Brabazon M OrsquoNeill and I DempseyAn Introduction to Evolutionary Computation in FinanceIEEE Computational Intelligence Magazine 3(4)42ndash55 2008

R Bradley A Brabazon and M OrsquoNeillDynamic High Frequency Trading A Neuro-Evolutionary ApproachIn Proc EvoWorkshops 2009

G Cuccu and F GomezWhen novelty is not enoughIn Proc EvoApplications 2011

Bibliographyndash 85

M Daga and M C DeoAlternative data-driven methods to estimate wind from waves by inversemodelingNatural Hazards 49(2)293ndash310 May 2009

I Dempsey M OrsquoNeill and A BrabazonAdaptive Trading With Grammatical EvolutionIn Proc CEC 2006

G Durrett F Neumann and U-M OrsquoReillyComputational Complexity Analysis of Simple Genetic Programming On TwoProblems Modeling Isolated Program SemanticsIn Proc FOGA 2011

E Galvaacuten-Loacutepez J Swafford M OrsquoNeill and A BrabazonEvolving a Ms PacMan Controller Using Grammatical EvolutionIn Applications of Evolutionary Computation Springer 2010

J V Hansen P B Lowry R D Meservy and D M McDonaldGenetic Programming for Prevention of Cyberterrorism through Dynamic andEvolving Intrusion DetectionDecision Support Systems 431362ndash1374 2007

Bibliographyndash 86

S Harding J F Miller and W BanzhafDevelopments in Cartesian Genetic Programming self-modifying CGPGPEM 11397ndash439 2010

D Jakobović and L BudinDynamic Scheduling with Genetic ProgrammingIn Proc EuroGP 2006

G Kendall A Parkes and K SpoererA Survey of NP-Complete PuzzlesInternational Computer Games Association Journal 31(1)13ndash34 2008

K E Kinnear JrEvolving a Sort Lessons in Genetic ProgrammingIn Proc of the International Conference on Neural Networks 1993

J KozaA Genetic Approach to the Truck Backer Upper Problem and the Inter-twinedSpiral ProblemIn Proc International Joint Conference on Neural Networks 1992

J R KozaGenetic Programming On the Programming of Computers by Means of NaturalSelectionMIT Press Cambridge MA USA 1992

Bibliographyndash 87

J R KozaGenetic Programming II Automatic Discovery of Reusable ProgramsMIT Press Cambridge Massachusetts May 1994

W Langdon and W BanzhafRepeated Patterns in Genetic ProgrammingNatural Computing 7589ndash613 2008

W B LangdonRandom search is parsimoniousIn E Cantuacute-Paz editor Late Breaking Papers at the Genetic and EvolutionaryComputation Conference (GECCO-2002) pages 308ndash315 New York NY 9-13July 2002 AAAI

W B Langdon and W BanzhafRepeated Sequences in Linear Genetic Programming GenomesComplex Systems 15(4)285ndash306 2005

W B Langdon and A P HarrisonEvolving Regular Expressions for GeneChip Probe Performance PredictionIn Proc PPSN pages 1061ndash1070 2008

W B Langdon and R PoliFoundations of Genetic ProgrammingSpringer-Verlag 2002

Bibliographyndash 88

W B Langdon J Rowsell and A P HarrisonCreating Regular Expressions as mRNA Motifs with GP to Predict Human ExonSplittingIn Proc GECCO 2009

W B Langdon O Sanchez Graillet and A P HarrisonAutomated DNA Motif DiscoveryarXivorg 2010

M Lones A Tyrrell S Stepney and L CavesControlling Complex Dynamics with Artificial Biochemical NetworksIn Proc EuroGP pages 159ndash170 2010

S LucasOthello Competitionhttpprotectkern-1667emrelaxalgovalessexacuk8080othellohtmlOthellohtml 2012[Online accessed 27-Jan-2012]

Bibliographyndash 89

S LucasThe Physical Travelling Salesperson Problemhttpprotectkern-1667emrelaxalgovalessexacukptspptsphtml2012[Online accessed 27ndashJan-2012]

T McConaghyFFX Fast Scalable Deterministic Symbolic Regression TechnologyIn Proc GPTP 2011

D J MontanaStrongly typed genetic programmingBBN Technical Report 7866 Bolt Beranek and Newman Inc 10 MoultonStreet Cambridge MA 02138 USA 7 May 1993

M Nicolau M Schoenauer and W BanzhafEvolving Genes to Balance a PoleIn Proc EuroGP 2010

Bibliographyndash 90

P Nordin and W BanzhafGenetic programming controlling a miniature robotIn E V Siegel and J R Koza editors Working Notes for the AAAI Symposiumon Genetic Programming pages 61ndash67 MIT Cambridge MA USA 10ndash12 Nov1995 AAAI

G Olague and L TrujilloEvolutionary-computer-assisted design of image operators that detect interestpoints using genetic programmingImage and Vision Computing 29(7)484ndash498 2011

M OrsquoNeill A Brabazon and E HembergSubtree Deactivation Control with Grammatical Genetic Programming inDynamic EnvironmentsIn Proc CEC 2008

M OrsquoNeill and C RyanGrammatical Evolution by Grammatical Evolution The Evolution of Grammarand Genetic CodeIn Proc EuroGP pages 138ndash149 Springer-Verlag 5ndash7 Apr 2004

M Paterson Y Peres M Thorup P Winkler and U ZwickMaximum OverhangIn Proc 19th Annual ACM-SIAM Symposium on Discrete Algorithms 2008

Bibliographyndash 91

R PoliExact schema theory for genetic programming and variable-length geneticalgorithms with one-point crossoverGenetic Programming and Evolvable Machines 2(2)123ndash163 June 2001

R Poli and W B LangdonOn the search properties of different crossover operators in genetic programmingIn J R Koza W Banzhaf K Chellapilla K Deb M Dorigo D B Fogel M HGarzon D E Goldberg H Iba and R Riolo editors Genetic Programming1998 Proceedings of the Third Annual Conference pages 293ndash301 University ofWisconsin Madison Wisconsin USA 22-25 July 1998 Morgan Kaufmann

R Poli W B Langdon and N F McPheeA field guide to genetic programmingPublished via httplulucom and freely available athttpwwwgp-field-guideorguk 2008(With contributions by J R Koza)

B J Ross and H ZhuProcedural texture evolution using multiobjective optimizationNew Generation Computing 22(3)271ndash293 2004

Bibliographyndash 92

R P Salustowicz and J SchmidhuberProbabilistic incremental program evolutionEvolutionary Computation 5(2)123ndash141 1997

M Schmidt and H LipsonDistilling free-form natural laws from experimental dataScience 324(5923)81ndash85 3 Apr 2009

S Silva and L VanneschiState-of-the-Art Genetic Programming for Predicting Human Oral Bioavailabilityof DrugsIn Proc 4th International Workshop on Practical Applications of ComputationalBiology and Bioinformatics 2010

M SipperLet the Games EvolveIn Proc GPTP 2011

C Sivapragasam R Maheswaran and V VenkateshGenetic programming approach for flood routing in natural channelsHydrological Processes 22(5)623ndash628 2007

C Sivapragasam N Muttil S Muthukumar and V M ArunPrediction of algal blooms using genetic programmingMarine Pollution Bulletin 60(10)1849ndash1855 2010

Bibliographyndash 93

L SpectorTowards practical autoconstructive evolution Self-evolution of problem-solvinggenetic programming systemsIn R Riolo T McConaghy and E Vladislavleva editors Genetic ProgrammingTheory and Practice VIII volume 8 of Genetic and Evolutionary Computationchapter 2 pages 17ndash33 Springer Ann Arbor USA 20-22 May 2010

L Spector C Perry and J KleinPush 20 programming language descriptionTechnical report School of Cognitive Science Hampshire College Apr 2004

J Togelius S Karakovskiy J Koutnik and J SchmidhuberSuper Mario EvolutionIn Proc IEEE Computational Intelligence and Games 2009

TORCS The Open Car Racing Simulatorhttptorcssourceforgenet 2012

L Vanneschi and G CuccuVariable Size Population for Dynamic Optimization with Genetic ProgrammingIn Proc GECCO 2009

Bibliographyndash 94

E Vladislavleva G Smits and D Den HertogOrder of Nonlinearity as a Complexity Measure for Models Generated by SymbolicRegression via Pareto Genetic ProgrammingIEEE Trans EC 13(2)333ndash349 2009

N Wagner Z Michalewicz M Khouja and R McGregorTime Series Forecasting for Dynamic Environments The DyFor Genetic ProgramModelIEEE Trans EC 2007

J Walker and J MillerPredicting Prime Numbers Using Cartesian Genetic ProgrammingIn Proc EuroGP 2007

J A Walker K Voumllk S L Smith and J F MillerParallel Evolution using Multi-chromosome Cartesian Genetic ProgrammingGPEM 10417ndash445 2009

W-C Wang K-W Chau C-T Cheng and L QiuA comparison of performance of several artificial intelligence methods forforecasting monthly discharge time seriesJournal of Hydrology 374(3-4)294ndash306 2009

Bibliographyndash 95

W Weimer S Forrest C Le Goues and T NguyenAutomatic program repair with evolutionary computationCommunications of the ACM 53(5)109ndash116 June 2010

P Widera J Garibaldi and N KrasnogorGP challenge Evolving energy function for protein structure predictionGPEM 1161ndash88 2010

T YuHierarchical Processing for Evolving Recursive and Modular Programs UsingHigher-Order Functions and Lambda AbstractionGPEM 2345ndash380 2001

Bibliographyndash 96

  • Introduction
  • Background
  • What is genetic programming
  • Summary of our first glimpse at GP
  • Life demonstration of GP using ECJ
  • Assessment of GP techniques
  • Case study
  • A more detailed view on GP (vanilla GP is not the whole story)
  • The Challenges for GP
  • Variants of GP
  • Selected theoretical results
  • Applications of GP
  • Additional resources
  • Bibliography
Page 64: GeneticProgramming - Poznań University of TechnologyGeneticProgramming KrzysztofKrawiec LaboratoryofIntelligentDecisionSupportSystems InstituteofComputingScience,PoznanUniversityofTechnology,Poznań,Poland

Linear GP

Variants of GPndash 64

Stack-based GP

The best-known representative Push and PushGPhampshireedulspectorpushhtml [48]

ProsVery simple syntax program = instruction | literal | ( program )No need to specify the number of registersThe top element of a stack has the natural interpretation of program outcomeNatural possibility of implementing autoconstructive programs [47]Includes certain features that make it Turing-complete (eg YANK instruction)

Variants of GPndash 65

Push Example

Program

( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR )

Initial stack states

BOOLEAN STACK ()CODE STACK ( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR )FLOAT STACK ()INTEGER STACK ()

Stack states after program execution

BOOLEAN STACK ( TRUE )CODE STACK ( ( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR ) )FLOAT STACK ( 93 )INTEGER STACK ( 6 )

httphampshireedulspectorpush3-descriptionhtml

Variants of GPndash 66

Other variants of GP

Grammatical Evolution The grammar of the programming language ofconsideration is given as input to the algorithm Individuals encode the choice ofproductions in the derivation tree (which of available alternative productionshould be chosen modulo the number of productions available at given step ofderivation)

Graph-based GPMotivation standard GP cannot reuse subprograms (within a single program)Example Cartesian Genetic Programming

Variants of GPndash 67

Other variants of GP

Multiobjective GP The extra objectives canCome with the problemResult from GPrsquos specifics eg use program size as the second (minimized)objectiveBe associated with different tests (eg feature tests [40])

Developmental GP (eg using Push)

Probabilistic GP (a variant of EDA Estimation of Distribution Algorithms)The algorithm maintains a probability distribution P instead of a populationIndividuals are generated from P lsquoon demandrsquoThe results of individualsrsquo evaluation are used to update P

Variants of GPndash 68

Simple EDA-like GP PIPE

Probabilistic Incremental Program Evolution [41]

Variants of GPndash 69

Selected theoretical results

Selected theoretical resultsndash 70

Schemata theorem for GP

Exact formula for the expected number of individuals sampling a schema a thenext generation [37]

Plus later work for other types of crossover

Selected theoretical resultsndash 71

Semantic crossover

Selected theoretical resultsndash 72

Applications of GP

Applications of GPndash 73

Review

GP produced a number of solutions that are human-competitive ie a GPalgorithm automatically solved a problem for which a patent exists2

A recent award-winning work has demonstrated the ability of a GP system toautomatically find and correct bugs in commercially-released software whenprovided with test data3

GP is one of leading methodologies that can be used to lsquoautomatersquo sciencehelping the researchers to find the hidden complex patterns in the observedphenomena4

2Koza J R Keane M A Streeter M J Mydlowec W Yu J Lanza G 2003 Genetic Pro-gramming IV Routine Human-Competitive Machine Intelligence Kluwer Academic Publishers

3Arcuri A Yao X A novel co-evolutionary approach to automatic software bug fixing In WangJ (Ed) 2008 IEEE World Congress on Computational Intelligence IEEE Computational IntelligenceSociety IEEE Press Hong Kong

4Schmidt M Lipson H 3 Apr 2009 Distilling free-form natural laws from experimental dataScience 324 (5923) 81ndash85

Applications of GPndash 74

Humies

() Entries were solicited for cash awards for human-competitive results that wereproduced by any form of genetic and evolutionary computation and that were published

Applications of GPndash 75

Humies

The conditions to qualify(A) The result was patented as an invention in the past is an improvement over apatented invention or would qualify today as a patentable new invention(B) The result is equal to or better than a result that was accepted as a new scientificresult at the time when it was published in a peer-reviewed scientific journal(C) The result is equal to or better than a result that was placed into a database orarchive of results maintained by an internationally recognized panel of scientificexperts(D) The result is publishable in its own right as a new scientific result mdash independentof the fact that the result was mechanically created(E) The result is equal to or better than the most recent human-created solution to along-standing problem for which there has been a succession of increasingly betterhuman-created solutions(F) The result is equal to or better than a result that was considered an achievementin its field at the time it was first discovered(G) The result solves a problem of indisputable difficulty in its field(H) The result holds its own or wins a regulated competition involving humancontestants (in the form of either live human players or human-written computerprograms)

Applications of GPndash 76

Selected Gold Humies

2004 Jason D Lohn Gregory S Hornby Derek S Linden NASA Ames ResearchCenterAn Evolved Antenna for Deployment on NASArsquos Space Technology 5 Mission

httpidesignucscedupapershornby_ec11pdf

Applications of GPndash 77

Selected Gold Humies using GP

2009 Stephanie Forrest Claire Le Goues ThanhVu Nguyen Westley WeimerAutomatically finding patches using genetic programming A GeneticProgramming Approach to Automated Software Repair

Applications of GPndash 78

Selected Gold Humies using GP

2008 Lee Spector David M Clark Ian Lindsay Bradford Barr Jon KleinGenetic Programming for Finite Algebras

2010 Natalio Krasnogor Paweł Widera Jonathan GaribaldiEvolutionary design of the energy function for protein structure prediction GPchallenge evolving the energy function for protein structure prediction Automateddesign of energy functions for protein structure prediction by means of geneticprogramming and improved structure similarity assessment

2011 Achiya Elyasaf Ami Hauptmann Moshe SipperGA-FreeCell Evolving Solvers for the Game of FreeCell

Applications of GPndash 79

Other applications

classification problems in machine learning [] object recognition [ 33] orlearning game strategies [] [2 57] has demonstrated the ability of a GP system to automatically find andcorrect bugs in commercially-released software when provided with test data

In the context of this paper it deserves particular attention that GP is one ofleading methodologies that can be used to lsquoautomatersquo science helping theresearchers to find the hidden complex patterns in the observed phenomena

In this spirit in their seminal paper [42] have shown how GP can be used toinduce scientific laws from experimental data Many other studies havedemonstrated the usefulness of GP for modeling different phenomena includingthose of natural origins [45 3 7 56 46]

See [] for an extensive review of GP applications

Applications of GPndash 80

Case study 2 Evolution of feature detectors

Based on Krzysztof Krawiec Bartosz Kukawka and Tomasz Maciejewski (2010)Evolving cascades of voting feature detectors for vehicle detection in satellite imageryIn IEEE Congress on Evolutionary Computation (CEC 2010) Barcelona IEEE Presspages 2392-2399

Applications of GPndash 81

Additional resources

Additional resourcesndash 82

Additional resources

Evolutionary Computation in Java csgmuedu~eclabprojectsecjGeneric software framework for EA well-prepared to work with GP

The online GP bilbiography wwwcsbhamacuk~wblbiblio

The genetic programming lsquohome pagersquo (a little bit messy but still valuable)httpwwwgenetic-programmingcom

Additional resourcesndash 83

Bibliography

Bibliographyndash 84

A Agapitos and S M LucasEvolving Modular Recursive Sorting AlgorithmsIn Proc EuroGP 2007

A Arcuri and X YaoA novel co-evolutionary approach to automatic software bug fixingIn J Wang editor 2008 IEEE World Congress on Computational Intelligencepages 162ndash168 Hong Kong 1-6 June 2008 IEEE Computational IntelligenceSociety IEEE Press

M Arganis R Val J Prats K Rodriguez R Dominguez and J DolzGenetic programming and standardization in water temperature modellingAdvances in Civil Engineering 2009 2009

A Brabazon M OrsquoNeill and I DempseyAn Introduction to Evolutionary Computation in FinanceIEEE Computational Intelligence Magazine 3(4)42ndash55 2008

R Bradley A Brabazon and M OrsquoNeillDynamic High Frequency Trading A Neuro-Evolutionary ApproachIn Proc EvoWorkshops 2009

G Cuccu and F GomezWhen novelty is not enoughIn Proc EvoApplications 2011

Bibliographyndash 85

M Daga and M C DeoAlternative data-driven methods to estimate wind from waves by inversemodelingNatural Hazards 49(2)293ndash310 May 2009

I Dempsey M OrsquoNeill and A BrabazonAdaptive Trading With Grammatical EvolutionIn Proc CEC 2006

G Durrett F Neumann and U-M OrsquoReillyComputational Complexity Analysis of Simple Genetic Programming On TwoProblems Modeling Isolated Program SemanticsIn Proc FOGA 2011

E Galvaacuten-Loacutepez J Swafford M OrsquoNeill and A BrabazonEvolving a Ms PacMan Controller Using Grammatical EvolutionIn Applications of Evolutionary Computation Springer 2010

J V Hansen P B Lowry R D Meservy and D M McDonaldGenetic Programming for Prevention of Cyberterrorism through Dynamic andEvolving Intrusion DetectionDecision Support Systems 431362ndash1374 2007

Bibliographyndash 86

S Harding J F Miller and W BanzhafDevelopments in Cartesian Genetic Programming self-modifying CGPGPEM 11397ndash439 2010

D Jakobović and L BudinDynamic Scheduling with Genetic ProgrammingIn Proc EuroGP 2006

G Kendall A Parkes and K SpoererA Survey of NP-Complete PuzzlesInternational Computer Games Association Journal 31(1)13ndash34 2008

K E Kinnear JrEvolving a Sort Lessons in Genetic ProgrammingIn Proc of the International Conference on Neural Networks 1993

J KozaA Genetic Approach to the Truck Backer Upper Problem and the Inter-twinedSpiral ProblemIn Proc International Joint Conference on Neural Networks 1992

J R KozaGenetic Programming On the Programming of Computers by Means of NaturalSelectionMIT Press Cambridge MA USA 1992

Bibliographyndash 87

J R KozaGenetic Programming II Automatic Discovery of Reusable ProgramsMIT Press Cambridge Massachusetts May 1994

W Langdon and W BanzhafRepeated Patterns in Genetic ProgrammingNatural Computing 7589ndash613 2008

W B LangdonRandom search is parsimoniousIn E Cantuacute-Paz editor Late Breaking Papers at the Genetic and EvolutionaryComputation Conference (GECCO-2002) pages 308ndash315 New York NY 9-13July 2002 AAAI

W B Langdon and W BanzhafRepeated Sequences in Linear Genetic Programming GenomesComplex Systems 15(4)285ndash306 2005

W B Langdon and A P HarrisonEvolving Regular Expressions for GeneChip Probe Performance PredictionIn Proc PPSN pages 1061ndash1070 2008

W B Langdon and R PoliFoundations of Genetic ProgrammingSpringer-Verlag 2002

Bibliographyndash 88

W B Langdon J Rowsell and A P HarrisonCreating Regular Expressions as mRNA Motifs with GP to Predict Human ExonSplittingIn Proc GECCO 2009

W B Langdon O Sanchez Graillet and A P HarrisonAutomated DNA Motif DiscoveryarXivorg 2010

M Lones A Tyrrell S Stepney and L CavesControlling Complex Dynamics with Artificial Biochemical NetworksIn Proc EuroGP pages 159ndash170 2010

S LucasOthello Competitionhttpprotectkern-1667emrelaxalgovalessexacuk8080othellohtmlOthellohtml 2012[Online accessed 27-Jan-2012]

Bibliographyndash 89

S LucasThe Physical Travelling Salesperson Problemhttpprotectkern-1667emrelaxalgovalessexacukptspptsphtml2012[Online accessed 27ndashJan-2012]

T McConaghyFFX Fast Scalable Deterministic Symbolic Regression TechnologyIn Proc GPTP 2011

D J MontanaStrongly typed genetic programmingBBN Technical Report 7866 Bolt Beranek and Newman Inc 10 MoultonStreet Cambridge MA 02138 USA 7 May 1993

M Nicolau M Schoenauer and W BanzhafEvolving Genes to Balance a PoleIn Proc EuroGP 2010

Bibliographyndash 90

P Nordin and W BanzhafGenetic programming controlling a miniature robotIn E V Siegel and J R Koza editors Working Notes for the AAAI Symposiumon Genetic Programming pages 61ndash67 MIT Cambridge MA USA 10ndash12 Nov1995 AAAI

G Olague and L TrujilloEvolutionary-computer-assisted design of image operators that detect interestpoints using genetic programmingImage and Vision Computing 29(7)484ndash498 2011

M OrsquoNeill A Brabazon and E HembergSubtree Deactivation Control with Grammatical Genetic Programming inDynamic EnvironmentsIn Proc CEC 2008

M OrsquoNeill and C RyanGrammatical Evolution by Grammatical Evolution The Evolution of Grammarand Genetic CodeIn Proc EuroGP pages 138ndash149 Springer-Verlag 5ndash7 Apr 2004

M Paterson Y Peres M Thorup P Winkler and U ZwickMaximum OverhangIn Proc 19th Annual ACM-SIAM Symposium on Discrete Algorithms 2008

Bibliographyndash 91

R PoliExact schema theory for genetic programming and variable-length geneticalgorithms with one-point crossoverGenetic Programming and Evolvable Machines 2(2)123ndash163 June 2001

R Poli and W B LangdonOn the search properties of different crossover operators in genetic programmingIn J R Koza W Banzhaf K Chellapilla K Deb M Dorigo D B Fogel M HGarzon D E Goldberg H Iba and R Riolo editors Genetic Programming1998 Proceedings of the Third Annual Conference pages 293ndash301 University ofWisconsin Madison Wisconsin USA 22-25 July 1998 Morgan Kaufmann

R Poli W B Langdon and N F McPheeA field guide to genetic programmingPublished via httplulucom and freely available athttpwwwgp-field-guideorguk 2008(With contributions by J R Koza)

B J Ross and H ZhuProcedural texture evolution using multiobjective optimizationNew Generation Computing 22(3)271ndash293 2004

Bibliographyndash 92

R P Salustowicz and J SchmidhuberProbabilistic incremental program evolutionEvolutionary Computation 5(2)123ndash141 1997

M Schmidt and H LipsonDistilling free-form natural laws from experimental dataScience 324(5923)81ndash85 3 Apr 2009

S Silva and L VanneschiState-of-the-Art Genetic Programming for Predicting Human Oral Bioavailabilityof DrugsIn Proc 4th International Workshop on Practical Applications of ComputationalBiology and Bioinformatics 2010

M SipperLet the Games EvolveIn Proc GPTP 2011

C Sivapragasam R Maheswaran and V VenkateshGenetic programming approach for flood routing in natural channelsHydrological Processes 22(5)623ndash628 2007

C Sivapragasam N Muttil S Muthukumar and V M ArunPrediction of algal blooms using genetic programmingMarine Pollution Bulletin 60(10)1849ndash1855 2010

Bibliographyndash 93

L SpectorTowards practical autoconstructive evolution Self-evolution of problem-solvinggenetic programming systemsIn R Riolo T McConaghy and E Vladislavleva editors Genetic ProgrammingTheory and Practice VIII volume 8 of Genetic and Evolutionary Computationchapter 2 pages 17ndash33 Springer Ann Arbor USA 20-22 May 2010

L Spector C Perry and J KleinPush 20 programming language descriptionTechnical report School of Cognitive Science Hampshire College Apr 2004

J Togelius S Karakovskiy J Koutnik and J SchmidhuberSuper Mario EvolutionIn Proc IEEE Computational Intelligence and Games 2009

TORCS The Open Car Racing Simulatorhttptorcssourceforgenet 2012

L Vanneschi and G CuccuVariable Size Population for Dynamic Optimization with Genetic ProgrammingIn Proc GECCO 2009

Bibliographyndash 94

E Vladislavleva G Smits and D Den HertogOrder of Nonlinearity as a Complexity Measure for Models Generated by SymbolicRegression via Pareto Genetic ProgrammingIEEE Trans EC 13(2)333ndash349 2009

N Wagner Z Michalewicz M Khouja and R McGregorTime Series Forecasting for Dynamic Environments The DyFor Genetic ProgramModelIEEE Trans EC 2007

J Walker and J MillerPredicting Prime Numbers Using Cartesian Genetic ProgrammingIn Proc EuroGP 2007

J A Walker K Voumllk S L Smith and J F MillerParallel Evolution using Multi-chromosome Cartesian Genetic ProgrammingGPEM 10417ndash445 2009

W-C Wang K-W Chau C-T Cheng and L QiuA comparison of performance of several artificial intelligence methods forforecasting monthly discharge time seriesJournal of Hydrology 374(3-4)294ndash306 2009

Bibliographyndash 95

W Weimer S Forrest C Le Goues and T NguyenAutomatic program repair with evolutionary computationCommunications of the ACM 53(5)109ndash116 June 2010

P Widera J Garibaldi and N KrasnogorGP challenge Evolving energy function for protein structure predictionGPEM 1161ndash88 2010

T YuHierarchical Processing for Evolving Recursive and Modular Programs UsingHigher-Order Functions and Lambda AbstractionGPEM 2345ndash380 2001

Bibliographyndash 96

  • Introduction
  • Background
  • What is genetic programming
  • Summary of our first glimpse at GP
  • Life demonstration of GP using ECJ
  • Assessment of GP techniques
  • Case study
  • A more detailed view on GP (vanilla GP is not the whole story)
  • The Challenges for GP
  • Variants of GP
  • Selected theoretical results
  • Applications of GP
  • Additional resources
  • Bibliography
Page 65: GeneticProgramming - Poznań University of TechnologyGeneticProgramming KrzysztofKrawiec LaboratoryofIntelligentDecisionSupportSystems InstituteofComputingScience,PoznanUniversityofTechnology,Poznań,Poland

Stack-based GP

The best-known representative Push and PushGPhampshireedulspectorpushhtml [48]

ProsVery simple syntax program = instruction | literal | ( program )No need to specify the number of registersThe top element of a stack has the natural interpretation of program outcomeNatural possibility of implementing autoconstructive programs [47]Includes certain features that make it Turing-complete (eg YANK instruction)

Variants of GPndash 65

Push Example

Program

( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR )

Initial stack states

BOOLEAN STACK ()CODE STACK ( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR )FLOAT STACK ()INTEGER STACK ()

Stack states after program execution

BOOLEAN STACK ( TRUE )CODE STACK ( ( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR ) )FLOAT STACK ( 93 )INTEGER STACK ( 6 )

httphampshireedulspectorpush3-descriptionhtml

Variants of GPndash 66

Other variants of GP

Grammatical Evolution The grammar of the programming language ofconsideration is given as input to the algorithm Individuals encode the choice ofproductions in the derivation tree (which of available alternative productionshould be chosen modulo the number of productions available at given step ofderivation)

Graph-based GPMotivation standard GP cannot reuse subprograms (within a single program)Example Cartesian Genetic Programming

Variants of GPndash 67

Other variants of GP

Multiobjective GP The extra objectives canCome with the problemResult from GPrsquos specifics eg use program size as the second (minimized)objectiveBe associated with different tests (eg feature tests [40])

Developmental GP (eg using Push)

Probabilistic GP (a variant of EDA Estimation of Distribution Algorithms)The algorithm maintains a probability distribution P instead of a populationIndividuals are generated from P lsquoon demandrsquoThe results of individualsrsquo evaluation are used to update P

Variants of GPndash 68

Simple EDA-like GP PIPE

Probabilistic Incremental Program Evolution [41]

Variants of GPndash 69

Selected theoretical results

Selected theoretical resultsndash 70

Schemata theorem for GP

Exact formula for the expected number of individuals sampling a schema a thenext generation [37]

Plus later work for other types of crossover

Selected theoretical resultsndash 71

Semantic crossover

Selected theoretical resultsndash 72

Applications of GP

Applications of GPndash 73

Review

GP produced a number of solutions that are human-competitive ie a GPalgorithm automatically solved a problem for which a patent exists2

A recent award-winning work has demonstrated the ability of a GP system toautomatically find and correct bugs in commercially-released software whenprovided with test data3

GP is one of leading methodologies that can be used to lsquoautomatersquo sciencehelping the researchers to find the hidden complex patterns in the observedphenomena4

2Koza J R Keane M A Streeter M J Mydlowec W Yu J Lanza G 2003 Genetic Pro-gramming IV Routine Human-Competitive Machine Intelligence Kluwer Academic Publishers

3Arcuri A Yao X A novel co-evolutionary approach to automatic software bug fixing In WangJ (Ed) 2008 IEEE World Congress on Computational Intelligence IEEE Computational IntelligenceSociety IEEE Press Hong Kong

4Schmidt M Lipson H 3 Apr 2009 Distilling free-form natural laws from experimental dataScience 324 (5923) 81ndash85

Applications of GPndash 74

Humies

() Entries were solicited for cash awards for human-competitive results that wereproduced by any form of genetic and evolutionary computation and that were published

Applications of GPndash 75

Humies

The conditions to qualify(A) The result was patented as an invention in the past is an improvement over apatented invention or would qualify today as a patentable new invention(B) The result is equal to or better than a result that was accepted as a new scientificresult at the time when it was published in a peer-reviewed scientific journal(C) The result is equal to or better than a result that was placed into a database orarchive of results maintained by an internationally recognized panel of scientificexperts(D) The result is publishable in its own right as a new scientific result mdash independentof the fact that the result was mechanically created(E) The result is equal to or better than the most recent human-created solution to along-standing problem for which there has been a succession of increasingly betterhuman-created solutions(F) The result is equal to or better than a result that was considered an achievementin its field at the time it was first discovered(G) The result solves a problem of indisputable difficulty in its field(H) The result holds its own or wins a regulated competition involving humancontestants (in the form of either live human players or human-written computerprograms)

Applications of GPndash 76

Selected Gold Humies

2004 Jason D Lohn Gregory S Hornby Derek S Linden NASA Ames ResearchCenterAn Evolved Antenna for Deployment on NASArsquos Space Technology 5 Mission

httpidesignucscedupapershornby_ec11pdf

Applications of GPndash 77

Selected Gold Humies using GP

2009 Stephanie Forrest Claire Le Goues ThanhVu Nguyen Westley WeimerAutomatically finding patches using genetic programming A GeneticProgramming Approach to Automated Software Repair

Applications of GPndash 78

Selected Gold Humies using GP

2008 Lee Spector David M Clark Ian Lindsay Bradford Barr Jon KleinGenetic Programming for Finite Algebras

2010 Natalio Krasnogor Paweł Widera Jonathan GaribaldiEvolutionary design of the energy function for protein structure prediction GPchallenge evolving the energy function for protein structure prediction Automateddesign of energy functions for protein structure prediction by means of geneticprogramming and improved structure similarity assessment

2011 Achiya Elyasaf Ami Hauptmann Moshe SipperGA-FreeCell Evolving Solvers for the Game of FreeCell

Applications of GPndash 79

Other applications

classification problems in machine learning [] object recognition [ 33] orlearning game strategies [] [2 57] has demonstrated the ability of a GP system to automatically find andcorrect bugs in commercially-released software when provided with test data

In the context of this paper it deserves particular attention that GP is one ofleading methodologies that can be used to lsquoautomatersquo science helping theresearchers to find the hidden complex patterns in the observed phenomena

In this spirit in their seminal paper [42] have shown how GP can be used toinduce scientific laws from experimental data Many other studies havedemonstrated the usefulness of GP for modeling different phenomena includingthose of natural origins [45 3 7 56 46]

See [] for an extensive review of GP applications

Applications of GPndash 80

Case study 2 Evolution of feature detectors

Based on Krzysztof Krawiec Bartosz Kukawka and Tomasz Maciejewski (2010)Evolving cascades of voting feature detectors for vehicle detection in satellite imageryIn IEEE Congress on Evolutionary Computation (CEC 2010) Barcelona IEEE Presspages 2392-2399

Applications of GPndash 81

Additional resources

Additional resourcesndash 82

Additional resources

Evolutionary Computation in Java csgmuedu~eclabprojectsecjGeneric software framework for EA well-prepared to work with GP

The online GP bilbiography wwwcsbhamacuk~wblbiblio

The genetic programming lsquohome pagersquo (a little bit messy but still valuable)httpwwwgenetic-programmingcom

Additional resourcesndash 83

Bibliography

Bibliographyndash 84

A Agapitos and S M LucasEvolving Modular Recursive Sorting AlgorithmsIn Proc EuroGP 2007

A Arcuri and X YaoA novel co-evolutionary approach to automatic software bug fixingIn J Wang editor 2008 IEEE World Congress on Computational Intelligencepages 162ndash168 Hong Kong 1-6 June 2008 IEEE Computational IntelligenceSociety IEEE Press

M Arganis R Val J Prats K Rodriguez R Dominguez and J DolzGenetic programming and standardization in water temperature modellingAdvances in Civil Engineering 2009 2009

A Brabazon M OrsquoNeill and I DempseyAn Introduction to Evolutionary Computation in FinanceIEEE Computational Intelligence Magazine 3(4)42ndash55 2008

R Bradley A Brabazon and M OrsquoNeillDynamic High Frequency Trading A Neuro-Evolutionary ApproachIn Proc EvoWorkshops 2009

G Cuccu and F GomezWhen novelty is not enoughIn Proc EvoApplications 2011

Bibliographyndash 85

M Daga and M C DeoAlternative data-driven methods to estimate wind from waves by inversemodelingNatural Hazards 49(2)293ndash310 May 2009

I Dempsey M OrsquoNeill and A BrabazonAdaptive Trading With Grammatical EvolutionIn Proc CEC 2006

G Durrett F Neumann and U-M OrsquoReillyComputational Complexity Analysis of Simple Genetic Programming On TwoProblems Modeling Isolated Program SemanticsIn Proc FOGA 2011

E Galvaacuten-Loacutepez J Swafford M OrsquoNeill and A BrabazonEvolving a Ms PacMan Controller Using Grammatical EvolutionIn Applications of Evolutionary Computation Springer 2010

J V Hansen P B Lowry R D Meservy and D M McDonaldGenetic Programming for Prevention of Cyberterrorism through Dynamic andEvolving Intrusion DetectionDecision Support Systems 431362ndash1374 2007

Bibliographyndash 86

S Harding J F Miller and W BanzhafDevelopments in Cartesian Genetic Programming self-modifying CGPGPEM 11397ndash439 2010

D Jakobović and L BudinDynamic Scheduling with Genetic ProgrammingIn Proc EuroGP 2006

G Kendall A Parkes and K SpoererA Survey of NP-Complete PuzzlesInternational Computer Games Association Journal 31(1)13ndash34 2008

K E Kinnear JrEvolving a Sort Lessons in Genetic ProgrammingIn Proc of the International Conference on Neural Networks 1993

J KozaA Genetic Approach to the Truck Backer Upper Problem and the Inter-twinedSpiral ProblemIn Proc International Joint Conference on Neural Networks 1992

J R KozaGenetic Programming On the Programming of Computers by Means of NaturalSelectionMIT Press Cambridge MA USA 1992

Bibliographyndash 87

J R KozaGenetic Programming II Automatic Discovery of Reusable ProgramsMIT Press Cambridge Massachusetts May 1994

W Langdon and W BanzhafRepeated Patterns in Genetic ProgrammingNatural Computing 7589ndash613 2008

W B LangdonRandom search is parsimoniousIn E Cantuacute-Paz editor Late Breaking Papers at the Genetic and EvolutionaryComputation Conference (GECCO-2002) pages 308ndash315 New York NY 9-13July 2002 AAAI

W B Langdon and W BanzhafRepeated Sequences in Linear Genetic Programming GenomesComplex Systems 15(4)285ndash306 2005

W B Langdon and A P HarrisonEvolving Regular Expressions for GeneChip Probe Performance PredictionIn Proc PPSN pages 1061ndash1070 2008

W B Langdon and R PoliFoundations of Genetic ProgrammingSpringer-Verlag 2002

Bibliographyndash 88

W B Langdon J Rowsell and A P HarrisonCreating Regular Expressions as mRNA Motifs with GP to Predict Human ExonSplittingIn Proc GECCO 2009

W B Langdon O Sanchez Graillet and A P HarrisonAutomated DNA Motif DiscoveryarXivorg 2010

M Lones A Tyrrell S Stepney and L CavesControlling Complex Dynamics with Artificial Biochemical NetworksIn Proc EuroGP pages 159ndash170 2010

S LucasOthello Competitionhttpprotectkern-1667emrelaxalgovalessexacuk8080othellohtmlOthellohtml 2012[Online accessed 27-Jan-2012]

Bibliographyndash 89

S LucasThe Physical Travelling Salesperson Problemhttpprotectkern-1667emrelaxalgovalessexacukptspptsphtml2012[Online accessed 27ndashJan-2012]

T McConaghyFFX Fast Scalable Deterministic Symbolic Regression TechnologyIn Proc GPTP 2011

D J MontanaStrongly typed genetic programmingBBN Technical Report 7866 Bolt Beranek and Newman Inc 10 MoultonStreet Cambridge MA 02138 USA 7 May 1993

M Nicolau M Schoenauer and W BanzhafEvolving Genes to Balance a PoleIn Proc EuroGP 2010

Bibliographyndash 90

P Nordin and W BanzhafGenetic programming controlling a miniature robotIn E V Siegel and J R Koza editors Working Notes for the AAAI Symposiumon Genetic Programming pages 61ndash67 MIT Cambridge MA USA 10ndash12 Nov1995 AAAI

G Olague and L TrujilloEvolutionary-computer-assisted design of image operators that detect interestpoints using genetic programmingImage and Vision Computing 29(7)484ndash498 2011

M OrsquoNeill A Brabazon and E HembergSubtree Deactivation Control with Grammatical Genetic Programming inDynamic EnvironmentsIn Proc CEC 2008

M OrsquoNeill and C RyanGrammatical Evolution by Grammatical Evolution The Evolution of Grammarand Genetic CodeIn Proc EuroGP pages 138ndash149 Springer-Verlag 5ndash7 Apr 2004

M Paterson Y Peres M Thorup P Winkler and U ZwickMaximum OverhangIn Proc 19th Annual ACM-SIAM Symposium on Discrete Algorithms 2008

Bibliographyndash 91

R PoliExact schema theory for genetic programming and variable-length geneticalgorithms with one-point crossoverGenetic Programming and Evolvable Machines 2(2)123ndash163 June 2001

R Poli and W B LangdonOn the search properties of different crossover operators in genetic programmingIn J R Koza W Banzhaf K Chellapilla K Deb M Dorigo D B Fogel M HGarzon D E Goldberg H Iba and R Riolo editors Genetic Programming1998 Proceedings of the Third Annual Conference pages 293ndash301 University ofWisconsin Madison Wisconsin USA 22-25 July 1998 Morgan Kaufmann

R Poli W B Langdon and N F McPheeA field guide to genetic programmingPublished via httplulucom and freely available athttpwwwgp-field-guideorguk 2008(With contributions by J R Koza)

B J Ross and H ZhuProcedural texture evolution using multiobjective optimizationNew Generation Computing 22(3)271ndash293 2004

Bibliographyndash 92

R P Salustowicz and J SchmidhuberProbabilistic incremental program evolutionEvolutionary Computation 5(2)123ndash141 1997

M Schmidt and H LipsonDistilling free-form natural laws from experimental dataScience 324(5923)81ndash85 3 Apr 2009

S Silva and L VanneschiState-of-the-Art Genetic Programming for Predicting Human Oral Bioavailabilityof DrugsIn Proc 4th International Workshop on Practical Applications of ComputationalBiology and Bioinformatics 2010

M SipperLet the Games EvolveIn Proc GPTP 2011

C Sivapragasam R Maheswaran and V VenkateshGenetic programming approach for flood routing in natural channelsHydrological Processes 22(5)623ndash628 2007

C Sivapragasam N Muttil S Muthukumar and V M ArunPrediction of algal blooms using genetic programmingMarine Pollution Bulletin 60(10)1849ndash1855 2010

Bibliographyndash 93

L SpectorTowards practical autoconstructive evolution Self-evolution of problem-solvinggenetic programming systemsIn R Riolo T McConaghy and E Vladislavleva editors Genetic ProgrammingTheory and Practice VIII volume 8 of Genetic and Evolutionary Computationchapter 2 pages 17ndash33 Springer Ann Arbor USA 20-22 May 2010

L Spector C Perry and J KleinPush 20 programming language descriptionTechnical report School of Cognitive Science Hampshire College Apr 2004

J Togelius S Karakovskiy J Koutnik and J SchmidhuberSuper Mario EvolutionIn Proc IEEE Computational Intelligence and Games 2009

TORCS The Open Car Racing Simulatorhttptorcssourceforgenet 2012

L Vanneschi and G CuccuVariable Size Population for Dynamic Optimization with Genetic ProgrammingIn Proc GECCO 2009

Bibliographyndash 94

E Vladislavleva G Smits and D Den HertogOrder of Nonlinearity as a Complexity Measure for Models Generated by SymbolicRegression via Pareto Genetic ProgrammingIEEE Trans EC 13(2)333ndash349 2009

N Wagner Z Michalewicz M Khouja and R McGregorTime Series Forecasting for Dynamic Environments The DyFor Genetic ProgramModelIEEE Trans EC 2007

J Walker and J MillerPredicting Prime Numbers Using Cartesian Genetic ProgrammingIn Proc EuroGP 2007

J A Walker K Voumllk S L Smith and J F MillerParallel Evolution using Multi-chromosome Cartesian Genetic ProgrammingGPEM 10417ndash445 2009

W-C Wang K-W Chau C-T Cheng and L QiuA comparison of performance of several artificial intelligence methods forforecasting monthly discharge time seriesJournal of Hydrology 374(3-4)294ndash306 2009

Bibliographyndash 95

W Weimer S Forrest C Le Goues and T NguyenAutomatic program repair with evolutionary computationCommunications of the ACM 53(5)109ndash116 June 2010

P Widera J Garibaldi and N KrasnogorGP challenge Evolving energy function for protein structure predictionGPEM 1161ndash88 2010

T YuHierarchical Processing for Evolving Recursive and Modular Programs UsingHigher-Order Functions and Lambda AbstractionGPEM 2345ndash380 2001

Bibliographyndash 96

  • Introduction
  • Background
  • What is genetic programming
  • Summary of our first glimpse at GP
  • Life demonstration of GP using ECJ
  • Assessment of GP techniques
  • Case study
  • A more detailed view on GP (vanilla GP is not the whole story)
  • The Challenges for GP
  • Variants of GP
  • Selected theoretical results
  • Applications of GP
  • Additional resources
  • Bibliography
Page 66: GeneticProgramming - Poznań University of TechnologyGeneticProgramming KrzysztofKrawiec LaboratoryofIntelligentDecisionSupportSystems InstituteofComputingScience,PoznanUniversityofTechnology,Poznań,Poland

Push Example

Program

( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR )

Initial stack states

BOOLEAN STACK ()CODE STACK ( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR )FLOAT STACK ()INTEGER STACK ()

Stack states after program execution

BOOLEAN STACK ( TRUE )CODE STACK ( ( 2 3 INTEGER 41 52 FLOAT+ TRUE FALSE BOOLEANOR ) )FLOAT STACK ( 93 )INTEGER STACK ( 6 )

httphampshireedulspectorpush3-descriptionhtml

Variants of GPndash 66

Other variants of GP

Grammatical Evolution The grammar of the programming language ofconsideration is given as input to the algorithm Individuals encode the choice ofproductions in the derivation tree (which of available alternative productionshould be chosen modulo the number of productions available at given step ofderivation)

Graph-based GPMotivation standard GP cannot reuse subprograms (within a single program)Example Cartesian Genetic Programming

Variants of GPndash 67

Other variants of GP

Multiobjective GP The extra objectives canCome with the problemResult from GPrsquos specifics eg use program size as the second (minimized)objectiveBe associated with different tests (eg feature tests [40])

Developmental GP (eg using Push)

Probabilistic GP (a variant of EDA Estimation of Distribution Algorithms)The algorithm maintains a probability distribution P instead of a populationIndividuals are generated from P lsquoon demandrsquoThe results of individualsrsquo evaluation are used to update P

Variants of GPndash 68

Simple EDA-like GP PIPE

Probabilistic Incremental Program Evolution [41]

Variants of GPndash 69

Selected theoretical results

Selected theoretical resultsndash 70

Schemata theorem for GP

Exact formula for the expected number of individuals sampling a schema a thenext generation [37]

Plus later work for other types of crossover

Selected theoretical resultsndash 71

Semantic crossover

Selected theoretical resultsndash 72

Applications of GP

Applications of GPndash 73

Review

GP produced a number of solutions that are human-competitive ie a GPalgorithm automatically solved a problem for which a patent exists2

A recent award-winning work has demonstrated the ability of a GP system toautomatically find and correct bugs in commercially-released software whenprovided with test data3

GP is one of leading methodologies that can be used to lsquoautomatersquo sciencehelping the researchers to find the hidden complex patterns in the observedphenomena4

2Koza J R Keane M A Streeter M J Mydlowec W Yu J Lanza G 2003 Genetic Pro-gramming IV Routine Human-Competitive Machine Intelligence Kluwer Academic Publishers

3Arcuri A Yao X A novel co-evolutionary approach to automatic software bug fixing In WangJ (Ed) 2008 IEEE World Congress on Computational Intelligence IEEE Computational IntelligenceSociety IEEE Press Hong Kong

4Schmidt M Lipson H 3 Apr 2009 Distilling free-form natural laws from experimental dataScience 324 (5923) 81ndash85

Applications of GPndash 74

Humies

() Entries were solicited for cash awards for human-competitive results that wereproduced by any form of genetic and evolutionary computation and that were published

Applications of GPndash 75

Humies

The conditions to qualify(A) The result was patented as an invention in the past is an improvement over apatented invention or would qualify today as a patentable new invention(B) The result is equal to or better than a result that was accepted as a new scientificresult at the time when it was published in a peer-reviewed scientific journal(C) The result is equal to or better than a result that was placed into a database orarchive of results maintained by an internationally recognized panel of scientificexperts(D) The result is publishable in its own right as a new scientific result mdash independentof the fact that the result was mechanically created(E) The result is equal to or better than the most recent human-created solution to along-standing problem for which there has been a succession of increasingly betterhuman-created solutions(F) The result is equal to or better than a result that was considered an achievementin its field at the time it was first discovered(G) The result solves a problem of indisputable difficulty in its field(H) The result holds its own or wins a regulated competition involving humancontestants (in the form of either live human players or human-written computerprograms)

Applications of GPndash 76

Selected Gold Humies

2004 Jason D Lohn Gregory S Hornby Derek S Linden NASA Ames ResearchCenterAn Evolved Antenna for Deployment on NASArsquos Space Technology 5 Mission

httpidesignucscedupapershornby_ec11pdf

Applications of GPndash 77

Selected Gold Humies using GP

2009 Stephanie Forrest Claire Le Goues ThanhVu Nguyen Westley WeimerAutomatically finding patches using genetic programming A GeneticProgramming Approach to Automated Software Repair

Applications of GPndash 78

Selected Gold Humies using GP

2008 Lee Spector David M Clark Ian Lindsay Bradford Barr Jon KleinGenetic Programming for Finite Algebras

2010 Natalio Krasnogor Paweł Widera Jonathan GaribaldiEvolutionary design of the energy function for protein structure prediction GPchallenge evolving the energy function for protein structure prediction Automateddesign of energy functions for protein structure prediction by means of geneticprogramming and improved structure similarity assessment

2011 Achiya Elyasaf Ami Hauptmann Moshe SipperGA-FreeCell Evolving Solvers for the Game of FreeCell

Applications of GPndash 79

Other applications

classification problems in machine learning [] object recognition [ 33] orlearning game strategies [] [2 57] has demonstrated the ability of a GP system to automatically find andcorrect bugs in commercially-released software when provided with test data

In the context of this paper it deserves particular attention that GP is one ofleading methodologies that can be used to lsquoautomatersquo science helping theresearchers to find the hidden complex patterns in the observed phenomena

In this spirit in their seminal paper [42] have shown how GP can be used toinduce scientific laws from experimental data Many other studies havedemonstrated the usefulness of GP for modeling different phenomena includingthose of natural origins [45 3 7 56 46]

See [] for an extensive review of GP applications

Applications of GPndash 80

Case study 2 Evolution of feature detectors

Based on Krzysztof Krawiec Bartosz Kukawka and Tomasz Maciejewski (2010)Evolving cascades of voting feature detectors for vehicle detection in satellite imageryIn IEEE Congress on Evolutionary Computation (CEC 2010) Barcelona IEEE Presspages 2392-2399

Applications of GPndash 81

Additional resources

Additional resourcesndash 82

Additional resources

Evolutionary Computation in Java csgmuedu~eclabprojectsecjGeneric software framework for EA well-prepared to work with GP

The online GP bilbiography wwwcsbhamacuk~wblbiblio

The genetic programming lsquohome pagersquo (a little bit messy but still valuable)httpwwwgenetic-programmingcom

Additional resourcesndash 83

Bibliography

Bibliographyndash 84

A Agapitos and S M LucasEvolving Modular Recursive Sorting AlgorithmsIn Proc EuroGP 2007

A Arcuri and X YaoA novel co-evolutionary approach to automatic software bug fixingIn J Wang editor 2008 IEEE World Congress on Computational Intelligencepages 162ndash168 Hong Kong 1-6 June 2008 IEEE Computational IntelligenceSociety IEEE Press

M Arganis R Val J Prats K Rodriguez R Dominguez and J DolzGenetic programming and standardization in water temperature modellingAdvances in Civil Engineering 2009 2009

A Brabazon M OrsquoNeill and I DempseyAn Introduction to Evolutionary Computation in FinanceIEEE Computational Intelligence Magazine 3(4)42ndash55 2008

R Bradley A Brabazon and M OrsquoNeillDynamic High Frequency Trading A Neuro-Evolutionary ApproachIn Proc EvoWorkshops 2009

G Cuccu and F GomezWhen novelty is not enoughIn Proc EvoApplications 2011

Bibliographyndash 85

M Daga and M C DeoAlternative data-driven methods to estimate wind from waves by inversemodelingNatural Hazards 49(2)293ndash310 May 2009

I Dempsey M OrsquoNeill and A BrabazonAdaptive Trading With Grammatical EvolutionIn Proc CEC 2006

G Durrett F Neumann and U-M OrsquoReillyComputational Complexity Analysis of Simple Genetic Programming On TwoProblems Modeling Isolated Program SemanticsIn Proc FOGA 2011

E Galvaacuten-Loacutepez J Swafford M OrsquoNeill and A BrabazonEvolving a Ms PacMan Controller Using Grammatical EvolutionIn Applications of Evolutionary Computation Springer 2010

J V Hansen P B Lowry R D Meservy and D M McDonaldGenetic Programming for Prevention of Cyberterrorism through Dynamic andEvolving Intrusion DetectionDecision Support Systems 431362ndash1374 2007

Bibliographyndash 86

S Harding J F Miller and W BanzhafDevelopments in Cartesian Genetic Programming self-modifying CGPGPEM 11397ndash439 2010

D Jakobović and L BudinDynamic Scheduling with Genetic ProgrammingIn Proc EuroGP 2006

G Kendall A Parkes and K SpoererA Survey of NP-Complete PuzzlesInternational Computer Games Association Journal 31(1)13ndash34 2008

K E Kinnear JrEvolving a Sort Lessons in Genetic ProgrammingIn Proc of the International Conference on Neural Networks 1993

J KozaA Genetic Approach to the Truck Backer Upper Problem and the Inter-twinedSpiral ProblemIn Proc International Joint Conference on Neural Networks 1992

J R KozaGenetic Programming On the Programming of Computers by Means of NaturalSelectionMIT Press Cambridge MA USA 1992

Bibliographyndash 87

J R KozaGenetic Programming II Automatic Discovery of Reusable ProgramsMIT Press Cambridge Massachusetts May 1994

W Langdon and W BanzhafRepeated Patterns in Genetic ProgrammingNatural Computing 7589ndash613 2008

W B LangdonRandom search is parsimoniousIn E Cantuacute-Paz editor Late Breaking Papers at the Genetic and EvolutionaryComputation Conference (GECCO-2002) pages 308ndash315 New York NY 9-13July 2002 AAAI

W B Langdon and W BanzhafRepeated Sequences in Linear Genetic Programming GenomesComplex Systems 15(4)285ndash306 2005

W B Langdon and A P HarrisonEvolving Regular Expressions for GeneChip Probe Performance PredictionIn Proc PPSN pages 1061ndash1070 2008

W B Langdon and R PoliFoundations of Genetic ProgrammingSpringer-Verlag 2002

Bibliographyndash 88

W B Langdon J Rowsell and A P HarrisonCreating Regular Expressions as mRNA Motifs with GP to Predict Human ExonSplittingIn Proc GECCO 2009

W B Langdon O Sanchez Graillet and A P HarrisonAutomated DNA Motif DiscoveryarXivorg 2010

M Lones A Tyrrell S Stepney and L CavesControlling Complex Dynamics with Artificial Biochemical NetworksIn Proc EuroGP pages 159ndash170 2010

S LucasOthello Competitionhttpprotectkern-1667emrelaxalgovalessexacuk8080othellohtmlOthellohtml 2012[Online accessed 27-Jan-2012]

Bibliographyndash 89

S LucasThe Physical Travelling Salesperson Problemhttpprotectkern-1667emrelaxalgovalessexacukptspptsphtml2012[Online accessed 27ndashJan-2012]

T McConaghyFFX Fast Scalable Deterministic Symbolic Regression TechnologyIn Proc GPTP 2011

D J MontanaStrongly typed genetic programmingBBN Technical Report 7866 Bolt Beranek and Newman Inc 10 MoultonStreet Cambridge MA 02138 USA 7 May 1993

M Nicolau M Schoenauer and W BanzhafEvolving Genes to Balance a PoleIn Proc EuroGP 2010

Bibliographyndash 90

P Nordin and W BanzhafGenetic programming controlling a miniature robotIn E V Siegel and J R Koza editors Working Notes for the AAAI Symposiumon Genetic Programming pages 61ndash67 MIT Cambridge MA USA 10ndash12 Nov1995 AAAI

G Olague and L TrujilloEvolutionary-computer-assisted design of image operators that detect interestpoints using genetic programmingImage and Vision Computing 29(7)484ndash498 2011

M OrsquoNeill A Brabazon and E HembergSubtree Deactivation Control with Grammatical Genetic Programming inDynamic EnvironmentsIn Proc CEC 2008

M OrsquoNeill and C RyanGrammatical Evolution by Grammatical Evolution The Evolution of Grammarand Genetic CodeIn Proc EuroGP pages 138ndash149 Springer-Verlag 5ndash7 Apr 2004

M Paterson Y Peres M Thorup P Winkler and U ZwickMaximum OverhangIn Proc 19th Annual ACM-SIAM Symposium on Discrete Algorithms 2008

Bibliographyndash 91

R PoliExact schema theory for genetic programming and variable-length geneticalgorithms with one-point crossoverGenetic Programming and Evolvable Machines 2(2)123ndash163 June 2001

R Poli and W B LangdonOn the search properties of different crossover operators in genetic programmingIn J R Koza W Banzhaf K Chellapilla K Deb M Dorigo D B Fogel M HGarzon D E Goldberg H Iba and R Riolo editors Genetic Programming1998 Proceedings of the Third Annual Conference pages 293ndash301 University ofWisconsin Madison Wisconsin USA 22-25 July 1998 Morgan Kaufmann

R Poli W B Langdon and N F McPheeA field guide to genetic programmingPublished via httplulucom and freely available athttpwwwgp-field-guideorguk 2008(With contributions by J R Koza)

B J Ross and H ZhuProcedural texture evolution using multiobjective optimizationNew Generation Computing 22(3)271ndash293 2004

Bibliographyndash 92

R P Salustowicz and J SchmidhuberProbabilistic incremental program evolutionEvolutionary Computation 5(2)123ndash141 1997

M Schmidt and H LipsonDistilling free-form natural laws from experimental dataScience 324(5923)81ndash85 3 Apr 2009

S Silva and L VanneschiState-of-the-Art Genetic Programming for Predicting Human Oral Bioavailabilityof DrugsIn Proc 4th International Workshop on Practical Applications of ComputationalBiology and Bioinformatics 2010

M SipperLet the Games EvolveIn Proc GPTP 2011

C Sivapragasam R Maheswaran and V VenkateshGenetic programming approach for flood routing in natural channelsHydrological Processes 22(5)623ndash628 2007

C Sivapragasam N Muttil S Muthukumar and V M ArunPrediction of algal blooms using genetic programmingMarine Pollution Bulletin 60(10)1849ndash1855 2010

Bibliographyndash 93

L SpectorTowards practical autoconstructive evolution Self-evolution of problem-solvinggenetic programming systemsIn R Riolo T McConaghy and E Vladislavleva editors Genetic ProgrammingTheory and Practice VIII volume 8 of Genetic and Evolutionary Computationchapter 2 pages 17ndash33 Springer Ann Arbor USA 20-22 May 2010

L Spector C Perry and J KleinPush 20 programming language descriptionTechnical report School of Cognitive Science Hampshire College Apr 2004

J Togelius S Karakovskiy J Koutnik and J SchmidhuberSuper Mario EvolutionIn Proc IEEE Computational Intelligence and Games 2009

TORCS The Open Car Racing Simulatorhttptorcssourceforgenet 2012

L Vanneschi and G CuccuVariable Size Population for Dynamic Optimization with Genetic ProgrammingIn Proc GECCO 2009

Bibliographyndash 94

E Vladislavleva G Smits and D Den HertogOrder of Nonlinearity as a Complexity Measure for Models Generated by SymbolicRegression via Pareto Genetic ProgrammingIEEE Trans EC 13(2)333ndash349 2009

N Wagner Z Michalewicz M Khouja and R McGregorTime Series Forecasting for Dynamic Environments The DyFor Genetic ProgramModelIEEE Trans EC 2007

J Walker and J MillerPredicting Prime Numbers Using Cartesian Genetic ProgrammingIn Proc EuroGP 2007

J A Walker K Voumllk S L Smith and J F MillerParallel Evolution using Multi-chromosome Cartesian Genetic ProgrammingGPEM 10417ndash445 2009

W-C Wang K-W Chau C-T Cheng and L QiuA comparison of performance of several artificial intelligence methods forforecasting monthly discharge time seriesJournal of Hydrology 374(3-4)294ndash306 2009

Bibliographyndash 95

W Weimer S Forrest C Le Goues and T NguyenAutomatic program repair with evolutionary computationCommunications of the ACM 53(5)109ndash116 June 2010

P Widera J Garibaldi and N KrasnogorGP challenge Evolving energy function for protein structure predictionGPEM 1161ndash88 2010

T YuHierarchical Processing for Evolving Recursive and Modular Programs UsingHigher-Order Functions and Lambda AbstractionGPEM 2345ndash380 2001

Bibliographyndash 96

  • Introduction
  • Background
  • What is genetic programming
  • Summary of our first glimpse at GP
  • Life demonstration of GP using ECJ
  • Assessment of GP techniques
  • Case study
  • A more detailed view on GP (vanilla GP is not the whole story)
  • The Challenges for GP
  • Variants of GP
  • Selected theoretical results
  • Applications of GP
  • Additional resources
  • Bibliography
Page 67: GeneticProgramming - Poznań University of TechnologyGeneticProgramming KrzysztofKrawiec LaboratoryofIntelligentDecisionSupportSystems InstituteofComputingScience,PoznanUniversityofTechnology,Poznań,Poland

Other variants of GP

Grammatical Evolution The grammar of the programming language ofconsideration is given as input to the algorithm Individuals encode the choice ofproductions in the derivation tree (which of available alternative productionshould be chosen modulo the number of productions available at given step ofderivation)

Graph-based GPMotivation standard GP cannot reuse subprograms (within a single program)Example Cartesian Genetic Programming

Variants of GPndash 67

Other variants of GP

Multiobjective GP The extra objectives canCome with the problemResult from GPrsquos specifics eg use program size as the second (minimized)objectiveBe associated with different tests (eg feature tests [40])

Developmental GP (eg using Push)

Probabilistic GP (a variant of EDA Estimation of Distribution Algorithms)The algorithm maintains a probability distribution P instead of a populationIndividuals are generated from P lsquoon demandrsquoThe results of individualsrsquo evaluation are used to update P

Variants of GPndash 68

Simple EDA-like GP PIPE

Probabilistic Incremental Program Evolution [41]

Variants of GPndash 69

Selected theoretical results

Selected theoretical resultsndash 70

Schemata theorem for GP

Exact formula for the expected number of individuals sampling a schema a thenext generation [37]

Plus later work for other types of crossover

Selected theoretical resultsndash 71

Semantic crossover

Selected theoretical resultsndash 72

Applications of GP

Applications of GPndash 73

Review

GP produced a number of solutions that are human-competitive ie a GPalgorithm automatically solved a problem for which a patent exists2

A recent award-winning work has demonstrated the ability of a GP system toautomatically find and correct bugs in commercially-released software whenprovided with test data3

GP is one of leading methodologies that can be used to lsquoautomatersquo sciencehelping the researchers to find the hidden complex patterns in the observedphenomena4

2Koza J R Keane M A Streeter M J Mydlowec W Yu J Lanza G 2003 Genetic Pro-gramming IV Routine Human-Competitive Machine Intelligence Kluwer Academic Publishers

3Arcuri A Yao X A novel co-evolutionary approach to automatic software bug fixing In WangJ (Ed) 2008 IEEE World Congress on Computational Intelligence IEEE Computational IntelligenceSociety IEEE Press Hong Kong

4Schmidt M Lipson H 3 Apr 2009 Distilling free-form natural laws from experimental dataScience 324 (5923) 81ndash85

Applications of GPndash 74

Humies

() Entries were solicited for cash awards for human-competitive results that wereproduced by any form of genetic and evolutionary computation and that were published

Applications of GPndash 75

Humies

The conditions to qualify(A) The result was patented as an invention in the past is an improvement over apatented invention or would qualify today as a patentable new invention(B) The result is equal to or better than a result that was accepted as a new scientificresult at the time when it was published in a peer-reviewed scientific journal(C) The result is equal to or better than a result that was placed into a database orarchive of results maintained by an internationally recognized panel of scientificexperts(D) The result is publishable in its own right as a new scientific result mdash independentof the fact that the result was mechanically created(E) The result is equal to or better than the most recent human-created solution to along-standing problem for which there has been a succession of increasingly betterhuman-created solutions(F) The result is equal to or better than a result that was considered an achievementin its field at the time it was first discovered(G) The result solves a problem of indisputable difficulty in its field(H) The result holds its own or wins a regulated competition involving humancontestants (in the form of either live human players or human-written computerprograms)

Applications of GPndash 76

Selected Gold Humies

2004 Jason D Lohn Gregory S Hornby Derek S Linden NASA Ames ResearchCenterAn Evolved Antenna for Deployment on NASArsquos Space Technology 5 Mission

httpidesignucscedupapershornby_ec11pdf

Applications of GPndash 77

Selected Gold Humies using GP

2009 Stephanie Forrest Claire Le Goues ThanhVu Nguyen Westley WeimerAutomatically finding patches using genetic programming A GeneticProgramming Approach to Automated Software Repair

Applications of GPndash 78

Selected Gold Humies using GP

2008 Lee Spector David M Clark Ian Lindsay Bradford Barr Jon KleinGenetic Programming for Finite Algebras

2010 Natalio Krasnogor Paweł Widera Jonathan GaribaldiEvolutionary design of the energy function for protein structure prediction GPchallenge evolving the energy function for protein structure prediction Automateddesign of energy functions for protein structure prediction by means of geneticprogramming and improved structure similarity assessment

2011 Achiya Elyasaf Ami Hauptmann Moshe SipperGA-FreeCell Evolving Solvers for the Game of FreeCell

Applications of GPndash 79

Other applications

classification problems in machine learning [] object recognition [ 33] orlearning game strategies [] [2 57] has demonstrated the ability of a GP system to automatically find andcorrect bugs in commercially-released software when provided with test data

In the context of this paper it deserves particular attention that GP is one ofleading methodologies that can be used to lsquoautomatersquo science helping theresearchers to find the hidden complex patterns in the observed phenomena

In this spirit in their seminal paper [42] have shown how GP can be used toinduce scientific laws from experimental data Many other studies havedemonstrated the usefulness of GP for modeling different phenomena includingthose of natural origins [45 3 7 56 46]

See [] for an extensive review of GP applications

Applications of GPndash 80

Case study 2 Evolution of feature detectors

Based on Krzysztof Krawiec Bartosz Kukawka and Tomasz Maciejewski (2010)Evolving cascades of voting feature detectors for vehicle detection in satellite imageryIn IEEE Congress on Evolutionary Computation (CEC 2010) Barcelona IEEE Presspages 2392-2399

Applications of GPndash 81

Additional resources

Additional resourcesndash 82

Additional resources

Evolutionary Computation in Java csgmuedu~eclabprojectsecjGeneric software framework for EA well-prepared to work with GP

The online GP bilbiography wwwcsbhamacuk~wblbiblio

The genetic programming lsquohome pagersquo (a little bit messy but still valuable)httpwwwgenetic-programmingcom

Additional resourcesndash 83

Bibliography

Bibliographyndash 84

A Agapitos and S M LucasEvolving Modular Recursive Sorting AlgorithmsIn Proc EuroGP 2007

A Arcuri and X YaoA novel co-evolutionary approach to automatic software bug fixingIn J Wang editor 2008 IEEE World Congress on Computational Intelligencepages 162ndash168 Hong Kong 1-6 June 2008 IEEE Computational IntelligenceSociety IEEE Press

M Arganis R Val J Prats K Rodriguez R Dominguez and J DolzGenetic programming and standardization in water temperature modellingAdvances in Civil Engineering 2009 2009

A Brabazon M OrsquoNeill and I DempseyAn Introduction to Evolutionary Computation in FinanceIEEE Computational Intelligence Magazine 3(4)42ndash55 2008

R Bradley A Brabazon and M OrsquoNeillDynamic High Frequency Trading A Neuro-Evolutionary ApproachIn Proc EvoWorkshops 2009

G Cuccu and F GomezWhen novelty is not enoughIn Proc EvoApplications 2011

Bibliographyndash 85

M Daga and M C DeoAlternative data-driven methods to estimate wind from waves by inversemodelingNatural Hazards 49(2)293ndash310 May 2009

I Dempsey M OrsquoNeill and A BrabazonAdaptive Trading With Grammatical EvolutionIn Proc CEC 2006

G Durrett F Neumann and U-M OrsquoReillyComputational Complexity Analysis of Simple Genetic Programming On TwoProblems Modeling Isolated Program SemanticsIn Proc FOGA 2011

E Galvaacuten-Loacutepez J Swafford M OrsquoNeill and A BrabazonEvolving a Ms PacMan Controller Using Grammatical EvolutionIn Applications of Evolutionary Computation Springer 2010

J V Hansen P B Lowry R D Meservy and D M McDonaldGenetic Programming for Prevention of Cyberterrorism through Dynamic andEvolving Intrusion DetectionDecision Support Systems 431362ndash1374 2007

Bibliographyndash 86

S Harding J F Miller and W BanzhafDevelopments in Cartesian Genetic Programming self-modifying CGPGPEM 11397ndash439 2010

D Jakobović and L BudinDynamic Scheduling with Genetic ProgrammingIn Proc EuroGP 2006

G Kendall A Parkes and K SpoererA Survey of NP-Complete PuzzlesInternational Computer Games Association Journal 31(1)13ndash34 2008

K E Kinnear JrEvolving a Sort Lessons in Genetic ProgrammingIn Proc of the International Conference on Neural Networks 1993

J KozaA Genetic Approach to the Truck Backer Upper Problem and the Inter-twinedSpiral ProblemIn Proc International Joint Conference on Neural Networks 1992

J R KozaGenetic Programming On the Programming of Computers by Means of NaturalSelectionMIT Press Cambridge MA USA 1992

Bibliographyndash 87

J R KozaGenetic Programming II Automatic Discovery of Reusable ProgramsMIT Press Cambridge Massachusetts May 1994

W Langdon and W BanzhafRepeated Patterns in Genetic ProgrammingNatural Computing 7589ndash613 2008

W B LangdonRandom search is parsimoniousIn E Cantuacute-Paz editor Late Breaking Papers at the Genetic and EvolutionaryComputation Conference (GECCO-2002) pages 308ndash315 New York NY 9-13July 2002 AAAI

W B Langdon and W BanzhafRepeated Sequences in Linear Genetic Programming GenomesComplex Systems 15(4)285ndash306 2005

W B Langdon and A P HarrisonEvolving Regular Expressions for GeneChip Probe Performance PredictionIn Proc PPSN pages 1061ndash1070 2008

W B Langdon and R PoliFoundations of Genetic ProgrammingSpringer-Verlag 2002

Bibliographyndash 88

W B Langdon J Rowsell and A P HarrisonCreating Regular Expressions as mRNA Motifs with GP to Predict Human ExonSplittingIn Proc GECCO 2009

W B Langdon O Sanchez Graillet and A P HarrisonAutomated DNA Motif DiscoveryarXivorg 2010

M Lones A Tyrrell S Stepney and L CavesControlling Complex Dynamics with Artificial Biochemical NetworksIn Proc EuroGP pages 159ndash170 2010

S LucasOthello Competitionhttpprotectkern-1667emrelaxalgovalessexacuk8080othellohtmlOthellohtml 2012[Online accessed 27-Jan-2012]

Bibliographyndash 89

S LucasThe Physical Travelling Salesperson Problemhttpprotectkern-1667emrelaxalgovalessexacukptspptsphtml2012[Online accessed 27ndashJan-2012]

T McConaghyFFX Fast Scalable Deterministic Symbolic Regression TechnologyIn Proc GPTP 2011

D J MontanaStrongly typed genetic programmingBBN Technical Report 7866 Bolt Beranek and Newman Inc 10 MoultonStreet Cambridge MA 02138 USA 7 May 1993

M Nicolau M Schoenauer and W BanzhafEvolving Genes to Balance a PoleIn Proc EuroGP 2010

Bibliographyndash 90

P Nordin and W BanzhafGenetic programming controlling a miniature robotIn E V Siegel and J R Koza editors Working Notes for the AAAI Symposiumon Genetic Programming pages 61ndash67 MIT Cambridge MA USA 10ndash12 Nov1995 AAAI

G Olague and L TrujilloEvolutionary-computer-assisted design of image operators that detect interestpoints using genetic programmingImage and Vision Computing 29(7)484ndash498 2011

M OrsquoNeill A Brabazon and E HembergSubtree Deactivation Control with Grammatical Genetic Programming inDynamic EnvironmentsIn Proc CEC 2008

M OrsquoNeill and C RyanGrammatical Evolution by Grammatical Evolution The Evolution of Grammarand Genetic CodeIn Proc EuroGP pages 138ndash149 Springer-Verlag 5ndash7 Apr 2004

M Paterson Y Peres M Thorup P Winkler and U ZwickMaximum OverhangIn Proc 19th Annual ACM-SIAM Symposium on Discrete Algorithms 2008

Bibliographyndash 91

R PoliExact schema theory for genetic programming and variable-length geneticalgorithms with one-point crossoverGenetic Programming and Evolvable Machines 2(2)123ndash163 June 2001

R Poli and W B LangdonOn the search properties of different crossover operators in genetic programmingIn J R Koza W Banzhaf K Chellapilla K Deb M Dorigo D B Fogel M HGarzon D E Goldberg H Iba and R Riolo editors Genetic Programming1998 Proceedings of the Third Annual Conference pages 293ndash301 University ofWisconsin Madison Wisconsin USA 22-25 July 1998 Morgan Kaufmann

R Poli W B Langdon and N F McPheeA field guide to genetic programmingPublished via httplulucom and freely available athttpwwwgp-field-guideorguk 2008(With contributions by J R Koza)

B J Ross and H ZhuProcedural texture evolution using multiobjective optimizationNew Generation Computing 22(3)271ndash293 2004

Bibliographyndash 92

R P Salustowicz and J SchmidhuberProbabilistic incremental program evolutionEvolutionary Computation 5(2)123ndash141 1997

M Schmidt and H LipsonDistilling free-form natural laws from experimental dataScience 324(5923)81ndash85 3 Apr 2009

S Silva and L VanneschiState-of-the-Art Genetic Programming for Predicting Human Oral Bioavailabilityof DrugsIn Proc 4th International Workshop on Practical Applications of ComputationalBiology and Bioinformatics 2010

M SipperLet the Games EvolveIn Proc GPTP 2011

C Sivapragasam R Maheswaran and V VenkateshGenetic programming approach for flood routing in natural channelsHydrological Processes 22(5)623ndash628 2007

C Sivapragasam N Muttil S Muthukumar and V M ArunPrediction of algal blooms using genetic programmingMarine Pollution Bulletin 60(10)1849ndash1855 2010

Bibliographyndash 93

L SpectorTowards practical autoconstructive evolution Self-evolution of problem-solvinggenetic programming systemsIn R Riolo T McConaghy and E Vladislavleva editors Genetic ProgrammingTheory and Practice VIII volume 8 of Genetic and Evolutionary Computationchapter 2 pages 17ndash33 Springer Ann Arbor USA 20-22 May 2010

L Spector C Perry and J KleinPush 20 programming language descriptionTechnical report School of Cognitive Science Hampshire College Apr 2004

J Togelius S Karakovskiy J Koutnik and J SchmidhuberSuper Mario EvolutionIn Proc IEEE Computational Intelligence and Games 2009

TORCS The Open Car Racing Simulatorhttptorcssourceforgenet 2012

L Vanneschi and G CuccuVariable Size Population for Dynamic Optimization with Genetic ProgrammingIn Proc GECCO 2009

Bibliographyndash 94

E Vladislavleva G Smits and D Den HertogOrder of Nonlinearity as a Complexity Measure for Models Generated by SymbolicRegression via Pareto Genetic ProgrammingIEEE Trans EC 13(2)333ndash349 2009

N Wagner Z Michalewicz M Khouja and R McGregorTime Series Forecasting for Dynamic Environments The DyFor Genetic ProgramModelIEEE Trans EC 2007

J Walker and J MillerPredicting Prime Numbers Using Cartesian Genetic ProgrammingIn Proc EuroGP 2007

J A Walker K Voumllk S L Smith and J F MillerParallel Evolution using Multi-chromosome Cartesian Genetic ProgrammingGPEM 10417ndash445 2009

W-C Wang K-W Chau C-T Cheng and L QiuA comparison of performance of several artificial intelligence methods forforecasting monthly discharge time seriesJournal of Hydrology 374(3-4)294ndash306 2009

Bibliographyndash 95

W Weimer S Forrest C Le Goues and T NguyenAutomatic program repair with evolutionary computationCommunications of the ACM 53(5)109ndash116 June 2010

P Widera J Garibaldi and N KrasnogorGP challenge Evolving energy function for protein structure predictionGPEM 1161ndash88 2010

T YuHierarchical Processing for Evolving Recursive and Modular Programs UsingHigher-Order Functions and Lambda AbstractionGPEM 2345ndash380 2001

Bibliographyndash 96

  • Introduction
  • Background
  • What is genetic programming
  • Summary of our first glimpse at GP
  • Life demonstration of GP using ECJ
  • Assessment of GP techniques
  • Case study
  • A more detailed view on GP (vanilla GP is not the whole story)
  • The Challenges for GP
  • Variants of GP
  • Selected theoretical results
  • Applications of GP
  • Additional resources
  • Bibliography
Page 68: GeneticProgramming - Poznań University of TechnologyGeneticProgramming KrzysztofKrawiec LaboratoryofIntelligentDecisionSupportSystems InstituteofComputingScience,PoznanUniversityofTechnology,Poznań,Poland

Other variants of GP

Multiobjective GP The extra objectives canCome with the problemResult from GPrsquos specifics eg use program size as the second (minimized)objectiveBe associated with different tests (eg feature tests [40])

Developmental GP (eg using Push)

Probabilistic GP (a variant of EDA Estimation of Distribution Algorithms)The algorithm maintains a probability distribution P instead of a populationIndividuals are generated from P lsquoon demandrsquoThe results of individualsrsquo evaluation are used to update P

Variants of GPndash 68

Simple EDA-like GP PIPE

Probabilistic Incremental Program Evolution [41]

Variants of GPndash 69

Selected theoretical results

Selected theoretical resultsndash 70

Schemata theorem for GP

Exact formula for the expected number of individuals sampling a schema a thenext generation [37]

Plus later work for other types of crossover

Selected theoretical resultsndash 71

Semantic crossover

Selected theoretical resultsndash 72

Applications of GP

Applications of GPndash 73

Review

GP produced a number of solutions that are human-competitive ie a GPalgorithm automatically solved a problem for which a patent exists2

A recent award-winning work has demonstrated the ability of a GP system toautomatically find and correct bugs in commercially-released software whenprovided with test data3

GP is one of leading methodologies that can be used to lsquoautomatersquo sciencehelping the researchers to find the hidden complex patterns in the observedphenomena4

2Koza J R Keane M A Streeter M J Mydlowec W Yu J Lanza G 2003 Genetic Pro-gramming IV Routine Human-Competitive Machine Intelligence Kluwer Academic Publishers

3Arcuri A Yao X A novel co-evolutionary approach to automatic software bug fixing In WangJ (Ed) 2008 IEEE World Congress on Computational Intelligence IEEE Computational IntelligenceSociety IEEE Press Hong Kong

4Schmidt M Lipson H 3 Apr 2009 Distilling free-form natural laws from experimental dataScience 324 (5923) 81ndash85

Applications of GPndash 74

Humies

() Entries were solicited for cash awards for human-competitive results that wereproduced by any form of genetic and evolutionary computation and that were published

Applications of GPndash 75

Humies

The conditions to qualify(A) The result was patented as an invention in the past is an improvement over apatented invention or would qualify today as a patentable new invention(B) The result is equal to or better than a result that was accepted as a new scientificresult at the time when it was published in a peer-reviewed scientific journal(C) The result is equal to or better than a result that was placed into a database orarchive of results maintained by an internationally recognized panel of scientificexperts(D) The result is publishable in its own right as a new scientific result mdash independentof the fact that the result was mechanically created(E) The result is equal to or better than the most recent human-created solution to along-standing problem for which there has been a succession of increasingly betterhuman-created solutions(F) The result is equal to or better than a result that was considered an achievementin its field at the time it was first discovered(G) The result solves a problem of indisputable difficulty in its field(H) The result holds its own or wins a regulated competition involving humancontestants (in the form of either live human players or human-written computerprograms)

Applications of GPndash 76

Selected Gold Humies

2004 Jason D Lohn Gregory S Hornby Derek S Linden NASA Ames ResearchCenterAn Evolved Antenna for Deployment on NASArsquos Space Technology 5 Mission

httpidesignucscedupapershornby_ec11pdf

Applications of GPndash 77

Selected Gold Humies using GP

2009 Stephanie Forrest Claire Le Goues ThanhVu Nguyen Westley WeimerAutomatically finding patches using genetic programming A GeneticProgramming Approach to Automated Software Repair

Applications of GPndash 78

Selected Gold Humies using GP

2008 Lee Spector David M Clark Ian Lindsay Bradford Barr Jon KleinGenetic Programming for Finite Algebras

2010 Natalio Krasnogor Paweł Widera Jonathan GaribaldiEvolutionary design of the energy function for protein structure prediction GPchallenge evolving the energy function for protein structure prediction Automateddesign of energy functions for protein structure prediction by means of geneticprogramming and improved structure similarity assessment

2011 Achiya Elyasaf Ami Hauptmann Moshe SipperGA-FreeCell Evolving Solvers for the Game of FreeCell

Applications of GPndash 79

Other applications

classification problems in machine learning [] object recognition [ 33] orlearning game strategies [] [2 57] has demonstrated the ability of a GP system to automatically find andcorrect bugs in commercially-released software when provided with test data

In the context of this paper it deserves particular attention that GP is one ofleading methodologies that can be used to lsquoautomatersquo science helping theresearchers to find the hidden complex patterns in the observed phenomena

In this spirit in their seminal paper [42] have shown how GP can be used toinduce scientific laws from experimental data Many other studies havedemonstrated the usefulness of GP for modeling different phenomena includingthose of natural origins [45 3 7 56 46]

See [] for an extensive review of GP applications

Applications of GPndash 80

Case study 2 Evolution of feature detectors

Based on Krzysztof Krawiec Bartosz Kukawka and Tomasz Maciejewski (2010)Evolving cascades of voting feature detectors for vehicle detection in satellite imageryIn IEEE Congress on Evolutionary Computation (CEC 2010) Barcelona IEEE Presspages 2392-2399

Applications of GPndash 81

Additional resources

Additional resourcesndash 82

Additional resources

Evolutionary Computation in Java csgmuedu~eclabprojectsecjGeneric software framework for EA well-prepared to work with GP

The online GP bilbiography wwwcsbhamacuk~wblbiblio

The genetic programming lsquohome pagersquo (a little bit messy but still valuable)httpwwwgenetic-programmingcom

Additional resourcesndash 83

Bibliography

Bibliographyndash 84

A Agapitos and S M LucasEvolving Modular Recursive Sorting AlgorithmsIn Proc EuroGP 2007

A Arcuri and X YaoA novel co-evolutionary approach to automatic software bug fixingIn J Wang editor 2008 IEEE World Congress on Computational Intelligencepages 162ndash168 Hong Kong 1-6 June 2008 IEEE Computational IntelligenceSociety IEEE Press

M Arganis R Val J Prats K Rodriguez R Dominguez and J DolzGenetic programming and standardization in water temperature modellingAdvances in Civil Engineering 2009 2009

A Brabazon M OrsquoNeill and I DempseyAn Introduction to Evolutionary Computation in FinanceIEEE Computational Intelligence Magazine 3(4)42ndash55 2008

R Bradley A Brabazon and M OrsquoNeillDynamic High Frequency Trading A Neuro-Evolutionary ApproachIn Proc EvoWorkshops 2009

G Cuccu and F GomezWhen novelty is not enoughIn Proc EvoApplications 2011

Bibliographyndash 85

M Daga and M C DeoAlternative data-driven methods to estimate wind from waves by inversemodelingNatural Hazards 49(2)293ndash310 May 2009

I Dempsey M OrsquoNeill and A BrabazonAdaptive Trading With Grammatical EvolutionIn Proc CEC 2006

G Durrett F Neumann and U-M OrsquoReillyComputational Complexity Analysis of Simple Genetic Programming On TwoProblems Modeling Isolated Program SemanticsIn Proc FOGA 2011

E Galvaacuten-Loacutepez J Swafford M OrsquoNeill and A BrabazonEvolving a Ms PacMan Controller Using Grammatical EvolutionIn Applications of Evolutionary Computation Springer 2010

J V Hansen P B Lowry R D Meservy and D M McDonaldGenetic Programming for Prevention of Cyberterrorism through Dynamic andEvolving Intrusion DetectionDecision Support Systems 431362ndash1374 2007

Bibliographyndash 86

S Harding J F Miller and W BanzhafDevelopments in Cartesian Genetic Programming self-modifying CGPGPEM 11397ndash439 2010

D Jakobović and L BudinDynamic Scheduling with Genetic ProgrammingIn Proc EuroGP 2006

G Kendall A Parkes and K SpoererA Survey of NP-Complete PuzzlesInternational Computer Games Association Journal 31(1)13ndash34 2008

K E Kinnear JrEvolving a Sort Lessons in Genetic ProgrammingIn Proc of the International Conference on Neural Networks 1993

J KozaA Genetic Approach to the Truck Backer Upper Problem and the Inter-twinedSpiral ProblemIn Proc International Joint Conference on Neural Networks 1992

J R KozaGenetic Programming On the Programming of Computers by Means of NaturalSelectionMIT Press Cambridge MA USA 1992

Bibliographyndash 87

J R KozaGenetic Programming II Automatic Discovery of Reusable ProgramsMIT Press Cambridge Massachusetts May 1994

W Langdon and W BanzhafRepeated Patterns in Genetic ProgrammingNatural Computing 7589ndash613 2008

W B LangdonRandom search is parsimoniousIn E Cantuacute-Paz editor Late Breaking Papers at the Genetic and EvolutionaryComputation Conference (GECCO-2002) pages 308ndash315 New York NY 9-13July 2002 AAAI

W B Langdon and W BanzhafRepeated Sequences in Linear Genetic Programming GenomesComplex Systems 15(4)285ndash306 2005

W B Langdon and A P HarrisonEvolving Regular Expressions for GeneChip Probe Performance PredictionIn Proc PPSN pages 1061ndash1070 2008

W B Langdon and R PoliFoundations of Genetic ProgrammingSpringer-Verlag 2002

Bibliographyndash 88

W B Langdon J Rowsell and A P HarrisonCreating Regular Expressions as mRNA Motifs with GP to Predict Human ExonSplittingIn Proc GECCO 2009

W B Langdon O Sanchez Graillet and A P HarrisonAutomated DNA Motif DiscoveryarXivorg 2010

M Lones A Tyrrell S Stepney and L CavesControlling Complex Dynamics with Artificial Biochemical NetworksIn Proc EuroGP pages 159ndash170 2010

S LucasOthello Competitionhttpprotectkern-1667emrelaxalgovalessexacuk8080othellohtmlOthellohtml 2012[Online accessed 27-Jan-2012]

Bibliographyndash 89

S LucasThe Physical Travelling Salesperson Problemhttpprotectkern-1667emrelaxalgovalessexacukptspptsphtml2012[Online accessed 27ndashJan-2012]

T McConaghyFFX Fast Scalable Deterministic Symbolic Regression TechnologyIn Proc GPTP 2011

D J MontanaStrongly typed genetic programmingBBN Technical Report 7866 Bolt Beranek and Newman Inc 10 MoultonStreet Cambridge MA 02138 USA 7 May 1993

M Nicolau M Schoenauer and W BanzhafEvolving Genes to Balance a PoleIn Proc EuroGP 2010

Bibliographyndash 90

P Nordin and W BanzhafGenetic programming controlling a miniature robotIn E V Siegel and J R Koza editors Working Notes for the AAAI Symposiumon Genetic Programming pages 61ndash67 MIT Cambridge MA USA 10ndash12 Nov1995 AAAI

G Olague and L TrujilloEvolutionary-computer-assisted design of image operators that detect interestpoints using genetic programmingImage and Vision Computing 29(7)484ndash498 2011

M OrsquoNeill A Brabazon and E HembergSubtree Deactivation Control with Grammatical Genetic Programming inDynamic EnvironmentsIn Proc CEC 2008

M OrsquoNeill and C RyanGrammatical Evolution by Grammatical Evolution The Evolution of Grammarand Genetic CodeIn Proc EuroGP pages 138ndash149 Springer-Verlag 5ndash7 Apr 2004

M Paterson Y Peres M Thorup P Winkler and U ZwickMaximum OverhangIn Proc 19th Annual ACM-SIAM Symposium on Discrete Algorithms 2008

Bibliographyndash 91

R PoliExact schema theory for genetic programming and variable-length geneticalgorithms with one-point crossoverGenetic Programming and Evolvable Machines 2(2)123ndash163 June 2001

R Poli and W B LangdonOn the search properties of different crossover operators in genetic programmingIn J R Koza W Banzhaf K Chellapilla K Deb M Dorigo D B Fogel M HGarzon D E Goldberg H Iba and R Riolo editors Genetic Programming1998 Proceedings of the Third Annual Conference pages 293ndash301 University ofWisconsin Madison Wisconsin USA 22-25 July 1998 Morgan Kaufmann

R Poli W B Langdon and N F McPheeA field guide to genetic programmingPublished via httplulucom and freely available athttpwwwgp-field-guideorguk 2008(With contributions by J R Koza)

B J Ross and H ZhuProcedural texture evolution using multiobjective optimizationNew Generation Computing 22(3)271ndash293 2004

Bibliographyndash 92

R P Salustowicz and J SchmidhuberProbabilistic incremental program evolutionEvolutionary Computation 5(2)123ndash141 1997

M Schmidt and H LipsonDistilling free-form natural laws from experimental dataScience 324(5923)81ndash85 3 Apr 2009

S Silva and L VanneschiState-of-the-Art Genetic Programming for Predicting Human Oral Bioavailabilityof DrugsIn Proc 4th International Workshop on Practical Applications of ComputationalBiology and Bioinformatics 2010

M SipperLet the Games EvolveIn Proc GPTP 2011

C Sivapragasam R Maheswaran and V VenkateshGenetic programming approach for flood routing in natural channelsHydrological Processes 22(5)623ndash628 2007

C Sivapragasam N Muttil S Muthukumar and V M ArunPrediction of algal blooms using genetic programmingMarine Pollution Bulletin 60(10)1849ndash1855 2010

Bibliographyndash 93

L SpectorTowards practical autoconstructive evolution Self-evolution of problem-solvinggenetic programming systemsIn R Riolo T McConaghy and E Vladislavleva editors Genetic ProgrammingTheory and Practice VIII volume 8 of Genetic and Evolutionary Computationchapter 2 pages 17ndash33 Springer Ann Arbor USA 20-22 May 2010

L Spector C Perry and J KleinPush 20 programming language descriptionTechnical report School of Cognitive Science Hampshire College Apr 2004

J Togelius S Karakovskiy J Koutnik and J SchmidhuberSuper Mario EvolutionIn Proc IEEE Computational Intelligence and Games 2009

TORCS The Open Car Racing Simulatorhttptorcssourceforgenet 2012

L Vanneschi and G CuccuVariable Size Population for Dynamic Optimization with Genetic ProgrammingIn Proc GECCO 2009

Bibliographyndash 94

E Vladislavleva G Smits and D Den HertogOrder of Nonlinearity as a Complexity Measure for Models Generated by SymbolicRegression via Pareto Genetic ProgrammingIEEE Trans EC 13(2)333ndash349 2009

N Wagner Z Michalewicz M Khouja and R McGregorTime Series Forecasting for Dynamic Environments The DyFor Genetic ProgramModelIEEE Trans EC 2007

J Walker and J MillerPredicting Prime Numbers Using Cartesian Genetic ProgrammingIn Proc EuroGP 2007

J A Walker K Voumllk S L Smith and J F MillerParallel Evolution using Multi-chromosome Cartesian Genetic ProgrammingGPEM 10417ndash445 2009

W-C Wang K-W Chau C-T Cheng and L QiuA comparison of performance of several artificial intelligence methods forforecasting monthly discharge time seriesJournal of Hydrology 374(3-4)294ndash306 2009

Bibliographyndash 95

W Weimer S Forrest C Le Goues and T NguyenAutomatic program repair with evolutionary computationCommunications of the ACM 53(5)109ndash116 June 2010

P Widera J Garibaldi and N KrasnogorGP challenge Evolving energy function for protein structure predictionGPEM 1161ndash88 2010

T YuHierarchical Processing for Evolving Recursive and Modular Programs UsingHigher-Order Functions and Lambda AbstractionGPEM 2345ndash380 2001

Bibliographyndash 96

  • Introduction
  • Background
  • What is genetic programming
  • Summary of our first glimpse at GP
  • Life demonstration of GP using ECJ
  • Assessment of GP techniques
  • Case study
  • A more detailed view on GP (vanilla GP is not the whole story)
  • The Challenges for GP
  • Variants of GP
  • Selected theoretical results
  • Applications of GP
  • Additional resources
  • Bibliography
Page 69: GeneticProgramming - Poznań University of TechnologyGeneticProgramming KrzysztofKrawiec LaboratoryofIntelligentDecisionSupportSystems InstituteofComputingScience,PoznanUniversityofTechnology,Poznań,Poland

Simple EDA-like GP PIPE

Probabilistic Incremental Program Evolution [41]

Variants of GPndash 69

Selected theoretical results

Selected theoretical resultsndash 70

Schemata theorem for GP

Exact formula for the expected number of individuals sampling a schema a thenext generation [37]

Plus later work for other types of crossover

Selected theoretical resultsndash 71

Semantic crossover

Selected theoretical resultsndash 72

Applications of GP

Applications of GPndash 73

Review

GP produced a number of solutions that are human-competitive ie a GPalgorithm automatically solved a problem for which a patent exists2

A recent award-winning work has demonstrated the ability of a GP system toautomatically find and correct bugs in commercially-released software whenprovided with test data3

GP is one of leading methodologies that can be used to lsquoautomatersquo sciencehelping the researchers to find the hidden complex patterns in the observedphenomena4

2Koza J R Keane M A Streeter M J Mydlowec W Yu J Lanza G 2003 Genetic Pro-gramming IV Routine Human-Competitive Machine Intelligence Kluwer Academic Publishers

3Arcuri A Yao X A novel co-evolutionary approach to automatic software bug fixing In WangJ (Ed) 2008 IEEE World Congress on Computational Intelligence IEEE Computational IntelligenceSociety IEEE Press Hong Kong

4Schmidt M Lipson H 3 Apr 2009 Distilling free-form natural laws from experimental dataScience 324 (5923) 81ndash85

Applications of GPndash 74

Humies

() Entries were solicited for cash awards for human-competitive results that wereproduced by any form of genetic and evolutionary computation and that were published

Applications of GPndash 75

Humies

The conditions to qualify(A) The result was patented as an invention in the past is an improvement over apatented invention or would qualify today as a patentable new invention(B) The result is equal to or better than a result that was accepted as a new scientificresult at the time when it was published in a peer-reviewed scientific journal(C) The result is equal to or better than a result that was placed into a database orarchive of results maintained by an internationally recognized panel of scientificexperts(D) The result is publishable in its own right as a new scientific result mdash independentof the fact that the result was mechanically created(E) The result is equal to or better than the most recent human-created solution to along-standing problem for which there has been a succession of increasingly betterhuman-created solutions(F) The result is equal to or better than a result that was considered an achievementin its field at the time it was first discovered(G) The result solves a problem of indisputable difficulty in its field(H) The result holds its own or wins a regulated competition involving humancontestants (in the form of either live human players or human-written computerprograms)

Applications of GPndash 76

Selected Gold Humies

2004 Jason D Lohn Gregory S Hornby Derek S Linden NASA Ames ResearchCenterAn Evolved Antenna for Deployment on NASArsquos Space Technology 5 Mission

httpidesignucscedupapershornby_ec11pdf

Applications of GPndash 77

Selected Gold Humies using GP

2009 Stephanie Forrest Claire Le Goues ThanhVu Nguyen Westley WeimerAutomatically finding patches using genetic programming A GeneticProgramming Approach to Automated Software Repair

Applications of GPndash 78

Selected Gold Humies using GP

2008 Lee Spector David M Clark Ian Lindsay Bradford Barr Jon KleinGenetic Programming for Finite Algebras

2010 Natalio Krasnogor Paweł Widera Jonathan GaribaldiEvolutionary design of the energy function for protein structure prediction GPchallenge evolving the energy function for protein structure prediction Automateddesign of energy functions for protein structure prediction by means of geneticprogramming and improved structure similarity assessment

2011 Achiya Elyasaf Ami Hauptmann Moshe SipperGA-FreeCell Evolving Solvers for the Game of FreeCell

Applications of GPndash 79

Other applications

classification problems in machine learning [] object recognition [ 33] orlearning game strategies [] [2 57] has demonstrated the ability of a GP system to automatically find andcorrect bugs in commercially-released software when provided with test data

In the context of this paper it deserves particular attention that GP is one ofleading methodologies that can be used to lsquoautomatersquo science helping theresearchers to find the hidden complex patterns in the observed phenomena

In this spirit in their seminal paper [42] have shown how GP can be used toinduce scientific laws from experimental data Many other studies havedemonstrated the usefulness of GP for modeling different phenomena includingthose of natural origins [45 3 7 56 46]

See [] for an extensive review of GP applications

Applications of GPndash 80

Case study 2 Evolution of feature detectors

Based on Krzysztof Krawiec Bartosz Kukawka and Tomasz Maciejewski (2010)Evolving cascades of voting feature detectors for vehicle detection in satellite imageryIn IEEE Congress on Evolutionary Computation (CEC 2010) Barcelona IEEE Presspages 2392-2399

Applications of GPndash 81

Additional resources

Additional resourcesndash 82

Additional resources

Evolutionary Computation in Java csgmuedu~eclabprojectsecjGeneric software framework for EA well-prepared to work with GP

The online GP bilbiography wwwcsbhamacuk~wblbiblio

The genetic programming lsquohome pagersquo (a little bit messy but still valuable)httpwwwgenetic-programmingcom

Additional resourcesndash 83

Bibliography

Bibliographyndash 84

A Agapitos and S M LucasEvolving Modular Recursive Sorting AlgorithmsIn Proc EuroGP 2007

A Arcuri and X YaoA novel co-evolutionary approach to automatic software bug fixingIn J Wang editor 2008 IEEE World Congress on Computational Intelligencepages 162ndash168 Hong Kong 1-6 June 2008 IEEE Computational IntelligenceSociety IEEE Press

M Arganis R Val J Prats K Rodriguez R Dominguez and J DolzGenetic programming and standardization in water temperature modellingAdvances in Civil Engineering 2009 2009

A Brabazon M OrsquoNeill and I DempseyAn Introduction to Evolutionary Computation in FinanceIEEE Computational Intelligence Magazine 3(4)42ndash55 2008

R Bradley A Brabazon and M OrsquoNeillDynamic High Frequency Trading A Neuro-Evolutionary ApproachIn Proc EvoWorkshops 2009

G Cuccu and F GomezWhen novelty is not enoughIn Proc EvoApplications 2011

Bibliographyndash 85

M Daga and M C DeoAlternative data-driven methods to estimate wind from waves by inversemodelingNatural Hazards 49(2)293ndash310 May 2009

I Dempsey M OrsquoNeill and A BrabazonAdaptive Trading With Grammatical EvolutionIn Proc CEC 2006

G Durrett F Neumann and U-M OrsquoReillyComputational Complexity Analysis of Simple Genetic Programming On TwoProblems Modeling Isolated Program SemanticsIn Proc FOGA 2011

E Galvaacuten-Loacutepez J Swafford M OrsquoNeill and A BrabazonEvolving a Ms PacMan Controller Using Grammatical EvolutionIn Applications of Evolutionary Computation Springer 2010

J V Hansen P B Lowry R D Meservy and D M McDonaldGenetic Programming for Prevention of Cyberterrorism through Dynamic andEvolving Intrusion DetectionDecision Support Systems 431362ndash1374 2007

Bibliographyndash 86

S Harding J F Miller and W BanzhafDevelopments in Cartesian Genetic Programming self-modifying CGPGPEM 11397ndash439 2010

D Jakobović and L BudinDynamic Scheduling with Genetic ProgrammingIn Proc EuroGP 2006

G Kendall A Parkes and K SpoererA Survey of NP-Complete PuzzlesInternational Computer Games Association Journal 31(1)13ndash34 2008

K E Kinnear JrEvolving a Sort Lessons in Genetic ProgrammingIn Proc of the International Conference on Neural Networks 1993

J KozaA Genetic Approach to the Truck Backer Upper Problem and the Inter-twinedSpiral ProblemIn Proc International Joint Conference on Neural Networks 1992

J R KozaGenetic Programming On the Programming of Computers by Means of NaturalSelectionMIT Press Cambridge MA USA 1992

Bibliographyndash 87

J R KozaGenetic Programming II Automatic Discovery of Reusable ProgramsMIT Press Cambridge Massachusetts May 1994

W Langdon and W BanzhafRepeated Patterns in Genetic ProgrammingNatural Computing 7589ndash613 2008

W B LangdonRandom search is parsimoniousIn E Cantuacute-Paz editor Late Breaking Papers at the Genetic and EvolutionaryComputation Conference (GECCO-2002) pages 308ndash315 New York NY 9-13July 2002 AAAI

W B Langdon and W BanzhafRepeated Sequences in Linear Genetic Programming GenomesComplex Systems 15(4)285ndash306 2005

W B Langdon and A P HarrisonEvolving Regular Expressions for GeneChip Probe Performance PredictionIn Proc PPSN pages 1061ndash1070 2008

W B Langdon and R PoliFoundations of Genetic ProgrammingSpringer-Verlag 2002

Bibliographyndash 88

W B Langdon J Rowsell and A P HarrisonCreating Regular Expressions as mRNA Motifs with GP to Predict Human ExonSplittingIn Proc GECCO 2009

W B Langdon O Sanchez Graillet and A P HarrisonAutomated DNA Motif DiscoveryarXivorg 2010

M Lones A Tyrrell S Stepney and L CavesControlling Complex Dynamics with Artificial Biochemical NetworksIn Proc EuroGP pages 159ndash170 2010

S LucasOthello Competitionhttpprotectkern-1667emrelaxalgovalessexacuk8080othellohtmlOthellohtml 2012[Online accessed 27-Jan-2012]

Bibliographyndash 89

S LucasThe Physical Travelling Salesperson Problemhttpprotectkern-1667emrelaxalgovalessexacukptspptsphtml2012[Online accessed 27ndashJan-2012]

T McConaghyFFX Fast Scalable Deterministic Symbolic Regression TechnologyIn Proc GPTP 2011

D J MontanaStrongly typed genetic programmingBBN Technical Report 7866 Bolt Beranek and Newman Inc 10 MoultonStreet Cambridge MA 02138 USA 7 May 1993

M Nicolau M Schoenauer and W BanzhafEvolving Genes to Balance a PoleIn Proc EuroGP 2010

Bibliographyndash 90

P Nordin and W BanzhafGenetic programming controlling a miniature robotIn E V Siegel and J R Koza editors Working Notes for the AAAI Symposiumon Genetic Programming pages 61ndash67 MIT Cambridge MA USA 10ndash12 Nov1995 AAAI

G Olague and L TrujilloEvolutionary-computer-assisted design of image operators that detect interestpoints using genetic programmingImage and Vision Computing 29(7)484ndash498 2011

M OrsquoNeill A Brabazon and E HembergSubtree Deactivation Control with Grammatical Genetic Programming inDynamic EnvironmentsIn Proc CEC 2008

M OrsquoNeill and C RyanGrammatical Evolution by Grammatical Evolution The Evolution of Grammarand Genetic CodeIn Proc EuroGP pages 138ndash149 Springer-Verlag 5ndash7 Apr 2004

M Paterson Y Peres M Thorup P Winkler and U ZwickMaximum OverhangIn Proc 19th Annual ACM-SIAM Symposium on Discrete Algorithms 2008

Bibliographyndash 91

R PoliExact schema theory for genetic programming and variable-length geneticalgorithms with one-point crossoverGenetic Programming and Evolvable Machines 2(2)123ndash163 June 2001

R Poli and W B LangdonOn the search properties of different crossover operators in genetic programmingIn J R Koza W Banzhaf K Chellapilla K Deb M Dorigo D B Fogel M HGarzon D E Goldberg H Iba and R Riolo editors Genetic Programming1998 Proceedings of the Third Annual Conference pages 293ndash301 University ofWisconsin Madison Wisconsin USA 22-25 July 1998 Morgan Kaufmann

R Poli W B Langdon and N F McPheeA field guide to genetic programmingPublished via httplulucom and freely available athttpwwwgp-field-guideorguk 2008(With contributions by J R Koza)

B J Ross and H ZhuProcedural texture evolution using multiobjective optimizationNew Generation Computing 22(3)271ndash293 2004

Bibliographyndash 92

R P Salustowicz and J SchmidhuberProbabilistic incremental program evolutionEvolutionary Computation 5(2)123ndash141 1997

M Schmidt and H LipsonDistilling free-form natural laws from experimental dataScience 324(5923)81ndash85 3 Apr 2009

S Silva and L VanneschiState-of-the-Art Genetic Programming for Predicting Human Oral Bioavailabilityof DrugsIn Proc 4th International Workshop on Practical Applications of ComputationalBiology and Bioinformatics 2010

M SipperLet the Games EvolveIn Proc GPTP 2011

C Sivapragasam R Maheswaran and V VenkateshGenetic programming approach for flood routing in natural channelsHydrological Processes 22(5)623ndash628 2007

C Sivapragasam N Muttil S Muthukumar and V M ArunPrediction of algal blooms using genetic programmingMarine Pollution Bulletin 60(10)1849ndash1855 2010

Bibliographyndash 93

L SpectorTowards practical autoconstructive evolution Self-evolution of problem-solvinggenetic programming systemsIn R Riolo T McConaghy and E Vladislavleva editors Genetic ProgrammingTheory and Practice VIII volume 8 of Genetic and Evolutionary Computationchapter 2 pages 17ndash33 Springer Ann Arbor USA 20-22 May 2010

L Spector C Perry and J KleinPush 20 programming language descriptionTechnical report School of Cognitive Science Hampshire College Apr 2004

J Togelius S Karakovskiy J Koutnik and J SchmidhuberSuper Mario EvolutionIn Proc IEEE Computational Intelligence and Games 2009

TORCS The Open Car Racing Simulatorhttptorcssourceforgenet 2012

L Vanneschi and G CuccuVariable Size Population for Dynamic Optimization with Genetic ProgrammingIn Proc GECCO 2009

Bibliographyndash 94

E Vladislavleva G Smits and D Den HertogOrder of Nonlinearity as a Complexity Measure for Models Generated by SymbolicRegression via Pareto Genetic ProgrammingIEEE Trans EC 13(2)333ndash349 2009

N Wagner Z Michalewicz M Khouja and R McGregorTime Series Forecasting for Dynamic Environments The DyFor Genetic ProgramModelIEEE Trans EC 2007

J Walker and J MillerPredicting Prime Numbers Using Cartesian Genetic ProgrammingIn Proc EuroGP 2007

J A Walker K Voumllk S L Smith and J F MillerParallel Evolution using Multi-chromosome Cartesian Genetic ProgrammingGPEM 10417ndash445 2009

W-C Wang K-W Chau C-T Cheng and L QiuA comparison of performance of several artificial intelligence methods forforecasting monthly discharge time seriesJournal of Hydrology 374(3-4)294ndash306 2009

Bibliographyndash 95

W Weimer S Forrest C Le Goues and T NguyenAutomatic program repair with evolutionary computationCommunications of the ACM 53(5)109ndash116 June 2010

P Widera J Garibaldi and N KrasnogorGP challenge Evolving energy function for protein structure predictionGPEM 1161ndash88 2010

T YuHierarchical Processing for Evolving Recursive and Modular Programs UsingHigher-Order Functions and Lambda AbstractionGPEM 2345ndash380 2001

Bibliographyndash 96

  • Introduction
  • Background
  • What is genetic programming
  • Summary of our first glimpse at GP
  • Life demonstration of GP using ECJ
  • Assessment of GP techniques
  • Case study
  • A more detailed view on GP (vanilla GP is not the whole story)
  • The Challenges for GP
  • Variants of GP
  • Selected theoretical results
  • Applications of GP
  • Additional resources
  • Bibliography
Page 70: GeneticProgramming - Poznań University of TechnologyGeneticProgramming KrzysztofKrawiec LaboratoryofIntelligentDecisionSupportSystems InstituteofComputingScience,PoznanUniversityofTechnology,Poznań,Poland

Selected theoretical results

Selected theoretical resultsndash 70

Schemata theorem for GP

Exact formula for the expected number of individuals sampling a schema a thenext generation [37]

Plus later work for other types of crossover

Selected theoretical resultsndash 71

Semantic crossover

Selected theoretical resultsndash 72

Applications of GP

Applications of GPndash 73

Review

GP produced a number of solutions that are human-competitive ie a GPalgorithm automatically solved a problem for which a patent exists2

A recent award-winning work has demonstrated the ability of a GP system toautomatically find and correct bugs in commercially-released software whenprovided with test data3

GP is one of leading methodologies that can be used to lsquoautomatersquo sciencehelping the researchers to find the hidden complex patterns in the observedphenomena4

2Koza J R Keane M A Streeter M J Mydlowec W Yu J Lanza G 2003 Genetic Pro-gramming IV Routine Human-Competitive Machine Intelligence Kluwer Academic Publishers

3Arcuri A Yao X A novel co-evolutionary approach to automatic software bug fixing In WangJ (Ed) 2008 IEEE World Congress on Computational Intelligence IEEE Computational IntelligenceSociety IEEE Press Hong Kong

4Schmidt M Lipson H 3 Apr 2009 Distilling free-form natural laws from experimental dataScience 324 (5923) 81ndash85

Applications of GPndash 74

Humies

() Entries were solicited for cash awards for human-competitive results that wereproduced by any form of genetic and evolutionary computation and that were published

Applications of GPndash 75

Humies

The conditions to qualify(A) The result was patented as an invention in the past is an improvement over apatented invention or would qualify today as a patentable new invention(B) The result is equal to or better than a result that was accepted as a new scientificresult at the time when it was published in a peer-reviewed scientific journal(C) The result is equal to or better than a result that was placed into a database orarchive of results maintained by an internationally recognized panel of scientificexperts(D) The result is publishable in its own right as a new scientific result mdash independentof the fact that the result was mechanically created(E) The result is equal to or better than the most recent human-created solution to along-standing problem for which there has been a succession of increasingly betterhuman-created solutions(F) The result is equal to or better than a result that was considered an achievementin its field at the time it was first discovered(G) The result solves a problem of indisputable difficulty in its field(H) The result holds its own or wins a regulated competition involving humancontestants (in the form of either live human players or human-written computerprograms)

Applications of GPndash 76

Selected Gold Humies

2004 Jason D Lohn Gregory S Hornby Derek S Linden NASA Ames ResearchCenterAn Evolved Antenna for Deployment on NASArsquos Space Technology 5 Mission

httpidesignucscedupapershornby_ec11pdf

Applications of GPndash 77

Selected Gold Humies using GP

2009 Stephanie Forrest Claire Le Goues ThanhVu Nguyen Westley WeimerAutomatically finding patches using genetic programming A GeneticProgramming Approach to Automated Software Repair

Applications of GPndash 78

Selected Gold Humies using GP

2008 Lee Spector David M Clark Ian Lindsay Bradford Barr Jon KleinGenetic Programming for Finite Algebras

2010 Natalio Krasnogor Paweł Widera Jonathan GaribaldiEvolutionary design of the energy function for protein structure prediction GPchallenge evolving the energy function for protein structure prediction Automateddesign of energy functions for protein structure prediction by means of geneticprogramming and improved structure similarity assessment

2011 Achiya Elyasaf Ami Hauptmann Moshe SipperGA-FreeCell Evolving Solvers for the Game of FreeCell

Applications of GPndash 79

Other applications

classification problems in machine learning [] object recognition [ 33] orlearning game strategies [] [2 57] has demonstrated the ability of a GP system to automatically find andcorrect bugs in commercially-released software when provided with test data

In the context of this paper it deserves particular attention that GP is one ofleading methodologies that can be used to lsquoautomatersquo science helping theresearchers to find the hidden complex patterns in the observed phenomena

In this spirit in their seminal paper [42] have shown how GP can be used toinduce scientific laws from experimental data Many other studies havedemonstrated the usefulness of GP for modeling different phenomena includingthose of natural origins [45 3 7 56 46]

See [] for an extensive review of GP applications

Applications of GPndash 80

Case study 2 Evolution of feature detectors

Based on Krzysztof Krawiec Bartosz Kukawka and Tomasz Maciejewski (2010)Evolving cascades of voting feature detectors for vehicle detection in satellite imageryIn IEEE Congress on Evolutionary Computation (CEC 2010) Barcelona IEEE Presspages 2392-2399

Applications of GPndash 81

Additional resources

Additional resourcesndash 82

Additional resources

Evolutionary Computation in Java csgmuedu~eclabprojectsecjGeneric software framework for EA well-prepared to work with GP

The online GP bilbiography wwwcsbhamacuk~wblbiblio

The genetic programming lsquohome pagersquo (a little bit messy but still valuable)httpwwwgenetic-programmingcom

Additional resourcesndash 83

Bibliography

Bibliographyndash 84

A Agapitos and S M LucasEvolving Modular Recursive Sorting AlgorithmsIn Proc EuroGP 2007

A Arcuri and X YaoA novel co-evolutionary approach to automatic software bug fixingIn J Wang editor 2008 IEEE World Congress on Computational Intelligencepages 162ndash168 Hong Kong 1-6 June 2008 IEEE Computational IntelligenceSociety IEEE Press

M Arganis R Val J Prats K Rodriguez R Dominguez and J DolzGenetic programming and standardization in water temperature modellingAdvances in Civil Engineering 2009 2009

A Brabazon M OrsquoNeill and I DempseyAn Introduction to Evolutionary Computation in FinanceIEEE Computational Intelligence Magazine 3(4)42ndash55 2008

R Bradley A Brabazon and M OrsquoNeillDynamic High Frequency Trading A Neuro-Evolutionary ApproachIn Proc EvoWorkshops 2009

G Cuccu and F GomezWhen novelty is not enoughIn Proc EvoApplications 2011

Bibliographyndash 85

M Daga and M C DeoAlternative data-driven methods to estimate wind from waves by inversemodelingNatural Hazards 49(2)293ndash310 May 2009

I Dempsey M OrsquoNeill and A BrabazonAdaptive Trading With Grammatical EvolutionIn Proc CEC 2006

G Durrett F Neumann and U-M OrsquoReillyComputational Complexity Analysis of Simple Genetic Programming On TwoProblems Modeling Isolated Program SemanticsIn Proc FOGA 2011

E Galvaacuten-Loacutepez J Swafford M OrsquoNeill and A BrabazonEvolving a Ms PacMan Controller Using Grammatical EvolutionIn Applications of Evolutionary Computation Springer 2010

J V Hansen P B Lowry R D Meservy and D M McDonaldGenetic Programming for Prevention of Cyberterrorism through Dynamic andEvolving Intrusion DetectionDecision Support Systems 431362ndash1374 2007

Bibliographyndash 86

S Harding J F Miller and W BanzhafDevelopments in Cartesian Genetic Programming self-modifying CGPGPEM 11397ndash439 2010

D Jakobović and L BudinDynamic Scheduling with Genetic ProgrammingIn Proc EuroGP 2006

G Kendall A Parkes and K SpoererA Survey of NP-Complete PuzzlesInternational Computer Games Association Journal 31(1)13ndash34 2008

K E Kinnear JrEvolving a Sort Lessons in Genetic ProgrammingIn Proc of the International Conference on Neural Networks 1993

J KozaA Genetic Approach to the Truck Backer Upper Problem and the Inter-twinedSpiral ProblemIn Proc International Joint Conference on Neural Networks 1992

J R KozaGenetic Programming On the Programming of Computers by Means of NaturalSelectionMIT Press Cambridge MA USA 1992

Bibliographyndash 87

J R KozaGenetic Programming II Automatic Discovery of Reusable ProgramsMIT Press Cambridge Massachusetts May 1994

W Langdon and W BanzhafRepeated Patterns in Genetic ProgrammingNatural Computing 7589ndash613 2008

W B LangdonRandom search is parsimoniousIn E Cantuacute-Paz editor Late Breaking Papers at the Genetic and EvolutionaryComputation Conference (GECCO-2002) pages 308ndash315 New York NY 9-13July 2002 AAAI

W B Langdon and W BanzhafRepeated Sequences in Linear Genetic Programming GenomesComplex Systems 15(4)285ndash306 2005

W B Langdon and A P HarrisonEvolving Regular Expressions for GeneChip Probe Performance PredictionIn Proc PPSN pages 1061ndash1070 2008

W B Langdon and R PoliFoundations of Genetic ProgrammingSpringer-Verlag 2002

Bibliographyndash 88

W B Langdon J Rowsell and A P HarrisonCreating Regular Expressions as mRNA Motifs with GP to Predict Human ExonSplittingIn Proc GECCO 2009

W B Langdon O Sanchez Graillet and A P HarrisonAutomated DNA Motif DiscoveryarXivorg 2010

M Lones A Tyrrell S Stepney and L CavesControlling Complex Dynamics with Artificial Biochemical NetworksIn Proc EuroGP pages 159ndash170 2010

S LucasOthello Competitionhttpprotectkern-1667emrelaxalgovalessexacuk8080othellohtmlOthellohtml 2012[Online accessed 27-Jan-2012]

Bibliographyndash 89

S LucasThe Physical Travelling Salesperson Problemhttpprotectkern-1667emrelaxalgovalessexacukptspptsphtml2012[Online accessed 27ndashJan-2012]

T McConaghyFFX Fast Scalable Deterministic Symbolic Regression TechnologyIn Proc GPTP 2011

D J MontanaStrongly typed genetic programmingBBN Technical Report 7866 Bolt Beranek and Newman Inc 10 MoultonStreet Cambridge MA 02138 USA 7 May 1993

M Nicolau M Schoenauer and W BanzhafEvolving Genes to Balance a PoleIn Proc EuroGP 2010

Bibliographyndash 90

P Nordin and W BanzhafGenetic programming controlling a miniature robotIn E V Siegel and J R Koza editors Working Notes for the AAAI Symposiumon Genetic Programming pages 61ndash67 MIT Cambridge MA USA 10ndash12 Nov1995 AAAI

G Olague and L TrujilloEvolutionary-computer-assisted design of image operators that detect interestpoints using genetic programmingImage and Vision Computing 29(7)484ndash498 2011

M OrsquoNeill A Brabazon and E HembergSubtree Deactivation Control with Grammatical Genetic Programming inDynamic EnvironmentsIn Proc CEC 2008

M OrsquoNeill and C RyanGrammatical Evolution by Grammatical Evolution The Evolution of Grammarand Genetic CodeIn Proc EuroGP pages 138ndash149 Springer-Verlag 5ndash7 Apr 2004

M Paterson Y Peres M Thorup P Winkler and U ZwickMaximum OverhangIn Proc 19th Annual ACM-SIAM Symposium on Discrete Algorithms 2008

Bibliographyndash 91

R PoliExact schema theory for genetic programming and variable-length geneticalgorithms with one-point crossoverGenetic Programming and Evolvable Machines 2(2)123ndash163 June 2001

R Poli and W B LangdonOn the search properties of different crossover operators in genetic programmingIn J R Koza W Banzhaf K Chellapilla K Deb M Dorigo D B Fogel M HGarzon D E Goldberg H Iba and R Riolo editors Genetic Programming1998 Proceedings of the Third Annual Conference pages 293ndash301 University ofWisconsin Madison Wisconsin USA 22-25 July 1998 Morgan Kaufmann

R Poli W B Langdon and N F McPheeA field guide to genetic programmingPublished via httplulucom and freely available athttpwwwgp-field-guideorguk 2008(With contributions by J R Koza)

B J Ross and H ZhuProcedural texture evolution using multiobjective optimizationNew Generation Computing 22(3)271ndash293 2004

Bibliographyndash 92

R P Salustowicz and J SchmidhuberProbabilistic incremental program evolutionEvolutionary Computation 5(2)123ndash141 1997

M Schmidt and H LipsonDistilling free-form natural laws from experimental dataScience 324(5923)81ndash85 3 Apr 2009

S Silva and L VanneschiState-of-the-Art Genetic Programming for Predicting Human Oral Bioavailabilityof DrugsIn Proc 4th International Workshop on Practical Applications of ComputationalBiology and Bioinformatics 2010

M SipperLet the Games EvolveIn Proc GPTP 2011

C Sivapragasam R Maheswaran and V VenkateshGenetic programming approach for flood routing in natural channelsHydrological Processes 22(5)623ndash628 2007

C Sivapragasam N Muttil S Muthukumar and V M ArunPrediction of algal blooms using genetic programmingMarine Pollution Bulletin 60(10)1849ndash1855 2010

Bibliographyndash 93

L SpectorTowards practical autoconstructive evolution Self-evolution of problem-solvinggenetic programming systemsIn R Riolo T McConaghy and E Vladislavleva editors Genetic ProgrammingTheory and Practice VIII volume 8 of Genetic and Evolutionary Computationchapter 2 pages 17ndash33 Springer Ann Arbor USA 20-22 May 2010

L Spector C Perry and J KleinPush 20 programming language descriptionTechnical report School of Cognitive Science Hampshire College Apr 2004

J Togelius S Karakovskiy J Koutnik and J SchmidhuberSuper Mario EvolutionIn Proc IEEE Computational Intelligence and Games 2009

TORCS The Open Car Racing Simulatorhttptorcssourceforgenet 2012

L Vanneschi and G CuccuVariable Size Population for Dynamic Optimization with Genetic ProgrammingIn Proc GECCO 2009

Bibliographyndash 94

E Vladislavleva G Smits and D Den HertogOrder of Nonlinearity as a Complexity Measure for Models Generated by SymbolicRegression via Pareto Genetic ProgrammingIEEE Trans EC 13(2)333ndash349 2009

N Wagner Z Michalewicz M Khouja and R McGregorTime Series Forecasting for Dynamic Environments The DyFor Genetic ProgramModelIEEE Trans EC 2007

J Walker and J MillerPredicting Prime Numbers Using Cartesian Genetic ProgrammingIn Proc EuroGP 2007

J A Walker K Voumllk S L Smith and J F MillerParallel Evolution using Multi-chromosome Cartesian Genetic ProgrammingGPEM 10417ndash445 2009

W-C Wang K-W Chau C-T Cheng and L QiuA comparison of performance of several artificial intelligence methods forforecasting monthly discharge time seriesJournal of Hydrology 374(3-4)294ndash306 2009

Bibliographyndash 95

W Weimer S Forrest C Le Goues and T NguyenAutomatic program repair with evolutionary computationCommunications of the ACM 53(5)109ndash116 June 2010

P Widera J Garibaldi and N KrasnogorGP challenge Evolving energy function for protein structure predictionGPEM 1161ndash88 2010

T YuHierarchical Processing for Evolving Recursive and Modular Programs UsingHigher-Order Functions and Lambda AbstractionGPEM 2345ndash380 2001

Bibliographyndash 96

  • Introduction
  • Background
  • What is genetic programming
  • Summary of our first glimpse at GP
  • Life demonstration of GP using ECJ
  • Assessment of GP techniques
  • Case study
  • A more detailed view on GP (vanilla GP is not the whole story)
  • The Challenges for GP
  • Variants of GP
  • Selected theoretical results
  • Applications of GP
  • Additional resources
  • Bibliography
Page 71: GeneticProgramming - Poznań University of TechnologyGeneticProgramming KrzysztofKrawiec LaboratoryofIntelligentDecisionSupportSystems InstituteofComputingScience,PoznanUniversityofTechnology,Poznań,Poland

Schemata theorem for GP

Exact formula for the expected number of individuals sampling a schema a thenext generation [37]

Plus later work for other types of crossover

Selected theoretical resultsndash 71

Semantic crossover

Selected theoretical resultsndash 72

Applications of GP

Applications of GPndash 73

Review

GP produced a number of solutions that are human-competitive ie a GPalgorithm automatically solved a problem for which a patent exists2

A recent award-winning work has demonstrated the ability of a GP system toautomatically find and correct bugs in commercially-released software whenprovided with test data3

GP is one of leading methodologies that can be used to lsquoautomatersquo sciencehelping the researchers to find the hidden complex patterns in the observedphenomena4

2Koza J R Keane M A Streeter M J Mydlowec W Yu J Lanza G 2003 Genetic Pro-gramming IV Routine Human-Competitive Machine Intelligence Kluwer Academic Publishers

3Arcuri A Yao X A novel co-evolutionary approach to automatic software bug fixing In WangJ (Ed) 2008 IEEE World Congress on Computational Intelligence IEEE Computational IntelligenceSociety IEEE Press Hong Kong

4Schmidt M Lipson H 3 Apr 2009 Distilling free-form natural laws from experimental dataScience 324 (5923) 81ndash85

Applications of GPndash 74

Humies

() Entries were solicited for cash awards for human-competitive results that wereproduced by any form of genetic and evolutionary computation and that were published

Applications of GPndash 75

Humies

The conditions to qualify(A) The result was patented as an invention in the past is an improvement over apatented invention or would qualify today as a patentable new invention(B) The result is equal to or better than a result that was accepted as a new scientificresult at the time when it was published in a peer-reviewed scientific journal(C) The result is equal to or better than a result that was placed into a database orarchive of results maintained by an internationally recognized panel of scientificexperts(D) The result is publishable in its own right as a new scientific result mdash independentof the fact that the result was mechanically created(E) The result is equal to or better than the most recent human-created solution to along-standing problem for which there has been a succession of increasingly betterhuman-created solutions(F) The result is equal to or better than a result that was considered an achievementin its field at the time it was first discovered(G) The result solves a problem of indisputable difficulty in its field(H) The result holds its own or wins a regulated competition involving humancontestants (in the form of either live human players or human-written computerprograms)

Applications of GPndash 76

Selected Gold Humies

2004 Jason D Lohn Gregory S Hornby Derek S Linden NASA Ames ResearchCenterAn Evolved Antenna for Deployment on NASArsquos Space Technology 5 Mission

httpidesignucscedupapershornby_ec11pdf

Applications of GPndash 77

Selected Gold Humies using GP

2009 Stephanie Forrest Claire Le Goues ThanhVu Nguyen Westley WeimerAutomatically finding patches using genetic programming A GeneticProgramming Approach to Automated Software Repair

Applications of GPndash 78

Selected Gold Humies using GP

2008 Lee Spector David M Clark Ian Lindsay Bradford Barr Jon KleinGenetic Programming for Finite Algebras

2010 Natalio Krasnogor Paweł Widera Jonathan GaribaldiEvolutionary design of the energy function for protein structure prediction GPchallenge evolving the energy function for protein structure prediction Automateddesign of energy functions for protein structure prediction by means of geneticprogramming and improved structure similarity assessment

2011 Achiya Elyasaf Ami Hauptmann Moshe SipperGA-FreeCell Evolving Solvers for the Game of FreeCell

Applications of GPndash 79

Other applications

classification problems in machine learning [] object recognition [ 33] orlearning game strategies [] [2 57] has demonstrated the ability of a GP system to automatically find andcorrect bugs in commercially-released software when provided with test data

In the context of this paper it deserves particular attention that GP is one ofleading methodologies that can be used to lsquoautomatersquo science helping theresearchers to find the hidden complex patterns in the observed phenomena

In this spirit in their seminal paper [42] have shown how GP can be used toinduce scientific laws from experimental data Many other studies havedemonstrated the usefulness of GP for modeling different phenomena includingthose of natural origins [45 3 7 56 46]

See [] for an extensive review of GP applications

Applications of GPndash 80

Case study 2 Evolution of feature detectors

Based on Krzysztof Krawiec Bartosz Kukawka and Tomasz Maciejewski (2010)Evolving cascades of voting feature detectors for vehicle detection in satellite imageryIn IEEE Congress on Evolutionary Computation (CEC 2010) Barcelona IEEE Presspages 2392-2399

Applications of GPndash 81

Additional resources

Additional resourcesndash 82

Additional resources

Evolutionary Computation in Java csgmuedu~eclabprojectsecjGeneric software framework for EA well-prepared to work with GP

The online GP bilbiography wwwcsbhamacuk~wblbiblio

The genetic programming lsquohome pagersquo (a little bit messy but still valuable)httpwwwgenetic-programmingcom

Additional resourcesndash 83

Bibliography

Bibliographyndash 84

A Agapitos and S M LucasEvolving Modular Recursive Sorting AlgorithmsIn Proc EuroGP 2007

A Arcuri and X YaoA novel co-evolutionary approach to automatic software bug fixingIn J Wang editor 2008 IEEE World Congress on Computational Intelligencepages 162ndash168 Hong Kong 1-6 June 2008 IEEE Computational IntelligenceSociety IEEE Press

M Arganis R Val J Prats K Rodriguez R Dominguez and J DolzGenetic programming and standardization in water temperature modellingAdvances in Civil Engineering 2009 2009

A Brabazon M OrsquoNeill and I DempseyAn Introduction to Evolutionary Computation in FinanceIEEE Computational Intelligence Magazine 3(4)42ndash55 2008

R Bradley A Brabazon and M OrsquoNeillDynamic High Frequency Trading A Neuro-Evolutionary ApproachIn Proc EvoWorkshops 2009

G Cuccu and F GomezWhen novelty is not enoughIn Proc EvoApplications 2011

Bibliographyndash 85

M Daga and M C DeoAlternative data-driven methods to estimate wind from waves by inversemodelingNatural Hazards 49(2)293ndash310 May 2009

I Dempsey M OrsquoNeill and A BrabazonAdaptive Trading With Grammatical EvolutionIn Proc CEC 2006

G Durrett F Neumann and U-M OrsquoReillyComputational Complexity Analysis of Simple Genetic Programming On TwoProblems Modeling Isolated Program SemanticsIn Proc FOGA 2011

E Galvaacuten-Loacutepez J Swafford M OrsquoNeill and A BrabazonEvolving a Ms PacMan Controller Using Grammatical EvolutionIn Applications of Evolutionary Computation Springer 2010

J V Hansen P B Lowry R D Meservy and D M McDonaldGenetic Programming for Prevention of Cyberterrorism through Dynamic andEvolving Intrusion DetectionDecision Support Systems 431362ndash1374 2007

Bibliographyndash 86

S Harding J F Miller and W BanzhafDevelopments in Cartesian Genetic Programming self-modifying CGPGPEM 11397ndash439 2010

D Jakobović and L BudinDynamic Scheduling with Genetic ProgrammingIn Proc EuroGP 2006

G Kendall A Parkes and K SpoererA Survey of NP-Complete PuzzlesInternational Computer Games Association Journal 31(1)13ndash34 2008

K E Kinnear JrEvolving a Sort Lessons in Genetic ProgrammingIn Proc of the International Conference on Neural Networks 1993

J KozaA Genetic Approach to the Truck Backer Upper Problem and the Inter-twinedSpiral ProblemIn Proc International Joint Conference on Neural Networks 1992

J R KozaGenetic Programming On the Programming of Computers by Means of NaturalSelectionMIT Press Cambridge MA USA 1992

Bibliographyndash 87

J R KozaGenetic Programming II Automatic Discovery of Reusable ProgramsMIT Press Cambridge Massachusetts May 1994

W Langdon and W BanzhafRepeated Patterns in Genetic ProgrammingNatural Computing 7589ndash613 2008

W B LangdonRandom search is parsimoniousIn E Cantuacute-Paz editor Late Breaking Papers at the Genetic and EvolutionaryComputation Conference (GECCO-2002) pages 308ndash315 New York NY 9-13July 2002 AAAI

W B Langdon and W BanzhafRepeated Sequences in Linear Genetic Programming GenomesComplex Systems 15(4)285ndash306 2005

W B Langdon and A P HarrisonEvolving Regular Expressions for GeneChip Probe Performance PredictionIn Proc PPSN pages 1061ndash1070 2008

W B Langdon and R PoliFoundations of Genetic ProgrammingSpringer-Verlag 2002

Bibliographyndash 88

W B Langdon J Rowsell and A P HarrisonCreating Regular Expressions as mRNA Motifs with GP to Predict Human ExonSplittingIn Proc GECCO 2009

W B Langdon O Sanchez Graillet and A P HarrisonAutomated DNA Motif DiscoveryarXivorg 2010

M Lones A Tyrrell S Stepney and L CavesControlling Complex Dynamics with Artificial Biochemical NetworksIn Proc EuroGP pages 159ndash170 2010

S LucasOthello Competitionhttpprotectkern-1667emrelaxalgovalessexacuk8080othellohtmlOthellohtml 2012[Online accessed 27-Jan-2012]

Bibliographyndash 89

S LucasThe Physical Travelling Salesperson Problemhttpprotectkern-1667emrelaxalgovalessexacukptspptsphtml2012[Online accessed 27ndashJan-2012]

T McConaghyFFX Fast Scalable Deterministic Symbolic Regression TechnologyIn Proc GPTP 2011

D J MontanaStrongly typed genetic programmingBBN Technical Report 7866 Bolt Beranek and Newman Inc 10 MoultonStreet Cambridge MA 02138 USA 7 May 1993

M Nicolau M Schoenauer and W BanzhafEvolving Genes to Balance a PoleIn Proc EuroGP 2010

Bibliographyndash 90

P Nordin and W BanzhafGenetic programming controlling a miniature robotIn E V Siegel and J R Koza editors Working Notes for the AAAI Symposiumon Genetic Programming pages 61ndash67 MIT Cambridge MA USA 10ndash12 Nov1995 AAAI

G Olague and L TrujilloEvolutionary-computer-assisted design of image operators that detect interestpoints using genetic programmingImage and Vision Computing 29(7)484ndash498 2011

M OrsquoNeill A Brabazon and E HembergSubtree Deactivation Control with Grammatical Genetic Programming inDynamic EnvironmentsIn Proc CEC 2008

M OrsquoNeill and C RyanGrammatical Evolution by Grammatical Evolution The Evolution of Grammarand Genetic CodeIn Proc EuroGP pages 138ndash149 Springer-Verlag 5ndash7 Apr 2004

M Paterson Y Peres M Thorup P Winkler and U ZwickMaximum OverhangIn Proc 19th Annual ACM-SIAM Symposium on Discrete Algorithms 2008

Bibliographyndash 91

R PoliExact schema theory for genetic programming and variable-length geneticalgorithms with one-point crossoverGenetic Programming and Evolvable Machines 2(2)123ndash163 June 2001

R Poli and W B LangdonOn the search properties of different crossover operators in genetic programmingIn J R Koza W Banzhaf K Chellapilla K Deb M Dorigo D B Fogel M HGarzon D E Goldberg H Iba and R Riolo editors Genetic Programming1998 Proceedings of the Third Annual Conference pages 293ndash301 University ofWisconsin Madison Wisconsin USA 22-25 July 1998 Morgan Kaufmann

R Poli W B Langdon and N F McPheeA field guide to genetic programmingPublished via httplulucom and freely available athttpwwwgp-field-guideorguk 2008(With contributions by J R Koza)

B J Ross and H ZhuProcedural texture evolution using multiobjective optimizationNew Generation Computing 22(3)271ndash293 2004

Bibliographyndash 92

R P Salustowicz and J SchmidhuberProbabilistic incremental program evolutionEvolutionary Computation 5(2)123ndash141 1997

M Schmidt and H LipsonDistilling free-form natural laws from experimental dataScience 324(5923)81ndash85 3 Apr 2009

S Silva and L VanneschiState-of-the-Art Genetic Programming for Predicting Human Oral Bioavailabilityof DrugsIn Proc 4th International Workshop on Practical Applications of ComputationalBiology and Bioinformatics 2010

M SipperLet the Games EvolveIn Proc GPTP 2011

C Sivapragasam R Maheswaran and V VenkateshGenetic programming approach for flood routing in natural channelsHydrological Processes 22(5)623ndash628 2007

C Sivapragasam N Muttil S Muthukumar and V M ArunPrediction of algal blooms using genetic programmingMarine Pollution Bulletin 60(10)1849ndash1855 2010

Bibliographyndash 93

L SpectorTowards practical autoconstructive evolution Self-evolution of problem-solvinggenetic programming systemsIn R Riolo T McConaghy and E Vladislavleva editors Genetic ProgrammingTheory and Practice VIII volume 8 of Genetic and Evolutionary Computationchapter 2 pages 17ndash33 Springer Ann Arbor USA 20-22 May 2010

L Spector C Perry and J KleinPush 20 programming language descriptionTechnical report School of Cognitive Science Hampshire College Apr 2004

J Togelius S Karakovskiy J Koutnik and J SchmidhuberSuper Mario EvolutionIn Proc IEEE Computational Intelligence and Games 2009

TORCS The Open Car Racing Simulatorhttptorcssourceforgenet 2012

L Vanneschi and G CuccuVariable Size Population for Dynamic Optimization with Genetic ProgrammingIn Proc GECCO 2009

Bibliographyndash 94

E Vladislavleva G Smits and D Den HertogOrder of Nonlinearity as a Complexity Measure for Models Generated by SymbolicRegression via Pareto Genetic ProgrammingIEEE Trans EC 13(2)333ndash349 2009

N Wagner Z Michalewicz M Khouja and R McGregorTime Series Forecasting for Dynamic Environments The DyFor Genetic ProgramModelIEEE Trans EC 2007

J Walker and J MillerPredicting Prime Numbers Using Cartesian Genetic ProgrammingIn Proc EuroGP 2007

J A Walker K Voumllk S L Smith and J F MillerParallel Evolution using Multi-chromosome Cartesian Genetic ProgrammingGPEM 10417ndash445 2009

W-C Wang K-W Chau C-T Cheng and L QiuA comparison of performance of several artificial intelligence methods forforecasting monthly discharge time seriesJournal of Hydrology 374(3-4)294ndash306 2009

Bibliographyndash 95

W Weimer S Forrest C Le Goues and T NguyenAutomatic program repair with evolutionary computationCommunications of the ACM 53(5)109ndash116 June 2010

P Widera J Garibaldi and N KrasnogorGP challenge Evolving energy function for protein structure predictionGPEM 1161ndash88 2010

T YuHierarchical Processing for Evolving Recursive and Modular Programs UsingHigher-Order Functions and Lambda AbstractionGPEM 2345ndash380 2001

Bibliographyndash 96

  • Introduction
  • Background
  • What is genetic programming
  • Summary of our first glimpse at GP
  • Life demonstration of GP using ECJ
  • Assessment of GP techniques
  • Case study
  • A more detailed view on GP (vanilla GP is not the whole story)
  • The Challenges for GP
  • Variants of GP
  • Selected theoretical results
  • Applications of GP
  • Additional resources
  • Bibliography
Page 72: GeneticProgramming - Poznań University of TechnologyGeneticProgramming KrzysztofKrawiec LaboratoryofIntelligentDecisionSupportSystems InstituteofComputingScience,PoznanUniversityofTechnology,Poznań,Poland

Semantic crossover

Selected theoretical resultsndash 72

Applications of GP

Applications of GPndash 73

Review

GP produced a number of solutions that are human-competitive ie a GPalgorithm automatically solved a problem for which a patent exists2

A recent award-winning work has demonstrated the ability of a GP system toautomatically find and correct bugs in commercially-released software whenprovided with test data3

GP is one of leading methodologies that can be used to lsquoautomatersquo sciencehelping the researchers to find the hidden complex patterns in the observedphenomena4

2Koza J R Keane M A Streeter M J Mydlowec W Yu J Lanza G 2003 Genetic Pro-gramming IV Routine Human-Competitive Machine Intelligence Kluwer Academic Publishers

3Arcuri A Yao X A novel co-evolutionary approach to automatic software bug fixing In WangJ (Ed) 2008 IEEE World Congress on Computational Intelligence IEEE Computational IntelligenceSociety IEEE Press Hong Kong

4Schmidt M Lipson H 3 Apr 2009 Distilling free-form natural laws from experimental dataScience 324 (5923) 81ndash85

Applications of GPndash 74

Humies

() Entries were solicited for cash awards for human-competitive results that wereproduced by any form of genetic and evolutionary computation and that were published

Applications of GPndash 75

Humies

The conditions to qualify(A) The result was patented as an invention in the past is an improvement over apatented invention or would qualify today as a patentable new invention(B) The result is equal to or better than a result that was accepted as a new scientificresult at the time when it was published in a peer-reviewed scientific journal(C) The result is equal to or better than a result that was placed into a database orarchive of results maintained by an internationally recognized panel of scientificexperts(D) The result is publishable in its own right as a new scientific result mdash independentof the fact that the result was mechanically created(E) The result is equal to or better than the most recent human-created solution to along-standing problem for which there has been a succession of increasingly betterhuman-created solutions(F) The result is equal to or better than a result that was considered an achievementin its field at the time it was first discovered(G) The result solves a problem of indisputable difficulty in its field(H) The result holds its own or wins a regulated competition involving humancontestants (in the form of either live human players or human-written computerprograms)

Applications of GPndash 76

Selected Gold Humies

2004 Jason D Lohn Gregory S Hornby Derek S Linden NASA Ames ResearchCenterAn Evolved Antenna for Deployment on NASArsquos Space Technology 5 Mission

httpidesignucscedupapershornby_ec11pdf

Applications of GPndash 77

Selected Gold Humies using GP

2009 Stephanie Forrest Claire Le Goues ThanhVu Nguyen Westley WeimerAutomatically finding patches using genetic programming A GeneticProgramming Approach to Automated Software Repair

Applications of GPndash 78

Selected Gold Humies using GP

2008 Lee Spector David M Clark Ian Lindsay Bradford Barr Jon KleinGenetic Programming for Finite Algebras

2010 Natalio Krasnogor Paweł Widera Jonathan GaribaldiEvolutionary design of the energy function for protein structure prediction GPchallenge evolving the energy function for protein structure prediction Automateddesign of energy functions for protein structure prediction by means of geneticprogramming and improved structure similarity assessment

2011 Achiya Elyasaf Ami Hauptmann Moshe SipperGA-FreeCell Evolving Solvers for the Game of FreeCell

Applications of GPndash 79

Other applications

classification problems in machine learning [] object recognition [ 33] orlearning game strategies [] [2 57] has demonstrated the ability of a GP system to automatically find andcorrect bugs in commercially-released software when provided with test data

In the context of this paper it deserves particular attention that GP is one ofleading methodologies that can be used to lsquoautomatersquo science helping theresearchers to find the hidden complex patterns in the observed phenomena

In this spirit in their seminal paper [42] have shown how GP can be used toinduce scientific laws from experimental data Many other studies havedemonstrated the usefulness of GP for modeling different phenomena includingthose of natural origins [45 3 7 56 46]

See [] for an extensive review of GP applications

Applications of GPndash 80

Case study 2 Evolution of feature detectors

Based on Krzysztof Krawiec Bartosz Kukawka and Tomasz Maciejewski (2010)Evolving cascades of voting feature detectors for vehicle detection in satellite imageryIn IEEE Congress on Evolutionary Computation (CEC 2010) Barcelona IEEE Presspages 2392-2399

Applications of GPndash 81

Additional resources

Additional resourcesndash 82

Additional resources

Evolutionary Computation in Java csgmuedu~eclabprojectsecjGeneric software framework for EA well-prepared to work with GP

The online GP bilbiography wwwcsbhamacuk~wblbiblio

The genetic programming lsquohome pagersquo (a little bit messy but still valuable)httpwwwgenetic-programmingcom

Additional resourcesndash 83

Bibliography

Bibliographyndash 84

A Agapitos and S M LucasEvolving Modular Recursive Sorting AlgorithmsIn Proc EuroGP 2007

A Arcuri and X YaoA novel co-evolutionary approach to automatic software bug fixingIn J Wang editor 2008 IEEE World Congress on Computational Intelligencepages 162ndash168 Hong Kong 1-6 June 2008 IEEE Computational IntelligenceSociety IEEE Press

M Arganis R Val J Prats K Rodriguez R Dominguez and J DolzGenetic programming and standardization in water temperature modellingAdvances in Civil Engineering 2009 2009

A Brabazon M OrsquoNeill and I DempseyAn Introduction to Evolutionary Computation in FinanceIEEE Computational Intelligence Magazine 3(4)42ndash55 2008

R Bradley A Brabazon and M OrsquoNeillDynamic High Frequency Trading A Neuro-Evolutionary ApproachIn Proc EvoWorkshops 2009

G Cuccu and F GomezWhen novelty is not enoughIn Proc EvoApplications 2011

Bibliographyndash 85

M Daga and M C DeoAlternative data-driven methods to estimate wind from waves by inversemodelingNatural Hazards 49(2)293ndash310 May 2009

I Dempsey M OrsquoNeill and A BrabazonAdaptive Trading With Grammatical EvolutionIn Proc CEC 2006

G Durrett F Neumann and U-M OrsquoReillyComputational Complexity Analysis of Simple Genetic Programming On TwoProblems Modeling Isolated Program SemanticsIn Proc FOGA 2011

E Galvaacuten-Loacutepez J Swafford M OrsquoNeill and A BrabazonEvolving a Ms PacMan Controller Using Grammatical EvolutionIn Applications of Evolutionary Computation Springer 2010

J V Hansen P B Lowry R D Meservy and D M McDonaldGenetic Programming for Prevention of Cyberterrorism through Dynamic andEvolving Intrusion DetectionDecision Support Systems 431362ndash1374 2007

Bibliographyndash 86

S Harding J F Miller and W BanzhafDevelopments in Cartesian Genetic Programming self-modifying CGPGPEM 11397ndash439 2010

D Jakobović and L BudinDynamic Scheduling with Genetic ProgrammingIn Proc EuroGP 2006

G Kendall A Parkes and K SpoererA Survey of NP-Complete PuzzlesInternational Computer Games Association Journal 31(1)13ndash34 2008

K E Kinnear JrEvolving a Sort Lessons in Genetic ProgrammingIn Proc of the International Conference on Neural Networks 1993

J KozaA Genetic Approach to the Truck Backer Upper Problem and the Inter-twinedSpiral ProblemIn Proc International Joint Conference on Neural Networks 1992

J R KozaGenetic Programming On the Programming of Computers by Means of NaturalSelectionMIT Press Cambridge MA USA 1992

Bibliographyndash 87

J R KozaGenetic Programming II Automatic Discovery of Reusable ProgramsMIT Press Cambridge Massachusetts May 1994

W Langdon and W BanzhafRepeated Patterns in Genetic ProgrammingNatural Computing 7589ndash613 2008

W B LangdonRandom search is parsimoniousIn E Cantuacute-Paz editor Late Breaking Papers at the Genetic and EvolutionaryComputation Conference (GECCO-2002) pages 308ndash315 New York NY 9-13July 2002 AAAI

W B Langdon and W BanzhafRepeated Sequences in Linear Genetic Programming GenomesComplex Systems 15(4)285ndash306 2005

W B Langdon and A P HarrisonEvolving Regular Expressions for GeneChip Probe Performance PredictionIn Proc PPSN pages 1061ndash1070 2008

W B Langdon and R PoliFoundations of Genetic ProgrammingSpringer-Verlag 2002

Bibliographyndash 88

W B Langdon J Rowsell and A P HarrisonCreating Regular Expressions as mRNA Motifs with GP to Predict Human ExonSplittingIn Proc GECCO 2009

W B Langdon O Sanchez Graillet and A P HarrisonAutomated DNA Motif DiscoveryarXivorg 2010

M Lones A Tyrrell S Stepney and L CavesControlling Complex Dynamics with Artificial Biochemical NetworksIn Proc EuroGP pages 159ndash170 2010

S LucasOthello Competitionhttpprotectkern-1667emrelaxalgovalessexacuk8080othellohtmlOthellohtml 2012[Online accessed 27-Jan-2012]

Bibliographyndash 89

S LucasThe Physical Travelling Salesperson Problemhttpprotectkern-1667emrelaxalgovalessexacukptspptsphtml2012[Online accessed 27ndashJan-2012]

T McConaghyFFX Fast Scalable Deterministic Symbolic Regression TechnologyIn Proc GPTP 2011

D J MontanaStrongly typed genetic programmingBBN Technical Report 7866 Bolt Beranek and Newman Inc 10 MoultonStreet Cambridge MA 02138 USA 7 May 1993

M Nicolau M Schoenauer and W BanzhafEvolving Genes to Balance a PoleIn Proc EuroGP 2010

Bibliographyndash 90

P Nordin and W BanzhafGenetic programming controlling a miniature robotIn E V Siegel and J R Koza editors Working Notes for the AAAI Symposiumon Genetic Programming pages 61ndash67 MIT Cambridge MA USA 10ndash12 Nov1995 AAAI

G Olague and L TrujilloEvolutionary-computer-assisted design of image operators that detect interestpoints using genetic programmingImage and Vision Computing 29(7)484ndash498 2011

M OrsquoNeill A Brabazon and E HembergSubtree Deactivation Control with Grammatical Genetic Programming inDynamic EnvironmentsIn Proc CEC 2008

M OrsquoNeill and C RyanGrammatical Evolution by Grammatical Evolution The Evolution of Grammarand Genetic CodeIn Proc EuroGP pages 138ndash149 Springer-Verlag 5ndash7 Apr 2004

M Paterson Y Peres M Thorup P Winkler and U ZwickMaximum OverhangIn Proc 19th Annual ACM-SIAM Symposium on Discrete Algorithms 2008

Bibliographyndash 91

R PoliExact schema theory for genetic programming and variable-length geneticalgorithms with one-point crossoverGenetic Programming and Evolvable Machines 2(2)123ndash163 June 2001

R Poli and W B LangdonOn the search properties of different crossover operators in genetic programmingIn J R Koza W Banzhaf K Chellapilla K Deb M Dorigo D B Fogel M HGarzon D E Goldberg H Iba and R Riolo editors Genetic Programming1998 Proceedings of the Third Annual Conference pages 293ndash301 University ofWisconsin Madison Wisconsin USA 22-25 July 1998 Morgan Kaufmann

R Poli W B Langdon and N F McPheeA field guide to genetic programmingPublished via httplulucom and freely available athttpwwwgp-field-guideorguk 2008(With contributions by J R Koza)

B J Ross and H ZhuProcedural texture evolution using multiobjective optimizationNew Generation Computing 22(3)271ndash293 2004

Bibliographyndash 92

R P Salustowicz and J SchmidhuberProbabilistic incremental program evolutionEvolutionary Computation 5(2)123ndash141 1997

M Schmidt and H LipsonDistilling free-form natural laws from experimental dataScience 324(5923)81ndash85 3 Apr 2009

S Silva and L VanneschiState-of-the-Art Genetic Programming for Predicting Human Oral Bioavailabilityof DrugsIn Proc 4th International Workshop on Practical Applications of ComputationalBiology and Bioinformatics 2010

M SipperLet the Games EvolveIn Proc GPTP 2011

C Sivapragasam R Maheswaran and V VenkateshGenetic programming approach for flood routing in natural channelsHydrological Processes 22(5)623ndash628 2007

C Sivapragasam N Muttil S Muthukumar and V M ArunPrediction of algal blooms using genetic programmingMarine Pollution Bulletin 60(10)1849ndash1855 2010

Bibliographyndash 93

L SpectorTowards practical autoconstructive evolution Self-evolution of problem-solvinggenetic programming systemsIn R Riolo T McConaghy and E Vladislavleva editors Genetic ProgrammingTheory and Practice VIII volume 8 of Genetic and Evolutionary Computationchapter 2 pages 17ndash33 Springer Ann Arbor USA 20-22 May 2010

L Spector C Perry and J KleinPush 20 programming language descriptionTechnical report School of Cognitive Science Hampshire College Apr 2004

J Togelius S Karakovskiy J Koutnik and J SchmidhuberSuper Mario EvolutionIn Proc IEEE Computational Intelligence and Games 2009

TORCS The Open Car Racing Simulatorhttptorcssourceforgenet 2012

L Vanneschi and G CuccuVariable Size Population for Dynamic Optimization with Genetic ProgrammingIn Proc GECCO 2009

Bibliographyndash 94

E Vladislavleva G Smits and D Den HertogOrder of Nonlinearity as a Complexity Measure for Models Generated by SymbolicRegression via Pareto Genetic ProgrammingIEEE Trans EC 13(2)333ndash349 2009

N Wagner Z Michalewicz M Khouja and R McGregorTime Series Forecasting for Dynamic Environments The DyFor Genetic ProgramModelIEEE Trans EC 2007

J Walker and J MillerPredicting Prime Numbers Using Cartesian Genetic ProgrammingIn Proc EuroGP 2007

J A Walker K Voumllk S L Smith and J F MillerParallel Evolution using Multi-chromosome Cartesian Genetic ProgrammingGPEM 10417ndash445 2009

W-C Wang K-W Chau C-T Cheng and L QiuA comparison of performance of several artificial intelligence methods forforecasting monthly discharge time seriesJournal of Hydrology 374(3-4)294ndash306 2009

Bibliographyndash 95

W Weimer S Forrest C Le Goues and T NguyenAutomatic program repair with evolutionary computationCommunications of the ACM 53(5)109ndash116 June 2010

P Widera J Garibaldi and N KrasnogorGP challenge Evolving energy function for protein structure predictionGPEM 1161ndash88 2010

T YuHierarchical Processing for Evolving Recursive and Modular Programs UsingHigher-Order Functions and Lambda AbstractionGPEM 2345ndash380 2001

Bibliographyndash 96

  • Introduction
  • Background
  • What is genetic programming
  • Summary of our first glimpse at GP
  • Life demonstration of GP using ECJ
  • Assessment of GP techniques
  • Case study
  • A more detailed view on GP (vanilla GP is not the whole story)
  • The Challenges for GP
  • Variants of GP
  • Selected theoretical results
  • Applications of GP
  • Additional resources
  • Bibliography
Page 73: GeneticProgramming - Poznań University of TechnologyGeneticProgramming KrzysztofKrawiec LaboratoryofIntelligentDecisionSupportSystems InstituteofComputingScience,PoznanUniversityofTechnology,Poznań,Poland

Applications of GP

Applications of GPndash 73

Review

GP produced a number of solutions that are human-competitive ie a GPalgorithm automatically solved a problem for which a patent exists2

A recent award-winning work has demonstrated the ability of a GP system toautomatically find and correct bugs in commercially-released software whenprovided with test data3

GP is one of leading methodologies that can be used to lsquoautomatersquo sciencehelping the researchers to find the hidden complex patterns in the observedphenomena4

2Koza J R Keane M A Streeter M J Mydlowec W Yu J Lanza G 2003 Genetic Pro-gramming IV Routine Human-Competitive Machine Intelligence Kluwer Academic Publishers

3Arcuri A Yao X A novel co-evolutionary approach to automatic software bug fixing In WangJ (Ed) 2008 IEEE World Congress on Computational Intelligence IEEE Computational IntelligenceSociety IEEE Press Hong Kong

4Schmidt M Lipson H 3 Apr 2009 Distilling free-form natural laws from experimental dataScience 324 (5923) 81ndash85

Applications of GPndash 74

Humies

() Entries were solicited for cash awards for human-competitive results that wereproduced by any form of genetic and evolutionary computation and that were published

Applications of GPndash 75

Humies

The conditions to qualify(A) The result was patented as an invention in the past is an improvement over apatented invention or would qualify today as a patentable new invention(B) The result is equal to or better than a result that was accepted as a new scientificresult at the time when it was published in a peer-reviewed scientific journal(C) The result is equal to or better than a result that was placed into a database orarchive of results maintained by an internationally recognized panel of scientificexperts(D) The result is publishable in its own right as a new scientific result mdash independentof the fact that the result was mechanically created(E) The result is equal to or better than the most recent human-created solution to along-standing problem for which there has been a succession of increasingly betterhuman-created solutions(F) The result is equal to or better than a result that was considered an achievementin its field at the time it was first discovered(G) The result solves a problem of indisputable difficulty in its field(H) The result holds its own or wins a regulated competition involving humancontestants (in the form of either live human players or human-written computerprograms)

Applications of GPndash 76

Selected Gold Humies

2004 Jason D Lohn Gregory S Hornby Derek S Linden NASA Ames ResearchCenterAn Evolved Antenna for Deployment on NASArsquos Space Technology 5 Mission

httpidesignucscedupapershornby_ec11pdf

Applications of GPndash 77

Selected Gold Humies using GP

2009 Stephanie Forrest Claire Le Goues ThanhVu Nguyen Westley WeimerAutomatically finding patches using genetic programming A GeneticProgramming Approach to Automated Software Repair

Applications of GPndash 78

Selected Gold Humies using GP

2008 Lee Spector David M Clark Ian Lindsay Bradford Barr Jon KleinGenetic Programming for Finite Algebras

2010 Natalio Krasnogor Paweł Widera Jonathan GaribaldiEvolutionary design of the energy function for protein structure prediction GPchallenge evolving the energy function for protein structure prediction Automateddesign of energy functions for protein structure prediction by means of geneticprogramming and improved structure similarity assessment

2011 Achiya Elyasaf Ami Hauptmann Moshe SipperGA-FreeCell Evolving Solvers for the Game of FreeCell

Applications of GPndash 79

Other applications

classification problems in machine learning [] object recognition [ 33] orlearning game strategies [] [2 57] has demonstrated the ability of a GP system to automatically find andcorrect bugs in commercially-released software when provided with test data

In the context of this paper it deserves particular attention that GP is one ofleading methodologies that can be used to lsquoautomatersquo science helping theresearchers to find the hidden complex patterns in the observed phenomena

In this spirit in their seminal paper [42] have shown how GP can be used toinduce scientific laws from experimental data Many other studies havedemonstrated the usefulness of GP for modeling different phenomena includingthose of natural origins [45 3 7 56 46]

See [] for an extensive review of GP applications

Applications of GPndash 80

Case study 2 Evolution of feature detectors

Based on Krzysztof Krawiec Bartosz Kukawka and Tomasz Maciejewski (2010)Evolving cascades of voting feature detectors for vehicle detection in satellite imageryIn IEEE Congress on Evolutionary Computation (CEC 2010) Barcelona IEEE Presspages 2392-2399

Applications of GPndash 81

Additional resources

Additional resourcesndash 82

Additional resources

Evolutionary Computation in Java csgmuedu~eclabprojectsecjGeneric software framework for EA well-prepared to work with GP

The online GP bilbiography wwwcsbhamacuk~wblbiblio

The genetic programming lsquohome pagersquo (a little bit messy but still valuable)httpwwwgenetic-programmingcom

Additional resourcesndash 83

Bibliography

Bibliographyndash 84

A Agapitos and S M LucasEvolving Modular Recursive Sorting AlgorithmsIn Proc EuroGP 2007

A Arcuri and X YaoA novel co-evolutionary approach to automatic software bug fixingIn J Wang editor 2008 IEEE World Congress on Computational Intelligencepages 162ndash168 Hong Kong 1-6 June 2008 IEEE Computational IntelligenceSociety IEEE Press

M Arganis R Val J Prats K Rodriguez R Dominguez and J DolzGenetic programming and standardization in water temperature modellingAdvances in Civil Engineering 2009 2009

A Brabazon M OrsquoNeill and I DempseyAn Introduction to Evolutionary Computation in FinanceIEEE Computational Intelligence Magazine 3(4)42ndash55 2008

R Bradley A Brabazon and M OrsquoNeillDynamic High Frequency Trading A Neuro-Evolutionary ApproachIn Proc EvoWorkshops 2009

G Cuccu and F GomezWhen novelty is not enoughIn Proc EvoApplications 2011

Bibliographyndash 85

M Daga and M C DeoAlternative data-driven methods to estimate wind from waves by inversemodelingNatural Hazards 49(2)293ndash310 May 2009

I Dempsey M OrsquoNeill and A BrabazonAdaptive Trading With Grammatical EvolutionIn Proc CEC 2006

G Durrett F Neumann and U-M OrsquoReillyComputational Complexity Analysis of Simple Genetic Programming On TwoProblems Modeling Isolated Program SemanticsIn Proc FOGA 2011

E Galvaacuten-Loacutepez J Swafford M OrsquoNeill and A BrabazonEvolving a Ms PacMan Controller Using Grammatical EvolutionIn Applications of Evolutionary Computation Springer 2010

J V Hansen P B Lowry R D Meservy and D M McDonaldGenetic Programming for Prevention of Cyberterrorism through Dynamic andEvolving Intrusion DetectionDecision Support Systems 431362ndash1374 2007

Bibliographyndash 86

S Harding J F Miller and W BanzhafDevelopments in Cartesian Genetic Programming self-modifying CGPGPEM 11397ndash439 2010

D Jakobović and L BudinDynamic Scheduling with Genetic ProgrammingIn Proc EuroGP 2006

G Kendall A Parkes and K SpoererA Survey of NP-Complete PuzzlesInternational Computer Games Association Journal 31(1)13ndash34 2008

K E Kinnear JrEvolving a Sort Lessons in Genetic ProgrammingIn Proc of the International Conference on Neural Networks 1993

J KozaA Genetic Approach to the Truck Backer Upper Problem and the Inter-twinedSpiral ProblemIn Proc International Joint Conference on Neural Networks 1992

J R KozaGenetic Programming On the Programming of Computers by Means of NaturalSelectionMIT Press Cambridge MA USA 1992

Bibliographyndash 87

J R KozaGenetic Programming II Automatic Discovery of Reusable ProgramsMIT Press Cambridge Massachusetts May 1994

W Langdon and W BanzhafRepeated Patterns in Genetic ProgrammingNatural Computing 7589ndash613 2008

W B LangdonRandom search is parsimoniousIn E Cantuacute-Paz editor Late Breaking Papers at the Genetic and EvolutionaryComputation Conference (GECCO-2002) pages 308ndash315 New York NY 9-13July 2002 AAAI

W B Langdon and W BanzhafRepeated Sequences in Linear Genetic Programming GenomesComplex Systems 15(4)285ndash306 2005

W B Langdon and A P HarrisonEvolving Regular Expressions for GeneChip Probe Performance PredictionIn Proc PPSN pages 1061ndash1070 2008

W B Langdon and R PoliFoundations of Genetic ProgrammingSpringer-Verlag 2002

Bibliographyndash 88

W B Langdon J Rowsell and A P HarrisonCreating Regular Expressions as mRNA Motifs with GP to Predict Human ExonSplittingIn Proc GECCO 2009

W B Langdon O Sanchez Graillet and A P HarrisonAutomated DNA Motif DiscoveryarXivorg 2010

M Lones A Tyrrell S Stepney and L CavesControlling Complex Dynamics with Artificial Biochemical NetworksIn Proc EuroGP pages 159ndash170 2010

S LucasOthello Competitionhttpprotectkern-1667emrelaxalgovalessexacuk8080othellohtmlOthellohtml 2012[Online accessed 27-Jan-2012]

Bibliographyndash 89

S LucasThe Physical Travelling Salesperson Problemhttpprotectkern-1667emrelaxalgovalessexacukptspptsphtml2012[Online accessed 27ndashJan-2012]

T McConaghyFFX Fast Scalable Deterministic Symbolic Regression TechnologyIn Proc GPTP 2011

D J MontanaStrongly typed genetic programmingBBN Technical Report 7866 Bolt Beranek and Newman Inc 10 MoultonStreet Cambridge MA 02138 USA 7 May 1993

M Nicolau M Schoenauer and W BanzhafEvolving Genes to Balance a PoleIn Proc EuroGP 2010

Bibliographyndash 90

P Nordin and W BanzhafGenetic programming controlling a miniature robotIn E V Siegel and J R Koza editors Working Notes for the AAAI Symposiumon Genetic Programming pages 61ndash67 MIT Cambridge MA USA 10ndash12 Nov1995 AAAI

G Olague and L TrujilloEvolutionary-computer-assisted design of image operators that detect interestpoints using genetic programmingImage and Vision Computing 29(7)484ndash498 2011

M OrsquoNeill A Brabazon and E HembergSubtree Deactivation Control with Grammatical Genetic Programming inDynamic EnvironmentsIn Proc CEC 2008

M OrsquoNeill and C RyanGrammatical Evolution by Grammatical Evolution The Evolution of Grammarand Genetic CodeIn Proc EuroGP pages 138ndash149 Springer-Verlag 5ndash7 Apr 2004

M Paterson Y Peres M Thorup P Winkler and U ZwickMaximum OverhangIn Proc 19th Annual ACM-SIAM Symposium on Discrete Algorithms 2008

Bibliographyndash 91

R PoliExact schema theory for genetic programming and variable-length geneticalgorithms with one-point crossoverGenetic Programming and Evolvable Machines 2(2)123ndash163 June 2001

R Poli and W B LangdonOn the search properties of different crossover operators in genetic programmingIn J R Koza W Banzhaf K Chellapilla K Deb M Dorigo D B Fogel M HGarzon D E Goldberg H Iba and R Riolo editors Genetic Programming1998 Proceedings of the Third Annual Conference pages 293ndash301 University ofWisconsin Madison Wisconsin USA 22-25 July 1998 Morgan Kaufmann

R Poli W B Langdon and N F McPheeA field guide to genetic programmingPublished via httplulucom and freely available athttpwwwgp-field-guideorguk 2008(With contributions by J R Koza)

B J Ross and H ZhuProcedural texture evolution using multiobjective optimizationNew Generation Computing 22(3)271ndash293 2004

Bibliographyndash 92

R P Salustowicz and J SchmidhuberProbabilistic incremental program evolutionEvolutionary Computation 5(2)123ndash141 1997

M Schmidt and H LipsonDistilling free-form natural laws from experimental dataScience 324(5923)81ndash85 3 Apr 2009

S Silva and L VanneschiState-of-the-Art Genetic Programming for Predicting Human Oral Bioavailabilityof DrugsIn Proc 4th International Workshop on Practical Applications of ComputationalBiology and Bioinformatics 2010

M SipperLet the Games EvolveIn Proc GPTP 2011

C Sivapragasam R Maheswaran and V VenkateshGenetic programming approach for flood routing in natural channelsHydrological Processes 22(5)623ndash628 2007

C Sivapragasam N Muttil S Muthukumar and V M ArunPrediction of algal blooms using genetic programmingMarine Pollution Bulletin 60(10)1849ndash1855 2010

Bibliographyndash 93

L SpectorTowards practical autoconstructive evolution Self-evolution of problem-solvinggenetic programming systemsIn R Riolo T McConaghy and E Vladislavleva editors Genetic ProgrammingTheory and Practice VIII volume 8 of Genetic and Evolutionary Computationchapter 2 pages 17ndash33 Springer Ann Arbor USA 20-22 May 2010

L Spector C Perry and J KleinPush 20 programming language descriptionTechnical report School of Cognitive Science Hampshire College Apr 2004

J Togelius S Karakovskiy J Koutnik and J SchmidhuberSuper Mario EvolutionIn Proc IEEE Computational Intelligence and Games 2009

TORCS The Open Car Racing Simulatorhttptorcssourceforgenet 2012

L Vanneschi and G CuccuVariable Size Population for Dynamic Optimization with Genetic ProgrammingIn Proc GECCO 2009

Bibliographyndash 94

E Vladislavleva G Smits and D Den HertogOrder of Nonlinearity as a Complexity Measure for Models Generated by SymbolicRegression via Pareto Genetic ProgrammingIEEE Trans EC 13(2)333ndash349 2009

N Wagner Z Michalewicz M Khouja and R McGregorTime Series Forecasting for Dynamic Environments The DyFor Genetic ProgramModelIEEE Trans EC 2007

J Walker and J MillerPredicting Prime Numbers Using Cartesian Genetic ProgrammingIn Proc EuroGP 2007

J A Walker K Voumllk S L Smith and J F MillerParallel Evolution using Multi-chromosome Cartesian Genetic ProgrammingGPEM 10417ndash445 2009

W-C Wang K-W Chau C-T Cheng and L QiuA comparison of performance of several artificial intelligence methods forforecasting monthly discharge time seriesJournal of Hydrology 374(3-4)294ndash306 2009

Bibliographyndash 95

W Weimer S Forrest C Le Goues and T NguyenAutomatic program repair with evolutionary computationCommunications of the ACM 53(5)109ndash116 June 2010

P Widera J Garibaldi and N KrasnogorGP challenge Evolving energy function for protein structure predictionGPEM 1161ndash88 2010

T YuHierarchical Processing for Evolving Recursive and Modular Programs UsingHigher-Order Functions and Lambda AbstractionGPEM 2345ndash380 2001

Bibliographyndash 96

  • Introduction
  • Background
  • What is genetic programming
  • Summary of our first glimpse at GP
  • Life demonstration of GP using ECJ
  • Assessment of GP techniques
  • Case study
  • A more detailed view on GP (vanilla GP is not the whole story)
  • The Challenges for GP
  • Variants of GP
  • Selected theoretical results
  • Applications of GP
  • Additional resources
  • Bibliography
Page 74: GeneticProgramming - Poznań University of TechnologyGeneticProgramming KrzysztofKrawiec LaboratoryofIntelligentDecisionSupportSystems InstituteofComputingScience,PoznanUniversityofTechnology,Poznań,Poland

Review

GP produced a number of solutions that are human-competitive ie a GPalgorithm automatically solved a problem for which a patent exists2

A recent award-winning work has demonstrated the ability of a GP system toautomatically find and correct bugs in commercially-released software whenprovided with test data3

GP is one of leading methodologies that can be used to lsquoautomatersquo sciencehelping the researchers to find the hidden complex patterns in the observedphenomena4

2Koza J R Keane M A Streeter M J Mydlowec W Yu J Lanza G 2003 Genetic Pro-gramming IV Routine Human-Competitive Machine Intelligence Kluwer Academic Publishers

3Arcuri A Yao X A novel co-evolutionary approach to automatic software bug fixing In WangJ (Ed) 2008 IEEE World Congress on Computational Intelligence IEEE Computational IntelligenceSociety IEEE Press Hong Kong

4Schmidt M Lipson H 3 Apr 2009 Distilling free-form natural laws from experimental dataScience 324 (5923) 81ndash85

Applications of GPndash 74

Humies

() Entries were solicited for cash awards for human-competitive results that wereproduced by any form of genetic and evolutionary computation and that were published

Applications of GPndash 75

Humies

The conditions to qualify(A) The result was patented as an invention in the past is an improvement over apatented invention or would qualify today as a patentable new invention(B) The result is equal to or better than a result that was accepted as a new scientificresult at the time when it was published in a peer-reviewed scientific journal(C) The result is equal to or better than a result that was placed into a database orarchive of results maintained by an internationally recognized panel of scientificexperts(D) The result is publishable in its own right as a new scientific result mdash independentof the fact that the result was mechanically created(E) The result is equal to or better than the most recent human-created solution to along-standing problem for which there has been a succession of increasingly betterhuman-created solutions(F) The result is equal to or better than a result that was considered an achievementin its field at the time it was first discovered(G) The result solves a problem of indisputable difficulty in its field(H) The result holds its own or wins a regulated competition involving humancontestants (in the form of either live human players or human-written computerprograms)

Applications of GPndash 76

Selected Gold Humies

2004 Jason D Lohn Gregory S Hornby Derek S Linden NASA Ames ResearchCenterAn Evolved Antenna for Deployment on NASArsquos Space Technology 5 Mission

httpidesignucscedupapershornby_ec11pdf

Applications of GPndash 77

Selected Gold Humies using GP

2009 Stephanie Forrest Claire Le Goues ThanhVu Nguyen Westley WeimerAutomatically finding patches using genetic programming A GeneticProgramming Approach to Automated Software Repair

Applications of GPndash 78

Selected Gold Humies using GP

2008 Lee Spector David M Clark Ian Lindsay Bradford Barr Jon KleinGenetic Programming for Finite Algebras

2010 Natalio Krasnogor Paweł Widera Jonathan GaribaldiEvolutionary design of the energy function for protein structure prediction GPchallenge evolving the energy function for protein structure prediction Automateddesign of energy functions for protein structure prediction by means of geneticprogramming and improved structure similarity assessment

2011 Achiya Elyasaf Ami Hauptmann Moshe SipperGA-FreeCell Evolving Solvers for the Game of FreeCell

Applications of GPndash 79

Other applications

classification problems in machine learning [] object recognition [ 33] orlearning game strategies [] [2 57] has demonstrated the ability of a GP system to automatically find andcorrect bugs in commercially-released software when provided with test data

In the context of this paper it deserves particular attention that GP is one ofleading methodologies that can be used to lsquoautomatersquo science helping theresearchers to find the hidden complex patterns in the observed phenomena

In this spirit in their seminal paper [42] have shown how GP can be used toinduce scientific laws from experimental data Many other studies havedemonstrated the usefulness of GP for modeling different phenomena includingthose of natural origins [45 3 7 56 46]

See [] for an extensive review of GP applications

Applications of GPndash 80

Case study 2 Evolution of feature detectors

Based on Krzysztof Krawiec Bartosz Kukawka and Tomasz Maciejewski (2010)Evolving cascades of voting feature detectors for vehicle detection in satellite imageryIn IEEE Congress on Evolutionary Computation (CEC 2010) Barcelona IEEE Presspages 2392-2399

Applications of GPndash 81

Additional resources

Additional resourcesndash 82

Additional resources

Evolutionary Computation in Java csgmuedu~eclabprojectsecjGeneric software framework for EA well-prepared to work with GP

The online GP bilbiography wwwcsbhamacuk~wblbiblio

The genetic programming lsquohome pagersquo (a little bit messy but still valuable)httpwwwgenetic-programmingcom

Additional resourcesndash 83

Bibliography

Bibliographyndash 84

A Agapitos and S M LucasEvolving Modular Recursive Sorting AlgorithmsIn Proc EuroGP 2007

A Arcuri and X YaoA novel co-evolutionary approach to automatic software bug fixingIn J Wang editor 2008 IEEE World Congress on Computational Intelligencepages 162ndash168 Hong Kong 1-6 June 2008 IEEE Computational IntelligenceSociety IEEE Press

M Arganis R Val J Prats K Rodriguez R Dominguez and J DolzGenetic programming and standardization in water temperature modellingAdvances in Civil Engineering 2009 2009

A Brabazon M OrsquoNeill and I DempseyAn Introduction to Evolutionary Computation in FinanceIEEE Computational Intelligence Magazine 3(4)42ndash55 2008

R Bradley A Brabazon and M OrsquoNeillDynamic High Frequency Trading A Neuro-Evolutionary ApproachIn Proc EvoWorkshops 2009

G Cuccu and F GomezWhen novelty is not enoughIn Proc EvoApplications 2011

Bibliographyndash 85

M Daga and M C DeoAlternative data-driven methods to estimate wind from waves by inversemodelingNatural Hazards 49(2)293ndash310 May 2009

I Dempsey M OrsquoNeill and A BrabazonAdaptive Trading With Grammatical EvolutionIn Proc CEC 2006

G Durrett F Neumann and U-M OrsquoReillyComputational Complexity Analysis of Simple Genetic Programming On TwoProblems Modeling Isolated Program SemanticsIn Proc FOGA 2011

E Galvaacuten-Loacutepez J Swafford M OrsquoNeill and A BrabazonEvolving a Ms PacMan Controller Using Grammatical EvolutionIn Applications of Evolutionary Computation Springer 2010

J V Hansen P B Lowry R D Meservy and D M McDonaldGenetic Programming for Prevention of Cyberterrorism through Dynamic andEvolving Intrusion DetectionDecision Support Systems 431362ndash1374 2007

Bibliographyndash 86

S Harding J F Miller and W BanzhafDevelopments in Cartesian Genetic Programming self-modifying CGPGPEM 11397ndash439 2010

D Jakobović and L BudinDynamic Scheduling with Genetic ProgrammingIn Proc EuroGP 2006

G Kendall A Parkes and K SpoererA Survey of NP-Complete PuzzlesInternational Computer Games Association Journal 31(1)13ndash34 2008

K E Kinnear JrEvolving a Sort Lessons in Genetic ProgrammingIn Proc of the International Conference on Neural Networks 1993

J KozaA Genetic Approach to the Truck Backer Upper Problem and the Inter-twinedSpiral ProblemIn Proc International Joint Conference on Neural Networks 1992

J R KozaGenetic Programming On the Programming of Computers by Means of NaturalSelectionMIT Press Cambridge MA USA 1992

Bibliographyndash 87

J R KozaGenetic Programming II Automatic Discovery of Reusable ProgramsMIT Press Cambridge Massachusetts May 1994

W Langdon and W BanzhafRepeated Patterns in Genetic ProgrammingNatural Computing 7589ndash613 2008

W B LangdonRandom search is parsimoniousIn E Cantuacute-Paz editor Late Breaking Papers at the Genetic and EvolutionaryComputation Conference (GECCO-2002) pages 308ndash315 New York NY 9-13July 2002 AAAI

W B Langdon and W BanzhafRepeated Sequences in Linear Genetic Programming GenomesComplex Systems 15(4)285ndash306 2005

W B Langdon and A P HarrisonEvolving Regular Expressions for GeneChip Probe Performance PredictionIn Proc PPSN pages 1061ndash1070 2008

W B Langdon and R PoliFoundations of Genetic ProgrammingSpringer-Verlag 2002

Bibliographyndash 88

W B Langdon J Rowsell and A P HarrisonCreating Regular Expressions as mRNA Motifs with GP to Predict Human ExonSplittingIn Proc GECCO 2009

W B Langdon O Sanchez Graillet and A P HarrisonAutomated DNA Motif DiscoveryarXivorg 2010

M Lones A Tyrrell S Stepney and L CavesControlling Complex Dynamics with Artificial Biochemical NetworksIn Proc EuroGP pages 159ndash170 2010

S LucasOthello Competitionhttpprotectkern-1667emrelaxalgovalessexacuk8080othellohtmlOthellohtml 2012[Online accessed 27-Jan-2012]

Bibliographyndash 89

S LucasThe Physical Travelling Salesperson Problemhttpprotectkern-1667emrelaxalgovalessexacukptspptsphtml2012[Online accessed 27ndashJan-2012]

T McConaghyFFX Fast Scalable Deterministic Symbolic Regression TechnologyIn Proc GPTP 2011

D J MontanaStrongly typed genetic programmingBBN Technical Report 7866 Bolt Beranek and Newman Inc 10 MoultonStreet Cambridge MA 02138 USA 7 May 1993

M Nicolau M Schoenauer and W BanzhafEvolving Genes to Balance a PoleIn Proc EuroGP 2010

Bibliographyndash 90

P Nordin and W BanzhafGenetic programming controlling a miniature robotIn E V Siegel and J R Koza editors Working Notes for the AAAI Symposiumon Genetic Programming pages 61ndash67 MIT Cambridge MA USA 10ndash12 Nov1995 AAAI

G Olague and L TrujilloEvolutionary-computer-assisted design of image operators that detect interestpoints using genetic programmingImage and Vision Computing 29(7)484ndash498 2011

M OrsquoNeill A Brabazon and E HembergSubtree Deactivation Control with Grammatical Genetic Programming inDynamic EnvironmentsIn Proc CEC 2008

M OrsquoNeill and C RyanGrammatical Evolution by Grammatical Evolution The Evolution of Grammarand Genetic CodeIn Proc EuroGP pages 138ndash149 Springer-Verlag 5ndash7 Apr 2004

M Paterson Y Peres M Thorup P Winkler and U ZwickMaximum OverhangIn Proc 19th Annual ACM-SIAM Symposium on Discrete Algorithms 2008

Bibliographyndash 91

R PoliExact schema theory for genetic programming and variable-length geneticalgorithms with one-point crossoverGenetic Programming and Evolvable Machines 2(2)123ndash163 June 2001

R Poli and W B LangdonOn the search properties of different crossover operators in genetic programmingIn J R Koza W Banzhaf K Chellapilla K Deb M Dorigo D B Fogel M HGarzon D E Goldberg H Iba and R Riolo editors Genetic Programming1998 Proceedings of the Third Annual Conference pages 293ndash301 University ofWisconsin Madison Wisconsin USA 22-25 July 1998 Morgan Kaufmann

R Poli W B Langdon and N F McPheeA field guide to genetic programmingPublished via httplulucom and freely available athttpwwwgp-field-guideorguk 2008(With contributions by J R Koza)

B J Ross and H ZhuProcedural texture evolution using multiobjective optimizationNew Generation Computing 22(3)271ndash293 2004

Bibliographyndash 92

R P Salustowicz and J SchmidhuberProbabilistic incremental program evolutionEvolutionary Computation 5(2)123ndash141 1997

M Schmidt and H LipsonDistilling free-form natural laws from experimental dataScience 324(5923)81ndash85 3 Apr 2009

S Silva and L VanneschiState-of-the-Art Genetic Programming for Predicting Human Oral Bioavailabilityof DrugsIn Proc 4th International Workshop on Practical Applications of ComputationalBiology and Bioinformatics 2010

M SipperLet the Games EvolveIn Proc GPTP 2011

C Sivapragasam R Maheswaran and V VenkateshGenetic programming approach for flood routing in natural channelsHydrological Processes 22(5)623ndash628 2007

C Sivapragasam N Muttil S Muthukumar and V M ArunPrediction of algal blooms using genetic programmingMarine Pollution Bulletin 60(10)1849ndash1855 2010

Bibliographyndash 93

L SpectorTowards practical autoconstructive evolution Self-evolution of problem-solvinggenetic programming systemsIn R Riolo T McConaghy and E Vladislavleva editors Genetic ProgrammingTheory and Practice VIII volume 8 of Genetic and Evolutionary Computationchapter 2 pages 17ndash33 Springer Ann Arbor USA 20-22 May 2010

L Spector C Perry and J KleinPush 20 programming language descriptionTechnical report School of Cognitive Science Hampshire College Apr 2004

J Togelius S Karakovskiy J Koutnik and J SchmidhuberSuper Mario EvolutionIn Proc IEEE Computational Intelligence and Games 2009

TORCS The Open Car Racing Simulatorhttptorcssourceforgenet 2012

L Vanneschi and G CuccuVariable Size Population for Dynamic Optimization with Genetic ProgrammingIn Proc GECCO 2009

Bibliographyndash 94

E Vladislavleva G Smits and D Den HertogOrder of Nonlinearity as a Complexity Measure for Models Generated by SymbolicRegression via Pareto Genetic ProgrammingIEEE Trans EC 13(2)333ndash349 2009

N Wagner Z Michalewicz M Khouja and R McGregorTime Series Forecasting for Dynamic Environments The DyFor Genetic ProgramModelIEEE Trans EC 2007

J Walker and J MillerPredicting Prime Numbers Using Cartesian Genetic ProgrammingIn Proc EuroGP 2007

J A Walker K Voumllk S L Smith and J F MillerParallel Evolution using Multi-chromosome Cartesian Genetic ProgrammingGPEM 10417ndash445 2009

W-C Wang K-W Chau C-T Cheng and L QiuA comparison of performance of several artificial intelligence methods forforecasting monthly discharge time seriesJournal of Hydrology 374(3-4)294ndash306 2009

Bibliographyndash 95

W Weimer S Forrest C Le Goues and T NguyenAutomatic program repair with evolutionary computationCommunications of the ACM 53(5)109ndash116 June 2010

P Widera J Garibaldi and N KrasnogorGP challenge Evolving energy function for protein structure predictionGPEM 1161ndash88 2010

T YuHierarchical Processing for Evolving Recursive and Modular Programs UsingHigher-Order Functions and Lambda AbstractionGPEM 2345ndash380 2001

Bibliographyndash 96

  • Introduction
  • Background
  • What is genetic programming
  • Summary of our first glimpse at GP
  • Life demonstration of GP using ECJ
  • Assessment of GP techniques
  • Case study
  • A more detailed view on GP (vanilla GP is not the whole story)
  • The Challenges for GP
  • Variants of GP
  • Selected theoretical results
  • Applications of GP
  • Additional resources
  • Bibliography
Page 75: GeneticProgramming - Poznań University of TechnologyGeneticProgramming KrzysztofKrawiec LaboratoryofIntelligentDecisionSupportSystems InstituteofComputingScience,PoznanUniversityofTechnology,Poznań,Poland

Humies

() Entries were solicited for cash awards for human-competitive results that wereproduced by any form of genetic and evolutionary computation and that were published

Applications of GPndash 75

Humies

The conditions to qualify(A) The result was patented as an invention in the past is an improvement over apatented invention or would qualify today as a patentable new invention(B) The result is equal to or better than a result that was accepted as a new scientificresult at the time when it was published in a peer-reviewed scientific journal(C) The result is equal to or better than a result that was placed into a database orarchive of results maintained by an internationally recognized panel of scientificexperts(D) The result is publishable in its own right as a new scientific result mdash independentof the fact that the result was mechanically created(E) The result is equal to or better than the most recent human-created solution to along-standing problem for which there has been a succession of increasingly betterhuman-created solutions(F) The result is equal to or better than a result that was considered an achievementin its field at the time it was first discovered(G) The result solves a problem of indisputable difficulty in its field(H) The result holds its own or wins a regulated competition involving humancontestants (in the form of either live human players or human-written computerprograms)

Applications of GPndash 76

Selected Gold Humies

2004 Jason D Lohn Gregory S Hornby Derek S Linden NASA Ames ResearchCenterAn Evolved Antenna for Deployment on NASArsquos Space Technology 5 Mission

httpidesignucscedupapershornby_ec11pdf

Applications of GPndash 77

Selected Gold Humies using GP

2009 Stephanie Forrest Claire Le Goues ThanhVu Nguyen Westley WeimerAutomatically finding patches using genetic programming A GeneticProgramming Approach to Automated Software Repair

Applications of GPndash 78

Selected Gold Humies using GP

2008 Lee Spector David M Clark Ian Lindsay Bradford Barr Jon KleinGenetic Programming for Finite Algebras

2010 Natalio Krasnogor Paweł Widera Jonathan GaribaldiEvolutionary design of the energy function for protein structure prediction GPchallenge evolving the energy function for protein structure prediction Automateddesign of energy functions for protein structure prediction by means of geneticprogramming and improved structure similarity assessment

2011 Achiya Elyasaf Ami Hauptmann Moshe SipperGA-FreeCell Evolving Solvers for the Game of FreeCell

Applications of GPndash 79

Other applications

classification problems in machine learning [] object recognition [ 33] orlearning game strategies [] [2 57] has demonstrated the ability of a GP system to automatically find andcorrect bugs in commercially-released software when provided with test data

In the context of this paper it deserves particular attention that GP is one ofleading methodologies that can be used to lsquoautomatersquo science helping theresearchers to find the hidden complex patterns in the observed phenomena

In this spirit in their seminal paper [42] have shown how GP can be used toinduce scientific laws from experimental data Many other studies havedemonstrated the usefulness of GP for modeling different phenomena includingthose of natural origins [45 3 7 56 46]

See [] for an extensive review of GP applications

Applications of GPndash 80

Case study 2 Evolution of feature detectors

Based on Krzysztof Krawiec Bartosz Kukawka and Tomasz Maciejewski (2010)Evolving cascades of voting feature detectors for vehicle detection in satellite imageryIn IEEE Congress on Evolutionary Computation (CEC 2010) Barcelona IEEE Presspages 2392-2399

Applications of GPndash 81

Additional resources

Additional resourcesndash 82

Additional resources

Evolutionary Computation in Java csgmuedu~eclabprojectsecjGeneric software framework for EA well-prepared to work with GP

The online GP bilbiography wwwcsbhamacuk~wblbiblio

The genetic programming lsquohome pagersquo (a little bit messy but still valuable)httpwwwgenetic-programmingcom

Additional resourcesndash 83

Bibliography

Bibliographyndash 84

A Agapitos and S M LucasEvolving Modular Recursive Sorting AlgorithmsIn Proc EuroGP 2007

A Arcuri and X YaoA novel co-evolutionary approach to automatic software bug fixingIn J Wang editor 2008 IEEE World Congress on Computational Intelligencepages 162ndash168 Hong Kong 1-6 June 2008 IEEE Computational IntelligenceSociety IEEE Press

M Arganis R Val J Prats K Rodriguez R Dominguez and J DolzGenetic programming and standardization in water temperature modellingAdvances in Civil Engineering 2009 2009

A Brabazon M OrsquoNeill and I DempseyAn Introduction to Evolutionary Computation in FinanceIEEE Computational Intelligence Magazine 3(4)42ndash55 2008

R Bradley A Brabazon and M OrsquoNeillDynamic High Frequency Trading A Neuro-Evolutionary ApproachIn Proc EvoWorkshops 2009

G Cuccu and F GomezWhen novelty is not enoughIn Proc EvoApplications 2011

Bibliographyndash 85

M Daga and M C DeoAlternative data-driven methods to estimate wind from waves by inversemodelingNatural Hazards 49(2)293ndash310 May 2009

I Dempsey M OrsquoNeill and A BrabazonAdaptive Trading With Grammatical EvolutionIn Proc CEC 2006

G Durrett F Neumann and U-M OrsquoReillyComputational Complexity Analysis of Simple Genetic Programming On TwoProblems Modeling Isolated Program SemanticsIn Proc FOGA 2011

E Galvaacuten-Loacutepez J Swafford M OrsquoNeill and A BrabazonEvolving a Ms PacMan Controller Using Grammatical EvolutionIn Applications of Evolutionary Computation Springer 2010

J V Hansen P B Lowry R D Meservy and D M McDonaldGenetic Programming for Prevention of Cyberterrorism through Dynamic andEvolving Intrusion DetectionDecision Support Systems 431362ndash1374 2007

Bibliographyndash 86

S Harding J F Miller and W BanzhafDevelopments in Cartesian Genetic Programming self-modifying CGPGPEM 11397ndash439 2010

D Jakobović and L BudinDynamic Scheduling with Genetic ProgrammingIn Proc EuroGP 2006

G Kendall A Parkes and K SpoererA Survey of NP-Complete PuzzlesInternational Computer Games Association Journal 31(1)13ndash34 2008

K E Kinnear JrEvolving a Sort Lessons in Genetic ProgrammingIn Proc of the International Conference on Neural Networks 1993

J KozaA Genetic Approach to the Truck Backer Upper Problem and the Inter-twinedSpiral ProblemIn Proc International Joint Conference on Neural Networks 1992

J R KozaGenetic Programming On the Programming of Computers by Means of NaturalSelectionMIT Press Cambridge MA USA 1992

Bibliographyndash 87

J R KozaGenetic Programming II Automatic Discovery of Reusable ProgramsMIT Press Cambridge Massachusetts May 1994

W Langdon and W BanzhafRepeated Patterns in Genetic ProgrammingNatural Computing 7589ndash613 2008

W B LangdonRandom search is parsimoniousIn E Cantuacute-Paz editor Late Breaking Papers at the Genetic and EvolutionaryComputation Conference (GECCO-2002) pages 308ndash315 New York NY 9-13July 2002 AAAI

W B Langdon and W BanzhafRepeated Sequences in Linear Genetic Programming GenomesComplex Systems 15(4)285ndash306 2005

W B Langdon and A P HarrisonEvolving Regular Expressions for GeneChip Probe Performance PredictionIn Proc PPSN pages 1061ndash1070 2008

W B Langdon and R PoliFoundations of Genetic ProgrammingSpringer-Verlag 2002

Bibliographyndash 88

W B Langdon J Rowsell and A P HarrisonCreating Regular Expressions as mRNA Motifs with GP to Predict Human ExonSplittingIn Proc GECCO 2009

W B Langdon O Sanchez Graillet and A P HarrisonAutomated DNA Motif DiscoveryarXivorg 2010

M Lones A Tyrrell S Stepney and L CavesControlling Complex Dynamics with Artificial Biochemical NetworksIn Proc EuroGP pages 159ndash170 2010

S LucasOthello Competitionhttpprotectkern-1667emrelaxalgovalessexacuk8080othellohtmlOthellohtml 2012[Online accessed 27-Jan-2012]

Bibliographyndash 89

S LucasThe Physical Travelling Salesperson Problemhttpprotectkern-1667emrelaxalgovalessexacukptspptsphtml2012[Online accessed 27ndashJan-2012]

T McConaghyFFX Fast Scalable Deterministic Symbolic Regression TechnologyIn Proc GPTP 2011

D J MontanaStrongly typed genetic programmingBBN Technical Report 7866 Bolt Beranek and Newman Inc 10 MoultonStreet Cambridge MA 02138 USA 7 May 1993

M Nicolau M Schoenauer and W BanzhafEvolving Genes to Balance a PoleIn Proc EuroGP 2010

Bibliographyndash 90

P Nordin and W BanzhafGenetic programming controlling a miniature robotIn E V Siegel and J R Koza editors Working Notes for the AAAI Symposiumon Genetic Programming pages 61ndash67 MIT Cambridge MA USA 10ndash12 Nov1995 AAAI

G Olague and L TrujilloEvolutionary-computer-assisted design of image operators that detect interestpoints using genetic programmingImage and Vision Computing 29(7)484ndash498 2011

M OrsquoNeill A Brabazon and E HembergSubtree Deactivation Control with Grammatical Genetic Programming inDynamic EnvironmentsIn Proc CEC 2008

M OrsquoNeill and C RyanGrammatical Evolution by Grammatical Evolution The Evolution of Grammarand Genetic CodeIn Proc EuroGP pages 138ndash149 Springer-Verlag 5ndash7 Apr 2004

M Paterson Y Peres M Thorup P Winkler and U ZwickMaximum OverhangIn Proc 19th Annual ACM-SIAM Symposium on Discrete Algorithms 2008

Bibliographyndash 91

R PoliExact schema theory for genetic programming and variable-length geneticalgorithms with one-point crossoverGenetic Programming and Evolvable Machines 2(2)123ndash163 June 2001

R Poli and W B LangdonOn the search properties of different crossover operators in genetic programmingIn J R Koza W Banzhaf K Chellapilla K Deb M Dorigo D B Fogel M HGarzon D E Goldberg H Iba and R Riolo editors Genetic Programming1998 Proceedings of the Third Annual Conference pages 293ndash301 University ofWisconsin Madison Wisconsin USA 22-25 July 1998 Morgan Kaufmann

R Poli W B Langdon and N F McPheeA field guide to genetic programmingPublished via httplulucom and freely available athttpwwwgp-field-guideorguk 2008(With contributions by J R Koza)

B J Ross and H ZhuProcedural texture evolution using multiobjective optimizationNew Generation Computing 22(3)271ndash293 2004

Bibliographyndash 92

R P Salustowicz and J SchmidhuberProbabilistic incremental program evolutionEvolutionary Computation 5(2)123ndash141 1997

M Schmidt and H LipsonDistilling free-form natural laws from experimental dataScience 324(5923)81ndash85 3 Apr 2009

S Silva and L VanneschiState-of-the-Art Genetic Programming for Predicting Human Oral Bioavailabilityof DrugsIn Proc 4th International Workshop on Practical Applications of ComputationalBiology and Bioinformatics 2010

M SipperLet the Games EvolveIn Proc GPTP 2011

C Sivapragasam R Maheswaran and V VenkateshGenetic programming approach for flood routing in natural channelsHydrological Processes 22(5)623ndash628 2007

C Sivapragasam N Muttil S Muthukumar and V M ArunPrediction of algal blooms using genetic programmingMarine Pollution Bulletin 60(10)1849ndash1855 2010

Bibliographyndash 93

L SpectorTowards practical autoconstructive evolution Self-evolution of problem-solvinggenetic programming systemsIn R Riolo T McConaghy and E Vladislavleva editors Genetic ProgrammingTheory and Practice VIII volume 8 of Genetic and Evolutionary Computationchapter 2 pages 17ndash33 Springer Ann Arbor USA 20-22 May 2010

L Spector C Perry and J KleinPush 20 programming language descriptionTechnical report School of Cognitive Science Hampshire College Apr 2004

J Togelius S Karakovskiy J Koutnik and J SchmidhuberSuper Mario EvolutionIn Proc IEEE Computational Intelligence and Games 2009

TORCS The Open Car Racing Simulatorhttptorcssourceforgenet 2012

L Vanneschi and G CuccuVariable Size Population for Dynamic Optimization with Genetic ProgrammingIn Proc GECCO 2009

Bibliographyndash 94

E Vladislavleva G Smits and D Den HertogOrder of Nonlinearity as a Complexity Measure for Models Generated by SymbolicRegression via Pareto Genetic ProgrammingIEEE Trans EC 13(2)333ndash349 2009

N Wagner Z Michalewicz M Khouja and R McGregorTime Series Forecasting for Dynamic Environments The DyFor Genetic ProgramModelIEEE Trans EC 2007

J Walker and J MillerPredicting Prime Numbers Using Cartesian Genetic ProgrammingIn Proc EuroGP 2007

J A Walker K Voumllk S L Smith and J F MillerParallel Evolution using Multi-chromosome Cartesian Genetic ProgrammingGPEM 10417ndash445 2009

W-C Wang K-W Chau C-T Cheng and L QiuA comparison of performance of several artificial intelligence methods forforecasting monthly discharge time seriesJournal of Hydrology 374(3-4)294ndash306 2009

Bibliographyndash 95

W Weimer S Forrest C Le Goues and T NguyenAutomatic program repair with evolutionary computationCommunications of the ACM 53(5)109ndash116 June 2010

P Widera J Garibaldi and N KrasnogorGP challenge Evolving energy function for protein structure predictionGPEM 1161ndash88 2010

T YuHierarchical Processing for Evolving Recursive and Modular Programs UsingHigher-Order Functions and Lambda AbstractionGPEM 2345ndash380 2001

Bibliographyndash 96

  • Introduction
  • Background
  • What is genetic programming
  • Summary of our first glimpse at GP
  • Life demonstration of GP using ECJ
  • Assessment of GP techniques
  • Case study
  • A more detailed view on GP (vanilla GP is not the whole story)
  • The Challenges for GP
  • Variants of GP
  • Selected theoretical results
  • Applications of GP
  • Additional resources
  • Bibliography
Page 76: GeneticProgramming - Poznań University of TechnologyGeneticProgramming KrzysztofKrawiec LaboratoryofIntelligentDecisionSupportSystems InstituteofComputingScience,PoznanUniversityofTechnology,Poznań,Poland

Humies

The conditions to qualify(A) The result was patented as an invention in the past is an improvement over apatented invention or would qualify today as a patentable new invention(B) The result is equal to or better than a result that was accepted as a new scientificresult at the time when it was published in a peer-reviewed scientific journal(C) The result is equal to or better than a result that was placed into a database orarchive of results maintained by an internationally recognized panel of scientificexperts(D) The result is publishable in its own right as a new scientific result mdash independentof the fact that the result was mechanically created(E) The result is equal to or better than the most recent human-created solution to along-standing problem for which there has been a succession of increasingly betterhuman-created solutions(F) The result is equal to or better than a result that was considered an achievementin its field at the time it was first discovered(G) The result solves a problem of indisputable difficulty in its field(H) The result holds its own or wins a regulated competition involving humancontestants (in the form of either live human players or human-written computerprograms)

Applications of GPndash 76

Selected Gold Humies

2004 Jason D Lohn Gregory S Hornby Derek S Linden NASA Ames ResearchCenterAn Evolved Antenna for Deployment on NASArsquos Space Technology 5 Mission

httpidesignucscedupapershornby_ec11pdf

Applications of GPndash 77

Selected Gold Humies using GP

2009 Stephanie Forrest Claire Le Goues ThanhVu Nguyen Westley WeimerAutomatically finding patches using genetic programming A GeneticProgramming Approach to Automated Software Repair

Applications of GPndash 78

Selected Gold Humies using GP

2008 Lee Spector David M Clark Ian Lindsay Bradford Barr Jon KleinGenetic Programming for Finite Algebras

2010 Natalio Krasnogor Paweł Widera Jonathan GaribaldiEvolutionary design of the energy function for protein structure prediction GPchallenge evolving the energy function for protein structure prediction Automateddesign of energy functions for protein structure prediction by means of geneticprogramming and improved structure similarity assessment

2011 Achiya Elyasaf Ami Hauptmann Moshe SipperGA-FreeCell Evolving Solvers for the Game of FreeCell

Applications of GPndash 79

Other applications

classification problems in machine learning [] object recognition [ 33] orlearning game strategies [] [2 57] has demonstrated the ability of a GP system to automatically find andcorrect bugs in commercially-released software when provided with test data

In the context of this paper it deserves particular attention that GP is one ofleading methodologies that can be used to lsquoautomatersquo science helping theresearchers to find the hidden complex patterns in the observed phenomena

In this spirit in their seminal paper [42] have shown how GP can be used toinduce scientific laws from experimental data Many other studies havedemonstrated the usefulness of GP for modeling different phenomena includingthose of natural origins [45 3 7 56 46]

See [] for an extensive review of GP applications

Applications of GPndash 80

Case study 2 Evolution of feature detectors

Based on Krzysztof Krawiec Bartosz Kukawka and Tomasz Maciejewski (2010)Evolving cascades of voting feature detectors for vehicle detection in satellite imageryIn IEEE Congress on Evolutionary Computation (CEC 2010) Barcelona IEEE Presspages 2392-2399

Applications of GPndash 81

Additional resources

Additional resourcesndash 82

Additional resources

Evolutionary Computation in Java csgmuedu~eclabprojectsecjGeneric software framework for EA well-prepared to work with GP

The online GP bilbiography wwwcsbhamacuk~wblbiblio

The genetic programming lsquohome pagersquo (a little bit messy but still valuable)httpwwwgenetic-programmingcom

Additional resourcesndash 83

Bibliography

Bibliographyndash 84

A Agapitos and S M LucasEvolving Modular Recursive Sorting AlgorithmsIn Proc EuroGP 2007

A Arcuri and X YaoA novel co-evolutionary approach to automatic software bug fixingIn J Wang editor 2008 IEEE World Congress on Computational Intelligencepages 162ndash168 Hong Kong 1-6 June 2008 IEEE Computational IntelligenceSociety IEEE Press

M Arganis R Val J Prats K Rodriguez R Dominguez and J DolzGenetic programming and standardization in water temperature modellingAdvances in Civil Engineering 2009 2009

A Brabazon M OrsquoNeill and I DempseyAn Introduction to Evolutionary Computation in FinanceIEEE Computational Intelligence Magazine 3(4)42ndash55 2008

R Bradley A Brabazon and M OrsquoNeillDynamic High Frequency Trading A Neuro-Evolutionary ApproachIn Proc EvoWorkshops 2009

G Cuccu and F GomezWhen novelty is not enoughIn Proc EvoApplications 2011

Bibliographyndash 85

M Daga and M C DeoAlternative data-driven methods to estimate wind from waves by inversemodelingNatural Hazards 49(2)293ndash310 May 2009

I Dempsey M OrsquoNeill and A BrabazonAdaptive Trading With Grammatical EvolutionIn Proc CEC 2006

G Durrett F Neumann and U-M OrsquoReillyComputational Complexity Analysis of Simple Genetic Programming On TwoProblems Modeling Isolated Program SemanticsIn Proc FOGA 2011

E Galvaacuten-Loacutepez J Swafford M OrsquoNeill and A BrabazonEvolving a Ms PacMan Controller Using Grammatical EvolutionIn Applications of Evolutionary Computation Springer 2010

J V Hansen P B Lowry R D Meservy and D M McDonaldGenetic Programming for Prevention of Cyberterrorism through Dynamic andEvolving Intrusion DetectionDecision Support Systems 431362ndash1374 2007

Bibliographyndash 86

S Harding J F Miller and W BanzhafDevelopments in Cartesian Genetic Programming self-modifying CGPGPEM 11397ndash439 2010

D Jakobović and L BudinDynamic Scheduling with Genetic ProgrammingIn Proc EuroGP 2006

G Kendall A Parkes and K SpoererA Survey of NP-Complete PuzzlesInternational Computer Games Association Journal 31(1)13ndash34 2008

K E Kinnear JrEvolving a Sort Lessons in Genetic ProgrammingIn Proc of the International Conference on Neural Networks 1993

J KozaA Genetic Approach to the Truck Backer Upper Problem and the Inter-twinedSpiral ProblemIn Proc International Joint Conference on Neural Networks 1992

J R KozaGenetic Programming On the Programming of Computers by Means of NaturalSelectionMIT Press Cambridge MA USA 1992

Bibliographyndash 87

J R KozaGenetic Programming II Automatic Discovery of Reusable ProgramsMIT Press Cambridge Massachusetts May 1994

W Langdon and W BanzhafRepeated Patterns in Genetic ProgrammingNatural Computing 7589ndash613 2008

W B LangdonRandom search is parsimoniousIn E Cantuacute-Paz editor Late Breaking Papers at the Genetic and EvolutionaryComputation Conference (GECCO-2002) pages 308ndash315 New York NY 9-13July 2002 AAAI

W B Langdon and W BanzhafRepeated Sequences in Linear Genetic Programming GenomesComplex Systems 15(4)285ndash306 2005

W B Langdon and A P HarrisonEvolving Regular Expressions for GeneChip Probe Performance PredictionIn Proc PPSN pages 1061ndash1070 2008

W B Langdon and R PoliFoundations of Genetic ProgrammingSpringer-Verlag 2002

Bibliographyndash 88

W B Langdon J Rowsell and A P HarrisonCreating Regular Expressions as mRNA Motifs with GP to Predict Human ExonSplittingIn Proc GECCO 2009

W B Langdon O Sanchez Graillet and A P HarrisonAutomated DNA Motif DiscoveryarXivorg 2010

M Lones A Tyrrell S Stepney and L CavesControlling Complex Dynamics with Artificial Biochemical NetworksIn Proc EuroGP pages 159ndash170 2010

S LucasOthello Competitionhttpprotectkern-1667emrelaxalgovalessexacuk8080othellohtmlOthellohtml 2012[Online accessed 27-Jan-2012]

Bibliographyndash 89

S LucasThe Physical Travelling Salesperson Problemhttpprotectkern-1667emrelaxalgovalessexacukptspptsphtml2012[Online accessed 27ndashJan-2012]

T McConaghyFFX Fast Scalable Deterministic Symbolic Regression TechnologyIn Proc GPTP 2011

D J MontanaStrongly typed genetic programmingBBN Technical Report 7866 Bolt Beranek and Newman Inc 10 MoultonStreet Cambridge MA 02138 USA 7 May 1993

M Nicolau M Schoenauer and W BanzhafEvolving Genes to Balance a PoleIn Proc EuroGP 2010

Bibliographyndash 90

P Nordin and W BanzhafGenetic programming controlling a miniature robotIn E V Siegel and J R Koza editors Working Notes for the AAAI Symposiumon Genetic Programming pages 61ndash67 MIT Cambridge MA USA 10ndash12 Nov1995 AAAI

G Olague and L TrujilloEvolutionary-computer-assisted design of image operators that detect interestpoints using genetic programmingImage and Vision Computing 29(7)484ndash498 2011

M OrsquoNeill A Brabazon and E HembergSubtree Deactivation Control with Grammatical Genetic Programming inDynamic EnvironmentsIn Proc CEC 2008

M OrsquoNeill and C RyanGrammatical Evolution by Grammatical Evolution The Evolution of Grammarand Genetic CodeIn Proc EuroGP pages 138ndash149 Springer-Verlag 5ndash7 Apr 2004

M Paterson Y Peres M Thorup P Winkler and U ZwickMaximum OverhangIn Proc 19th Annual ACM-SIAM Symposium on Discrete Algorithms 2008

Bibliographyndash 91

R PoliExact schema theory for genetic programming and variable-length geneticalgorithms with one-point crossoverGenetic Programming and Evolvable Machines 2(2)123ndash163 June 2001

R Poli and W B LangdonOn the search properties of different crossover operators in genetic programmingIn J R Koza W Banzhaf K Chellapilla K Deb M Dorigo D B Fogel M HGarzon D E Goldberg H Iba and R Riolo editors Genetic Programming1998 Proceedings of the Third Annual Conference pages 293ndash301 University ofWisconsin Madison Wisconsin USA 22-25 July 1998 Morgan Kaufmann

R Poli W B Langdon and N F McPheeA field guide to genetic programmingPublished via httplulucom and freely available athttpwwwgp-field-guideorguk 2008(With contributions by J R Koza)

B J Ross and H ZhuProcedural texture evolution using multiobjective optimizationNew Generation Computing 22(3)271ndash293 2004

Bibliographyndash 92

R P Salustowicz and J SchmidhuberProbabilistic incremental program evolutionEvolutionary Computation 5(2)123ndash141 1997

M Schmidt and H LipsonDistilling free-form natural laws from experimental dataScience 324(5923)81ndash85 3 Apr 2009

S Silva and L VanneschiState-of-the-Art Genetic Programming for Predicting Human Oral Bioavailabilityof DrugsIn Proc 4th International Workshop on Practical Applications of ComputationalBiology and Bioinformatics 2010

M SipperLet the Games EvolveIn Proc GPTP 2011

C Sivapragasam R Maheswaran and V VenkateshGenetic programming approach for flood routing in natural channelsHydrological Processes 22(5)623ndash628 2007

C Sivapragasam N Muttil S Muthukumar and V M ArunPrediction of algal blooms using genetic programmingMarine Pollution Bulletin 60(10)1849ndash1855 2010

Bibliographyndash 93

L SpectorTowards practical autoconstructive evolution Self-evolution of problem-solvinggenetic programming systemsIn R Riolo T McConaghy and E Vladislavleva editors Genetic ProgrammingTheory and Practice VIII volume 8 of Genetic and Evolutionary Computationchapter 2 pages 17ndash33 Springer Ann Arbor USA 20-22 May 2010

L Spector C Perry and J KleinPush 20 programming language descriptionTechnical report School of Cognitive Science Hampshire College Apr 2004

J Togelius S Karakovskiy J Koutnik and J SchmidhuberSuper Mario EvolutionIn Proc IEEE Computational Intelligence and Games 2009

TORCS The Open Car Racing Simulatorhttptorcssourceforgenet 2012

L Vanneschi and G CuccuVariable Size Population for Dynamic Optimization with Genetic ProgrammingIn Proc GECCO 2009

Bibliographyndash 94

E Vladislavleva G Smits and D Den HertogOrder of Nonlinearity as a Complexity Measure for Models Generated by SymbolicRegression via Pareto Genetic ProgrammingIEEE Trans EC 13(2)333ndash349 2009

N Wagner Z Michalewicz M Khouja and R McGregorTime Series Forecasting for Dynamic Environments The DyFor Genetic ProgramModelIEEE Trans EC 2007

J Walker and J MillerPredicting Prime Numbers Using Cartesian Genetic ProgrammingIn Proc EuroGP 2007

J A Walker K Voumllk S L Smith and J F MillerParallel Evolution using Multi-chromosome Cartesian Genetic ProgrammingGPEM 10417ndash445 2009

W-C Wang K-W Chau C-T Cheng and L QiuA comparison of performance of several artificial intelligence methods forforecasting monthly discharge time seriesJournal of Hydrology 374(3-4)294ndash306 2009

Bibliographyndash 95

W Weimer S Forrest C Le Goues and T NguyenAutomatic program repair with evolutionary computationCommunications of the ACM 53(5)109ndash116 June 2010

P Widera J Garibaldi and N KrasnogorGP challenge Evolving energy function for protein structure predictionGPEM 1161ndash88 2010

T YuHierarchical Processing for Evolving Recursive and Modular Programs UsingHigher-Order Functions and Lambda AbstractionGPEM 2345ndash380 2001

Bibliographyndash 96

  • Introduction
  • Background
  • What is genetic programming
  • Summary of our first glimpse at GP
  • Life demonstration of GP using ECJ
  • Assessment of GP techniques
  • Case study
  • A more detailed view on GP (vanilla GP is not the whole story)
  • The Challenges for GP
  • Variants of GP
  • Selected theoretical results
  • Applications of GP
  • Additional resources
  • Bibliography
Page 77: GeneticProgramming - Poznań University of TechnologyGeneticProgramming KrzysztofKrawiec LaboratoryofIntelligentDecisionSupportSystems InstituteofComputingScience,PoznanUniversityofTechnology,Poznań,Poland

Selected Gold Humies

2004 Jason D Lohn Gregory S Hornby Derek S Linden NASA Ames ResearchCenterAn Evolved Antenna for Deployment on NASArsquos Space Technology 5 Mission

httpidesignucscedupapershornby_ec11pdf

Applications of GPndash 77

Selected Gold Humies using GP

2009 Stephanie Forrest Claire Le Goues ThanhVu Nguyen Westley WeimerAutomatically finding patches using genetic programming A GeneticProgramming Approach to Automated Software Repair

Applications of GPndash 78

Selected Gold Humies using GP

2008 Lee Spector David M Clark Ian Lindsay Bradford Barr Jon KleinGenetic Programming for Finite Algebras

2010 Natalio Krasnogor Paweł Widera Jonathan GaribaldiEvolutionary design of the energy function for protein structure prediction GPchallenge evolving the energy function for protein structure prediction Automateddesign of energy functions for protein structure prediction by means of geneticprogramming and improved structure similarity assessment

2011 Achiya Elyasaf Ami Hauptmann Moshe SipperGA-FreeCell Evolving Solvers for the Game of FreeCell

Applications of GPndash 79

Other applications

classification problems in machine learning [] object recognition [ 33] orlearning game strategies [] [2 57] has demonstrated the ability of a GP system to automatically find andcorrect bugs in commercially-released software when provided with test data

In the context of this paper it deserves particular attention that GP is one ofleading methodologies that can be used to lsquoautomatersquo science helping theresearchers to find the hidden complex patterns in the observed phenomena

In this spirit in their seminal paper [42] have shown how GP can be used toinduce scientific laws from experimental data Many other studies havedemonstrated the usefulness of GP for modeling different phenomena includingthose of natural origins [45 3 7 56 46]

See [] for an extensive review of GP applications

Applications of GPndash 80

Case study 2 Evolution of feature detectors

Based on Krzysztof Krawiec Bartosz Kukawka and Tomasz Maciejewski (2010)Evolving cascades of voting feature detectors for vehicle detection in satellite imageryIn IEEE Congress on Evolutionary Computation (CEC 2010) Barcelona IEEE Presspages 2392-2399

Applications of GPndash 81

Additional resources

Additional resourcesndash 82

Additional resources

Evolutionary Computation in Java csgmuedu~eclabprojectsecjGeneric software framework for EA well-prepared to work with GP

The online GP bilbiography wwwcsbhamacuk~wblbiblio

The genetic programming lsquohome pagersquo (a little bit messy but still valuable)httpwwwgenetic-programmingcom

Additional resourcesndash 83

Bibliography

Bibliographyndash 84

A Agapitos and S M LucasEvolving Modular Recursive Sorting AlgorithmsIn Proc EuroGP 2007

A Arcuri and X YaoA novel co-evolutionary approach to automatic software bug fixingIn J Wang editor 2008 IEEE World Congress on Computational Intelligencepages 162ndash168 Hong Kong 1-6 June 2008 IEEE Computational IntelligenceSociety IEEE Press

M Arganis R Val J Prats K Rodriguez R Dominguez and J DolzGenetic programming and standardization in water temperature modellingAdvances in Civil Engineering 2009 2009

A Brabazon M OrsquoNeill and I DempseyAn Introduction to Evolutionary Computation in FinanceIEEE Computational Intelligence Magazine 3(4)42ndash55 2008

R Bradley A Brabazon and M OrsquoNeillDynamic High Frequency Trading A Neuro-Evolutionary ApproachIn Proc EvoWorkshops 2009

G Cuccu and F GomezWhen novelty is not enoughIn Proc EvoApplications 2011

Bibliographyndash 85

M Daga and M C DeoAlternative data-driven methods to estimate wind from waves by inversemodelingNatural Hazards 49(2)293ndash310 May 2009

I Dempsey M OrsquoNeill and A BrabazonAdaptive Trading With Grammatical EvolutionIn Proc CEC 2006

G Durrett F Neumann and U-M OrsquoReillyComputational Complexity Analysis of Simple Genetic Programming On TwoProblems Modeling Isolated Program SemanticsIn Proc FOGA 2011

E Galvaacuten-Loacutepez J Swafford M OrsquoNeill and A BrabazonEvolving a Ms PacMan Controller Using Grammatical EvolutionIn Applications of Evolutionary Computation Springer 2010

J V Hansen P B Lowry R D Meservy and D M McDonaldGenetic Programming for Prevention of Cyberterrorism through Dynamic andEvolving Intrusion DetectionDecision Support Systems 431362ndash1374 2007

Bibliographyndash 86

S Harding J F Miller and W BanzhafDevelopments in Cartesian Genetic Programming self-modifying CGPGPEM 11397ndash439 2010

D Jakobović and L BudinDynamic Scheduling with Genetic ProgrammingIn Proc EuroGP 2006

G Kendall A Parkes and K SpoererA Survey of NP-Complete PuzzlesInternational Computer Games Association Journal 31(1)13ndash34 2008

K E Kinnear JrEvolving a Sort Lessons in Genetic ProgrammingIn Proc of the International Conference on Neural Networks 1993

J KozaA Genetic Approach to the Truck Backer Upper Problem and the Inter-twinedSpiral ProblemIn Proc International Joint Conference on Neural Networks 1992

J R KozaGenetic Programming On the Programming of Computers by Means of NaturalSelectionMIT Press Cambridge MA USA 1992

Bibliographyndash 87

J R KozaGenetic Programming II Automatic Discovery of Reusable ProgramsMIT Press Cambridge Massachusetts May 1994

W Langdon and W BanzhafRepeated Patterns in Genetic ProgrammingNatural Computing 7589ndash613 2008

W B LangdonRandom search is parsimoniousIn E Cantuacute-Paz editor Late Breaking Papers at the Genetic and EvolutionaryComputation Conference (GECCO-2002) pages 308ndash315 New York NY 9-13July 2002 AAAI

W B Langdon and W BanzhafRepeated Sequences in Linear Genetic Programming GenomesComplex Systems 15(4)285ndash306 2005

W B Langdon and A P HarrisonEvolving Regular Expressions for GeneChip Probe Performance PredictionIn Proc PPSN pages 1061ndash1070 2008

W B Langdon and R PoliFoundations of Genetic ProgrammingSpringer-Verlag 2002

Bibliographyndash 88

W B Langdon J Rowsell and A P HarrisonCreating Regular Expressions as mRNA Motifs with GP to Predict Human ExonSplittingIn Proc GECCO 2009

W B Langdon O Sanchez Graillet and A P HarrisonAutomated DNA Motif DiscoveryarXivorg 2010

M Lones A Tyrrell S Stepney and L CavesControlling Complex Dynamics with Artificial Biochemical NetworksIn Proc EuroGP pages 159ndash170 2010

S LucasOthello Competitionhttpprotectkern-1667emrelaxalgovalessexacuk8080othellohtmlOthellohtml 2012[Online accessed 27-Jan-2012]

Bibliographyndash 89

S LucasThe Physical Travelling Salesperson Problemhttpprotectkern-1667emrelaxalgovalessexacukptspptsphtml2012[Online accessed 27ndashJan-2012]

T McConaghyFFX Fast Scalable Deterministic Symbolic Regression TechnologyIn Proc GPTP 2011

D J MontanaStrongly typed genetic programmingBBN Technical Report 7866 Bolt Beranek and Newman Inc 10 MoultonStreet Cambridge MA 02138 USA 7 May 1993

M Nicolau M Schoenauer and W BanzhafEvolving Genes to Balance a PoleIn Proc EuroGP 2010

Bibliographyndash 90

P Nordin and W BanzhafGenetic programming controlling a miniature robotIn E V Siegel and J R Koza editors Working Notes for the AAAI Symposiumon Genetic Programming pages 61ndash67 MIT Cambridge MA USA 10ndash12 Nov1995 AAAI

G Olague and L TrujilloEvolutionary-computer-assisted design of image operators that detect interestpoints using genetic programmingImage and Vision Computing 29(7)484ndash498 2011

M OrsquoNeill A Brabazon and E HembergSubtree Deactivation Control with Grammatical Genetic Programming inDynamic EnvironmentsIn Proc CEC 2008

M OrsquoNeill and C RyanGrammatical Evolution by Grammatical Evolution The Evolution of Grammarand Genetic CodeIn Proc EuroGP pages 138ndash149 Springer-Verlag 5ndash7 Apr 2004

M Paterson Y Peres M Thorup P Winkler and U ZwickMaximum OverhangIn Proc 19th Annual ACM-SIAM Symposium on Discrete Algorithms 2008

Bibliographyndash 91

R PoliExact schema theory for genetic programming and variable-length geneticalgorithms with one-point crossoverGenetic Programming and Evolvable Machines 2(2)123ndash163 June 2001

R Poli and W B LangdonOn the search properties of different crossover operators in genetic programmingIn J R Koza W Banzhaf K Chellapilla K Deb M Dorigo D B Fogel M HGarzon D E Goldberg H Iba and R Riolo editors Genetic Programming1998 Proceedings of the Third Annual Conference pages 293ndash301 University ofWisconsin Madison Wisconsin USA 22-25 July 1998 Morgan Kaufmann

R Poli W B Langdon and N F McPheeA field guide to genetic programmingPublished via httplulucom and freely available athttpwwwgp-field-guideorguk 2008(With contributions by J R Koza)

B J Ross and H ZhuProcedural texture evolution using multiobjective optimizationNew Generation Computing 22(3)271ndash293 2004

Bibliographyndash 92

R P Salustowicz and J SchmidhuberProbabilistic incremental program evolutionEvolutionary Computation 5(2)123ndash141 1997

M Schmidt and H LipsonDistilling free-form natural laws from experimental dataScience 324(5923)81ndash85 3 Apr 2009

S Silva and L VanneschiState-of-the-Art Genetic Programming for Predicting Human Oral Bioavailabilityof DrugsIn Proc 4th International Workshop on Practical Applications of ComputationalBiology and Bioinformatics 2010

M SipperLet the Games EvolveIn Proc GPTP 2011

C Sivapragasam R Maheswaran and V VenkateshGenetic programming approach for flood routing in natural channelsHydrological Processes 22(5)623ndash628 2007

C Sivapragasam N Muttil S Muthukumar and V M ArunPrediction of algal blooms using genetic programmingMarine Pollution Bulletin 60(10)1849ndash1855 2010

Bibliographyndash 93

L SpectorTowards practical autoconstructive evolution Self-evolution of problem-solvinggenetic programming systemsIn R Riolo T McConaghy and E Vladislavleva editors Genetic ProgrammingTheory and Practice VIII volume 8 of Genetic and Evolutionary Computationchapter 2 pages 17ndash33 Springer Ann Arbor USA 20-22 May 2010

L Spector C Perry and J KleinPush 20 programming language descriptionTechnical report School of Cognitive Science Hampshire College Apr 2004

J Togelius S Karakovskiy J Koutnik and J SchmidhuberSuper Mario EvolutionIn Proc IEEE Computational Intelligence and Games 2009

TORCS The Open Car Racing Simulatorhttptorcssourceforgenet 2012

L Vanneschi and G CuccuVariable Size Population for Dynamic Optimization with Genetic ProgrammingIn Proc GECCO 2009

Bibliographyndash 94

E Vladislavleva G Smits and D Den HertogOrder of Nonlinearity as a Complexity Measure for Models Generated by SymbolicRegression via Pareto Genetic ProgrammingIEEE Trans EC 13(2)333ndash349 2009

N Wagner Z Michalewicz M Khouja and R McGregorTime Series Forecasting for Dynamic Environments The DyFor Genetic ProgramModelIEEE Trans EC 2007

J Walker and J MillerPredicting Prime Numbers Using Cartesian Genetic ProgrammingIn Proc EuroGP 2007

J A Walker K Voumllk S L Smith and J F MillerParallel Evolution using Multi-chromosome Cartesian Genetic ProgrammingGPEM 10417ndash445 2009

W-C Wang K-W Chau C-T Cheng and L QiuA comparison of performance of several artificial intelligence methods forforecasting monthly discharge time seriesJournal of Hydrology 374(3-4)294ndash306 2009

Bibliographyndash 95

W Weimer S Forrest C Le Goues and T NguyenAutomatic program repair with evolutionary computationCommunications of the ACM 53(5)109ndash116 June 2010

P Widera J Garibaldi and N KrasnogorGP challenge Evolving energy function for protein structure predictionGPEM 1161ndash88 2010

T YuHierarchical Processing for Evolving Recursive and Modular Programs UsingHigher-Order Functions and Lambda AbstractionGPEM 2345ndash380 2001

Bibliographyndash 96

  • Introduction
  • Background
  • What is genetic programming
  • Summary of our first glimpse at GP
  • Life demonstration of GP using ECJ
  • Assessment of GP techniques
  • Case study
  • A more detailed view on GP (vanilla GP is not the whole story)
  • The Challenges for GP
  • Variants of GP
  • Selected theoretical results
  • Applications of GP
  • Additional resources
  • Bibliography
Page 78: GeneticProgramming - Poznań University of TechnologyGeneticProgramming KrzysztofKrawiec LaboratoryofIntelligentDecisionSupportSystems InstituteofComputingScience,PoznanUniversityofTechnology,Poznań,Poland

Selected Gold Humies using GP

2009 Stephanie Forrest Claire Le Goues ThanhVu Nguyen Westley WeimerAutomatically finding patches using genetic programming A GeneticProgramming Approach to Automated Software Repair

Applications of GPndash 78

Selected Gold Humies using GP

2008 Lee Spector David M Clark Ian Lindsay Bradford Barr Jon KleinGenetic Programming for Finite Algebras

2010 Natalio Krasnogor Paweł Widera Jonathan GaribaldiEvolutionary design of the energy function for protein structure prediction GPchallenge evolving the energy function for protein structure prediction Automateddesign of energy functions for protein structure prediction by means of geneticprogramming and improved structure similarity assessment

2011 Achiya Elyasaf Ami Hauptmann Moshe SipperGA-FreeCell Evolving Solvers for the Game of FreeCell

Applications of GPndash 79

Other applications

classification problems in machine learning [] object recognition [ 33] orlearning game strategies [] [2 57] has demonstrated the ability of a GP system to automatically find andcorrect bugs in commercially-released software when provided with test data

In the context of this paper it deserves particular attention that GP is one ofleading methodologies that can be used to lsquoautomatersquo science helping theresearchers to find the hidden complex patterns in the observed phenomena

In this spirit in their seminal paper [42] have shown how GP can be used toinduce scientific laws from experimental data Many other studies havedemonstrated the usefulness of GP for modeling different phenomena includingthose of natural origins [45 3 7 56 46]

See [] for an extensive review of GP applications

Applications of GPndash 80

Case study 2 Evolution of feature detectors

Based on Krzysztof Krawiec Bartosz Kukawka and Tomasz Maciejewski (2010)Evolving cascades of voting feature detectors for vehicle detection in satellite imageryIn IEEE Congress on Evolutionary Computation (CEC 2010) Barcelona IEEE Presspages 2392-2399

Applications of GPndash 81

Additional resources

Additional resourcesndash 82

Additional resources

Evolutionary Computation in Java csgmuedu~eclabprojectsecjGeneric software framework for EA well-prepared to work with GP

The online GP bilbiography wwwcsbhamacuk~wblbiblio

The genetic programming lsquohome pagersquo (a little bit messy but still valuable)httpwwwgenetic-programmingcom

Additional resourcesndash 83

Bibliography

Bibliographyndash 84

A Agapitos and S M LucasEvolving Modular Recursive Sorting AlgorithmsIn Proc EuroGP 2007

A Arcuri and X YaoA novel co-evolutionary approach to automatic software bug fixingIn J Wang editor 2008 IEEE World Congress on Computational Intelligencepages 162ndash168 Hong Kong 1-6 June 2008 IEEE Computational IntelligenceSociety IEEE Press

M Arganis R Val J Prats K Rodriguez R Dominguez and J DolzGenetic programming and standardization in water temperature modellingAdvances in Civil Engineering 2009 2009

A Brabazon M OrsquoNeill and I DempseyAn Introduction to Evolutionary Computation in FinanceIEEE Computational Intelligence Magazine 3(4)42ndash55 2008

R Bradley A Brabazon and M OrsquoNeillDynamic High Frequency Trading A Neuro-Evolutionary ApproachIn Proc EvoWorkshops 2009

G Cuccu and F GomezWhen novelty is not enoughIn Proc EvoApplications 2011

Bibliographyndash 85

M Daga and M C DeoAlternative data-driven methods to estimate wind from waves by inversemodelingNatural Hazards 49(2)293ndash310 May 2009

I Dempsey M OrsquoNeill and A BrabazonAdaptive Trading With Grammatical EvolutionIn Proc CEC 2006

G Durrett F Neumann and U-M OrsquoReillyComputational Complexity Analysis of Simple Genetic Programming On TwoProblems Modeling Isolated Program SemanticsIn Proc FOGA 2011

E Galvaacuten-Loacutepez J Swafford M OrsquoNeill and A BrabazonEvolving a Ms PacMan Controller Using Grammatical EvolutionIn Applications of Evolutionary Computation Springer 2010

J V Hansen P B Lowry R D Meservy and D M McDonaldGenetic Programming for Prevention of Cyberterrorism through Dynamic andEvolving Intrusion DetectionDecision Support Systems 431362ndash1374 2007

Bibliographyndash 86

S Harding J F Miller and W BanzhafDevelopments in Cartesian Genetic Programming self-modifying CGPGPEM 11397ndash439 2010

D Jakobović and L BudinDynamic Scheduling with Genetic ProgrammingIn Proc EuroGP 2006

G Kendall A Parkes and K SpoererA Survey of NP-Complete PuzzlesInternational Computer Games Association Journal 31(1)13ndash34 2008

K E Kinnear JrEvolving a Sort Lessons in Genetic ProgrammingIn Proc of the International Conference on Neural Networks 1993

J KozaA Genetic Approach to the Truck Backer Upper Problem and the Inter-twinedSpiral ProblemIn Proc International Joint Conference on Neural Networks 1992

J R KozaGenetic Programming On the Programming of Computers by Means of NaturalSelectionMIT Press Cambridge MA USA 1992

Bibliographyndash 87

J R KozaGenetic Programming II Automatic Discovery of Reusable ProgramsMIT Press Cambridge Massachusetts May 1994

W Langdon and W BanzhafRepeated Patterns in Genetic ProgrammingNatural Computing 7589ndash613 2008

W B LangdonRandom search is parsimoniousIn E Cantuacute-Paz editor Late Breaking Papers at the Genetic and EvolutionaryComputation Conference (GECCO-2002) pages 308ndash315 New York NY 9-13July 2002 AAAI

W B Langdon and W BanzhafRepeated Sequences in Linear Genetic Programming GenomesComplex Systems 15(4)285ndash306 2005

W B Langdon and A P HarrisonEvolving Regular Expressions for GeneChip Probe Performance PredictionIn Proc PPSN pages 1061ndash1070 2008

W B Langdon and R PoliFoundations of Genetic ProgrammingSpringer-Verlag 2002

Bibliographyndash 88

W B Langdon J Rowsell and A P HarrisonCreating Regular Expressions as mRNA Motifs with GP to Predict Human ExonSplittingIn Proc GECCO 2009

W B Langdon O Sanchez Graillet and A P HarrisonAutomated DNA Motif DiscoveryarXivorg 2010

M Lones A Tyrrell S Stepney and L CavesControlling Complex Dynamics with Artificial Biochemical NetworksIn Proc EuroGP pages 159ndash170 2010

S LucasOthello Competitionhttpprotectkern-1667emrelaxalgovalessexacuk8080othellohtmlOthellohtml 2012[Online accessed 27-Jan-2012]

Bibliographyndash 89

S LucasThe Physical Travelling Salesperson Problemhttpprotectkern-1667emrelaxalgovalessexacukptspptsphtml2012[Online accessed 27ndashJan-2012]

T McConaghyFFX Fast Scalable Deterministic Symbolic Regression TechnologyIn Proc GPTP 2011

D J MontanaStrongly typed genetic programmingBBN Technical Report 7866 Bolt Beranek and Newman Inc 10 MoultonStreet Cambridge MA 02138 USA 7 May 1993

M Nicolau M Schoenauer and W BanzhafEvolving Genes to Balance a PoleIn Proc EuroGP 2010

Bibliographyndash 90

P Nordin and W BanzhafGenetic programming controlling a miniature robotIn E V Siegel and J R Koza editors Working Notes for the AAAI Symposiumon Genetic Programming pages 61ndash67 MIT Cambridge MA USA 10ndash12 Nov1995 AAAI

G Olague and L TrujilloEvolutionary-computer-assisted design of image operators that detect interestpoints using genetic programmingImage and Vision Computing 29(7)484ndash498 2011

M OrsquoNeill A Brabazon and E HembergSubtree Deactivation Control with Grammatical Genetic Programming inDynamic EnvironmentsIn Proc CEC 2008

M OrsquoNeill and C RyanGrammatical Evolution by Grammatical Evolution The Evolution of Grammarand Genetic CodeIn Proc EuroGP pages 138ndash149 Springer-Verlag 5ndash7 Apr 2004

M Paterson Y Peres M Thorup P Winkler and U ZwickMaximum OverhangIn Proc 19th Annual ACM-SIAM Symposium on Discrete Algorithms 2008

Bibliographyndash 91

R PoliExact schema theory for genetic programming and variable-length geneticalgorithms with one-point crossoverGenetic Programming and Evolvable Machines 2(2)123ndash163 June 2001

R Poli and W B LangdonOn the search properties of different crossover operators in genetic programmingIn J R Koza W Banzhaf K Chellapilla K Deb M Dorigo D B Fogel M HGarzon D E Goldberg H Iba and R Riolo editors Genetic Programming1998 Proceedings of the Third Annual Conference pages 293ndash301 University ofWisconsin Madison Wisconsin USA 22-25 July 1998 Morgan Kaufmann

R Poli W B Langdon and N F McPheeA field guide to genetic programmingPublished via httplulucom and freely available athttpwwwgp-field-guideorguk 2008(With contributions by J R Koza)

B J Ross and H ZhuProcedural texture evolution using multiobjective optimizationNew Generation Computing 22(3)271ndash293 2004

Bibliographyndash 92

R P Salustowicz and J SchmidhuberProbabilistic incremental program evolutionEvolutionary Computation 5(2)123ndash141 1997

M Schmidt and H LipsonDistilling free-form natural laws from experimental dataScience 324(5923)81ndash85 3 Apr 2009

S Silva and L VanneschiState-of-the-Art Genetic Programming for Predicting Human Oral Bioavailabilityof DrugsIn Proc 4th International Workshop on Practical Applications of ComputationalBiology and Bioinformatics 2010

M SipperLet the Games EvolveIn Proc GPTP 2011

C Sivapragasam R Maheswaran and V VenkateshGenetic programming approach for flood routing in natural channelsHydrological Processes 22(5)623ndash628 2007

C Sivapragasam N Muttil S Muthukumar and V M ArunPrediction of algal blooms using genetic programmingMarine Pollution Bulletin 60(10)1849ndash1855 2010

Bibliographyndash 93

L SpectorTowards practical autoconstructive evolution Self-evolution of problem-solvinggenetic programming systemsIn R Riolo T McConaghy and E Vladislavleva editors Genetic ProgrammingTheory and Practice VIII volume 8 of Genetic and Evolutionary Computationchapter 2 pages 17ndash33 Springer Ann Arbor USA 20-22 May 2010

L Spector C Perry and J KleinPush 20 programming language descriptionTechnical report School of Cognitive Science Hampshire College Apr 2004

J Togelius S Karakovskiy J Koutnik and J SchmidhuberSuper Mario EvolutionIn Proc IEEE Computational Intelligence and Games 2009

TORCS The Open Car Racing Simulatorhttptorcssourceforgenet 2012

L Vanneschi and G CuccuVariable Size Population for Dynamic Optimization with Genetic ProgrammingIn Proc GECCO 2009

Bibliographyndash 94

E Vladislavleva G Smits and D Den HertogOrder of Nonlinearity as a Complexity Measure for Models Generated by SymbolicRegression via Pareto Genetic ProgrammingIEEE Trans EC 13(2)333ndash349 2009

N Wagner Z Michalewicz M Khouja and R McGregorTime Series Forecasting for Dynamic Environments The DyFor Genetic ProgramModelIEEE Trans EC 2007

J Walker and J MillerPredicting Prime Numbers Using Cartesian Genetic ProgrammingIn Proc EuroGP 2007

J A Walker K Voumllk S L Smith and J F MillerParallel Evolution using Multi-chromosome Cartesian Genetic ProgrammingGPEM 10417ndash445 2009

W-C Wang K-W Chau C-T Cheng and L QiuA comparison of performance of several artificial intelligence methods forforecasting monthly discharge time seriesJournal of Hydrology 374(3-4)294ndash306 2009

Bibliographyndash 95

W Weimer S Forrest C Le Goues and T NguyenAutomatic program repair with evolutionary computationCommunications of the ACM 53(5)109ndash116 June 2010

P Widera J Garibaldi and N KrasnogorGP challenge Evolving energy function for protein structure predictionGPEM 1161ndash88 2010

T YuHierarchical Processing for Evolving Recursive and Modular Programs UsingHigher-Order Functions and Lambda AbstractionGPEM 2345ndash380 2001

Bibliographyndash 96

  • Introduction
  • Background
  • What is genetic programming
  • Summary of our first glimpse at GP
  • Life demonstration of GP using ECJ
  • Assessment of GP techniques
  • Case study
  • A more detailed view on GP (vanilla GP is not the whole story)
  • The Challenges for GP
  • Variants of GP
  • Selected theoretical results
  • Applications of GP
  • Additional resources
  • Bibliography
Page 79: GeneticProgramming - Poznań University of TechnologyGeneticProgramming KrzysztofKrawiec LaboratoryofIntelligentDecisionSupportSystems InstituteofComputingScience,PoznanUniversityofTechnology,Poznań,Poland

Selected Gold Humies using GP

2008 Lee Spector David M Clark Ian Lindsay Bradford Barr Jon KleinGenetic Programming for Finite Algebras

2010 Natalio Krasnogor Paweł Widera Jonathan GaribaldiEvolutionary design of the energy function for protein structure prediction GPchallenge evolving the energy function for protein structure prediction Automateddesign of energy functions for protein structure prediction by means of geneticprogramming and improved structure similarity assessment

2011 Achiya Elyasaf Ami Hauptmann Moshe SipperGA-FreeCell Evolving Solvers for the Game of FreeCell

Applications of GPndash 79

Other applications

classification problems in machine learning [] object recognition [ 33] orlearning game strategies [] [2 57] has demonstrated the ability of a GP system to automatically find andcorrect bugs in commercially-released software when provided with test data

In the context of this paper it deserves particular attention that GP is one ofleading methodologies that can be used to lsquoautomatersquo science helping theresearchers to find the hidden complex patterns in the observed phenomena

In this spirit in their seminal paper [42] have shown how GP can be used toinduce scientific laws from experimental data Many other studies havedemonstrated the usefulness of GP for modeling different phenomena includingthose of natural origins [45 3 7 56 46]

See [] for an extensive review of GP applications

Applications of GPndash 80

Case study 2 Evolution of feature detectors

Based on Krzysztof Krawiec Bartosz Kukawka and Tomasz Maciejewski (2010)Evolving cascades of voting feature detectors for vehicle detection in satellite imageryIn IEEE Congress on Evolutionary Computation (CEC 2010) Barcelona IEEE Presspages 2392-2399

Applications of GPndash 81

Additional resources

Additional resourcesndash 82

Additional resources

Evolutionary Computation in Java csgmuedu~eclabprojectsecjGeneric software framework for EA well-prepared to work with GP

The online GP bilbiography wwwcsbhamacuk~wblbiblio

The genetic programming lsquohome pagersquo (a little bit messy but still valuable)httpwwwgenetic-programmingcom

Additional resourcesndash 83

Bibliography

Bibliographyndash 84

A Agapitos and S M LucasEvolving Modular Recursive Sorting AlgorithmsIn Proc EuroGP 2007

A Arcuri and X YaoA novel co-evolutionary approach to automatic software bug fixingIn J Wang editor 2008 IEEE World Congress on Computational Intelligencepages 162ndash168 Hong Kong 1-6 June 2008 IEEE Computational IntelligenceSociety IEEE Press

M Arganis R Val J Prats K Rodriguez R Dominguez and J DolzGenetic programming and standardization in water temperature modellingAdvances in Civil Engineering 2009 2009

A Brabazon M OrsquoNeill and I DempseyAn Introduction to Evolutionary Computation in FinanceIEEE Computational Intelligence Magazine 3(4)42ndash55 2008

R Bradley A Brabazon and M OrsquoNeillDynamic High Frequency Trading A Neuro-Evolutionary ApproachIn Proc EvoWorkshops 2009

G Cuccu and F GomezWhen novelty is not enoughIn Proc EvoApplications 2011

Bibliographyndash 85

M Daga and M C DeoAlternative data-driven methods to estimate wind from waves by inversemodelingNatural Hazards 49(2)293ndash310 May 2009

I Dempsey M OrsquoNeill and A BrabazonAdaptive Trading With Grammatical EvolutionIn Proc CEC 2006

G Durrett F Neumann and U-M OrsquoReillyComputational Complexity Analysis of Simple Genetic Programming On TwoProblems Modeling Isolated Program SemanticsIn Proc FOGA 2011

E Galvaacuten-Loacutepez J Swafford M OrsquoNeill and A BrabazonEvolving a Ms PacMan Controller Using Grammatical EvolutionIn Applications of Evolutionary Computation Springer 2010

J V Hansen P B Lowry R D Meservy and D M McDonaldGenetic Programming for Prevention of Cyberterrorism through Dynamic andEvolving Intrusion DetectionDecision Support Systems 431362ndash1374 2007

Bibliographyndash 86

S Harding J F Miller and W BanzhafDevelopments in Cartesian Genetic Programming self-modifying CGPGPEM 11397ndash439 2010

D Jakobović and L BudinDynamic Scheduling with Genetic ProgrammingIn Proc EuroGP 2006

G Kendall A Parkes and K SpoererA Survey of NP-Complete PuzzlesInternational Computer Games Association Journal 31(1)13ndash34 2008

K E Kinnear JrEvolving a Sort Lessons in Genetic ProgrammingIn Proc of the International Conference on Neural Networks 1993

J KozaA Genetic Approach to the Truck Backer Upper Problem and the Inter-twinedSpiral ProblemIn Proc International Joint Conference on Neural Networks 1992

J R KozaGenetic Programming On the Programming of Computers by Means of NaturalSelectionMIT Press Cambridge MA USA 1992

Bibliographyndash 87

J R KozaGenetic Programming II Automatic Discovery of Reusable ProgramsMIT Press Cambridge Massachusetts May 1994

W Langdon and W BanzhafRepeated Patterns in Genetic ProgrammingNatural Computing 7589ndash613 2008

W B LangdonRandom search is parsimoniousIn E Cantuacute-Paz editor Late Breaking Papers at the Genetic and EvolutionaryComputation Conference (GECCO-2002) pages 308ndash315 New York NY 9-13July 2002 AAAI

W B Langdon and W BanzhafRepeated Sequences in Linear Genetic Programming GenomesComplex Systems 15(4)285ndash306 2005

W B Langdon and A P HarrisonEvolving Regular Expressions for GeneChip Probe Performance PredictionIn Proc PPSN pages 1061ndash1070 2008

W B Langdon and R PoliFoundations of Genetic ProgrammingSpringer-Verlag 2002

Bibliographyndash 88

W B Langdon J Rowsell and A P HarrisonCreating Regular Expressions as mRNA Motifs with GP to Predict Human ExonSplittingIn Proc GECCO 2009

W B Langdon O Sanchez Graillet and A P HarrisonAutomated DNA Motif DiscoveryarXivorg 2010

M Lones A Tyrrell S Stepney and L CavesControlling Complex Dynamics with Artificial Biochemical NetworksIn Proc EuroGP pages 159ndash170 2010

S LucasOthello Competitionhttpprotectkern-1667emrelaxalgovalessexacuk8080othellohtmlOthellohtml 2012[Online accessed 27-Jan-2012]

Bibliographyndash 89

S LucasThe Physical Travelling Salesperson Problemhttpprotectkern-1667emrelaxalgovalessexacukptspptsphtml2012[Online accessed 27ndashJan-2012]

T McConaghyFFX Fast Scalable Deterministic Symbolic Regression TechnologyIn Proc GPTP 2011

D J MontanaStrongly typed genetic programmingBBN Technical Report 7866 Bolt Beranek and Newman Inc 10 MoultonStreet Cambridge MA 02138 USA 7 May 1993

M Nicolau M Schoenauer and W BanzhafEvolving Genes to Balance a PoleIn Proc EuroGP 2010

Bibliographyndash 90

P Nordin and W BanzhafGenetic programming controlling a miniature robotIn E V Siegel and J R Koza editors Working Notes for the AAAI Symposiumon Genetic Programming pages 61ndash67 MIT Cambridge MA USA 10ndash12 Nov1995 AAAI

G Olague and L TrujilloEvolutionary-computer-assisted design of image operators that detect interestpoints using genetic programmingImage and Vision Computing 29(7)484ndash498 2011

M OrsquoNeill A Brabazon and E HembergSubtree Deactivation Control with Grammatical Genetic Programming inDynamic EnvironmentsIn Proc CEC 2008

M OrsquoNeill and C RyanGrammatical Evolution by Grammatical Evolution The Evolution of Grammarand Genetic CodeIn Proc EuroGP pages 138ndash149 Springer-Verlag 5ndash7 Apr 2004

M Paterson Y Peres M Thorup P Winkler and U ZwickMaximum OverhangIn Proc 19th Annual ACM-SIAM Symposium on Discrete Algorithms 2008

Bibliographyndash 91

R PoliExact schema theory for genetic programming and variable-length geneticalgorithms with one-point crossoverGenetic Programming and Evolvable Machines 2(2)123ndash163 June 2001

R Poli and W B LangdonOn the search properties of different crossover operators in genetic programmingIn J R Koza W Banzhaf K Chellapilla K Deb M Dorigo D B Fogel M HGarzon D E Goldberg H Iba and R Riolo editors Genetic Programming1998 Proceedings of the Third Annual Conference pages 293ndash301 University ofWisconsin Madison Wisconsin USA 22-25 July 1998 Morgan Kaufmann

R Poli W B Langdon and N F McPheeA field guide to genetic programmingPublished via httplulucom and freely available athttpwwwgp-field-guideorguk 2008(With contributions by J R Koza)

B J Ross and H ZhuProcedural texture evolution using multiobjective optimizationNew Generation Computing 22(3)271ndash293 2004

Bibliographyndash 92

R P Salustowicz and J SchmidhuberProbabilistic incremental program evolutionEvolutionary Computation 5(2)123ndash141 1997

M Schmidt and H LipsonDistilling free-form natural laws from experimental dataScience 324(5923)81ndash85 3 Apr 2009

S Silva and L VanneschiState-of-the-Art Genetic Programming for Predicting Human Oral Bioavailabilityof DrugsIn Proc 4th International Workshop on Practical Applications of ComputationalBiology and Bioinformatics 2010

M SipperLet the Games EvolveIn Proc GPTP 2011

C Sivapragasam R Maheswaran and V VenkateshGenetic programming approach for flood routing in natural channelsHydrological Processes 22(5)623ndash628 2007

C Sivapragasam N Muttil S Muthukumar and V M ArunPrediction of algal blooms using genetic programmingMarine Pollution Bulletin 60(10)1849ndash1855 2010

Bibliographyndash 93

L SpectorTowards practical autoconstructive evolution Self-evolution of problem-solvinggenetic programming systemsIn R Riolo T McConaghy and E Vladislavleva editors Genetic ProgrammingTheory and Practice VIII volume 8 of Genetic and Evolutionary Computationchapter 2 pages 17ndash33 Springer Ann Arbor USA 20-22 May 2010

L Spector C Perry and J KleinPush 20 programming language descriptionTechnical report School of Cognitive Science Hampshire College Apr 2004

J Togelius S Karakovskiy J Koutnik and J SchmidhuberSuper Mario EvolutionIn Proc IEEE Computational Intelligence and Games 2009

TORCS The Open Car Racing Simulatorhttptorcssourceforgenet 2012

L Vanneschi and G CuccuVariable Size Population for Dynamic Optimization with Genetic ProgrammingIn Proc GECCO 2009

Bibliographyndash 94

E Vladislavleva G Smits and D Den HertogOrder of Nonlinearity as a Complexity Measure for Models Generated by SymbolicRegression via Pareto Genetic ProgrammingIEEE Trans EC 13(2)333ndash349 2009

N Wagner Z Michalewicz M Khouja and R McGregorTime Series Forecasting for Dynamic Environments The DyFor Genetic ProgramModelIEEE Trans EC 2007

J Walker and J MillerPredicting Prime Numbers Using Cartesian Genetic ProgrammingIn Proc EuroGP 2007

J A Walker K Voumllk S L Smith and J F MillerParallel Evolution using Multi-chromosome Cartesian Genetic ProgrammingGPEM 10417ndash445 2009

W-C Wang K-W Chau C-T Cheng and L QiuA comparison of performance of several artificial intelligence methods forforecasting monthly discharge time seriesJournal of Hydrology 374(3-4)294ndash306 2009

Bibliographyndash 95

W Weimer S Forrest C Le Goues and T NguyenAutomatic program repair with evolutionary computationCommunications of the ACM 53(5)109ndash116 June 2010

P Widera J Garibaldi and N KrasnogorGP challenge Evolving energy function for protein structure predictionGPEM 1161ndash88 2010

T YuHierarchical Processing for Evolving Recursive and Modular Programs UsingHigher-Order Functions and Lambda AbstractionGPEM 2345ndash380 2001

Bibliographyndash 96

  • Introduction
  • Background
  • What is genetic programming
  • Summary of our first glimpse at GP
  • Life demonstration of GP using ECJ
  • Assessment of GP techniques
  • Case study
  • A more detailed view on GP (vanilla GP is not the whole story)
  • The Challenges for GP
  • Variants of GP
  • Selected theoretical results
  • Applications of GP
  • Additional resources
  • Bibliography
Page 80: GeneticProgramming - Poznań University of TechnologyGeneticProgramming KrzysztofKrawiec LaboratoryofIntelligentDecisionSupportSystems InstituteofComputingScience,PoznanUniversityofTechnology,Poznań,Poland

Other applications

classification problems in machine learning [] object recognition [ 33] orlearning game strategies [] [2 57] has demonstrated the ability of a GP system to automatically find andcorrect bugs in commercially-released software when provided with test data

In the context of this paper it deserves particular attention that GP is one ofleading methodologies that can be used to lsquoautomatersquo science helping theresearchers to find the hidden complex patterns in the observed phenomena

In this spirit in their seminal paper [42] have shown how GP can be used toinduce scientific laws from experimental data Many other studies havedemonstrated the usefulness of GP for modeling different phenomena includingthose of natural origins [45 3 7 56 46]

See [] for an extensive review of GP applications

Applications of GPndash 80

Case study 2 Evolution of feature detectors

Based on Krzysztof Krawiec Bartosz Kukawka and Tomasz Maciejewski (2010)Evolving cascades of voting feature detectors for vehicle detection in satellite imageryIn IEEE Congress on Evolutionary Computation (CEC 2010) Barcelona IEEE Presspages 2392-2399

Applications of GPndash 81

Additional resources

Additional resourcesndash 82

Additional resources

Evolutionary Computation in Java csgmuedu~eclabprojectsecjGeneric software framework for EA well-prepared to work with GP

The online GP bilbiography wwwcsbhamacuk~wblbiblio

The genetic programming lsquohome pagersquo (a little bit messy but still valuable)httpwwwgenetic-programmingcom

Additional resourcesndash 83

Bibliography

Bibliographyndash 84

A Agapitos and S M LucasEvolving Modular Recursive Sorting AlgorithmsIn Proc EuroGP 2007

A Arcuri and X YaoA novel co-evolutionary approach to automatic software bug fixingIn J Wang editor 2008 IEEE World Congress on Computational Intelligencepages 162ndash168 Hong Kong 1-6 June 2008 IEEE Computational IntelligenceSociety IEEE Press

M Arganis R Val J Prats K Rodriguez R Dominguez and J DolzGenetic programming and standardization in water temperature modellingAdvances in Civil Engineering 2009 2009

A Brabazon M OrsquoNeill and I DempseyAn Introduction to Evolutionary Computation in FinanceIEEE Computational Intelligence Magazine 3(4)42ndash55 2008

R Bradley A Brabazon and M OrsquoNeillDynamic High Frequency Trading A Neuro-Evolutionary ApproachIn Proc EvoWorkshops 2009

G Cuccu and F GomezWhen novelty is not enoughIn Proc EvoApplications 2011

Bibliographyndash 85

M Daga and M C DeoAlternative data-driven methods to estimate wind from waves by inversemodelingNatural Hazards 49(2)293ndash310 May 2009

I Dempsey M OrsquoNeill and A BrabazonAdaptive Trading With Grammatical EvolutionIn Proc CEC 2006

G Durrett F Neumann and U-M OrsquoReillyComputational Complexity Analysis of Simple Genetic Programming On TwoProblems Modeling Isolated Program SemanticsIn Proc FOGA 2011

E Galvaacuten-Loacutepez J Swafford M OrsquoNeill and A BrabazonEvolving a Ms PacMan Controller Using Grammatical EvolutionIn Applications of Evolutionary Computation Springer 2010

J V Hansen P B Lowry R D Meservy and D M McDonaldGenetic Programming for Prevention of Cyberterrorism through Dynamic andEvolving Intrusion DetectionDecision Support Systems 431362ndash1374 2007

Bibliographyndash 86

S Harding J F Miller and W BanzhafDevelopments in Cartesian Genetic Programming self-modifying CGPGPEM 11397ndash439 2010

D Jakobović and L BudinDynamic Scheduling with Genetic ProgrammingIn Proc EuroGP 2006

G Kendall A Parkes and K SpoererA Survey of NP-Complete PuzzlesInternational Computer Games Association Journal 31(1)13ndash34 2008

K E Kinnear JrEvolving a Sort Lessons in Genetic ProgrammingIn Proc of the International Conference on Neural Networks 1993

J KozaA Genetic Approach to the Truck Backer Upper Problem and the Inter-twinedSpiral ProblemIn Proc International Joint Conference on Neural Networks 1992

J R KozaGenetic Programming On the Programming of Computers by Means of NaturalSelectionMIT Press Cambridge MA USA 1992

Bibliographyndash 87

J R KozaGenetic Programming II Automatic Discovery of Reusable ProgramsMIT Press Cambridge Massachusetts May 1994

W Langdon and W BanzhafRepeated Patterns in Genetic ProgrammingNatural Computing 7589ndash613 2008

W B LangdonRandom search is parsimoniousIn E Cantuacute-Paz editor Late Breaking Papers at the Genetic and EvolutionaryComputation Conference (GECCO-2002) pages 308ndash315 New York NY 9-13July 2002 AAAI

W B Langdon and W BanzhafRepeated Sequences in Linear Genetic Programming GenomesComplex Systems 15(4)285ndash306 2005

W B Langdon and A P HarrisonEvolving Regular Expressions for GeneChip Probe Performance PredictionIn Proc PPSN pages 1061ndash1070 2008

W B Langdon and R PoliFoundations of Genetic ProgrammingSpringer-Verlag 2002

Bibliographyndash 88

W B Langdon J Rowsell and A P HarrisonCreating Regular Expressions as mRNA Motifs with GP to Predict Human ExonSplittingIn Proc GECCO 2009

W B Langdon O Sanchez Graillet and A P HarrisonAutomated DNA Motif DiscoveryarXivorg 2010

M Lones A Tyrrell S Stepney and L CavesControlling Complex Dynamics with Artificial Biochemical NetworksIn Proc EuroGP pages 159ndash170 2010

S LucasOthello Competitionhttpprotectkern-1667emrelaxalgovalessexacuk8080othellohtmlOthellohtml 2012[Online accessed 27-Jan-2012]

Bibliographyndash 89

S LucasThe Physical Travelling Salesperson Problemhttpprotectkern-1667emrelaxalgovalessexacukptspptsphtml2012[Online accessed 27ndashJan-2012]

T McConaghyFFX Fast Scalable Deterministic Symbolic Regression TechnologyIn Proc GPTP 2011

D J MontanaStrongly typed genetic programmingBBN Technical Report 7866 Bolt Beranek and Newman Inc 10 MoultonStreet Cambridge MA 02138 USA 7 May 1993

M Nicolau M Schoenauer and W BanzhafEvolving Genes to Balance a PoleIn Proc EuroGP 2010

Bibliographyndash 90

P Nordin and W BanzhafGenetic programming controlling a miniature robotIn E V Siegel and J R Koza editors Working Notes for the AAAI Symposiumon Genetic Programming pages 61ndash67 MIT Cambridge MA USA 10ndash12 Nov1995 AAAI

G Olague and L TrujilloEvolutionary-computer-assisted design of image operators that detect interestpoints using genetic programmingImage and Vision Computing 29(7)484ndash498 2011

M OrsquoNeill A Brabazon and E HembergSubtree Deactivation Control with Grammatical Genetic Programming inDynamic EnvironmentsIn Proc CEC 2008

M OrsquoNeill and C RyanGrammatical Evolution by Grammatical Evolution The Evolution of Grammarand Genetic CodeIn Proc EuroGP pages 138ndash149 Springer-Verlag 5ndash7 Apr 2004

M Paterson Y Peres M Thorup P Winkler and U ZwickMaximum OverhangIn Proc 19th Annual ACM-SIAM Symposium on Discrete Algorithms 2008

Bibliographyndash 91

R PoliExact schema theory for genetic programming and variable-length geneticalgorithms with one-point crossoverGenetic Programming and Evolvable Machines 2(2)123ndash163 June 2001

R Poli and W B LangdonOn the search properties of different crossover operators in genetic programmingIn J R Koza W Banzhaf K Chellapilla K Deb M Dorigo D B Fogel M HGarzon D E Goldberg H Iba and R Riolo editors Genetic Programming1998 Proceedings of the Third Annual Conference pages 293ndash301 University ofWisconsin Madison Wisconsin USA 22-25 July 1998 Morgan Kaufmann

R Poli W B Langdon and N F McPheeA field guide to genetic programmingPublished via httplulucom and freely available athttpwwwgp-field-guideorguk 2008(With contributions by J R Koza)

B J Ross and H ZhuProcedural texture evolution using multiobjective optimizationNew Generation Computing 22(3)271ndash293 2004

Bibliographyndash 92

R P Salustowicz and J SchmidhuberProbabilistic incremental program evolutionEvolutionary Computation 5(2)123ndash141 1997

M Schmidt and H LipsonDistilling free-form natural laws from experimental dataScience 324(5923)81ndash85 3 Apr 2009

S Silva and L VanneschiState-of-the-Art Genetic Programming for Predicting Human Oral Bioavailabilityof DrugsIn Proc 4th International Workshop on Practical Applications of ComputationalBiology and Bioinformatics 2010

M SipperLet the Games EvolveIn Proc GPTP 2011

C Sivapragasam R Maheswaran and V VenkateshGenetic programming approach for flood routing in natural channelsHydrological Processes 22(5)623ndash628 2007

C Sivapragasam N Muttil S Muthukumar and V M ArunPrediction of algal blooms using genetic programmingMarine Pollution Bulletin 60(10)1849ndash1855 2010

Bibliographyndash 93

L SpectorTowards practical autoconstructive evolution Self-evolution of problem-solvinggenetic programming systemsIn R Riolo T McConaghy and E Vladislavleva editors Genetic ProgrammingTheory and Practice VIII volume 8 of Genetic and Evolutionary Computationchapter 2 pages 17ndash33 Springer Ann Arbor USA 20-22 May 2010

L Spector C Perry and J KleinPush 20 programming language descriptionTechnical report School of Cognitive Science Hampshire College Apr 2004

J Togelius S Karakovskiy J Koutnik and J SchmidhuberSuper Mario EvolutionIn Proc IEEE Computational Intelligence and Games 2009

TORCS The Open Car Racing Simulatorhttptorcssourceforgenet 2012

L Vanneschi and G CuccuVariable Size Population for Dynamic Optimization with Genetic ProgrammingIn Proc GECCO 2009

Bibliographyndash 94

E Vladislavleva G Smits and D Den HertogOrder of Nonlinearity as a Complexity Measure for Models Generated by SymbolicRegression via Pareto Genetic ProgrammingIEEE Trans EC 13(2)333ndash349 2009

N Wagner Z Michalewicz M Khouja and R McGregorTime Series Forecasting for Dynamic Environments The DyFor Genetic ProgramModelIEEE Trans EC 2007

J Walker and J MillerPredicting Prime Numbers Using Cartesian Genetic ProgrammingIn Proc EuroGP 2007

J A Walker K Voumllk S L Smith and J F MillerParallel Evolution using Multi-chromosome Cartesian Genetic ProgrammingGPEM 10417ndash445 2009

W-C Wang K-W Chau C-T Cheng and L QiuA comparison of performance of several artificial intelligence methods forforecasting monthly discharge time seriesJournal of Hydrology 374(3-4)294ndash306 2009

Bibliographyndash 95

W Weimer S Forrest C Le Goues and T NguyenAutomatic program repair with evolutionary computationCommunications of the ACM 53(5)109ndash116 June 2010

P Widera J Garibaldi and N KrasnogorGP challenge Evolving energy function for protein structure predictionGPEM 1161ndash88 2010

T YuHierarchical Processing for Evolving Recursive and Modular Programs UsingHigher-Order Functions and Lambda AbstractionGPEM 2345ndash380 2001

Bibliographyndash 96

  • Introduction
  • Background
  • What is genetic programming
  • Summary of our first glimpse at GP
  • Life demonstration of GP using ECJ
  • Assessment of GP techniques
  • Case study
  • A more detailed view on GP (vanilla GP is not the whole story)
  • The Challenges for GP
  • Variants of GP
  • Selected theoretical results
  • Applications of GP
  • Additional resources
  • Bibliography
Page 81: GeneticProgramming - Poznań University of TechnologyGeneticProgramming KrzysztofKrawiec LaboratoryofIntelligentDecisionSupportSystems InstituteofComputingScience,PoznanUniversityofTechnology,Poznań,Poland

Case study 2 Evolution of feature detectors

Based on Krzysztof Krawiec Bartosz Kukawka and Tomasz Maciejewski (2010)Evolving cascades of voting feature detectors for vehicle detection in satellite imageryIn IEEE Congress on Evolutionary Computation (CEC 2010) Barcelona IEEE Presspages 2392-2399

Applications of GPndash 81

Additional resources

Additional resourcesndash 82

Additional resources

Evolutionary Computation in Java csgmuedu~eclabprojectsecjGeneric software framework for EA well-prepared to work with GP

The online GP bilbiography wwwcsbhamacuk~wblbiblio

The genetic programming lsquohome pagersquo (a little bit messy but still valuable)httpwwwgenetic-programmingcom

Additional resourcesndash 83

Bibliography

Bibliographyndash 84

A Agapitos and S M LucasEvolving Modular Recursive Sorting AlgorithmsIn Proc EuroGP 2007

A Arcuri and X YaoA novel co-evolutionary approach to automatic software bug fixingIn J Wang editor 2008 IEEE World Congress on Computational Intelligencepages 162ndash168 Hong Kong 1-6 June 2008 IEEE Computational IntelligenceSociety IEEE Press

M Arganis R Val J Prats K Rodriguez R Dominguez and J DolzGenetic programming and standardization in water temperature modellingAdvances in Civil Engineering 2009 2009

A Brabazon M OrsquoNeill and I DempseyAn Introduction to Evolutionary Computation in FinanceIEEE Computational Intelligence Magazine 3(4)42ndash55 2008

R Bradley A Brabazon and M OrsquoNeillDynamic High Frequency Trading A Neuro-Evolutionary ApproachIn Proc EvoWorkshops 2009

G Cuccu and F GomezWhen novelty is not enoughIn Proc EvoApplications 2011

Bibliographyndash 85

M Daga and M C DeoAlternative data-driven methods to estimate wind from waves by inversemodelingNatural Hazards 49(2)293ndash310 May 2009

I Dempsey M OrsquoNeill and A BrabazonAdaptive Trading With Grammatical EvolutionIn Proc CEC 2006

G Durrett F Neumann and U-M OrsquoReillyComputational Complexity Analysis of Simple Genetic Programming On TwoProblems Modeling Isolated Program SemanticsIn Proc FOGA 2011

E Galvaacuten-Loacutepez J Swafford M OrsquoNeill and A BrabazonEvolving a Ms PacMan Controller Using Grammatical EvolutionIn Applications of Evolutionary Computation Springer 2010

J V Hansen P B Lowry R D Meservy and D M McDonaldGenetic Programming for Prevention of Cyberterrorism through Dynamic andEvolving Intrusion DetectionDecision Support Systems 431362ndash1374 2007

Bibliographyndash 86

S Harding J F Miller and W BanzhafDevelopments in Cartesian Genetic Programming self-modifying CGPGPEM 11397ndash439 2010

D Jakobović and L BudinDynamic Scheduling with Genetic ProgrammingIn Proc EuroGP 2006

G Kendall A Parkes and K SpoererA Survey of NP-Complete PuzzlesInternational Computer Games Association Journal 31(1)13ndash34 2008

K E Kinnear JrEvolving a Sort Lessons in Genetic ProgrammingIn Proc of the International Conference on Neural Networks 1993

J KozaA Genetic Approach to the Truck Backer Upper Problem and the Inter-twinedSpiral ProblemIn Proc International Joint Conference on Neural Networks 1992

J R KozaGenetic Programming On the Programming of Computers by Means of NaturalSelectionMIT Press Cambridge MA USA 1992

Bibliographyndash 87

J R KozaGenetic Programming II Automatic Discovery of Reusable ProgramsMIT Press Cambridge Massachusetts May 1994

W Langdon and W BanzhafRepeated Patterns in Genetic ProgrammingNatural Computing 7589ndash613 2008

W B LangdonRandom search is parsimoniousIn E Cantuacute-Paz editor Late Breaking Papers at the Genetic and EvolutionaryComputation Conference (GECCO-2002) pages 308ndash315 New York NY 9-13July 2002 AAAI

W B Langdon and W BanzhafRepeated Sequences in Linear Genetic Programming GenomesComplex Systems 15(4)285ndash306 2005

W B Langdon and A P HarrisonEvolving Regular Expressions for GeneChip Probe Performance PredictionIn Proc PPSN pages 1061ndash1070 2008

W B Langdon and R PoliFoundations of Genetic ProgrammingSpringer-Verlag 2002

Bibliographyndash 88

W B Langdon J Rowsell and A P HarrisonCreating Regular Expressions as mRNA Motifs with GP to Predict Human ExonSplittingIn Proc GECCO 2009

W B Langdon O Sanchez Graillet and A P HarrisonAutomated DNA Motif DiscoveryarXivorg 2010

M Lones A Tyrrell S Stepney and L CavesControlling Complex Dynamics with Artificial Biochemical NetworksIn Proc EuroGP pages 159ndash170 2010

S LucasOthello Competitionhttpprotectkern-1667emrelaxalgovalessexacuk8080othellohtmlOthellohtml 2012[Online accessed 27-Jan-2012]

Bibliographyndash 89

S LucasThe Physical Travelling Salesperson Problemhttpprotectkern-1667emrelaxalgovalessexacukptspptsphtml2012[Online accessed 27ndashJan-2012]

T McConaghyFFX Fast Scalable Deterministic Symbolic Regression TechnologyIn Proc GPTP 2011

D J MontanaStrongly typed genetic programmingBBN Technical Report 7866 Bolt Beranek and Newman Inc 10 MoultonStreet Cambridge MA 02138 USA 7 May 1993

M Nicolau M Schoenauer and W BanzhafEvolving Genes to Balance a PoleIn Proc EuroGP 2010

Bibliographyndash 90

P Nordin and W BanzhafGenetic programming controlling a miniature robotIn E V Siegel and J R Koza editors Working Notes for the AAAI Symposiumon Genetic Programming pages 61ndash67 MIT Cambridge MA USA 10ndash12 Nov1995 AAAI

G Olague and L TrujilloEvolutionary-computer-assisted design of image operators that detect interestpoints using genetic programmingImage and Vision Computing 29(7)484ndash498 2011

M OrsquoNeill A Brabazon and E HembergSubtree Deactivation Control with Grammatical Genetic Programming inDynamic EnvironmentsIn Proc CEC 2008

M OrsquoNeill and C RyanGrammatical Evolution by Grammatical Evolution The Evolution of Grammarand Genetic CodeIn Proc EuroGP pages 138ndash149 Springer-Verlag 5ndash7 Apr 2004

M Paterson Y Peres M Thorup P Winkler and U ZwickMaximum OverhangIn Proc 19th Annual ACM-SIAM Symposium on Discrete Algorithms 2008

Bibliographyndash 91

R PoliExact schema theory for genetic programming and variable-length geneticalgorithms with one-point crossoverGenetic Programming and Evolvable Machines 2(2)123ndash163 June 2001

R Poli and W B LangdonOn the search properties of different crossover operators in genetic programmingIn J R Koza W Banzhaf K Chellapilla K Deb M Dorigo D B Fogel M HGarzon D E Goldberg H Iba and R Riolo editors Genetic Programming1998 Proceedings of the Third Annual Conference pages 293ndash301 University ofWisconsin Madison Wisconsin USA 22-25 July 1998 Morgan Kaufmann

R Poli W B Langdon and N F McPheeA field guide to genetic programmingPublished via httplulucom and freely available athttpwwwgp-field-guideorguk 2008(With contributions by J R Koza)

B J Ross and H ZhuProcedural texture evolution using multiobjective optimizationNew Generation Computing 22(3)271ndash293 2004

Bibliographyndash 92

R P Salustowicz and J SchmidhuberProbabilistic incremental program evolutionEvolutionary Computation 5(2)123ndash141 1997

M Schmidt and H LipsonDistilling free-form natural laws from experimental dataScience 324(5923)81ndash85 3 Apr 2009

S Silva and L VanneschiState-of-the-Art Genetic Programming for Predicting Human Oral Bioavailabilityof DrugsIn Proc 4th International Workshop on Practical Applications of ComputationalBiology and Bioinformatics 2010

M SipperLet the Games EvolveIn Proc GPTP 2011

C Sivapragasam R Maheswaran and V VenkateshGenetic programming approach for flood routing in natural channelsHydrological Processes 22(5)623ndash628 2007

C Sivapragasam N Muttil S Muthukumar and V M ArunPrediction of algal blooms using genetic programmingMarine Pollution Bulletin 60(10)1849ndash1855 2010

Bibliographyndash 93

L SpectorTowards practical autoconstructive evolution Self-evolution of problem-solvinggenetic programming systemsIn R Riolo T McConaghy and E Vladislavleva editors Genetic ProgrammingTheory and Practice VIII volume 8 of Genetic and Evolutionary Computationchapter 2 pages 17ndash33 Springer Ann Arbor USA 20-22 May 2010

L Spector C Perry and J KleinPush 20 programming language descriptionTechnical report School of Cognitive Science Hampshire College Apr 2004

J Togelius S Karakovskiy J Koutnik and J SchmidhuberSuper Mario EvolutionIn Proc IEEE Computational Intelligence and Games 2009

TORCS The Open Car Racing Simulatorhttptorcssourceforgenet 2012

L Vanneschi and G CuccuVariable Size Population for Dynamic Optimization with Genetic ProgrammingIn Proc GECCO 2009

Bibliographyndash 94

E Vladislavleva G Smits and D Den HertogOrder of Nonlinearity as a Complexity Measure for Models Generated by SymbolicRegression via Pareto Genetic ProgrammingIEEE Trans EC 13(2)333ndash349 2009

N Wagner Z Michalewicz M Khouja and R McGregorTime Series Forecasting for Dynamic Environments The DyFor Genetic ProgramModelIEEE Trans EC 2007

J Walker and J MillerPredicting Prime Numbers Using Cartesian Genetic ProgrammingIn Proc EuroGP 2007

J A Walker K Voumllk S L Smith and J F MillerParallel Evolution using Multi-chromosome Cartesian Genetic ProgrammingGPEM 10417ndash445 2009

W-C Wang K-W Chau C-T Cheng and L QiuA comparison of performance of several artificial intelligence methods forforecasting monthly discharge time seriesJournal of Hydrology 374(3-4)294ndash306 2009

Bibliographyndash 95

W Weimer S Forrest C Le Goues and T NguyenAutomatic program repair with evolutionary computationCommunications of the ACM 53(5)109ndash116 June 2010

P Widera J Garibaldi and N KrasnogorGP challenge Evolving energy function for protein structure predictionGPEM 1161ndash88 2010

T YuHierarchical Processing for Evolving Recursive and Modular Programs UsingHigher-Order Functions and Lambda AbstractionGPEM 2345ndash380 2001

Bibliographyndash 96

  • Introduction
  • Background
  • What is genetic programming
  • Summary of our first glimpse at GP
  • Life demonstration of GP using ECJ
  • Assessment of GP techniques
  • Case study
  • A more detailed view on GP (vanilla GP is not the whole story)
  • The Challenges for GP
  • Variants of GP
  • Selected theoretical results
  • Applications of GP
  • Additional resources
  • Bibliography
Page 82: GeneticProgramming - Poznań University of TechnologyGeneticProgramming KrzysztofKrawiec LaboratoryofIntelligentDecisionSupportSystems InstituteofComputingScience,PoznanUniversityofTechnology,Poznań,Poland

Additional resources

Additional resourcesndash 82

Additional resources

Evolutionary Computation in Java csgmuedu~eclabprojectsecjGeneric software framework for EA well-prepared to work with GP

The online GP bilbiography wwwcsbhamacuk~wblbiblio

The genetic programming lsquohome pagersquo (a little bit messy but still valuable)httpwwwgenetic-programmingcom

Additional resourcesndash 83

Bibliography

Bibliographyndash 84

A Agapitos and S M LucasEvolving Modular Recursive Sorting AlgorithmsIn Proc EuroGP 2007

A Arcuri and X YaoA novel co-evolutionary approach to automatic software bug fixingIn J Wang editor 2008 IEEE World Congress on Computational Intelligencepages 162ndash168 Hong Kong 1-6 June 2008 IEEE Computational IntelligenceSociety IEEE Press

M Arganis R Val J Prats K Rodriguez R Dominguez and J DolzGenetic programming and standardization in water temperature modellingAdvances in Civil Engineering 2009 2009

A Brabazon M OrsquoNeill and I DempseyAn Introduction to Evolutionary Computation in FinanceIEEE Computational Intelligence Magazine 3(4)42ndash55 2008

R Bradley A Brabazon and M OrsquoNeillDynamic High Frequency Trading A Neuro-Evolutionary ApproachIn Proc EvoWorkshops 2009

G Cuccu and F GomezWhen novelty is not enoughIn Proc EvoApplications 2011

Bibliographyndash 85

M Daga and M C DeoAlternative data-driven methods to estimate wind from waves by inversemodelingNatural Hazards 49(2)293ndash310 May 2009

I Dempsey M OrsquoNeill and A BrabazonAdaptive Trading With Grammatical EvolutionIn Proc CEC 2006

G Durrett F Neumann and U-M OrsquoReillyComputational Complexity Analysis of Simple Genetic Programming On TwoProblems Modeling Isolated Program SemanticsIn Proc FOGA 2011

E Galvaacuten-Loacutepez J Swafford M OrsquoNeill and A BrabazonEvolving a Ms PacMan Controller Using Grammatical EvolutionIn Applications of Evolutionary Computation Springer 2010

J V Hansen P B Lowry R D Meservy and D M McDonaldGenetic Programming for Prevention of Cyberterrorism through Dynamic andEvolving Intrusion DetectionDecision Support Systems 431362ndash1374 2007

Bibliographyndash 86

S Harding J F Miller and W BanzhafDevelopments in Cartesian Genetic Programming self-modifying CGPGPEM 11397ndash439 2010

D Jakobović and L BudinDynamic Scheduling with Genetic ProgrammingIn Proc EuroGP 2006

G Kendall A Parkes and K SpoererA Survey of NP-Complete PuzzlesInternational Computer Games Association Journal 31(1)13ndash34 2008

K E Kinnear JrEvolving a Sort Lessons in Genetic ProgrammingIn Proc of the International Conference on Neural Networks 1993

J KozaA Genetic Approach to the Truck Backer Upper Problem and the Inter-twinedSpiral ProblemIn Proc International Joint Conference on Neural Networks 1992

J R KozaGenetic Programming On the Programming of Computers by Means of NaturalSelectionMIT Press Cambridge MA USA 1992

Bibliographyndash 87

J R KozaGenetic Programming II Automatic Discovery of Reusable ProgramsMIT Press Cambridge Massachusetts May 1994

W Langdon and W BanzhafRepeated Patterns in Genetic ProgrammingNatural Computing 7589ndash613 2008

W B LangdonRandom search is parsimoniousIn E Cantuacute-Paz editor Late Breaking Papers at the Genetic and EvolutionaryComputation Conference (GECCO-2002) pages 308ndash315 New York NY 9-13July 2002 AAAI

W B Langdon and W BanzhafRepeated Sequences in Linear Genetic Programming GenomesComplex Systems 15(4)285ndash306 2005

W B Langdon and A P HarrisonEvolving Regular Expressions for GeneChip Probe Performance PredictionIn Proc PPSN pages 1061ndash1070 2008

W B Langdon and R PoliFoundations of Genetic ProgrammingSpringer-Verlag 2002

Bibliographyndash 88

W B Langdon J Rowsell and A P HarrisonCreating Regular Expressions as mRNA Motifs with GP to Predict Human ExonSplittingIn Proc GECCO 2009

W B Langdon O Sanchez Graillet and A P HarrisonAutomated DNA Motif DiscoveryarXivorg 2010

M Lones A Tyrrell S Stepney and L CavesControlling Complex Dynamics with Artificial Biochemical NetworksIn Proc EuroGP pages 159ndash170 2010

S LucasOthello Competitionhttpprotectkern-1667emrelaxalgovalessexacuk8080othellohtmlOthellohtml 2012[Online accessed 27-Jan-2012]

Bibliographyndash 89

S LucasThe Physical Travelling Salesperson Problemhttpprotectkern-1667emrelaxalgovalessexacukptspptsphtml2012[Online accessed 27ndashJan-2012]

T McConaghyFFX Fast Scalable Deterministic Symbolic Regression TechnologyIn Proc GPTP 2011

D J MontanaStrongly typed genetic programmingBBN Technical Report 7866 Bolt Beranek and Newman Inc 10 MoultonStreet Cambridge MA 02138 USA 7 May 1993

M Nicolau M Schoenauer and W BanzhafEvolving Genes to Balance a PoleIn Proc EuroGP 2010

Bibliographyndash 90

P Nordin and W BanzhafGenetic programming controlling a miniature robotIn E V Siegel and J R Koza editors Working Notes for the AAAI Symposiumon Genetic Programming pages 61ndash67 MIT Cambridge MA USA 10ndash12 Nov1995 AAAI

G Olague and L TrujilloEvolutionary-computer-assisted design of image operators that detect interestpoints using genetic programmingImage and Vision Computing 29(7)484ndash498 2011

M OrsquoNeill A Brabazon and E HembergSubtree Deactivation Control with Grammatical Genetic Programming inDynamic EnvironmentsIn Proc CEC 2008

M OrsquoNeill and C RyanGrammatical Evolution by Grammatical Evolution The Evolution of Grammarand Genetic CodeIn Proc EuroGP pages 138ndash149 Springer-Verlag 5ndash7 Apr 2004

M Paterson Y Peres M Thorup P Winkler and U ZwickMaximum OverhangIn Proc 19th Annual ACM-SIAM Symposium on Discrete Algorithms 2008

Bibliographyndash 91

R PoliExact schema theory for genetic programming and variable-length geneticalgorithms with one-point crossoverGenetic Programming and Evolvable Machines 2(2)123ndash163 June 2001

R Poli and W B LangdonOn the search properties of different crossover operators in genetic programmingIn J R Koza W Banzhaf K Chellapilla K Deb M Dorigo D B Fogel M HGarzon D E Goldberg H Iba and R Riolo editors Genetic Programming1998 Proceedings of the Third Annual Conference pages 293ndash301 University ofWisconsin Madison Wisconsin USA 22-25 July 1998 Morgan Kaufmann

R Poli W B Langdon and N F McPheeA field guide to genetic programmingPublished via httplulucom and freely available athttpwwwgp-field-guideorguk 2008(With contributions by J R Koza)

B J Ross and H ZhuProcedural texture evolution using multiobjective optimizationNew Generation Computing 22(3)271ndash293 2004

Bibliographyndash 92

R P Salustowicz and J SchmidhuberProbabilistic incremental program evolutionEvolutionary Computation 5(2)123ndash141 1997

M Schmidt and H LipsonDistilling free-form natural laws from experimental dataScience 324(5923)81ndash85 3 Apr 2009

S Silva and L VanneschiState-of-the-Art Genetic Programming for Predicting Human Oral Bioavailabilityof DrugsIn Proc 4th International Workshop on Practical Applications of ComputationalBiology and Bioinformatics 2010

M SipperLet the Games EvolveIn Proc GPTP 2011

C Sivapragasam R Maheswaran and V VenkateshGenetic programming approach for flood routing in natural channelsHydrological Processes 22(5)623ndash628 2007

C Sivapragasam N Muttil S Muthukumar and V M ArunPrediction of algal blooms using genetic programmingMarine Pollution Bulletin 60(10)1849ndash1855 2010

Bibliographyndash 93

L SpectorTowards practical autoconstructive evolution Self-evolution of problem-solvinggenetic programming systemsIn R Riolo T McConaghy and E Vladislavleva editors Genetic ProgrammingTheory and Practice VIII volume 8 of Genetic and Evolutionary Computationchapter 2 pages 17ndash33 Springer Ann Arbor USA 20-22 May 2010

L Spector C Perry and J KleinPush 20 programming language descriptionTechnical report School of Cognitive Science Hampshire College Apr 2004

J Togelius S Karakovskiy J Koutnik and J SchmidhuberSuper Mario EvolutionIn Proc IEEE Computational Intelligence and Games 2009

TORCS The Open Car Racing Simulatorhttptorcssourceforgenet 2012

L Vanneschi and G CuccuVariable Size Population for Dynamic Optimization with Genetic ProgrammingIn Proc GECCO 2009

Bibliographyndash 94

E Vladislavleva G Smits and D Den HertogOrder of Nonlinearity as a Complexity Measure for Models Generated by SymbolicRegression via Pareto Genetic ProgrammingIEEE Trans EC 13(2)333ndash349 2009

N Wagner Z Michalewicz M Khouja and R McGregorTime Series Forecasting for Dynamic Environments The DyFor Genetic ProgramModelIEEE Trans EC 2007

J Walker and J MillerPredicting Prime Numbers Using Cartesian Genetic ProgrammingIn Proc EuroGP 2007

J A Walker K Voumllk S L Smith and J F MillerParallel Evolution using Multi-chromosome Cartesian Genetic ProgrammingGPEM 10417ndash445 2009

W-C Wang K-W Chau C-T Cheng and L QiuA comparison of performance of several artificial intelligence methods forforecasting monthly discharge time seriesJournal of Hydrology 374(3-4)294ndash306 2009

Bibliographyndash 95

W Weimer S Forrest C Le Goues and T NguyenAutomatic program repair with evolutionary computationCommunications of the ACM 53(5)109ndash116 June 2010

P Widera J Garibaldi and N KrasnogorGP challenge Evolving energy function for protein structure predictionGPEM 1161ndash88 2010

T YuHierarchical Processing for Evolving Recursive and Modular Programs UsingHigher-Order Functions and Lambda AbstractionGPEM 2345ndash380 2001

Bibliographyndash 96

  • Introduction
  • Background
  • What is genetic programming
  • Summary of our first glimpse at GP
  • Life demonstration of GP using ECJ
  • Assessment of GP techniques
  • Case study
  • A more detailed view on GP (vanilla GP is not the whole story)
  • The Challenges for GP
  • Variants of GP
  • Selected theoretical results
  • Applications of GP
  • Additional resources
  • Bibliography
Page 83: GeneticProgramming - Poznań University of TechnologyGeneticProgramming KrzysztofKrawiec LaboratoryofIntelligentDecisionSupportSystems InstituteofComputingScience,PoznanUniversityofTechnology,Poznań,Poland

Additional resources

Evolutionary Computation in Java csgmuedu~eclabprojectsecjGeneric software framework for EA well-prepared to work with GP

The online GP bilbiography wwwcsbhamacuk~wblbiblio

The genetic programming lsquohome pagersquo (a little bit messy but still valuable)httpwwwgenetic-programmingcom

Additional resourcesndash 83

Bibliography

Bibliographyndash 84

A Agapitos and S M LucasEvolving Modular Recursive Sorting AlgorithmsIn Proc EuroGP 2007

A Arcuri and X YaoA novel co-evolutionary approach to automatic software bug fixingIn J Wang editor 2008 IEEE World Congress on Computational Intelligencepages 162ndash168 Hong Kong 1-6 June 2008 IEEE Computational IntelligenceSociety IEEE Press

M Arganis R Val J Prats K Rodriguez R Dominguez and J DolzGenetic programming and standardization in water temperature modellingAdvances in Civil Engineering 2009 2009

A Brabazon M OrsquoNeill and I DempseyAn Introduction to Evolutionary Computation in FinanceIEEE Computational Intelligence Magazine 3(4)42ndash55 2008

R Bradley A Brabazon and M OrsquoNeillDynamic High Frequency Trading A Neuro-Evolutionary ApproachIn Proc EvoWorkshops 2009

G Cuccu and F GomezWhen novelty is not enoughIn Proc EvoApplications 2011

Bibliographyndash 85

M Daga and M C DeoAlternative data-driven methods to estimate wind from waves by inversemodelingNatural Hazards 49(2)293ndash310 May 2009

I Dempsey M OrsquoNeill and A BrabazonAdaptive Trading With Grammatical EvolutionIn Proc CEC 2006

G Durrett F Neumann and U-M OrsquoReillyComputational Complexity Analysis of Simple Genetic Programming On TwoProblems Modeling Isolated Program SemanticsIn Proc FOGA 2011

E Galvaacuten-Loacutepez J Swafford M OrsquoNeill and A BrabazonEvolving a Ms PacMan Controller Using Grammatical EvolutionIn Applications of Evolutionary Computation Springer 2010

J V Hansen P B Lowry R D Meservy and D M McDonaldGenetic Programming for Prevention of Cyberterrorism through Dynamic andEvolving Intrusion DetectionDecision Support Systems 431362ndash1374 2007

Bibliographyndash 86

S Harding J F Miller and W BanzhafDevelopments in Cartesian Genetic Programming self-modifying CGPGPEM 11397ndash439 2010

D Jakobović and L BudinDynamic Scheduling with Genetic ProgrammingIn Proc EuroGP 2006

G Kendall A Parkes and K SpoererA Survey of NP-Complete PuzzlesInternational Computer Games Association Journal 31(1)13ndash34 2008

K E Kinnear JrEvolving a Sort Lessons in Genetic ProgrammingIn Proc of the International Conference on Neural Networks 1993

J KozaA Genetic Approach to the Truck Backer Upper Problem and the Inter-twinedSpiral ProblemIn Proc International Joint Conference on Neural Networks 1992

J R KozaGenetic Programming On the Programming of Computers by Means of NaturalSelectionMIT Press Cambridge MA USA 1992

Bibliographyndash 87

J R KozaGenetic Programming II Automatic Discovery of Reusable ProgramsMIT Press Cambridge Massachusetts May 1994

W Langdon and W BanzhafRepeated Patterns in Genetic ProgrammingNatural Computing 7589ndash613 2008

W B LangdonRandom search is parsimoniousIn E Cantuacute-Paz editor Late Breaking Papers at the Genetic and EvolutionaryComputation Conference (GECCO-2002) pages 308ndash315 New York NY 9-13July 2002 AAAI

W B Langdon and W BanzhafRepeated Sequences in Linear Genetic Programming GenomesComplex Systems 15(4)285ndash306 2005

W B Langdon and A P HarrisonEvolving Regular Expressions for GeneChip Probe Performance PredictionIn Proc PPSN pages 1061ndash1070 2008

W B Langdon and R PoliFoundations of Genetic ProgrammingSpringer-Verlag 2002

Bibliographyndash 88

W B Langdon J Rowsell and A P HarrisonCreating Regular Expressions as mRNA Motifs with GP to Predict Human ExonSplittingIn Proc GECCO 2009

W B Langdon O Sanchez Graillet and A P HarrisonAutomated DNA Motif DiscoveryarXivorg 2010

M Lones A Tyrrell S Stepney and L CavesControlling Complex Dynamics with Artificial Biochemical NetworksIn Proc EuroGP pages 159ndash170 2010

S LucasOthello Competitionhttpprotectkern-1667emrelaxalgovalessexacuk8080othellohtmlOthellohtml 2012[Online accessed 27-Jan-2012]

Bibliographyndash 89

S LucasThe Physical Travelling Salesperson Problemhttpprotectkern-1667emrelaxalgovalessexacukptspptsphtml2012[Online accessed 27ndashJan-2012]

T McConaghyFFX Fast Scalable Deterministic Symbolic Regression TechnologyIn Proc GPTP 2011

D J MontanaStrongly typed genetic programmingBBN Technical Report 7866 Bolt Beranek and Newman Inc 10 MoultonStreet Cambridge MA 02138 USA 7 May 1993

M Nicolau M Schoenauer and W BanzhafEvolving Genes to Balance a PoleIn Proc EuroGP 2010

Bibliographyndash 90

P Nordin and W BanzhafGenetic programming controlling a miniature robotIn E V Siegel and J R Koza editors Working Notes for the AAAI Symposiumon Genetic Programming pages 61ndash67 MIT Cambridge MA USA 10ndash12 Nov1995 AAAI

G Olague and L TrujilloEvolutionary-computer-assisted design of image operators that detect interestpoints using genetic programmingImage and Vision Computing 29(7)484ndash498 2011

M OrsquoNeill A Brabazon and E HembergSubtree Deactivation Control with Grammatical Genetic Programming inDynamic EnvironmentsIn Proc CEC 2008

M OrsquoNeill and C RyanGrammatical Evolution by Grammatical Evolution The Evolution of Grammarand Genetic CodeIn Proc EuroGP pages 138ndash149 Springer-Verlag 5ndash7 Apr 2004

M Paterson Y Peres M Thorup P Winkler and U ZwickMaximum OverhangIn Proc 19th Annual ACM-SIAM Symposium on Discrete Algorithms 2008

Bibliographyndash 91

R PoliExact schema theory for genetic programming and variable-length geneticalgorithms with one-point crossoverGenetic Programming and Evolvable Machines 2(2)123ndash163 June 2001

R Poli and W B LangdonOn the search properties of different crossover operators in genetic programmingIn J R Koza W Banzhaf K Chellapilla K Deb M Dorigo D B Fogel M HGarzon D E Goldberg H Iba and R Riolo editors Genetic Programming1998 Proceedings of the Third Annual Conference pages 293ndash301 University ofWisconsin Madison Wisconsin USA 22-25 July 1998 Morgan Kaufmann

R Poli W B Langdon and N F McPheeA field guide to genetic programmingPublished via httplulucom and freely available athttpwwwgp-field-guideorguk 2008(With contributions by J R Koza)

B J Ross and H ZhuProcedural texture evolution using multiobjective optimizationNew Generation Computing 22(3)271ndash293 2004

Bibliographyndash 92

R P Salustowicz and J SchmidhuberProbabilistic incremental program evolutionEvolutionary Computation 5(2)123ndash141 1997

M Schmidt and H LipsonDistilling free-form natural laws from experimental dataScience 324(5923)81ndash85 3 Apr 2009

S Silva and L VanneschiState-of-the-Art Genetic Programming for Predicting Human Oral Bioavailabilityof DrugsIn Proc 4th International Workshop on Practical Applications of ComputationalBiology and Bioinformatics 2010

M SipperLet the Games EvolveIn Proc GPTP 2011

C Sivapragasam R Maheswaran and V VenkateshGenetic programming approach for flood routing in natural channelsHydrological Processes 22(5)623ndash628 2007

C Sivapragasam N Muttil S Muthukumar and V M ArunPrediction of algal blooms using genetic programmingMarine Pollution Bulletin 60(10)1849ndash1855 2010

Bibliographyndash 93

L SpectorTowards practical autoconstructive evolution Self-evolution of problem-solvinggenetic programming systemsIn R Riolo T McConaghy and E Vladislavleva editors Genetic ProgrammingTheory and Practice VIII volume 8 of Genetic and Evolutionary Computationchapter 2 pages 17ndash33 Springer Ann Arbor USA 20-22 May 2010

L Spector C Perry and J KleinPush 20 programming language descriptionTechnical report School of Cognitive Science Hampshire College Apr 2004

J Togelius S Karakovskiy J Koutnik and J SchmidhuberSuper Mario EvolutionIn Proc IEEE Computational Intelligence and Games 2009

TORCS The Open Car Racing Simulatorhttptorcssourceforgenet 2012

L Vanneschi and G CuccuVariable Size Population for Dynamic Optimization with Genetic ProgrammingIn Proc GECCO 2009

Bibliographyndash 94

E Vladislavleva G Smits and D Den HertogOrder of Nonlinearity as a Complexity Measure for Models Generated by SymbolicRegression via Pareto Genetic ProgrammingIEEE Trans EC 13(2)333ndash349 2009

N Wagner Z Michalewicz M Khouja and R McGregorTime Series Forecasting for Dynamic Environments The DyFor Genetic ProgramModelIEEE Trans EC 2007

J Walker and J MillerPredicting Prime Numbers Using Cartesian Genetic ProgrammingIn Proc EuroGP 2007

J A Walker K Voumllk S L Smith and J F MillerParallel Evolution using Multi-chromosome Cartesian Genetic ProgrammingGPEM 10417ndash445 2009

W-C Wang K-W Chau C-T Cheng and L QiuA comparison of performance of several artificial intelligence methods forforecasting monthly discharge time seriesJournal of Hydrology 374(3-4)294ndash306 2009

Bibliographyndash 95

W Weimer S Forrest C Le Goues and T NguyenAutomatic program repair with evolutionary computationCommunications of the ACM 53(5)109ndash116 June 2010

P Widera J Garibaldi and N KrasnogorGP challenge Evolving energy function for protein structure predictionGPEM 1161ndash88 2010

T YuHierarchical Processing for Evolving Recursive and Modular Programs UsingHigher-Order Functions and Lambda AbstractionGPEM 2345ndash380 2001

Bibliographyndash 96

  • Introduction
  • Background
  • What is genetic programming
  • Summary of our first glimpse at GP
  • Life demonstration of GP using ECJ
  • Assessment of GP techniques
  • Case study
  • A more detailed view on GP (vanilla GP is not the whole story)
  • The Challenges for GP
  • Variants of GP
  • Selected theoretical results
  • Applications of GP
  • Additional resources
  • Bibliography
Page 84: GeneticProgramming - Poznań University of TechnologyGeneticProgramming KrzysztofKrawiec LaboratoryofIntelligentDecisionSupportSystems InstituteofComputingScience,PoznanUniversityofTechnology,Poznań,Poland

Bibliography

Bibliographyndash 84

A Agapitos and S M LucasEvolving Modular Recursive Sorting AlgorithmsIn Proc EuroGP 2007

A Arcuri and X YaoA novel co-evolutionary approach to automatic software bug fixingIn J Wang editor 2008 IEEE World Congress on Computational Intelligencepages 162ndash168 Hong Kong 1-6 June 2008 IEEE Computational IntelligenceSociety IEEE Press

M Arganis R Val J Prats K Rodriguez R Dominguez and J DolzGenetic programming and standardization in water temperature modellingAdvances in Civil Engineering 2009 2009

A Brabazon M OrsquoNeill and I DempseyAn Introduction to Evolutionary Computation in FinanceIEEE Computational Intelligence Magazine 3(4)42ndash55 2008

R Bradley A Brabazon and M OrsquoNeillDynamic High Frequency Trading A Neuro-Evolutionary ApproachIn Proc EvoWorkshops 2009

G Cuccu and F GomezWhen novelty is not enoughIn Proc EvoApplications 2011

Bibliographyndash 85

M Daga and M C DeoAlternative data-driven methods to estimate wind from waves by inversemodelingNatural Hazards 49(2)293ndash310 May 2009

I Dempsey M OrsquoNeill and A BrabazonAdaptive Trading With Grammatical EvolutionIn Proc CEC 2006

G Durrett F Neumann and U-M OrsquoReillyComputational Complexity Analysis of Simple Genetic Programming On TwoProblems Modeling Isolated Program SemanticsIn Proc FOGA 2011

E Galvaacuten-Loacutepez J Swafford M OrsquoNeill and A BrabazonEvolving a Ms PacMan Controller Using Grammatical EvolutionIn Applications of Evolutionary Computation Springer 2010

J V Hansen P B Lowry R D Meservy and D M McDonaldGenetic Programming for Prevention of Cyberterrorism through Dynamic andEvolving Intrusion DetectionDecision Support Systems 431362ndash1374 2007

Bibliographyndash 86

S Harding J F Miller and W BanzhafDevelopments in Cartesian Genetic Programming self-modifying CGPGPEM 11397ndash439 2010

D Jakobović and L BudinDynamic Scheduling with Genetic ProgrammingIn Proc EuroGP 2006

G Kendall A Parkes and K SpoererA Survey of NP-Complete PuzzlesInternational Computer Games Association Journal 31(1)13ndash34 2008

K E Kinnear JrEvolving a Sort Lessons in Genetic ProgrammingIn Proc of the International Conference on Neural Networks 1993

J KozaA Genetic Approach to the Truck Backer Upper Problem and the Inter-twinedSpiral ProblemIn Proc International Joint Conference on Neural Networks 1992

J R KozaGenetic Programming On the Programming of Computers by Means of NaturalSelectionMIT Press Cambridge MA USA 1992

Bibliographyndash 87

J R KozaGenetic Programming II Automatic Discovery of Reusable ProgramsMIT Press Cambridge Massachusetts May 1994

W Langdon and W BanzhafRepeated Patterns in Genetic ProgrammingNatural Computing 7589ndash613 2008

W B LangdonRandom search is parsimoniousIn E Cantuacute-Paz editor Late Breaking Papers at the Genetic and EvolutionaryComputation Conference (GECCO-2002) pages 308ndash315 New York NY 9-13July 2002 AAAI

W B Langdon and W BanzhafRepeated Sequences in Linear Genetic Programming GenomesComplex Systems 15(4)285ndash306 2005

W B Langdon and A P HarrisonEvolving Regular Expressions for GeneChip Probe Performance PredictionIn Proc PPSN pages 1061ndash1070 2008

W B Langdon and R PoliFoundations of Genetic ProgrammingSpringer-Verlag 2002

Bibliographyndash 88

W B Langdon J Rowsell and A P HarrisonCreating Regular Expressions as mRNA Motifs with GP to Predict Human ExonSplittingIn Proc GECCO 2009

W B Langdon O Sanchez Graillet and A P HarrisonAutomated DNA Motif DiscoveryarXivorg 2010

M Lones A Tyrrell S Stepney and L CavesControlling Complex Dynamics with Artificial Biochemical NetworksIn Proc EuroGP pages 159ndash170 2010

S LucasOthello Competitionhttpprotectkern-1667emrelaxalgovalessexacuk8080othellohtmlOthellohtml 2012[Online accessed 27-Jan-2012]

Bibliographyndash 89

S LucasThe Physical Travelling Salesperson Problemhttpprotectkern-1667emrelaxalgovalessexacukptspptsphtml2012[Online accessed 27ndashJan-2012]

T McConaghyFFX Fast Scalable Deterministic Symbolic Regression TechnologyIn Proc GPTP 2011

D J MontanaStrongly typed genetic programmingBBN Technical Report 7866 Bolt Beranek and Newman Inc 10 MoultonStreet Cambridge MA 02138 USA 7 May 1993

M Nicolau M Schoenauer and W BanzhafEvolving Genes to Balance a PoleIn Proc EuroGP 2010

Bibliographyndash 90

P Nordin and W BanzhafGenetic programming controlling a miniature robotIn E V Siegel and J R Koza editors Working Notes for the AAAI Symposiumon Genetic Programming pages 61ndash67 MIT Cambridge MA USA 10ndash12 Nov1995 AAAI

G Olague and L TrujilloEvolutionary-computer-assisted design of image operators that detect interestpoints using genetic programmingImage and Vision Computing 29(7)484ndash498 2011

M OrsquoNeill A Brabazon and E HembergSubtree Deactivation Control with Grammatical Genetic Programming inDynamic EnvironmentsIn Proc CEC 2008

M OrsquoNeill and C RyanGrammatical Evolution by Grammatical Evolution The Evolution of Grammarand Genetic CodeIn Proc EuroGP pages 138ndash149 Springer-Verlag 5ndash7 Apr 2004

M Paterson Y Peres M Thorup P Winkler and U ZwickMaximum OverhangIn Proc 19th Annual ACM-SIAM Symposium on Discrete Algorithms 2008

Bibliographyndash 91

R PoliExact schema theory for genetic programming and variable-length geneticalgorithms with one-point crossoverGenetic Programming and Evolvable Machines 2(2)123ndash163 June 2001

R Poli and W B LangdonOn the search properties of different crossover operators in genetic programmingIn J R Koza W Banzhaf K Chellapilla K Deb M Dorigo D B Fogel M HGarzon D E Goldberg H Iba and R Riolo editors Genetic Programming1998 Proceedings of the Third Annual Conference pages 293ndash301 University ofWisconsin Madison Wisconsin USA 22-25 July 1998 Morgan Kaufmann

R Poli W B Langdon and N F McPheeA field guide to genetic programmingPublished via httplulucom and freely available athttpwwwgp-field-guideorguk 2008(With contributions by J R Koza)

B J Ross and H ZhuProcedural texture evolution using multiobjective optimizationNew Generation Computing 22(3)271ndash293 2004

Bibliographyndash 92

R P Salustowicz and J SchmidhuberProbabilistic incremental program evolutionEvolutionary Computation 5(2)123ndash141 1997

M Schmidt and H LipsonDistilling free-form natural laws from experimental dataScience 324(5923)81ndash85 3 Apr 2009

S Silva and L VanneschiState-of-the-Art Genetic Programming for Predicting Human Oral Bioavailabilityof DrugsIn Proc 4th International Workshop on Practical Applications of ComputationalBiology and Bioinformatics 2010

M SipperLet the Games EvolveIn Proc GPTP 2011

C Sivapragasam R Maheswaran and V VenkateshGenetic programming approach for flood routing in natural channelsHydrological Processes 22(5)623ndash628 2007

C Sivapragasam N Muttil S Muthukumar and V M ArunPrediction of algal blooms using genetic programmingMarine Pollution Bulletin 60(10)1849ndash1855 2010

Bibliographyndash 93

L SpectorTowards practical autoconstructive evolution Self-evolution of problem-solvinggenetic programming systemsIn R Riolo T McConaghy and E Vladislavleva editors Genetic ProgrammingTheory and Practice VIII volume 8 of Genetic and Evolutionary Computationchapter 2 pages 17ndash33 Springer Ann Arbor USA 20-22 May 2010

L Spector C Perry and J KleinPush 20 programming language descriptionTechnical report School of Cognitive Science Hampshire College Apr 2004

J Togelius S Karakovskiy J Koutnik and J SchmidhuberSuper Mario EvolutionIn Proc IEEE Computational Intelligence and Games 2009

TORCS The Open Car Racing Simulatorhttptorcssourceforgenet 2012

L Vanneschi and G CuccuVariable Size Population for Dynamic Optimization with Genetic ProgrammingIn Proc GECCO 2009

Bibliographyndash 94

E Vladislavleva G Smits and D Den HertogOrder of Nonlinearity as a Complexity Measure for Models Generated by SymbolicRegression via Pareto Genetic ProgrammingIEEE Trans EC 13(2)333ndash349 2009

N Wagner Z Michalewicz M Khouja and R McGregorTime Series Forecasting for Dynamic Environments The DyFor Genetic ProgramModelIEEE Trans EC 2007

J Walker and J MillerPredicting Prime Numbers Using Cartesian Genetic ProgrammingIn Proc EuroGP 2007

J A Walker K Voumllk S L Smith and J F MillerParallel Evolution using Multi-chromosome Cartesian Genetic ProgrammingGPEM 10417ndash445 2009

W-C Wang K-W Chau C-T Cheng and L QiuA comparison of performance of several artificial intelligence methods forforecasting monthly discharge time seriesJournal of Hydrology 374(3-4)294ndash306 2009

Bibliographyndash 95

W Weimer S Forrest C Le Goues and T NguyenAutomatic program repair with evolutionary computationCommunications of the ACM 53(5)109ndash116 June 2010

P Widera J Garibaldi and N KrasnogorGP challenge Evolving energy function for protein structure predictionGPEM 1161ndash88 2010

T YuHierarchical Processing for Evolving Recursive and Modular Programs UsingHigher-Order Functions and Lambda AbstractionGPEM 2345ndash380 2001

Bibliographyndash 96

  • Introduction
  • Background
  • What is genetic programming
  • Summary of our first glimpse at GP
  • Life demonstration of GP using ECJ
  • Assessment of GP techniques
  • Case study
  • A more detailed view on GP (vanilla GP is not the whole story)
  • The Challenges for GP
  • Variants of GP
  • Selected theoretical results
  • Applications of GP
  • Additional resources
  • Bibliography
Page 85: GeneticProgramming - Poznań University of TechnologyGeneticProgramming KrzysztofKrawiec LaboratoryofIntelligentDecisionSupportSystems InstituteofComputingScience,PoznanUniversityofTechnology,Poznań,Poland

A Agapitos and S M LucasEvolving Modular Recursive Sorting AlgorithmsIn Proc EuroGP 2007

A Arcuri and X YaoA novel co-evolutionary approach to automatic software bug fixingIn J Wang editor 2008 IEEE World Congress on Computational Intelligencepages 162ndash168 Hong Kong 1-6 June 2008 IEEE Computational IntelligenceSociety IEEE Press

M Arganis R Val J Prats K Rodriguez R Dominguez and J DolzGenetic programming and standardization in water temperature modellingAdvances in Civil Engineering 2009 2009

A Brabazon M OrsquoNeill and I DempseyAn Introduction to Evolutionary Computation in FinanceIEEE Computational Intelligence Magazine 3(4)42ndash55 2008

R Bradley A Brabazon and M OrsquoNeillDynamic High Frequency Trading A Neuro-Evolutionary ApproachIn Proc EvoWorkshops 2009

G Cuccu and F GomezWhen novelty is not enoughIn Proc EvoApplications 2011

Bibliographyndash 85

M Daga and M C DeoAlternative data-driven methods to estimate wind from waves by inversemodelingNatural Hazards 49(2)293ndash310 May 2009

I Dempsey M OrsquoNeill and A BrabazonAdaptive Trading With Grammatical EvolutionIn Proc CEC 2006

G Durrett F Neumann and U-M OrsquoReillyComputational Complexity Analysis of Simple Genetic Programming On TwoProblems Modeling Isolated Program SemanticsIn Proc FOGA 2011

E Galvaacuten-Loacutepez J Swafford M OrsquoNeill and A BrabazonEvolving a Ms PacMan Controller Using Grammatical EvolutionIn Applications of Evolutionary Computation Springer 2010

J V Hansen P B Lowry R D Meservy and D M McDonaldGenetic Programming for Prevention of Cyberterrorism through Dynamic andEvolving Intrusion DetectionDecision Support Systems 431362ndash1374 2007

Bibliographyndash 86

S Harding J F Miller and W BanzhafDevelopments in Cartesian Genetic Programming self-modifying CGPGPEM 11397ndash439 2010

D Jakobović and L BudinDynamic Scheduling with Genetic ProgrammingIn Proc EuroGP 2006

G Kendall A Parkes and K SpoererA Survey of NP-Complete PuzzlesInternational Computer Games Association Journal 31(1)13ndash34 2008

K E Kinnear JrEvolving a Sort Lessons in Genetic ProgrammingIn Proc of the International Conference on Neural Networks 1993

J KozaA Genetic Approach to the Truck Backer Upper Problem and the Inter-twinedSpiral ProblemIn Proc International Joint Conference on Neural Networks 1992

J R KozaGenetic Programming On the Programming of Computers by Means of NaturalSelectionMIT Press Cambridge MA USA 1992

Bibliographyndash 87

J R KozaGenetic Programming II Automatic Discovery of Reusable ProgramsMIT Press Cambridge Massachusetts May 1994

W Langdon and W BanzhafRepeated Patterns in Genetic ProgrammingNatural Computing 7589ndash613 2008

W B LangdonRandom search is parsimoniousIn E Cantuacute-Paz editor Late Breaking Papers at the Genetic and EvolutionaryComputation Conference (GECCO-2002) pages 308ndash315 New York NY 9-13July 2002 AAAI

W B Langdon and W BanzhafRepeated Sequences in Linear Genetic Programming GenomesComplex Systems 15(4)285ndash306 2005

W B Langdon and A P HarrisonEvolving Regular Expressions for GeneChip Probe Performance PredictionIn Proc PPSN pages 1061ndash1070 2008

W B Langdon and R PoliFoundations of Genetic ProgrammingSpringer-Verlag 2002

Bibliographyndash 88

W B Langdon J Rowsell and A P HarrisonCreating Regular Expressions as mRNA Motifs with GP to Predict Human ExonSplittingIn Proc GECCO 2009

W B Langdon O Sanchez Graillet and A P HarrisonAutomated DNA Motif DiscoveryarXivorg 2010

M Lones A Tyrrell S Stepney and L CavesControlling Complex Dynamics with Artificial Biochemical NetworksIn Proc EuroGP pages 159ndash170 2010

S LucasOthello Competitionhttpprotectkern-1667emrelaxalgovalessexacuk8080othellohtmlOthellohtml 2012[Online accessed 27-Jan-2012]

Bibliographyndash 89

S LucasThe Physical Travelling Salesperson Problemhttpprotectkern-1667emrelaxalgovalessexacukptspptsphtml2012[Online accessed 27ndashJan-2012]

T McConaghyFFX Fast Scalable Deterministic Symbolic Regression TechnologyIn Proc GPTP 2011

D J MontanaStrongly typed genetic programmingBBN Technical Report 7866 Bolt Beranek and Newman Inc 10 MoultonStreet Cambridge MA 02138 USA 7 May 1993

M Nicolau M Schoenauer and W BanzhafEvolving Genes to Balance a PoleIn Proc EuroGP 2010

Bibliographyndash 90

P Nordin and W BanzhafGenetic programming controlling a miniature robotIn E V Siegel and J R Koza editors Working Notes for the AAAI Symposiumon Genetic Programming pages 61ndash67 MIT Cambridge MA USA 10ndash12 Nov1995 AAAI

G Olague and L TrujilloEvolutionary-computer-assisted design of image operators that detect interestpoints using genetic programmingImage and Vision Computing 29(7)484ndash498 2011

M OrsquoNeill A Brabazon and E HembergSubtree Deactivation Control with Grammatical Genetic Programming inDynamic EnvironmentsIn Proc CEC 2008

M OrsquoNeill and C RyanGrammatical Evolution by Grammatical Evolution The Evolution of Grammarand Genetic CodeIn Proc EuroGP pages 138ndash149 Springer-Verlag 5ndash7 Apr 2004

M Paterson Y Peres M Thorup P Winkler and U ZwickMaximum OverhangIn Proc 19th Annual ACM-SIAM Symposium on Discrete Algorithms 2008

Bibliographyndash 91

R PoliExact schema theory for genetic programming and variable-length geneticalgorithms with one-point crossoverGenetic Programming and Evolvable Machines 2(2)123ndash163 June 2001

R Poli and W B LangdonOn the search properties of different crossover operators in genetic programmingIn J R Koza W Banzhaf K Chellapilla K Deb M Dorigo D B Fogel M HGarzon D E Goldberg H Iba and R Riolo editors Genetic Programming1998 Proceedings of the Third Annual Conference pages 293ndash301 University ofWisconsin Madison Wisconsin USA 22-25 July 1998 Morgan Kaufmann

R Poli W B Langdon and N F McPheeA field guide to genetic programmingPublished via httplulucom and freely available athttpwwwgp-field-guideorguk 2008(With contributions by J R Koza)

B J Ross and H ZhuProcedural texture evolution using multiobjective optimizationNew Generation Computing 22(3)271ndash293 2004

Bibliographyndash 92

R P Salustowicz and J SchmidhuberProbabilistic incremental program evolutionEvolutionary Computation 5(2)123ndash141 1997

M Schmidt and H LipsonDistilling free-form natural laws from experimental dataScience 324(5923)81ndash85 3 Apr 2009

S Silva and L VanneschiState-of-the-Art Genetic Programming for Predicting Human Oral Bioavailabilityof DrugsIn Proc 4th International Workshop on Practical Applications of ComputationalBiology and Bioinformatics 2010

M SipperLet the Games EvolveIn Proc GPTP 2011

C Sivapragasam R Maheswaran and V VenkateshGenetic programming approach for flood routing in natural channelsHydrological Processes 22(5)623ndash628 2007

C Sivapragasam N Muttil S Muthukumar and V M ArunPrediction of algal blooms using genetic programmingMarine Pollution Bulletin 60(10)1849ndash1855 2010

Bibliographyndash 93

L SpectorTowards practical autoconstructive evolution Self-evolution of problem-solvinggenetic programming systemsIn R Riolo T McConaghy and E Vladislavleva editors Genetic ProgrammingTheory and Practice VIII volume 8 of Genetic and Evolutionary Computationchapter 2 pages 17ndash33 Springer Ann Arbor USA 20-22 May 2010

L Spector C Perry and J KleinPush 20 programming language descriptionTechnical report School of Cognitive Science Hampshire College Apr 2004

J Togelius S Karakovskiy J Koutnik and J SchmidhuberSuper Mario EvolutionIn Proc IEEE Computational Intelligence and Games 2009

TORCS The Open Car Racing Simulatorhttptorcssourceforgenet 2012

L Vanneschi and G CuccuVariable Size Population for Dynamic Optimization with Genetic ProgrammingIn Proc GECCO 2009

Bibliographyndash 94

E Vladislavleva G Smits and D Den HertogOrder of Nonlinearity as a Complexity Measure for Models Generated by SymbolicRegression via Pareto Genetic ProgrammingIEEE Trans EC 13(2)333ndash349 2009

N Wagner Z Michalewicz M Khouja and R McGregorTime Series Forecasting for Dynamic Environments The DyFor Genetic ProgramModelIEEE Trans EC 2007

J Walker and J MillerPredicting Prime Numbers Using Cartesian Genetic ProgrammingIn Proc EuroGP 2007

J A Walker K Voumllk S L Smith and J F MillerParallel Evolution using Multi-chromosome Cartesian Genetic ProgrammingGPEM 10417ndash445 2009

W-C Wang K-W Chau C-T Cheng and L QiuA comparison of performance of several artificial intelligence methods forforecasting monthly discharge time seriesJournal of Hydrology 374(3-4)294ndash306 2009

Bibliographyndash 95

W Weimer S Forrest C Le Goues and T NguyenAutomatic program repair with evolutionary computationCommunications of the ACM 53(5)109ndash116 June 2010

P Widera J Garibaldi and N KrasnogorGP challenge Evolving energy function for protein structure predictionGPEM 1161ndash88 2010

T YuHierarchical Processing for Evolving Recursive and Modular Programs UsingHigher-Order Functions and Lambda AbstractionGPEM 2345ndash380 2001

Bibliographyndash 96

  • Introduction
  • Background
  • What is genetic programming
  • Summary of our first glimpse at GP
  • Life demonstration of GP using ECJ
  • Assessment of GP techniques
  • Case study
  • A more detailed view on GP (vanilla GP is not the whole story)
  • The Challenges for GP
  • Variants of GP
  • Selected theoretical results
  • Applications of GP
  • Additional resources
  • Bibliography
Page 86: GeneticProgramming - Poznań University of TechnologyGeneticProgramming KrzysztofKrawiec LaboratoryofIntelligentDecisionSupportSystems InstituteofComputingScience,PoznanUniversityofTechnology,Poznań,Poland

M Daga and M C DeoAlternative data-driven methods to estimate wind from waves by inversemodelingNatural Hazards 49(2)293ndash310 May 2009

I Dempsey M OrsquoNeill and A BrabazonAdaptive Trading With Grammatical EvolutionIn Proc CEC 2006

G Durrett F Neumann and U-M OrsquoReillyComputational Complexity Analysis of Simple Genetic Programming On TwoProblems Modeling Isolated Program SemanticsIn Proc FOGA 2011

E Galvaacuten-Loacutepez J Swafford M OrsquoNeill and A BrabazonEvolving a Ms PacMan Controller Using Grammatical EvolutionIn Applications of Evolutionary Computation Springer 2010

J V Hansen P B Lowry R D Meservy and D M McDonaldGenetic Programming for Prevention of Cyberterrorism through Dynamic andEvolving Intrusion DetectionDecision Support Systems 431362ndash1374 2007

Bibliographyndash 86

S Harding J F Miller and W BanzhafDevelopments in Cartesian Genetic Programming self-modifying CGPGPEM 11397ndash439 2010

D Jakobović and L BudinDynamic Scheduling with Genetic ProgrammingIn Proc EuroGP 2006

G Kendall A Parkes and K SpoererA Survey of NP-Complete PuzzlesInternational Computer Games Association Journal 31(1)13ndash34 2008

K E Kinnear JrEvolving a Sort Lessons in Genetic ProgrammingIn Proc of the International Conference on Neural Networks 1993

J KozaA Genetic Approach to the Truck Backer Upper Problem and the Inter-twinedSpiral ProblemIn Proc International Joint Conference on Neural Networks 1992

J R KozaGenetic Programming On the Programming of Computers by Means of NaturalSelectionMIT Press Cambridge MA USA 1992

Bibliographyndash 87

J R KozaGenetic Programming II Automatic Discovery of Reusable ProgramsMIT Press Cambridge Massachusetts May 1994

W Langdon and W BanzhafRepeated Patterns in Genetic ProgrammingNatural Computing 7589ndash613 2008

W B LangdonRandom search is parsimoniousIn E Cantuacute-Paz editor Late Breaking Papers at the Genetic and EvolutionaryComputation Conference (GECCO-2002) pages 308ndash315 New York NY 9-13July 2002 AAAI

W B Langdon and W BanzhafRepeated Sequences in Linear Genetic Programming GenomesComplex Systems 15(4)285ndash306 2005

W B Langdon and A P HarrisonEvolving Regular Expressions for GeneChip Probe Performance PredictionIn Proc PPSN pages 1061ndash1070 2008

W B Langdon and R PoliFoundations of Genetic ProgrammingSpringer-Verlag 2002

Bibliographyndash 88

W B Langdon J Rowsell and A P HarrisonCreating Regular Expressions as mRNA Motifs with GP to Predict Human ExonSplittingIn Proc GECCO 2009

W B Langdon O Sanchez Graillet and A P HarrisonAutomated DNA Motif DiscoveryarXivorg 2010

M Lones A Tyrrell S Stepney and L CavesControlling Complex Dynamics with Artificial Biochemical NetworksIn Proc EuroGP pages 159ndash170 2010

S LucasOthello Competitionhttpprotectkern-1667emrelaxalgovalessexacuk8080othellohtmlOthellohtml 2012[Online accessed 27-Jan-2012]

Bibliographyndash 89

S LucasThe Physical Travelling Salesperson Problemhttpprotectkern-1667emrelaxalgovalessexacukptspptsphtml2012[Online accessed 27ndashJan-2012]

T McConaghyFFX Fast Scalable Deterministic Symbolic Regression TechnologyIn Proc GPTP 2011

D J MontanaStrongly typed genetic programmingBBN Technical Report 7866 Bolt Beranek and Newman Inc 10 MoultonStreet Cambridge MA 02138 USA 7 May 1993

M Nicolau M Schoenauer and W BanzhafEvolving Genes to Balance a PoleIn Proc EuroGP 2010

Bibliographyndash 90

P Nordin and W BanzhafGenetic programming controlling a miniature robotIn E V Siegel and J R Koza editors Working Notes for the AAAI Symposiumon Genetic Programming pages 61ndash67 MIT Cambridge MA USA 10ndash12 Nov1995 AAAI

G Olague and L TrujilloEvolutionary-computer-assisted design of image operators that detect interestpoints using genetic programmingImage and Vision Computing 29(7)484ndash498 2011

M OrsquoNeill A Brabazon and E HembergSubtree Deactivation Control with Grammatical Genetic Programming inDynamic EnvironmentsIn Proc CEC 2008

M OrsquoNeill and C RyanGrammatical Evolution by Grammatical Evolution The Evolution of Grammarand Genetic CodeIn Proc EuroGP pages 138ndash149 Springer-Verlag 5ndash7 Apr 2004

M Paterson Y Peres M Thorup P Winkler and U ZwickMaximum OverhangIn Proc 19th Annual ACM-SIAM Symposium on Discrete Algorithms 2008

Bibliographyndash 91

R PoliExact schema theory for genetic programming and variable-length geneticalgorithms with one-point crossoverGenetic Programming and Evolvable Machines 2(2)123ndash163 June 2001

R Poli and W B LangdonOn the search properties of different crossover operators in genetic programmingIn J R Koza W Banzhaf K Chellapilla K Deb M Dorigo D B Fogel M HGarzon D E Goldberg H Iba and R Riolo editors Genetic Programming1998 Proceedings of the Third Annual Conference pages 293ndash301 University ofWisconsin Madison Wisconsin USA 22-25 July 1998 Morgan Kaufmann

R Poli W B Langdon and N F McPheeA field guide to genetic programmingPublished via httplulucom and freely available athttpwwwgp-field-guideorguk 2008(With contributions by J R Koza)

B J Ross and H ZhuProcedural texture evolution using multiobjective optimizationNew Generation Computing 22(3)271ndash293 2004

Bibliographyndash 92

R P Salustowicz and J SchmidhuberProbabilistic incremental program evolutionEvolutionary Computation 5(2)123ndash141 1997

M Schmidt and H LipsonDistilling free-form natural laws from experimental dataScience 324(5923)81ndash85 3 Apr 2009

S Silva and L VanneschiState-of-the-Art Genetic Programming for Predicting Human Oral Bioavailabilityof DrugsIn Proc 4th International Workshop on Practical Applications of ComputationalBiology and Bioinformatics 2010

M SipperLet the Games EvolveIn Proc GPTP 2011

C Sivapragasam R Maheswaran and V VenkateshGenetic programming approach for flood routing in natural channelsHydrological Processes 22(5)623ndash628 2007

C Sivapragasam N Muttil S Muthukumar and V M ArunPrediction of algal blooms using genetic programmingMarine Pollution Bulletin 60(10)1849ndash1855 2010

Bibliographyndash 93

L SpectorTowards practical autoconstructive evolution Self-evolution of problem-solvinggenetic programming systemsIn R Riolo T McConaghy and E Vladislavleva editors Genetic ProgrammingTheory and Practice VIII volume 8 of Genetic and Evolutionary Computationchapter 2 pages 17ndash33 Springer Ann Arbor USA 20-22 May 2010

L Spector C Perry and J KleinPush 20 programming language descriptionTechnical report School of Cognitive Science Hampshire College Apr 2004

J Togelius S Karakovskiy J Koutnik and J SchmidhuberSuper Mario EvolutionIn Proc IEEE Computational Intelligence and Games 2009

TORCS The Open Car Racing Simulatorhttptorcssourceforgenet 2012

L Vanneschi and G CuccuVariable Size Population for Dynamic Optimization with Genetic ProgrammingIn Proc GECCO 2009

Bibliographyndash 94

E Vladislavleva G Smits and D Den HertogOrder of Nonlinearity as a Complexity Measure for Models Generated by SymbolicRegression via Pareto Genetic ProgrammingIEEE Trans EC 13(2)333ndash349 2009

N Wagner Z Michalewicz M Khouja and R McGregorTime Series Forecasting for Dynamic Environments The DyFor Genetic ProgramModelIEEE Trans EC 2007

J Walker and J MillerPredicting Prime Numbers Using Cartesian Genetic ProgrammingIn Proc EuroGP 2007

J A Walker K Voumllk S L Smith and J F MillerParallel Evolution using Multi-chromosome Cartesian Genetic ProgrammingGPEM 10417ndash445 2009

W-C Wang K-W Chau C-T Cheng and L QiuA comparison of performance of several artificial intelligence methods forforecasting monthly discharge time seriesJournal of Hydrology 374(3-4)294ndash306 2009

Bibliographyndash 95

W Weimer S Forrest C Le Goues and T NguyenAutomatic program repair with evolutionary computationCommunications of the ACM 53(5)109ndash116 June 2010

P Widera J Garibaldi and N KrasnogorGP challenge Evolving energy function for protein structure predictionGPEM 1161ndash88 2010

T YuHierarchical Processing for Evolving Recursive and Modular Programs UsingHigher-Order Functions and Lambda AbstractionGPEM 2345ndash380 2001

Bibliographyndash 96

  • Introduction
  • Background
  • What is genetic programming
  • Summary of our first glimpse at GP
  • Life demonstration of GP using ECJ
  • Assessment of GP techniques
  • Case study
  • A more detailed view on GP (vanilla GP is not the whole story)
  • The Challenges for GP
  • Variants of GP
  • Selected theoretical results
  • Applications of GP
  • Additional resources
  • Bibliography
Page 87: GeneticProgramming - Poznań University of TechnologyGeneticProgramming KrzysztofKrawiec LaboratoryofIntelligentDecisionSupportSystems InstituteofComputingScience,PoznanUniversityofTechnology,Poznań,Poland

S Harding J F Miller and W BanzhafDevelopments in Cartesian Genetic Programming self-modifying CGPGPEM 11397ndash439 2010

D Jakobović and L BudinDynamic Scheduling with Genetic ProgrammingIn Proc EuroGP 2006

G Kendall A Parkes and K SpoererA Survey of NP-Complete PuzzlesInternational Computer Games Association Journal 31(1)13ndash34 2008

K E Kinnear JrEvolving a Sort Lessons in Genetic ProgrammingIn Proc of the International Conference on Neural Networks 1993

J KozaA Genetic Approach to the Truck Backer Upper Problem and the Inter-twinedSpiral ProblemIn Proc International Joint Conference on Neural Networks 1992

J R KozaGenetic Programming On the Programming of Computers by Means of NaturalSelectionMIT Press Cambridge MA USA 1992

Bibliographyndash 87

J R KozaGenetic Programming II Automatic Discovery of Reusable ProgramsMIT Press Cambridge Massachusetts May 1994

W Langdon and W BanzhafRepeated Patterns in Genetic ProgrammingNatural Computing 7589ndash613 2008

W B LangdonRandom search is parsimoniousIn E Cantuacute-Paz editor Late Breaking Papers at the Genetic and EvolutionaryComputation Conference (GECCO-2002) pages 308ndash315 New York NY 9-13July 2002 AAAI

W B Langdon and W BanzhafRepeated Sequences in Linear Genetic Programming GenomesComplex Systems 15(4)285ndash306 2005

W B Langdon and A P HarrisonEvolving Regular Expressions for GeneChip Probe Performance PredictionIn Proc PPSN pages 1061ndash1070 2008

W B Langdon and R PoliFoundations of Genetic ProgrammingSpringer-Verlag 2002

Bibliographyndash 88

W B Langdon J Rowsell and A P HarrisonCreating Regular Expressions as mRNA Motifs with GP to Predict Human ExonSplittingIn Proc GECCO 2009

W B Langdon O Sanchez Graillet and A P HarrisonAutomated DNA Motif DiscoveryarXivorg 2010

M Lones A Tyrrell S Stepney and L CavesControlling Complex Dynamics with Artificial Biochemical NetworksIn Proc EuroGP pages 159ndash170 2010

S LucasOthello Competitionhttpprotectkern-1667emrelaxalgovalessexacuk8080othellohtmlOthellohtml 2012[Online accessed 27-Jan-2012]

Bibliographyndash 89

S LucasThe Physical Travelling Salesperson Problemhttpprotectkern-1667emrelaxalgovalessexacukptspptsphtml2012[Online accessed 27ndashJan-2012]

T McConaghyFFX Fast Scalable Deterministic Symbolic Regression TechnologyIn Proc GPTP 2011

D J MontanaStrongly typed genetic programmingBBN Technical Report 7866 Bolt Beranek and Newman Inc 10 MoultonStreet Cambridge MA 02138 USA 7 May 1993

M Nicolau M Schoenauer and W BanzhafEvolving Genes to Balance a PoleIn Proc EuroGP 2010

Bibliographyndash 90

P Nordin and W BanzhafGenetic programming controlling a miniature robotIn E V Siegel and J R Koza editors Working Notes for the AAAI Symposiumon Genetic Programming pages 61ndash67 MIT Cambridge MA USA 10ndash12 Nov1995 AAAI

G Olague and L TrujilloEvolutionary-computer-assisted design of image operators that detect interestpoints using genetic programmingImage and Vision Computing 29(7)484ndash498 2011

M OrsquoNeill A Brabazon and E HembergSubtree Deactivation Control with Grammatical Genetic Programming inDynamic EnvironmentsIn Proc CEC 2008

M OrsquoNeill and C RyanGrammatical Evolution by Grammatical Evolution The Evolution of Grammarand Genetic CodeIn Proc EuroGP pages 138ndash149 Springer-Verlag 5ndash7 Apr 2004

M Paterson Y Peres M Thorup P Winkler and U ZwickMaximum OverhangIn Proc 19th Annual ACM-SIAM Symposium on Discrete Algorithms 2008

Bibliographyndash 91

R PoliExact schema theory for genetic programming and variable-length geneticalgorithms with one-point crossoverGenetic Programming and Evolvable Machines 2(2)123ndash163 June 2001

R Poli and W B LangdonOn the search properties of different crossover operators in genetic programmingIn J R Koza W Banzhaf K Chellapilla K Deb M Dorigo D B Fogel M HGarzon D E Goldberg H Iba and R Riolo editors Genetic Programming1998 Proceedings of the Third Annual Conference pages 293ndash301 University ofWisconsin Madison Wisconsin USA 22-25 July 1998 Morgan Kaufmann

R Poli W B Langdon and N F McPheeA field guide to genetic programmingPublished via httplulucom and freely available athttpwwwgp-field-guideorguk 2008(With contributions by J R Koza)

B J Ross and H ZhuProcedural texture evolution using multiobjective optimizationNew Generation Computing 22(3)271ndash293 2004

Bibliographyndash 92

R P Salustowicz and J SchmidhuberProbabilistic incremental program evolutionEvolutionary Computation 5(2)123ndash141 1997

M Schmidt and H LipsonDistilling free-form natural laws from experimental dataScience 324(5923)81ndash85 3 Apr 2009

S Silva and L VanneschiState-of-the-Art Genetic Programming for Predicting Human Oral Bioavailabilityof DrugsIn Proc 4th International Workshop on Practical Applications of ComputationalBiology and Bioinformatics 2010

M SipperLet the Games EvolveIn Proc GPTP 2011

C Sivapragasam R Maheswaran and V VenkateshGenetic programming approach for flood routing in natural channelsHydrological Processes 22(5)623ndash628 2007

C Sivapragasam N Muttil S Muthukumar and V M ArunPrediction of algal blooms using genetic programmingMarine Pollution Bulletin 60(10)1849ndash1855 2010

Bibliographyndash 93

L SpectorTowards practical autoconstructive evolution Self-evolution of problem-solvinggenetic programming systemsIn R Riolo T McConaghy and E Vladislavleva editors Genetic ProgrammingTheory and Practice VIII volume 8 of Genetic and Evolutionary Computationchapter 2 pages 17ndash33 Springer Ann Arbor USA 20-22 May 2010

L Spector C Perry and J KleinPush 20 programming language descriptionTechnical report School of Cognitive Science Hampshire College Apr 2004

J Togelius S Karakovskiy J Koutnik and J SchmidhuberSuper Mario EvolutionIn Proc IEEE Computational Intelligence and Games 2009

TORCS The Open Car Racing Simulatorhttptorcssourceforgenet 2012

L Vanneschi and G CuccuVariable Size Population for Dynamic Optimization with Genetic ProgrammingIn Proc GECCO 2009

Bibliographyndash 94

E Vladislavleva G Smits and D Den HertogOrder of Nonlinearity as a Complexity Measure for Models Generated by SymbolicRegression via Pareto Genetic ProgrammingIEEE Trans EC 13(2)333ndash349 2009

N Wagner Z Michalewicz M Khouja and R McGregorTime Series Forecasting for Dynamic Environments The DyFor Genetic ProgramModelIEEE Trans EC 2007

J Walker and J MillerPredicting Prime Numbers Using Cartesian Genetic ProgrammingIn Proc EuroGP 2007

J A Walker K Voumllk S L Smith and J F MillerParallel Evolution using Multi-chromosome Cartesian Genetic ProgrammingGPEM 10417ndash445 2009

W-C Wang K-W Chau C-T Cheng and L QiuA comparison of performance of several artificial intelligence methods forforecasting monthly discharge time seriesJournal of Hydrology 374(3-4)294ndash306 2009

Bibliographyndash 95

W Weimer S Forrest C Le Goues and T NguyenAutomatic program repair with evolutionary computationCommunications of the ACM 53(5)109ndash116 June 2010

P Widera J Garibaldi and N KrasnogorGP challenge Evolving energy function for protein structure predictionGPEM 1161ndash88 2010

T YuHierarchical Processing for Evolving Recursive and Modular Programs UsingHigher-Order Functions and Lambda AbstractionGPEM 2345ndash380 2001

Bibliographyndash 96

  • Introduction
  • Background
  • What is genetic programming
  • Summary of our first glimpse at GP
  • Life demonstration of GP using ECJ
  • Assessment of GP techniques
  • Case study
  • A more detailed view on GP (vanilla GP is not the whole story)
  • The Challenges for GP
  • Variants of GP
  • Selected theoretical results
  • Applications of GP
  • Additional resources
  • Bibliography
Page 88: GeneticProgramming - Poznań University of TechnologyGeneticProgramming KrzysztofKrawiec LaboratoryofIntelligentDecisionSupportSystems InstituteofComputingScience,PoznanUniversityofTechnology,Poznań,Poland

J R KozaGenetic Programming II Automatic Discovery of Reusable ProgramsMIT Press Cambridge Massachusetts May 1994

W Langdon and W BanzhafRepeated Patterns in Genetic ProgrammingNatural Computing 7589ndash613 2008

W B LangdonRandom search is parsimoniousIn E Cantuacute-Paz editor Late Breaking Papers at the Genetic and EvolutionaryComputation Conference (GECCO-2002) pages 308ndash315 New York NY 9-13July 2002 AAAI

W B Langdon and W BanzhafRepeated Sequences in Linear Genetic Programming GenomesComplex Systems 15(4)285ndash306 2005

W B Langdon and A P HarrisonEvolving Regular Expressions for GeneChip Probe Performance PredictionIn Proc PPSN pages 1061ndash1070 2008

W B Langdon and R PoliFoundations of Genetic ProgrammingSpringer-Verlag 2002

Bibliographyndash 88

W B Langdon J Rowsell and A P HarrisonCreating Regular Expressions as mRNA Motifs with GP to Predict Human ExonSplittingIn Proc GECCO 2009

W B Langdon O Sanchez Graillet and A P HarrisonAutomated DNA Motif DiscoveryarXivorg 2010

M Lones A Tyrrell S Stepney and L CavesControlling Complex Dynamics with Artificial Biochemical NetworksIn Proc EuroGP pages 159ndash170 2010

S LucasOthello Competitionhttpprotectkern-1667emrelaxalgovalessexacuk8080othellohtmlOthellohtml 2012[Online accessed 27-Jan-2012]

Bibliographyndash 89

S LucasThe Physical Travelling Salesperson Problemhttpprotectkern-1667emrelaxalgovalessexacukptspptsphtml2012[Online accessed 27ndashJan-2012]

T McConaghyFFX Fast Scalable Deterministic Symbolic Regression TechnologyIn Proc GPTP 2011

D J MontanaStrongly typed genetic programmingBBN Technical Report 7866 Bolt Beranek and Newman Inc 10 MoultonStreet Cambridge MA 02138 USA 7 May 1993

M Nicolau M Schoenauer and W BanzhafEvolving Genes to Balance a PoleIn Proc EuroGP 2010

Bibliographyndash 90

P Nordin and W BanzhafGenetic programming controlling a miniature robotIn E V Siegel and J R Koza editors Working Notes for the AAAI Symposiumon Genetic Programming pages 61ndash67 MIT Cambridge MA USA 10ndash12 Nov1995 AAAI

G Olague and L TrujilloEvolutionary-computer-assisted design of image operators that detect interestpoints using genetic programmingImage and Vision Computing 29(7)484ndash498 2011

M OrsquoNeill A Brabazon and E HembergSubtree Deactivation Control with Grammatical Genetic Programming inDynamic EnvironmentsIn Proc CEC 2008

M OrsquoNeill and C RyanGrammatical Evolution by Grammatical Evolution The Evolution of Grammarand Genetic CodeIn Proc EuroGP pages 138ndash149 Springer-Verlag 5ndash7 Apr 2004

M Paterson Y Peres M Thorup P Winkler and U ZwickMaximum OverhangIn Proc 19th Annual ACM-SIAM Symposium on Discrete Algorithms 2008

Bibliographyndash 91

R PoliExact schema theory for genetic programming and variable-length geneticalgorithms with one-point crossoverGenetic Programming and Evolvable Machines 2(2)123ndash163 June 2001

R Poli and W B LangdonOn the search properties of different crossover operators in genetic programmingIn J R Koza W Banzhaf K Chellapilla K Deb M Dorigo D B Fogel M HGarzon D E Goldberg H Iba and R Riolo editors Genetic Programming1998 Proceedings of the Third Annual Conference pages 293ndash301 University ofWisconsin Madison Wisconsin USA 22-25 July 1998 Morgan Kaufmann

R Poli W B Langdon and N F McPheeA field guide to genetic programmingPublished via httplulucom and freely available athttpwwwgp-field-guideorguk 2008(With contributions by J R Koza)

B J Ross and H ZhuProcedural texture evolution using multiobjective optimizationNew Generation Computing 22(3)271ndash293 2004

Bibliographyndash 92

R P Salustowicz and J SchmidhuberProbabilistic incremental program evolutionEvolutionary Computation 5(2)123ndash141 1997

M Schmidt and H LipsonDistilling free-form natural laws from experimental dataScience 324(5923)81ndash85 3 Apr 2009

S Silva and L VanneschiState-of-the-Art Genetic Programming for Predicting Human Oral Bioavailabilityof DrugsIn Proc 4th International Workshop on Practical Applications of ComputationalBiology and Bioinformatics 2010

M SipperLet the Games EvolveIn Proc GPTP 2011

C Sivapragasam R Maheswaran and V VenkateshGenetic programming approach for flood routing in natural channelsHydrological Processes 22(5)623ndash628 2007

C Sivapragasam N Muttil S Muthukumar and V M ArunPrediction of algal blooms using genetic programmingMarine Pollution Bulletin 60(10)1849ndash1855 2010

Bibliographyndash 93

L SpectorTowards practical autoconstructive evolution Self-evolution of problem-solvinggenetic programming systemsIn R Riolo T McConaghy and E Vladislavleva editors Genetic ProgrammingTheory and Practice VIII volume 8 of Genetic and Evolutionary Computationchapter 2 pages 17ndash33 Springer Ann Arbor USA 20-22 May 2010

L Spector C Perry and J KleinPush 20 programming language descriptionTechnical report School of Cognitive Science Hampshire College Apr 2004

J Togelius S Karakovskiy J Koutnik and J SchmidhuberSuper Mario EvolutionIn Proc IEEE Computational Intelligence and Games 2009

TORCS The Open Car Racing Simulatorhttptorcssourceforgenet 2012

L Vanneschi and G CuccuVariable Size Population for Dynamic Optimization with Genetic ProgrammingIn Proc GECCO 2009

Bibliographyndash 94

E Vladislavleva G Smits and D Den HertogOrder of Nonlinearity as a Complexity Measure for Models Generated by SymbolicRegression via Pareto Genetic ProgrammingIEEE Trans EC 13(2)333ndash349 2009

N Wagner Z Michalewicz M Khouja and R McGregorTime Series Forecasting for Dynamic Environments The DyFor Genetic ProgramModelIEEE Trans EC 2007

J Walker and J MillerPredicting Prime Numbers Using Cartesian Genetic ProgrammingIn Proc EuroGP 2007

J A Walker K Voumllk S L Smith and J F MillerParallel Evolution using Multi-chromosome Cartesian Genetic ProgrammingGPEM 10417ndash445 2009

W-C Wang K-W Chau C-T Cheng and L QiuA comparison of performance of several artificial intelligence methods forforecasting monthly discharge time seriesJournal of Hydrology 374(3-4)294ndash306 2009

Bibliographyndash 95

W Weimer S Forrest C Le Goues and T NguyenAutomatic program repair with evolutionary computationCommunications of the ACM 53(5)109ndash116 June 2010

P Widera J Garibaldi and N KrasnogorGP challenge Evolving energy function for protein structure predictionGPEM 1161ndash88 2010

T YuHierarchical Processing for Evolving Recursive and Modular Programs UsingHigher-Order Functions and Lambda AbstractionGPEM 2345ndash380 2001

Bibliographyndash 96

  • Introduction
  • Background
  • What is genetic programming
  • Summary of our first glimpse at GP
  • Life demonstration of GP using ECJ
  • Assessment of GP techniques
  • Case study
  • A more detailed view on GP (vanilla GP is not the whole story)
  • The Challenges for GP
  • Variants of GP
  • Selected theoretical results
  • Applications of GP
  • Additional resources
  • Bibliography
Page 89: GeneticProgramming - Poznań University of TechnologyGeneticProgramming KrzysztofKrawiec LaboratoryofIntelligentDecisionSupportSystems InstituteofComputingScience,PoznanUniversityofTechnology,Poznań,Poland

W B Langdon J Rowsell and A P HarrisonCreating Regular Expressions as mRNA Motifs with GP to Predict Human ExonSplittingIn Proc GECCO 2009

W B Langdon O Sanchez Graillet and A P HarrisonAutomated DNA Motif DiscoveryarXivorg 2010

M Lones A Tyrrell S Stepney and L CavesControlling Complex Dynamics with Artificial Biochemical NetworksIn Proc EuroGP pages 159ndash170 2010

S LucasOthello Competitionhttpprotectkern-1667emrelaxalgovalessexacuk8080othellohtmlOthellohtml 2012[Online accessed 27-Jan-2012]

Bibliographyndash 89

S LucasThe Physical Travelling Salesperson Problemhttpprotectkern-1667emrelaxalgovalessexacukptspptsphtml2012[Online accessed 27ndashJan-2012]

T McConaghyFFX Fast Scalable Deterministic Symbolic Regression TechnologyIn Proc GPTP 2011

D J MontanaStrongly typed genetic programmingBBN Technical Report 7866 Bolt Beranek and Newman Inc 10 MoultonStreet Cambridge MA 02138 USA 7 May 1993

M Nicolau M Schoenauer and W BanzhafEvolving Genes to Balance a PoleIn Proc EuroGP 2010

Bibliographyndash 90

P Nordin and W BanzhafGenetic programming controlling a miniature robotIn E V Siegel and J R Koza editors Working Notes for the AAAI Symposiumon Genetic Programming pages 61ndash67 MIT Cambridge MA USA 10ndash12 Nov1995 AAAI

G Olague and L TrujilloEvolutionary-computer-assisted design of image operators that detect interestpoints using genetic programmingImage and Vision Computing 29(7)484ndash498 2011

M OrsquoNeill A Brabazon and E HembergSubtree Deactivation Control with Grammatical Genetic Programming inDynamic EnvironmentsIn Proc CEC 2008

M OrsquoNeill and C RyanGrammatical Evolution by Grammatical Evolution The Evolution of Grammarand Genetic CodeIn Proc EuroGP pages 138ndash149 Springer-Verlag 5ndash7 Apr 2004

M Paterson Y Peres M Thorup P Winkler and U ZwickMaximum OverhangIn Proc 19th Annual ACM-SIAM Symposium on Discrete Algorithms 2008

Bibliographyndash 91

R PoliExact schema theory for genetic programming and variable-length geneticalgorithms with one-point crossoverGenetic Programming and Evolvable Machines 2(2)123ndash163 June 2001

R Poli and W B LangdonOn the search properties of different crossover operators in genetic programmingIn J R Koza W Banzhaf K Chellapilla K Deb M Dorigo D B Fogel M HGarzon D E Goldberg H Iba and R Riolo editors Genetic Programming1998 Proceedings of the Third Annual Conference pages 293ndash301 University ofWisconsin Madison Wisconsin USA 22-25 July 1998 Morgan Kaufmann

R Poli W B Langdon and N F McPheeA field guide to genetic programmingPublished via httplulucom and freely available athttpwwwgp-field-guideorguk 2008(With contributions by J R Koza)

B J Ross and H ZhuProcedural texture evolution using multiobjective optimizationNew Generation Computing 22(3)271ndash293 2004

Bibliographyndash 92

R P Salustowicz and J SchmidhuberProbabilistic incremental program evolutionEvolutionary Computation 5(2)123ndash141 1997

M Schmidt and H LipsonDistilling free-form natural laws from experimental dataScience 324(5923)81ndash85 3 Apr 2009

S Silva and L VanneschiState-of-the-Art Genetic Programming for Predicting Human Oral Bioavailabilityof DrugsIn Proc 4th International Workshop on Practical Applications of ComputationalBiology and Bioinformatics 2010

M SipperLet the Games EvolveIn Proc GPTP 2011

C Sivapragasam R Maheswaran and V VenkateshGenetic programming approach for flood routing in natural channelsHydrological Processes 22(5)623ndash628 2007

C Sivapragasam N Muttil S Muthukumar and V M ArunPrediction of algal blooms using genetic programmingMarine Pollution Bulletin 60(10)1849ndash1855 2010

Bibliographyndash 93

L SpectorTowards practical autoconstructive evolution Self-evolution of problem-solvinggenetic programming systemsIn R Riolo T McConaghy and E Vladislavleva editors Genetic ProgrammingTheory and Practice VIII volume 8 of Genetic and Evolutionary Computationchapter 2 pages 17ndash33 Springer Ann Arbor USA 20-22 May 2010

L Spector C Perry and J KleinPush 20 programming language descriptionTechnical report School of Cognitive Science Hampshire College Apr 2004

J Togelius S Karakovskiy J Koutnik and J SchmidhuberSuper Mario EvolutionIn Proc IEEE Computational Intelligence and Games 2009

TORCS The Open Car Racing Simulatorhttptorcssourceforgenet 2012

L Vanneschi and G CuccuVariable Size Population for Dynamic Optimization with Genetic ProgrammingIn Proc GECCO 2009

Bibliographyndash 94

E Vladislavleva G Smits and D Den HertogOrder of Nonlinearity as a Complexity Measure for Models Generated by SymbolicRegression via Pareto Genetic ProgrammingIEEE Trans EC 13(2)333ndash349 2009

N Wagner Z Michalewicz M Khouja and R McGregorTime Series Forecasting for Dynamic Environments The DyFor Genetic ProgramModelIEEE Trans EC 2007

J Walker and J MillerPredicting Prime Numbers Using Cartesian Genetic ProgrammingIn Proc EuroGP 2007

J A Walker K Voumllk S L Smith and J F MillerParallel Evolution using Multi-chromosome Cartesian Genetic ProgrammingGPEM 10417ndash445 2009

W-C Wang K-W Chau C-T Cheng and L QiuA comparison of performance of several artificial intelligence methods forforecasting monthly discharge time seriesJournal of Hydrology 374(3-4)294ndash306 2009

Bibliographyndash 95

W Weimer S Forrest C Le Goues and T NguyenAutomatic program repair with evolutionary computationCommunications of the ACM 53(5)109ndash116 June 2010

P Widera J Garibaldi and N KrasnogorGP challenge Evolving energy function for protein structure predictionGPEM 1161ndash88 2010

T YuHierarchical Processing for Evolving Recursive and Modular Programs UsingHigher-Order Functions and Lambda AbstractionGPEM 2345ndash380 2001

Bibliographyndash 96

  • Introduction
  • Background
  • What is genetic programming
  • Summary of our first glimpse at GP
  • Life demonstration of GP using ECJ
  • Assessment of GP techniques
  • Case study
  • A more detailed view on GP (vanilla GP is not the whole story)
  • The Challenges for GP
  • Variants of GP
  • Selected theoretical results
  • Applications of GP
  • Additional resources
  • Bibliography
Page 90: GeneticProgramming - Poznań University of TechnologyGeneticProgramming KrzysztofKrawiec LaboratoryofIntelligentDecisionSupportSystems InstituteofComputingScience,PoznanUniversityofTechnology,Poznań,Poland

S LucasThe Physical Travelling Salesperson Problemhttpprotectkern-1667emrelaxalgovalessexacukptspptsphtml2012[Online accessed 27ndashJan-2012]

T McConaghyFFX Fast Scalable Deterministic Symbolic Regression TechnologyIn Proc GPTP 2011

D J MontanaStrongly typed genetic programmingBBN Technical Report 7866 Bolt Beranek and Newman Inc 10 MoultonStreet Cambridge MA 02138 USA 7 May 1993

M Nicolau M Schoenauer and W BanzhafEvolving Genes to Balance a PoleIn Proc EuroGP 2010

Bibliographyndash 90

P Nordin and W BanzhafGenetic programming controlling a miniature robotIn E V Siegel and J R Koza editors Working Notes for the AAAI Symposiumon Genetic Programming pages 61ndash67 MIT Cambridge MA USA 10ndash12 Nov1995 AAAI

G Olague and L TrujilloEvolutionary-computer-assisted design of image operators that detect interestpoints using genetic programmingImage and Vision Computing 29(7)484ndash498 2011

M OrsquoNeill A Brabazon and E HembergSubtree Deactivation Control with Grammatical Genetic Programming inDynamic EnvironmentsIn Proc CEC 2008

M OrsquoNeill and C RyanGrammatical Evolution by Grammatical Evolution The Evolution of Grammarand Genetic CodeIn Proc EuroGP pages 138ndash149 Springer-Verlag 5ndash7 Apr 2004

M Paterson Y Peres M Thorup P Winkler and U ZwickMaximum OverhangIn Proc 19th Annual ACM-SIAM Symposium on Discrete Algorithms 2008

Bibliographyndash 91

R PoliExact schema theory for genetic programming and variable-length geneticalgorithms with one-point crossoverGenetic Programming and Evolvable Machines 2(2)123ndash163 June 2001

R Poli and W B LangdonOn the search properties of different crossover operators in genetic programmingIn J R Koza W Banzhaf K Chellapilla K Deb M Dorigo D B Fogel M HGarzon D E Goldberg H Iba and R Riolo editors Genetic Programming1998 Proceedings of the Third Annual Conference pages 293ndash301 University ofWisconsin Madison Wisconsin USA 22-25 July 1998 Morgan Kaufmann

R Poli W B Langdon and N F McPheeA field guide to genetic programmingPublished via httplulucom and freely available athttpwwwgp-field-guideorguk 2008(With contributions by J R Koza)

B J Ross and H ZhuProcedural texture evolution using multiobjective optimizationNew Generation Computing 22(3)271ndash293 2004

Bibliographyndash 92

R P Salustowicz and J SchmidhuberProbabilistic incremental program evolutionEvolutionary Computation 5(2)123ndash141 1997

M Schmidt and H LipsonDistilling free-form natural laws from experimental dataScience 324(5923)81ndash85 3 Apr 2009

S Silva and L VanneschiState-of-the-Art Genetic Programming for Predicting Human Oral Bioavailabilityof DrugsIn Proc 4th International Workshop on Practical Applications of ComputationalBiology and Bioinformatics 2010

M SipperLet the Games EvolveIn Proc GPTP 2011

C Sivapragasam R Maheswaran and V VenkateshGenetic programming approach for flood routing in natural channelsHydrological Processes 22(5)623ndash628 2007

C Sivapragasam N Muttil S Muthukumar and V M ArunPrediction of algal blooms using genetic programmingMarine Pollution Bulletin 60(10)1849ndash1855 2010

Bibliographyndash 93

L SpectorTowards practical autoconstructive evolution Self-evolution of problem-solvinggenetic programming systemsIn R Riolo T McConaghy and E Vladislavleva editors Genetic ProgrammingTheory and Practice VIII volume 8 of Genetic and Evolutionary Computationchapter 2 pages 17ndash33 Springer Ann Arbor USA 20-22 May 2010

L Spector C Perry and J KleinPush 20 programming language descriptionTechnical report School of Cognitive Science Hampshire College Apr 2004

J Togelius S Karakovskiy J Koutnik and J SchmidhuberSuper Mario EvolutionIn Proc IEEE Computational Intelligence and Games 2009

TORCS The Open Car Racing Simulatorhttptorcssourceforgenet 2012

L Vanneschi and G CuccuVariable Size Population for Dynamic Optimization with Genetic ProgrammingIn Proc GECCO 2009

Bibliographyndash 94

E Vladislavleva G Smits and D Den HertogOrder of Nonlinearity as a Complexity Measure for Models Generated by SymbolicRegression via Pareto Genetic ProgrammingIEEE Trans EC 13(2)333ndash349 2009

N Wagner Z Michalewicz M Khouja and R McGregorTime Series Forecasting for Dynamic Environments The DyFor Genetic ProgramModelIEEE Trans EC 2007

J Walker and J MillerPredicting Prime Numbers Using Cartesian Genetic ProgrammingIn Proc EuroGP 2007

J A Walker K Voumllk S L Smith and J F MillerParallel Evolution using Multi-chromosome Cartesian Genetic ProgrammingGPEM 10417ndash445 2009

W-C Wang K-W Chau C-T Cheng and L QiuA comparison of performance of several artificial intelligence methods forforecasting monthly discharge time seriesJournal of Hydrology 374(3-4)294ndash306 2009

Bibliographyndash 95

W Weimer S Forrest C Le Goues and T NguyenAutomatic program repair with evolutionary computationCommunications of the ACM 53(5)109ndash116 June 2010

P Widera J Garibaldi and N KrasnogorGP challenge Evolving energy function for protein structure predictionGPEM 1161ndash88 2010

T YuHierarchical Processing for Evolving Recursive and Modular Programs UsingHigher-Order Functions and Lambda AbstractionGPEM 2345ndash380 2001

Bibliographyndash 96

  • Introduction
  • Background
  • What is genetic programming
  • Summary of our first glimpse at GP
  • Life demonstration of GP using ECJ
  • Assessment of GP techniques
  • Case study
  • A more detailed view on GP (vanilla GP is not the whole story)
  • The Challenges for GP
  • Variants of GP
  • Selected theoretical results
  • Applications of GP
  • Additional resources
  • Bibliography
Page 91: GeneticProgramming - Poznań University of TechnologyGeneticProgramming KrzysztofKrawiec LaboratoryofIntelligentDecisionSupportSystems InstituteofComputingScience,PoznanUniversityofTechnology,Poznań,Poland

P Nordin and W BanzhafGenetic programming controlling a miniature robotIn E V Siegel and J R Koza editors Working Notes for the AAAI Symposiumon Genetic Programming pages 61ndash67 MIT Cambridge MA USA 10ndash12 Nov1995 AAAI

G Olague and L TrujilloEvolutionary-computer-assisted design of image operators that detect interestpoints using genetic programmingImage and Vision Computing 29(7)484ndash498 2011

M OrsquoNeill A Brabazon and E HembergSubtree Deactivation Control with Grammatical Genetic Programming inDynamic EnvironmentsIn Proc CEC 2008

M OrsquoNeill and C RyanGrammatical Evolution by Grammatical Evolution The Evolution of Grammarand Genetic CodeIn Proc EuroGP pages 138ndash149 Springer-Verlag 5ndash7 Apr 2004

M Paterson Y Peres M Thorup P Winkler and U ZwickMaximum OverhangIn Proc 19th Annual ACM-SIAM Symposium on Discrete Algorithms 2008

Bibliographyndash 91

R PoliExact schema theory for genetic programming and variable-length geneticalgorithms with one-point crossoverGenetic Programming and Evolvable Machines 2(2)123ndash163 June 2001

R Poli and W B LangdonOn the search properties of different crossover operators in genetic programmingIn J R Koza W Banzhaf K Chellapilla K Deb M Dorigo D B Fogel M HGarzon D E Goldberg H Iba and R Riolo editors Genetic Programming1998 Proceedings of the Third Annual Conference pages 293ndash301 University ofWisconsin Madison Wisconsin USA 22-25 July 1998 Morgan Kaufmann

R Poli W B Langdon and N F McPheeA field guide to genetic programmingPublished via httplulucom and freely available athttpwwwgp-field-guideorguk 2008(With contributions by J R Koza)

B J Ross and H ZhuProcedural texture evolution using multiobjective optimizationNew Generation Computing 22(3)271ndash293 2004

Bibliographyndash 92

R P Salustowicz and J SchmidhuberProbabilistic incremental program evolutionEvolutionary Computation 5(2)123ndash141 1997

M Schmidt and H LipsonDistilling free-form natural laws from experimental dataScience 324(5923)81ndash85 3 Apr 2009

S Silva and L VanneschiState-of-the-Art Genetic Programming for Predicting Human Oral Bioavailabilityof DrugsIn Proc 4th International Workshop on Practical Applications of ComputationalBiology and Bioinformatics 2010

M SipperLet the Games EvolveIn Proc GPTP 2011

C Sivapragasam R Maheswaran and V VenkateshGenetic programming approach for flood routing in natural channelsHydrological Processes 22(5)623ndash628 2007

C Sivapragasam N Muttil S Muthukumar and V M ArunPrediction of algal blooms using genetic programmingMarine Pollution Bulletin 60(10)1849ndash1855 2010

Bibliographyndash 93

L SpectorTowards practical autoconstructive evolution Self-evolution of problem-solvinggenetic programming systemsIn R Riolo T McConaghy and E Vladislavleva editors Genetic ProgrammingTheory and Practice VIII volume 8 of Genetic and Evolutionary Computationchapter 2 pages 17ndash33 Springer Ann Arbor USA 20-22 May 2010

L Spector C Perry and J KleinPush 20 programming language descriptionTechnical report School of Cognitive Science Hampshire College Apr 2004

J Togelius S Karakovskiy J Koutnik and J SchmidhuberSuper Mario EvolutionIn Proc IEEE Computational Intelligence and Games 2009

TORCS The Open Car Racing Simulatorhttptorcssourceforgenet 2012

L Vanneschi and G CuccuVariable Size Population for Dynamic Optimization with Genetic ProgrammingIn Proc GECCO 2009

Bibliographyndash 94

E Vladislavleva G Smits and D Den HertogOrder of Nonlinearity as a Complexity Measure for Models Generated by SymbolicRegression via Pareto Genetic ProgrammingIEEE Trans EC 13(2)333ndash349 2009

N Wagner Z Michalewicz M Khouja and R McGregorTime Series Forecasting for Dynamic Environments The DyFor Genetic ProgramModelIEEE Trans EC 2007

J Walker and J MillerPredicting Prime Numbers Using Cartesian Genetic ProgrammingIn Proc EuroGP 2007

J A Walker K Voumllk S L Smith and J F MillerParallel Evolution using Multi-chromosome Cartesian Genetic ProgrammingGPEM 10417ndash445 2009

W-C Wang K-W Chau C-T Cheng and L QiuA comparison of performance of several artificial intelligence methods forforecasting monthly discharge time seriesJournal of Hydrology 374(3-4)294ndash306 2009

Bibliographyndash 95

W Weimer S Forrest C Le Goues and T NguyenAutomatic program repair with evolutionary computationCommunications of the ACM 53(5)109ndash116 June 2010

P Widera J Garibaldi and N KrasnogorGP challenge Evolving energy function for protein structure predictionGPEM 1161ndash88 2010

T YuHierarchical Processing for Evolving Recursive and Modular Programs UsingHigher-Order Functions and Lambda AbstractionGPEM 2345ndash380 2001

Bibliographyndash 96

  • Introduction
  • Background
  • What is genetic programming
  • Summary of our first glimpse at GP
  • Life demonstration of GP using ECJ
  • Assessment of GP techniques
  • Case study
  • A more detailed view on GP (vanilla GP is not the whole story)
  • The Challenges for GP
  • Variants of GP
  • Selected theoretical results
  • Applications of GP
  • Additional resources
  • Bibliography
Page 92: GeneticProgramming - Poznań University of TechnologyGeneticProgramming KrzysztofKrawiec LaboratoryofIntelligentDecisionSupportSystems InstituteofComputingScience,PoznanUniversityofTechnology,Poznań,Poland

R PoliExact schema theory for genetic programming and variable-length geneticalgorithms with one-point crossoverGenetic Programming and Evolvable Machines 2(2)123ndash163 June 2001

R Poli and W B LangdonOn the search properties of different crossover operators in genetic programmingIn J R Koza W Banzhaf K Chellapilla K Deb M Dorigo D B Fogel M HGarzon D E Goldberg H Iba and R Riolo editors Genetic Programming1998 Proceedings of the Third Annual Conference pages 293ndash301 University ofWisconsin Madison Wisconsin USA 22-25 July 1998 Morgan Kaufmann

R Poli W B Langdon and N F McPheeA field guide to genetic programmingPublished via httplulucom and freely available athttpwwwgp-field-guideorguk 2008(With contributions by J R Koza)

B J Ross and H ZhuProcedural texture evolution using multiobjective optimizationNew Generation Computing 22(3)271ndash293 2004

Bibliographyndash 92

R P Salustowicz and J SchmidhuberProbabilistic incremental program evolutionEvolutionary Computation 5(2)123ndash141 1997

M Schmidt and H LipsonDistilling free-form natural laws from experimental dataScience 324(5923)81ndash85 3 Apr 2009

S Silva and L VanneschiState-of-the-Art Genetic Programming for Predicting Human Oral Bioavailabilityof DrugsIn Proc 4th International Workshop on Practical Applications of ComputationalBiology and Bioinformatics 2010

M SipperLet the Games EvolveIn Proc GPTP 2011

C Sivapragasam R Maheswaran and V VenkateshGenetic programming approach for flood routing in natural channelsHydrological Processes 22(5)623ndash628 2007

C Sivapragasam N Muttil S Muthukumar and V M ArunPrediction of algal blooms using genetic programmingMarine Pollution Bulletin 60(10)1849ndash1855 2010

Bibliographyndash 93

L SpectorTowards practical autoconstructive evolution Self-evolution of problem-solvinggenetic programming systemsIn R Riolo T McConaghy and E Vladislavleva editors Genetic ProgrammingTheory and Practice VIII volume 8 of Genetic and Evolutionary Computationchapter 2 pages 17ndash33 Springer Ann Arbor USA 20-22 May 2010

L Spector C Perry and J KleinPush 20 programming language descriptionTechnical report School of Cognitive Science Hampshire College Apr 2004

J Togelius S Karakovskiy J Koutnik and J SchmidhuberSuper Mario EvolutionIn Proc IEEE Computational Intelligence and Games 2009

TORCS The Open Car Racing Simulatorhttptorcssourceforgenet 2012

L Vanneschi and G CuccuVariable Size Population for Dynamic Optimization with Genetic ProgrammingIn Proc GECCO 2009

Bibliographyndash 94

E Vladislavleva G Smits and D Den HertogOrder of Nonlinearity as a Complexity Measure for Models Generated by SymbolicRegression via Pareto Genetic ProgrammingIEEE Trans EC 13(2)333ndash349 2009

N Wagner Z Michalewicz M Khouja and R McGregorTime Series Forecasting for Dynamic Environments The DyFor Genetic ProgramModelIEEE Trans EC 2007

J Walker and J MillerPredicting Prime Numbers Using Cartesian Genetic ProgrammingIn Proc EuroGP 2007

J A Walker K Voumllk S L Smith and J F MillerParallel Evolution using Multi-chromosome Cartesian Genetic ProgrammingGPEM 10417ndash445 2009

W-C Wang K-W Chau C-T Cheng and L QiuA comparison of performance of several artificial intelligence methods forforecasting monthly discharge time seriesJournal of Hydrology 374(3-4)294ndash306 2009

Bibliographyndash 95

W Weimer S Forrest C Le Goues and T NguyenAutomatic program repair with evolutionary computationCommunications of the ACM 53(5)109ndash116 June 2010

P Widera J Garibaldi and N KrasnogorGP challenge Evolving energy function for protein structure predictionGPEM 1161ndash88 2010

T YuHierarchical Processing for Evolving Recursive and Modular Programs UsingHigher-Order Functions and Lambda AbstractionGPEM 2345ndash380 2001

Bibliographyndash 96

  • Introduction
  • Background
  • What is genetic programming
  • Summary of our first glimpse at GP
  • Life demonstration of GP using ECJ
  • Assessment of GP techniques
  • Case study
  • A more detailed view on GP (vanilla GP is not the whole story)
  • The Challenges for GP
  • Variants of GP
  • Selected theoretical results
  • Applications of GP
  • Additional resources
  • Bibliography
Page 93: GeneticProgramming - Poznań University of TechnologyGeneticProgramming KrzysztofKrawiec LaboratoryofIntelligentDecisionSupportSystems InstituteofComputingScience,PoznanUniversityofTechnology,Poznań,Poland

R P Salustowicz and J SchmidhuberProbabilistic incremental program evolutionEvolutionary Computation 5(2)123ndash141 1997

M Schmidt and H LipsonDistilling free-form natural laws from experimental dataScience 324(5923)81ndash85 3 Apr 2009

S Silva and L VanneschiState-of-the-Art Genetic Programming for Predicting Human Oral Bioavailabilityof DrugsIn Proc 4th International Workshop on Practical Applications of ComputationalBiology and Bioinformatics 2010

M SipperLet the Games EvolveIn Proc GPTP 2011

C Sivapragasam R Maheswaran and V VenkateshGenetic programming approach for flood routing in natural channelsHydrological Processes 22(5)623ndash628 2007

C Sivapragasam N Muttil S Muthukumar and V M ArunPrediction of algal blooms using genetic programmingMarine Pollution Bulletin 60(10)1849ndash1855 2010

Bibliographyndash 93

L SpectorTowards practical autoconstructive evolution Self-evolution of problem-solvinggenetic programming systemsIn R Riolo T McConaghy and E Vladislavleva editors Genetic ProgrammingTheory and Practice VIII volume 8 of Genetic and Evolutionary Computationchapter 2 pages 17ndash33 Springer Ann Arbor USA 20-22 May 2010

L Spector C Perry and J KleinPush 20 programming language descriptionTechnical report School of Cognitive Science Hampshire College Apr 2004

J Togelius S Karakovskiy J Koutnik and J SchmidhuberSuper Mario EvolutionIn Proc IEEE Computational Intelligence and Games 2009

TORCS The Open Car Racing Simulatorhttptorcssourceforgenet 2012

L Vanneschi and G CuccuVariable Size Population for Dynamic Optimization with Genetic ProgrammingIn Proc GECCO 2009

Bibliographyndash 94

E Vladislavleva G Smits and D Den HertogOrder of Nonlinearity as a Complexity Measure for Models Generated by SymbolicRegression via Pareto Genetic ProgrammingIEEE Trans EC 13(2)333ndash349 2009

N Wagner Z Michalewicz M Khouja and R McGregorTime Series Forecasting for Dynamic Environments The DyFor Genetic ProgramModelIEEE Trans EC 2007

J Walker and J MillerPredicting Prime Numbers Using Cartesian Genetic ProgrammingIn Proc EuroGP 2007

J A Walker K Voumllk S L Smith and J F MillerParallel Evolution using Multi-chromosome Cartesian Genetic ProgrammingGPEM 10417ndash445 2009

W-C Wang K-W Chau C-T Cheng and L QiuA comparison of performance of several artificial intelligence methods forforecasting monthly discharge time seriesJournal of Hydrology 374(3-4)294ndash306 2009

Bibliographyndash 95

W Weimer S Forrest C Le Goues and T NguyenAutomatic program repair with evolutionary computationCommunications of the ACM 53(5)109ndash116 June 2010

P Widera J Garibaldi and N KrasnogorGP challenge Evolving energy function for protein structure predictionGPEM 1161ndash88 2010

T YuHierarchical Processing for Evolving Recursive and Modular Programs UsingHigher-Order Functions and Lambda AbstractionGPEM 2345ndash380 2001

Bibliographyndash 96

  • Introduction
  • Background
  • What is genetic programming
  • Summary of our first glimpse at GP
  • Life demonstration of GP using ECJ
  • Assessment of GP techniques
  • Case study
  • A more detailed view on GP (vanilla GP is not the whole story)
  • The Challenges for GP
  • Variants of GP
  • Selected theoretical results
  • Applications of GP
  • Additional resources
  • Bibliography
Page 94: GeneticProgramming - Poznań University of TechnologyGeneticProgramming KrzysztofKrawiec LaboratoryofIntelligentDecisionSupportSystems InstituteofComputingScience,PoznanUniversityofTechnology,Poznań,Poland

L SpectorTowards practical autoconstructive evolution Self-evolution of problem-solvinggenetic programming systemsIn R Riolo T McConaghy and E Vladislavleva editors Genetic ProgrammingTheory and Practice VIII volume 8 of Genetic and Evolutionary Computationchapter 2 pages 17ndash33 Springer Ann Arbor USA 20-22 May 2010

L Spector C Perry and J KleinPush 20 programming language descriptionTechnical report School of Cognitive Science Hampshire College Apr 2004

J Togelius S Karakovskiy J Koutnik and J SchmidhuberSuper Mario EvolutionIn Proc IEEE Computational Intelligence and Games 2009

TORCS The Open Car Racing Simulatorhttptorcssourceforgenet 2012

L Vanneschi and G CuccuVariable Size Population for Dynamic Optimization with Genetic ProgrammingIn Proc GECCO 2009

Bibliographyndash 94

E Vladislavleva G Smits and D Den HertogOrder of Nonlinearity as a Complexity Measure for Models Generated by SymbolicRegression via Pareto Genetic ProgrammingIEEE Trans EC 13(2)333ndash349 2009

N Wagner Z Michalewicz M Khouja and R McGregorTime Series Forecasting for Dynamic Environments The DyFor Genetic ProgramModelIEEE Trans EC 2007

J Walker and J MillerPredicting Prime Numbers Using Cartesian Genetic ProgrammingIn Proc EuroGP 2007

J A Walker K Voumllk S L Smith and J F MillerParallel Evolution using Multi-chromosome Cartesian Genetic ProgrammingGPEM 10417ndash445 2009

W-C Wang K-W Chau C-T Cheng and L QiuA comparison of performance of several artificial intelligence methods forforecasting monthly discharge time seriesJournal of Hydrology 374(3-4)294ndash306 2009

Bibliographyndash 95

W Weimer S Forrest C Le Goues and T NguyenAutomatic program repair with evolutionary computationCommunications of the ACM 53(5)109ndash116 June 2010

P Widera J Garibaldi and N KrasnogorGP challenge Evolving energy function for protein structure predictionGPEM 1161ndash88 2010

T YuHierarchical Processing for Evolving Recursive and Modular Programs UsingHigher-Order Functions and Lambda AbstractionGPEM 2345ndash380 2001

Bibliographyndash 96

  • Introduction
  • Background
  • What is genetic programming
  • Summary of our first glimpse at GP
  • Life demonstration of GP using ECJ
  • Assessment of GP techniques
  • Case study
  • A more detailed view on GP (vanilla GP is not the whole story)
  • The Challenges for GP
  • Variants of GP
  • Selected theoretical results
  • Applications of GP
  • Additional resources
  • Bibliography
Page 95: GeneticProgramming - Poznań University of TechnologyGeneticProgramming KrzysztofKrawiec LaboratoryofIntelligentDecisionSupportSystems InstituteofComputingScience,PoznanUniversityofTechnology,Poznań,Poland

E Vladislavleva G Smits and D Den HertogOrder of Nonlinearity as a Complexity Measure for Models Generated by SymbolicRegression via Pareto Genetic ProgrammingIEEE Trans EC 13(2)333ndash349 2009

N Wagner Z Michalewicz M Khouja and R McGregorTime Series Forecasting for Dynamic Environments The DyFor Genetic ProgramModelIEEE Trans EC 2007

J Walker and J MillerPredicting Prime Numbers Using Cartesian Genetic ProgrammingIn Proc EuroGP 2007

J A Walker K Voumllk S L Smith and J F MillerParallel Evolution using Multi-chromosome Cartesian Genetic ProgrammingGPEM 10417ndash445 2009

W-C Wang K-W Chau C-T Cheng and L QiuA comparison of performance of several artificial intelligence methods forforecasting monthly discharge time seriesJournal of Hydrology 374(3-4)294ndash306 2009

Bibliographyndash 95

W Weimer S Forrest C Le Goues and T NguyenAutomatic program repair with evolutionary computationCommunications of the ACM 53(5)109ndash116 June 2010

P Widera J Garibaldi and N KrasnogorGP challenge Evolving energy function for protein structure predictionGPEM 1161ndash88 2010

T YuHierarchical Processing for Evolving Recursive and Modular Programs UsingHigher-Order Functions and Lambda AbstractionGPEM 2345ndash380 2001

Bibliographyndash 96

  • Introduction
  • Background
  • What is genetic programming
  • Summary of our first glimpse at GP
  • Life demonstration of GP using ECJ
  • Assessment of GP techniques
  • Case study
  • A more detailed view on GP (vanilla GP is not the whole story)
  • The Challenges for GP
  • Variants of GP
  • Selected theoretical results
  • Applications of GP
  • Additional resources
  • Bibliography
Page 96: GeneticProgramming - Poznań University of TechnologyGeneticProgramming KrzysztofKrawiec LaboratoryofIntelligentDecisionSupportSystems InstituteofComputingScience,PoznanUniversityofTechnology,Poznań,Poland

W Weimer S Forrest C Le Goues and T NguyenAutomatic program repair with evolutionary computationCommunications of the ACM 53(5)109ndash116 June 2010

P Widera J Garibaldi and N KrasnogorGP challenge Evolving energy function for protein structure predictionGPEM 1161ndash88 2010

T YuHierarchical Processing for Evolving Recursive and Modular Programs UsingHigher-Order Functions and Lambda AbstractionGPEM 2345ndash380 2001

Bibliographyndash 96

  • Introduction
  • Background
  • What is genetic programming
  • Summary of our first glimpse at GP
  • Life demonstration of GP using ECJ
  • Assessment of GP techniques
  • Case study
  • A more detailed view on GP (vanilla GP is not the whole story)
  • The Challenges for GP
  • Variants of GP
  • Selected theoretical results
  • Applications of GP
  • Additional resources
  • Bibliography