Cartesian Genetic Programming

Genetic Programming (GP)

Genetic Programming is concerned with the automatic evolution (as in Darwinian evolution) of computational structures (such as mathematical equations, computer programs, digital circuits, etc.). The best known figure in this field is
John Koza.
The form of GP pioneered by John Koza used a tree representation of computer programs. This was inspired by the artificial intelligence computer language, LISP.

Cartesian Genetic Programming (CGP)

CGP is a highly efficient and flexible form of Genetic Programming that encodes a graph representation of a computer program.
It was invented by Julian Miller in 1999 and was developed from a representation of electronic circuits devised by Julian Miller and Peter Thomson developed a few years earlier.

CGP represents computational structures (mathematical equations, circuits, computer programs etc) as a string of integers. These integers, known as genes determine the functions of nodes in the graph, the connections between nodes, the connections to inputs and the locations in the graph where outputs are taken from.

Using a graph representation is very flexible as many computational structures can be represented as graphs. A good example of this is artificial neural networks (ANNs). These can be easily encoded in CGP. Recent published results of Khan and Miller and Turner and Miller(see the publications linked from this site) show that using CGP to encode and evolve ANNs (CGPANNs) is highly efficient and very competitive with other methods of evolving ANNs. Here is a ten-slide introduction to CGP.
Chapters 1 and 2 of Julian Miller's edited book are available from the link below: chapter1-and-2-CGP-book.pdf

What can CGP be used for?

Machine Learning, Neural Networks, Artificial Intelligence, Data Mining, Financial prediction, Function optimization, Classification, Electronic circuit design, medical diagnostics, evolutionary art and music,..., the list is endless.

Understanding the advantages of CGP

CGP typically finds good solutions very efficiently in few evaluations. Although it uses many generations, it also uses extremely small populations, typically 5, where one is the best one from the previous generation.

In all comparisons CGPs performance has been shown to be highly competitive with other GP methods.

CGP is extremely efficient to decode. It takes one pass though the graph to find which nodes are required. So running data through the graph is very fast as it only visits nodes are are used.
In addition, CGP genotypes tend to decode into phenotypes that are a small fraction of the genotype size. This means solutions found are naturally very small and no parsimony pressure is required.

CGP does not bloat like other forms of GP.

CGP is easy to implement and indeed has been implemented in various languages (see Resources)

CGP, unlike some other forms of GP can easily implement multiple output functions. This allows it to solve such problem as circuits problems, implement neural networks and other graph-based computational systems.

Frankly, there is no reason not to try CGP.

How to choose CGP parameters if you are a non-specialist

Choose one row and a quite large number of columns ( at least 100, but maybe 1000).

If the package you are using has levels-back, make it equal to the number of columns.

Use a 1+4 evolutionary strategy. So, population size is 5 and each generation is formed from the parent (best candidate from previous generation) and four mutated versions of the parent (offspring).

This means that only four new fitness evaluations are required in each generation. Note by checking nodes used, you can find out if any offspring have phenotypes identical to parents. If so, set their fitness to be the parents (and do not increment the evaluation counter)

Either only mutate one active gene, or stop mutating when an active gene has been mutated (single). If you want to choose a mutation rate or probability of gene mutation then choose it so that only a few genes are actually mutated to create an offspring from a parent.

Brian Goldman and Bill Punch wrote a very interesting paper of various mutation mechanisms in CGP and found single to be very good.
Analysis of Cartesian Genetic Programming's Evolutionary Mechanisms

Do not use crossover.

Choose how many fitness evaluations you want to do and let your program determine the number of generations.

Only process used nodes when decoding CGP (Most software implementations do this).

To learn more

Buy the CGP book edited by Julian Miller (cheapest at amazon).
Cartesian Genetic Programming
Download a tutorial by Julian Miller and Simon Harding - note that more recent tutorials are always available from Julian Miller' publication page (go to People tab)
GECCO 2012 Tutorial on CGP.

News highlights

NEWS FLASH: DIFFERENTIABLE CGP (dcCGP). Dario Izzo, Francesco Biscani, and Alessio Mereta of the Advanced Concepts Team, European Space Agency in Netherlands have created a differentiable form of CGP(dCGP). This allows one to obtain high-order Taylor representation of program outputs. The new technique means that back-propagation (as in neural networks) can be used.

dCGP is open source and is able to find the exact form of symbolic expression as well as their constants. The authors demonstrate the use of dCGP to solve a large class of differential equations and to find prime integrals of dynamical systems.

GOLD MEDAL: Zdenek Vasicek and Lukas Sekanina win gold humies ($5000) award at GECCO 2015

Congratulations to Zdenek Vasicek and Lukas Sekanina for their oustanding work on "Evolutionary Approach to Approximate Digital Circuits Design". They used CGP :-)

Zdenek Vasicek wins best paper at EuroGP2015

Congratulations to Zdenek Vasicek who won best paper award at EuroGP2015 for his paper "Cartesian GP in Optimization of Combinational Circuits with Hundreds of Inputs and Thousands of Gates".

Machine Intelligence Ltd.

Simon Harding has founded his own company Machine Intelligence Ltd to exploit the power of CGP for automatic visual defect detection, and process control and optimization. Here is a interesting video showing how software detects faults in a metallic surface.