The Interaction of Learning and Evolution

Next: Other GBML Models Up: A Framework for GBML Previous: Classifying GBML Systems Algorithmically Contents

Subsections

2.3 The Interaction of Learning and Evolution

This section briefly touches on the rich interactions between evolution and learning.

Memetic Learning

We can characterise evolution as a form of global search, which is good at finding good basins of attraction, but poor at finding the optimum of those basins. In contrast, many learning methods are forms of local search and have the opposite characteristics. We can get the best of both by combining them, which generally generally outperforms either alone [316]. For example, evolving the initial weights of a neural network and then training them with gradient descent can be two orders of magnitude faster than using random initial weights [93]. Methods which combine global and local search are called memetic algorithms [118,119,214,212,252,213,240]. See [161] for a self-contained tutorial.

Darwinian and Lamarckian Evolution

In Lamarckian evolution/inheritance, learning during an individual's lifetime directly alters genes passed to offspring, so offspring inherit the result of their parents' learning. This does not occur in nature but can in computers and has the potential to be more efficient than Darwinian evolution since the results of learning are not thrown away. Indeed, Ackley and Littman [3] showed Lamarckian evolution was much faster on stationary learning tasks but Saski and Tokoro [243] showed Darwinian evolution is generally better on non-stationary tasks. See also [296,315,228,293].

The Baldwin Effect

The Baldwin effect is a two-part dynamic between learning and evolution which depends on Phenotypic Plasticity (PP): the ability to adapt (e.g. learn) during an individual's lifetime. The first aspect is this. Suppose a mutation would have no benefit except for PP. Without PP, the mutation does not increase fitness, but with PP it does. Thus PP helps evolution to adopt beneficial mutations; it effectively smooths the fitness landscape. A possible example from nature is lactose tolerance in human adults. At a recent point in human evolution a mutation occurred which allows adult humans to digest milk. Subsequently, humans learned to keep animals for milk, which in turn made the mutation more likely to spread. The smoothing effect on the fitness landscape depends on PP; the greater the PP the more potential there is for smoothing. All GBML methods exploit the Baldwin effect to the extent that they have PP. See [293] §7.2 for a short review of the Baldwin effect in reinforcement learning.

The second aspect of the Baldwin effect is genetic assimilation. Suppose PP has a cost (e.g. learning involves making mistakes). If PP can be replaced by new genes, it will be; for instance a learned behaviour can become instinctive. This allows learned behaviours to become inherited without Lamarckian inheritance.

Turney [280] has connected the Baldwin effect to inductive bias. All inductive algorithms have a bias and the Baldwin effect can be seen as a shift from weak to strong bias. When bias is weak agents rely on learning; when bias is strong agents rely on instinctive behaviour.

Evaluating Evolutionary Search by Evaluating Accuracy

Kovacs and Kerber [157] point out that high classification accuracy does not imply effective genetic search. To illustrate, they initialised XCS [301] with random condition/action rules and disabled evolutionary search. Updates to estimates of rule utility, however, were made as usual. They found the system was still able to achieve very high training set accuracy on the widely-used 6 and 11 multiplexer tasks since ineffective rules were simply given low weight in decision making, though neither removed nor replaced. Care is therefore warranted when attributing good accuracy to genetic search. A limitation of this work is that test set accuracy was not evaluated.

Next: Other GBML Models Up: A Framework for GBML Previous: Classifying GBML Systems Algorithmically Contents

T Kovacs 2011-03-12