Next: Other GBML Models
Up: A Framework for GBML
Previous: Classifying GBML Systems Algorithmically
Contents
Subsections
2.3 The Interaction of Learning and Evolution
This section briefly touches on the rich interactions between
evolution and learning.
Memetic Learning
We can characterise evolution as a form
of global search, which is good at finding good basins of attraction,
but poor at finding the optimum of those basins. In contrast, many
learning methods are forms of local search and have the opposite
characteristics.
We can get the best of both by combining them, which generally
generally outperforms either alone [316]. For example,
evolving the initial weights of a neural network and then training
them with gradient descent can be two orders of magnitude faster than
using random initial weights [93]. Methods which
combine global and local search are called memetic algorithms
[118,119,214,212,252,213,240].
See [161] for a self-contained tutorial.
Darwinian and Lamarckian Evolution
In Lamarckian evolution/inheritance, learning during an individual's
lifetime directly alters genes passed to offspring, so offspring
inherit the result of their parents' learning. This does not occur in
nature but can in computers and has the potential to be more efficient
than Darwinian evolution since the results of learning are not thrown
away. Indeed, Ackley and Littman [3] showed Lamarckian
evolution was much faster on stationary learning tasks but Saski and
Tokoro [243] showed Darwinian evolution is generally
better on non-stationary tasks. See also
[296,315,228,293].
The Baldwin Effect
The Baldwin effect is a two-part dynamic between learning and
evolution which depends on Phenotypic Plasticity (PP): the
ability to adapt (e.g. learn) during an individual's lifetime. The
first aspect is this. Suppose a mutation would have no benefit except
for PP. Without PP, the mutation does not increase fitness, but with
PP it does. Thus PP helps evolution to adopt beneficial mutations; it
effectively smooths the fitness landscape.
A possible example from nature is lactose tolerance in human adults.
At a recent point in human evolution a mutation occurred which allows
adult humans to digest milk. Subsequently, humans learned to keep
animals for milk, which in turn made the mutation more likely to
spread.
The smoothing effect on the fitness landscape depends on PP; the
greater the PP the more potential there is for smoothing. All GBML
methods exploit the Baldwin effect to the extent that they have PP.
See [293] §7.2 for a short review of the Baldwin effect
in reinforcement learning.
The second aspect of the Baldwin effect is genetic assimilation.
Suppose PP has a cost (e.g. learning involves making mistakes). If PP
can be replaced by new genes, it will be; for instance a learned
behaviour can become instinctive. This allows learned behaviours to
become inherited without Lamarckian inheritance.
Turney [280] has connected the Baldwin effect to
inductive bias. All inductive algorithms have a bias and the Baldwin
effect can be seen as a shift from weak to strong bias. When bias is
weak agents rely on learning; when bias is strong agents rely on
instinctive behaviour.
Kovacs and Kerber [157] point out that high
classification accuracy does not imply effective genetic search. To
illustrate, they initialised XCS [301] with random
condition/action rules and disabled evolutionary search. Updates to
estimates of rule utility, however, were made as usual. They found the
system was still able to achieve very high training set accuracy on
the widely-used 6 and 11 multiplexer tasks since ineffective rules
were simply given low weight in decision making, though neither
removed nor replaced. Care is therefore warranted when attributing
good accuracy to genetic search. A limitation of this work is that
test set accuracy was not evaluated.
Next: Other GBML Models
Up: A Framework for GBML
Previous: Classifying GBML Systems Algorithmically
Contents
T Kovacs
2011-03-12