The first part of this article is going to discuss the different kinds
of operators you need for the Genetic Algorithm. There are different ways
of implementing each operator, each with its own strengths and weaknesses.
The second part of the article talks about some of the common problems
which arise as a result of design issues.
Selection Operators
Selection operators are the biggest and most complicated operator
in the whole genetic algorithm. Choosing this operator and fine-tuning
its parameters is the second most important part of the whole algorithm.
The rest of this section is devoted to focusing on describing the two
main selection operators, and analyzing their characteristics.
So you've got your fitness function done such that, given any genome,
you can obtain its fitness. Now you need to randomly choose two genomes
to mate, but you need to choose them in such a way that the more fit ones
are more likely to mate. This needs to be precise so that, probabilistically,
your population is getting better overall after each generation, but is
also allowing for new ideas to propagate. The two main ways of doing this
are using Roulette Wheels or Tournament Style selection. Let's look at
each one individually, then compare them.
Roulette Wheel
Suppose you have your population's fitness figured out, and the
sum of every genome's fitness happens to be 532. Then for each genome,
you want their probability of getting chosen to be their fitness divided
by 532. That way, everyone's probability of getting chosen all adds up
to 100%. If two genomes have the same fitness, then they have the same
probability of getting chosen.
Think of each genome owning sections of a roulette wheel. The better
fit genomes own bigger chunks of the wheel, so they're more likely to
get chosen, but it is always possible for any genome to get picked.
Take a look at this example:
Suppose you have a population of five genomes (not a typical population
size, but work with me). Their fitness is given below:
fitness(g1) = 35
fitness(g2) = 25
fitness(g3) = 21
fitness(g4) = 13
fitness(g5) = 6
Sum of fitness = 100
Probabilities of being chosen:
g1 - 0.35
g2 - 0.25
g3 - 0.21
g4 - 0.13
g5 - 0.06

Imagine that this is spinning and a ball must end on one section for each
parent
Implementing this roulette wheel is quite simple. Choose
a random number between 1 and the sum of the fitness, inclusive. Keep
subtracting each genome's fitness until you get onto that part of the
board. Think of it in terms of our example. The "ball" will
land on a number between 1 and 100. Genome g1 is chosen if the ball lands
on 1 to 35, g2 has 36-60, g3 has 60-81, g4 has 82-94, g5 has 95 to 100.
If you're into pseudocode, here's what it looks like:
ball = Random(1,sum_fitness)
for( n=0; n< pop_size; n++) {
if( fitness[n]<=ball )
{
 choice
= n;
} else {
 ball
= ball - fitness[n]
}
}
return choice
This is done once for each parent. If you happen to choose
the same parent both times, just try it again (don't elilminate the selected
parent when running your selection for the second time, more on that in
the programming discussion).
The Roulette Wheel algorithm is pretty natural - a genome
gets more real-estate of the board when they're better. As a result, the
probability of getting chosen is also quite easy to analyze, as opposed
to the next selection operator.
Tournament Style
This selection operator is given away by its name. You can choose
a group of, say four, genomes. Then pair them off randomly and each genome's
fitness with its partner, the winner is the parent. If any two genomes
happen to have the same fitness, then choose randomly. The tiers look
like this:

Let's analyze the probability of being chosen in a 1-tiered
tournament selection operator. The best genome of the group of four will
always be chosen, since it will always "win" any comparison.
Likewise the worst genome will never get picked because it will always
"lose" any comparison. Look at this ranking of the probability
of getting chosen in a pool of 4:
Probability of being
chosen by Rank (pool of 4 total)
1st: 100%
2nd: 2/3 = 66.7%
3rd: 1/3 = 33.3%
4th: 0%
Now if we were to make this selector 2 tiered, that is
each parent comes from a pool of four rather than two, then the probability
of each rank getting chosen is this (assuming no "ties"):
Probability of being
chosen by Rank (pool of 8 total)
1st: 100%
2nd: 4/7 = 57.1%
3rd: 2/7 = 28.6%
4th: 4/35 = 11.4%
5th: 1/35 = 2.9%
6th: 0%
7th: 0%
8th: 0%
This means that no matter which eight genomes are selected,
the bottom three will never be chosen and the top will always be chosen.
(The numbers are similar if there are any "ties" with fitness,
though there really shouldn't be too many of ties since your fitness function
ought to be precise enough).
Now look at the difference between 1-tiered and 2-tiered
selection processes. Proportionally speaking, the amount of genomes which
are thrown out is more (1/4 < 3/8), but there's more spread of percentage
from the 2nd to 5th place. So there's a better chance that a "good,
but not necessarily the best" genome is going to be picked for mating,
which is exactly what a Genetic Algorithm is supposed to do.
This tournament operator can also be used to replace a fitness
function. Suppose you can't assign a number to each genome, but you can
still compare any two genomes. An example of this would be game-players.
So suppose your genomes code for some sort of Chess player. It might be
hard to assign a number to any player's skill, but since you can have
two players play each other, just take the winner of a series and choose
that one to be mated, or go to the next tier as the case may be.
Comparing both Selection Operators
Let's look at some of the characteristics of both selection operators.
Roulette requires that the fitness be an integer because of the way it's
implemented, and Tournament doesn't even require a numbered fitness just
a comparison. Roulette allows every genome a proportional chance to be
selected for mating, while Tournament always selects the best of a pool
and won't select some of the worst of the pool. Roulette is suceptible
to disproportionate fitness (i.e. a few genomes are a lot better than
other genomes), Tournament is only concerned with overall rank of fitness
and not the actual value of fitness itself. Each operator has its own
characteristics with strengths and weaknesses, and it seems that most
Genetic Algorithms out there employ both these operators pretty equally
depending on the problem they are approaching.
(Dis)Allowing Adultery
Is a genome allowed to mate more than once per generation? This
is certainly a viable option, since eliminating a parent once it has mated
changes the probabilty dynamic throughout the population. It's certainly
easier to implement a Genetic Algorithm by allowing adultery because you
don't have to mark off certain genomes once they've mated. It is my opinion,
however, that you must not allow adultery in the population (and not just
for moral reasons!). If you allow genomes to mate more than once in a
single generation, then the probability of the entire population being
uniform becomes exponentially bigger. Think about it: if a piece of a
genome is truly a good "idea", then it will double with every
generation if there's no adultery. But along the way, there might
be a better chunk which is in only one genome, and it won't be able to
propagate itself because the first idea is already too pervasive. Now
there typically is a good balance of new ideas that come about if you
choose the right selection operator, but what adultery does is advance
propagation too fast. If a chunk of genome, which is better than all the
other ideas that the population has at the time, is propagated through
adultery, say four times in one generation, then that's as if it took
two extra generations for that idea to propagate. Balance is thrown away.
The population is more likely to be too uniform too quickly if adultery
is allowed. Admittedly, the population will converge to a solution faster
with adultery, but the potential creativity of the algorithm is stunted.
Mating Operators
Mating is one of the most basic operations that the Genetic Algorithm
uses. It's such a simple idea that there is really no special methods
to use like in those selection operators. The goal of mating is to take
two parent genomes and somehow mix them to create a child. Since you have
your genomes encoded into a list of integers, then simply take a chunk
from genome and swap it on the other genome.

If your genome is encoded circularly, which is common with
marker-based encoding, then you need to use a double-point based mating.
Here's what that looks like:

The mating operation need not be complicated because you really want
to preserve the elegant "randomness" of mating. Children which
aren't very fit will die out, and children which are better will survive
the next few generations, so you don't need to worry about always making
your mates come up with a "better" child. Remember, you don't
really know what "better" means anyway; so let the algorithm
guess and modify on its own.
Survival Operators
What is each child's destiny? You've selected your parents, you've
mated them, and now you have a child. Where does that child go? The size
of the population doesn't change, so this child needs to replace some
genome somewhere (note: this operator is also referred to as the replacement
operator). There are couple of natural ideas behind the replacement operator.
You could get all of your children from all the matings
and then take the best of both the parents and the children. This method
is referred to as intersection because you're taking the intersection
of two sets: the set of parents and set of children. The intersection
method will guarantee that the population will never get worse, because
if all the children are worse than their parents then the population will
stay the same. This could lead to quick unformity, however, because if
the children are similar to their parents then they will permeate the
population too quickly.
Another method of the killing operator is more along the
lines of the "taking over the family business" destiny. If a
child is better than one of its parents, then replace that parent. This
will also guarantee that the population won't go backward because if none
of the children are better then they die out. The rest of the population
won't be affected much, either. The danger behind this particular form
of replacement is that the population can get into a rut. It's too easy
for none of the mates to get any better and for the population to stagnate.
My solution to this, and this is something that I haven't tested myself
but seems to theoretically work when thinking about the probabilities,
is to give the failing child another chance. Choose a random genome which
is ranked low in the population and test the child's fitness against that
one. If the child isn't good, then kill it off, but otherwise replace
the bottom ranked genome. This seems like a middle-ground between the
"intersection" and the "family destiny" methods, where
only good ideas begin to permeate the population without converging at
an exponential rate.
The last and simplest killing operator is commonly known
as Elitism. This means that a child will replace a random genome in the
population (or maybe just parents), but the top few genomes will be immune
to this replacement. This means that the top genomes of the population
won't get any worse, but the rest of the population might go backward
while trying to find other ideas. This is also a good middle-ground idea
because the elite genomes can keep the good ideas of the population, but
they also provide for plenty of creativity. The main issues that I have
come up in using Elitism is that sometimes a population will latch onto
an elite genome and miss a much better idea. The way you can get around
that problem, however, is by essentially rotating the elite genomes so
that no one genome monopolizes immunity. The Tournament selection operator
is good for this, and if you're using a Roulette Wheel, fitness scaling
is the way to go (more on fitness scaling later).
Mutating Operators
Throughout the algorithm, it is important that you
create some sort of variation in the population. Mutation is the operator
which adds new "ideas" to the pile by changing a few random
genomes. How this works depends on your encoding scheme, but most of the
time the mutation operator changes a few 1's to 0's or changes, or changing
an integer in the list. The mutation operator usually changes one or two
genomes per generation.
For some populations, rather than mutating existing genomes,
sometimes can create a new genome, evaluate its fitness, and if that fitness
is better than the worst genome then replace it. That way there's not
a possibility of making the population get worse, but it's still creative.
If that's not creative enough, then it would make sense to create a few
new genomes and have them replace the worst. The choice depends wholly
on what kind of problem you're solving and what makes the most sense.
Comment on static versus adaptive operators
Now most GA's will choose their operators and parameters at the
beginning and stick with them for the entire run, which typically works
well. It is conceivable to alter the parameters, maybe even change the
operators, partway through the evolution process. Maybe if we notice that
the population is converging too quickly we can put in a fitness scaler.
Or maybe we could turn down the mutation intensity toward the end of the
process. There are plenty of ideas that you could implement partway through
the process. I've never attempted this mostly because it's hard to program
in a different operator in the middle of the algorithm, but it's certainly
possible to do.
These operators are the second layer to the whole GA and
are therefore still crucial to the overall success of the algorithm. One
must take special care as to how each operator is chosen and how each
parameter is given. The good news about this is that once you've got the
operators functioning properly, most the fine-tuning for GA's are in the
fitness function. It's a ton of work to get done at the beginning of development,
but they don't need a lot revision.
I'd like to point out here that there are a number of ideas
in this article that are my own. Most of the operators are standard, but
some of their variations are my own concoction. They have proven to work
well in my own experience, but I have not read any literature to back
their reliability. I guess that's why this is an internet article - I
can write whatever I've learned and not worry about where it came from.
|