| Population Dynamics |
| Research > Genetic Algorithms > Articles > Population Dynamics |
The past few articles have discussed the crucial elements involved in a Genetic Algorithm. This article talks about some features that are optional for GAs, but they are often employed today. Most GAs that I have done have not used species or subpopulations because I have not needed them, but they are certainly quite help in a practical setting. Speciation and Subpopulations are great for employing creativity in order to find multiple solutions to a problem, as well as refining the search space of a given problem. Search Space Revisited Given that the calculation involves raising exponents, you can see that search space is typically enormous. Don't let that worry you. This is one of beauties of Genetic Algorithms: they can take enormous amounts of search space and still find something that would take us humans years to find. Sometimes, however, the search space is simply too big. The key reason for estimating the search space is to be able to push the GA's ability as far as it can go. I first stumbled upon search space issues when I was doing a project for my Computer Science class called ASGA: A Simple Genetic Algorithm. The problem that the algorithm was simple: guessing a word where each genome was a string of letters. I found that the algorithm wouldn't converge almost at all when the word was fifteen letters or more, but if there were fourteen letters or less, the whole population would be done in seconds. Now think about the search space for that problem. Every letter you add to the genome multiplies the search space by 26 (since there are 26 letters of the alphabet). What I found fascinating was that, for a 70-genome population, the breaking point was 14 to the 26th power. Why is that?
Let's look at search space in terms of the two-variable optimization problem from the fitness article. Recall that we are looking for (one of) the highest value in a a given domain of a very complicated function. The graph of the function is as follows.
Here, each genome is 40 binary digits long which encodes for two numbers between 0 and 3. Estimating the search space for this problem is simple: 2 to the 40th power.
It's pretty easy to see that the more bits you add to the genome, the more precise answer you're looking for. So there are two ways you can change the search space in this genome. You can change the number of bits to get a more or less precise answer, or you can limit the overall domain (maybe only look for an answer between 2 and 3). As a combination of the two ideas, you can first make your genomes not very precise, but over a large domain. When you come up with an answer, you can limit the domain, but make them more precise. That way, the search space to the algorithm never changes, but you can progressively zoom into a good solution.
Speciation Speciation can be an important idea to use in a GA if you're trying to keep your population diverse. This is done by taking "similar" genomes and separating them out so that they're likely to mate only within their species. It's sort of like introducing geography into the population. The tough part which requires a keen understanding of your problem is how you define "similar" and "not-similar". Some search spaces don't lend themselves to this; those are usually the spaces which only have one optimal solution. But if you can determine somehow that a set of points "good" for different reasons than another set of points, then you should speciate your population. Take a look at our favorite 2-variable spikey graph.
Above there are six genomes labelled by their location. Notice that if a yellow and a red genome mated, which is very likely considering they are both very fit, their child might end up somewhere in that valley between them. The population would grow a lot slower because there are two groups that have great ideas, but they're cancelling each other out. So now a tough question lends itself: how do you determine a species? Sometimes there isn't a niche going on and every genome is pretty evenly spaced. And also remember that comparing each genome to another in some way is already a quadratic operation - a phrase which makes us Computer Scientists cringe! A satisfactory answer, unfortunately, has not quite been found as far as I've seen. I have read a few articles which use this idea and managed to get it to work using some graduate-level statistical analysis techniques, but those only applied to their specfic problem. Another issue to address is how to implement this. Your selection operator will need to be tailor-made for your speciation, which can be quite hairy if you're using an off-the-shelf library. Subpopulations Since any two initial populations can evolve into different solutions, it would make sense to take multiple populations, evolve them over a number of generations, then merge them together to get a whole new population. This new population will be bursting with creative, refined ideas that can be evaluated against each other to get the best genome. Though I have not done much with subpopulations myself, I have read about their success in the literature. Really the only extra operator that needs to be hashed out is the Merging Operator. There are a number of ways you can combine two populations, the easiest of which is to take the two and add them together, which doubles the size of the overall population. This is not reccomended for the reason mentioned above where the good genomes get diluted. The other way of merging is similar to the killing operator: intersection. Take only the best fit from both populations. According to the literature, this seems to be the best known approach. Speciating and subpopulating a genetic algorithm are stimulating ideas that have proven useful in the practical world. There seems no main reason against using these techniques if you have the opportunity to do so. My only reason for not exploring this branch of GAs was that I was getting great results without them and that it would have been an enormous undertaking to implement either of speciation or subpopulations in the end. According to what I have read, it is my opinion that they are probably worthwhile. The only warning that I would have for those actually developing these techniques is to make sure that you have your population working before you speciate and such. In other words, speciation and subpopulations ought not to be relied upon, they only enhance what you already have. |
| Page last updated: May 24, 2006 |





