Taking Remy's comments and a few of my own critiques, I've done
yet another revision of the tutorials.
I decided to get rid of the section where they navigate through
a custom Java driver. Why do they need to learn about a custom driver?
That section has actually moved to Tutorial 3, which the students
won't have time to do. But it fits really nicely into Lab 3, in case
you were worried.
I put a band-aid solution on poor code in the framework with the
SurvivalOperator class. A larger refactoring
would be nice, but I didn't want to break everything two days before
the big lab.
Hope the lab goes well....
04/13
The last few weeks can be summed up in rewriting Tutorials 1 and
2 for my expert test with Remy. I like the format of these labs. Lab
2 was totally rewritten with a whole new example and more
hand-holding. I'm convinced this is good, though.
Tomorrow I'm testing this all out with Remy.
I made a bunch of edits to the framework as well - all of which
are committed and released.
03/23
I worked primarily on the examples today
The IntGenome example was changed significantly,
and much commenting was done to help out the process.
I've also been working on getting those reporting mechanisms better.
It's still console output, but it's much better this time.
In the writing, I've tried to make things more clear. I've separated
out the Eclipse stuff into sidebars (which I love - THANK YOU FRENS!)
Todo
Change the other examples to be more readable
Apply many of these changes to the other labs.
The expectation for this lab are these
The average student will finish the first lab
The great students will finish the second lab
03/22
I got rid of the "Dummy Genome" everywhere. The whole
concept seemed a little smelly to begin with, anyway.
I changed the GenomeFactory interface
to only one method, createGenome(List<Integer>
data), to alleviate the confusion. The createRandomGenome
method will be implemented by just putting a random list of integers
into the createGenome method. This change
will hopefully make things a lot less confusing, and will cut down
on that evil duplicate code.
Actually, I lied... the GenomeFactory
interface must also support the method genomeSize()
so that the framework can know how many integers to generate.
Moved MersenneTwister into the framework
to cut down on the JAR files being used. I think that's legal...
Made the text output for the Runner to be a little more descriptive.
TODO
Change around the examples to make them a bit more user-friendly
Come up with some airtight examples and exercises.
03/20
I changed the parameters to the Mutate operators
to be a little more specific than "intensity".
Todo
Rewrite and revise the tutorials in a very massive way
Clean up the examples, making them more accessible
Write an introduction for the tutorials so that the users will
know more of what to expect
02/23
I've been up to the wee hours of the morning writing a bazillion
tutorials and other documentation. I recommend you take a glance at
it.
02/19
Finished the Operators Sheet.
Tweaked some Javadoc.
Set up the overall structure for the tutorials, and examples
02/18
Lots of smaller changes to the documentation: navigation bar, more
work on the Operators document, officially posted the date for the
pilot lab, etc.
Wrote the transitivity thing.
To do this weekend
Finish Operators sheet
Finish Examples Guide
Finish the Getting Started and First Algorithm tutorials.
To do this week
Write all the tutorials
Finish off the examples already in place
Write a few new examples
02/15
Did tons of documentation work. I'm getting there....
02/14
Made some small changes to the code, mostly just testing and polishing
Worked a lot on documentation, but there's a lot more work to be
done there
02/10
Finished the javadoc, now working on documentation.
01/30
Javadoc'd every class within the galapagos.framework
package - more to come.
01/14
Wrote another utility method in SuperBean
called missingProperties() which returns
a list of all properties which are null. Used in error checking for
GalapagosRunner
Fixed a lot of things.
Finished List
Runner is working, though some testing will surely find some
bugs
Integration Testing in terms of JUnit is fine, but I prefer
some human testing at this point in terms of integration
Revised List of "Tasks Ahead of Me"
Documentation
JavaDoc for everything
Parameters and Properties Quick Reference
Tutorial
Testing throughout the code which I may have missed.
Begin writing Ant code for the buildfile - issues have already
arisen regarding that.
"Back Burner"
XML Schema - with the missingProperties() method, we can check
for missing properties easier than to enforce a schema. If I see
a very, very good reason to use a schema instead (other than slightly
more descriptive errors).
Code Metrics
3,729 lines of code, 43 packages, 101 classes
Lines of code per method: 3.532(stddev 4.017), maximum 19
89 unit tests (nearly all have multiple assertions)
01/07
Major refactoring: move Factory away from
a property of any operator and get it from Population
PopulationPref was made to serve as holder
for the preferences when the runner generates a StandardPopulation
01/03
I did some reorganization of Comparator
and Factory. This was motivated by adding
Population to the runner.
I had to make a few decisions about the runner
Since, GAMersenneTwister, is not a
JavaBean (since it follows a singleton pattern, so it has no public
constructor), I decided that all algorithms run from the Runner
use GAMersenneTwister - the best random
generator available anyway.
ReportStatus will be Report_End
until the reporting mechanism evolves more.
PopSize, then, is now an attribute
of GalapagosAlgorithm. The setter fires
up the StandardPopulation.generateRandomPopulation(..).
Future refactoring will involve taking two constructors for the
algorithms, one that will generate a StandardPopulation
and another that will simply accept a Population.
01/01
Selection operators and all of the scalers
are supported.
My fancy equals(Object other) from SuperBean
method broke, so I'll need to fix it.
Speciation is supported, so the OperatorSet
is done
Idea: for noSpeciation
and noScaling, allow the user to not specify
that operator and default to it. That is, after the XML parsing, check
for a null and add the appropriate operator
Issues that drive me nuts:
Still can't get PipedInputStream to
work (piping the XSLT Transformer's
OutputStream to the XMLDecoder's
InputStream). I think this might be
because the entire file needs to be buffered, which means that
a Stream object is inappropriate for
XMLDecoder.
I want to separate galapagosTransform.xsl
to separate files for readability, so using <xsl:include>
is what I should do. But, the base URI of this XSLT is not the
document's location, but the location of the project. I need to
access the document's location so that I can resolve a relative
link without knowledge of the XSLT file's absolute location. This
should be easy, but...
I can't decide on how to use the Report
mechanism well. I might export an XML file and use an XSLT to
create an HTML report, but maybe that's beating around the bush.
I need to research JasperReports and
other reporting libraries.
Major tasks ahead of me
Once the parser is done (which is coming close), I need to write
the runner with all of the error-checking bells and whistles.
Write a schema to enforce the runner files.
A truckload of documentation
More thorough integration testing the entire algorithm
Write tutorials
A few code metrics
3,555 lines of code, 41 packages, 96 classes
Lines of code per method: 3.45 (stddev 3.9), maximum 19
81 unit tests (about 100 assertions)
12/29
Implemented an equals(Object) method in
SuperBean, which uses introspection to check all of the properties
(very handy for future java bean stuff).
Runner now supports Mate, Mutate,
and Survival, and some Selection
operators (scalers are the current issue).
Much more to do, but most of it is busywork now that I have the
code patterns all figured out.
Idea: Documentation must include
the XML configuration structure
MAJOR REFACTOR: rewrote the runner's unit tests so that TestCase
classes don't extend each other. This is because all of the child
classes re-run the parent's test methods (slows down the unit testing
execution time).
12/24
Beanified every operator, did a little refactoring.
Implemented the Mate Operators for the runner
12/23
After many iterations and many different decisions, I am now implementing
the runner by using XSLT to transform the Galapagos XML configuration
format to the format that Java's XMLEncoder/Decoder uses.
Began a SuperBean, which all JavaBeans
will be derived from, that has the method hasAllProperties.
This method uses the Introspector to check
all of the properties of the bean against null
and (in the future) will do a deep check of all other SuperBeans.
In working on the transformation process, I have come across a sticky
problem: piping an Outputstream to an Inputstream. So far I'm simply
dumping the information to an intermediate XML file (which may be
the final solution for pedagogical reasons)
11/8
Started the Galapagos Runner with the parser. Currently I'm using
the (very clunky) DOM from org.w3c,
but I'm debating on going to JDOM
for my parsing needs. JAXB
is also on my horizon.
Didn't commit the code yet since I might change libraries.
Working on XML Schema for the input XML file for the runner.
11/7
In Roulette, created a "base case"
for speed-up reasons. If there are only two genomes left unmated,
then grab those two. This bug was discovered when the Window
scaler was placed in and the runSelection
method of Roulette slowed down a ton (i.e.
since one genome had 1 as fitness and the other had 3000, it's pointless
to keep randomly picking because there's only two left).
Incorporated fitness scaling
Every QuantifiableGenome now has a
getScaledFitness() function, with an
appropriate exception message when the fitness has not been scaled.
LogScale and SqrtScale
both have a multiplier parameter to scale the output
WindowScale subtracts the lowest fitness
from the population
Idea: make an exercise that requires
a combination of two scalers (e.g. window and logarithmic)
Idea: make a genetic programming example,
then have some exercises be different variations on it. Need to do
some research on the "textbook" examples of genetic programming.
Idea: do a simple Traveling Salesperson
example.
10/16
In Genome, changed toString()
to report(), since toString()
really shouldn't be used there.
Idea: Once fitness scaling is in place,
write a random generator whose pdf is transformed from a uniform distribution
to an exponential, quadratic, etc. Then apply this to a selection
operator which would rank all genomes and select with bias towards
the top.
10/13
Created nullCheck methods for when the
operators are loaded as JavaBeans to ensure that all parameters are
loaded properly before the algorithm is run
IterativeAlgorithm is now used fully in
the IntGenome example, so the Main
class no longer extends GalapagosAlgorithm
Made Genome's toString()
method abstract, forcing the user to write a useful toString()
method for reporting purposes.
Started on GAReport, a more sophisticated
reporting mechanism
Researched a few new ideas for Pet Problems. Will consult more formal
sources soon.
Designed and began to implement the XML schema for the Runner. Brainstorm
below:
TODO
Move functionality of ReportStatus away
from an enumeration to GAReport which collects
all possible data then reports only what you want, inputing the OutputStream
as a parameter
Add Fitness Scaling to the Roulette selection
operator
Idea: create a tutorial which asks the
user to create a GA for solving general integer solutions to Diophantine
Equations (e.g. x+3y+5z = 30). This would make for a great example
instead of a continuous-space example like SineGenome
Change the abstract toString() method
to another, more descriptive method for reporting later on.
Change SineGenome's toString() value to
something meaningful (i.e. the x-,y-, and z-values)
Created a generateRandomPopulation method
in the framework
The GenomeFactory is now a part of the
Population, so this means that the population
is in charge of creating new genomes, keeping a list of genomes, and
knowing how to compare genomes.
CompGenome and SineGenome
were refactored considerably
The property GenomeSize was delegated
to the user to define when they implement GenomeFactory.
Pulled up GenomeFactory into the abstract
class MateOperator from its children
Document Type Definition of the runner configuration input file
is underway.
9/24
Fixed a selection bug in keeping track of unmated genomes
Made all operators abstract classes instead of interfaces, moved
around common fields
Changed the random number generator to a parameter everywhere
Added a numChildrenPerFamily parameter,
allowing for multiple children per family
Created an iterative algorithm
Began pet problem "SineWave",
which is the optimization of the continuous function
Task List:
Incorporate an Iterative Algorithm into the framework
Fix number formatting fix in SineGenome
Change GenomeFactory to a parameter, not
a property
Move Population out of OperatorSet
Move generateRandomPopulation into the
framework
9/19 Initial Entry
I have been working on this project for about 3 months now and I have
the framework up on its feet with 50 unit tests, two pet problems, and
over 1000 effective lines of code. Most importantly, it all works beautifully!
...time to break it again.