Three major problems make Genetic Programming unfeasible or impractical
for real world problems.
The first is the excessive time complexity.In nature the evolutionary process
can take millions of years, a time frame that is clearly not acceptable for the solution
of problems on a computer. In order to apply Genetic Programming to real world
problems, it is essential that its efficiency be improved.
The second is called overfitting (where results are inaccurate outside the
training data). In a paper[36] for the Federal Reserve Bank, authors Neely and
Weller state “a perennial problem with using flexible, powerful search procedures
like Genetic Programming is overfitting, the finding of spurious patterns in the data.
Given the well-documented tendency for the genetic program to overfit the data it
is necessary to design procedures to mitigate this.”
The third is the difficulty of determining optimal control parameters for the
Genetic Programming process. Control parameters control the evolutionary process. They include settings such as, the size of the population and the number of generations
to be run. In his book[45], Banzhaf describes this problem, “The bad
news is that Genetic Programming is a young field and the effect of using various
combinations of parameters is just beginning to be explored.”
We address these problems by implementing and testing a number of novel
techniques and improvements to the Genetic Programming process. We conduct
experiments using data sets of various degrees of difficulty to demonstrate success
with a high degree of statistical confidence.