Down-Sampled Epsilon-Lexicase Selection for Real-World Symbolic Regression Problems

by   Alina Geiger, et al.
University of Mainz

Epsilon-lexicase selection is a parent selection method in genetic programming that has been successfully applied to symbolic regression problems. Recently, the combination of random subsampling with lexicase selection significantly improved performance in other genetic programming domains such as program synthesis. However, the influence of subsampling on the solution quality of real-world symbolic regression problems has not yet been studied. In this paper, we propose down-sampled epsilon-lexicase selection which combines epsilon-lexicase selection with random subsampling to improve the performance in the domain of symbolic regression. Therefore, we compare down-sampled epsilon-lexicase with traditional selection methods on common real-world symbolic regression problems and analyze its influence on the properties of the population over a genetic programming run. We find that the diversity is reduced by using down-sampled epsilon-lexicase selection compared to standard epsilon-lexicase selection. This comes along with high hyperselection rates we observe for down-sampled epsilon-lexicase selection. Further, we find that down-sampled epsilon-lexicase selection outperforms the traditional selection methods on all studied problems. Overall, with down-sampled epsilon-lexicase selection we observe an improvement of the solution quality of up to 85 comparison to standard epsilon-lexicase selection.


On the Success Rate of Crossover Operators for Genetic Programming with Offspring Selection

Genetic programming is a powerful heuristic search technique that is use...

Probabilistic Lexicase Selection

Lexicase selection is a widely used parent selection algorithm in geneti...

Glyph: Symbolic Regression Tools

We present Glyph - a Python package for genetic programming based symbol...

All You Need Is Sex for Diversity

Maintaining genetic diversity as a means to avoid premature convergence ...

Data Aggregation for Reducing Training Data in Symbolic Regression

The growing volume of data makes the use of computationally intense mach...

Lexicase Selection at Scale

Lexicase selection is a semantic-aware parent selection method, which as...

The Environmental Discontinuity Hypothesis for Down-Sampled Lexicase Selection

Down-sampling training data has long been shown to improve the generaliz...

Please sign up or login with your details

Forgot password? Click here to reset