The Environmental Discontinuity Hypothesis for Down-Sampled Lexicase Selection

05/31/2022
by   Ryan Boldi, et al.
0

Down-sampling training data has long been shown to improve the generalization performance of a wide range of machine learning systems. Recently, down-sampling has proved effective in genetic programming (GP) runs that utilize the lexicase parent selection technique. Although this down-sampling procedure has been shown to significantly improve performance across a variety of problems, it does not seem to do so due to encouraging adaptability through environmental change. We hypothesize that the random sampling that is performed every generation causes discontinuities that result in the population being unable to adapt to the shifting environment. We investigate modifications to down-sampled lexicase selection in hopes of promoting incremental environmental change to scaffold evolution by reducing the amount of jarring discontinuities between the environments of successive generations. In our empirical studies, we find that forcing incremental environmental change is not significantly better for evolving solutions to program synthesis problems than simple random down-sampling. In response to this, we attempt to exacerbate the hypothesized prevalence of discontinuities by using only disjoint down-samples to see if it hinders performance. We find that this also does not significantly differ from the performance of regular random down-sampling. These negative results raise new questions about the ways in which the composition of sub-samples, which may include synonymous cases, may be expected to influence the performance of machine learning systems that use down-sampling.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/04/2023

Informed Down-Sampled Lexicase Selection: Identifying productive training cases for efficient problem solving

Genetic Programming (GP) often uses large training sets and requires all...
research
02/08/2023

Down-Sampled Epsilon-Lexicase Selection for Real-World Symbolic Regression Problems

Epsilon-lexicase selection is a parent selection method in genetic progr...
research
06/10/2021

Problem-solving benefits of down-sampled lexicase selection

In genetic programming, an evolutionary method for producing computer pr...
research
04/14/2023

Analyzing the Interaction Between Down-Sampling and Selection

Genetic programming systems often use large training sets to evaluate th...
research
04/04/2023

A Static Analysis of Informed Down-Samples

We present an analysis of the loss of population-level test coverage ind...
research
04/30/2015

Model Selection and Overfitting in Genetic Programming: Empirical Study [Extended Version]

Genetic Programming has been very successful in solving a large area of ...
research
11/06/2021

On pseudo-absence generation and machine learning for locust breeding ground prediction in Africa

Desert locust outbreaks threaten the food security of a large part of Af...

Please sign up or login with your details

Forgot password? Click here to reset