Guaranteeing Reproducibility in Deep Learning Competitions
To encourage the development of methods with reproducible and robust training behavior, we propose a challenge paradigm where competitors are evaluated directly on the performance of their learning procedures rather than pre-trained agents. Since competition organizers re-train proposed methods in a controlled setting they can guarantee reproducibility, and – by retraining submissions using a held-out test set – help ensure generalization past the environments on which they were trained.
READ FULL TEXT