Photometric Redshift Estimation with Convolutional Neural Networks and Galaxy Images: A Case Study of Resolving Biases in Data-Driven Methods

02/21/2022
by   Q. Lin, et al.
2

Deep Learning models have been increasingly exploited in astrophysical studies, yet such data-driven algorithms are prone to producing biased outputs detrimental for subsequent analyses. In this work, we investigate two major forms of biases, i.e., class-dependent residuals and mode collapse, in a case study of estimating photometric redshifts as a classification problem using Convolutional Neural Networks (CNNs) and galaxy images with spectroscopic redshifts. We focus on point estimates and propose a set of consecutive steps for resolving the two biases based on CNN models, involving representation learning with multi-channel outputs, balancing the training data and leveraging soft labels. The residuals can be viewed as a function of spectroscopic redshifts or photometric redshifts, and the biases with respect to these two definitions are incompatible and should be treated in a split way. We suggest that resolving biases in the spectroscopic space is a prerequisite for resolving biases in the photometric space. Experiments show that our methods possess a better capability in controlling biases compared to benchmark methods, and exhibit robustness under varying implementing and training conditions provided with high-quality data. Our methods have promises for future cosmological surveys that require a good constraint of biases, and may be applied to regression problems and other studies that make use of data-driven models. Nonetheless, the bias-variance trade-off and the demand on sufficient statistics suggest the need for developing better methodologies and optimizing data usage strategies.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset