Sacrificing information for the greater good: how to select photometric bands for optimal accuracy

11/17/2015
by   Kristoffer Stensbo-Smidt, et al.
0

Large-scale surveys make huge amounts of photometric data available. Because of the sheer amount of objects, spectral data cannot be obtained for all of them. Therefore it is important to devise techniques for reliably estimating physical properties of objects from photometric information alone. These estimates are needed to automatically identify interesting objects worth a follow-up investigation as well as to produce the required data for a statistical analysis of the space covered by a survey. We argue that machine learning techniques are suitable to compute these estimates accurately and efficiently. This study promotes a feature selection algorithm, which selects the most informative magnitudes and colours for a given task of estimating physical quantities from photometric data alone. Using k nearest neighbours regression, a well-known non-parametric machine learning method, we show that using the found features significantly increases the accuracy of the estimations compared to using standard features and standard methods. We illustrate the usefulness of the approach by estimating specific star formation rates (sSFRs) and redshifts (photo-z's) using only the broad-band photometry from the Sloan Digital Sky Survey (SDSS). For estimating sSFRs, we demonstrate that our method produces better estimates than traditional spectral energy distribution (SED) fitting. For estimating photo-z's, we show that our method produces more accurate photo-z's than the method employed by SDSS. The study highlights the general importance of performing proper model selection to improve the results of machine learning systems and how feature selection can provide insights into the predictive relevance of particular input features.

READ FULL TEXT

page 9

page 12

page 14

page 18

page 19

page 20

research
12/07/2020

Spectral band selection for vegetation properties retrieval using Gaussian processes regression

With current and upcoming imaging spectrometers, automated band analysis...
research
05/19/2023

Photo-zSNthesis: Converting Type Ia Supernova Lightcurves to Redshift Estimates via Deep Learning

Upcoming photometric surveys will discover tens of thousands of Type Ia ...
research
10/16/2020

Feature Selection for Huge Data via Minipatch Learning

Feature selection often leads to increased model interpretability, faste...
research
05/18/2018

A case study of hurdle and generalized additive models in astronomy: the escape of ionizing radiation

The dark ages of the Universe end with the formation of the first genera...
research
08/19/2023

Utilizing Semantic Textual Similarity for Clinical Survey Data Feature Selection

Survey data can contain a high number of features while having a compara...
research
05/01/2020

Automatic Catalog of RRLyrae from ∼ 14 million VVV Light Curves: How far can we go with traditional machine-learning?

The creation of a 3D map of the bulge using RRLyrae (RRL) is one of the ...
research
02/18/2020

Constraining the recent star formation history of galaxies : an Approximate Bayesian Computation approach

[Abridged] Although galaxies are found to follow a tight relation betwee...

Please sign up or login with your details

Forgot password? Click here to reset