Feature Selection on a Flare Forecasting Testbed: A Comparative Study of 24 Methods

09/30/2021
by   Atharv Yeoleka, et al.
0

The Space-Weather ANalytics for Solar Flares (SWAN-SF) is a multivariate time series benchmark dataset recently created to serve the heliophysics community as a testbed for solar flare forecasting models. SWAN-SF contains 54 unique features, with 24 quantitative features computed from the photospheric magnetic field maps of active regions, describing their precedent flare activity. In this study, for the first time, we systematically attacked the problem of quantifying the relevance of these features to the ambitious task of flare forecasting. We implemented an end-to-end pipeline for preprocessing, feature selection, and evaluation phases. We incorporated 24 Feature Subset Selection (FSS) algorithms, including multivariate and univariate, supervised and unsupervised, wrappers and filters. We methodologically compared the results of different FSS algorithms, both on the multivariate time series and vectorized formats, and tested their correlation and reliability, to the extent possible, by using the selected features for flare forecasting on unseen data, in univariate and multivariate fashions. We concluded our investigation with a report of the best FSS methods in terms of their top-k features, and the analysis of the findings. We wish the reproducibility of our study and the availability of the data allow the future attempts be comparable with our findings and themselves.

READ FULL TEXT

page 1

page 7

page 8

research
05/01/2020

Supervised Feature Subset Selection and Feature Ranking for Multivariate Time Series without Feature Extraction

We introduce supervised feature ranking and feature subset selection alg...
research
03/12/2021

How to Train Your Flare Prediction Model: Revisiting Robust Sampling of Rare Events

We present a case study of solar flare forecasting by means of metadata ...
research
07/19/2021

Topological Attention for Time Series Forecasting

The problem of (point) forecasting univariate time series is considered....
research
11/20/2019

Challenges with Extreme Class-Imbalance and Temporal Coherence: A Study on Solar Flare Data

In analyses of rare-events, regardless of the domain of application, cla...
research
05/20/2019

A Comparative Analysis of Feature Selection Methods for Biomarker Discovery in Study of Toxicant-treated Atlantic Cod (Gadus morhua) Liver

Univariate and multivariate feature selection methods can be used for bi...
research
04/11/2023

The Capacity and Robustness Trade-off: Revisiting the Channel Independent Strategy for Multivariate Time Series Forecasting

Multivariate time series data comprises various channels of variables. T...
research
06/14/2022

Improving Solar Flare Prediction by Time Series Outlier Detection

Solar flares not only pose risks to outer space technologies and astrona...

Please sign up or login with your details

Forgot password? Click here to reset