Pulsars Detection by Machine Learning with Very Few Features
It is an active topic to investigate the schemes based on machine learning (ML) methods for detecting pulsars as the data volume growing exponentially in modern surveys. To improve the detection performance, input features into an ML model should be investigated specifically. In the existing pulsar detection researches based on ML methods, there are mainly two kinds of feature designs: the empirical features and statistical features. Due to the combinational effects from multiple features, however, there exist some redundancies and even irrelevant components in the available features, which can reduce the accuracy of a pulsar detection model. Therefore, it is essential to select a subset of relevant features from a set of available candidate features and known as feature selection. In this work, two feature selection algorithms —-Grid Search (GS) and Recursive Feature Elimination (RFE)—- are proposed to improve the detection performance by removing the redundant and irrelevant features. The algorithms were evaluated on the Southern High Time Resolution University survey (HTRU-S) with five pulsar detection models. The experimental results verify the effectiveness and efficiency of our proposed feature selection algorithms. By the GS, a model with only two features reach a recall rate as high as 99% and a false positive rate (FPR) as low as 0.65%; By the RFE, another model with only three features achieves a recall rate 99% and an FPR of 0.16% in pulsar candidates classification. Furthermore, this work investigated the number of features required as well as the misclassified pulsars by our models.
READ FULL TEXT