Comparing Multi-class, Binary and Hierarchical Machine Learning Classification schemes for variable stars

07/18/2019
by   Zafiirah Hosenie, et al.
7

Upcoming synoptic surveys are set to generate an unprecedented amount of data. This requires an automatic framework that can quickly and efficiently provide classification labels for several new object classification challenges. Using data describing 11 types of variable stars from the Catalina Real-Time Transient Surveys (CRTS), we illustrate how to capture the most important information from computed features and describe detailed methods of how to robustly use Information Theory for feature selection and evaluation. We apply three Machine Learning (ML) algorithms and demonstrate how to optimize these classifiers via cross-validation techniques. For the CRTS dataset, we find that the Random Forest (RF) classifier performs best in terms of balanced-accuracy and geometric means. We demonstrate substantially improved classification results by converting the multi-class problem into a binary classification task, achieving a balanced-accuracy rate of ∼99 per cent for the classification of δ-Scuti and Anomalous Cepheids (ACEP). Additionally, we describe how classification performance can be improved via converting a 'flat-multi-class' problem into a hierarchical taxonomy. We develop a new hierarchical structure and propose a new set of classification features, enabling the accurate identification of subtypes of cepheids, RR Lyrae and eclipsing binary stars in CRTS data.

READ FULL TEXT

page 5

page 8

page 12

research
02/14/2012

Hierarchical Maximum Margin Learning for Multi-Class Classification

Due to myriads of classes, designing accurate and efficient classifiers ...
research
02/27/2020

Imbalance Learning for Variable Star Classification

The accurate automated classification of variable stars into their respe...
research
05/18/2020

Classification of Spam Emails through Hierarchical Clustering and Supervised Learning

Spammers take advantage of email popularity to send indiscriminately uns...
research
08/26/2023

Class Binarization to NeuroEvolution for Multiclass Classification

Multiclass classification is a fundamental and challenging task in machi...
research
08/02/2019

A Visual Technique to Analyze Flow of Information in a Machine Learning System

Machine learning (ML) algorithms and machine learning based software sys...
research
10/08/2013

Feature Selection Strategies for Classifying High Dimensional Astronomical Data Sets

The amount of collected data in many scientific fields is increasing, al...
research
10/26/2022

UnfoldML: Cost-Aware and Uncertainty-Based Dynamic 2D Prediction for Multi-Stage Classification

Machine Learning (ML) research has focused on maximizing the accuracy of...

Please sign up or login with your details

Forgot password? Click here to reset