A Modulation Layer to Increase Neural Network Robustness Against Data Quality Issues

07/19/2021
by   Mohamed Abdelhack, et al.
12

Data quality is a common problem in machine learning, especially in high-stakes settings such as healthcare. Missing data affects accuracy, calibration, and feature attribution in complex patterns. Developers often train models on carefully curated datasets to minimize missing data bias; however, this reduces the usability of such models in production environments, such as real-time healthcare records. Making machine learning models robust to missing data is therefore crucial for practical application. While some classifiers naturally handle missing data, others, such as deep neural networks, are not designed for unknown values. We propose a novel neural network modification to mitigate the impacts of missing data. The approach is inspired by neuromodulation that is performed by biological neural networks. Our proposal replaces the fixed weights of a fully-connected layer with a function of an additional input (reliability score) at each input, mimicking the ability of cortex to up- and down-weight inputs based on the presence of other data. The modulation function is jointly learned with the main task using a multi-layer perceptron. We tested our modulating fully connected layer on multiple classification, regression, and imputation problems, and it either improved performance or generated comparable performance to conventional neural network architectures concatenating reliability to the inputs. Models with modulating layers were more robust against degradation of data quality by introducing additional missingness at evaluation time. These results suggest that explicitly accounting for reduced information quality with a modulating fully connected layer can enable the deployment of artificial intelligence systems in real-time settings.

READ FULL TEXT

page 8

page 9

research
10/28/2016

Missing Data Imputation for Supervised Learning

This paper compares methods for imputing missing categorical data for su...
research
02/10/2020

Missing Data Imputation using Optimal Transport

Missing data is a crucial issue when applying machine learning algorithm...
research
06/25/2020

ELMV: a Ensemble-Learning Approach for Analyzing Electrical Health Records with Significant Missing Values

Many real-world Electronic Health Record (EHR) data contains a large pro...
research
02/10/2021

MAIN: Multihead-Attention Imputation Networks

The problem of missing data, usually absent incurated and competition-st...
research
05/18/2018

Processing of missing data by neural networks

We propose a general, theoretically justified mechanism for processing m...
research
03/21/2014

Missing Data Prediction and Classification: The Use of Auto-Associative Neural Networks and Optimization Algorithms

This paper presents methods which are aimed at finding approximations to...
research
03/25/2021

Deep Learning with robustness to missing data: A novel approach to the detection of COVID-19

In the context of the current global pandemic and the limitations of the...

Please sign up or login with your details

Forgot password? Click here to reset