Log In Sign Up

Contextual Classification Using Self-Supervised Auxiliary Models for Deep Neural Networks

by   Sebastian Palacio, et al.

Classification problems solved with deep neural networks (DNNs) typically rely on a closed world paradigm, and optimize over a single objective (e.g., minimization of the cross-entropy loss). This setup dismisses all kinds of supporting signals that can be used to reinforce the existence or absence of a particular pattern. The increasing need for models that are interpretable by design makes the inclusion of said contextual signals a crucial necessity. To this end, we introduce the notion of Self-Supervised Autogenous Learning (SSAL) models. A SSAL objective is realized through one or more additional targets that are derived from the original supervised classification task, following architectural principles found in multi-task learning. SSAL branches impose low-level priors into the optimization process (e.g., grouping). The ability of using SSAL branches during inference, allow models to converge faster, focusing on a richer set of class-relevant features. We show that SSAL models consistently outperform the state-of-the-art while also providing structured predictions that are more interpretable.


page 1

page 2

page 3

page 4


Perceptual Loss based Speech Denoising with an ensemble of Audio Pattern Recognition and Self-Supervised Models

Deep learning based speech denoising still suffers from the challenge of...

Graph-Based Neural Network Models with Multiple Self-Supervised Auxiliary Tasks

Self-supervised learning is currently gaining a lot of attention, as it ...

Comparing supervised and self-supervised embedding for ExVo Multi-Task learning track

The ICML Expressive Vocalizations (ExVo) Multi-task challenge 2022, focu...

MT4SSL: Boosting Self-Supervised Speech Representation Learning by Integrating Multiple Targets

In this paper, we provide a new perspective on self-supervised speech mo...

Improving the Generalization of Supervised Models

We consider the problem of training a deep neural network on a given cla...

WavFT: Acoustic model finetuning with labelled and unlabelled data

Unsupervised and self-supervised learning methods have leveraged unlabel...