Exploring new ways: Enforcing representational dissimilarity to learn new features and reduce error consistency

07/05/2023
by   Tassilo Wald, et al.
0

Independently trained machine learning models tend to learn similar features. Given an ensemble of independently trained models, this results in correlated predictions and common failure modes. Previous attempts focusing on decorrelation of output predictions or logits yielded mixed results, particularly due to their reduction in model accuracy caused by conflicting optimization objectives. In this paper, we propose the novel idea of utilizing methods of the representational similarity field to promote dissimilarity during training instead of measuring similarity of trained models. To this end, we promote intermediate representations to be dissimilar at different depths between architectures, with the goal of learning robust ensembles with disjoint failure modes. We show that highly dissimilar intermediate representations result in less correlated output predictions and slightly lower error consistency, resulting in higher ensemble accuracy. With this, we shine first light on the connection between intermediate representations and their impact on the output predictions.

READ FULL TEXT

page 2

page 3

page 4

research
10/29/2020

Understanding the Failure Modes of Out-of-Distribution Generalization

Empirical studies suggest that machine learning models often rely on fea...
research
02/09/2018

Fibres of Failure: Classifying errors in predictive processes

We describe Fibres of Failure (FiFa), a method to classify failure modes...
research
06/15/2021

Mean Embeddings with Test-Time Data Augmentation for Ensembling of Representations

Averaging predictions over a set of models – an ensemble – is widely use...
research
03/15/2023

Exploring Resiliency to Natural Image Corruptions in Deep Learning using Design Diversity

In this paper, we investigate the relationship between diversity metrics...
research
10/12/2022

Efficient Knowledge Distillation from Model Checkpoints

Knowledge distillation is an effective approach to learn compact models ...
research
04/01/2022

InterAug: Augmenting Noisy Intermediate Predictions for CTC-based ASR

This paper proposes InterAug: a novel training method for CTC-based ASR ...

Please sign up or login with your details

Forgot password? Click here to reset