Similarity of Neural Architectures Based on Input Gradient Transferability

10/20/2022
by   Jaehui Hwang, et al.
0

In this paper, we aim to design a quantitative similarity function between two neural architectures. Specifically, we define a model similarity using input gradient transferability. We generate adversarial samples of two networks and measure the average accuracy of the networks on adversarial samples of each other. If two networks are highly correlated, then the attack transferability will be high, resulting in high similarity. Using the similarity score, we investigate two topics: (1) Which network component contributes to the model diversity? (2) How does model diversity affect practical scenarios? We answer the first question by providing feature importance analysis and clustering analysis. The second question is validated by two different scenarios: model ensemble and knowledge distillation. Our findings show that model diversity takes a key role when interacting with different neural architectures. For example, we found that more diversity leads to better ensemble performance. We also observe that the relationship between teacher and student networks and distillation performance depends on the choice of the base architecture of the teacher and student networks. We expect our analysis tool helps a high-level understanding of differences between various neural architectures as well as practical guidance when using multiple architectures.

READ FULL TEXT

page 7

page 19

page 20

research
12/01/2020

Multi-level Knowledge Distillation

Knowledge distillation has become an important technique for model compr...
research
09/24/2019

FEED: Feature-level Ensemble for Knowledge Distillation

Knowledge Distillation (KD) aims to transfer knowledge in a teacher-stud...
research
01/28/2023

Supervision Complexity and its Role in Knowledge Distillation

Despite the popularity and efficacy of knowledge distillation, there is ...
research
07/01/2023

Common Knowledge Learning for Generating Transferable Adversarial Examples

This paper focuses on an important type of black-box attacks, i.e., tran...
research
05/25/2023

On the Impact of Knowledge Distillation for Model Interpretability

Several recent studies have elucidated why knowledge distillation (KD) i...
research
03/29/2020

Sequential Transfer Machine Learning in Networks: Measuring the Impact of Data and Neural Net Similarity on Transferability

In networks of independent entities that face similar predictive tasks, ...
research
11/17/2017

Generation and Consolidation of Recollections for Efficient Deep Lifelong Learning

Deep lifelong learning systems need to efficiently manage resources to s...

Please sign up or login with your details

Forgot password? Click here to reset