NeuroX: A Toolkit for Analyzing Individual Neurons in Neural Networks

12/21/2018
by   Fahim Dalvi, et al.
0

We present a toolkit to facilitate the interpretation and understanding of neural network models. The toolkit provides several methods to identify salient neurons with respect to the model itself or an external task. A user can visualize selected neurons, ablate them to measure their effect on the model accuracy, and manipulate them to control the behavior of the model at the test time. Such an analysis has a potential to serve as a springboard in various research directions, such as understanding the model, better architectural choices, model distillation and controlling data biases.

READ FULL TEXT

page 1

page 2

research
08/30/2021

Neuron-level Interpretation of Deep NLP Models: A Survey

The proliferation of deep neural networks in various domains has seen an...
research
12/21/2018

What Is One Grain of Sand in the Desert? Analyzing Individual Neurons in Deep NLP Models

Despite the remarkable evolution of deep neural networks in natural lang...
research
05/26/2023

NeuroX Library for Neuron Analysis of Deep NLP Models

Neuron analysis provides insights into how knowledge is structured in re...
research
10/05/2016

LAYERS: Yet another Neural Network toolkit

Layers is an open source neural network toolkit aim at providing an easy...
research
07/18/2022

MRCLens: an MRC Dataset Bias Detection Toolkit

Many recent neural models have shown remarkable empirical results in Mac...
research
04/20/2022

Inferring ice sheet damage models from limited observations using CRIKit: the Constitutive Relation Inference Toolkit

We examine the prospect of learning ice sheet damage models from observa...
research
09/15/2023

Sparse Autoencoders Find Highly Interpretable Features in Language Models

One of the roadblocks to a better understanding of neural networks' inte...

Please sign up or login with your details

Forgot password? Click here to reset