Seeing in Words: Learning to Classify through Language Bottlenecks

06/29/2023
by   Khalid Saifullah, et al.
0

Neural networks for computer vision extract uninterpretable features despite achieving high accuracy on benchmarks. In contrast, humans can explain their predictions using succinct and intuitive descriptions. To incorporate explainability into neural networks, we train a vision model whose feature representations are text. We show that such a model can effectively classify ImageNet images, and we discuss the challenges we encountered when training it.

READ FULL TEXT
research
03/20/2017

On the Limitation of Convolutional Neural Networks in Recognizing Negative Images

Convolutional Neural Networks (CNNs) have achieved state-of-the-art perf...
research
02/17/2022

PCB Component Detection using Computer Vision for Hardware Assurance

Printed Circuit Board (PCB) assurance in the optical domain is a crucial...
research
12/05/2014

Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable Images

Deep neural networks (DNNs) have recently been achieving state-of-the-ar...
research
05/15/2022

What is an equivariant neural network?

We explain equivariant neural networks, a notion underlying breakthrough...
research
06/19/2019

A simple and effective postprocessing method for image classification

Whether it is computer vision, natural language processing or speech rec...
research
01/28/2019

Interpreting Deep Neural Networks Through Variable Importance

While the success of deep neural networks (DNNs) is well-established acr...
research
09/30/2018

Optical Illusions Images Dataset

Human vision is capable of performing many tasks not optimized for in it...

Please sign up or login with your details

Forgot password? Click here to reset