Learning Semantically Coherent and Reusable Kernels in Convolution Neural Nets for Sentence Classification

08/01/2016
by   Madhusudan Lakshmana, et al.
0

The state-of-the-art CNN models give good performance on sentence classification tasks. The purpose of this work is to empirically study desirable properties such as semantic coherence, attention mechanism and reusability of CNNs in these tasks. Semantically coherent kernels are preferable as they are a lot more interpretable for explaining the decision of the learned CNN model. We observe that the learned kernels do not have semantic coherence. Motivated by this observation, we propose to learn kernels with semantic coherence using clustering scheme combined with Word2Vec representation and domain knowledge such as SentiWordNet. We suggest a technique to visualize attention mechanism of CNNs for decision explanation purpose. Reusable property enables kernels learned on one problem to be used in another problem. This helps in efficient learning as only a few additional domain specific filters may have to be learned. We demonstrate the efficacy of our core ideas of learning semantically coherent kernels and leveraging reusable kernels for efficient learning on several benchmark datasets. Experimental results show the usefulness of our approach by achieving performance close to the state-of-the-art methods but with semantic and reusable properties.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/30/2015

Design of Kernels in Convolutional Neural Networks for Image Classification

Despite the effectiveness of Convolutional Neural Networks (CNNs) for im...
research
04/19/2018

Learning to Extract Coherent Summary via Deep Reinforcement Learning

Coherence plays a critical role in producing a high-quality summary from...
research
01/04/2022

Attention Mechanism Meets with Hybrid Dense Network for Hyperspectral Image Classification

Convolutional Neural Networks (CNN) are more suitable, indeed. However, ...
research
06/07/2022

Towards a General Purpose CNN for Long Range Dependencies in ND

The use of Convolutional Neural Networks (CNNs) is widespread in Deep Le...
research
10/22/2016

Optimization on Submanifolds of Convolution Kernels in CNNs

Kernel normalization methods have been employed to improve robustness of...
research
04/21/2019

DDGK: Learning Graph Representations for Deep Divergence Graph Kernels

Can neural networks learn to compare graphs without feature engineering?...

Please sign up or login with your details

Forgot password? Click here to reset