Efficient Representation for Natural Language Processing via Kernelized Hashcodes

11/10/2017
by   Sahil Garg, et al.
0

Kernel similarity functions have been successfully applied in classification models such as Support Vector Machines, Gaussian Processes and k-Nearest Neighbors (kNN), but found to be computationally expensive for Natural Language Processing (NLP) tasks due to the cost of computing kernel similarities between discrete natural language structures. A well-known technique, Kernelized Locality Sensitive Hashing (KLSH), allows for an approximate computation of kNN graphs and significantly reduces the number of kernel computations; however, applying KLSH to other classifiers have not been explored. In this paper, we propose to use random subspaces of KLSH codes for constructing an efficient representation that preserves fine-grained structure of the data and is suitable for general classification methods. Further, we proposed an approach for optimizing KLSH model for supervised classification problems, by maximizing a variational lower bound on the mutual information between the KLSH codes (feature vectors) and the class labels.We apply the proposed approach to the task of extracting information about bio-molecular interactions from the semantic parsing of scientific papers. Our empirical results on a variety of datasets demonstrate significant improvements over the state of the art.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/11/2018

Stochastic Learning of Nonstationary Kernels for Natural Language Modeling

Natural language processing often involves computations with semantic or...
research
09/27/2016

Optimizing Neural Network Hyperparameters with Gaussian Processes for Dialog Act Classification

Systems based on artificial neural networks (ANNs) have achieved state-o...
research
06/27/2018

hep-th

We apply techniques in natural language processing, computational lingui...
research
06/11/2019

Lightweight and Efficient Neural Natural Language Processing with Quaternion Networks

Many state-of-the-art neural models for NLP are heavily parameterized an...
research
08/10/2015

Learning Structural Kernels for Natural Language Processing

Structural kernels are a flexible learning paradigm that has been widely...
research
07/21/2019

Word Sense Disambiguation using Diffusion Kernel PCA

One of the major problems in natural language processing (NLP) is the wo...
research
05/29/2019

SECRET: Semantically Enhanced Classification of Real-world Tasks

Supervised machine learning (ML) algorithms are aimed at maximizing clas...

Please sign up or login with your details

Forgot password? Click here to reset