Binary Paragraph Vectors

11/03/2016
by   Karol Grzegorczyk, et al.
0

Recently Le & Mikolov described two log-linear models, called Paragraph Vector, that can be used to learn state-of-the-art distributed representations of documents. Inspired by this work, we present Binary Paragraph Vector models: simple neural networks that learn short binary codes for fast information retrieval. We show that binary paragraph vectors outperform autoencoder-based binary codes, despite using fewer bits. We also evaluate their precision in transfer learning settings, where binary codes are inferred for documents unrelated to the training corpus. Results from these experiments indicate that binary paragraph vectors can capture semantics relevant for various domain-specific documents. Finally, we present a model that simultaneously learns short binary codes and longer, real-valued representations. This model can be used to rapidly retrieve a short list of highly relevant documents from a large document collection.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/07/2019

Vector representations of text data in deep learning

In this dissertation we report results of our research on dense distribu...
research
02/17/2023

Binary Embedding-based Retrieval at Tencent

Large-scale embedding-based retrieval (EBR) is the cornerstone of search...
research
08/20/2021

Additive Polycyclic Codes over 𝔽_4 Induced by Binary Vectors and Some Optimal Codes

In this paper we study the structure and properties of additive right an...
research
06/24/2019

Binary Stochastic Representations for Large Multi-class Classification

Classification with a large number of classes is a key problem in machin...
research
02/18/2018

Recurrent Binary Embedding for GPU-Enabled Exhaustive Retrieval from Billion-Scale Semantic Vectors

Rapid advances in GPU hardware and multiple areas of Deep Learning open ...
research
06/02/2021

LLC: Accurate, Multi-purpose Learnt Low-dimensional Binary Codes

Learning binary representations of instances and classes is a classical ...
research
03/18/2016

Document Neural Autoregressive Distribution Estimation

We present an approach based on feed-forward neural networks for learnin...

Please sign up or login with your details

Forgot password? Click here to reset