Learning K-way D-dimensional Discrete Code For Compact Embedding Representations

11/08/2017
by   Ting Chen, et al.
0

Embedding methods such as word embedding have become pillars for many applications containing discrete structures. Conventional embedding methods directly associate each symbol with a continuous embedding vector, which is equivalent to applying linear transformation based on "one-hot" encoding of the discrete symbols. Despite its simplicity, such approach yields number of parameters that grows linearly with the vocabulary size and can lead to overfitting. In this work we propose a much more compact K-way D-dimensional discrete encoding scheme to replace the "one-hot" encoding. In "KD encoding", each symbol is represented by a D-dimensional code, and each of its dimension has a cardinality of K. The final symbol embedding vector can be generated by composing the code embedding vectors. To learn the semantically meaningful code, we derive a relaxed discrete optimization technique based on stochastic gradient descent. By adopting the new coding system, the efficiency of parameterization can be significantly improved (from linear to logarithmic), and this can also mitigate the over-fitting problem. In our experiments with language modeling, the number of embedding parameters can be reduced by 97% while achieving similar or better performance.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/21/2018

Learning K-way D-dimensional Discrete Codes for Compact Embedding Representations

Conventional embedding methods directly associate each symbol with a con...
research
03/07/2019

Multi-Hot Compact Network Embedding

Network embedding, as a promising way of the network representation lear...
research
07/03/2016

Context-Dependent Word Representation for Neural Machine Translation

We first observe a potential weakness of continuous vector representatio...
research
04/02/2020

Fundamental Limits of Distributed Encoding

In general coding theory, we often assume that error is observed in tran...
research
01/23/2017

dna2vec: Consistent vector representations of variable-length k-mers

One of the ubiquitous representation of long DNA sequence is dividing it...
research
08/26/2019

Differentiable Product Quantization for End-to-End Embedding Compression

Embedding layer is commonly used to map discrete symbols into continuous...
research
12/14/2020

Is FFT Fast Enough for Beyond-5G Communications?

In this work, we consider the complexity and throughput limits of the Fa...

Please sign up or login with your details

Forgot password? Click here to reset