Efficient Purely Convolutional Text Encoding

08/03/2018
by   Szymon Malik, et al.
0

In this work, we focus on a lightweight convolutional architecture that creates fixed-size vector embeddings of sentences. Such representations are useful for building NLP systems, including conversational agents. Our work derives from a recently proposed recursive convolutional architecture for auto-encoding text paragraphs at byte level. We propose alternations that significantly reduce training time, the number of parameters, and improve auto-encoding accuracy. Finally, we evaluate the representations created by our model on tasks from SentEval benchmark suite, and show that it can serve as a better, yet fairly low-resource alternative to popular bag-of-words embeddings.

READ FULL TEXT
research
02/06/2018

Byte-Level Recursive Convolutional Auto-Encoder for Text

This article proposes to auto-encode text at byte-level using convolutio...
research
01/25/2019

Word Embeddings: A Survey

This work lists and describes the main recent strategies for building fi...
research
04/10/2022

Self-Supervised Audio-and-Text Pre-training with Extremely Low-Resource Parallel Data

Multimodal pre-training for audio-and-text has recently been proved to b...
research
05/29/2019

Learning Multilingual Word Embeddings Using Image-Text Data

There has been significant interest recently in learning multilingual wo...
research
04/06/2023

Static Fuzzy Bag-of-Words: a lightweight sentence embedding algorithm

The introduction of embedding techniques has pushed forward significantl...
research
10/29/2018

Learning Better Internal Structure of Words for Sequence Labeling

Character-based neural models have recently proven very useful for many ...
research
06/19/2019

Learning Compressed Sentence Representations for On-Device Text Processing

Vector representations of sentences, trained on massive text corpora, ar...

Please sign up or login with your details

Forgot password? Click here to reset