DeepAI AI Chat
Log In Sign Up

Character-based Neural Embeddings for Tweet Clustering

03/15/2017
by   Svitlana Vakulenko, et al.
WU (Vienna University of Economics and Business)
MODUL Technology GmbH
TU Wien
0

In this paper we show how the performance of tweet clustering can be improved by leveraging character-based neural networks. The proposed approach overcomes the limitations related to the vocabulary explosion in the word-based models and allows for the seamless processing of the multilingual content. Our evaluation results and code are available on-line at https://github.com/vendi12/tweet2vec_clustering

READ FULL TEXT

page 1

page 2

page 3

page 4

11/06/2020

Understanding Pure Character-Based Neural Machine Translation: The Case of Translating Finnish into English

Recent work has shown that deeper character-based neural machine transla...
11/17/2021

Character Transformations for Non-Autoregressive GEC Tagging

We propose a character-based nonautoregressive GEC approach, with automa...
02/18/2023

RetVec: Resilient and Efficient Text Vectorizer

This paper describes RetVec, a resilient multilingual embedding scheme d...
08/28/2018

Explaining Character-Aware Neural Networks for Word-Level Prediction: Do They Discover Linguistic Rules?

Character-level features are currently used in different neural network-...
03/06/2020

Explaining Away Attacks Against Neural Networks

We investigate the problem of identifying adversarial attacks on image-b...
11/02/2017

A Comparison of Feature-Based and Neural Scansion of Poetry

Automatic analysis of poetic rhythm is a challenging task that involves ...

Code Repositories