A Short Survey on Sense-Annotated Corpora for Diverse Languages and Resources

02/13/2018
by   Tommaso Pasini, et al.
0

With the advancement of research in word sense disambiguation and deep learning, large sense-annotated datasets are increasingly important for training supervised systems. However, gathering high-quality sense-annotated data for as many instances as possible is an arduous task. This has led to the proliferation of automatic and semi-automatic methods for overcoming the so-called knowledge-acquisition bottleneck. In this paper we present an overview of currently available sense-annotated corpora, both manually and automatically constructed, for various languages and resources (i.e. WordNet, Wikipedia, BabelNet). General statistics and specific features of each sense-annotated dataset are also provided.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/12/2018

Huge Automatically Extracted Training Sets for Multilingual Word Sense Disambiguation

We release to the community six large-scale sense-annotated datasets in ...
research
06/11/2021

Semi-Supervised and Unsupervised Sense Annotation via Translations

Acquisition of multilingual training data continues to be a challenge in...
research
03/23/2018

Multilingual bottleneck features for subword modeling in zero-resource languages

How can we effectively develop speech technology for languages where no ...
research
07/04/2018

Towards Automation of Sense-type Identification of Verbs in OntoSenseNet(Telugu)

In this paper, we discuss the enrichment of a manually developed resourc...
research
11/02/2018

Improving the Coverage and the Generalization Ability of Neural Word Sense Disambiguation through Hypernymy and Hyponymy Relationships

In Word Sense Disambiguation (WSD), the predominant approach generally i...
research
02/06/2017

Q-WordNet PPV: Simple, Robust and (almost) Unsupervised Generation of Polarity Lexicons for Multiple Languages

This paper presents a simple, robust and (almost) unsupervised dictionar...
research
10/25/2016

EmojiNet: Building a Machine Readable Sense Inventory for Emoji

Emoji are a contemporary and extremely popular way to enhance electronic...

Please sign up or login with your details

Forgot password? Click here to reset