Introducing RONEC -- the Romanian Named Entity Corpus

09/03/2019
by   Stefan Daniel Dumitrescu, et al.
0

We present RONEC - the Named Entity Corpus for the Romanian language. The corpus contains over 26000 entities in 5000 annotated sentences, belonging to 16 distinct classes. The sentences have been extracted from a copy-right free newspaper, covering several styles. This corpus represents the first initiative in the Romanian language space specifically targeted for named entity recognition. It is available in BRAT and CoNLL-U Plus formats, and it is free to use and extend at github.com/dumitrescustefan/ronec .

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/19/2018

pioNER: Datasets and Baselines for Armenian Named Entity Recognition

In this work, we tackle the problem of Armenian named entity recognition...
research
01/03/2020

Information Extraction based on Named Entity for Tourism Corpus

Tourism information is scattered around nowadays. To search for the info...
research
02/18/2019

"The Michael Jordan of Greatness": Extracting Vossian Antonomasia from Two Decades of the New York Times, 1987-2007

Vossian Antonomasia is a prolific stylistic device, in use since antiqui...
research
10/04/2017

Building a Web-Scale Dependency-Parsed Corpus from CommonCrawl

We present DepCC, the largest to date linguistically analyzed corpus in ...
research
08/07/2017

LitStoryTeller: An Interactive System for Visual Exploration of Scientific Papers Leveraging Named entities and Comparative Sentences

The present study proposes LitStoryTeller, an interactive system for vis...
research
01/19/2019

MOROCO: The Moldavian and Romanian Dialectal Corpus

In this work, we introduce the MOldavian and ROmanian Dialectal COrpus (...
research
12/30/2021

KIND: an Italian Multi-Domain Dataset for Named Entity Recognition

In this paper we present KIND, an Italian dataset for Named-Entity Recog...

Please sign up or login with your details

Forgot password? Click here to reset