An Annotated Corpus of Emerging Anglicisms in Spanish Newspaper Headlines

04/06/2020
by   Elena Alvarez-Mellado, et al.
0

The extraction of anglicisms (lexical borrowings from English) is relevant both for lexicographic purposes and for NLP downstream tasks. We introduce a corpus of European Spanish newspaper headlines annotated with anglicisms and a baseline model for anglicism extraction. In this paper we present: (1) a corpus of 21,570 newspaper headlines written in European Spanish annotated with emergent anglicisms and (2) a conditional random field baseline model with handcrafted features for anglicism extraction. We present the newspaper headlines corpus, describe the annotation tagset and guidelines and introduce a CRF model that can serve as baseline for the task of detecting anglicisms. The presented work is a first step towards the creation of an anglicism extractor for Spanish newswire.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/30/2017

Creation of an Annotated Corpus of Spanish Radiology Reports

This paper presents a new annotated corpus of 513 anonymized radiology r...
research
09/19/2023

FRACAS: A FRench Annotated Corpus of Attribution relations in newS

Quotation extraction is a widely useful task both from a sociological an...
research
09/19/2020

Nominal Compound Chain Extraction: A New Task for Semantic-enriched Lexical Chain

Lexical chain consists of cohesion words in a document, which implies th...
research
04/08/2022

CrudeOilNews: An Annotated Crude Oil News Corpus for Event Extraction

In this paper, we present CrudeOilNews, a corpus of English Crude Oil ne...
research
07/27/2021

Emotion Stimulus Detection in German News Headlines

Emotion stimulus extraction is a fine-grained subtask of emotion analysi...
research
03/30/2022

Detecting Unassimilated Borrowings in Spanish: An Annotated Corpus and Approaches to Modeling

This work presents a new resource for borrowing identification and analy...
research
04/21/2021

Possibilities, Challenges and Limits of a European Charters Corpus (Cartae Europae Medii Aevi - CEMA)

The objective of this paper is to present a meta-corpus of diplomatic do...

Please sign up or login with your details

Forgot password? Click here to reset