Recognizing the vocabulary of Brazilian popular newspapers with a free-access computational dictionary

04/19/2019
by   Maria José Finatto, et al.
0

We report an experiment to check the identification of a set of words in popular written Portuguese with two versions of a computational dictionary of Brazilian Portuguese, DELAF PB 2004 and DELAF PB 2015. This dictionary is freely available for use in linguistic analyses of Brazilian Portuguese and other researches, which justifies critical study. The vocabulary comes from the PorPopular corpus, made of popular newspapers Diário Gaúcho (DG) and Massa! (MA). From DG, we retained a set of texts with 984.465 words (tokens), published in 2008, with the spelling used before the Portuguese Language Orthographic Agreement adopted in 2009. From MA, we examined papers of 2012, 2014 e 2015, with 215.776 words (tokens), all with the new spelling. The checking involved: a) generating lists of words (types) occurring in DG and MA; b) comparing them with the entry lists of both versions of DELAF PB; c) assessing the coverage of this vocabulary; d) proposing ways of incorporating the items not covered. The results of the work show that an average of 19 the types in DG were not found in DELAF PB 2004 or 2015. In MA, this average is 13 recognizing the words.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/21/2020

Accent Estimation of Japanese Words from Their Surfaces and Romanizations for Building Large Vocabulary Accent Dictionaries

In Japanese text-to-speech (TTS), it is necessary to add accent informat...
research
04/13/2018

Neologisms on Facebook

In this paper, we present a study of neologisms and loan words frequentl...
research
12/14/2019

LScDC-new large scientific dictionary

In this paper, we present a scientific corpus of abstracts of academic p...
research
08/24/2023

Probabilistic Method of Measuring Linguistic Productivity

In this paper I propose a new way of measuring linguistic productivity t...
research
05/31/2016

Determining the Characteristic Vocabulary for a Specialized Dictionary using Word2vec and a Directed Crawler

Specialized dictionaries are used to understand concepts in specific dom...
research
07/02/2022

VocabulARy: Learning Vocabulary in AR Supported by Keyword Visualisations

Learning vocabulary in a primary or secondary language is enhanced when ...

Please sign up or login with your details

Forgot password? Click here to reset