Automatic Extraction of the Romanian Academic Word List: Data and Methods

07/29/2023
by   Ana-Maria Bucur, et al.
0

This paper presents the methodology and data used for the automatic extraction of the Romanian Academic Word List (Ro-AWL). Academic Word Lists are useful in both L2 and L1 teaching contexts. For the Romanian language, no such resource exists so far. Ro-AWL has been generated by combining methods from corpus and computational linguistics with L2 academic writing approaches. We use two types of data: (a) existing data, such as the Romanian Frequency List based on the ROMBAC corpus, and (b) self-compiled data, such as the expert academic writing corpus EXPRES. For constructing the academic word list, we follow the methodology for building the Academic Vocabulary List for the English language. The distribution of Ro-AWL features (general distribution, POS distribution) into four disciplinary datasets is in line with previous research. Ro-AWL is freely available and can be used for teaching, research and NLP applications.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/05/2020

Automatic Compilation of Resources for Academic Writing and Evaluating with Informal Word Identification and Paraphrasing System

We present the first approach to automatically building resources for ac...
research
12/14/2019

LScDC-new large scientific dictionary

In this paper, we present a scientific corpus of abstracts of academic p...
research
03/07/2020

Synthetic Error Dataset Generation Mimicking Bengali Writing Pattern

While writing Bengali using English keyboard, users often make spelling ...
research
02/12/2016

An Empirical Study on Academic Commentary and Its Implications on Reading and Writing

The relationship between reading and writing (RRW) is one of the major t...
research
02/01/2020

Novel Language Resources for Hindi: An Aesthetics Text Corpus and a Comprehensive Stop Lemma List

This paper is an effort to complement the contributions made by research...
research
06/09/2018

Word Familiarity and Frequency

Word frequency is assumed to correlate with word familiarity, but the st...
research
03/30/2017

Neutral evolution and turnover over centuries of English word popularity

Here we test Neutral models against the evolution of English word freque...

Please sign up or login with your details

Forgot password? Click here to reset