Vulgaris: Analysis of a Corpus for Middle-Age Varieties of Italian Language

10/12/2020
by   Andrea Zugarini, et al.
9

Italian is a Romance language that has its roots in Vulgar Latin. The birth of the modern Italian started in Tuscany around the 14th century, and it is mainly attributed to the works of Dante Alighieri, Francesco Petrarca and Giovanni Boccaccio, who are among the most acclaimed authors of the medieval age in Tuscany. However, Italy has been characterized by a high variety of dialects, which are often loosely related to each other, due to the past fragmentation of the territory. Italian has absorbed influences from many of these dialects, as also from other languages due to dominion of portions of the country by other nations, such as Spain and France. In this work we present Vulgaris, a project aimed at studying a corpus of Italian textual resources from authors of different regions, ranging in a time period between 1200 and 1600. Each composition is associated to its author, and authors are also grouped in families, i.e. sharing similar stylistic/chronological characteristics. Hence, the dataset is not only a valuable resource for studying the diachronic evolution of Italian and the differences between its dialects, but it is also useful to investigate stylistic aspects between single authors. We provide a detailed statistical analysis of the data, and a corpus-driven study in dialectology and diachronic varieties.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/04/2016

Ensemble Maximum Entropy Classification and Linear Regression for Author Age Prediction

The evolution of the internet has created an abundance of unstructured d...
research
11/10/2022

BERT in Plutarch's Shadows

The extensive surviving corpus of the ancient scholar Plutarch of Chaero...
research
02/24/2022

Some Stylometric Remarks on Ovid's Heroides and the Epistula Sapphus

This article aims to contribute to two well-worn areas of debate in clas...
research
12/31/2018

Pull out all the stops: Textual analysis via punctuation sequences

Whether enjoying the lucid prose of a favorite author or slogging throug...
research
01/02/2020

Why Molière most likely did write his plays

As for Shakespeare, a hard-fought debate has emerged about Molière, a su...
research
08/23/2022

Computational valency lexica and Homeric formularity

Distributional semantics, the quantitative study of meaning variation an...
research
11/11/2020

Personal-ITY: A Novel YouTube-based Corpus for Personality Prediction in Italian

We present a novel corpus for personality prediction in Italian, contain...

Please sign up or login with your details

Forgot password? Click here to reset