Connecting a French Dictionary from the Beginning of the 20th Century to Wikidata

06/22/2022
by   Pierre Nugues, et al.
0

The Petit Larousse illustré is a French dictionary first published in 1905. Its division in two main parts on language and on history and geography corresponds to a major milestone in French lexicography as well as a repository of general knowledge from this period. Although the value of many entries from 1905 remains intact, some descriptions now have a dimension that is more historical than contemporary. They are nonetheless significant to analyze and understand cultural representations from this time. A comparison with more recent information or a verification of these entries would require a tedious manual work. In this paper, we describe a new lexical resource, where we connected all the dictionary entries of the history and geography part to current data sources. For this, we linked each of these entries to a wikidata identifier. Using the wikidata links, we can automate more easily the identification, comparison, and verification of historically-situated representations. We give a few examples on how to process wikidata identifiers and we carried out a small analysis of the entities described in the dictionary to outline possible applications. The resource, i.e. the annotation of 20,245 dictionary entries with wikidata links, is available from GitHub urlhttps://github.com/pnugues/petit_larousse_1905/

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/06/2020

Building a Norwegian Lexical Resource for Medical Entity Recognition

We present a large Norwegian lexical resource of categorized medical ter...
research
03/28/2023

Scalable handwritten text recognition system for lexicographic sources of under-resourced languages and alphabets

The paper discusses an approach to decipher large collections of handwri...
research
08/16/2023

Handwriting Analysis on the Diaries of Rosamond Jacob

Handwriting is an art form that most people learn at an early age. Each ...
research
07/06/2020

A Broad-Coverage Deep Semantic Lexicon for Verbs

Progress on deep language understanding is inhibited by the lack of a br...
research
03/07/2015

Identifying missing dictionary entries with frequency-conserving context models

In an effort to better understand meaning from natural language texts, w...
research
10/27/2020

A Comprehensive Dictionary and Term Variation Analysis for COVID-19 and SARS-CoV-2

The number of unique terms in the scientific literature used to refer to...
research
07/06/2019

Bag-of-Audio-Words based on Autoencoder Codebook for Continuous Emotion Prediction

In this paper we present a novel approach for extracting a Bag-of-Words ...

Please sign up or login with your details

Forgot password? Click here to reset