The Logoscope: a Semi-Automatic Tool for Detecting and Documenting French New Words

10/25/2018
by   Ingrid Falk, et al.
0

In this article we present the design and implementation of the Logoscope, the first tool especially developed to detect new words of the French language, to document them and allow a public access through a web interface. This semi-automatic tool collects new words daily by browsing the online versions of French well known newspapers such as Le Monde, Le Figaro, L'Equipe, Libération, La Croix, Les Échos. In contrast to other existing tools essentially dedicated to dictionary development, the Logoscope attempts to give a more complete account of the context in which the new words occur. In addition to the commonly given morpho-syntactic information it also provides information about the textual and discursive contexts of the word creation; in particular, it automatically determines the (journalistic) topics of the text containing the new word. In this article we first give a general overview of the developed tool. We then describe the approach taken, we discuss the linguistic background which guided our design decisions and present the computational methods we used to implement it.

READ FULL TEXT
research
06/14/2018

Automatic Language Identification for Romance Languages using Stop Words and Diacritics

Automatic language identification is a natural language processing probl...
research
01/05/2022

Some Strategies to Capture Karaka-Yogyata with Special Reference to apadana

In today's digital world language technology has gained importance. Seve...
research
01/12/2020

Detecting New Word Meanings: A Comparison of Word Embedding Models in Spanish

Semantic neologisms (SN) are defined as words that acquire a new word me...
research
02/16/2021

Decidability for Sturmian words

We show that the first-order theory of Sturmian words over Presburger ar...
research
01/05/2021

Political Depolarization of News Articles Using Attribute-aware Word Embeddings

Political polarization in the US is on the rise. This polarization negat...
research
05/24/2018

WSD-algorithm based on new method of vector-word contexts proximity calculation via epsilon-filtration

The problem of word sense disambiguation (WSD) is considered in the arti...
research
05/19/2023

Persian Typographical Error Type Detection using Many-to-Many Deep Neural Networks on Algorithmically-Generated Misspellings

Digital technologies have led to an influx of text created daily in a va...

Please sign up or login with your details

Forgot password? Click here to reset