A Modular Approach for Multilingual Timex Detection and Normalization using Deep Learning and Grammar-based methods

04/27/2023
by   Nayla Escribano, et al.
0

Detecting and normalizing temporal expressions is an essential step for many NLP tasks. While a variety of methods have been proposed for detection, best normalization approaches rely on hand-crafted rules. Furthermore, most of them have been designed only for English. In this paper we present a modular multilingual temporal processing system combining a fine-tuned Masked Language Model for detection, and a grammar-based normalizer. We experiment in Spanish and English and compare with HeidelTime, the state-of-the-art in multilingual temporal processing. We obtain best results in gold timex normalization, timex detection and type recognition, and competitive performance in the combined TempEval-3 relaxed value metric. A detailed error analysis shows that detecting only those timexes for which it is feasible to provide a normalization is highly beneficial in this last metric. This raises the question of which is the best strategy for timex processing, namely, leaving undetected those timexes for which is not easy to provide normalization rules or aiming for high coverage.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/20/2022

Multilingual Normalization of Temporal Expressions with Masked Language Models

The detection and normalization of temporal expressions is an important ...
research
10/28/2021

ÚFAL at MultiLexNorm 2021: Improving Multilingual Lexical Normalization by Fine-tuning ByT5

We present the winning entry to the Multilingual Lexical Normalization (...
research
08/31/2021

Automatic Rule Generation for Time Expression Normalization

The understanding of time expressions includes two sub-tasks: recognitio...
research
04/07/2021

GrammarTagger: A Multilingual, Minimally-Supervised Grammar Profiler for Language Education

We present GrammarTagger, an open-source grammar profiler which, given a...
research
03/19/2023

Bangla Grammatical Error Detection Using T5 Transformer Model

This paper presents a method for detecting grammatical errors in Bangla ...
research
12/05/2020

Codeswitched Sentence Creation using Dependency Parsing

Codeswitching has become one of the most common occurrences across multi...

Please sign up or login with your details

Forgot password? Click here to reset