BERT got a Date: Introducing Transformers to Temporal Tagging

09/30/2021
by   Satya Almasian, et al.
0

Temporal expressions in text play a significant role in language understanding and correctly identifying them is fundamental to various retrieval and natural language processing systems. Previous works have slowly shifted from rule-based to neural architectures, capable of tagging expressions with higher accuracy. However, neural models can not yet distinguish between different expression types at the same level as their rule-based counterparts. In this work, we aim to identify the most suitable transformer architecture for joint temporal tagging and type classification, as well as, investigating the effect of semi-supervised training on the performance of these systems. Based on our study of token classification variants and encoder-decoder architectures, we present a transformer encoder-decoder model using the RoBERTa language model as our best performing system. By supplementing training resources with weakly labeled data from rule-based systems, our model surpasses previous works in temporal tagging and type classification, especially on rare classes. Our code and pre-trained experiments are available at: https://github.com/satya77/Transformer_Temporal_Tagger

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/31/2021

BNLP: Natural language processing toolkit for Bengali language

BNLP is an open source language processing toolkit for Bengali language ...
research
03/14/2022

PERT: Pre-training BERT with Permuted Language Model

Pre-trained Language Models (PLMs) have been widely used in various natu...
research
10/23/2022

On Cross-Domain Pre-Trained Language Models for Clinical Text Mining: How Do They Perform on Data-Constrained Fine-Tuning?

Pre-trained language models (PLMs) have been deployed in many natural la...
research
11/03/2020

A Benchmark of Rule-Based and Neural Coreference Resolution in Dutch Novels and News

We evaluate a rule-based (Lee et al., 2013) and neural (Lee et al., 2018...
research
05/03/2020

Encoder-Decoder Models Can Benefit from Pre-trained Masked Language Models in Grammatical Error Correction

This paper investigates how to effectively incorporate a pre-trained mas...
research
05/19/2020

Adversarial Alignment of Multilingual Models for Extracting Temporal Expressions from Text

Although temporal tagging is still dominated by rule-based systems, ther...
research
03/20/2022

g2pW: A Conditional Weighted Softmax BERT for Polyphone Disambiguation in Mandarin

Polyphone disambiguation is the most crucial task in Mandarin grapheme-t...

Please sign up or login with your details

Forgot password? Click here to reset