A Linguistic Investigation of Machine Learning based Contradiction Detection Models: An Empirical Analysis and Future Perspectives

10/19/2022
by   Maren Pielka, et al.
0

We analyze two Natural Language Inference data sets with respect to their linguistic features. The goal is to identify those syntactic and semantic properties that are particularly hard to comprehend for a machine learning model. To this end, we also investigate the differences between a crowd-sourced, machine-translated data set (SNLI) and a collection of text pairs from internet sources. Our main findings are, that the model has difficulty recognizing the semantic importance of prepositions and verbs, emphasizing the importance of linguistically aware pre-training tasks. Furthermore, it often does not comprehend antonyms and homonyms, especially if those are depending on the context. Incomplete sentences are another problem, as well as longer paragraphs and rare words or phrases. The study shows that automated language understanding requires a more informed approach, utilizing as much external knowledge as possible throughout the training process.

READ FULL TEXT

page 1

page 3

research
10/11/2022

Revisiting and Advancing Chinese Natural Language Understanding with Accelerated Heterogeneous Knowledge Pre-training

Recently, knowledge-enhanced pre-trained language models (KEPLMs) improv...
research
07/27/2023

Improving Natural Language Inference in Arabic using Transformer Models and Linguistically Informed Pre-Training

This paper addresses the classification of Arabic text data in the field...
research
05/23/2022

Informed Pre-Training on Prior Knowledge

When training data is scarce, the incorporation of additional prior know...
research
12/30/2020

Unnatural Language Inference

Natural Language Understanding has witnessed a watershed moment with the...
research
10/24/2022

An Empirical Revisiting of Linguistic Knowledge Fusion in Language Understanding Tasks

Though linguistic knowledge emerges during large-scale language model pr...
research
07/31/2023

Structural Transfer Learning in NL-to-Bash Semantic Parsers

Large-scale pre-training has made progress in many fields of natural lan...
research
02/11/2019

LS-Tree: Model Interpretation When the Data Are Linguistic

We study the problem of interpreting trained classification models in th...

Please sign up or login with your details

Forgot password? Click here to reset