Compositional Evaluation on Japanese Textual Entailment and Similarity

08/09/2022
by   Hitomi Yanaka, et al.
0

Natural Language Inference (NLI) and Semantic Textual Similarity (STS) are widely used benchmark tasks for compositional evaluation of pre-trained language models. Despite growing interest in linguistic universals, most NLI/STS studies have focused almost exclusively on English. In particular, there are no available multilingual NLI/STS datasets in Japanese, which is typologically different from English and can shed light on the currently controversial behavior of language models in matters such as sensitivity to word order and case particles. Against this background, we introduce JSICK, a Japanese NLI/STS dataset that was manually translated from the English dataset SICK. We also present a stress-test dataset for compositional inference, created by transforming syntactic structures of sentences in JSICK to investigate whether language models are sensitive to word order and case particles. We conduct baseline experiments on different pre-trained language models and compare the performance of multilingual models when applied to Japanese and other languages. The results of the stress-test experiments suggest that the current pre-trained language models are insensitive to word order and case marking.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/11/2023

Towards preserving word order importance through Forced Invalidation

Large pre-trained language models such as BERT have been widely used as ...
research
05/24/2023

SETI: Systematicity Evaluation of Textual Inference

We propose SETI (Systematicity Evaluation of Textual Inference), a novel...
research
06/19/2023

Jamp: Controlled Japanese Temporal Inference Dataset for Evaluating Generalization Capacity of Language Models

Natural Language Inference (NLI) tasks involving temporal inference rema...
research
01/14/2021

SICKNL: A Dataset for Dutch Natural Language Inference

We present SICK-NL (read: signal), a dataset targeting Natural Language ...
research
04/12/2022

Do Not Fire the Linguist: Grammatical Profiles Help Language Models Detect Semantic Change

Morphological and syntactic changes in word usage (as captured, e.g., by...
research
05/15/2023

Sensitivity and Robustness of Large Language Models to Prompt in Japanese

Prompt Engineering has gained significant relevance in recent years, fue...
research
09/08/2022

Towards explainable evaluation of language models on the semantic similarity of visual concepts

Recent breakthroughs in NLP research, such as the advent of Transformer ...

Please sign up or login with your details

Forgot password? Click here to reset