CLiMP: A Benchmark for Chinese Language Model Evaluation

01/26/2021
by   Beilei Xiang, et al.
0

Linguistically informed analyses of language models (LMs) contribute to the understanding and improvement of these models. Here, we introduce the corpus of Chinese linguistic minimal pairs (CLiMP), which can be used to investigate what knowledge Chinese LMs acquire. CLiMP consists of sets of 1,000 minimal pairs (MPs) for 16 syntactic contrasts in Mandarin, covering 9 major Mandarin linguistic phenomena. The MPs are semi-automatically generated, and human agreement with the labels in CLiMP is 95.8 CLiMP, covering n-grams, LSTMs, and Chinese BERT. We find that classifier-noun agreement and verb complement selection are the phenomena that models generally perform best at. However, models struggle the most with the ba construction, binding, and filler-gap dependencies. Overall, Chinese BERT achieves an 81.8 average accuracy, while the performances of LSTMs and 5-grams are only moderately above chance level.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/21/2022

SLING: Sino Linguistic Evaluation of Large Language Models

To understand what kinds of linguistic knowledge are encoded by pretrain...
research
12/02/2019

BLiMP: A Benchmark of Linguistic Minimal Pairs for English

We introduce The Benchmark of Linguistic Minimal Pairs (shortened to BLi...
research
10/31/2022

Do LSTMs See Gender? Probing the Ability of LSTMs to Learn Abstract Syntactic Rules

LSTMs trained on next-word prediction can accurately perform linguistic ...
research
10/16/2020

Linguistically-Informed Transformations (LIT): A Method forAutomatically Generating Contrast Sets

Although large-scale pretrained language models, such as BERT and RoBERT...
research
09/22/2021

Controlled Evaluation of Grammatical Knowledge in Mandarin Chinese Language Models

Prior work has shown that structural supervision helps English language ...
research
03/02/2022

Discontinuous Constituency and BERT: A Case Study of Dutch

In this paper, we set out to quantify the syntactic capacity of BERT in ...
research
09/19/2019

Analysing Neural Language Models: Contextual Decomposition Reveals Default Reasoning in Number and Gender Assignment

Extensive research has recently shown that recurrent neural language mod...

Please sign up or login with your details

Forgot password? Click here to reset