-
CLUECorpus2020: A Large-scale Chinese Corpus for Pre-training Language Model
In this paper, we introduce the Chinese corpus from CLUE organization, C...
read it
-
CLUECorpus2020: A Large-scale Chinese Corpus for Pre-trainingLanguage Model
In this paper, we introduce the Chinese corpus from CLUE organization, C...
read it
-
Pre-Training with Whole Word Masking for Chinese BERT
Bidirectional Encoder Representations from Transformers (BERT) has shown...
read it
-
AnchiBERT: A Pre-Trained Model for Ancient ChineseLanguage Understanding and Generation
Ancient Chinese is the essence of Chinese culture. There are several nat...
read it
-
IndoNLU: Benchmark and Resources for Evaluating Indonesian Natural Language Understanding
Although Indonesian is known to be the fourth most frequently used langu...
read it
-
CLiMP: A Benchmark for Chinese Language Model Evaluation
Linguistically informed analyses of language models (LMs) contribute to ...
read it
-
Zero-Shot Entity Linking by Reading Entity Descriptions
We present the zero-shot entity linking task, where mentions must be lin...
read it
CLUE: A Chinese Language Understanding Evaluation Benchmark
We introduce CLUE, a Chinese Language Understanding Evaluation benchmark. It contains eight different tasks, including single-sentence classification, sentence pair classification, and machine reading comprehension. We evaluate CLUE on a number of existing full-network pre-trained models for Chinese. We also include a small hand-crafted diagnostic test set designed to probe specific linguistic phenomena using different models, some of which are unique to Chinese. Along with CLUE, we release a large clean crawled raw text corpus that can be used for model pre-training. We release CLUE, baselines and pre-training dataset on Github.
READ FULL TEXT
Comments
There are no comments yet.