Evaluating the Generation Capabilities of Large Chinese Language Models

08/09/2023
by   Hui Zeng, et al.
0

This paper presents CG-Eval, the first comprehensive evaluation of the generation capabilities of large Chinese language models across a wide range of academic disciplines. The models' performance was assessed based on their ability to generate accurate and relevant responses to different types of questions in six disciplines, namely, Science and Engineering, Humanities and Social Sciences, Mathematical Calculations, Medical Practitioner Qualification Examination, Judicial Examination, and Certified Public Accountant Examination. This paper also presents Gscore, a composite index derived from the weighted sum of multiple metrics to measure the quality of model's generation against a reference. The test data and test results can be found at http://cgeval.besteasy.com/.

READ FULL TEXT
research
06/15/2023

CMMLU: Measuring massive multitask language understanding in Chinese

As the capabilities of large language models (LLMs) continue to advance,...
research
08/19/2023

FinEval: A Chinese Financial Domain Knowledge Evaluation Benchmark for Large Language Models

Large language models (LLMs) have demonstrated exceptional performance i...
research
07/24/2023

Performance of Large Language Models in a Computer Science Degree Program

Large language models such as ChatGPT-3.5 and GPT-4.0 are ubiquitous and...
research
06/05/2023

Benchmarking Large Language Models on CMExam – A Comprehensive Chinese Medical Exam Dataset

Recent advancements in large language models (LLMs) have transformed the...
research
07/17/2023

ChatGPT is Good but Bing Chat is Better for Vietnamese Students

This study examines the efficacy of two SOTA large language models (LLMs...
research
04/20/2023

Safety Assessment of Chinese Large Language Models

With the rapid popularity of large language models such as ChatGPT and G...
research
11/18/2022

Mindel C. Cheps: Counted, dead or alive

In the 1958 paper "Shall we count the living or the dead", Canadian phys...

Please sign up or login with your details

Forgot password? Click here to reset