GPTEval: A Survey on Assessments of ChatGPT and GPT-4

08/24/2023
by   Rui Mao, et al.
0

The emergence of ChatGPT has generated much speculation in the press about its potential to disrupt social and economic systems. Its astonishing language ability has aroused strong curiosity among scholars about its performance in different domains. There have been many studies evaluating the ability of ChatGPT and GPT-4 in different tasks and disciplines. However, a comprehensive review summarizing the collective assessment findings is lacking. The objective of this survey is to thoroughly analyze prior assessments of ChatGPT and GPT-4, focusing on its language and reasoning abilities, scientific knowledge, and ethical considerations. Furthermore, an examination of the existing evaluation methods is conducted, offering several recommendations for future research in evaluating large language models.

READ FULL TEXT

page 14

page 15

research
12/20/2022

Towards Reasoning in Large Language Models: A Survey

Reasoning is a fundamental aspect of human intelligence that plays a cru...
research
03/17/2023

Practical and Ethical Challenges of Large Language Models in Education: A Systematic Literature Review

Educational technology innovations that have been developed based on lar...
research
09/04/2023

Are Emergent Abilities in Large Language Models just In-Context Learning?

Large language models have exhibited emergent abilities, demonstrating e...
research
12/19/2022

Reasoning with Language Model Prompting: A Survey

Reasoning, as an essential ability for complex problem-solving, can prov...
research
05/26/2023

Towards a Common Understanding of Contributing Factors for Cross-Lingual Transfer in Multilingual Language Models: A Review

In recent years, pre-trained Multilingual Language Models (MLLMs) have s...
research
08/10/2022

Revisiting Piggyback Prototyping: Examining Benefits and Tradeoffs in Extending Existing Social Computing Systems

The CSCW community has a history of designing, implementing, and evaluat...
research
08/25/2023

SciEval: A Multi-Level Large Language Model Evaluation Benchmark for Scientific Research

Recently, there has been growing interest in using Large Language Models...

Please sign up or login with your details

Forgot password? Click here to reset