Investigating the Effectiveness of ChatGPT in Mathematical Reasoning and Problem Solving: Evidence from the Vietnamese National High School Graduation Examination

06/10/2023
by   Xuan-Quy Dao, et al.
0

This study offers a complete analysis of ChatGPT's mathematics abilities in responding to multiple-choice questions for the Vietnamese National High School Graduation Examination (VNHSGE) on a range of subjects and difficulty levels. The dataset included 250 questions divided into four levels: knowledge (K), comprehension (C), application (A), and high application (H), and it included ten themes that covered diverse mathematical concepts. The outcomes demonstrate that ChatGPT's performance varies depending on the difficulty level and subject. It performed best on questions at Level (K), with an accuracy rate of 83%; but, as the difficulty level rose, it scored poorly, with an accuracy rate of 10%. The study has also shown that ChatGPT significantly succeeds in providing responses to questions on subjects including exponential and logarithmic functions, geometric progression, and arithmetic progression. The study found that ChatGPT had difficulty correctly answering questions on topics including derivatives and applications, spatial geometry, and Oxyz spatial calculus. Additionally, this study contrasted ChatGPT outcomes with Vietnamese students in VNHSGE and in other math competitions. ChatGPT dominated in the SAT Math competition with a success rate of 70%, followed by VNHSGE mathematics (58.8%). However, its success rates were lower on other exams, such as AP Statistics, the GRE Quantitative, AMC 10, AMC 12, and AP Calculus BC. These results suggest that ChatGPT has the potential to be an effective teaching tool for mathematics, but more work is needed to enhance its handling of graphical data and address the challenges presented by questions that are getting more challenging.

READ FULL TEXT
research
05/20/2023

VNHSGE: VietNamese High School Graduation Examination Dataset for Large Language Models

The VNHSGE (VietNamese High School Graduation Examination) dataset, deve...
research
08/19/2021

The effect of the number of distractors and the "None of the above" - "All of the above" options in multiple choice questions

Multiple choice questions (MCQs) are commonly used for assessment in hig...
research
06/15/2023

Can ChatGPT pass the Vietnamese National High School Graduation Examination?

This research article highlights the potential of AI-powered chatbots in...
research
07/20/2023

SciBench: Evaluating College-Level Scientific Problem-Solving Abilities of Large Language Models

Recent advances in large language models (LLMs) have demonstrated notabl...
research
11/03/2021

Teaching Math with the help of Virtual Reality

In the present work we intend to introduce a system based on VR (Virtual...
research
05/15/2023

C-Eval: A Multi-Level Multi-Discipline Chinese Evaluation Suite for Foundation Models

New NLP benchmarks are urgently needed to align with the rapid developme...
research
09/05/2023

AGIBench: A Multi-granularity, Multimodal, Human-referenced, Auto-scoring Benchmark for Large Language Models

Large language models (LLMs) like ChatGPT have revealed amazing intellig...

Please sign up or login with your details

Forgot password? Click here to reset