Performance Comparison of Large Language Models on VNHSGE English Dataset: OpenAI ChatGPT, Microsoft Bing Chat, and Google Bard

07/05/2023
by   Xuan-Quy Dao, et al.
0

This paper presents a performance comparison of three large language models (LLMs), namely OpenAI ChatGPT, Microsoft Bing Chat (BingChat), and Google Bard, on the VNHSGE English dataset. The performance of BingChat, Bard, and ChatGPT (GPT-3.5) is 92.4%, 86%, and 79.2%, respectively. The results show that BingChat is better than ChatGPT and Bard. Therefore, BingChat and Bard can replace ChatGPT while ChatGPT is not yet officially available in Vietnam. The results also indicate that BingChat, Bard and ChatGPT outperform Vietnamese students in English language proficiency. The findings of this study contribute to the understanding of the potential of LLMs in English language education. The remarkable performance of ChatGPT, BingChat, and Bard demonstrates their potential as effective tools for teaching and learning English at the high school level.

READ FULL TEXT

page 5

page 10

page 11

research
06/27/2023

Evaluating GPT-3.5 and GPT-4 on Grammatical Error Correction for Brazilian Portuguese

We investigate the effectiveness of GPT-3.5 and GPT-4, two large languag...
research
06/12/2023

Lost in Translation: Large Language Models in Non-English Content Analysis

In recent years, large language models (e.g., Open AI's GPT-4, Meta's LL...
research
07/17/2023

ChatGPT is Good but Bing Chat is Better for Vietnamese Students

This study examines the efficacy of two SOTA large language models (LLMs...
research
04/09/2023

Can ChatGPT and Bard Generate Aligned Assessment Items? A Reliability Analysis against Human Performance

ChatGPT and Bard are AI chatbots based on Large Language Models (LLM) th...
research
08/19/2019

The Natural Selection of Words: Finding the Features of Fitness

We introduce a dataset for studying the evolution of words, constructed ...
research
01/16/2023

PromptShots at the FinNLP-2022 ERAI Tasks: Pairwise Comparison and Unsupervised Ranking

This report describes our PromptShots submissions to a shared task on Ev...
research
05/22/2023

DADA: Dialect Adaptation via Dynamic Aggregation of Linguistic Rules

Existing large language models (LLMs) that mainly focus on Standard Amer...

Please sign up or login with your details

Forgot password? Click here to reset