Large Language Models on Wikipedia-Style Survey Generation: an Evaluation in NLP Concepts

08/21/2023
by   Fan Gao, et al.
0

Large Language Models (LLMs) have achieved significant success across various natural language processing (NLP) tasks, encompassing question-answering, summarization, and machine translation, among others. While LLMs excel in general tasks, their efficacy in domain-specific applications remains under exploration. Additionally, LLM-generated text sometimes exhibits issues like hallucination and disinformation. In this study, we assess LLMs' capability of producing concise survey articles within the computer science-NLP domain, focusing on 20 chosen topics. Automated evaluations indicate that GPT-4 outperforms GPT-3.5 when benchmarked against the ground truth. Furthermore, four human evaluators provide insights from six perspectives across four model configurations. Through case studies, we demonstrate that while GPT often yields commendable results, there are instances of shortcomings, such as incomplete information and the exhibition of lapses in factual accuracy.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/28/2023

Comparative Analysis of CHATGPT and the evolution of language models

Interest in Large Language Models (LLMs) has increased drastically since...
research
08/17/2017

Natural Language Processing: State of The Art, Current Trends and Challenges

Natural language processing (NLP) has recently gained much attention for...
research
05/29/2023

A Systematic Study and Comprehensive Evaluation of ChatGPT on Benchmark Datasets

The development of large language models (LLMs) such as ChatGPT has brou...
research
06/13/2023

AutoML in the Age of Large Language Models: Current Challenges, Future Opportunities and Risks

The fields of both Natural Language Processing (NLP) and Automated Machi...
research
06/01/2023

ReviewerGPT? An Exploratory Study on Using Large Language Models for Paper Reviewing

Given the rapid ascent of large language models (LLMs), we study the que...
research
04/04/2023

PromptAid: Prompt Exploration, Perturbation, Testing and Iteration using Visual Analytics for Large Language Models

Large Language Models (LLMs) have gained widespread popularity due to th...
research
05/01/2021

Hidden Backdoors in Human-Centric Language Models

Natural language processing (NLP) systems have been proven to be vulnera...

Please sign up or login with your details

Forgot password? Click here to reset