Evaluating ChatGPT as a Recommender System: A Rigorous Approach

09/07/2023
by   Dario Di Palma, et al.
0

Recent popularity surrounds large AI language models due to their impressive natural language capabilities. They contribute significantly to language-related tasks, including prompt-based learning, making them valuable for various specific tasks. This approach unlocks their full potential, enhancing precision and generalization. Research communities are actively exploring their applications, with ChatGPT receiving recognition. Despite extensive research on large language models, their potential in recommendation scenarios still needs to be explored. This study aims to fill this gap by investigating ChatGPT's capabilities as a zero-shot recommender system. Our goals include evaluating its ability to use user preferences for recommendations, reordering existing recommendation lists, leveraging information from similar users, and handling cold-start situations. We assess ChatGPT's performance through comprehensive experiments using three datasets (MovieLens Small, Last.FM, and Facebook Book). We compare ChatGPT's performance against standard recommendation algorithms and other large language models, such as GPT-3.5 and PaLM-2. To measure recommendation effectiveness, we employ widely-used evaluation metrics like Mean Average Precision (MAP), Recall, Precision, F1, normalized Discounted Cumulative Gain (nDCG), Item Coverage, Expected Popularity Complement (EPC), Average Coverage of Long Tail (ACLT), Average Recommendation Popularity (ARP), and Popularity-based Ranking-based Equal Opportunity (PopREO). Through thoroughly exploring ChatGPT's abilities in recommender systems, our study aims to contribute to the growing body of research on the versatility and potential applications of large language models. Our experiment code is available on the GitHub repository: https://github.com/sisinflab/Recommender-ChatGPT

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/15/2023

Large Language Models are Zero-Shot Rankers for Recommender Systems

Recently, large language models (LLMs) (e.g. GPT-4) have demonstrated im...
research
04/20/2023

Is ChatGPT a Good Recommender? A Preliminary Study

Recommendation systems have witnessed significant advancements and have ...
research
08/23/2023

LLMRec: Benchmarking Large Language Models on Recommendation Task

Recently, the fast development of Large Language Models (LLMs) such as C...
research
02/09/2023

Adap-τ: Adaptively Modulating Embedding Magnitude for Recommendation

Recent years have witnessed the great successes of embedding-based metho...
research
07/10/2023

Exploring Large Language Model for Graph Data Understanding in Online Job Recommendations

Large Language Models (LLMs) have revolutionized natural language proces...
research
04/19/2023

Is ChatGPT Good at Search? Investigating Large Language Models as Re-Ranking Agent

Large Language Models (LLMs) have demonstrated a remarkable ability to g...
research
08/07/2023

Emotionally Numb or Empathetic? Evaluating How LLMs Feel Using EmotionBench

Recently, the community has witnessed the advancement of Large Language ...

Please sign up or login with your details

Forgot password? Click here to reset