ChatLog: Recording and Analyzing ChatGPT Across Time

04/27/2023
by   Shangqing Tu, et al.
0

While there are abundant researches about evaluating ChatGPT on natural language understanding and generation tasks, few studies have investigated how ChatGPT's behavior changes over time. In this paper, we collect a coarse-to-fine temporal dataset called ChatLog, consisting of two parts that update monthly and daily: ChatLog-Monthly is a dataset of 38,730 question-answer pairs collected every month including questions from both the reasoning and classification tasks. ChatLog-Daily, on the other hand, consists of ChatGPT's responses to 1000 identical questions for long-form generation every day. We conduct comprehensive automatic and human evaluation to provide the evidence for the existence of ChatGPT evolving patterns. We further analyze the unchanged characteristics of ChatGPT over time by extracting its knowledge and linguistic features. We find some stable features to improve the robustness of a RoBERTa-based detector on new versions of ChatGPT. We will continuously maintain our project at https://github.com/THU-KEG/ChatLog.

READ FULL TEXT

page 7

page 23

page 27

research
03/14/2023

Evaluation of ChatGPT as a Question Answering System for Answering Complex Questions

ChatGPT is a powerful large language model (LLM) that has made remarkabl...
research
06/06/2022

Investigating the use of Paraphrase Generation for Question Reformulation in the FRANK QA system

We present a study into the ability of paraphrase generation methods to ...
research
10/19/2020

Multi-hop Question Generation with Graph Convolutional Network

Multi-hop Question Generation (QG) aims to generate answer-related quest...
research
07/16/2020

LogiQA: A Challenge Dataset for Machine Reading Comprehension with Logical Reasoning

Machine reading is a fundamental task for testing the capability of natu...
research
03/01/2023

DIFFQG: Generating Questions to Summarize Factual Changes

Identifying the difference between two versions of the same article is u...
research
07/06/2023

A Survey on Evaluation of Large Language Models

Large language models (LLMs) are gaining increasing popularity in both a...

Please sign up or login with your details

Forgot password? Click here to reset