Using ChatGPT for Entity Matching

05/05/2023
by   Ralph Peeters, et al.
0

Entity Matching is the task of deciding if two entity descriptions refer to the same real-world entity. State-of-the-art entity matching methods often rely on fine-tuning Transformer models such as BERT or RoBERTa. Two major drawbacks of using these models for entity matching are that (i) the models require significant amounts of fine-tuning data for reaching a good performance and (ii) the fine-tuned models are not robust concerning out-of-distribution entities. In this paper, we investigate using ChatGPT for entity matching as a more robust, training data-efficient alternative to traditional Transformer models. We perform experiments along three dimensions: (i) general prompt design, (ii) in-context learning, and (iii) provision of higher-level matching knowledge. We show that ChatGPT is competitive with a fine-tuned RoBERTa model, reaching an average zero-shot performance of 83 task on which RoBERTa requires 2000 training examples for reaching a similar performance. Adding in-context demonstrations to the prompts further improves the F1 by up to 5 Finally, we show that guiding the zero-shot model by stating higher-level matching rules leads to similar gains as providing in-context examples.

READ FULL TEXT

page 8

page 10

research
09/04/2021

Robust fine-tuning of zero-shot models

Large pre-trained models such as CLIP offer consistent accuracy across a...
research
10/22/2022

Exploring The Landscape of Distributional Robustness for Question Answering Models

We conduct a large empirical evaluation to investigate the landscape of ...
research
07/05/2023

Improving Address Matching using Siamese Transformer Networks

Matching addresses is a critical task for companies and post offices inv...
research
10/24/2020

Improving Zero and Few-Shot Abstractive Summarization with Intermediate Fine-tuning and Data Augmentation

Models pretrained with self-supervised objectives on large text corpora ...
research
03/29/2023

Zero-shot Clinical Entity Recognition using ChatGPT

In this study, we investigated the potential of ChatGPT, a large languag...
research
10/12/2022

Are Sample-Efficient NLP Models More Robust?

Recent work has observed that pre-trained models have higher out-of-dist...
research
09/12/2023

Characterizing Latent Perspectives of Media Houses Towards Public Figures

Media houses reporting on public figures, often come with their own bias...

Please sign up or login with your details

Forgot password? Click here to reset