Can GPT-4 Support Analysis of Textual Data in Tasks Requiring Highly Specialized Domain Expertise?

06/24/2023
by   Jaromír Šavelka, et al.
0

We evaluated the capability of generative pre-trained transformers (GPT-4) in analysis of textual data in tasks that require highly specialized domain expertise. Specifically, we focused on the task of analyzing court opinions to interpret legal concepts. We found that GPT-4, prompted with annotation guidelines, performs on par with well-trained law student annotators. We observed that, with a relatively minor decrease in performance, GPT-4 can perform batch predictions leading to significant cost reductions. However, employing chain-of-thought prompting did not lead to noticeably improved performance on this task. Further, we demonstrated how to analyze GPT-4's predictions to identify and mitigate deficiencies in annotation guidelines, and subsequently improve the performance of the model. Finally, we observed that the model is quite brittle, as small formatting related changes in the prompt had a high impact on the predictions. These findings can be leveraged by researchers and practitioners who engage in semantic/pragmatic annotations of texts in the context of the tasks requiring highly specialized domain expertise.

READ FULL TEXT
research
05/08/2023

Unlocking Practical Applications in Legal Domain: Evaluation of GPT for Zero-Shot Semantic Annotation of Legal Texts

We evaluated the capability of a state-of-the-art generative pre-trained...
research
06/05/2023

LexGPT 0.1: pre-trained GPT-J models with Pile of Law

This research aims to build generative language models specialized for t...
research
10/31/2020

Effective Approach to Develop a Sentiment Annotator For Legal Domain in a Low Resource Setting

Analyzing the sentiments of legal opinions available in Legal Opinion Te...
research
06/04/2021

Annotation Curricula to Implicitly Train Non-Expert Annotators

Annotation studies often require annotators to familiarize themselves wi...
research
10/22/2021

Rethinking Generalization Performance of Surgical Phase Recognition with Expert-Generated Annotations

As the area of application of deep neural networks expands to areas requ...
research
07/20/2023

Embroid: Unsupervised Prediction Smoothing Can Improve Few-Shot Classification

Recent work has shown that language models' (LMs) prompt-based learning ...
research
05/03/2023

SCOTT: Self-Consistent Chain-of-Thought Distillation

Large language models (LMs) beyond a certain scale, demonstrate the emer...

Please sign up or login with your details

Forgot password? Click here to reset