An Effective Data Creation Pipeline to Generate High-quality Financial Instruction Data for Large Language Model

07/31/2023
by   Ziao Wang, et al.
0

At the beginning era of large language model, it is quite critical to generate a high-quality financial dataset to fine-tune a large language model for financial related tasks. Thus, this paper presents a carefully designed data creation pipeline for this purpose. Particularly, we initiate a dialogue between an AI investor and financial expert using ChatGPT and incorporate the feedback of human financial experts, leading to the refinement of the dataset. This pipeline yielded a robust instruction tuning dataset comprised of 103k multi-turn chats. Extensive experiments have been conducted on this dataset to evaluate the model's performance by adopting an external GPT-4 as the judge. The promising experimental results verify that our approach led to significant advancements in generating accurate, relevant, and financial-style responses from AI models, and thus providing a powerful tool for applications within the financial sector.

READ FULL TEXT

page 3

page 5

research
07/31/2023

FinVis-GPT: A Multimodal Large Language Model for Financial Chart Analysis

In this paper, we propose FinVis-GPT, a novel multimodal large language ...
research
09/11/2023

TeGit: Generating High-Quality Instruction-Tuning Data with Text-Grounded Task Design

High-quality instruction-tuning data is critical to improving LLM capabi...
research
07/22/2023

FinPT: Financial Risk Prediction with Profile Tuning on Pretrained Foundation Models

Financial risk prediction plays a crucial role in the financial sector. ...
research
07/08/2023

Can LLMs be Good Financial Advisors?: An Initial Study in Personal Decision Making for Optimized Outcomes

Increasingly powerful Large Language Model (LLM) based chatbots, like Ch...
research
08/07/2023

Zhongjing: Enhancing the Chinese Medical Capabilities of Large Language Model through Expert Feedback and Real-world Multi-turn Dialogue

Recent advances in Large Language Models (LLMs) have achieved remarkable...
research
08/11/2023

Self-Alignment with Instruction Backtranslation

We present a scalable method to build a high quality instruction followi...
research
05/24/2023

RefGPT: Reference -> Truthful Customized Dialogues Generation by GPTs and for GPTs

General chat models, like ChatGPT, have attained impressive capability t...

Please sign up or login with your details

Forgot password? Click here to reset