Company Similarity using Large Language Models

08/15/2023
by   Dimitrios Vamvourellis, et al.
0

Identifying companies with similar profiles is a core task in finance with a wide range of applications in portfolio construction, asset pricing and risk attribution. When a rigorous definition of similarity is lacking, financial analysts usually resort to 'traditional' industry classifications such as Global Industry Classification System (GICS) which assign a unique category to each company at different levels of granularity. Due to their discrete nature, though, GICS classifications do not allow for ranking companies in terms of similarity. In this paper, we explore the ability of pre-trained and finetuned large language models (LLMs) to learn company embeddings based on the business descriptions reported in SEC filings. We show that we can reproduce GICS classifications using the embeddings as features. We also benchmark these embeddings on various machine learning and financial metrics and conclude that the companies that are similar according to the embeddings are also similar in terms of financial performance metrics including return correlation.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/11/2023

Named entity recognition using GPT for identifying comparable companies

For both public and private firms, comparable companies analysis is wide...
research
07/18/2023

Company2Vec – German Company Embeddings based on Corporate Websites

With Company2Vec, the paper proposes a novel application in representati...
research
06/18/2023

CompanyKG: A Large-Scale Heterogeneous Graph for Company Similarity Quantification

In the investment industry, it is often essential to carry out fine-grai...
research
12/03/2022

Harnessing label semantics to extract higher performance under noisy label for Company to Industry matching

Assigning appropriate industry tag(s) to a company is a critical task in...
research
07/08/2023

Incorporating Deep Q – Network with Multiclass Classification Algorithms

In this study, we explore how Deep Q-Network (DQN) might improve the fun...
research
05/10/2022

Turtle Score – Similarity Based Developer Analyzer

In day-to-day life, a highly demanding task for IT companies is to find ...
research
03/20/2023

Learning Semantic Text Similarity to rank Hypernyms of Financial Terms

Over the years, there has been a paradigm shift in how users access fina...

Please sign up or login with your details

Forgot password? Click here to reset