Company2Vec – German Company Embeddings based on Corporate Websites

07/18/2023
by   Christopher Gerling, et al.
0

With Company2Vec, the paper proposes a novel application in representation learning. The model analyzes business activities from unstructured company website data using Word2Vec and dimensionality reduction. Company2Vec maintains semantic language structures and thus creates efficient company embeddings in fine-granular industries. These semantic embeddings can be used for various applications in banking. Direct relations between companies and words allow semantic business analytics (e.g. top-n words for a company). Furthermore, industry prediction is presented as a supervised learning application and evaluation method. The vectorized structure of the embeddings allows measuring companies similarities with the cosine distance. Company2Vec hence offers a more fine-grained comparison of companies than the standard industry labels (NACE). This property is relevant for unsupervised learning tasks, such as clustering. An alternative industry segmentation is shown with k-means clustering on the company embeddings. Finally, this paper proposes three algorithms for (1) firm-centric, (2) industry-centric and (3) portfolio-centric peer-firm identification.

READ FULL TEXT

page 23

page 25

research
08/15/2023

Company Similarity using Large Language Models

Identifying companies with similar profiles is a core task in finance wi...
research
06/18/2023

CompanyKG: A Large-Scale Heterogeneous Graph for Company Similarity Quantification

In the investment industry, it is often essential to carry out fine-grai...
research
11/27/2019

Learning a faceted customer segmentation for discovering new business opportunities at Intel

For sales and marketing organizations within large enterprises, identify...
research
03/10/2017

Deep Learning in Customer Churn Prediction: Unsupervised Feature Learning on Abstract Company Independent Feature Vectors

As companies increase their efforts in retaining customers, being able t...
research
12/03/2022

Harnessing label semantics to extract higher performance under noisy label for Company to Industry matching

Assigning appropriate industry tag(s) to a company is a critical task in...
research
09/11/2020

Supervised learning for the prediction of firm dynamics

Thanks to the increasing availability of granular, yet high-dimensional,...

Please sign up or login with your details

Forgot password? Click here to reset