IITK at the FinSim Task: Hypernym Detection in Financial Domain via Context-Free and Contextualized Word Embeddings

07/22/2020
by   Vishal Keswani, et al.
0

In this paper, we present our approaches for the FinSim 2020 shared task on "Learning Semantic Representations for the Financial Domain". The goal of this task is to classify financial terms into the most relevant hypernym (or top-level) concept in an external ontology. We leverage both context-dependent and context-independent word embeddings in our analysis. Our systems deploy Word2vec embeddings trained from scratch on the corpus (Financial Prospectus in English) along with pre-trained BERT embeddings. We divide the test dataset into two subsets based on a domain rule. For one subset, we use unsupervised distance measures to classify the term. For the second subset, we use simple supervised classifiers like Naive Bayes, on top of the embeddings, to arrive at a final prediction. Finally, we combine both the results. Our system ranks 1st based on both the metrics, i.e., mean rank and accuracy.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/21/2021

Yseop at FinSim-3 Shared Task 2021: Specializing Financial Domain Learning with Phrase Representations

In this paper, we present our approaches for the FinSim-3 Shared Task 20...
research
09/30/2021

DICoE@FinSim-3: Financial Hypernym Detection using Augmented Terms and Distance-based Features

We present the submission of team DICoE for FinSim-3, the 3rd Shared Tas...
research
07/13/2021

Exploiting Network Structures to Improve Semantic Representation for the Financial Domain

This paper presents the participation of the MiniTrue team in the FinSim...
research
09/29/2021

EDGAR-CORPUS: Billions of Tokens Make The World Go Round

We release EDGAR-CORPUS, a novel corpus comprising annual reports from a...
research
04/19/2017

Predicting Role Relevance with Minimal Domain Expertise in a Financial Domain

Word embeddings have made enormous inroads in recent years in a wide var...
research
11/21/2018

Inline Detection of Domain Generation Algorithms with Context-Sensitive Word Embeddings

Domain generation algorithms (DGAs) are frequently employed by malware t...
research
03/20/2023

Learning Semantic Text Similarity to rank Hypernyms of Financial Terms

Over the years, there has been a paradigm shift in how users access fina...

Please sign up or login with your details

Forgot password? Click here to reset