Language Model-driven Negative Sampling

03/09/2022
by   Mirza Mohtashim Alam, et al.
2

Knowledge Graph Embeddings (KGEs) encode the entities and relations of a knowledge graph (KG) into a vector space with a purpose of representation learning and reasoning for an ultimate downstream task (i.e., link prediction, question answering). Since KGEs follow closed-world assumption and assume all the present facts in KGs to be positive (correct), they also require negative samples as a counterpart for learning process for truthfulness test of existing triples. Therefore, there are several approaches for creating negative samples from the existing positive ones through a randomized distribution. This choice of generating negative sampling affects the performance of the embedding models as well as their generalization. In this paper, we propose an approach for generating negative sampling considering the existing rich textual knowledge in KGs. representations of the entities inside a KG. Particularly, a pre-trained Language Model (LM) is utilized to obtain the contextual representation of symbolic entities. Our approach is then capable of generating more meaningful negative samples in comparison to other state of the art methods. Our comprehensive evaluations demonstrate the effectiveness of the proposed approach across several benchmark datasets for like prediction task. In addition, we show cased our the functionality of our approach on a clustering task where other methods fall short.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset