Local Embeddings for Relational Data Integration

09/03/2019
by   Riccardo Cappuzzo, et al.
0

Integrating information from heterogeneous data sources is one of the fundamental problems facing any enterprise. Recently, it has been shown that deep learning based techniques such as embeddings are a promising approach for data integration problems. Prior efforts directly use pre-trained embeddings or simplistically adapt techniques from natural language processing to obtain relational embeddings. In this work, we propose algorithms for obtaining local embeddings that are effective for data integration tasks on relational data. We make three major contributions. First, we describe a compact graph-based representation that allows the specification of a rich set of relationships inherent in relational world. Second, we propose how to derive sentences from such graph that effectively describe the similarity across elements (tokens, attributes, rows) across the two datasets. The embeddings are learned based on such sentences. Finally, we propose a diverse collection of criteria to evaluate relational embeddings and perform extensive set of experiments validating them. Our experiments show that our system, EmbDI, produces meaningful results for data integration tasks and our embeddings improve the result quality for existing state of the art methods.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/28/2019

Inducing Relational Knowledge from BERT

One of the most remarkable properties of word embeddings is the fact tha...
research
12/17/2022

Relational Sentence Embedding for Flexible Semantic Matching

We present Relational Sentence Embedding (RSE), a new paradigm to furthe...
research
12/16/2021

Unsupervised Matching of Data and Text

Entity resolution is a widely studied problem with several proposals to ...
research
07/23/2020

Graph integration of structured, semistructured and unstructured data for data journalism

Nowadays, journalism is facilitated by the existence of large amounts of...
research
09/11/2019

Recognizing Variables from their Data via Deep Embeddings of Distributions

A key obstacle in automated analytics and meta-learning is the inability...
research
10/11/2022

Relational Embeddings for Language Independent Stance Detection

The large majority of the research performed on stance detection has bee...
research
01/17/2020

Siamese Graph Neural Networks for Data Integration

Data integration has been studied extensively for decades and approached...

Please sign up or login with your details

Forgot password? Click here to reset