LEAPME: Learning-based Property Matching with Embeddings

10/05/2020
by   Daniel Ayala, et al.
0

Data integration tasks such as the creation and extension of knowledge graphs involve the fusion of heterogeneous entities from many sources. Matching and fusion of such entities require to also match and combine their properties (attributes). However, previous schema matching approaches mostly focus on two sources only and often rely on simple similarity measurements. They thus face problems in challenging use cases such as the integration of heterogeneous product entities from many sources. We therefore present a new machine learning-based property matching approach called LEAPME (LEArning-based Property Matching with Embeddings) that utilizes numerous features of both property names and instance values. The approach heavily makes use of word embeddings to better utilize the domain-specific semantics of both property names and instance values. The use of supervised machine learning helps exploit the predictive power of word embeddings. Our comparative evaluation against five baselines for several multi-source datasets with real-world data shows the high effectiveness of LEAPME. We also show that our approach is even effective when training data from another domain (transfer learning) is used.

READ FULL TEXT
research
01/15/2021

EAGER: Embedding-Assisted Entity Resolution for Knowledge Graphs

Entity Resolution (ER) is a constitutional part for integrating differen...
research
09/11/2019

Recognizing Variables from their Data via Deep Embeddings of Distributions

A key obstacle in automated analytics and meta-learning is the inability...
research
09/20/2020

Supervised Ontology and Instance Matching with MELT

In this paper, we present MELT-ML, a machine learning extension to the M...
research
06/07/2019

Learning Word Embeddings with Domain Awareness

Word embeddings are traditionally trained on a large corpus in an unsupe...
research
02/12/2015

An Efficient Metric of Automatic Weight Generation for Properties in Instance Matching Technique

The proliferation of heterogeneous data sources of semantic knowledge ba...
research
05/11/2023

A Semi-Automated Hybrid Schema Matching Framework for Vegetation Data Integration

Integrating disparate and distributed vegetation data is critical for co...
research
06/28/2023

Social World Knowledge: Modeling and Applications

Social world knowledge is a key ingredient in effective communication an...

Please sign up or login with your details

Forgot password? Click here to reset