Evaluating approaches for supervised semantic labeling

01/29/2018
by   Natalia Ruemmele, et al.
0

Relational data sources are still one of the most popular ways to store enterprise or Web data, however, the issue with relational schema is the lack of a well-defined semantic description. A common ontology provides a way to represent the meaning of a relational schema and can facilitate the integration of heterogeneous data sources within a domain. Semantic labeling is achieved by mapping attributes from the data sources to the classes and properties in the ontology. We formulate this problem as a multi-class classification problem where previously labeled data sources are used to learn rules for labeling new data sources. The majority of existing approaches for semantic labeling have focused on data integration challenges such as naming conflicts and semantic heterogeneity. In addition, machine learning approaches typically have issues around class imbalance, lack of labeled instances and relative importance of attributes. To address these issues, we develop a new machine learning model with engineered features as well as two deep learning models which do not require extensive feature engineering. We evaluate our new approaches with the state-of-the-art.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/16/2016

Learning the Semantics of Structured Data Sources

Information sources such as relational databases, spreadsheets, XML, JSO...
research
06/26/2018

EmbNum: Semantic labeling for numerical values with deep metric learning

Semantic labeling is a task of matching unknown data source to labeled d...
research
12/06/2022

Measuring Intangible Assets Using Parametric and Machine Learning Approaches

Intangible capital as the result of digitalization and globalization has...
research
06/18/2023

2D-Shapley: A Framework for Fragmented Data Valuation

Data valuation – quantifying the contribution of individual data sources...
research
11/16/2021

GAP Enhancing Semantic Interoperability of Genomic Datasets and Provenance Through Nanopublications

While the publication of datasets in scientific repositories has become ...
research
12/21/2022

Automatic Semantic Modeling for Structural Data Source with the Prior Knowledge from Knowledge Base

A critical step in sharing semantic content online is to map the structu...
research
03/15/2020

On new data sources for the production of official statistics

In the past years we have witnessed the rise of new data sources for the...

Please sign up or login with your details

Forgot password? Click here to reset