An Efficient Metric of Automatic Weight Generation for Properties in Instance Matching Technique

02/12/2015
by   Md. Hanif Seddiqui, et al.
0

The proliferation of heterogeneous data sources of semantic knowledge base intensifies the need of an automatic instance matching technique. However, the efficiency of instance matching is often influenced by the weight of a property associated to instances. Automatic weight generation is a non-trivial, however an important task in instance matching technique. Therefore, identifying an appropriate metric for generating weight for a property automatically is nevertheless a formidable task. In this paper, we investigate an approach of generating weights automatically by considering hypotheses: (1) the weight of a property is directly proportional to the ratio of the number of its distinct values to the number of instances contain the property, and (2) the weight is also proportional to the ratio of the number of distinct values of a property to the number of instances in a training dataset. The basic intuition behind the use of our approach is the classical theory of information content that infrequent words are more informative than frequent ones. Our mathematical model derives a metric for generating property weights automatically, which is applied in instance matching system to produce re-conciliated instances efficiently. Our experiments and evaluations show the effectiveness of our proposed metric of automatic weight generation for properties in an instance matching technique.

READ FULL TEXT
research
04/04/2016

Automatic Knowledge Base Evolution by Learning Instances

Knowledge base is the way to store structured and unstructured data thro...
research
02/02/2023

Maximum weight codewords of a linear rank metric code

Let 𝒞⊆𝔽_q^m^n be an 𝔽_q^m-linear non-degenerate rank metric code with di...
research
02/20/2015

OntoLoki: an automatic, instance-based method for the evaluation of biological ontologies on the Semantic Web

The delineation of logical definitions for each class in an ontology and...
research
10/05/2020

LEAPME: Learning-based Property Matching with Embeddings

Data integration tasks such as the creation and extension of knowledge g...
research
07/21/2021

Approximation by Lexicographically Maximal Solutions in Matching and Matroid Intersection Problems

We study how good a lexicographically maximal solution is in the weighte...
research
02/09/2021

Lower Bounds on the Integraliy Ratio of the Subtour LP for the Traveling Salesman Problem

In this paper we investigate instances with high integrality ratio of th...
research
03/19/2021

Connecting Images through Time and Sources: Introducing Low-data, Heterogeneous Instance Retrieval

With impressive results in applications relying on feature learning, dee...

Please sign up or login with your details

Forgot password? Click here to reset