Unsupervised Object Matching for Relational Data

10/09/2018
by   Tomoharu Iwata, et al.
0

We propose an unsupervised object matching method for relational data, which finds matchings between objects in different relational datasets without correspondence information. For example, the proposed method matches documents in different languages in multi-lingual document-word networks without dictionaries nor alignment information. The proposed method assumes that each object has latent vectors, and the probability of neighbor objects is modeled by the inner-product of the latent vectors, where the neighbors are generated by short random walks over the relations. The latent vectors are estimated by maximizing the likelihood of the neighbors for each dataset. The estimated latent vectors contain hidden structural information of each object in the given relational dataset. Then, the proposed method linearly projects the latent vectors for all the datasets onto a common latent space shared across all datasets by matching the distributions while preserving the structural information. The projection matrix is estimated by minimizing the distance between the latent vector distributions with an orthogonality regularizer. To represent the distributions effectively, we use the kernel embedding of distributions that hold high-order moment information about a distribution as an element in a reproducing kernel Hilbert space, which enables us to calculate the distance between the distributions without density estimation. The structural information encoded in the latent vectors are preserved by using the orthogonality regularizer. We demonstrate the effectiveness of the proposed method with experiments using real-world multi-lingual document-word relational datasets and multiple user-item relational datasets.

READ FULL TEXT
research
02/21/2020

Refinement of Unsupervised Cross-Lingual Word Embeddings

Cross-lingual word embeddings aim to bridge the gap between high-resourc...
research
10/03/2015

Maximum Likelihood Latent Space Embedding of Logistic Random Dot Product Graphs

A latent space model for a family of random graphs assigns real-valued v...
research
07/18/2019

Evaluating the Utility of Document Embedding Vector Difference for Relation Learning

Recent work has demonstrated that vector offsets obtained by subtracting...
research
01/09/2020

Multiplex Word Embeddings for Selectional Preference Acquisition

Conventional word embeddings represent words with fixed vectors, which a...
research
08/16/2021

Hierarchical Infinite Relational Model

This paper describes the hierarchical infinite relational model (HIRM), ...
research
10/11/2019

R-SQAIR: Relational Sequential Attend, Infer, Repeat

Traditional sequential multi-object attention models rely on a recurrent...
research
12/28/2010

DD-EbA: An algorithm for determining the number of neighbors in cost estimation by analogy using distance distributions

Case Based Reasoning and particularly Estimation by Analogy, has been us...

Please sign up or login with your details

Forgot password? Click here to reset