Retrieval Interaction Machine for Tabular Data Prediction

08/11/2021
by   Jiarui Qin, et al.
0

Prediction over tabular data is an essential task in many data science applications such as recommender systems, online advertising, medical treatment, etc. Tabular data is structured into rows and columns, with each row as a data sample and each column as a feature attribute. Both the columns and rows of the tabular data carry useful patterns that could improve the model prediction performance. However, most existing models focus on the cross-column patterns yet overlook the cross-row patterns as they deal with single samples independently. In this work, we propose a general learning framework named Retrieval Interaction Machine (RIM) that fully exploits both cross-row and cross-column patterns among tabular data. Specifically, RIM first leverages search engine techniques to efficiently retrieve useful rows of the table to assist the label prediction of the target row, then uses feature interaction networks to capture the cross-column patterns among the target row and the retrieved rows so as to make the final label prediction. We conduct extensive experiments on 11 datasets of three important tasks, i.e., CTR prediction (classification), top-n recommendation (ranking) and rating prediction (regression). Experimental results show that RIM achieves significant improvements over the state-of-the-art and various baselines, demonstrating the superiority and efficacy of RIM.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/16/2023

Ae^2I: A Double Autoencoder for Imputation of Missing Values

The most common strategy of imputing missing values in a table is to stu...
research
04/18/2022

Table Enrichment System for Machine Learning

Data scientists are constantly facing the problem of how to improve pred...
research
03/01/2022

TableFormer: Robust Transformer Modeling for Table-Text Encoding

Understanding tables is an important aspect of natural language understa...
research
05/31/2019

Table2Vec: Neural Word and Entity Embeddings for Table Population and Retrieval

Tables contain valuable knowledge in a structured form. We employ neural...
research
04/27/2014

A Constrained Matrix-Variate Gaussian Process for Transposable Data

Transposable data represents interactions among two sets of entities, an...
research
03/11/2011

SPPAM - Statistical PreProcessing AlgorithM

Most machine learning tools work with a single table where each row is a...
research
06/26/2021

SpreadsheetCoder: Formula Prediction from Semi-structured Context

Spreadsheet formula prediction has been an important program synthesis p...

Please sign up or login with your details

Forgot password? Click here to reset