Self-Supervised Bernoulli Autoencoders for Semi-Supervised Hashing

07/17/2020
by   Ricardo Ñanculef, et al.
7

Semantic hashing is an emerging technique for large-scale similarity search based on representing high-dimensional data using similarity-preserving binary codes used for efficient indexing and search. It has recently been shown that variational autoencoders, with Bernoulli latent representations parametrized by neural nets, can be successfully trained to learn such codes in supervised and unsupervised scenarios, improving on more traditional methods thanks to their ability to handle the binary constraints architecturally. However, the scenario where labels are scarce has not been studied yet. This paper investigates the robustness of hashing methods based on variational autoencoders to the lack of supervision, focusing on two semi-supervised approaches currently in use. The first augments the variational autoencoder's training objective to jointly model the distribution over the data and the class labels. The second approach exploits the annotations to define an additional pairwise loss that enforces consistency between the similarity in the code (Hamming) space and the similarity in the label space. Our experiments show that both methods can significantly increase the hash codes' quality. The pairwise approach can exhibit an advantage when the number of labelled points is large. However, we found that this method degrades quickly and loses its advantage when labelled samples decrease. To circumvent this problem, we propose a novel supervision method in which the model uses its label distribution predictions to implement the pairwise objective. Compared to the best baseline, this procedure yields similar performance in fully supervised settings but improves the results significantly when labelled data is scarce. Our code is made publicly available at https://github.com/amacaluso/SSB-VAE.

READ FULL TEXT
research
07/01/2020

Unsupervised Semantic Hashing with Pairwise Reconstruction

Semantic Hashing is a popular family of methods for efficient similarity...
research
06/03/2019

Unsupervised Neural Generative Semantic Hashing

Fast similarity search is a key component in large-scale information ret...
research
02/02/2019

Pairwise Teacher-Student Network for Semi-Supervised Hashing

Hashing method maps similar high-dimensional data to binary hashcodes wi...
research
11/11/2022

Semi-supervised Variational Autoencoder for Regression: Application on Soft Sensors

We present the development of a semi-supervised regression method using ...
research
09/19/2022

EcoFormer: Energy-Saving Attention with Linear Complexity

Transformer is a transformative framework that models sequential data an...
research
09/29/2021

One Loss for All: Deep Hashing with a Single Cosine Similarity based Learning Objective

A deep hashing model typically has two main learning objectives: to make...
research
11/07/2022

Okapi: Generalising Better by Making Statistical Matches Match

We propose Okapi, a simple, efficient, and general method for robust sem...

Please sign up or login with your details

Forgot password? Click here to reset