Engineering a Simplified 0-Bit Consistent Weighted Sampling

03/30/2018
by   Edward Raff, et al.
0

The Min-Hashing approach to sketching has become an important tool in data analysis, search, and classification. To apply it to real-valued datasets, the ICWS algorithm has become a seminal approach that is widely used, and provides state-of-the-art performance for this problem space. However, ICWS suffers a computational burden as the sketch size K increases. We develop a new Simplified approach to the ICWS algorithm, that enables us to obtain over 20x speedups compared to the standard algorithm. The veracity of our approach is demonstrated empirically on multiple datasets, showing that our new Simplified CWS obtains the same quality of results while being an order of magnitude faster.

READ FULL TEXT
research
01/07/2022

GCWSNet: Generalized Consistent Weighted Sampling for Scalable and Accurate Training of Neural Networks

We develop the "generalized consistent weighted sampling" (GCWS) for has...
research
05/27/2019

Deep Multi-Index Hashing for Person Re-Identification

Traditional person re-identification (ReID) methods typically represent ...
research
08/15/2011

Training Logistic Regression and SVM on 200GB Data Using b-Bit Minwise Hashing and Comparisons with Vowpal Wabbit (VW)

We generated a dataset of 200 GB with 10^9 features, to test our recent ...
research
10/18/2019

The Bitwise Hashing Trick for Personalized Search

Many real world problems require fast and efficient lexical comparison o...
research
05/27/2019

Identity Connections in Residual Nets Improve Noise Stability

Residual Neural Networks (ResNets) achieve state-of-the-art performance ...
research
12/13/2015

Learning the Correction for Multi-Path Deviations in Time-of-Flight Cameras

The Multipath effect in Time-of-Flight(ToF) cameras still remains to be ...
research
12/19/2014

Simplified firefly algorithm for 2D image key-points search

In order to identify an object, human eyes firstly search the field of v...

Please sign up or login with your details

Forgot password? Click here to reset