The Bitwise Hashing Trick for Personalized Search

10/18/2019
by   Braddock Gaskill, et al.
0

Many real world problems require fast and efficient lexical comparison of large numbers of short text strings. Search personalization is one such domain. We introduce the use of feature bit vectors using the hashing trick for improving relevance in personalized search and other personalization applications. We present results of several lexical hashing and comparison methods. These methods are applied to a user's historical behavior and are used to predict future behavior. Using a single bit per dimension instead of floating point results in an order of magnitude decrease in data structure size, while preserving or even improving quality. We use real data to simulate a search personalization task. A simple method for combining bit vectors demonstrates an order of magnitude improvement in compute time on the task with only a small decrease in accuracy.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/15/2011

Training Logistic Regression and SVM on 200GB Data Using b-Bit Minwise Hashing and Comparisons with Vowpal Wabbit (VW)

We generated a dataset of 200 GB with 10^9 features, to test our recent ...
research
01/11/2021

Number Parsing at a Gigabyte per Second

With disks and networks providing gigabytes per second, parsing decimal ...
research
08/03/2011

Accurate Estimators for Improving Minwise Hashing and b-Bit Minwise Hashing

Minwise hashing is the standard technique in the context of search and d...
research
10/18/2019

b-Bit Sketch Trie: Scalable Similarity Search on Integer Sketches

Recently, randomly mapping vectorial data to strings of discrete symbols...
research
03/30/2018

Engineering a Simplified 0-Bit Consistent Weighted Sampling

The Min-Hashing approach to sketching has become an important tool in da...
research
02/08/2019

Binarized Knowledge Graph Embeddings

Tensor factorization has become an increasingly popular approach to know...
research
01/20/2015

DeepHash: Getting Regularization, Depth and Fine-Tuning Right

This work focuses on representing very high-dimensional global image des...

Please sign up or login with your details

Forgot password? Click here to reset