
-
A Constant-time Adaptive Negative Sampling
Softmax classifiers with a very large number of classes naturally occur ...
read it
-
Active Sampling Count Sketch (ASCS) for Online Sparse Estimation of a Trillion Scale Covariance Matrix
Estimating and storing the covariance (or correlation) matrix of high-di...
read it
-
Learning Sampling Distributions Using Local 3D Workspace Decompositions for Motion Planning in High Dimensions
Earlier work has shown that reusing experience from prior motion plannin...
read it
-
SOLAR: Sparse Orthogonal Learned and Random Embeddings
Dense embedding models are commonly deployed in commercial search engine...
read it
-
Distributed Tera-Scale Similarity Search with MPI: Provably Efficient Similarity Search over billions without a Single Distance Computation
We present SLASH (Sketched LocAlity Sensitive Hashing), an MPI (Message ...
read it
-
Bloom Origami Assays: Practical Group Testing
We study the problem usually referred to as group testing in the context...
read it
-
Climbing the WOL: Training for Cheaper Inference
Efficient inference for wide output layers (WOLs) is an essential yet ch...
read it
-
STORM: Foundations of End-to-End Empirical Risk Minimization on the Edge
Empirical risk minimization is perhaps the most influential idea in stat...
read it
-
A One-Pass Private Sketch for Most Machine Learning Tasks
Differential privacy (DP) is a compelling privacy definition that explai...
read it
-
Privacy Adversarial Network: Representation Learning for Mobile Data Privacy
The remarkable success of machine learning has fostered a growing number...
read it
-
Sub-linear RACE Sketches for Approximate Kernel Density Estimation on Streaming Data
Kernel density estimation is a simple and effective method that lies at ...
read it
-
Angular Visual Hardness
Although convolutional neural networks (CNNs) are inspired by the mechan...
read it
-
FourierSAT: A Fourier Expansion-Based Algebraic Framework for Solving Hybrid Boolean Constraints
The Boolean SATisfiability problem (SAT) is of central importance in com...
read it
-
Lsh-sampling Breaks the Computation Chicken-and-egg Loop in Adaptive Stochastic Gradient Estimation
Stochastic Gradient Descent or SGD is the most popular optimization algo...
read it
-
Extreme Classification in Log Memory using Count-Min Sketch: A Case Study of Amazon Search with 50M Products
In the last decade, it has been shown that many hard AI tasks, especiall...
read it
-
Adaptive Learned Bloom Filter (Ada-BF): Efficient Utilization of the Classifier
Recent work suggests improving the performance of Bloom filter by incorp...
read it
-
RAMBO: Repeated And Merged Bloom Filter for Multiple Set Membership Testing (MSMT) in Sub-linear time
Approximate set membership is a common problem with wide applications in...
read it
-
Semantic Similarity Based Softmax Classifier for Zero-Shot Learning
Zero-Shot Learning (ZSL) is a classification task where we do not have e...
read it
-
Revisiting Consistent Hashing with Bounded Loads
Dynamic load balancing lies at the heart of distributed caching. Here, t...
read it
-
Using Local Experiences for Global Motion Planning
Sampling-based planners are effective in many real-world applications su...
read it
-
SLIDE : In Defense of Smart Algorithms over Hardware Acceleration for Large-Scale Deep Learning Systems
Deep Learning (DL) algorithms are the central focus of modern machine le...
read it
-
RACE: Sub-Linear Memory Sketches for Approximate Near-Neighbor Search on Streaming Data
We demonstrate the first possibility of a sub-linear memory sketch for s...
read it
-
Compressing Gradient Optimizers via Count-Sketches
Many popular first-order optimization methods (e.g., Momentum, AdaGrad, ...
read it
-
Better accuracy with quantified privacy: representations learned via reconstructive adversarial network
The remarkable success of machine learning, especially deep learning, ha...
read it
-
Probabilistic Blocking with An Application to the Syrian Conflict
Entity resolution seeks to merge databases as to remove duplicate entrie...
read it
-
Extreme Classification in Log Memory
We present Merged-Averaged Classifiers via Hashing (MACH) for K-classifi...
read it
-
MISSION: Ultra Large-Scale Feature Selection using Count-Sketches
Feature selection is an important challenge in machine learning. It play...
read it
-
Scaling-up Split-Merge MCMC with Locality Sensitive Sampling (LSS)
Split-Merge MCMC (Monte Carlo Markov Chain) is one of the essential and ...
read it
-
Unique Entity Estimation with Application to the Syrian Conflict
Entity resolution identifies and removes duplicate entities in large, no...
read it
-
FLASH: Randomized Algorithms Accelerated over CPU-GPU for Ultra-High Dimensional Similarity Search
We present FLASH ( Fast LSH Algorithm for Similarity search accelerat...
read it
-
Accelerating Dependency Graph Learning from Heterogeneous Categorical Event Streams via Knowledge Transfer
Dependency graph, as a heterogeneous graph representing the intrinsic re...
read it
-
Arrays of (locality-sensitive) Count Estimators (ACE): High-Speed Anomaly Detection via Cache Lookups
Anomaly detection is one of the frequent and important subroutines deplo...
read it
-
A New Unbiased and Efficient Class of LSH-Based Samplers and Estimators for Partition Function Computation in Log-Linear Models
Log-linear models are arguably the most successful class of graphical mo...
read it
-
Revisiting Winner Take All (WTA) Hashing for Sparse Datasets
WTA (Winner Take All) hashing has been successfully applied in many larg...
read it
-
Scalable and Sustainable Deep Learning via Randomized Hashing
Current deep learning architectures are growing larger in order to learn...
read it
-
2-Bit Random Projections, NonLinear Estimators, and Approximate Near Neighbor Search
The method of random projections has become a standard tool for machine ...
read it
-
Asymmetric Minwise Hashing
Minwise hashing (Minhash) is a widely popular indexing scheme in practic...
read it
-
Improved Asymmetric Locality Sensitive Hashing (ALSH) for Maximum Inner Product Search (MIPS)
Recently it was shown that the problem of Maximum Inner Product Search (...
read it
-
In Defense of MinHash Over SimHash
MinHash and SimHash are the two widely adopted Locality Sensitive Hashin...
read it
-
Asymmetric LSH (ALSH) for Sublinear Time Maximum Inner Product Search (MIPS)
We present the first provably sublinear time algorithm for approximate M...
read it
-
Graph Kernels via Functional Embedding
We propose a representation of graph as a functional object derived from...
read it
-
A New Space for Comparing Graphs
Finding a new mathematical representations for graph, which allows direc...
read it
-
Training Logistic Regression and SVM on 200GB Data Using b-Bit Minwise Hashing and Comparisons with Vowpal Wabbit (VW)
We generated a dataset of 200 GB with 10^9 features, to test our recent ...
read it
-
Hashing Algorithms for Large-Scale Learning
In this paper, we first demonstrate that b-bit minwise hashing, whose es...
read it