
NearOptimal Explainable kMeans for All Dimensions
Many clustering algorithms are guided by certain cost functions such as ...
read it

Multiway Online Correlated Selection
We give a 0.5368competitive algorithm for edgeweighted online bipartit...
read it

Approximation Algorithms for Orthogonal Nonnegative Matrix Factorization
In the nonnegative matrix factorization (NMF) problem, the input is an ...
read it

A Model for Ant Trail Formation and its Convergence Properties
We introduce a model for ant trail formation, building upon previous wor...
read it

Kernel Density Estimation through Density Constrained Near Neighbor Search
In this paper we revisit the kernel density estimation problem: given a ...
read it

Instance Based Approximations to Profile Maximum Likelihood
In this paper we provide a new efficient algorithm for approximately com...
read it

Improved Algorithms for Edge Colouring in the WStreaming Model
In the Wstreaming model, an algorithm is given O(n polylog n) space and...
read it

Nearest Neighbor Search for Hyperbolic Embeddings
Embedding into hyperbolic space is emerging as an effective representati...
read it

A Simple Sublinear Algorithm for Gap Edit Distance
We study the problem of estimating the edit distance between two nchara...
read it

The Bethe and Sinkhorn Permanents of Low Rank Matrices and Implications for Profile Maximum Likelihood
In this paper we consider the problem of computing the likelihood of the...
read it

A General Framework for Symmetric Property Estimation
In this paper we provide a general framework for estimating symmetric pr...
read it

Storyboard: Optimizing Precomputed Summaries for Aggregation
An emerging class of data systems partition their data and precompute ap...
read it

New lower bounds for Massively Parallel Computation from query complexity
Roughgarden, Vassilvitskii, and Wang (JACM 18) recently introduced a nov...
read it

Retrieving Top Weighted Triangles in Graphs
Pattern counting in graphs is a fundamental primitive for many network a...
read it

Efficient Profile Maximum Likelihood for Universal Symmetric Property Estimation
Estimating symmetric properties of a distribution, e.g. support size, co...
read it

The OneWay Communication Complexity of Dynamic Time Warping Distance
We resolve the randomized oneway communication complexity of Dynamic Ti...
read it

Hierarchical Clustering for Euclidean Data
Recent works on Hierarchical Clustering (HC), a wellstudied problem in ...
read it

Recovery Guarantees for Quadratic Tensors with Limited Observations
We consider the tensor completion problem of predicting the missing entr...
read it

A sampling framework for counting temporal motifs
Pattern counting in graphs is fundamental to network science tasks, and ...
read it

Local Density Estimation in High Dimensions
An important question that arises in the study of high dimensional vecto...
read it

HashingBasedEstimators for Kernel Density in High Dimensions
Given a set of points P⊂R^d and a kernel k, the Kernel Density Estimate ...
read it

Hierarchical Clustering better than AverageLinkage
Hierarchical Clustering (HC) is a widely studied problem in exploratory ...
read it

MultiResolution Hashing for Fast Pairwise Summations
A basic computational primitive in the analysis of massive datasets is s...
read it

Hierarchical Clustering with Structural Constraints
Hierarchical clustering is a popular unsupervised data analysis method. ...
read it

On Estimating Edit Distance: Alignment, Dimension Reduction, and Embeddings
Edit distance is a fundamental measure of distance between strings and h...
read it

MultiCommodity Flow with InNetwork Processing
Modern networks run "middleboxes" that offer services ranging from netwo...
read it

On Finding Dense Common Subgraphs
We study the recently introduced problem of finding dense common subgrap...
read it

Fully Dynamic AlmostMaximal Matching: Breaking the Polynomial Barrier for WorstCase Time Bounds
Despite significant research efforts, the stateoftheart algorithm for...
read it

Resilience: A Criterion for Learning in the Presence of Arbitrary Outliers
We introduce a criterion, resilience, which allows properties of a datas...
read it

A Hitting Time Analysis of Stochastic Gradient Langevin Dynamics
We study the Stochastic Gradient Langevin Dynamics (SGLD) algorithm for ...
read it

Learning from Untrusted Data
The vast majority of theoretical results in machine learning and statist...
read it

Smoothed Analysis of Tensor Decompositions
Low rank tensor decompositions are a powerful tool for learning generati...
read it