
Generalization bounds via distillation
This paper theoretically investigates the following empirical phenomenon...
read it

On the Approximation Power of TwoLayer Networks of Random ReLUs
This paper considers the following question: how well can depthtwo ReLU...
read it

Biased Programmers? Or Biased Data? A Field Experiment in Operationalizing AI Ethics
Why do biased predictions arise? What interventions can prevent them? We...
read it

Detecting Foodborne Illness Complaints in Multiple Languages Using English Annotations Only
Health departments have been deploying text classification systems for t...
read it

CrossLingual Text Classification with Minimal Resources by Transferring a Sparse Teacher
Crosslingual text classification alleviates the need for manually label...
read it

On the proliferation of support vectors in high dimensions
The support vector machine (SVM) is a wellestablished classification me...
read it

Contrastive learning, multiview redundancy, and linear models
Selfsupervised learning is an empirically successful approach to unsupe...
read it

Statistical Query Lower Bounds for Tensor PCA
In the Tensor PCA problem introduced by Richard and Montanari (2014), on...
read it

Ensuring Fairness Beyond the Training Data
We initiate the study of fair classifiers that are robust to perturbatio...
read it

Classification vs regression in overparameterized regimes: Does the loss function matter?
We compare classification and regression tasks in the overparameterized ...
read it

Contrastive estimation reveals topic posterior information to linear models
Contrastive learning is an approach to representation learning that util...
read it

A New Framework for Query Efficient Active Imitation Learning
We seek to align agent policy with human expert behavior in a reinforcem...
read it

Weakly Supervised Attention Networks for FineGrained Opinion Mining and Public Health
In many review classification applications, a finegrained analysis of t...
read it

Privacy Accounting and Quality Control in the Sage Differentially Private ML Platform
Companies increasingly expose machine learning (ML) models trained over ...
read it

Leveraging Just a Few Keywords for FineGrained Aspect Detection Through Weakly Supervised CoTraining
Usergenerated reviews can be decomposed into finegrained segments (e.g...
read it

Unbiased estimators for random design regression
In linear regression we wish to estimate the optimum linear least square...
read it

A gradual, semidiscrete approach to generative network training via explicit wasserstein minimization
This paper provides a simple procedure to fit generative networks to tar...
read it

A cryptographic approach to black box adversarial machine learning
We propose an ensemble technique for converting any classifier into a co...
read it

Diameterbased Interactive Structure Search
In this work, we introduce interactive structure search, a generic frame...
read it

How many variables should be entered in a principal component regression equation?
We study least squares linear regression over N uncorrelated Gaussian fe...
read it

Two models of double descent for weak features
The "double descent" risk curve was recently proposed to qualitatively d...
read it

Consistent Risk Estimation in HighDimensional Linear Regression
Risk estimation is at the core of many learning systems. The importance ...
read it

Reconciling modern machine learning and the biasvariance tradeoff
The question of generalization in machine learninghow algorithms are ...
read it

Benefits of overparameterization with EM
Expectation Maximization (EM) is among the most popular algorithms for m...
read it

Correcting the bias in least squares regression with volumerescaled sampling
Consider linear regression where the examples are generated by an unknow...
read it

Overfitting or perfect fitting? Risk bounds for classification and regression rules that interpolate
Many modern machine learning models are trained to achieve zero or near...
read it

Tail bounds for volume sampled linear regression
The n × d design matrix in a linear regression problem is given, but the...
read it

On the Connection between Differential Privacy and Adversarial Robustness in Machine Learning
Adversarial examples in machine learning has been a topic of intense res...
read it

NonGaussian information from weak lensing data via deep learning
Weak lensing maps contain information beyond twopoint statistics on sma...
read it

Mixing time estimation in reversible Markov chains from a single sample path
The spectral gap γ of a finite, ergodic, and reversible Markov chain is ...
read it

Anomaly Detection on Graph Time Series
In this paper, we use variational recurrent neural network to investigat...
read it

Time Series Compression Based on Adaptive Piecewise Recurrent Autoencoder
Time series account for a large proportion of the data stored in financi...
read it

Time Series Forecasting Based on Augmented Long ShortTerm Memory
In this paper, we use recurrent autoencoder model to predict the time se...
read it

Greedy Approaches to Symmetric Orthogonal Tensor Decomposition
Finding the symmetric and orthogonal decomposition (SOD) of a tensor is ...
read it

Parameter identification in Markov chain choice models
This work studies the parameter identification problem for the Markov ch...
read it

Successive RankOne Approximations for Nearly Orthogonally Decomposable Symmetric Tensors
Many idealized problems in signal processing, machine learning and stati...
read it

Linear regression without correspondence
This article considers algorithmic and statistical aspects of linear reg...
read it

Kernel Approximation Methods for Speech Recognition
We study largescale kernel methods for acoustic modeling in speech reco...
read it

Global analysis of Expectation Maximization for mixtures of two Gaussians
Expectation Maximization (EM) is among the most popular algorithms for e...
read it

Search Improves Label for Active Learning
We investigate active learning with access to two distinct oracles: Labe...
read it

Scalable Nonlinear Learning with Adaptive Polynomial Expansions
Can we effectively learn a nonlinear representation in time comparable t...
read it

Taming the Monster: A Fast and Simple Algorithm for Contextual Bandits
We present a new algorithm for the contextual bandit learning problem, w...
read it

When are Overcomplete Topic Models Identifiable? Uniqueness of Tensor Tucker Decompositions with Structured Sparsity
Overcomplete latent representations have been very popular for unsupervi...
read it

Loss minimization and parameter estimation with heavy tails
This work studies applications and generalizations of a simple estimatio...
read it

A Tensor Approach to Learning Mixed Membership Community Models
Community detection is the task of detecting hidden communities from obs...
read it

Learning Sparse LowThreshold Linear Classifiers
We consider the problem of learning a nonnegative linear classifier wit...
read it

Analysis of a randomized approximation scheme for matrix multiplication
This note gives a simple analysis of a randomized approximation scheme f...
read it

Tensor decompositions for learning latent variable models
This work considers a computationally and statistically efficient parame...
read it

Learning Topic Models and Latent Bayesian Networks Under Expansion Constraints
Unsupervised estimation of latent variable models is a fundamental probl...
read it

Convergence Rates for Differentially Private Statistical Estimation
Differential privacy is a cryptographicallymotivated definition of priv...
read it