
Benchmarking Semisupervised Federated Learning
Federated learning promises to use the computational power of edge devic...
Boundary thickness and robustness in learning models
Robustness of machine learning models to various adversarial and nonadv...
Debiasing Distributed Second Order Optimization with Surrogate Sketching and Scaled Regularization
In distributed second order optimization, a standard strategy is to aver...
Good linear classifiers are abundant in the interpolating regime
Within the machine learning community, the widelyused uniform convergen...
Precise expressions for random projections: Lowrank approximation and randomized Newton
It is often desirable to reduce the dimensionality of a large dataset by...
Multiplicative noise and heavy tails in stochastic optimization
Although stochastic optimization is central to modern machine learning, ...
A random matrix analysis of random Fourier features: beyond the Gaussian kernel, a precise phase transition, and the corresponding double descent
This article characterizes the exact asymptotics of random Fourier featu...
ADAHESSIAN: An Adaptive Second Order Optimizer for Machine Learning
We introduce AdaHessian, a second order stochastic optimization algorith...
Determinantal Point Processes in Randomized Numerical Linear Algebra
Randomized Numerical Linear Algebra (RandNLA) uses randomness to develop...
Error Estimation for Sketched SVD via the Bootstrap
In order to compute fast approximations to the singular value decomposit...
Forecasting Sequential Data using Consistent Koopman Autoencoders
Recurrent neural networks are widely used on time series data, yet such ...
Asymptotic Analysis of Sampling Estimators for Randomized Numerical Linear Algebra Algorithms
The statistical analysis of Randomized Numerical Linear Algebra (RandNLA...
Stochastic Normalizing Flows
We introduce stochastic normalizing flows, an extension of continuous no...
Improved guarantees and a multipledescent curve for the Column Subset Selection Problem and the Nyström method
The Column Subset Selection Problem (CSSP) and the Nyström method are am...
Predicting trends in the quality of stateoftheart neural networks without access to training or testing data
In many applications, one works with deep neural network (DNN) models tr...
ZeroQ: A Novel Zero Shot Quantization Framework
Quantization is a promising approach for reducing the inference time and...
Exact expressions for double descent and implicit regularization via surrogate random design
Double descent refers to the phase transition that is exhibited by the g...
LSAR: Efficient Leverage Score Sampling Algorithm for the Analysis of Big Time Series Data
We apply methods from randomized numerical linear algebra (RandNLA) to d...
HAWQV2: Hessian Aware traceWeighted Quantization of Neural Networks
Quantization is an effective method for reducing memory footprint and in...
Running Alchemist on Cray XC and CS Series Supercomputers: Dask and PySpark Interfaces, Deployment Options, and Data Transfer Times
Newly developed interfaces for Python, Dask, and PySpark enable the use ...
Limit theorems for outofsample extensions of the adjacency and Laplacian spectral embeddings
Graph embeddings, a class of dimensionality reduction techniques designe...
Bootstrapping the Operator Norm in High Dimensions: Error Estimation for Covariance Matrices and Sketching
Although the operator (spectral) norm is one of the most widely used met...
QBERT: Hessian Based Ultra Low Precision Quantization of BERT
Transformer based architectures have become defacto models used for a r...
The Difficulties of Addressing Interdisciplinary Challenges at the Foundations of Data Science
The National Science Foundation's Transdisciplinary Research in Principl...
On Linear Convergence of Weighted Kernel Herding
We provide a novel convergence analysis of two popular sampling algorith...
Statistical guarantees for local graph clustering
Local graph clustering methods aim to find small clusters in very large ...
Bayesian experimental design using regularized determinantal point processes
In experimental design, we are given n vectors in d dimensions, and our ...
Residual Networks as Nonlinear Systems: Stability Analysis using Linearization
We regard pretrained residual networks (ResNets) as nonlinear systems a...
Distributed estimation of the inverse Hessian by determinantal averaging
In distributed optimization and distributed numerical linear algebra, we...
Physicsinformed Autoencoders for Lyapunovstable Fluid Flow Prediction
In addition to providing highprofile successes in computer vision and n...
JumpReLU: A Retrofit Defense Strategy for Adversarial Attacks
It has been demonstrated that very simple attacks can fool highlysophis...
OverSketched Newton: Fast Convex Optimization for Serverless Systems
Motivated by recent developments in serverless systems for largescale m...
Inefficiency of KFAC for Large Batch Size Training
In stochastic optimization, large batch training can leverage parallel r...
Shallow Learning for Fluid Flow Reconstruction with Limited Sensors and Limited Data
In many applications, it is important to reconstruct a fluid flow field,...
Minimax experimental design: Bridging the gap between statistical and worstcase approaches to least squares regression
In experimental design, we are given a large collection of vectors, each...
HeavyTailed Universality Predicts Trends in Test Accuracies for Very Large PreTrained Deep Neural Networks
Given two or more Deep Neural Networks (DNNs) with the same or similar a...
Traditional and HeavyTailed Self Regularization in Neural Network Models
Random Matrix Theory (RMT) is applied to analyze the weight matrices of ...
On the Computational Inefficiency of Large Batch Sizes for Stochastic Gradient Descent
Increasing the minibatch size for stochastic gradient descent offers si...
Implicit SelfRegularization in Deep Neural Networks: Evidence from Random Matrix Theory and Implications for Learning
Random Matrix Theory (RMT) is applied to analyze weight matrices of Deep...
NewtonMR: Newton's Method Without Smoothness or Convexity
Establishing global convergence of the classical Newton's method has lon...
Distributed Secondorder Convex Optimization
Convex optimization problems arise frequently in diverse machine learnin...
Alchemist: An Apache Spark <=> MPI Interface
The Apache Spark framework for distributed computation is popular in the...
Accelerating LargeScale Data Analysis by Offloading to HighPerformance Computing Libraries using Alchemist
Apache Spark is a popular system aimed at the analysis of large data set...
Error Estimation for Randomized LeastSquares Algorithms via the Bootstrap
Over the course of the past decade, a variety of randomized algorithms h...
GPU Accelerated SubSampled Newton's Method
First order methods, which solely rely on gradient information, are comm...
Hessianbased Analysis of Large Batch Training and Robustness to Adversaries
Large batch size training of Neural Networks has been shown to incur acc...
Outofsample extension of graph adjacency spectral embedding
Many popular dimensionality reduction procedures have outofsample exte...
Lectures on Randomized Numerical Linear Algebra
This chapter is based on lectures on Randomized Numerical Linear Algebra...
Avoiding Synchronization in FirstOrder Methods for Sparse Convex Optimization
Parallel computing has played an important role in speeding up convex op...
A Berkeley View of Systems Challenges for AI
With the increasing commoditization of computer vision, speech recogniti...
