
-
Landscape of Sparse Linear Network: A Brief Investigation
Network pruning, or sparse network has a long history and practical sign...
read it
-
Optimal Quantization for Batch Normalization in Neural Network Deployments and Beyond
Quantized Neural Networks (QNNs) use low bit-width fixed-point numbers f...
read it
-
Intervention Generative Adversarial Networks
In this paper we propose a novel approach for stabilizing the training p...
read it
-
An Asymptotically Optimal Multi-Armed Bandit Algorithm and Hyperparameter Optimization
The evaluation of hyperparameters, neural architectures, or data augment...
read it
-
Communication Efficient Decentralized Training with Multiple Local Updates
Communication efficiency plays a significant role in decentralized optim...
read it
-
Distillation ≈ Early Stopping? Harvesting Dark Knowledge Utilizing Anisotropic Information Retrieval For Overparameterized Neural Network
Distillation is a method to transfer knowledge from one model to another...
read it
-
A Stochastic Proximal Point Algorithm for Saddle-Point Problems
We consider saddle point problems which objective functions are the aver...
read it
-
A General Analysis Framework of Lower Complexity Bounds for Finite-Sum Optimization
This paper studies the lower bound complexity for the optimization probl...
read it
-
Towards Better Generalization: BP-SVRG in Training Deep Neural Networks
Stochastic variance-reduced gradient (SVRG) is a classical optimization ...
read it
-
On the Convergence of FedAvg on Non-IID Data
Federated learning enables a large amount of edge computing devices to l...
read it
-
A Gram-Gauss-Newton Method Learning Overparameterized Deep Neural Networks for Regression Problems
First-order methods such as stochastic gradient descent (SGD) are curren...
read it
-
A Unified Framework for Regularized Reinforcement Learning
We propose and study a general framework for regularized Markov decision...
read it
-
Lipschitz Generative Adversarial Nets
In this paper we study the convergence of generative adversarial network...
read it
-
Do Subsampled Newton Methods Work for High-Dimensional Data?
Subsampled Newton methods approximate Hessian matrices through subsampli...
read it
-
Hierarchical Attention: What Really Counts in Various NLP Tasks
Attention mechanisms in sequence to sequence models have shown great abi...
read it
-
Interpolatron: Interpolation or Extrapolation Schemes to Accelerate Optimization for Deep Neural Networks
In this paper we explore acceleration techniques for large scale nonconv...
read it
-
A Unifying Framework for Convergence Analysis of Approximate Newton Methods
Many machine learning models are reformulated as optimization problems. ...
read it
-
An Efficient Character-Level Neural Machine Translation
Neural machine translation aims at building a single large neural networ...
read it
-
A Proximal Stochastic Quasi-Newton Algorithm
In this paper, we discuss the problem of minimizing the sum of two conve...
read it
-
Wishart Mechanism for Differentially Private Principal Components Analysis
We propose a new input perturbation mechanism for publishing a covarianc...
read it
-
Nonconvex Penalization in Sparse Estimation: An Approach Based on the Bernstein Function
In this paper we study nonconvex penalization using Bernstein functions ...
read it
-
A Parallel algorithm for X-Armed bandits
The target of X-armed bandit problem is to find the global maximum of an...
read it
-
A Scalable and Extensible Framework for Superposition-Structured Models
In many learning tasks, structural models usually lead to better interpr...
read it
-
Adjusting Leverage Scores by Row Weighting: A Practical Approach to Coherent Matrix Completion
Low-rank matrix completion is an important problem with extensive real-w...
read it
-
Group Orbit Optimization: A Unified Approach to Data Normalization
In this paper we propose and study an optimization problem over a matrix...
read it
-
The Bernstein Function: A Unifying Framework of Nonconvex Penalization in Sparse Estimation
In this paper we study nonconvex penalization using Bernstein functions....
read it
-
The Matrix Ridge Approximation: Algorithms and Applications
We are concerned with an approximation problem for a symmetric positive ...
read it
-
Compound Poisson Processes, Latent Shrinkage Priors and Bayesian Nonconvex Penalization
In this paper we discuss Bayesian nonconvex penalization for sparse lear...
read it
-
Kinetic Energy Plus Penalty Functions for Sparse Estimation
In this paper we propose and study a family of sparsity-inducing penalty...
read it
-
A Scalable CUR Matrix Decomposition Algorithm: Lower Time Complexity and Tighter Bound
The CUR matrix decomposition is an important extension of Nyström approx...
read it
-
Bayesian Multicategory Support Vector Machines
We show that the multi-class support vector machine (MSVM) proposed by L...
read it
-
EP-GIG Priors and Applications in Bayesian Sparse Learning
In this paper we propose a novel framework for the construction of spars...
read it
-
Coherence Functions with Applications in Large-Margin Classification Methods
Support vector machines (SVMs) naturally embody sparseness due to their ...
read it