Rong Ge

research

∙ 06/01/2023

A Uniform Confidence Phenomenon in Deep Learning and its Implications for Calibration

Despite the impressive generalization capabilities of deep neural networ...

0 Muthu Chidambaram, et al. ∙

research

∙ 05/18/2023

Smoothing the Landscape Boosts the Signal for SGD: Optimal Sample Complexity for Learning Single Index Models

We focus on the task of learning a single index model σ(w^⋆· x) with res...

0 Alex Damian, et al. ∙

research

∙ 04/03/2023

Depth Separation with Multilayer Mean-Field Networks

Depth separation – why a deeper network is more powerful than a shallowe...

0 Yunwei Ren, et al. ∙

research

∙ 03/14/2023

Do Transformers Parse while Predicting the Masked Word?

Pre-trained language models have been shown to encode linguistic structu...

0 Haoyu Zhao, et al. ∙

research

∙ 02/24/2023

Hiding Data Helps: On the Benefits of Masking for Sparse Coding

Sparse coding refers to modeling a signal as sparse linear combinations ...

0 Muthu Chidambaram, et al. ∙

research

∙ 02/01/2023

Implicit Regularization Leads to Benign Overfitting for Sparse Linear Regression

In deep learning, often the training process finds an interpolator (a so...

0 Mo Zhou, et al. ∙

research

∙ 10/24/2022

Provably Learning Diverse Features in Multi-View Data with Midpoint Mixup

Mixup is a data augmentation technique that relies on training using ran...

0 Muthu Chidambaram, et al. ∙

research

∙ 10/07/2022

Understanding Edge-of-Stability Training Dynamics with a Minimalist Example

Recently, researchers observed that gradient descent for deep neural net...

0 Xingyu Zhu, et al. ∙

research

∙ 10/03/2022

Plateau in Monotonic Linear Interpolation – A "Biased" View of Loss Landscape for Deep Networks

Monotonic linear interpolation (MLI) - on the line connecting a random i...

6 Xiang Wang, et al. ∙

research

∙ 05/18/2022

Customizing ML Predictions for Online Algorithms

A popular line of recent research incorporates ML advice in the design o...

0 Keerti Anand, et al. ∙

research

∙ 10/14/2021

Towards Understanding the Data Dependency of Mixup-style Training

In the Mixup training paradigm, a model is trained using convex combinat...

0 Muthu Chidambaram, et al. ∙

research

∙ 09/23/2021

Outlier-Robust Sparse Estimation via Non-Convex Optimization

We explore the connection between outlier-robust high-dimensional statis...

1 Yu Cheng, et al. ∙

research

∙ 06/11/2021

Understanding Deflation Process in Over-parametrized Tensor Decomposition

In this paper we study the training dynamics for gradient flow on over-p...

0 Rong Ge, et al. ∙

research

∙ 02/04/2021

A Local Convergence Theory for Mildly Over-Parameterized Two-Layer Neural Network

While over-parameterization is widely believed to be crucial for the suc...

0 Mo Zhou, et al. ∙

research

∙ 10/22/2020

Beyond Lazy Training for Over-parameterized Tensor Decomposition

Over-parametrization is an important technique in training neural networ...

0 Xiang Wang, et al. ∙

research

∙ 10/08/2020

Dissecting Hessian: Understanding Common Structure of Hessian in Neural Networks

Hessian captures important properties of the deep neural network loss la...

0 Yikai Wu, et al. ∙

research

∙ 09/30/2020

Efficient sampling from the Bingham distribution

We give a algorithm for exact sampling from the Bingham distribution p(x...

0 Rong Ge, et al. ∙

research

∙ 06/30/2020

Guarantees for Tuning the Step Size using a Learning-to-Learn Approach

Learning-to-learn (using optimization algorithms to learn a new optimize...

0 Xiang Wang, et al. ∙

research

∙ 06/29/2020

Optimization Landscape of Tucker Decomposition

Tucker decomposition is a popular technique for many data analysis and m...

0 Abraham Frandsen, et al. ∙

research

∙ 06/29/2020

Extracting Latent State Representations with Linear Dynamics from Rich Observations

Recently, many reinforcement learning techniques were shown to have prov...

0 Abraham Frandsen, et al. ∙

research

∙ 05/12/2020

Energy-Aware DNN Graph Optimization

Unlike existing work in deep neural network (DNN) graphs optimization fo...

3 Yu Wang, et al. ∙

research

∙ 05/04/2020

High-Dimensional Robust Mean Estimation via Gradient Descent

We study the problem of high-dimensional robust mean estimation in the p...

0 Yu Cheng, et al. ∙

research

∙ 04/16/2020

Spectral Learning on Matrices and Tensors

Spectral methods have been the mainstay in several domains such as machi...

164 Majid Janzamin, et al. ∙

research

∙ 11/08/2019

Estimating Normalizing Constants for Log-Concave Distributions: Algorithms and Lower Bounds

Estimating the normalizing constant of an unnormalized probability distr...

0 Rong Ge, et al. ∙

research

∙ 09/26/2019

Mildly Overparametrized Neural Nets can Memorize Training Data Efficiently

It has been observed zhang2016understanding that deep neural networks ca...

0 Rong Ge, et al. ∙

research

∙ 06/14/2019

Explaining Landscape Connectivity of Low-cost Solutions for Multilayer Nets

Mode connectivity is a surprising phenomenon in the loss landscape of de...

3 Rohith Kuditipudi, et al. ∙

research

∙ 06/11/2019

Faster Algorithms for High-Dimensional Robust Covariance Estimation

We study the problem of estimating the covariance matrix of a high-dimen...

0 Yu Cheng, et al. ∙

research

∙ 05/01/2019

Stabilized SVRG: Simple Variance Reduction for Nonconvex Optimization

Variance reduction techniques like SVRG provide simple and fast algorith...

0 Rong Ge, et al. ∙

research

∙ 04/29/2019

The Step Decay Schedule: A Near Optimal, Geometrically Decaying Learning Rate Procedure

There is a stark disparity between the step size schedules used in pract...

12 Rong Ge, et al. ∙

research

∙ 02/13/2019

Stochastic Gradient Descent Escapes Saddle Points Efficiently

This paper considers the perturbed stochastic gradient descent algorithm...

20 Chi Jin, et al. ∙

research

∙ 02/11/2019

A Short Note on Concentration Inequalities for Random Vectors with SubGaussian Norm

In this note, we derive concentration inequalities for random vectors wi...

16 Chi Jin, et al. ∙

research

∙ 02/02/2019

Understanding Composition of Word Embeddings via Tensor Decomposition

Word embedding is a powerful tool in natural language processing. In thi...

0 Abraham Frandsen, et al. ∙

research

∙ 11/29/2018

Simulated Tempering Langevin Monte Carlo II: An Improved Proof using Soft Markov Chain Decomposition

A key task in Bayesian machine learning is sampling from distributions t...

0 Rong Ge, et al. ∙

research

∙ 11/23/2018

High-Dimensional Robust Mean Estimation in Nearly-Linear Time

We study the fundamental problem of high-dimensional mean estimation in ...

0 Yu Cheng, et al. ∙

research

∙ 10/16/2018

Learning Two-layer Neural Networks with Symmetric Inputs

We give a new algorithm for learning a two-layer neural network under a ...

0 Rong Ge, et al. ∙

research

∙ 03/28/2018

Non-Convex Matrix Completion Against a Semi-Random Adversary

Matrix completion is a well-studied problem with many machine learning a...

0 Yu Cheng, et al. ∙

research

∙ 03/25/2018

Minimizing Nonconvex Population Risk from Rough Empirical Risk

Population risk---the expectation of the loss over the sampling mechanis...

0 Chi Jin, et al. ∙

research

∙ 02/14/2018

Stronger generalization bounds for deep nets via a compression approach

Deep nets generalize well despite having more parameters than the number...

0 Sanjeev Arora, et al. ∙

research

∙ 01/15/2018

Global Convergence of Policy Gradient Methods for Linearized Control Problems

Direct policy gradient methods for reinforcement learning and continuous...

0 Maryam Fazel, et al. ∙

research

∙ 11/01/2017

Learning One-hidden-layer Neural Networks with Landscape Design

We consider the problem of learning a one-hidden-layer neural network: w...

0 Rong Ge, et al. ∙

research

∙ 10/07/2017

Beyond Log-concavity: Provable Guarantees for Sampling Multi-modal Distributions using Simulated Tempering Langevin Monte Carlo

A key task in Bayesian statistics is sampling from distributions that ar...

0 Rong Ge, et al. ∙

research

∙ 06/18/2017

On the Optimization Landscape of Tensor Decompositions

Non-convex optimization with local search heuristics has been widely use...

0 Rong Ge, et al. ∙

research

∙ 04/03/2017

No Spurious Local Minima in Nonconvex Low Rank Problems: A Unified Geometric Analysis

In this paper we develop a new framework that captures the common landsc...

0 Rong Ge, et al. ∙

research

∙ 03/02/2017

How to Escape Saddle Points Efficiently

This paper shows that a perturbed form of gradient descent converges to ...

0 Chi Jin, et al. ∙

research

∙ 03/02/2017

Generalization and Equilibrium in Generative Adversarial Nets (GANs)

We show that training of generative adversarial network (GAN) may not ha...

0 Sanjeev Arora, et al. ∙

research

∙ 12/28/2016

Provable learning of Noisy-or Networks

Many machine learning applications use latent variable models to explain...

0 Sanjeev Arora, et al. ∙

research

∙ 10/28/2016

Homotopy Analysis for Tensor PCA

Developing efficient and guaranteed nonconvex algorithms has been an imp...

0 Anima Anandkumar, et al. ∙

research

∙ 09/29/2016

DynIMS: A Dynamic Memory Controller for In-memory Storage on HPC Systems

In order to boost the performance of data-intensive computing on HPC sys...

0 Pengfei Xuan, et al. ∙

research

∙ 05/27/2016

Provable Algorithms for Inference in Topic Models

Recently, there has been considerable progress on designing algorithms w...

0 Sanjeev Arora, et al. ∙

research

∙ 05/24/2016

Matrix Completion has No Spurious Local Minimum

Matrix completion is a basic machine learning problem that has wide appl...

0 Rong Ge, et al. ∙

Rong Ge

Featured Co-authors

Sign in with Google

Consider DeepAI Pro