b'Chulhee Yun'

research

∙ 07/09/2023

Trajectory Alignment: Understanding the Edge of Stability Phenomenon via Bifurcation Theory

Cohen et al. (2021) empirically study the evolution of the largest eigen...

0 Minhak Song, et al. ∙

research

∙ 06/19/2023

Enhancing Generalization and Plasticity for Sample Efficient Reinforcement Learning

In Reinforcement Learning (RL), enhancing sample efficiency is crucial, ...

0 Hojoon Lee, et al. ∙

research

∙ 06/16/2023

Practical Sharpness-Aware Minimization Cannot Converge All the Way to Optima

Sharpness-Aware Minimization (SAM) is an optimizer that takes a descent ...

0 Dongkuk Si, et al. ∙

research

∙ 06/01/2023

Provable Benefit of Mixup for Finding Optimal Decision Boundaries

We investigate how pair-wise data augmentation techniques like Mixup aff...

0 Junsoo Oh, et al. ∙

research

∙ 03/13/2023

Tighter Lower Bounds for Shuffling SGD: Random Permutations and Beyond

We study convergence lower bounds of without-replacement stochastic grad...

0 Jaeyoung Cha, et al. ∙

research

∙ 02/24/2023

On the Training Instability of Shuffling SGD with Batch Normalization

We uncover how SGD interacts with batch normalization and can exhibit un...

0 David X. Wu, et al. ∙

research

∙ 10/12/2022

SGDA with shuffling: faster convergence for nonconvex-PŁ minimax optimization

Stochastic gradient descent-ascent (SGDA) is one of the main workhorses ...

0 Hanseul Cho, et al. ∙

research

∙ 10/20/2021

Minibatch vs Local SGD with Shuffling: Tight Convergence Bounds and Beyond

In distributed learning, local SGD (also known as federated averaging) a...

0 Chulhee Yun, et al. ∙

research

∙ 03/12/2021

Can Single-Shuffle SGD be Better than Reshuffling SGD and GD?

We propose matrix norm inequalities that extend the Recht-Ré (2012) conj...

7 Chulhee Yun, et al. ∙

research

∙ 10/26/2020

Provable Memorization via Deep Neural Networks using Sub-linear Parameters

It is known that Θ(N) parameters are sufficient for neural networks to m...

0 Sejun Park, et al. ∙

research

∙ 10/06/2020

A Unifying View on Implicit Bias in Training Linear Neural Networks

We study the implicit bias of gradient flow (i.e., gradient descent with...

5 Chulhee Yun, et al. ∙

research

∙ 06/16/2020

Minimum Width for Universal Approximation

The universal approximation property of width-bounded networks has been ...

0 Sejun Park, et al. ∙

research

∙ 06/12/2020

SGD with shuffling: optimal rates without component convexity and large epoch requirements

We study without-replacement SGD for solving finite-sum optimization pro...

0 Kwangjun Ahn, et al. ∙

research

∙ 06/08/2020

O(n) Connections are Expressive Enough: Universal Approximability of Sparse Transformers

Transformer networks use pairwise attention to compute contextual embedd...

5 Chulhee Yun, et al. ∙

research

∙ 02/17/2020

Low-Rank Bottleneck in Multi-head Attention Models

Attention based Transformer architecture has enabled significant advance...

9 Srinadh Bhojanapalli, et al. ∙

research

∙ 12/20/2019

Are Transformers universal approximators of sequence-to-sequence functions?

Despite the widespread adoption of Transformer models for NLP tasks, the...

24 Chulhee Yun, et al. ∙

research

∙ 07/09/2019

Are deep ResNets provably better than linear predictors?

Recently, a residual network (ResNet) with a single residual block has b...

2 Chulhee Yun, et al. ∙

research

∙ 10/17/2018

Finite sample expressive power of small-width ReLU networks

We study universal finite sample expressivity of neural networks, define...

12 Chulhee Yun, et al. ∙

research

∙ 09/28/2018

Efficiently testing local optimality and escaping saddles for ReLU networks

We provide a theoretical algorithm for checking local optimality and esc...

18 Chulhee Yun, et al. ∙

research

∙ 02/10/2018

A Critical View of Global Optimality in Deep Learning

We investigate the loss surface of deep linear and nonlinear neural netw...

0 Chulhee Yun, et al. ∙

research

∙ 07/08/2017

Global optimality conditions for deep neural networks

We study the error landscape of deep linear and nonlinear neural network...

0 Chulhee Yun, et al. ∙

Chulhee Yun

Featured Co-authors

Sign in with Google

Consider DeepAI Pro