
Named Entity Recognition with Small Strongly Labeled and Large Weakly Labeled Data
Weak supervision has shown promising results in many natural language pr...
read it

Super Tickets in PreTrained Language Models: From Model Compression to Improving Generalization
The Lottery Ticket Hypothesis suggests that an overparametrized network...
read it

Permutation Invariant Policy Optimization for MeanField MultiAgent Reinforcement Learning: A Principled Approach
Multiagent reinforcement learning (MARL) becomes more challenging in th...
read it

COUnty aggRegation mixup AuGmEntation (COURAGE) COVID19 Prediction
The global spread of COVID19, the disease caused by the novel coronavir...
read it

Adversarial Training as Stackelberg Game: An Unrolled Optimization Approach
Adversarial training has been shown to improve the generalization perfor...
read it

Tokenwise Curriculum Learning for Neural Machine Translation
Existing curriculum learning approaches to Neural Machine Translation (N...
read it

Reinforcement Learning for Adaptive Mesh Refinement
Largescale finite element simulations of complex physical systems gover...
read it

Noisy Gradient Descent Converges to Flat Minima for Nonconvex Matrix Factorization
Numerous empirical evidences have corroborated the importance of noise i...
read it

Towards Automatic Evaluation of Dialog Systems: A ModelFree OffPolicy Evaluation Approach
Reliable automatic evaluation of dialogue systems under an interactive e...
read it

A Hypergradient Approach to Robust Regression without Correspondence
We consider a regression problem, where the correspondence between input...
read it

Doubly Robust OffPolicy Learning on LowDimensional Manifolds by Deep Neural Networks
Causal inference explores the causation between actions and the conseque...
read it

Calibrated Language Model FineTuning for In and OutofDistribution Data
Finetuned pretrained language models can suffer from severe miscalibra...
read it

FineTuning Pretrained Language Model with Weak Supervision: A ContrastiveRegularized SelfTraining Approach
Finetuned pretrained language models (LMs) achieve enormous success in...
read it

How Important is the TrainValidation Split in MetaLearning?
Metalearning aims to perform fast adaptation on a new task through lear...
read it

Residual Network Based Direct Synthesis of EM Structures: A Study on OnetoOne Transformers
We propose using machine learning models for the direct synthesis of on...
read it

BOND: BERTAssisted OpenDomain Named Entity Recognition with Distant Supervision
We study the opendomain named entity recognition (NER) problem under di...
read it

The flare Package for High Dimensional Linear Regression and Precision Matrix Estimation in R
This paper describes an R package named flare, which implements a family...
read it

The huge Package for Highdimensional Undirected Graph Estimation in R
We describe an R package named huge which provides easytouse functions...
read it

Towards Understanding Hierarchical Learning: Benefits of Neural Representations
Deep neural networks can empirically perform efficient hierarchical lear...
read it

Deep Reinforcement Learning with Smooth Policy
Deep neural networks have been widely adopted in modern reinforcement le...
read it

Transformer Hawkes Process
Modern data acquisition routinely produce massive amounts of event seque...
read it

Differentiable Topk Operator with Optimal Transport
The topk operation, i.e., finding the k largest or smallest elements fr...
read it

Why Do Deep Residual Networks Generalize Better than Deep Feedforward Networks? – A Neural Tangent Kernel Perspective
Deep residual networks (ResNets) have demonstrated better generalization...
read it

Statistical Guarantees of Generative Adversarial Networks for Distribution Estimation
Generative Adversarial Networks (GANs) have achieved great success in un...
read it

On Computation and Generalization of Generative Adversarial Imitation Learning
Generative Adversarial Imitation Learning (GAIL) is a powerful and pract...
read it

SMART: Robust and Efficient FineTuning for Pretrained Natural Language Models through Principled Regularized Optimization
Transfer learning has fundamentally changed the landscape of natural lan...
read it

MultiDomain Neural Machine Translation with WordLevel Adaptive Layerwise Domain Mixing
Many multidomain neural machine translation (NMT) models achieve knowle...
read it

On Generalization Bounds of a Family of Recurrent Neural Networks
Recurrent Neural Networks (RNNs) have been widely applied to sequential ...
read it

Towards Understanding the Importance of Shortcut Connections in Residual Networks
Residual Network (ResNet) is undoubtedly a milestone in deep learning. R...
read it

Towards Understanding the Importance of Noise in Training Neural Networks
Numerous empirical evidence has corroborated that the noise plays a cruc...
read it

Meta Learning with Relational Information for Short Sequences
This paper proposes a new metalearning method  named HARMLESS (HAwkes...
read it

Efficient Approximation of Deep ReLU Networks for Functions on Low Dimensional Manifolds
Deep neural networks have revolutionized many real world applications, d...
read it

Inductive Bias of Gradient Descent based Adversarial Training on Separable Data
Adversarial training is a principled approach for training robust neural...
read it

On Scalable and Efficient Computation of Large Scale Optimal Transport
Optimal Transport (OT) naturally arises in many machine learning applica...
read it

On Computation and Generalization of GANs with Spectrum Control
Generative Adversarial Networks (GANs), though powerful, is hard to trai...
read it

Learning to Defense by Learning to Attack
Adversarial training provides a principled approach for training robust ...
read it

Provable Gaussian Embedding with One Observation
The success of machine learning methods heavily relies on having an appr...
read it

On Tighter Generalization Bound for Deep Neural Networks: CNNs, ResNets, and Beyond
Our paper proposes a generalization error bound for a general family of ...
read it

On Landscape of Lagrangian Functions and Stochastic Search for Constrained Nonconvex Optimization
We study constrained nonconvex optimization problems in machine learning...
read it

Towards Understanding Acceleration Tradeoff between Momentum and Asynchrony in Nonconvex Stochastic Optimization
Asynchronous momentum stochastic gradient descent algorithms (AsyncMSGD...
read it

Detecting Nonlinear Causality in Multivariate Time Series with Sparse Additive Models
We propose a nonparametric method for detecting nonlinear causal relatio...
read it

Dimensionality Reduction for Stationary Time Series via Stochastic Nonconvex Optimization
Stochastic optimization naturally arises in machine learning. Efficient ...
read it

Toward Deeper Understanding of Nonconvex Stochastic Optimization with Momentum using Diffusion Approximations
Momentum Stochastic Gradient Descent (MSGD) algorithm has been widely ap...
read it

Misspecified Nonconvex Statistical Optimization for Phase Retrieval
Existing nonconvex statistical optimization theory and methods crucially...
read it

Deep Hyperspherical Learning
Convolution as inner product has been the founding basis of convolutiona...
read it

On Quadratic Convergence of DC Proximal Newton Algorithm for Nonconvex Sparse Learning in High Dimensions
We propose a DC proximal Newton algorithm for solving nonconvex regulari...
read it

Online Factorization and Partition of Complex Networks From Random Walks
Finding the reduceddimensional structure is critical to understanding c...
read it

Homotopy Parametric Simplex Method for Sparse Learning
High dimensional sparse learning has imposed a great computational chall...
read it

Dropping Convexity for More Efficient and Scalable Online Multiview Learning
Multiview representation learning is very popular for latent factor anal...
read it

Symmetry, Saddle Points, and Global Geometry of Nonconvex Matrix Factorization
We propose a general theory for studying the geometry of nonconvex objec...
read it
Tuo Zhao
verfied profile
Assistant Professor at Georgia Institute of Technology