
FineGrained GapDependent Bounds for Tabular MDPs via Adaptive MultiStep Bootstrap
This paper presents a new modelfree algorithm for episodic finitehoriz...
Provable Modelbased Nonlinear Bandit and Reinforcement Learning: Shelve Optimism, Embrace Virtual Curvature
This paper studies modelbased bandit and reinforcement learning (RL) wi...
InNOut: PreTraining and SelfTraining using Auxiliary Information for OutofDistribution Robustness
Consider a prediction setting where a few inputs (e.g., satellite images...
Metalearning Transferable Representations with a Single Target Domain
Recent works found that finetuning and joint training—two popular appro...
Beyond Lazy Training for Overparameterized Tensor Decomposition
Overparametrization is an important technique in training neural networ...
DocumentLevel Relation Extraction with Adaptive Thresholding and Localized Context Pooling
Documentlevel relation extraction (RE) poses new challenges compared to...
Theoretical Analysis of SelfTraining with Deep Networks on Unlabeled Data
Selftraining algorithms, which train a model to fit pseudolabels predic...
Entity and Evidence Guided Relation Extraction for DocRED
Documentlevel relation extraction is a challenging task which requires ...
Learning OverParametrized TwoLayer ReLU Neural Networks beyond NTK
We consider the dynamic of gradient descent for learning a twolayer neu...
Simplifying Models with Unlabeled Output Data
We focus on prediction problems with highdimensional outputs that are s...
Heteroskedastic and Imbalanced Deep Learning with Adaptive Regularization
Realworld largescale datasets are heteroskedastic and imbalanced – lab...
Active Online Domain Adaptation
Online machine learning systems need to adapt to domain shifts. Meanwhil...
Individual Calibration with Randomized Forecasting
Machine learning applications often require calibrated predictions, e.g....
Selftraining Avoids Using Spurious Features Under Domain Shift
In unsupervised domain adaptation, existing theory focuses on situations...
Federated Accelerated Stochastic Gradient Descent
We propose Federated Accelerated Stochastic Gradient Descent (FedAc), a ...
Modelbased Adversarial MetaReinforcement Learning
Metareinforcement learning (metaRL) aims to learn from multiple traini...
Shape Matters: Understanding the Implicit Bias of the Noise Covariance
The noise in stochastic gradient descent (SGD) provides a crucial implic...
MOPO: Modelbased Offline Policy Optimization
Offline reinforcement learning (RL) refers to the problem of learning po...
Robust and Onthefly Dataset Denoising for Image Classification
Memorization in overparameterized neural networks could severely hurt g...
Optimal Regularization Can Mitigate Double Descent
Recent empirical and theoretical studies have shown that many learning a...
The Implicit and Explicit Regularization Effects of Dropout
Dropout is a widelyused regularization technique, often required to obt...
Understanding SelfTraining for Gradual Domain Adaptation
Machine learning systems must adapt to data distributions that evolve ov...
VariableViewpoint Representations for 3D Object Recognition
For the problem of 3D object recognition, researchers using deep learnin...
Bootstrapping the Expressivity with Modelbased Planning
We compare the modelfree reinforcement learning with the modelbased ap...
Improved Sample Complexities for Deep Networks and Robust Classification via an AllLayer Margin
For linear classifiers, the relationship between (normalized) output mar...
Verified Uncertainty Calibration
Applications such as weather forecasting and personalized medicine deman...
Learning SelfCorrectable Policies and Value Functions from Demonstrations with Negative Sampling
Imitation learning, followed by reinforcement learning algorithms, is a ...
A Modelbased Approach for Sampleefficient Multitask Reinforcement Learning
The aim of multitask reinforcement learning is twofold: (1) efficientl...
Towards Explaining the Regularization Effect of Initial Large Learning Rate in Training Neural Networks
Stochastic gradient descent with a large initial learning rate is a wide...
Learning Imbalanced Datasets with LabelDistributionAware Margin Loss
Deep learning algorithms can fare poorly when the training dataset suffe...
On the Performance of Thompson Sampling on Logistic Bandits
We study the logistic bandit, in which rewards are binary with success p...
Datadependent Sample Complexity of Deep Neural Networks via Lipschitz Augmentation
Existing Rademacher complexity bounds for neural networks rely only on n...
Fixup Initialization: Residual Learning Without Normalization
Normalization layers are a staple in stateoftheart deep neural networ...
On the Margin Theory of Feedforward Neural Networks
Past works have shown that, somewhat surprisingly, overparametrization ...
Algorithmic Framework for Modelbased Reinforcement Learning with Theoretical Guarantees
While modelbased reinforcement learning has empirically been shown to s...
Approximability of Discriminators Implies Diversity in GANs
While Generative Adversarial Networks (GANs) have empirically produced i...
Seeing Neural Networks Through a Box of Toys: The Toybox Dataset of Visual Object Transformations
Deep convolutional neural networks (CNNs) have enjoyed tremendous succes...
Optimal Design of Process Flexibility for General Production Systems
Process flexibility is widely adopted as an effective strategy for respo...
A La Carte Embedding: Cheap but Effective Induction of Semantic Feature Vectors
Motivations like domain adaptation, transfer learning, and feature learn...
Algorithmic Regularization in Overparameterized Matrix Sensing and Neural Networks with Quadratic Activations
We show that the (stochastic) gradient descent algorithm provides an imp...
Algorithmic Regularization in Overparameterized Matrix Recovery
We study the problem of recovering a lowrank matrix X^ from linear meas...
Learning Onehiddenlayer Neural Networks with Landscape Design
We consider the problem of learning a onehiddenlayer neural network: w...
On the Optimization Landscape of Tensor Decompositions
Nonconvex optimization with local search heuristics has been widely use...
Generalization and Equilibrium in Generative Adversarial Nets (GANs)
We show that training of generative adversarial network (GAN) may not ha...
Provable learning of Noisyor Networks
Many machine learning applications use latent variable models to explain...
Identity Matters in Deep Learning
An emerging design principle in deep learning is that each layer of a de...
Finding Approximate Local Minima Faster than Gradient Descent
We design a nonconvex secondorder optimization algorithm that is guara...
A Nongenerative Framework and Convex Relaxations for Unsupervised Learning
We give a novel formal theoretical framework for unsupervised learning w...
Gradient Descent Learns Linear Dynamical Systems
We prove that gradient descent efficiently converges to the global optim...
Provable Algorithms for Inference in Topic Models
Recently, there has been considerable progress on designing algorithms w...
Tengyu Ma
verfied profile
Assistant Professor of Computer Science and Statistics at Stanford University