Jiantao Jiao

research

∙ 09/18/2023

Guided Online Distillation: Promoting Safe Reinforcement Learning by Offline Demonstration

Safe Reinforcement Learning (RL) aims to find a policy that achieves hig...

0 Jinning Li, et al. ∙

research

∙ 09/07/2023

Noisy Computing of the 𝖮𝖱 and 𝖬𝖠𝖷 Functions

We consider the problem of computing a function of n variables using noi...

0 Banghua Zhu, et al. ∙

research

∙ 06/21/2023

On the Optimal Bounds for Noisy Computing

We revisit the problem of computing with noisy information considered in...

0 Banghua Zhu, et al. ∙

research

∙ 06/04/2023

Fine-Tuning Language Models with Advantage-Induced Policy Alignment

Reinforcement learning from human feedback (RLHF) has emerged as a relia...

0 Banghua Zhu, et al. ∙

research

∙ 06/03/2023

On Optimal Caching and Model Multiplexing for Large Model Inference

Large Language Models (LLMs) and other large foundation models have achi...

0 Banghua Zhu, et al. ∙

research

∙ 06/01/2023

Doubly Robust Self-Training

Self-training is an important technique for solving semi-supervised lear...

0 Banghua Zhu, et al. ∙

research

∙ 05/19/2023

Online Learning in a Creator Economy

The creator economy has revolutionized the way individuals can profit th...

0 Banghua Zhu, et al. ∙

research

∙ 02/12/2023

Statistical Complexity and Optimal Algorithms for Non-linear Ridge Bandits

We consider the sequential decision-making problem where the mean outcom...

0 Nived Rajaraman, et al. ∙

research

∙ 01/30/2023

Importance Weighted Actor-Critic for Optimal Conservative Offline Reinforcement Learning

We propose A-Crab (Actor-Critic Regularized by Average Bellman error), a...

0 Hanlin Zhu, et al. ∙

research

∙ 01/27/2023

Online Learning in Stackelberg Games with an Omniscient Follower

We study the problem of online learning in a two-player decentralized co...

0 Geng Zhao, et al. ∙

research

∙ 01/26/2023

Principled Reinforcement Learning with Human Feedback from Pairwise or K-wise Comparisons

We provide a theoretical framework for Reinforcement Learning with Human...

0 Banghua Zhu, et al. ∙

research

∙ 11/01/2022

Optimal Conservative Offline RL with General Function Approximation via Augmented Lagrangian

Offline reinforcement learning (RL), which refers to decision-making fro...

0 Paria Rashidinejad, et al. ∙

research

∙ 11/01/2022

Beyond the Best: Estimating Distribution Functionals in Infinite-Armed Bandits

In the infinite-armed bandit problem, each arm's average reward is sampl...

0 Yifei Wang, et al. ∙

research

∙ 05/30/2022

Minimax Optimal Online Imitation Learning via Replay Estimation

Online imitation learning is the problem of how best to mimic expert dem...

0 Gokul Swamy, et al. ∙

research

∙ 05/24/2022

Byzantine-Robust Federated Learning with Optimal Statistical Rates and Privacy Guarantees

We propose Byzantine-robust federated learning protocols with nearly opt...

7 Banghua Zhu, et al. ∙

research

∙ 04/05/2022

Jump-Start Reinforcement Learning

Reinforcement learning (RL) provides a theoretical framework for continu...

0 Ikechukwu Uchendu, et al. ∙

research

∙ 02/02/2022

Robust Estimation for Nonparametric Families via Generative Adversarial Networks

We provide a general framework for designing Generative Adversarial Netw...

9 Banghua Zhu, et al. ∙

research

∙ 12/21/2021

Nearly Optimal Policy Optimization with Stable at Any Time Guarantee

Policy optimization methods are one of the most widely used classes of R...

0 Tianhao Wu, et al. ∙

research

∙ 07/08/2021

Computational Benefits of Intermediate Rewards for Hierarchical Planning

Many hierarchical reinforcement learning (RL) applications have empirica...

0 Yuexiang Zhai, et al. ∙

research

∙ 06/18/2021

MADE: Exploration via Maximizing Deviation from Explored Regions

In online reinforcement learning (RL), efficient exploration remains par...

4 Tianjun Zhang, et al. ∙

research

∙ 06/07/2021

Securing Secure Aggregation: Mitigating Multi-Round Privacy Leakage in Federated Learning

Secure aggregation is a critical component in federated learning, which ...

0 Jinhyun So, et al. ∙

research

∙ 03/22/2021

Bridging Offline Reinforcement Learning and Imitation Learning: A Tale of Pessimism

Offline (or batch) reinforcement learning (RL) algorithms seek to learn ...

41 Paria Rashidinejad, et al. ∙

research

∙ 02/25/2021

Provably Breaking the Quadratic Error Compounding Barrier in Imitation Learning, Optimally

We study the statistical limits of Imitation Learning (IL) in episodic M...

0 Nived Rajaraman, et al. ∙

research

∙ 01/19/2021

Minimax Off-Policy Evaluation for Multi-Armed Bandits

We study the problem of off-policy evaluation in the multi-armed bandit ...

0 Cong Ma, et al. ∙

research

∙ 01/12/2021

Linear Representation Meta-Reinforcement Learning for Instant Adaptation

This paper introduces Fast Linearized Adaptive Policy (FLAP), a new meta...

0 Matt Peng, et al. ∙

research

∙ 10/12/2020

SLIP: Learning to Predict in Unknown Dynamical Systems with Long-Term Memory

We present an efficient and practical (polynomial time) algorithm for on...

0 Paria Rashidinejad, et al. ∙

research

∙ 09/13/2020

Toward the Fundamental Limits of Imitation Learning

Imitation learning (IL) aims to mimic the behavior of an expert policy i...

7 Nived Rajaraman, et al. ∙

research

∙ 05/28/2020

Robust estimation via generalized quasi-gradients

We explore why many recently proposed robust estimation problems are eff...

0 Banghua Zhu, et al. ∙

research

∙ 01/21/2020

When does the Tukey median work?

We analyze the performance of the Tukey median estimator under total var...

0 Banghua Zhu, et al. ∙

research

∙ 09/19/2019

Generalized Resilience and Robust Statistics

Robust statistics traditionally focuses on outliers, or perturbations in...

0 Banghua Zhu, et al. ∙

research

∙ 09/18/2019

Barracuda: The Power of ℓ-polling in Proof-of-Stake Blockchains

A blockchain is a database of sequential events that is maintained by a ...

0 Giulia Fanti, et al. ∙

research

∙ 01/27/2019

Deconstructing Generative Adversarial Networks

We deconstruct the performance of GANs into three components: 1. Formu...

0 Banghua Zhu, et al. ∙

research

∙ 01/24/2019

Theoretically Principled Trade-off between Robustness and Accuracy

We identify a trade-off between robustness and accuracy that serves as a...

5 Hongyang Zhang, et al. ∙

research

∙ 11/19/2018

Stackelberg GAN: Towards Provable Minimax Equilibrium via Multi-Generator Architectures

We study the problem of alleviating the instability issue in the GAN tra...

8 Hongyang Zhang, et al. ∙

research

∙ 09/18/2018

Concentration Inequalities for the Empirical Distribution

We study concentration inequalities for the Kullback--Leibler (KL) diver...

0 Jay Mardia, et al. ∙

research

∙ 05/02/2018

Minimax redundancy for Markov chains with large state space

For any Markov source, there exist universal codes whose normalized code...

0 Kedar Shriram Tatwawadi, et al. ∙

research

∙ 05/02/2018

Lower bounds on the minimax redundancy for Markov chains with large state space

For any Markov source, there exist universal codes whose normalized code...

0 Kedar Shriram Tatwawadi, et al. ∙

research

∙ 02/23/2018

Local moment matching: A unified methodology for symmetric functional estimation and distribution estimation under Wasserstein distance

We present Local Moment Matching (LMM), a unified methodology for symmet...

0 Yanjun Han, et al. ∙

research

∙ 02/22/2018

Entropy Rate Estimation for Markov Chains with Large State Space

Estimating the entropy based on data is one of the prototypical problems...

0 Yanjun Han, et al. ∙

research

∙ 12/19/2017

Approximate Profile Maximum Likelihood

We propose an efficient algorithm for approximate computation of the pro...

0 Dmitri S. Pavlichin, et al. ∙

research

∙ 11/23/2017

The Nearest Neighbor Information Estimator is Adaptively Near Minimax Rate-Optimal

We analyze the Kozachenko--Leonenko (KL) nearest neighbor estimator for ...

0 Jiantao Jiao, et al. ∙

research

∙ 11/06/2017

Optimal rates of entropy estimation over Lipschitz balls

We consider the problem of minimax estimation of the entropy of a densit...

0 Yanjun Han, et al. ∙

research

∙ 07/05/2017

Estimating the Fundamental Limits is Easier than Achieving the Fundamental Limits

We show through case studies that it is easier to estimate the fundament...

0 Jiantao Jiao, et al. ∙

research

∙ 11/03/2016

Demystifying ResNet

The Residual Network (ResNet), proposed in He et al. (2015), utilized sh...

0 Sihan Li, et al. ∙

research

∙ 09/26/2014

Beyond Maximum Likelihood: from Theory to Practice

Maximum likelihood is the most widely used statistical estimation techni...

0 Jiantao Jiao, et al. ∙

research

∙ 07/29/2011

Minimax-Optimal Bounds for Detectors Based on Estimated Prior Probabilities

In many signal detection and classification problems, we have knowledge ...

0 Jiantao Jiao, et al. ∙

Jiantao Jiao

Featured Co-authors

Sign in with Google

Consider DeepAI Pro