
Empirical Study of OffPolicy Policy Evaluation for Reinforcement Learning
Offpolicy policy evaluation (OPE) is the problem of estimating the onli...
read it

Sample Complexity of Reinforcement Learning using Linearly Combined Model Ensembles
Reinforcement learning (RL) methods have been shown to be capable of lea...
read it

ModelBased Reinforcement Learning in Contextual Decision Processes
We study the sample complexity of modelbased reinforcement learning in ...
read it

Provably Efficient QLearning with Low Switching Cost
We take initial steps in studying PACMDP algorithms with limited adapti...
read it

Minimax Confidence Interval for OffPolicy Evaluation and Policy Optimization
We study minimax methods for offpolicy evaluation (OPE) using valuefun...
read it

Markov Decision Processes with Continuous Side Information
We consider a reinforcement learning (RL) setting in which the agent int...
read it

Repeated Inverse Reinforcement Learning
We introduce a novel repeated Inverse Reinforcement Learning problem: th...
read it

Contextual Decision Processes with Low Bellman Rank are PACLearnable
This paper studies systematic exploration for reinforcement learning wit...
read it

Neural Network Architecture Optimization through Submodularity and Supermodularity
Deep learning models' architectures, including depth and width, are key ...
read it

Optimizing Recurrent Neural Networks Architectures under Time Constraints
Recurrent neural network (RNN)'s architecture is a key factor influencin...
read it

Doubly Robust Offpolicy Value Evaluation for Reinforcement Learning
We study the problem of offpolicy value evaluation in reinforcement lea...
read it

Word Embedding based Correlation Model for Question/Answer Matching
With the development of community based question answering (Q&A) service...
read it

On Polynomial Time PAC Reinforcement Learning with Rich Observations
We study the computational tractability of provably sampleefficient (PA...
read it

Hierarchical Imitation and Reinforcement Learning
We study the problem of learning policies over long time horizons. We pr...
read it

Image Classification Based on Quantum KNN Algorithm
Image classification is an important task in the field of machine learni...
read it

Why Do Neural Response Generation Models Prefer Universal Replies?
Recent advances in sequencetosequence learning reveal a purely datadr...
read it

Cooperative Deep Reinforcement Learning for Multiple Groups NBIoT Networks Optimization
NarrowBandInternet of Things (NBIoT) is an emerging cellularbased tec...
read it

Cooperative Deep Reinforcement Learning for MultipleGroup NBIoT Networks Optimization
NarrowBandInternet of Things (NBIoT) is an emerging cellularbased tec...
read it

Provably efficient RL with Rich Observations via Latent State Decoding
We study the exploration problem in episodic MDPs with rich observations...
read it

Deep Reinforcement Learning for RealTime Optimization in NBIoT Networks
NarrowBandInternet of Things (NBIoT) is an emerging cellularbased tec...
read it

InformationTheoretic Considerations in Batch Reinforcement Learning
Valuefunction approximation methods that operate in batch mode have fou...
read it

On Value Functions and the AgentEnvironment Boundary
When function approximation is deployed in reinforcement learning (RL), ...
read it

The development and evaluation of the SmartAbility Android Application to detect users' abilities
The SmartAbility Android Application recommends Assistive Technology (AT...
read it

Online Supervised Learning for Traffic Load Prediction in FramedALOHA Networks
Predicting the current backlog, or traffic load, in framedALOHA network...
read it

From Importance Sampling to Doubly Robust Policy Gradient
We show that policy gradient (PG) and its variance reduction variants ca...
read it

Minimax Weight and QFunction Learning for OffPolicy Evaluation
We provide theoretical investigations into offpolicy evaluation in rein...
read it

Scale Match for Tiny Person Detection
Visual object detection has achieved unprecedented advance with the ris...
read it

Q^ Approximation Schemes for Batch Reinforcement Learning: A Theoretical Comparison
We prove performance guarantees of two algorithms for approximating Q^ i...
read it

RLDuet: Online Music Accompaniment Generation Using Deep Reinforcement Learning
This paper presents a deep reinforcement learning algorithm for online a...
read it
Nan Jiang
is this you? claim profile