
Minimax Model Learning
We present a novel offpolicy loss function for learning a transition mo...
read it

CURE: CodeAware Neural Machine Translation for Automatic Program Repair
Automatic program repair (APR) is crucial to improve software reliabilit...
read it

Modelfree Representation Learning and Exploration in Lowrank MDPs
The low rank MDP has emerged as an important model for studying represen...
read it

SM+: Refined Scale Match for Tiny Person Detection
Detecting tiny objects ( e.g., less than 20 x 20 pixels) in largescale ...
read it

Finite Sample Analysis of Minimax Offline Reinforcement Learning: Completeness, Fast Rates and FirstOrder Efficiency
We offer a theoretical characterization of offpolicy evaluation (OPE) i...
read it

On Queryefficient Planning in MDPs under Linear Realizability of the Optimal Statevalue Function
We consider the problem of local planning in fixedhorizon Markov Decisi...
read it

AntiUAV: A Large MultiModal Benchmark for UAV Tracking
Unmanned Aerial Vehicle (UAV) offers lots of applications in both commer...
read it

Quantifying spatial homogeneity of urban road networks via graph neural networks
The spatial homogeneity of an urban road network (URN) measures whether ...
read it

Language Generation via Combinatorial Constraint Satisfaction: A Tree Search Enhanced MonteCarlo Approach
Generating natural language under complex constraints is a principled fo...
read it

MultiSQL: An extensible multimodel data query language
Big data management aims to establish data hubs that support data in mul...
read it

A Variant of the WangFosterKakade Lower Bound for the Discounted Setting
Recently, Wang et al. (2020) showed a highly intriguing hardness result ...
read it

Improved WorstCase Regret Bounds for Randomized LeastSquares Value Iteration
This paper studies regret minimization with randomized value functions i...
read it

Batch Valuefunction Approximation with Only Realizability
We solve a longstanding problem in batch reinforcement learning (RL): l...
read it

A Question Type Driven and Copy Loss Enhanced Frameworkfor AnswerAgnostic Neural Question Generation
The answeragnostic question generation is a significant and challenging...
read it

Q^ Approximation Schemes for Batch Reinforcement Learning: A Theoretical Comparison
We prove performance guarantees of two algorithms for approximating Q^ i...
read it

RLDuet: Online Music Accompaniment Generation Using Deep Reinforcement Learning
This paper presents a deep reinforcement learning algorithm for online a...
read it

Minimax Confidence Interval for OffPolicy Evaluation and Policy Optimization
We study minimax methods for offpolicy evaluation (OPE) using valuefun...
read it

Scale Match for Tiny Person Detection
Visual object detection has achieved unprecedented advance with the ris...
read it

Empirical Study of OffPolicy Policy Evaluation for Reinforcement Learning
Offpolicy policy evaluation (OPE) is the problem of estimating the onli...
read it

Minimax Weight and QFunction Learning for OffPolicy Evaluation
We provide theoretical investigations into offpolicy evaluation in rein...
read it

Sample Complexity of Reinforcement Learning using Linearly Combined Model Ensembles
Reinforcement learning (RL) methods have been shown to be capable of lea...
read it

From Importance Sampling to Doubly Robust Policy Gradient
We show that policy gradient (PG) and its variance reduction variants ca...
read it

Online Supervised Learning for Traffic Load Prediction in FramedALOHA Networks
Predicting the current backlog, or traffic load, in framedALOHA network...
read it

On Value Functions and the AgentEnvironment Boundary
When function approximation is deployed in reinforcement learning (RL), ...
read it

Provably Efficient QLearning with Low Switching Cost
We take initial steps in studying PACMDP algorithms with limited adapti...
read it

InformationTheoretic Considerations in Batch Reinforcement Learning
Valuefunction approximation methods that operate in batch mode have fou...
read it

The development and evaluation of the SmartAbility Android Application to detect users' abilities
The SmartAbility Android Application recommends Assistive Technology (AT...
read it

Provably efficient RL with Rich Observations via Latent State Decoding
We study the exploration problem in episodic MDPs with rich observations...
read it

Deep Reinforcement Learning for RealTime Optimization in NBIoT Networks
NarrowBandInternet of Things (NBIoT) is an emerging cellularbased tec...
read it

ModelBased Reinforcement Learning in Contextual Decision Processes
We study the sample complexity of modelbased reinforcement learning in ...
read it

Cooperative Deep Reinforcement Learning for MultipleGroup NBIoT Networks Optimization
NarrowBandInternet of Things (NBIoT) is an emerging cellularbased tec...
read it

Cooperative Deep Reinforcement Learning for Multiple Groups NBIoT Networks Optimization
NarrowBandInternet of Things (NBIoT) is an emerging cellularbased tec...
read it

Why Do Neural Response Generation Models Prefer Universal Replies?
Recent advances in sequencetosequence learning reveal a purely datadr...
read it

Image Classification Based on Quantum KNN Algorithm
Image classification is an important task in the field of machine learni...
read it

On Polynomial Time PAC Reinforcement Learning with Rich Observations
We study the computational tractability of provably sampleefficient (PA...
read it

Hierarchical Imitation and Reinforcement Learning
We study the problem of learning policies over long time horizons. We pr...
read it

Markov Decision Processes with Continuous Side Information
We consider a reinforcement learning (RL) setting in which the agent int...
read it

Repeated Inverse Reinforcement Learning
We introduce a novel repeated Inverse Reinforcement Learning problem: th...
read it

Contextual Decision Processes with Low Bellman Rank are PACLearnable
This paper studies systematic exploration for reinforcement learning wit...
read it

Neural Network Architecture Optimization through Submodularity and Supermodularity
Deep learning models' architectures, including depth and width, are key ...
read it

Optimizing Recurrent Neural Networks Architectures under Time Constraints
Recurrent neural network (RNN)'s architecture is a key factor influencin...
read it

Word Embedding based Correlation Model for Question/Answer Matching
With the development of community based question answering (Q&A) service...
read it

Doubly Robust Offpolicy Value Evaluation for Reinforcement Learning
We study the problem of offpolicy value evaluation in reinforcement lea...
read it
Nan Jiang
is this you? claim profile