
Adaptive Federated Learning and Digital Twin for Industrial Internet of Things
Industrial Internet of Things (IoT) enables distributed intelligent serv...
Learning the Linear Quadratic Regulator from Nonlinear Observations
We introduce a new problem setting for continuous control called the LQR...
PCPG: Policy Cover Directed Exploration for Provable Policy Gradient Learning
Direct policy gradient methods for reinforcement learning are a successf...
Information Theoretic Regret Bounds for Online Nonlinear Control
This work studies the problem of sequential control in an unknown, nonli...
FLAMBE: Structural Complexity and Representation Learning of Low Rank MDPs
In order to deal with the curse of dimensionality in reinforcement learn...
Provably Efficient Modelbased Policy Adaptation
The high sample complexity of reinforcement learning challenges its use ...
Constrained episodic reinforcement learning in concaveconvex and knapsack settings
We propose an algorithm for tabular episodic reinforcement learning with...
Arbitrary Style Transfer via MultiAdaptation Network
Arbitrary style transfer is a significant topic with both research value...
Exploration in Action Space
Parameter space exploration methods with blackbox optimization have rec...
Corruption Robust Exploration in Episodic Reinforcement Learning
We initiate the study of multistage episodic reinforcement learning und...
Policy Poisoning in Batch Reinforcement Learning and Control
We study a security threat to batch reinforcement learning and control w...
Optimal Sketching for Kronecker Product Regression and Low Rank Approximation
We study the Kronecker product regression problem, in which the design m...
Imitation Learning as fDivergence Minimization
We address the problem of imitation learning with multimodal demonstrat...
Provably Efficient Imitation Learning from Observation Alone
We study Imitation Learning (IL) from Observations alone (ILFO) in large...
Efficient Modelfree Reinforcement Learning in Metric Spaces
Modelfree Reinforcement Learning (RL) algorithms such as Qlearning [Wa...
OutcomeDriven Clustering of Acute Coronary Syndrome Patients using MultiTask Neural Network with Attention
Cluster analysis aims at separating patients into phenotypically heterog...
Contrasting Exploration in Parameter and Action Space: A ZerothOrder Optimization Perspective
Blackbox optimizers that explore in parameter space have often been sho...
ModelBased Reinforcement Learning in Contextual Decision Processes
We study the sample complexity of modelbased reinforcement learning in ...
Contextual Memory Trees
We design and study a Contextual Memory Tree (CMT), a learning memory co...
Truncated Horizon Policy Search: Combining Reinforcement Learning & Imitation Learning
In this paper, we propose to combine imitation and reinforcement learnin...
Dual Policy Iteration
Recently, a novel class of Approximate Policy Iteration (API) algorithms...
Recurrent Predictive State Policy Networks
We introduce Recurrent Predictive State Policy (RPSP) networks, a recurr...
Sketching for Kronecker Product Regression and Psplines
TensorSketch is an oblivious linear sketch introduced in Pagh'13 and lat...
PredictiveState Decoders: Encoding the Future into Recurrent Networks
Recurrent neural networks (RNNs) are a vital modeling technique that rel...
RiskAware Algorithms for Adversarial Contextual Bandits
In this work we consider adversarial contextual bandits with risk constr...
