
Exponential Family Estimation via Adversarial Dynamics Embedding
We present an efficient algorithm for maximum likelihood estimation (MLE...
ConQUR: Mitigating Delusional Bias in Deep Qlearning
Delusional bias is a fundamental source of error in approximate Qlearni...
Domain Aggregation Networks for MultiSource Domain Adaptation
In many realworld applications, we want to exploit multiple source data...
GenDICE: Generalized Offline Estimation of Stationary Values
An important problem that arises in reinforcement learning and Monte Car...
The Value Function Polytope in Reinforcement Learning
We establish geometric and topological properties of the space of value ...
A Geometric Perspective on Optimal Representations for Reinforcement Learning
This paper proposes a new approach to representation learning based on g...
Batch Stationary Distribution Estimation
We consider the problem of approximating the stationary distribution of ...
Kernel Exponential Family Estimation via Doubly Dual Embedding
We investigate penalized maximum loglikelihood estimation for exponenti...
On the Global Convergence Rates of Softmax Policy Gradient Methods
We make three contributions toward better understanding policy gradient ...
Striving for Simplicity in Offpolicy Deep Reinforcement Learning
Reflecting on the advances of offpolicy deep reinforcement learning (RL...
TrustPCL: An OffPolicy Trust Region Method for Continuous Control
Trust region methods, such as TRPO, are often used to stabilize policy o...
Improving Policy Gradient by Exploring Underappreciated Rewards
This paper presents a novel form of policy gradient for modelfree reinf...
Bridging the Gap Between Value and Policy Based Reinforcement Learning
We establish a new connection between value and policy based reinforceme...
Adaptive Monte Carlo via Bandit Allocation
We consider the problem of sequentially choosing between a set of unbias...
Stochastic Neural Networks with Monotonic Activation Functions
We propose a Laplace approximation that creates a stochastic unit from a...
Learning Bayesian Nets that Perform Well
A Bayesian net (BN) is more than a succinct way to encode a probabilisti...
Monte Carlo Matrix Inversion Policy Evaluation
In 1950, Forsythe and Leibler (1950) introduced a statistical technique ...
Generalized Conditional Gradient for Sparse Estimation
Structured sparsity is an important modeling tool that expands the appli...
Convex Relaxations of Bregman Divergence Clustering
Although many convex relaxations of clustering have been proposed in the...
Monte Carlo Inference via Greedy Importance Sampling
We present a new method for conducting Monte Carlo inference in graphica...
Boltzmann Machine Learning with the Latent Maximum Entropy Principle
We present a new statistical learning paradigm for Boltzmann machines ba...
Maximum Margin Bayesian Networks
We consider the problem of learning Bayesian network classifiers that ma...
Regularizers versus Losses for Nonlinear Dimensionality Reduction: A Factored View with New Convex Relaxations
We demonstrate that almost all nonparametric dimensionality reduction m...
Convex Structure Learning for Bayesian Networks: Polynomial Feature Selection and Approximate Ordering
We present a new approach to learning the structure and parameters of a ...
Rank/Norm Regularization with ClosedForm Solutions: Application to Subspace Clustering
When data is sampled from an unknown subspace, principal component analy...
Smoothed Action Value Functions for Learning Gaussian Policies
Stateaction value functions (i.e., Qvalues) are ubiquitous in reinforc...
Planning and Learning with Stochastic Action Sets
In many practical uses of reinforcement learning (RL) the set of actions...
Variational Rejection Sampling
Learning latent variable models with stochastic variational inference is...
Understanding the impact of entropy in policy learning
Entropy regularization is commonly used to improve policy optimization i...
Understanding the impact of entropy on policy optimization
Entropy regularization is commonly used to improve policy optimization i...
Learning to Generalize from Sparse and Underspecified Rewards
We consider the problem of learning from sparse and underspecified rewar...
Advantage Amplification in Slowly Evolving LatentState Environments
Latentstate environments with long horizons, such as those faced by rec...
AlgaeDICE: Policy Gradient from Arbitrary Experience
In many realworld applications of reinforcement learning (RL), interact...
Learning to Combat CompoundingError in ModelBased Reinforcement Learning
Despite its potential to improve sample complexity versus modelfree app...
Variational Inference for Deep Probabilistic Canonical Correlation Analysis
In this paper, we propose a deep probabilistic multiview model that is ...
EnergyBased Processes for Exchangeable Data
Recently there has been growing interest in modeling sets with exchangea...
