
Exponential Family Estimation via Adversarial Dynamics Embedding
We present an efficient algorithm for maximum likelihood estimation (MLE...
04/27/2019 ∙ by Bo Dai, et al. ∙ 28 ∙ shareread it

Domain Aggregation Networks for MultiSource Domain Adaptation
In many realworld applications, we want to exploit multiple source data...
09/11/2019 ∙ by Junfeng Wen, et al. ∙ 15 ∙ shareread it

The Value Function Polytope in Reinforcement Learning
We establish geometric and topological properties of the space of value ...
01/31/2019 ∙ by Robert Dadashi, et al. ∙ 10 ∙ shareread it

A Geometric Perspective on Optimal Representations for Reinforcement Learning
This paper proposes a new approach to representation learning based on g...
01/31/2019 ∙ by Marc G. Bellemare, et al. ∙ 10 ∙ shareread it

Kernel Exponential Family Estimation via Doubly Dual Embedding
We investigate penalized maximum loglikelihood estimation for exponenti...
11/06/2018 ∙ by Bo Dai, et al. ∙ 4 ∙ shareread it

Striving for Simplicity in Offpolicy Deep Reinforcement Learning
Reflecting on the advances of offpolicy deep reinforcement learning (RL...
07/10/2019 ∙ by Rishabh Agarwal, et al. ∙ 2 ∙ shareread it

TrustPCL: An OffPolicy Trust Region Method for Continuous Control
Trust region methods, such as TRPO, are often used to stabilize policy o...
07/06/2017 ∙ by Ofir Nachum, et al. ∙ 0 ∙ shareread it

Improving Policy Gradient by Exploring Underappreciated Rewards
This paper presents a novel form of policy gradient for modelfree reinf...
11/28/2016 ∙ by Ofir Nachum, et al. ∙ 0 ∙ shareread it

Bridging the Gap Between Value and Policy Based Reinforcement Learning
We establish a new connection between value and policy based reinforceme...
02/28/2017 ∙ by Ofir Nachum, et al. ∙ 0 ∙ shareread it

Adaptive Monte Carlo via Bandit Allocation
We consider the problem of sequentially choosing between a set of unbias...
05/13/2014 ∙ by James Neufeld, et al. ∙ 0 ∙ shareread it

Stochastic Neural Networks with Monotonic Activation Functions
We propose a Laplace approximation that creates a stochastic unit from a...
01/01/2016 ∙ by Siamak Ravanbakhsh, et al. ∙ 0 ∙ shareread it

Learning Bayesian Nets that Perform Well
A Bayesian net (BN) is more than a succinct way to encode a probabilisti...
02/06/2013 ∙ by Russell Greiner, et al. ∙ 0 ∙ shareread it

Monte Carlo Matrix Inversion Policy Evaluation
In 1950, Forsythe and Leibler (1950) introduced a statistical technique ...
10/19/2012 ∙ by Fletcher Lu, et al. ∙ 0 ∙ shareread it

Generalized Conditional Gradient for Sparse Estimation
Structured sparsity is an important modeling tool that expands the appli...
10/17/2014 ∙ by Yaoliang Yu, et al. ∙ 0 ∙ shareread it

Convex Relaxations of Bregman Divergence Clustering
Although many convex relaxations of clustering have been proposed in the...
09/26/2013 ∙ by Hao Cheng, et al. ∙ 0 ∙ shareread it

Monte Carlo Inference via Greedy Importance Sampling
We present a new method for conducting Monte Carlo inference in graphica...
01/16/2013 ∙ by Dale Schuurmans, et al. ∙ 0 ∙ shareread it

Boltzmann Machine Learning with the Latent Maximum Entropy Principle
We present a new statistical learning paradigm for Boltzmann machines ba...
10/19/2012 ∙ by Shaojun Wang, et al. ∙ 0 ∙ shareread it

Maximum Margin Bayesian Networks
We consider the problem of learning Bayesian network classifiers that ma...
07/04/2012 ∙ by Yuhong Guo, et al. ∙ 0 ∙ shareread it

Regularizers versus Losses for Nonlinear Dimensionality Reduction: A Factored View with New Convex Relaxations
We demonstrate that almost all nonparametric dimensionality reduction m...
06/27/2012 ∙ by Yaoliang Yu, et al. ∙ 0 ∙ shareread it

Convex Structure Learning for Bayesian Networks: Polynomial Feature Selection and Approximate Ordering
We present a new approach to learning the structure and parameters of a ...
06/27/2012 ∙ by Yuhong Guo, et al. ∙ 0 ∙ shareread it

Rank/Norm Regularization with ClosedForm Solutions: Application to Subspace Clustering
When data is sampled from an unknown subspace, principal component analy...
02/14/2012 ∙ by YaoLiang Yu, et al. ∙ 0 ∙ shareread it

Smoothed Action Value Functions for Learning Gaussian Policies
Stateaction value functions (i.e., Qvalues) are ubiquitous in reinforc...
03/06/2018 ∙ by Ofir Nachum, et al. ∙ 0 ∙ shareread it

Planning and Learning with Stochastic Action Sets
In many practical uses of reinforcement learning (RL) the set of actions...
05/07/2018 ∙ by Craig Boutilier, et al. ∙ 0 ∙ shareread it

Variational Rejection Sampling
Learning latent variable models with stochastic variational inference is...
04/05/2018 ∙ by Aditya Grover, et al. ∙ 0 ∙ shareread it

Understanding the impact of entropy in policy learning
Entropy regularization is commonly used to improve policy optimization i...
11/27/2018 ∙ by Zafarali Ahmed, et al. ∙ 0 ∙ shareread it

Understanding the impact of entropy on policy optimization
Entropy regularization is commonly used to improve policy optimization i...
11/27/2018 ∙ by Zafarali Ahmed, et al. ∙ 0 ∙ shareread it

Learning to Generalize from Sparse and Underspecified Rewards
We consider the problem of learning from sparse and underspecified rewar...
02/19/2019 ∙ by Rishabh Agarwal, et al. ∙ 0 ∙ shareread it

Advantage Amplification in Slowly Evolving LatentState Environments
Latentstate environments with long horizons, such as those faced by rec...
05/29/2019 ∙ by Martin Mladenov, et al. ∙ 0 ∙ shareread it

AlgaeDICE: Policy Gradient from Arbitrary Experience
In many realworld applications of reinforcement learning (RL), interact...
12/04/2019 ∙ by Ofir Nachum, et al. ∙ 0 ∙ shareread it
Dale Schuurmans
is this you? claim profile
Professor of Department of Computing Science at University of Alberta, Research Scientist at Google Brain