Regularized Off-Policy TD-Learning

06/06/2020
by   Bo Liu, et al.
0

We present a novel l_1 regularized off-policy convergent TD-learning method (termed RO-TD), which is able to learn sparse representations of value functions with low computational complexity. The algorithmic framework underlying RO-TD integrates two key ideas: off-policy convergent gradient TD methods, such as TDC, and a convex-concave saddle-point formulation of non-smooth convex optimization, which enables first-order solvers and feature selection using online convex regularization. A detailed theoretical and experimental analysis of RO-TD is presented. A variety of experiments are presented to illustrate the off-policy convergence, sparse feature selection capability and low computational cost of the RO-TD algorithm.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/02/2015

Direct l_(2,p)-Norm Learning for Feature Selection

In this paper, we propose a novel sparse learning based feature selectio...
research
06/07/2023

Sparse Linear Centroid-Encoder: A Convex Method for Feature Selection

We present a novel feature selection technique, Sparse Linear Centroid-E...
research
04/26/2021

Algorithmic Solution for Non-Square, Dense Systems of Linear Equations, with applications in Feature Selection

We present a novel algorithm attaining excessively fast, the sought solu...
research
09/23/2014

HSR: L1/2 Regularized Sparse Representation for Fast Face Recognition using Hierarchical Feature Selection

In this paper, we propose a novel method for fast face recognition calle...
research
05/09/2012

Multi-Task Feature Learning Via Efficient l2,1-Norm Minimization

The problem of joint feature selection across a group of related tasks h...
research
04/10/2019

New Computational and Statistical Aspects of Regularized Regression with Application to Rare Feature Selection and Aggregation

Prior knowledge on properties of a target model often come as discrete o...
research
06/15/2018

Crime Event Embedding with Unsupervised Feature Selection

We present a novel event embedding algorithm for crime data that can joi...

Please sign up or login with your details

Forgot password? Click here to reset