
Attention ActorCritic algorithm for MultiAgent Constrained Cooperative Reinforcement Learning
In this work, we consider the problem of computing optimal actions for R...
read it

Robust Quadrupedal Locomotion on Sloped Terrains: A Linear Policy Approach
In this paper, with a view toward fast deployment of locomotion gaits in...
read it

Hindsight Experience Replay with Kronecker Product Approximate Curvature
Hindsight Experience Replay (HER) is one of the efficient algorithm to s...
read it

A reinforcement learning approach to hybrid control design
In this paper we design hybrid control policies for hybrid systems whose...
read it

Learning Stable Manoeuvres in Quadruped Robots from Expert Demonstrations
With the research into development of quadruped robots picking up pace, ...
read it

A Stochastic Game Framework for Efficient Energy Management in Microgrid Networks
We consider the problem of energy management in microgrid networks. A mi...
read it

Gait Library Synthesis for Quadruped Robots via Augmented Random Search
In this paper, with a view toward fast deployment of learned locomotion ...
read it

Hierarchical Average Reward Policy Gradient Algorithms
Optioncritic learning is a generalpurpose reinforcement learning (RL) ...
read it

A Convergent OffPolicy Temporal Difference Algorithm
Learning the value function of a given policy (target policy) from the d...
read it

Generalized Speedy Qlearning
In this paper, we derive a generalization of the Speedy Qlearning (SQL)...
read it

Solution of TwoPlayer ZeroSum Game by Successive Relaxation
We consider the problem of twoplayer zerosum game. In this setting, th...
read it

Learning Active Spine Behaviors for Dynamic and Efficient Locomotion in Quadruped Robots
In this work, we provide a simulation framework to perform systematic st...
read it

Reinforcement Learning in NonStationary Environments
Reinforcement learning (RL) methods learn optimal decisions in the prese...
read it

Second Order Value Iteration in Reinforcement Learning
Value iteration is a fixed point iteration technique utilized to obtain ...
read it

ActorCritic Algorithms for Constrained Multiagent Reinforcement Learning
In cooperative stochastic games multiple agents work towards learning jo...
read it

Successive Over Relaxation QLearning
In a discounted reward Markov Decision Process (MDP) the objective is to...
read it

An Online Sample Based Method for Mode Estimation using ODE Analysis of Stochastic Approximation Algorithms
One of the popular measures of central tendency that provides better rep...
read it

Design, Development and Experimental Realization of a Quadrupedal Research Platform: Stoch
In this paper, we present a complete description of the hardware design ...
read it

Memorybased Deep Reinforcement Learning for Obstacle Avoidance in UAV with Limited Environment Knowledge
This paper presents our method for enabling a UAV quadrotor, equipped wi...
read it

Realizing Learned Quadruped Locomotion Behaviors through Kinematic Motion Primitives
Humans and animals are believed to use a very minimal set of trajectorie...
read it

Random directions stochastic approximation with deterministic perturbations
We introduce deterministic perturbation schemes for the recently propose...
read it

An Online Prediction Algorithm for Reinforcement Learning with Linear Function Approximation using Cross Entropy Method
In this paper, we provide two new stable online algorithms for the probl...
read it

Asynchronous stochastic approximations with asymptotically biased errors and deep multiagent learning
Asynchronous stochastic approximations are an important class of modelf...
read it

A Cross Entropy based Optimization Algorithm with Global Convergence Guarantees
The cross entropy (CE) method is a model based search method to solve op...
read it

An Incremental Offpolicy Search in a Modelfree Markov Decision Process Using a Single Sample Path
In this paper, we consider a modified version of the control problem in ...
read it

RLWS: A Reinforcement Learning based GPU Warp Scheduler
The Streaming Multiprocessors (SMs) of a Graphics Processing Unit (GPU) ...
read it

A unified decision making framework for supply and demand management in microgrid networks
This paper considers two important problems  on the supplyside and dem...
read it

Conditions for Stability and Convergence of SetValued Stochastic Approximations: Applications to Approximate Value and Fixed point Iterations
The main aim of this paper is the development of easily verifiable suffi...
read it

Novel Sensor Scheduling Scheme for Intruder Tracking in Energy Efficient Sensor Networks
We consider the problem of tracking an intruder using a network of wirel...
read it

MultiAgent QLearning for Minimizing DemandSupply Power Deficit in Microgrids
We consider the problem of minimizing the difference in the demand and t...
read it

Analysis of gradient descent methods with nondiminishing, bounded errors
The main aim of this paper is to provide an analysis of gradient descent...
read it

Shaping ProtoValue Functions via Rewards
In this paper, we combine taskdependent reward shaping and taskindepen...
read it

Stability of Stochastic Approximations with `Controlled Markov' Noise and Temporal Difference Learning
In this paper we present a `stability theorem' for stochastic approximat...
read it

Two Timescale Stochastic Approximation with Controlled Markov noise and Offpolicy temporal difference learning
We present for the first time an asymptotic convergence analysis of two ...
read it

Stochastic recursive inclusion in two timescales with an application to the Lagrangian dual problem
In this paper we present a framework to analyze the asymptotic behavior ...
read it

A Generalization of the BorkarMeyn Theorem for Stochastic Recursive Inclusions
In this paper the stability theorem of Borkar and Meyn is extended to in...
read it

ActorCritic Algorithms for Learning Nash Equilibria in Nplayer GeneralSum Games
We consider the problem of finding stationary Nash equilibria (NE) in a ...
read it
Shalabh Bhatnagar
is this you? claim profile
Professor and Chair Dept of Computer Science and Automation at Indian Institute of Science