DeepAI AI Chat
Log In Sign Up

Hardware as Policy: Mechanical and Computational Co-Optimization using Deep Reinforcement Learning

by   Tianjian Chen, et al.

Deep Reinforcement Learning (RL) has shown great success in learning complex control policies for a variety of applications in robotics. However, in most such cases, the hardware of the robot has been considered immutable, modeled as part of the environment. In this study, we explore the problem of learning hardware and control parameters together in a unified RL framework. To achieve this, we propose to model aspects of the robot's hardware as a "mechanical policy", analogous to and optimized jointly with its computational counterpart. We show that, by modeling such mechanical policies as auto-differentiable computational graphs, the ensuing optimization problem can be solved efficiently by gradient-based algorithms from the Policy Optimization family. We present two such design examples: a toy mass-spring problem, and a real-world problem of designing an underactuated hand. We compare our method against traditional co-optimization approaches, and also demonstrate its effectiveness by building a physical prototype based on the learned hardware parameters.


page 1

page 3

page 4

page 5

page 8

page 9

page 14

page 15


Placement Optimization with Deep Reinforcement Learning

Placement Optimization is an important problem in systems and chip desig...

A Framework for Studying Reinforcement Learning and Sim-to-Real in Robot Soccer

This article introduces an open framework, called VSSS-RL, for studying ...

Learning to Locomote: Understanding How Environment Design Matters for Deep Reinforcement Learning

Learning to locomote is one of the most common tasks in physics-based an...

Enforcing the consensus between Trajectory Optimization and Policy Learning for precise robot control

Reinforcement learning (RL) and trajectory optimization (TO) present str...

Control of a fly-mimicking flyer in complex flow using deep reinforcement learning

An integrated framework of computational fluid-structural dynamics (CFD-...

Neuronal Circuit Policies

We propose an effective way to create interpretable control agents, by r...

SURREAL-System: Fully-Integrated Stack for Distributed Deep Reinforcement Learning

We present an overview of SURREAL-System, a reproducible, flexible, and ...