Interpretability in Contact-Rich Manipulation via Kinodynamic Images

by   Ioanna Mitsioni, et al.

Deep Neural Networks (NNs) have been widely utilized in contact-rich manipulation tasks to model the complicated contact dynamics. However, NN-based models are often difficult to decipher which can lead to seemingly inexplicable behaviors and unidentifiable failure cases. In this work, we address the interpretability of NN-based models by introducing the kinodynamic images. We propose a methodology that creates images from the kinematic and dynamic data of a contact-rich manipulation task. Our formulation visually reflects the task's state by encoding its kinodynamic variations and temporal evolution. By using images as the state representation, we enable the application of interpretability modules that were previously limited to vision-based tasks. We use this representation to train Convolution-based Networks and we extract interpretations of the model's decisions with Grad-CAM, a technique that produces visual explanations. Our method is versatile and can be applied to any classification problem using synchronous features in manipulation to visually interpret which parts of the input drive the model's decisions and distinguish its failure modes. We evaluate this approach on two examples of real-world contact-rich manipulation: pushing and cutting, with known and unknown objects. Finally, we demonstrate that our method enables both detailed visual inspections of sequences in a task, as well as high-level evaluations of a model's behavior and tendencies. Data and code for this work are available at



There are no comments yet.


page 1

page 2

page 5

page 6


Accurate Vision-based Manipulation through Contact Reasoning

Planning contact interactions is one of the core challenges of many robo...

Probabilistic Model Learning and Long-term Prediction for Contact-rich Manipulation Tasks

Learning dynamics models is an essential component of model-based reinfo...

COCOI: Contact-aware Online Context Inference for Generalizable Non-planar Pushing

General contact-rich manipulation problems are long-standing challenges ...

Data-Driven Model Predictive Control for Food-Cutting

Modelling of contact-rich tasks is challenging and cannot be entirely so...

Contact-Rich Manipulation of a Flexible Object based on Deep Predictive Learning using Vision and Tactility

We achieved contact-rich flexible object manipulation, which was difficu...

Modelling and Learning Dynamics for Robotic Food-Cutting

Data-driven approaches for modelling contact-rich tasks address many of ...

Autonomously Learning to Visually Detect Where Manipulation Will Succeed

Visual features can help predict if a manipulation behavior will succeed...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.