Kinova Gemini: Interactive Robot Grasping with Visual Reasoning and Conversational AI

09/03/2022
by   Hanxiao Chen, et al.
0

To facilitate recent advances in robotics and AI for delicate collaboration between humans and machines, we propose the Kinova Gemini, an original robotic system that integrates conversational AI dialogue and visual reasoning to make the Kinova Gen3 lite robot help people retrieve objects or complete perception-based pick-and-place tasks. When a person walks up to Kinova Gen3 lite, our Kinova Gemini is able to fulfill the user's requests in three different applications: (1) It can start a natural dialogue with people to interact and assist humans to retrieve objects and hand them to the user one by one. (2) It detects diverse objects with YOLO v3 and recognize color attributes of the item to ask people if they want to grasp it via the dialogue or enable the user to choose which specific one is required. (3) It applies YOLO v3 to recognize multiple objects and let you choose two items for perception-based pick-and-place tasks such as "Put the banana into the bowl" with visual reasoning and conversational interaction.

READ FULL TEXT

page 2

page 3

page 4

page 5

page 6

research
02/15/2023

Commonsense Reasoning for Conversational AI: A Survey of the State of the Art

Large, transformer-based pretrained language models like BERT, GPT, and ...
research
10/06/2022

CoGrasp: 6-DoF Grasp Generation for Human-Robot Collaboration

Robot grasping is an actively studied area in robotics, mainly focusing ...
research
07/01/2020

Interactive Path Reasoning on Graph for Conversational Recommendation

Traditional recommendation systems estimate user preference on items fro...
research
11/02/2020

An ontology-based chatbot for crises management: use case coronavirus

Today is the era of intelligence in machines. With the advances in Artif...
research
11/22/2021

Building Goal-Oriented Dialogue Systems with Situated Visual Context

Most popular goal-oriented dialogue agents are capable of understanding ...
research
07/24/2023

simPLE: a visuotactile method learned in simulation to precisely pick, localize, regrasp, and place objects

Existing robotic systems have a clear tension between generality and pre...
research
05/28/2015

Like Partying? Your Face Says It All. Predicting the Ambiance of Places with Profile Pictures

To choose restaurants and coffee shops, people are increasingly relying ...

Please sign up or login with your details

Forgot password? Click here to reset