Prototypical Contrastive Transfer Learning for Multimodal Language Understanding

07/12/2023
by   Seitaro Otsuki, et al.
0

Although domestic service robots are expected to assist individuals who require support, they cannot currently interact smoothly with people through natural language. For example, given the instruction "Bring me a bottle from the kitchen," it is difficult for such robots to specify the bottle in an indoor environment. Most conventional models have been trained on real-world datasets that are labor-intensive to collect, and they have not fully leveraged simulation data through a transfer learning framework. In this study, we propose a novel transfer learning approach for multimodal language understanding called Prototypical Contrastive Transfer Learning (PCTL), which uses a new contrastive loss called Dual ProtoNCE. We introduce PCTL to the task of identifying target objects in domestic environments according to free-form natural language instructions. To validate PCTL, we built new real-world and simulation datasets. Our experiment demonstrated that PCTL outperformed existing methods. Specifically, PCTL achieved an accuracy of 78.1 simple fine-tuning achieved an accuracy of 73.4

READ FULL TEXT

page 1

page 3

page 7

research
06/11/2019

DoubleTransfer at MEDIQA 2019: Multi-Source Transfer Learning for Natural Language Understanding in the Medical Domain

This paper describes our competing system to enter the MEDIQA-2019 compe...
research
11/05/2020

Language Model is All You Need: Natural Language Understanding as Question Answering

Different flavors of transfer learning have shown tremendous impact in a...
research
12/23/2019

A Multimodal Target-Source Classifier with Attention Branches to Understand Ambiguous Instructions for Fetching Daily Objects

In this study, we focus on multimodal language understanding for fetchin...
research
06/17/2019

Understanding Natural Language Instructions for Fetching Daily Objects Using GAN-Based Multimodal Target-Source Classification

In this paper, we address multimodal language understanding for unconstr...
research
07/02/2021

Target-dependent UNITER: A Transformer-Based Multimodal Language Comprehension Model for Domestic Service Robots

Currently, domestic service robots have an insufficient ability to inter...
research
03/21/2019

Inferring Compact Representations for Efficient Natural Language Understanding of Robot Instructions

The speed and accuracy with which robots are able to interpret natural l...
research
07/14/2023

Switching Head-Tail Funnel UNITER for Dual Referring Expression Comprehension with Fetch-and-Carry Tasks

This paper describes a domestic service robot (DSR) that fetches everyda...

Please sign up or login with your details

Forgot password? Click here to reset