lamBERT: Language and Action Learning Using Multimodal BERT

04/15/2020
by   Kazuki Miyazawa, et al.
16

Recently, the bidirectional encoder representations from transformers (BERT) model has attracted much attention in the field of natural language processing, owing to its high performance in language understanding-related tasks. The BERT model learns language representation that can be adapted to various tasks via pre-training using a large corpus in an unsupervised manner. This study proposes the language and action learning using multimodal BERT (lamBERT) model that enables the learning of language and actions by 1) extending the BERT model to multimodal representation and 2) integrating it with reinforcement learning. To verify the proposed model, an experiment is conducted in a grid environment that requires language understanding for the agent to act properly. As a result, the lamBERT model obtained higher rewards in multitask settings and transfer settings when compared to other models, such as the convolutional neural network-based model and the lamBERT model without pre-training.

READ FULL TEXT

page 3

page 5

page 6

page 7

research
10/11/2018

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

We introduce a new language representation model called BERT, which stan...
research
02/28/2019

BERT for Joint Intent Classification and Slot Filling

Intent classification and slot filling are two essential tasks for natur...
research
05/14/2020

A pre-training technique to localize medical BERT and enhance BioBERT

Bidirectional Encoder Representations from Transformers (BERT) models fo...
research
08/13/2019

StructBERT: Incorporating Language Structures into Pre-training for Deep Language Understanding

Recently, the pre-trained language model, BERT (Devlin et al.(2018)Devli...
research
07/13/2017

Representation Learning for Grounded Spatial Reasoning

The interpretation of spatial references is highly contextual, requiring...
research
04/11/2021

Innovative Bert-based Reranking Language Models for Speech Recognition

More recently, Bidirectional Encoder Representations from Transformers (...
research
12/30/2020

Accurate Word Representations with Universal Visual Guidance

Word representation is a fundamental component in neural language unders...

Please sign up or login with your details

Forgot password? Click here to reset