Grounded Language Acquisition From Object and Action Imagery

09/12/2023
by   James Robert Kubricht, et al.
0

Deep learning approaches to natural language processing have made great strides in recent years. While these models produce symbols that convey vast amounts of diverse knowledge, it is unclear how such symbols are grounded in data from the world. In this paper, we explore the development of a private language for visual data representation by training emergent language (EL) encoders/decoders in both i) a traditional referential game environment and ii) a contrastive learning environment utilizing a within-class matching training paradigm. An additional classification layer utilizing neural machine translation and random forest classification was used to transform symbolic representations (sequences of integer symbols) to class labels. These methods were applied in two experiments focusing on object recognition and action recognition. For object recognition, a set of sketches produced by human participants from real imagery was used (Sketchy dataset) and for action recognition, 2D trajectories were generated from 3D motion capture systems (MOVI dataset). In order to interpret the symbols produced for data in each experiment, gradient-weighted class activation mapping (Grad-CAM) methods were used to identify pixel regions indicating semantic features which contribute evidence towards symbols in learned languages. Additionally, a t-distributed stochastic neighbor embedding (t-SNE) method was used to investigate embeddings learned by CNN feature extractors.

READ FULL TEXT

page 3

page 6

page 7

page 8

research
09/18/2023

Do learned speech symbols follow Zipf's law?

In this study, we investigate whether speech symbols, learned through de...
research
08/28/2020

All About Knowledge Graphs for Actions

Current action recognition systems require large amounts of training dat...
research
02/02/2017

Symbolic, Distributed and Distributional Representations for Natural Language Processing in the Era of Deep Learning: a Survey

Natural language and symbols are intimately correlated. Recent advances ...
research
01/17/2016

Face-space Action Recognition by Face-Object Interactions

Action recognition in still images has seen major improvement in recent ...
research
10/27/2022

Learning Joint Representation of Human Motion and Language

In this work, we present MoLang (a Motion-Language connecting model) for...
research
12/04/2020

DeepSym: Deep Symbol Generation and Rule Learning from Unsupervised Continuous Robot Interaction for Planning

Autonomous discovery of discrete symbols and rules from continuous inter...
research
11/12/2015

Hand-Object Interaction and Precise Localization in Transitive Action Recognition

Action recognition in still images has seen major improvement in recent ...

Please sign up or login with your details

Forgot password? Click here to reset