Unsupervised Lexical Acquisition of Relative Spatial Concepts Using Spoken User Utterances

by   Rikunari Sagara, et al.

This paper proposes methods for unsupervised lexical acquisition for relative spatial concepts using spoken user utterances. A robot with a flexible spoken dialog system must be able to acquire linguistic representation and its meaning specific to an environment through interactions with humans as children do. Specifically, relative spatial concepts (e.g., front and right) are widely used in our daily lives, however, it is not obvious which object is a reference object when a robot learns relative spatial concepts. Therefore, we propose methods by which a robot without prior knowledge of words can learn relative spatial concepts. The methods are formulated using a probabilistic model to estimate the proper reference objects and distributions representing concepts simultaneously. The experimental results show that relative spatial concepts and a phoneme sequence representing each concept can be learned under the condition that the robot does not know which located object is the reference object. Additionally, we show that two processes in the proposed method improve the estimation accuracy of the concepts: generating candidate word sequences by class n-gram and selecting word sequences using location information. Furthermore, we show that clues to reference objects improve accuracy even though the number of candidate reference objects increases.


Online Spatial Concept and Lexical Acquisition with Simultaneous Localization and Mapping

In this paper, we propose an online learning algorithm based on a Rao-Bl...

Hierarchical Bayesian Model for the Transfer of Knowledge on Spatial Concepts based on Multimodal Information

This paper proposes a hierarchical Bayesian model based on spatial conce...

Interactive and Incremental Learning of Spatial Object Relations from Human Demonstrations

Humans use semantic concepts such as spatial relations between objects t...

Spatio-Temporal Reference Frames as Geographic Objects

It is often desirable to analyse trajectory data in local coordinates re...

Evaluating Models of Robust Word Recognition with Serial Reproduction

Spoken communication occurs in a "noisy channel" characterized by high l...

CiwaGAN: Articulatory information exchange

Humans encode information into sounds by controlling articulators and de...

Please sign up or login with your details

Forgot password? Click here to reset