Sensor Transformation Attention Networks

08/03/2017
by   Stefan Braun, et al.
0

Recent work on encoder-decoder models for sequence-to-sequence mapping has shown that integrating both temporal and spatial attention mechanisms into neural networks increases the performance of the system substantially. In this work, we report on the application of an attentional signal not on temporal and spatial regions of the input, but instead as a method of switching among inputs themselves. We evaluate the particular role of attentional switching in the presence of dynamic noise in the sensors, and demonstrate how the attentional signal responds dynamically to changing noise levels in the environment to achieve increased performance on both audio and visual tasks in three commonly-used datasets: TIDIGITS, Wall Street Journal, and GRID. Moreover, the proposed sensor transformation network architecture naturally introduces a number of advantages that merit exploration, including ease of adding new sensors to existing architectures, attentional interpretability, and increased robustness in a variety of noisy environments not seen during training. Finally, we demonstrate that the sensor selection attention mechanism of a model trained only on the small TIDIGITS dataset can be transferred directly to a pre-existing larger network trained on the Wall Street Journal dataset, maintaining functionality of switching between sensors to yield a dramatic reduction of error in the presence of noise.

READ FULL TEXT
research
01/22/2019

Self-Attention Networks for Connectionist Temporal Classification in Speech Recognition

Self-attention has demonstrated great success in sequence-to-sequence ta...
research
01/11/2017

Attention-Based Multimodal Fusion for Video Description

Currently successful methods for video description are based on encoder-...
research
11/12/2018

Multi-encoder multi-resolution framework for end-to-end speech recognition

Attention-based methods and Connectionist Temporal Classification (CTC) ...
research
04/27/2023

Cross-Domain Evaluation of POS Taggers: From Wall Street Journal to Fandom Wiki

The Wall Street Journal section of the Penn Treebank has been the de-fac...
research
11/07/2018

Promising Accurate Prefix Boosting for sequence-to-sequence ASR

In this paper, we present promising accurate prefix boosting (PAPB), a d...
research
10/10/2016

Latent Sequence Decompositions

We present the Latent Sequence Decompositions (LSD) framework. LSD decom...
research
06/17/2022

Design of Multi-model Linear Inferential Sensors with SVM-based Switching Logic

We study the problem of data-based design of multi-model linear inferent...

Please sign up or login with your details

Forgot password? Click here to reset