DeepAI AI Chat
Log In Sign Up

WSRNet: Joint Spotting and Recognition of Handwritten Words

by   George Retsinas, et al.

In this work, we present a unified model that can handle both Keyword Spotting and Word Recognition with the same network architecture. The proposed network is comprised of a non-recurrent CTC branch and a Seq2Seq branch that is further augmented with an Autoencoding module. The related joint loss leads to a boost in recognition performance, while the Seq2Seq branch is used to create efficient word representations. We show how to further process these representations with binarization and a retraining scheme to provide compact and highly efficient descriptors, suitable for keyword spotting. Numerical results validate the usefulness of the proposed architecture, as our method outperforms the previous state-of-the-art in keyword spotting, and provides results in the ballpark of the leading methods for word recognition.


Seeing wake words: Audio-visual Keyword Spotting

The goal of this work is to automatically determine whether and when a w...

EfficientNet-Absolute Zero for Continuous Speech Keyword Spotting

Keyword spotting is a process of finding some specific words or phrases ...

Learning Decoupling Features Through Orthogonality Regularization

Keyword spotting (KWS) and speaker verification (SV) are two important t...

An Integrated Framework for Two-pass Personalized Voice Trigger

In this paper, we present the XMUSPEECH system for Task 1 of 2020 Person...

Hierarchical Neural Network Architecture In Keyword Spotting

Keyword Spotting (KWS) provides the start signal of ASR problem, and thu...

Temporal Feedback Convolutional Recurrent Neural Networks for Keyword Spotting

While end-to-end learning has become a trend in deep learning, the model...