DeepAI AI Chat
Log In Sign Up

Seq-2-Seq based Refinement of ASR Output for Spoken Name Capture

03/29/2022
by   Karan Singla, et al.
INTERACTIONS LLC
0

Person name capture from human speech is a difficult task in human-machine conversations. In this paper, we propose a novel approach to capture the person names from the caller utterances in response to the prompt "say and spell your first/last name". Inspired from work on spell correction, disfluency removal and text normalization, we propose a lightweight Seq-2-Seq system which generates a name spell from a varying user input. Our proposed method outperforms the strong baseline which is based on LM-driven rule-based approach.

READ FULL TEXT

page 1

page 2

page 3

page 4

11/26/2018

A Rule-based Kurdish Text Transliteration System

In this article, we present a rule-based approach for transliterating tw...
10/29/2017

Finding Dominant User Utterances And System Responses in Conversations

There are several dialog frameworks which allow manual specification of ...
04/04/2022

Deliberation Model for On-Device Spoken Language Understanding

We propose a novel deliberation-based approach to end-to-end (E2E) spoke...
05/13/2022

Who Are We Talking About? Handling Person Names in Speech Translation

Recent work has shown that systems for speech translation (ST) – similar...
02/16/2023

E2E Spoken Entity Extraction for Virtual Agents

This paper reimagines some aspects of speech processing using speech enc...
11/10/2017

Object Referring in Visual Scene with Spoken Language

Object referring has important applications, especially for human-machin...
08/02/2021

User-Initiated Repetition-Based Recovery in Multi-Utterance Dialogue Systems

Recognition errors are common in human communication. Similar errors oft...