Seq-2-Seq based Refinement of ASR Output for Spoken Name Capture

03/29/2022
by   Karan Singla, et al.
0

Person name capture from human speech is a difficult task in human-machine conversations. In this paper, we propose a novel approach to capture the person names from the caller utterances in response to the prompt "say and spell your first/last name". Inspired from work on spell correction, disfluency removal and text normalization, we propose a lightweight Seq-2-Seq system which generates a name spell from a varying user input. Our proposed method outperforms the strong baseline which is based on LM-driven rule-based approach.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/26/2018

A Rule-based Kurdish Text Transliteration System

In this article, we present a rule-based approach for transliterating tw...
research
10/29/2017

Finding Dominant User Utterances And System Responses in Conversations

There are several dialog frameworks which allow manual specification of ...
research
05/13/2022

Who Are We Talking About? Handling Person Names in Speech Translation

Recent work has shown that systems for speech translation (ST) – similar...
research
02/16/2023

E2E Spoken Entity Extraction for Virtual Agents

This paper reimagines some aspects of speech processing using speech enc...
research
11/10/2017

Object Referring in Visual Scene with Spoken Language

Object referring has important applications, especially for human-machin...
research
08/02/2021

User-Initiated Repetition-Based Recovery in Multi-Utterance Dialogue Systems

Recognition errors are common in human communication. Similar errors oft...

Please sign up or login with your details

Forgot password? Click here to reset