Multiple topic identification in human/human conversations

12/18/2018
by   X. Bost, et al.
0

The paper deals with the automatic analysis of real-life telephone conversations between an agent and a customer of a customer care service (ccs). The application domain is the public transportation system in Paris and the purpose is to collect statistics about customer problems in order to monitor the service and decide priorities on the intervention for improving user satisfaction. Of primary importance for the analysis is the detection of themes that are the object of customer problems. Themes are defined in the application requirements and are part of the application ontology that is implicit in the ccs documentation. Due to variety of customer population, the structure of conversations with an agent is unpredictable. A conversation may be about one or more themes. Theme mentions can be interleaved with mentions of facts that are irrelevant for the application purpose. Furthermore, in certain conversations theme mentions are localized in specific conversation segments while in other conversations mentions cannot be localized. As a consequence, approaches to feature extraction with and without mention localization are considered. Application domain relevant themes identified by an automatic procedure are expressed by specific sentences whose words are hypothesized by an automatic speech recognition (asr) system. The asr system is error prone. The word error rates can be very high for many reasons. Among them it is worth mentioning unpredictable background noise, speaker accent, and various types of speech disfluencies. As the application task requires the composition of proportions of theme mentions, a sequential decision strategy is introduced in this paper for performing a survey of the large amount of conversations made available in a given time period. The strategy has to sample the conversations to form a survey containing enough data analyzed with high accuracy so that proportions can be estimated with sufficient accuracy. Due to the unpredictable type of theme mentions, it is appropriate to consider methods for theme hypothesization based on global as well as local feature extraction. Two systems based on each type of feature extraction will be considered by the strategy. One of the four methods is novel. It is based on a new definition of density of theme mentions and on the localization of high density zones whose boundaries do not need to be precisely detected. The sequential decision strategy starts by grouping theme hypotheses into sets of different expected accuracy and coverage levels. For those sets for which accuracy can be improved with a consequent increase of coverage a new system with new features is introduced. Its execution is triggered only when specific preconditions are met on the hypotheses generated by the basic four systems. Experimental results are provided on a corpus collected in the call center of the Paris transportation system known as ratp. The results show that surveys with high accuracy and coverage can be composed with the proposed strategy and systems. This makes it possible to apply a previously published proportion estimation approach that takes into account hypothesization errors .

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/21/2018

Multiple topic identification in telephone conversations

This paper deals with the automatic analysis of conversations between a ...
research
03/12/2022

A combined approach to the analysis of speech conversations in a contact center domain

The ever more accurate search for deep analysis in customer data is a re...
research
09/15/2017

"How May I Help You?": Modeling Twitter Customer Service Conversations Using Fine-Grained Dialogue Acts

Given the increasing popularity of customer service dialogue on Twitter,...
research
08/21/2019

Towards Better Understanding of Spontaneous Conversations: Overcoming Automatic Speech Recognition Errors With Intent Recognition

In this paper, we present a method for correcting automatic speech recog...
research
10/07/2020

WER we are and WER we think we are

Natural language processing of conversational speech requires the availa...
research
03/16/2023

Trustera: A Live Conversation Redaction System

Trustera, the first functional system that redacts personally identifiab...
research
05/13/2017

Annotating and Modeling Empathy in Spoken Conversations

Empathy, as defined in behavioral sciences, expresses the ability of hum...

Please sign up or login with your details

Forgot password? Click here to reset