SmartSales: Sales Script Extraction and Analysis from Sales Chatlog

by   Hua Liang, et al.

In modern sales applications, automatic script extraction and management greatly decrease the need for human labor to collect the winning sales scripts, which largely boost the success rate for sales and can be shared across the sales teams. In this work, we present the SmartSales system to serve both the sales representatives and managers to attain the sales insights from the large-scale sales chatlog. SmartSales consists of three modules: 1) Customer frequently asked questions (FAQ) extraction aims to enrich the FAQ knowledge base by harvesting high quality customer question-answer pairs from the chatlog. 2) Customer objection response assists the salespeople to figure out the typical customer objections and corresponding winning sales scripts, as well as search for proper sales responses for a certain customer objection. 3) Sales manager dashboard helps sales managers to monitor whether a specific sales representative or team follows the sales standard operating procedures (SOP). The proposed prototype system is empowered by the state-of-the-art conversational intelligence techniques and has been running on the Tencent Cloud to serve the sales teams from several different areas.



page 3


Soccer Team Vectors

In this work we present STEVE - Soccer TEam VEctors, a principled approa...

SCAI-QReCC Shared Task on Conversational Question Answering

Search-Oriented Conversational AI (SCAI) is an established venue that re...

Conversational Agents for Insurance Companies: From Theory to Practice

Advances in artificial intelligence have renewed interest in conversatio...

A practical approach to dialogue response generation in closed domains

We describe a prototype dialogue response generation model for the custo...

AliMe KBQA: Question Answering over Structured Knowledge for E-commerce Customer Service

With the rise of knowledge graph (KG), question answering over knowledge...

Intelligent Conversational Bot for Massive Online Open Courses (MOOCs)

Massive Online Open Courses (MOOCs) which were introduced in 2008 has si...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

In the sales teams, some sales representatives consistently attain or exceed the sales goals while others do not. The conventional routine of escalating the performance of sales teams is to figure out what makes good sales stand out by manually analyzing hundreds or thousands of human-to-human conversation chatlog between salespeople and customers, and then summarize the winning sales script or other sales tips for sales team training [9, 8, 5]. However, this labor intensive procedure requires the participation of sales experts and tedious human analysis, which makes it undesirable for modern fast-paced, high-growth sales teams.

Recent technological advances in natural language processing (NLP) have made it possible to induce the skeleton or key semantics of the human conversations in an automatic fashion

[16, 3, 14]. Compared with recognizing the winning sales scripts and analyzing the large-scale sales chatlog through human experts, the automatic winning sales scripts mining and analyses empowered by advanced NLP techniques would largely accelerate the performance improvement of sales teams. To this end, we build an easy-to-use system named that is capable of handling hundreds of thousands or even millions of sales chatlog and assists the both sales representatives and managers to figure out the best practises to increase the sales success rate. is a web application that consists of three modules: Customer FAQ extraction, Customer objection response mining and search, and Sales manager dashboard.

has been deployed on the Tencent Cloud and can also be seamlessly embedded in the existing customer relationship management (CRM) systems. We verify the performance of according to the feedback from an online education sales teams and an automobile sales teams.

2 System Architecture

Figure 1: The illustration for the user interface of (the UI elements are translated to English from Chinese). The users start from an entrance page (1⃝) that includes file uploading and task selection (2⃝-5⃝). The selected tasks correspond to the different modules of , which are elaborated in Sec 2.2.

2.1 Modules

Customer FAQ Extraction

As a replacement for manual FAQ accumulation, this module offers a much easier and more efficient way for sales teams. The question-answer pairs retrieval followed by human post-checking would be one of the cornerstones for knowledge acquisition as new customers emerge. According to the real customer feedback, this module reduce the human cost by up to 97%, 87% during knowledge base cold-start for the sales teams of an international express company and an online education company respectively.

Customer Objection Response

Customer objections are the concerns that a prospect has which cause them to hesitate (at best) and abandon (at worst) a purchase. Even professionals might feel nervous while they find themselves facing the customer objections. One of the common resolutions is to highlight the typical customer objections that come up again and again by going through the chatlog and then create a plan to answer them. However, the sales chatlog would grow rapidly for a large sales group which has vast prospective customers, and the customer objections might change in the different stages of the sales circle. The proposed module would serve as an assistant for the salespeople to determine typical customer objections and gain actionable insights from good responses in the periodical sales retrospectives or performance reviews. As illustrated in Fig 3, after filtering trivial customer messages, semantically similar customer queries and corresponding responses from sales are assembled in the same cluster. The clusters of customer queries are ordered by frequency and semantic relevance. For each cluster, we offer keywords that highlights the core semantics of customer queries. The sales representatives could figure out the typical customer objections and the best practises to respond, or search for a successful sales response for a particular customer objection with the proposed module.

Sales Manager Dashboard

In order to increase the sales performance, the sales manager should understand the pros and cons of their crew in order to develop the training programs for the sales staffs. The proposed module helps the managers to overview the sales performance with a “query trigger-response spotlight” paradigm. The paradigm consists of a set of pre-defined rules that specify the sales standard operating procedures (SOP), i.e. “what the sales should do in the given situation”. For example,

  • When a customer hesitates due to the affordability, the sales should mention payment policy such as “pay by installments”.

  • While facing a new customer, the sales should advertise for the best-seller productions.

The customer query triggers and sales response spotlights are formulated as a set of rules, keyword matching and query intention classification models, which are tailored for different sales groups.

We verify the module functionality and effectiveness of customer objection response and sales manager dashboard with the chatlog from an automobile sales team and an online education sales team.

2.2 Interface

system is a web application which is deployed on the Tencent cloud and can be easily integrated in the customers’ CRM system. 111The demonstration video for can be viewed in We show the user interface and user interaction of in Fig 1. The users start from an entrance page that includes chatlog file uploading and task selection. After uploading the chatlog in the csv format, the users could select a specific task with the “task starts” operation (1⃝), and view the details of the finished tasks in the “task list” section. The selected task in the entrance page corresponds to different modules of the system (2⃝-5⃝). For the customer FAQ extraction module (2⃝), the extracted FAQs would be plainly shown in a new webpage that does not need further user interaction. For the customer objection response module, the response mining sub-module corresponds to the information table of the clustered customer messages (3⃝), the users could refer to the clustered sales responses by clicking on a specific message cluster in a popup window (6⃝). The user could also gain sales insights by searching for the existing sales responses in the searchbar (4⃝). On the sales manager dashboard page (5⃝), the sales managers could overview the execution ratio of the sales SOP from different viewpoints, i.e. trigger view, team view and staff view, by interacting with the tabbar.

Figure 2: The workflow of the customer FAQ extraction module.
Figure 3: The NLP pipeline for customer objection response module.

3 AI Engine

QA pair mining

As depicted in Fig 2, the QA pairs mining module is comprised of a question extractor and an answer extractor. The question extractor aims to determine the meaningful

customer queries and is formulated as a multi-label classifier for

semantic integrity, chitchat filtering, legal customer inquiry222A valid question should be labeled as “yes” for all three aspects., which is modeled as a one-layer MLP classifier over the CLSvector of a syntax-enhanced BERT [10, 15] encoder. We model the answer extractor with a prompt-based BERT matching model after recognizing the valid customer queries. The input of the answer extractor is a dialog snippet with utterance, in which denotes the valid customer query recognized by the question extractor while denotes the subsequent utterances that includes the messages from both customer and sales. We feed the dialog snippet with soft prompt templates [11, 4, 13] into a BERT encoder. For simplicity, suppose there only exists 4 utterances in , the input format is , where , , , represents the CLS token, the SEP token, and the prompt template tokens that are inserted before the candidate answers {, }, respectively. Then we derive the answer scores for {, }. We select the answer with the highest score for only if the score is larger than a threshold, i.e. 0.75, otherwise we posit no valid answer exists in the chatlog for . Offline experiments show that the proposed QA extractor outperforms multiple QA extraction baselines [1, 6]. The training data for QA extraction are annotated by the crowdsource workers on an internal platform of Tencent comparable to AMT.

Utterance Semantic Clustering

After filtering trivial information in the dialog (Fig 3), we represent the utterances with the dense vectors using the pretrained sentence encoders [12] which are compatible with efficient query retrieval on GPUs [7]

. Then we assemble the utterances that are similar in the semantics with the K-means algorithm. In each assembled cluster, we set the semantic centroid query as the anchor and use a dedicated BERT sentence-pair matching model

[1] that is pretrained on an internal query-query matching corpus to filter the utterances with the low relevance to the anchor query in the cluster.

Keywords and Semantic Classifier

We use the open-source

topmine [2] toolkit to extract keywords from customer or sales utterances in the customer objection response module. For the trigger and spotlight detection in the sales manager dashboard module, the intent classifier for a specific sales groups is tailored and has a domain-specific intent vocabulary. For example, for the sales teams in the online education domain, the pre-defined intents include “affordability”, “competitors”, “eagerness to learn”, “curriculum” , “lack of time”, etc. The intent classifier also supports domain-specific keyword matching, e.g. “new energy car”, “driver asisstance”, “intelligent vehicle” for the car sales teams.

4 Conclusion

We propose the system to automatically extract and analyze winning sales scripts from the enormous customer-sales chatlog using advanced NLP technologies for both sales representatives and managers. consists of three modules: 1) customer FAQ extraction aims to enrich the FAQ knowledge base by mining question-answer pairs from chatlog; 2) customer objection response helps the sales to figure out the typical customer objections and the best practises to respond; 3) sales manager dashboard depicts the performances for different sales staffs or teams. The system has been running on the Tencent Cloud for several sales teams from different domains.


  • [1] J. Devlin, M. Chang, K. Lee, and K. Toutanova (2019-06) BERT: pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), Minneapolis, Minnesota, pp. 4171–4186. External Links: Link, Document Cited by: §3, §3.
  • [2] A. El-Kishky, Y. Song, C. Wang, C. R. Voss, and J. Han (2014) Scalable topical phrase mining from text corpora. Proceedings of the VLDB Endowment 8 (3). Cited by: §3.
  • [3] B. Gliwa, I. Mochol, M. Biesek, and A. Wawer (2019-11) SAMSum corpus: a human-annotated dialogue dataset for abstractive summarization. In Proceedings of the 2nd Workshop on New Frontiers in Summarization, Hong Kong, China, pp. 70–79. External Links: Link, Document Cited by: §1.
  • [4] K. Hambardzumyan, H. Khachatrian, and J. May (2021-08) WARP: Word-level Adversarial ReProgramming. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), Online, pp. 4921–4933. External Links: Link, Document Cited by: §3.
  • [5] R. H. Humphrey and B. Ashforth (1994) Cognitive scripts and prototypes in service encounters. Advances in services marketing and management 3 (C), pp. 175–199. Cited by: §1.
  • [6] Q. Jia, M. Zhang, S. Zhang, and K. Q. Zhu (2020) Matching questions and answers in dialogues from online forums. In ECAI 2020, pp. 2046–2053. Cited by: §3.
  • [7] J. Johnson, M. Douze, and H. Jégou (2019) Billion-scale similarity search with gpus. IEEE Transactions on Big Data 7 (3), pp. 535–547. Cited by: §3.
  • [8] T. W. Leigh and P. F. McGraw (1989) Mapping the procedural knowledge of industrial sales personnel: a script-theoretic investigation. Journal of Marketing 53 (1), pp. 16–34. Cited by: §1.
  • [9] S. M. Leong, P. S. Busch, and D. R. John (1989) Knowledge bases and salesperson effectiveness: a script-theoretic analysis. Journal of Marketing Research 26 (2), pp. 164–178. Cited by: §1.
  • [10] Z. Li, Q. Zhou, C. Li, K. Xu, and Y. Cao (2021-08) Improving BERT with syntax-aware local attention. In Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, Online, pp. 645–653. External Links: Link, Document Cited by: §3.
  • [11] G. Qin and J. Eisner (2021-06) Learning how to ask: querying LMs with mixtures of soft prompts. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Online, pp. 5203–5212. External Links: Link, Document Cited by: §3.
  • [12] N. Reimers and I. Gurevych (2019-11) Sentence-BERT: sentence embeddings using Siamese BERT-networks. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), Hong Kong, China, pp. 3982–3992. External Links: Link, Document Cited by: §3.
  • [13] T. Schick and H. Schütze (2021-04) Exploiting cloze-questions for few-shot text classification and natural language inference. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, Online, pp. 255–269. External Links: Link, Document Cited by: §3.
  • [14] K. Sun, D. Yu, J. Chen, D. Yu, Y. Choi, and C. Cardie (2019) DREAM: a challenge data set and models for dialogue-based reading comprehension. Transactions of the Association for Computational Linguistics 7, pp. 217–231. External Links: Link, Document Cited by: §1.
  • [15] Z. Xu, D. Guo, D. Tang, Q. Su, L. Shou, M. Gong, W. Zhong, X. Quan, D. Jiang, and N. Duan (2021-08) Syntax-enhanced pre-trained model. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), Online, pp. 5412–5422. External Links: Link, Document Cited by: §3.
  • [16] D. Yu, K. Sun, C. Cardie, and D. Yu (2020-07) Dialogue-based relation extraction. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Online, pp. 4927–4940. External Links: Link, Document Cited by: §1.