DeepAI AI Chat
Log In Sign Up

Extracting Similar Questions From Naturally-occurring Business Conversations

by   Xiliang Zhu, et al.

Pre-trained contextualized embedding models such as BERT are a standard building block in many natural language processing systems. We demonstrate that the sentence-level representations produced by some off-the-shelf contextualized embedding models have a narrow distribution in the embedding space, and thus perform poorly for the task of identifying semantically similar questions in real-world English business conversations. We describe a method that uses appropriately tuned representations and a small set of exemplars to group questions of interest to business users in a visualization that can be used for data exploration or employee coaching.


Evaluation of BERT and ALBERT Sentence Embedding Performance on Downstream NLP Tasks

Contextualized representations from a pre-trained language model are cen...

MCSE: Multimodal Contrastive Learning of Sentence Embeddings

Learning semantically meaningful sentence embeddings is an open problem ...

Evaluating the Construct Validity of Text Embeddings with Application to Survey Questions

Text embedding models from Natural Language Processing can map text data...

ChatGPT: Applications, Opportunities, and Threats

Developed by OpenAI, ChatGPT (Conditional Generative Pre-trained Transfo...

Automated classification for open-ended questions with BERT

Manual coding of text data from open-ended questions into different cate...

Solving ESL Sentence Completion Questions via Pre-trained Neural Language Models

Sentence completion (SC) questions present a sentence with one or more b...

Auto-tagging of Short Conversational Sentences using Transformer Methods

The problem of categorizing short speech sentences according to their se...