On Generality and Knowledge Transferability in Cross-Domain Duplicate Question Detection for Heterogeneous Community Question Answering

Duplicate question detection is an ongoing challenge in community question answering because semantically equivalent questions can have significantly different words and structures. In addition, the identification of duplicate questions can reduce the resources required for retrieval, when the same questions are not repeated. This study compares the performance of deep neural networks and gradient tree boosting, and explores the possibility of domain adaptation with transfer learning to improve the under-performing target domains for the text-pair duplicates classification task, using three heterogeneous datasets: general-purpose Quora, technical Ask Ubuntu, and academic English Stack Exchange. Ultimately, our study exposes the alternative hypothesis that the meaning of a "duplicate" is not inherently general-purpose, but rather is dependent on the domain of learning, hence reducing the chance of transfer learning through adapting to the domain.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/06/2019

Towards Domain Adaptation from Limited Data for Question Answering Using Deep Neural Networks

This paper explores domain adaptation for enabling question answering (Q...
research
08/07/2017

ISS-MULT: Intelligent Sample Selection for Multi-Task Learning in Question Answering

Transferring knowledge from a source domain to another domain is useful,...
research
01/08/2019

Supervised Transfer Learning for Product Information Question Answering

Popular e-commerce websites such as Amazon offer community question answ...
research
11/06/2019

Unsupervised Domain Adaptation of Contextual Embeddings for Low-Resource Duplicate Question Detection

Answering questions is a primary goal of many conversational systems or ...
research
07/01/2020

Transferability of Natural Language Inference to Biomedical Question Answering

Biomedical question answering (QA) is a challenging problem due to the s...
research
11/22/2020

Cross-Domain Generalization Through Memorization: A Study of Nearest Neighbors in Neural Duplicate Question Detection

Duplicate question detection (DQD) is important to increase efficiency o...
research
11/02/2017

Hi, how can I help you?: Automating enterprise IT support help desks

Question answering is one of the primary challenges of natural language ...

Please sign up or login with your details

Forgot password? Click here to reset