An Unsupervised Model with Attention Autoencoders for Question Retrieval
Question retrieval is a crucial subtask for community question answering. Previous research focus on supervised models which depend heavily on training data and manual feature engineering. In this paper, we propose a novel unsupervised framework, namely reduced attentive matching network (RAMN), to compute semantic matching between two questions. Our RAMN integrates together the deep semantic representations, the shallow lexical mismatching information and the initial rank produced by an external search engine. For the first time, we propose attention autoencoders to generate semantic representations of questions. In addition, we employ lexical mismatching to capture surface matching between two questions, which is derived from the importance of each word in a question. We conduct experiments on the open CQA datasets of SemEval-2016 and SemEval-2017. The experimental results show that our unsupervised model obtains comparable performance with the state-of-the-art supervised methods in SemEval-2016 Task 3, and outperforms the best system in SemEval-2017 Task 3 by a wide margin.
READ FULL TEXT