Learning to Reformulate the Queries on the WEB
Inability of the naive users to formulate appropriate queries is a fundamental problem in web search engines. Therefore, assisting users to issue more effective queries is an important way to improve users' happiness. One effective approach is query reformulation, which generates new effective queries according to the current query issued by users. Previous researches typically generate words and phrases related to the original query. Since the definition of query reformulation is quite general, it is completely difficult to develop a uniform term-based approach for this problem. This paper uses readily available data, particularly over one billion anchor phrases in Clueweb09 corpus, in order to learn an end-to-end encoder-decoder model to automatically generate effective queries. Following successful researches in the field of sequence to sequence models, we employ a character-level convolutional neural network with max-pooling at encoder and an attention-based recurrent neural network at decoder. The whole model learned in an unsupervised end-to-end manner.Experiments on TREC collections show that the reformulated queries automatically generated by the proposed solution can significantly improve the retrieval performance.
READ FULL TEXT