Learning Representations and Agents for Information Retrieval
A goal shared by artificial intelligence and information retrieval is to create an oracle, that is, a machine that can answer our questions, no matter how difficult they are. A more limited, but still instrumental, version of this oracle is a question-answering system, in which an open-ended question is given to the machine, and an answer is produced based on the knowledge it has access to. Such systems already exist and are increasingly capable of answering complicated questions. This progress can be partially attributed to the recent success of machine learning and to the efficient methods for storing and retrieving information, most notably through web search engines. One can imagine that this general-purpose question-answering system can be built as a billion-parameters neural network trained end-to-end with a large number of pairs of questions and answers. We argue, however, that although this approach has been very successful for tasks such as machine translation, storing the world's knowledge as parameters of a learning machine can be very hard. A more efficient way is to train an artificial agent on how to use an external retrieval system to collect relevant information. This agent can leverage the effort that has been put into designing and running efficient storage and retrieval systems by learning how to best utilize them to accomplish a task. ...
READ FULL TEXT