A Multi-Resolution Word Embedding for Document Retrieval from Large Unstructured Knowledge Bases

02/02/2019
by   Tolgahan Cakaloglu, et al.
0

Deep language models learning a hierarchical representation proved to be a powerful tool for natural language processing, text mining and information retrieval. However, representations that perform well for retrieval must capture semantic meaning at different levels of abstraction or context-scopes. In this paper, we propose a new method to generate multi-resolution word embedding representing documents at multiple resolutions in term of context-scopes. In order to investigate its performance, we use the Stanford Question Answering Dataset (SQuAD) and the Question Answering by Search And Reading (QUASAR) in an open-domain question-answering setting, where the first task is to find documents useful for answering a given question. To this end, we first compare the quality of various text-embedding methods for retrieval performance and give an extensive empirical comparison with the performance of various non-augmented base embeddings with and without multi-resolution representation. We argue that multi-resolution word embeddings are consistently superior to the original counterparts and deep residual neural models specifically trained for retrieval purposes can yield further significant gains when they are used for augmenting those embeddings.

READ FULL TEXT
research
10/24/2018

Text Embeddings for Retrieval From a Large Knowledge Base

Text embedding representing natural language documents in a semantic vec...
research
11/03/2019

MRNN: A Multi-Resolution Neural Network with Duplex Attention for Document Retrieval in the Context of Question Answering

The primary goal of ad-hoc retrieval (document retrieval in the context ...
research
04/01/2019

Unsupervised Abbreviation Disambiguation Contextual disambiguation using word embeddings

As abbreviations often have several distinct meanings, disambiguating th...
research
08/17/2019

Language Features Matter: Effective Language Representations for Vision-Language Tasks

Shouldn't language and vision features be treated equally in vision-lang...
research
08/06/2023

Embedding-based Retrieval with LLM for Effective Agriculture Information Extracting from Unstructured Data

Pest identification is a crucial aspect of pest control in agriculture. ...
research
07/14/2020

Using Holographically Compressed Embeddings in Question Answering

Word vector representations are central to deep learning natural languag...
research
09/07/2018

Convolutional Neural Network: Text Classification Model for Open Domain Question Answering System

Recently machine learning is being applied to almost every data domain o...

Please sign up or login with your details

Forgot password? Click here to reset