Using Holographically Compressed Embeddings in Question Answering

07/14/2020
by   Salvador E. Barbosa, et al.
0

Word vector representations are central to deep learning natural language processing models. Many forms of these vectors, known as embeddings, exist, including word2vec and GloVe. Embeddings are trained on large corpora and learn the word's usage in context, capturing the semantic relationship between words. However, the semantics from such training are at the level of distinct words (known as word types), and can be ambiguous when, for example, a word type can be either a noun or a verb. In question answering, parts-of-speech and named entity types are important, but encoding these attributes in neural models expands the size of the input. This research employs holographic compression of pre-trained embeddings, to represent a token, its part-of-speech, and named entity type, in the same dimension as representing only the token. The implementation, in a modified question answering recurrent deep learning network, shows that semantic relationships are preserved, and yields strong performance.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/01/2019

Improved Word Sense Disambiguation Using Pre-Trained Contextualized Word Representations

Contextualized word representations are able to give different represent...
research
01/11/2017

Question Analysis for Arabic Question Answering Systems

The first step of processing a question in Question Answering(QA) System...
research
02/02/2019

A Multi-Resolution Word Embedding for Document Retrieval from Large Unstructured Knowledge Bases

Deep language models learning a hierarchical representation proved to be...
research
06/09/2017

Learning to Embed Words in Context for Syntactic Tasks

We present models for embedding words in the context of surrounding word...
research
10/04/2017

Building a Web-Scale Dependency-Parsed Corpus from CommonCrawl

We present DepCC, the largest to date linguistically analyzed corpus in ...
research
06/23/2021

Clinical Named Entity Recognition using Contextualized Token Representations

The clinical named entity recognition (CNER) task seeks to locate and cl...
research
12/19/2022

Multi hash embeddings in spaCy

The distributed representation of symbols is one of the key technologies...

Please sign up or login with your details

Forgot password? Click here to reset