Learning Word Embeddings with Domain Awareness

06/07/2019
by   Guoyin Wang, et al.
0

Word embeddings are traditionally trained on a large corpus in an unsupervised setting, with no specific design for incorporating domain knowledge. This can lead to unsatisfactory performances when training data originate from heterogeneous domains. In this paper, we propose two novel mechanisms for domain-aware word embedding training, namely domain indicator and domain attention, which integrate domain-specific knowledge into the widely used SG and CBOW models, respectively. The two methods are based on a joint learning paradigm and ensure that words in a target domain are intensively focused when trained on a source domain corpus. Qualitative and quantitative evaluation confirm the validity and effectiveness of our models. Compared to baseline methods, our method is particularly effective in near-cold-start scenarios.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/21/2017

Learning Domain-Specific Word Embeddings from Sparse Cybersecurity Texts

Word embedding is a Natural Language Processing (NLP) technique that aut...
research
10/06/2022

Domain-Specific Word Embeddings with Structure Prediction

Complementary to finding good general word embeddings, an important ques...
research
05/25/2018

Lifelong Domain Word Embedding via Meta-Learning

Learning high-quality domain word embeddings is important for achieving ...
research
03/05/2019

Improving Cross-Domain Chinese Word Segmentation with Word Embeddings

Cross-domain Chinese Word Segmentation (CWS) remains a challenge despite...
research
12/28/2017

Corpus specificity in LSA and Word2vec: the role of out-of-domain documents

Latent Semantic Analysis (LSA) and Word2vec are some of the most widely ...
research
05/05/2022

Balancing Multi-Domain Corpora Learning for Open-Domain Response Generation

Open-domain conversational systems are assumed to generate equally good ...
research
10/05/2020

LEAPME: Learning-based Property Matching with Embeddings

Data integration tasks such as the creation and extension of knowledge g...

Please sign up or login with your details

Forgot password? Click here to reset