During the course of human language evolution, the semantic meanings of words keep evolving with time. The understanding of evolving semantics enables us to capture the true meaning of the words in different usage contexts, and thus is critical for various applications, such as machine translation. While it is naturally promising to study word semantics in a time-aware manner, traditional methods to learn word vector representation do not adequately capture the change over time. To this end, in this paper, we aim at learning time-aware vector representation of words through dynamic word embedding modeling. Specifically, we first propose a method that captures time-specific semantics and across-time alignment simultaneously in a way that is robust to data sparsity. Then, we solve the resulting optimization problem using a scalable coordinate descent method. Finally, we perform the empirical study on New York Times data to learn the temporal embeddings and develop multiple evaluations that illustrate the semantic evolution of words, discovered from news media. Moreover, our qualitative and quantitative tests indicate that the our method not only reliably captures the semantic evolution over time, but also onsistently outperforms state-of-the-art temporal embedding approaches on both semantic accuracy and alignment quality.
03/02/2017 ∙ by Zijun Yao, et al. ∙ 0 ∙ share
Zijun Yaois this you? claim profile