Unsupervised Hypernym Detection by Distributional Inclusion Vector Embedding
Modeling hypernymy, such as poodle is-a dog, is an important generalization aid to many NLP tasks, such as entailment, relation extraction, and question answering. Supervised learning from labeled hypernym sources, such as WordNet, limit the coverage of these models, which can be addressed by learning hypernyms from unlabeled text. Existing unsupervised methods either do not scale to large vocabularies or yield unacceptably poor accuracy. This paper introduces distributional inclusion vector embedding (DIVE), a simple-to-implement unsupervised method of hypernym discovery via per-word non-negative vector embeddings learned by modeling diversity of word context with specialized negative sampling. In an experimental evaluation more comprehensive than any previous literature of which we are aware - evaluating on 11 datasets using multiple existing as well as newly proposed scoring metrics - we find that our method can provide up to double or triple the precision of previous unsupervised methods, and also sometimes outperforms previous semi-supervised methods, yielding many new state-of-the-art results.
READ FULL TEXT