When Hearst Is not Enough: Improving Hypernymy Detection from Corpus with Distributional Models

by   Changlong Yu, et al.

We address hypernymy detection, i.e., whether an is-a relationship exists between words (x, y), with the help of large textual corpora. Most conventional approaches to this task have been categorized to be either pattern-based or distributional. Recent studies suggest that pattern-based ones are superior, if large-scale Hearst pairs are extracted and fed, with the sparsity of unseen (x, y) pairs relieved. However, they become invalid in some specific sparsity cases, where x or y is not involved in any pattern. For the first time, this paper quantifies the non-negligible existence of those specific cases. We also demonstrate that distributional methods are ideal to make up for pattern-based ones in such cases. We devise a complementary framework, under which a pattern-based and a distributional model collaborate seamlessly in cases which they each prefer. On several benchmark datasets, our framework achieves competitive improvements and the case study shows its better interpretability.


Hearst Patterns Revisited: Automatic Hypernym Detection from Large Text Corpora

Methods for unsupervised hypernym detection may broadly be categorized a...

Improving Hypernymy Detection with an Integrated Path-based and Distributional Method

Detecting hypernymy relations is a key task in NLP, which is addressed i...

Weakly-supervised Relation Extraction by Pattern-enhanced Embedding Learning

Extracting relations from text corpora is an important task in text mini...

Path-based vs. Distributional Information in Recognizing Lexical Semantic Relations

Recognizing various semantic relations between terms is beneficial for m...

Using Distributional Thesaurus Embedding for Co-hyponymy Detection

Discriminating lexical relations among distributionally similar words ha...

Investigating Antigram Behaviour using Distributional Semantics

Language is an extremely interesting subject to study, each day presenti...

Analysis of Communication Pattern with Scammers in Enron Corpus

This paper is an exploratory analysis into fraud detection taking Enron ...