Novel Aficionados and Doppelgängers: a referential task for semantic representations of individual entities
In human semantic cognition, proper names (names which refer to individual entities) are harder to learn and retrieve than common nouns. This seems to be the case for machine learning algorithms too, but the linguistic and distributional reasons for this behaviour have not been investigated in depth so far. To tackle this issue, we show that the semantic distinction between proper names and common nouns is reflected in their linguistic distributions by employing an original task for distributional semantics, the Doppelgänger test, an extensive set of models, and a new dataset, the Novel Aficionados dataset. The results indicate that the distributional representations of different individual entities are less clearly distinguishable from each other than those of common nouns, an outcome which intriguingly mirrors human cognition.
READ FULL TEXT