Semantic properties of English nominal pluralization: Insights from word embeddings
Semantic differentiation of nominal pluralization is grammaticalized in many languages. For example, plural markers may only be relevant for human nouns. English does not appear to make such distinctions. Using distributional semantics, we show that English nominal pluralization exhibits semantic clusters. For instance, pluralization of fruit words is more similar to one another and less similar to pluralization of other semantic classes. Therefore, reduction of the meaning shift in plural formation to the addition of an abstract plural meaning is too simplistic. A semantically informed method, called CosClassAvg, is introduced that outperforms pluralization methods in distributional semantics which assume plural formation amounts to the addition of a fixed plural vector. In comparison with our approach, a method from compositional distributional semantics, called FRACSS, predicted plural vectors that were more similar to the corpus-extracted plural vectors in terms of direction but not vector length. A modeling study reveals that the observed difference between the two predicted semantic spaces by CosClassAvg and FRACSS carries over to how well a computational model of the listener can understand previously unencountered plural forms. Mappings from word forms, represented with triphone vectors, to predicted semantic vectors are more productive when CosClassAvg-generated semantic vectors are employed as gold standard vectors instead of FRACSS-generated vectors.
READ FULL TEXT