The Grammar-Learning Trajectories of Neural Language Models
The learning trajectories of linguistic phenomena provide insight into the nature of linguistic representation, beyond what can be gleaned from inspecting the behavior of an adult speaker. To apply a similar approach to analyze neural language models (NLM), it is first necessary to establish that different models are similar enough in the generalizations they make. In this paper, we show that NLMs with different initialization, architecture, and training data acquire linguistic phenomena in a similar order, despite having different end performances over the data. Leveraging these findings, we compare the relative performance on different phenomena at varying learning stages with simpler reference models. Results suggest that NLMs exhibit consistent “developmental” stages. Initial analysis of these stages presents phenomena clusters (notably morphological ones), whose performance progresses in unison, suggesting potential links between their acquired representations.
READ FULL TEXT