Structural Supervision Improves Few-Shot Learning and Syntactic Generalization in Neural Language Models

by   Ethan Wilcox, et al.

Humans can learn structural properties about a word from minimal experience, and deploy their learned syntactic representations uniformly in different grammatical contexts. We assess the ability of modern neural language models to reproduce this behavior in English and evaluate the effect of structural supervision on learning outcomes. First, we assess few-shot learning capabilities by developing controlled experiments that probe models' syntactic nominal number and verbal argument structure generalizations for tokens seen as few as two times during training. Second, we assess invariance properties of learned representation: the ability of a model to transfer syntactic generalizations from a base context (e.g., a simple declarative active-voice sentence) to a transformed context (e.g., an interrogative sentence). We test four models trained on the same dataset: an n-gram baseline, an LSTM, and two LSTM-variants trained with explicit structural supervision (Dyer et al.,2016; Charniak et al., 2016). We find that in most cases, the neural models are able to induce the proper syntactic generalizations after minimal exposure, often from just two examples during training, and that the two structurally supervised models generalize more accurately than the LSTM model. All neural models are able to leverage information learned in base contexts to drive expectations in transformed contexts, indicating that they have learned some invariance properties of syntax.


Neural Language Models as Psycholinguistic Subjects: Representations of Syntactic State

We deploy the methods of controlled psycholinguistic experimentation to ...

Overestimation of Syntactic Representationin Neural Language Models

With the advent of powerful neural language models over the last few yea...

Representation of Constituents in Neural Language Models: Coordination Phrase as a Case Study

Neural language models have achieved state-of-the-art performances on ma...

Structural Supervision Improves Learning of Non-Local Grammatical Dependencies

State-of-the-art LSTM language models trained on large corpora learn seq...

Language Modeling Teaches You More Syntax than Translation Does: Lessons Learned Through Auxiliary Task Analysis

Recent work using auxiliary prediction task classifiers to investigate t...

Language model acceptability judgements are not always robust to context

Targeted syntactic evaluations of language models ask whether models sho...

Lexicosyntactic Inference in Neural Models

We investigate neural models' ability to capture lexicosyntactic inferen...

Please sign up or login with your details

Forgot password? Click here to reset