We expose a surprising failure of generalization in auto-regressive larg...
We aim to better understand the emergence of `situational awareness' in ...
Reinforcement learning from human feedback (RLHF) is a technique for tra...
Work on scaling laws has found that large language models (LMs) show
pre...
Pretrained language models often generate outputs that are not in line w...
The potential for pre-trained large language models (LLMs) to use natura...
Computational simulations are a popular method for testing hypotheses ab...
Language models (LMs) are pretrained to imitate internet text, including...
Aligning language models with preferences can be posed as approximating ...
The availability of large pre-trained models is changing the landscape o...
Reinforcement learning (RL) is frequently employed in fine-tuning large
...
Machine learning is shifting towards general-purpose pretrained generati...
Communication is compositional if complex signals can be represented as ...
Neural language models can be successfully trained on source code, leadi...
Compositionality is an important explanatory target in emergent communic...
This paper explores a novel approach to achieving emergent compositional...
This paper presents our contribution to PolEval 2019 Task 6: Hate speech...
We describe a variant of Child-Sum Tree-LSTM deep neural network (Tai et...