In this work, we develop and release Llama 2, a collection of pretrained...
We present a theory for the previously unexplained divergent behavior no...
Pretrained multilingual large language models have typically used heuris...
Current image generation models struggle to reliably produce well-formed...
Large language models (LLM) trained using the next-token-prediction
obje...
Large language models (LLMs) have shown exceptional performance on a var...
There have been a lot of interest in the scaling properties of Transform...
Large language models have been shown to achieve remarkable performance
...
Recent neural network-based language models have benefited greatly from
...
There remain many open questions pertaining to the scaling behaviour of
...
Most widely-used pre-trained language models operate on sequences of tok...
The research community has proposed copious modifications to the Transfo...
Task-oriented dialogue systems help users accomplish tasks such as booki...
Neural networks have recently achieved human-level performance on variou...
Task-oriented dialog presents a difficult challenge encompassing multipl...
Transfer learning, where a model is first pre-trained on a data-rich tas...
Deep learning (DL) creates impactful advances following a virtuous recip...
Recurrent Neural Networks (RNNs) are used in state-of-the-art models in
...
We present Deep Voice 3, a fully-convolutional attention-based neural
te...
We present Deep Voice 3, a fully-convolutional attention-based neural
te...
Deep neural networks have enabled progress in a wide variety of applicat...
Recurrent Neural Networks (RNN) are widely used to solve a variety of
pr...
Modern deep neural networks have a large number of parameters, making th...
We show that an end-to-end deep learning approach can be used to recogni...