b'Dawn Drain'

research

∙ 02/15/2023

The Capacity for Moral Self-Correction in Large Language Models

We test the hypothesis that language models trained with reinforcement l...

0 Deep Ganguli, et al. ∙

research

∙ 12/15/2022

Constitutional AI: Harmlessness from AI Feedback

As AI systems become more capable, we would like to enlist their help to...

0 Yuntao Bai, et al. ∙

research

∙ 11/04/2022

Measuring Progress on Scalable Oversight for Large Language Models

Developing safe and useful general-purpose AI systems will require us to...

0 Samuel R. Bowman, et al. ∙

research

∙ 09/24/2022

In-context Learning and Induction Heads

"Induction heads" are attention heads that implement a simple algorithm ...

8 Catherine Olsson, et al. ∙

research

∙ 09/21/2022

Toy Models of Superposition

Neural networks often pack many unrelated concepts into a single neuron ...

12 Nelson Elhage, et al. ∙

research

∙ 08/29/2022

Exploring and Evaluating Personalized Models for Code Generation

Large Transformer models achieved the state-of-the-art status for Natura...

0 Andrei Zlotchevski, et al. ∙

research

∙ 08/23/2022

Red Teaming Language Models to Reduce Harms: Methods, Scaling Behaviors, and Lessons Learned

We describe our early efforts to red team language models in order to si...

0 Deep Ganguli, et al. ∙

research

∙ 07/11/2022

Language Models (Mostly) Know What They Know

We study whether language models can evaluate the validity of their own ...

12 Saurav Kadavath, et al. ∙

research

∙ 05/21/2022

Scaling Laws and Interpretability of Learning from Repeated Data

Recent large language models have been trained on vast datasets, but als...

0 Danny Hernandez, et al. ∙

research

∙ 04/12/2022

Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback

We apply preference modeling and reinforcement learning from human feedb...

2 Yuntao Bai, et al. ∙

research

∙ 02/15/2022

Predictability and Surprise in Large Generative Models

Large-scale pre-training has recently emerged as a technique for creatin...

0 Deep Ganguli, et al. ∙

research

∙ 12/01/2021

A General Language Assistant as a Laboratory for Alignment

Given the broad capabilities of large language models, it should be poss...

11 Amanda Askell, et al. ∙

research

∙ 09/17/2021

Long-Range Modeling of Source Code Files with eWASH: Extended Window Access by Syntax Hierarchy

Statistical language modeling and translation with transformers have fou...

0 Colin B. Clement, et al. ∙

research

∙ 08/06/2021

Distilling Transformers for Neural Cross-Domain Search

Pre-trained transformers have recently clinched top spots in the gamut o...

4 Colin B. Clement, et al. ∙

research

∙ 05/19/2021

DeepDebug: Fixing Python Bugs Using Stack Traces, Backtranslation, and Code Skeletons

The joint task of bug localization and program repair is an integral par...

0 Dawn Drain, et al. ∙

research

∙ 04/16/2021

Generating Bug-Fixes Using Pretrained Transformers

Detecting and fixing bugs are two of the most important yet frustrating ...

14 Dawn Drain, et al. ∙

research

∙ 04/12/2021

Generating Code with the Help of Retrieved Template Functions and Stack Overflow Answers

We approach the important challenge of code autocompletion as an open-do...

0 Dawn Drain, et al. ∙

research

∙ 10/07/2020

PyMT5: multi-mode translation of natural language and Python code with transformers

Simultaneously modeling source code and natural language has many exciti...

0 Colin B. Clement, et al. ∙

research

∙ 09/17/2020

GraphCodeBERT: Pre-training Code Representations with Data Flow

Pre-trained models for programming language have achieved dramatic empir...

0 Daya Guo, et al. ∙

research

∙ 09/11/2020

Generating Accurate Assert Statements for Unit Test Cases using Pretrained Transformers

Unit testing represents the foundational basis of the software testing p...

0 Michele Tufano, et al. ∙

research

∙ 09/11/2020

Unit Test Case Generation with Transformers

Automated Unit Test Case generation has been the focus of extensive lite...

0 Michele Tufano, et al. ∙

Dawn Drain

Featured Co-authors

Sign in with Google

Consider DeepAI Pro