Xander Davies | DeepAI

DeepAI

AI Chat AI Image Generator AI Video AI Music Voice Chat AI Photo Editor Math AI

Featured Co-authors

Xin Chen
162 publications
Dorsa Sadigh
102 publications
Dylan Hadfield-Menell
38 publications
Anca Dragan
33 publications
Gabriel Kreiman
32 publications
David Bau
31 publications
David Krueger
31 publications
Erdem Bıyık
27 publications
Tomasz Korbak
18 publications
Peter Hase
17 publications
Thomas Krendl Gilbert
15 publications

research

∙ 09/12/2023

Circuit Breaking: Removing Model Behaviors with Targeted Ablation

Language models often exhibit behaviors that improve performance on a pr...

0 Maximilian Li, et al. ∙

research

∙ 07/27/2023

Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback

Reinforcement learning from human feedback (RLHF) is a technique for tra...

0 Stephen Casper, et al. ∙

research

∙ 07/07/2023

Discovering Variable Binding Circuitry with Desiderata

Recent work has shown that computation in language models may be human-u...

0 Xander Davies, et al. ∙

research

∙ 03/20/2023

Sparse Distributed Memory is a Continual Learner

Continual learning is a problem for artificial neural networks that thei...

0 Trenton Bricken, et al. ∙

research

∙ 03/10/2023

Unifying Grokking and Double Descent

A principled understanding of generalization in deep learning may requir...

0 Xander Davies, et al. ∙