research
∙
06/27/2022
Parametrically Retargetable Decision-Makers Tend To Seek Power
If capable AI agents are generally incentivized to seek power in service...
research
∙
06/23/2022
On Avoiding Power-Seeking by Artificial Intelligence
We do not know how to align a very intelligent AI agent's behavior with ...
research
∙
06/23/2022
Formalizing the Problem of Side Effect Regularization
AI objectives are often hard to specify properly. Some approaches tackle...
research
∙
06/11/2020
Avoiding Side Effects in Complex Environments
Reward function specification can be difficult, even in simple environme...
research
∙
12/03/2019
Optimal Farsighted Agents Tend to Seek Power
Some researchers have speculated that capable reinforcement learning (RL...
research
∙
02/26/2019