DeepAI AI Chat
Log In Sign Up

Legible Normativity for AI Alignment: The Value of Silly Rules

by   Dylan Hadfield-Menell, et al.
berkeley college

It has become commonplace to assert that autonomous agents will have to be built to follow human rules of behavior--social norms and laws. But human laws and norms are complex and culturally varied systems, in many cases agents will have to learn the rules. This requires autonomous agents to have models of how human rule systems work so that they can make reliable predictions about rules. In this paper we contribute to the building of such models by analyzing an overlooked distinction between important rules and what we call silly rules--rules with no discernible direct impact on welfare. We show that silly rules render a normative system both more robust and more adaptable in response to shocks to perceived stability. They make normativity more legible for humans, and can increase legibility for AI systems as well. For AI systems to integrate into human normative systems, we suggest, it may be important for them to have models that include representations of silly rules.


page 1

page 2

page 3

page 4


How Should AI Interpret Rules? A Defense of Minimally Defeasible Interpretive Argumentation

Can artificially intelligent systems follow rules? The answer might seem...

Heterogeneity of AI-Induced Societal Harms and the Failure of Omnibus AI Laws

AI-induced societal harms mirror existing problems in domains where AI r...

Truthful AI: Developing and governing AI that does not lie

In many contexts, lying – the use of verbal falsehoods to deceive – is h...

Godseed: Benevolent or Malevolent?

It is hypothesized by some thinkers that benign looking AI objectives ma...

Computational-level Analysis of Constraint Compliance for General Intelligence

Human behavior is conditioned by codes and norms that constrain action. ...

Learning Behavioral Soft Constraints from Demonstrations

Many real-life scenarios require humans to make difficult trade-offs: do...

New Rewriter Features in FGL

FGL is a successor to GL, a proof procedure for ACL2 that allows complic...