Editing a classifier by rewriting its prediction rules

12/02/2021
by   Shibani Santurkar, et al.
13

We present a methodology for modifying the behavior of a classifier by directly rewriting its prediction rules. Our approach requires virtually no additional data collection and can be applied to a variety of settings, including adapting a model to new environments, and modifying it to ignore spurious features. Our code is available at https://github.com/MadryLab/EditingClassifiers .

READ FULL TEXT
POST COMMENT

Comments

There are no comments yet.

Authors

page 2

page 4

page 10

page 22

page 24

page 26

page 28

page 30

04/26/2021

Yes, BM25 is a Strong Baseline for Legal Case Retrieval

We describe our single submission to task 1 of COLIEE 2021. Our vanilla ...
09/19/2017

A Fast and Accurate Vietnamese Word Segmenter

We propose a novel approach to Vietnamese word segmentation. Our approac...
12/26/2018

Deconfounding Reinforcement Learning in Observational Settings

We propose a general formulation for addressing reinforcement learning (...
11/30/2021

ePose: Let's Make EfficientPose More Generally Applicable

EfficientPose is an impressive 3D object detection model. It has been de...
09/06/2020

Automatic Yara Rule Generation Using Biclustering

Yara rules are a ubiquitous tool among cybersecurity practitioners and a...
06/15/2021

Learning Stable Classifiers by Transferring Unstable Features

We study transfer learning in the presence of spurious correlations. We ...
03/09/2020

Set-Structured Latent Representations

Unstructured data often has latent component structure, such as the obje...

Code Repositories

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.