
-
Understanding Generalization in Adversarial Training via the Bias-Variance Decomposition
Adversarially trained models exhibit a large generalization gap: they ca...
read it
-
Approximating How Single Head Attention Learns
Why do models often attend to salient words, and how does this evolve th...
read it
-
Limitations of Post-Hoc Feature Alignment for Robustness
Feature alignment is an approach to improving robustness to distribution...
read it
-
Measuring Mathematical Problem Solving With the MATH Dataset
Many intellectual endeavors require mathematical problem solving, but th...
read it
-
Enabling certification of verification-agnostic networks via memory-efficient semidefinite programming
Convex relaxations have emerged as a promising approach for verifying de...
read it
-
Measuring Massive Multitask Language Understanding
We propose a new test to measure a text model's multitask accuracy. The ...
read it
-
Aligning AI With Shared Human Values
We show how to assess a language model's knowledge of basic concepts of ...
read it
-
The Many Faces of Robustness: A Critical Analysis of Out-of-Distribution Generalization
We introduce three new robustness benchmarks consisting of naturally occ...
read it
-
Robust estimation via generalized quasi-gradients
We explore why many recently proposed robust estimation problems are eff...
read it
-
Identifying Statistical Bias in Dataset Replication
Dataset replication is a useful tool for assessing whether improvements ...
read it
-
Rethinking Bias-Variance Trade-off for Generalization of Neural Networks
The classical bias-variance trade-off predicts that bias decreases and v...
read it
-
When does the Tukey median work?
We analyze the performance of the Tukey median estimator under total var...
read it
-
A Benchmark for Anomaly Segmentation
Detecting out-of-distribution examples is important for safety-critical ...
read it
-
Generalized Resilience and Robust Statistics
Robust statistics traditionally focuses on outliers, or perturbations in...
read it
-
Testing Robustness Against Unforeseen Adversaries
Considerable work on adversarial defense has studied robustness to a fix...
read it
-
Natural Adversarial Examples
We introduce natural adversarial examples -- real-world, unmodified, and...
read it
-
Transfer of Adversarial Robustness Between Perturbation Types
We study the transfer of adversarial robustness of deep neural networks ...
read it
-
FrAngel: Component-Based Synthesis with Control Structures
In component-based program synthesis, the synthesizer generates a progra...
read it
-
Semidefinite relaxations for certifying robustness to adversarial examples
Despite their impressive performance on diverse tasks, neural networks f...
read it
-
Stronger Data Poisoning Attacks Break Data Sanitization Defenses
Machine learning models trained on data from the outside world can be co...
read it
-
Troubling Trends in Machine Learning Scholarship
Collectively, machine learning (ML) researchers are engaged in the creat...
read it
-
Sever: A Robust Meta-Algorithm for Stochastic Optimization
In high dimensions, most machine learning methods are brittle to even a ...
read it
-
The Malicious Use of Artificial Intelligence: Forecasting, Prevention, and Mitigation
This report surveys the landscape of potential security threats from mal...
read it
-
Certified Defenses against Adversarial Examples
While neural networks have achieved high accuracy on standard image clas...
read it
-
Resilience: A Criterion for Learning in the Presence of Arbitrary Outliers
We introduce a criterion, resilience, which allows properties of a datas...
read it
-
Learning from Untrusted Data
The vast majority of theoretical results in machine learning and statist...
read it
-
Concrete Problems in AI Safety
Rapid progress in machine learning and artificial intelligence (AI) has ...
read it
-
Unsupervised Risk Estimation Using Only Conditional Independence Structure
We show how to estimate a model's test error from unlabeled data, on dis...
read it
-
The Statistics of Streaming Sparse Regression
We present a sparse analogue to stochastic gradient descent that is guar...
read it