
Understanding Generalization in Adversarial Training via the BiasVariance Decomposition
Adversarially trained models exhibit a large generalization gap: they ca...
Approximating How Single Head Attention Learns
Why do models often attend to salient words, and how does this evolve th...
Limitations of PostHoc Feature Alignment for Robustness
Feature alignment is an approach to improving robustness to distribution...
Measuring Mathematical Problem Solving With the MATH Dataset
Many intellectual endeavors require mathematical problem solving, but th...
Enabling certification of verificationagnostic networks via memoryefficient semidefinite programming
Convex relaxations have emerged as a promising approach for verifying de...
Measuring Massive Multitask Language Understanding
We propose a new test to measure a text model's multitask accuracy. The ...
Aligning AI With Shared Human Values
We show how to assess a language model's knowledge of basic concepts of ...
The Many Faces of Robustness: A Critical Analysis of OutofDistribution Generalization
We introduce three new robustness benchmarks consisting of naturally occ...
Robust estimation via generalized quasigradients
We explore why many recently proposed robust estimation problems are eff...
Identifying Statistical Bias in Dataset Replication
Dataset replication is a useful tool for assessing whether improvements ...
Rethinking BiasVariance Tradeoff for Generalization of Neural Networks
The classical biasvariance tradeoff predicts that bias decreases and v...
When does the Tukey median work?
We analyze the performance of the Tukey median estimator under total var...
A Benchmark for Anomaly Segmentation
Detecting outofdistribution examples is important for safetycritical ...
Generalized Resilience and Robust Statistics
Robust statistics traditionally focuses on outliers, or perturbations in...
Testing Robustness Against Unforeseen Adversaries
Considerable work on adversarial defense has studied robustness to a fix...
Natural Adversarial Examples
We introduce natural adversarial examples  realworld, unmodified, and...
Transfer of Adversarial Robustness Between Perturbation Types
We study the transfer of adversarial robustness of deep neural networks ...
FrAngel: ComponentBased Synthesis with Control Structures
In componentbased program synthesis, the synthesizer generates a progra...
Semidefinite relaxations for certifying robustness to adversarial examples
Despite their impressive performance on diverse tasks, neural networks f...
Stronger Data Poisoning Attacks Break Data Sanitization Defenses
Machine learning models trained on data from the outside world can be co...
Troubling Trends in Machine Learning Scholarship
Collectively, machine learning (ML) researchers are engaged in the creat...
Sever: A Robust MetaAlgorithm for Stochastic Optimization
In high dimensions, most machine learning methods are brittle to even a ...
The Malicious Use of Artificial Intelligence: Forecasting, Prevention, and Mitigation
This report surveys the landscape of potential security threats from mal...
Certified Defenses against Adversarial Examples
While neural networks have achieved high accuracy on standard image clas...
Resilience: A Criterion for Learning in the Presence of Arbitrary Outliers
We introduce a criterion, resilience, which allows properties of a datas...
Learning from Untrusted Data
The vast majority of theoretical results in machine learning and statist...
Concrete Problems in AI Safety
Rapid progress in machine learning and artificial intelligence (AI) has ...
Unsupervised Risk Estimation Using Only Conditional Independence Structure
We show how to estimate a model's test error from unlabeled data, on dis...
The Statistics of Streaming Sparse Regression
We present a sparse analogue to stochastic gradient descent that is guar...
