Because "out-of-the-box" large language models are capable of generating...
Recent efforts at explaining the interplay of memorization and generaliz...
Deep Boltzmann machines (DBMs), one of the first “deep” learning methods...
Modern image classification is based upon directly predicting model clas...
Large web-sourced multimodal datasets have powered a slew of new methods...
Despite the tremendous success in text-to-image generative models, local...
A central notion in practical and theoretical machine learning is that o...
As their size increases, Large Languages Models (LLMs) are natural candi...
Existing approaches for improving generalization in deep reinforcement
l...
Learning features from data is one of the defining characteristics of de...
The recent success of neural networks as implicit representation of data...
It is notoriously difficult to train Transformers on small datasets;
typ...
The process of revising (or constructing) a policy immediately prior to
...
While supervised learning assumes the presence of labeled data, we may h...
Predicting how distributions over discrete variables vary over time is a...
In recent years, NLP practitioners have converged on the following pract...
This work studies the design of neural networks that can process the wei...
In their seminal work, Nayyar et al. (2013) showed that imperfect inform...
Function approximation (FA) has been a critical component in solving lar...
Owing to the prohibitive costs of generating large amounts of labeled da...
Researchers investigating example hardness have increasingly focused on ...
Steganography is the practice of encoding secret information into innocu...
Neural network weights are typically initialized at random from univaria...
Bound propagation methods, when combined with branch and bound, are amon...
Algorithms designed for single-agent reinforcement learning (RL) general...
Randomized smoothing (RS) has been shown to be a fast, scalable techniqu...
Many recent state-of-the-art (SOTA) optical flow models use finite-step
...
We present a differentiable rigid-body-dynamics simulator for robotics t...
Although convolutional networks have been the dominant architecture for
...
Many tasks in deep learning involve optimizing over the inputs to a
netw...
In recent years, the ML community has seen surges of interest in both
ad...
Certified robustness is a desirable property for deep neural networks in...
Deep equilibrium networks (DEQs) are a new class of models that eschews
...
We empirically show that the test error of deep networks can be estimate...
Analyzing the worst-case performance of deep neural networks against inp...
Many machine learning tasks involve subpopulation shift where the testin...
While reinforcement learning (RL) is gaining popularity in energy system...
To assess generalization, machine learning scientists typically either (...
Large optimization problems with hard constraints arise in many settings...
Recent work has highlighted several advantages of enforcing orthogonalit...
Recent works in neural network verification show that cheap incomplete
v...
We empirically demonstrate that full-batch gradient descent on neural ne...
Modern policy gradient algorithms, notably Proximal Policy Optimization
...
The use of cash bail as a mechanism for detaining defendants pre-trial i...
A central problem in machine learning and statistics is to model joint
d...
As machine learning and algorithmic decision making systems are increasi...
Modularity maximization has been a fundamental tool for understanding th...
Probabilistic inference in pairwise Markov Random Fields (MRFs), i.e.
co...
When designing controllers for safety-critical systems, practitioners of...
Under a commonly-studied "backdoor" poisoning attack against classificat...