
Quantifying and Improving Transferability in Domain Generalization
Outofdistribution generalization is one of the key challenges when tra...
Posterior Differential Regularization with fdivergence for Improving Model Robustness
We address the problem of enhancing model robustness through regularizat...
OLALA: ObjectLevel Active Learning Based Layout Annotation
In layout object detection problems, the groundtruth datasets are const...
Stronger and Faster Wasserstein Adversarial Attacks
Deep models, while being extremely flexible and accurate, are surprising...
Newtontype Methods for Minimax Optimization
Differential games, in particular twoplayer sequential games (a.k.a. mi...
FedMGDA+: Federated Learning meets Multiobjective Optimization
Federated learning has emerged as a promising, massively distributed way...
Density Deconvolution with Normalizing Flows
Density deconvolution is the task of estimating a probability density fu...
Interpretable Contrastive Learning for Networks
Contrastive learning (CL) is an emerging analysis approach that aims to ...
Showing Your Work Doesn't Always Work
In natural language processing, a recently popular line of work explores...
DeeBERT: Dynamic Early Exiting for Accelerating BERT Inference
Largescale pretrained language models such as BERT have brought signif...
Convex Representation Learning for Generalized Invariance in SemiInnerProduct Space
Invariance (defined in a general sense) has been one of the most effecti...
Complete Hierarchy of Relaxation for Constrained Signomial Positivity
In this article, we prove that the SumsofAM/GM Exponential (SAGE) rela...
Optimality and Stability in NonConvexNonConcave MinMax Optimization
Convergence to a saddle point for convexconcave functions has been stud...
Unsupervised Multilingual Alignment using Wasserstein Barycenter
We study unsupervised multilingual alignment, the problem of finding wor...
Exploiting Token and Pathbased Representations of Code for Identifying SecurityRelevant Commits
Public vulnerability databases such as CVE and NVD account for only 60 s...
Convergence Behaviour of Some GradientBased Methods on Bilinear ZeroSum Games
Minmax formulations have attracted great attention in the ML community ...
Convergence Behaviour of Some GradientBased Methods on Bilinear Games
Minmax optimization has attracted much attention in the machine learnin...
Understanding Adversarial Robustness: The Tradeoff between Minimum and Average Margin
Deep models, while being extremely versatile and accurate, are vulnerabl...
Tails of Triangular Flows
Triangular maps are a construct in probability theory that allows the tr...
Distributional Reinforcement Learning for Efficient Exploration
In distributional reinforcement learning (RL), the estimated distributio...
SumofSquares Polynomial Flow
Triangular map is a recent construct in probability theory that allows o...
Convexconstrained Sparse Additive Modeling and Its Extensions
Sparse additive modeling is a class of effective methods for performing ...
Dropout with Expectationlinear Regularization
Dropout, a simple and effective way to train deep neural networks, has l...
Additive Approximations in High Dimensional Nonparametric Regression via the SALSA
High dimensional nonparametric regression is an inherently difficult pro...
Generalized Conditional Gradient for Sparse Estimation
Structured sparsity is an important modeling tool that expands the appli...
Petuum: A New Platform for Distributed Machine Learning on Big Data
What is a systematic way to efficiently apply a wide spectrum of advance...
Regularizers versus Losses for Nonlinear Dimensionality Reduction: A Factored View with New Convex Relaxations
We demonstrate that almost all nonparametric dimensionality reduction m...
