
Deformable DETR: Deformable Transformers for EndtoEnd Object Detection
DETR has been recently proposed to eliminate the need for many handdesi...
Benign Overfitting and Noisy Features
Modern machine learning often operates in the regime where the number of...
Robust Learning Rate Selection for Stochastic Optimization via Splitting Diagnostic
This paper proposes SplitSGD, a new stochastic optimization algorithm wi...
VLBERT: Pretraining of Generic VisualLinguistic Representations
We introduce a new pretrainable generic representation for visuallingu...
Algorithmic Analysis and Statistical Estimation of SLOPE via Approximate Message Passing
SLOPE is a relatively new convex optimization procedure for highdimensi...
Quantifying Intrinsic Uncertainty in Classification via Deep Dirichlet Mixture Networks
With the widespread success of deep neural networks in science and techn...
Statistical Inference for Online Learning and Stochastic Approximation via Hierarchical Incremental Gradient Descent
Stochastic gradient descent (SGD) is an immensely popular approach for o...
Statistical Inference for the Population Landscape via Moment Adjusted Stochastic Gradients
Modern statistical inference tasks often require iterative optimization ...
When Does the First Spurious Variable Get Selected by Sequential Regression Procedures?
Applied statisticians use sequential regression procedures to produce a ...
Private False Discovery Rate Control
We provide the first differentially private algorithms for controlling t...
False Discoveries Occur Early on the Lasso Path
In regression settings where explanatory variables have very low correla...
CommunicationEfficient False Discovery Rate Control via Knockoff Aggregation
The false discovery rate (FDR)the expected fraction of spurious disco...
A Differential Equation for Modeling Nesterov's Accelerated Gradient Method: Theory and Insights
We derive a secondorder ordinary differential equation (ODE) which is t...
Weijie Su
