
Improving Selfsupervised Pretraining via a FullyExplored Masked Language Model
Masked Language Model (MLM) framework has been widely adopted for selfs...
read it

Jointly Modeling Intra and Intertransaction Dependencies with Hierarchical Attentive Transaction Embeddings for Nextitem Recommendation
A transactionbased recommender system (TBRS) aims to predict the next i...
read it

Statistically Preconditioned Accelerated Gradient Method for Distributed Optimization
We consider the setting of distributed empirical risk minimization where...
read it

Statistical Adaptive Stochastic Gradient Methods
We propose a statistical adaptive procedure called SALSA for automatical...
read it

Understanding the Role of Momentum in Stochastic Gradient Methods
The use of momentum in stochastic gradient methods has become a widespre...
read it

Joint Computation and Communication Design for UAVAssisted Mobile Edge Computing in IoT
Unmanned aerial vehicle (UAV)assisted mobile edge computing (MEC) syste...
read it

Using Statistics to Automate Stochastic Optimization
Despite the development of numerous adaptive optimizers, tuning the lear...
read it

MultiLevel Composite Stochastic Optimization via Nested Variance Reduction
We consider multilevel composite optimization problems where each mappi...
read it

A Stochastic Composite Gradient Method with Incremental Variance Reduction
We consider the problem of minimizing the composition of a smooth (nonco...
read it

Hyperbolic Interaction Model For Hierarchical MultiLabel Classification
Different from the traditional classification tasks which assume mutual ...
read it

Labelaware Document Representation via Hybrid Attention for Extreme MultiLabel Text Classification
Extreme multilabel text classification (XMTC) aims at tagging a documen...
read it

Secrecy Energy Efficiency Maximization for UAVEnabled Mobile Relaying
This paper investigates the secrecy energy efficiency (SEE) maximization...
read it

Learning SMaLL Predictors
We present a new machine learning technique for training small resource...
read it

Smoothed Dual Embedding Control
We revisit the Bellman optimality equation with Nesterov's smoothing tec...
read it

DSCOVR: Randomized PrimalDual Block Coordinate Algorithms for Asynchronous Distributed Optimization
Machine learning with big data often involves large optimization models....
read it

Exploiting Strong Convexity from Data with PrimalDual FirstOrder Algorithms
We consider empirical risk minimization of linear predictors with convex...
read it

Stochastic Variance Reduction Methods for Policy Evaluation
Policy evaluation is a crucial step in many reinforcementlearning proce...
read it

CommunicationEfficient Distributed Optimization of SelfConcordant Empirical Loss
We consider distributed convex optimization problems originated from sam...
read it

A Proximal Stochastic Gradient Method with Progressive Variance Reduction
We consider the problem of minimizing the sum of two convex functions: o...
read it

A Randomized Nonmonotone Block Proximal Gradient Method for a Class of Structured Nonlinear Programming
We propose a randomized nonmonotone block proximal gradient (RNBPG) meth...
read it

On the Complexity Analysis of Randomized BlockCoordinate Descent Methods
In this paper we analyze the randomized blockcoordinate descent (RBCD) ...
read it

A ProximalGradient Homotopy Method for the Sparse LeastSquares Problem
We consider solving the ℓ_1regularized leastsquares (ℓ_1LS) problem i...
read it
Lin Xiao
is this you? claim profile