Xiangru Lian

research

∙ 06/05/2022

E^2VTS: Energy-Efficient Video Text Spotting from Unmanned Aerial Vehicles

Unmanned Aerial Vehicles (UAVs) based video text spotting has been exten...

0 Zhenyu Hu, et al. ∙

research

∙ 06/11/2021

DouZero: Mastering DouDizhu with Self-Play Deep Reinforcement Learning

Games are abstractions of the real world, where artificial agents learn ...

0 Daochen Zha, et al. ∙

research

∙ 02/04/2021

1-bit Adam: Communication Efficient Large-Scale Training with Adam's Convergence Speed

Scalable training of large models (like BERT and GPT-3) requires careful...

0 Hanlin Tang, et al. ∙

research

∙ 08/26/2020

APMSqueeze: A Communication Efficient Adam-Preconditioned Momentum SGD Algorithm

Adam is the important optimization algorithm to guarantee efficiency and...

11 Hanlin Tang, et al. ∙

research

∙ 03/09/2020

Stochastic Recursive Momentum for Policy Gradient Methods

In this paper, we propose a novel algorithm named STOchastic Recursive M...

0 Huizhuo Yuan, et al. ∙

research

∙ 12/31/2019

Stochastic Recursive Variance Reduction for Efficient Smooth Non-Convex Compositional Optimization

Stochastic compositional optimization arises in many important machine l...

0 Huizhuo Yuan, et al. ∙

research

∙ 07/17/2019

DeepSqueeze: Decentralization Meets Error-Compensated Compression

Communication is a key bottleneck in distributed training. Recently, an ...

0 Hanlin Tang, et al. ∙

research

∙ 07/17/2019

DeepSqueeze: Parallel Stochastic Gradient Descent with Double-Pass Error-Compensated Compression

Communication is a key bottleneck in distributed training. Recently, an ...

0 Hanlin Tang, et al. ∙

research

∙ 05/15/2019

DoubleSqueeze: Parallel Stochastic Gradient Descent with Double-Pass Error-Compensated Compression

A standard approach in large scale machine learning is distributed stoch...

0 Hanlin Tang, et al. ∙

research

∙ 10/15/2018

Revisit Batch Normalization: New Understanding from an Optimization View and a Refinement via Composition Optimization

Batch Normalization (BN) has been used extensively in deep learning to a...

0 Xiangru Lian, et al. ∙

research

∙ 03/19/2018

D^2: Decentralized Training over Decentralized Data

While training a machine learning model using multiple workers, each of ...

0 Hanlin Tang, et al. ∙

research

∙ 10/18/2017

Asynchronous Decentralized Parallel Stochastic Gradient Descent

Recent work shows that decentralized parallel stochastic gradient decent...

0 Xiangru Lian, et al. ∙

research

∙ 05/25/2017

Can Decentralized Algorithms Outperform Centralized Algorithms? A Case Study for Decentralized Parallel Stochastic Gradient Descent

Most distributed machine learning systems nowadays, including TensorFlow...

0 Xiangru Lian, et al. ∙

research

∙ 06/27/2015

Asynchronous Parallel Stochastic Gradient for Nonconvex Optimization

Asynchronous parallel implementations of stochastic gradient (SG) have b...

0 Xiangru Lian, et al. ∙

Xiangru Lian

Featured Co-authors

Sign in with Google

Consider DeepAI Pro