
Distributed Stochastic Consensus Optimization with Momentum for Nonconvex Nonsmooth Problems
While many distributed optimization algorithms have been proposed for so...
read it

TensorFlow Lite Micro: Embedded Machine Learning on TinyML Systems
Deep learning inference on embedded devices is a burgeoning field with m...
read it

DoubleEnsemble: A New Ensemble Method Based on Sample Reweighting and Feature Selection for Financial Data Analysis
Modern machine learning models (such as deep neural networks and boostin...
read it

Kalman Filtering Attention for User Behavior Modeling in CTR Prediction
Clickthrough rate (CTR) prediction is one of the fundamental tasks for ...
read it

Loosely Coupled Federated Learning Over Generative Models
Federated learning (FL) was proposed to achieve collaborative machine le...
read it

The Deep Learning Galerkin Method for the General Stokes Equations
The finite element method, finite difference method, finite volume metho...
read it

Decoupled Modified Characteristic Finite Element Method with Different Subdomain Time Steps for Nonstationary DualPorosityNavierStokes Model
In this paper, we develop the numerical theory of decoupled modified cha...
read it

LRSpeech: Extremely LowResource Speech Synthesis and Recognition
Speech synthesis (text to speech, TTS) and recognition (automatic speech...
read it

ACFD: Asymmetric Cartoon Face Detector
Cartoon face detection is a more challenging task than human face detect...
read it

Approximation Algorithms for Clustering with Dynamic Points
In many classic clustering problems, we seek to sketch a massive data se...
read it

Neural Architecture Optimization with Graph VAE
Due to their high computational efficiency on a continuous space, gradie...
read it

Improved Algorithms for ConvexConcave Minimax Optimization
This paper studies minimax optimization problems min_x max_y f(x,y), whe...
read it

Exploration by Maximizing Rényi Entropy for ZeroShot Meta RL
Exploring the transition dynamics is essential to the success of reinfor...
read it

ASFD: Automatic and Scalable Face Detector
In this paper, we propose a novel Automatic and Scalable Face Detector (...
read it

MetaEmbeddings Based On SelfAttention
Creating metaembeddings for better performance in language modelling ha...
read it

Convolutional Spectral Kernel Learning
Recently, nonstationary spectral kernels have drawn much attention, owi...
read it

PACache: Learningbased PopularityAware Content Caching in Edge Networks
With the aggressive growth of smart environments, a large amount of data...
read it

Online Algorithms for Multishop Ski Rental with Machine Learned Predictions
We study the problem of augmenting online algorithms with machine learne...
read it

Schema2QA: Answering Complex Queries on the Structured Web with a Neural Model
Virtual assistants today require every website to submit skills individu...
read it

Resource Sharing in the Edge: A Distributed BargainingTheoretic Approach
The growing demand for edge computing resources, particularly due to inc...
read it

Let's Share: A GameTheoretic Framework for Resource Sharing in Mobile Edge Clouds
Mobile edge computing seeks to provide resources to different delaysens...
read it

Neuron Interaction Based Representation Composition for Neural Machine Translation
Recent NLP studies reveal that substantial linguistic information can be...
read it

LearningAssisted Competitive Algorithms for PeakAware Energy Scheduling
In this paper, we study the peakaware energy scheduling problem using t...
read it

Fast Learning of Temporal Action Proposal via Dense Boundary Generator
Generating temporal action proposals remains a very challenging problem,...
read it

Algorithms and Adaptivity Gaps for Stochastic kTSP
Given a metric (V,d) and a root∈ V, the classic kTSP problem is to find...
read it

Metric Classification Network in Actual Face Recognition Scene
In order to make facial features more discriminative, some new models ha...
read it

Optimizing Speech Recognition For The Edge
While most deployed speech recognition systems today still run on server...
read it

Automated Spectral Kernel Learning
The generalization performance of kernel methods is largely determined b...
read it

Learning Vectorvalued Functions with Local Rademacher Complexity
We consider a general family of problems of which the output space admit...
read it

Learning Guided Convolutional Network for Depth Completion
Dense depth perception is critical for autonomous driving and other robo...
read it

NetSMF: LargeScale Network Embedding as Sparse Matrix Factorization
We study the problem of largescale network embedding, which aims to lea...
read it

Gradient Descent Maximizes the Margin of Homogeneous Neural Networks
Recent works on implicit regularization have shown that gradient descent...
read it

Distributed Learning with Random Features
Distributed learning and random projections are the most common techniqu...
read it

TSRNN: Text Steganalysis Based on Recurrent Neural Networks
With the rapid development of natural language processing technologies, ...
read it

Policy Search by Target Distribution Learning for Continuous Control
We observe that several existing policy gradient methods (such as vanill...
read it

Robust Variational Autoencoder
Machine learning methods often need a large amount of labeled training d...
read it

AntiConfusing: RegionAware Network for Human Pose Estimation
In this work, we propose a novel framework named RegionAware Network (R...
read it

Automatic Target Recognition Using Discrimination Based on Optimal Transport
The use of distances based on optimal transportation has recently shown ...
read it

Information Aggregation for MultiHead Attention with RoutingbyAgreement
Multihead attention is appealing for its ability to jointly extract dif...
read it

OutcomeDriven Clustering of Acute Coronary Syndrome Patients using MultiTask Neural Network with Attention
Cluster analysis aims at separating patients into phenotypically heterog...
read it

ContextAware SelfAttention Networks
Selfattention model have shown its flexibility in parallel computation ...
read it

Efficient CrossValidation for SemiSupervised Learning
Manifold regularization, such as laplacian regularized least squares (La...
read it

On Generalization Error Bounds of Noisy Gradient Methods for NonConvex Learning
Generalization error (also known as the outofsample error) measures ho...
read it

MaxDiversity Distributed Learning: Theory and Algorithms
We study the risk performance of distributed learning for the regulariza...
read it

gl2vec: Learning Feature Representation Using Graphlets for Directed Networks
Learning network representations has a variety of applications, such as ...
read it

DSFD: Dual Shot Face Detector
Recently, Convolutional Neural Network (CNN) has achieved great success ...
read it

MultiHead Attention with Disagreement Regularization
Multihead attention is appealing for the ability to jointly attend to i...
read it

Tracking by Animation: Unsupervised Learning of MultiObject Attentive Trackers
Online MultiObject Tracking (MOT) from videos is a challenging computer...
read it

A Fast AndersonChebyshev Mixing Method for Nonlinear Optimization
Anderson mixing (or Anderson acceleration) is an efficient acceleration ...
read it

An AndersonChebyshev Mixing Method for Nonlinear Optimization
Anderson mixing (or Anderson acceleration) is an efficient acceleration ...
read it
Jian Li
is this you? claim profile
Assistant Professor at Institute for Interdisciplinary Information Sciences (IIIS, previously ITCS), Tsinghua University