
ACFD: Asymmetric Cartoon Face Detector
Cartoon face detection is a more challenging task than human face detect...
Approximation Algorithms for Clustering with Dynamic Points
In many classic clustering problems, we seek to sketch a massive data se...
Neural Architecture Optimization with Graph VAE
Due to their high computational efficiency on a continuous space, gradie...
Improved Algorithms for ConvexConcave Minimax Optimization
This paper studies minimax optimization problems min_x max_y f(x,y), whe...
Exploration by Maximizing Rényi Entropy for ZeroShot Meta RL
Exploring the transition dynamics is essential to the success of reinfor...
ASFD: Automatic and Scalable Face Detector
In this paper, we propose a novel Automatic and Scalable Face Detector (...
MetaEmbeddings Based On SelfAttention
Creating metaembeddings for better performance in language modelling ha...
Convolutional Spectral Kernel Learning
Recently, nonstationary spectral kernels have drawn much attention, owi...
PACache: Learningbased PopularityAware Content Caching in Edge Networks
With the aggressive growth of smart environments, a large amount of data...
Online Algorithms for Multishop Ski Rental with Machine Learned Predictions
We study the problem of augmenting online algorithms with machine learne...
Schema2QA: Answering Complex Queries on the Structured Web with a Neural Model
Virtual assistants today require every website to submit skills individu...
Resource Sharing in the Edge: A Distributed BargainingTheoretic Approach
The growing demand for edge computing resources, particularly due to inc...
Let's Share: A GameTheoretic Framework for Resource Sharing in Mobile Edge Clouds
Mobile edge computing seeks to provide resources to different delaysens...
Neuron Interaction Based Representation Composition for Neural Machine Translation
Recent NLP studies reveal that substantial linguistic information can be...
LearningAssisted Competitive Algorithms for PeakAware Energy Scheduling
In this paper, we study the peakaware energy scheduling problem using t...
Fast Learning of Temporal Action Proposal via Dense Boundary Generator
Generating temporal action proposals remains a very challenging problem,...
Algorithms and Adaptivity Gaps for Stochastic kTSP
Given a metric (V,d) and a root∈ V, the classic kTSP problem is to find...
Metric Classification Network in Actual Face Recognition Scene
In order to make facial features more discriminative, some new models ha...
Optimizing Speech Recognition For The Edge
While most deployed speech recognition systems today still run on server...
Automated Spectral Kernel Learning
The generalization performance of kernel methods is largely determined b...
Learning Vectorvalued Functions with Local Rademacher Complexity
We consider a general family of problems of which the output space admit...
Learning Guided Convolutional Network for Depth Completion
Dense depth perception is critical for autonomous driving and other robo...
NetSMF: LargeScale Network Embedding as Sparse Matrix Factorization
We study the problem of largescale network embedding, which aims to lea...
Gradient Descent Maximizes the Margin of Homogeneous Neural Networks
Recent works on implicit regularization have shown that gradient descent...
Distributed Learning with Random Features
Distributed learning and random projections are the most common techniqu...
TSRNN: Text Steganalysis Based on Recurrent Neural Networks
With the rapid development of natural language processing technologies, ...
Policy Search by Target Distribution Learning for Continuous Control
We observe that several existing policy gradient methods (such as vanill...
Robust Variational Autoencoder
Machine learning methods often need a large amount of labeled training d...
AntiConfusing: RegionAware Network for Human Pose Estimation
In this work, we propose a novel framework named RegionAware Network (R...
Automatic Target Recognition Using Discrimination Based on Optimal Transport
The use of distances based on optimal transportation has recently shown ...
Information Aggregation for MultiHead Attention with RoutingbyAgreement
Multihead attention is appealing for its ability to jointly extract dif...
OutcomeDriven Clustering of Acute Coronary Syndrome Patients using MultiTask Neural Network with Attention
Cluster analysis aims at separating patients into phenotypically heterog...
ContextAware SelfAttention Networks
Selfattention model have shown its flexibility in parallel computation ...
Efficient CrossValidation for SemiSupervised Learning
Manifold regularization, such as laplacian regularized least squares (La...
On Generalization Error Bounds of Noisy Gradient Methods for NonConvex Learning
Generalization error (also known as the outofsample error) measures ho...
MaxDiversity Distributed Learning: Theory and Algorithms
We study the risk performance of distributed learning for the regulariza...
gl2vec: Learning Feature Representation Using Graphlets for Directed Networks
Learning network representations has a variety of applications, such as ...
DSFD: Dual Shot Face Detector
Recently, Convolutional Neural Network (CNN) has achieved great success ...
MultiHead Attention with Disagreement Regularization
Multihead attention is appealing for the ability to jointly attend to i...
Tracking by Animation: Unsupervised Learning of MultiObject Attentive Trackers
Online MultiObject Tracking (MOT) from videos is a challenging computer...
A Fast AndersonChebyshev Mixing Method for Nonlinear Optimization
Anderson mixing (or Anderson acceleration) is an efficient acceleration ...
An AndersonChebyshev Mixing Method for Nonlinear Optimization
Anderson mixing (or Anderson acceleration) is an efficient acceleration ...
A GameTheoretic Approach to MultiObjective Resource Sharing and Allocation in Mobile Edge Clouds
Mobile edge computing seeks to provide resources to different delaysens...
Quickest Detection of Dynamic Events in Networks
The problem of quickest detection of dynamic events in networks is studi...
Network Classification in Temporal Networks Using Motifs
Network classification has a variety of applications, such as detecting ...
BRITS: Bidirectional Recurrent Imputation for Time Series
Time series are widely used as signals in many classification/regression...
A PTAS for a Class of Stochastic Dynamic Programs
We develop a framework for obtaining polynomial time approximation schem...
ARUM: Polar Coded HARQ Scheme based on Incremental Channel Polarization
A hybrid ARQ (HARQ) scheme for polar code, which is called activebit re...
εCoresets for Clustering (with Outliers) in Doubling Metrics
We study the problem of constructing εcoresets for the (k, z)clusterin...
Stochastic Gradient Hamiltonian Monte Carlo with Variance Reduction for Bayesian Inference
Gradientbased Monte Carlo sampling algorithms, like Langevin dynamics a...
Jian Li
is this you? claim profile
Assistant Professor at Institute for Interdisciplinary Information Sciences (IIIS, previously ITCS), Tsinghua University