
Multimodal Learning for Hateful Memes Detection
Memes are pixelbased multimedia documents containing images and express...
Contrastive Weight Regularization for Large Minibatch SGD
The minibatch stochastic gradient descent method (SGD) is widely applied...
Generating universal language adversarial examples by understanding and enhancing the transferability across neural models
Deep neural network models are vulnerable to adversarial attacks. In man...
Neural Network Training Techniques Regularize Optimization Trajectory: An Empirical Study
Modern deep neural network (DNN) trainings utilize various training tech...
VarianceReduced OffPolicy TDC Learning: NonAsymptotic Convergence Analysis
Variance reduction techniques have been successfully applied to temporal...
Boosting OnePoint DerivativeFree Online Optimization via Residual Feedback
Zerothorder optimization (ZO) typically relies on twopoint feedback to...
Unsupervised Crosslingual Image Captioning
Most recent image captioning works are conducted in English as the major...
Transfer Learning from Speech Synthesis to Voice Conversion with NonParallel Training Data
This paper presents a novel framework to build a voice conversion (VC) s...
Normalization Techniques in Training DNNs: Methodology, Analysis and Application
Normalization techniques are essential for accelerating the training and...
Exploring the Hierarchy in Relation Labels for Scene Graph Generation
By assigning each relationship a single label, current approaches formul...
Smallfloating Target Detection in Sea Clutter via Visual Feature Classifying in the TimeDoppler Spectra
It is challenging to detect smallfloating object in the sea clutter for...
A Benchmark for Studying Diabetic Retinopathy: Segmentation, Grading, and Transferability
People with diabetes are at risk of developing an eye disease called dia...
Spatiotemporal Attention Model for Tactile Texture Recognition
Recently, tactile sensing has attracted great interest in robotics, espe...
Eventbased Stereo Visual Odometry
Eventbased cameras are bioinspired vision sensors whose pixels work in...
The Complexity of the Partition Coloring Problem
Given a simple undirected graph G=(V,E) and a partition of the vertex se...
IBM Federated Learning: an Enterprise Framework White Paper V0.1
Federated Learning (FL) is an approach to conduct machine learning witho...
Understanding the Impact of Model Incoherence on Convergence of Incremental SGD with Random Reshuffle
Although SGD with random reshuffle has been widelyused in machine learn...
Defense against Adversarial Attacks in NLP via Dirichlet Neighborhood Ensemble
Despite neural networks have achieved prominent performance on many natu...
Improving the Convergence Rate of OnePoint ZerothOrder Optimization using Residual Feedback
Many existing zerothorder optimization (ZO) algorithms adopt twopoint ...
Fully Convolutional Mesh Autoencoder using Efficient Spatially Varying Kernels
Learning latent representations of registered meshes is useful for many ...
Recognizing Chinese Judicial Named Entity using BiLSTMCRF
Named entity recognition (NER) plays an essential role in natural langua...
Generative Tweening: Longterm Inbetweening of 3D Human Motions
The ability to generate complex and realistic human body animations at s...
Momentum with Variance Reduction for Nonconvex Composition Optimization
Composition optimization is widelyapplied in nonconvex machine learning...
InfNet: Automatic COVID19 Lung Infection Segmentation from CT Scans
Coronavirus Disease 2019 (COVID19) spread globally in early 2020, causi...
An Investigation into the Stochasticity of Batch Whitening
Batch Normalization (BN) is extensively employed in various network arch...
GFTE: Graphbased Financial Table Extraction
Tabular data is a crucial form of information expression, which can orga...
MotionAttentive Transition for ZeroShot Video Object Segmentation
In this paper, we present a novel MotionAttentive Transition Network (M...
Proximal Gradient Algorithm with Momentum and Flexible Parameter Restart for Nonconvex Optimization
Various types of parameter restart schemes have been proposed for accele...
TiFL: A Tierbased Federated Learning System
Federated Learning (FL) enables learning a shared model across many clie...
Reanalysis of Variance Reduced Temporal Difference Learning
Temporal difference (TD) learning is a popular algorithm for policy eval...
Chinese Named Entity Recognition Augmented with Lexicon Memory
Inspired by a concept of contentaddressable retrieval from cognitive sc...
HybridAlpha: An Efficient Approach for PrivacyPreserving Federated Learning
Federated learning has emerged as a promising approach for collaborative...
DRGAN: Conditional Generative Adversarial Network for FineGrained Lesion Synthesis on Diabetic Retinopathy Images
Diabetic retinopathy (DR) is a complication of diabetes that severely af...
Improved ZerothOrder Variance Reduced Algorithms and Analysis for Nonconvex Optimization
Two types of zerothorder stochastic algorithms have recently been desig...
Faster Stochastic Algorithms via HistoryGradient Aided Batch Size Adaptation
Various schemes for adapting batch size have been recently proposed to a...
Distributed SGD Generalizes Well Under Asynchrony
The performance of fully synchronized distributed systems has faced a bo...
Towards Federated Graph Learning for Collaborative Financial Crimes Detection
Financial crime is a large and growing problem, in some way touching alm...
A unified variancereduced accelerated gradient method for convex optimization
We propose a novel randomized incremental gradient algorithm, namely, VA...
Iterative Normalization: Beyond Standardization towards Efficient Whitening
Batch Normalization (BN) is ubiquitously employed for accelerating neura...
AEDNet: An Abnormal Event Detection Network
It is challenging to detect the anomaly in crowded scenes for quite a lo...
Momentum Schemes with Stochastic Variance Reduction for Nonconvex Composite Optimization
Two new stochastic variancereduced algorithms named SARAH and SPIDER ha...
Hybrid coarsefine classification for head pose estimation
Head pose estimation, which computes the intrinsic Euler angles (yaw, pi...
The square root rule for adaptive importance sampling
In adaptive importance sampling, and other contexts, we have unbiased an...
SGD Converges to Global Minimum in Deep Learning via Starconvex Path
Stochastic gradient descent (SGD) has been found to be surprisingly effe...
On the Continuity of Rotation Representations in Neural Networks
In neural networks, it is often desirable to work with various represent...
MRGAN: Manifold Regularized Generative Adversarial Networks
Despite the growing interest in generative adversarial networks (GANs), ...
SpiderBoost: A Class of Faster Variancereduced Algorithms for Nonconvex Optimization
There has been extensive research on developing stochastic variance redu...
Cubic Regularization with Momentum for Nonconvex Optimization
Momentum is a popular technique to accelerate the convergence in practic...
Toward Understanding the Impact of Staleness in Distributed Machine Learning
Many distributed machine learning (ML) systems adopt the nonsynchronous...
Elastic Neural Networks for Classification
In this work we propose a framework for improving the performance of any...
Yi Zhou
is this you?
Assistant Researcher at Fudan University School of Mathematical Sciences, Professor of School of Mathematical Sciences, Fudan University