
OffPolicy Reinforcement Learning with Delayed Rewards
We study deep reinforcement learning (RL) algorithms with delayed reward...
read it

MCGNet: Partial Multiview Fewshot Learning via Metaalignment and Context Gatedaggregation
In this paper, we propose a new challenging task named as partial multi...
read it

Adversarial Sample Enhanced Domain Adaptation: A Case Study on Predictive Modeling with Electronic Health Records
With the successful adoption of machine learning on electronic health re...
read it

Harnessing Distribution Ratio Estimators for Learning Agents with Quality and Diversity
QualityDiversity (QD) is a concept from Neuroevolution with some intrig...
read it

Learning Guidance Rewards with Trajectoryspace Smoothing
Longterm temporal credit assignment is an important challenge in deep r...
read it

Unsupervised Selftraining Algorithm Based on Deep Learning for Optical Aerial Images Change Detection
Optical aerial images change detection is an important task in earth obs...
read it

SYMPAIS: SYMbolic Parallel Adaptive Importance Sampling for Probabilistic Program Analysis
Probabilistic software analysis aims at quantifying the probability of a...
read it

Bayesian Policy Search for Stochastic Domains
AI planning can be cast as inference in probabilistic models, and probab...
read it

Probabilistic Programs with Stochastic Conditioning
We tackle the problem of conditioning probabilistic programs on distribu...
read it

NearOptimal MNL Bandits Under Risk Criteria
We study MNL bandits, which is a variant of the traditional multiarmed ...
read it

Efficient Competitive SelfPlay Policy Optimization
Reinforcement learning from selfplay has recently reported many success...
read it

Generating Adjacency Matrix for VideoQuery based Video Moment Retrieval
In this paper, we continue our work on VideoQuery based Video Moment re...
read it

Pooling Regularized Graph Neural Network for fMRI Biomarker Analysis
Understanding how certain brain regions relate to a specific neurologica...
read it

Graph Neural Network for VideoQuery based Video Moment Retrieval
In this paper, we focus on Video Query based Video Moment Retrieval (VQ...
read it

Multinomial Logit Bandit with Low Switching Cost
We study multinomial logit bandit with limited adaptivity, where the alg...
read it

Linear Bandits with Limited Adaptivity and Learning Distributional Optimal Design
Motivated by practical needs such as largescale learning, we study the ...
read it

ModelFree Reinforcement Learning: from Clipped PseudoRegret to Sample Complexity
In this paper we consider the problem of learning an ϵoptimal policy fo...
read it

Bidirection Context Propagation Network for Realtime Semantic Segmentation
Spatial details and context correlations are two types of critical infor...
read it

Adaptive DoubleExploration Tradeoff for Outlier Detection
We study a variant of the thresholding bandit problem (TBP) in the conte...
read it

MultiIF : An Approach to Anomaly Detection in SelfDriving Systems
Autonomous driving vehicles (ADVs) are implemented with rich software fu...
read it

Almost Optimal ModelFree Reinforcement Learning via ReferenceAdvantage Decomposition
We study the reinforcement learning problem in the setting of finitehor...
read it

Collaborative Top Distribution Identifications with Limited Interaction
We consider the following problem in this paper: given a set of n distri...
read it

Guardauto: A Decentralized Runtime Protection System for Autonomous Driving
Due to the broad attack surface and the lack of runtime protection, pote...
read it

Stochastically Differentiable Probabilistic Programs
Probabilistic programs with mixed support (both continuous and discrete ...
read it

SoftRootSign Activation Function
The choice of activation function in deep networks has a significant eff...
read it

Anypath Routing Protocol Design via QLearning for Underwater Sensor Networks
As a promising technology in the Internet of Underwater Things, underwat...
read it

Domain Adaptive Adversarial Learning Based on Physics Model Feedback for Underwater Image Enhancement
Owing to refraction, absorption, and scattering of light by suspended pa...
read it

Exploiting Operation Importance for Differentiable Neural Architecture Search
Recently, differentiable neural architecture search methods significantl...
read it

Furnishing Your Room by What You See: An EndtoEnd Furniture Set Retrieval Framework with Rich Annotated Benchmark Dataset
Understanding interior scenes has attracted enormous interest in compute...
read it

Subcarrier Assignment Schemes Based on QLearning in Wideband Cognitive Radio Networks
Subcarrier assignment is of crucial importance in wideband cognitive rad...
read it

CrossScale Residual Network for Multiple Tasks:Image Superresolution, Denoising, and Deblocking
In general, image restoration involves mapping from low quality images t...
read it

Temporal Action Localization using Long ShortTerm Dependency
Temporal action localization in untrimmed videos is an important but dif...
read it

Comb Convolution for Efficient Convolutional Architecture
Convolutional neural networks (CNNs) are inherently suffering from massi...
read it

Divide, Conquer, and Combine: a New Inference Strategy for Probabilistic Programs with Stochastic Support
Universal probabilistic programming systems (PPSs) provide a powerful an...
read it

√(n)Regret for Learning in Markov Decision Processes with Function Approximation and Low Bellman Rank
In this paper, we consider the problem of online learning of Markov deci...
read it

Dualreference Age Synthesis
Age synthesis has received much attention in recent years. Stateofthe...
read it

Graph Neural Network for Interpreting TaskfMRI Biomarkers
Finding the biomarkers associated with ASD is helpful for understanding ...
read it

Exploration via Hindsight Goal Generation
Goaloriented reinforcement learning has recently been a practical frame...
read it

HGC: Hierarchical Group Convolution for Highly Efficient Neural Network
Group convolution works well with many deep convolutional neural network...
read it

Thresholding Bandit with Optimal Aggregate Regret
We consider the thresholding bandit problem, whose goal is to find arms ...
read it

Tight Regret Bounds for Infinitearmed Linear Contextual Bandits
Linear contextual bandit is a class of sequential decision making proble...
read it

Collaborative Learning with Limited Interaction: Tight Bounds for Distributed Exploration in MultiArmed Bandits
Best arm identification (or, pure exploration) in multiarmed bandits is...
read it

Nearly MinimaxOptimal Regret for Linearly Parameterized Bandits
We study the linear contextual bandit problem with finite action sets. W...
read it

LFPPL: A LowLevel First Order Probabilistic Programming Language for NonDifferentiable Models
We develop a new Lowlevel, Firstorder Probabilistic Programming Langua...
read it

On Asymptotically Tight Tail Bounds for Sums of Geometric and Exponential Random Variables
In this note we prove bounds on the upper and lower probability tails of...
read it

Efficient Interpretation of Deep Learning Models Using Graph Structure and Cooperative Game Theory: Application to ASD Biomarker Discovery
Discovering imaging biomarkers for autism spectrum disorder (ASD) is cri...
read it

On Exploration, Exploitation and Learning in Adaptive Importance Sampling
We study adaptive importance sampling (AIS) as an online learning proble...
read it

Dynamic Assortment Optimization with Changing Contextual Information
In this paper, we study the dynamic assortment optimization problem unde...
read it

OffPolicy Evaluation and Learning from Logged Bandit Feedback: Error Reduction via Surrogate Policy
When learning from a batch of logged bandit feedback, the discrepancy be...
read it

Dynamic Assortment Selection under the Nested Logit Models
We study a stylized dynamic assortment planning problem during a selling...
read it
Yuan Zhou
is this you? claim profile