
Ecological Reinforcement Learning
Much of the current work on reinforcement learning studies episodic sett...
The State of AI Ethics Report (June 2020)
These past few months have been especially challenging, and the deployme...
Response by the Montreal AI Ethics Institute to the European Commission's Whitepaper on AI
In February 2020, the European Commission (EC) published a white paper e...
Domain Randomization for Active Pose Estimation
Accurate state estimation is a fundamental component of robotic control....
Gradient Surgery for MultiTask Learning
While deep learning and deep reinforcement learning (RL) systems have de...
ROBEL: Robotics Benchmarks for Learning with LowCost Robots
ROBEL is an opensource platform of costeffective robots designed for r...
Relay Policy Learning: Solving LongHorizon Tasks via Imitation and Reinforcement Learning
We present relay policy learning, a method for imitation and reinforceme...
The Ingredients of RealWorld Robotic Reinforcement Learning
The success of reinforcement learning for real world robotics has been, ...
Soft ActorCritic Algorithms and Applications
Modelfree deep reinforcement learning (RL) algorithms have been success...
Learning in Markov Decision Processes under Constraints
We consider reinforcement learning (RL) in Markov Decision Processes (MD...
Automatically Composing Representation Transformations as a Means for Generalization
How can we build a learner that can capture the essence of what makes a ...
DisCor: Corrective Feedback in Reinforcement Learning via Distribution Correction
Deep reinforcement learning can learn effective policies for a wide rang...
Convergence of Recursive Stochastic Algorithms using Wasserstein Divergence
This paper develops a unified framework, based on iterated random operat...
Guiding Policies with Language via MetaLearning
Behavioral skills or policies for autonomous agents are conventionally l...
Guided MetaPolicy Search
Reinforcement learning (RL) algorithms have demonstrated promising resul...
Befriending The Byzantines Through Reputation Scores
We propose two novel stochastic gradient descent algorithms, ByGARS and ...
Unsupervised Curricula for Visual MetaReinforcement Learning
In principle, metareinforcement learning algorithms leverage experience...
SelfConsistent Trajectory Autoencoder: Hierarchical Reinforcement Learning with Trajectory Embeddings
In this work, we take a representation learning perspective on hierarchi...
Evolutionary Multitasking for Singleobjective Continuous Optimization: Benchmark Problems, Performance Metric, and Baseline Results
In this report, we suggest nine test problems for multitask singleobje...
Evolutionary Multitasking for Multiobjective Continuous Optimization: Benchmark Problems, Performance Metrics and Baseline Results
In this report, we suggest nine test problems for multitask multiobjec...
Learning Complex Dexterous Manipulation with Deep Reinforcement Learning and Demonstrations
Dexterous multifingered hands are extremely versatile and provide a gen...
Genetic Transfer or Population Diversification? Deciphering the Secret Ingredients of Evolutionary Multitask Optimization
Evolutionary multitasking has recently emerged as a novel paradigm that ...
Imitation from Observation: Learning to Imitate Behaviors from Raw Video via Context Translation
Imitation learning is an effective approach for autonomous systems to ac...
Multiple View Reconstruction of Calibrated Images using Singular Value Decomposition
Calibration in a multi camera network has widely been studied for over s...
From QueryByKeyword to QueryByExample: LinkedIn Talent Search Approach
One key challenge in talent search is to translate complex criteria of a...
Diversity is All You Need: Learning Skills without a Reward Function
Intelligent creatures can explore their environments and learn useful sk...
Addressing Expensive Multiobjective Games with Postponed Preference Articulation via Memetic Coevolution
This paper presents algorithmic and empirical contributions demonstratin...
VirtualMobileCore Placement for Metro Network
Traditional highlycentralized mobile core networks (e.g., Evolved Packe...
A Fixed Point Theorem for Iterative Random Contraction Operators over Banach Spaces
Consider a contraction operator T over a Banach space X with a fixed po...
MetaReinforcement Learning of Structured Exploration Strategies
Exploration is a fundamental challenge in reinforcement learning (RL). M...
Unsupervised MetaLearning for Reinforcement Learning
Metalearning is a powerful tool that builds on multitask learning to l...
Adversarial Reinforcement Learning for Observer Design in Autonomous Systems under Cyber Attacks
Complex autonomous control systems are subjected to sensor failures, cyb...
Learning Actionable Representations with GoalConditioned Policies
Representation learning is a central challenge across a range of machine...
Dexterous Manipulation with Deep Reinforcement Learning: Efficient, General, and LowCost
Dexterous multifingered robotic hands can perform a wide range of manip...
AIR5: Five Pillars of Artificial Intelligence Research
In this article, we provide and overview of what we consider to be some ...
Some Limit Properties of Markov Chains Induced by Stochastic Recursive Algorithms
Recursive stochastic algorithms have gained significant attention in the...
Rank Reduction in Bimatrix Games
The rank of a bimatrix game is defined as the rank of the sum of the pay...
On the Computation of Strategically Equivalent Rank0 Games
It has been well established that in a bimatrix game, the rank of the ma...
Canada Protocol: an ethical checklist for the use of Artificial Intelligence in Suicide Prevention and Mental Health
Introduction: To improve current public health strategies in suicide pre...
Distributed SGD Generalizes Well Under Asynchrony
The performance of fully synchronized distributed systems has faced a bo...
A MultiTask Gradient Descent Method for MultiLabel Learning
Multilabel learning studies the problem where an instance is associated...
Minimalistic Attacks: How Little it Takes to Fool a Deep Reinforcement Learning Policy
Recent studies have revealed that neural networkbased policies can be e...
BISTRO: Berkeley Integrated System for Transportation Optimization
This article introduces BISTRO, a new open source transportation plannin...
Learning To Reach Goals Without Reinforcement Learning
Imitation learning algorithms provide a simple and straightforward appro...
On the Coverage Performance of BooleanPoisson Cluster Models for Wireless Sensor Networks
In this paper, we consider wireless sensor networks (WSNs) with sensor n...
Reciprocal Collision Avoidance for General Nonlinear Agents using Reinforcement Learning
Finding feasible and collisionfree paths for multiple nonlinear agents ...
Coverage Improvement of Wireless Sensor Networks via Spatial Profile Information
This paper considers a wireless sensor network deployed to sense an envi...
Response by the Montreal AI Ethics Institute to the Santa Clara Principles on Transparency and Accountability in Online Content Moderation
In April 2020, the Electronic Frontier Foundation (EFF) publicly called ...
Unified Characterization Platform for Emerging NVM Technology: Neural Network Application Benchmarking Using offtheshelf NVM Chips
In this paper, we present a unified FPGA based electrical testbench for...
Accelerating Online Reinforcement Learning with Offline Datasets
Reinforcement learning provides an appealing formalism for learning cont...
Abhishek Gupta
Research Scientist at the School of Computer Science and Engineering at Nanyang Technological University since 2016, Research Fellow at RollsRoyce @ NTU Corporate Lab at Nanyang Technological University from 20152016, Research Fellow at SIMTechNTU Joint Lab on Complex Systems at Nanyang Technological University from 20142015, Research Scholar and Teaching Assistant at The University of Auckland from 20112014, Computational Engineer (Part time Consultancy) at ZEUS NUMERIX from 20112013, (Ph.D.) Engineering Science at University of Auckland 20112014