Xiao Li

is this you? claim profile

0 followers

  • Latent Space Factorisation and Manipulation via Matrix Subspace Projection

    This paper proposes a novel method for factorising the information in the latent space of an autoencoder (AE), to improve the interpretability of the latent space and facilitate controlled generation. When trained on a dataset with labelled attributes we can produce a latent vector which separates information encoding the attributes from other characteristic information, and also disentangles the attribute information. This then allows us to manipulate each attribute of the latent representation individually without affecting others. Our method, matrix subspace projection, is simpler than the state of the art adversarial network approaches to latent space factorisation. We demonstrate the utility of the method for attribute manipulation tasks on the CelebA image dataset and the E2E text corpus.

    07/26/2019 ∙ by Xiao Li, et al. ∙ 17 share

    read it

  • Synthesizing 3D Shapes from Silhouette Image Collections using Multi-projection Generative Adversarial Networks

    We present a new weakly supervised learning-based method for generating novel category-specific 3D shapes from unoccluded image collections. Our method is weakly supervised and only requires silhouette annotations from unoccluded, category-specific objects. Our method does not require access to the object's 3D shape, multiple observations per object from different views, intra-image pixel-correspondences, or any view annotations. Key to our method is a novel multi-projection generative adversarial network (MP-GAN) that trains a 3D shape generator to be consistent with multiple 2D projections of the 3D shapes, and without direct access to these 3D shapes. This is achieved through multiple discriminators that encode the distribution of 2D projections of the 3D shapes seen from a different views. Additionally, to determine the view information for each silhouette image, we also train a view prediction network on visualizations of 3D shapes synthesized by the generator. We iteratively alternate between training the generator and training the view prediction network. We validate our multi-projection GAN on both synthetic and real image datasets. Furthermore, we also show that multi-projection GANs can aid in learning other high-dimensional distributions from lower dimensional training datasets, such as material-class specific spatially varying reflectance properties from images.

    06/10/2019 ∙ by Xiao Li, et al. ∙ 4 share

    read it

  • A Debiased MDI Feature Importance Measure for Random Forests

    Tree ensembles such as Random Forests have achieved impressive empirical success across a wide variety of applications. To understand how these models make predictions, people routinely turn to feature importance measures calculated from tree ensembles. It has long been known that Mean Decrease Impurity (MDI), one of the most widely used measures of feature importance, incorrectly assigns high importance to noisy features, leading to systematic bias in feature selection. In this paper, we address the feature selection bias of MDI from both theoretical and methodological perspectives. Based on the original definition of MDI by Breiman et al. for a single tree, we derive a tight non-asymptotic bound on the expected bias of MDI importance of noisy features, showing that deep trees have higher (expected) feature selection bias than shallow ones. However, it is not clear how to reduce the bias of MDI using its existing analytical expression. We derive a new analytical expression for MDI, and based on this new expression, we are able to propose a debiased MDI feature importance measure using out-of-bag samples, called MDI-oob. For both the simulated data and a genomic ChIP dataset, MDI-oob achieves state-of-the-art performance in feature selection from Random Forests for both deep and shallow trees.

    06/26/2019 ∙ by Xiao Li, et al. ∙ 1 share

    read it

  • Automata Guided Hierarchical Reinforcement Learning for Zero-shot Skill Composition

    An obstacle that prevents the wide adoption of (deep) reinforcement learning (RL) in control systems is its need for a large amount of interactions with the environ- ment in order to master a skill. The learned skill usually generalizes poorly across domains and re-training is often necessary when presented with a new task. We present a framework that combines methods in formal methods with hierarchi- cal reinforcement learning (HRL). The set of techniques we provide allows for convenient specification of tasks with complex logic, learn hierarchical policies (meta-controller and low-level controllers) with well-defined intrinsic rewards us- ing any RL methods and is able to construct new skills from existing ones without additional learning. We evaluate the proposed methods in a simple grid world simulation as well as simulation on a Baxter robot.

    10/31/2017 ∙ by Xiao Li, et al. ∙ 0 share

    read it

  • A Policy Search Method For Temporal Logic Specified Reinforcement Learning Tasks

    Reward engineering is an important aspect of reinforcement learning. Whether or not the user's intentions can be correctly encapsulated in the reward function can significantly impact the learning outcome. Current methods rely on manually crafted reward functions that often require parameter tuning to obtain the desired behavior. This operation can be expensive when exploration requires systems to interact with the physical world. In this paper, we explore the use of temporal logic (TL) to specify tasks in reinforcement learning. TL formula can be translated to a real-valued function that measures its level of satisfaction against a trajectory. We take advantage of this function and propose temporal logic policy search (TLPS), a model-free learning technique that finds a policy that satisfies the TL specification. A set of simulated experiments are conducted to evaluate the proposed approach.

    09/27/2017 ∙ by Xiao Li, et al. ∙ 0 share

    read it

  • Reinforcement Learning With Temporal Logic Rewards

    Reinforcement learning (RL) depends critically on the choice of reward functions used to capture the de- sired behavior and constraints of a robot. Usually, these are handcrafted by a expert designer and represent heuristics for relatively simple tasks. Real world applications typically involve more complex tasks with rich temporal and logical structure. In this paper we take advantage of the expressive power of temporal logic (TL) to specify complex rules the robot should follow, and incorporate domain knowledge into learning. We propose Truncated Linear Temporal Logic (TLTL) as specifications language, that is arguably well suited for the robotics applications, together with quantitative semantics, i.e., robustness degree. We propose a RL approach to learn tasks expressed as TLTL formulae that uses their associated robustness degree as reward functions, instead of the manually crafted heuristics trying to capture the same specifications. We show in simulated trials that learning is faster and policies obtained using the proposed approach outperform the ones learned using heuristic rewards in terms of the robustness degree, i.e., how well the tasks are satisfied. Furthermore, we demonstrate the proposed RL approach in a toast-placing task learned by a Baxter robot.

    12/11/2016 ∙ by Xiao Li, et al. ∙ 0 share

    read it

  • A Hierarchical Reinforcement Learning Method for Persistent Time-Sensitive Tasks

    Reinforcement learning has been applied to many interesting problems such as the famous TD-gammon and the inverted helicopter flight. However, little effort has been put into developing methods to learn policies for complex persistent tasks and tasks that are time-sensitive. In this paper, we take a step towards solving this problem by using signal temporal logic (STL) as task specification, and taking advantage of the temporal abstraction feature that the options framework provide. We show via simulation that a relatively easy to implement algorithm that combines STL and options can learn a satisfactory policy with a small number of training cases

    06/20/2016 ∙ by Xiao Li, et al. ∙ 0 share

    read it

  • Convergence Analysis of Alternating Nonconvex Projections

    We consider the convergence properties for alternating projection algorithm (a.k.a alternating projections) which has been widely utilized to solve many practical problems in machine learning, signal and image processing, communication and statistics. In this paper, we formalize two properties of proper, lower semi-continuous and semi-algebraic sets: the three point property for all possible iterates and the local contraction prop- erty that serves as the non-expensiveness property of the projector, but only for the iterates that are closed enough to each other. Then by exploiting the geometric properties of the objective function around its critical point,i.e. the Kurdyka-Lojasiewicz(KL)property, we establish a new convergence analysis framework to show that if one set satisfies the three point property and the other one obeys the local contraction property, the iterates generated by alternating projections is a convergent sequence and converges to a critical point. We complete this study by providing convergence rate which depends on the explicit expression of the KL exponent. As a byproduct, we use our new analysis framework to recover the linear convergence rate of alternating projections onto closed convex sets. To illustrate the power of our new framework, we provide new convergence result for a class of concrete applications: alternating projections for designing structured tight frames that are widely used in sparse representation, compressed sensing and communication. We believe that our new analysis framework can be applied to guarantee the convergence of alternating projections when utilized for many other nonconvex and nonsmooth sets.

    02/12/2018 ∙ by Zhihui Zhu, et al. ∙ 0 share

    read it

  • RoboCup 2D Soccer Simulation League: Evaluation Challenges

    We summarise the results of RoboCup 2D Soccer Simulation League in 2016 (Leipzig), including the main competition and the evaluation round. The evaluation round held in Leipzig confirmed the strength of RoboCup-2015 champion (WrightEagle, i.e. WE2015) in the League, with only eventual finalists of 2016 competition capable of defeating WE2015. An extended, post-Leipzig, round-robin tournament which included the top 8 teams of 2016, as well as WE2015, with over 1000 games played for each pair, placed WE2015 third behind the champion team (Gliders2016) and the runner-up (HELIOS2016). This establishes WE2015 as a stable benchmark for the 2D Simulation League. We then contrast two ranking methods and suggest two options for future evaluation challenges. The first one, "The Champions Simulation League", is proposed to include 6 previous champions, directly competing against each other in a round-robin tournament, with the view to systematically trace the advancements in the League. The second proposal, "The Global Challenge", is aimed to increase the realism of the environmental conditions during the simulated games, by simulating specific features of different participating countries.

    06/14/2017 ∙ by Mikhail Prokopenko, et al. ∙ 0 share

    read it

  • Automata Guided Reinforcement Learning With Demonstrations

    Tasks with complex temporal structures and long horizons pose a challenge for reinforcement learning agents due to the difficulty in specifying the tasks in terms of reward functions as well as large variances in the learning signals. We propose to address these problems by combining temporal logic (TL) with reinforcement learning from demonstrations. Our method automatically generates intrinsic rewards that align with the overall task goal given a TL task specification. The policy resulting from our framework has an interpretable and hierarchical structure. We validate the proposed method experimentally on a set of robotic manipulation tasks.

    09/17/2018 ∙ by Xiao Li, et al. ∙ 0 share

    read it

  • Nonconvex Robust Low-rank Matrix Recovery

    In this paper we study the problem of recovering a low-rank matrix from a number of random linear measurements that are corrupted by outliers taking arbitrary values. We consider a nonsmooth nonconvex formulation of the problem, in which we enforce the low-rank property explicitly by using a factored representation of the matrix variable and employ an ℓ_1-loss function to robustify the solution against outliers. Under the Gaussian measurement model, we show that with a number of measurements that is information-theoretically optimal and even when a constant fraction (which can be up to almost half) of the measurements are arbitrarily corrupted, the resulting optimization problem is sharp and weakly convex. Consequently, we show that when initialized close to the set of global minima of the problem, a SubGradient Method (SubGM) with geometrically diminishing step sizes will converge linearly to the ground-truth matrix. We demonstrate the performance of the SubGM for the nonconvex robust low-rank matrix recovery problem with various numerical experiments.

    09/24/2018 ∙ by Xiao Li, et al. ∙ 0 share

    read it