Xi Chen

is this you? claim profile


Software Engineer at Microsoft

  • Flow++: Improving Flow-Based Generative Models with Variational Dequantization and Architecture Design

    Flow-based generative models are powerful exact likelihood models with efficient sampling and inference. Despite their computational efficiency, flow-based models generally have much worse density modeling performance compared to state-of-the-art autoregressive models. In this paper, we investigate and improve upon three limiting design choices employed by flow-based models in prior work: the use of uniform noise for dequantization, the use of inexpressive affine flows, and the use of purely convolutional conditioning networks in coupling layers. Based on our findings, we propose Flow++, a new flow-based model that is now the state-of-the-art non-autoregressive model for unconditional density estimation on standard image benchmarks. Our work has begun to close the significant performance gap that has so far existed between autoregressive models and flow-based models. Our implementation is available at https://github.com/aravind0706/flowpp.

    02/01/2019 ∙ by Jonathan Ho, et al. ∙ 97 share

    read it

  • Boundary-Aware Network for Fast and High-Accuracy Portrait Segmentation

    Compared with other semantic segmentation tasks, portrait segmentation requires both higher precision and faster inference speed. However, this problem has not been well studied in previous works. In this paper, we propose a lightweight network architecture, called Boundary-Aware Network (BANet) which selectively extracts detail information in boundary area to make high-quality segmentation output with real-time( >25FPS) speed. In addition, we design a new loss function called refine loss which supervises the network with image level gradient information. Our model is able to produce finer segmentation results which has richer details than annotations.

    01/12/2019 ∙ by Xi Chen, et al. ∙ 12 share

    read it

  • Experimental Implementation of a Quantum Autoencoder via Quantum Adders

    Quantum autoencoders allow for reducing the amount of resources in a quantum computation by mapping the original Hilbert space onto a reduced space with the relevant information. Recently, it was proposed to employ approximate quantum adders to implement quantum autoencoders in quantum technologies. Here, we carry out the experimental implementation of this proposal in the Rigetti cloud quantum computer employing up to three qubits. The experimental fidelities are in good agreement with the theoretical prediction, thus proving the feasibility to realize quantum autoencoders via quantum adders in state-of-the-art superconducting quantum technologies.

    07/27/2018 ∙ by Yongcheng Ding, et al. ∙ 8 share

    read it

  • Learning from Demonstration in the Wild

    Learning from demonstration (LfD) is useful in settings where hand-coding behaviour or a reward function is impractical. It has succeeded in a wide range of problems but typically relies on artificially generated demonstrations or specially deployed sensors and has not generally been able to leverage the copious demonstrations available in the wild: those that capture behaviour that was occurring anyway using sensors that were already deployed for another purpose, e.g., traffic camera footage capturing demonstrations of natural behaviour of vehicles, cyclists, and pedestrians. We propose video to behaviour (ViBe), a new approach to learning models of road user behaviour that requires as input only unlabelled raw video data of a traffic scene collected from a single, monocular, uncalibrated camera with ordinary resolution. Our approach calibrates the camera, detects relevant objects, tracks them through time, and uses the resulting trajectories to perform LfD, yielding models of naturalistic behaviour. We apply ViBe to raw videos of a traffic intersection and show that it can learn purely from videos, without additional expert knowledge.

    11/08/2018 ∙ by Feryal Behbahani, et al. ∙ 8 share

    read it

  • Sequence Modeling of Temporal Credit Assignment for Episodic Reinforcement Learning

    Recent advances in deep reinforcement learning algorithms have shown great potential and success for solving many challenging real-world problems, including Go game and robotic applications. Usually, these algorithms need a carefully designed reward function to guide training in each time step. However, in real world, it is non-trivial to design such a reward function, and the only signal available is usually obtained at the end of a trajectory, also known as the episodic reward or return. In this work, we introduce a new algorithm for temporal credit assignment, which learns to decompose the episodic return back to each time-step in the trajectory using deep neural networks. With this learned reward signal, the learning efficiency can be substantially improved for episodic reinforcement learning. In particular, we find that expressive language models such as the Transformer can be adopted for learning the importance and the dependency of states in the trajectory, therefore providing high-quality and interpretable learned reward signals. We have performed extensive experiments on a set of MuJoCo continuous locomotive control tasks with only episodic returns and demonstrated the effectiveness of our algorithm.

    05/31/2019 ∙ by Yang Liu, et al. ∙ 8 share

    read it

  • Evaluating Protein Transfer Learning with TAPE

    Protein modeling is an increasingly popular area of machine learning research. Semi-supervised learning has emerged as an important paradigm in protein modeling due to the high cost of acquiring supervised protein labels, but the current literature is fragmented when it comes to datasets and standardized evaluation techniques. To facilitate progress in this field, we introduce the Tasks Assessing Protein Embeddings (TAPE), a set of five biologically relevant semi-supervised learning tasks spread across different domains of protein biology. We curate tasks into specific training, validation, and test splits to ensure that each task tests biologically relevant generalization that transfers to real-life scenarios. We benchmark a range of approaches to semi-supervised protein representation learning, which span recent work as well as canonical sequence learning techniques. We find that self-supervised pretraining is helpful for almost all models on all tasks, more than doubling performance in some cases. Despite this increase, in several cases features learned by self-supervised pretraining still lag behind features extracted by state-of-the-art non-neural techniques. This gap in performance suggests a huge opportunity for innovative architecture design and improved modeling paradigms that better capture the signal in biological sequences. TAPE will help the machine learning community focus effort on scientifically relevant problems. Toward this end, all data and code used to run these experiments are available at https://github.com/songlab-cal/tape.

    06/19/2019 ∙ by Roshan Rao, et al. ∙ 3 share

    read it

  • Population Based Augmentation: Efficient Learning of Augmentation Policy Schedules

    A key challenge in leveraging data augmentation for neural network training is choosing an effective augmentation policy from a large search space of candidate operations. Properly chosen augmentation policies can lead to significant generalization improvements; however, state-of-the-art approaches such as AutoAugment are computationally infeasible to run for the ordinary user. In this paper, we introduce a new data augmentation algorithm, Population Based Augmentation (PBA), which generates nonstationary augmentation policy schedules instead of a fixed augmentation policy. We show that PBA can match the performance of AutoAugment on CIFAR-10, CIFAR-100, and SVHN, with three orders of magnitude less overall compute. On CIFAR-10 we achieve a mean test error of 1.46 state-of-the-art. The code for PBA is open source and is available at https://github.com/arcelien/pba.

    05/14/2019 ∙ by Daniel Ho, et al. ∙ 2 share

    read it

  • HodgeRank with Information Maximization for Crowdsourced Pairwise Ranking Aggregation

    Recently, crowdsourcing has emerged as an effective paradigm for human-powered large scale problem solving in various domains. However, task requester usually has a limited amount of budget, thus it is desirable to have a policy to wisely allocate the budget to achieve better quality. In this paper, we study the principle of information maximization for active sampling strategies in the framework of HodgeRank, an approach based on Hodge Decomposition of pairwise ranking data with multiple workers. The principle exhibits two scenarios of active sampling: Fisher information maximization that leads to unsupervised sampling based on a sequential maximization of graph algebraic connectivity without considering labels; and Bayesian information maximization that selects samples with the largest information gain from prior to posterior, which gives a supervised sampling involving the labels collected. Experiments show that the proposed methods boost the sampling efficiency as compared to traditional sampling schemes and are thus valuable to practical crowdsourcing experiments.

    11/16/2017 ∙ by Qianqian Xu, et al. ∙ 0 share

    read it

  • Deep Inception-Residual Laplacian Pyramid Networks for Accurate Single Image Super-Resolution

    With exploiting contextual information over large image regions in an efficient way, the deep convolutional neural network has shown an impressive performance for single image super-resolution (SR). In this paper, we propose a deep convolutional network by cascading the well-designed inception-residual blocks within the deep Laplacian pyramid framework to progressively restore the missing high-frequency details of high-resolution (HR) images. By optimizing our network structure, the trainable depth of the proposed network gains a significant improvement, which in turn improves super-resolving accuracy. With our network depth increasing, however, the saturation and degradation of training accuracy continues to be a critical problem. As regard to this, we propose an effective two-stage training strategy, in which we firstly use images downsampled from the ground-truth HR images as the optimal objective to train the inception-residual blocks in each pyramid level with an extremely high learning rate enabled by gradient clipping, and then the ground-truth HR images are used to fine-tune all the pre-trained inception-residual blocks for obtaining the final SR model. Furthermore, we present a new loss function operating in both image space and local rank space to optimize our network for exploiting the contextual information among different output components. Extensive experiments on benchmark datasets validate that the proposed method outperforms existing state-of-the-art SR methods in terms of the objective evaluation as well as the visual quality.

    11/15/2017 ∙ by Yongliang Tang, et al. ∙ 0 share

    read it

  • Constructing multi-modality and multi-classifier radiomics predictive models through reliable classifier fusion

    Radiomics aims to extract and analyze large numbers of quantitative features from medical images and is highly promising in staging, diagnosing, and predicting outcomes of cancer treatments. Nevertheless, several challenges need to be addressed to construct an optimal radiomics predictive model. First, the predictive performance of the model may be reduced when features extracted from an individual imaging modality are blindly combined into a single predictive model. Second, because many different types of classifiers are available to construct a predictive model, selecting an optimal classifier for a particular application is still challenging. In this work, we developed multi-modality and multi-classifier radiomics predictive models that address the aforementioned issues in currently available models. Specifically, a new reliable classifier fusion strategy was proposed to optimally combine output from different modalities and classifiers. In this strategy, modality-specific classifiers were first trained, and an analytic evidential reasoning (ER) rule was developed to fuse the output score from each modality to construct an optimal predictive model. One public data set and two clinical case studies were performed to validate model performance. The experimental results indicated that the proposed ER rule based radiomics models outperformed the traditional models that rely on a single classifier or simply use combined features from different modalities.

    10/04/2017 ∙ by Zhiguo Zhou, et al. ∙ 0 share

    read it

  • A Note on Tight Lower Bound for MNL-Bandit Assortment Selection Models

    In this note we prove a tight lower bound for the MNL-bandit assortment selection model that matches the upper bound given in (Agrawal et al., 2016a,b) for all parameters, up to logarithmic factors.

    09/18/2017 ∙ by Xi Chen, et al. ∙ 0 share

    read it