Kashyap Chitta

is this you? claim profile


  • Adaptive Semantic Segmentation with a Strategic Curriculum of Proxy Labels

    Training deep networks for semantic segmentation requires annotation of large amounts of data, which can be time-consuming and expensive. Unfortunately, these trained networks still generalize poorly when tested in domains not consistent with the training data. In this paper, we show that by carefully presenting a mixture of labeled source domain and proxy-labeled target domain data to a network, we can achieve state-of-the-art unsupervised domain adaptation results. With our design, the network progressively learns features specific to the target domain using annotation from only the source domain. We generate proxy labels for the target domain using the network's own predictions. Our architecture then allows selective mining of easy samples from this set of proxy labels, and hard samples from the annotated source domain. We conduct a series of experiments with the GTA5, Cityscapes and BDD100k datasets on synthetic-to-real domain adaptation and geographic domain adaptation, showing the advantages of our method over baselines and existing approaches.

    11/08/2018 ∙ by Kashyap Chitta, et al. ∙ 18 share

    read it

  • Learning Sampling Policies for Domain Adaptation

    We address the problem of semi-supervised domain adaptation of classification algorithms through deep Q-learning. The core idea is to consider the predictions of a source domain network on target domain data as noisy labels, and learn a policy to sample from this data so as to maximize classification accuracy on a small annotated reward partition of the target domain. Our experiments show that learned sampling policies construct labeled sets that improve accuracies of visual classifiers over baselines.

    05/19/2018 ∙ by Yash Patel, et al. ∙ 0 share

    read it

  • Targeted Kernel Networks: Faster Convolutions with Attentive Regularization

    We propose Attentive Regularization (AR), a method to constrain the activation maps of kernels in Convolutional Neural Networks (CNNs) to specific regions of interest (ROIs). Each kernel learns a location of specialization along with its weights through standard backpropagation. A differentiable attention mechanism requiring no additional supervision is used to optimize the ROIs. Traditional CNNs of different types and structures can be modified with this idea into equivalent Targeted Kernel Networks (TKNs), while keeping the network size nearly identical. By restricting kernel ROIs, we reduce the number of sliding convolutional operations performed throughout the network in its forward pass, speeding up both training and inference. We evaluate our proposed architecture on both synthetic and natural tasks across multiple domains. TKNs obtain significant improvements over baselines, requiring less computation (around an order of magnitude) while achieving superior performance.

    06/01/2018 ∙ by Kashyap Chitta, et al. ∙ 0 share

    read it

  • Deep Probabilistic Ensembles: Approximate Variational Inference through KL Regularization

    In this paper, we introduce Deep Probabilistic Ensembles (DPEs), a scalable technique that uses a regularized ensemble to approximate a deep Bayesian Neural Network (BNN). We do so by incorporating a KL divergence penalty term into the training objective of an ensemble, derived from the evidence lower bound used in variational inference. We evaluate the uncertainty estimates obtained from our models for active learning on visual classification, consistently outperforming baselines and existing approaches.

    11/06/2018 ∙ by Kashyap Chitta, et al. ∙ 0 share

    read it

  • Large-Scale Visual Active Learning with Deep Probabilistic Ensembles

    Annotating the right data for training deep neural networks is an important challenge. Active learning using uncertainty estimates from Bayesian Neural Networks (BNNs) could provide an effective solution to this. Despite being theoretically principled, BNNs require approximations to be applied to large-scale problems, and have not been used widely by practitioners. In this paper, we introduce Deep Probabilistic Ensembles (DPEs), a scalable technique that uses a regularized ensemble to approximate a deep BNN. We conduct a series of active learning experiments to evaluate DPEs on classification with the CIFAR-10, CIFAR-100 and ImageNet datasets, and semantic segmentation with the BDD100k dataset. Our models consistently outperform baselines and previously published methods, requiring significantly less training data to achieve competitive performances.

    11/08/2018 ∙ by Kashyap Chitta, et al. ∙ 0 share

    read it

  • Less is More: An Exploration of Data Redundancy with Active Dataset Subsampling

    Deep Neural Networks (DNNs) often rely on very large datasets for training. Given the large size of such datasets, it is conceivable that they contain certain samples that either do not contribute or negatively impact the DNN's performance. If there is a large number of such samples, subsampling the training dataset in a way that removes them could provide an effective solution to both improve performance and reduce training time. In this paper, we propose an approach called Active Dataset Subsampling (ADS), to identify favorable subsets within a dataset for training using ensemble based uncertainty estimation. When applied to three image classification benchmarks (CIFAR-10, CIFAR-100 and ImageNet) we find that there are low uncertainty subsets, which can be as large as 50 These subsets are identified and removed with ADS. We demonstrate that datasets obtained using ADS with a lightweight ResNet-18 ensemble remain effective when used to train deeper models like ResNet-101. Our results provide strong empirical evidence that using all the available data for training can hurt performance on large scale vision tasks.

    05/29/2019 ∙ by Kashyap Chitta, et al. ∙ 0 share

    read it