
OutofDistribution Detection Using Neural Rendering Generative Models
Outofdistribution (OoD) detection is a natural downstream task for deep generative models, due to their ability to learn the input probability distribution. There are mainly two classes of approaches for OoD detection using deep generative models, viz., based on likelihood measure and the reconstruction loss. However, both approaches are unable to carry out OoD detection effectively, especially when the OoD samples have smaller variance than the training samples. For instance, both flow based and VAE models assign higher likelihood to images from SVHN when trained on CIFAR10 images. We use a recently proposed generative model known as neural rendering model (NRM) and derive metrics for OoD. We show that NRM unifies both approaches since it provides a likelihood estimate and also carries out reconstruction in each layer of the neural network. Among various measures, we found the joint likelihood of latent variables to be the most effective one for OoD detection. Our results show that when trained on CIFAR10, lower likelihood (of latent variables) is assigned to SVHN images. Additionally, we show that this metric is consistent across other OoD datasets. To the best of our knowledge, this is the first work to show consistently lower likelihood for OoD data with smaller variance with deep generative models.
07/10/2019 ∙ by Yujia Huang, et al. ∙ 33 ∙ shareread it

From Hard to Soft: Understanding Deep Network Nonlinearities via Vector Quantization and Statistical Inference
Nonlinearity is crucial to the performance of a deep (neural) network (DN). To date there has been little progress understanding the menagerie of available nonlinearities, but recently progress has been made on understanding the rôle played by piecewise affine and convex nonlinearities like the ReLU and absolute value activation functions and maxpooling. In particular, DN layers constructed from these operations can be interpreted as maxaffine spline operators (MASOs) that have an elegant link to vector quantization (VQ) and Kmeans. While this is good theoretical progress, the entire MASO approach is predicated on the requirement that the nonlinearities be piecewise affine and convex, which precludes important activation functions like the sigmoid, hyperbolic tangent, and softmax. This paper extends the MASO framework to these and an infinitely large class of new nonlinearities by linking deterministic MASOs with probabilistic Gaussian Mixture Models (GMMs). We show that, under a GMM, piecewise affine, convex nonlinearities like ReLU, absolute value, and maxpooling can be interpreted as solutions to certain natural "hard" VQ inference problems, while sigmoid, hyperbolic tangent, and softmax can be interpreted as solutions to corresponding "soft" VQ inference problems. We further extend the framework by hybridizing the hard and soft VQ optimizations to create a βVQ inference that interpolates between hard, soft, and linear VQ inference. A prime example of a βVQ DN nonlinearity is the swish nonlinearity, which offers stateoftheart performance in a range of computer vision tasks but was developed ad hoc by experimentation. Finally, we validate with experiments an important assertion of our theory, namely that DN performance can be significantly improved by enforcing orthogonality in its linear filters.
10/22/2018 ∙ by Randall Balestriero, et al. ∙ 16 ∙ shareread it

SemiSupervised Learning via New Deep Network Inversion
We exploit a recently derived inversion scheme for arbitrary deep neural networks to develop a new semisupervised learning framework that applies to a wide range of systems and problems. The approach outperforms current stateoftheart methods on MNIST reaching 99.14% of test set accuracy while using 5 labeled examples per class. Experiments with onedimensional signals highlight the generality of the method. Importantly, our approach is simple, efficient, and requires no change in the deep network architecture.
11/12/2017 ∙ by Randall Balestriero, et al. ∙ 0 ∙ shareread it

SemiSupervised Learning with the Deep Rendering Mixture Model
Semisupervised learning algorithms reduce the high cost of acquiring labeled training data by using both labeled and unlabeled data during learning. Deep Convolutional Networks (DCNs) have achieved great success in supervised tasks and as such have been widely employed in the semisupervised learning. In this paper we leverage the recently developed Deep Rendering Mixture Model (DRMM), a probabilistic generative model that models latent nuisance variation, and whose inference algorithm yields DCNs. We develop an EM algorithm for the DRMM to learn from both labeled and unlabeled data. Guided by the theory of the DRMM, we introduce a novel nonnegativity constraint and a variational inference term. We report stateoftheart performance on MNIST and SVHN and competitive results on CIFAR10. We also probe deeper into how a DRMM trained in a semisupervised setting represents latent nuisance variation using synthetically rendered images. Taken together, our work provides a unified framework for supervised, unsupervised, and semisupervised learning.
12/06/2016 ∙ by Tan Nguyen, et al. ∙ 0 ∙ shareread it

A Probabilistic Framework for Deep Learning
We develop a probabilistic framework for deep learning based on the Deep Rendering Mixture Model (DRMM), a new generative probabilistic model that explicitly capture variations in data due to latent task nuisance variables. We demonstrate that maxsum inference in the DRMM yields an algorithm that exactly reproduces the operations in deep convolutional neural networks (DCNs), providing a first principles derivation. Our framework provides new insights into the successes and shortcomings of DCNs as well as a principled route to their improvement. DRMM training via the ExpectationMaximization (EM) algorithm is a powerful alternative to DCN backpropagation, and initial training results are promising. Classification based on the DRMM and other variants outperforms DCNs in supervised digit classification, training 23x faster while achieving similar accuracy. Moreover, the DRMM is applicable to semisupervised and unsupervised learning tasks, achieving results that are stateoftheart in several categories on the MNIST benchmark and comparable to state of the art on the CIFAR10 benchmark.
12/06/2016 ∙ by Ankit B. Patel, et al. ∙ 0 ∙ shareread it

DeepCodec: Adaptive Sensing and Recovery via Deep Convolutional Neural Networks
In this paper we develop a novel computational sensing framework for sensing and recovering structured signals. When trained on a set of representative signals, our framework learns to take undersampled measurements and recover signals from them using a deep convolutional neural network. In other words, it learns a transformation from the original signals to a nearoptimal number of undersampled measurements and the inverse transformation from measurements to signals. This is in contrast to traditional compressive sensing (CS) systems that use random linear measurements and convex optimization or iterative algorithms for signal recovery. We compare our new framework with ℓ_1minimization from the phase transition point of view and demonstrate that it outperforms ℓ_1minimization in the regions of phase transition plot where ℓ_1minimization cannot recover the exact solution. In addition, we experimentally demonstrate how learning measurements enhances the overall recovery performance, speeds up training of recovery framework, and leads to having fewer parameters to learn.
07/11/2017 ∙ by Ali Mousavi, et al. ∙ 0 ∙ shareread it

Learned DAMP: Principled Neural Network based Compressive Image Recovery
Compressive image recovery is a challenging problem that requires fast and accurate algorithms. Recently, neural networks have been applied to this problem with promising results. By exploiting massively parallel GPU processing architectures and oodles of training data, they can run orders of magnitude faster than existing techniques. However, these methods are largely unprincipled black boxes that are difficult to train and oftentimes specific to a single measurement matrix. It was recently demonstrated that iterative sparsesignalrecovery algorithms can be "unrolled" to form interpretable deep networks. Taking inspiration from this work, we develop a novel neural network architecture that mimics the behavior of the denoisingbased approximate message passing (DAMP) algorithm. We call this new network Learned DAMP (LDAMP). The LDAMP network is easy to train, can be applied to a variety of different measurement matrices, and comes with a stateevolution heuristic that accurately predicts its performance. Most importantly, it outperforms the stateoftheart BM3DAMP and NLRCS algorithms in terms of both accuracy and run time. At high resolutions, and when used with sensing matrices that have fast implementations, LDAMP runs over 50× faster than BM3DAMP and hundreds of times faster than NLRCS.
04/21/2017 ∙ by Christopher A. Metzler, et al. ∙ 0 ∙ shareread it

Learning to Invert: Signal Recovery via Deep Convolutional Networks
The promise of compressive sensing (CS) has been offset by two significant challenges. First, realworld data is not exactly sparse in a fixed basis. Second, current highperformance recovery algorithms are slow to converge, which limits CS to either nonrealtime applications or scenarios where massive backend computing is available. In this paper, we attack both of these challenges headon by developing a new signal recovery framework we call DeepInverse that learns the inverse transformation from measurement vectors to signals using a deep convolutional network. When trained on a set of representative images, the network learns both a representation for the signals (addressing challenge one) and an inverse map approximating a greedy or convex recovery algorithm (addressing challenge two). Our experiments indicate that the DeepInverse network closely approximates the solution produced by stateoftheart CS recovery algorithms yet is hundreds of times faster in run time. The tradeoff for the ultrafast run time is a computationally intensive, offline training procedure typical to deep networks. However, the training needs to be completed only once, which makes the approach attractive for a host of sparse recovery problems.
01/14/2017 ∙ by Ali Mousavi, et al. ∙ 0 ∙ shareread it

Consistent Parameter Estimation for LASSO and Approximate Message Passing
We consider the problem of recovering a vector β_o ∈R^p from n random and noisy linear observations y= Xβ_o + w, where X is the measurement matrix and w is noise. The LASSO estimate is given by the solution to the optimization problem β̂_λ = _β1/2yXβ_2^2 + λβ_1. Among the iterative algorithms that have been proposed for solving this optimization problem, approximate message passing (AMP) has attracted attention for its fast convergence. Despite significant progress in the theoretical analysis of the estimates of LASSO and AMP, little is known about their behavior as a function of the regularization parameter λ, or the thereshold parameters τ^t. For instance the following basic questions have not yet been studied in the literature: (i) How does the size of the active set β̂^λ_0/p behave as a function of λ? (ii) How does the mean square error β̂_λ  β_o_2^2/p behave as a function of λ? (iii) How does β^t  β_o _2^2/p behave as a function of τ^1, ..., τ^t1? Answering these questions will help in addressing practical challenges regarding the optimal tuning of λ or τ^1, τ^2, .... This paper answers these questions in the asymptotic setting and shows how these results can be employed in deriving simple and theoretically optimal approaches for tuning the parameters τ^1, ..., τ^t for AMP or λ for LASSO. It also explores the connection between the optimal tuning of the parameters of AMP and the optimal tuning of LASSO.
11/03/2015 ∙ by Ali Mousavi, et al. ∙ 0 ∙ shareread it

A Deep Learning Approach to Structured Signal Recovery
In this paper, we develop a new framework for sensing and recovering structured signals. In contrast to compressive sensing (CS) systems that employ linear measurements, sparse representations, and computationally complex convex/greedy algorithms, we introduce a deep learning framework that supports both linear and mildly nonlinear measurements, that learns a structured representation from training data, and that efficiently computes a signal estimate. In particular, we apply a stacked denoising autoencoder (SDA), as an unsupervised feature learner. SDA enables us to capture statistical dependencies between the different elements of certain signals and improve signal recovery performance as compared to the CS approach.
08/17/2015 ∙ by Ali Mousavi, et al. ∙ 0 ∙ shareread it

oASIS: Adaptive Column Sampling for Kernel Matrix Approximation
Kernel matrices (e.g. Gram or similarity matrices) are essential for many stateoftheart approaches to classification, clustering, and dimensionality reduction. For large datasets, the cost of forming and factoring such kernel matrices becomes intractable. To address this challenge, we introduce a new adaptive sampling algorithm called Accelerated Sequential Incoherence Selection (oASIS) that samples columns without explicitly computing the entire kernel matrix. We provide conditions under which oASIS is guaranteed to exactly recover the kernel matrix with an optimal number of columns selected. Numerical experiments on both synthetic and realworld datasets demonstrate that oASIS achieves performance comparable to stateoftheart adaptive sampling methods at a fraction of the computational cost. The low runtime complexity of oASIS and its low memory footprint enable the solution of large problems that are simply intractable using other adaptive methods.
05/19/2015 ∙ by Raajen Patel, et al. ∙ 0 ∙ shareread it
Richard G. Baraniuk
is this you? claim profile
Professor at Rice University, Founder and Director at OpenStax, Founder and Director at Connexions