Martin Wistuba

is this you? claim profile


  • Automated Image Data Preprocessing with Deep Reinforcement Learning

    Data preparation, i.e. the process of transforming raw data into a format that can be used for training effective machine learning models, is a tedious and time-consuming task. For image data, preprocessing typically involves a sequence of basic transformations such as cropping, filtering, rotating or flipping images. Currently, data scientists decide manually based on their experience which transformations to apply in which particular order to a given image data set. Besides constituting a bottleneck in real-world data science projects, manual image data preprocessing may yield suboptimal results as data scientists need to rely on intuition or trial-and-error approaches when exploring the space of possible image transformations and thus might not be able to discover the most effective ones. To mitigate the inefficiency and potential ineffectiveness of manual data preprocessing, this paper proposes a deep reinforcement learning framework to automatically discover the optimal data preprocessing steps for training an image classifier. The framework takes as input sets of labeled images and predefined preprocessing transformations. It jointly learns the classifier and the optimal preprocessing transformations for individual images. Experimental results show that the proposed approach not only improves the accuracy of image classifiers, but also makes them substantially more robust to noisy inputs at test time.

    06/15/2018 ∙ by Tran Ngoc Minh, et al. ∙ 2 share

    read it

  • Adversarial Phenomenon in the Eyes of Bayesian Deep Learning

    Deep Learning models are vulnerable to adversarial examples, i.e. images obtained via deliberate imperceptible perturbations, such that the model misclassifies them with high confidence. However, class confidence by itself is an incomplete picture of uncertainty. We therefore use principled Bayesian methods to capture model uncertainty in prediction for observing adversarial misclassification. We provide an extensive study with different Bayesian neural networks attacked in both white-box and black-box setups. The behaviour of the networks for noise, attacks and clean test data is compared. We observe that Bayesian neural networks are uncertain in their predictions for adversarial perturbations, a behaviour similar to the one observed for random Gaussian perturbations. Thus, we conclude that Bayesian neural networks can be considered for detecting adversarial examples.

    11/22/2017 ∙ by Ambrish Rawat, et al. ∙ 0 share

    read it

  • Bank Card Usage Prediction Exploiting Geolocation Information

    We describe the solution of team ISMLL for the ECML-PKDD 2016 Discovery Challenge on Bank Card Usage for both tasks. Our solution is based on three pillars. Gradient boosted decision trees as a strong regression and classification model, an intensive search for good hyperparameter configurations and strong features that exploit geolocation information. This approach achieved the best performance on the public leaderboard for the first task and a decent fourth position for the second task.

    10/13/2016 ∙ by Martin Wistuba, et al. ∙ 0 share

    read it

  • Time-Series Classification Through Histograms of Symbolic Polynomials

    Time-series classification has attracted considerable research attention due to the various domains where time-series data are observed, ranging from medicine to econometrics. Traditionally, the focus of time-series classification has been on short time-series data composed of a unique pattern with intraclass pattern distortions and variations, while recently there have been attempts to focus on longer series composed of various local patterns. This study presents a novel method which can detect local patterns in long time-series via fitting local polynomial functions of arbitrary degrees. The coefficients of the polynomial functions are converted to symbolic words via equivolume discretizations of the coefficients' distributions. The symbolic polynomial words enable the detection of similar local patterns by assigning the same words to similar polynomials. Moreover, a histogram of the frequencies of the words is constructed from each time-series' bag of words. Each row of the histogram enables a new representation for the series and symbolize the existence of local patterns and their frequencies. Experimental evidence demonstrates outstanding results of our method compared to the state-of-art baselines, by exhibiting the best classification accuracies in all the datasets and having statistically significant improvements in the absolute majority of experiments.

    07/24/2013 ∙ by Josif Grabocka, et al. ∙ 0 share

    read it

  • Finding Competitive Network Architectures Within a Day Using UCT

    The design of neural network architectures for a new data set is a laborious task which requires human deep learning expertise. In order to make deep learning available for a broader audience, automated methods for finding a neural network architecture are vital. Recently proposed methods can already achieve human expert level performances. However, these methods have run times of months or even years of GPU computing time, ignoring hardware constraints as faced by many researchers and companies. We propose the use of Monte Carlo planning in combination with two different UCT (upper confidence bound applied to trees) derivations to search for network architectures. We adapt the UCT algorithm to the needs of network architecture search by proposing two ways of sharing information between different branches of the search tree. In an empirical study we are able to demonstrate that this method is able to find competitive networks for MNIST, SVHN and CIFAR-10 in just a single GPU day. Extending the search time to five GPU days, we are able to outperform human architectures and our competitors which consider the same types of layers.

    12/20/2017 ∙ by Martin Wistuba, et al. ∙ 0 share

    read it

  • Learning Features For Relational Data

    Feature engineering is one of the most important but tedious tasks in data science projects. This work studies automation of feature learning for relational data. We first theoretically proved that learning relevant features from relational data for a given predictive analytics problem is NP-hard. However, it is possible to empirically show that an efficient rule based approach predefining transformations as a priori based on heuristics can extract very useful features from relational data. Indeed, the proposed approach outperformed the state of the art solutions with a significant margin. We further introduce a deep neural network which automatically learns appropriate transformations of relational data into a representation that predicts the target variable well instead of being predefined as a priori by users. In an extensive experiment with Kaggle competitions, the proposed methods could win late medals. To the best of our knowledge, this is the first time an automation system could win medals in Kaggle competitions with complex relational data.

    01/16/2018 ∙ by Hoang Thanh Lam, et al. ∙ 0 share

    read it

  • Adversarial Robustness Toolbox v0.2.2

    Adversarial examples have become an indisputable threat to the security of modern AI systems based on deep neural networks (DNNs). The Adversarial Robustness Toolbox (ART) is a Python library designed to support researchers and developers in creating novel defence techniques, as well as in deploying practical defences of real-world AI systems. Researchers can use ART to benchmark novel defences against the state-of-the-art. For developers, the library provides interfaces which support the composition of comprehensive defence systems using individual methods as building blocks. The Adversarial Robustness Toolbox supports machine learning models (and deep neural networks (DNNs) specifically) implemented in any of the most popular deep learning frameworks (TensorFlow, Keras, PyTorch). Currently, the library is primarily intended to improve the adversarial robustness of visual recognition systems, however, future releases that will comprise adaptations to other data modes (such as speech, text or time series) are envisioned. The ART source code is released ( under an MIT license. The release includes code examples and extensive documentation ( to help researchers and developers get quickly started.

    07/03/2018 ∙ by Maria-Irina Nicolae, et al. ∙ 0 share

    read it

  • Scalable Multi-Class Bayesian Support Vector Machines for Structured and Unstructured Data

    We introduce a new Bayesian multi-class support vector machine by formulating a pseudo-likelihood for a multi-class hinge loss in the form of a location-scale mixture of Gaussians. We derive a variational-inference-based training objective for gradient-based learning. Additionally, we employ an inducing point approximation which scales inference to large data sets. Furthermore, we develop hybrid Bayesian neural networks that combine standard deep learning components with the proposed model to enable learning for unstructured data. We provide empirical evidence that our model outperforms the competitor methods with respect to both training time and accuracy in classification experiments on 68 structured and two unstructured data sets. Finally, we highlight the key capability of our model in yielding prediction uncertainty for classification by demonstrating its effectiveness in the tasks of large-scale active learning and detection of adversarial images.

    06/07/2018 ∙ by Martin Wistuba, et al. ∙ 0 share

    read it

  • NeuNetS: An Automated Synthesis Engine for Neural Network Design

    Application of neural networks to a vast variety of practical applications is transforming the way AI is applied in practice. Pre-trained neural network models available through APIs or capability to custom train pre-built neural network architectures with customer data has made the consumption of AI by developers much simpler and resulted in broad adoption of these complex AI models. While prebuilt network models exist for certain scenarios, to try and meet the constraints that are unique to each application, AI teams need to think about developing custom neural network architectures that can meet the tradeoff between accuracy and memory footprint to achieve the tight constraints of their unique use-cases. However, only a small proportion of data science teams have the skills and experience needed to create a neural network from scratch, and the demand far exceeds the supply. In this paper, we present NeuNetS : An automated Neural Network Synthesis engine for custom neural network design that is available as part of IBM's AI OpenScale's product. NeuNetS is available for both Text and Image domains and can build neural networks for specific tasks in a fraction of the time it takes today with human effort, and with accuracy similar to that of human-designed AI models.

    01/17/2019 ∙ by Atin Sood, et al. ∙ 0 share

    read it

  • Inductive Transfer for Neural Architecture Optimization

    The recent advent of automated neural network architecture search led to several methods that outperform state-of-the-art human-designed architectures. However, these approaches are computationally expensive, in extreme cases consuming GPU years. We propose two novel methods which aim to expedite this optimization problem by transferring knowledge acquired from previous tasks to new ones. First, we propose a novel neural architecture selection method which employs this knowledge to identify strong and weak characteristics of neural architectures across datasets. Thus, these characteristics do not need to be rediscovered in every search, a strong weakness of current state-of-the-art searches. Second, we propose a method for learning curve extrapolation to determine if a training process can be terminated early. In contrast to existing work, we propose to learn from learning curves of architectures trained on other datasets to improve the prediction accuracy for novel datasets. On five different image classification benchmarks, we empirically demonstrate that both of our orthogonal contributions independently lead to an acceleration, without any significant loss in accuracy.

    03/08/2019 ∙ by Martin Wistuba, et al. ∙ 0 share

    read it

  • A Survey on Neural Architecture Search

    The growing interest in both the automation of machine learning and deep learning has inevitably led to the development of automated methods for neural architecture optimization. The choice of the network architecture has proven to be critical, and many advances in deep learning spring from its immediate improvements. However, deep learning techniques are computationally intensive and their application requires a high level of domain knowledge. Therefore, even partial automation of this process would help make deep learning more accessible to both researchers and practitioners. With this survey, we provide a formalism which unifies and categorizes the landscape of existing methods along with a detailed analysis that compares and contrasts the different approaches. We achieve this via a discussion of common architecture search spaces and architecture optimization algorithms based on principles of reinforcement learning and evolutionary algorithms along with approaches that incorporate surrogate and one-shot models. Additionally, we address the new research directions which include constrained and multi-objective architecture search as well as automated data augmentation, optimizer and activation function search.

    05/04/2019 ∙ by Martin Wistuba, et al. ∙ 0 share

    read it