Combinets: Learning New Classifiers via Recombination

02/10/2018 ∙ by Matthew Guzdial, et al. ∙ Georgia Institute of Technology 0

Problems with few examples of a new class of objects prove challenging to most classifiers. One solution to is to reuse existing data through transfer methods such as one-shot learning or domain adaption. However these approaches require an explicit hand-authored or learned definition of how reuse can occur. We present an approach called conceptual expansion that learns how to reuse existing machine-learned knowledge when classifying new cases. We evaluate our approach by adding new classes of objects to the CIFAR-10 dataset and varying the number of available examples of these new classes.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

Introduction

Modern deep learning systems perform well with large amounts of training data on known classes but often struggle otherwise. This is a general issue given the invention or discovery of novel classes, rare or illusive classes, or the imagining of fantastical classes. For example, if a new traffic sign were invented tomorrow it would have a severe, negative impact on autonomous driving efforts until there were enough examples for a learning system to recognize this new class with confidence.

Deep learning success has depended more on the size of datasets than the strength of algorithms [Pereira, Norvig, and Halevy2009]. We observe that a significant amount of training data for many classes exists. But there are also many novel, rare, or fantastical classes with insufficient data that can be understood as derivations or combinations of existing classes. For example, consider a pegasus, a fantastical creature that appears to be a horse with wings, and therefore can be thought of as a combination of a horse and a bird. If we suddenly discovered a pegasus and only had a few pictures, we couldn’t train a typical neural network classifier to recognize a pegasus as a new class nor a generative adversarial network to create new pegasus images. However we might be able to approximate both models given appropriate models trained on horse and bird data.

Various approaches exist to reuse knowledge from larger datasets for problems with smaller datasets, such as zero-shot and transfer learning. In these approaches, knowledge from a source model trained on a large dataset is applied to a target problem by either retraining the network on the target dataset [Levy and Markovitch2012] or leveraging sufficiently general or authored features to handle new classes [Xian, Schiele, and Akata2017]. The latter of these two approaches is not guaranteed to perform well depending on source and target problems, and the former of these is limited in terms of what final target models can be learned.

Combinational creativity is the type of creativity humans employ when combining ideas [Boden2004]. Many algorithms exist that attempt to represent this process, but they have historically required hand-authored graphical representations of input concepts with combination only occurring across symbolic values [Fauconnier2001]

. A neural network is a large, complex graph of numeric values derived from data. If combinational creativity techniques can be applied to recombine trained neural networks, they could potentially address the pegasus problem and improve few-shot recognition and generation of new classes without the introduction of outside knowledge or heuristics.

We introduce a novel representation, conceptual expansion, that allows for the recombination of an arbitrary number of learned models into a final model without additional training. In the domains of image recognition and image generation we demonstrate how recombination via conceptual expansion outperforms standard transfer learning approaches for fixed neural network architectures. The remainder of the paper is organized as follows: first we discuss related work and differentiate this technique from similar approaches for few-shot problems. Second, we discuss the conceptual expansion representation in detail and the search-based approach we employ to construct them in this paper. Third, we present a variety of experiments to demonstrate the limitations and advantages of the approach. We end with conclusions and future work.

Related Work

Combinational Creativity

Combinational creativity represents a particular set of approaches for knowledge reuse through recombining existing knowledge and concepts for the purposes of inventing novel concepts [Boden2004]. There have been many approaches to combinational creativity over the years. Case-based reasoning (CBR) represents a general AI problem solving approach that relies on the storage, retrieval, and adaption of existing solutions [De Mantaras et al.2005]. The adaption function has lead to a large class of combinational creativity approaches [Wilke and Bergmann1998, Fox and Clarke2009, Manzano, Ontanón, and Plaza2011]

. These techniques tend to be domain-dependent, for example for the problem of text generation or tool creation

[Hervás and Gervás2006, Sizov, Öztürk, and Aamodt2015]

. Murdock and Goel murdock2001meta combine reinforcement learning with case-based reasoning, which aligns with our work to combine combinational creativity and machine learning research. However, the technique does not combine classes.

The area of belief revision, modeling how beliefs change, includes a function to merge prior existing beliefs with new beliefs [Cojan and Lieber2009, Konieczny and Pérez2011, Fox and Clarke2009]. Amalgams represent an extension of this belief merging process that looks to output the simplest combination [Ontañón and Plaza2010]. The mathematical notion of convolution has been applied to blend weights between two neural nets in work that parallels our desire to combine combinational creativity and machine learning, but with inconclusive results [Thagard and Stewart2011].

Conceptual blending is perhaps the most popular computational creativity technique, though it has traditionally been limited to hand-authored input [Fauconnier2001]. Li et al. li2012goal introduced goals to conceptual blending, which parallels our usage of training data to derive the structure of a combination. However, conceptual blending only deals with symbolic values, which makes it ill-suited to machine-learned models. Visual blending [Cunha et al.2017], blends components of images using conceptual blending and parallels are use of combinational creativity with Generative Adversarial Networks, however it requires hand-defined components and combines images instead of models. Guzdial and Riedl guzdial2016learning utilized conceptual blending to recombine machine-learned models of video game level design by treating all numbers as ordinal values, but their approach does not generalize to neural networks.

Combinational creativity algorithms tend to have many possible valid outputs. This is typically viewed as undesirable, with general heuristics or constraints designed to pick a single correct combination from this set [Fauconnier2001, Ontañón and Plaza2010]. This limits the potential output of these approaches, we instead employ a domain-specific heuristic criteria to find an optimal combination.

Knowledge Reuse in Neural Networks

A wide range of prior approaches exist for the reuse or transfer of knowledge in neural networks, such as zero-shot, one-shot, and few-shot learning [Xian, Schiele, and Akata2017, Fei-Fei, Fergus, and Perona2006], domain adaption [Daumé III2009], and transfer learning [Lampert, Nickisch, and Harmeling2009, Wang and Hebert2016]. These approaches either require an additional set of features for transfer, or depend upon backpropagation to refine learned features from some source domain to a target domain. In the former case these additional transfer features can be hand-authored [Lampert, Nickisch, and Harmeling2009, Kulis, Saenko, and Darrell2011, Ganin et al.2016] or learned [Levy and Markovitch2012, Norouzi et al.2013, Mensink, Gavves, and Snoek2014, Ba et al.2015, Elhoseiny et al.2017]. In the case of requiring additional training these approaches can freeze all weights of a network aside from a final classification layer or can tune all the weights of the network with standard training approaches [Wong and Gales2016, Li et al.2017]. As an alternative one can author an explicit model of transfer such as metaphors [Levy and Markovitch2012] or hypotheses [Kuzborskij and Orabona2013]. To the best of our knowledge no work exists that attempts few-shot training of generative adversarial networks (GANs), though some work exists at exploring the space between distributions of classes [Cheong and Teo2018].

Kuzborskij et al. kuzborskij2013n investigate the same n to n+1 multiclass transfer learning problem as our image classification experiments, and make use of a combination of existing trained classifiers. However, their approach makes use of Support Vector Machines with a small feature-set and only allows for linear combinations. Rebuffi et al. rebuffi2017icarl extended this work to convolutional neural nets, but still requires retraining via backpropagation. Chao et al. chao2016empirical demonstrated that average visual features can be used for zero-shot learning, which represents a domain independent zero-shot learning measure that does not require human authoring or additional training.

One alternative to reusing learned knowledge in neural networks, is to extend a dataset to new classes using query expansions and the web [Divvala, Farhadi, and Guestrin2014, Yao et al.2017] . However, we are interested in problems in which no additional training data exists, even online, due to the class in question being new, fantastical, or rare, and how existing learned features can be adapted.

Conceptual Expansion

Imagine tomorrow we discover that a pegasus exists. Initially we lack enough images of this newly discovered flying horse to build a traditional classifier or image generator. However, suppose we have neural network classifiers and generators trained on classes including horses and birds. Conceptual expansion allows us to reuse the learned features from machine learned model(s) to produce new models without additional training or additional transfer features.

The intuition behind conceptual expansion is that it allows us to derive a high-dimensional, parameterized search space from an arbitrary number of pretrained input models, where each point of this search space is a new model that can be understood as a being some degree of combination or variation of the input models. Each point of this space—each combined model—is a valid conceptual expansion. We can consider the case where a class () is a combination of other classes () and that the learned features of models of classes can be recombined to create the features of a model of . In these cases, we hypothesize that conceptual expansions can represent models one cannot necessarily discover using conventional machine learning techniques with the available data. Furthermore, we hypothesize that these conceptual expansion models may perform better on specific tasks than standard models in cases with small amounts of available data, such as identifying or generating new classes of objects. We can use a heuristic informed by this small amount of training data to guide the search for our final conceptual expansion. This process is inspired by the human ability to make conceptual leaps, but is not intended as an accurate recreation.

A conceptual expansion of concept is represented as the following function:

(1)

Where is the set of all mapped features and is a filter representing what of and what amount of mapped feature should be represented in the final conceptual expansion. In the ideal case (e.g. a combined model of birds and horses equals our ideal pegasus model). The exact shape of depends upon the feature representation. If features are symbolic, can have values of either 0 or 1 (including the mapped feature or not), or vary from 0 to 1 if features are numeric or ordinal. Note that for numeric values one may choose a different range (e.g. -1 to 1) dependent on the domain. If features are matrices, as in a neural net, each is also a matrix. In the case of matrices the multiplication is an element-wise multiplication or Hadamard product. As an example, in the case of neural image recognition,

are the variables in a convolutional neural network learned via backpropagation. Deriving a conceptual expansion is the process of finding an

for known features such that optimizes a given objective or heuristic towards some target concept .

In this representation, the space of conceptual expansions is a multidimensional, parameterized search space over possible combinations of our input models. There exists an infinite number of possible conceptual expansions for non-symbolic features, which makes naïvely deriving this representation ill-advised. Instead, as is typical in combinational creativity approaches, we first derive a mapping. The mapping determines what particular knowledge—in this case the weights and biases of a neural network—will be combined to address a novel case. It will serve as an informed starting point that can then be optimized in the space of possible conceptual expansions.

The mapping is the collection of existing class knowledge we will combine from our knowledge base to represent the novel class knowledge initially, and the degree of inclusion of each class. In the aforementioned pegasus example it is unlikely one would have a trained image recognition model that only recognized horses and birds. For example the widely used CIFAR-10 dataset [Krizhevsky and Hinton2009] contains ten classes, including horses and birds. If we were to use the CIFAR-10 dataset as our starting point, some of the information from some of the 10 classes might be useful, but much of it likely isn’t. We differentiate between these two cases with the mapping. The mapping allows us to select only the portions of a model that will contribute to the recognition of the new class, which can then be used to determine a starting point for searching the space of conceptual expansions.

Given a mapping, we construct an initial conceptual expansion—a set of and an —that is iterated upon to optimize for domain specific notions of quality (in the example pegasus case image recognition accuracy). We discuss the creation of the mapping in Section Mapping Construction and the refinement of the conceptual expansion in Section Conceptual Expansion Search.

Mapping Construction

Constructing the initial mapping is relatively straightforward. As input we assume we have an existing trained model or models (CifarNet trained on CIFAR-10 for the purposes of this example [Krizhevsky and Hinton2009]), and data for a novel class (whatever pegasus images we have). We construct a mapping with the novel class data by looking at how the model or models in our knowledge base perform on the data for the novel class. The mapping is constructed according to the ratio of the new images classified into each of the old classes. For example, suppose we have a CifarNet trained on CIFAR-10 and we additionally have four pegasus images. Further suppose that CifarNet classifies two of the four pegasus images as a horse and two as a bird. We construct a mapping of: consisting of the weights and baises associated with the horse class, and consisting of the weights and biases associated with the bird class. We initialize the alpha values for both variables to all be 0.5—the classification ratio—meaning a floating point value for the biases and a matrix for the weights. This leads to a final classification for our pegasus images that relies on half of the weights of the house class and half of the weights of the bird class.

Conceptual Expansion Search

input : available data , an initial model , a mapping , and a score
output : The maximum expansion found according to the heuristic
1 maxE DefaultExpansion(model)+m ;
2 maxScore score ;
3 v maxE ;
4 improving 0;
5 while improving 10 do
6        n maxE.GetNeighbor(v);
7        v v n ;
8        s Heuristic(n, data);
9        oldMax maxScore maxScore, maxE max(maxScore, maxE , s, n );
10        improving oldMax maxScore?0:improving ++
11return maxE ;
Algorithm 1 Conceptual Expansion Search

The space of potential conceptual expansions is massive, and the mapping construction stage gives us an initial starting point in this space from which to search. We present the pseudocode for the Conceptual Expansion Search in Algorithm 1. Line 1 finds a default expansion plus the mapping information. The exact nature of this depends on the final network architecture. For example, the mapping may overwrite the entirety of the network if the input models and final model have the same architecture or just the final classification layer if not (as in the case of adding an additional class). This initial conceptual expansion will be a linear combination of the existing knowledge, but the final conceptual expansion need not be a linear combination. The default expansion is an expansion equivalent to the original model(s), in that each variable is replaced by an expanded variable with its original value and an of 1.0 (or matrix of 1.0’s). This means that the initial expansion is functionally identical to the original model, beyond any weights impacted by the mapping.

Once we have a mapping we search for a set of and for which the conceptual expansion performs well on a domain-specific measure (e.g. pegasus classification accuracy). For the purposes of this paper we implement a greedy optimization search that checks a fixed number of neighbors before the search ends. The function randomly selects between one of the following: altering a single element of a single , replacing all of the values of a single replacing values of with a randomly selected alternative , or adding an addition and corresponding random to an expanded variable. The final output of this process is the maximum scoring conceptual expansion found during the search. For the purposes of clarity we refer to these conceptual expansions of neural networks as combinets.

Our initial refinement algorithm is a random search for our initial investigation of conceptual expansions, as our focus for this paper is the representation, not the optimization method. It is possible that alternative means of searching the space of conceptual expansions may find better conceptual expansions and improve on the baselines we establish in the next section.

width=   100 50 10 5 1 Fox   11th orig.   11th orig.   11th orig.   11th orig.   11th orig. combinet   34.03.5 81.82.2   26.05.2 81.591.9   28.33.5 79.11.6   23.08.5 80.61.2   12.09.8 80.77.2 standard   7.02.7 62.04   0.00.0 62.17   0.00.0 62.34   0.00.0 62.44   0.00.0 76.443.5 transfer   5.04.3 87.20.5   0.00.0 87.90.2   0.00.0 88.10.4   0.00.0 87.70.2   0.00.0 88.01.1 zero-shot   11.00.7 86.20.4   11.01.0 86.20.8   9.62.3 86.20.2   10.04.6 86.01.4   6.03.3 83.22.5 Plain   11th orig.   11th orig.   11th orig.   11th orig.   11th orig. combinet   53.010.0 84.03.6   45.77.6 84.27.8   31.322.0 83.92.4   28.312.6 82.32.2   23.017.4 84.02.4 standard   50.07.7 62.54   42.03.2 62.18   16.012.8 61.67   0.00.0 62.27   0.00.0 62.27 transfer   4.53.0 86.92   0.00.0 86.91   0.00.0 86.96   0.00.0 87.20   0.00.0 87.20 zero-shot   23.00.7 86.20.5   23.61.1 86.20.3   222.8 86.113.9   18.63.8 83.73.4   15.67.3 82.72.9

Table 1: A table with the average test accuracy for the first experiment. The orig. column displays the accuracy for the 10,000 test images for the original 10 classes of CIFAR-10. The 11th column displays the accuracy for the CIFAR-100 test images.

CifarNet Experiments

In this section we present a series of experiments meant to demonstrate the strengths and limitations of conceptual expansions for image classification with deep neural networks. We chose CIFAR-10 and CIFAR-100 [Krizhevsky and Hinton2009] as the domain for this approach as these represent well-understood datasets. It is not our goal to achieve state of the art on CIFAR-10 or CIFAR-100; we instead use these datasets to construct problems in which a system must identify images of a class not present in some initial training set given limited training data on the novel class. For the deep neural network model we chose CifarNet [Krizhevsky and Hinton2009]

, again due to existing understanding of its performance on the more traditional applications of these datasets. We intentionally choose not to make use of a larger dataset like ImageNet or a larger architecture

[Deng et al.2009], as we aim to compare how our approach constructs final features given a limited set of input features, compared to other approaches that transfer features. We do not include a full description of CifarNet but note that it is a two-layer convolutional neural net with three fully-connected layers.

For each experiment, we ran our conceptual expansion search algorithm ten times and took the most successful conceptual expansion found across the ten runs in terms of training accuracy. We did this to ensure we had found a near optimal conceptual expansion. We note that this approach was still many times faster than initially training the CifarNet on CIFAR-10 with backpropagation.

Our first experiment expands a CifarNet trained on CIFAR-10 to recognize one additional class selected from CIFAR-100 that is not in CIFAR-10. We vary the size of slices of the training data for the newly introduced class, which allows us to evaluate the performance of recombination via conceptual expansions under a variety of controlled conditions. Our second experiment fully expands a CifarNet model trained on CIFAR-10 to recognize the one-hundred classes of CIFAR-100 with limited training data. Finally, we investigate the running example throughout this paper: expanding a CifarNet model trained on CIFAR-10 to classify pegasus images.

CIFAR-10 + Fox/Plain

For our initial experiment we chose to add fox and plain (as in a grassy field) recognition to the CifarNet, as these classes exist within CIFAR-100, but not within CIFAR-10 (CIFAR-10 is made up of the classes: airplane, automobile, bird, cat, deer, dog, frog, horse, ship, and truck). We chose foxes and plains for this initial case study because they represented illustrative examples of conceptual expansion performance. There exists a number of classes in CIFAR-10 that can be argued to be similar to foxes, but no classes similar to plains.

For training data we drew from the 50,000 training examples for the ten classes of CIFAR-10, adding a varying number of training instance of fox or plain. For test data we made use of the full 10,000 CIFAR-10 test set and the 100 samples in the CIFAR-100 test set for each class. For each size slice of training data (i.e. 1, 5, 10, and 100) we constructed five unique random slices. We chose five for consistency across all the differently sized slices, given that there was a maximum of 500 training images for fox and plan, and our largest slice size was 100. We present the average test accuracy across all approaches and with all sample sizes in Table 1. This table shows results when we provide five slices of fox or plain images in the quantities of 1, 5, 10, 50, or 100. For each slice, we provide the accuracy on the original CIFAR-10 images and the accuracy of identifying the 11th class (either fox or plains).

We evaluate against three baselines. Our first baseline (standard) trains CifarNet with backpropagation with stratified branches on the 10,000 CIFAR-10 images and newly introduced foxes or plains. This baseline makes the assumption that the new class was part of the same domain as the other classes as in [Daumé III2009]. For our second baseline we took inspiration from transfer learning and student-teacher models [Wong and Gales2016, Li et al.2017, Furlanello et al.2017], and train an initial CifarNet on only the CIFAR-10 data and then retrain the classification layers to predict the eleventh class with the newly available data. We note that transfer learning typically involves training on a larger dataset, such as ImageNet, then retraining the final classification layer. However, we wished to compare how these different approaches alter the same initial features towards classifying the new class. For our third baseline we drew on the zero-shot approach outlined in [Chao et al.2016], using the average activation of the trained CifarNet on the training data to derive feature classification vectors. In all cases we trained the model until convergence.

There exist many other transfer approaches, but other approaches tend to require additional human authoring of transfer methods or features or an additional dataset to draw from. We focus on comparing the behavior of these approaches in terms of altering or leveraging learned features, and so making use of these other approaches would only make this less clear.

As can be seen in Table 1, the combinet consistently outperforms the baselines at recognizing the newly added eleventh class. We note that the expected CifarNet test accuracy for CIFAR-10 is 85%. Combinets achieve the best accuracy on the newly added class while only losing a small amount of accuracy on average on the 10 original classes. The combinet loss in CIFAR-10 accuracy was almost always due to overgeneralizing. The transfer approach did slightly better than the expected CIFAR-10 accuracy, but this matches previously reported accuracy improvements from retraining [Furlanello et al.2017].

Foxes clearly confused the baselines, leading to no correctly identified test foxes for the standard of transfer baselines for the lowest values. Compared to plains, foxes had significant overlap in terms of features with cats and dogs. With these smaller sizes samples transfer and standard were unable to learn or adapt suitable discriminatory features. Comparatively, the conceptual expansion approach was capable of combining existing features into new features that were more successfully able to discriminate between these classes. The zero-shot approach did not require additional training and instead made use of secondary features to make predictions, which was more consistent, but still not as successful as our approach in classifying the new class.

Note that combinets do not always outperform these other approaches. For example, the standard approach beats out combinets, getting an average of 83% accuracy with access to all 500 plain training images, while the combinet only achieves an accuracy of roughly 50%. This suggests that combinets are only suited to problems with low training data.

width=1   combiGAN combi+N combi+T Naive Transfer Samples   I KL   I KL   I KL   I KL   I KL 500   3.830.32 0.33   4.610.22 0.28   3.050.23 0.31   2.980.25 0.33   3.380.19 1.05 100   4.230.15 0.10   4.380.37 0.29   4.400.19 0.43   1.760.04 0.33   3.260.23 0.36 50   4.050.24 0.22   4.030.35 0.12   1.690.05 2.36   1.060.00 10.8   3.970.22 0.21 10   4.670.44 0.44   4.790.28 0.13   3.060.19 1.20   1.200.01 10.8   4.400.19 0.11

Table 2: Summary of results for the GAN experiments.

Expanding CIFAR-10 to CIFAR-100

For the prior experiments we added a single eleventh class from CIFAR-100 to a CifarNet trained on CIFAR-10. This experiment looks at the problem of expanding a trained CifarNet from classifying the ten classes of the CIFAR-10 dataset to the one-hundred classes of the CIFAR-100 dataset.

For this experiment we limited our training data to ten randomly chosen samples of each CIFAR-100 class. We slightly altered our approach to account for the change in task, constructing an initial mapping for each class individually as if we were expanding a CifarNet to just that eleventh class. We utilized the same two baselines as with the first experiment, given the same 1,000 image training set.

We note that one would not typically utilize CifarNet for this task. Even given access to all 50,000 training samples of CIFAR-100 a CifarNet trained using backpropagation only achieves around 30% test accuracy for CIFAR-100. We mean to show the relative scale of accuracy before and after conceptual expansion and not an attempt to achieve state of the art on CIFAR-100 with the full dataset. We tested on the 100,000 test samples available for CIFAR-100.

The average test accuracy across all 100 classes are as follows: the combinet achieves 11.13%, the naive baseline achieves 1.20%, the transfer baseline achieves 6.43%, and the zero-shot baseline achieves 4.10%. We note that our approach is the only one to do better than chance, and significantly outperforms all the baselines. However no approach reaches anywhere near the 30% accuracy that could be achieved with full training data for this architecture.

Pegasus

We return to our running example of an image recognition system that can recognize a pegasus. Unfortunately we lack actual images of a pegasus. To approximate this we collected fifteen photo-realistic, open-use pegasus images from Flickr. Using the same combinet as the above two experiments we tested this approach with a 10-5 training/test split and a 5-10 training/test split. For the former we recognized 4 of the 5 pegasus images (80% accuracy), with 80% CIFAR-10 accuracy, and for the latter we recognized 5 of the 10 pegasus images (50% accuracy) with 82% CIFAR-10 accuracy.

DCGAN Experiment

In this section we demonstrate the application of conceptual expansions to generative adversarial networks (GANs). Specifically, we demonstrate the ability to use conceptual expansions to find GANs that can generate images of a class without traditional training on images of that class. We also demonstrate how our approach can take as input an arbitrary number of initial neural networks, instead of the one network for the classification experiments. We make use of the DCGAN [Radford, Metz, and Chintala2015] as the GAN architecture for this experiment, as it has known performance on a number of tasks. We make use of the CIFAR-100 dataset from the prior section and in addition use the Caltech-UCSD Birds-200-2011 [Wah et al.2011], the CAT [Zhang, Sun, and Tang2008], the Stanford Dogs [Khosla et al.2011], FGVC Aircraft [Maji et al.2013], and the Standford Cars [Krause et al.2013] datasets. We make use of these five datasets as they represent five of the ten CIFAR-10 classes, but with significantly more images and images of higher quality. Sharing the majority of classes between experiments allows us to draw comparisons between results.

We trained a DCGAN on each of these datasets till convergence, then used all five of these models as the original knowledge base for combinets. Specifically, we built mappings by testing the proportion of training samples the discriminator of each GAN classified as real. We then built a combinet discriminator for the target class from the discriminators of each GAN. Finally we built a combinet generator from the generators of each GAN, using the combinet discriminators as the heuristic in traditional GAN-fashion for the conceptual expansion search. We nickname these combinet discriminators and generators combiGANs. As above we made use of the fox images of CIFAR-100 as our novel class training data, varying the number of available images.

We built two baselines: (1) A naive baseline, which involved training the DCGAN on the available fox images in the traditional manner. (2) A transfer baseline, in which we took a DCGAN trained on the Stanford Dogs dataset and retrained it on the fox dataset. We also built two variations of combiGAN: (1) A combiGAN baseline in which we used the discriminator of the naïve baseline as the heuristic for the combinet generator (Combi+N). (2) Same as the last, but using the transfer baseline discriminator (Combi+T). We further built a baseline trained on the Stanford Dogs, CAT dataset, and Fox images simultaneously as in [Cheong and Teo2018], but found that it did not have any improvement over the other baselines thus we omit it to save space. We do not include the zero shot approach of the prior section as it is only suitable for classification tasks.

Figure 1: Most fox-like output according to our model for each baseline and sample size.

CombiGAN Results

We made use of two metrics: the inception score [Salimans et al.2016] and Kullback-Leibler (KL) divergence between generated image classification and true image classification distributions. We acknowledge that inception score was originally designed for ImageNet; since we do not train on ImageNet, we cannot use this as an objective score, but we can use it as a comparative metric of objectness. For the second metric we desired some way to represent how fox-like the generated images were. Thus we made use of the standard classifier trained on 500 foxes, though we could have made use of any classifier in theory. We compare the distribution over classes of real CIFAR-100 fox images and the fake images with the KL divergence. We generated 10,000 images from each GAN to test each metric. We summarize the results of this experiment in Table 2.

We note that in almost all cases our approach or one its variations (combi+N and combi+T) outperform the two baselines. In the case with 10 training images the transfer baseline beats our approach on our fox-like measure, but this 0.11 differs only slightly from the 0.13 combi+N value. In Figure 1, we include the most fox-like image in terms of classifier confidence from the training samples (real) and each baseline’s output. We note that the combiGAN output had a tendency to retain face-like features, while the transfer baseline tended to revert to fuzzy blobs.

Discussion and Limitations

Conceptual expansions of neural networks—combinets and combiGANs—outperform standard approaches on problems with limited data without additional knowledge engineering. We refer to this approach generally as conceptual expansion, which is inspired by the human ability to make conceptual leaps by combining existing knowledge. Our contributions in this paper are an initial exploration of conceptual expansion of neural networks; we speculate that more sophisticated optimization search routines than the one provided in this paper may achieve greater improvements.

We anticipate the future performance of conceptual expansions to depend upon the extent to which the existing knowledge base contains relevant information to the new problem and ability for the optimization function to find helpful conceptual expansions. We note that one choice of optimization function could be human intuition, and we have had success hand-designing conceptual expansions for sufficiently small problems.

Conceptual expansions appear less dependent on training data than existing transfer learning approaches as evidenced by the comparative performance of the approach with low training data, This is further evidenced by those instances where conceptual expansion outperformed itself with less training data. We anticipate further exploration of this in future work. We expect these results to generalize to other domains, but recognize our choice of datasets as a potential limiting factor. CIFAR-10 and CIFAR-100 have very low resolution images (32x32 RGB images). Further, we do not make use of traditional data augmentation techniques such as noising or horizontal flips of the images. We note once again that we chose these datasets for our experiments to focus on feature adaptation.

Conclusions

We present conceptual expansion, an approach to produce recombined versions of existing machine learned deep neural net models. We ran four experiments of this approach compared to common baselines, and found we were able to achieve greater accuracy with less data. Our technique relies upon a flexible representation of recombination of existing knowledge that allows us to represent new knowledge as a combination of particular knowledge from existing cases. To our knowledge this represents the first attempt at applying a model of combinational creativity to neural networks.

Acknowledgments

We gratefully acknowledge the NSF for supporting this research under NSF award 1525967.

References

  • [Ba et al.2015] Ba, L. J.; Swersky, K.; Fidler, S.; and Salakhutdinov, R. 2015. Predicting deep zero-shot convolutional neural networks using textual descriptions. In

    International Conference on Computer Vision

    , 4247–4255.
  • [Boden2004] Boden, M. A. 2004. The creative mind: Myths and mechanisms. Psychology Press.
  • [Chao et al.2016] Chao, W.-L.; Changpinyo, S.; Gong, B.; and Sha, F. 2016. An empirical study and analysis of generalized zero-shot learning for object recognition in the wild. In European Conference on Computer Vision, 52–68. Springer.
  • [Cheong and Teo2018] Cheong, B., and Teo, H. 2018. Can we train dogs and humans at the same time? gans and hidden distributions. Technical report, Stanford.
  • [Cojan and Lieber2009] Cojan, J., and Lieber, J. 2009. Belief merging-based case combination. In ICCBR, 105–119. Springer.
  • [Cunha et al.2017] Cunha, J. M.; Gonçalves, J.; Martins, P.; Machado, P.; and Cardoso, A. 2017. A pig, an angel and a cactus walk into a blender: A descriptive approach to visual blending. arXiv preprint arXiv:1706.09076.
  • [Daumé III2009] Daumé III, H. 2009. Frustratingly easy domain adaptation. arXiv preprint arXiv:0907.1815.
  • [De Mantaras et al.2005] De Mantaras, R. L.; McSherry, D.; Bridge, D.; Leake, D.; Smyth, B.; Craw, S.; Faltings, B.; Maher, M. L.; T COX, M.; Forbus, K.; et al. 2005. Retrieval, reuse, revision and retention in case-based reasoning. The Knowledge Engineering Review 20(3):215–240.
  • [Deng et al.2009] Deng, J.; Dong, W.; Socher, R.; Li, L.-J.; Li, K.; and Fei-Fei, L. 2009. Imagenet: A large-scale hierarchical image database. In CVPR09.
  • [Divvala, Farhadi, and Guestrin2014] Divvala, S. K.; Farhadi, A.; and Guestrin, C. 2014. Learning everything about anything: Webly-supervised visual concept learning. In

    Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition

    , 3270–3277.
  • [Elhoseiny et al.2017] Elhoseiny, M.; Zhu, Y.; Zhang, H.; and Elgammal, A. 2017. Zero shot learning from noisy text description at part precision. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
  • [Fauconnier2001] Fauconnier, G. 2001. Conceptual blending and analogy. The analogical mind: Perspectives from cognitive science 255–286.
  • [Fei-Fei, Fergus, and Perona2006] Fei-Fei, L.; Fergus, R.; and Perona, P. 2006. One-shot learning of object categories. IEEE transactions on pattern analysis and machine intelligence 28(4):594–611.
  • [Fox and Clarke2009] Fox, J., and Clarke, S. 2009. Exploring approaches to dynamic adaptation. In Proceedings of the 3rd International DiscCoTec Workshop on Middleware-Application Interaction, 19–24. ACM.
  • [Furlanello et al.2017] Furlanello, T.; Lipton, Z. C.; Amazon, A.; Itti, L.; and Anandkumar, A. 2017. Born again neural networks. In NIPS Workshop on Meta Learning.
  • [Ganin et al.2016] Ganin, Y.; Ustinova, E.; Ajakan, H.; Germain, P.; Larochelle, H.; Laviolette, F.; Marchand, M.; and Lempitsky, V. 2016. Domain-adversarial training of neural networks. The Journal of Machine Learning Research 17(1):2096–2030.
  • [Guzdial and Riedl2016] Guzdial, M., and Riedl, M. 2016. Learning to blend computer game levels. In Seventh International Conference on Computational Creativity.
  • [Hervás and Gervás2006] Hervás, R., and Gervás, P. 2006. Case-based reasoning for knowledge-intensive template selection during text generation. In European Conference on Case-Based Reasoning, 151–165. Springer.
  • [Khosla et al.2011] Khosla, A.; Jayadevaprakash, N.; Yao, B.; and Fei-Fei, L. 2011. Novel dataset for fine-grained image categorization. In First Workshop on Fine-Grained Visual Categorization, IEEE Conference on Computer Vision and Pattern Recognition.
  • [Konieczny and Pérez2011] Konieczny, S., and Pérez, R. P. 2011. Logic based merging. Journal of Philosophical Logic 40(2):239–270.
  • [Krause et al.2013] Krause, J.; Stark, M.; Deng, J.; and Fei-Fei, L. 2013. 3d object representations for fine-grained categorization. In 4th International IEEE Workshop on 3D Representation and Recognition (3dRR-13).
  • [Krizhevsky and Hinton2009] Krizhevsky, A., and Hinton, G. 2009. Learning multiple layers of features from tiny images. Citeseer.
  • [Kulis, Saenko, and Darrell2011] Kulis, B.; Saenko, K.; and Darrell, T. 2011. What you saw is not what you get: Domain adaptation using asymmetric kernel transforms. In Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on, 1785–1792. IEEE.
  • [Kuzborskij and Orabona2013] Kuzborskij, I., and Orabona, F. 2013. Stability and hypothesis transfer learning. In International Conference on Machine Learning, 942–950.
  • [Kuzborskij, Orabona, and Caputo2013] Kuzborskij, I.; Orabona, F.; and Caputo, B. 2013. From n to n+ 1: Multiclass transfer incremental learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 3358–3365.
  • [Lampert, Nickisch, and Harmeling2009] Lampert, C. H.; Nickisch, H.; and Harmeling, S. 2009. Learning to detect unseen object classes by between-class attribute transfer. In Computer Vision and Pattern Recognition, 2009. CVPR 2009. IEEE Conference on, 951–958. IEEE.
  • [Levy and Markovitch2012] Levy, O., and Markovitch, S. 2012. Teaching machines to learn by metaphors. Technion-Israel Institute of Technology, Faculty of Computer Science.
  • [Li et al.2012] Li, B.; Zook, A.; Davis, N.; and Riedl, M. O. 2012. Goal-driven conceptual blending: A computational approach for creativity. In Proceedings of the 2012 International Conference on Computational Creativity, Dublin, Ireland, 3–16.
  • [Li et al.2017] Li, J.; Seltzer, M. L.; Wang, X.; Zhao, R.; and Gong, Y. 2017. Large-scale domain adaptation via teacher-student learning. arXiv preprint arXiv:1708.05466.
  • [Maji et al.2013] Maji, S.; Kannala, J.; Rahtu, E.; Blaschko, M.; and Vedaldi, A. 2013. Fine-grained visual classification of aircraft. arXiv preprint arXiv:1306.5151.
  • [Manzano, Ontanón, and Plaza2011] Manzano, S.; Ontanón, S.; and Plaza, E. 2011. Amalgam-based reuse for multiagent case-based reasoning. In International Conference on Case-Based Reasoning, 122–136. Springer.
  • [Mensink, Gavves, and Snoek2014] Mensink, T.; Gavves, E.; and Snoek, C. G. 2014. Costa: Co-occurrence statistics for zero-shot classification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2441–2448.
  • [Murdock and Goel2001] Murdock, J., and Goel, A. 2001. Meta-case-based reasoning: Using functional models to adapt case-based agents. Case-based reasoning research and development 407–421.
  • [Norouzi et al.2013] Norouzi, M.; Mikolov, T.; Bengio, S.; Singer, Y.; Shlens, J.; Frome, A.; Corrado, G. S.; and Dean, J. 2013. Zero-shot learning by convex combination of semantic embeddings. arXiv preprint arXiv:1312.5650.
  • [Ontañón and Plaza2010] Ontañón, S., and Plaza, E. 2010. Amalgams: A formal approach for combining multiple case solutions. In Case-Based Reasoning. Research and Development. Springer. 257–271.
  • [Pereira, Norvig, and Halevy2009] Pereira, F.; Norvig, P.; and Halevy, A. 2009. The unreasonable effectiveness of data. IEEE Intelligent Systems 24(undefined):8–12.
  • [Radford, Metz, and Chintala2015] Radford, A.; Metz, L.; and Chintala, S. 2015. Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv preprint arXiv:1511.06434.
  • [Rebuffi et al.2017] Rebuffi, S.-A.; Kolesnikov, A.; Sperl, G.; and Lampert, C. H. 2017. icarl: Incremental classifier and representation learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.
  • [Salimans et al.2016] Salimans, T.; Goodfellow, I.; Zaremba, W.; Cheung, V.; Radford, A.; and Chen, X. 2016. Improved techniques for training gans. In Advances in Neural Information Processing Systems, 2234–2242.
  • [Sizov, Öztürk, and Aamodt2015] Sizov, G.; Öztürk, P.; and Aamodt, A. 2015. Evidence-driven retrieval in textual cbr: bridging the gap between retrieval and reuse. In International Conference on Case-Based Reasoning, 351–365. Springer.
  • [Thagard and Stewart2011] Thagard, P., and Stewart, T. C. 2011. The aha! experience: Creativity through emergent binding in neural networks. Cognitive science 35(1):1–33.
  • [Wah et al.2011] Wah, C.; Branson, S.; Welinder, P.; Perona, P.; and Belongie, S. 2011. The caltech-ucsd birds-200-2011 dataset. Technical Report CNS-TR-2011-001, California Institute of Technology.
  • [Wang and Hebert2016] Wang, Y.-X., and Hebert, M. 2016. Learning to learn: Model regression networks for easy small sample learning. In European Conference on Computer Vision, 616–634. Springer.
  • [Wilke and Bergmann1998] Wilke, W., and Bergmann, R. 1998. Techniques and knowledge used for adaptation during case-based problem solving. In International Conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems, 497–506. Springer.
  • [Wong and Gales2016] Wong, J. H., and Gales, M. J. 2016. Sequence student-teacher training of deep neural networks. International Speech Communication Association.
  • [Xian, Schiele, and Akata2017] Xian, Y.; Schiele, B.; and Akata, Z. 2017. Zero-shot learning-the good, the bad and the ugly. arXiv preprint arXiv:1703.04394.
  • [Yao et al.2017] Yao, Y.; Zhang, J.; Shen, F.; Hua, X.; Xu, J.; and Tang, Z. 2017. Exploiting web images for dataset construction: A domain robust approach. IEEE Transactions on Multimedia 19(8):1771–1784.
  • [Zhang, Sun, and Tang2008] Zhang, W.; Sun, J.; and Tang, X. 2008. Cat head detection-how to effectively exploit shape and texture features. In European Conference on Computer Vision, 802–816. Springer.