1 Introduction
Graphbased representations are powerful tools to analyze realworld structured data that encapsulates pairwise relationships between its parts [Defferrard et al.2016, Zambon et al.2018]
. One fundamental challenge arising in the analysis of graphbased data is to represent discrete graph structures as numeric features that preserve the topological information. Due to the recent successes of deep learning networks in computer vision problems, many researchers have devoted their efforts to generalizing Convolutional Neural Networks (CNNs)
[Vinyals et al.2015, Krizhevsky et al.2017] to the graph domain. These neural networks on graphs are now widely known as Graph Convolutional Networks (GCNs) [Kipf and Welling2016], and have proven to be an effective way to extract highly meaningful statistical features for graph classification problems [Defferrard et al.2016].Generally speaking, most existing stateoftheart graph convolutional networks are developed based on two strategies, i.e., a) the spectral and b) the spatial strategies. Specifically, approaches based on the spectral strategy employ the property of the convolution operator from the graph Fourier domain that is related to spectral graph theory [Bruna et al.2013]
. By transforming the graph into the spectral domain through the eigenvectors of the Laplacian matrix, these methods perform the filter operation by multiplying the graph by a series of filter coefficients
[Bruna et al.2013, Rippel et al.2015, Henaff et al.2015]. Unfortunately, most spectralbased approaches demand the size of the graph structures to be the same and cannot be performed on graphs with different sizes and Fourier bases. As a result, approaches based on the spectral strategy are usually applied to vertex classification tasks. Methods based on the spatial strategy, on the other hand, generalize the convolution operation to the spatial structure of a graph by propagating features between neighboring vertices [Vialatte et al.2016, Duvenaud et al.2015, Atwood and Towsley2016]. Since spatialbased approaches are not restricted to the same graph structure, these methods can be directly applied to graph classification problems. Unfortunately, most existing spatialbased methods have relatively poor performance on graph classifications. The reason for this ineffectiveness is that these methods tend to directly sum up the extracted locallevel vertex features from the convolution operation as globallevel graph features through a SumPooling layer. As a result, the local topological information residing on the vertices of a graph may be discarded.To address the shortcoming of the graph convolutional networks associated with SumPooling, a number of methods focusing on locallevel vertex information have been proposed. For instance, [Niepert et al.2016] have developed a different graph convolutional network by reordering the vertices and converting each graph into fixedsized vertex grid structures, where standard onedimensional CNNs can be directly used. [Zhang et al.2018] have developed a novel Deep Graph Convolutional Neural Network model to preserve more vertex information through global graph topologies. Specifically, they propose a new SortPooling layer to transform the extracted vertex features of unordered vertices from the spatial graph convolution layers into a fixedsized vertex grid structure. Then a traditional convolutional operation can be performed by sliding a fixedsized filter over the vertex grid structures to further learn the topological information. The aforementioned methods focus more on locallevel vertex features and outperform stateoftheart graph convolutional network models on graph classification tasks. However, they tend to sort the vertex order based on the local structure descriptor of each individual graph. As a result, they cannot easily reflect the accurate topological correspondence information between graph structures. Furthermore, these approaches also lead to significant information loss. This usually occurs when they form a fixedsized vertex grid structure and some vertices associated with lower ranking may be discarded. In summary, developing effective methods to preserve the structural information residing in graphs still remains a significant challenge.
To overcome the shortcoming of the aforementioned methods, we propose a new graph convolutional network model, namely the Aligned Vertex Convolutional Network, to learn multiscale features from locallevel vertices for graph classification. One key innovation of the proposed model is the identification of the transitively aligned vertices between graphs. That is, given three vertices , and from three sample graphs, assume and are aligned, and and are aligned, the proposed model can guarantee that and are also aligned. More specifically, the new model utilizes the transitive alignment procedure to transform different graphs into fixedsized aligned vertex grid structures with consistent vertex orders. Overall, the main contributions are threefold.
First, we propose a new vertex matching method to transitively align the vertices of graphs. We show that this matching procedure can establish reliable vertex correspondence information between graphs, by gradually minimizing the innervertexcluster sum of squares over the vertices of all graphs through a means clustering method.
Second, with the transitive alignment information over a family of graphs to hand, we show how the graphs of arbitrary sizes can be mapped into fixedsized aligned vertex grid structures. The resulting Aligned Vertex Convolutional Network model is defined by adopting fixedsized onedimensional convolution filters on the grid structure to slide over the entire ordered aligned vertices. We show that the proposed model can effectively learn the multiscale characteristics residing on the locallevel vertex features for graph classifications. Moreover, since all the original vertex information will be mapped into the aligned vertex grid structure through the transitive alignment, the grid structure not only precisely integrates the structural correspondence information but also minimises the loss of structural information residing on locallevel vertices. As a result, the proposed model addresses the shortcomings of information loss and imprecise information representation arising in existing graph convolutional networks associated with SortPooling or SumPooling.
Third, we empirically evaluate the performance of the proposed model on graph classification problems. Experiments on benchmark graph datasets demonstrate the effectiveness.
2 Transitive Vertex Alignment Method
One main objective of this work is to convert graphs of arbitrary sizes into the fixedsized aligned vertex grid structures, so that a fixedsized convolution filter can directly slide over the grid structures to learn locallevel structural features through vertices. To this end, we need to identify the correspondence information between graphs.
In this section, we introduce a new matching method to transitively align the vertices. We commence by designating a family of prototype representations that encapsulate the principle characteristics over all vectorial vertex representations in a set of graphs . Assume there are vertices from all graphs in , and the associated dimensional vectorial representations of these vertices are . We employ means [Witten et al.2011] to locate centroids over all representations in . Specifically, given clusters , the aim of means is to minimize the objective function
(1) 
where is the mean of the vertex representations belonging to the th cluster . Since Eq.(1) minimizes the sum of the square Euclidean distances between the vertex points and the centroid point of cluster , the set of centroid points can be seen as a family of dimensional prototype representations that encapsulate representative characteristics over all graphs in .
To establish the correspondence information between the graph vertices over all graphs in , we align the vectorial vertex representations of each graph to the prototype representations in . Our alignment is similar to that introduced in [Bai et al.2015] for point matching in a pattern space. Specifically, for each sample graph and the associated dimensional vectorial representation of each vertex , we compute a
level affinity matrix in terms of the Euclidean distances between the two sets of points as
(2) 
where is a matrix, and each element represents the distance between the vectorial representation of vertex and the prototype representation . If the value of is the smallest in row , the vertex is aligned to the th prototype representation. Note that for each graph there may be two or more vertices aligned to the same prototype representation. We record the correspondence information using the level correspondence matrix
(3) 
For a pair of graphs and , if their vertices and are aligned to the same prototype representation, we say that and possess similar characteristics and are also aligned. Thus, we can identify the transitive alignment information between the vertices of all graphs in , by aligning their vertices to the same set of prototype representations. The alignment process is equivalent to assigning the vectorial representation of each vertex to the mean of the cluster . Thus, the proposed alignment procedure can be seen as an optimization process that gradually minimizes the innervertexcluster sum of squares over the vertices of all graphs through means, and can establish reliable vertex correspondence information over all graphs.
3 Learning Vertex Convolutional Networks
In this section, we develop a new vertex convolutional network model for graph classification. Our idea is to employ the transitive alignment information over a family of graphs and convert the arbitrary sized graphs into fixedsized aligned vertex grid structures. We then define a vertex convolution operation by adopting a set of fixedsized onedimensional convolution filters on the grid structure. With the new vertex convolution operation to hand, the proposed model can extract the original aligned vertex grid structure as a new grid structure with a reduced number of packed aligned vertices, i.e., the extracted multiscale vertex features learned through the convolutional operation is packed into the new grid structure. Finally, we employ the Softmax layer to read the extracted vertex features and predict the graph class.
3.1 Aligned Vertex Grid Structures of Graphs
In this subsection, we show how to convert graphs of different sizes into fixedsized aligned vertex grid structures. For each sample graph from the graph set defined earlier, assume each of its vertices is represented as a
dimensional feature vector. Then the features of all the
() vertices can be encoded using the matrix (i.e., ). If are vertex attributed graphs,can be the onehot encoding matrix of the vertex labels. For unattributed graphs, we propose to use the vertex degree as the vertex label. Based on the transitive alignment method defined in Section
2, we commence by identifying the family of the dimensional prototype representations in of . For each graph , we compute the level vertex correspondence matrix , where the row and column of are indexed by the vertices in and the prototype representations in , respectively. With to hand, we compute the level aligned vertex feature matrix for as(4) 
where is a matrix and each row of represents the feature of a corresponding aligned vertex. Since is computed by mapping the original feature information of each vertex to that of the new aligned vertices indexed by the corresponding prototypes in , it encapsulates all the original vertex feature information of .
For constructing the fixedsized aligned vertex grid structure for each graph , we need to establish a consistent vertex order for all graphs in . As the vertices are all aligned to the same prototype representations, the vertex orders can be determined by reordering the prototype representations. To this end, we construct a prototype graph that captures the pairwise similarity between the prototype representations, then we reorder the prototype representations based on their degree. This process is equivalent to sorting the prototypes in order of average similarity to the remaining ones. Specifically, for the dimensional prototype representations in , we compute the prototype graph as , where each vertex represents the prototype representation and each edge represents the similarity between a pair of prototype representations and . The similarity between two vertices of is computed as
(5) 
The degree of each prototype representation is . We sort the dimensional prototype representations in according to their degree . Then, we rearrange accordingly.
Finally, note that, to construct reliable grid structures for graphs, we employ the depthbased (DB) representations as the vectorial vertex representations to compute the required level vertex correspondence matrix . The DB representation of each vertex is defined by measuring the entropies on a family of layer expansion subgraphs rooted at the vertex [Bai and Hancock2014], where the parameter varies from to . It is shown that such a dimensional DB representation encapsulates rich entropy content flow from each local vertex to the global graph structure, as a function of depth. The process of computing the correspondence matrix associated with DB representations is shown in the appendix file. When we vary the largest layer of the expansion subgraphs from to (i.e., ), we compute the final aligned vertex grid structure for each graph as
(6) 
where is also a matrix as same as . Clearly, Eq.(6) transforms the original graphs of arbitrary sizes into a new aligned vertex grid structure with the same vertex number. Moreover, note that, the aligned vertex grid structure also preserve the original vertex feature information through the level aligned vertex feature matrix .
3.2 The Aligned Vertex Convolutional Network
In this subsection, we develop a new Aligned Vertex Convolutional Network model that learns locallevel vertex features for graph classifications. This model is defined by adopting a set of fixedsized onedimensional convolution filters on the aligned vertex grid structures and sliding the filter over the ordered aligned vertices to learn features, in a manner analogous to the standard convolution operation. Specifically, for each graph and its associated aligned vertex grid structure (i.e., aligned vertices each with feature channels), we denote the element of in the th row and th column as , i.e., the th feature channel of the th aligned vertex. We pass to the convolution layer. Assume the size of the receptive field is , i.e., the size of the onedimensional convolution filter is , the vertex convolution operation associated with
stride takes the form
(7) 
where is the element in the th row and th column of the new grid structure after the convolution operation, the parameter satisfies , is the th element of the convolution filter that maps the th feature channel of to the th feature channel of , is the bias of the th convolution filter, and
is the activation function.
An example of the vertex convolution operation defined by Eq.(7) are show in Figure 1. The vertex convolution operation consists of two computational steps. In the first step, the convolution filter is applied to map the th aligned vertex as well as its neighbor vertices () into a new feature value, associated with all the feature channels of these vertices. Specifically, Figure 1.(1) exhibits this process. Here, assume the vertex index , the convolution filter size , and we focus on the nd aligned vertex of . The convolution filter represented by the red lines first maps the th feature channels of the nd aligned vertex as well as its neighbor vertices and into a new single value by , and then sums up the values computed through all the channels as the th feature channel of . Moreover, we need to slide the convolution filter over all the aligned vertices, and this requires three convolution filters represented by the green, red and blue lines respectively. The weights for the three filters are shared, i.e., they are in fact the same filter. Finally, the second step , where
, applies the Relu function associated with the bias
and outputs the final result as .To further extract the multiscale features of a graph associated with its aligned vertex grid structure , we stack multiple vertex convolution layers defined as follows
(8) 
where is the input aligned vertex grid structure , and the corresponding notations of the symbols are listed in Table 1. After a number of vertex convolution operations, we employ the Softmax layer to read the extracted features computed from the vertex convolution layers and predict the graph class for graph classifications.
Symbol  Defitions 

node  the eth vertex 
the hth feature channel of vertex (e) in layer t  
the filter that maps to the hth feature channel in  
layer t from the sth feature channel in layer t1  
the jth element of the filter that maps to the hth  
feature channel in layer t from the sth feature  
channel in layer t1  
the bias of the hth filter in layer t  
the activate function, e.g., Relu function  
the number of filters in layer t1 
Datasets  MUTAG  PROTEINS  D&D  GatorBait  Reeb  IMDBB  IMDBM  REDB 
Max # vertices  
Mean # vertices  
# graphs  
# vertex labels  
# classes  
Description  Bioinformatics  Bioinformatics  Bioinformatics  Vision  Vision  Social  Social  Social 
Discussions: Comparing to existing stateoftheart graph convolution networks, the proposed Aligned Vertex Convolution Network (AVCN) model has a number of advantages.
First, unlike the Neural Graph Fingerprint Network (NGFN) model [Duvenaud et al.2015] and the Diffusion Convolution Neural Network (DCNN) model [Atwood and Towsley2016] that both employ a SumPooling layer to directly sum up the extracted locallevel vertex features from the convolution operation as globallevel graph features. The proposed AVCN model focuses more on learning local structural features through the proposed aligned vertex grid structure. Specifically, Figure 1 indicates that the associated vertex convolution operation of the proposed AVCN model can convert the original aligned vertex grid structure into a new grid structure, by packing the aligned vertex features from the original grid structure into the new grid structure. Thus, the new grid structure can be seen as a new extracted aligned vertex grid structure with a reduced number of aligned vertices. As a result, the proposed AVCN model can gradually extract multiscale locallevel vertex features through a number of stacked vertex convolution layers, and encapsulate more significant local structural information than the NGFN and DCNN models associated with SumPooling.
Second, similar to the proposed AVCN model, both the PATCHYSAN based Graph Convolution Neural Network (PSGCNN) model [Niepert et al.2016] and the Deep Graph Convolution Neural Network model [Zhang et al.2018] need to rearrange the vertex order of each graph structure and transform each graph into the fixedsized vertex grid structure. Unfortunately, both the PSGCNN and the DGCNN models sort the vertices of each graph based on the local structural descriptor, ignoring consistent vertex correspondence information between different graphs. By contrast, the proposed AVCN model associates with a transitive vertex alignment procedure to transform each graph into an aligned fixedsized vertex grid structure. As a result, only the proposed AVCN model can integrate the precise structural correspondence information over all graphs under investigations.
Third, when the PSGCNN model and the DGCNN model form fixedsized vertex grid structures, some vertices with lower ranking will be discarded. This in turn leads to significant information loss. By contrast, the required aligned vertex grid structures for the proposed AVCN model can encapsulate all the original vertex features from the original graphs. As a result, the proposed AVCN overcomes the shortcoming of information loss arising in the PSGCNN and DGCNN models.
4 Experiments
In this section, we compare the performance of the proposed AVCN model to both stateoftheart graph kernels and deep learning methods on graph classification problems on eight standard graph datasets. These datasets are abstracted from bioinformatics, computer vision and social networks. A selection of statistics of these datasets are shown in Table.2.
Experimental Setup: We evaluate the performance of the proposed AVCN model on graph classification problems against a) six alternative stateoftheart graph kernels and b) six alternative stateoftheart deep learning methods for graphs. Specifically, the graph kernels include 1) JensenTsallis qdifference kernel (JTQK) with [Bai et al.2014], 2) the WeisfeilerLehman subtree kernel (WLSK) [Shervashidze et al.2010], 3) the shortest path graph kernel (SPGK) [Borgwardt and Kriegel2005], 4) the shortest path kernel based on core variants (CORE SP) [Nikolentzos et al.2018], 5) the random walk graph kernel (RWGK) [Kashima et al.2003], and 6) the graphlet count kernel (GK) [Shervashidze et al.2009]. The deep learning methods include 1) the deep graph convolutional neural network (DGCNN) [Zhang et al.2018], 2) the PATCHYSAN based convolutional neural network for graphs (PSGCNN) [Niepert et al.2016], 3) the diffusion convolutional neural network (DCNN) [Atwood and Towsley2016], 4) the deep graphlet kernel (DGK) [Yanardag and Vishwanathan2015], 5) the graph capsule convolutional neural network (GCCNN) [Verma and Zhang2018], and 6) the anonymous walk embeddings based on feature driven (AWE) [Ivanov and Burnaev2018].
Datasets  MUTAG  PROTEINS  D&D  GatorBait  Reeb  IBDMB  IBDMM  REDB 

AVCN  
JTQK  
WLSK  
SPGK  
CORE SP  
GK  
RWGK 
Datasets  MUTAG  PROTEINS  D&D  IBDMB  IBDMM  REDB 

AVCN  
DGCNN  
PSGCNN  
DCNN  
GCCNN  
DGK  
AWE 
For the experiment, the proposed AVCN model uses the same network structure on all graph datasets. Specifically, we set the channel of each vertex convolution operation as , and the number of the prototype representations as , i.e., the vertex numbers of the aligned vertex grid structures for the graphs in any dataset are all . To extract different hierarchical multiscale local vertex features, we propose to input the aligned vertex grid structure of each graph to a family of paralleling stacked vertex convolution layers associated with different convolution filter sizes. Specifically, the architecture of the AVCN model is . Here, denotes a vertex convolution layer consisting of paralleling vertex convolution filters each with channels, and the filter sizes are , , and respectively. denotes a fullyconnected layer consisting of hidden units. An example of the architecture  for the proposed AVCN model are shown in Figure 2. We set the stride of each filter in layer as . With extracted patterns learned from the paralleling stacked vertex convolution layers to hand, we concatenate them and add a new fullyconnected layer followed by a Softmax layer to learn the graph class. We set the dropout rate for the fully connected layer as
. We employ the rectified linear units (ReLU) as the active function for the convolution layers. The only hyperparameter that we need to be optimized is the learning rate, the number of epochs, and the batch size for the minibatch gradient decent algorithm. To optimize the our AVCN model, we utilize the Stochastic Gradient Descent with the Adam updating rules. Finally, note that, our AVCN model needs to construct the prototype representations to identify the transitive vertex alignment information over all graphs. In this evaluation we proposed to compute the prototype representations from both the training and testing graphs. Thus, our model is an instance of transductive learning
[Gammerman et al.1998], where all graphs are used to compute the prototype representations but the class labels of the testing graphs are not used during the training process. For our model, we perform fold crossvalidation to compute the classification accuracies, with nine folds for training and one fold for testing. For each dataset, we repeat the experiment 10 times and report the average classification accuracies and standard errors in Table.3.For the alternative kernel methods, we set the parameters of the maximum subtree height for both the WLSK and JTQK kernels as , based on the previous empirical studies in the original papers. For each alternative graph kernel, we perform
fold crossvalidation associated with the LIBSVM implementation of CSupport Vector Machines (CSVM) to compute the classification accuracies. We repeat the experiment 10 times for each kernel and dataset and we report the average classification accuracies and standard errors in Table.
3. Note that for some kernels we directly report the best results from the original corresponding papers, since the evaluation of these kernels followed the same setting of ours. On the other hand, for the alternative deep learning methods, we report the best results for the PSGCNN and DGK models from their original papers. Note that, these methods were evaluated based on the same setting with the proposed AVCN model. For the DCNN model, we report the best results from the work of Zhang et al., [Zhang et al.2018], following the same setting of ours. For the AWE model, we report the classification accuracies of the featuredriven AWE, since the author have stated that this kind of AWE model can achieve competitive performance on label dataset. Finally, note that the PSGCNN model can leverage additional edge features, most of the graph datasets and the alternative methods do not leverage edge features. Thus, we do not report the results associated with edge features in the evaluation. The classification accuracies and standard errors for each deep learning method are shown in Table.4. Note that, the alternative deep learning methods have been evaluated on the Reeb and GatorBait datasets abstracted from computer vision by any author, we do not include the accuracies for these methods.Experimental Results and Discussions: Table.3 and Table.4
indicate that the proposed AVCN model can outperform the alternative stateoftheart methods including either the graph kernels or the deep learning methods for graphs. Specifically, for the alternative graph kernels, only the accuracy of the SPGK kernel on the IBDMM dataset is a little higher than that of the proposed AVCN model. On the other hand, for the alternative deep learning methods, only the accuracies of the GCCNN model on the PROTEINS dataset and the AWE model on the IMDBM dataset are a little higher than those of the proposed AVCN model. The reasons for the effectiveness are threefold. First, these alternative graph kernels are typical examples of Rconvolution kernels and are based on measuring any pair of substructures, ignoring the correspondence information between the substructures. By contrast, the proposed model associated with aligned vertex grid structure incorporates the transitive alignment information between graphs, and thus better reflect graph characteristics. Furthermore, the CSVM classifier associated with graph kernels can only be seen as a shallow learning framework
[Zhang et al.2015]. By contrast, the proposed model can provide an endtoend deep learning architecture, and thus better learn graph characteristics. Second, similar to the alternative graph kernels, all the alternative deep learning methods also cannot integrate the correspondence information between graphs into the learning architecture. Especially, the PSGCNN and DGCNN models need to reorder the vertices and some vertices may be discarded, leading to information loss. By contrast, the associated aligned vertex grid structures can preserve all the information of original graphs. Third, unlike the proposed model, the DCNN model needs to sum up the extracted locallevel vertex features as globallevel graph features. By contrast, the proposed model can learn richer multiscale locallevel vertex features. The experiments demonstrate the effectiveness of the proposed model.5 Conclusion
In this paper, we have developed a new aligned vertex convolutional network model for graph classification. The proposed model cannot only integrates the precise structural correspondence information between graphs but also minimises the loss of structural information residing on locallevel vertices. Experiments demonstrate the effectiveness of the proposed vertex convolution network model.
References
 [Atwood and Towsley2016] James Atwood and Don Towsley. Diffusionconvolutional neural networks. In Proceedings of NIPS, pages 1993–2001, 2016.
 [Bai and Hancock2014] Lu Bai and Edwin R. Hancock. Depthbased complexity traces of graphs. Pattern Recognition, 47(3):1172–1186, 2014.
 [Bai et al.2014] Lu Bai, Luca Rossi, Horst Bunke, and Edwin R. Hancock. Attributed graph kernels using the jensentsallis qdifferences. In Proceedings of ECMLPKDD, pages 99–114, 2014.
 [Bai et al.2015] Lu Bai, Luca Rossi, Zhihong Zhang, and Edwin R. Hancock. An aligned subtree kernel for weighted graphs. In Proceedings of ICML, pages 30–39, 2015.
 [Borgwardt and Kriegel2005] Karsten M. Borgwardt and HansPeter Kriegel. Shortestpath kernels on graphs. In Proceedings of the IEEE International Conference on Data Mining, pages 74–81, 2005.
 [Bruna et al.2013] Joan Bruna, Wojciech Zaremba, Arthur Szlam, and Yann LeCun. Spectral networks and locally connected networks on graphs. CoRR, abs/1312.6203, 2013.
 [Defferrard et al.2016] Michaël Defferrard, Xavier Bresson, and Pierre Vandergheynst. Convolutional neural networks on graphs with fast localized spectral filtering. In Proceedings of NIPS, pages 3837–3845, 2016.
 [Duvenaud et al.2015] David K. Duvenaud, Dougal Maclaurin, Jorge AguileraIparraguirre, Rafael GómezBombarelli, Timothy Hirzel, Alán AspuruGuzik, and Ryan P. Adams. Convolutional networks on graphs for learning molecular fingerprints. In Proceedings of NIPS, pages 2224–2232, 2015.
 [Gammerman et al.1998] Alexander Gammerman, Katy S. Azoury, and Vladimir Vapnik. Learning by transduction. In Proceedings of UAI, pages 148–155, 1998.
 [Henaff et al.2015] Mikael Henaff, Joan Bruna, and Yann LeCun. Deep convolutional networks on graphstructured data. CoRR, abs/1506.05163, 2015.
 [Ivanov and Burnaev2018] Sergey Ivanov and Evgeny Burnaev. Anonymous walk embeddings. In Proceedings of ICML, pages 2191–2200, 2018.
 [Kashima et al.2003] Hisashi Kashima, Koji Tsuda, and Akihiro Inokuchi. Marginalized kernels between labeled graphs. In Proceedings of ICML, pages 321–328, 2003.
 [Kipf and Welling2016] Thomas N. Kipf and Max Welling. Semisupervised classification with graph convolutional networks. CoRR, abs/1609.02907, 2016.
 [Krizhevsky et al.2017] Alex Krizhevsky, Ilya Sutskever, and Geoffrey E. Hinton. Imagenet classification with deep convolutional neural networks. Commun. ACM, 60(6):84–90, 2017.
 [Niepert et al.2016] Mathias Niepert, Mohamed Ahmed, and Konstantin Kutzkov. Learning convolutional neural networks for graphs. In Proceedings of ICML, pages 2014–2023, 2016.
 [Nikolentzos et al.2018] Giannis Nikolentzos, Polykarpos Meladianos, Stratis Limnios, and Michalis Vazirgiannis. A degeneracy framework for graph similarity. In Proceedings of IJCAI, pages 2595–2601, 2018.
 [Rippel et al.2015] Oren Rippel, Jasper Snoek, and Ryan P. Adams. Spectral representations for convolutional neural networks. In Proceddings of NIPS, pages 2449–2457, 2015.

[Shervashidze et al.2009]
N. Shervashidze, S.V.N. Vishwanathan, K. Mehlhorn T. Petri, and K. M.
Borgwardt.
Efficient graphlet kernels for large graph comparison.
Journal of Machine Learning Research
, 5:488–495, 2009.  [Shervashidze et al.2010] Nino Shervashidze, Pascal Schweitzer, Erik Jan van Leeuwen, Kurt Mehlhorn, and Karsten M. Borgwardt. Weisfeilerlehman graph kernels. Journal of Machine Learning Research, 1:1–48, 2010.
 [Verma and Zhang2018] Saurabh Verma and ZhiLi Zhang. Graph capsule convolutional neural networks. CoRR, abs/1805.08090, 2018.
 [Vialatte et al.2016] JeanCharles Vialatte, Vincent Gripon, and Grégoire Mercier. Generalizing the convolution operator to extend cnns to irregular domains. CoRR, abs/1606.01166, 2016.
 [Vinyals et al.2015] Oriol Vinyals, Alexander Toshev, Samy Bengio, and Dumitru Erhan. Show and tell: A neural image caption generator. In Proceedings of CVPR, pages 3156–3164, 2015.
 [Witten et al.2011] Ian H. Witten, Eibe Frank, and Mark A. Hall. Data Mining: Practical Machine Learning Tools and Techniques. Morgan Kaufmann, 2011.
 [Yanardag and Vishwanathan2015] Pinar Yanardag and S. V. N. Vishwanathan. Deep graph kernels. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Sydney, NSW, Australia, August 1013, 2015, pages 1365–1374, 2015.

[Zambon et al.2018]
Daniele Zambon, Cesare Alippi, and Lorenzo Livi.
Concept drift and anomaly detection in graph streams.
IEEE Trans. Neural Netw. Learning Syst., 29(11):5592–5605, 2018.  [Zhang et al.2015] ShiXiong Zhang, Chaojun Liu, Kaisheng Yao, and Yifan Gong. Deep neural support vector machines for speech recognition. In Proceedings of ICASSP, pages 4275–4279, 2015.
 [Zhang et al.2018] Muhan Zhang, Zhicheng Cui, Marion Neumann, and Yixin Chen. An endtoend deep learning architecture for graph classification. In Proceedings of AAAI, 2018.
Comments
There are no comments yet.