1 Motivation and significance
Convolutional neural networks (CNNs) are a powerful and versatile tool in big data analysis and computer vision
nature_DL. Their application has been widely promoted in various research fields by the availability of opensource deep learning frameworks (DLFs) like TensorFlow, Caffe, PyTorch or the Microsoft Cognitive Toolkit. Also in groundbased astroparticle physics experiments, where large amounts of imagelike data need to be analysed, the application of CNNs has come into focus.
This data is often hexagonally sampled, which poses an initial obstacle for the application of CNNs: DLFs cannot process hexagonally sampled data outofthebox. Solutions to this problem have been presented in several applicability studies Feng2016 ; Holch2017 ; Huennefeld2017 ; Erdmann2018 ; Mangano2018 ; Shilon2018
. Most of these solutions are based on transforming the hexagonally sampled data to an approximate representation on a rectangular grid via preprocessing such as rebinning, interpolation, oversampling and axisshearing. HexagDLy, on the other hand, provides a native solution to process hexagonally sampled data. It relies on a specific addressing scheme for hexagonally sampled data that allows for the construction of convolution and pooling operations on hexagonal grids by using methods provided by PyTorch
^{1}^{1}1https://pytorch.org/ Paszke2017 . HexagDLy thereby aims to exploit the benefits of directly processing hexagonally sampled data, of which the most notable are reduced computing resources Mersereau1979 , more efficient image processing operators Staunton1989 and higher angular resolution Staunton1990 .In the context of CNNs, Hoogeboom et al. have already demonstrated the advantages of applying hexagonal convolutions such as improved accuracies due to the reduced anisotropy of hexagonal filters Hoogeboom2018 . With HexagDLy, hexagonal convolutions are available in an opensource software with focus on userfriendliness. It facilitates access to CNNs for any kind of hexagonally sampled data, which, in addition to groundbased astroparticle physics, can be found in other research fields like ecology Birch2007 or numerical climate modeling Sahr2011 ; Satoh2014 .
In the following Sec. 2 the software is described, including its capabilities and the requirements on the input format. The application of HexagDLy is illustrated with an example in Sec. 3 followed by a comparative study on the application of hexagonal and square convolution kernels. Potential benefits of using hexagonal convolutions in groundbased astroparticle physics are outlined in Sec. 5.
2 Software description
HexagDLy provides convolution operations on hexagonal grids built on PyTorch routines. Given the required input format for these routines, an addressing scheme has to be chosen to map the hexagonally sampled data to Cartesian tensors. The convolution and pooling operations are then adapted accordingly to reflect the hexagonal structure of the original data which is also conserved in the output. This is done by constructing custom hexagonal kernels that are applied in combination with a strict padding and striding scheme. These main ideas behind HexagDLy are outlined below. Please see Table
1 for the repository and software dependencies.2.1 Input Format
In order to map hexagonally sampled data to Cartesian tensors, different addressing schemes can be applied (for example, see Hoogeboom2018 ). HexagDLy uses the scheme that allows for the most efficient data storage. As a hexagonal grid can be interpreted as two overlayed rectangular grids, the data points can be combined in a single squaregrid array by aligning the two rectangular parts. The procedure is illustrated in Figure 1 where the hexagonal array in Cartesian coordinates is first rotated to achieve a vertical alignment of neighbouring elements (called pixels hereafter). This allows a separation of the data into columns. The pixels are then aligned horizontally by shifting every second column upwards by half the distance between neighbouring pixels, resulting in a squaregrid array with rows and columns. Counting rows from top to bottom and columns from left to right yields the indices for each element in the input tensor, which corresponds to a certain pixel in the hexagonal array. Tensor elements that do not have a corresponding counterpart in the hexagonal array have to be filled with an arbitrary value.
2.2 Hexagonal Kernels
The implemented convolution operations use kernels on the hexagonal grid that have a 6fold rotational symmetry (i.e. kernels of hexagonal shape). The geometry of a kernel is therefore described only by its size which is a single integer corresponding to the number of layers of neighbouring elements around its central element. Internally, HexagDLy constructs these hexagonal kernels from rectangular subkernels as illustrated in Fig. 2. The illustrated kernel of size 2 consists of three subkernels, each representing a set of equallength columns of the hexagonal kernel. The spatial relation between these columns are accounted for via defined horizontal dilations.
2.3 Convolution Operations
Since a hexagonal kernel is constructed out of multiple rectangular subkernels, a single hexagonal convolution operation is realised by a combination of multiple convolutions of the input tensor with these subkernels. As described in Sec. 2.1, columns of the hexagonal array are shifted to match with the tensor format required by PyTorch. The single subconvolutions therefore have to be adapted in order to account for this shift. This is achieved by defining a complex scheme for the padding and slicing of the input tensor. In this scheme, the number of rows and columns that are padded or sliced for each subconvolution depends on the size of the input tensor as well as on the size of the hexagonal kernel and the applied stride. To conserve the hexagonal structure of the data, only symmetric strides in equally sized steps along the three symmetry axes of the hexagonal grid are performed, starting from the top left cell. Figure 3 illustrates the single steps of this procedure for the convolution of a toy tensor with a hexagonal kernel of size 1.
It is important to note that a kernel is always centred on a pixel that is part of the actual input tensor and not of the padded rows and columns. To conserve the data format used by HexagDLy, steps that would lead to an output with columns of unequal length are neglected. Figure 4 illustrates this padding and convolutionelement selection for different strides and kernel sizes, including such a case, where a step is omitted.
2.4 Software Functionalities
HexagDLy provides two and threedimensional hexagonal convolution operations. In the threedimensional case, the input data is expected to have a hexagonal layout in the xyplane while data points along the zaxis are assumed to be equidistant. This makes it possible e.g. to process timeresolved data of a twodimensional detector with hexagonal layout. Following the design of convolution operations, pooling methods are implemented accordingly. This is done by replacing the PyTorchbased subconvolutions with the according pooling methods and combining the outputs with aggregation functions, whereas the padding and stridingscheme is identical. By adopting the PyTorchAPI, these operations can easily be incorporated in CNN models defined in PyTorch. Furthermore, it is possible to define custom hexagonal kernels with defined values for each kernel element, making it possible to manually implement structure detecting kernels or to perform data processing like smoothing on hexagonally sampled data. Examples are provided in the online repository in the form of jupyter notebooks (see Tab. 1) that demonstrate the functionalities and usage of the methods provided by HexagDLy.
3 Illustrative Example
To outline the application of HexagDLy, a set of examples covering basic usecases is provided along with the HexagDLy source code in the online repository. An illustrative way to demonstrate the functioning and capabilities of HexagDLy is to perform hexagonal operations on hexagonally sampled shapes that themselves exhibit a 6fold symmetry. In Fig. 5 the result of convolving an image displaying hexagonal shapes with a hexagonal kernel is shown. It can clearly be seen that the 6fold symmetry of the original shapes on the hexagonal grid is conserved in the output.
For an example of how to use HexagDLy in a CNN, please see the provided jupyter notebooks in the online repository (see Tab. 1).
4 Comparing Hexagonal and Square Convolution Kernels
As outlined in Sec. 1, a hexagonal sampling of twodimensional data allows for more efficient data processing compared to a squaregrid sampling. Starting with hexagonally sampled data, a conversion to a square grid representation therefore implies less efficient data processing. Additionally, resampling hexagonally sampled data to a square grid can introduce sampling artefacts and often requires an increase in resolution to reduce distortions.
In the context of deep learning, the effects of resampling and the reduction of processing efficiency can have a significant influence on the process of designing, optimising and applying CNNbased algorithms. While the applied resampling method is an independent parameter that can be optimised, an increase in resolution demands more computer storage and implies larger convolution kernels or more convolution layers to retain a certain receptive field. In combination, these effects can influence the performance of a CNN significantly. This is demonstrated in the following by comparing the performance of CNNs that are trained for the same task but use either hexagonal or squaregrid operations on hexagonal or resampled data, respectively.
For the presented experiment, a data set was created with images of four different hexagonal shapes at random positions on a hexagonal grid, overlayed with Gaussian noise. This data set was then interpolated to a square grid of the same resolution (small) as well as to a square grid with four times the number of pixels (large). An example of such a hexagonal shape with the according resampled images is shown in Fig. 6. Two CNN models with the same architecture were set up with the only difference being the use of hexagonal (hCNN, small) or squaregrid operations (sCNN, small). These two models have two convolutional and three fully connected layers with a total of k learnable parameters. A third CNN model with three convolutional and three fully connected layers and a total of M learnable parameters (sCNN, large) was set up and trained on the large squaregrid data. The full implementation of the CNNmodels and the data set are provided in a jupyter notebook in the online repository. The three CNNs were trained for epochs on images per class with a selfadjusting learning rate. This was repeated times with the training data being regenerated and the models being reinitialised in each iteration.
Figure 6 shows the resulting learning curves for all iterations for each CNNmodel. It can be seen that the hCNN reliably reaches accuracy after a few epochs of training. Both sCNNs, on the other hand, show a generally worse learning behaviour. Although they are both able to achieve accuracy in some cases, only in (small) and (large) of all iterations the models reach accuracies above random guessing performance.
This toy example illustrates the advantages of directly processing hexagonally sampled data in terms of reliability and accuracy. The differences in performance of the two sCNNs demonstrate that the effects of resampling can be compensated by increasing the resolution of the resampled data and likewise extending the CNN capacity. However, even with two orders of magnitude more learnable parameters, the performance of the hCNN is not reached. Even though the performance difference between hCNN and sCNN may not be as significant in a realistic application, natively processing hexagonally sampled data is generally expected to be the most efficient approach. However, the current implementation of hexagonal operations in HexagDLy produces a significant computational overhead compared to its according squaregrid operation in PyTorch. This can increase the processing time for an hCNN implemented with HexagDLy, but does not influence the advantages of applying hexagonal convolutions as outlined above.
5 Impact
Hexagonally sampled data is common in groundbased astroparticle physics experiments like the High Energy Stereoscopic System (H.E.S.S.), the Pierre Auger Observatory or IceCube where large areas have to be efficiently covered with a limited number of detectors. This can be achieved by arranging the detectors on a hexagonal grid as it allows for the densest tiling of a twodimensional Euclidean plane and for optimal sampling of circularly bandlimited signals. In these experiments data is taken at high rates and is mostly backgrounddominated. Additionally, this data can cover a large parameter space, e.g. multiple telescopes taking data simultaneously. Therefore, advanced data processing algorithms are used to analyse this data. The application of machine learning techniques has already become a standard in this respect
Ohm2009 ; Aartsen2015 . Following the progress in the field of machine learning, CNNs represent promising means to further improve data analyses for astroparticle physics experiments.By providing convolution and pooling operations that can be directly applied to hexagonally sampled data, HexagDLy provides a userfriendly environment to explore the applicability of CNNs for these experiments. Since no preprocessing is required, the initial efforts for the application of CNNs can be significantly reduced compared to other approaches.
The increasing scales and sensitivity of future observatories like the Cherenkov Telescope Array CTA2017 will result in much larger data sets that need to be analysed. This will pose additional challenges for the analysis in terms of performance and resources. The methods provided by HexagDLy can help to address these challenges.
6 Conclusions
Following the growing interest in CNNs, increasing efforts to adapt convolution operations to nonCartesian data can be observed, as for example for spherical data Taco2018 ; Perraudin2018 and nonEuclidean manifolds Masci2015 . Besides Hoogeboom2018
, HexagDLy presents a solution for hexagonally sampled data. With a focus on flexibility and userfriendliness, HexagDLy provides convolution and pooling operations on hexagonal grids. It is based on PyTorch and makes use of the torch.nn module for the implementation of these operations. In combination with a special data addressing scheme, it facilitates the access to CNNs for hexagonally sampled data. By taking advantage of the benefits of directly processing hexagonally sampled data, HexagDLy aims to promote research based on the applicability of CNNs e.g. in groundbased astroparticle physics. Currently, HexagDLy is used in a study on the applicability of CNNs for the analysis of data from the H.E.S.S. experiment. A report on first results is in preparation.
Acknowledgements
This software project evolved as part of a research study to explore deep learning algorithms in the analysis of imaging Cherenkov telescope data within the H.E.S.S. collaboration. We acknowledge the support of the whole collaboration in this project. In particular we want to thank our colleagues Matthias Büchele, Kathrin Egberts, Tobias Fischer, Manuel Kraus, Thomas Lohse, Ullrich Schwanke, Idan Shilon and Gerrit Spengler for fruitful discussions that promoted the development of HexagDLy.
References
 (1) Y. LeCun, Y. Bengio, G. Hinton, Deep learning, Nature 521 (7553) (2015) 436–444. doi:10.1038/nature14539.
 (2) Q. Feng, T. T. Y. Lin, The analysis of VERITAS muon images using convolutional neural networks, in: Proceedings of the International Astronomical Union, Vol. 12, 2016.
 (3) T. L. Holch, I. Shilon, M. Büchele, T. Fischer, S. Funk, N. Groeger, D. Jankowsky, T. Lohse, U. Schwanke, P. Wagner, Probing Convolutional Neural Networks for Event Reconstruction in Ray Astronomy with Cherenkov Telescopes, PoS ICRC2017 (2018) 795. arXiv:1711.06298, doi:10.22323/1.301.0795.
 (4) M. Huennefeld, Deep learning in physics exemplified by the reconstruction of muonneutrino events in IceCube, PoS ICRC2017 (2018) 1057. doi:10.22323/1.301.1057.

(5)
M. Erdmann, J. Glombitza, D. Walz,
A
deep learningbased reconstruction of cosmic rayinduced air showers,
Astroparticle Physics 97 (2018) 46 – 53.
doi:https://doi.org/10.1016/j.astropartphys.2017.10.006.
URL http://www.sciencedirect.com/science/article/pii/S0927650517302219  (6) S. Mangano, C. Delgado, M. Bernardos, M. Lallena, J. J. R. Vázquez, Extracting gammaray information from images with convolutional neural network methods on simulated Cherenkov Telescope Array data, in: ANNPR 2018, LNAI 11081, p.243254, 2018. arXiv:1810.00592, doi:10.1007/9783319999784.

(7)
I. Shilon, M. Kraus, M. Büchele, K. Egberts, T. Fischer, T. L.
Holch, T. Lohse, U. Schwanke, C. Steppa, S. Funk,
Application
of deep learning methods to analysis of imaging atmospheric cherenkov
telescopes data, Astroparticle Physics 105 (2019) 44 – 53.
doi:https://doi.org/10.1016/j.astropartphys.2018.10.003.
URL http://www.sciencedirect.com/science/article/pii/S0927650518301178  (8) A. Paszke, S. Gross, S. Chintala, G. Chanan, E. Yang, Z. DeVito, Z. Lin, A. Desmaison, L. Antiga, A. Lerer, Automatic differentiation in PyTorch, in: NIPSW, 2017.
 (9) R. M. Mersereau, The processing of hexagonally sampled twodimensional signals, Proceedings of the IEEE 67 (6) (1979) 930–949. doi:10.1109/PROC.1979.11356.
 (10) R. Staunton, The design of hexagonal sampling structures for image digitization and their use with local operators, Image and Vision Computing 7 (3) (1989) 162 – 166. doi:https://doi.org/10.1016/02628856(89)900401.
 (11) N. S. Richard C. Staunton, A comparison between square and hexagonal sampling methods for pipeline image processing, Proc.SPIE 1194 (1990) 1194 – 1194 – 10.
 (12) E. Hoogeboom, J. W. T. Peters, T. S. Cohen, M. Welling, HexaConv, ArXiv eprintsarXiv:1803.02108.
 (13) C. Birch, S. P. Oom, J. A. Beecham, Rectangular and hexagonal grids used for observation, experiment and simulation in ecology, Ecological Modelling 206 (2007) 347–359. doi:10.1016/j.ecolmodel.2007.03.041.
 (14) K. Sahr, Hexagonal discrete global grid systems for geospatial computing, Archives of Photogrammetry, Cartography and Remote Sensing, Vol. 22, 2011, p.363376 22 (2011) 363–376.

(15)
M. Satoh, H. Tomita, H. Yashiro, H. Miura, C. Kodama, T. Seiki, A. T. Noda,
Y. Yamada, D. Goto, M. Sawada, T. Miyoshi, Y. Niwa, M. Hara, T. Ohno, S.i.
Iga, T. Arakawa, T. Inoue, H. Kubokawa,
The nonhydrostatic
icosahedral atmospheric model: description and development, Progress in
Earth and Planetary Science 1 (1) (2014) 18.
doi:10.1186/s4064501400181.
URL https://doi.org/10.1186/s4064501400181 
(16)
S. Ohm, C. van Eldik, K. Egberts, /hadron
separation in veryhighenergy
ray astronomy using a multivariate analysis method, Astroparticle Physics 31 (2009) 383–391.
arXiv:0904.1136, doi:10.1016/j.astropartphys.2009.04.001.  (17) M. G. Aartsen et al., Search for dark matter annihilation in the galactic center with IceCube79, European Physical Journal C 75 (2015) 492. doi:10.1140/epjc/s1005201537131.
 (18) T. Cherenkov Telescope Array Consortium, :, B. S. Acharya, I. Agudo, I. A. Samarai, R. Alfaro, J. Alfaro, C. Alispach, R. Alves Batista, J.P. Amans, et al., Science with the Cherenkov Telescope Array, ArXiv eprintsarXiv:1709.07997.

(19)
T. S. Cohen, M. Geiger, J. Köhler, M. Welling,
Spherical CNNs, in:
International Conference on Learning Representations, 2018.
URL https://openreview.net/forum?id=Hkbd5xZRb  (20) N. Perraudin, M. Defferrard, T. Kacprzak, R. Sgier, DeepSphere: Efficient spherical Convolutional Neural Network with HEALPix sampling for cosmological applicationsarXiv:1810.12186.

(21)
J. Masci, D. Boscaini, M. M. Bronstein, P. Vandergheynst,
Shapenet: Convolutional neural
networks on noneuclidean manifolds, CoRR abs/1501.06297.
URL http://arxiv.org/abs/1501.06297
Current code version
Nr.  Code metadata description  Please fill in this column 

C1  Current code version  2.0.1 
C2  Permanent link to code/repository used for this code version  https://github.com/ai4iacts/hexagdly/tree/2.0.1 
C3  Legal Code License  MIT 
C4  Code versioning system used  git 
C5  Software code languages, tools, and services used  Python 3, PyTorch 
C6  Compilation requirements, operating environments & dependencies  Tested on Linux and Mac OS, Python 3.6, PyTorch 0.4 
C7  If available Link to developer documentation/manual  https://github.com/ai4iacts/hexagdly/README.md 
C8  Support email for questions  steppa@unipotsdam.de, holchtim@physik.huberlin.de 
Comments
There are no comments yet.