1 Introduction
Understanding the principles guiding neuronal organization has been a major goal in neuroscience. The ability to reconstruct individual neuronal arbors is necessary, but not sufficient to achieve this goal: understanding how neurons of the same and different types colocate themselves requires the reconstruction of the arbors of multiple neurons sharing similar molecular and/or physiological features from the same brain. Such denser reconstructions may allow the field to answer some of the fundamental questions of neuroanatomy: do cells of the same type tile across the lateral dimensions by avoiding each other? To what extent do the organizational principles within a brain region extend across the whole brain? While dense reconstruction of electron microscopy images provides a solution [1, 2], its fieldofview has been limited for studying regionwide and brainwide organization.
Recent advances in tissue clearing [3, 4] and light microscopy enable a fast, and versatile approach to this problem. In particular, oblique lightsheet microscopy can image thousands of individual neurons at once from the entire mouse brain at a 0.406 0.406 resolution [5]. Moreover, by registering reconstructed neurons from multiple brains of different neuronal gene expressions to a common coordinate framework such as the Allen Mouse Brain Atlas [6], it is possible to study neuronal structure and organization across many brain regions and neuronal cell classes. Therefore, this method may soon produce hundreds of full brain images, each containing hundreds of sparsely labeled neurons. However, scaling neuronal reconstructions to such large sets is not trivial. The gold standard of manual reconstruction is a tedious and laborintensive process with a single neuronal reconstruction taking a few hours. This makes automated reconstruction the most viable alternative. Recently, many automated methods appeared for the reconstruction of neurons from light microscopy images. These include methods based on supervised learning with neuronal networks as well as other approaches [7, 8, 9, 10, 11, 12]
. Some common problems include slow training and/or reconstruction speeds, tendency for topological mistakes despite high voxelwise accuracy, and vulnerability to rare but important imaging artifacts such as stitching misalignments and microscope stage jumps. Here, we propose a supervised learning method based on a convolutional neural network architecture to address these shortcomings. In particular, we suggest (i) an objective function that penalizes topological errors more heavily, (ii) a data augmentation framework to increase robustness against multiple imaging artifacts, and (iii) a distributed scheme for scalability. Training data augmentation for addressing microscopy image defects was initially demonstrated for automated tracing of neurons in electron microscopy images
[13]. Here, we adapt this approach to sparse light microscopy images.The UNet architecture [14, 15] has recently received significant interest, especially in the analysis of biomedical images. By segmenting all the voxels of an input patch rather than a central portion of it, the UNet can learn robust segmentation rules faster, and decreases the memory and storage requirements. In this paper, we train a 3D UNet convolutional network on a set of manually traced neuronal arbors. To overcome challenges caused by artifacts producing apparent discontinuities in the arbors, we propose a fast, connectivitybased regularization technique. While approaches that increase topological consistency exist [16, 17], they are either too slow for petascale images, or are not part of an online training procedure. Our approach is a simple, differentiable modification of the cost function, and the computational overhead scales linearly with the voxel count of the input patch. On the other hand, while these regularization techniques can enforce proper connectivity, there are relatively few examples of the various imaging artifacts in the training set. In order to increase the examples of such artifacts, we simulate them through various data augmentations and present these simulations under a unified framework. Taken together, our approach produces a significant increase in the topological accuracy of neuronal reconstructions on a test set.
In addition to accuracy, an efficient, scalable implementation is necessary for reconstructing petavoxelsized image datasets. We maintain scalability and increase the throughput by using a distributed framework for reconstructing neurons from brain images, in which the computation can be distributed across multiple GPU instances. Finally, we augment data at runtime to avoid memory issues and computational bottlenecks. This significantly increases the throughput rate because data transfers are a substantial bottleneck. We report segmentation speeds exceeding 300 gigavoxels per hour and linear speedups in the presence of additional GPUs.
2 Methods
2.1 Convolutional neural network regularization through digital topology techniques
To create the training set, we obtain volumetric reconstructions of the manual arbor traces of neuronal images by a topologypreserving inflation of the traces [18]. We use a 3D UNet convolutional neural network architecture [14, 15, 13]
to learn to segment the neurons from this volumetric training set. Since neuronal morphology is ultimately represented and analyzed as a tree structure, we consider the branching pattern of the segmented neuron more important than its voxelwise accuracy. Hence, to penalize topological changes between the groundtruth and the prediction at the time of training, we binarize the network output by thresholding and identify all nonsimple points in this binarized patch based on
connectivity [19] — points when added or removed change an object’s topology (e.g., splits and mergers) — and assign larger weights to them in the binary crossentropy cost function(1) 
where if voxel is nonsimple while otherwise, is the number of voxels, and and are the label image and predicted segmentation, respectively. Note that the simpleness of a voxel depends only on its neighborhood, and therefore this operation scales linearly with the patch size.
2.2 Simulation of image artifacts through data augmentations
Data augmentation is a technique that augments the base training data with predefined transformations of it. By creating statistical invariances (e.g. against rotation) within the dataset or overrepresenting rarely occurring artifacts, augmentation can increase the robustness of the learned algorithm. Motivated by the fact that 3D microscopy is prone to several image artifacts, we followed a unified framework for data augmentation. In particular, our formalism requires explicit models of the underlying artifacts and the desired reconstruction in their presence to augment the original training set with simulations of these artifacts.
We define the class of “artifactgenerating” transformations as such that if , then for and , where acts on an raw image and acts on its corresponding label image. For example, the common augmentation step of rotation by can be realized by and both rotating their arguments by . Data augmentation adds these rotated raw/label image pairs to the original training set (Fig. 1).
Occluded branches: Branch occlusions can be caused by photobleaching or an absence of a fluorophore. We model the artifactgenerating transformation for an absence of a fluorophore as , where
(2) 
such that denotes the identity transformation, denotes the position of the absent fluorophore and is its corresponding pointspread function. Here, we approximated the of a fluorophore with a multivariate Gaussian.
Duplicate sections: The stage of a scanning 3D microscope can intermittently stall, which can duplicate the imaging of a tissue section. The artifactgenerating transformation for stage stalling is given by , where
(3) 
for the region and the plane such that duplicates the slice in a rectangular neighborhood .
Dropped sections: Similar to the stalling of the stage, jumps that result in missed sections can occur intermittently. The corresponding artifactgenerating transformation is given by , where
(4) 
and
(5) 
such that , for , which downsamples the region to maintain partial connectivity in the label. Hence, skips a small region given by at , and is the corresponding desired transformation on the label image.
Stitching misalignment: Misalignments can occur between 3D image stacks, potentially causing topological breaks and mergers between neuronal branches. The corresponding artifactgenerating transformation is given by , where
(6) 
and
(7) 
such that is a shear transform on . Hence, translates a region of to simulate a stitching misalignment, and shears a region around the discontinuity to maintain 18connectivity in the label.
Light scattering: Light scattering by the cleared tissue can create an inhomogeneous intensity profile and blur the image. To simulate this image artifact, we assumed the scatter has a homogeneous profile and is anisotropic due to the oblique lightsheet. We approximate these characteristics with a Gaussian kernel: . In addition, the global inhomogeneous intensity profile was simulated with an additive constant. Thus, the corresponding artifactgenerating transformation is given by , where
(8) 
2.3 Fully automated, scalable tracing
To optimize the pipeline for scalability, we store images as parcellated HDF5 datasets. For training, a file server software streams these images to the GPU server, which performs data augmentations onthefly, to minimize storage space requirements. For deploying the trained neural network, the file server similarly streams the datasets to a GPU server for segmentation. Once the segmentation is completed, the neuronal morphology is reconstructed automatically from the segmented image using the UltraTracer neuron tracing tool within the Vaa3D software package [7].
3 Experimental Procedure
In our experiments, we used a dataset of 54 manually traced neurons imaged using oblique lightsheet microscopy. These morphological annotations were dilated while preserving topology for training the neural network for segmentation. We partitioned the dataset into training, validation, and test sets by randomly choosing 25, 8, and 21 neurons, respectively. The software package PyTorch was used to implement the neural network
[20]. The network was trained using an Adam optimizer for gradient descent [21]. Training and reconstruction were conducted on two Intel Xeon Silver 4116 CPU, 256 GB RAM, and 2 NVIDIA GeForce GTX 1080 Ti GPUs.4 Results
4.1 Topologically accurate reconstruction
To quantify the topological accuracy of the network on lightsheet microscopy data, we define the topological error as the number of nonsimple points that must be added or removed from a prediction to obtain its corresponding label. Specifically, for binary images and , let denote a topologypreserving warping of that minimizes the voxelwise disagreements between the warped image and [17, 11], denote the binary image whose foreground is common to both and , and denote the number of foreground voxels of . We quantify the agreement between a reconstruction and label using the Jaccard index as
(9) 
We compared this score across different UNet results: without any augmentations or regularization, with the augmentations, with the topological regularization, and with both the topological regularization and the augmentations. The UNet results with augmentations and topological regularization performed significantly better compared to the results without augmentations or regularization (Figs 2, 3).
4.2 Neuron reconstruction is efficient and scalable
To quantify the efficiency of the distributed framework, we measured the framework’s throughput for augmenting data, training on the data, and segmenting the data. Augmentations performed at 35.2 9.2 gigavoxels per hour while training performed at 16.8 0.2 megavoxels per hour. Segmentation performed at 348.8 1.9 gigavoxels per hour. Both segmentation and training showed a linear speedup with an additional GPU. For an entire mouse brain, neuronal reconstruction would take about 23 hours on a single GPU.
5 Discussion
In this paper, we proposed an efficient, scalable, and accurate algorithm capable of reconstructing neuronal anatomy from light microscopy images of the whole brain. Our method employs topological regularization as well as simulates discontinuous image artifacts inherent to the imaging systems. These techniques help maintain topological correctness of the trace (skeleton) representations of neuronal arbors.
While we demonstrated the merit of our approach on neuronal images obtained by oblique lightsheet microscopy, our methods address some of the problems common to most 3D fluorescence microscopy techniques. Therefore, we hope that some of our methods will be useful for multiple applications. Combined with the speed and precision of oblique lightsheet microscopy, the distributed and fast nature of our approach enables the production of a comprehensive database of neuronal anatomy across many brain regions and cell classes. We believe that these aspects will be useful in discovering different cortical cell types as well as understanding the anatomical organization of the brain.
References
 [1] Winfried Denk and Heinz Horstmann, “Serial blockface scanning electron microscopy to reconstruct threedimensional tissue nanostructure,” PLoS Biology, vol. 2, no. 11, pp. e329, 2004.
 [2] Moritz Helmstaedter, Kevin L. Briggman, Srinivas C. Turaga, Viren Jain, H. Sebastian Seung, and Winfried Denk, “Connectomic reconstruction of the inner plexiform layer in the mouse retina,” Nature, vol. 500, no. 7461, pp. 168, 2013.
 [3] Kwanghun Chung, Jenelle Wallace, SungYon Kim, Sandhiya Kalyanasundaram, Aaron S. Andalman, Thomas J. Davidson, Julie J. Mirzabekov, Kelly A. Zalocusky, Joanna Mattis, Aleksandra K. Denisin, et al., “Structural and molecular interrogation of intact biological systems,” Nature, vol. 497, no. 7449, pp. 332, 2013.
 [4] Etsuo A. Susaki, Kazuki Tainaka, Dimitri Perrin, Fumiaki Kishino, Takehiro Tawara, Tomonobu M. Watanabe, Chihiro Yokoyama, Hirotaka Onoe, Megumi Eguchi, Shun Yamaguchi, et al., “Wholebrain imaging with singlecell resolution using chemical cocktails and computational analysis,” Cell, vol. 157, no. 3, pp. 726–739, 2014.
 [5] Arun Narasimhan, Kannan Umadevi Venkataraju, Judith Mizrachi, Dinu F. Albeanu, and Pavel Osten, “Oblique lightsheet tomography: fast and high resolution volumetric imaging of mouse brains,” BioRxiv, 2017.
 [6] Ed S. Lein, Michael J. Hawrylycz, Nancy Ao, Mikael Ayres, Amy Bensinger, Amy Bernard, Andrew F. Boe, Mark S. Boguski, Kevin S. Brockway, Emi J. Byrnes, et al., “Genomewide atlas of gene expression in the adult mouse brain,” Nature, vol. 445, no. 7124, pp. 168, 2007.
 [7] Hanchuan Peng, Zhi Zhou, Erik Meijering, Ting Zhao, Giorgio A. Ascoli, and Michael Hawrylycz, “Automatic tracing of ultravolumes of neuronal images,” Nature Methods, vol. 14, no. 4, pp. 332, 2017.
 [8] Engin Türetken, Germán González, Christian Blum, and Pascal Fua, “Automated reconstruction of dendritic and axonal trees by global optimization with geometric priors,” Neuroinformatics, vol. 9, no. 23, pp. 279–302, 2011.
 [9] Yu Wang, Arunachalam Narayanaswamy, ChiaLing Tsai, and Badrinath Roysam, “A broadly applicable 3D neuron tracing method based on opencurve snake,” Neuroinformatics, vol. 9, no. 23, pp. 193–217, 2011.

[10]
Engin Türetken, Fethallah Benmansour, and Pascal Fua,
“Automated reconstruction of tree structures using path classifiers and mixed integer programming,”
in Computer Vision and Pattern Recognition (CVPR), 2012 IEEE Conference on. IEEE, 2012, pp. 566–573.  [11] Uygar Sümbül, Sen Song, Kyle McCulloch, Michael Becker, Bin Lin, Joshua R. Sanes, Richard H. Masland, and H. Sebastian Seung, “A genetic and computational approach to structurally classify neuronal types,” Nature Communications, vol. 5, no. 3512, 2014.

[12]
Rohan Gala, Julio Chapeton, Jayant Jitesh, Chintan Bhavsar, and Armen
Stepanyants,
“Active learning of neuron morphology for accurate automated tracing of neurites,”
Frontiers in Neuroanatomy, vol. 8, pp. 37, 2014.  [13] Kisuk Lee, Jonathan Zung, Peter Li, Viren Jain, and H. Sebastian Seung, “Superhuman accuracy on the SNEMI3D connectomics challenge,” ArXiv, May 2017, arXiv:1706.00120 [cs].
 [14] Olaf Ronneberger, Philipp Fischer, and Thomas Brox, “UNet: Convolutional networks for biomedical image segmentation,” in Medical Image Computing and ComputerAssisted Intervention – MICCAI 2015, Nassir Navab, Joachim Hornegger, William M. Wells, and Alejandro F. Frangi, Eds., Cham, 2015, pp. 234–241, Springer International Publishing.
 [15] Özgün Çiçek, Ahmed Abdulkadir, Soeren S. Lienkamp, Thomas Brox, and Olaf Ronneberger, “3D UNet: Learning dense volumetric segmentation from sparse annotation,” in International Conference on Medical Image Computing and ComputerAssisted Intervention. Springer, 2016, pp. 424–432.
 [16] Kevin Briggman, Winfried Denk, H. Sebastian Seung, Moritz N. Helmstaedter, and Srinivas C. Turaga, “Maximin affinity learning of image segmentation,” in Advances in Neural Information Processing Systems, 2009, pp. 1865–1873.
 [17] Viren Jain, Benjamin Bollmann, Mark Richardson, Daniel R. Berger, Moritz N. Helmstaedter, Kevin L. Briggman, Winfried Denk, Jared B. Bowden, John M. Mendenhall, Wickliffe C. Abraham, et al., “Boundary learning by optimization with topological constraints,” in Computer Vision and Pattern Recognition (CVPR), 2010 IEEE Conference on. IEEE, 2010, pp. 2488–2495.
 [18] Uygar Sümbül, Aleksandar Zlateski, Ashwin Vishwanathan, Richard H. Masland, and H. Sebastian Seung, “Automated computation of arbor densities: a step toward identifying neuronal cell types,” Frontiers in Neuroanatomy, vol. 8, pp. 139, 2014.
 [19] Gilles Bertrand and Grégoire Malandain, “A new characterization of threedimensional simple points,” Pattern Recognition Letters, vol. 15, no. 2, pp. 169–175, 1994.
 [20] Adam Paszke, Sam Gross, Soumith Chintala, Gregory Chanan, Edward Yang, Zachary DeVito, Zeming Lin, Alban Desmaison, Luca Antiga, and Adam Lerer, “Automatic differentiation in PyTorch,” NIPS, 2017.
 [21] Diederik P. Kingma and Jimmy Ba, “Adam: A Method for Stochastic Optimization,” ArXiv, Dec. 2014, arXiv:1412.6980 [cs].
Comments
There are no comments yet.