all things xbrain
Methods for resolving the 3D microstructure of the brain typically start by thinly slicing and staining the brain, and then imaging each individual section with visible light photons or electrons. In contrast, X-rays can be used to image thick samples, providing a rapid approach for producing large 3D brain maps without sectioning. Here we demonstrate the use of synchrotron X-ray microtomography (μCT) for producing mesoscale (1 μ m^3) resolution brain maps from millimeter-scale volumes of mouse brain. We introduce a pipeline for μCT-based brain mapping that combines methods for sample preparation, imaging, automated segmentation of image volumes into cells and blood vessels, and statistical analysis of the resulting brain structures. Our results demonstrate that X-ray tomography promises rapid quantification of large brain volumes, complementing other brain mapping and connectomics efforts.READ FULL TEXT VIEW PDF
Reconstructing multiple molecularly defined neurons from individual brai...
To improve the performance of most neuroimiage analysis pipelines, brain...
As many different 3D volumes could produce the same 2D x-ray image, inve...
Current Flash X-ray single-particle diffraction Imaging (FXI) experiment...
X-ray radiography is the most readily available imaging modality and has...
Modern Flash X-ray diffraction imaging (FXI) acquires diffraction signal...
We review the relatively immature field of automated image analysis for ...
all things xbrain
Using the 2-BM synchrotron beamline at the Advanced Photon Source (APS), we obtained tomography data from cubic mm volumes of brain tissue (Fig. 1). Samples were fixed with aldehydes, stained with osmium, and embedded in plastic, making them compatible with subsequent large volume EM. Stacks of projection images were acquired by rotating the sample at (,) uniformly spaced rotation angles from 0 to 180 degrees and measuring the propagation of X-rays through the sample at each rotation angle (Fig. 1b). Radiographs are recorded with an indirect detection system consisting of a thin scintillator which converts the transmitted X-rays into visible light (Fig. 1c). The light is then focused by an objective lens on a charged coupled detector (CCD) array, producing images with equivalent pixel size of () at the sample plane. After calibrating the instrument, collecting the main dataset studied in this paper (10.6 Gigavoxels) took approximately six minutes. To obtain high contrast images, data acquisition was performed in propagation-based phase contrast mode by increasing the distance between the detector and sample to several tens of centimeters with a pink beam (E/E = ) set respectively to 30 keV. To reconstruct a 3D image volume from the projection data, phase retrieval was performed on each projection using the well established Paganin algorithm, followed by volume reconstruction using the open source TomoPy package. The resulting image volume provides the data for our segmentation and analysis methods.
To quantify the resolution of our reconstructed X-ray image volumes, we obtained digitally vignetted sub-fields from regions with brain tissue (signal), and without (background) and computed their respective 2D Fourier power spectra. The signal power spectra (SPS) for regions with brain tissue is five times larger than in the noise power spectra (NPS) computed in background regions at a half-period resolution of about 1.31 m in the XY or transverse plane, and 0.95 m in the XZ or vertical plane (Fig. 2a). This near-isotropic spatial resolution simplifies data analysis when compared to other imaging approaches that may give very non-isotropic values of spatial resolution in XY versus XZ.
). We estimate that voxels inside cells are on averagedB (meanstd) brighter then their immediate surround (see Methods). At this contrast level, it is possible to discern the location and size of cells in the sample. Blood vessels are also visible in the sample and provide even stronger contrast than cell bodies, making them much easier to track. This signal strength suggests that we should be able to segment the tissue into cell bodies and blood vessels, which we validate with our automated techniques.
After collecting CT data, we performed ultra-thin sectioning and electron microscopic imaging of the same sample. EM confirmed the identity of the cell bodies, myelinated axons, and blood vessels, corresponding to those annotated in the CT dataset (Fig. 2
c), suggesting that details seen in the X-ray dataset are bona fide and not spurious results of our imaging and processing pipelines. In addition, we noticed no changes in the microtome sectioning properties of the epon-embedded brain tissue, nor did we see any obvious signs of irradiation-induced structural damage in the scanning electron micrographs obtained from these sections. Structures like synapses and mitochondria are still clearly evident (Fig.2c). This is consistent with our calculated radiation dose of about 3 kGy during the collection of the X-ray tomography data; this dose is well below the dose affecting the dissolution rate of radiation-sensitive polymers like poly(methyl methacrylate) or PMMA (1000 kGy), and the dose at which glutaraldehyde-fixed wet chromosomes start to show mass loss(70 MGy). Our results confirm that CT and EM can be coupled to produce multi-resolution brain maps.
A good dataset, at minimum, should allow human annotators to clearly see the structures of interest and in turn, reliably annotate them. We thus measured human annotator ability in finding and labeling cell bodies and blood vessels in multi-view projections (orthogonal 2D projection planes) of the 3D image data. Two expert annotators (A0 and A1) were instructed to label the boundaries of all of the cells and vessels in a small volume of X-ray image data with ITK-Snap. When provided 3D context, pixel-level agreement between annotators of the cell bodies and blood vessels were (precision, recall) and , for V2 and V3 respectively (see Methods). We further measured the discrepancy between the centroids of cell bodies across both annotators and find nearly perfect agreement . While precise manual segmentation of the boundaries of cell bodies and vessels is challenging, we observe that the object-level agreement between annotations of the centers of cell bodies is high, showing that performance on the detection task is nearly identical. We conclude that the data is of sufficient quality to segment cell bodies and vessels with automated methods.
The datasets afforded by X-ray tomography are too large to be analyzed by humans. Therefore, we developed automatic 3D segmentation algorithms to extract cells and vessels from the image volumes. We created a suite of tools for extracting and visualizing mesoscale maps from 3D stacks of X-ray images (Fig. 3). This set of tools, X-BRAIN (X-ray Brain Reconstruction, Analytics, and Inference for Neuroanatomy), consists of image processing and computer vision methods for preprocessing and artifact removal, segmentation, estimating the location and size of cells, and vessel segmentation. We also provide methods for large-scale analyses of these data to compute relevant statistics on the reconstructed maps of the cells and vessels. X-BRAIN is implemented in Matlab and Python and both code and data are openly available through docs.neurodata.io/xbrain, providing a community resource for the automated segmentation and quantification of mesoscale brain anatomy.
Our main image processing and computer vision pipeline (Step 1-2 in Fig. 3
) consists of methods for segmenting blood vessels and detecting the location and size of cells in the volume. In the initial step of our workflow, we train a classifier to predict the probability that each brain voxel belongs to each of the three classes: cell body, blood vessel, and background (other). To do this, we use a tool called ilastik to sparsely annotate data and build a random forest classifier using intensity, edge, and gradient features computed on the image volume. This classification procedure returns three probability maps , which collectively provide the probability tuple that each voxel, whose position is denoted by , is a cell, vessel, or lies in the background (output of ilastik in Step 1 of Fig. 3). This classification procedure provides an easy and intuitive way to provide an estimate of which voxels correspond to cell bodies and blood vessels.
The simplest way to convert a probability map to a (binary) segmentation, is to threshold the probabilities and group the resulting structures that pass this test into connected components. In the case of vessel segmentation, we can employ this procedure with minimal tweaks. To segment vessels in the sample, we threshold the vessel probability map and then apply simple morphological filtering operations to clean and smooth the resulting binary data (see Methods). Visual inspection and subsequent quantification of precision and recall of vessel segmentation suggests a high-degree of accuracy through this simple post-processing of the ilastik outputs.
Applying the same thresholding procedure used for vessel segmentation, to the segmentation of cells, is difficult because neurons and blood vessels are often densely packed. In this case, simple thresholding-based approaches tend to group clusters of cells and vessels together (see Supp. Materials Fig. 2). We thus developed an algorithm for cell detection (Step 2 in Fig. 3), which produces estimates of the centroids and radii of detected cells. Our method iteratively selects a new candidate centroid based upon the correlation between the probability map and a (fixed radius) spherical template which serves to enforce our biologically-inspired shape prior. We use a frequency-based approach to convolve a spherical template with the cell probability map and greedily select “hotspots” which are likely to contain cell bodies (see Methods for further details). Our method leverages prior knowledge of the approximate size and spherical shape of cells to select sphere-like objects from the pre-filtered probabilities to resolve situations where neurons and blood vessels appear in close proximity.
After finding the centroids of all detected cells, we can then efficiently estimate their sizes. To do this, we center a small spherical template at the detected center of each cell and estimate the cell size by varying the template size. When the template can no longer be inscribed within the cell body, we observe a sharp decay in the correlation. Thus, we compute the correlation between the probability map while increasing the diameter of the spherical template, find the maximum decrease in correlation, and select this diameter as our estimate of the cell size. This operation has low complexity and can be performed on the entire (cubic mm) dataset on a single workstation. Once we have detected cells, estimating the diameter of the cell body is a simple one-dimensional fitting problem.
To optimize each stage of our segmentation pipeline, we performed an exhaustive grid search to find the set of hyperparameters (i.e., threshold parameters for cell/vessel detection, the size of spherical template, and the stopping criterion for the cell finder) that maximize a combination of the precision and recall (-score) between our algorithm’s output and manually annotated data from volume V1 (Fig. 4a-b). After tuning our cell detection algorithm to find the best set of hyperparameters, we obtained a precision and recall of on the same volume. Our initial results on this training volume and visual inspection of large-scale runs (Fig. 4c) suggest that our methods provide reliable maps of the cells and vessels in the sample.
The image data varies across space, due to various details of the imaging and reconstruction pipeline. Therefore, it is important to test that our segmentation algorithm works reliably across regions previously unseen during classifier training. We thus labeled and tested our cell detection algorithm on two additional test cubes V2 and V3 (Fig. 4b) that are spatially disjoint from V1 and each other. V2 served an initial test set, as we added some sparse training data from this volume to train our ilastik classifier. V3 served as a held-out test set, as the location of this cube was unknown before tuning and running the algorithm on the entire dataset. After obtaining ground truth labels, we ran X-BRAIN on V2 and V3, using the set of parameters selected by optimizing our method on V1. The precision and recall is given by and , for V2 and V3 respectively. These results suggest that X-BRAIN generalizes well across different regions of the sample, and is robust to fluctuations in brightness and contrast.
The variation in training and test volume performance can be partially explained by fluctuations in the brightness, introduced during tomographic image reconstruction. To understand the connection between the fluctuations in contrast and difficulty of the cell detection problem, we computed the SNR across multiple cells within each of the labeled volumes. The mean and standard deviation of the signal-to-noise (SNR) between cells and their background in all three volumes was V1 =, V2 =, and V3 =
. As expected, the precision and recall (for cell detection) seem to be correlated with the variance of the SNR in the volume (providing a measure fluctuations in contrast). In particular, we obtain the lowest precision and recall for V2, and indeed, this volume exhibits the highest variance in the contrast between cells and their background. Even in light of these fluctuations in brightness, our sensitivity analysis (Fig.4a) and results (Fig. 4b) on training and test volumes suggest that X-BRAIN generalizes well across different regions of the volume. Furthermore, when we visually inspect our large-scale results (Fig. 4c), we find a good correspondence between cells and vessels that are visible in slices and those detected by our algorithms. These results suggest that our algorithms are robust and can be applied at large scale.
We applied our pipeline to segment vessels and detect cells in a cubic mm sample ( voxels) of excised brain tissue collected from mouse somatosensory cortex (Fig. 4). To apply X-BRAIN to large datasets, we created an analytics workflow that uses (but does not require) the LONI Pipeline environment to automatically distribute jobs across a cluster environment. Our workflow is parallelized by dividing our large dataset into small data blocks which can be processed independently, based upon a user-specified graphical (xml-based) description of the dependencies between various algorithms. Running our analytics pipeline on a cubic mm sample took approximately six hours on a small 48-core cluster (see Methods). As a result, we detected , cells over the extent of the analyzed sample (0.42mm), which suggests a density of cells per mm.
To compute the spatially-varying density of cells, we applied a robust non-parametric approach for density estimation. Adopting a non-parametric approach enables us to obtain an accurate estimate of the distribution without making any restrictive assumption on its form. In particular, we rely on the popular
-nearest neighbors (kNN) density estimation algorithm[23, 24], which estimates a distribution using only distances between the samples (cells) and their nearest neighbor. When applied to the entire volume, we calculated an average density of cells per mm (Fig. 5a). These results are comparable to other studies that estimate an average of cells per mm in mouse cortex, both in terms of our average, and the spread in the distribution. These density estimates provide important information about the spatially-varying distribution of cells within the sample.
The location of cell bodies, relative to one another, and relative to the vasculature, is important for studying diseases that afflict the brain . Thus we developed automated tools to compute distances between detected cell centers (cell-to-cell distances,, Fig. 5b) and distances between each cell and the closest segmented vessel (cell-to-vessel distances, Fig. 5c). Cell-to-vessel distances are spread between 10-40 m, with very few cells exceeding this distance (). In contrast, the cell-to-cell distances appear to be much more concentrated, with a strong peak at and much smaller variance (). The distribution of distances between cells and vessels (Fig. 5b) aligns with previous results[25, 26] and confirms the accuracy of our approach for large-scale analysis. We further estimated that the fractional volume of vessels in the sample was . This estimate is in agreement with previous studies[27, 25, 26], which estimate the fractional density of vessels in the cortex to range from . These results further confirm that our methods can be used to compute information about the relationship between cells and vasculature in the brain.
To complement our analysis tools, we developed methods to produce and visualize mesoscale maps, with the cellular density and vasculature as their output. These methods are integrated into the NeuroData framework and thus, after running a sample through our pipeline, users can download different descriptions of the neuroanatomy, either alone, or combined with the image data to help reveal relevant structures in the images. Using these multiple modes of visualization (Fig. 5
d), we identified a 3D structure with extremely high cell density clustered at the bottom of the sample (Layer 6). We confirmed this structure in both 3D visualizations (left), in X-ray micrographs, the cell probability maps, and in our estimate of the cell density (right). All of these representations provide information and descriptions of the data that can be used to further visualize and quantify its neuroanatomy. The combination of dense reconstructions of cells and blood vessels provide a unique approach for studying the joint distribution of brain cytoarchitecture and vasculature.
We have shown that CT can be used to rapidly quantify mesoscale neuroanatomy in a millimeter scale sample without sectioning. Our results demonstrate how osmium stained and plastic embedded brains, in conjunction with a synchrotron X-ray source, produce sufficient contrast and resolution to automatically detect blood vessels and cell bodies. Our approach to automated anatomy is uniquely poised to provide detailed, large, mesoscale maps of the brain.
Our current approach does not yet provide enough resolution and contrast to resolve neural processes or reliably disambiguate cell types. In some cases we can resolve large cellular process like apical dendrites, from which we can discern that the cell is indeed a neuron and potentially a pyramidal neuron. The resolution of CT can be enhanced towards the nanometer scale, which would allow us to distinguish neuronal and non-neuronal types by shape. To complement cell type classification by shape, it is possible to develop genetic or immunohistochemical approaches for differential labeling of distinct neuronal and non-neuronal cell types for CT. Future applications of these techniques to enhance resolution and cell typing will enable a more detailed understanding of brain architecture through CT.
High-resolution approaches threaten to damage the specimen, as the X-ray dose increases quartically with the resolution. For instance, beam damage can induce changes in sample geometry while the tomogram is being acquired, leading to reconstruction artifacts and the degradation of spatial resolution. However, the effect of radiation damage is greatly reduced in cold samples. As an example, the critical dose for mass loss in plastics (PMMA) is increased from 35 MGy at room temperature to 600 MGy when operating at a temperature of 113 K. X-rays have the potential for sub- nm resolution 3D imaging of frozen hydrated brain biopsies with no chemical modification or plastic embedding. Additional parameters of the imaging setup, including photon energy, coherence, and the optics can also be optimized to minimize damage while increasing resolution.
Our imaging data exhibits ringing artifacts that result from inhomogeneities in the signal source. The beamline used in our current study (2BM), uses a double multi-layer monochromator (DMM) to select a narrow bandwidth of the X-ray beam (E/E=). Multi-layers are known to introduce a significant amount of structures in the flat-field (the image without a sample). Such structures, when convolved with fluctuations of the source, generate imperfections in flat-field correction and produce ringing artifacts difficult to correct despite the use of ring artifact removal algorithms during image conversion (left, Fig. 1c). Although our segmentation methods are relatively robust to these low-frequency artifacts, improvements to the tomography setup are important to improve data quality. Preliminary datasets collected at beamline 32-ID, a beamline that uses a short period undulator (U18) instead of a DMM, yield radiographs that are almost artifact free (see Supp. Fig. 3). Future improvements in source stability promise to yield tomographic reconstructions of considerably higher quality.
Our current analysis pipeline is only designed to provide automatic segmentation of cells and vasculature. We currently do not address other aspects of the cytoarchitecture and vasculature important in neuroanatomy, such as cell shapes and neurites. Leveraging both computational methods and other histological preparations, we could develop more advanced approaches for distinguishing different cell types in X-ray tomograms. With significant further development and a massive-scale ground truthing effort, it should be possible to quantify the shape and morphology of neurons, as well as track neurites.
Limited ground truth data restricts in the complexity of methods that we can apply. With more training data, we can leverage more advanced nonlinear classification strategies, such as convolutional neural networks for segmentation and axon tracing. Such approaches have been shown to achieve state-of-the-art performance for identifying synapses and segmenting cell bodies in EM data[31, 32]. Finally, improvements in spatial resolution will help in the challenging problem of resolving adjacent neural structures as separate objects, leading to more straightforward and robust approaches for cell detection.
One of the advantages of our staining method is that we can apply CT and automated EM imaging to the same sample. Our results suggest that both techniques provide complementary images of the same sample without requiring modifications to existing EM techniques. There are several potential advantages to combining these approaches: (i) providing large volume (but coarser resolution) inputs for EM reconstructions of synaptic connectivity, (ii) providing fiducials from inside the plastic block prior to deformation by cutting and thin section imaging for EM, and (iii) providing ‘pre-segmentation’ of many cellular components to aid existing EM segmentation algorithms. Combining the advantages of CT and EM promises to make both approaches stronger.
The high precision and recall of our algorithms suggest that our segmentation and cell detection methods can be used to reliably and quickly survey data volumes and identify cells and vessels in the sample. We can use these methods to build more systematic studies of regions of interest with EM, once the large-scale structure is identified using CT. Information about where cells and vessels lie can be used as a prior in segmentation algorithms (in EM) and also to improve subsequent registration and alignment. As our pipeline for X-ray image analysis has been integrated in NeuroData, we can readily combine existing EM analysis pipelines in NeuroData with our methods to analyze the same dataset with CT and EM. These results can be combined to create a multi-modal brain map that contains information about the cytoarchitectural and cerebrovascular properties of a sample, in addition to the fine-scale information about the processes and synapses afforded by EM.
Knowledge about the macro-scale organization of the brain, such as Brodmann maps, have been based primarily on human anatomists working with thin, sparsely labeled slices of brains. However, with developments in large-scale connectomics with EM[34, 35] and the techniques we present here for CT, far larger and more comprehensive datasets become possible; datasets so large that no human could possibly digest them. The capabilities of synchotron source X-ray microscopy, combined with staining approaches for entire brain preparation , offers the possibility of imaging entire brains at the mesoscale. With these capabilities, it should become possible to obtain brain maps in a new, data-driven fashion, enabling the massive-scale quantification of a broad set of effects related to disease, development, and learning in the brain. We developed methods to image, segment, and analyze the neuroanatomical structure of brain volumes to quantify neuroanatomy with CT. Our methods consist of three main parts: (1) sample preparation, (2) CT and 3D image reconstruction, and (3) automatic segmentation and analysis of brain volumes.
To prepare the samples used in this paper, we used previous techniques for large volume EM. Mice were anesthetized and transcardially perfused with aldehydes (2 percent PFA and 2 percent Glutaraldhyde), stained with heavy metals (osmium, uranium, and lead), dehydrated, and embedded in plastic (EPON). The main dataset analyzed in this paper is taken from mouse somatosensory cortex (S1).
After preparing the sample, we used synchrotron-based CT to image 3D volumes of brain tissue at micron isotropic resolution. We subsequently ultra-thin sectioned the same tissue block with our approach to automated electron microscopy and collected low-resolution EM micrographs ( nm pixel resolution). In these low-resolution EM images, we identified the same pattern of cell bodies and vasculature that were found in the equivalent volume of X-ray data. Fine-resolution EM micrographs (3 nm pixel resolution) were then collected to identify synapses in the EM volumes. Since these labeling approaches are species independent (i.e., they do not depend on transgenic strategies), we can apply this approach to human (and other) brain biopsies.
To collect the CT datasets described here, we utilized the 2-BM beamline at the Advanced Photon Source (APS) with exposure times of second per projection and projections. The 2-BM beamline uses a thick LuAG:Ce scintillator to convert propagation-enhanced X-ray wave into visible light, which a microscope objective magnifies onto a visible light-scientific CMOS camera (pco.edge camera with pixels). When using a 10X objective, this yields a projection with a pixel size of . We utilized propagation-based phase contrast X-ray imaging to obtain high-contrast tomograms of millimeter-sized regions of plastic embedded and metal-stained mouse brain. Imaging a volume at 1 micron isotropic takes approximately six minutes and requires no volume alignment or registration process post-acquisition. The X-ray energy bandwidth was about eV, which means that the data are largely free of the “beam hardening” effect that otherwise complicates medical imaging using laboratory X-ray sources.111X-ray spectrum changes with depth in tissue as the lower energy X-rays are absorbed, leading to changes in the Fresnel fringes used to obtain phase contrast. We are thus able to obtain data around times faster than with laboratory sources, and with higher image quality.
Datasets were collected in Hierarchical Data Format (HDF) files using the Data Exchange schema developed for synchrotron data. Data processing and image reconstructions were performed using the TomoPy toolbox, an open-source Python package, developed at the Advanced Photon Source (APS) for tomographic data analysis. We first normalized the projection images with the incident X-ray measurements to suppress artifacts originating from imperfections in the detection process. A wavelet-Fourier filter is used to further suppress these artifacts with ten wavelet levels and an offset suppression value of two. We used a Paganin-type single-step phase retrieval algorithm to retrieve the phase of the transmitted X-ray signal. The location of the rotation center was estimated either automatically — using an optimization approach minimizing the entropy in reconstructions, or manually — if signal-to-noise levels are high. The tomographic reconstructions were performed using GridRec algorithm, which is a fast implementation of the conventional filtered-back-projection method.
Each image reconstructed in TomoPy is pixels ( isotropic) and is initially stored with 32-bit float precision. We utilized the multiple image processor tool in Fiji (ImageJ) to convert the bit depth of each CT image to 8-bit images. By computing the average number of bits of information in each pixel of the original image, we confirmed that an 8-bit depth was sufficient to capture the information in the CT stack. Visual inspection also confirmed this choice of bit depth, with no visible loss of data quality due to quantization. Following this, we applied an automatic contrast enhancement filter to each image in the stack in Fiji.
To reduce the size of the data and the complexity of subsequent processing steps, we applied a semi-supervised masking algorithm to identify regions in each image where the biological specimen is present (versus background pixels). We developed a supervised method, which requires a user to draw a mask between a small number of dataset images and then finds a smooth interpolation of the mask between these labeled frames. This toolbox utilizes the roipoly tool in Matlab, which provides a GUI for drawing a polygon around the brain tissue in each image. The semi-automated masking step is a small part of the overall workflow and is only performed once per dataset. After reducing the bit depth and masking the data, the dataset is reduced from 95 GB to only 10 GB.
The image volume that we analyze in this paper is of size voxels, which corresponds to a volume of size microns ( cubic millimeters). As the sample is rotated within the field of view (sample plane), we compute that the number of unmasked voxels represents a volume of approximately cubic mm.
We uploaded the raw and masked images into an open-access platform for neuroimaging datasets called NeuroData(www.neurodata.io). Additionally, we also store the annotations and segmentations resulting from our analysis in the NeuroData spatial database to facilitate rapid access, dissemination, and analysis of the data. This framework allows for researchers to freely download arbitrary volumes of raw data, manual labels, or automated annotations for algorithm development or analysis. Users may also query the metadata of detected cells within a volume, which enables rapid knowledge extraction from the X-ray datasets and statistical analyses at scale.
Our methodology, end-to-end pipeline, algorithms, data, and data derivatives are all open source and available for others to reproduce and leverage for further scientific discovery. As described above, we facilitate this open access via integration with the NeuroData ecosystem, which provides tools and infrastructure to store, visualize, parse, and analyze big neuroscience data. Both the raw data and its derivatives are freely available on NeuroData for download and visualization (Fig. 4 in Supp. Materials).
To compute human-to-human agreement and evaluate the performance of our methods, we developed tools to compare segmentations at both the pixel and object level. Detected pixels/objects that do not appear in the manual segmentation are counted as false positives, and manually identified pixels/objects not found by the automatic segmentation algorithm result in false negatives (misses). In all of our evaluations, we compute the precision (), recall (), and f score
where we set . When evaluating the performance of our methods for detecting cells (object-level errors), we compute matches between two sets of centroids by identifying cell pairs in different segmentations that are nearest neighbors. If the matching centroids are within a fixed distance () from one another, we label them a match and remove both cells from the dataset to avoid duplicate assignments. The matching process iterates until all possible matches are found, and precision and recall metrics are computed. For cell detection, we compute the score as it places equal weight on precision and recall. However, in the case of the pixel-level segmentation of vessels, we observe that optimizing the score produces more accurate results (confirmed by visual inspection).
In order to obtain ground truth datasets to quantify the performance of our algorithms and to assess human-to-human agreement, we used a total of four trained annotators (A0, A1, A2, A3) and five novices to label different sub-volumes (V0, V1, V2, V3) of our image dataset using ITK-Snap.
Two of the trained experts (A0, A1) and the five novices labeled cells and vessels in V1, a micron cube of data ( voxels). Annotator A0 was instructed to produce a saturated reconstruction, where all cells and vessels were fully labeled. A1 produced a saturated segmentation of a sub-volume of V1, which we denote as V0. To then compute human-to-human agreement across annotators, we computed the voxel-wise precision and recall between V0-A0 (ground truth) and V0-A1, which we compute to be for cell bodies and for vessels. While precision is high in both cases, the recall is much lower. This is due to the fact that A1 produces an underestimate of A0’s labels; we tested this by dilating A1’s labels until we maximized the score between both annotations. In this case, we obtain a precision for cell bodies and for vessels.
We then computed the agreement between these annotators in detecting cell centroids. We first cleaned each segmentation to ensure all cells are disconnected from one another. We then applied a connected component algorithm and found the center of mass of each component to estimate the centroid of each cell. We matched centroids across the two annotations and computed object-level precision and recall. When ignoring cells along the boundaries of the volume, there are no cells identified by A1 that are not identified by A0 and only one cell identified by A0 that was not identified by A1. Thus, human-to-human agreement is nearly perfect when asked to identify cell centers.
In addition to computing human agreement, we also acquired additional volumes for testing our algorithms. For these purposes, we had another expert annotator (A2) densely label the cells and partially label the vessels in a training volume (V2). Annotator A1 then edited all cells in this volume, which we denoted as V2-A12. Finally, to test our methods, we had an external party randomly select a sub-volume to be used as a hold-out test volume at a location unknown to the authors of this manuscript. Annotator A0 and A3 then iteratively refined a common estimate of the cell centroids in volume (V3); this annotation is referred to as V3-A03. V3 is used as a hold-out test set to evaluate the accuracy of our cell detection method.
To quantify the time required to label the centroids of cell bodies, we recruited five subjects with no previous experience to label the centers of cell bodies in 3D. Each subject was instructed to label as many cells as possible in thirty minutes. The average number of cells that these subjects labeled was and the median was . These results suggests that a novice can label the centroids of around cells in one hour. In practice, we find that it takes experts around 5 hours to reliably label all cell centers in a volume. From estimates of the cell density in mouse cortex, we expect around 120,000 cells per cubic millimeter; therefore, to manually annotate all cells in a cubic mm would require a projected 1200 person-hours or 50 days working 24 hours per day.
To estimate the intrinsic difficulty of separating cells from their background, we calculated the ratio of the intensity between cells and their exteriors. To do this, we sampled 10 cells every 25 slices () in each of the three manually annotated volumes (V1, V2, V3) using ITK Snap. We placed a small circular marker within the cell’s boundary and a marker outside of the cell in a location where the cell’s boundary is clearly resolved. This generated 30 samples in both V1 and V2 and 89 samples in V3, of the brightness inside (signal) and outside (noise). We then computed the signal-to-noise ratio (SNR) for the cell as follows:
where (signal) and (noise) contains the mean value of the labeled pixels within and outside of the labeled cell, respectively. The mean and standard deviation of the SNR (dB) across each subvolume is: V1 = , V2 = , V3 = . Thus, we observe the largest variance in SNR in V2 and the lowest average SNR in V3. The training volume V1 appears to have the highest mean and lowest variance out of the three volumes. Our estimates of SNR appear to be predictive of the difficulty of the segmentation task, and thus are correlated with the accuracy of our segmentation results on the different volumes.
To facilitate biological interpretation and knowledge extraction from CT datasets, we developed a suite of computer vision and image processing methods to segment and analyze mesoscale data. We refer to this set of tools as X-BRAIN (X-ray Brain Reconstruction, Analytics, and Inference for Neuroscience). We now provide an overview of the modules and tools provided in X-BRAIN. Following this, we provide additional details about this toolkit.
Data download: Using Neurodata web-services, we provide scripts for accessing subvolumes of the raw and masked data.
Segmentation: Using the ilastik classification tool, we produce a three-class probability map which encodes the probability of each voxel belonging to class ‘cell,’ ‘blood vessel,’ or ‘other’ (lies in background). We use these probability maps to obtain a segmentation of vessels and cells in the volume.
Estimating the size and location of cell bodies: To estimate the position of cells in the volume, we develop an iterative approach for cell detection that applies a fast-3D convolution method to detect cell bodies based on the ilastik probability maps. We then estimate the size of each cell based upon the detected centers.
Methods for computing spatial statistics: We apply a robust density estimation technique to detected centroids, which provides a non-parametric estimate of the underlying density of cells in the sample. Additionally, we provide methods to compute the distance between detected cells and vessels in addition to spatially-varying vessel density measures.
Data upload: The raw and masked image datasets, cell and vessel probabilities and 3D segmentations are uploaded to NeuroData spatial databases to allow for later queries and analysis.
(1) Computing probability maps with ilastik.
The first step of our segmentation pipeline is to perform pixel-level classification on the X-ray images (2D), which provides the posterior probablities that each voxel is either a cell, vessel, or lies in the background (other). We applied a tool called ilastik which trains a random forest classifier from sparse (manual) annotations of class labels in the data. Ilastik provides an interactive method to compute and examine feature channels; we selected a variety of patch-based texture features at different scales for this analysis.
(2) Vessel segmentation. After computing the vessel probability map with ilastik, we threshold the probability map (see Fig. 4b to assess the sensitivity of our algorithm to different choices of thresholds), dilate the resulting binary thresholded output, and then remove spurrious connected components based on a minimum size threshold. After applying these simple morphological filtering operations, we find that the resulting segmentation has a higher agreement with the (manually segmented) ground truth than the labels produced by a second manual annotator.
(3) Greedy sphere finding approach for cell detection. While ilastik provides a good starting point for identifying cell body locations, individual cells and vessels are often hard to distinguish by simply thresholding the probability map. To separate these components into their constituent parts (cells and vessels), we developed a greedy approach which is similar in spirit to matching pursuit algorithms for sparse signal recovery. The main idea behind our approach for cell finding is to iteratively refine our estimate of the cell position and then “remove” this cell from the data. We do this by first creating a spherical template with a diameter roughly equal to that of the cell; the exact choice of parameter was learned through a hyperparameter search (see below). We apply a 3D-FFT to convolve the spherical template with the cell probability map produced by ilastik. This produces a “sphere map” which gives us high responses in regions that are likely to contain cell bodies. At each step of our algorithm, we select the global maxima of the sphere map to be the centroid of the next detected cell. After finding this cell, we then zero out the probability map in this region so that we cannot select a candidate cell in this same location again, and repeat this matching procedure until convergence. We define convergence as the point at which the correlation between the probability map and our template drops below a user-specified threshold or reaches the maximum number of iterations.
(4) Hyperparameter searches. We developed a tool to run hyperparameter searches over our methods to maximize performance on the ground truth volume V1. After exploring the parameter space, we ran a grid search over the most critical parameters (cell size, dilation, and threshold cutoff) to find a stable, optimal point. We selected the parameters (cell size: 18, dilation: 8, threshold 0.47) that maximized
, the harmonic mean between precision and recall. Because voxels on the edge of volumes have inherent ambiguity for both human and machine annotators, we choose to disregard objects at the edge (of both detected and truth volumes) when computing precision and recall scores throughout this manuscript to ensure the most representative result.
(5) Non-parametric density estimation. To compute the density of detected cells within a volume, we applied a -nearest neighbor (kNN) density estimation algorithm[23, 24], which estimates the density using only distances between the samples (cells) and their
nearest neighbor. More concretely, we define the distance between a centroid vectorand a matrix as
where is the nearest neighbor to contained in the columns of . The value of the empirical distribution at is then estimated using the following consistent estimator:
and contains the centroids of the rest of the detected cells in the sample. We compute this quantity over a 3D grid, where the volume of each bin in the sample grid is
. We selected this bin size to ensure that detected cells will lie in roughly a single grid point. This choice was further confirmed by visually inspecting the resulting density estimates. After computing the density for each 3D bin in our selected grid, we normalize to obtain a proper probability density function. Finally, we compute an estimate of the number of cells per cubic mm as,. The intuition behind this approach is that in regions where we have higher density of samples, the quantity will be very small and thus, the probability of generating a sample at this location is large. In practice, we set which guarantees that the estimates of will asymptotically converge to the exact point estimates of the distribution since converges to as .
After validating and benchmarking our algorithms, we scaled our processing to the entire dataset of interest (x voxels: 610-2010, y: 1-2480, z: 390-2014, resolution: 0), using the LONI processing environment. Leveraging LONI allows us to quickly build interfaces to algorithms written by different research groups and in different languages to assemble a cohesive pipeline. These algorithms have well-defined interfaces and can also be repackaged for use in a different meta-scheduler environment. When running at scale, we first chunked our data into small cuboids meeting our computational constraints, and then ran each block through the pipeline. NeuroData was used to get and store data; image data was requested for each computed block and the results were written to a spatially co-registered annotation channel
. Each block was retrieved with sufficient padding to provide edge context; we processed these blocks in a parallel fashion and uploaded the resulting detections to NeuroData. We have also implemented an alternative merging strategy to account for cells near boundaries. Briefly, we eliminate putative detections touching an edge or that overlap an object already present in the database to further reduce edge effects.
This research used resources of the U.S. Department of Energy (DOE) Office of Science User Facilities operated for the DOE Office of Science by Argonne National Laboratory under Contract No. DE-AC02-06CH11357. Support was provided by: NIH U01MH109100 (ELD, HLF, DG, XX, CJ, NK, and KK), the IARPA MICRONS project (NK), an educational fellowship from the Johns Hopkins University Applied Physics Laboratory (WGR), the Defense Advanced Research Projects Agency (DARPA) SIMPLEX program through SPAWAR contract N66001-15-C-4041, and DARPA GRAPHS N66001-14-1-4028. The sample showed in the Supp. Materials was graciously provided by Haruo Mizutani. The authors would like to thank Jordan Matelsky from JHU and NeuroData for all of his help in creating 3D visualizations of the data in Blender. We would also like to thank Norman Clark and Olivia Grebski for their assistance in annotating some of the training volumes. A big thanks to Francesco De Carlo (Argonne) and Pavan Ramkumar (Northwestern) for useful discussions regarding this work.
The authors declare that they have no competing financial interests.
Correspondence and requests for materials
should be addressed to E.L.D. and N.K.
(email: email@example.com, firstname.lastname@example.org).
Int. Conf. on Artificial Intelligence and Statistics, 609–617 (2011).
|Annotation||# Cells||Cell area||Volume ( of mm)||Cell density ()|