Optical coherence tomography (OCT) enables a cross-sectional and non-invasive imaging of the eye’s fundus in axial direction and thus, by depicting the retina’s anatomical layers, results in a high diagnostic value in ophthalmology. A further significant increase for the examination is offered by the OCT angiography (OCTA), which exhibits the microvasculature of the retina and choroid. For example, microaneurysms, caused by diabetic retinopathy, can be identified in OCTA images [2828-01].
One popular, solely image based, OCTA imaging technique relies on the varying reflectance of the flowing red blood cells over time, while the signal of surrounding tissue is almost stationary. By decorrelating 2 or more OCT scans, acquired over a short period of time at the very same location, vessels are enhanced and form an OCTA scan [2828-02]. While the acquisition of a single OCT image, also called B-scan, is rapidly done by combining 1-D axial scans in lateral direction, a stack of B-scans is considerably slower.
As OCTA is derived from image changes over successively acquired B-scans, other origins of image differences than blood flow can result in artifacts. One cause of such artifacts is motion, which leads to a misalignment of the B-scans of consecutive volumes. The origin of the motion can be rooted in pulsation, tremor and microsaccades, involuntary executed by the subject. Misalignment can lead to high decorrelation values and manifest as white stripes in en-face projections (along axial direction) of the OCTA volume [2828-03]. Another source of artifacts, showing as black lines in the projection, is blinking, which results into lost OCT scans and therefore no OCTA image at that location.
More elaborated OCTA algorithms, like the Split-spectrum Amplitude Decorrelation Angiography, can compensate for motion in the axial direction but still relies on several aligned structural B-scans and quarters the axial resolution [2828-04]
. A recent algorithm consists of a Deep Learning (DL) pipeline for the extraction of angiographic information from a stack of OCT B-scans[2828-05]
. As both methods rely on matching B-scans, they are prone to the aforementioned artifacts. We propose a new generative DL based strategy to replace detected erroneous OCTA B-scans using an image-to-image translation architecture from a single OCT scan. Exhibiting these blind spots of conventional algorithms can increase the usability for physicians and improve further processing like vessel segmentation.
2 Materials and methods
2.1 Defect scan detection
The first step to replace defect OCTA scans is the detection algorithm, which triggers the OCT-OCTA translation network for identified angiographic scans.
Threshold based detection is commonly used, thus we adapt this approach to include the aforementioned artifacts. The summed flow signal of OCTA scans is calculated for each B-scan and compared to an upper and lower value. Instead of global thresholds, local limits are determined with and for the lower and upper bound, respectively. The mean
and the varianceis calculated over nearest B-scans and the scan itself. For the lower bound, a broader neighborhood is considered, as well as the parameter was selected less strict as this threshold only filters blank scans. With and a higher sensitivity to local changes, detects motion corrupted scans and outliers.
2.2 Data set and processing
The next step we process our data such that it can be used for the OCT-OCTA translation network. The data for this project consists of 106 volumetric OCTA scans, which were each generated from 2 structural volumes. The volume size is 500 500 465/433 (number of B-scans, lateral and axial) pixels over an area of 3 3mm or 6
6mm for 54 and 52 volumes, respectively. The data is split into a training, validation and test data sets of 91, 15 and 10 volumes and zero padded to a mutual axial size of 480 pixels to compensate for the varying dimensionality. Each B-scan is normalized to the range of 0 to 1 using the global maximum value.
The information on vessels in OCT scans is likely not depended on the global B-scan, but rather locally encoded. Therefore, we can crop the B-scans along the lateral axis into patches of size 480
128pixels to reduce the input size for the neural network (NN). During training we randomly select 100 patches per OCT-OCTA B-scan pair, which passed the detection algorithm, to augment the data. At inference, the input B-scan is cropped into 5 overlapping patches, such that the first and last 8 columns can be removed at the composition of the outputs.
Patches where the depicted structures were cropped by the top or bottom image margins due to the curvature of the retina were removed.
2.3 OCT-OCTA translation
After the detection and outlier rejection, the remaining data is used to train a supervised NN for image-to-image translation.
To achieve such domain transform, U-Nets [2828-06] has been shown to be suitable. The layout of the developed U-Net is illustrated in Fig. 1. In the encoding section, after an initial convolution, two blocks are used to each halve the spatial input resolution. Each block consists of 2 dense blocks, which concatenates the block’s input with its output along the channel direction after a convolution as input for the next layer. The transition block learns a 1 1 convolution and reduces the width and height by average pooling (2 2). The latent space is decoded by 2 decoding layers with an upsampling operation and final 1
1 convolution with a sigmoid activation. Each decoding layer is built from a residual layer followed by a nearest neighbor interpolation (not in the last layer) and a convolution to avoid upsampling artifacts. If not stated otherwise, the kernel size is 3
The training process minimized the L2 norm between the generated patch and the ground truth. For a few training epochs, a further preprocessing step smooths the target OCTA images with a 33 3 median filter to reduce the speckle noise in the target data and to enforce a higher focus on vessel information. In addition, from related work, it can be expected that speckle noise is also reduced in DL generated OCTA scans [2828-05].
The evaluation is 3 folded. First, we evaluate the generated OCTA B-scans from the test data by comparing them with the ground truth at intact locations. Secondly, we look at the en-face projection of the fake angiographic volume and its reference. Due to the anatomical structure of the choroid, the amount of blood vessels leads to noisy OCTA signals without hardly any recognizable individual vessels in the ground truth. In projections of our OCTA data, this leads to a reduced contrast of the vessels in upper layers. Therefore, we segment the vitreous as upper bound and the inner plexiform layer as lower bound using the OCTSEG-tool [2828-07] of two volumes and only use the area in between for the evaluation of the projection. Finally, the detected defect scans are replaced and evaluated.
The first row of Tab. 1 shows the mean absolute error (MAE), mean square error (MSE) and the structural similarity (SSIM) between the generated scan and the ground truth, averaged over all intact test B-scans. Example B-scans are depicted in Fig. 2. The top left image shows the input OCT scan, to its right, the target angiographic image is shown. In the row below, the output image and the absolute difference to the ground truth are displayed. As expected from Liu et al. [2828-05], the result has reduced noise structure.
|Seg. proj. 1:||0.0096||0.0001||0.9978|
|Seg. proj. 2:||0.0138||0.0002||0.9896|
The image metrics of the segmented projections are written in Tab. 1 in the second and third row, the metric of the unsegmented volumes in the last row. In the 3rd and 4th row of Fig. 2, the segmented projection of the OCT, target and output volume are exhibited. Regarding also the less noisy structure, it can be concluded that not only speckle noise is diminished, but also smaller vessels. Nonetheless, all major vessels are extracted accurately.
4 Discussion and conclusion
Based on a single intact OCT scan, the NN can approximate useful OCTA images. Major vessels have a sharper edge and seemingly a better contrast. Smaller vessels are not extracted accurately, but this might be rooted in the fact, that a single structural scan is insufficient for such details. As a result, inserted scans can be identified, when compared to the neighborhood. Furthermore, our approach still needs a correct input B-scan, otherwise the artifacts will remain.
As the requirements for our approach only comprise a single scan at locations with artifacts and the network with its preprocessing, an integration into the OCTA imaging workflow is easily possible and can improve the image quality easily.
We gratefully acknowledge the support of NVIDIA Corporation with the donation of the Titan Xp GPU used for this research.