DeepFocus: a Few-Shot Microscope Slide Auto-Focus using a Sample Invariant CNN-based Sharpness Function

by   Adrian Shajkofci, et al.

Autofocus (AF) methods are extensively used in biomicroscopy, for example to acquire timelapses, where the imaged objects tend to drift out of focus. AD algorithms determine an optimal distance by which to move the sample back into the focal plane. Current hardware-based methods require modifying the microscope and image-based algorithms either rely on many images to converge to the sharpest position or need training data and models specific to each instrument and imaging configuration. Here we propose DeepFocus, an AF method we implemented as a Micro-Manager plugin, and characterize its Convolutional neural network-based sharpness function, which we observed to be depth co-variant and sample-invariant. Sample invariance allows our AF algorithm to converge to an optimal axial position within as few as three iterations using a model trained once for use with a wide range of optical microscopes and a single instrument-dependent calibration stack acquisition of a flat (but arbitrary) textured object. From experiments carried out both on synthetic and experimental data, we observed an average precision, given 3 measured images, of 0.30 +- 0.16 micrometers with a 10x, NA 0.3 objective. We foresee that this performance and low image number will help limit photodamage during acquisitions with light-sensitive samples.



page 2


Spatially-Variant CNN-based Point Spread Function Estimation for Blind Deconvolution and Depth Estimation in Optical Microscopy

Optical microscopy is an essential tool in biology and medicine. Imaging...

Convolutional neural network-based regression for depth prediction in digital holography

Digital holography enables us to reconstruct objects in three-dimensiona...

Leveraging blur information for plenoptic camera calibration

This paper presents a novel calibration algorithm for plenoptic cameras,...

Convolutional neural networks that teach microscopes how to image

Deep learning algorithms offer a powerful means to automatically analyze...

Extended depth-of-field in holographic image reconstruction using deep learning based auto-focusing and phase-recovery

Holography encodes the three dimensional (3D) information of a sample in...

PlenoptiSign: an optical design tool for plenoptic imaging

Plenoptic imaging enables a light-field to be captured by a single monoc...

Rapid Whole Slide Imaging via Learning-based Two-shot Virtual Autofocusing

Whole slide imaging (WSI) is an emerging technology for digital patholog...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Modern microscopy techniques rely on many components that are remotely controllable. This allows implementing control loops that limit the need for human-supervised operation. Auto-focusing systems, in particular, are used extensively in the acquisition of timelapses in developmental or cellular biology or to automatically image slides in a slide scanner. In the former application, imaged specimens tend to drift from the focal plane over time because of specimen growth, flow of the medium, or motion caused by temperature changes. In the latter case, variability in the mounting of the slides requires per-slide adjustment.

af systems seek to determine the optimal shift by which to adjust the axial position to maximize image sharpness. af solutions can be hardware-based (e.g. laser-based sensing of the sample drift [liron_laser_2006] or phase detection by an auxiliary sensor [silvestri_rapid_2017]) or image-based, which does not require any modification of the optical path of the microscope as a focus score is retrieved from the image itself [sun_autofocusing_2004].

We can classify image-based af algorithms into two categories. The first comprises af methods that use iterative minimization of a one-dimensional objective function, the focus score, to move the object to the point at which it is sharpest. Because the output of the function is not predictable and depends on the sample, the af has to acquire tens to hundreds of images at different axial positions in order to converge to a non-local optimum

[sun_autofocusing_2004]. A high number of image acquisitions can be damaging for the sample, especially in fluorescence microscopy [magidson_circumventing_2013].

Additionally, existing objective functions only give a meaningful result in the neighborhood of the focal plane, and lose information (i.e. the gradient of the curve is zero) farther away from the focal plane. Furthermore, depending on the software implementation and the imaging modality, the acquisition of hundreds of images can take up to several minutes. The second category comprises single shot AF techniques (that need only one or a few images). Thanks to end-to-end cnn, they take an image as input and directly deduce the optimal shift to be in focus ([wei_neural_2018, jiang_transform-_2018, pinkard_deep_2019]). The drawback of these direct methods is that a long and computationally-intensive cnn training with a microscope objective-specific training data set, must be repeated whenever the optical system changes. Furthermore, these methods are not directly available in open microscope control software, such as Manager [edelstein_computer_2010].

Figure 1: The object may be outside of the dof and appear blurry. Here we quantified the blur using DeepFocus and an hpf for two different images. Using hpf, changes shape and slope when different objects are presented under the microscope, and there is a lack of depth information for . Using DeepFocus, the slopes for both images are similar in shape and retain information about depth in the whole region.

In this paper, we propose a local, cnn-based focus scoring function that remains nearly invariant when imaging different types of samples or modalities on any given microscope. We developed a correlation-based af algorithm that takes advantage of the broad shape and unimodal minimum of this function, which helps to speed up convergence and remaining effective even when the imaged object is far from the focal (several times the dof, see Fig. 1). Since our cnn method does not require a microscope-specific data set for training besides a single stack of an arbitrary object, it is plug-and-play.

This paper is organized as follows. In Section 2, we present the blurriness scoring function, the calibration process, and the af algorithm. In Section 3, we experimentally verify the scoring function’s assumed invariance to a variety of samples and characterize performance with respect to the number of images and in comparison to common af scoring functions, using both simulated and experimentally acquired data. We discuss our findings and conclude in Section 4.

2 Methods

2.1 Problem statement

We consider a specimen, modeled as 2D manifold in 3D space (such as a thin microscopy slide), that we wish to image with a widefield microscope in bright field, fluorescence, or phase contrast. The entire specimen or some regions in the fov can be out of focus and outside of the dof (see Fig. 1). We assume the microscope has a motorized stage for adjusting the focus. We aim at finding the optimal axial shift by which to adjust the sample position such that it is in focus. We seek a solution that (i) does not require a manually selected reference image to be matched (such that the method can be used both for maintaining focus in live timelapses but also for imaging collections of fixed samples) (ii) requires a minimal number of images (to limit photodamage) (iii) shall not require imaging calibration specimens (PSF measurement beads, etc.) or large-scale, microscope-specific training.

2.2 Method description

The principle behind our proposed algorithm is to measure a blurriness score for a few () images acquired at different focus positions , , resulting in a set of pairs and to determine the necessary focal shift such that matches a microscope objective-specific, sample-invariant, depth-blurriness response curve using cross-correlation. The curve invariance assumption has been similarly used by the model-based curve fitting approach of [yazdanfar_simple_2008].

For this approach to work, we need a focus estimation function that is invariant to the sample shape or texture (sample-invariance) but co-variant with the sample’s axial position and sufficiently informative beyond the immediate vicinity of the focal plane. To this end, we chose an estimator of the local optical properties of the microscope objective [shajkofci_semi-blind_2018]. Briefly, it relies on a trained cnn to regress the parameters of a Zernike polynomial psf model [von_zernike_beugungstheorie_1934], given a blurry image patch as an input. Here, we use the estimated Zernike coefficient corresponding to focus as a blurriness score, which provides, given an image as input, a local blurriness score for the indicated position depth .

The trained cnn [shajkofci_semi-blind_2018] does not require re-training when used on different microscopes or different microscope objectives and produces a curve whose shape (up to an axial scaling) is invariant to the sample (an aspect that we verify experimentally in Section 3.1). In order to determine the axial scaling, which is instrument-dependent, we require a calibration step consisting in the acquisition of a full stack of an arbitrary planar and textured object. This yields a blurriness map that we center with its minimum at the origin.

We now describe our proposed af, which follows the structure illustrated in Fig. 2 and is summarized in the steps:

  1. [align=parleft,leftmargin=11pt,labelsep=-8pt,topsep=0pt,itemsep=-1ex,partopsep=1ex,parsep=1ex]

  2. Fit to a Moffat distribution [moffat_theoretical_1969] and extract its fwhm. Set , let be the initial focal plane position, and initialize a gss algorithm with the interval . Acquire images at and compute, using the cnn, the blurriness scores .

  3. Check the convexity of , by fitting to a quadratic polynomial. If the of the polynomial fit is higher than the of a linear fit, go to Step 6. Otherwise go to Step 3.

  4. Increment . Update the gss triplet to obtain and move to a new axial position .

  5. Acquire an image at the current axial position .

  6. Compute, using the cnn, the blurriness score and go to Step 2.

  7. Compute using cross-correlation the local optimal shift minimizing the squared distance:

  8. Move the sample by , averaged for the roi in the plane.

Figure 2: Flowchart of the AF algorithm (see Section 2.2).
Focus score function
Tenengrad [sun_secrets_2010]
ewc [hanghang_tong_blur_2004]
ws [liebling_autofocus_2004]
Table 1: Using the same experimental conditions as Fig. 3, we quantified the scoring function performance in terms of sd in a range (lower is better) and conditional entropy between the focus score and the distance (lower is better) in the whole range. Using experimental acquisitions, DeepFocus outdo all other tested functions in terms of sd. Additionally, our method has for both modalities the lower conditional entropy and thus is more informative.

3 Experiments

3.1 Characterization of regression invariance to image diversity

Since our af algorithm relies on the invariance of to the type of imaged sample, we investigated whether our proposed cnn indeed satisfied this condition and whether other (existing) focus metrics could be substituted.

We gathered images from the evaluation dataset of [shajkofci_semi-blind_2018] and blurred them with Gaussian psf mimicking a 10, NA objective for points in the depth range . In addition, we acquired stacks of fixed rat brain slices tagged with three fluorescent stains using a widefield transmission light microscope with a 10, NA 0.3 objective in a depth range of . We then computed using DeepFocus and other methods, including hpf, lapv, sml [nayar_shape_1990], Tenengrad [sun_secrets_2010], ewc [hanghang_tong_blur_2004], and ws [liebling_autofocus_2004], which cover a broad range of focus measures, as reviewed in [price_comparison_1994, sun_autofocusing_2004, mateos-perez_comparative_2012, ali_analysis_2018].

In Table 1, we reported the average sd of over all input images. Using the experimental dataset, our method had an average sd of (normalization scale with 1 and 0 the blurriest and sharpest values, respectively). We noticed, as illustrated in Fig. 3 (a) and (b), that DeepFocus’ sd increased when increases (i.e when the acquired pictures contain a medium-to-high blur). A low sd implies that is similar with different types of imaged specimens. Other methods had a sd of

, and hence confirmed the variance of these focus metrics with image diversity.

(a) Synt: DeepFocus
(b) Real: DeepFocus
(c) Synt: Tenengrad [sun_secrets_2010]
(d) Real: Tenengrad [sun_secrets_2010]
(e) Synt: ws [liebling_autofocus_2004]
(f) Real: ws [liebling_autofocus_2004]
Figure 3: Comparison of the output of different sharpness scoring functions as a function of , centered at the origin. We used as input synthetically blurred images (left) and stacks of fluorescent rat brain tissue with a 10, NA 0.3 objective (right). Using DeepFocus, the sd of around the focal plane is lower than with the other scoring functions. Additionally, scoring functions other than DeepFocus do not infer depth information ( for all ) when .

3.2 Characterization of information measure of the scoring function

We next investigated how robustly our proposed DeepFocus measure can report (de)focus information as the distance from focus is increased up to 10 times the dof. We observed (Fig. 3) that focus metrics other than ours were unable to give any information about from whenever is higher than , as they reach a value that does no longer vary as the position is increased further. Since the gradient in such plateau regions is small, minimization algorithms could not converge quickly. To quantify these visual observations regarding the uncertainty of recovering from any given , we computed the conditional entropy:

where and

are random variables representing the calibration blurriness score and the axial distances,

and their support sets, and

the probability of a score

, given the distance . A high conditional entropy value implies a high uncertainty of detecting the right position for a given . The results, compiled in Table 1, reveal that DeepFocus had a conditional entropy of , a value smaller than that obtained when using any of the other scoring functions instead. In the case of experimental acquisitions, we observed again an improvement in terms of entropy (), where other methods have values in the range . We further determined the threshold distance after which no distance information can be inferred from the image, i.e when the image is too blurry to make the af converge. DeepFocus retained depth information for a range of m with a 10, objective, which is equivalent, using the diffraction-limited dof formula, to 11 times the dof (). In comparison, metrics like ws and sml achieved ranges of only 4 and 7 times the dof, respectively.

3.3 Characterization of the af error as a function of the number of acquisitions

We finally investigated how accurately DeepFocus could retrieve the focal distance as a function of the number of images acquired. We used 100 blurred images from the generated dataset in Section 3.1 with a known in-focus position and computed its distance to the output position of the af. We also compared our method to other autofocus scoring functions (for which we used a bounded Brent’s method as optimizer). The results are summarized in Fig. 4.

Figure 4: Comparison of the af error using 3 different af scoring functions for 100 samples. We quantified the distance between the theoretical focus plane position and the af output as a function of the number of af iterations which represent additional input images. DeepFocus yields an error of with 4 iterations. With 8 iterations or more, the others methods are on par or more accurate than ours.

We observed that our proposed af converged rapidly (3 iterations), while the two other focus functions needed more than twice as many images to reach a similar focus accuracy. Using 8 iterations or more, we did not notice a better accuracy with our method compared to Tenengrad or hpf.

4 Discussion and conclusion

In our experiments, we showed that the variance of over multiple images was usually lower using DeepFocus than when using other focus scoring functions, especially near the focal plane. Our explanation would be that the cnn, already known to be translation-invariant [lecun_learning_2012], have been trained specifically for the recognition of the psf parameters without discrimination on the input image type and position. By contrast, crafted features such as hpf are computed from content-based calculations and differ from one image to another. When the image is acquired at a large distance from the focal plane, we noticed a loss of spatial features in the acquired image, due to the large fwhm of the psf that degraded it. However, we have been able to retrieve depth information from the image up to 2.5 times farther away from the focal plane than with other methods. That could be mostly explained by the fact that DeepFocus computes features from a 128128 px window, while Gradient-based methods use a much smaller window, such as 33 or 55.

In summary, we developed an af method based on a combination of an cnn scoring function and optimization algorithms that are relying on the invariance of the scoring function. We showed that DeepFocus was robust to changes amongst samples, which enables the retrieval of the optimal axial shift using a correlation-based optimization process that needs as few as 3 images to converge. Our method is currently limited to imaging thin samples and further work will investigate the procedure for thicker objects. We implemented the calibration step and af algorithm as two plugins (Java with a PyTorch

[paszke_automatic_2017] backend) for the Manager microscopy acquisition engine [edelstein_computer_2010], which we will make available upon acceptance.

5 References