Ultracold atomic gases are unique systems that allow studying few- and many-body physics in a highly precise and tunable manner. The atomic ensembles are exquisitely isolated from the surroundings as they are held in an ultra-high vacuum environment; therefore, probing them is almost always restricted to the analysis of their optical response. The most widely used probing technique is absorption imaging, where a collimated resonant laser beam is passed through the cloud, and the shadow cast by the atoms is recorded by a digital camera1. The spatial atomic distribution is then extracted from the position-dependent absorption coefficient. The coherence length of the probe beam is typically much longer than the distances between optical interfaces in the experiment, hence, unwanted reflections interfere and generate a characteristic patterns of stripes and Newton’s rings in the recorded image. These patterns pose a problem in distinguishing between the signal and the non-uniform background.
The standard solution is to employ a double-exposure scheme: the first exposure is performed while the atoms are present, while the second reference exposure is performed shortly after and without the atoms. The exposure without atoms can be done either by waiting for the atoms to move out of the frame or by optically pumping them into a dark state. The line-of-sight integrated optical density (OD) image is formed by subtracting the logarithms of the pixel counts in the two frames, with and without the atoms. However, due to acoustic noises and other dynamical processes, the noise patterns in the two images are typically not identical. This results in a residual structured noise pattern in the final image (Fig. 1
a). The lower signal to noise ratio afflicted by the fringes is particularly problematic in low-OD images. Linear approaches for background completion were recently suggested2, but we have found empirically that they are sensitive to small changes in the noise pattern that evolve over time.
In this work, we tackle the noisy background problem using machine learning, a term describing a set of algorithms that effectively perform a specific tasks relying on patterns and inference. Among these, deep learning refers to a class of models which involves information propagation via multiple structures, enabling the translation of a given input to a certain prediction. The use of deep learning has become widespread in recent years for problems where an analytic mapping does not exist or when numeric solutions are intractable3; 4; 5; 6. Image completion is an excellent example of such an application, particularly in a scenario where there are typical recurrent but varying patterns in the image. Machine learning techniques were also used for the optimization of ultracold atoms cooling sequences7; 8; 9; 10 and to execute related numerical calculations 11. They were also suggested12 and demonstrated13 to be useful for fluorescence detection of pinned atoms and ions.
Here we report on a new approach for absorption imaging that uses a deep neural network (DNN) to generate an ideal reference frame from a single image that includes the atomic absorption signal. The reference image is constructed by masking out the part of the image containing the atomic shadow and using the network for image completion of the background. We demonstrate the new method with data acquired with ultracold gas and show that the images captured by the single exposure technique feature lower noise levels and therefore allow for a more accurate extraction of physical observables. In addition to the improvement in the data quality, our single-shot approach simplifies the experimental sequence and eases the hardware requirements from the camera. The DNN model successfully adapts to both short and long time variations, and therefore it constitutes a robust solution.
The experiments are conducted with a quantum degenerate Fermi gas of atoms with an equal mixture of the two lowest energy states in the manifold at a magnetic field of G. Our experimental system and cooling procedure are the same as described in Refs. 14; 15. The frames without atoms were captured deliberately along seven months to test the DNN in realistic conditions. We acquired data with atomic clouds at different conditions by modifying the evaporation cooling sequence. For training and validation of the DNN, we also acquired images without atoms. To this end, we set the initial position of the optical transfer trap to about away from their location at the magnetic trap, hence no atoms are shuttled to the position where the images are recorded. In all cases, the first exposure was taken between after the optical dipole trap was turned off abruptly.
The images are taken with a laser tuned to the cycling transition in the manifold, at a wavelength of nm. The laser linewidth is about , much narrower than the natural linewidth of MHz. The illumination is pulsed for and recorded by a bit CCD camera16. The reference frame (for the conventional absorption imaging) is recorded with a second pulse given after , when the atoms already moved out of the camera field of view. We also capture “dark frames” without illumination at all that serve as the zero references. The dark images don’t have to be taken often since they only account for any remaining light which is not due to the probe beam and for electronic noise in the camera. Prior to analyzing the two images in the conventional absorption imaging technique, we correct for small differences which may exist between the intensity of the illumination in both exposures. These differences are typically of few percents. The second exposure is taken only in order to compare our technique with the conventional method and is neither required for the application of the DNN nor for its training.
Two physical observables that are commonly used in ultracold atomic experiments are the temperature and number of atoms. In the presented results, the number of atoms in the cloud and its temperature are controlled by changing the final trap depth in the optical evaporation. We extract the observables from the momentum distribution, which is measured after of a ballistic expansion. To extract the observables, we fit the OD images with1
where denotes the Jonquière’s polylogarithm function, is the fugacity, and accounts for any remaining constant background in the OD image. From the fugacity, we extract the relative temperature , with being the Fermi temperature, and is the geometrically-averaged trapping frequency, which we measure and rescale according to the trapping laser power. The number of atoms, , is obtained by integrating over the fitted momentum distribution.
DNN architecture and training.
DNNs establish a pipeline where the input (the information in the masked OD image, in our case) undergoes multiple convolutional transformations and dimensional variations. These transformations distill the features of the underlying spatial pattern, and their result is the prediction of the DNN. The network is trained to optimally recover the structure of the illumination in the region where the atomic signal appears. The training phase is performed using images captured without atoms, and constitute therefore the “ground truth” for the unsupervised reconstruction. At each optimization step, the prediction of the network is compared to the ground truth values in the masked area, and the weights of the model are varied to minimize the loss, i.e., the mean squared error ( norm) between the ground truth and the prediction. At the end of the training, we obtain an optimized model ready for prediction (inference) on new images with atoms. The network produces an ideal reference regardless of whether atoms appeared in the original image or not, because the relevant region is masked out. Since the involved convolutions are relatively simple, the evaluation of the model for inference on new inputs is rapid, and therefore the integration of a trained network into the infrastructure of another calculation is extremely facile.
From the raw images we subtract the dark frames, and then take the logarithm of their pixel values. The convolutional network is an autoencoder of a U-net architecture17. The input to the network is the OD image cropped to pixels around the position of the atoms, from which we mask out the central circle with a diameter of pixels that may include an absorption signal if atoms are present. This mask diameter is larger by at least a factor of two relative to the size of the typical atomic cloud, to ensure that there is no absorption signal in the region used by the DNN to predict the background. For training, we use a generator to riffle through the stored TIFF images, apply the mask on the input, and feed the DNN input with frames batches18. To evaluate the DNN on an atomic frame, we store the model inference as binary file and subtract the input frame to obtain the atomic OD. By minimizing the loss over the square circumscribing the masked region (dashed cyan square in Fig. 1a), we ensure continuity at the corners, where the background is unmasked. Effectively of the loss is dedicated to image duplication rather than completion, in order to eliminate any offset between the input and output frames, which might be translated into an error in the number of atoms.
The feed-forward network consists of about parameters arranged in layers. These parameters were optimized by running over frames captured without atoms, with additional
images for loss validation, comparing the network output to the original central part of each image, and minimizing the mean squared error loss function. We used ADAM optimizer19 and Glorot initialization20 for the parameters optimization, applying batch normalization21. For this application, labeling of the input frames is unnecessary as the network output is compared directly against its input before masking. The only prior knowledge is the absence of atoms in the peripheral region and, only for the training set, also in the central area. Notably, generative adversarial networks22, which were found very successful in natural-scene image competition tasks, might be destructive for this study case, as there is a given unique ground truth.
DNN performance on the validation set.
First, we examine the residual noise in inferences on the validation set, which was not used for training and does not include atomic signal. The convergence of the model is depicted in Fig. 2, where we present the decay of the residual loss during the training process for both the training (purple) and validation (black) datasets. The decay in both datasets on a log-log scale is sub-power-law. It exceeds the reference level, set by the average double-shot residual noise (dashed red line), after approximately training epochs, which mainly points to a reliable extraction of the bias, but noise features still exist. In principle, the training should continue as long as the validation loss decreases. In practice, the loss decay slows dramatically after few hundreds of epochs, and we therefore cease the training after epochs. An example for image completion without atoms is displayed in Fig. 1, with the DNN input (1a) and the corresponding prediction of the network (1b), which closely resembles the original data (1c). Notably, there are no significant spatial correlations in the difference between the desired and the predicted frame (1d).
The lowest residual error is optical-depth root mean squared error (ODRMSE), for the whole validation dataset captured intermittently along seven months. As most of the residual error resulted from the inner circle of the square output image (see Fig. 1d), a fair comparison for the loss is against of the averaged-double-shot error, indicated by a dashed line in Fig. 2. Also considering this factor, this value is (ODRMSE), slightly higher than the minimal validation loss obtained during the first epochs.
Fig. 3 is a histogram of the residual loss after epochs. Variations in the probe laser power over the course of the acquisition result in multiple features that appear in the double-shot histogram (lower panel). The absence of these features in the single-shot data (upper panel) attest to the robustness of our technique to these long-term variations. On the right we compare the DNN single-frame technique (upper panels) and the conventional double-exposure scheme (lower panels) with several representing examples, whose noise is marked by corresponding colors and letters in the histogram. The results on the validation set show that the single-exposure approach achieves lower residual noise levels and deals better with variations in the imaging conditions when compared to the conventional double-shot scheme. The residual noise of the single-exposure technique can, in principle, be further reduced by additional training, but to asses the usefulness of the time invested in such prolonged training one should take into account whether it has a measurable effect on physical observables.
Single-shot imaging evaluation.
In this section we present single-shot absorption images of a quantum degenerate fermionic potassium gas at different conditions. A typical analysis of a low-OD image following a ballistic expansion from a -deep trap is shown in Fig. 4. In panel (4a), we present the inner square part of the input log image. In this example, there are approximately atoms, hence the atomic signal is hardly discernible from the background to the naked eye. When it is subtracted from the network prediction in (4b), a clean OD image is obtained (4c). As a comparison, panel (4d) shows the conventional absorption image obtained from two exposures in the same experiment. Evidently, the single-shot approach eliminates the remaining fringe pattern and yields an overall better OD image. More examples for different trap depths are presented in the upper panel of Fig. 5, and show the same behaviour regardless of the atomic conditions.
Effect on physical observables.
In Fig. 6 we plot the number of atoms and temperature for different trap depths as extracted by the single-shot (purple diamonds) and the two-exposures (black circles) techniques. Importantly, the new technique does not introduce any systematic error in extraction of these important observables. The errorbars represent the shot-to-shot variation in the experimental conditions combined with the fitting extraction error. Since both of these terms are of a similar magnitude, it is hard to observe the improvement in the single-exposure technique. To emphasize this improvement, we present in the insets only the fitting extraction relative error averaged over the experimental realizations in each trap depth. We find that the extraction uncertainty of both observables is smaller by using the single-exposure technique.
Iv Summary and outlook
We have demonstrated a single-shot absorption imaging based on a deep convolutional network background completion. We have shown that this approach can accurately reconstruct atomic density profiles and yield smaller errors on the extracted physical quantities, compared to the standard double-exposure technique. The single-shot imaging lifts the need for fast cameras and facilitates multi-framed acquisitions. The corresponding simplification directly enables simpler and cleaner designs for new cold atomic systems. We have also demonstrated the ability of the DNN to adapt to variations in the working condition that develop through time.
Our network can be improved in several aspects. First, the masked area can be enlarged to achieve even better robustness. Also, by training the network over random patches in the uncropped OD image, the position-dependency of the result can be further reduced. Another interesting direction is the implementation of an online learning scheme, where images are routinely added to the dataset and the model is continuously updated between inferences.
The trained network and its generating scripts are publicly available as an open-source Python software package at http://absDL.github.io to facilitate their deployment by other experimental groups. Using the provided repository, single-shot imaging can be realized on any imaging apparatus, following local parameters training.
This research was supported by the Israel Science Foundation (ISF) grant No. 1779/19, and by the United States-Israel Binational Science Foundation (BSF), Jerusalem, Israel, grant No. 2018264. The GeForce TITAN V used for the local network training was donated by the Nvidia Corporation. Remote training power was granted by the Google Cloud Platform research credits program. G.N. would like to thank Amit Oved for inspiring discussions.
- W. Ketterle (2008) M. W. Z. W. Ketterle, Making, probing and understanding ultracold fermi gases, La Rivista del Nuovo Cimento , 247 (2008).
- Niu et al. (2018) L. Niu, X. Guo, Y. Zhan, X. Chen, W. M. Liu, and X. Zhou, Optimized fringe removal algorithm for absorption images, Applied Physics Letters 113, 144103 (2018).
- LeCun et al. (2015) Y. LeCun, Y. Bengio, and G. Hinton, Deep learning, Nature 521, 436 (2015).
- Biamonte et al. (2017) J. Biamonte, P. Wittek, N. Pancotti, P. Rebentrost, N. Wiebe, and S. Lloyd, Quantum machine learning, Nature 549, 195 (2017).
Mehta et al. (2019)
P. Mehta, M. Bukov, C.-H. Wang, A. G. Day, C. Richardson, C. K. Fisher, and D. J. Schwab, A high-bias, low-variance introduction to machine learning for physicists,Physics Reports 810, 1 (2019).
- Carleo et al. (2019) G. Carleo, I. Cirac, K. Cranmer, L. Daudet, M. Schuld, N. Tishby, L. Vogt-Maranto, and L. Zdeborová, Machine learning and the physical sciences, Reviews of Modern Physics 91, 10.1103/revmodphys.91.045002 (2019).
- Wigley et al. (2016) P. B. Wigley, P. J. Everitt, A. van den Hengel, J. W. Bastian, M. A. Sooriyabandara, G. D. McDonald, K. S. Hardman, C. D. Quinlivan, P. Manju, C. C. N. Kuhn, I. R. Petersen, A. N. Luiten, J. J. Hope, N. P. Robins, and M. R. Hush, Fast machine-learning online optimization of ultra-cold-atom experiments, Scientific Reports 6, 10.1038/srep25890 (2016).
- Tranter et al. (2018) A. D. Tranter, H. J. Slatyer, M. R. Hush, A. C. Leung, J. L. Everett, K. V. Paul, P. Vernaz-Gris, P. K. Lam, B. C. Buchler, and G. T. Campbell, Multiparameter optimisation of a magneto-optical trap using deep learning, Nature Communications 9, 10.1038/s41467-018-06847-1 (2018).
- Nakamura et al. (2019) I. Nakamura, A. Kanemura, T. Nakaso, R. Yamamoto, and T. Fukuhara, Non-standard trajectories found by machine learning for evaporative cooling of 87rb atoms, Optics Express 27, 20435 (2019).
- Barker et al. (2019) A. J. Barker, H. Style, K. Luksch, S. Sunami, D. Garrick, F. Hill, C. J. Foot, and E. Bentine, Applying machine learning optimization methods to the production of a quantum gas, arXiv preprint (2019), http://arxiv.org/abs/1908.08495v2 .
- Pilati and Pieri (2019) S. Pilati and P. Pieri, Supervised machine learning of ultracold atoms with speckle disorder, Scientific Reports 9, 10.1038/s41598-019-42125-w (2019).
- Picard et al. (2019) L. R. B. Picard, M. J. Mark, F. Ferlaino, and R. van Bijnen, Deep learning-assisted classification of site-resolved quantum gas microscope images, Measurement Science and Technology 31, 025201 (2019).
Ding et al. (2019)
Z.-H. Ding, J.-M. Cui, Y.-F. Huang, C.-F. Li, T. Tu, and G.-C. Guo, Fast high-fidelity readout of a single trapped-ion qubit via machine-learning methods, Physical Review Applied12, 10.1103/physrevapplied.12.014038 (2019).
- Shkedrov et al. (2018) C. Shkedrov, Y. Florshaim, G. Ness, A. Gandman, and Y. Sagi, High-sensitivity rf spectroscopy of a strongly interacting fermi gas, Physical Review Letters 121, 10.1103/physrevlett.121.093402 (2018).
- Ness et al. (2018) G. Ness, C. Shkedrov, Y. Florshaim, and Y. Sagi, Realistic shortcuts to adiabaticity in optical transfer, New Journal of Physics 20, 095002 (2018).
- pix ( CCD) PCO pixelfly usb (ICX285AL CCD).
- Ronneberger et al. (2015) O. Ronneberger, P. Fischer, and T. Brox, U-net: Convolutional networks for biomedical image segmentation, in Lecture Notes in Computer Science (Springer International Publishing, 2015) pp. 234–241.
Chollet et al. (2015)
F. Chollet et al.
, Keras,https://keras.io (2015).
- Kingma and Ba (2014) D. P. Kingma and J. Ba, Adam: A method for stochastic optimization, arXiv preprint (2014), http://arxiv.org/abs/1412.6980v9 .
Glorot and Bengio (2010)
X. Glorot and Y. Bengio, Understanding the
difficulty of training deep feedforward neural networks, in
Proceedings of the thirteenth international conference on artificial intelligence and statistics(2010) pp. 249–256.
- Ioffe and Szegedy (2019) S. Ioffe and C. Szegedy, Batch normalization: Accelerating deep network training by reducing internal covariate shift, arXiv preprint (2019), http://arxiv.org/abs/1502.03167v3 .
- Goodfellow et al. (2014) I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio, Generative adversarial nets, in Advances in Neural Information Processing Systems 27, edited by Z. Ghahramani, M. Welling, C. Cortes, N. D. Lawrence, and K. Q. Weinberger (Curran Associates, Inc., 2014) pp. 2672–2680.