Training Auto-encoder-based Optimizers for Terahertz Image Reconstruction

07/02/2019 ∙ by Tak Ming Wong, et al. ∙ 10

Terahertz (THz) sensing is a promising imaging technology for a wide variety of different applications. Extracting the interpretable and physically meaningful parameters for such applications, however, requires solving an inverse problem in which a model function determined by these parameters needs to be fitted to the measured data. Since the underlying optimization problem is nonconvex and very costly to solve, we propose learning the prediction of suitable parameters from the measured data directly. More precisely, we develop a model-based autoencoder in which the encoder network predicts suitable parameters and the decoder is fixed to a physically meaningful model function, such that we can train the encoding network in an unsupervised way. We illustrate numerically that the resulting network is more than 140 times faster than classical optimization techniques while making predictions with only slightly higher objective values. Using such predictions as starting points of local optimization techniques allows us to converge to better local minima about twice as fast as optimization without the network-based initialization.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

page 5

page 6

page 8

page 10

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Terahertz (THz) imaging is an emerging sensing technology with a great potential for hidden object imaging, contact-free analysis, non-destructive testing and stand-off detection in various application fields, including semi-conductor industry, biological and medical analysis, material and quality control, safety and security [1, 2, 3].

The physically interpretable quantities relevant to the aforementioned applications, however, can not always be measured directly. Instead, in THz imaging system, each pixel contains implicit information about such quantities, making the inverse problem of inferring these physical quantities a challenging problem with high practical relevance.

As we will discuss in section 2, at each pixel location the relation between the desired (unknown) parameters , i.e., the field amplitude , the position of the surface , the width of the reflected pulse , and the phase

, and the actual (vector-valued) measurements

can be modelled via the equation , where

(1)
(2)

and is a device-dependent sampling grid . Thus, the crucial step in THz imaging is the solution of optimization problems of the form

(3)

at each pixel

, possibly along with additional regularizers on the unknown parameters. Even with simple choices of the loss function such as an

-squared loss, the resulting fitting problem is highly nonconvex and global solutions become rather expensive. Considering that the number of pixels, i.e., of optimization problems (3) to be solved, typically is in the order of hundred thousands to millions, even local first order or quasi-Newton methods become quite costly: For example, running the build-in Trust-Region solver of MATLAB® to reconstruct a THz image takes over 170 minutes.

In this paper, we propose to train a neural network to solve the optimization problems (

3) directly. We formulate the training of the network as a model-based autoencoder (AE), which allows us to train the corresponding network with real data in an unsupervised way, i.e., without ground truth. We demonstrate that the resulting optimization network yields parameters that result in only slightly higher losses than actually running an optimization algorithm, despite the advantage of being more than 140 times faster. Moreover, we demonstrate that our network can serve as an excellent initialization scheme for classical optimizers. By using the network’s prediction as a starting point for a gradient-based optimizer, we obtain lower losses and converge more than 2x faster than classical optimization approaches, while benefiting from all theoretical guarantees of the respective minimization algorithm.

This paper is organized as follows: Section 2 gives more details on how THz imaging systems work. Section 3

summarizes the related work on learning optimizers, machine learning for THz imaging techniques, and model-based autoencoders. Section

4

describes model-based AEs in contrast to classical supervised learning approaches in detail, before Section

5 summarizes our implementation. Section 6 compares the proposed approaches to classical (optimization-based) reconstruction techniques in terms of speed and accuracy before Section 7 draws conclusions.

2 THz Imaging Systems

There are several approaches to realizing THz imaging, e.g. femtosecond laser based scanning system [4, 5], synthetic aperture systems [6, 7], and hybrid systems [8]. A typical approach to THz imaging is based on the Frequency Modulated Continuous Wave (FMCW) concept [7], which uses active frequency modulated THz signals to sense reflected signals from the object. The reflected energy and phase shifts due to the signal path length make 3D THz imaging possible.

In Figure 1, the setup of our electronic FMCW-THz 3D imaging system is shown. More details on the THz imaging system are described in [7].

Figure 1: THz 3D imaging geometry. Both transmitter (Tx) and receiver (Rx) are mounted on the same platform. The imaging unit, consisting of Tx, Rx and optical components, are moved along the x and y direction using stepper motors and linear stages. This imaging unit takes a depth profile of the object at each lateral position, in order to acquire a full THz 3D image.

In this paper, we denote by the measured demodulated time domain signal of the reflected electric field amplitude of the FMCW system at lateral position

. In FMCW radar signal processing, this continuous wave temporal signal is converted into frequency domain by a Fourier transform 

[9, 10]. Since the linear frequency sweep has a unique frequency at each spatial position in -direction, the converted frequency domain signal directly relates to the spatial azimuth (-direction) domain signal

(4)

The resulting 3D image is complex data in the spatial domain, representing per-pixel complex reflectivity of THz energy. The quantities , , resemble the discretization in vertical, horizontal and depth-direction, respectively. Equivalently, we may represent

by considering the real and imaginary parts as two separate channels, resulting a 4D real data tensor

.

Since the system is calibrated by amplitude normalization with respect to an ideal metallic reflector, a rectangular frequency signal response is expected. After the FFT in (4), the -direction signal envelope is an ideal function as continuous spatial signal amplitude, giving rise to the physical model given in (1) in the introduction.

In (1), the electric field amplitude

is the reflection coefficient for the material, which is dependent on the complex dielectric constant of the material and helps to identify and classify materials. The depth position

is the position at which maximum reflection occurs, i.e., the position of the surface reflecting the THz energy. is the width of the reflected pulse, which includes information on the dispersion characteristics of the material. The phase of the reflected wave depends on the ratio of real to imaginary parts of the dielectric properties of the material. Thus, the parameters contain important information about the geometry as well as the material of the imaged object, which is of interest in a wide variety of applications.

3 Related Work

Due to the revolutionary success (convolutional) neural networks have had on computer vision problems over the last decade, researchers have extended the fields of applications of neural networks significantly. A particularly interesting concept is to learn the solution of complex, possibly nonconvex, optimization problems. Different lines of research have considered directly learning the optimizer itself, e.g. modelled as a recurrent neural network 

[11], or rolling out optimization algorithms and learning the incremental steps, e.g. in the form of parameterized proximal operators in [12]. Further hybrid approaches include optimization problems in the networks’ architecture, e.g. [13], or combining optimizers with networks that have been trained individually [14, 15]. The recent work of Moeller et al. [16] trains a network to predict descent directions to a given energy in order to give provable convergence results on the learned optimizer.

Objectives similar to the one arising in the training of our model-based AEs are considered, for instance, for solving inverse problems with deep image priors [17] or deep decoders [18]. These works, however, consider the input to the networks being fixed random noise and have to solve an optimization problem for the networks weights for each inverse problems, such that they are regularization-by-parametrization approaches rather than learned optimizers.

The most related prior work is the 3D face reconstruction network from Tewari et al. [19]. They aimed at finding a semantic code vector from a given facial image such that feeding this code vector into a rending engine yields an image similar to the input image itself. While this problem had been addressed using optimization algorithms a long time ago [20] (also known under the name of analysis-by-synthesis approaches), the Tewari et al. [19] replaced the optimizer with a neural network and kept the original cost function to train the network in an unsupervised way. The resulting structure resembles an AE in which the decoder fixed to the forward model and was therefore coined model-based AE. As we will discuss in the next section, the idea of model-based AEs generalizes far beyond 3D face reconstruction and can be used to boost the THz parameter identification problem significantly.

Finally, a recent work has exploited deep learning techniques in Terahertz imaging in 

[21], but the considered application of super-resolving the THz amplitude image by training a convolutional neural network on synthetically blurred images is not directly related to our proposed approach.

4 A Model-Based Autoencoder for THz Image Reconstruction

Let us denote the THz input data by , and consider our four unknown parameters to be matrices, allowing each parameter to change at each pixel. Under slight abuse of notation we can interpret all operations in (1) to be pointwise and again identify complex values with two real values in order to have , where denotes the depth sampling grid. Concatenating all four matrix valued parameters into a single parameter tensor , our goal can be formalized as finding such that .

A classical supervised machine learning approach to problems with known forward operator is illustrated in Figure 2 for the example of THz image reconstruction: The explicit forward model is used to simulate a large set of images from known parameters which can subsequently be used as training data for predicting via a neural network depending on weights

. Such supervised approaches with simulated training data are frequently used in other image reconstruction areas, e.g. super resolution

[22, 23], or image deblurring [24, 25]. The accuracy of networks trained on simulated data, however, crucially relies on precise knowledge of the forward model and the simulated noise. Slight deviations thereof can significantly degrade a network performance as demonstrated in [26], where deep denoising networks trained on Gaussian noise were outperformed by BM3D when applied to realistic sensor noise.

Instead of pursuing the supervised learning approach described above, we replace in the optimization approach (3) by a suitable network that depends on the raw input data and learnable parameters , that can be trained in an unsupervised way on real data. Assuming we have multiple examples of THz data, and choosing the loss function in (3) as an -squared loss, gives rise to the unsupervised training problem

(5)

As we have illustrated in Figure 3, this training resembles an AE architecture: The input to the network is data which gets mapped to parameters that – when fed into the model function – ought to reproduce again.

Opposed to the straight forward supervised learning approach, the proposed approach (5) has two significant advantages

  • It allows us to train the network in an unsupervised way, i.e., on real data, and therefore learn to deal with measurement-specific distortions.

  • The cost function in (5) implicitly handles the scaling of different parameters, and therefore circumvents the problem of defining meaningful cost functions on the parameter space: Simple parameter discrepancies such as for two different parameters sets and largely depend on the scaling of the individual parameters and might even be meaningless, e.g. for cyclic parameters such as the phase offset .

Figure 2: Classical supervised learning strategy with simulated data: The forward model (e.g. from (1)) is used to simulate data , which can subsequently be fed into a network to be trained to reproduce the simulation parameters in a supervised way.
Figure 3: A model-based AE for THz image reconstruction: The input data is fed into a network whose parameters are trained in such a way that feeding the network’s prediction into a model function again reproduces the input data . Such an architecture resembles an AE with a learnable encoder and a model-based decoder and allows an unsupervised training on real data.

5 Encoder Network Architecture and Training

5.1 Data Preprocessing

As illustrated in the plot of the magnitude of an exemplary measured THz signal shown in Figure 4, the THz energy is mainly focused in the main lobe and first side-lobes of the function. Because the physical model remains valid in close proximity of the main lobe only, we preprocess the data by cropping a small (typically measurements wide) window out of the impressively large range of measurements per pixel, where – at each pixel – the cropping window is centered at the position where the magnitude of the signal is maximal.

As discussed above, we represent the THz data in a 4D real tensor , where , and is the size of the cropping window, i.e. in our case.

Figure 4: Magnitude of a sample point of measured THz signal. The main lobe and major side-lobes are included in the grid window, which is colored in gray.

5.2 Encoder Architecture and Training

For the encoder network we pick a spatially decoupled architecture using convolutions on only, leading to a signal-by-signal reconstruction mechanism that allows a high level of parallelism and therefore maximizes the reconstruction speed on a GPU. The specific architecture (illustrated in Figure 5

) applies a first set of convolutional filters on the real and imaginary part separately, before concatenating the activations, and applying three further convolutional filters on the concatenated structure. We apply batch-normalization (BN) 

[27]

after each convolution and use leaky rectified linear units (LeReLU) 

[28] as activations. Finally, a fully connected layer reduces the dimension to the desired size of four output parameters per pixel. To ensure that the amplitude is physically meaningful, i.e., non-negative, we apply an absolute value function on the first component. Interestingly, this choice compared favorably to a plain rectified linear unit when the network is trained.

Figure 5: Architecture of encoding network that predicts the parameters: At each pixel the real and imaginary part is extracted, convolved, concatenated and processed via three convolutional and 1 fully connected layer. To obtain physically meaningful (non-negative) amplitudes, we apply an absolute value function to the first component.

We train our model optimizing (5) using the Adam optimizer [29] on of the

pixels from a real (measured) THz image for 1200 epochs. The remaining

of the pixels serve as a validation set. The batch size is set to . The initial learning rate is set to , and is reduced by a factor of 0.99 every 20 epochs.

Figure 6: The average losses of the training and validation sets over 1200 epochs on a decibel scale illustrate that there is almost no generalization gap between training and validation.

Figure 6 illustrates the decay of the training and validation losses over 1200 epochs. As we can see, the validation loss nicely resembles the training loss with almost no generalization gap.

6 Numerical Experiments

(a)
(b)
Figure 9: Objects of evaluated datasets LABEL:sub@fig:thz_object_metalpcb MetalPCB dataset LABEL:sub@fig:thz_object_stepchart StepChart dataset

We evaluate the proposed model-based AE on two datasets, which are acquired using the setup described in Section 2, namely the MetalPCB dataset and the StepChart dataset. The MetalPCB dataset is measured by a nearly planar copper target etched on a circuit board (Figure (a)a), which includes metal and PCB material regions, in the standard size scale of USAF target MIL-STD-150A [30]. After the preprocessing described in Section 5.1, the MetalPCB dataset has sample points. The StepChart dataset is based on an aluminum object (Figure (b)b) with sharp edges to evaluate the distance measurement accuracy using a 3D object. The StepChart dataset has sample points after preprocessing.

In order to evaluate the optimization quality on different materials and structures, MetalPCB dataset is evaluated in regions: PCB region is a local region that contains PCB material only, Metal region is a local region that contains copper material only, and All region is the entire image area. Similarly, the StepChart dataset is evaluated by 3 regions: Edge region is the region that contains physical edges, Steps region is the center planar region of each steps, and All region is the entire image area. This segmentation is done, because the THz measurements of the highly specular aluminum target results in strong multi-path interference artifacts at the edges that should be investigated separately.

The proposed model-based AE is trained on the MetalPCB dataset only, while the parameter inference is made for both the MetalPCB and StepChart datasets. This cross-referencing between two datasets can verify whether the proposed AE method is modelling the physical behavior of the system without overfitting to a specific dataset or recorded material.

To compare with the classical optimization methods, the parameters are estimated using the Trust-Region Algorithm (TRA) 

[31], which is implemented in MATLAB® . The TRA optimization requires a proper definition of the parameter ranges. Furthermore, it is very sensitive with respect to the initial parameter set. We, therefore, carefully select the initial parameters by sequentially estimating them from the source data (see [7] for more details). Still, the optimization may result in a parameter set with significant loss values; see Section 6.2.

The trained encoder network is independent of any initialization scheme as it tries to directly predict optimal parameters from the input data. While the network alone gives remarkably good results with significantly lower runtimes than the optimization method, there is no guarantee that the predictions of networks are critical points of the energy to be minimized. This motivates the use of the encoder network as an initialization scheme to the TRA, specifically because the TRA guarantees the monotonic decrease of objective function such that using the TRA on top of the network can only improve the results. We abbreviate this approach to AE+TRA for the rest of this paper.

To fairly compare all three approaches, the optimization time of TRA and the inference time of the AE are both recorded by an Intel® i7-8700K CPU computation, while the AE is trained on a NVIDIA® GTX 1080 GPU.

6.1 Loss and timing

Dataset (Region) Measurement TRA AE AE+TRA
MetalPCB (All) Average Loss 693.9 886.3 442.2
MetalPCB (PCB) Average Loss 589.0 872.6 589.0
MetalPCB (Metal) Average Loss 519.6 446.1 115.7
StepChart (All) Average Loss 3815.1 5148.3 3675.3
StepChart (Edges) Average Loss 4860.4 6309.1 2015.7
StepChart (Steps) Average Loss 1152.5 2015.7 1150.3
MetalPCB Training time (sec.) none 9312.8 9312.8
MetalPCB Run time (sec.) 10391.2 73.5 4854.7
StepChart Run time (sec.) 3463.9 22.8 1712.4
Inference time
Run time is the sum of AE inference and TRA optimization time
Table 1: Loss and timing enhancement based on the proposed model-based AE

In Table 1, the average loss in (5) and the timing are shown for the Trust-Region Algorithm (TRA), the Autoencoder (AE) and the joint AE+TRA approaches, respectively. We can see that the proposed encoder network achieves a lower average loss than the TRA method in the metal region of the MetalPCB dataset, it yields higher average losses than the TRA on both datasets. It is encouraging to see that although the AE was trained on the MetalPCB dataset, the relative performance in comparison to the TRA does not decay too significantly when changing to an entirely unseen data set with a different material, with the AE loss being and higher than the TRA loss on the MetalPCB and StepChart data sets, respectively. If such a sacrifice in accuracy is acceptable, the speed-up in runtime is tremendous with the AE being over 140 times faster than the TRA (for both methods being evaluated on a CPU). Note that even the sum of training and inference time are smaller for the proposed AE than the runtime of the TRA on the MetalPCB dataset.

Interestingly, the combined AE+TRA approach of initializing the TRA with the encoder network’s prediction leads to better losses than the TRA alone in all regions. Additionally, the AE-initialized TRA converged more than 2 times faster due to the stopping criterion being reached earlier.

We note that the losses of all approaches are significantly higher for the StepCart data set than they are for the MetalPCB. This is because the aluminum StepChart object (Figure (b)b) has a more complex physical structure than the MetalPCB object, which results in a mixture of scattered THz pulses by multi-path interference effects in all object regions. Incorporating such effects in the reflection model of (1) could therefore be an interesting aspect of future research for improving the explainability of the measured data with the physical model.

6.2 Quality Assessment of THz Images

(a)
(b)
(c)
(d)
Figure 14: Comparison of the THz intensity for the MetalPCB dataset: LABEL:sub@fig:intensity_source intensity image extracted from the source data without any model-based processing (in red: the pixel line for plots LABEL:sub@fig:line_intensity and LABEL:sub@fig:line_loss); LABEL:sub@fig:intensity_hybrid image extracted by the proposed AE+TRA approach (in red: the pixel line for plots LABEL:sub@fig:line_intensity and LABEL:sub@fig:line_loss); LABEL:sub@fig:line_intensity plot of the intensity extracted along the horizontal line in the copper region; LABEL:sub@fig:line_loss plot of the per-pixel loss by TRA, AE, and AE+TRA approaches along the horizontal line in the copper region.

In THz imaging, the intensity image that is equal to the squared amplitude, i.e. is the most important criteria for quality assessment. Note that the intensity could be inferred directly from the data by considering that (1) yields

(6)

where is the complex conjugate of . As we illustrate in Figure 14, the model-based approach is not only capable of extracting all relevant parameters, i.e., , , and , but, compared to values directly extracted from the source data, the resulting intensity is more homogeneous in homogeneous material regions. The homogeneity of the directly extracted intensity results from the very low depth of field of THz imaging systems in general, combined with the slight non-planarity of the MetalPCB target. As depicted in Figure (c)c, the intensity variations along the selected line in the homogeneous copper region are reduced using the three model-based methods, i.e. TRA, AE, and AE+TRA. However, due to the crucial selection of the initial parameters (see discussion at the beginning of Section 6), the TRA optimization results exhibit significant amplitude fluctuations and loss values (Figure (d)d) in the two horizontal sub-regions and . The proposed AE and AE+TRA methods, however, deliver superior results with respect to the main quality measure applied in THz imaging, i.e. to the intensity homogeneity and the loss in model fitting. Still, the AE approach shows very few extreme loss values, while the AE+TRA method’s loss values are consistently low along the selected line in the homogeneous copper region.

7 Conclusions and Future Work

In this paper, we propose a model-based autoencoder for THz image reconstruction. Comparing to a classical Trust-Region optimizer, the proposed autoencoder gets within

margin to the objective value of the optimizer, while being more than 140 times faster. Using the network’s prediction as an initialization to a gradient-based optimization scheme improves the result over a plain optimization scheme in terms of objective values while still being two times faster. We believe that these are very promising results for training optimizers/initialization schemes for parameter identification problems in general by exploiting the idea of model-based autoencoders for unsupervised learning.

References

  • [1] Wai Lam Chan, Jason Deibel, and Daniel M Mittleman. Imaging with terahertz radiation. Reports on progress in physics, 70(8):1325, 2007.
  • [2] Christian Jansen, Steffen Wietzke, Ole Peters, Maik Scheller, Nico Vieweg, Mohammed Salhi, Norman Krumbholz, Christian Jördens, Thomas Hochrein, and Martin Koch. Terahertz imaging: applications and perspectives. Appl. Opt., 49(19):E48–E57, 2010.
  • [3] Peter H Siegel. Terahertz technology. IEEE Transactions on microwave theory and techniques, 50(3):910–928, 2002.
  • [4] Ken B Cooper, Robert J Dengler, Nuria Llombart, Bertrand Thomas, Goutam Chattopadhyay, and Peter H Siegel. Thz imaging radar for standoff personnel screening. IEEE Transactions on Terahertz Science and Technology, 1(1):169–182, 2011.
  • [5] Binbin B Hu and Martin C Nuss. Imaging with terahertz waves. Optics letters, 20(16):1716–1718, 1995.
  • [6] K McClatchey, MT Reiten, and RA Cheville. Time resolved synthetic aperture terahertz impulse imaging. Applied physics letters, 79(27):4485–4487, 2001.
  • [7] Jinshan Ding, Matthias Kahl, Otmar Loffeld, and Peter Haring Bolívar. Thz 3-d image formation using sar techniques: simulation, processing and experimental results. IEEE Transactions on Terahertz Science and Technology, 3(5):606–616, 2013.
  • [8] M. Kahl, A. Keil, J. Peuser, T. Löffler, M. Pätzold, A. Kolb, T. Sprenger, B. Hils, and P. Haring Bolívar. Stand-off real-time synthetic imaging at mm-wave frequencies. In Passive and Active Millimeter-Wave Imaging XV, volume 8362, page 836208, 2012.
  • [9] David C Munson and Robert L Visentin. A signal processing view of strip-mapping synthetic aperture radar. IEEE Transactions on Acoustics, Speech, and Signal Processing, 37(12):2131–2147, 1989.
  • [10] Merrill Ivan Skolnik. Radar handbook. 1970.
  • [11] Marcin Andrychowicz, Misha Denil, Sergio Gomez Colmenarejo, Matthew W. Hoffman, David Pfau, Tom Schaul, and Nando de Freitas. Learning to learn by gradient descent by gradient descent. In Proc. Int. Conf. on Neural Information Processing Systems (NIPS), 2016.
  • [12] Erich Kobler, Teresa Klatzer, Kerstin Hammernik, and Thomas Pock. Variational networks: Connecting variational methods and deep learning. In

    Proc. German Conf. Pattern Recognition (GCPR)

    , 2017.
  • [13] Brandon Amos and J. Zico Kolter. Optnet: Differentiable optimization as a layer in neural networks. In Proc. Int. Conf. on Machine Learning, 2017.
  • [14] J-H. Chang, C-L. Li, B. Poczos, B.V.K. Vijaya Kumar, and A.C. Sankaranarayanan. One network to solve them all — solving linear inverse problems using deep projection models. In Proc. IEEE Int. Conf. on Computer Vision, 2017.
  • [15] T. Meinhardt, M. Moeller, C. Hazirbas, and D. Cremers. Learning proximal operators: Using denoising networks for regularizing inverse imaging problems. In Proc. IEEE Int. Conf. on Computer Vision, 2017.
  • [16] M. Moeller, T. Möllenhoff, and D. Cremers. Controlling neural networks via energy dissipation, 2019. Online at https://arxiv.org/abs/1904.03081.
  • [17] D. Ulyanov, A. Vedaldi, and V.S. Lempitsky. Deep image prior. In Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2018.
  • [18] R. Heckel and P. Hand. Deep decoder: Concise image representations from untrained non-convolutional networks. In Int. Conf. on Learning Representations, 2019.
  • [19] Ayush Tewari, Michael Zollöfer, Hyeongwoo Kim, Pablo Garrido, Florian Bernard, Patrick Perez, and Theobalt Christian. MoFA: Model-based Deep Convolutional Face Autoencoder for Unsupervised Monocular Reconstruction. In Proc. IEEE Int. Conf. on Computer Vision, 2017.
  • [20] Volker Blanz and Thomas Vetter. A morphable model for the synthesis of 3d faces. In Proc. SIGGRAPH, pages 187–194, New York, NY, USA, 1999. ACM Press/Addison-Wesley Publishing Co.
  • [21] Zhenyu Long, Tianyi Wang, ChengWu You, Zhengang Yang, Kejia Wang, and Jinsong Liu. Terahertz image super-resolution based on a deep convolutional neural network. Applied Optics, 58(10):2731–2735, 2019.
  • [22] Chao Dong, Chen Change Loy, Kaiming He, and Xiaoou Tang. Learning a deep convolutional network for image super-resolution. In Proc. Europ. Conf. Computer Vision, pages 184–199. Springer, 2014.
  • [23] Jiwon Kim, Jung Kwon Lee, and Kyoung Mu Lee. Accurate image super-resolution using very deep convolutional networks. In Proc. IEEE Conf. Computer Vision and Pattern Recognition, pages 1646–1654, 2016.
  • [24] Seungjun Nah, Tae Hyun Kim, and Kyoung Mu Lee. Deep multi-scale convolutional neural network for dynamic scene deblurring. In Proc. IEEE Conf. Computer Vision and Pattern Recognition, pages 3883–3891, 2017.
  • [25] Christian J Schuler, Michael Hirsch, Stefan Harmeling, and Bernhard Schölkopf. Learning to deblur. IEEE Trans. Pattern Analysis and Machine Intelligence (PAMI), 38(7):1439–1451, 2016.
  • [26] T. Plötz and S. Roth. Benchmarking denoising algorithms with real photographs. In Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2017.
  • [27] Sergey Ioffe and Christian Szegedy. Batch normalization: Accelerating deep network training by reducing internal covariate shift. Proc. Int. Conf. on Machine Learning, 2015.
  • [28] Xavier Glorot, Antoine Bordes, and Yoshua Bengio. Deep sparse rectifier neural networks. In

    Proceedings of the fourteenth international conference on artificial intelligence and statistics

    , pages 315–323, 2011.
  • [29] Diederik P Kingma and Jimmy Ba. Adam: A method for stochastic optimization. Int. Conf. on Learning Representations, 2015.
  • [30] Military Standard. Photographic lenses, 1959.
  • [31] Thomas F Coleman and Yuying Li. An interior trust region approach for nonlinear minimization subject to bounds. SIAM Journal on optimization, 6(2):418–445, 1996.