VIFB: A Visible and Infrared Image Fusion Benchmark

02/09/2020 ∙ by Xingchen Zhang, et al. ∙ Shanghai Jiao Tong University Imperial College London 0

Visible and infrared image fusion is one of the most important areas in image processing due to its numerous applications. While much progress has been made in recent years with efforts on developing fusion algorithms, there is a lack of code library and benchmark which can gauge the state-of-the-art. In this paper, after briefly reviewing recent advances of visible and infrared image fusion, we present a visible and infrared image fusion benchmark (VIFB) which consists of 21 image pairs, a code library of 20 fusion algorithms and 13 evaluation metrics. We also carry out large scale experiments within the benchmark to understand the performance of these algorithms. By analyzing qualitative and quantitative results, we identify effective algorithms for robust image fusion and give some observations on the status and future prospects of this field. The benchmark, including dataset, code library, evaluation metrics, and results is available upon request.



There are no comments yet.


page 1

page 2

page 5

page 6

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Image fusion means to combine information from different images to a single image, which is more informative and can be better processed by the following process. Figure 1 shows an example where in the fused image the target and more details about the scene can be seen. Many image fusion algorithms have been proposed, which can be generally divided into pixel-level, feature-level and decision-level fusion approaches based on the level of fusion. Also, image fusion can either be performed in the spatial domain or transform domain. Based on application areas, image fusion technology can be grouped into several types, namely medical image fusion [11, 39], multi-focus image fusion [36, 23, 41], remote sensing image fusion [9], multi-exposure image fusion [30, 32], visible and infrared image fusion [26]. Among these types, the visible and infrared image fusion is one of the most frequently used ones. This is because that the visible and infrared image fusion can be applied in many applications, for instance object tracking [42, 43, 15], object detection [35], and biometric recognition [13, 1].

Figure 1: The benefit of visible and infrared image fusion. The people around the car can not be seen in visible image due to overexposure of car light. Although they can be seen in infrared image, the infrared image lacks detail information about the scene. After fusion, the fused image contains enough details and the people are also visible.
Name Image/Video pairs Image type Resolution Year Results Code library
OSU Color-Thermal Database 6 video pairs RGB, Infrared 320 240 2005 No No
TNO 63 image pairs multispectral Various 2014 No No
VLIRVDIF 24 video pairs RGB, Infrared 720480 2019 No No
VIFB 21 image pairs RGB, Infrared Various 2020 Yes Yes
Table 1: Details of some existing visible and infrared image fusion datasets.
Figure 2: The infrared and visible dataset in VIFB. The dataset includes 21 pairs of infrared and visible images. The first, third, and fifth row contains RGB images, while the second, fourth, and sixth row presents corresponding infrared images. In other words, in each pair of images, the top one is the visible image and the bottom one is the infrared image.

However, current research on visible and infrared image fusion is suffering from several problems, which hinder the development of this field severely. First, there is no a well-recognized visible and infrared image fusion dataset which can be used to compare performance under the same standard. Therefore, it is quite common that different images are utilized in experiments in the literature, which makes it difficult to compare the performance of different algorithms. Second, although the source codes of many image fusion algorithms have been made publicly available, for example the FusionGAN [28] and DenseFuse [16], the input and output formats of most algorithms are different and thus it is inconvenient for large scale performance evaluation. Third, it is crucial to evaluate the performance of state-of-the-art fusion algorithms to demonstrate their strength and weakness and to help identify future research directions in this field for designing more robust algorithms. However, many evaluation metrics have been proposed to evaluate the fused images, but none of them is better than all other metrics. As a result, researchers normally choose several metrics which support their methods in the literature. This further makes it difficult to objectively compare performances.

To solve these issues, in this work we build a visible and infrared image fusion benchmark (VIFB) that includes 21 pairs of visible and infrared images, 20 publicly available fusion algorithms and 13 evaluation metrics to facilitate the evaluation task.

The main contributions of this paper lie in the following aspects:

  • Dataset. We created a dataset containing 21 pairs of visible and infrared images. These image pairs cover a wide range of environments and conditions, such as indoor, outdoor, low illumination, and over-exposure. Therefore, the dataset is able to test the generalization ability of fusion algorithms.

  • Code library. We collected 20 recent image fusion algorithms and integrated them into a cold library, which can be easily utilized to run algorithms and compare performance. Most of these algorithms are published in recent 5 years. An interface is designed to integrate any image fusion algorithms written in Matlab into VIFB easily.

  • Comprehensive performance evaluation. We implemented 13 evaluation metrics in VIFB to comprehensively compare fusion performance. We have run the collected 20 algorithms on the proposed dataset, and performed comprehensive comparison of those algorithms. All the results are made available for the interested readers to use.

Method Year Journal/Conference Category
MSVD [31] 2011 Defense Science Journal Multi-scale
GFF [21] 2013 IEEE Transactions on Image Processing Multi-scale
MST_SR [25] 2015 Information Fusion Hybrid
RP_SR [25] 2015 Information Fusion Hybrid
NSCT_SR [25] 2015 Information Fusion Hybrid
CBF [34] 2015 Signal, image and video processing Multi-scale
ADF [2] 2016 IEEE Sensors Journal Multi-scale
GFCE [45] 2016 Applied Optics Multi-scale
HMSD_GF[45] 2016 Applied Optics Multi-scale
Hybrid_MSD [46] 2016 Information Fusion Hybrid
TIF [3] 2016 Infrared Physics & Technology Saliency-based
GTF [26] 2016 Information Fusion Other
FPDE [4] 2017 International Conference on Information Fusion Subspace-based
IFEVIP [44] 2017 Infrared Physics & Technology Other
VSMWLS [29] 2017 Infrared Physics & Technology Saliency-based
DLF [19] 2018

International Conference on Pattern Recognition

LatLRR [17] 2018 arXiv Saliency-based
CNN [22] 2018 International Journal of Wavelets, Multiresolution and Information Processing DL-based
MGFF [5] 2019 Circuits, Systems, and Signal Processing Multi-scale
ResNet [18] 2019 Infrared Physics & Technology DL-based
Table 2: Visible and infrared image fusion algorithms that have been integrated in VIFB.
Name Meaning P/N Name Meaning P/N
AG Average gradient + RW Relatively wrap -
CE Cross entropy - RMSE Root mean squared error -
EI Edge intensity + Edge based similarity measurement +
FD Figure definition + SF Spatial frequency +
MI Mutual information + SSIM Structural similarity index measure +

Peak signal-to-noise ratio

+ SD Standard deviation +
EN Entropy +
Table 3: Evaluation metrics implemented in VIFB. The ’+’ means that a large value indicates a good performance while ’-’ means that a small value indicates a good performance.

2 Related Work

In this section, we briefly review recent visible and infrared image fusion algorithms. In addition, we summarize existing visible and infrared image datasets.

2.1 Visible-infrared fusion methods

In recent years, a lot of visible and infrared image fusion methods have been proposed. Before deep learning is introduced to the image fusion community, main image fusion methods can be generally grouped into several categories, namely multi-scale transform-, sparse representation-, subspace-, and saliency-based methods, hybrid models, and other methods according to their corresponding theories


In the past few years, a number of image fusion methods based on deep learning have emerged [12, 20, 24, 27]

. Deep learning can help to solve several important problems in image fusion. First, deep learning can provide better features compared to handcrafted ones. Second, deep learning can learn adaptive weights in image fusion, which is crucial in many fusion rules. Regarding methods, convolutional neural network (CNN)

[10, 23, 41, 39, 32], generative adversarial networks (GAN) [28], Siamese networks [22]

, autoencoder

[16] have been explored to conduct image fusion. Apart from image fusion methods, the image quality assessment, which is critical in image fusion performance evaluation, has also benefited from deep learning [40]

. It is foreseeable that in the future, image fusion technology will develop in the direction of machine learning, and an increasing number of research results will appear.

2.2 Existing dataset

Although the research on image fusion has begun for many years, there is still no a well-recognized and common used dataset in the community of visible and infrared image fusion. This differs from the visual tracking where several well-known benchmarks have been proposed and widely utilized, such as OTB [37, 38] and VOT [14]. Therefore, it is common that different image pairs are utilized in visible and infrared image fusion literature, which makes the objective comparison difficult.

At the moment, there are several existing visible and infrared image fusion dataset, including OSU Color-Thermal Database

[6]111, TNO Image fusion dataset222, and VLIRVDIF [7]333 fusion/. The main information about these dataset are summarized in Table 1. Actually, apart from OSU, the number of image pairs in TNO and VLIRVDIF is not small. However, the lack of code library, evaluation metrics as well as results on these dataset make it difficult to gauge the state-of-the-art based on them.

Figure 3: Qualitative comparison of 20 methods on the fight image pair shown in Fig. 2.
Figure 4: Qualitative comparison of 20 methods on the manlight image pair shown in Fig. 1 and Fig. 2.

3 Visible and Infrared Image Fusion Benchmark

3.1 Dataset

The dataset in VIFB, which is a test set, includes 21 pairs of visible and infrared images. These images cover a wide range of environments and working conditions, such as indoor, outdoor, low illumination, and over-exposure. Each pair of visible and infrared image has been strictly registered to make sure that the image fusion can be successfully performed. There are various image resolution in the dataset, such as 320240, 630460, 512184, and 452332. Some examples of images in the dataset are given in Fig. 2. The images are collected by the authors from the Internet444 and fusion tracking dataset [15].

Figure 5: Quantitative comparisons of nine metrics of 20 methods on 21 image pairs shown in Fig. 2. From 1 to 21 in the horizontal axis: carLight, carShadow, carWhite, elecbike, fight, kettle, labMan, man, manCall, manCar, manlight, manWalking, manwithbag, nightCar, peopleshadow, running, snow, tricycle, walking, walking2, walkingnight.

3.2 Baseline algorithms

In recent years, a lot of algorithms have been proposed to perform visible and infrared image fusion. However, only a part of papers provide the source code. Besides, these codes have different input and output interfaces, and they may require different running environment. These factors hinder the usage of these codes to produce results and compare performances.

In VIFB benchmark, we integrate 20 recently published visible-infrared image fusion algorithms including MSVD [31], GFF [21], MST_SR [25], RP_SR [25], NSCT_SR [25], CBF [34], ADF [2], GFCE [45], HMSD_GF [45], Hybrid-MSD [46], TIF [3], GTF [26], FPDE [4], IFEVIP [44], VSM_WLS [29], DLF [19], LatLRR [17], CNN [22], MGFF [5], ResNet [18]. Table 2 lists more details about these algorithms. At the moment, all these image fusion algorithms integrated in VIFB are written in Matlab. Note that some algorithms can only fused gray-scale images while some methods can fusion colorful images.

These algorithms cover almost every kind of visible-infrared fusion algorithms, and most algorithms are proposed in the last five years, which can represent the development of the field of visible-infrared fusion algorithms to some extent. We will continue to add more algorithms to VIFB in future.

To integrate algorithms into VIFB and for the convenience of users, we designed an interface to integrate fusion algorithms written in Matlab into VIFB. By using this interface, any visible-infrared fusion algorithm written in Matlab can be added into the benchmark easily. However, it should be mentioned that, for those methods whose source codes are not in Matlab, we also design an interface to integrate the fusion results to VIFB in order to compare their results with other algorithms.

3.3 Evaluation metrics

Numerous evaluation metrics for visible-infrared image fusion have been proposed, such as mutual information, spatial frequency, cross entropy. However, none of them is better than all other metrics. Therefore, in the literature, authors normally chose and present several evaluation metrics which support their methods. However, it may not be able to compare performance comprehensively in this way. In VIFB, we implement 13 evaluation metrics. It is convenient to compute all these metrics for each method in VIFB, thus making it easy to compare performances. All evaluation metrics that have been implemented in VIFB are listed in Table 3. Here we only introduce several metrics. More introduction to other metrics will be put in the supplementary material.

  • Mutual information (MI).
    MI [33] is used to measure the amount of information that is transferred from source images to the fused image. It is defined as:


    where and denote the information transferred from visible and infrared images to the fused image, respectively. The subscript denotes the fused image. Specifically, is defined as follows:


    where is for visible image and is for infrared image, and are the marginal histograms of source image and fused image , respectively.  is the joint histogram of source image and fused image . A large MI value means a good fusion performance since considerable information is transferred to the fused image.

  • Root mean squared error (RMSE).
    RMSE is defined as:


    where denotes the dissimilarity between the visible and fused images, is the dissimilarity between the infrared and fused images.  is defined as:


    where is for visible image and is for infrared image, and are the width and height of the images, respectively. If the fused image has a small amount of error and distortion, then there will be a small RMSE value.

  • Spatial frequency (SF).
    SF [8] can measure the gradient distribution of an image thus revealing the detail and texture of an image. It is defined as:


    where and . A large SF value indicates rich edges and textures, thus indicating good fusion performance.

More information about evaluation metrics can be founded in [27].

4 Experiments

This section presents experimental results on the VIFB dataset. Section 4.1 and Section 4.2 presents qualitative and quantitative performance comparison, respectively. Section 4.3 compares the runtime of each algorithm. All experiments are performed using a desktop equipped with an NVIDIA GTX 1080Ti GPU and i7-8700K CPU. Default parameters reported by the corresponding authors of each algorithm are employed. Note that due to the page limits, we just present a part of results here. More fusion results will be provided in the supplementary materials.

4.1 Qualitative performance comparison

Qualitative evaluation methods are important in fusion quality assessment and they assess the quality of fused images on the basis of the human visual system. Figure 3 presents the qualitative performance comparison of 20 fusion methods on the fight image pair. In this image pair, several people are in the shadow of a car thus can not be seen clearly in the visible image while can be seen in infrared image. As can be seen, in almost all fused images these people can be seen. However, the fused images which are obtained by some algorithms have more artifacts information. These include ADF, CBF, DLF, FPDE, GFCE, HMSD_GF, Hybrid_MSD, IFEVIP, MST_SR, MSVD, NSCT_SR, ResNet, PR_SR, TIF and VSMWLS. Besides, the fused images produced by GTF, ResNet, and PR_SR do not preserve much detail information contained in the visible image. Figure 3 indicates that the fused images obtained by GFF and MGFF are more natural for human sensitivity and preserve more details.

Figure 4 shows the qualitative comparison of 20 methods on manlight image pair. It can be seen that in many fused images, the people around the car are still invisible or not clear, such as those produced by ADF, CNN, GFCE, HMSD_GF, Hybrid_MSD, and IFEVIP. Some other fused images have more artifacts which are not presented in original images, such as those obtained by CBF, GFCE, and NSCT_SR. The results indicate that DLF, FPDE, MGFF, and MST_SR give better subjective fusion performance for the manlight case.

4.2 Quantitative performance comparison

Table 4 presents the average value of 13 evaluation metrics for all methods on 21 image pairs. As can be seen, the LatLRR method obtains the best overall performance by having 4 best values and 3 second best values. The NSCT_SR, GFCE and IFEVIP also show relatively good overall performance. However, this table indicates clearly that there is no a dominant fusion method that can beat other methods in all or most evaluation metrics. Besides, all three deep learning-based methods, namely DLF, CNN and ResNet, do not show very competitive overall performance. This is different from the field of tracking and detection which is almost dominated by deep learning-based approaches.

To further show the quantitative comparison of fusion performances of different methods, nine metric results of all 20 methods on 21 image pairs are presented in Figure 5.

MSVD (0,0,1) 3.489 1.454 35.595 6.698 4.579 1.986 64.596 0.139 0.499 0.024 12.276 1.433 105.939
GFF (0,0,0) 5.301 1.268 54.959 7.124 6.592 2.578 64.952 0.136 0.699 0.022 17.210 1.423 105.115
MST_SR (0,2,1) 5.796 0.975 60.158 7.341 7.106 2.926 65.479 0.147 0.729 0.020 18.489 1.401 108.806
RP_SR (0,1,1) 6.280 1.021 64.275 7.350 7.903 2.410 65.756 0.161 0.720 0.018 20.905 1.348 112.318
NSCT_SR (3,0,0) 6.419 0.896 67.156 7.393 7.740 3.228 65.608 0.126 0.671 0.019 19.053 1.291 105.731
CBF (0,4,0) 7.092 0.979 73.863 7.319 8.720 2.248 65.305 0.122 0.632 0.020 20.179 1.184 105.378
ADF (0,0,0) 4.971 1.676 50.702 6.950 6.353 1.984 65.000 0.136 0.540 0.023 15.234 1.380 111.400
GFCE (2,4,1) 7.401 1.899 76.443 7.255 9.070 1.855 68.277 0.261 0.684 0.011 21.983 1.153 158.797
HMSD_GF (0,0,0) 6.205 1.156 64.570 7.282 7.544 2.517 65.886 0.163 0.757 0.018 19.610 1.404 115.725
Hybrid_MSD (0,0,0) 6.069 1.237 62.842 7.307 7.467 2.671 65.175 0.144 0.722 0.021 19.376 1.415 107.297
TIF (0,1,0) 4.527 1.782 46.746 6.781 5.531 1.737 65.105 0.138 0.462 0.022 14.294 1.411 112.537
GIF (1,0,0) 4.284 1.258 43.473 6.504 5.573 2.022 66.272 0.130 0.295 0.017 14.690 1.383 106.106
FPDE (0,0,0) 5.129 1.714 52.419 6.999 6.472 1.997 65.167 0.146 0.512 0.022 15.066 1.360 113.056
IFEVIP (2,1,1) 4.962 1.286 51.558 6.955 6.073 2.363 69.722 0.259 0.613 0.008 15.796 1.403 138.318
VSMWLS (0,0,0) 5.564 1.380 56.727 7.017 7.046 2.048 64.687 0.148 0.595 0.023 17.429 1.428 109.096
DLF (1,0,1) 3.806 1.353 38.309 6.719 4.936 2.066 64.619 0.139 0.499 0.023 12.323 1.473 106.321
LatLRR (4,3,1) 9.004 1.736 93.131 6.917 11.045 1.665 68.887 0.252 0.724 0.010 29.488 1.189 155.134
CNN (0,1,0) 5.783 1.033 60.000 7.332 7.159 2.711 65.627 0.163 0.758 0.019 18.750 1.403 114.730
MGFF (0,0,0) 5.830 1.313 60.543 7.120 7.206 1.799 64.551 0.133 0.524 0.024 17.903 1.418 106.243
ResNet (0,1,0) 3.564 1.392 36.304 6.708 4.522 2.046 64.603 0.137 0.505 0.023 11.124 1.470 106.053
Table 4: Average evaluation metric values of all methods on 21 image pairs. The best three values in each metric are denoted in red, green and blue, respectively. The three numbers after the name of each method denote the number of best value, second best value and third best value, respectively. Best viewed in color.

4.3 Runtime comparison

The runtime of all algorithms integrated in VIFB is listed in Table 5. As can be seen, the runtime of image fusion methods vary significantly from one to another. This is also true even for methods in the same category. For instance, both TIF and LatLRR are saliency-based methods, but the runtime of LatLRR is more than 2000 times that of TIF. Besides, multi-scale methods are generally fast and deep learning-based algorithms are slower than others even with the help of GPU. The fastest deep learning-based method, i.e. ResNet, takes 2.89 seconds to fuse one image pair. It should be mentioned that all three deep learning-based algorithms in VIFB do not update the model online, but use pretrained model instead.

One important application area of visible and infrared image fusion is the RGB-infrared fusion tracking [42, 43], where the tracking speed is vital for practical applications. As pointed out in [42], if an image fusion algorithm is very time-consuming, like LatLRR [17] and NSCT_SR [25], then it will not be feasible to develop a real-time fusion tracker based on this image fusion algorithm. Actually, most image fusion algorithms listed in Table 5 are computationally expensive in terms of tracking.

Method Average runtime Category
MSVD [31] 0.21 Multi-scale
GFF [21] 0.21 Multi-scale
MST_SR [25] 0.56 Hybrid
RP_SR [25] 0.58 Hybrid
NSCT_SR [25] 34.29 Hybrid
CBF [34] 7.30 Multi-scale
ADF [2] 0.23 Multi-scale
GFCE [45] 0.27 Multi-scale
HMSD_GF[45] 0.26 Multi-scale
Hybrid_MSD [46] 4.18 Hybrid
TIF [3] 0.05 Saliency-based
GTF [26] 1.91 Other
FPDE [4] 0.29 Subspace-based
IFEVIP [44] 0.11 Other
VSMWLS [29] 0.45 Saliency-based
DLF [19] 4.98 DL-based
LatLRR [17] 107.77 Saliency-based
CNN [22] 22.43 DL-based
MGFF [5] 0.31 Multi-scale
ResNet [18] 2.89 DL-based
Table 5: Runtime of algorithms in VIFB (seconds per image pair)

5 Concluding Remarks

In this paper, we present a visible and infrared image fusion benchmark (VIFB), which includes a dataset of 21 image pairs, a code library consists of 20 algorithms, 13 evaluation metrics and all results. To the best of our knowledge, this is the first visible and infrared image fusion benchmark to date. This benchmark facilitates better understanding of the state-of-the-art image fusion approaches, and can provide a platform for gauging new methods.

We carry out large scale experiments based on VIFB to evaluate the performance of all integrated fusion algorithms. We have several observations on the status of visible and infrared image fusion based on our experimental results. First, unlike some other fields in computer vision where deep learning is almost the dominant method, such as object tracking and detection, different kinds of methods are still being frequently utilized in visible and infrared image fusion. Second, although there are an increasing number of deep learning-based image fusion methods, their performances do not show superiority over non-learning algorithms at the moment. However, due to its strong representation ability and the end-to-end property, we believe that the deep learning-based image fusion approach will be an important research direction in future. Third, the computational efficiency of visible and infrared image fusion algorithms still need to be improved in order to be applied in real-time applications, such as tracking and detection.

We will continue to extend the dataset and code library of VIFB to contain more image pairs and fusion algorithms. We will also implement more evaluation metrics in VIFB. We hope that VIFB can serve as a good starting point for researchers who are interested in visible and infrared image fusion.


  • [1] Syed Mohd Zahid Syed Zainal Ariffin, Nursuriati Jamil, Puteri Norhashimah Megat Abdul Rahman, Syed Mohd, Zahid Syed, Zainal Ariffin, Nursuriati Jamil, and Universititeknologi Mara. Can thermal and visible image fusion improves ear recognition? In Proceedings of the 8th International Conference on Information Technology, pages 780–784, 2017.
  • [2] Durga Prasad Bavirisetti and Ravindra Dhuli. Fusion of infrared and visible sensor images based on anisotropic diffusion and karhunen-loeve transform. IEEE Sensors Journal, 16(1):203–209, 2016.
  • [3] Durga Prasad Bavirisetti and Ravindra Dhuli. Two-scale image fusion of visible and infrared images using saliency detection. Infrared Physics & Technology, 76:52–64, 2016.
  • [4] Durga Prasad Bavirisetti, Gang Xiao, and Gang Liu.

    Multi-sensor image fusion based on fourth order partial differential equations.

    In 2017 20th International Conference on Information Fusion (Fusion), pages 1–9. IEEE, 2017.
  • [5] Durga Prasad Bavirisetti, Gang Xiao, Junhao Zhao, Ravindra Dhuli, and Gang Liu. Multi-scale guided image and video fusion: A fast and efficient approach. Circuits, Systems, and Signal Processing, 38(12):5576–5605, Dec 2019.
  • [6] James W Davis and Vinay Sharma. Background-subtraction using contour-based fusion of thermal and visible imagery. Computer vision and image understanding, 106(2-3):162–182, 2007.
  • [7] Andreas Ellmauthaler, Carla L Pagliari, Eduardo AB da Silva, Jonathan N Gois, and Sergio R Neves. A visible-light and infrared video database for performance evaluation of video/image fusion methods. Multidimensional Systems and Signal Processing, 30(1):119–143, 2019.
  • [8] Ahmet M Eskicioglu and Paul S Fisher. Image quality measures and their performance. IEEE Transactions on communications, 43(12):2959–2965, 1995.
  • [9] Hassan Ghassemian. A review of remote sensing image fusion methods. Information Fusion, 32:75–89, 2016.
  • [10] Haithem Hermessi, Olfa Mourali, and Ezzeddine Zagrouba. Convolutional neural network-based multimodal image fusion via similarity learning in the shearlet domain. Neural Computing and Applications, pages 1–17, 2018.
  • [11] Alex Pappachen James and Belur V Dasarathy. Medical image fusion: A survey of the state of the art. Information Fusion, 19:4–19, 2014.
  • [12] Xin Jin, Qian Jiang, Shaowen Yao, Dongming Zhou, Rencan Nie, Jinjin Hai, and Kangjian He. A survey of infrared and visual image fusion methods. Infrared Physics & Technology, 85:478–501, 2017.
  • [13] Seong G. Kong, Jingu Heo, Besma R. Abidi, Joonki Paik, and Mongi A. Abidi.

    Recent advances in visual and infrared face recognition - A review.

    Computer Vision and Image Understanding, 97(1):103–135, 2005.
  • [14] Matej Kristan, Jiri Matas, Aleš Leonardis, Tomas Vojir, Roman Pflugfelder, Gustavo Fernandez, Georg Nebehay, Fatih Porikli, and Luka Čehovin. A novel performance evaluation methodology for single-target trackers. IEEE Transactions on Pattern Analysis and Machine Intelligence, 38(11):2137–2155, Nov 2016.
  • [15] Chenglong Li, Xinyan Liang, Yijuan Lu, Nan Zhao, and Jin Tang. Rgb-t object tracking: benchmark and baseline. Pattern Recognition, page 106977, 2019.
  • [16] Hui Li and Xiaojun Wu. Densefuse: A fusion approach to infrared and visible images. IEEE Transactions on Image Processing, 28(5):2614–2623, 2018.
  • [17] Hui Li and Xiaojun Wu. Infrared and visible image fusion using latent low-rank representation. arXiv preprint arXiv:1804.08992, 2018.
  • [18] Hui Li, Xiao-Jun Wu, and Tariq S Durrani. Infrared and visible image fusion with resnet and zero-phase component analysis. Infrared Physics & Technology, 102:103039, 2019.
  • [19] Hui Li, Xiao-Jun Wu, and Josef Kittler. Infrared and visible image fusion using a deep learning framework. 24th International Conference on Pattern Recognition, 2018.
  • [20] Shutao Li, Xudong Kang, Leyuan Fang, Jianwen Hu, and Haitao Yin. Pixel-level image fusion: A survey of the state of the art. Information Fusion, 33:100–112, 2017.
  • [21] Shutao Li, Xudong Kang, and Jianwen Hu. Image fusion with guided filtering. IEEE Transactions on Image processing, 22(7):2864–2875, 2013.
  • [22] Yu Liu, Xun Chen, Juan Cheng, Hu Peng, and Zengfu Wang. Infrared and visible image fusion with convolutional neural networks. International Journal of Wavelets, Multiresolution and Information Processing, 16(03):1850018, 2018.
  • [23] Yu Liu, Xun Chen, Hu Peng, and Zengfu Wang. Multi-focus image fusion with a deep convolutional neural network. Information Fusion, 36:191–207, 2017.
  • [24] Yu Liu, Xun Chen, Zengfu Wang, Z Jane Wang, Rabab K Ward, and Xuesong Wang. Deep learning for pixel-level image fusion: Recent advances and future prospects. Information Fusion, 42:158–173, 2018.
  • [25] Yu Liu, Shuping Liu, and Zengfu Wang. A general framework for image fusion based on multi-scale transform and sparse representation. Information Fusion, 24:147–164, 2015.
  • [26] Jiayi Ma, Chen Chen, Chang Li, and Jun Huang. Infrared and visible image fusion via gradient transfer and total variation minimization. Information Fusion, 31:100–109, 2016.
  • [27] Jiayi Ma, Yong Ma, and Chang Li. Infrared and visible image fusion methods and applications: A survey. Information Fusion, 45:153–178, 2019.
  • [28] Jiayi Ma, Wei Yu, Pengwei Liang, Chang Li, and Junjun Jiang. FusionGAN: A generative adversarial network for infrared and visible image fusion. Information Fusion, 48(June 2018):11–26, 2019.
  • [29] Jinlei Ma, Zhiqiang Zhou, Bo Wang, and Hua Zong. Infrared and visible image fusion based on visual saliency map and weighted least square optimization. Infrared Physics & Technology, 82:8–17, 2017.
  • [30] Kede Ma, Kai Zeng, and Zhou Wang. Perceptual quality assessment for multi-exposure image fusion. IEEE Transactions on Image Processing, 24(11):3345–3356, 2015.
  • [31] VPS Naidu.

    Image fusion technique using multi-resolution singular value decomposition.

    Defence Science Journal, 61(5):479–484, 2011.
  • [32] K Ram Prabhakar, V Sai Srikar, and R Venkatesh Babu. Deepfuse: A deep unsupervised approach for exposure fusion with extreme exposure image pairs. In 2017 IEEE International Conference on Computer Vision (ICCV). IEEE, pages 4724–4732, 2017.
  • [33] Guihong Qu, Dali Zhang, and Pingfan Yan. Information measure for performance of image fusion. Electronics letters, 38(7):313–315, 2002.
  • [34] B. K. Shreyamsha Kumar. Image fusion based on pixel significance using cross bilateral filter. Signal, Image and Video Processing, 9(5):1193–1204, Jul 2015.
  • [35] Helene Torresan, Benoit Turgeon, Clemente Ibarra-Castanedo, Patrick Hebert, and Xavier P Maldague. Advanced surveillance systems: combining video and thermal imagery for pedestrian detection. In Thermosense XXVI, volume 5405, pages 506–516. International Society for Optics and Photonics, 2004.
  • [36] Zhaobin Wang, Yide Ma, and Jason Gu. Multi-focus image fusion using pcnn. Pattern Recognition, 43(6):2003–2016, 2010.
  • [37] Yi Wu, Jongwoo Lim, and Ming-Hsuan Yang. Online object tracking: A benchmark. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 2411–2418, 2013.
  • [38] Yi Wu, Jongwoo Lim, and Ming-Hsuan Yang. Object tracking benchmark. IEEE Transactions on Pattern Analysis and Machine Intelligence, 37(9):1834–1848, 2015.
  • [39] Kaijian Xia, Hongsheng Yin, and Jiangqiang Wang. A novel improved deep convolutional neural network model for medical image fusion. Cluster Computing, pages 1–13, 2018.
  • [40] Qingsen Yan, Dong Gong, and Yanning Zhang. Two-stream convolutional networks for blind image quality assessment. IEEE Transactions on Image Processing, 28(5):2200–2211, 2018.
  • [41] Xiang Yan, Syed Zulqarnain Gilani, Hanlin Qin, Ajmal Mian, Student Member, Syed Zulqarnain Gilani, Hanlin Qin, and Ajmal Mian. Unsupervised Deep Multi-focus Image Fusion. pages 1–11, 2018.
  • [42] Xingchen Zhang, Gang Xiao, Ping Ye, Dan Qiao, Junhao Zhao, and Shengyun Peng. Object fusion tracking based on visible and infrared images using fully convolutional siamese networks. In Proceedings of the 22nd International Conference on Information Fusion. IEEE, 2019.
  • [43] Xingchen Zhang, Ping Ye, Shengyun Peng, Jun Liu, Ke Gong, and Gang Xiao. Siamft: An rgb-infrared fusion tracking method via fully convolutional siamese networks. IEEE Access, 7:122122–122133, 2019.
  • [44] Yu Zhang, Lijia Zhang, Xiangzhi Bai, and Li Zhang.

    Infrared and visual image fusion through infrared feature extraction and visual information preservation.

    Infrared Physics & Technology, 83:227 – 237, 2017.
  • [45] Zhiqiang Zhou, Mingjie Dong, Xiaozhu Xie, and Zhifeng Gao. Fusion of infrared and visible images for night-vision context enhancement. Applied optics, 55(23):6480–6490, 2016.
  • [46] Zhiqiang Zhou, Bo Wang, Sun Li, and Mingjie Dong. Perceptual fusion of infrared and visible images through a hybrid multi-scale decomposition with gaussian and bilateral filters. Information Fusion, 30:15–26, 2016.