infrared anf visible image fusion using latent low-rank representation
Infrared and visible image fusion is an important problem in the field of image fusion which has been applied widely in many fields. To better preserve the useful information from source images, in this paper, we propose a novel image fusion method based on latent low-rank representation(LatLRR) which is simple and effective. This is the first time that LatLRR is introduced to image fusion. Firstly, the source images are decomposed into low-rank parts(global structure) and saliency parts(local structure) by LatLRR. Then, the low-rank parts are fused by weighted-average strategy, and the saliency parts are simply fused by sum strategy. Finally, the fused image is obtained by combining the fused low-rank part and the fused saliency part. Compared with other fusion methods experimentally, the proposed method has better fusion performance than state-of-the-art fusion methods in both subjective and objective evaluation. The Code of our fusion method is available at https://github.com/exceptionLi/imagefusion_Infrared_visible_latlrrREAD FULL TEXT VIEW PDF
Infrared and visible image fusion is an important problem in image fusio...
In the process of image acquisition, the noise is inevitable for source
In this paper we propose a very efficient method to fuse the unregistere...
Visible and infrared image fusion is one of the most important areas in ...
In recent years, deep learning has become a very active research tool wh...
Among the representation learning, the low-rank representation (LRR) is ...
In this paper, we present a fast yet effective method for pixel-level
infrared anf visible image fusion using latent low-rank representation
In multi-sensor image fusion field, the infrared and visible image fusion is an important task. It has been widely used in many applications, such as surveillance, object detection and target recogintion. The main purpose of image fusion is to generate a single image which contains the complementary information from multiple images of the same sceneShutao2017 . In infrared and visible image fusion, it is a key problem to extract the saliency object from infrared image and visible image. And many fusion methods have been proposed in recently.
As we all know, the most commonly used method in image fusion are multi-scale transforms, such as discrete wavelet transform(DWT)DWT2005 , contourlet trnsformcontourlet2010 , shift-invariant shearlet transformshearlet2014 and quaternion wavelet transformquaternion wavelet2013 etc. Due to the conventional tansform methods has not enough detail preservation ability, Lou et al.nonshearlet2017 proposed a fusion method based on contextual statistical similarity and nonsubsampled shearlet transform which can obtain the local structure information from source image and get good fusion performance. For infrared and visible image fusion, Bavirisetti et al.twoscale2016 proposed a fusion method based on two-scale decomposed and saliency detection, they used mean filter and median filter to extract the base layers and detail layers, and used visual saliency to obtain weight maps. Finally, the fused image is obtained by calculating these three parts. Besides above methods, Zhang et al.morphological2017 proposed a morphological gradient based fusion method. This method used the different morphological gradient operator to obtain the focus region, defocus region and focus boundary region, respectively. Then the fused image is obtained by using an appropriate fusion strategy. Besides, Luo et al.Luo2017JVCIR proposed an novel image fusion method based on HOSVD and edge intensity. And a fuzzy transform based fusion method was proposed by Manchanda et al.Manchanda2018 . These methods both obtain better fusion performance in subjective and objective evaluation.
Recently, with the rise of compressed sensing, image fusion methods based on representation learning has attracted great attention. The most common methods of representation learning are sparse representation(SR). Zong et al.medicalclassified2017
proposed a novel medical image fusion method based on SR. They used the Histogram of Oriented Gradient(HOG) features to classify the image patches and learned several sub-dictionaries. Then this method used the
-norm and choose-max strategy to reconstruct fused image. In addition, there are many methods based on combining SR and other tools which are pulse coupled neural network(PCNN)ivsr2014 , low-rank representation(LRR)lrr2017 and shearlet transformivsidt2016 . In sparse domain, the joint sparse representation jsrsd2017 and cosparse representationcao2017 were also applied into image fusion field.
Although the SR based fusion methods obtain good fusion performance, these methods still have drawback, such as the ability of capturing global structure is limited. To address this problem, we introduce a new representation learning technique, latent low-rank representation (LatLRR)latentLrr2011 , to infrared and visible image fusion task. Unlike low-rank representation(LRR)Lrr2010 , the LatLRR can extract the global structure information and the local structure information from source images.
, Yu Liu et al. proposed a fusion method based on convolutional sparse representation(CSR). The CSR is different from deep learning methods, but the features extracted by CSR are still deep features. In addition, Yu Liu et al.cnn2017
also proposed a convolutional neural network(CNN)-based fusion method. Image patches which contain different blur versions of the input image are used to train the network and get a decision map. The fused image is obtained by using the decision map and the source images. However, these deep learning-based methods still have drawbacks which the network is difficult to train when the training data is not enough, especially in infrared and visible image fusion task.
In this paper, we propose a novel fusion method based on LatLRR in infrared and visible image fusion. The source images are decomposed into low-rank parts(global structure) and saliency parts(local structure) by LatLRR. Then we use different fusion strategy to fused low-rank part and saliency part. Finally, we use fused low-rank parts and fused saliency parts to reconstruct the fused image. And the experimental results demonstrate that our proposed has better fusion performance than other fusion methods.
This paper is structured as follows. In Section 2, we give a brief introduction to LatLRR theory. In Section 3, the proposed LatLRR based image fusion method is introducted in detail. The experimental results are shown in Section 4. Finally, Section 5 draws the conclusions.
In 2010, Liu et al.Lrr2010 proposed LRR theroy, but this representation method can not preserve the local structure information. So in 2011, the author proposed LatLRR theroylatentLrr2011 , and the global structure and local structure can be extracted by LatLRR from raw data.
In referencelatentLrr2011 , the LatLRR problem is reduced to solve the following optimization problem,
where is the balance coefficient,
denotes the nuclear norm which is the sum of the singular values of matrix andis -norm. denotes observed data matrix, is the low-rank coefficients, is the saliency coefficients, and is the sparse noisy part. Eq.(1) is solved by the inexact Augmented Lagrangian Multiplier (ALM). Then the low-rank part and the saliency part is obtained by Eq.(1).
In this section, the proposed fusion method is presented in detail. And the fusion processing of low-rank parts and saliency parts will be introduced in next sections.
The LatLRR decomposed operation is shown in Fig.1. The low-rank part and saliency part for input image X are obtained by LatLRR.
As shown in Fig.1, is the input image, and denote the low-rank part and saliency part which are obtained by LatLRR. When we get the low-rank parts and saliency parts from each source images, we will use fusion strategy to fusedd these two parts, respectively. And the framework of the proposed fusion method is shown in Fig.2.
The two source images are donated as (infrared) and (visible). Firstly, the low-rank part and saliency part for each source image are obtained by LatLRR, where . Then the low-rank parts and saliency parts are fused by weighted-average strategy, respectively. Finally, the fused image will be reconstructed by adding the fused low-rank part and saliency part .
Low-rank parts for source images contain more global structure information and brightness information. So, in our fusion method, we use weighted average strategy to obtain the fused low-rank part. The fused low-rank part is calculated by Eq.(2),
where denotes the corresponding position of the coefficient in , and . And and represent the weight values for coefficient in in and , respectively. To preserve the global structure and brightness information, and to reduce the redundant information, in this paper, we choose and .
The saliency parts contain the local structure information and saliency features, as shown in Fig.3.
As we can see from the example in Fig.3, the saliency part contains more saliency features from and it is the same situation for and . The saliency features from source images are complementary information and need to be contained in fused images without loss. So in Eq.(3), we simply use sum strategy to fusedd the saliency parts.
In Eq.(3), the represent the corresponding position of the coefficient in , and . And and represent the weight value for coefficient in in and , respectively. To preserve more local structure and saliency features in fused images, in this paper, we choose and in Eq.(3). In next section, we will introduce why we use sum strategy and choose .
In this section, we will explain why we simply use the sum strategy to fused the saliency parts. We choose the coefficients from two saliency parts in same row, as shown in Fig.4(a) and Fig.4(b). Fig.4(c) indicate the saliency part values in the same row and different column.
As we can see in Fig.4(a) and Fig.4(b), the white line denotes the row which we choose in Fig.4(c). And in Fig.4(c), the blue line and orange line indicate the coefficients of infrared saliency part() and visible saliency part(), respectively. From Fig.4(a) and Fig.4(b), in the first and second red boxes, the infrared saliency part contains more saliency features which means the coefficients of infrared saliency part are greater than visible saliency part in corresponding position and the values of visible saliency part are close to 0, as shown in Fig.4(c) and the position in and . In the third red box, the visible saliency part contains more saliency features, which means the coefficients of visible saliency part are greater than infrared saliency part in corresponding position and the values of infrared saliency part are very small, as shown in Fig.4(c) and the position in .
The fused saliency part and the corresponding coefficients are shown in Fig.5. Fig.5(a) is the fused saliency part which use sum strategy, and the red line in Fig.5(b) indicate the coefficients of fused saliency part in same row.
As we discuss in Section 3.2, the saliency features from source images need to be contained in fused images without loss. If we choose weight-average strategy, the saliency features will be reduced. And if we choose sum strategy to fusedd saliency part, the saliency features will be contained in fused saliency part without loss. Furthermore, the features will be increased by sum strategy, as shown in Fig.5(b).
We summarize the proposed fusion method based on LatLRR as follows:
1) The source images are decomposed by LatLRR to obtain the low-rank parts and the saliency part , where .
2) We choose weighted-average fusion strategy to fused low-rank parts and saliency parts. Then the fused low-rank part and the fused saliency part are obtained.
3) Finally, the fused image are obtained by Eq.(4).
In this section, first of all, we introduce the detailed experimental settings. Then the subjective and objective methods are adopted to assess the fusion performance. Finally, the experimental results are analyzed visually and quantitatively.
Firstly, the source infrared and visible images were collected from ivwls2017 and TNO2014 . And the number of our source images are 21 pairsGithubHuiLi . Because the number of infrared and visible images are too much to show all of them, so we just take several examples for these images, as shown in Fig.6.
Then several novel and classical fusion methods are compared with the proposed method, including: cross bilateral filter fusion method(CBF)cbf2013 , discrete cosine harmonic wavelet transform fusion method(DCHWT)dchwt2012 , the joint-sparse representation model(JSR)jsr2013 , the JSR model with saliency detection fusion method(JSRSD)jsrsd2017 , the gradient transfer fusion method(GTF)gtf2016 , convolutional sparse representation(CSR)csr2016 and deep convolutional neural network-based method(CNN)cnn2017 .
In our experiments, the parameter in LatLRR is 0.8. And the weighting value for fusing the low-rank parts is 0.5, the weighting value for fusing the saliency parts is 1. The evaluation methods for fusion performance will be introduced in next sections.
All the experiments are implemented in MTALAB R2016a on 3.2 GHz Intel(R) Core(TM) CPU with 12 GB RAM.
The fusion results are obtained by five compared methods and proposed method are shown in Fig.7 - 8, we just choose two pairs infrared and visible images(“stree” and “ca”) to assess the fusion performance by subjective evaluation.
As shown in Fig.7 and Fig.8, Fig.7(c-j) and Fig.8(c-j) are the fused images which are obtained by the compared fusion methods and the proposed fusion method. As we can see from the fused images in Fig.7, the fused image which is obtained by proposed method can preserve more windows and chair detail information in red and green boxes, respectively. In Fig.8, the fused image which is obtained by proposed method also contain more glass detail information in red box.
In summary, the fused images which are obtained by CBF and DCHWT have more artifacts information and the saliency features are not clear. For the fused images which are obtained by JSR, JSRSD, GTF, CSR and CNN contain many ringing artifacts around the saliency features and the detail information are also not clear. In contrast, the fused images which are obtained by proposed fusion method contain more saliency features and preserve more detail information. Compared with above fusion methods, the fused images which are obtained by proposed fusion method are more natural for human sensitivity and the proposed method has better subjective fusion performance.
For the purpose of quantitative comparison between the proposed method and other fusion methods, four quality metrics are utilized. These are: QabfQabf , dchwt2012 which denotes the rate of noise or artifacts added in fused image due to fusion process, the sum of the correlations of differences(SCD)scd2015 , and modified structural similarity().
In our experiment, the is calculated by Eq.(5),
where represents the structural similarity operation, is fused image. And , are source images. The value of denotes the ability of structural preservation.
The fusion performance of fused image is better with the increasing numerical index of Qabf, SCD and . On the contrary, the fusion performance is better when the value of is small.
The average values of , SCD and for 21 pairs source images are shown in Table 1. And more experimental data are shown in our supplementary material file.
In Table1, the best values of quality metrics are indicated in bold. As we can see, the proposed method has the best values in SCD, and .
These values indicate that the fused images which are obtained by the proposed method are more natural and contain less artificial information. From objective evaluation, our fusion method has better fusion performance than those compared methods.
In this paper, we proposed a simple and effective infrared and visible image fusion method based on latent low-rank representation. Firstly, the source images are decomposed into low-rank parts and saliency parts which contain global structure and local structure information, respectively. Then the fused low-rank part and saliency part are obtained by weighted-average strategy, and we choose different weight value for these two parts. Finally, the fused image is reconstructed by adding the fused low-rank part and the fused saliency part. We use both subjective and objective methods to evaluate the proposed method, the experimental results show that the proposed method exhibits better performance than other compared methods.
Liu G, Yan S. Latent Low-Rank Representation for subspace segmentation and feature extraction[C]// International Conference on Computer Vision. IEEE Computer Society, 2011:1615-1622.
Liu G, Lin Z, Yu Y. Robust Subspace Segmentation by Low-Rank Representation[C]// International Conference on Machine Learning. DBLP, 2010:663-670.