Abstract

An algorithm based on pulse-coupled neural network (PCNN) constructed in the Tetrolet transform domain is proposed for the fusion of the visible and passive millimeter wave images in order to effectively identify concealed targets. The Tetrolet transform is applied to build the framework of the multiscale decomposition due to its high sparse degree. Meanwhile, a Laplacian pyramid is used to decompose the low-pass band of the Tetrolet transform for improving the approximation performance. In addition, the maximum criterion based on regional average gradient is applied to fuse the top layers along with selecting the maximum absolute values of the other layers. Furthermore, an improved PCNN model is employed to enhance the contour feature of the hidden targets and obtain the fusion results of the high-pass band based on the firing time. Finally, the inverse transform of Tetrolet is exploited to obtain the fused results. Some objective evaluation indexes, such as information entropy, mutual information, and , are adopted for evaluating the quality of the fused images. The experimental results show that the proposed algorithm is superior to other image fusion algorithms.

1. Introduction

Both the active mode and passive mode are used to detect concealed objects. The active detection mode usually relies on strong penetrability of special ray. It (i.e., the irreversible radiation) is easy to damage the testing material and human health. On the contrary, the passive detection mode plays an important role in the field of threat precaution due to its security. It depends on the spectral radiation difference between the interesting targets and surrounding for recognizing the concealed objects. For example, a metal gun is hidden in the abdomen of a man shown in Figure 1, which is labeled for explaining a passive imaging scene. When the man passes through a passive millimeter wave (PMMW) system, the gun should reflect brightness temperature of cold air in millimeter wavelength. Meanwhile, some completely different pixels are generated for describing the information of brightness temperature in PMMW images, which lead to an obvious diversity between the gun and human body. The PMMW imaging produces an interpretable imagery without irradiating the targets. It has the capability to penetrate through the low-visibility conditions and some obstacles such as textile materials [1]. Therefore, the concealed objects under clothing can be identified by the PMMW imaging system reasonably. The target characteristic forming in the PMMW images is different from surroundings, which leads to an automated target detection [2].

The radiometer array captures radiated energy restricted by the antenna aperture. Every pixel of the PMMW images actually reflects a weighted average of regional radiation in millimeter wave (MMW) band. The low resolution images are usually obtained due to the diffraction limit and the low signal level. Meanwhile, the sensitivity of the sensor and environmental radiation are the key factors of affecting feature expression about the concealed objects. As a result, the imaging quality is insufficient for supporting the follow-up task such as target recognition and localization. There is an inevitable limitation for detecting the concealed targets based on any type of sensors or methods. Additionally, the visible image has good readability and rich details of scene with the defect of exploring the concealed objects. The infrared sensor is used to obtain thermal radiation information of the hidden targets, which is inferior to the PMMW sensor when detecting the metal objects [2]. The multisource information fusion has maintained a strong vitality for obtaining multiple dimensions information about the interesting targets under any complex viewing conditions. It is effective for improving the validity and accuracy of recognizing the concealed objects with the comprehensive description of a scene. The fused results integrate complementary and redundant information from source images and obtain a more sufficient description of the targets than any single source images [3]. Song et al. [4] proposed a novel region-based algorithm based on Expectation-Maximization (EM) algorithm and Normalized Cut (Ncut) algorithm. A region growing algorithm is used to extract the potential target regions, and a statistical model with Gaussian mixture distortion is applied for producing the fusion image. Xiong et al. [5] proposed a novel algorithm based on clustering and nonsubsampled contourlet transform (NSCT). The fusion image is obtained by taking the inverse NSCT of the fusion coefficients. These fusion algorithms are more meaningful for obtaining the concealed information.

The fused images contain much more comprehensive and accurate information than a single image. This is widely exploited in the field of military, medical science, remote sensing, and machine vision [9]. Particularly, the multiscale transforms are usually applied to achieve sparse representation of source images. The final images are obtained through fusion in accordance with certain rules. There are several types of the multiscale transform, such as discrete wavelet transform (DWT) [10], Curvelet transform (CT) [11], NSCT, and Tetrolet transform (TT) [12, 13]. The DWT is suitable for dealing with singular signal with the limitation of describing linear signal, and the CT is suitable for approximating the closed curve. The NSCT not only inherits the anisotropy of the CT but also expands multidirection and translation invariance. The TT executes sparse decomposition of source images due to its excellent capability of multiscale geometry analysis. Krommweh firstly proved that the TT is better than the DWT, CT, and NSCT when describing geometric structure characteristics [13]. Shi et al. presented a hybrid method for image approximation using the TT and wavelet [14]. The core of the algorithm is the further sparse representation of the low-pass band in the TT domain. After that, some scholars began to explore the possibility of introducing the TT into multisource image fusion. For example, Huang et al. proposed different rules for the low and high-pass coefficients. The local region gradient information was applied to get the low-pass fusion coefficients. And the larger region edge information measurement factor is used to select the better coefficients for fusion [15]. Shen et al. proposed an improved algorithm based on the TT for fusing the infrared and visible images [16]. An optimization algorithm named compressive sampling matching pursuit (referred to as CoSaMP) is used to determine the fusion coefficients. The PMMW images contain relatively less information due to the detecting principles. The CoSaMP algorithm usually causes certain loss of useful information. Yan et al. introduced a regional gradient into the fusion process in the TT domain [17]. The fused results are better than those algorithms based on the wavelet transform and principal component analysis (PCA) methods. However, the low-frequency coefficients of the TT contain a small amount of details such as edge and corner feature. If these details are neglected, the fused results always lost a lot of targets’ contours. Zhang et al. proposed a Laplacian pyramid for decomposing the low-frequency portion of the TT and proved that the Laplacian pyramid is conducive to improve the capability of describing details [18]. The result shows that the proposed algorithm performs well when fusing multichannel satellite cloud images. If the source images have similar structural characteristics, this method has good performance when preserving image edge and curvature geometric structure. However, due to the characteristic difference between the visible and PMMW images, the contour features of the concealed objects can be submerged in the background easily.

The pulse-coupled neural network (PCNN) is known as the third-generation neural network developed by Eckhorn et al. in 1990. It was founded on the experimental observations of synchronous pulse bursts in cat and monkey visual cortex [19]. Although PCNN achieves excellent results, the PCNN-based fusion methods are complex and inefficient for dealing with different source images. Wang et al. illustrated that the amount of the channels of the PCNN and parameters limits its application [20]. Many researchers have improved the original PCNN model for making it more appropriate for image fusion. For example, Deng and Ma proposed an improved PCNN model and implemented the initial parameters based on the max gray of normalized source images [21]. Chenhui and Jianying [22] decomposed the visible and PMMW images in the multiband wavelet domain. The fusion ruler for low-frequency coefficients is based on local variance, and the high-frequency coefficients are based on the adaptive PCNN. Xiong et al. [23] adopted the CT transform for obtaining the coefficients at different scales. The potential target regions of the PMMW image is conducive to determine the fusion coefficients, which takes advantage of the particular ability of the PMMW image in presenting metal target. So the fusion coefficients are determined by the feature of PMMW image based on the region growing algorithm, and the improved PCNN is selected for the fine scale, which enhances the performance of fusion for integrating the important information of the visible and PMMW image. However, the result of the region growing is restricted by three major factors such as an initial growth point, a growth criteria, and a terminating condition, which directly affected the success rate of the potential target extraction.

In this work, we adopted a generic framework for the fusion of the source image instead of extracting the target region of the PMMW image. Both the TT and the improved PCNN are applied for fusing the visible and PMMW images with different rules. Meanwhile, the PCNN is used to enhance the clarity and contrast of the hidden targets. The rest of this paper is organized as follows. The principles of the TT and the PCNN are illustrated in Section 2. The proposed fusion method is described in Section 3. The results and analysis of experiments are shown in Section 4. Finally, Section 5 concludes the work.

2. The Theory of the TT and PCNN

2.1. The Theory of the TT

The TT possesses smaller support domain and avoids the Gibbs phenomenon at the edge of images. Five basic structures of the TT are shown in Figure 2.

Suppose a source image is expressed as , . The decomposition process of the TT is shown as follows:

(I) Primary Decomposition. The low-pass image is divided into several blocks , .

(II) Tetrominoes Selection. The low-pass coefficients are defined as and then the three high-pass coefficients for are given byThus the covering is An optimal Tetrolet decomposition in the first phase is .

(III) Rearranging Coefficients. The low-frequency coefficients of each block are retranslated into blocks. Then steps (I) and (II) are repeated for sparse representation.

(IV) Image Reconstruction. The fused image is reconstructed based on the low-pass, high-pass coefficients, and the corresponding coverings.

The flow chart of the TT is shown in Figure 3.

2.2. The Theory of the PCNN

The neuron model of the PCNN is described as follows [20]:where and denote the external input stimulus and the feedback of , respectively, and represent the internal activity of neuron and the dynamic threshold, respectively, and is the linking item. denotes the pulse output of . and denote the relationship between the current neuron and the surrounding neurons, respectively; is the linking strength or linking coefficient; , , and are the attenuation time constants; , , and denote the inherent voltage potential of , , and , respectively.

The complexity limits the application of the PCNN in the field of image fusion. Most of the parameters are difficult to set up due to the change of source images. These parameters are commonly adjusted by a lot of experiments and experience. The PCNN relies on sync pulse distribution phenomenon for giving rise to pixel change. Because the mathematics coupled characteristic of the PCNN itself has an overwritten effect on biological characteristics, the improved PCNN and parameters setting basis are used to eliminate the coupled characteristic [21]. We adopted the optimization model for fusing the high-pass coefficients. Thereby the improved model is given bywhere is the normalized parameter for finishing the weak coupling connection. The spatial frequency (SF) is fit for motivating the PCNN directly [6]. It reflects the gradient features of images in transform domain, which is considered an effective external input stimulus of the PCNN. Let represent the coefficients located at in the th subbands at the th decomposition level. These parameters are where denotes the spatial frequency of high-pass domain; is the max gray of normalized source images. Let , , and .

In addition, if the cross entropy is bigger than the last one during the iterative process, the cyclic process of the PCNN is accomplished.

3. The TT-PCNN

There is no perfect transformation which achieves completed approximation of various image details due to the inherent defect of the multiscale transform. These details usually contain important features of the targets. We use the Laplacian pyramid to decompose the low-pass band of source images in the TT domain. The remaining details of the concealed objects usually exist in the top layer of the Laplacian pyramid. Meanwhile, the rule based on average gradient is applied to fuse and enhance the details of objects which are sensitive to human vision. In addition, the detailed features of the hidden targets in the PMMW images are important to the subsequent recognition limited by various factors such as the style of imaging and electronic noise. We adopted the coefficient of the high-pass band as the input of the PCNN in advance. The enhancement operator of the PCNN has the capability of enhancing the details of the hidden targets, which is beneficial to the subsequent target recognition. Additionally, we fuse the high-coefficient of visible and PMMW images based on the SF. The TT-PCNN and fusion rules are shown in Figure 4.

Step 1. Decompose the source images into the low-pass and high-pass subbands via the TT. The following coefficients are expressed as the high-pass coefficients and and the low-pass coefficients and .

Step 2. and are decomposed by the Laplacian pyramid named and , respectively. The fusion rule based regional gradient is used for fusing the top layer of and . Suppose that is the value of located at top layer. The regional space at the center of is . is the fused result of and . and have the similar definition as and . The regional average gradient is expressed aswhere and are the first-order difference of in different directions. So the fusion rule of top layer is described as In addition, the rule of choosing the highest absolute value is designed to fuse the value of other layers of and .

Step 3. Inverse Laplacian pyramid and obtain the fusion result .

Step 4. The enhancement of targets area is based on the improved PCNN. Suppose that represents the coefficients of and let . Meanwhile, the other parameters remain the same settings as (7).

Step 5. This is the final fusion of the high-pass coefficients. The SF is obtained from (6) in slipping windows , which is the input of the improved PCNN. The fusion rule is designed aswhere denotes the firing time of each coefficient, which is given by

Step 6. Use the selected coefficients to reconstruct the fused image via the inverse TT.

4. Experimental Results and Performance Analysis

4.1. Evaluation Criteria

The existing metrics are classified into three categories: statistics-based, information-based, and human-visual-system based classes. The selected metrics with smaller correlation are beneficial to the objectivity of the evaluation [24]. The statistics-based metrics are easily influenced by the pseudoedges of targets, so we evaluate the fusion performance based on information-based and human-visual-system based metrics. The information-based evaluation indexes mainly contain information entropy (IE) and mutual information (MI) [25]. Moreover, is a representative model in the evaluation system based on human vision since it has strong correlation with other human-visual-system based metrics [26]. These formulas are shown as follows:

IE:

MI:

: where is the probability mass function of the input images. , , and is obtained by simple normalization of the joint and marginal histograms of the input images. and are weighted by the coefficients of the edge preservation values. and reflect the perceptual importance of the corresponding edge elements. IE reflects the amount of average information in the fused images. MI reflects detailed information which is obtained from source images, whereas the metric computes and measures the amount of edge information transferred from source images into the fused results. In addition, a larger value of these metrics means a better fusion results.

The source images derived from ThermoTrex Corporation are shown in Figure 5. There are three soldiers with gun and grenade displayed in Figure 5(a). Due to the limitation of penetrability, the information of targets under clothing is not included in the visible image. But it contains rich environmental details about imaging scene. In contrast, Figure 5(b) is the PMMW image. The bright part of the PMMW image reflects the location and shape information of the concealed objects. The outline of the gun and grenade is detected by the MMW owing to its penetrability, and the contour of three soldiers is heavily blurred. It is difficult to recognize lawn from the PMMW image. We use different wavelets and fusion rules for acquiring the results in the subsequent section in order to prove the effectiveness of the proposed algorithm.

4.2. The First Group of the Fused Results

The first group of the fused results is performed on the PMMW and visible image. Figure 6 illustrates the source images and fusion results obtained by different wavelet. The fusion results achieved by the DWT, CT, NSCT, TT, and TT-PCNN are displayed in Figures 6(a)6(e). The fusion rule adopted by these wavelets is the same as the description of the TT [15]. As can be seen from Figures 6(a)6(e), the five methods successfully fuse the PMMW and visible image, and all the fused images contain the concealed objects information and background information. However, it can be found that the fused result obtained by the DWT has many artifacts due to the lack of shift-invariance. The contour of the gun is a little blurred caused by the pseudo-Gibbs phenomena. The CT, NSCT, and TT achieve a better performance than the DWT method. The CT has superior performance of depicting the edge details. So the concealed gun has complete structural features for recognition. If the background characteristics of source images have significant differences, the CT usually leads to the decrease of image contrast. Due to the shift-invariant of NSCT, the pseudo-Gibbs phenomenon is eliminated successfully. Limited by the fusion rules, the concealed targets have low contrast which produces serious impact on risk identification. Since the TT has superior capacity to describe smooth region and local details, the fused result achieves better effect than the above methods. The proposed method provides best visual effects. Almost all the useful information of concealed objects is transferred to the fused image, and fewer artifacts are introduced during the fusion process. Table 1 shows the evaluation results of the five methods. The IE of the fused image obtained by the DWT and the TT is bigger than the TT-PCNN due to the introduction of invalid information. The MI obtained by the TT-PCNN acquires the maximum. It illustrates the fact that the fused image extracts more information from the original images. Furthermore, of the TT-PCNN is maximum, which indicates that the proposed algorithm preserves the detailed information and extracts more edge information from source images effectively. The objective evaluation meets the visual observation.

4.3. The Second Group of the Fused Results

The fusion results of the NSCT-PCNN, CT-PCNN, NSCT, and TT-PCNN are displayed in Figures 7(a)7(d). As can be seen from Figures 7(a)7(d), all of the methods successfully fuse the PMMW and visible images. All the fused images still contain concealed targets information. However, the fused image obtained by the CT-PCNN still has low contrast due to the background differences between source images, which is a common problem of the CT based methods. While the NSCT-PCNN and NSCT achieve a better performance than the CT-PCNN. The pseudo-Gibbs phenomenon is eliminated owing to the shift-invariant of NSCT. It is proven that the PCNN is conducive to enhance the details of interesting targets. So the PCNN is beneficial to the fusion of visible and PMMW images. But the concealed objects and background have low contrast. Especially, the information of the grenade is difficult to discrimination. The TT-PCNN provides better visual effects. The detailed information of gun and grenade is preserved well. Table 2 shows the evaluation results of the four methods. The IE of the fused image achieved by the TT-PCNN is the second maximum. This means that the fused result contains a lot of information inherited from source images. and of the fused image obtained by the TT-PCNN gain the largest value. This demonstrates that the proposed algorithm extracts abundant image information from source images and achieves high contrast.

4.4. The Third Group of the Fused Results

As shown in Figure 8, the source images and fused results are displayed well. Figures 8(a) and 8(b) are the visible image and PMMW image. A single 94-Ghz radiometer on a scanning 24 in dish antenna is used to detect the MMW energy of concealed weapons [27]. As can be seen from Figures 8(c)8(f), all of the methods successfully synthesize the targets information and the background information. But the contrast of the fused image based on the CT-PCNN is relatively low. The NSCT and NSCT-PCNN methods improve the fusion effect and achieve high contrast. However these two methods still enhanced useless information such as the radiated information of the dress zipper. The TT-PCNN synthesizes the PMMW and visible images, highlights the information of concealed weapons, and suppresses the invalid information. The objective evaluation of the fused results is listed in Table 3. The TT-PCNN receives the maximum compared with other algorithms. It proves that the fused result of the proposed method contains abundant target information and preserves more object features well.

5. Conclusion

In this paper, an improved PCNN for the fusion of the PMMW and visible image is proposed in the Tetrolet domain. The improved PCNN model is more simple and adaptive with fewer parameters. We firstly adopted the improved PCNN to strengthen the high-pass coefficients of the PMMW image in order to enhance the contour of concealed targets. And then a Laplacian pyramid is introduced for the decomposition of low-pass band after the TT. Next, the SF is applied to motivate the improved PCNN neurons. The flexible multiresolution of the TT is associated with global coupling and pulse synchronization characteristics of the PCNN. Finally, the four groups of experiments are conducted for evaluating the fusion performance. The results show that the proposed algorithm has superior performance of fusing the visible and PMMW images. The fused results have high contrast, remarkable target information, and rich information of background. The proposed method is suitable for fusing the infrared and visible image, which is superior to the other fusion algorithms in terms of visual quality and quantitative evaluation.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this article.

Acknowledgments

This work is supported by the Postdoctoral fund in Jiangsu province under Grant no. 1302027C, the Natural Science Foundation of Jiangsu Province under Grant no. 15KJB510008, and the State Key Laboratory of Millimeter Waves, Project no. K201714. The support of Image Processing Lab of Jiang Su University of Science and Technology is acknowledged. Thanks are due to Dr. Qu, for publishing related program on the Internet, and Dr. Larry, Dr. Merit, and Dr. Philip who collected a large number of PMMW images.