Abstract

For hyperspectral images (HSI) compressed sensing reconstruction, the 3D total variation (3DTV) is a powerful regulation term encoding the spatial-spectral local smooth prior structure. The term is calculated by supposing the sparsity structure on a gradient map along the spatial and spectral direction. In some real scenes, however, the gradient maps along different directions of HSI are not always sparse. Actually, TV constraints established on the gradient map of the original HSI are more effective in most cases. In this paper, instead of imposing sparsity on gradient maps themselves directly, compressed sensing (CS) reconstruction for HSI is formulated as an optimization problem utilizing a novel regulation term named multi-TV (MTV), which combines the sparsity prior for the gradient map and the TV projection along with other directions of the gradient map. We also develop a workable utility algorithm based on the alternating direction method of multipliers (ADMM) to effectively deal with the optimization problem. The proposed MTV term can easily replace the conventional 3DTV term and be embedded into hyperspectral CS (HCS) reconstruction to improve its performance. Experimental results show that compared with similar state-of-the-art methods, the proposed MTV term can significantly improve reconstruction precision for HCS.

1. Introduction

Hyperspectral images (HSI) are different from a conventional color image including R, G, and B channels and usually consist of tens to hundreds of grayscale images. Abundant spatial and spectral information makes HSI an important technique in many fields, such as fine agriculture [1], mineral detection and exploration [2], environmental monitoring [3], military reconnaissance [4], and other fields [5]. However, the huge volume due to the increasing resolution brings tremendous pressure to the onboard imaging system in power consumption, memory, and transmission.

Compressed sensing (CS) [6] is a new imaging technique that effectively reduces power consumption during hyperspectral data acquisition. The key point of applying CS technology is how to reconstruct the original image from a small amount of observation data. Therefore, CS reconstruction is also a typical HSI restoration task like superresolution [7], denoising [8, 9], and fusion [10]. Metzler et al. [11] pointed out that the denoising algorithm seeks to remove additive white Gaussian noise, while the CS reconstruction algorithm seeks to recover a structured signal acquired using a small number of randomized measurements.

In the restoration task of a grayscale image, one of the most popular priors is the spatial smooth prior. The spatial local smoothness prior means that the image rarely has drastic edges, and most of them are spatially smooth areas. The application of spatial smoothing prior to CS reconstruction is to add the well-known 2D total variation (2DTV) regularization term [12], which can constrain the smoothness of the spatial horizontal and vertical directions [13].

3DTV, which is composed of 2DTV and spectral smoothness constraint, is the most widely used regularization term in HSI restoration [1420]. Spectral smoothness prior refers to the fact that similar values of adjacent band images are usually collected by an imaging spectrometer with the same parameter settings. Therefore, 2DTV models the spatial sparse structure of HSI, whereas spectral differential minimization ensures the smoothness of the spectrum. Although being successfully utilized in various tasks, the 3DTV regularization still has not sufficiently exploited the insightful sparsity structure of HSI. Many recent methods can achieve high-precision reconstruction by combining TV-based and other regularization terms [14, 18, 21]; on the other hand, they intended to maximize performance by improving the form of constraint terms [1517, 19, 20, 22, 23]. In general, multiple regularization term-based approaches perform better than approaches using the TV regularization term alone. However, they have to handle multiple regularization term optimization problems and carefully control the hyperparameter balancing the contributions of these terms. In this paper, we focus on the reconstruction method of a single TV regularization term for hyperspectral compressed sensing (HCS).

Hyperspectral TV (HTV) [24] imposed TV constraints on the images of each channel, which can be regarded as a generalization of the standard color TV [25]. In order to exploit spatial-spectral correlation, a weighted version of HTV, spectral-spatial adaptive hyperspectral TV (SSAHTV) [24], is proposed for hyperspectral denoise. Anisotropic spectral-spatial total variation (ASSTV) [15] and joint tensor/reweight 3DTV (JTenRe3DTV) [17] are both different forms of weighted 3DTV. ASSTV treats the HSI as a spatial-spectral volume and three weights enhance the smoothness constraint in both vertical, horizontal, and spectral directions. JTenRe3DTV got a comparable result by employing the tensor Tucker decomposition to describe the global spatial-spectral correlation among all HSI bands, and the local smooth structures in both spatial and spectral modes are enhanced by an updateable weight matrix. Spatial domain spectral residual total variation (SSRTV) [26], considering residual texture information in spectral variation image, calculated the difference between the pixel values of adjacent bands and 2DTV for the residual image. Denoising-compressed sensing by regularization (DCSR) [27] combined the total variation regularization and the nonlocal self-similarity constraint.

The above methods explore the sparse structure of the difference images (gradient maps) along the horizontal, vertical, and band directions but do not take into account the smoothing prior to the gradient maps themselves. The smoothing constraint on the gradient map is even more effective than the sparsity of the gradient map itself. Aggarwal and Majumdar introduced a hyperspectral denoising algorithm based on spatial-spectral total variation (SSTV) [16], which calculates spatial differences on a gradient diagram along the spectral direction. SSTV can restore a desirable HSI without any weight and can fulfill higher denoise performance than HTV. Instead of imposing sparsity on the gradient maps themselves, enhanced 3DTV (E-3DTV) [19] proposed a regularization term to exploit the low-rankness property on gradient maps along different directions. However, E-3DTV ignores the spatial and spectral characteristics of HSI. Hybrid spatial-spectral total variation (HSSTV) [20], which is established on HTV and SSTV, combines spatial TV and spatial-spectral TV constraints by a weighting coefficient. Although HSSTV simultaneously handles both direct local spatial differences and local spatial-spectral differences of an HSI, it ignores the sparsity of the spectral difference image, and a single weight is difficult to deal with the complexity of real cases. In addition, the horizontal difference on the gradient map along the vertical direction may also be sparse.

Based on the above discussion, we propose a new multi-TV (MTV) collaboration regulation term for HCS reconstruction. In summary, the threefold main contributions are as follows: (1)We propose a new operator to enhance the spatial smooth prior structure of an HSI. The proposed operator can simultaneously reflect both horizontal and vertical sparsity insight. The new operator is also applicable to the low-level vision of natural images or hyperspectral images, such as image denoising, deblurring, and superresolution(2)The multiple TV collaboration regulation term includes spatial, spectral, and spatial-spectral TV, which is proposed to better exploit hyperspectral prior knowledge. To easily use the proposed MTV as the unique regularization term, we coordinate multiple TV operators by weighting, which makes the model very easy to be optimized. We also suggest an empirical setting method of multiple weights(3)To make the algorithm simple and fast to implement, we also develop an efficient algorithm for solving the corresponding model based on the well-known alternating direction method of multipliers (ADMM) [28]. The closed-form updating equations are deduced for each step of both algorithms, and they can thus be implemented efficiently

The remainder of this paper is organized as follows. The compressed sampling model of the HSI is introduced, and following this, the MTV and MTV-based reconstruction model is presented in Section 2. In Section 3, the optimization process by ADMM is presented in detail. Section 4 presents some experimental results and discusses various aspects of the proposed method. Finally, some conclusions are given in Section 5.

2. MTV Model

2.1. Hyperspectral Compressed Sampling Model

In this paper, let represent an HSI. and denote the sizes of spatial height and width, respectively. is the number of bands in the HSI. For notational convenience, we treat as a vector by stacking its columns on top of one another, where is the number of the pixels of each band.

The goal of CS is to restore the original images with as little observation data as possible. Therefore, a random sampling matrix is designed for collecting a small number of compressed measurements with global information, where , and denotes the sampling rate (SR). The compressed sampling process is defined as follows:

The compressed operator can be instantiated as , where is a random downsampling operator, is a random permutation matrix, and is the Walsh-Hadamard transform [17, 19]. Such sampling matrix satisfies the restricted isometry property (RIP) and is successfully used in various compressed sensing problems [19, 29, 30]. Recovering from the observed compressed measurements is an ill-posed inverse problem [31], and therefore, various prior information must be exploited.

2.2. Related TV-Based Models

There are various constraint priors for hyperspectral image restoration, such as sparse, TV, and low rank. The most widely used in CS reconstruction are sparse and smooth prior (i.e., TV regularization). In this paper, we focus on the development of smooth prior information, that is, multidirectional TV regularization. The reconstruction problem with TV regularization can be described by the following forms: where the first term summarizes the data fidelity, is a regularization term based on TV, and is a tradeoff parameter between the fidelity and TV term.

In the previous work, HTV [24] modeled the band-by-band hyperspectral TV regulation term as follows: where and denote the difference operator along the horizontal and vertical directions, respectively, and stands for the th band of the hyperspectral image. Note that only anisotropic TVs are discussed in this paper, regardless of isotropy. Therefore, norm is used in (3). The neglect of spectral constraints makes HTV difficult to achieve high reconstruction performance.

ASSTV [15] added a gradient along the spectral direction and assigned different weights to the gradients in different directions. where defines the difference operator along the spectral direction. To achieve excellent performance, ASSTV must adjust the regulation parameters , , and carefully.

Then, JTenRe3DTV [17] extends TV regularization to tensor space. where is an updatable weight of different directions.

These methods only investigated the TV form of HSI from the data itself. In recent years, some work has focused on the properties of the TV transform domain of HSI. SSTV [16] explored the spectral smoothing prior to the spatially differential gradient map or the spatially smoothed prior to the spectral differential gradient map.

Since HSI has low-rank characteristics and the TV transform is also a linear operation, E-3DTV [19] considers the gradient map to be low-rank as well. A reconstruction method applying low-rank matrix factorization was proposed.

HSSTV [20] built a new TV form by integrating ASSTV and SSTV in a weighted way. where is a weight to adjust the relative importance of direct spatial piecewise smoothness to spatial-spectral piecewise smoothness. However, the omission of spectral smoothness in HSSTV hinders the improvement of reconstruction performance.

3. MTV Model

Based on the aforementioned analysis, hyperspectral images exhibit different smoothness along multiple directions. Single direction TV operator is not enough to describe the sparsity of the gradient domain. Figure 1 shows the sparsity of the single-direction gradient map compared to the combined gradient map and their statistical comparison when applied on Cuprite HSI (data described in Section 4). For the convenience of the display, here we only show the original image of the 100th band and its gradient maps with a moderate zoom. Compared with Figures 1(b)1(d), the spatial gradient map still retains rich content information, while the spectral directional gradient map is closer to random noise. This is because near the 100th band, the spectrum changes more smoothly, and its gradient map is more sparse. However, according to the statistical results in Figure 1(h), the spatial gradient map is more sparse than that of the spectral. In previous works, the description of spatial local smoothness only depends on horizontal and vertical TV. However, we find that the vertical difference of the horizontal gradient map (defined as ) has stronger sparsity than the gradient map in a single direction ( or ) from Figure 1(h). We can see that sorted discrete gradient coefficients of give a sparser representation compared to or . Our MTV model, therefore, is proposed to describe spatial local smooth prior with and . In addition, the statistical comparison also shows that the multilayer TV operator is more responsive to image smoothing constraints than the single direction. TV operators with different directions and layers describe the smoothness of hyperspectral images from various perspectives. However, how to combine these TV operators with different directions and layers to accurately describe the smooth sparsity of hyperspectral images is a matter of concern.

Considering that hyperspectral images have smooth characteristics in different directions, we unify TV constraints in all directions and propose a new regularization model, MTV, for HSI reconstruction. where weights to adjust the relative importance of the six difference operations and denotes the transpose of a matrix or vector. In MTV, , , and correspond to local spatial differences, and corresponds to local spectral difference. and correspond to local spatial-spectral differences. Although MTV can more appropriately describe the various local smoothness of hyperspectral images, however, too many weights in (8) make it difficult to set for practical applications. In Section 4, we demonstrate an empirical approach to setting the weights. In fact, according to our setting rules, we only need to carefully select the parameter .

4. Optimization Computing Procedure

Recovering from using the proposed MTV regularization can be described as the following optimization problem:

Optimization problem (9) is NP-hard and difficult to solve directly. We employ ADMM for the above-unconstrained optimization problem. First, let us define variables and , then optimization problem (9) can be converted into the following constraint problem:

The augmented Lagrangian function of problem (10) is where and denote the Lagrange multipliers and is a positive penalty constant.

Each iteration of ADMM fixes the other variables and minimizes only one of them. For optimization subproblem, given that we run an optimization over the variable , the terms of the Lagrangian function (11) which do not contain this variable are not taken into account [32]. Then, the reduced optimization function becomes where the subscript denotes the iteration. Optimization subproblem (12) can be treated as solving the following linear system: where denotes the transposition of matrix .

This linear system can be solved by off-the-shelf techniques such as the preconditioned conjugate gradient method.

To compute , the optimization subproblem to be solved is whose solution is where is an identity matrix and indicates the adjoint of .

The optimization subproblem of with respect to can be solved as whose solution is the well-known soft threshold [33]. where denotes the soft-threshold function.

According to the ADMM method, the multipliers associated with are updated as follows:

Summarizing the aforementioned descriptions, we now arrive at obtaining the main algorithm for MTV-based reconstruction, as presented in Algorithm 1.

Inputs:, ,
   , , , ,
Output:
1. WhileA stopping criterion is not satisfieddo
2. 
3. 
4. 
5. 
6. 
7. 
End

5. Results

In this section, we quantitatively and visually substantiate the effectiveness of the proposed MTV method for HCS on several HSI datasets. To thoroughly evaluate the performance of the proposed approach, we compare our method with 5 recently competing CS reconstruction approaches, including ASSTV [15], SSTV [16], JTenRe3DTV [17], E-3DTV [19], and HSSTV [20]. All the experiments were performed with MATLAB (R2020b) on a workstation and Windows 10 OS with a dual-core Inter 4 GHz CPU and 32 GB RAM.

To sufficiently assess the performance of all compared approaches, three quantitative picture quality indices (PQIs) are employed: mean peak signal-to-noise ratio (MPSNR), mean structure similarity (MSSIM), and mean spectral angle mapper (MSAM). MPSNR and MSSIM are two conventional spatial-based metrics. They evaluate the similarity between the reconstructed images and the original images based on mean squared error and structural consistency, respectively. The larger the MPSNR and MSSIM values are, the better performance the reconstruction methods tend to be. The MPSNR (measured in dB) is defined as the average peak signal-to-noise ratio (PSNR) of all bands: where and correspond to the original and reconstructed band image vector. is the peak value of . MSSIM is expressed as where is defined as the structure similarity between and . The details of the SSIM function can refer to [34].

MSAM calculates the average angle between spectrum vectors of the target HSI and the reference one across all spatial pixels, and its definition is as follows: where and are the th spectral vectors of the original and reconstructed HSI, respectively. Different from the former two measures, the smaller the value of MSAM is, the better the algorithm performance.

Four popular real-world HSI datasets are selected to substantiate the effectiveness of our approach, i.e., Cuprite1, Samson (http://www.escience.cn/people/feiyunZHU/Dataset_GT.html), IndianPines (http://www.ehu.eus/ccwintco/index.php/Hyperspectral_Remote_Sensing_Scenes), and SalinasA2 datasets. As the original images of these datasets are too large, which is very expensive in terms of computational cost, we cropped the images for each dataset and removed the seriously polluted bands due to the water absorption and atmospheric effects. The subcubes used for the experiments are of , , , and , respectively. The RGB composite of these datasets is shown in the first column of Figure 2.

5.1. Parameter Selection

In our MTV model, there are 7 parameters, and to . According to defining equation (8) of MTV and optimization problem (9), however, can be viewed as the same scale factor for all parameters to . Therefore, we can only analyze the sensitivity of to and set to a fixed value. That is, can be merged into to . In the following experiments, is fixed to . Even so, there are still 6 parameters to be set. In this subsection, we focus on the empirical method of setting the parameters to . According to our empirical method, the selection of 6 parameters will evolve into the setting of a single parameter .

Parameters , , and control the importance of local spatial smoothness. adjusts the local spectral smoothness, while and correspond to the local spatial-spectral smoothness. By observing the norm of the gradient in different directions of hyperspectral images, we found that the gradient norm along the spectral direction is one order of magnitude higher than the spatial direction and two orders of magnitude higher than the spatial-spectral direction. Therefore, we simply set and , describes the correlation between spatial weights and spectral weight, and describes the correlation between spatial-spectral weights and spectral weight. Figure 3 shows the variation of MPSNR with and . It can be seen from the figure that the weight correlation relationship of the three directions corresponds to the value of the norm. The MPSNR reaches its maximum value when is set to 10 and is set to 100. Therefore, in the subsequent experiment, we set and . In this way, our MTV model only needs to set the parameter according to different sampling rates.

Figure 4 shows the effect of parameter on MPSNR for the Samson dataset when the sampling rate SR is 0.01, 0.05, 0.1, and 0.2. As can be seen from Figure 4, the optimal value of varies as the sampling rate changes. For example, at a sampling rate of 0.01, the reconstruction performance is optimal when is set to 0.04. For the 0.1 sampling rate, the value of corresponding to the highest reconstruction accuracy is 0.02. Therefore, we fit the optimal value curve through the results in Figure 4, which is shown in Figure 5. Obviously, the 2nd-degree fitting curve can more effectively match the experimental results of Figure 4. In the following experiments, is set according to the 2nd-degree fitting curve for different sampling rates on the above four hyperspectral datasets.

5.2. Quantitative Comparison

In this subsection, we compare the reconstruction results of all the compared methods by the above three PQIs. For a fair comparison, all methods adopt the same random permuted Hadamard transform as the compressed operator. The default sets or tuned all their values to possibly guarantee their optimal performance for the competing methods’ parameters. We compared the reconstruction performance of five sampling rates, i.e., 0.003, 0.01, 0.05, 0.1, and 0.2. Among them, the sampling rate of 0.003 is used to verify the validity of the weight fit in the previous subsection.

Tables 14 show the aforementioned three indices of the reconstructed HSI by all the compared methods. For all five sampling rates and four hyperspectral datasets, the proposed method achieves a significantly improved performance in terms of three PQIs, as compared with other competing methods. In particular, at a 0.01 sample rate, MTV improves MPSNR by nearly 3 dB over other methods with optimal performance. For example, the MPSNR is 4.7 dB higher than the HSSTV on the SalinasA dataset. Additionally, at 0.003 sample rate is predicted by weight fitting. However, the proposed method still achieves very excellent performance. All these findings demonstrate the power of MTV modeling for exploiting the reconstruction of HCS.

5.3. Visual Comparison

In terms of visual quality, pseudocolour images composed of three representative bands of four HSIs in a sampling ratio of 0.01 are presented in Figure 2. From these figures and the quantitative indices MPSNR, We can see that most of the reconstruction methods yield poor results due to the fact that only 1% of the data is available. Although ASSTV obtains a high MPSNR, the reconstructed image clearly appears to be overfitted. MTV combines a variety of TV constraints and can still recover a clearer and sharper image. It is evident that the MTV method can preserve the best detail edge, texture information, and image fidelity than other competing methods.

To more visually observe the recovery of spectral information, specific pixels (10 50) in the four datasets under 0.01 sampling rate are selected. Figure 6 plots the original spectral curves and the spectral curves reconstructed by various algorithms. Some local areas in the subfigure are enlarged in the corner for better visualization. Overall, the conservation of spectral information on IndianPines and SalinasA outperforms Cuprite and Samson for all reconstruction methods. However, a comparison of the locally enlarged spectral curves shows that MTV can still approach the original spectral curve perfectly on Cuprite and Samson datasets. On SalinasA datasets, the spectral curve recovered by HSSTV is superior to that of other competing methods, which is consistent with the MSAM index in Table 4.

5.4. Robustness Analysis

In this subsection, we discuss the robustness of the proposed MTV algorithm for noisy images. Additive white Gaussian noise with different standard deviations () was added to the original hyperspectral image. Then, the sample matrix is employed to compress the sampled noisy images. Finally, the original data is reconstructed by the proposed MTV algorithm. Table 5 shows the reconstruction performance on the four data sets under different sampling rates.

As the noise standard deviation decreases, the MPSNR of the noisy image increases gradually. The reconstruction accuracy of each sampling rate also increases with the decrease in the standard deviation. It can be seen from the table that the proposed MTV reconstruction algorithm is still effective for the reconstruction of noisy images. This means that a small number of noise image observations can still be reconstructed well, and the quality of the reconstructed image is better than the original noise image, indicating that MTV has a certain denoise function during the CS reconstruction. Of course, the denoise performance of MTV cannot be compared with the pure denoise algorithm. After all, CS reconstruction is only for a very small amount of observation data.

In Table 5, we also observed a strange phenomenon. Although the reconstructed MPSNR is higher than the original noise image in most cases, the reconstruction accuracy of the high sampling rate (0.2) is not significantly higher than that of the low sampling rate (0.05) and even decreases. This is because optimal value fitting is accomplished in the noiseless case. Different noise levels can affect the fitting process. Therefore, in the next work, we will focus on the joint fitting of the parameters to for sampling rate and noise level.

5.5. Computation Complexity Analysis

As for the computational complexity of the proposed MTV algorithm, the most time-consuming step is to calculate in step 2 of Algorithm 1. In equation (13), we need to calculate , and its computational complexity is . Considering , the order of complexity of the proposed MTV algorithm is . Furthermore, we use the reconstruction time to evaluate the complexity of several algorithms, which is reported in Table 6. Note that the algorithms were implemented using MATLAB R2021a on a laptop workstation with an Intel Core 2.6 GHz CPU and 32 GB RAM.

The six reconstruction algorithms employed in our experiments have a similar structural framework, so the computational complexity is similar. Only the E-3DTV algorithm has a slightly longer running time, which is due to the tensor low-rank decomposition applied by the algorithm. Only the E-3DTV algorithm has a slightly longer running time, and its increased computational complexity is caused by the tensor low-rank decomposition.

From the above presentation, the proposed MTV method can achieve more satisfactory performance than other state-of-the-art methods in both spatial information recovery and spectral curve reconstruction. All the experiments demonstrate that the proposed MTV method is an effective HCS reconstruction algorithm. It should be expected to use the MTV regularizer as a general tool to ameliorate the performance for general HSI restoration tasks.

6. Conclusions

In this paper, we have proposed a new constrained optimization approach to the HCS reconstruction task. Specifically, the proposed approach constructs a novel spatial smoothness constraint model and models the spatial, spectral, and spatial-spectral correlation of an HSI using multiple prior collaborative constraint techniques, named MTV. Extensive experiments demonstrate that such approach significantly outperforms other state-of-the-art approaches both qualitatively and quantitatively.

On the weight assignment of TV regularization terms in each direction, we fit the relationship between the spectral direction weight and the sampling rate. In addition, we have specified the fixed relationship between the weights of other directions and the spectral direction, which is derived from the observation of hyperspectral data. Of course, the selection of weights for self-learning based on image features will be more conducive to improving the reconstruction accuracy, but the disadvantage is that a large amount of hyperspectral data is required and the computational complexity is significantly increased. This will be our next research interest.

In addition, it could be meaningful to incorporate some other prior information including low-rank, nonlocal tensor sparse into MTV, which will likely further improve the reconstruction performance of HCS. This is the next step of our research focus.

Data Availability

The Cuprite and Samson data used to support the findings of this study are available from the corresponding author upon request. The IndianPines and SalinasA data used to support the findings of this study are available at http://www.ehu.eus/ccwintco/index.php/Hyperspectral_Remote_Sensing_Scenes.

Conflicts of Interest

The authors declare that there is no conflict of interest regarding the publication of this paper.

Acknowledgments

This research was funded by Key Projects of Natural Science Research of Universities of Anhui Province, grant numbers KJ2018A0483 and KJ2020A0703; the Overseas Visiting and Research Project for Excellent Young Key Talents in Higher Education Institutions in Anhui Province, grant number gxgwfx2019056, the Talent Research Start-up Fund Project of Tongling University (2022tlxyrc35), and the Natural Science Research Project of Tongling University (2022tlxy45). The authors would like to thank those anonymous reviewers for their insightful comments, in improving the quality of this paper.