Abstract

Statistical tools have changed significantly in the past decade; when the maximum likelihood method is usually applied, it provides an inaccurate solution due to its unsuitable properties and causes problems in fit. However, the current Ordinary Least Square (OLS) Model is more reliable in specific situations where the estimation is based on the slope only. On the other hand, some methods depend on the slope, and the intercept is recommended. These methods can be described as thresholding rules. Another often preferred method is the posterior mean (PM) technique. This procedure depends on two parts, the first part is the likelihood, and the other is the prior distribution, where the second part plays a significant role in the estimation. In this article, the standard elastic-net distribution is assumed as the prior part, which consists of two parts, the first being the normal distribution and the second being the double exponential distribution. The reason why this model is used is that wavelet tools have different levels of resolution. Thus, this model may provide a more accurate estimation for the wavelet coefficients, which might be estimated using a normal or double exponential distribution. In the past, some properties of elastic-net penalized were introduced and discussed. However, more properties are introduced for this distribution. In addition, two models based on the elastic-net method are demonstrated, involving the point mass prior. The first model combines normal as likelihood, and elastic-net distributions as prior, while the other combines the double exponential distribution as likelihood with the elastic-net distribution as prior. Moreover, the level-dependent components are estimated at each resolutions. A simulated investigation is studied using the Markov Chain Monte Carlo (MCMC) tool to estimate the underlying features, where real data are involved and modelled using the proposed methods. A stationary wavelet basis is also applied. As a result, the proposed procedure reduces noise levels which may be helpful since noise levels often corrupt real data, usually a significant cause of most numerical estimation problems.

1. Introduction

The traditional statistics tools start with the ordinary least squares method, which is used to estimate the unknown parameters. However, this method provides inaccurate results because noise levels corrupt the real data. The technique has improved to reduce the problem and improve reconstruction; thus, more complex statistical tools are available. The reconstruction of unknown parameters becomes more accurate after the posterior distribution of Ridge, Lasso, and other penalized forms is implemented. One of these methods of penalized form was the elastic-net technique, which was introduced by the study of [1]; subsequently, in 2017, Hassan, with his supervisor, Robert, wrote about the penalty of implementing elastic-net as distribution. The prior parameters were fixed based on knowledge or given information preceding that. Nowadays, the distribution of elastic-net is available with some properties. This distribution allows estimation of the prior parameters using a flexible method such as the MCMC method. There are two reasons for using the penalized form as a probability distribution. The first is that the distribution form allows estimation of the distribution parameters rather than keeping them fixed. The second is that the wavelet coefficients are centered and peaked at zero [2]; this statement is true for some wavelet coefficients, but not all. Therefore, it is believed that wavelet coefficients need to thresh at some levels and shrink at other levels. The SElastic-net has two parts. The first is normal, which allows the shrinkage of wavelet coefficients, and the second is double exponential distribution, which allows them to thresh. In this article, the two previously mentioned methods will be used. The first has earlier been used by Hassan with his supervisor Robert [3, 4]. The other is a newly proposed method, where the difference between these methods is that, in the first one, the variance is estimated using the wavelet coefficients at their finest scale, see [5] for more details. The use of the normal distribution and the prior distribution gives a quadratic equation, meaning that there are two solutions to an estimation of the wavelet coefficients. However, in the second method, where the variance is proposed using exponential distribution, the prior changes the likelihood from normal distribution to double exponential distribution [6]. The procedure provides a cubic equation, where the solution becomes more complicated. However, the resolution of this combination between double exponential and standard elastic net can be provided using the following numerical method. Now, suppose the problem is given by the following equation:where is a general observation, is the parameter, which is unknown, and is the noise, which is assumed to be independent distributed. The relation between and is linear, adding a small value that is almost normal; this describes the impact of noise. The use of normal distribution as a likelihood function provides an inaccurate solution since the maximum of equals , which shows no difference between the observation and the estimation. Moreover, the included normal as likelihood provides a solution based on the slope only. On the other hand, the necessary proposed prior becomes more fixable, such as the standard elastic net , which has two parts normal as well as double exponential distributions, and the normal is used as likelihood. Figure 1 shows the plot of smooth data and the corresponding wavelet coefficients where the magnitude of wavelet coefficients at their finest level are small and can be centered around the zero. Note that, it can be threshed. However, the wavelet coefficients at a high level are large and seem to be away from zero and these coefficients can be shrunk. Several authors have published many articles about the Bayesian approach to the wavelet method, such as [7] who solved the model using single prior Weibull distribution based on two methods, where the model is compared with various mixture priors. Reference [8] used the wavelet method in nonparametric regression with kernel function based on the normal distribution. The monographs of [9, 10] can be referred to for extensive details. Vidakovic and Ruggeri used double experiential distribution as a probability function after proposing the noise variance using exponential distribution and a prior. This article aims to generalise the Bayesian Adaptive Multiresolution Shrinker (BAMS) method of Vidakovic and Ruggeri, where level independent densities are applied. Wavelet bases are introduced by Daubechies in [11] and it is clearly explained by [12]. Nondecimated technique is studied by [13, 14]. Lawton studied the complex valued wavelet transforms to subband decomposition and provided an application [15]. The MCMC can be found in [16, 17]. This article is structured as follows: introduction to the standard elastic-net distribution in Section 2. Some properties are referred to in Sections 3. Modelling is provided in Section 4. Section 5 gives all technical arguments are referred to. Section 6 explains the complex-valued wavelet basis. Simulation study to investigate estimation properties are provided in Section 7. Section 8 gives the result of the proposed rule to real data. The final summary and conclusions are presented in Section 9.

2. Standard Elastic-Net Distribution

Definition 1. Let be a random variable with probability density function and defined as follows:where is a function based on and such that , andwhere the parameters and are the scale parameters. This distribution is called the standard elastic net distribution with mean 0 and where we suppose and , then variance is given as follows:where and are the standard normal and the standard normal cumulative distributions, respectively.
Figure 2 shows the plot of the elastic-net distribution with different values of . It can be seen that as , the shape of the distribution takes a normal curve, which is a bell-curve. This standard elastic net can be symbolized by SElastic-net .

Remark 1. (1)Setting in (1), then , the results for the double exponential distribution with mean zero and variance (2)Setting in (1), then , the results for the normal distribution with mean zero and variance Figure 3 shows the cumulative function, which has an S-shape and is increasing and continuous. As more wavelet coefficients are threshold.

3. Some Properties

The cumulative distribution is given as follows:

For then

The SElastic-net hazard rate function is given by the following equation:

For more details—see Appendix A. For then

Figure 4 shows the plots of the hazard rate function with different values of , where it changes, as well as the products that failed early. Hence, SElastic-net distribution is a continuous distribution, symmetric about its zero and is nonzero over the entire line. The moment generating function of the SElastic-net distribution is given by the following equation:where .

4. Modelling

Let the model be defined as follows:where represents the observation, and refers to the true signal, and where is represented by an additive Gaussian error model with mean zero and variance ; the likelihood takes the following form:

Hence, this means that the relationship between and is linear, which is affected by white noise. More precisely, the values of the observed data, , are known and recorded with the level of noise, this procedure of changing can lead to different estimates. Let prior of the wavelet coefficients be defined as follows:where

The main reason for choosing this prior is that the elastic net distribution has flexibility and reliability since the wavelet coefficients have different degrees of shrinkage or thresholding. For more details see [3].

5. Bayesian Approaches

5.1. Model I

Here the posterior distribution can be written as follows:where the denominate is negligible because it does not contain any information about . The predictive distribution of , when , is defined as follows:

The predictive of , when , is denoted as follows:

The distribution of the parameter when is

For more details, see Appendix B. Figure 5 shows the wavelet coefficient distribution with different values of , and its flat shape as . This procedure demonstrates the behavior of the distribution when the distribution is normal. As increases towards 1, the shape becomes flat owing to the variance of the wavelet coefficients, while the shape of the distribution becomes narrow as increases towards zero since the wavelet coefficients center around and peak at zero. The expectation of , when , is denoted as follows:

For more details—see Appendix C. The Bayes estimator of , when , is denoted as follows:

Rules with a more desirable shape result from the prior on that contain a point mass at zero; that is given by the following equation:

The marginal is

The corresponding Bayes rule is given by the following equation:

Figure 6 shows the rule in (22), where its shrinkage rule is based on the shape of the rule. However, it is seen that the rule can shrink the wavelet coefficients sharply as increases. The process behavior happens, since the normal distribution appears. More precisely, the posterior distribution is based on the likelihood and the prior, which are based on normal distribution, and it provides a smooth shape. This procedure can be seen in the ordinary least squares and ridge rules. Despite this, the denominator and numerator depend on the elastic net.

5.2. Model II

Assuming that, the difference between the truth wavelet coefficients and the unknown wavelet coefficients can be interpreted as a Gaussian error model, it is given as follows:with density is given by the following equation:

This expression can be found in many existing articles such as, [18], where the variance can be computed from the highest level of wavelet coefficients [12]. However, the prior variance is assumed as exponential distribution given by the following equation:with the parameter and the density of variance defined as follows:

The marginal likelihood obtained as follows:and density defined as follows:which was proven by [6, 19]. The prior distribution of the unknown coefficients , is assumed as the standard elastic-net distribution is given by the following equation:

Then, the posterior distribution can be written as follows:

The predictive distribution of , when , is defined as follows:

The predictive of , when , is denoted as follows:

The predictive of , when , is denoted as follows:

For more details, see Appendix D. Figure 7 shows the plot of the wavelet coefficients using Model II, with its flat shape, as towards zero. The expectation of , when , is denoted as follows:

Similarly, the expectation of , when , is denoted as follows:

For more details—see Appendix E. The posterior mean of , when , is denoted as follows:

Rules with a more desirable shape result from the prior on that contain a point mass at zero; that is given by the following equation:

The marginal is

The corresponding Bayes rule is given by the following equation:

Figure 8 shows the plots of the rule in (39), It provides a thresholding rule since the wavelet coefficients around zero increase toward zero because the denominator and numerator depend on the elastic net. In addition, the posterior includes likelihood, which is the double exponential distribution. According to [2, 20], the thresholding rule is recommended, owing to the empirical.

6. Complex-Valued Wavelets Basis

Wavelet bases have been used widely in past decades, according to the results of an extensive simulation. Wavelet methods treat the wavelet coefficients term by term, where the wavelet coefficients have threshed or shrunk. There are many base wavelets with the sample basis being the Haar wavelet.In this article, a complex wavelet basis is involved, where the real wavelet coefficients are treated and the imaginary wavelet coefficients are negligible. The complex wavelet technique was introduced by [15], who explained the complex-valued wavelets of the Daubechies wavelets. Subsequently, complex wavelets have become more popular in the past two decades. There have been several published articles considering complex-valued wavelets, such as [2124]. A simple explanation of the method can be found in [25] where the model shows the complex-valued wavelet by a bivariate shrinkage rule leaving the phase undamaged. The method of the complex-valued wavelet can be explained briefly in the following: suppose have random value, which is known as the observation and are the parameters, which are unknown, the model can thus be written as follows:where is a random variable, such as . The corresponding wavelet model is given by the following equation:where is a vector of wavelet coefficients of the observation, is a vector of the wavelet coefficients of the unknown parameters, is a vector of the wavelet coefficients of the noise and , where is a matrix transform of the wavelet, based on the choice of the wavelet basis. Barber and Nason assumed the complex-valued wavelet coefficients as follows:where is a covariance matrix, including the real and the imaginary wavelet coefficients-see [25] for more detail. The complex-valued wavelet method requires a wavelet basis with being equal to or more than 3 vanishing moments.

In Complex-valued “Daubechies” basis, If , there are four solutions but two are negligible because there are two real extremal-phase wavelets and the other is the complex-valued conjugate pair. Figure 9(a) shows the plot of the wavelet basis has vanishing moments, while Figure 9(b) shows the real and imaginary wavelet coefficients. It can be seen that the real and imaginary wavelet coefficients are slightly smoother than the wavelet’s function for Daubechies’ wavelet with vanishing moments. The procedure of the comparison can be explained by the following stepwise technique:(1)Compute the wavelet coefficients: the real and the imaginary wavelet coefficients(2)Choose the parameters in models I and II, which give the small minimum mean squared-error (MMSE) by comparing the estimated signal with the truth test function(3)Run step (2) 1000 iterations(4)This procedure is replicated till it equals 1000 to calculate the average mean squared error (AMSE)

Hence, the real wavelet coefficients are estimated by the proposed methods, while the imaginary wavelet coefficients remained. Furthermore, the “LinaMayrand” wavelet basis, with numbers equaling wavelet basis with number equaling 5.1, is used to recover the test function. The proposed method when compared with the complex multiwavelet style (CMWS) using universal hard and soft thresholding and nondecimated transforms is applied.

7. Simulation Results

This section implements experiments on simulation to assess the finite sample performance of the proposed methods. The simulated data sets consist of the standard test signal, Heavisine of (Donoho and Johnstone, 1994) and (Nason and Silverman, 1994), corrupted by independent Gaussian noise. The test signal is rescaled to have unit variance. The degree of noise is measured by the ratio of the standard deviations of the signal and noise, referred to as the root signal to noise ratio (SNR). Nondecimated wavelet transform has also been considered. Also, the real and imaginary wavelet coefficients and the real coefficients are estimated, while the imaginary wavelet coefficients are reminded. Then, the underlying signal is calculated from the real coefficients. Tables 1 and 2 show the AMSE and associated standard errors obtained when denoising signals using two of the new methods: Models I and II are compared with CMWS universal hard and soft thresholding of [25]. The results in Tables 1 and 2 are based on 1000 simulated data set lengths with and 128 equally spaced points with different levels of noise . In each case, no thresholding was done below level 3. We have compared all of our methods to a range of techniques using real-valued wavelets and implemented the same simulated data sets as in Tables 1 and 2. For all of the results of simulations, we have found the method of [25] has been found to give the smallest AMSE. The first is that the distribution form allows estimation of the distribution parameters rather than keeping them fixed.

A selection of results is presented in Tables 1 and 2, where the results are shown for CMWS-hard, CMWS-soft, and the proposed methods, Models I and II. For all these competitive procedures, there has been considerable effort to tune their parameters to give as effective results as possible. Real valued multiwavelet shrinkage and, to a lesser extent, cross-validation, were found to be highly sensitive to the choice of primary resolution. All other methods were found to be robust to the choice of primary resolution. As the level of the noise increases, the result of the AMSE increases and the method CMWS-soft provides a better result than the other method. More precisely, the method of [25], CMWS reduces the AMSE. Tables 1 and 2 show the results of the AMSE when the nondecimated decomposed wavelet coefficients were implemented. Both tables show the results of the proposed methods compared with CMWS-hard and CMWS-soft with five vanishing moments. The CMWS-soft method provides better results when the size of data equals 64 and 128, however, as the size of data increases, the AMSE decreases. Since large data contains more information about the signal. This pattern has been seen in [26].

This article considers three ways of estimating the unknown signal: CMWS-Hard; CMWS-soft; model 1, and model 2. More precisely, the simulation was replicated by times, and the run for estimating the underlying signal equals 1000 times. the average mean squared-error (AMSE) was calculated withwhere is the estimate of the underlying signal from the jth replicate. It is important to know that is the number of replicates, and is the length of . Algorithm 1 shows the main idea of calculating AMSE. The underlying signal is estimated by using (. The algorithm of the cthresh method has been applied using the R package “wavethresh.” For models 1 and 2, the following Algorithm 2 can be used to explain the idea.

Result: AMSE
Let
for j=1 to R do
 Generate
 Compute
 Compute
end
Compute the
Result: MSE
Initialization , , , , ,
, and
 Compute and
for r=1 to R do
for m=1 to M do
  for j=0 to J − 1 do
   Compute the real coefficients of at level , and
   for i=1 to M do
    Generate from a Gaussian distribution
    
    Compute
    Compute and
    ifthen
      = , , and
    else
      = , , and ,
    end
   end
  end
  ,
  Updata , and , (Robert’s)
end
 Compute MSE
end

Tables 1 and 2 show the results of different methods for various levels of noise (0.1, 0.2, 0.5, 0.9), the CMWS-soft method provides a better result than other methods. However, model 2 gives a better result than model 1, where the data is used to estimate the variance. Hence, the variance is estimated by an exponential distribution in model 2.

8. Application to Real Data

Within this section, real data is simulated to show investigation of the performance of the methodology. Figure 9 shows the popular motorcycle crash data introduced by [27], this set of data is available in R under the package MASS. The location “Time” points are not regularly spaced. Figure 10 shows the plot of 128 points of the motorcycle crash data.

And, Figure 11 shows the plot of the reconstructions of different methods; the panel (a) shows the thresholding of the method CMWS-hard where it is seen that this method works well. However, there is a misfit in the top of the panel. The problems in the fitting can be seen in panel (b), which is provided by CMWS-soft. Panels (c) and (d) are recovered by Model I and II, respectively, whereas in Model I the parameters are fixed by using the result in Tables 1 and 2. In this method, the parameters are , and , while the parameters are fixed by and in Model II. These parameters are chosen using the MMSE method, where the test function was recovered. As seen, there is a slight difference between the reconstruction of Model I and II.

9. Conclusion

This article has investigated two new models for real-valued denoising signals using the elastic-net distribution with complex-valued wavelets. Two models are proposed, the first model has normal distribution as likelihood, while the second model has double exponential distribution as likelihood. In the first model, the variance is calculated from the wavelet coefficient at the finest level, while in the second model, the variance was estimated using the exponential distribution. The wavelet coefficients are used to compute the complex wavelet coefficients, where the imagery coefficients are negligible and the underlying signal is estimated using the real coefficients. Extensive simulation results have shown in Tables 1 and 2 where different methods are compared by using AMSE, and the methods of [25] work well and provide a reasonable restriction than the other proposed methods and it reducing the result of the AMSE. As much as 0.0028, when the variance of noise equals 0.1. However, these methods do not offer smooth solutions to real data as they are not regularly spaced. Our method provides slightly more excellent resolution than other methods. However, the CMWS method is simple and easy to use and is also much faster. The proposed methods show smoothing features in the reconstruction when the data are not equally spaced.

Appendix

A

The cumulative function is given by the following equation:

The cumulative function is given by the following equation:where and .

B

The predictive distribution of , when , is defined as follows:

The predictive of , when , is denoted as follows:

C. The Mathematical Equations for the Expectation of βi

The expectation of , when , is denoted as follows:

D. The Mathematical Equations for the Expectation of βi with Different Values of yi

The predictive distribution of , when , is defined as follows:

The predictive of , when , is denoted as follows:

E

The expectation of , when , is given by the following equation:

Data Availability

The data used to support the study are included in the paper.

Conflicts of Interest

The authors declare that there are no conflicts of interest.

Acknowledgments

This study was funded by Taif University Researchers Supporting under Project no. TURSP-2020/279, Taif University, Taif, Saudi Arabia.