Abstract
The proportionate affine projection sign subband adaptive filter (PAP-SSAF) has a better performance than the affine projection sign subband adaptive filter (AP-SSAF) when we eliminate the echoes. Still, the robustness of the PAP-SSAF algorithm is insufficient under unknown environmental conditions. Besides, the best balance remains to be found between low steady-state misalignment and fast convergence rate. In order to solve this problem, we propose a normalized combination of PAP-SSAF (NCPAP-SSAF) based on the normalized adaption schema. In this paper, a power normalization adaptive rule for mixing parameters is proposed to further improve the performance of the NCPAP-SSAF algorithm. By using Nesterov’s accelerated gradient (NAG) method, the mixing parameter of the control combination can be obtained with less time consumed when we take the l1-norm of the subband error as the cost function. We also test the algorithmic complexity and memory requirements to illustrate the rationality of our method. In brief, our study contributes a novel adaptive filter algorithm, accelerating the convergence speed, reducing the steady-state error, and improving the robustness. Thus, the proposed method can be utilized to improve the performance of echo cancellation. We will optimize the combination structure and simplify unnecessary calculations to reduce the algorithm’s computational complexity in future research.
1. Introduction
Adaptive filters are important components of many signal processing applications, such as echo cancellation, system identification, channel equalization, and so on [1, 2]. Echo cancellation is the process of extracting pure signals from echo corrupted signals. The adaptive filter for the echo cancellation system is generally designed in the frequency domain. Otherwise, the length of the designed filter tap might be unexpected in the time domain. The prominent of the least mean squares (LMS), normalized LMS(NLMS), and Filtered-x LMS(FxLMS) are simple and reliable [3, 4]. However, for colored inputs, their performance will degrade, especially for the speech input signals [5]. Exploiting the multiple regression, the affine projection algorithm (APA) can improve convergence performance but at the cost of high computational complexity. Besides, the normalized subband adaptive filter (NSAF) could also speed up the convergence [6]. This kind of algorithm is presented from the principle of minimum disturbance, and it processes the colored input signals by analyzing filter banks [7]. Das and Trivedi proved that the rate of convergence can also be improved by using the proportional normalization method in the adaptive filter. However, the sparseness of impulse response still impacts its performance [8].
In real life, the noise is complex and does not meet the Gaussian distribution, and many adaptive algorithms suffer reduced convergence rate under the impulsive noise environment due to the l2-norm optimization criterion [9]. The sign-algorithms family has the ability to resist the impulse noise disturbance. However, the convergence speed of SSAF is very slow and cannot be accelerated by increasing the number of subbands. In addition, if the impulse response of the echo path is sparse, the convergence speed of SSAF will further decrease. In order to speed up the convergence rate of the algorithm, Ni, J et al. proposed variable regularization parameter SSAF (VRP-SSAF) to further improve performance [10, 11].
In recent years, researchers proposed many modified SSAF have to accelerate the convergence. By adopting the idea of multiple regression, the APA algorithm shows better performance, which brings new inspiration to further studies. Reference [7] proposes an AP-SSAF algorithm that uses multiple previous input vectors to update the weight vector. Yu and Zhao discovered a phenomenon that the filter performance would decrease when all the subbands use the same common weighting factor [12]. To solve this problem, they proposed the method called the individual-weighting-factor SSAF (IWF-SSAF), which allocates an individual weighting factor for each subband. However, in many situations, the echo path impulse responses are sparse, so that these above-modified algorithms converge slowly [13]. For the sparse echo paths, these following mentioned algorithms can incorporate a gain distribution matrix into their adaptions. Consequently, considering the sparsity of the impulse responses, the SSAF, AP-SSAF, and IWF-SSAF were improved to P-SSAF, PAP-SSAF, and IWF-IP-SSAF, respectively [14–17].
Now we know that because of using the fixed step size, the standard SSAF algorithm and the modified SSAF algorithm should find the best point between fast convergence rate and low steady-state misalignment. To solve this problem, many variable step-size algorithms have been proposed [18–20], but all these algorithms need to incorporate the a priori information into the learning mechanism. However, it is difficult to obtain them from the real world.
In addition to the variable step size algorithm mentioned above, the combinatorial method, which combines two different step size filters by using mixing parameters, keeps a balance of optimal performance between the convergence rate and the steady-state error [21]. And it is also called convex combination because the mixing parameter ranges between 0 and 1. This algorithm uses a random gradient descent algorithm to determine the optimal solution. The improved convex combination normalized subband adaptive filter (ICNSAF) can achieve the desired performance without the information of the subband noise power. Considering the impulse noise, Lu et al. proposed a novel combination approach of the AP-SSAF, which uses weight transfer of coefficients to obtain fast convergence speed during the transition stage [22]. By applying the convex combination scheme to IWF-SSAF and cyclically returning the weight vector of the combined filter to both component filters, Yu et al. proposed the combined IWF-SSAF with weight feedback. Although the above-proposed combination algorithms improve the performance of adaptive filters to some extent, there are still two problems to be solved. First, the above algorithms do not achieve good adaptability in terms of mixing parameter step size, which is a major factor affecting the adaptability of the filter. To correctly adjust the step size of the mixing parameter, we also need to consider some characteristics of the filtering scheme, such as input signal and additive noise power, or the step size of the adaptive filter included in the combination [23]. Second, they rarely considered the sparsity of impulse response, resulting in weak robustness of the filters in this situation [24, 25].
In this paper, the normalized combination of PAP-SSAF (NCPAP-SSAF) was proposed to deal with these defects, which adjusts the mixing parameter by means of the power normalization. Compared with other adaptive filter algorithms, the algorithm NCPAP-SSAF we proposed is robust in impulse noise environment. which is confirmed in the simulation results. In contrast to the standard PAP-SSAF algorithm, the proposed algorithm has the following characteristics:(i)In order to accelerate convergence and improve robustness against impulse noise, the l1-norm of subband error is used as the cost function in this paper. Then the mixing parameter of the combination is obtained by using a Nesterov’s accelerated gradient (NAG) method.(ii)NCPAP-SSAF algorithm normalizes the step size of the mixing parameter so that it is independent of the signal-to-noise ratio (SNR). The improvement makes the adaptive filter easy to select the step size and shows a robust behavior against unknown environmental conditions such as the “double-talk” scene.
2. Background of PAP-SSAF
The algorithm we proposed in this paper is an optimizing method based on the PAP-SSAF algorithm in convergence rate and steady-state error. Therefore, it is necessary to introduce this algorithm first.
At the beginning, we analyze the mathematical model parameters of a typical echo canceller. The input signal vector u(n) is filtered through the unknown impulse response = [(n), (n), …, (n)]T to observe the desired signal, where L is the length of the impulse response. This process can be described as follows: d(n) = uT(n)(n)+(n), where v(n) represents the background noise, and superscript T represents transpose of matrix and (n) = [(n), (n − 1), …, (n − L + 1)]T. We define N as the number of subbands, d(n) as the microphone signal, and u(n) as the far-end signal in the described adaptive filter structure. First, by means of the analysis filters, d(n) and (n) are stripped into N subband signals as di(n) and (n), in which i = 0, 1, …, N − 1. After the subband input signal (n) passing through the adaptive filter , we can get the subband output signal yi(n). The letter n represents the original sequence; k represents index decimated sequences. The results obtained from N-decimation of the filter are di,D(k) and yi,D(k). The decimated subband error signal can be expressed as follows: , where is the tap-weight vector of the adaptive filter . After that, we use μ to represent the step length of the filter. Then, we can formulate the update of the SSAF as in the following equation:where εA represents posteriori subband error, and sgn[·] is the sign function. δ is the regularization factor which is a small constant to avoid numerator divided by zero. Inspired by the APA, Ni et al. in [7] proposed a method that used several previous input vectors to update the tap-weight vector, which they called it AP-SSAF. In each subband, we collect the nearest L-th desired subband signals to generate the i-th desired subband signal vector. Similarly, we collect the subband input vectors to generate the input signal matrix.
In AP-SSAF, it is necessary to obtain the prior subband error and the posterior subband error. Prior error is referred to as in the following equation:
Posteriori error is referred to as in the following equation:where,
To formulate the AP-SSAF, it should follow constrained optimization problem as in the following equation:where and represent the l1-norm and l2-norm, respectively. By using the Lagrange multiplier method, the unconstrained optimization problem can be used to replace the above-constrained optimization problem, that is to say that we use the subband error vector eA instead of the posteriori subband error vector εA in equation (1). Accordingly, the renewal equation of AP-SSAF is as follows:
In the real world, there is the fact that the echo path impulse response is usually sparse and most of the filter coefficients are extremely close to zero. To solve this problem, the work in [26] combined the proportionate idea with the AP-SSAF, which is called the PAP-SSAF, and the updated equation of the PAP-SSAF is as follows:
In equation (9), (k) = diag[(k), (k), …, (k)] is a proportionate diagonal matrix. There are many algorithms that have been proposed to calculate the diagonal matrix [27]. Among them, a typical method shows robustness in the condition of the impulse response. This method has been used in the PAP-SSAF, and it will be used in our research as well. The diagonal elements from G(k) are calculated by the following:
3. Proposed NCPAP-SSAF
3.1. The Algorithm Design of NCPAP-SSAF
The step size has a great influence on the convergence performance. In terms of the adaptive filter, on the one hand, if the step size is large, the adaptive filter convergence is very fast, but it will lead to a larger steady-state error. On the other hand, the step size is small and then the convergence is slow, but there is a small steady-state error. The basic principle of the convex combination algorithm is combining two filters with different step sizes, which update independently. Consequently, the final filter inherits the advantages of the two filters.
For simplicity, we show one of all the subband structures of the convex combination method in Figure 1 denotes the filter vector with a large step size and denotes that with a small step size. The subband output signal of each filter is and the subband error is ei,D,j(k) = di,D(k) − yi,D,j(k), where i = 0, 1, …, N − 1, j = 1, 2. In the combined filter structure, the two filters do not affect each other and update independently. They update according to the following:

We obtain the final output by combining the subband output of two filters, as follows:where λ(k) is the mixing parameter. Thus, the overall error can be expressed as follows:
The weight vector of the combination filter is referred to as follows:
Since the value of λ(k) ∈ [0, 1], this kind of combination method is named as convex combination and has usually been utilized in combinational filters. λ(k) is calculated by the sigmoid function as follows:
The main problem in designing a convex composite filter is how to find the appropriate value of α(k) to make the mean square error of the error signal minimized. For a lossless filter bank, the power of the output error is equal to the sum of the powers of the subband errors [28]. The traditional convex combination algorithm uses the stochastic gradient method to determine the mixing parameter. However, its convergence performance is unsatisfactory, so that the filter cannot track the system quickly. In order to solve this problem and improve the capability of impulse noise suppression, we use the Nesterov’s accelerated gradient (NAG) method to determine the mixing parameter.
The key point of the Nesterov’s accelerated gradient algorithm is illustrated as shown in Figure 2. NAG can be unfolded into two steps: Firstly, we calculate the update vector α(k) according to the past time step α(k − 1) and the gradient obtained from the next position of the parameters. Note that computing J(k) gives us a rough idea of where our parameters are going to be. And we can look “ahead” by calculating the gradient not w.r.t. to the current parameters α(k − 1) but w.r.t. the approximate future position of the parameters. Finally, we update the parameter α(k) and accomplish this iteration. Therefore, it will not stop convergence before beyond the region of local optimal solution.

This method can minimize cost function as equation (19) [29]:
The update equation for α(k) is given by the following equation:where is the momentum, γ is a constant number named momentum factor, and it ranges between zero and one, and μα > 0 is the step size. It can be seen that the NAG is the same as the stochastic gradient method when γ = 0. Reference [16] points out that α(k) is limited to a symmetrical interval [−α+, α+] to meet the minimum level of adaptation. The experimental results in [16] indicate that the optimal value of α+ should be set to 4, while the value of λ(k) is restricted in the range of [0.018 0.982].
3.2. Power Normalized Rule for Adapting the Mixing Parameter
When the value of the mixing parameter step μα is reasonable, the update equation of α(k) can provide good performance for the whole system. However, the value of μα is related to many factors of the filter, such as input signals and additive noise power and the step size of the adaptive filter. Therefore, it is necessary to normalize the mixing parameter step μα. Substituting equations (19) and (20) into the update equation (18), we get the following:
In equation (21), eD(k) = [e0,D(k), e1,D(k), …, eN−1,D(k)]T denotes the overall error vector, and yD,j(k) = [y0,D,j(k), y1,D,j(k), …, yN−1,D,j(k)]T (j = 1, 2) denotes the output of each filter.
Under the condition that N = 1, the filter can be seen as a full band filter, and the signal has no use for analysis and reconstruction. The adaptive rule of mixing parameter is equivalent to the sign-error-LMS algorithm, where μαλ(k)[1 − λ(k)] is the varying step size and the input signal is [e0,D,1(k) − e0,D,2(k)]. The reason for this equivalent analysis is that the output of the combinational filter can be expressed as follows:
So the overall combination scheme can be seen as a two-layer adaptive filter. According to their own rules, the two-component filters operate independently in the first layer. In the second layer, the output error of the two combination filters in the first layer is taken as input so that the norm of the overall output error is minimized. Since [e0,D,1(k) − e0,D,2(k)] is the input signal at this level, it makes sense to use the above adaptive scheme. Reference [30] proved that the system performance of ε-normalized-SLMS (NSLMS) is better than SLMS. According to this conclusion, if we replace SLMS with NSLMS in the update of α(k), the system performance will be further improved. After normalized, the update equation of α(k) is referred to as follows:
However, the instantaneous value [e0,D,1(k) − e0,D,2(k)]2 is an inaccurate estimation of the input signal, so the calculation is not stable. The algorithm can be improved effectively when the power estimation of the input signal is used instead of its instantaneous value, as equation (24). When N = 1, the filter is a full band filter, the instantaneous value can be replaced by its power estimation:where η named forgetting factor is close to 1, such as 0.99. When N > 1, the update rule of the mixing parameter is the same as the SSAF. Its step size is μαλ(k)[1 − λ(k)] and its subband input signal is eD,1(k) − eD,2(k) = [e0,D,1(k) − e0,D,2(k), e1,D,1(k) − e1,D,2(k), …, eN−1,D,1(k) − eN−1,D,2(k)]. The output of the combined filter can be expressed as follows:which supports the above conclusions. In (26) yD(k) = [y0,D(k), y1,D(k), …, yN−1,D(k)]T. Then comparing the update equation (21) with the standard SSAF, we can easily see that the step size of the former has not been normalized. Therefore, normalizing the step size of α(k) can improve the convergence performance of the global filter. By analogizing the updating method of filter weights in SSAF, we can get the normalized expression of α(k) step size as follows:
Similar to the condition when N = 1, the instantaneous value [e0,D,1(k) − e0,D,2(k)]2 cannot be used to estimate the power of the second layer input signal very well, and a better behavior is obtained from the following equation:
By the comparison of the condition with different values of N in the state of N = 1 and N > 1, we can find that the result after normalization is different. It is resulted from the different method of normalization. The former uses power estimation to normalize, while the latter uses the square root of power estimation to normalize.
3.3. Stability Analysis of NCPAP-SSAF
The stability of NCPAP-SSAF by analyzing the convergence of the algorithm will be presented in this subsection. We carry out the Taylor series expansion for ei,D(k + 1) and get the following results according to the following equation [31]:where o(k) stands for the higher order infinitesimal of the Taylor series. By rewriting the first-order quantities of Taylor expansion, it becomes as follows:
Substituting equations (12) and (13) into the updated equation (31), we get the following:
By sorting out the preceding equation (16), we can find out the following relations:
Substituting equations (32) and (33) into the update equation (31), we get the following:
And α(k) can be calculated from equation (21):
From equations (34) and (35), we can get equation (36) when the subband number N is assumed to 1:
The result of the ideal filter should be that when k tends to ∞, eD,1(k) tends to be 0. Then we can rewrite the expression equation (36) as follows:
Therefore, it can get the following equation:
So we conclude that when the mixing parameter satisfies the following conditions, the NCPAP-SSAF algorithm will converge according to the following equation:
3.4. Computational Complexity and Memory Requirement
To further illustrate the rationality of NCPAP-SSAF, it is necessary to test the algorithmic complexity and memory requirements of the algorithm. Concerning the multiplications, the algorithmic complexity is summarized in Table 1. All of these subband adaptive filtering algorithms require 3NL multiplication, and P represents the order of the analysis filter (synthesis filter). Most of algorithms’ computation costs have been summarized in [16], so we mainly analyze CAP-SSAF and NCPAP-SSAF. Since both CAP-SSAF and NCPAP-SSAF require two filters so that tap-weight update requires 2MP + 4M/N multiplications and the subband error calculation requires MP + 6M/N multiplications, for both CAP-SSAF and NCPAP-SSAF, the tap-weight vector can be rewritten as and thus they both require M/N multiplications. For each weight vector update, CAP-SSAF requires 2MP + 4M/N + (M + 5)/N + 3NL + 1 multiplications, and NCPAP-SSAF requires 2MP + 6M/N + (M + 6)/N + 3NL + 1 multiplication.
Besides, in Table 2, we analyze and compare the memory requirement of various algorithms. The NCPAP-SSAF combines two complete AP-SSAF filters and 8 independent parameters which are [α(k), λ(k), μα, σ(k), , η]. For M = 512, L = 64, N = 4, P = 4, it needs 2M(NP + 1) + N(3L + 6P + 1)+ 12 = 18288 words to save the parameters of AP-SSAF and other NAG parameters.
4. Simulation Experiments and Results
4.1. Setting Up the Environment of Experiment
In this section, we will conduct two kinds of experiments. Firstly, the experiment of parameter analysis will show us the influence of key parameters on the performance of the algorithm and whether their performance is consistent with the theoretical expectation. Then we simulate the echo cancellation experiment to compare the performance of our proposed algorithm and other methods. The results of the experiments will verify that the proposed algorithm brings an improvement in both accuracy and convergence speed.
The sparse echo path used in the following experiments is shown in Figure 3, and both the sparse echo path and the adaptive filter have 512 coefficients with the sampling rate 8 kHz. In realistic communication scenarios, the impulse response of the echo path will be affected by environmental factors. Hence, we simulated an unexpected echo path impulse response by shifting the echo path to the right by 12 samples in the midst of each experiment, which is half of the total iteration number. Therefore, in the following experimental, we can see that the algorithms restart the convergence at the half process. The formula for array sparsity can be described as follows [32]:

By substituting the echo path vector and vector length into the formula, we find that the sparsity ζ is 0.6078.
There are two types of input signals: speech segments or an AR(1) process. The AR(1) signal is obtained by filtering a zero mean white Gaussian random sequence through the first-order system H(z) = 1/(1 − 0.9z − 1), with the signal length 6e4 points. Meanwhile, we use an independent white Gaussian noise with 30 dB signal-to-noise ratio and a strong impulsive noise with-10 dB signal-to-interference ratio as the system background noise and system output noise, respectively. The Bernoulli-Gauss distribution model is used to obtain impulse noise. The impulsive noise is generated as z(k) = ω(k)n(k), where n(k) is Gaussian white noise with a mean value of 0 and a variance of δ, and ω(k) is a kind of Bernoulli process with occurrence probability P{ω(k) = 1} = Pr, P{ω(k) = 0} = 1 − Pr.
Double-talk is very common in echo cancellation. In order to simulate this scene, an 8 kHz sampling rate speech signal is added to near-end speaking in simulation. Figure 4 shows the signals of double-talk scenarios. Figure 4(a) is the near-end speech and Figure 4(b) is the far-end speech in all of the following experiments. In order to ensure a fair comparison, the following parameters were uniformly set in all algorithms, namely, subband N = 4, affine projection number L = 4, forgetting factor η = 0.99.

(a)

(b)
We did 50 times independent MonteCarlo in each simulation. We obtained the final results by averaging all of the 50 simulation results. The normalized mean square deviation (NMSD, in dB) was utilized to evaluate the convergence performance of the adaptive filters. It is defined as follows:
4.2. Momentum Parameter Analysis
Figures 5 and 6 show that the convergence curves which represent the NCPAP-SSAF with different γ in SNR = 20 dB, Pr = 0.001 for AR(1) input.


In Figure 5, we can see that different values of momentum factor γ cause different evolution results of mixing parameter λ(k). The x-axis represents the number of iterations of the algorithm in the experiment, and the y-axis represents the value of NMSD. The larger momentum factor causes NAG to make a quicker choice between big steps and small steps. We can also see that λ(k) is limited from 0.018 to 0.982 due to the fact that the absolute value of α(k) is less than 4. In Figure 6, PAP-SSAF and NCPAP-SSAF show much better performances than NLMS in terms of steady-state error and convergence speed. NAG is equivalent to the stochastic gradient method when γ = 0. It can be found from the figure that the “convergence pause” exits in the NCPAP-SSAF when γ = 0, which is marked in the circle. With an increment of the value of γ, the pause is reducing progressively to zero. Consequently, using NAG accelerates the convergence of the convex combination algorithm, and using γ = 0.99 gives a satisfying acceleration.
4.3. AR(1) Input
We carried out four different groups of experiments with AR(1) signal used as the input signal. In the following experiments, the μα of NCPAP-SSAF is set to 1.
Figure 7 shows the convergence of several algorithms under SNR = 20 dB and Pr = 0.001 conditions. The value of μα of ICNSAF and CAPSAF is set to 10. As can be seen from the figure, the convergence performance of PAP-SSAF is better than IP-SSAF with the same step size, which is undoubtedly due to the application of the affine projection. At the same time, the poor convergence performance of VSS-SSAF shows that in the sparse echo impulse response channel, a proportional step matrix is very necessary. Otherwise, the algorithm will not converge effectively. Compared with the algorithms without combinations, for the convex combinatorial algorithms, NCPAP-SSAF, CAP-SSAF, and ICNSAF, the performance of each has been improved, albeit to varying degrees. They have both fast convergence and low steady-state error and achieve the goal of algorithm design. By contrastively analyzing these three convex combination algorithms, it can be seen that the performance of NCPAP-SSAF proposed in this paper is significantly better than other algorithms. Its convergence rate is the same as that of large step PAP-SSAF, and its steady-state error is the same as that of small step PAP-SSAF. In addition, due to the application of the NAG method, NCPAP-SSAF does not appear the “pause-convergence” phenomenon in the process of convergence. NCPAP-SSAF has the normalized mixing step-size convex combination structure, which contains two PAP-SSAF with different step sizes. Its excellent performances should not only be attributed to the own structure but also the characteristic of PAP-SSAF, namely, a faster convergence speed.

Figure 8 shows the convergence performance of several algorithms in SNR = 20 dB and Pr = 0.01. The values of μα at ICNSAF and CAPSAF are set to 100. The performance of these algorithms is approximately the same as that of Figure 7. NCPAP-SSAF performs well in resisting impulse noise, and its performance is obviously better than other algorithms. Please note that in Figures 7 and 8, the steps of mixing parameters used by NCPAP-SSAF are the same, and all of them have achieved good performance. That means that the normalized step size proposed in this paper makes the combined filter not affected by impulse noise. Hence, the algorithm has good robustness. With the Pr increasing, the steady-state errors of other algorithms increase in varying degrees.

Figures 9 and 10 show the convergence performance of AR(1) input of these algorithms under the low noise conditions. In Figure 9, the SNR = 30 dB, Pr = 0.001, and in Figure 10, the SNR = 30 dB, Pr = 0.01. By comparing Figures 7 and 8, we can see that the steady-state errors of these algorithms decrease in varying degrees with the increase of SNR. NCPAP-SSAF achieves good results in two experiments with the same step size of mixing parameter. Compared with Figures 9 and 10, it can be seen that with the increase of the probability of impulse noise, the convergence speed and steady-state error of several algorithms decrease, but the algorithms still maintain a good ability of anti-impulse noise. Comparing the experimental results of Figures 9 and 10 with those of the previous two experiments, it can be seen that with the increase of SNR, all the algorithms achieve a better performance.


4.4. Speech Input
Taking a voice signal as input, four independent experiments were carried out in total. Figures 11 and 12 show the simulations with the single-talk case, and Figures 13 and 14 show the double-talk case because the correlation of speech signals is much greater than that of white noise, which passes through the first-order systems.




The value of μα at NCPAP-SSAF in these experiments is set at 0.05. The performance of the NCPAP algorithm is discussed and analyzed in the following.
Figures 11 and 12 show under different SNRs the convergence of several algorithms without near-end voice. The values of μα at ICNSAF and CAP-SSAF are set at 100 in Figure 11, and the values of μα at ICNSAF and CAP-SSAF are set at 5000 in Figure 12.
As shown in both figures, we can conclude the conclusion that the NCPAP-SSAF algorithm has much better performance in the aspects of fast convergence speed and small steady-state error. Compared with the condition when μ = 0.005 at PAP-SSAF, it is clear that when μ = 0.005, it gradually converges to a certain extent (about 12 dB) and remains stable when μ = 0.05. This is because the larger step size makes it impossible for the adaptive filter to continue to converge. For the condition that μ = 0.005, its former process has too slow convergence speed and too poor tracking performance, although it can obtain a smaller steady-state error.
The situation indicates that the NCPAP-SSAF algorithm adaptively matches a digital filter with large step size and can constantly converge when the filter is close to the steady statement and also can acquire similar steady-state error with the PAP-SSAF when the value of μ is 0.05. In addition, in the two experiments, the ICNSAF algorithm and the CAP-SSAF algorithm need to choose a different value of μα in order to get the best filtering effect. For its filter with normalized mixing parameter step size, the NCPAP-SSAF algorithm can get the best filtering effect without adjusting the value of μα and improve the robustness of the system and reduce the influence of external factors. In the case of single-talk voice input, the NCPAP-SSAF algorithm can combine two independent filters scientifically and reasonably and has good robustness.
Figures 13 and 14 show the convergence of several algorithms with a near-end voice under different SNR, respectively. The values of μα at ICNSAF and CAP-SSAF are set at 100 in Figure 13, and the values of μα at ICNSAF and CAP-SSAF are set at 5000 in Figure 14.
It can be seen from both figures that NCPAP-SSAF has obvious performance advantages, fast convergence speed, and small steady-state error, and it is not disturbed by near-end voice.
First, for PAP-SSAF with μ = 0.05 and PAP-SSAF with μ = 0.005, it can be clearly seen that the former has a large steady-state error and is disturbed by the near-end speech, which results in the divergence of the filter in a certain degree. And though the latter can get a smaller steady-state error, the convergence speed is too slow especially when the echo path changes. Then as for NCPAP-SSAF, it can be found that the convergence curve of NCPAP-SSAF in the initial convergence stage of the filter almost coincides with that of PAP-SSAF with μ = 0.05. It indicates that NCPAP-SSAF adaptively chooses a filter with a larger step size, which is consistent with the design goal. Next, NCPAP-SSAF can keep fast convergence speed when the filter reaches steady-state gradually, and the filter can remain stable when the near-end voice appears, which indicates that NCPAP-SSAF has a certain antijamming ability. Being consistent with the previous results, NCPAP-SSAF achieves good filtering performance and improves the robustness of the system by using the same mixing parameter step size in both experiments. Compared with Figure 11 and Figure 12, it can be seen that with the increase of SNR, the performances of all adaptive filtering algorithms have been improved to a certain extent, especially in reducing the steady-state error. So we can reach the conclusion that the proposed algorithm has better double-talk robust than the other two combination filter methods, which owes the l1-norm as the cost function.
5. Discussion
The proposed method has some referenced effects on echo cancellation. With a convex combination of two independent filters, the NCPAP-SSAF algorithm exhibits a far superior filtering performance. However, there are still some issues that need to be further improved in the later work. One limitation of the current systems is that the computational complexity of the algorithm is similar to that of an algorithm based on combination filters but much larger than that of a single filter. This can be improved by upgrading composite structure, simplifying unnecessary calculation courses and reducing the complexity. Moreover, all the experiments in this paper are carried out in the MATLAB simulation environment. And impulse noise is generated by the Gaussian Bernoulli distribution based simulation, which is not consistent with the impulse noise in the real world.
6. Conclusion
In this paper, we design a new combination structure called NCPAP-SSAF for affine projection symbolic subband adaptive filtering algorithm. The power normalization method with the mixing parameter step size we proposed improves the robustness of the algorithm. We do simulation experiments of echo cancellation to validate the effectiveness of the proposed algorithm. First, we test the influence of the momentum factor on the mixing parameter. Then, we compare the performance of our proposed algorithm and other methods. The simulation results show that the NCPAP-SSAF algorithm is not affected by stationary noise or impulse noise to a certain extent, and it can accurately obtain the optimal combination parameters under different conditions and thus obtain the optimal filtering performance. In the case of double-talk speech, NCPAP-SSAF can maintain faster convergence speed and smaller steady-state error and has a certain ability to resist near-end speech interference and strong robustness. Compared with other algorithms, our proposed method accelerates the convergence speed, reduces the steady-state error, and improves the robustness. In future research, we will improve the combination structure and simplify unnecessary calculations to reduce the computational complexity of the algorithm.
Abbreviations
LMS: | Least mean squares |
NLMS: | Normalized least mean squares |
APA: | Affine projection algorithm |
NSAF: | Normalized subband adaptive filter |
SSAF: | Sign subband adaptive filter |
VRP-SSAF: | Variable regularization parameter SSAF |
AP-SSAF: | Affine projection sign subband adaptive filter |
IWF-SSAF: | Individual-weighting-factor SSAF |
P-SSAF: | Proportionate SSAF |
ICNSAF: | Improved convex combination normalized subband adaptive filter |
PAP-SSAF: | Proportionate affine projection sign subband adaptive filter |
SLMS: | Sign-error LMS |
NAG: | Nesterov’s accelerated gradient. |
Data Availability
The data were generated according to the method described in this paper.
Conflicts of Interest
The authors declare that they have no conflicts of interest.
Acknowledgments
This work was supported in part by the National Natural Science Foundation of China (61350009), and in part by the National Natural Science Foundation of China 61179045.