Abstract

The Feasibility Pump is an effective heuristic method for solving mixed integer optimization programs. In this paper the algorithm is adapted for finding the sparse representation of signals affected by Laplacian noise. Two adaptations of the algorithm, regularized and nonregularized, are proposed, tested, and compared against the regularized least absolute deviation (RLAD) model. The obtained results show that the addition of the regularization factor always improves the algorithm. The regularized version of the algorithm also offers better results than the RLAD model in all cases. The Feasibility Pump recovers the sparse representation with good accuracy while using a very small computation time when compared with other mixed integer methods.

1. Problem Formulation

Sparse representations are among the most important linear representation methods. They are directly related to the compressed sensing problem, but can also be used for machine learning, signal or image processing. As presented in [1] different types of algorithms have been proposed. Sparse representations allow the identification of only a few relevant features or atoms, which can be used to represent the signal of interest with high accuracy. They can also help to well separate different types of signals, which makes them appropriate for denoising and classification problems.

The sparse representation of a signal , using a dictionary , consists in finding a solution having the smallest possible number of nonzero coefficients to the system . In practice, the equality constraint is relaxed and replaced by the minimization of the data misfit measure . We adopt here the norm, although usually the norm used for the error minimization is . The norm is more robust against outliers and is optimal for Laplacian noise [2]. It is also a norm that promotes sparsity (the error between the original signal and the obtained representation has only few significant elements). Bounding the number of nonzero coefficients (denoted ) by an imposed sparsity level , the sparse representation is found by solving

An alternative formulation of the problem is obtained by minimizing the number of coefficients and requiring the data misfit measure to be under a selected threshold :

Because the norm used to bound the number of nonzero coefficients in (1) is , Mixed Integer Programming (MIP) algorithms such as Branch and Bound and Feasibility Pump can be used [3].

Mixed Integer Programs are optimization problems that contain both integer and continuous variables. The most used MIP algorithms are Branch and Bound [4, 5] and Cutting Planes [6]. Another popular variant is the Branch and Cut [7] algorithm, which combines both algorithms. These algorithms can offer good solutions for problems with integer constraints, but often they are very time consuming. The Feasibility Pump, initially proposed in [8, 9], is a MIP heuristic method that generates two sequences of solutions, one satisfying the linear constraints and one satisfying the integer constraint. These sequences are generated until a solution that satisfies both conditions is found. The Feasibility Pump starts from an initial solution and then proceeds through several iterations to minimize the distance between the two solutions, first by solving a linear optimization problem with the integer constraints removed in order to obtain the solution that satisfies the linear constraint, then by a rounding step in order to obtain the solution that satisfies the integer constraint. The algorithm is prone to cycling and loops and several approaches have been proposed to avoid this [1014]. The Feasibility Pump offers much better execution time (several orders of magnitude compared to Branch and Bound or Branch and Cut), but the quality of the solution is not as good; there is no guarantee that the optimum is attained. The Feasibility Pump has been proposed for both linear and nonlinear problems [15, 16].

In order for the Feasibility Pump algorithm to be used, we introduce a binary variable , which shows whether an atom of the dictionary is used to represent the signal .

Problem (1) becomesThe big- trick is used as in [17], where is a preset parameter bounding the elements of , chosen as .

In this paper, the Feasibility Pump is adapted in order to obtain the sparse solution of problem (3). A regularization term is also added to the objective in order to improve the convergence of the algorithm. Section 2 presents our Feasibility Pump algorithm and the implementation details. Section 3 is dedicated to experimental results showing the behavior of our algorithm and comparisons with the regularized least absolute deviation (RLAD) model, which consists of a linear programming problem. We show that our regularized Feasibility Pump algorithm is consistently better than its nonregularized version and than RLAD. Section 4 presents the conclusions and future ideas of research.

2. Algorithm

Feasibility Pump [8] is a MIP algorithm that tries to minimize the difference between the solution of the relaxed linear programming problem (where the integer condition is relaxed) and a solution that satisfies the integer conditions.

Algorithm 1 presents our approach for solving problem (1) via its reformulation (3). In what follows, we explain its steps and the refinements that we have made to standard Feasibility Pump.

Data: Dictionary , signal to represent , sparsity level , maximum number of iterations ,
weight parameters , , (the latter for SFPreg only)
Result: a feasible solution
Solve (4)/(9) for using a linear programming solver
Use rounding procedure to obtain vector
while number of iterationsIter do
Solve the problem (5)/(10) for the vectors and
if is integer then
return
end
Use rounding procedure to obtain vector
if cycle is detected then
Use perturbation (7) on
end
Update the value of using (6)
end

As usual, the starting point for the Feasibility Pump algorithm is the solution of problem (3) when is a vector with values in the box, instead of a binary one. We also introduce an auxiliary variable to bound the data misfit measure. The resulting problem isThis is a linear programming problem that can be easily solved with one of the many existing algorithms. We have used the MATLAB function linprog, based on the dual simplex algorithm.

This real is then rounded to obtain a binary vector , by setting to 1 the values of corresponding to the largest absolute values of and the others to 0. This is in contrast with the usual rounding to the nearest integer used by Feasibility Pump, but is justified by the fact that a smaller representation error is obtained in the long run by taking the maximum allowed number of nonzero coefficients.

In the ideal case where the vector is already binary, the algorithm stops. Note that the first constraint of (4) is satisfied; hence the number of coefficients with a value of 1 cannot exceed . The other stopping condition is when the maximum number of iterations is reached. In this case, is rounded as described above and the coefficients of that correspond to zero coefficients in are also set to zero.

In each iteration of Feasibility Pump, the vector and the tentative solution are improved by solvingwhere and is a weight. This linear program is also solved with linprog. The iteration step has an objective that combines the representation error with a term that enforces the new solution to be near from the current integer vector , with the aim of making the solution nearer from a binary vector. This kind of modification is proposed in [18, 19]. After each iteration, the parameter is reduced and the integer condition will weigh more than the error objective. The reduction of is done by multiplication with a value :Typically, a large gives the smallest error, but at the expense of a larger execution time, while a smaller offers faster results, but with a larger error.

Although this strategy usually leads towards a binary , it can also lead to cycling, which we consider to occur at iteration if(i)the point has already been visited in previous iterations;(ii)the distance is under a certain threshold.

To break the cycle, a weak perturbation is used for the coefficients of :The resulting has real values and is used in the linear program for the next iteration. This is done to induce a switching of the coefficients in the next vector , which will have binary values.

We name Sparse Feasibility Pump (SFP) the algorithm as described above.

Because the matrix can be ill conditioned, a second implementation, named Sparse Feasibility Pump with regularization (SFPreg), is proposed, where the objective has an regularization term added. Regularization terms in the Feasibility Pump are also considered in [20]. For the initial step, instead of (3) we aim to solvewhere is a given constant. Similar to the transformation of (3) into (4), we implement (8) by introducing a new variable to bound the error and a variable to bound and by relaxing the binary constraint on , thus obtaining

Regularization also changes the form of the iteration step. The parameter is now used as a weight value for the regularized error. The regularized version of (5) is

Adding the regularization term does not change the nature of the problem and the same algorithms as for SFP are used.

We note that, unlike the Branch and Bound algorithms from [17], Algorithm 1 is not guaranteed to provide the exact solution of (1). However, it is significantly faster.

3. Results

For our tests, we use randomly generated dictionaries of size with condition numbers of 1000 and 100000; like in [17], we focus on not so well conditioned matrices because they are more difficult to handle by sparse representation algorithms. The test signals are obtained with , where the solutions have nonzero coefficients generated randomly following a Gaussian distribution with zero mean and unit variance, in random positions; the noise is Laplacian and its variance is chosen such that the signal to noise ratios (SNR) have values 10, 20, 30, and . For each condition number, sparsity level , and SNR, we generate 10 distinct dictionaries and solutions .

For the computation of the representation error, the relative erroris used (where now is the computed solution), in accordance with the formulation of the initial problem (1). The name of the algorithm may be added as subscript, like in .

We have implemented SFP and SFPreg as shown in Algorithm 1. The initial weight is set to 0.5 and is multiplied by an update factor at each iteration; in [18, 21] the choices and are used; by taking we aim to start with equal emphasis on the number of nonzero coefficients and on the representation error; this choice appears also to speed up the convergence. The number of iterations is set to 1000. We run SFPreg with 50 equally spaced values of the regularization parameter from 0 to 1. The value for which the error is the smallest is considered the best.

Our two algorithms are compared with the regularized least absolute deviation (RLAD) model [2]solved with linprog. Note that this is a lasso-style relaxed problem, adapted to the error norm. Problem (12) is solved for different values of until the solution has a number of nonzero coefficients equal or close to . After that, problem (12) is solved without the regularization term, keeping fixed the support already found. Hence, the solution is optimal for that support. This linear programming approach is generically named RLAD.

The algorithms are implemented in MATLAB and tested on a computer with a 6-core 3.4 GHz processor and 32 GB of RAM.

In Figure 1 we show the effect of the regularization factor in SFPreg on error (11). The values for SFP (which corresponds to ) are not shown, since they are much larger (with the exception of the noiseless case); in increasing SNR order, they are 0.51, 0.232, 0.108, and . It can be seen that as the SNR is smaller the error reduction is more significant. Although it is hard to give a general recipe on how should be chosen, it is visible that the error is small for a fairly large range of values. Hence, it is enough to try only few of them and take the one with best error.

Figures 2 and 3 display the difference in relative errors between RLAD and SFPreg for different matrix condition numbers. It can be seen that the differences are significant. In order to obtain the same number of coefficients, RLAD is forced by the regularization to focus more on sparsity than on the error reduction. Both algorithms recover the solution exactly in the noiseless case.

The relative errors (11) for the SFPreg algorithm are displayed in Figures 4 and 5. Obviously, the errors are dependent on the SNR. The error values are consistently at about the noise level, showing that SFPreg has a good behavior.

The relative recovery errors computed withfor the SFPreg algorithm are displayed in Figures 6 and 7. The recovery errors are also dependent on the SNR. For low perturbation the signal is recovered with very good accuracy.

The means of the relative errors (11) obtained by running the tests are displayed in Tables 1 and 2. Table 1 shows the results for matrices with a condition number of 1000 and Table 2 for matrices with condition number 100000. The first value in each cell is the relative error for SFP, the second (with larger font) is for SFPreg, and the last is for RLAD. It can be seen that SFPreg has always smaller mean errors compared to RLAD. The SFPreg approach gives an important improvement, in some cases reducing the error by almost an order of magnitude compared to SFP. Also, SFP has many cases in which the representation error is much larger than for RLAD.

In fact, the SFPreg algorithm has smaller representation errors in all cases with noise compared to the RLAD algorithm, for both conditioning numbers. It can be seen that the conditioning of the matrix does not affect the methods based on regularization.

Turning now to the complexity of the algorithms, the mean number of iterations is 3.77 for SFP and 3.14 for SFPreg. So, regularization not only helps the algorithm to give better results but also leads to slightly faster convergence.

The number of iterations depends on the selection of . A small makes SFP and SFPreg converge faster but can cause a higher error. Also the update factor affects the speed and accuracy of the algorithms. The values reported above seem to provide a good compromise.

The mean execution time of SFPreg is 0.416 seconds, while SFP requires 0.348 seconds. This is an important improvement compared to the formulation (4) solved with the Branch and Cut algorithm. As reported in [17], Branch and Cut require between 3 to 10000 seconds to find the solution for the problem. So, SFPreg is significantly faster and still obtains results that are nearly optimal.

4. Conclusions

In this paper an adaptation of the Feasibility Pump has been proposed to address the sparse representation problem with norm of the error. As seen from the experimental results, the addition of the regularization term offers a considerable improvement to the algorithm. The Feasibility Pump gives a good solution in a very small number of iterations. Thus, the Feasibility Pump proves to be a good algorithm for finding sparse representations of signals. Due to regularization and the big- trick, it limits the magnitude of the values of the coefficients and is thus appropriate also for ill-conditioned problems. Future research will focus on improving the randomization step, using different regularization terms, and adapting the algorithm for other sparse problems reformulations.

Data Availability

The data used for experiments have been generated as described in the paper, using random numbers. We expect similar results if the data are generated again using the same method.

Conflicts of Interest

The authors declare that they have no conflicts of interest.