Abstract

The set packing problem (SPP) is a significant NP-hard combinatorial optimization problem with extensive applications. In this paper, we encode the set packing problem as the maximum weighted independent set (MWIS) problem and solve the encoded problem with an efficient algorithm designed to the MWIS problem. We compare the independent set-based method with the state-of-the-art algorithms for the set packing problem on the 64 standard benchmark instances. The experimental results show that the independent set-based method is superior to the existing algorithms in terms of the quality of the solutions and running time obtained the solutions.

1. Introduction

The set packing problem is a classical combinatorial optimization problem and it has been studied extensively by researchers in recent years. In the set packing problem, there are n objects and a set of exclusive constraints between some objects of OB labelled as O1, …, Om. Each object j ∈ OB is associated with a positive weight cj. The aim of set packing problem is to find out a packing that maximizes the total weight of objects such that any constraint should not be violated. The problem is widely used in various fields such as routing and scheduling trains at intersections in railway operations [1], selecting winning bids in combinatorial auctions [2], surgical operations scheduling [3], and packets scheduling and transmission in communication networks [4] among many others.

The set packing problem is an NP-hard problem [5]. The solving algorithms for this problem can be categorized into two types of exact and inexact ones. In [68], new facets were identified for the polyhedron of the problem, which strengthen the solution of the relaxed problem. Rossi and Smriglio used a branch and cut algorithm to solve the set packing problems [9]. Kwon et al. proposed an approach for ex-postevaluation of approximate solutions obtained by a well-known simple greedy method for set packing [10]. Kolokolov and Zaozerskaya found the polynomial upper bounds on average iterations number for L-class enumeration algorithm and the first Gomory cutting plane algorithm [11]. Landete et al. presented an alternative formulation for the set packing problem in a higher dimension and the addition of a new family of binary variables allowed the authors to find new valid inequalities, some of which were shown to be facets of the polytope in the higher dimension [12]. However, the computational time required for this exact approach increases exponentially with the size of the problem in general. The exact approach can only obtain optimal solutions for relatively small-scale instances. So as to solve larger-scale instances, heuristic algorithms [1317], which indeed play an important role in obtaining high-quality solutions to combinatorial optimization problems in a reasonable time [1822], have been designed. For set packing problem, Rönnqvist proposed a combination of Lagrangian relaxation and the subgradient method to solve the cutting stock problem, which is an application of the set packing problem [23]. Chandra and Halldórsson proposed a combination of greedy algorithms and local search methods to tackle the special cases of the set packing problem [24]. Lau and Goh presented a greedy algorithm to solving the set packing problem [25]. In the literature [26], a greedy randomized adaptive search procedure (GRASP) was proposed, and railway problem instances and random instances were tested to measure the effectiveness of GRASP [26]. In literature [27], Gandibleux et al. proposed an ant colony optimization (ACO) method, and the random instances were used to evaluate the ACO algorithm. Guo et at. presented the simulated annealing heuristic with three local moves to solving the bidding problem which can be modelled as the set packing problem [28]. The set packing problem was modelled as the unconstrained quadratic binary program and Tabu search was proposed to solve it [29]. Later, two improved versions of ACO were proposed in [30, 31]. In [32], an approximation algorithm based on local search methods was proposed. EA/G [33] was a recently proposed evolutionary algorithm and was applied on random instances. Chaurasia et al. presented an evolutionary algorithm-based hyperheuristic framework for solving the set packing problem [34]. Chaurasia and Kim proposed an evolutionary algorithm-based hyperheuristic framework that incorporates dynamic selection of parameters [35]. In [36], a decomposition technique based on constraint partitioning was proposed for exploiting the semiblock-angular structures of set packing problem and solving the original problem through solving the subproblems of the obtained structure.

Although some heuristic search algorithms have been used to solve the set packing problem, their solving quality still needs to be improved. In particular, the average solution obtained by the existing algorithms differs a lot from the optimal solution. In other words, the stability of the solution obtained by the existing algorithms is not very good and has a lot of randomness. Therefore, a new idea is proposed to solve this problem, which is to transform the set packing problem to the weighted independent set problem. As a result, any solving method proposed for minimum weighted independent set can be used to solve SPP via its independent formulation. The idea of encoding a problem into an equivalent problem to solve is appealing because of two reasons: one is that we can solve the set packing problem without designing dedicated set packing solving algorithms. The other is that we can make full use of the minimum weighted independent solving algorithms to better solve the set packing problem. Furthermore, we can even use different minimum weighted independent set algorithms to enlarge the scale of solvable instances of the set packing problem. In our paper, we solve the set packing problem by using the efficient algorithm proposed previously for minimum weighted independent set problem. To our knowledge, the minimum weighted independent set-based methods for the set packing problem have not been used before.

The rest of this article is structured as follows. In Section 2, we introduce the method of encoding set packing problem as minimum weighted independent problem. Then, we briefly review the algorithm DLSWCC (diversion local search based on weighted configuration checking) for set packing problem in Section 3. The experimental results and the analyses of the experimental results on standard benchmarks are presented in Section 4. The conclusions and perspectives for future work are shown in Section 5.

2. The Encoding Method

In this section, we introduce the concepts used in this article and the encoding method. According to the definitions in literatures [26, 27], the set packing problem can be described as follows. Given an object set OB = {1, …, n} and each object i ∈ OB is associated with a positive weight ci, we use Oj to represent a set of exclusive constraints between some objects of OB, where j ∈ J = {1, …, m}. A packing PA is a subset of OB such that , i.e., at most one object of Oj can be in PA.

The aim of the set packing problem is to obtain a packing that maximizes the total weight of the contained objects without violating any constraint. Formally, the set packing problem can be formulated as follows:where xi is a binary variable, indicating whether object i is in PA, xi = 1 means in PA, otherwise, xi = 0 means not in PA, ci is the weight of object i, and oi, j is also a binary variable, indicating whether object i belongs to exclusive constraint set Oj, oi, j = 1 means belonging to set Oj, oi, j = 0 means not belonging to set Oj. Formula (1) is the objective function of maximizing the total weight of objects belonging to PA. Constraint (2) establishes that at most one object of Oj can be in PA. Equations (3) and (4) are integer constraints.

The set packing problem can be conveniently encoded as the maximum weighted independent set problem. To see this, we first review some basic symbols associated with the MWIS problem. Given an undirected graph G = (Vt, Eg, ), Vt = {1, ... , n} is the vertex set, Eg ⊂ Vt × Vt is the edge set, is the vertex weighting function, and each vertex is assigned a positive integer weight . An independent set IS of G is a subset of Vt such that there is no pair of vertices in IS linked by an edge in Eg, i.e., ∀u,  ∈ IS, . The weight of an independent set IS of G is the sum of all contained vertices’ weights, i.e., . Then, the maximum weighted independent set problem is to find the independent set with the maximum sum of weights of the contained vertices.

For a set packing problem instance, an object set OB = {1, …, n}, and an exclusive constraint Oj, j ∈ J = {1, …, m}, each object u ∈ OB is associated with a positive weight cu. We give a maximum weighted independent set instance (conflict graph) G = (Vt, Eg, ) as follows:(i)For an object u ∈ OB, define a vertex u ∈ Vt, whose weight is equal to cu. That is, Vt = {1, …, n}, u ∈ V,  = cu.(ii)For the exclusive constraint, define the edge matrix Eg by

It is easy to see an edge euv will link two vertices u and if the objects u and are in the same exclusive constraint, which indicates that the two objects are in conflict and cannot be accepted at the same time. From the edge matrices, we can easily check the objects that are included in the same exclusive constraint.

According to the above transformation, it is not difficult to find that a maximum weighted independent set IS = {, …, } of the conflict graph G = (Vt, Eg, ) corresponding to a feasible set packing PA = {1, …, r} of objects with maximum of the total weight of objects it contains without violating any constraint. So, any solving algorithm for the minimum weighted independent set problem can be applied to tackle the set packing problem.

To further illustrate the encoding, we give an example, i.e., a set packing problem instance with 6 objects and 4 constraints in Figure 1(a). The correspondent conflict graph G = (Vt, Eg, ) with regard to the set packing problem instance is described in Figure 1(b) where each object is represented by a vertex whose weight is equal to that of the object. An edge will link two vertices if they are included in the same constraint. It is distinct that the optimal solution to the maximum weighted independent set problem defined by the conflict graph is given by the vertex set {} which represents the set of objects {1, 3, 6} with a maximum weight of 29.

3. Diversion Local Search Based on Weighted Configuration Checking for SPP

Given a set packing problem instance, we can encode this instance as a maximum weighted independent set problem instance according to the previous section. So, any algorithm to solve the maximum weighted independent set can be used to solve the set packing problem. As far as we know, the maximum weighted independent set problem has two equivalent problems, namely, minimum weighted vertex cover (MWVC) problem and maximum weighted clique (MWC) problem. Methods for solving MWVC problem can be directly applied to tackle the MWIS problem. In this paper, an efficient local search algorithm called “Diversion Local Search based on Weighted Configuration Checking (DLSWCC)” is used to solve the MWIS problem [37]. The algorithm has high efficiency in solving minimum weighted vertex cover and set packing problem. Let us briefly review the key factors of the DLSWCC algorithm. For a comprehensive description, the readers can refer to the literature [37].

3.1. Dynamic Scoring Strategy

For a candidate solution, the quality of the candidate solution can be improved by selecting appropriate vertices to add to or remove from the candidate solution, thus improving the performance of the local search algorithm. We present a dynamic scoring strategy to evaluate the benefit when the vertex is added to or deleted from the candidate solution. The dynamic edge weighting mechanism is used in dynamic scoring strategy. The definition of dynamic edge weight is given below.

Definition 1 (dynamic edge weight). Given an undirected graph G (Vt, Eg, ), each edge e ∈ Eg is assigned a weight denoted by dynmc_w (e), and the weight is dynamically updated in the local search.
Specifically, we abide by the following two rules to update edge weights.W_Rule1: the dynmc_w (e) of each edge e ∈ Eg is initialized as 1W_Rule2: the dynmc_w (e) will be increased by 1 if edge e is not covered by the candidate solution at the end of each loopOn the basis of Definition 1, assuming that vertex subset C ⊆ Vt is a candidate solution, the Boolean function cover (e, C) is used to indicate whether edge e ∈ Eg is covered by candidate solution C, i.e., whether at least one of e’s endpoints is a member of C. The quality of the candidate solution C is measured by cost (C), which is defined by the following formula:From Formula (6), we can see that cost (C) represents the weight sum of the edge uncovered by candidate solution C. The candidate solution C is feasible if cost (C) is equal to 0.
The score of vertices is crucial to choose which vertex to add to or remove from the candidate solution. In algorithm DLSWCC, the authors use score () to define the score of vertex , as shown in the following formula:where C is the current candidate solution, if belongs to C, then C′ is the candidate solution after removing vertex from C; otherwise, C′ is the candidate solution after adding vertex to C.

3.2. Weighted Configuration Checking Strategy

The cycling problem is revisiting a scenario that has just been visited during the local search phase. This problem will make the algorithm fall into the local optimum, cause the waste of time, and reduce the performance of the algorithm. Many scholars have been working on how to avoid the cycling problem. In literature [38], Cai et al. proposed the configuration checking strategy, which can consider the environmental information to avoid the cycling problem. Up to now, CC strategy has been successfully used to tackle many combinatorial optimization problems, i.e., the minimum vertex cover, SAT, MaxSAT, and set cover problem [3941].

However, the direct application of CC strategy to MWVC problem will limit some promising vertices to be added into the candidate solution, thus misleading the search. That is, the original CC strategy is more restrictive. In our algorithm, we will use the deformation strategy of CC strategy, namely, weighted configuration checking (WCC) strategy. The concept of weighted configuration is given below.

Definition 2. (weighted configuration). Given an undirected graph G = (Vt, Eg), each edge is associated with a weight, and C is the candidate solution. The weighted configuration of vertex is defined as the states of all ’s neighbours and the weights of the associated edges of all ’s neighbours.
In order to implement the weighted configuration checking (WCC) strategy, we use an array wcnfg to record whether the weighted configuration of each vertex has changed since last leaving C. Each element of the array is a binary variable. For a vertex , wcnfg [] = 1 indicates that the weighted configuration of vertex has changed and wcnfg [] = 0 on the contrary. We update the wcnfg array according to the following four rules:(i)WCC_Rule1: in the initialization phase, the wcnfg value of each vertex is assigned to 1(ii)WCC_Rule2: if vertex is removed from C, then the wcnfg value of is assigned to 0 and the wcnfg value of ’s each neighbour is assigned to 1(iii)WCC_Rule3: if vertex is added into C, then the wcnfg value of v’s each neighbour is assigned to 1(iv)WCC_Rule4: if edge e’s weight dynmc_w [e] is updated, then the wcnfg values of the two vertices u and linked by edge e are assigned to 1

3.3. Vertex Selection Strategy

In this subsection, we introduce the vertex selection strategy, which combines the dynamic scoring strategy with the weighted configuration checking strategy. Before we introduce this strategy, let us introduce the age concept that we will be using. The age of a vertex is the number of iterations after the vertex’s state has changed.

In the local search phase, we use the following two rules to select suitable vertices to add to or remove from the candidate solution.(i)Rmv_Rule: the vertex with the highest score is selected from the candidate solution, or the vertex with the greatest age is selected if there are multiple vertices to choose from. Then, wcnfg values of this selected vertex and its neighbours are modified according to WCC_Rule2.(ii)Add_Rule: the vertex with the highest score and wcnfg value of 1 is selected from the vertices of the noncandidate solution. If there are multiple optional vertices, the vertex with the greatest age is selected. Then, wcnfg values of its neighbours are modified according to WCC_Rule3.

3.4. DLSWCC Algorithm

In this subsection, we review the main idea of DLSWCC algorithm, and the corresponding pseudocode is shown in Algorithm 1. First, we construct the initial solution C by the greedy method. Then, a perturbing approach is applied on the initial solution C to improve its quality. We use to represent the objective value of the candidate solution C. We use UB to record the objective value of the global optimal solution and initialize UB to (C). It is obvious that if a better solution exists, the objective value should be less than UB. In DLSWCC algorithm, once the initial candidate solution has been built, we will remove some vertices from the candidate solution until the candidate solution becomes infeasible and the objective value is less than UB. Then, we exchange the vertices in C and the vertices in Vt\C according to Rmv_Rule and Add_Rule until C is a feasible solution. At this stage, if a better solution is found, the value of UB needs to be updated. At the end of each loop, the algorithm checks if each edge is covered by the current solution, and if not, the algorithm adds the weight dynmc_w of the uncovered edge by 1, thus giving the “hard to cover” edges a better chance to be covered by the new candidate solution in the future iterations and making the algorithm jump out of local optimum effectively.

(1)Initialize wcnfg array according to WCC_Rule1;
(2)initialize the dynmc_w of each edge assigned as 1;
(3)initialize the score of each vertex assigned as the degree of the vertex;
(4)initialize the candidate solution C greedily;
(5)UB =  (C);
(6)C ⟵ C;
(7)iter ⟵ 0;
(8)while stop criterion is not satisfied do
(9)while C covers all edges, then
(10)   UB =  (C);
(11)   C ⟵ C;
(12)    ⟵ x with the greatest score in C, breaking ties in favor of the oldest one;
(13)   C ⟵ C\{};
(14)   update wcnfg array according to WCC_Rule 2;
(15)end while
(16)   ⟵ x with the greatest score in C and is not in tabu_list, breaking ties in favor of the oldest one;
(17)  C ⟵ C\{};
(18)  update wcnfg array according to WCC_Rule 2;
(19)  clear tabu_list;
(20)  while C uncovers some edges do
(21)    ⟵ x with the greatest score not in C and wcnfg [x] = = 1, breaking ties in favor of the oldest one;
(22)   if (C) +  () ≥ UB then break;
(23)   C ⟵ C∪{};
(24)   update wcnfg array according to WCC_Rule3;
(25)   dynmc_w [e] ⟵ dynmc_w [e] + 1, for each uncovered edge by C;
(26)   update wcnfg array according to WCC_Rule4;
(27)   add into tabu_list;
(28)  end while
(29)  iter ⟵ iter + 1;
(30)end while
(31)return C;

4. Computational Results

In this section, we will report a large number of experimental results through using the introduced DLSWCC algorithm to solve the set packing problem as a minimum weighted independent set on a large number of set packing problem standard benchmarks. Further, DLSWCC is compared with several state-of-the-art algorithms proposed in the literature. Finally, we test the effectiveness of the dynamic scoring strategy and the weighted configuration checking strategy.

4.1. Reference Algorithms and Experimental Protocol

We compare DLSWCC with the current best solving algorithms, i.e., CPLEX, GRASP approach [26], ACO approach [27], and EA/G approach [33]. In this study, our DLSWCC algorithm is implemented in C and executed on a computer with Intel (R) Xeon (R) CPU E7-4830 with 2.13 GHz. The system that is used to execute ACO and GRASP approaches [27] is Pentium III at 800 MHz. EA/G approach [33] is implemented in C and executed on a Core 2 Duo system with 2 GB RAM running under Fedora 12 at 3.0 GHz. For each instance, our algorithm DLSWCC is executed, where the cutoff condition for each execution is to reach a given cutoff time 3600 (s) or max iteration 1000000. Like EA/G [33], GRASP [26], and ACO [27] approaches, DLSWCC is run 16 times independently on each instance.

The railway problem instances and random instances are two main types of standard benchmarks for set packing problem [26]. As far as we know, because the railway problem instances contained confidential data related to French railways, the data were not made public. Only the random instance data are public. Therefore, we show the experiment results of DLSWCC algorithm on random instances only.

4.2. Comparison with State-of-the-Art Algorithms

Tables 1 and 2 provide the instance characteristics and the results found by CPLEX method (CPLEX), GRASP method [26] (GRASP), ACO method [27] (ACO), EA/G method [33] (EA/G), and our DLSWCC method (DLSWCC). Table 1 shows the experimental results of small-scale problem instances with 100 and 200 variables, while Table 2 shows the experimental results of medium-scale problem instances with 500 and 1000 variables.

In Tables 1 and 2, column Var indicates the number of variables, column Cnst indicates the number of constraints, and column Density indicates the percentage of nonnull elements in the constraint matrix. Column M_One indicates the number of elements in the maximum set Oj, where j {1... , m}, and column Weight represents the range of object weights in each instance. Note that the instances whose ranges are [1-1] are instances of the unicost set packing problem. The CPLEX solver can solve all small-scale instances (the number of variables less than or equal to 200). As shown in Table 1, column Opt indicates the optimal solutions found by CPLEX and column TET indicates the time to obtain the optimal solution. For the medium-scale instances (the number of variables equal to 500 and 1000), CPLEX cannot solve all of them. In such cases, we report the best known value in Table 2. When CPLEX cannot obtain the optimal solution, the best solution value found is marked by an asterisk (). For GRASP, ACO, and EA/G methods, column Best indicates the best solution found, column Avrg indicates the average solution quality, and column ATET indicates the average execution time in seconds over 16 runs in Tables 1 and 2. The column hit is the number of executions reaching its best value of algorithm DLSWCC. Results of CPLEX, GRASP, and ACO methods are obtained from the literature [27]; results of EA/G approach are obtained from [33]. The bold values indicate the best solution values obtained among the compared algorithms. And the bold values in Tables 3 and 4 indicate the same meaning.

Tables 1 and 2 distinctly show that the DLSWCC method is superior to EA/G, GRASP, and ACO methods in solution quality. Out of a total of 56 instances, DLSWCC obtained the best solution that was superior to EA/G on 3 instances and the same as EA/G on the rest. In terms of average solution quality, DLSWCC is superior to EA/G on 39 instances and the same as EA/G on the rest. DLSWCC obtained the best solution that was superior to ACO on 7 instances and the same as ACO on the rest. In terms of average solution quality, DLSWCC is superior to ACO on 42 instances and the same as ACO on the rest. DLSWCC obtained the best solution that was superior to GRASP on 13 instances and the same as ACO on the rest. In terms of average solution quality, DLSWCC is superior to ACO on 32 instances and the same as GRASP on the rest. On the whole, in term of the best solution, DLSWCC method is superior to the three comparison algorithms EA/G, ACO, and GRASP on 2 instances. Similarly, in term of the average solution quality, the DLSWCC method is superior to the three comparison algorithms EA/G, ACO, and GRASP on 30 instances. More significantly, in Table 2, DLSWCC sometimes gives better values than CPLEX when CPLEX values are marked with asterisk.

Note that the system that is used to perform GRASP and ACO methods [27] is Pentium III at 800 MHz and the system that is used to perform EA/G method is Fedora 12 at 3.0 GHz which are different from the system used to perform DLSWCC. Therefore, running time cannot be compared accurately. We just make a rough comparison on the running time. Our approach is faster than EA/G, GRASP, and ACO methods on majority of the instances.

There were also 8 random instances with 2000 variables (large-scale instances), which were not used to test the ACO algorithm in literature [27]. However, in literatures [26, 33], these instances are used to test the GRASP and EA/G algorithms. Table 3 shows the results of CPLEX, GRASP, EA/G, and DLSWCC on these instances. The results for CPLEX and GRASP are obtained from [26]. When CPLEX cannot obtain the optimal solution, the best solution value found is marked by an asterisk (). For GRASP algorithm, if it can obtain the best known solution, then the column corresponding to GRASP marks “yes”; otherwise, it marks “no”. Out of the 8 instances, GRASP can obtain best known values on 5 instances. DLSWCC, on the other hand, obtains as good as or better than best known values on all instances. On 2 instances, DLSWCC even improved the best known values. For the running time, our approach is a little slower than EA/G.

4.3. Comparison of Different Version of DLSWCC

To study the effectiveness of the dynamic scoring strategy and the weighted configuration checking strategy, we compare DLSWCC with three other alternative algorithms named DLSWCC_STATIC, DLSNOCC, and DLSECC.

In DLSWCC_STATIC, the scoring method works with a static scoring strategy, i.e., the weight of each edge will not be updated. DLSNOCC works without the weighted configuration checking strategy, i.e., it selects the vertex with the greatest score, breaking ties in favor of the oldest one during the adding procedure. DLSECC works with a straightforward extension of the configuration checking strategy instead of the weighted configuration checking strategy. We tested the three algorithms on large-scale instances over 16 runs with different random seeds per instance. The results are summarised in Table 4.

From Table 4, by comparing the experimental results of algorithm DLSNOCC and algorithm DLSECC, we can see the effectiveness of the configuration checking strategy. By comparing the experimental results of algorithm DLSNECC and algorithm DLSWCC, we can see the effectiveness of the weighted configuration checking strategy. By comparing the experimental results of algorithm DLSWCC_STATIC and algorithm DLSWCC, we can see the effectiveness of the dynamic scoring strategy.

Through the above comparison, we analyse that DLSWCC algorithm is superior to EA/G algorithm, ACO algorithm, and GRASP algorithm mainly because it adopts weighted configuration checking strategy and dynamic scoring strategy, which can effectively prevent the cycling problem and avoid the algorithm falling into local minimum.

5. Conclusions

The set packing problem is a significant combinatorial optimization problem and has many real applications. In this paper, we have first researched the method of solving the SPP by encoding the problem as the maximum weighted independent set problem and tackling it with an existing maximum weighted independent set algorithm (DLSWCC). Comparing with the current best solving algorithms (EA/G, GRASP method, and ACO) for SPP, our method has yielded best results. In terms of the optimal solution and the average solution, our method has obvious advantages over the comparison methods. In terms of the solving time, our method is significantly faster than the comparison methods on majority of the instances.

In the future work, we can extend our method to solve other combinatorial optimization problems, such as dominating set problem [42, 43], generalized vertex cover problem [44], maximum diversity problem [45], maximum edge weighted clique problem [46], and multiobjective unconstrained binary quadratic programming problem [47].

Data Availability

The set packing problem data used to support the findings of this study are included within the article.

Conflicts of Interest

The authors declare that they have no Conflicts of Interest.

Acknowledgments

This work was supported by the National Natural Science Foundation of China (NSFC) under Grant nos. 61806082, 61972063, and 61976050, the National Social Foundation of China under, Grant no. 19BJY246, 20BTJ062, Certificate of China Postdoctoral Science Foundation, Grant no. 2019M651208, and Jilin Education Department 13th Five-Year Science and Technology Project no. JJKH20190726KJ.