Improved Strategy for High-Utility Pattern Mining Algorithm
Algorithm 4
The Search-IMP procedure.
Input: α: an itemset; α-D: the α projected database; Primary(α): the primary items of α; Secondary(α): the secondary items of α; MinU: a user-specified threshold.
Output: the set of high-utility itemsets that are extensions of α
(1)
Record transaction ID and item index of each item to two lists (TransList and ItemList) by scanning
(2)
for each item i ∈ Primary(α) do
(3)
β = α∪{i};
(4)
Scan α−D to calculate u(β) and create β−D//uses transaction merging;
(5)
if u(β) ≥ minutil then output β;
(6)
Calculate su(β,z) and lu(β,z) for all item z ∈ Secondary(α) by scanning β−D once, using two utility-bin arrays;
(7)
Primary(β) = {z ∈ Secondary(α)|su(β,z) ≥MinU };
(8)
Secondary(β) = {z ∈ Secondary(α)|lu(β,z) ≥MinU };
(9)
count0 = number of items in Secondary(α);
(10)
count1 = number of items in Secondary(β);
(11)
while count0 – count1 >0 do
(12)
Remove each item zSecondary(β) from the transactions inβ−D;
(13)
Calculate su(β,z) and lu(β,z) for all items z ∈ Secondary(β) by scanningβ−D;
(14)
Secondary(β) = {z|z ∈ Secondary(β)∧lu(β,z) ≥ MinU };