Research Article
A Deep Random Forest Model on Spark for Network Intrusion Detection
| Input: training dataset D = {(x1, y1), (x2, y2)… (xm, ym)}; | | x: potential anomaly data. | | Output: H (x): voting result of sample x; | | CPRF: Deep random forests where {PRFi |i = 1, 2, …, N}. | (1) | CPRF = {∅} | (2) | Initialize hyperparameters: tolerance t and slice window size winSize | (3) | D′ = Feature Grained (D); //D′ is newly generated feature vector. | (4) | do | (5) | i = 1//layer i of cascaded PRF. | (6) | for j = 1, 2, …, T do | (7) | PRFi = {∅} | (8) | D′j ⟵ Bootstrap sampling (D′) | (9) | Treej ⟵ decision tree (D′j) | (10) | PRFi+ = {Treej} | (11) | end for | (12) | if (tolerance ≥ t) | (13) | CPRF+ = {PRFi} | (14) | else | (15) | Break | (16) | i = i + 1 | (17) | while (TRUE) | (18) | H (x) = voting method (x)//the last layer votes for classification | (19) | Return CPRF |
|