Research Article

A Deep Random Forest Model on Spark for Network Intrusion Detection

Algorithm 1.

FS-DPRF.
Input: training dataset D = {(x1, y1), (x2, y2)… (xm, ym)};
x: potential anomaly data.
Output: H (x): voting result of sample x;
CPRF: Deep random forests where {PRFi |i = 1, 2, …, N}.
(1)CPRF = {∅}
(2)Initialize hyperparameters: tolerance t and slice window size winSize
(3)D′ = Feature Grained (D); //D′ is newly generated feature vector.
(4)do
(5)i = 1//layer i of cascaded PRF.
(6)for j = 1, 2, …, T do
(7)  PRFi = {∅}
(8)  Dj ⟵ Bootstrap sampling (D′)
(9)  Treej ⟵ decision tree (Dj)
(10)  PRFi+ = {Treej}
(11)end for
(12)if (tolerance ≥ t)
(13)  CPRF+ = {PRFi}
(14)else
(15)  Break
(16)i = i + 1
(17)while (TRUE)
(18)H (x) = voting method (x)//the last layer votes for classification
(19)Return CPRF