Research Article

CFSBFDroid: Android Malware Detection Using CFS + Best First Search-Based Feature Selection

Algorithm 5

Input: Training dataset
(1)    If “S” is the size of the training dataset and f(x) is the loss function on the data instance index by “k.”
(2)    If the size of the dataset is huge, then the gradient descent function may be infeasible due to the high computational cost. For the large dataset, the stochastic gradient descent (SGD) provides a lighter-weight solution for each iteration; instead of calculating the gradient ∇f(x), the SGD randomly selects the “k” sample from the dataset and calculates ∇fk(X) as an unbiased estimator of ∇f(x).
(3)    At each iteration, a mini-batch µ to update X as {where {\displaystyle\eta} is a step size}, where is a mini-batch size and is a positive scalar, which represents the learning rate or steps size. This generalized stochastic algorithm is also called mini-batch SGD. The computational cost of per-iteration is O . Thus, when the mini-batch size is small, the computational cost is light at each iteration.
Output: Classified Instances