Research Article

Failure Analysis of Static Analysis Software Module Based on Big Data Tendency Prediction

Algorithm 2

 Input: training data defectData, label defectLabel corresponding to training data, test data testData, number of input layer nodes inputSize of sDSAE, number of hidden layer nodes hiddenSizeL1, hiddenSizeL2, hiddenSizeL3, hiddenSizeL4, weight attenuation coefficient lambda, sparse regularization parameter beta, masking noise the masking rate noiseRatio, the maximum number of iterations maxIter to minimize the loss cost function, the weight attenuation coefficient of logistic regression classifier LogisticLambda, the maximum number of iterations LogisticMaxIter.
 Output: predicted defect tendency label predLabel, predicted defect tendency probability value predScore.
(1)The software defect data defectData are preprocessed to obtain the processed training data trainData. The preprocessing process mainly includes removing invalid data and data standardization. The data standardization process refers to the process of making the training data conform to the standard normal distribution;
(2)Take the processed training data trainData as the input of the first layer of sDSAE and train to obtain the first-order feature sae1Features of the defect data;
(3)Take the first-order feature sae1Features of the defect data as the input of the second layer of sDSAE and train to obtain the second-order feature sae2Features of the defect data;
(4)Similar to Step 3, the third-order features sae3Features and fourth-order features sae4Features of the defect data can be obtained, respectively;
(5)The labels of each order feature and the software defect data obtained from Steps 24 are used as the logistic regression classification The input of the processor to construct a software defect prediction model
(6)“Fine-tuning” the constructed prediction model through the back propagation algorithm and gradient descent method to optimize the network parameters of each prediction model;
(7)The test data testData are preprocessed in the same way as the training data and then input to each trained prediction model to obtain the probability value predScore of the predicted defect tendency;
(8)If predScore ≥ 0.5, then predLabel = 1; otherwise, predLabel = 0.