Research Article

Detecting Overlapping Data in System Logs Based on Ensemble Learning Method

Figure 2

Data distribution of using TSNE. The -coordinate represents the high-dimensional data distribution (Gaussian distribution). The -coordinate represents the low-dimensional data distribution ( distribution). The blue point represents normal data, and the red point anomaly data. (a) The distribution of all system log data in the test set. (b) The distribution of test sample data with fuzziness greater than 0.65. (c) The distribution of test sample data with fuzziness less than 0.65.