|
Type | Algorithm | Year | Characteristics | Reference |
|
Incremental learning | VFDT | 2000 | The leaf node is replaced with a split node, and the algorithm uses less memory and time. | [21] |
HAT | 2009 | Hoeffding trees are combined with a sliding time window based techniques; there is no need to predict when concept drift occurs in the data stream. | [22] |
OHT | 2014 | The misclassification rate is used to control node splitting, and the concept drift is solved based on misclassification classes and false alarm rates. | [23] |
Hoeffding-ID | 2016 | Bayes’ theorem is combined with traditional Hoeffding trees. The new spanning tree is continuously used in the classification process to replace the old spanning tree so that the classifier maintains high accuracy and adapts to the data flow concept drift. | [24] |
|
Cluster-based | CluStream | 2003 | Extending the traditional clustering algorithm BIRCH to the data flow scenario has strong flexibility and scalability, but it is sensitive to outliers. | [25] |
DenStream | 2006 | Microclusters are used to capture summary information about a data stream, which can find clusters of arbitrary shapes in the data and have the ability to process noise objects. | [26] |
IEBC | 2014 | The clustering framework is integrated with the classified data stream using sliding window technology and data marking technology, which is excellent in clustering results and detection concept drift but can only process classified data. | [27] |
MuDi-Stream | 2016 | The multidensity classification problem in the concept drift data stream is solved by a hybrid method based on network and microclusters, but it is not suitable for high-dimensional data streams. | [28] |
|
Integrated learning | AWE | 2003 | K classifiers are fixedly constructed, and a new classifier is trained in batch mode using the new arrival data object. Subsequently, the k most accurate classifiers are selected to form a classifier set, and each classifier is weighted according to the accuracy. | [29] |
AE | 2011 | It mainly solves the problem of data stream mining noise and is a collection of horizontal and vertical integration framework methods. The time complexity is high. | [30] |
EM | 2013 | Concept drift and novel class in the data stream can be automatically detected, but only concept drift under dynamic feature sets can be handled. | [31] |
CLAM | 2016 | It uses a class-based integrated classifier to efficiently classify data flow loop classes and novel classes, but it cannot classify multiclass data. | [32] |
|