Research Article

An Efficient Outlier Detection Approach for Streaming Sensor Data Based on Neighbor Difference and Clustering

Algorithm 3

Outlier sources identification based on correlation (OSIC).
Input: the streaming sensors data after outlier detection
Output: outliers and their sources
(01)if there is no outlier in current window then
(02) exit
(03)else
(04)if the data instance is detected as point outlier then
(05)  label its source as error
(06)end if
(07)if the data instance is detected as jump outlier then
(08)  label its source as event
(09)end if
(10)if the data instance is detected as collective outlier or contextual outlier then
(11)  calculate the correlation coefficient
(12)  if correlation coefficient > th1then
(13)   if more than half of attributes are labeled as outliers then
(14)    label its source as event
(15)   else
(16)    label its source as error
(17)   end if
(18)  else if |correlation coefficient| > th2then
(19)   read these correlative variables of data instances
(20)   predict the mean and variance of normal values with 10-folder cross validation
(21)   if the outlier is out of the predicted range then
(22)    label its source as error
(23)   else
(24)    label its source as suspected event
(25)   end if
(26)  end if
(27)else
(28)  label the source of collective outliers as unknown
(29)  label the source of contextual outliers as normal
(30)end if
(31)end if