Research Article

A Data Mining Method Using Deep Learning for Anomaly Detection in Cloud Computing Environment

Algorithm 2

Compute
input data set
k: number of nearest neighbor
: threshold for LOF
N: data block number
output data set which
Initialize a Hadoop Job
Set TaskMapReduce class
Logically divide X into multiple data blocks: .
In the -th TaskMapReduce
FirstMapper
input
output <key, value> = < >
for each data di, i = 1, 2, ..., m do
Calculate disij = distance (di, dj), j = 1, ... , m
Sort disij of di
for each disij of di do
if &
add di and disij in k-distinct-neighbor record (, )
end
Calculate k-distinct-distance record k-dis (di)
end
First Record
input <key, value> = < di, [(ok, dis (di, ok)), k-dis (di) >
output <key, value> = < di, [(ok, dis (di, ok)), k-dis (di) >
SecondMapper
input < key, value> = < di, [(ok, dis (di, ok)), k-dis (di) >
output < key, value> = < di, [(ok, reach-dis (di, ok)) >
for ok ∈ k-distinct-neighbor do
if k-dis (di) < dis (di, ok)
  reach-dis (di, ok) = dis (di, ok)
else reach-dis (di, ok) = k-dis (di, ok)
end
SecondReducer
input < key, value> = < di, (ok, reach-dis (di, ok)) >
output < key, value> = < di, lrd (di) >
for value do
,
ok ∈ k-distinct-neighbor
end
ThirdMapper
input < key, value> = < di, lrd (di) >
output < key, value> = < di (lof (di) > θ), lof (di) >
for ok ∈ k-distinct-neighbor do
,
ok ∈ k-distinct-neighbor
end
if lof (di) > θ
output
ThirdReduce
input < key, value> = < di (lof (di) > θ), lof(di) >
output < key, value> = < , lof () >
for value do
Sort for and record
End