Research Article

Using an Efficient Detection Method to Prevent Personal Data Leakage for Web-Based Smart City Platforms

Table 6

Percentage of file similarity.

File similarity percentageComparison time (min)Scanning time (min)No. of filesNo. of nonrepeated personal dataPercentage of nonrepeated personal data (A)Percentage of time (B)Value of weight W2
(A)(B)(C)(D = C/14,562)(E = (A + B)/352)(D × D/E)

99%42285,77014,43999.16%65.91%1.49
97%5.51854,55114,36398.63%54.12%1.80
95%71573,92714,30998.26%46.59%2.07
93%91443,60314,05396.50%43.47%2.14
91%11.51363,37013,85695.15%41.90%2.16
90%131313,24813,80794.82%40.91%2.20
85%191292,70513,41092.09%42.05%2.02
80%271252,33213,06789.73%43.18%1.86
75%351222,03112,50685.88%44.60%1.65
70%441171,79511,73580.59%45.74%1.42
65%551131,66411,51879.10%47.73%1.31
60%721041,57511,00675.58%50.00%1.14
55%88971,50310,45871.82%52.56%0.98
50%104921,44310,29170.67%55.68%0.90
Full-scan mode03528,44914,562100.00%100.00%1.00