Research Article

Big Data Privacy Preservation Using Principal Component Analysis and Random Projection in Healthcare

Algorithm 1

Input: ARFF File with original data values.
Output: ARFF File with perturbed data after applying Enhanced Random Projection Perturbation.
(1)Read the data from the input ARFF file of the dimension M x N (M = different samples and N= Original features) and named it P.
(2)Apply Feature Selection on dataset P by using Principal component Analysis and reduce the small ranked features.
(3)Save the dataset say J.
(4)Take dataset J, of the dimension M × V (M = different samples and V = Reduced features)
(5)Initialize a random 2D matrix S of size V × R where R = new dimension(reduced)
(6)Normalize the columns of S making them unit length vectors.
(7)Matrix multiplication A = J S. A is the final output with dimension M × R.
(8)The resultant modified data is saved in an ARFF file.