Research Article
Big Data Privacy Preservation Using Principal Component Analysis and Random Projection in Healthcare
| Input: ARFF File with original data values. | | Output: ARFF File with perturbed data after applying Enhanced Random Projection Perturbation. | (1) | Read the data from the input ARFF file of the dimension M x N (M = different samples and N= Original features) and named it P. | (2) | Apply Feature Selection on dataset P by using Principal component Analysis and reduce the small ranked features. | (3) | Save the dataset say J. | (4) | Take dataset J, of the dimension M × V (M = different samples and V = Reduced features) | (5) | Initialize a random 2D matrix S of size V × R where R = new dimension(reduced) | (6) | Normalize the columns of S making them unit length vectors. | (7) | Matrix multiplication A = J S. A is the final output with dimension M × R. | (8) | The resultant modified data is saved in an ARFF file. |
|