Research Article
Utilizing Selected Di- and Trinucleotides of siRNA to Predict RNAi Activity
Algorithm 1
The calculation process of threshold
.
| Input: A data set , where is the feature set extracted from siRNA | | sequence and is the experimentally determined siRNA activities. The features of are first sorted | | by the variable importance in descending order. The initial value of and are 1 and , | | respectively. | | Output: optimal features . | | The dataset is divided into ten parts. Nine parts are used as the training set and the rest are used as | | a testing set. We build a Random Forest model using the feature set and the training set and then | | predict the testing siRNAs using the model. The correlation coefficient between the observed and predicted | | siRNA activities is . | | | while and do | | Calculate the prediction accuracy using according to the first step. | | If then | | | | | | else | | end if | | | | end while | | . |
|