Research Article
Gene Sequence Clustering Based on the Profile Hidden Markov Model with Differential Identifiability
Algorithm 2
Gene sequence clustering algorithm based on DI-PHMM (DI-GSCA).
| ā | Input: Number of clusters , training sequence data , DI parameter and round number of iteration (optional) | | ā | Output: Index of the cluster to which the sequence belongs where | | (1) | | | (2) | for in | | (3) | for in | | (4) | //Calculate the score of the sequence for each PHMM | | (5) | | | (6) | //Divide the sequence into the corresponding cluster according to the highest score | | (7) | | | (8) | for in | | (9) | if (): | | (10) | | | (11) | else | | (12) | //The privacy parameter is assigned according to whether the number of iteration rounds is fixed. | | (13) | //Construct a new cluster center sub-model | | (14) | | | (15) | //The degree of change of the model from the last iteration (divergence distance) | | (16) | |
|