Research Article
n-Gram-Based Text Compression
Algorithm 2
Pseudocode of the four_gram_compression.
input: The four-gram string, in this case is st4 | output: The encoded stream | () index = find(st4, four_gram_dict) | (2) if then | () force_trigram_compression(st3) | () outputstring += compress(index, 4) | () delete content of st4 | () end | () else | () st3 += first gram of st4 | () delete first gram of st4 | () if number of grams of st3 = 3 then | () trigram_compression(st3) | () end | () end |
|