Research Article

Natural Language Processing Algorithms for Normalizing Expressions of Synonymous Symptoms in Traditional Chinese Medicine

Table 9

Model comparison.

ModelModeling conceptComplex symptoms aRetrieval bOverall performance c

Jaccard similaritySimilarity matching×√(0.61)×
Word2vec with cosineSimilarity matching×√(0.95)×
Encoder-Classification (our)Text classification√(0.89)√(0.98)√(0.86)
Encoder-Decoder (our)Sequence generation√(0.87)×√(0.86)
DNormSimilarity matching×√(0.99)×
Transition-based modelNER×××
Bi-LSTM-CNNs-CRFNER×××
BERT-based rankingSimilarity matching×√(0.99)×
BERT-UniLM (our)Sequence generation√(0.90)×√(0.89)
BERT-Classification (our)Text classification√(0.92)√(0.99)√(0.91)

Note.a means the ability to handle complex symptoms, if the model has this ability, it is evaluated for performance using F1-score; b means the ability to retrieve normalized symptoms, if the model has this ability, it is evaluated for performance using top 10 recall; c stands for the overall performance of normalizing single symptoms and complex symptoms, if the model has the ability of handling single symptoms and complex symptoms, it is evaluated by F1-score. √ indicates that the model has this ability or can be evaluated for overall performance. × indicates that the model does not have this ability or cannot be evaluated for overall performance.