Evidence-Based Complementary and Alternative Medicine

Research Article

Natural Language Processing Algorithms for Normalizing Expressions of Synonymous Symptoms in Traditional Chinese Medicine

Table 9

Model comparison.


Model	Modeling concept	Complex symptoms ^a	Retrieval ^b	Overall performance ^c

Jaccard similarity	Similarity matching	×	√(0.61)	×
Word2vec with cosine	Similarity matching	×	√(0.95)	×
Encoder-Classification (our)	Text classification	√(0.89)	√(0.98)	√(0.86)
Encoder-Decoder (our)	Sequence generation	√(0.87)	×	√(0.86)
DNorm	Similarity matching	×	√(0.99)	×
Transition-based model	NER	×	×	×
Bi-LSTM-CNNs-CRF	NER	×	×	×
BERT-based ranking	Similarity matching	×	√(0.99)	×
BERT-UniLM (our)	Sequence generation	√(0.90)	×	√(0.89)
BERT-Classification (our)	Text classification	√(0.92)	√(0.99)	√(0.91)

Note.^a means the ability to handle complex symptoms, if the model has this ability, it is evaluated for performance using F1-score; ^b means the ability to retrieve normalized symptoms, if the model has this ability, it is evaluated for performance using top 10 recall; ^c stands for the overall performance of normalizing single symptoms and complex symptoms, if the model has the ability of handling single symptoms and complex symptoms, it is evaluated by F1-score. √ indicates that the model has this ability or can be evaluated for overall performance. × indicates that the model does not have this ability or cannot be evaluated for overall performance.