Research Article
Natural Language Processing Algorithms for Normalizing Expressions of Synonymous Symptoms in Traditional Chinese Medicine
Figure 4
Examples of the BERT-UniLM and BERT-Classification models. (a) This model structure is consistent with BERT. There are 12 transformer blocks. According to the embedding composition of BERT, including segment embedding of segments 1 and 2 and character embedding of original symptom. The token output layer of the model outputs the normalized symptom in character form or label form through a fully connected layer with a softmax function. SOS is the symbol at the start of the sequence, and EOS is the symbol at the end of the sequence. This model was trained by the sequence-to-sequence method of UniLM. (b) This normalization model structure is also based on BERT. In contrast to (a), a full connection layer with the sigmoid function is used as the output layer.
(a) |
(b) |