Research Article
Exploiting Syntactic and Semantic Information for Textual Similarity Estimation
Figure 5
Attention weight values are corrected by Gaussian probabilities. (a) presents a traditional attention mechanism. The word “new” that appeared in different positions acquired the same contribution to sentence, which is inconsistent with our experience that adjacent words should be more critical. (b) describes the Gaussian distribution of the x-axis. (c) shows the attention value corrected by the Gaussian distribution, where the first “new” is more critical compared to the second “new.”