International Journal of Digital Multimedia Broadcasting

Research Article

On the Principles and Decisions of New Word Translation in Sino-Japan Cross-Border e-Commerce: A Study in the Context of Cross-Cultural Communication

Table 3

Summary of comparative methods.


Model	Summary	Strengths	Limitations

Literature [23]	Based on the encoder-decoder framework, it aims to guide the model to generate a more descriptive sentence for a given image by introducing reference information.	The sentences it generates sound more natural.	Sentence expressions are on the rigid side and performance is weak.
Literature [24]	It mimics the cortical lateral inhibition mechanism in the human visual system. Then, each image feature is identified by global saliency.	It can accurately detect the most prominent areas in the image while ignoring other local interferences, thus obtaining high results.	The algorithm has a high false alarm rate for scenes that are cluttered and have no obvious prominent areas in the scene.
Literature [25]	In order to extract the keywords from the original documents, it proposes a keyword extraction algorithm based on a probabilistic neural network and visual attention mechanism.	It has a strong ability to extract context-rich information about keywords.	The algorithm is time-consuming, and the algorithm complexity needs to be optimized.
Proposed	Perceptual text information is generated using visual information and text information. A generic machine translation model is implemented by controlling the proportion of visual information in the overall multimodal information.	The multimodal text information is fully utilized and the accuracy of identifying semantic information of new words is higher.	The dataset it uses suffers from a small size, and the available data needs to be expanded to enhance the expressive power of the model.