Abstract
With the rapid development of Internet technology and the wide application of image acquisition equipment, the number of digital artwork images is exploding. The retrieval of near-similar artwork images has a wide application prospect for copyright infringement, trademark registration, and other scenes. However, compared with traditional images, these artwork images have the characteristics of high similarity and complexity, which lead to the retrieval accuracy not meeting the demand. To solve the above problems, an intelligent retrieval method of artwork image based on wavelet transform and dual propagation neural network (WTCPN) is proposed. Firstly, the original artwork image is replaced by the low-frequency subimage after wavelet transform, which not only removes redundant information and reduces the dimension of data but also suppresses random noise. Secondly, in order to make the network assign different competition winning units to different types of modes, the dual propagation neural network is improved by setting the maximum number of times of winning neurons. Experimental results show that the proposed method can improve the accuracy of image retrieval, and the recognition accuracy of verification set can reach over 91%.
1. Introduction
With the advent of the 21st century, computer technology has been rapidly improved, and the wide popularization of image technology has led to an explosive growth in the number of digital artwork images. There are a large number of near-similar images in a large number of picture resources, and the retrieval of near-similar images has broad application prospects, such as detecting copyright infringement of works of art on the Internet and detecting whether new trademarks are similar to existing trademarks [1–6]. In recent years, near-similar image retrieval has gradually become an important branch of image retrieval and attracted more and more attention.
Nowadays, image retrieval technology is the fastest-growing technology in all information retrieval applications, and the number of image retrieval in several major retrieval engines all over the world is increasing very fast. This also makes the near-similar image retrieval technology become the focus of extensive attention at home and abroad and also becomes a key research issue in the development of information industry and digital media technology. Near-similar image detection uses an image as a query keyword [7–10] and retrieves all database images that are all similar or partially similar to the query image. There are many methods for near-similar image detection, among which feature-based detection, index-based detection, and content-based detection are the most commonly used. At present, content-based image retrieval technology has entered a new research stage. Content retrieval based on image visual features begins to use machine learning to extract image features [11–13]. Artificial neural network (ANN), which belongs to machine learning technology [14, 15], is a widely used network model in the development of machine learning and can extract higher-level features of pictures. Artificial neural network is a neural network constructed artificially and capable of realizing a certain function. It is a theoretical mathematical model of human brain neural network based on the understanding of human brain neural network. It is an information processing system established by imitating the structure and function of brain neural network. Artificial neural network is a brand-new system with logical thinking, fuzzy processing, and accurate calculation. It is a complex network formed by connecting a large number of simple neurons.
The traditional neural network is based on BP algorithm, but the BP algorithm has been studied with it, and some problems have been found [16–18]. (1) The signal of error correction will become smaller and smaller from top to bottom, and gradient dispersion will occur. (2) It can only converge to the local minimum. However, in practice, the use of tags is usually presented in the form of no data, while BP algorithm can only use tagged data for training. Therefore, it is difficult for BP algorithm to be widely used in practical environment. Counter propagation network (CPN) is an advanced artificial neural network proposed by Robert Hecht-Nielsen, an American scholar, in 1987.
To solve the problem of poor retrieval accuracy of approximate artwork images, an intelligent retrieval method of artwork images based on wavelet transform and dual propagation neural network (WTCPN) is proposed. The main contributions of this paper are as follows:(1)Replacing the original artwork image with the low-frequency subimage after wavelet transform not only removes redundant information but also reduces the dimension of data and the irrelevant factors that interfere with the recognition performance of CPN(2)The CPN is improved by setting the maximum times of winning neurons, which not only effectively overcomes the instability of the competition layer of CPN network but also avoids the local minimum points in the training process
2. Literature Review
Content-based image retrieval technology is the research direction. Dubey et al. [19] realized the local binary mode of multichannel decoding through the adder and the decoder, which can be effectively used for content-based image retrieval and is superior to other multichannel-based methods in average retrieval accuracy and average retrieval rate. Alshehri [20] proposed an image retrieval method based on BP neural network prediction technology, which classified and predicted the retrieved data by fuzzy inference of neural network. However, according to the above analysis, the research and development of traditional neural networks are limited by the problems of BP algorithm. Recently, the traditional neural network has been replaced by new network models, such as Boltzmann machine, convolutional neural network (CNN), and residual neural network (RNN), which have promoted the further development of this field.
Filip et al. [21] proposed a CNN image retrieval method without manual annotation, which expanded the maximum average pool through the trainable generalized average pool layer and showed better image retrieval performance on Oxford Buildings and Holiday Datasets. Liu et al. [22] aimed at the retrieval of large-scale network image resources and realized a simple and effective image indexing framework by combining the pretrained large-scale convolutional neural network with the structured support vector machine. However, because this method is aimed at large-scale image retrieval, the retrieval speed has been greatly improved, but the retrieval accuracy is not high. Rajkumar and Sudhamani [23] proposed an image retrieval system based on residual neural network and used Euclidean distance measure to measure similarity. The retrieval test is carried out on a dataset with 50,000 network images in 250 categories. Experimental results show that compared with Google’s random query image retrieval system, the performance of the proposed system is improved by 15%. Wang et al. [24] proposed a large-scale similar image retrieval method based on deep neural network, which mainly used deep frame learning multilevel nonlinear transformation to obtain advanced image features and achieved good retrieval results.
Usually, the dimensions of digital artwork images are very high, which make the algorithm need a long time and a large amount of computation. If the distribution of artwork image points is not very compact, it is not conducive to feature classification because artwork images usually have high-dimensional spatial points. To solve the above problems, this paper uses wavelet transform, which can not only reduce the dimension of artwork image but also filter the high-frequency interference information, highlight the main features of the image, and obtain a low-dimensional image suitable for neural network recognition. In addition, by combining the main features of wavelet transform and CPN network, a higher retrieval recognition rate is achieved.
3. Image Retrieval Based on Wavelet Transform and Improved CPN Network
3.1. Operation Principle and Learning Algorithm of CPN
CPN has three layers of standard structure, and neurons in each layer are all connected with each other. Figure 1 shows the topology of CPN network, which is composed of the input layer, the competition layer, and the output layer. Among them, the input layer and the competition layer form a feature mapping network. The competition layer and the output layer constitute a basic competitive network.

In each layer of CPN, the input vector is represented by X:
After the competition, the output of the competition layer is expressed by Y:
The output of the network is represented by O:
The expected output of the network is denoted by d:
The weight matrix between the input layer and the competition layer is expressed by V:where the column vector is the inner star weight vector corresponding to the jth neuron in the competition layer.
The weight matrix between the competition layer and the output layer is represented by W:where the column vector is the weight vector corresponding to the kth neuron in the output layer.
It can be seen from Figure 2(a) that, after each layer of the network is trained according to the learning rules, it sends input vectors to the network in the running stage, and then, the competition layer performs competition calculation on them. When the net input value of a neuron is the maximum, it wins the competition, becomes the representative of the current input mode class, and becomes the active neuron shown in the figure at the same time, with an output value of L. While the rest neurons are inactive, and the output value is 0.

(a)

(b)
It can be seen from Figure 2(b) that, after the competing neurons win the competition, the neurons in the output layer are excited to produce the output pattern shown in the figure. The output value of failed neurons is 0, and the neurons in the output layer do not contribute to the net input and do not affect the output value, so the output is determined by the alien vector corresponding to the neurons that compete for victory.
The learning rules of CPN are composed of unsupervised learning and supervised learning, so the input vector and the expected output vector in the training sample set should be paired. Training is divided into two stages, and each stage adopts a learning rule. In the first stage, the competitive learning algorithm is used to train the inner star weight vector from the input layer to the competitive layer, and the steps are as follows:(1)All inner star weights are randomly assigned with initial values between 0 and 1 and normalized to unit length, and all input modes in the training set are also normalized.(2)Enter a pattern , , where P is the total number of patterns in the training set.(3)Determine the competition winning neuron. The competition algorithm of CPN has no winning neighborhood, so only the inner star weight vector of the winning neuron is adjusted. The adjustment rule is as follows: where is the learning rate which is an annealing function that decreases with time.(4)Repeat steps (2) to (3) until it drops to 0. It should be noted that the weight vector must be normalized again after adjustment.
In the second stage, an alien learning algorithm is adopted to train the alien weight vector from the competition layer to the output layer, and the steps are as follows:(1)Input a mode pair and , wherein the weight matrix from the input layer to the competition layer keeps the training result of the first stage.(2)Determine the neuron that wins the competition, and satisfy(3)Adjust the alien weight vector from the competition layer to the output layer, and the adjustment rule is where is the learning rate of alien rules, and it is also an annealing function that decreases with time; is the output value of the neurons out of the layer, which is calculated by the following formula: The alien weight vector adjustment rules are as follows:(4)Repeat steps (1) to (3) until drops to 0.
3.2. Principle of Wavelet Transform
Firstly, the wavelet transform filters the signal [25], using a group of high-pass and low-pass filters with different scales. Then, we analyze and process the signals decomposed into different frequency bands, which are decomposed from the high-frequency and low-frequency components in the original signal. Finally, in order to reach the preset threshold, the above filtering process should be repeated.
At different scales , the function of a basic wavelet (mother wavelet) and the signal to be analyzed carry out the inner product, which constitutes the wavelet basis function:
The equivalent frequency domain is expressed as follows:
Images in real life are generally two-dimensional signals. Therefore, extending wavelet from one dimension to two dimensions is the basic idea of applying two-dimensional wavelet transform to image processing. Two-dimensional wavelet transform is used to decompose the artwork image in frequency domain, and four regions can be obtained: low-frequency region is different from high-frequency region and the former LL is an approximate component, while the latter LH, HL, and HH are the horizontal component, the vertical component, and the diagonal component, respectively. Among them, the transformed low-frequency region LL can also be subjected to wavelet transform again.
Figure 3 is a schematic diagram of primary and secondary wavelet decomposition, respectively, in which LL1 is the low-frequency subimage of the original image, LH1 and HL1 are horizontal and vertical subimages, and HH1 is the high-frequency subimage of the image.

(a)

(b)
3.3. Proposed WTCPN
The proposed WTCPN method is divided into two processes: training process and recognition process. In the training process, firstly, the training sample of the image is read, and the original training sample set is X, and the artwork image is decomposed by wavelet transform. The wavelet basis function selected here is Daubechies [26], and the wavelet coefficients obtained are independent of each other. Finally, the transformed LL part is selected as the approximation of the original image.
Then, the improved CPN algorithm is used to classify the images. In the process of identification, firstly, the test samples are subjected to wavelet transform, and their low-frequency component subgraphs are selected as the approximation of the test samples, and then, the test sample identification space is constructed, which is classified and identified according to the improved CPN algorithm.
Let be the input mode of CPN network and be the output of the competition layer, but the actual output of the output layer is and the desired output of the output layer is . The number of neurons in the input layer, the competition layer, and the output layer is n, m, and z, respectively. P is the number of input modes. is the connection weight vector from the input layer to the competition layer. is the connection weight vector from the competition layer to the output layer.
The improved CPN learning algorithm includes the following steps:(1)Assign each component of and to a random value in the [0, 1] interval for initialization. At the same time, a variable t (with an initial value of 0) is added to each neuron in the competition layer to record the number of times the neuron won. The maximum number of neuron wins is set as T, and the error tolerance is specified as e.(2)The kth input pattern is provided to the network input layer.(3)Carry out normalization processing on the connection weight vector .(4)Find the input activation value of neurons in the competition layer:(5)Find out the maximum activation value from the calculated . If t of is less than T, t = t + 1, and the neuron corresponding to is taken as the winning neuron in the competition layer. Otherwise, if t ≥ T, select the maximum activation value except . If t of is less than T, t = t + 1, and the neuron corresponding to is regarded as the winning neuron in the competition layer. Otherwise, search in in order of activation value from large to small. Set the output of the winning neuron in the competition layer as 1, and the rest as 0, and its corresponding connection weight is .(6)Adjust in the following ways: where is the learning rate.(7)Adjust the connection weight vector from the winning neuron in the competition layer to the neurons in the output layer, while other connection weights remain unchanged as follows:(8)Calculate the weighted sum of the comprehensive input signals of each neuron in the output layer and take it as the actual output of the output neuron.(9)Calculate the error between the actual output of the network and the desired output :(10)Judging whether the error calculated in step (9) is less than the error tolerance, and if so, continue step (11) to learn the next mode. If it is greater than the error tolerance, return to step (3) to continue learning.(11)Return to step (2) until all the p input modes are provided to the network.
4. Experimental Results and Analysis
4.1. Common Datasets and Performance Evaluation Methods
The effectiveness of the WTCPN method is analyzed experimentally on two public datasets, specifically Oxford [27] and Holiday [28]. Oxford architecture dataset contains 55 query images corresponding to 11 different buildings. Each query image has a rectangular area to define the building. Holiday dataset includes 1491 amplitude false pictures, which are divided into 500 groups, and each group has a different scene or object. The error tolerance e = 0.01, learning rate α = 0.4, and learning rate β = 0.5 in WTCPN algorithm. The number of neuron nodes in the input layer is 4096, the number of neuron nodes in the output layer is 21, and the number of neuron nodes in the competition layer is 30. The experimental environment is carried out on our own server, the operating system is Ubuntu14.04, the CPU is I5, and the graphics card is GTX1060. Divide the training dataset and verification dataset according to the ratio of 4 : 1.
The average correct rate is used to evaluate the performance of image retrieval. The average accuracy is calculated by using the area of the P-R curve, where P represents accuracy, which is the ratio of the number of retrieved positive samples to the number of all retrieved images, and R represents the recovery rate, which is the ratio of the number of retrieved positive samples to the number of all positive samples in the dataset:where Q represents the number of query images, represents the number of dataset images belonging to the same group as the ith query image, represents the accuracy of k and ith query images, is an indicator, and its value is 1 when the query result of the ith query image belongs to the same group as the kth image, otherwise it is 0, and n represents the total number of all images.
4.2. Experimental Results of Public Datasets
Because the convolution kernel size and activation function have great influence on the performance of CPN, two different control experiments are conducted to summarize the CPN architecture with the best recognition effect. In the first group, the convolution kernel sizes were set to 3 × 3, 5 × 5, 7 × 7, and 9 × 9, respectively. There were four groups, and the unified activation function was ReLU. All experimental results are shown in Table 1.
Through the analysis of the experimental results shown in Table 1, it is found that the 3 × 3 small convolution kernel has a better effect, but with the increase of the convolution kernel size, the experimental results are getting worse and worse because the large-sized convolution kernel is obviously too “rough” for the small-sized input image. This leads to the unsatisfactory experimental results of large-scale convolution kernels.
In the second group of experiments, the convolution kernel size was set to 3 × 3, and the activation function was changed. The network training results of different activation functions are shown in Table 2.
Through the experimental results, it is easy to find that convolution neural network with ReLU activation function has better effect. Figure 4 shows the results retrieved from two datasets by the WTCPN method. The first and second rows represent the images in Holiday dataset, and the third and fourth rows represent the images in Oxford dataset. The word Query below each image represents a query image, while TP represents a related image and FP represents an error image.

4.3. Retrieval Performance of Art Datasets
Some highly similar artwork images were obtained from open source websites, with a total of more than 3,600 images, which were divided into six categories, with 600 images in each category. These six categories are national clothing (clothing), saddle (saddle), leather hip flask (pnjh), national craft ornaments (gongyi), high hat (gdm), and Ma Touqin (mtq). Divide into training dataset and verification dataset according to the ratio of 4 : 1. The convolution kernel size is 3 × 3, and the activation function is ReLU. The retrieval result of the WTCPN method on artwork image dataset is shown in Figure 5.

In order to show the advantages of WTCPN, WTCPN is compared with BP network and CNN network which are widely used. These three network models have basically the same topological structure and all adopt three-tier structure, thus ensuring that the three methods are under the same conditions and are comparable. When the error capacity during training is the same, the number of neurons in the input layer and the output layer of both networks is the same. Accuracy indicates the accuracy in the verification process, while loss indicates that the loss value in the verification process is obvious. Generally, the higher the accuracy value and the lower the loss value, the better the trained neural network is. Comparison of retrieval performance of three network models on artwork image dataset is shown in Figures 6 and 7, respectively.


It can be seen from Figure 6 that the accuracy value of WTCPN is higher than that of BP network and CNN network, which is 91.83%. It can be seen from Figure 7 that the loss value of WTCPN is the lowest among the three network models, which is about 0.2. Therefore, it is obvious that the training results of BP network and CNN network are not as good as those of WTCPN, which means that WTCPN can get the best retrieval accuracy.
5. Conclusion
In this paper, an intelligent retrieval method of artwork image based on wavelet transform and dual propagation neural network (WTCPN) is proposed. By combining the main features of wavelet transform and CPN, the retrieval recognition rate is high. Test results on Oxford and Holiday datasets show that the optimal convolution kernel size of the WTCPN method is 3 × 3, and the optimal activation function is ReLU. Test results on art datasets show that the WTCPN method has the best retrieval accuracy compared with BP network and CNN network, with an accuracy of 91.83% and a loss value of about 0.2. In the future, aiming at the image retrieval in concurrent environment, based on the WTCPN method, we can increase message queue (MQ), MapReduce, and cache strategy, to improve the access and computing ability in concurrent environment and further improve the art image retrieval system.
Data Availability
The datasets used in the findings of the study are available from the corresponding author upon request.
Conflicts of Interest
The authors declare that they have no conflicts of interest.