Abstract

In recent years, with the development of computer technology and the Internet, image databases have increased day by day, and the classification of image data has become one of the important research issues for obtaining image information. This article aims to study the role of depth algorithms in network art image classification and print propagation extraction. This article proposes a series of methods of image classification, print dissemination, and deep learning algorithms and also conducts corresponding experiments on the role of deep algorithms in image classification. The experimental results show that the neural network model based on the deep algorithm can effectively identify and classify network images, and its recognition accuracy is more than 80%. The image recognition method based on depth algorithm greatly improves the efficiency of image recognition.

1. Introduction

In order to efficiently manage and accurately classify image data, manual efficiency alone can no longer solve the problem. We hope to use computer computing power and efficient algorithms for quick and accurate classification. The problem of image classification can basically be studied from two aspects. One is quick and efficient classification from massive data, and the other is classification of subcategories from similar images. Mass data image classification is to classify images in terms of breadth. In the field of computer vision, this type of problem is a test of algorithm optimization capabilities. The content of the image is ever changing, and using only a few keywords to describe it obviously cannot meet the classification requirements.

Image classification is very common in life. The massive image data is messy and disorganized. How to extract the content that users are interested in has become a topic of great concern to industry and academia. Research on image classification can solve many practical problems in people’s life or work, for example, the classification of items on Taobao shopping. The algorithm behind this supports the classification scenarios of hundreds of millions of products on Taobao, which efficiently allows users to find the products they need and increases their desire to buy. This requires accurate and efficient classification of images.

On the basis of deep learning research, Lee et al. developed a computer-aided detection system based on CNN algorithm to evaluate the role of the system in the diagnosis and prediction of periodontal damaged teeth [1]. In Sudha and Priyadarshini’s research, they proposed an advanced deep learning method, which uses an improved algorithm to detect multiple types and multiple vehicles in the input video [2]. The work by Farooq and Bazaz is different from that by Lee and Sudha in the research direction of deep learning. The former propose an online incremental learning technology based on artificial neural network (ANN). It is used to develop an adaptive and noninvasive analysis model of the COVID-19 pandemic to analyze the time dynamics of the spread of the disease. The model was validated with historical data, and a 30-day forecast of disease transmission was given in the five most severely affected states in India [3]. With the continuous in-depth study of deep learning algorithms by researchers, people have begun to conduct a lot of research on the application of deep algorithms in image classification. In their research on in-depth learning algorithms in image classification, Yan et al. proposed an image classification framework, which surpassed the window sampling of a fixed spatial pyramid and was supported by a new learning algorithm [4]. In recent years, methods based on deep learning have attracted widespread attention in the field of hyperspectral image classification. However, due to the large number of parameters and complex network structure, when there are only a few training samples, deep learning methods may perform poorly [5]. When it comes to hyperspectral image classification, Su et al.’s research in this area has to be introduced. In Su et al.’s research, in order to optimize the classification of hyperspectral images and the selection and optimization of frequency bands, they proposed an extreme learning machine (ELM) method based on the firefly algorithm (FA) trigger [6]. Although these researchers have done a lot of research on image classification and depth algorithms, they have neglected the related research on some problems existing in image classification and depth algorithms in their research.

The innovation of this article lies in the corresponding research on image classification, print dissemination, and deep learning algorithms. Through relevant demonstrations on the image classification process and the extraction of image features, certain experimental research is carried out on the practical application of depth algorithms in image classification. Corresponding research provides a certain theoretical basis for popularizing the application of depth algorithm in image classification.

2. Depth Algorithm and Image Classification

2.1. Image Classification
2.1.1. Image Recognition

In daily life, images can be seen anytime and anywhere, occupying an important position, and visual information exceeds 70% of the total information received. This phenomenon can be described by the sentence “hearing is not as good as seeing” [7]. In many scenarios, images can show us information more intuitively than text or other forms of information. Figure 1 shows the common image types in life.

2.1.2. Image Classification Process

The general process of image classification is mainly divided into the following steps: (1) image preprocessing; (2) establishment of a classification model; (3) feature extraction and output classification results [8]. Figure 2 shows the general process of image classification.

There are two main research methods for image classification [9, 10]. One is a method based on manual features. The core of this method is the design of features. The second is image classification. The core of this method based on deep features is to build a deep learning model. Compared with methods based on manual features, methods based on depth features have better classification results [11].

There are two types of image features: global features and local features [12]. Local features usually include color features and texture features [13]. Commonly used image feature extraction algorithms include SIFT, SURF, MSER, Harris-Laplace, and Hessian-Affine [14]. As shown in Table 1, the statistical table of the local feature extraction method of the image is displayed.

2.1.3. Summary of Commonly Used Depth Features in Image Classification

A feature is a piece of information related to solving a computing task related to an application. Features are specific structures within the image [15].

Deep features are extracted through deep learning networks. Compared with traditional manual features, the deep features obtained through deep learning have better expression effects. The high-level depth features are equivalent to the combination of low-level features. They have better abstraction, can better express a certain object, and are more conducive to image recognition and classification. Now, very popular frameworks for extracting deep features include AlexNet, VGG, GoogLeNet, and ResNet.

(1) ResNet Depth Features. ResNet has more layers, a total of 152 layers, which is far more than the number of layers in the previous networks. ResNet is also the first network to exceed 100 layers. The biggest difference between this network and other networks is the use of a residual network; that is, the original fitting output becomes the residual of the output and input, as shown in the ResNet network structure in Figure 3.

It can be seen from Figure 3 that the input a can directly reach the output, and the residual function is G(a), which is also a new optimization goal, where

In the formula, T(a) represents the expected mapping output of a certain layer.

Due to the large number of layers in the network, in order to reduce the amount of calculations, the residuals are optimized accordingly. Figure 4 shows the optimized ResNet network structure.

(2) Comparison of Models for Extracting Depth Features. Table 2 shows the summary and comparison result table of several deep learning models.

From the data in Table 2, we can see that the numbers of layers of ResNet, GoogLeNet, VGG, and AlexNet are decreasing layer by layer, and the number of layers of ResNet exceeds 100, which is obviously more than that of other networks. The number and size of the fully connected layers of VGG and AlexNet are exactly the same.

2.2. Depth Algorithm

Deep learning algorithms are an important field in machine learning; they are algorithms that classify data by learning feature performance [16]. The deep learning algorithm is a multilayer processing mechanism that simulates the external information of the human brain and nervous system [17].

CNN is the most common type of deep algorithms. CNN and multilayer neural network combine feature extraction and classification into one process, and CNN has several advantages over multilayer neural network: (1) The structure of CNN is more similar to the human visual processing system, and it is more suitable for 2D and 3D pictures. (2) CNN uses convolutional layers to implement convolution operations, and the convolution operation helps to obtain the spatial structure relationship of the image, so it can extract features with stronger representation capabilities. (3) The pooling layer used by CNN can provide shape invariance characteristics, reduce network parameters, prevent overfitting, and improve the convergence speed of the network [18]. Figure 5 shows the structure of the CNN.

2.2.1. Backpropagation of CNN

The backpropagation algorithm is a learning algorithm suitable for multilayer neuron networks, which is based on the gradient descent method. The input-output relationship of the backpropagation algorithm network is essentially a mapping relationship: the function completed by a BP neural network with n input and m output is a continuous mapping from n-dimensional Euclidean space to a finite field in m-dimensional Euclidean space, which is highly nonlinear. The backpropagation algorithm of CNN is based on gradient descent, and the iteration is divided into two steps [19]. For sample a, the error iswhere represents the target value of the m-th dimension of the corresponding a-th sample, n stands for gradient descent, and represents the m-th dimension of the output of the corresponding a-th sample. In a fully connected network, the structure of the network is as follows.

The activation function f generally uses the Sigmoid function.

Then, the partial derivative of the error to the network parameters is required; that is, the error is used to obtain the derivative of the bias and the weight.

Since , , ; thus, the sensitivity of the hidden layer can be derived:

The sensitivity of the output layer is directly determined by the layer, which is expressed as follows:

Since there is no error function that can be used directly in the hidden layer, the backpropagation algorithm uses the above formula to propagate the error to the front of each network layer [20]. Finally, the neuron weight is updated. The rule for updating the neuron weight is to multiply the input vector and the triangular array of the outer product neuron of the error signal vector. The specific operation process is as follows:

2.2.2. CNN Gradient Descent

For the convolutional layer, the convolution process of the CNN can be expressed as

Like the backpropagation algorithm, the gradient descent process of the CNN is to multiply the deviation function of the activation function of the convolutional layer feature map with the error signal map obtained after downsampling. Since the weights in the downsampling layer are all θ, here it needs to multiply another θ:

Then, if the feature map is given, the deviation gradient can be found. The calculation method of the deviation gradient is

Finally, the gradient of the kernel function can be calculated by post-attributes. Because the weights are shared, all gradients contained in the weights must be totaled:

Among them, is the area in that is multiplied by during convolution. The value of the () position of the output convolution graph is obtained by multiplying the patch at the position () of the previous layer by .

For the downsampling layer, there is no difference between the numbers of input and output feature maps, but the output content is the input downsampling result, so the size of the output map will change. The calculation formula for the change is

To calculate the derived function, first calculate the error signal mapping of this layer. Because there will be a convolutional layer after the downsampling layer, it is necessary to clarify which area of the input image corresponds to the output pixel. Formula (14) shows the calculation method for a given pixel:

After calculating the given pixel size, after calculating the gradient of the multiplicative deviation θ and the additional deviation y, the gradient of the additional deviation y is the sum of the elements in the error signal graph. The calculation method is

The multiplication deviation θ is associated with the sampling map. The original downsampling map refers to the feature map without additional deviation after downsampling. These feature maps are directly stored in the propagation sequence, so there is no need to calculate the multiplicative deviation during backpropagation.

Therefore, the gradient of the error to θ can be expressed as

2.2.3. Image Recognition Based on Deep Neural Network

(1) The Method of Initializing Weights in Neural Network and the Existing Problems. In deep learning, the weight initialization method of neural network has a crucial impact on the convergence speed and performance of the model [2123]. In a deep neural network, as the number of layers increases, gradient disappearance or gradient explosion is very easy to occur in the process of gradient descent. Therefore, the initialization of weights is very important. Although a good weight initialization cannot completely solve the problems of gradient disappearance and gradient explosion, it is very helpful in dealing with these two problems and is very beneficial to model performance and convergence speed. In the deep neural network, the initialization of the weight is generated by following a standard normal distribution with a mean of 0 and a standard deviation of 1 [24].

(2) Improved Initialization Weight Method. The improved method is to initialize the weight value according to a normal distribution with a mean value of 0 and a standard deviation of , where r represents the number of neurons in the input layer. It is known that

Expanding formula (18), we get

Transforming formula (19) based on the nature of variance, we get the result of the change as follows:

If the variances of and input data a are to be similar, that is, the degree of dispersion of data and input data a is similar, then the weight must be initialized to obey a normal distribution with a variance of I.

2.3. Print Dissemination

Printmaking is a kind of painting with strong artistic expression, and printmaking has a variety of painting languages and reproducibility [25]. Compared with other types of paintings, prints are not as direct as traditional Chinese paintings and oil paintings. Prints must be created with the help of wood panels, copperplates, slates, silk screens, etc. Figure 6 shows the common types of prints.

Communication media is different from communication forms. Communication forms refer to the specific ways that the communicators use to act on the audience when they carry out communication activities, such as oral communication, letter communication, image communication, and comprehensive communication.

The transformation of modern printmaking’s communication function is mainly due to the development of modern printmaking, which has changed the development direction of modern printmaking. The art of printmaking has been strengthened, and the works present different forms, richness, and variety, breaking through the nature of printmaking itself. For modern printmaking, its full value lies not only in the spirit embodied in the picture, but also in the individuality of the printing techniques and materials and the technical spirit revealed in the printing process. In some of the current exhibitions, we can easily find that the prints are already very rich and diverse. Printmaking has its unique pluralism, and the ability to communicate is particularly strong. This kind of communication can not only popularize printmaking, but also make it affordable for fans who like to keep printmaking at home, give it to friends, etc. This is conducive to the spread and development of prints.

3. Image Recognition and Classification Experiment

3.1. Image Classification Feature Extraction Experiment

Global features refer to the overall attributes of an image, and common global features include color features, texture features, and shape features, such as intensity histograms. Because they are low-level visual features at the pixel level, the global features have the characteristics of good invariance, simple calculation, and intuitive representation. Traditional feature extraction methods mainly use six features, namely, Dense SIFT (DSIFT), color histogram, SURF, HOG, LBP, and Gabor features. Four classifiers are used, namely, SVM, Random Forest, Naive Bayes, and KNN. In the experiment, a single feature and a combination of multiple features are mainly used, then the BoW model is used for feature coding, and finally a variety of classification methods are used for comparison experiments.

For the CNN method, this experiment uses the classic AlexNet and ResNet-18 network models and uses two training modes: The first is direct training; that is, the network parameters are randomly generated. The second type is pretraining. The network parameters are loaded by the parameters of the pretrained model, which is usually trained on the ImageNet database.

3.1.1. Experimental Results of Single-Feature Classification

Table 3 shows the experimental results of a single-feature classification method, where Bayes represents the Naive Bayes classification method. Except for the HSV color histogram, all features are extracted using the BoW method for feature coding, and the HSV color histogram is directly classified after normalizing the features, without the need for BoW feature coding.

It can be seen from Table 3 that the direct use of HSV color histogram features can get the best classification accuracy, with an average classification accuracy of 80%, followed by DSIFT features, and the worst is the use of Gabor and LBP features. As can be seen from the data in Table 3, the use of Gabor and LBP features achieved classification accuracy between 20% and 25%, which is too low.

3.1.2. Multiple-Feature Combination Experiment Results

In single-feature classification, since DSIFT and SURF have better classification results, these two features are mainly used in combination with other features. Table 4 shows the classification results of the two-feature combinations.

3.2. Image Recognition Experiment Based on Depth Algorithm

In order to verify the role of the depth algorithm in the recognition of images and prints, in this experiment, various prints, Chinese paintings, oil paintings, watercolors, etc. from the network art images were randomly selected for recognition and classification. In the recognition process, different network models were used for many experiments, the purpose is to verify whether the neural network model based on the deep algorithm has more advantages in image recognition and classification than other network models. Tables 5 and 6 show the test results of image classification performance based on deep learning and the comparison table of classification results of different network models, respectively.

4. Results of Image Recognition and Classification Experiment

4.1. Experimental Results of Feature Extraction Experiments Based on Image Classification

In the experiment, the results of a single-feature classification experiment and multiple-feature combination experiments were recorded. In the current feature classification, it can be clearly seen that using the HSV color histogram feature can get the best classification accuracy. In the multifeature combination experiment, due to too much data, the experimental conclusion cannot be drawn directly. To this end, according to the data in Table 4, we can get the experimental results of the classification of DSIFT and SURF features with HSV color histograms, as shown in Figure 7:

According to Figure 7, it can be known that the HSV color histogram combined with DISFT and SURF features has achieved better classification performance, especially when combined with SURF features to get the best classification accuracy, with an average accuracy rate of 80%.

In this experiment, two initialization methods are used to classify and recognize image features for the weight of CNN, namely, the initialization of Gaussian distribution and the Xavier initialization method. Figure 8 shows the experimental results of the two weight initialization methods.

According to Figure 8, it can be clearly seen that the accuracy of CNN image classification based on the Xavier initialization method is more suitable for image recognition. Its recognition accuracy can stabilize at 86% to 90% after a certain number of iterations. As the image layer increases, the Xavier initialization method has a higher recognition rate for the image. Figure 9 shows the comparison of the image recognition accuracy of the classic AlexNet and ResNet-18 network models in two different training modes.

It can be seen from Figure 9 that the accuracy of image recognition in the pretraining mode is greatly improved compared with the direct training mode. In the pretraining mode, the accuracy of the AlexNet network model is maintained between 90% and 93%, while in the direct training mode, its recognition accuracy is only 87%.

4.2. Results of Image Recognition Experiment Based on Depth Algorithm

According to the experimental data in Tables 5 and 6, a comparison chart of the recognition rate accuracy of different images and the image recognition rate under different network models can be obtained, as shown in Figure 10.

According to Figure 10, the image recognition technology based on the depth algorithm maintains a high level of accuracy in the recognition of prints and web images. During the experiment, no matter what the type of the image is, the recognition accuracy rate can reach 80%. Moreover, the CNN image recognition model based on the depth algorithm is higher than other network models in image recognition rate, accuracy rate, and recognition speed.

5. Conclusions

With the continuous increase in the types and quantity of network images, traditional network image classification methods can no longer meet the needs of efficient management of network images. Network image classification management puts forward higher requirements for the professional knowledge and skills of classification personnel. The use of CNN has greatly improved the accuracy of image recognition and further improved the efficiency of network image recognition, classification, and organization. Its good feature extraction capabilities should also be widely used in more classification and retrieval tasks.

Data Availability

No data were used to support this study.

Conflicts of Interest

The authors declare that there are no conflicts of interest with any financial organizations regarding the material reported in this manuscript.

Acknowledgments

This work was supported by the “Xing Liao Talents Plan” project of Liaoning Province (no. XLYC1907112).