Abstract
Strengthening the construction of intellectual property rights of SM-TE (small and medium-sized scientific and technological enterprises) in China is an important measure to speed up the development of SM-TE, and improve their scientific and technological innovation ability and market competitiveness. In this paper, a patent recommendation algorithm based on deep semantic similarity is proposed to solve the problem of low calculation accuracy of similarity matrix among users in sparse interaction matrix. The algorithm trains the patent corpus, and obtains the Doc2vec DL (Deep Learning) model, and then constructs the semantic similarity matrix among patents through the DL model. On this basis, to further improve the modeling ability of semantic expression and feature extraction, this paper optimizes CNN (Convolutional Neural Network) model, using a variety of pretrained word vector models, multi-layer classifiers, etc., to improve the model accuracy and generate feature vectors of different dimensions. The results show that the accuracy, recall rate and F1 value of the proposed algorithm are better than those of the traditional recommendation algorithm, which are 22.41%, 20.86% and 21.51% respectively. The experiment shows that this paper can guide Chinese enterprises to establish and improve the risk warning system of independent intellectual property rights, thus reducing the losses of enterprises.
1. Introduction
SM-TE (Small and Medium-sized Scientific and Technological Enterprises) innovation is an important force to promote social progress, stimulate national economic growth and consolidate the national independent innovation strength. At present, international competition is mainly reflected in the competition of independent innovation forces. Compared with developed countries, China’s small and micro-enterprises in science and technology have insufficient innovation ability, and the low level of intellectual property management is also an urgent problem to be solved. Intellectual property ownership has become an important index to measure the core competitiveness and innovation ability of enterprises. Chinese enterprises urgently need to use independent intellectual property rights to break the international monopoly and blockade, go abroad and strive for greater development space. Due to the simple industrial structure of SM-TE, it is easy to pay attention to the technological development of the leading industries. In addition, SM-TE has a close connection with the market and high market sensitivity. SM-TE with innovative ability must become a new force for independent intellectual property innovation in China.
Risk early-warning research is a hot topic both domestically and internationally. Domestic scholars have looked into the medium-term early warning mechanisms of start-up companies, commercial bank loan risk early warning mechanisms, marketing risk early warning mechanisms, and financial risk early warning mechanisms of small and medium-sized businesses, as well as knowledge management risk early warning and knowledge capital risk early warning mechanisms [1–3]. According to He et al., there is generally no financial risk in enterprises during periods of rapid economic growth, and there are many factors that affect the financial risk of enterprises [4], the most prominent of which are the economic situation, stock price, and inflation. Deng et al. used the univariate analysis method to compare 79 companies from crisis and normal enterprises [5]. Finally, it is discovered that cash flow divided by total liabilities is the best predictor of an enterprise’s financial crisis. Liang et al. used multivariate analysis to assess enterprise financial risk early warning. This method combines financial ratios with multivariate judgment to provide an early warning system for financial risk [6]. In their study, Niu et al. used both cash flow and non-cash flow indicators and proposed a research idea of financial early warning based on cash flow [7]. There are, however, few studies on early detection of intellectual property risks. This is because the emergence of intellectual property risk in the process of enterprise independent innovation is influenced by a variety of factors, and it is difficult to predict intellectual property risk in the process of independent innovation.
Deep learning (DL) is a new concept in machine learning. The term “deep learning” is derived from the term “neural network.” DL, in particular, has a large number of hidden layers that determine its complicated internal mapping relationship. We can learn the effective characteristics of data and have a strong learning ability thanks to this complex internal relationship. Both the DL network and the BP neural network (BP neural network) are machine learning models, but they differ significantly. A shallow neural network is a BPNN, and a multi-layer deep neural network is DL. Many academic and practical examples demonstrate that DL is more important in defining complex functional relationships. As a result, the goal of this paper is to apply DL knowledge to SM-independent TE’s intellectual property risk warning, to put the scientific concept of development in the field of independent intellectual property risk warning into practise, and to use it flexibly to protect the company’s independent intellectual property security, which has both theoretical and practical implications. The following aspects of this paper’s innovation: (1) In this paper, the existing research on intellectual property risk pre-operation is deeply studied, which breaks the current situation that most of the existing research focuses on the identification, risk assessment, and control of intellectual property risk response measures, and tries to build an early warning system of intellectual property risk. The whole system is divided into the risk identification subsystem, risk assessment subsystem, and risk early warning subsystem, which is conducive to risk prevention and control in the whole process of intellectual property development. (2) In the aspect of collaboration among users, aiming at the problem of low calculation accuracy of similarity matrix among users in sparse interaction matrix, a patent recommendation algorithm based on deep semantic similarity is proposed. The algorithm extracts the nearest neighbor of the target user, estimates the patent score of the target user according to the patent score of the neighbor, sorts the patents according to the score, and recommends the patent with the highest score to the target user. (3) To further improve the semantic expression and feature extraction ability of the model, the neural network model for feature extraction and analysis of patent texts is optimized and enhanced. Through relevant experiments, the improved model is evaluated and analyzed on multiple pretrained word vector models and multiple data sets.
2. Related Work
2.1. Risk Early Warning Research
Neuner research shows that the financial distress of an enterprise may not necessarily lead to bankruptcy or reorganization, but the bankrupt or reorganized enterprise must be the one with financial distress [8]. Liu et al. believe that the serious cash-out problem of an enterprise cannot be solved by conventional means, and if the operation or structure of the enterprise needs large-scale restructuring, the enterprise will be in financial trouble [9]. Liu’s model for financial risk early warning research has many limitations on assumptions compared with multivariate judgment. Logistic model has lower data requirements and is more applicable, so it is a better method [10].
Hz et al. introduced artificial neural network into the field of financial risk early warning, and they chose a three-layer neural network for early warning. At the same time, they used multiple judgment methods to make empirical analyses and compare the results [11]. The results show that the accuracy and fault tolerance of artificial neural networks are better. Zhang et al. introduced qualitative indicators such as working environment, internal control, external environment, business environment, and analyzed them in combination with traditional quantitative indicators such as solvency indicators and profitability indicators [12]. However, it is not tested by specific data, but it is a good idea to introduce qualitative indicators. Yudo et al. have established the financial risk judgment index system for oil companies, and based on this, they have established the financial risk early warning model of fuzzy neural network [13]. Li et al. used the data processed by GM(1 : 1) model of function transformation as the input value of BPNN to make an early warning of financial risks [14].
2.2. Research Status of DL Network
DL-based models and algorithms have made remarkable achievements in the fields of computer vision and speech processing. At present, the application of DL in natural language processing has gradually matured. In some natural language processing tasks, such as text classification, sentiment analysis, DL method shows greater advantages than traditional text processing methods.
Nateghi et al. verified through experiments that DL methods using unsupervised training at all levels can describe complex functions well and avoid over-fitting problems caused by network training [15]. Colombo et al. have made great success in using DL neural networks. The input values of its model do not contain artificial features but image pixels, which has become a great breakthrough in the field of image recognition [16]. Panwar et al. combined the grey prediction model and neural network model to study financial early warning [17]; Dhuri et al. use statistical methods to optimize the artificial neural network model and improve the financial early warning model based on the neural network with higher reliability [18]; Hui et al. used DL method to build a neural network model to predict the financial distress of enterprises, with high accuracy [19].
Chen et al. used DL method to extract the features of objects, initialized the network, and then used back propagation algorithm to fine-tune the network parameters [20]. Liu et al. used a self-coding DL neural network in the field of speech recognition [21]. Firstly, DL method was used to extract the features of speech signals. Then it is tested by BPNN and DL network respectively. The results show that the accuracy of DL method is nearly 20% higher than that of the traditional BPNN method, and it has a good effect. Wei et al. have studied the application of DL network in the prediction of stock index futures. In this paper, an automatic encoder and other algorithms are used to establish DL network model, and the comparison is made. Finally, a network prediction system for trading is constructed according to trading choices [22].
3. Methodology
3.1. Patent Recommendation Algorithm
Many science and technology small and microenterprises have yet to develop a perfect intellectual property incentive system, have yet to sign intellectual property confidentiality agreements with their employees, and have neglected intellectual property protection negotiations when collaborating with external units. According to synergy theory, the enhancement of nonlinear interaction among all system elements (capital, technology, equipment, R&D personnel, etc.) leads to the creation of innovation, and the related energy is greater than the innovation energy. Individual movement is governed by coordinated movement, and the system is well-structured, resulting in innovative achievements with dissipative structure characteristics.
Intellectual property is an important wealth and resource [1], which is vital to the development of enterprises and countries. Intellectual property not only represents the core competitiveness of enterprises but also represents the comprehensive national strength of the country. As an important intellectual property right, patent symbolizes the power of various scientific and technological achievements, and it is essential to protect the core technologies of enterprises and countries. Enterprises with high patent content have the initiative to survive and develop [2], while countries with high patent content have competitive advantages in terms of scientific and technological strength and comprehensive national strength [3, 4].
In this chapter, a patent recommendation algorithm based on deep semantic similarity is proposed, which employs a DL model and completion strategy to fill the sparse interaction matrix between users and patents, addressing the issue of low similarity matrix calculation accuracy. To improve recommendation efficiency, the problem of users in sparse interaction matrix is not severe. Fill the sparse interaction matrix between users and patents with Doc2vec DL model and completion strategy, analyse the collaboration relationship between users, find potential neighbors with similar interests, use neighbor scores to predict unknown patent scores, and recommend patents in turn.
A cross-patent similarity matrix is a matrix that contains all patents in both the horizontal and vertical directions. The intermediate data is the semantic similarity of cross-patents calculated by the Doc2vec DL model, also known as a deep semantic patent similarity. Two patent documents are used as input parameters of the Doc2vec DL model after training, and the vectors of the two patent documents are generated separately. The cosine similarity formula is then used to calculate the semantic similarity of the two patent documents.
Combined with the cross-patent similarity matrix, the score of unexamined patents is predicted, and the interactive matrix is completed. The predicted score is shown in formula:where represents the predicted score of the Ungraded patent by the -th registered user, and represents the set of the scored patents of the Uth registered user; represents the specific patent in the set ; represents a specific patent outside the set ; represents the similarity of the patent ; represents the score of the -th registered user on patent ; Represents the threshold value, which is a custom value between 0 and 1;
The specific steps of patent recommendation algorithm based on deep semantic similarity are shown in Figure 1:(1)Enter the original parameters in the recommended method;(2)Completes the interaction matrix between all registered users and all patents;(3)Calculate the similarity matrix among all registered users;(4)According to the similarity matrix of all registered users, the nearest neighbor user list is obtained;(5)Find a list of patents that may be used for recommendation according to the nearest user list;(6)Predicting the score of the recommended user on the patent;(7)Output the recommendation list to the user according to the score.

3.2. Neural Network Model of Patent Feature Extraction
The quality of patent features that characterise the content of patent text is the key to patent text analysis. In the field of natural language processing, text classification problems are first classified using expert-defined rules, and then a knowledge-engineered expert classification system is created. Rules and knowledge systems limit the problems that these two methods can solve, and they are time-consuming and inaccurate. The deep learning method based on word vector and CNN (Convolutional Neural Network) has been gradually tested and practised in text classification to overcome the disadvantages of feature extraction in traditional machine learning methods. This paper proposes a deep learning-based feature extraction method for patent text, combining the application of deep learning in the field of natural language processing.
Considering the performance advantages of deep learning in natural language processing, especially text classification, this paper proposes a neural network model based on text classification for patent feature extraction and patent analysis. The neural network model used in this paper is based on the supervised learning model, so it is necessary to use marked or trained data sets. The model selection, structure, and parameter optimization are considered. TextCNN is a representation model that uses the CNN model to perform NLP tasks [18]. It combines the ideas of CNN N-grams and the language model, extracts the context features of different dimensions from text vectors through convolution kernels of different sizes, and then uses the maximum pool operation to enhance the features of the extracted text vectors, thus improving the feature extraction ability of texts and enhancing the classification effect of texts.
Assuming that a text word vector represents , TextCNN is divided into three stages: convolution layer, pooling layer and full connection layer, as shown in Figure 2.

The input layer is , which represents the word vector of a patent text. represents the splicing operation, and represents the splicing of the to word vectors in the patent text. is used as the input of the convolution layer.
Because the Attention mechanism can highlight the key features in long sentences, this paper puts forward the “Word2vec + Attention” model, that is, a set of feature weight matrices corresponding to word vectors are obtained by word vector training, and the final text vector representation is obtained by weighting word vectors based on weights.
Assuming that the word vector of a patent text represents , the calculation formula of Word2Vec + Attention model is simply described as follows: is the hidden representation calculated by , is the weight vector normalized by the hidden representation, is the network parameter, and represents the text vector representation weighted by the Attention weight matrix.
Deep learning has had remarkable success in the fields of computer vision and speech recognition in recent years, making it widely used in deep learning. When using deep learning to solve natural language processing problems, the first task is to solve the problem of text representation, and then the deep neural network’s ability to extract feature expression can be used instead of relying on complicated artificial feature extraction engineering. Word2vec is a set of neural network models for word embedding generation. A two-layer shallow neural network can be trained to reconstruct the position of words in this model. In practise, Word2vec provides a faster and more stable initial value for the first word embedding layer of a text processing neural network model, especially when the number of data sets is small. The CNN model is optimized in this paper, including network structure optimization and super-parameter optimization. The model structure and key parameters are shown in Figure 3.

128 of the input layer in Figure 3 indicates the data quantity of one iteration or a batch of training; 400 in the word embedding layer represents the dimension of the pretrained word vector model. In the third convolution layer, the model uses convolution kernels of 3, 4 and 5 lengths at the same time, and the number of each convolution kernel is 200.1 × 200 × represents the dimensions of feature mapping after convolution of different convolution kernels, where the size of is related to the sentence length and the length of convolution kernel.
The word embedding layer is a two-way cyclic neural network structure, which is represented by reverse and forward cycles respectively, as shown in the following formula: represents the current word, represents the left text of the current word, represents the right text of the current word, represents the word vector of the word , represents the weight parameter, and is a nonlinear function.
According to the context representation of the current word , it can be inferred that the text representation of the current word is:
One feature of text processing is that features in the text are closely related to positions, such as the position information of important sentence components, while the latent semantic vectors constructed in the previous layer do not highlight the important information of certain mapping features. Use the maximum pool operation formula as shown in:
The whole layer part also combines the features extracted from the previous layers of texts by a single-layer neural network, and the formula is shown in:
3.3. Realization of Intellectual Property Risk Early War Model
SM-TE has few funds and talents, and the quality and quantity of intellectual property rights it owns are not high. First, the foundation of SM-TE intellectual property rights is weak. The subjects involved include the government, evaluation agencies, law firms, guarantee agencies, intellectual property trading centers, etc. Only institutions involved in the financing of enterprise intellectual property guarantee can form cooperation and coordinate the distribution of interests and risks among institutions. To ensure the development of intellectual property clothing financing business. At present, there are not many public welfare intellectual property service organizations facing a large number of SM-TE, which are far from meeting the needs of SM-TE in protecting intellectual property rights.
The growth and evolution of independent intellectual property rights of SM-TE is a complex system. The growth of independent intellectual property rights depends not only on the innovation mechanism and intellectual property awareness within the company but also on the corresponding growth environment. Therefore, it cannot just be based on our subjective desire, design, and control. For enterprises that carry out independent innovation, early warning of intellectual property risks is an important task. Through the early warning of intellectual property risks, we can find risks and take early action to prevent further losses.
There are many links in the risk early-warning process of enterprises’ independent intellectual property rights, and each link requires different elements in the early-warning mechanism. The intellectual property risk early warning index system’s design requirements are in line with the enterprise’s intellectual property management goals, and the indicators have no strong correlation. The index data must be able to accurately reflect the enterprise’s intellectual property risk, as well as the company’s intellectual property management status, problems, and trends. Only when the intellectual property risk warning mechanism operates normally can the intellectual property risk warning process be implemented. The SM-TE intellectual property risk warning process is shown in Figure 4.

The risk identification subsystem identifies the potential risk factors by analyzing the risk sources in the process of intellectual property development. Based on the enterprise information database, the subsystem uses information retrieval software tools to compare and analyse the data and literature in the database, and finally identifies the factors that lead to the property risks of enterprises. In this daily work, once it is determined that the company’s intellectual property information is highly correlated with the existing information in the database, it will send out the risk monitoring and early warning signal in time, enter the early warning subsystem as soon as possible, and judge the risk level.
After quantifying the risk indicators, the risk evaluation subsystem measures and evaluates the degree of risk. The routine management work of enterprise intellectual property risk management is the assessment of intellectual property risk. Companies can assess themselves at key nodes based on their intellectual property development. The risk early warning subsystem divides intellectual property risks into no risk, slight risk, medium risk, and serious risk based on the intelligence monitoring information provided by the first two subsystems. The early warning information is fed into the risk response management link when the system sends out an early warning signal. The company decides whether to keep things as they are or take preventative and control measures based on the early warning signal and which preventative and control measures are available.
4. Experiment and Results
4.1. Experimental Setup
The experiment of this algorithm is carried out in a local computer, and the details of the experimental environment are as follows: Processor: Intel(R)Core(TM)i7-7700CPU Memory: 8.00 GB Operating system: microsoftwindows10 DL development framework: Deeplearning4j1.0.0-alpha
The experimental data used in this chapter includes two pieces of data. One piece of data comes from the retrieval system of the intellectual property (patent information) public service platform. The patent literature data was downloaded from the patent retrieval system as a patent corpus, and finally, 18,124 experimental patent data documents were obtained.
Another part of the data comes from the user registration data collected in this study. As long as the user’s score is collected on the patent, it means that the user likes the patent. The user registration data includes user id, patent id, and score fields, and finally, there are 8096 user registration data of 133 users.
4.2. Experimental Result Analysis
We use 50% cross-validation [10] to randomly divide the user registration data of each user into 6 parts, 5 parts from the training set and 1 part from the test set. An average of 6 results is used, such as the final accuracy, recall, and F1 value. Figure 5 shows the influence of paragraph vector dimension on recommendation results.

It can be seen that with the increase of paragraph vector dimension, the accuracy rate, recall rate, and F1 value first increase and then decrease. When the vector dimension of a word is less than 240, the semantic information of a paragraph is incomplete; It also brings some noise, which leads to errors in feature rendering. Therefore, the final depth semantic model paragraph vector dimension of Doc2vec is 240.
The user’s Knum neighborhood represents the choice of the nearest Knum neighborhood of the target user, which affects the recommendation effect. Knum can be 1, 3, 5, 7, 9, 11, and the dimension of the paragraph vector is 240. Different users’ neighborhood has different accuracy, recall rate, and F1 value. The results are shown in Figure 6.

As can be seen from Figure 6, with the increasing number of neighborhoods, the precision, recall rate and F1 value show a trend of first increasing and then decreasing. When , the neighbor sets with similar hobbies are not fully excavated; When , the recommendation effect is the best; When is used, neighbors with similar hobbies are fully mined, but some neighbors with low similarity are also mined, which leads to errors in recommendation. So the final number of neighborhoods is chosen as 7.
This algorithm contains an adjustment parameter, threshold , which represents the threshold of similarity between scored patents and unrated patents, and affects the recommendation effect. The paragraph vector dimension is 240, and the neighborhood k is 7, which have different accuracy, recall and F1 values. The results are shown in Figure 7.

It can be seen that with the increase of , the precision, recall rate, and F1 value first increase and then decrease. When , the scores of patents with low similarity are also estimated to complete the interaction matrix, and too much is completed, resulting in inaccurate adjacent user sets.
When , the interaction matrix is completed properly, the recommendation effect is the best. When , it takes a patent with high similarity to estimate the score to complete the interaction matrix, resulting in the sparse matrix may still be very sparse, and the adjacent user set is inaccurate. Therefore, the final is selected as 0.6.
Through the above experiments, the appropriate paragraph vector dimension 240, , is obtained, and the patent recommendation algorithm based on deep semantic similarity is the best. Therefore, the best algorithm in this paper is compared with the traditional recommendation algorithm, and the experimental results are shown in Table 1.
As can be seen from the table, the accuracy, recall and F1 value of the proposed algorithm are superior to those of the traditional recommendation algorithm, which are 22.41%, 20.86% and 21.51% respectively. The algorithm proposed in this paper uses Doc2vec DL model and completion strategy to complete the sparse matrix, thus solving the problems of low calculation accuracy and great mining potential of similarity matrix among users in sparse interactive matrix. Therefore, the algorithm in this paper is superior to the traditional algorithm in the experimental evaluation results.
This paper introduces two initialization methods of the word embedding layer, random initialization and initialization using the pretrained word vector model. This paper makes a series of initialization of the CNN model using these three methods. Figures 8 and 9 show the accuracy loss curves of the CNN model under three initialization methods of word embedding layer, including two training and verification processes.


Experiments show that CNN model shows good advantages in dealing with patent texts. Most of the texts in the patent data set are long texts or Chinese texts, so context information, such as word and sentence order, is particularly important. However, CNN model needs to artificially determine the size (length) of filter convolution kernel to select different range of context information, which has high instability. In order to further verify that the patent feature vector extracted by CNN model has the ability to express the differences of patent text content, this paper adopts the method of comparative experiment to cluster and verify CNN models of two mapping strategies, and counts two types of models in two types of patent samples. The number of results conforming to the formula in the data set represents the proportion of the sample data set, and the results are shown in Table 2.
The experimental results in Table 2 show that the feature vectors extracted by CNN model have a high accuracy in the text-similarity comparison experiment, and the error is within 5%–10% of the given result, which is much higher than other text feature representation methods. The choice of pre-training word vector model must consider the difference between corpus and actual training set. It is still an ideal state for training general word vector model, but it can be considered that it is used to train word vector model for specific applications through transfer learning in the patent field. A great improvement in the accuracy of feature extraction of this patent by using neural network model is to make use of the superior structure of other models to make up for the deficiency of the model itself.
5. Conclusions
Innovation plays an important role in the development of Chinese enterprises. According to the process system of risk prevention and control, enterprises can find the source of risks in the process of intellectual property development from the risk identification subsystem, identify potential risk factors, and then enter the risk identification subsystem, and issue an early warning according to the risk monitoring signals. The patent similarity matrix is constructed by using Doc2vec DL model, and the experimental and analytical results show that the patent recommendation algorithm based on deep semantic similarity designed in this chapter is superior to the traditional algorithm. The accuracy, recall, and F1 value of the proposed algorithm are 22.41%, 20.86%, and 21.51%, respectively. In future research, we can establish different types of independent intellectual property risk early warning index systems for different types of companies through field research.
Data Availability
The data used to support the findings of this study are available from the author upon request.
Conflicts of Interest
The author does not have any possible conflicts of interest.