Abstract

The market is intricate and complicated, and the existing risk warning models have problems of low efficiency and poor generalization in predicting market risk data. Aiming at the problems, this study takes stock market risk warning as the research object and proposes a market risk warning model based on LSTM-VaR. 15 variables in the three categories of basic transaction data, statistical technical indicators, and moving interval data are selected as the stock market characteristic indicators, the LSTM(Long Short Term Memory) prediction model is constructed and the standard deviation of stock returns is predicted. Based on the predicted results, the probability distribution of return rate under the conditional distribution is obtained, and the VaR(Value at Risk) is calculated. 1% and 5% sample quantiles are taken as the warning line, and the LSTM-VaR warning model is obtained. The results show that the RMSE value of the model is the smallest, which is 0.013762, when the activation function of the LSTM-VaR model is the Leaky ReLu function, the training periods epochs are 10, the time window length N is 9, the batch size is 8, the number of neurons in each layer is 50, the dropout probability is 0.1, and adam is used as the optimizer. Compared with traditional prediction models such as MLP, the proposed model has better performance and can well realize market risk warning.

1. Introduction

The influencing factors of the market are complex and changeable, and it is usually difficult to master the rules. In recent decades, market crises occurred frequently. For example, the global economic crisis in 2008 triggered economic turmoil around the world, resulting in unemployment of tens of millions of laborers worldwide, an increase of 50 million poverty population, and irreparable losses [1]. This shows the significance of market risk warning in the global economy. At present, with the progress of artificial intelligence technology, deep learning, with good nonlinear mapping ability and fitting generalization ability, is widely used in market dynamic prediction. For instance, prediction methods based on artificial neural networks have achieved good prediction results in dealing with market non-linearity and time series dependence. Lin Wenhao and Chen Xuebin et al., combined with the relevant data of Shanghai Securities, proposed a stock market risk prediction method based on GARCH(Generalized Autoregressive Conditional Heteroskedasticity model), which effectively realized the prediction of stock market risk [2, 3]; Li Xinxin and Liu Chengcheng et al. built a risk prediction model with generalized vector autoregressive model [4, 5]; Guo Jing and Liu Wenchao et al. evaluated the inherent volatility risk of the stock market through implied tail risk, greatly improving the risk warning ability [6, 7]; Zhou Wenhaoand Tian Chongwenet al., constructed a customer default model of banks on the basis of commercial banks data and the logistic regression, so as to improve the identification ability of customer risks of banks; However, the above methods are mainly through quantitative risk analysis [8, 9]. With the application of neural network, It has begun to be applied to risk prediction. For example, Ren Ni et al. applied deep learning algorithm to financial risk prediction, providing reference for the application of deep learning algorithm in risk prediction [10]. Zhang Qun et al. applied LSTM algorithm to wind forecasting, which is characterized by fitting with time series data. In the above studies, the focus is how to improve the accuracy of risk prediction [11]. Therefore, in order to solve the above problems, based on extensive review of relevant literature, this study takes stock market risk warning as the research object, and proposes a LSTM-VaR market risk warning model in the basis of the LSTM(Long Short Term Memory) and the VaR(Value at Risk). By using LSTM network to predict standard deviation of stock returns and VaR to measure value at risk, the stock market risk warning can be realized.

2. Basic Methods

2.1. LSTM Model Introduction

LSTM model is a variant of recurrent neural network (RNN). By replacing the neuron structure of RNN with a three-layer “gate” structure of input gate, output gate and forgetting gate, the problem of gradient disappearance of RNN network loss function and long-term dependence can be solved. Figure 1 is the LSTM network model structure, showing the processing flow of data through the memory unit and gate structure. In the figure, represents the input gate, represents the forgetting gate, represents the output gate, represents the hidden layer neuron state, represents the candidate value generated by inputting , represents the connection vector between the hidden layer output at time and the input at time t, and , respectively represent the weight matrix of the forgetting gate and bias, , respectively represent the weight matrix of the input gate weight matrix and bias, , respectively represent the weight matrix of the output gate weight matrix and bias, , respectively represent the weight matrix of the current state value weight matrix and bias.

The status update mode of the LSTM network model is shown in (1)–(5)

2.2. VaR Model Introduction

The VaR model, value-at-risk model, can be calculated by formula (6)

In the formula, P stands for probability measurement; stands for value loss; stands for confidence level, the larger the value, the more disgusted the separation. The meaning of the above formula is that within a period of time in the future, the probability loss of the asset portfolio will not exceed the VaR value. For VaR measurement, it is usually calculated by the empirical distribution of the rate of return based on historical data, as shown in formulas (7) and (8) [12].

In the formula, is the rate of return; is the average rate of return; is standard deviation of return rate; is the information set at time t−1.

According to the above LSTM and VaR model introductions, LSTM has strong nonlinear mapping capabilities [13], VaR model is a risk management method to measure market risk, and it is easy to handle and fast in calculation [14]. Therefore, combining LSTM model and VaR model, this study proposes a market risk warning model based on LSTM-VaR.

3. Market risk warning model based on LSTM-VaR

3.1. Characteristic variables selection

The selection of characteristic variables is a prerequisite for realizing market risk warning. This paper takes stock as the research object, and constructs a market risk warning model based on optimized LSTM model. According to literature and the nonlinear characteristics of stock data [15], this paper selects representative indicators from basic transaction data, statistical technical indicators and moving interval data as characteristic index variables, as shown in Table 1.

3.2. LSTM-VaR model construction
3.2.1. Activation function

In order to strengthen the learning ability of the network, an activation function is introduced into the network. The Sigmoid function is the designated activation function of the logistic regression model. It has the advantage of being easy to derive, but it is prone to the problem of gradient disappearance. Its mathematical expression is as formula (8). Tanh function is a saturated activation function, and there is still the problem of gradient disappearance. Its mathematical expression is as formula (9), and the calculation is simplified as formula (10). Relu function has certain advantages in solving the problem of gradient disappearance, and its mathematical expression is shown in formula (11). The Leaky Relu function is an expanded and improved activation function of Relu function [16], which can better solve the problem of gradient disappearance. Therefore, this study selects the Leaky Relu function as the activation function of the LSTM-VaR model.

3.2.2. Loss function selection

Loss function is a method to solve the problem of model overfitting. By not allowing parameters to pass through all neurons, it can reduce neuronal computation, improve computational efficiency, and reduce network scale. This research reduces the network scale by adding a loss function at the dropout layer of the LSTM-VaR network. Commonly used loss functions include 0-1 loss function, absolute loss, logarithm and root mean square error (RMSE), whose mathematical expressions are as follows: (12)–(14). Since the root mean square error function has a better solution to the fitting phenomenon, this study chooses it as the loss function of the LSTM-VaR network model [17].

In the formula (15), n represents quantity, yi represents the true value and represents the predicted value.

3.2.3. Sliding windows

The input of market risk warning model based on LSTM-VaR is time series data, while sliding Windows are usually used in time series prediction [18]. The sliding window realizes data statistics by dividing a period of time into multiple windows, and sliding each window with equal length, taking the window data as the unit. By using the sliding window, more relevant and time-sensitive data information can be extracted. Figure 2 is a schematic diagram of an instant sliding window. As the window slides, the previous window becomes invalid, and a new window is generated accordingly.

3.2.4. Dropout layer

LSTM-VaR model contains a large number of parameters which are prone to over-fitting during model training, resulting in poor fitting effect of data samples on training set and test set [18]. Therefore, this paper adds a dropout layer to the model to solve this problem. The process of dropout node units is to randomly select neurons for temporary hiding in a loop, and perform looping and optimization [19]. Repeat the operation until the end of the training.

3.3. Market risk warning process based on LSTM-VaR

Based on the above analysis, the risk warning process of the stock market is summarized as follows: Firstly, missing values and standardized processing are carried out on the collected stock market data, then the risk measurement on the processed data. Secondly, the data after risk measurement is input into LSTM-VaR model, and the deep learning model LSTM is used to predict the stock market risk, and the prediction results are output. Finally, the prediction results are measured by VaR to realize risk warning. The above process can be illustrated in Figure 3.

4. Simulation experiment

4.1. Experimental environment construction and data sources

This experiment was carried out in Python, TensorFlow and keras environment. The experiment takes the stock market risk warning as the research object, takes day as the unit, selects the relevant data of the Shanghai and Shenzhen 300 Index of Oriental Fortune from October 5, 2009 to August 5, 2020 as the experimental data. The CSI 300 yield trend is shown in Figure 4 [20, 21].

Considering the problem of missing data in the collected data, in order to avoid the impact of missing data on the risk prediction results, the study performed deletion preprocessing on missing data. In addition, due to the large differences in the magnitude and dimension of different indicators, this experiment standardized the data by formula (16) [22]. In the end, this study obtained a total of 2,917 sets of historical data for stock market risk prediction.

The pre-processed data were divided into training set, test set and verification set in a ratio of 3 : 1 : 1, and the specific distribution was shown in Figure 5.

4.2. Evaluation Indicators

In this experiment, root mean square error (RMSE) was used as an indicator to evaluate model performance, as shown in formula (17). The smaller the value is, the better the prediction effect is [23].

According to the definition of VaR, stock rate of return is a significant indicator affecting value-at-risk prediction. Therefore, it is selected as the warning indicator of stock market risk in this experiment. Stock return rate is usually expressed by relative return rate, as shown in formula (18) [24]:

In the formula, represents the price return rate of index i on day t; represents the daily closing price of the index on day t; and represents the daily closing price of the index on day t−1.

4.3. Parameter Settings

In this experiment, the initial parameters of the LSTM model were set as follows: training periods epochs were 10, time window length N was 9, Batch size was 8, the number of neurons in each layer was 50, and the dropout probability was 0.1. Tanh function was selected as the activation function and Adam as the optimizer [25]. To obtain the optimal LSTM model, the optimal parameters are determined by experimentally observing the RMSE value of the model. Firstly, , different epochs are selected to train the LSTM model when other parameters remain unchanged, and RMSE corresponding to different epochs values are obtained as shown in Table 2. According to the table, when epochs = 10, RMSE is the smallest. As the epochs increase, the RMSE value gradually increases, indicating that the model overfits when epochs >10. Therefore, epochs were set to 10 in this experiment

Secondly, under the condition that other parameters remain unchanged, different Batch sizes are selected for training of LSTM models, and RMSE corresponding to different batch size values are obtained as shown in Table 3. In order to display the relationship between batch sizes and RMSE more intuitively, the table is drawn into a line chart, as shown in Figure 6. As can be seen from the figure, RMSE of the model fluctuates with the increase of batch size values. When Batch size is 8, RMSE value corresponding to the model is the minimum. Therefore, batch size was set to 8 in this experiment.

In the same way, the controlled variable method was used to experiment with the length of the time window of the LSTM model, the number of neurons in each layer, and the dropout probability value. While other parameters remain unchanged, RMSE of the model under different time window lengths (N), number of neurons at each layer, and dropout probability values are shown in Figure 7 and Figure 8.

Figure 7 shows that the RMSE value of the model fluctuates with the increase of N. When N is 16, the RMSE of the model is the minimum. Therefore, the time window length was set as 16 in this experiment.

It can be seen from Figure 8 that the RMSE value of the model is the smallest when the number of neurons in each layer is 100.

As shown in Figure 9, the RMSE value of the model is minimal when the dropout probability value is 0.5. Therefore, the dropout probability of the model was set to 0.5 in this experiment.

Finally, for purpose of testing the impact of different activation functions on the model, on the basis of the above optimal parameters, Leaky ReLu, TANH and ReLu functions were selected as the activation functions of the LSTM model, and RMSE values of the model under different activation functions were obtained, as shown in Table 4. According to the table, when the activation function is Leaky ReLu, the RMSE of the model is the smallest. Therefore, Leaky ReLu function was selected as the activation function of the LSTM model in this experiment.

Through the above operations, the parameter settings of the experimental model are shown in Table 5.

4.4. Experimental Results
4.4.1. LSTM model verification

In order to verify the performance of the LSTM model with optimized parameters, this study compared the prediction results with the LSTM prediction results before parameter optimization and the prediction results of common prediction methods such as MLP (multi-layer perceptron), as shown in Table 6. As can be seen from the table, the optimized LSTM model has the lowest RMSE value compared with the other prediction method. Therefore, the optimized LSTM model proposed in this study can effectively and well predict the stock return rate, which has certain advantages.

The residual sequence histogram of the LSTM model is shown in Figure 10. As the figure shows, the residual sequence follows normal distribution. Descriptive statistics were made in Table 7. The table shows that the residual mean, standard deviation and skewness are approximately 0, and the kurtosis is close to the kurtosis value of the standard normal distribution.

Then, the standard deviation of return rate was predicted by rolling, and LSTM model parameters were obtained, as shown in Table 8. Using this parameter to predict the standard deviation, the fitted broken line is shown in Figure 11. It can be seen from the figure that the model has a good fitting effect and can well fit the standard deviation trend.

4.4.2. VaR measurement results

Using VaR to calculate the test set, some results are shown in Figure 12. As can be seen from the figure, VaR fluctuates sharply in the stock market.

The sample quantiles with stock return rate of 1% and 5% are taken as the warning line of stock market risk. When the VaR is lower than the warning line, the warning model indicates. Figure 13 shows the warning diagram. The figure shows that VaR was in the trough in April 2020. Enlarged area is got in Figure 14 As Figure 14 shows, VaR was below the warning line of 1% sample quantile from March 13 to April 1 2020.

5. Conclusion

To sum up, the proposed model predicts the standard deviation of stock return rate with LSTM model. Taking Leaky Relu function as the activation function, training period epochs were 10, time window length N was9, batch size was 8, the number of neurons in each layer was 50, and the dropout probability was 0.1, and Adam as the optimizer, the standard deviation of stock return rate can be effectively realized. In this case, the RMSE value of the model is the minimum, which is 0.013762. The stock market risk warning can be realized by measuring VaR and taking 1% and 5% sample quantile as warning line. Compared with traditional prediction models such as MLP, the LSTM-VAR model proposed in this study has a better effect on market risk warning and can realize the warning well. However, there are still some shortcomings in the research process, such as the selection of characteristic variables, which are all structured data. It is recommended to add unstructured data to enrich data features to better capture market sentiment characteristics. In addition, due to the limitations of equipment conditions and the long time-consuming model training parameters, the model was only trained once, so there may be some errors in the results. It is suggested to train the model for several times if conditions permit, or to reduce the training duration by optimizing the model. In the next step, researches should be improved from the above shortcomings to increase the breadth of the research.

Data Availability

The experimental data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declared that they have no conflicts of interest regarding this work.

Acknowledgments

This work is supported by the social science development research project of Hebei Province (20210201118) “Research on the industrialization of housing decoration and the selection of supply logistics mode in Hebei Province”.