Abstract
This paper addresses a method to solve a multi-period portfolio selection on the stock market. The portfolio problem seeks an investor to trade stocks with a finite budget and a given integer number of stocks to hold in a portfolio. The trade must be performed through a stockbroker that charges its respective transaction cost and has its minimum required trade amount. A mathematical model has been proposed to deal with the constrained problem. The objective function is to find the best risk-return rate; thus, Sharpe Ratio and Treynor Ratio are used as objective functions. The returns are the same for these ratios, but the risks are not Sharpe considering covariance and Treynor systematical risk. The returns are predicted using a Neural Net with Long-Short-Term Memory (LSTM). This neural net is compared with simple forecasting methods through Mean Absolute Percentage Error (MAPE). Computational experiments show the quality prediction performed by LSTM. The heteroskedastic risk is estimated by Generalized Autoregressive Conditional Heteroskedasticity (GARCH), adjusting the variance for every period; this risk measure is used in Sharpe Ratio. The experiment contemplates a weekly portfolio selection with 5 and 10 stocks in 122 weekly periods for each Chilean market ratio. The best portfolio is Sharpe Ratio with ten stocks, performing a 62.28% real return beating the market, represented by the Selective Stock Price Index (IPSA). Even the worst portfolio, Treynor Ratio, overcomes the IPSA cumulative yield with ten stocks.
1. Introduction
Stocks are a capitalization investment instrument whose profitability depends on the company’s results in its business environment, which is reflected in the purchase or sale of stock prices [1, 2]. Negotiating and exchanging financial products is carried out internationally; therefore, any investor has access to products from all over the world [3]. Individuals who want to buy stocks must decide which and how much they must acquire. This problem, without considering risk, as was considered by [4] and introduced by [5], is a combinatorial optimization problem known as “The Knapsack Problem,” which is cataloged as NP-Hard. Markowitz [5] introduced the modern portfolio theory and expresses the expected return of a financial asset as the mean of its historical returns, the risk explained by the variance of the historical returns, and the concepts of diversification. Investors who want to buy securities should decide which and how many assets to include in the portfolio without considering the risk.
The sale and purchase of stocks must seek to increase the investor’s wealth, where the recommended proportions to invest in the current period compared to those of the previous period reflect the decision to sell, buy, or stay in the current position. An optimization model must be aligned with an econometric prediction method or artificial intelligence scheme of expected returns to obtain the best relationship between risk and return.
The context of the problem is to consider the requirements of an investor with a finite budget to invest in the stock market, maximizing an expected return while minimizing the risk. These stocks must be traded by a stockbroker charging their respective transaction costs. Every stock is assumed as an integer asset. A stock portfolio must be rebalanced weekly. Once the stock portfolio is performed, the investor maintains the acquired position until it is time to rebalance the proportions. Therefore, the next week, the investor must decide which stocks to sell, buy, or hold. The asset must be selected due to the investor’s risk aversion and return expectations to find the optimal proportion. This optimal combination must reach the best risk-return ratio [6].
This paper addresses a method to solve the stock market’s multi-period portfolio selection problem. A mathematical model has been proposed to deal with the constrained problem. The objective function is to find the best risk-return rate; thus, Sharpe Ratio and Treynor Ratio are used as objective functions. The returns are the same for these ratios, but the risks are not Sharpe considering covariance and Treynor systematical risk. The returns are predicted using a Recurrent Neural Network with Long-Short-Term Memory (RNN-LSTM); this neural net is compared to simple forecasting methods through Mean Absolute Percentage Error (MAPE). A pilot test demonstrated the superior quality prediction performed with Long-Short-Term Memory (LSTM).
The heteroskedastic risk is estimated by Generalized Autoregressive Conditional Heteroskedasticity (GARCH), adjusting the variance for every period; this risk measure is used in Sharpe Ratio. The proposed approach is novel and could apply to a rich portfolio problem. The main contribution of this paper is to design a model that allows an investor to allocate money to the stock market, considering constraints such as transaction cost, integrality, minimum trade amount, and budget, among others. The mathematical model analyzes the return and risk to find the best combination. The proposed approach considers multiple periods without simplifying the problem within a single period, as usually considered in previously published works. Note that we solved a real problem using mathematical models and found the optimal solution for the considered subproblems (Initial and Rebalancing portfolio). However, approximate methods are not necessarily due to the prominent obtained results. Besides, we have considered transaction costs, real, local market conditions, and other real characteristics in solving a “Rich Multi-Period Portfolio Selection Problem.” Finally, the application of the proposed approach for an emerging market such as Chilean allows for supporting the investment decision on quantitative methods.
The paper is organized as follows. Section 2 describes the portfolio problem’s literature review with the considered problem’s main characteristics. Section 3 details the proposed approach. Section 4 shows the computation results of the proposed methodology on the Chilean Market. Finally, Section 5 shows the concluding remarks and the future work section.
2. Literature Review
The problem of the stock portfolio is based mainly on a collection of financial assets, which are updated by selling the current positions and buying new stocks. This situation increases the portfolio’s total available budget or selling securities to decrease the portfolio size [7].
Volatility is an inherent characteristic of the financial time series. Generally, its behavior is not constant, and traditional approaches consider time series models assuming homoscedastic variance unsuitable for modeling financial time series [8]. According to [9, 10], most authors have focused on the computational aspects of the stock portfolio problem. They have ignored the aspects and characteristics of the financial problem, such as the time series modeling. In this case, multiple financial assets, transaction costs for each rebalanced period, and the search for the prediction of returns are considered. However, this is very difficult to solve, and only some papers consider this problem [11].
Adebiyi et al. [12] applied the modern portfolio model [5], maximizing the Sharpe Ratio and considering stock lots, investor budget, transaction costs, and minimum and maximum amounts acquired per stock. This work considered a single period. The authors proposed a Particle Swarm Optimization to solve large instances within short computing times. Markowitz [5] stated that the expected return is calculated as the average of the historical returns, and their respective variance explains the risk. An investor wishes to obtain a return , investing in financial assets minimizing the risk given a value of . The other approach is to set a risk parameter , which is the risk the investor is willing to accept. The objective function must seek to maximize return given risk aversion . The proposed model by [5] has received various criticisms for its assumptions [13]. One is that the risk is constant over time for a financial time series (homoscedasticity). This assumption is unreal because the volatility has systematical changes, called heteroskedasticity (the variance of the returns has systematic changes over time) [4, 14]. The mean-variance model proposed by [5] is described in Appendix Section. This model includes diversification by choosing the least correlated stocks, minimizing the risk of the portfolio by choosing stocks with a lower variance, and finally maximizing profitability by looking for stocks with higher average historical returns [15].
However, several authors proposed other different risk metrics (different from the work proposed by [5]), such as Value at Risk (VaR) [16], Conditional Value at Risk (CVaR) [17], semi-variance [18], Generalized Autoregressive Conditional Heteroskedasticity (GARCH) and Dynamic Conditional Correlation Generalized Autoregressive Conditional Heteroskedasticity (DCC-GARCH) [19], and fuzzy-logic adjusted risk [20]. Besides, several authors proposed metrics for measuring returns, such as based on expert opinions [18], Autoregressive Integrated Moving Average (ARIMA) [12], RNN-LSTM [17], and Fuzzy Logic Adjusted Return [20].
There are two metrics to find the optimal combination of risk and return. The first metric is the Sharpe Ratio (SR), a measure to analyze an investment’s excess return, considering the involved risk. SR is calculated by subtracting the return of a risk-free asset from the expected return and dividing this result by the risk [21]. This ratio incorporates diversification as part of the risk. Another stock market ratio is called the Treynor Ratio (TR). Peiro [22] explained that TR measures the excess return per unit of systematic risk. The calculation of these metrics is detailed in Appendix Section.
In the literature, all the current works generally focus on single-period optimization (SPO) for index tracking portfolio design [23]. However, in the financial markets, the methods may lead to frequent portfolio rebalances, resulting in high transaction costs. Huang et al. [23] proposed a novel multi-period optimization (MPO) approach to index tracking portfolio design, which can account for transaction costs and holding costs. Moghadam et al. [24] proposed a multi-period portfolio selection model considering investors’ dependence, risk aversion, and diminishing sensitivity. A robust optimization approach was considered, and three metaheuristic algorithms were developed for solving large-size problems. Yang et al. [25] addressed the multi-period portfolio problem with short selling under a fuzzy environment. Three types of short-selling constraints, i.e., total short-selling proportion constraint, short-selling cardinality constraint, and lower and upper bound constraint, were considered. Li et al. [26] proposed a predictive control model for a multi-period portfolio optimization problem. Additional to the mean-variance objective, the authors constructed a portfolio whose allocation is given by model predictive control with a risk-parity objective. Finally, Jiang and Wang [27] considered a multi-period multiobjective portfolio selection problem with uncertainty. A weighted-sum approach was introduced to obtain the Pareto front of the problem.
A multi-period portfolio selection problem where the future security return rates are given by experts’ estimations instead of historical data was proposed by [28]. A new mental account concept was introduced to reflect the conflicting risk attitudes for different goals. Also, realistic constraints such as background risk, liquidity risk, transaction cost, and cardinality constraint were considered. García et al. [29] extend the stochastic mean-semivariance model to a fuzzy multiobjective model to measure the performance of a portfolio. Uncertainty of future return and liquidity of each asset is modeled using LR-type fuzzy numbers belonging to the power reference function family. The main novelty of the work is the consideration of realistic constraints by investors.
Nokhandan et al. [30] proposed a Nash bargaining model to solve a novel multi-period competitive portfolio optimization problem for large investors in the stock market. The Competitive Portfolio Model (CPM) was developed following the Cournot competition principle for a static, non-cooperative, and non-zero-sum game with complete information. Real-world conditions such as transaction costs, risk-free assets, and cash were also included to match real-world problems. Three criteria control the model’s investment risk: the average value at risk, the mean absolute semi-deviation, and entropy. Dymova et al. [31] proposed an approach to the bi-criteria multi-period fuzzy portfolio selection based on observing the variance as a measure of portfolio risk. Simple criteria of portfolio risk and return are proposed. A fuzzy portfolio selection one-period model was developed to solve the considered problem. Besides, a new two-stage bi-criteria optimization approach to portfolio selection was developed, tested, and used as the main component of the proposed multi-period portfolio selection model.
Generally, transaction costs are neglected in the decision-making process of the online stock portfolio problem. However, some authors included this concept for making decisions, such as [32, 33]. In [32], the authors proposed an adaptive online portfolio selection problem with transaction costs. An online moving average method (AOLMA) was used to predict future returns by incorporating an adaptive decaying factor into the moving average method. Moon and Yoon [33] considered the portfolio selection (OLPS) considering transaction costs and proposed a hybrid genetic reversion strategy evolving a population of portfolio vectors.
However, building efficient multi-period portfolios is a challenging problem, including defining the risk and return metrics and evaluating each share’s position for the period. Early, Merton [34] proposed a policy in which an investor must continually seek to balance the invested proportions for each asset. However, continuously rebalancing the portfolio implies high transaction costs. Dynamic programming could solve a multi-period problem by choosing the best consecutive decisions; these decisions are affected by the number of stocks, market information, liquidity of assets, and short sales [35].
Brandt and Santa-Clara [36] suggested that a linear function could estimate the investment proportion. Bodnar et al. [37] solved the multi-period problem by assuming a unique repeating period. He et al. [38] proposed a method where a high-order model monthly estimates the risk and return. This approach has been tested in the US and Chinese markets. Skaf and Boyd [39] addressed the multi-period problem with stochastic considerations to increase wealth using dynamic programming and reality constraints. The authors found suboptimal policies to get feasible solutions. Indeed, some policies that help stock trading to handle transaction costs are the “No-Trade Region Policy” and the “Rolling Optimize-and-Hold Policy” proposed by [11]. Babazadeh and Esfahanipour [16] proposed a multi-period optimization considering the Value at Risk (VaR) as a measure of risk, generating an Average VaR model, which includes transaction costs, budget, and maximum and minimum purchases.
According to [40], Recurrent Neural Networks (RNN) were considered a methodology for processing sequential data, such as time series [41]. A neural network with long and short-term memory is one of the most successful RNN architectures [42]. LSTMs include memory cells, a computing unit that replaces traditional artificial neurons in the hidden layer of a network. With these memory cells, the networks can efficiently associate remote memory and inputs over time; thus, they are suitable for capturing the structure of data dynamically over time with high predictability [41].
Peng et al. [43] analyzed the factor zoo from a machine learning perspective, which has theoretical and empirical implications for finance. The authors discussed feature selection in the context of deep neural network models to predict the stock price direction. This work considered a set of 124 technical analysis indicators used as explanatory variables in the recent literature. It specialized trading websites-various classification metrics, accounting for profitability and transaction cost levels to analyze economic gains.
Muncharaz [44] showed the application of neural networks in creating predictive models. The work considered an RNN with LSTM instead of classic time series models such as the Exponential Smooth Time Series (ETS) and the Arima model (ARIMA). These models were estimated for 284 stocks from the S&P 500 stock market index, comparing the MAE obtained from their predictions. Rather [45] proposed a new method of predicting time-series-based stock prices considering the investment portfolio problem. A new regression scheme was implemented on a long-short-term memory-based deep neural network.
A novel portfolio construction approach using a hybrid model based on machine learning for stock prediction and the well-known mean-variance (MV) model for portfolio selection was proposed by [46]. Two stages were involved in the proposed model: stock prediction and portfolio selection. In the first stage, a hybrid model combining eXtreme Gradient Boosting (XGBoost) with an improved firefly algorithm (IFA) was proposed to predict stock prices for the next period. Zhao et al. [47] considered a multi-period investment portfolio selection problem. First, the portfolio selection model fits the extreme cases of 0% or 100% confidence views. The authors established a new programming problem based on the optimization approach and identified explicit solutions. Second, the author extended the model to multi-period form and discretized the results with a scenario tree, which solves the multi-period problems.
In the literature review, it has been found that multi-period optimization is widely studied, and the series prediction is part of the success of this technique. Many authors mention that the difficulty of the multi-period optimization problem is complex. Therefore, it is necessary to find ways to simplify the problem away from the real conditions of the stock market. The complexity is given by the consecutive decisions that must be performed, where the current decision affects the portfolio within the horizon time. Many authors considered the multi-period problem as a single-period problem repeated several times, helping reduce the computational complexity.
We have proposed a methodology for the stock portfolio problem by considering multiple periods without simplification (a single period repeated several times), seeking a better estimate of the returns differently from the average of the historical values and the risk measures considered. We have considered Sharpe’s and Treynor’s ratios’ objective function values to find the best risk-return ratio. In this way, it is achieved that the proposed methodology can be applied adequately in fundamental emerging markets.
3. Materials and Methods
The main scientific concepts are explained in this section to approach the portfolio selection problem with sophisticated methods, allowing an investor to allocate money in the best risk-return combination of assets.
3.1. Initial Optimization Model
This approach uses an econometric method to estimate the volatility of a financial time series. Three different prediction metrics (Moving Average-MA, Exponential Smoothing-ES, and Long Short-Term Memory-LSTM) have been compared to estimate each asset’s expected logarithmic returns and find the best one. The portfolio selection problem is solved by a mathematical formulation using the above information, finding the best risk-return ratio of assets, testing two different financial ratios, and comparing them with the index stock market. Finally, a mathematical model for rebalancing the portfolio is applied. We have considered the following main aspects:(1)Short sell is not allowed(2)Budget one-time financed at the beginning(3)Money is not withdrawn(4)The close price is known
The proposed methodology considers several steps:
3.1.1. Managing Returns Heteroskedasticity
The first step is to measure the volatility inherent to the financial time series, whose behavior is not constant. Consequently, homoscedastic variance methods, such as Markowitz, are unsuitable for modeling financial time series for the proposed approach. We have considered GARCH models widely used in finance to solve this issue. GARCH models stand for the General Autoregressive Conditional Heteroskedasticity model, and its mission is to capture the changing values of risk. In this case, the risk is expressed as a variance.where, are the square of the perturbations of a time period , is the historic variance corresponding to the period . could be estimated by maximum likelihood method. In Julia language [48] is available GARCH by using the package ARCHModels.jl [49].
3.1.2. Return Prediction by LSTM
The second step is analyzing the predictive data of logarithmic returns by the RNN. The input data must be considered from a previous exploration time step of the neuron as part of the incoming information. LSTM networks pass more information across the recurrent connection than the traditional RNN. The components of an LSTM unit are input, forget and output gate, block input memory cell, output, activation function, and peephole connections. The input gate protects the unit from irrelevant input events. The forget gate helps the unit forget previous memory contents. The output gate exposes the memory cell’s contents at the LSTM unit’s output. The output of the LSTM block is recurrently connected back to the block input and the gates of the LSTM block. The input, forget, and output gates in an LSTM unit have sigmoid activation functions for the [0, 1] constraint. The LSTM block input and output activation function (usually) is a Tanh Activation Function [50].
Forecasting with simple methods should be less effective in portfolio selection than LSTM. The moving average is calibrated by choosing , representing the number of periods to look back, splitting the closing price data into 70% to find which minimizes MAPE and the last 30% to contrast with LSTM. Exponential Smoothing follows the same procedure as MA, but instead of , is calibrated .
3.1.3. Experimental Design
The proposed algorithm is described in the flowchart shown in Figure 1. It starts at by inputting the Risk-Free Rate , the budget . The size of the portfolio , so is input to the model the covariance matrix the expected logarithmic returns and vectors, corresponding to the first optimization problem (please see Section 3.1.5). It is going to output the transaction cost the stocks and their quantity portfolio (t), the surplus the total Wealth (including the Surplus), and the portfolio return expected minus transaction costs After the above steps, it has inputted the same matrix and vectors However, in , this time including the previous portfolio and surplus, so are calculated , to evaluate the real obtained returns and that show the Portfolio expected return if the portfolio is held. Afterward, in the Nth optimization, it will come out the updated variables , and It is decided if the expected return by trading is less than the expected without doing it (). If it is true, then the previous portfolio is held.
3.1.4. Simplified Problem
The simplified problem could find an optimal solution due to the absence of complex constraints; moreover, the solution could be found within a short computing time with the Ipopt® solver. The objective function (2) shows the best risk-return ratio. Therefore, any constraint added to the problem would have suboptimal performance.
Subject to
3.1.5. Initial Portfolio: Sharpe Ratio
This portfolio selection is different from the next because there is only one decision to make how many stocks to buy. The objective is to maximize the Sharpe Ratio, choosing a given number of shares. The sets and parameters and decision variables are the following: (See Table1).
The objective function is calculated by.
Subject to
The objective function (4) shows the maximization of the Sharpe Ratio, which considers the excess return per unit of portfolio risk. Equation (5) shows that the portfolio’s return is given by the proportion invested in stock of the total invested multiplied by its expected return. Equations (6) relate the number of stocks with the variable possibility of including a stock in the portfolio. These equations limit the number of stocks in the portfolio. A maximum amount equal to the division between the budget and the stock with the lowest price could be acquired. Constraints (7) show that the expected return is equal to the percentage change of the previous price of stock concerning its current price. Equation (8) shows that the total amount to be invested plus the transaction cost of this investment must be less than or equal to the initial available budget.
The portfolio risk is measured through equation (9). The risk is calculated as its variance. The variance is equal to the sum of the return covariances weighted by the proportion of the stock, plus the sum of the variance of the returns weighted by the square of the total invested proportion. Equation (10) shows that the total investment amount must be greater than the minimum transaction amount accepted in some markets. Constraint (9) shows that the total proportion of investment is equal to the number of shares i on own multiplied by their respective price i, which is divided by the total investment amount.
Equations (12) determine that if a stock is purchased, the minimum purchase amount corresponds to the entire unit. Equation (13) shows that the total investment is the sum of the number of stocks multiplied by their respective prices. Equations (14) and (16) determine the maximum and minimum proportion of the portfolio to invest for each stock, respectively. Equation (17) determines that the number of stocks to be included must be equal to , where is an arbitrary integer defined by the investor. The sum of the proportions must be equal to 1 to ensure that all capital invested in the portfolio is distributed (equation (16)) Finally, (18) determine the nature and integrality of the variables.
3.1.6. Initial Optimization: Treynor Ratio
The objective function is calculated by.
Subject to.
Constraints (5)–(8) and (10)–(18) plus the following constraints:
The objective function (19) considers the excess return per unit of systematic risk. Equation (20) calculates the portfolio’s systematic risk, equivalent to the sum of the systematic risk of stock weighted by the proportion of the total invested money. Finally, equations (21) calculate the systematic risk of stock equal to the covariance of the returns of stock i concerning the market over the variance of the market returns.
3.2. Rebalancing Optimization: Sharpe Ratio
This formulation is repeated for all remaining periods. The decisions to make are which stocks buy, sell, or hold. The set and some variables are shared with the first optimization: . Moreover, some parameters of the initial model as considered: . The additional parameters and decision variables are the following. (See Table2).
The objective function is calculated by.
Subject to.
Constraints (5), (7), (9), (14)–(17) with the following additional constraints:
Equation (22) maximizes the Sharpe Ratio. Equation (23) calculates the portfolio’s profitability, which is diminished by the transaction cost. Equation (24) restricts that the transaction amount must be more significant than that required by the broker. Equations (25) show that the proportion of the total investment is equal to the number of shares to own multiplied by their respective price , which is divided by the total investment amount.
The portfolio’s value before optimizing equals the sum of the previous stocks at their respective current prices (equation (26)). Equations (27) and (28) restrict the minimum number of stocks and the number of shares to be sold, respectively. Equation (29) considers the total amount, equivalent to the sum of the number of shares to own with their respective price. Equation (30) restricts that the total amount to be purchased must be less than or equal to the total sold amount, plus the money left over from the previous period, and less cost associated with the transaction. Equation (31) determines that the total purchase amount equals the sum of the number of stocks to be purchased multiplied by their respective prices.
Equations (32) show the relationship between the number of stocks with the maximum number to have. This maximum number is equivalent to the division between the initial value of the portfolio plus the rest and the cheapest share. Equation (33) indicates that the total sale amount equals the sum of the number of shares to be sold multiplied by their respective price. Constrains (34) show the relationship between the number of stocks to buy with the maximum amount equal to the division between the initial value of the portfolio plus the rest and the cheapest share.
Equation (35) determines that the transaction amount equals the total purchase and sale amount. The relationship between the number of stocks to be sold and included in the portfolio is determined by (36). Equations (37) show that the number of stocks to be held until the following week must be equal to the number of stocks already owned plus the number of bought stocks minus the number of sold stocks. Equations (38) indicate that only the purchase or sale of stocks could be performed, not both transactions simultaneously. Equation (39) indicates that it is possible to buy neither nor sell stock. Finally, (40) determine the nature and integrality of the variables.
3.3. Rebalancing Optimization: Treynor Ratio
The objective function is the following:
Subject to constrains (5), (7), (14), (16), (18), (20), (21), and (23)–(40).
4. Results and Discussion
The multi-period optimization problem is applied using the Julia Mathematical Programming package (JuMP.jl), solving it through Gurobi® solver v9.1.2. We have performed the experimental computation on returns of the Chilean market’s high and medium liquidity stocks from June 16th, 2017, to November 18th, 2019. First, the best prediction method must be performed. Next, we found the most profitable scenario: one must perform better than the IPSA index (Chilean Index).
4.1. Data Source and Data Pre-Processing
The experiment begins by downloading, from the platform “Santiago Stock Exchange,” the daily open, high, low, and close prices and traded volume (OHLCV) of 100 Chilean stocks from 2012 to 2019. These are the most relevant traded stocks for that period. Then, for each stock, the number of days traded between 2012 and 2019. Thus, the initial number of stocks is reduced because only 30 stocks accomplish the minimum trading days.
The daily OHLCV data is transformed into a weekly one, finding the highest and lowest price, the open and close price, and the summary of the traded volume within a week. This transformation aims to have a longer time horizon than a daily one and significant changes in stock prices. Furthermore, it chose the data from January 6th, 2012, to October 19th, 2019, leaving the data set with 407 weeks of OHLCV data. In the same period, the OHLCV data of the IPSA index is downloaded from Santiago Stock Exchange, representing the most traded 30 stocks in the Chilean Stock Market. The period after the “social outburst” of 2019 and the SARS-CoV-2 pandemic is excluded.
4.2. Prediction Methods
A comparison is performed between the three methods where exponential smoothing seems to reach a better MAPE, followed by LSTM and then the moving average of the two periods. Table 3 shows the prediction comparison. The first column of Table 3 shows the used method: MA-2 (move average), ES (exponential smoothing), and LSTM (Neural Net with Long-Short-Term Memory). The second column shows the average obtained values of the MAPE from the forecast method. Finally, Table 4 shows the Standard Deviation of the MAPE.
In addition, to test the quality of the prediction, a pilot test has been performed with the three best methods (Figure 2), choosing Sharpe Ratio as the objective function and using the simplified formulation to reach an optimal portfolio.
The performance of Exponential Smoothing with an alpha near 1 makes the prediction almost equal to the current one. The expected returns are near zero, making it difficult to decide which risk-return ratio is the best because all expected returns are nearly identical. The moving average has a similar issue. In this case, the prediction not only depends on the current price, but the previous value does not help to improve the prediction either. Consequently, LSTM offers the best quality prediction.
We need to find the values of and that minimize the mean and variance of the error, in this case “Akaike (AIC).” Using an exhaustive search for values of p and q from 1 to 3. Table 5 shows the parameter calibration.
Hence, GARCH {2, 2} shows the best performance, with the lowest Akaike, that GARCH to adjust the risk is looking the perturbations () and volatility () from two-time steps back. Then, GARCH {2,2} is used in the predict function, which is also provided by Archmodels.jl package and works considering consecutive batches with the size of 52 weeks of the close price of a stock to adjust a volatility level for the next period, more specifically, the volatility is expressed as variance and with a year of data is predicted the risk level for the next week, using the described function. The risk of CHILE, the stock of “Banco de Chile,” is shown in Figure 3. Thus, this method can adjust the heteroskedastic risk value for each period.
The input data are OHLCV of stock and IPSA to predict the stock’s close price for the next week. This case represents one of the 30 stocks considered in the study. The data is split into two sets; 70% goes to a training one to let the model learn. Testing one (30%) to calibrate parameters like Number of Neurons (NA), Iteration (EPCOH), and Batch Size (BS), combinations are made following the current literature. A command is included that picks chronologically random data in the time series to avoid overfitting, making a synthetic database; thus, the calibration is not made over the same data used for the optimization. MAPE is an error parameter, then the mean of the 30 stock MAPES for each parameter’s combination is calculated, choosing the minimum value corresponding to N°18.
As an example, Figure 4 shows the LSTM based prediction v/s, the actual price of “Banco de Chile (CHILE)” Stock for the last 122 periods.
4.3. Portfolio Comparison of 5 and 10 Stocks
Each ratio is selected between 5 and 10 shares, with a computing time of 1800 seconds, 1,000,000 CLP budget, and the expected returns are LSTM based. Figure 5 shows the portfolio information for the first and last two periods for Sharpe Ratio, stands for portfolio weekly expected return, “Return” expresses the real yield of the week, “Real Rmax” is the weekly return minus the transaction cost. Table 6 shows the portfolio performance comparison.
Figure 5 shows the cumulative returns for different configurations. SR5 and SR10 show the performance by considering 5 and 10 stocks using the Sharpe Ratio, respectively. TR5 and TR10 show the returns’ performance by considering 5 and 10 stocks using the Taylor Ratio. Finally, IPSA describes the performance of the Index Chilean Market. Note that the best portfolio selection performed is Sharpe Ratio with ten stocks (SR10), and the worst one is Treynor Ratio with ten stocks.
4.4. Simplified Formulation v/s, SR10 and TR5
The simplified formulation is applied using the Sharpe Ratio as the objective function. Table 7 shows the cumulative returns for the simplified in the first column without considering the transaction cost; the remaining columns show SR10 and TR5 portfolio returns reduced by the transaction cost. The cumulative return has been increasing at a quarterly average of 11.02%, 6.70%, and 2.35% for simplified problems, the cumulative return of SR10 and TR5 with transaction cost, respectively.
Even if the simplified portfolio does not consider the transaction cost, this could be estimated assuming the same budget. Table 8 shows the comparison of the different methods for the considered problem. The first column of Table 8 describes the cumulative return and the cumulative return by considering transaction costs. We have compared the simplified portfolio, the SR10, and TR5. The new value of the cumulative return for the simplified portfolio is reduced by 24.57%, reaching 78.04% as Cumulative Rmax.
Figure 6 shows how the stock selection is close to the border of the cluster; this resembles an efficient frontier, validating the portfolio selection.
4.5. Discussion
The research addresses the design of a portfolio selection method and begins forecasting risk using an econometric procedure and returns using artificial intelligence. These predictions feed an exact method that optimizes the multi-period portfolio problem.
An RNN algorithm is compared to simple forecasting methods for time series. Considering MAPE, a simple method that performs better than LSTM, the MAPE criteria are discarded. However, it showed the superior quality prediction of LSTM by doing a pilot test. Indeed, Recurrent Neural Network with Long Short-Term Memory performs better in portfolio selection than simple forecasting methods.
Markowitz’s method is the main concept for this approach, adjusting the best risk-return ratio, where Sharpe Ratio performs better than Treynor Ratio in both 5 and 10 stock portfolios. This situation is a heteroskedastic risk created by GARCH, considered for Sharpe Ratio, unlike Treynor Ratio, which has a constant systematic risk for 122 periods. Furthermore, it is concluded that the ten-stock Sharpe Ratio Portfolio (SR10) has the most significant cumulative return due to the diversification expressed as the correlation between the assets. Although the Sharpe simplified formulation overcomes every portfolio selection, it does not deal with the realistic problem, so this optimal solution could be used only as an upper bound to reach the proposed method.
This approach is like a multiple of Markowitz’s formulation, with a weekly horizon. The expected returns are predicted through LSTM instead of the mean of the historical prices, and the variance for every week is estimated with GARCH, which is used in Sharpe Ratio. Treynor Ratio uses constant risk; this makes Treynor Ratio perform worse than Sharpe Ratio, but even the worst portfolio selected (TR10) beats the IPSA index, this is to say, the market.
5. Concluding Remarks
The portfolio selection problem is widely studied in the literature but always focuses on the computational dimension of the problem. This paper deals with the complexity of a realistic problem, proposing a mathematical formulation fed by an econometric and machine learning method. Therefore, an investor guided by this method must not consider any subjective variable, like opinion, news, or sentiment. Thus, this approach to the portfolio selection problem allows an investor to decide where to allocate the money through a complex method that helps to find the best combination of assets, considering transaction cost, the stock integrality, and minimum trade amount, and ensuring not to overcome the budget.
However, the model has some limitations; for instance, the close price is known, and this price is used for prediction, also is the bid and ask price, so, hypothetically, the prediction and the trade are made at the same time, just when the market closes, and this is virtually impossible. Finally, in practice, the price is constantly changing; this proposes a challenge to reduce the compute time, using heuristics and metaheuristics to quickly get a good solution, thus decreasing the price change since it is input in the model until it gets a solution.
In future work, we propose using extended multiobjective methods for logistic problems such as those proposed by [51–53]. Moreover, we could propose metaheuristic algorithms based on the granular tabu concept [54–56] for a large number of stocks. Besides, multicriteria techniques such as those proposed by [2] could solve real problems.
Appendix
A. Markowitz Model
The model of mean-variance proposed by [5] considers the following risk (A.1) and return (A.2) functions:
Subject towhere and determine the percentage of the invested budget in stocks and and is the covariance of returns between stocks. Finally, the average of the historical returns is . The model is subject to the proportion to be invested must be equal to the available budget of the investor, as expressed in equation (A.3).
B. Sharpe Ratio
The Sharpe Ratio[57] is defined as the relationship between the additional benefit of an investment fund (difference between the return of the fund on the asset without risk) and its volatility, measured as its standard deviation. The Sharpe Ratio is calculated with.where is the average return of the financial assets (normally a stock or fund), is the average return of the asset with free risk, and is the deviation of the returns of the asset [21].
C. Taylor Ratio
The Taylor Ratio is calculated as the excess return per unit of systematic risk, unlike the Sharpe Ratio, which considers only the portfolio’s systematic risk [58]. is a systematic risk measure that considers the variation of the portfolio concerning the market. is the weighted average of the individual betas of the portfolio assets. corresponds to the relationship between the risk of the asset concerning the market risk. This value measures the sensitivity of a change in the average return of an individual investment to the change in the market’s return. The market risk is equal to 1. If an investment shows a greater than 1, this asset is riskier concerning the market risk. An investment with a less than 1 means the asset is less risky than the market risk. An investment with equal to zero is risk-free, such as treasury bonds [1].where is the return of the asset and is the market return.
Data Availability
The data could be provided by an e-mail of the corresponding author.
Conflicts of Interest
The author(s) declare(s) that there are no conflicts of interest regarding the publication of this paper.
Acknowledgments
This work was supported by the University of Bío-Bío, Project number 2160277 GI/EF.