Abstract

The traditional portfolio selection model seriously overestimates its theoretic optimal return. Aiming at this problem, two portfolio selection models are proposed to modify the parameters and enhance portfolio performance based on Bayesian theory. Firstly, a Bayesian-GARCH(1,1) model is built. Secondly, Markov Chain is applied to curve the parameters’ state transfer, and a Bayesian Markov regime-Switching-GARCH(1,1) model is constructed. Both the two models can handle the overestimation problem and can obtain self-financing portfolios. In the numerical experiments, both the models are examined with data from China stock market, and their performances are compared and analyzed. The results show that BMS-GARCH(1,1) model is superior to the Bayesian-GARCH(1,1) model.

1. Introduction

Portfolio optimization and diversification have been instrumental in the development and understanding of financial markets and financial decision-making. The major breakthrough came in 1952 with the publication of Harry Markowitz theory of portfolio selection. The theory is now popularly referred to as modern portfolio theory. According to this theory, both the return and risk of a security can be quantified, using the statistical measures of its expected return and standard deviation. Then, the investors should consider return and risk together and determine the allocation of funds among investment alternatives on the basis of their return-risk trade-off. The idea that sound financial decision-making is a quantitative trade-off between return and risk was revolutionary. And, it has had a major impact on academic research and the financial industry as a whole.

However, there are still some concerns with the theory, and one of the concerns is that the original mean-variance model depends on estimated parameters which are uncertain and may be mistaken. Specifically, return-risk optimization can be very sensitive to changes in the inputs, especially when the return and risk estimates are not well aligned or when the problem formulation uses multiple, interacting constraints. As a result, many practitioners consider the output of risk-return optimization to be opaque, unstable, and/or unintuitive. Besides, estimation errors in the forecasts significantly impact the resulting portfolio weights. For example, it is well-known that, in practical applications, equally weighted portfolios often outperform mean-variance portfolios [14]. The classical framework has to be modified when used in practice in order to achieve reliability, stability, and robustness with respect to model and estimation errors [5, 6]. In other words, the portfolio selection models face estimation risks. Therefore, how to minimize the estimation risk to portfolio draws much attention in academia.

Recently, more and more parameter uncertainties have been analyzed in Bayesian Framework. The Bayesian method has advantages: first of all, it can fully use prior information; secondly, it explains the uncertainties of estimated risks and models; thirdly, it makes the computation in simulating complicated variables easy. In this way, investors firstly set the parameter prior; then they use Bayesian rules to get the posterior. As a result, Bayesian methods are gradually used in the portfolio selection analysis. Avramov and Zhou [7] reviewed the development of Bayesian portfolio in the following respect: (1) the returns are i.i.d; (2) the returns are predicable; (3) the mean follows regime switching and the volatility is stochastic. They mainly discussed the Bayesian portfolio in parameter uncertainties as well as with the model uncertainties. In Fabozzis review of “60 Years of Portfolio Optimization: Practical Challenges and Current Trends,” they also speak high of the application of Bayesian methods in portfolio optimization [8]. In this paper, we take the financial data with different lengths as the application background, mainly focus on the uncertainties of parameters in portfolio selection, and build two models based on Bayes and modern portfolio theory: a Bayesian-GARCH(1,1) model and a Bayesian-Markov Regime Switching-GARCH(1,1) model. Using the Bayesian optimization algorithm, both the models do not overestimate their theoretic optimal returns, and they can handle the overestimation problem.

The paper is organized as follows. In Section 2, the background studies are briefly introduced, including the literature review and the relative theories and methods, such as Bayesian theory and Markov regime switching. In Section 3, a Bayesian-GARCH(1,1) model and a Bayesian-Markov Regime Switching-GARCH(1,1) model are constructed. In Section 4, the two models are taken into application, and the numerical results are illustrated and compared. In Section 5, the paper is concluded with summary comments and pointers to the future work.

2. Background Studies

2.1. Literature Review

It is well known that the parameters in portfolio selection may be uncertain or mistaken, and the return-risk optimization can be very sensitive to changes in the inputs. In early studies, most researchers assumed that the risky asset returns are i.i.d and focused on the effect of the uncertainty of the mean and variance on the portfolio selection. Williams [9] discussed how the parameter uncertainty affected the portfolio selection. Bawa et al. [10] firstly consider the uncertainty of the parameters of the return distribution in static portfolio selection. Gennotte [11] analyzed portfolio selection when the expected return was uncertain. Kandel and Stambaugh [12] showed that when the returns were predictable, investors who ignored the parameter uncertainties preferred risky stocks in a short period. Zenios et al. [13] develop multiperiod dynamic models for fixed-income portfolio management under uncertainty, using multistage stochastic programming with recourse. The multiperiod models outperform classical models based on portfolio immunization and single-period models. Brennan [14] studied the dynamic portfolio selection with the assumptions that the expected risk premium was uncertain in discrete time. He thought the assumptions of financial models should be that all the invest opportunities depended on a series of unobservable variables, and the parameters were not certain. Barberis [15] got similar results with the long-period investors. Pástor and Stambaugh [16] discussed how the uncertainties of the mispricing affected the portfolio selection. Pástor [17] treated the model uncertainty using the investors’ opinions to the asset pricing as prior information. Bai et al. [18, 19] develop a bootstrap-corrected estimator to correct the overestimation in portfolio selection and further extend the theory to obtain self-financing portfolios. Besides, in order to improve the portfolio selection efficiency, researchers tend to extend the models with dynamic programming [13, 20, 21], with continuous time framework [22, 23], using Markovian regime-switching methods [24], and so on.

Currently, most of the parameter uncertainties are analyzed in Bayesian Framework. As we mentioned before, the Bayesian method has advantages: first of all, it can fully use prior information; secondly, it explains the uncertainties of estimated risks and models; thirdly, it makes the computation in simulating complicated variables easy. In this way, investors firstly set the parameter prior; then they use Bayesian rules to get the posterior. As a result, Bayesian methods are gradually used in the portfolio selection analysis. One of the implications is that the Bayesian theory can be mainly used to estimate the uncertain returns or other factors (such as a shrinkage estimator and so on) in portfolio [15, 2529]. In addition, for the parameter uncertainty of portfolio selection, in other words, for the estimation risk evaluation or the estimation error correction, researchers also tend to choose Bayesian methods [4, 3034]. Besides, the Bayesian theory is also used in model analysis [35], risk measurement or return-risk trade off [36], portfolio optimization [37, 38], and so on.

2.2. Modern Portfolio Theory

According to Markowitz’s portfolio theory, there are N assets. is the excess return of asset N in time t, , and they follow a normal distribution , µ is a matrix, and is a matrix. The weights are denoted as . The return of the portfolio is

And the expected return and variance arewhere is the return of asset i and is the covariance of assets i and j.

Suppose investors hold the portfolio for τ and their target is to maximize the value at time , where T is the time when the portfolio is built. The mean-variance model can be expressed as the following optimization problem:where is the permitted minimum return.

2.3. Bayesian Theory

Bayes firstly presented Bayesian formula in his essay. Later on, Laplace put the new approach into applications. Then statisticians developed it as a systematic statistical method, the Bayesian theory. With the new theory, each unknown variable can be seen as a stochastic variable and can be described with a probability distribution. The distribution is the prior information before sampling, and hence it is called prior distribution, or prior.

Traditional statistical estimation depends on the given distributions and data. However, prior is the base of the Bayesian estimation. Posterior is the key to Bayesian estimation. For an unknown parameter θ, its posterior is a conditional distribution of θ under sampler x and it contains all the information that is available.

2.4. GARCH Model

Engle presented the ARCH (autoregressive conditional heteroskedasticity) model in 1982. Then the method is popular in the volatility empirical test. However, in real stock market, the conditional variance depends not only on one or two previous variables but also on more alternating quantities, which account for several lags and parameters. Then, Bollerslev presented the GARCH model:where . In general, the real market can be characterized well when .

2.5. MCMC Simulation

MCMC sampling is proposed by Metropolis et al. [36], and then Hastings [40] generalized it. MCMC simulation depends on a Markov chain, and the main methods are Gibbs sampler and Metropolis–Hastings. Derin et al. [41] presents Gibbs sampler in a statistics article, which is now the most popular sampler in MCMC. According to Gibbs sampler, it is easy to figure out the total conditional probability density, which makes the sampler not that difficult. However, sometimes the Gibbs sampler cannot solve complicated Bayesian problems with combined probability density. In these cases, the Metropolis–Hastings algorithm is applied. Take a random variable θ as an example, for its symmetric proposal density function , the acceptance probability with M-H algorithm is

In this way, the acceptance probability is free from calculation. It should be mentioned that the Bayesian parameter estimation based on simulation depends on the astringency of Markov chain. In this way, the sample generating from the simulation follows the posterior distribution of expectation. In the posterior simulation, the aim is to generate a fully ergodic Markov chain.

3. Portfolio Selection Models Based on Bayes

3.1. Bayesian-GARCH(1,1) Model

In this section, the residuals of the return are assumed to follow t-distribution, and the posterior is simulated by Metropolis–Hastings and Gibbs sampler. Suppose the return of asset follows the GARCH(1,1) model as belows:

Equations (6) and (7) show the process that the asset return and volatility follow. Equation (6) is the mean equation, and (7) is the volatility equation, where is the return of asset i at time t, is the variance (volatility) of the return of asset i between time and t, and . For convenience, assume is unchangeable. If the variance is consistent, then (6) turns to a linear regression. For the parameters , , and , they will be estimated with Bayesian methods.

3.2. Likelihood Function and Prior of Parameters
3.2.1. Likelihood Function

With the assumptions of assets and parameters above, denote the parameter variable of the model as . Suppose follows a t-distribution with degrees of freedom, the likelihood function can be written as follows:where is a parameter variable of asset i and is its conditional variance. The initial conditional covariance is a constant, and follows a t-distribution with degrees of freedom . Its conditional variance at time t is .

3.2.2. Prior of Parameters

In many cases, people’s priors are vague and thus difficult to translate into an informative prior. We therefore want to reflect the uncertainty about the model parameters without substantially influencing the posterior parameter inference. The so-called noninformative priors, also called vague or diffuse priors, are employed to that end. In this section, we suppose that there is a noninformation prior for the conditional variance parameters of asset i:where is an indicator function for constraints of parameters as follows:

Geweke (1993) proposed that the prior of can be index prior distribution, and its density function is

Based on the distribution, the mean is , and λ can be determined by the degree of freedom of . Several empirical experiments show that in the financial markets, the degree of freedom is typically less than 20. Hence, set the upper bound K as 20. According to Bauwens et al. [42], if there is a diffusion prior for on , its posterior will not be appropriate. So, here we use the index prior as (11) shows. Then, assume the prior of follows normal distribution:

3.3. Parameter Estimation with Bayesian-GARCH(1,1)

Assume there are N assets, of which assets are of T period. Denote the return of asset i as (), and all the T variables are i.i.d, following the same multivariate normal distribution. In addition, the rest of the () assets are of S periods, and denote their return as ().

According to the prior assumptions, the posterior of is

Obviously, there is no analytic solution for the joint posterior density function. Therefore, it is acceptable to replace distribution with normal distribution before sampling. Suppose the return of asset i follows distribution, there is

It is equivalent to the normal distribution as follows:where is called mixed variables, following i.i.d. Gamma distribution is

This substitution helps to make the calculation of the posterior easy. After the replacement, the parameter variable of asset i becomes , and its logarithm likelihood function turns to normal logarithm likelihood function:

According to the equation, when and when , . Meantime, the posterior of contains other mixed variables. The logarithm posterior iswhere .

3.3.1. Conditional Posterior Distributions

(1) The Conditional Posterior Distribution of . It can be proved that the full conditional posterior density of is a normal distribution:

(2) The Conditional Posterior Distribution of . It can be proved that the full conditional posterior density of is a gamma distribution:

(3) The Conditional Posterior Distribution of . As there is no standard form for the conditional posterior density of and the degree of freedom (11), the kernel of the posterior distribution iswhere .

Here, we put sampling with the Griddy Gibbs sampler algorithm into use. The kernel of the conditional logarithm posterior distribution of is

Set the reasonable range of as (2, 30). Divide equally, and denote the grid interval as . Sample from the time m, and the other parameters are denoted as .

(4) Conditional Posterior Destiny of . The posterior distribution kernel of iswhere . Here, we set a proposal density function for , and it follows distribution.

3.3.2. Sampling Algorithm of Bayesian-GARCH(1,1)

We decompose the parameter variable of asset i, where . Then, we can use a coherent sampling strategy to sample from each conditional posterior distribution:

3.4. Portfolio Selection Based on BMS-GARCH(1,1)

Many researches find that the returns are heteroscedastic, but there are still some other empirical experiments which show that the parameters are not fixed and unchangeable. As for the reasons of changes, it might be due to the potential transfer in different mechanisms during data generating processes. For example, business cycle fluctuations can be seen as an endogenous factor. And, in the regime-switching situation, the Markov regime-switching model is good at describing the dynamics of the returns and variance.

The Markov regime-switching model is proposed by Hamilton in 1989, providing great convenience in modeling the volatility between different states. Then, several researchers try to introduce the method in the GARCH process. For instance, the regime-dependent parameter can be integrated in the standard deviation [44]:where is the state at time t. Besides, the regime-dependent parameters can be part of the variance equation [45], that is,

Models of (25) and (26) are both based on the dynamics of the conditional variance since the status dependency makes the likelihood function hard to deal with. According to Henneke et al. [43], the conditional mean can be treated as a ARMA(1,1) process in Bayesian framework, where the volatilities will change in different statuses.

3.4.1. Markov Chain

Suppose that the volatility has three states and denote as the low volatility, normal volatility, and high volatility. Denote as the transfer probability from state to state. The transfer probability matrix iswhere the sum of each row is 1. And, the property of Markov chain can be shown as follows:

In matrix (27), each row means the conditional probability distribution of transferring from the realized state to state , where is a tridimensional Markov chain with the transfer probability matrix .

Here, we suppose all the parameters in GARCH(1,1) are state-dependent, and for convenience, the calculation below is for asset j ignoring the subscript and the conditional variance equation isand in every stage, there is

As for the return, there is . If the given states are the same, the conditional variances are the same according to equations (29) and (7).

3.4.2. Prior Information

Denote the parameter variables of asset and Markov chain in BMS-GARCH(1,1) model aswhere , , and S is the state path in all the periods, .

For parameters , η, and υ, they have the noninformation prior, which means they are not affected by the states of the BMS-GARCH(1,1) model. The following are for the other parameters.

(1) Prior Distribution of . For , there is a normal distribution prior:where is an indicator function.

(2) Prior Distribution of . In a binomial environment, beta distribution can be a proper prior for probability parameters, and it is called Dirichlet distribution in multivariate situation. Therefore, there is

In order to get the prior parameters , it is necessary to get the prior of every expected value of each transfer probability and figure out the parameters with the equations.

3.4.3. Parameter Estimation with BMS-GARCH(1,1)

BMS-GARCH(1,1) evolves with invisible status variable . Hence, the discrete Markov Chain is called invisible Markov process as well. However, the parameters can be estimated by Bayesian methods: simulate the invisible variable and the parameters at the same time. Here, the distribution of S is multivariable distribution:where is the number of times the chain transferring from status i to status j between time 1 and time T. The first equation in (34) draws the property of a Markov Chain.

According to the GARCH(1,1) model with distribution, the invisible Markov process, and , the joint log posterior distribution of parameter variables in BMS-GARCH(1,1) is shown as follows:where and i is a certain status. The likelihood function depends on the whole sequence of invisible variable S.

(1) The Conditional Posterior Distribution of . For transfer probability , its conditional posterior distribution of is as follows:

where means the variable of all the other parameters except . Equation (36) can be seen as the log of kernel of Dirichlet distribution, can be set with a prior, and is the times of shifting from status i to j.

(2) The Conditional Posterior Distribution of S. With the three states, the number of potential possible status paths is . Hence, it is impossible to get the whole variable at a time. But, part of the parameters can be sampled out. In other words, we can do sampling from the full conditional posterior density of . And, the full conditional posterior density iswhere is the variable with S removed and is the status path without the state at time t. With Bayes theorem, becomeswhere is the likelihood function with according to equation (38). Considering Markov properties, there is

The denominator is

Combine equation (38)–(40), then it is possible to figure out the conditional posterior distribution of :

In addition, the proposal density of is similar to that in single statuses with GARCH(1,1) and distribution.

3.4.4. Sampling Algorithm of Parameter Estimation with the BMS-GARCH(1,1) Model

The sampling procedures are as follows. In the iteration,(1)Get from the posterior density according to equation (36), (2)Get with equation (41)(3)Get by equation (20)(4)Get according to equation (21)(5)Get with equation (19)(6)Get from the proposal distribution of (7)Check that if each element is satisfying the constraint or continue sampling until all the parameter constraints are satisfied(8)Calculate the acceptable probability mentioned in Section 2; then decide whether or not can be accepted

With the new parameters from sampling, update the variable θ. Repeat all the procedures for a certain time until the Markov Chain is convergent.

According to the parameters and Bayes formula, figure out the posterior distribution . The forecast distribution of return at time is

In most cases, there is not an accurate analytic solution to the formula above. But fortunately, we can find a numerical solution by simulation and construct a portfolio with the returns from sampling. Besides, the models can obtain self-financing portfolios if we set instead of 1 in model (3) as well as removing the restriction of in equation (10) and others. In this way, self-financing portfolios are constructed and the models can be used to estimate the parameters and optimize the portfolios.

4. Numerical Examples

4.1. Data

We select 30 stocks from China stock markets in total, 10 stocks are picked up from GEM board market, and they are from the very beginning of the market, from Nov 6, 2009, to Oct 21, 2011. Besides, 10 stocks are selected from Shanghai stock exchange and 10 from Shenzhen stock exchange. These 20 stocks have a longer history from Jan 7, 2000, to Oct 21, 2011. All the data are weekly. Considering the interest market situation of China, we choose monthly rate of interest that is converted with the three months deposit rate as risk-free rate because the interest rate does not follow the market principle yet. In the following analysis, the sample is taken between Nov 6, 2009, and Oct 21, 2011.

4.2. Parameter Estimation of Bayesian-GARCH(1,1)
4.2.1. Parameter Estimation

We set the freedom degree , which means . The initial variance can be treated as the variance of residuals. We make 10000 iterations and choose the last 5000 as the posterior. According to the Bayesian methods and models stated above, the uncertain parameters are estimated.

4.2.2. Efficient Frontiers

According to the mean-variance model and with the expected return and covariance matrix, the efficient frontiers are shown as follows (risk-averse parameter ). And, the Sharpe ratios are shown in Table 1.

In Figure 1, it can be seen that the efficient frontier without short-selling is below the frontier with short-selling, and the Sharpe ratio is smaller. This suggests short-selling can hedge most nonsystematic risks.

4.3. Parameter Estimation of BMS-GARCH(1,1)
4.3.1. Parameter Estimation

In this model, we suppose that with Dirichlet distribution, the prior parameter , which means the prior of the transfer probability follows uniform distribution. The prior mean of the parameters in conditional variance is set as .

The parameter priors are selected with the following rule: prior mean reflects the different statuses (status 1, status 2, and status 3) of asset. We suppose the sum of and is fixed, and its value fluctuates with statuses. In the case of high volatility, investors’ reaction to the unknown information may be sharper than that in the low volatility case, and then α is greater than β. For simplicity, we suppose the prior covariance matrix of is a unit matrix. With M-H algorithm running 10000 iterations, keep the last 5000 as the posterior inference. Then, the modified mean and variance in BMS-GARCH(1,1) can be estimated under the posterior.

According to the estimation results, it is easy to find that the values of posterior mean of conditional variance parameters are nearly the same with prior, but it is only satisfied with status 1 and status 2. In the case of status 3, there is , implying the status is not steady.

We stochastically pick up one stock with short history data (Hanwei Electronics, ) and one stock with long history data (Dazhonggongyong, ). Their state transition matrixes are shown in Table 2, and the state transfer probability is shown in Figures 2 and 3.

From Table 2 and Figure 2, we can see that, for stock (Hanwei Electronics), its status 3 is a transfer status compared with statuses 1 and 2. Besides stock is newly public on GEM board, it is interesting that it gradually transfers from state 3 (high volatility) to a steady one, state 1 (low volatility). Instead, with stock (Dazhonggongyong; since its data length is longer than that of , here the status transfer sample period is from Jan 7, 2000), when status 3 appears, it will soon transfer to status 1 or status 2. But during the year of 2008, status 3 shows consistency. We believe this is because during the financial crisis, there are more noises in stock market, and the asset is trapped in a state of high volatility.

4.3.2. Efficient Frontier

According to te BMS-GARCH(1,1) model, with the expected return and covariance in last section, the portfolio weights can be calculated (the risk-averse parameter ).

Comparing the portfolio weights with BMS-GARCH(1,1) to those with Bayesian-GARCH(1,1), it can be seen that there are obvious differences. We believe that is because there is a state variable in BMS-GARCH(1,1) and the Markov chain contains a lot of information, hence making the results different. Table 3 shows the Sharpe ratios. It can be seen that the Sharpe ratio with BMS-GARCH(1,1) model is much greater than that with Bayesian-GARCH(1,1) and the Sharpe ratios in Section 4.2. It suggests that the state variables overcome the problem of not considering the relationship between long series and short series. In other words, the state variables carry a lot of useful information.

Figure 4 is about the efficient frontier of portfolios under short-selling or without short-selling. It is clear that, in the case of no short-selling, the frontier is below that with short-selling, and the corresponding Sharpe ratio is smaller as well. This also proves that short-selling can hedge some nonsystematic risk, which improves the Sharpe ratio. Comparing Figure 4 with Figure 1, it can be found that the portfolio frontiers based on BMS-GARCH(1,1) are above the others, implying the state variable, the Markov chain, carries useful information and is helpful to the portfolio selection. Therefore, the portfolio selection model based on BMS-GARCH(1,1) performs better than Bayesian-GARCH(1,1).

5. Conclusions

This paper introduces and constructs two models and optimizes portfolios based on them. With the Bayesian-GARCH(1,1) model, the asset returns are assumed to follow distribution and the likelihood function can be figured out. Then, with Gibbs sampler, the parameters are estimated and modified.

BMS-GARCH(1,1) can be seen as introducing Markov state variables to the Bayesian-GARCH(1,1). In this model, the parameters are state-dependent and all the states construct a Markov chain. The state changes can be curves with state transfer matrixes, and the parameters can be estimated with the MCMC method. At last, we compare the two models with real stock market data. Numerical results show that portfolios with Bayesian-GARCH(1,1) perform poorer than those with BMS-GARCH(1,1), implying that after introducing state variables, the transferred states carry useful information for the investment portfolio.

People always want to make the optimal financial decision. However, many investors ignore the uncertainties of the parameters and models, which lead to a suboptimal portfolio at last. From this point of view, these models may be of some practical significance and enlightenment. Besides, in the future work, we can try to take other informative priors into consideration, try to expand the models to the multi-stage situation, or even try other frameworks instead of mean-variance framework, such as the utility function, safety-first framework, and so on.

Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

Disclosure

An earlier version of this paper was presented as a presentation in 20th Conference of the International Federation of Operational Research Societies.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

This work was supported by the National Natural Science Foundation of China, 71271201, 71631008, and 71701138.