Kernel-Based Aggregating Learning System for Online Portfolio Optimization

Wang, Xin; Sun, Tao; Liu, Zhi

doi:https://doi.org/10.1155/2020/6595329

Mathematical Problems in Engineering

On this page

Abstract Introduction Preliminaries and Related Works Experimental Results Conclusion Data Availability Conflicts of Interest Acknowledgments References Copyright Related Articles

Research Article | Open Access

Volume 2020 | Article ID 6595329 | https://doi.org/10.1155/2020/6595329

Kernel-Based Aggregating Learning System for Online Portfolio Optimization

Xin Wang,¹Tao Sun,¹and Zhi Liu^1,2

Academic Editor: Wlodzimierz Ogryczak

Received14 Sept 2019

Accepted18 Dec 2019

Published28 Jan 2020

Abstract

Recently, various machine learning techniques have been applied to solve online portfolio optimization (OLPO) problems. These approaches typically explore aggressive strategies to gain excess returns due to the existence of irrational phenomena in financial markets. However, existing aggressive OLPO strategies rarely consider the downside risk and lack effective trend representation, which leads to poor prediction performance and large investment losses in certain market environments. Besides, prediction with a single model is often unstable and sensitive to the noises and outliers, and the subsequent selection of optimal parameters also become obstacles to accurate estimation. To overcome these drawbacks, this paper proposes a novel kernel-based aggregating learning (KAL) system for OLPO. It includes a two-step price prediction scheme to improve the accuracy and robustness of the estimation. Specifically, a component price estimator is built by exploiting additional indicator information and the nonstationary nature of financial time series, and then an aggregating learning method is presented to combine multiple component estimators following different principles. Next, this paper conducts an enhanced tracking system by introducing a kernel-based increasing factor to maximize the future wealth of next period. At last, an online learning algorithm is designed to solve the system objective, which is suitable for large-scale and time-limited situations. Experimental results on several benchmark datasets from diverse real markets show that KAL outperforms other state-of-the-art systems in cumulative wealth and some risk-adjusted metrics. Meanwhile, it can withstand certain transaction costs.

1. Introduction

Portfolio optimization is a fundamental issue of computational finance which aims to invest wealth in a set of assets to meet some financial demands in the long run. There are two major schools of principles and theories for this problem: (i) Markowitz [1] introduces the mean-variance theory that illustrates the relationship between portfolio expected return and risk; (ii) Kelly [2] presents the Kelly investment criterion, which focuses on multiperiod portfolio selection and tends to maximize the expected log return. Due to the sequential nature of financial market data, it is suitable to solve online portfolio optimization (OLPO) problems following the last framework. In recent years, we have witnessed much research effort from machine-learning and artificial intelligence communities to design OLPO strategies through diverse prediction models and online learning algorithms (see [3–13] and references therein for more details).

In finance industry, heuristic principles based on economic phenomena are often adopted. Trend representation is one of the main methods to make future price predictions following this principle. In the survey by Li and Hoi [14], there are three categories for trend representation: pattern-matching, trend-reversing, and trend-following. Pattern-matching tries to find historical patterns that are similar to the current pattern and uses the historical results to predict the asset price. Gyorfi et al. [15] identify the similarity set by comparing two market windows via Euclidean distance and then conduct nonparametric kernel-based sequential investment strategies. In addition, Györfi et al. [16] further discuss the nonparametric nearest neighbor system to search for historical patterns which are located in the l nearest neighbors of the current pattern. Trend-reversing and trend-following are frequently observed in financial markets, as shown in Figure 1. Most individual investors trade by analyzing these trends, forming large momentums that drive the asset price up or down. Trend-reversing assumes that poor performing assets will perform well in the subsequent periods and vice versa. For example, Li et al. [8] present the online moving average reversion (OLMAR) strategy that takes the moving average in a recent time window as a prediction of the future asset price. Huang et al. [9] propose the robust median reversion (RMR) strategy which exploits the -median of recent asset prices as a robust statistic against noise and outliers. However, studies of behavioral finance in [17–19] indicate irrational phenomena in financial markets, which is contradictive to the efficient capital market model presented by Fama [20]. People tend to believe that good/poor performing assets will keep on rising up/going down, thus further push up/down the price along the previous direction. Hence, by following this trend pattern, it is possible to capture potential opportunities for excess returns. Agarwal et al. [21] take a Newton ascent step on the current portfolio to follow the price trend. Lai et al. [11] track the historical peak prices of assets in a recent time window and learn portfolios to catch the potential profit patterns. Besides, Lai et al. [13] propose a short-term sparse portfolio optimization system based on the alternating direction method of multipliers, which concentrates wealth on a small proportion of assets that have good increasing potential and proves that the augmented Lagrangian has a saddle point.

To the best of our knowledge, most OLPO strategies are built according to trend-reversing principle and can be seen as defensive systems. Few aggressive systems in OLPO that could catch up with those of the state-of-the-art defensive ones are investigated. As an aggressive system, Lai et al. [11] have achieved better investing performance than OLMAR and RMR [8, 9], but it lacks effective trend representation and does not take into account the downside risk which may lead to poor prediction performance and large investment losses in some market environments. Besides, in financial practice, technical analysis is one common method to analyze the asset price, which can identify price patterns from historical data by exploiting technical indicators and then suggest future activities. Wang and Zheng [22] investigate the statistical stationarity of well-known technical indicators including moving average, bollinger bands, moving average convergence-divergence, and rate of change and apply them in high-frequency trading. In modern portfolio theory, investors can also rebalance their portfolios by using various indicator information concerning the financial market, which can individuate the nature of the investment opportunity about to be faced. Moreover, prediction with a single model based on certain selection criterion is often unstable and sensitive to the noisy data and outliers which are seldom considered by existing aggressive systems with effective trend representation. These potential problems will lead to estimation errors and thus nonoptimal portfolios. Huang et al. [23] explicitly estimate the next price relative by combining four types of different forecasting estimators and design an online portfolio selection strategy named combination forecasting reversion. Lin et al. [24] exploit mean reversion principle from a metalearning perspective and formulate a boosting method for price relatice prediction. Yang et al. [25] present an online portfolio strategy, named WAACS, which is proved to be a universal portfolio. It utilizes the available side information in markets and applies the weak aggregating algorithm to aggregate all the expert advice given by all the constant rebalanced portfolio strategies.

In this paper, we present a novel online learning system named kernel-based aggregating learning (KAL) for portfolio optimization to address the above drawbacks in two stages. Firstly, technical indicator information of historical prices, nonstationary nature of time series data, and weighted aggregation of multiple estimators are taken into comprehensive consideration to make the improved price relative prediction. Specifically, indicator information suggests the possible trend pattern of price fluctuations; autoregression integrated moving average model deals with the nonstationary nature of price time series; meanwhile, weighted aggregation improves the robustness of estimation and breaks the limitation of optimal parameters selected in hindsight. Then, online convex optimization theory, in references [26–28], is applied to calculate coefficients of the proposed prediction model. Secondly, an enhanced learning system is built to optimize the portfolio by maximizing future wealth with a kernel-based increasing factor. And then we develop a fast algorithm for the KAL objective to make it applicable to large-scale and time-limited situations. Experimental results show that KAL achieves better performance than other state-of-the-art systems in OLPO.

The remainder of this paper is organized as follows. In Section 2, we introduce the problem setting baseline and related works. In Section 3, we illustrate the whole KAL system in detail. In Section 4, extensive experiments on benchmark datasets from real financial markets with diverse assets and in different time spans are conducted. In Section 5, the concluding remarks are presented.

2.1. Problem Setting

The problem setting in this paper is consistent with the standard and common one that has been used by many previous research studies [8–16]. Consider an investment task over a financial market with d assets. On the tth period, the asset prices are represented by a close price vector , where denotes the d-dimensional nonnegative number space, and each element represents the close price of asset i. Moreover, there is another concept called price relative [29]:where a division between two vectors denotes an elementwise division in this paper. is the outcome of one unit wealth invested in the ith asset during the tth trading period. In fact, price relative is the main form of price information that an OLPO system exploits.

At the beginning of the tth period, we diversify our capital among d assets. To denote the proportion of the total wealth invested in each asset, we introduce a portfolio vector lying on the d-dimensional simplex:

In this paper, we assume there are no short selling, no borrowing money, and all the wealth from pervious period should be reinvest in the current period, which lead to this nonnegative constraint and the equality constraint.

Since we adopt price relative, the portfolio wealth would multiplicatively grow. The cumulative wealth (CW) at the end of the tth period is a number . Without loss of generality, suppose the whole investment lasts n periods with initial wealth , then the evolution of is

The OLPO problem can be formulated as a sequential decision task. The portfolio manager aims to design a strategy to maximize the cumulative wealth :

Only the historical information until the current period can be used to select the next portfolio vector , and different portfolio optimizing strategies indicate different principles of how to use the historical information.

In addition, we make several general assumptions in the above model as a supplement:(i)Transaction cost: no transaction cost or taxes in this OLPO model(ii)Market liquidity: one can buy and sell required quantities at last closing price of any given trading period(iii)Impact cost: market behavior is not affected by an OLPO strategy

These assumptions are not trivial, and we will empirically analyze the effects of transaction costs in Section 4.

2.2. Related Works

In this section, some related works are introduced specifically and their performance will be compared with our KAL in Section 4.

Uniformly buy-and-hold (UBAH) is a simple and commonly used baseline. The portfolio manager allocates his capital equally in d assets at the beginning and does not rebalance in subsequent periods. It is usually adopted as market strategy to produce the market index. Another common benchmark is the beststock (BS) strategy, a special buy-and-hold strategy that invests all the wealth on the best asset in hindsight.

Borodin et al. [30] present the anticorrelation (Anticor) algorithm that calculates a crosscorrelation matrix between two specific market windows and then transfers weights from the previous winning assets to the current losing assets. Li et al. [31] propose correlation-driven nonparametric learning (CORN) approach that identifies the linear similarity among two market windows via correlation, which also adopts the idea of pattern-matching. Anticor and CORN try to dig the correlation between different assets separately and all follow the mean reversion principle.

Online moving average reversion (OLMAR) and robust median reversion (RMR) are two state-of-the-art defensive strategies based on the mean reversion principle. Li et al. [8] assume that the asset price in the next period will reverse to its moving average (MA) and takes MA as a reference of the asset price trend. There are two types of MA. One is the so-called simple moving average (SMA) which truncates the historical prices via a time window and calculates its arithmetical average. Another one is the exponential moving average (EMA) which adopts all historical prices and each price is exponentially weighted:

Huang et al. [9] exploit the -median of the recent prices as a robust prediction, which is less sensitive to outliers and noise than OLMAR.

Lai et al. [11] explore an aggressive strategy based on the trend-following principle that could catch up with OLMAR and RMR, named the peak price tracking (PPT) strategy. It extracts the increasing power of the assets by using the peak price in a fixed time window as a prediction to get potential growth opportunities.

3. Kernel-Based Aggregating Learning System

3.1. Motivation

Empirical studies show that real-world financial markets are not always effective. They often overreact to all kinds of information and create potential opportunities for capturing excess profits. More aggressive OLPO systems with promising performance should be further investigated. PPT, which estimates the next price via peak price, has achieved good results on most datasets. But it also may lead to poor prediction performance and large investment losses when the market environment is accompanied by the downside risk. Besides, existing OLPO strategies often lack explicit price trend presentation which can effectively recognize the trend patterns. Moreover, the single-model prediction always suffers from noises and outliers in the data and ignores the temporal heterogeneity of historical data, both of which could reduce the accuracy and robustness of estimators. To fill in these gaps, in this paper we propose the KAL system to make an improved price relative prediction by exploiting indicator information and aggregating learning method and optimize the portfolio by an enhanced tracking system with a kernel-based increasing factor.

3.2. Price Relative Prediction

3.2.1. Price Relative Prediction with Component Estimator

In financial markets, technical analysis is a popular way to analyze the asset price. It exploits technical indicators to identify price patterns and guide the investment behavior to make profits. In this paper, we consider particular situations of different assets and adopt simple moving average of prices to follow the trend pattern of each asset. At first, according to whether the close price exceeds its simple moving average in (5) at the end of tth period, we assume the indicator information has two states as follows:where represents the indicator information status of asset i at the beginning of th period, is the close price of asset i at the end of tth period, and is its moving average in recent periods. From the perspective of technical analysis, the future price of asset i in short term has high probability to rise up when and go down when .

Time series data in financial field are usually not realization of a stationary process, some of them may contain deterministic trends. Autoregression integrated moving average (ARIMA) is one effective linear model for time series prediction, and it has great statistical properties and structural flexibility and can deal with the nonstationary characteristics of price sequences well.

We denote is the h order differences of , and denotes the zero-mean random noise term at time t. The price sequence of satisfying the ARIMA (k, h, and q) model is formulated as follows:which are parameterized by three terms and weight vector of the autoregression (AR) part and of the moving average (MA) part. The original price prediction could be approximated with the AR (k + m, d) model as follows:where is a properly chosen constant and is the coefficient vector to be solved which belongs to the set .

At period t, we first make a price prediction , after which the real price is revealed, and then we suffer a loss denoted by . Our goal is to minimize the cumulated losses over a predefined number of iterations T. The regret after T rounds is defined as follows:

We wish to obtain an efficient algorithm that can guarantee this regret growth sublinearly in T, implying that the per-round regret will vanish as T increases.

Now we present one specific online convex optimization algorithm by applying the Online Newton Step method in [27] to solve the parameter vector in the model above. Algorithm 1 iteratively optimizes the coefficient vector in an online manner.

Input: Given parameters h, k, m, learning rate η, and initial matrix and initial vector .
(1)	for t = 1 to T do
(2)	Calculate price prediction by (9);
(3)	Receive and incur loss ;
(4)	Let gradient , update ;
(5)	Calculate the inverse matrix by Sherman–Morrision formula:
(6)	;
(7)	Update the coefficient vector , where is the projection in the norm induced by ;
(8)	end for
Output: The coefficient vector .

It has been proved that this iterative procedure guarantees a proper upper bound of the regret in prediction, as shown in Theorem 1. The details of the proofs can be found in [27] and we omit the details here.

Theorem 1. Let k + m 1, and set , , and , where D is the diameter of , G is the upper bound of for all t, and . The loss functions are assumed to be -exp-concavity in . Then, the online sequence generated by Algorithm 1 guarantees .

PPT [11] has proposed a future price prediction named peak price, which is the maximum price of the asset on the most recent periods. Following the idea of PPT, we give the nadir price which is the minimum price of the asset in this time window. Peak prices and nadir prices of different assets are gathered as vectors and :

We propose a novel and improved future price prediction with the indicator information y mentioned before. If , it indicates asset i would be in an upward trend. And if , it indicates asset i would be in a downward trend. Irrational phenomena in financial markets shows that prices of poor performing assets will keep on going down, in this situation can achieve better prediction performance than PPT and OLMAR. Since no short selling is allowed, investors can only make profits when their asset prices increase. The peak price can extract the increasing power of different assets. It is essential to consider the price trends as well as the increasing power of different assets; hence, we combine , , and to design the resulted price prediction as follows:

Then, we produce the resulted price relative prediction with the component estimator:

3.2.2. Price Relative Prediction with Aggregating Estimator

As we can see, the value of in (7) depends on the window size of the simple moving average, thus the price prediction in (12) changes accordingly and sensitively. Meanwhile, the optimal parameter can only be chosen in hindsight. Now we consider an aggregating approach to combine a set of experts, and each expert estimates the price relative in the next period following the scheme in Section 3.2.1 with different parameters. The experts are generated by sampling the parameter uniformly from the range , and then we present a weighted aggregation of these experts as the final price relative prediction:where is the total number of experts, is the weight of the jth expert, belongs to the decision set , and is the predictive value of the jth expert on th period by (13).

The remaining issue is how to compute the weights assigned to each expert, and now we present another online convex optimization algorithm by applying the Online Gradient Descent method in Algorithm 2 to calculate in each iteration.

Input: Given the parameter e, learning rate η, and initial vector .
(1)	for s = 1 to t do
(2)	Calculate the final price prediction ;
(3)	Receive and incur loss ;
(4)	Let the gradient , update the weight vector .
(5)	end for
Output: Final price relative prediction .

As shown in Theorem 2, the regret of the aggregating estimator can also be bounded (see [27] for detailed proofs).

Theorem 2. Assume that the loss function is H-strong convex in . Let e 1, and set , where H is the lower bound of for all t and . Then, the online sequence generated by Algorithm 2 guarantees:

The whole indicator information-based price relative prediction scheme with aggregating estimator can be interpreted as shown in Figure 2.

3.3. Portfolio Optimization

3.3.1. Kernel-Based Increasing Factor

After future price prediction, the second step is to optimize our portfolio according to certain criterion. Similar to the criterions adopted in [10, 11], in this paper a tracking system is also conducted. It invests more wealth in potentially good performing assets and less wealth in potentially bad performing ones. We first establish the following KAL objective:where denotes the Euclidean norm. The maximization of is adopted to track with b. The constraints on the right of (16) control the deviation from last portfolio and ensure the feasibility of b.

Instead of the pure increasing factor , we propose a generalized increasing factor as follows:where is a positive definite symmetric matrix that rescales the relative influence of different assets in the increasing factor and is the average price relative prediction of all assets. can be seen as a normalized price relative prediction, after that some assets have positive signs while others have negative signs, suggesting an increase or decrease in investing proportion. The generalized increasing factor in (17) can be seen as an inner product; to maximize this inner product, should track .

As for the setting of , there are many sorts of principles to achieve different financial targets. In this paper, firstly we define as a kernel matrix for two vectors :

It is a positive definite diagonal kernel satisfying Mercer’s theorem and measures the similarity between and . If is closed to , then . If is far away from , then . From the perspective of technical analysis, the distance between the asset price and its mean value implies the strength of trend momentums. The larger the distance is, the greater the corresponding strength will be. For example, at the end of the tth period, the asset price falls below its moving average heavily; thus, the difference between them is great and the asset price in subsequent periods is more likely to continue to fall. Naturally, we hope specific assets having this great power can produce more optimization influence; thus, the corresponding elements of would be set larger. Now we present the following form of :where indicates the ensemble moving average of the price relative.

The kernel-based increasing factor cannot be arbitrarily large; thus, a constraint generalized from that of (16) is added to the optimization, leading to the whole KAL portfolio optimization:where denotes the positive part of y. The constraint can be seen as a generalized Mahalanobis distance between b and with the square adjustment matrix , such that the feasible set of b is an ellipsoid centered at . ϵ can be seen as an expected profiting level. If , the potential wealth exceeds the expected level, then the portfolio remains unchanged.

3.3.2. Algorithm to Solve KAL

To solve the KAL objective, in this section, we design a fast algorithm based on the gradient projection principle. It consists of simple and explicit matrix calculations, which are applicable to large-scale and time-limited situations.

At first, we relax the simplex constraint in (20) and search and optimize . The gradient of the objective function in (20) is ; thus, the gradient ascent step is

Substituting (21) into the constraint in (20) yields:

neutralizes the effect of and , and then we obtain

If , there is no need to update this portfolio; hence, . Otherwise, can be chosen in the interval of (23). To exploit full strength of gradient ascent, is set as

To ensure that the resulting portfolio is nonnegative, we finally project the above portfolio to the simplex domain by the algorithm in [32]:

The whole KAL system can be summarized as Algorithm 3 and illustrated by Figure 3.

Input: Given parameters , , ϵ, asset prices in the recent time window and , the current portfolio .
(1)	Calculate by (5), the indicator information by (7).
(2)	Calculate by (9), and by (11), price prediction by (12).
(3)	Calculate the final price relative prediction by (14) and the adjust matrix by (19).
(4)	if then
(5)
(6)	else
(7)
(8)	end if
(9)	Optimization: .
(10)	Projection: .
Output: The next portfolio .

4. Experimental Results

In this section, we use the cumulative wealth and other performance criteria to measure the performance of the proposed KAL system and evaluate its effectiveness by comparing with seven existing strategies on several real-world datasets.

4.1. Experiment Setting

In our experiment, we focus on six benchmark datasets: (1) NYSE(O), (2) NYSE(N), (3) DJIA, (4) SP500, (5) TSE, and (6) HS300. All of these datasets consist of real-world daily close stock price relative sequences. Their detailed information is shown in Table 1. NYSE(O) and NYSE(N) are two different datasets from New York Stock Exchange (NYSE) with different stocks and in different time spans. NYSE(O) is the well-known NYSE dataset pioneered by Cover [29], and it contains 5651 daily price relatives of 36 stocks in NYSE for a 22-year period from July 3, 1962, to December 31, 1984. NYSE(N) is the extended version of NYSE(O) and is collected by Li et al. [33]. For consistency, this dataset is from January 1, 1985, to June 3, 2010, which consists of 6431 trading days of 23 stocks and covers the global financial crisis in 2008. DJIA is collected by Borodin et al. [30], which consists of 30 stocks from Dow Jones Industrial Average containing price relatives of 507 trading days, ranging from January 1, 2001, to January 1, 2003. SP500 and TSE are collected from constituent stocks of Standard & Pool 500 and Toronto Stock Exchange, respectively. Interested readers can check [12](http://OLPO.stevenhoi.org) or the original papers for the first five datasets. The dataset HS300 is collected by Lai et al. [12], which contains 44 stocks of certain CSI300 constituents from China in a recent time span. It supplements the database of this research area since the datasets before are mainly from North America. As we can see, the datasets mentioned above cover much long trading periods from 1962 to 2017 and diversified markets, which enables us to examine how the proposed KAL system performs under different events and crises such as the dot-com bubble from 1995 to 2001 and the subprime mortgage from 2007 to 2009.

We take five representative state-of-the-art portfolio selection systems (CORN, Anticor, OLMAR, RMR, and PPT) and two trivial ones (Market and Beststock) to make comparisons with KAL. Due to diverse effective principles as introduced in Section 2.2, the five state-of-the-art systems will show advantages in different parts of the experiments. The parameters for these systems are set by their defaults and according to previous experiments [8, 9, 11, 30, 31]. CORN: = 5, P = 1, ρ = 0.1; Anticor: = 5; OLMAR: α = 0.5, ϵ = 10; RMR: = 5, ϵ = 5; and PPT: = 5, ϵ = 100. Following similar methods in the related works [8–12, 16, 23, 24, 33], the parameters of our KAL system are empirically set as follows: for all datasets. The parameters of the ARIMA model are chosen as , which are consistent with previous research studies [23, 34]. The time window size of is usually used in stock markets, and the price information in such a time window reflects the recent financial environment. To choose the sampling range of experts and the expected profiting level ϵ, we will conduct experiments in Section 4.4 to further evaluate how different choices of these parameters affect the performance metrics.

4.2. Performance Metrics

Performance is evaluated on several common metrics: cumulative wealth (CW), mean excess return (MER), sharpe ratio (SR), and information ratio (IR). CW is the core metric to evaluate investing performance. MER measures how much better a system is than the market in average. SR and IR are two kinds of risk-adjusted return metrics that trade off between risk and return.

4.2.1. Cumulative Wealth

Table 2 shows the final cumulative wealth achieved by various systems on the six benchmark datasets without considering transaction costs.

As we can see, the proposed KAL outperforms other state-of-the-art systems on five datasets and ranks second on TSE. For instance, KAL achieves much higher CWs (7.35E + 18, 3.42E + 9, 21.45) than PPT (1.31E + 18, 2.89E + 9, 11.76) on NYSE(O), NYSE(N), and SP500, respectively. Only KAL (1.41) and RMR (1.35) among the nontrivial systems perform better than the Market (1.34) on HS300, and KAL achieves 30% higher CW than PPT. It indicates that KAL is an effective system following aggressive principle and accumulates more wealth by considering the indicator information. To see how the KAL system works during the entire investments, we plot the CWs of different systems on NYSE(O) and DJIA in Figure 4. The plots of KAL are above other systems on most periods, suggesting that it achieves effective investing performance in the long run.

(a)

(b)

4.2.2. Mean Excess Return

In finance, return is the proportion of wealth that an investor has gained or lost on one period. The daily return of the tth period is . MER is the long-term average return that a portfolio selection system exceeds the Market benchmark:where and denote the returns of a portfolio selection system and the Market strategy, respectively. At the same time, we take the t statistic as reference to see whether the return of a system is significantly higher than the Market benchmark. According to the capital asset pricing model, the expected return can be decomposed to the market component and the inherent excess return. So, the following linear regression model can be established:where is the α-factor representing the active return, is the β-factor representing the volatility from the market, and is the error term. By using the ordinary least squares method, the coefficients and can be estimated with n sample pairs of and . We also conduct a right-tailed t-test to test whether is significantly higher than 0 and show that the excess return is not due to luck.

The MERs and the corresponding p values of t-tests for different systems on the six datasets are shown in Table 3. KAL achieves higher MERs than other state-of-art systems on four datasets and ranks the second on TSE. For example, KAL achieves MER = 0.0025 and 0.0003 on SP500 and HS300, compared with OLMAR (0.0020 and -0.0003), RMR (0.0019 and −4.5E − 5), and PPT (0.0022 and −0.0005), respectively. As we can see, KAL has high inherent excess returns, and it is the only state-of-the-art system that achieves a positive MER on HS300. Moreover, PPT obtains significantly better performance than the Market at a confidence level of 99% on five datasets. These results suggest that KAL is an effective and aggressive system that can capture significant excess returns in the long run.

4.2.3. Sharpe Ratio and Information Ratio

A rational investor not only wants to gain excess return which is higher than the risk-free asset but also wants to balance with risk as well. Sharpe Ratio (SR) is a traditional measurement as a risk-adjusted return, calculated as , where and are the expectation and the standard deviation of , respectively. They can be estimated by the daily samples of n periods. is the return of a risk-free asset in the financial market (e.g., bank deposits, bonds, and currency funds). In this paper, all the wealth is invested in risk asserts, so is set to 0. Information Ratio (IR) is also a risk metric but it directly measures the risk-adjusted excess return of a system compared with the Market benchmark, calculated as .

The SRs of different systems are given in Table 4. KAL (0.1082, 0.0998, 0.0848, 0.0552) achieves the highest SRs among the state-of-the-art systems on NYSE(N), DJIA, SP500, and HS300, compared with OLMAR (0.1051, 0.0263, 0.0686, 0.0353), RMR (0.1031, 0.0743, 0.0682, 0.0512), and PPT (0.1085, 0.0801, 0.0719, 0.0246), respectively. The IRs of different systems are given in Table 5. KAL (0.1014, 0.1374, 0.0169) achieves the highest IRs among the state-of-the-art systems on NYSE(N), DJIA, and HS300, compared with OLMAR (0.0970, 0.0448, −0.0061), RMR (0.0951, 0.1045, 0.0099), and PPT (0.1005, 0.1130, −0.0255), respectively. Besides, KAL is competitive to OLMAR, RMR, and PPT on other datasets. These results show that KAL has a good ability in balancing between return and risk; hence, it is robust and reliable for investments.

4.3. Transaction Costs

Transaction cost is an important and unavoidable issue in portfolio selection. We conduct experiments of cumulative wealth according to the proportional transaction cost model and vary the transaction cost ratio from 0 to 0.5%. The CWs of different strategies on six benchmark datasets are shown in Figure 5. As we can see, when the transaction costs increase, the CWs achieved by all strategies drops considerably. While KAL still outperforms PPT on all the datasets and outperforms OLMAR on most datasets except TSE. Notice that the real transaction cost ratio is usually below 0.5%; therefore, KAL is effective and practical applicable.

(a)

(b)

(c)

(d)

(e)

(f)

4.4. Parameter Sensitivity

Notice that our KAL system has several key parameters: the sampling range for aggregating estimator and the expected profiting level for portfolio optimization ϵ. Now we conduct experiments on all datasets to evaluate how different choices of these parameters affect the CWs. First, we fix and let ϵ change in . Results are shown in Figure 6. The CWs remain nearly unchanged as ϵ varies, which indicate that KAL is stable to the change of ϵ. Hence, is empirically used for KAL to conduct experiments. Next, we fix and let change in . Results are plotted in Figure 7, which show that the performance of KAL is good on all datasets when is around 30. We empirically set as a conventional value.

(a)

(b)

(c)

(d)

(e)

(f)

(a)

(b)

(c)

(d)

(e)

(f)

5. Conclusion

In this paper, we consider the online portfolio optimization problem from the perspective of aggregating learning and trend-following. So far, few aggressive systems that follow this trend pattern and could catch up with most existing state-of-the-art defensive systems are investigated in depth. Most previous works lack effective trend representation and suffer large investment losses when the market environment is with downside risk. Meanwhile, they are sensitively affected by noises and outliers in data and face limits of optimal parameters chosen in hindsight.

KAL addresses these issues in two stages. At first, technical indicator information extracted from historical data is designed. And an aggregating price relative prediction based on this additional information is proposed, which applies the online convex optimization method to calculate the model coefficients. Then, an online learning system is presented to track the increasing power of different assets by maximizing a kernel-based increasing factor. By this method, the better performing assets get more investment, while others get less. A fast algorithm is also developed for KAL, which is applicable to large-scale and time-limited environments.

Extensive experiments on real-world markets show that the KAL system achieves promising performance. It achieves the highest CWs and the highest significant excess returns on most benchmark datasets, which outperforms other state-of-the-art systems. It also has robust performance with high SRs and IRs, which are comparable with other state-of-the-art systems. In summary, KAL is an effective and efficient OLPO system.

For a further study of the KAL system, it would be useful to mention and discuss the overfitting problem. Following the theorem in [35], the Minimum Backtest Length (MinBTL, in years) is needed to avoid selecting a strategy with a given in-sample SR among N trials with an expected out-of-sample SR of zero. According to the experimental results in Section 4.2.3 and Section 4.4, we could roughly calculate the approximate upper bound to the MinBTL of KAL on six benchmark datasets. Then, after comparing this upper bound and the realistic backtest length, we find that KAL could avoid overfitting from this perspective on most datasets. Because MinBTL is merely a necessary, nonsufficient condition to avoid overfitting, this issue deserves further investigations in our future works.

Data Availability

The matlab data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

This research was funded by the National Social Science Fund Project, “Research on Financing Ecology, Financing Efficiency, and Co-evolution Mechanism of Strategic Emerging Industries in China” (Grant no. 15BGL056).

References

H. Markovwitz, “Mean-variance analysis in portfolio choice and capital markets,” Journal of Finance, vol. 7, no. 1, pp. 77–91, 1952.
View at: Google Scholar
J. Kelly, “A new interpretation of information rate,” IEEE Transactions on Information Theory, vol. 2, no. 3, pp. 185–189, 1956.
View at: Publisher Site | Google Scholar
L. Yang, R. Couillet, and M. R. McKay, “A robust statistics approach to minimum variance portfolio optimization,” IEEE Transactions on Signal Processing, vol. 63, no. 24, pp. 6684–6697, 2015.
View at: Publisher Site | Google Scholar
S. Raugys, “Portfolio of automated trading system: complexity and learning set size issues,” IEEE Transactions on Neural Networks and Learning Systems, vol. 24, no. 3, pp. 448–459, 2013.
View at: Google Scholar
Q. Liu, C. Dang, and T. Huang, “A one-layer recurrent netural network for real-time portfolio optimization with probablity criterion,” IEEE Transactions on Cybernetics, vol. 43, no. 1, pp. 14–23, 2013.
View at: Publisher Site | Google Scholar
L. Györfi, G. Ottucsäk, and H. Walk, Machine Learning for Financial Engineering, World Scientific, Singapore, 2012.
E. Vercher and J. D. Bermudez, “A possibilistic mean-downside risk-skewness model for efficient portfolio selection,” IEEE Transactions on Fuzzy Systems, vol. 21, no. 3, pp. 585–595, 2013.
View at: Publisher Site | Google Scholar
B. Li, S. C. H. Hoi, D. Sahoo, and Z.-Y. Liu, “Moving average reversion strategy for on-line portfolio selection,” Artificial Intelligence, vol. 222, pp. 104–123, 2015.
View at: Publisher Site | Google Scholar
D.-j. Huang, J. Zhou, B. Li, S. C. H. Hoi, and S. Zhou, “Robust median reversion strategy for online portfolio selection,” IEEE Transactions on Knowledge and Data Engineering, vol. 28, no. 9, pp. 2480–2493, 2016.
View at: Publisher Site | Google Scholar
Z. Lai, P. Yang, L. Fang, and X. Wu, “Reweighted price relative tracking system for automatic portfolio optimization,” IEEE Transactions on Systems, Man, and Cybernetics: Systems, pp. 1–13, 2018.
View at: Publisher Site | Google Scholar
Z. Lai, D. Dai, C. Ren, and K. Huang, “A peak price tracking-based learning system for portfolio selection,” IEEE Transactions on Neural Networks and Learning Systems, vol. 29, no. 7, pp. 2823–2832, 2018.
View at: Publisher Site | Google Scholar
Z.-R. Lai, D.-Q. Dai, C.-X. Ren, and K.-K Huang, “Radial basis functions with adaptive input and composite trend representation for portfolio selection,” IEEE Transactions on Neural Networks and Learning Systems, vol. 29, no. 12, pp. 6214–6226, 2018.
View at: Publisher Site | Google Scholar
Z. Lai, P. Yang, L. Fang, and X. Wu, “Short-term sparse portfolio optimization based on alternating direction method of multipliers,” Journal of Machine Learning Research, vol. 19, no. 63, pp. 1–28, 2018.
View at: Google Scholar
B. Li and S. C. H. Hoi, “Online portfolio selection,” ACM Computing Surveys, vol. 46, no. 3, pp. 1–36, 2014.
View at: Publisher Site | Google Scholar
L. Gyorfi, G. Lugosi, and F. Udina, “Nonparametric kernel-based sequential investment strategies,” Mathematical Finance, vol. 16, no. 2, pp. 337–357, 2006.
View at: Publisher Site | Google Scholar
L. Györfi, F. Udina, and H. Walk, “Nonparametric nearest neighbor based empirical portfolio selection strategies,” Statistics & Decisions, vol. 26, no. 2, pp. 145–157, 2008.
View at: Publisher Site | Google Scholar
W. F. M. Bondt and R. Thaler, “Does the stock market overreact?” The Journal of Finance, vol. 40, no. 3, pp. 793–805, 1985.
View at: Publisher Site | Google Scholar
R. J. Shiller, “From efficient markets theory to behavioral finance,” Journal of Economic Perspectives, vol. 17, no. 1, pp. 83–104, 2003.
View at: Publisher Site | Google Scholar
R. J. Shiller, Irrational Exuberance, Princeton University Press, Princeton, NJ, USA, 2000.
E. F. Fama, “Efficient capital markets: a review of theory and empirical work,” The Journal of Finance, vol. 25, no. 2, pp. 383–417, 1970.
View at: Publisher Site | Google Scholar
A. Agarwal, E. Hazan, S. Kale, and R. E. Schapire, “Algorithms for portfolio management based on the Newton method,” in Proceedings of the 23rd International Conference on Machine Learning, pp. 9–16, Pittsburgh, PA, USA, June 2006.
View at: Publisher Site | Google Scholar
Z. Wang and W. Zheng, High-Frequency Trading and Probability Theory, World Scientific, Singapore, 2014.
D. Huang, S. Yu, B. Li, S. C. H. Hoi, and S. Zhou, “Combination forecasting reversion strategy for online portfolio selection,” ACM Transactions on Intelligent Systems and Technology, vol. 9, no. 5, pp. 1–22, 2018.
View at: Publisher Site | Google Scholar
X. Lin, M. Zhang, Y. Zhang, Z. Gu, Y. Liu, and S. Ma, “Boosting moving average reversion strategy fo online portfolio selection: a meta-learning approach,” in Proceedings of the International Conference on Database Systems for Advanced Applications, pp. 494–510, Chiang Mai, Thailand, April 2017.
View at: Google Scholar
X. Yang, J. He, J. Xian, H. Lin, and Y. Zhang, “Aggregating expert advice strategy for online portfolio selection with side information,” in Soft Computing, pp. 1–15, Springer, Berlin, Germany, 2019.
View at: Google Scholar
E. Hazan, Introduction to Online Convex Optimization, Princeton University Press, Princeton, NJ, USA, 2016.
E. Hazan, A. Agarwal, and S. Kale, “Logarithmic regret algorithms for online convex optimization,” Machine Learning, vol. 69, no. 2-3, pp. 169–192, 2007.
View at: Publisher Site | Google Scholar
S. Shai, “Online learning and online online convex optimization,” Foundations and Trends in Machine Learning, vol. 4, no. 2, pp. 107–194, 2011.
View at: Google Scholar
T. M. Cover, “Universal portfolios,” Mathematical Finance, vol. 1, no. 1, pp. 1–29, 1991.
View at: Publisher Site | Google Scholar
A. Borodin, R. El-Yaniv, and V. Gogan, “Can we learn to beat the best stock,” Journal of Artificial Intelligence Research, vol. 21, no. 1, pp. 579–594, 2004.
View at: Publisher Site | Google Scholar
B. Li, S. C. H. Hoi, and V. Gopalkrishnan, “CORN: correlation-driven nonparametric learning approach for portfolio selection,” ACM Transactions on Intelligent Systems and Technology, vol. 2, no. 3, 2011.
View at: Publisher Site | Google Scholar
J. Duchi, S. Shalev-Shwartz, Y. Singer, and T. Chandra, “Efficient projections onto the -ball for learning in high dimensions,” in Proceedings of 25th International Conference on Machine Learning, pp. 272–279, Helsinki, Finland, July 2008.
View at: Google Scholar
B. Li, S. C. H. Hoi, P. Zhao, and V. Gopalkrishnan, “Confidence weighted mean reverison stratehy for online portfolio selection,” ACM Transactions on Knowledge Discovery from Data, vol. 7, no. 1, 2013.
View at: Publisher Site | Google Scholar
C. Liu, S. C. H. Hoi, P. Zhao, and J. Sun, “Online ARIMA algorithms for time series prediction,” in Proceedings of 30th AAAI Conference on Artificial Intelligence, pp. 1867–1873, Phoenix, AZ, USA, February 2016.
View at: Google Scholar
D. H. Bailey, J. M. Borwein, M. López de Prado, and Q. J. Zhu, “Pseudo-mathematics and financial charlatanism: the effects of backtest overfitting on out-of-sample performance,” Notices of the American Mathematical Society, vol. 61, no. 5, pp. 458–471, 2014.
View at: Publisher Site | Google Scholar

Copyright

Copyright © 2020 Xin Wang et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

PDF Download Citation

Download other formats

Order printed copies

Mathematical Problems in Engineering

Kernel-Based Aggregating Learning System for Online Portfolio Optimization

Abstract

1. Introduction

2. Preliminaries and Related Works

2.1. Problem Setting

2.2. Related Works

3. Kernel-Based Aggregating Learning System

3.1. Motivation

3.2. Price Relative Prediction

3.2.1. Price Relative Prediction with Component Estimator

3.2.2. Price Relative Prediction with Aggregating Estimator

3.3. Portfolio Optimization

3.3.1. Kernel-Based Increasing Factor

3.3.2. Algorithm to Solve KAL

4. Experimental Results

4.1. Experiment Setting

4.2. Performance Metrics

4.2.1. Cumulative Wealth

4.2.2. Mean Excess Return

4.2.3. Sharpe Ratio and Information Ratio

4.3. Transaction Costs

4.4. Parameter Sensitivity

5. Conclusion

Data Availability

Conflicts of Interest

Acknowledgments

References

Copyright