Abstract
China’s livestock market has experienced exceptionally severe price fluctuations over the past few years. In this paper, based on the well-established idea of “forecast combination,” a forecast combination framework with different time scales is proposed to improve the forecast accuracy for livestock products. Specifically, we combine the forecasts from multi-time scale, i.e., the short-term forecast and the long-term forecast. Forecasts derived from multi-time scale introduce complementary information about the dynamics of price movements, thus increasing the diversities within the modeling process. Moreover, we investigate a total of ten combination methods with different weighting schemes, including linear and nonlinear combination. The empirical results show that (i) forecast performance can be remarkably improved with this novel combination idea, and short-term forecast model is more suitable for the products with a relatively high volatility, e.g., mutton and beef; (ii) geometric mean, which provides a nonlinear combination, is the most effective one among all the combination methods; and (iii) variance-based weighting scheme can yield a superior result compared to the best individual forecast, especially for the products such as egg and beef.
1. Introduction
Livestock is an important sector of the agricultural economy in China, which accounts for nearly one-third of the total agricultural economic output (, 2017). Owing to various influential factors, i.e., extreme meteorological events, spread of epidemic disease, and new policy legislation, the price of livestock products has become volatile in the past ten years. Large fluctuations not only have a great impact on the consumer price index (CPI) [1] but also have adverse effects on people’s livelihood and related industries [2]. Therefore, an accurate price prediction will be helpful for agricultural practitioners as well as the authorities to make policy decisions.
In the past several years, new forecasting techniques have emerged, for example, the TEI@I (text mining, econometrics, intelligence, and integration) methodology [3], the decomposition and ensemble paradigm [4], the newly developed convolutional neural networks, and long short-term memory networks [5, 6]. All of them have gained great success in the field of energy and financial market analysis; however, these advanced methods have not yet been applied in agricultural commodity market [7]. The most widely used models for agricultural prices’ forecasting can be grouped into two categories, i.e., statistical models and computational intelligence models. For example, Maki presented an econometric model to predict the cyclical fluctuations in livestock products prices [8]. Molina et al. used the autoregressive integrated moving average (ARIMA) method to forecast monthly pork prices in the Philippines [9]. Wu et al. proposed an exponential smoothing model to predict the pork price in China [10]. As for computational intelligent models, time-delay network [11], feedforward neural network [12], and support vector machine [13, 14] are widely adopted. Although plenty of single forecast models have been used in different products, scholars have not yet reached a consensus on which one is the best forecasting model in all the cases. In order to improve the individual forecast accuracy as well as to avoid the respective shortcomings within different single models, the concept of “forecast combination” has been regarded as a well-established and well-tested strategy [15, 16]. By adding the diversities within the models to be combined, forecast combinations can often generate superior results in contrast to their constituent models [17].
In the realm of forecast combination, most of the existing studies focus on combining different types of single models. For example, Ribeiro and Oliveira combined artificial neural networks and stochastic models to predict the price of sugar alcohol sector in Brazil [18]. Xiong et al. combined the seasonal-trend decomposition model and extreme learning machines to predict the price of vegetables in China [19]. An interesting study conducted by Andrawis et al. pointed out that combining multi-time scale might be another promising direction for further improving the forecast accuracy [20]. Multi-time scale, e.g., weekly time series and monthly time series, contain complementary information about the data generation process of the time series. By combining them together, we are then able to take advantage of different kinds of data dynamics, thus providing a better approximation of the data generation process. To the best of our knowledge, this novel idea has only been applied in tourism demand forecasting and has proven to be highly effective. Hence, we introduce this novel idea to the field of agricultural price forecasting and validate its effectiveness, so as to expand the application of this novel combination idea.
Another important issue in the area of forecast combination is how to select the optimal weights for individual forecast. The success of the combination strategy depends greatly on the weights selected [21]. Hence, scholars have proposed several combination methods with applications in many areas [22, 23]. Some studies claim that the complex or sophisticated combination methods usually do not perform well during the out-of-sample test, while the arithmetic means often has a better performance [24, 25]. This phenomenon is known as “forecast combination puzzle.”
Regarding the forecast combination of livestock products, we attempt to validate this hypothesis and verify whether there exist some combination methods that are superior to the arithmetic mean, or even surpass the best single model. Moreover, different from the current studies that mainly focus on linear combination methods, two nonlinear combination methods that have seldom been used in the area of forecasting have been employed in our study, i.e., geometric mean and harmonic mean.
In general, the objective of this paper is to introduce the idea of forecast combination with multi-time scale for livestock products' price forecasting in China. Pork, egg, mutton, and beef are chosen as the research samples according to their important roles in livestock market and people’s daily life. Meanwhile, we investigate various combination methods (i.e., linear and nonlinear methods) and try to figure out whether there is a method that can effectively surpass the best individual forecast, and which weighting scheme is most suitable for improving the agricultural forecast accuracy in China. Relative to current studies, the main contributions of this paper can be summarized to the following points:(1)Based on the well-established strategy of forecast combination, this paper proposes a novel forecast combination framework with multi-time scale to the livestock products’ price forecasting in China(2)To the best of our knowledge, this might be the first attempt to combine the forecasts from multi-time scale in the field of agricultural commodities’ price prediction(3)The performances of ten combination methods with different weighting schemes are statistically evaluated, and a nonlinear combination method coupled with performance-based weighting scheme is identified as the superior one for the livestock products in China
The remainder of this paper is organized as follows. Section 2 introduces the data investigated and the overall research framework of this paper. Section 3 reports the experimental results, including the comparison between different combination methods and the test of two hypotheses. Section 4 analyzes the possible reasons for the empirical results and finally, Section 5 gives some concluding remarks.
2. Data and Methods
2.1. Data Description
In this study, the price series of several livestock products (i.e., pork, beef, mutton, and egg) in the Chinese market are selected as study samples. The sample data are in weekly frequency covering the period from January 2009 to December 2017, with a total of 468 observations. Data from January 2009 to December 2016 are used as the in-sample period (with 416 observations), and data from January 2017 to December 2017 are specified as the out-of-sample period (with 52 observations). All the data are available on http://www.wind.com.cn.
Figure 1 reveals that each price series has different fluctuation patterns. These differences can be intuitively shown by the box plot graphics in Figure 2. In order to make the graphics more readable, as well as to avoid the impacts of different data dimensions, data demonstrated in Figure 2 are scaled by equation (1) to the range of [0,1]. Obviously, the price of mutton and beef is more volatile than the other products owing to the bigger box volume:


2.2. Forecast Combination Framework
Let be the weekly price of an agricultural product, where t indexes the week number and indexes the weekly frequency. We convert the original time series to a monthly time series , where τ indicates the month number and m indicates the monthly frequency. That is to say, equals to the average of the sum of , whose weeks fall into the month τ considered. Here, we use the ARIMA model to fit both the weekly and monthly price data and the weekly forecast and the monthly forecast are obtained, respectively. To combine these two different forecasts, we have to interpolate the monthly forecast into a weekly frequency time series, namely , and the superscript m indicates that the forecast is derived from an interpolation of the monthly forecast. This interpolation process is implemented with the frequency transformation function in Eviews7. With the same frequency, the short-term and long-term forecast can be combined together. Forecasts derived from different time scales contain complementary information about the price behavior, and by combining them together, we are able to take advantage of different kinds of data dynamics, thus providing a better approximation of the real data generation process. The combined forecast can be written aswhere the function denotes certain kinds of combination methods. All these combination methods will be specifically introduced in the following section.
2.3. Forecast Combination Methods
It is believed that the success of forecast combination strategy depends on how well the weights of each individual forecast are determined. An evaluation set which covers from January 2016 to December 2016 (with 52 observations) is extracted from the in-sample period, in order to estimate the combining weights. Here, we are going to introduce a total of ten combination methods, including linear and nonlinear combinations.
2.3.1. Equal Weight ()
This approach assigns equal weights to each individual forecast. In our research, . Although it seems quite simple, compared with other complex weighting schemes, it is usually regarded as a robust method especially in the real prediction circumstance [24, 26].
2.3.2. Inverse of Forecasts Errors ()
This method was proposed with the weights proportional to their individual inverse of the forecast errors, i.e., mean square error (MSE). The forecast errors are calculated from the evaluation set. The method is abbreviated as and its calculation process is given by [19]
2.3.3. Variance Based ()
The variance-based combination method, which assigns fewer weights to the forecasts with higher variability, is another widely adopted approach. Suppose the forecasts from two individual models are unbiased, with and as their individual variance, and as their covariance. By minimizing the variance of forecasts, the optimal weights will then be written as [24]
2.3.4. Rank Based ()
Although the above-mentioned methods based on minimizing the forecast errors have been proven to be effective in many studies, Gupta and Wilton argued that there were still some deficiencies, i.e., the appearance of negative weights might induce a meaningless combination [27]. Hence, they proposed a method based on the ranks of the forecasts performance of each individual model. This method is considered to be much more robust than the variance-based one.
Let denotes the times that model 1 outperforms model 2, and vice versa. The ranks are also estimated from the evaluation set. Then, we can have
2.3.5. Nonlinear Combination
All the methods mentioned above are conventional linear combination methods. Considering that the original price series may contain both linear and nonlinear patterns, we also introduce some nonlinear combination strategies in this paper, i.e., geometric mean and harmonic mean [28]. These nonlinear combination methods have seldom been used in the area of forecasting, and we try to investigate their effectiveness. The combination function based on the geometric mean and harmonic mean is expressed by equations (6) and (7), respectively:
Similarly, there are different ways of determining the weights and . Here, we mainly contemplate three schemes. The first one uses the weights obtained from the method and is abbreviated as and . The second one utilizes the weights derived from the method and is abbreviated as and . The third one can be regarded as a special case among the methods above, i.e., . With this equal weighting strategy, the combined forecast based on the geometric mean is written as equation (8) with the abbreviation :
Also, the combined forecast built on the harmonic mean is given by equation (9) with the abbreviation :
Finally, ten combination methods with different weighting schemes are proposed in this study. Four of them are linear combinations, i.e., , , , and . Six of them belong to nonlinear combination, i.e., , , , , , and . The former part of the abbreviation denotes the combination method, while the latter part indicates the weighting scheme.
2.4. Forecast Modeling and Evaluation Criteria
2.4.1. Forecast Modeling
The forecast model used in this paper is ARIMA model according to its consolidation and applicability in both scientific research and industrial applications [29]. In a typical ARIMA model with an order (), the future value of a variable is assumed to be a linear function of several past observations and random errors. The formula can be written as [30]
Here, and represent the number of observations and random error terms at time t, respectively. B is a backward shift operator and is defined as . Notation equals to where d acts as the order of differencing. and are autoregressive (AR) and moving average (MA) operators with order p and q, respectively, and they are defined as follows:
With the procedure proposed by Box and Jenkins in 1976, the parameters of can be properly specified according to AIC and BIC criteria.
2.4.2. Evaluation Criteria
There are plenty measures for evaluating the forecast accuracy in the field of univariate time series forecasting, i.e., mean absolute error (MAE), root mean squared error (RMSE), mean absolute percent error (MAPE), mean absolute scaled error (MASE), and so on. In this paper, we specify RMSE (scale-dependent measure) and MAPE (scale-independent measure) as the evaluation criteria. Both of them are widely used in time series analysis and they provide complementary approaches for evaluating the forecast errors. Obviously, a smaller value of the measure indicates a higher forecast accuracy:
Here, is the predicted value and is the corresponding actual value at time t, and N is the size of the testing data set, i.e., 52 observations in this research.
Furthermore, the Diebold Mariano test is implemented in order to validate whether the difference between the two forecasts is statistically significantly different from zero [18]. The null hypothesis is that the two competing methods have the same forecast accuracy. Let be the loss function of the forecast error of model i. Then, the DM statistic is given bywhere and . Under the null hypothesis, the S-statistic has an asymptotic standard normal distribution [29]. Here, a two-sided t-test is employed to test the S-statistic.
2.5. Hypothesis Formulation
There are always some arguments in the area of forecast combination. The most famous one is the so-called “forecast combination puzzle,” which indicates that equal weighting combination strategy usually has a better performance than the sophisticated combination strategy [21, 30]. However, other scholars also pointed out that equal weighting strategy might only work under the situation when the two competing models have comparable performance [19], and the advantage of forecast combinations is not to have a better performance than the best appropriate model, but to avoid the risk in selecting an appropriate forecast model [22]. These arguments inspire us to investigate whether there is a combination method that can effectively surpass the best individual model, and whether equal weighting has a universal performance in every situation. Hence, we propose two research hypotheses here.
Hypothesis 1. The accuracy of the combined model is on average equal to (not better than) the best single forecast model.
Hypothesis 2. If two single models have different forecast accuracy, then the weighting scheme based on forecast performance will be a better choice. Otherwise, the equal average method should be specified.
The overall experimental process is illustrated in Figure 3.

3. Empirical Results
3.1. Results of the ARIMA Modeling
The forecast model used in this study is ARIMA, which is one of the most popular models for time series forecasting. The main process of fitting the ARIMA model include the following: (a) identification of the possible model order (); (b) estimation of the unknown parameters; (c) choose the best-fitted model according to certain information criteria; and (d) forecast the future values based on the best-fitted model. All the modeling processes for both weekly and monthly data were implemented by the package “forecast” (version 8.3) in R software, and the AICs criteria were specified for the model selection. To avoid clutter, we only report the estimated weekly forecast model for the five agricultural products as follows:
3.2. Performance of Single Forecast Model
In this paper, two forecast strategies with different time scales are proposed and the results are shown in Table 1. Here, short-term forecast refers to the direct forecast from the original weekly price data, while the long-term forecast means the weekly forecast derived from the interpolation of the monthly forecast. Table 1 shows the forecast accuracy of different forecast strategies, i.e., short-term and long-term forecast, in terms of both RMSE and MAPE criteria.
It can be seen that short-term forecast is more suitable for beef, while long-term forecast is more available for egg and pork, owing to the relatively smaller RMSE and MAPE criteria. As for mutton, RMSE and MAPE show different results. In order to have a consistent result, we compare other evaluation criteria, i.e., mean error (ME) and mean percent error (MPE). Among all the four criteria, ME, RMSE, and MPE indicate the short-term forecast has smaller error relative to the long-term forecast. We then conclude that the short-term forecast is more suitable for mutton.
Obviously, different forecast strategies have different performance among different products. The possible reason can be referred to the phenomenon that different products have different price fluctuations. Figure 2 shows that the price volatility is more severe for mutton and beef. Under this highly volatile forecast situation, the modeling procedure that can better approximate the short-term data generation process is likely to yield a better forecast. Consequently, the weekly forecast, which has a greater influence on a relatively short time period, outperforms the monthly forecast.
Here, we define the model with a smaller forecast error as the best single model for each product. That is to say, the short-term forecast is the best model for mutton and beef, and the long-term forecast is the best model for pork and egg. The model with a relative bigger forecast error is correspondingly defined as the baseline model.
3.3. Performance of Different Combination Methods
A total of ten combination methods are introduced in Section 2.3. The weights for each combination method derived from the evaluation set are listed in Table 2. The former number in the bracket indicates the weight for the short-term forecast, and the latter one denotes that of long-term forecast. Note that the weights of the method are determined by the testing set.
With the weights shown in Table 2, the out-of-sample forecast of each combination method can be obtained. The average forecast errors across all the four products are calculated so as to compare the overall performance of each method. Descriptive analysis with the measurements of mean, standard deviation, minimum, and maximum was performed and presented in Table 3.
Table 3 shows that each combination method has identical performance ranking in both RMSE and MAPE criteria. In general, the top three combination methods are , , and , while the method has the poorest performance. Another interesting finding is that, the method, which is usually regarded as a robust combination strategy, only lists the last three in our experiment. Some detailed discussions will be carried out in Section 4.
3.4. Performance Improvement Ratio
It is common sense that forecast combination is a noninferior strategy in most cases [29]. Noninferior strategy means the combined results are usually better than the worst single model but are not better than the best single one. If the outputs can even beat the best single model, then the combination method can be identified as the superior one. In this section, we evaluate the performance improvement ratio of each method with the worst single model as the baseline. The performance improvement ratio in terms of RMSE and MAPE are defined as follows:
Here, and represent the forecast errors of the baseline model. According to Table 1, we specify the short-term forecast model as the baseline for pork and egg and the long-term forecast model for mutton and beef, respectively. Figure 4 shows the average of different combination methods, and Figure 5 shows the average of different products across all the combination methods. Considering the method often yields the result even worse than the baseline model, it is excluded in the above calculation process.


It can be seen that all the methods have a favorable performance improvement ratio compared to the baseline, especially for the methods and , in both RMSE and MAPE criteria. These results confirm the superiority of forecast combination strategy in the area of agricultural forecasting. As for the specific agricultural products, forecast accuracy of beef and pork can be mostly improved by the combination strategy.
3.5. Hypothesis Test
We have proposed two hypotheses in Section 2.5 according to the existing literature. Here, we are going to validate these hypotheses based on our experimental results.
3.5.1. Hypothesis 1
Hypothesis 1 declares that the accuracy of the combination model is on average equal to (not better than) the best single model. In Section 3.2, the long-term forecast model is defined as the best single model for pork and egg, with a short-term forecast model as the best single one for beef and mutton. Comparing the performance of the best single model with the combined model, we find that the and methods yield favorable results in most cases, as shown in Table 4.
In order to consolidate the superiority of and , a DM test is introduced to check whether the forecast results generated by these methods are statistically better than the best single model. The null hypothesis is that the test model and the counterpart model have equal forecast accuracy. Results in Table 5 show that the combined models (, ) outperform the best single model at a 1% level of statistical significance, for the item egg and beef. Consequently, hypothesis 1 can be rejected for these two specific products.
3.5.2. Hypothesis 2
Hypothesis 2 indicates that if two single models have different forecast accuracy, then the weighting scheme based on forecast performance should be a better strategy. Otherwise, equal weighting strategy should be chosen. Hence, there are two steps for verifying hypothesis 2. First, a DM test is implemented to find whether two single models (i.e., short-term and long-term forecast model) have different forecast performance. Second, the combination methods are divided into different groups and the performance difference between different groups is further investigated.
The results of the DM test provide us the information about whether the difference between two competing models is statistically significantly different from zero [30]. Table 6 shows the S-statistic for each of the pairing models. Results statistically demonstrate the forecast models with multi-time scale have different forecast accuracy at a 5% level of statistical significance across all the products.
Then, we classify the methods of , , and into the group named EW, which indicates the weights are determined by the equal weighting strategy. The remaining methods are classified into another group titled PB, which means the weights are decided by the performance-based approach. From the descriptive analysis shown in Table 3, we find that on average, the top five combination methods are , , , , and , respectively. It is obvious that the top five methods all belong to the group PB, while the methods in group perform much worse than that of group PB.
Furthermore, considering the equal weighting strategy does not perform well in this study, a DM test is again applied to verify the performance of and other superior combination methods. Table 7 reveals that, in most cases, the performances of are inferior to other combination methods. These results clearly demonstrated that the performance-based weighting scheme is more workable when the two individual models have different forecast performance. Hence, we can not reject hypothesis 2.
4. Discussion
Accurate prediction provides a solid foundation for better planning and administration of agricultural products. The existing literature contributes a lot in improving the forecast accuracy by introducing sophisticated forecast models as well as optimizing the parameters within the models [18, 31, 32]. But, there is still a consensus in the forecasting literature that no single model is the best for all the cases [33, 34]. Thus, the idea of forecast combination evolved and it is considered to be an effective safeguard against the forecast uncertainties [7]. However, most of the previous studies only focused on combining different kinds of single models, and the combination methods investigated were relatively limited. In this paper, we investigate the idea of forecast combination with multi-time scale and verify a total of ten combination methods for the livestock products in China. With the experimental results presented above, some interesting findings can be found as follows:(1)Combining time series with multi-time scale along with some effective combination methods can remarkably improve the forecast accuracy for the agricultural products. The main reason is that the multi-time scale involves complementary information about the price behavior (i.e., the short-term dynamics and the long-term dynamics). By adding these diversities together, the combined models are able to provide a better approximation of the real data generation process, thus generating a more accurate price prediction. This result confirms the superiority of forecast combination with the multitime scale.(2)Considering the model based on different time scales, we find that short-term forecast model has better performance for mutton and beef, while long-term forecast model is more suitable for pork and egg. The possible reason can be ascribed that short-term forecast model has a relative superior capability of capturing the short-term dynamics of price movement. Therefore, compared to the long-term forecast model, it is more likely to generate a precise prediction when faced with a high-volatile prediction situation.(3)Hibon and Evgeniou pointed out that the advantage of forecast combination is to reduce the risks in selecting the best single model, rather than to generate a forecast better than the best single one based on the experiments of M3 data set (3003 series) [23]. In our study, the forecast performance of the best single model can be improved with some effective combination methods, i.e., and , as shown in Table 4. Consequently, these two methods can be defined as the superior combination method, especially for the product beef and egg. Considering that the M3 data set contains little agricultural time series, this might be the possible reason of why our experimental results are different from previous work.(4)Considering the approach , which has seldom been used in the area of forecasting, it performs quite well in our study. The main reason is that owing to the nonlinearity of the agricultural price series, geometric mean, which has the property of nonlinear combination, is more likely to generate a favorable combination result. As for another nonlinear method harmonic mean, the variations of this method (i.e., ) ranks last among all the methods. The possible reason is that the method is more sensitive to smaller values and it assigns larger weights to the worse single model, thus generating a poor performance. As can be seen in Table 3, is the most effective way of determining the combination weights. Owing to the intrinsic inverse property of harmonic mean, combined with the most optimal weights, turns out to be the poorest combination method.(5)All the combination methods investigated in this paper can be classified into two groups, i.e., the equal weighting group EW and the performance-based group PB. The result from Table 7 provides a counterexample for “forecast combination puzzle,” and it confirms the hypothesis that the performance-based weighting scheme is more feasible when the individual forecasts have different forecast performance.
Furthermore, there are two kinds of performance-based weighting schemes in this study, i.e., MSE-based and VAR-based weighting schemes. On average, the VAR-based methods ( and ) are superior to the MSE-based methods ( and ) as can be seen in Table 3. Compared to the MSE-based method, the variance-based method is supposed to be more feasible in the real prediction situation, in which the actual value is always unavailable.
5. Conclusion
In previous studies, scholars mainly focused on the combination of different single models for agricultural commodities' price forecasting. In this paper, we verify the idea of forecast combination with multitime scale, i.e., short-term and long-term forecast. Forecast accuracy has been remarkably improved with some effective combination methods. The rationale is that multitime scale represents different types of price movement, and therefore increases the diversity of the combined results. We therefore believe that this is a promising direction for enhancing the forecast performance of the livestock products in China. Furthermore, we find out that the short-term forecast model is more suitable for products with higher price fluctuations, i.e., mutton and beef.
The other objective of this paper is to verify the effectiveness of different combination methods with different weighting schemes. Generally speaking, as for livestock price forecasting task, a performance-based weighting scheme is more feasible than equal weighting scheme. Moreover, evidence supports that the nonlinear combination method and the variance-based weighting scheme are the most effective strategies. In most cases, the methods and yield results even superior than those of best single model, especially for the products egg and beef.
This paper has proposed some directions for improving the forecast accuracy for the livestock products in China; however, a further study is still needed. In the near future, we are planning to explore the performance of nonlinear combination via some artificial intelligence models, i.e., support vector regression (SVR) and extreme learning machine (ELM). Also, compared to the static weighting scheme used in this paper, the performance of a dynamic weighting strategy will be further discussed.
Data Availability
The research data are the average price from China’s livestock market and available on http://www.wind.com.cn. The research data are authoritative and reliable.
Conflicts of Interest
The authors declare no conflicts of interest.
Acknowledgments
This work was supported by the National Natural Science Foundation of China (NSFC, 61702197, 71971089, and 91746102).