Abstract
SARS-CoV-2, known as COVID-19, has affected the entire world, resulting in an unexpected death rate as compared to the death probability before the pandemic. Prior to the COVID-19 pandemic, death probability has been assessed in a normal context that is different from those anticipated during the pandemic, particularly for the older population cluster. However, there is no such evidence of excess mortality in Malaysia to date. Therefore, this study determines the excess mortality rate for specific age groups during the pandemic outbreak in Malaysia. Before determining the excess mortality rate, this study aims to establish the efficiency of various parametrized mortality models in reference to the data set before the pandemic. This study employs the hold-out, repeated hold-out, and leave-one-out cross-validation procedures to identify the optimal mortality law for fitting the mortality data. Based on the goodness-of-fit measures (mean absolute percentage error, mean absolute error, sum square error, and mean square error), the Heligman-Pollard model for men and Rogers Planck model for women are considered as the optimal models. In assessing the excess mortality, both models favour the hold-out technique. When the COVID-19 mortality data are incorporated to forecast the mortality rate for people aged 60 and above, there is an excess mortality rate. However, the men’s mortality rate appears to be delayed and more prolonged than the women’s mortality rate. Consequently, the government is recommended to amend the existing policy to reflect the post COVID-19 mortality forecast.
1. Introduction
The pandemic COVID-19, known as Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2), is a highly contagious disease, affecting human’s respiratory, hepatic, gastrointestinal, and neurological systems has made a significant negative impact on the entire world [1]. This pandemic has the potential to have an unparalleled influence around the globe, as the number of deaths continues to rise day by day. Malaysia has almost 17,000 COVID-19 daily cases on August 9, 2021, with 212 deaths, making it the Southeast Asian country with the third-highest cumulative cases after Indonesia and the Philippines [2]. Despite the Malaysian government’s implementation of a nationwide movement control order (MCO) on March 18, 2020, this pandemic substantially impacted humanity, particularly the number of deaths. Aside from the current overall population, predicting the estimated size of future populations requires understanding mortality rates. Although the death rate is an unexpected occurrence, it profoundly affects demographers, insurance companies, legislators, and the government. Therefore, all stakeholders should be aware of changes in mortality rates to continue providing the highest quality of service and deliver the best resources for future planning.
According to the Department of Statistics Malaysia, Malaysians aged 60 and more are expected to reach 3.5 million in 2020, accounting for 10.7% of the country's total population. However, the current COVID-19 pandemic is anticipated to lower the number of senior citizen populations, as COVID-19 mortality rates are higher in countries with a higher proportion of the population aged 65 and over [3,4]. Consequently, the biggest concern articulated prior to the pandemic outbreak concerning an ageing society should be reconsidered, as the mortality rate of elderly citizens has a significant influence during the pandemic. In terms of post-pandemic support for Malaysia’s ageing population, Jamaluddin et al. [5] proposed the following steps: (i) reconsider whether the existing social security framework meets the needs of the population post-pandemic and in the light of an ageing population and (ii) reformation of the existing social security framework and a holistic execution plan to be contemplated in light of the projected economic outlook. Furthermore, Chung et al. [6] emphasised the importance of assisting older adults in Malaysia in adapting to their new surroundings and participating actively in social and economic activities.
COVID-19 is exerting a detrimental influence on humans all across the world, resulting in higher mortality and shorter life expectancy. Nevertheless, age-specific mortality rates have yet to be investigated in order to determine whether Malaysia has been experiencing an excess in mortality. Excess mortality rates occur when the total number of deaths during a crisis (i.e., global pandemic) exceeds what would be expected in normal circumstances [7]. If an excess mortality rate was identified, the government, demographers, insurance providers, and policymakers might revise their existing strategy for dealing with the current population numbers. As a result of this event, this study determines the excess mortality for a specific age group in Malaysia during the COVID-19 pandemic. Before investigating the excess mortality rate, this study investigates the optimum resampling method and parametrized mortality model to forecast the mortality rate under normal conditions. Parametrized functions, commonly known as mortality laws, are one-factor models that aim to represent the age pattern of death parsimoniously and they have advantages in the smoothness of predicted rates over time [8]. On the other hand, resampling methods are required as one of the quantitative tools to analyze the existing and anticipated conditions of mortality patterns [9].
As a consequence of the pandemic’s domino effect, the number of deaths is increasing tremendously. Therefore, this study aims to forecast the mortality rate in normal conditions by employing various resampling methods to fit the Malaysian mortality rate in identifying the best model and resampling method. The main contribution of this research is twofold: (i) identifying the best resampling method and mortality model for forecasting the Malaysian mortality rate in normal conditions and (ii) determining the existence of an excess mortality rate during the pandemic. In the next section, the parametric mortality models employed in this study, as well as their development and application, are briefly described. Several resampling techniques for fitting mortality are briefly discussed in the next section. The next section utilizes goodness-of-fit measures to identify the best model for forecasting the morality rate under normal conditions in Malaysia. The next section also determines the existence of excess mortality during COVID-19 before concluding this research. Finally, the optimal mortality model and resampling technique to forecast normal and excess mortality rates for specific age groups during a pandemic are summarised in the final section.
2. Parametrized Mortality Models
This section describes the parametric mortality models involved to fit the mortality rates. The mortality models describe the process of individuals in a population dying off over the timeframe of a significant portion of their lives [10]. The mortality rate formulation is expressed in Equation 1.where(i): the mortality rate for a specific age group of x at a specific time t,(ii): number of deaths for a specific age group of x at a specific time t, and(iii): exposures for a specific age group of x at a specific time t.
The number of populations at the beginning and end for a certain age group at a specific time is averaged to compute exposures. The number of deaths and number of populations are extracted for the Department of Statistics, Malaysia. The “MortalityLaws” package by Pascariu [10], which was developed in the R software, is utilized to fit the mortality rate. For both men and women, the mortality rate is provided in a five-year age group ranging from 0 to 84 years old. The data set covers the years 1995 to 2018 and fits the mortality rate in normal conditions. Table 1 shows a descriptive analysis for data characteristics used in this research study.
Table 2 describes the parametric mortality models to fit the Malaysian mortality rate. The HP model was developed by Heligman and Pollard [11], which consists of eight parameters with three terms: (i) the first term reflects the mortality rate during early childhood from 0 to 9 years old, (ii) the second term defines the adult mortality rate from 10 to 40 years old, and (iii) the third term illustrates the senescence mortality rate for ages over 40 years old. The Nigerian mortality rate was employed by Umar and Chukwudi [15] to investigate the performance of the HP and the Lee and Carter [16] models. Besides that, Silva et al. [17] applied the HP model to estimate life expectancy at birth in Mexico. Also, Kostaki [12] developed the KT model with nine parameters to eliminate a source of systematic error that affects the fit of the HP formula. The HP and KT models differ in the second term, which describes the spread of the accident hump to the left and right of its top [18]. The use of the KT model with mortality data [19–21] demonstrated that the cubic splines modification resulted in the smoothing process to capture the mortality variations over time easily.
Furthermore, the WT model, which was introduced by Wittstein [13] and consisted of four parameters with two terms, was applied to investigate the human’s mortality pattern [22]. The WT model is an alternative to the existing used logistic function in fitting observed probabilities at the oldest ages [23]. The WT model was postulated to overcome the Gompertz-law anomaly on enhancing a model that works for many countries with minor flaws such as lack of model fit for a particular age group [24]. In addition, Rogers and Planck [14] developed the RP model, which is a multiexponential that compromises of four terms and nine parameters for modelling migration. The RP model consists of a constant, exponentially dropping child mortality, a double exponential accident hump, and Gompertzian senescent mortality [8].
3. Resampling Methods
This section briefly describes the resampling methods to select the parametric mortality model that best fits the Malaysian case for both men and women in normal circumstances. First, each model is fitted to acquire the parameters for their particular models for the entire study period by using the observed mortality rate expressed in Equation 1. Then, for each model and gender, a different loss function is applied to optimize the model (refer to Table 3). Refer to [10] for further information about the loss function. After acquiring all parameters for each model, hold-out, repeated hold-out, and leave-one-out cross-validation (LOOCV) methods are used to resample each parametric mortality model. Table 4 illustrates the goodness-of-fit measures in fitting the observed mortality data for each model, which are the mean absolute percentage error (MAPE), mean absolute error (MAE), sum square error (SSE), and mean square error (MSE).
Table 4 reveals that the HP model has the lowest values for all goodness-of-fit measures for the men’s mortality rate, followed by the WT, RP, and KT models. The lower the goodness-of-fit measures, the better the parametric mortality model. The bold values indicate the best mortality models according to their respective goodness-of-fit measures. All of the goodness-of-fit measures display a consistent result for each model’s performance when fitting the observed mortality rate. Based on Table 4, the HP model is the best-parametrized mortality model for fitting the observed mortality rate since the model has scored the best values for all measurements. Although the HP fits the men’s mortality rate well, it cannot fit the women’s mortality rate due to an overparameterization issue [25,26]. This issue is also applicable to our research study here. Therefore, Table 5 solely illustrates the goodness-of-fit measures for the KT, WT, and RP models in fitting the women’s mortality rate.
Table 5 shows that the RP model has scored the best value for all goodness-of-fit measurements. Note that the lower the goodness-of-fit measure, the better the parametric mortality model. Therefore, the RP model is the best fit for the women’s mortality rate. The RP model is the best based on MAPE values, followed by the WT and KT models. On the other hand, the MAPE value yields inconsistent results when compared to the other three measurements. In contrast, the RP model appears to be the best mortality model followed by the KT and WT models based on the MAE, SSE, and MSE values. It is important to note that the first procedure fits the observed mortality rate to acquire the parameters for each model and gender. While other studies applied MAE and MAPE to determine the level of prediction in forecasting crude oil prices [27,28], this study research study utilizes the same accuracy measures to determine the best parametric model in forecasting the mortality rate. Then, the next step is to apply hold-out, repeated hold-out, and LOOCV resampling methods to forecast the parameters for each model and gender.
3.1. Hold-Out Methods
Hold-out methods, also known as out-of-sample methods, require two sets of data: training and testing. The training set is a set of data designed to fit the model, whereas the testing set is used to evaluate the model’s forecasting performance [29]. The training and testing sets are randomly divided depending on the number of samples. For instance, the training set contains 75% of the sample, while the testing set contains the remaining of 25%. However, this varies according to the situation, such as 80% and 20% or 2/3 and 1/3 [30,31]. The sample should be ordered chronologically because this study involves time-series data [32]. Refer Figure 1(a) in for an example of how to divide the data. For each mortality model, the methods for applying the hold-out method were as follows:(1)The data are separated into two sets, with the training set having a 2 : 1 ratio to the testing set. In addition, the training set spans the years 1995 through 2010, whereas the testing set spans the years 2011 through 2018.(2)The testing set parameters are forecasted using the parameters from the training set.(3)For all age groups, the parameters are fitted to acquire the forecasted mortality rate.(4)The goodness-of-fit measurements are used to assess each mortality model’s forecasting performance for each age group.

3.2. Repeated Hold-Out Methods
The second resampling method is the repeated hold-out, which follows a similar procedure to the hold-out but involves several iterations [33]. An infinite new sample with a certain Bayesian distribution can be generated using this method [34]. Unlike the hold-out method, this method selects the training and testing sets at random for each iteration. The repeated hold-out, as proposed by Bergmeir et al. [35], applies the standard procedure for time series data without any adjustments. Refer Figure 1(b) in for an example of how the repeated hold-out method is applied in the analysis. The repeated hold-out method follows a similar procedure but takes into accounts the sample’s peculiarities as follows:(1)The sample is divided into two sets, with two-third of the sample serving as the training set and one-third serving as the testing set. The procedure is repeated for a number of iterations. For the first iteration, the training set corresponds to the years 1996, 1997, 1999, 2001, 2002, 2003, 2007, 2008, 2009, 2012, 2013, 2014, 2015, 2016, 2017, and 2018. Whereas the testing set corresponds to the years 1995, 1998, 2000, 2004, 2005, 2006, 2010, and 2011 for the first iteration.(2)Since the data are in a random order, the parameters for the testing set are obtained using the package “imputeTS” [36].(3)The parameters of the testing set are fitted to acquire the forecasted mortality rate for all age groups in the testing set.(4)The goodness-of-fit measures for each age group are computed to evaluate the forecasting ability for each mortality model.
Using the looping function in the R software, different training and testing sets are generated randomly for each iteration and model. The number of iterations is repeated 100 times as applied by Atance et al. [9].
3.3. Leave-One-Out Cross-Validation
The third method is leave-one-out cross-validation (LOOCV), nearly identical to the first hold-out method but differs in the proportion of the training and testing sets. For a more detailed discussion of the analysis involved in this analysis, see [37]. This method has several advantages over other methods: (a) it minimizes sample bias because the training set is made up of n-1 observations, which covers almost the entire sample and (b) it selects the training and testing sets without involving randomness because almost all data are used for fitting and testing purposes [38]. The training set has a window with a defined period. For the following iteration, a new datum of time series is added chronologically, also known as an “assessment on a rolling forecasting origin one-step-ahead” [39]. For a visual representation of how this method works, see Figure 1(c) in . The following are the approaches for modifying this method:(1)The first three years (1995, 1996, and 1997) are utilized as a training set in the first iteration because three is the least number of samples to fit the mortality model [40].(2)The ARIMA model is used to forecast the parameter for the year 1998 based on the first three time-series data.(3)The number of training data is raised by one to acquire the next forecasted parameter for the following iteration. Finally, the forecasted parameters are fitted to provide the forecasted mortality rate for each testing set’s age group.(4)Goodness-of-fit measures are calculated for each age group to evaluate the forecasting ability of each mortality model.
4. Goodness-of-Fit Measures
This section presents the goodness-of-fit measurements for each model to assess its forecasting ability: (i) MAPE, (ii) MAE, (iii) SSE, and (iv) MSE.
Table 6 summarizes previous studies on goodness-of-fit measures for selecting the best model for fitting and forecasting the mortality rate in normal conditions. Equation 2 displays the MAPE formula, Equation 3 represents the MAE formula, Equation 4 illustrates the SSE formula, and Equation 5 shows the MSE for the hold-out method specifically.
Meanwhile, Equation (6) indicates the goodness-of-fit measure for repeated hold-outs, while Equation (7) exhibits the goodness-of-fit measure for the LOOCVs.
The number of iterations involved in a repeated hold-out is represented by k. Since the iteration is repeated 100 times, k is equivalent to100. On the other hand, n represents the number of observations in this study, which is 24. The forecasting performance of each mortality model is assessed using goodness-of-fit measures. Table 7 shows the goodness-of-fit measures for the men, while Table 8 shows the goodness-of-fit measures for the women.
Table 7 indicates the goodness-of-fit measure for each parametrized mortality model of resampling methods for the men’s mortality data. In terms of resampling methodologies, the repeated hold-out method favours the KT model for men’s mortality, whereas the hold-out method favours the HP, WT, and RP models. Note that the HP model is the best mortality model for fitting the men’s mortality data, and the hold-out method is the best resampling method for men’s data. Although the RP model has scored the lowest values for all goodness-of-fit measurements of the hold-out resampling method, the HP model has scored the lowest values for three goodness-of-fit measurements. The fact that the HP model has the lowest values for MAE, SSE, and MSE for the hold-out resampling method resulting the HP model as the optimal model for the men’s mortality data.
Table 8 illustrates the goodness-of-fit measures in fitting the women’s mortality rate. The hold-out resampling method is the best in fitting the women data based on the KT and RP models. However, when it comes to fitting the WT model, the repeated hold-out resampling method is the best. Note that the RP model fits the women’s mortality data well, and the hold-out is the best resampling method for women data. Overall, the HP model for the men’s mortality rate and the RP model for women’s mortality data are the best models for fitting the mortality rate. Despite the fact that the repeated hold-out is the best resampling method for KT men and WT women, the hold-out has been demonstrated to be the best resampling method in other situations.
5. Forecasting Mortality under Normal Conditions and COVID-19
The HP model is applied for the men’s data, and the RP model is utilized for the women’s data using the hold-out resampling method to forecast the mortality rate under normal conditions and account for the COVID-19 mortality data. The mortality rate under normal conditions is forecasted from 2019 to 2030 without considering the COVID-19 mortality data. Then, the COVID-19 mortality data for 2019 and 2020 are included to predict the mortality rate until 2030. The excess mortality rate is determined by comparing the mortality forecast under normal conditions to the mortality forecast using the COVID-19 data. Figure 2 depicts the mortality rate for men aged 60 years and over. The straight line represents the observed mortality rate from 1995 until 2020, while blue dotted line indicates the fitted mortality rate under normal conditions from 1995 until 2030. The red dotted line indicates the forecasted mortality rate by considering the COVID-19 mortality data. When the mortality rate for people aged 60 and over is forecasted, the mortality rate shows a decreasing pattern after 2020. However, when the COVID-19 data for 2019 and 2020 are incorporated in the forecasting, the mortality rate begins to rise in 2023 and continues to rise after that. When using the COVID-19 data in forecasting, it demonstrates that there is an aftereffect of mortality rate for men population aged 60 and over.

(a)

(b)

(c)

(d)

(e)
Figure 3 displays the mortality rate of women for the age group of 60 years and over. Based on the observed mortality rate, all age groups showed a decreasing pattern. However, when the mortality rate is forecasted, considering the COVID-19 mortality data, the mortality rate indicates a slight increase in 2023 and then a reduction after that. The women’s mortality rate increases with age, but not as much as the men’s mortality rate. Compared to the women’s mortality rates, the men’s mortality rates have a delayed effect but last longer. This is most likely related to the fact that women have a higher chance of outliving men.

(a)

(b)

(c)

(d)

(e)
6. Discussion
As the mortality rates for the elderly population of 60 years and over improve, the ageing population becomes a major concern [54, 55]. The mortality rate of an ageing population has been steadily increasing from 1950 to 2015 in Malaysia, consistent from 1950 to 2015 [56]. Furthermore, Malaysia’s number of people aged 60 and over is predicted to increase, resulting in an ageing population. However, when the mortality rate is forecasted using the COVID-19 mortality data for 2019 and 2020, the men’s mortality rate displays a delayed effect. Still, the trend begins to indicate an increasing pattern starting in 2023. The mortality rate has increased significantly after 2023 across all age groups for men aged 60 and over.
Furthermore, when the COVID-19 mortality data are considered, the women’s mortality rate also shows an impact, although it does not last as long as the men’s mortality rate. Due to the COVID-19, it seems that the population of men has a greater impact than the population of women. The women’s mortality rate has only impacted a short-term until 2023, after which it will return to a stable trend as forecasted under normal conditions. The COVID-19 mortality rate exhibits a bigger impact when the age group increases. The result is consistent with the studies in [7, 57], which demonstrated an excess mortality rate, particularly in countries with a large population of people aged 65 years and above during the COVID-19 pandemic. The excess mortality increased with age above 70 years and correlated in time with the COVID-19 reported mortality rate time series in Italy [58].
Based on the forecasted mortality rates for men and women, the excess mortality rate only takes place after 2023. This is due to the fact that the mortality rate shows a decreasing pattern and increases after 2023. It can be concluded that the Malaysia’s population experience the ageing population to the point that the mortality rate due to confirmed COVID-19 mortality has not impacted immediately as soon as the COVID-19 outbreak in 2020. This is most likely due to the fact that an increasing of the ageing population resulting from the tremendous achievements of public health policies and social and economic development in Malaysia [59].
Moreover, the mortality rates were higher for ages above 50 due to the pandemic in Spain but no evidence in the Czech Republic [7]. The empirical result suggests that the COVID-19 mortality has a delayed and longer effect on the men’s mortality rate. On the other hand, COVID-19 has a quick and short effect on the mortality rate of women. Although the forecasting only considers the mortality data for two years due to data availability, the results could be used as a benchmark to predict the post-COVID-19 mortality rate for future planning.
7. Conclusion
This study determines the excess mortality rate for particular age groups during the COVID-19 in Malaysia. In order to determine the excess mortality rate, this study employed various parametrized mortality models to forecast the mortality rate under normal conditions, including the Heligman-Pollard, Kostaki, Wittstein, and Rogersplanck models. The data set involved in this study ranges from the year 1995 to 2018. Furthermore, this study identifies the optimal mortality law for fitting the mortality data utilising multiple resampling methods, such as hold-out, repeated hold-out, and leave-one-out cross-validation. The optimal model for men and women’s mortality rates is determined using a variety of goodness-of-fit measures (mean absolute percentage error, mean absolute error, sum square error, and mean square error). The Heligman-Pollard model for men and the Rogers Planck model for women are the optimal models based on goodness-of-fit measures. Both models favour the hold-out technique as the best resampling method. While our study has been conducted in a yearly basis, there is no outlier detected by observing the boxplot. The post-COVID-19 mortality rate is projected for ten years up to 2030 using the COVID-19 mortality data for 2019 and 2020. The empirical results revealed that the COVID-19 mortality in Malaysia has an excess mortality rate, particularly among those aged 60 and over. The men’s mortality rate appears to have a delayed and longer effect than the women’s mortality rate based on the forecasted mortality rate. This is most likely due to the risk of a woman living longer than a man. In conclusion, this study recommends amending the existing policy to reflect the post-COVID-19 mortality forecast. Furthermore, the mortality rate reveals that excess mortality rate is caused not just by the illness itself but also by other psychological matters such as suicide and health treatment delay.
Data Availability
Data are available on request. The data are collected from publicly available sources except for the ones that have been mentioned otherwise. The raw data supporting the conclusions of this manuscript will be made available by the authors without undue reservation.
Conflicts of Interest
There are no conflicts of interest related to this work.
Authors’ Contributions
Robiaatul Adawiah Edrus contributed to literature review, data collection, data analysis, research framework, methodology, validation, and initial draft. Zailan Siri performed conceptualization, validation, methodology, formal analysis, and supervision. Mohd Azmi Haron performed conceptualization, validation, methodology, formal analysis, review, editing, and supervision and provided resources. Muhammad Aslam Mohd Safari contributed to conceptualization, validation, methodology, formal analysis, initial draft, and supervision and provided the software. Mohammed K. A. Kaabar contributed to validation, methodology, supervision, initial draft, and editing and provided resources. All authors read and approved the final version.