Abstract
The current study aimed to examine the interseasonal characteristics of meteorological drought. For this purpose, a new comprehensive framework is proposed. The framework consists of two major stages. In the first stage of the framework, the K-means method is utilized to identify homogeneous clusters. Besides, the Monte Carlo feature selection (MCFS) is applied to select more important stations from the varying clusters. In the second stage, the standardized precipitation index at a three-time scale (SPI-3), the conditional fixed effect binary logistic regression model (CFEBLRM), and the random effect binary logistic regression model (REBLRM) are utilized. The significance of CFEBLRM and REBLRM is measured by log-likelihood values, log-likelihood ratio chi-square test (LRCST), Wald chi-square tests (WCT), and values. The Hausman test (HT) is applied to identify endogeneity and suggests the appropriate model in CFEBLRM and REBLRM. The results from the proposed framework indicate that the drought persists in the summer to autumn and autumn to winter seasons between 90 and 99 percent. The odds ratio of CFEBLRM for the summer-autumn season indicates that the increment in precipitation will decrease the drought persistence in the autumn season. The result of the current study facilitates the decision-makers to understand the effects of meteorological drought occurrences better and improve strategies for mitigating drought effects and managing seasonal crops in the Punjab province in Pakistan.
1. Introduction
A drought is defined as an extended period of precipitation shortage that exceeds the normal range, resulting in sustained insufficiencies in the availability of atmospheric, surface, or groundwater resources [1]. Drought occurs when an extended period of extremely dry circumstances results in a major shortage in water availability, either due to a severe lack of rainfall or an unexpectedly lower precipitation amount than projected [2]. Global warming is becoming more severe on a worldwide scale. Several locations around the world have experienced rising temperatures and less precipitation in recent decades [3, 4], leading to an increase in severe drought occurrences. This warming trend is anticipated to worsen, resulting in even more adverse conditions. The impact of global warming extends to weather phenomena, changes in climate patterns, and the availability of water resources. Climate change directly affects agriculture and water resources [5, 6], which means that droughts have a significant impact on the availability of food and economic stability. For instance, in South Asia, drought has affected nearly 2 billion people, and approximately seven to eight billion dollars have been spent since 1990 to lessen the impact of droughts [7]. Droughts’ effects on society and the economy are expected to increase in the future. This is due to rising temperatures and increased human demand for water [8]. The ability to forecast and detect early indications of drought is vital for efficient management and strategic planning of agricultural resources before the drought begins [9].
Drought is generally classified into four major types: meteorological drought, i.e., insufficient precipitation, hydrological drought, i.e., groundwater and water flow reduction, agricultural drought, i.e., insufficient soil moisture and socioeconomic drought, i.e., the gap between water demand and supply [3, 4, 10, 11]. Among the several types of droughts, meteorological drought is the most important because the severity of meteorological drought directly affects the presence of surface water and moisture in the soil [12, 13]. Meteorological drought may occur due to a shortage of precipitation defined as a divergence from ordinary meteorological circumstances that causes the earth’s surface to dry out [11, 14]. Meteorological drought is one of the primary natural hazards. If it persists for a prolonged period, it can bring other kinds of droughts that can lead to long-lasting effects on ecosystems, environment, agriculture, and water reservoirs [2, 15, 16]. Over the last 30 years, scientific studies have observed a gradual rise in the average global temperature of the Earth [17, 18]. Multiple studies have indicated that various climate change scenarios have been linked to an observed increase in temperature and greater fluctuations in precipitation [19, 20]. Meteorological drought and agricultural drought are interconnected through various aspects, such as the scarcity of rainfall, deviation from normal meteorological factors like evapotranspiration, insufficiency of soil moisture, and a decline in groundwater levels [21]. When a meteorological drought persists for an extended period, it can cause an agricultural drought because the lack of precipitation leads to a lack of soil moisture, resulting in lower crop production [1, 22, 23]. In addition to impacting crops, droughts also have adverse effects on orchards, forests, and the overall environment. Hence, droughts are recognized as significant obstacles to achieving optimal agricultural growth and ensuring food security [24].
Further, meteorological drought can cause socioeconomic shifts that might lead to hunger, driving migration and causing large refugee crises [1, 25]. The impact of meteorological droughts may become more severe in the future, as the climate changes due to increased greenhouse gas concentrations in the atmosphere [26]. The extended duration of insufficient soil moisture is remarkably associated with hydrological drought, which affects the water shortage for a long period [4, 27, 28]. Hence, the other types of droughts are connected to the meteorological drought, and therefore, monitoring and evaluating meteorological drought are the primarily steps towards improving the performance of operational drought monitoring systems. Moreover, the impact of drought can be managed or reduced by using decision support systems to measure the physical attributes of droughts [29]. Several researchers and climatologists worldwide have used drought indices to investigate meteorological drought and inform the drought community to develop drought policies and conduct risk management [26, 30–32]. Drought indices are particularly useful tools and play an important role in monitoring droughts and providing a comprehensive overview of drought conditions [33–39]. The Standardized Precipitation Index (SPI) has prevailed to assess meteorological drought due to simplicity in calculation and ability to assess drought for different time scales [24, 30, 40–46]. Based on the importance of SPI for monitoring meteorological drought, the current research employs SPI for monitoring and modeling the spatiotemporal interseasonal characteristics of meteorological drought in selected stations.
The comprehensive monitoring and modeling interseasonal drought characteristics are required to reduce the potential negative impacts of drought. Therefore, the current research provides a comprehensive framework to monitor and model the spatiotemporal interseasonal characteristics of meteorological drought in selected stations. For this purpose, we initially use the K-means method to identify homogeneous clusters. Also, the Monte Carlo feature selection is used to take more crucial stations from the various clusters. Further, the standardized precipitation index at a three-time scale (SPI-3), the conditional fixed effect binary logistic regression model (CFEBLRM), and the random effect binary logistic regression model (REBLRM) are employed. The impact of CFEBLRM and REBLRM is assessed by log-likelihood values, log-likelihood ratio chi-square test (LRCST), Wald chi-square tests (WCT), and values. The Hausman test (HT) is used for identifying endogeneity and suggests the appropriate model in CFEBLRM and REBLRM. The result of the present study can enable the decision-makers to identify the effects of meteorological drought occurrences better and to improve policies for mitigating drought effects in the province of Punjab in Pakistan.
2. Materials and Methods
2.1. Description of the Study Area
The study focusses on the stations of Punjab, the Pakistan province located on the central eastern region of the country with an area of about 205,344 square kilometers (79,284 square miles) (Figure 1). Punjab consists of mostly landscape, and it is amongst the most heavily irrigated on the earth; there are several mountain regions including the Suleiman Mountains in the southwest part of the province and the Margelle hills in the north. The major rivers Indus and its tributaries, Ravi, Jhelum, Chenab, and Sutlej, flow through it. Punjab is the second largest province of Pakistan. It uses more fertilizers and is the most industrialized province, and 24% of GDP is contributed by its industrial sectors. Due to these several properties, Punjab region is vital for other regions, and it has the lowest rate of poverty among all the regions of Pakistan. Moreover, data from 24 stations are chosen as a representative sample of the climate in the study area. Punjab is facing severe water scarcity due to an increasing demand of water resources by population livelihood, agricultural, energy, and industry sector; therefore, Punjab province becomes a drought-prone region. Climate change has caused significant disturbances across the entire globe. As a result, different climate change scenarios such as increase in temperature and variability in precipitation cause the unsustainability of Punjab, and it has a negative effect on economic and agricultural sectors [47]. Therefore, it is important to monitor drought characteristics by getting information about annual drought frequency for a selected station.
2.2. Data and Methods
We retrieved the time series 40-year metrological monthly data ranging from January 1981 to December 2021 for 24 stations of the province of Pakistan. These stations have been chosen because of their significant climatological attributes, which play a crucial role in accurately defining and observing the risks associated with droughts. We have included the following variables: precipitation (mm/day), temperature (2 m), wind, profile soil moisture, and dew, which are suitable for the analysis of metrological drought. The current data are collected from the NASA website (https://power.larc.nasa.gov/data-access-viewer/) which provides: POWER Data Access Viewer. Furthermore, it contains POWER Global Download’s widget which gives access to climatology for the entire globe. The climate is changing season to season in Pakistan. Hence, these fluctuations in climatic characteristics in interseasonal are important to be recognized [48]. Therefore, we arrange the data in four seasons due to their homogeneous characteristics: winter (Dec to Feb), spring (March to May), summer (Jun to Aug), and autumn (Sep to Nov) for interseasonal drought characteristics. For the current analysis, the SPI is utilized for the three-month scale SPI-3. The dependent variable is binary in which 1 shows the persistence of the drought in the current season. The entire globe is significantly caused by climatological change which damages the balance of nature. By single drought, the country loses millions of dollars which damages the whole economy of the country and stays intact for several years. The metrological drought can affect agriculture, food production, and human health, limit worker productivity, and increase mortality; even a single drought brings a lot of risk. The prediction and early signs of drought are particularly important for the management and planning of agricultural resources before the onset of drought. Therefore, it is important to monitor and model interseasonal drought characteristics in selected stations (Figure 2).
2.2.1. K-Means Clustering
Clustering is an unsupervised machine-learning technique that allows one to find patterns and identify groups of similar observations in multivariate data to extract relevant information. Thus, clustering can be very helpful for forming groups according to the object [49]. Clustering aims to partition data into homogeneous groups to reduce heterogeneity such that the data in the same cluster are more like each other than the data in other clusters. [50]. Different datasets require different clustering methods chosen from different categories of clustering (e.g., partitioning methods, density-based clustering, and hierarchical clustering) [49, 51]. Within these categories, K-means is one of the most widely used clustering algorithms due to its ease of implementation, simplicity, efficiency, and empirical success [50, 52]. Here, the K-means is a suitable method for clustering by using precipitation data. K-Means is a distance-based clustering method that looks for the division of the data into K clusters on similar characteristics that minimizes the within-cluster sum of squares and maximizes sum of square between the clusters [53, 54]. The K-means algorithm starts with a random selection of k, where k is the number of clusters to be formed [54]. It randomly initializes the cluster centroids and assigns data points to the nearest (closest) cluster centroid based on the Euclidian distance between data points } and centroids {}. The distance calculated between any two points is defined as
The objective is to minimize the sum of squared error among data points and their respective clusters is defined aswhere is the data point and is the cluster centroid. In the current study, eight clusters are formed for twenty-four locations and the locations are assigned to these clusters by the same means.
2.2.2. Monte Carlo Feature Selection
Monte Carlo is an algorithm for feature selection by ranking each attribute or feature in terms of relative importance in high-dimensional data problems [55–57]. Recently, the authors in [58] have used MCFS to select important stations in the northern region for their analysis. For calculation of the relative importance values, we start from a set of d features and s subsets of m features are chosen (with m being fixed and smaller than d) (Figure 3). For each feature subset, t trees are created. In the inner loop, each of these t trees is trained on a random 66% of samples and tested on the remaining 34%. Overall, trees are constructed and evaluated. Both s and t are set to be large to ensure that each feature has opportunities to appear in various feature subsets.
To determine relative importance, let us first introduce weighted accuracy which is denoted aswhere is the weighted accuracy, is a true positive rate, and denotes the number of samples from class classified as those from class , clearly , = 1, 2, …, c and .
Here, the relative importance can be defined aswhere denotes the gain ratio for tree nodes, no. is the number of samples in the node , and no. in τ is the number of samples in the root of tree. The values of three parameters, m, s, and t, were prespecified by a practitioner and set u = = 1. In the current study, the Punjab province of Pakistan is divided into eight clusters by the method of K-means clusters. The selected stations in clusters provide a homogeneous pattern of meteorological drought. We use MCFS to select the important stations in different clusters for analysis, and on basis of this method, we select one station from each cluster which has a large relative importance value as compared to other stations [56]. For instance, Pakpattan is selected as the important station for cluster1, and in cluster2, Gujranwala is selected as an informative station. In this way, the MCFS selects the informative stations for our analysis.
2.2.3. Standardized Precipitation Index
The Standardized Precipitation Index (SPI) method delivered by McKee et al. [40] is used in the current study to evaluate and track the drought condition. The SPI was developed to measure the extent of precipitation deficiency across various time frames, such as one, three, six, nine, and twelve months. These time scales help in determining the impact of drought on the availability of different water resources. Each time scale represents a distinct aspect of drought, where shorter periods assess shorter-term drought events, and longer periods provide insights into longer-term drought patterns [59]. The SPI allows for a comparison of precipitation amounts over a specified time and historical precipitation totals for that same period across all available years [24, 30, 41, 46, 43]. To compute SPI, a long-term precipitation record of at least 30 years is required [60]. The SPI has several distinguishing features, including simplicity, spatial consistency, a probabilistic nature, and a capacity to represent droughts across both spatial and temporal dimensions. Further, the SPI is relatively easy to calculate in comparison to other indices, and the SPI is helpful in providing early warnings for drought events and help in drought damage reduction [1, 18, 58, 61, 62]. The process of SPI analysis involves transforming the rainfall data into a normal distribution using the gamma probability distribution [1]. The SPI can also be calculated by using the marginal probability of precipitation formula instead of gamma distribution function [42, 63–65]. To extend this, the current research uses the formula that is proposed by [42] for standardizing the precipitation data. Various probability distributions are used for standardizing the precipitation data. The detailed discussion for the calculation of SPI is given in [66, 67]. The SPI provides a clear classification to different levels of drought severity. When the SPI number falls below −1.5, the drought is deemed severe, and when it falls below −2, it is considered extreme. As a probability-based index, the intensity of a precipitation event in the SPI is determined relative to the typical rainfall patterns of a specific area. Calculating the SPI requires a long-term record of precipitation data [18]. This research paper utilizes the SPI-3, which corresponds to a three-month time scale typically used for evaluating medium-term drought conditions [3]. The SPI-3 is computed in R language (R software) using a propagate library. SPI-3 helps smooth out short-term fluctuations and provides a better understanding of medium-term trends.
2.2.4. Panel Binary Logistic Regression Model
In panel data, observations are collected over time for multiple individuals or groups [68–70]. The study of binary choices of individuals is prevalent in the literature [71, 72]. When the responses are observed spatiotemporally, the models related spatially and temporally are preferred for modeling; specifically, the use of panel data for binary response is prevailed now-a-days [67, 73–76].
The binary panel model is given aswhere is the binary dependent variable for individual at time , is a vector of independent variables for individual at time , is a vector of coefficients associated with each independent variable, is the individual-specific intercept which captured the unobserved heterogeneity and represents the baseline log-odds of the dependent variable (drought = 1) for each group, and the coefficients represent how the log-odds change with respect to the independent variables. is called an idiosyncratic error because these errors vary across i as well as across t. Ideally, we are interested in the correlation of and within the group but uncorrelated across the groups. The logistic model assumes a linear relationship between the log-odds of the probability of being 1 (success) and the independent variables. is the linear combination of the independent variables and their corresponding coefficients, also known as the log odd. In the context of panel considers, both fixed and random logit models are extensions of the standard logit model that consider the presence of individual-specific or group-specific effects. These models are commonly used when analyzing panel data, where observations are grouped into different entities (e.g., locations) and are observed over multiple time periods. The key difference between fixed and random logit models lies in how they treat these individual-specific or group-specific effects. The fixed effect is used when and are correlated, and it means that the conditional distribution of is not correlated with and that the random effect model is used when and are not correlated.
2.2.5. Conditional Fixed Effect Binary Logistic Regression Model (CFEBLRM)
The CFEBLRM individual-specific effects are treated as fixed parameters. It assumes that these effects are constant and do not vary across individuals or groups in the population. Essentially, the fixed logit model estimates separate intercepts for each individual or group, but these intercepts are not allowed to vary based on any underlying distribution. This model is also known as the “within-subjects” or “entity-specific” model. The conditional fixed effect binary logistic regression model gives more consistency estimate than the unconditional fixed effect binary logistic regression model. The most common nonlinear function is the logistic function.
The probability of binary response for nonlinear model iswhere is the probability of being 1 (persistence drought) and 0 (not persistence drought) given the values of the independent variables .
is the linear combination of the independent variables and their corresponding coefficients, also known as the log-odd.
is cumulative distribution function for the logistic variable with range zero to one.
The conditional probability for response variable is given aswhere
The conditional probabilities for t = 2 are
There is involved the unobserved in the model which makes it complex to estimate the parameters for the logistic model. The conditioning minimum sufficient statistic for is used for estimating the equation to eliminate the . Then, the parameters for the model are estimated by conditional log-likelihood.
The conditional log-likelihood function is
The low value of the Hausman test indicates the endogeneity in the model, and in such cases, it recommends the CFEBLRM because it is less susceptible to endogeneity concern and can help mitigate the impact of endogeneity and omitted variable bias.
2.2.6. Random Effect Binary Logistic Regression Model (REBLRM)
The model is also called the component of the variance model. The REBLRM is used when there is no endogeneity in the model, and it means that the independent variable and the error term are not correlated. It also controls the unobserved heterogeneity between the groups in the model. The REBLRM is defined as
Here, is the random error effect. In the random effect, is a random individual-specific effect and it is specified as distributed by Gaussian.
M is the binomial denominator of the binomial logistic model. The probability function in exponential family forms as
For estimating the parameter of the random effect binary logistic model, the log-likelihood function for the Bernoulli model is given as
3. Results
In current research, the 24 stations are selected from the Punjab province of Pakistan for analysis. Selected stations are grouped into eight appropriate clusters based on C-index based on the K-means method. The MCFS method is applied for choosing important stations for each cluster. Moreover, the analysis involves evaluating drought persistence within four different periods for selected stations, such as the winter and spring season data which are utilized to calculate the winter to spring drought persistence for selected stations. Hence, for winter to spring drought persistence, the precipitation, temperature, wind_speed, and profile_soil_moisture are used as independent variables. Further, the odds ratio defines the relationship between the binary dependent variable and independent variables. In the current analysis, the binary variable identified the significance of preceding season to the current season by including the CFEBLM and REBLM. The significance of CFEBLM and REBLM is evaluated by LRCST and WCT, and the Hausman test is used to check the endogeneity in independent and error terms and then select the appropriate model to find the interseasonal meteorological drought persistence for selected stations in the Punjab province of Pakistan. The persistence of drought has a negative impact on agriculture, economic structure, and living organisms. In the current study, we investigate the characteristics and persistence of meteorological drought in several meteorological stations of Punjab.
The appropriate probability distributions and their BIC values for the Standardized Precipitation Index of selected stations at three-month time scale are given.
Various characteristics of precipitation are given in Figure 4. The details of the important characteristics of precipitation are given in Table 1.
The greater mean value of precipitation is noted in Murree and Rawalpindi as 75.71 and 74.54 mm. Various probability distributions are utilized to standardize the SPI-3 values. This distribution selection is based on the minimum value of the Bayesian Information Criterion (BIC). The selected distributions and their BIC values for selected meteorological stations are given in Table 2. The histograms of appropriate probability distributions for SPI-3 are provided in Figure 5. For example, the log-normal has a suitable distribution for Faisalabad and Jhelum stations with BIC values of −1032.76 and −1255.70, respectively, and similarly, for other stations, they are given. Moreover, the temporal plots of SPI-3 for several stations are given in Figure 6. The drought is categorized in two categories on the basis of SPI value. Positive values of the SPI indicate wet periods, whereas negative values indicate dry periods or drought (SPI ≤ 0), whereas no drought condition is indicated by SPI ≥ 0 [1, 77]. The varying drought categories observed in selected meteorological stations are given in Figure 7. The figure was created using ArcGIS Pro 2.5 software, which shows the total number of droughts of different categories occurring in selected years. The total number of droughts in October month for several years is provided in Figure 8. The drought counts for selected years are presented in Figure 9. For example, the drought count in spring season (March 1981 to May 2021) for Faisalabad, Rawalpindi, and Sialkot is 104, 110, and 106, and for other locations, it is observed accordingly. Additionally, in Figure 10, the drought frequency is as follows.
This is evaluated for spring season (March 1981 to May 2021), and the total number of droughts occurs in total months in specific locations divided by the total number of months in that season. For example, in summer season, the drought frequency for Faisalabad, Rawalpindi, and Sialkot is 85, 89, and 86 percent. Further, the seasonal drought persistence is presented in Figure 11, which is evaluated as the total number of droughts persisting in the current season from the previous season divided by the total number of droughts in the previous season. For example, in the spring to summer season, the total number of droughts persisting in Faisalabad of summer season is 46 and the total number of droughts in the previous season is 104, so the drought persistence in spring to summer in Faisalabad is 44 percent. Table 3 provides details about the winter-spring season drought persistence modeling.
The log-likelihood values, WCT, LRCST, and values, for the CFEBLRM and REBLRM are given. The value of both the models is significant which indicates that both the models are important. However, the HT is used to test the endogeneity in independent variables and error term and suggests the significant model in both CFEBLRM and REBLRM for winter to spring season for the selected location of Punjab province, Pakistan. The value of the HT is 0.05 which confirms that there is no endogeneity in independent variable and error term in winter-spring season data and indicates that REBLRM is an appropriate choice for the winter-spring spatiotemporal drought persistence modeling.
Table 4 presents the result derived from REBLRM for winter to spring season. The results are derived in Stata software. The values of the variables show a significant effect on meteorological drought persistence. The variables such as precipitation, temperature, and windspeed have a significant influence on the meteorological drought persistence in varying seasons. However, any variable can be insignificant for any season which means that it does not have a statistical impact on drought persistence. For example, the precipitation, temperature, and windspeed are significant for the winter to spring season. The odds ratio value represents how much the odd of the persistence changes by one-unit change in the independent variables. The odds ratio of precipitation (0.981) with a significant value and 95% confidence intervals indicates that the increase in precipitation in winter season will decrease the probability of the drought in spring season. The changes in the dependent variable are influenced by the combined effect of the other factors that are included in the model. The odds ratio of precipitation (0.981) means that one mm increase in precipitation decreases the odds of drought persistence in spring season by 0.02. Similarly, by one-unit change, this has a slight relation to drought persistence. However, the relationship is significant. Table 5 provides details regarding the spring-summer season drought persistence modeling. The table includes log-likelihood values, WCT, LRCST values, and values for both the CFEBLRM and REBLRM models. The significant values for both models indicate their importance. Additionally, the HT is employed to examine endogeneity within the independent and error term, suggesting significance for both CFEBLRM and REBLRM. The HT value ≤0.001 confirms the presence of endogeneity in the spring-summer season data. This underscores that CFEBLRM is the suitable choice for modeling spatiotemporal winter-spring drought persistence. In Table 6, the results obtained from CFEBLRM analysis for the spring to summer season are given. The values associated with the various independent variables demonstrate a significant impact on meteorological drought persistence, indicating the importance of precipitation, temperature, windspeed, and profile soil moisture. The odds ratio of precipitation (0.9977), with 95% confidence interval and significant value, suggests that increment in precipitation during the spring season slightly diminishes the likelihood of summer season drought. Thus, a one mm increment in precipitation in spring corresponds to reduction in drought persistence during summer by 0.0023.
The profile soil moisture has strong negative relation to the drought persistence in spring to summer seasons. Table 7 presents the log-likelihood values, WCT, and LRCST values. The CFEBLRM and REBLRM both have a high negative log-likelihood value which indicates that both the models are significant. The HT with value ≤0.001 indicates that there is endogeneity occurring in the model, and HT suggests that the CFEBLRM is appropriate for the analysis of summer to autumn spatiotemporal meteorological persistence modeling. Table 8 provides the outcomes of the CFEBLRM analysis applied to model spatiotemporal drought persistence trends during the transition from summer to autumn. The value in the table indicates that the precipitation is significant.
There is an impact on the persistence of drought from summer to autumn. The odds ratio of precipitation indicates that the one-unit change in precipitation in summer spatiotemporal drought is the declining persistence of meteorological drought of autumn season by 0.0593 for the selected locations. Table 9 gives the log-likelihood values, WCT, and LRCST values. The value of CFEBLRM and REBLRM indicates that both models are significant. The HT with a value of 0.2244 indicates that there is no endogeneity occurring in the independent and error term, and therefore, the Hausman test suggests that the REBLRM is appropriate for the analysis of autumn to winter meteorological persistence modeling. Table 10 presents the result obtained from REBLRM. The values are significant for several variables for persistence in autumn to winter season.
This indicates the significant effect of drought persistence in winter season. The coefficient of precipitation odds ratio (0.9840) presents that the increment in precipitation declines the drought persistence in winter season by 0.016, and similarly, the unit change in windspeed and profile soil moisture will decline the drought persistence in winter season by 0.8342 and 0.9902, respectively. The odds ratio of temperature has a positive relation with drought persistence. The temperature is significant for this season. The effect of temperature is determined having the influence of the other factors that are included in the model. Additionally, is the proportion of total variance attributed to the random effects in panel data. represents the variance associated with the random effects within the model. It captures the variability observed between different groups or units. represents the residual variance, which captures the unexplained variability within each group after accounting for the random effects. If rho is closer to 1, it means that a significant portion of the total variance in the data is explained by the random effects between the groups, and the residual variance is relatively small. If rho is close to zero, this means that the total variance is mostly explained by residual variance between the groups. In the current study, the rho value (0.5151) represents the 51.51% variation of random effects between the groups. In the future, the drought persistence should be calculated using these factors, which implies that the inclusion of a new index will improve the monitoring and modeling of the drought persistence in various seasons in the selected locations. The influences of these factors may vary with respect to time, but their inclusion for calculating the drought characteristics is crucial. The current study has also provided some slight influence, but there is a significant effect on drought persistence in varying seasons, and therefore, based on the analysis, it is suggested for incorporating these factors while monitoring and modeling the meteorological drought. The reliance on the single variable (say precipitation) is not enough for monitoring and modeling spatiotemporal meteorological drought. The SPI needs to be modified by including new crucial factors.
4. Discussion
Several authors have added numerous frameworks in the literature aimed at assessing drought conditions across various climate conditions and environmental regions [57, 78, 79]. Specifically, some researchers monitored and modeled drought characteristics in their frameworks to enhance forecasting accuracy, which is useful for updated decisions and early warning policies [44, 80–83]. Therefore, modeling and monitoring of drought characteristics are crucial for early warning and decision-making [12, 63, 84, 85]. Among the various drought characteristics of modeling, Meng et al. [85] and Niaz et al. [37] focused on the drought persistence modeling. Numerous investigators employed the logistic regression model for panel data [86–88] to evaluate the interseasonal drought persistence in different regions. Recently, Niaz et al. [67] also focused on modeling drought persistence for various seasons. They used precipitation and moisture conditions of the previous seasons for modeling drought persistence. However, the current study is based on various ensemble approaches (K-means clustering and MCFS) and meteorological factors (precipitation, temperature, windspeed, and profile soil moisture) to provide more comprehensive results for spatiotemporal meteorological drought on various seasons over the selected period. By understanding the pattern and influences of the meteorological factors, the meteorologists or investigators can better understand and predict meteorological drought. Understanding will assist in mitigating the impacts of drought on society. The addition of new meteorological factors makes this study more advantageous over the mentioned studies. The K-means method is employed to create suitable clusters for dividing the locations into homogeneous groups. The MCFS is utilized for selecting important stations varying clusters. Further, in the current study, SPI is employed at a 3-month time scale to determine drought characteristics across different seasons. The binary panel logistic model is used to evaluate interseasonal drought persistence. The significance of the CFEBLRM and REBLRM is measured by LRCST and WCT test. The Hausman test is used to choose the appropriate model in both CFEBLRM and REBLRM for the analysis of interseasonal drought persistence. The metrological drought has a negative impact on agriculture and socioeconomic of the country. The current study will help the researcher evaluate more accurate drought monitoring and early warning systems and help inform the drought community to develop drought policies and facilitate the drought management strategies to avoid the negative impact in the region, Punjab.
5. Conclusion
Meteorological drought is a detrimental natural hazard. It is a natural phenomenon that can have critical apprehensions for the affected region. The deficiency of rainfall and persisted dry states can start to water deficiencies, crop failures, and ecological disparities. Therefore, their monitoring and modeling are crucial for the concerned organizations to make rationalized and well-prepared decisions. Building upon the importance of monitoring and modeling the meteorological drought, we propose a new framework to integrate advanced computational techniques to analyze meteorological drought and develop robust drought models. In this regard, the K-means method is employed to categorize homogeneous clusters. Also, the MCFS is utilized to choose more crucial stations from the varying clusters. The SPI-3 is utilized to quantify drought occurrences. This quantification is used to measure the drought persistence that is employed as a dependent variable. Several meteorological factors are also utilized including independent variables, which include precipitation, temperature, windspeed, and profile soil moisture. The CFEBLRM and the REBLRM are also utilized for the modeling of meteorological drought persistence based on the selected meteorological factors. The implication of CFEBLRM and REBLRM is determined by log-likelihood values, LRCST, WCT, and values. The HT is employed to find endogeneity and indicates the appropriate model in CFEBLRM and REBLRM. From the outcomes of the current research, the likelihood value indicates the significance of both the models. The HT value confirms the endogeneity in summer to autumn and confirms the CFEBLRM for drought persistence modeling. The interseasonal drought persists and is observed high, especially from summer to autumn and from autumn to winter. The drought persistence for summer to autumn is between 90% and 99%, and the odds ratio (0.9407) value of precipitation (95% confidence interval 0.9298 to 0.9518) indicates that the one mm change in precipitation in summer spatiotemporal drought persistence is the declining persistence of meteorological drought of the autumn season by 0.0593. The likelihood value (−76.7501) shows the significance of CFEBLRM. Similarly, in autumn to winter season, the HT with a value of 0.2244 indicates that there is no endogeneity occurring in the independent and error terms, and therefore, HT suggests that the REBLRM is appropriate for the analysis of meteorological persistence modeling. The likelihood (−174.7282) indicates the significance of REBLRM. The drought persistence from autumn to winter ranges between 90% and 98%, signifying a significant continuation of drought conditions from autumn season to winter. The odds ratio (1.0545) indicates that the increment of temperature in autumn season will increase 6% the drought persistence in winter season; similarly, the odds ratio of precipitation 0.983, windspeed 0.166, and profile soil moisture 0.0098 indicates that the increase of one unit in precipitation, windspeed, and profile soil moisture in autumn season will decline the drought persistence in winter season by 0.02 and 0.8342 and 0.9902. The results of the present research provide comprehensive and relevant information regarding the selected factors to deliver accurate and precise information for drought persistence. The obtained information facilitates the decision-makers to understand the effects of meteorological drought occurrences that can lead to contribute to improved drought preparedness and response policies.
Data Availability
The data and codes used for the preparation of the manuscript are available from the corresponding author and can be provided upon request.
Ethical Approval
All procedures followed were in accordance with the ethical standards and with the Helsinki Declaration of 1975, as revised in 2000.
Conflicts of Interest
The authors declare that there are no conflicts of interest.
Authors’ Contributions
All authors contributed equally to this study.
Acknowledgments
The authors extend their appreciation to the Deanship of Scientific Research at King Khalid University for funding this work through Larg Groups Research Project under grant number RGP.2/44/44, and this study was supported via funding from Prince Sattam Bin Abdulaziz University, project number PSAU/2023/R/1444.